136 views

Uploaded by אוהד כהן

save

- 314 02
- Dominator-Based Algorithms in Logic Synthesis and Verification
- CMJV02I03P0445
- differential element
- Mathematics Syllabus for Main Examination
- 5_3notes181
- 2011
- Maths 4 test 3 2016.pdf
- 1718-mod0-1054
- 4130 Errata
- Chap 01 Real Analysis: Real Number System
- Module 1-
- Maths 4 test 3 2016.pdf
- Closed and Open Compact
- Definite Integral
- AIAAJ-sv
- Chapter 1 Sets
- Mathematics Material for Revision Class
- MA242SumII2015Syllabus
- Engineering Mathematics
- 9709_s10_er.pdf
- Fundamental of Calculus (Presentation Slide)A
- Esxplicação Teorema Da Substituição de Variáveis
- CH
- Cerme7 Wg14 Paper Vandebrouck Revised Dec2010
- 18.02 Textbook
- MStat Sample 2009
- Calculus With Analytic Geometry
- MATH220syllabus
- Parish%2c Duraisamy (2016) - Non-local Closure Models for Large Eddy Simul....pdf
- Yes Please
- The Unwinding: An Inner History of the New America
- Sapiens: A Brief History of Humankind
- The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
- Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
- Dispatches from Pluto: Lost and Found in the Mississippi Delta
- Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
- John Adams
- The Prize: The Epic Quest for Oil, Money & Power
- The Emperor of All Maladies: A Biography of Cancer
- This Changes Everything: Capitalism vs. The Climate
- Grand Pursuit: The Story of Economic Genius
- A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
- The New Confessions of an Economic Hit Man
- Team of Rivals: The Political Genius of Abraham Lincoln
- The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
- Smart People Should Build Things: How to Restore Our Culture of Achievement, Build a Path for Entrepreneurs, and Create New Jobs in America
- Rise of ISIS: A Threat We Can't Ignore
- The World Is Flat 3.0: A Brief History of the Twenty-first Century
- Bad Feminist: Essays
- Steve Jobs
- Angela's Ashes: A Memoir
- How To Win Friends and Influence People
- Extremely Loud and Incredibly Close: A Novel
- The Sympathizer: A Novel (Pulitzer Prize for Fiction)
- The Silver Linings Playbook: A Novel
- Leaving Berlin: A Novel
- The Light Between Oceans: A Novel
- The Incarnations: A Novel
- You Too Can Have a Body Like Mine: A Novel
- The Love Affairs of Nathaniel P.: A Novel
- Life of Pi
- The Flamethrowers: A Novel
- Brooklyn: A Novel
- The Rosie Project: A Novel
- The Blazing World: A Novel
- We Are Not Ourselves: A Novel
- The First Bad Man: A Novel
- The Master
- Bel Canto
- A Man Called Ove: A Novel
- The Kitchen House: A Novel
- Beautiful Ruins: A Novel
- Interpreter of Maladies
- The Wallcreeper
- The Art of Racing in the Rain: A Novel
- Wolf Hall: A Novel
- The Cider House Rules
- A Prayer for Owen Meany: A Novel
- The Bonfire of the Vanities: A Novel
- Lovers at the Chameleon Club, Paris 1932: A Novel
- The Perks of Being a Wallflower
- The Constant Gardener: A Novel

You are on page 1of 206

Raz Kupferman

Institute of Mathematics

The Hebrew University

May 5, 2011

2

Contents

1 Real numbers 1

1.1 Axioms of ﬁeld . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Axioms of order (as taught in 2009) . . . . . . . . . . . . . . . . 7

1.3 Axioms of order (as taught in 2010, 2011) . . . . . . . . . . . . . 10

1.4 Absolute values . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.5 Special sets of numbers . . . . . . . . . . . . . . . . . . . . . . . 16

1.6 The Archimedean property . . . . . . . . . . . . . . . . . . . . . 18

1.7 Axiom of completeness . . . . . . . . . . . . . . . . . . . . . . . 22

1.8 Rational powers . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

1.9 Real-valued powers . . . . . . . . . . . . . . . . . . . . . . . . . 40

1.10 Addendum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2 Functions 43

2.1 Basic deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.2 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2.3 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.4 Limits and order . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

2.5 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

2.6 Theorems about continuous functions . . . . . . . . . . . . . . . 70

2.7 Inﬁnite limits and limits at inﬁnity . . . . . . . . . . . . . . . . . 78

2.8 Inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . 80

2.9 Uniform continuity . . . . . . . . . . . . . . . . . . . . . . . . . 84

ii CONTENTS

3 Derivatives 91

3.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

3.2 Rules of diﬀerentiation . . . . . . . . . . . . . . . . . . . . . . . 98

3.3 Another look at derivatives . . . . . . . . . . . . . . . . . . . . . 102

3.4 The derivative and extrema . . . . . . . . . . . . . . . . . . . . . 104

3.5 Derivatives of inverse functions . . . . . . . . . . . . . . . . . . . 117

3.6 Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

3.7 Taylor’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 123

4 Integration theory 133

4.1 Deﬁnition of the integral . . . . . . . . . . . . . . . . . . . . . . 133

4.2 Integration theorems . . . . . . . . . . . . . . . . . . . . . . . . 142

4.3 The fundamental theorem of calculus . . . . . . . . . . . . . . . . 153

4.4 Riemann sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

4.5 The trigonometric functions . . . . . . . . . . . . . . . . . . . . 158

4.6 The logarithm and the exponential . . . . . . . . . . . . . . . . . 163

4.7 Integration methods . . . . . . . . . . . . . . . . . . . . . . . . . 167

5 Sequences 171

5.1 Basic deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

5.2 Limits of sequences . . . . . . . . . . . . . . . . . . . . . . . . . 172

5.3 Inﬁnite series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

CONTENTS iii

Foreword

1. Texbooks: Spivak, Meizler, Hochman.

iv CONTENTS

Chapter 1

Real numbers

In this course we will cover the calculus of real univariate functions, which was

developed during more than two centuries. The pioneers were Isaac Newton

(1642-1737) and Gottfried Wilelm Leibniz (1646-1716). Some of their follow-

ers who will be mentioned along this course are Jakob Bernoulli (1654-1705),

Johann Bernoulli (1667-1748) and Leonhard Euler (1707-1783). These ﬁrst two

generations of mathematicians developed most of the practice of calculus as we

know it; they could integrate many functions, solve many diﬀerential equations,

and sum up a large number of inﬁnite series using a wealth of sophisticated analyt-

ical techniques. Yet, they were sometimes very vague about deﬁnitions and their

theory often laid on shaky grounds. The sound theory of calculus as we know it

today, and as we are going to learn it in this course was mostly developed through-

out the 19th century, notably by Joseph-Louis Lagrange (1736-1813), Augustin

Louis Cauchy (1789-1857), Georg Friedrich Bernhard Riemann (1826-1866), Pe-

ter Gustav Lejeune-Dirichlet (1805-1859), Joseph Liouville (1804-1882), Jean-

Gaston Darboux (1842-1917), and Karl Weierstrass (1815-1897). As calculus

was being established on ﬁrmer grounds, the theory of functions needed a thor-

ough revision of the concept of real numbers. In this context, we should mention

Georg Cantor (1845-1918) and Richard Dedekind (1831-1916).

1.1 Axioms of ﬁeld

Even though we have been using “numbers” as an elementary notion since ﬁrst

grade, a rigorous course of calculus should start by putting even such basic con-

2 Chapter 1

cept on axiomatic grounds.

The set of real numbers will be deﬁned as an instance of a complete, ordered

ﬁeld (לש רודס הדש).

Deﬁnition 1.1 A ﬁeld F (הדש) is a non-empty set on which two binary opera-

tions are deﬁned

1

: an operation which we call addition, and denote by +, and

an operation which we call multiplication and denote by (or by nothing, as in

a b = ab). The operations on elements of a ﬁeld satisfy nine deﬁning properties,

which we list now.

1. Addition is associative (יצוביק): for all a, b, c ∈ F,

(a + b) + c = a + (b + c).

It should be noted that addition was deﬁned as a binary operation. As such,

there is no a-priori meaning to a +b +c. In fact, the meaning is ambiguous,

as we could ﬁrst add a + b and then add c to the sum, or conversely, add

b + c, and then add the sum to a. The ﬁrst axiom states that in either case,

the result is the same.

What about the addition of four elements, a + b + c + d? Does it require a

separate axiom of equivalence? We would like the following additions

((a + b) + c) + d

(a + (b + c)) + d

a + ((b + c) + d)

a + (b + (c + d))

(a + b) + (c + d)

to be equivalent. It is easy to see that the equivalence follows fromthe axiom

of associativity. Likewise (although it requires some non-trivial work), we

can show that the n-fold addition

a

1

+ a

2

+ + a

n

is deﬁned unambiguously.

1

Abinary operation in F is a function that “takes” an ordered pair of elements in F and “returns”

an element in F. That is, to every a, b ∈ F corresponds one and only one a + b ∈ F and a b ∈ F.

In formal notation,

(∀a, b ∈ F)(∃!c ∈ F) : (c = a + b).

Real numbers 3

2. Existence of an additive-neutral element (רוביחל ילרטיינ רביא): there

exists an element 0 ∈ F such that for all a ∈ F

a + 0 = 0 + a = a.

3. Existence of an additive inverse (ידגנ רביא): for all a ∈ F there exists an

element b ∈ F, such that

a + b = b + a = 0.

It is customary to denote this additive inverse by (−a).

2

With the ﬁrst three axioms there are a few things we can show. For example,

that

a + b = a + c implies b = c.

The proof requires all three axioms:

a + b = a + c

(−a) + (a + b) = (−a) + (a + c) (existence of inverse)

((−a) + a)) + b = ((−a) + a) + c (associativity)

0 + b = 0 + c (property of inverse)

b = c. (0 is the neutral element)

It follows at once that every element a ∈ F has a unique additive inverse,

for if b and c were both additive inverses of a, then a + b = 0 = a + c, from

which follows that b = c. Hence we can refer to the additive inverse of a,

which justiﬁes the notation (−a).

We can also show that there is a unique element in F satisfying the neutral

property (it is not an a priori fact). Suppose there were a, b ∈ F such that

a + b = a.

Then adding (−a) (on the left!) and using the law of associativity we get

that b = 0.

2

The notation (−a) assumes implicitly that the inverse is unique. This is not an assumption, but

follows from the ﬁrst three axioms.

4 Chapter 1

Comment: We only postulate the operations of addition and multiplication.

Subtraction is a short-hand notation for the addition of the additive inverse,

a − b

def

= a + (−b).

Comment: A set satisfying the ﬁrst three axioms is called a group (הרובח).

By themselves, those three axioms have many implications, which can ﬁll

an entire course.

4. Addition is commutative (יפוליח): for all a, b ∈ F

a + b = b + a.

With this law we ﬁnally obtain that any (ﬁnite!) summation of elements of

F can be re-arranged in any order.

On the other hand, our experience with numbers tells us that a − b = b − a

implies that a = b. There is no way we can prove it using only the ﬁrst four

axioms! Indeed, all we obtain from it that

a + (−b) = b + (−a) (given)

(a + (−b)) + (b + a) = (b + (−a)) + (a + b) (commutativity)

a + ((−b) + b) + a = b + ((−a) + a) + b (associativity)

a + 0 + a = b + 0 + b (property of additive inverse)

a + a = b + b, (property of neutral element)

but how can we deduce from that that a = b?

With very little comments, we can state now the corresponding laws for

multiplication:

5. Multiplication is associative: for all a, b, c ∈ F,

a (b c) = (a b) c.

6. Existence of a multiplicative-neutral element: there exists an element

1 ∈ F such that for all a ∈ F,

a 1 = 1 a = a.

Real numbers 5

7. Existence of a multiplicative inverse (יכפוה רביא): for every a 0 there

exists an element a

−1

∈ F such that

3

a a

−1

= a

−1

a = 1.

The condition that a 0 has strong implications. For example, from the

fact that a b = a c we can deduce that b = c only if a 0.

Comment: Division is deﬁned as multiplication by the inverse,

a/b

def

= a b

−1

.

8. Multiplication is commutative: for every a, b ∈ F,

a b = b a.

The last axiom relates the addition and multiplication operations:

9. Distributive law (וביקה קוח): for all a, b, c ∈ F,

a (b + c) = a b + a c.

We can revisit now the a − b = b − a problem. We can now proceed,

a + a = b + b

1 a + 1 a = 1 b + 1 b (property of neutral element)

(1 + 1) a = (1 + 1) b (distributive law).

We would be done if we knew that 1 + 1 0, for we would multiply both

sides by (1 + 1)

−1

. However, this does not follow from the axioms of ﬁeld!

Example: Consider the following set F = ¦e, f ¦ with the following properties,

+ e f

e e f

f f e

e f

e e e

f e f

3

Here again, the uniqueness of the multiplicative inverse has to be proved, without which there

is no justiﬁcation to the notation a

−1

.

6 Chapter 1

It takes some explicit veriﬁcation to check that this is indeed a ﬁeld (in fact the

smallest possible ﬁeld), with e being the additive neutral and f being the multi-

plicative neutral (do you recognize this ﬁeld?).

Note that

e − f = e + (−f ) = e + f = f + (−e) = f − e,

and yet e f , which is then no wonder that we can’t prove, based only on the

ﬁeld axioms, that f − e = e − f implies that e = f .

Proposition 1.1 For every a ∈ F, a 0 = 0.

Proof : Using sequentially the property of the neutral element and the distributive

law:

a 0 = a (0 + 0) = a 0 + a 0.

Adding to both sides −(a 0) we obtain

a 0 = 0.

B

Comment: Suppose it were the case that 1 = 0, i.e., that the same element of F is

both the additive neutral and the multiplicative neutral. It would follows that for

every a ∈ F,

a = a 1 = a 0 = 0,

which means that 0 is the only element of F. This is indeed a ﬁeld according to

the axioms, however a very boring one. Thus, we rule it out and require the ﬁeld

to satisfy 0 1.

Comment: In principle, every algebraic identity should be proved from the ax-

ioms of ﬁeld. In practice, we will assume henceforth that all known algebraic

manipulations are valid. For example, we will not bother to prove that

a − b = −(b − a)

(even though the proof is very easy). The only exception is (of course) if you get

to prove an algebraic identity as an assignment.

Real numbers 7

Exercise 1.1 Let F be a ﬁeld. Prove, based on the ﬁeld axioms, that

' If ab = a for some ﬁeld element a 0, then b = 1.

^ a

3

− b

3

= (a − b)(a

2

+ ab + b

2

).

· If c 0 then a/b = (ac)/(bc).

´ If b, d 0 then

a

b

+

c

d

=

ad + bc

bd

.

Justify every step in your proof!

1.2 Axioms of order (as taught in 2009)

The real numbers contain much more structure than just being a ﬁeld. They also

form an ordered set. The property of being ordered can be formalized by three

axioms:

Deﬁnition 1.2 (רוּדָס הדשׁ) A ﬁeld F is said to be ordered if it has a distin-

guished subset P ⊂ F, which we call the positive elements, such that the following

three properties are satisﬁed:

1. Trichotomy: every element a ∈ F satisﬁes one and only one of the follow-

ing properties. Either (i) a ∈ P, or (ii) (−a) ∈ P, or (iii) a = 0.

2. Closure under addition: if a, b ∈ P then a + b ∈ P.

3. Closure under multiplication: if a, b ∈ P then a b ∈ P.

We supplement these axioms with the following deﬁnitions:

a < b means b − a ∈ P

a > b means b < a

a ≤ b means a < b or a = b

a ≥ b means a > b or a = b.

Comment: a > 0 means, by deﬁnition, that a−0 = a ∈ P. Similarly, a < 0 means,

by deﬁnition, that 0 − a = (−a) ∈ P.

We can then show a number of well-known properties of real numbers:

8 Chapter 1

Proposition 1.2 For every a, b ∈ F either (i) a < b, or (ii) a > b, or (iii) a = b.

Proof : This is an immediate consequence of the trichotomy law, along with our

deﬁnitions. We are asked to show that either (i) a−b ∈ P, or (ii) b−a = −(a−b) ∈

P, or (iii) that a − b = 0. That is, we are asked to show that a − b satisﬁes the

trichotomy, which is an axiom. B

Proposition 1.3 If a < b then a + c < b + c.

Proof :

(b + c) − (a + c) = b − a ∈ P.

B

Proposition 1.4 (Transitivity) If a < b and b < c then a < c.

Proof :

c − a = (c − b)

.,,.

in P

+(b − a)

.,,.

in P

.,,.

in P by closure under addition

.

B

Proposition 1.5 If a < 0 and b < 0 then ab > 0.

Proof : We have seen above that a, b < 0 is synonymous to (−a), (−b) ∈ P. By

the closure under multiplication, it follows that (−a)(−b) > 0. It remains to check

that the axioms of ﬁeld imply that ab = (−a)(−b). B

Corollary 1.1 If a 0 then a a ≡ a

2

> 0.

Real numbers 9

Proof : If a 0, then by the trichotomy either a > 0, in which case a

2

> 0 follows

from the closure under multiplication, or a < 0 in which case a

2

> 0 follows from

the previous proposition, with a = b. B

Corollary 1.2 1 > 0.

Proof : This follows from the fact that 1 = 1

2

. B

Another corollary is that the ﬁeld of complex numbers (yes, this is not within

the scope of the present course) cannot be ordered, since i

2

= (−1), which implies

that (−1) has to be positive, hence 1 has to be negative, which violates the previous

corollary.

Proposition 1.6 If a < b and c > 0 then ac < bc.

Proof : It is given that b − a ∈ P and c ∈ P. Then,

bc − ac = (b − a)

.,,.

in P

c

.,,.

in P

.,,.

in P by closure under multiplication

.

B

Exercise 1.2 Let F be an ordered ﬁeld. Prove based on the axioms that

' a < b if and only if (−b) < (−a).

^ If a < b and c > d, then a − c < b − d.

· If a > 1 then a

2

> a.

´ If 0 < a < 1 then a

2

< a.

I If 0 ≤ a < b and 0 ≤ c < d then ac < bd.

I If 0 ≤ a < b then a

2

< b

2

.

Exercise 1.3 Prove using the axioms of ordered ﬁelds and using induction on

n that

10 Chapter 1

' If 0 ≤ x < y then x

n

< y

n

.

^ If x < y and n is odd then x

n

< y

n

.

· If x

n

= y

n

and n is odd then x = y.

´ If x

n

= y

n

and n is even then either x = y of x = −y.

I Conclude that

x

n

= y

n

if and only if x = y or x = −y.

Exercise 1.4 Let F be an ordered ﬁeld, with multiplicative neutral term 1; we

also denote 1 +1 = 2. Prove that (a +b)/2 is well deﬁned for all a, b ∈ F, and that

if a < b, then

a <

a + b

2

< b.

Exercise 1.5 Show that there can be no ordered ﬁeld that has a ﬁnite number

of elements.

1.3 Axioms of order (as taught in 2010, 2011)

The real numbers contain much more structure than just being a ﬁeld. They also

form an ordered set. The property of being ordered can be formalized by four

axioms:

Deﬁnition 1.3 (רוּדָס הדשׁ) A ﬁeld F is said to be ordered if there exists a rela-

tion < (a relation is a property that every ordered pair of elements either satisfy

or not), such that the following four properties are satisﬁed:

O1 Trichotomy: every pair of elements a, b ∈ F satisﬁes one and only one of

the following properties. Either (i) a < b, or (ii) b < a, or (iii) a = b.

O2 Transitivity: if a < b and b < c then a < c.

O3 Invariance under addition: if a < b then a + c < b + c for all c ∈ F.

O4 Invariance under multiplication by positive numbers: if a < b then ac <

bc for all 0 < c.

Real numbers 11

Comment: The set of numbers a for which 0 < a are called the positive numbers.

The set of numbers a for which a < 0 are called the negative numbers.

We supplement these axioms with the following deﬁnitions:

a > b means b < a

a ≤ b means a < b or a = b

a ≥ b means a > b or a = b.

Proposition 1.7 The set of positive numbers is closed under addition and multi-

plication, namely

If 0 < a, b then 0 < a + b and 0 < ab.

Proof : Closure under addition follows from the fact that 0 < a implies that

b = 0 + b (neutral element)

< a + b, (O3)

and b < a + b follows from transitivity.

Closure under multiplication follows from the fact that

0 = 0 a (proved)

< b a. (O4)

B

(2 hrs, 2010) (2 hrs, 2011)

Proposition 1.8 0 < a if and only if (−a) < 0, i.e., a number is positive if and

only if its additive inverse is negative.

Proof : Suppose 0 < a, then

(−a) = 0 + (−a) (neutral element)

< a + (−a) (O3)

= 0 (additive inverse)

12 Chapter 1

i.e., (−a) < 0. Conversely, if (−a) < 0 then

0 = (−a) + a (additive inverse)

< 0 + a (O3)

= a (neutral element)

B

Corollary 1.3 The set of positive numbers is not empty.

Proof : Since by assumption 1 0, it follows by O1 that either 0 < 1 or 1 < 0. In

the latter case 0 < (−1), hence either 1 or (−1) is positive. B

Proposition 1.9 0 < 1.

Proof : By trichotomy, either 0 < 1, or 1 < 0, or 0 = 1. The third possibility is

ruled out by assumption. Suppose then, by contradiction, that 1 < 0. Then for

every 0 < a (and we know that at least one such exists),

a = a 1 < a 0 = 0,

i.e., a < 0 which violates the trichotomy axiom. Hence 0 < 1. B

Proposition 1.10 0 < a if and only if 0 < a

−1

, i.e., a number is positive if and

only if its multiplicative inverse is also positive.

Proof : Let 0 < a and suppose by contradiction that a

−1

< 0 (it can’t be zero

because a

−1

a = 1). Then,

1 = a a

−1

< a 0 = 0,

which is a contradiction. B

Real numbers 13

Proposition 1.11 If a < 0 and b < 0 then ab > 0.

Proof : We have seen above that a, b < 0 implies that 0 < (−a), (−b). By the

closure under multiplication, it follows that (−a)(−b) > 0. It remains to check that

the axioms of ﬁeld imply that ab = (−a)(−b). B

Corollary 1.4 If a 0 then a a ≡ a

2

> 0.

Proof : If a 0, then by the trichotomy either a > 0, in which case a

2

> 0 follows

from the closure under multiplication, or a < 0 in which case a

2

> 0 follows from

the previous proposition, with a = b. B

Another corollary is that the ﬁeld of complex numbers (which is not within the

scope of the present course) cannot be ordered, since i

2

= (−1), which implies

that (−1) has to be positive, hence 1 has to be negative, which is a violation.

1.4 Absolute values

Deﬁnition 1.4 For every element a in an ordered ﬁeld F we deﬁne the absolute

value (טלחומ רע),

¦a¦ =

_

¸

¸

_

¸

¸

_

a a ≥ 0

(−a) a < 0.

We see right away that ¦a¦ = 0 if and only if a = 0, and otherwise ¦a¦ > 0.

Proposition 1.12 (Triangle inequality (שלושמה ויוויש יא)) For every a, b ∈ F,

¦a + b¦ ≤ ¦a¦ + ¦b¦.

Proof : We need to examine four cases (by trichotomy):

1. a ≥ 0 and b ≥ 0.

14 Chapter 1

2. a ≤ 0 and b ≥ 0.

3. a ≥ 0 and b ≤ 0.

4. a ≤ 0 and b ≤ 0.

In the ﬁrst case, ¦a¦ = a and ¦b¦ = b, hence

¦a + b¦ = a + b = ¦a¦ + ¦b¦.

(Hey, does this mean that ¦a + b¦ ≤ ¦a¦ + ¦b¦?) In the fourth case, ¦a¦ = (−a),

¦b¦ = (−b), and 0 < −(a + b), hence

¦a + b¦ = −(a + b) = (−a) + (−b) = ¦a¦ + ¦b¦.

It remains to show the second case, as the third case follows by interchanging the

roles of a and b. Let a ≤ 0 and b ≥ 0, i.e., ¦a¦ = (−a) and ¦b¦ = b. We need to show

that

¦a + b¦ ≤ (−a) + b.

We can divide this case into two sub-cases. Either a + b ≥ 0, in which case

¦a + b¦ = a + b ≤ (−a) + b = ¦a¦ + ¦b¦.

If a + b < 0, then

¦a + b¦ = −(a + b) = (−a) + (−b) ≤ (−a) + b = ¦a¦ + ¦b¦,

which completes the proof

4

. B

Comment: By replacing b by (−b) we obtain that

¦a − b¦ ≤ ¦a¦ + ¦b¦.

The following version of the triangle inequality is also useful:

4

We have used the fact that a ≥ 0 implies (−a) ≤ a and a ≤ 0 implies a ≤ (−a). This is very

easy to show. For example,

a > 0 =⇒ a + (−a) > 0 + (−a) =⇒ 0 > (−a),

and (−a) < a follows from transitivity.

Real numbers 15

Proposition 1.13 (Reverse triangle inequality) For every a, b ∈ F,

¦¦a¦ − ¦b¦¦ ≤ ¦a + b¦.

Proof : By the triangle inequality,

∀a, c ∈ F, ¦a − c¦ ≤ ¦a¦ + ¦c¦.

Set c = b + a, in which case ¦b¦ ≤ ¦a¦ + ¦b + a¦, or

¦b¦ − ¦a¦ ≤ ¦b + a¦.

By interchanging the roles of a and b,

¦a¦ − ¦b¦ ≤ ¦a + b¦.

It follows that

5

¦¦a¦ − ¦b¦¦ ≤ ¦a + b¦,

B

Proposition 1.14 For every a, b, c ∈ F,

¦a + b + c¦ ≤ ¦a¦ + ¦b¦ + ¦c¦.

Proof : Use the triangle inequality twice,

¦a + b + c¦ = ¦a + (b + c)¦ ≤ ¦a¦ + ¦b + c¦ ≤ ¦a¦ + ¦b¦ + ¦c¦.

B

(2 hrs, 2009)

Exercise 1.6 Let min(x, y) and max(x, y) denote the smallest and the largest

of the two arguments, respectively. Show that

max(x, y) =

1

2

(x + y + ¦x − y¦) and min(x, y) =

1

2

(x + y − ¦x − y¦) .

5

Here we used the fact that any property satisﬁed by a and (−a) is also satisﬁed by ¦a¦.

16 Chapter 1

1.5 Special sets of numbers

What kind of sets can have the properties of an ordered ﬁeld? We start by intro-

ducing the natural numbers (ייעבטה ירפסמה), by taking the element 1 (which

exists by the axioms), and naming the numbers

1 + 1, 1 + 1 + 1, 1 + 1 + 1 + 1, etc.

That these numbers are all distinct follows from the axioms of order (each number

is greater than its predecessor because 0 < 1). That we give them names, such as

2, 3, 4, and may represent them using a decimal system is immaterial. We denote

the set of natural numbers by N. Clearly the natural numbers do not form a ﬁeld

(for example, there are no additive inverses because all the numbers are positive).

We can then augment the set of natural numbers by adding the numbers

0, −1, −(1 + 1), −(1 + 1 + 1), etc.

This forms the set of integers (ימלשה), which we denote by Z. The integers form

a commutative group with respect to addition, but are not a ﬁeld since multiplica-

tive inverses do not exist

6

.

The set of numbers can be further augmented by adding all integer quotients,

m/n, n 0. This forms the set of rational numbers, which we denote by Q. It

should be noted that the rational numbers are the set of integer quotients modulo

an equivalence relation

7

. A rational number has inﬁnitely many representations

as the quotient of two integers.

6

How do we know that (1 + 1)

−1

, for example, is not an integer? We know that it is positive

and non-zero. Suppose that it were true that 1 < (1 + 1)

−1

, then

1 + 1 = (1 + 1) 1 < (1 + 1) (1 + 1)

−1

= 1,

which is a violation of the axioms of order.

7

A few words about equivalence relations (תוליקש סחי). An equivalence relation on a set S is

a property that any two elements either have or don’t. If two elements a, b have this property, we

say that a is equivalent to b, and denote it by a ∼ b. An equivalence relation has to be symmetric

(a ∼ b implies b ∼ a), reﬂexive (a ∼ a for all a) and transitive (a ∼ b and b ∼ c implies a ∼ c).

Thus, with every element a ∈ S we can associate an equivalence class (תוליקש תקלחמ), which

is ¦b : b ∼ a¦. Every element belongs to one and only one equivalence class. An equivalence

relation partitions S into a collection of disjoint equivalence classes; we call this set S modulo the

equivalence relation ∼, and denote it S/ ∼.

Real numbers 17

Proposition 1.15 Let b, d 0. Then

a

b

=

c

d

if and only if ad = bc.

Proof : Multiply/divide both sides by bd (which cannot be zero if b, d 0). B

It can be checked that the rational numbers form an ordered ﬁeld.

(4 hrs, 2010) (4 hrs, 2011)

In this course we will not teach mathematical induction, nor inductive deﬁni-

tions; we will assume these concepts to be understood. As an example of an

inductive deﬁnition, we deﬁne for all a ∈ Q and n ∈ N, the n-th power, by setting,

a

1

= a and a

k+1

= a a

k

.

As an example of an inductive proof, consider the following proposition:

Proposition 1.16 Let a, b > 0 and n ∈ N. Then

a > b if and only if a

n

> b

n

.

Proof : We proceed by induction. This is certainly true for n = 1. Suppose this

were true for n = k. Then

a > b implies a

k

> b

k

implies a

k+1

= a a

k

> a b

k

> b b

k

= b

k+1

.

Conversely, if a

k+1

> b

k+1

then a ≤ b would imply that a

k+1

≤ b

k+1

, hence a > b.

B

Another inequality, which we will need occasionally is:

Proposition 1.17 (Bernoulli inequality) For all x > (−1) and n ∈ N,

(1 + x)

n

≥ 1 + nx.

18 Chapter 1

Proof : The proof is by induction. For n = 1 both sides are equal. Suppose this

were true for n = k. Then,

(1 + x)

k+1

= (1 + x)(1 + x)

k

≥ (1 + x)(1 + kx) = 1 + (k + 1)x + kx

2

≥ 1 + (k + 1)x,

where in the left-most inequality we have used explicitly the fact that 1 + x > 0.

B

Exercise 1.7 The following exercise deal with logical assertions, and is in

preparation to the type of assertions we will be dealing with all along this course.

For each of the following statements write its negation in hebrew, without using

the word “no”. Then write both the statement and its negation, using logical

quantiﬁers like ∀ and ∃.

' For every integer n, n

2

≥ n holds.

^ There exists a natural number M such that n < M for all natural numbers n.

· For every integer n, m, either n ≥ m or −n ≥ −m.

´ For every natural number n there exist natural numbers a, b,, such that b < n,

1 < a and n = ab.

(3 hrs, 2009)

1.6 The Archimedean property

We are aiming at constructing an entity that matches our notions and experiences

associated with numbers. Thus far, we deﬁned an ordered ﬁeld, and showed that it

must contain a set, which we called the integers, along with all numbers express-

ible at ratios of integers—the rational numbers Q. Whether it contains additional

elements is left for the moment unanswered. The point is that Q already has the

property of being an ordered ﬁeld.

Before proceeding, we will need the following deﬁnitions:

Deﬁnition 1.5 A set of elements A ⊂ F is said to be bounded from above (וסח

ליעלמ) if there exists an element M ∈ F, such that a ≤ M for all a ∈ A, i.e.,

(∃M ∈ F) : (∀a ∈ A)(a ≤ M).

Real numbers 19

Such an element M is called an upper bound (ליעלמ סח) for A. A is said to be

bounded from below (ערלמ וסח) if

(∃m ∈ F) : (∀a ∈ A)(a ≥ m).

Such an element m is called a lower bound (ערלמ סח) for A. A set is said to be

bounded (וסח) if it is bounded both from above and below.

Note that if a set is upper bounded, then the upper bound is not unique, for if M is

an upper bound, so are M + 1, M + 2, and so on.

Proposition 1.18 A set A is bounded if and only if

(∃M ∈ F) : (∀a ∈ A)(¦a¦ ≤ M).

Proof :

8

Suppose A is bounded, then

(∃M

1

∈ F) : (∀a ∈ A)(a ≤ M

1

)

(∃M

2

∈ F) : (∀a ∈ A)(a ≥ M

2

).

That is,

(∃M

1

∈ F) : (∀a ∈ A)(a ≤ M

1

)

(∃M

2

∈ F) : (∀a ∈ A)((−a) ≤ (−M

2

)).

By taking M = max(M

1

, −M

2

) we obtain

(∀a ∈ A)(a ≤ M ∧ (−a) ≤ M),

i.e.,

(∀a ∈ A)(¦a¦ ≤ M).

The other direction isn’t much diﬀerent. B

8

Once and for all: P if Q means that “if Q holds then P holds”, or Q implies P. P only if

Q means that “if Q does not hold, then P does not hold either”, which implies that P implies Q.

Thus, “if and only if” means that each one implies the other.

20 Chapter 1

A few notational conventions: for a < b we use the following notations for an

open (חותפ עטק), closed (רוגס עטק), and semi-open (רוגס יצח עטק) segment,

(a, b) = ¦x ∈ F : a < x < b¦

[a, b] = ¦x ∈ F : a ≤ x ≤ b¦

(a, b] = ¦x ∈ F : a < x ≤ b¦

[a, b) = ¦x ∈ F : a ≤ x < b¦.

Each of these sets is bounded (above and below).

We also use the following notations for sets that are bounded on one side,

(a, ∞) = ¦x ∈ F : x > a¦

[a, ∞) = ¦x ∈ F : x ≥ a¦

(−∞, a) = ¦x ∈ F : x < a¦

(−∞, a] = ¦x ∈ F : x ≤ a¦.

The ﬁrst two sets are bounded from below, whereas the last two are bounded from

above. It should be emphasized that ±∞ are not members of F! The above is

purely a notation. Thus, “being less than inﬁnity” does not mean being less than

a ﬁeld element called inﬁnity.

(5 hrs, 2011)

Consider now the set of natural numbers, N. Does it have an upper bound? Our

“geometrical picture” of the number axis clearly indicates that this set is un-

bounded, but can we prove it? We can easily prove the following:

Proposition 1.19 The natural numbers are not upper bounded by a natural num-

ber.

Proof : Suppose n ∈ N were an upper bound for N, i.e.,

(∀k ∈ N) : (k ≤ n).

However, m = n + 1 ∈ N and m > n, i.e.,

(∃m ∈ N) : (n < m),

which is a contradiction. B

Similarly,

Real numbers 21

Corollary 1.5 The natural numbers are not bounded from above by any rational

number

Proof : Suppose that p/q ∈ Q was an upper bound for N. Since, by the axioms of

order, p/q ≤ p, it would imply the existence of a natural number p that is upper

bound for N.

9

B

(5 hrs, 2010)

It may however be possible that an irrational element in F be an upper bound for

N. Can we rule out this possibility? It turns out that this cannot be proved from

the axioms of an ordered ﬁeld. Since, however, the boundedness of the naturals is

so basic to our intuition, we may impose it as an additional axiom, known as the

axiom of the Archimedean ﬁeld: the set of natural numbers is unbounded from

above

10

.

This axiom has a number of immediate consequences:

Proposition 1.20 In an Archimedean ordered ﬁeld F:

(∀a ∈ F)(∃n ∈ N) : (a < n).

Proof : If this weren’t the case, then N would be bounded. Indeed, the negation of

the proposition is

(∃a ∈ F) : (∀n ∈ N)(n ≤ a),

i.e., a is an upper bound for N. B

Corollary 1.6 In an Archimedean ordered ﬁeld F:

(∀ > 0)(∃n ∈ N) : (1/n < ).

9

Convince yourself that p/q ≤ p for all p, q ∈ N.

10

This additional assumption is temporary. The Archimedean property will be provable once

we add our last axiom.

22 Chapter 1

Proof : By the previous proposition,

(∀ > 0)(∃n ∈ N) : (0 < 1/ < n)

.,,.

(0<1/n<)

.

B

Corollary 1.7 In an Archimedean ordered ﬁeld F:

(∀x, y > 0)(∃n ∈ N) : (y < nx).

Comment: This is really what is meant by the Archimedean property. For every

x, y > 0, a segment of length y can be covered by a ﬁnite number of segments of

length x.

y

x x x x x x

Proof : By the previous corollary with = y/x,

(∀x, y > 0)(∃n ∈ N) : (y/x < n)

.,,.

(y<nx)

.

B

1.7 Axiom of completeness

The Greeks knew already that the ﬁeld of rational numbers is “incomplete”, in the

sense that there is no rational number whose square equals 2 (whereas they knew

by Pythagoras’ theorem that this should be the length of the diagonal of a unit

square). In fact, let’s prove it:

Proposition 1.21 There is no r ∈ Q such that r

2

= 2.

Real numbers 23

Proof : Suppose, by contradiction, that r = n/m, where n, m ∈ N (we can assume

that r is positive) satisﬁes r

2

= 2. Although it requires some number theoreti-

cal knowledge, we will assume that it is known that any rational number can be

brought into a form where m and n have no common divisor. By assumption,

m

2

/n

2

= 2, i.e., m

2

= 2n

2

. This means that m

2

is even, from which follows that m

is even, hence m = 2k for some k ∈ N. Hence, 4k

2

= 2n

2

, or n

2

= 2k

2

, from which

follows that n is even, contradicting the assumption that m and n have no common

divisor. B

Proof : Another proof: suppose again that r = n/m is irreducible and r

2

= 2.

Since, n

2

= 2m

2

, then

m

2

< n

2

< 4m

2

,

it follows that

m < n < 2m < 2n,

and

0 < n − m < m and 0 < 2m − n < n.

Consider now the ratio

q =

2m − n

n − m

,

whose numerator and denominator are both smaller than that of r. By elementary

arithmetic

q

2

=

4m

2

− 4mn + n

2

n

2

− 2nm + m

2

=

4 − 4n/m + n

2

/m

2

n

2

/m

2

− 2n/m + 1

=

6 − 4n/m

3 − 2n/m

= 2,

which is a contradiction. B

Exercise 1.8 Prove that there is no r ∈ Q such that r

2

= 3.

Exercise 1.9 What fails if we try to apply the same arguments to prove that

there is no r ∈ Q such that r

2

= 4 (an assertion that happens to be false)?

A way to cope with this “missing number” would be to add

√

2 “by hand” to the

set of rational numbers, along with all the numbers obtained by ﬁeld operations

involving

√

2 and rational numbers (this is called in algebra a ﬁeld extension)

24 Chapter 1

(הבחרה הדש)

11

. But then, what about

√

3? We could add all the square roots of

all positive rational numbers. And then, what about

3

√

2? Shall we add all n-th

roots? But then, what about a solution to the equation x

5

+ x + 1 = 0 (it cannot be

expressed in terms of roots as a result of Galois’ theory)?

It turns out that a single additional axiom, known as the axiom of completeness

(תומלשה תמויסכא), completes the set of rational numbers in one fell swoop, such

to provide solutions to all those questions. Let us try to look in more detail in

what sense is the ﬁeld of rational numbers “incomplete”. Consider the following

two sets,

A = ¦x ∈ Q : 0 < x, x

2

≤ 2¦

B = ¦y ∈ Q : 0 < y, 2 ≤ y

2

¦.

Every element in B is greater or equal than every element in A (by transitivity and

by the fact that 0 < x ≤ y implies x

2

≤ y

2

). Formally,

(∀a ∈ A ∧ ∀b ∈ B)(a ≤ b).

Does there exist a rational number that “separates” the two sets, i.e., does there

exist an element c ∈ Q such that

(∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b) ?

It can be shown that if there existed such a c it would satisfy c

2

= 2, hence such a

c does not exist. This observation motivates the following deﬁnition:

Deﬁnition 1.6 An ordered ﬁeld F is said to be complete if for every two non-

empty sets, A, B ⊆ F satisfying

(∀a ∈ A ∧ ∀b ∈ B)(a ≤ b).

there exists a c ∈ F such that

(∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b).

We will soon see how this axiom “completes” the ﬁeld Q.

We next introduce more deﬁnitions. Let’s start with a motivating example:

11

It can be shown that this ﬁeld extension of Q consists of all elements of the form

¦a + b

√

2 : a, b ∈ Q¦.

Real numbers 25

Example: For any set [a, b) = ¦x : a ≤ x < b¦, b is an upper bound, but so is any

larger element, e.g., b + 1. In fact, it is clear that b is the least upper bound.

Deﬁnition 1.7 Let A ⊆ F be a set. An element M ∈ F is called a least upper

bound (וילע סח) for A if (i) it is an upper bound for A, and (ii) if M

·

is also an

upper bound for A then M ≤ M

·

. That is,

(∀a ∈ A)(a ≤ M)

(∀M

·

∈ F)( if (∀a ∈ A)(a ≤ M

·

) then (M ≤ M

·

)).

We can see right away that a least upper bound, if it exists, is unique:

Proposition 1.22 Let A be a set and let M be a least upper bound for A. If M

·

is

also a least upper bound for A then M = M

·

.

Proof : It follows immediately from the deﬁnition of the least upper bound. If

M and M

·

are both least upper bounds, then both are in particular upper bounds,

hence M ≤ M

·

and M

·

≤ M, which implies that M = M

·

. B

We call the least upper bound of a set (if it exists!) a supremum, and denote it

by sup A. Similarly, the greatest lower bound (ותחת סח) of a set is called the

inﬁmum, and it is denoted by inf A.

12

(7 hrs, 2011)

We can provide an equivalent deﬁnition of the least upper bound:

Proposition 1.23 Let A be a set. A number M is a least upper bound if and only

if (i) it is an upper bound, and (ii)

(∀ > 0)(∃a ∈ A) : (a > M − ).

12

In some books the notations lub (least upper bound) and glb (greatest lower bound) are used

instead of sup and inf.

26 Chapter 1

M

A

!!

a

Proof : There are two directions to be proved.

1. Suppose ﬁrst that M were a least upper bound for A, i.e.,

(∀a ∈ A)(a ≤ M) and (∀M

·

∈ F)(if (∀a ∈ A)(a ≤ M

·

) then (M ≤ M

·

)).

Suppose, by contradiction, that there exists an > 0 such that a ≤ M− for

all a ∈ A, i.e., that

(∃ > 0) : (∀a ∈ A)(a < M − ),

then M − is an upper bound for A, smaller than the least upper bound,

which is a contradiction.

2. Conversely, suppose that M is an upper bound and

(∀ > 0)(∃a ∈ A) : (a > M − ).

Suppose that M was not a least upper bound. Then there exists a smaller

upper bound M

·

< M. Take = M − M

·

. Then, a ≤ M

·

= M − for all

a ∈ A, or in formal notation,

(∃ > 0) : (∀a ∈ A)(a < M − ),

which is a contradiction.

B

Examples: What are the least upper bounds (if they exist) in the following exam-

ples:

' [−5, 15] (answer: 15).

Real numbers 27

^ [−5, 15) (answer: 15).

· [−5, 15] ∪ ¦20¦ (answer: 20).

´ [−5, 15] ∪ (17, 18) (answer: 18).

I [−5, ∞) (answer: none).

I ¦1 − 1/n : n ∈ N¦ (answer: 1).

Exercise 1.10 Let F be an ordered ﬁeld, and ∅ A ⊆ F. Prove that M = inf A

if and only if M is a lower bound for A, and

(∀ > 0)(∃a ∈ A) : (a < M + ).

Exercise 1.11 Show that the axiom of completeness is equivalent to the fol-

lowing statement: for every two non-empty sets A, B ⊂ F, satisfying that a ∈ A

and b ∈ B implies a ≤ b, there exists an element c ∈ F, such that

a ≤ c ≤ b for all a ∈ A and b ∈ B.

Exercise 1.12 For each of the following sets, determine whether it is upper

bounded, lower bounded, and if they are ﬁnd their inﬁmum and/or supremum:

' ¦(−2)

n

: n ∈ N¦.

^ ¦1/n : n ∈ Z, n 0¦.

· ¦1 + (−1)

n

: n ∈ N¦.

´ ¦1/n + (−1)

n

: n ∈ N¦.

I ¦n/(2n + 1) : n ∈ N¦.

I ¦x : ¦x

2

− 1¦ < 3¦.

Exercise 1.13 Let A, B be non-empty sets of real numbers. We deﬁne

A + B = ¦a + b : a ∈ A, b ∈ B¦

A − B = ¦a − b : a ∈ A, b ∈ B¦

A B = ¦a b : a ∈ A, b ∈ B¦.

Find A + B, A B and A A in each of the following cases:

' A = ¦1, 2, 3¦ and B = ¦−1, −2, −1/2, 1/2¦.

^ A = B = [0, 1].

28 Chapter 1

Exercise 1.14 Let A be a non-empty set of real numbers. Is each of the fol-

lowing statements necessarity true? Prove it or give a counter example:

' A + A = ¦2¦ A.

^ A − A = ¦0¦.

· ∀x ∈ A A, x ≥ 0.

Exercise 1.15 In each of the following items, A, B are non-empty subsets of a

complete ordered ﬁeld. For the moment, you can only use the axiom of complete-

ness for the least upper bound.

' Suppose that A is lower bounded and set B = ¦x : −x ∈ A¦. Prove that B is

upper bounded, that A has an inﬁmum, and sup B = −inf A.

^ Show that if A is upper bounded and B is lower bounded, then sup(A− B) =

sup A − inf B.

· Show that if A, B are lower bounded then inf(A + B) = inf A + inf B.

´ Suppose that A, B are upper bounded. It is necessarily true that sup(A B) =

sup A sup B?

I Show that if A, B only contain non-negative terms and are upper bounded,

then sup(A B) = sup A sup B.

Deﬁnition 1.8 Let A ⊂ F be a subset of an ordered ﬁeld. It is said to have a

maximum if there exists an element M ∈ A which is an upper bound for A. We

denote

M = max A.

It is said to have a minimum if there exists an element m ∈ A which is a lower

bound for A. We denote

m = min A.

(7 hrs, 2010)

Proposition 1.24 If a set A has a maximum, then the maximum is the least upper

bound, i.e.,

sup A = max A.

Similarly, if A has a minimum, then the minimum is the greatest lower bound,

inf A = min A.

Real numbers 29

Proof : Let M = max A.

13

Then M is an upper bound for A, and for every > 0,

A ÷ M > M − ,

that is (∀ > 0)(∃a ∈ A)(a > M−), which proves that M is the least upper bound.

B

Examples: What are the maxima (if they exist) in the following examples:

' [−5, 15] (answer: 15).

^ [−5, 15) (answer: none).

· [−5, 15] ∪ ¦20¦ (answer: 20).

´ [−5, 15] ∪ (17, 18) (answer: none).

I [−5, ∞) (answer: none).

I ¦1 − 1/n : n ∈ N¦ (answer: none).

(5 hrs, 2009)

Proposition 1.25 Every ﬁnite set in an ordered ﬁeld has a minimum and a maxi-

mum.

Proof : The proof is by induction on the size of the set. B

Corollary 1.8 Every ﬁnite set in an ordered ﬁeld has a least upper bound and a

greatest lower bound.

Consider now the ﬁeld Q of rational numbers, and its subset

A = ¦x ∈ Q : 0 < x, x

2

< 2¦.

This set is bounded from above, as 2, for example, is an upper bound. Indeed, if

x ∈ A, then

x

2

< 2 < 2

2

,

13

We emphasize that not every (non-empty) set has a maximum. The statement is that if a

maximum exists, then it is also the supremum.

30 Chapter 1

which implies that x < 2.

0 1 2

A

an upper bound

But does A have a least upper bound?

Proposition 1.26 Suppose that A has a least upper bound, then

(sup A)

2

= 2.

Proof : We assume that A has a least upper bound, and denote

α = sup A.

Clearly, 1 < α < 2.

Suppose, by contradiction, that α

2

< 2. Relying on the Archimedean property, we

choose n to be an integer large enough, such that

n > max

_

1

α

,

6

2 − α

2

_

.

We are going to show that there exists a rational number r > α, such that r

2

< 2,

which means that α is not even an upper bound for A—contradiction:

(α + 1/n)

2

= (α)

2

+

2

n

α +

1

n

2

< α

2

+

3

n

α (use 1/n < α)

< α

2

+

6

n

(use α < 2)

< α

2

+ 2 − α

2

= 2, (use 6/n < 2 − α

2

)

which means that α + 1/n ∈ A.

Real numbers 31

Similarly, suppose that α

2

> 2. This time we set

n >

2α

α

2

− 2

.

We are going to show that there exists a rational number r < α, such that r

2

> 2,

which means that r is an upper bound for A smaller than α—again, a contradiction:

(α − 1/n)

2

= α

2

−

2

n

α +

1

n

2

> α

2

−

2

n

α (omit 1/n

2

)

> α

2

+ 2 − α

2

= 2, (use n/2α > α

2

− 2)

which means that, indeed, α − 1/n is an upper bound for A. B

Since we saw that there is no r ∈ Q such that r

2

= 2, it follows that there is no

supremum to this set in Q. We will now see that completeness guarantees the

existence of a lowest upper bound:

Proposition 1.27 In a complete ordered ﬁeld every non-empty set that is upper-

bounded has a least upper bound.

Proof : Let A ⊂ F be a non-empty set that is upper-bounded. Deﬁne

B = ¦b ∈ F : b is an upper bound for A¦,

which by assumption is non-empty. Clearly,

(∀a ∈ A ∧ ∀b ∈ B)(a ≤ b).

By the axiom of completeness,

(∃c ∈ F) : (∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b).

Clearly c is an upper bound for A smaller or equal than every other upper bound,

hence

c = sup A.

B

(8 hrs, 2010)

The following proposition is an immediate consequence:

32 Chapter 1

Proposition 1.28 In a complete ordered ﬁeld every non-empty set that is bounded

from below has a greatest lower bound (an inﬁmum).

Proof : Let m be a lower bound for A, i.e.,

(∀a ∈ A)(m ≤ a).

Consider the set

B = ¦(−a) : a ∈ A¦ .

Since

(∀a ∈ A)((−m) ≥ (−a)),

it follows that (−m) is a upper bound for B. By the axiom of completeness, B has

a least upper bound, which we denote by M. Thus,

(∀a ∈ A)(M ≥ (−a)) and (∀ > 0)(∃a ∈ A) : ((−a) > M − ).

This means that

(∀a ∈ A)((−M) ≤ a) and (∀ > 0)(∃a ∈ A) : (a < (−M) + ).

This means that (−M) = inf A. B

With that, we deﬁne the set of real numbers as a complete ordered ﬁeld, which we

denote by R. It turns out that this deﬁnes the set uniquely, up to a relabeling of its

elements (i.e., up to an isomorphism).

14

The axiom of completeness can be used to prove the Archimedean property. In

other words, the axiom of completeness solves at the same time the “problems”

pointed out in the previous section.

Proposition 1.29 (Archimedean property) N is not bounded from above.

14

Strictly speaking, we should prove that such an animal exists. Constructing the set of real

numbers from the set of rational numbers is beyond the scope of this course. You are strongly

recommended, however, to read about it. See for example the construction of Dedekind cuts.

Real numbers 33

Proof : Suppose N was bounded. Then it would have a least upper bound M,

(∃M ∈ R) : (∀n ∈ N) (n ≤ M).

Since n ∈ N implies n + 1 ∈ N,

(∀n ∈ N) ((n + 1) ≤ M),

i.e.,

(∀n ∈ N) (n ≤ (M − 1)).

This means that M−1 is an upper bound, contradicting the fact that M is the least

upper bound. B

(9 hrs, 2011)

We next prove an important facts about the density (תופיפצ) of rational numbers

within the reals.

Proposition 1.30 (The rational numbers are dense in the reals) Let x, y ∈ R,

such that x < y. There exists a rational number q ∈ Q such that x < q < y.

In formal notation,

(∀x, y ∈ R, x < y)(∃q ∈ Q) : (x < q < y).

Proof : Since the natural numbers are not bounded, there exists an m ∈ Z ⊂ Q

such that m < x. Also there exists a natural number n ∈ N such that 1/n < y − x.

Consider now the sequence of rational numbers,

r

k

= m +

k

n

, k = 0, 1, 2, . . . .

From the Archimedean property follows that there exists a k such that

r

k

= m +

k

n

> x.

Let k be the smallest such number (the exists a smallest k because every ﬁnite set

has a minimum), i.e.,

r

k

> x and r

k−1

≤ x.

34 Chapter 1

Hence,

x < r

k

= r

k−1

+

1

n

≤ x +

1

n

< y,

which completes the proof.

x y !

q

1/"

If we start #om an integer poin$

on the le% of x and proceed by rational

steps of size less than &y'x(, then w)

reach a rational point between x and y.

B

The following proposition provides a useful fact about complete ordered ﬁelds.

We will use it later in the course.

Proposition 1.31 ((יכתחה תמל)) Let A, B ⊂ F be two non-empty sets in a com-

plete ordered ﬁeld, such that

(∀a ∈ A ∧ ∀b ∈ B)(a ≤ b).

Then the following three statements are equivalent:

' (∃!M ∈ F) : (∀a ∈ A ∧ ∀b ∈ B)(a ≤ M ≤ b).

^ sup A = inf B.

· (∀ > 0)(∃a ∈ A, b ∈ B) : (b − a < ).

Comment: The claim is not that the three items follow from the given data. The

claim is that each statement implies the two other.

Proof : Suppose that the ﬁrst statement holds. We have already seen that sup A is

a lower bound for B. This implies at once that sup A ≤ inf B. If sup A < inf B,

then any number m in between would satisfy a < m < b for all a ∈ A, b ∈ B,

contradicting the assumption, hence ' implies ®.

Real numbers 35

By the properties of the inﬁmum and the supremum,

(∀ > 0)(∃a ∈ A) : (a > sup A − /2)

(∀ > 0)(∃b ∈ B) : (b < inf B + /2).

That is,

(∀ > 0)(∃a ∈ A) : (a + /2 > sup A)

(∀ > 0)(∃b ∈ B) : (b − /2 < inf B),

i.e., (∀ > 0)(∃a ∈ A, b ∈ B) such that

b − a < (inf B + /2) − (sup A − /2) = .

Thus, ® implies ·.

It remains to show that · implies '. Suppose M was not unique, i.e., there were

M

1

< M

2

such that

(∀a ∈ A ∧ ∀b ∈ B)(a ≤ M

1

< M

2

≤ b).

Set m = (M

1

+ M

2

)/2 and = (M

2

− M

1

). Then,

(∀a ∈ A, b ∈ B) (a ≤ m − /2, b ≥ m + /2)

.,,.

b−a≥(m+/2)−(m−/2)=

,

i.e., if ' does not hold then · does not hold. This concludes the proof. B

(7 hrs, 2009)

Exercise 1.16 In this exercise you may assume that

√

2 is irrational.

' Show that if a + b

√

2 = 0, where a, b ∈ Q, then a = b = 0.

^ Conclude that a + b

√

2 = c + d

√

2, where a, b, c, d ∈ Q, then a = c and

b = d.

· Show that if a + b

√

2 ∈ Q, where a, b ∈ Q, then b = 0.

´ Show that the irrational numbers are dense, namely that there exists an irra-

tional number between every two.

I Show that the set

¦a + b

√

2 : a, b ∈ Q¦

with the standard addition and multiplication operations is a ﬁeld.

36 Chapter 1

Exercise 1.17 Show that the set

¦m/2

n

: m ∈ Z, n ∈ N¦

is dense.

Exercise 1.18 Let A, B ⊆ R be non-empty sets. Prove or disprove each of the

following statements:

' If A ⊆ B and B is bounded from above then A is bounded from above.

^ If A ⊆ B and B is bounded from below then A is bounded from below.

· If every b ∈ B is an upper bound for A then every a ∈ A is a lower bound

form B.

´ A is bounded from above if and only if A ∩ Z is bounded from above.

Exercise 1.19 Let A ⊆ R be a non-empty set. Determine for each of the fol-

lowing statements whether it is equivalent, contradictory, implied by, or implying

the statement that s = sup A, or none of the above:

' a ≤ s for all a ∈ A.

^ For every a ∈ A there exists a r ∈ R, such that a < t < s.

· For every x ∈ R such that x > s there exists an a ∈ A such that s < a < x.

´ For every ﬁnite subset B ⊂ A, s ≥ max B.

I s is the supremum of A ∪ ¦s¦.

1.8 Rational powers

We deﬁned the integer powers recursively,

x

1

= x x

k+1

= x x

k

.

We also deﬁne for x 0, x

0

= 1 and x

−n

= 1/x

n

.

Proposition 1.32 (Properties of integer powers) Let x, y 0 and m, n ∈ Z, then

' x

m

x

n

= x

m+n

.

Real numbers 37

^ (x

m

)

n

= x

mn

.

· (xy)

n

= x

n

y

n

.

´ 0 < α < β and n > 0 implies α

n

< β

n

.

I 0 < α < β and n < 0 implies α

n

> β

n

.

I α > 1 and n > m implies α

n

> α

m

.

I 0 < α < 1 and n > m implies α

n

< α

m

.

Proof : Do it. B

(10 hrs, 2011)

Deﬁnition 1.9 Let x > 0. An n-th root of x is a positive number y, such that

y

n

= x.

It is easy to see that if x has an n-th root then it is unique (since y

n

= z

n

, would

imply that y = z). Thus, we denote it by either

n

√

x, or by x

1/n

.

Theorem 1.1 (Existence and uniqueness of roots)

(∀x > 0 ∧ ∀n ∈ N)(∃!y > 0) : (y

n

= x).

(10 hrs, 2010)

Proof : Consider the set

S = ¦z ≥ 0 : z

n

< x¦.

This is a non-empty set (it contains zero) and bounded from above, since

if x ≤ 1 and z ∈ S then z

n

< x ≤ 1, which implies that z ≤ 1

if x > 1 and z ∈ S then z

n

< x, which implies that z ≤ x,

It follows that regardless of the value of x, z ∈ S implies that z ≤ max(1, x), and

the latter is hence an upper bound for S . By the axiom of completeness, there

exists a unique y = sup S , which is the “natural suspect” for being an n-th root of

x. Indeed, we will show that y

n

= x.

38 Chapter 1

Claim: y is positive Indeed,

0 <

x

1 + x

< 1,

hence

_

x

1 + x

_

n

<

x

1 + x

< x.

Thus, x/(1 + x) ∈ S , which implies that y > 0.

Claim: y

n

≥ x Suppose, by contradiction that y

n

< x. Set

0 < < min

_

1,

x − y

n

nx

_

.

Then,

nx < x − y

n

,

and

y

n

< x(1 − n) ≤ x(1 − )

n

,

where we used the Bernoulli inequality. This ﬁnally implies that

_

y

1 −

_

n

< x.

Thus, we found a number greater than y, whose n-th power is less than x, i.e., in

S . This contradicts the assumption that y is an upper bound for S .

Claim: y

n

≤ x Suppose, by contradiction that y

n

> x. Set

0 < < min

_

1,

y

n

− x

ny

n

_

.

Then,

ny

n

< y

n

− x,

and

[(1 − )y]

n

= (1 − )

n

y

n

≥ (1 − n)y

n

> x,

where we used once more the Bernoulli inequality. Thus, we found a number less

than y, whose n-th power is greater than x, i.e., it is an upper bound for S . This

contradicts the assumption that y is the least upper bound for S .

Real numbers 39

From the two inequalities follows (by trichotomy) that y

n

= x. B

Having shown that every positive number has a unique n-th root, we may proceed

to deﬁne rational powers.

Deﬁnition 1.10 Let r = m/n ∈ Q, then for all x > 0

x

r

= (x

m

)

1/n

.

There is one little delicacy with this deﬁnition. Recall that rational numbers do

not have a unique representation as the ratio of two integers. We thus need to show

that the above deﬁnition is independent of the representation

15

. In other words, if

ad = bc, with a, b, c, d ∈ Z, b, d 0, then for all x > 0

x

a/b

= x

c/d

.

This is a non-trivial fact. We need to prove that ad = bc implies that

sup¦z > 0 : z

b

< x

a

¦ = sup¦z > 0 : z

d

< x

c

¦.

The arguments goes as follows:

(x

a/b

)

bd

= (((x

a

)

1/b

)

b

)

d

= (x

a

)

d

= x

ad

= x

bc

= (x

c

)

b

= (((x

c

)

1/d

)

d

)

b

= (x

c/d

)

bd

,

hence x

a/b

= x

c/d

.

(8 hrs, 2009)

Proposition 1.33 (Properties of rational powers) Let x, y > 0 and r, s ∈ Q, then

' x

r

x

s

= x

r+s

.

^ (x

r

)

s

= x

rs

.

· (xy)

r

= x

r

y

r

.

´ 0 < α < β and r > 0 implies α

r

< β

r

.

I 0 < α < β and r < 0 implies α

r

> β

r

.

I α > 1 and r > s implies α

r

> α

s

.

I 0 < α < 1 and r > s implies α

r

< α

s

.

15

This is not obvious. Suppose we wanted to deﬁne the notion of even/odd for rational numbers,

by saying that a rational number is even if its numerator is even. Such as deﬁnition would imply

that 2/6 is even, but 1/3 (which is the same number!) is odd.

40 Chapter 1

Proof : We are going to prove only two items. Start with the ﬁrst. Let r = a/b and

s = c/d. Using the (proved!) laws for integer powers,

(x

a/b

x

c/d

)

bd

= (x

a/b

)

bd

(x

c/d

)

bd

= = x

ad

x

bc

= x

ad+bc

= (x

a/b+c/d

)

bd

,

from which follows that

x

a/b

x

c/d

= x

a/b+c/d

.

Take then the fourth item. Let r = a/b. We already know that α

a

< β

a

. Thus

it remains to show α

1/b

< β

1/b

. This has to be because α

1/b

≥ β

1/b

would have

implied (from the rules for integer powers!) that α ≥ β. B

1.9 Real-valued powers

Delayed to much later in the course.

1.10 Addendum

Existence of non-Archimedean ordered ﬁelds Does there exist ordered ﬁelds

that do not satisfy the Archimedean property? The answer is positive. We will give

one example, bearing in mind that some of the arguments rely on later material.

Consider the set of rational functions. It is easy to see that this set forms a ﬁeld

with respect to function addition and multiplication, with f (x) ≡ 0 for additive

neutral element and f (x) ≡ 1 for multiplicative neutral element. Ignore the fact

that a rational function may be undeﬁned at a ﬁnite collection of points. The

”natural elements” are the constant functions, f (x) ≡ 1, f (x) ≡ 2, etc. We endow

this set with an order relation by deﬁning the set P of positive elements as the

rational functions f that are eventually positive for x suﬃciently large. We hace

f > g if f (x) > g(x) for x suﬃciently large. It is easy to see that the set of “natural

elements” are bounded, say, by the element f (x) = x (which obviously belongs to

the set).

Connected set A set A ⊆ R will be said to be connected (רישק) if x, y ∈ A and

x < z < y implies z ∈ A (that is, every point between two elements in the set is

Real numbers 41

also n the set). We state that a (non-empty) connected set can only be of one of

the following forms:

¦a¦, (a, b), [a, b), (a, b], [a, b]

(a, ∞), [a, ∞), (−∞, b), (−∞, b], (−∞, ∞).

That is, it can only be a single point, an open, closed or semi-open segments, an

open or closed ray, or the whole line.

Exercise 1.20 Prove it.

42 Chapter 1

Chapter 2

Functions

2.1 Basic deﬁnitions

What is a function? There is a standard way of deﬁning functions, but we will

deliberately be a little less formal than perhaps we should, and deﬁne a function

as a “machine”, which when provided with a number, returns a number (only one,

and always the same for the same input). In other words, a function is a rule that

assigns real numbers to real numbers.

A function is deﬁned by three elements:

' A domain (וּחת): a subset of R. The numbers which may be “fed into the

machine”.

^ A range (חווט): another subset of R. Numbers that may be “emitted by the

machine”. We do not exclude the possibility that some of these numbers

may never be returned. We only require that every number returned by the

function belongs to its range.

· A transformation rule (הקתעה). The crucial point is that to every number

in its domain corresponds one and only one number in its range (תוּיכרע דח).

We normally denote functions by letters, like we do for real numbers (and for any

other mathematical entity). To avoid ambiguities, the function has to be deﬁned

properly. For example, we may denote a function by the letter f . If A ⊆ R is its

domain, and B ⊆ R is its range, we write f : A → B ( f maps the set A into the set

B). The transformation rule has to specify what number in B is assigned by the

44 Chapter 2

function to each number x ∈ A. We denote the assignment by f (x) (the function

f evaluated at x). That the assignment rule is “assign f (x) to x” is denoted by

f : x ·→ f (x) (pronounced “ f maps x to f (x)”)

1

.

Deﬁnition 2.1 Let f : A → B be a function. Its image (הנומת) is the subset of B

of numbers that are actually assigned by the function. That is,

Image( f ) = ¦y ∈ B : ∃x ∈ A : f (x) = y¦.

The function f is said to be onto B (לע) if B is its image. f is said to be one-to-

one (תיכרע דח דח) if to each number in its image corresponds a unique number

in its domain, i.e.,

(∀y ∈ Image( f ))(∃!x ∈ A) : ( f (x) = y).

Examples:

' A function that assigns to every real number its square. If we denote the

function by f , then

f : R → R and f : x ·→ x

2

.

We may also write f (x) = x

2

.

We should not say, however, that “the function f is x

2

”. In particular, we

may use any letter other than x as an argument for f . Thus, the functions

f : R → R, deﬁned by the transformation rules f (x) = x

2

, f (t) = t

2

,

f (α) = α

2

and f (ξ) = ξ

2

are identical

2

.

Also, it turns out that the function f only returns non-negative numbers.

There is however nothing wrong with the deﬁnition of the range as the

whole of R. We could limit the range to be the set [0, ∞), but not to the

set [1, 2].

^ A function that assigns to every w ±1 the number (w

3

+3w+5)/(w

2

−1).

If we denote this function by g, then

g : R \ ¦±1¦ → R and g : w ·→

w

3

+ 3w + 5

w

2

− 1

.

1

Programmers: think of f : A → B as deﬁning the “type” or “syntax” of the function, and of

f : x ·→ f (x) as deﬁning the “action” of the function.

2

It takes a great skill not to confuse the Greek letters ζ (zeta) and ξ (xi).

Functions 45

· A function that assigns to every −17 ≤ x ≤ π/3 its square. This function

diﬀers from the function in the ﬁrst example because the two functions do

not have the same domain (diﬀerent “syntax” but same “routine”).

´ A function that assigns to every real numbers the value zero if it is irrational

and one if it is rational. This function is known as the Dirichlet function.

We have f : R → ¦0, 1¦, with

f : x ·→

_

¸

¸

_

¸

¸

_

0 x is irrational

1 x is rational.

I A function deﬁned on the domain

A = ¦2, 17, π

2

/17, 36/π¦ ∪ ¦a + b

√

2 : a, b ∈ Q¦,

such that

x ·→

_

¸

¸

¸

¸

¸

¸

¸

¸

_

¸

¸

¸

¸

¸

¸

¸

¸

_

5 x = 2

36/π x = 17

28 x = π

2

/17 or 36/π

16 otherwise.

The range may be taken to be R, but the image is ¦6, 16, 28, 36/π¦.

I A function deﬁned on R \ Q (the irrational numbers), which assigns to x

the number of 7’s in its decimal expansion, if this number is ﬁnite. If this

number is inﬁnite, then it returns −π. This example diﬀers from the previous

ones in that we do not have an assignment rule in closed form (how the heck

do we compute f (x)?). Nevertheless it provides a legal assignment rule

3

.

I For every n ∈ N we may deﬁne the n-th power function f

n

: R → R, by

f

n

: x ·→ x

n

. Here again, we will avoid referring to “the function x

n

”. The

function f

1

: x ·→ x is known as the identity function, often denoted by Id,

namely

Id : R → R, Id : x ·→ x.

3

We limited the range of the function to the irrationals because rational numbers may have a

non-unique decimal representation, e.g.,

0.7 = 0.6999999 . . . .

46 Chapter 2

I There are many functions that you all know since high school, such as the

sine, the cosine, the exponential, and the logarithm. These function re-

quire a careful, sometimes complicated, deﬁnition. It is our choice in the

present course to assume that the meaning of these functions (along with

their domains and ranges) are well-understood.

Given several functions, they can be combined together to form new functions.

For example, functions form a vector space over the reals. Let f : A → R and

g : B → R be given functions, and a, b ∈ R. We may deﬁne a new function

4

a f + b g : A ∩ B → R, a f + b g : x ·→ a f (x) + b g(x).

This is a vector ﬁeld whose zero element is the zero function, x ·→ 0; given a

function f , its inverse is −f = (−1) f .

Moreover, functions form an algebra. We may deﬁne the product of two func-

tions,

f g : A ∩ B → R, f g : x ·→ f (x)g(x),

as well as their quotient,

f /g : A ∩ ¦z ∈ B : g(z) 0¦ → R, f /g : x ·→ f (x)/g(x),

(12 hrs, 2011)

A third operation that can combine two functions is composition. Let f : A → B

and g : B → C. We deﬁne

g ◦ f : A → C, g ◦ f : x ·→ g( f (x)).

For example, if f is the sine function and g is the square function, then

5

g ◦ f : ξ ·→ sin

2

ξ and f ◦ g : ζ ·→ sin ζ

2

,

i.e., composition is non-commutative. On the other hand, composition is associa-

tive, namely,

( f ◦ g) ◦ h = f ◦ (g ◦ h).

4

This is not trivial. We are adding “machines”, not numbers.

5

It is deﬁnitely a habit to use the letter x as generic argument for functions. Beware of ﬁxations!

Any letter including Ξ is as good.

Functions 47

Note that for every function f ,

Id ◦ f = f ◦ Id = f ,

so that the identity is the neutral element with respect to function composition.

This should not be confused with the fact that x ·→ 1 is the neutral element with

respect to function multiplication.

Example: Consider the function f that assigns the rule

f : x ·→

x + x

2

+ x sin x

2

x sin x + x sin

2

x

.

This function can be written as

f =

Id +Id Id +Id sin ◦(Id Id)

Id sin +Id sin sin

.

**Example: Recall the n-th power functions f
**

n

. A function P is called a polynomial

of degree n if there exist real numbers (a

i

)

n

i=0

, with a

n

0, such that

P =

n

k=0

a

k

f

k

.

The union over all n’s of polynomials of degree n is the set of polynomials. A

function is called rational if it is the ratio of two polynomials.

(10 hrs, 2009)

Exercise 2.1 The following exercise deals with the algebra of functions. Con-

sider the three following functions,

s(x) = sin x P(x) = 2

x

and S (x) = x

2

,

all three deﬁned on R.

' Express the following functions without using the symbols s, P, and S : (i)

(s ◦ P)(y), (ii) (S ◦ s)(ξ), and (iii) (S ◦ P ◦ s)(t) + (s ◦ P)(t).

48 Chapter 2

^ Write the following functions using only the symbols s, P, and S , along

with the binary operators +, , and ◦: (i) f (x) = 2

sin x

, (ii) f (t) = 2

2

x

, (iii)

f (u) = sin(2

u

+ 2

u

2

), (iv) f (y) = sin(sin(sin(2

2

sin y

))), and (v) f (a) = 2

sin

2

a

+

sin(a

2

) + 2

sin(a

2

+sin a)

.

Exercise 2.2 An open segment I is called symmetric if there exists an a > 0

such that I = (−a, a). Let f : I → R where I is a symmetric segment; the function

f is called even if f (−x) = f (x) for every x ∈ I, and odd if f (−x) = −f (x) for

every x ∈ I.

Let f , g : I → R, where I is a symmetric segments, and consider the four possibili-

ties of f , g being either even or odd. In each case, determine whether the speciﬁed

function is even, odd, or not necessarily either two. Explain your statement, and

where applicable give a counter example:

' f + g.

^ f g.

· f ◦ g (here we assume that g : I → J and f : J → R, where I and J are

symmetric segments).

Exercise 2.3 Show that any function f : R → R can be written as

f = E + O,

where E is even and O is odd. Prove then that this decomposition is unique. That

is, if f = E + O = E

·

+ O

·

, then E = E

·

and O = O

·

.

Exercise 2.4 The following exercise deals with polynomials.

' Prove that for every polynomial of degree ≥ 1 and for every real number

a there exists a polynomial g and a real number b, such that f (x) = (x −

a)g(x) + b (hint: use induction on the degree of the polynomial).

^ Prove that if f is a polynomial satisfying f (a) = 0, then there exists a

polynomial g such that f (x) = (x − a)g(x).

· Conclude that if f is a polynomial of degree n, then there are at most n real

numbers a, such that f (a) = 0 (at most n roots).

Exercise 2.5 Let a, b, c ∈ R and a 0, and let f : A → B be given by

f : x ·→ ax

2

+ bx + c.

Functions 49

' Suppose B = R. Find the largest segment A for which f is one-to-one.

^ Suppose A = R. Find the largest segment B for which f is onto.

· Find segments A and B as large as possible for which f is one-to-one and

onto.

Exercise 2.6 Let f satisfy the relation f (x + y) = f (x) + f (y) for all x, y ∈ R.

' Prove inductively that

f (x

1

+ x

2

+ + x

n

) = f (x

1

) + f (x

2

) + + f (x

n

)

for all x

1

, . . . , x

n

∈ R.

^ Prove the existence of a real number c, such that f (x) = cx for all x ∈ Q.

(Note that we claim nothing for irrational arguments. )

2.2 Graphs

As you know, we can associate with every function a graph. What is a graph? A

drawing? A graph has to be thought of as a subset of the plane. For a function

f : A → B, we deﬁne the graph of f to be the set

Graph( f ) = ¦(x, y) : x ∈ A, y = f (x)¦ ⊂ A × B.

The deﬁning property of the function is that it is uniquely deﬁned, i.e.,

(x, y) ∈ Graph( f ) and (x, z) ∈ Graph( f ) implies y = z.

The function is one-to-one if

(x, y) ∈ Graph( f ) and (w, y) ∈ Graph( f ) implies x = w.

It is onto B if

(∀y ∈ B)(∃x ∈ A) : ((x, y) ∈ Graph( f )).

The deﬁnition we gave to a function as an assignment rule is, strictly speaking, not

a formal one. The standard way to deﬁne a function is via its graph. A function is

a graph—a subset of the Cartesian product space A × B.

50 Chapter 2

Comment: In fact, this is the general way to deﬁne functions. Let A and B be two

sets (of anything!). A function f : A → B is a subset of the Cartesian product

Graph( f ) ⊂ A × B = ¦(a, b) : a ∈ A, b ∈ B¦,

satisfying

(∀a ∈ A)(∃!b ∈ B) : ((a, b) ∈ Graph( f )).

(12 hrs, 2010)

2.3 Limits

Deﬁnition 2.2 Let x ∈ R. A neighborhood of x (הביבס) is an open segment

(a, b) that contains the point x (note that since the segment is open, x cannot be

a boundary point). A punctured neighborhood of x (תֶבֶקוּנמ הביבס) is a set

(a, b) \ ¦x¦ where a < x < b.

a a

punctured neighborhood

neighborhood

Deﬁnition 2.3 Let A ⊂ R. A point a ∈ A is called an interior point of A (הדוּקנ

תימינפ) if it had a neighborhood contained in A.

Notation: We will mostly deal with symmetric neighborhoods (whether punc-

tured or not), i.e., neighborhoods of a of the form

¦x : ¦x − a¦ < δ¦

for some δ > 0. We will introduce the following notations for neighborhoods,

B(a, δ) = (a − δ, a + δ)

B

◦

(a, δ) = ¦x : 0 < ¦x − a¦ < δ¦.

Functions 51

We will also deﬁne one-sided neighborhoods,

B

+

(a, δ) = [a, a + δ)

B

◦

+

(a, δ) = (a, a + δ)

B

−

(a, δ) = (a − δ, a]

B

◦

−

(a, δ) = (a − δ, a).

Example: For A = [0, 1], the point 1/2 is an interior point, but not the point 0.

**Example: The set Q ⊂ R has no interior points, and neither does its complement,
**

R \ Q.

In this section we explain the meaning of the limit of a function at a point. Infor-

mally, we say that:

The limit of a function f at a point a is , if we can make f assign

a value as close to as we wish, by making its argument suﬃciently

close to a (excluding the value a itself).

Note that the function does not need to be equal to at the point a; in fact, it does

not even need to be deﬁned at the point a.

Example: Consider the function

f : R → R f : x ·→ 3x.

We claim that the limit of this function at the point 5 is 15. This means that we

can make f (x) be as close to 15 as we wish, by making x suﬃciently close to 5,

with 5 itself being excluded. Suppose you want f (x) to diﬀer from 15 by less than

1/100. This means that you want

15 −

1

100

< f (x) = 3x < 15 +

1

100

.

This requirement is guaranteed if

5 −

1

300

< x < 5 +

1

300

.

52 Chapter 2

Thus, if we take x to diﬀer from 5 by less than 1/300 (but more than zero!), then

we are guaranteed to have f (x) within the desired range. Since we can repeat this

construction for any number other than 1/100, then we conclude that the limit of

f at 5 is 15. We write,

lim

5

f = 15.

although the more common notation is

6

lim

x→5

f (x) = 15.

We can be more precise. Suppose you want f (x) to diﬀer from 15 by less than ,

for some > 0 of your choice. In other words, you want

¦ f (x) − 15¦ = ¦3x − 15¦ < .

This is guaranteed if ¦x − 5¦ < /3, thus given > 0, choosing x within a sym-

metric punctured neighborhood of 5 of length 2/3 guarantees that f (x) is within

a distance of from 15.

This example insinuates what would be a formal deﬁnition of the limit of a func-

tion at a point:

Deﬁnition 2.4 Let f : A → B with a ∈ A an interior point. We say that the limit

of f at a is , if for every > 0, there exists a δ > 0, such that

7

¦ f (x) − ¦ < for all x ∈ B

◦

(a, δ).

In formal notation,

(∀ > 0)(∃δ > 0) : (∀x ∈ B

◦

(a, δ))(¦ f (x) − ¦ < ).

6

The choice of not using the common notation becomes clear when you realize that you could

as well write

lim

ℵ→5

f (ℵ) = 15 or lim

ξ→5

f (ξ) = 15.

7

The game is “you give me and I give you δ in return”.

Functions 53

a

l

a

l

! !

"

Player A: This is my !.

Player B: No problem. Here is my ".

Since Player B can ﬁnd a " for every choice of ! made by Player A,it fo!ows tha"

the limit of f at a is l.

Example: Consider the square function f : R → R, f : x ·→ x

2

. We want to show

that

lim

3

f = 9.

By deﬁnition, we need to show that for every > 0 we can ﬁnd a δ > 0, such that

¦ f (x) − 9¦ < whenever x ∈ B

◦

(3, δ).

Thus, the game it to respond with an appropriate δ for every .

To ﬁnd the appropriate δ, we examine the condition that needs to be satisﬁed:

¦ f (x) − 9¦ = ¦x − 3¦ ¦x + 3¦ < .

If we impose that ¦x − 3¦ < δ then

¦ f (x) − 9¦ < ¦x + 3¦ δ.

If the ¦x +3¦ wasn’t there we would have set δ = and we would be done. Instead,

we note that by the triangle inequality,

¦x + 3¦ = ¦x − 3 + 6¦ ≤ ¦x − 3¦ + 6,

so that for all ¦x − 3¦ < δ,

¦ f (x) − 9¦ ≤ ¦x − 3¦ (¦x − 3¦ + 6) < (6 + δ)δ.

54 Chapter 2

Now recall: given we have the freedom to choose δ > 0 such to make the right-

hand side less or equal than . We may freely require for example that d ≤ 1, in

which case

¦ f (x) − 9¦ < (6 + δ)δ ≤ 7δ.

If we furthermore take δ ≤ /7, then we reach the desired goal.

To summarize, given > 0 we take δ = min(1, /7), in which case (∀x : 0 <

¦x − 3¦ < δ),

¦ f (x) − 9¦ = ¦x − 3¦ ¦x + 3¦ ≤ ¦x − 3¦ (¦x − 3¦ + 6) < (6 + δ)δ ≤ (6 + 1)

7

= .

This proves (by deﬁnition) that lim

3

f = 9.

We can show, in general, that

lim

a

f = a

2

for all a ∈ R. [do it!]

(13 hrs, 2010)

Example: The next example is the function f : (0, ∞), f : x ·→ 1/x. We are going

to show that for every a > 0,

lim

a

f =

1

a

.

First ﬁx a; it is not a variable. Let > 0 be given. We ﬁrst observe that

¸

¸

¸

¸

¸

f (x) −

1

a

¸

¸

¸

¸

¸

=

¸

¸

¸

¸

¸

1

x

−

1

a

¸

¸

¸

¸

¸

=

¦x − a¦

ax

.

We need to be careful that the domain of x does not include zero. We start by

requiring that ¦x − a¦ < a/2, which at once implies that x > a/2, hence

¸

¸

¸

¸

¸

f (x) −

1

a

¸

¸

¸

¸

¸

<

2¦x − a¦

a

2

.

If we further require that ¦x − a¦ < a

2

/2, then ¦ f (x) − 1/a¦ < . To conclude,

¸

¸

¸

¸

¸

f (x) −

1

a

¸

¸

¸

¸

¸

< whenever x ∈ B

◦

(a, δ),

for δ = min(a/2, a

2

/2).

(14 hrs, 2011)

Functions 55

Example: Consider next the Dirichlet function, f : R → R,

f : x ·→

_

¸

¸

_

¸

¸

_

0 x is irrational

1 x is rational.

We will show that f does not have a limit at zero. To show that for all ,

lim

0

f ,

we need to show the negation of

(∃ ∈ R) : (∀ > 0)(∃δ > 0) : (∀x ∈ B

◦

(0, δ))( f (x) ∈ B(, )),

i.e.,

(∀ ∈ R)(∃ > 0) : (∀δ > 0)(∃x ∈ B

◦

(0, δ)) : (¦ f (x) − ¦ ≥ ).

Indeed, take = 1/4. Then no matter what δ is, there exist points 0 < ¦x¦ < δ for

which f (x) = 0 and points for which f (x) = 1, i.e.,

(∀δ > 0)(∃x, y ∈ B

◦

(0, δ)) : ( f (x) = 0, f (y) = 1).

No can satisfy both

¦0 − ¦ <

1

4

and ¦1 − ¦ <

1

4

.

I.e.,

(∀ ∈ R, ∀δ > 0)(∃x ∈ B

◦

(0, δ)) : ( f (x) B(,

1

4

)).

(12 hrs, 2009)

Comment: It is important to stress what is the negation that the limit f at a is :

There exists an > 0, such that for all δ > 0, there exists an x, which

satisﬁes 0 < ¦x − a¦ < δ, but not ¦ f (x) − ¦ < .

Example: Consider, in contrast, the function f : R → R,

f : x ·→

_

¸

¸

_

¸

¸

_

0 x is irrational

x x is rational.

56 Chapter 2

Here we show that

lim

0

f = 0.

Indeed, let > 0 be given, then by taking δ = ,

x ∈ B

◦

(0, δ) implies ¦ f (x) − 0¦ < .

**Having a formal deﬁnition of a limit, and having seen a number of example, we
**

are in measure to prove general theorems about limits. The ﬁrst theorem states

that a limit, if it exists, is unique.

Theorem 2.1 (Uniqueness of the limit) A function f : A → B has at most one

limit at any interior point a ∈ A.

Proof : Suppose, by contradiction, that

lim

a

f = and lim

a

f = m,

with m. By deﬁnition, there exists a δ

1

> 0 such that

¦ f (x) − ¦ <

1

4

¦ − m¦ whenever x ∈ B

◦

(a, δ

1

),

and there exists a δ

2

> 0 such that

¦ f (x) − m¦ <

1

4

¦ − m¦ whenever x ∈ B

◦

(a, δ

2

).

Take δ = min(δ

1

, δ

2

). Then, whenever x ∈ B

◦

(a, δ),

¦ − m¦ = ¦( f (x) − m) − ( f (x) − )¦ ≤ ¦ f (x) − ¦ + ¦ f (x) − m¦ <

1

2

¦ − m¦,

which is a contradiction.

Functions 57

a

l

!

"

Player A: This is my !.

Player B: I give up. No " ﬁts.

Since Player B cannot ﬁnd a " for a particular choice of ! made by Player A,it fo!ows tha"

the limit of f at a cannot be both l and #.

#

!

a

l

!

#

!

B

Exercise 2.7 In each of the following items, determine whether the limit ex-

ists, and if it does calculate its value. You have to calculate the limit directly from

its deﬁnition (without using, for example, limits arithmetic—see below).

' lim

2

f , where f (x) = 7x.

^ lim

0

f , where f (x) = x

2

+ 3.

· lim

0

f , where f (x) = 1/x.

´ lim

a

f , where

f (x) =

_

¸

¸

¸

¸

¸

_

¸

¸

¸

¸

¸

_

1 x > 0

0 x = 0

−1 x < 0,

and a ∈ R.

I lim

a

f , where f (x) = ¦x¦.

I lim

a

f , where f (x) =

√

¦x¦.

Exercise 2.8 Let I be an open interval in R, with a ∈ I. Let f , g : I \ ¦a¦ → R

(i.e., deﬁned on a punctured neighborhood of a). Let J ⊂ I be an open interval

containing the point a, such that f (x) = g(x) for all x ∈ J \ ¦a¦. Show that

lim

a

f exists if and only if lim

a

g exists, and if they do exist they are equal. (This

exercise shows that the limit of a function at a point only depends on its properties

in a neighborhood of that point).

58 Chapter 2

Exercise 2.9 The Riemann function R : R → R is deﬁned as

R(x) =

_

¸

¸

_

¸

¸

_

1/q x = p/q ∈ Q

0 x Q,

where the representation p/q is in reduced form (for example R(4/6) = R(2/3) =

1/3). Note that R(0) = R(0/1) = 1. Show that lim

a

R exists everywhere and is

equal to zero.

We have seen above various examples of limits. In each case, we had to “work

hard” to prove what the limit was, by showing that the deﬁnition was satisﬁed.

This becomes impractical when the functions are more complex. Thus, we need

to develop theorems that will make our task easier. But ﬁrst, a lemma:

Lemma 2.1 Let , m ∈ R. Then,

'

¦x − ¦ <

2

and ¦y − m¦ <

2

imply ¦(x + y) − ( + m)¦ < .

^

¦x − ¦ < min

_

1,

2(¦m¦ + 1)

_

and ¦y − m¦ < min

_

1,

2(¦¦ + 1)

_

imply

¦xy − m¦ < .

· If m 0 and

¦y − m¦ < min

_

¦m¦

2

,

¦m¦

2

2

_

,

then y 0 and

¸

¸

¸

¸

¸

1

y

−

1

m

¸

¸

¸

¸

¸

< .

(15 hrs, 2011)

Functions 59

Proof : The ﬁrst item is obvious (the triangle inequality). Consider the second

item. If the two conditions are satisﬁed then

¦x¦ = ¦x − + ¦ < ¦x − ¦ + ¦¦ < ¦¦ + 1,

hence,

¦xy − m¦ = ¦x(y − m) + m(x − )¦ ≤ ¦x¦¦y − m¦ + ¦m¦¦x − ¦

<

¦¦ + 1

2(¦¦ + 1)

+

¦m¦

2(¦m¦ + 1)

< .

For the third item it is clear that ¦y − m¦ < ¦m¦/2 implies that ¦y¦ > ¦m¦/2, and in

particular, y 0. Then

¸

¸

¸

¸

¸

1

y

−

1

m

¸

¸

¸

¸

¸

=

¦y − m¦

¦y¦¦m¦

<

¦m¦

2

2¦m¦(¦m¦/2)

= .

B

(13 hrs, 2009)

(15 hrs, 2010)

Theorem 2.2 (Limits arithmetic) Let f and g be two functions deﬁned in a neigh-

borhood of a, such that

lim

a

f = and lim

a

g = m.

Then,

lim

a

( f + g) = + m and lim

a

( f g) = m.

If, moreover, 0, then

lim

a

_

1

f

_

=

1

.

Proof : This is an immediate consequence of the above lemma. Since, given > 0,

we know that there exists a δ > 0, such that

8

¦ f (x) − ¦ <

2

and ¦g(x) − m¦ <

2

whenever x ∈ B

◦

(a, δ).

8

This δ > 0 is already the smallest of the two δ’s corresponding to each of the functions.

60 Chapter 2

It follows that

¦( f (x) + g(x)) − ( + m)¦ < whenever x ∈ B

◦

(a, δ).

There also exists a δ > 0, such that

¦ f (x) − ¦ < min

_

1,

2(¦m¦ + 1)

_

and ¦g(x) − m¦ < min

_

1,

2(¦¦ + 1)

_

whenever x ∈ B

◦

(a, δ). It follows that

¦ f (x)g(x) − m¦ < whenever x ∈ B

◦

(a, δ).

Finally, there exists a δ > 0, such that

¦ f (x) − ¦ < min

_

¦¦

2

,

¦¦

2

2

_

whenever x ∈ B

◦

(a, δ).

It follows that

¸

¸

¸

¸

¸

1

f (x)

−

1

¸

¸

¸

¸

¸

< whenever x ∈ B

◦

(a, δ).

B

Comment: It is important to point out that the fact that f + g has a limit at a does

not imply that either f or g has a limit at that point; take for example the functions

f (x) = 5 + 1/x and g(x) = 6 − 1/x at x = 0.

Example: What does it take now to show that

9

lim

x→a

x

3

+ 7x

5

x

2

+ 1

=

a

3

+ 7a

5

a

2

+ 1

.

We need to show that for all c ∈ R

lim

a

c = c,

and that for all k ∈ N,

lim

x→a

x

k

= a

k

,

which follows by induction once we show that for k = 1.

9

This is the standard notation to what we write in these notes as

f : ξ ·→

ξ

3

+ 7ξ

5

ξ

2

+ 1

and lim

a

f =

a

3

+ 7a

5

a

2

+ 1

.

Functions 61

Exercise 2.10 Show that

lim

x→a

x

n

= a

n

, a ∈ R, n ∈ N,

directly from the deﬁnition of the limit, i.e., without using limits arithmetic.

Exercise 2.11 Let f , g be deﬁned in a punctured neighborhood of a point a.

Prove or disprove:

' If lim

a

f and lim

a

g do not exist, then lim

a

( f + g) does not exist either.

^ If lim

a

f and lim

a

g do not exist, then lim

a

( f g) does not exist either

· If both lim

a

f and lim

a

( f + g) exist, then lim

a

g also exists.

´ If both lim

a

f and lim

a

( f g) exist, then lim

a

g also exists.

I If lim

a

f exists and lim

a

( f +g) does not exist, then lim

a

( f +g) does no exist.

I If g is bounded in a punctured neighborhood of a and lim

a

f = 0, then

lim

a

( f g) = 0.

We conclude this section by extending the notion of a limit to that of a one-sided

limit.

Deﬁnition 2.5 Let f : A → B with a ∈ A an interior point. We say that the limit

on the right (ימימ לובג) of f at a is , if for every > 0, there exists a δ > 0, such

that

¦ f (x) − ¦ < whenever x ∈ B

◦

+

(a, δ).

We write,

lim

a

+

f = .

An analogous deﬁnition is given for the limit on the left (לאמשמ לובג).

62 Chapter 2

a

l

a

l

! !

"

Player A: This is my !.

Player B: No problem. Here is my ".

Since Player B can ﬁnd a " for every choice of ! made by Player A,it fo!ows tha"

the right#limit of f at a is l.

Exercise 2.12 We denote by f : x ·→ [x| the lower integer value of x.

' Draw the graph of this function.

^ Find

lim

a

−

f and lim

a

+

f

for a ∈ R. Prove it based on the deﬁnition of one-sided limits.

Exercise 2.13 Show that f : (0, ∞) → R, f : x ·→

√

x satisﬁes

lim

a

f =

√

a,

for all a > 0.

2.4 Limits and order

In this section we will prove a number of properties pertinent to limits and order.

First a deﬁnition:

Deﬁnition 2.6 Let f : A → B and a ∈ A an interior point. We say that f is

locally bounded (תימוקמ המוסח) near a if there exists a punctured neighborhood

U of a, such that the set

¦ f (x) : x ∈ U¦

Functions 63

is bounded. Equivalently, there exists a δ > 0 such that the set

¦ f (x) : x ∈ B

◦

(a, δ)¦

is bounded.

Comment: As in the previous section, we consider punctured neighborhoods of a

point, and we don’t even care if the function is deﬁned at the point.

Proposition 2.1 Let f and g be two functions deﬁned in a punctured neighbor-

hood U of a point a. Suppose that

f (x) < g(x) ∀x ∈ U,

and that

lim

a

f = and lim

a

g = m.

Then ≤ m.

Comment: The very fact that f and g have limits at a is an assumption.

Comment: Note that even though f (x) < g(x) is a strict inequality, the resulting

inequality of the limits is in a weak sense. To see what this must be the case,

consider the example

f (x) = ¦x¦ and g(x) = 2¦x¦.

Even though f (x) < g(x) in an open neighborhood of 0,

lim

0

f = lim

0

g = 0.

Proof : Suppose, by contradiction, that > g and set =

1

2

( − m). Thus, there

exists a δ > 0 such that

(∀x ∈ B

◦

(a, δ)) : (¦ f (x) − ¦ < and ¦g(x) − m¦ < ).

In particular, for every such x,

f (x) > − = −

− m

2

= m +

− m

2

= m + > g(x),

which is a contradiction. B

64 Chapter 2

Proposition 2.2 Let f and g be two functions deﬁned in a punctured neighbor-

hood U of a point a. Suppose that

lim

a

f = and lim

a

g = m,

with < m. Then there exists a δ > 0 such that

(∀x ∈ B

◦

(a, δ)) : ( f (x) < g(x)).

Comment: This time both inequalities are strong.

Proof : Let =

1

2

(m − ). There exists a δ > 0 such that

(∀x ∈ B

◦

(a, δ)) : (¦ f (x) − ¦ < and ¦g(x) − m¦ < ).

For every such x,

f (x) < + < + + (g(x) − m + ) = g(x).

B

Proposition 2.3 Let f , g, h be deﬁned in a punctured neighborhood U of a. Sup-

pose that

(∀x ∈ U) : ( f (x) ≤ g(x) ≤ h(x)),

and

lim

a

f = lim

a

h = .

Then

lim

a

g = .

Proof : It is given that for every > 0 there exists a δ > 0 such that

(∀x ∈ B

◦

(a, δ)) : ( − < f (x) and h(x) < + ).

Functions 65

Then for every such x,

− < f (x) ≤ g(x) ≤ h(x) < + ,

i.e.,

(∀x ∈ B

◦

(a, δ)) : (¦g(x) − ¦ < ).

B

(17 hrs, 2011)

Proposition 2.4 Let f be deﬁned in a punctured neighborhood U of a. If the limit

lim

a

f =

exists, then f is locally bounded near a.

Proof : Immediate from the deﬁnition of the limit. B

Proposition 2.5 Let f be deﬁned in a punctured neighborhood of a point a. Then,

lim

a

f = if and only if lim

a

( f − ) = 0.

Proof : Very easy. B

Proposition 2.6 Let f and g be deﬁned in a punctured neighborhood of a point

a. Suppose that

lim

a

f = 0

whereas g is locally bounded near a. Then,

lim

a

( f g) = 0.

Proof : Very easy. B

66 Chapter 2

Comment: Note that we do not require g to have a limit at a.

With the above tools, here is another way of proving the product property of the

arithmetic of limits (this time without and δ).

Proposition 2.7 (Arithmetic of limits, product) Let f , g be deﬁned in a punc-

tured neighborhood of a point a, and

lim

a

f = and lim

a

g = m.

Then

lim

a

( f g) = m.

Proof : Write

f g − m = ( f − )

.,,.

limit zero

g

.,,.

loc. bdd

.,,.

limit zero

− (g − m)

.,,.

limit zero

.,,.

limit zero

.

B

(17 hrs, 2010)

2.5 Continuity

Deﬁnition 2.7 A function f : A → R is said to be continuous at an inner point

a ∈ A, if it has a limit at a, and

lim

a

f = f (a).

Comment: If f has a limit at a and the limit diﬀers from f (a) (or that f (a) is

undeﬁned) , then we say that f has a removable discontinuity (הקילס תופיצר יא)

at a.

Examples:

' We saw that the function f (x) = x

2

has a limit at 3, and that this limit was

equal 9. Hence, f is continuous at x = 3.

Functions 67

^ We saw that the function f (x) = 1/x has a limit at any a > 0 and that this

limit equals 1/a. By a similar calculation we could have shown that it has a

limit at any a < 0 and that this limit also equals 1/a. Hence, f is continuous

at all x 0. Since f is not even deﬁned at x = 0, then it is not continuous at

that point.

· The function f (x) = x sin 1/x (it was seen in the tutoring session) is contin-

uous at all x 0. At zero it is not deﬁned, but if we deﬁne f (0) = 0, then it

is continuous for all x ∈ R (by the bounded times limit zero argument).

´ The function

f : x ·→

_

¸

¸

_

¸

¸

_

x x ∈ Q

0 x Q,

is continuous at x = 0, however it is not continuous at any other point,

because the limit of f at a 0 does not exist.

I The elementary functions sin and cos are continuous everywhere. To prove

that sin is continuous, we need to prove that for every > 0 there exists a

δ > 0 such that

¦ sin x − sin a¦ < whenever x ∈ B

◦

(a, δ).

We will take it for granted at the moment.

Our theorems on limit arithmetic imply right away similar theorems for continu-

ity:

Theorem 2.3 If f : A → R and g : A → R are continuous at a ∈ A then f +g and

f g are continuous at a. Moreover, if f (a) 0 then 1/ f is continuous at a.

Proof : Obvious. B

Comment: Recall that in the deﬁnition of the limit, we said that the limit of f at a

is if for every > 0 there exists a δ > 0 such that

¦ f (x) − ¦ < whenever x ∈ B

◦

(a, δ).

There was an emphasis on the fact that the point x itself is excluded. Continuity

is deﬁned as that for every > 0 there exists a δ > 0 such that

¦ f (x) − f (a)¦ < whenever x ∈ B

◦

(a, δ).

68 Chapter 2

Note that we may require that this be true whenever x ∈ B(a, δ). There is no need

to exclude the point x = a.

(19 hrs, 2010)

With that, we have all the tools to show that a function like, say,

f (x) =

sin

2

x + x

2

+ x

4

sin x

1 + sin

2

x

is continuous everywhere in R. But what about a function like sin x

2

. Do we have

the tools to show that it is continuous. No. We don’t yet have a theorem for the

composition of continuous functions.

Theorem 2.4 Let g : A → B and f : B → C. Suppose that g is continuous at an

inner point a ∈ A, and that f is continuous at g(a) ∈ B, which is an inner point.

Then, f ◦ g is continuous at a.

Proof : Let > 0. Since f is continuous at g(a), then there exists a δ

1

> 0, such

that

¦ f (y) − f (g(a))¦ < whenever y ∈ B(g(a), δ

1

).

Since g is continuous at a, then there also exists a δ

2

> 0, such that

¦g(x) − g(a)¦ < δ

1

or g(x) ∈ B(g(a), δ

1

) whenever x ∈ B(a, δ

2

).

Combining the two we get that

¦ f (g(x)) − f (g(a))¦ < whenever B(a, δ

2

).

B

Deﬁnition 2.8 A function f is said to be right-continuous (ימימ הפיצר) at a if

lim

a

+

f = f (a).

A similar deﬁnition is given for left-continuity (לאמשמ הפיצר).

So far, we have only dealt with continuity at points. Usually, we are interested in

continuity on intervals.

Functions 69

Deﬁnition 2.9 A function f is said to be continuous on an open interval (a, b) if

it is continuous at all x ∈ (a, b). It is said to be continuous on a closed interval

[a, b] if it is continuous on the open interval, and in addition it is right-continuous

at a and left-continuous at b.

Comment: We usually think of continuous function as “well-behaved”. One

should be careful with such interpretations; see for example

f (x) =

_

¸

¸

_

¸

¸

_

x sin

1

x

x 0

0 x = 0.

Exercise 2.14 Suppose f : R → R satisﬁes f (a + b) = f (a) + f (b) for all

a, b ∈ R. )We have already seen that this implies that f (x) = cx, for some constant

c for all x ∈ Q.) Suppose that in addition f is continuous at zero. Prove that f is

continuous everywhere.

Exercise 2.15 ' Prove that if f : A → B is continuous in A, then so is ¦ f ¦

(where ¦ f ¦(x)

def

= ¦ f (x)¦).

^ Prove that if f : A → R and g : A → R are continuous then so are max( f , g)

and min( f , g).

Exercise 2.16 Suppose that g, h : R → R are continuous at a and satisfy

g(a) = h(a). We deﬁne a new function that “glues” g and h at a:

f (x) =

_

¸

¸

_

¸

¸

_

g(x) x ≤ a

h(x) x > a.

Prove that f is continuous at a.

Exercise 2.17 Here is another “gluing” problem: Suppose that g is continuous

in [a, b], that h is continuous is [b, c], and g(b) = h(b). Deﬁne

f (x) =

_

¸

¸

_

¸

¸

_

g(x) x ∈ [a, b]

h(x) x ∈ [b, c].

Prove that f is continuous in [a, c].

70 Chapter 2

Example: Here is another “crazy function” due to Johannes Karl Thomae (1840-

1921):

r(x) =

_

¸

¸

_

¸

¸

_

1/q x = p/q

0 x Q,

where x = p/q assumes that x is rational in reduced form. This function has the

wonderful property of being continuous at all x Qand discontinuous at all x ∈ Q

(this is because its limit is everywhere zero). It took some more time until Vito

Volterra proved in 1881 that there can be no function that is continuous on Q and

discontinuous on R \ Q.

2.6 Theorems about continuous functions

Theorem 2.5 Suppose that f is continuous at a and f (a) > 0. Then there exists a

neighborhood of a in which f (x) > 0. That is, there exists a δ > 0 such that

f (x) > 0 whenever x ∈ B(a, δ).

Comment: An analogous theorem holds if f (a) < 0. Also, a one-sided version

can be proved, whereby if f is right-continuous at a and f (a) > 0, then

f (x) > 0 whenever x ∈ B

+

(a, δ).

Comment: Note that continuity is only required at the point a.

Proof : Since f is continuous at a, there exists a δ > 0 such that

¦ f (x) − f (a)¦ <

f (a)

2

whenever x ∈ B(a, δ).

But, ¦ f (x) − f (a)¦ <

f (a)

2

implies

f (a) − f (x) ≤ ¦ f (a) − f (x)¦ <

f (a)

2

,

Functions 71

i.e., f (x) > f (a)/2 > 0.

a

f!a"

1/2 f!a"

3/2 f!a"

!

B

This theorem can be viewed as a lemma for the following important theorem:

(15 hrs, 2009)

(20 hrs, 2010)

Theorem 2.6 (Intermediate value theorem ייניבה רע טפשמ) Suppose f is

continuous on an interval [a, b], with f (a) < 0 and f (b) > 0. Then, there ex-

ists a point c ∈ (a, b), such that f (c) = 0.

(19 hrs, 2011)

Comment: Continuity is required on the whole intervals, for consider f : [0, 1] →

R:

f (x) =

_

¸

¸

_

¸

¸

_

−1 0 ≤ x <

1

2

+1

1

2

≤ x ≤ 1.

Comment: If f (a) > 0 and f (b) < 0 the same conclusion holds for just replace f

by (−f ).

Proof : Consider the set

A = ¦x ∈ [a, b] : f (y) < 0 for all y ∈ [a, x]¦.

72 Chapter 2

The set is non empty for it contains a. Actually, the previous theorems guarantees

the existence of a δ > 0 such that [a, a+δ) ⊆ A. This set is also bounded by b. The

previous theorem guarantees even the existence of a δ > 0 such that b − δ is an

upper bound for A. By the axiom of completeness, there exists a number c ∈ (a, b)

such that

c = sup A.

f!a"

b c

f!b"

a

A

Suppose it were true that f (c) < 0. By the previous theorem, there exists a δ > 0

such that

f (x) < 0 whenever c ≤ x ≤ c + δ,

Because c is the least upper bound of a it follows that also

f (x) < 0 whenever x < c,

i.e., f (x) < 0 for all x ∈ [a, c + δ], which means that c + δ ∈ A, contradicting the

fact that c is an upper bound for A.

Suppose it were true that f (c) > 0. By the previous theorem, there exists a δ > 0

such that

f (x) > 0 whenever c − δ ≤ x ≤ c,

which implies that c − δ is an upper bound for A, i.e., c cannot be the least upper

bound of A. By the trichotomy property, we conclude that f (c) = 0. B

Corollary 2.1 If f is continuous on a closed interval [a, b], such that

f (a) < α < f (b),

then there exists a point c ∈ (a, b) at which f (c) = α.

Functions 73

Proof : Apply the previous theorem for g(x) = f (x) − α. B

Exercise 2.18 Suppose that f is continuous on [a, b] with f (x) ∈ Q for all

x ∈ [a, b]. What can be said about f ?

Exercise 2.19 (i) Suppose that f , g : [a, b] → R are continuous such that

f (a) < g(a) and f (b) > g(b). Show that there exists a point c ∈ [a, b] such that

f (c) = g(c). (ii) A ﬂy ﬂies from Tel-Aviv to Or Akiva (along the shortest path).

He starts its way at 10:00 and ﬁnishes at 16:00. The following day are makes his

way back along the very same path, starting again at 10:00 and ﬁnishing at 16:00.

Show that there will be a point along the way that he will cross a the exact same

time on both days.

Exercise 2.20 (i) Let f : [0, 1] → R be continuous, f (0) = f (1). Show

that there exists an x

0

∈ [0,

1

2

], such that f (x

0

) = f (x

0

+

1

2

). (Hint: consider

g(x) = f (x +

1

2

) − f (x).) (ii) Show that there exists an x

0

∈ [0, 1 −

1

n

], such that

f (x

0

) = f (x

0

+

1

n

).

Lemma 2.2 If f is continuous at a, then there is a δ > 0 such that f has an upper

bound on the interval (a − δ, a + δ).

Proof : This is obvious, for by the deﬁnition of continuity, there exists a δ > 0,

such that

f (x) ∈ B( f (a), 1) whenever x ∈ B(a, δ),

i.e., f (x) < f (a) + 1 on this interval. B

Theorem 2.7 (Karl Theodor Wilhelm Weierstraß) If f is continuous on [a, b]

then it is bounded from above of that interval, that is, there exists a number M,

such that f (x) < M for all x ∈ [a, b].

Comment: Here too, continuity is needed on the whole interval, for look at the

“counter example” f : [−1, 1] → R,

f (x) =

_

¸

¸

_

¸

¸

_

1/x x 0

0 x = 0.

74 Chapter 2

Comment: It is crucial that the interval [a, b] be closed. The function f : (0, 1] →

R, f : x ·→ 1/x is continuous on the semi-open interval, but it is not bounded from

above.

Proof : Let

A = ¦x ∈ [a, b] : f is bounded from above on [a, x]¦.

By the previous lemma, there exists a δ > 0 such that a + δ ∈ A. Also A is upper

bounded by b, hence there exists a c ∈ (a, b], such that

c = sup A.

We claim that c = b, for if c < b, then by the previous lemma, there exists a

neighborhood of c in which f is upper bounded, and c cannot be an upper bound

for A. B

Comment: In essence, this theorem is based on the fact that if a continuous func-

tion is bounded up to a point, then it is bounded up to a little farther. To be able to

take such increments up to b we need the axiom of completeness.

(17 hrs, 2009)

(22 hrs, 2010)

(20 hrs, 2011)

Theorem 2.8 (Weierstraß, Maximum principle ומיסקמה ורקע) If f is con-

tinuous on [a, b], then there exists a point c ∈ [a, b] such that

f (x) ≤ f (c) for all x ∈ [a, b].

Comment: Of course, there is a corresponding minimum principle.

Proof : We have just proved that f is upper bounded on [a, b], i.e., the set

A = ¦ f (x) : a ≤ x ≤ b¦

Functions 75

is upper bounded. This set is non-empty for it contains the point f (a). By the

axiom of completeness it has a least upper bound, which we denote by

α = sup A.

We need to show that this supremum is in fact a maximum; that there exists a

point c ∈ [a, b], for which f (c) = α.

Suppose, by contradiction, that this were not the case, i.e., that f (x) < α for all

x ∈ [a, b]. We deﬁne then a new function g : [a, b] → R,

g =

1

α − f

.

This function is deﬁned everywhere on [a, b] (since we assumed that the denom-

inator does not vanish), it is continuous and positive. We will show that g is not

upper bounded on [a, b], contradicting thus the previous theorem. Indeed, since

α = sup A:

For all M > 0 there exists a y ∈ [a, b] such that f (y) > α − 1/M.

For this y,

g(y) >

1

α − (α − 1/M)

= M,

i.e., for every M > 0 there exists a point in [a, b] at which f takes a value greater

than M. B

Comment: Here too we needed continuity on a closed interval, for consider the

“counter examples”

f : [0, 1) → R f (x) = x

2

,

and

f : [0, 1] → R f (x) =

_

¸

¸

_

¸

¸

_

x

2

x < 1

0 x = 1

.

These functions do not attain a maximum in [0, 1].

Exercise 2.21 Let f : [a, b] → R be continuous, with f (a) = f (b) = 0 and

M = sup¦ f (x) : a ≤ x ≤ b¦ > 0.

Prove that for every 0 < c < M, the set ¦x ∈ [a, b] : f (x) = c¦ has at least two

elements.

76 Chapter 2

Exercise 2.22 Let f : [a, b] → R be continuous. Prove or disprove each of

the following statements:

' It is possible that f (x) > 0 for all x ∈ [a, b], and that for every > 0 there

exists an x ∈ [a, b] for which f (x) < .

^ There exists an x

0

∈ [a, b] such that for every [a, b] ÷ x x

0

, f (x) < f (x

0

).

Theorem 2.9 If n ∈ N is odd, then the equation

f (x) = x

n

+ a

n−1

x

n−1

+ . . . a

1

x + a

0

= 0

has a root (a solution) for any set of constants a

0

, . . . , a

n−1

.

Proof : The idea it to show that existence of points a, b for which f (b) > 0 and

f (a) < 0, and apply the intermediate value theorem, based on the fact that poly-

nomials are continuous. The only technical issue is to ﬁnd such points a, b in a

way that works for all choices of a

0

, . . . , a

n−1

.

Let

M = max(1, 2n¦a

0

¦, . . . , 2n¦a

n−1

¦).

Then for ¦x¦ > M,

f (x)

x

n

= 1 +

a

n−1

x

+ +

a

0

x

n

(u + v ≥ u − ¦v¦) ≥ 1 −

¦a

n−1

¦

¦x¦

− −

¦a

0

¦

¦x

n

¦

(¦x

n

¦ > ¦x¦) ≥ 1 −

¦a

n−1

¦

¦x¦

− −

¦a

0

¦

¦x¦

(¦x¦ > ¦a

i

¦) ≥ 1 −

¦a

n−1

¦

2n¦a

n−1

¦

− −

¦a

0

¦

2n¦a

0

¦

= 1 − n

1

2n

> 0.

Since the sign of x

n

is the same as the sign of x, it follows that f (x) is positive for

x ≥ M and negative for x ≤ −M, hence there exists a a < c < b such that f (c) = 0.

B

(18 hrs, 2009)

The next theorem deals with the case where n is even:

Functions 77

Theorem 2.10 Let n be even. Then the function

f (x) = x

n

+ a

n−1

x

n−1

+ . . . a

1

x + a

0

= 0

has a minimum. Namely, there exists a c ∈ R, such that f (c) ≤ f (x) for all x ∈ R.

Proof : The idea is simple. We are going to show that there exists a b > 0 for

which f (x) > f (0) for all ¦x¦ ≥ b. Then, looking at the function f in the interval

[−b, b], we know that it assumes a minimum in this interval, i.e., that there exists

a c ∈ [−b, b] such that f (c) ≤ f (x) for all x ∈ [−b, b]. In particular, f (c) ≤ f (0),

which in turn is less than f (x) for all x [−b, b]. It follows that f (c) ≤ f (x) for

all x ∈ R.

It remains to ﬁnd such a b. Let M be deﬁned as in the previous theorem. Then,

for all ¦x¦ > M, using the fact that n is even,

f (x) = x

n

_

1 +

a

n−1

x

+ +

a

0

x

n

_

≥ x

n

_

1 −

¦a

n−1

¦

¦x¦

− −

¦a

0

¦

¦x

n

¦

_

≥ x

n

_

1 −

¦a

n−1

¦

¦x¦

− −

¦a

0

¦

¦x¦

_

≥ x

n

_

1 −

¦a

n−1

¦

2n¦a

n−1

¦

− −

¦a

0

¦

2n¦a

0

¦

_

=

1

2

x

n

.

Let then b > max(M,

n

_

2¦ f (0)¦), from which follows that f (x) > f (0) for ¦x¦ > b.

This concludes the proof. B

Corollary 2.2 Consider the equation

f (x) = x

n

+ a

n−1

x

n−1

+ . . . a

1

x + a

0

= α

with n even. Then there exists a number m such that this equation has a solution

for all α ≥ m but has no solution for α < m.

78 Chapter 2

Proof : According to the previous theorem there exists a c ∈ R such that f (c) is

the minimum of f . If α < f (c) then for all x, f (x) ≥ f (c) > α, i.e., there is no

solution. For α = f (c), c is a solution. For α > f (c) we have f (c) < α and for

large enough x, f (x) > α, hence, by the intermediate value theorem a root exists.

B

2.7 Inﬁnite limits and limits at inﬁnity

In this section we extend the notions of limits discussed in previous sections to

two cases: (i) the limit of a function at a point is inﬁnite, and (ii) the limit of a

function at inﬁnity. Before we start recall: inﬁnity is not a real number!.

Deﬁnition 2.10 Let f : A → B be a function. We say that the limit of f at a is

inﬁnity, denoted

lim

a

f = ∞,

if for every M there exists a δ > 0, such that

f (x) > M whenever 0 < ¦x − a¦ < δ.

Similarly,

lim

a

f = −∞,

if for every M there exists a δ > 0, such that

f (x) < M whenever 0 < ¦x − a¦ < δ.

Example: Consider the function

f : x ·→

_

¸

¸

_

¸

¸

_

1/¦x¦ x 0

17 x = 0.

The limit of f at zero is inﬁnity. Indeed given M we take δ = 1/M, then

0 < ¦x¦ < δ implies f (x) > M.

Functions 79

Deﬁnition 2.11 Let f : R → R. We say that

lim

∞

f = ,

if for every > 0 there exists an M, such that

¦ f (x) − ¦ < whenever x > M.

Similarly,

lim

−∞

f = ,

if for every > 0 there exists an M, such that

¦ f (x) − ¦ < whenever x < M.

Example: Consider the function f : x ·→ 3 + 1/x

2

. Then the limit of f at inﬁnity

is 3, since given > 0 we take M = 1/

√

, and

x > M implies ¦ f (x) − 3¦ < .

**Comment: These deﬁnitions are in full agreement with all previous deﬁnitions of
**

limits, if we adopt the idea that “neighborhoods of inﬁnity” are sets of the form

(M, ∞).

Exercise 2.23 Prove directly from the deﬁnition that

lim

x→3

+

x

3

+ 2x + 1

x − 3

= ∞.

Exercise 2.24 Calculate the following limit using limit arithmetic,

lim

x→∞

_ √

x

2

+ 2x − x

_

.

Exercise 2.25 Prove that

lim

x→∞

f (x) exists if and only if lim

x→0

+

f (1/x) exist,

in which case they are equal.

80 Chapter 2

Figure 2.1: Inﬁnite limit and limit at inﬁnity.

Exercise 2.26 Let g be deﬁned in a punctured neighborhood of a ∈ R, with

lim

a

g = ∞. Show that lim

a

1/g = 0.

Exercise 2.27

' Let f , g be deﬁned in a punctured neighborhood of a ∈ R, with lim

a

f =

lim

a

g = −∞. Show that lim

a

( f g) = ∞.

^ Let f , g be deﬁned in a punctured neighborhood of a ∈ R, with lim

a

f =

K < 0 and lim

a

g = −∞. Show that lim

a

( f /g) = 0.

(24 hrs, 2010)

2.8 Inverse functions

Suppose that f : A → B is one-to-one and onto. This means that for every b ∈ B

there exists a unique a ∈ A, such that f (a) = b. This property deﬁnes a function

from B to A. In fact, this function is also one-to-one and onto. We call this

function the function inverse to f (תיכפוה היצקנופ), and denote it by f

−1

(not to

be mistaken with 1/ f ). Thus, f

−1

: B → A,

f

−1

(y) = ¦x ∈ A : f (x) = y¦.

Functions 81

10

Note also that f

−1

◦ f is the identity in A whereas f ◦ f

−1

is the identity in B.

Comment: Recall that a function f : A → B can be deﬁned as a subset of A × B,

Graph( f ) = ¦(a, b) : f (a) = b¦.

The inverse function can be deﬁned as a subset of B × A: it is all the pairs (b, a)

for which (a, b) ∈ Graph( f ), or

Graph( f

−1

) = ¦(b, a) : f (a) = b¦.

Example: Let f : [0, 1] → R be given by f : x ·→ x

2

. This function is one-to-

one and its image is the segment [0, 1]. Thus we can deﬁne an inverse function

f

−1

: [0, 1] → [0, 1] by

f

−1

(y) = ¦x ∈ [0, 1] : x

2

= y¦,

i.e., it is the (positive) square root.

(22 hrs, 2011)

Theorem 2.11 If f : I → R is continuous and one-to-one (I is an interval), then

f is either monotonically increasing, or monotonically decreasing.

Comment: Continuity is crucial. The function f : [0, 2] → R,

f (x) =

_

¸

¸

_

¸

¸

_

x 0 ≤ x ≤ 1

6 − x 1 < x ≤ 2

is one-to-one with image [0, 1] ∪ [4, 5), but it is not monotonic.

10

Strictly speaking, the right-hand side is a set of real numbers, and not a real number. However,

the one-to-one and onto properties of f guarantee that it contains one and only one number, hence

we identify this singleton (ודיחי) with the unique element it contains.

82 Chapter 2

a b c

f!b"

f!a"

f!c"

Figure 2.2: Illustration of proof

Proof : We ﬁrst prove that for every three points a < b < c, either

f (a) < f (b) < f (c), or f (a) > f (b) > f (c).

Suppose that f (a) < f (c) and that, by contradiction, f (b) < f (a) (see Figure 2.2).

By the intermediate value theorem, there exists a point x ∈ (b, c) such that f (x) =

f (a), contradicting the fact that f is one-to-one. Similarly, if f (b) > f (c), then

there exists a point y ∈ [a, b] such that f (y) = f (c). Thus, f (a) < f (c) implies that

f (a) < f (b) < f (c). We proceed similarly if f (a) > f (c).

It follows as once that for every four points a < b < c < d, either

f (a) < f (b) < f (c) < f (d) or f (a) > f (b) > f (c) > f (d).

Fix now a and b, and suppose w.l.o.g that f (a) < f (b) (they can’t be equal). Then,

for every x, y such that x < y, we apply the above arguments to the four points

a, b, x, y (whatever their order is) and conclude that f (x) < f (y).

B

Suppose that the function f : [a, b] → R is one-to-one and monotonically in-

creasing (without loss of generality). Then, its image is the interval [ f (a), f (b)].

Indeed, f (a) and f (b) are in the range, and by the intermediate value theorem,

so is any point in this interval. By monotonicity, no point outside this interval

Functions 83

can be in the range of f . It follows that the inverse function has range and im-

age f

−1

: [ f (a), f (b)] → [a, b]. We have proved, en passant, that continuous

one-to-one functions map closed segments into closed segments

11

.

Theorem 2.12 If f : [a, b] → R is continuous and one-to-one then so is f

−1

.

Proof : Without loss of generality, let us assume that f is monotonically increas-

ing, i.e.,

x < y implies f (x) < f (y).

Then, so is f

−1

. We need to show that for every y

0

∈ ( f (a), f (b)),

lim

y

0

f

−1

= f

−1

(y

0

).

Equivalently, we need to show that for every > 0 there exists a δ > 0, such that

¦ f

−1

(y) − f

−1

(y

0

)¦ < whenever ¦y − y

0

¦ < δ.

Now, every such y

0

is equal to f (x

0

) for a unique x

0

. Since f (and f

−1

) are both

monotonically increasing, it is clear how to set δ. Let

y

1

= f (x

0

− ) and y

2

= f (x

0

+ ).

By monotonicity, y

1

< y

0

< y

2

. We let then δ = min(y

0

− y

1

, y

2

− y

0

). Every

y

0

− δ < y < y

0

+ δ is in the interval (y

1

, y

2

), and therefore,

f

−1

(y

0

) − < f

−1

(y) < f

−1

(y

0

) + .

We can do it more formally. Let

δ = min( f ( f

−1

(y

0

) + ) − y

0

, y

0

− f ( f

−1

(y

0

) − )).

Then, ¦y − y

0

¦ < δ implies

y

0

− δ < y < y

0

+ δ,

or,

f ( f

−1

(y

0

)−) = y

0

−[y

0

−f ( f

−1

(y

0

)−)] < y < y

0

+[ f ( f

−1

(y

0

)+)−y

0

] = f ( f

−1

(y

0

)+).

11

More generally, a continuous one-to-one function maps connected sets into connected sets,

and retains the open/closed properties.

84 Chapter 2

Since f

−1

is monotonically increasing, we may apply f

−1

on all terms, giving

f

−1

(y

0

) − < f

−1

(y) < f

−1

(y

0

) + ,

or ¦ f

−1

(y

0

) − f

−1

(y)¦ < .

f!x

0

"=y

0

f

#1

!y

0

"=x

0

f!x

0

+!"

f!x

0

#!"

!

"

B

(20 hrs, 2009)

(26 hrs, 2010)

2.9 Uniform continuity

Recall the deﬁnition of a continuous function: a function f : (a, b) → R is contin-

uous on the interval (a, b), if it is continuous at every point in the interval. That is,

for every x ∈ (a, b) and every > 0, there exists a δ < 0, such that

¦ f (y) − f (x)¦ < whenever ¦y − x¦ < δ and y ∈ (a, b).

In general, we expect δ to depend both on x and on . In fact, the way we set

it here, this deﬁnition applies for continuity on a closed interval as well (i.e, it

includes one-sided continuity as well).

Let us re-examine two examples:

Functions 85

Example: Consider the function f : (0, 1) → R, f : x ·→ x

2

. This function is

continuous on (0, 1). Why? Because for every x ∈ (0, 1), and every y ∈ (0, 1),

¦ f (y) − f (x)¦ = ¦y

2

− x

2

¦ = ¦y − x¦¦y + x¦ ≤ (x + 1)¦y − x¦.

Thus, given x and > 0, if we choose δ = δ(, x) = /(x + 1), then

¦ f (y) − f (x)¦ < whenever ¦x − y¦ < δ(, x) and y ∈ (0, 1).

Thus, δ depends both on and x. However, it is always legitimate to replace δ by

a smaller number. If we take δ = /2, then the same δ ﬁts all points x.

Example: Consider next the function f : (0, 1) → R, f : x ·→ 1/x. This function

is also continuous on (0, 1). Why? Given x ∈ (0, 1) and y ∈ (0, 1),

¦ f (x) − f (y)¦ =

¦x − y¦

xy

=

¦x − y¦

x[x + (y − x)]

.

If we take δ(, x) = min(x/2, x

2

/2), then

¦x − y¦ < δ implies ¦ f (x) − f (y)¦ <

δ

x(x − δ)

< .

Here, given , the closer x is to zero, the smaller is δ(, x). There is no way we

can ﬁnd a δ = δ() that would ﬁt all x.

These two examples motivate the following deﬁnition:

Deﬁnition 2.12 f : A → B is said to be uniformly continuous on A (הפיצר

הווש הדימב) if for every > 0 corresponds a δ = δ() > 0, such that

¦ f (y) − f (x)¦ < whenever ¦y − x¦ < δ and x, y ∈ A.

Note that x and y play here symmetric roles

12

.

What is the essential diﬀerence between the two above examples? The func-

tion x ·→ x

2

could have been deﬁned on the closed interval [0, 1] and it would

have been continuous there too. In contrast, there is no way we could have de-

ﬁned the function x ·→ 1/x as a continuous function on [0, 1], even if we took

care of its value at zero. We have already seen examples where continuity on a

12

The term “uniform” means that the same number can be used for all points.

86 Chapter 2

closed interval had strong implications (ensures boundedness and the existence of

a maximum). This is also the case here. We will prove that continuity on a closed

interval implies uniform continuity.

We proceed by steps:

Lemma 2.3 Let f : [a, c] → R be continuous. Let a < b < c, and suppose that

for a given > 0 there exist δ

1

, δ

2

> 0, such that

x, y ∈ [a, b] and ¦x − y¦ < δ

1

implies ¦ f (x) − f (y)¦ <

x, y ∈ [b, c] and ¦x − y¦ < δ

2

implies ¦ f (x) − f (y)¦ < .

Then there exists a δ > 0 such that

x, y ∈ [a, c] and ¦x − y¦ < δ implies ¦ f (x) − f (y)¦ < .

Proof : We have to take care at the “gluing” at the point b. Since the function is

continuous at b, then there exists a δ

3

> 0 such that

¦ f (x) − f (b)¦ <

2

whenever ¦x − b¦ < δ

3

.

Hence,

¦x − b¦ < δ

3

and ¦y − b¦ < δ

3

implies ¦ f (x) − f (y)¦ < . (2.1)

Take now δ = min(δ

1

, δ

2

, δ

3

). We claim that

¦ f (x) − f (y)¦ < whenever ¦x − y¦ < δ and x, y ∈ [a, c].

Why is that true? One of three: either x, y ∈ [a, b], in which case it follows from

the fact that δ ≤ δ

1

, or x, y ∈ [b, c], in which case it follows from the fact that

δ ≤ δ

2

. Remains the third case, where, without loss of generality, x ∈ [a, b] and

y ∈ [b, c]. Since ¦x −y¦ < δ

3

, it follows that ¦x −b¦ < δ

3

and ¦y −b¦ < δ

3

and we use

(2.1). B

(24 hrs, 2011)

Functions 87

Theorem 2.13 If f is continuous on [a, b] then it is uniformly continuous on that

interval.

Proof : Given > 0, we will say that f is -good on [a, b] if there exists a δ > 0,

such that

¦ f (x) − f (y)¦ < whenever ¦x − y¦ < δ and x, y ∈ [a, b]

(i.e., if for that particular there exists a corresponding δ). We need to show that

f is -good on [a, b] for all > 0. Note that we proved in Lemma 2.3 that if f is

-good on [a, b] and on [b, c], then it is -good on [a, c].

Given , deﬁne

A = ¦x ∈ [a, b] : f is -good on [a, x]¦.

The set A is non-empty, as it includes the point a (trivial) and it is bounded by b.

Hence it has a supremum, which we denote by α. Suppose that α < b. Since f is

continuous at α, then there exists a δ

0

> 0 such that

¦ f (y) − f (α)¦ <

2

whenever ¦y − α¦ < δ

0

and y ∈ [a, b].

Thus,

∀y, z ∈ B(α, δ

0

), ¦ f (y) − f (z)¦ ≤ ¦ f (y) − f (α)¦ + ¦ f (z) − f (α)¦ < ,

fromwhich follows that f is -good in (α−δ

0

, α+δ

0

)∩[a, b] and hence in the closed

interval [α − δ

0

/2, α + δ

0

/2] ∩ [a, b]. By the deﬁnition of α, it is also -good on

[a, α−δ

0

/2]∩[a, b], and by the previous lemma, f is -good on [a, α+δ

0

/2]∩[a, b],

contradicting the fact that α is an upper bound for A. Thus f is -good on [a, b].

Since this argument holds independently of we conclude that f is -good in [a, b]

for all > 0. B

Comment: In this course we will make use of uniform continuity only once, when

we study integration.

(27 hrs, 2010)

Examples:

88 Chapter 2

' Consider the function f (x) = sin(1/x) deﬁned on (0, 1). Even though it is

continuous, it is not uniformly continuous.

^ Consider the function Id : R → R. It is easy to see that it is uniformly

continuous. The function Id Id, however, is continuous on R, but not uni-

formly continuous. Thus, the product of uniformly continuous functions is

not necessarily uniformly continuous.

· Finally the function f (x) = sin x

2

is continuous and bounded on R, but it is

not uniformly continuous.

(22 hrs, 2009)

Exercise 2.28 Let f : R → R be continuous. Prove or disprove:

' If

lim

∞

f = lim

−∞

f = 0,

then f attains a maximum value.

^ If lim

∞

f = ∞ and lim

−∞

f = −∞, then f assumes all real values; that is f

is on R.

· If on any interval [n, n + 1], n ∈ Z, f is uniformly continuous, then f is

uniformly continuous on R.

Exercise 2.29 Suppose A ⊆ R is non-empty.

' Prove that if f , g are uniformly continuous on A then αf + βg is also uni-

formly continuous on A, where α, β ∈ R.

^ Is in the above case f g necessarily uniformly continuous?

Exercise 2.30 Let f : (0, ∞) → R be given by f : x ·→ 1/

√

x:

' Let a > 0. Prove that f is uniformly continuous on [a, ∞).

^ Show that f is not uniformly continuous on (0, ∞).

Exercise 2.31 Let I be an open connected domain (i.e., an open segment, an

open ray, or the whole line) and let f , g : I → R be uniformly continuous on I.

Prove that f g is uniformly continuous on I. (Hint: read the proof about the limit

of a product of functions being the product of their limits.)

Functions 89

Exercise 2.32 Let f : A → B and g : B → C be uniformly continuous. Show

that g ◦ f is uniformly continuous.

Exercise 2.33 Let a > 0. Show that f (x) = sin(1/x) is not uniformly contin-

uous on (0, a).

Exercise 2.34 For which values of α > 0 is the function f (x) = x

α

uniformly

continuous on [0, ∞)?

Exercise 2.35 Let α > 0. We say that f : A → B is H¨ older continuous of

order α if there exists a constant M > 0, such that

¦ f (x) − f (y)¦ < M¦x − y¦

α

, ∀x, y ∈ A.

' Show that f (x) =

√

x is H¨ older continuous of order 1/2 on (0, ∞).

^ Prove that if f is H¨ older continuous of order α on A then it is uniformly

continuous on A.

· Prove that if f is H¨ older continuous of order α > 1, then it is a constant

function.

90 Chapter 2

Chapter 3

Derivatives

3.1 Deﬁnition

In this chapter we deﬁne and study the notion of diﬀerentiable functions. Deriva-

tives were historically introduced in order to answer the need of measuring the

“rate of change” of a function. We will actually adopt this approach as a prelude

to the formal deﬁnition of the derivative.

Before we start, one technical clariﬁcation. In the previous chapter, we introduced

the limit of a function f : A → B at an (interior) point a,

lim

a

f .

Consider the function g deﬁned by

g(x) = f (a + x),

on the range,

¦x : a + x ∈ A¦.

In particular, zero is an interior point of this set. We claim that

lim

0

g = lim

a

f ,

provided, of course, that the right hand side exists. It is a good exercise to prove

it formally. Suppose that the right hand side equals . This means that

(∀ > 0)(∃δ > 0) : (∀x ∈ B

◦

(a, δ))(¦ f (x) − ¦ < ).

92 Chapter 3

Setting x − a = y,

(∀ > 0)(∃δ > 0) : (∀y ∈ B

◦

(0, δ))(¦ f (a + y) − ¦ < ),

and by the deﬁnition of g,

(∀ > 0)(∃δ > 0) : (∀y ∈ B

◦

(0, δ))(¦g(y) − ¦ < ),

i.e., lim

0

g = . Often, one does not bother to deﬁne the function g and simply

writes

lim

x→0

f (a + x) = lim

a

f ,

although this in not in agreement with our notations; f (a + x) stands for the com-

posite function,

f ◦ (x ·→ a + x).

Let now f be a function deﬁned on some interval I, and let a and x be two points

inside this interval. The variation in the value of f between these two points is

f (x) − f (a). We deﬁne the mean rate of change (עצוממ יוניש בצק) of f between

the points a and x to be the ratio

f (x) − f (a)

x − a

.

This quantity can be attributed with a number of interpretations. First, if we con-

sider the graph of f , then the mean rate of change is the slope of the secant line

(רתימ) that intersects the graph at the points a and x (see Figure 3.1). Second, it

has a meaning in many physical situations. For example, f can be the position

along a line (say, in meters relative to the origin) as function of time (say, in sec-

onds relative to an origin of time). Thus, f (a) is the distance from the origin in

meters a seconds after the time origin, and f (x) is the distance from the origin

in meters x seconds after the time origin. Then f (x) − f (a) is the displacement

(קתעה) between time a and time x, and [ f (x) − f (a)]/(x −a) is the mean displace-

ment per unit time, or the mean velocity.

This physical example is a good preliminary toward the deﬁnition of the deriva-

tive. The mean velocity, or mean rate of displacement, is an average quantity

between two instants, but there is nothing to guarantee that within this time in-

terval, the body was in a “ﬁxed state”. For example, we could intersect this time

interval into two equal sub-intervals, and measure the mean velocity in each half.

Nothing guarantees that these mean velocities will equal the mean velocity over

Derivatives 93

f!a"

a a+h

f!a+h"

!

Figure 3.1: Relation between the mean rate of change of a function and the slope

of the corresponding secant.

the whole interval. Physicists aimed to deﬁne an “instantaneous velocity”, and

the way to do it was to make the length of the interval very small. Of course,

having h small (what does small mean?) does not change the average nature of

the measured velocity.

An instantaneous rate of change can be deﬁned by using limits. Fixing the point

a, we deﬁne a function

∆

f ,a

: x ·→

f (x) − f (a)

x − a

.

The function ∆

f ,a

can be deﬁned on the same domain as f , but excluding the point

a. We say that f is diﬀerentiable at a (תיליבאיצנרפיד וא הריזג) if ∆

f ,a

has a limit

at a. We write

lim

a

∆

f ,a

= f

·

(a).

We call the limit f

·

(a) the derivative (תרזגנ) of f at a.

Comment: In the more traditional notation,

f

·

(a) = lim

x→a

f (x) − f (a)

x − a

.

At this stage, f

·

(a) is nothing but a notation. It is not (yet!) a function evaluated

94 Chapter 3

at the point a. Another standard notation, due to Leibniz, is

d f

dx

(a).

Having identiﬁed the mean rate of change as the slope of the secant line, the

derivative has a simple geometrical interpretation. As x tends to a, the corre-

sponding family of secants tends to a line which is tangent to f at the point a.

For us, these are only hand-waving arguments, as we have never assigned any

meaning to the limit of a family of lines.

Example: Consider the constant function, f : R → R, f : x ·→ c. We are going to

calculate its derivative at the point a. We construct the function

∆

f ,a

(y) =

f (y) − f (a)

y − a

=

c − c

y − a

= 0.

The limit of this function at a is clearly zero, hence f

·

(a) = 0.

In the above example, we could have computed the derivative at any point. In

other words, we can deﬁne a function that given a point x, returns the derivative of

f at that point, i.e., returns f

·

(x). A function f : A → B is called diﬀerentiable in

a subset U ⊆ A of its domain if it has a derivative at every point x ∈ U. We then

deﬁne the derivative function, f

·

: U → R, as the function,

f

·

(x) = lim

x

∆

f ,x

.

Again, the more traditional notation is,

f

·

(x) = lim

y→x

f (y) − f (x)

y − x

.

Example: Consider the function f : x ·→ x

2

. We calculate its derivative at a point

x by ﬁrst observing that

∆

f ,x

(y) =

y

2

− x

2

y − x

= x + y or ∆

f ,x

= Id +x,

so that

f

·

(x) = lim

x

∆

f ,x

= 2x.

Derivatives 95

(23 hrs, 2009)

(25 hrs, 2011)

Example: Consider now the function f : x ·→ ¦x¦. Let’s ﬁrst calculate its deriva-

tive at a point x > 0. For x > 0,

∆

f ,x

(y) =

¦y¦ − x

y − x

.

Since we are interested in the limit of ∆

f ,x

at x > 0 we may well assume that y > 0,

in which case ∆

f ,x

(y) = 1. Hence

f

·

(x) = lim

x

∆

f ,x

= 1.

For negative x, ¦x¦ = −x and we may consider y < 0 as well, so that

∆

f ,x

(y) =

(−y) − (−x)

y − a

= −1,

so that

f

·

(x) = lim

x

∆

f ,x

= −1.

Remains the point 0 itself,

∆

f ,0

(y) =

¦y¦ − ¦0¦

y − 0

= sgn(y).

The limit at zero does not exist since every neighborhood of zero has points where

this function equals one and points where this function equals minus one. Thus f

is diﬀerentiable everywhere except for the origin

1

.

(29 hrs, 2010)

One-sided diﬀerentiability This last example shows a case where a limit does

not exist, but one-sided limits do exist. This motivates the following deﬁnitions:

1

We use here a general principle, whereby lim

a

f does not exist if there exist

1

2

, such that

every neighborhood of a has points x, y, such that f (x) =

1

and f (y) =

2

.

96 Chapter 3

Deﬁnition 3.1 A function f is said to be diﬀerentiable on the right (ימימ הריזג)

at a if the one-sided limit

lim

a

+

∆

f ,a

exists. We denote the right-hand derivative (תינמי תרזגנ) by f

·

(a

+

). A similar

deﬁnition holds for left-hand derivatives.

Example: Here is one more example that practices the calculation of the deriva-

tive directly from the deﬁnition. Consider ﬁrst the function

f : x ·→

_

¸

¸

_

¸

¸

_

x sin

1

x

x 0

0 x = 0.

We have already seen that that this function is continuous at zero, but is it diﬀer-

entiable at zero? We construct the function

∆

f ,0

(y) =

f (y) − f (0)

y − 0

= sin

1

y

.

The function ∆

f ,0

does not have a limit at 0, because every neighborhood of zero

has both points where ∆

f ,0

= 1 and points where ∆

f ,0

= −1.

Example: In contrast, consider the function

f : x ·→

_

¸

¸

_

¸

¸

_

x

2

sin

1

x

x 0

0 x = 0.

We construct

∆

f ,0

(y) =

f (y) − f (0)

y − 0

= y sin

1

y

.

Now lim

0

∆

f ,0

= 0, hence f is diﬀerentiable at zero.

Diﬀerentiability and continuity The following theorem shows that diﬀeren-

tiable functions are a subclass (“better behaved”) of the continuous functions.

Theorem 3.1 If f is diﬀerentiable at a then it is continuous at a.

Derivatives 97

Proof : Note that for y a,

f (y) = f (a) + (y − a)

f (y) − f (a)

(y − a)

.

Viewing f (a) as a constant, we have a functional identity,

f = f (a) + (Id −a) ∆

f ,a

valid in a punctured neighborhood of a. By limit arithmetic,

lim

a

f = f (a) + lim

a

(Id −a) lim

a

∆

f ,a

= f (a) + 0 f

·

(a) = f (a).

B

Exercise 3.1 Show, based on the deﬁnition of the derivative, that if f is dif-

ferentiable at a ∈ R with f (x) 0, then ¦ f ¦ is also diﬀerentiable at that point.

(Hint, use the fact that a function diﬀerentiable at a point is also continuous at that

point.)

Exercise 3.2 Let I be a connected domain, and suppose that f is diﬀerentiable

at every interior point of I and continuous on I. Suppose furthermore that ¦ f

·

(x)¦ <

M for all x ∈ I. Show that f is uniformly continuous on I.

Higher order derivatives If a function f is diﬀerentiable on an interval A, we

can construct a new function—its derivative, f

·

. The derivative f

·

can have vari-

ous properties, For example, it may be continuous, or not, and in particular, it may

be diﬀerentiable on A, or on a subset of A. Then, we can deﬁne a new function—

the derivative of the derivative, or the second derivative (הינש תרזגנ), which we

denote by f

··

. By deﬁnition

f

··

(x) = lim

x

∆

f

·

,x

.

(Note that this is a limit of limits.) Likewise, the second derivative may be dif-

ferentiable, in which case we may deﬁne the third derivative f

···

, and so on. For

derivatives higher than the third, it is customary to use, for example, the notation

f

(4)

rather than f

····

. The k-th derivative, f

(k)

, is deﬁned recursively,

f

(k)

(x) = lim

x

∆

f

(k−1)

,x

.

For the recursion to hold from k = 1, we also set f

(0)

= f .

98 Chapter 3

3.2 Rules of diﬀerentiation

Recall that when we studied limits, we ﬁrst calculated limits by using the deﬁni-

tion of the limit, but very soon this became impractical, and we proved a number

of theorems (arithmetic of limits), with which we were able to easily calculate

a large variety of limits. The same exactly will happen with derivatives. After

having calculated the derivatives of a small number of functions, we develop tools

that will enable us to (easily) compute derivatives, without having to go back to

the deﬁnitions.

Theorem 3.2 If f is a constant function, then f

·

= 0 (by this we mean that f

·

:

x ·→ 0).

Proof : We proved this is in the previous section. B

Theorem 3.3 If f = Id then f

·

= 1 (by this we mean that f

·

: x ·→ 1).

Proof : Immediate from the deﬁnition. B

Theorem 3.4 Let f , g be functions deﬁned on the same domain (it suﬃces that the

domains have a non-empty intersection). If both f and g are diﬀerentiable at a

then f + g is diﬀerentiable at a, and

( f + g)

·

(a) = f

·

(a) + g

·

(a).

Proof : We consider the function

∆

f +g,a

=

( f + g) − ( f + g)(a)

Id −a

=

f + g − f (a) − g(a)

Id −a

= ∆

f ,a

+ ∆

g,a

,

Derivatives 99

and by the arithmetic of limits,

( f + g)

·

(a) = f

·

(a) + g

·

(a).

B

Theorem 3.5 (Leibniz rule) Let f , g be functions deﬁned on the same domain. If

both f and g are diﬀerentiable at a then f g is diﬀerentiable at a, and

( f g)

·

(a) = f

·

(a) g(a) + f (a) g

·

(a).

Proof : We consider the function,

∆

f g,a

=

f g − ( f g)(a)

Id −a

=

f g − f g(a) + f g(a) − f (a)g(a)

Id −a

= f ∆

g,a

+ g(a)∆

f ,a

,

and it remains to apply the arithmetic rules of limits. B

Corollary 3.1 The derivative is a linear operator,

(αf + βg)

·

= αf

·

+ βg

·

.

Proof : We only need to verify that if f is diﬀerentiable at a then so is αf and

(αf )

·

= αf

·

. This follows from the derivative of the product. B

Example: Let n ∈ N and consider the functions f

n

: x ·→ x

n

. Then,

f

·

n

(x) = n x

n−1

or equivalently f

·

n

= n f

n−1

.

We can show this inductively. We know already that this is true for n = 0, 1.

Suppose this were true for n = k. Then, f

k+1

= f

k

Id, and by the diﬀerentiation

rule for products,

f

·

k+1

= f

·

k

Id +f

k

Id

·

,

i.e.,

f

·

k+1

(x) = k x

k−1

x + x

k

1 = (k + 1) x

k

.

100 Chapter 3

Theorem 3.6 If g is diﬀerentiable at a and g(a) 0, then 1/g is diﬀerentiable at

a and

(1/g)

·

(a) = −

g

·

(a)

g

2

(a)

.

Proof : Since g is continuous at a (it is continuous since it is diﬀerentiable) and it

is non-zero, then there exists a neighborhood of a in which g does not vanish (by

Theorem 2.5). In this neighborhood we consider the function

∆

1/g,a

=

1/g − (1/g)(a)

Id −a

=

1

Id −a

_

1

g

−

1

g(a)

_

=

−1

g g(a)

∆

g,a

.

It remains to take the limit at a and apply the arithmetic laws of limits. B

(31 hrs, 2010)

Theorem 3.7 If f and g are diﬀerentiable at a and g(a) 0, then the function f /g

is diﬀerentiable at a and

( f /g)

·

=

f

·

g − g

·

f

g

2

.

Proof : Apply the last two theorems. B

(25 hrs, 2009)

With these theorems in hand, we can diﬀerentiate a large number of functions.

In particular there is no diﬃculty in diﬀerentiating a product of more than two

functions. For example,

( f gh)

·

= (( f g)h)

·

= ( f g)

·

h+( f g)h

·

= ( f

·

g+ f g

·

)h = ( f g)h

·

= f

·

gh+ f g

·

h+ f gh

·

.

Suppose we take for granted that

sin

·

= cos and cos

·

= −sin .

Then we have no problem calculating the derivative of, say, x ·→ sin

3

x, as a

product of three functions. But what about the derivative of x ·→ sin x

3

? Here we

need a rule for how to diﬀerentiate compositions.

Derivatives 101

Consider the composite function g ◦ f , i.e.,

(g ◦ f )(x) = g( f (x)).

Let’s try to calculate its derivative at a point a, assuming for the moment that both

f and g are diﬀerentiable everywhere. We then need to look at the function

∆

g◦ f ,a

(y) =

(g ◦ f )(y) − (g ◦ f )(a)

y − a

=

g( f (y)) − g( f (a))

y − a

,

and calculate its limit at a. When y is close to a, we expect the arguments f (y)

and f (a) of g to be very close. This suggest the following treatment,

∆

g◦ f ,a

(y) =

g( f (y)) − g( f (a))

f (y) − f (a)

f (y) − f (a)

y − a

=

g( f (y)) − g( f (a))

f (y) − f (a)

∆

f ,a

(y).

It looks that as y → a, since f (y) − f (a) → 0, this product tends to g

·

( f (a)) f

·

(a).

The problem is that while the limit y → a means that the case y = a is not to be

considered, there is nothing to prevent the denominator f (y)−f (a) fromvanishing,

rendering this expression meaningless. Yet, the result is correct, and it only takes

a little more subtlety to prove it.

(27 hrs, 2011)

Theorem 3.8 Let f : A → B and g : B → R. If f is diﬀerentiable at a ∈ A, and g

is diﬀerentiable at f (a), then

(g ◦ f )

·

(a) = g

·

( f (a)) f

·

(a).

Proof : We introduce the following function, deﬁned in a neighborhood of f (a),

ψ : z ·→

_

¸

¸

_

¸

¸

_

∆

g, f (a)

(z) z f (a)

g

·

( f (a)) z = f (a).

The fact that g is diﬀerentiable at f (a) implies that ψ is continuous at f (a).

We next claim that for y a

∆

g◦ f ,a

= (ψ ◦ f ) ∆

f ,a

.

102 Chapter 3

Why that? If f (y) f (a), then this equation reads

g( f (y)) − g( f (a))

y − a

=

g( f (y)) − g( f (a))

f (y) − f (a)

f (y) − f (a)

y − a

,

which holds indeed, whereas if f (y) = f (a) it reads,

g( f (y)) − g( f (a))

y − a

= g

·

( f (a))

f (y) − f (a)

y − a

,

and both sides are zero. Consider now the limit of the right hand side at a. By

limits arithmetic,

lim

a

_

(ψ ◦ f ) ∆

f ,a

_

= ψ( f (a)) lim

a

∆

f ,a

= g

·

( f (a)) f

·

(a).

B

(32 hrs, 2010)

3.3 Another look at derivatives

In this short section we provide another (equivalent) characterization of diﬀeren-

tiability and the derivative. Its purpose is to oﬀer a slightly diﬀerent angle of view

on the subject, and how clever deﬁnitions can sometimes greatly simplify proofs.

Our deﬁnition of derivatives states that a function f is diﬀerentiable at an interior

point a, if the function ∆

f ,a

has a limit at a, and we denote this limit by f

·

(a).

The function ∆

f ,a

is not deﬁned at a, hence not continuous at a, but this is a

removable discontinuity. We could say that f is diﬀerentiable at a if there exists a

real number, , such that the function

S

f ,x

(y) =

_

¸

¸

_

¸

¸

_

∆

f ,a

(y) y a

y = a

is continuous at a, and S

f ,a

(a) = is called the derivative of f at a. For y a,

S

f ,a

(y) = ∆

f ,a

(y) =

f (x) − f (a)

x − a

,

or equivalently,

f (y) = f (a) + S

f ,a

(y)(y − a).

This equation holds also for y = a. This suggests the following alternative deﬁni-

tion of the derivative:

Derivatives 103

Deﬁnition 3.2 A function f deﬁned in on open neighborhood of a point a is said

to be diﬀerentiable at a if there exists a function S

f ,a

continuous at a, such that

f (y) = f (a) + S

f ,a

(y)(y − a).

S

f ,a

(a) is called the derivative of f at a and is denoted by f

·

(a).

Of course, S

f ,a

coincides with ∆

f ,a

in some punctured neighborhood of a.

A ﬁrst example Take the function f : x → x

2

. Then

f (y) − f (a) = (y + a)(y − a),

or

f = f (a) + (Id +a)(Id −a).

Since the function Id +a is continuous at a, it follows that f is diﬀerentiable at a

and

f

·

(a) = lim

a

(Id +a) = 2a.

Leibniz’ rule Let’s now see how this alternative deﬁnition simpliﬁes certain

proofs. Suppose for example that both f and g are diﬀerentiable at a. This implies

the existence of two continuous functions, S

f ,a

and S

g,a

, such that

f (y) = f (a) + S

f ,a

(y)(y − a)

g(y) = g(a) + S

g,a

(y)(y − a)

in some neighborhood of a and f

·

(a) = S

f ,a

(a) and g

·

(a) = S

g,a

(a). Then,

f (y)g(y) =

_

f (a) + S

f ,a

(y)(y − a)

_ _

g(a) + S

g,a

(y)(y − a)

_

= f (a)g(a) +

_

f (a)S

g,a

(y) + g(a) S

f ,a

(y) + S

f ,a

(y)S

g,a

(y)(y − a)

_

(y − a).

By limit arithmetic, the function in the brackets,

F = f (a)S

g,a

+ g(a) S

f ,a

+ S

f ,a

S

g,a

(Id −a)

is continuous at a, hence f g is diﬀerentiable at a and

( f g)

·

(a) = lim

a

F = f (a)g

·

(a) + g(a) f

·

(a).

104 Chapter 3

Derivative of a composition Suppose now that f is diﬀerentiable at a and g is

diﬀerentiable at f (a). Then, there exist continuous functions, S

f ,a

and S

g, f (a)

, such

that

f (y) = f (a) + S

f ,a

(y)(y − a)

g(z) = g( f (a)) + S

g, f (a)

(z)(z − f (a)).

Now,

g( f (y)) = g( f (a)) + S

g, f (a)

( f (y))( f (y) − f (a))

= g( f (a)) + S

g, f (a)

( f (y))S

f ,a

(y)(y − a),

or,

(g ◦ f )(y) = (g ◦ f )(a) + S

g, f (a)

( f (y))S

f ,a

(y)

.,,.

[

S

g, f (a)

◦ f

]

S

f ,a

(y)

(y − a).

By the properties of continuous functions, the function in the square brackets is

continuous at a, and

(g ◦ f )

·

(a) = lim

a

_

S

g, f (a)

◦ f

_

S

f ,a

= g

·

( f (a)) f

·

(a).

(34 hrs, 2010)

(28 hrs, 2011)

3.4 The derivative and extrema

In high-school calculus, one of the main uses of diﬀerential calculus is to ﬁnd

extrema of functions. We are going to put this practice on solid grounds. First,

recall the deﬁnition,

Deﬁnition 3.3 Let f : A → B. A point a ∈ A (not necessarily an internal point)

is said to be a maximum point of f in A, if

f (x) ≤ f (a) ∀x ∈ A.

We deﬁne similarly a minimum point.

Comment: By no means a maximum point has to be unique, nor to exist. For

example, in a constant function all points are minima and maxima. On the other

Derivatives 105

hand, the function f : (0, 1) → R, f : x ·→ x

2

does not have a maximum in (0, 1).

Also, we proved that a continuous function deﬁned on a closed interval always

has a maximum (and a minumum, both, not necessarily unique).

Here is a ﬁrst connection between maximum (and minimum) points and deriva-

tives:

Theorem 3.9 Let f : (a, b) → R. If x ∈ (a, b) is a maximum point of f and f is

diﬀerentiable at x, then f

·

(x) = 0.

Proof : Since x is, by assumption, a maximum point, then for every y ∈ (a, b),

f (y) − f (x) ≤ 0.

In particular, for y x,

∆

f ,x

(y) =

f (y) − f (x)

y − x

,

satisﬁes

_

¸

¸

_

¸

¸

_

∆

f ,x

(y) ≤ 0 y > x

∆

f ,x

(y) ≥ 0 y < x.

It is given that ∆

f ,x

has a limit at x (it is f

·

(x)). We will show that this limit is

necessarily zero.

Suppose the limit, , were positive. Then, taking = /2, there has to be a δ > 0,

such that

0 <

2

< ∆

f ,x

(y) <

3

2

whenever 0 < ¦y − x¦ < δ,

but this can’t hold, since every neighborhood of x has points y where ∆

f ,x

(y) ≤ 0.

Similarly, can’t be negative, hence = 0. B

Comment: This is a uni-directional theorem. It does not imply that if f

·

(x) = 0

then x is a maximum point (nor a minimum point).

Comment: The open interval cannot be replaced by the closed interval [a, b],

because at the end points we can only consider one-sided limits.

106 Chapter 3

Deﬁnition 3.4 Let f : A → B. An interior point a ∈ A is called a local maxi-

mum of f (ימוקמ ומיסקמ), if there exists a δ > 0 such that a is a maximum of f

in (a − δ, a + δ) (“local” always means “in a suﬃciently small neighborhood”).

Theorem 3.10 (Pierre de Fermat) If a is a local maximum (or minimum) of f in

some open interval and f is diﬀerentiable at a then f

·

(a) = 0.

Proof : There is actually nothing to prove, as the previous theorem applies verba-

tim for f restricted to the interval (a − δ, a + δ). B

Comment: The converse is not true. Take for example the function f : x → x

3

. If

is diﬀerentiable in R and f

·

(0) = 0, but zero is not a local minimum, nor a local

maximum of f .

Comment: Although this theorem is due to Fermat, its proof is slightly shorter

than the proof of Fermat’s last theorem.

Deﬁnition 3.5 Let f : A → B be diﬀerentiable. A point a ∈ A is called a critical

point of f if f

·

(a) = 0. The value f (a) is called a critical value.

Locating extremal points Let f : [a, b] → R and suppose we want to ﬁnd a

maximum point of f . If f is continuous, then a maximum is guaranteed to exist.

There are three types of candidates: (1) critical points of f in (a, b), (ii) the end

points a and b, and (iii) points where f is not diﬀerentiable. If f is diﬀerentiable,

then the task reduces to ﬁnding all the critical points and comparing all the critical

values,

¦ f (x) : f

·

(x) = 0¦.

Finally, the largest critical value has to be compared to f (a) and f (b).

Example: Consider the function f : [−1, 2] → R, f : x ·→ x

3

− x. This function

is diﬀerentiable everywhere, and its critical points satisfy

f

·

(x) = 3x

2

− 1 = 0,

i.e., x = ±1/

√

3, which are both in the domain. It is easily checked that

f (1/

√

3) = −

2

3

√

3

and f (−1/

√

3) =

2

3

√

3

.

Derivatives 107

Finally, f (−1) = 0 and f (2) = 6, hence 2 is the (unique) maximum point.

So far, we have derived properties of the derivative given the function. What about

the reverse direction? Take the following example: we know that the derivative

of a constant function is zero. Is it also true that if the derivative is zero then the

function is a constant? A priori, it is not clear how to show it. How can we go

from the knowledge that

lim

y→x

f (y) − f (x)

y − x

= 0,

to showing that f is a constant function. The following two theorems will provide

us with the necessary tools:

Theorem 3.11 (Michel Rolle, 1691) If f is continuous on [a, b], diﬀerentiable in

(a, b), and f (a) = f (b), then there exists a point c ∈ (a, b) where f

·

(c) = 0.

Proof : It follows from the continuity of f that it has a maximum and a minimum

point in [a, b]. If the maximum occurs at some c ∈ (a, b) then f

·

(c) = 0 and we are

done. If the minimum occurs at some c ∈ (a, b) then f

·

(c) = 0 and we are done.

The only remaining alternative is that a and b are both minima and maxima, in

which case f is a constant and its derivative vanishes at some interior point (well,

at all of them). B

Comment: The requirement that f be diﬀerentiable everywhere in (a, b) is imper-

ative, for consider the function

f (x) =

_

¸

¸

_

¸

¸

_

x 0 ≤ x ≤

1

2

1 − x

1

2

< x ≤ 1.

Even though f is continuous in [0, 1] and f (0) = f (1) = 0, there is no interior

point where f

·

(x) = 0.

Theorem 3.12 (Mean-value theorem (עצוממה רעה טפשמ)) If f is continu-

ous on [a, b] and diﬀerentiable in (a, b), then there exists a point c ∈ (a, b) where

f

·

(c) =

f (b) − f (a)

b − a

,

that is, a point at which the derivative equals to the mean rate of change of f

between a and b.

108 Chapter 3

Comment: Rolle’s theorem is a particular case.

Proof : This is almost a direct consequence of Rolle’s theorem. Deﬁne the function

g : [a, b] → R,

g(x) = f (x) −

f (b) − f (a)

b − a

(x − a).

g is continuous on [a, b] and diﬀerentiable in (a, b). Moreover, g(a) = g(b) = f (a).

Hence by Rolle’s theorem there exists a point c ∈ (a, b), such that

0 = g

·

(c) = f

·

(c) −

f (b) − f (a)

b − a

.

B

Corollary 3.2 Let f : [a, b] → R. If f

·

(x) = 0 for all x ∈ (a, b) then f is a

constant.

Proof : Let c, d ∈ (a, b), c < d. By the mean-value theorem there exists some

e ∈ (c, d), such that

f

·

(e) =

f (d) − f (c)

d − c

,

however f

·

(e) = 0, hence f (d) = f (c). Since this holds for all pair of points, then

f is a constant. B

Comment: This is only true if f is deﬁned on an interval. Take a domain of

deﬁnition which is the union of two disjoint sets, and this is no longer true ( f is

only constant in every “connected component” (תורישק ביכר)).

(27 hrs, 2009)

Corollary 3.3 If f and g are diﬀerentiable on an interval with f

·

(x) = g

·

(x), then

there exists a real number c such that f = g + c on that interval.

Proof : Apply the previous corollary for f − g. B

Derivatives 109

Corollary 3.4 If f is continuous on a closed interval [a, b] and f

·

(x) > 0 in (a, b),

then f is increasing on that interval.

Proof : Let x < y belong to that interval. By the mean-value theorem there exists

a c ∈ (x, y) such that

f

·

(c) =

f (y) − f (x)

y − x

,

however f

·

(c) > 0, hence f (y) > f (x). B

Comment: The converse is not true. If f is increasing and diﬀerentiable then

f

·

(x) ≥ 0, but equality may hold, as in f (x) = x

3

at zero.

We have seen that at a local minimum (or maximum) the derivative (if it exists)

vanishes, but that the opposite is not true. The following theorem gives a suﬃcient

condition for a point to be a local minimum (with a corresponding theorem for a

local maximum).

Theorem 3.13 If f

·

(a) = 0 and f

··

(a) > 0 then a is a local minimum of f .

(30 hrs, 2011)

Comment: The fact that f

··

exists at a implies:

1. f

·

exists in some neighborhood of a.

2. f

·

is continuous at a.

3. f is continuous in some neighborhood of a.

Proof : By deﬁnition,

f

··

(a) = lim

a

∆

f

·

,a

> 0 where ∆

f

·

,a

(y) =

f

·

(y) − f

·

(a)

y − a

=

f

·

(y)

y − a

.

Thus there exists a δ > 0 such that

f

·

(y)

y − a

> 0 whenever 0 < ¦y − a¦ < δ,

110 Chapter 3

or,

f

·

(y) > 0 whenever a < y < a + δ

f

·

(y) < 0 whenever a − δ < y < a.

It follows that f is increasing in a right-neighborhood of a and decreasing in a

left-neighborhood of a, i.e., a is a local minimum.

Another way of completing the argument is as follows: for every y ∈ (a, a + δ):

f (y) − f (a)

y − a

= f

·

(c

y

) > 0 i.e., f (y) > f (a),

where c

y

∈ (a, y). Similarly, for every y ∈ (a − δ, a):

f (y) − f (a)

y − a

= f

·

(c

y

) < 0 i.e., f (y) > f (a),

where here c

y

∈ (y, a).

B

Example: Consider the function f (x) = x

3

− x. There are three critical points

0, ±1/

√

3: one local maximum, one local minimum, and one which is neither.

**Example: Consider the function f (x) = x
**

4

. Zero is a minimum point, even though

f

··

(0) = 0.

(36 hrs, 2010)

Theorem 3.14 If f has a local minimum at a and f

··

(a) exists, then f

··

(a) ≥ 0.

Proof : Suppose, by contradiction, that f

··

(a) < 0. By the previous theorem this

would imply that a is both a local minimum and a local maximum. This in turn

implies that there exists a neighborhood of a in which f is constant, i.e., f

·

(a) =

f

··

(a) = 0, a contradiction. B

The following theorem states that the derivative of a continuous function cannot

have a removable discontinuity.

Derivatives 111

Theorem 3.15 Suppose that f is continuous at a and

lim

a

f

·

exists,

then f

·

(a) exists and f

·

is continuous at a.

Proof : The assumption that f

·

has a limit at a implies that f

·

exists in some punc-

tured neighborhood U of a. Thus, for y ∈ U \ ¦a¦ the function f is continuous on

the closed segment that connects y and a and diﬀerentiable on the corresponding

open segment. By the mean-value theorem, there exists a point c

y

between y and

a, such that

f

·

(c

y

) =

f (y) − f (a)

y − a

,

and therefore ¸

¸

¸

¸

¸

∆

f ,a

(y) − lim

a

f

·

¸

¸

¸

¸

¸

=

¸

¸

¸

¸

¸

f

·

(c

y

) − lim

a

f

·

¸

¸

¸

¸

¸

.

Since f

·

has a limit at a,

(∀ > 0)(∃δ > 0) : (∀y ∈ B

◦

(a, δ))

_

¦ f

·

(y) − lim

a

f

·

¦ <

_

.

Since y ∈ B

◦

(a, δ) implies that c

y

∈ B

◦

(a, δ),

(∀ > 0)(∃δ > 0) : (∀y ∈ B

◦

(a, δ))

_

¦ f

·

(c

y

) − lim

a

f

·

¦ <

_

,

which by the above identity gives

(∀ > 0)(∃δ > 0) : (∀y ∈ B

◦

(a, δ))

_

¦∆

f ,a

(y) − lim

a

f

·

¦ <

_

,

which precisely means that

f

·

(a) = lim

a

f

·

.

B

Example: The function

f (x) =

_

¸

¸

_

¸

¸

_

x

2

x < 0

x

2

+ 5 x ≥ 0,

112 Chapter 3

is diﬀerentiable in a neighborhood of 0 and

lim

0

f

·

exists,

but it is not continuous, and therefore not diﬀerentiable at zero.

Example: The function

f (x) =

_

¸

¸

_

¸

¸

_

x

2

sin 1/x x 0

0 x = 0,

is continuous and diﬀerentiable in a neighborhood of zero. However, the deriva-

tive does not have a limit at zero, so that the theorem does not apply. Note, how-

ever, that f is diﬀerentiable at zero.

(29 hrs, 2009) (37 hrs, 2010)

Theorem 3.16 (Augustin Louis Cauchy; mean value theorem) Suppose that

f , g are continuous on [a, b] and diﬀerentiable in (a, b). Then, there exists a

c ∈ (a, b), such that

[ f (b) − f (a)]g

·

(c) = [g(b) − g(a)] f

·

(c).

Comment: If g(b) g(a) then this theorem states that

f (b) − f (a)

g(b) − g(a)

=

f

·

(c)

g

·

(c)

.

This may seem like a corollary of the mean-value theorem. We know that there

exist c,d, such that

f

·

(c) =

f (b) − f (a)

b − a

and g

·

(d) =

g(b) − g(a)

b − a

.

The problem is that there is no reason for the points c and d to coincide.

Derivatives 113

Proof : Deﬁne

h(x) = f (x)[g(b) − g(a)] − g(x)[ f (b) − f (a)],

and apply Rolle’s theorem. Namely, we verify that h is continuous on [a, b] and

diﬀerentiable in (a, b), and that h(a) = h(b), hence there exists a point c ∈ (a, b),

such that

h

·

(c) = f

·

(c)[g(b) − g(a)] − g

·

(c)[ f (b) − f (a)] = 0.

B

Theorem 3.17 (L’Hˆ opital’s rule) Suppose that

lim

a

f = lim

a

g = 0,

and that

lim

a

f

·

g

·

exists.

Then the limit lim

a

( f /g) exists and

lim

a

f

g

= lim

a

f

·

g

·

.

Comment: This is a very convenient tool for obtaining a limit of a fraction, when

both numerator and denominator vanish in the limit

2

.

Proof : It is implicitly assumed that f and g are diﬀerentiable in a neighborhood

of a (except perhaps at a) and that g

·

is non-zero in a neighborhood of a (except

perhaps at a); otherwise, the limit lim

a

( f

·

/g

·

) would not have existed. Note that f

and g are not necessarily deﬁned at a; since they have a limit there, there will be

no harm assuming that they are continuous at a, i.e., f (a) = g(a) = 0

3

.

2

L’Hˆ opital’s rule was in fact derived by Johann Bernoulli who was giving to the French noble-

man, le Marquis de l’Hˆ opital, lessons in calculus. L’Hˆ opital was the one to publish this theorem,

acknowledging the help of Bernoulli.

3

What we really do is to deﬁne continuous functions

˜

f (x) =

_

¸

¸

_

¸

¸

_

f (x) x a

0 x = a

and ˜ g(x) =

_

¸

¸

_

¸

¸

_

g(x) x a

0 x = a

,

114 Chapter 3

First, we claim that g does not vanish in some neighborhood U of a, for by Rolle’s

theorem, it would imply that g

·

vanishes somewhere in this neighborhood. More

formally, if

g

·

(x) 0 whenever 0 < ¦x − a¦ < δ,

then

g(x) 0 whenever 0 < ¦x − a¦ < δ,

for if g(x) = 0, with, say, 0 < x < δ, then there exists a 0 < c

x

< x < δ, where

g

·

(c

x

) = 0.

Both f and g are continuous and diﬀerentiable in some neighborhood U that con-

tains a, hence by Cauchy’s mean-value theorem, there exists for every x ∈ U, a

point c

x

between x and a, such that

f (x) − 0

g(x) − 0

=

f (x)

g(x)

=

f

·

(c

x

)

g

·

(c

x

)

. (3.1)

Since the limit of f

·

/g

·

at a exists,

(∀ > 0)(∃δ > 0) : x ∈ B

◦

(a, δ) implies

_

¸

¸

¸

¸

¸

f

·

(x)

g

·

(x)

− lim

a

f

·

g

·

¸

¸

¸

¸

¸

<

_

,

and since x ∈ B

◦

(a, δ) implies c

x

∈ B

◦

(a, δ),

(∀ > 0)(∃δ > 0) : x ∈ B

◦

(a, δ) implies

_

¸

¸

¸

¸

¸

f

·

(c

x

)

g

·

(c

x

)

− lim

a

f

·

g

·

¸

¸

¸

¸

¸

<

_

,

and by (3.1),

(∀ > 0)(∃δ > 0) : x ∈ B

◦

(a, δ) implies

_

¸

¸

¸

¸

¸

f (x)

g(x)

− lim

a

f

·

g

·

¸

¸

¸

¸

¸

<

_

,

which means that

lim

a

f

g

= lim

a

f

·

g

·

.

B

Example:

and prove the theorem for the “corrected” functions

˜

f and ˜ g. At the end, we observe that the result

must apply for f and g as well.

Derivatives 115

' The limit

lim

x→0

sin x

x

= 1.

^ The limit

lim

x→0

sin

2

x

x

2

= 1,

with two consecutive applications of l’Hˆ opital’s rule.

(32 hrs, 2011)

Comment: Purposely, we only prove one variant among many other of l’Hˆ opital’s

rule. Another variants is: Suppose that

lim

∞

f = lim

∞

g = 0,

and that

lim

∞

f

·

g

·

exists.

Then

lim

∞

f

g

= lim

∞

f

·

g

·

.

Yet another one is: Suppose that

lim

a

f = lim

a

g = ∞,

and that

lim

a

f

·

g

·

exists.

Then

lim

a

f

g

= lim

a

f

·

g

·

.

The following theorem is very reminiscent of l’Hˆ opital’s rule, but note how dif-

ferent it is:

116 Chapter 3

Theorem 3.18 Suppose that f and g are diﬀerentiable at a, with

f (a) = g(a) = 0 and g

·

(a) 0.

Then,

lim

a

f

g

=

f

·

(a)

g

·

(a)

.

Proof : Without loss of generality, assume g

·

(a) > 0. That is,

lim

y→a

g(y) − g(a)

y − a

= lim

y→a

g(y)

y − a

= g

·

(a) > 0,

hence there exists a punctured neighborhood of a where g(x)/(x − a) > 0, and in

particular, in which the numerator g(x) is non-zero. Thus, we can divide by g(x)

in a punctured neighborhood of a, and

f (x)

g(x)

=

f (x) − f (a)

g(x) − g(a)

=

f (x)−f (a)

x−a

g(x)−g(a)

x−a

=

∆

f ,a

(x)

∆

g,a

(x)

.

Using the arithmetic of limits, we obtain the desired result. B

(33 hrs, 2011)

(30 hrs, 2009)

Exercise 3.3 Show that the function

f (x) =

_

¸

¸

_

¸

¸

_

1

2

x + x

2

sin

1

x

x 0

0 x = 0

satisﬁes f

·

(0) > 0, and yet, is not increasing in any neighborhood of 0 (i.e., there

is no neighborhood of zero in which x < y implies f (x) < f (y)).

Exercise 3.4 Let I be an open segment. Any suppose that f

·

(x) = f (x) for all

x ∈ I. Show that there exists a constant c, such that f (x) = c e

x

. (Hint: consider

the function g(x) = f (x)/e

x

.)

Exercise 3.5

Derivatives 117

' Let f : R → R be diﬀerentiable. Show that if f vanishes at m points, then

f

·

vanishes at at least m − 1 points.

^ Let f be n times continuously diﬀerentiable in [a, b], and n + 1 times dif-

ferentiable in (a, b). Suppose furthermore that f (b) = 0 and f

(k)

(a) = 0 for

k = 0, 1, . . . , n. Show that there exists a c ∈ (a, b) such that f

(n+1)

(c) = 0.

Exercise 3.6 Let

f (x) =

_

¸

¸

¸

¸

¸

_

¸

¸

¸

¸

¸

_

sin(ωx) x < 0

γ x = 0

αx

2

+ βx x > 0.

Find all the values of α, β, γ, ω for which f is (i) continuous at zero, (ii) diﬀeren-

tiable at zero, (iii) twice diﬀerentiable at zero.

Exercise 3.7 Calculate the following limits:

' lim

x→0

sin(ax)

sin(bx) cos(cx)

.

^ lim

x→0

tan(x)−x

x−sin(x)

.

· lim

x→0

x

2

sin(1/x)

sin(x)

.

´ lim

x→1

(1 − x) tan

_

πx

2

_

.

Exercise 3.8 Suppose that f : (0, ∞) → R is diﬀerentiable on (0, ∞) and

lim

∞

f

·

= L. Prove, without using l’Hopital’s theorem that lim

x→∞

f (x)/x = L.

Hint: write

f (x)

x

=

f (x) − f (x

0

)

x − x

0

x − x

0

x

,

and choose x

0

appropriately in order to use Lagrange’s theorem.

3.5 Derivatives of inverse functions

We have already shown that if f : A → B is one-to-one and onto, then it has an

inverse f

−1

: B → A. We then say that f is invertible (הכיפה). We have seen

that an invertible function deﬁned on an interval must be monotonic, and that the

inverse of a continuous function is continuous. In this section, we study under

what conditions is the inverse of a diﬀerentiable function diﬀerentiable.

118 Chapter 3

Figure 3.2: The derivative of the inverse function

Recall also that

f

−1

◦ f = Id,

with Id : A → A. If both f and f

−1

were diﬀerentiable at a and f (a), respectively,

then by the composition rule,

( f

−1

)

·

( f (a)) f

·

(a) = 1.

Setting f (a) = b, or equivalently, a = f

−1

(b), we get

( f

−1

)

·

(b) =

1

f

·

( f

−1

(b))

.

There is one little ﬂaw with this argument. We have assumed ( f

−1

)

·

to exist. If this

were indeed the case, then we have just derived an expression for the derivative of

the inverse.

Corollary 3.5 If f

·

( f

−1

(b)) = 0 then f

−1

is not diﬀerentiable at b.

Example: The function f (x) = x

3

+ 2 is continuous and invertible, with f

−1

(x) =

3

√

x − 2. Since, f

·

( f

−1

(2)) = 0, then f

−1

is not diﬀerentiable at 2.

(39 hrs, 2010)

Theorem 3.19 Let f : I → R be continuous and invertible. If f is diﬀerentiable

at f

−1

(b) and f

·

( f

−1

(b)) 0, then f

−1

is diﬀerentiable at b.

Derivatives 119

Proof : Let b = f (a), i.e., a = f

−1

(b). It is given

lim

a

∆

f ,a

= f

·

(a) 0.

To each x a corresponds a y b such that

y = f (x) and x = f

−1

(y).

Then,

∆

f ,a

(x) =

f (x) − f (a)

x − a

=

y − b

f

−1

(y) − f

−1

(b)

=

1

∆

f

−1

,b

(y)

,

which we can rewrite as

∆

f

−1

,b

(y) =

1

∆

f ,a

(x)

=

1

∆

f ,a

( f

−1

(y))

.

Since f

−1

is continuous at b and ∆

f ,a

is continuous at a = f

−1

(b), it follows from

limit artithmetic that

lim

b

∆

f

−1

,b

=

1

f

·

( f

−1

(b))

.

B

Example: Consider the function f (x) = tan x deﬁned on (−π/2, π/2). We denote

its inverse by arctan x. Then

4

,

( f

−1

)

·

(x) =

1

f

·

( f

−1

(x))

= cos

2

(arctan x) =

1

1 + tan

2

(arctan x)

=

1

1 + x

2

.

**Example: Consider the function f (x) = x
**

n

, with n integer. This proves that the

derivative of f

−1

(x) = x

1/n

is

( f

−1

)

·

(x) =

1

n

x

1/n−1

,

and together with the composition rule gives the derivative for all rational powers.

4

We used the identity cos

2

= 1/(1 + tan

2

).

120 Chapter 3

Comment: So far, we have not deﬁned real-valued powers. We will get to it after

we study integration theory.

Exercise 3.9 Let f : R → R be one-to-one and onto, and furthermore dif-

ferentiable with f

·

(x) 0 for all x ∈ R. Suppose that there exists a function

F : R → R such that F

·

= f . Deﬁne G(x) = x f

−1

(x) − F( f

−1

(x)). Prove that

G

·

(x) = f

−1

(x). Note that this statement says that if we know a function whose

derivative is one-to-one and onto, then we know the function whose derivative is

its inverse.

Exercise 3.10 Let f be invertible and twice diﬀerentiable in (a, b). Further-

more, f

·

(x) 0 in (a, b). Prove that ( f

−1

)

··

exists and ﬁnd an expression for it in

terms of f and its derivatives.

Exercise 3.11 Suppose that f : R → R is continuous, one-to-one and onto,

and satisﬁes f = f

−1

. (i) Show that f has a ﬁxed point, i.e., there exists an x ∈ R

such that f (x) = x. (ii) Find such an f that has a unique ﬁxed point. Hint: start by

ﬁguring out how do such functions look like.

3.6 Complements

Theorem 3.20 Suppose that f : A → B is monotonic (say, increasing), then it has

one-sided limits at every point that has a one-sided neighborhood that belongs to

A. (In particular, if A is a connected set then f has one-sided limits everywhere.)

Proof : Consider a segment (b, a] ∈ A. We need to show that f has a left-limit at

a. The set

K = ¦ f (x) : b < x < a¦

is non-empty and upper bounded by f (a) (since f is increasing). Hence we can

deﬁne = sup K. We are going to show that

= lim

a

−

f .

Let > 0. By the deﬁnition of the supremum, there exists an m ∈ K such that

m > − . Let y be such f (y) = m. By the monotonicity of f ,

− < f (x) ≤ whenever y < x < a.

Derivatives 121

This concludes the proof. We proceed similarly for right-limits. B

Theorem 3.21 Let f : [a, b] → R be monotonic (say, increasing). Then the image

of f is a segment if and only if f is continuous.

Proof : (i) Suppose ﬁrst that f is continuous. By monotonicity,

Image( f ) ⊆ [ f (a), f (b)].

That every point of [ f (a), f (b)] is in the image of f follows from the intermediate-

value theorem.

(ii) Suppose then that Image( f ) = [ f (a), f (b)]. Suppose, by contradiction, that f

is not continuous. This implies the existence of a point c ∈ [a, b], where

f (c) > lim

c

−

f and/or f (c) < lim

c

+

f .

(We rely on the fact that one-sided derivatives exist.)

Say, for example, that f (c) < lim

c

+ f . Then,

f (c) <

f (c) + lim

c

+ f

2

< lim

c

+

f

is not in the image of f , which contradicts the fact that Image( f ) = [ f (a), f (b)].

B

Theorem 3.22 (Jean-Gaston Darboux) Let f : A → B be diﬀerentiable on (a, b),

including a right-derivative at a and a left-derivative at b. Then, f

·

has the

“intermediate-value property”, whereby for every t between f

·

(a

+

) and f

·

(b

−

)

there exists an x ∈ [a, b], such that f

·

(x) = t.

Comment: The statement is trivial when f is continuously diﬀerentiable (by

the intermediate-value theorem). It is nevertheless correct even if f

·

has dis-

continuities. This theorem thus limits the functions that are derivatives of other

functions—not every function can be a derivative.

122 Chapter 3

Proof : Without loss of generality, let us assume that f

·

(a

+

) > f

·

(b

−

). Let f

·

(a

+

) >

t > f

·

(b

−

), and consider the function

g(x) = f (x) − tx.

(Here t is a ﬁxed parameter; for every t we can construct such a function.) We

have

g

·

(a

+

) = f

·

(a

+

) − t > 0 and g

·

(b

−

) = f

·

(b

−

) − t < 0,

and we wish to show the existence of an x ∈ [a, b] such that g

·

(x) = 0.

Since g is continuous on [a, b] then it must attain a maximum. The point a can-

not be a maximum point because g is increasing at a. Similarly, b cannot be a

maximum point because g is decreasing at b. Thus, there exists an interior point c

which is a maximum, and by Fermat’s theorem, g

·

(c) = f

·

(c) − t = 0. B

(32 hrs, 2009)

Example: The function

f (x) =

_

¸

¸

_

¸

¸

_

x

2

sin 1/x x 0

0 x = 0

is diﬀerentiable everywhere, and its derivative is

f

·

(x) =

_

¸

¸

_

¸

¸

_

2x sin 1/x − cos 1/x x 0

0 x = 0

.

The derivative is not continuous at zero, but nevertheless satisfes the Darboux

theorem.

Example: How crazy can a continuous function be? It was Weierstrass who

shocked the world by constructing a continuous function that it nowhere diﬀer-

entiable! As instance of his class of example is

f (x) =

∞

k=0

cos(21

k

πx)

3

k

.

Derivatives 123

3.7 Taylor’s theorem

Recall the mean-value theorem. If f : [a, b] → R is continuous on [a, b] and

diﬀerentiable in (a, b), then for every point x ∈ (a, b) there exists a point c

x

, such

that

f (x) − f (a)

x − a

= f

·

(c

x

).

We can write it alternatively as

f (x) = f (a) + f

·

(c

x

)(x − a).

Suppose, for example, that we knew that ¦ f

·

(y)¦ < M for all y. Then, we would

deduce that f (x) cannot diﬀer from f (a) by more than M times the separation

¦x − a¦. In particular,

lim

a

[ f − f (a)] = 0.

It turns out that if we have more information, about higher derivatives of f , then

we can reﬁne a lot the estimates we have on the variation of f as we get away

from the point a.

Theorem 3.23 Let f be a given function, and a an interior point of its domain.

Suppose that the derivatives

f

·

(a), f

··

(a), . . . , f

(n)

(a)

all exist. Let a

k

= f

(k)

(a)/k! and set the Taylor polynomial of degree n,

P

f ,n,a

(x) = a

0

+ a

1

(x − a) + a

2

(x − a)

2

+ + a

n

(x − a)

n

.

Then,

lim

a

f − P

f ,n,a

(Id −a)

n

= 0.

Comment: Loosely speaking, this theorem states that, as x approaches a, P

f ,n,a

(x)

is closer to f (x) than (x − a)

n

(or that f and P

f ,n,a

are “equal up to order n”; see

below). For n = 0 it states that

lim

a

[ f − f (a)] = 0,

124 Chapter 3

whereas for n = 1,

lim

a

_

f − f (a)

(Id −a)

− f

·

(a)

_

= lim

a

∆

f ,a

− f

·

(a) = 0,

which holds by deﬁnition.

Proof : Consider the ratio

f − P

f ,n,a

(Id −a)

n

,

which is a ratio of n-times diﬀerentiable functions at a. Both numerator and de-

nominator vanish at a, so we will attempt to use l’Hˆ opital’s rule. For that we note

that

P

·

f ,n,a

= P

f

·

,n−1,a

.

Thus,

lim

a

f − P

f ,n,a

(Id −a)

n

= lim

a

f

·

− P

f

·

,n−1,a

n (Id −a)

n−1

,

provided that the right hand side exists. We then proceed to consider the right

hand side, which is the limit of the ratio of two diﬀerentiable functions that vanish

at a. Hence,

lim

a

f

·

− P

f

·

,n−1,a

n (Id −a)

n−1

= lim

a

f

··

− P

f

··

,n−2,a

n(n − 1) (Id −a)

n−2

,

provided that the right hand side exists. We proceed inductively, until we get that

lim

a

f − P

f ,n,a

(Id −a)

n

= lim

a

f

(n−1)

− P

f

(n−1)

,1,a

n! (Id −a)

,

provided that the right hand side exists. Now write the right hand side explicitly,

1

n!

lim

a

f

(n−1)

− f

(n−1)

(a) − f

(n)

(a)(Id −a)

Id −a

=

1

n!

_

lim

a

∆

f

(n−1)

,a

− f

(n)

(a)

_

,

which vanishes by the deﬁnition of f

(n)

(a). This concludes the proof. B

(35 hrs, 2011)

The polynomial P

f ,n,a

(x) is known as the Taylor polynomial of degree n of f

about a. Its existence relies on f being diﬀerentiable n times at a. If we denote by

Π

n

the set of polynomials of degree at most n, then the deﬁning property of P

f ,n,a

is:

Derivatives 125

1. P

f ,n,a

∈ Π

n

.

2. P

(k)

f ,n,a

(a) = f

(k)

(a) for k = 0, 1, . . . , n.

Deﬁnition 3.6 Let f , g be deﬁned in a neighborhood of a. We say that f and g

are equal up to order n at a if

lim

a

f − g

(Id −a)

n

= 0.

It is easy to see that “equal up to order n” is an equivalence relation, which

deﬁnes an equivalence class.

Thus, the above theorem proves that f (x) and P

f ,n,a

(provided that the latter exists)

are equal up to order n at a.

The next task is to obtain an expression for the diﬀerence between the function f

and its Taylor polynomial. We deﬁne the remainder (תיראש) of order n, R

n

(x), by

f (x) = P

f ,n,a

(x) + R

f ,n,a

(x).

Proposition 3.1 Let P, Q ∈ Π

n

be equal up to order n at a. Then P = Q.

Proof : We can always express these polynomials as

P(x) = a

0

+ a

1

(x − a) + + a

n

(x − a)

n

Q(x) = b

0

+ b

1

(x − a) + + b

n

(x − a)

n

.

We know that

lim

a

P − Q

(Id −a)

k

= 0 for k = 0, 1, . . . , n.

For k = 0

0 = lim

a

(P − Q) = a

0

− b

0

,

i.e., a

0

= b

0

. We proceed similarly for each k to show that a

k

= b

k

for k =

0, 1, . . . , n. B

126 Chapter 3

Corollary 3.6 If f is n times diﬀerentiable at a and f is equal to Q ∈ Π

n

up to

order n at a, then

P

f ,n,a

= Q.

Example: We know from high-school math that for ¦x¦ < 1

f (x) =

1

1 − x

=

∞

k=0

x

k

=

n

k=0

x

k

+

x

n+1

1 − x

.

Thus,

f (x) −

_

n

k=0

x

k

x

n

=

x

1 − x

.

Since the right hand side tends to zero as x → 0 it follows that f and

_

n

k=0

x

k

are

equal up to order n at 0, which means that

P

f ,n,0

(x) =

n

k=0

x

k

.

**Example: Similarly for ¦x¦ < 1
**

f (x) =

1

1 + x

2

=

∞

k=0

(−x

2

)

k

=

n

k=0

(−1)

k

x

2k

+ (−1)

n+1

x

2n+2

1 + x

2

.

Thus,

f (x) −

_

n

k=0

(−1)

k

x

2k

x

2n

=

x

2

1 + x

2

.

Since the right hand side tends to zero as x → 0 it follows that

P

f ,2n,0

(x) =

n

k=0

(−1)

k

x

2k

.

Derivatives 127

Example: Let f = arctan. Then,

f (x) =

_

x

0

dt

1 + t

2

=

_

x

0

_

¸

¸

¸

¸

¸

_

n

k=0

(−1)

k

t

2k

+ (−1)

n+1

t

2n+2

1 + t

2

_

¸

¸

¸

¸

¸

_

dt

=

n

k=0

(−1)

k

x

2k+1

2k + 1

+

_

x

0

(−1)

n+1

t

2n+2

1 + t

2

dt.

Since ¸

¸

¸

¸

¸

¸

¸

f (x) −

n

k=0

(−1)

k

x

2k+1

2k + 1

¸

¸

¸

¸

¸

¸

¸

≤ ¦x¦

2n+3

,

it follows that

P

f ,2n+2,0

=

n

k=0

(−1)

k

x

2k+1

2k + 1

.

(36 hrs, 2011)

Theorem 3.24 Let f , g be n times diﬀerentiable at a. Then,

P

f +g,n,a

= P

f ,n,a

+ P

g,n,a

,

and

P

f g,n,a

= [P

f ,n,a

P

g,n,a

]

n

,

where []

n

stands for a truncation of the polynomial in (Id −a)at the n-th power.

Proof : We use again the fact that if a polynomial “looks” like the Taylor polyno-

mial then it is. The sum P

f ,n,a

+ P

g,n,a

belongs to Π

n

and

lim

a

( f + g) − (P

f ,n,a

+ P

g,n,a

)

(Id −a)

n

= lim

a

f − P

f ,n,a

(Id −a)

n

+ lim

a

g − P

g,n,a

(Id −a)

n

= 0,

which proves the ﬁrst statement.

Second, we note that P

f ,n,a

P

g,n,a

∈ Π

2n

, and that

f g − P

f ,n,a

P

g,n,a

(Id −a)

n

=

f g − P

f ,n,a

g + P

f ,n,a

g − P

f ,n,a

P

g,n,a

(Id −a)

n

= g

f − P

f ,n,a

(Id −a)

n

+ P

f ,n,a

g − P

g,n,a

(Id −a)

n

,

128 Chapter 3

whose limit at a is zero. All that remains it to truncate the product at the n-th

power of (Id −a). B

Theorem 3.25 (Taylor) Suppose that f is (n +1)-times diﬀerentiable on the inter-

val [a, x]. Then,

f (x) = P

f ,n,a

(x) + R

n, f ,a

(x),

where the remainder R

f ,n,a

(x) can be represented in the Lagrange form,

R

f ,n,a

(x) =

f

(n+1)

(c)

(n + 1)!

(x − a)

n+1

,

for some c ∈ (a, x). It can also be represented in the Cauchy form,

R

f ,n,a

(x) =

f

(n+1)

(ξ)

n!

(x − ξ)

n

(x − a),

for some ξ ∈ (a, x).

Comment: For n = 0 this is simply the mean-value theorem.

Proof : Fix x, and consider the function φ : [a, x] → R,

φ(z) = f (x) − P

f ,n,z

(x)

= f (x) − f (z) − f

·

(z)(x − z) −

f

··

(z)

2!

(x − z)

2

− −

f

(n)

(z)

n!

(x − z)

n

.

We note that

φ(a) = R

f ,n,a

(x) and φ(x) = 0.

Moreover, φ is diﬀerentiable, with

φ

·

(z) = −f

·

(z) −

_

f

··

(z)(x − z) − f

·

(z)

_

−

_

f

···

(z)

2!

(x − z)

2

− f

··

(z)(x − z)

_

− −

_

f

(n+1)

(z)

n!

(x − z)

n

−

f

(n)

(z)

(n − 1)!

(x − z)

n−1

_

= −

f

(n+1)

(z)

n!

(x − z)

n

.

Derivatives 129

Let now ψ be any function deﬁned on [a, x], diﬀerentiable on (a, x) with non-

vanishing derivative. By Cauchy’s mean-value theorem, there exists a point a <

ξ < x, such that

φ(x) − φ(a)

ψ(x) − ψ(a)

=

φ

·

(ξ)

ψ

·

(ξ)

.

That is,

R

n

(x) =

ψ(x) − ψ(a)

ψ

·

(ξ)

f

(n+1)

(ξ)

n!

(x − ξ)

n

Set for example ψ(z) = (x − z)

p

for some p. Then,

R

n

(x) = (x − a)

p

f

(n+1)

(ξ)

p n!

(x − ξ)

n−p+1

For p = 1 we retrieve the Cauchy form. For p = n + 1 we retrieve the Lagrange

form. B

Comment: Later in this course we will study series and ask questions like “does

the Taylor polynomial P

n

(x) tend to f (x) as n → ∞?”. This is the notion of con-

verging series. Here we deal with a diﬀerent beast: the degree of the polynomial

n is ﬁxed and we consider how does P

n

(x) approach f (x) as x → a. The fact that

P

n

approaches f faster than (x − a)

n

as x → a means that P

n

is an asymptotic

series of f .

(34 hrs, 2009)

Exercise 3.12 Write the polynomial p(x) = x

3

−3x

2

+3x −6 as a polynomial

in x + 1. That is, ﬁnd coeﬃcients a

0

, a

1

, . . . , a

n

, such that p(x) =

_

n

k=0

a

k

(x + 1)

k

.

Solve this problem using two methods: (i) Newton’s binomial, and (ii) Taylor’s

polynomial.

Exercise 3.13 Let f : R → R be n times diﬀerentiable, such that f

(n)

(x) = 0

for all x ∈ R. Prove that f is a polynomial whose degree is at most n − 1.

Exercise 3.14 Write the Taylor polynomials of degree n at the point x

0

for

the following functions, as well as the corresponding remainders, both in the La-

grange and Cauchy forms:

' f (x) = 1/(1 + x), x

0

= 0, n ∈ N.

^ f (x) = cosh(x), x

0

= 0, n ∈ N.

130 Chapter 3

· f (x) = x

5

− x

4

+ 7x

3

− 2x

2

+ 3x + 1, x

0

= 1, n = 5.

´ f (x) = (x − 2)

6

+ 9(x + 1)

4

+ (x − 2)

2

+ 8, x

0

= 0, n = 6.

I f (x) = (1 + x)

α

, α N, x

0

= 0, n ∈ N. Use the notation

_

α

n

_

def

=

α(α − 1) . . . (α − n + 1)

n!

,

which generalizes Newton’s binomial.

I f (x) = sin(x), x

0

= π/2, n = 2k, k ∈ N.

Exercise 3.15 Show using diﬀerent methods that of f

··

(a) exists, then

f

·

(a) = lim

h→0

f (a + h) + f (a − h) − 2 f (a)

h

.

' Using l’Hopital’s rule.

^ Using Taylor’s polynomial of order two f at a. Note that you can’t use

neither Lagrange or Cauchy’s form of the remainder, since it is only given

that f is twice diﬀerentiable at a.

· Show that if

f (x) =

_

¸

¸

_

¸

¸

_

x

2

x ≥ 0

−x

2

x < 0,

then f

··

(0) does not exist, but the limit of the above fraction does exist. Why

isn’t it a contradiction?

Exercise 3.16 Evaluate, using Taylor’s polynomial, the following expressions,

up to an accuracy of at least 0.01:

' e

1/2

.

^ cos(1/2).

Exercise 3.17

' For every polynomial P(x) =

_

n

k=0

a

k

x

k

and evert q ∈ N, we denote

P

q

(x) =

_

¸

¸

_

¸

¸

_

_

q

k=0

a

k

x

k

q < n

P(x) q ≥ n.

Derivatives 131

Suppose that f , g have Taylor polynomials of order n at zero, P and Q.

Suppose furthermore that g(0) = 0, and that g does not vanish in a punctured

neighborhood of zero. Show that the Taylor polynomial of order n at zero

of f ◦ g is given by (P ◦ Q)

n

. Hint: write f (g(x)) = f (g(x))/g(x) g(x).

^ Given the Taylor polynomial P of order n at zero of the function f , write

the Taylor polynomial of order n at zero of the function g(x) = f (2x).

· Find the Taylor polynomial of order three at zero of the function f (x) =

exp(sin(x)).

Exercise 3.18 Let f be n times diﬀerentiable at x

0

and let P(x) =

_

n

k=0

a

k

x

k

be

its Taylor polynomial at zero. What is the Taylor polynomial of f

·

of order n − 1

at zero. Justify your answer.

Exercise 3.19 Show that if f and g are n times diﬀerentiable at x

0

, then

( f g)

(n)

(x

0

) =

n

k=0

_

n

k

_

f

(k)

(x

0

) g

(n−k)

(x

0

).

Hint: think of the Taylor polynomial of a product of functions.

Exercise 3.20 Suppose that f and g have derivatives of all orders in R and

that P

n

(x) = Q

n

(x) for all n ∈ N and every x ∈ R, where P

n

and Q

n

are the

corresponding Taylor polynomials at zero. Does it imply that f (x) = g(x)?

132 Chapter 3

Chapter 4

Integration theory

4.1 Deﬁnition of the integral

Derivatives were about calculating the rate of change of a function. Integrals were

invented in order to calculate areas. The fact that the two are intimately related—

that derivatives are in a sense the inverse of integrals—is a wonder.

When we learn in school about the area of a region, it is deﬁned as “the number

of squares with sides of length one that ﬁt into the region”. It is of course hard

to make sense with such a deﬁnition. We will learn how to calculate the area of

a region bounded by the x-axis, the graph of a function f , and two vertical lines

passing though points (a, 0) and (b, 0). For the moment let us assume that f ≥ 0.

This region will be denoted by R( f , a, b), and its area (once deﬁned) will be called

the integral of f on [a, b]

1

.

1

There are multiple non-equivalent ways to deﬁne integrals. We study integration theory in the

sense of Riemann; another classical integration theory is that of Lebesgue (studied in the context

of measure theory). Within the Riemann integration theory, we adopt a particular derivation due

to Darboux.

134 Chapter 4

a b

f

R!f,a,b"

Deﬁnition 4.1 A partition (הקולח) of the interval [a, b] is a partition (yes, I

know, you can’t use the word to be deﬁned in a deﬁnition) of the interval into a

ﬁnite number of segments,

[a, b] = [x

0

, x

1

] ∪ [x

1

, x

2

] ∪ ∪ [x

n−1

, x

n

].

We will identify a partition with a ﬁnite number of points,

a = x

0

< x

1

< x

2

< < x

n

= b.

Deﬁnition 4.2 Let f : [a, b] → Rbe a bounded function, and let P = ¦x

0

, x

1

, . . . , x

n

¦

be a partition of [a, b]. Let

m

i

= inf¦ f (x) : x

i−1

≤ x ≤ x

i

¦

M

i

= sup¦ f (x) : x

i−1

≤ x ≤ x

i

¦.

We deﬁne the lower sum (ותחת וכס) of f for P to be

L( f , P) =

n

i=1

m

i

(x

i

− x

i−1

).

We deﬁne the upper sum (וילע וכס) of f for P to be

U( f , P) =

n

i=1

M

i

(x

i

− x

i−1

).

Clearly, any sensible deﬁnition of the area A of R( f , a, b) must satisfy

L( f , P) ≤ A ≤ U( f , P)

for all partitions P.

Integration theory 135

a b

f

x

0

x

1

x

2

x

3

Figure 4.1: The lower and upper sums of f for P.

Comments:

' The boundedness of f is essential in these deﬁnitions.

^ It is always true that L( f , P) ≤ U( f , P) because m

i

≤ M

i

for all i.

· Is it however true that L( f , P

1

) ≤ U( f , P

2

) where P

1

and P

2

are arbitrary

partitions? It ought to be true.

(35 hrs, 2009)

(38 hrs, 2011)

Exercise 4.1 Let f : [−1, 1] → R be given by f : x ·→ x

2

and let P =

¦−1, −1/2, −1/3, 2/3, 8/9, 1¦ be a partition of [−1, 1]. Calculate L( f , P) and U( f , P).

Deﬁnition 4.3 A partition Q is called a reﬁnement (ודיע) of a partition P if

P ⊂ Q.

Lemma 4.1 Let Q be a reﬁnement of P. Then

L( f , Q) ≥ L( f , P) and U( f , Q) ≤ U( f , P).

136 Chapter 4

Proof : Suppose ﬁrst that Q diﬀers from P by exactly one point. That is,

P = ¦x

0

, x

1

, . . . , x

n

¦

Q = ¦x

0

, x

1

, . . . , x

j−1

, y, x

j

, . . . , x

n

¦.

Then,

L( f , P) =

n

i=1

m

i

(x

i

− x

i−1

)

and

L( f , Q) =

j−1

i=1

m

i

(x

i

− x

i−1

) + ˜ m

j

(y − x

j−1

) + ˆ m

j

(x

j

− y) +

n

i=j+1

m

i

(x

i

− x

i−1

),

where

˜ m

j

= inf¦ f (x) : x

j−1

≤ x ≤ y¦

ˆ m

j

= inf¦ f (x) : y ≤ x ≤ x

j

¦.

Thus,

L( f , Q) − L( f , P) = ( ˜ m

j

− m

j

)(y − x

j−1

) + ( ˆ m

j

− m

j

)(x

j

− y).

It remains to note that ˜ m

j

≥ m

j

and ˆ m

j

≥ m

j

, since both are inﬁmums over a

smaller set

2

. Thus, L( f , Q) − L( f , P) ≥ 0.

Consider now two partitions, P ⊂ Q. There exists a sequence of partitions,

P = P

0

⊂ P

1

⊂ ⊂ P

k

= Q,

such that each P

j

diﬀers from P

j−1

by one point. Hence,

L( f , P) ≤ L( f , P

1

) ≤ ≤ L( f , Q).

A similar argument works for the upper sums. B

Theorem 4.1 For every two partitions P

1

,P

2

of [a, b],

L( f , P

1

) ≤ U( f , P

2

).

2

It is always true that if A ⊂ B then

inf A ≥ inf B and sup A ≤ sup B.

Integration theory 137

Proof : Let Q = P

1

∪ P

2

. Since Q is ﬁner than both P

1

and P

2

it follows that

L( f , P

1

) ≤ L( f , Q) ≤ U( f , Q) ≤ U( f , P

2

).

B

Exercise 4.2 Prove or disprove:

' A partition Q is a reﬁnement of a partition P if and only if every segment in

Q is contained in a segment of P.

^ If a partition P is a reﬁnement of a partition Q then every segment in Q is a

segment in P.

· If R is a partition of both P and Q then it is also a reﬁnement of P ∪ Q.

Exercise 4.3 Characterize the functions f for which there exists a partition P

such that L( f , P) = U( f , P).

We may now consider the collection of lower and upper sums,

L( f ) = ¦L( f , P) : P is a partition of [a, b]¦

U ( f ) = ¦U( f , P) : P is a partition of [a, b]¦.

Note that both L( f ) and U ( f ) are sets of real numbers, and not sets of partitions.

The set L( f ) is non-empty and bounded from above by any element of U ( f ).

Similarly, the set U ( f ) is non-empty and bounded from below by any element of

L( f ). Thus, we may deﬁne the least upper bound and greatest lower bound,

L( f ) = sup L( f ) and U( f ) = inf U ( f ).

The real number U( f ) is called the upper integral (וילע לרגטניא) of f on [a, b]

and L( f ) is called the lower integral (ותחת לרגטניא) of f on [a, b]

We once proved a theorem stating that the following assertions are equivalent:

' L( f ) = U( f ).

^ There is a unique number c, such that ≤ c ≤ u for all ∈ L( f ) and

u ∈ U ( f ).

· For every > 0 there exist an ∈ L( f ) and a u ∈ U ( f ), such that u− < .

138 Chapter 4

Deﬁnition 4.4 If L( f ) = U( f ) we say that f is integrable on [a, b]. We call this

number the integral of f on [a, b], and denote it by

_

b

a

f .

That is, f is integrable on [a, b] if the upper and lower integrals coincide.

Comment: The integral of f is the unique number such that

L( f , P) ≤

_

b

a

f ≤ U( f , Q),

for all partitions P and Q.

Comment: You are all accustomed to the notation

_

b

a

f (x) dx.

This notation is indeed very suggestive about the meaning of the integral, however,

it is very important to realize that the integral only involves f , a and b, and it is

not a function of any x.

Comment: At this stage we only consider integrals of bounded functions over

bounded segments—so-called proper integrals. Integrals like

_

∞

a

f and

_

1

0

log x dx,

are instances of improper integrals. We will only brieﬂy talk about them (only

due to a lack of time).

Comment: What about the case where the sign of f is not everywhere positive?

The deﬁnition remains the same. The geometric interpretation is that areas under

the x-axis count as “negative areas”.

Two questions arise:

' How do we know whether a function is integrable?

Integration theory 139

^ How to compute the integral of an integrable function?

(40 hrs, 2011)

The following lemma will prove useful in checking whether a function is inte-

grable:

Theorem 4.2 A bounded function f is integrable if and only if there exists for

every > 0 a partition P, such that

U( f , P) < L( f , P) + .

Proof : If for every > 0 there exists a partition P, such that

U( f , P) < L( f , P) + ,

then f is integrable, by the equivalent criteria described above.

Conversely, suppose f is integrable. Then, given > 0, there exist partitions P, P

·

,

such that

U( f , P) − L( f , P

·

) < .

Let P

··

= P ∪ P

·

. Since P

··

is a reﬁnement of both, then

U( f , P

··

) − L( f , P

··

) ≤ U( f , P) − L( f , P

·

) < .

B

Example: Let f : [a, b] → R, f : x ·→ c. Then for every partition P,

L( f , P) =

n

i=1

m

i

(x

i

− x

i−1

) = c(b − a)

U( f , P) =

n

i=1

M

i

(x

i

− x

i−1

) = c(b − a).

It follows that L( f ) = U( f ) = c(b − a), hence f is integrable and

_

b

a

f = c(b − a).

140 Chapter 4

Example: Let f : [0, 1] → R,

f : x ·→

_

¸

¸

_

¸

¸

_

0 x Q

1 x ∈ Q.

For every partition and every j,

m

j

= 0 and M

j

= 1,

hence

L( f , P) = 0 and U( f , P) = 1,

and L( f ) = 0 and U( f ) = 1. Since L( f ) U( f ), f is not integrable.

(41 hrs, 2010)

Example: Consider next the function f : [0, 2] → R,

f : x ·→

_

¸

¸

_

¸

¸

_

0 x 1

1 x = 1.

Let P

n

, n ∈ N denote the family of uniform partitions,

x

i

=

2i

n

.

For every such partition,

L( f , P) = 0 and U( f , P) = 2/n.

Set now,

L

unif

( f ) = ¦L( f , P

n

) : n ∈ N¦

U

unif

( f ) = ¦U( f , P

n

) : n ∈ N¦

Clearly, L

unif

( f ) ⊆ L( f ) and U

unif

( f ) ⊆ U ( f ), hence

L( f ) = sup L( f ) ≥ sup L

unif

( f ) = 0

U( f ) = inf U ( f ) ≤ inf U

unif

( f ) = 0,

thus

0 ≤ L( f ) ≤ U( f ) ≤ 0,

Integration theory 141

which implies that f is integrable, and

_

2

0

f = 0.

This example shows that integrals are not aﬀected by the value of a function at an

isolated point.

We are next going to examine a number of examples, which on the one hand, will

practice our understanding of integrability, and on the other hand, will convince

us that we must develop tools to deal with more complicated functions.

Example: Let f : [0, b] → R, f = Id. Let P be any partition, then m

i

= x

i−1

and

M

i

= x

i

, hence

L( f , P) =

n

i=1

x

i−1

(x

i

− x

i−1

)

U( f , P) =

n

i=1

x

i

(x

i

− x

i−1

).

Using again uniform partition,

P

n

= ¦0, b/n, 2b/n, . . . , b¦,

L( f , P

n

) =

n

i=1

(i − 1)b

n

b

n

=

b

2

n

2

n(n − 1)

2

=

b

2

2

_

1 −

1

n

_

U( f , P

n

) =

n

i=1

ib

n

b

n

=

b

2

n

2

n(n + 1)

2

=

b

2

2

_

1 +

1

n

_

.

As in the previous example,

L( f ) = sup L( f ) ≥ sup L

unif

( f ) =

b

2

2

U( f ) = inf U ( f ) ≤ inf U

unif

( f ) =

b

2

2

,

hence f is integrable and

_

b

0

f =

b

2

2

.

(37 hrs, 2009)

142 Chapter 4

Example: Let f : [0, b] → R, f (x) = x

2

. Let P be any partition, then m

i

= x

2

i−1

and M

i

= x

2

i

, hence

L( f , P) =

n

i=1

x

2

i−1

(x

i

− x

i−1

)

U( f , P) =

n

i=1

x

2

i

(x

i

− x

i−1

).

Take the same uniform partition, then

3

L( f , P

n

) =

n

i=1

(i − 1)

2

b

2

n

2

b

n

=

b

3

n

3

(n − 1)n(2n − 1)

6

=

b

3

3

_

1 −

1

n

_ _

1 −

1

2n

_

U( f , P

n

) =

n

i=1

i

2

b

2

n

2

b

n

=

b

3

n

3

n(n + 1)(2n + 1)

6

=

b

3

3

_

1 +

1

n

_ _

1 +

2

n

_

.

Since

L( f ) ≥ sup L

unif

( f ) = b

3

/3 and U( f ) ≤ inf U

unif

( f ) = b

3

/3,

it follows that

_

b

0

f =

b

3

3

.

**This game can’t go much further, i.e., we need stronger tools. We start by verify-
**

ing which classes of functions are integrable.

(42 hrs, 2010)

4.2 Integration theorems

Theorem 4.3 If f [a, b] → R is monotone, then it is integrable.

3

n

k=1

k

2

=

1

6

n(n + 1)(2n + 1).

Integration theory 143

Proof : Assume, without loss of generality, that f is monotonically increasing.

Then, for every partition P,

L( f , P) =

n

i=1

f (x

i−1

)(x

i

− x

i−1

) and U( f , P) =

n

i=1

f (x

i

)(x

i

− x

i−1

),

hence,

U( f , P) − L( f , P) =

n

i=1

[ f (x

i

) − f (x

i−1

)](x

i

− x

i−1

).

Take now a family of equally-spaced partitions, P

n

= ¦a + j(b −a)/n¦

n

j=0

. For such

partitions,

U( f , P

n

) − L( f , P

n

) =

b − a

n

n

i=1

[ f (x

i

) − f (x

i−1

)] =

(b − a)[ f (b) − f (a)]

n

.

Thus, for every > 0 we can choose n > (b −a)[ f (b) − f (a)]/, which guarantees

that U( f , P

n

) − L( f , P

n

) < (we used the fact that the sum was “telescopic”). B

Theorem 4.4 If f [a, b] → R is continuous, then it is integrable.

Proof : First, a continuous function on a segment is bounded. We want to show

that for every > 0 there exists a partition P, such that

U( f , P) < L( f , P) + .

We need to somehow control the diﬀerence between m

i

and M

i

. Recall that a

function that is continuous on a closed interval is uniformly continuous. That is,

there exists a δ > 0, such that

¦ f (x) − f (y)¦ <

b − a

whenever ¦x − y¦ < δ.

Take then a partition P such that

x

i

− x

i−1

< δ for all i = 1, . . . , n.

Since continuous functions on closed intervals assume minima and maxima in the

interval, it follows that M

i

− m

i

< /(b − a), and

U( f , P) − L( f , P) =

n

i=1

(M

i

− m

i

)(x

i

− x

i−1

) <

b − a

n

i=1

(x

i

− x

i−1

) = .

144 Chapter 4

B

(42 hrs, 2011)

Theorem 4.5 Let f be a bounded function on [a, b], and let a < c < b. Then, f is

integrable on [a, b] if and only if it is integrable on both [a, c] and [c, b], in which

case

_

b

a

f =

_

c

a

f +

_

b

c

f .

Proof : (1) Suppose ﬁrst that f is integrable on [a, b]. It is suﬃcient to show that it

is integrable on [a, c]. For every > 0 there exists a partition P of [a, b], such that

U( f , P) − L( f , P) < .

If the partition P does not include the point c then let Q = P ∪ ¦c¦, and

U( f , Q) − L( f , Q) ≤ U( f , P) − L( f , P) < .

Suppose that x

j

= c, then P

·

= ¦x

0

, . . . , x

j

¦ is a partition of [a, c], and P

··

=

¦x

j

, . . . , x

n

¦ is a partition of [c, b]. We have

L( f , Q) = L( f , P

·

) + L( f , P

··

) and U( f , Q) = U( f , P

·

) + U( f , P

··

),

from which we deduce that

[U( f , P

·

) − L( f , P

·

)] = [U( f , Q) − L( f , Q)]

.,,.

<

−[U( f , P

··

) − L( f , P

··

)]

.,,.

≥0

< ,

hence f is integrable on [a, c].

(2) Conversely, suppose that f is integrable on both [a, c] and [c, b]. Then there a

exists a partition P

·

of [a, c] and a partition P

··

of [c, b], such that

U( f , P

·

) − L( f , P

·

) <

2

and U( f , P

··

) − L( f , P

··

) <

2

.

Since P = P

·

∪P

··

is a partition of [a, b], summing up, we obtain the desired result.

Integration theory 145

Thus, we have shown that f is integrable on [a, b] if and only if it is integrable on

both [a, c] and [c, b]. Next, since

L( f , P

·

) ≤

_

c

a

f ≤ U( f , P

·

) for every partition P

·

of [a, c]

L( f , P

··

) ≤

_

b

c

f ≤ U( f , P

··

) for every partition P

··

of [c, b],

it follows that

L( f , P) ≤

_

c

a

f +

_

b

c

f ≤ U( f , P) for every partition P of [a, b].

(More precisely, for every partition P that is a union of partitions of P

·

and P

··

,

but it is easy to see that this implies for every partition.) Since there is only one

such number, it is equal to

_

b

a

f . B

(39 hrs, 2009)

Comment: So far, we have only deﬁned the integral on [a, b] with b > a. We now

add the following conventions,

_

a

b

f = −

_

b

a

f and

_

a

a

f = 0,

so that (as can easily be checked) the addition rule always holds.

Theorem 4.6 If f and g are integrable on [a, b], then so is f + g, and

_

b

a

( f + g) =

_

b

a

f +

_

b

a

g.

Proof : First of all, we observe that if f and g are bounded functions on a set A,

then

4

inf¦ f (x) + g(x) : x ∈ A¦ ≥ inf¦ f (x) : x ∈ A¦ + inf¦g(x) : x ∈ A¦

sup¦ f (x) + g(x) : x ∈ A¦ ≤ sup¦ f (x) : x ∈ A¦ + sup¦g(x) : x ∈ A¦.

4

This is because for every x,

f (x) + g(x) ≥ inf¦ f (x) : x ∈ A¦ + inf¦g(x) : x ∈ A¦,

so that the right-hand side is a lower bound of f + g.

146 Chapter 4

This implies that for every partition P,

L( f + g, P) ≥ L( f , P) + L(g, P)

U( f + g, P) ≤ U( f , P) + U(g, P),

hence,

U( f + g, P) − L( f + g, P) ≤ [U( f , P) − L( f , P)] + [U(g, P) − L(g, P)]. (4.1)

We are practically done. Given > 0 there exist partitions P

·

, P

··

, such that

U( f , P

·

) − L( f , P

·

) <

2

and U(g, P

··

) − L(g, P

··

) <

2

.

Take P = P

·

∪ P

··

, then

U( f , P) − L( f , P) ≤ U( f , P

·

) − L( f , P

·

) <

2

U(g, P) − L(g, P) ≤ U(g, P

··

) − L(g, P

··

) <

2

,

and it remains to substitute into (4.1), to prove that f + g is integrable.

Now, for every partition P,

L( f , P) + L(g, P) ≤ L( f + g, P) ≤

_

b

a

( f + g) ≤ U( f + g, P) ≤ U( f , P) + U(g, P),

and also

L( f , P) + L(g, P) ≤

_

b

a

f +

_

b

a

g ≤ U( f , P) + U(g, P),

It follows that

¸

¸

¸

¸

¸

¸

_

b

a

( f + g) −

_

b

a

f −

_

b

a

g

¸

¸

¸

¸

¸

¸

≤ [U( f , P) + U(g, P)] − [L( f , P) + L(g, P)].

Since the right-hand side can be made less than for every , it follows that it is

zero

5

. B

5

If ¦x¦ < for every > 0 then x = 0.

Integration theory 147

Corollary 4.1 If f is integrable on [a, b] and g diﬀers from f only at a ﬁnite num-

ber of points then

_

b

a

f =

_

b

a

g.

Proof : We saw that (g − f ), which is non-zero only at a ﬁnite number of points, is

integrable and its integral is zero. Then we only need to use the additivity of the

integral for g = f + (g − f ). B

Exercise 4.4 Denote by

_

the upper integral. Let f , g be bounded on [a, b].

Show that

_ b

a

( f + g) ≤

_ b

a

f +

_ b

a

g.

Also, ﬁnd two functions f , g for which the inequality is strong.

Exercise 4.5 Recall the Riemann function,

R(x) =

_

¸

¸

_

¸

¸

_

1/q x = p/q ∈ Q

0 x Q.

Let [a, b] be a gievn interval.

' Show that

_

b

a

R = 0 (the lower integral).

^ Let > 0 and deﬁne

Z

= ¦x ∈ [a, b] : R(x) > ¦.

Show that Z

is a ﬁnite set.

· Show that there exists a partition P such that U(R, P) < .

´ Conclude that R is integrable on [a, b] and that

_

b

a

R = 0.

(44 hrs, 2010)

148 Chapter 4

Theorem 4.7 If f is integrable on [a, b] and c ∈ R, then c f is integrable on [a, b]

and

_

b

a

(c f ) = c

_

b

a

f .

Proof : Start with the case where c > 0. For every partition P,

L(c f , P) = c L( f , P) and U(c f , P) = c U( f , P).

Since f is integrable on [a, b], then there exists for every > 0 a partition P, such

that

U( f , P) − L( f , P) <

c

,

hence

U(c f , P) − L(c f , P) < ,

which proves that c f is integrable on [a, b]. Then, we note that for every partition

P,

L(c f , P) = c L( f , P) ≤ c

_

b

a

f ≤ c U( f , P) = U(c f , P).

Since there is only one number such that L(c f , P) ≤ ≤ U(c f , P) for all parti-

tions P, and this number is by deﬁnition

_

b

a

(c f ), then

_

b

a

(c f ) = c

_

b

a

f .

Consider then the case where c < 0; in fact, it is suﬃcient to consider the case of

c = −1, since for arbitrary c < 0 we write c f = (−1)¦c¦ f . For every partition P,

L(−f , P) = −U( f , P) and U(−f , P) = −L( f , P).

Since f is integrable on [a, b], then there exists for every > 0 a partition P, such

that

U( f , P) − L( f , P) < ,

hence

U(−f , P) − L(−f , P) = −(L( f , P) − U( f , P)) < ,

Integration theory 149

which proves that (−f ) is integrable on [a, b]. Then, we note that for every parti-

tion P,

L(−f , P) = −U( f , P) ≤ −

_

b

a

f ≤ −L( f , P) = U(−f , P),

and the rest is as above. B

(45 hrs, 2010)

This last theorem is in fact a special case of a more general theorem:

Theorem 4.8 Suppose that both f and g are integrable on [a, b], then so is their

product f g.

Proof : We will prove this theorem for the case where f , g ≥ 0. The generalization

to arbitrary functions is not hard (for example, since f and g are bounded, we

can add to each a constant to make it positive and then use the “arithmetic of

integrability”).

Let P be a partition. It is useful to introduce the following notations:

m

f

j

= inf¦ f (x) : x

j−1

≤ x ≤ x

j

¦

m

g

j

= inf¦g(x) : x

j−1

≤ x ≤ x

j

¦

m

f g

j

= inf¦ f (x)g(x) : x

j−1

≤ x ≤ x

j

¦

M

f

j

= sup¦ f (x) : x

j−1

≤ x ≤ x

j

¦

M

g

j

= sup¦g(x) : x

j−1

≤ x ≤ x

j

¦

M

f g

j

= sup¦ f (x)g(x) : x

j−1

≤ x ≤ x

j

¦.

We also deﬁne

M

f

= sup¦¦ f (x)¦ : a ≤ x ≤ b¦

M

g

= sup¦¦g(x)¦ : a ≤ x ≤ b¦.

For every x

j−1

≤ x ≤ x

j

,

m

f

j

m

g

j

≤ f (x)g(x) ≤ M

f

j

M

g

j

,

hence

m

f

j

m

g

j

≤ m

f g

j

and M

f g

j

≤ M

f

j

M

g

j

.

150 Chapter 4

It further follows that

M

f g

j

−m

f g

j

≤ M

f

j

M

g

j

−m

f

j

m

g

j

≤ (M

f

j

−m

f

j

)M

g

j

+m

f

j

(M

g

j

−m

g

j

) ≤ M

g

(M

f

j

−m

f

j

)+M

f

(M

g

j

−m

g

j

).

This inequality implies that

U( f g, P) − L( f g, P) ≤ M

g

[U( f , P) − L( f , P)] + M

f

[U(g, P) − L(g, P)].

It is then a simple task to show that the integrability of f and g implies the inte-

grability of f g. B

Theorem 4.9 If f is integrable on [a, b] and m ≤ f (x) ≤ M for all x, then

m(b − a) ≤

_

b

a

f ≤ M(b − a).

Proof : Let P

0

= ¦x

0

, x

n

¦, then

L( f , P

0

) ≥ m(b − a) and U( f , P

0

) ≤ M(b − a),

and the proof follows from the fact that

L( f , P

0

) ≤

_

b

a

f ≤ U( f , P

0

).

B

Suppose now that f is integrable on [a, b]. We proved that it is integrable on every

sub-segment. Deﬁne for every x,

F(x) =

_

x

a

f .

Theorem 4.10 The function F deﬁned above is continuous on [a, b].

Integration theory 151

Proof : Since f is integrable it is bounded. Deﬁne

M = sup¦¦ f (x)¦ : a ≤ x ≤ b¦.

Given > 0 take δ = /M. Then for ¦x − y¦ < δ (suppose wlog that y > x),

F(y) − F(x) =

_

y

a

f −

_

x

a

f =

_

y

x

f .

By the previous theorem, since −M ≤ f (x) ≤ M for all x,

−M(y − x) ≤ F(y) − F(x) ≤ M(y − x),

i.e., ¦F(y) − F(x)¦ ≤ M(y − x) < . That is, given > 0, we take δ = /M, and then

¦F(y) − F(x)¦ < whenever ¦x − y¦ < δ,

which proves that F is (uniformly) continuous on [a, b]. B

(41 hrs, 2009)

Exercise 4.6 Let f be bounded and integrable in [a, b] and denote M = sup¦¦ f (x)¦ :

a ≤ x ≤ b¦. Show that for every p > 0,

__

b

a

¦ f ¦

p

_

1/p

≤ M¦b − a¦

1/p

,

and in particular, that the integral on the left hand side exists.

Exercise 4.7 Show that if f is bounded and integrable on [a, b], then ¦ f ¦ is

integrable on [a, b] and

¸

¸

¸

¸

¸

¸

_

b

a

f

¸

¸

¸

¸

¸

¸

≤

_

b

a

¦ f ¦.

Exercise 4.8 Let a < b and c > 0 and let f be integrable on [ca, cb]. Show

that

_

cb

ca

f = c

_

b

a

g,

where g(x) = f (cx).

152 Chapter 4

Exercise 4.9 Let f be integrable on [a, b] and g(x) = f (x − c). Show that g is

integrable on [a + c, b + c] and

_

b

a

f =

_

b+c

a+c

g.

Exercise 4.10 Let f : [a, b] → R be bounded, and deﬁne the variation of f in

[a, b] as

ω

f

= sup¦ f (x) : a ≤ x ≤ b¦ − inf¦ f (x) : a ≤ x ≤ b¦.

' Prove that

ω

f

= sup¦ f (x

1

)−f (x

2

) : a ≤ x

1

, x

2

≤ b¦ = sup¦¦ f (x

1

)−f (x

2

)¦ : a ≤ x

1

, x

2

≤ b¦.

^ Show that f is integrable on [a, b] if and only if there exists for every > 0

a partition P = ¦x

0

, . . . , x

n

¦, such that

_

n

i=1

ω

i

< , where ω

i

is the variation

of f in the segment [x

i−1

, x

i

].

· Prove that if f is integrable on [a, b], and there exists an m > 0 such that

¦ f (x)¦ ≥ m for all x ∈ [a, b], then 1/ f is integrable on [a, b].

Exercise 4.11 A function s : [a, b] → R is called a step function (תיצקנופ

תוגרדמ) if there exists a partition p such that s is constant on every open interval

(we do not care for the values at the nodes),

' Show that if f is integrable on [a, b] then there exist for every > 0 two

step functions, s

1

≤ f and s

2

≥ f , such that

_

b

a

f −

_

b

a

s

1

< and

_

b

a

s

2

−

_

b

a

f < .

^ Show that if f is integrable on [a, b] then there exists for every > 0 a

continuous function g, g ≤ f , such that

_

b

a

f −

_

b

a

g < .

Discussion: A question that occupied many mathematicians in the second half

of the 19th century is what characterizes integrable functions. Hermann Hankel

(1839-1873) proved that if a function is integrable, then it is necessarily continu-

ous on a dense set of points (he also proved the opposite, but had an error in his

Integration theory 153

proof). The ultimate answer is due to Henri Lebesgue (1875-1941) who proved

that a bounded function is integrable if and only if its points of discontinuity have

measure zero (a notion not deﬁned in this course).

(44 hrs, 2011)

4.3 The fundamental theorem of calculus

Theorem 4.11 ((ילאיצנרפידה ובשחה לש ידוסיה טפשמה)) Let f be inte-

grable on [a, b] and deﬁne for every x ∈ [a, b],

F(x) =

_

x

a

f .

If f is continuous at c ∈ (a, b), then F is diﬀerentiable at c and

F

·

(c) = f (c).

Comment: The theorem states that in a certain way the integral is an operation

inverse to the derivative. Note the condition on f being continuous. For example,

if f (x) = 0 everywhere except for the point x = 1 where it is equal to one, then

F(x) = 0, and it is not true that F

·

(1) = f (1).

Proof : We need to show that

lim

c

∆

F,c

= f (c).

We are going to show that this holds for the right derivative; the same arguments

can then be applied for the left derivative.

For x > c,

F(x) − F(c) =

_

c

x

f .

If we deﬁne

m(c, x) = inf¦ f (y) : x < y < c¦

M(c, x) = sup¦ f (y) : x < y < c¦,

154 Chapter 4

then,

m(c, x)(x − c) ≤ F(x) − F(c) ≤ M(c, x)(x − c),

or

m(c, x) ≤ ∆

F,c

(x) ≤ M(c, x).

Then,

¸

¸

¸∆

F,c

(x) − f (c)

¸

¸

¸ ≤ max ¦¦m(c, x) − f (y)¦, ¦M(c, x) − f (y)¦¦ .

It remains that the right-hand side as a limit zero at c. Since f is continuous at c,

(∀ > 0)(∃δ > 0) : (∀c < y < c + δ)(¦ f (y) − f (c)¦ < /2).

For c < x < c + δ,

(∀ > 0)(∃δ > 0) : (∀c < y < c + x)(¦ f (y) − f (c)¦ < /2),

hence

¦m(c, x) − f (c)¦ ≤ /2 < e and ¦M(c, x) − f (c)¦ ≤ /2 < e.

To conclude, for every > 0 there exists a δ > 0 such that for all c < x < c + δ,

¸

¸

¸∆

F,c

(x) − f (c)

¸

¸

¸ < ,

which concludes the proof. B

Corollary 4.2 (Newton-Leibniz formula) If f is continuous on [a, b] and f = g

·

for some function g, then

_

b

a

f = g(b) − g(a).

Proof : We have seen that

F(x) =

_

x

a

f

satisﬁes the identity F

·

(x) = f (x) for all x (since f is continuous on [a, b]). We

have also seen that it implies that F = g + c, for some constant c, hence

_

x

a

f = F(x) = g(x) + c,

Integration theory 155

but since F(a) = 0 it follows that c = −g(a). Setting x = b we get the desired

result. B

This is a very useful result. It means that whenever we know a pair of functions

f , g, such that g

·

= f , we have an easy way to calculate the integral of f . On the

other hand, it is totally useless if we want to calculate the integral of f , but no

primitive function (המודק היצקנופ) g is known.

(43 hrs, 2009) (47 hrs, 2010)

Exercise 4.12 Let f : R → R be continuous and g, h : (a, b) → R diﬀeren-

tiable. Deﬁne F : (a, b) → R by

F(x) =

_

h(x)

g(x)

f .

' Show that F is well-deﬁned for all x ∈ (a, b).

^ Show that F is diﬀerentiable, and calculate its derivative.

Exercise 4.13 Diﬀerentiate the following functions:

'

_

1

0

cos(t

2

) dt.

^

_

e

x

0

cos(t) dt.

·

_

x

2

x

e

sin(t)

dt.

Exercise 4.14 Let f : R → R be continuous. For every δ > 0 we deﬁne

F

δ

(x) =

1

2δ

_

x+δ

x−δ

f .

Show that for every x ∈ R,

f (x) = lim

δ→0

+

F

δ

(x).

Exercise 4.15 Let f , g : [a, b] → R be integrable, such that g(x) ≥ 0 for all

x ∈ [a, b].

' Show that there exists a λ ∈ R, such that

inf¦ f (x) : a ≤ x ≤ b¦ ≤ λ ≤ sup¦ f (x) : a ≤ x ≤ b¦,

156 Chapter 4

and

_

b

a

( f g) = λ

_

b

a

g.

Hint: distinguish between the cases where

_

b

a

g > 0 and

_

b

a

g = 0.

^ Show that if f is continuous, then there exists a c ∈ [a, b], such that

_

b

a

( f g) = f (c)

_

b

a

g.

(This is known as the integral mean value theorem.)

Exercise 4.16 Let f : [a, b] → R be monotonic and let g : [a, b] → R be

non-negative and integrable. Show that there exists a ξ ∈ [a, b], such that

_

b

a

( f g) = f (a)

_

ξ

a

g + f (b)

_

b

ξ

g.

Exercise 4.17 Let f : [a, b] → R be integrable, c ∈ (a, b), and F(x) =

_

x

a

f .

Prove or disprove:

' If f is diﬀerentiable at c then F is diﬀerentiable at c.

^ If f is diﬀerentiable at c then F

·

is continuous at c.

· If f is diﬀerentiable in a neighborhood of c and f

·

is continuous at c then F

is twice diﬀerentiable at c.

4.4 Riemann sums

Let’s ask a practical question. Let f be a given integrable function on [a, b], and

suppose we want to calculate the integral of f on [a, b], however we do not know

any primitive function of f . Suppose that we are content with an approximation

of this integral within a maximum error of . Take a partition P: since

L( f , P) ≤

_

b

a

f ≤ U( f , P),

it follows that if the diﬀerence between the upper and lower sum is less than ,

then their average diﬀers from the integral by less that /2. If the diﬀerence is

Integration theory 157

greater than , then it can be reduced by reﬁning the partition. Thus, up to the fact

that we have to compute lower and upper sums, we can keep reﬁning until they

diﬀer by less than , and so we obtain an approximation to the integral within the

desired accuracy.

There is one drawback in this plan. Inﬁmums and supremums may be hard to

calculate. On the other hand, since m

i

≤ f (x) ≤ M

i

for all x

i−1

≤ x ≤ x

i

, it follows

that if in every interval in the partition we take some point t

i

(a sample point),

then

L( f , P) ≤

n

i=1

f (t

i

)(x

i−1

− x

i

) ≤ U( f , P).

The sum in the middle is called a Riemann sum. Its evaluation only requires

function evaluations, without any inf or sup. If the partition is suﬃciently ﬁne,

such that U( f , P) − L( f , P) < , then

¸

¸

¸

¸

¸

¸

¸

n

i=1

f (t

i

)(x

i−1

− x

i

) −

_

b

a

f

¸

¸

¸

¸

¸

¸

¸

< .

Note that we don’t know a priori whether a Riemann is greater or smaller than the

integral. What we are going to show is that Riemann sums converge to the integral

as we reﬁne the partition, irrespectively on how we sample the points t

i

.

Deﬁnition 4.5 The diameter of a partition P is

diamP = max¦x

i

− x

i−1

: 1 ≤ i ≤ n¦.

Theorem 4.12 Let f be an integrable function on [a, b]. For every > 0 there

exists a δ > 0, such that

¸

¸

¸

¸

¸

¸

¸

n

i=1

f (t

i

)(x

i−1

− x

i

) −

_

b

a

f

¸

¸

¸

¸

¸

¸

¸

<

for every partition P such that diamP < δ and any choice of points t

i

.

Proof : It is enough to show that given > 0 there exists a δ > 0 such that

¦U( f , P) − L( f , P)¦ < whenever diamP < δ,

158 Chapter 4

for any Riemann sum with the same partition will diﬀer from the integral by less

than .

Let

M = sup¦¦ f (x)¦ : a ≤ x ≤ b¦,

and take some partition P

·

= ¦y

0

, . . . , y

k

¦ for which

¦U( f , P

·

) − L( f , P

·

)¦ <

2

.

We set

δ <

4Mk

.

For every partition P of diameter less than δ,

U( f , P) − L( f , P) =

n

i=1

(M

i

− m

i

)(x

i

− x

i−1

)

can be divided into two sums. One for which each of the intervals [x

i−1

, x

i

] is

contained in an interval [y

j−1

, y

j

]. The contribution of these terms is at most /2.

The second sum includes those intervals for which x

i−1

< y

j

< x

i

. There are at

most k such terms and each contributes at most Mδ. This concludes the proof. B

4.5 The trigonometric functions

Discussion: Describe the way the trigonometric functions are deﬁned in geome-

try, and explain that we want deﬁnitions that only rely on the 13 axioms of real

numbers.

Deﬁnition 4.6 The number π is deﬁned as

π = 2

_

1

−1

√

1 − x

2

dx.

Comment: We have used here the standard notation of integrals. In our notation,

we have a function ϕ : [−1, 1] → R deﬁned as ϕ : x ·→

√

1 − x

2

, and we deﬁne

π = 2

_

1

−1

ϕ.

Integration theory 159

This is well deﬁned because ϕ is continuous, hence integrable.

Consider now a unit circle. For −1 ≤ x ≤ 1 we look at the point (x,

√

1 − x

2

),

connect it to the origin, and express the area of the sector bounded by this seg-

ment and the positive x-axis. This geometric construction is just the motivation;

formally, we deﬁne a function A : [−1, 1] → R,

A(x) =

x

√

1 − x

2

2

+

_

1

x

ϕ.

If θ is the angle of this sector, then we know that x = cos θ, and that A(x) = θ/2.

Thus,

A(cos θ) =

θ

2

,

or,

cos θ = A

−1

_

θ

2

_

.

This observation motivates the way we are going to deﬁne the cosine function.

x

A!x"

We now examine some properties of the function A. First, A(−1) = π/2 and

A(1) = 0. Second,

A

·

(x) = −

1

2

√

1 − x

2

,

i.e., the function A is monotonically decreasing. It follows that it is invertible,

with its inverse deﬁned on [0, π/2].

Deﬁnition 4.7 We deﬁne the functions cos : [0, π] → R and sin : [0, π] → R by

cos x = A

−1

(x/2) and sin x =

√

1 − cos

2

x.

160 Chapter 4

With these deﬁnitions in hand we deﬁne the other trigonometric functions, tan,

cot, etc.

We now need to show that the sine and cosine functions satisfy all known diﬀer-

ential and algebraic properties.

Theorem 4.13 For 0 < x < π,

sin

·

x = cos x and cos

·

x = −sin x.

Proof : Since A is diﬀerentiable and A

·

(x) 0 for 0 < x < π, we have by the chain

rule and the derivative of the inverse function formula,

cos

·

x =

1

2

(A

−1

)

·

(x/2) =

1

2

1

A

·

(A

−1

(x/2))

= −

1

2

1

1

2

√

1−(A

−1

(x/2))

2

= = −sin x.

The derivative of the sine function is then obtained by the composition rule. B

(45 hrs, 2009)

We are now in measure to start picturing the graphs of these two functions. First,

by deﬁnition,

sin x > 0 0 < x < π.

It thus follows that cos is decreasing in this domain. We also have

cos π = A

−1

(π/2) = −1 and cos 0 = A

−1

(0) = 1,

from which follows that

sin π = 0 and sin 0 = 0.

By the intermediate value theorem, cos must vanish for some 0 < y < π. Since

A(cos y) =

y

2

,

it follows that

y = 2A(0) = 2

_

1

0

ϕ =

π

2

,

Integration theory 161

where we have used the fact that by symmetry,

_

0

−1

ϕ =

_

1

0

ϕ.

Thus,

cos

π

2

= 0 and sin

π

2

= 1.

We can then extend the deﬁnition of the sine and cosine to all of R by setting

sin x = −sin(2π − x) and cos x = cos(2π − x) for π ≤ x ≤ 2π,

and

sin(x + 2πk) = sin x and cos(x + 2πk) = cos x for all k ∈ Z,

It is easy to verify that the expressions for the derivatives of sine and cosine remain

the same for all x ∈ R.

Since we deﬁned the trigonometric functions (more precisely their inverses) via

integrals, it is not surprising that it was straightforward to diﬀerentiate them. A

little more subtle is to derive their algebraic properties.

Lemma 4.2 Let f : R → R be twice diﬀerentiable, and satisfy the diﬀerential

equation

f

··

+ f = 0,

with f (0) = f

·

(0) = 0. Then, f is identically zero.

Proof : Multiply the equation by 2f

·

, then

2 f

··

f

·

+ 2f f

·

= [( f

·

)

2

+ f

2

]

·

= 0,

from which we deduce that

( f

·

)

2

+ f

2

= c,

for some c ∈ R. The conditions at zero imply that c = 0, which implies that f is

identically zero. B

162 Chapter 4

Theorem 4.14 Let f : R → R be twice diﬀerentiable, and satisfy the diﬀerential

equation

f

··

+ f = 0,

with f (0) = a and f

·

(0) = b. Then,

f = a cos +b sin .

Proof : Deﬁne

g = f − a cos −b sin,

then,

g

··

+ g = 0,

and g(0) = g

·

(0) = 0. It follows from the lemma that g is identically zero, hence

f = a cos +b sin. B

Theorem 4.15 For every x, y ∈ R,

sin(x + y) = sin x cos y + cos x sin y

cos(x + y) = cos x cos y − sin x sin y.

Proof : Fix y and deﬁne

f : x ·→ sin(x + y),

from which follows that

f

·

: x ·→ cos(x + y)

f

··

: x ·→ −sin(x + y).

Thus, f

··

+ f = 0 and

f (0) = sin y and f

·

(0) = cos y.

It follows from the previous theorem that

f (x) = sin y cos x + cos y sin x.

We proceed similarly for the cosine function. B

Integration theory 163

4.6 The logarithm and the exponential

Our (starting) goal is to deﬁne the function

f : x ·→ 10

x

for x ∈ R, along with its inverse, which we denote by log

10

.

For n ∈ Nwe have an inductive deﬁnition, which in particular, satisﬁes the identity

10

n

10

m

= 10

n+m

.

We then extend the deﬁnition of the exponential (with base 10) to integers, by

requiring the above formula to remain correct. Thus, we must set

10

0

= 1 and 10

−n

=

1

10

n

.

The next extension is to fractional powers of the form 10

1/n

. Since we want

10

1/n

× 10

1/n

× × 10

1/n

.,,.

n times

= 10

1

= 10,

this forces the deﬁnition 10

1/n

=

n

√

10. Then, we extend the deﬁnition to rational

powers, by

10

1/n

× 10

1/n

× × 10

1/n

.,,.

m times

= 10

m/n

.

We have proved in the beginning of this course that the deﬁnition of a rational

power is independent of the representation of that fraction.

The question is how to extend the deﬁnition of 10

x

when x is a real number?

This can be done with lots of “sups” and “infs”, but here we shall adopt another

approach. We are looking for a function f with the property that

f (x + y) = f (x) f (y),

and f (1) = 10. Can we ﬁnd a diﬀerentiable function that satisﬁes these properties?

If there was such a function, then

f

·

(x) = lim

h→0

f (x + h) − f (x)

h

= lim

h→0

f (x) f (h) − f (x)

h

=

_

lim

h→0

f (h) − 1

h

_

f (x).

164 Chapter 4

Suppose that the limit on the right hand side exists, and denote it by α. Then,

f

·

(x) = α f (x) and f (1) = 10.

This is not very helpful, as the derivative of f is expressed in terms of the (un-

known) f .

It is clear that f should be monotonically increasing (it is so for rational powers),

hence invertible. Then,

( f

−1

)

·

(x) =

1

f

·

( f

−1

(x))

=

1

αf ( f

−1

(x))

=

1

αx

.

From this we deduce that

f

−1

(x) =

_

x

1

dt

αt

= log

10

(x).

We seem to have succeeded in deﬁning (using integration) the logarithm-base-10

function, but there is a caveat. We do not know the value of α. Since, in any case,

the so far distinguished role of the number 10 is unnatural, we deﬁne an alternative

logarithm function:

Deﬁnition 4.8 For all x > 0,

log(x) =

_

x

1

dt

t

.

Comment: Since 1/x is continuous on (0, ∞), this integral is well-deﬁned for all

0 < x < ∞.

Theorem 4.16 For all x, y > 0,

log(xy) = log(x) + log(y).

Proof : Fix y and consider the function

f : x ·→ log(xy).

Integration theory 165

Then,

f

·

(x) =

1

xy

y =

1

x

.

Since log

·

(x) = 1/x, it follows that

log(xy) = log(x) + c.

The value of the constant is found by setting x = 1. B

Corollary 4.3 For integer n,

log(x

n

) = n log(x).

Proof : By induction on n. B

Corollary 4.4 For all x, y > 0,

log

_

x

y

_

= log(x) − log(y).

Proof :

log(x) = log

_

x

y

y

_

= log

_

x

y

_

+ log(y).

B

Deﬁnition 4.9 The exponential function is deﬁned as

exp(x) = log

−1

(x).

Comment: The logarithm is invertible because it is monotonically increasing, as

its derivative is strictly positive. What is less easy to see is that the domain of the

exponential is the whole of R. Indeed, since

log(2

n

) = n log(2),

166 Chapter 4

it follows that the logarithm is not bounded above. Similarly, since

log(1/2

n

) = −n log(2),

it is neither bounded below. Thus, its range is R.

Theorem 4.17

exp

·

= exp .

Proof : We proceed as usual,

exp

·

(x) = (log

−1

)

·

(x) =

1

log

·

(log

−1

(x))

= log

−1

(x) = exp(x).

B

Theorem 4.18 For every x, y,

exp(x + y) = exp(x) exp(y).

Proof : Set t = exp(x) and s = exp(y). Thus, x = log(t) and y = log(s), and

x + y = log(t) + log(s) = log(ts).

Exponentiating both sides, we get the desired result. B

Deﬁnition 4.10

e = exp(1).

There is plenty more to be done with exponentials and logarithms, like for exam-

ple, show that for rational numbers r, e

r

= exp(r), and the passage between bases.

Due to the lack of time, we will stop at this point.

Integration theory 167

4.7 Integration methods

Given a function f , a question of practical interest is how to calculate integrals,

_

b

a

f .

If we know a function F, which is a primitive of f , in the sense that F

·

= f , then

the answer is known,

_

b

a

f = F(b) − F(a)

def

= F¦

b

a

.

To eﬀectively use this formula, we must know many pairs of functions f , F. Here

is a table of pairs f , F, which we already know:

f (x) F(x)

1 x

x

n

x

n+1

/(n + 1)

cos x sin x

sin x −cos x

1/x log(x)

exp(x) exp(x)

sec

2

(x) tan(x)

1/(1 + x

2

) arctan(x)

1/

√

1 − x

2

arcsin(x)

Sums of such functions and multiplication by constants are easily treated with our

integration theorems, for

_

b

a

( f + g) =

_

b

a

f +

_

b

a

g and

_

b

a

(c f ) = c

_

b

a

f .

Functions that can be expressed in terms of powers, trigonometric, exponential

and logarithmic functions and called elementary. In this section we study tech-

niques that in some cases allow to express a primitive function in terms of ele-

mentary functions. There are basically two techniques—integration by parts, and

substitution—both we will study.

168 Chapter 4

Comment: Every continuous function f has a primitive,

F(x) =

_

x

a

f ,

where a is arbitrary. It is however not always the case that the primitive of an

elementary function is itself and elementary function.

Example: Consider the function

F(x) = x arctan(x) −

1

2

log(1 + x

2

).

It is easily checked that

f (x) = F

·

(x) = arctan(x).

As a result of this observation,

_

b

a

arctan(x) dx = F¦

b

a

.

The question is how could have we guessed this answer if the function F had not

been given to us.

Theorem 4.19 (Integration by parts (יקלחב היצרגטניא)) Let f , g have con-

tinuous ﬁrst derivatives, then

_

b

a

f g

·

= ( f g)¦

b

a

−

_

b

a

f

·

g.

Comment: When we use this formula, f and g

·

are given. Then, g on the right-

hand side may be any primitive of g

·

. Also, setting b = x and noting that primitives

are deﬁned up to constants, we have that

Primitive( f g

·

) = f Primitive(g

·

) − Primitive( f

·

Primitive(g)) + const.

Integration theory 169

Example:

_

b

a

x e

x

dx.

**Example: The 1-trick:
**

_

b

a

log(x) dx.

**Example: The back-to-original-problem-trick
**

_

b

a

(1/x) log(x) dx.

**Once we have used this method to get new f , F pairs, we use this knowledge to
**

obtain new such pairs:

Example: Using the fact that a primitive of log(x) is x log(x) − x, we solve

_

b

a

log

2

(x) dx.

**Theorem 4.20 (Substitution formula (הבצהה תטיש)) For continuously diﬀer-
**

entiable f and g,

_

b

a

(g ◦ f ) f

·

=

_

f (b)

f (a)

g.

Proof : Let G be a primitive of g, then

(G ◦ f )

·

= (g ◦ f ) f

·

,

170 Chapter 4

hence

_

b

a

(g ◦ f ) f

·

=

_

b

a

(G ◦ f )

·

= (G ◦ f )¦

b

a

= G¦

f (b)

f (a)

=

_

f (b)

f (a)

g.

B

The simplest use of this formula is by recognizing that a function is of the form

(g ◦ f ) f

·

.

Example: Take the integral

_

b

a

sin

5

(x) cos(x) dx.

If f (x) = sin(x) and g(x) = x

5

, then

sin

5

(x) cos(x) = g( f (x)) f

·

(x),

hence

_

b

a

sin

5

(x) cos(x) dx =

_

sin(b)

sin(a)

t

5

dt =

t

6

6

¸

¸

¸

¸

¸

¸

sin(b)

sin(a)

.

(46 hrs, 2009)

Chapter 5

Sequences

5.1 Basic deﬁnitions

An (inﬁnite) sequence (הרדס) is an inﬁnite ordered set of real numbers. We denote

sequences by a letter which labels the sequence, and a subscript which labels the

position in the sequence, for example,

a

1

, a

2

, a

3

, . . . .

The essential fact is that to each natural number corresponds a real number. Hav-

ing said that, we can deﬁne:

Deﬁnition 5.1 An inﬁnite sequence is a function N → R.

Comment: Let a : N → R. Rather than writing a(3), a(17), we write a

3

, a

17

.

Comment: As for functions, where we often refer to “the function x

2

”, we often

refer to “the sequence 1/n”.

Notation: We usually denote a sequence a : N → R by (a

n

). Another ubiquitous

notation is (a

n

)

∞

n=1

.

Examples:

' a : n ·→ n or a

n

= n.

172 Chapter 5

^ b : n ·→ (−1)

n

or b

n

= (−1)

n

.

· c : n ·→ 1/n or c

n

= 1/n.

´ (d

n

) = ¦2, 3, 5, 7, 11, . . . ¦, the sequence of primes.

In the ﬁrst and fourth cases we would say that “the sequence goes (תפאוש) to

inﬁnity”. In the second case we would say that “it jumps between −1 and 1”. In

the third case we would say that “it goes to zero”. We will make such statements

precise momentarily.

Like functions, sequences can be graphed. More useful often is to mark the ele-

ments of the sequence on the number axis.

a

1

a

4

a

5

a

3

a

2

5.2 Limits of sequences

Deﬁnition 5.2 A sequence (a

n

) converges (תסנכתמ) to , denoted

lim

n→∞

a

n

= ,

if

(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a

n

− ¦ < ).

Deﬁnition 5.3 A sequence is called convergent (תסנכתמ) if it converges to some

(real) number; otherwise it is called divergent (תרדבתמ) (there is no such thing

as convergence to inﬁnity).

Comment: There is a close connection between the limit of a sequence and the

limit of a function at inﬁnity. We will elaborate on that later on.

Notation: Like for functions, a “cleaner” notation for the limit of a sequence (a

n

)

would be

lim

∞

a.

Sequences 173

Example: Let a : n ·→ 1/n, or

(a

n

) = 1,

1

2

,

1

3

,

1

4

, . . .

It is easy to show that

lim

∞

a = 0,

since,

(∀ > 0)(set N = ¦1/¦) then (∀n > N)

_

¦a

n

¦ =

1

n

<

1

N

<

_

.

Example: Let a : n ·→

√

n + 1 −

√

n, or

(a

n

) =

√

2 −

√

1,

√

3 −

√

2,

√

4 −

√

3, . . . .

It seems reasonable that

lim

∞

a = 0,

however showing it requires some manipulations. One way is by an algebraic

trick,

a

n

=

(

√

n + 1 −

√

n)(

√

n + 1 +

√

n)

√

n + 1 +

√

n

=

1

√

n + 1 +

√

n

<

1

2

√

n

,

from which it is easy to proceed. The same inequality can be obtained by deﬁning

a function f : x ·→

√

x, and using the mean-value theorem,

√

y −

√

x =

y − x

2

√

ξ

,

for some x < ξ < y. Substituting x = n and y = n + 1, we obtain the desired

inequality.

Example: Let now

a : n ·→

3n

3

+ 7n

2

+ 1

4n

3

− 8n + 63

.

174 Chapter 5

It seems reasonable that

lim

∞

a =

3

4

,

but once again, proving it is not totally straightforward. One way to proceed is to

note that

a

n

=

3 + 7/n + 1/n

3

4 − 8/n

2

+ 63/n

3

,

and use limit arithmetic as stated below.

Exercise 5.1 For each of the following sequence determine whether it is con-

verging or diverging. Prove your claim using the deﬁnition of the limit:

' a

n

=

7n

2

+

√

n

n

2

+3n

.

^ a

n

=

n

4

+1

n

3

−3

.

· a

n

=

√

n

2

+ 1 − n.

Theorem 5.1 Let a and b be convergent sequences. Then the sequences (a+b)

n

=

a

n

+ b

n

and (a b)

n

= a

n

b

n

are also convergent, and

lim

∞

(a + b) = lim

∞

a + lim

∞

b

lim

∞

(a b) = lim

∞

a lim

∞

b.

If furthermore lim

∞

b 0, then there exists an N ∈ N such that b

n

0 for all

n > N, the sequence (a/b) converges and

lim

∞

a

b

=

lim

∞

a

lim

∞

b

.

Comment: Since the limit of a sequence is only determined by the “tail” of the

sequence, we do not care if a ﬁnite number of elements is not deﬁned.

Proof : Easy once you’ve done it for functions. B

Sequences 175

Exercise 5.2 Show that if a and b are converging sequences, such that

lim

∞

a = L and lim

∞

b = M,

then

lim

∞

(a + b) = L + M.

We now establish the analogy between limits of sequences and limits of functions.

Let a be a sequence, and deﬁne a function f : [1, ∞) → R by

f (x) = a

n

+ (a

n+1

− a

n

)(x − n) n ≤ x ≤ n + 1.

Proposition 5.1

lim

∞

a = if and only if lim

∞

f = .

Proof : Suppose ﬁrst that the sequence converges. Thus,

(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a

n

− ¦ < ).

Since for every x, f (x) lies between f (¦x¦) and f ([x|), it follows that for all x > N,

¦ f (x) − ¦ ≤ max

_

¦a

¦x¦

− ¦, ¦a

[x|

− ¦

_

< ,

hence

(∀ > 0)(∃N ∈ N) : (∀x > N)(¦ f (x) − ¦ < ).

Conversely, if f converges to as x → ∞, then,

(∀ > 0)(∃N ∈ N) : (∀x > N)(¦ f (x) − ¦ < ),

and this is in particular true for x ∈ N. B

Conversely, given a function f : R → R, it is often useful to deﬁne a sequence

a

n

= f (n). If f (x) converges as x → ∞, then so does the sequence (a

n

).

176 Chapter 5

Example: Let 0 < a < 1, then

lim

n→∞

a

n

=?

We deﬁne f (x) = a

x

. Then

f (x) = e

x log a

,

and a

n

= f (n). Since log a < 0 it follows that lim

x→∞

f (x) = 0, hence the sequence

converges to zero.

Comment: Like for functions, we may deﬁne convergence of sequences to ±∞,

only that we don’t use the term convergence, but rather say that a sequence ap-

proaches (תפאוש) inﬁnity.

Theorem 5.2 Let f be deﬁned on a (punctured) neighborhood of a point c. Sup-

pose that

lim

c

f = .

If a is a sequence whose entries belong to the domain of f , such that a

n

c, and

lim

∞

a = c,

then

lim

∞

f (a) = ,

where the composition of a function and a sequence is deﬁned by

[ f (a)]

n

= f (a

n

).

Conversely, if lim

∞

f (a) = for every sequence a that converges to c, then

lim

c

f = .

Comment: This theorem provides a characterization of the limit of a function at

a point in terms of sequences. This is known as Heine’s characterization of the

limit.

Proof : Suppose ﬁrst that

lim

c

f = .

Sequences 177

Thus,

(∀ > 0)(∃δ > 0) : (∀x : 0 < ¦x − c¦ < δ)(¦ f (x) − ¦ < ).

Let a be a sequence as speciﬁed in the theorem, namely, a

n

c and lim

∞

a = c.

Then

(∀δ > 0)(∃N ∈ N) : (∀n > N)(0 < ¦a

n

− c¦ < δ).

Combining the two,

(∀ > 0)(∃N ∈ N) : (∀n > N)(¦ f (a

n

) − ¦ < ),

i.e., lim

∞

f (a) = .

Conversely, suppose that for any sequence a as above lim

∞

f (a) = . Suppose, by

contradiction that

lim

c

f

(i.e., either the limit does not exist or it is not equal to c). This means that

(∃ > 0) : (∀δ > 0)(∃x : 0 < ¦x − c¦ < δ) : (¦ f (x) − ¦ ≥ ).

In particular, setting δ

n

= 1/n,

(∃ > 0) : (∀n ∈ N)(∃a

n

: 0 < ¦a

n

− c¦ < δ

n

) : (¦ f (a

n

) − ¦ ≥ ).

The sequence a converges to c but f (a) does not converge to , which contradicts

our assumptions. B

Example: Consider the sequence

a : n ·→ sin

_

13 +

1

n

2

_

.

This sequence can be written as a = sin(b), where b

n

= 13 + 1/n

2

. Since

lim

13

sin = sin(13) and lim

∞

b = 13,

it follows that

lim

∞

a = sin(13).

(48 hrs, 2009)

178 Chapter 5

Theorem 5.3 A convergent sequence is bounded.

Proof : Let a be a sequence that converges to a limit α. We need to show that there

exists L

1

, L

2

such that

L

1

≤ a

n

≤ L

2

∀n ∈ N.

By deﬁnition, setting = 1,

(∃N ∈ N) : (∀n > N)(¦a

n

− α¦ < 1),

or,

(∃N ∈ N) : (∀n > N)(α − 1 < a

n

< α + 1).

Set

M = max¦a

n

: 1 ≤ n ≤ N¦

m = min¦a

n

: 1 ≤ n ≤ N¦.

then for all n,

min(m, α − 1) ≤ a

n

≤ max(M, α + 1).

B

Theorem 5.4 Let a be a bounded sequence and let b be a sequence that converges

to zero. Then

lim

n→∞

(ab) = 0.

Proof : Let M be a bound on a, namely,

¦a

n

¦ ≤ M ∀n ∈ N.

Since b converges to zero,

(∀ > 0)(∃N ∈ N) : (∀n > N)

_

¦b

n

¦ <

M

_

.

Thus,

(∀ > 0)(∃N ∈ N) : (∀n > N) (¦a

n

b

n

¦ ≤ M¦b

n

¦ < ) ,

which implies that the sequence ab converges to zero. B

Sequences 179

Example: The sequence

a : n ·→

sin n

n

converges to zero.

Theorem 5.5 Let a be a sequence with non-zero elements, such that

lim

∞

a = 0.

Then

lim

∞

1

¦a¦

= ∞.

Proof : By deﬁnition,

(∀M > 0)(∃N ∈ N) : (∀n > N)(0 < ¦a

n

¦ < 1/M),

hence

(∀M > 0)(∃N ∈ N) : (∀n > N)(1/¦a

n

¦ > M),

which concludes the proof. B

Theorem 5.6 Let a be a sequence such that

lim

∞

¦a¦ = ∞.

Then

lim

∞

1

a

= 0.

Proof : By deﬁnition,

(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a

n

¦ > 1/),

hence

(∀ > 0)(∃N ∈ N) : (∀n > N)(0 < 1/¦a

n

¦ < ),

which concludes the proof. B

180 Chapter 5

Theorem 5.7 Suppose that a and b are convergent sequences,

lim

∞

a = α and lim

∞

b = β,

and β > α. Then there exists an N ∈ N, such that

b

n

> a

n

for all n > N,

i.e., the sequence b is eventually greater (term-by-term) than the sequence a.

Proof : Since (β − α)/2 > 0,

(∃N

1

∈ N) : (∀n > N

1

)

_

a

n

< α +

1

2

(β − α)

_

,

and

(∃N

2

∈ N) : (∀n > N

2

)

_

b

n

> β −

1

2

(β − α)

_

.

Thus,

(∃N

1

, N

2

∈ N) : (∀n > max(N

1

, N

2

)) (b

n

− a

n

> 0) .

B

Corollary 5.1 Let a be a sequence and α, β ∈ R. If the sequence a converges to α

and α > β then eventually a

n

> β.

Theorem 5.8 Suppose that a and b are convergent sequences,

lim

∞

a = α and lim

∞

b = β,

and there exists an N ∈ N, such that a

n

≤ b

n

for all n > N. Then α ≤ β.

Proof : By contradiction, using the previous theorem. B

Comment: If instead a

n

< b

n

for all n > N, then we still only have α ≤ β. Take

for example the sequences a

n

= 1/n and b

n

= 2/n. Even though a

n

< b

n

for all n,

both converge to the same limit.

Sequences 181

Theorem 5.9 (Sandwich) Suppose that a and b are sequences that converge to

the same limit . Let c be a sequence for which there exists an N ∈ N such that

a

n

≤ c

n

≤ b

n

for all n > N.

Then

lim

∞

c = .

Proof : It is given that

(∀ > 0)(∃N

1

∈ N) : (∀n > N

1

)(¦a

n

− ¦ < ),

and

(∀ > 0)(∃N

2

∈ N) : (∀n > N

2

)(¦b

n

− ¦ < ).

Thus,

(∀ > 0)(∃N

1

, N

2

∈ N) : (∀n > max(N

1

, N

2

))(− < a

n

− ≤ c

n

− ≤ b

n

− < ),

or

(∀ > 0)(∃N ∈ N) : (∀n > N)(¦c

n

− ¦ < ).

B

Example: Since for all n,

1 <

_

1 + 1/n < 1 + 1/n,

it follows that

lim

n→∞

_

1 + 1/n = 1.

We can also deduce this from the fact that

lim

x→1

√

x = 1 and lim

n→∞

(1 + 1/n) = 1.

Exercise 5.3

182 Chapter 5

' Find sequences a and b, such that a converges to zero, b is unbounded, and

ab converges to zero.

^ Find sequences a and b, such that a converges to zero, b is unbounded, and

ab converges to a limit other then zero.

· Find sequences a and b, such that a converges to zero, b is unbounded, and

ab is bounded but does not converge.

Deﬁnition 5.4 A sequence a is called increasing (הלוע) if a

n+1

> a

n

for all n. It

is called non-decreasing (תדרוי אל) if a

n+1

≥ a

n

for all n. We deﬁne similarly a

decreasing and a non-increasing sequence.

Theorem 5.10 (Bounded + monotonic = convergent) Let a be a non-decreasing

sequence bounded from above. Then it is convergent.

Proof : Set

α = sup¦a

n

: n ∈ N¦.

(Note that a supremum is a property of a set, i.e., the order in the set does not

matter.) By the deﬁnition of the supremum,

(∀ > 0)(∃N ∈ N) : (α − < a

N

≤ α).

Since the sequence is non-decreasing,

(∀ > 0)(∃N ∈ N) : (∀n > N)(α − < a

N

≤ a

n

≤ α),

and in particular,

(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a

n

− α¦ < ).

B

Example: Consider the sequence

a : n ·→

_

1 +

1

n

_

n

.

Sequences 183

We will ﬁrst show that a

n

< 3 for all n. Indeed

1

,

_

1 +

1

n

_

n

= 1 +

_

n

1

_

1

n

+

_

n

2

_

1

n

2

+ +

_

n

n

_

1

n

n

= 1 + 1 +

n(n − 1)

2! n

2

+ +

n!

n! n

n

≤ 1 + 1 +

1

2!

+ +

1

n!

≤ 1 + 1 +

1

2

+ +

1

2

n−1

< 3.

Second, we claim that this sequence is increasing, as

a

n

= 1 + 1 +

1

2!

_

1 −

1

n

_

+

1

3!

_

1 −

1

n

_ _

1 −

2

n

_

+ .

When n grows, both the number of elements grows as well as their values. It

follows that

lim

n→∞

_

1 +

1

n

_

n

exists

(and equals to 2.718 ≡ e).

Deﬁnition 5.5 Let a be a sequence. A subsequence (הרדס תת) of a is any

sequence

a

n

1

, a

n

2

, . . . ,

such that

n

1

< n

2

< .

More formally, b is a subsequence of a if there exists a monotonically increasing

sequence of natural numbers (n

k

), such that b

k

= a

n

k

for all k ∈ N.

Example: Let a by the sequence of natural numbers, namely a

n

= n. The subse-

quence b of all even numbers is

b

k

= a

2k

,

i.e., n

k

= 2k.

1

The binomial formula is

(a + b)

n

=

n

k=0

_

n

k

_

a

k

b

n−k

.

184 Chapter 5

Lemma 5.1 Any sequence contains a subsequence which is either non-decreasing

or non-increasing.

Proof : Let a be a sequence. Let’s call a number n a peak point (איש תדוקנ) of the

sequence a if a

m

< a

n

for all m > n.

a peak poin!

a peak poin!

There are now two possibilities.

There are inﬁnitely many peak points: If n

1

< n

2

< are a sequence of peak

points, then the subsequence a

n

k

is decreasing.

There are ﬁnitely many peak points: Then let n

1

be greater than all the peak points.

Since it is not a peak point, there exists an n

2

, such that a

n

2

≥ a

n

1

. Continuing this

way, we obtain a non-decreasing subsequence. B

Corollary 5.2 (Bolzano-Weierstraß) Every bounded sequence has a convergent

subsequence.

(50 hrs, 2009)

In many cases, we would like to know whether a sequence is convergent even

if we do not know what the limit is. We will now provide such a convergence

criterion.

Deﬁnition 5.6 A sequence a is called a Cauchy sequence if

(∀ > 0)(∃N ∈ N) : (∀m, n > N)(¦a

n

− a

m

¦ < ).

Sequences 185

Theorem 5.11 A sequence converges if and only if it is a Cauchy sequence.

Proof : One direction is easy

2

. If a sequence a converges to a limit α, then

(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a

n

− α¦ < /2),

hence

(∀ > 0)(∃N ∈ N) : (∀m, n > N)(¦a

n

− a

m

¦ ≤ ¦a

n

− α¦ + ¦a

m

− α¦ < ),

i.e., the sequence is a Cauchy sequence.

Suppose now that a is a Cauchy sequence. We ﬁrst show that the sequence is

bounded. Taking = 1,

(∃N ∈ N) : (∀n > N)(¦a

n

− a

N+1

¦ < 1),

which proves that the sequence is bounded.

By the Bolzano-Weierstraß theorem, it follows that a has a converging subse-

quence. Denote this subsequence by b with b

k

= a

n

k

and its limit by β. We will

show that the whole sequence converges to β.

Indeed, by the Cauchy property

(∀ > 0)(∃N ∈ N) : (∀m, n > N)(¦a

n

− a

m

¦ < /2),

and whereas by the convergence of the sequence b,

(∀N ∈ N)(∃K ∈ N : n

K

> N) : (∀k > K)(¦b

k

− β¦ < /2).

Combining the two,

(∀ > 0)(∃N, K ∈ N : n

K

> N) : (∀n > N)(¦a

n

− β¦ ≤ ¦a

n

− b

k

¦ + ¦b

k

− β¦ < ).

This concludes the proof. B

(51 hrs, 2009)

2

There is something amusing about calling sequences satisfying this property a Cauchy se-

quence. Cauchy assumed that sequences that get eventually arbitrarily close converge, without

being aware that this is something that ought to be proved.

186 Chapter 5

5.3 Inﬁnite series

Let a be a sequence. We deﬁne the sequence of partial sums (ייקלח ימוכס) of

a by

S

a

n

=

n

k=1

a

k

.

Note that S

a

does not depend on the order of summation (a ﬁnite summation). If

the sequence S

a

converges, we will interpret its limit as the inﬁnite sum of the

sequence.

Deﬁnition 5.7 A sequence a is called summable (המיכס) if the sequence of par-

tial sums S

a

converges. In this case we write

∞

k=1

a

k

def

= lim

∞

S

a

.

The sequence of partial sum is called a series (רוט). Often, the term series refers

also to the limit of the sequence of partial sums.

Example: The canonical example of a series is the geometric series. Let a be a

geometric sequence

a

n

= q

n

¦q¦ < 1.

The geometric series is the sequence of partial sums, for which we have an explicit

expresion:

S

a

n

= q

1 − q

n

1 − q

This series converges, hence this geometric sequence is summable and

∞

n=1

q

n

=

q

1 − q

.

**Comment: Note the terminology:
**

sequence is summable = series is convergent.

Sequences 187

Theorem 5.12 Let a and b be summable sequences and γ ∈ R. Then the sequences

a + b and γa are summable and

∞

n=1

(a

n

+ b

n

) =

∞

n=1

a

n

+

∞

n=1

b

n

,

and

∞

n=1

(γa

n

) = γ

∞

n=1

a

n

.

Proof : Very easy. For example,

S

a+b

n

=

n

k=1

(a

k

+ b

k

) = S

a

n

+ S

b

n

,

and the rest follows from limits arithmetic. B

An important question is how to identify summable sequences, even in cases

where the sum is not known. This brings us to discuss convergence criteria.

Theorem 5.13 (Cauchy’s criterion) The sequence a is summable if and only if for

every > 0 there exists an N ∈ N, such that

¦a

n+1

+ a

n+2

+ + a

m

¦ < for all m > n > N.

Proof : This is just a restatement of Cauchy’s convergence criterion for sequences.

We know that the series converges if and only if

(∀ > 0)(∃N ∈ N) : (∀m > n > N)(¦S

a

m

− S

a

n

¦ < ),

but

S

a

m

− S

a

n

= a

n+1

+ a

n+2

+ + a

m

.

B

188 Chapter 5

Corollary 5.3 If a is summable then

lim

∞

a = 0.

Proof : We apply Cauchy’s criterion with m = n + 1. That is,

(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a

n+1

¦ < ).

B

The vanishing of the sequence is only a necessary condition for its summability.

It is not suﬃcient as proves the harmonic series, a

n

= 1/n,

S

a

n

= 1 +

1

2

+

1

3

+

1

4

.,,.

≥1/2

+

1

5

+

1

6

+

1

7

+

1

8

.,,.

≥1/2

+ .

This grouping of terms shows that the sequence of partial sums is unbounded.

For a while we will deal with non-negative sequences. The reason will be made

clear later on.

Theorem 5.14 A non-negative sequence is summable if and only if the series (i.e.,

the sequence of partial sums) is bounded.

Proof : Consider the sequence of partial sums, which is monotonically increas-

ing. Convergence implies boundedness. Monotonicity and boundedness imply

convergence. B

Theorem 5.15 (Comparison test) If 0 ≤ a

n

≤ b

n

for every n and the sequence b is

summable, then the sequence a is summable.

Proof : Since b is summable then the corresponding series S

b

is bounded, namely.

(∃M > 0) : (∀n ∈ N)

_

0 ≤ S

b

n

≤ M

_

.

The bound M bounds the series S

a

, which therefore converges. B

Sequences 189

Example: Consider the following sequence:

a

n

=

2 + sin

3

(n + 1)

2

n

+ n

2

.

Since

0 < a

n

≤

3

2

n

def

= b

n

,

and the series associated with b converges, it follows that the series of a converges.

**Example: Consider now the sequence
**

a

n

=

1

2

n

− 1 + sin

3

(n

2

)

.

Here we have

0 <

1

2

n

≤ a

n

≤

1

2

n

− 2

≤

2

2

n

, n ≥ 2,

so again, based on the comparison test and the convergence of the geometric series

we conclude that the series of a converges.

Example: The next example is

a

n

=

n + 1

n

2

+ 1

.

It is clear that the corresponding series diverges, since a

n

“roughly behaves” like

1/n. This can be shown by the following inequality

1

n

=

n + 1

n

2

+ n

≤ a

n

,

which proves that the series S

a

is unbounded. The following theorem provides an

even easier way to show that.

(53 hrs, 2009)

Theorem 5.16 (Limit comparison) Let a and b be positive sequences, such that

lim

∞

a

b

= γ 0.

Then a is summable if and only if b is summable.

190 Chapter 5

Example: Back to the previous example, setting b

n

= 1/n we ﬁnd

lim

∞

a

b

= lim

n→∞

n

2

+ n

n

2

+ 1

= 1,

hence the series of a diverges.

Proof : Suppose that b is summable. Since

lim

∞

a

γ b

= 1,

it follows that

(∃N ∈ N) : (∀n > N)

_

a

n

γ b

n

< 2

_

,

or

(∃N ∈ N) : (∀n > N) (a

n

< 2γb

n

) .

Now,

S

a

n

= S

a

N

+

n

k=N+1

a

k

< S

a

N

+ 2γ

n

k=N+1

b

k

≤ S

a

N

+ 2γS

b

n

.

Since the right hand sides is uniformly bounded, then a is summable. The roles of

a and b can then be interchanged. B

Theorem 5.17 (Ratio test (הנמה חבמ)) Let a

n

> 0 and suppose that

lim

n→∞

a

n+1

a

n

= r.

(i) If r < 1 then (a

n

) is summable. (ii) If r > 1 then (a

n

) is not summable.

Proof : Suppose ﬁrst that r < 1. Then,

(∃s : r < s < 1)(∃N ∈ N) : (∀n > N)

_

a

n+1

a

n

≤ s

_

.

In particular, a

N+1

≤ s a

N

, a

N+2

≤ s

2

a

N

, and so on. Thus,

n

k=1

a

k

=

N

k=1

a

k

+ a

N

n

k=N+1

s

k−N

.

Sequences 191

The second term on the right hand side is a partial sum of a convergent geometric

series, hence the partial sums of (a

n

) are bounded, which proves their convergence.

Conversely, if r > 1, then

(∃s : 1 < s < r)(∃N ∈ N) : (∀n > N)

_

a

n+1

a

n

≥ s

_

,

from which follows that a

n

≥ s

n−N

a

N

for all n > N, i.e., the sequence (a

n

) is not

even bounded (and certainly not summable). B

Examples: Apply the ratio test for

a

n

=

1

n!

b

n

=

r

n

n!

c

n

= n r

n

.

Comment: What about the case r = 1 in the above theorem? This does not provide

any information about convergence, as show the two examples

3

,

∞

n=1

1

n

2

< ∞ and

∞

n=1

1

n

= ∞.

These examples motivate, however, another comparison test. Note that

lim

x→∞

_

x

1

dt

t

= ∞ whereas lim

x→∞

_

x

1

dt

t

2

< ∞.

Indeed, there is a close relation between the convergence of the series and the

convergence of the corresponding integrals.

Deﬁnition 5.8 Let f be integrable on any segment [a, b] for some a and b > a.

Then

_

∞

a

f

def

= lim

x→∞

_

x

a

f .

3

This notation means that the sequence a

n

= 1/n

2

is summable, whereas the sequence b

n

= 1/n

is not.

192 Chapter 5

Theorem 5.18 (Integral test) Let f be a positive decreasing function on [1, ∞)

and set a

n

= f (n). Then

_

∞

n=1

a

n

converges if and only if

_

∞

1

f exists.

Proof : The proof hinges on showing that

[x|

k=2

a

k

≤

_

x

1

f ≤

¦x¦

k=1

a

k

.

TO COMPLETE. B

Example: Consider the series

∞

n=1

1

n

p

for p > 0. We apply the integral text, and consider the integral

_

x

1

dt

t

p

=

_

¸

¸

_

¸

¸

_

log x p = 1

(1 − p)

−1

(x

−p+1

− 1) p 1.

This integral converges as x → ∞ if and only if p > 1, hence so does the series.

(55 hrs, 2009)

Example: One can take this further and prove that

∞

n=1

1

n(log n)

p

converges if and only if p > 1, and so does

∞

n=1

1

n log n(log log n)

p

,

and so on. This is known as Bertrand’s ladder.

We have dealt with series of non-negative sequences. The case of non-positive

sequences is the same as we can deﬁne b

n

= (−a

n

). Sequences of non-ﬁxed sign

are a diﬀerent story.

Sequences 193

Deﬁnition 5.9 A series

_

∞

n=1

a

n

is said to be absolutely convergent (תסנכתמ

טלחהב) if

_

∞

n=1

¦a

n

¦ converges. A series that converges but does not converge

absolutely is called conditionally convergent (יאנתב תסנכתמ).

Example: The series

1 −

1

2

+

1

3

−

1

4

+ ,

converges conditionally.

Example: The series

1 −

1

3

+

1

5

−

1

7

+ ,

converges conditionally as well. It is a well-known alternating series, which Leib-

niz showed (without any rigor) to be equal to π/4. It is known as the Leibniz

series.

Theorem 5.19 Every absolutely convergent series is convergent. Also, a series is

absolutely convergent, if and only if the two series formed by its positive elements

and its negative elements both converge.

Proof : The fact that an absolutely convergent series converges follows fromCauchy’s

criterion. Indeed, for every > 0 there exists an N ∈ N such that

¦a

n+1

+ + a

m

¦ ≤ ¦a

n+1

¦ + + ¦a

m

¦ <

whenever m, n > N.

Deﬁne now

4

a

+

n

=

_

¸

¸

_

¸

¸

_

a

n

a

n

> 0

0 a

n

≤ 0

and a

−

n

=

_

¸

¸

_

¸

¸

_

a

n

a

n

< 0

0 a

n

≥ 0

.

4

For example, if a

n

= (−1)

n+1

/n, then

a

+

n

= 1, 0,

1

3

, 0,

1

5

, 0,

a

−

n

= 0, −

1

2

, 0, −

1

4

, 0, −

1

6

, .

194 Chapter 5

Then, a

n

= a

+

n

+ a

−

n

, ¦a

n

¦ = a

+

n

− a

−

n

, and conversely,

a

+

n

=

1

2

(a

n

+ ¦a

n

¦) and a

−

n

=

1

2

(a

n

− ¦a

n

¦) .

Thus,

n

k=1

¦a

k

¦ =

n

k=1

a

+

n

−

n

k=1

a

−

n

,

hence if the two series of positive and negative terms converge so does the series

of absolute values. On the other hand,

n

k=1

a

+

n

=

1

2

n

k=1

a

n

+

1

2

n

k=1

¦a

n

¦

n

k=1

a

−

n

=

1

2

n

k=1

a

n

−

1

2

n

k=1

¦a

n

¦

so that the other direction holds as well. B

(56 hrs, 2009)

Theorem 5.20 (Leibniz) Suppose that a

n

is a non-increasing sequence of non-

negative numbers, i.e.,

a

1

≥ a

2

≥ ≥ 0,

and

lim

n→∞

a

n

= 0.

Then the series

∞

n=1

(−1)

n+1

a

n

= a

1

− a

2

+ a

3

− a

4

+

converges.

Proof : Simple considerations show that

s

2

≤ s

4

≤ s

6

≤ ≤ s

5

≤ s

3

≤ s

1

.

Indeed,

s

2n

≤ s

2n

+ (a

2n+1

− a

2n+2

) = s

2n+2

= s

2n+1

− a

2n+2

≤ s

2n+1

.

Sequences 195

Thus, the sequence b

n

= s

2n

is non-decreasing and bounded from above, whereas

the sequence c

n

= s

2n+1

is non-increasing and bounded from below. It follows that

both subsequences converge,

lim

n→∞

b

n

= b and lim

n→∞

c

n

= c.

By limit arithmetic

lim

n→∞

(c

n

− b

n

) = lim

n→∞

(s

2n+1

− s

2n

) = lim

n→∞

a

2n+1

= c − b,

however a

2n+1

converges to zero, hence b = c. It remains to show that if both the

odd and even elements of a sequence converge to the same limit then the whole

sequence converges to this limit.

Indeed,

(∀ > 0)(∃N

1

∈ N) : (∀n > N

1

)(¦s

2n+1

− b¦ < )

and

(∀ > 0)(∃N

2

∈ N) : (∀n > N

2

)(¦s

2n

− b¦ < ),

i.e.,

(∀ > 0)(∃N

1

, N

2

∈ N) : (∀n > max(2N

1

+ 1, 2N

2

))(¦s

n

− b¦ < ).

B

Deﬁnition 5.10 A rearrangement (שדחמ רודיס) of a sequence a

n

is a sequence

b

n

= a

f (n)

,

where f : N → N is one-to-one and onto.

Theorem 5.21 (Riemann) If

_

∞

n=1

a

n

is conditionally convergent then for every

α ∈ R there exists a rearrangement of the series that converges to α.

Proof : I will not give a formal proof, but only a sketch. B

196 Chapter 5

Example: Consider the series

S =

∞

n=1

(−1)

n+1

n

= 1 −

1

2

+

1

3

−

1

4

+

1

5

+ .

We proved that this series does indeed converge to some

1

2

< S < 1. By limit

arithmetic for sequences (or series),

1

2

S =

∞

n=1

(−1)

n+1

2n

=

1

2

−

1

4

+

1

6

−

1

8

+

1

10

+ ,

which we can also pad with zeros,

1

2

S = 0 +

1

2

+ 0 −

1

4

+ 0 +

1

6

+ 0 −

1

8

+ 0 +

1

10

+ .

Using once more sequence arithmetic,

3

2

S = 1 +

1

3

−

1

2

+

1

5

+

1

7

−

1

4

+ . . . ,

which directly proves that a rearrangement of the alternating harmonic series con-

verges to three halves of its value. In this example the rearrangement is

b

3n+1

= a

4n+1

b

3n+2

= a

4n+3

b

3n

= a

2n

.

**Comment: Here is an important fact about Cauchy sequences: let (a
**

n

) be a Cauchy

sequence with limit a. Then,

(∀ > 0)(∃N ∈ N) : (∀m, n > N)(¦a

n

− a

m

¦ < ).

Think now of the left hand side as a sequence with index m with n ﬁxed. As

m → ∞ this sequence converges to ¦a

n

− a¦. This implies that

¦a

n

− a¦ ≤ for all n > N.

Sequences 197

Theorem 5.22 If

_

∞

n=1

a

n

converges absolutely then any rearrangement

_

∞

n=1

b

n

converges absolutely, and

∞

n=1

a

n

=

∞

n=1

b

n

.

In other word, the order of summation does not matter in an absolutely convergent

series.

Proof : Let b

n

= a

f (n)

and set

s

n

=

n

k=1

a

k

and t

n

=

n

k=1

b

k

.

Let > 0 be given. Since

_

∞

n=1

a

n

converges, then there exists an N such that

¸

¸

¸

¸

¸

¸

¸

s

N

−

∞

k=1

a

k

¸

¸

¸

¸

¸

¸

¸

<

2

.

Moreover, since the sequence is absolutely convergent, we can choose N large

enough such that

∞

k=1

¦a

k

¦ −

N

k=1

¦a

k

¦ =

∞

k=N+1

¦a

k

¦ <

2

.

Take now M be suﬃciently large, so that each of the a

1

, . . . , a

N

appears among the

b

1

, . . . , b

M

. For all m > M,

¦t

m

− s

N

¦ =

¸

¸

¸

¸

¸

¸

¸

k>N, f (k)≤m

a

k

¸

¸

¸

¸

¸

¸

¸

≤

∞

k=N+1

¦a

k

¦ <

2

,

i.e.,

¸

¸

¸

¸

¸

¸

¸

∞

k=1

a

k

− t

m

¸

¸

¸

¸

¸

¸

¸

≤

¸

¸

¸

¸

¸

¸

¸

∞

k=1

a

k

− s

N

¸

¸

¸

¸

¸

¸

¸

+ ¦s

N

− t

m

¦ ≤ .

This proves that

lim

m→∞

t

m

=

∞

k=1

a

k

,

198 Chapter 5

i.e.,

∞

n=1

b

n

=

∞

n=1

a

n

.

That

_

∞

n=1

b

n

converges absolutely follows formthe fact that ¦b

n

¦ is a rearrangement

of ¦a

n

¦ and the ﬁrst part of this proof. B

Absolute convergence is also necessary in order to multiply two series. Consider

the product

_

¸

¸

¸

¸

¸

_

∞

n=1

a

n

_

¸

¸

¸

¸

¸

_

_

¸

¸

¸

¸

¸

_

∞

k=1

b

k

_

¸

¸

¸

¸

¸

_

= (a

1

+ a

2

+ )(b

1

+ b

2

+ ).

This product seems to be a sum of all products a

n

b

k

, i.e.,

∞

k,n=1

(a

n

b

k

),

except that these numbers to be added form a “matrix”, i.e., a set that can be

ordered in diﬀerent ways.

Theorem 5.23 Suppose that

_

∞

n=1

a

n

and

_

∞

n=1

b

n

converge absolutely, and that c

n

is any sequence that contains all terms a

n

b

k

. Then

∞

n=1

c

n

=

_

¸

¸

¸

¸

¸

_

∞

n=1

a

n

_

¸

¸

¸

¸

¸

_

_

¸

¸

¸

¸

¸

_

∞

k=1

b

k

_

¸

¸

¸

¸

¸

_

.

Proof : Consider the sequence

s

n

=

_

¸

¸

¸

¸

¸

_

∞

k=1

¦a

k

¦

_

¸

¸

¸

¸

¸

_

_

¸

¸

¸

¸

¸

_

∞

=1

¦b

¦

_

¸

¸

¸

¸

¸

_

.

This sequence converges as n → ∞ (by limit arithmetic), which implies that it is

a Cauchy sequence. For every > 0 there exists an N ∈ N, such that

¸

¸

¸

¸

¸

¸

¸

_

¸

¸

¸

¸

¸

_

n

k=1

¦a

k

¦

_

¸

¸

¸

¸

¸

_

_

¸

¸

¸

¸

¸

_

n

=1

¦b

¦

_

¸

¸

¸

¸

¸

_

−

_

¸

¸

¸

¸

¸

_

m

k=1

¦a

k

¦

_

¸

¸

¸

¸

¸

_

_

¸

¸

¸

¸

¸

_

m

=1

¦b

¦

_

¸

¸

¸

¸

¸

_

¸

¸

¸

¸

¸

¸

¸

<

2

Sequences 199

for all m, n > N. By the comment above,

¸

¸

¸

¸

¸

¸

¸

_

¸

¸

¸

¸

¸

_

m

k=1

¦a

k

¦

_

¸

¸

¸

¸

¸

_

_

¸

¸

¸

¸

¸

_

m

=1

¦b

¦

_

¸

¸

¸

¸

¸

_

−

_

¸

¸

¸

¸

¸

_

∞

k=1

¦a

k

¦

_

¸

¸

¸

¸

¸

_

_

¸

¸

¸

¸

¸

_

∞

=1

¦b

¦

_

¸

¸

¸

¸

¸

_

¸

¸

¸

¸

¸

¸

¸

≤

2

for all m > N. Thus,

k or >N

¦a

k

¦¦b

¦ ≤

2

< .

Let now (c

n

) be a sequence comprising of all pairs a

n

b

**. There exists an M suﬃ-
**

ciently large, such that m > M implies that either k or is greater than N, where

c

m

= a

k

b

. I.e.,

M

n=1

c

n

−

_

¸

¸

¸

¸

¸

_

N

k=1

a

k

_

¸

¸

¸

¸

¸

_

_

¸

¸

¸

¸

¸

_

N

=1

b

_

¸

¸

¸

¸

¸

_

only consists of terms a

k

b

**with either k or greater than N, hence for all m > M,
**

¸

¸

¸

¸

¸

¸

¸

m

n=1

c

n

−

_

¸

¸

¸

¸

¸

_

N

k=1

a

k

_

¸

¸

¸

¸

¸

_

_

¸

¸

¸

¸

¸

_

N

=1

b

_

¸

¸

¸

¸

¸

_

¸

¸

¸

¸

¸

¸

¸

≤

k or >N

¦a

k

¦¦b

¦ < .

Letting N → ∞,

¸

¸

¸

¸

¸

¸

¸

m

n=1

c

n

−

_

¸

¸

¸

¸

¸

_

∞

k=1

a

k

_

¸

¸

¸

¸

¸

_

_

¸

¸

¸

¸

¸

_

∞

=1

b

_

¸

¸

¸

¸

¸

_

¸

¸

¸

¸

¸

¸

¸

≤ ,

for all m > M, which proves the desires result. B

200 Chapter 5

2

Contents

1 Real numbers 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 Axioms of ﬁeld . . . . . . . . . . . . . . . . . . . . . . . . . . . Axioms of order (as taught in 2009) . . . . . . . . . . . . . . . . Axioms of order (as taught in 2010, 2011) . . . . . . . . . . . . . Absolute values . . . . . . . . . . . . . . . . . . . . . . . . . . . Special sets of numbers . . . . . . . . . . . . . . . . . . . . . . . The Archimedean property . . . . . . . . . . . . . . . . . . . . . Axiom of completeness . . . . . . . . . . . . . . . . . . . . . . . Rational powers . . . . . . . . . . . . . . . . . . . . . . . . . . . Real-valued powers . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 7 10 13 16 18 22 36 40 40 43 43 49 50 62 66 70 78 80 84

1.10 Addendum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Functions 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 Basic deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Limits and order . . . . . . . . . . . . . . . . . . . . . . . . . . . Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theorems about continuous functions . . . . . . . . . . . . . . . Inﬁnite limits and limits at inﬁnity . . . . . . . . . . . . . . . . . Inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . Uniform continuity . . . . . . . . . . . . . . . . . . . . . . . . .

ii 3 Derivatives 3.1 3.2 3.3 3.4 3.5 3.6 3.7 4

CONTENTS 91 91 98

Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rules of diﬀerentiation . . . . . . . . . . . . . . . . . . . . . . .

Another look at derivatives . . . . . . . . . . . . . . . . . . . . . 102 The derivative and extrema . . . . . . . . . . . . . . . . . . . . . 104 Derivatives of inverse functions . . . . . . . . . . . . . . . . . . . 117 Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Taylor’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 123 133

Integration theory 4.1 4.2 4.3 4.4 4.5 4.6 4.7

Deﬁnition of the integral . . . . . . . . . . . . . . . . . . . . . . 133 Integration theorems . . . . . . . . . . . . . . . . . . . . . . . . 142 The fundamental theorem of calculus . . . . . . . . . . . . . . . . 153 Riemann sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 The trigonometric functions . . . . . . . . . . . . . . . . . . . . 158 The logarithm and the exponential . . . . . . . . . . . . . . . . . 163 Integration methods . . . . . . . . . . . . . . . . . . . . . . . . . 167 171

5

Sequences 5.1 5.2 5.3

Basic deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Limits of sequences . . . . . . . . . . . . . . . . . . . . . . . . . 172 Inﬁnite series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

iii .CONTENTS Foreword 1. Hochman. Meizler. Texbooks: Spivak.

iv CONTENTS .

they were sometimes very vague about deﬁnitions and their theory often laid on shaky grounds. which was developed during more than two centuries. Some of their followers who will be mentioned along this course are Jakob Bernoulli (1654-1705). Joseph Liouville (1804-1882). 1. solve many diﬀerential equations. JeanGaston Darboux (1842-1917). The pioneers were Isaac Newton (1642-1737) and Gottfried Wilelm Leibniz (1646-1716). and Karl Weierstrass (1815-1897). they could integrate many functions.Chapter 1 Real numbers In this course we will cover the calculus of real univariate functions. notably by Joseph-Louis Lagrange (1736-1813). and as we are going to learn it in this course was mostly developed throughout the 19th century. Peter Gustav Lejeune-Dirichlet (1805-1859). Augustin Louis Cauchy (1789-1857). the theory of functions needed a thorough revision of the concept of real numbers. Georg Friedrich Bernhard Riemann (1826-1866). The sound theory of calculus as we know it today. a rigorous course of calculus should start by putting even such basic con- . we should mention Georg Cantor (1845-1918) and Richard Dedekind (1831-1916). As calculus was being established on ﬁrmer grounds. Yet.1 Axioms of ﬁeld Even though we have been using “numbers” as an elementary notion since ﬁrst grade. and sum up a large number of inﬁnite series using a wealth of sophisticated analytical techniques. In this context. These ﬁrst two generations of mathematicians developed most of the practice of calculus as we know it. Johann Bernoulli (1667-1748) and Leonhard Euler (1707-1783).

In formal notation. Chapter 1 The set of real numbers will be deﬁned as an instance of a complete. and then add the sum to a. What about the addition of four elements. As such. a + b + c + d? Does it require a separate axiom of equivalence? We would like the following additions ((a + b) + c) + d (a + (b + c)) + d a + ((b + c) + d) a + (b + (c + d)) (a + b) + (c + d) to be equivalent. c ∈ F. there is no a-priori meaning to a + b + c. b ∈ F)(∃!c ∈ F) : (c = a + b). 1 . Likewise (although it requires some non-trivial work). the meaning is ambiguous. The ﬁrst axiom states that in either case. The operations on elements of a ﬁeld satisfy nine deﬁning properties. b. as in a · b = ab). b ∈ F corresponds one and only one a + b ∈ F and a · b ∈ F. Addition is associative ( ): for all a. and an operation which we call multiplication and denote by · (or by nothing. and denote by +. A binary operation in F is a function that “takes” an ordered pair of elements in F and “returns” an element in F. or conversely. ordered ﬁeld ( ). (a + b) + c = a + (b + c). as we could ﬁrst add a + b and then add c to the sum. we can show that the n-fold addition a1 + a2 + · · · + an is deﬁned unambiguously. to every a. In fact. 1. which we list now. add b + c. (∀a. That is. It is easy to see that the equivalence follows from the axiom of associativity. It should be noted that addition was deﬁned as a binary operation. Deﬁnition 1.1 A ﬁeld F ( ) is a non-empty set on which two binary operations are deﬁned1 : an operation which we call addition. the result is the same.2 cept on axiomatic grounds.

This is not an assumption. (existence of inverse) (associativity) (property of inverse) (0 is the neutral element) It follows at once that every element a ∈ F has a unique additive inverse.2 With the ﬁrst three axioms there are a few things we can show. that a+b=a+c implies b = c. Suppose there were a. 3. which justiﬁes the notation (−a). Existence of an additive-neutral element ( exists an element 0 ∈ F such that for all a ∈ F a + 0 = 0 + a = a. b ∈ F such that a + b = a. for if b and c were both additive inverses of a. such that 3 ): there ): for all a ∈ F there exists an a + b = b + a = 0. Existence of an additive inverse ( element b ∈ F. 2 . The notation (−a) assumes implicitly that the inverse is unique. For example.Real numbers 2. then a + b = 0 = a + c. It is customary to denote this additive inverse by (−a). The proof requires all three axioms: a+b=a+c (−a) + (a + b) = (−a) + (a + c) ((−a) + a)) + b = ((−a) + a) + c 0+b=0+c b = c. Hence we can refer to the additive inverse of a. from which follows that b = c. but follows from the ﬁrst three axioms. We can also show that there is a unique element in F satisfying the neutral property (it is not an a priori fact). Then adding (−a) (on the left!) and using the law of associativity we get that b = 0.

b. (given) (commutativity) (associativity) (property of additive inverse) (property of neutral element) . Subtraction is a short-hand notation for the addition of the additive inverse. There is no way we can prove it using only the ﬁrst four axioms! Indeed. By themselves. our experience with numbers tells us that a − b = b − a implies that a = b. Multiplication is associative: for all a. all we obtain from it that a + (−b) = b + (−a) (a + (−b)) + (b + a) = (b + (−a)) + (a + b) a + ((−b) + b) + a = b + ((−a) + a) + b a+0+a=b+0+b a + a = b + b. but how can we deduce from that that a = b? With very little comments. 4. a · 1 = 1 · a = a. Existence of a multiplicative-neutral element: there exists an element 1 ∈ F such that for all a ∈ F. def Comment: A set satisfying the ﬁrst three axioms is called a group ( ). b ∈ F a + b = b + a. a · (b · c) = (a · b) · c. we can state now the corresponding laws for multiplication: 5. Addition is commutative ( ): for all a. c ∈ F.4 Chapter 1 Comment: We only postulate the operations of addition and multiplication. 6. a − b = a + (−b). which can ﬁll an entire course. On the other hand. those three axioms have many implications. With this law we ﬁnally obtain that any (ﬁnite!) summation of elements of F can be re-arranged in any order.

We would be done if we knew that 1 + 1 0. f } with the following properties. ): for every a 5 0 there The condition that a 0 has strong implications. a+a=b+b 1·a+1·a=1·b+1·b (1 + 1) · a = (1 + 1) · b (property of neutral element) (distributive law). the uniqueness of the multiplicative inverse has to be proved. We can now proceed. b ∈ F. for we would multiply both sides by (1 + 1)−1 . a · b = b · a. We can revisit now the a − b = b − a problem. c ∈ F. Existence of a multiplicative inverse ( ¯ exists an element a−1 ∈ F such that3 a · a−1 = a−1 · a = 1. this does not follow from the axioms of ﬁeld! Example: Consider the following set F = {e. Multiplication is commutative: for every a. Distributive law ( ): for all a. For example. The last axiom relates the addition and multiplication operations: 9. without which there is no justiﬁcation to the notation a−1 . Comment: Division is deﬁned as multiplication by the inverse. a/b = a · b−1 . b.Real numbers 7. from the fact that a · b = a · c we can deduce that b = c only if a 0. However. . a · (b + c) = a · b + a · c. + e f 3 e e f f f e · e f e e e f e f Here again. def 8.

It would follows that for every a ∈ F. we will not bother to prove that a − b = −(b − a) (even though the proof is very easy). The only exception is (of course) if you get to prove an algebraic identity as an assignment. we rule it out and require the ﬁeld to satisfy 0 1. Proof : Using sequentially the property of the neutral element and the distributive law: a · 0 = a · (0 + 0) = a · 0 + a · 0. i. a = a · 1 = a · 0 = 0. I Comment: Suppose it were the case that 1 = 0.. and yet e f . a · 0 = 0. every algebraic identity should be proved from the axioms of ﬁeld.1 For every a ∈ F. In practice.6 Chapter 1 It takes some explicit veriﬁcation to check that this is indeed a ﬁeld (in fact the smallest possible ﬁeld). that f − e = e − f implies that e = f . Comment: In principle. Thus. Note that e − f = e + (− f ) = e + f = f + (−e) = f − e. For example. we will assume henceforth that all known algebraic manipulations are valid. however a very boring one. Proposition 1. which is then no wonder that we can’t prove.e. that the same element of F is both the additive neutral and the multiplicative neutral. This is indeed a ﬁeld according to the axioms. . based only on the ﬁeld axioms. Adding to both sides −(a · 0) we obtain a · 0 = 0. with e being the additive neutral and f being the multiplicative neutral (do you recognize this ﬁeld?). which means that 0 is the only element of F.

that a − 0 = a ∈ P. a3 − b3 = (a − b)(a2 + ab + b2 ). Either (i) a ∈ P. b ∈ P then a · b ∈ P. b ∈ P then a + b ∈ P. d 0 then a c ad + bc + = . Closure under multiplication: if a. such that the following three properties are satisﬁed: 1. Closure under addition: if a. We can then show a number of well-known properties of real numbers: . or (ii) (−a) ∈ P. or (iii) a = 0.1 Let F be a ﬁeld.2 ( ) A ﬁeld F is said to be ordered if it has a distinguished subset P ⊂ F. then b = 1. The property of being ordered can be formalized by three axioms: Deﬁnition 1. Comment: a > 0 means. Similarly. that If ab = a for some ﬁeld element a 0. that 0 − a = (−a) ∈ P. by deﬁnition. Prove. which we call the positive elements. If b. 2. They also form an ordered set. a < 0 means. If c 0 then a/b = (ac)/(bc). by deﬁnition. We supplement these axioms with the following deﬁnitions: a<b a>b a≤b a≥b means means means means b−a∈ P b<a a < b or a = b a > b or a = b.Real numbers 7 Exercise 1. b d bd Justify every step in your proof! 1.2 Axioms of order (as taught in 2009) The real numbers contain much more structure than just being a ﬁeld. based on the ﬁeld axioms. 3. Trichotomy: every element a ∈ F satisﬁes one and only one of the following properties.

(−b) ∈ P. which is an axiom.4 (Transitivity) If a < b and b < c then a < c.8 Chapter 1 Proposition 1. we are asked to show that a − b satisﬁes the trichotomy.2 For every a. Proof : c−a= (c − b) + (b − a) . That is. Proof : (b + c) − (a + c) = b − a ∈ P. I Proposition 1. or (iii) that a − b = 0.3 If a < b then a + c < b + c. I Proposition 1. . We are asked to show that either (i) a − b ∈ P. or (iii) a = b. along with our deﬁnitions. or (ii) b − a = −(a − b) ∈ P. it follows that (−a)(−b) > 0. b ∈ F either (i) a < b. By the closure under multiplication. Proof : This is an immediate consequence of the trichotomy law. Proof : We have seen above that a. It remains to check that the axioms of ﬁeld imply that ab = (−a)(−b).1 If a 0 then a · a ≡ a2 > 0. I Corollary 1.5 If a < 0 and b < 0 then ab > 0. or (ii) a > b. in P in P in P by closure under addition I Proposition 1. b < 0 is synonymous to (−a).

which implies that (−1) has to be positive.2 Let F be an ordered ﬁeld. this is not within the scope of the present course) cannot be ordered.Real numbers 9 Proof : If a 0. or a < 0 in which case a2 > 0 follows from the previous proposition. Prove based on the axioms that n that a < b if and only if (−b) < (−a). which violates the previous corollary. If a > 1 then a2 > a. I Corollary 1.2 1 > 0. Proof : This follows from the fact that 1 = 12 . in P by closure under multiplication I Exercise 1. Then. I Another corollary is that the ﬁeld of complex numbers (yes. Proof : It is given that b − a ∈ P and c ∈ P. then a − c < b − d. hence 1 has to be negative. If 0 < a < 1 then a2 < a.6 If a < b and c > 0 then ac < bc.3 Prove using the axioms of ordered ﬁelds and using induction on . If a < b and c > d. Proposition 1. If 0 ≤ a < b then a2 < b2 . since i2 = (−1). If 0 ≤ a < b and 0 ≤ c < d then ac < bd. with a = b. then by the trichotomy either a > 0. bc − ac = (b − a) · in P c in P . in which case a2 > 0 follows from the closure under multiplication. Exercise 1.

If xn = yn and n is even then either x = y of x = −y.10 If 0 ≤ x < y then xn < yn . with multiplicative neutral term 1. O3 Invariance under addition: if a < b then a + c < b + c for all c ∈ F.5 Show that there can be no ordered ﬁeld that has a ﬁnite number 1. such that the following four properties are satisﬁed: Deﬁnition 1. Conclude that xn = yn if and only if x = y or x = −y. Prove that (a + b)/2 is well deﬁned for all a. or (iii) a = b. The property of being ordered can be formalized by four axioms: tion < (a relation is a property that every ordered pair of elements either satisfy or not). a< 2 of elements. then a+b < b. or (ii) b < a. Either (i) a < b. b ∈ F satisﬁes one and only one of the following properties.4 Let F be an ordered ﬁeld. and that if a < b. b ∈ F. 2011) The real numbers contain much more structure than just being a ﬁeld. Exercise 1.3 Axioms of order (as taught in 2010. O4 Invariance under multiplication by positive numbers: if a < b then ac < bc for all 0 < c. O2 Transitivity: if a < b and b < c then a < c. They also form an ordered set. Exercise 1. Chapter 1 If x < y and n is odd then xn < yn . If xn = yn and n is odd then x = y.3 ( ) A ﬁeld F is said to be ordered if there exists a rela- O1 Trichotomy: every pair of elements a. . we also denote 1 + 1 = 2.

b then 0<a+b and 0 < ab. We supplement these axioms with the following deﬁnitions: a>b a≤b a≥b means means means b<a a < b or a = b a > b or a = b.e.Real numbers 11 Comment: The set of numbers a for which 0 < a are called the positive numbers. Closure under multiplication follows from the fact that 0=0·a < b · a. Proof : Closure under addition follows from the fact that 0 < a implies that b=0+b < a + b. a number is positive if and only if its additive inverse is negative. namely If 0 < a. Proof : Suppose 0 < a.8 0 < a if and only if (−a) < 0. The set of numbers a for which a < 0 are called the negative numbers. 2010) (2 hrs. and b < a + b follows from transitivity. 2011) (neutral element) (O3) Proposition 1. Proposition 1. i.. (proved) (O4) I (2 hrs.7 The set of positive numbers is closed under addition and multiplication. then (−a) = 0 + (−a) < a + (−a) =0 (neutral element) (O3) (additive inverse) .

a < 0 which violates the trichotomy axiom. 1 = a · a−1 < a · 0 = 0. In the latter case 0 < (−1)..10 0 < a if and only if 0 < a−1 . Then for every 0 < a (and we know that at least one such exists). a number is positive if and only if its multiplicative inverse is also positive.e. i.12 Chapter 1 i. it follows by O1 that either 0 < 1 or 1 < 0. either 0 < 1..3 The set of positive numbers is not empty.. Conversely. Then.e. I Proposition 1. i. a = a · 1 < a · 0 = 0. hence either 1 or (−1) is positive. Hence 0 < 1.9 0 < 1. if (−a) < 0 then 0 = (−a) + a <0+a =a (additive inverse) (O3) (neutral element) I Corollary 1. by contradiction. or 1 < 0. (−a) < 0. Proof : Since by assumption 1 0. which is a contradiction. Proof : Let 0 < a and suppose by contradiction that a−1 < 0 (it can’t be zero because a−1 a = 1). I Proposition 1. Suppose then.e. The third possibility is ruled out by assumption. Proof : By trichotomy. that 1 < 0. I . or 0 = 1.

1. a a≥0 |a| = (−a) a < 0. hence 1 has to be negative. then by the trichotomy either a > 0.12 (Triangle inequality ( |a + b| ≤ |a| + |b|. I Another corollary is that the ﬁeld of complex numbers (which is not within the scope of the present course) cannot be ordered. Proposition 1.4 value ( Absolute values ). . since i2 = (−1).Real numbers 13 Proposition 1. I Corollary 1. or a < 0 in which case a2 > 0 follows from the previous proposition. in which case a2 > 0 follows from the closure under multiplication. and otherwise |a| > 0. It remains to check that the axioms of ﬁeld imply that ab = (−a)(−b). By the closure under multiplication.4 For every element a in an ordered ﬁeld F we deﬁne the absolute We see right away that |a| = 0 if and only if a = 0. Proof : We have seen above that a. )) For every a. which implies that (−1) has to be positive. it follows that (−a)(−b) > 0. (−b). a ≥ 0 and b ≥ 0. b < 0 implies that 0 < (−a). which is a violation. Deﬁnition 1. with a = b. Proof : We need to examine four cases (by trichotomy): 1.4 If a 0 then a · a ≡ a2 > 0. Proof : If a 0.11 If a < 0 and b < 0 then ab > 0. b ∈ F.

and 0 < −(a + b). If a + b < 0. i. hence |a + b| = −(a + b) = (−a) + (−b) = |a| + |b|. then |a + b| = −(a + b) = (−a) + (−b) ≤ (−a) + b = |a| + |b|. |a| = (−a). |b| = (−b). We can divide this case into two sub-cases. a ≤ 0 and b ≥ 0.e. (Hey. and (−a) < a follows from transitivity. in which case |a + b| = a + b ≤ (−a) + b = |a| + |b|. We need to show that |a + b| ≤ (−a) + b. |a| = (−a) and |b| = b. which completes the proof4 . hence |a + b| = a + b = |a| + |b|. a > 0 =⇒ a + (−a) > 0 + (−a) =⇒ 0 > (−a). Either a + b ≥ 0. For example. a ≤ 0 and b ≤ 0. Let a ≤ 0 and b ≥ 0. This is very easy to show. Chapter 1 In the ﬁrst case. I Comment: By replacing b by (−b) we obtain that |a − b| ≤ |a| + |b|. 3. a ≥ 0 and b ≤ 0. 4.14 2. The following version of the triangle inequality is also useful: We have used the fact that a ≥ 0 implies (−a) ≤ a and a ≤ 0 implies a ≤ (−a).. |a| = a and |b| = b. 4 . does this mean that |a + b| ≤ |a| + |b|?) In the fourth case. as the third case follows by interchanging the roles of a and b. It remains to show the second case.

y) = 1 (x + y − |x − y|) . By interchanging the roles of a and b. b ∈ F. 2009) Exercise 1. c ∈ F.14 For every a. . Show that max(x. c ∈ F. It follows that5 ||a| − |b|| ≤ |a + b|. |a| − |b| ≤ |a + b|. or |b| − |a| ≤ |b + a|. I (2 hrs. y) and max(x. b. |a + b + c| = |a + (b + c)| ≤ |a| + |b + c| ≤ |a| + |b| + |c|. ||a| − |b|| ≤ |a + b|. respectively. Proof : By the triangle inequality.Real numbers 15 Proposition 1. |a + b + c| ≤ |a| + |b| + |c|. ∀a. Proof : Use the triangle inequality twice. in which case |b| ≤ |a| + |b + a|. y) = 5 1 (x + y + |x − y|) 2 and min(x.13 (Reverse triangle inequality) For every a. Set c = b + a. I Proposition 1. 2 Here we used the fact that any property satisﬁed by a and (−a) is also satisﬁed by |a|. |a − c| ≤ |a| + |c|.6 Let min(x. y) denote the smallest and the largest of the two arguments.

which we denote by Q. 4. 1 + 1 + 1. and naming the numbers 1 + 1. then 1 + 1 = (1 + 1) · 1 < (1 + 1) · (1 + 1)−1 = 1. there are no additive inverses because all the numbers are positive). we call this set S modulo the equivalence relation ∼. We can then augment the set of natural numbers by adding the numbers 0. We denote the set of natural numbers by N. such as 2. n 0. etc.5 Special sets of numbers What kind of sets can have the properties of an ordered ﬁeld? We start by introducing the natural numbers ( ). which is {b : b ∼ a}. which is a violation of the axioms of order. with every element a ∈ S we can associate an equivalence class ( ). and may represent them using a decimal system is immaterial. etc. This forms the set of integers ( ). 6 How do we know that (1 + 1)−1 . b have this property. That we give them names. 3. m/n.16 Chapter 1 1. . is not an integer? We know that it is positive and non-zero. reﬂexive (a ∼ a for all a) and transitive (a ∼ b and b ∼ c implies a ∼ c). The set of numbers can be further augmented by adding all integer quotients. The integers form a commutative group with respect to addition. Suppose that it were true that 1 < (1 + 1)−1 . An equivalence relation partitions S into a collection of disjoint equivalence classes. −1. 7 A few words about equivalence relations ( ). we say that a is equivalent to b. An equivalence relation has to be symmetric (a ∼ b implies b ∼ a). That these numbers are all distinct follows from the axioms of order (each number is greater than its predecessor because 0 < 1). It should be noted that the rational numbers are the set of integer quotients modulo an equivalence relation7 . An equivalence relation on a set S is a property that any two elements either have or don’t. Thus. 1 + 1 + 1 + 1. Clearly the natural numbers do not form a ﬁeld (for example. for example. A rational number has inﬁnitely many representations as the quotient of two integers. −(1 + 1 + 1). by taking the element 1 (which exists by the axioms). Every element belongs to one and only one equivalence class. This forms the set of rational numbers. which we denote by Z. −(1 + 1). but are not a ﬁeld since multiplicative inverses do not exist6 . and denote it by a ∼ b. If two elements a. and denote it S / ∼.

Real numbers

17

Proposition 1.15 Let b, d

a c = b d

0. Then if and only if ad = bc.

Proof : Multiply/divide both sides by bd (which cannot be zero if b, d It can be checked that the rational numbers form an ordered ﬁeld. (4 hrs, 2010) (4 hrs, 2011)

0).

I

In this course we will not teach mathematical induction, nor inductive deﬁnitions; we will assume these concepts to be understood. As an example of an inductive deﬁnition, we deﬁne for all a ∈ Q and n ∈ N, the n-th power, by setting, a1 = a and ak+1 = a · ak .

As an example of an inductive proof, consider the following proposition:

**Proposition 1.16 Let a, b > 0 and n ∈ N. Then
**

a>b if and only if an > bn .

Proof : We proceed by induction. This is certainly true for n = 1. Suppose this were true for n = k. Then a>b implies ak > bk implies ak+1 = a · ak > a · bk > b · bk = bk+1 .

Conversely, if ak+1 > bk+1 then a ≤ b would imply that ak+1 ≤ bk+1 , hence a > b. I Another inequality, which we will need occasionally is:

**Proposition 1.17 (Bernoulli inequality) For all x > (−1) and n ∈ N,
**

(1 + x)n ≥ 1 + nx.

18

Chapter 1

Proof : The proof is by induction. For n = 1 both sides are equal. Suppose this were true for n = k. Then, (1 + x)k+1 = (1 + x)(1 + x)k ≥ (1 + x)(1 + kx) = 1 + (k + 1)x + kx2 ≥ 1 + (k + 1)x, where in the left-most inequality we have used explicitly the fact that 1 + x > 0. I

**Exercise 1.7 The following exercise deal with logical assertions, and is in
**

preparation to the type of assertions we will be dealing with all along this course. For each of the following statements write its negation in hebrew, without using the word “no”. Then write both the statement and its negation, using logical quantiﬁers like ∀ and ∃. For every integer n, n2 ≥ n holds. There exists a natural number M such that n < M for all natural numbers n. For every integer n, m, either n ≥ m or −n ≥ −m. For every natural number n there exist natural numbers a, b,, such that b < n, 1 < a and n = ab. (3 hrs, 2009)

1.6

The Archimedean property

We are aiming at constructing an entity that matches our notions and experiences associated with numbers. Thus far, we deﬁned an ordered ﬁeld, and showed that it must contain a set, which we called the integers, along with all numbers expressible at ratios of integers—the rational numbers Q. Whether it contains additional elements is left for the moment unanswered. The point is that Q already has the property of being an ordered ﬁeld. Before proceeding, we will need the following deﬁnitions:

**Deﬁnition 1.5 A set of elements A ⊂ F is said to be bounded from above (
**

) if there exists an element M ∈ F, such that a ≤ M for all a ∈ A, i.e., (∃M ∈ F) : (∀a ∈ A)(a ≤ M).

Real numbers Such an element M is called an upper bound ( bounded from below ( ) if (∃m ∈ F) : (∀a ∈ A)(a ≥ m).

19 ) for A. A is said to be

Such an element m is called a lower bound ( ) for A. A set is said to be bounded ( ) if it is bounded both from above and below. Note that if a set is upper bounded, then the upper bound is not unique, for if M is an upper bound, so are M + 1, M + 2, and so on.

**Proposition 1.18 A set A is bounded if and only if
**

(∃M ∈ F) : (∀a ∈ A)(|a| ≤ M).

Proof : 8 Suppose A is bounded, then (∃M1 ∈ F) : (∀a ∈ A)(a ≤ M1 ) (∃M2 ∈ F) : (∀a ∈ A)(a ≥ M2 ). That is, (∃M1 ∈ F) : (∀a ∈ A)(a ≤ M1 ) (∃M2 ∈ F) : (∀a ∈ A)((−a) ≤ (−M2 )). By taking M = max(M1 , −M2 ) we obtain (∀a ∈ A)(a ≤ M ∧ (−a) ≤ M), i.e., (∀a ∈ A)(|a| ≤ M). The other direction isn’t much diﬀerent.

8

I

Once and for all: P if Q means that “if Q holds then P holds”, or Q implies P. P only if Q means that “if Q does not hold, then P does not hold either”, which implies that P implies Q. Thus, “if and only if” means that each one implies the other.

20

Chapter 1

A few notational conventions: for a < b we use the following notations for an open ( ), closed ( ), and semi-open ( ) segment, (a, b) = {x ∈ F : [a, b] = {x ∈ F : (a, b] = {x ∈ F : [a, b) = {x ∈ F : a< a≤ a< a≤ x < b} x ≤ b} x ≤ b} x < b}.

Each of these sets is bounded (above and below). We also use the following notations for sets that are bounded on one side, (a, ∞) = {x ∈ F : [a, ∞) = {x ∈ F : (−∞, a) = {x ∈ F : (−∞, a] = {x ∈ F : x > a} x ≥ a} x < a} x ≤ a}.

The ﬁrst two sets are bounded from below, whereas the last two are bounded from above. It should be emphasized that ±∞ are not members of F! The above is purely a notation. Thus, “being less than inﬁnity” does not mean being less than a ﬁeld element called inﬁnity. (5 hrs, 2011) Consider now the set of natural numbers, N. Does it have an upper bound? Our “geometrical picture” of the number axis clearly indicates that this set is unbounded, but can we prove it? We can easily prove the following:

Proposition 1.19 The natural numbers are not upper bounded by a natural number. Proof : Suppose n ∈ N were an upper bound for N, i.e., (∀k ∈ N) : (k ≤ n). However, m = n + 1 ∈ N and m > n, i.e., (∃m ∈ N) : (n < m), which is a contradiction. Similarly, I

Real numbers

21

**Corollary 1.5 The natural numbers are not bounded from above by any rational
**

number

Proof : Suppose that p/q ∈ Q was an upper bound for N. Since, by the axioms of order, p/q ≤ p, it would imply the existence of a natural number p that is upper bound for N.9 I (5 hrs, 2010) It may however be possible that an irrational element in F be an upper bound for N. Can we rule out this possibility? It turns out that this cannot be proved from the axioms of an ordered ﬁeld. Since, however, the boundedness of the naturals is so basic to our intuition, we may impose it as an additional axiom, known as the axiom of the Archimedean ﬁeld: the set of natural numbers is unbounded from above10 . This axiom has a number of immediate consequences:

**Proposition 1.20 In an Archimedean ordered ﬁeld F:
**

(∀a ∈ F)(∃n ∈ N) : (a < n).

Proof : If this weren’t the case, then N would be bounded. Indeed, the negation of the proposition is (∃a ∈ F) : (∀n ∈ N)(n ≤ a), i.e., a is an upper bound for N. I

**Corollary 1.6 In an Archimedean ordered ﬁeld F:
**

(∀ > 0)(∃n ∈ N) : (1/n < ).

Convince yourself that p/q ≤ p for all p, q ∈ N. This additional assumption is temporary. The Archimedean property will be provable once we add our last axiom.

10

9

22

Chapter 1

**Proof : By the previous proposition, (∀ > 0)(∃n ∈ N) : (0 < 1/ < n) .
**

(0<1/n< )

I

**Corollary 1.7 In an Archimedean ordered ﬁeld F:
**

(∀x, y > 0)(∃n ∈ N) : (y < nx).

**Comment: This is really what is meant by the Archimedean property. For every
**

x, y > 0, a segment of length y can be covered by a ﬁnite number of segments of length x.

y x x x x x x

**Proof : By the previous corollary with = y/x, (∀x, y > 0)(∃n ∈ N) : (y/x < n) .
**

(y<nx)

I

1.7

Axiom of completeness

The Greeks knew already that the ﬁeld of rational numbers is “incomplete”, in the sense that there is no rational number whose square equals 2 (whereas they knew by Pythagoras’ theorem that this should be the length of the diagonal of a unit square). In fact, let’s prove it:

Proposition 1.21 There is no r ∈ Q such that r2 = 2.

Real numbers

23

Proof : Suppose, by contradiction, that r = n/m, where n, m ∈ N (we can assume that r is positive) satisﬁes r2 = 2. Although it requires some number theoretical knowledge, we will assume that it is known that any rational number can be brought into a form where m and n have no common divisor. By assumption, m2 /n2 = 2, i.e., m2 = 2n2 . This means that m2 is even, from which follows that m is even, hence m = 2k for some k ∈ N. Hence, 4k2 = 2n2 , or n2 = 2k2 , from which follows that n is even, contradicting the assumption that m and n have no common divisor. I Proof : Another proof: suppose again that r = n/m is irreducible and r2 = 2. Since, n2 = 2m2 , then m2 < n2 < 4m2 , it follows that m < n < 2m < 2n, and 0<n−m<m Consider now the ratio q= 2m − n , n−m and 0 < 2m − n < n.

whose numerator and denominator are both smaller than that of r. By elementary arithmetic q2 = 4m2 − 4mn + n2 4 − 4n/m + n2 /m2 6 − 4n/m = 2 2 = 2, = n2 − 2nm + m2 n /m − 2n/m + 1 3 − 2n/m I

which is a contradiction.

Exercise 1.8 Prove that there is no r ∈ Q such that r2 = 3. Exercise 1.9 What fails if we try to apply the same arguments to prove that

there is no r ∈ Q such that r2 = 4 (an assertion that happens to be false)? √ A way to cope with this “missing number” would be to add 2 “by hand” to the set of rational numbers, along with all the numbers obtained by ﬁeld operations √ involving 2 and rational numbers (this is called in algebra a ﬁeld extension)

. This observation motivates the following deﬁnition: Deﬁnition 1. known as the axiom of completeness ( ¯ ). what about 3? We could √ all the square roots of add 3 all positive rational numbers. B ⊆ F satisfying (∀a ∈ A ∧ ∀b ∈ B)(a ≤ b). what about a solution to the equation x5 + x + 1 = 0 (it cannot be expressed in terms of roots as a result of Galois’ theory)? It turns out that a single additional axiom. Does there exist a rational number that “separates” the two sets.6 An ordered ﬁeld F is said to be complete if for every two nonempty sets. We next introduce more deﬁnitions. A = {x ∈ Q : 0 < x. Let us try to look in more detail in what sense is the ﬁeld of rational numbers “incomplete”.24 Chapter 1 √ ( )11 .e. (∀a ∈ A ∧ ∀b ∈ B)(a ≤ b). x2 ≤ 2} B = {y ∈ Q : 0 < y. And then. We will soon see how this axiom “completes” the ﬁeld Q. Every element in B is greater or equal than every element in A (by transitivity and by the fact that 0 < x ≤ y implies x2 ≤ y2 ). completes the set of rational numbers in one fell swoop. But then.. i. b ∈ Q}. 2 ≤ y2 }. there exists a c ∈ F such that (∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b). hence such a c does not exist. what about 2? Shall we add all n-th roots? But then. such to provide solutions to all those questions. Formally. does there exist an element c ∈ Q such that (∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b) ? It can be shown that if there existed such a c it would satisfy c2 = 2. Consider the following two sets. A. Let’s start with a motivating example: 11 It can be shown that this ﬁeld extension of Q consists of all elements of the form √ {a + b 2 : a.

g. I We call the least upper bound of a set (if it exists!) a supremum. if it exists. (∀a ∈ A)(a ≤ M) (∀M ∈ F)( if (∀a ∈ A)(a ≤ M ) then (M ≤ M )). it is clear that b is the least upper bound.7 Let A ⊆ F be a set.Real numbers 25 Example: For any set [a. hence M ≤ M and M ≤ M. e. 2011) We can provide an equivalent deﬁnition of the least upper bound: Proposition 1. and it is denoted by inf A. and (ii) (∀ > 0)(∃a ∈ A) : (a > M − ). If M and M are both least upper bounds.. and denote it by sup A. is unique: Proposition 1. Deﬁnition 1. An element M ∈ F is called a least upper bound ( ) for A if (i) it is an upper bound for A. We can see right away that a least upper bound. In fact. b is an upper bound. In some books the notations lub (least upper bound) and glb (greatest lower bound) are used instead of sup and inf. (7 hrs. If M is also a least upper bound for A then M = M .22 Let A be a set and let M be a least upper bound for A. 12 . but so is any larger element. Proof : It follows immediately from the deﬁnition of the least upper bound.23 Let A be a set. which implies that M = M . That is. b + 1. Similarly. and (ii) if M is also an upper bound for A then M ≤ M . b) = {x : a ≤ x < b}. the greatest lower bound ( ) of a set is called the 12 inﬁmum. A number M is a least upper bound if and only if (i) it is an upper bound. then both are in particular upper bounds.

or in formal notation. 2. which is a contradiction. a ≤ M = M − for all a ∈ A. which is a contradiction. Then there exists a smaller upper bound M < M. (∀a ∈ A)(a ≤ M) and (∀M ∈ F)(if (∀a ∈ A)(a ≤ M ) then (M ≤ M )).. that (∃ > 0) : (∀a ∈ A)(a < M − ). Suppose ﬁrst that M were a least upper bound for A. suppose that M is an upper bound and (∀ > 0)(∃a ∈ A) : (a > M − ). Suppose. 15] (answer: 15). smaller than the least upper bound. that there exists an > 0 such that a ≤ M − for all a ∈ A. Take = M − M . .26 Chapter 1 A a M !! Proof : There are two directions to be proved. i. Suppose that M was not a least upper bound. 1. then M − is an upper bound for A.. i. (∃ > 0) : (∀a ∈ A)(a < M − ). Conversely. Then. I Examples: What are the least upper bounds (if they exist) in the following examples: [−5.e.e. by contradiction.

and A ⊆ F. ∞) (answer: none).11 Show that the axiom of completeness is equivalent to the following statement: for every two non-empty sets A. 15] ∪ {20} (answer: 20). We deﬁne A + B = {a + b : a ∈ A. {1/n + (−1)n : n ∈ N}. −1/2. 1/2}. there exists an element c ∈ F.Real numbers [−5. and if they are ﬁnd their inﬁmum and/or supremum: {(−2)n : n ∈ N}. [−5. [−5. A = B = [0. determine whether it is upper bounded. Prove that M = inf A (∀ > 0)(∃a ∈ A) : (a < M + ). [−5. 2. B be non-empty sets of real numbers.10 Let F be an ordered ﬁeld. 3} and B = {−1. 1]. {1/n : n ∈ Z. A · B and A · A in each of the following cases: A = {1.12 For each of the following sets. b ∈ B} A · B = {a · b : a ∈ A. and ∅ if and only if M is a lower bound for A. {1 + (−1)n : n ∈ N}. Exercise 1. satisfying that a ∈ A and b ∈ B implies a ≤ b. 15] ∪ (17. Find A + B. b ∈ B} A − B = {a − b : a ∈ A. lower bounded.13 Let A. 15) (answer: 15). . n 0}. {n/(2n + 1) : n ∈ N}. 27 Exercise 1. Exercise 1. {x : |x2 − 1| < 3}. −2. b ∈ B}. 18) (answer: 18). B ⊂ F. {1 − 1/n : n ∈ N} (answer: 1). such that a≤c≤b for all a∈A and b ∈ B. Exercise 1.

Show that if A. Suppose that A is lower bounded and set B = {x : −x ∈ A}. We denote m = min A.14 Let A be a non-empty set of real numbers. and sup B = − inf A.. Show that if A is upper bounded and B is lower bounded. if A has a minimum. A. i.28 Chapter 1 Exercise 1. Suppose that A. Deﬁnition 1. x ≥ 0. B are upper bounded. Exercise 1. ∀x ∈ A · A. A − A = {0}. inf A = min A. that A has an inﬁmum. B are lower bounded then inf(A + B) = inf A + inf B. then sup(A · B) = sup A · sup B. B only contain non-negative terms and are upper bounded.24 If a set A has a maximum. It is said to have a minimum if there exists an element m ∈ A which is a lower bound for A. then the maximum is the least upper bound. Similarly. 2010) Proposition 1. For the moment. Prove that B is upper bounded. you can only use the axiom of completeness for the least upper bound. Is each of the following statements necessarity true? Prove it or give a counter example: A + A = {2} · A. (7 hrs. It is necessarily true that sup(A · B) = sup A · sup B? Show that if A.8 Let A ⊂ F be a subset of an ordered ﬁeld. We denote M = max A. then sup(A − B) = sup A − inf B.15 In each of the following items. B are non-empty subsets of a complete ordered ﬁeld.e. sup A = max A. . It is said to have a maximum if there exists an element M ∈ A which is an upper bound for A. then the minimum is the greatest lower bound.

[−5. The statement is that if a maximum exists. and for every > 0. 2009) Proposition 1. is an upper bound. for example. which proves that M is the least upper bound. A M>M− .8 Every ﬁnite set in an ordered ﬁeld has a least upper bound and a greatest lower bound. 15] ∪ (17. I Corollary 1. [−5. We emphasize that not every (non-empty) set has a maximum. as 2. 15) (answer: none). [−5. [−5. that is (∀ > 0)(∃a ∈ A)(a > M − ). Consider now the ﬁeld Q of rational numbers. {1 − 1/n : n ∈ N} (answer: none). then x 2 < 2 < 22 . if x ∈ A.13 Then M is an upper bound for A. (5 hrs. Indeed. then it is also the supremum. 15] ∪ {20} (answer: 20). I Examples: What are the maxima (if they exist) in the following examples: [−5. x2 < 2}. This set is bounded from above.25 Every ﬁnite set in an ordered ﬁeld has a minimum and a maximum. 15] (answer: 15). 13 . ∞) (answer: none).Real numbers 29 Proof : Let M = max A. Proof : The proof is by induction on the size of the set. 18) (answer: none). and its subset A = {x ∈ Q : 0 < x.

(use 1/n < α) (use α < 2) (use 6/n < 2 − α2 ) . Relying on the Archimedean property.30 which implies that x < 2. Suppose. by contradiction. . such that n > max 1 6 . such that r2 < 2. Clearly. α 2 − α2 We are going to show that there exists a rational number r > α. Proof : We assume that A has a least upper bound.26 Suppose that A has a least upper bound. Chapter 1 A 0 But does A have a least upper bound? an upper bound 1 2 Proposition 1. then (sup A)2 = 2. which means that α is not even an upper bound for A—contradiction: 2 1 (α + 1/n)2 = (α)2 + α + 2 n n 3 < α2 + α n 6 < α2 + n 2 < α + 2 − α2 = 2. we choose n to be an integer large enough. which means that α + 1/n ∈ A. that α2 < 2. 1 < α < 2. and denote α = sup A.

We will now see that completeness guarantees the existence of a lowest upper bound: Proposition 1.Real numbers Similarly. α − 1/n is an upper bound for A. indeed. 2 (omit 1/n2 ) (use n/2α > α2 − 2) which means that. suppose that α2 > 2.27 In a complete ordered ﬁeld every non-empty set that is upperbounded has a least upper bound. hence c = sup A. By the axiom of completeness. Clearly. Clearly c is an upper bound for A smaller or equal than every other upper bound. it follows that there is no supremum to this set in Q. I (8 hrs. Deﬁne B = {b ∈ F : b is an upper bound for A}. a contradiction: 2 1 (α − 1/n)2 = α2 − α + 2 n n 2 > α2 − α n > α2 + 2 − α2 = 2. which by assumption is non-empty. This time we set n> α2 2α . such that r2 > 2. (∀a ∈ A ∧ ∀b ∈ B)(a ≤ b). −2 31 We are going to show that there exists a rational number r < α. (∃c ∈ F) : (∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b). I Since we saw that there is no r ∈ Q such that r = 2. 2010) The following proposition is an immediate consequence: . Proof : Let A ⊂ F be a non-empty set that is upper-bounded. which means that r is an upper bound for A smaller than α—again.

to read about it.32 Chapter 1 Proposition 1. Proof : Let m be a lower bound for A. It turns out that this deﬁnes the set uniquely.. and (∀ > 0)(∃a ∈ A) : (a < (−M) + ). I and (∀ > 0)(∃a ∈ A) : ((−a) > M − ).29 (Archimedean property) N is not bounded from above. In other words. You are strongly recommended. which we denote by R. which we denote by M.28 In a complete ordered ﬁeld every non-empty set that is bounded from below has a greatest lower bound (an inﬁmum). 14 The axiom of completeness can be used to prove the Archimedean property. By the axiom of completeness. Consider the set B = {(−a) : a ∈ A} . we deﬁne the set of real numbers as a complete ordered ﬁeld. Since (∀a ∈ A)((−m) ≥ (−a)). i. up to an isomorphism).e. up to a relabeling of its elements (i. Proposition 1. however. Strictly speaking. With that.e. the axiom of completeness solves at the same time the “problems” pointed out in the previous section. See for example the construction of Dedekind cuts. 14 . (∀a ∈ A)(m ≤ a). it follows that (−m) is a upper bound for B. Thus. we should prove that such an animal exists.. Constructing the set of real numbers from the set of rational numbers is beyond the scope of this course. (∀a ∈ A)(M ≥ (−a)) This means that (∀a ∈ A)((−M) ≤ a) This means that (−M) = inf A. B has a least upper bound.

Proof : Since the natural numbers are not bounded. Proposition 1.30 (The rational numbers are dense in the reals) Let x. Consider now the sequence of rational numbers. there exists an m ∈ Z ⊂ Q such that m < x. y ∈ R. k rk = m + . (∃M ∈ R) : (∀n ∈ N) (n ≤ M). y ∈ R.e. n k = 0.e. .. . Then it would have a least upper bound M. (∀n ∈ N) ((n + 1) ≤ M). contradicting the fact that M is the least upper bound. n Let k be the smallest such number (the exists a smallest k because every ﬁnite set has a minimum).Real numbers Proof : Suppose N was bounded. In formal notation. x < y)(∃q ∈ Q) : (x < q < y). 1. I (9 hrs. i. (∀x. . 33 This means that M − 1 is an upper bound. ) of rational numbers such that x < y. From the Archimedean property follows that there exists a k such that rk = m + k > x. 2. . (∀n ∈ N) (n ≤ (M − 1)). . Also there exists a natural number n ∈ N such that 1/n < y − x. i. rk > x and rk−1 ≤ x. There exists a rational number q ∈ Q such that x < q < y. Since n ∈ N implies n + 1 ∈ N.. 2011) We next prove an important facts about the density ( within the reals.

b ∈ B.34 Hence. . (∀ > 0)(∃a ∈ A. then w) reach a rational point between x and y. Chapter 1 x < rk = rk−1 + which completes the proof. then any number m in between would satisfy a < m < b for all a ∈ A. Then the following three statements are equivalent: (∃!M ∈ F) : (∀a ∈ A ∧ ∀b ∈ B)(a ≤ M ≤ b). Proof : Suppose that the ﬁrst statement holds. The claim is that each statement implies the two other. n n 1/" q If we start #om an integer poin$ on the le% of x and proceed by rational steps of size less than &y'x(. Proposition 1. We have already seen that sup A is a lower bound for B. contradicting the assumption. such that )) Let A. 1 1 ≤ x + < y. This implies at once that sup A ≤ inf B. If sup A < inf B. Comment: The claim is not that the three items follow from the given data. ! x y I The following proposition provides a useful fact about complete ordered ﬁelds. sup A = inf B.31 (( ¯ plete ordered ﬁeld. B ⊂ F be two non-empty sets in a com(∀a ∈ A ∧ ∀b ∈ B)(a ≤ b). b ∈ B) : (b − a < ). hence x implies y. We will use it later in the course.

y implies z. where a.. b ∈ B) (a ≤ m − /2. Show that the set √ {a + b 2 : a. where a. d ∈ Q. then b = 0. Show that the irrational numbers are dense. (7 hrs. Then.e. 2009) √ I Exercise 1. It remains to show that z implies x. √ Show that if a + b 2 = 0. c. b−a≥(m+ /2)−(m− /2)= i. i. √ √ Conclude that a + b 2 = c + d 2. Thus. 35 i. . there were M1 < M2 such that (∀a ∈ A ∧ ∀b ∈ B)(a ≤ M1 < M2 ≤ b). Set m = (M1 + M2 )/2 and = (M2 − M1 ). then a = b = 0...16 In this exercise you may assume that 2 is irrational. b. (∀ > 0)(∃a ∈ A) : (a > sup A − /2) (∀ > 0)(∃b ∈ B) : (b < inf B + /2). b ∈ Q. (∀ > 0)(∃a ∈ A. b ∈ Q} with the standard addition and multiplication operations is a ﬁeld.e. (∀ > 0)(∃a ∈ A) : (a + /2 > sup A) (∀ > 0)(∃b ∈ B) : (b − /2 < inf B). where a. √ Show that if a + b 2 ∈ Q. (∀a ∈ A. b ∈ Q. This concludes the proof. b ≥ m + /2). b ∈ B) such that b − a < (inf B + /2) − (sup A − /2) = . That is. if x does not hold then z does not hold.e. then a = c and b = d. namely that there exists an irrational number between every two. Suppose M was not unique.Real numbers By the properties of the inﬁmum and the supremum.

such that a < t < s. or none of the above: a ≤ s for all a ∈ A. n ∈ Z. We deﬁned the integer powers recursively. n ∈ N} is dense. B ⊆ R be non-empty sets. For every ﬁnite subset B ⊂ A.32 (Properties of integer powers) Let x.18 Let A. Determine for each of the following statements whether it is equivalent. s is the supremum of A ∪ {s}.8 Rational powers x1 = x xk+1 = x · xk . Prove or disprove each of the following statements: If A ⊆ B and B is bounded from above then A is bounded from above. Proposition 1. or implying the statement that s = sup A. For every a ∈ A there exists a r ∈ R. Exercise 1.36 Chapter 1 Exercise 1. Exercise 1. If every b ∈ B is an upper bound for A then every a ∈ A is a lower bound form B. y 0 and m. implied by. We also deﬁne for x 0. x0 = 1 and x−n = 1/xn . . A is bounded from above if and only if A ∩ Z is bounded from above. 1. For every x ∈ R such that x > s there exists an a ∈ A such that s < a < x. contradictory. s ≥ max B.19 Let A ⊆ R be a non-empty set. then xm xn = xm+n . If A ⊆ B and B is bounded from below then A is bounded from below.17 Show that the set {m/2n : m ∈ Z.

and the latter is hence an upper bound for S . An n-th root of x is a positive number y. (10 hrs. . we denote it by either n x. z ∈ S implies that z ≤ max(1.9 Let x > 0. which implies that z ≤ 1 if x > 1 and z ∈ S then zn < x.Real numbers 37 (xm )n = xmn . It follows that regardless of the value of x. x). This is a non-empty set (it contains zero) and bounded from above. would it imply that y = z). By the axiom of completeness. Thus. which implies that z ≤ x. there exists a unique y = sup S . 2010) Proof : Consider the set S = {z ≥ 0 : zn < x}. 0 < α < β and n < 0 implies αn > βn . (xy)n = xn yn . 0 < α < β and n > 0 implies αn < βn . 2011) I Deﬁnition 1. since if x ≤ 1 and z ∈ S then zn < x ≤ 1. Proof : Do it. Theorem 1. we will show that yn = x. or by x1/n . It is easy to see that if x has an n-th root then √ is unique (since yn = zn .1 (Existence and uniqueness of roots) (∀x > 0 ∧ ∀n ∈ N)(∃!y > 0) : (yn = x). (10 hrs. Indeed. such that yn = x. α > 1 and n > m implies αn > αm . 0 < α < 1 and n > m implies αn < αm . which is the “natural suspect” for being an n-th root of x.

Chapter 1 0< hence x < 1. where we used the Bernoulli inequality. and yn < x(1 − n) ≤ x(1 − )n . Thus. by contradiction that yn < x. by contradiction that yn > x. yn − x . 1+x x n x < < x. nyn < yn − x. we found a number less than y. whose n-th power is greater than x. nx < x − yn .. This contradicts the assumption that y is an upper bound for S . we found a number greater than y. x − yn . Claim: yn ≤ x Suppose. in S . Set 0< < min 1. whose n-th power is less than x. which implies that y > 0. i. nyn . < min 1.e. it is an upper bound for S . This ﬁnally implies that y 1− n < x. Claim: yn ≥ x Suppose. i. where we used once more the Bernoulli inequality. Thus. 1+x 1+x Thus. This contradicts the assumption that y is the least upper bound for S . and [(1 − )y]n = (1 − )n yn ≥ (1 − n)yn > x. nx Then..38 Claim: y is positive Indeed.e. x/(1 + x) ∈ S . Set 0< Then.

b. d 0. (8 hrs. α > 1 and r > s implies αr > α s .Real numbers From the two inequalities follows (by trichotomy) that yn = x. by saying that a rational number is even if its numerator is even. This is a non-trivial fact. We thus need to show that the above deﬁnition is independent of the representation15 . The arguments goes as follows: (xa/b )bd = (((xa )1/b )b )d = (xa )d = xad = xbc = (xc )b = (((xc )1/d )d )b = (xc/d )bd . c. Such as deﬁnition would imply that 2/6 is even. Suppose we wanted to deﬁne the notion of even/odd for rational numbers. Deﬁnition 1. y > 0 and r. 2009) Proposition 1. (xr ) s = xrs . 39 I Having shown that every positive number has a unique n-th root. (xy)r = xr yr . we may proceed to deﬁne rational powers. We need to prove that ad = bc implies that sup{z > 0 : zb < xa } = sup{z > 0 : zd < xc }. Recall that rational numbers do not have a unique representation as the ratio of two integers. hence xa/b = xc/d . 0 < α < β and r < 0 implies αr > βr . but 1/3 (which is the same number!) is odd. then 15 xr x s = xr+s . s ∈ Q. This is not obvious. In other words. d ∈ Z. . then for all x > 0 xr = (xm )1/n . with a.10 Let r = m/n ∈ Q. b. There is one little delicacy with this deﬁnition. if ad = bc. then for all x > 0 xa/b = xc/d . 0 < α < 1 and r > s implies αr < α s . 0 < α < β and r > 0 implies αr < βr .33 (Properties of rational powers) Let x.

We will give one example. Let r = a/b.10 Addendum Existence of non-Archimedean ordered ﬁelds Does there exist ordered ﬁelds that do not satisfy the Archimedean property? The answer is positive. Start with the ﬁrst. from which follows that xa/b xc/d = xa/b+c/d . y ∈ A and x < z < y implies z ∈ A (that is. (xa/b xc/d )bd = (xa/b )bd (xc/d )bd = · · · = xad xbc = xad+bc = (xa/b+c/d )bd . The ”natural elements” are the constant functions. Thus it remains to show α1/b < β1/b . This has to be because α1/b ≥ β1/b would have implied (from the rules for integer powers!) that α ≥ β.9 Real-valued powers Delayed to much later in the course. every point between two elements in the set is . etc. with f (x) ≡ 0 for additive neutral element and f (x) ≡ 1 for multiplicative neutral element. say. It is easy to see that this set forms a ﬁeld with respect to function addition and multiplication. Connected set A set A ⊆ R will be said to be connected ( ) if x. We already know that αa < βa . We hace f > g if f (x) > g(x) for x suﬃciently large. Let r = a/b and s = c/d. f (x) ≡ 2.40 Chapter 1 Proof : We are going to prove only two items. by the element f (x) = x (which obviously belongs to the set). Consider the set of rational functions. Using the (proved!) laws for integer powers. bearing in mind that some of the arguments rely on later material. I 1. 1. It is easy to see that the set of “natural elements” are bounded. f (x) ≡ 1. Ignore the fact that a rational function may be undeﬁned at a ﬁnite collection of points. Take then the fourth item. We endow this set with an order relation by deﬁning the set P of positive elements as the rational functions f that are eventually positive for x suﬃciently large.

[a. b). [a. (a. (a. b] (−∞. b). (−∞. (a. b). it can only be a single point. an open. Exercise 1.20 Prove it. an open or closed ray.Real numbers 41 also n the set). closed or semi-open segments. . ∞). (−∞. [a. or the whole line. That is. ∞). b]. b]. ∞). We state that a (non-empty) connected set can only be of one of the following forms: {a}.

42 Chapter 1 .

To avoid ambiguities. we write f : A → B ( f maps the set A into the set B). we may denote a function by the letter f . The numbers which may be “fed into the machine”. returns a number (only one. For example. We normally denote functions by letters. We only require that every number returned by the function belongs to its range. Numbers that may be “emitted by the machine”. We do not exclude the possibility that some of these numbers may never be returned. the function has to be deﬁned properly. and deﬁne a function as a “machine”. The crucial point is that to every number in its domain corresponds one and only one number in its range ( ¯ ). A transformation rule ( ). a function is a rule that assigns real numbers to real numbers. but we will deliberately be a little less formal than perhaps we should. which when provided with a number. In other words. like we do for real numbers (and for any other mathematical entity). and always the same for the same input). A range ( ): another subset of R. The transformation rule has to specify what number in B is assigned by the .Chapter 2 Functions 2. and B ⊆ R is its range. If A ⊆ R is its domain.1 Basic deﬁnitions What is a function? There is a standard way of deﬁning functions. A function is deﬁned by three elements: A domain ( ): a subset of R.

(∀y ∈ Image( f ))(∃!x ∈ A) : ( f (x) = y). A function that assigns to every w ±1 the number (w3 + 3w + 5)/(w2 − 1). and of f : x → f (x) as deﬁning the “action” of the function. deﬁned by the transformation rules f (x) = x2 . and g:w→ w3 + 3w + 5 . f (t) = t2 . We denote the assignment by f (x) (the function f evaluated at x). 2]. 2 It takes a great skill not to confuse the Greek letters ζ (zeta) and ξ (xi).. Also. Examples: A function that assigns to every real number its square. Its image ( of numbers that are actually assigned by the function. the functions f : R → R. ) is the subset of B The function f is said to be onto B ( ) if B is its image. i.44 Chapter 2 function to each number x ∈ A.1 Let f : A → B be a function. however. that “the function f is x2 ”. then g : R \ {±1} → R 1 and f : x → x2 . Thus. Deﬁnition 2. There is however nothing wrong with the deﬁnition of the range as the whole of R. then f :R→R We may also write f (x) = x2 . f (α) = α2 and f (ξ) = ξ2 are identical2 . w2 − 1 Programmers: think of f : A → B as deﬁning the “type” or “syntax” of the function. In particular.e. it turns out that the function f only returns non-negative numbers. ∞). That is. Image( f ) = {y ∈ B : ∃x ∈ A : f (x) = y}. We could limit the range to be the set [0. but not to the set [1. we may use any letter other than x as an argument for f . If we denote this function by g. . That the assignment rule is “assign f (x) to x” is denoted by f : x → f (x) (pronounced “ f maps x to f (x)”)1 . f is said to be one-toone ( ¯ ) if to each number in its image corresponds a unique number in its domain. If we denote the function by f . We should not say.

often denoted by Id. . 0. 28. π2 /17. We have f : R → {0. For every n ∈ N we may deﬁne the n-th power function fn : R → R. The function f1 : x → x is known as the identity function. 17. This function diﬀers from the function in the ﬁrst example because the two functions do not have the same domain (diﬀerent “syntax” but same “routine”). If this number is inﬁnite. A function that assigns to every real numbers the value zero if it is irrational and one if it is rational. Id : x → x.Functions 45 A function that assigns to every −17 ≤ x ≤ π/3 its square. This example diﬀers from the previous ones in that we do not have an assignment rule in closed form (how the heck do we compute f (x)?). such that 5 36/π x→ 28 16 x=2 x = 17 x = π2 /17 or 36/π otherwise. This function is known as the Dirichlet function. e. 36/π}. by fn : x → xn . 36/π} ∪ {a + b 2 : a. namely Id : R → R. 3 . which assigns to x the number of 7’s in its decimal expansion.. A function deﬁned on R \ Q (the irrational numbers). . We limited the range of the function to the irrationals because rational numbers may have a non-unique decimal representation. The range may be taken to be R. Here again. Nevertheless it provides a legal assignment rule3 . then it returns −π. with 0 x is irrational f :x→ 1 x is rational. but the image is {6.g. 1}. we will avoid referring to “the function xn ”.6999999 . . A function deﬁned on the domain √ A = {2. 16. b ∈ Q}. if this number is ﬁnite.7 = 0.

Let f : A → B and g : B → C. b ∈ R. then5 g ◦ f : ξ → sin2 ξ and f ◦ g : ζ → sin ζ 2 . f /g : x → f (x)/g(x). It is deﬁnitely a habit to use the letter x as generic argument for functions. 0} → R. if f is the sine function and g is the square function. Beware of ﬁxations! Any letter including Ξ is as good. We are adding “machines”. x → 0.. We may deﬁne a new function4 a f + b g : A ∩ B → R. Let f : A → R and g : B → R be given functions. its inverse is − f = (−1) f . For example. sometimes complicated. f · g : x → f (x)g(x). not numbers. It is our choice in the present course to assume that the meaning of these functions (along with their domains and ranges) are well-understood. and a. This is a vector ﬁeld whose zero element is the zero function. We may deﬁne the product of two functions. Given several functions. 2011) A third operation that can combine two functions is composition. On the other hand. i. functions form a vector space over the reals. functions form an algebra. This is not trivial. 5 4 . the exponential. Moreover. the cosine. ( f ◦ g) ◦ h = f ◦ (g ◦ h). f /g : A ∩ {z ∈ B : g(z) (12 hrs. such as the sine. given a function f . as well as their quotient. f · g : A ∩ B → R.e. g ◦ f : x → g( f (x)). composition is associative.46 Chapter 2 There are many functions that you all know since high school. These function require a careful. We deﬁne g ◦ f : A → C. they can be combined together to form new functions. a f + b g : x → a f (x) + b g(x). deﬁnition. For example. and the logarithm. namely. composition is non-commutative.

2009) Exercise 2.Functions Note that for every function f . (10 hrs. The union over all n’s of polynomials of degree n is the set of polynomials. with an i=0 n 0. A function P is called a polynomial of degree n if there exist real numbers (ai )n . . Example: Consider the function f that assigns the rule f :x→ This function can be written as f = Id + Id · Id + Id · sin ◦(Id · Id) .1 The following exercise deals with the algebra of functions. P(x) = 2 x and S (x) = x2 . Id · sin + Id · sin · sin x + x2 + x sin x2 . such that P= k=0 ak fk . s(x) = sin x all three deﬁned on R. P. Consider the three following functions. Id ◦ f = f ◦ Id = f. and S : (i) (s ◦ P)(y). x sin x + x sin2 x Example: Recall the n-th power functions fn . Express the following functions without using the symbols s. 47 so that the identity is the neutral element with respect to function composition. This should not be confused with the fact that x → 1 is the neutral element with respect to function multiplication. (ii) (S ◦ s)(ξ). A function is called rational if it is the ratio of two polynomials. and (iii) (S ◦ P ◦ s)(t) + (s ◦ P)(t).

Exercise 2. b. Exercise 2. (iii) 2 sin y 2 f (u) = sin(2u + 2u ).2 An open segment I is called symmetric if there exists an a > 0 such that I = (−a. Prove that if f is a polynomial satisfying f (a) = 0. and let f : A → B be given by f : x → ax2 + bx + c. Let f : I → R where I is a symmetric segment. That is. where I is a symmetric segments. if f = E + O = E + O . where I and J are symmetric segments). and odd if f (−x) = − f (x) for every x ∈ I. along x with the binary operators +. then there are at most n real numbers a. and where applicable give a counter example: f + g. and (v) f (a) = 2sin a + 2 +sin a) sin(a2 ) + 2sin(a . determine whether the speciﬁed function is even. (ii) f (t) = 22 . such that f (a) = 0 (at most n roots). or not necessarily either two. a). g : I → R. Let f. Exercise 2. then E = E and O = O . odd. Prove then that this decomposition is unique. g being either even or odd. (iv) f (y) = sin(sin(sin(22 ))).48 Chapter 2 Write the following functions using only the symbols s. such that f (x) = (x − a)g(x) + b (hint: use induction on the degree of the polynomial). the function f is called even if f (−x) = f (x) for every x ∈ I. ·. and S .5 Let a. Exercise 2.3 Show that any function f : R → R can be written as f = E + O. c ∈ R and a 0. P. where E is even and O is odd. Prove that for every polynomial of degree ≥ 1 and for every real number a there exists a polynomial g and a real number b. In each case. f · g. Explain your statement. and consider the four possibilities of f. .4 The following exercise deals with polynomials. then there exists a polynomial g such that f (x) = (x − a)g(x). f ◦ g (here we assume that g : I → J and f : J → R. and ◦: (i) f (x) = 2sin x . Conclude that if f is a polynomial of degree n.

y) : x ∈ A. A function is a graph—a subset of the Cartesian product space A × B. Exercise 2. (x.6 Let f satisfy the relation f (x + y) = f (x) + f (y) for all x. and (x. The deﬁnition we gave to a function as an assignment rule is. xn ∈ R..2 Graphs As you know. not a formal one. What is a graph? A drawing? A graph has to be thought of as a subset of the plane. . we deﬁne the graph of f to be the set Graph( f ) = {(x. ) 2. y = f (x)} ⊂ A × B. For a function f : A → B. z) ∈ Graph( f ) implies y = z. y) ∈ Graph( f ) The function is one-to-one if (x. . . i. Suppose A = R. . The deﬁning property of the function is that it is uniquely deﬁned. (Note that we claim nothing for irrational arguments. strictly speaking. Prove inductively that f (x1 + x2 + · · · + xn ) = f (x1 ) + f (x2 ) + · · · + f (xn ) for all x1 . such that f (x) = cx for all x ∈ Q. .Functions Suppose B = R. The standard way to deﬁne a function is via its graph. Find the largest segment A for which f is one-to-one. y) ∈ Graph( f )). Find the largest segment B for which f is onto. 49 Find segments A and B as large as possible for which f is one-to-one and onto. y ∈ R. Prove the existence of a real number c.e. we can associate with every function a graph. y) ∈ Graph( f ) implies x = w. y) ∈ Graph( f ) It is onto B if (∀y ∈ B)(∃x ∈ A) : ((x. and (w.

A point a ∈ A is called an interior point of A ( ) if it had a neighborhood contained in A. δ) = {x : 0 < |x − a| < δ}. neighborhoods of a of the form {x : |x − a| < δ} for some δ > 0. Notation: We will mostly deal with symmetric neighborhoods (whether punctured or not). satisfying (∀a ∈ A)(∃!b ∈ B) : ((a.3 Limits Deﬁnition 2. B(a. Let A and B be two sets (of anything!).3 Let A ⊂ R.2 Let x ∈ R.50 Chapter 2 Comment: In fact. i. 2010) 2. b ∈ B}. We will introduce the following notations for neighborhoods. δ) = (a − δ. x cannot be a boundary point). b) ∈ Graph( f )). punctured neighborhood neighborhood a a Deﬁnition 2. (12 hrs. A function f : A → B is a subset of the Cartesian product Graph( f ) ⊂ A × B = {(a. b) : a ∈ A. . b) that contains the point x (note that since the segment is open.. this is the general way to deﬁne functions. b) \ {x} where a < x < b. A punctured neighborhood of x ( ) is a set (a.e. a + δ) B◦ (a. A neighborhood of x ( ) is an open segment (a.

In this section we explain the meaning of the limit of a function at a point. the point 1/2 is an interior point. δ) = [a. δ) = (a − δ. by making x suﬃciently close to 5. − 51 Example: For A = [0. Suppose you want f (x) to diﬀer from 15 by less than 1/100. a). in fact. and neither does its complement. This means that we can make f (x) be as close to 15 as we wish.Functions We will also deﬁne one-sided neighborhoods. B+ (a. Note that the function does not need to be equal to at the point a. Example: Consider the function f :R→R f : x → 3x. a + δ) + B− (a. with 5 itself being excluded. a] B◦ (a. 300 300 . we say that: The limit of a function f at a point a is . We claim that the limit of this function at the point 5 is 15. δ) = (a. but not the point 0. δ) = (a − δ. by making its argument suﬃciently close to a (excluding the value a itself). 100 100 This requirement is guaranteed if 5− 1 1 < x<5+ . it does not even need to be deﬁned at the point a. Informally. Example: The set Q ⊂ R has no interior points. 1]. if we can make f assign a value as close to as we wish. This means that you want 15 − 1 1 < f (x) = 3x < 15 + . R \ Q. a + δ) B◦ (a.

you want | f (x) − 15| = |3x − 15| < . for some > 0 of your choice.4 Let f : A → B with a ∈ A an interior point. choosing x within a symmetric punctured neighborhood of 5 of length 2 /3 guarantees that f (x) is within a distance of from 15. .52 Chapter 2 Thus. In other words. Suppose you want f (x) to diﬀer from 15 by less than . We write. Since we can repeat this construction for any number other than 1/100. We say that the limit of f at a is . such that7 | f (x) − | < In formal notation. x→5 We can be more precise. This example insinuates what would be a formal deﬁnition of the limit of a function at a point: Deﬁnition 2. δ). if we take x to diﬀer from 5 by less than 1/300 (but more than zero!). lim f = 15. if for every > 0. 7 The game is “you give me and I give you δ in return”. This is guaranteed if |x − 5| < /3. δ))(| f (x) − | < ). then we conclude that the limit of f at 5 is 15. 5 although the more common notation is6 lim f (x) = 15. The choice of not using the common notation becomes clear when you realize that you could as well write ℵ→5 6 for all x ∈ B◦ (a. thus given > 0. there exists a δ > 0. lim f (ℵ) = 15 or ξ→5 lim f (ξ) = 15. (∀ > 0)(∃δ > 0) : (∀x ∈ B◦ (a. then we are guaranteed to have f (x) within the desired range.

Player B: No problem. If we impose that |x − 3| < δ then | f (x) − 9| < |x + 3| δ. we need to show that for every > 0 we can ﬁnd a δ > 0. we note that by the triangle inequality. To ﬁnd the appropriate δ. Instead. δ). If the |x + 3| wasn’t there we would have set δ = and we would be done. |x + 3| = |x − 3 + 6| ≤ |x − 3| + 6. such that | f (x) − 9| < whenever x ∈ B◦ (3. We want to show that lim f = 9. Example: Consider the square function f : R → R.it fo!ows tha" the limit of f at a is l. we examine the condition that needs to be satisﬁed: | f (x) − 9| = |x − 3| |x + 3| < . so that for all |x − 3| < δ. Thus. 3 By deﬁnition.Functions 53 Player A: This is my !. l ! l ! " a a Since Player B can ﬁnd a " for every choice of ! made by Player A. f : x → x2 . | f (x) − 9| ≤ |x − 3| (|x − 3| + 6) < (6 + δ)δ. . Here is my ". the game it to respond with an appropriate δ for every .

To summarize. > 0 we take δ = min(1. Example: The next example is the function f : (0. given |x − 3| < δ). . hence f (x) − 2|x − a| 1 < . then | f (x) − 1/a| < . We may freely require for example that d ≤ 1. We start by requiring that |x − a| < a/2. ∞). which at once implies that x > a/2. then we reach the desired goal. If we furthermore take δ ≤ /7. We are going 1 lim f = . f : x → 1/x. f (x) − for δ = min(a/2. it is not a variable. We can show. in which case (∀x : 0 < | f (x) − 9| = |x − 3| |x + 3| ≤ |x − 3| (|x − 3| + 6) < (6 + δ)δ ≤ (6 + 1) = . a a2 If we further require that |x − a| < a2 /2. in general. in which case | f (x) − 9| < (6 + δ)δ ≤ 7δ. δ).54 Chapter 2 Now recall: given we have the freedom to choose δ > 0 such to make the righthand side less or equal than . To conclude. /7). (14 hrs. Let > 0 be given. 2011) 1 < a whenever x ∈ B◦ (a. We ﬁrst observe that f (x) − 1 1 1 |x − a| . that lim f = a2 a for all a ∈ R. 7 This proves (by deﬁnition) that lim3 f = 9. a a First ﬁx a. = − = a x a ax We need to be careful that the domain of x does not include zero. a2 /2). [do it!] (13 hrs. 2010) to show that for every a > 0.

δ)) : ( f (x) (12 hrs. To show that for all . Indeed.. 0 f :x→ 1 x is irrational x is rational. in contrast.e. δ)) : (| f (x) − | ≥ ). 4 1 4 and 1 |1 − | < . the function f : R → R. δ)) : ( f (x) = 0. we need to show the negation of (∃ ∈ R) : (∀ > 0)(∃δ > 0) : (∀x ∈ B◦ (0. 4 Comment: It is important to stress what is the negation that the limit f at a is : There exists an > 0. i.Functions 55 Example: Consider next the Dirichlet function. which satisﬁes 0 < |x − a| < δ. )). y ∈ B◦ (0. f : R → R. (∀ ∈ R. No can satisfy both |0 − | < I. but not | f (x) − | < . 1 )). δ))( f (x) ∈ B( ... We will show that f does not have a limit at zero.e. f (y) = 1). (∀δ > 0)(∃x. 2009) B( . Then no matter what δ is. ∀δ > 0)(∃x ∈ B◦ (0. . there exists an x. (∀ ∈ R)(∃ > 0) : (∀δ > 0)(∃x ∈ B◦ (0. 0 f :x→ x x is irrational x is rational. take = 1/4. Example: Consider. lim f 0 . there exist points 0 < |x| < δ for which f (x) = 0 and points for which f (x) = 1. i.e. such that for all δ > 0.

let > 0 be given. there exists a δ1 > 0 such that 1 | f (x) − | < | − m| 4 whenever x ∈ B◦ (a. Theorem 2. The ﬁrst theorem states that a limit. Then. Proof : Suppose. we are in measure to prove general theorems about limits. . δ). a with m. that lim f = a and lim f = m. and having seen a number of example. if it exists. is unique. then by taking δ = .56 Here we show that Chapter 2 lim f = 0. Having a formal deﬁnition of a limit. and there exists a δ2 > 0 such that 1 | f (x) − m| < | − m| 4 whenever x ∈ B◦ (a. whenever x ∈ B◦ (a. Take δ = min(δ1 . δ2 ). δ) implies | f (x) − 0| < . 1 | − m| = |( f (x) − m) − ( f (x) − )| ≤ | f (x) − | + | f (x) − m| < | − m|. δ2 ). By deﬁnition.1 (Uniqueness of the limit) A function f : A → B has at most one limit at any interior point a ∈ A. δ1 ). by contradiction. 0 Indeed. 2 which is a contradiction. x ∈ B◦ (0.

it fo!ows tha" the limit of f at a cannot be both l and #. where f (x) = 1/x. and if they do exist they are equal. Player B: I give up.. Show that lima f exists if and only if lima g exists. No " ﬁts. where f (x) = |x|. I Exercise 2. where f (x) = x2 + 3. where f (x) = 7x. You have to calculate the limit directly from its deﬁnition (without using. (This exercise shows that the limit of a function at a point only depends on its properties in a neighborhood of that point). f . with a ∈ I.7 In each of the following items. # l ! ! # l ! ! " a a Since Player B cannot ﬁnd a " for a particular choice of ! made by Player A. for example.Functions 57 Player A: This is my !. determine whether the limit exists. deﬁned on a punctured neighborhood of a). Let J ⊂ I be an open interval containing the point a. where f (x) = |x|. √ lima f . x>0 x=0 x < 0. Exercise 2. g : I \ {a} → R (i.e. lim2 lim0 lim0 lima f . Let f. such that f (x) = g(x) for all x ∈ J \ {a}.8 Let I be an open interval in R. and if it does calculate its value. lima f . . where 1 f (x) = 0 −1 and a ∈ R. f . limits arithmetic—see below). f .

We have seen above various examples of limits. |x − | < |x − | < min 1. imply |xy − m| < . 2011) . by showing that the deﬁnition was satisﬁed.58 Chapter 2 Exercise 2.1 Let . where the representation p/q is in reduced form (for example R(4/6) = R(2/3) = 1/3). 2 2 . 2(| | + 1) 2 and |y − m| < 2 imply |(x + y) − ( + m)| < . m ∈ R. y m |m| |m|2 . If m 0 and |y − m| < min then y 0 and 1 1 − < . we had to “work hard” to prove what the limit was. a lemma: Lemma 2. 2(|m| + 1) and |y − m| < min 1. Then.9 The Riemann function R : R → R is deﬁned as 1/q R(x) = 0 x = p/q ∈ Q x Q. Show that lima R exists everywhere and is equal to zero. But ﬁrst. In each case. This becomes impractical when the functions are more complex. we need to develop theorems that will make our task easier. (15 hrs. Note that R(0) = R(0/1) = 1. Thus.

|xy − m| = |x(y − m) + m(x − )| ≤ |x||y − m| + |m||x − | | |+1 |m| < + < . lim( f + g) = + m a and 1 1 = . 2(| | + 1) 2(|m| + 1) For the third item it is clear that |y − m| < |m|/2 implies that |y| > |m|/2. 2009) (15 hrs. y m |y||m| 2|m|(|m|/2) I (13 hrs. we know that there exists a δ > 0. hence. This δ > 0 is already the smallest of the two δ’s corresponding to each of the functions.Functions 59 Proof : The ﬁrst item is obvious (the triangle inequality). Then |y − m| |m|2 1 1 − = < = . If the two conditions are satisﬁed then |x| = |x − + | < |x − | + | | < | | + 1. Consider the second item. 2010) Theorem 2. δ). y 0. and in particular. a Then. moreover. 0. then lim a Proof : This is an immediate consequence of the above lemma. f lim( f · g) = · m. such that lim f = a and lim g = m.2 (Limits arithmetic) Let f and g be two functions deﬁned in a neighborhood of a. a If. such that8 | f (x) − | < 8 2 and |g(x) − m| < 2 whenever x ∈ B◦ (a. Since. given > 0. .

2(|m| + 1) and whenever x ∈ B◦ (a. δ). x→a which follows by induction once we show that for k = 1. |g(x) − m| < min 1. such that | f (x) − | < min 1. δ). δ). | f (x) − | < min 2 2 It follows that 1 1 − < f (x) whenever x ∈ B◦ (a. 2(| | + 1) whenever x ∈ B◦ (a. lim xk = ak . whenever x ∈ B◦ (a. 9 This is the standard notation to what we write in these notes as f :ξ→ ξ3 + 7ξ5 ξ2 + 1 and lim f = a a3 + 7a5 . a2 + 1 . x→a x2 + 1 a +1 We need to show that for all c ∈ R lim lim c = c. such that | | | |2 . It follows that | f (x)g(x) − m| < Finally. take for example the functions f (x) = 5 + 1/x and g(x) = 6 − 1/x at x = 0. δ). I Comment: It is important to point out that the fact that f + g has a limit at a does not imply that either f or g has a limit at that point.60 It follows that Chapter 2 |( f (x) + g(x)) − ( + m)| < There also exists a δ > 0. Example: What does it take now to show that9 x3 + 7x5 a3 + 7a5 = 2 . a and that for all k ∈ N. whenever x ∈ B◦ (a. there exists a δ > 0. δ).

Functions

61

**Exercise 2.10 Show that
**

lim xn = an ,

x→a

a ∈ R, n ∈ N,

directly from the deﬁnition of the limit, i.e., without using limits arithmetic.

**Exercise 2.11 Let f, g be deﬁned in a punctured neighborhood of a point a.
**

Prove or disprove: If lima f and lima g do not exist, then lima ( f + g) does not exist either. If lima f and lima g do not exist, then lima ( f · g) does not exist either If both lima f and lima ( f + g) exist, then lima g also exists. If both lima f and lima ( f · g) exist, then lima g also exists. If lima f exists and lima ( f + g) does not exist, then lima ( f + g) does no exist. If g is bounded in a punctured neighborhood of a and lima f = 0, then lima ( f · g) = 0. We conclude this section by extending the notion of a limit to that of a one-sided limit.

**Deﬁnition 2.5 Let f : A → B with a ∈ A an interior point. We say that the limit
**

on the right ( that

) of f at a is , if for every > 0, there exists a δ > 0, such whenever x ∈ B◦ (a, δ). +

| f (x) − | < We write,

lim f = . +

a

An analogous deﬁnition is given for the limit on the left (

).

62

Chapter 2

Player A: This is my !.

Player B: No problem. Here is my ".

l

!

l

! "

a

a

Since Player B can ﬁnd a " for every choice of ! made by Player A,it fo!ows tha" the right#limit of f at a is l.

**Exercise 2.12 We denote by f : x → x the lower integer value of x.
**

Draw the graph of this function. Find lim f −

a

and

lim f +

a

for a ∈ R. Prove it based on the deﬁnition of one-sided limits.

**Exercise 2.13 Show that f : (0, ∞) → R, f : x → x satisﬁes
**

lim f =

a

√

√

a,

for all a > 0.

2.4

Limits and order

In this section we will prove a number of properties pertinent to limits and order. First a deﬁnition:

**Deﬁnition 2.6 Let f : A → B and a ∈ A an interior point. We say that f is
**

locally bounded ( U of a, such that the set ) near a if there exists a punctured neighborhood { f (x) : x ∈ U}

Functions is bounded. Equivalently, there exists a δ > 0 such that the set { f (x) : x ∈ B◦ (a, δ)} is bounded.

63

**Comment: As in the previous section, we consider punctured neighborhoods of a
**

point, and we don’t even care if the function is deﬁned at the point.

Proposition 2.1 Let f and g be two functions deﬁned in a punctured neighborhood U of a point a. Suppose that f (x) < g(x) and that lim f =

a

∀x ∈ U, lim g = m.

a

and

Then ≤ m.

Comment: The very fact that f and g have limits at a is an assumption. Comment: Note that even though f (x) < g(x) is a strict inequality, the resulting

inequality of the limits is in a weak sense. To see what this must be the case, consider the example f (x) = |x| and g(x) = 2|x|. Even though f (x) < g(x) in an open neighborhood of 0, lim f = lim g = 0.

0 0

Proof : Suppose, by contradiction, that exists a δ > 0 such that (∀x ∈ B◦ (a, δ)) : (| f (x) − | < In particular, for every such x, f (x) > − = − which is a contradiction.

> g and set and

= 1 ( − m). Thus, there 2

|g(x) − m| < ).

−m −m =m+ = m + > g(x), 2 2 I

64

Chapter 2

**Proposition 2.2 Let f and g be two functions deﬁned in a punctured neighborhood U of a point a. Suppose that lim f =
**

a

and

lim g = m,

a

with < m. Then there exists a δ > 0 such that (∀x ∈ B◦ (a, δ)) : ( f (x) < g(x)).

**Comment: This time both inequalities are strong.
**

Proof : Let = 1 (m − ). There exists a δ > 0 such that 2 (∀x ∈ B◦ (a, δ)) : (| f (x) − | < For every such x, f (x) < + < + + (g(x) − m + ) = g(x). I and |g(x) − m| < ).

**Proposition 2.3 Let f, g, h be deﬁned in a punctured neighborhood U of a. Suppose that (∀x ∈ U) : ( f (x) ≤ g(x) ≤ h(x)), and lim f = lim h = .
**

a a

Then lim g = .

a

Proof : It is given that for every > 0 there exists a δ > 0 such that (∀x ∈ B◦ (a, δ)) : ( − < f (x) and h(x) < + ).

− < f (x) ≤ g(x) ≤ h(x) < + . lim( f g) = 0. then f is locally bounded near a. i.6 Let f and g be deﬁned in a punctured neighborhood of a point a. lim f = a if and only if lim( f − ) = 0.Functions Then for every such x. 2011) Proposition 2. Proof : Immediate from the deﬁnition of the limit. Then. Then.e. I . I Proposition 2. (∀x ∈ B◦ (a. 65 I (17 hrs.5 Let f be deﬁned in a punctured neighborhood of a point a.. a Proof : Very easy.4 Let f be deﬁned in a punctured neighborhood U of a. I Proposition 2. If the limit lim f = a exists. Suppose that lim f = 0 a whereas g is locally bounded near a. a Proof : Very easy. δ)) : (|g(x) − | < ).

a Then lim( f g) = m. product) Let f. a Proof : Write fg − m = (f − ) g − (g − m) . if it has a limit at a. Hence. ) Examples: We saw that the function f (x) = x2 has a limit at 3. Proposition 2. a Comment: If f has a limit at a and the limit diﬀers from f (a) (or that f (a) is undeﬁned) . then we say that f has a removable discontinuity ( at a. here is another way of proving the product property of the arithmetic of limits (this time without and δ).7 (Arithmetic of limits. and lim f = a and lim g = m. . bdd limit zero I (17 hrs.7 A function f : A → R is said to be continuous at an inner point a ∈ A. and that this limit was equal 9. and lim f = f (a). f is continuous at x = 3.5 Continuity Deﬁnition 2.66 Chapter 2 Comment: Note that we do not require g to have a limit at a. With the above tools. limit zero limit zero limit zero loc. g be deﬁned in a punctured neighborhood of a point a. 2010) 2.

Continuity is deﬁned as that for every > 0 there exists a δ > 0 such that | f (x) − f (a)| < whenever x ∈ B◦ (a. Our theorems on limit arithmetic imply right away similar theorems for continuity: Theorem 2. we said that the limit of f at a x ∈ B◦ (a. The function f (x) = x sin 1/x (it was seen in the tutoring session) is continuous at all x 0. because the limit of f at a 0 does not exist. I Comment: Recall that in the deﬁnition of the limit. but if we deﬁne f (0) = 0. To prove that sin is continuous. if f (a) Proof : Obvious. f is continuous at all x 0. Moreover. By a similar calculation we could have shown that it has a limit at any a < 0 and that this limit also equals 1/a. At zero it is not deﬁned. The elementary functions sin and cos are continuous everywhere. δ). δ). Since f is not even deﬁned at x = 0. then it is continuous for all x ∈ R (by the bounded times limit zero argument). . Hence.Functions 67 We saw that the function f (x) = 1/x has a limit at any a > 0 and that this limit equals 1/a. is continuous at x = 0. then it is not continuous at that point. δ). however it is not continuous at any other point.3 If f : A → R and g : A → R are continuous at a ∈ A then f + g and f · g are continuous at a. There was an emphasis on the fact that the point x itself is excluded. is if for every > 0 there exists a δ > 0 such that | f (x) − | < whenever 0 then 1/ f is continuous at a. we need to prove that for every > 0 there exists a δ > 0 such that | sin x − sin a| < whenever x ∈ B◦ (a. The function x x ∈ Q f :x→ 0 x Q. We will take it for granted at the moment.

(19 hrs. δ1 ). Proof : Let that Since g is continuous at a. Do we have the tools to show that it is continuous. δ1 ) Combining the two we get that | f (g(x)) − f (g(a))| < whenever B(a. δ2 ). We don’t yet have a theorem for the composition of continuous functions. There is no need to exclude the point x = a. f ◦ g is continuous at a. Suppose that g is continuous at an inner point a ∈ A. > 0. But what about a function like sin x2 . Since f is continuous at g(a). Usually. we have all the tools to show that a function like. No. and that f is continuous at g(a) ∈ B. 2010) With that. .68 Chapter 2 Note that we may require that this be true whenever x ∈ B(a. Deﬁnition 2. we are interested in continuity on intervals. Then. I whenever x ∈ B(a. we have only dealt with continuity at points. δ).4 Let g : A → B and f : B → C.8 A function f is said to be right-continuous ( lim f = f (a). say. Theorem 2. such | f (y) − f (g(a))| < whenever y ∈ B(g(a). δ2 ). So far. such that |g(x) − g(a)| < δ1 or g(x) ∈ B(g(a). + a ) at a if A similar deﬁnition is given for left-continuity ( ). sin2 x + x2 + x4 sin x f (x) = 1 + sin2 x is continuous everywhere in R. then there also exists a δ2 > 0. then there exists a δ1 > 0. which is an inner point.

It is said to be continuous on a closed interval [a. h : R → R are continuous at a and satisfy g(a) = h(a). then so is | f | (where | f |(x) = | f (x)|).Functions 69 Deﬁnition 2. g).) Suppose that in addition f is continuous at zero. for some constant c for all x ∈ Q. We deﬁne a new function that “glues” g and h at a: g(x) x ≤ a f (x) = h(x) x > a. One should be careful with such interpretations.16 Suppose that g. b] f (x) = h(x) x ∈ [b. Exercise 2. b] if it is continuous on the open interval. Exercise 2. Exercise 2.14 Suppose f : R → R satisﬁes f (a + b) = f (a) + f (b) for all a. c]. and g(b) = h(b). Deﬁne g(x) x ∈ [a. c].17 Here is another “gluing” problem: Suppose that g is continuous in [a. Prove that f is continuous at a.9 A function f is said to be continuous on an open interval (a. )We have already seen that this implies that f (x) = cx. Exercise 2. Prove that f is continuous everywhere. see for example x sin 1 x 0 x f (x) = 0 x = 0. b) if it is continuous at all x ∈ (a. and in addition it is right-continuous at a and left-continuous at b. g) and min( f. Comment: We usually think of continuous function as “well-behaved”. b). b ∈ R. b]. def Prove that if f : A → R and g : A → R are continuous then so are max( f.15 Prove that if f : A → B is continuous in A. c]. . Prove that f is continuous in [a. that h is continuous is [b.

6 Theorems about continuous functions Theorem 2. That is. Then there exists a neighborhood of a in which f (x) > 0. This function has the wonderful property of being continuous at all x Q and discontinuous at all x ∈ Q (this is because its limit is everywhere zero).5 Suppose that f is continuous at a and f (a) > 0. there exists a δ > 0 such that | f (x) − f (a)| < But. where x = p/q assumes that x is rational in reduced form. δ). Proof : Since f is continuous at a. | f (x) − f (a)| < f (a) 2 f (a) 2 whenever x ∈ B(a. a one-sided version can be proved. whereby if f is right-continuous at a and f (a) > 0. It took some more time until Vito Volterra proved in 1881 that there can be no function that is continuous on Q and discontinuous on R \ Q. δ). there exists a δ > 0 such that f (x) > 0 whenever x ∈ B(a. δ). Also. then f (x) > 0 whenever x ∈ B+ (a. 2 . Comment: Note that continuity is only required at the point a. Comment: An analogous theorem holds if f (a) < 0. 2. implies f (a) − f (x) ≤ | f (a) − f (x)| < f (a) .70 Chapter 2 Example: Here is another “crazy function” due to Johannes Karl Thomae (18401921): 1/q r(x) = 0 x = p/q x Q.

2 Comment: If f (a) > 0 and f (b) < 0 the same conclusion holds for just replace f by (− f ). 2011) Comment: Continuity is required on the whole intervals. x]}. b). b]. Proof : Consider the set A = {x ∈ [a. there exists a point c ∈ (a.6 (Intermediate value theorem ) Suppose f is continuous on an interval [a. such that f (c) = 0. b] : f (y) < 0 for all y ∈ [a.e. (19 hrs. 2010) Theorem 2. f (x) > f (a)/2 > 0. 1] → R: −1 f (x) = +1 1 0≤x< 2 1 ≤ x ≤ 1.Functions i. Then. 2009) (20 hrs. for consider f : [0. with f (a) < 0 and f (b) > 0. .. 71 3/2 f!a" f!a" 1/2 f!a" a ! I This theorem can be viewed as a lemma for the following important theorem: (15 hrs.

This set is also bounded by b. By the previous theorem. The previous theorem guarantees even the existence of a δ > 0 such that b − δ is an upper bound for A. such that f (a) < α < f (b). i. i. c + δ]. f!b" A a f!a" c b Suppose it were true that f (c) < 0.72 Chapter 2 The set is non empty for it contains a. b) such that c = sup A. a + δ) ⊆ A. Actually. the previous theorems guarantees the existence of a δ > 0 such that [a. which implies that c − δ is an upper bound for A. there exists a number c ∈ (a.e. By the previous theorem. b]. .e. By the axiom of completeness.. I Corollary 2. c cannot be the least upper bound of A. we conclude that f (c) = 0. f (x) < 0 for all x ∈ [a. b) at which f (c) = α.. contradicting the fact that c is an upper bound for A. there exists a δ > 0 such that f (x) > 0 whenever c − δ ≤ x ≤ c.1 If f is continuous on a closed interval [a. By the trichotomy property. Suppose it were true that f (c) > 0. Because c is the least upper bound of a it follows that also f (x) < 0 whenever x < c. there exists a δ > 0 such that f (x) < 0 whenever c ≤ x ≤ c + δ. then there exists a point c ∈ (a. which means that c + δ ∈ A.

. Show that there exists a point c ∈ [a. . b] → R are continuous such that f (a) < g(a) and f (b) > g(b). a + δ). n bound on the interval (a − δ. b] such that f (c) = g(c). (ii) A ﬂy ﬂies from Tel-Aviv to Or Akiva (along the shortest path). 1 ]. What can be said about f ? 73 I Exercise 2. 1 − 1 ]. The following day are makes his way back along the very same path.2 If f is continuous at a.) (ii) Show that there exists an x0 ∈ [0. He starts its way at 10:00 and ﬁnishes at 16:00. i.e. continuity is needed on the whole interval. b].18 Suppose that f is continuous on [a. b] with f (x) ∈ Q for all Exercise 2.20 (i) Let f : [0. δ). I Theorem 2. f (x) < f (a) + 1 on this interval. b]. b] then it is bounded from above of that interval. for look at the “counter example” f : [−1. such that f (x) < M for all x ∈ [a. such that f (x0 ) = f (x0 + 1 ). Lemma 2. Show that there exists an x0 ∈ [0. for by the deﬁnition of continuity. there exists a δ > 0. that is. Exercise 2.19 (i) Suppose that f. g : [a. 1) whenever x ∈ B(a. then there is a δ > 0 such that f has an upper Proof : This is obvious. Comment: Here too. 1] → R. there exists a number M. Show that there will be a point along the way that he will cross a the exact same time on both days. 1] → R be continuous.7 (Karl Theodor Wilhelm Weierstraß) If f is continuous on [a.Functions Proof : Apply the previous theorem for g(x) = f (x) − α. such that n f (x0 ) = f (x0 + 1 ). (Hint: consider 2 2 1 g(x) = f (x + 2 ) − f (x). f (0) = f (1). 1/x f (x) = 0 x 0 x = 0. starting again at 10:00 and ﬁnishing at 16:00. such that f (x) ∈ B( f (a). x ∈ [a.

b] be closed. b]. x]}. there exists a neighborhood of c in which f is upper bounded. I Comment: In essence.8 (Weierstraß. then there exists a point c ∈ [a. then it is bounded up to a little farther. 1] → R. b] : f is bounded from above on [a. The function f : (0. i. 2009) (22 hrs. By the previous lemma. To be able to take such increments up to b we need the axiom of completeness.e. the set A = { f (x) : a ≤ x ≤ b} . b] such that f (x) ≤ f (c) for all x ∈ [a. f : x → 1/x is continuous on the semi-open interval. We claim that c = b. b]. for if c < b.. 2011) Theorem 2. Proof : Let A = {x ∈ [a.74 Chapter 2 Comment: It is crucial that the interval [a. and c cannot be an upper bound for A. such that c = sup A. ) If f is con- Comment: Of course. (17 hrs. there is a corresponding minimum principle. hence there exists a c ∈ (a. b]. Also A is upper bounded by b. this theorem is based on the fact that if a continuous function is bounded up to a point. Proof : We have just proved that f is upper bounded on [a. 2010) (20 hrs. there exists a δ > 0 such that a + δ ∈ A. then by the previous lemma. b]. Maximum principle tinuous on [a. but it is not bounded from above.

. 1) → R and f : [0. Prove that for every 0 < c < M. x2 f (x) = 0 x<1 . for every M > 0 there exists a point in [a. b] (since we assumed that the denominator does not vanish). 1]. 1 = M. We will show that g is not upper bounded on [a. We deﬁne then a new function g : [a. for which f (c) = α. I g(y) > Comment: Here too we needed continuity on a closed interval. Indeed.21 Let f : [a. the set {x ∈ [a. that there exists a point c ∈ [a. b].e. . 1] → R f (x) = x2 .. b] → R. b] at which f takes a value greater than M. b] : f (x) = c} has at least two elements. This set is non-empty for it contains the point f (a). b]. We need to show that this supremum is in fact a maximum. since α = sup A: For all M > 0 there exists a y ∈ [a. i. by contradiction.Functions 75 is upper bounded. b] → R be continuous. For this y. which we denote by α = sup A. α− f This function is deﬁned everywhere on [a. contradicting thus the previous theorem.e. Exercise 2. α − (α − 1/M) i. b] such that f (y) > α − 1/M. Suppose. it is continuous and positive. that this were not the case. x=1 These functions do not attain a maximum in [0. g= 1 . with f (a) = f (b) = 0 and M = sup{ f (x) : a ≤ x ≤ b} > 0. that f (x) < α for all x ∈ [a. b]. By the axiom of completeness it has a least upper bound. for consider the “counter examples” f : [0.

and that for every > 0 there exists an x ∈ [a. b in a way that works for all choices of a0 . f (x) a0 an−1 + ··· + n =1+ n x x x |an−1 | |a0 | (u + v ≥ u − |v|) ≥1− − ··· − n |x| |x | |an−1 | |a0 | (|xn | > |x|) ≥1− − ··· − |x| |x| |a0 | |an−1 | (|x| > |ai |) ≥1− − ··· − 2n|an−1 | 2n|a0 | 1 =1−n· > 0. b] such that for every [a.9 If n ∈ N is odd. Let M = max(1. a1 x + a0 = 0 has a root (a solution) for any set of constants a0 . .76 Chapter 2 Exercise 2. 2009) The next theorem deals with the case where n is even: . based on the fact that polynomials are continuous. b] x x0 . . Proof : The idea it to show that existence of points a. Prove or disprove each of the following statements: It is possible that f (x) > 0 for all x ∈ [a. b]. . . . I (18 hrs. . an−1 . . . b for which f (b) > 0 and f (a) < 0. Theorem 2. it follows that f (x) is positive for x ≥ M and negative for x ≤ −M. then the equation f (x) = xn + an−1 xn−1 + . . Then for |x| > M. f (x) < f (x0 ). . . b] for which f (x) < . hence there exists a a < c < b such that f (c) = 0. and apply the intermediate value theorem. .22 Let f : [a. The only technical issue is to ﬁnd such points a. an−1 . . 2n Since the sign of xn is the same as the sign of x. 2n|an−1 |). . 2n|a0 |. There exists an x0 ∈ [a. b] → R be continuous.

Functions 77 Theorem 2. f (x) = xn 1 + ≥ ≥ ≥ = an−1 a0 + ··· + n x x |a0 | |an−1 | − ··· − n xn 1 − |x| |x | |a0 | |an−1 | xn 1 − − ··· − |x| |x| |an−1 | |a0 | − ··· − xn 1 − 2n|an−1 | 2n|a0 | 1 n x. 2 Let then b > max(M. b]. Then. using the fact that n is even. a1 x + a0 = α with n even. n 2| f (0)|). Then there exists a number m such that this equation has a solution for all α ≥ m but has no solution for α < m. b]. . It remains to ﬁnd such a b. f (c) ≤ f (0). It follows that f (c) ≤ f (x) for all x ∈ R. b].e. we know that it assumes a minimum in this interval.. Let M be deﬁned as in the previous theorem.10 Let n be even. I Corollary 2. . from which follows that f (x) > f (0) for |x| > b.2 Consider the equation f (x) = xn + an−1 xn−1 + . . Proof : The idea is simple. such that f (c) ≤ f (x) for all x ∈ R. that there exists a c ∈ [−b. for all |x| > M. . Namely. b] such that f (c) ≤ f (x) for all x ∈ [−b. Then. which in turn is less than f (x) for all x [−b. This concludes the proof. i. looking at the function f in the interval [−b. a1 x + a0 = 0 has a minimum. We are going to show that there exists a b > 0 for which f (x) > f (0) for all |x| ≥ b. there exists a c ∈ R. In particular. Then the function f (x) = xn + an−1 xn−1 + . .

such that f (x) < M whenever 0 < |x − a| < δ. c is a solution. a whenever 0 < |x − a| < δ. f (x) > α. by the intermediate value theorem a root exists. and (ii) the limit of a function at inﬁnity. If α < f (c) then for all x. We say that the limit of f at a is inﬁnity.7 Inﬁnite limits and limits at inﬁnity In this section we extend the notions of limits discussed in previous sections to two cases: (i) the limit of a function at a point is inﬁnite. . f (x) ≥ f (c) > α. Before we start recall: inﬁnity is not a real number!.e. hence. then 0 < |x| < δ implies f (x) > M. For α > f (c) we have f (c) < α and for large enough x.. lim f = −∞. Indeed given M we take δ = 1/M. denoted lim f = ∞. Deﬁnition 2.78 Chapter 2 Proof : According to the previous theorem there exists a c ∈ R such that f (c) is the minimum of f . a if for every M there exists a δ > 0. there is no solution.10 Let f : A → B be a function. such that f (x) > M Similarly. The limit of f at zero is inﬁnity. Example: Consider the function 1/|x| f :x→ 17 x 0 x = 0. i. I 2. if for every M there exists a δ > 0. For α = f (c).

since given > 0 we take M = 1/ x>M . such that | f (x) − | < Similarly.25 Prove that x→∞ lim f (x) exists if and only if x→0+ lim f (1/x) exist. Exercise 2. x→∞ lim x2 + 2x − x . such that | f (x) − | < whenever x < M. . x−3 √ Exercise 2. ∞). We say that lim f = . if we adopt the idea that “neighborhoods of inﬁnity” are sets of the form (M. if for every > 0 there exists an M. Exercise 2.23 Prove directly from the deﬁnition that x→3 lim+ x3 + 2x + 1 = ∞. Comment: These deﬁnitions are in full agreement with all previous deﬁnitions of limits. and implies | f (x) − 3| < .Functions 79 Deﬁnition 2.24 Calculate the following limit using limit arithmetic. Example: Consider the function f : x → 3 + 1/x2 . −∞ whenever x > M. ∞ if for every > 0 there exists an M.11 Let f : R → R. in which case they are equal. Then the limit of f at inﬁnity √ is 3. lim f = .

Show that lima ( f /g) = 0. with lima f = lima g = −∞. Show that lima ( f · g) = ∞.26 Let g be deﬁned in a punctured neighborhood of a ∈ R. Show that lima 1/g = 0. g be deﬁned in a punctured neighborhood of a ∈ R. g be deﬁned in a punctured neighborhood of a ∈ R. Let f. . This means that for every b ∈ B there exists a unique a ∈ A. and denote it by f −1 (not to −1 be mistaken with 1/ f ). this function is also one-to-one and onto. (24 hrs. In fact.80 Chapter 2 Figure 2. This property deﬁnes a function from B to A. We call this function the function inverse to f ( ¯ ). with lima g = ∞. Thus.1: Inﬁnite limit and limit at inﬁnity. 2010) 2. Exercise 2.27 Let f. with lima f = K < 0 and lima g = −∞. f −1 (y) = {x ∈ A : f (x) = y}. Exercise 2.8 Inverse functions Suppose that f : A → B is one-to-one and onto. f : B → A. such that f (a) = b.

Example: Let f : [0. 1] → [0. or Graph( f −1 ) = {(b. However. and not a real number. 1] by f −1 (y) = {x ∈ [0. 1] → R be given by f : x → x2 . The inverse function can be deﬁned as a subset of B × A: it is all the pairs (b. but it is not monotonic. This function is one-toone and its image is the segment [0. Strictly speaking.. or monotonically decreasing. a) : f (a) = b}. Thus we can deﬁne an inverse function f −1 : [0. 5). 2011) Theorem 2. 1] : x2 = y}. The function f : [0. x f (x) = 6 − x 0≤x≤1 1<x≤2 is one-to-one with image [0. 2] → R. a) for which (a.e. 10 . b) ∈ Graph( f ). b) : f (a) = b}. (22 hrs. Graph( f ) = {(a. then f is either monotonically increasing. the one-to-one and onto properties of f guarantee that it contains one and only one number.11 If f : I → R is continuous and one-to-one (I is an interval). i. 1] ∪ [4. Comment: Recall that a function f : A → B can be deﬁned as a subset of A × B. hence we identify this singleton ( ) with the unique element it contains. the right-hand side is a set of real numbers. 1]. it is the (positive) square root.Functions 10 81 Note also that f −1 ◦ f is the identity in A whereas f ◦ f −1 is the identity in B. Comment: Continuity is crucial.

By monotonicity. so is any point in this interval. b] → R is one-to-one and monotonically increasing (without loss of generality). contradicting the fact that f is one-to-one. Suppose that f (a) < f (c) and that. and by the intermediate value theorem. b. It follows as once that for every four points a < b < c < d. f (b) < f (a) (see Figure 2. its image is the interval [ f (a). f (a) < f (c) implies that f (a) < f (b) < f (c). Similarly. y (whatever their order is) and conclude that f (x) < f (y). Then. by contradiction.l. either f (a) < f (b) < f (c). either f (a) < f (b) < f (c) < f (d) or f (a) > f (b) > f (c) > f (d). for every x.o. By the intermediate value theorem. there exists a point x ∈ (b. if f (b) > f (c). We proceed similarly if f (a) > f (c). f (a) and f (b) are in the range. or f (a) > f (b) > f (c). y such that x < y. Indeed. we apply the above arguments to the four points a. Fix now a and b.2: Illustration of proof Proof : We ﬁrst prove that for every three points a < b < c. no point outside this interval . x.2). then there exists a point y ∈ [a. Then. f (b)]. Thus. and suppose w. c) such that f (x) = f (a). I Suppose that the function f : [a.g that f (a) < f (b) (they can’t be equal). b] such that f (y) = f (c).82 Chapter 2 f!c" f!a" f!b" a b c Figure 2.

let us assume that f is monotonically increasing. so is f −1 . en passant. such that | f −1 (y) − f −1 (y0 )| < whenever |y − y0 | < δ. By monotonicity. lim f −1 = f −1 (y0 ). that continuous one-to-one functions map closed segments into closed segments11 . i. y0 − f ( f −1 (y0 ) − )). Every y0 − δ < y < y0 + δ is in the interval (y1 . f ( f −1 (y0 )− ) = y0 −[y0 − f ( f −1 (y0 )− )] < y < y0 +[ f ( f −1 (y0 )+ )−y0 ] = f ( f −1 (y0 )+ ).Functions 83 can be in the range of f . We have proved. More generally. Then. Theorem 2. Let δ = min( f ( f −1 (y0 ) + ) − y0 . y2 ). and therefore. every such y0 is equal to f (x0 ) for a unique x0 . We can do it more formally. y0 Equivalently. f (b)] → [a. x<y implies f (x) < f (y). Let y1 = f (x0 − ) and y2 = f (x0 + ).. y1 < y0 < y2 . f (b)). It follows that the inverse function has range and image f −1 : [ f (a). a continuous one-to-one function maps connected sets into connected sets. we need to show that for every > 0 there exists a δ > 0. y2 − y0 ). Now. f −1 (y0 ) − < f −1 (y) < f −1 (y0 ) + . it is clear how to set δ.e. and retains the open/closed properties. . We need to show that for every y0 ∈ ( f (a). 11 y0 − δ < y < y0 + δ. b]. b] → R is continuous and one-to-one then so is f −1 .12 If f : [a. We let then δ = min(y0 − y1 . |y − y0 | < δ implies or. Then. Proof : Without loss of generality. Since f (and f −1 ) are both monotonically increasing.

b).e. Let us re-examine two examples: . That is. 2010) 2. such that | f (y) − f (x)| < whenever |y − x| < δ and y ∈ (a. if it is continuous at every point in the interval. 2009) (26 hrs. b) and every > 0.9 Uniform continuity Recall the deﬁnition of a continuous function: a function f : (a.84 Chapter 2 Since f −1 is monotonically increasing. In fact. it includes one-sided continuity as well). b). b) → R is continuous on the interval (a. this deﬁnition applies for continuity on a closed interval as well (i. for every x ∈ (a. the way we set it here. there exists a δ < 0. In general. or | f −1 (y0 ) − f −1 (y)| < . f!x0 +!" f!x0 "=y0 f!x0 #!" ! " f#1!y0"=x0 I (20 hrs. we may apply f −1 on all terms. giving f −1 (y0 ) − < f −1 (y) < f −1 (y0 ) + . we expect δ to depend both on x and on .

Note that x and y play here symmetric roles12 . xy x[x + (y − x)] If we take δ( . This function is continuous on (0. Thus. Example: Consider next the function f : (0. In contrast. then the same δ ﬁts all points x. even if we took care of its value at zero. 1) → R. These two examples motivate the following deﬁnition: Deﬁnition 2. if we choose δ = δ( . 1) → R. 1). | f (y) − f (x)| = |y2 − x2 | = |y − x||y + x| ≤ (x + 1)|y − x|. There is no way we can ﬁnd a δ = δ( ) that would ﬁt all x. Why? Given x ∈ (0. 1). We have already seen examples where continuity on a 12 The term “uniform” means that the same number can be used for all points. 1). x) = min(x/2. x) and y ∈ (0. it is always legitimate to replace δ by a smaller number. Why? Because for every x ∈ (0. the smaller is δ( . f : x → 1/x. given x and > 0. there is no way we could have deﬁned the function x → 1/x as a continuous function on [0. 1].12 f : A → B is said to be uniformly continuous on A ( ) if for every > 0 corresponds a δ = δ( ) > 0. . 1). the closer x is to zero. x(x − δ) Here.Functions 85 Example: Consider the function f : (0. x). However. x) = /(x + 1). given . 1) and y ∈ (0. and every y ∈ (0. This function is also continuous on (0. y ∈ A. Thus. If we take δ = /2. | f (x) − f (y)| = |x − y| |x − y| = . 1). then | f (y) − f (x)| < whenever |x − y| < δ( . δ depends both on and x. 1] and it would have been continuous there too. x2 /2). What is the essential diﬀerence between the two above examples? The function x → x2 could have been deﬁned on the closed interval [0. 1). such that whenever |y − x| < δ and | f (y) − f (x)| < x. then |x − y| < δ implies | f (x) − f (y)| < δ < . f : x → x2 .

I (24 hrs. b] and y ∈ [b. δ2 . We will prove that continuity on a closed interval implies uniform continuity. it follows that |x − b| < δ3 and |y − b| < δ3 and we use (2. |x − b| < δ3 and |y − b| < δ3 implies | f (x) − f (y)| < . We claim that | f (x) − f (y)| < whenever |x − y| < δ and x. then there exists a δ3 > 0 such that | f (x) − f (b)| < Hence. in which case it follows from the fact that δ ≤ δ1 . Since the function is continuous at b. We proceed by steps: Lemma 2. Remains the third case. y ∈ [a. y ∈ [a. such that x. where. (2. Proof : We have to take care at the “gluing” at the point b.3 Let f : [a. δ2 > 0. Why is that true? One of three: either x. y ∈ [a. in which case it follows from the fact that δ ≤ δ2 . c]. This is also the case here. c]. c].1). Since |x − y| < δ3 . Then there exists a δ > 0 such that x. y ∈ [b. c] and |x − y| < δ implies | f (x) − f (y)| < . b]. or x. 2011) . x ∈ [a. c] and and |x − y| < δ1 |x − y| < δ2 implies implies | f (x) − f (y)| < | f (x) − f (y)| < . c] → R be continuous. Let a < b < c. y ∈ [b.1) 2 whenever |x − b| < δ3 . b] x. without loss of generality.86 Chapter 2 closed interval had strong implications (ensures boundedness and the existence of a maximum). y ∈ [a. and suppose that for a given > 0 there exist δ1 . Take now δ = min(δ1 . δ3 ).

Functions 87 Theorem 2. (27 hrs. if for that particular there exists a corresponding δ). α−δ0 /2]∩[a. b] for all > 0. x]}. b] for all > 0. > 0. b]. b]. then there exists a δ0 > 0 such that | f (y) − f (α)| < Thus. The set A is non-empty. contradicting the fact that α is an upper bound for A.. I Comment: In this course we will make use of uniform continuity only once. z ∈ B(α. b] and hence in the closed interval [α − δ0 /2. b] (i. 2010) Examples: . By the deﬁnition of α. b] then it is uniformly continuous on that interval. δ0 ). when we study integration. Thus f is -good on [a. from which follows that f is -good in (α−δ0 . Suppose that α < b. f is -good on [a. We need to show that f is -good on [a. b] and on [b. Given . b]. 2 whenever |y − α| < δ0 and y ∈ [a. Since this argument holds independently of we conclude that f is -good in [a. c].3 that if f is -good on [a. | f (y) − f (z)| ≤ | f (y) − f (α)| + | f (z) − f (α)| < . α+δ0 )∩[a. Since f is continuous at α. we will say that f is -good on [a. α+δ0 /2]∩[a. |x − y| < δ Proof : Given such that | f (x) − f (y)| < whenever and x. as it includes the point a (trivial) and it is bounded by b. then it is -good on [a. c]. ∀y. deﬁne A = {x ∈ [a. y ∈ [a. b] if there exists a δ > 0. b]. α + δ0 /2] ∩ [a. it is also -good on [a.13 If f is continuous on [a. and by the previous lemma. Hence it has a supremum. b] : f is -good on [a. which we denote by α. Note that we proved in Lemma 2.e. b].

If on any interval [n. the product of uniformly continuous functions is not necessarily uniformly continuous. Finally the function f (x) = sin x2 is continuous and bounded on R.30 Let f : (0.28 Let f : R → R be continuous. then f is uniformly continuous on R.88 Chapter 2 Consider the function f (x) = sin(1/x) deﬁned on (0. n + 1]. ∞). (22 hrs. 1). an open segment. (Hint: read the proof about the limit of a product of functions being the product of their limits. g : I → R be uniformly continuous on I. Prove that f · g is uniformly continuous on I. ∞ −∞ then f attains a maximum value. It is easy to see that it is uniformly continuous. Prove that f is uniformly continuous on [a. Is in the above case f · g necessarily uniformly continuous? Exercise 2.. Thus.) . n ∈ Z. Consider the function Id : R → R. The function Id · Id. Prove or disprove: If lim f = lim f = 0. that is f is on R. ∞) → R be given by f : x → 1/ x: Let a > 0. where α. then f assumes all real values. Prove that if f. but not uniformly continuous. 2009) Exercise 2. g are uniformly continuous on A then α f + βg is also uniformly continuous on A. it is not uniformly continuous. Exercise 2. is continuous on R.29 Suppose A ⊆ R is non-empty. f is uniformly continuous. an open ray. however.31 Let I be an open connected domain (i. or the whole line) and let f. Even though it is continuous. but it is not uniformly continuous. ∞).e. If lim∞ f = ∞ and lim−∞ f = −∞. β ∈ R. Show that f is not uniformly continuous on (0. √ Exercise 2.

Exercise 2.Functions 89 Exercise 2. Prove that if f is H¨ lder continuous of order α > 1. ∞). Prove that if f is H¨ lder continuous of order α on A then it is uniformly o continuous on A. y ∈ A.35 Let α > 0. Show that f (x) = sin(1/x) is not uniformly continuous on (0. a). We say that f : A → B is H¨ lder continuous of o order α if there exists a constant M > 0.32 Let f : A → B and g : B → C be uniformly continuous. Exercise 2. ∀x.34 For which values of α > 0 is the function f (x) = xα uniformly continuous on [0. √ o Show that f (x) = x is H¨ lder continuous of order 1/2 on (0. ∞)? Exercise 2. then it is a constant o function.33 Let a > 0. such that | f (x) − f (y)| < M|x − y|α . Show that g ◦ f is uniformly continuous. .

90 Chapter 2 .

This means that (∀ > 0)(∃δ > 0) : (∀x ∈ B◦ (a. We claim that lim g = lim f. we introduced the limit of a function f : A → B at an (interior) point a. Derivatives were historically introduced in order to answer the need of measuring the “rate of change” of a function. Suppose that the right hand side equals . one technical clariﬁcation. lim f. a Consider the function g deﬁned by g(x) = f (a + x). zero is an interior point of this set. 0 a provided. that the right hand side exists. It is a good exercise to prove it formally. We will actually adopt this approach as a prelude to the formal deﬁnition of the derivative. δ))(| f (x) − | < ). of course. In particular.1 Deﬁnition In this chapter we deﬁne and study the notion of diﬀerentiable functions.Chapter 3 Derivatives 3. In the previous chapter. Before we start. on the range. {x : a + x ∈ A}. .

if we consider the graph of f . x→0 a although this in not in agreement with our notations. and by the deﬁnition of g. For example. Then f (x) − f (a) is the displacement ( ) between time a and time x. δ))(|g(y) − | < ). First. f ◦ (x → a + x). f (a + x) stands for the composite function. The mean velocity. then the mean rate of change is the slope of the secant line ( ) that intersects the graph at the points a and x (see Figure 3. Often. δ))(| f (a + y) − | < ). and let a and x be two points inside this interval. Let now f be a function deﬁned on some interval I. (∀ > 0)(∃δ > 0) : (∀y ∈ B◦ (0. it has a meaning in many physical situations. but there is nothing to guarantee that within this time interval. we could intersect this time interval into two equal sub-intervals. is an average quantity between two instants. x−a This quantity can be attributed with a number of interpretations. the body was in a “ﬁxed state”. Thus. For example. in meters relative to the origin) as function of time (say. We deﬁne the mean rate of change ( ) of f between the points a and x to be the ratio f (x) − f (a) . one does not bother to deﬁne the function g and simply writes lim f (a + x) = lim f. i. Chapter 3 (∀ > 0)(∃δ > 0) : (∀y ∈ B◦ (0. This physical example is a good preliminary toward the deﬁnition of the derivative. f (a) is the distance from the origin in meters a seconds after the time origin. The variation in the value of f between these two points is f (x) − f (a). and [ f (x) − f (a)]/(x − a) is the mean displacement per unit time.. or mean rate of displacement.92 Setting x − a = y. Second.1). or the mean velocity.e. lim0 g = . and measure the mean velocity in each half. and f (x) is the distance from the origin in meters x seconds after the time origin. in seconds relative to an origin of time). Nothing guarantees that these mean velocities will equal the mean velocity over . f can be the position along a line (say.

a = f (a).a can be deﬁned on the same domain as f . f (a) is nothing but a notation. Physicists aimed to deﬁne an “instantaneous velocity”. Fixing the point a.a has a limit at a.a : x → x−a The function ∆ f. x−a At this stage. f (a) = lim x→a f (x) − f (a) .Derivatives 93 f!a+h" f!a" ! a a+h Figure 3. Of course. a We call the limit f (a) the derivative ( ) of f at a. the whole interval. but excluding the point a. we deﬁne a function f (x) − f (a) . We say that f is diﬀerentiable at a ( ) if ∆ f.1: Relation between the mean rate of change of a function and the slope of the corresponding secant. having h small (what does small mean?) does not change the average nature of the measured velocity. It is not (yet!) a function evaluated . ∆ f. Comment: In the more traditional notation. We write lim ∆ f. An instantaneous rate of change can be deﬁned by using limits. and the way to do it was to make the length of the interval very small.

we could have computed the derivative at any point. is df (a).x . f : x → c.e. f (x) = lim y→x f (y) − f (x) .x = 2x. . Another standard notation. we can deﬁne a function that given a point x.94 Chapter 3 at the point a. We construct the function ∆ f.a (y) = f (y) − f (a) c − c = = 0. as the function. these are only hand-waving arguments. x y2 − x2 = x+y y−x or ∆ f. y−x Example: Consider the function f : x → x2 . Example: Consider the constant function. x Again. We are going to calculate its derivative at the point a. due to Leibniz. In the above example. f : U → R.x (y) = so that f (x) = lim ∆ f. the derivative has a simple geometrical interpretation.. hence f (a) = 0. We calculate its derivative at a point x by ﬁrst observing that ∆ f. For us. dx Having identiﬁed the mean rate of change as the slope of the secant line. In other words. As x tends to a. returns f (x). the corresponding family of secants tends to a line which is tangent to f at the point a. the more traditional notation is. returns the derivative of f at that point. A function f : A → B is called diﬀerentiable in a subset U ⊆ A of its domain if it has a derivative at every point x ∈ U. f (x) = lim ∆ f. We then deﬁne the derivative function. f : R → R.x = Id +x. i. y−a y−a The limit of this function at a is clearly zero. as we have never assigned any meaning to the limit of a family of lines.

x (y) = 1.x at x > 0 we may well assume that y > 0. 2009) (25 hrs.x (y) = so that f (x) = lim ∆ f. ∆ f. in which case ∆ f. x For negative x. ∆ f. Thus f is diﬀerentiable everywhere except for the origin1 . This motivates the following deﬁnitions: We use here a general principle. but one-sided limits do exist. 2011) Example: Consider now the function f : x → |x|. 1 1 2. For x > 0. |x| = −x and we may consider y < 0 as well. Let’s ﬁrst calculate its derivative at a point x > 0. such that . whereby lima f does not exist if there exist every neighborhood of a has points x.0 (y) = |y| − |0| = sgn(y). Hence f (x) = lim ∆ f. x (−y) − (−x) = −1.x = −1. 2010) One-sided diﬀerentiability This last example shows a case where a limit does not exist.x = 1.x (y) = |y| − x . y−a Remains the point 0 itself.Derivatives 95 (23 hrs. so that ∆ f. y. y−0 The limit at zero does not exist since every neighborhood of zero has points where this function equals one and points where this function equals minus one. (29 hrs. y−x Since we are interested in the limit of ∆ f. such that f (x) = 1 and f (y) = 2 .

0 does not have a limit at 0. f (y) − f (0) 1 = y sin . ) by f (a+ ). because every neighborhood of zero has both points where ∆ f.96 Chapter 3 ) Deﬁnition 3.1 If f is diﬀerentiable at a then it is continuous at a.0 (y) = f (y) − f (0) 1 = sin .0 = 0. We have already seen that that this function is continuous at zero. consider the function x2 sin 1 x f :x→ 0 We construct ∆ f. We denote the right-hand derivative ( deﬁnition holds for left-hand derivatives. A similar Example: Here is one more example that practices the calculation of the derivative directly from the deﬁnition. but is it diﬀerentiable at zero? We construct the function ∆ f.0 = 1 and points where ∆ f. Example: In contrast. Diﬀerentiability and continuity The following theorem shows that diﬀerentiable functions are a subclass (“better behaved”) of the continuous functions. . Consider ﬁrst the function x sin 1 x 0 x f :x→ 0 x = 0.1 A function f is said to be diﬀerentiable on the right ( at a if the one-sided limit lim ∆ f.0 (y) = x 0 x = 0. hence f is diﬀerentiable at zero.0 = −1. Theorem 3. y−0 y The function ∆ f.a + a exists. y−0 y Now lim0 ∆ f.

For derivatives higher than the third. use the fact that a function diﬀerentiable at a point is also continuous at that point. f = f (a) + (Id −a) ∆ f. in which case we may deﬁne the third derivative f . By deﬁnition f (x) = lim ∆ f . we have a functional identity. f . f (k) . or on a subset of A. then | f | is also diﬀerentiable at that point. is deﬁned recursively.) Exercise 3.Derivatives Proof : Note that for y a.) Likewise.x . or the second derivative ( ).x . or not. we can construct a new function—its derivative. for example. x (Note that this is a limit of limits. Then.2 Let I be a connected domain. By limit arithmetic. Higher order derivatives If a function f is diﬀerentiable on an interval A. The k-th derivative. the second derivative may be differentiable. The derivative f can have various properties. we also set f (0) = f . lim f = f (a) + lim(Id −a) lim ∆ f. a a a I Exercise 3.a = f (a) + 0 · f (a) = f (a). and in particular. f (k) (x) = lim ∆ f (k−1) . Suppose furthermore that | f (x)| < M for all x ∈ I. it may be diﬀerentiable on A. we can deﬁne a new function— the derivative of the derivative. f (y) = f (a) + (y − a) · f (y) − f (a) . Show that f is uniformly continuous on I. which we denote by f . (y − a) 97 Viewing f (a) as a constant. the notation f (4) rather than f . and suppose that f is diﬀerentiable at every interior point of I and continuous on I. (Hint. For example. it is customary to use. it may be continuous. that if f is differentiable at a ∈ R with f (x) 0. x For the recursion to hold from k = 1.a valid in a punctured neighborhood of a. and so on. .1 Show. based on the deﬁnition of the derivative.

we develop tools that will enable us to (easily) compute derivatives.a + ∆g.a .4 Let f. we ﬁrst calculated limits by using the deﬁnition of the limit. Proof : Immediate from the deﬁnition. then f = 0 (by this we mean that f : x → 0). I Theorem 3.a = ( f + g) − ( f + g)(a) Id −a f + g − f (a) − g(a) = Id −a = ∆ f. and we proved a number of theorems (arithmetic of limits). without having to go back to the deﬁnitions. I Theorem 3.3 If f = Id then f = 1 (by this we mean that f : x → 1).2 Rules of diﬀerentiation Recall that when we studied limits.2 If f is a constant function. Proof : We consider the function ∆ f +g. After having calculated the derivatives of a small number of functions. and ( f + g) (a) = f (a) + g (a). g be functions deﬁned on the same domain (it suﬃces that the domains have a non-empty intersection). with which we were able to easily calculate a large variety of limits. Theorem 3. The same exactly will happen with derivatives. Proof : We proved this is in the previous section.98 Chapter 3 3. but very soon this became impractical. . If both f and g are diﬀerentiable at a then f + g is diﬀerentiable at a.

Suppose this were true for n = k.a + g(a)∆ f. Id −a Id −a and it remains to apply the arithmetic rules of limits. and ( f g) (a) = f (a) g(a) + f (a) g (a). and by the diﬀerentiation rule for products. g be functions deﬁned on the same domain. . Then.a .e. We can show this inductively.1 The derivative is a linear operator. 99 I Theorem 3.. fn (x) = n xn−1 or equivalently fn = n fn−1 .Derivatives and by the arithmetic of limits. I Example: Let n ∈ N and consider the functions fn : x → xn . ( f + g) (a) = f (a) + g (a). If both f and g are diﬀerentiable at a then f · g is diﬀerentiable at a. i. We know already that this is true for n = 0. fk+1 (x) = k xk−1 · x + xk · 1 = (k + 1) xk .5 (Leibniz rule) Let f. Then. f g − ( f g)(a) f g − f g(a) + f g(a) − f (a)g(a) = = f ∆g. fk+1 = fk · Id. Proof : We consider the function. (α f + βg) = α f + βg . 1. This follows from the derivative of the product.a = Corollary 3. Proof : We only need to verify that if f is diﬀerentiable at a then so is α f and (α f ) = α f . fk+1 = fk · Id + fk · Id . I ∆ f g.

a .5). (31 hrs.100 Chapter 3 Theorem 3. But what about the derivative of x → sin x3 ? Here we need a rule for how to diﬀerentiate compositions. say. as a product of three functions.a = 1 1 1 1/g − (1/g)(a) −1 = − · ∆g. then 1/g is diﬀerentiable at Proof : Since g is continuous at a (it is continuous since it is diﬀerentiable) and it is non-zero. (25 hrs. 2010) Theorem 3. then the function f /g is diﬀerentiable at a and ( f /g) = f g−g f . g2 (a) 0. x → sin3 x. Then we have no problem calculating the derivative of. .6 If g is diﬀerentiable at a and g(a) a and (1/g) (a) = − g (a) .7 If f and g are diﬀerentiable at a and g(a) 0. Suppose we take for granted that sin = cos and cos = − sin . g2 Proof : Apply the last two theorems. For example. In this neighborhood we consider the function ∆1/g. ( f gh) = (( f g)h) = ( f g) h + ( f g)h = ( f g + f g )h = ( f g)h = f gh + f g h + f gh . = Id −a Id −a g g(a) g · g(a) I It remains to take the limit at a and apply the arithmetic laws of limits. then there exists a neighborhood of a in which g does not vanish (by Theorem 2. 2009) I With these theorems in hand. In particular there is no diﬃculty in diﬀerentiating a product of more than two functions. we can diﬀerentiate a large number of functions.

This suggest the following treatment. i. and it only takes a little more subtlety to prove it. the result is correct. The fact that g is diﬀerentiable at f (a) implies that ψ is continuous at f (a). rendering this expression meaningless. y−a y−a and calculate its limit at a. then (g ◦ f ) (a) = g ( f (a)) · f (a). this product tends to g ( f (a)) · f (a). Proof : We introduce the following function. we expect the arguments f (y) and f (a) of g to be very close. (27 hrs. 2011) Theorem 3. ∆g◦ f.a (y) = (g ◦ f )(y) − (g ◦ f )(a) g( f (y)) − g( f (a)) = .Derivatives Consider the composite function g ◦ f . f (y) − f (a) y−a f (y) − f (a) It looks that as y → a. 101 Let’s try to calculate its derivative at a point a.a (y) = g( f (y)) − g( f (a)) f (y) − f (a) g( f (y)) − g( f (a)) · = ∆ f. assuming for the moment that both f and g are diﬀerentiable everywhere. f (a) (z) z f (a) ψ:z→ g ( f (a)) z = f (a).a = (ψ ◦ f ) · ∆ f. deﬁned in a neighborhood of f (a). The problem is that while the limit y → a means that the case y = a is not to be considered.e. Yet. When y is close to a. We next claim that for y a ∆g◦ f. ∆ g. .8 Let f : A → B and g : B → R. If f is diﬀerentiable at a ∈ A. since f (y) − f (a) → 0. and g is diﬀerentiable at f (a). there is nothing to prevent the denominator f (y)− f (a) from vanishing.a (y). We then need to look at the function ∆g◦ f..a . (g ◦ f )(x) = g( f (x)).

. This suggests the following alternative deﬁnition of the derivative: f (x) − f (a) .a = ψ( f (a)) · lim ∆ f. The function ∆ f. and how clever deﬁnitions can sometimes greatly simplify proofs. a a I (32 hrs. x−a a. then this equation reads g( f (y)) − g( f (a)) g( f (y)) − g( f (a)) f (y) − f (a) = · . . and S f.a has a limit at a.x (y) = y=a is continuous at a. 2010) 3. hence not continuous at a. and we denote this limit by f (a). f (y) = f (a) + S f. We could say that f is diﬀerentiable at a if there exists a real number. By limits arithmetic. Consider now the limit of the right hand side at a. y−a f (y) − f (a) y−a which holds indeed. Its purpose is to oﬀer a slightly diﬀerent angle of view on the subject. if the function ∆ f. Our deﬁnition of derivatives states that a function f is diﬀerentiable at an interior point a.a (y)(y − a).102 Why that? If f (y) Chapter 3 f (a). For y S f.a is not deﬁned at a. whereas if f (y) = f (a) it reads. This equation holds also for y = a. f (y) − f (a) g( f (y)) − g( f (a)) = g ( f (a)) · .a S f. y−a y−a and both sides are zero. lim (ψ ◦ f ) · ∆ f.a (a) = is called the derivative of f at a.a (y) = or equivalently.a (y) = ∆ f.3 Another look at derivatives In this short section we provide another (equivalent) characterization of diﬀerentiability and the derivative. such that the function ∆ (y) y a f. but this is a removable discontinuity.a = g ( f (a)) · f (a).

a . Of course.a and Sg. Then f (y) − f (a) = (y + a)(y − a).a (y)(y − a) in some neighborhood of a and f (a) = S f.a + g(a) S f.2 A function f deﬁned in on open neighborhood of a point a is said to be diﬀerentiable at a if there exists a function S f. F = f (a)Sg.a (y)(y − a) g(a) + Sg.a (y)(y − a).a (y)(y − a) g(y) = g(a) + Sg.Derivatives 103 Deﬁnition 3.a (y)(y − a) = f (a)g(a) + f (a)Sg.a (y) + S f. hence f g is diﬀerentiable at a and ( f g) (a) = lim F = f (a)g (a) + g(a) f (a).a Sg.a coincides with ∆ f. such that f (y) = f (a) + S f.a continuous at a.a (a) is called the derivative of f at a and is denoted by f (a).a (a). the function in the brackets. a Leibniz’ rule Let’s now see how this alternative deﬁnition simpliﬁes certain proofs.a (Id −a) is continuous at a. it follows that f is diﬀerentiable at a and f (a) = lim(Id +a) = 2a. S f.a (y) + g(a) S f.a (y)(y − a) (y − a). This implies the existence of two continuous functions. f (y)g(y) = f (a) + S f. S f. A ﬁrst example Take the function f : x → x2 . Since the function Id +a is continuous at a. or f = f (a) + (Id +a)(Id −a).a (y)Sg. such that f (y) = f (a) + S f.a . S f. By limit arithmetic.a in some punctured neighborhood of a.a + S f.a (a) and g (a) = Sg. Suppose for example that both f and g are diﬀerentiable at a. Then.

Deﬁnition 3. We are going to put this practice on solid grounds. A point a ∈ A (not necessarily an internal point) is said to be a maximum point of f in A. 2010) (28 hrs. recall the deﬁnition. nor to exist. Then. f (a) ◦ f · S f. For example. f (a) ( f (y))( f (y) − f (a)) = g( f (a)) + Sg.a (y)(y − a) g(z) = g( f (a)) + Sg. there exist continuous functions. one of the main uses of diﬀerential calculus is to ﬁnd extrema of functions. in a constant function all points are minima and maxima.a and Sg. (g ◦ f )(y) = (g ◦ f )(a) + Sg.a (y)(y − a). Comment: By no means a maximum point has to be unique.4 The derivative and extrema In high-school calculus. S f. f (a) ◦ f ]·S f. Now.3 Let f : A → B. f (a) ( f (y))S f. a (34 hrs. f (a) (z)(z − f (a)). f (a) ( f (y))S f. ∀x ∈ A.a (y) By the properties of continuous functions. f (a) . or. First. such that f (y) = f (a) + S f. and (g ◦ f ) (a) = lim Sg. if f (x) ≤ f (a) We deﬁne similarly a minimum point. 2011) 3. g( f (y)) = g( f (a)) + Sg.104 Chapter 3 Derivative of a composition Suppose now that f is diﬀerentiable at a and g is diﬀerentiable at f (a).a = g ( f (a)) f (a). [Sg.a (y)(y − a). On the other . the function in the square brackets is continuous at a.

since every neighborhood of x has points y where ∆ f.x (y) ≤ 0.x It is given that ∆ f.x (y) = satisﬁes f (y) − f (x) .x has a limit at x (it is f (x)). 1). both. . for y x. but this can’t hold. taking such that 0< 2 < ∆ f. Similarly. the function f : (0. a maximum point. b) is a maximum point of f and f is diﬀerentiable at x. Suppose the limit. We will show that this limit is necessarily zero. ∆ (y) ≤ 0 f. . can’t be negative. b). then f (x) = 0. by assumption. In particular. f (y) − f (x) ≤ 0. Then. ∆ f.x ∆ (y) ≥ 0 f. were positive.9 Let f : (a. f : x → x2 does not have a maximum in (0. b]. then for every y ∈ (a. not necessarily unique). 1) → R. b) → R. Here is a ﬁrst connection between maximum (and minimum) points and derivatives: Theorem 3. there has to be a δ > 0.Derivatives 105 hand. Also. 0 < |y − x| < δ.x (y) < 3 2 whenever = /2. Proof : Since x is. hence = 0. y−x y>x y < x. because at the end points we can only consider one-sided limits. we proved that a continuous function deﬁned on a closed interval always has a maximum (and a minumum. Comment: The open interval cannot be replaced by the closed interval [a. It does not imply that if f (x) = 0 then x is a maximum point (nor a minimum point). If x ∈ (a. I Comment: This is a uni-directional theorem.

A point a ∈ A is called a critical point of f if f (a) = 0. a + δ). There are three types of candidates: (1) critical points of f in (a.4 Let f : A → B. 3 3 3 3 . x = ±1/ 3. (ii) the end points a and b. If Comment: Although this theorem is due to Fermat. Theorem 3.. which are both in the domain.10 (Pierre de Fermat) If a is a local maximum (or minimum) of f in some open interval and f is diﬀerentiable at a then f (a) = 0. √ i. If f is diﬀerentiable. If f is continuous. if there exists a δ > 0 such that a is a maximum of f in (a − δ. An interior point a ∈ A is called a local maximum of f ( ). Deﬁnition 3. a + δ) (“local” always means “in a suﬃciently small neighborhood”). then the task reduces to ﬁnding all the critical points and comparing all the critical values. It is easily checked that √ √ 2 2 f (1/ 3) = − √ and f (−1/ 3) = √ .e. the largest critical value has to be compared to f (a) and f (b). Proof : There is actually nothing to prove. but zero is not a local minimum. its proof is slightly shorter than the proof of Fermat’s last theorem. b). 2] → R. b] → R and suppose we want to ﬁnd a maximum point of f . and its critical points satisfy f (x) = 3x2 − 1 = 0. Comment: The converse is not true. The value f (a) is called a critical value. Take for example the function f : x → x3 . { f (x) : f (x) = 0}.106 Chapter 3 Deﬁnition 3. as the previous theorem applies verbatim for f restricted to the interval (a − δ.5 Let f : A → B be diﬀerentiable. nor a local maximum of f . Locating extremal points Let f : [a. Finally. then a maximum is guaranteed to exist. I is diﬀerentiable in R and f (0) = 0. Example: Consider the function f : [−1. f : x → x3 − x. This function is diﬀerentiable everywhere. and (iii) points where f is not diﬀerentiable.

there is no interior point where f (x) = 0. b). I Comment: The requirement that f be diﬀerentiable everywhere in (a. b−a that is. What about the reverse direction? Take the following example: we know that the derivative of a constant function is zero. b) is imperative. If the maximum occurs at some c ∈ (a. f (−1) = 0 and f (2) = 6. we have derived properties of the derivative given the function. diﬀerentiable in (a. If the minimum occurs at some c ∈ (a. b] and diﬀerentiable in (a. b) then f (c) = 0 and we are done. b]. 1] and f (0) = f (1) = 0. b]. 107 So far. b) where f (b) − f (a) . b) then f (c) = 0 and we are done. 2 Even though f is continuous in [0. . Proof : It follows from the continuity of f that it has a maximum and a minimum point in [a. at all of them). it is not clear how to show it. How can we go from the knowledge that f (y) − f (x) = 0. b) where f (c) = 0. then there exists a point c ∈ (a. The following two theorems will provide us with the necessary tools: Theorem 3.Derivatives Finally. Is it also true that if the derivative is zero then the function is a constant? A priori.12 (Mean-value theorem ( f (c) = )) If f is continuous on [a. The only remaining alternative is that a and b are both minima and maxima.11 (Michel Rolle. a point at which the derivative equals to the mean rate of change of f between a and b. in which case f is a constant and its derivative vanishes at some interior point (well. then there exists a point c ∈ (a. Theorem 3. hence 2 is the (unique) maximum point. lim y→x y−x to showing that f is a constant function. for consider the function x f (x) = 1 − x 1 0≤x≤ 2 1 < x ≤ 1. and f (a) = f (b). 1691) If f is continuous on [a. b).

Hence by Rolle’s theorem there exists a point c ∈ (a. Moreover. Proof : Apply the previous corollary for f − g. f (e) = d−c however f (e) = 0. g(a) = g(b) = f (a). d ∈ (a. hence f (d) = f (c). such that f (d) − f (c) . Proof : This is almost a direct consequence of Rolle’s theorem. Proof : Let c. b). b) then f is a constant. Take a domain of deﬁnition which is the union of two disjoint sets. then there exists a real number c such that f = g + c on that interval.108 Chapter 3 Comment: Rolle’s theorem is a particular case. d). g(x) = f (x) − b−a g is continuous on [a. and this is no longer true ( f is only constant in every “connected component” ( ¯ )). b] → R. b] and diﬀerentiable in (a. f (b) − f (a) (x − a). Since this holds for all pair of points. b). Deﬁne the function g : [a. c < d. (27 hrs. I . then f is a constant.2 Let f : [a. b−a I Corollary 3. such that 0 = g (c) = f (c) − f (b) − f (a) . 2009) Corollary 3. If f (x) = 0 for all x ∈ (a. By the mean-value theorem there exists some e ∈ (c. b] → R.3 If f and g are diﬀerentiable on an interval with f (x) = g (x). b). I Comment: This is only true if f is deﬁned on an interval.

We have seen that at a local minimum (or maximum) the derivative (if it exists) vanishes.13 If f (a) = 0 and f (a) > 0 then a is a local minimum of f . Proof : Let x < y belong to that interval. y) such that f (y) − f (x) f (c) = . The following theorem gives a suﬃcient condition for a point to be a local minimum (with a corresponding theorem for a local maximum). but that the opposite is not true. as in f (x) = x3 at zero. . hence f (y) > f (x). If f is increasing and diﬀerentiable then f (x) ≥ 0. f exists in some neighborhood of a.4 If f is continuous on a closed interval [a. Proof : By deﬁnition. f (a) = lim ∆ f a . (30 hrs. b] and f (x) > 0 in (a. 3. 2011) Comment: The fact that f exists at a implies: 1. I Comment: The converse is not true. b). Theorem 3. y−x however f (c) > 0. y−a y−a Thus there exists a δ > 0 such that f (y) >0 y−a whenever 0 < |y − a| < δ.a (y) = f (y) − f (a) f (y) = . but equality may hold.a >0 where ∆ f . 2. By the mean-value theorem there exists a c ∈ (x. f is continuous in some neighborhood of a. then f is increasing on that interval. f is continuous at a.Derivatives 109 Corollary 3.

0. and one which is neither. Proof : Suppose. even though f (0) = 0.. Example: Consider the function f (x) = x3 − x. one local minimum..14 If f has a local minimum at a and f (a) exists. y). then f (a) ≥ 0. a): f (y) − f (a) = f (cy ) < 0 y−a where here cy ∈ (y. Another way of completing the argument is as follows: for every y ∈ (a. It follows that f is increasing in a right-neighborhood of a and decreasing in a left-neighborhood of a. (36 hrs.110 or. f (y) > f (a). I i. By the previous theorem this would imply that a is both a local minimum and a local maximum. 2010) Theorem 3. for every y ∈ (a − δ. by contradiction.e.e. a is a local minimum. f (a) = f (a) = 0. i. ±1/ 3: one local maximum. I The following theorem states that the derivative of a continuous function cannot have a removable discontinuity. Zero is a minimum point. Chapter 3 f (y) > 0 f (y) < 0 whenever whenever a<y<a+δ a − δ < y < a. f (y) > f (a)..e.e. Similarly. a). . a + δ): f (y) − f (a) = f (cy ) > 0 y−a i. a contradiction. where cy ∈ (a.. There are three critical points √ Example: Consider the function f (x) = x4 . i. that f (a) < 0. This in turn implies that there exists a neighborhood of a in which f is constant.

(∀ > 0)(∃δ > 0) : (∀y ∈ B◦ (a. y−a and therefore ∆ f. δ). a a Since f has a limit at a.a (y) − lim f = f (cy ) − lim f . Proof : The assumption that f has a limit at a implies that f exists in some punctured neighborhood U of a.a (y) − lim f | < a . such that f (y) − f (a) f (cy ) = . a then f (a) exists and f is continuous at a. there exists a point cy between y and a.Derivatives 111 Theorem 3.15 Suppose that f is continuous at a and lim f exists. a I Example: The function x2 f (x) = 2 x + 5 x<0 x ≥ 0. δ)) |∆ f. δ)) | f (y) − lim f | < a . δ)) | f (cy ) − lim f | < a . for y ∈ U \ {a} the function f is continuous on the closed segment that connects y and a and diﬀerentiable on the corresponding open segment. By the mean-value theorem. which precisely means that f (a) = lim f . which by the above identity gives (∀ > 0)(∃δ > 0) : (∀y ∈ B◦ (a. Thus. . (∀ > 0)(∃δ > 0) : (∀y ∈ B◦ (a. δ) implies that cy ∈ B◦ (a. Since y ∈ B◦ (a.

b). Example: The function x2 sin 1/x f (x) = 0 x 0 x = 0. b] and diﬀerentiable in (a. b−a The problem is that there is no reason for the points c and d to coincide. so that the theorem does not apply. b). the derivative does not have a limit at zero.16 (Augustin Louis Cauchy. mean value theorem) Suppose Comment: If g(b) g(a) then this theorem states that f (c) f (b) − f (a) = .112 Chapter 3 is diﬀerentiable in a neighborhood of 0 and lim f exists. g(b) − g(a) g (c) This may seem like a corollary of the mean-value theorem. (29 hrs. . and therefore not diﬀerentiable at zero. Note. is continuous and diﬀerentiable in a neighborhood of zero. g are continuous on [a. Then. that f is diﬀerentiable at zero. 2009) (37 hrs. such that [ f (b) − f (a)]g (c) = [g(b) − g(a)] f (c). 0 but it is not continuous. such that f (c) = f (b) − f (a) b−a and g (d) = g(b) − g(a) . Theorem 3. 2010) that f. We know that there exist c.d. However. there exists a c ∈ (a. however.

f (a) = g(a) = 03 . a g g Comment: This is a very convenient tool for obtaining a limit of a fraction. otherwise.17 (L’Hˆ pital’s rule) Suppose that o lim f = lim g = 0. le Marquis de l’Hˆ pital. when both numerator and denominator vanish in the limit2 .Derivatives Proof : Deﬁne h(x) = f (x)[g(b) − g(a)] − g(x)[ f (b) − f (a)]. b). the limit lima ( f /g ) would not have existed. 0 0 x=a x=a 2 .e. Namely. 3 What we really do is to deﬁne continuous functions f (x) x a g(x) x a f˜(x) = and g(x) = ˜ . such that h (c) = f (c)[g(b) − g(a)] − g (c)[ f (b) − f (a)] = 0. L’Hˆ pital’s rule was in fact derived by Johann Bernoulli who was giving to the French nobleo man. lessons in calculus. b] and diﬀerentiable in (a. hence there exists a point c ∈ (a.. and that h(a) = h(b). 113 and apply Rolle’s theorem. I Theorem 3. a a and that lim a f exists. g Then the limit lima ( f /g) exists and lim a f f = lim . b). Note that f and g are not necessarily deﬁned at a. o o acknowledging the help of Bernoulli. L’Hˆ pital was the one to publish this theorem. we verify that h is continuous on [a. there will be no harm assuming that they are continuous at a. i. since they have a limit there. Proof : It is implicitly assumed that f and g are diﬀerentiable in a neighborhood of a (except perhaps at a) and that g is non-zero in a neighborhood of a (except perhaps at a).

then g(x) 0 whenever 0 < |x − a| < δ. δ).1) f f (x) − lim < a g g (x) . for by Rolle’s theorem. f f = lim . with.1). . say. a point c x between x and a. (∀ > 0)(∃δ > 0) : x ∈ B◦ (a. (∀ > 0)(∃δ > 0) : x ∈ B◦ (a. then there exists a 0 < c x < x < δ. Both f and g are continuous and diﬀerentiable in some neighborhood U that contains a. At the end. where g (c x ) = 0. δ) implies c x ∈ B◦ (a. we claim that g does not vanish in some neighborhood U of a. there exists for every x ∈ U. g(x) − 0 g(x) g (c x ) Since the limit of f /g at a exists.114 Chapter 3 First. such that f (x) − 0 f (x) f (c x ) = = . if g (x) 0 whenever 0 < |x − a| < δ. δ) implies and since x ∈ B◦ (a. More formally. δ) implies which means that lim a (3. a g g I Example: and prove the theorem for the “corrected” functions f˜ and g. hence by Cauchy’s mean-value theorem. δ) implies and by (3. it would imply that g vanishes somewhere in this neighborhood. (∀ > 0)(∃δ > 0) : x ∈ B◦ (a. 0 < x < δ. f f (c x ) − lim < a g g (c x ) . f f (x) − lim < a g g(x) . for if g(x) = 0. we observe that the result ˜ must apply for f and g as well.

we only prove one variant among many other of l’Hˆ pital’s o rule. ∞ g g Yet another one is: Suppose that lim f = lim g = ∞. but note how difo ferent it is: . x→0 x sin2 x = 1. a a and that lim a f exists. g Then lim a f f = lim .Derivatives The limit 115 lim The limit sin x = 1. g Then lim ∞ f f = lim . 2011) Comment: Purposely. ∞ ∞ and that lim ∞ f exists. x→0 x2 with two consecutive applications of l’Hˆ pital’s rule. Another variants is: Suppose that lim f = lim g = 0. o lim (32 hrs. a g g The following theorem is very reminiscent of l’Hˆ pital’s rule.

e. Show that there exists a constant c. Any suppose that f (x) = f (x) for all Exercise 3. and yet. ∆g. x ∈ I. Thus. g g (a) g (a) 0.5 . in which the numerator g(x) is non-zero. and in particular. y→a y→a y − a y−a hence there exists a punctured neighborhood of a where g(x)/(x − a) > 0. and f (x) − f (a) f (x) = = g(x) g(x) − g(a) f (x)− f (a) x−a g(x)−g(a) x−a = ∆ f. we can divide by g(x) in a punctured neighborhood of a. with f (a) = g(a) = 0 Then. is not increasing in any neighborhood of 0 (i.a (x) I Using the arithmetic of limits. we obtain the desired result. such that f (x) = c e x ..a (x) . Proof : Without loss of generality. there is no neighborhood of zero in which x < y implies f (x) < f (y)).116 Chapter 3 Theorem 3. (Hint: consider the function g(x) = f (x)/e x . lim g(y) − g(a) g(y) = lim = g (a) > 0. 2011) (30 hrs. That is.4 Let I be an open segment. (33 hrs. lim a and f f (a) = . assume g (a) > 0.18 Suppose that f and g are diﬀerentiable at a.) Exercise 3. 2009) Exercise 3.3 Show that the function 1 x + x2 sin 1 x f (x) = 2 0 x 0 x=0 satisﬁes f (0) > 0.

7 Calculate the following limits: lim x→0 sin(ax) . Show that if f vanishes at m points. . 1.5 Derivatives of inverse functions We have already shown that if f : A → B is one-to-one and onto. sin(bx) cos(cx) tan(x)−x lim x→0 x−sin(x) . then it has an inverse f −1 : B → A. x x − x0 x and choose x0 appropriately in order to use Lagrange’s theorem. Hint: write f (x) − f (x0 ) x − x0 f (x) = · . . Exercise 3. Find all the values of α. and n + 1 times differentiable in (a.6 Let sin(ωx) f (x) = γ 2 αx + βx x<0 x=0 x > 0. We then say that f is invertible ( ¯ ). In this section. sin(x) πx 2 lim x→0 lim x→1 (1 − x) tan . ∞) and lim∞ f = L. and that the inverse of a continuous function is continuous. Show that there exists a c ∈ (a. (ii) diﬀerentiable at zero. . Prove. β. b). then f vanishes at at least m − 1 points. Exercise 3. Let f be n times continuously diﬀerentiable in [a. 3. we study under what conditions is the inverse of a diﬀerentiable function diﬀerentiable. Suppose furthermore that f (b) = 0 and f (k) (a) = 0 for k = 0. . γ. ω for which f is (i) continuous at zero. x2 sin(1/x) . b]. . without using l’Hopital’s theorem that lim x→∞ f (x)/x = L.8 Suppose that f : (0. b) such that f (n+1) (c) = 0. We have seen that an invertible function deﬁned on an interval must be monotonic. ∞) → R is diﬀerentiable on (0. n.Derivatives 117 Let f : R → R be diﬀerentiable. (iii) twice diﬀerentiable at zero. Exercise 3.

2: The derivative of the inverse function Recall also that f −1 ◦ f = Id. Corollary 3. If f is diﬀerentiable at f −1 (b) and f ( f −1 (b)) 0. then f −1 is not diﬀerentiable at 2. There is one little ﬂaw with this argument. Example: The function f (x) = x3 + 2 is continuous and invertible. ( f −1 ) ( f (a)) f (a) = 1.118 Chapter 3 Figure 3. Since. then by the composition rule. 2010) Theorem 3. then f −1 is diﬀerentiable at b. We have assumed ( f −1 ) to exist. or equivalently. a = f −1 (b). we get ( f −1 ) (b) = 1 f ( f −1 (b)) . with Id : A → A.5 If f ( f −1 (b)) = 0 then f −1 is not diﬀerentiable at b. . If both f and f −1 were diﬀerentiable at a and f (a).19 Let f : I → R be continuous and invertible. f ( f −1 (2)) = 0. with f −1 (x) = √ 3 x − 2. Setting f (a) = b. respectively. (39 hrs. then we have just derived an expression for the derivative of the inverse. If this were indeed the case.

Then4 . ( f −1 ) (x) = 1 f ( f −1 (x)) = cos2 (arctan x) = 1 1+ tan2 (arctan x) = 1 ..a (x) = which we can rewrite as ∆ f −1 . a = f −1 (b). Then. 4 We used the identity cos2 = 1/(1 + tan2 ). −1 (b) x−a f (y) − f ∆ f −1 .e. To each x a corresponds a y y = f (x) b such that and x = f −1 (y). It is given lim ∆ f. 1 + x2 derivative of f −1 (x) = x1/n is Example: Consider the function f (x) = xn .b = . it follows from limit artithmetic that 1 lim ∆ f −1 .a = f (a) a 119 0. −1 (b)) b f (f I Example: Consider the function f (x) = tan x deﬁned on (−π/2. n and together with the composition rule gives the derivative for all rational powers.Derivatives Proof : Let b = f (a).a is continuous at a = f −1 (b). i. . This proves that the ( f −1 ) (x) = 1 1/n−1 x .b (y) Since f −1 is continuous at b and ∆ f. with n integer.b (y) = 1 1 = . ∆ f. ∆ f.a (x) ∆ f. We denote its inverse by arctan x. π/2).a ( f −1 (y)) y−b 1 f (x) − f (a) = −1 = .

and satisﬁes f = f −1 .) Proof : Consider a segment (b. Further- more. By the deﬁnition of the supremum. − < f (x) ≤ whenever y < x < a. Exercise 3. if A is a connected set then f has one-sided limits everywhere.11 Suppose that f : R → R is continuous.e. Exercise 3. Hence we can deﬁne = sup K. . there exists an m ∈ K such that m > − . then we know the function whose derivative is its inverse. Note that this statement says that if we know a function whose derivative is one-to-one and onto. a] ∈ A. Prove that ( f −1 ) exists and ﬁnd an expression for it in terms of f and its derivatives.10 Let f be invertible and twice diﬀerentiable in (a.. (i) Show that f has a ﬁxed point. By the monotonicity of f . b). 3. b). Suppose that there exists a function F : R → R such that F = f .9 Let f : R → R be one-to-one and onto. then it has one-sided limits at every point that has a one-sided neighborhood that belongs to A. We need to show that f has a left-limit at a. i. Hint: start by ﬁguring out how do such functions look like. Exercise 3. Prove that G (x) = f −1 (x). (ii) Find such an f that has a unique ﬁxed point. Let y be such f (y) = m. increasing).20 Suppose that f : A → B is monotonic (say. (In particular. Deﬁne G(x) = x f −1 (x) − F( f −1 (x)). − a Let > 0. We will get to it after we study integration theory. We are going to show that = lim f. there exists an x ∈ R such that f (x) = x.120 Chapter 3 Comment: So far. one-to-one and onto. we have not deﬁned real-valued powers. The set K = { f (x) : b < x < a} is non-empty and upper bounded by f (a) (since f is increasing).6 Complements Theorem 3. and furthermore differentiable with f (x) 0 for all x ∈ R. f (x) 0 in (a.

where f (c) > lim f − c and/or f (c) < lim f. This implies the existence of a point c ∈ [a. f (b)] is in the image of f follows from the intermediatevalue theorem. It is nevertheless correct even if f has discontinuities.21 Let f : [a. . b). f (b)]. + c (We rely on the fact that one-sided derivatives exist. Image( f ) ⊆ [ f (a). f (b)].) Say. that f is not continuous. f (c) < f (c) + limc+ f < lim f c+ 2 is not in the image of f . such that f (x) = t. including a right-derivative at a and a left-derivative at b. whereby for every t between f (a+ ) and f (b− ) there exists an x ∈ [a. We proceed similarly for right-limits. that f (c) < limc+ f . Then the image of f is a segment if and only if f is continuous. b]. This theorem thus limits the functions that are derivatives of other functions—not every function can be a derivative. Then. Proof : (i) Suppose ﬁrst that f is continuous.22 (Jean-Gaston Darboux) Let f : A → B be diﬀerentiable on (a. I Theorem 3. for example. Comment: The statement is trivial when f is continuously diﬀerentiable (by the intermediate-value theorem). (ii) Suppose then that Image( f ) = [ f (a). Then. 121 I Theorem 3. f has the “intermediate-value property”. b] → R be monotonic (say. by contradiction. f (b)]. b]. which contradicts the fact that Image( f ) = [ f (a). Suppose. increasing).Derivatives This concludes the proof. By monotonicity. That every point of [ f (a).

The point a cannot be a maximum point because g is increasing at a. b cannot be a maximum point because g is decreasing at b. x=0 The derivative is not continuous at zero. (Here t is a ﬁxed parameter. and we wish to show the existence of an x ∈ [a. g (c) = f (c) − t = 0. Similarly. Let f (a+ ) > t > f (b− ).122 Chapter 3 Proof : Without loss of generality. there exists an interior point c which is a maximum. but nevertheless satisfes the Darboux theorem. Since g is continuous on [a. and by Fermat’s theorem. 2009) Example: The function x2 sin 1/x f (x) = 0 x 0 x=0 is diﬀerentiable everywhere. and its derivative is 2x sin 1/x − cos 1/x f (x) = 0 x 0 . and consider the function g(x) = f (x) − tx. let us assume that f (a+ ) > f (b− ). Example: How crazy can a continuous function be? It was Weierstrass who shocked the world by constructing a continuous function that it nowhere diﬀerentiable! As instance of his class of example is ∞ f (x) = k=0 cos(21k πx) . b] such that g (x) = 0. b] then it must attain a maximum. 3k .) We have g (a+ ) = f (a+ ) − t > 0 and g (b− ) = f (b− ) − t < 0. I (32 hrs. Thus. for every t we can construct such a function.

b). Suppose. b] → R is continuous on [a. that we knew that | f (y)| < M for all y. then we can reﬁne a lot the estimates we have on the variation of f as we get away from the point a. Then. . such that f (x) − f (a) = f (c x ). a It turns out that if we have more information. lim a f − P f. (Id −a)n Comment: Loosely speaking. f (n) (a) all exist. for example. f (a).23 Let f be a given function. as x approaches a. b) there exists a point c x . . Theorem 3. For n = 0 it states that lim[ f − f (a)] = 0. P f.a (x) is closer to f (x) than (x − a)n (or that f and P f. .n. If f : [a. In particular. a . then for every point x ∈ (a. Then.a = 0. Suppose that the derivatives f (a).7 Taylor’s theorem Recall the mean-value theorem.a (x) = a0 + a1 (x − a) + a2 (x − a)2 + · · · + an (x − a)n .n. about higher derivatives of f .n. this theorem states that. b] and diﬀerentiable in (a. x−a We can write it alternatively as f (x) = f (a) + f (c x )(x − a). lim[ f − f (a)] = 0.n. and a an interior point of its domain.Derivatives 123 3. we would deduce that f (x) cannot diﬀer from f (a) by more than M times the separation |x − a|. .a are “equal up to order n”. Let ak = f (k) (a)/k! and set the Taylor polynomial of degree n. see below). P f.

(35 hrs. lim n−1 a n(n − 1) (Id −a)n−2 a n (Id −a) provided that the right hand side exists. For that we note o that P f. Now write the right hand side explicitly.n. a (Id −a) which holds by deﬁnition.n.a . f (n−1) − P f (n−1) . This concludes the proof.n.a − f (n) (a) .n. (Id −a)n which is a ratio of n-times diﬀerentiable functions at a. lim a Chapter 3 f − f (a) − f (a) = lim ∆ f. 2011) The polynomial P f.n−2.a is: I . so we will attempt to use l’Hˆ pital’s rule.n. = lim a (Id −a)n n! (Id −a) provided that the right hand side exists. f − P f . If we denote by Πn the set of polynomials of degree at most n.n−1. then the deﬁning property of P f.a = lim . f − P f .a f − P f. a n (Id −a)n−1 a (Id −a)n provided that the right hand side exists. Both numerator and denominator vanish at a.a . which is the limit of the ratio of two diﬀerentiable functions that vanish at a.a − f (a) = 0.a = lim . Its existence relies on f being diﬀerentiable n times at a.n−1. We then proceed to consider the right hand side.a f − P f. 1 f (n−1) − f (n−1) (a) − f (n) (a)(Id −a) 1 lim = lim ∆ f (n−1) .a (x) is known as the Taylor polynomial of degree n of f about a. Proof : Consider the ratio f − P f.a = P f . We proceed inductively. until we get that lim lim a Thus.124 whereas for n = 1. Hence.n−1.a .a f − P f .n.1. n! a Id −a n! a which vanishes by the deﬁnition of f (n) (a).

n.a (x) + R f.a (provided that the latter exists) are equal up to order n at a. which deﬁnes an equivalence class. 1. .n. 2. The next task is to obtain an expression for the diﬀerence between the function f and its Taylor polynomial. (Id −a)n It is easy to see that “equal up to order n” is an equivalence relation. by f (x) = P f. Then P = Q. a0 = b0 .a ∈ Πn . g be deﬁned in a neighborhood of a. n.. We say that f and g are equal up to order n at a if lim a f −g = 0. We proceed similarly for each k to show that ak = bk for k = 0.n. Proposition 3. . Q ∈ Πn be equal up to order n at a.1 Let P.Derivatives 1. . n. Rn (x).6 Let f.a (x). P f. . n. 1. 1. I . . .a 125 Deﬁnition 3. P(k) (a) = f (k) (a) for k = 0. f. . . We know that lim a P−Q =0 (Id −a)k for k = 0.n. For k = 0 0 = lim(P − Q) = a0 − b0 . the above theorem proves that f (x) and P f.e. . . .n. Thus. . Proof : We can always express these polynomials as P(x) = a0 + a1 (x − a) + · · · + an (x − a)n Q(x) = b0 + b1 (x − a) + · · · + bn (x − a)n . a i. We deﬁne the remainder ( ) of order n.

1 + x2 P f.6 If f is n times diﬀerentiable at a and f is equal to Q ∈ Πn up to order n at a.0 (x) = k=0 xk . then P f. which means that n xk are P f. f (x) − xn n k=0 ∞ n x = k k=0 k=0 xk + xn+1 . Example: We know from high-school math that for |x| < 1 1 f (x) = = 1−x Thus.2n.n.126 Chapter 3 Corollary 3. . 1−x xk = x . Example: Similarly for |x| < 1 1 f (x) = = 1 + x2 Thus.a = Q.n. 1−x n k=0 Since the right hand side tends to zero as x → 0 it follows that f and equal up to order n at 0. x2 .0 (x) = k=0 (−1)k x2k . 1 + x2 Since the right hand side tends to zero as x → 0 it follows that f (x) − = n n k 2k k=0 (−1) x x2n ∞ n (−x ) = 2 k k=0 k=0 (−1)k x2k + (−1)n+1 x2n+2 .

2011) Theorem 3. Proof : We use again the fact that if a polynomial “looks” like the Taylor polynomial then it is.a = [P f. P f +g.a =g + P f.n.n. Then.n.a + Pg.n.a g + P f. 1 + t2 Since x2k+1 (−1) f (x) − ≤ |x|2n+3 .n.a Pg. Then.a = P f. 2k + 1 (36 hrs.n. n (Id −a) (Id −a)n . where []n stands for a truncation of the polynomial in (Id −a)at the n-th power.0 = n (−1)k k=0 x2k+1 .a Pg. The sum P f.n. Second.n.a Pg.a .a f g − P f.n. g be n times diﬀerentiable at a.a ) = lim + lim = 0.n. x f (x) = 0 n dt = 1 + t2 x 0 n (−1) t + (−1) k 2k k=0 x n+1 t2n+2 dt 2 1+t = k=0 (−1)k x2k+1 + 2k + 1 n (−1)n+1 0 t2n+2 dt.a Pg.a g − Pg. we note that P f.n.n. 2k + 1 k=0 k it follows that P f.a ]n .n.n.a + Pg.a ( f + g) − (P f.n.n.24 Let f.a g − P f.2n+2.a .a belongs to Πn and lim a f − P f. n n a a (Id −a)n (Id −a) (Id −a) which proves the ﬁrst statement. and that f g − P f.n.a ∈ Π2n .n.n.a + Pg.a g − Pg.n.Derivatives 127 Example: Let f = arctan.n.n.a = n (Id −a) (Id −a)n f − P f. and P f g.n.

I Theorem 3. f (x) = P f.a (x) = f (n+1) (c) (x − a)n+1 . x).128 Chapter 3 whose limit at a is zero. f (z) f (n) (z) (x − z)2 − · · · − (x − z)n . φ is diﬀerentiable. All that remains it to truncate the product at the n-th power of (Id −a).n.z (x) = f (x) − f (z) − f (z)(x − z) − We note that φ(a) = R f.a (x) can be represented in the Lagrange form.n. n! Comment: For n = 0 this is simply the mean-value theorem.a (x). (n + 1)! for some c ∈ (a. It can also be represented in the Cauchy form. where the remainder R f. 2! n! f (n+1) (z) f (n) (z) (x − z)n − (x − z)n−1 n! (n − 1)! f (n+1) (z) =− (x − z)n .a (x) = for some ξ ∈ (a. and consider the function φ : [a.n. x] → R. x]. with φ (z) = − f (z) − f (z)(x − z) − f (z) − − ··· − f (z) (x − z)2 − f (z)(x − z) 2! and φ(x) = 0. f (n+1) (ξ) (x − ξ)n (x − a). f.n. x). φ(z) = f (x) − P f.a (x) Moreover. Then.a (x) + Rn. R f. n! .n.n. R f. Proof : Fix x.25 (Taylor) Suppose that f is (n + 1)-times diﬀerentiable on the interval [a.

x0 = 0. That is. both in the Lagrange and Cauchy forms: f (x) = 1/(1 + x). x) with nonvanishing derivative. f (x) = cosh(x). ﬁnd coeﬃcients a0 . Prove that f is a polynomial whose degree is at most n − 1. such that p(x) = n ak (x + 1)k . Rn (x) = (x − a) p f (n+1) (ξ) (x − ξ)n−p+1 p n! For p = 1 we retrieve the Cauchy form. .13 Let f : R → R be n times diﬀerentiable. a1 .14 Write the Taylor polynomials of degree n at the point x0 for the following functions. . n ∈ N. diﬀerentiable on (a. For p = n + 1 we retrieve the Lagrange form.Derivatives 129 Let now ψ be any function deﬁned on [a. an . Exercise 3. Rn (x) = ψ(x) − ψ(a) f (n+1) (ξ) (x − ξ)n ψ (ξ) n! Set for example ψ(z) = (x − z) p for some p. there exists a point a < ξ < x. such that φ(x) − φ(a) φ (ξ) = . n ∈ N. x0 = 0. such that f (n) (x) = 0 for all x ∈ R. By Cauchy’s mean-value theorem.12 Write the polynomial p(x) = x3 − 3x2 + 3x − 6 as a polynomial Exercise 3. I Comment: Later in this course we will study series and ask questions like “does the Taylor polynomial Pn (x) tend to f (x) as n → ∞?”. ψ(x) − ψ(a) ψ (ξ) That is. k=0 Solve this problem using two methods: (i) Newton’s binomial. as well as the corresponding remainders. Then. 2009) in x + 1. (34 hrs. Here we deal with a diﬀerent beast: the degree of the polynomial n is ﬁxed and we consider how does Pn (x) approach f (x) as x → a. . . and (ii) Taylor’s polynomial. This is the notion of converging series. Exercise 3. x]. The fact that Pn approaches f faster than (x − a)n as x → a means that Pn is an asymptotic series of f . .

f (x) = sin(x). Use the notation α def α(α − 1) . then f (0) does not exist. cos(1/2).17 For every polynomial P(x) = n k=0 ak xk and evert q ∈ N. x0 = 0. then f (a) = lim h→0 f (a + h) + f (a − h) − 2 f (a) . the following expressions. x0 = 1. Exercise 3. h Using l’Hopital’s rule. .16 Evaluate. α N. we denote q<n q ≥ n. f (x) = (x − 2)6 + 9(x + 1)4 + (x − 2)2 + 8.15 Show using diﬀerent methods that of f (a) exists. but the limit of the above fraction does exist. n = 6. . Why isn’t it a contradiction? Exercise 3. using Taylor’s polynomial. Using Taylor’s polynomial of order two f at a. since it is only given that f is twice diﬀerentiable at a. k ∈ N. x0 = π/2. n = 5. up to an accuracy of at least 0. n ∈ N. = n! n which generalizes Newton’s binomial. Note that you can’t use neither Lagrange or Cauchy’s form of the remainder. n = 2k.130 Chapter 3 f (x) = x5 − x4 + 7x3 − 2x2 + 3x + 1. q ak x k Pq (x) = k=0 P(x) . Exercise 3. (α − n + 1) . Show that if x2 x≥0 f (x) = 2 −x x < 0. f (x) = (1 + x)α .01: e1/2 . x0 = 0.

g have Taylor polynomials of order n at zero. write the Taylor polynomial of order n at zero of the function g(x) = f (2x). where Pn and Qn are the corresponding Taylor polynomials at zero.Derivatives 131 Suppose that f. Find the Taylor polynomial of order three at zero of the function f (x) = exp(sin(x)). P and Q. k Hint: think of the Taylor polynomial of a product of functions.18 Let f be n times diﬀerentiable at x0 and let P(x) = ak xk be its Taylor polynomial at zero. Hint: write f (g(x)) = f (g(x))/g(x) · g(x). n k=0 Exercise 3.20 Suppose that f and g have derivatives of all orders in R and . and that g does not vanish in a punctured neighborhood of zero. then n ( f · g)(n) (x0 ) = k=0 n (k) f (x0 ) g(n−k) (x0 ). Given the Taylor polynomial P of order n at zero of the function f . Show that the Taylor polynomial of order n at zero of f ◦ g is given by (P ◦ Q)n . that Pn (x) = Qn (x) for all n ∈ N and every x ∈ R.19 Show that if f and g are n times diﬀerentiable at x0 . Exercise 3. Justify your answer. Does it imply that f (x) = g(x)? Exercise 3. Suppose furthermore that g(0) = 0. What is the Taylor polynomial of f of order n − 1 at zero.

132 Chapter 3 .

Within the Riemann integration theory. We will learn how to calculate the area of a region bounded by the x-axis. it is deﬁned as “the number of squares with sides of length one that ﬁt into the region”. For the moment let us assume that f ≥ 0. This region will be denoted by R( f. 0) and (b. When we learn in school about the area of a region. Integrals were invented in order to calculate areas. and its area (once deﬁned) will be called the integral of f on [a. and two vertical lines passing though points (a. b).Chapter 4 Integration theory 4. 0). There are multiple non-equivalent ways to deﬁne integrals. It is of course hard to make sense with such a deﬁnition. 1 . a. b]1 . We study integration theory in the sense of Riemann. The fact that the two are intimately related— that derivatives are in a sense the inverse of integrals—is a wonder. the graph of a function f . another classical integration theory is that of Lebesgue (studied in the context of measure theory). we adopt a particular derivation due to Darboux.1 Deﬁnition of the integral Derivatives were about calculating the rate of change of a function.

a.2 Let f : [a. . P) ≤ A ≤ U( f. . We deﬁne the lower sum ( ¯ ) of f for P to be n L( f. . you can’t use the word to be deﬁned in a deﬁnition) of the interval into a ﬁnite number of segments. a = x0 < x1 < x2 < · · · < xn = b. [a. xn } be a partition of [a. . b) must satisfy L( f. xn ]. b]. b] → R be a bounded function.b" a b Deﬁnition 4. P) for all partitions P.1 A partition ( ) of the interval [a. x1 ] ∪ [x1 .a. Deﬁnition 4. and let P = {x0 . any sensible deﬁnition of the area A of R( f. Clearly. We deﬁne the upper sum ( ¯ ) of f for P to be n U( f. P) = i=1 Mi (xi − xi−1 ). P) = i=1 mi (xi − xi−1 ). I know. .134 Chapter 4 f R!f. b] is a partition (yes. x1 . x2 ] ∪ · · · ∪ [xn−1 . b] = [x0 . We will identify a partition with a ﬁnite number of points. Let mi = inf{ f (x) : xi−1 ≤ x ≤ xi } Mi = sup{ f (x) : xi−1 ≤ x ≤ xi }.

−1/3. 1]. 2011) Exercise 4. 2/3. P) and U( f. Q) ≥ L( f. P2 ) where P1 and P2 are arbitrary partitions? It ought to be true. 8/9.Integration theory 135 f a x0 x1 x2 x3 b Figure 4. Is it however true that L( f. It is always true that L( f.3 A partition Q is called a reﬁnement ( P ⊂ Q. P1 ) ≤ U( f.1 Let f : [−1. P) and U( f. 1} be a partition of [−1. 2009) (38 hrs. Then L( f. (35 hrs. P) because mi ≤ Mi for all i. Deﬁnition 4.1: The lower and upper sums of f for P. ) of a partition P if Lemma 4. P). P). Comments: The boundedness of f is essential in these deﬁnitions. Q) ≤ U( f. 1] → R be given by f : x → x2 and let P = {−1. −1/2. .1 Let Q be a reﬁnement of P. Calculate L( f. P) ≤ U( f.

such that each P j diﬀers from P j−1 by one point. . x1 . . L( f. Q) = i=1 mi (xi − xi−1 ) + m j (y − x j−1 ) + m j (x j − y) + ˜ ˆ i= j+1 mi (xi − xi−1 ). since both are inﬁmums over a ˜ ˆ 2 smaller set . x j−1 . . Q) − L( f. . Thus. P) ≥ 0. I Theorem 4. That is. xn } Q = {x0 . L( f. P1 ) ≤ · · · ≤ L( f. . There exists a sequence of partitions.P2 of [a. xn }. Consider now two partitions. . P) = i=1 n mi (xi − xi−1 ) and j−1 n L( f. x j . y. b]. A similar argument works for the upper sums. P = {x0 . .136 Chapter 4 Proof : Suppose ﬁrst that Q diﬀers from P by exactly one point. L( f. P1 ) ≤ U( f. P2 ). ˜ ˆ Thus. . It remains to note that m j ≥ m j and m j ≥ m j . P ⊂ Q. . . L( f. Then. P) ≤ L( f. 2 It is always true that if A ⊂ B then inf A ≥ inf B and sup A ≤ sup B. where m j = inf{ f (x) : x j−1 ≤ x ≤ y} ˜ m j = inf{ f (x) : y ≤ x ≤ x j }. . Q). . P = P0 ⊂ P1 ⊂ · · · ⊂ Pk = Q. Q) − L( f. P) = (m j − m j )(y − x j−1 ) + (m j − m j )(x j − y). Hence. .1 For every two partitions P1 . ˆ L( f. x1 .

L ( f ) = {L( f. Q) ≤ U( f. 137 I Exercise 4. ≤ c ≤ u for all ∈ L ( f ) and For every > 0 there exist an ∈ L ( f ) and a u ∈ U ( f ). If R is a partition of both P and Q then it is also a reﬁnement of P ∪ Q. P) : P is a partition of [a. Exercise 4. We may now consider the collection of lower and upper sums. b]}. Similarly. b] We once proved a theorem stating that the following assertions are equivalent: L( f ) = U( f ). Q) ≤ U( f. we may deﬁne the least upper bound and greatest lower bound.2 Prove or disprove: A partition Q is a reﬁnement of a partition P if and only if every segment in Q is contained in a segment of P.3 Characterize the functions f for which there exists a partition P such that L( f. and not sets of partitions. L( f ) = sup L ( f ) and U( f ) = inf U ( f ). The set L ( f ) is non-empty and bounded from above by any element of U ( f ).Integration theory Proof : Let Q = P1 ∪ P2 . such that u − < . Thus. The real number U( f ) is called the upper integral ( ) of f on [a. b]} U ( f ) = {U( f. P) : P is a partition of [a. b] and L( f ) is called the lower integral ( ) of f on [a. . If a partition P is a reﬁnement of a partition Q then every segment in Q is a segment in P. P). the set U ( f ) is non-empty and bounded from below by any element of L ( f ). P1 ) ≤ L( f. P2 ). Since Q is ﬁner than both P1 and P2 it follows that L( f. Note that both L ( f ) and U ( f ) are sets of real numbers. P) = U( f. There is a unique number c. such that u ∈ U ( f ).

We will only brieﬂy talk about them (only due to a lack of time). it is very important to realize that the integral only involves f .138 Chapter 4 Deﬁnition 4. b]. We call this number the integral of f on [a. The geometric interpretation is that areas under the x-axis count as “negative areas”.4 If L( f ) = U( f ) we say that f is integrable on [a. Comment: At this stage we only consider integrals of bounded functions over bounded segments—so-called proper integrals. and denote it by b f. b] if the upper and lower integrals coincide. b]. Two questions arise: How do we know whether a function is integrable? . Comment: The integral of f is the unique number such that b L( f. are instances of improper integrals. a This notation is indeed very suggestive about the meaning of the integral. f is integrable on [a. P) ≤ a f ≤ U( f. however. Comment: What about the case where the sign of f is not everywhere positive? The deﬁnition remains the same. a and b. Integrals like ∞ 1 f a and 0 log x dx. for all partitions P and Q. Comment: You are all accustomed to the notation b f (x) dx. a That is. and it is not a function of any x. Q).

P) − L( f. such that U( f. by the equivalent criteria described above. suppose f is integrable. P ) ≤ U( f. Proof : If for every > 0 there exists a partition P. n L( f. P) = It follows that L( f ) = U( f ) = c(b − a). a . P) < L( f.2 A bounded function f is integrable if and only if there exists for every > 0 a partition P. P) = i=1 n mi (xi − xi−1 ) = c(b − a) Mi (xi − xi−1 ) = c(b − a). such that U( f. P . i=1 U( f. P) − L( f. P ) < . 2011) 139 The following lemma will prove useful in checking whether a function is integrable: Theorem 4. Let P = P ∪ P . given > 0. Since P is a reﬁnement of both. such that U( f. P) + . then U( f. P) + . P) < L( f. b] → R.Integration theory How to compute the integral of an integrable function? (40 hrs. then f is integrable. Then. there exist partitions P. f : x → c. P ) < . P ) − L( f. Then for every partition P. I Example: Let f : [a. Conversely. hence f is integrable and b f = c(b − a).

n Lunif ( f ) = {L( f. 0 f :x→ 1 x 1 x = 1. Let Pn . and U( f. Since L( f ) and M j = 1. P) = 0 (41 hrs. hence L( f ) = sup L ( f ) ≥ sup Lunif ( f ) = 0 U( f ) = inf U ( f ) ≤ inf Uunif ( f ) = 0. n ∈ N denote the family of uniform partitions. thus 0 ≤ L( f ) ≤ U( f ) ≤ 0. 2010) and U( f. P) = 1. and L( f ) = 0 and U( f ) = 1. 1] → R. Pn ) : n ∈ N} Clearly. Example: Consider next the function f : [0. P) = 2/n. Pn ) : n ∈ N} Uunif ( f ) = {U( f. .140 Chapter 4 Example: Let f : [0. mj = 0 hence L( f. 0 f :x→ 1 For every partition and every j. xi = For every such partition. x Q x ∈ Q. 2] → R. Lunif ( f ) ⊆ L ( f ) and Uunif ( f ) ⊆ U ( f ). P) = 0 Set now. U( f ). 2i . f is not integrable. L( f.

Pn = {0. then mi = xi−1 and Mi = xi . will convince us that we must develop tools to deal with more complicated functions. n L( f. We are next going to examine a number of examples. b}. . hence n L( f. Let P be any partition. and 2 141 f = 0. . Pn ) = i=1 As in the previous example.Integration theory which implies that f is integrable. b] → R. will practice our understanding of integrability. i=1 U( f. f = Id. b/n. . P) = Using again uniform partition. 2 (37 hrs. which on the one hand. and on the other hand. Example: Let f : [0. 2b/n. L( f ) = sup L ( f ) ≥ sup Lunif ( f ) = U( f ) = inf U ( f ) ≤ inf Uunif ( f ) = hence f is integrable and b 0 b2 2 b2 . 0 This example shows that integrals are not aﬀected by the value of a function at an isolated point. 2 b2 f = . P) = i=1 n xi−1 (xi − xi−1 ) xi (xi − xi−1 ). nn n 2 2 n U( f. Pn ) = i=1 n 1 (i − 1)b b b2 n(n − 1) b2 = 2 = 1− n n n 2 2 n ib b b2 n(n + 1) b2 1 = 2 = 1+ . 2009) . .

We start by verifying which classes of functions are integrable.3 If f [a. b] → R. 6 . we need stronger tools. 3 n k2 = k=1 1 n(n + 1)(2n + 1)..2 Integration theorems Theorem 4.142 Chapter 4 2 Example: Let f : [0. Take the same uniform partition. Pn ) = i=1 Since L( f ) ≥ sup Lunif ( f ) = b3 /3 it follows that 0 b and b3 . P) = i=1 n 2 xi−1 (xi − xi−1 ) U( f. i. f (x) = x2 .e. Pn ) = i=1 n (i − 1)2 b2 b b3 (n − 1)n(2n − 1) b3 1 = 3 = 1− 2 n n n 6 3 n i2 b2 b b3 n(n + 1)(2n + 1) b3 1 = 3 = 1+ 2 n n n 6 3 n 1+ 1− 2 . Let P be any partition. 2010) 4. b] → R is monotone. then it is integrable. 3 U( f ) ≤ inf Uunif ( f ) = b3 /3. hence n L( f. then3 n L( f. P) = i=1 xi2 (xi − xi−1 ). (42 hrs. n 1 2n U( f. then mi = xi−1 and Mi = xi2 . f = This game can’t go much further.

We want to show that for every > 0 there exists a partition P. P) = i=1 (Mi − mi )(xi − xi−1 ) < b−a (xi − xi−1 ) = . n n L( f. which guarantees that U( f. Then. b] → R is continuous. P) − L( f. P) − L( f. n Thus. that f is monotonically increasing.4 If f [a. n. such that b−a Take then a partition P such that xi − xi−1 < δ | f (x) − f (y)| < whenever |x − y| < δ. then it is integrable. P) = i=1 f (xi−1 )(xi − xi−1 ) and U( f. without loss of generality. U( f. That is. We need to somehow control the diﬀerence between mi and Mi . P) = n [ f (xi ) − f (xi−1 )](xi − xi−1 ). P) < L( f. such that U( f. Pn ) = n n [ f (xi ) − f (xi−1 )] = i=1 (b − a)[ f (b) − f (a)] . . Pn ) − L( f. b−a U( f. Recall that a function that is continuous on a closed interval is uniformly continuous. for all i = 1. i=1 Take now a family of equally-spaced partitions. . there exists a δ > 0. for every partition P. Proof : First. . I Theorem 4. Pn ) − L( f. and n n U( f. P) + . i=1 . it follows that Mi − mi < /(b − a). Since continuous functions on closed intervals assume minima and maxima in the interval. P) = i=1 f (xi )(xi − xi−1 ). .Integration theory 143 Proof : Assume. for every > 0 we can choose n > (b − a)[ f (b) − f (a)]/ . Pn ) < (we used the fact that the sum was “telescopic”). For such j=0 partitions. Pn = {a + j(b − a)/n}n . a continuous function on a segment is bounded. hence.

Q)] − [U( f. P ) < 2 and U( f. P ) − L( f. c] and [c. P ) + U( f. . P ) − L( f. b]. c] and a partition P of [c. P ) − L( f. b]. hence f is integrable on [a. c]. such that U( f. . We have L( f. P ). b].5 Let f be a bounded function on [a. we obtain the desired result. suppose that f is integrable on both [a. P) − L( f. c]. For every > 0 there exists a partition P of [a. P) < . b]. . P ) − L( f. and let a < c < b. xn } is a partition of [c. b]. If the partition P does not include the point c then let Q = P ∪ {c}. b] if and only if it is integrable on both [a. . It is suﬃcient to show that it is integrable on [a. P ) < . < ≥0 = and U( f. . . b]. Proof : (1) Suppose ﬁrst that f is integrable on [a. P )] = [U( f. P ) + L( f. . x j } is a partition of [a. in which case b c b f = a a f+ c f. c] and [c. 2 Since P = P ∪P is a partition of [a. Q) ≤ U( f. and U( f. P )] < . . Then there a exists a partition P of [a.144 Chapter 4 I (42 hrs. Q) = U( f. c]. b]. . and P {x j . then P = {x0 . Then. Q) = L( f. Q) − L( f. 2011) Theorem 4. b]. f is integrable on [a. Q) − L( f. Suppose that x j = c. P) − L( f. P ) from which we deduce that [U( f. P) < . summing up. (2) Conversely. such that U( f.

a b a f =− b a f and a f = 0. f (x) + g(x) ≥ inf{ f (x) : x ∈ A} + inf{g(x) : x ∈ A}.6 If f and g are integrable on [a. for every partition P that is a union of partitions of P and P . P) for every partition P of [a. b] if and only if it is integrable on both [a. so that (as can easily be checked) the addition rule always holds. since c L( f. b]. Theorem 4.Integration theory 145 Thus. b]. c] for every partition P of [c. We now add the following conventions. b] with b > a. L( f. c] and [c. then so is f + g. we have shown that f is integrable on [a. . 4 This is because for every x. b]. P ) c for every partition P of [a. we observe that if f and g are bounded functions on a set A. P ) f ≤ U( f. then4 inf{ f (x) + g(x) : x ∈ A} ≥ inf{ f (x) : x ∈ A} + inf{g(x) : x ∈ A} sup{ f (x) + g(x) : x ∈ A} ≤ sup{ f (x) : x ∈ A} + sup{g(x) : x ∈ A}. (More precisely. and b b b ( f + g) = a a f+ a g. we have only deﬁned the integral on [a. P ) ≤ it follows that c b L( f. Next.) Since there is only one b such number. it is equal to a f . but it is easy to see that this implies for every partition. b]. I (39 hrs. 2009) Comment: So far. P) ≤ a f+ c f ≤ U( f. so that the right-hand side is a lower bound of f + g. Proof : First of all. P ) ≤ a b f ≤ U( f.

for every partition P. P ) < Take P = P ∪ P . P) + U(g. P) + L(g. P) − L(g. 5 If |x| < for every > 0 then x = 0. P ) − L( f. P) + U(g. b (4.1). P ) − L(g. P) ≤ a ( f + g) ≤ U( f + g. P) ≤ U( f. P) ≤ U(g. P) ≤ [U( f. P) − L( f. P)] − [L( f. Now. then 2 U(g. P) − L(g. L( f + g. such that U( f. 2 and it remains to substitute into (4. P) + L(g. 2 U( f. U( f + g. It follows that b b b ( f + g) − a a f− a g ≤ [U( f. P) + U(g. P) ≤ U( f. Given > 0 there exist partitions P . P ) − L( f. P ) < . P ) − L(g. P) ≥ L( f. P)] + [U(g. P) + U(g. and also L( f. it follows that it is I Since the right-hand side can be made less than zero5 . P) − L( f + g. P). P ) < L( f. P ) < .1) 2 and U(g. P) ≤ U( f. P)]. P) U( f + g. P . . P) ≤ L( f + g. to prove that f + g is integrable. We are practically done. hence. P) + L(g. P) + L(g. P).146 Chapter 4 This implies that for every partition P. P)]. P) ≤ a b b f+ a g ≤ U( f. P) − L( f. P). for every .

b] and that (44 hrs.5 Recall the Riemann function.4 Denote by Show that b the upper integral. R = 0 (the lower integral). Proof : We saw that (g − f ). Then we only need to use the additivity of the integral for g = f + (g − f ). which is non-zero only at a ﬁnite number of points. Conclude that R is integrable on [a. 2010) b a R = 0. b]. Show that b a x = p/q ∈ Q x Q. b] be a gievn interval. Also. Show that Z is a ﬁnite set. is integrable and its integral is zero.1 If f is integrable on [a. I Exercise 4. b] : R(x) > }. . Let f. ﬁnd two functions f. 1/q R(x) = 0 Let [a. Let > 0 and deﬁne Z = {x ∈ [a.Integration theory 147 Corollary 4. Exercise 4. b] and g diﬀers from f only at a ﬁnite number of points then b b f = a a g. g be bounded on [a. b b ( f + g) ≤ a a f+ a g. P) < . Show that there exists a partition P such that U(R. g for which the inequality is strong.

such that U( f. and this number is by deﬁnition a (c f ). L(− f. ≤ U(c f. Consider then the case where c < 0. P) = −U( f. P) = c L( f. . For every partition P. P) = c L( f. Since f is integrable on [a. Since f is integrable on [a. P) < . Proof : Start with the case where c > 0. P) = −(L( f. P). P). then there exists for every > 0 a partition P. P) = −L( f. P) − L(c f. then b b (c f ) = c a a f. P). in fact.7 If f is integrable on [a. hence U(− f. we note that for every partition P. such that U( f. P) − L(− f. b]. P) − L( f. it is suﬃcient to consider the case of c = −1. b L(c f. then c f is integrable on [a. b] and a b b (c f ) = c a f. P) − L( f. P)) < . P) < . P) − U( f. P) and U(− f. P) ≤ b tions P. c hence U(c f. b].148 Chapter 4 Theorem 4. P) for all parti- Since there is only one number such that L(c f. b] and c ∈ R. which proves that c f is integrable on [a. Then. L(c f. since for arbitrary c < 0 we write c f = (−1)|c| f . then there exists for every > 0 a partition P. P) = c U( f. P) and U(c f. P) ≤ c a f ≤ c U( f. P) = U(c f. b]. P) < . For every partition P.

j . we can add to each a constant to make it positive and then use the “arithmetic of integrability”). (45 hrs. j j hence m fj mg ≤ m fj g j and M jf g ≤ M jf M g . Proof : We will prove this theorem for the case where f. we note that for every partition P. P) ≤ − a f ≤ −L( f. Let P be a partition. The generalization to arbitrary functions is not hard (for example. g ≥ 0. b]. We also deﬁne M f = sup{| f (x)| : a ≤ x ≤ b} M g = sup{|g(x)| : a ≤ x ≤ b}. Then. P) = U(− f. b L(− f. then so is their product f · g.Integration theory 149 which proves that (− f ) is integrable on [a. 2010) This last theorem is in fact a special case of a more general theorem: Theorem 4. I and the rest is as above. m fj mg ≤ f (x)g(x) ≤ M jf M g . It is useful to introduce the following notations: m fj = inf{ f (x) : x j−1 ≤ x ≤ x j } mg = inf{g(x) : x j−1 ≤ x ≤ x j } j m fj g = inf{ f (x)g(x) : x j−1 ≤ x ≤ x j } M jf = sup{ f (x) : x j−1 ≤ x ≤ x j } M g = sup{g(x) : x j−1 ≤ x ≤ x j } j M jf g = sup{ f (x)g(x) : x j−1 ≤ x ≤ x j }. P). b]. since f and g are bounded. P) = −U( f.8 Suppose that both f and g are integrable on [a. For every x j−1 ≤ x ≤ x j .

P0 ) ≥ m(b − a) and U( f. b]. then b m(b − a) ≤ a f ≤ M(b − a). P0 ) ≤ a f ≤ U( f. P)] + M f [U(g. P) − L( f.10 The function F deﬁned above is continuous on [a. It is then a simple task to show that the integrability of f and g implies the integrability of f g. Proof : Let P0 = {x0 . b] and m ≤ f (x) ≤ M for all x. Theorem 4. j j j j j j j This inequality implies that U( f g. P0 ). P) ≤ M g [U( f. then L( f. b]. P) − L(g. xn }. I Suppose now that f is integrable on [a. P)]. P0 ) ≤ M(b − a). Deﬁne for every x. I Theorem 4. .9 If f is integrable on [a. x F(x) = a f. P) − L( f g. We proved that it is integrable on every sub-segment.150 It further follows that Chapter 4 M jf g −m fj g ≤ M jf M g −m fj mg ≤ (M jf −m fj )M g +m fj (M g −mg ) ≤ M g (M jf −m fj )+M f (M g −mg ). and the proof follows from the fact that b L( f.

we take δ = /M.7 Show that if f is bounded and integrable on [a.e. Show that cb b f =c ca a g. Deﬁne M = sup{| f (x)| : a ≤ x ≤ b}.8 Let a < b and c > 0 and let f be integrable on [ca. Show that for every p > 0. then | f | is integrable on [a. By the previous theorem. and then |F(y) − F(x)| < whenever |x − y| < δ. .6 Let f be bounded and integrable in [a. 2009) Exercise 4.. i. and in particular. b 1/p | f |p a ≤ M|b − a|1/p . That is. b]. Exercise 4. −M(y − x) ≤ F(y) − F(x) ≤ M(y − x). that the integral on the left hand side exists. cb]. Then for |x − y| < δ (suppose wlog that y > x). (41 hrs. Given > 0 take δ = /M. |F(y) − F(x)| ≤ M(y − x) < . given > 0. b]. Exercise 4. y x y 151 F(y) − F(x) = a f− a f = x f.Integration theory Proof : Since f is integrable it is bounded. b] and b b f ≤ a a | f |. where g(x) = f (cx). b] and denote M = sup{| f (x)| : a ≤ x ≤ b}. I which proves that F is (uniformly) continuous on [a. since −M ≤ f (x) ≤ M for all x.

then 1/ f is integrable on [a. b]. Exercise 4. .9 Let f be integrable on [a. x2 ≤ b} = sup{| f (x1 )− f (x2 )| : a ≤ x1 . b] and g(x) = f (x − c). Prove that if f is integrable on [a. Hermann Hankel (1839-1873) proved that if a function is integrable. but had an error in his .11 A function s : [a. such that b b f− a a g< . where ωi is the variation i=1 of f in the segment [xi−1 . > 0a Show that if f is integrable on [a. Show that if f is integrable on [a. b]. Show that g is b b+c f = a a+c g. Show that f is integrable on [a.10 Let f : [a. Discussion: A question that occupied many mathematicians in the second half of the 19th century is what characterizes integrable functions. xi ]. b] if and only if there exists for every > 0 a partition P = {x0 . Prove that ω f = sup{ f (x1 )− f (x2 ) : a ≤ x1 . such that b b b b > 0 two f− a a s1 < and a s2 − a f < . b]. b] as ω f = sup{ f (x) : a ≤ x ≤ b} − inf{ f (x) : a ≤ x ≤ b}. b] → R be bounded. b] then there exists for every continuous function g. b] → R is called a step function ( ) if there exists a partition p such that s is constant on every open interval (we do not care for the values at the nodes). xn }. and there exists an m > 0 such that | f (x)| ≥ m for all x ∈ [a. b] then there exist for every step functions. such that n ωi < . g ≤ f . b + c] and Exercise 4. then it is necessarily continuous on a dense set of points (he also proved the opposite. s1 ≤ f and s2 ≥ f . . . and deﬁne the variation of f in [a. Exercise 4. x2 ≤ b}.152 Chapter 4 integrable on [a + c. .

x) = inf{ f (y) : x < y < c} M(c. Comment: The theorem states that in a certain way the integral is an operation inverse to the derivative. b). Proof : We need to show that lim ∆F. b]. (44 hrs. For example. If we deﬁne m(c. For x > c. b] and deﬁne for every x ∈ [a. if f (x) = 0 everywhere except for the point x = 1 where it is equal to one.Integration theory 153 proof).c = f (c). c We are going to show that this holds for the right derivative. F(x) − F(c) = x c f. x) = sup{ f (y) : x < y < c}. then F is diﬀerentiable at c and F (c) = f (c).3 The fundamental theorem of calculus )) Let f be intex Theorem 4. Note the condition on f being continuous. the same arguments can then be applied for the left derivative. If f is continuous at c ∈ (a. 2011) 4. then F(x) = 0. The ultimate answer is due to Henri Lebesgue (1875-1941) who proved that a bounded function is integrable if and only if its points of discontinuity have measure zero (a notion not deﬁned in this course). F(x) = a f. and it is not true that F (1) = f (1).11 (( grable on [a. .

a .c (x) ≤ M(c. (∀ > 0)(∃δ > 0) : (∀c < y < c + x)(| f (y) − f (c)| < /2). x) ≤ ∆F. Chapter 4 m(c. hence x f = F(x) = g(x) + c. It remains that the right-hand side as a limit zero at c. (∀ > 0)(∃δ > 0) : (∀c < y < c + δ)(| f (y) − f (c)| < /2). or m(c. I Corollary 4.c (x) − f (c) < . x) − f (y)|.2 (Newton-Leibniz formula) If f is continuous on [a. x) − f (c)| ≤ /2 < e and |M(c. We have also seen that it implies that F = g + c. x) − f (y)|} . x)(x − c). b] and f = g for some function g. x)(x − c) ≤ F(x) − F(c) ≤ M(c. then b f = g(b) − g(a). x) − f (c)| ≤ /2 < e. ∆F. Then. b]).154 then. To conclude. |M(c. Since f is continuous at c. ∆F.c (x) − f (c) ≤ max {|m(c. which concludes the proof. for some constant c. a Proof : We have seen that F(x) = a x f satisﬁes the identity F (x) = f (x) for all x (since f is continuous on [a. x). for every > 0 there exists a δ > 0 such that for all c < x < c + δ. For c < x < c + δ. hence |m(c.

(43 hrs. esin(t) dt. . such that inf{ f (x) : a ≤ x ≤ b} ≤ λ ≤ sup{ f (x) : a ≤ x ≤ b}. b) → R by h(x) F(x) = g(x) f.14 Let f : R → R be continuous. Setting x = b we get the desired result. I This is a very useful result. we have an easy way to calculate the integral of f . 0 x2 x Exercise 4. b). b]. 1 2δ x+δ f. h : (a. 2009) (47 hrs. For every δ > 0 we deﬁne Fδ (x) = Show that for every x ∈ R. cos(t) dt. It means that whenever we know a pair of functions f. Show that there exists a λ ∈ R. but no primitive function ( ) g is known. On the other hand. Show that F is diﬀerentiable. + δ→0 Exercise 4. 2010) Exercise 4.15 Let f. Exercise 4.12 Let f : R → R be continuous and g. b] → R be integrable. x−δ f (x) = lim Fδ (x). it is totally useless if we want to calculate the integral of f . such that g = f . and calculate its derivative. such that g(x) ≥ 0 for all x ∈ [a. g.Integration theory 155 but since F(a) = 0 it follows that c = −g(a).13 Diﬀerentiate the following functions: 1 0 ex cos(t2 ) dt. b) → R diﬀerentiable. Deﬁne F : (a. Show that F is well-deﬁned for all x ∈ (a. g : [a.

Show that if f is continuous. If the diﬀerence is . ξ x a Exercise 4. P). b]. b]. b]. then there exists a c ∈ [a. P) ≤ a f ≤ U( f. Show that there exists a ξ ∈ [a. b] → R be non-negative and integrable. b]. b] → R be integrable. 4.) Exercise 4. b] → R be monotonic and let g : [a. If f is diﬀerentiable at c then F is diﬀerentiable at c.156 and Chapter 4 b b ( f g) = λ a a g.16 Let f : [a. and F(x) = Prove or disprove: f. If f is diﬀerentiable at c then F is continuous at c. c ∈ (a. b). it follows that if the diﬀerence between the upper and lower sum is less than . however we do not know any primitive function of f . Let f be a given integrable function on [a. such that b ξ b ( f g) = f (a) a a g + f (b) g. such that b b ( f g) = f (c) a a g. then their average diﬀers from the integral by less that /2. (This is known as the integral mean value theorem. Suppose that we are content with an approximation of this integral within a maximum error of . Take a partition P: since b L( f. b b Hint: distinguish between the cases where a g > 0 and a g = 0.4 Riemann sums Let’s ask a practical question. and suppose we want to calculate the integral of f on [a.17 Let f : [a. If f is diﬀerentiable in a neighborhood of c and f is continuous at c then F is twice diﬀerentiable at c.

Proof : It is enough to show that given > 0 there exists a δ > 0 such that |U( f. Its evaluation only requires function evaluations. If the partition is suﬃciently ﬁne. The sum in the middle is called a Riemann sum. P) < .12 Let f be an integrable function on [a. without any inf or sup. then n b f (ti )(xi−1 − xi ) − i=1 a f < . it follows that if in every interval in the partition we take some point ti (a sample point). such that n b > 0 there f (ti )(xi−1 − xi ) − i=1 a f < for every partition P such that diam P < δ and any choice of points ti . we can keep reﬁning until they diﬀer by less than . since mi ≤ f (x) ≤ Mi for all xi−1 ≤ x ≤ xi . What we are going to show is that Riemann sums converge to the integral as we reﬁne the partition. then n L( f. For every exists a δ > 0. up to the fact that we have to compute lower and upper sums. such that U( f.Integration theory 157 greater than . P) − L( f. P) − L( f. then it can be reduced by reﬁning the partition. and so we obtain an approximation to the integral within the desired accuracy. P)| < whenever diam P < δ. irrespectively on how we sample the points ti . b]. Deﬁnition 4. Note that we don’t know a priori whether a Riemann is greater or smaller than the integral. Inﬁmums and supremums may be hard to calculate. . P). There is one drawback in this plan. Thus.5 The diameter of a partition P is diam P = max{xi − xi−1 : 1 ≤ i ≤ n}. Theorem 4. On the other hand. P) ≤ i=1 f (ti )(xi−1 − xi ) ≤ U( f.

. The contribution of these terms is at most /2. and take some partition P = {y0 . n δ< . 1] → R deﬁned as ϕ : x → 1 1 − x2 . P) = i=1 (Mi − mi )(xi − xi−1 ) can be divided into two sums. y j ].6 The number π is deﬁned as 1 π=2 −1 √ 1 − x2 dx. P )| < . U( f. yk } for which |U( f. √ we have a function ϕ : [−1. This concludes the proof. One for which each of the intervals [xi−1 .158 Chapter 4 for any Riemann sum with the same partition will diﬀer from the integral by less than . . Deﬁnition 4. P ) − L( f.5 The trigonometric functions Discussion: Describe the way the trigonometric functions are deﬁned in geometry. Let M = sup{| f (x)| : a ≤ x ≤ b}. xi ] is contained in an interval [y j−1 . I 4. There are at most k such terms and each contributes at most Mδ. Comment: We have used here the standard notation of integrals. and explain that we want deﬁnitions that only rely on the 13 axioms of real numbers. . . 2 We set 4Mk For every partition P of diameter less than δ. and we deﬁne π=2 −1 ϕ. . In our notation. P) − L( f. The second sum includes those intervals for which xi−1 < y j < xi .

and express the area of the sector bounded by this segment and the positive x-axis. 159 √ Consider now a unit circle. with its inverse deﬁned on [0. A(−1) = π/2 and A(1) = 0. θ . Second. 1 A (x) = − √ . For −1 ≤ x ≤ 1 we look at the point (x. √ 1 x 1 − x2 + ϕ. First. . 2 or. It follows that it is invertible. This geometric construction is just the motivation.Integration theory This is well deﬁned because ϕ is continuous. and that A(x) = θ/2. formally. hence integrable.7 We deﬁne the functions cos : [0.. connect it to the origin. 2 1 − x2 i. cos θ = A−1 2 This observation motivates the way we are going to deﬁne the cosine function. then we know that x = cos θ. A!x" x We now examine some properties of the function A. A(x) = 2 x If θ is the angle of this sector. we deﬁne a function A : [−1. 1] → R. π] → R by cos x = A−1 (x/2) and sin x = √ 1 − cos2 x.e. 1 − x2 ). Thus. π/2]. Deﬁnition 4. the function A is monotonically decreasing. π] → R and sin : [0. θ A(cos θ) = .

13 For 0 < x < π. We now need to show that the sine and cosine functions satisfy all known diﬀerential and algebraic properties. Since y A(cos y) = .160 Chapter 4 With these deﬁnitions in hand we deﬁne the other trigonometric functions. etc. sin x = cos x and cos x = − sin x. 2009) We are now in measure to start picturing the graphs of these two functions. (45 hrs. tan. 2 it follows that y = 2A(0) = 2 0 1 ϕ= π . 2 . I 1−(A−1 (x/2))2 The derivative of the sine function is then obtained by the composition rule. cos must vanish for some 0 < y < π. we have by the chain rule and the derivative of the inverse function formula. Proof : Since A is diﬀerentiable and A (x) 0 for 0 < x < π. by deﬁnition. Theorem 4. We also have cos π = A−1 (π/2) = −1 from which follows that sin π = 0 and sin 0 = 0. First. and cos 0 = A−1 (0) = 1. 1 1 1 1 cos x = (A−1 ) (x/2) = =− −1 (x/2)) 2 2 A (A 2 √ 2 1 1 = · · · = − sin x. cot. By the intermediate value theorem. It thus follows that cos is decreasing in this domain. sin x > 0 0 < x < π.

which implies that f is identically zero. It is easy to verify that the expressions for the derivatives of sine and cosine remain the same for all x ∈ R.2 Let f : R → R be twice diﬀerentiable. Then. Proof : Multiply the equation by 2 f . Since we deﬁned the trigonometric functions (more precisely their inverses) via integrals. The conditions at zero imply that c = 0. I . π = 1. and cos x = cos(2π − x) for π ≤ x ≤ 2π. for some c ∈ R. cos π =0 2 and sin We can then extend the deﬁnition of the sine and cosine to all of R by setting sin x = − sin(2π − x) and sin(x + 2πk) = sin x and cos(x + 2πk) = cos x for all k ∈ Z. then 2 f f + 2 f f = [( f )2 + f 2 ] = 0. it is not surprising that it was straightforward to diﬀerentiate them.Integration theory where we have used the fact that by symmetry. 2 Thus. f is identically zero. with f (0) = f (0) = 0. 0 1 161 ϕ= −1 0 ϕ. from which we deduce that ( f )2 + f 2 = c. A little more subtle is to derive their algebraic properties. and satisfy the diﬀerential equation f + f = 0. Lemma 4.

Then.15 For every x. f + f = 0 and f (0) = sin y and It follows from the previous theorem that f (x) = sin y cos x + cos y sin x. I .14 Let f : R → R be twice diﬀerentiable. Proof : Fix y and deﬁne f : x → sin(x + y). Proof : Deﬁne g = f − a cos −b sin. hence f = a cos +b sin. sin(x + y) = sin x cos y + cos x sin y cos(x + y) = cos x cos y − sin x sin y. It follows from the lemma that g is identically zero. I Theorem 4. f (0) = cos y. with f (0) = a and f (0) = b. then.162 Chapter 4 Theorem 4. g + g = 0. from which follows that f : x → cos(x + y) f : x → − sin(x + y). Thus. We proceed similarly for the cosine function. and satisfy the diﬀerential equation f + f = 0. f = a cos +b sin . y ∈ R. and g(0) = g (0) = 0.

Integration theory

163

4.6

The logarithm and the exponential

Our (starting) goal is to deﬁne the function f : x → 10 x for x ∈ R, along with its inverse, which we denote by log10 . For n ∈ N we have an inductive deﬁnition, which in particular, satisﬁes the identity 10n 10m = 10n+m . We then extend the deﬁnition of the exponential (with base 10) to integers, by requiring the above formula to remain correct. Thus, we must set 100 = 1 and 10−n = 1 . 10n

The next extension is to fractional powers of the form 101/n . Since we want 101/n × 101/n × · · · × 101/n = 101 = 10,

n times

√ n this forces the deﬁnition 101/n = 10. Then, we extend the deﬁnition to rational powers, by 101/n × 101/n × · · · × 101/n = 10m/n .

m times

We have proved in the beginning of this course that the deﬁnition of a rational power is independent of the representation of that fraction. The question is how to extend the deﬁnition of 10 x when x is a real number? This can be done with lots of “sups” and “infs”, but here we shall adopt another approach. We are looking for a function f with the property that f (x + y) = f (x) f (y), and f (1) = 10. Can we ﬁnd a diﬀerentiable function that satisﬁes these properties? If there was such a function, then f (x) = lim

h→0

f (h) − 1 f (x + h) − f (x) f (x) f (h) − f (x) = lim = lim f (x). h→0 h→0 h h h

164

Chapter 4

Suppose that the limit on the right hand side exists, and denote it by α. Then, f (x) = α f (x) and f (1) = 10.

This is not very helpful, as the derivative of f is expressed in terms of the (unknown) f . It is clear that f should be monotonically increasing (it is so for rational powers), hence invertible. Then, ( f −1 ) (x) = From this we deduce that

x

1 f ( f −1 (x))

=

1 α f ( f −1 (x))

=

1 . αx

f −1 (x) =

1

dt = log10 (x). αt

We seem to have succeeded in deﬁning (using integration) the logarithm-base-10 function, but there is a caveat. We do not know the value of α. Since, in any case, the so far distinguished role of the number 10 is unnatural, we deﬁne an alternative logarithm function:

**Deﬁnition 4.8 For all x > 0,
**

x

log(x) =

1

dt . t

0 < x < ∞.

Comment: Since 1/x is continuous on (0, ∞), this integral is well-deﬁned for all

**Theorem 4.16 For all x, y > 0,
**

log(xy) = log(x) + log(y).

Proof : Fix y and consider the function f : x → log(xy).

Integration theory Then, f (x) = Since log (x) = 1/x, it follows that log(xy) = log(x) + c. The value of the constant is found by setting x = 1. 1 1 ·y= . xy x

165

I

**Corollary 4.3 For integer n,
**

log(xn ) = n log(x).

Proof : By induction on n.

I

**Corollary 4.4 For all x, y > 0,
**

log x = log(x) − log(y). y

Proof : log(x) = log x x y = log + log(y). y y I

**Deﬁnition 4.9 The exponential function is deﬁned as
**

exp(x) = log−1 (x).

**Comment: The logarithm is invertible because it is monotonically increasing, as
**

its derivative is strictly positive. What is less easy to see is that the domain of the exponential is the whole of R. Indeed, since log(2n ) = n log(2),

166

Chapter 4

it follows that the logarithm is not bounded above. Similarly, since log(1/2n ) = −n log(2), it is neither bounded below. Thus, its range is R.

Theorem 4.17

exp = exp .

Proof : We proceed as usual, exp (x) = (log−1 ) (x) = 1 = log−1 (x) = exp(x). log (log−1 (x)) I

**Theorem 4.18 For every x, y,
**

exp(x + y) = exp(x) · exp(y).

Proof : Set t = exp(x) and s = exp(y). Thus, x = log(t) and y = log(s), and x + y = log(t) + log(s) = log(ts). Exponentiating both sides, we get the desired result. I

Deﬁnition 4.10

e = exp(1). There is plenty more to be done with exponentials and logarithms, like for example, show that for rational numbers r, er = exp(r), and the passage between bases. Due to the lack of time, we will stop at this point.

. in the sense that F = f .Integration theory 167 4. b f. trigonometric.7 Integration methods Given a function f . and substitution—both we will study. a question of practical interest is how to calculate integrals. we must know many pairs of functions f. F. then the answer is known. In this section we study techniques that in some cases allow to express a primitive function in terms of elementary functions. There are basically two techniques—integration by parts. a To eﬀectively use this formula. for b b b b b ( f + g) = a a f+ a g and a (c f ) = c a f. b a f = F(b) − F(a) = F|b . which is a primitive of f . Functions that can be expressed in terms of powers. a If we know a function F. Here is a table of pairs f. F. exponential and logarithmic functions and called elementary. which we already know: f (x) 1 xn cos x sin x 1/x exp(x) sec2 (x) 1/(1 + x2 ) √ 1/ 1 − x2 F(x) x n+1 x /(n + 1) sin x − cos x log(x) exp(x) tan(x) arctan(x) arcsin(x) def Sums of such functions and multiplication by constants are easily treated with our integration theorems.

we have that Primitive( f g ) = f Primitive(g ) − Primitive( f Primitive(g)) + const. Example: Consider the function F(x) = x arctan(x) − It is easily checked that f (x) = F (x) = arctan(x). It is however not always the case that the primitive of an elementary function is itself and elementary function. setting b = x and noting that primitives are deﬁned up to constants. Theorem 4. a a The question is how could have we guessed this answer if the function F had not been given to us. b 1 log(1 + x2 ).168 Chapter 4 Comment: Every continuous function f has a primitive. x F(x) = a f. g on the righthand side may be any primitive of g . f and g are given. Then. where a is arbitrary. then b b )) Let f. g have con- f g = ( f g)|b − a a a f g. Comment: When we use this formula. Also. . 2 arctan(x) dx = F|b . As a result of this observation.19 (Integration by parts ( tinuous ﬁrst derivatives.

a Example: The back-to-original-problem-trick b (1/x) log(x) dx. a Theorem 4. a Once we have used this method to get new f. F pairs. f (a) (g ◦ f ) f = a Proof : Let G be a primitive of g. b f (b) )) For continuously diﬀerg. a Example: The 1-trick: b log(x) dx. then (G ◦ f ) = (g ◦ f ) f . . we solve b log2 (x) dx.Integration theory 169 Example: b x e x dx.20 (Substitution formula ( entiable f and g. we use this knowledge to obtain new such pairs: Example: Using the fact that a primitive of log(x) is x log(x) − x.

170 hence a Chapter 4 b b f (b) (g ◦ f ) f = a (G ◦ f ) = (G ◦ f )|b = G| ff (b) = a (a) g. f (a) I The simplest use of this formula is by recognizing that a function is of the form (g ◦ f ) f . hence b sin(b) sin5 (x) cos(x) dx = a sin(a) t5 dt = t6 6 sin(b) . a If f (x) = sin(x) and g(x) = x5 . Example: Take the integral b sin5 (x) cos(x) dx. sin(a) (46 hrs. 2009) . then sin5 (x) cos(x) = g( f (x)) f (x).

Comment: Let a : N → R. . . The essential fact is that to each natural number corresponds a real number. Rather than writing a(3). for example. Notation: We usually denote a sequence a : N → R by (an ).1 Basic deﬁnitions An (inﬁnite) sequence ( ) is an inﬁnite ordered set of real numbers. n=1 Examples: a : n → n or an = n. Comment: As for functions. we often refer to “the sequence 1/n”. . a1 . where we often refer to “the function x2 ”. a2 . . a(17). we write a3 . a3 . . Another ubiquitous notation is (an )∞ . We denote sequences by a letter which labels the sequence. Having said that.Chapter 5 Sequences 5.1 An inﬁnite sequence is a function N → R. and a subscript which labels the position in the sequence. a17 . we can deﬁne: Deﬁnition 5.

a “cleaner” notation for the limit of a sequence (an ) would be lim a. . ∞ . In the ﬁrst and fourth cases we would say that “the sequence goes ( ) to inﬁnity”. 11. More useful often is to mark the elements of the sequence on the number axis.172 Chapter 5 b : n → (−1)n or bn = (−1)n . the sequence of primes. . 5. ¯ ) if it converges to some ) (there is no such thing Comment: There is a close connection between the limit of a sequence and the limit of a function at inﬁnity. c : n → 1/n or cn = 1/n. 7. Like functions. otherwise it is called divergent ( as convergence to inﬁnity). In the third case we would say that “it goes to zero”.2 A sequence (an ) converges ( n→∞ if (∀ > 0)(∃N ∈ N) : (∀n > N)(|an − | < ). 3. Deﬁnition 5. Deﬁnition 5. sequences can be graphed.2 Limits of sequences ¯ ) to . We will make such statements precise momentarily. We will elaborate on that later on. }. (dn ) = {2. denoted lim an = . . Notation: Like for functions. In the second case we would say that “it jumps between −1 and 1”. a1 a5 a3 a2 a4 5.3 A sequence is called convergent ( (real) number.

One way is by an algebraic trick. Example: Let a : n → n + 1 − n. . . 4n3 − 8n + 63 . ∞ since. . . 2 ξ for some x < ξ < y. Example: Let now a:n→ 3n3 + 7n2 + 1 . 3 − 2. . . The same inequality can be obtained by deﬁning √ a function f : x → x. √ y−x √ y− x= √ . however showing it requires some manipulations. and using the mean-value theorem. . 2 3 4 It is easy to show that lim a = 0.Sequences 173 Example: Let a : n → 1/n. Substituting x = n and y = n + 1. . (∀ > 0)(set N = 1/ ) then (∀n > N) |an | = 1 1 < < n N . or (an ) = It seems reasonable that lim a = 0. n+1+ n n+1+ n 2 n from which it is easy to proceed. . 4 − 3. . √ √ √ √ ( n + 1 − n)( n + 1 + n) 1 1 an = = √ √ √ √ < √ . or 1 1 1 (an ) = 1. ∞ √ √ √ √ √ √ √ √ 2 − 1. we obtain the desired inequality.

n n4 +1 . and ∞ ∞ ∞ lim(a + b) = lim a + lim b lim(a · b) = lim a · lim b. n3 −3 √ n2 + 1 − n.1 For each of the following sequence determine whether it is converging or diverging. Exercise 5. Then the sequences (a + b)n = an + bn and (a · b)n = an bn are also convergent. 4 − 8/n2 + 63/n3 and use limit arithmetic as stated below. proving it is not totally straightforward. Theorem 5. we do not care if a ﬁnite number of elements is not deﬁned. b lim∞ b Comment: Since the limit of a sequence is only determined by the “tail” of the sequence. ∞ ∞ ∞ If furthermore lim∞ b 0. Prove your claim using the deﬁnition of the limit: an = an = an = √ 7n2 + n 2 +3n . the sequence (a/b) converges and lim ∞ 0 for all a lim∞ a = .1 Let a and b be convergent sequences. One way to proceed is to note that 3 + 7/n + 1/n3 an = .174 It seems reasonable that Chapter 5 3 lim a = . Proof : Easy once you’ve done it for functions. ∞ 4 but once again. I . then there exists an N ∈ N such that bn n > N.

| f (x) − | ≤ max |a x − |. ∞ We now establish the analogy between limits of sequences and limits of functions. |a x − | < . ∞ then lim(a + b) = L + M. and this is in particular true for x ∈ N. I Conversely. then. Conversely. such that lim a = L ∞ and lim b = M. Let a be a sequence. ∞) → R by f (x) = an + (an+1 − an )(x − n) n ≤ x ≤ n + 1. ∞ Proof : Suppose ﬁrst that the sequence converges. f (x) lies between f ( x ) and f ( x ). Thus. it follows that for all x > N. .1 lim a = ∞ if and only if lim f = . Since for every x.2 Show that if a and b are converging sequences. (∀ > 0)(∃N ∈ N) : (∀x > N)(| f (x) − | < ). Proposition 5. If f (x) converges as x → ∞. given a function f : R → R. (∀ > 0)(∃N ∈ N) : (∀n > N)(|an − | < ). hence (∀ > 0)(∃N ∈ N) : (∀x > N)(| f (x) − | < ). and deﬁne a function f : [1. then so does the sequence (an ). it is often useful to deﬁne a sequence an = f (n). if f converges to as x → ∞.Sequences 175 Exercise 5.

and an = f (n). Theorem 5. Since log a < 0 it follows that lim x→∞ f (x) = 0.176 Chapter 5 Example: Let 0 < a < 1. Proof : Suppose ﬁrst that lim f = . Then n→∞ lim an =? f (x) = e x log a . ∞ where the composition of a function and a sequence is deﬁned by [ f (a)]n = f (an ).2 Let f be deﬁned on a (punctured) neighborhood of a point c. such that an lim a = c. we may deﬁne convergence of sequences to ±∞. for every sequence a that converges to c. and then lim f (a) = . Comment: Like for functions. then Comment: This theorem provides a characterization of the limit of a function at a point in terms of sequences. c If a is a sequence whose entries belong to the domain of f . only that we don’t use the term convergence. Suppose that lim f = . This is known as Heine’s characterization of the limit. if lim∞ f (a) = limc f = . c . Conversely. ∞ c. hence the sequence converges to zero. then We deﬁne f (x) = a x . but rather say that a sequence approaches ( ) inﬁnity.

(∃ > 0) : (∀n ∈ N)(∃an : 0 < |an − c| < δn ) : (| f (an ) − | ≥ ). namely. Suppose. (∀ > 0)(∃δ > 0) : (∀x : 0 < |x − c| < δ)(| f (x) − | < ). 177 Let a be a sequence as speciﬁed in the theorem. Conversely.. In particular. This means that (∃ > 0) : (∀δ > 0)(∃x : 0 < |x − c| < δ) : (| f (x) − | ≥ ). ∞ (48 hrs. ∞ it follows that lim a = sin(13). i. which contradicts our assumptions. (∀ > 0)(∃N ∈ N) : (∀n > N)(| f (an ) − | < ). where bn = 13 + 1/n2 .e. either the limit does not exist or it is not equal to c). 2009) . an c and lim∞ a = c. I Example: Consider the sequence a : n → sin 13 + 1 . n2 This sequence can be written as a = sin(b).. Then (∀δ > 0)(∃N ∈ N) : (∀n > N)(0 < |an − c| < δ). suppose that for any sequence a as above lim∞ f (a) = . setting δn = 1/n. The sequence a converges to c but f (a) does not converge to . lim∞ f (a) = .e. Combining the two.Sequences Thus. by contradiction that lim f c (i. Since lim sin = sin(13) 13 and lim b = 13.

α + 1). |an | ≤ M Since b converges to zero. (∃N ∈ N) : (∀n > N)(α − 1 < an < α + 1). We need to show that there exists L1 . Set M = max{an : 1 ≤ n ≤ N} m = min{an : 1 ≤ n ≤ N}. Theorem 5. namely.178 Chapter 5 Theorem 5. Then n→∞ lim (ab) = 0. . (∀ > 0)(∃N ∈ N) : (∀n > N) (|an bn | ≤ M|bn | < ) . min(m.3 A convergent sequence is bounded.4 Let a be a bounded sequence and let b be a sequence that converges to zero. α − 1) ≤ an ≤ max(M. I M . Proof : Let M be a bound on a. (∀ > 0)(∃N ∈ N) : (∀n > N) |bn | < Thus. ∀n ∈ N. setting = 1. Proof : Let a be a sequence that converges to a limit α. By deﬁnition. (∃N ∈ N) : (∀n > N)(|an − α| < 1). L2 such that L1 ≤ an ≤ L2 ∀n ∈ N. which implies that the sequence ab converges to zero. or. I then for all n.

a Proof : By deﬁnition. I Theorem 5. ∞ Then lim ∞ 1 = 0.Sequences 179 Example: The sequence a:n→ converges to zero.6 Let a be a sequence such that lim |a| = ∞. which concludes the proof. which concludes the proof. (∀M > 0)(∃N ∈ N) : (∀n > N)(0 < |an | < 1/M). ∞ Then lim ∞ 1 = ∞. (∀ > 0)(∃N ∈ N) : (∀n > N)(|an | > 1/ ). I . hence (∀ > 0)(∃N ∈ N) : (∀n > N)(0 < 1/|an | < ).5 Let a be a sequence with non-zero elements. such that lim a = 0. sin n n Theorem 5. |a| Proof : By deﬁnition. hence (∀M > 0)(∃N ∈ N) : (∀n > N)(1/|an | > M).

I Corollary 5. then we still only have α ≤ β. Then there exists an N ∈ N. Proof : By contradiction.8 Suppose that a and b are convergent sequences. Then α ≤ β. Even though an < bn for all n. ∞ and there exists an N ∈ N. the sequence b is eventually greater (term-by-term) than the sequence a. 2 Thus. such that an ≤ bn for all n > N. β ∈ R. such that bn > an for all n > N. I Comment: If instead an < bn for all n > N..e. (∃N1 ∈ N) : (∀n > N1 ) an < α + 1 (β − α) . Theorem 5.180 Chapter 5 Theorem 5. Proof : Since (β − α)/2 > 0. i.1 Let a be a sequence and α. both converge to the same limit. ∞ and β > α. . 2 and (∃N2 ∈ N) : (∀n > N2 ) bn > β − 1 (β − α) . using the previous theorem. (∃N1 .7 Suppose that a and b are convergent sequences. Take for example the sequences an = 1/n and bn = 2/n. N2 )) (bn − an > 0) . If the sequence a converges to α and α > β then eventually an > β. lim a = α ∞ and lim b = β. N2 ∈ N) : (∀n > max(N1 . lim a = α ∞ and lim b = β.

N2 ))(− < an − ≤ cn − ≤ bn − < ).Sequences 181 Theorem 5. (∀ > 0)(∃N1 . N2 ∈ N) : (∀n > max(N1 . 1< it follows that n→∞ 1 + 1/n < 1 + 1/n. x→1 n→∞ Exercise 5. Proof : It is given that (∀ > 0)(∃N1 ∈ N) : (∀n > N1 )(|an − | < ). or (∀ > 0)(∃N ∈ N) : (∀n > N)(|cn − | < ). ∞ for all n > N.3 . 1 + 1/n = 1.9 (Sandwich) Suppose that a and b are sequences that converge to the same limit . Thus. and (∀ > 0)(∃N2 ∈ N) : (∀n > N2 )(|bn − | < ). Let c be a sequence for which there exists an N ∈ N such that an ≤ cn ≤ bn Then lim c = . lim We can also deduce this from the fact that √ lim x = 1 and lim (1 + 1/n) = 1. I Example: Since for all n.

and ab is bounded but does not converge. b is unbounded. It ≥ an for all n. such that a converges to zero. the order in the set does not matter. We deﬁne similarly a Theorem 5. . I Example: Consider the sequence 1 a:n→ 1+ n n . and in particular. i. (Note that a supremum is a property of a set.. b is unbounded. and ab converges to a limit other then zero.4 A sequence a is called increasing ( is called non-decreasing ( ) if an+1 decreasing and a non-increasing sequence.10 (Bounded + monotonic = convergent) Let a be a non-decreasing sequence bounded from above. and ab converges to zero.) By the deﬁnition of the supremum. Then it is convergent. b is unbounded. Find sequences a and b. such that a converges to zero. such that a converges to zero.e. (∀ > 0)(∃N ∈ N) : (∀n > N)(α − < aN ≤ an ≤ α). Find sequences a and b.182 Chapter 5 Find sequences a and b. ) if an+1 > an for all n. (∀ > 0)(∃N ∈ N) : (∀n > N)(|an − α| < ). Since the sequence is non-decreasing. Deﬁnition 5. (∀ > 0)(∃N ∈ N) : (α − < aN ≤ α). Proof : Set α = sup{an : n ∈ N}.

i. . 1 The binomial formula is (a + b)n = n k=0 n k n−k ab . Deﬁnition 5. namely an = n. both the number of elements grows as well as their values. . we claim that this sequence is increasing. nk = 2k. It follows that n 1 exists lim 1 + n→∞ n (and equals to 2. such that n1 < n2 < · · · .Sequences We will ﬁrst show that an < 3 for all n.e. an2 .718 · · · ≡ e). Indeed1 . as an = 1 + 1 + 1 1 1 1 1− + 1− 2! n 3! n 1− 2 + ··· . . ) of a is any More formally. A subsequence ( sequence an1 .. such that bk = ank for all k ∈ N. b is a subsequence of a if there exists a monotonically increasing sequence of natural numbers (nk ). k . Example: Let a by the sequence of natural numbers. . 1 1+ n n 183 n 1 n 1 n 1 + + ··· + 2 1 n 2 n n nn n(n − 1) n! =1+1+ + ··· + 2 2! n n! nn 1 1 ≤ 1 + 1 + + ··· + 2! n! 1 1 ≤ 1 + 1 + + · · · + n−1 < 3. n When n grows. 2 2 =1+ Second. The subsequence b of all even numbers is bk = a2k .5 Let a be a sequence.

then the subsequence ank is decreasing.1 Any sequence contains a subsequence which is either non-decreasing or non-increasing. There are ﬁnitely many peak points: Then let n1 be greater than all the peak points. there exists an n2 . Deﬁnition 5. (50 hrs. 2009) In many cases. Since it is not a peak point. a peak poin! a peak poin! ) of the There are now two possibilities.6 A sequence a is called a Cauchy sequence if (∀ > 0)(∃N ∈ N) : (∀m. .2 (Bolzano-Weierstraß) Every bounded sequence has a convergent subsequence. such that an2 ≥ an1 .184 Chapter 5 Lemma 5. We will now provide such a convergence criterion. Continuing this way. n > N)(|an − am | < ). Proof : Let a be a sequence. we obtain a non-decreasing subsequence. Let’s call a number n a peak point ( sequence a if am < an for all m > n. There are inﬁnitely many peak points: If n1 < n2 < · · · are a sequence of peak points. we would like to know whether a sequence is convergent even if we do not know what the limit is. I Corollary 5.

This concludes the proof. which proves that the sequence is bounded. Cauchy assumed that sequences that get eventually arbitrarily close converge. (∃N ∈ N) : (∀n > N)(|an − aN+1 | < 1). 2009) There is something amusing about calling sequences satisfying this property a Cauchy sequence. Proof : One direction is easy2 . it follows that a has a converging subsequence. 2 I . n > N)(|an − am | < /2). By the Bolzano-Weierstraß theorem.Sequences 185 Theorem 5. We ﬁrst show that the sequence is bounded. Denote this subsequence by b with bk = ank and its limit by β. Combining the two. without being aware that this is something that ought to be proved. (∀ > 0)(∃N. i. n > N)(|an − am | ≤ |an − α| + |am − α| < ).e.. We will show that the whole sequence converges to β.11 A sequence converges if and only if it is a Cauchy sequence. hence (∀ > 0)(∃N ∈ N) : (∀m. and whereas by the convergence of the sequence b. (51 hrs. If a sequence a converges to a limit α. by the Cauchy property (∀ > 0)(∃N ∈ N) : (∀m. Indeed. (∀N ∈ N)(∃K ∈ N : nK > N) : (∀k > K)(|bk − β| < /2). Suppose now that a is a Cauchy sequence. K ∈ N : nK > N) : (∀n > N)(|an − β| ≤ |an − bk | + |bk − β| < ). then (∀ > 0)(∃N ∈ N) : (∀n > N)(|an − α| < /2). the sequence is a Cauchy sequence. Taking = 1.

1−q Comment: Note the terminology: sequence is summable = series is convergent. In this case we write ∞ a ¯ ) if the sequence of par- ak = lim S a . Note that S does not depend on the order of summation (a ﬁnite summation). Let a be a geometric sequence an = qn |q| < 1. Deﬁnition 5. . Often. hence this geometric sequence is summable and ∞ qn = n=1 q . We deﬁne the sequence of partial sums ( a by n a Sn = k=1 a ak . k=1 ∞ def The sequence of partial sum is called a series ( also to the limit of the sequence of partial sums.3 Inﬁnite series ¯ ) of Let a be a sequence. for which we have an explicit expresion: 1 − qn a Sn = q 1−q This series converges. If the sequence S a converges. the term series refers Example: The canonical example of a series is the geometric series. The geometric series is the sequence of partial sums.186 Chapter 5 5. ). we will interpret its limit as the inﬁnite sum of the sequence.7 A sequence a is called summable ( tial sums S converges.

I An important question is how to identify summable sequences. Proof : This is just a restatement of Cauchy’s convergence criterion for sequences. and the rest follows from limits arithmetic. even in cases where the sum is not known. such that |an+1 + an+2 + · · · + am | < for all m > n > N.12 Let a and b be summable sequences and γ ∈ R. For example. and ∞ ∞ (γan ) = γ n=1 n=1 an . Then the sequences a + b and γa are summable and ∞ ∞ ∞ (an + bn ) = n=1 n=1 an + n=1 bn . but a a S m − S n = an+1 + an+2 + · · · + am . n a+b Sn = k=1 a b (ak + bk ) = S n + S n . We know that the series converges if and only if a a (∀ > 0)(∃N ∈ N) : (∀m > n > N)(|S m − S n | < ). I . Theorem 5.Sequences 187 Theorem 5. This brings us to discuss convergence criteria. Proof : Very easy.13 (Cauchy’s criterion) The sequence a is summable if and only if for every > 0 there exists an N ∈ N.

b (∃M > 0) : (∀n ∈ N) 0 ≤ S n ≤ M . 2 3 4 5 6 7 8 ≥1/2 ≥1/2 This grouping of terms shows that the sequence of partial sums is unbounded. which therefore converges. Proof : Consider the sequence of partial sums. Convergence implies boundedness.14 A non-negative sequence is summable if and only if the series (i. an = 1/n. which is monotonically increasing. I The vanishing of the sequence is only a necessary condition for its summability. namely..3 If a is summable then lim a = 0. Monotonicity and boundedness imply convergence. For a while we will deal with non-negative sequences.e. The bound M bounds the series S a . ∞ Proof : We apply Cauchy’s criterion with m = n + 1. Proof : Since b is summable then the corresponding series S b is bounded. (∀ > 0)(∃N ∈ N) : (∀n > N)(|an+1 | < ). then the sequence a is summable. That is. I . The reason will be made clear later on. It is not suﬃcient as proves the harmonic series.15 (Comparison test) If 0 ≤ an ≤ bn for every n and the sequence b is summable. the sequence of partial sums) is bounded.188 Chapter 5 Corollary 5. a Sn = 1 + 1 1 1 1 1 1 1 + + + + + + +··· . Theorem 5. I Theorem 5.

2n 2 −2 2 so again. . based on the comparison test and the convergence of the geometric series we conclude that the series of a converges.Sequences 189 Example: Consider the following sequence: an = Since 2 + sin3 (n + 1) . Then a is summable if and only if b is summable. ≤ an ≤ n n ≥ 2. This can be shown by the following inequality an = 1 n+1 = 2 ≤ an . 0 < an ≤ Example: Consider now the sequence an = Here we have 2n 1 . it follows that the series of a converges. 2n and the series associated with b converges. − 1 + sin3 (n2 ) 1 2 1 ≤ n. 2n + n2 3 def = bn . 2009) Theorem 5. 0< Example: The next example is n+1 . The following theorem provides an even easier way to show that. since an “roughly behaves” like 1/n. such that lim ∞ a =γ b 0. n2 + 1 It is clear that the corresponding series diverges.16 (Limit comparison) Let a and b be positive sequences. (53 hrs. n n +n which proves that the series S a is unbounded.

b n→∞ n + 1 hence the series of a diverges. γ bn it follows that (∃N ∈ N) : (∀n > N) or (∃N ∈ N) : (∀n > N) (an < 2γbn ) . Now. setting bn = 1/n we ﬁnd lim ∞ n2 + n a = lim 2 = 1. . The roles of a and b can then be interchanged. n→∞ an (i) If r < 1 then (an ) is summable.17 (Ratio test ( lim )) Let an > 0 and suppose that an+1 = r. γb an <2 . aN+1 ≤ s aN . n N n ak = k=1 k=1 ak + a N k=N+1 sk−N . a Sn n n = Sa N + k=N+1 ak < Sa N + 2γ k=N+1 b bk ≤ S a + 2γS n . an In particular. (∃s : r < s < 1)(∃N ∈ N) : (∀n > N) an+1 ≤s . Proof : Suppose ﬁrst that r < 1. Proof : Suppose that b is summable. Since lim ∞ a = 1. then a is summable.190 Chapter 5 Example: Back to the previous example. I Theorem 5. Thus. N Since the right hand sides is uniformly bounded. Then. aN+2 ≤ s2 aN . and so on. (ii) If r > 1 then (an ) is not summable.

Note that x x→∞ lim 1 dt =∞ t x whereas x→∞ lim 1 dt < ∞. t2 Indeed. n These examples motivate. Conversely. the sequence (an ) is not even bounded (and certainly not summable). Then a 3 ∞ f = lim def x x→∞ f. . b] for some a and b > a. as show the two examples3 . ∞ n=1 1 <∞ n2 ∞ and n=1 1 = ∞.Sequences 191 The second term on the right hand side is a partial sum of a convergent geometric series. hence the partial sums of (an ) are bounded. i. there is a close relation between the convergence of the series and the convergence of the corresponding integrals. another comparison test. an = Comment: What about the case r = 1 in the above theorem? This does not provide any information about convergence.e. whereas the sequence bn = 1/n is not. which proves their convergence. then (∃s : 1 < s < r)(∃N ∈ N) : (∀n > N) an+1 ≥s .8 Let f be integrable on any segment [a. if r > 1. a This notation means that the sequence an = 1/n2 is summable. Deﬁnition 5. I Examples: Apply the ratio test for 1 n! rn bn = n! cn = n r n . however. an from which follows that an ≥ sn−N aN for all n > N..

Example: Consider the series ∞ n=1 1 np for p > 0. hence so does the series. Proof : The proof hinges on showing that x x x ak ≤ k=2 1 f ≤ k=1 ak . . p 1 t This integral converges as x → ∞ if and only if p > 1. Sequences of non-ﬁxed sign are a diﬀerent story.192 Chapter 5 Theorem 5.18 (Integral test) Let f be a positive decreasing function on [1. 2009) Example: One can take this further and prove that ∞ n=1 1 n(log n) p converges if and only if p > 1. and so does ∞ n=1 1 . (55 hrs. Then ∞ n=1 an converges if and only if ∞ 1 f exists. This is known as Bertrand’s ladder. and consider the integral x p=1 dt log x = (1 − p)−1 (x−p+1 − 1) p 1. n log n(log log n) p and so on. I TO COMPLETE. ∞) and set an = f (n). We apply the integral text. The case of non-positive sequences is the same as we can deﬁne bn = (−an ). We have dealt with series of non-negative sequences.

if and only if the two series formed by its positive elements and its negative elements both converge. n 2 4 6 . Deﬁne now4 a n a+ = n 0 4 an > 0 an ≤ 0 and a n a− = n 0 an < 0 . for every > 0 there exists an N ∈ N such that |an+1 + · · · + am | ≤ |an+1 | + · · · + |am | < whenever m. Indeed. 0.19 Every absolutely convergent series is convergent. − . Also. . n > N. A series that converges but does not converge absolutely is called conditionally convergent ( ¯ ). − . 2 3 4 Example: The series 1 1 1 + − + ··· . 0. − . It is a well-known alternating series. 0. 1− Theorem 5. Proof : The fact that an absolutely convergent series converges follows from Cauchy’s criterion. It is known as the Leibniz series. · · · . 3 5 7 converges conditionally as well. 0. then 1 1 a+ = 1. which Leibniz showed (without any rigor) to be equal to π/4. a series is absolutely convergent. 1 1 1 + − + ··· . if an = (−1)n+1 /n. 0. . · · · n 3 5 1 1 1 a− = 0. an ≥ 0 For example.Sequences 193 an is said to be absolutely convergent ( ¯ ) if |an | converges. ∞ n=1 Deﬁnition 5.9 A series ∞ n=1 Example: The series 1− converges conditionally.

an = a+ + a− . 2009) I Theorem 5.20 (Leibniz) Suppose that an is a non-increasing sequence of nonnegative numbers. n n n n a+ = n Thus. n hence if the two series of positive and negative terms converge so does the series of absolute values. Indeed. Proof : Simple considerations show that s2 ≤ s4 ≤ s6 ≤ · · · ≤ s5 ≤ s3 ≤ s1 . On the other hand. and conversely. a1 ≥ a2 ≥ · · · ≥ 0. Then the series ∞ (−1)n+1 an = a1 − a2 + a3 − a4 + · · · n=1 converges. s2n ≤ s2n + (a2n+1 − a2n+2 ) = s2n+2 = s2n+1 − a2n+2 ≤ s2n+1 . 2 |ak | = k=1 k=1 a+ − n k=1 a− . i. and n→∞ lim an = 0.e. . (56 hrs. n a+ = n k=1 n 1 2 1 2 n an + k=1 n 1 2 1 2 n |an | k=1 n a− = n k=1 an − k=1 |an | k=1 so that the other direction holds as well.194 Chapter 5 Then.. 1 (an + |an |) 2 n and n a− = n n 1 (an − |an |) . |an | = a+ − a− .

n→∞ n→∞ however a2n+1 converges to zero. n→∞ lim bn = b and n→∞ lim cn = c. N2 ∈ N) : (∀n > max(2N1 + 1. hence b = c. By limit arithmetic n→∞ lim (cn − bn ) = lim (s2n+1 − s2n ) = lim a2n+1 = c − b. ) of a sequence an is a sequence Theorem 5. but only a sketch. where f : N → N is one-to-one and onto. 2N2 ))(|sn − b| < ). I Deﬁnition 5. It follows that both subsequences converge.10 A rearrangement ( bn = a f (n) . ∞ n=1 Proof : I will not give a formal proof. i. whereas the sequence cn = s2n+1 is non-increasing and bounded from below. the sequence bn = s2n is non-decreasing and bounded from above.21 (Riemann) If an is conditionally convergent then for every α ∈ R there exists a rearrangement of the series that converges to α. (∀ > 0)(∃N1 .e.. It remains to show that if both the odd and even elements of a sequence converge to the same limit then the whole sequence converges to this limit.Sequences 195 Thus. Indeed. (∀ > 0)(∃N1 ∈ N) : (∀n > N1 )(|s2n+1 − b| < ) and (∀ > 0)(∃N2 ∈ N) : (∀n > N2 )(|s2n − b| < ). I .

.. 1 S = 2 ∞ < S < 1. 2 3 2 5 7 4 which directly proves that a rearrangement of the alternating harmonic series converges to three halves of its value. In this example the rearrangement is b3n+1 = a4n+1 b3n+2 = a4n+3 b3n = a2n . As m → ∞ this sequence converges to |an − a|. Then. 2n 2 4 6 8 10 which we can also pad with zeros. By limit n=1 (−1)n+1 1 1 1 1 1 = − + − + + ··· . n 2 3 4 5 1 2 We proved that this series does indeed converge to some arithmetic for sequences (or series). Think now of the left hand side as a sequence with index m with n ﬁxed.196 Chapter 5 Example: Consider the series ∞ S = n=1 (−1)n+1 1 1 1 1 = 1 − + − + + ··· . (∀ > 0)(∃N ∈ N) : (∀m. 1 1 1 1 1 1 S =0+ +0− +0+ +0− +0+ + ··· . 1 1 1 1 1 3 S = 1 + − + + − + . Comment: Here is an important fact about Cauchy sequences: let (an ) be a Cauchy sequence with limit a.. This implies that |an − a| ≤ for all n > N.. n > N)(|an − am | < ). 2 2 4 6 8 10 Using once more sequence arithmetic.

. since the sequence is absolutely convergent. the order of summation does not matter in an absolutely convergent series.Sequences 197 Theorem 5. ∞ |tm − sN | = k>N. For all m > M. 2 Take now M be suﬃciently large. and ∞ ∞ ∞ n=1 ∞ n=1 bn an = n=1 n=1 bn . In other word. aN appears among the b1 . . . Proof : Let bn = a f (n) and set n n sn = k=1 ak ∞ n=1 and tn = k=1 bk . so that each of the a1 . we can choose N large enough such that ∞ N ∞ |ak | − k=1 k=1 |ak | = k=N+1 |ak | < .e.. ∞ This proves that m→∞ lim tm = k=1 ak . . . Let > 0 be given. Since an converges. . then there exists an N such that ∞ sN − k=1 ak < . b M . 2 Moreover. . 2 i. f (k)≤m ak ≤ k=N+1 |ak | < . . ∞ ∞ ak − tm ≤ k=1 k=1 ak − sN + |sN − tm | ≤ . .22 If an converges absolutely then any rearrangement converges absolutely.

such that m n n m |a | |b | − |a | |b | < k k 2 k=1 =1 k=1 =1 .e. Then ∞ ∞ ∞ cn = an bk .e. i. I Absolute convergence is also necessary in order to multiply two series. i. a set that can be ordered in diﬀerent ways.. n=1 n=1 k=1 ∞ n=1 Proof : Consider the sequence sn = ∞ k=1 |ak | ∞ =1 |b | . k. Theorem 5.23 Suppose that an and ∞ bn converge absolutely. Consider the product ∞ ∞ a b = (a + a + · · · )(b + b + · · · ). That ∞ bn converges absolutely follows form the fact that |bn | is a rearrangement n=1 of |an | and the ﬁrst part of this proof.. Chapter 5 ∞ ∞ bn = n=1 n=1 an . ∞ (an bk ).e. which implies that it is a Cauchy sequence.198 i. This sequence converges as n → ∞ (by limit arithmetic).n=1 except that these numbers to be added form a “matrix”. For every > 0 there exists an N ∈ N. and that cn n=1 is any sequence that contains all terms an bk . n k 1 2 1 2 n=1 k=1 This product seems to be a sum of all products an bk ..

N N M cn − ak b n=1 k=1 =1 only consists of terms ak b with either k or greater than N.. By the comment above. m m |a | |b | − k k=1 =1 199 ∞ k=1 |ak | ∞ =1 |b | ≤ 2 for all m > N. .Sequences for all m. Thus. n > N. I. I for all m > M. Let now (cn ) be a sequence comprising of all pairs an b . hence for all m > M.e. There exists an M suﬃciently large. N N m cn − ak b ≤ |ak ||b | < . where cm = ak b . which proves the desires result. |ak ||b | ≤ k or >N 2 < . n=1 k=1 =1 k or >N Letting N → ∞. such that m > M implies that either k or is greater than N. m n=1 cn − ∞ k=1 ak ∞ =1 b ≤ .

200 Chapter 5 .

- 314 02Uploaded byVince Bagsit Policarpio
- Dominator-Based Algorithms in Logic Synthesis and VerificationUploaded byMahmoud Elbayoumi
- CMJV02I03P0445Uploaded byysuresh
- differential elementUploaded by8310032914
- Mathematics Syllabus for Main ExaminationUploaded byNIlesh Bhagat
- 5_3notes181Uploaded byEge Ata Türkgeldi
- 2011Uploaded byTy Webb
- Maths 4 test 3 2016.pdfUploaded byBafundi Cishe
- 1718-mod0-1054Uploaded byNatkhatKhoopdi
- 4130 ErrataUploaded byerlandfilho
- Chap 01 Real Analysis: Real Number SystemUploaded byatiq4pk
- Module 1-Uploaded byBharat Kumar
- Maths 4 test 3 2016.pdfUploaded byBafundi Cishe
- Closed and Open CompactUploaded byBhimesh Jetti
- Definite IntegralUploaded byRiddhima Mukherjee
- AIAAJ-svUploaded byashishgupta02
- Chapter 1 SetsUploaded byendeavour909897
- Mathematics Material for Revision ClassUploaded byRohan Chacko
- MA242SumII2015SyllabusUploaded byMssss
- Engineering MathematicsUploaded bySurendra Babu Koganti
- 9709_s10_er.pdfUploaded bymatshona
- Fundamental of Calculus (Presentation Slide)AUploaded byJosephineSinaga
- Esxplicação Teorema Da Substituição de VariáveisUploaded byadcant
- CHUploaded byJatin Bisht
- Cerme7 Wg14 Paper Vandebrouck Revised Dec2010Uploaded byVassilis
- 18.02 TextbookUploaded byEthan Replicate
- MStat Sample 2009Uploaded byxyza304gmailcom
- Calculus With Analytic GeometryUploaded byJoseph Nikolai Chioco
- MATH220syllabusUploaded byAbdullah Farooque
- Parish%2c Duraisamy (2016) - Non-local Closure Models for Large Eddy Simul....pdfUploaded bySahera Shabnam