Professional Documents
Culture Documents
Lecture Notes
Originally Created for the Class of Winter Semester 2015/2016 at LMU Munich
Includes Subsequent Corrections and Revisions†
Contents
1 Foundations: Mathematical Logic and Set Theory 6
1.1 Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Propositional Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Logical Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.3 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 Predicate Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
CONTENTS 2
4 Real Numbers 51
4.1 The Real Numbers as a Complete Totally Ordered Field . . . . . . . . . 51
4.2 Important Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5 Complex Numbers 56
5.1 Definition and Basic Arithmetic . . . . . . . . . . . . . . . . . . . . . . . 56
5.2 Sign and Absolute Value (Modulus) . . . . . . . . . . . . . . . . . . . . . 58
5.3 Sums and Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.4 Binomial Coefficients and Binomial Theorem . . . . . . . . . . . . . . . . 63
6 Polynomials 66
6.1 Arithmetic of K-Valued Functions . . . . . . . . . . . . . . . . . . . . . . 66
6.2 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
CONTENTS 3
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
CONTENTS 4
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
CONTENTS 5
References 292
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 6
Mathematical logic is a large field in its own right and, as indicated above, a thorough in-
troduction is beyond the scope of this class – the interested reader may refer to [EFT07],
[Kun12], and references therein. Here, we will just introduce some basic concepts using
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 7
common English (rather than formal symbolic languages – a concept touched on in Sec.
A.2 of the Appendix and more thoroughly explained in books like [EFT07]).
As mentioned before, mathematics establishes the truth or falsehood of statements. By
a statement or proposition we mean any sentence (any sequence of symbols) that can
reasonably be assigned a truth value, i.e. a value of either true, abbreviated T, or false,
abbreviated F. The following example illustrates the difference between statements and
sentences that are not statements:
Example 1.1. (a) Sentences that are statements:
The fourth sentence in Ex. 1.1(b) is not a statement, as it can not be said to be either
true or false without any further knowledge on x. The fifth sentence in Ex. 1.1(b) is
not a statement as it lacks any meaning and can, hence, not be either true or false. It
would become a statement if given a definition of what it means for a natural number
to be green.
The next step now is to combine statements into new statements using logical operators,
where the truth value of the combined statements depends on the truth values of the
original statements and on the type of logical operator facilitating the combination.
The simplest logical operator is negation, denoted ¬. It is actually a so-called unary
operator, i.e. it does not combine statements, but is merely applied to one statement.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 8
For example, if A stands for the statement “Every dog is an animal.”, then ¬A stands
for the statement “Not every dog is an animal.”; and if B stands for the statement “The
number 4 is odd.”, then ¬B stands for the statement “The number 4 is not odd.”, which
can also be expressed as “The number 4 is even.”
To completely understand the action of a logical operator, one usually writes what is
known as a truth table. For negation, the truth table is
A ¬A
T F (1.1)
F T
that means if the input statement A is true, then the output statement ¬A is false; if
the input statement A is false, then the output statement ¬A is true.
We now proceed to discuss binary logical operators, i.e. logical operators combining
precisely two statements. The following four operators are essential for mathematical
reasoning:
Conjunction: A and B, usually denoted A ∧ B.
Disjunction: A or B, usually denoted A ∨ B.
Implication: A implies B, usually denoted A ⇒ B.
Equivalence: A is equivalent to B, usually denoted A ⇔ B.
Here is the corresponding truth table:
A B A∧B A∨B A⇒B A⇔B
T T T T T T
T F F T F F (1.2)
F T F T T F
F F F F T T
When first seen, some of the assignments of truth values in (1.2) might not be completely
intuitive, due to the fact that logical operators are often used somewhat differently in
common English. Let us consider each of the four logical operators of (1.2) in sequence:
For the use in subsequent examples, let A1 , . . . , A6 denote the six statements from Ex.
1.1(a).
Conjunction: Most likely the easiest of the four, basically identical to common language
use: A ∧ B is true if, and only if, both A and B are true. For example, using Ex. 1.1(a),
A1 ∧ A4 is the statement “Every dog is an animal and 2 + 3 = 5.”, which is true since
both A1 and A4 are true. On the other hand, A1 ∧ A3 is the statement “Every dog is
an animal and the number 4 is odd.”, which is false, since A3 is false.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 9
Disjunction: The disjunction A ∨ B is true if, and only if, at least one of the statements
A, B is true. Here one already has to be a bit careful – A ∨ B defines the inclusive or,
whereas “or” in common English is often understood to mean the exclusive or (which is
false if both input statements are true). For example, using Ex. 1.1(a), A1 ∨ A4 is the
statement “Every dog is an animal or 2 + 3 = 5.”, which is true since both A1 and A4
are true. The statement A1 ∨ A3 , i.e. “Every dog is an animal or the number 4 is odd.”
is also true,
√ since A1 is true. However, the statement A2 ∨ A5 , i.e. “Every animal is a
dog or 2 < 0.” is false, as both A2 and A5 are false.
As you will have noted in the above examples, logical operators can be applied to
combine statements that have no obvious contents relation. While this might seem
strange, introducing contents-related restrictions is unnecessary as well as undesirable,
since it is often not clear which seemingly unrelated statements might suddenly appear
in a common context in the future. The same occurs when considering implications and
equivalences, where it might seem even more obscure at first.
Implication: Instead of A implies B, one also says if A then B, B is a consequence
of A, B is concluded or inferred from A, A is sufficient for B, or B is necessary for
A. The implication A ⇒ B is always true, except if A is true and B is false. At first
glance, it might be surprising that A ⇒ B is defined to be true for A false and B true,
however, there are many examples of incorrect statements implying correct statements.
For instance, squaring the (false) equality of integers −1 = 1, implies the (true) equality
of integers 1 = 1. However, as with conjunction and disjunction, it is perfectly valid
to combine statements without any obvious context relation: For example, using Ex.
1.1(a), the statement A1 ⇒ A6 , i.e. “Every dog is an animal implies x + 1 > 0 holds for
each natural number x.” is true, since A6 is true, whereas the statement A4 ⇒ A2 , i.e.
“2 + 3 = 5 implies every animal is a dog.” is false, as A4 is true and A2 is false.
Of course, the implication A ⇒ B is not really useful in situations, where the truth
values of both A and B are already known. Rather, in a typical application, one tries
to establish the truth of A to prove the truth of B (a strategy that will fail if A happens
to be false).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 10
Equivalence: A ⇔ B means A is true if, and only if, B is true. Once again, using
input statements from Ex. 1.1(a), we see that A1 ⇔ A4 , i.e. “Every dog is an animal
is equivalent to 2 + 3 = 5.”, is true as well as A2 ⇔ A3 , i.e. “Every animal is a dog is
equivalent to √the number 4 is odd.”. On the other hand, A4 ⇔ A5 , i.e. “2 + 3 = 5 is
equivalent to 2 < 0, is false.
Analogous to the situation of implications, A ⇔ B is not really useful if the truth values
of both A and B are known a priori, but can be a powerful tool to prove B to be true
or false by establishing the truth value of A. It is obviously more powerful than the
implication as illustrated by the following example (compare with Ex. 1.2):
Example 1.3. Suppose we know Sasha has obtained the highest score among the stu-
dents registered for the Analysis class. Then the statement A “Sasha has taken the
Analysis class before.” is equivalent to the statement B “The student with the highest
score has taken the class before.” As in Ex. 1.2, if we can establish Sasha to have taken
the class before, then we also know B to be true. However, in contrast to Ex. 1.2, if we
find Sasha to have taken the class for the first time, then we know B to be false.
Remark 1.4. In computer science, the truth value T is often coded as 1 and the truth
value F is often coded as 0.
1.2.3 Rules
Note that the expressions in the first row of the truth table (1.2) (e.g. A ∧ B) are not
statements in the sense of Sec. 1.2.1, as they contain the statement variables (also known
as propositional variables) A or B. However, the expressions become statements if all
statement variables are substituted with actual statements. We will call expressions of
this form propositional formulas. Moreover, if a truth value is assigned to each statement
variable of a propositional formula, then this uniquely determines the truth value of the
formula. In other words, the truth value of the propositional formula can be calculated
from the respective truth values of its statement variables – a first justification for the
name propositional calculus.
Example 1.5. (a) Consider the propositional formula (A ∧ B) ∨ (¬B). Suppose A is
true and B is false. The truth value of the formula is obtained according to the
following truth table:
A B A ∧ B ¬B (A ∧ B) ∨ (¬B)
(1.3)
T F F T T
(b) The propositional formula A ∨ (¬A), also known as the law of the excluded middle,
has the remarkable property that its truth value is T for every possible choice of
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 11
Notation 1.7. We write φ(A1 , . . . , An ) if, and only if, the propositional formula φ
contains precisely the n statement variables A1 , . . . , An .
Definition 1.8. The propositional formulas φ(A1 , . . . , An ) and ψ(A1 , . . . , An ) are called
equivalent if, and only if, φ(A1 , . . . , An ) ⇔ ψ(A1 , . . . , An ) is a tautology.
Lemma 1.9. The propositional formulas φ(A1 , . . . , An ) and ψ(A1 , . . . , An ) are equiva-
lent if, and only if, they have the same truth value for all possible assignments of truth
values to A1 , . . . , An .
Proof. If φ(A1 , . . . , An ) and ψ(A1 , . . . , An ) are equivalent and Ai is assigned the truth
value ti , i = 1, . . . , n, then φ(A1 , . . . , An ) ⇔ ψ(A1 , . . . , An ) being a tautology implies it
has truth value T. From (1.2) we see that either φ(A1 , . . . , An ) and ψ(A1 , . . . , An ) both
have truth value T or they both have truth value F.
If, on the other hand, we know φ(A1 , . . . , An ) and ψ(A1 , . . . , An ) have the same truth
value for all possible assignments of truth values to A1 , . . . , An , then, given such an
assignment, either φ(A1 , . . . , An ) and ψ(A1 , . . . , An ) both have truth value T or both
have truth value F, i.e. φ(A1 , . . . , An ) ⇔ ψ(A1 , . . . , An ) has truth value T in each case,
showing it is a tautology.
For all logical purposes, two equivalent formulas are exactly the same – it does not
matter if one uses one or the other. The following theorem provides some important
equivalences of propositional formulas. As too many parentheses tend to make formulas
less readable, we first introduce some precedence conventions for logical operators:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 12
Theorem 1.11. (a) (A ⇒ B) ⇔ ¬A ∨ B. This means one can actually define impli-
cation via negation and disjunction.
(b) (A ⇔ B) ⇔ (A ⇒ B) ∧ (B ⇒ A) , i.e. A and B are equivalent if, and only if, A
is both necessary and sufficient for B. One also calls the implication B ⇒ A the
converse of the implication A ⇒ B. Thus, A and B are equivalent if, and only if,
both A ⇒ B and its converse hold true.
Proof. Each equivalence is proved by providing a truth table and using Lem. 1.9.
(a):
A B ¬A A ⇒ B ¬A ∨ B
T T F T T
T F F F F
F T T T T
F F T T T
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 13
(j): Exercise.
(k):
A ¬A ¬¬A
T F T
F T F
(l):
A B ¬A ¬B A ⇒ B ¬B ⇒ ¬A
T T F F T T
T F F T F F
F T T F T T
F F T T T T
Having checked all the rules completes the proof of the theorem.
The importance of the rules provided by Th. 1.11 lies in their providing proof techniques,
i.e. methods for establishing the truth of statements from statements known or assumed
to be true. The rules of Th. 1.11 will be used frequently in proofs throughout this class.
Remark 1.12. Another important proof technique is the so-called proof by contradic-
tion, also called indirect proof. It is based on the observation, called the principle of
contradiction, that A ∧ ¬A is always false:
A ¬A A ∧ ¬A
T F F (1.5)
F T F
Two more rules we will use regularly in subsequent proofs are the so-called transitivity
of implication and the transitivity of equivalence (we will encounter equivalence again
in the context of relations in Sec. 1.3 below). In preparation for the transitivity rules,
we generalize implication to propositional formulas:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 14
Proof. According to Def. 1.13, the rules can be verified by providing truth tables that
show that, for all possible assignments of truth values to the propositional formulas on
the left-hand side of the implications, either the left-hand side is false or both sides are
true. (a):
A B C A ⇒ B B ⇒ C (A ⇒ B) ∧ (B ⇒ C) A ⇒ C
T T T T T T T
T F T F T F T
F T T T T T T
F F T T T T T
T T F T F F F
T F F F T F F
F T F T F F T
F F F T T T T
(b):
A B C A ⇔ B B ⇔ C (A ⇔ B) ∧ (B ⇔ C) A ⇔ C
T T T T T T T
T F T F F F T
F T T F T F F
F F T T F F F
T T F T F F F
T F F F T F F
F T F F F F T
F F F T T T T
Having checked both rules, the proof is complete.
Definition and Remark 1.15. A proof of the statement B is a finite sequence of
statements A1 , A2 , . . . , An such that A1 is true; for 1 ≤ i < n, Ai implies Ai+1 , and An
implies B. If there exists a proof for B, then Th. 1.14(a) guarantees that B is true.
Remark 1.16. Principle of Duality: In Th. 1.11, there are several pairs of rules that
have an analogous form: (c) and (d), (e) and (f), (g) and (h), (i) and (j). These
analogies are due to the general law called the principle of duality: If φ(A1 , . . . , An ) ⇒
ψ(A1 , . . . , An ) and only the operators ∧, ∨, ¬ occur in φ and ψ, then the reverse im-
plication Φ(A1 , . . . , An ) ⇐ Ψ(A1 , . . . , An ) holds, where one obtains Φ from φ and Ψ
from ψ by replacing each ∧ with ∨ and each ∨ with ∧. In particular, if, instead of an
implication, we start with an equivalence (as in the examples from Th. 1.11), then we
obtain another equivalence.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 15
Definition 1.18. The sets M and N are equal, denoted M = N , if, and only if, M and
N have precisely the same elements.
—
Definition 1.18 means we know everything about a set M if, and only if, we know all its
elements.
Definition 1.19. The set with no elements is called the empty set; it is denoted by the
symbol ∅.
Example 1.20. For finite sets, we can simply write down all its elements,
√ for example,
A := {0}, B := {0, 17.5}, C := {5, 1, 5, 3}, D := {3, 5, 1}, E := {2, 2, −2}, where the
symbolism “:=” is to be read as “is defined to be equal to”.
Note C = D, since both sets contain precisely the same elements. In particular, the
order in which the elements are written down plays no role and a set does not change if
an element is written down more than once.
If a set has many elements, instead of writing down all its elements, one might use
abbreviations such as F := {−4, −2, . . . , 20, 22, 24}, where one has to make sure the
meaning of the dots is clear from the context.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 16
Definition 1.21. The set A is called a subset of the set B (denoted A ⊆ B and also
referred to as the inclusion of A in B) if, and only if, every element of A is also an
element of B (one sometimes also calls B a superset of A and writes B ⊇ A). Please
note that A = B is allowed in the above definition of a subset. If A ⊆ B and A 6= B,
then A is called a strict subset of B, denoted A ( B.
If B is a set and P (x) is a statement about an element x of B (i.e., for each x ∈ B,
P (x) is either true or false), then we can define a subset A of B by writing
A := {x ∈ B : P (x)}. (1.6)
This notation is supposed to mean that the set A consists precisely of those elements of
B such that P (x) is true (has the truth value T in the language of Sec. 1.2).
(c) We have {3} ⊆ {6.7, 3, 0}. Letting A := {−10, −8, . . . , 8, 10}, we have {−2, 0, 2} =
{x ∈ A : x3 ∈ A}, ∅ = {x ∈ A : x + 21 ∈ A}.
Remark 1.23. As a consequence of Def. 1.18, the sets A and B are equal if, and only
if, one has both inclusions, namely A ⊆ B and B ⊆ A. Thus, when proving the equality
of sets, one often divides the proof into two parts, first proving one inclusion, then the
other.
Definition 1.24. (a) The intersection of the sets A and B, denoted A ∩ B, consists of
all elements that are in A and in B. The sets A, B are said to be disjoint if, and
only if, A ∩ B = ∅.
(b) The union of the sets A and B, denoted A ∪ B, consists of all elements that are in
A or in B (as in the logical disjunction in (1.2), the or is meant nonexclusively). If
A and B are disjoint, one sometimes writes A ∪˙ B and speaks of the disjoint union
of A and B.
(c) The difference of the sets A and B, denoted A\B (read “A minus B” or “A without
B”), consists of all elements of A that are not elements of B, i.e. A \ B := {x ∈
A: x∈ / B}. If B is a subset of a given set A (sometimes called the universe in
this context), then A \ B is also called the complement of B with respect to A.
In that case, one also writes B c := A \ B (note that this notation suppresses the
dependence on A).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 17
As mentioned
earlier, it will
often be unavoidable
to consider sets of sets. Here are first
examples: ∅, {0}, {0, 1} , {0, 1}, {1, 2} .
Definition 1.26. Given a set A, the set of all subsets of A is called the power set of A,
denoted P(A) (for reasons explained later (cf. Prop. 2.18), the power set is sometimes
also denoted as 2A ).
Example 1.27. Examples of Power Sets:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 18
So far, we have restricted our set-theoretic examples to finite sets. However, not sur-
prisingly, many sets of interest to us will be infinite (we will have to postpone a mathe-
matically precise definition of finite and infinite to Sec. 3.2). We will now introduce the
most simple infinite set.
Definition 1.28. The set N := {1, 2, 3, . . . } is called the set of natural numbers (for a
more rigorous construction of N, based on the axioms of axiomatic set theory, see Sec.
A.3.4 of the Appendix, where Th. A.46 shows N to be, indeed, infinite). Moreover, we
define N0 := {0} ∪ N.
—
Proof. In each case, the proof results from the corresponding rule of Th. 1.11:
(a):
Th. 1.11(c)
x∈A∩B ⇔x∈A∧x∈B ⇔ x ∈ B ∧ x ∈ A ⇔ x ∈ B ∩ A.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 19
Remark 1.30. The correspondence between Th. 1.11 and Th. 1.29 is no coincidence.
One can actually prove that, starting with an equivalence of propositional formulas
φ(A1 , . . . , An ) ⇔ ψ(A1 , . . . , An ), where both formulas contain only the operators ∧, ∨, ¬,
one obtains a set-theoretic rule (stating an equality of sets) by reinterpreting all state-
ment variables A1 , . . . , An as variables for sets, all subsets of a universe U , and replacing
∧ by ∩, ∨ by ∪, and ¬ by U \ (if there are no multiple negations, then we do not need
the hypothesis that A1 , . . . , An are subsets of U ). The procedure also works in the op-
posite direction – one can start with a set-theoretic formula for an equality of sets and
translate it into two equivalent propositional formulas.
That means we are interested in statements involving universal quantification via the
quantifier “for all” (one also often uses “for each” or “for every” instead), existential
quantification via the quantifier “there exists”, or both. The quantifier of universal
quantification is denoted by ∀ and the quantifier of existential quantification is denoted
by ∃. Using these symbols as well as N and R to denote the sets of natural and real
numbers, respectively, we can restate (1.11) as
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 20
∀ P (x), (1.13a)
x∈A
∃ P (x). (1.13b)
x∈A
In (1.13), A denotes a set and P (x) is a sentence involving the variable x, a so-called
predicate of x, that becomes a statement (i.e. becomes either true or false) if x is substi-
tuted with any concrete element of the set A (in particular, P (x) is allowed to contain
further quantifiers, but it must not contain any other quantifier involving x – one says
x must be a free variable in P (x), not bound by any quantifier in P (x)).
The universal statement (1.13a) has the truth value T if, and only if, P (x) has the truth
value T for all elements x ∈ A; the existential statement (1.13b) has the truth value T
if, and only if, P (x) has the truth value T for at least one element x ∈ A.
V W
Remark 1.32. Some people prefer to write instead of ∀ and instead of ∃ .
x∈A x∈A x∈A x∈A
Even though this notation has the advantage of emphasizing that the universal statement
can be interpreted as a big logical conjunction and the existential statement can be
interpreted as a big logical disjunction, it is significantly less common. So we will stick
to ∀ and ∃ in this class.
Remark 1.33. According to Def. 1.31, the existential statement (1.13b) is true if, and
only if, P (x) is true for at least one x ∈ A. So if there is precisely one such x, then
(1.13b) is true; and if there are several different x ∈ A such that P (x) is true, then
(1.13b) is still true. Uniqueness statements are often of particular importance, and one
sometimes writes
∃! P (x) (1.14)
x∈A
for the statement “there exists a unique x ∈ A such that P (x) is true”. This notation
can be defined as an abbreviation for
∃ P (x) ∧ ∀ P (y) ⇒ x = y . (1.15)
x∈A y∈A
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 21
Remark 1.35. As for propositional calculus, we also have some important rules for
predicate calculus:
(a) Consider the negation of a universal statement, ¬ ∀ P (x), which is true if, and
x∈A
only if, P (x) does not hold for each x ∈ A, i.e. if, and only if, there exists at least
one x ∈ A such that P (x) is false (such that ¬P (x) is true). We have just proved
the rule
¬ ∀ P (x) ⇔ ∃ ¬P (x). (1.17a)
x∈A x∈A
(b) If A, B are sets and P (x, y) denotes a predicate of both x and y, then ∀ ∀ P (x, y)
x∈A y∈B
and ∀ ∀ P (x, y) both hold true if, and only if, P (x, y) holds true for each x ∈ A
y∈B x∈A
and each y ∈ B, i.e. the order of two consecutive universal quantifiers does not
matter:
∀ ∀ P (x, y) ⇔ ∀ ∀ P (x, y) (1.19a)
x∈A y∈B y∈B x∈A
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 22
(b) As a more complicated example, consider the negation of the uniqueness statement
(1.14), i.e. of (1.15):
¬ ∃! P (x) ⇔ ¬ ∃ P (x) ∧ ∀ P (y) ⇒ x = y
x∈A x∈A y∈A
(1.17b), Th. 1.11(a)
⇔ ∀ ¬ P (x) ∧ ∀ ¬P (y) ∨ x = y
x∈A y∈A
Th. 1.11(i)
⇔ ∀ ¬P (x) ∨ ¬ ∀ ¬P (y) ∨ x = y
x∈A y∈A
(1.17a)
⇔ ∀ ¬P (x) ∨ ∃ ¬ ¬P (y) ∨ x = y
x∈A y∈A
Th. 1.11(j),(k)
⇔ ∀ ¬P (x) ∨ ∃ P (y) ∧ x 6= y
x∈A y∈A
Th. 1.11(a)
⇔ ∀ P (x) ⇒ ∃ P (y) ∧ x 6= y . (1.22)
x∈A y∈A
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 23
So how to decode the expression, we have obtained at the end? It states that if
P (x) holds for some x ∈ A, then there must be at least a second, different, element
y ∈ A such that P (y) is true. This is, indeed, precisely the negation of ∃! P (x).
x∈A
(d) The following example shows that different quantifiers do, in general, not commute
(i.e. do not yield equivalent statements when commuted): While the statement
∀ ∃ y>x (1.24a)
x∈R y∈R
is true (for each real number x, there is a bigger real number y, e.g. y := x + 1 will
do the job), the statement
∃ ∀ y>x (1.24b)
y∈R x∈R
is false (for example, since y > y is false). In particular, (1.24a) and (1.24b) are not
equivalent.
(e) Even though (1.14) provides useful notation, it is better not to think of ∃! as a
quantifier. It is really just an abbreviation for (1.15), and it behaves very differently
from ∃ and ∀: The following examples show that, in general, ∃! commutes neither
with ∃, nor with itself:
∃ ∃! m < n 6⇔ ∃! ∃ m<n
n∈N m∈N m∈N n∈N
(the statement on the left is true, as one can choose n = 2, but the statement on
the right is false, as ∃ m < n holds for every m ∈ N). Similarly,
n∈N
∃! ∃! m < n 6⇔ ∃! ∃! m < n
n∈N m∈N m∈N n∈N
(the statement on the left is still true and the statement on the right is still false
(there is no m ∈ N such that ∃! m < n)).
n∈N
Remark 1.37. One can make the following observations regarding the strategy for
proving universal and existential statements:
(a) To prove that ∀ P (x) is true, one must check the truth of P (x) for every element
x∈A
x ∈ A – examples are not enough!
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 24
(b) To prove that ∀ P (x) is false, it suffices to find one x ∈ A such that P (x) is
x∈A
false – such an x is then called a counterexample and one counterexample is always
enough to prove ∀ P (x) is false!
x∈A
(c) To prove that ∃ P (x) is true, it suffices to find one x ∈ A such that P (x) is true
x∈A
– such an x is then called an example and one example is always enough to prove
∃ P (x) is true!
x∈A
The subfield of mathematical logic dealing with quantified statements is called predicate
calculus. In general, one does not restrict the quantified variables to range only over
elements of sets (as we have done above). Again, we refer to [EFT07] for a deeper
treatment of the subject.
As an application of quantified statements, let us generalize the notion of union and
intersection:
Definition 1.38. Let I 6= ∅ be a nonempty set, usually called an index set in the present
context. For each i ∈ I, let Ai denote a set (some or all of the Ai can be identical).
consists of all elements x that belong to at least one Ai . The union is called disjoint
if, and only if, for each i, j ∈ I, i 6= j implies Ai ∩ Aj = ∅.
Proposition 1.39. Let I 6= ∅ be an index set, let M denote a set, and, for each i ∈ I,
let Ai denote a set. The following set-theoretic rules hold:
T T
(a) Ai ∩M = (Ai ∩ M ).
i∈I i∈I
S S
(b) Ai ∪M = (Ai ∪ M ).
i∈I i∈I
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
1 FOUNDATIONS: MATHEMATICAL LOGIC AND SET THEORY 25
T T
(c) Ai ∪M = (Ai ∪ M ).
i∈I i∈I
S S
(d) Ai ∩M = (Ai ∩ M ).
i∈I i∈I
T S
(e) M \ Ai = (M \ Ai ).
i∈I i∈I
S T
(f ) M \ Ai = (M \ Ai ).
i∈I i∈I
Proof. We prove (c) and (e) and leave the remaining proofs as an exercise.
(c):
!
\ (∗)
x∈ Ai ∪ M ⇔ x ∈ M ∨ ∀ x ∈ Ai ⇔ ∀ x ∈ Ai ∨ x ∈ M
i∈I i∈I
i∈I
\
⇔ x∈ (Ai ∪ M ).
i∈I
To justify the equivalence at (∗), we make use of Th. 1.11(b) and verify ⇒ and ⇐. For
⇒ note that the truth of x ∈ M implies x ∈ Ai ∨ x ∈ M is true for each i ∈ I. If x ∈ Ai
is true for each i ∈ I, then x ∈ Ai ∨ x ∈ M is still true for each i ∈ I. To verify ⇐, note
that the existence of i ∈ I such that x ∈ M implies the truth of x ∈ M ∨ ∀ x ∈ Ai .
i∈I
If x ∈ M is false for each i ∈ I, then x ∈ Ai must be true for each i ∈ I, showing
x ∈ M ∨ ∀ x ∈ Ai is true also in this case.
i∈I
(e):
\
x∈M\ Ai ⇔ x ∈ M ∧ ¬ ∀ x ∈ Ai ⇔ x ∈ M ∧ ∃ x ∈
/ Ai
i∈I i∈I
i∈I
[
⇔ ∃ x ∈ M \ Ai ⇔ x ∈ (M \ Ai ),
i∈I
i∈I
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 26
2.1 Functions
Definition 2.1. Let A, B be sets. Given x ∈ A, y ∈ B, the set
n o
(x, y) := {x}, {x, y} (2.1)
is called the ordered pair (often shortened to just pair) consisting of x and y. The set of
all such pairs is called the Cartesian product A × B, i.e.
A × ∅ = ∅ × A = ∅, (2.3a)
{1, 2} × {1, 2, 3} = {(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3)} (2.3b)
6 {1, 2, 3} × {1, 2} = {(1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)}.
= (2.3c)
Definition 2.3. Given sets A, B, a function or map f is an assignment rule that assigns
to each x ∈ A a unique y ∈ B. One then also writes f (x) for the element y. The set A
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 27
is called the domain of f , denoted dom(f ), and B is called the codomain of f , denoted
codom(f ). The information about a map f can be concisely summarized by the notation
f : A −→ B, x 7→ f (x), (2.5)
where x 7→ f (x) is called the assignment rule for f , f (x) is called the image of x, and
x is called a preimage of f (x) (the image must be unique, but there might be several
preimages). The set
graph(f ) := (x, y) ∈ A × B : y = f (x) (2.6)
is called the graph of f (not to be confused with pictures visualizing the function f ,
which are also called graph of f ). If one wants to be completely precise, then one
identifies the function f with the ordered triple (A, B, graph(f )).
The set of all functions with domain A and codomain B is denoted by F(A, B) or B A ,
i.e.
F(A, B) := B A := (f : A −→ B) : A = dom(f ) ∧ B = codom(f ) . (2.7)
Caveat: Some authors reserve the word map for continuous functions, but we use func-
tion and map synonymously.
Definition 2.4. Let A, B be sets and f : A −→ B a function.
f (T ) := {f (x) ∈ B : x ∈ T } (2.8)
f −1 (U ) := {x ∈ A : f (x) ∈ U } (2.9)
(c) f is called injective or one-to-one if, and only if, every y ∈ B has at most one
preimage, i.e. if, and only if, the preimage of {y} has at most one element:
−1
f injective ⇔ ∀ f {y} = ∅ ∨ ∃! f (x) = y
y∈B x∈A
⇔ ∀ x1 6= x2 ⇒ f (x1 ) 6= f (x2 ) . (2.10)
x1 ,x2 ∈A
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 28
(d) f is called surjective or onto if, and only if, every element of the codomain of f has
a preimage:
(e) f is called bijective if, and only if, f is injective and surjective.
f ({1, 2}) = {5, 4} = f −1 ({1, 2}), h̃−1 ({2, 4, 6}) = {1, 2, 3, 4, 5, 6}, (2.13)
Example 2.6. (a) For each nonempty set A, the map Id : A −→ A, Id(x) := x, is
called the identity on A. If one needs to emphasize that Id operates on A, then one
also writes IdA instead of Id. The identity is clearly bijective.
(b) Let A, B be nonempty sets. A map f : A −→ B is called constant if, and only if,
there exists c ∈ B such that f (x) = c for each x ∈ A. In that case, one also writes
f ≡ c, which can be read as “f is identically equal to c”. If f ≡ c, ∅ 6= T ⊆ A, and
U ⊆ B, then (
A for c ∈ U ,
f (T ) = {c}, f −1 (U ) = (2.14)
∅ for c ∈/ U.
f is injective if, and only if, A = {x}; f is surjective if, and only if, B = {c}.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 29
f (S ∩ T ) ⊆ f (S) ∩ f (T ), (2.16a)
!
\ \
f Si ⊆ f (Si ), (2.16b)
i∈I i∈I
f (S ∪ T ) = f (S) ∪ f (T ), (2.16c)
!
[ [
f Si = f (Si ), (2.16d)
i∈I i∈I
f −1 (U ∩ V ) = f (U ) ∩ f −1 (V ),
−1
(2.16e)
!
\ \
f −1 Ui = f −1 (Ui ), (2.16f)
i∈I i∈I
−1
f (U ∪ V ) = f (U ) ∪ f −1 (V ),
−1
(2.16g)
!
[ [
f −1 Ui = f −1 (Ui ), (2.16h)
i∈I i∈I
f (f (U )) ⊆ U, f −1 (f (S)) ⊇ S,
−1
(2.16i)
f −1 (U \ V ) = f −1 (U ) \ f −1 (V ). (2.16j)
Proof. We prove (2.16b) (which includes (2.16a) as a special case) and the second part
of (2.16i), and leave the remaining cases as exercises.
For (2.16b), one argues
!
\ \
y∈f Si ⇔ ∃ ∀ x ∈ Si ∧ y = f (x) ⇒ ∀ y ∈ f (Si ) ⇔ y ∈ f (Si ).
x∈A i∈I i∈I
i∈I i∈I
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 30
The observation
It is an exercise to find counterexamples that show one can not, in general, replace the
four subset symbols in (2.16) by equalities (it is possible to find examples with sets that
have at most 2 elements).
f : N −→ R, n 7→ n2 , (2.18a)
g : N −→ R, n 7→ 2n. (2.18b)
showing that composing functions is, in general, not commutative, even if the involved
functions have the same domain and the same codomain.
h ◦ (g ◦ f ) = (h ◦ g) ◦ f. (2.20)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 31
The maps
(
n/2 if n even,
g1 : N −→ N, g1 (n) := (2.23b)
1 if n odd,
(
n/2 if n even,
g2 : N −→ N, g2 (n) := (2.23c)
2 if n odd,
both constitute left inverses of f . It follows from Th. 2.13(c) below that f does not
have a right inverse.
The maps
both constitute right inverses of f . It follows from Th. 2.13(c) below that f does
not have a left inverse.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 32
the inverse is
3 for n = 1,
1 for n = 2,
g −1 : N −→ N, g −1 (n) := (2.25c)
2 for n = 3,
n for n∈/ {1, 2, 3}.
While Examples 2.12(a),(b) show that left and right inverses are usually not unique,
they are unique provided f is bijective (see Th. 2.13(c)).
(a) f : A −→ B is right invertible if, and only if, f is surjective (where the implication
“⇐” makes use of the axiom of choice (AC), see Appendix A.4).
(c) f : A −→ B is invertible if, and only if, f is bijective. In this case, the right inverse
and the left inverse are unique and both identical to the inverse.
Proof. (a): If f is surjective, then, for each y ∈ B, there exists xy ∈ f −1 {y} such that
f (xy ) = y. By AC, we can define the choice function
g : B −→ A, g(y) := xy . (2.26)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 33
(b): Fix a ∈ A. If f is injective, then, for each y ∈ B with f −1 {y} 6= ∅, let xy denote
the unique element in A satisfying f (xy ) = y. Define
(
xy for f −1 {y} 6= ∅,
g : B −→ A, g(y) := (2.27)
a otherwise.
Proof. Exercise.
Definition 2.15. (a) Given an index set I and a set A, a map f : I −→ A is sometimes
called a family (of elements in A), and is denoted in the form f = (ai )i∈I with
ai := f (i). When using this representation, one often does not even specify f and
A, especially if the ai are themselves sets.
(b) A sequence in a set A is a family of elements in A, where the index set is the set of
natural numbers N. In this case, one writes (an )n∈N or (a1 , a2 , . . . ). More generally,
a family is called a sequence, given a bijective map between the index set I and a
subset of N.
(c) Given a family of sets (Ai )i∈I , we define the Cartesian product of the Ai to be the
set of functions
( ! )
Y [
Ai := f : I −→ Aj : ∀ f (i) ∈ Ai . (2.30)
i∈I
i∈I j∈I
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 34
QI has precisely n elements with n ∈ N, then the elements of the Cartesian product
If
i∈I Ai are called (ordered) n-tuples, (ordered) triples for n = 3.
Example
T 2.16. (a) Using
S the notion of family, we can now say that the intersection
A
i∈I i and union i∈I Ai as defined in Def. 1.38 are the intersection and union of
the family of sets (Ai )i∈I , respectively. As a concrete example, let us revisit (1.26b),
where we have
\
(An )n∈N , An := {1, 2, . . . , n}, An = {1}. (2.31)
n∈N
In the following, we explain the common notation 2A for the power set P(A) of a set
A. It is related to a natural identification between subsets and their corresponding
characteristic function.
Definition 2.17. Let A be a set and let B ⊆ A be a subset of A. Then
(
1 if x ∈ B,
χB : A −→ {0, 1}, χB (x) := (2.34)
0 if x ∈/ B,
is called the characteristic function of the set B (with respect to the universe A). One
also finds the notations 1B and 1B instead of χB (note that all the notations suppress
the dependence of the characteristic function on the universe A).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 35
is bijective (recall that P(A) denotes the power set of A and {0, 1}A denotes the set of
all functions from A into {0, 1}).
Proposition 2.18 allows one to identify the sets P(A) and {0, 1}A via the bijective map
χ. This fact together with the common practice of set theory to identify the number
2 with the set {0, 1} (cf. the first paragraph of Sec. D.1 in the Appendix) explains the
notation 2A for P(A).
2.2 Relations
Definition 2.19. Given sets A and B, a relation is a subset R of A × B (if one wants
to be completely precise, a relation is an ordered triple (A, B, R), where R ⊆ A × B).
If A = B, then we call R a relation on A. One says that a ∈ A and b ∈ B are related
according to the relation R if, and only if, (a, b) ∈ R. In this context, one usually writes
a R b instead of (a, b) ∈ R.
Example 2.20. (a) The relations we are probably most familiar with are = and ≤.
The relation R of equality, usually denoted =, makes sense on every nonempty set
A:
R := ∆(A) := {(x, x) ∈ A × A : x ∈ A}. (2.36)
The set ∆(A) is called the diagonal of the Cartesian product, i.e., as a subset of
A × A, the relation of equality is identical to the diagonal:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 36
i.e. if, and only if, each x is related to y if, and only if, y is related to x.
(c) R is called antisymmetric if, and only if,
∀ (x R y ∧ y R x) ⇒ x = y , (2.43)
x,y∈A
i.e. if, and only if, the only possibility for x to be related to y at the same time that
y is related to x is in the case x = y.
(d) R is called transitive if, and only if,
∀ (x R y ∧ y R z) ⇒ x R z , (2.44)
x,y,z∈A
i.e. if, and only if, the relatedness of x and y together with the relatedness of y and
z implies the relatedness of x and z.
Example 2.22. The relations = and ≤ on R (or N) are reflexive, antisymmetric, and
transitive; = is also symmetric, whereas ≤ is not; < is antisymmetric (since x < y∧y < x
is always false) and transitive, but neither reflexive nor symmetric. The relation
R := (x, y) ∈ N2 : (x, y are both even) ∨ (x, y are both odd) (2.45)
on N is not antisymmetric, but reflexive, symmetric, and transitive. The relation
S := {(x, y) ∈ N2 : y = x2 } (2.46)
is not transitive (for example, 2 S 4 and 4 S 16, but not 2 S 16), not reflexive, not sym-
metric; it is only antisymmetric.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 37
Definition 2.23. A relation R on a set A is called an equivalence relation if, and only
if, R is reflexive, symmetric, and transitive. If R is an equivalence relations, then one
often writes x ∼ y instead of x R y.
called the equivalence class of x; each y ∈ [x] is called a representative of [x]. One
verifies that the properties of ∼ guarantee
[x] = [y] ⇔ x ∼ y ∧ [x] ∩ [y] = ∅ ⇔ ¬(x ∼ y) . (2.49)
The set of all equivalence classes I := A/ ∼:= {[x] : x ∈ A} is called the quotient set
Ṡ
of A by ∼, and A = i∈I Ai with Ai := i for each i ∈ I is the desired decomposition
of A.
Definition 2.25. A relation R on a set A is called a partial order if, and only if, R is
reflexive, antisymmetric, and transitive. If R is a partial order, then one usually writes
x ≤ y instead of x R y. A partial order ≤ is called a total or linear order if, and only if,
for each x, y ∈ A, one has x ≤ y or y ≤ x.
Notation 2.26. Given a (partial or total) order ≤ on A 6= ∅, we write x < y if, and
only if, x ≤ y and x 6= y, calling < the strict order corresponding to ≤ (note that the
strict order is never a partial order).
(a) x ∈ A is called lower (resp. upper) bound for B if, and only if, x ≤ b (resp. b ≤ x)
for each b ∈ B. Moreover, B is called bounded from below (resp. from above) if, and
only if, there exists a lower (resp. upper) bound for B; B is called bounded if, and
only if, it is bounded from above and from below.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 38
(b) x ∈ B is called minimum or just min (resp. maximum or max) of B if, and only if,
x is a lower (resp. upper) bound for B. One writes x = min B if x is minimum and
x = max B if x is maximum.
(c) A maximum of the set of lower bounds of B (i.e. a largest lower bound) is called
infimum of B, denoted inf B; a minimum of the set of upper bounds of B (i.e. a
smallest upper bound) is called supremum of B, denoted sup B.
Example 2.28. (a) For each A ⊆ R, the usual relation ≤ defines a total order on A.
For A = R, we see that N has 0 and 1 as lower bound with 1 = min N = inf N. On
the other hand, N is unbounded from above. The set M := {1, 2, 3} is bounded
with min M = 1, max M = 3. The positive real numbers R+ := {x ∈ R : x > 0}
have inf R+ = 0, but they do not have a minimum (if x > 0, then 0 < x/2 < x).
defines a partial order on A that is not a total order (for example, neither (1, 2) ≤
(2, 1) nor (2, 1) ≤ (1, 2)). For the set
B := (1, 1), (2, 1), (1, 2) , (2.51)
we have inf B = min B = (1, 1), B does not have a max, but sup B = (2, 2) (if
(m, n) ∈ A is an upper bound for B, then (2, 1) ≤ (m, n) implies 2 ≤ m and
(1, 2) ≤ (m, n) implies 2 ≤ n, i.e. (2, 2) ≤ (m, n); since (2, 2) is clearly an upper
bound for B, we have proved sup B = (2, 2)).
A different order on A is the so-called lexicographic order defined by
In contrast to the order from (2.50), the lexicographic order does define a total
order on A.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 39
Proof. Reflexivity, antisymmetry, and transitivity of ≤ clearly imply the same properties
for ≥, respectively. Moreover
proving (2.54a). Analogously, we obtain (2.54b). Next, (2.54c) and (2.54d) are implied
by (2.54a) and (2.54b), respectively. Finally, (2.54e) is proved by
Proof. Exercise.
Definition 2.31. Let A, B be nonempty sets with partial orders, both denoted by ≤
(even though they might be different). A function f : A −→ B, is called (strictly)
isotone, order-preserving, or increasing if, and only if,
∀ x < y ⇒ f (x) ≤ f (y) (resp. f (x) < f (y)) ; (2.55a)
x,y∈A
Functions that are (strictly) isotone or antitone are called (strictly) monotone.
Proposition 2.32. Let A, B be nonempty sets with partial orders, both denoted by ≤.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
2 FUNCTIONS AND RELATIONS 40
is strictly isotone and invertible, however f −1 is not isotone (since 2 < 3, but
f −1 (2) = (1, 2) and f −1 (3) = (2, 1) are not comparable, i.e. f −1 (2) ≤ f −1 (3) is not
true).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
3 NATURAL NUMBERS, INDUCTION, AND THE SIZE OF SETS 41
Induction is based on the fact that N satisfies the so-called Peano axioms:
P2: There exists an injective map S : N −→ N \ {1}, called the successor function (for
each n ∈ N, S(n) is called the successor of n).
P3: If a subset A of N has the property that 1 ∈ A and S(n) ∈ A for each n ∈ A, then
A is equal to N. Written as a formula, the third axiom is:
∀ 1 ∈ A ∧ S(A) ⊆ A ⇒ A = N .
A∈P(N)
Remark 3.1. In Def. 1.28, we had introduced the natural numbers N := {1, 2, 3, . . . }.
The successor function is S(n) = n + 1. In axiomatic set theory, one starts with the
Peano axioms and shows that the axioms of set theory allow the construction of a
set N which satisfies the Peano axioms. One then defines 2 := S(1), 3 := S(2), . . . ,
n + 1 := S(n). The interested reader can find more details in Appendix D.1.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
3 NATURAL NUMBERS, INDUCTION, AND THE SIZE OF SETS 42
Remark 3.3. To prove some φ(n) for each n ∈ N by induction according to Th. 3.2
consists of the following two steps:
(b) Perform the inductive step, i.e. prove that φ(n) (the induction hypothesis) implies
φ(n + 1).
1·2
Base Case (n = 1): 1 = 2
, i.e. φ(1) is true.
n(n+1)
Induction Hypothesis: Assume φ(n), i.e. 1 + 2 + · · · + n = 2
holds.
Induction Step: One computes
φ(n) n(n + 1) n(n + 1) + 2n + 2
1 + 2 + · · · + n + (n + 1) = +n+1=
2 2
2
n + 3n + 2 (n + 1)(n + 2)
= = , (3.4)
2 2
i.e. φ(n + 1) holds and the induction is complete.
Proof. If, for each n ∈ N, we use ψ(n) to denote ∀ φ(m), then (3.5) is equivalent to
1≤m≤n
∀ ψ(n) ⇒ ψ(n + 1) , i.e. to Th. 3.2(b) with φ replaced by ψ. Thus, Th. 3.2 implies
n∈N
ψ(n) holds true for each n ∈ N, i.e. φ(n) holds true for each n ∈ N.
Corollary 3.6. Let I be an index set. Suppose, for each i ∈ I, φ(i) is a statement. If
there is a bijective map f : N −→ I and (a) and (b) both hold, where
(a) φ f (1) is true,
(b) ∀ φ f (n) ⇒ φ f (n + 1) ,
n∈N
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
3 NATURAL NUMBERS, INDUCTION, AND THE SIZE OF SETS 43
Apart from providing a widely employable proof technique, the most important ap-
plication of Th. 3.2 is the possibility to define sequences inductively, using so-called
recursion:
(i) x1 = x.
Proof. To prove uniqueness, let (xn )n∈N and (yn )n∈N be sequences in A, both satisfying
(i) and (ii), i.e.
x1 = y1 = x and (3.6a)
∀ xn+1 = fn (x1 , . . . , xn ) ∧ yn+1 = fn (y1 , . . . , yn ) . (3.6b)
n∈N
We prove by induction (in the form of Cor. 3.5) that (xn )n∈N = (yn )n∈N , i.e.
∀ xn = yn : (3.7)
n∈N| {z }
φ(n)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
3 NATURAL NUMBERS, INDUCTION, AND THE SIZE OF SETS 44
Induction Hypothesis: Assume φ(m) for each m ∈ {1, . . . , n}, i.e. xm = ym holds for
each m ∈ {1, . . . , n}.
Induction Step: One computes
(3.6b) φ(1),...,φ(n) (3.6b)
xn+1 = fn (x1 , . . . , xn ) = fn (y1 , . . . , yn ) = yn+1 , (3.8)
F (1) = x, (3.9a)
∀ F (n + 1) = fn F (1), . . . , F (n) . (3.9b)
n∈N
and \
G := B. (3.11)
B∈F
∀ ∃! (n, xn ) ∈ G . (3.12)
n∈N xn ∈A
| {z }
φ(n)
Base Case (n = 1): From the definition of G, we know (1, x) ∈ G. If (1, a) ∈ G with
a 6= x, then H := G \ {(1, a)} ∈ F, implying G ⊆ H in contradiction to (1, a) ∈
/ H.
This shows a = x and proves φ(1).
Induction Hypothesis: Assume φ(m) for each m ∈ {1, . . . , n}.
Induction Step: From the induction hypothesis, we know
∃! (1, x1 ), . . . , (n, xn ) ∈ G.
(x1 ,...,xn )∈An
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
3 NATURAL NUMBERS, INDUCTION, AND THE SIZE OF SETS 45
Example 3.8. In many applications of Th. 3.7, one has functions gn : A −→ A and
uses
∀ fn : An −→ A, fn (a1 , . . . , an ) := gn (an ) . (3.13)
n∈N
(b) For each a ∈ R and each d ∈ R, we define the following arithmetic progression (also
called arithmetic sequence) recursively by
a1 := a, ∀ an+1 := an + d, (3.15a)
n∈N
(c) For each a ∈ R and each q ∈ R \ {0}, we define the following geometric progression
(also called geometric sequence) recursively by
x1 := a, ∀ xn+1 := xn · q, (3.16a)
n∈N
For the time being, we will continue to always specify A and the gn or fn in subsequent
recursive definitions, but in the literature, most of the time, the gn or fn are not provided
explicitly.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
3 NATURAL NUMBERS, INDUCTION, AND THE SIZE OF SETS 46
Example 3.9. (a) The Fibonacci sequence consists of the Fibonacci numbers, defined
recursively by
F0 := 0, F1 := 1, ∀ Fn+1 := Fn + Fn−1 , (3.17a)
n∈N
i.e. we have A = N0 and
(
1 for n = 1,
fn : An −→ A, fn (a1 , . . . , an ) := (3.17b)
an + an−1 for n ≥ 2.
So we obtain
(Fn )n∈N0 = (0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, . . . ). (3.17c)
i.e.
fn : An −→ A, fn (x1 , . . . , xn ) := xn + an+1 . (3.19b)
In (3.19a), one can also use other symbols for i, except a and n; for a finite sequence,
n needs to be less than the maximal index of the finite sequence.
More generally, if I is an index set and φ : {1, . . . , n} −→ I a bijective map, then
define n
X X
ai := aφ(i) . (3.19c)
i∈I i=1
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
3 NATURAL NUMBERS, INDUCTION, AND THE SIZE OF SETS 47
(b) Product Symbol: On A = R (or, more generally, on every set A, where a multiplica-
tion · : A × A −→ A is defined), define recursively, for each given (possibly finite)
sequence (a1 , a2 , . . . ) in A:
1
Y n+1
Y n
Y
ai := a1 , ai := an+1 · ai for n ≥ 1, (3.20a)
i=1 i=1 i=1
i.e.
fn : An −→ A, fn (x1 , . . . , xn ) := an+1 · xn . (3.20b)
In (3.20a), one can also use other symbols for i, except a and n; for a finite sequence,
n needs to be less than the maximal index of the finite sequence.
More generally, if I is an index set and φ : {1, . . . , n} −→ I a bijective map, then
define n
Y Y
ai := aφ(i) . (3.20c)
i∈I i=1
∀ an = a + (n − 1)d, (3.21a)
n∈N
n
X n n
∀ Sn := ai = (a1 + an ) = 2 a + (n − 1) d , (3.21b)
n∈N
i=1
2 2
(b) Given a ∈ R and q ∈ R \ {0}, let (xn )n∈N be the geometric sequence as defined in
(3.16a). We will prove by induction that
xn = a q n−1 ,
∀ (3.22a)
n∈N
n n n−1
(
X X X na for q = 1,
∀ Sn := xi = (a q i−1 ) = a q i = a (1−qn ) (3.22b)
n∈N
i=1 i=1 i=0 1−q
for q 6= 1,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
3 NATURAL NUMBERS, INDUCTION, AND THE SIZE OF SETS 48
Definition 3.12. (a) The sets A, B are defined to have the same cardinality or the
same size if, and only if, there exists a bijective map ϕ : A −→ B. One can show
that this defines an equivalence relation on every set of sets (see Th. A.54 of the
Appendix).
(b) The cardinality of a set A is n ∈ N (denoted #A = n) if, and only if, there exists
a bijective map ϕ : A −→ {1, . . . , n}. The cardinality of ∅ is defined as 0, i.e.
#∅ := 0. A set A is called finite if, and only if, there exists n ∈ N0 such that
#A = n; A is called infinite if, and only if, A is not finite, denoted #A = ∞ (in the
strict sense, this is an abuse of notation, since ∞ is not a cardinality – for example
#N = ∞ and #P(N) = ∞, but N and P(N) do not have the same cardinality,
since the power set P(A) is always strictly bigger than A (see Th. A.70 of the
Appendix) – #A = ∞ is merely an abbreviation for the statement “A is infinite”).
The interested student finds additional material regarding the uniqueness of finite
cardinality in Th. A.62 and Cor. A.63, and regarding characterizations of infinite
sets in Th. A.55 of the Appendix.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
3 NATURAL NUMBERS, INDUCTION, AND THE SIZE OF SETS 49
(c) The set A is called countable if, and only if, A is finite or A has the same cardinality
as N. Otherwise, A is called uncountable.
—
In the rest of the section, we present a number of important results regarding the natural
numbers and countability.
Theorem 3.13. (a) Every nonempty finite subset of a totally ordered set has a mini-
mum and a maximum.
Proof. (a): Let A be a set and let ≤ denote a total order on A. Moreover, let ∅ 6= B ⊆ A.
We show by induction
∀ #B = n ⇒ B has a min .
n∈N | {z }
φ(n)
Base Case (n = 1): For n = 1, B contains a unique element b, i.e. b = min B, proving
φ(1).
Induction Step: Suppose φ(n) holds and consider B with #B = n + 1. Let b be one
element from B. Then C := B \ {b} has cardinality n and, according to the induction
hypothesis, there exists c ∈ C satisfying c = min C. If c ≤ b, then c ≤ x for each x ∈ B,
proving c = min B. If b ≤ c, then b ≤ x for each x ∈ B, proving b = min B. In each
case, B has a min, proving φ(n + 1) and completing the induction.
(b): Let ∅ 6= A ⊆ N. We have to show A has a min. If A is finite, then A has a min by (a).
If A is infinite, let n be an element from A. Then the finite set B := {k ∈ A : k ≤ n}
must have a min m by (a). Since m ≤ x for each x ∈ B and m ≤ n < x for each
x ∈ A \ B, we have m = min A.
Proof. Since ∅ is countable, we may assume A 6= ∅. From Th. 3.13(b), we know that
every nonempty subset of N has a min. We recursively define a sequence in A by
(
min An if An := A \ {ai : 1 ≤ i ≤ n} 6= ∅,
a1 := min A, an+1 :=
an if An = ∅.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
3 NATURAL NUMBERS, INDUCTION, AND THE SIZE OF SETS 50
(i) A is countable.
Proof. Directly from the definition of countable in Def. 3.12(c), one obtains (i)⇒(ii) and
(i)⇒(iii). To prove (ii)⇒(i), let f : A −→ N be injective. Then f : A −→ f (A) is
bijective, and, since f (A) ⊆ N, f (A) is countable by Prop. 3.14, proving A is countable
as well. To prove (iii)⇒(i), let g : N −→ A be surjective. Then g has a right inverse
f : A −→ N. One can obtain this from Th. 2.13(a), but, here, we can actually construct
f without the axiom of choice: For a ∈ A, let f (a) := min g −1 ({a}) (recall Th. 3.13(b)).
Then, clearly, g ◦ f = IdA . But this means g is a left inverse for f , showing f is injective
according to Th. 2.13(b). Then A is countable by an application of (ii).
Q
Theorem 3.16. If (A1 , . . . , An ), n ∈ N, is a finite family of countable sets, then ni=1 Ai
is countable.
Proof. We first consider the special case n = 2 with A1 = A2 = N and show the map
ϕ : N × N −→ N, ϕ(m, n) := 2m · 3n ,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
4 REAL NUMBERS 51
Q
Induction Step: Assuming φ(n), Prop. 3.15(ii) provides injective maps f1 : Q ni=1 Ai −→
N and f2 : An+1 −→ N. To prove φ(n+1), we provide an injective map h : n+1 i=1 Ai −→
N: Define
n+1
Y
h: Ai −→ N, h(a1 , . . . , an , an+1 ) := ϕ f1 (a1 , . . . , an ), f2 (an+1 ) .
i=1
Proof. It suffices to consider the case that all Ai are nonempty. Moreover, according to
Prop. 3.15(iii), it suffices to construct a surjective map ϕ : N −→ A. Also according
to Prop. 3.15(iii), the countability of I and the Ai provides us with surjective maps
f : N −→ I and gi : N −→ Ai (here AC is used to select each gi from the set of all
surjective maps from N onto Ai ). Define
Remark 3.18. The axiom of choice is, indeed, essential for the proof of Th. 3.17. It
is shown in [Jec73, Th. 10.6] that it is consistent with the axioms of ZF (i.e. with the
axioms of Sec. A.3 of the Appendix) that, e.g., the uncountable sets P(N) and R (cf.
Th. F.2 of the Appendix) are countable unions of countable sets.
4 Real Numbers
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
4 REAL NUMBERS 52
Definition 4.1. A total order ≤ on a nonempty set A is called complete if, and only if,
every nonempty subset B of A that is bounded from above has a supremum, i.e.
∀ ∃ ∀ b≤x ⇒ ∃ s = sup B . (4.1)
B∈P(A)\{∅} x∈A b∈B s∈A
Lemma 4.2. A total order ≤ on a nonempty set A is complete if, and only if, every
nonempty subset B of A that is bounded from below has an infimum.
Proof. According to Lem. 2.29, it suffices to prove one implication. We show that (4.1)
implies that every nonempty B bounded from below has an infimum: Define
Then every b ∈ B is an upper bound for C and (4.1) implies there exists s = sup C ∈ A.
To verify s = inf B, it remains to show s ∈ C, i.e. that s is a lower bound for B.
However, every b ∈ B is an upper bound for C and s = sup C is the min of all upper
bounds for C, i.e. s ≤ b for each b ∈ B, showing s ∈ C.
Definition 4.3. Let (A, +, ·) be a field and let ≤ be a total order on A. Then A (or,
more precisely, (A, +, ·, ≤)) is called a totally ordered field if, and only if, the order is
compatible with addition and multiplication, i.e. if, and only if,
∀ x≤y ⇒ x+z ≤y+z , (4.3a)
x,y,z∈A
∀ 0 ≤ x ∧ 0 ≤ y ⇒ 0 ≤ xy . (4.3b)
x,y∈A
Finally, A is called a complete totally ordered field if, and only if, A is a totally ordered
field that is complete in the sense of Def. 4.1.
Theorem 4.4. There exists a complete totally ordered field R (it is called the set of
real numbers). Moreover, R is unique up to isomorphism, i.e. if A is a complete totally
ordered field, then there exists an isomorphism φ : A −→ R, i.e. a bijective map φ :
A −→ R, satisfying
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
4 REAL NUMBERS 53
Proof. To really prove the existence of the real numbers by providing a construction
is tedious and not easy. One possible construction is provided in Appendix D (the
existence proof is completed in Th. D.41, the results regarding the isomorphism can be
found in Th. D.45).
Theorem 4.5. The following statements and rules are valid in the set of real numbers
R (and, more generally, in every totally ordered field):
(a) x ≤ y ⇒ −x ≥ −y.
(e) If 0 < x < y, then x/y < 1, y/x > 1, and 1/x > 1/y.
and, for z ≤ 0,
(4.3b)
x ≤ y ⇒ 0 ≤ y − x ⇒ 0 ≤ (y − x)(−z) = xz − yz ⇒ xz ≥ yz.
(c): From (b), one obtains x2 ≥ 0. From Th. C.10(i), one then gets x2 > 0.
(d): If x > 0, then x−1 < 0 implies the false statement 1 = xx−1 < 0, i.e. x−1 > 0. The
case x < 0 is treated analogously.
(e): Using (d), we obtain from 0 < x < y that x/y = xy −1 < yy −1 = 1 and 1 = xx−1 <
yx−1 = y/x.
(f): x < y ⇒ x + u < y + u and u < v ⇒ y + u < y + v; both combined yield
x + u < y + v.
(g): 0 < x < y ∧ 0 < u < v ⇒ xu < yu ∧ yu < yv ⇒ xu < yv.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
4 REAL NUMBERS 54
Proof. Exercise.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
4 REAL NUMBERS 55
For a = b, one says that the intervals defined by (4.8a) – (4.8d) are degenerate or
trivial, where [a, a] = {a}, ]a, a[=]a, a] = [a, a[= ∅ – it is sometimes convenient to have
included the degenerate cases in the definition. It is sometimes also useful to abandon
the restriction a ≤ b, to let c := min{a, b}, d := max{a, b}, and to define
[a, b] := [c, d], ]a, b[:=]c, d[, ]a, b] := [c, d] \ {a}, [a, b[:= [c, d] \ {b}. (4.8i)
Theorem 4.8 (Archimedean Property). Let ǫ, x be real numbers. If ǫ > 0 and x > 0,
then there exists n ∈ N such that n ǫ > x.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
5 COMPLEX NUMBERS 56
Proof. We conduct the proof by contradiction: Suppose x is an upper bound for the set
A := {n ǫ : n ∈ N}. Since the order ≤ on R is complete, according to (4.1), there exists
s ∈ R such that s = sup A. In particular, s − ǫ is not an upper bound for A, i.e. there
exists n ∈ N satisfying n ǫ > s − ǫ. But then (n + 1) ǫ > s in contradiction to s = sup A.
This shows x is not an upper bound for A, thereby establishing the case.
5 Complex Numbers
Theorem 5.2. (a) The set of complex numbers C with addition and multiplication as
defined in Def. 5.1 forms a field, where (0, 0) and (1, 0) are the neutral elements
with respect to addition and multiplication, respectively,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
5 COMPLEX NUMBERS 57
It is customary to identify R with ι(R), as it usually does not cause any confusion.
One then just writes x instead of (x, 0).
Proof. All computations required for (a) and (c) are straightforward and are left as an
exercise; (b) is a consequence of (a), since Th. C.10 is valid for every field.
Notation 5.3. The number i := (0, 1) is called the imaginary unit (note that, indeed,
i2 = i · i = (0, 1) · (0, 1) = (0 · 0 − 1 · 1, 0 · 1 + 1 · 0) = (−1, 0) = −1). Using i, one obtains
the commonly used representation of a complex number z = (x, y) ∈ C:
z = (x, y) = x · (1, 0) + y · (0, 1) = x + iy, (5.7)
where one calls Re z := x the real part of z and Im z := y the imaginary part of z.
Moreover, z is called purely imaginary if, and only if, Re z = 0 (as a consequence of this
convention, one has the (harmless) pathology that 0 is both real and purely imaginary).
Remark 5.4. There does not exist a total order ≤ on C that makes C into a totally
ordered field (i.e. no total order on C can be compatible with addition and multiplication
in the sense of (4.3)): Indeed, if there were such a total order ≤ on C, then all the rules
of Th. 4.5 had to be valid with respect to that total order ≤. In particular, 0 < 12 = 1
and 0 < i2 = −1 had to be valid by Th. 4.5(c), and, then, 0 < 1 + (−1) = 0 had to
be valid by Th. 4.5(f). However, 0 < 0 is false, showing that there is no total order on
C that satisfies (4.3). Caveat: Of course, there do exist total orders on C, just none
compatible with addition and multiplication – for example, the lexicographic order on
R × R (defined as it was in (2.52) for N × N) constitutes a total order on C.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
5 COMPLEX NUMBERS 58
Definition and Remark 5.5. Conjugation: For each complex number z = x + iy, we
define its complex conjugate or just conjugate to be the complex number z̄ := x − iy.
We then have the following rules that hold for each z = x + iy, w = u + iv ∈ C:
(c) z = z̄ ⇔ x + iy = x − iy ⇔ y = 0 ⇔ z ∈ R.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
5 COMPLEX NUMBERS 59
It is emphasized that the sign function is only defined for real numbers (cf. Rem.
5.4)!
where the term absolute value is often preferred for real numbers z ∈ R and the
term modulus is often preferred if one also considers complex numbers z ∈
/ R.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
5 COMPLEX NUMBERS 60
Proof. We carry out the proofs for z, w ∈ C. However, for z, w ∈ R, everything can
easily be shown directly from (5.10), without making use of square roots.
Let z = x + iy with x, y ∈ R.
(a): If z 6= 0, then x 6= 0 or y 6= p 0, i.e. x2 > 0 or y 2 > 0 by Th. 4.5(c), implying
2 2
x + y > 0 by Th. 4.5(f), i.e. |z| = x2 + y 2 > 0.
√
(b): Since a := |z| ∈ R+0 , we have |a| = a2 = a = |z|.
p p
(c): Since z̄ = x − iy, we have |z̄| = x2 + (−y)2 = x2 + y 2 = |z|.
(d): It is x = Re z, y = Im z. Let a := max{|x|, |y|}. As remarked in Def. and Rem.
5.6, the square root function is increasing and, thus, taking square roots in the chain of
inequalities a2 ≤ x2 + y 2 ≤ (|x| + |y|)2 implies a ≤ |z| ≤ |x| + |y| as claimed.
(e): As remarked in Def. and Rem. 5.6, the square root function is injective, and, thus,
(e) follows from
Def. and Rem. 5.5(a)
|zw|2 = zw zw = zwz̄ w̄ = z z̄ ww̄ = |z|2 |w|2 .
u2 v2 1 2
|w−1 |2 = 2 2 2
+ 2 2 2
= 2 2
= |w|−1 ,
(u + v ) (u + v ) u +v
|z|
i.e. |w−1 | = |w|−1 . Now (f) follows from (e): | wz | = |zw−1 | = |z||w−1 | = |z||w|−1 = |w|
.
(g) follows from
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
5 COMPLEX NUMBERS 61
Theorem 5.11. Let (R, +, ·) be a ring (cf. Def. C.7 of the Appendix).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
5 COMPLEX NUMBERS 62
(c) If R is a commutative ring with unity, then, for each n ∈ N0 and each z, w ∈ R:
n
X
wn+1 − z n+1 = (w − z) z j wn−j = (w − z)(wn + zwn−1 + · · · + z n−1 w + z n ).
j=0
Proof. In each case, the proof can be conducted by an easy induction. We carry out
(c) and leave the other cases as exercises. For (c), the base case (n = 0) is provided by
the true statement w0+1 − z 0+1 = w − z = (w − z)z 0 w0−0 . For the induction step, one
computes
n+1 n
!
X X
(w − z) z j wn+1−j = (w − z) z n+1 w0 + z j wn+1−j
j=0 j=0
n
X
n+1
= (w − z)z + (w − z) w z j wn−j
j=0
ind. hyp.
= (w − z) z n+1 + w(wn+1 − z n+1 ) = wn+2 − z n+2 ,
completing the induction.
Theorem 5.12. Let (F, +, ·) be a totally ordered field (cf. Def. 4.3).
Proof. Both cases are proved by simple inductions, where Th. 4.5(f) is used in (a) and
Th. 4.5(g) is used in (b).
Theorem 5.13. Triangle Inequality: For each n ∈ N and each zj ∈ C, j ∈ {1, . . . , n}:
Xn X n
z j ≤ |zj |.
j=1 j=1
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
5 COMPLEX NUMBERS 63
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
5 COMPLEX NUMBERS 64
Proof. (a): The first identity is part of the definition in (5.16). For the second identity,
we first observe, for each k ∈ N,
Y k k−1
α α+1−j α+1−k Yα+1−j α α+1−k
= = = , (5.19)
k j=1
j k j=1
j k−1 k
which implies
α α α α+1−k α α+1
+ = 1+ =
k−1 k k−1 k k−1 k
k−1 k
α+1Yα+1−j Yα+2−j α+1
= = = . (5.20)
k j=1 j j=1
j k
(b): 00 = 1 according to (5.16). For n ∈ N, (5.18) is proved by induction. The base
1+1−1
1
case (n = 1) is provided by the true statement 1 = 1 = 1. For the induction step,
one computes
n+1 n
n+1 Y n+1+1−j n+1Yn+1−j n ind. hyp.
= = = = 1, (5.21)
n+1 j=1
j n + 1 j=1 j n
Theorem 5.16 (Binomial Theorem). Let R be a commutative ring with unity (cf. Def.
C.7 of the Appendix – for us, R = C is the most important example). For each z, w ∈ R
and each n ∈ N0 , the following formula holds:
n
n
X n n−k k n n−1
n n
(z + w) = z w =z + z w + ··· + zwn−1 + wn . (5.22)
k=0
k 1 n−1
Using the induction hypothesis, we now further manipulate the two terms on the right-
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
5 COMPLEX NUMBERS 65
The binomial theorem can now be used to infer a few more rules that hold for the
binomial coefficients:
Proof. (5.27a) is just (5.22) with z = w = 1; (5.27b) is just (5.22) with z = 1 and
w = −1.
The formulas provided by the following proposition are also sometimes useful.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
6 POLYNOMIALS 66
denotes the set of all subsets of a set A that have precisely k elements.
Proof. The induction proofs of (a) and (b) are left as exercises. For (c), one computes
k k k
X n+j (5.29) X (n + j)! (5.29) X n + j
= =
j=0
n j=0
n!(n + j − n)! j=0
j
(5.28) n + k + 1 (5.29) (n + k + 1)! n+k+1
= = = ,
k k!(n + 1)! n+1
6 Polynomials
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
6 POLYNOMIALS 67
Notation 6.2. If A is any nonempty set, then one can add and multiply arbitrary
functions f, g : A −→ K, and one can define several further operations to create new
functions from f and g:
One calls f + and f − the positive part and the negative part of f , respectively. For
R-valued functions f , we have
|f | = f + + f − . (6.1l)
6.2 Polynomials
Definition 6.3. Let n ∈ N0 . Each function from K into K, x 7→ xn , is called a
monomial. A function P from K into K is called a polynomial if, and only if, it is a
linear combination of monomials, i.e. if, and only if P has the form
n
X
P : K −→ K, P (x) = aj x j = a0 + a1 x + · · · + an x n , aj ∈ K. (6.2)
j=0
The aj are called the coefficients of P . The largest number d ≤ n such that ad 6= 0 is
called the degree of P , denoted deg(P ). If all coefficients are 0, then P is called the zero
polynomial; the degree of the zero polynomial is defined as −1 (in Th. 6.6(b) below, we
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
6 POLYNOMIALS 68
will see that each polynomial of degree n ∈ N0 is uniquely determined by its coefficients
a0 , . . . , an and vice versa).
Polynomials of degree ≤ 0 are constant. Polynomials of degree ≤ 1 have the form
P (x) = a + bx and are called affine functions (often they are also called linear functions,
even though this is not really correct for a 6= 0, since every function P that is linear (in
the sense of linear algebra) must satisfy P (0) = 0). Polynomials of degree ≤ 2 have the
form P (x) = a + bx + cx2 and are called quadratic functions.
Each ξ ∈ K such that P (ξ) = 0 is called a zero or a root of P .
A rational function is a quotient P/Q of two polynomials P and Q.
Remark 6.4. Let λ ∈ K and let P, Q be polynomials. Then λP , P +Q, and P Q defined
according to Not. 6.2 are polynomials as well. More precisely, if λ = 0 or P ≡ 0, then
λP = 0; if P ≡ 0, then P + Q = Q; if Q ≡ 0, then P + Q = P ; if P ≡ 0 or Q ≡ 0, then
P Q = 0. If λ 6= 0 and
n
X m
X
P (x) = aj x j , Q(x) = bj xj ,
j=0 j=0 (6.3)
with deg(P ) = n ≥ 0, deg(Q) = m ≥ 0, n ≥ m ≥ 0,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
6 POLYNOMIALS 69
P
i.e. cj = b0 aj = jk=0 ak bj−k , which establishes the base case, remembering bj−k = 0 for
j > k. For the induction step, we compute, for deg(Q) = m + 1,
n m+1 n m
!
X X X X
(P Q)(x) = aj x j bα xα = aj xj bm+1 xm+1 + bα xα
j=0 α=0 j=0 α=0
n m+n j
!
ind. hyp.
X X X
= aj bm+1 xm+1+j + ak bj−k xj
j=0 j=0 k=0
m+n+1 m+n j
!
X X X
= aj−m−1 bm+1 xj + ak bj−k xj
j=m+1 j=0 k=0
m+n+1 j
!
X X
= ak bj−k xj ,
j=0 k=0
which completes the induction step. There is a notational issue in the second and third
line in of the above computation, since, in both lines, the bm+1 in the first sum is the
actual bm+1 from Q, but bm+1 = 0 in the second sum in both lines, which is due to the
induction hypothesis being applied for m < m+1. This is actually used when combining
j ≤ m + n: aj−m−1 bm+1 xj + aj−m−1 ·
both sums in the last step, computing, for m + 1 ≤ P
0 · x = aj−m−1 bm+1 x . For j = m + n + 1, one has m+n+1
j j
k=0 ak bm+n+1−k = an bm+1 , since
bm+n+1−k = 0 for n > k and ak = 0 for k > n.
Finally, deg(P Q) = m + n follows from cm+n = am bn 6= 0.
Theorem 6.5. (a) For each polynomial P given in the form of (6.3) and each ξ ∈ K,
we have the identity
Xn
P (x) = bj (x − ξ)j , (6.5)
j=0
where
n
X k k−j
∀ bj = ak ξ , in particular b0 = P (ξ), b n = an . (6.6)
j∈{0,...,n}
k=j
j
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
6 POLYNOMIALS 70
Proof. (a): For ξ = 0, there is nothing to prove. For ξ 6= 0, defining the auxiliary
variable η := x − ξ, we obtain x = ξ + η and
n k
n X n n
X k k−j j X X
k (5.22)
X k k−j j
P (x) = ak (ξ + η) = ak ξ η = ak ξ η
k=0 k=0 j=0
j k=0 j=0
j
n X n n n
X k k−j j X X k k−j j
= ak ξ η = ak ξ η , (6.8)
j=0 k=0
j j=0 k=j
j
which is (6.5).
(b): According to (a), we have
n
X n−1
X
j−1
P (x) = P (ξ) + (x − ξ) Q(x), with Q(x) = bj (x − ξ) = bj+1 (x − ξ)j , (6.9)
j=1 j=0
proving (b).
Proof. (a): For n = 0, P is constant, but not the zero polynomial, i.e. P ≡ a0 6= 0 with
no zeros as claimed. For n ∈ N, the proof is conducted by induction. The base case
(n = 1) is provided by the observation that deg(P ) = 1 implies P is the affine function
with P (x) = a0 + a1 x, a1 6= 0, i.e. P has precisely one zero at ξ = −a0 /a1 . For the
induction step, assume deg(P ) = n + 1. If P has no zeros, then the assertion of (a)
holds true. Otherwise, P has at least one zero ξ ∈ K, and, according to Th. 6.5(b),
there exists a polynomial Q such that deg(Q) = n and
From the induction hypothesis, we gather that Q has at most n zeros, i.e. (6.10) implies
P has at most n + 1 zeros, which completes the induction.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 71
(b): If P (xj ) = Q(xj ) at n + 1 distinct points xj , then each of these points is a zero of
P − Q. Thus P − Q is a polynomial of degree ≤ n with at least n + 1 zeros. Then (a)
implies deg(P − Q) = −1, i.e. P − Q is the zero polynomial, i.e. aj − bj = 0 for each
j ∈ {0, . . . , n}.
Remark 6.7. Let P be a polynomial with n := deg(P ) ≥ 0. According to Th. 6.6(a), P
has at most n zeros. Using Th. 6.5(b) for an induction shows there exists k ∈ {0, . . . , n}
and a polynomial Q of degree n − k such that
k
Y
P (x) = Q(x) (x − ξj ) = (x − ξ1 )(x − ξ2 ) · · · (x − ξk )Q(x), (6.11a)
j=1
where Q does not have any zeros in K and {ξ1 , . . . , ξk } = {ξ ∈ K : P (ξ) = 0} is the set
of zeros of P . It can of course happen that P does not have any zeros and P = Q (no
ξj exist). It can also occur that some of the ξj in (6.11a) are identical. Thus, we can
rewrite (6.11a) as
l
Y
P (x) = Q(x) (x − λj )mj = (x − λ1 )m1 (x − λ2 )m2 · · · (x − λl )ml Q(x), (6.11b)
j=1
Pl
where λ1 , . . . , λl , l ∈ {0, . . . , k}, are the distinct zeros of P , and mj ∈ N with j=1 mj =
k. Then mj is called the multiplicity of the zero λj of P .
7.1 Sequences
Recall from Def. 2.15(b) that a sequence in K is a function f : N −→ K, in this context
usually denoted as f = (zn )n∈N or (z1 , z2 , . . . ) with zn := f (n). Sometimes the sequence
also has the form (zn )n∈I , where I 6= ∅ is a countable index set (e.g. I = N0 ) different
from N (in the context of convergence (see the following Def. 7.1), I must be N or it
must have the same cardinality as N, i.e. finite I are not permissible).
Definition 7.1. The sequence (zn )n∈N in K is said to be convergent with limit z ∈ K if,
and only if, for each ǫ > 0, there exists an index N ∈ N such that |zn − z| < ǫ for every
index n > N . The notation for (zn )n∈N converging to z is limn→∞ zn = z or zn → z for
n → ∞. Thus, by definition,
lim zn = z ⇔ ∀ ∃ ∀ |zn − z| < ǫ. (7.1)
n→∞ ǫ∈R+ N ∈N n>N
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 72
The sequence (zn )n∈N in K is called divergent if, and only if, it is not convergent.
Example 7.2. (a) For every constant sequence (zn )n∈N = (a)n∈N with a ∈ K, one has
limn→∞ zn = limn→∞ a = a: Since, for each n ∈ N, |zn − a| = |a − a| = 0, one can
choose N = 1 for each ǫ > 0.
1
(b) limn→∞ n+a = 0 for each a ∈ C: Here zn := 1/(n + a) (if n = −a, then set zn := w
with w ∈ C arbitrary). Given ǫ > 0, choose an arbitrary N ∈ N with N ≥ ǫ−1 + |a|.
Then, for each n ≥ N , we compute |n + a| = |n − (−a)| ≥ |n − |a|| = n − |a| >
N − |a| ≥ ǫ−1 , and, thus, |zn | = |n + a|−1 < ǫ as desired.
(c) ((−1)n )n∈N is not convergent: We have zn = 1 for each even n and zn = −1 for each
odd n. Thus, for each z 6= 1 and each even n, |zn − z| = |1 − z| > |1 − z|/2 =: ǫ > 0,
i.e. z is not a limit of (zn )n∈N . However, z = 1 is also not a limit of the sequence,
since, for each odd n, |zn − 1| = | − 1 − 1| = 2 > 1 =: ǫ > 0, proving that the
sequence has no limit.
Theorem 7.3. (a) Let (zn )n∈N be a sequence in C. Then (zn )n∈N is convergent in C
if, and only if, both (Re zn )n∈N and (Im zn )n∈N are convergent in R. Moreover, in
that case,
lim zn = z ⇔ lim Re zn = Re z ∧ lim Im zn = Im z. (7.2)
n→∞ n→∞ n→∞
Proof. (a): Suppose (zn )n∈N converges to z ∈ C. Then, given ǫ > 0, there exists N ∈ N
such that, for each n > N , |zn − z| < ǫ. In consequence, for each n > N ,
Th. 5.9(d)
| Re zn − Re z| = | Re(zn − z)| ≤ |zn − z| < ǫ, (7.4)
proving limn→∞ Re zn = Re z. The proof of limn→∞ Im zn = Im z is completely anal-
ogous. Conversely, suppose there are x, y ∈ R such that limn→∞ Re zn = x and
limn→∞ Im zn = y. Here we encounter, for the first time, what is sometimes called an ǫ/2
argument: Given ǫ > 0, there exists N ∈ N such that, for each n > N , | Re zn − x| < ǫ/2
and | Im zn − y| < ǫ/2, implying, for each n > N ,
|zn − (x + iy)| = | Re zn + i Im zn − (x + iy)|
≤ | Re zn − x| + |i|| Im zn − y| < ǫ/2 + ǫ/2 = ǫ, (7.5)
proving limn→∞ zn = x + iy.
(b) is a direct consequence of (a).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 73
(b) According to Th. 7.3(a) and Ex. 7.2(c), the sequence ( n1 + (−1)n i)n∈N is divergent.
For q = 0, there is nothing to prove. For 0 < |q| < 1, it is |q|−1 > 1, i.e. h := |q|−1 −1 > 0.
Thus, for each ǫ > 0 and N ≥ 1/(ǫh), we obtain
(7.6)
n>N ⇒ |q|−n = (1 + h)n ≥ 1 + nh > nh > 1/ǫ ⇒ |q n | = |q|n < ǫ. (7.9)
Definition 7.7. (a) Given z ∈ K and ǫ ∈ R+ , we call the set Bǫ (z) := {w ∈ K :
|w − z| < ǫ} the ǫ-neighborhood of z or, in anticipation of Analysis II, the (open) ǫ-
ball with center z (in fact, for K = C, Bǫ (z) represents an open disk in the complex
plane with center z and radius ǫ, whereas, for K = R, Bǫ (z) =]z − ǫ, z + ǫ[ is the
open interval with center z and length 2ǫ). More generally, a set U ⊆ K is called
a neighborhood of z if, and only if, there exists ǫ > 0 with Bǫ (z) ⊆ U (so, for
example, for ǫ > 0, Bǫ (z) is always a neighborhood of z, whereas R and [z − ǫ, ∞[
are neighborhoods of z for K = R, but not for K = C ([z − ǫ, ∞[ not even being
defined for z ∈
/ R); the sets {z}, {w ∈ K : Re w ≥ Re z}, {w ∈ K : Re w ≥ Re z +ǫ}
are never neighborhoods of z).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 74
(b) If φ(n) is a statement for each n ∈ N, then φ(n) is said to be true for almost all
n ∈ N if, and only if, there exists a finite subset A ⊆ N such that φ(n) is true for
each n ∈ N \ A, i.e. if, and only if, φ(n) is always true, with the possible exception
of finitely many cases.
Remark 7.8. In the language of Def. 7.7, the sequence (zn )n∈N converges to z if, and
only if, every neighborhood of z contains almost all zn .
Definition 7.9. The sequence (zn )n∈N in K is called bounded if, and only if, the set
{|zn | : n ∈ N} is bounded in the sense of Def. 2.27(a).
Proposition 7.10. Let (zn )n∈N be a sequence in K.
(a) Limits are unique, that means if z, w ∈ K such that limn→∞ zn = z and limn→∞ zn =
w, then z = w.
(a) If (bn )n∈N is a sequences in C such that there exists C ∈ R+ with |bn | ≤ C|zn | for
almost all n, then limn→∞ bn = 0.
Proof. (a): Given ǫ > 0, there exists N ∈ N such that |zn | < ǫ/C and |bn | ≤ C|zn | for
each n > N . Then, for each n > N , |bn | ≤ C|zn | < ǫ, proving limn→∞ bn = 0.
(b): If (cn )n∈N is bounded, then there exists C ∈ R+ such that |cn | ≤ C for each n ∈ N.
Thus, |cn zn | ≤ C|zn | for each n ∈ N, implying limn→∞ (cn zn ) = 0 via (a).
Example 7.12. The sequences ((−1)n )n∈N and (b)n∈N with b ∈ C are bounded. Since,
1
for each a ∈ C, limn→∞ n+a = 0 by Example 7.2(b), we obtain
(−1)n b
lim = lim =0 (7.10)
n→∞ n + a n→∞ n + a
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 75
Theorem 7.13. (a) Let (zn )n∈N and (wn )n∈N be sequences in C. Moreover, let z, w ∈ C
with limn→∞ zn = z and limn→∞ wn = w. We have the following identities:
lim (λzn ) = λz for each λ ∈ C, (7.11a)
n→∞
lim (zn + wn ) = z + w, (7.11b)
n→∞
lim (zn wn ) = zw, (7.11c)
n→∞
lim zn /wn = z/w given all wn 6= 0 and w 6= 0, (7.11d)
n→∞
lim |zn | = |z|, (7.11e)
n→∞
lim z̄n = z̄, (7.11f)
n→∞
lim znp = z p for each p ∈ N. (7.11g)
n→∞
(b) Let (xn )n∈N and (yn )n∈N be sequences in R. Moreover, let x, y ∈ R with limn→∞ xn =
x and limn→∞ yn = y. Then
lim max{xn , yn } = max{x, y}, (7.12a)
n→∞
lim min{xn , yn } = min{x, y}. (7.12b)
n→∞
(c) If, in the situation of (b) (i.e. for real sequences), xn ≤ yn holds for almost all
n ∈ N, then x ≤ y. In particular, if almost all xn ≥ 0, then x ≥ 0.
(7.11b): Given ǫ > 0, there exists N ∈ N such that, for each n > N , |zn − z| < ǫ/2 and
|wn − w| < ǫ/2, implying
∀ |zn + wn − (z + w)| ≤ |zn − z| + |wn − w| < ǫ/2 + ǫ/2 = ǫ. (7.13b)
n>N
(7.11c): Let M1 := max{|z|, 1}. According to Prop. 7.10(b), there exists M2 ∈ R+ such
that M2 is an upper bound for {|wn | : n ∈ N}. Moreover, given ǫ > 0, there exists
N ∈ N such that, for each n > N , |zn − z| < ǫ/(2M2 ) and |wn − w| < ǫ/(2M1 ), implying
|zn wn − zw| = (zn − z)wn + z(wn − w)
∀ M2 ǫ M1 ǫ (7.13c)
n>N
≤ |wn | · |zn − z| + |z| · |wn − w| < + = ǫ.
2M2 2M1
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 76
(7.11d): We first consider the case, where all zn = 1. Given ǫ > 0, there exists N ∈ N
such that, for each n > N , |wn − w| < ǫ |w|2 /2 and |wn − w| < |w|/2 (since w 6= 0 for
this case), implying |w| ≤ |w − wn | + |wn | < |w|/2 + |wn | and |wn | > |w|/2. Thus,
1 1 wn − w 2 |wn − w| 2 ǫ |w|2
∀ − = ≤ < = ǫ. (7.13d)
n>N wn w wn w |w|2 |w|2 2
The general case now follows from (7.11c).
(7.11e): This is a consequence of the inverse triangle inequality (5.13): Given ǫ > 0,
there exists N ∈ N such that, for each n > N , |zn − z| < ǫ, implying
∀ |zn | − |z| ≤ |zn − z| < ǫ. (7.13e)
n>N
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 77
Proof. (7.16) follows by simple inductions from (7.11b) and (7.11c), respectively.
Theorem 7.16 (Sandwich Theorem). Let (xn )n∈N , (yn )n∈N , and (an )n∈N be sequences
in R. If xn ≤ an ≤ yn holds for almost all n ∈ N, then
Proof. Given ǫ > 0, there exists N ∈ N such that, for each n > N , xn ≤ an ≤ yn ,
|xn − x| < ǫ, and |yn − x| < ǫ, implying
1
lim = 0. (7.19)
n→∞ n!
Definition 7.18. Let (xn )n∈N be a sequence in R. The sequence is said to diverge to
∞ (resp. to −∞), denoted limn→∞ xn = ∞ (resp. limn→∞ xn = −∞) if, and only if, for
each K ∈ R, almost all xn are bigger (resp. smaller) than K. Thus,
Proof. We treat the increasing case; the decreasing case is proved completely analo-
gously. If A is bounded and ǫ > 0, let K := sup A − ǫ; if A is unbounded, then let
K ∈ R be arbitrary. In both cases, since K can not be an upper bound, there exists
N ∈ N such that xN > K. Since the sequence is increasing, for each n > N , xN ≤ xn ,
showing | sup A − xn | < ǫ in the bounded case, and xn > K in the unbounded case.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 78
Proof. Let (wn )n∈N be a subsequence of of (zn )n∈N , i.e. there is a strictly increasing
function φ : N −→ N such that wn = zφ(n) . If limn→∞ zn = z, then, given ǫ > 0, there
is N ∈ N such that zn ∈ Bǫ (z) for each n > N . For Ñ choose any number from N that
is ≥ N and in φ(N). Take M := φ−1 (Ñ ) (where φ−1 : φ(N) −→ N). Then, for each
n > M , one has φ(n) > Ñ ≥ N , and, thus, wn = zφ(n) ∈ Bǫ (z), showing limn→∞ wn = z.
Let (wn )n∈N be a reordering of (zn )n∈N , i.e. there is a bijective function φ : N −→ N
such that wn = zφ(n) . Let ǫ and N be as before. Define
M := max{φ−1 (n) : n ≤ N }. (7.23)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 79
As φ is bijective, it is φ(n) > N for each n > M . Then, for each n > M , one has
wn = zφ(n) ∈ Bǫ (z), showing limn→∞ wn = z.
Example 7.25. The sequence ((−1)n )n∈N has cluster points 1 and −1.
Proposition 7.26. A point z ∈ K is a cluster point of the sequence (zn )n∈N in K if,
and only if, the sequence has a subsequence converging to z.
Proof. If (wn )n∈N is a subsequence of (zn )n∈N , limn→∞ wn = z, then every Bǫ (z), ǫ > 0,
contains infinitely many wn , i.e. infinitely many zn , i.e. z is a cluster point of (zn )n∈N .
Conversely, if z is a cluster point of (zn )n∈N , then, inductively, define φ : N −→ N as
follows: For φ(1), choose the index k of any point zk in B1 (z) (such a point exists, since
z is a cluster point of the sequence). Now assume that n > 1 and that φ(m) have already
been defined for each m < n. Let M := max{φ(m) : m < n}. Since B 1 (z) contains
n
infinitely many zk , there must be some zk ∈ B 1 (z) such that k > M . Choose this k as
n
φ(n). Thus, by construction, φ is strictly increasing, i.e. (wn )n∈N with wn := zφ(n) is a
subsequence of (zn )n∈N . Moreover, for each ǫ > 0, there is N ∈ N such that 1/N < ǫ.
Then, for each n > N , wn ∈ B 1 (z) ⊆ B 1 (z) ⊆ Bǫ (z), showing limn→∞ wn = z.
n N
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 80
of A∗ , i.e. xn > a + ǫ/2 holds for only finitely many n (if any). In particular, we have
shown xn < a + ǫ holds for almost all n, and a − ǫ < xn < a + ǫ must hold for infinitely
many n, showing a is a cluster point of S. To see that a is the largest cluster point of
S (i.e. a = max A), we have to show that x > a implies x is not a cluster point of S.
However, letting ǫ := x − a > 0, we had seen above that xn > a + ǫ/2 holds for only
finitely many n, i.e. Bǫ/2 (x) contains only finitely many xn , showing x is not a cluster
point of S.
It now remains to consider the complex case, i.e. a bounded sequence S := (zn )n∈N in C.
For each n ∈ N, let zn = xn +iyn with xn , yn ∈ R. Due to Th. 5.9(d), we have |xn | ≤ |zn |
and |yn | ≤ |zn |, i.e. the boundedness of S implies the boundedness of both (xn )n∈N and
(yn )n∈N . Then we know that (xn )n∈N has a cluster point x and, by Prop. 7.26, S
has a subsequence (znj )j∈N such that x = limj→∞ xnj . As the subsequence (ynj )j∈N is
still bounded, it must have a cluster point y and a subsequence (ynjk )k∈N such that
y = limk→∞ ynjk . Since x = limk→∞ xnjk as well, we now have limk→∞ znjk = x + iy =: z,
i.e. S has a subsequence converging to z. According to Prop. 7.26, z is a cluster point
of S.
Definition 7.28. A sequence (zn )n∈N in C is defined to be a Cauchy sequence if, and
only if, for each ǫ ∈ R+ , there exists N ∈ N such that |zn − zm | < ǫ for each n, m > N ,
i.e.
(zn )n∈N Cauchy ⇔ ∀+ ∃ ∀ |zn − zm | < ǫ. (7.25)
ǫ∈R N ∈N n,m>N
Theorem 7.29. The sequence (zn )n∈N in C is convergent if, and only if, it is a Cauchy
sequence.
Proof. Suppose the sequence is convergent with limn→∞ zn = z. Then, given ǫ > 0,
there is N ∈ N such that zn ∈ B 2ǫ (z) for each n > N . If n, m > N , then |zn − zm | ≤
|zn − z| + |z − zm | < 2ǫ + 2ǫ = ǫ, establishing that (zn )n∈N is a Cauchy sequence.
Conversely, suppose the sequence is a Cauchy sequence. Using similar reasoning as in
the proof of Prop. 7.10(b), we first show the sequence is bounded. If the sequence is
Cauchy, then there exists N ∈ N such that |zn − zm | < 1 for all n, m > N . Thus, the
set A := {|zn | : |zn − zN +1 | ≥ 1} ∪ {|z1 |} ⊆ R+
0 is nonempty and finite. According to
Th. 3.13(a), A has an upper bound M . Then max{M, |zN +1 | + 1} is an upper bound for
{|zn | : n ∈ N}, showing that the sequence is bounded. From Th. 7.27, we obtain that
the sequence has a cluster point z. It remains to show limn→∞ zn = z. Given ǫ > 0,
choose N ∈ N such that |zn − zm | < ǫ/2 for all n, m > N . Since z is a cluster point,
there exists k > N such that |zk − z| < ǫ/2. Thus,
ǫ ǫ
∀ |zn − z| ≤ |zn − zk | + |zk − z| < + = ǫ, (7.26)
n>N 2 2
proving limn→∞ zn = z.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 81
We claim S is not a Cauchy sequence and, thus, not convergent by Th. 7.29: For each
N ∈ N, we find n, m > N such that sn −sm > 1/2, namely m = N +1 and n = 2(N +1):
2(N +1)
X 1 1 1 1
s2(N +1) − sN +1 = = + + ··· +
k=N +2
k N +2 N +3 2(N + 1)
1 1
> (N + 1) · = . (7.28)
2(N + 1) 2
While we have just seen that S is not convergent, it is clearly increasing, i.e. Th. 7.19
implies S is unbounded and limn→∞ sn = ∞. Sequences defined by longer and longer
sums are known as series and will be studied further in Sec. 7.3 below. The series of the
present example is known as the harmonic series. It has become famous as the simplest
example of a series that does not converge even though its summands converge to 0. In
terms of the notation introduced in Sec. 7.3 below, we have shown
∞
X 1 1 1
=1+ + + · · · = ∞. (7.29)
k=1
k 2 3
7.2 Continuity
7.2.1 Definitions and First Examples
Roughly, a function is continuous if a small change in its input results in a small change
of its output. For functions defined on an interval, the notion of continuity makes
precise the idea of a function having no jump – no discontinuity – at some point x in
its domain. For example, we would say the sign function of (5.8) has precisely one
jump – one discontinuity – at x = 0, whereas quadratic functions (or, more generally,
polynomials) do not have any jumps – they are continuous.
Definition 7.31. Let M ⊆ C. If ζ ∈ M , then a function f : M −→ K is said to be
continuous in ζ if, and only if, for each ǫ > 0, there is δ > 0 such that the distance
between the values f (z) and f (ζ) is less than ǫ, provided the distance between z and ζ
is less than δ, i.e. if, and only if,
∀ + ∃ + ∀ |z − ζ| < δ ⇒ |f (z) − f (ζ)| < ǫ . (7.30)
ǫ∈R δ∈R z∈M
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 82
Moreover, f is called continuous if, and only if, f is continuous in every ζ ∈ M . The set of
all continuous functions from f : M −→ K is denoted by C(M, K), C(M ) := C(M, R).
Example 7.32. (a) Every constant map f : M −→ K, ∅ 6= M ⊆ C, is continuous: In
this case, given ǫ, we can choose any δ > 0 we want, say δ := 42: If ζ, z ∈ M , then
|f (ζ) − f (z)| = 0 < ǫ, which holds independently of δ, in particular, if |ζ − z| < δ.
(b) Every affine function f : K −→ K, f (z) := az + b is continuous: For a = 0, this
follows from (a). For a 6= 0, given ǫ > 0, choose δ := ǫ/|a|. Then,
ǫ
|z − ζ| < δ = ⇒ f (z) − f (ζ) = az + b − aζ − b
|a|
∀ ǫ . (7.31)
ζ,z∈K
= |a| |z − ζ| < |a| =ǫ
|a|
(c) The sign function of (5.8) is not continuous: It is continuous in each ξ ∈ R\{0}, but
not continuous in 0: If ξ 6= 0, then, given ǫ > 0, choose δ := |ξ|. If |x − ξ| < δ, then
sgn(x) = sgn(ξ), i.e. | sgn(x) − sgn(ξ)| = 0 < ǫ, proving continuity in ξ. However,
at 0, for ǫ := 1/2, we have
1
∀ sgn(0) − sgn(δ/2) = |0 − 1| = 1 > = ǫ, (7.32)
δ>0 2
showing sgn is not continuous in 0.
Some subtleties arise from the possibility that f can be defined on subsets of C with
very different properties. The notions introduced in Def. 7.33 help to deal with these
subtleties.
Definition 7.33. Let M ⊆ C.
(a) The point z ∈ C is called a cluster point or accumulation point of M if, and only if,
each ǫ-neighborhood of z, ǫ ∈ R+ , contains infinitely many points of M , i.e. if, and
only if,
∀ + #(M ∩ Bǫ (z)) = ∞. (7.33)
ǫ∈R
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 83
Then Bǫ (z) ∩ M = {z}, showing z is an isolated point of M . Finally, the union in (7.34)
is clearly disjoint.
Lemma 7.35. Let M ⊆ C, f : M −→ K. If ζ is an isolated point of M , then f is
always continuous in ζ.
Proof. Independently of the concrete definition of f , we know there is δ > 0 such that
Bδ (ζ) ∩ M = {ζ}. In other words, if z ∈ M with |z − ζ| < δ, then z = ζ, implying
|f (z) − f (ζ)| = 0 < ǫ for each ǫ > 0, showing f to be continuous in ζ.
Example 7.36. (a) The sign function restricted to the set M :=]−∞, −1]∪{0}∪[1, ∞[,
i.e.
1
for x ∈ [1, ∞[,
sgn(x) = 0 for x = 0,
−1 for x ∈] − ∞, −1]
is continuous: As in Ex. 7.32(c), one sees that sgn is continuous in each ξ ∈ M \{0}.
However, now it is also continuous in 0, since 0 is an isolated point of M .
To make available the power of the results on convergent sequences from Sec. 7.1 to
investigations regarding the continuity of functions, we need to understand the relation-
ship between both notions. The core of this relationship is the contents of the following
Th. 7.37, which provides a criterion allowing one to test continuity in terms of convergent
sequences:
Theorem 7.37. Let M ⊆ C, f : M −→ K. If ζ ∈ M , then f is continuous in ζ if, and
only if, for each sequence (zn )n∈N in M with limn→∞ zn = ζ, the sequence (f (zn ))n∈N
converges to f (ζ), i.e.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 84
Proof. If ζ ∈ M is an isolated point of M , then there is δ > 0 such that M ∩Bδ (ζ) = {ζ}.
Then every f : M −→ K is continuous in ζ according to Lem. 7.35. On the other hand,
every sequence in M converging to ζ must be finally constant and equal to ζ, i.e. (7.36)
is trivially valid at ζ. Thus, the assertion of the theorem holds if ζ ∈ M is an isolated
point of M .
If ζ ∈ M is not an isolated point of M , then ζ is a cluster point of M according to Prop.
7.34. So, for the remainder of the proof, let ζ ∈ M be a cluster point of M . Assume
that f is continuous in ζ and (zn )n∈N is a sequence in M with limn→∞ zn = ζ. For each
ǫ > 0, there is δ > 0 such that z ∈ M and |z − ζ| < δ implies |f (z) − f (ζ)| < ǫ. Since
limn→∞ zn = ζ, there is also N ∈ N such that, for each n > N , |zn − ζ| < δ. Thus,
for each n > N , |f (zn ) − f (ζ)| < ǫ, proving limn→∞ f (zn ) = f (ζ). Conversely, assume
that f is not continuous in ζ. We have to construct a sequence (zn )n∈N in M with
limn→∞ zn = ζ, but (f (zn ))n∈N does not converge to f (ζ). Since f is not continuous
in ζ, there must be some ǫ0 > 0 such that, for each 1/n, n ∈ N, there is at least one
zn ∈ M satisfying |zn − ζ| < 1/n and |f (zn ) − f (ζ)| ≥ ǫ0 . Then (zn )n∈N is a sequence in
M with limn→∞ zn = ζ and (f (zn ))n∈N does not converge to f (ζ).
We can now apply the rules of Th. 7.13 to see that all the arithmetic operations defined
in Not. 6.2 preserve continuity:
Proof. Let (zn )n∈N be a sequence in M such that limn→∞ zn = ζ. Then the continuity
of f and g in ζ yields limn→∞ f (zn ) = f (ζ) and limn→∞ g(zn ) = g(ζ). Then
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 85
and, finally, the continuity of f + and f − follows from the continuity of max(f, g).
f = Re f + i Im f, (7.37)
Example 7.40. (a) The continuity of the absolute value function z 7→ |z| on K can be
concluded directly from (7.11e) and, alternatively, from combining the continuity
of f : K −→ K, f (z) = z, according to Ex. 7.32(b), with the continuity of |f |
according to Th. 7.38.
P
(b) Every polynomial P : K −→ K, P (x) = nj=0 aj xj , aj ∈ K, is continuous: First
note that every monomial x 7→ xj is continuous on K by (7.11g). Then Th. 7.38
implies the continuity of x 7→ aj xj on K. Now the continuity of P follows from
(7.16a) or, alternatively, by an induction from the f + g part of Th. 7.38.
(c) Let P, Q : K −→ K, be polynomials and let A := Q−1 {0} the set of all zeros of
Q (if any). Then the rational function (P/Q) : K \ A −→ K is continuous as a
consequence of (b) plus the f /g part of Th. 7.38.
Proof. Let ζ ∈ Df and assume f is continuous in ζ and g is continuous in f (ζ). If (zn )n∈N
is a sequence in Df such that limn→∞ zn = ζ, then the continuity of f in ζ implies that
limn→∞ f (zn ) = f (ζ). Then the continuity of g in f (ζ) implies limn→∞ g(f (zn )) =
g(f (ζ)), thereby establishing the continuity of g ◦ f in ζ.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 86
Subsets A of C (and even subsets of R) can be extremely complicated. If the set A has
one or more of the benign properties defined in the following, then this can often be
exploited in some useful way (we will see an important example in Th. 7.54 below).
Definition 7.42. Consider A ⊆ C.
(a) A is called bounded if, and only if, A = ∅ or the set {|z| : z ∈ A} is bounded in R
in the sense of Def. 2.27(a), i.e. if, and only if,
∃ A ⊆ BM (0).
M ∈R+
(b) A is called closed if, and only if, every sequence in A that converges in C has its
limit in A (note that ∅ is, thus, closed).
(c) A is called compact if, and only if, A is both closed and bounded.
Example 7.43. (a) Clearly, ∅ and sets containing single points {z}, z ∈ C are com-
pact. The sets C and R are simple examples of closed sets that are not bounded.
(b) Let a, b ∈ R, a < b. Each bounded interval ]a, b[, ]a, b], [a, b[, [a, b] is, indeed,
bounded (e.g. by M := ǫ + max{|a|, |b|} for each ǫ ∈ R+ ). If (xn )n∈N is a sequence
in [a, b], converging to x ∈ R, then Th. 7.13(c) shows a ≤ x ≤ b, i.e. x ∈ [a, b] and
[a, b] is, indeed, closed. Analogously, one sees that the unbounded intervals [a, ∞[
and ] − ∞, a] are also closed. On the other hand, open and half-open intervals are
not closed: For sufficiently large n, the convergent sequence (b − n1 )n∈N is in [a, b[,
but limn→∞ (b − n1 ) = b ∈ / [a, b[, and the other cases are treated analogously. In
particular, only intervals of the form [a, b] (and trivial intervals) are compact.
(c) For each ǫ > 0 and each z ∈ C, the set Bǫ (z) is bounded (since Bǫ (z) ⊆ Bǫ+|z| (0)
by the triangle inequality), but not closed (since, for sufficiently large n ∈ N,
(z + ǫ − n1 )n∈N is a sequence in Bǫ (z), converging to z + ǫ ∈
/ Bǫ (z)). In particular,
Bǫ (z) is not compact.
Proposition 7.44. (a) Finite unions of bounded (resp. closed, resp. compact) sets are
Sn if A1 , . . . , An ⊆ C, n ∈ N, are bounded
bounded (resp. closed, resp. compact), i.e.
(resp. closed, resp. compact), then A := j=1 Aj is also bounded (resp. closed, resp.
compact).
(b) Arbitrary (i.e. finite or infinite) intersections of bounded (resp. closed, resp. com-
pact) sets are bounded (resp. closed, resp. compact), i.e. if I 6= ∅ is an arbitrary
set and, for each j ∈ I, Aj ⊆ C is bounded (resp. closed, resp. compact), then
index T
A := j∈I Aj is also bounded (resp. closed, resp. compact).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 87
Many more examples of closed sets can be obtained as preimages of closed sets under
continuous maps according to the following remark:
Remark 7.46. In Analysis II, it will be shown in the more general context of maps
f between topological spaces that a map f is continuous if, and only if, all preimages
f −1 (A) under f of closed sets A are closed. Here, we will only prove the following special
case:
f : C −→ K continuous and A ⊆ K closed ⇒ f −1 (A) ⊆ C closed. (7.38)
Indeed, suppose f is continuous and A ⊆ K is closed. If (zn )n∈N is a sequence in f −1 (A)
with limn→∞ zn = z ∈ C, then (f (zn ))n∈N is a sequence in A. The continuity of f then
implies limn→∞ f (zn ) = f (z) and, then, f (z) ∈ A, since A is closed. Thus, z ∈ f −1 (A),
showing f −1 (A) is closed.
Example 7.47. (a) For each z ∈ C and each r > 0, the closed disk B r (z) := {w ∈ C :
|z − w| ≤ r} with radius r and center z is, indeed, closed by (7.38), since
B r (z) = f −1 [0, r], (7.39)
where f is the continuous map f : C −→ R, f (w) := |z − w|. Since B r (z) is clearly
bounded, it is also compact.
(b) For each z ∈ C and each r > 0, the circle (also called a 1-sphere) Sr (z) := {w ∈ C :
|z − w| = r} with radius r and center z is closed by (7.38), since Sr (z) = f −1 {r},
where f is the same map as in (7.39). Moreover, Sr (z) is also clearly bounded, and,
thus, compact.
(c) According to (7.38), for each x ∈ R, the closed half-spaces {z ∈ C : Re z ≥ x} =
Re−1 [x, ∞[ and {z ∈ C : Im z ≥ x} = Im−1 [x, ∞[ are, indeed, closed.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 88
Theorem 7.48. A subset K of C is compact if, and only if, every sequence in K has a
subsequence that converges to some limit z ∈ K.
Proof. If K is closed and bounded, and (zn )n∈N is a sequence in K, then the boundedness,
the Bolzano-Weierstrass Th. 7.27, and Prop. 7.26 yield a subsequence that converges to
some z ∈ C. However, since K is closed, z ∈ K.
Conversely, assume every sequence in K has a subsequence that converges to some
limit z ∈ K. Let (zn )n∈N be a sequence in K that converges to some w ∈ C. Then this
sequence must have a subsequence that converges to some z ∈ K. However, according to
Prop. 7.23, it must be w = z ∈ K, showing K is closed. If K is not bounded, then there
exists a sequence (zn )n∈N in K such that limn→∞ |zn | = ∞. Every subsequence (znk )k∈N
then still has the property that limk→∞ |znk | = ∞, in particular, each subsequence is
unbounded and can not converge to some z ∈ C (let alone in K).
Caveat 7.49. In Analysis II, we will generalize the notion of compactness to subsets
of so-called metric spaces. In metric spaces, it is still true that a set K is compact if,
and only if, every sequence in K has a subsequence that converges to some limit in
K. However, while it remains true that every compact set is closed and bounded, the
converse does not(!) hold in general metric spaces (in general, even in closed sets, there
exist bounded sequences that do not have convergent subsequences).
—
One reason that compact sets are useful is that real-valued continuous functions on
compact sets assume a maximum and a minimum, which is the contents of Th. 7.54
below. In preparation, we now define maxima and minima for real-valued functions.
(a) Given z ∈ M , f has a (strict) global min at z if, and only if, f (z) ≤ f (w) (f (z) <
f (w)) for each w ∈ M \ {z}. Analogously, f has a (strict) global max at z if, and
only if, f (z) ≥ f (w) (f (z) > f (w)) for each w ∈ M \{z}. Moreover, f has a (strict)
global extreme value at z if, and only if, f has a (strict) global min or a (strict)
global max at z.
(b) Given z ∈ M , f has a (strict) local min at z if, and only if, there exists ǫ > 0
such that f (z) ≤ f (w) (f (z) < f (w)) for each w ∈ {w ∈ M : |z − w| < ǫ} \ {z}.
Analogously, f has a (strict) local max at z if, and only if, there exists ǫ > 0 such
that f (z) ≥ f (w) (f (z) > f (w)) for each w ∈ {w ∈ M : |z − w| < ǫ} \ {z}.
Moreover, f has a (strict) local extreme value at z if, and only if, f has a (strict)
local min or a (strict) local max at z.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 89
Remark 7.51. In the context of Def. 7.50, it is immediate from the respective definitions
that f has a (strict) global min at z ∈ M if, and only if, −f has a (strict) global max
at z. Moreover, the same holds if “global” is replaced by “local”. It is equally obvious
that every (strict) global min/max is a (strict) local min/max.
Proof. If (wn )n∈N is a sequence in f (K), then, for each n ∈ N, there is some zn ∈ K
such that f (zn ) = wn . As K is compact, there is a subsequence (an )n∈N of (zn )n∈N
with limn→∞ an = a for some a ∈ K. Then (f (an ))n∈N is a subsequence of (wn )n∈N and
the continuity of f yields limn→∞ f (an ) = f (a) ∈ f (K), showing that (wn )n∈N has a
convergent subsequence with limit in f (K). By Th. 7.48, we have therefore established
that f (K) is compact.
According to the definition of the inf and sup as largest lower bound and smallest upper
bound, respectively, for each n ∈ N, there must be elements xn , yn ∈ K such that
m ≤ xn ≤ m + n1 and M − n1 ≤ yn ≤ M . Since the compact set K is also closed, we get
m = limn→∞ xn ∈ K and M = limn→∞ yn ∈ K.
Example 7.55. On an unbounded set, a continuous function does not necessarily have
a global max or a global min, as one can already see from x 7→ x. An example for a
continuous function on a bounded, but not closed, interval, that does not have a global
max is f : ]0, 1] −→ R, f (x) := 1/x, which is continuous by Th. 7.38.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 90
(a): f (ξ1 ) ≤ 0: This is clear if ξ1 = b. If ξ1 < b, then, for each n ∈ N sufficiently large,
there exists xn ∈ [ξ1 , ξ1 + 1/n[⊆ [a, b] such that f (xn ) ≤ 0). Then limn→∞ xn = ξ1 and
the continuity of f implies limn→∞ f (xn ) = f (ξ1 ). Now f (ξ1 ) ≤ 0 is a consequence of
Th. 7.13(c). In particular, (a) yields a < ξ1 and f > 0 on [a, ξ1 [.
(b): f (ξ1 ) ≥ 0: The continuity of f implies limn→∞ f (ξ1 − 1/n) = f (ξ1 ) and, since we
have already seen f (ξ1 − 1/n) > 0 for each n ∈ N sufficiently large, f (ξ1 ) ≥ 0 is again a
consequence of Th. 7.13(c). In particular, we have ξ1 < b.
Combining (a) and (b), we have f (ξ1 ) = 0 and a < ξ1 < b.
Defining ξ2 := sup f −1 (R+0 ), f (ξ2 ) = 0 and a < ξ2 < b is shown completely analogous.
Then f < 0 on ]ξ2 , b] is also clear as well as ξ1 ≤ ξ2 .
Theorem 7.57 (Intermediate Value Theorem). Let a, b ∈ R with a < b. If f : [a, b] −→
R is continuous, then f assumes every value between f (a) and f (b), i.e.
h i
min{f (a), f (b)}, max{f (a), f (b)} ⊆ f [a, b] . (7.40)
Proof. If f (a) = f (b), then there is nothing to prove. If f (a) < f (b) and η ∈]f (a), f (b)[,
then consider the auxiliary function g : [a, b] −→ R, g(x) := η − f (x). Then g is
continuous with g(a) = η − f (a) > 0 and g(b) = η − f (b) < 0. According to Bolzano’s
Th. 7.56, there exists ξ ∈]a, b[ such that g(ξ) = η − f (ξ) = 0, i.e. f (ξ) = η as claimed.
If f (b) < f (a) and η ∈]f (b), f (a)[, then consider the auxiliary function g : [a, b] −→ R,
g(x) := f (x)−η. Then g is continuous with g(a) = f (a)−η > 0 and g(b) = f (b)−η < 0.
Once again, according to Bolzano’s Th. 7.56, there exists ξ ∈]a, b[ such that g(ξ) =
f (ξ) − η = 0, i.e. f (ξ) = η.
Theorem 7.58. If I ⊆ R is an interval (of one of the 8 types listed in (4.8)) and
f : I −→ R is continuous, then f (I) is also an interval (it can degenerate to a single
point if f is constant). More precisely, if ∅ 6= I = [a, b] is a compact interval, then
∅ 6= f (I) = [min f (I), max f (I)]; if I is not a compact interval, then one of the following
9 cases occurs:
f (I) = R, (7.41a)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 91
Proof. If I is a compact interval, then we merely combine Th. 7.54 with Th. 7.57.
Otherwise, let η ∈ f (I). If f (I) has an upper bound, then Th. 7.57 implies [η, sup f (I)[⊆
f (I) and f (I) ∩ [η, ∞[⊆ [η, sup f (I)]. If f (I) does not have an upper bound, then Th.
7.57 implies f (I) ∩ [η, ∞[= [η, ∞[. Analogously, one obtains f (I)∩] − ∞, η] =] − ∞, η]
or f (I)∩] − ∞, η] = [inf f (I), η] or f (I)∩] − ∞, η] =] inf f (I), η],showing that there
are
precisely the 9 possibilities of (7.41) for f (I) = f (I)∩] − ∞, η] ∪ f (I) ∩ [η, ∞[ .
The above results will have striking consequences in the following Sec. 7.2.5.
Theorem 7.60. Let I ⊆ R be an interval (of one of the 8 types listed in (4.8)). If
f : I −→ R is strictly increasing (resp. decreasing), then f has an inverse function f −1
defined on J := f (I), i.e. f −1 : J −→ I, and f −1 is continuous and strictly increasing
(resp. decreasing). If f is also continuous, then J must be an interval.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 92
ξ ∈ I with f (ξ) = η. If I = {ξ}, then J = {η}, and there is nothing to prove. It remains
to consider three cases:
(a) ξ = min I, i.e. ξ is the left endpoint of I (and ξ 6= max I),
(b) ξ = max I, i.e. ξ is the right endpoint of I (and ξ 6= min I),
(c) ξ is neither the min nor the max of I.
We carry out the proof for (c) and leave the (very similar) cases (a) and (b) as exercises.
In case (c), there are points ξ1 , ξ2 ∈ Bǫ (ξ) ∩ I such that
Then
f −1 str. inc. (7.42)
∀ f (ξ1 ) < y < f (ξ2 ) ⇒ ξ1 < f −1 (y) < ξ2 ⇒ f −1 (y) ∈ Bǫ (ξ) ,
y∈J∩Bδ (η)
Remark and Definition 7.61 (Roots). We are now in a position to fulfill the promise
made in Def. and Rem. 5.6, i.e. to prove the existence of unique roots for nonnegative
real numbers: For each n ∈ N, the function f : R+ n
0 −→ R, f (x) := x , is continuous
and strictly increasing with J := f (R+ +
0 ) = R0 . Then Th. 7.60 implies the existence
of a continuous and strictly increasing inverse function f −1 : R+ +
0 −→ R0 . For each
√ 1
x ∈ R+ 0 , we call f
−1
(x) the nth root of x and write n x := x n := f −1 (x). Then
√ 1
( n x)n = (x n )n = x is immediate from the definition. Caveat: By definition, roots are
always nonnegative and they are only defined for nonnegative numbers (when studying
complex numbers and C-valued functions more deeply in the field of Complex Analysis,
one typically extends the notion of root, but we will not
√ pursue this√ route in this√class).
As anticipated in Def. and Rem. 5.6, one also writes x instead of x and calls x the
2
square root of x.
√
Remark and Definition √ 7.62. It turns out that 2 (and many other roots)
√ are not
rational numbers, i.e. 2 ∈ √ by contradiction: If 2 ∈ Q, then
/ Q. This is easily proved
there exist natural numbers m, n ∈ N such that 2 = m/n. Moreover, by canceling
possible factors of 2, we may assume at least one of the numbers m, n is odd. Now
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 93
√
2 = m/n implies m2 = 2n2 , i.e. m2 and, thus, m must be even. In consequence, there
exists p ∈ N such that m = 2p, implying 2n2 = m2 = 4p2 and n2 = 2p2 . Thus n2 and n
must also be even, in contradiction to m, n not both being even.
The elements of R \ Q are called irrational numbers. It turns out that most real num-
bers are irrational numbers – one can show that Q is countable, whereas R \ Q is not
countable (actually, every interval contains countably many rational and uncountably
many irrational numbers, see Appendix F, in particular, Th. F.1(c) and Cor. F.4).
Theorem 7.63 (Inequality Between the Arithmetic Mean and the Geometric Mean).
If n ∈ N and x1 , . . . , xn ∈ R+
0 , then
√ x1 + · · · + xn
n
x1 · · · xn ≤ , (7.43)
n
where the left-hand side is called the geometric mean and the right-hand side is called
the arithmetic mean of the numbers x1 , . . . , xn . Equality occurs if, and only if, x1 =
· · · = xn .
Proof. If at least one of the xj is 0, then (7.43) becomes the true statement 0 ≤ x1 +···+x
n
n
with strict equality if at least one xj > 0. If x1 = · · · = xn = x, then (7.43) also holds
since both sides are equal to x. Thus, for the remainder of the proof, we assume all
xj > 0 and not all xj are equal. First, we consider the special case, where x1 +···+x
n
n
= 1.
Since not all xj are equal, there exists k with xk 6= 1. We prove (7.43) by induction for
n ∈ {2, 3, . . . } in the form
n
! n
X Y
xj = n ∧ ∃ xk 6= 1 ⇒ xj < 1.
k∈{1,...,n}
j=1 j=1
Base Case (n = 2): Since x1 + x2 = 2, 0 < x1 , x2 and not both x1 and x2 are equal to
1, there is ǫ > 0 such that x1 = 1 + ǫ and x2 = 1 − ǫ, i.e. x1 x2 = 1 − ǫ2 < 1, which
Pn+1 the base case. Induction Step: We now have n ≥ 2 and 0 < x1 , . . . , xn+1
establishes
with j=1 xj = n + 1 plus the existence of k, l ∈ {1, . . . , n + 1} such that xk = 1 + α,
xl = 1 − β with α, β > 0. Then define y := xk + xl − 1 = 1 + α − β. One observes y > 0
(since β < 1) and
n+1
X n+1
X n+1
Y
ind. hyp.
y+ xj = −1 + xj = n ⇒ y xj ≤ 1
j=1, j=1 j=1,
j6=k,l j6=k,l
(we can not exclude equality as y and all the remaining xj might be equalQto 1). Since
xk xl = (1 + α)(1 − β) = 1 + α − β − αβ = y − αβ < y, we now infer n+1 j=1 xj < 1,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 94
√
n p √ a−1
ap < 1 + (a − 1); p = 1 yields n
a<1+ . (7.44)
n n
Proof. The simple application
v
u n−p
√
n
u Y Th. 7.63 p a + n − p p
p
a = a ·
n
t p 1 < = 1 + (a − 1) (7.45)
j=1
n n
It is an exercise to show
1
lim √ = 0. (7.48)
n→∞ n
√
Now this together with 1 ≤ n
n≤1+ √2 and the Sandwich Th. 7.16 proves (7.46).
n
Example 7.66 (Euler’s Number). We use Th. 7.63 to prove the limit
n
1
e := lim 1 + (7.49)
n→∞ n
exists. It is known as Euler’s number. One can show it is an irrational number (see
Appendix H.1) and its first digits are e = 2.71828 . . . It is of exceptional importance for
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 95
where we have used that, on both sides of the inequality in (7.50), there are n + 1 factors
having the same sum, namely n + 1 + x; and the inequality in (7.43) is strict, unless
all factors are equal. We now apply (7.50) to the sequences (an )n∈N , (bn )n∈N , (cn )n∈N ,
where n n
1 1
an := 1 + , bn := 1 − ,
n n
∀
−1 !n+1 :
n+1 (7.51)
n∈N
c := b−1 = 1 1
n n+1 1− = 1+
n+1 n
Applying (7.50) with x = 1 and x = −1, respectively, yields (an )n∈N and (bn )n∈N are
strictly increasing, and (cn )n∈N is strictly decreasing. On the other hand, an < cn holds
for each n ∈ N, showing (an )n∈N is bounded from above by c1 , and (cn )n∈N is bounded
from below by a1 . In particular, Th. 7.19 implies the convergence of both (an )n∈N and
(cn )n∈N . Moreover, limn→∞ cn = limn→∞ an (1 + 1/n) = e · 1 = e, which, together with
an < e < cn for each n ∈ N, can be used to compute e to an arbitrary precision.
Definition 7.67. Let A ⊆ R be a subset of the real numbers. Then A is called dense
in R if, and only if, every ǫ-neighborhood of every real number contains a point from A,
i.e. if, and only if,
∀ ∀ + A ∩ Bǫ (x) 6= ∅.
x∈R ǫ∈R
(b) R \ Q is dense in R.
(c) For each x ∈ R, there exist sequences (rn )n∈N and (sn )n∈N in the rational numbers
Q such that x = limn→∞ rn = limn→∞ sn , (rn )n∈N is strictly increasing and (sn )n∈N
is strictly decreasing.
Proof. (a): Since each Bǫ (x) is an interval, it suffices to prove that every interval ]a, b[,
a < b, contains a rational number. If 0 ∈]a, b[, then there is nothing to prove. Suppose
0 < a < b and set δ := b − a > 0. Choose n ∈ N such that 1/n < δ and let
k k
q := max : k∈N ∧ <b .
n n
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 96
Then q ∈ Q and a < q < b. If a < b < 0, choose δ and n as above, but let
k k
q := min − : k ∈ N ∧ − > a .
n n
Then, once again, q ∈ Q and a < q < b.
(b): Analogous to (a), we show that every interval ]a, b[, a < b, contains an irrational
√ According to (a), we choose q ∈
number: √ Q∩]a, b[, δ := b − q > 0 and n ∈ N such
that
√ 2/n < δ. Then a < λ := q + 2/n < b and also λ ∈ R \ Q (otherwise,
2 = n(λ − q) ∈ Q).
(c): Using (a), for each n ∈ N, we choose rational numbers rn and sn such that
1 1 1 1
rn ∈ x − , x − , sn ∈ x + ,x + .
n n+1 n+1 n
Then, clearly, (rn )n∈N is strictly increasing, (sn )n∈N is strictly decreasing, and the Sand-
wich Th. 7.16 implies x = limn→∞ rn = limn→∞ sn .
Definition and Remark 7.69 (Exponentiation). We have already used that Not. C.5
of the Appendix defines ax for (a, x) ∈ C×N0 and for (a, x) ∈ (C\{0})×Z (using G = C
and G = C \ {0}, respectively). We will now extend the definition to (a, x) ∈ R+ × R
(later, we will further extend the definition to (a, z) ∈ R+ × C). The present extension
to (a, x) ∈ R+ × R is accomplished in two steps – first, in (a), for rational x, then, in
(b), for irrational x.
For this definition to make sense, we have to check it does not depend on the special
representation of x, i.e., we have to verify x = nk = nmkm
with k ∈ Z and m, n ∈ N
k km
implies a n = a nm . To this end, observe, using Rem. and Def. 7.61,
k √
n km √
nm
(a n )nm = ( ak )nm = akm and (a nm )nm = ( akm )nm = akm , (7.53)
k km
proving a n = a nm (here, as in Rem. and Def. 7.61, we used that λ 7→ λN is one-
to-one on R+0 for each N ∈ N). The exponentiation rules of Th. C.6 now extend to
rational exponents in a natural way, i.e., for each a, b > 0 and each x, y ∈ Q:
ax+y = ax ay , (7.54a)
(ax )y = ax y , (7.54b)
ax bx = (ab)x . (7.54c)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 97
For the proof, by possibly multiplying numerator and denominator by some natural
number, we can assume x = k/n and y = l/n with k, l ∈ Z and n ∈ N. Then
k+l Th. C.6(a) k l Th. C.6(c)
(ax+y )n = (a n )n = ak+l = ak al = (a n )n (a n )n = (ax ay )n ,
proving (7.54a);
Th. C.6(b)
l
n k Th. C.6(b) k Th. C.6(b)
x y n2 x n
((a ) ) = ((a ) ) n = ((a n )l )n = ((a n )n )l = akl
kl 2 2
= (a n2 )n = (ax y )n ,
proving (7.54b);
Th. C.6(c) k k Th. C.6(c) k Th. C.6(b)
(ax bx )n = (a n )n (b n )n ak bk = (ab)k = (ab) n ·n = ((ab)x )n ,
proving (7.54c).
Moreover, we obtain the following monotonicity rules for each a, b ∈ R+ and each
x, y ∈ Q:
∀ a < b ⇒ ax < b x , (7.55a)
x>0
∀ a < b ⇒ ax > b x , (7.55b)
x<0
∀ x < y ⇒ ax < ay , (7.55c)
a>1
∀ x < y ⇒ ax > ay . (7.55d)
0<a<1
If x = k/n with k, n ∈ N and a < b, then a1/n < b1/n according to Rem. and
Def. 7.61, which, in turn, implies ax = (a1/n )k < (b1/n )k = bx , proving (7.55a); and
a−1 > b−1 implies a−x = (a−1 )x > (b−1 )x = b−x , proving (7.55b). If x < y, set q :=
y − x > 0. Then 1 < a and (7.55a) imply 1 = 1q < aq , i.e. ax < ax aq = ay , proving
(7.55c). Similarly, 0 < a < 1 and (7.55a) imply aq < 1q = 1, i.e. ay = ax aq < ax ,
proving (7.55d).
The following estimates will also come in handy: For a ∈ R+ and x, y ∈ Q:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 98
For x ≥ 1, (7.56) is proved by ax < ax+1 < x · ax+1 + 1; for x < 1, write x = p/n
with p, n ∈ N and p < n, and apply (7.44) to obtain ax < 1 + x(a − 1) < 1 + xa <
1 + x · ax+1 . For the proof of (7.57), first consider a > 1. Moreover, by possibly
renaming x and y, we may assume x < y, i.e. z := y − x > 0. Thus, (7.56) holds
with x replaced by z. Multiplying the resulting inequality by ax yields
proving (7.57) for a > 1. For a = 1, it is clearly true, and for a < 1, it is a−1 > 1,
i.e.
|ax − ay | = |(a−1 )−x − (a−1 )−y | ≤ |y − x| (a−1 )m+1 ,
finishing the proof of (7.57).
For this definition to make sense, we have to know such sequences (qn )n∈N exist,
which we do know from Th. 7.68(c). We also know from Th. 7.68(c) that there
exists an increasing sequence (qn )n∈N in Q converging to x, in particular, bounded
by x. Then, by (7.55c) and (7.55d), respectively, (aqn )n∈N is increasing for a > 1
and decreasing for 0 < a < 1. Moreover, the sequence is bounded from above by
aN with N ∈ N, N > x, for a > 1; and bounded from below by 0 for 0 < a < 1.
In both cases, Th. 7.19 implies convergence of the sequence to some limit that we
may call ax . However, we still need to verify that, for each sequence (rn )n∈N in Q
with limn→∞ rn = x, the sequence (arn )n∈N converges to the same limit ax in R. If
limn→∞ rn = x, then limn→∞ |qn − rn | = 0. Since (rn )n∈N and (qn )n∈N are bounded,
(7.57) implies
∃ + ∀ |aqn − arn | ≤ L |qn − rn |, (7.59)
L∈R n∈N
Proposition 7.70. The exponentiation rules (7.54), the monotonicity rules (7.55), and
the estimates (7.56) and (7.57) remain valid if x, y ∈ Q is replaced by x, y ∈ R. More-
over, for each a > 0 and each sequence (xn )n∈N in R:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 99
Proof. Given x, y ∈ R, let (pn )n∈N and (qn )n∈N be sequences in Q such that limn→∞ pn =
x and limn→∞ qn = y.
We start by verifying (7.57). As we can assume (pn )n∈N and (qn )n∈N to be monotone,
we may also assume pn , qn ∈ [−m, m] for each n ∈ N. Then the rational case of (7.57)
implies
∀ |apn − aqn | ≤ L |pn − qn |,
n∈N
and Th. 7.13(c) establishes the case. Then (7.61) also follows, since
0 ≤ |axn − ax | ≤ L |xn − x| → 0.
proving (7.55a). If x < 0 and 0 < a < b, then ax = (1/a)−x > (1/b)−x = bx , proving
(7.55b).
Finally, it remains to verify (7.56). For x ≥ 1, the proof for rational x still works for
irrational x. For 0 < x < 1, one uses the usual sequence (qn )n∈N in Q with limn→∞ qn = x
and obtains (recalling a > 1)
(7.44)
ax = lim aqn ≤ lim 1 + qn (a − 1) = 1 + x(a − 1) < 1 + x · ax+1 ,
n→∞ n→∞
proving (7.56).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 100
Definition 7.71 (Exponential and Power Functions). (a) Each function of the form
f : R+ −→ R, f (x) := xα , α ∈ R, (7.62)
is called a (general) exponential function. The case where a = e with e being Euler’s
number from (7.49) is of particular interest and importance. Most of the time, when
referring to an exponential function, one actually means x 7→ ex . It is also common
to write exp(x) instead of ex .
Theorem 7.72. (a) Every power function as defined in Def. 7.71(a) is continuous on
its respective domain. Moreover, for each α > 0, it is strictly increasing on [0, ∞[;
for each α < 0, it is strictly decreasing on ]0, ∞[.
(b) Every exponential function as defined in Def. 7.71(b) is continuous. Moreover, for
each a > 1, it is strictly increasing; for each 0 < a < 1, it is strictly decreasing.
Proof. (a): The monotonicity claims are provided by (7.55a) and (7.55b), respectively.
For each α ∈ N0 , the power function is a polynomial, for each α ∈ Z, a rational function,
i.e. continuity is provided by Ex. 7.40(b) and Ex. 7.40(c), respectively. For a general
α ∈ R, the continuity proof on R+ will be postponed to Ex. 7.76(a) below, where it can
be accomplished more easily. So it remains to show the continuity in x = 0 for α > 0.
However, if (xn )n∈N is a sequence in R+ with limn→∞ xn = 0 and k ∈ N with 1/k ≤ α,
1/k
then, at least for n sufficiently large such that xn ≤ 1, 0 < xαn ≤ xn by (7.55d). Then
1/k
the continuity of x 7→ x1/k implies limn→∞ xn = 0 and the Sandwich Th. 7.16 implies
limn→∞ xαn = 0, proving continuity in x = 0.
(b): Everything has already been proved – continuity is provided by (7.61), monotonicity
is provided by (7.55c) and (7.55d).
Remark and Definition 7.73 (Logarithm). According to Th. 7.72(b), for each a ∈
R+ \ {1}, the exponential function f : R −→ R+ , f (x) := ax , is continuous and strictly
monotone with f (R) = R+ (verify that the image is all of R+ as an exercise). Then
Th. 7.60 implies the existence of a continuous and strictly monotone inverse function
f −1 : R+ −→ R. For each x ∈ R+ , we call f −1 (x) the logarithm of x to base a and write
loga x := f −1 (x). The most important special case is where the base is Euler’s number,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 101
a = e. This is called the natural logarithm. Bases a = 2 and a = 10 also carry special
names, binary and common logarithm, respectively. The notation is
ln x := loge x, lb x := log2 x, lg x := log10 x, (7.64)
however, the notation in the literature varies – one finds log used instead of ln, lb, and
lg; one also finds lg instead of lb. So you always need to verify what precisely is meant
by either notation.
Corollary 7.74. For each a ∈ R+ \ {1}, the logarithm function f : R+ −→ R, f (x) =
loga x is continuous. For a > 1, it is strictly increasing; for 0 < a < 1, it is strictly
decreasing.
Theorem 7.75. One obtains the following logarithm rules:
∀ loga 1 = 0, (7.65a)
a∈R+ \{1}
∀ loga a = 1, (7.65b)
a∈R+ \{1}
∀ ∀ aloga x = x, (7.65c)
a∈R+ \{1} x∈R+
∀ ∀ loga ax = x, (7.65d)
a∈R+ \{1} x∈R
√ 1
∀ ∀+ ∀ loga n
x= loga x, (7.65h)
+
a∈R \{1} x∈R n∈N n
∀ ∀ logb x = (logb a) loga x. (7.65i)
a,b∈R+ \{1} x∈R+
Proof. All the rules are easy consequences of the logarithm being defined as the inverse
function to f : R −→ R+ , f (x) := ax .
(7.65a): It is loga 1 = f −1 (1) = 0, as f (0) = a0 = 1.
(7.65b): It is loga a = f −1 (a) = 1, as f (1) = a1 = a.
(7.65c): It is aloga x = f (f −1 (x)) = x.
(7.65d): It is loga ax = f −1 (f (x)) = x.
(7.65e): It is loga (xy) = f −1 (xy) = f −1 f (loga x + loga y) = loga x + loga y, since
(7.65c)
f (loga x + loga y) = aloga x+loga y = aloga x aloga y = xy.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 102
(7.65f): It is loga (xy ) = f −1 (xy ) = f −1 f (y loga x) = y loga x, since
(7.65c)
f (y loga x) = ay loga x = (aloga x )y = xy .
(7.65g) is just a combination of (7.65e) and (7.65f): loga (x/y) = loga (xy −1 ) = loga x −
loga y.
√
(7.65h) is just a special case of (7.65f): loga n x = loga x1/n = n1 loga x.
(7.65i): One computes
(7.65f) (7.65c)
(logb a) loga x = logb aloga x = logb x.
Thus, we have verified all the rules and concluded the proof.
f : R+ −→ R, f (x) := xα = eα ln x , (7.66)
is continuous, which follows from Th. 7.41, since f = exp ◦(α ln), ln is continuous
by Cor. 7.74, and exp is continuous by Th. 7.72(b).
7.3 Series
7.3.1 Definition and Convergence
Series are a special type of sequences, namely sequences whose members arise from
summing up the members of another sequence. We have, on occasion, already encoun-
tered series, for example the harmonic series (sn )n∈N , whose members sn were defined
in (7.27). In the present section, we will study series more systematically.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 103
Definition 7.77. Given a sequence (an )n∈N in K (or, more generally, in any set A,
where an addition is defined), the sequence (sn )n∈N , where
n
X
∀ sn := aj , (7.67)
n∈N
j=1
The anPare called the summands of the series, the sn its partial sums. Moreover, each
series ∞j=k aj with k ∈ N is called a remainder (series) of the series (sn )n∈N .
The example of the remainder series already shows that it is useful to allow countable
index sets other than N. Thus, if (aj )j∈I , where I is a countable index set and φ : N −→
I a bijective map, then define
X X∞
aj := aφ(j) (7.69)
j∈I j=1
For sequences in K, the notion of convergence is available, and, thus, it is also available
for series arising from real or complex sequences (as such series are, again, sequences in
K).
Definition 7.78. If (sn )n∈N is a series with the sn defined as in (7.67) and with sum-
mands aj ∈ K, then the series is called convergent with limit s ∈ K if, and only if,
limn→∞ sn = s in the sense of (7.1). In that case, one writes
∞
X
aj = s (7.70)
j=1
and calls s the sum of thePseries. The series P is called divergent if, and only if, it is
∞
not convergent. We write j=1 aj = ∞ (resp. ∞ j=1 aj = −∞) if, and only if, (sn )n∈N
diverges to ∞ (resp. −∞) in the sense of Def. 7.18.
P
Caveat 7.79. One has to use care as the symbol ∞ j=1 aj is used with two completely
different meanings. If it is used according to (7.68), then it means a sequence; if it is
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 104
used according to (7.70), then it means a real or complex number (or, possibly, ∞ or
−∞). It should always be clear from the Pcontext, if it means a sequence or a number.
For example, in the statement “the series ∞j=1 2
−j
is convergent”, it means a sequence;
P∞ −j
whereas in the statement “ j=1 2 = 1”, it means a number.
P
Example 7.80. (a) For each q ∈ C with |q| < 1, ∞ j
j=0 q is called a geometric series.
From (3.22b) (the reader is asked to go back and check that (3.22b) andPits proof,
indeed, remain valid for each q ∈ C), we obtain the partial sums sn = nj=0 q j =
1−q n+1
1−q
Since |q| < 1, we know limn→∞ q n+1 = 0 from Ex. 7.6. Thus, the series is
.
convergent with
∞
X 1 − q n+1 1
∀ q j = lim sn = lim = . (7.71)
|q|<1 n→∞ n→∞ 1 − q 1−q
j=0
(a) Linearity:
∞
X ∞
X ∞
X
∀ (λ aj + µ bj ) = λ aj + µ bj . (7.73)
λ,µ∈C
j=1 j=1 j=1
(c) Monotonicity:
∞
X ∞
X
∀ aj , b j ∈ R ∧ aj ≤ b j ⇒ aj ≤ bj . (7.75)
j∈N
j=1 j=1
P∞ P∞
(d) Each P
remainder seriesP j=n+1 a j , n ∈ N, converges, and, letting S := j=1 aj ,
sn := nj=1 aj , rn := ∞ a
j=n+1 j , one has
∀ S = s n + rn , lim an = lim rn = 0. (7.76)
n∈N n→∞ n→∞
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 105
Proof. (a) follows from the first two identities of Th. 7.13(a), (b) is due to
∞
X n
X n
X n
X ∞
X
Def. and Rem. 5.5(a) (7.11f)
aj = lim aj = lim aj = lim aj = aj ,
n→∞ n→∞ n→∞
j=1 j=1 j=1 j=1 j=1
(c) follows from Th. 7.13(c), and, for (d), one computes
lim an = lim (sn − sn−1 ) = S − S = 0,
n→∞ n→∞
Xk
∀ rn = lim aj = lim (sk − sn ) = S − sn ,
n∈N k→∞ k→∞
j=n+1
lim rn = lim (S − sn ) = S − S = 0,
n→∞ n→∞
P∞ P∞
(b) If j=1 aj is divergent, then j=1 |bj | is divergent as well.
Proof. Since
Pn(b) is merely the P contraposition of (a), it suffices toP
prove (a). ToP
this end,
let sn := j=1 aj and tn := j=1 |bj | be the partial sums of j=1 aj and ∞
n ∞
j=1 |bj |,
respectively. Since (tn )n∈N converges, it must be a Cauchy sequence by Th. 7.29. Thus,
∀ ∃ ∀ |tn − tm | = |bm+1 | + · · · + |bn | < ǫ
ǫ∈R+ N ∈N, n>m>N
N ≥k
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 106
that means the error made when approximating the limit by the partial sum sn has the
same sign as the first neglected summand an+1 , and its absolute value is less than |an+1 |.
Proof. We first consider the case where a1 > 0, i.e. where there exists a strictly de-
creasing sequence of positive numbers (bn )n∈N such that an = (−1)n+1 bn . As the bn are
strictly decreasing, we obtain bn − bn+1 > 0 for each n ∈ N, such that the sequences
(un )n∈N and (vn )n∈N , defined by
n
X
∀ un := s2n = (b2j−1 − b2j ) = (b1 − b2 ) + (b3 − b4 ) + · · · + (b2n−1 − b2n ),
n∈N
j=1
n
X
∀ vn := s2n+1 = b1 − (b2j − b2j+1 )
n∈N
j=1
are strictly monotone, namely (un )n∈N strictly increasing and (vn )n∈N strictly decreasing.
Since, 0 < un < un + b2n+1 = vn < b1 for each n ∈ N, both sequences (un )n∈N and
(vn )n∈N are also bounded, and, thus, convergent by Th. 7.19, i.e. U := limn→∞ un ∈ R
and V := limn→∞ vn ∈ R. Since
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 107
Pj=1 (−a j ) = θ (−a 1 ) for a suitable θ ∈]0, 1[. However, this then yields, as before,
∞
j=1 aj = θ a1 .
P
Applying the above result to each remainder series ∞ j=n+1 aj , n ∈ N, completes the
proof of (7.79) and the theorem.
Example 7.86. (a) Each of the following alternating series clearly converges, as the
Leibniz criterion of Th. 7.85 clearly applies in each case:
∞
X (−1)j+1 1 1
=1− + − +..., (7.80a)
j=1
j 2 3
∞
X (−1)j+1 1 1
=1− + − +..., (7.80b)
j=1
2j − 1 3 5
∞
X (−1)j+1 1 1 1
= − + − +... (7.80c)
j=1
ln(j + 1) ln 2 ln 3 ln 4
(b) To see that Th. 7.85P is false without its monotonicity requirement, take any diver-
gent series with ∞ j=1 aj = ∞, 0 <P aj , limj→∞ aj = 0 (for example the harmonic
series), any convergent series with ∞ +
j=1 cj = s ∈ R and 0 < cj (for example any
geometric series with 0 < q < 1), and define
(
a(n+1)/2 for n odd,
dn :=
−cn/2 for n even.
P
an exercise to show that ∞
It is P j=1 dj is an alternating series with limn→∞ dn = 0
∞
and j=1 dj = ∞.
P∞
Definition
P∞ 7.87. The series j=1 aj in C is said to be absolutely convergent if, and
only if, j=1 |aj | is convergent.
P∞
Corollary 7.88. Every absolutely convergent series j=1 aj is also convergent and
satisfies the triangle inequality for infinite series:
∞ ∞
X X
aj ≤ |aj |. (7.81)
j=1 j=1
Proof. The corollary is given by the special case aj = bj for each j ∈ N of Th. 7.83(a).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 108
P∞
Theorem 7.89. We consider the series j=1 aj in C.
P
(a) If ∞ +
P∞cj is a convergent series such that cj ∈ R0 and |aj | ≤ cj for each j ∈ N,
j=1
then j=1 aj is absolutely convergent.
X ∞
⇒ aj is absolutely convergent, (7.83a)
j=1
∞
an+1 X
an ≥ 1 for almost all n ∈ N ⇒ aj is divergent. (7.83b)
j=1
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 109
p
the harmonic series, which does not converge, but n
1/n < 1 for each n ≥ 2 and
1/(n+1) n
1/n
= n+1 < 1 for each n ∈ N.
P
Example 7.91. (a) For each z ∈ C with |z| < 1√and each p ∈ N0 , the series ∞ p n
n=1 n z
is absolutely convergent: We have limn→∞ n p
n = 1 as a consequence of Ex. 7.65.
p p
This implies limn→∞ |an | = limn→∞ n |z|n = |z| < 1. Thus, the root test of
n n p
Thus, the ratio test of (7.83a) applies and proves absolute convergence of the series
for |z| < e. For |z|
n > e, (7.83b) applies and proves divergence. Since, according to
Ex. 7.66, 1 + n1 < e for each n ∈ N, (7.83b) applies to prove divergence also for
|z| = e.
In general, one has to use care when dealing with infinite series, as convergence properties
and even the limit in case of convergence can depend on the order of the summands (in
obvious contrast to the situation of finite sums). For real series that are convergent, but
not absolutely convergent, one has the striking Riemann rearrangement Th. 7.93, that
states one can choose an arbitrary number S ∈ R∪{−∞, ∞} and reorder the summands
such that the new series converges to S (actually, Th. 7.93 says even more, namely that
one can prescribe an entire interval of cluster points for the rearranged series). However,
the situation is better for absolutely convergent series. In Th. 7.95, we will see that the
sum of absolutely convergent series does not depend on the order of the summands.
P
Proposition 7.92. Let ∞ j=1 aj be a series in R. Defining
∀ a+
j := max{aj , 0}, a−
j := max{−a j , 0} , (7.85)
j∈N
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 110
P∞
(b) If j=1 aj is convergent, but not absolutely convergent, then
∞
X ∞
X
a+
j = a−
j = ∞. (7.86)
j=1 j=1
∞
X ∞
X ∞
X
(7.87a),(7.73)
|aj | = a+
j + a−
j , (7.88)
j=1 j=1 j=1
P P
and, in particular, ∞j=1 aj is absolutely convergent. Conversely, if ∞ j=1 aj is absolutely
P∞ + P∞ −
convergent, then j=1 aj and j=1 aj are convergent by (7.87c) and Th. 7.83(a).
P P∞ + P∞ −
(b): If ∞ j=1 aj and j=1
P∞ a j are convergent, then (7.87b) implies that j=1
P∞ aj is also
convergent and, thus, j=1 aj absolutely convergent by (a). Likewise, if j=1 aj and
P∞ − P∞ +
j=1 aP
j are convergent, then (7.87b) implies that j=1 aj is
Palso convergent and, once
∞ ∞
again, j=1 aj absolutely convergent by (a). Therefore, if j=1 aj is convergent, but
not absolutely convergent, then (7.86) must hold by (7.77).
Theorem
P 7.93 (Riemann Rearrangement
P∞ Theorem a.k.a. Riemann Series Theorem).
Let ∞ a
j=1 j be a series in R. If a
j=1 j is convergent, but not absolutely
P∞ convergent,
then, given x, y ∈ R ∪ {−∞, ∞} with x ≤ y, there exists P a rearrangement j=1 bj of the
series (i.e. a reordering (bj )j∈N of (aj )j∈N ) such that ∞ j=1 bj has precisely all elements of
[x, y] as cluster points (where we call −∞ (resp. ∞) a cluster point of the real sequence
(tn )n∈N if, and only if, #{n ∈ N : tn < −N } = ∞ (resp. #{n ∈ N : tn > N } = ∞) for
each N ∈ N). In particular,P choosing S := x = y ∈ R ∪ {−∞, ∞}, one can prescribe an
arbitrary limit S such that ∞ j=1 bj = S.
Sketch of Proof. Here we just give a sketch of the proof to convey its fairly simple idea;
a detailed proof is provided in Appendix E.1. According to Prop. 7.92(b), (7.86) must
−
hold, where the a+j and aj are as defined in (7.85). Thus, we can define
−k for x = −∞, −k for y = −∞,
∀ xk := x for x ∈ R, yk := y for y ∈ R,
k∈N
k for x = ∞, k for y = ∞,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 111
and, noting xk ≤ yk for almost all k ∈ N, alternate between adding summands a+ j until
−
the partial sum exceeds yk and subtracting summands aj until the partial sum falls
below xk . If k is sufficiently large such that xk ≤ yk , then, at each switching point (from
adding to subtracting or vice versa), the absolute value of the difference between the
last partial sum and xk or yk , respectively, is less than the value of the last contributing
nonzero summand. Since
−
lim a+j = lim aj = 0,
j→∞ j→∞
the partial sums corresponding to the switching points converge to the respective end-
points x or y, respectively, and precisely all points between x and y are cluster points.
We will now study the more benign situation of absolutely convergent series.
P P∞
Theorem 7.94. Let ∞ j=1 aj and j=1 bj be
Pseries in C such that (bn )n∈N is a reordering
∞
of (an )n∈N inPthe sense P
P of Def. 7.21. If j=1 aj is absolutely convergent, then so is
∞ ∞ ∞
b
j=1 j and a
j=1 j = b
j=1 j .
Pn Pn Pn
Proof. Let sn := j=1 aj , s̃n := j=1 |aj |, and tn := j=1 bj denote the respective
partial sums. We will show that limn→∞ (sn − tn ) = 0. Given ǫ > 0, since (s̃n )n∈N is a
Cauchy sequence by Th. 7.29, there exists N ∈ N, such that
∀ |s̃n − s̃m | = |am+1 | + · · · + |an | < ǫ.
n>m>N
Since (bn )n∈N is a reordering of (an )n∈N , there exists a bijective map φ : N −→ N such
that bn = aφ(n) for each n ∈ N. Since φ is bijective, there exists M ∈ N such that
{1, 2, . . . , N + 1} ⊆ φ{1, 2, . . . , M }. Then n > M implies φ(n) > N + 1, and
∀ ∃ |sn − tn | ≤ |aN +2 | + · · · + |aN +k | < ǫ,
n>M k∈N
since all aj with j ≤ N +1 occur in both sn and tn and cancel in sn −tn (i.e. all aj that do
not cancel must have an index j > N + 1). So we have shown that limn→∞ (sn − tn ) = 0,
which, in turn, implies
∞
X ∞
X ∞
X
bj = lim tn = lim (tn − sn + sn ) = 0 + aj = aj .
n→∞ n→∞
j=1 j=1 j=1
Pn P∞ P∞
ApplyingPthis to s̃n := j=1 |aj | yields j=1 |bj | = j=1 |aj |, proving absolute conver-
gence of ∞j=1 bj .
Theorem 7.95. Let I be an arbitrary infinite countable index set and let
[˙
I= In (7.89)
n∈N
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 112
P
(a) If the series j∈I aj (cf. (7.69)) is absolutely convergent, then
X ∞ X
X
aj = aα . (7.90)
j∈I n=1 α∈In
P
Proof. (a):PFirst notePthat Th. 7.94 implies that, for absolutely convergent j∈I aj ,
∞
the limit j∈I aj = j=1 aφ(j) does not depend on the bijective map φ : N −→ I:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 113
X ∞
X
|aφ(j) | ≤ |aφ(j) | = rN +1 < ǫ.
j∈J j=N +2
proving
X k
X ∞ X
X
aj = S(I) = lim S(In ) = aα ,
k→∞
j∈I n=1 n=1 α∈In
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 114
which is (7.90).
P
(b): (i) implies (ii) with C := j∈I |aj | using Cl. 1 (with aj replaced by |aj |). (i) implies
P (with aj replaced by |aj |). (ii) implies (i) via (7.77), as C is an upper
(iii) using (7.90)
bound for (P nj=1 |a Pφ(j) |)n∈N for each bijection φ : N −→ I. Finally, (iii) implies (ii)
∞
with C := n=1 α∈In |aα |, since, given a finite J ⊆ I, there exists k ∈ N such that
J ⊆ I1 ∪ · · · ∪ Ik , i.e.
X k
X X ∞ X
X
|aj | ≤ |aα | ≤ |aα | = C,
j∈J n=1 α∈In n=1 α∈In
Example 7.96. We apply Th. 7.95 to so-called double series, i.e. to series with index
set I := N × N. The following notation is common:
∞
X X
amn := a(m,n) , (7.91)
m,n=1 (m,n)∈N×N
where one writes amn (also am,n ) instead of a(m,n) . Recall from Th. 3.16 that N × N is
countable. In general, the convergence properties of the double series and, if it exists,
the value of the sum, will depend on the chosen bijection φ : N −→ N × N.
However, we will now assume our double series to be absolutely convergent. Then Th.
7.94 guarantees the sum does not depend on the chosen bijection and we can apply Th.
7.95. Applying Th. 7.95 to the decompositions
˙
[
N×N= {(m, n) : n ∈ N}, (7.92a)
m∈N
˙
[
N×N= {(m, n) : m ∈ N}, (7.92b)
n∈N
˙
[
N×N= {(m, n) ∈ N × N : m + n = k}, (7.92c)
k∈N
yields
X ∞ X
X ∞ ∞ X
X ∞
(7.92a) (7.92b)
a(m,n) = amn = amn
(m,n)∈N×N m=1 n=1 n=1 m=1
∞
X X ∞ X
X k−1
(7.92c)
= amn := am,k−m . (7.93)
k=2 m+n=k k=2 m=1
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 115
∞ X
X ∞ ∞
X
|am bn | = |am | B = AB < ∞,
m=1 n=1 m=1
P
i.e. ∞m,n=1 am bn is absolutely convergent according to Th. 7.95(b)(iii). Now the second
equality in (7.94) is just the third equality in (7.93), and the first equality in (7.94) also
follows from (7.93):
∞ ∞ X ∞ ∞ ∞ ∞
! ∞ !
X X X X X X
am b n = am b n = am bn = am bm ,
m,n=1 m=1 n=1 m=1 n=1 m=1 m=1
We are mostly used to representing real numbers in the decimal system. For example,
we write ∞
395 X
x= 2 1 0
= 131.6 = 1 · 10 + 3 · 10 + 1 · 10 + 6 · 10−n , (7.95a)
3 n=1
where ∞
X
−n (7.71) 1 1 2
6 · 10 = 6· 1 −1 =6· = .
n=1
1 − 10 9 3
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
7 LIMITS AND CONVERGENCE OF REAL AND COMPLEX NUMBERS 116
The decimal system represents real numbers as, in general, infinite series of decimal
fractions. Digital computers represent numbers in the dual system, using base 2 instead
of 10. For example, the number from (7.95a) has the dual representation
∞
X
x = 10000011.10 = 2 + 2 + 2 + 7 1 0
2−(2n+1) , (7.95b)
n=0
Representations with base 16 (hexadecimal) and 8 (octal) are also of importance when
working with digital computers. More generally, each natural number b ≥ 2 can be used
as a base.
Definition 7.98. Let b ≥ 2 be a natural number.
is called a b-adic series. The number b is called the base or the radix, and the
numbers dν are called digits.
(b) If x ∈ R+0 is the sum of the b-adic series given by (7.96), than one calls the b-adic
series a b-adic representation or a b-adic expansion of x.
Theorem 7.99. Given a natural number b ≥ 2 and a nonnegative real number x ∈
R+0 , there exists a b-adic series representing x, i.e. there is N ∈ Z and a sequence
(dN , dN −1 , dN −2 , . . . ) in {0, . . . , b − 1} such that
∞
X
x= dN −ν bN −ν . (7.97)
ν=0
If one introduces the additional requirement that 0 6= dN , then each x > 0 has either a
unique b-adic representation or precisely two b-adic representations. More precisely, for
0 6= dN and x > 0, the following statements are equivalent:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 117
(iii) There exists a b-adic representation of x such that dn = 0 for each n ≤ n0 for
some n0 < N .
(iv) There exists a b-adic representation of x such that dn = b − 1 for each n ≤ n0 for
some n0 ≤ N .
Example 7.100. Every natural number has precisely two decimal (i.e. 10-adic) repre-
sentations. For instance,
∞
X
−n (7.71) 1 1
2 = 2.0 = 1.9 = 1 + 9 · 10 = 1+9· 1 −1 =1+9· , (7.98)
n=1
1 − 10 9
(a) We say (fn )n∈N converges pointwise to f : M −→ K if, and only if, limn→∞ fn (z) =
f (z) for each z ∈ M , i.e. if, and only if,
(b) We say (fn )n∈N converges uniformly to f : M −→ K if, and only if,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 118
Remark 8.2. It is immediate from Def. 8.1(a),(b) that uniform convergence implies
pointwise convergence, but Ex. 8.3(b) below will show the converse is not true.
(b) The sequence (fn )n∈N , where fn : [0, 1] −→ R, fn (x) := xn , converges pointwise,
but not uniformly, to
(
0 for 0 ≤ x < 1,
f : [0, 1] −→ R, f (x) := (8.3)
1 for x = 1 :
1
∀ |fn (ξn ) − f (ξn )| = ξnn = = ǫ, (8.4)
n∈N 2
proving the convergence is not uniform.
Thus,
ǫ
∀ |f (z)−f (ζ)| ≤ |f (z)−fm (z)|+|fm (z)−fm (ζ)|+|fm (ζ)−f (ζ)| < 3· = ǫ, (8.7)
z∈M ∩Bδ (ζ) 3
proving continuity of f in ζ.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 119
(a) The series converges pointwise to f : M −→ K if, and only if, it (i.e. (sn )n∈N )
converges pointwise in the sense of Def. 8.1(a). In that case, we use the notation
∞
X
f= fj . (8.10)
j=1
If (8.10) holds, then the series is sometimes called a series expansion of f , in par-
ticular, a power series expansion if the series happens to be a power series.
P
Analogous to the situation of series in K, the notation ∞ j=1 fj is also used with
two different meanings – it can mean the sequence of partial sums as in (8.8) or, in
the case of convergent series, the limit function as in (8.10) (cf. Caveat 7.79).
(b) The series converges uniformly to f : M −→ K if, and only if, it converges uniformly
in the sense of Def. 8.1(b).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 120
P∞
Corollary 8.7. Consider a function series j=1 fj with fj : M −→ K, ∅ 6= M ⊆ C.
(a) The series converges uniformly to some f : PM −→ K if, and only if, for each
n ∈ N and each z ∈ M , the remainder series ∞j=n+1 fj (z) in K converges to some
rn (z) ∈ K such that
∀ + ∃ ∀ ∀ |rn (z)| < ǫ. (8.11)
ǫ∈R N ∈N n>N z∈M
P∞
(b) If j=1 aj is a convergent series in R+
0 , then the condition
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 121
P
Theorem 8.9. For each power series ∞ j
j=0 aj z , aj ∈ K, there exists a number r ∈
[0, ∞] := R+
0 ∪ {∞}, called the radius of convergence of the power series, such that
∞
X
z ∈ K ∧ |z| < r ⇒ aj z j converges absolutely in K, (8.13a)
j=0
X∞
z ∈ K ∧ |z| > r ⇒ aj z j diverges in K (8.13b)
j=0
In (8.15), lim sup denotes the pso-called limit superior, which is defined as the largest
cluster point of the sequence ( n |an |)n∈N if the sequence is bounded (cf. Th. 7.27) and
∞ if the sequence is unbounded. As the limit superior can be 0 or ∞, we also define
1/0 := ∞ and 1/∞ := 0 in (8.15).
One has the simpler formula
an
r = lim
, (8.16)
n→∞ an+1
provided all an are nonzero and provided the limit in (8.16) either exists in R+
0 or is ∞.
Proof. For the proof of (8.15), we apply theproot test from Th. 7.89(b). Here, for the
root test, we have to consider the sequence ( n |an ||z|n )n∈N . As a consequence of (7.11a)
and Prop. 7.26, lim supn→∞ (λxn ) = λ lim supn→∞ xn for each λ > 0 and each sequence
(xn )n∈N in R (with λ · ∞ := ∞, this also holds if the limit superior is infinite). Thus,
p p
lim sup n |an ||z|n = |z| lim sup n |an | = |z| L.
n→∞ n→∞
If |z| > 1/L, then |z| L > 1 and (7.82b) applies, i.e. (8.13b) holds for r = 1/L. If
|z| < 1/L, then |z| L < 1, and, recalling the Bolzano-Weierstrass Th. 7.27, one sees that
(7.82a) applies, i.e. (8.13a) holds for r = 1/L.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 122
P
Next, if 0 < r0 < r, then ∞ j
j=0 |aj r0 | converges according to (8.13a). Since, for each
z ∈ Br0 (0) and each j ∈ N, we have |aj z j | ≤ |aj r0j |, (8.14) is a consequence of Cor.
8.7(b).
The validity of (8.16) follows from the ratio test of Th. 7.89(c): If all an 6= 0 and z 6= 0,
then
an+1 z n+1 an+1 |z|
lim
n
= |z| lim
=
an .
n→∞ an z n→∞ an
limn→∞ an+1
an
If |z| < l := limn→∞ an+1 , then |z|/l < 1, i.e. (7.83a) applies, proving (8.13a) for r = l.
If |z| > l, then |z|/l > 1, i.e. (7.83b) applies, proving (8.13b) for r = l.
P∞
Corollary 8.10. If j=0 aj z j , aj ∈ K, is a power series with radius of convergence
r ∈]0, ∞], then the function
∞
X
f : Br (0) −→ K, f (z) := aj z j , (8.17)
j=0
which, for each α ∈ Z, follows from (7.46) and Th. 7.13(a), and, then, for all α ∈ R
from the Sandwich Th. 7.16.
Let us investigate what can happen for |z| = r = 1 for some cases: The series
P ∞ n
n=1 z (α = 0) is divergent for each z ∈ C with z = 1 by the observation that
(z )n∈NPdoes not converge to 0 for n → ∞ (as |z n | = 1 for each n ∈ N); the
n
series ∞ n=1 n
−1 n
z (α = −1) is the harmonic series, i.e. divergent, for z = 1, but
convergent for z = −1 according to Ex. 7.86(a).
P P∞ z n
(b) The radius of convergence of both ∞ zn
n=0 n! and n=0 nn is r = ∞ by (8.16) and
(8.15), respectively, since
an
lim = lim (n + 1)! = lim (n + 1) = ∞, (8.19a)
n→∞ an+1 n→∞ n! n→∞
r
p n 1 1
lim sup n |an | = lim n
= lim = 0. (8.19b)
n→∞ n→∞ n n→∞ n
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 123
P
(c) The radius of convergence of ∞ n
n=0 n! z is r = 0 by (8.16), since
an n! 1
lim = lim = lim = 0. (8.20)
n→∞ an+1 n→∞ (n + 1)! n→∞ n + 1
P
Caveat 8.12. Theorem 8.9 does not claim the uniform convergence of ∞ aj z j on
P∞ j=0
Br (0), which is usually not true (e.g., it is an exercise to show that j=0 z j does not
converge uniformlyP∞on B1 (0)). Theorem 8.9 also claims nothing about the convergence
j
or divergence of j=0 aj z for |z| = r, which has to be determined case by case (cf. Ex.
8.11(a)).
P∞ j
Definition and Remark 8.13. Given two power series p := j=0 aj z and q :=
P∞ j
j=0 bj z in K, we define their Cauchy product
∞ j
X X
j
p ∗ q := cj z , where cj := ak bj−k = a0 bj + a1 bj−1 + · · · + aj b0 . (8.21)
j=0 k=0
Note that we have not assumed any convergence of the series so far, i.e. p, q, and p ∗ q
are not K-valued functions, but sequences of K-valued functions according to Def. 8.5
(sequences of polynomials, actually). Sometimes one also calls the Cauchy product p ∗ q
the convolution of p and q.
Now, if we do assume p and q to have some nonzero radii of convergence, say rp , rq ∈
]0, ∞], respectively, then, by (8.13a), both series are absolutely convergent for each
z ∈ Br (0), where r := min{rp , rq }. Thus, the functions
∞
X ∞
X
f : Br (0) −→ K, f (z) := aj z j , g : Br (0) −→ K, g(z) := bj z j , (8.22)
j=0 j=0
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 124
From Ex. 8.11(b), we already know the radius of convergence of the power series in
(8.24) is ∞, such that the function in (8.24) is well-defined.
For the time being, we also redefine Euler’s number as e := exp(1) > 1 > 0 and, for each
x ∈ R+ , ln x := logexp(1) (x). This, as well as calling the function of (8.24) exponential
function, will be justified as soon as we will have proved
n X ∞
1 1 1 1 1
lim 1 + = = 1 + + + + ... (8.25)
n→∞ n n=0
n! 1! 2! 3!
and ∞
x
X xn x2 x3
∀ e = =1+x+ + + ... (8.26)
x∈R
n=0
n! 2! 3!
in (8.36) of Th. 8.18 and in Th. 8.16(c) below, respectively.
∀ E(x) = ax . (8.28)
x∈R
Proof. First, a = E(1) = E(0 + 1) = E(0)E(1) = E(0) a and a > 0 shows E(0) = 1.
−1
Then, for each x ∈ R, 1 = E(0) = E(x − x) = E(x)E(−x), i.e. E(−x) = E(x) ,
showing E(x) 6= 0 for each x ∈ R. Thus, E(1) > 0, the continuity of E, and the
intermediate value Th. 7.57 imply E(x) > 0 for each x ∈ R. Next, an induction shows
n
∀ ∀ E(n · x) = E(x) : (8.29)
x∈R n∈N
Applying (8.29) with x = 1 shows E(n) = an for each n ∈ N. Applying (8.29) with
x = 1/n, n ∈ N, shows a = E(1) = (E(1/n))n , i.e. E(1/n) = a1/n since E(1/n) > 0.
Next,
(8.29) k k
∀ E(k/n) = E(1/n) = (a1/n )k = a n ,
n,k∈N
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 125
showing (8.28) holds for each x ∈ Q+ . Then (8.28) also holds for each x ∈ R+ , since, if
(qn )n∈N is a sequence in Q+ with limn→∞ qn = x, then the continuity of E implies
Finally, if x ∈ R− , then
−1
ax = (a−x )−1 = E(−x) = E(x),
Theorem 8.16. We consider the exponential function exp as defined in (8.24). The
following holds:
Proof. (a) holds by Cor. 8.10; for (b), we compute (using (7.94)),
∞
X
exp(z) exp(w) = cn ,
n=0
∀ n n ;
z,w∈C
X z j wn−j 1 X n j n−j (5.22) (z + w)n
where cn = = z w =
j=0
j! (n − j)! n! j=0 j n!
(8.30)
and then (c) is an immediate consequence of (a), (b), and Prop. 8.15.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 126
Theorem 8.18. We consider the exponential function exp as defined in (8.24). With
ez := exp(z) for each z ∈ C and ln x := logexp(1) (x) for each x ∈ R+ (cf. Th. 8.16(c)
and Def. and Rem. 8.14), we have the following limits:
ez − 1
lim =1 z ∈ M := C \ {0} , (8.32)
z→0 z
ln(1 + x)
lim =1 x ∈ M :=] − 1, ∞[\{0} , (8.33)
x→0 x
1
∀ lim ln(1 + ξ x) x = ξ x ∈ M := {x ∈ R : 1 + ξ x > 0} \ {0} , (8.34)
ξ∈R x→0
1
∀ lim (1 + ξ x) x = eξ x ∈ M := {x ∈ R : 1 + ξ x > 0} \ {0} , (8.35)
ξ∈R x→0
∞
x n x
X xn
∀ lim 1 + =e = . (8.36)
x∈R n→∞ n n=0
n!
where, in the last step, it was used that limk→∞ xk = 0 and the continuity of f implies
limk→∞ f (xk ) = ln 1 = 0.
Similarly, but simpler, one obtains (8.34) and (8.35) (exercise). Finally, for the sequence
(xn )n∈N with xn := 1/n, (8.35) implies (8.36).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 127
Theorem 8.20. (a) The first two exponentiation rules of (7.54) still hold for each
a, b > 0 and each z, w ∈ C:
az+w = az aw , (8.38a)
az bz = (ab)z . (8.38b)
f : C −→ C, f (z) := az , (8.39a)
g : R+ −→ C, g(x) := xζ , (8.39b)
is continuous.
proving (8.38b).
(b): The continuity of both functions follows from the continuity of exp (according to
Th. 8.16(a)) and from the fact that continuity is preserved by compositions (according
to Th. 7.41): The exponential function f , given by f (z) = ez ln a , is the composition of
the continuous functions z 7→ z ln a and w 7→ ew , whereas (analogous to Ex. 7.76(a)),
the power function g, given by g(x) = eζ ln x , is the composition g = exp ◦(ζ ln), where
ln is continuous by Cor. 7.74.
(c): Exercise.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 128
Definition and Remark 8.21. We define the sine function, denoted sin, and the
cosine function, denoted cos by
∞
X (−1)n z 2n+1 z3 z5
sin : C −→ C, sin z := =z− + − +..., (8.41a)
n=0
(2n + 1)! 3! 5!
∞
X (−1)n z 2n z2 z4
cos : C −→ C, cos z := =1− + − +.... (8.41b)
n=0
(2n)! 2! 4!
(a) sin and cos are well-defined and continuous: For both series and each z ∈ C, we can
estimate the absolute value of the nth summand by the nth summand of the series
for the exponential function e|z| (cf. (8.36)), which we know to be convergent from
Ex. 8.11(b). Thus, by Th. 8.9, both series in (8.41) have radius of convergence ∞
and are continuous by Cor. 8.10.
(b) cos : R −→ R (i.e. cos↾R ) has a smallest positive zero α ∈ R+ . We define π := 2α.
One can show π is an irrational number (see Appendix H.2) and its first digits are
π = 3.14159 . . .
To see cos has a smallest positive zero and to obtain a first (very coarse) estimate,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 129
note
xk xk+1 x
∀ ∀ > ⇔ 1> ⇔ k+1>x ,
x∈R+ k∈N k! (k + 1)! k+1
k k+1
showing xk! > (k+1)!
x
holds for each k ≥ 2 and each x ∈]0, 3[. In particular, the
summands of the series in (8.41) converge monotonically to 0 (for k ≥ 2) and, since
the series are alternating for x 6= 0, Th. 7.85 applies and (7.79) yields
x2 x2 x4
f (x) := 1 − 2 < cos x < 1 − 2 + 24 =: g(x),
∀ (8.42)
0<x<3 x3 x3 x5
x− < sin x < x − + .
6 6 120
√ √ √
The zeros of x 7→ f p(x) are − 2,p2, i.e. 2p is its smallest positive zero;pthe zeros
√ √ √ p √ √
of x 7→ g(x) are − 6 − 2 3, − 6 + 2 3, 6 − 2 3, 6 + 2 3, i.e. 6 − 2 3
is its smallest positive zero. Thus, as f (0) = g(0) = 1, the intermediate value Th.
7.57 implies cos has a smallest positive zero α and
q
√ π √
1.4 < 2 < := α < 6 − 2 3 < 1.6 (8.43)
2
Theorem 8.22. We have the following identities:
sin 0 = 0, cos 0 = 1, (8.44a)
∀ sin z = − sin(−z), cos z = cos(−z), (8.44b)
z∈C
∀ sin(z + w) = sin z cos w + cos z sin w, (8.44c)
z,w∈C
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 130
Proof. (8.44a) is immediate from (8.41) since, for z = 0, all summands of the sine
series are 0 and all summands of the cosine series are 0, except the first one, which is
(−1)0 00
0!
= 1.
(8.44b) is also immediate from (8.41), since (−z)2n+1 = (−1)2n+1 z 2n+1 = −z 2n+1 and
(−z)2n = (−1)2n z 2n = z 2n .
(8.44c) and (8.44d) can be verified using the Cauchy product: According to (7.94),
∞ ∞
X X
sin z cos w = cn , cos z sin w = dn ,
n=0 n=0
n j 2j+1 n−j 2(n−j)
X (−1) z (−1) w
∀ where cn = , ,
z,w∈C
j=0
(2j + 1)! (2(n − j))!
n
X (−1)j z 2j (−1)n−j w2(n−j)+1
dn = ,
j=0
(2j)! (2(n − j) + 1)!
(−1)n (z + w)2n+1
= ,
(2n + 1)!
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 131
c0 = 1 and
n n−1
X (−1)n z 2j w2(n−j) X (−1)n−1 z 2j+1 w2(n−1−j)+1
∀ cn − dn−1 = −
n∈N
j=0
(2j)! (2(n − j))! j=0 (2j + 1)! (2(n − 1 − j) + 1)!
n n−1
X (−1)n z 2j w2n−2j X (−1)n z 2j+1 w2n−(2j+1)
= +
j=0
(2j)! (2n − 2j)! j=0 (2j + 1)! (2n − (2j + 1))!
2n 2n
X z j w2n−j
n (−1)n X 2n j 2n−j
= (−1) = z w
j=0
j! (2n − j)! (2n)! j=0
j
(−1)n (z + w)2n
= ,
(2n)!
proving (8.44d).
(8.44e): One computes for each z ∈ C:
(8.44d)
(sin z)2 + (cos z)2 = cos z cos(−z) − sin z sin(−z) = cos(z − z) = cos 0 = 1.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 132
For both series on the right-hand side and each z ∈ C, we can estimate the absolute
value of each summand by the corresponding summand of the exponential series for e|z|
(cf. (8.36)), showing they have radius of convergence ∞ and are continuous by Cor. 8.10.
In particular, their continuity in z = 0 proves (8.44j).
Theorem 8.23. One has sin(R) = cos(R) = [−1, 1], i.e. the image of both sine and
cosine is [−1, 1]. Moreover, for each k ∈ Z:
h π π i
sin is strictly increasing on − + 2kπ, + 2kπ , (8.45a)
2 2
π 3π
sin is strictly decreasing on + 2kπ, + 2kπ , (8.45b)
2 2
cos is strictly increasing on [(2k − 1)π, 2kπ], (8.45c)
cos is strictly decreasing on [2kπ, (2k + 1)π], (8.45d)
which, due to (8.44e), can be summarized (and visualized) by saying that, if x runs from
2kπ to 2(k + 1)π, then (cos x, sin x) runs once counterclockwise through the unit circle,
starting at (1, 0).
Proof. From (8.44e), we know sin(R) ⊆ [−1, 1] and cos(R) ⊆ [−1, 1]. As
π (8.44f) π (8.44b) (8.44a) (8.44h)
sin = 1, sin − = −1, cos 0 = 1, cos π = − cos 0 = −1,
2 2
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 133
the continuity of sine and cosine together with the intermediate value Th. 7.57 implies
sin(R) = cos(R) = [−1, 1].
3 2 4
From (8.42), we know 0 < x − x6 < sin x and cos x < 1 − x2 + x24 < 1 for each x ∈]0, π2 ],
implying
∀ cos(x + y) = cos x cos y − sin x sin y ≤ cos x cos y < cos x,
0≤x<x+y≤ π2
We now come to important complex number relations between sine, cosine, and the
exponential function.
Theorem 8.24. One has the following formulas, relating the (complex) sine, cosine,
and exponential function:
∀ eiz = cos z + i sin z (Euler formula), (8.46a)
z∈C
eiz + e−iz
∀ cos z = , (8.46b)
z∈C 2
eiz − e−iz
∀ sin z = . (8.46c)
z∈C 2i
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 134
As a first application of (8.46), we can now determine all solutions to the equation
ez = 1 and all zeros (if any) of exp, sin, and cos:
Theorem 8.25. The set of (complex) solutions to the equation ez = 1 consists precisely
of all integer multiples of 2πi, the exponential function has no zeros (neither in R nor
in C), and the set of all (real or complex) zeros of sine and cosine consists of a discrete
set of real numbers. More precisely:
Proof. Exercise.
Definition and Remark 8.26. We define tangent and cotangent by
sin z
tan : C \ cos−1 {0} −→ C, tan z := , (8.48a)
| {z } cos z
C \ {(2k + 1) π2 : k ∈ Z} by (8.47d)
cos z
cot : C \ sin−1 {0} −→ C, cot z := , (8.48b)
| {z } sin z
C \ {kπ : k ∈ Z} by (8.47c)
respectively. Since sine and cosine are both continuous, tangent and cotangent are also
both continuous on their respective domains. Both functions have period π, since, for
each z in the respective domains,
sin(z + π) (8.44h) − sin z (8.44h) − cos z
tan(z + π) = = = tan z, cot(z + π) = = cot z.
cos(z + π) − cos z − sin z
(8.49)
Since
π 1 π π 1 π
lim sin − = sin = 1 ∧ lim cos − = cos = 0
n→∞ 2 n 2 n→∞ 2 n 2
π 1 π 1
∧ cos − >0 ⇒ lim tan − = ∞,
2 n n→∞ 2 n
π π
π 1 π 1
lim sin − + = sin − = −1 ∧ lim cos − + = cos − =0
n→∞ 2 n 2 n→∞ 2 n 2
π 1 π 1
∧ cos − + >0 ⇒ lim tan − + = −∞,
2 n n→∞ 2 n
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 135
1 1 1
lim sin = sin 0 = 0 ∧ lim cos = cos 0 = 1 ∧ sin > 0
n→∞ n n→∞ n n
1
⇒ lim cot = ∞,
n→∞ n
1 1
lim sin π − = sin π = 0 ∧ lim cos π − = cos π = −1
n→∞ n n→∞ n
1 1
∧ sin π − >0 ⇒ lim cot π − = −∞,
n n→∞ n
we obtain tan(R \ cos−1 {0}) = cot(R \ sin−1 {0}) = R.
For each k ∈ Z,
i π π h
tan is strictly increasing on − + kπ, + kπ , (8.50a)
i2 2 h
cot is strictly decreasing on kπ, (k + 1)π : (8.50b)
On ]0, π2 [, sin is strictly increasing and cos is strictly decreasing, i.e. tan is strictly
increasing and cot is strictly decreasing. Since tan(−x) = sin(−x)/ cos(−x) = − tan(x),
on ] − π2 , 0[, tan is strictly increasing and cot is strictly decreasing. Taking into account
the signs of tan and cot on the respective intervals and their π-periodicity according to
(8.49) proves (8.50).
Definition and Remark 8.27. Since we have seen sin to be strictly increasing on
[− π2 , π2 ] with image [−1, 1], cos to be strictly decreasing on [0, π] with image [−1, 1], tan
to be strictly increasing on ] − π2 , π2 [ with image R, and cot to be strictly decreasing on
]0, π[ with image R; and since all four functions are continuous, Th. 7.60 implies the
existence of inverse functions, denoted by
arcsin : [−1, 1] −→ [−π/2, π/2], (8.51a)
arccos : [−1, 1] −→ [0, π], (8.51b)
arctan : R −→] − π/2, π/2[, (8.51c)
arccot : R −→]0, π[, (8.51d)
respectively, where all four inverse functions are continuous, arcsin is strictly increasing,
arccos is strictly decreasing, arctan is strictly increasing, and arccot is strictly decreasing.
Of course, using (8.45) and (8.50), respectively, one can also obtain the inverse functions
on different intervals, and, in the literature, such inverse functions are, indeed, considered
as well. Somewhat confusingly, it is common to denote all these different functions by
the same symbols, namely the ones introduced in (8.51). Here, we will not need to pursue
this any further, i.e. we will only consider the inverse functions precisely as defined in
(8.51), which are also known as the principle inverse functions of sin, cos, tan, and cot,
respectively.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 136
ϕ := arccos ξ,
In consequence,
z (8.46a)
= ξ + iη = cos ϕ + i sin ϕ = eiϕ ,
r
as desired. If y ≤ 0, then the above shows the existence of ψ ∈ R such that z̄ = x − iy =
reiψ = r cos ψ + ir sin ψ. Letting ϕ := −ψ, we, once again, have z = r cos ψ − ir sin ψ =
re−iψ = reiϕ , as desired, completing the existence proof for the representation (8.52).
Now assume (8.52) holds with r ≥ 0. Then
p
|z| = r|eiϕ | = r (sin ϕ)2 + (cos ϕ)2 = r.
Finally, if r eiϕ1 = r eiϕ2 with r > 0, then ei(ϕ1 −ϕ2 ) = 1, i.e. i(ϕ1 − ϕ2 ) ∈ {2kπi : k ∈ Z}
by (8.47a).
Definition and Remark 8.29. The representation of z ∈ C given by (8.52) is called its
polar form, where (r, ϕ) are also called polar coordinates of z, ϕ is called an argument of
z. For z 6= 0, one can fix the argument uniquely by the additional requirement ϕ ∈ [0, 2π[
(but one also finds other choices, for example ϕ ∈] − π, π], in the literature). The above
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 137
terminology is consistent with the common use of calling (r, ϕ) polar coordinates of the
vector z = (x, y) ∈ R2 (= C) (in contrast to the Cartesian coordinates (x, y)), where r
constitutes the distance of the point z = (x, y) from the origin (0, 0) and ϕ is the angle
between the vector z = (x, y) and the x-axis (cf. the three introductory paragraphs
of the previous Sec. 8.4). As promised, we can now better understand the geometric
interpretation of complex multiplication already described in Rem. 5.10: If z1 = r1 eiϕ1
and z2 = r2 eiϕ2 , then z1 z2 = r1 r2 ei(ϕ1 +ϕ2 ) , i.e. complex multiplication, indeed, means
multiplying absolute values and adding arguments.
Corollary 8.30. If z ∈ C, then |z| = 1 holds if, and only if, there exists ϕ ∈ R such
that z = eiϕ – in other words, the map
is surjective. Moreover f (ϕ1 ) = f (ϕ2 ) holds if, and only if, ϕ1 − ϕ2 = 2πk for some
k ∈ Z.
Corollary 8.31 (Roots of Unity). For each n ∈ N, the equation z n = 1 has precisely n
distinct solutions ζ1 , . . . , ζn ∈ C, where
The numbers ζ1 , . . . , ζn defined in (8.56) are called the nth roots of unity.
Proof. It is ζkn = ek2πi = 1 for each k ∈ {1, . . . , n} and the ζ1 , . . . , ζn are all distinct
by Cor. 8.30, since, for k, l ∈ {1, . . . , n} with k 6= l, (k − l)/n ∈ / Z. As ζ1 , . . . , ζn are
n distinct zeros of the polynomial P : C −→ C, P (z) := z n − 1, and P has at most n
zeros by Th. 6.6(a), ζ1 , . . . , ζn constitute all solutions to z n = 1.
We are now in a position to prove one of the central results of analysis and algebra,
namely the fundamental theorem of algebra. The following proof does not need any tools
beyond the ones provided by this class – it is actually mainly founded on continuous
functions attaining a min and a max on compact sets according to Th. 7.54 and the
existence of nth roots of unity according to Cor. 8.31.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 138
Claim 1. The function |P | attains its global min on C, i.e. there exists z0 ∈ C such that
|P | is minimal in z0 .
Proof. Proceeding by contraposition, we assume P (z0 ) 6= 0 and show that |P | does not
have a min in z0 . We need to construct z1 ∈ C such that |P (z1 )| < |P (z0 )|. To this end,
define
P (z0 + z)
p : C −→ C, p(z) := .
P (z0 )
Then p is still a polynomial of degree n. Since p(0) = 1,
n
X
∃ ∃ ∀ p(z) = 1 + bj z j , bk 6= 0.
k∈{1,...,n} bk ,...,bn ∈C z∈C
j=k
Write −b−1 −1
k in polar form, i.e. −bk = re
iϕ
with r ∈ R+ and ϕ ∈ R. Define
√
β := k r eiϕ/k i.e. β k = reiϕ = −b−1
k
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
8 CONVERGENCE OF K-VALUED FUNCTIONS 139
and
n
X
k k
q : C −→ C, q(z) := p(βz) = 1 + bk β z + bj β j z j = 1 − z k + z k+1 S(z),
j=k+1
∃ ∀ |S(z)| ≤ C.
C∈R+ z∈B 1 (0)
Letting
c := min{1, C −1 },
one obtains k+1
∀ z S(z) ≤ C |z|k+1 < |z|k
0<|z|<c
and, thus,
∀ |q(x)| ≤ 1 − xk + xk+1 S(x) < 1 − xk + xk = 1.
x∈]0,c[
Thus, finally,
|P (z0 + βx)|
∀ = |p(βx)| = |q(x)| < 1,
x∈]0,c[ |P (z0 )|
showing |P | does not have a min in z0 . N
(the ζ1 , . . . , ζn are precisely all the zeros of P , some or all of which might be iden-
tical).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 140
n = n1 + 2n2 , (8.58a)
and
n1
Y n2
Y
P (x) = c (x − ξj ) (x2 + αj x + βj ). (8.58b)
j=1 j=1
Proof. For (a), one just combines Th. 8.32 with Rem. 6.7.
(b): If P has only real coefficients, then we can take complex conjugates to obtain
showing that the nonreal zeros of P (if any) must occur in conjugate pairs. Moreover,
9 Differential Calculus
f (x) − f (ξ)
x−ξ
should provide “good” approximations of f ′ (ξ) if x tends to ξ. This leads to the following
Def. 9.1, where we also allow C-valued functions (while the above-described geometric
interpretation only works for R-valued functions, it can be applied to both the real and
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 141
the imaginary parts of a C-valued function, cf. Rem. 9.2 below); but note that we do
not consider differentiability of functions f : C −→ C, which would lead to the notion
of complex differentiability or holomorphicity, which is studied in the field of Complex
Analysis and is beyond the scope of this class.
Definition 9.1. Let a < b, f : ]a, b[−→ K (a = −∞, b = ∞ is admissible), and ξ ∈]a, b[.
Then f is said to be differentiable at ξ if, and only if, the following limit in (9.1) exists
in the sense of Def. 8.17 (where x 7→ f (x)−f
x−ξ
(ξ)
plays the role of x 7→ f (x) in Def. 8.17).
The limit is then called the derivative of f in ξ. Many symbols are used in the literature
to denote derivatives, the following provides a selection:
df (ξ) f (x) − f (ξ) f (ξ + h) − f (ξ)
f ′ (ξ) := ∂x f (ξ) := := lim = lim . (9.1)
dx x→ξ x−ξ h→0 h
Note both limits occurring in (9.1) are, indeed, identical, since the sequence (xk )k∈N in
]a, b[ converges to ξ if, and only if, the sequence (hk )k→∞ with hk := xk − ξ converges
to 0. The number in (9.1) (if it exists) is also called a differential quotient, whereas
f (x)−f (ξ)
x−ξ
is known as a difference quotient.
f is called differentiable if, and only if, it is differentiable at each ξ ∈]a, b[. In that case,
one calls the function
f ′ : ]a, b[−→ K, x 7→ f ′ (x), (9.2)
the derivative of f .
Remark 9.2. In the situation of Def. 9.1, the complex-valued function f : ]a, b[−→ C
is differentiable at ξ ∈]a, b[ if, and only if, both functions Re f, Im f : ]a, b[−→ R are
differentiable, and, in that case
f ′ (ξ) = (Re f )′ (ξ) + i (Im f )′ (ξ). (9.3)
Indeed, we merely have to note
f (x) − f (ξ) Re f (x) − Re f (ξ) Im f (x) − Im f (ξ)
∀ = +i (9.4)
x,ξ∈]a,b[, x−ξ x−ξ x−ξ
x6=ξ
and that, by (7.2) a sequence (zn )n∈N in C converges to ζ ∈ C if, and only if, both
limn→∞ Re zn = Re ζ and limn→∞ Im zn = Im ζ hold.
Definition 9.3. If f : ]a, b[−→ R as in Def. 9.1 is differentiable at ξ ∈]a, b[, then the
graph of the affine function
L : R −→ R, L(x) := f (ξ) + f ′ (ξ)(x − ξ), (9.5)
i.e. the line through (ξ, f (ξ)) with slope f ′ (ξ) is called the tangent to the graph of f at
ξ.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 142
Theorem 9.4. If f : ]a, b[−→ K as in Def. 9.1 is differentiable at ξ ∈]a, b[, then it is
continuous at ξ. In particular, if f is everywhere differentiable, then it is everywhere
continuous.
Proof. Let (xk )k∈N be a sequence in ]a, b[\{ξ} such that limk→∞ xk = ξ. Then
(xk − ξ) f (xk ) − f (ξ)
lim f (xk ) − f (ξ) = lim = 0 · f ′ (ξ) = 0, (9.6)
k→∞ k→∞ xk − ξ
proving the continuity of f in ξ.
Example 9.5. (a) For each a, b ∈ K, the affine function f : R −→ K, f (x) := ax + b,
is differentiable with f ′ (x) = a for each x ∈ R: If x ∈ R and (hk )k∈N is a sequence
with hk 6= 0 such that limk→∞ hk = 0, then
f (x + hk ) − f (x) a(x + hk ) + b − ax − b a hk
lim = lim = lim = a. (9.7)
k→∞ hk k→∞ hk k→∞ hk
(c) The sine and the cosine function f, g : R −→ R, f (x) := sin x, g(x) := cos x, are
differentiable with f ′ (x) = cos x and g ′ (x) = − sin x for each x ∈ R: If x ∈ R and
(hk )k∈N is a sequence with hk 6= 0 such that limk→∞ hk = 0, then
f (x + hk ) − f (x) sin(x + hk ) − sin x
lim = lim
k→∞ hk k→∞ hk
(8.44c) sin x cos hk + cos x sin hk − sin x
= lim
k→∞ hk
hk (cos hk − 1) sin hk
= sin x lim 2
+ cos x lim
k→∞ h k→∞ hk
k
(8.44j) 1
= (sin x) · 0 · − + (cos x) · 1 = cos x. (9.9)
2
The proof of g ′ (x) = − sin x is left as an exercise.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 143
f (0 + n1 ) − f (0)
lim 1 = lim 1 = 1, (9.10a)
n→∞ n→∞
n
1 1
f (0 − n
)− f (0) n
lim = lim = −1, (9.10b)
n→∞ − n1 n→∞ − 1
n
f (0+h)−f (0)
showing that h
does not have a limit for h → 0.
Remark 9.6. As just seen in Ex. 9.5(d), the absolute value function shows continuous
functions do not need to be differentiable. Somewhat surprisingly, with a bit more effort,
one can even construct continuous functions f : R −→ R that are not differentiable at
any x ∈ R (see Appendix J.1).
Theorem 9.7. Let a < b, f, g : ]a, b[−→ K (a = −∞, b = ∞ is admissible), and
ξ ∈]a, b[. Assume f and g are differentiable at ξ.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 144
(f g)(ξ + hk ) − (f g)(ξ)
lim
k→∞ hk
f (ξ + hk )g(ξ + hk ) − f (ξ)g(ξ + hk ) + f (ξ)g(ξ + hk ) − f (ξ)g(ξ)
= lim
k→∞ hk
f (ξ + hk ) − f (ξ) g(ξ + hk ) − g(ξ)
= lim g(ξ + hk ) lim + f (ξ) lim
k→∞ k→∞ hk k→∞ hk
= f ′ (ξ)g(ξ) + f (ξ)g ′ (ξ),
where, in the last equality, we used the continuity of g in ξ according to Th. 9.4.
For (d), one first proves the special case f ≡ 1 by
Example 9.8. (a) Each polynomial is differentiable and the derivative is, again, a
polynomial. More precisely,
n
X
P : R −→ K, P (x) = aj x j , aj ∈ K
j=0
n (9.11)
X
⇒ P ′ : R −→ K, P ′ (x) = j aj xj−1 :
j=1
The cases n = 0, 1 are provided by Ex. 9.5(a). To complete the induction P proof of
(9.11), we carry out the induction step for each n ∈ N: Writing P (x) = nj=0 aj xj +
an+1 x xn and applying the induction hypothesis as well as the rules of Th. 9.7 yields
n
X n+1
X
′ j−1 n n−1
P (x) = j aj x + an+1 (1 · x + x · n · x )= j aj xj−1 ,
j=1 j=1
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 145
(b) Clearly, the derivatives of rational functions P/Q with polynomials P and Q can
be computed from (9.11) and the quotient rule of Th. 9.7(d).
(c) The functions tan and cot as defined in (8.48) and restricted to R \ cos−1 {0} and
R \ sin−1 {0}, respectively, are differentiable and one obtains
1
tan′ : R \ cos−1 {0} −→ R, tan′ x = = 1 + (tan x)2 , (9.12a)
| {z } (cos x)2
R\{(2k+1) π2 : k∈Z}
1
cot′ : R \ sin−1 {0} −→ R, cot′ x = − 2
= −(1 + (cot x)2 ) : (9.12b)
| {z } (sin x)
R\{kπ: k∈Z}
One merely needs the derivatives of sin and cos from Ex. 9.5(c) and the quotient
rule of Th. 9.7(d):
cos x cos x − sin x(− sin x) (8.44e) 1 (8.44e)
tan′ x = = = 1 + (tan x)2 ,
(cos x)2 (cos x)2
− sin x sin x − cos x cos x (8.44e) 1 (8.44e)
cot′ x = 2
= − = −(1 + (cot x)2 ).
(sin x) (sin x)2
Theorem 9.9 (Derivative of Inverse Functions). Let a < b, I :=]a, b[ (a = −∞, b = ∞
is admissible). If f : I −→ R is differentiable and strictly increasing (resp. decreasing),
then f has a continuous, strictly increasing (resp. decreasing) inverse function f −1 de-
fined on the interval J := f (I), i.e. f −1 : J −→ I, and, for each ξ ∈ I with f ′ (ξ) 6= 0,
f −1 is differentiable at η := f (ξ) with
1 1
(f −1 )′ (η) = = . (9.13)
f ′ (ξ) f′ f −1 (η)
Proof. As a differentiable function, f is continuous by Th. 9.4, i.e. Th. 7.60 provides
all the present assertions, except differentiability at η and (9.13). Let (yk )k∈N be a
sequence in J \ {η} such that limk→∞ yk = η. Then, as f −1 is bijective and continuous,
(f −1 (yk ))k∈N is a sequence in I \ {ξ} such that limk→∞ f −1 (yk ) = ξ, and one obtains
f −1 (yk ) − f −1 (η) f −1 (yk ) − f −1 (η) 1
lim = lim = , (9.14)
k→∞ yk − η k→∞ f f −1 −1
(yk ) − f f (η) ′ −1
f f (η)
establishing the case.
Example 9.10. (a) The function ln : R+ −→ R is differentiable and, for each x ∈ R+ ,
ln′ x = 1/x: If f (x) = ex , then f ′ (x) = ex 6= 0 for each x ∈ R, ln x = f −1 (x), and
(9.13) yields
1 1 1
ln′ x = ′ = ln x = .
f (ln x) e x
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 146
(b) The function arcsin : ] √ − 1, 1[−→] − π/2, π/2[ is differentiable and, for each x ∈
′
] − 1, 1[, arcsin x = 1/ 1 − x2 : If f (x) = sin x, then f ′ (x) = cos x 6= 0 for each
x ∈] − π/2, π/2[, arcsin x = f −1 (x), and (9.13) yields
1 1 (∗) 1 1
arcsin′ x = ′ = = p =√ ,
f (arcsin x) cos arcsin x 1 − (sin arcsin x) 2 1 − x2
where, at (∗), it was used that cos2 = 1−sin2 and cos t > 0 for each t ∈]−π/2, π/2[.
(c) The function arccos
√ : ] − 1, 1[−→]0, π[ is differentiable and, for each x ∈] − 1, 1[,
arccos′ x = −1/ 1 − x2 : If f (x) = cos x, then f ′ (x) = − sin x 6= 0 for each x ∈]0, π[,
arccos x = f −1 (x), and (9.13) yields
1 1 (∗) 1 1
arccos′ x = ′ = = −p = −√ ,
f (arccos x) − sin arccos x 1 − (cos arccos x)2 1 − x2
where, at (∗), it was used that sin2 = 1 − cos2 and sin t > 0 for each t ∈]0, π[.
(d) The function arctan : R −→] − π/2, π/2[ is differentiable and, for each x ∈ R,
arctan′ x = 1/(1 + x2 ): Apply Th. 9.9 with f (x) = tan x as an exercise.
(e) The function arccot : R −→]0, π[ is differentiable and, for each x ∈ R, arccot′ x =
−1/(1 + x2 ): Apply Th. 9.9 with f (x) = cot x as an exercise.
Theorem 9.11 (Chain Rule). Let a < b, c < d, f : ]a, b[−→ R, g : ]c, d[−→ K,
f (]a, b[) ⊆]c, d[ (a, c = −∞; b, d = ∞ is admissible). If f is differentiable in ξ ∈]a, b[
and g is differentiable in f (ξ) ∈]c, d[, then g ◦ f : ]a, b[−→ K is differentiable in ξ and
(g ◦ f )′ (ξ) = f ′ (ξ)g ′ (f (ξ)). (9.15)
Let (xk )k∈N be a sequence in ]a, b[\{ξ} such that limk→∞ xk = ξ. One obtains
g f (xk ) − g f (ξ) (9.17) g̃ f (xk ) f (xk ) − f (ξ)
lim = lim
k→∞ xk − ξ k→∞ xk − ξ
f (xk ) − f (ξ)
= lim g̃ f (xk ) lim
k→∞ k→∞ xk − ξ
= f ′ (ξ)g ′ (f (ξ)), (9.18)
establishing the case.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 147
Example 9.12. (a) According to the chain rule of Th. 9.11, the function h : R −→ R,
h(x) := sin(−x3 ) is differentiable and, for each x ∈ R, h′ (x) = −3x2 cos(−x3 ).
(b) According to the chain rule of Th. 9.11, each power function h : R+ −→ K, h(x) :=
xα = eα ln x , α ∈ K, is differentiable and, for each x ∈ R+ , h′ (x) = αx eα ln x = α xα−1 .
Indeed, h = g ◦f , where f : R+ −→ R, f (x) := ln x with f ′ : R+ −→ R, f ′ (x) := x1 ,
according to Ex. 9.5(b), and g : R −→ K, g(x) := eαx , with g ′ : R −→ K,
g ′ (x) := α eαx according to Ex. 9.10(a).
If f (k+1) (ξ) exists for all ξ ∈ I, then f is said to be (k + 1)-times differentiable and the
function f (k+1) : I −→ K, x 7→ f (k+1) (ξ), is called the (k + 1)st derivative of f . It is
common to write f ′ := f (1) , f ′′ := f (2) , f ′′′ := f (3) , but f (k) if k ≥ 4.
If f (k) exists, it might or might not be continuous (cf. Ex. 9.14(c) below). One defines
n o
k (k)
∀ C (I, K) := f ∈ F(I, K) : f exists and is continuous on I , (9.20)
k∈N0
\
C ∞ (I, K) := C k (I, K) (9.21)
k∈N0
Example 9.14. (a) One has sin ∈ C ∞ (R) with sin′ = cos, sin′′ = − sin, sin′′′ = − cos,
sin(4) = sin, . . .
P
(b) A simple induction shows, for each polynomial P : R −→ K, P (x) = nj=0 aj xj ,
aj ∈ K, n ∈ N0 , that P (n) (x) = n! an . In particular, P ∈ C ∞ (R, K).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 148
Proof. Suppose f has a local max at ξ. Then there exists ǫ > 0 such that |h| < ǫ implies
f (ξ + h) − f (ξ) ≤ 0. Now let (hk )k∈N be a sequence in ]0, ǫ[ with limk→∞ hk = 0. Then
f (ξ ± hk ) − f (ξ) ≤ 0 for all k ∈ N implies
f (ξ + hk ) − f (ξ) f (ξ − hk ) − f (ξ)
f ′ (ξ) = lim ≤ 0, f ′ (ξ) = lim ≥ 0, (9.22)
k→∞ hk k→∞ −hk
showing f ′ (ξ) = 0. Now, if f has a local min at ξ, then −f has a local max at ξ, and
f ′ (ξ) = −(−f )′ (ξ) = 0 establishes the case.
Remark 9.16. For f : R −→ R, f (x) := x3 , it is f ′ (0) = 0, but f does not have a local
min or max at 0, showing that, while being necessary for an differentiable function f to
have a local extremum at ξ, f ′ (ξ) = 0 is not a sufficient condition for such an extremum
at ξ. Points ξ with f ′ (ξ) = 0 are sometimes called stationary or critical points of f .
—
Now, we first prove an important special case of the mean value theorem:
Theorem 9.17 (Rolle’s Theorem). Let a < b. If f : [a, b] −→ R is continuous on the
compact interval [a, b], differentiable on the open interval ]a, b[, and f (a) = f (b), then
there exists ξ ∈]a, b[ such that f ′ (ξ) = 0.
Proof. If f is constant, then f ′ (ξ) = 0 holds for each ξ ∈]a, b[. If f is nonconstant,
then there exists x ∈]a, b[ with f (x) 6= f (a). If f (x) > f (a), then Th. 7.54 implies the
existence of ξ ∈]a, b[ such that f attains its (global and, thus, local) max in ξ. Then
Th. 9.15 yields f ′ (ξ) = 0. The case f (x) < f (a) is treated analogously.
Theorem 9.18 (Mean Value Theorem). Let a < b. If f, g : [a, b] −→ R are continuous
on the compact interval [a, b], differentiable on the open interval ]a, b[, and g ′ (x) 6= 0 for
each x ∈]a, b[, then there exists ξ ∈]a, b[ such that
f (b) − f (a) f ′ (ξ)
= ′ . (9.23a)
g(b) − g(a) g (ξ)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 149
In the special case that g : [a, b] −→ R, g(x) = x, one obtains the standard form
f (b) − f (a)
= f ′ (ξ). (9.23b)
b−a
Proof. First note that Rolle’s Th. 9.17 and g ′ 6= 0 imply g(b) − g(a) 6= 0. Next, one
applies Rolle’s Th. 9.17 to the auxiliary function
f (b) − f (a)
h : [a, b] −→ R, h(x) := f (x) − g(x) − g(a) . (9.24)
g(b) − g(a)
Since f and g are continuous on [a, b] and differentiable on ]a, b[, so is h. Moreover,
h(a) = f (a) = h(b), i.e. Rolle’s Th. 9.17 applies and yields ξ ∈]a, b[ satisfying h′ (ξ) = 0.
However, (9.24) implies h′ (ξ) = 0 is equivalent to (9.23a).
Proof. If c < a < b < d and f ′ ≥ 0 (resp. f ′ ≤ 0, resp. f ′ ≡ 0), then (9.23b) implies
f (b) ≥ f (a) (resp. f (b) ≤ f (a), resp. f (b) = f (a)). Moreover, strict inequalities for f ′
yield strict inequality between f (b) and f (a).
Lemma 9.20. Let a < b, f : ]a, b[−→ R, ξ ∈]a, b[, and assume f is differentiable at ξ.
If f ′ (ξ) > 0 (resp. f ′ (ξ) < 0), then there exists ǫ > 0 such that ]ξ − ǫ, ξ + ǫ[⊆]a, b[ and
∀ ∀ f (a1 ) < f (ξ) < f (b1 ) resp. f (a1 ) > f (ξ) > f (b1 ) .
a1 ∈]ξ−ǫ,ξ[ b1 ∈]ξ,ξ+ǫ[
Proof. If there does not exist ǫ > 0 such that f (a1 ) < f (ξ) < f (b1 ) for each a1 ∈]ξ − ǫ, ξ[
and each b1 ∈]ξ, ξ + ǫ[, then then there exists a sequence (xk )k∈N in ]a, b[\{ξ} such that
limk→∞ xk = ξ and
f (xk ) − f (ξ)
∀ ≤ 0,
k∈N xk − ξ
showing f ′ (ξ) ≤ 0. Analogously, one obtains that f ′ (ξ) ≥ 0 provided there does not exist
ǫ > 0 such that f (a1 ) > f (ξ) > f (b1 ) for each a1 ∈]ξ − ǫ, ξ[ and each b1 ∈]ξ, ξ + ǫ[.
Caveat 9.21. The hypotheses of Lem. 9.20 are not sufficient for f to be increasing or
decreasing in any neighborhood of ξ: It is an exercise to find a counterexample.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 150
Theorem 9.22 (Sufficient Conditions for Extrema). Let c < d, let f : ]c, d[−→ R be
differentiable, and assume f ′ (ξ) = 0 for some ξ ∈]c, d[.
(a) If f ′ (x) > 0 for each x ∈]c, ξ[ and f ′ (x) < 0 for each x ∈]ξ, d[, then f has a strict
max at ξ. Likewise, if f ′′ (ξ) exists and is negative, then f has a strict max at ξ.
(b) If f ′ (x) < 0 for each x ∈]c, ξ[ and f ′ (x) > 0 for each x ∈]ξ, d[, then f has a strict
min at ξ. Likewise, if f ′′ (ξ) exists and is positive, then f has a strict min at ξ.
Proof. We just present the proof for (a); (b) is proved analogously. If f ′ (x) > 0 for each
x ∈]c, ξ[, then (9.23b) shows f (ξ)−f (a) > 0 for each c < a < ξ; analogously, if f ′ (x) < 0
for each x ∈]ξ, d[, then (9.23b) shows f (ξ) − f (b) > 0 for each ξ < b < d. Altogether, we
have shown f to have a strict max at ξ. If f ′′ (ξ) exists and is negative, then Lem. 9.20
yields the existence of ǫ > 0 such that f ′ is positive on ]ξ − ǫ, ξ[ and negative on ]ξ, ξ + ǫ[.
Applying what we have already proved with c := ξ − ǫ and d := ξ + ǫ establishes the
case.
Example 9.23. One obtains
f : R −→ R, f (x) := x ex , (9.25a)
f ′ : R −→ R, f ′ (x) = ex + x ex = (1 + x) ex , (9.25b)
f ′′ : R −→ R, f ′′ (x) = 2ex + x ex = (2 + x) ex . (9.25c)
From Th. 9.15, we know that f can have at most one extremum, namely at ξ = −1,
where f ′ (ξ) = 0. Since f ′′ (ξ) = e−x > 0, Th. 9.22(b) implies that f has a strict min at
−1.
—
The following Th. 9.24, the intermediate value theorem for derivatives, is another ap-
plication of Lem. 9.20. Even though Ex. 9.14(c) shows that derivatives do not have
to be continuous, not every function can occur as a derivative: The following Th. 9.24
shows that derivatives always satisfy an intermediate value property, even if they are
not continuous (that also means that discontinuities in derivatives are always due to
oscillations rather than jumps).
Theorem 9.24 (Intermediate Value Theorem for Derivatives). Let a, b, c, d ∈ R with
a < c ≤ d < b. If f : ]a, b[−→ R is differentiable, then f ′ assumes every value between
f ′ (c) and f ′ (d), i.e.
h i
min{f ′ (c), f ′ (d)}, max{f ′ (c), f ′ (d)} ⊆ f ′ [c, d] . (9.26)
Proof. Exercise (hint: use a suitable auxiliary function and apply Lem. 9.20 together
with Th. 7.54 and Th. 9.15).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 151
(b) limx→ξ g(x) = ∞ or limx→ξ g(x) = −∞, where Def. 8.17 is extended to the case
η ∈ {−∞, ∞} in the obvious way.
Then
f ′ (x) f (x)
lim =η ⇒ lim = η. (9.27)
x→ξ g ′ (x) x→ξ g(x)
The above statement also holds for ξ ∈ {−∞, ∞} and/or η ∈ {−∞, ∞} if, as in (b),
one extends Def. 8.17 to these cases in the obvious way.
Proof. First, we assume (a). Consider the case ξ ∈ R. Since f and g are continuous, (a)
implies f and g remain continuous, if we extend them to ξ by letting f (ξ) := g(ξ) = 0.
This extension will now allow us to apply Th. 9.18 to f and g. To prove (9.27), let
(xk )k∈N be a sequence in I with limk→∞ xk = ξ. Then (9.23a) yields, for each k ∈ N,
some ξk ∈]xk , ξ[ if xk < ξ and some ξk ∈]ξ, xk [ if ξ < xk , satisfying
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 152
f (x)
proving limx→ξ g(x)
= η.
We now assume (b), still letting (xk )k∈N be as before. Note that g ′ 6= 0 implies g
is injective by Rolle’s Th. 9.17. Then the intermediate value theorem implies g is
either strictly increasing or strictly decreasing. We proceed with the proof for the case
I =]a, ξ[, the proof for I =]ξ, b[ can be done completely analogous. We first consider
the case where g is strictly increasing, i.e. limx→ξ g(x) = ∞. Assume η ∈ R and ǫ > 0.
′ (x)
Then limx→ξ fg′ (x) = η and limx→ξ g(x) = ∞ imply
ǫ f ′ (x) ǫ
∃ ∀ g(x) > 0 ∧ η− < ′ <η+ .
c∈]a,ξ[ x∈]c,ξ[ 2 g (x) 2
Since limk→∞ xk = ξ, there exists N0 ∈ N such that, for each k > N0 , c < xk < ξ. Next,
according to Th. 9.18,
that means
f (xk )
∀ η−ǫ< < η + ǫ,
k>N g(xk )
f (x)
proving limx→ξ g(x)
= η. For η = ∞ and given n ∈ N, the argument is similar:
′ (x)
limx→ξ fg′ (x) = η and limx→ξ g(x) = ∞ imply
f ′ (x)
∃ ∀ g(x) > 0 ∧ n< ′ .
c∈]a,ξ[ x∈]c,ξ[ g (x)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 153
As before, since limk→∞ xk = ξ, there exists N0 ∈ N such that, for each k > N0 ,
c < xk < ξ. Again, according to Th. 9.18,
and
f (c) − n g(c) f (xk )
∀ n+ < .
k>N0 g(xk ) g(xk )
Since limk→∞ g(xk ) = ∞,
f (c) − n g(c)
∃ ∀ < 1,
N ≥N0 k>N g(xk )
that means
f (xk )
∀ n−1< ,
k>N g(xk )
f (x)
proving limx→ξ g(x)
= η. If η = −∞, then, using what we have already shown,
Example 9.26. (a) Applying L’Hôpital’s rule to f : ] − π/2, π/2[−→ R, f (x) := tan x,
g : ] − π/2, π/2[−→ R, g(x) := ex − 1, with ξ = 0 yields
tan x 1 + tan2 x 1
lim x
= lim x
= =1 (9.29)
x→0 e − 1 x→0 e 1
(note g ′ (x) = ex 6= 0 for each x ∈] − π/2, π/2[).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 154
(b) It can happen that a single application of L’Hôpital’s rule does not, yet, yield a
useful result, but that a repeated application does. An example is provided by
considering α > 0, n ∈ N, and f : R+ −→ R, f (x) := eα x , g : R+ −→ R,
g(x) := xn , ξ := ∞. Applying L’Hôpital’s rule n times yields
eα x α n eα x
∀+ ∀ lim = lim =∞ (9.30)
α∈R n∈N x→∞ xn x→∞ n!
(note g (k) (x) = n(n − 1) · · · (n − k + 1)xn−k 6= 0 for each k ∈ {1, . . . , n} and each
x ∈ R+ ).
(c) It can also happen that even repeated applications of L’Hôpital’s rule do not
help at all, even though limx→ξ fg(x)(x)
does exist and the hypotheses of Th. 9.25
are all satisfied. A simple example is given by f : R −→ R, f (x) := ex , g :
R −→ R, g(x) := 2ex , and ξ = −∞. Even though limx→−∞ fg(x) (x)
= 21 , one has
limx→−∞ f (n) (x) = limx→−∞ g (n) (x) = 0 for every n ∈ N.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 155
Moreover, f is called strictly convex (resp. strictly concave) if, and only if, (9.32a) (resp.
(9.32b)) always holds with strict inequality.
The following Prop. 9.29 provides equivalences for convexity. One can easily obtain the
corresponding equivalences for concavity by combining Prop. 9.29 with Lem. 9.28.
(ii) For each a, b ∈ I such that a 6= b and each λ ∈]0, 1[, the following estimate holds
(with strict inequality):
f λa + (1 − λ)b ≤ λf (a) + (1 − λ)f (b). (9.33)
(iii) For each x1 , x, x2 ∈ I such that x1 < x < x2 , one has (with strict inequality)
(iv) For each x1 , x, x2 ∈ I such that x1 < x < x2 , one has (with strict inequality)
Proof. We leave the proof as an exercise. Suggestion: Establish the following implica-
tions: (i) ⇔ (ii), (i) ⇔ (iii), (iv) ⇒ (iii), (i) ⇒ (iv).
Example 9.30. Since |λx + (1 − λ)y| ≤ λ|x| + (1 − λ)|y| for each 0 < λ < 1 and each
x, y ∈ R, the absolute value function is convex. This example also shows that a convex
function does not need to be differentiable.
—
For differentiable functions, one can formulate convexity criteria in terms of the deriva-
tive:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 156
Proposition 9.31. Let a < b, and suppose that f : [a, b] −→ R is continuous on [a, b]
and differentiable on ]a, b[. Then f is (strictly) convex (resp. (strictly) concave) on [a, b]
if, and only if, the derivative f ′ is (strictly) increasing (resp. (strictly) decreasing) on
]a, b[.
Proof. Since (−f )′ = −f ′ and −f ′ is (strictly) increasing if, and only if, f ′ is (strictly)
decreasing, it suffices to consider the (strictly) convex case. So assume that f is (strictly)
convex. Then for each x1 , x, x0 , y, x2 ∈]a, b[ such that x1 < x < x0 < y < x2 , applying
Prop. 9.29(iv), one has (with strict inequalities),
(a) f is convex (resp. concave) on [a, b] if, and only if, f ′′ ≥ 0 (resp. f ′′ ≤ 0) on ]a, b[.
(b) If f ′′ > 0 (resp. f ′′ < 0) on ]a, b[, then f is strictly convex (resp. strictly concave)
(as a caveat we remark that, here, the converse does not hold – for example x 7→ x4
is strictly convex, but its second derivative x 7→ 12x2 is 0 at x = 0).
Proof. Since −f ′′ ≥ 0 if, and only if f ′′ ≤ 0; and −f ′′ > 0 if, and only if f ′′ < 0, if
suffices to consider the convex cases. Moreover, for (a), one merely has to combine Prop.
9.31 with the fact that f ′ is increasing on ]a, b[ if, and only if, f ′′ ≥ 0 on ]a, b[. The
proof of (b) is left as an exercise.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
9 DIFFERENTIAL CALCULUS 157
(b) Since for f : R+ −→ R, f (x) = ln x, it is f ′′ (x) = −1/x2 < 0, the natural logarithm
is strictly concave on R+ .
If f is concave, then
If f is strictly convex or strictly concave, then equality in the above inequalities can only
hold if x1 = · · · = xn .
Since f is (strictly) concave if, and only if, −f is (strictly) convex, it suffices to consider
the cases where f is convex and where f is strictly convex. Thus, we assume that f is
convex and prove (9.39a) by induction. For n = 1, one has λ1 = 1 and there is nothing
to prove. For n = 2, (9.39a) reduces to (9.33), which holds due to the convexity of f .
Finally, let n > 2 and assume that (9.39a) already holds for each 1 ≤ l ≤ n − 1. Set
λ1 λn−1
λ := λ1 + · · · + λn−1 , x := x1 + · · · + xn−1 . (9.41)
λ λ
Then x ∈ I follows as in (9.40). One computes
n−1
!
X
f (λ1 x1 + · · · + λn xn ) = f λj xj + λn xn = f (λx + λn xn )
j=1
n−1
l=2 l=n−1 X λj
≤ λf (x) + λn f (xn ) ≤ λ f (xj ) + λn f (xn )
j=1
λ
= λ1 f (x1 ) + · · · + λn f (xn ), (9.42)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 158
thereby completing the induction, and, thus, the proof of (9.39a). If f is strictly convex
and (9.39a) holds with equality, then one can also proceed by induction to prove the
equality of the xj . Again, if n = 1, then there is nothing to prove. If n = 2, and x1 6= x2 ,
then strict convexity requires (9.39a) to hold with strict inequality. Thus x1 = x2 . Now
let n > 2. It is noted that (9.42) still holds. By hypothesis, the first and last term
in (9.42) are now equal, implying that all terms in (9.42) must be equal. Using the
induction hypothesis for l = 2 and the corresponding equality in (9.42), we conclude
that x = xn . Using the induction hypothesis for l = n−1 and the corresponding equality
in (9.42), we conclude that x1 = · · · = xn−1 . Finally, x = xn and x1 = · · · = xn−1 are
combined using (9.41) to get x1 = xn , finishing the proof of the theorem.
Theorem 9.35 (Inequality Between the Weighted Arithmetic Mean and the Weighted
Geometric Mean). If n ∈ N, x1 , . . . , xn ≥ 0 and λ1 , . . . , λn > 0 such that λ1 +· · ·+λn = 1,
then
xλ1 1 · · · xλnn ≤ λ1 x1 + · · · + λn xn , (9.43)
where equality occurs if, and only if, x1 = · · · = xn . In particular, for λ1 = · · · = λn =
1
n
, one recovers the inequality between the arithmetic and the geometric mean without
weights, known from Th. 7.63.
Proof. If at least one of the xj is 0, then (9.43) becomes the true statement 0 ≤
P n
j=1 λj xj with strict inequality if, and only if, at least one xj > 0. Thus, it remains
to consider the case x1 , . . . , xn > 0. As we noted in Ex. 9.33(b), the natural logarithm
ln : R+ −→ R is concave and even strictly concave. Employing Jensen’s inequality
(9.39b) yields
Applying the exponential function to both sides of (9.44), one obtains (9.43). Since
(9.44) is equivalent to (9.43), the strict concavity of ln yields that equality in (9.44)
implies x1 = · · · = xn .
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 159
R
This area M f (if it exists) will be called the integral of f over M . Moreover, for
functions f : M −→ R that are not necessarily nonnegative, we would like to count
areas of sets of the form (10.1)
(which are below the graph of f and above the set
∼
M = (x, 0) ∈ R : x ∈ M ⊆ R2 ) with a positive sign, and whereas we would like to
2
count areas of sets above the graph of f and below the set M with a negative sign. In
other words, making use of the positive and negative parts f + and f − of f = f + − f −
as defined in (6.1i) and (6.1j), respectively, we would like our integral to satisfy
Z Z Z
f= +
f − f −. (10.2)
M M M
Difficulties arise from the fact that both the function f and the set M can be extremely
complicated. To avoid dealing with complicated sets M , we restrict ourselves to the
situation of integrals over compact intervals, i.e. to integrals over sets of the form M =
[a, b]. Moreover, we will also restrict ourselves to bounded functions f , which we now
define:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 160
where r(∆, f ) is called the lower Riemann sum and R(∆, f ) is called the upper Riemann
sum associated with ∆ and f . If ∆ is tagged by τ := (t1 , . . . , tN ), then we also define
the intermediate Riemann sum
N
X N
X
ρ(∆, f ) := f (tj ) |Ij | = f (tj )(xj − xj−1 ). (10.7c)
j=1 j=1
Note that, for a = b, all the above sums are empty and we have r(∆, f ) = R(∆, f ) =
ρ(∆, f ) = 0.
Definition 10.5. Let I = [a, b] ⊆ R be an interval, a ≤ b, and suppose f : I −→ R is
bounded.
(a) Define
J∗ (f, I) := sup r(∆, f ) : ∆ is a partition of I , (10.8a)
J ∗ (f, I) := inf R(∆, f ) : ∆ is a partition of I . (10.8b)
We call J∗ (f, I) the lower Riemann integral of f over I and J ∗ (f, I) the upper
Riemann integral of f over I.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 161
(b) The function f is called Riemann integrable over I if, and only if, J∗ (f, I) = J ∗ (f, I).
If f is Riemann integrable over I, then
Z b Z Z b Z
f (x) dx := f (x) dx := f := f := J∗ (f, I) = J ∗ (f, I) (10.9)
a I a I
is called the Riemann integral of f over I. The set of all functions f : I −→ R that
are Riemann integrable over I is denoted by R(I, R) or just by R(I).
(c) The function g : I −→ C is called Riemann integrable over I if, and only if, both
Re g and Im g are Riemann integrable. The set of all Riemann integrable functions
g : I −→ C is denoted by R(I, C). If g ∈ R(I, C), then
Z Z Z Z Z
g := Re g, Im g = Re g + i Im g ∈ C (10.10)
I I I I I
(10.7) implies
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 162
(b) An example of a function that is not Riemann integrable for a < b is given by the
Dirichlet function
(
0 for x irrational,
f : [a, b] −→ R, f (x) := a < b. (10.14)
1 for x rational,
P
Since r(∆, f ) = 0 and R(∆, f ) = N j=1 |Ij | = b − a for every partition ∆ of I, one
∗
obtains J∗ (f, I) = 0 6= (b − a) = J (f, I), showing that f ∈ / R(I).
(b) If ∆ and ∆′ are partitions of [a, b] ⊆ R, then the superposition of ∆ and ∆′ , denoted
∆ + ∆′ , is the unique partition of [a, b] having ν(∆) ∪ ν(∆′ ) as its set of nodes. Note
that the superposition of ∆ and ∆′ is always a common refinement of ∆ and ∆′ .
Lemma 10.9. Let a, b ∈ R, a < b, I := [a, b], and suppose f : I −→ R is bounded with
M := kf ksup ∈ R+ ′
0 . Let ∆ be a partition of I and assume
′
α := # ν(∆ ) \ {a, b} ≥ 1 (10.15)
is the number of interior nodes that occur in ∆′ . Then, for each partition ∆ of I, the
following holds:
Proof. We carry out the proof of (10.16a) – the proof of (10.16b) can be conducted
completely analogous. Consider the case α = 1 and let ξ be the single element of
ν(∆′ ) \ {a, b}. If ξ ∈ ν(∆), then ∆ + ∆′ = ∆, and (10.16a) is trivially true. If ξ ∈
/ ν(∆),
then xk−1 < ξ < xk for a suitable k ∈ {1, . . . , N }. Define
and
m′ := inf{f (x) : x ∈ I ′ }, m′′ := inf{f (x) : x ∈ I ′′ }. (10.18)
Then we obtain
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 163
(10.19) implies
0 ≤ r(∆ + ∆′ , f ) − r(∆, f ) ≤ 2M |I ′ | + |I ′′ | ≤ 2M |∆|. (10.21)
(d) For each sequence of partitions (∆n )n∈N of I such that limn→∞ |∆n | = 0, one has
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 164
(d): For a = b, there is nothing to show. For a < b, let (∆n )n∈N be a sequence of
partitions of I such that limn→∞ |∆n | = 0, and let ∆′ be an arbitrary partition of I with
numbers α and M defined as in Lem. 10.9. Then, according to (10.16a):
(a) The integral is linear: More precisely, if f, g ∈ R(I, K) and λ, µ ∈ K, then λf +µg ∈
R(I, K) and Z Z Z
(λf + µg) = λ f + µ g. (10.28)
I I I
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 165
Thus,
Nn
X
(10.24) (10.7a)
J∗ (f + g, I) = lim r(∆n , f + g) = lim mn,j (f + g) |In,j |
n→∞ n→∞
j=1
(10.32a)
≥ lim r(∆n , f ) + r(∆n , g) = J∗ (f, I) + J∗ (g, I), (10.33a)
n→∞
Nn
X
∗ (10.24) (10.7b)
J (f + g, I) = lim R(∆n , f + g) = lim Mn,j (f + g) |In,j |
n→∞ n→∞
j=1
(10.32b)
≤ lim R(∆n , f ) + R(∆n , g) = J ∗ (f, I) + J ∗ (g, I),(10.33b)
n→∞
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 166
Nn
X
(10.24) (10.7a)
∀ J∗ (λf, I) = lim r(∆n , λf ) = lim mn,j (λf ) |In,j |
λ∈R n→∞ n→∞
j=1
(
(10.32c) λ limn→∞ r(∆n , f ) = λ J∗ (f, I) for λ ≥ 0,
= (10.33c)
λ limn→∞ R(∆n , f ) = λ J ∗ (f, I) for λ < 0,
Nn
X
∗ (10.24) (10.7b)
∀ J (λf, I) = lim R(∆n , λf ) = lim Mn,j (λf ) |In,j |
λ∈R n→∞ n→∞
j=1
(
(10.32d) λ limn→∞ R(∆n , f ) = λ J ∗ (f, I) for λ ≥ 0,
= (10.33d)
λ limn→∞ r(∆n , f ) = λ J∗ (f, I) for λ < 0.
and
Z Z Z Z Z Z Z
(f + g) = Re(f + g), Im(f + g) = Re f + Re g, Im g + Im g
I I I I I I I
Z Z Z Z Z Z
= Re f, Im f + Re g, Im g = f + g.
I I I I I I
(b): Once again, consider K = R first. For a = b, there is nothing to prove, so let
a < b. For M = 1, there is still nothing to prove. For M = 2, we have a = y0 <
y1 < y2 = b. Consider a sequence (∆n )n∈N of partitions of I, ∆n = (xn,0 , . . . , xn,Nn ),
such that limn→∞ |∆n | = 0 and y1 ∈ ν(∆n ) for each n ∈ N. Define ∆′n := (xn,0 , . . . , y1 ),
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 167
∆′′n := (y1 , . . . , xn,Nn ). Then ∆′n and ∆′′n are partitions of J1 and J2 , respectively, and
limn→∞ |∆′n | = limn→∞ |∆′′n | = 0. Moreover,
′ ′′ ′ ′′
∀ r(∆n , f ) = r(∆n , f ) + r(∆n , f ), R(∆n , f ) = R(∆n , f ) + R(∆n , f ) ,
n∈N
∗ ∗ ∗
implying
R R J∗ (f,RI) = J∗ (f, J1 )+J∗ (f, J2 ) and J (f, I) = J (f, J1 )+J (f, J2 ). This proves
I
f = J1 f + J2 f provided f ∈ R(I) ∩ R(J1 ) ∩ R(J2 ). So it just remains to show the
claimed equivalence between f ∈ R(I) and f ∈ R(J1 ) ∩ R(J2 ). If f ∈ R(J1 ) ∩ R(J2 ),
then J∗ (f, I) = J∗ (f, J1 )+J∗ (f, J2 ) = J ∗ (f, J1 )+J ∗ (f, J2 ) = J ∗ (f, I), showing f ∈ R(I).
Conversely, J∗ (f, I) = J ∗ (f, I) implies J∗ (f, J1 ) = J ∗ (f, J1 ) + J ∗ (f, J2 ) − J∗ (f, J2 ) ≥
J ∗ (f, J1 ), showing J∗ (f, J1 ) = J ∗ (f, J1 ) and f ∈ R(J1 ); f ∈ R(J2 ) follows completely
analogous. The general case now follows by induction on M . If, f ∈ R(I, C), then one
computes, using the real-valued case,
Z Z Z M Z M Z
! M Z
X X X
f= Re f, Im f = Re f, Im f = f.
I I I k=1 Jk k=1 Jk k=1 Jk
N
X
≤ Re f (tj ), Im f (tj ) |Ij |
j=1
N
X
= |f (tj )| |Ij | =: ρ(∆, |f |). (10.34)
j=1
Since the intermediate Riemann sums in (10.34) converge to the respective integrals by
(10.25b), one obtains
Z Z
(10.34)
f = lim ρ(∆, Re f ), ρ(∆, Im f ) ≤ lim ρ(∆, |f |) = |f |,
|∆|→0 |∆|→0
I I
proving (10.31).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 168
Proof. Suppose, for each ǫ > 0, there exists a partition ∆ of I such that (10.37) is
satisfied. Then
J ∗ (f, I) − J∗ (f, I) ≤ R(∆, f ) − r(∆, f ) < ǫ, (10.38)
showing J ∗ (f, I) ≤ J∗ (f, I). As the opposite inequality always holds, we have J ∗ (f, I) =
J∗ (f, I), i.e. f ∈ R(I) as claimed. Conversely, if f ∈ R(I) and (∆n )n∈N is a sequence of
partitions of I with limn→∞ |∆n | = 0, then (10.25a) implies that, for each ǫ > 0, there
is N ∈ N such that R(∆n , f ) − r(∆n , f ) < ǫ for each n > N .
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 169
The previous theorem will allow us to prove that every continuous function on [a, b] is
Riemann integrable. However, we will also need to make use of the following result:
Proposition 10.14. Let I = [a, b] ⊆ R, a ≤ b, f : I −→ R. If f is continuous, then f
is even uniformly continuous, i.e.
∀ + ∃ + ∀ |x − y| < δ ⇒ |f (x) − f (y)| < ǫ . (10.39)
ǫ∈R δ∈R x,y∈I
and |f (xn ) − f (yn )| ≥ ǫ0 . Then the sequence (xn )n∈N is bounded and the Bolzano-
Weierstrass Th. 7.27 provides a convergent subsequence (xφ(n) )n∈N , i.e. there is ξ ∈ R
with limn→∞ xφ(n) = ξ. Clearly, ξ ∈ [a, b] and (10.41) implies limn→∞ yφ(n) = ξ as
well. However, due to |f (xφ(n) ) − f (yφ(n) )| ≥ ǫ0 > 0, the sequences f (xφ(n) ) n∈N and
f (yφ(n) ) n∈N can not both converge to f (ξ), showing that f can not be continuous.
Caveat 10.15. It is important in Prop. 10.14 that f is defined on a compact interval I.
The examples f : ]0, 1] −→ R, f (x) := 1/x, and f : R −→ R, f (x) := x2 are examples
of continuous functions that are not uniformly continuous.
Theorem 10.16. Let I = [a, b] ⊆ R, a ≤ b.
Proof. (a): As f is continuous if, and only if, Re f and Im f are both continuous, it
suffices to consider a real-valued continuous f . For a = b, there is nothing to prove, so
let a < b. First note that, if f is continuous on I = [a, b], then f is bounded by Th.
7.54. Moreover, f is uniformly continuous due to Prop. 10.14. Thus, given ǫ > 0, there
is δ > 0 such that |x − y| < δ implies |f (x) − f (y)| < ǫ/|I| for each x, y ∈ I. Then, for
each partition ∆ of I satisfying |∆| < δ, we obtain
N N
X ǫ X
R(∆, f ) − r(∆, f ) = (Mj − mj )|Ij | ≤ |Ij | = ǫ, (10.42)
j=1
|I| j=1
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 170
as |∆| < δ implies |x − y| < δ for each x, y ∈ Ij and each j ∈ {1, . . . , N }. Finally,
(10.42) implies f ∈ R(I) due to Riemann’s integrability criterion of Th. 10.13.
(b): Suppose f : [a, b] −→ R is increasing. Then f is bounded, as f (a) ≤ f (x) ≤ f (b)
for each x ∈ [a, b]. If f (a) = f (b), then f is constant. Thus, assume f (a) < f (b).
Moreover, if ∆ = (x0 , . . . , xN ) is a partition of I as in Def. 10.3, then
N
X N
X
R(∆, f ) − r(∆, f ) = (Mj − mj )|Ij | = f (xj ) − f (xj−1 ) |Ij | ≤ |∆| f (b) − f (a) .
j=1 j=1
(10.43)
Thus, given ǫ > 0, we have R(∆, f ) − r(∆, f ) < ǫ for each partition ∆ of I satisfying
|∆| < ǫ/(f (b) − f (a)). In consequence, f ∈ R(I), once again due to Riemann’s integra-
bility criterion of Th. 10.13. If f is decreasing, then −f is increasing, and Th. 10.11(a)
establishes the case.
Definition and Remark 10.17. Let M ⊆ C. A function f : M −→ C f is called
Lipschitz continuous in M with Lipschitz constant L if, and only if,
∃ ∀ |f (x) − f (y)| ≤ L |x − y|. (10.44)
L∈R+
0
x,y∈M
Every Lipschitz continuous function is, indeed, continuous, since, if ξ ∈ M and (yn )n∈N
is a sequence in M with limn→∞ yn = ξ, then (10.44) implies
∀ |f (ξ) − f (yn )| ≤ L |ξ − yn |, (10.45)
n∈N
proving limn→∞ f (yn ) = f (ξ). Moreover, it is not too much harder to prove Lipschitz
continuous functions are even uniformly continuous,√but we will not pursue this right
now. On the other hand, f : R+ 0 −→ R, f (x) := x, is an example of a continuous
function (actually, even uniformly continuous) that is not Lipschitz continuous.
Theorem 10.18. Let a, b ∈ R, a ≤ b, I := [a, b].
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 171
Proof. (a),(b): We just carry out the proof for the case f ∈ R(I), φ : f (I) −→ R, and
leave the extension to the case that f or φ is C-valued as an exercise. Thus, let f ∈ R(I)
and let φ : f (I) −→ R be Lipschitz continuous. Then there exists L ≥ 0 such that
φ(x) − φ(y) ≤ L |x − y| for each x, y ∈ f (I). (10.46)
As f ∈ R(I), given ǫ > 0, Th. 10.13 provides a partition ∆ of I such that R(∆, f ) −
r(∆, f ) < ǫ/L, and we obtain
N
X
R(∆, φ ◦ f ) − r(∆, φ ◦ f ) = Mj (φ ◦ f ) − mj (φ ◦ f ) |Ij |
j=1
N
X
≤ L Mj (f ) − mj (f ) |Ij |
j=1
= L R(∆, f ) − r(∆, f ) < ǫ. (10.47)
Thus, φ ◦ f ∈ R(I) by another application of Th. 10.13.
(c): |f |, f 2 , f + , f − ∈ R(I) follows from (b) (for |f | also for f ∈ R(I, C)), since each of
the maps x 7→ |x|, x 7→ x2 , x 7→ max{x, 0}, x 7→ − min{x, 0} is Lipschitz continuous
on the bounded set f (I) (recall that f ∈ R(I) implies that f is bounded). Since
f = f + − f − , (10.2) is implied by (10.28). Finally, if f (x) ≥ δ > 0, then x 7→ x−1 is
Lipschitz continuous on the bounded set f (I), and f −1 ∈ R(I) follows from (b).
(d): For f, g ∈ R(I), we note that, due to
1
f g = (f + g)2 − (f − g)2 , (10.48a)
4
max(f, g) = f + (g − f )+ , (10.48b)
min(f, g) = g − (f − g)− , (10.48c)
everything is a consequence of (c). For f, g ∈ R(I, C), due to
f¯ = (Re f, − Im f ), (10.48d)
f g = (Re f Re g − Im f Im g, Re f Im g + Im f Re g), (10.48e)
1/g = (Re g/|g|2 , − Im g/|g|2 ), (10.48f)
everything follows from the real-valued case together with (c) and Th. 10.11(a), where
|g| ≥ δ > 0 guarantees |g|2 ≥ δ 2 > 0).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 172
We provide two variants of the fundamental theorem with slightly different flavors: In
the first variant, Th. 10.20(a), we start with a function f , obtain another function F
by means of integrating f , and recover f by taking the derivative of F . In the second
variant, Th. 10.20(b), one first differentiates the given function F , obtaining f := F ′ ,
followed by integrating f , recovering F up to an additive constant.
Notation 10.19. If a, b ∈ R, a ≤ b, I := [a, b], f : I −→ C, then denote
Z b Z Z a Z b
f := f, f := − f, (10.49a)
a I b a
and
Z x
F (x) = F (c) + F ′ (t) dt for each c, x ∈ I. (10.51b)
c
Proof. It suffices to prove the case K = R, since the case K = C then follows by applying
the case K = R to Re Fc and Im Fc (for (a)) and to Re F and Im F (for (b)). Thus, for
the rest of the proof, we assume K = R.
(a): We need to show that
Fc (ξ + h) − Fc (ξ)
lim A(h) = 0, where A(h) := − f (ξ). (10.52)
h→0 h
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 173
One computes
Z ξ+h Z ξ+h Z ξ+h
1 1 1
A(h) = f (t) dt − f (ξ) dt = f (t) − f (ξ) dt . (10.53)
h ξ h ξ h ξ
Now, given ǫ > 0, the continuity of f in ξ allows us to find δ > 0 such that |(f (t)−f (ξ)| <
ǫ/2 for each t ∈ I with |t − ξ| < δ. Thus, for each h with |h| < δ, we obtain
Z ξ+h
1 f (t) − f (ξ) dt ≤ ǫh < ǫ,
|A(h)| ≤ (10.54)
h ξ 2h
constant on I, i.e. H(x) = H(a) = F (a) − G(a) = F (a) for each x ∈ I. Evaluating at
x = b yields Z b
F (a) = H(b) = F (b) − F ′ (t) dt , (10.56)
a
thereby establishing the case.
Now we consider the remaining case of a differentiable F with integrable derivative
F ′ ∈ R(I). Consider a partition ∆ = (x0 , . . . , xN ) of I as in Def. 10.3. Then, for
each j ∈ {1, . . . , N }, the mean value theorem provides ξj ∈]xj−1 , xj [ such that F (xj ) −
F (xj−1 ) = (xj − xj−1 ) F ′ (ξj ). Thus,
N
X N
X
F (b) − F (a) = F (xj ) − F (xj−1 ) = (xj − xj−1 ) F ′ (ξj ) = ρ(∆, F ′ ). (10.57)
j=1 j=1
Example 10.22. (a) Due to the fundamental theorem, if we know a function’s an-
tiderivative, we can easily compute its integral over a given interval. Here are three
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 174
simple examples:
Z 1 1
5 x6 3x2 1 3 4
(x − 3x) dx = − = − =− , (10.58a)
6 2 0 6 2 3
Z0 e
1
dx = [ln x]e1 = ln e − ln 1 = 1, (10.58b)
x
Z1 π
sin x dx = [− cos x]π0 = 2. (10.58c)
0
Theorem 10.23. Let a, b ∈ R, a < b, I := [a, b]. If f, g ∈ C 1 (I, C), then the following
integration by parts formula holds:
Z b Z b
′ b
f g = [f g]a − f ′ g. (10.59)
a a
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 175
Proof. If f, g ∈ C 1 (I, C), then, according to the product rule, f g ∈ C 1 (I, C) with
(f g)′ = f ′ g + f g ′ . Applying (10.51a), we obtain
Z b Z b Z b
′ ′
b
[f g]a = (f g) = f g+ f g′, (10.60)
a a a
Proof. Let
Z x
F : J −→ C, F (x) := f (t) dt . (10.64)
φ(a)
According to Th. 10.20(a) and the chain rule of Th. 9.11, we obtain
proving (10.63).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 176
R1 √
Example 10.26. We compute the integral 0 t2 1 − t dt using the change of variables
x := φ(t) := 1 − t, φ′ (t) = −1:
Z 1 Z 0 Z 1
2
√ 2
√ √ √ √
t 1 − t dt = − (1 − x) x dx = ( x − 2x x + x2 x) dx
0 1 0
" 3 5 7
#1
2x 2 4x 2 2x 2 16
= − + = . (10.67)
3 5 7 105
0
Proof. The integral form (10.70) of the remainder term we prove by using induction on
m: For m = 0, the assertion is
Z x
f (x) = f (a) + f ′ (t) dt , (10.72)
a
which holds according to the fundamental theorem of calculus in the form Th. 10.20(b).
For the induction step, we assume (10.68) holds for fixed m ∈ N0 with Rm (x, a) in
integral form (10.70) and consider f ∈ C m+2 (I, K). For fixed x ∈ I, we define the
function
(x − t)m+1 (m+1)
g : I −→ K, g(t) := f (t). (10.73)
(m + 1)!
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 177
can converge to f (x) (see (a) below), diverge (see (b) below), or converge to η 6= f (x)
(see (c) below):
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 178
(b) For f : R+ −→ R, f (x) = ln x, we have f (m) (x) = (−1)m−1 (m − 1)! x−m for
each m ∈ N0 and each x ∈ R+ . Thus, for x = 23 and a = 12 , using f (k) (a) =
(−1)k−1 (k − 1)! 2k , we have
m m
X f (k) (a) k
X (−1)k−1 2k
Tm (x, a) = (x − a) = ,
k=0
k! k=0
k
which diverges for m → ∞, since the summands do not converge to 0.
(c) For ( 2
e−1/x for x 6= 0,
f : R −→ R, f (x) :=
0 for x = 0,
it is an exercise to show f ∈ C ∞ (R) with f (k) (0) = 0 for each k ∈ N0 (hint: for
2
x 6= 0, one obtains f (k) (x) = Pk (x−1 )e−1/x , where Pk is a polynomial of degree 3k).
Thus, Tm (x, 0) = 0 for each x ∈ R and each m ∈ N0 , implying
lim Tm (x, 0) = 0 6= f (x) for each x ∈ R \ {0}.
m→∞
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 179
Before we can define improper Riemann integrals, we define, in partial extension of Def.
8.17:
(a) Let I := [c, b[, f ∈ Rloc (I), and assume b = ∞ or f is unbounded. Consider the
function Z x
F : I −→ R, F (x) := f.
c
If the limit Z x
lim F (x) = lim f (10.79)
x→b x→b c
exists in R, then we define
Z Z b Z b Z x
f := f (t) dt := f := lim f.
I c c x→b c
(b) Let I :=]a, c], f ∈ Rloc (I), and assume a = −∞ or f is unbounded. Consider the
function Z c
F : I −→ R, F (x) := f.
x
If the limit Z c
lim F (x) = lim f (10.80)
x→a x→a x
exists in R, then we define
Z Z c Z c Z c
f := f (t) dt := f := lim f.
I a a x→a x
(c) Let I =]a, b[ , f ∈ Rloc (I). If the conditions of both (a) and (b) hold, i.e. (i) – (iv),
where
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 180
Rx
(ii) limx→b c
f exists in R,
(iii) a = −∞ or f is unbounded on ]a, c],
Rc
(iv) limx→a x f exists in R,
then we define Z Z Z Z Z
b b c b
f := f (t) dt := f := f+ f.
I a a a c
All the above limits of Riemann integrals (if they exist) are called improper Riemann
integrals. In each case, if the limit exists, we call f improperly Riemann integrable and
write f ∈ R(I).
Remark 10.34. (a) The definitions in Def. 10.33 are consistent with what occurs if
the limits are proper Riemann integrals: Let a, c, b ∈ R, a < c < b, and f ∈ R[a, b].
Then Z Z Z Z
x b c c
lim f= f and lim f= f. (10.81)
x→b c c x→a x a
Indeed, since f ∈ R[a, b], |f | is bounded by some M ∈ R+ ; and if (xk )k∈N is a
sequence in [a, b[ such that limk→∞ xk = b, then
Z b Z b
f ≤ |f | ≤ M (b − xk ) → 0 for k → ∞,
xk xk
implying
Z xk Z b Z b Z b Z b
Th. 10.11(b)
lim f = lim f− f = f −0= f.
k→∞ c k→∞ c xk c c
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 181
(c) Let a < c1 < c2 < b (a = −∞, b = ∞ is admissible). If I := [c1 , b[, f ∈ Rloc (I), and
Rb Rb
b = ∞ or f is unbounded, then c1 f exists if, and only if, c2 f exists. Moreover, if
the integrals exist, then Z Z Z b c2 b
f= f+ f. (10.82a)
c1 c1 c2
Rb
Indeed, if (xk )k→∞ is a sequence in [c1 , b[ such that limk→∞ xk = b and if c1
f exists,
then Z Z Z Z Z
xk xk c2 b c2
Th. 10.11(b)
lim f = lim f− f = f− f,
k→∞ c2 k→∞ c1 c1 c1 c1
Rb Rb
proving c2
f exists and (10.82a) holds. Conversely, if c2
f exists, then
Z xk Z x k Z c2 Z b Z c2
Th. 10.11(b)
lim f = lim f+ f = f+ f,
k→∞ c1 k→∞ c2 c1 c2 c1
Rb
proving c1 f exists and (10.82a) holds. Analogously, one shows that if I :=]a, c2 ],
Rc Rc
f ∈ Rloc (I), and a = −∞ or f is unbounded, then a 1 f exists if, and only if, a 2 f
exists, where, if the integrals exist, then
Z c2 Z c2 Z c1
f= f+ f. (10.82b)
a c1 a
In particular, we see that neither the existence nor the value of the improper integral
in Def. 10.33(c) depends on the choice of c.
Example 10.35. (a) Let 0 < α < 1. We claim that
Z 1 Z 1
1 1 1 1
α
dt = α = yields √ dt = 2 . (10.83)
0 t 1−α 2 0 t
Indeed, if (xk )k∈N is a sequence in ]0, 1] such that limk→∞ xk = 0, then
Z 1 1−α 1
1 t 1 − xk1−α 1
lim α
dt = lim = lim = .
k→∞ x t k→∞ 1 − α k→∞ 1 − α 1−α
k xk
showing the limit does not exist in R, but diverges to ∞. Sometimes, this is stated
in the form Z 1
1
dt = ∞. (10.84)
0 t
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 182
Then limt→∞ f (t) does not exist and f is not even bounded. However f ∈ R(R+
0)
and Z ∞ ∞ Z n+1/(n2n ) ∞
X X 1
f= n dt = 2−n = 1 − 1 = 1.
0 n=1 n n=1
1 − 2
Lemma 10.36. Let a < c < b (a = −∞, b = ∞ is admissible). Let I ⊆]a, b[ be one of
the three kinds of intervals occurring in Def. 10.33 (i.e. I = [c, b[, I =]a, c], or I =]a, b[),
and assume f, g : I −→ R to be improperly Riemann integrable over I.
Proof. We conduct the proof for the case I = [c, b[ – the case I =]a, b] can be shown
analogously, and the case I =]a, b[ then also follows. Let (xk )k∈N be a sequence in I
such that limk→∞ xk = b.
(a): One computes
Z xk Z xk Z xk Z b Z b
Th. 10.11(a)
lim λf + µg) = lim λ f +µ g =λ f +µ g,
k→∞ c k→∞ c c c c
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 183
proving (b).
Before we can proceed to Prop. 10.39 about convergence criteria for improper integrals,
we need to prove the analogon of Th. 7.19 for limits of functions.
Proof. We prove (10.87a) for the case, where f is increasing – the remaining case of
(10.87a) as well as (10.87b) can be proved completely analogous. Let (xk )k∈N be a
sequence in M \ {b} such that limk→∞ xk = b. We have to show that limk→∞ f (xk ) = η,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 184
where η := sup A for A bounded from above and η := ∞ for A not bounded from above.
Seeking a contradiction, assume limk→∞ f (xk ) = η does not hold. Due to the choice of
b, there then must be ǫ > 0 and a subsequence (yk )k∈N of (xk )k∈N such that (yk )k∈N is
strictly increasing and
(
η − ǫ if η = sup A,
∀ f (yk ) ≤
k∈N ǫ if η = ∞.
Proof. (a): We consider the case I = [c, b[ – the proof for the case I =]a, c] is completely
R
analogous, and the case I =]a, b[ then also follows. First, suppose 0 ≤ f ≤ g, and I g
exists. Since 0 ≤ f , the function
Z x
+
F : [c, b[−→ R0 , F (x) := f,
c
is increasing. Due to
Z x Z x Z b
∀ F (x) = f≤ g≤ g ∈ R+
0,
x∈[c,b[ c c c
F is also bounded from above (in the sense that {F (x) R: x ∈ [c, b[} is bounded from
x
above), i.e. Prop. 10.38 yields that limx→b F (x) = limx→b c f exists in R as claimed.
R
Now suppose 0 ≤ g ≤ f and I g diverges. As the function F above, the function
Z x
+
G : [c, b[−→ R0 , G(x) := g,
c
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
10 THE RIEMANN INTEGRAL ON INTERVALS IN R 185
is increasing. Since we assume that limx→b G(x) does not exist in R, Prop. 10.38 im-
plies limx→b G(x) = ∞. As a consequence, if (xk )k∈N is a sequence in [c, b[ such that
limk→∞ xk = b, then
Z xk Z xk
lim F (xk ) = lim f = lim g = ∞,
k→∞ k→∞ c k→∞ c
R
showing that f diverges as well.
I
R R
(b): We assume I f to converge absolutely, i.e. I |f |Rmust exist inRR. Since 0 ≤ f + ≤ |f |
and 0 ≤ f − ≤ |f |, R(a) then
R implies
R the existence of I f + and of I f − . Thus, according
to Lem. 10.36(a), I f = I f + − I f − must also exist.
Example 10.40. (a) We will use Prop. 10.39(a) to show that the improper integral
Z ∞
2
e−t dt
0
exists. Indeed,
2
∀ (t − 1)2 = t2 − 2t + 1 ≥ 0 ⇒ −t2 ≤ −2t + 1 ⇒ 0 ≤ e−t ≤ e−2t+1 ,
t∈R
and, since
Z ∞ Z x x
−2t+1 −2t+1 e−2t+1 e − e−2x+1 e
e dt = lim e dt = lim − = lim = ,
0 x→∞ 0 x→∞ 2 0
x→∞ 2 2
R∞ 2
Prop. 10.39(a) implies that 0
e−t dt exists in R.
diverges. Indeed,
2 t2
∀ t ≥0 ⇒ e ≥1 ,
t∈R
and, since Z x
lim 1 dt = lim x = ∞,
x→∞ 0 x→∞
R∞ t2
Prop. 10.39(a) implies that 0
e dt = ∞.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 186
(c) We provide an example that shows an improper integral can converge without
converging absolutely: Consider the function
(
(−1)n+1 for n ≤ t ≤ n + n1 , n ∈ N,
f : [0, ∞[−→ R, f (t) := (10.88)
0 otherwise.
Then Z ∞ n Z k+ k1 n
X X 1 (7.72)
|f | = lim 1 dt = lim = ∞, (10.89)
0 k→∞
k=1 k k→∞
k=1
k
R∞
showing 0
f does not converge absolutely. However, we will show
Z ∞ ∞
X (−1)j+1
f= =: α > 0. (10.90)
0 j=1
j
We know α > 0 from Ex. 7.86(a) and Th. 7.85. Let (xk )k∈N be a sequence in R+ 0
such that limk→∞ xk = ∞. Given ǫ > 0, choose K ∈ N such that K1 < 2ǫ and N ∈ N
such that
∀ xk > K. (10.91)
k>N
Then, for each k > N , there exists K1 ∈ N such that K < K1 ≤ xk < K1 + 1. Thus
Z xk KX1 −1
1 (−1)j+1
f (t) dt = min xk − K1 , + (10.92)
0 K1 j=1
j
and
X
∞ (−1)j+1
Z xk
α − 1 (7.79) 1 1
f (t) dt = − min xk − K1 , < +
0
j=K
j K1 K1 K1
1
2 ǫ
< < 2 · = ǫ, (10.93)
K 2
thereby proving (10.90).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 187
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 188
contained in some concrete dictionary). Then there is a smallest natural number n that
is not in A. But then n is the smallest natural number that can not be defined by
fifty English words or less, which, actually, defines n by less than fifty English words, in
contradiction to n ∈/ A.
To avoid contradictions of this type, we require P (x) to be a so-called set-theoretic
formula.
Definition A.1. (a) The language of set theory consists precisely of the following sym-
bols: ∧, ¬, ∃, (, ), ∈, =, vj , where j = 1, 2, . . . .
(b) A set-theoretic formula is a finite string of symbols from the above language of set
theory that can be built using the following recursive rules:
Remark A.3. It is noted that, for a given finite string of symbols, a computer can, in
principle, check in finitely many steps, if the string constitutes a set-theoretic formula
or not. The symbols that can occur in a set-theoretic formula are to be interpreted
as follows: The variables v1 , v2 , . . . are variables for sets. The symbols ∧ and ¬ are
to be interpreted as the logical operators of conjunction and negation as described in
Sec. 1.2.2. Similarly, ∃ stands for an existential quantifier as in Sec. 1.4: The statement
∃vj (φ) means “there exists a set vj that has the property φ”. Parentheses ( and ) are
used to make clear the scope of the logical symbols ∃, ∧, ¬. Where the symbol ∈ occurs,
it is interpreted to mean that the set to the left of ∈ is contained as an element in the
set to the right of ∈. Similarly, = is interpreted to mean that the sets occurring to the
left and to the right of = are equal.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 189
or transcriptions of actual set-theoretic formulas. For example, we use the rules of Th.
1.11 to define the additional logical symbols ∨, ⇒, ⇔ as abbreviations:
(φ) ∨ (ψ) is short for ¬((¬(φ)) ∧ (¬(ψ))) (cf. Th. 1.11(j)), (A.2a)
(φ) ⇒ (ψ) is short for (¬(φ)) ∨ (ψ) (cf. Th. 1.11(a)), (A.2b)
(φ) ⇔ (ψ) is short for ((φ) ⇒ (ψ)) ∧ ((ψ) ⇒ (φ)) (cf. Th. 1.11(b)). (A.2c)
Similarly, we use (1.17a) to define the universal quantifier:
∀vj (φ) is short for ¬(∃vj (¬(φ))). (A.2d)
Further abbreviations and transcriptions are obtained from omitting parentheses if it is
clear from the context and/or from Convention 1.10 where to put them in, by writing
variables bound by quantifiers under the respective quantifiers (as in Sec. 1.4), and by
using other symbols than vj for set variables. For example,
∀ (φ ⇒ ψ) transcribes ¬(∃v1 (¬((¬(φ)) ∨ (ψ)))).
x
Moreover,
vi 6= vj is short for ¬(vi = vj ); vi ∈
/ vj is short for ¬(vi ∈ vj ). (A.2e)
Remark A.5. Even though axiomatic set theory requires the use of set-theoretic for-
mulas as described above, the systematic study of formal symbolic languages is the
subject of the field of mathematical logic and is beyond the scope of this class (see, e.g.,
[EFT07]). In Def. and Rem. 1.15, we defined a proof of statement B from statement A1
as a finite sequence of statements A1 , A2 , . . . , An such that, for 1 ≤ i < n, Ai implies
Ai+1 , and An implies B. In the field of proof theory, also beyond the scope of this class,
such proofs are formalized via a finite set of rules that can be applied to (set-theoretic)
formulas (see, e.g., [EFT07, Sec. IV], [Kun12, Sec. II]). Once proofs have been formal-
ized in this way, one can, in principle, mechanically check if a given sequence of symbols
does, indeed, constitute a valid proof (without even having to understand the actual
meaning of the statements). Indeed, several different computer programs have been
devised that can be used for automatic proof checking, for example Coq [Wik15a], HOL
Light [Wik15b], and Isabelle [Wik15c] to name just a few.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 190
We are now in a position to formulate and discuss the axioms of axiomatic set the-
ory. More precisely, we will present the axioms of Zermelo-Fraenkel set theory, usually
abbreviated as ZF, which are Axiom 0 – Axiom 8 below. While there exist various
set theories in the literature, each set theory defined by some collection of axioms, the
axioms of ZFC, consisting of the axioms of ZF plus the axiom of choice (Axiom 9, see
Sec. A.4 below), are used as the foundation of mathematics currently accepted by most
mathematicians.
Axiom 0 Existence:
∃ (X = X).
X
In Def. 1.18 two sets are defined to be equal if, and only if, they contain precisely the
same elements. In axiomatic set theory, this is guaranteed by the axiom of extensionality:
Axiom 1 Extensionality:
∀ ∀ ∀ (z ∈ X ⇔ z ∈ Y ) ⇒ X = Y .
X Y z
Following [Kun12], we assume that the substitution property of equality is part of the
underlying logic, i.e. if X = Y , then X can be substituted for Y and vice versa without
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 191
changing the truth value of a (set-theoretic) formula. In particular, this yields the
converse to extensionality:
∀ ∀ X = Y ⇒ ∀ (z ∈ X ⇔ z ∈ Y ) .
X Y z
is an axiom. Thus, the comprehension scheme states that, given the set X,
there exists (at least one) set Y , containing precisely the elements of X that
have the property φ.
Remark A.8. Comprehension does not provide uniqueness. However, if both Y and
Y ′ are sets containing precisely the elements of X that have the property φ, then
′
∀ x ∈ Y ⇔ (x ∈ X ∧ φ) ⇔ x ∈ Y ,
x
and, then, extensionality implies Y = Y ′ . Thus, due to extensionality, the set Y given
by comprehension is unique, justifying the notation
{x : x ∈ X ∧ φ} := {x ∈ X : φ} := Y (A.3)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 192
Remark A.10. In Rem. A.4 we said that every formula with additional symbols and
notation is to be regarded as an abbreviation or transcription of a set-theoretic formula
as defined in Def. A.1(b). Thus, formulas containing symbols for defined sets (e.g. 0
or ∅ for the empty set) are to be regarded as abbreviations for formulas without such
symbols. Some logical subtleties arise from the fact that there is some ambiguity in the
way such abbreviations can be resolved: For example, 0 ∈ X can abbreviate either
ψ : ∃ φ(y) ∧ y ∈ X or χ : ∀ φ(y) ⇒ y ∈ X , where φ(y) stands for ∀ (v ∈ / y).
y y v
Then ψ and χ are equivalent if ∃! φ(y) is true (e.g., if Axioms 0 – 2 hold), but they can
y
be nonequivalent, otherwise (see discussion between Lem. 2.9 and Lem. 2.10 in [Kun80]).
—
At first glance, the role played by the free variables in φ, which are allowed to occur
in Axiom 2, might seem a bit obscure. So let us consider examples to illustrate that
allowing free variables (i.e. set parameters) in comprehension is quite natural:
(b) Note that it is even allowed for φ in comprehension to have X as a free variable, so
one can let φ be the formula ∃ (x ∈ u ∧ u ∈ X) to define the set
u
n o
X ∗ := x ∈ X : ∃ (x ∈ u ∧ u ∈ X) .
u
2∗ = {0} = 1.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 193
(as C and M are non-sets) and, by extensionality, {C} = {M } were true, in contradic-
tion to a set with a cow inside not being the same as a set with a monkey inside. Thus,
we see that all objects of the mathematical universe must be so-called hereditary sets,
i.e. sets all of whose elements (thinking of the elements as being the children of the sets)
are also sets.
A.3.2 Classes
As we need to avoid contradictions such as Russell’s antinomy, we must not require the
existence of a set {x : φ} for each set-theoretic formula φ. However, it can still be
useful to think of a “collection” of all sets having the property φ. Such collections are
commonly called classes:
(b) If φ is a set-theoretic formula, then we say the class {x : φ} exists (as a set) if, and
only if
∃ ∀ x∈X ⇔ φ (A.4)
X x
Example A.13. (a) Due to Russell’s antinomy of Sec. A.1, we know that R := {x :
x∈
/ x} forms a proper class.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 194
(b) The universal class of all sets, V := {x : x = x}, is a proper class. Once again,
this is related to Russell’s antinomy: If V were a set, then
R = {x : x ∈
/ x} = {x : x = x ∧ x ∈
/ x} = {x : x ∈ V ∧ x ∈
/ x}
Remark A.14. From the perspective of formal logic, statements involving proper
classes are to be regarded as abbreviations for statements without proper classes. For
example, it turns out that the class G of all sets forming a group is a proper class. But
we might write G ∈ G as an abbreviation for the statement “The set G is a group.”
Axioms 0 – 2 are still consistent with the empty set being the only set in existence (see
[Kun12, I.6.13]). The next axiom provides the existence of nonempty sets:
Axiom 3 Pairing:
∀ ∀ ∃ (x ∈ Z ∧ y ∈ Z).
x y Z
Thus, the pairing axiom states that, for all sets x and y, there exists a set Z
that contains x and y as elements.
0 := ∅, (A.5a)
1 := {0}, (A.5b)
2 := {0, 1} (A.5c)
Definition A.15. If x, y are sets and Z is given by the pairing axiom, then we call
(c) (x, y) := {{x}, {x, y}} the ordered pair given by x and y (cf. Def. 2.1).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 195
where, by extensionality {x} 6= {x, y} 6= {x′ }. Thus, using extensionality again, {x} =
{x′ } and x = x′ . Next, we conclude
While we now have the existence of the infinitely many different sets 0, {0}, {{0}}, . . . ,
we are not, yet, able to form sets containing more than two elements. This is remedied
by the following axiom:
Axiom 4 Union:
∀ ∃ ∀ ∀ (x ∈ X ∧ X ∈ M) ⇒ x ∈ Y .
M Y x X
Thus, the union axiom states that, for each set of sets M, there exists a set
Y containing all elements of elements of M.
Definition A.17. (a) If M is a set and Y is given by the union axiom, then define
[ [
M := X := x ∈ Y : ∃ x∈X .
X∈M
X∈M
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 196
X ∩ Y := {x ∈ X : x ∈ Y },
\
Ai := x ∈ Ai0 : ∀ x ∈ Ai ,
i∈I
i∈I
i.e. the intersection over the empty set is the class of all sets – in particular, a proper
class and not a set.
Definition A.19. We define the successor function
Thus, recalling (A.5), we have 1 = S(0), 2 = S(1); and we can define 3 := S(2), . . . In
general, we call the set S(x) the successor of the set x.
—
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 197
In Def. 2.3 and Def. 2.19, respectively, we define functions and relations in the usual
manner, making use of the Cartesian product A × B of two sets A and B, which,
according to (2.2) consists of all ordered pairs (x, y), where x ∈ A and y ∈ B. However,
Axioms 0 – 4 are not sufficient to justify the existence of Cartesian products. To obtain
Cartesian products, we employ the axiom of replacement. Analogous to the axiom
of comprehension, the following axiom of replacement actually consists of a scheme of
infinitely many axioms, one for each set-theoretic formula:
is an axiom. Thus, the replacement scheme states that if, for each x ∈ X,
there exists a unique y having the property φ (where, in general, φ will depend
on x), then there exists a set Y that, for each x ∈ X, contains this y with
property φ. One can view this as obtaining Y by replacing each x ∈ X by
the corresponding y = y(x).
Theorem A.20. If A and B are sets, then the Cartesian product of A and B, i.e. the
class
A × B := x : ∃ ∃ x = (a, b)
a∈A b∈B
exists as a set.
Proof. For each a ∈ A, we can use replacement with X := B and φ := φa being the
formula y = (a, x) to obtain the existence of the set
(in the usual way, comprehension and extensionality were used as well). Analogously,
using replacement again with X := A and φ being the formula y = {x} × B, we obtain
the existence of the set
M := {{x} × B : x ∈ A}. (A.6b)
In a final step, the union axiom now shows
[ [
M= {a} × B = A × B (A.6c)
a∈A
to be a set as well.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 198
The following axiom of infinity guarantees the existence of infinite sets (e.g., it will allow
us to define the set of natural numbers N, which is infinite by Th. A.46 below).
Axiom 6 Infinity:
∃ 0∈X ∧ ∀ (x ∪ {x} ∈ X) .
X x∈X
Thus, the infinity axiom states the existence of a set X containing ∅ (iden-
tified with the number 0), and, for each of its elements x, its successor
S(x) = x ∪ {x}.
In preparation for our official definition of N in Def. A.41 below, we will study so-called
ordinals, which are special sets also of further interest to the field of set theory (the
natural numbers will turn out to be precisely the finite ordinals). We also need some
notions from the theory of relations, in particular, order relations (cf. Def. 2.19 and Def.
2.25).
(b) R is called a strict partial order if, and only if, R is asymmetric and transitive. It is
noted that this is consistent with Not. 2.26, since, recalling the notation ∆(X) :=
{(x, x) : x ∈ X}, R is a partial order on X if, and only if, R \ ∆(X) is a strict
partial order on X. We extend the notions lower/upper bound, min, max, inf, sup
of Def. 2.27 to strict partial orders R by applying them to R ∪∆(X): We call x ∈ X
a lower bound of Y ⊆ X with respect to R if, and only if, x is a lower bound of Y
with respect to R ∪ ∆(X), and analogous for the other notions.
(c) A strict partial order R is called a strict total order or a strict linear order if, and
only if, for each x, y ∈ X, one has x = y or xRy or yRx.
(d) R is called a (strict) well-order if, and only if, R is a (strict) total order and every
nonempty subset of X has a min with respect to R (for example, the usual ≤
constitutes a well-order on N (see Th. D.5 below), but not on R (e.g., R+ does not
have a min)).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 199
Proof. (a): If a, b, c ∈ Y with aRb and bRc, then aRc, since a, b, c ∈ X and R is transitive
on X.
(b): If a ∈ Y , then a ∈ X and aRa, since R is reflexive on X.
(c): If a, b ∈ Y with aRb and bRa, then a = b, since a, b ∈ X and R is antisymmetric
on X.
(d): If a, b ∈ Y with aRb, then ¬bRa, since a, b ∈ X and R is asymmetric on X.
(e) follows by combining (a) – (d).
(f): If a, b ∈ Y with a = b and ¬aRb, then bRa, since a, b ∈ X and R is total on X.
Combining this with (e) yields (f).
(g): Due to (f), it merely remains to show that every nonempty subset Z ⊆ Y has a
min. However, since Z ⊆ X and R is a well-order on X, there is m ∈ Z such that m is
a min for R on X, implying m to be a min for R on Y as well.
Remark A.23. Since the universal class V is not a set, ∈ is not a relation in the sense
of Def. 2.19. It can be considered as a “class relation”, i.e. a subclass of V × V, but
it is a proper class. However, ∈ does constitute a relation in the sense of Def. 2.19 on
each set X (recalling that each element of X must be a set as well). More precisely, if
X is a set, then so is
R∈ := {(x, y) ∈ X × X : x ∈ y}. (A.8a)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 200
Then
∀ (x, y) ∈ R∈ ⇔ x ∈ y. (A.8b)
x,y∈X
Definition A.24. A set X is called transitive if, and only if, every element of X is also
a subset of X:
∀ x ⊆ X. (A.9a)
x∈X
(b) We define
Example A.27. Using (A.5), 0 = ∅ is an ordinal, and 1 = S(0), 2 = S(1) are both
successor ordinals (in Prop. A.43, we will identify N0 as the smallest limit ordinal). Even
though X := {1} and Y := {0, 2} are well-ordered by ∈, they are not ordinals, since
they are not transitive sets: 1 ∈ X, but 1 6⊆ X (since 0 ∈ 1, but 0 ∈ / X); similarly,
1 ∈ 2 ∈ Y , but 1 ∈
/ Y.
Lemma A.28. No ordinal contains itself, i.e.
∀ α∈
/ α.
α∈ON
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 201
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 202
(a) X is well-ordered by ∈.
(b) X is an ordinal if, and only if, X is transitive. Note: A transitive set of ordinals
X is sometimes called an initial segment of ON, since, here, transitivity can be
restated in the form
∀ ∀ α<β ⇒ α∈X . (A.13)
α∈ON β∈X
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 203
∀ α ⊆ u.
α∈X
Next, we obtain some results regarding the successor function of Def. A.19 in the context
of ordinals.
This contradiction to α ∈
/ α yields x 6= α, concluding the proof.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 204
(c) For each ordinal β, β < S(α) holds if, and only if, β ≤ α.
Proof. (a): Due to Prop. A.29, S(α) is a set of ordinals. Thus, by Cor. A.34(b), it
merely remains to prove that S(α) is transitive. Let x ∈ S(α). If x = α, then x =
α ⊆ α ∪ {α} = S(α). If x 6= α, then x ∈ α and, since α is transitive, this implies
x ⊆ α ⊆ S(α), showing S(α) to be transitive, thereby completing the proof of (a).
(b) holds, as α ∈ S(α) holds by the definition of S(α).
(c) is clear, since, for each ordinal β,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 205
k < m ∧ m ≤ n ⇒ k ≤ n ⇒ k ∈ S(n).
Definition A.41. If the set X is given by the axiom of infinity, then we use compre-
hension to define the set
N0 := {n ∈ X : n = 0 ∨ n is a natural number}
Proof. “⇒” is clear from Def. A.41 and “⇐” is due to Th. A.40.
In the following Th. A.44, we will prove that N satisfies the Peano axioms P1 – P3 of
Sec. 3.1 (if one prefers, one can show the same for N0 , where 0 takes over the role of 1).
Theorem A.44. The set of natural numbers N satisfies the Peano axioms P1 – P3 of
Sec. 3.1.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 206
Proof. For P1 and P2, we have to show that, for each n ∈ N, one has S(n) ∈ N \ {1}
and that S(m) 6= S(n) for each m, n ∈ N, m 6= n. Let n ∈ N. Then S(n) ∈ N by Prop.
A.39. If S(n) = 1, then n < S(n) = 1 by Prop. A.37(b), i.e. n = 0, in contradiction to
n ∈ N. If m, n ∈ N with m 6= n, then S(m) 6= S(n) is due to Prop. A.37(d). To prove
P3, suppose A ⊆ N has the property that 1 ∈ A and S(n) ∈ A for each n ∈ A. We need
to show A = N (i.e. N ⊆ A, as A ⊆ N is assumed). Let X := A ∪ {0}. Then X satisfies
(A.14) and Th. A.40 yields N0 ⊆ X. Thus, if n ∈ N, then n ∈ X \ {0} = A, showing
N ⊆ A.
For more basic information regarding ordinals see, e.g., [Kun12, Sec. I.8].
There is one more basic construction principle for sets that is not covered by Axioms 0
– 6, namely the formation of power sets. This needs another axiom:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 207
A.3.6 Foundation
Foundation is, perhaps, the least important of the axioms in ZF. It basically cleanses
the mathematical universe of unnecessary “clutter”, i.e. of certain pathological sets that
are of no importance to standard mathematics anyway.
Axiom 8 Foundation:
∀ ∃ (x ∈ X) ⇒ ∃ ¬∃ z ∈ x ∧ z ∈ X .
X x x∈X z
Thus, the foundation axiom states that every nonempty set X contains an
element x that is disjoint to X.
Theorem A.48. Due to the foundation axiom, the ∈ relation can have no cycles, i.e.
there do not exist sets x1 , x2 , . . . , xn , n ∈ N, such that
x1 ∈ x2 ∈ · · · ∈ xn ∈ x1 . (A.16a)
In particular, sets can not be members of themselves:
¬∃ x ∈ x. (A.16b)
x
Proof. If there were sets x1 , x2 , . . . , xn , n ∈ N, such that (A.16a) were true, then, by
using the pairing axiom and the union axiom, we could form the set
X := {x1 , . . . , xn }.
Then, in contradiction to the foundation axiom, X ∩ xi 6= ∅, for each i = 1, . . . , n:
Indeed, xn ∈ X ∩ x1 , and xi−1 ∈ X ∩ xi for each i = 2, . . . , n.
For a detailed explanation, why “sets” forbidden by foundation do not occur in standard
mathematics, anyway, see, e.g., [Kun12, Sec. I.14].
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 208
Thus, the axiom of choice postulates, for each nonempty set M, whose ele-
ments are all nonempty sets, the existence of a choice function, that means
a function that assigns, to each M ∈ M, an element m ∈ M .
Example A.49. For example, the axiom of choice postulates, for each nonempty set
A, the existence of a choice function on P(A) \ {∅} that assigns each subset of A one of
its elements.
—
The axiom of choice is remarkable since, at first glance, it seems so natural that one
can hardly believe it is not provable from the axioms in ZF. However, one can actually
show that it is neither provable nor disprovable from ZF (see, e.g., [Jec73, Th. 3.5, Th.
5.16] – such a result is called an independence proof, see [Kun80] for further material). If
you want to convince yourself that the existence of choice functions is, indeed, a tricky
matter, try to define a choice function on P(R) \ {∅} without AC (but do not spend too
much time on it – one can show this is actually impossible to accomplish).
Theorem A.52 below provides several important equivalences of AC. Its statement and
proof needs some preparation. We start by introducing some more relevant notions from
the theory of partial orders:
Definition A.50. Let X be a set and let ≤ be a partial order on X.
(a) An element m ∈ X is called maximal (with respect to ≤) if, and only if, there exists
no x ∈ X such that m < x (note that a maximal element does not have to be a
max and that a maximal element is not necessarily unique).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 209
(b) A nonempty subset C of X is called a chain if, and only if, C is totally ordered by
≤. Moreover, a chain C is called maximal if, and only if, no strict superset Y of C
(i.e. no Y ⊆ X such that C ( Y ) is a chain.
The following lemma is a bit technical and will be used to prove the implication AC
⇒ (ii) in Th. A.52 (other proofs in the literature often make use of so-called transfinite
recursion, but that would mean further developing the theory of ordinals, and we will
not pursue this route in this class).
Lemma A.51. Let X be a set and let ∅ 6= M ⊆ P(X) be a nonempty set of subsets of
X. We let M be partially ordered by inclusion, i.e. setting A ≤ B :⇔ A ⊆ B for each
A, B ∈ M. Moreover, define
[ [
∀ S := S (A.17)
S⊆M
S∈S
and assume [
∀ C is a chain ⇒ C∈M . (A.18)
C⊆M
Proof. Fix some arbitrary M0 ∈ M. We call T ⊆ M an M0 -tower if, and only if, T
satisfies the following three properties
(i) M0 ∈ T .
S
(ii) If C ⊆ T is a chain, then C ∈T.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 210
The main work of the rest of the proof consists of showing that T0 is a chain. To show
T0 to be a chain, define
Γ := M ∈ T0 : ∀ (M ⊆ N ∨ N ⊆ M ) . (A.22)
N ∈T0
∀ Φ(M ) := {N ∈ T0 : N ⊆ M ∨ g(M ) ⊆ N }
M ∈Γ
and also show each Φ(M ) to be an M0 -tower. Actually, Γ and each Φ(M )Ssatisfy (i)
due to (A.21). To verify Γ satisfies (ii), let C ⊆ Γ be a chain and U := C. Then
U ∈ T0 , since T0 satisfies (ii). If N ∈ T0 , and C ⊆ N for each C ∈ C, then U ⊆ N . If
N ∈ T0 , and there is C ∈ C such that C 6⊆ N , then N ⊆ C (since C ∈ Γ), i.e. N ⊆ U ,
showing U ∈ Γ and Γ satisfyingS(ii). Now, let M ∈ Γ. To verify Φ(M ) satisfies (ii), let
C ⊆ Φ(M ) be a chain and U := C. Then U ∈ T0 , since T0 satisfies (ii). If U ⊆ M , then
U ∈ Φ(M ) as desired. If U 6⊆ M , then there is x ∈ U such that x ∈ / M . Thus, there
is C ∈ C such that x ∈ C and g(M ) ⊆ C (since C ∈ Φ(M )), i.e. g(M ) ⊆ U , showing
U ∈ Φ(M ) also in this case, and Φ(M ) satisfies (ii). We will verify that Φ(M ) satisfies
(iii) next. For this purpose, fix N ∈ Φ(M ). We need to show g(N ) ∈ Φ(M ). We already
know g(N ) ∈ T0 , as T0 satisfies (iii). As N ∈ Φ(M ), we can now distinguish three cases.
Case 1: N ( M . In this case, we cannot have M ( g(N ) (otherwise, #(g(N ) \ N ) ≥ 2
in contradiction to (A.19)). Thus, g(N ) ⊆ M (since M ∈ Γ), showing g(N ) ∈ Φ(M ).
Case 2: N = M . Then g(N ) = g(M ) ∈ Φ(M ) (since g(M ) ∈ T0 and g(M ) ⊆ g(M )).
Case 3: g(M ) ⊆ N . Then g(M ) ⊆ g(N ) by (A.19), again showing g(N ) ∈ Φ(M ). Thus,
we have verified that Φ(M ) satisfies (iii) and, therefore, is an M0 -tower. Then, by the
definition of T0 , we have T0 ⊆ Φ(M ). As we also have Φ(M ) ⊆ T0 (from the definition
of Φ(M )), we have shown
∀ Φ(M ) = T0 .
M ∈Γ
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 211
Theorem A.52 (Equivalences to the Axiom of Choice). The following statements (i)
– (v) are equivalent to the axiom of choice (as stated as Axiom 9 above).
Q
(i) Every Cartesian product i∈I Ai of nonempty sets Ai , where I is a nonempty
index set, is nonempty (cf. Def. 2.15(c)).
(ii) Hausdorff’s Maximality Principle: Every nonempty partially ordered set X con-
tains a maximal chain (i.e. a maximal totally ordered subset).
(iii) Zorn’s Lemma: Let X be a nonempty partially ordered set. If every chain C ⊆ X
(i.e. every nonempty totally ordered subset of X) has an upper bound in X (such
chains with upper bounds are sometimes called inductive), then X contains a
maximal element (cf. Def. A.50(a)).
(iv) Zermelo’s Well-Ordering Theorem: Every set can be well-ordered (recall the defi-
nition of a well-order from Def. A.21(d)).
to prove (i).
Next, we will show AC ⇒ (ii) ⇒ (iii) ⇒ (iv) ⇒ AC.
“AC ⇒ (ii)”: Assume AC and let X be a nonempty partially ordered set. Let M be the
set of all chains in X (i.e. the set of all nonempty totally ordered subsets of X). Then
∅∈/ M and M 6= ∅ (since X 6= ∅ and {x} ∈ M for each x ∈ X). Moreover, M satisfies
the hypothesis
S of Lem. A.51, since, if C ⊆ M is a chain of totally ordered subsets of X,
then C is a totally ordered subset of X, i.e. in M (here we have used the notation
of (A.17); also note that we are dealing with two different types of chains here, namely
those with respect to the order on X and those with respect to the order given by ⊆ on
M). Let f : P(X) \ {∅} −→ X be a choice function given by AC, i.e. such that
∀ f (Y ) ∈ Y.
Y ∈P(X)\{∅}
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 212
Since g clearly satisfies (A.19), Lem. A.51 applies, providing an M ∈ M such that
g(M ) = M . Thus, M ∗ = ∅, i.e. M is a maximal chain, proving (ii).
“(ii) ⇒ (iii)”: Assume (ii). To prove Zorn’s lemma, let X be a nonempty set, partially
ordered by ≤, such that every chain C ⊆ X has an upper bound. Due to Hausdorff’s
maximality principle, we can assume C ⊆ X to be a maximal chain. Let m ∈ X be an
upper bound for the maximal chain C. We claim that m is a maximal element: Indeed,
if there were x ∈ X such that m < x, then x ∈ / C (since m is upper bound for C) and
C ∪ {x} would constitute a strict superset of C that is also a chain, contradicting the
maximality of C.
“(iii) ⇒ (iv)”: Assume (iii) and let X be a nonempty set. We need to construct a
well-order on X. Let W be the set of all well-orders on subsets of X, i.e.
W := (Y, W ) : Y ⊆ X ∧ W ⊆ Y × Y ⊆ X × X is a well-order on Y .
(recall the definition of the restriction of a relation from Def. A.21(e)). To apply Zorn’s
lemma to (W, ≤), we need to check that every chain C ⊆ W has an upper bound. To
this end, if C ⊆ W is a chain, let
[ [
UC := (YC , WC ), where YC := Y, WC := W.
(Y,W )∈C (Y,W )∈C
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 213
i.e., for each nonempty set M, whose elements are all nonempty sets, there exists a
function that assigns, to each M ∈ M, a finite subset of M . Neither the proof that (i)
– (iv) of Th. A.52 are each equivalent to AC nor the proof in [Bla84] that Th. A.52(v)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 214
implies AMC makes use of the axiom of foundation (Axiom 8 above). However, the
proof that AMC implies AC does require the axiom of foundation (cf., e.g., [Hal17, Ch.
6]).
A.5 Cardinality
A.5.1 Relations to Injective, Surjective, and Bijective Maps; Schröder-
Bernstein Theorem
Proof. According to Def. 2.23, we have to prove that ∼ is reflexive, symmetric, and
transitive. According to Def. 3.12(a), A ∼ B holds for A, B ∈ M if, and only if, there
exists a bijective map f : A −→ B. Thus, since the identity Id : A −→ A is bijective,
A ∼ A, showing ∼ is reflexive. If A ∼ B, then there exists a bijective map f : A −→ B,
and f −1 is a bijective map f −1 : B −→ A, showing B ∼ A and that ∼ is symmetric.
If A ∼ B and B ∼ C, then there are bijective maps f : A −→ B and g : B −→ C.
Then, according to Th. 2.14, the composition (g ◦ f ) : A −→ C is also bijective, proving
A ∼ C and that ∼ is transitive.
The next theorem provides two interesting, and sometimes useful, characterizations of
infinite sets:
Theorem A.55. Let A be a set. Using the axiom of choice (AC) of Sec. A.4, the
following statements (i) – (iii) are equivalent. More precisely, (ii) and (iii) are equivalent
even without AC (a set A is sometimes called Dedekind-infinite if, and only if, it satisfies
(iii)), (iii) implies (i) without AC, but AC is needed to show (i) implies (ii), (iii).
(i) A is infinite.
(ii) There exists M ⊆ A and a bijective map f : M −→ N.
(iii) There exists a strict subset B ( A and a bijective map g : A −→ B.
One sometimes expresses the equivalence between (i) and (ii) by saying that a set is
infinite if, and only if, it contains a copy of the natural numbers. The property stated in
(iii) might seem strange at first, but infinite sets are, indeed, precisely those identical in
size to some of their strict subsets (as an example think of the natural bijection n 7→ 2n
between all natural numbers and the even numbers).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 215
Proof. We first prove, without AC, the equivalence between (ii) and (iii).
“(ii) ⇒ (iii)”: Let E denote the even numbers. Then E ( N and h : N −→ E,
h(n) := 2n, is a bijection, showing that (iii) holds for the natural numbers. According to
(ii), there exists M ⊆ A and a bijective map f : M −→ N. Define B := (A\M ) ∪˙ f −1 (E)
and (
x for x ∈ A \ M ,
h : A −→ B, h(x) := −1
(A.24)
f ◦ h ◦ f (x) for x ∈ M .
Then B ( A since B does not contain the elements of M that are mapped to odd
numbers under f . Still, h is bijective, since h↾A\M = IdA\M and h↾M = f −1 ◦ h ◦ f is the
composition of the bijective maps f , h, and f −1 ↾E : E −→ f −1 (E).
“(iii) ⇒ (ii)”: As (iii) is assumed, there exist B ⊆ A, a ∈ A \ B, and a bijective map
g : A −→ B. Set
M := {an := g n (a) : n ∈ N}.
We show that an 6= am for each m, n ∈ N with m 6= n: Indeed, suppose m, n ∈ N with
n > m and an = am . Then, since g is bijective, we can apply g −1 m times to an = am
to obtain
a = (g −1 )m (am ) = (g −1 )m (an ) = g n−m (a).
Since l := n − m ≥ 1, we have a = g(g l−1 (a)), in contradiction to a ∈ A \ B. Thus, all
the an ∈ M are distinct and we can define f : M −→ N, f (an ) := n, which is clearly
bijective, proving (ii).
“(iii) ⇒ (i)”: The proof is conducted by contraposition, i.e. we assume A to be finite and
proof that (iii) does not hold. If A = ∅, then there is nothing to prove. If ∅ 6= A is finite,
then, by Def. 3.12(b), there exists n ∈ N and a bijective map f : A −→ {1, . . . , n}. If
B ( A, then, according to Th. A.64(a), there exists m ∈ N0 , m < n, and a bijective
map h : B −→ {1, . . . , m}. If there were a bijective map g : A −→ B, then h ◦ g ◦ f −1
were a bijective map from {1, . . . , n} onto {1, . . . , m} with m < n in contradiction to
Th. A.62.
“(i) ⇒ (ii)”: Inductively, we construct a strictly increasing sequence M1 ⊆ M2 ⊆ . . . of
subsets Mn of A n ∈ N, and a sequence of functions fn : Mn −→ {1, . . . , n} satisfying
∀ fn is bijective, (A.25a)
n∈N
∀ m ≤ n ⇒ f n ↾M m = f m : (A.25b)
m,n∈N
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 216
Then the bijectivity of fn implies the bijectivity of fn+1 , and, since fn+1 ↾Mn = fn holds
by definition of fn+1 , the implication
holds true as well. An induction also shows Mn = {m1 , . . . , mn } and fn (mn ) = n for
each n ∈ N. We now define
[
M := Mn = {mn : n ∈ N}, f : M −→ N, f (mn ) := fn (mn ) = n. (A.27)
n∈N
(i) The sets A and B have the same cardinality (i.e. there exists a bijective map
φ : A −→ B).
We will give two proofs of the Schröder-Bernstein theorem. The first proof is rather
elegant, but also quite abstract. The second proof is longer, but less abstract. Even
though it is still nonconstructive in the general situation, in many concrete cases, it does
provide a method for actually constructing a bijective map from two injective maps. The
first proof is based on the following lemma:
Lemma A.57. Let A be a set. Consider P(A) to be endowed with the partial order
given by set inclusion, i.e., for each X, Y ∈ P(A), X ≤ Y if, and only if, X ⊆ Y . If
F : P(A) −→ P(A) is isotone with respect to that order, then F has a fixed point, i.e.
F (X0 ) = X0 for some X0 ∈ P(A).
Proof. Define \
A := {X ∈ P(A) : F (X) ⊆ X}, X0 := X (A.28)
X∈A
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 217
First Proof of Th. A.56. (i) trivially implies (ii), as one can simply set f := φ and
g := φ−1 . It remains to show (ii) implies (i). Thus, let f : A −→ B and g : B −→ A
be injective. To apply Lem. A.57, define
F : P(A) −→ P(A), F (X) := A \ g B \ f (X) ,
and note
X ⊆ Y ⊆ A ⇒ f (X) ⊆ f (Y ) ⇒ B \ f (Y ) ⊆ B \ f (X)
⇒ g B \ f (Y ) ⊆ g B \ f (X) ⇒ F (X) ⊆ F (Y ).
Thus, by Lem. A.57, F has a fixed point X0 . We claim that a bijection is obtained via
setting (
f (x) for x ∈ X0 ,
φ : A −→ B, φ(x) := −1
g (x) for x ∈ / X0 .
First, φ is well-defined, since x ∈/ X0 = F (X0 ) implies x ∈ g B \ f (X0 ) . To verify
that φ is injective, let x, y ∈ A, x 6= y. If x, y ∈ X0 , then φ(x) 6= φ(y), as f is
injective. If x, y ∈ A \ X0 , then φ(x) 6= φ(y), as g −1 is well-defined. If x ∈ X0 and
y∈/ X0 , then φ(x) ∈ f (X0 ) and φ(y) ∈ B \ f (X0 ), once again, implying φ(x) 6= φ(y). If
remains to prove surjectivity. If b ∈ f (X0 ), then φ(f −1 (b)) = b. If b ∈ B \ f (X0 ), then
g(b) ∈
/ X0 = F (X0 ), i.e. φ(g(b)) = b, showing φ to be surjective.
Second Proof of Th. A.56. As in the first proof, we only need to show (ii) implies (i).
We first assume that A and B are disjoint. To define φ, we first construct a suitable
partition of A ∪˙ B, where the subsets of the partition are given via sequences defined by
using f and g. The idea is to assign a unique sequence σ(a) to each a ∈ A and a unique
sequence σ(b) to each b ∈ B by alternately applying f and g to advance the sequence
to the right and by alternately applying f −1 and g −1 to advance the sequence to the
left, if possible (for a given a ∈ A, g −1 (a) might not be defined and, for a given b ∈ B,
f −1 (a) might not be defined). Thus, for a ∈ A, σ(a) has the form
. . . , f −1 g −1 (a) , g −1 (a), a, f (a), g f (a) , . . . (A.29)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 218
where the conditions in (A.30e) and (A.30g) are meant to implicitly require σi+1 (a) to
be defined for i+1. By induction, one shows σi−1 (a) ∈ A for each i > 0 odd, σi−1 (a) ∈ B
for each i > 0 even, σi+1 (a) ∈ A for each ma ≤ i < 0 odd, and σi+1 (a) ∈ B for each
ma ≤ i < 0 even, such that σi (a) is well-defined by (A.30) for each i ∈ Ia (with Ia = Z
if (A.30e) and (A.30g) are never satisfied). Analogously, for each b ∈ B, we define
σ(b) = (σi (b))i∈Ib recursively by
where the conditions in (A.31e) and (A.31g) are meant to implicitly require σi+1 (b) to
be defined for i+1. By induction, one shows σi−1 (b) ∈ B for each i > 0 odd, σi−1 (b) ∈ A
for each i > 0 even, σi+1 (b) ∈ B for each mb ≤ i < 0 odd, and σi+1 (b) ∈ A for each
mb ≤ i < 0 even, such that σi (b) is well-defined by (A.31) for each i ∈ Ib (with Ib = Z if
(A.31e) and (A.31g) are never satisfied). The σ(a) and σ(b) now allow us to define the
sets
∀ Sx := {σi (x) : i ∈ Ix } ⊆ A ∪˙ B. (A.32)
˙B
x∈A ∪
Moreover, we call x ∈ A ∪˙ B an A-stopper if, and only if, σ(x) terminates to the left
with some element in A; a B-stopper, if, and only if, σ(x) terminates to the left with
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 219
some element in B; and a non-stopper, if σ(x) does never terminate to the left – thus,
x A-stopper ⇔ Ix 6= Z ∧ (x ∈ A ∧ mx even) ∨ (x ∈ B ∧ mx odd) ,
x B-stopper ⇔ Ix 6= Z ∧ (x ∈ A ∧ mx odd) ∨ (x ∈ B ∧ mx even) ,
x non-stopper ⇔ Ix = Z. (A.33)
To verify (A.35), let z ∈ Sx . Then there exists i ∈ Ix such that z = σ0 (z) = σi (x) and
simple inductions show σk (z) = σk+i (x) for each k ∈ Iz and σk−i (z) = σk (x) for each
k ∈ Ix (in particular, i + Iz = Ix ), proving Sx = Sz .
We are now in a position to define the desired bijection φ : A −→ B:
(
f (a) if a is an A-stopper or a non-stopper,
φ : A −→ B, φ(a) := −1
(A.36)
g (a) if a is a B-stopper.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 220
still being injective if f, g are, the first part of the proof yields a bijective function
φ̃ : A × {0} −→ B × {1}. Then, using the clearly bijective functions
α : A −→ A × {0}, α(a) := (a, 0), (A.38a)
β : B −→ B × {1}, β(b) := (b, 1), (A.38b)
φ := β −1 ◦ φ̃ ◦ α : A −→ B is also bijective.
Remark A.58. In general, the second proof of the Schröder-Bernstein Th. A.56 is still
nonconstructive, since one has, in general, no algorithm to determine if a given element
is an A-stopper, a B-stopper, or a non-stopper. However, as the following Ex. A.59
shows, in particular situations, determining A-stoppers, B-stoppers, and non-stoppers
does not have to be difficult.
Example A.59. Let A := N0 , B := {n ∈ N0 : n even}. We consider A and B as being
made disjoint (for example, by using the trick employed in the last part of the proof
of Th. A.56 above), but, for the sake of readability, we will not reflect this in the used
notation. Define the maps
f : A −→ B, f (n) := 4n, (A.39a)
g : B −→ A, g(n) := n, (A.39b)
both being clearly injective, but not surjective. The goal is to, explicitly, find the
bijective map φ : A −→ B, given by (A.36). As an intermediate step, we determine
which elements of A are non-stoppers, A-stoppers, and B-stoppers, and likewise for the
elements of B. Clearly 0 ∈ A and 0 ∈ B are non-stoppers. We will see that all other
elements are either A-stoppers or B-stoppers. The precise claim is
A1 := {a ∈ A : a is A-stopper} = C1 := {a ∈ A : a = n 4k , n odd, k ∈ N0 }, (A.40a)
A2 := {a ∈ A : a is B-stopper} = C2 := A \ (C1 ∪ {0}), (A.40b)
B1 := {b ∈ B : b is A-stopper} = D1 := B \ (D2 ∪ {0}), (A.40c)
B2 := {b ∈ B : b is B-stopper} = D2 := {b ∈ B : b = n 2k ; n, k odd; n, k ≥ 1}.
(A.40d)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 221
Since D2 ⊆ B2 , all elements of D2 are B-stoppers, and, thus, so are all elements of C2 ,
proving C2 ⊆ A2 . Since A = C1 ∪˙ C2 ∪{0},
˙ we then also obtain A1 = C1 and A2 = C2 .
Clearly, each even b ∈ N either has the form b = n 2k with odd n, k ≥ 1 (i.e. b ∈ D2 ) or
b = n4k with n odd and k ∈ N, i.e.
Since C1 = A1 , all elements of C1 are A-stoppers, and, thus, so are all elements of D1 .
Since B = D1 ∪˙ D2 ∪{0},
˙ we then also obtain B1 = D1 and B2 = D2 .
Now that we have identified explicit formulas for A1 and A2 , we can write the assignment
rule for the bijective φ : A −→ B, given by (A.36), in the explicit form
0 if a = 0,
φ(a) := 4a if a = n 4k with n odd and k ∈ N0 , (A.43)
a if a = 2 · n4k with n odd and k ∈ N0 .
Theorem A.60. Let A, B be nonempty sets. Then the following statements are equiv-
alent (where the implication “(ii) ⇒ (i)” makes use of the axiom of choice (AC) of Sec.
A.4).
(i) The sets A and B have the same cardinality (i.e. there exists a bijective map
φ : A −→ B).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 222
Proof. The equivalences are an immediate consequence of combining Th. A.56 with Th.
A.60.
It is intuitively clear that finite cardinalities are uniquely determined. Still one has to
provide a rigorous proof. The key is the following theorem:
Then g is bijective, since it is the composition g = h ◦ f of the bijective map f with the
bijective map
Thus, the restriction g↾{1,...,m−1} : {1, . . . , m−1} −→ {1, . . . , n−1} must also be bijective,
such that the induction hypothesis yields m − 1 = n − 1, which, in turn, implies m = n
as desired.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 223
Proof. For #A = n ∈ N, we use induction to prove (a) and (b) simultaneously, i.e. we
show
∀ #A = n ⇒ ∀ ∀ #B ∈ {0, . . . , n − 1} ∧ # A \ {a} = n − 1 .
n∈N B∈P(A)\{A} a∈A
| {z }
φ(n)
Base Case (n = 1): In this case, A has precisely one element, i.e. B = A \ {a} = ∅, and
#∅ = 0 = n − 1 proves φ(1).
Induction Step: For the induction hypothesis, we assume φ(n) to be true, i.e. we assume
(a) and (b) hold for each A with #A = n. We have to prove φ(n + 1), i.e., we consider
A with #A = n + 1. From #A = n + 1, we conclude the existence of a bijective map ϕ :
A −→ {1, . . . , n + 1}. We have to construct a bijective map ψ : A \ {a} −→ {1, . . . , n}.
To this end, set k := ϕ(a) and define the auxiliary function
n + 1 for x = k,
f : {1, . . . , n + 1} −→ {1, . . . , n + 1}, f (x) := k for x = n + 1,
x for x ∈
/ {k, n + 1}.
(i) f is injective.
(ii) f is surjective.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 224
(iii) f is bijective.
Lemma A.66. For each finite set A (i.e. #A = n ∈ N0 ) and each B ⊆ A, one has
#(A \ B) = #A − #B.
Base Case (m = 1): φ(1) is precisely the statement provided by Th. A.64(b).
Induction Step: For the induction hypothesis, we assume φ(m) with 1 ≤ m < n. To
prove φ(m + 1), consider B ⊆ A with #B = m + 1. Fix an element b ∈ B and set
B1 := B \ {b}. Then #B1 = m by Th. A.64(b), A \ B = (A \ B1 ) \ {b}, and we compute
Th. A.64(b) φ(m)
#(A \ B) = # (A \ B1 ) \ {b} = #(A \ B1 ) − 1 = #A − #B1 − 1
= #A − #B,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 225
Proof. The assertion is clearly true if A or B is empty. If A and B are nonempty, then
there exist m, n ∈ N such that #A = m and #B = n, i.e. there are bijective maps
f : A −→ {1, . . . , m} and g : B −→ {1, . . . , n}.
We first consider the case A∩B = ∅. We need to construct a bijective map h : A∪B −→
{1, . . . , m + n}. To this end, we define
(
f (x) for x ∈ A,
h : A ∪ B −→ {1, . . . , m + n}, h(x) :=
g(x) + m for x ∈ B.
Proof. If at least one Ai is empty, then (A.47) is true, since both sides are 0.
The case where all Ai are nonempty is proved by induction over n, i.e. we know ki :=
#Ai ∈ N for each i ∈ {1, . . . , n} and show by induction
n
Y n
Y
∀ # Ai = ki .
n∈N
i=1 i=1
| {z }
φ(n)
Q1 Q1
Base Case (n = 1): i=1 Ai = #A1 = k1 = i=1 ki , i.e. φ(1) holds.
Induction Step: From the induction
Qn hypothesis φ(n),Qn we obtain a bijective map ϕ :
A −→ {1, . . . , N }, where A := i=1 Ai and N := i=1 ki . To prove φ(n + 1), we need
to construct a bijective map h : A × An+1 −→ {1, . . . , N · kn+1 }. Since #An+1 = kn+1 ,
there exists a bijective map f : An+1 −→ {1, . . . , kn+1 }. We define
h : A × An+1 −→ {1, . . . , N · kn+1 },
h(a1 , . . . , an , an+1 ) := f (an+1 ) − 1 · N + ϕ(a1 , . . . , an ).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
A AXIOMATIC SET THEORY 226
Since ϕ and f are bijective, and since every m ∈ {1, . . . , N · kn+1 } has a unique rep-
resentation in the form m = a · N + r with a ∈ {0, . . . , kn+1 − 1} and r ∈ {1, . . . , N }
(exercise), h is also bijective. This proves φ(n + 1) and completes the induction.
Theorem A.69. For each finite set A (i.e. #A = n ∈ N0 ), one has #P(A) = 2n .
Base Case (n = 0): For n = 0, we have A = ∅, i.e. P(A) = {∅}. Thus, #P(A) = 1 = 20 ,
proving φ(0).
Induction Step: Assume φ(n) and consider A with #A = n + 1. Then A contains at
least one element a. ForB := A \ {a}, we then
know #B = n from Th. A.64(b).
Moreover, setting M := C ∪ {a} : C ∈ P(B) , we have the disjoint decomposition
P(A) = P(B) ∪˙ M. As the map ϕ : P(B) −→ M, ϕ(C) := C ∪ {a}, is clearly bijective,
P(B) and M have the same cardinality. Thus,
Th. A.67 φ(n)
#P(A) = #P(B) + #M = #P(B) + #P(B) = 2 · 2n = 2n+1 ,
thereby proving φ(n + 1) and completing the induction.
Theorem A.70. Let A be a set. There can never exist a surjective map from A onto
P(A) (in this sense, the size of P(A) is always strictly bigger than the size of A; in
particular, A and P(A) can never have the same size).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
B ASSOCIATIVITY AND COMMUTATIVITY 227
B.1 Associativity
In the literature, the general law of associativity is often stated in the form that
a1 a2 · · · an gives the same result “for every admissible way of inserting parentheses into
a1 a2 · · · an ”, but a completely precise formulation of what that actually means seems to
be rare. As a warm-up, we first prove a special case of the general law:
Proposition B.1. Let A be a set with a composition · : A × A −→ A (which we write
as a multiplication, but, clearly, this is not essential, and we could also write it as an
addition or with some other symbol). If the composition is associative, i.e. if
then ! !
n
Y k−1
Y n
Y
∀ ∀ ∀ ai ai = ai , (B.2)
n∈N, a1 ,...,an ∈A k∈{2,...,n}
n≥2 i=k i=1 i=1
Proof. If k = n, then (B.2) is immediate from (3.20a). For 2 ≤ k < n, we prove (B.2)
by induction on n: For the base case, n = 2, there is nothing to prove. For n > 2, one
computes
n
! k−1 ! n−1
! k−1 ! n−1
! k−1 !!
Y Y (3.20a) Y Y (B.1) Y Y
ai ai = an · ai ai = an · ai ai
i=k i=1 i=k i=1 i=k i=1
n−1
Y n
Y
ind.hyp. (3.20a)
= an · ai = ai , (B.3)
i=1 i=1
The difficulty in stating the general form of the law of associativity lies in giving a
precise definition of what one means by “an admissible way of inserting parentheses into
a1 a2 · · · an ”. So how does one actually proceed to calculate the value of a1 a2 · · · an , given
that parentheses have been inserted in an admissible way? The answer is that one does
it in n − 1 steps, where, in each step, one combines two juxtaposed elements, consistent
with the inserted parentheses. There can still be some ambiguity: For example, for
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
B ASSOCIATIVITY AND COMMUTATIVITY 228
(a1 a2 )(a3 (a4 a5 )), one has the freedom of first combining a1 , a2 , or of first combining
a4 , a5 . In consequence, our general law of associativity will show that, for each admissible
sequence of n − 1 directives for combining two juxtaposed elements, the final result is
the same (under the hypothesis that (B.1) holds). This still needs some preparatory
work.
In the following,
Qn one might see it as a slight notational inconvenience that we have
defined i=1 ai as an · · · a1 rather than a1 · · · an . For this reason, we will enumerate the
elements to be combined by composition from right to left rather then from left to right.
Definition and Remark B.2. Let A be a (nonempty) set with a composition · :
A × A −→ A, let n ∈ N, n ≥ 2, and let I be a totally ordered index set, #I = n,
I = {i1 , . . . , in } with i1 < · · · < in . Moreover, let F := (ain , . . . , ai1 ) be a family of n
elements of A.
(a) An admissible composition directive (for combining two juxtaposed elements of the
family) is an index ik ∈ I with 1 ≤ k ≤ n − 1. It transforms the family F into
the family G := (ain , . . . , aik+1 aik , . . . , ai1 ). In other words, G = (bj )j∈J , where
J := I \ {ik+1 }, bj = aj for each j ∈ J \ {ik }, and bik = aik+1 aik . We can write this
transformation as two maps
(1)
F 7→ δik (F ) := G = (ain , . . . , aik+1 aik , . . . , ai1 ) = (bj )j∈J , (B.4a)
(2)
I 7→ δik (I) := J = I \ {ik+1 }. (B.4b)
Thus, an application of an admissible composition directive reduces the length of
the family and the number of indices by one.
(b) Recursively, we define (finite) sequences of families, index sets, and indices as fol-
lows:
Fn := F, In := I, (B.5a)
(1) (2)
∀ Fα−1 := δjα (Fα ), Iα−1 := δjα (Iα ), where jα ∈ Iα \ {max Iα }.
α∈{2,...,n}
(B.5b)
The corresponding sequence of indices D := (jn , . . . , j2 ) in I is called an admissible
evaluation directive. Clearly,
∀ #Iα = α, i.e. Fα has length α. (B.6)
α∈{1,...,n}
In particular, I1 = {j2 } = {i1 } (where the second equality follows from (B.4b)),
F1 = (a), and we call
D(F ) := a (B.7)
the result of the admissible evaluation directive D applied to F .
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
B ASSOCIATIVITY AND COMMUTATIVITY 229
Theorem B.3 (General Law of Associativity). Let A be a (nonempty) set with a com-
position · : A × A −→ A, let n ∈ N, n ≥ 2, and let I be a totally ordered index set,
#I = n, I = {i1 , . . . , in } with i1 < · · · < in . Moreover, let F := (ain , . . . , ai1 ) be a
family of n elements of A. If the composition is associative, i.e. if (B.1) holds, then,
for each admissible evaluation directive as defined in Def. and Rem. B.2(b), the result
is the same, namely
Yn
D(F ) = ai k . (B.8)
k=1
Proof. We conduct the proof via induction on n. For n = 3, there are only two possible
directives and (B.1) guarantees that they yield the same result. For the induction step,
let n > 3. As in Def. and Rem. B.2(b), we write D = (jn , . . . , j2 ) and obtain some
I2 = {i1 , im }, 1 < m ≤ n, as the corresponding penultimate index set. Depending on
im , we partition (jn , . . . , j3 ) as follows: Set
J1 := k ∈ {3, . . . , n} : jk < im , J2 := k ∈ {3, . . . , n} : jk ≥ im . (B.9)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
B ASSOCIATIVITY AND COMMUTATIVITY 230
B.2 Commutativity
In the present section, we will generalize the law of commutativity ab = ba to a finite
number of factors, provided the composition is also associative. For this purpose, we
introduce the notion of permutation, also useful in many other mathematical contexts.
Definition and Remark B.4. Let n ∈ N. Each bijective map π : {1, . . . , n} −→
{1, . . . , n} is called a permutation of {1, . . . , n}. The set of permutations of {1, . . . , n}
forms a group with respect to the composition of maps, the so-called symmetric group
Sn : Indeed, the composition of maps is associative by Prop. 2.10(a); the neutral element
is the identity map e : {1, . . . , n} −→ {1, . . . , n}, e(i) = i; and, for each π ∈ Sn , its
inverse map π −1 is also its inverse element in the group Sn . Caveat: Simple examples
show that Sn is not commutative.
Theorem B.5 (General Law of Commutativity). Let A be a set with an associative
composition · : A × A −→ A (which we write as a multiplication, but, clearly, this is
not essential, and we could also write it as an addition or with some other symbol). If
the composition is commutative, i.e. if
∀ ab = ba, (B.17)
a,b∈A
then n n
Y Y
∀ ∀ ∀ ai = aπ(i) . (B.18)
n∈N π∈Sn a1 ,...,an ∈A
i=1 i=1
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
B ASSOCIATIVITY AND COMMUTATIVITY 231
Before we can carry out the proof, we need to learn a bit more about permutations.
π = (i1 i2 . . . ik ). (B.20)
(a) Each permutation can be decomposed into finitely many disjoint cycles: For each
π ∈ Sn , there exists a decomposition of {1, . . . , n} into disjoint sets A1 , . . . , AN ,
N ∈ N, i.e.
N
[
{1, . . . , n} = Ai and Ai ∩ Aj = ∅ for i 6= j, (B.21)
i=1
∀ ∃ ∃ π = τN ◦ · · · ◦ τ1 , (B.23)
π∈Sn N ∈N τ1 ,...,τN ∈T
where T := (i i + 1) : i ∈ {1, . . . , n − 1} .
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
B ASSOCIATIVITY AND COMMUTATIVITY 232
Indeed, since {1, . . . , n} is finite, there must be a smallest k ∈ N such that π k (i) ∈ A1 :=
{i, π(i), . . . , π k−1 (i)}. Since π is bijective, it must be π k (i) = i and (i π(i) . . . , π k−1 (i))
is a k-cycle. We are already done in case k = n. If k < n, then consider B :=
{1, . . . , n} \ A1 . Then, again using the bijectivity of π, π↾B is a permutation onSB with
1 ≤ #B < n. By induction, there are disjoint sets A2 , . . . , AN such that B = N j=2 Aj ,
Aj consists of the distinct elements aj1 , . . . , aj,Nj and
π↾B = (aN 1 . . . aN,NN ) · · · (a21 . . . a2,N2 ).
Since π = (i π(i) . . . , π k−1 (i)) ◦ π ↾B , this finishes the proof of (B.22). If there were
another, different,
SM decomposition of π into cycles, say, given by disjoint sets B1 , . . . , BM ,
{1, . . . , n} = i=1 Bi , M ∈ N, then there were Ai 6= Bj and k ∈ Ai ∩ Bj . But then k
were in the cycle given by Ai and in the cycle given by Bj , implying Ai = {π l (k) : l ∈
N} = Bj , in contradiction to Ai 6= Bj .
(b): We first show that every π ∈ Sn is a composition of finitely many transpositions
(not necessarily transpositions from the set T ): According to (a), it suffices to show
that every cycle is a composition of finitely many transpositions. Since each 1-cycle is
the identity, it is (i) = Id = (1 2) (1 2) for each i ∈ {1, . . . , n}. If (i1 . . . ik ) is a k-cycle,
k ∈ {2, . . . , n}, then
(i1 . . . ik ) = (i1 i2 ) (i2 i3 ) · · · (ik−1 ik ) : (B.25)
Indeed,
i1
for i = ik ,
∀ (i1 i2 ) (i2 i3 ) · · · (ik−1 ik )(i) = il+1 for i = il , l ∈ {1, . . . , k − 1}, (B.26)
i∈{1,...,n}
i for i ∈
/ {i1 , . . . , ik },
proving (B.25). To finish the proof of (b), we observe that every transposition is a
composition of finitely many elements of T : If i, j ∈ {1, . . . , n}, i < j, then
(i j) = (i i + 1) · · · (j − 2 j − 1)(j − 1 j) · · · (i + 1 i + 2)(i i + 1) : (B.27)
Indeed,
∀ (i i + 1) · · · (j − 2 j − 1)(j − 1 j) · · · (i + 1 i + 2)(i i + 1)(k)
k∈{1,...,n}
j for k = i,
i for k = j,
= (B.28)
k for i < k < j,
k for k ∈ / {i, i + 1, . . . , j},
proving (B.27).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
C ALGEBRAIC STRUCTURES 233
Proof of Th. B.5. For n = 1, there is nothing to prove. So let n > 1. For l ∈
1, . . . , n − 1, let τl : {1, . . . , n} −→ {1, . . . , n} be the transposition that interchanges
l and l + 1 and leaves all other elements fixed (i.e. τl (l) = l + 1, τl (l + 1) = l, τ (α) = α
for each α ∈ {1, . . . , n} \ {l, l + 1}) and let T := {τ1 , . . . , τn−1 }. Then (B.17) and Th.
B.3 imply that the theorem holds for π = τ for each τ ∈ T . For a general permutation
π ∈ Sn , Th. B.7(b) provides a finite sequence (τ 1 , . . . , τ N ), N ∈ N, of elements of T
such that π = τ N ◦ · · · ◦ τ 1 . Thus, as we already know that the theorem holds for N = 1,
the case N > 1 follows by induction.
The following example shows that, if the composition is not associative, then, in general,
(B.17) does not imply (B.18):
Example B.8. Let A := {a, b, c} with #A = 3 (i.e. the elements a, b, c are all distinct).
Let the composition · on A be defined according to the composition table
· a b c
a b b b
b b b a
c b a a
(the table entry at the intersection of the row labeled with factor x and the column
labeled with factor y is definition of x · y). Then, clearly, · is commutative. However, ·
is not associative, since, e.g.,
C Algebraic Structures
C.1 Groups
Definition C.1. Let G be a nonempty set with a map
◦ : G × G −→ G, (x, y) 7→ x ◦ y (C.1)
(called a composition on G, the examples we have in mind are addition and multiplication
on R). Then (G, ◦) (or just G, if the composition ◦ is understood) is called a group if,
and only if, the following three conditions are satisfied:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
C ALGEBRAIC STRUCTURES 234
∀ x ◦ e = x. (C.2)
x∈G
(iii) For each x ∈ G, there exists an inverse element x ∈ G, i.e. an element x ∈ G such
that
x ◦ x = e.
G is called a commutative or abelian group if, and only if, it is a group and satisfies the
additional condition:
Theorem C.2. The following statements and rules are valid in every group (G, ◦):
∀ e ◦ x = x. (C.3)
x∈G
also holds.
Proof. (a): Let x ∈ G. By Def. C.1(iii), there exists y ∈ G such that x ◦ y = e and, in
turn, z ∈ G such that y ◦ z = e. Thus,
e ◦ z = (x ◦ y) ◦ z = x ◦ (y ◦ z) = x ◦ e = x, (C.5)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
C ALGEBRAIC STRUCTURES 235
implying
x = e ◦ z = (e ◦ e) ◦ z = e ◦ (e ◦ z) = e ◦ x (C.6)
as desired.
(b): If e, f are both neutral elements, then, using (a), f = e ◦ f = e.
(c): Assume x ◦ a = e. Then there is b such that a ◦ b = e. One computes
Proposition C.4. Let (G, ◦) and (H, ◦) be groups and let φ : G −→ H be a group
homomorphism. Let e, e′ denote the neutral elements of G and H, respectively. Then
the following holds:
(a) φ(e) = e′ .
Applying (φ(e))−1 to both sides of the above equality then proves φ(e) = e′ .
(b): We compute
φ(a−1 ) ◦ φ(a) = φ(a−1 ◦ a) = φ(e) = e′ ,
proving (b).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
C ALGEBRAIC STRUCTURES 236
Notation C.5. Exponentiation with Integer Exponents: Let G be a nonempty set with
a composition · : G × G −→ G. Assume there exists a (unique) neutral element 1 ∈ G
(satisfying x · 1 = x for each x ∈ G). Define recursively for each x ∈ G and each n ∈ N0 :
x0 := 1, ∀ xn+1 := x · xn . (C.10a)
n∈N0
Moreover, if (G, ·) constitutes a group, then also define for each x ∈ G and each n ∈ N:
(a) xm+n = xm · xn .
(b) (xm )n = xm n .
(c) If the composition is commutative (i.e. xy = yx for each x, y ∈ G), then it holds
that xn y n = (xy)n .
Proof. (a): First, we fix n ∈ N0 and prove the statement for each m ∈ N0 by induction:
The base case (m = 0) is xn = xn , which is true. For the induction step, we compute
(C.10a) ind. hyp. (C.10a)
xm+1+n = x · xm+n = x · xm · xn = xm+1 xn ,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
C ALGEBRAIC STRUCTURES 237
(b): First, we prove the statement for each n ∈ N0 by induction (for m < 0, we assume
G to be a group): The base case (n = 0) is (xm )0 = 1 = x0 , which is true. For the
induction step, we compute
(C.10a) ind. hyp. (a)
(xm )n+1 = xm · (xm )n = xm · xm n = xm n+m = xm (n+1) ,
completing the induction step. Now, let G be a group and n < 0. We already know
(xm )−1 = x−m . Thus, using what we have already shown,
(C.10b) −n
(xm )n = (xm )−1 = (x−m )−n = x(−m)(−n) = xm n .
completing the induction step. If G is a group and n < 0, then, using what we have
already shown,
(C.10b) Th. C.2(e) −n (C.10b)
xn y n = (x−1 )−n (y −1 )−n = (x−1 y −1 )−n = (xy)−1 = (xy)n ,
C.2 Rings
Definition C.7. Let R be a nonempty set with two maps
+ : R × R −→ R, (x, y) 7→ x + y,
(C.11)
· : R × R −→ R, (x, y) 7→ x · y
(+ is called addition and · is called multiplication; often one writes xy instead of x · y).
Then (R, +, ·) (or just R, if + and · are understood) is called a ring if, and only if, the
following three conditions are satisfied:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
C ALGEBRAIC STRUCTURES 238
(iii) Distributivity:
∀ x · (y + z) = x · y + x · z, (C.12a)
x,y,z∈R
∀ (y + z) · x = y · x + z · x. (C.12b)
x,y,z∈R
A ring R is called commutative if, and only if, its multiplication is commutative. More-
over, a ring called a ring with unity if, and only if, R contains a neutral element with
respect to multiplication (i.e. there is 1 ∈ R such that 1 · x = x · 1 = x for each
x ∈ R) – some authors always require a ring to have a neutral element with respect to
multiplication.
Theorem C.8. The following statements and rules are valid in every ring (R, +, ·) (let
0 denote the additive neutral element and let x, y, z ∈ R):
(a) x · 0 = 0 = 0 · x.
i.e. x · 0 = 0 follows since (R, +) is a group. The second equality follows analogously
using (C.12b).
(b): xy +x(−y) = x(y −y) = x·0 = 0, where we used (C.12a) and (a). This shows x(−y)
is the additive inverse to xy. The second equality follows analogously using (C.12b).
(c): xy = −(−(xy)) = −(x(−y)) = (−x)(−y), where (b) was used twice.
(d): x(y − z) = x(y + (−z)) = xy + x(−z) = xy − xz.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
C ALGEBRAIC STRUCTURES 239
C.3 Fields
Definition C.9. Let (F, +, ·) be a ring with unity. Then (F, +, ·) (or just F , if + and
· are understood) is called a field if, and only if, F \ {0} is a commutative group with
respect to ·.
Theorem C.10. The following statements and rules are valid in every field (F, +, ·):
(a) Inverse elements are unique. For each x ∈ F , the unique inverse with respect to
addition is denoted by −x. Also define y − x := y + (−x). For each x ∈ F \ {0}, the
unique inverse with respect to multiplication is denoted by x−1 . For x 6= 0, define
the fractions xy := y/x := yx−1 with numerator y and denominator x.
(e) x · 0 = 0.
(f ) x(−y) = −(xy).
(i) xy = 0 ⇒ x = 0 ∨ y = 0.
a b ad + bc a b ab a/c ad
+ = , · = , = ,
c d cd c d cd b/d bc
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 240
(e) follows by applying Th. C.8(a) to the ring with unity (F, +, ·).
(f) follows by applying Th. C.8(b) to the ring with unity (F, +, ·).
(g) follows by applying Th. C.8(c) to the ring with unity (F, +, ·).
(h) follows by applying Th. C.8(d) to the ring with unity (F, +, ·).
(i): If xy = 0 and x 6= 0, then y = 1 · y = x−1 xy = x−1 · 0 = 0.
(j): One computes
a b ad + bc
+ = ac−1 + bd−1 = add−1 c−1 + bcc−1 d−1 = (ad + bc)(cd)−1 =
c d cd
and
a b ab
· = ac−1 bd−1 = ab(cd)−1 =
c d cd
and
a/c ad
= ac−1 (bd−1 )−1 = ac−1 b−1 d = ad(bc)−1 = ,
b/d bc
completing the proof.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 241
This fits into the framework of Th. 3.7, using A := N0 , x1 := S(m), and, for each
n ∈ N, fn : An −→ A, fn (x1 , . . . , xn ) := S(xn ) (due to the different initializations,
one obtains a different recursion for each m ∈ N0 ).
m · 0 := 0, m · 1 := m, ∀ m · (n + 1) := m · n + m. (D.2)
n∈N
This fits into the framework of Th. 3.7, using A := N0 , x1 := m, and, for each
m, n ∈ N, fm,n : An −→ A, fm,n (x1 , . . . , xn ) := xn + m.
Theorem D.2. The set N0 of the natural numbers (including 0) with the maps of
addition and multiplication
+ : N0 × N0 −→ N0 , (x, y) 7→ x + y,
· : N0 × N0 −→ N0 , (x, y) 7→ x · y,
as defined in Def. D.1(a) and Def. D.1(b), respectively, satisfies Def. C.1(i),(ii),(iv) for
both addition and multiplication, i.e. associativity, commutativity, and the existence of a
neutral element. This can be summarized as the statement that N0 forms a commutative
semigroup with respect to both addition and multiplication (however, no group, as the
existence of inverse elements is lacking). Moreover, distributivity, i.e. Def. C.7(iii) is
also satisfied.
∀ (k + m) + n = k + (m + n). (D.3a)
k,m,n∈N0
The proof of (D.3a) is carried out by induction on n. The base case (n = 0) follows
from the first definition in (D.1): (k + m) + 0 = k + m = k + (m + 0) for every k, m ∈ N0 .
For the induction step, one computes, for every k, m, n ∈ N0 ,
(D.1) (D.1) ind. hyp.
(k + m) + (n + 1) = (k + m) + S(n) = S((k + m) + n) = S(k + (m + n))
(D.1) (D.1)
= k + S(m + n) = k + (m + S(n))
(D.1)
= k + (m + (n + 1)), (D.3b)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 242
∀ m + n = n + m. (D.4a)
m,n∈N0
The proof of (D.4a) is also carried out by induction on n. More precisely, we prove
n = 0 separately, and then carry out the induction for n ∈ N. The case n = 0 is proved
by induction on m: The base case (m = 0) is the true statement 0 + 0 = 0 = 0 + 0. For
the induction step, one computes (m + 1) + 0 = m + 1 = S(m) = S(m + 0) = S(0 + m) =
0+S(m) = 0+(m+1). The base case for the induction on n, i.e. n = 1 is also proved by
induction on m: The base case (m = 0) is the true statement 0 + 1 = S(0) = 1 = 1 + 0.
For the induction step, one computes, for every m ∈ N0 ,
(D.1) ind. hyp. (D.1)
(m + 1) + 1 = S(m + 1) = S(1 + m) = (1 + m) + 1
(D.3a)
= 1 + (m + 1). (D.4b)
Now, for the induction step of the induction on n, one computes, for every (m, n) ∈
N0 × N,
(D.1) (D.1) ind. hyp. (D.1)
m + (n + 1) = m + S(n) = S(m + n) = S(n + m) = n + S(m)
(D.1) base case (D.3a)
= n + (m + 1) = n + (1 + m) = (n + 1) + m, (D.4c)
∀ m · n = n · m. (D.5a)
m,n∈N0
∀ m = m · 1 = 1 · m. (D.5b)
m∈N0
Next, we show
∀ n · m + m = (n + 1) · m (D.5c)
m,n∈N0
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 243
via induction on m. For the base case (m = 0), we note (n + 1) · 0 = 0 by (D.2), and
n · 0 + 0 = 0 by (D.1) and (D.2). For the induction step, we compute
(D.2) (D.4a)
n · (m + 1) + m + 1 = n·m+n+m+1 = n·m+m+n+1
ind. hyp. (D.2)
= (n + 1) · m + n + 1 = (n + 1) · (m + 1).
We are now in a position to carry out the proof of (D.5a) by induction on n. More
precisely, we prove n = 0 separately, and then carry out the induction for n ∈ N. Let
n = 0. We have m · 0 = 0 for each m ∈ N0 directly from (D.2). We prove 0 · m = 0 for
each m ∈ N0 via induction on m: 0 · 0 = 0 and 0 · 1 = 0 are immediate from (D.2). For
the induction step, one computes, for every m ∈ N,
(D.2) ind. hyp., (D.1)
0 · (m + 1) = 0 · m + 0 = 0.
The base case for the induction on n ∈ N is provided by (D.5b). For the induction step,
one computes, for every (m, n) ∈ N0 × N,
(D.2) ind. hyp. (D.5c)
m · (n + 1) = m · n + m = n · m + m = (n + 1) · m,
∀ (k + m) · n = k · n + m · n. (D.6a)
k,m,n∈N0
The proof of (D.6a) is carried out by induction on n. The base case (n = 0) follows
from (D.2): (k + m) · 0 = 0 = k · 0 + m · 0. For the induction step, one computes, for
every k, m, n ∈ N0 ,
(D.2)
(k + m) · (n + 1) = (k + m) · n + k + m
ind. hyp.
= k·n+m·n+k+m
(D.4a)
= k·n+k+m·n+m
(D.2)
= k · (n + 1) + m · (n + 1), (D.6b)
∀ (k · m) · n = k · (m · n). (D.7a)
k,m,n∈N0
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 244
The proof of (D.7a) is carried out by induction on n. The base case (n = 0) follows
from (D.2): (k · m) · 0 = 0 = k · (m · 0) for every k, m ∈ N0 . For the induction step, one
computes, for every k, m, n ∈ N0 ,
(D.2) ind. hyp.
(k · m) · (n + 1) = (k · m) · n + k · m = k · (m · n) + k · m
(D.6a) (D.2)
= k · (m · n + m) = k · (m · (n + 1)), (D.7b)
n≤m :⇔ ∃ n + k = m. (D.9)
k∈N0
Theorem D.5. The relation defined in (D.9) constitutes a well-order on N0 (in partic-
ular, a total order) that is compatible with addition and multiplication, i.e. it satisfies
(4.3).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 245
To this end, note that k ≤ m means that there is l ∈ N0 such that k + l = m. But then
k + n + l = k + l + n = m + n, showing k + n ≤ m + n.
Compatibility with Multiplication: As 0 ≤ n holds for every n ∈ N0 , there is nothing to
prove.
Lemma D.6. Let a, n ∈ N. Then the following holds:
(a) a + n ∈ N.
(b) a · n ∈ N.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 246
defines a total order on G that is compatible with addition, i.e. it satisfies (4.3a). More-
over, if a multiplication is also defined on G and P ∪ {0} is closed under this multipli-
cation, then ≤ is also compatible with multiplication, i.e. it satisfies (4.3b). Of course,
one refers to the elements of P as positive and to the elements of −P as negative.
Proof. For each x ∈ G, one has x−x = 0 ∈ P ∪{0}, i.e. x ≤ x and the relation is reflexive.
If x, y ∈ G, x ≤ y and y ≤ x, then x − y ∈ P ∪ {0} and −(x − y) = y − x ∈ P ∪ {0},
and the disjointness of the union in (D.13) implies x − y = 0, i.e. x = y, showing the
relation is antisymmetric. If x, y, z ∈ G with x ≤ y and y ≤ z, then y − x ∈ P ∪ {0},
z −y ∈ P ∪{0}, and z −x = z −y +y −x ∈ P ∪{0} since P is closed under +, showing the
relation is transitive. So we have shown ≤ constitutes a partial order on G. It remains to
show the order is total. However, given the decomposition in (D.13), for each x, y ∈ G,
precisely one of the statements x − y ∈ P (i.e. y < x), x − y = 0 (i.e. x = y), x − y ∈ −P
(i.e. x < y) must be true, proving that the order is total. To see ≤ satisfies (4.3a), let
x, y, z ∈ G. If x ≤ y, then y − x ∈ P ∪ {0}, i.e. y + z − (x + z) = y + z − z − x ∈ P ∪ {0},
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 247
showing x+z ≤ y +z. The proof is completed by noting (4.3b) is precisely the statement
that P ∪ {0} is closed under multiplication.
D.3 Integers
As compared to our goal, the set of real numbers R, the set N0 still has three defi-
ciencies, namely the lack of inverse elements for addition, the lack of inverse elements
for multiplication, and that the order ≤ lacks completeness. The construction of the
integers will remedy (only) the first of the three deficiencies by providing the inverse
elements of addition.
Definition and Remark D.9. The relation ∼ on N0 × N0 defined by
(a, b) ∼ (c, d) :⇔ a + d = b + c, (D.15)
constitutes an equivalence relation on N0 × N0 (cf. Def. 2.23): Indeed, if a, b ∈ N0 , then
a + b = b + a shows (a, b) ∼ (a, b), proving ∼ to be reflexive. If a, b, c, d ∈ N0 , then
(a, b) ∼ (c, d) ⇒ a+d=b+c ⇒ c+b=d+a ⇒ (c, d) ∼ (a, b),
proving ∼ to be symmetric. If a, b, c, d, e, f ∈ N0 , then
(a, b) ∼ (c, d) ∧ (c, d) ∼ (e, f ) ⇒ a+d=b+c ∧ c+f =d+e
(D.12a)
⇒ a+d+c+f =b+c+d+e ⇒ a+f =b+e ⇒ (a, b) ∼ (e, f ),
proving ∼ to be transitive and an equivalence relation.
Definition D.10. (a) Define the set of integers Z as the set of equivalence classes of
the equivalence relation ∼ defined in (D.15), i.e.
Z := (N0 × N0 )/ ∼ = [(a, b)] : (a, b) ∈ N0 × N0 (D.16)
is the quotient set of N0 × N0 with respect to ∼ (cf. Ex. 2.24(c)). To simplify
notation, in the following, we will write
[a, b] := [(a, b)] (D.17)
for the equivalence class of (a, b) with respect to ∼.
(b) Addition on Z is defined by
+ : Z × Z −→ Z, [a, b], [c, d] →7 [a, b] + [c, d] := [a + c, b + d]. (D.18)
Subtraction on Z is defined by
− : Z × Z −→ Z, [a, b], [c, d] →7 [a, b] − [c, d] := [a, b] + [d, c]. (D.19)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 248
For the definitions in Def. D.10(b) to make sense, one needs to check that they do not
depend on the chosen representatives of the equivalence classes. Moreover, one needs to
convince oneself that these definitions yield the desired familiar operations of addition
and subtraction. Let us start by verifying the independence of the representatives is the
following Lem. D.11.
Lemma D.11. The definitions in Def. D.10(b) do not depend on the chosen represen-
tatives, i.e.
∀ ˜ ⇒ [a + c, b + d] = [ã + c̃, b̃ + d]
[a, b] = [ã, b̃] ∧ [c, d] = [c̃, d] ˜ (D.20)
˜ 0
a,b,c,d,ã,b̃,c̃,d∈N
and
∀ ˜
[a, b] = [ã, b̃]∧[c, d] = [c̃, d] ⇒ ˜ . (D.21)
[a, b]−[c, d] = [ã, b̃]−[c̃, d]
˜ 0
a,b,c,d,ã,b̃,c̃,d∈N
˜ means c + d˜ = d + c̃,
Proof. (D.20): [a, b] = [ã, b̃] means a + b̃ = b + ã, [c, d] = [c̃, d]
implying a + c + b̃ + d˜ = b + ã + d + c̃, i.e. [a + c, b + d] = [ã + c̃, b̃ + d].
˜
(D.21) is just (D.19) combined with (D.20).
Theorem D.12. The set of integers Z forms a commutative group with respect to ad-
dition as defined in Def. D.10(b), where [0, 0] is the neutral element, [b, a] is the inverse
element of [a, b] for each a, b ∈ N0 , and, denoting the inverse element of [a, b] by −[a, b]
in the usual way, [a, b] − [c, d] = [a, b] + (−[c, d]) for each a, b, c, d ∈ N0 .
proving associativity. For every a, b ∈ N0 , one obtains [a, b]+[0, 0] = [a+0, b+0] = [a, b],
proving neutrality of [0, 0], whereas [a, b] + [b, a] = [a + b, b + a] = [a + b, a + b] = [0, 0]
(since (a + b, a + b) ∼ (0, 0)) shows [b, a] = −[a, b]. Now [a, b] − [c, d] = [a, b] + (−[c, d])
is immediate from (D.19).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 249
It is customary to identify N0 with ι(N0 ), as it usually does not cause any confusion.
One then just writes n instead of [n, 0] and −n instead of [0, n] = −[n, 0].
Lemma D.14. We have the disjoint decomposition
˙
Z = N ∪{0} ∪˙ Z− , Z− := −N = {n ∈ Z : −n ∈ N}. (D.24)
Proof. Note that, due to (D.15), an equivalence class remains the same if a natural
number is added or subtracted in both components: [a, b] = [a + m, b + m]. Thus, for
each x = [a, b] ∈ Z, if a > b, then x = [a − b, 0] ∈ N; if a = b, then x = [0, 0] = 0; if
a < b, then x = [0, b − a] = −[b − a, 0] ∈ Z− . It just remains to verify that the union in
(D.24) is disjoint. However, if [n, 0] = [0, m] with m, n ∈ N0 , then n + m = 0, proving
n = m = 0, completing the proof.
Remark D.15. In the above construction, we obtained the commutative group (Z, +)
from the commutative semigroup (N0 , +). It is worth pointing out that the same con-
struction always works when, instead of with N0 , one starts with any commutative
semigroup (H, +) that satisfies the cancellation law a + c = b + c ⇒ a = b, to obtain a
commutative group (G, +) and a monomorphism ι : H −→ G.
—
Lemma D.17. The definition in Def. D.16 does not depend on the chosen representa-
tives, i.e.
∀ ˜ ˜ ˜
[a, b] = [ã, b̃] ∧ [c, d] = [c̃, d] ⇒ [ac + bd, ad + bc] = [ãc̃ + b̃d, ãd + b̃c̃] .
˜ 0
a,b,c,d,ã,b̃,c̃,d∈N
(D.26)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 250
Proof. As mentioned before, due to (D.15), an equivalence class remains the same if a
natural number is added or subtracted in both components. Thus, one computes
(D.15)
[ac + bd, ad + bc] = [ac + bd + b̃c, ad + bc + b̃c] = [(a + b̃)c + bd, ad + bc + b̃c]
(D.15)
= [(ã + b)c + bd, ad + bc + b̃c] = [ãd˜ + ãc + bd, ãd˜ + ad + b̃c]
= [ã(d˜ + c) + bd, ãd˜ + ad + b̃c] = [ã(d + c̃) + bd, ãd˜ + ad + b̃c]
= [ãc̃ + (ã + b)d, ãd˜ + ad + b̃c] = [ãc̃ + (a + b̃)d, ãd˜ + ad + b̃c]
(D.15)
= [ãc̃ + b̃d + b̃c̃, ãd˜ + b̃c + b̃c̃] = [ãc̃ + b̃(d + c̃), ãd˜ + b̃c + b̃c̃]
= [ãc̃ + b̃(d˜ + c), ãd˜ + b̃c + b̃c̃] = [ãc̃ + b̃d,
˜ ãd˜ + b̃c̃], (D.27)
completing the proof.
Theorem D.18. The set of integers Z is associative and commutative with respect to
the multiplication defined in Def. D.16. Moreover, distributivity, i.e. Def. C.7(iii) is
satisfied, [1, 0] is the neutral element of multiplication, and there are no zero divisors,
i.e.
∀ [a, b] · [c, d] = [ac + bd, ad + bc] = [0, 0] ⇒ [a, b] = [0, 0] ∨ [c, d] = [0, 0] .
a,b,c,d∈N0
(D.28)
Algebraically, the theorem can be summarized by saying that (Z, +, ·) constitutes a prin-
cipal ideal domain.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 251
l≤k :⇔ k − l ∈ N0 . (D.30)
Theorem D.20. (a) The relation defined in (D.30) constitutes a total order on Z that
is compatible with addition and multiplication, i.e. it satisfies (4.3).
Proof. (a) follows from (D.30), (D.24), and Th. D.8 since N0 is closed under addition
and multiplication.
(b): According to Def. D.5, if m, n ∈ N with n < m, then m = n + k for some k ∈ N.
In consequence ι(m) = ι(n) + ι(k) by (D.23), i.e. ι(m) − ι(n) = ι(k) ∈ N, proving
ι(n) < ι(m).
constitutes an equivalence relation on Z × (Z \ {0}) (cf. Def. 2.23): Indeed, noting that
(D.31) is precisely the same as (D.15) if + is replaced by ·, the proof from Def. and Rem.
D.9 also shows that (D.31) does, indeed, define an equivalence relation on Z × (Z \ {0}):
One merely replaces each + with · and each N0 with Z or Z \ {0}), respectively. The
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 252
only modification needed occurs for 0 ∈ {a, c, e} in the proof of transitivity (in this case,
the proof of Def. and Rem. D.9 yields adcf = 0 = bcde, which does not imply af = be),
where one now argues, for a = 0,
for c = 0,
and, for e = 0,
Definition D.22. (a) Define the set of rational numbers Q as the set of equivalence
classes of the equivalence relation ∼ defined in (D.31), i.e.
Q := Z × (Z \ {0}) / ∼ = [(a, b)] : (a, b) ∈ Z × (Z \ {0}) (D.32)
is the quotient set of Z×(Z\{0}) with respect to ∼ (cf. Ex. 2.24(c)). As is common,
we will write
a
:= a/b := [(a, b)] (D.33)
b
for the equivalence class of (a, b) with respect to ∼.
(b) Addition on Q is defined by
a c a c ad + bc
+ : Q × Q −→ Q, , 7→ + := . (D.34)
b d b d bd
Multiplication on Q is defined by
a c a c ac
· : Q × Q −→ Q, , 7→ · := . (D.35)
b d b d bd
—
For the definitions in Def. D.22(b) to make sense, one needs to check that they do not
depend on the chosen representatives of the equivalence classes, and that the results of
both addition and multiplication are always elements of Q. All this is provided by the
following lemma.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 253
Lemma D.23. The definitions in Def. D.22(b) do not depend on the chosen represen-
tatives, i.e.
!
a ã c c̃ ad + bc ãd˜ + b̃c̃
∀ ∀ = ∧ = ⇒ = (D.36)
a,c,ã,c̃∈Z ˜
b,d,b̃,d∈Z\{0} b b̃ d d˜ bd b̃d˜
and
a ã c c̃ ac ãc̃
∀ ∀ = ∧ = ⇒ = . (D.37)
a,c,ã,c̃∈Z ˜
b,d,b̃,d∈Z\{0} b b̃ d d˜ bd b̃d˜
Furthermore, the results of both addition and multiplication are always elements of Q.
Proof. (D.36) and (D.37): a/b = ã/b̃ means ab̃ = ãb, c/d = c̃/d˜ means cd˜ = c̃d, implying
ad + bc ãd˜ + b̃c̃
(ad + bc)b̃d˜ = bd(ãd˜ + b̃c̃), i.e. = (D.38)
bd b̃d˜
and
ac ãc̃
acb̃d˜ = bdãc̃,
= . i.e. (D.39)
bd b̃d˜
That the results of both addition and multiplication are always elements of Q follows
from (D.28), i.e. from the fact that Z has no zero divisors. In particular, if b, d 6= 0,
then bd 6= 0, showing (ad + bc)/(bd) ∈ Q and (ac)/(bd) ∈ Q.
Theorem D.24. (a) The set of rational numbers Q with addition and multiplication as
defined in Def. D.22 forms a field, where 0/1 and 1/1 are the neutral elements with
respect to addition and multiplication, respectively, (−a/b) is the additive inverse
to a/b, whereas b/a is the multiplicative inverse to a/b with a 6= 0.
(b) Defining subtraction and division in the usual way, for each r, s ∈ Q, by s − r :=
s + (−r) and s/r := sr−1 , respectively, with −r denoting the additive inverse of r
and r−1 denoting the multiplicative inverse of r 6= 0, all the rules stated in Th. C.10
are valid in Q.
(c) The map
k
ι : Z −→ Q, ι(k) := , (D.40)
1
is a monomorphism, i.e. it is injective and satisfies
∀ ι(k + l) = ι(k) + ι(l), (D.41a)
k,l∈Z
It is customary to identify Z with ι(Z), as it usually does not cause any confusion.
One then just writes k instead of k1 .
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 254
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 255
Q = Q+ ∪{0}
˙ ∪˙ Q− , Q− := −Q+ = {r ∈ Q : −r ∈ Q+ }, (D.44)
since
a/b ∈ Q+ ⇔ (a > 0 ∧ b > 0) ∨ (a < 0 ∧ b < 0) , (D.45a)
a/b = 0 ⇔ a = 0, (D.45b)
a/b ∈ Q− ⇔ (a > 0 ∧ b < 0) ∨ (a < 0 ∧ b > 0) . (D.45c)
s≤r :⇔ r − s ∈ Q+ +
0 := Q ∪ {0}. (D.46)
Theorem D.27. (a) The relation defined in (D.46) constitutes a total order on Q that
is compatible with addition and multiplication, i.e. it satisfies (4.3); in other words
(Q, +, ·, ≤) constitutes a totally ordered field.
Proof. (a) follows from (D.46), (D.44), and Th. D.8, since it is immediate from (D.34)
and (D.35) that Q+ is closed under addition and multiplication.
(b) is a consequence of (a), since Th. 4.5 and its proof are valid in every totally ordered
field.
(c): According to Def. D.27, if k, l ∈ Z with l < k, then n := k − l ∈ N. In consequence
ι(k) = ι(l) + ι(n) by (D.41a), i.e. ι(k) − ι(l) = ι(n) = n/1 ∈ Q+ , proving ι(l) < ι(k).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 256
the set of real numbers R such that it becomes a complete totally ordered field. There
are several different important constructions to obtain R from Q. We will describe
the construction that defines real numbers as equivalence classes of rational Cauchy
sequences following [EHH+ 95, Ch. 2.3]. The construction using so-called Dedekind cuts
can be found in [EHH+ 95, Ch. 2.2], the construction via nested intervals in [EHH+ 95,
Ch. 2.4].
Definition D.28. (a) Let S denote the set of all Cauchy sequences in Q, where we
call a sequence (rn )n∈N in Q a Cauchy sequence if, and only if,
∀ ∃ ∀ |rn − rm | < ǫ, (D.47)
ǫ∈Q+ N ∈N n,m>N
which defers from (7.25) in that ǫ has to be from Q+ rather than from R+ .
(b) Addition on S is defined by
+ : S × S −→ S, ((rn )n∈N , (sn )n∈N ) 7→ (rn )n∈N + (sn )n∈N := (rn + sn )n∈N . (D.48)
Multiplication on S is defined by
· : S × S −→ S, ((rn )n∈N , (sn )n∈N ) 7→ (rn )n∈N · (sn )n∈N := (rn sn )n∈N . (D.49)
As a consequence of the following Lem. D.29, addition and multiplication are well-
defined on S.
Lemma D.29. If (rn )n∈N and (sn )n∈N are Cauchy sequences in Q, so are (rn + sn )n∈N
and (rn sn )n∈N .
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 257
Proof. Note that, since the rational sequence (rn )n∈N is nothing but the function f :
N −→ Q, f (n) = rn , addition and multiplication as defined in Def. D.28(b) is analogous
to the definition of addition and multiplication of real-valued functions in (6.1a), (6.1c),
respectively. It is an easy exercise to verify that these function operations always inherit
associativity, commutativity, and distributivity if these rules hold for the operations
defined on the function’s codomain (i.e. for + and · on Q in our present situation of
rational sequences). The constant sequence (0, 0, . . . ) is the neutral element of addition
on S and −(rn )n∈N = (−rn )n∈N is the additive inverse of (rn )n∈N .
The reason that we need another step in our construction of R is the fact that S is not
a field: As soon as 0 occurs, even just once, in the sequence (rn )n∈N ∈ S, the sequence
does not have a multiplicative inverse (where the neutral element of multiplication is
obviously the constant sequence (1, 1, . . . )). The solution to this problem consists of
factoring out all sequences converging to 0.
Definition and Remark D.31. Let
n o
N := (rn )n∈N ∈ S : lim rn = 0 . (D.52)
n→∞
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 258
Multiplication on R is defined by
· : R × R −→ R, [f ], [g] →7 [f ] · [g] := [f g]. (D.56)
Once again, for the definitions in Def. D.32(b) to make sense, one needs to check that
they do not depend on the chosen representatives of the equivalence classes, and once
again, we provide a lemma providing this check:
Lemma D.33. The definitions in Def. D.32(b) do not depend on the chosen represen-
tatives, i.e.
∀ f − f˜ ∈ N ∧ g − g̃ ∈ N ⇒ f + g − (f˜ + g̃) ∈ N (D.57)
f,g,f˜,g̃∈S
and
∀ f − f˜ ∈ N ∧ g − g̃ ∈ N ⇒ f g − (f˜g̃) ∈ N . (D.58)
f,g,f˜,g̃∈S
Proof. Let f = (rn )n∈N , g = (sn )n∈N , f˜ = (r̃n )n∈N , g̃ = (s̃n )n∈N be elements of S such
that f − f˜ ∈ N and g − g̃ ∈ N , i.e. limn→∞ (rn − r̃n ) = limn→∞ (sn − s̃n ) = 0.
Then (7.11b) implies 0 = limn→∞ rn + sn − (r̃n + s̃n ) , proving (D.57).
To prove (D.58), one computes
lim rn sn − r̃n s̃n = lim rn (sn − s̃n ) − s̃n (rn − r̃n ) = 0, (D.59)
n→∞ n→∞
where the last equality follows from the boundedness of (rn )n∈N and (s̃n )n∈N together
with Prop. 7.11(b).
Proposition D.34. If (rn )n∈N ∈ S, then precisely one of the following statements is
correct:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 259
Proof. Let us first verify that the three statements in (D.60) are mutually exclusive. If
(D.60a) holds, then, for every ǫ ∈ Q+ , −ǫ < rn < ǫ holds for almost all (in particular,
for infinitely many) n ∈ N, i.e. (D.60b) and (D.60c) are both false. If (D.60b) holds,
then (D.60a) must be false as we have just seen. Moreover, if rn ≤ ǫ holds for at most
finitely many n ∈ N, then rn > ǫ > 0 must hold for infinitely many n ∈ N, i.e. (D.60c)
is false.
Now suppose (D.60a) and (D.60b) are false. We have to show that (D.60c) is true. Since
(D.60a) is false, there exists δ > 0 and an increasing sequence of indices (nk )k∈N with
|rnk | > δ for each k ∈ N. Since (D.60b) is false, there is an increasing sequence of indices
(mk )k∈N with rmk < 1/k. Thus, since (rn )n∈N is a Cauchy sequence, only finitely many
rnk > δ and infinitely many rnk < −δ. Now, if N ∈ N is such that |rn − rm | < δ/2 for
all n, m > N and k0 ∈ N such that nk0 > N , then rn < −δ/2 for each n > N (since
|rn − rnk0 | < δ/2). Thus, (D.60c) holds with ǫ := δ/2.
Theorem D.35. (a) The set of real numbers R with addition and multiplication as
defined in Def. D.32 forms a field, where [(0, 0, . . . )] and [(1, 1, . . . )] are the neutral
elements with respect to addition and multiplication, respectively.
It is customary to identify Q withι(Q), as it usually does not cause any confusion.
One then just writes r instead of (r, r, . . . ) .
Proof. (a): Clearly, Def. D.32(b) ensures the laws of associativity and commutativity
of addition and multiplication valid in S are preserved in R, and, likewise, the law of
distributivity. It is also immediate from (D.55) and (D.56), respectively, that [(0, 0, . . . )]
and [(1, 1, . . . )] are the respective neutral elements of addition and multiplication. More-
over, if −f is the additive inverse of f ∈ S, then [−f ] is the additive inverse of [f ] ∈ R.
It remains to show that each x = [(rn )n∈N ] 6= [(0, 0, . . . )] has a multiplicative inverse x−1
in R. We claim x−1 = [(sn )n∈N ], where
(
rn−1 for rn 6= 0,
∀ sn := (D.63)
n∈N 1 for rn = 0.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 260
We need to verify [(sn )n∈N ] ∈ R, i.e. (sn )n∈N is a Cauchy sequence. We know (rn )n∈N is
a Cauchy sequence that does not converge to 0. Thus, according to Prop. D.34, there
exists δ > 0 and M ∈ N such that, for each n > M , we have |rn | > δ (in particular,
rn 6= 0). Let ǫ > 0. As (rn )n∈N is a Cauchy sequence, there exists N ∈ N such that
N ≥ M and, for each n, m > N , |rn − rm | < ǫ δ 2 . Thus,
1 1 rn − rm ǫ δ 2
∀ |sn − sm | = − = < = ǫ, (D.64)
n,m>N rn rm rn rm δ2
Proposition D.37. (a) The definition in (D.67) does not depend on the chosen rep-
resentatives (rn )n∈N .
R = R+ ∪{0}
˙ ∪˙ R− , R− := −R+ = {x ∈ R : −x ∈ R+ }. (D.68)
Proof. (a): If (sn )n∈N ∈ S with limn→∞ (rn − sn ) = 0, then |rn − sn | < ǫ/2 for almost all
n ∈ N. Thus, since |sn | ≥ |rn | − |rn − sn |, we obtain sn > ǫ/2 for almost all n ∈ N, i.e.
#{n ∈ N : sn ≤ 2ǫ } ∈ N0 .
(b) is an immediate consequence of Prop. D.34.
Definition D.38. For each x, y ∈ R, let
y≤x :⇔ x − y ∈ R+ +
0 := R ∪ {0}. (D.69)
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 261
Theorem D.39. (a) The relation defined in (D.69) constitutes a total order on R that
is compatible with addition and multiplication, i.e. it satisfies (4.3); in other words
(R, +, ·, ≤) constitutes a totally ordered field.
Proof. (a) follows from (D.69), (D.68), and Th. D.8, once we have shown that R+ is
closed under addition and multiplication. Let (rn )n∈N ∈ S, (sn )n∈N ∈ S. If rn > ǫ1 ∈ Q+
for almost all n ∈ N and sn > ǫ2 ∈ Q+ for almost all n ∈ N, then rn + sn > ǫ1 + ǫ2 ,
showing R+ is closed under addition. Moreover, rn sn > ǫ1 ǫ2 , showing R+ is closed under
multiplication.
(b): According to Def. D.39, if r, s ∈ Q with s < r, then q := r − s ∈ Q+ . In
consequence ι(r) = ι(s) + ι(q) by (D.62a), i.e. ι(r) − ι(s) = ι(q) = [(q, q, . . . )] ∈ R+ ,
proving ι(s) < ι(r).
Finally, we will show in Th. D.41 below that the order ≤ on R is complete. However,
we first need some additional auxiliary results.
Proposition D.40. (a) For each x ∈ R, there is (rn )n∈N ∈ S satisfying limn→∞ rn = x.
(b) Every (rn )n∈N ∈ S converges in R – more precisely, limn→∞ rn = [(rn )n∈N ].
Proof. (a) and (b): If x = [(rn )n∈N ] with (rn )n∈N ∈ S, then, given ǫ > 0, choose N ∈ N
such that, for each m, n > N , one has |rn − rm | < ǫ/2. Then, for each k > N , one has
|x − rk | = |[(rn − rk )n∈N ]| < ǫ, since |rn − rk | < ǫ/2 for all n ≥ k, showing limn→∞ rn = x.
(c): Let (xn )n∈N be a Cauchy sequence in R. According to (a), for each n ∈ N, there
exists rn ∈ Q such that |xn − rn | < n1 . Then (rn )n∈N is a Cauchy sequence: Given ǫ > 0,
choose k ∈ N such that k1 < 3ǫ and |xn − xm | < 3ǫ for each n, m > k. Then
ǫ ǫ ǫ
∀ |rn − rm | ≤ |rn − xn | + |xn − xm | + |xm − rm | < + + = ǫ, (D.70)
n,m>k 3 3 3
showing (rn )n∈N is Cauchy. Thus, from (b), we obtain x ∈ R with limn→∞ rn = x. We
can now show, limn→∞ xn = x as well: Given ǫ > 0, choose N ∈ N such that N1 < 2ǫ and
|x − rn | < 2ǫ for each n > N . Then
ǫ ǫ
∀ |x − xn | ≤ |x − rn | + |rn − xn | < + = ǫ, (D.71)
n>N 2 2
showing limn→∞ xn = x and completing the proof.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 262
Proof. Let ∅ 6= A ⊆ R and let M ∈ R be an upper bound for A. We have to show that
A has a supremum in R. To this end, we recursively construct two Cauchy sequences
(xn )n∈N and (yn )n∈N in R such that (xn )n∈N is increasing, (yn )n∈N is decreasing, xn < yn ,
and limn→∞ (yn − xn ) = 0. Let x1 ∈ A be arbitrary and y1 := M . Define
(
(xn + yn )/2 if (xn + yn )/2 is not an upper bound for A,
xn+1 :=
xn otherwise,
∀
( (D.72)
n∈N
(xn + yn )/2 if (xn + yn )/2 is an upper bound for A,
yn+1 :=
yn otherwise.
Then, clearly, the xn are increasing, the yn are decreasing, and xn ≤ yn holds for each
n ∈ N. Moreover, letting d := M − x1 ≥ 0, a simple induction shows yn − xn = d/2n−1
and limn→∞ (yn − xn ) = 0. Also, for m > n,
m−1 m−1 m−1 m−1−n
X X d X −i+n d X −i 2d
xm − xn = (xi+1 − xi ) ≤ d 2−i = 2 = 2 ≤ n, (D.73)
i=n i=n
2n i=n 2n i=0 2
showing (xn )n∈N is a Cauchy sequence. Analogous, one sees that (yn )n∈N is a Cauchy
sequence. By Prop. D.40(c), we obtain s ∈ R such that s = limn→∞ xn = limn→∞ (yn −
xn + xn ) = limn→∞ yn . We claim s = sup A. If s < y, then there is n ∈ N with
s ≤ yn < y, showing y ∈/ A, i.e. s is an upper bound for A. If y < s, then there is n ∈ N
with y < xn ≤ s, showing y is not an upper bound for A. Thus, s is the smallest upper
bound for A, i.e. s = sup A.
D.6 Uniqueness
We will show in Th. D.45 below that, up to a unique isomorphism, R is the only complete
totally ordered field.
Notation D.42. Let (A, +, ·, ≤) be a complete totally ordered field. The neutral el-
ements with respect to + and ·, we denote with 0A and 1A , respectively. We recur-
sively define (n + 1)A := nA + 1A for each n ∈ N. Then NA := {nA : n ∈ N},
ZA := NA ∪ {0A } ∪ {−nA : n ∈ N}, QA := {0A } ∪ { kl : k, l ∈ ZA \ {0A }}.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 263
Proposition D.43. Let (A, +, ·, ≤) and (B, +, ·, ≤) be complete totally ordered fields.
Moreover, let φ : A −→ B be a field isomorphism, i.e. a bijective map, satisfying
Proof. (a): As (D.74a) and (D.74b) state φ to be a group homomorphism with respect
to addition and multiplication, respectively, Prop. C.4(a) yields φ(0A ) = 0B and φ(1A ) =
1B . If n ∈ N, then (D.74a) implies φ(nA + 1A ) = φ(nA ) + 1B and, thus, an induction
shows φ(nA ) = nB for each n ∈ N.
(b): If x, y ∈ A with x < y, then, by Rem. and Def. 7.61, there exists a unique z ∈ A
such that z 2 = y − x. Thus, (φ(z))2 = φ(y) − φ(x). By Th. 4.5(c), we have (φ(z))2 > 0
and, thus, φ(x) < φ(y), proving the strict isotonicity of φ.
Proof. From Prop. D.43(a), we already know φ(n) = n for each n ∈ NA . Next, if
n ∈ NA , then, using Prop. C.4(b), we obtain φ(−n) = −φ(n) = −n, showing φ(k) = k
for each k ∈ ZA . If k, l ∈ ZA \ {0A }, then φ(k/l) = φ(k · l−1 ) = φ(k) · (φ(l))−1 = kl−1 ,
where Prop. C.4(b) was used again. Thus, we already have φ(q) = q for each q ∈ QA .
From Prop. D.43(b), we know φ to be strictly isotone. Finally, if x ∈ A, then, by Th.
7.68(c), there exist a sequences (rn )n∈N in QA and (sn )n∈N in QA such that (rn )n∈N is
strictly increasing, (sn )n∈N is strictly decreasing, and
lim rn = lim sn = x.
n→∞ n→∞
As φ is isotone, we obtain
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 264
Theorem D.45. Let (A, +, ·, ≤) and (B, +, ·, ≤) be complete totally ordered fields. Then
there exists a unique isomorphism φ : A −→ B, i.e. a unique bijective φ : A −→ B,
satisfying (D.74a), (D.74b), and (D.75).
∀ φ(nA ) := nB . (D.76)
n∈N0
Indeed,
Next, we verify
∀ φ(mA + nA ) = φ(mA ) + φ(nA ) (D.78)
m,n∈N0
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 265
If m > n, then
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 266
We claim that ψ = φ−1 on QB : Indeed, for each k, l ∈ ZA \ {0A } and for each m, n ∈
ZB \ {0B }
ψ(φ(k/l)) = ψ φ(k)/φ(l) = φ−1 (φ(k))/φ−1 (φ(l)) = k/l,
φ(ψ(m/n)) = φ φ−1 (m)/φ−1 (n) = φ(φ−1 (m))/φ(φ−1 (n)) = m/n.
Moreover, we have
m k ml + kn φ(ml + kn) (D.74) for ZA φ(m)φ(l) + φ(k)φ(n)
φ + =φ = =
n l nl φ(nl) φ(n)φ(l)
φ(m) φ(k) m k
= + =φ +φ
φ(n) φ(l) n l
and
m k mk φ(mk) (D.74) for ZA φ(m)φ(k)
φ · =φ = =
n l nl φ(nl) φ(n)φ(l)
φ(m) φ(k) m k
= · =φ ·φ .
φ(n) φ(l) n l
Let r, s ∈ QA such that r < s. Then d := s − r > 0A , i.e. there are m, n ∈ NA satisfying
d= m n
. Then φ(s) − φ(r) = φ(d) = φ(m)
φ(n)
> 0B , proving φ(r) < φ(s).
In the fourth (and last) step, for each x ∈ A, we choose a sequence (rn )n∈N in QA such
that x = limn→∞ rn and set
φ(x) := lim φ(rn ). (D.83)
n→∞
To show that φ is well-defined by (D.83), we have to verify that (φ(rn ))n∈N does, indeed,
converge in B, and that φ(x) does not depend on the chosen sequence (rn )n∈N . As
(rn )n∈N converges to x, it has to be a Cauchy sequence by Th. 7.29. We show that
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
D CONSTRUCTION OF THE REAL NUMBERS 267
(φ(rn ))n∈N must be a Cauchy sequence as well: Let ǫ ∈ B, ǫ > 0B and choose ǫ̃ ∈ QB
such that 0B < ǫ̃ < ǫ. Then
∃ ∀ |rn − rm | < φ−1 (ǫ̃).
N ∈N n,m>N
proving (φ(rn ))n∈N to be a Cauchy sequence. Now Th. 7.29 implies the convergence of
(φ(rn ))n∈N . Next, we show limn→∞ rn = 0A implies limn→∞ φ(rn ) = 0B for each sequence
in QA : Indeed, as above, let ǫ > 0B and choose ǫ̃ ∈ QB such that 0B < ǫ̃ < ǫ. Then
∃ ∀ |rn | < φ−1 (ǫ̃).
N ∈N n>N
proving limn→∞ φ(rn ) = 0B . Thus, if (rn )n∈N and (sn )n∈N are sequences in QA such that
limn→∞ rn = x = limn→∞ sn , then
lim φ(rn ) = lim φ(rn − sn + sn ) = 0B + lim φ(sn ) = lim φ(sn ),
n→∞ n→∞ n→∞ n→∞
showing φ(x) 6= φ(y) and the injectivity of φ. To see that φ is surjective, let b ∈ B and
let (rn )n∈N be an increasing sequence in QB such that b = limn→∞ rn . Then (φ−1 (rn ))n∈N
is an increasing sequence in QA that is bounded, i.e. it must converge to some a ∈ A.
Then
φ(a) = lim φ(φ−1 (rn )) = lim rn = b,
n→∞ n→∞
showing φ to be surjective. Finally, if x, y ∈ A, then let (rn )n∈N and (sn )n∈N be sequences
in QA such that x = limn→∞ rn and y = limn→∞ sn . Then
φ(x + y) = lim φ(rn + sn ) = lim φ(rn ) + lim φ(sn ) = φ(x) + φ(y)
n→∞ n→∞ n→∞
and
φ(xy) = lim φ(rn sn ) = lim φ(rn ) lim φ(sn ) = φ(x)φ(y).
n→∞ n→∞ n→∞
Thus, we have verified (D.74) for each x, y ∈ A, and, thereby completed the proof.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
E SERIES: ADDITIONAL MATERIAL 268
N = I + ∪˙ I − , where (E.2a)
I + := {j ∈ N : aj ≥ 0}, (E.2b)
I − := {j ∈ N : aj < 0}. (E.2c)
∀ bj := aφ(j) , (E.3a)
j∈N
n
X
∀ tn := bj . (E.3b)
n∈N
j=1
The definition of φ will be recursive, and we will also need to recursively define an
auxiliary sequence (σj )j∈N taking values in {−1, 1}, serving as an accounting tool to
keep track if we are in the process of moving right (i.e. adding a+j ) or moving left (i.e.
−
subtracting aj ). Moreover, we need a recursively defined auxiliary function κ : N −→ N
to update the left and right boundaries xk and yk , respectively, to handle the first and
third case of (E.1) if need be. The recursion is initialized by
φ(1) := 1, (E.4a)
(
1 if t1 ≤ y1 ,
σ1 := (E.4b)
−1 if t1 > y1 ,
(
1 if t1 ≤ y1 ,
κ(1) := (E.4c)
2 if t1 > y1 ,
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
E SERIES: ADDITIONAL MATERIAL 269
and completed by
(
min I + \ φ{1, . . . , j − 1} if σj−1 = 1,
∀ φ(j) := (E.5a)
j>1 min I − \ φ{1, . . . , j − 1} if σj−1 = −1,
1 if σj−1 = 1 and tj ≤ yκ(j−1) ,
−1 if σ
j−1 = 1 and tj > yκ(j−1) ,
∀ σj := (E.5b)
j>1
−1 if σj−1 = −1 and tj ≥ xκ(j−1) ,
1 if σj−1 = −1 and tj < xκ(j−1) ,
κ(j − 1) if σj−1 = 1 and tj ≤ yκ(j−1) ,
1 + κ(j − 1) if σ
j−1 = 1 and tj > yκ(j−1) ,
∀ κ(j) := (E.5c)
j>1
κ(j − 1) if σj−1 = −1 and tj ≥ xκ(j−1) ,
1 + κ(j − 1) if σj−1 = −1 and tj < xκ(j−1) .
We note that φ is well-defined, since, according to (7.86), both I + and I − must have
infinitely many elements. Moreover, φ is injective, since, for j1 < j2 , φ(j2 ) 6= φ(j1 )
is immediate from (E.5a). Finally, φ is also surjective: Otherwise, there is a smallest
n ∈ N \ {1} such that n ∈ / φ(N). Suppose n ∈ I + . Then, according to (E.5a), there
must be j0 ∈ N such that σj = −1 for every j > j0 , i.e., according to P∞(E.5b) and
−
(E.5c), tj ≥ xκ(j0 ) ∈ R for each j > j0 , which is in contradiction to the j=1 aj = ∞
P
part of (7.86). Analogously, n ∈ I − leads to a contradiction to the ∞ +
j=1 aj = P∞ part
∞
of (7.86), completing theP∞proof of surjectivity of φ. So we have shown that P∞ j=1 bj
is a rearrangement of j=1 aj as desired. We still need to verify that j=1 bj (i.e.
(tn )n∈N ) has precisely all elements of [x, y] as cluster points. To this end, first note
that, due to (7.86) and (E.1), limj→∞ xκ(j) = −∞ holds if, and only if, x = −∞; and
limj→∞ xκ(j) = ∞ holds if, and only if, x = ∞; and likewise for the yκ(j) and y. If
x = −∞, then limj→∞ xκ(j) = −∞ and the bijectivity of φ together with (E.5b) and
(E.5c) implies
∀ ∃ tj < xκ(j−1) ≤ −N, (E.6)
N ∈N j∈N
showing ∞ is a cluster point of (tn )n∈N . Now let ξ ∈ [x, y] ∩ R and ǫ > 0. Due to
−
limj→∞ a+
j = limj→∞ aj = 0, we have
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
E SERIES: ADDITIONAL MATERIAL 270
Due to the bijectivity of φ together with (E.5b) and (E.5c), for each j0 ∈ N, there exists
j > max{j0 , N } such that tj−1 ≤ ξ ≤ tj , showing ξ is a cluster point of (tn )n∈N . On the
other hand, if ξ ∈] − ∞, x[, then x 6= −∞. If x = ∞, then limj→∞ tj = ∞ and ξ is not
a cluster point of (tn )n∈N . If ξ < x < ∞, then let ǫ := (x − ξ)/2 and choose N as in
(E.8). Then, by (E.5b) and (E.5c), for each j > N , tj > x − ǫ = ξ + ǫ, showing ξ is
not a cluster point of (tn )n∈N . Analogously, one sees that ξ ∈]y, ∞[ can not be a cluster
point of (tn )n∈N .
Proof. One estimates, using the formula for the value of a geometric series:
∞ ∞ ∞
X
N −ν
X
N −ν
X 1
dN −ν b ≤ (b − 1) b = (b − 1)b N
b−ν = (b − 1)bN 1 = bN +1 . (E.10)
ν=0 ν=0 ν=0
1− b
Note that (E.10) also shows that equality is achieved if all dn are equal to b − 1. Con-
versely, if there is n ∈ {N, N − 1, N − 2, . . . } such that dn < b − 1, then there is ñ ∈ N
such that dN −ñ < b − 1 and one estimates
∞
X ñ−1
X ∞
X
N −ν N −ν N −ñ
dN −ν b < dN −ν b + (b − 1)b + dN −ν bN −ν ≤ bN +1 , (E.11)
ν=0 ν=0 ν=ñ+1
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
E SERIES: ADDITIONAL MATERIAL 271
Proof. By subtracting dN bN from both series, one can assume dN = 0 without loss of
generality. From Lem. E.1, we know
∞
X ∞
X
N −ν
x= dN −ν b = dN −1−ν bN −1−ν ≤ bN . (E.13a)
ν=0 ν=0
Combining (E.13a) and (E.13b) yields x = bN . Once again employing Lem. E.1, (E.13a)
also shows that dn = b − 1 for each n ≤ N − 1 as claimed. Since eN > 0 and en ≥ 0
for each n, equality in (E.13b) can only occur for eN = 1 and en = 0 for each n < N ,
thereby completing the proof of the lemma.
Claim 3. One can verify by induction on n that the numbers dN −n and xn enjoy the
following properties for each n ∈ N0 :
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
E SERIES: ADDITIONAL MATERIAL 272
Proof. The induction is carried out for all three statements of (E.17) simultaneously.
From (E.15), we know bN ≤ x < bN +1 , i.e. 1 ≤ bxN < b. Using (E.16a), this yields
dN ∈ {1, . . . , b − 1} and 0 < x0 = dN bN = bN dN ≤ bN bxN = x as well as x − x0 =
x − dN bN = bN ( bxN − dN ) < bN . For n ≥ 1, by induction, one obtains 0 ≤ x − xn−1 <
b1+N −n , i.e. 0 ≤ x−x n−1
bN −n
< b. Using (E.16b), this yields dN −n ∈ {0, . . . , b − 1} and
x =x + dN −n b N −n
≤ xn−1 + bN −n x−x n−1
= x. Moreover, by induction, 0 < xn−1 =
Pn n−1 n−1 N −ν bN −n
N −n
ν=0 dN −ν b , such that (E.16b) implies P xn = xn−1 + dNP −n b ≥ xn−1 > 0 and
n−1
xn = xn−1 + dN −n bN −n = dN −n bN −n + ν=0 dN −ν bN −ν = nν=0 dN −ν bN −ν . Finally,
x − xn = x − xn−1 − dN −n bN −n = bN −n ( x−x n−1
bN −n
− dN −n ) ≤ bN −n , completing the proof
of the claim. N
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
E SERIES: ADDITIONAL MATERIAL 273
We will now show that, conversely, (i) implies (ii), (iii), and (iv). To that end, let x > 0
and suppose that x has two different b-adic representations
∞
X ∞
X
N1 −ν
x= dN1 −ν b = eN2 −ν bN2 −ν (E.19)
ν=0 ν=0
Now Lem. E.2 shows that there are precisely two b-adic representations of x, one having
the property required in (iii) and the other having property required in (iv).
Thus, in each case (N2 > N1 , N1 > N2 , and N1 = N2 ), we find that (i) implies (ii), (iii),
and (iv), thereby concluding the proof of the theorem.
In most cases, it is understood that we work only with decimal representations such
that there is no confusion about the meaning of symbol strings like 101.01. However,
in general, 101.01 could also be meant with respect to any other base, and, the number
represented by the same string of symbols does obviously depend on the base used.
Thus, when working with different representations, one needs some notation to keep
track of the base.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
E SERIES: ADDITIONAL MATERIAL 274
One frequently needs to convert representations with respect to one base into represen-
tations with respect to another base. When working with digital computers, conversions
between bases 10 and 2 and vice versa are the most obvious ones that come up. Con-
verting representations is related to the following elementary remainder theorem and
the well-known long division algorithm.
Theorem E.6. For each pair of numbers (a, b) ∈ N2 , there exists a unique pair of
numbers (q, r) ∈ N20 satisfying the two conditions a = qb + r and 0 ≤ r < b.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
F CARDINALITY OF R AND SOME RELATED SETS 275
where [f (n)] denotes the equivalence class of f (n) with respect to ∼ from (D.31), is
surjective. Thus, Q is countable by Prop. 3.15.
In the following theorem and its two corollaries, we will see that the set R of real numbers
is not countable, but has the same cardinality as the power set of N. Moreover, the same
is true for every nontrivial interval of real numbers.
Theorem F.2. Let a, b ∈ R with a < b. Recalling the notations F N, {0, 1} = {0, 1}N
for the set of sequences in {0, 1}, we obtain the following equalities of cardinalities:
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
F CARDINALITY OF R AND SOME RELATED SETS 276
(i): To prove #{0, 1}N = #P(N), we have to show the existence of a bijective map
f : {0, 1}N −→ P(N). Given σ ∈ {0, 1}N , i.e. σ is a function σ : N −→ {0, 1}, define
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
F CARDINALITY OF R AND SOME RELATED SETS 277
and we note
g (0, 0, . . . ) = 0, (F.8a)
g (1, 1, . . . ) = 1, (F.8b)
g(en ) = g(fn ) = 2−n for each n ∈ N. (F.8c)
it follows from Th. 7.99 that (the following restrictions of f which, to simplify notation,
we also denote by f )
and
f : A −→ B (F.11b)
are bijective, i.e. the full f of (F.9) is itself bijective, completing the proof of (ii).
(iii): To prove #] − 1, 1[= #R, we have to show the existence of a bijective map f :
R −→] − 1, 1[. Since we know from Def. and Rem. 8.27 that arctan : R −→] − π/2, π/2[
is bijective, we can define
2 arctan x
f : R −→]0, 1[, f (x) := . (F.12)
π
However, even though this provides a valid proof, arctan is a somewhat complicated
function (as it is defined via sin and cos, which are defined via power series). Thus, it
might be desirable to see an alternative proof, using a more elementary f . We claim
that
x
f : R −→] − 1, 1[, f (x) := , (F.13)
|x| + 1
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
F CARDINALITY OF R AND SOME RELATED SETS 278
is also bijective. Since f is clearly continuous, according to the intermediate value Th.
7.57, it suffices to show
Proof. #R = #P(N) was proved in Th. F.2 and P(N) is uncountable by Th. A.70.
Corollary F.4. If a, b ∈ R with a < b, then #(Q∩]a, b[) = #N and #(]a, b[ \Q) = #R,
i.e. ]a, b[ contains countably many rational and uncountably many irrational numbers.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
G PARTIAL FRACTION DECOMPOSITION 279
Proof. Since Q∩]a, b[⊆ Q, the claim #(Q∩]a, b[) = #N follows from Th. F.1(c), Prop.
3.14, and Th. 7.68(a).
To prove #(]a, b[ \Q) = #R, a bijection between ]a, b[ \Q and R can be constructed
analogous to the construction of f in step (ii) of the proof of Th. F.2, making use of the
fact that #]a, b[= #R and #Q = #N.
Theorem F.5. The set of complex numbers C = R × R has the same cardinality as R:
#(R × R) = #R = #P(N).
Proof. Let
A := {0, 1}N . (F.16)
By an application of Th. F.2, it suffices to prove #A = #(A×A), which is accomplished
by showing the existence of a bijective map f : A −→ A × A. We define
f : A −→ A × A, f (xj )j∈N := (yj )j∈N , (zj )j∈N , (F.17a)
where
∀ yj := x2j−1 , (F.17b)
j∈N
∀ zj := x2j , (F.17c)
j∈N
and
g : A × A −→ A, g (yj )j∈N , (zj )j∈N := (xj )j∈N , (F.18a)
where (
y(j+1)/2 for j odd,
∀ xj := (F.18b)
j∈N zj/2 for j even.
Clearly, g = f −1 , proving that f is bijective as desired.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
G PARTIAL FRACTION DECOMPOSITION 280
(for example, for the computation of the antiderivative of R, cf. Ex. 10.22(b)). The
following Th. G.1 guarantees this is always possible:
ajl ∈ C (j = 1, . . . , k, l = 1, . . . , mj ), (G.4)
such that
k mj
P (z) X X ajl
∀ R(z) = = . (G.5)
z∈C\N (Q) Q(z) j=1 l=1
(z − λj )l
Proof. We first prove the existence of the decomposition (G.5) via induction on n =
deg(Q). If n = 1, then P must be constant and there is nothing to prove. For the
induction step, consider n ≥ 2. Let ζ be a zero of Q with multiplicity m ∈ {1, . . . , n}.
Then, according to Rem. 6.7, there exists a polynomial S : C −→ C such that Q(z) =
(z − ζ)m S(z) and S(ζ) 6= 0. Noting
We will now
m−1
apply (G.8) with ζ = λk and m = mk . Since deg(T ) < n − 1 = deg (z −
ζ) S(z) < deg(Q), the induction hypothesis applies to the function in (G.8), yielding
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
G PARTIAL FRACTION DECOMPOSITION 281
We fix j0 and prove aj0 l = bj0 l via induction on l = 1, . . . , mj0 : Let l ∈ {1, . . . , mj0 } and
assume aj0 α = bj0 α has already been shown for each α > l (the induction does, indeed,
start at l = mj0 , working itself down to l = 1). Then (G.10) implies
l k mj l k mj
X aj 0 β XX ajβ X bj 0 β XX bjβ
β
+ β
= β
+ . (G.11)
β=1
(z − λj0 ) j=1, β=1
(z − λj ) β=1
(z − λj0 ) j=1, β=1
(z − λj )β
j6=j0 j6=j0
One now multiplies (G.11) by (z − λj0 )l . Then taking the limit for z → λj0 on both
sides yields aj0 l = bj0 l as desired.
If, in (G.1), P, Q : R −→ R, then the partial fraction decomposition (G.5) of Th. G.1 is
not quite satisfactory, since, even though P and Q are both real, the ajl will typically
be nonreal elements of C. As the following Th. G.2 shows, if P, Q are real, then it is
always possible to obtain a partial fraction decomposition with only real coefficients,
however its form is somewhat more complicated.
We start by using the real factorization of Q : R −→ R, deg(Q) = n ∈ N, according to
(8.58), where, as in (G.2), we combine identical factors, obtaining
k1
Y k2
Y
mj
Q(x) = c (x − λj ) (x2 + αj x + βj )nj , (G.12)
j=1 j=1
where c ∈ R; λ1 , . . . , λk1 ∈ R, k1 ∈ {0, . . . , n}, are the distinct real zeros of Q (if
any), mj ∈ N being their respective multiplicities; and (α1 , β1 ), . . . , (αk2 , βk2 ) ∈ R2
k2 ∈ {0, . . . , n}, are distinct pairs of real numbers, each pair arising from combining
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
G PARTIAL FRACTION DECOMPOSITION 282
ajl ∈ R (j = 1, . . . , k1 , l = 1, . . . , mj ), (G.14a)
bjl ∈ R (j = 1, . . . , k2 , l = 1, . . . , nj ), (G.14b)
cjl ∈ R (j = 1, . . . , k2 , l = 1, . . . , nj ) (G.14c)
such that
1 X k mj 2 X k nj
P (x) X ajl X bjl x + cjl
∀ R(x) = = l
+ . (G.15)
x∈R\N (Q) Q(x) j=1 l=1
(x − λj ) j=1 l=1
(x + αj x + βj )l
2
Proof. We show that, if P, Q are real, where Q is as in (G.12), then (G.5) can be
rewritten in the form (G.15): First, consider λj0 ∈ R to be a real zero of Q. Then
all corresponding coefficients in (G.5) are real: We prove aj0 l ∈ R via induction on
l = 1, . . . , mj0 : Let l ∈ {1, . . . , mj0 } and assume aj0 α ∈ R has already been shown for
each α > l. Then (G.5) (with z replaced by x) implies
mj 0 l k mj
aj 0 β
X X aj 0 β XX ajβ
∀ S(x) := R(x) − β
= β
+ ∈ R.
x∈R\N (Q)
β=l+1
(x − λj0 ) β=1
(x − λj0 ) j=1, β=1
(x − λj )β
j6=j0
(G.16)
One now multiplies (G.16) by (x − λj0 )l . Then taking the limit for x → λj0 on both
sides yields aj0 l ∈ R as desired (the limit on the right-hand side is clearly aj0 l and all
values and, thus, the limit on the left-hand side are clearly in R).
Thus, the summands corresponding to real zeros of Q are identical in (G.5) and (G.15).
It remains to show that terms in (G.5), corresponding to conjugate nonreal zeros of
Q, can be combined to result in the summands involving the bjl and cjl in (G.15). To
this end, consider λj0 , λj1 ∈ C to be conjugate nonreal zeros of Q, λj1 = λj0 . Then all
corresponding coefficients in (G.5) are conjugate: We prove aj0 l = aj1 l via induction on
l = 1, . . . , mj0 = mj1 : Let l ∈ {1, . . . , mj0 } and assume aj0 α = aj1 α has already been
shown for each α > l. We once again have the formula (G.16) for S(x) (even for each
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
G PARTIAL FRACTION DECOMPOSITION 283
x ∈ C \ N (Q), but we can no longer expect S(x) ∈ R). As before, after multiplying
(G.16) by (x − λj0 )l , we obtain
lim S(x)(x − λj0 )l = aj0 l . (G.17)
x→λj0
Taking complex conjugates in (G.16) and using the induction hypothesis as well as
R(x) = R(x) (since the coefficients of P, Q are real) yields
mj 1 l k mj
aj 1 β
X (G.18) X aj 1 β XX ajβ
∀ S(x) = R(x) − = + .
x∈C\N (Q)
β=l+1
(x − λj1 )β β=1
(x − λj1 )β j=1, β=1 (x − λj )β
j6=j1
(G.19)
l
If we multiply (G.19) by (x − λj1 ) , we obtain
lim S(x)(x − λj1 )l = aj1 l . (G.20)
x→λj1
Thus,
(G.17) (G.20)
aj 0 l = lim S(x)(x − λj0 )l = lim S(x)(x − λj1 )l = aj 1 l , (G.21)
x→λj0 x→λj0
as needed.
We now combine two corresponding summands of (G.5) (for x ∈ R \ N (Q)):
to simplify notation. To finish the proof of (G.15), it remains to show there are real
coefficients s1l , . . . , sll and t1l , . . . , tll such that
l
X sβl x + tβl
∀ σl = , (G.24)
l∈{1,...,mj0 }
β=1
(x2 + bx + c)β
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
G PARTIAL FRACTION DECOMPOSITION 284
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
H IRRATIONALITY OF e AND π 285
to have the form required by (G.24), thereby finishing the induction and the proof of
the theorem.
Remark G.3. Given a rational function R = P/Q as in Th. G.1 (or Th. G.2), there
remains the question of how to actually compute the coefficients ajl of the partial fraction
decomposition (G.5) (or ajl , bjl , cjl of the partial fraction decomposition (G.15) in the real
case)? First, one always needs to obtain the zeros λj and their respective multiplicities
mj , which, for deg(Q) large, can be very difficult. Then there are basically three different
possibilities to proceed, where, in practice, the most efficient way in a concrete situation
might be to mix the three strategies:
(a) Linear System: To determine k unknown coefficients, one can plug k different values
for z into (G.5) (or for x into (G.15)) to obtain a linear system for the unknown
coefficients.
(b) One can multiply (G.5) (or (G.15)) by Q, obtaining a polynomial on both sides of
the equation. As the polynomials need to be equal, the coefficients of equal powers
need to be equal on both sides, yielding a system of equations for the unknown
coefficients.
(c) Multiplying (G.5) (or (G.15)) by (z − λj )mj and setting z = λj yields the coefficient
of (z−λ1j )mj , etc.
H Irrationality of e and π
H.1 Irrationality of e
The following Prop. H.1, which will then be used to prove the irrationality of e in
Th. H.2, shows, in particular, that the series (8.26) can be used to efficiently compute
accurate approximations of e.
Proposition H.1. Defining
n−1 j
z
X z
∀ ∀ Rn (z) := e − , (H.1)
n∈N z∈C
j=0
j!
we have n
Rn (z) ≤ 2 |z|
∀ |z| ≤ 1 ⇒ , (H.2)
n∈N n!
i.e. the error made when approximating ez by the partial sum (for |z| ≤ 1) is at most as
large as twice the modulus of the first missing summand.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
H IRRATIONALITY OF e AND π 286
2
in contradiction to 0 < |n! Rn+1 (1)| < n+1
< 1, which holds according to (H.2) (recalling
n ≥ 2).
H.2 Irrationality of π
Theorem H.3. π 2 is irrational (then, in particular, π must be irrational as well).
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
H IRRATIONALITY OF e AND π 287
Moreover, since f (1 − x) = f (x) for each x ∈ R, and, thus, f (j) (1 − x) = (−1)j f (j) (x)
for each x ∈ R, we also have
∀ f (j) (1) ∈ Z. (H.11)
j∈N0
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
I TRIGONOMETRIC FUNCTIONS 288
I Trigonometric Functions
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
J DIFFERENTIAL CALCULUS 289
as claimed.
J Differential Calculus
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
J DIFFERENTIAL CALCULUS 290
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
J DIFFERENTIAL CALCULUS 291
To show that f is nowhere differentiable, we will now study suitable difference quotients.
Let (hk )k∈N be a sequence such that
1
∀ hk = ± . (J.11)
k∈N ak+1
Then (J.4) implies limk→∞ hk = 0. Let x ∈ R be arbitrary. Define
g(an (x + hk )) − g(an x)
∀ δk g(an x) := . (J.12)
k,n∈N hk
Then
(J.10) an |hk |
∀ |δk g(an x)| ≤ = an , (J.13)
k,n∈N |hk |
and, recalling a ∈ N,
1 an−(k+1)
n gan (x ± ak+1
) − gan (x) gan (x ± an
) − gan (x) (J.9)
∀ δk g(a x) = = = 0. (J.14)
n>k∈N hk hk
Thus, for each k ∈ N, we obtain
P∞ n n n
f (x + hk ) − f (x) q g(a (x + h k )) − g(a x)
δk f := = n=0
hk hk
k−1
X
(J.14)
= q n δk g(an x) + q k δk g(ak x) (J.15)
n=0
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
REFERENCES 292
We rewrite (J.16) as
Xk−1
n n 1
< η q k ak ,
q δ k g(a x) where η := , 0 < η < 1. (J.17)
n=0
aq − 1
1
According to (J.8), gak is affine on intervals of length 2ak
. Thus, since a ≥ 4 implies
1 1
∀ ≤ , (J.18)
k∈N ak+1 4ak
we can always choose the sign of hk such that
|ak (x + hk − x)|
|δk g(ak x)| = = ak . (J.19)
|hk |
Using this choice for the hk , we combine our estimates to obtain
k−1
(J.17),(J.19)
(J.15) X n n k k
∀ |δk f | = q δk g(a x) + q δk g(a x) ≥ (1 − η) q k ak . (J.20)
k∈N n=0
References
[Bla84] A. Blass. Existence of Bases Implies the Axiom of Choice. Contemporary
Mathematics 31 (1984), 31–33.
[EFT07] H.-D. Ebbinghaus, J. Flum, and W. Thomas. Einführung in die math-
ematische Logik, 5th ed. Spektrum Akademischer Verlag, Heidelberg, 2007
(German).
[EHH+ 95] H.-D. Ebbinghaus, H. Hermes, F. Hirzebruch, M. Koecher,
K. Mainzer, J. Neukirch, A. Prestel, and R. Remmert. Numbers.
Graduate Texts in Mathematics, Vol. 123, Springer-Verlag, New York, 1995,
corrected 3rd printing.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34
REFERENCES 293
[Hal17] Lorenz J. Halbeisen. Combinatorial Set Theory, 2nd ed. Springer Mono-
graphs in Mathematics, Springer, Cham, Switzerland, 2017.
[Kun80] Kenneth Kunen. Set Theory. Studies in Logic and the Foundations of
Mathematics, Vol. 102, North-Holland, Amsterdam, 1980.
[Phi17] P. Philip. Analysis III: Measure and Integration Theory of Several Vari-
ables. Lecture Notes, LMU Munich, 2016/2017, AMS Open Math Notes Ref.
# OMN:202109.111308, available in PDF format at
https://www.ams.org/open-math-notes/omn-view-listing?listingId=111308.
[Wal02] Wolfgang Walter. Analysis 2, 5th ed. Springer-Verlag, Berlin, 2002 (Ger-
man).
[Wal04] Wolfgang Walter. Analysis 1, 7th ed. Springer-Verlag, Berlin, 2004 (Ger-
man).
[Wik15b] Wikipedia. HOL Light — Wikipedia, The Free Encyclopedia. 2015, https:
//en.wikipedia.org/wiki/HOL_Light Online; accessed Sep-01-2015.
AMS Open Math Notes: Works in Progress; Reference # OMN:202109.111306; Last Revised: 2021-09-17 09:47:34