Preface
ABOUT THE CHOICE OF TOPICS
Handbook of Analysis and its Foundations hereafter abbreviated HAF is a selfstudy guide, intended for advanced undergraduates or beginning graduate students in mathematics. It will also be useful as a reference tool for more advanced mathematicians. HAF surveys analysis and related topics, with particular attention to existence proofs. HAF progresses from elementary notions   sets, functions, products of sets   through intermediate topics   uniform completions, Tychonov's T h e o r e m  all the way to a few advanced results the EberleinSmulianGrothendieck Theorem, the CrandallLiggett Theorem, and others. The book is selfcontained and thus is well suited for selfdirected study. It will help to compensate for the differences between students who, coming into a single graduate class from different undergraduate schools, have different backgrounds. I believe that the reading of part or all of this book would be a good project for the summer vacation before one begins graduate school in mathematics. At least, this is the book I wish I had had before I began my graduate studies. HAF introduces and shows the connections between many topics that are customarily taught separately in greater depth: set theory, metric spaces, abstract algebra, formal logic, general topology, real analysis, and linear and nonlinear functional analysis, plus a small amount of Baire category theory, Mac LaneEilenberg category theory, nonstandard analysis, and differential equations. Included in these customary topics are the usual nonconstructive proofs of existence of pathological objects. Unlike most analysis books, however, HAF also includes some chapters on set theory and logic, to explain why many of those classical pathological objects are presented without examples. HAF contains the most fundamental parts of an entire shelf of conventional textbooks. In his "automathography," Halmos [1985] said that one good way to learn a lot of mathematics is by reading the first chapters of many books. I have tried to improve upon that collection of first chapters by eliminating the overlap between separate books, adhering to consistent notation, and inserting frequent crossreferencing between the different topics. HAF's integrated approach shows connections between topics and thus partially counteracts the fragmentation into specialized little bits that has become commonplace in mathematics in recent decades. HAF's integrated approach also supports the development xiii
xiv
Preface
of interdisciplinary topics, such as the "intangibles" discussed later in this preface. The content is biased toward the interests of analysts. For instance, our treatment of algebra devotes much attention to convexity but little attention to finite or noncommutative groups; our treatment of general topology emphasizes distances and meager sets but omits manifolds and homology. H A F will not transform the reader into a researcher in algebra, topology, or logic, but it will provide analysts with useful tools from those fields. H A F includes a few "hard analysis" results: Clarkson's Inequalities, the KobayashiRasmussen Inequalities, maximal inequalities for martingales and for Lebesgue measure, etc. However, the book leans more toward "soft analysis" i.e., existence theorems and other qualitative results. Preference is given to theorems that have short or elegant or intuitive proofs and that mesh well with the main themes of the book. A few long proofs e.g., Brouwer's Theorem, James's Theorem are included when they are sumciently important for the themes of the book. As much as possible, I have tried to make this book current. Most mathematical papers published each year are on advanced and specialized material, not appropriate for an introductory work. Only occasionally does a paper strengthen, simplify, or clarify some basic, classical ideas. I have combed the literature for these insightful papers as well as I could, but some of them are not well known; that is evident from their infrequent mentions in the Science Citation Index. Following are a few of HAF's unusual features: 9 A thorough introduction to filters in Chapters 5 and 6, and nets in Chapter 7. Those tools are used extensively in later chapters. Included are ideas of Aarnes and Andenaes [1972] on the interchangeability of subnets and superfilters, making available the advantages of both theories of convergence. Also included, in 15.10, is Gherman's [1980] characterization of topological convergences, which simplifies slightly the classic characterization of Kelley [1955/1975]. 9 an introduction to symmetric and preregular spaces, filling the conceptual gaps that are left in most introductions to To, T1, T2, and T3 spaces see the table in 16.1. 9 a unified treatment of topological spaces, uniform spaces, topological Abelian groups, topological vector spaces, locally convex spaces, Fr~chet spaces, Banach spaces, and Banach lattices, explaining these spaces in terms of increasingly specialized kinds of "distances" see the table in 26.1. 9 converses to Banach's Contraction Fixed Point Theorem, due to Bessaga [1959] and Meyers [1967], in Chapter 19. These converses show that, although Banach's theorem is quite easy to prove, a longer proof cannot yield stronger results. the Brouwer Fixed Point Theorem, proved via van Maaren's geometryfree Sperner's Lemma. This approach is particularly intuitive and elementary involves neither Jacobians nor triangulations. It decomposes the proof of Theorem into a purely combinatorial argument (in 3.28) and a compactness (in 27.19). version of in that it Brouwer's argument
introductions to both the Lebesgue and Henstock integrals and a proof of their equivalence in Chapter 24. (More precisely, a Banachspacevalued function is Lebesgue
Existence, Examples, and Intangibles
xv
integrable if and only if it is almost separably valued and absolutely Henstock integrable.) 9 pathological examples due to Nedoma, Kottman, Gordon, Dieudonn6, and others, which illustrate very vividly some of the differences between IRn and infinitedimensional Banach spaces. 9 an introduction to set theory, including the most interesting equivalents of the Axiom of Choice, Dependent Choice, the Ultrafilter Principle, and the HahnBanach Theorem. (For lists of equivalents of these principles, see the index.) 9 an introduction to formal logic following the substitution rules of Rasiowa and Sikorski [1963], which are simpler and in this author's opinion more natural than the substitution rules used in most logic textbooks. This is discussed in 14.20. 9 a discussion of model theory and consistency results, including a summary of some results of Solovay, Pincus, Shelah, et al. Those results can be used to prove the nonconstructibility of many classical pathological objects of analysis; see especially the discussions in 14.76 and 14.77. 9 Neumann's [1985] nonlinear Closed Graph Theorem. 9 the automatic continuity theorems of Garnir [1974] and Wright [1977]. These results are similar to Neumann's, but instead of assuming a closed graph, they replace conventional set theory with ZF + DC + BP. Their result explains in part why a Banach space in applied math has a "usual norm;" see 14.77. In compiling this book I have acted primarily as a reporter, not an inventor or discoverer. Nearly all the theorems and proofs in HAF can be found in earlier books or in research journal articles but in many cases those books or articles are hard to find or hard to read. This book's goal is to enhance classical results by modernizing the exposition, arranging separate topics into a unified whole, and occasionally incorporating some recent developments. I have tried to give credit where it is due, but that is sometimes difficult or impossible. Historical inaccuracies tend to propagate through the literature. I have tried to weed out the inaccuracies by reading widely, but I'm sure I have not caught them all. Moreover, I have not always distinguished between primary and secondary sources. In many cases I have cited a textbook or other secondary source, to give credit for an exposition that I have modified in the present work.
EXISTENCE~ EXAMPLES~ AND INTANGIBLES
Most existence proofs use either compactness, completeness, or the Axiom of Choice; those topics receive extra attention in this book. (In fact, Choice, Completeness, Compactness was the title of an earlier, prepublication version of this book; papers that mention that title are actually citing this book.) Although those three approaches to existence are usually
xvi
Preface
quite different, they are not entirely unrelated AC has many equivalent forms, some of which are concerned with compactness or completeness (see 17.16 and 19.13). The term "foundations" has two meanings; both are intended in the title of this book: (i) In nonmathematical, everyday English, "foundations" refers to any basic or elementary or prerequisite material. For instance, this book contains much elementary set theory, algebra, and topology. Those subjects are not part of analysis, but are prerequisites for some parts of analysis. (ii) "Foundations" also has a more specialized and technical meaning. It refers to more advanced topics in set theory (such as the Axiom of Choice) and to formal logic. Many mathematicians consider these topics to be the basis for all of mathematics. Conventional analysis books include only a page or so concerning (ii); this book contains much more. We are led to (ii) when we look for examples of pathological objects. Students and researchers need examples; it is a basic precept of pedagogy that every abstract idea should be accompanied by one or more concrete examples. Therefore, when I began writing this book (originally a conventional analysis book), I resolved to give examples of everything. However, as I searched through the literature, I was unable to find explicit examples of several important pathological objects, which I now call i n t a n g i b l e s : 9 finitely additive probabilities that are not countably additive, 9 elements of (e~)* \ el (a customary corollary of the HahnBanach Theorem), 9 universal nets that are not eventually constant, 9 free ultrafilters (used very freely in nonstandard analysis!), 9 well orderings for N, 9 inequivalent complete norms on a vector space, etc. In analysis books it has been customary to prove the existence of these and other pathological objects without constructing any explicit examples, without explaining the omission of examples, and without even mentioning that anything has been omitted. Typically, the student does not consciously notice the omission, but is left with a vague uneasiness about these unillustrated objects that are so difficult to visualize. I could not understand the dearth of examples until I accidentally ventured beyond the traditional confines of analysis. I was surprised to learn that the examples of these mysterious objects are omitted from the literature because they m u s t be omitted: Although the objects exist, it can also be proved that explicit constructions do not exist. That may sound paradoxical, but it merely reflects a peculiarity in our language: The customary requirements for an "explicit construction" are more stringent than the customary requirements for an "existence proof." In an existence proof we are permitted to postulate arbitrary choices, but in an explicit construction we are expected to make choices in an algorithmic fashion. (To make this observation more precise requires some definitions, which are given in 14.76 and 14.77.)
Existence, Examples, and Intangibles
xvii
Though existence without examples has puzzled some analysts, the relevant concepts have been a part of logic for many years. The nonconstructive nature of the Axiom of Choice was controversial when set theory was born about a century ago, but our understanding and acceptance of it has gradually grown. An account of its history is given by Moore [1982]. It is now easy to observe that nonconstructive techniques are used in many of the classical existence proofs for pathological objects of analysis. It can also be shown, though less easily, that many of those existence theorems cannot be proved by other, constructive techniques. Thus, the pathological objects in question are inherently unconstructible. The paradox of existence without examples has become a part of the logicians' folklore, which is not easily accessible to nonlogicians. Most modern books and papers on logic are written in a specialized, technical language that is unfamiliar and nonintuitive to outsiders: Symbols are used where other mathematicians are accustomed to seeing words, and distinctions are made which other mathematicians are accustomed to blurring e.g., the distinction between firstorder and higherorder languages. Moreover, those books and papers of logic generally do not focus on the intangibles of analysis. On the other hand, analysis books and papers invoke nonconstructive principles like magical incantations, without much accompanying explanation and in some cases without much understanding. One recent analysis book asserts that analysts would gain little from questioning the Axiom of Choice. I disagree. The present work was motivated in part by my feeling that students deserve a more "honest" explanation of some of the nonexamples of analysis especially of some of the consequences of the HahnBanach Theorem. When we cannot construct an explicit example, we should say so. The student who cannot visualize some object should be reassured that no one else can visualize it either. Because examples are so important in the learning process, the lack of examples should be discussed at least briefly when that lack is first encountered; it should not be postponed until some more advanced course or ignored altogether. Though most of HAF relies only on conventional reasoning   i.e., the kind of set theory and logic that most mathematicians use without noticing they are using it   we come to a better understanding of the idiosyncrasies of conventional reasoning by contrasting it with unconventional systems, such as ZF + DC + BP or Bishop's constructivism. HAF explains the relevant foundational concepts in brief, informal, intuitive terms that should be easily understood by analysts and other nonlogicians. To better understand the role played by the Axiom of Choice, we shall keep track of its uses and the uses of certain weakened forms of AC, especially the Principle of Dependent Choices (DC), which is constructive and is equivalent to several principles about complete metric spaces; the Ultrafilter Principle (UF), which is nonconstructive and is equivalent to the Completeness and Compactness Principles of logic, as well as dozens of other important principles involving topological compactness; and the HahnBanach Theorem (HB), also nonconstructive, which has many important equivalent forms in functional analysis. Most analysts are not accustomed to viewing HB as a weakened form of AC, but that
xviii
Preface
viewpoint makes the HahnBanach Theorem's nonconstructive nature much easier to understand. This book's sketch of logic omits many proofs and even some definitions. It is intended not to make the reader into a logician, but only to show analysts the relevance of some parts of logic. The introduction to foundations for analysts is HAF's most unusual feature, but it is not an overriding feature it takes up only a small portion of the book and can be skipped over by mathematicians who have picked up this book for its treatment of nonfoundational topics such as nets, Fspaces, or integration.
ABSTRACT VERSUS CONCRETE
i have attempted to present each set of ideas at a natural level of generality and abstraction i.e., a level that conveys the ideas in a simple form and permits several examples and applications. Of course, the level of generality of any part of the book is partly dictated by the needs of later parts of the book. Usually, I lean toward more abstract and general approaches when they are available. By omitting unnecessary, irrelevant, or distracting hypotheses, we trim a concept down to reveal its essential parts. In many cases, omitting unnecessary hypotheses does not lengthen a proof, and it may make the proof easier to understand because the reader's attention is then focused on the few possible lines of reasoning that still remain available. For instance, every metric space can be embedded isometrically in a Banach space (see 22.14), but the "more concrete" setting of Banach spaces does not improve our understanding of metric space results such as the Contraction Fixed Point Theorem in 19.39. Here is another example of my preference for abstraction: Some textbooks build Hausdorffness into their definition of "uniform space" or "topological vector space" or "locally convex space" because most spaces used in applications are in fact Hausdorff. This may shorten the statements of several theorems by a word or two, but it does not shorten the proofs of those theorems. Moreover, it may confuse beginners by entangling concepts that are not inherently related: The basic ideas of Hausdorff spaces are independent from the other basic ideas of uniform spaces, topological spaces, and locally convex spaces; neither set of ideas actually requires the other. In HAF, Hausdorffness is a separate property; it is not built into our definitions of those other spaces. Our notnecessarilyHausdorff approach has several benefits, of which the greatest probably is this: The weak topology of an infinitedimensional Banach space is an important nonmetrizable Hausdorff topology that is best explained as the supremum of a collection of pseudometrizable, nonHausdorff topologies. (If the reader is accustomed to working only in Hausdorff spaces, HAF's notnecessarilyHausdorff approach may take a little getting used to, but only a little. Mostly, one replaces "metric" with "pseudometric" or with the neutral notion of "distance;" one replaces "the limit" with "a limit" or with the neutral notion of "converges to.") However, a more general approach to a topic is not necessarily a simpler approach. Every idea in mathematics can be made more general and more abstract by making the hypotheses
Order of Topics
xix
weaker and more complicated and by introducing more definitions, but I have tried to avoid the weakly upper hemisemidemicontinuous quasipseudospaces of baroque mathematics. It is unavoidable that the beginning graduate student of mathematics must wade through a large collection of new definitions, but that collection should not be made larger than necessary. Thus we sometimes accept slightly stronger hypotheses for a theorem in order to avoid introducing more definitions. Of course, ultimately the difference between important distinctions and excessive hairsplitting is a matter of an individual mathematician's own personal taste. Converses to main implications are included in HAF whenever this can be managed conveniently, as well as in a few inconvenient cases that I deemed sufficiently important. Lists of dissimilar but equivalent definitions are collected into one long definintionandtheorem, even though that one theorem may have a painfully long proof. The single portmanteau theorem is convenient for reference, and moreover it clearly displays the importance of a concept. For instance, the notion of "ultrabarrelled spaces" seemed too advanced and specialized for this book until I saw the long list of dissimilar but equivalent definitions that now appears in 27.26. To prevent confusion, I have called the student's attention to contrasts between similar but inequivalent concepts, either by juxtaposing them (as in the case of barrels and ultrabarrels) or by including crossreferencing remarks (as in the case of Bishop's constructivism and GSdel's constructivism). Although the content is chosen for analysts, the writing style has been influenced by algebraists. Whenever possible, I have made degenerate objects such as the empty set into a special case of a rule, rather than an exception to the rule. For instance, in this book and in algebra books, {S" S c_ X} is an "improper filter" on X, though it is not a filter at all according to the definition used by many books on general topology.
ORDER OF TOPICS
I have followed a Bourbakilike order of' topics, first introducing simple fundamentals and later building upon them to develop more specialized ideas. The topics are ordered to suit pedagogy rather than to emphasize applications. For instance, convexity is commonly introduced in functional analysis courses in the setting of Banach spaces or topological vector spaces, but I have found it expedient to introduce convexity as a purely algebraic notion, and then add topological considerations much later in the book. Most topological vector spaces used in applications are locally convex, but HAF first studies topological vector spaces without the additional assumption of local convexity. Topics covered within a single chapter are closely related to each other. However, in many cases the end of a chapter covers more advanced and specialized material that can be postponed; it will not be needed until much later in the book, if at all. Most of Part C (on topological and uniform spaces) can be read without Part B (logic and algebra). Howeve'~, most readers should skim through Chapters 5, 6, and 7. Those chapters introduce filters and nets tools that are used more extensively in this book than in most analysis books. I have felt justified in violating logical sequencing in one important instance. The real number system is, in some sense, the foundation of analysis, so it must be used in examples
xx
Preface
quite early in the book. Examples given in early chapters assume an informal understanding of the real numbers, such as might be acquired in calculus and other early undergraduate courses. A more precise definition of the reals is neither needed nor attainable until Chapter 10. Much conceptual machinery must be built before we can understand and prove a statement such as this one: There exists a Dedekind complete, chain ordered field, called the real numbers. It is unique up to isomorphism if we use the conventional reasoning methods of analysts. (It is not unique if we restrict our reasoning methods to firstorder languages and permit the use of nonstandard models.) The existence and uniqueness of the complete ordered field justify the usual definition of IR. I am surprised that these algebraic results are not proved (or even mentioned!) in many introductory textbooks on analysis. A traditional course on measure and integration would correspond roughly to part of Chapter 11, all of Chapter 21, and parts of Chapters 2225 and 29. Integration theory is commonly introduced separately from functional analysis, but I have mixed the two topics together because I feel that each supports the other in essential ways. All of the usual definitions of the Lebesgue space LI[0, 1] (e.g., in 19.38, 22.28, or 24.36) are quite involved; these definitions cannot be properly appreciated without some of the abstract theory of completions or Banach spaces or convergent nets. Conversely, an introduction to Banach spaces is narrow or distorted if it omits or postpones the rather important example of L p spaces; the remaining elementary examples of Banach spaces are not diverse enough to give a proper feel for the subject.
H o w TO USE THIS BOOK
Because students' backgrounds differ greatly, I have tried to assume very few prerequisites. The book is intended for students who have finished calculus plus at least four other college math courses. H A F will rely on those four additional courses, not for specific content, but only for mathematical maturity i.e., for the student's ability to learn new material at a certain pace and a certain level of abstraction, and to fill in a few omitted details to make an exercise into a proof. Students with that amount of preparation will find H A F selfcontained; they will not need to refer to other books to read this one. Students with sufficient mathematical maturity may not even need to refer to their college calculus textbooks; Chapters 24 and 25 reintroduce calculus in the more general setting of Banach spaces. Proofs are included, or at least sketched, for all the main results of this book except a few consistency results of formal logic. For those consistency results we give references in lieu of proofs, but the conclusions are explained in sufficient detail to make them clear to beginners. Parts of H A F might be used as a classroom textbook, but H A F was written primarily for individual use. My intended reader will skip back and forth from one part of the book to another; different readers will follow different paths through the book. The reader should begin by skimming the table of contents to get acquainted with the ordering of
Acknowledgments
xxi
topics. To facilitate skipping around in the book, I have included a large index and many crossreferencing remarks. Newly defined terms are generally given in b o l d f a c e to make them easy to find. These definitions are followed by alternate terminology in italics if the literature uses other terms for the same concept or by cautionary remarks if the literature also uses the same term for other concepts. The first few pages of the first chapter introduce many of the symbols and typographical conventions used throughout the book; the index ends with a list of symbols. A list of charts, tables, diagrams, and figures is included in the index under "charts." Mathematics textbooks usually postpone exercises until the end of each subchapter or each chapter, but H A F mixes exercises into the main text. In fact, HAF does not always distinguish sharply between "discussions," "theorems," "examples," and "exercises." All such assertions are true statements, with varying degrees of importance, generality, or difficulty, and with varying amounts of hints provided. Every student knows that reading through any proof in any math book is a challenge, whether that proof is marked "exercise" or not. Some computations and deductions are easier or more instructive to do than to watch, so for brevity I have intentionally given some proofs as sketches. All the "exercises" are actually part of the text; most of them will serve as essential examples or as steps in proofs of later theorems. Thus, in each chapter that is studied, the reader should work through, or at least R E A D through, every exercise; no exercise should be skipped.
ACKNOWLEDGMENTS
I am especially grateful to Isidore Fleischer, Mai Gehrke, Paul Howard, and Constantine Tsinakis, who helped with innumerable questions about algebra and logic. I am also grateful to many other mathematicians who helped or tried to help with many different questions: Richard Ball, Howard Becker, Lamar Bentley, Dan Biles, Andreas Blass, Douglas Bridges, Norbert Brunner, Gerard Buskes, Chris Ciesielski, John Cook, Matthew Foreman, Doug Hardin, Peter Johnstone, Bjarni Jdnsson, William Julian, Keith Kearnes, Darrell Kent, Menachem Kojman, Ralph Kopperman, Wilhelmus Luxemburg, Hans van Maaren, Roman Mafika, Peter Massopust, Ralph McKenzie, Charles Megibben, Norm Megill, Michael Mihalik, Zuhair Nashed, Neil Nelson, Michael Neumann, Jeffrey Norden, Simeon Reich, Fred Richman, Saharon Shelah, Stephen Simons, Steve Tschantz, Stan Wagon, and others too numerous to list here. I am also grateful to many students who read through earlier versions of parts of this book. Of course, any mistakes that remain in this book are my own. This work was supported in part by a Summer Award from the Vanderbilt University Research Council. I would also like to thank John Cook, Mark Ellingham, Martin Fryd, Bob Messer, Ruby Moore, Steve Tschantz, John Williams, and others for their help with TEX. This book was composed using several different computers and wordprocessors. It was typeset using I~TEX, with some fonts and symbols imported from A ~ S  ~ . I am also grateful to my family for their support of this project.
xxii
Preface
T o CONTACT ME
I've surveyed the literature as well as I could, but it's enormous; I'm sure there is much that I've overlooked. I would be grateful for comments from readers, particularly regarding errors or other suggested alterations for a possible later edition. I will post the errata and other insights on the book's World Wide Web page on the internet.
Eric Schechter, August 16, 1996 http://math, vanderbilt, edu/~schectex/ccc/
Chapter 1 Sets
MATHEMATICAL LANGUAGE AND INFORMAL LOGIC
1.1. A f e w t y p o g r a p h i c a l c o n v e n t i o n s . Certain kinds of mathematical objects are most often represented by certain kinds of letters. For instance, mathematicians often represent a point by "x" and a function by " f , " and very seldom the other way around. This book will usually adhere to the following guidelines, which are consistent with much (but not all!) of the literature of algebra, topology, and analysis. The reader is cautioned that there is no s t a n d a r d usage, in the literature or even in this book. The guidelines in the following list will be helpful, but the guidelines will have exceptions (which should be clear from the context). There is even some overlap between the categories listed above. For instance, in atomless set theory, discussed in 1.46, all sets are sets of sets.
i, j , k, m , n, p, . . . p, q, r, s, t, . . .
c, H, q, R, z
W, X, Y, Z, ft, F, A , . . . A, B, C, L, S, T , . . .
a, b, y, z, o~, /3, A, p, ~z, . . . A , ~ , e, . . . f , g, p, q, o~, /3, A, p, Tr, . . .
F, A, q), ~ , . . .
integers real numbers sets of numbers main sets e.g., linear spaces subsets of main sets elements of sets sets of sets   e.g., filters, topologies functions collections of functions
1.2. All letters are variables, but some letters are more variable than others (as George Orwell might have put it). Every high school student has understood at least one example of this: the solutions of
a x 2 + bx + c = 0
are
x =
 b • v/b 2  4 a c
2a
Here the letters a, b, c are treated as real constants, but they can be a n y real constants;
4
Chapter 1: Sets
they vary only slightly less than x does. Usually it should be clear from the context just which letters are varying more than others. 1.3. Notes on "and" and "or." Although mathematicians base their language on English or other "natural" languages, mathematicians alter the language slightly to make it more precise or to make it fit their purposes better. Some of the differences between English and mathematics may confuse the beginner. For instance, there are two different meanings for the English word "or:" LJ + vel aut inclusive or exclusive or A or B or both A or B but not both.
Latin distinguishes between these two meanings by using two different words: "vel" and "aut;" see Rosser [1953/1978]. In everyday English, the term "or" is ambiguous; it could have either meaning. For clarification in English, "vel" is sometimes called "and/or," and "aut" is sometimes called "either/or." In mathematics, "or" generally means "vel," unless specified otherwise. Undergraduate mathematics students sometimes confuse "and" and "or" in the following fashion: W h a t is the solution set of x 2  4x + 3 > 0, in the real line? It is {xER : x<l} U {xeR : x>3} = {xeR : x<l or x > 3 } .
Thus, the appropriate word is "or." However, some calculus students write the solution as "x < 1 and x > 3," by which they mean "the points x that satisfy x < 1, and also the points x that satisfy x > 3" thus they are using "and" for U (union). Though such students may think that they know what they mean, this usage is not standard in mathematics and should be discontinued by students who wish to proceed in higher mathematics. Another word for "or" is d i s j u n c t i o n ; the most commonly used symbol for it is V. Another word for "and" is c o n j u n c t i o n ; the most commonly used symbol for it is A. However, we shall use U and R for "or" and "and," in order to reserve the symbols V and A for use in some related lattices. We shall use "notA" or "~A" as abbreviations for the statement that "statement A is not true;" some mathematicians use other symbols such as ~ A. The symbol 9, meaning "not," is also called n e g a t i o n . In conventional (ordinary) logic, used throughout most of this book, ~~A = A; that is, notnotA is equal to A. That equality fails in constructivist or intuitionist logic, which is discussed very briefly in Chapters 6 and 13. 1.4. The statement "A i m p l i e s B" will sometimes be abbreviated as "A =~ B" or "A ~ B;" the latter expression will be used in our chapter on logic. Either of these expressions means "if A is true t h e n B is true" or more precisely, "whenever A is true, then B is also true." The usage of " i f . . . then" in mathematics differs from the usage ia:, English, because the mathematical statement A =~ B makes no prediction about B in the case where A is false. For instance, in everyday English the statement "If it rains, then I will take my umbrella" is ambiguous   it could have either of the following meanings: (i) If it rains, then I will take my umbrella. If it doesn't rain, then I won't take my umbrella.
M a t h e m a t i c a l Language and Informal Logic
5
(ii) If it rains, then I will take my umbrella. If it doesn't rain, then I might or might not take my umbrella. In mathematics, however, (ii) is the only customary interpretation of " i f . . . then." The mathematicians' implication also differs from the nonmathematicians' implication in this respect: we may have A =~ B even if A and B are not causally related. For instance, "if ice is hot then grass is green" is true in mathematics, but it is nonsense in ordinary English, since there is no apparent connection between the temperature of ice and the color of grass. The mathematicians' implication is sometimes referred to as m a t e r i a l i m p l i c a t i o n , to distinguish it from certain other kinds of implications not commonly used in mathematics but sometimes studied by philosophers and specialized logicians. The c o n v e r s e of the statement "A ~ B" is the statement "B ~ A." These two statements are not equivalent; the beginner must be careful not to confuse them. For instance, "x = 3" implies "x is a prime number," but "x is a prime number" does not imply "x = 3." The statement "A if and only if B" may be abbreviated "A iff B;" it is also written "A ~ B." This statement means that both A =~ B and the converse implication B =~ A are true. Statement A is s t r o n g e r than statement B if A ~ B; then we may say B is w e a k e r than A. More generally, a property P of objects is stronger than a property Q if every object that has property P also must have property Q i.e., if the statement " X has property P" is stronger than the statement " X has property Q." (A related but slightly different meaning of "stronger than" is introduced in 9.4.) The mathematical usage of the terms "stronger" and "weaker" (and of other comparative adjectives such as coarser, finer, higher, lower) differs from the common nonmathematical English usage in this important respect: In English, two objects cannot be "stronger" than each other, but in mathematics they can. Thus, when A ~ B, each statement is stronger than the other. In particular, a statement is always stronger than itself. To say that A implies B and B does not imply A,
we could say that A is s t r i c t l y s t r o n g e r than B. For instance, the property of being equal to 3 is strictly stronger than the property of being a prime number. In general, " i f . . . then" is quite different from "if and only if." However, in mathematical definitions the words "and only if" generally are omitted and are understood implicitly, particularly when the defined word or phrase is displayed in boldface or italics. For instance, in our earlier sentence Statement A is s t r o n g e r than statement B if A => B; then we may say B is w e a k e r than A. the "if" is really understood to be "if and only if." 1.5. When A and B are variables taking the values "true" or "false," then an expression such as "A and B" is a function of those variables that is, the value of "A and B" depends on the values of A and B. The t r u t h t a b l e below shows how several functions of
6
Chapter 1: Sets
A and B depend on the values of A and B. In the table, "T" and "F" stapd for "true" and "false," respectively. A T T F F B T F T F notA F F T T
AorB T T T F
AandB
A ~ B T F T T
A ~ T F F T
B
If a statement A is known to be always false, then the statement "A ~ B" is true, regardless of what we know or do not know about B; under these circumstances we may say ~hat the implication "A =~ B" is v a c u o u s l y t r u e , or t r i v i a l l y t r u e . The term "trivially true" can also be used to describe the implication "A ~ B" if B is known to be always true, since in that case the validity of A need not be considered. 1.6.
Exercises. B" is equivalent to the statement "B or notA." Explain.
a. The statement "A ~
b. The c o n t r a p o s i t i v e of "A =~ B" is the statement "notB => notA." Show that an implication and its contrapositive are equivalent. We shall use them interchangeably. c. ( D e M o r g a n ' s L a w s for logic.) Explain: (notA) and (notB) (notA) or (notB) is equivalent to is equivalent to not(A or B); not(A and B).
1.7. D u a l i t y a r g u m e n t s . Some concepts in mathematics occur in pairs; each member of the pair is said to be dual to the other. A few examples are listed in the table below; these examples and others are developed in more detail in later chapters. The statements about these concepts occur in pairs. In some cases, one of the two statements is preferred, because it is more relevant to applications or is simpler in appearance.
Its dual
A concept III and
or
min
max
inf
sup
open closed
int cl
ideal filter
C
3
Generally there is a simple and mechanical method for transforming a statement into its dual statement and for transforming the proof of a statement into the proof of the dual statement. For instance, De Morgan's Laws for logic (given in 1.6.c) can be used to convert between ands and ors, by inserting a few nots. Other such conversion rules will be given in later chapters. In some cases, for brevity, we state and/or prove only one of the two statements in the pair. The other statement is left unstated and/or unproved, but the reader should be able to fill in the missing details without any difficulty. 1.8. On parsing strings of symbols. In this book, we generally read settheoretical operations (n, U, C, etc.) first, then settheoretical relations (=, C_, ~, etc.), then logical relations
M a t h e m a t i c a l Language and Informal Logic
7
between statements. For instance, A = B < ',.
c
n
D
2
E
(,)
should be interpreted as
Generally we omit the parentheses, but we may sometimes use extra spacing to make the correct interpretation more obvious:
A = B ,', ', CAD D E.
We emphasize that this order of precedence depends on the context i.e., the present book is concerned with abstract analysis. In a different context, the expression (,) could be read in an entirely different order. For instance, in some books on logic, N means "and" and _D means "is implied by." Hence all four of the symbols  , <::~, N, and _D are binary operations on s t a t e m e n t s i.e., they are operators [] with the syntax that if P and Q are statements, then P D Q is a statement. Therefore, in a logic book, the displayed equation (,) could make sense with any arrangement of parentheses, and it would have different meanings with different arrangements of parentheses. In that context, (,) would be highly ambiguous; some parentheses would be needed for clarification. 1.9. P r o o f b y c o n t r a d i c t i o n is a nonconstructive technique of logic, so widely used in mainstream mathematics that it generally goes unremarked. It may be confusing to beginning mathematicians who have never seen it explained. The technique is this: If we wish to prove A ~ B, we can assume the t r u t h of both A and notB. From those two assumptions we deduce a contradiction; the contradiction demonstrates that indeed A ~ B. The justification of this technique is 1.6.a. Proof by contradiction has this advantage: We work from two assumptions (both A and notB) rather than just the one assumption of A; thus we have more statements on which to build. Consequently, proofs by contradiction are often easier to discover than direct proofs. Proofs by contradiction also have a couple of disadvantages: Proofs by contradiction are often harder to read than direct proofs because they are conceptually more complicated. Proofs by contradiction are conceptually complicated. A beginning student of mathematics may prefer to assume that A is true and try to discover what else is then true a sort of onedirectional approach. But a proof by contradiction works simultaneously in two directions, mixing together statements (such as A and its consequences) that we take to be true with statements (such as notB) that we temporarily pretend are true but shall eventually decide are false. This scheme must seem diabolical, or at least amoral, to beginners: It is not concerned so much with "what is true," but rather with "what implies what."
8
Chapter 1: Sets
9 A proof by contradiction is often nonconstructive: It may prove the existence of some mathematical object without producing any explicit example of that object. For a very vivid example of this lack of examples, see 6.5. The availability or unavailability of explicit examples is one of the main themes of this book. A proof by contradiction may convince us that a statement is true, but it may not give us as much intuitive understanding of that statement as a direct proof would.
1.10. The phrase "we m a y a s s u m e " is often used in the literature in ways that may bewilder the novice. For instance, consider a proposition of this form:
Proposition A. Let X be a mathematical object satisfying hypothesis H(X). Then X satisfies conclusion C(X).
A published proof of Proposition A might begin something like this: (!) We may assume that X also satisfies property P(X).
The reasoning step (!) has several possible meanings; we shall describe three of them below. The simplest meaning of (!) would be that (1) Hypothesis H(X) actually implies property P(X), by some reasoning that should be evident to a sufficiently advanced reader. Readers who are not so advanced may spend many hours trying to fill in that reasoning. However, (!) may not mean (1) after all. Indeed, if (1) were true then (!) would probably be worded a bit differently e.g., the proof might have begun by saying "We first observe that, obviously, H(X) ~ P(C)." A more likely meaning of (!) is this: (2) H(X) and n o t  P ( X ) together imply C(X), by some reasoning that should be evident to the reader. Hence, in trying to prove H(X) ~ C(X), w e m a y concentrate on the case where P(X) holds. That is harder but still manageable. Alas, (!) has yet a third meaning, and this one is much too subtle for some beginners: (3) The text will now give the details of a proof of a slightly easier proposition. After reading the proof provided for the easier proposition, the reader is expected to figure out the details of how to use that easier proposition to prove Proposition A. The easier proposition is as follows:
Proposition B. Let Y be a mathematical object satisfying hypotheses H(Y) and P(Y). Then Y satisfies conclusion C(Y).
The missing details might go as follows: Let any object X be given, satisfying hypothesis H(X) but not necessarily property P(X). By some clever method (which the reader must figure out), we now construct a collection of related objects Y1,Y2,Y3,..., with each Yk satisfying both hypothesis H(Yk) and property P(Yk). Then Proposition B is applicable to
Mathematical Language and Informal Logic
9
the Yk's, and so we can draw conclusions C(Y1), C(Y2), C(Y3), .... By some clever method (which, again, the reader must figure out), we may then use that information to help us prove C(X). In such an argument, object X does not necessarily satisfy P ( X ) , despite the wording of statement (!). The effect of statement (!) is to discard the original object X, replace it with the new object ]Irk, and r e l a b e l Yk to call it X now. Some other relabeling arguments will be discussed and used in 2.19, 7.21, and 16.5. 1.11. How much formalism do we need? It is not necessary to learn the definitions of "noun" and "verb" to become a fluent speaker of English (or any other natural language). One can learn the language quite well just by studying examples; this is the method by which toddlers learn their native tongue. Similarly, most mathematicians use logic properly without ever knowing its formal rules. This book is intended for "most mathematicians," and we shall discuss logic and formal set theory as little as possible. The few concepts from logic and set theory that we shall need will be developed briefly and informally. For a more complete and formal development, the interested reader is referred to more advanced and specialized books and papers. Informal reasoning is not always reliable, in part because informal language is not always reliable. Natural languages (such as English) evolved to suit the mundane, ordinary, real world, but mathematicians often find themselves considering extraordinary ideas. For instance, a selfreferencing statement such as This statement is false cannot be true or false. (This is the simplest form of the P a r a d o x of t h e Liar, also known as the P a r a d o x of E p i m e n i d e s . ) Such statements do not arise in "ordinary" reality, but such statements show mathematicians a need for careful rules about language and reasoning. The simplest way to deal with selfreferencing statements is to simply prohibit them and avoid the confusion. We shall follow that policy in this book. However, we remark that selfreferencing recently has been analyzed in a meaningful and useful way by Aczel [1988] and Barwise and Etchemendy [1987]. Such analyses are especially useful in the theory of computer programs. A computer program may operate on data files that are stored in memory; one of those files may be the program that is operating. 1.12. We should mention one more type of selfreferencing before we leave the topic. The selfreferencing in Epimenides's Paradox is very direct: The word "this" in the sentence "This sentence is false" points directly to the sentence in which that word is located. But Quine's Paradox, below, involves a more indirect type of selfreferencing, which has some important uses in logic. A typical sentence in English consists of a subject followed by a predicate. For instance, in each of the sentences Jane is a girl. Jane runs with the ball.
the last example would become confusing if the quotation marks were omitted. and then their use or omission is no longer a matter of taste. we shall consider sentences that discuss certain sentence fragments. "is composed of five words" is composed of five words. Whether we include or omit the quotation marks is generally a matter of taste. Thus. . For instance. but the quotation marks are optional in the other examples. the sentence fragment will consist of a sentence whose subject has been omitted. "is a sentence fragment" is a sentence fragment. Q u i n e ' s P a r a d o x consists of the peculiar sentence "yields a falsehood when preceded by its quotation" yields a falsehood when preceded by its quotation. "is a girl" is a sentence fragment composed of three words. our main rule is that the intended meaning should be clear. [] is a box symbol.10 C h a p t e r 1" Sets the subject is "Jane" and the predicate is the remainder of the sentence. the process of forming such a sentence from such a fragment is called q u i n i n g . followed by the same sentence fragment without quotes. In Hofstadter [1979]. Each of those four sentences is true.) We shall now consider sentences that follow the format described above. x is a variable. but in these sentences the subject will be some phrase of the English language i. in a book or paper on logic. " x " is a variable. are all acceptable sentences in a mathematics book or paper. so in a mathematics text the subject of a sentence can be a mathematical symbol or formula. the last sentence displayed above is the result of starting from the fragment is composed of five words and then quining that fragment. The subject is some "thing" that is being discussed. a sentence fragment.e. Mathematicians often wish to discuss mathematical objects. In this author's opinion. "[3" is a box symbol. Thus. Now. The last two sentences have a peculiar structure: they consist of a sentence fragment in quotes. the predicate says that the subject "is" something or "does" something..y" is an equation. "x . "runs with the ball" is a sentence fragment composed of four words. followed by a period. the quotation marks may have a more technical meaning. (Of course. In each case.
A set is a collection of objects. the sentence being discussed happens to be identical to the sentence doing the discussing. Two mathematical objects may be equal as sets even though they have different additional structures associated with them. or x is a m e m b e r of S. for instance. but this book will have too many other uses for vertical bars. 3. These sentences are paradoxical: they are false if true. For instance." but occasionally "collection" may have the more general meaning of "class. and true if false.32. 3. such as those in 1. 7 ." discussed in 1. For instance. . 1.44 and the sections thereafter. we shall rely on the reader's intuition about these terms. To emphasize this we may occasionally refer to a set as an unordered set to contrast it with ordered sets. A more formal approach will be introduced in 1. see 14.Basic Notations for Sets 11 or.13. 1. 2. and repetitions are ignored. 5. 3. 1. or x is an e l e m e n t of S. Two sets A and B are defined to be e q u a l (as sets) if they contain the same elements i.) These sentences do not involve direct selfreferencing of the sort found in Epimenides's Paradox. these are different topological spaces.44.e. 2. } = {hEN : nisodd} = {2m+1 : mEN}. . since we do not state what a "collection" is. and by listing a parameter set and a way to form some object from each value of the parameter. This is not really a definition. However. 4}. But these topological spaces are equal as sets. the real number system with its usual topology is different from the real number system with the discrete topology i. 4} = {4.62. or p a r a m e t e r set. since they have the same members. Just by coincidence (not really). in either formulation.) The order of the elements of a set is not relevant. by specifying a larger set and a property that determines the subset in question. "yields a falsehood when quined" yields a falsehood when quined. . (Think about it for a moment. 3. in Hofstadter's terminology. Three common ways to specify a set are by listing the objects in the set. Quine's peculiar sentence discusses another sentence that would be formed as the result of a quining. {1. Quine formed this paradox in order to explain Ghdel's Proof.14.e. if they satisfy x E A 4=~ x E B. 2} = {1. BASIC NOTATIONS FOR SETS 1. (Some mathematicians would write that last expression as {2m + l l m E N}. N is used as a i n d e x set.. It is occasionally written as "S ~ x. The term "collection" will usually mean the same thing as "set.. the set of odd positive integers can be represented in any of these ways: {1. there is no "this" that points to itself." . In the last expression. Here are the two most basic notions of sets: "x E S" is read as: x b e l o n g s to S.
. . classical meanings. we say S is a p r o p e r s u b s e t of X.2. the terms "include" and "contain" are ambiguous. and we assume a familiarity with the elementary properties of those sets of numbers. that is. IB. 1. . each element of A is also an element of B. and for c and D by other mathematicians. 3 . The symbols C and D are ambiguous: They are used for C_ and _D by some mathematicians. 1.15. the reader must determine the intended meaning from context. N and 7Zhave their usual. In this book.16. . the statement "A is not a subset of B" is occasionally written as A ~ B. Some sets of numbers..17) complex numbers (introduced in Chapter 10) the circle group (introduced in 10.12 Chapter l" Sets "A _c B " means x c A =~ x E B. . or B i s a s u p e r s e t of A.. to be the sets o. 1. techniques of computation.." When the words "include" or "contain" are used. 2. but many others instead use the symbol N to represent the set {0.}. Set theorists often find it useful to define the integers (and everything else) in terms of sets see 1. or X is a p r o p e r s u p e r s e t of S. II3 positive integers (also known as natural numbers) integers rational numbers (quotients of integers) real numbers (introduced formally in Chapter 10) extended reals (introduced in 1. When S c_ X and S r X. Numbers are the basis of what most analysts consider to be "analysis. and fields. such as in college calculus." see 7. and C in Chapter 10. 3.generally understood to be IR or C directed sets ("generalized numbers. leading up to formal definittons of Q. It is also written as "B D A. this is sometimes written S C X orXDS. either of the statements "U i n c l u d e s V" or "U c o n t a i n s V" can have either of the meanings "U ~ V" or "U _D V.32. The statement "x is not an element of S" can be written x ~ S. 2 . R. } . in later chapters we shall carefully develop basic ideas of orderings.e. . Caution: Many mathematicians agree with our definition that N = { 1 . N Z Q R C 1" F A." Unfortunately.. As they are commonly used in the mathematical literature.f) unspecified field . 1.. groups.g. Zermelo defined the nonnegative integers 0." The list below shows some of the most commonly used sets of numbers. 3 . Relying on that informal acquaintance only for some illustrative examples. . We shall not use c or D in this book..3) We assume an informal acquaintance with Q and IR . It is read as: A i s a s u b s e t of B.46.
The term "infinity" has several different meanings in mathematics.. but more often it is defined to be 0. and so on. 0 neg. for which we define algebraic operations in the usual fashion.44. { { O } }. it is not a field. 1." "positive real. we have mentioned them here only to emphasize our reliance on some shared intuition about the standard (i. For purposes outside of set theory. oc 0 0 ~ I co co 0 +oc +oc I I re ll oc real +oc undef.17. but von Neumann's more complicated definition has a few advantages for purposes of set theory.c o be the names given to some two objects that are not real numbers.o c and +oc. not on a precise definition and list of properties.. it is not even an additive monoid. 1. Although it is possible to specify the positive integers uniquely by Peano's Axioms (see 14. 0 0 0 pos. to particular sets. Either of these definitions is manageable.Basic Notations for Sets 13 { O }." are abbreviations for "undefined.18. this is a very large finite number that gets larger without bound. von Neumann defined the nonnegative integers to be the sets {e}. 0 neg. In dealing with the integers. Later." and "neg.68 and 14." "1. { { { ~ } } }. denoted [oc. The reader of this book does not need to be familiar with the nonstandard integers. {e. language is sometimes used in a different fashion. +co} that is. that specification is nontrivial and rests on an understanding of conventional language. the real number system IR plus these two additional points." and "negative real. The resulting number system [oc. +co] is algebraically somewhat awkward unlike R. In the tables.e. "undef. as discussed in the preceding section. +oc].. indeed. . TIMES II+co +oc 0 oc cc I neg. Let +oc and . +oc +oc oc neg.52).28. (This point is discussed further by Hirsch [1995]. the object +co may also be abbreviated as oc." etc. as described in 5. and then the "integers" take on a new meaning. it is conceptually simpler not to attach the labels "0. Addition and multiplication are usually extended to this larger set of numbers by the rules indicated in the following tables. Thus N U {0} is a monoid and Z is a ring.) Instead we usually view the integers as indivisible objects.69. T h a t product may come as a surprise to some students and is discussed further in 15. is the set IR U { . however." "pos. and it is important not to confuse these with each other. In nonstandard analysis. +co co neg. 0 +oc 1. I ~ +co pos. We extend the ordering of R to this larger set by defining . and so on.. Our dealings with potential infinities may be simplified if we adjoin to R some i d e a l p o i n t s . as described in 14. The e x t e n d e d r e a l line. {e.c. and so it is now widely used in that field. conventional) integers. Preview of assorted infinities.o c ." The product of 0 and +oc is sometimes left undefined. Some older m a t h e m a t i c s books sometimes refer to a p o t e n t i a l i n f i n i t y such as limxl0 5. we shall rely on the reader's intuition. I pLUS II oc real +oc oc oc undef.o c < r < +co for all real numbers r. these notions are discussed in Chapter 8." "2.
then "a ham sandwich is better than nothing" can be written ham sandwich > 2~. card(T)} when S and T are infinite sets. think of them as entirely unrelated uses of the same strings of letters. for mathematicians have tamed infinity and made it entirely a secular matter. is an ordered field strictly larger than R. Yet another unrelated use of "infinity" is that in theology. (On the other hand. In fact." Yet another kind of infinity is the "number" of elements in an infinite set such as N or R. The beginner is urged to put aside any spiritual notions of infinity. an infinitely small but positive constant number plays a role similar to that played by the finite variable x in the expression "limxl0 f(x).71. In some other contexts. we obtain a more satisfactory algebraic system. cardinalities do not form a field.14 Chapter 1: Sets By adding many more ideal points.8 and 14.49). mathematics is not devoid of spiritual questions. it contains some numbers that are infinitely large and some numbers (besides zero) that are infinitely small. it is denoted by 2~ (or by { } in some books). this arithmetic should not be confused with the arithmetic of the hyperreal numbers. the unique solution of u 2 + 2u + 1 = 0 is u = . we consider this notation briefly in 5. it should be written as {x : x > true love} = 2~. *R. and the solution set is {1}." Similarly.20. introduced in 5." Rather.f). A s i n g l e t o n is a set {x} containing exactly one element. There are several different infinite cardinalities for instance. However.19. (For instance.1. x and {x} are used in substantially different ways.20.44. generally these two answers are interchangeable. the cardinalities of the sets N and Q are equal (see 2. Our several notions of the "infinite" are only distantly related.20. but the cardinality of the set R is larger (see 10. The objects x and {x} can never be equal (see 1. Among other things.44." 1. the first infinite ordinal is often denoted w. in contrast to the potential infinity mentioned above. Also related are the infinite ordinals. that "number" is called the c a r d i n a l i t y of the set. Some arithmetic of cardinalities is possible for instance. The set with no elements is called the e m p t y set (or null set). Thus. An infinitely large number is a constant that plays a role similar to the role played by finite variable such as the number x in the expression "limxT~ f(x).48. Infinite cardinal numbers are sometimes denoted by Nn. see particularly 6. we cannot conclude that "a ham sandwich is better than true love. there are infinitely many different sizes of infinities. that follows from 2.1 . "nothing is better than true love" should not be written as "~ > true love. and in some contexts the distinction between x and {x} is crucial.f). A few more sizes of sets. The h y p e r r e a l line. if we order things by our preferences. The word "nothing" is used in different ways in English.) . it is discussed in 10. however. To avoid confusion. However. and card(2 X) > card(X) for any set X. in later chapters we shall see that card(S x T) = max{card(S).) 1. For instance. so that no confusion is possible if we find it convenient to write x and {x} interchangeably. Some older mathematics books refer to it as an a c t u a l infinity.18.
The p o w e r s e t of a given set X is { S " S c_ X}. and Sx is a set for each A E A. . 9 then UXcL Sx C_ Ua~A Sx.k.x~} for some n E N U {0}.X3. Again suppose t h a t S . {x.X2.. The union of a sequence of sets S1.{Sx " A E A} is a set of sets i. . {0}. . . $2. .e.S I N S 2 N S 3 O ." Other sizes of sets will be discussed in 2. if S can be written in the form {Xl....e. thus by our definition any finite set is also a countable set. only to the sets t h a t we have called "countably infinite.20. n s~ k1 and n sk . k1 .~O. The set {Z} is a singleton.. 9 may be written as Uk=l Sk or as S1 U $2 U Sa U .s.e.21.or. WAYS TO COMBINE SETS 1..X2..e. y}} is a singleton in any case.1} is the set [P({0. The f % . y} is a singleton if and only if x = y.. {0. In particular.X2. i. if and only if it can be written in the form {xl. thus. It can be denoted by any of these expressions" AEA Other notations are available in certain special cases" The union of finitely many sets S1. $ 2 . equivalently. . {1}.. Suppose t h a t g . ... ... It can be denoted by any of these expressions: AEA The expressions n cxD n sk . Ux~o Sx is just the empty set. . We emphasize t h a t repetitions are permitted.22. and {{x. . A is a set.x2. n s2 o. It is also denoted 2 X. 1}) .x3. $3. (We permit n = 0.. U Sn. A set S is c o u n t a b l e if it is empty or can be written in the form {Xl. Note t h a t if L c_ A.. ..{~}..X3. A set t h a t is not countable is u n c o u n t a b l e .} without repetitions i. A set S is f i n i t e if the number of elements in S is a finite number i..) A set t h a t cannot be so written is i n f i n i t e .} without repetitions. We shall usually write the power set of X as ~P(X). 1}}.16. Sn may be written as U~=l Sk or as S1 U $2 U . the set of all subsets of X.Ways to Combine Sets 15 Examples for beginners to think about.}. the power set of {0. Then the u n i o n of the Sx's is the set {x 9x E Sx for at least one A}.{S~ 9A E A} is a set of sets. 1.23. the e m p t y set is a finite set. (:X2 1. which is a singleton so it is not empty. for reasons discussed in 2.. A set S is c o u n t a b l y i n f i n i t e if it is countable and infinite ... it has one element.X3. For instance. Caution: Some m a t h e m a t i c i a n s use these terms a little differently and apply the t e r m "countable" only to the sets of the form {Xl. Then the i n t e r s e c t i o n of the S~'s is the set {x" x E S~ for every A}.s power set of the empty set is ~P(~) .
3. If the choice of X is clear and/or does not need to be mentioned explicitly.S. For instance.. A collection of sets {S~ : A E A} is fixed if their intersection NAEASA is nonempty. i. of a single set X. p a i r w i s e d i s j o i n t ) if each pair of distinct sets in the collection is disjoint. The proofs are an easy exercise. Also simplified by the C notation are D e M o r g a n ' s L a w s for sets: AEA AEA T h a t is: The complement of a union is the intersection of complements. CnS . CCS .' ). d} .S. {a. c}\{c. etc. SaT B . A p a r t i t i o n of a set X is a collection of pairwise disjoint sets that have union equal to X.. Caution: Some mathematicians write the set X \ S instead as X . b. b.S when n . More generally.~EoS~ mean X. We emphasize that S is not necessarily a subset of X. if S and X are subsets of I~. However. r s}.0. (Some mathematicians write this as CS or S c or S.. otherwise the sets are d i s j o i n t . Here we adopt the convention that Cos . note then S \ T . If S and X are sets. CSDCT. There is a duality (as in 1.24. The C notation simplifies the appearance of many results for instance. U.7) between statements about any collection of sets and statements about the complements of those sets.. but the following convention is often useful: If the S~'s are all subsets of a fixed set X whose choice is understood. b}.16 Chapter 1" Sets are interpreted in a fashion analogous to that for unions. and vice versa. that also can be interpreted as { x .11. the duality transforms unions to intersections. 1. and vice versa. then ~ E L S ~ _D ~ E A S ~ " The expression ~ E o S ~ is not meaningful without further specification. . 1. this exponential notation will be particularly helpful in 13. then the relative complement of S in X may be written more briefly as CS. By De Morgan's Laws. see the discussion in 13. We emphasize that this does not refer to the pairwise intersection. Note that O and any set are disjoint. s E S} in contexts where subtraction is meaningful x e.25.s 9 E X. c are distinct objects. T.g.e. and S is a subset of X.26. if a. the set For example. If L C_ A and L ~ O. 6. 4..1. This duality is orderreversing. 1.CCns. then we may agree to let ~. A collection of sets is disjoint (or for emphasis. then the c o m p l e m e n t (or relative complement) of S in X is x\s e x . We say that two sets m e e t if their intersection is nonempty.2. The symbol C will be given a more general meaning in 13.S N CT. and C n + I s .S.) The [~ notation is useful especially when we are considering many subsets R. then the collection .{a.. S. the collection is free if its intersection is empty.
"cover" and "free" are dual concepts. b. . ANCBNCC. e}} is a free cover." in this context see 1.27. Exercise. and complements of several sets. Examples.e. A collection {S~ : k E A} of subsets of a set X is said to be a c o v e r . 9 A r e f i n e m e n t of S is a cover 9. {b. {{a. a Venn diagram is used for two or three subsets of a larger set X (which is sometimes called "the universe. b}. use induction to show that $1 A $2 A .~. c}. B.= {Tu : p E M} of X with the property that each Tu is contained in some S~. However. c.Ways to Combine Sets 17 is free but not disjoint. b}. Also.. {c.T A S. Let X = {a.S A (CT) . ASn .{a. that S A S . . in the sense of 1. T}. Further definitions. {{a. ANBNCC. b}. Two of these diagrams are shown below. Show that (RAS) AT = RA(SAT) {x" x is in exactly one or three of the sets R. we have C(S A T) . d. c. Then 9 A s u b c o v e r is a cover of X that is of the form {Sx : A c A0} for some set A0 c_ A. such that T~ C Sx for each A. Of course. If no assumptions are made about the sets A. {d}} is a disjoint collection but not a partition or a cover. a collection consisting of just one nonempty set is disjoint and not free. intersections. 1. b. For instance. {c}. .44)." or "the universal set. The s y m m e t r i c d i f f e r e n c e of two sets S and T is the set S A T = (S\T) U (T\S) (S U T) \ (S N T)  (S n CT) U (CS n T) {x 9 x is in S or in T but not both}. e} consist of five distinct elements. d}. then they are drawn "in general position" i. of X if U ~ A S~ . Suppose that S = {S~ : ~ E A} is a cover of a set X. e}} is a partition of X. d} . {c}.28. The set X may be represented by a rectangle. and that S A ~ . {a.{X " X is in a n o d d n u m b e r of t h e sets Sj }.X. Note that S A T . Thus. S. B. Typically. c} A {b. ANCBNC. d. Then the collection of sets {{a. 9 A p r e c i s e r e f i n e m e n t of S is a cover of X of the form 9"= {T~ : A E A} (with the same set A). for subsets of a given set X. Note that this condition is satisfied if and only if {{~S~ 9~ c A} is free (where C denotes complement in X). 1. Any disjoint collection of two or more sets is free. Also note that a partition is the same thing as a disjoint covering. or c o v e r i n g . {d.(CS)A T.7. and C are represented as disks contained in that rectangle. C.S. any precise refinement is a refinement. A V e n n d i a g r a m is used to indicate the unions. so that each of the eight sets ANBNC. More generally. and its subsets A.
That is: s e (TuU)  (SET)u (SUT)n (SNU) and S U (TNU) for all sets S. (See the first diagram. CA n {}B n Chapterl" Sets c. If A C C. in any case. b. CA n B n CC.19 are some further remarks about the limitations of diagrams. then we may draw the disk for A inside the disk for C. then (A A B ) \ C . if we know that A c_ C.) However. T.B \ C . In fact. In each case it is only necessary to prove one equation. then this may be reflected in the diagram. Do not rely too heavily on Venn diagrams or other figures particularly complicated ones for they can be erroneous in subtle ways.18 CA n B n C. (In 15. (See the second diagram. for instance.) X A X C B B C Shaded region is (A A B)\C. diagrams can be used to help us find other proofs that do not rely on diagrams.29. CA n CB n CC is represented by a single nonempty region in the rectangle. U. and. A common error is to attribute to a figure more generality than it truly possesses and thus to overlook certain special cases not explained by the figure.25).) If some relationship between the sets is known. Intersection and union distribute over each other. 1. the other then follows by duality using De Morgan's Laws (1. simple diagrams can be trusted if constructed carefully. a. The following equations occur in dual pairs. Distributive laws. intersection and union are infinitely distributive over each other: (SUU) and .
c l o s e d u n d e r arbitrary union if {Sx : ~ E A} c_ g ~ OO UXEA Sx E g. in 1. (See also 2. . Another reason is so t h a t we can compare several functions t h a t have different ranges. but they can both be viewed as having codomain R.23. . It should not be confused with the r a n g e of f . ~n E g ==k $1 U $2 U ' " Similarly. We say t h a t g is closed under some set operation if performing t h a t operation on members of g yields another m e m b e r of g. or a surjection) if the range is equal to the codomain.$2. The range is a very specific set it is the set of all the values taken on by the function. often abbreviated D o m a i n ( f ) or Dora(f). closed u n d e r c o u n t a b l e union if S1. This notion is generalized in 4.36 we give an alternate definition t h a t is less intuitive but more precise.4. the codomain is any convenient set large enough to contain the range of the function. c g ==k Uj=I ~J c g. T~. .) The distinction between range and codomain may confuse some beginners. The set X is the d o m a i n of f . These "closures" are special cases of Moore closures. We define closures under intersections analogously. A function (or map or mapping or operator or operation) from a set X into a set Y is a rule t h a t assigns to each a r g u m e n t x E X a unique v a l u e f ( x ) E Y. g is closed u n d e r finite union if $1. The function f ' i s called s u r j e c t i v e (or onto. For instance. We may choose to describe the function in terms of the codomain instead of the range because we do not actually know the range. often abbreviated R a n g e ( f ) or R a n ( f ) . The codomain of the function is a somewhat arbitrary or nominal set. We write f : X ~ Y to abbreviate the statement t h a t f is a function from X into Y. . we rely on the reader's intuition about what a "rule" is. FUNCTIONS AND PRODUCTS OF SETS 1. some earlier definitions are listed by Rfithing [1984]. We say f is a function on X .$3. $ 2 . B and any sets S~. Closure under operations. For instance. see 4. thus permitting us to ask such questions as: Is f ( x ) always less t h a n g(x)? The concept of "function" evolved over several centuries. This is not really a definition.31. in terms of subsets of products of sets. if $1. . the functions f ( x ) = x 2 and g(x) = x 3 (both defined for real numbers x) have different ranges.. A collection g is c l o s e d u n d e r c o m p l e m e n t a t i o n if S E g =v CS E g. The set Y is the c o d o m a i n of f. or f is defined on X . 19 1.Functions and Products of Sets for any index sets A.30. g is U ~n E g for each positive integer n.. Let g be a collection of subsets of a set X.d. However.. $2 E g =:k S1 U $2 E g or equivalently. which is the set { f ( x ) : x E X}.7.
. n}. For finite and infinite sequences.. {y.( Y j ) . . 2} and {2. 3. One representation is as an iteration of ordered pairs: (Xl. two finite or infinite sequences (xj) and (yj) are considered to be equal if and only if they have the same length and satisfy xj .20 Chapter 1" Sets 1.x~k for some positive integers ~Pl < P2 < ~3 < " " 9 For instance. where yl. in particular.Xn) ((Xl.Zl and y2 . the notation (y~) can only be used if the choice of A is understood. There are several ways to represent ntuples in terms of other objects (and we usually do not need to concern ourselves about which of these representations is being used). an ordered pair may be viewed as a function with domain { 1. . . .Xn).. z) can be represented by the set {{y}..Z2) are considered to be equal (i.. 27.Y2. 1} are considered to be the same. 2.. as in Y2 .t u p l e (or finite sequence.. .6). but that is not a requirement.. 7. . This representation (which is not used in most branches of mathematics outside of set theory) preserves the essential property of ordered pairs" Two ordered pairs (Yl.. . More generally.. This notation is yn used chiefly when the yj's are numbers. y2) consisting of two mathematical objects Yl. 3 . A sequence (yk) is a s u b s e q u e n c e of a sequence (xj) if Y k .) (yj'jeN) (yj)jC~=l : (yj). . An o r d e r e d p a i r is an ordered list (yl. . for any nonnegative integer n. The notation (yj) can only be used if the value of n is understood. For some purposes in set theory (discussed in 1.X2. 9. the ordered pair (y. y 2 . (1. with length n) is a list of n objects .e. A sequence is a function with domain N. it is convenient to view an ordered pair as a special kind of set. to be representations of the same mathematical object) if and only if yl ..n) (Yj)j~=I .. but the ordered pairs (1.. .yj for all j. Again. A s e q u e n c e (or infinite sequence) is an object of the form (yl. an object expressed in any of the forms (Yl.y2. Yl An ordered ntuple can also be written as a column. Thus..X2. Y2.. but it may be used for other yj's as well. . 1) are different.i. . 5. the notation (yj) can only be used if the choice of the domain is understood.46). we will have f(j) = yj.. We now generalize.. z}}. it is understood that the order of the objects is being noted.z2.y3. ) . 8 1 . The ordered pair is then a new mathematical object. . The notation of Atuples is used mainly when A is equipped with some sort of ordering (see especially 7. Again. an o r d e r e d n .. 9 .. The object y~ is called the Ath c o m p o n e n t (or e l e m e n t or e n t r y or value) of the Atuple. its value at the argument j is yj. 3. The unordered sets {1. the objects x and y are the first and second . . In particular.32. in the ordered pair (x. 2) and (2.e. Another way to view an ordered ntuple is as a function with domain {1. Yn are any mathematical objects... ) is a subsequence of (1. .Xn1). y). Any function with domain A may be viewed as a "Atuple" where y~ is the value of the function at the argument A. which may or may not be different from each other.Y2) and (Z1. 2}.2. if we represent the same function by f.Yn) (yj'jl.. Thus.
If interpreted properly. . However. it may be stated explicitly or may be implied in a particular context. . it may be of great or small importance.. $ 3 . s. 7 . Sn is the set of ordered ntuples n j=l IIs oo  $1 x $2 X ''' X Sn  {(Xl. The product of an arbitrary collection of sets (Sa 9A E A) is the set I I S~ AEA {(Y~)~EA " Y~ E S~ for all A}. whenever {A1. . . . this notation is occasionally useful. • • = 9 x. . . 3. } . . c This representation should only be used with caution.X3. ) .Xn) " Xj E S j for all j } . The p r o d u c t of n sets $1. this notation is not standard and should only be used with caution. . $2. are intended to represent typical elements of A. . y ~ . because it emphasizes the conceptual similarity between ntuples and more general functions. but that meaning is not intended. where the indices c~. we use braces { } for unordered sets and parentheses ( ) for sequences or other parametrized sets. but for some purposes A x B is "essentially the same" as B x A. then A is the p a r a m e t e r set.33... . 7 .. and a rearrangement of ordering may be clear in Some contexts. . it is the collection of all functions f 9A ~ UaEA Sx that satisfy f(A) E Sx for each A E A. Throughout this book. in any particular context.r 7 . . We emphasize that an ordering on A may or not be present. . as noted in 1.X2.X2. then the product IIaEA Sa may be written as • s.. 99 is the set of sequences S1 j=l X $2 X $3 X ''' = {(Xl. . We may occasionally write A = {c~. . yz. The set A x B is not the same as B x A. respectively.32. The product of a sequence of sets $1.) " Xj E S j for all j } . We may sometimes refer to (ya) as a p a r a m e t r i z e d set. 1. It is convenient to be able to say that AEA AEA1 2 for some purposes../3.A2} is a partition of A. . This collection of functions may also be viewed as a collection of Atuples. In other words. $2. but this equation is only valid after an obvious rearrangement of the ordering of A and removal of some parentheses. Note that the mathematical literature does not always observe this notational convention. } and (ya) = (y~. .Functions and Products of Sets 21 component. if we write A .{a. It may give some readers the impression that the parameter set A is a sequence. ..
We sometimes identify a function with its graph.. the Ath c o o r d i n a t e p r o j e c t i o n is the surjective mapping 7r~ 9P ~ S~ given by Try(f) . the notation 7r~ will be used for coordinate projections throughout most of this book. HAEA may SA also be It is called the A t h p o w e r of S. Notations for this mapping vary throughout the literature. Thus.x. 2 . 1.38. If A contains just n elements for some positive integer n. The term "projection" has other meanings. Any intersection of unions can be expressed as a union of intersections.~. .32 that an ordered pair may be viewed as a function with domain {1. the power set of S. then S A may also be written as Sn. we noted in 1. then their product written as S A . its domain and graph are both the empty set.k. When all the S~'s are equal to one set S.xA depending on whether we view 1I~EA S~ as a collection of functions f with domain A or as a collection of Atuples. see 1. On the other hand. a function may be viewed as a set of ordered pairs. . n} or N.{f 9 f is a function from A into S}.12 and 22. X2. then ~ s = ~.2}. . Degenerate examples. generally we do not adopt both of these viewpoints simultaneously.34. a f u n c t i o n from X into Y is simply a subset of X x Y with the property that each element of X is the first component of one and only one of the ordered pairs that are members of X x Y. . 1. since the only rule assigning to each element of O a corresponding element of S is the empty function. X 3 . then the j t h coordinate projection will be denoted by Try. with this viewpoint. f ( x ) ) " x E X } C X • Y..y). Xn) or the sequence (Xl.35. It is related to. then S ~ = {~}. 1.37. it is the map that takes the ntuple (Xl. see for instance 8. .45.36. If A is the set {1. To avoid confusion or circular reasoning.~ (a E A. .f(A) or 7r~(x~..21 and 2. we have N 7 E U C c~ E A~ = f E U N IIT E c A ~ ~/E C . and conversely: For any sets C and A~ (~ E C) and S~. .22 Chapter 1" Sets 1. X 2 . 1.) . An example using products. The set S ~ is also denoted S ~ If S is any nonempty set. The e m p t y f u n c t i o n is the rule that makes no assignments. If S is any set. . Associated with any product of sets P = I]~EA Sa is another collection of mappings. The g r a p h of a function f " X + Y is the set of ordered pairs Graph(f) = Gr(f) = {(x. since there is no rule that assigns to each element of S a corresponding element of o.20. . . ) to Xj. . For each A E A. but should not be confused with. We emphasize that 2~ ~ {2~}.xz.
More notations for functions. and conversely.. defined by Z. from X to Z Y. ) : Y ~ y ~ F(x. if F is a mapping from X x Y into Z. i.~ f E I I . notations such as ( . and yet differ in some other respect t h a t is not under consideration." and is sometimes used to indicate by individual values the rule t h a t defines a function. b. y) might be written as (Px.. In some cases we do not specify the rule. are not necessarily distinct. then II~Ec A~ is also a finite set. {9)x : x E X} and {~v : Y E Y}. this is the effect of the Axiom of Choice see (AC3) in 6. However. see the remarks in 13. The reader must interpret "fx" from the context. t h a t the description of f does in fact determine a function. y). The functions in one family have the other family as their domain. for instance. is not necessarily injective. x ~ in X to have the same action on Y. . Some other notations for f ( x ) are 9 fx used as in 1. A function f may also be denoted by the expression "f(. ) are often used.Functions and Products of Sets and 23 U N : N U 7 E C c~ E A. c. Thus. it is possible for two different points x .y) : X + Z. d. This notation is useful in more complicated expressions. px. W h e n two families of functions are dual to each other. and it may be necessary to verify t h a t the function is well d e f i n e d . ~ c A~ "T E C Note t h a t if C and the A~'s are finite sets.32. F(x.10. It should be noted t h a t the different functions px. it can mean Of/Ox. 1. The symbol H stands for " m a p s to. or perhaps as (x. be aware t h a t fx has other meanings as well. but simply postulate its existence. the partial derivative of f with respect to x. and 9 for each fixed y c Y we obtain a mapping ~y = F(. defined by Thus the one function F determines two families of functions.e." with the raised dot showing where the argument should be inserted.). Thus the mapping x H Px. For instance.39. then 9 for each fixed x C X we obtain a mapping ~x = F ( x . a. the function defined by f ( x ) = x 2 could also be written as x ~ x 2. In some cases the rule defining a function f is given explicitly (as by x ~ x2). For instance. In other cases the rule is only given implicitly or indirectly. We may say t h a t these two families are d u a l to one another since each family determines the other. . y). ) or ( . Let f be a function.12. hence any finite intersection of finite unions can be represented as a finite union of finite intersections. especially when X = N. ~y). An analogous statement is not true for countable intersections and unions.
). the symbol [] may be omitted altogether i. . Another example was given in 1. . the reader must interpret "fx" from the context.24 Chapter 1: Sets 9 fx used especially when f is linear (see Chapter 11). y. a binary operation is indicated by juxtaposition of the arguments. s(x + y) = (sx) + (sy). x[:]y) = f(s." called addition. let S be a set. Algebraists occasionally use + for a noncommutative operation. in o[dinary arithmetic of real numbers. x) D f(s. y) ~ x n y . . 1.a. i. A binary operation [] is c o m m u t a t i v e . Familiar examples are addition (+) and multiplication (. then parentheses are not needed. A familiar example is that. Let [::l be a binary operation on a set X. we find that parentheses are not needed in an expression such as Xl[:]x2 n . For instance. when the meaning is clear. y) for all s E S and x.c/d is generally interpreted to mean ( a x ) + b . x z . 1 . and let f : S x X . We say that f d i s t r i b u t e s (or is distributive) over [:] i f f(s.g. However. X be some function. .) on IR. they customarily are applied last. . It is often written in the form (x. after any other operations in the expression. x ~ . We may consider the point ( x ~ . By a A . A binary operation rn is a s s o c i a t i v e if x D (y[:Jz) = (xDy) [3 z for all x. x ~ .40. and intersection (N).29.. then the value of an expression such as XlClX2n . further examples will be given in later chapters.[:]xn does not depend on the order of the xj's. Again. or A b e l i a n .a r y o p e r a t i o n on X we shall mean any mapping from X A into X.(c/d). multiplication distributes over addition that is. and symmetric difference (/k) on [P(ft) for any set Ft. In some instances.. y) H xy.3. When this condition is satisfied.. this simple juxtaposition of symbols also often means composition or multiplication.. By repeated uses of this rule. 4 i .. see for instance 2.  If the operation [:l is both commutative and associative. ) C X A as the single argument of f. . ~/. although its behavior may differ significantly from the behavior of multiplication of real numbers e. for some symbol n. . A function written this way is often called multiplication. Dxn for any positive integer n. Such a function may be written as f = f ( x ~ . Another symbol commonly used for a binary operation is "+.}.y. union (U). We may . xz. y C X .  In this book addition (+) will always denote a commutative operation.. if x[]y = y[]x for allx. A b i n a r y o p e r a t i o n on a set X is a mapping from X x X into X. but analysts generally do not. . Let X and A be sets..e. x... y E X./~. E X. When addition or subtraction are used for binary operations. ax + b .. where A = {a. both sides of the equation above can be represented more simply as x n y n z .e.. but alternatively we may view f as having many arguments x~. x z . z C X. thus: (x. it need not be commutative.
$3. it will not be n e e d e d until m u c h later in the book.e.a r y o p e r a t i o n on X is a m a p p i n g from X into X . We note a few o t h e r i m p o r t a n t cases: 9 W h e n A is a finite set. x 2 . Typically. it is a c o n s t a n t m e m b e r of X .. it is a function with 0 a r g u m e n t s . This s u b c h a p t e r can be p o s t p o n e d .ZF Set Theory 25 refer to the a t h a r g u m e n t of f .42. Let X = T ( ~ ) for some set ~. Defining sets in a selfreferencing way can lead to sets t h a t are too "big" to be meaningful. 4 3 . . it is w r i t t e n in the form y = f ( x l . an Nary o p e r a t i o n on X is a m a p p i n g from X N into X . a n d the collection of all sets that are mentioned in this book is a set t h a t has just been m e n t i o n e d . x X (n factors) into X .. in I~) as an "operation. we will a t t e m p t to avoid selfreferencing s t a t e m e n t s . It seems t h a t some sets are m e m b e r s of themselves. For instance. ..) ~ U Sj j=l and (S1. Remark. $2. A b i n a r y o p e r a t i o n (defined in 1." Since X ~ = X ~ = { ~ } is a singleton. etc.Xn). R u s s e l l ' s P a r a d o x . the second a r g u m e n t ... . $2. a 0ary o p e r a t i o n on X is a function from a singleton into X . For instance. How big can sets be? As we r e m a r k e d in 1. ZF SET THEORY 1. . 9 T h e set A does not have to be finite.. In effect. O p e r a t i o n s t h a t are n .a r y o p e r a t i o n . We m a y consider it to be a function with n a r g u m e n t s in X the first a r g u m e n t . . if it is finite) is called the a r i t y of the operation.40) is the same t h i n g as a 2ary operation. We m u s t also avoid certain kinds of selfreferencing definitions of sets. It is also called a u n a r y o p e r a t i o n .a r y for finite n are also called f i n i t a r y operations. the collection of all sets that can be described in fewer than 100 words of English is a set t h a t has just been so described. a m a p p i n g t h a t takes each sequence of elements of X to an element of X . Typical examples are x H .11. 9 It is occasionally useful to view a specially selected m e m b e r of a set X (such as t h e n u m b e r 0.t h a t is. This is evident in the following paradox.i. t h e / ~ t h a r g u m e n t of f .. T h e set A (or t h e n u m b e r of elements in A. it m a y be viewed as a m a p p i n g from X x X x . .) ~ N Sj j=l are two Nary o p e r a t i o n s on X t h a t are i m p o r t a n t in m e a s u r e theory. 9 A l . It is also called a n u l l a r y o p e r a t i o n . with n elements. 1 .." . etc.x (for n u m b e r s ) or S H CS (for subsets of a given set). $3. Let us call such sets "selfinclusive.. t h e n a Aary o p e r a t i o n is also called an n . t h e n (S1.
45." but t h a t definition leads to Russell's Paradox. a universe is not specified explicitly or even mentioned. 9 A p r o p e r class is a collection that is not a member of V. the collection of all pages in this book is a set of pages. Z F t.44. to discuss as a "collection" those sets that satisfy a certain property. 5.48.53. one may replace the term "set" with "member of V. We shall call such sets "nonselfinclusive.g. A few ways to avoid paradoxical sets. the objects must already be fixed before the collection is formed (Scott [1974].13 we defined a set to be "a collection of objects.54. One small." the uncertainty of this is discussed further in 14. A safer but more restrictive method for avoiding excessively large collections is by specifying in advance some manageable collection V of sets. plus the Axiom of Choice. A set can be a member of something.53. This precludes selfreferencing. it is not a page.47.26 Chapter 1: Sets On the other hand. For instance. In 1. it is a class of ordinary size. Although we shall only apply the term "set" to members of our universe V. see for instance 1. and it isn't if it is a contradiction either way. Conventional set theory does not permit Russell's Paradox. much bigger collections. We refuse to consider proper classes as members of anything. The collection V may be smaller than what one ordinarily thinks of as "all sets. and then prohibiting the use of any sets outside that collection. which is commonly used in nonstandard analysis. it is described in 14. In other contexts it is useful to discuss the choice of the universe and even to specify it explicitly. We shall list the axioms of ZF in 1. as modified by Skolem. 1. if it is a set). 5." The collection V is then called the u n i v e r s e (or universal set. this "definition" is not very precise. . and 9.e.. 1. But we can only say "apparently. we shall give it some precision in 5. Intuitively. W i t h this definition. This stands for ZerrneloFraenkel set theory.71 and the sections thereafter. Intuitively. one simply assumes that the universe being used is large enough for one's applications. and it apparently does not lead to any other contradictions either." but it may still be large enough for all the applications one is interested in. Thus. easily manageable universe is the "superstructure" over IR. In many contexts in mathematics." But now what about the collection of all nonselfinclusive sets? Is it a member of itself? It is if it isn't. at least informally e. some sets do not include themselves. Shoenfield [1977]).A C . without substantial inconvenience. we shall introduce AC in 6. one that is too big for us to safely apply to it the rules for sets. a proper class cannot. but we shall distinguish between two types of classes: 9 A s e t is a member of the universe V. it is a much bigger class. Unfortunately. Any collection of objects will be called a class.65. the collection of all sets is not a set.39..12. A slightly better definition is: A set is a collection of already fixed objects i. it is grammatically convenient to be able to discuss other. The most commonly used universe is the one described by the axioms of conventional set theory.
For instance.53. 2 .. logicians and set theorists quite commonly have sets nested arbitrarily deep. V is the collection of all sets. What are sets made of? A typical set is {0. 1.. but rather a class of ordered pairs. 1. We will need proper classes only a few times in this book. By a f u n c t i o n o f c l a s s e s we shall mean a mapping f : :JV[~ N from one class into another i. 7r}} is a set whose members are sets. Such sets arise naturally in analysis.ZF Set Theory 27 Thus. Outside of set theory. for this reason we may refer to t h e m as a t o m s (or urelements or individuals or primitive objects). 5. . we generally do not think of an ordered pair ( . Likewise.. described in 5." so Russell's Paradox does not arise. {~.23. 1}} are 0 and {0. for instance. {~.g. The class of all linear spaces and the class of all topological spaces are proper classes. whether those elements are known to be indivisible or not. or defined on the collection of all nets on X. However.25). . it should be clear in each context that we are avoiding selfreferencing arguments such as Russell's Paradox.. a theorem about topological spaces usually only involves a few topological spaces at a time. but these examples are somewhat contrived for instance. examples are harder to produce.e. 1}. {1.5 / 3 . so we shall not develop a systematic theory for them. A set need not contain just atoms it may contain other sets for its members.5 / 3 . the class of all singletons or the class of all ordinals (investigated later in this book). for instance. it does not m a t t e r whether "3" is an indivisible object or a set containing three objects. 7r}. not a set. 5.e.51. it is simpler to view "3" as an indivisible object.5 / 3 . and 6. For most purposes in most branches of mathematics.50.. The set of all topologies on X is a set of sets of sets. We cannot form the "set of all sets" or the "class of all classes. We shall simply use t h e m in an ad hoc fashion.46. 1. { ~ } } . { . For most mathematicians. it can be formulated so t h a t it does not require us to simultaneously consider all topological spaces.12 and 5. occasionally a proper class really is needed outside set theory.e. The net approach has some intuitive advantages nets are very much like sequences but the net approach must be used with some caution: The collection of all nets on X is a proper class.5 / 3 . {0. In set theory. . this set has five elements. . . consider the ordinals ~. a convergence structure on a set X (see C h a p t e r 7) can be described by a "limit" function defined on the collection of all proper filters on X. . 2}. "Ordinary" mathematicians i. {~}. it is easy to give examples of proper classes . Generally the graph of such a function f is not a set of ordered pairs. However. It is a proper class. 7r) as a set. W h a t m a t t e r s is how we use 3. The e l e m e n t s of a set are also called its p o i n t s . a rule t h a t assigns to each M E 3V[ some particular f ( M ) C N. The points of the set {0. The usual operations of sets make sense for classes sometimes. 7r as sets t h a t may contain other objects. 2 . but not a set. it usually makes sense to consider the intersection of two classes. For instance. We may define "3" in any way we wish. a topology or a aalgebra on a set X is a collection of subsets of X (see 5. provided we define "+" so t h a t 3 + 3 = 6. those not involved in logic or set theory seldom need to go to any levels deeper t h a n this. outside of set theory and logic . and our language reflects that viewpoint.e. For instance. see 1. For most purposes in "ordinary" m a t h e m a t i c s i.we do not think of the individual numbers 0. but not always. in ways t h a t obviously make sense. and it is sometimes convenient to apply the terminology of sets to a few classes. Instead we think of these numbers as indivisible.44.
Whether we view certain objects . even though those objects could instead be represented as sets. as in 5.2 and 1. to assert that all objects can be represented as sets is to make an additional assumption about our universe of "objects. see 3. so conventional set theory is atomless.e. y) can be represented in terms of sets and an ordered ntuple in terms of ordered pairs.36. Some details of this representation will be worked out in later chapters. the bags may also contain beads. and so on. although "atoms" may enter into our terminology.65. we obtain a slightly weaker system of axioms. all familiar objects can be represented as sets. If we omit this assumption and permit the existence of objects that cannot be represented as sets. Then the assumptions that underlie our work are really the assumptions of conventional. it is occasionally useful in model theory. 9 Functions and relations can be represented as sets of ordered pairs. they are not really needed. and agree to treat them as beads. 9 Real numbers may be represented in terms of sets of rationals (see 10. and a finite or infinite sequence may be viewed as a function. known as set t h e o r y w i t h a t o m s (or set theory with urelements). In set theory with atoms.15. The beads do not contain anything. This metaphor may be helpful: Atomless (conventional) set theory is like a great collection of transparent bags.44. All familiar objects can be represented solely in terms of sets. atomless set theory.33. 9 A product of sets is a set of functions. the nonnegative integers as sets or as atoms depends on our viewpoint. and nothing to distinguish between the bags except the different combinations of bagswithinbags that they contain.g.32. see 1. It is sometimes convenient to label the real numbers or other familiar objects as "atoms" and treat them as indivisible. 9 An ordered pair (z. (In our metaphor. An example of this is given in 14.22).d and the constructions used in that proof).. However. 9 Rational numbers may be represented in terms of pairs of integers (see 8. as discussed in 1. without any other basic building blocks. 9 A negative integer can be represented by an ordered pair involving a positive integer.) . but we seal some of the bags shut and mark them on the outside. different branches of mathematics find different viewpoints advantageous. but we can outline it now: 9 The nonnegative integers can be built up from the empty set.. and all the members of sets are sets. Thus. some of which are empty and some of which contain other bags which may in turn contain other bags. but can be distinguished by their markings.28 Chapter 1: Sets Although atoms seem natural to most mathematicians. there is still nothing in the system but bags. and in some studies of set theory it is customary to dispense with atoms altogether." This is one of the assumptions of conventional set theory: All of its "objects" are sets. there is nothing in the system except bags.
It turns out that this is all we need the general substitution principle (described in the last two axioms of 14. Most of the axioms of ZF are just formal statements that correspond to our informal intuition about sets. by induction . Other books follow a slightly different approach. 2 7 .a) can be proved from our Axiom of Extensionality.47." We assume that some pairs of these "sets" are related by a relationship. We may then explore the consequences of those axioms. We assume that this collection of "sets" and this relationship "E" satisfy certain axioms. some readers may find it helpful to glance ahead to the peculiar example in 1.F r a e n k e l S e t T h e o r y . T h a t is. T h a t is. as listed in 14. but later in this book we shall usually rely on the reader's intuition rather than on the list of axioms. In such books. However. From our definition of equality of sets (and our understanding of the logical symbol "r it is easy to prove that (i) A = A.which state that "equal" quantities can be substituted for one another in any expression do not follow directly from our definition of equality of sets. If A . thus equals can be substituted for equals. Some books take equality (=) to be a logical symbol with its customary properties.: . We assume that we are given some collection of objects. A . Two sets are the same if and only if they have the same members. We define two sets to be equal when they have the same members. for instance.B means x E A ~ x E B. for elementary treatments see. Two objects are equal if and only if they are not distinct.48. (An exception is the Axiom of Regularity.27. when S and T are "sets. which we shall follow here. it is somewhat nonintuitive and nonconstructive. To help put aside familiar intuitive notions of sets. a . Halmos [1960] or Stoll [1963].. (x B)). With this definition. Z e r m e l o . There are at least two different ways to deal with equality of sets. then B E C. A c_ B is defined to mean x E A ~ x E B. equality (=) and membership (E) are already meaningful before we get to the relation between them. which will appear at the end of our list of axioms. In some later chapters we shall briefly consider modifications of conventional set theory. called "is a member of. (A . (ii) A = B => B = A. The relation between them is taken to be the first axiom: Axiom (called "Extensionality" in some books). not as a primitive notion. we cannot automatically assume that equality (=) has all of its usual properties." denoted E. One particular instance of the substitution principle is: A x i o m of E x t e n s i o n a l i t y .27.B and A E C. The statement A c B is an abbreviation for the statement that every member of A is also a member of B.a. and so they will require some assumption about sets. Equality of sets is taken. For that reason. listed below. Further discussions of ZF set theory can be found in books on set theory. but as a defined notion. the last two axioms in 1 4 ." then the statement "S E T " is either true or false. we must not be misled by the fact that the symbol we are using (=) is a familiar symbol. and (iii) A = B and B = C imply A = C.) In the next few paragraphs we shall list the axioms of set theory to give a general impression. which we call "sets. T h a t is.ZF Set Theory 29 1. the reader is encouraged to put aside the usual intuitive meaning of "set" and view the axioms below as a selfcontained theory that does not refer to anything familiar.B) ~ ((x E A) . Thus.
(Firstorder languages are introduced in 14. discussed informally earlier in this chapter. we have skipped over some of the complexity of the last two axioms. We could replace this axiom with the assumption that there exists some set. Though the Axiom of Replacement and the Axiom of Comprehension are usually presented as separate axioms. since two sets with the same members are equal.1 for positive integers n.A2. then Un(S) . T h a t is.. see Bell and Machover [1977]. the union of S.A1 U A2 U A3 U .30 Chapter 1: Sets on the length of the formulas involved. so that one of these axioms becomes a consequence of the other. for then the existence of the empty set follows from the Axiom of Comprehension by taking P(x) to be the property x % x. some mathematicians formulate their language in a slightly different fashion. To understand this axiom. T}. If S is a set.16).one for each function f and one for each property P. We call Un(S) the union of the members of S or more briefly. Actually... This can also be stated as: The intersection of a set and a class is a set. The next few axioms are formal restatements of some of our basic rules about permitted methods for forming sets." those terms mean approximately what one would expect them to mean. If X is a set. Then { f ( x ) : x C X} is a set. Un(S) has the property that [A c Un(S)] ~ [there exists some B c S with A c B]. T h a t set is denoted by [P(S). . then Un(S) = A. .. A x i o m of U n i o n s . If S and T are sets. keep in mind that all the elements of S are sets. We shall omit the proof. then there exists a set whose only members are S and T. A x i o m of R e p l a c e m e n t . then U n ( n ) = n .15 and thereafter. see also the related comments in 14. then {x E X : P(x) is true} is a set. A x i o m o f t h e P o w e r Set. thus we are justified in introducing a symbol " ~ " for it. . there e x i s t s a s e t Y s u c h t h a t x c Y ~ '. Here are a few examples: If S = {A} is a singleton. if S = {A1.}. We shall not explain at this point precisely what is meant by "function" or "property. A x i o m of C o m p r e h e n s i o n ( o r S e p a r a t i o n ) . then there exists a set whose members consist precisely of the subsets of S. then there exists a set Un(S) whose members are precisely the same as the members of the members of S.. . A x i o m o f t h e E m p t y Set. it is denoted o and called the empty set. There exists a set that has no members. a rule that assigns to each element x E X some set f(x). but for axiomatic set theory the function f or property P must be expressed in a formal firstorder language. Let f be a function defined on a set X that is.[ x c X a n d P ( x ) is true]. it is denoted {S. A x i o m of P a i r i n g . T h a t is. it can be found in Takeuti and Zaring [1982]. Note that the empty set is unique. and P(x) is a property that is true or false for each x c X.67. If we use either Zermelo's or von Neumann's definition of the integers (see 1.) This actually gives infinite schemes of axioms . and U n ( 0 ) = 0 also. If S is a set.A3.
Since such deep nesting does not occur outside of set theory." Define a function f on 2 by taking f(0) = S and f(1) = T. The set A whose existence is postulated by the Axiom of Regularity is sometimes called an E . 1.. Let P(x) = "for every y. So. define 1 to be the power set of 0. also a member of X. Thus 0=2~. see Aczel [1988] and Barwise and Etchemendy [1987]. the Axiom of Regularity has little effect on "ordinary" mathematics. Clearly. The Axiom of Regularity precludes the possibility of certain counterintuitive sets. 5. {S. The preceding axioms merely formalize the intuition about sets that we may have obtained from experience with finite sets. to define the remaining nonnegative integers 3. there exists at least one set A E X that satisfies A N X = ~.~}. but in fact we can make it a consequence of the previous axioms.50 and 6.e. this procedure only yields finitely many nonnegative integers. Now "2" is the name of a set that contains two elements. However. those elements are "0" and "1.) of informal mathematics is not permitted in formal set theory. if y is infinitylike. The Axiom of Pairing can be used repeatedly. It can be replaced with alternative axioms. define 0 to be the empty set. and such that A E S {A} E S. It can be shown that S1 is the desired set a~.e. It will be clear from the reformulations in 1. .. . it is merely a technical convenience that helps set theory work properly.m i n i m a l e l e m e n t of X (or more simply... The set of nonnegative integers can be constructed using A x i o m of I n f i n i t y . following von Neumann. . o r R e s t r i c t i o n ) . {2~}}. however: A x i o m of R e g u l a r i t y ( o r F o u n d a t i o n . Sl is the intersection of all infinitylike sets. 4. 2 = {2~. as follows: First.31 that the Axiom of Regularity is concerned with sets of sets of sets of sets of . The Axiom of Comprehension guarantees that Sl = {x C So : P ( x ) } is a set. define 2 to be the power set of 1. To get all of the nonnegative integers at once i. The Axiom of Infinity guarantees the existence of at least one infinitylike set. } requires something more. By the Axiom of Replacement. . T} is a set.. 3 . . that satisfies BEA. There exists a set S with Z E S. and there does not exist another set B.ZF Set Theory 31 Most books on set theory present the Axiom of Pairing as a separate axiom.. we can construct cz = N U {0} this way: Call a set S "infinitylike" if it satisfies ~ E S and also satisfies A E S =~ {A} E S. If X is a nonempty set. 2. .49. to get the set a~ = {0.. see 1. since the ellipsis (. The one remaining axiom of ZF set theory is not just a formalization of our intuition. However. . The set S given in the axiom is not quite the set of integers that we're after. then x E y". then X has a member that does not meet X i. a minimal element). for this reason" A is a member of X. . with Zermelo's definition or von Neumann's definition. 1 = {. we omit the details.
yet there is no set whose only members are C. 9 The Axiom of the Power Set is violated: The sets t h a t are subsets of D are the sets A and D. memberships: CcA A. 9 The Axiom of the E m p t y Set is violated: Each of our "sets" has at least one member. we would not come to understand those axioms better by looking at examples t h a t satisfy the axioms. This example is a modification of one by Krivine [1971].48. B. Pathological example. Thus. the GC_E. plus the fact t h a t each set is a subset of itself." 9 The Axiom of Extensionality is violated: The sets A and D are "equal. the only memberships in our miniature universe are those listed beside the diagram." yet they are not members of the same sets we have D E F but not A E F. 9 ! 9 l AC_D. 9 The Axiom of Unions is violated: The members of B are A. D. G. The membership relation c is represented by an arrow in the diagram below we say S E T if there is an arrow from S to T.D. Most of the axioms of ZF are violated: 9 "Equality" doesn't mean what we would expect the sets A and D are distinct. F.32 Chapter 1" Sets 1. E. D. G c F FEG As usual. we define S c_ T to mean t h a t X E S ~ only subset relations are: X c T. so they are "equal.G c E D. G. Instead. . or E are the sets C. we shall now present a peculiar little universe t h a t violates most of the axioms. F. W i t h this definition.E E B BEC CcD F. the sets t h a t are members of A. yet there is no set whose only members are A and D. D. G. yet they have the same members. denoted by A. There are seven "sets" in our peculiar little universe. 9 The Axiom of Pairing is violated: There is no set whose only members are C and G. C. DC_A. F. Since ZF's axioms are mostly in agreement with our intuition. E.
If To. as stated in 1. . is a sequence as in (iii). . 1. T) is specified. T suppose some set p(S. (iii) 9. T1. . yet neither of those sets is disjoint from E. using a rule that specifies F(X) in terms of the values of F on the members of X. . 9 The Axiom of Infinity. Actually..) Then there exists a unique map F : {sets} + {sets} satisfying F(X)  p(X... none of the following can occur: (i) (ii) Tn E Tn1 E Tn2 E . for instance. . Then P(. Johnstone shows that the Principle of Membership Induction is equivalent to the Axiom of Regularity.ZF Set Theory 9 The Axiom of Comprehension is violated: Let the class {S E B : P ( S ) } is not a set.. Let p be a function of classes.. Suppose P(. . from {sets} x {sets} into {sets}. E T3 E T2 E T1 E To for some infinite sequence of sets To. In T E T for some set T. since the sequence in (iii) may repeat itself. . P r i n c i p l e of M e m b e r s h i p R e c u r s i o n . by the Axiom of Regularity some member of X does not meet X a contradiction. T1. .}. 1. let X = {To. (That is. (In other words. it is possible to define F(X) for all sets X. . T1.T2. only makes sense if we have already assumed the Axiom of the E m p t y Set." Then 9 The Axiom of Regularity is violated: The only members of E are F and G.) We state without proof two more interesting consequences of the Axiom of Regularity.. T2.) We omit the proofs. ZF set theory. it is clear that our universe of seven "sets" does not yield an infinite set or an infinite collection of sets. 33 P(X) be the statement "C E X.50.47. T 2 .49. .e.. E T1 E To = T~ for some positive integer n and sets T1. for any sets S.) is true for all sets.T~. in Johnstone [1987] and Kunen [1980].) is true for all members of a set X.. (Optional. {F(A)" A E X}) for each set X. Proof. Either of conditions (i) and (ii) implies (iii).) is a property of sets that is Einductive i. Consequences of regularity. that has this property: Whenever P(. then P(X) is also true. T2. . Proofs can be found. P r i n c i p l e of M e m b e r s h i p I n d u c t i o n . However.
31 and in 1.. which must be understood from context.1. . b. The summation notation will also apply to sums of vectors or sums of members of any additive monoid. a. 1T}. n. and their p r o d u c t is n j=m lI Um Um+l Urn+2 "'" Un.. (The letter j may be replaced by any other letter not already in use.Um~i ~ Urn+2 ~. Let X be some set.Chapter 2 Functions 2. Urn+2. We assume the reader has at least an informal familiarity with •. . We shall call this the c h a r a c t e r i s t i c f u n c t i o n of S.2. we shall not use any of the deeper properties of ~ until then. then their s u m is n uj j=m Um ~. A few numerical functions. m + 1. Note that 1SnT  1S 91T 34  min{1s. Un are real numbers parametrized by consecutive integers m." ' " ~ U n . urn+l.36.. A formal introduction to I~ is given in Chapter 10. they will be studied in greater depth in this chapter. . . Functions were defined in 1. If urn. but also to sums and products of complex numbers or members of any ring. m + 2 .) These notations will later be applied more generally .not just to sums and products of real numbers. SOME SPECIAL FUNCTIONS 2. Our notation " l s " does not reflect the choice of X.. . For each subset S C_ X we define l s 9 ~ X 1 {0. 1} by ls(x)  0 ifxES if x C X \ S .
Is. Let n p(t)  E f(rk)Lk(t). These are the L a g r a n g e p o l y n o m i a l s . m a x { l s . We shall use them for a result about linear independence in 11. 1r} is the characteristic function of S U T . .18.Some Special Functions 35 Similarly. composition of functions does not satisfy the commutative law g f = f g.. 3 . r 2 . The c o m p o s i t i o n of two functions f 9 X ~ Y and g 9 Y ~ Z is the function go f : X ~ Z defined by (gof)(x) = g(f(x)).x) : x E X}. . e. and 1 . sgn(x)  It may also be written as sign(x). For any set X. the K r o n e c k e r d e l t a is the characteristic function of the diagonal set {(x. .) d. The sign f u n c t i o n . . it is the function ~ : X • X ~ {0. . 2. let j~ Lk(t)  trj rkrj " Show that L1. . 2. rn. c. 1} defined by _ ~xy f 0 1 whenx=/=y when x .L2. n.. . Let rl.3. r~ be distinct real numbers (or. . but it may be omitted otherwise: g o f may be written "multiplicatively" as gf. However. k=l Then p(t) is the unique polynomial of degree at most n that agrees with f on the set {rl. .y.l s is the characteristic function of CS. considered as a subset of X • X. . It is called the i n t e r p o l a t i n g p o l y n o m i a l and is used to approximate f in various ways. 2. . is. A s e l f . rn}. Caution: Some mathematicians call l s the indicator function of S or denote it by Xs. 1}.1 that satisfy Lk(rj) = 5jk (where ~ is the Kronecker delta). distinct elements of any field see 8. . Thus. It is usually written with its arguments as subscripts. (We mention it again in 15. . or other symbols. r e . . . Another meaning for the term "indicator function" is given in 12.0 ifx <0.35.. r 2 . the beginner is cautioned not to assume too much just on the basis of notation.18). For k = 1. For instance. is defined by 1 0 1 if x > 0 if x . .m a p p i n g of a set X is a mapping from X into X. sgn : R ~ { .1 . more generally.Ln are polynomials of degree n .20. 0. . unlike multiplication of real numbers. The symbol "o" may be included for emphasis or clarification.4. which in turn will be used for a cardinality proof in 11. .15. The Lagrange polynomials are commonly used in numerical analysis in the following fashion: Let f(t) be any function defined on a set that includes the numbers rl.
An i n v o l u t i o n of a set X is a function f 9X ~ X t h a t satisfies f 2 _ i x . A function f is an e x t e n s i o n of a function g if g is a restriction of f.especially on c a t e g o r y t h e o r y . T h e y are said to agree on a set S c_ X1 N X2 if t h e y agree at every point in S.y + u. A T is an involution on certain collections of matrices. = 1 / x is an involution on R \ {0} or on (0. Of course. b u t it should not be confused with the c h a r a c t e r i s t i c f u n c t i o n . we m a y write i x a n d i y if the distinction needs to be displayed. For a function f 9X ~ X we m a y write f2 _ f o f . too.x . B o o l e a n algebras. if we are given y a n d a function f a n d wish to find a solution x of t h e e q u a t i o n y .write the identity m a p as 1x. where we define g ( u ) . here ~ denotes the complex c o n j u g a t e of (~. Several t h e o r e m s a b o u t fixed points will be developed in later chapters. For instance. This a r r a n g e m e n t is s o m e t i m e s a b b r e v i a t e d as i 9S c_ X . N o t e th'~t x is t h e n a fixed point of all the iterates of f . t h e n the i n c l u s i o n m a p i 9S ~ X is the m a p given by i ( s ) . Here are some examples. A function f " X ~ X is i d e m p o t e n t if f2 _ f . b u t some p r o b l e m s yield to solution in this fashion. F u n c t i o n s t h a t agree 9 If X is a set a n d S c_ X . these are t h e i t e r a t e s of f .36 C h a p t e r 2: Functions O n e p a r t i c u l a r l y i m p o r t a n t selfmapping of X is t h e i d e n t i t y m a p p i n g . n. P r e v i e w : M a n y p r o b l e m s t h a t do not a p p e a r to involve fixed points can be r e f o r m u l a t e d as p r o b l e m s a b o u t fixed points. T w o functions f l " X1 ~ 1/'1 a n d f2 " X2 ~ Y2 with overlapping d o m a i n s are said to a g r e e at a point x0 C X1 A X2 if f l ( x o ) .b. d e n o t e d i 9 X ~ X .. 0 ~ X . C a u t i o n : Some t e x t s . the g r e a t e s t integer function. + o c ) . This m a y seem r a t h e r contrived. It will s o m e t i m e s be convenient to also write f l _ f a n d f o _ i x . T w o functions d i f f e r at a point if t h e y do not agree there.2. w h e n c If f " X ~ Y a n d S c_ )6. f3 _ f o f o f . 9 (~ H ~ is an involution on C.f2(x0).x . defined by i ( x ) . we m a y rewrite the e q u a t i o n as g ( x ) . t h e n f m + n _ f m 0 f ~ for any n o n n e g a t i v e integers m.. for any set X . the last few are a preview of m a t e r i a l in later chapters. t h e n the r e s t r i c t i o n of f to S is the function f S L S ~ Y t h a t takes t h e value f ( s ) at each point s E S. a n d sgn are i d e m p o t e n t m a p s from I~ into itself.) 9 A ~. If f is a selfmapping of X . Of course. t h e n a f i x e d p o i n t of f is any point x c X such t h a t f ( x ) .x for all x c R a n g e ( f ) .f o i for some inclusion i. 9 f(x) 9 f(x) x is an involution on R (or on any additive group). defined in 2.f ( u ) . t h e n i is simply the identity m a p . etc. C.f ( x ) .5. a.. T h e reader should already be familiar with the first few of these.x for all x.s for each s E S. for e m p h a s i s or clarification we m a y occasionally write it as i 9 S S .4. . 2. Some e l e m e n t a r y examples: T h e absolute value function. note t h a t this occurs if and only if g . (Later we will generalize this to 9 S H CS is an involution on [P(X). or equivalently if f ( x ) .X . the identity m a p s of different sets X a n d Y are different functions. defined in 2.
in general this does not lead to any confusion. a collection of mappings defined on X (possibly with different codomains) is said to s e p a r a t e t h e p o i n t s of X if for each pair of distinct points Xl. but occasionally the formal distinction between two such functions is useful. ~ here Y can be any set t h a t contains Ya U Y2. A set S c_ X • Y is the graph of a bijection from X onto Y if and only if S is a set of ordered pairs such t h a t each x E X is the first coordinate of exactly one of the pairs and each y E Y is the second coordinate of exactly one of the pairs. the empty set. Exercise. the real number 0. A bijection from a set X onto itself is called a p e r m u t a t i o n of X.6.4) is a permutation. A function f : X ~ Y is i n j e c t i v e (or onetoone. etc. a function f .i2 o f2 for some inclusions il 9 Y1 c Y and i2 9 Y2 c~ Y .X2 in X there exists at least one f c (I) satisfying f ( x l ) ~ f(x2) 9 If a function f : X ~ Y is injective. let f . 2.l ( y ) be the unique x e X that satisfies f (x) = y. x/ \y Two functions f l .4 and 9. W h e n t h a t is the case. Let f 9X ~ Y be a function 9 The i m a g e (or forward image) under f of any set S c_ X is the set f ( S ) .20. then f acts as a bijection from X onto Range(f). or an injection) if it has the property t h a t Xl ~ X2 ~ f ( x l ) ~ f(x2) 9 (See the following diagram. then we usually disregard formality and consider fl and f2 to be the "same" function. It then follows t h a t f . where "0" has any of its usual meanings i.1 9 R a n g e ( f ) ~ X. Of course. or a onetoone correspondence between X and Y) if it is both injective and surjective. We say a function f 9X ~ Y is b i j e c t i v e (or a bijection of X onto Y.Some Special Functions 37 Let fl " X ~ Y1 and f2 " X ~ Y2 be two functions with the same domain and different codomains 9 (See the following diagram 9 T h e n fl and f2 agree on all of X if and only if i l o fl . See the related discussions in 9. (In a few . the vector 0 in some linear space.. A function v a n i s h e s at a point or on a set if t h a t function agrees there with the constant function 0. Any involution (defined in 2. then we may define its i n v e r s e . as follows" for each y e R a n g e ( f ) . f2 t h a t agree on their domain X d.7.1 is also a bijection from Y onto X.{ f ( x ) 9 E S} c_ Y9 Thus the same symbol "f" is also used for a mapping from [P(X) into [P(Y).) More generally.e. 2. whenever f : X ~ Y is injective. Recall from 19 t h a t a function f : X ~ Y is surjective if its codomain Y is equal to its range f ( X ) .
The range of f. Thus.T)  f({x t xT)  f({x}..31. for a function f : X x Y ~ Z we may write f(x.e.65.T)  {f(x. the image of the domain.) The forward image map preserves some of the basic set operations: )~EA AEA The forward image map extends the given mapping f : X ~ Y if we identify each singleton in X or Y with its unique member. see 14. defined in 1.~ ) selfmapping but not permutation permutation but not involution involution but not identity map unusual contexts it can cause difficulty. for the forward image. is just the set f(X) i. Some mathematicians use a slightly different notation. 9 >0 .38 Chapter 2: Functions X Y X Y X Y surjective but not injective injective but not surjective bijective Examples with Finite Sets 9 >0 t. The notation of forward images can also be applied in one or more arguments of a function of several variables.t) 9t E T } . such as f[S] or f :: S. we shall not follow that practice here. however.
the forward image of T N Range(f) under the mapping f . g .l ( T ) {x E X " f ( x ) E T}.l ( T ) is also equal to { f . .( X n ~. yn.yn ) 2 ~ ~ / X 2 ~.l ( t ) .T)  f ( S x T)  { f ( s .33.l ( A ) . I f f isinjective. Let f " X ~ Y be any function. Then for all sets S. then take square roots on both sides. the set f .B u n y a k o v s k i i . yj) . . T C_ X and A.y l ) 2 Jr " " " ~. It is somewhat better behaved than the forward image it preserves all the basic set operations: 1 1 AEA fl(y\r) X\f l(z). A c_ B ~ g .8. n To prove this inequality. y 2 . ST{st'sES.1 is a mapping from ~P(Y) into [P(X).g .. This notation can also be combined with the notation of binary operators.. A. . .~ i c j ( x i Y j xjYi) 2 A more general form of the CBS inequality will be given in 22. fl(2~) . g .l ( A ) ) hence A . Whether f is injective or not. g ( g . DISTANCES 2. . For any point y E Y.{ s D t 9 s E S.A n Range(g) C_ A.10. An important consequence (which will be used in 2. t E T} for sets S C X and T C Y. Then g ( g . t E T} that is. . .24. f .l ( A ) ) . Also. . multiply both sides of the CBS inequality by 2.~ y 2 ~.9. Hint for the proof: 0 < ~ . + y~  for any real numbers Xl.l ( B ) .a) is: r ( X l 2r. . . xn and yl. Jf.y 2 . 2. .~ .l ( { y } ) can also be abbreviated as f . c.B .X n 2 Jr.) _ X. For later reference we note this form of the C a u c h y .I ( B ) . Further properties of forward and inverse images. AEA fl(I/. Let g" X ~ Y be some function. B C_ Y we have" a.l ( A ) C_ g . t ) 9 s E S.l ( g ( S ) ) D S. x 2 . Suppose g is surjective. The i n v e r s e i m a g e (or preimage) under f of any set T C Y is the set f .Distances and similarly 39 f(S. as noted in1.12. then f .1 " R a n g e ( f ) + X. Thus we may write S [ ] T . t E T}. . b. . I n particular. t E T}.l ( y ) .z5. 2. t. then add }~d=l(x~ + 2 to both sides. .S c h w a r z equality: X l Y l + x2Y2 + ' ' " in + XnYn <__ I X 2 1 + X 2 + ' ' " + X2n ~ / y 2 + y~ + .
i.11. ~ . a. these are indeed metrics. In this book we shall sometimes discuss the positivedefinite case and the notnecessarilypositivedefinite case simultaneously. e). Basic examples and properties of metrics and pseudometrics.Y21. y) (triangle inequality) for all x. for d2 use 2.I x 2 . y) > 0 when is called a metric.Y l l .y) V / I X l .x).x2. from one (pseudo)metric space (X.y l l 2 . For an asymmetric example. Definitions. the most commonly used metrics on ]~n (or on C n) are dl(X. IX2 . where I I is the usual absolute value function. We may refer to X itself as a (pseudo)metric space. u) + d(u. x2) for all xl.~ . and (semi)norms. y) = I x . The name "triangle inequality" stems from the fact that in Euclidean geometry.. Note that many different distance functions can be defined on any one set X.y) = IXl . A q u a s i p s e u d o m e t r i c that satisfies d(x. Any such discussion should be read once with the "pseudo" and once without it. y) is called the distance from x to y. Except for a few brief remarks in 5. IXn . If it is also injective (always true for metric spaces)..y) they are then called p s e u d o m e t r i c s . F(semi)norms. This convention also applies to G(semi)norms. It is easy to verify that . consider a taxicab in a city that has Some oneway streets. y. . all of the quasipseudometrics which we shall consider in this book also satisfy d(x.Yl. the length of one side of a triangle is less than or equal to the sum of the lengths of the other two sides. A map p : X ~ Y.10. d) consisting of a set X and a (pseudo)metric d on X.Y2. then by a change of notation we may view p as an inclusion map that makes X a subset of Y. A ( p s e u d o ) m e t r i c s p a c e is a pair (X. . d (x. The u s u a l m e t r i c on I~ is d(x.y212 . .Ynl.Xn) and y = (Yl.y) max {IXl . for any points x = (xl.Y21 J r ' ' ' JrIXn ..p(x2)) d(Xl. the distance from x to y is not necessarily equal to the distance from y to x.Yn). y) <_ d(x. (symmetry) A pseudometric that also satisfies x ~= y (positivedefiniteness) d(x. " ' ' .I X n . by writing "pseudo" in parentheses... if d does not need to be mentioned explicitly. +c~) d(x. . x2 E X.. 2.yn] 2 d2(x.40 2. is called d i s t a n c e .12. It then preserves all the (pseudo)metric structure of X.15.. which are special types of (pseudo)metrics introduced in later chapters. For any positive integer n. The number d(x.p r e s e r v i n g or i s o m e t r i c if it satisfies e(p(xl). note t h a t distance as the crow flies is different from distance as the taxicab drives. = d(y.~ .Yll JrIX2 .Y n l } . u c X. x) = 0 and Chapter 2: Functions on a set X is a mapping d : X • X ~ [0. d) into another (pseudo)metric / \ space (Y. For realworld examples.
5. d(p.yll p + lx2 . y. may be referred to as the u s u a l m e t r i c on [oc. u.11. ifx~:y (where ~ is the Kronecker delta). u) + d(y.f(Y)l as actually being the "distance" between x and y. Id(x. The metrics dp for 1 <_ p _< oc may be referred to as the u s u a l m e t r i c s on IRn. 1+ with values of f (u) at u = +oc defined by taking limits in the obvious fashion. e.y~l p. This is a pseudometric on X.d. y ) I f ( x ) . or any other equivalent metric. Three such functions f ( u ) are given by arctan(u). v) for x.0 if {t E J 9 p(t) r q(t)} is a finite set. which we shall call the K r o n e c k e r m e t r i c : d(x. +co] into R. 1]. they have the same convergent sequences.a and 19.y2 p + ' ' ' + lx~ . We may define a metric on the extended real line [oo. For instance.) Let X be the set of all Riemann integrable (or more generally. q)  ~ p(t) . d) is i s o l a t e d if there is some number r > 0 (which may depend on x) such that all other points have distance from x at least equal to r.g. q) . +co] by t a k i n g d ( x . that dp is a metric will be proved in 22. A point x in a metric space (X.~xy = f 0 ! 1 if X Y.Distances The metrics dl and d2 are special cases of the metric 41 dp(x.34.) Any of these metrics. Some mathematicians call this the discrete metric. but the metric is nevertheless useful for defining convergent sequences and other metric concepts. c. The pseudometric d becomes a metric if we restrict it to the continuous functions on J. and therefore they are interchangeable for most purposes.f(y)q where f is some injective function from [co. The simplest discrete metric is the following one. b.24. tanh(u). these notions are discussed in later chapters. +oc]. Henstock or Lebesgue integrable) realvalued functions defined on some interval J C R say on [0. The three choices of f given above yield metrics that are equivalent in the sense that they yield the same topologies and the same uniformities. and consequently agree on many other structures e.d(u. but it is not a metric. y) . co).y)  ~/IXl . There are many such metrics on a set X. They are equivalent. y) = 1 . v E X. for instance. A d i s c r e t e m e t r i c on a set X is a metric that makes every point isolated.. d. v)l <_ d(x. . and some of them have substantially different properties see 5. (See 18. (This example assumes some familiarity with calculus. We do not think of I f ( x ) . in a sense discussed in 22.q(t)l dt.11. if d is a pseudometric on X.e. Define d(p. Assorted other functions f will also suffice for this purpose. where p E [1.
y)}. then d(x. As we develop the theory of gauges. A gauge D on a set X is s e p a r a t i n g if it has the property that for each pair of distinct points x and y in X. .) An u l t r a m e t r i c is a metric that satisfies the following strengthened version of the triangle inequality" d(x. It is a metric if and only if the function f is injective i.. Conversely. A single pseudometric is not adequate to describe the structure of some spaces. some important separating gauges D used in applications consist of large collections of pseudometrics that are not metrics. By a g a u g e on a set X we shall mean a collection of pseudometrics on X.f(Y)l is a pseudometric on X. and develop some properties of pseudometrics as a special case of properties of gauges. Show that the Kronecker 2. d(u. Most gauges used in applications are separating. sometimes large collections of pseudometrics are needed. Show that this inequality implies the triangle inequality.42 Chapter 2: Functions f.18 and 18.12." but we shall not follow that practice. If f : X ~ I~ is any realvalued function on any set. Another. More generally. Our own usage follows that of Reilly [1973].11. most pseudometrics used by themselves in applications are metrics. u). For instance. satisfying x # y ~ f(x) # f(y). (Optional. metric (in 2. We may refer to X itself as a gauge space. Note that a singleton gauge D = {d} is separating precisely when the pseudometric d is a metric. Examples and exercises about separation. q) = Ip(t). see 28.13. Some mathematicians make the separation condition a part of their definition of "gauge. let (I) = {f~ : I c A} be a collection of realvalued functions on a set X. y) = I f ( x ) . where d~ (x. We shall often write "{d}" and "d" interchangeably. Thus. define a pseudometric dt on X by: dr(p. 2. but no proper subset of it is separating. y) < max{d(x. y) > O.f that the gauge D yields the product topology and product uniformity on R R.q(t)l. D) often can be analyzed in terms of the simpler pseudometric spaces {(X.6. Then a gauge can be defined by D = {d~: I e A}.b.12. entirely unrelated meaning of the term "gauge" is given in 24. d ) : d E D}.d. We caution that the term "gauge" is used in a wide variety of inequivalent ways in the literature. A g a u g e s p a c e is a pair (X. 2.e. We shall see in 9. The resulting gauge D = {dr : t E R} is separating. Remarks/example. for instance. For each t E I~. if D does not need to be mentioned explicitly. y) = If~(x)f~(y)l. let X = R R = {functions from I~ into R}. discuss d itself as a gauge. However.b) is an ultrametric. there exists at least one d E D satisfying d(x.14.15. a. D) consisting of a set X and a gauge D on X. Definitions. a gauge space (X.9. we shall devote special attention to the case of a gauge D = {d} consisting of just one pseudometric d. that usage works particularly well with the concepts in this book. A special case of this construction was given in 2.
A set S is finite if card(S) : card({1. some mathematicians use a slightly different definition of "infinite" see the remark in 6. . u n c o u n t a b l e if it is not countable. n}) for some nonnegative integer n (in which case we call n the c a r d i n a l i t y of the set and write card(S) = n). b. it is much easier to define the comparison of cardinalities of two sets. c o f i n i t e if it is being considered as a subset of some set X and its complement X \ S is finite.if there exists a bijection between X and Y.e. Doets. More generally. .e.48.16. y) is a separating collection in the sense of 2.6). (Cantor invented these ideas while investigating Fourier series.11 if and only if the collection of functions (I) = {fx. The c a r d i n a l i t y of a finite set is the number of distinct elements in that set. CARDINALITY 2. if there exists an injection from X into Y. Much of our presentation of cardinality is based on Dalen.. . The "cardinality" of a notnecessarilyfinite set is a bit harder to define.20 and add a few more definitions. Caution: Some mathematicians apply the term "countable" or the term "denumerable" only to the sets that have the same cardinality as N.. see 26. . thus it is a nonnegative integer. Also.Cardinafity 43 This gauge is separating if and only if the collection (I) separates points of X (in the sense o f 2. However. 2 . The cardinality of a set X is sometimes abbreviated IXI .d : x C X .) With this convention. we write card(X) _< card(Y) if X has the same cardinality as some subset of Y .6. . Similarly. A gauge D on a set X is separating in the sense of 2. We say that two sets X and Y h a v e t h e s a m e c a r d i n a l i t y written card(X) = card(Y) . d E D} defined by fx. we write card(X) < card(Y) if X and Y satisfy card(X) _< card(Y) but do not satisfy card(X) = card(Y) i. This notion is due to Georg Cantor and is the foundation of modern set theory. i n f i n i t e if it is not finite. we can now restate some of the definitions given in 1.i.. and Swart [1978] and Kaplansky [1977].27. if there exists an injection from X into Y but there does not exist a bijection from X onto Y. c o u n t a b l e (or d e n u m e r a b l e ) i f card(S) < card(N). c o u n t a b l y i n f i n i t e if card(S) = card(N)..23.d(Y) = d(x. we shall postpone that concept until 6.
the letter a (a Greek lowercase sigma) is often used to indicate countable sums or unions e. . Further remarks. 2 . see 1. Y by taking h(z) . Throughout the mathematical literature. if card(Y) < card(X) and card(X) < card(Y). Y. 1 9 . but the proof is deeper and also requires that we assume the Axiom of Choice. it is quite easy to prove that for any sets X.g. Still stronger properties about the comparison of cardinalities will be proved in 6.. Verify that the function h takes each f n ( C ) bijectively to fn+l(C). the comparison of cardinalities is not quite like the comparison of real numbers. Exercises and examples. Some familiar properties of real numbers are also valid for cardinalities. comparison of distinct cardinalities is a partial ordering. (The reader should show this now.) Define a function h" X .card(Y). We shall define these terms separately in their appropriate contexts. This presentation follows Cox [1968].) In the following diagram. (More precisely.19. comparison of cardinalities is a preordering. (Here f n is the nth iterate of f. 5 (delta) is often used to indicate countable products or intersections e. Z we have card(X) _< card(Y) and card(Y) _< card(Z) imply card(X) <_ card(Z).) It is rather harder to prove that card(X) _< card(Y) and card(Y) <_ card(X) imply card(X) . since we have an injection e 9Y ~ X. Thus. however. Similarly. then there exists a bijection from X onto Y.card(Y). Do not assume too much on the basis of this notation. are disjoint. by relabeling we may identify each point of Y with its image under e. aconvex sets. that is the content of the SchrSderBernstein Theorem in 2. and hence h is a bijection from X onto Y. F~ sets..18. . then c a r d ( X ) . aadditive measures. as an exercise. 2. note that I ( S ) . X \ Y 9 ~ C ' f(C) f2(C) f3(C) f4(C) x\s Let C .X \ Y .z when z E X \ S.22. In other words. If there exist injections e" Y ~ X and f " X ~ Y.20. f ( C ) . Let X and Y be sets. in aideals. We may assume that Y c_ X and that we are given an injection f 9X ~ Y.17.. It is customary to use the familiar symbol _< for comparison of cardinalities.) Let S .S \ C C S. in G5 sets.10. the big box represents the set X. f2 (C). aalgebras. 2. f3 (C).f ( z ) when z E S and h(z) . For instance.g.B e r n s t e i n T h e o r e m .44 Chapter 2: Functions 2. Proof. (See the diagram above.Un~__01n (C). the sets C. and some are not. Remarks.. Since f is injective and has range contained in Y. S c h r h d e r .
which is an enumeration of N x N.card(N). 1) (3. (1.card(N N) . See 10. 1) (5. I 12. 2. Later we will show card(R) . (2. ... 3) (5. 1) (2.} I { 0.. (2..d. 3) S (4. The sets N.2}) < card({1. 7.3}) < . If u ~ X. 3) S (3. I . b. Remarks. we have card(X x X) _> card(X).} I I . 5) (2. We have card(X x X) > card(X) when X is a finite set containing more than one element. I 3. 2) S (5. 4) (4. I 2. We have card(X x X) = card(X) when X is empty or a singleton. N U {0}. 3) S (2.. c. Hint: Use the preceding result.2.Cardinality 45 a. 2) J S S J (I. 4) J (I. 2) S (3. 5) (5. d. 1). I 3. C a n t o r ' s T h e o r e m on pairs. (3. 1) (4. I 6.. 3) S (5.} I e.0 for r > 0).. I 8. In 6. card(2~) < card((1}) < card({1.. 6. I 14. . 4) J (2.card(2 N) > card(N). together with the Schr6derBernstein Theorem 9 Remark. For any set X. 5) f. Hint" See diagram below..} I { 0. . 1). N N U{0} Z {even positive i n t e g e r s } = = = { 1. < card(N). I 5. 2) S (4. then card(X x Y) = card(X)card(Y) and card(X Y) = card(X) card(Y) (with the conventions r ~ . 1. then card(X U {u}) = card(X). (4.2).44. I 1. however. I I I { 2. 2) (2.2. 3. I 2. .. (1. 4. 4) S (4. I 4. 1). g. I 3. 5) (i. use 2.1. the proof of that result will require the Axiom of Choice. card(Q) . (1. the subset {n : n > 4} is cofinite.. 10. N x N is countably infinite9 Hint" By tracing along the diagonals in the diagram below. we obtain the sequence (1. Hints: This is trivial if u E X.2).1).. 5. If card(X) _> card(N) and u is any object. 5) (3. and {even positive integers} all have the same cardinality. 6.20.1 for r _> 0 and 0 ~ . 4) S (3. Z. 1) S (1. 4.. No If X and Y are finite sets.f.22 we shall see that card(X x X) = card(X) when X is an infinite set.3). Hint: Treat separately the cases of X = ~ and X r ~. c a r d ( o ) = 0 and c a r d ( { ~ } ) = 1. In N.
21. a set too big to be in the list given above.y) ~ f ( x . 1} is often called "2. and we can continue this process again and again. These two objects are often used interchangeably. 1. 1}. we can identify each subset S C_ X with its characteristic function l s " X ~ {0. y ) is a map from X x Y into Z.20. . 2}. card([P(X)) > card(X) for every set X.b.. then card([P(X)) = 2 card(X) is the number of subsets of X. roughly. n. Thus there is a natural bijection between the power set of X. Hint: Don't forget O and X.e. Indeed. Then S is bigger than any one of those sets. . . Hints: card(N N) G card((2N) N) = card(2 NxN) = card(2 N) G card(NN). [P(~P([P(N))). 2 X m  {subsets of X}. . List the eight subsets of X = {0. It is easy to see that this correspondence between Z X x Y and (ZY) X is no more than a change of notation.46 Chapter 2: Functions If X. Show that r E R ~ r ~ R. .1). jO ko The set {0. [P(X). 2. m.43). Hints: Easily card([P(X)) > card(X). If X is a finite set. card(2 N) = card(NN). i. Example. it is an uncountable set that is bigger than . 1. . while f E (ZV) X means that x ~ f ( x ." Let X be a set. Thus there are infinitely many different kinds of infinity. Are there still more infinities? Perhaps there are some even bigger than anything o btained in the "list" suggested above. . T(N). but it is similar to Russell's Paradox (1. card(N) < card([P(N)) < card([P(~(N))) < card([P([P([P(N)))) < .2.) is a map from X into Z v. How many kinds of infinity are there? By Cantor's Theorem (in 2. card(2 N) > card(N). An i n a c c e s s i b l e c a r d i n a l (also known as a strongly inaccessible cardinal) is. Define R = {x c X : x ~ f(x)}. infinitely many times. defined in 2. (where [P denotes the power set). or perhaps there are some lying between two consecutive elements of that list. and Z are any sets. We can get still more infinities. Y. as follows: Let S be the union of all the sets N. [P([P(N)). We can go further: We have card(S) < card([P(S)) < card(~(T(S))) < card(~P([P(~(S)))) < . Exercise for beginners. . 9 (X) and the X t h power of the set 2. and let r = fX(R). f c Z x x Y means that (x. . Now suppose that there exists a bijection f : X . {functions from X into 2}. Example. then there is a natural bijection between Z x x v and ( z Y ) x . T h e o r e m ( C a n t o r ) . Note: This contradiction is not paradoxical. a contradiction.
3. 3 . 1 . t 2 .22. 2 . and let p be some mapping from {finite sequences in T} into T. Then T  This principle can also be formulated as a method for proving that a statement P(z) is true for every z E N just take T . } .g. (Caution: Some mathematicians use the symbol N for the set {0.51. } . Let T be a set. .73. 2 . so there are only countably many possible P ' s but there are uncountably many sets T c_ N. . we note a few elementary applications of the countable case.) Following are two basic principles about the natural numbers. 3 . see 14. and 14. Then there exists a unique sequence ( t l . Finally. Surprisingly. . . t 2 . . . Both of these principles are generalized to sets other than N in 1. r e c u r s i o n is a method for defining new objects. t 3 . (Note: To logicians. I n d u c t i o n is a method for proving statements about objects that have already been defined. . and 5. .75. The G e n e r a l i z e d C o n t i n u u m H y p o t h e s i s ( G C H ) asserts that for any infinite set X.8. This is explained briefly in 14.Induction and Recursion on the Integers 47 anything obtainable from smaller sets via power sets and unions. ) in T that satisfies t~ . refer to books on set theory and logic (e. We shall not make this precise. we shall agree that the empty sequence the sequence with no components. a.40. 14.a9. We assume the reader is familiar with the basic properties of the natural numbers N = { 1 . GSdel and Cohen developed new methods to show that neither the t r u t h nor the falsehood of CH can be proved from the usual axioms of set theory. Shoenfield [1967]) for details. such assumptions about enormous sets lead to important conclusions about "ordinary" sets such as R. The question remained open for decades.p ( t l .53. The C o n t i n u u m H y p o t h e s i s ( C H ) asserts that there are no other cardinalities between those two. thus CH is independent of those axioms. INDUCTION AND RECURSION ON THE INTEGERS 2. 14. our definition of t~ may depend on all the preceding definitions. In other words.) For our second principle. In applicable analysis one seldom has any need for infinite cardinalities other than card(N) or card(2 N) = card(N). . .f.7. Suppose 1 E T C N and T has the N. It is usually understood that the statement P(x) must be expressed using finitely many symbols from a language with only countably many symbols. but in this book 0 ~ N. . this reformulation is not quite equivalent. .1 ) for all n.74.{z E N 9 P ( z ) } . . or the sequence of length 0 is a finite sequence and hence a member of the domain of p.50. It is not intuitively obvious whether such enormous cardinals exist. they are then referred to as transfinite induction and recursion. Their existence or nonexistence is taken as a hypothesis in some studies in set theory. For now. 14. property that whenever n E T then also n + 1 E T. Principle of C o u n t a b l e R e c u r s i o n . . Cantor spent a large part of his last years trying to prove that CH was true or false. there are no other cardinalities between card(X) and card(2X). Principle of C o u n t a b l e Induction.. t n .
. 2 . n}. An example is (x + y)4 _ y4 + 4xy3 4. " k0. ' ' k!(nk)! . 1 1 1 1 4 3 6 2 3 4 1 1 By convention. .0. . 3 . 3 .2. Every positive integer has some remarkably interesting property. c. 2. . p2 = 3. 9 9 Pn. . .0 when n _> 0 and k c Z \ {0.4x3y 4. Show by induction that the two methods of defining (~) yield the same values. 1. 3! = 6.(n) _ 1 for n . pa = 7. 2! = 2. (n!) for n = 0. using the second method.1 4 (0<k<n). a. 2. . The following induction argument proves that there are infinitely many prime numbers and also gives us a crude but easy upper bound on Pn.3. By induction on n. "Proof. for some positive integer n. 1. Joke.x 4. E x a m p l e s in countable induction and recursion. and P5 = 11. Use induction to show that pn <__22n" e. and 4! = 24. .6x2y 2 4. P 2 .3. . 1. Hint: See 1. p3 = 5. (We read "n!" as "n factorial. In any case. .PiP2 "'" Pn 41 is greater than Pn. Then q . Assume that P l .. pn+l _< q <_ 2 p l P 2 " " p n ." If not. Then no has the property that it is the first uninteresting number but isn't that an interesting property? Exercise. ' " . . show that the (~)'s are the numbers in Pascal's Triangle. p2. n) ' or it can be defined by recursion on n: we take (o) . 1 Pascal's Triangle: Each number is the sum of the two numbers above it.~ ) (k)4( ) kn . F a c t o r i a l s are defined recursively: 0! = 1.12. . prove the B i n o m i a l T h e o r e m : (x + . b.23. .. . we define (~) . let no be the first uninteresting number. . d. The first few prime numbers are pl = 2. .48 C h a p t e r 2: Functions 2. Hence either q is a new prime. The b i n o m i a l coefficient (~) (read "n choose k") can be defined directly by a formula: (k)n. and then ( k +4 . Carefully explain what has gone wrong here. .") The first few factorials are 0! = 1. Also.1)! = (n 4. A p r i m e n u m b e r is an integer greater than 1 that is not divisible by any positive integer except itself and 1..j 0= xJY J ( n . ) . (n012. and (n 4.1. and it is not divisible by any of pl. 1! = 1. or it is divisible by a new prime.1). P n have already been found.11. . . .
directed orderings are studied further in Chapter 7. preordered set set with equivalence relation~ Dedekind ~ complet~ poset ~ \ ~ ~ / ~ ~ ~ ~ poset lattic ~ distributive lattice se~miinfinitely distributive lattice Heylingalgebra~ n f i n i t e l y distributive lattice ~algebraofsets) (T(f~). The chart below shows the connections between some kinds of preorders that we shall study in this and later chapters.Chapter 3 Relations and Orderings 3.1. Lattices and order completeness are studied in greater detail in Chapter 4. Boolean algebras and Heyting algebras are covered in Chapter 13. Preview.C__) I chain latticegroup I ordered field directed set R 49 .
=/=. x ) ' x E X}. d. c . r (relations on any set X). This definition generalizes that in 2. we may write xRy. each statement about either of these relations can be converted to a statement about the other relation. < is still denoted by the same symbol. C_. y E X. The smallest relation on X is the e m p t y r e l a t i o n .7. its graph is the empty set. When it is viewed as an ordering. we shall call it the u n i v e r s a l o r d e r i n g . 3. In other words. Other symbols may be used in place of R. By a slight abuse of notation. _<. C. < (relations on R). For instance. then the composition defined in this fashion is the same as the composition defined in 2. _>. and c . The i n v e r s e of a relation R is the relation R 1 defined by xRly 4=~yRx. <.{ ( x . ~. Note that ( R . respectively. < are the r e l a t i o n s . If R is a relation on X and Y c_ X.3. G r a p h ( R i y ) . Equality (=) is a relation. A r e l a t i o n (or binary relation) on a set X is simply a set R c_ X x X. ~=. its graph is the d i a g o n a l set b. a restriction of any of the relations . if Q and R are in fact functions. c_ (relations on a collection of sets). See 1..P o (Q o R).2. The c o m p o s i t i o n of any two relations Q and R on a set X is the relation defined by QoR {(x. v E Y and uRv. If 4 and ~ are relations that are inverses of each other. we may even denote it by Graph(R) or Gr(R).50 Chapter 3: Relations and Orderings RELATIONS 3.. Examples and special kinds of relations.3. .1 ) . f. e.. Its graph is X x X.3 i. Some familiar symbols used in this fashion are . the inverses of =. but with this change in our notation" instead of writing (x. then there exists a duality between 4 and ~. we often denote R i y simply by the same symbol R  for example. C.y) E X x X 9xRuanduQyforatleastoneuEX}. Trivial though it may be. The largest relation on a set X is the u n i v e r s a l r e l a t i o n : xRy for all x. y) E R. >. I .Graph(R)M (Y x Y). r _D. We may sometimes refer to the subset of X x X as the g r a p h of the relation. <.e. Verify that the compositions of relations satisfy ( P o Q) o R . c. then the r e s t r i c t i o n of R to Y (or t r a c e of R on Y) is the relation R l y defined by URIyV if and only if u. a. Exercise..1 __ R . this relation is occasionally useful.
For other symmetric relations. Most relations of interest to us also satisfy either r e f l e x i v i t y " R _D I. 3. means t h a t both x ~ y and x # y hold. and use ~. is e a r l i e r than. More symbols for relations9 A familiar symmetric relation is equality (=). they may implicitly assume t h a t < is a chain ordering (defined in 3. < for a more "generic" ordering. T h a t is. most satisfy either s y m m e t r y " R 1 . xRy implies yRx. < for chain orderings. This makes it easier for beginners to disassociate themselves from the familiar properties of R and start over with a fresh perspective. E. E will be denoted respectively by ~. or antisymmetry" R n R 1 c_ I.. is l i t t l e r than. 19 The symbols 4 and _E may be read as p r e c e d e s . 9 perhaps ~< or n could be used as a blackboard substitution.23)9 To reduce the frequency of this type of error. xRy and yRx imply x Some examples are given in 3. < In this book the symbols 4 and < will always denote. y. Inverses (see 39149 of ~. A similar convention will apply to the pair _.Relations 51 3. T h a t is. is m o r e than. >. Let R be a relation on a set X. without explicitly mentioning corresponding results for <. we often use the symbols ~ or Inequality (_<) and inclusion (C_) are familiar relations t h a t are not symmetric 9 For other relations t h a t are not symmetric. Admittedly. The symbols < and E may be read as s t r i c t l y p r e c e d e s .~. .R.6. and let I be the diagonal set of X (see 3. Also. and the symbols >. etc.y holds. respectively. beginners sometimes inadvertently attribute to those symbols some familiar properties of the ordering of the real numbers e. __E.{(x. is l a r g e r than.and z may be read as s t r i c t l y s u c c e e d s .g. Many relations of interest to us satisfy the condition of transitivity" R o R C R.x) 9x E X} and Graph(<) form a partition of the set G r a p h ( ~ ) 9 Because ~ and < are connected in this fashion. In other words.3. Their inverses (~ and ~) may be read as s u c c e e d s . 4 is difficult to draw on a blackboard. is b i g g e r than.. is s m a l l e r than. the sets I . we shall often use the symbols 4 and <9 Occasionally we may also use E and r 9 Some m a t h e m a t i c i a n s prefer to use the symbols < and < for any relation t h a t is not necessarily symmetric because these symbols are more familiar and therefore easier to draw. <. xRx for all x E X. However. T h a t is. is l a t e r than. or i r r e f l e x i v i t y : R n I . T h a t is.4.a). we can usually state our results just in terms of ~. ~. is less than. or t h a t are not known to be symmetric.5. a reflexive relation and an irreflexive relation. xRx for no x E X. etc. which are connected as follows: x ~ y x < y means t h a t either x < y or x . we will usually reserve the symbols _<. T h a t is. xRy and yRz imply xRz.
Then (i) R is reflexive if and only if R is equality (=).6.. and C. b. :/: are symmetric. =. R is not transitive. c. or poset antisymmetric (x ~ y and y g x imply x .y). reflexive. We note a few important special types of preorders. <. C. ~ ~ are transitive. A p r e o r d e r on a set X is a relation ~ that is both transitive (x ~ y and y ~ z imply x ~ z) reflexive (x ~ x). e. d. Let (X.8.) .7.2. (2.2). C. 3. y satisfying xRy and yRx simultaneously. (2. < are irreflexive. Let R be a relation that is both symmetric and antisymmetric. PREORDERED SETS 3.. ~) is a directed set . we say that two elements x. ... < are antisymmetric. sets with equivalence relations. ~) consisting of a set X and a preorder ~ on X.9. Let 4 be a reflexive relation on a set X and let ~ be the corresponding irreflexive relation. If R is a relation on X that is transitive and irreflexive. :/:. Basic properties and examples.{(1. (This terminology is mainly used in posets. <. ~) be a preordered set. (3. Show that R .partially ordered sets. directed order if ~ is and (X. and (ii) R is irreflexive if and only if R is the empty relation. and A preordered set is a pair (X. a. _< are reflexive. ~.e. or antisymmetric. C_.2).3). then R is also antisymmetric but vacuously so" there cannot be x. symmetric. C .. c_. a.{1. well ordered sets. Then ~ is a partial order and (X. X2 E X there exists some y c X satisfying xl ~ y and x2 ~ y. In a set X equipped with a relation ~. chains. Among these examples. .52 3. and (ii) ~ is transitive if and only if ~ is transitive. directed sets. C _<. 3. Then (i) ~ is symmetric if and only if ~ is symmetric. A similar syntax will be used for special kinds of preordered sets .2)} is a relation on the set X . Chapter 3: Relations and Orderings Examples and exercises.if for each Xl. irreflexive. we may refer to X itself as the preordered set if ~ does not need to be mentioned explicitly. C _. etc.4 i. y are c o m p a r a b l e if at least one of the relations x ~ y or y ~ x holds. ~) is a partially ordered set.3} t h a t has none of the properties listed in 3. Six familiar relations are =. equivalence relation if it is symmetric (x ~ x for all x). lattices.
antisymmetric. The resulting relation is a partial ordering. Another common method is by inclusion of g r a p h s . see 3. Trivially.Preordered Sets b.9.. k. let f ~ g mean that Gr(f) c Gr(g). then there exists a third location z that is downstream (i.f. Then the p r o d u c t of the 4. By calling the universal ordering and similar orderings "directed. equivalence. We shall use the product relation on 1I~A X~ unless some other arrangement is specified. then so is its inverse. symmetric.i. or partial order. the universal ordering (defined in 3. ~. A notnecessarilyantisymmetric directed set can usually be replaced by a (perhaps more complicated) antisymmetric directed set.. Define restrictions as in 3. d. Actually. Caution" Some mathematicians make antisymmetry part of their definition of "directed set. One commonly used method is the product ordering. j. transitive. this example is as general as we could ask for: Every poset can be represented isomorphically in this form. 53 e. the empty set is directed.16. then (C.3. symmetric. this is a special case of 3.\'s is the relation 4 on the product of the Xx's. transitive." we shall achieve a few simplifications in the development of the theory. There are many ways to define an ordering on a collection of functions. For each I in some index set A. Example (from McShane [1952]). However. ~. x)}. go Most directed orderings of interest to us are antisymmetric. equivalence. Verify that if all the ~x's have one of the following properties." Indeed. then its restriction to any set S c X has the same property: reflexive. defined above. Subsets of ordered sets. let 4~ be a relation on some set Xx.3. is directed by the relation "is upstream from. then so is its inverse. as in 7. It may also be called the componentwise ordering or coordinatewise ordering since it acts separately on each component or coordinate.12. preorder. The only partial order that is also an equivalence relation is equality (=). directed order. The restriction of a directed order is not necessarily directed. a lattice). but its subset {(x. together with its tributaries. For instance. y) E Z 2 9 + y x O} is not directed. we remark that adding antisymmetry to the definition of "directed set" does not greatly affect the ultimate applications. If 4 is an equivalence relation.e. later in the water flow) from both x and y. Z 2 with the product ordering is directed (in fact. defined thus: f 6 g in I[ X:~ AEA means that f(A) 4x g(A) for every A C A.d. c_) is a poset. then the product ordering ~ has the same property: reflexive. y are any two locations in the system. or partial order.h. c." Though we shall not follow that practice. hQ If C is any collection of subsets of a set X.b) is a directed ordering that we shall find useful and that is not antisymmetric. preorder. A stream or river. i. Any singleton {x} is directed. . when equipped with the relation { (x. if x. antisymmetric. Verify that if the relation on X has the following property.e. If 4 is a partial order. f. irreflexive.
10. hence the distinct sets of the form 7r(x) form a partition g of the given set X. if ~ is determined by a filter 9=. as shown in 3. two objects x and y satisfying x ~ y are said to be e q u i v a l e n t .b. Actually. the quotient set may be represented by x~ X/Tr X/S X/~ X/J where ~ is the equivalence relation.g. b. y ~ z =~ x ~ z). Let ~ be any equivalence relation on a set X.~ x).10. Call two elements of X equivalent if they belong to the same Sx.41. C. The collection $ of equivalence classes is called the q u o t i e n t set. The student is cautioned that the term "equivalent" is highly contextdependent: T h a t one word is used for many different relations in different parts of mathematics. Thus..14. Let g = {Sx : A c A} be a partition of a set X that is. ..3. Define Xl ~ x2 if 7r(xl) = 7r(x2).10. x ~ y for all x and y in X. if ~ is determined by some subgroup S. we easily verify that this makes ~ an equivalence relation on X. reflexive (x ~ x for all x).b. every equivalence relation can be expressed in this form. For each x E X.b. the smallest equivalence relation is equality (=). let 7r(x) = {y E X : y ~ x}. We easily verify that. every equivalence relation can be represented in this form. if ~ is determined by an ideal [J. it is most often represented by an expression of the form X / 5 .11. if we distinguished between "aequivalence" and "9~equivalence '' but unfortunately that is not customary.~ y. the given equivalence relation ~ can be retrieved from this partition. x' E X the sets 7r(x) and 7r(x') are either identical or disjoint.10.41.c. as in 9.11. for any x. defined in 3. It is easy to see that this is an equivalence relation on X. Moreover. Actually.. If some equivalence relation ~ has been specified. where ~ is any device used to define the equivalence relation. as in 9. The surjective mapping 7r 9X ~ g is called the q u o t i e n t m a p or q u o t i e n t p r o j e c t i o n . The largest equivalence relation is the universal relation. that is.54 C h a p t e r 3: Relations and Orderings MORE ABOUT EQUIVALENCES 3. as in 8. An e q u i v a l e n c e r e l a t i o n is a relation ~ that is symmetric (x . 3. as in 3. if ~ is determined by a mapping 7r as in 3.~ y ~ y . The given equivalence relation ~ can be retrieved from this mapping. Here are a few examples of ways that equivalence relations can arise: a. Our language would be more precise if we gave slightly different names to different equivalence relations e. and transitive (x . as in 3. the sets Sx are disjoint and their union is the set X. Let 7r be a function with domain X. On any set X. The Sx's are then called the e q u i v a l e n c e classes of the relation. Represented in many ways. as shown below.
Of course. Define an equivalence relation on X by: x ~ y if and only if d(x. V ~. if U . Let f " X + Y be some function. D) be a gauge space.More about Equivalences 55 3. "elementary" is a subjective .. by a fairly short and elementary proof. Similarly. The hat over the f is sometimes omitted.11). there are act. a relation R on X is said to r e s p e c t the equivalence ~ if the validity of the statement u R v is unaffected when u. R i if and only if R respects the equivalence relation ~. Then d acts as a metric on the quotient space X~ ~.~. Let ~ be an equivalence relation on a set X. More generally. 7r(y)) = d(x. We then say t h a t the function f is well d e f i n e d . i d. if X l ~~ X2 ==~ f ( x l ) = f(x2).. This is an equivalence relation on the set of all words. Example.~ V l . or definitions are equivalent if they have the same meaning. phrases. v are replaced by equivalent elements of X t h a t is. We then say that the relation is well d e f i n e d .i. y) = 0. then a metric D on X~ ~ can be defined by D (Tr(x). This is an equivalence relation on the set of all statements t h a t can be expressed in our m a t h e m a t i c a l language. and t h a t 4 respects this equivalence relation. Show t h a t ~ is an equivalence relation on X. and define x ~ y to mean t h a t x ~ y and y 4 x. phrases. let (X. we sometimes use the same symbol R again for the new relation defined on Q. An equivalence relation ~ on X can be defined by: x ~ y if and only if d(x. T h e n D acts as a separating gauge on the quotient space X~ .. Since different rules of inference may be used. Let ~ be a preordering on a set X. Show: a.13. d) be a pseudometric space (defined in 2. The hat over the R is sometimes omitted: If no confusion will result. Let (X. let 7r : X ~ X~ ~ be the quotient map. or definitions in our vocabulary. A A b. The t e r m "equivalent" also has some common uses t h a t are implicit in our mathematical language: Two words. Show t h a t the resulting relation 4 is a partial ordering on the quotient set Q. Another way to say this is t h a t each set of the form f . We can define a corresponding relation R on Q by the rule R .~ U l . we sometimes use the same symbol f again for the new function defined on Q. c. two statements are equivalent if each implies the other via some set of rules of inference.e. More precisely.ually several meanings for "equivalent statements. if no confusion will result.. A function f defined on X is said to r e s p e c t the equivalence ~ if the value of f ( x ) is unchanged when x is replaced by an equivalent element of X t h a t is.f ( x ) if and only if f respects ~.l ( z ) is a union of equivalence classes. 3. y).12. U R V =~ U/_t~ V / . let Q be the resulting quotient set and let 7r : X ~ Q be the quotient mapping." Here are two main interpretations: 9 Many m a t h e m a t i c i a n s call two statements "equivalent" if each implies the other easily . Let R be some relation on X. Similarly. We can define a corresponding function f " Q + Y by the rule f(Tr(x)) . y) = 0 for a l l p s e u d o m e t r i c s d c D.
Caristi's Fixed Point Theorem and BrSnsted's Maximal Principle are not equivalent.56 C h a p t e r 3: Relations and Orderings term here. 9 a<x<b}. slightly different terminology is commonly used. An o r d e r i n t e r v a l in X is a subset of the form [a. (a.+oc] {x E [ . With this interpretation.51) as "equivalent" because each implies the other easily.. see the discussion in 19. statements equivalent in this sense are sometimes called effectively e q u i v a l e n t . (A3). Most mathematicians do not make any restriction on the use of the Axiom of Choice.o c . a n d antisymmetric (x 4 y and y 4 x imply x  y). In R or [oc. MORE ABOUT POSETS 3. 4) be a partially ordered set. +oc].45 and BrSnsted's Maximal Principle ((DC4) in 19.v>. what is elementary for one mathematician may not be elementary for another.18.b] {xEX 9 a4x4b} for some a. Logicians sometimes give the Axiom of Choice special status and treat it as a statement rather than as a rule of inference.. Definitions.b) (a. A set equipped with such an ordering is a p a r t i a l l y o r d e r e d set.o c . + o c ] {xE [oc. transitive ( x 4 y . Strictly speaking. An i n t e r v a l is any set of one of the following types" [a. See 6. + o c ] {x e " a < x < b}. 9 a < 9 < b}. see 19. then the Axiom of Choice or its consequences can only be used when stated explicitly as hypotheses. the relation "each implies the other easily" is not really an equivalence relation. When this system is followed. then (A1) ~ (A100) by a proof that is not necessarily easy. (A2) . it may be used freely as a "rule of inference.b] = = = {x E [ . b E X. for it is not transitive: If (A1) ~ (A2).51.14. (A99) ~ (A100) by 99 easy proofs. Recall that a p a r t i a l o r d e r is a relation 4 that is reflexive (x 4 x for all x).b] [a. 9 a < x < b}. y 4 z ==> x 4 z ) ." An example: The mathematical literature sometimes refers to Caristi's Fixed Point Theorem 19. . Let (X. For emphasis..51. This system which will be followed in parts of this book enables us to trace the effects of the Axiom of Choice. or p o s e t .b) = .
4) be a poset. 3.1. Note that any subset of an order bounded set is order bounded. Some older books refer to lower sets as initial segments or order ideals. orderreversing) if Xl ~ x2 ~ p(xa) E p(x2). Clearly. It is simply called "bounded" if the context is clear. +oc). It is nonempty. xEX. A lower set is equal to the union of all the principal lower sets that it contains. It is sometimes denoted by ~w. C_).2. and 27. Exercise. Although the statement "S is bounded" does not mention the set X explicitly. sES =r xES. boundedness of a set S C_ X depends very much on the choice of X. a. an interval of the form (a. It is empty if and only if w = min(X). It is proper. orderpreserving)if Xl ~ x2 ~ d e c r e a s i n g (antitone. +oc] (thus justifying our notation).c o . 3. the extended real line is the interval [oc. 4) onto a subset of the poset (T(X).{x c R" x >_ 0}U{+oc}. +oc] itself is bounded. Let (X. +oc]. 27. c_) for some collection e of sets. ~) and (Y. and the real line R is the interval ( . 3. Z is unbounded when considered as a subset of R (with its usual ordering).f. +oc] . is an order isomorphism from (X.More about Posets 57 for any extended real numbers a. In fact. d. . An interval of the form [a. Thus any poset can be represented isomorphically in the form (e. +oc] is bounded. b. One lower set is the set of p r e d e c e s s o r s of w.15. Any other lower set is called a p r o p e r lower set. b) is an o p e n i n t e r v a l . This terminology reflects the topological structure of IR or Eoc. Lower sets are discussed further in 4. introduced in 5. A mapping p" X ~ Y is i n c r e a s i n g (isotone. Two other important sets are [0. possibly inequivalent meanings see 4. X is a lower set in itself. __) be partially ordered sets.16.17.{x E R" x _> 0} and [0. sending each element to its principal lower set. A lower set in X is a set S c_ X with the property that x4s.40. A set S c_ X is o r d e r b o u n d e d if it is contained in an order interval. In particular.b. Special examples and properties. Fortunately. c. p(xl) ~_ p(x2). b] is sometimes called a closed interval. since [oc. +oc] (introduced in 1. but Z is bounded when considered as a subset of the extended real line [oc. Let X be a poser. b.4. all the usual meanings of "bounded" coincide at least for subsets of R n. but be aware that the term "bounded" has other. For instance.17). It is improper if and only if w = max(X). +oc) . every subset of [oc.15. 23.4. defined by Pre(w) {xCX 9 x<w}. The mapping w ~ ~w. Let (X. The p r i n c i p a l lower set determined by any w E X is the set {x E X : x ~ w}.
are both o r d e r .8. ~) into (Z.) is increasing if rl _< r2 _< r3 _< . S1 ~ ~2 ~ f(S1) C f(S2). . . The chart also includes suppreserving and infpreserving.22.18. 10. let _< be the usual ordering on Z. Then the identity map x H x is increasing from (Z.) The relationships between these kinds of mappings are explored in the next few exercises. c. as a preview of notions that will be introduced in 3. c_) into itself. defined in 2. _<). Chapter 3: Relations and Orderings s t r i c t l y increasing or decreasing or monotone if it is injective and (respectively) increasing or decreasing or monotone. for any set X. but not from (Z. a. 2 0 .. an o r d e r i s o m o r p h i s m if it is a bijection from X onto Y such that both p and p1 are increasing. d. 3. ... The inverse of an increasing bijection need not be increasing.p r e s e r v i n g . _<) into (Z.. } .l ( T 2 ) . Then the forward image map f " T(X) ~ T(Y)' and the inverse image map f .r3.7 and 2. ordered by inclusion. 2 5 . T1 C_ T2 =~ f . a chart below summarizes the results. b. S H CS is an antitone mapping from (IP(X). For instance..that is." some of these mathematicians use the term "increasing" where we have used the term "strictly increasing.l ( T 1 ) C_ f ." Analogous terminology is used for decreasing. monotone " strictly increasing Y " decreasing infpreserving suppreserving order isomorphism Caution: Some mathematicians use the terms nondecreasing or weakly increasing where we have used the term "increasing.1 . T ( y ) ~ T(X).58 m o n o t o n e if it is increasing or decreasing.r2. 4). Basic properties and examples. . A sequence of real numbers (rl. and let ~ be the partial ordering on Z defined by x ~ y if yx C {0. 15. (The terms "isotone" and "antitone" are used especially if X and Y are collections of sets. Let f " X ~ Y be any function.5..
Max, Sup, and Other Special Elements
59
MAX, SUP~ AND OTHER SPECIAL ELEMENTS
3.19.
Definitions. Let (X, ~) be a partially ordered set, and let y, z c X and S c_ X.
a. We say z is an u p p e r b o u n d for S if s ~ z holds for each s E S; we then say S is b o u n d e d a b o v e . We emphasize t h a t z is not required to be an element of S. Dually, z is a l o w e r b o u n d for S if s ~ z holds for each s E S; we then say S is bounded below. A set is order b o u n d e d (as defined in 3.15) if and only if it is b o u n d e d b o t h above and below. b. z is a m a x i m u m element of S (also known as a greatest, largest, biggest, highest, or last element of S) if z E S and z ~ s for all s E S. Clearly, a subset of a poset has at most one maximum. If it exists, it is denoted by m a x ( S ) . Dually, z is a m i n i m u m element of S (also known as a least, smallest, littlest, lowest, or first element of S) i f z E S a n d z ~ s for all s E S. Again, a s u b s e t of a poset has at most one minimum; it may be denoted by min(S). c. If S c_ X is b o u n d e d above and the set of u p p e r b o u n d s of S has a least element, then t h a t element is called the l e a s t u p p e r b o u n d , or s u p r e m u m or s u p of S. (Among algebraists, it is also known as the join of S.) It is denoted 1.u.b.(S) or sup(S) or V s . If the elements of S are represented by subscripted notation, as in S  {x~ 9 c~ E A}, then V s may also be denoted by V~cA x~. The sup of two elements x and y is also written as x V y. To be precise, the value sup(S) may be referred to as the supremum of S in X, for reasons indicated in 3.20.e. Dually, if S c_ X is b o u n d e d below and the set of lower bounds of S has a greatest element, then t h a t element is called the g r e a t e s t l o w e r b o u n d , or i n f i m u m or i n f of S. ( a m o n g algebraists, it is also known as the meet of S.) It is denoted g.l.b.(S) or inf(S) or A S . The infimum o f a s e t S {x~ 9 c~ c A} may also be denoted by / ~ A x~. The inf of two elements z and y is also written as x A y. d. A m a x i m a l element of S is any so E S with the p r o p e r t y t h a t no element of S is strictly greater t h a n so. Dually, a m i n i m a l element of S is any so E S with the p r o p e r t y t h a t no element of S is strictly less t h a n so. 3.20.
Further remarks and notational conventions.
a. We emphasize t h a t "max" and "min" are the abbreviations for "maximum" and "minimum," not "maximal" and "minimal." It may be helpful to think of maximal elements and s u p r e m a as two kinds of "almost maximums" i.e., objects with most of the properties one would find in a maximum. T h e y can often be used in place of a m a x i m u m , in situations where a m a x i m u m is not available. (For instance, if we are trying to generalize some known t h e o r e m by modifying a known proof, we may at some point replace a m a x i m u m with a maximal element or a supremum.) Analogously, a minimal element or an infimum is an "almost minimum."
60
Chapter 3: Relations and Orderings
b. Vie write " ~  u p p e r bound," "~maximal," " ~  m a x ( S ) , " " m a x 4 ( S ) , " etc., if we wish to emphasize or clarify which partial ordering is being used. e. W h e n the terms "max," "sup," etc., are applied to a collection of sets and no ordering is specified, then it is generally understood that C_ is the ordering being used. Thus, for instance, a maximal element of a collection 9" of sets is an element of 9" that is not a subset of any other element of 9~. Similarly, a largest element of ff is an element of 9" that is a superset of every other element of 9=. Note that a collection 9~ of sets can only have a largest element if the union of all the elements of 9" is itself an element of 9 " in which case that union is the largest element. Similarly, if 9" has a smallest member, that smallest member is equal to the intersection of all the members of 9~. There are some slight similarities between our language for (X, ~) and our language for (T(S), c_) that may help in learning the vocabulary: x V y is the join of two objects, while the union A U B of two sets is obtained by "joining" them together. Also, x A y is the meet of two objects; two sets A and B are said to "meet" (in the sense of 1.26) if and only if their intersection A N B is nonempty. d. If f is a mapping from a set S into a poset, the expressions max f ( S ) and maxs~s f ( s ) both m e a n m a x { f ( s ) 9 s E S}. Expressions for min, sup, and inf are interpreted analogously.
e. Context dependence of the definitions. The notation "sup(S)" does not mention X
explicitly, but the value of sup(S) depends very much on the choice of the poset (X, 4) in which S is a subset. For instance, let I denote the interval [0, 1], and let
I I z
{functions from I into I}, {continuous functions from I into I},
c(i,

S

{ f E C(I, I) 9 f (O)  O}.
Then S c C(I, I) c I z. Let I / be given the product ordering, and let C(I, I) be given the restriction of that ordering. Then the supremum of S in C(I, I) is the constant function 1, whereas the supremum of S in 11 is the characteristic function of the interval (0, 1]. 3.21. Elementary examples and properties. zcXandSCX. Then: Let (X, ~) be a partially ordered set. Let
a. z = max S if and only if z is both an element of S and an upper bound for S. b. z = min S if and only if z is both an element of S and an lower bound for S. c. If m a x ( S ) exists, it is also the supremum of S and the only maximal element of S. d. If min(S) exists, it is also the infimum of S and the only minimal element of S. e. X itself is bounded (in its own ordering) if and only if it has both a m a x i m u m and a minimum. f. Let x, y c X. Then x@y e. > max{x,y}y .: :sup{x,y}y.
Max, Sup, and Other Special Elements
g. Suppose t h a t sup(S) exists in X. Then z>sup(S) ' ;, z > s for every s E S.
61
h. Degenerate example, rg is a bounded subset of X. Indeed, every element of X is an
upper bound and a lower bound for the empty set, since the requirement involving s E S is vacuously satisfied when there are no s's. The set ~ has a least upper bound if and only if X has a first element, in which case those objects are the same. Similarly, ~ has a greatest lower bound if and only if X has a last element, in which case those objects are the same. Clearly, ~ has neither a m a x i m u m nor m i n i m u m element, nor a maximal or minimal element, since it does not have any element. i. A subset of a poset can have at most one m a x i m u m and at most one supremum. However, a subset of a poset may have more t h a n one maximal element. For instance, let f~ be any set containing more t h a n one element, and let X = {proper subsets of f~} be partially ordered by inclusion. Then each complement of a singleton (i.e., each set of the form ft\{w}) is a maximal element of X. j. A subset of a poset may have an upper bound without having a maximum. For instance, let X = Z 2 have the product ordering. Then the subset S = { (0,  1), (  1, 0) } has no m a x i m u m element, but it has (0, 0) as an upper bound. k. A subset of a poset need not have any maximal elements. For instance, let X be the real line with its usual ordering. Then the set S = {x E R : x < 0} has no maximal element, but it has 0 as a supremum. The set IR, considered as a subset of itself, has no maximum, no maximal element, and no supremum. 1. Let (X, ~) be a poset. Then sup is an isotone map, and inf is an antitone map, from their domains into X. T h a t is: A c_ B c_ X =:~ sup A 4 sup B, inf A > inf B
whenever those sups and infs exist.
m. Proposition. Suppose t h a t {S~ : a E A} is a collection of n o n e m p t y subsets of X and
inf(S~) exists for each c~. Then inf{inf(S~) : a E A} exists if and only if i n f ( ~ e a S~) exists, in which case they are equal. (Analogous results hold for sups.) Hint: Show t h a t p is a lower bound for {inf(S~) : c~ E A} if and only if p is a lower bound for U~EA S~. n. Let P = l[aEA Xa be a product of posets, with the product ordering (see 3.9.j). Let be a nonempty subset of P. Verify t h a t sup 9 exists in P if and only if the set {f(A) : f E ~} has a s u p r e m u m in Xa for each A in which case sup (I) is a function defined on A by (sup~)(~) s u p { f ( ~ ) 9f E ~} for each ~ c A.
Thus, the s u p r e m u m in P is defined coordinatewise. We shall call it the p o i n t w i s e s u p r e m u m , or sometimes simply the s u p r e m u m , of the set (I) in P. We emphasize t h a t sup 9 is a m e m b e r of P but not necessarily a m e m b e r of (I). Analogous notations
62
Chapter 3: Relations and Orderings
are used for inf, max, and min. In particular, when 9 contains just two functions, we obtain (x V y)(~)  x(~) V y(~), (x A y)(~)  x(~) A y(~).
3.22.
Let (X, 4) and (Y, __) be partially ordered sets. A mapping p" X , Y is s u p  p r e s e r v i n g if, whenever S is a nonempty subset of X and a  sup(S) exists in (X, ~), then sup{p(s) 9s e S} exists in (Y, __) and is equal to p(cr); i n f  p r e s e r v i n g if, whenever S is a nonempty subset of X and ~ = inf(S) exists in (X, ~), then i n f { p ( s ) : s e S} exists in (Y, _) and equals p(~).
These are special kinds of increasing maps; see 3.17. Some basic properties follow. a. Any order isomorphism is sup and infpreserving and strictly increasing. b. Any sup or infpreserving map is also increasing. Hint: 3.21.f.
c. Examples. The inclusion maps C ( I , I ) c i i in 3.20.e, V ~ R 3 in 4.21, and > c ba(A) c R A in 11.47 are orderpreserving, but they are not suppreserving or inf~
preserving. preserving. The inclusion 9" _c IP(X) given in 5.21 is suppreserving but not inf
CHAINS
3.23. Definition. Let (X, 4) be a poset. Then the following conditions are equivalent. If any, hence all, are satisfied, we say that (X, 4) is a c h a i n (or 4 is a total order or linear order or chain order). (A) Any two elements of X are comparable (defined in 3.9.a). (B) Each twoelement subset of X has a first element. (C) Each twoelement subset of X has a last element. (D) Each nonempty finite subset of X has a first element. (E) Each nonempty finite subset of X has a last element. (F) (X, 4) satisfies the T r i c h o t o m y Law: for each x, y E X, exactly one of the three conditions x 4 y, y ~x, x=y holds. In other words, the sets Graph(~), Graph(~), and I form a partition of X x X . 3.24. Some important examples. The number systems N C_ Z c_ @ c_ R c_ [oc, +oc] play a major role in analysis. We shall give formal introductions to @ and R in later chapters,
Chains
63
but for now we assume that the reader is already familiar with these number systems at least informally. The reader should understand arithmetic and inequalities in R. All of the number systems N , Z , Q , R are chains. Indeed, R is a chain, and all the inclusions N c_ Z c_ Q c_ IR are orderpreserving. 3.25.
Elementary properties.
a. Any subset of a chain is a chain. b. If (X, 4) is a chain, then (X, , ) is a chain. c. A product of chains, with the product ordering (from 3.9.j), is not necessarily a chain. For instance, N 2 is not a chain. Certain other orderings on a product may be chains or well orderings; see 3.44. d. Suppose (X, <) is a chain and S c_ X with a  sup(S). Then for each z E X with z < a there exists some s c S with z < s < a. 3.26. A t o t a l p r e o r d e r on a set X is a preorder 4 (i.e., a reflexive, transitive relation) that also has this property: Any two elements z, y E X are comparable z 4 y or y 4 z holds. i.e., at least one of the relations
Observe that a total preorder is in fact a total order if and only if it is antisymmetric. Let 4 be a total preorder on X; then: a. An equivalence relation is given on X by this rule: z ~ y if both z 4 y and y 4 z. b. 4 defines a total order on the equivalence classes, i.e., on the quotient set X~ ~. c. 4 can be extended to a total order < on X (so that G r a p h ( 4 ) _D Graph(_<)) by this natural method: Define a chain ordering < arbitrarily within each of the equivalence classes. When z and y are not equivalent, say z _< y if and only if z 4 y. 3.27. The reader may be better able to appreciate transitivity and chains after considering Condorcet's Paradox: Even if we assume that each individual voter's preferences are ranked in a chain ordering, the preferences of a collection of voters (determined by majority rule) are not necessarily a chain ordering they need not be transitive! For instance, a recent presidential election in the United States had three main candidates: Bush, Clinton, and Perot, hereafter represented by B, C, P. (For those readers who are not interested in politics, ask which fruit is preferred: banana, cherry, or peach; the mathematics is the same.) Before the election, I took a "straw poll" and asked my students which candidate they preferred. The class preferred Clinton over Bush; the class preferred Bush over Perot; but the class preferred Perot over Clinton! How is this possible? The following chart shows the details. Each individual voter's preferences are given by a chain ordering of the three candidates. There are six possible chain orderings of the candidates. For instance, one ordering is: Bush
64
Chapter 3: Relations and Orderings
B C P
B P C 5
C B P 6 6
C P B 1 1
P B C 2
P C B 7 7
Sum Result 22 14 8 12 10 14 8 Clinton beats Bush Bush beats Perot Perot beats Clinton
3way C>B C<B B>P B<P P>C P<C
1
1 1
5 5 2 2 7 7
1
6
1
is first choice; Clinton is second choice; Perot is third choice. That ordering is represented by "B C P," in the first column. The row labeled "3way" shows how many members of the class have that chain ordering; thus, that row shows the votes that would be cast in a contest between all three candidates. For instance, just one of the 22 voters had a "B C P" preference, so the number 1 appears in the "B C P" column, in the "3way" row. Below the "3way" row are rows showing the results of contests between any two candidates. The totals for each contest are in the column with the heading "sum." For instance, in a contest between Bush and Clinton, 14 voters preferred Clinton over Bush, while 8 voters preferred Bush over Clinton. Thus we obtain the result "C > B." Of course, this situation can arise with other numbers of voters and other numbers of candidates. Exercise. The simplest case of Condorcet's Paradox involves 3 candidates and 3 voters. Work out the details. This type of paradox was first published by Condorcet in 1785. A characterization of the combinations of numbers that yield Condorcet's paradox, and further references, were given by Weber [1993]. A generalization to infinite sets of voters (with majority rule replaced by other kinds of rule) were studied be Haddad [1989]; further considerations about finite or infinite sets of voters can also be found in Kirman and Sondermann [1972].
VAN MAAREN'S GEOMETRYF REE SPERNER LEMMA
3.28. Discussion and preview. The main result of this subchapter is a technical combinatorial result about preordered sets: V a n M a a r e n ' s T h e o r e m . Let t~ : X ~ P be some given function, where P and X are nonempty sets and P is finite. For each p E P, assume ~p is a total preordering of X. Then there exists a function a from some nonempty subset
Van Maaren's GeometryFree Sperner Lemma
of P into X, satisfying: 9 or(q) 4q or(r) for all q, r E Dora(a). 9 There is no z c X that satisfies a(q) <q z for all q c Dora(a). 9 Dom(cr) = e(Ran(cr)).
65
This theorem is due to Maaren [1987]; our presentation is based on the exposition given by van de Vel (see Vel [1993]). The proof will take several pages and will require several more definitions and preliminary results. We complete the proof of the theorem in 3.36, and follow it with a corollary about approximate fixed points in 3.37. This material may be postponed. It is rather specialized and will not be used until 27.19, where we use it to prove Brouwer's Fixed Point Theorem and related results. We include van Maaren's argument this early in the book mainly in order to emphasize how elementary it is i.e., to show that it does not depend on topology or geometry. For an abridged treatment, readers who are willing to skip some proofs may proceed directly to 3.37; the other ideas in this subchapter will not be needed elsewhere in this book. The literature contains many different proofs of Brouwer's Theorem. Some of the proofs may appear short or elementary, but that is only because they have concealed some of the d i f f i c u l t y  usually by using some wellknown but nontrivial theorem, about measures and Jacobian determinants or about the algebraic topology of simplicial triangulations. Those proofs, when carried out in detail, are (in this author's opinion) nonintuitive; they involve ndimensional diagrams that are hard to visualize and that seem to have little to do with the central ideas of Brouwer's Theorem. Van Maaren's proof, though not shorter or simpler than the other proofs, avoids such drawbacks. Our presentation separates the proof of Brouwer's Theorem into two main components: a purely combinatorial result in 3.37 and a compactness argument in 27.19. 3.29. Notations and definitions. The cardinality of a set S will be denoted ISI. The symmetric difference of two sets S, T will be denoted S A T. The domain and range of a function a will be denoted, respectively, by Dom(a) and Ran(a). Throughout this subchapter, we assume some nonempty sets P and X are given, with P finite. Also, we assume some mapping g : X ~ P is given; we shall call this function the l a b e l i n g . An a s s i g n m e n t will mean a function cr : Dom(c~) ~ X, where Dom(a) is a nonempty subset of P.
Note that any assignment has a finite domain, hence also a finite range. An assignment is c o m p l e t e (with respect to e ) i f Dom(r = e(Ran(cr)). An assignment cr will be called a l m o s t c o m p l e t e if IDom(cr)\ e ( R ~ ( ~ ) ) l _< 1. Two assignments Crl,a2 will be called n e i g h b o r s if either Dom(r Ran(r = Dom(cr2)and IRan((~l)A Ran(r I  1, or
= Ran(or2)and IDom(crl)A D o m ( ~ 2 ) ]  1.
66 3.30.
Observations. For any assignment a,
Chapter 3: Relations and Orderings
a. [g(Ran(a))] _< Ran(a)] _< ]Dora(a)]. Since ] S ]  IT sets S and T, we have Dom(cr)] Ig(Ran(er)) I =
S \ T ]  IT \ S[ for any finite
]Dom(~r) \ g(Ran(er)) I  g(Ran(~)) \ Dom(~)l <_ [Dora(a) \ f(Ran(cr)) I.
b. If Dom(a) \ f ( R a n ( a ) ) ]  0 then a is complete. c. If a is almost complete, then 0 <_ ] D o m ( a ) [  f(Ran(a))] _< 1.
A A ~
d. Let us abbreviate D  ] D o m ( a ) [ , R  ] R a n ( a ) ] , f  ] f ( R a n ( a ) ) ] . The almost complete assignments can be classified into the complete assignments and three types of noncomplete assignments" Complete Type (i) Type (ii) Type (iii) yes no yes yes Is a injective? yes yes no yes t~ injective on Ran(a)?
A A A
A A A A A A A
D,R,f
9
DR f
A
Ig(Ran(a)) \ D o m ( a ) l 
0
D 1Rf 0
A A
DRf§
A
DR
A 9 ~ I] .
0
1
3.31. If a is an almost complete assignment that is not complete, then D o m ( a ) \ f(Ran(a)) contains exactly one element. We shall call it the e x t r a n e o u s e l e m e n t for a.
Proposition. Suppose that a and a' are assignments that are almost complete but not complete and they are neighbors. Then they have the same extraneous element. Proof. Suppose that a and a' have extraneous elements p and p', respectively, where p =/=p'; we shall arrive at a contradiction. Since Dom(a) and Dom(a') differ by at most one element, one must contain the other; say Dom(a) C_ Dom(a'). Since p e Dora(a) \ f(Ran(a)), we have p e Dom(a'). Since p is not the extraneous element of a', we have p E f(Ran(a')). Since p ~ f(Ran(cr)), the sets Ran(a') and Ran(a) are different, and therefore Dom(a')  Dom(a). Since the sets Ran(a') and Ran(a) differ by at most one element, we have f(Ran(a)) C_ f(Ran(cr')). Then
p'
E Dom(a') \ f(Ran(a'))
C Dom(a) \ t~(Ran(a)),
and so p' is an extraneous element of a 3.32.
a contradiction.
More assumptions and definitions. In the remainder of this subchapter we assume
P is the index set for a collection { 4 p ' p c P} of total preorderings of X. (Recall from 3.26 that a preordering of X is total if it makes every two elements of X comparable. In several of the next few sections we make the additional assumption that
Van Maaren's GeometryFree Sperner Lemma
67
all the 4p'S are a n t i s y m m e t r i c  i.e., they are chain orderings but in 3.36 we drop that restriction.) An assignment a will be called a c r y s t a l if it satisfies these two conditions: (CR1) o(q) 4q a ( r ) for all q, r E Dom(a). There is no x E X that satisfies c(q) <q x for all q E Dom(a).
(oR2)
Let e be the set of almost complete crystals. A crystal a with IDom(c)I = k will be called a kcrystal. 3.33.
Observations. Assume all the 4p'S are antisymmetric. Then:
a. Condition (CR1) in 3.32 can be restated as: a(q) is the 4qsmallest member of Ran(a). Thus, a crystal is uniquely determined by its domain and range. b. Any lcrystal is almost complete. c. Let a be an assignment whose domain is a singleton i.e., Dom(a) = {p} for some p E P. Then a is a crystal if and only if a(p) is the 4plargest member of X.
do A lcrystal is uniquely determined by its domain.
e. Suppose that a l , c 2 E e with Ran(a1) c_ Ran(a2). Then a l ~ a 2 agree (i.e., take the same values) on D o m ( a l ) C1a~l(Ran(al)). Hint: Let q E D o m ( a l ) A a21(Ran(al)). By (CR1), oj(q) is the 4qsmallest member of R a n ( a j ) for j = 1, 2. f. A special case of the preceding result is as follows: Suppose that al, a2 E e with R a n ( a l ) = Ran(c2). Then Cl, c2 agree on D o m ( a l ) A Dom(a2). g. Suppose that ~ and ~' are neighboring almost complete crystals. Then one of T, T' is injective and the other is not. If ~ is injective and T' is not, then ~ and T' must be related in one of these two ways: (a) Ran(T')  Ran(T) and Dom(T') D Dom(~). In this case 7 and T' agree on Dom(~). (b) Dom(T)  Dom(T') and R a n ( r ) D Ran(T'). In this case ~ and T' agree at all but one point of Dom(T).
Hints: Use 3.30.d, 3.33.e, and 3.33.f.
3.34. Proposition. Assume all the ~ p ' S a r e antisymmetric. Then any noncomplete 1crystal 7 has precisely one neighbor 0' in C.
Proof. Any 1crystal ~ is injective. Clearly T' cannot have empty range, so 3.33.g(b) is not possible. Thus we must have Ran(T')  Ran(T) and Dom(~') D Dom(T). Say ~ has graph { (q, b)}; then Graph(T') = { (q, b), (q', b)} for some q' ~: q. For T' to be almost complete, it must satisfy I D o m ( T ' ) \ t~(Ran(T')) I < 1; that is, I{q, q'} \ {g(b)}l< 1. Therefore at least one of q, q' must equal t~(b). By assumption (q, b) is not complete, so g(b) # q. Thus we
68 must have q'  g(b). Finally, we easily verify that r' of C.
Chapter 3: Relations and Orderings
{(q, b), (g(b),
b)}
is indeed a member
3.35. Proposition. Assume all the ~~p'S are antisymmetric. Then for k > 2, a noncomplete kcrystal a E C has precisely two neighbors in C.
Proof. We analyze the possible values for a neighbor a~. We consider several cases, according to the type of a (with types as listed in 3.30.d).
~r is of T y p e (i). In this case cr is not injective. There is one and only one pair of elements pl,p2 in Dom(cr) such that pl r P2 and or(p1) = a(p2). We shall obtain one neighbor of a from each of these two points. Let p be one of Pl, p2. We shall obtain a neighbor by either modifying or removing or(p) i.e., by either changing the definition of the function at p or removing p from the domain. We do this in two cases, according to whether there does or does not exist a solution x E X to this problem: (,) ~r(q) <q x for all q E Dora(a) \ {p}.
If (,) has any solutions, let v be the ~plargest of those solutions. Note that v ~p or(p), since otherwise a and v would contradict (CR2). Now a neighbor ~r~ can be defined with Dora(or') = O o m ( a ) , by taking , or(q) { or(q) v
,%
when q r p whenqp.
On the other hand, if (,) has no solution, then a neighbor ~ can be defined by just restricting to a smaller domain i.e., taking Dom(cr') = Dom(~r) \ {p} and taking ~' equal to ~r on Dom(a~). It is tedious but straightforward to verify that the function ~' defined in either of these fashions is a neighboring almost complete crystal. Thus we obtain one neighbor by either modifying or removing cr(pl) and another by either modifying or removing a(p2). Now we shall show that there are no other neighbors possible besides those two. Let a ~ be a neighbor of a in e; what form can a ~ take? By 3.33.g, cr~ is injective, and there are two cases to consider: (1) R a n ( a ) = Ran(or') and D o m ( a ) = {p} U D o m ( a ' ) for some p r Dom(cr'), and a' and agree on Dom(a~). Since cr~ is injective, p must be one of pl,p2. Since cr~ is a crystal, by (CR2) we know that there is no x e X satisfying a(q) ~q x for all q E D o m ( a ' ) . Thus there is no solution of problem (,), and the function a ~ can only be the one obtained by removing o(p). (2) Dom(cr) = Dom(cr') = D and Ran(or') = R a n ( a ) U {cr'(p)} for some p e D with cr'(p) ~ R a n ( a ) , and a and a' agree on D \ {p}. Since R a n ( a ) c_ R a n ( a ' ) , we have a(p) = a'(ql) for some ql E D. Since a(p) belongs to R a n ( a ) and a'(p) does not, we know a'(p) /= a(p) = cr'(ql), and therefore p =/= ql. Since a and a' agree on D \ {p}, we have or(q1) = cr'(ql) = or(p). Thus p and ql are distinct members of D that are m a p p e d to the same value by a. Therefore the set {p, ql} is equal to the set {Pl, P2}. Hence p is one of Pl,P2. Since or' satisfies (CR1) and or' is injective, we have o"(q) ~q o"(p) for all q e D o m ( a ' ) \ {p}.
Van Maaren's GeometryFree Sperner L e m m a
69
That is, or(q) <q cr'(p) for all q E Dora(or') \ {p}, so cr'(p) is a solution of (,). To see that cr'(p) must be the 4plargest solution of (,), suppose that x is a 4plarger solution. Then cr(q) <q x for all q E Dora(or) \ {p}, and cr'(p) <p x as well. T h a t is, cr'(q) ~q X for all x E Dora(or'), contradicting the fact that or' must satisfy (CR2). Thus, we have established that there is a solution of problem (,), and the function or' can only be the one obtained by modifying or(p). cr is of T y p e (ii). In this case cr is injective, but g is not injective on Ran(or). Thus IDom(cr)l = tRan(cr)l > It!(Ran(cr))l. There is a unique pair of distinct elements Wl, w2 E Ran(or) that get mapped by g to the same value. There are unique elements Pl, P2 E Dom(cr) with cr(pj) = wj. We shall obtain one neighbor of cr from each of these two points. If Dom(cr') ~ Dom(cr) and Ran(or')  Ran(a), then we have ]Dom(cr')] 1 _> IDom(cr)[ _> le(Ran(~))l + 1 = Ig(Ran(~'))l + 1, contradicting a.a0.r Thu~ a.aa.g(~) cannot hold. Therefore Dom(cr) = Dom(cr') and Ran(or) = Ran(or')tO {or(p)} for some p with or(p) Ran(or'), and cr and or' agree on the set S  Dom(cr) \ {p}. Since Ran(or') c Ran(a), we must have or' (p) C cr(S). If p ~ {Pl,P2}, then Pl,P2 are distinct members of S with g(cr'(pl)) = g(cr'(p2)), and t!(cr'(p)) E t!(cr'(S)), too. It follows that Ig(Ran(cr')) I _< I D o m ( c r ' ) l  2, contradicting 3.30.c. Thus we must have p E {Pl, P2}. For each of those two choices of p, the value of cr'(p) is determined uniquely: By 3.33.a, cr'(p) must be equal to the 4plowest member of Ran(or'). We have shown that only two functions (one with p = Pl, the other with p = P2) could possibly be almost complete crystals that neighbor or. It is easy to verify that both of those two functions are, indeed, such crystals. r~ is of T y p e (iii). In this case cr is injective, and g is injective on Ran(a). We shall obtain one neighbor of cr from the unique member of g(Ran(cr)) \ Dom(a) and another from the unique member of Dom(a) \ g(Ran(a)). By 3.33.g, we can obtain a neighbor only by enlarging the domain or decreasing the range; we shall show that each of these two methods yields precisely one neighbor. (1) Enlarging the domain: In this case D o m ( a ' ) = Dom(cr)tO {p} for some p ~ Dom(a), and R a n ( a ' ) = R a n ( a ) = R, and cr and a' agree on Dom(a). Since p ~ Dom(a), p is not the extraneous element of a, and therefore p is not the extraneous element of a'. Hence p ~ g(Ran(~')) = g(R) = g(Ran(~)). Thus p is the unique member of t~(R) \ Dom(cr). By 3.33.a, cr'(p) must be the 4vleast member of R. Thus we have specified cr' uniquely. It is easy to verify that the function cr' defined in this fashion is indeed an almost complete neighboring crystal. (2) Decreasing the range: In this case Dora(or) = Dora(or') = D and R a n ( a ' ) = Ran(or)\ {or(p) } for some p c D, and cr and cr' agree on D \ {p}. Since t~ is injective on the range of or, we have e(cr(p)) E e(Ran(cr)) \ e(Ran(cr')). Since g(cr(p)) E g(Ran(cr)), we have g(cr(p)) Dom(cr) \ t!(Ran(cr)). T h a t is, t!(cr(p)) ~ Dora(or')\ t~(Ran(cr')), since cr and or' have the same extraneous point. But g(cr(p)) ~ t~(Ran(cr')), so we conclude g(cr(p)) ~ Dom(cr') = Dom(cr). Thus we have identified g(cr(p)) uniquely: it is the unique member of t!(Ran(cr)) \ Dom(cr). Since g and cr are injective, we have determined p uniquely: It is the unique member of Dora(or) that satisfies g(cr(p)) ~ Dora(or). The functions cr and cr' agree on D \ {p}, and the value of cr'(p) is determined uniquely by 3.33.a. Thus we have defined or' uniquely. It is a
70
Chapter 3: Relations and Orderings
tedious but straightforward exercise to verify that the function a t defined in this fashion is indeed an almost complete crystal that neighbors a. 3.36. V a n M a a r e n ~ s T h e o r e m . Suppose that P and X are finite sets and for each p c P we are given a total preorder ~p on X (not necessarily antisymmetric). Let any labeling t~" X ~ P be given. Then (X, P) has at least one complete crystal with respect to t~.
Diagram for proof of 3.36
Proof. A preliminary first step is this: We can replace each total preorder ~p with a total order, hereafter denoted ~p, by arbitrarily choosing a total ordering on each of the equivalence classes of ~p. This replacement results in fewer crystals. Thus, it suffices to prove the theorem under the additional assumption that each preordering ~p is a total ordering. Since X is a finite set, e is finite also. By 3.33.c, there exists a lcrystal a0, with a singleton domain Dom(ao) = {po}. In (~, each incomplete 1crystal has exactly one neighbor, and each incomplete 2crystal has exactly two neighbors. Follow a path, starting at a0, going from each crystal to its neighbor. If we do not encounter any complete crystals along the path, then our route is uniquely determined; it must begin and end at distinct lcrystals (see the preceding diagram). However, at each step the extraneous point is preserved, by 3.31; thus the beginning and ending lcrystals must have the same extraneous point contradicting the fact that they are distinct. This proves that the path must include at least one complete crystal. (Incidentally, we have given a constructive algorithm for finding a complete crystal: just follow the path until one is encountered.)
3.37. The first theorem below is included only for motivation; we give references for it in lieu of a proof. The second theorem, though more complicated to state, is easier to prove, and we shall do so below. It will be used to prove Brouwer's Theorem in 27.19. For both theorems, let R n be metrized by d(x, y)  m a x { I x j  YjI " 1 < j <_ n}. F i r s t A p p r o x i m a t e F i x e d P o i n t T h e o r e m . Let n be a positive integer, let f  (fl, f 2 , . . . , fn) " [0, 1]n ~ [0, 1]n be a function, and let any number c > 0 be given. Then there exists a set S c_ [0, 1]n with diameter less than e, with the following property" For each j c {1, 2 , . . . , n} there exist some points I x  ( x l , x 2 , . . . , x n ) and x '  (X~l,X~,...,Xn)in K such that xj < f j ( x ) and
x~ >_ f j(x').
Second Approximate Fixed Point Theorem. Let n be a positive integer.
Van Maaren's GeometryFree Sperner Lemma
Let A be the standard nsimplex; that is, the set
T1
A
{
U E R n
" Ul,U2,...,Un
~ 0 and j=l
Uj _< 1
}
.
Let any function f " A + A and any number c > 0 be given. Then there exists a set S c_ A that acts as an approximate fixed point of f, in the following sense: 9 diam(S) _< e 9 For each i  1, 2 , . . . , n, there is some u c S such that ui  e <_ f(u)i. 9 There exists some point v ~ S satisfying ~ = 1 vi + e _> ~ = 1 f(v)~.
n n
Remarks. We emphasize that f is not assumed to be continuous or even measurable. Aside from the domain and codomain, we make no assumption at all about f. Thus, these theorems are not really "about" f; they are theorems about the combinatorial structure of IR~. An analogous theorem about infinite dimensional vector spaces will be given in 27.19. A similar argument in two dimensions, more geometrical and elementary in presentation, was given by Shashkin [1991]. Theorem 2 of Baillon and Simons [1992] is also very similar. The First Approximate Fixed Point Theorem can be proved by methods similar to those below, using Wolsey's [1977] Cubical Sperner Lemma instead of our 3.36. It would be interesting to know if the First Approximate Fixed Point Theorem can also be proved by some short argument using 3.36; no such argument is presently known to this author. Proof of the Second Approximate Fixed Point Theorem. By writing Un+ 1
we may rewrite

n 1  }~j=l uj,
A
__
{
u c ] R ~+1 9 U l , U 2 , . . . , u n , u ~ + l > 0 a n d
j=l
uj1
}
The (n + 1)st coordinate will be treated just like the other coordinates in the following argument. Let M be an integer large enough so that 2(n + 1 ) / M < c. Let X consist of the collection of all points u E A for which all of M u l , M u 2 , . . . , M u n , MUn+l are integers. Let P  { 1 , 2 , 3 , . . . , n , n + 1}. For 1 < j < n + 1 define a preordering of X by taking u 4 j v when uj < vj. Let cr be a crystal, and let S be its range. If
~(i)~ _<
i~Dom(~)
n+l M
then there exists a member x E X that satisfies a(i)i < xi for all i E Dora(a), contradicting (CR2) in 3.32. Thus the inequality above does not hold.
72
Chapter 3: Relations and Orderings
For any i,j e Dom(a), we have a(i)i <_ a(j)i, and therefore
1
n+l
M
<
E
i~Dom(~)
a(i)i
< ~
E
i~Dom(~)
a(j)i
<
from which it follows that 0 < a ( j ) ~  a(i)~ < n+l for each i j e D o m ( a ) Therefore la(j)i  a ( k ) i l <_ 2 ( n + 1)/M whenever i , j , k e Dom(a). Hence ]uivil <_c for all u,v e S and i E Dom(a). X~n+l On the other hand, since z_,i=l a ( j ) i  1, we must have ~i~tDom(~)a(j)~ < Mn+1for j e D o m ( a ) and, in particular a(j)i < n+l for each i ~ D o m ( a ) Thus 0 < ui < c/2 for all u e S and i ~ Dom(a). Therefore diam(S) < c. Define a labeling t~ : A ~ {1, 2 , . . . , n, n + 1} as follows: let t~(u) = i if i is the first coordinate t h a t satisfies ui <_ f(u)i. By 3.36 there exists a complete crystal a with respect to t h a t labeling. W h e n i e Dom(a) = t~(Ran(a)), then i = t~(u) for some u e S, so u~ < f(u)~. On the other hand, we noted earlier in this proof that when i ~ D o m ( a ) and u E S, then ui _< r hence f(u)i >_ ui  c. This completes the proof.
WELL ORDERED SETS
3.38. Definition. Let (X, g) be a poset. We say g is a well o r d e r i n g if each nonempty subset of X has a first element. Then X is a well o r d e r e d set, or a w o s e t . Examples. The set N is well ordered. Also see 3.43 and 5.44. Remark. Well ordered sets are only used infrequently in analysis. This subchapter may be postponed or omitted if the reader is concerned only with the usual topics of analysis. 3.39.
Basic properties of wosets.
a. Any woset is a chain. b. Any subset of a woset is a woset. c. Let S be a subset of a woset X. Then S is a proper lower set in X if and only if S  Pre(b) for some b E X, with notation as in 3.16.b. Hint for the "only if" part: Let b be the first element of X \ S.
A
d. Let X be a woset. Then the lower sets of X form a woset X, when o r d e r e d by c . The last element of X is X. If X is not empty, then the first element of X is  P r e ( m i n ( X ) ) , where m i n ( X ) is the first member of X.
B
e. Any woset X is a proper lower set of some larger woset Y. Indeed, one way to form such a larger set is by adjoining some new element call it [::] that is not already present in X and defining [] to be larger than all the elements of X. f. I n d u c t i o n o n W o s e t s . Let (X, 4 ) be a woset, and let S be a subset of X with the property that Pre(b) C_ S =~ b E S. Then in fact S  X. Hint" If not, let b be the first element of X \ S.
Well Ordered Sets
73
3.40. Notation. For the result below, if (X, ~) is a well ordered set and T is a nonempty set, then an Xbased sequence in T will mean a function whose domain is some proper lower set of X and whose range is contained in T. As a degenerate case, we may view the empty function (with graph equal to the empty set) as an Xbased sequence in T. T h e o r e m of R e c u r s i o n o n W o s e t s . Let (X, f ) be a woset and let T be a nonempty set. Let any function p : {Xbased sequences in T} + T be given. Then there exists a unique function F : X + T satisfying
F(x)  p(FIPre(x))
Here
for each x E X.
FIPre(x ) denotes the restriction of F
to the set Pre(x)  {w c X ' w
< x}. Thus, the
value of F at any x is determined, via the rule p, by the values of F at all the predecessors of x.
Remark. Compare this result with 2.22. Proof of theorem. First we prove uniqueness. Suppose F1, F2 are two such functions, and F1 # F2. Let x be the first member of X that satisfies F l ( x ) # F2(x). Then F1 (w) = F2(w) for all w E Pre(x) that is, the restrictions F1 .IPre(x) and F2 .IPre(x) are the same function
p. But then F l ( x ) = p(p) = F2(x), a contradiction. This proves uniqueness. We now turn to the existence proof. It will be convenient to replace X with a slightly larger set. Let Y = XU{~}, where ~ is some object not belonging to X. Extend the ordering of X to an ordering on Y by setting x < ~ for all x E X; then Y is also a woset. Note that for each y E Y, the set Pre(y) is a lower set in X; in particular, Pre(~) = X. We shall prove that for each y E Y there is a function Fy : Pre(y) + T satisfying
Fy(x)  p(FylPre(x))
for each x E Pre(y).
(Once this is established, we simply take F = F~ to prove the theorem.) First note that for each y, we may unambiguously use the notation "Fy" because there is at most one such function Fy; that is clear by a uniqueness argument similar to the one at the beginning of the proof of the theorem. The proof of the existence of Fy's is by induction on y. Assume, then, that some r/E Y is given and that Fy's exist for all y < ~; we are to demonstrate the existence of F.~. We demonstrate that in two different ways, according to the nature of rl: First, suppose ~ has an immediate predecessor ~   that is, suppose r/is the first member of Y after some member ~. Then Pre(~) = {~} tO Pre(~), and F~: Pre(~) + T is a function of the sort described above. Define a function Fv : Pre(r/) + T by { F~(x)
r,(x) , (5)
when x E Pre(~)
when x 
It is easy to verify that Fn has the required properties.
74
Chapter 3: Relations and Orderings
Then Pre(r/) 
On the other hand, suppose r/ has no immediate predecessor in Y. [.Jy_~v Pre(y). Also, Graph(Fy) is an increasing function of y   that is,
y < y' < rl
~
Graph(Fy) C_ Graph(Fy,).
Verify that Graph(Fn)  Uy<n Graph(Fy) defines F v with the required properties. 3.41. C o m p a r a b i l i t y T h e o r e m . these three conditions holds" If (X, <) and (Y, ~) are wosets, then exactly one of
9 There exists an order isomorphism between X and Y. 9 There exists an order isomorphism from X onto a lower set of Y. 9 There exists an order isomorphism from Y onto a lower set of X. Furthermore, in each case the isomorphism is uniquely determined.
Proof. For each proper lower set L C_ X and each function y)" L ~ Y, define
_
P(~)
J" min(Y \ Range(~)) min(Y)
if Range(~) r Y if R a n g e ( ~ )  Y.
Now define F " X ~ Y by recursion, as in 3.40. Then F ( P r e ( x ) ) is an increasing function of x, so X0  {x C X " F ( P r e ( x ) ) r Y} is a lower set in X. Show that F(Xo) is a lower set in Y and F gives an order isomorphism from X0 onto F(Xo). If F(Xo) r Y, then X0  X. This establishes the existence of at least one isomorphism. If f and g were distinct order isomorphisms from X onto a lower set of Y, then we could take x0 to be the first element of X satisfying f(xo) r g(x0); show that this leads to a contradiction. This proves uniqueness in either direction. Suppose f is an order isomorphism from X onto a lower set of Y and g is an order isomorphism from Y onto a lower set of X. Then g o f is an order isomorphism from X onto a lower set of X   but by the uniqueness result of the previous paragraph, g o f must then be the identity map of X. 3.42. Corollaries. a. If X and Y are wosets, then card(X) _< card(Y) or card(Y) <_ card(X). b. If X is an infinite woset, then card(X) _> card(N), and for any ~ ~ X we have card(X) = card(X U {~}). 3.43. C a n o n i c a l W e l l O r d e r i n g T h e o r e m . Let X be a set, and let some mapping p : {proper subsets of X} ~ X
be given that satisfies p(S) r X \ S for each S. Then there exists a unique well ordering on X with the following property:
X

p({uEX
k
9 u<x}~]
foreachxEX.
Well Ordered Sets
75
(In other words, to find the next term in the ordering, just apply p to the set of all the terms that have already been ordered. Contrast this with (AC4) in 6.20.)
Proof (modified from Malitz [1979]). Consider well orderings of subsets of X. When is such a well ordering, let S f be the subset of X that it well orders, and let its sets of predecessors be denoted by
Pre~(x) for points x E S~. Say ~ is a x {sES~ 9 s~x, s:fix}
pwell ordering if it also satisfies
p(Pre~(x)) foreachxES~.
Let iK be the set of all pwell orderings. It is clear that 9C is nonempty, since the empty relation is a pwell ordering of the empty set. We are to show that X has a unique member satisfying $4 = X. As a preliminary step, we shall show that (**) Whenever 41 and 49 are pwell orderings, then one of the wosets ( $ 4 1 , 4 1 ) , ( $ 4 2 , 4 2 ) is a lower set of the other, whose ordering is just the restriction of the other's ordering. Indeed, by 3.41, we know that there exists an order isomorphism between one of the wosets ( $ 4 1 , 4 1 ) , ( $ 4 2 , 4 2 ) and a lower set of the other. Say p : $41 ~ $42 is an order isomorphism
%
from ($41 , 4 1 ) o n t o a lower set of ($42 , 42). Then p ( P r e 4 1 ( u ) )  Pre42
/
(p(u)) for any
u E $ 4 , . To prove (**), it suffices to show that this isomorphism is actually an inclusion map i.e., that p(x) = x in X for all x E $41. If/3 = p(c~) =/= c~ for some c~ E $41, choose the 41first such c~ in $41 and the corresponding/3. Then p acts as the identity map on Pre ~ (c~). Thus Pre~l(Ct )
/

p(Pre~l(oe))
\
/

Pre~2(p(ct))
\

Pre~2(/3), This proves the
/
and t h e r e f o r e
oe
p/Prea__{l(OL))
\
piPre__~2(/~)J
\ /

~, a contradiction.
claim (**). Each member of ig is a relation on X, which may be viewed as a subset of X x X. From (**) it follows easily that the union of the elements of iK is itself a member of X. Hence it is the largest member of iK; let us denote it by E. If S : is not equal to X, then _E extends to a strictly larger pwell ordering on S___U {p (S__)}, by defining p ( S g ) to be larger than all the members of SE contradicting the maximality of E. Thus SE  X. Uniqueness follows from (**). 3.44. Products of wosets. A product of wosets, with the product ordering, is not necessarily a woset; an example is given by N 2. Other orderings on a product may be well ordered, however" a. Let (A,_<) be a woset, and for each A E A let (X~,_<) be a chain. (The _<'s may represent different orderings.) Let P 1I~EA X~. The l e x i c o g r a p h i c a l o r d e r (or dictionary order) on P is defined as follows: p < q in P if p(t,) < q(t,) where u E A is the first component in which p and q differ. Show:
y) means that 9 max{u. hence IKI ~= IXI. y}.44. v} < max{x. Clearly._<). Suppose IXI < IX x X I for some infinite woset (X. or 9 max{u. by 3. v} = max{x. if X were an infinite woset with last element ~. hence (since K c X) we have IKI < IXI. y} and u = x and v < y. Let wo be the maximum of u0 and vo in (X. 3. Let (u0. Observe that if K is any proper lower set in X. the order isomorphism must be from X onto a set L that is a proper lower set of X x X. For a more concrete special case. v0) ==> m a x { u . or 9 max{u. y) and (u. the lexicographical ordering on an infinite product is not a well ordering. (ii) If each (X~. we can replace X by any other woset with the same cardinality. Then I L l .IXI is an infinite cardinal. Since X does not have a last element.ILl < IM x M I . if A is an infinite woset with no last element and each Xa is a woset containing at least two elements.v) E ( u 0 . . 1} N : x r (0. Then M is a lower set in X. This completes the proof.<). we shall obtain a contradiction. which tells us that every set can be well ordered. a contradiction. v} = max{x. Hence ILl < IM x M I. Since IXI < IX x XI.45. IXI < IX x X I. Clearly. Define an ordering __ on X x X. v) E (x.v) e L => (u. IM x M I . 1} N. Define a well ordering _ on X x X as in 3. Now I X I . Since X and X x X are well ordered. T h e o r e m o n c a r d ( X 2 ) . 0. Let X be an infinite set. show that {x c {0. then X \ {~} would be a proper lower set with the same cardinality as X. (This construction will be used in 3. show that P \ {~} has no smallest element.d we may replace X with the first lower set in X that is infinite and satisfies IXI < IX x X I.IK x KI. one of these sets is uniquely order isomorphic to a lower set of the other. and suppose that X can be well ordered. let ~ be the function whose value at A is the smallest member of Xa. 0 . say (u. then either K is finite or I K I . In particular. Proof of theorem. By our choice of X. Since L is an infinite set. see 6.45. v } < w 0 => u. v e M . b. .I M I .22.b. then the lexicographical ordering is a well ordering on P.{x E X 9 x <__wo}. ) } has no first element in {0. as follows: For (x.) Let (X. y} and u < x. M must be infinite. then P is not well ordered.I M ] < ]XI. then. Observe that (u.76 Chapter 3: Relations and Orderings (i) The lexicographical ordering is a chain ordering on P. (iii) In general. <_) be a well ordered set. The present result does not require the Axiom of Choice. . M is a proper lower set in X. Let I I denote cardinality. Verify that this is a well ordering on X x X. v) in X x X. too. Then card(X x X ) . To see this. _<) is well ordered and A is a finite set. vo) be the _Efirst member of (X x X) \ L.39. and thus L C_ M x M. a contradiction. X does not have a last element for. Remarks. Indeed. and therefore IMI < IXI. . . Let M .20 and 6.card(X).
At the c~th step. We shall determine a maximal set M E 9" by defining its characteristic function 1M " X + {0. Remark. 12. Proof of theorem. we have determined the set Pre(c~)N M. Example. The present theorem does not require the Axiom of Choice. Verify that the resulting set M is a C_maximal member of 9~. Then 9" has a Cmaximal element. Let 4 be a well ordering of X.7. which would tell us that every set can be well ordered. and thus we have determined which elements of Pre(c~)should be members of M i. . by taking it to be 1 if the set (Pre(c~)N M ) U {c~} is a member of 9" and 0 \ otherwise.20. If (X.e.46.Well Ordered Sets 77 3.f. S is a member of 9" if and only if each finite subset of S is a member of 9:. Now define IM(C~).. 11.17. we have defined 1M on all of Pre(c~). A collection 9" of subsets of a set X is said to have finite c h a r a c t e r if for each set S c_ X. 4) is a poset. by transfinite recursion. We shall now prove the following theorem" F i n i t e C h a r a c t e r T h e o r e m ( c a n o n i c a l c h o i c e v e r s i o n ) . Contrast this with (AC5) in 6.10. 1}. Let X be a set that can be well ordered and let 9" be a collection of subsets of X that has finite character. Other examples will be given in 5.e.31. Definition. then {S c X 9 S is chain ordered by 4} has finite character. and 14.
(ii) It becomes awkward if one wishes to work simultaneously with two or more closures (e.. and let C be a collection of subsets of X.2. then condition (i) can be omitted.3. (If we adopt the convention that the intersection of no subsets of X is just X.g. The term "closure" has several different meanings in mathematics.for instance.. if we need to distinguish between several different closures. complex conjugation).) In the present context.1. Order completeness of a poset refers to the existence of sups and infs in that poset. Let X be a set.g. Among all the closed 78 . a special type of Moore closure. We shall use subscripted notation. from two different topologies).e.~. members of C will be called M o o r e closed sets.4 need not be Moore closures. Most of the meanings of "closure" are specializations of the Moore closure. In this book we shall write a closure of a set S as cl(S). Now. MOORE COLLECTIONS AND MOORE CLOSURES 4. Sups and infs were introduced briefly in Chapter 3..e. or just closed sets if the context is understood. that notation has certain disadvantages: (i) It is used for other purposes (e.. let any set S c_ X inot necessarily a member of e) be given.) Many mathematicians write the closure of a set S as S. such as c19(S ) and clll (S). 4. We shall say C is a M o o r e c o l l e c t i o n of s e t s if: (i) X c e . it follows from (ii) by taking A .6. and i. Then there exist closed sets that are supersets of S . X itself is such a set. This chapter investigates sups and infs further and introduces the related notions of Moore closures and order completeness. under arbitrary infimum with respect to the ordering given by inclusion. (An exception: the "pretopological convergence closures" introduced in 15. A Moore collection is a collection of sets that is closed under arbitrary i n t e r s e c t i o n . see the example in 15. However. if { S x " k E A} c_ e. Chapter overview. defined below.i. then (ii) C is closed under arbitrary intersection N~eAS~ ~ e.Chapter 4 More about Sups and Infs 4.. Order completions can be constructed most easily using polars.
other terms sometimes used for a Moore closure are hull. In most applications of the concept.) Though Moore closures appear in many parts of mathematics. Moore closed set sublattice full (order convex) set upper set max closed gauge saturated set topologically closed set ideal (in a variety) convex set balanced set linear subspace filter of sets ideal of sets (a)algebra topology monotone class Moore collection Moore closure sublattice generated full hull up closure max closure saturation topological closure ideal generated by a set convex hull balanced hull linear span filter generated ideal generated (a)algebra generated topology generated monotone class generated Moore collection generated . but not both. which does not mention the collection of all closed supersets of S. all sets are sets of sets. and we shall denote it by cl(S).T. or s a t u r a t e d hull. This terminology is particularly common when the elements of X and S are themselves subsets of some set ~ i. the "closed" / "closure" terminology is commonly used..g. or the "generated" / "generating" terminology is commonly used. The Moore closure of S is also known as the member of e that is g e n e r a t e d by S. the c l o s u r e ) of S. called the M o o r e c l o s u r e o p e r a t o r associated with the collection e. a characterization directly in terms of S.e.. but most mathematicians do not share that viewpoint.e. when S is a collection of sets. Another term sometimes used for a Moore closed set is s a t u r a t e d set. In some cases (i. from the viewpoint of axiomatic set theory. (It is easy to see that a set T c_ X is closed if and only if cl(T) . We shall introduce appropriate terminology separately in each context.Moore Collections and Moore Closures 79 supersets of S. The following table previews a few kinds of Moore closures that will be studied later. (Of course. the terminology varies greatly. s a t u r a t i o n . the bottom part of the table deals with sets of sets.) In this fashion we define a mapping cl 9 iP(X) ~ C. for some choices of X and e) we can also give some other. we may call S a g e n e r a t i n g s e t for cl(S). the intersection of all the closed supersets of S. there is a smallest namely. and cl(S) is the collection of sets generated by S.. relative to the collection C. equivalent description of cl(S) that may be more convenient e. The top part of the table deals with sets of points. We shall call it the M o o r e c l o s u r e (or more simply.
Show that C(p. b) is full. then ~ is a member of S.b E X let [ a . b] or (a. then a is a member of S. McKenzie. if x ~ y. any two sets C ( p l . called the full hull of T.e.b]. ~) be a preordered set. and let S C_ X.T)'s as the full c o m p o n e n t s of T. a..8) or topological (introduced in 5. Show" (i) If S1. The term "closure" by itself is commonly used by algebraists to refer to any Moore closure (as defined above). A few examples of Moore closures.80 Chapter 4: More about Sups and Infs Further remarks about the terminology. but the term "closure" usually is used by analysts to refer only to topological closures. Hence any set T c_ X is contained in a smallest full superset. C(p2.T) be the union of all the full sets S that satisfy p E S C T. T) is not a proper subset of some other full subset of T. We shall follow the analysts' convention in some parts of this book since this book is largely devoted to the foundations of analysis. We say that S is u p .c l o s e d if.T) is maximal for that property i. b.c l o s e d . Let (X. although the notation certainly has changed since then. For a. Show that the full hull of T is equal to Ua. b E S ~ [a. C(p.b. b] or [a. y E S =v x E S. and Tsinakis [1993]. b ] .24. and Taylor [1987].c l o s e d if. T) is a full subset of T and that C(p. whenever A is a nonempty subset of S and a . Let (X. T) are either identical or disjoint. d o w n . +cc]. b] C_ S. i n f . (ii) For any T C X and p E T. Very few closures of interest are both algebraic and topological. This will be used in 15.19). y E S =v x E S. 9 x~sforsomesES}. We may refer to the C(p. then S1 U $2 is full. let C(p. $2 are full sets that are not disjoint. Most Moore closures of interest are either algebraic (introduced in 4. T ) . Let (X. or a l o w e r set. A set S C_ X is called full (or o r d e r c o n v e x ) if a.sup(A) exists in X. Show that the sets C(p. whenever A is a nonempty subset of S and c in X.5. b) or (a. or an u p p e r set. that is clear from 16. ~) be a partially ordered set.) Show that the full subsets of X form a Moore collection. (For example. .c and 17. Evers and Maaren [1985]. any interval [a. 4. _<) be a chain.c l o s e d .35.b and 5. inf(A) exists (Lower sets were defined in 3.4.16.{x E X ' a ~ x g b}. The basic property given in 4. T) form a partition of T that is.a is due to Moore [1910].16. McNulty.8. if x ~ y. s u p .) Show that the collections of such sets are Moore collections. For later applications we note some further properties of full subsets of chains. with resulting Moore closures as follows: upcl(S) downcl(S)  {xEX {xEX 9 x~sforsomesES}.DET[a. in [c~. Most properties of closures in this chapter are taken from Cohn [1965].
Similarly. if dl. but may simplify the proofs. We shall say that D is m a x . We may also denote it by dl V d2 V . . . The resulting Moore closure may be described as follows: If 8 is a collection of subsets of X. . do If 5I~. . This also determines a Moore closure. c.Moore Collections and Moore Closures supcl(S) infcl(S) Show also that (i) Any upclosed set is supclosed. d 2 . . another name for maxclosed is s a t u r a t e d . a gauge D is closed under addition. VP. Note that if D is maxclosed or closed under addition. it is the smallest pseudometric that is larger than or equal to all the dj's. or sumclosed. d 2 . y ) . . dn E D =:> dl V d2 V . then the smallest collection that is closed under finite union and contains 8 is Ucl(g) = { M C_ X 9 M is the union of finitely many members of g}. . (In the wider literature. can be defined by (VP)(x.~. d n ( x . {x E X {x E X 9 x . d2 E D => dl V d2 E D or. if dl.  81 9x sup(A) for some nonempty A c S}.15.. then D is directed. . this determines a type of Moore closure on the collection of all pseudometrics on X. a collection of pseudometrics. V dn.. equivalently. . D may be replaced by a directed gauge. If P = { d l . Clearly. any union of downclosed sets is downclosed. W h e n P contains just a single pseudometric. In 5.. . for many purposes.dn} is a finite collection of pseudometrics on a set X (defined in 2.e. then another pseudometric. } 3V[~ is also closed under finite union. 5 I ~ . Thus.. . the m a x c l o s u r e of a gauge D is the gauge maxcl(D) {VP 9 P is a finite subset of D } . . VP is the supremum of P in the family of all pseudometrics i.i n f ( A ) for some nonempty A c_ S}.y)  m a x { d l ( x . in 18.0. there exists some pseudometric d E D such that VP _< d. this replacement will not affect the hypotheses. (ii) A set S is upclosed if and only if X \ S is downclosed. Hence.11).. are collections of subsets of X.13 we shall see that any gauge D is uniformly equivalent to its max closure and its sum closure. .) (iv) 2~ is upclosed and downclosed. .c l o s e d if dl. then VP equals that pseudometric. then ["I~E{~. which we shall call the sum closure. For some theorems. (This property is not shared by most Moore collections. any downclosed set is infclosed. each of which is closed under finite union. 510.y). ) Clearly. . the collection {5I c_ [P(X)" 5{ is closed under finite union} is a Moore collection of subsets of [P(X). . d2(x.. D may be replaced with its max closure or sum closure i. (iii) Any union of upclosed sets is upclosed. . " V dn E D.e. . Preview.h we shall see that any gauge D is topologically equivalent to its max closure and its sum closure. Let D be a gauge on X that is. y ) } .d2 E D => dl + d2 E D. A gauge D is d i r e c t e d if for each finite set P C_ D.
S. This bijection preserves the basic set operations: complementations. it is the intersection of the 7~saturated sets that contain A. the other "closures" defined in 1. if Xl C S. the closure under arbitrary intersections is a Moore closure. if S is a union of equivalence classes. if "cl" does satisfy these axioms. Let X be a set. Hence we can define the corresponding Moore closure: The 7rs a t u r a t i o n . A x i o m s for a M o o r e C l o s u r e O p e r a t o r . or ~ . let 7r" X + Y be the quotient map.e. T h a t is.3) if and only if "cl" satisfies these three rules: S c_ cl(S) (extensive) cl(cl(S)) S c_ T =~ cl(S) (idempotent) (isotone) cl(S) c_ cl(T) for all sets S. Xl ~ x2 ~ x2 C S that is. intersections. . Of course. or 7 r . Some basic properties of Moore closures. we get a bijection from {Trsaturated subsets of X} onto ~P(Y). of a set A c_ X is the smallest 7rsaturated set that contains A. 4. whose inverse is given by the inverse image mapping T H 7rl(T) = {x c X : 7r(x) r T}. a. Let ~ be an equivalence relation on a set X. it is closed under intersection. and unions.s a t u r a t e d hull. Say that a set S c_ X is 7 r .30 are also Moore closures.7). The collection of saturated subsets of X is a Moore collection i.e.s a t u r a t e d . but if we restrict it to a smaller domain.s a t u r a t e d . c~E c~EA c~EA for any 7~saturated sets S and S~. e. Show that "cl" is the Moore closure operator for some Moore collection of subsets of X (as defined in 4. Thus the collection of all Moore collections of subsets of X is a Moore collection of subsets of [P(X). then the corresponding Moore collection e C_ iP(X) is uniquely determined by "cl:" it consists of those sets S c_ X that satisfy cl(S) .~ a}  the 7rsaturation of A... In particular. x aCA . T C_ X.82 Chapter 4: More about Sups and Infs Similarly. let Y be the quotient set.5. The forward image mapping S H 7r(S) = {Tr(x) : x c S} is usually defined as a mapping from T(X) into ~P(Y) (see 2. if it is closed under this equivalence i. Show that 7rl(Tr(A)) U {x r x . and suppose we are given a function cl 9 T(X) ~ ~P(X)..
Exercise (optional).{c~.c. SOME SPECIAL TYPES OF MOORE CLOSURES 4. ) E E also. hence it satisfies the conclusions of 4. Then each subset of e has a sup and an inf in e. cl./3. ~y. The closure obtained in this fashion is called the c l o s u r e w i t h r e s p e c t t o t h e o p e r a t i o n s 9 . It also satisfies for any sets S~ c_ X (c~ E A). It is easy to see that the sets that are "closed" in this sense satisfy Moore's axioms 4. and let 9 be a collection of Aary operations on X (defined as in 1. then ~b(e~. e ~ .6.. x ~ .3(i) and (ii). . . Hence they are the closed sets for a Moore closure operator. it is not intended to imply that the index set A .. and e~.xEA S. Note that any Moore closure on X is an isotone mapping from {P(X) into {P(X). e/~. The empty set is closed under the operations 9 if and only if none of the operations in 9 is nullary..32.{c~. .) Indeed. . B . let cl" [P(X) + {P(X) be a given mapping satisfying the axioms in 4.} is a countable or ordered. .A for every A E A otherwise.'~. Here the notation is as in 1. Actually. are members of E. C_) is a complete lattice. Observation. Let X be a set. thus it is the s u p r e m u m of the Sx's../~. 4. e/~.a.29.x) is the smallest member of e that contains all the Sa's. Different members of 9 may have different A's. finite or infinite. Here is an elementary example: A collection g of subsets of a set X is closed under finite union if and only if g is closed under the binary operation tO 9 ~P(X) x ~P(X) + [P(X).. thus it is the i n f i m u m of the Sx's. For each set A c_ X and each point z E cl(A). e ~ . Then A = NXEA Ss is the largest member of e that is contained in all the Sx's. with index set A ..41). defined as in 4. see the definition in 4.cl(U. Many Moore closures used in applications are formulated as closures with respect to some sort of operations. . Let e be a Moore collection of subsets of X. define a Aary operation ~)A. . and we permit the A's to be empty or nonempty.3.5. every Moore closure can be represented as a closure under operations (although such a representation is not necessarily helpful). (Thus (e. . let any collection g = {Sa : ~ E A} c_ e be given. Hints: Let cl be a Moore closure on a set X that is.Some Special Types of Moore Closures 83 b. .z 9X A + X by taking ( r Z ' xz.7. c.13.)  / some member of {x~" ~ E A} if x~ . Also.}. We shall say that a set E C_ X is c l o s e d u n d e r t h e o p e r a t i o n s 9 if it has this property" Whenever ~b is a Aary operation in ~. .
. . a constant function.) Let 9 be the collection of all operations formed in this fashion. Xn is empty.x2. . Proof.X2.3. Thus.Xn) . It remains only to prove (D) ~ (A). . . then f is a nullary operation . . then the union of all the members of e is a member of iK. that is. f ( x l . We may write F . Let X be a set.d and 12. x 2 .. ?tn) otherwise. In that case the list of arguments Xl.f(). let :K be a Moore collection of subsets of X.. Now let any S G L be given.Xn)  o" G cl(F) C_ cl(K)  K. f(Xl.xn G K. Now define f " X n ~ X by ) C_ K for each K E K. . X 2 .0.0 or ( X l . By (D). We may assume cl(S) is nonempty and let any a E cl(S) be given. Un} for some nonnegative integer n (which is 0 if F is the empty set).. x 2 . See also the related exercise in 16. .{el(D)" D E 9 For (C) =~ (D). . take 9 { F C_ S " F is finite}. . then we form a nullary operation ~Po. ?t2. X n ) E K.Xn) a Xl if n . .. .. This is clear in the case where f ( x i . In the remaining case. (A) =v (B) is an easy exercise. see 9. Then the following conditions are equivalent. . It is easy to see that [K c_ L. x 2 .Xl. . . "We claim t h a t f E ~. X (for nonnegative integers n) that satisfy f( K•215215 n times Let L be the collection of subsets of X that are closed under the operations ~. Let 9 be the collection of all finitary operations f " X n .L.. (For some important examples. . f(xl. Xn)  (?tl. Indeed.21. we say that iK is an a l g e b r a i c c l o s u r e s y s t e m and cl is an a l g e b r a i c c l o s u r e o p e r a t o r .8. where the A's are finite sets.i.. X 2 . and let cl 9 IP(X) ~ K be the resulting Moore closure operator.{ U l . Here it is understood that if n ..e. .) (A) cl is the closure with respect to some collection 9 of finitary operations on X that is. . . .. (C) Whenever 9 is a collection of subsets of X that is directed by inclusion. Aary operations. 9 D) C (D) For set S c_ X we have cl(S) U { c l ( F ) " F is a finite subset of S}. when A is the empty set. . .. if there are any such z's. . . In other words. For (B) =~ (C). we are to show t h a t f ( x l . .84 Chapter 4: More about Sups and Infs (In particular. then the union of all the members of C is a closed set. Theorem and definition. verify that closure under the operations 9 is the same as the given Moore closure. it suffices to show that cl(S) c_ S.  4.z() z for each z in cl(O). if C C_ K and the union of any two members of e is a subset of a member of C. . We shall show t h a t iK . let e . it suffices to show t h a t a E S. we have F c_ K and. then (U. there is some finite set F C_ S such that a E cl(F). let any K E X be given and any X l . . .. (B) Whenever e is a subset of X that is directed by inclusion.. hence all. . Xn) .8. If one. we wish to show that S G K. of these conditions are satisfied. U 2 .. x 2 . therefore.b.
Typically it is o or _1_ or T. (B) p(A)  {y E Y " A C q({y})} and q(B) . 4. b. The d u a l of p is the mapping q 9 T(Y) T ( X ) defined by [.J{s c x 9 e c_ . . . B c_ p(q(B)) for all A c_ X. {x}• This condition can be restated in the triangle notation as: A< {yEY 9 A• B { ex. . if A 1 C A 2 in X.a).9. q as a p o l a r p a i r between X and Y. Hence a .Some Special Types of Moore Closures 85 Thus f E ~. one symbol is used for both <1 and [>. Further properties of polars. Let X and Y be sets. .. ( s ) ) or. then p(A1) _Dp(A2) in Y. Any polar is antitone i. equivalently. u 2 .12. q(B) . if q is the dual of p. This follows easily from 4. (In many books. Examples of polars will be given in studying Dedekind cuts (see 4. and let p" [P(X) ~ [P(Y) and q" iP(Y) ~ [P(X) be some given functions. and so the set S is closed under the operation f. Thus we may speak of p. A p o l a r from X to Y is a mapping p" ~P(X) . several other examples are also mentioned in 4. for all A C_ X and B C_ Y. Moreover.e.q. B c We shall also write p(A) = A ~ and q(B) = B ~'.f ( u l . Hint.11. .{x e x . then p is the dual of q. 4. (A) p and q are a polar pair.34) and topological vector spaces (see 28. (D) There exists a set F C_ X x Y such that. Un) is a member of S.p and q o p o q . p o q o p . The dual of a polar is also a polar. By assumption S E L. [P(Y) that satisfies P(iUcI A i ) NP(Ai)iEI for any collection {Ai " i E I} of subsets of X. Let X and Y be sets.25). B c_ Y. and q o p and p o q are extensive" A c_ q(p(A)).{x E X " B C_ p({x})} for all ACXandBCY. (C) p and q are antitone (as in 4.9.10. p(A)  {yEY 9 A• q(B) {xex. Suppose p " T ( X ) ~ [P(Y) and q " iP(Y) ~ IP(X) are a polar pair. Then: a.) Exercises/basic properties: a. Definitions. 4.10(C). Then the following four conditions are equivalent.
Let e l .NXEA CI(Sx). x 2_ y .{0}.qop. the polar maps A ~ p ( A ) and B ~ q ( B ) are inverses of each other. d. Then the p o l a r s p ( S ) = S <~ and q(S) = S ~ are the same. For a simple counterexample. Instead it is given other names.cl (UAEA (Sk))" cl(NxE A Sx) . Let us first restate. (ii) 0 2 _ x f o r a l l x E X . (Hence this Moore closure is not a topological closure.{0}. respectively. Consequently. which do not apply to all polar pairs. they give a bijection between the closed subsets of A and the closed subsets of B." (UAEA SA) • . in the present notation. at least for the moment. using 4. We now describe a special type of polar pair. We now apply the results of the preceding sections. Let A1 = {rational numbers} and A2 = {irrational numbers}. Show: ~_L _ {0}• _ X and X • . q o p and p o q are Moore closure operators in X and Y.) S H S •177is a Moore closure operator on X. Thus the empty set is not a closed set. Generalized orthogonality. that is. some of the conclusions already reached in the preceding sections: S C S •177 n d s • a S c_ T =~ S • _D T • •177177 (Thus the mapping S H S • is antitone.y for all x E S}. This set is called the o r t h o g o n a l c o m p l e m e n t of S.) . Assume X is a set. see 5. usually we say that x and y are o r t h o g o n a l (or p e r p e n d i c u l a r ) . Caution: This operator is not called a "closure" in most specialized contexts where it is applied. (iii) x 2 .x ~ x=0. and 2_ is a relation on X with these properties: (i) 2_ is symmetric.NXeA (S~) and (NXEA Sa) • . c. We shall denote it by cl. S • = {y E X : x 2. An analogous result also holds in Y for p o q. This result does not generalize to all Moore closures.10(D) with F = {(x.19. Applied to just the collections of closed sets.86 Chapter 4: More about Sups and Infs b. 4. but cl(A1) = cl(A2) = R. let cl be the usual topological closure when R is equipped with its usual topology. the closure of an intersection equals the intersection of the closures.z5.12. 0 is some special member of X. Hence cl(~a) . y 2_ x.["liE/cl(Ai) for any collection {Ai" i E I} of subsets of X that is. respectively. If x _L y. Then cl (["IiEZ Ai) . The resulting closed subsets of X or Y are the sets A or B that satisfy A = q(p(A)) or B = p ( q ( B ) ) . such as "closed linear span. we shall denote them both by S • Thus. cl(A1 A A2) = ~ but el(A1) N cl(A2) = R. We also have a few new conclusions. y) E X x X : x _L y}. Then A1 N A2 = 2~.
" but this has several different meanings in different parts of mathematics.0 for all aJ E ft. (B) Whenever S c_ X is nonempty and bo/mded below. Prove the equivalence. 4) is a l a t t i c e if every twoelement subset of X has a sup and an inf in X.X2y2 = 0. y ) : y = 0}. then S has a greatest lower bound in X.50). y means that the line segment from the origin to x is perpendicular (in the usual geometric sense) to the line segment from the origin to y. For an example of a set S that is not closed. 4. 4) is a D e d e k i n d c o m p l e t e poset if either of the following equivalent conditions holds. Uniform completions will be studied in later chapters. (X. (This example requires some familiarity with analytic geometry from college calculus.14. or a straight line through 0. There are some strong analogies between the theories of order completions and uniform completions. 4) be a poset. let S be the line segment from the point (1. 0) to the point (2.4. in Riesz spaces (see 11.R a for some set ft.{ x E X 9 X I M ." or "not having any holes or gaps.L ~ t t i c e s and C o m p l e t e n e s s 87 Examples.13. Definitions. R e m a r k s . show that x . We say that (X. Xly I Jr. When x and y are both nonzero. Represent X by points in the plane. and let z 2_ y mean that x(a~)y(oa) .) (A) Whenever S c_ X is nonempty and bounded above. or a c o m p l e t e subset of X has a sup and an inf in X. then S has a least upper bound in X.) Let X = R 2 and define x _1_y to mean x . In 13. More examples will be given later in this book.f we shall see that the collection of closed sets obtained in the fashion indicated above is a complete Boolean lattice. The term "complete" generally means "not missing any parts. Show that a subset of X is closed (in the sense of this Moore closure) if and only if it is either {0}. LATTICES AND COMPLETENESS 4.0} for some set M c_ ft. a. Let X . show that the closure of S is the entire line {(x. 0). Let (X.59) and in Hilbert spaces (see 22. (X.g. Show also that (CM) • . Show that a set S c_ X is closed (in the sense of the Moore closure) if and only if it is of the form CM .C a \ M . It is possible to develop l a t t i c e ) if every . 4) is c o m p l e t e (or o r d e r c o m p l e t e . b. y = 0 that i s . X. (Exercise. Examples are given later in this chapter. in domain theory. We also caution that the term "complete poset" has a more specialized and technical meaning among some algebraists e..
a. 4." but the precise meaning is not closely related to order or uniform completeness. With the product ordering 3. right to left.16. f.17. . ." in books on category theory. center to outside. and thus parentheses are not necessary. 1 but that theory is rather technical and complicated and not recommended for beginners.Xn}.) AnalogoUs conclusions apply for V's and sups. then (X. The value of the expression is the same as i n f { x l . just by coincidence. In each case the supremum or infimum in the product is defined pointwise. b. 4) is a lattice. Relations between types of posets. Readers of this book are urged to instead view order completeness and uniform completeness as two entirely unrelated concepts that. Formal logic also uses the term "complete" to mean "without holes. then ~ has the same property: lattice ordering. a product of lattices is a lattice. Every chain is a lattice. (Hint" 3. An equivalent definition of lattice is: A poset in which every finite nonempty subset has a sup and an inf.58. . e. 4) is a lattice. If 4 has any of the following properties. A poset (X. Any lattice is both a poset and a directed set. MORE ABOUT LATTICES 4.j. 4) is a Dedekind complete poset and both 4 and ~ are directed.21. 4) is order complete if and only if it is order bounded and Dedekind complete. 1See "reflective subcategories. order complete. .15. c. see 3. a product of Dedekind complete posets is a Dedekind complete poset. for other examples besides order completions and uniform completions. x 2 .21. If (X.n. 4. Observations on products. d.28.88 Chapter 4: More about Sups and Infs those analogies into a unifying theory. etc. See also the corollary in 4. See 14.9. Dedekind complete. a product of complete lattices is a complete lattice. It follows that the operations in X l A X2 A x3 A ' " A Xn can be evaluated in any order left to right. If (X. then the binary operation A : X x X ~ X is both commutative: associative: X1 A x 2 x 2 A Xl  and (Xl A x 2 ) A x3 Xl A (x2 A x 3 ) . use some of the same words and have slightly analogous meanings. . Any well ordered set is Dedekind complete.m.
( x V y ) V z . Lattices. y) E Z 2 : x + y .. Let (X. suppose X is a set equipped with two binary operations A.) 4.More about Lattices 89 4. ~ ) i s a l a t t i c e . dots.) b. but we shall now show t h a t "lattice" can be defined instead in terms of the binary operations A and V.20.2.21. (Hint: First use L3 to prove t h a t x V x = x A x = x. Elements of the lattice are indicated by vertices i. (. (Hence. We have defined "lattice" in terms of its ordering ~. in a lattice.e. In these diagrams. Two examples are given below. Miscellaneous properties. z in a lattice" L1 ( c o m m u t a t i v e ) " L2 ( a s s o c i a t i v e ) " L3 ( a b s o r p t i o n ) : and xAyyAx and x V y . the union of two order bounded sets is order bounded. and x V ( y V z ) . M e e t .j o i n c h a r a c t e r i z a t i o n o f l a t t i c e s .e. 4.y V x . in the sense of 5. A i. Define ~ by (.) . For instance.82 E ~ =:~ 81 V 82.19. can be illustrated with l a t t i c e d i a g r a m s . 81 A 82 E ~.b} Lattice diagrams 1 22 M3 0 The first diagram shows the inclusion relation between the subsets of a twoelement set. a. for all x. we have x ~ y if there is a downward path from x to y. ~) be a lattice. Conversely. Show t h a t x A y = x ~ xVy=y. xA(yAz)(xAy)Az xA(xVy)=x and x V ( x A y ) = x .0} is not directed. . the order bounded sets form an ideal of sets. V t h a t satisfy L1L3. 0 is the smallest m e m b e r and 1 is largest. This lattice is known (among some lattice theorists) as 22 . In a lattice. t h a t satisfies 81.) x gy ~ xAy=x ~ xVy=y.18. not every subset of a directed set is directed. Show t h a t these laws are satisfied. then show t h a t ( X . This lattice is sometimes known as M3. Not every subset of a lattice is a lattice. {a. particularly finite ones. y. Z 2 with the product ordering is a lattice but its subset {(x. The second diagram shows a lattice containing five members.. Then a s u b l a t t i c e of X is a subset S t h a t is closed under the lattice operations V. 4.
~) is s e m i . which is not a member of V.Es(x V s). which are discussed in later chapters. X3) mK(Yl.1. u V v and u A v are the least common multiple and greatest common divisor of u and v. we say (X.Yl}. ~) is a d i s t r i b u t i v e lattice. If (X.22. respectively.. a means linear subspace) of ]R3. when equipped with the restrictions of V. A determined on V by the ordering ~ are not simply the restrictions of the lattice operations on IR3. .Es(X A s).(x V y)2] and A is computed analogously with minima. 4. the lattice operations of V (defined by the formulas above) yield x V y = (3.that is.24.y2}. This example may seem somewhat contrived. If both of these two conditions are satisfied. but the lattice operations V. The set N = {positive integers} is a lattice when ordered by this rule: x ~ y if x is a divisor of y . 5). the following two conditions are equivalent: (A) x A ( y V z ) = ( x A y ) V ( x A z ) for a l l x .inf. Then the lattice operations of R 3 yield x V y = (3. If these conditions are satisfied. 4. x3)E ]I~3 9 X lJrx2. For instance. Let V  { ( x l. let x = (1. S is a subset. Example. x V inf(S) . . 2. Example. then so does the right side. Y2. 2. and S is a lattice when equipped with the restriction of the ordering ~. ~). 2. (Xl. it is called the s u b l a t t i c e g e n e r a t e d b y S. For a lattice (X. if xu = y for some u E N.23. We shall order V by the restriction of the product ordering. where the equations are to be interpreted in this sense: If the left side of the equation exists. and they are equal. max{x2. ~) is a lattice. x2. z E X . Then V is a subset (in fact. (B) xV(yAz)=(xVy)A(xVz) for a l l x . A. The closure of any set S C_ X is the smallest sublattice containing S.i n f i n i t e l y d i s t r i b u t i v e if it satisfies either of the following conditions" (A') (B') x A sup(S)  suP.90 Chapter 4: More about Sups and Infs It then follows that S is also a lattice. (xVy)3 [(x V y)l "}. 3). The collection of all sublattices of a lattice X is a Moore collection of subsets of X. 2). y. for any positive integer m.x3_}. We shall say that a lattice (X. but it is actually quite typical of the behavior one sees in lattices of measures. the reader should verify that (x V y)l (x V Y)2 = = max{xl. Rather. 4.X2. With this ordering. A sublattice of N is given by {divisors of m}. Then V is a lattice (in fact. it does not follow that S is necessarily a sublattice of X. y. that is. Definition. Definition. 3) and y = ( 3 . Y3) xj <_ yj for all j. z E X . a vector lattice).
Again.29. Further examples of infinitely distributive lattices will be given in 8. The extended real line. Hint" See 3. +oc] A . It follows that [oc. was introduced in 1. Every chain is an infinitely distributive lattice. Then R A . .) For now. (See 1. Let A be any nonempty set.. Show that conditions equivalent to the following two conditions: (A") (B") sup(R) A s u p ( S ) . then so does the right side. we shall prove that there exists a unique Dedekind complete ordered field and then define R to be that field.b.f ( x l ) V f(x2) and f(xl Ax2) . when these products are equipped with the product ordering.21 we shall give an example lattice that is not infinitely distributive. In Chapter 10. Examples. (More precisely. [oc.48 and thereafter. sES}. In 5.17.s u p { r A s inf(R) V i n f ( S ) . Observation. 91 any semiinfinitely distributive of a semiinfinitely distributive (A') and (B') are not equivalent (A') and (B') are respectively for any nonempty sets R. Let X be a lattice. 4.) b. +oc]} is a complete lattice. 4. and they are equal.21. Lattice homomorphisms will be studied further in 8. and [oc. The fiveelement lattice Ma is not distributive. we shall show that 1R is Dedekind complete. c_) is an infinitely distributive lattice. Some important examples. then ([P(Ft).{functions from A into [oc. a. MORE ABOUT COMPLETE LATTICES 4.More about Complete Lattices the lattice X is i n f i n i t e l y d i s t r i b u t i v e . S c_ X. to the real number system and defining . Exercise.25.m. If f~ is any set. +oc] is a chain that is order complete. from one lattice into another.43. in our formal development of R. (See 4. we shall "borrow" that result from Chapter 10: We shall accept the fact that R is Dedekind complete and use that fact in some examples below.) c.27. however.18.o c < r < +oc for all real numbers r.26. It is clear that lattice is distributive.i n f { r V s 9 rER. A l a t t i c e h o m o m o r p h i s m is a mapping f " X ~ Y. 4.28. +oc]. each equation is to be interpreted in this fashion: If the left side of the equation exists. that satisfies f(xl V x2) .f ( x l ) A f(x2) for all Xl. . 9 rER. Recall that it is obtained by adjoining two new objects.X2 in X. thus the two laws to each other. sES}.{functions from A into R} is Dedekind complete.o c and +oc.
4. and ~ .92 4. then the inclusion map D from (D.32.. ~). In fact. Then: (i) If D is infdense in (X.1] is not Dedekind complete. It is easy to see that in this case L~ . Then f has at least one fixed point i. show that f(p) .e.30. hence it is both a directed ordering and a partial ordering. Let (L.p.29. 4) be a complete lattice. and suppose f : L ~ L is isotone. Furthermore. It is not a chain ordering if X contains more than one element. For any set X. ~). T a r s k i ' s F i x e d P o i n t T h e o r e m . Let X and Y be complete lattices. Hints: S is nonempty.e. and inf(U~). a.{d E D 9 d 4 ~} is nonempty. that is evident from an example in 15. xl ~ x2 =~ f ( x l ) ~ f(x2). Then )~EA AEA for any set {xx 9A E A} c_ X. It is easy to see that in this case U~ = {d E D 9 d ~ ~} is nonempty. x Dually.e.. if each point ~ in X is the sup (in X) of some nonempty subset of D. Definitions. Proposition.31. Chapter 4: More about Sups and Infs Miscellaneous properties.d e n s e in X if X is the sup closure of D i. not every subset of a Dedekind complete poset is Dedekind complete. c.. show that p E S. ~) to (X. c_ X is suppreserving . A set D c_ X is s u p . Let ( X . 4. and let D C_ X. if point ~ in X is the inf (in X) of some nonempty subset of D. [0. 1] (with its usual ordering) is order complete. let this restriction be denoted again by ~.d e n s e in X if X is the inf closure of D i. For instance. a set D C_ X is i n f .sup(S). the ordering c_ makes [P(X) into a complete lattice. Let D be ordered by the restriction of the ordering ~. Not every subset of a complete lattice is a complete lattice. there exists at least one point p E L such that f(p) = p. that largest fixed point is also the largest member of the set S = {x E L : x 4 f(x)}. b. ~ ) be a poset. and suppose f : X ~ Y is an orderpreserving function that is.sup(L~). Neither of these 4 ' s is necessarily equality. since it includes the first member of L. Let p . but Q N [0. ORDER COMPLETIONS 4.11. among the fixed points there is a largest one.
We shall only prove (i). We shall say that X is a D e d e k i n d c o m p l e t i o n of D if (i) D C_ X.34. Let S C_ D be nonempty. Note that condition (2) can be restated as: (2a) A is the set of lower bounds of B. and the ordering of D is the restriction to D of the ordering of X. it follows that cr 4 d. Outline of proof. we wish to show that a 4/3. Consider any d E U~. Define L~.suPD(S ) exists. and (iii) (X.c. Then d ~ / 3 ~ s for every s E S.1). we have shown that U~ C_ U~. and (2b) B is the set of upper bounds of A. (A survey of different kinds of completions applicable to lattice groups was given by Ball [1989]. 4 ) t o (X. c This type of completion might be more precisely named a "generalized Dedekind completion. respectively. See also 4. The literature contains many different kinds of order completions. and (2) A . Let ~ be any upper bound for S in X.{(u. Since the infimum operation is antitone (see 3. 4) be the givenposet.v) E D x D 9 u 4 v } . Let D and X be posets. Definition.e. Thus d E U~ i. Note that the inclusion D ~ X is then suppreserving and infpreserving.) The following notion of completion seems to be best suited for the purposes of this book. Since a is the least of all the upper bounds for S that lie in D. 4. ~ ~ or. we wish to show that it is the least upper bound in X. E x i s t e n c e T h e o r e m . Let (D.B > and B . and d lies in D. then the inclusion map D from (D.33.. B) such that (1) A and B are nonempty subsets of D.9. then (ii) follows since it is dual to (i). Proof. 4) . By a c u t we shall mean an ordered pair (A. 4 ) i s Dedekind complete.31. .36.Order Completions 93 c X is infpreserving ~ (ii) If D is supdense in (X.10(D). 4.A < in the sense of 4. Then cr is also an upper bound for S in X. it follows that infx(U~) ~ infx(U~) that is.32. and define a polar pair between D and itself as in 4.21. (ii) D is both supdense and infdense in X. Every poset has a Dedekind completion. by 4. U~ as in 4. with partial orderings both denoted by 4. Hence d is an upper bound for S. and assume that c r . We shall use "suPD" and "suPx" to denote the supremum in D or in X. denote an infimum analogously. ~). It follows that A is downclosed and B is upclosed." since the term "Dedekind completion" usually refers to chains. DefineF .
{eED'e~d}). j(d)  ({ecD'egd}. but it then becomes more . Verify that dl 4 d2 ~ j ( d l ) ___j(d2).and infpreserving. U) is a cut in Q.~ E A} be a nonempty subset of X. 4. 4) be any poset. U {qcQ 9 q>0andq2 >2}. Be) For each d E D define : :A1 C_ A2 : :B1 _D B2. such that the inclusion is both sup. Then there exists a complete lattice (X. in which case they are the same. 4. in the sense of the proof in 4. Then the pair (L.{(A~. X has a first element if and only if D has a first element. hence we may view D as a subset of X by identifying each d c D with its image j(d). We shall see in Chapter 10 that the order completion of Q is the real number system ]R. Example. let S . Hint: 4.B2) satisfy A1 c_ A2 ~ B1 _D 132. Show that any cuts (A1.15. Verify that if S has a Elower bound. In the theorem below we shall consider Dedekind completions only for chains. B1) E (A2.n~eA A~ and B_ . If D is a lattice. the cut (L. Use those two facts to show that D is infdense and supdense in X. Further properties of the Dedekind completion. the result will be t~e same. b.34. 4) with D c_~ X.f.p(A_).B~) 9 . then the ___sup of S is the pair ( A + . It is easy to verify that j(d) is a cut and that the mapping j 9D ~ X is injective. B + ) .) 4. verify that any cut (A. Let (D. Let q<0orq2<2}.B1) and (A2. (Such a complete lattice X is sometimes called a M a c N e i l l e c o m p l e t i o n of D. Remarks. then X is a lattice. To show that (X.q(B+). Let Q L{qEQ" {rational numbers} have its usual ordering. c. thus the ordering of D is the restriction of the ordering of X. B) is the _infimum of {j(b)" b E B} and the Esupremum of { j ( a ) ' a c A}. hence we may define a partial ordering __ on X by: (A1. Let X be a Dedekind completion of a poset D.) Hint: Adjoin a lowest element and a highest element. E) is Dedekind complete. Similarly for last elements. where B+ . The theorem can be extended to a more general setting.37. where A_ . and then take the Dedekind completion. U) described above corresponds to the number v/2 in R.36. Then: a.35. B _ ) .N ~ A B~ and A+ . (Or take the Dedekind completion first if you prefer. and if S has a _Eupper bound. then the Einf of S is the pair ( A _ .94 Chapter 4: More about Sups and Infs Let X be the set of all cuts.
T h e n we cannot have a > b.sup(Lx) and y . Define L~.i. Suppose X is not a chain. Then there exist two distinct elements x. For simplicity of notation.sup(Lv) . it follows t h a t a < b. Define L~. and Q is another chain t h a t is Dedekind complete.sup(Lx). . For each s E S we have s . and suppose a .s u p { F ( s ) 9 s E S} exists in Q and equals F(cr). Thus we may assume S is downclosed in X.. ) If X is a Dedekind completion of a chain D. since x . Consider any a E Lx and b E Uv.Order Completions 95 complicated. Proof of linear ordering. U~ as in 4. Since x . Since D is a chain. Hold a fixed. Theorem on completions of chains. Therefore f ( d l ) is an upper bound for the set f ( L x ) . For each x E X. Most other Dedekind complete structures used in analysis can be obtained by putting together copies of IR in various ways. (**) where L~ . we are to show t h a t q .{d E D " d < ~}. and f " D ~ Q is a suppreserving mapping. sup f ( L x ) exists in Q. and therefore by 3. U~ as in 4. y E X t h a t are not c o m p a r a b l e .sup(S) in X. If f has a suppreserving extension F " X + Q. X2 are two Dedekind completions of D. it follows t h a t x ~ y.25. It is easy to see t h a t this function is increasing and is an extension of f . This reasoning is applicable for every a c Lx. Also. so Lx c_ L v.e. this does not affect our hypotheses or desired conclusion. then f extends uniquely to a suppreserving mapping F " X ~ Q. Hence S V/D .sup(L~). from 3.) The Dedekind completion of a chain D is unique up to isomorphism over D. we may replace S with the set {x E X 9x < s for some s E S}.) If X is a Dedekind completion of a chain D.38.U~Es L~. since f is suppreserving on D. (i) (ii) (Linear ordering.d we see t h a t L~. the set Lx is nonempty and is bounded above in D by any dl E Ux. (Extension of m a p p i n g s .m. t h a t extension must satisfy (**). thus a is a lower bound for Uy. in the following sense: If X1. so a E L v. We are mainly concerned with completing Q to obtain IR. In fact. 4. Let S be a nonempty subset of X. c_ S U {~}. then X is also a chain. then there exists an order isomorphism from X1 onto X2 t h a t maps each m e m b e r of D to itself. and let b vary over all of Uy.{ f ( d ) " d E Lx}. Proof of extension of mappings. (iii) (Uniqueness of completions. we shall not need the greater generality later in this book.31. F must be defined by this formula: F(~) supf(L~)  s u p { f ( d ) " d E L~}.21.31. such that neither x ~ y nor y ~ x is valid. Hence a function F " X + Q can be defined by (**). It suffices to show the function F defined by (**) is indeed suppreserving. since t h a t would imply x ~ a ~ b ~ y. Since Q is Dedekind complete.
X is bounded if and only if S is bounded. . On the other hand. and so F(a) . 4. For k . The compositions F21 o F12 : X1 ~ X l and F12 o F21 : X 2 + X2 are suppreserving maps that leave elements of D fixed. from (S.f ( s u p ( S N D)) . Let S c X be partially ordered by the restriction of 4. ~) to (X. Since Q is Dedekind complete. Analogous statements are valid for F22 and X2. S is supdense in its sup completion X. then f extends uniquely to a suppreserving function This definition is slightly more complicated than the one in 4. 2) and Q = Xk. in which case the two posets have the same maximum and same minimum. and supclosed}. It remains to show the reverse of this inequality.33). thus the set {F(s) 9s C S} is bounded above by F ( a ) .sup{f(d) 9d E L~} _< sup{F(s) 9s c S} . _) is any Dedekind complete poset and f " (S. Let (X. Using the Extension Property with X = Xj (for j = 1. then (since f is suppreserving on D) f(a) . downclosed.sup{F(s) 9 s C S} exists in Q and that q _~ F ( a ) . let fk " D g) Xk be the inclusion map. Hence these maps are the identity maps on X1 and X2. X2 are order isomorphic via a map that acts as the identity on members of S.96 Chapter 4: More about Sups and Infs For each s C S we have s ~_ a and hence F(s) <_ F ( a ) .33. respectively.q. some mathematicians may prefer a different sort of completion. but for the ~mbitious reader we provide a hint: To prove existence of a sup completion of S. Therefore F12 :X1 ~ X2 is an order isomorphism. it follows that q . However. 2. ~) is a s u p c o m p l e t i o n of (S. it follows that F l l is the identity map on X1 and that this is the only suppreserving map from X1 into itself that leaves elements of D fixed. in the sense that any two sup completions X1.a chain.s u p ( f ( S N D)) _< s u p ( F ( S ) ) . ~) if these further properties are satisfied: (i) (X. If a ~ D. (Optional remarks. ~) ~ (Q. We shall not prove these results. Since the identity map of X1 is a suppreserving map that leaves elements of D fixed.39. The Dedekind complete posets form a reflective subcategory of the category of posets. Proof of uniqueness of completions.1.q. in which case the sup completion agrees with the Dedekind completion (defined in 4. Thus Fjk is the only suppreserving mapping from Xj into Xk that leaves elements of D fixed. if we use suppreserving maps for morphisms. then L~ C_ S. (ii) The inclusion map S c_ X is suppreserving. X is a chain if and only if S is. We shall say that (X. bounded above. __) is any suppreserving function. we see that fk extends uniquely to a suppreserving mapping Fjk : Xj ~ Xk.) Although the "Dedekind completion" defined in 4. 4) be a poset.{C c_ S" C is nonempty. (iii) If (Q.33 is probably the simplest for the purposes of this book. let X . it has the following advantages: Every poset S has a sup completion X that is unique. then partially order X by C_. if a c D. this notion is developed in books on category theory. ~). Moreover. ~ ) i s Dedekind complete.
s).g ( A ) I {bounded functions from A into R}. d) in the metric space (B(A). 27. Then p ( f . b.vES in [0. Then (A.Yl and e(x. We may omit the subscript d if no confusion is likely.Yl} are metrics on R that yield different collections of bounded sets. metrically bounded) if it has finite diameter. 4. 23.) A set S is b o u n d e d (or. we define diam (o) = 0. we shall" denote structure here). Thus the example above shows that every metric space can be embedded isometrically in a Banach space. +oc]. to give some important examples of sups and infs.2. we promise not to engage in any circular reasoning. S ) . Thus # H fu is a distancepreserving map from A into B(A).2. d ) be a pseudometric space (defined in 2. In a metric space.11.d(#. and so we may view A as a subset of B(A).Sups and Infs in Metric Spaces 97 SUPS AND INFS IN METRIC SPACES 4. f~) . By convention. y) for any nonempty set S c_ X. y) . Fortunately. The space B(A) has certain special properties that will be of interest later" It is a Banach space. Then: d(x. as follows: Fix it by "0" (although we do not assume any additive define a function f~ E B(A) by fu(A) d(A. at least when applied to subsets of R.40.1. g ) . Verify that p(fu. d) be a pseudometric space. A function is sometimes called b o u n d e d if its range is a metrically bounded set.19. For each p E A d. +oe] is order complete a fact that will not be established rigorously until we investigate the real numbers carefully in Chapter 10. ~). {y}). Idist(x. Let (X. Suppose d is a metric on the given set A. (AEA). Basic properties and examples.14).14. and the union of finitely many bounded sets is bounded. Then the d i s t a n c e from x to S and the d i a m e t e r of the set S are the numbers distd(x.min{1.y) . let S C X be a nonempty subset. e.y) . We shall "borrow" that result now. sES diamd(S)  sup d(u. See 19.I x .f and 22.a.11). and let x E X. c. Thus. a set with diameter 0 contains at most one point.41.4).dist(x. (The existence of these infs and sups follows from the fact that [0.S) inf d(x.v) u.sup{if(h ) . and 27. however.#) d(A. Let ( X . these metrics are equivalent . I x . Show that d(x. most meanings of "bounded" coincide. the bounded subsets of X form an ideal of sets. S)I _< d(x. in the sense of 5. In most other respects. a. more specifically.0) 9~ E A} is a metric on B(A) we may embed the metric space any point in A. . Let A be any set. Any subset of a bounded set is bounded.they yield the same topological structure and the same uniform structure (see 18. p).dist(y.. Caution: The term "bounded" has several other meanings (see 3.
. f(u2. Chapter 4: More about Sups and Infs Let X be a set.n . be a sequence of reflexive symmetric relations on a set X.y. This more general construction will be used in 19. u4 e X.48. y) if (X. u4)}. Some other formulations may be simpler.m a x { f ( u l . V1. ( a l . uu). +oc) be some function satisfying f ( x .x . but the present formulation which follows Murdeshwar [1983] has the advantage that it can be applied directly in the settings of uniform spaces.1 . .X • X. . see 16. topological vector spaces. .57. In fact. y) E X • X there exists at least one finite sequence (aj)~= 0 from x to y satisfying (a0. then the sum is interpreted to be 0. Outline of proof. y) E V~ \ Yn_F1 C Mn~ Yn. Remark. with V0 .42. f(u3. y) < 2 n} C_ Yn C_ {(x. u3).. This construction can be summarized informally as "the distance between two points is the shortest route connecting them. y) and _ that in fact d is the largest pseudometric that is less than or equal to f.98 4. u2). u4) are all in Vn. m  4. Again. y)  inf f ( a j . +cc] is order complete. u3).16. More generally. and then define d as in 4.29. y) < _f ( x . 4. so (Ul.43. The literature contains several variants of this lemma. 26. a 0 . a~ny .am) C D. f(u2. u2. . (1) Indeed. The existence of the infimum follows from the fact that [0.y) E X 2 9 d ( x .y) e tin}  (~: $ $) f(x.0. and let f " X • X ~ [0.. y)  f ( y . if 2 . (tt3. provided that subset is large enough that for each pair (x. . 1. aj) j=l 9 m>__0. u3). .42. . f(u3. the formula above defines a pseudometric d if the function f is merely defined on a subset D C_ X • X.a3). u4) C Vn o Vn o Vn C Vn1. u3.y)  2_ n 0 if (x. we permit 0 when x . V3. We begin by observing that f(Ul. u4)} for all Ul. 2. a z ) . Let V0. satisfying Vn1 D_ Vn o Vn o Vn for all n E N. (aml. We then define d to be the infimum of the sums over all such sequences. d may be selected as follows: Define inf{2n'(x. U4) _~ 2 m a x { f ( u l . We permit m = 0 in the case when x = y. W e i l ' s P s e u d o m e t r i z a t i o n L e m m a .al)." It is not hard to show that d(x. (u2. y) ~_ 2 n} for n . topological groups. V2. here the infimum is over all nonnegative integers rn and all finite sequences (aj)~_ o in X that go from x to y. and 26. . then (?tl. and locally solid vector lattices. (az. Then we can define a pseudometric d on X by d(x. Then there exists a pseudometric d on X that satisfies {(x. u2).y) E X 2 " d ( x .. x ) .44.
xm) c Vn for all n. hence Y:~i=j+2 f ( x i . y) < 2 n. and we are done. since d < f. f(x. . . by induction on m.y and ~~jm=l f ( a j _ l . hence (x0. with a0 .Xi). xi) > b/2. . From (2) we have f < 2d.Eim=l f(xil. By (2). 9 . and hence (x. a j ) < 2 n. By the definition of d.42. there exists a finite sequence a o . Also f(xj.1 .xj) < b and f(xj+l. Xm E X. By (1) we have f(xo. . Since f takes on only the values 2o 2 1 2 2 and 0 we must have f (x. .xm) <_ 2b. If b .Xm) < b.Sups and Infs in Metric Spaces Next. Then j < m J and E j+ 1 f(xi1. X l . then. we are to show f(xo.xj+l) <_ b by our definition of b. 9 . then.n + l . a ~ E X. .) is obvious. y) e Vn.x and am .Xm) <_ 2 E f(xil. Thus we may assume b > O. xi) < b/2.O. For the first inclusion in (. a l . y) <_ 2 n. Choose j as large as possible satisfying E i = I f(xi_l. suppose d(x. then (X~l. completing the induction proof of (2). we shall show that m 99 f(xo. xi) C Vn for all i and n. y) < 2 . (2) i=1 To see this.x~) <_b/2.Xi) for any m E N and xo. let b . . By two uses of the induction m i=1 hypothesis we have f(xo. Now define d as in 4. Xm) <_ 2b.). . The second inclusion in (. . Then d is a pseudometric and d < f.
a). T c ~J :=~ S u T E J. It is to see that a filter 9" is proper if and only if (iii) O ~ 9=. to distinguish it from the "ideal in an algebra" introduced in 9. A nonempty collection :I of subsets of a set X is an i d e a l on X if (i) S E : J a n d S 2 T i m p l y T c J . and (iii). We can also avoid ambiguity by referring to J as an i d e a l in t h e B o o l e a n a l g e b r a g)(X) because in that setting the two notions of "ideal" coincide (see 13. 5. " For any ideal ~J.) Note that any collection 9~ necessarily satisfies X E 9". An intuitive discussion is given in 5. by (i). [P(X) = {subsets of X} is an ideal on X. Let 9" be a nonempty collection of subsets of a set X. T ~ 9 ~ ~ (For such to it easy SnTcgL clarification or emphasis we may sometimes call 9~ a filter o f sets. We say 9~ is a filter on X if (i) S c g " a n d S C _ T C X i m p l y T c 9 " . However.3.25. Our terminology here follows that of algebraists. Elementary examples of filters and ideals are given in 5. and If the context is not clear. we shall refer as the i m p r o p e r filter. we have o e 3. We prefer the algebraists' terminology because in later chapters we shall use the duality between filters and ideals. we shall call it the i m p r o p e r ideal.1. (ii). Any other ideal on X is called a p r o p e r ideal.Chapter 5 Filters Sets Topologies and Other Sets of FILTERS AND IDEALS 5. It 100 . a n d (ii) S .6. Clearly. we might say that 3 is an i d e a l o f sets. and further examples (particularly of interest to analysts) are previewed in 5. we remark that many mathematicians particularly topologists use the term "filter" to refer only to collections satisfying all of (i).2.5. Any other filter on X will be called a p r o p e r filter. Clearly.17. [P(X) is a filter on X. (ii) S.
Other words sometimes used in place of large are r e s i d u a l or g e n e r i c especially in the context of directed sets or in the context of Baire category theory.e. Different ideals give us different collections of small sets. see the example in 24. There is a simple correspondence between filters and ideals. of the usual features of that word for instance. $2. where C denotes complementation in X. This interpretation of "true" preserves some. Also. The ideal and filter are proper if and only if also (iii) not every set is small. if the set {x E X 9 K is satisfied at x} is a member of 9~.3. if X \ S is small). suppose 3 is a nonempty collection of subsets of a set X. such that $1.39." However. and (ii) the union of two small sets (or finitely many small sets) is small.{CS 9S E 9"}. but some concepts can be expressed more simply in terms of filters or in terms of ideals.. but rather to explain ideal and filter. improper filter) if and only if :J is an ideal (proper ideal. and let J . 101 A a . In general. Other words sometimes used in place of small are n e g l i g i b l e and null (although the latter term also sometimes refers to the empty set).d and 5. a large subset of X is a l m o s t all of X. but the conjunction of infinitely many 9"true statements is not necessarily . see 5. To better understand the definitions of filter and ideal.8. the conjunction of finitely many 9"true statements is 9"true (as with ordinary truth).. Our three rules (i). E ~. Then :I is an ideal and 9= is a filter if and only if (i) any subset of a small set is small. not so much to explain small and large.{T C_ X " T r 9"}. see 5. Say that a set S c_ X is "small" if S E 5. since n o n m a t h e m a t i c a l usage deals only with finite sets. Any statement about 9" can be translated into a statement about :J.5.8(B). and vice versa. different rules would also be compatible with the n o n m a t h e m a t i c a l usage of those words. or is 9"true or a l m o s t t r u e . (ii). We say that fl" and :J are d u a l to each other.e. Caution" The dual ideal {CS" S E fl:} should not be confused with the other complem e n t a r y set.Filters and Ideals is easy to see that an ideal J is proper if and only if (iii) X ~ J. Let 9" be a collection of subsets of X. or "large" if S E 9" (i. then 9" is a filter (proper filter. ~P(X) \ 9" is neither a filter nor an ideal. and let 9" be the dual collection {X \ S 9S E J}. If this third condition is satisfied. [P(X) \ 9 " . and (iii) are compatible with common n o n m a t h e m a t i c a l usage of the words "small" and "large. under special circumstances {CS" S E 9"} is equal to the ideal ~P(X) \ 9".i. A set may be small with respect to one ideal while large with respect to another ideal. 99 E ~I =:g> S 1 U S 2 U S 3 U . $3.i d e a l is an ideal :1 that is closed under countable unions . . We might also say that a condition K on points x E X is satisfied a l m o s t e v e r y w h e r e or a l m o s t a l w a y s . . improper ideal respectively). the m a t h e m a t i c a l usage also covers infinite sets. However. Can a set be neither small nor large? T h a t depends on what 3 and 9" are. We have drawn this connection. 5. then a set S c_ X cannot be both small and large. but not all..
5. Let A be a nonempty subset of X.e. Gn} C_ 9}. b.{ S C _ X . Thus. These are dual to each other. Let X be a set. Examples and elementary properties.. and Other Sets of Sets 5[true (unlike ordinary truth). the collection of all ideals on X is a Moore collection. . a. G2. (This is a special case of the Moore closure. Degenerate examples: The singleton {X} is the smallest filter. but the terms "closed" and "closure" are generally not used for filters and ideals.see 1.. given any 9 C [P(X).4. they both are generated by the empty set. . Hence.5. 5. by the singleton {A}. Gn} C_ 9}. Show that the filter generated by a collection 9 C T(X) is ~Y  {FC_X 9 F _D G 1 N G 2 N ' " N G n for some finite set {G1. . with the conventions that the intersection of no subsets of X is just X and the union of no subsets of X is just o. (B) 9" is the filter generated (C) 5 . the intersection of all the filters (or ideals) that contain 9. Topologies. U G n for some finite set {G1. the singleton {O} is the smallest ideal. ~P(X\A). and let 9" be a filter on X. Then the filter 9~ is dual to the ideal . . . This slightly unusual interpretation of "truth" is occasionally useful to logicians. the collection of all filters on X is a Moore collection of subsets of T(X).26) and the intersection of its members is. there exists a smallest filter (or ideal) that contains namely. . we say that 9 is a g e n e r a t i n g set for it. G 2 .. We permit n . A.3. We call it the filter (respectively. . S ~ A } .0 in both formulas. Dually. We have noted that T(X) is a filter on X. has nonempty intersection . introduced in 4. and we can easily verify that the intersection of any collection of filters on X is another filter on X.102 Chapter 5: Filters. Assume that those conditions are satisfied. Then the following conditions are equivalent: (A) 9=is fixed (i. . the ideal generated by a collection 9 C_ T(X) is ~J = {I C X 9 I c_ G l U G 2 U . the ideal) g e n e r a t e d by 9. .) We shall say that 9" is a s u p e r f i l t e r of 9 whenever 9" is a filter and 9" _~ 9. the filter generated by 9 is simply the smallest superfilter of 9. Similarly. With this terminology.
dual to each other. Preview of more examples. (Optional.c). This ideal and filter are proper if and only if the set X is infinite. Let 9~ and :J be a filter and ideal on X. and let K c_ X. The i d e a l of f i n i t e s e t s is {S c_ X 9S is finite}. h. Then { F n L " F E 9" and K C_ L c_ X} is a proper filter that contains {K} U 9". Show that the following are equivalent: (A) 9" is free (i. We shall say that ~1 m e e t s ~}2 if every member of ~1 meets (i. d. f. and take A to be the singleton {p}.{Y n F 9F E 9"} is a filter on Y. It is generated by the collection of all the singletons in X. Then U c ~ 9 n i ~ c 9~i is a filter on X.e. Ezample. 5.8).16. (D) :J contains the ideal of finite sets. (B) 9: contains the cofinite filter. Let 9" be a filter on X. } is neither small nor large.3. 9 eventual sets of a net (see 7. . It is the fixed filter generated by the singleton {{p}}.30.e).pes}.9). 9 equicontinuous sets of maps between two uniform spaces (see 18. using the ideal of finite sets and the cofinite filter. . (C) J is a cover of X.8). also known as the F r 6 c h e t filter. . Other important filters studied later are the collection of 9 absorbing subsets of a vector space (see 12.. It is a proper filter if and only if X \ Y ~ 9~.a). H o w t o e n l a r g e a filter. this will be used in 15.6. Other important ideals studied later are the collection of 9 relatively compact subsets of a Hausdorff topological space (see 17. The filter that is dual to this is the c o f i n i t e filter.) Let I and X be sets.7. and for each i E I let 9"i be a filter on X. K o w a l s k y ' s filter. has nonempty intersection with) every member of 92. let 9 be a filter on I.. then the set {1. suppose that neither K nor CK is an element of 9". Let 91 and 92 be filters on X. Thus we obtain the filter = {scxs {p}} = {scx. If subsets of N are classified as small or large as in 5. and suppose Y c_ X. 7 .e. 9 neighborhoods of a point in a topological space (see 5. Show that there exists a proper filter 9f _2 ~1 U ~2 if and only if ~1 meets ~}2g.10. has empty intersection). We note an important special case of the preceding example: Let p E X. {S c_ X 9 CS is finite}. Then 9~y . 3. It is actually an ultrafilter (defined in 5. . sometimes called the t r a c e of 5 on Y. hence it is called the u l t r a f i l t e r fixed a t p.Filters and Ideals 103 c. e. 5. Let 9"be a proper filter on X.
see 21. If 9" is a filter subbase on X and $1. Show that the following conditions on 9~ are equivalent. 9 nowheredense subsets of a topological space. Definition and exercise. or a topological vector space (27.104 Chapter 5: Filters. ( i ) = ~ (ii)=> (iii)=> (iv). each member of ~ is nonempty. Let ~ be a collection of subsets of X. Topologies. and it generates the same filter as S does. If S is a filter subbase.{S C X " S ~ 9"} is an ideal. .) . If 9 is a filterbase. (A) 9" is a proper filter. and Other Sets of Sets 9 bounded subsets of a lattice (4.i.8(C) ~ 5.2. use 5. e.5. If any (hence all) of them are satisfied.7. $ 2 .e. 9 null sets with respect to a positive charge i.15. and meager subsets of a topological s p a c e .19. this is in fact a aideal if the measure is countably additive (see 21. we say 9"is an u l t r a f i l t e r . . 9 shy sets of a Banach space..41. and ~ is closed under finite intersection. this is in fact a aideal . . 5.the latter is in fact a aideal (see Chapter 20). B E ~ there exists some C C ~ with C c_ A N B. Let 9 be a nonempty collection of subsets of X. Show that the filter generated by ~ is a proper filter if and only if ~ is a filter subbase. (ii) Every member of ~ is nonempty. 9 precompact or totally bounded subsets of a uniform space (see 19. We then say that ~} is a b a s e for the filter 9~. (This is the special circumstance mentioned in the cautionary remark at the end of 5.21.3.0.46). then at most one of the Si's is an element of 9".a).b). We consider several useful generalizations of the notion of a proper filter. Then the set of all filter subbases on ~ is a collection of subsets of X that has finite character (defined in 3. Clearly. the intersection of finitely many elements of ~ is always nonempty. a metric space (4. (iv) ~ is a filter s u b b a s e on X. . c. and the complementary set [P(X) \ 9 " . or ~ has the finite i n t e r s e c t i o n p r o p e r t y that is.15). then ~ may or may not satisfy the following conditions" (i) ~ is a proper filter on X. Let 9" be a nonempty collection of subsets of X.. Hint: For 5. Let X = [P(~t) for some set Ft. then iB = {intersections of finitely many members of S} satisfies condition (ii) above. (iii) ~ is a f i l t e r b a s e on X that is. and for each pair of sets A. d. a finitely additive. Show the following further results: a. Sn are disjoint subsets of X. then the filter generated by 9 is the proper filter 9" {SCX 9 S_DGforsomeGEg}.8(B).c). 5. positive set function. b.8.
and used extensively thereafter.5.26 and 5.5. $2 is an element of 9. Remarks. Remarks. and no (D) ~F is a maximal filter subbase on X other filter subbase contains 9".e. 5. (E) 9" is a proper filter on X. and whenever $1 U S 2 E ~T'.) 105 (c) 9" is a maximal filter on X (or more precisely. Free ultrafilters will play an important role in some later parts of this book.c is quite constructive: It tells us explicitly how to form such objects. satisfying conditions 5. Show that in this case 1~ is the ultrafilter lip fixed at some point p E X (defined in 5. T h a t is. Show that in this case II is a superset of the cofinite filter and no element of 1~ is a finite set. A free ultrafilter on a set X can be described as a classification of subsets of X into small sets and large sets. our description of a free ultrafilter is indirect. (2) 1~ is a free u l t r a f i l t e r (i. Let II be an ultrafilter on X. Then one of the following two cases must hold: (1) 1~ is a fixed u l t r a f i l t e r (i. either K E 9" or CK E 9".. and whenever $1 U $2 U ' " U Sn E 9". Surprisingly. every subset of X is either large or small.. the dual ideal {CS" S E 9"} is equal to g~(X) \ 9:. free ultrafilters are our first intangibles. the reader is urged to try to give a completely explicit example of a free ultrafilter. it arises because the customary criteria for "explicit examples" are somewhat stricter than the customary criteria for existence proofs. (F) 9" is a proper filter on X. this concept is developed in 7.c). having nonempty intersection see 1.5. and (iii) and also satisfying (iv) every set is either small or the complement of a small set. and no other proper filter on X contains 9".Filters and Ideals (B) 9" is a proper filter on X that also satisfies" for each set K C_ X. In contrast. Basically.26). Our description of fixed ultrafilters in 5. Before continuing to the next sentence. a maximal proper filter). For purposes of convergences. 9" is a proper filter on X.e. (ii). In this case II is also called a n o n p r i n c i p a l u l t r a f i l t e r . this case cannot occur if X itself is a finite set. 5. In particular.9. Ultrafilters will be studied further in the last part of Chapter 6 and thereafter. (In the terminology of 5. This bizarre situation will be explained in 14.10.. In this case 1~ is also known as a p r i n c i p a l u l t r a f i l t e r . and we find it difficult to visualize such an object. filters can be used interchangeably with nets.9 and 7. then at least one of the Si's is an element of 9".3 (i).76 and 14.77. i. having empty intersection see 1.e.11. Thus.3. 5. . 9" is a filter subbase on X. free ultrafilters do exist.14. then at least one of S1. but explicit examples of free ultrafilters do not exist! Thus.b). and (v) every finite set is small.
but that expectation is incorrect. in any topological space X.19. Topologies. Let (X.13.7).36. Definition.14. Although it is customary to define a topological space in terms of its open sets. ~K contains o and X. and Other Sets of Sets TOPOLOGIES 5. That is. Such sets are called c l o p e n . 5. both o and X are clopen. bases (15. and X is closed under finite unions and arbitrary intersections. We shall demonstrate this in 15. Remarks and more definitions. and let 9C be a collection of subsets of X. Closed sets are dual to open sets (in the sense of 1. . In common nonmathematical English.37. Xc9".15. $2 c ~" ::~ S1 N $2 c {~'. A t o p o l o g y on a set X is a collection q" of subsets of X satisfying these three axioms: (i) O. a space typical of the topological spaces used by analysts. The members of 9" are called the o p e n subsets of X. T contains O and X. then ~ is the collection of closed sets for a topology on X if and only if 9C satisfies these conditions: (i) O . A t o p o l o g i c a l s p a c e is a pair (X.10). Indeed. closure or interior operators (5.106 Chapter 5: Filters.b and 15. and then in much greater detail in Chapter 15 and thereafter. the topological space X is d i s c o n n e c t e d if it can be partitioned into two disjoint nonempty open sets. (ii) S1.7). convergences (15. Also.12) if and only if it has no other clopen subsets besides o and X.a). if) consisting of a set X and a topology ~" on X. A point x is called i s o l a t e d if it is the only member of some open set. the space X is c o n n e c t e d . The complements of the open sets are the closed subsets of X.12.5). Topological spaces can be described in other ways in terms of closed sets (5. in topological spaces commonly used. $2 c X =:> S1 U $2 c X. "open" and "closed" are opposites. X e XT. we could as easily define it in terms of the closed sets. If no such partition exists. More definitions. Let X be a set. most subsets are neither open nor closed. we may refer to X itself as the topological space if 9" does not need to be mentioned explicitly. (ii) $1. or distances (5. neighborhood systems (5.i). and 9"is closed under finite intersections and arbitrary unions. In fact.c in the case where X is the ~' real line. and (iii) {S~'AEA}c_9" => U~eAS~cT.20. That is. Some sets may be neither open nor closed. 5. This could lead beginners to expect that every subset of a topological space X must be either open or closed. 5. The space X is connected (as defined in 5. some sets may be both open and closed.13).8.22 and 15. if) be a topological space. Topological spaces will be studied briefly in the next few sections and in Chapter 9. as follows. and (iii) {S~ 9A E A} C X => NAEASA 6 X.8. and 15. Exercise.
The d i s c r e t e t o p o l o g y is T(X) ..e.. (That topology can be described another way. The cofinite topology coincides with the discrete topology when X is a finite set.2}. let (X.9). if ~ is a preorder on X. then ~ is another preorder.. {2. The upper set topology on N is V {{1.X}.. Z . L o w e r set t o p o l o g y ( o p t i o n a l ) . It is the only topology that makes every subset of X clopen.}. More generally. or s u b s p a c e t o p o l o g y .4. another interesting topology on N is given by { ~ . induced on X by Y. It is called the r e l a t i v e t o p o l o g y . unless some other arrangement is specified. . d. Show (i) 7 is a topology on X. The set N is most often equipped with its discrete topology. {1}. Elementary examples of topologies. We shall call it the lower set t o p o l o g y on X. The i n d i s c r e t e t o p o l o g y (also called topology on X. .15. so we may view {preordered spaces} as a subclass of {topological spaces}. . {1. The cofinite topology is the smallest topology on X that makes every singleton {x} in X a closed set.. Then: chaotic topology) is {O.3.{0. 107 a..15.5. (ii) The preorder ~ can be recovered from the topology 7.4}. Show that (i) A subset of X is open in the relative topology if and only if it is of the form X n G for some set G c_ Y that is %open.2. . it is the largest topology on X. but the two topologies are different when X is an infinite set. ~) be a preordered set. see 5. Thus.. {1.) c.. e. However. the mapping H 8" is injective (i..4. It is also the only topology that makes every point of X isolated. different preorders determine different lower set topologies).{subsets of X}.f. Any subset of a topological space will be understood to be equipped with its relative topology. Finite sets are usually equipped with the discrete topology. The set 2 . . {3. 1} will be used in many different contexts. Let X be any set. (iii) Analogous properties are easily verified for the u p p e r set t o p o l o g y . a topology on X makes every singleton a closed set if and only if that topology contains the cofinite topology. it is the smallest b. That is.2.{integers} is also usually equipped with the discrete topology. and let 7 that {lower sets of X}. which is defined to be {upper sets of X}. N}.}.. ~}..3.) Example.3}.}.3.. Verify that { X N T 9T e 7} is a topology on X. we shall understand it to be equipped with the discrete topology unless some other arrangement is specified. The cofinite t o p o l o g y is {S c_ X 9 either S is empty or CS is finite}. Let X be a s u b s e t of a topological space (Y.Topologies 5. . (Indeed.. {1.2.
108 Chapter 5: Filters. we call it the p s e u d o m e t r i c t o p o l o g y (or the m e t r i c t o p o l o g y . (v) Suppose W C_ X c_ Y. but such sets are not always equipped with their order interval topologies. (iii) Suppose X itself is Topen. Then a subset of X is open in the relative topology if and only if it is %open.11). (We may omit the subscript d when no confusion will result. g. Then the relative topology induced on W by Y is the same as the relative topology induced on W by the relative topology induced on X by Y. if d is known to be a metric). called the o r d e r i n t e r v a l t o p o l o g y . The usual topologies on I~ and [oc. We also define the closed ball with center z and radius r to be the set Kd(z.e). (iv) Suppose X itself is %closed. The topology of I~ is in many ways typical of topologies used in analysis. unless some other arrangement is specified. __) be a chain. In fact. f.d) be a pseudometric space (defined as in 2. . thus justifying our calling i~ the open ball. in one way or another. r) = {x e X 9 d(x. r) is an open set in the topological space (X. we shall compare the two topologies in 15. following condition: Let T be the collection of all sets T C_ X that satisfy the For each p E T.15. Any subset of R is a chain.46.r)  {x C X " d(z. 9"4). Let (X.) A set T c_ X is said to be o p e n if for each z C T. For any z c X and r > 0. Definitions. z) < r}. Rather. unless some other is specified. Topologies. That topology does not always agree with the order interval topology. there exists some r > 0 such that Bd(z. Then a subset of X is closed in the relative topology if and only if it is %closed. These sets will always be understood to be equipped with these topologies. we define the o p e n ball centered at z with radius r to be the set Bd(z. The reader should verify that Bd(z. Let (X. most topological spaces used in analysis are built from copies of I~. The reader should verify that the collection of all such sets T is a topology Td on X. Any pseudometric space will be understood to be equipped with this topology. Then T is a topology on X. and Other Sets of Sets (ii) A subset of X is closed in the relative topology if and only if it is of the form X N F for some set F C Y that is %closed. The reader should verify that this is a Tdclosed set. r) C_ T. there exists some set J of the form {x c X 9a < x} or {x E X " a < x < b} or {x c X ' x < b} such that p E J C T. +oc] are their order interval topologies.x) ~_ r}. they are usually equipped with the relative topology induced by I~ (as defined in 5.
c). if the choice of D is clear or does not need to be mentioned. in 5.c we shall see that any gauge topology 90 can be analyzed in terms of the simpler pseudometric topologies {9"d " d E D}.4.43.15.4. there are many usual metrics on [c~. Different gauges on a set X may determine the same topology or different topologies. y)  I arctan(x) . y)  l+lxl x l+lyl y .) Further examples of pseudometric topologies are given in 5. This (pseudo)metric or gauge is not necessarily unique. Any gauge space (X.34 and elsewhere. For many applications we shall need a generalization of pseudometric topologies: Let D be a gauge (i. 9") is metrizable (or pseudometrizable) if there exists at least one metric d on X (respectively. Exercise. and in fact we shall see in 18. h. see 15.43.e. If D is a gauge and E is its max closure or its sum closure (as defined in 4. For each d E D let Bd be the corresponding open ball. it will be easier after 15. D) will be understood to be equipped with this topology unless some other arrangement is specified.arctan(y)l and d(x. as in 5. Fortunately.. we do not necessarily have some . we may write d and {d} interchangeably and consider d itself as a gauge. d(x. ndCDo Then (exercise) ~YD is a topology on X. +oc]. a collection of pseudometrics) on a set X. (It follows easily from 2. +oc].Yl.9d. with gauge D consisting of a singleton {d}. Exercise. at least one pseudometric d on X) for which 9 " .g. all of them slightly more complicated than one might wish. Whenever convenient. then D and E determine the same topology.) Two of the usual metrics on the extended real line [oc.Topologies 109 The u s u a l m e t r i c o n IR is that given by the absolute value function that is. (This result will be easier to prove later. unless some other arrangement is specified. Let 9D be the collection of all sets T C_ X having the property that for each x C T.) In fact.a that these are both metrics. Show that the two metrics given above both yield the order interval topology on [oc. (This exercise may be postponed. r) C_ T. If D is a directed gauge (as defined in 4. This terminology is discussed further in 9.9"0. the topological space is gaugeable if there exists at least one gauge D on X for which 9 " . Show that the resulting metric topology on IR is the same as the order interval topology on R. omitting the subscript D. there is some finite set Do c_ D and some number r > 0 such that Bd(X. We may call it the g a u g e t o p o l o g y determined by D.23. When no confusion will result.15.4. y) = I x . We may write 9". Two gauges D and E are called e q u i v a l e n t (or t o p o l o g i c a l l y e q u i v a l e n t ) if they determine the same topology. then we can always choose Do to be a singleton in the definition of 90 given above. Conversely.24 that they all yield the same uniformity. +oc] are d(x. we shall treat pseudometric spaces as a special case of gauge spaces.c). A topological space (X. When we state that a topological space is (pseudo)metrizable or gaugeable. they are interchangeable for most purposes" They all yield the same topology. Remarks continued. Exercises. The set ]R is always understood to be equipped with this metric.
In 16. 9~D is the collection of all sets T C_ X that have the property that for each x c T. (Optional.c a n be extended (in a weaker and more complicated form) to arbitrary topological spaces. the term "gaugeable" is seldom used in practice. every topology 9 on a set X can be determined by a quasigauge D. where 1 dG(x. but other mathematicians also require the set S to be open. 5.e. i. and Other Sets of Sets particular (pseudo)metric or gauge in mind. a collection of quasipseudometrics on X (which are not necessarily symmetric. .h.15. Show this with D = {dG : G E if'}. Let (X. . which we shall call the n e i g h b o r h o o d filter at z or the filter of n e i g h b o r h o o d s of z.16.. the neighborhoods of a point generally do not form a filter.16 that a topology is gaugeable if and only if it is uniformizable and if and only if it is completely regular. Definitions.c. u) < r} C_ T. That is. then the relative topology on 5' is also gaugeable. this term also applies to topologies and pseudometrics. Consequently. and let S c_ X. ~ be a topological space. there is some finite set Do c_ D and some number r > 0 such that {u e X " maxdcDo d(x. many of the ideas that we commonly associate with gauge spaces uniform continuity. it can be given by the restriction of D to S. a. We shall see in 16. the terms "uniformizable" and "completely regular" are commonly used in the literature.110 Chapter 5: Filters. etc. Let D be a q u a s i g a u g e on X i. If (X. see 2.) ReiUy's Representation. see Kopperman [1988] and Pervin [1962]. This presentation follows Reilly [1973]. Exercise. Caution" Some mathematicians define neighborhood as we have done. D) is a gauge space and S c_ X.. see 5. Most topologies used in analysis are gaugeable. equicontinuity. The two definitions yield similar results for the main theorems of general topology. With the latter approach. completeness. for instance. We say that a topology 9 and a gauge D are c o m p a t i b l e if 9~. Actually.) We can generalize the notion of pseudometric topologies still further.~D. (This is the supremum of the topologies ~d determined by the individual quasipseudometrics d c D. Topologies.18 we present some examples of topologies that are not gaugeable. but these examples are admittedly somewhat contrived. x') 0 ifxEGandx'~G otherwise. We can use D to define a q u a s i g a u g e t o p o l o g y 9~D in a fashion analogous to that in 5.23. We shall say that S is a n e i g h b o r h o o d of a point z if z E G c_ S for some open set G. Actually.11). Then N(z) . Similar ideas have been discovered independently in other forms.(neighborhoods of z} is a proper filter on X. but the openneighborhoodsonly approach is not compatible with the pedagogical style with which general topology is developed in this book" We shall use filters frequently. as part of their definition of a neighborhood of a point.
Elementary properties. A set S is open if and only if S .) R and . if the context is clear. A set S c X is open if and only if S is a neighborhood of each of its points. {~int(S) cl(C S). z e cl(S). The smallest closed set containing S is called the t o p o l o g i c a l c l o s u r e of S. Let X and Y be sets. then the support of f usually means the set {x E X 9 f ( x ) ~ 0}.17.r)) c K ( z . d. diam(cl(S)) = diam(S). It may be called simply the c l o s u r e of S. The notions of closure and interior are dual to each other.Yl}. int(C S).Topologies 111 b.7. then S n G is nonempty. in the sense of 1. we shall denote it by cl(S). b. 5. Show that C cl(S) where C A . and dist(z. a. Let (X. It is probably the type of closure that is most often used by analysts. and among all such sets there is a smallest (namely.g. Note that these two definitions agree if X has t h e discrete topology. the union of all the open subsets of S). and suppose some element of Y is designated "0" e. and among all such sets there is a largest (namely. if Y is a vector space. and a set S is closed if and only if S . z c cl(S) if and only if S meets every neighborhood of z. then the s u p p o r t of f means the set supp(f) cl({xEX 9 f(x)r If X is not equipped with a topology. r ) . where B and K are the open and closed balls. +c~]. Show that cl(B(z.cl(S).4.r)) c_ K ( z . Then: a. d. c.18. The largest open set contained in S is called the i n t e r i o r of S.40. e. There exist some open sets that are contained in S (for instance. There exist some closed sets that contain S (for instance. d) be a pseudometric space. O). or if Y c_ [c~. cl(B(z. ]x . S) = 0 ~ b.15.a for a further related result. 5. by taking X d(x. (See 26.. If G is open and cl(S) n G is nonempty. Closures and distances. Let S be a nonempty subset of X. int(S) c_ S c_ cl(S). dist(z. It is a special case of the Moore closure. and let z E X. The diameter of a set and the distance from a point to a set were defined in 4. c.g. c.int(S). S) = dist(z. y) = min{1. cl(S)). r ) may sometimes occur. If X is a topological space. X itself). Let f 9X ~ Y be some function. we shall denote it by int(S). defined as in 5.X \ A. the intersection of all the closed supersets of S).
Show that cl is the closure operator for a topology on X if and only if cl satisfies these four conditions: c 1 ( O ) . Topologies.) Hausdorff metric. Of course. Let X be a set. d) is a metric space.X. we include it here for convenient later reference. S ) ~ .int(S) n int(T) for all sets S. The lattice of open sets (optional). 5.21. it is called the (See example in the figure below. 5. cl(S U T) . its closed sets are those sets S c X that satisfy s cl(S). if these conditions are satisfied. for any open sets Gx c 9" (A c A).int(S). Of course. and let int 9 [P(X) + [P(X) be some mapping. int(S n T) . Then (O'. S _D int(S). T).{nonempty. For S.) Assume (X. if these conditions are satisfied. Let X be a set. The dual of Kuratowski's axioms follows easily. Indeed. K u r a t o w s k i ' s C l o s u r e A x i o m s . cl(cl(S)) . Example.cl(S) U cl(T) for all sets S. its open sets are those sets S c X that satisfy S . C_) is a complete lattice. Let (X.19. let h(S. closed. int(int(S)) . T) = max ~sup dist(s.112 Chapter 5: Filters.int(S). metrically bounded subsets of X}. Then int is the interior operator for a topology on X if and only if int satisfies these four conditions: int(X) . T C_ X. S c el(S). the smallest open set containing all the Gx's is VG~ AEA  UGh.20. sup dist(t. T E 9C. Hausdorff distance h between a circle and a rectangle 5.cl(S). T C_ X. then the corresponding topology is uniquely determined by int. (Optional. and let cl" [P(X) + ~P(X) be some mapping. AEA while the largest open set contained in all the G~'s is AEA .o. then the corresponding topology is uniquely determined by cl. Let 9 C . J tET L sES Show that h is a metric on K. T) be a topological space. and Other Sets of Sets d.
Let X be a set.5. Let iT {(TC_X 9 ( T E N ( z ) for e v e r y z E G } . then the topology in (A) must be iT.Topologies 113 Note that when the index set A is finite. thus it is a lattice homomorphism. suppose 2/(x) is some filter on X. then C1AEAG.) Then the following three conditions are equivalent" (A) There exists a topology on X for which {N(z) 9 z E X} is the system of neighborhood filters. Hints" Let us first restate (B) as follows: (B') For every S E N(z). However.) However. but it generally is not infpreserving. Hence the inclusion iT c .. For each x E X.28. AAEAGA generally is not equal to NAEAGA. AEA (2) For instance. It is possible to study at least some properties of a topological space purely in terms of its lattice of open sets. and Gn . H V Gn = (0.c_) does not necessarily satisfy the other infinite distributive law. that law is not satisfied in the following example. 2). On the other hand. such that x is a member of every member of N(x). Moreover. the collection of sets iT A N(z) is a base for the filter N(z).( 1 . ~ E iT. 5. (C) For each z E X and each S E N(z). hence the left side of equation (2) is (0. every member of N(z) contains some member of N(z) N iT. c_) satisfies one of the infinite distributive laws" HA V G. since there is no z E G in that case.22. one can disregard the individual points that make up those sets.1) U (1.2). (iT. It is easy to verify that (iT. That is.n' 2) for n = 1.d and the related comments in 13.a. A . N e i g h b o r h o o d A x i o m s . (C) are satisfied. . hence the right side of equation (2) is (0. 3.3. 1).2).x). (B) For each z E X.x). AEA (1) (See also the related results in 13. (In particular. when the index set A is infinite. there is some G E N(z) with the property that u E G =~ S E N(u). there is some G E iN(z) A iT such that G c_ S.x = AEA V (HAG.N. iP(X) preserves finite sups and infs.. 2.(0. and so AAEAGA generally is equal to NAEAGA. (B).. taken from Vulikh [1967]: 1 Let iT be the usual topology on the real line. Verify that AnEN Gn = (1. this pointless topology is seldom useful in applied analysis. Let H . . which is greatly concerned with points. Thus the inclusion 7 c_ iP(X) is suppreserving. (See the related result in 16. if (A). H v A AEA = A (H V a.~ is an open set.) An introduction to this "pointless topology" was given by Johnstone [1983]. However.2).
define an operator int 9{P(X) + {P(X) by i n t ( S ) . show that G . Topologies. b.namely. then ng~ ACA  {scx .e.114 Chapter 5: Filters. if ~ is any collection of subsets of X. For (C) => (A)..int(S) is a member of the collection of sets 9" described above. then the topology generated by n  U ~A AA E  (SC_X 9 SCffA for someA} is called the supremum of the ffA's.23.15.a) and a sup.. let int be the interior operator of the given topology. Here are a few more ways to make topologies" a.) Example. Gn} C ~ such that x E Nj=I Gj c_ T.15. and Other Sets of Sets For (A) ~ (B'). with the convention that the intersection of no subsets of X is all of X.. since it is their greatest lower bound i. It is sometimes called the i n f i m u m of the 9"~'s. Important example. then verify that this operator int satisfies the conditions of 5. Thus. then there exists a smallest topology 9+containing ~ . A set T c_ X is a neighborhood of a point x E X with respect to the topology generated by ~ if and only if T has the following property" There is some finite set (GI.. it is the largest topology that is contained in all the ~S.23. Exercise. . _ (We permit a . If {9"~ 9k C A} is a collection of topologies on a set X. b E X.g)...i.0. The order interval topology on a chain X (defined in 5. 5. From the preceding result we see that the collection of all topologies on X is a Moore collection of subsets of ~P(X).) c.. (The topology generated is a special case of the Moore closure. The topology 9+obtained in this fashion is the topology generated by ~. the gauge topology 9"D is the supremum of the pseudometric topologies (ffd" d C D} (defined as in 5.D).e. the intersection of all the topologies that contain ~. it is the smallest topology that contains all the ffA's.{z e X " S e N(z)}. but the terms "closed" and "closure" generally are not used in this context. note that u e G :::> G e N(u) :::> S e N(u) since S_D G. On any gauge space (X. G2. The collection of all topologies on X is a complete lattice when ordered by c_ since each subco]lection has an inf (see 5. since it is their least upper bound . Scg"~foreveryA} is also a topology on X. the generating set ~ is also called a subbase for the topology 9+.f) is the topology generated by the sets that can be expressed in either of the forms Sa={xeX'a<x} or S (xeX x<b} for points a. For (B') ~ (C). Let ~}be a collection of subsets of a set X.20. If {9"~ 9A E A} is a collection of topologies on a set X..
A a . or with spaces of measures {#~}. we shall call this the i n d i s c r e t e (er)algebra. In the terminology of 1. introduced in 11. {0. Remarks. Let X be any set. .24. and (iii) S. Let X be a set. #). S \ T c 8.in parentheses should be read once with the a. ~ 3 . It follows that O E 8 and that 8 is closed under finite intersection and relative complementation" S. and a space of measures is a collection of measures that is equipped with some additional structure (linear.25.47.47 and one more in 11. E S ~ Nj=I sj E s. . and 29.26. any statement involving a.) that leads us to call it a "space. (ii) S E g =~ C S c 8 ." 5.$2.9. then: a. T ( X ) = {subsets of X} is the largest (a)algebra on X. we may say that a measure is device for measuring how big sets are.Algebras and SigmaAlgebras 115 5. It follows immediately that any aalgebra S is also closed under countable intersection: $1.3. X} is the smallest (a)algebra on X.$3.. We shall call it the d i s c r e t e (a)algebra. S. T E S ~ S U T E 8. Measurable spaces (X.. Caution: The term "algebra" has many different meanings in mathematics. that says: 8 contains X itself and 8 is closed under complementation and finite union. C ~ ~ Uj=I ~j e S. 11. We may refer to X itself as a measurable space if S does not need to be mentioned explicitly.48. In the following examples.. The theory of topological spaces will be developed a little further in Chapter 9. a measurable space is a space that is capable of being equipped with any of several different measures. T c S implies S n T. a measure space is a space that has been equipped with a particular measure. 8) should not be confused with measure spaces (X. When we need to distinguish the algebra defined above from other kinds of algebras. Examples of (a) algebras. where X is a set and $ is a aalgebra of subsets of X. Somewhat impre. introduced in 21. several meanings will be given in 8. An a l g e b r a (or field) of subsets of X is a collection S c_ T(X) with the following properties: (i) X e 8.cisely. It will be continued in much greater detail in Chapter 15 and thereafter.omitted and once with it included. etc.a l g e b r a (or afield) of subsets of X is an algebra that is closed under countable union: (x) (iii') S l .30. A m e a s u r a b l e s p a c e is a pair (X. . topological. ALGEBRAS AND SIGMAALGEBRAS 5. The elements of g are referred to as the m e a s u r a b l e s e t s in X.f.29. the algebra defined in the preceding paragraph will be called an a l g e b r a of sets. . and let C denote complementation in X. S). S 2 . b.
the smallest aalgebra containing all the open sets. Topologies. the Lebesgue measurable sets. Then: (i) The algebra generated by 9 is equal to the union of the algebras generated by finite subsets of 9. see particularly 6. by convention. studied in Chapters 21 and 24. Caution: The "Baire sets" are not the same as the "sets with the Baire property. mentioned in 20. i. Let (X. e.34. the intersection of all the (a)algebras that contain 9. (The proof of this result assumes some familiarity with the most basic properties of countable sets.a l g e b r a is the aalgebra generated by g " . .26. In view of the preceding exercise. then the Borel aalgebra is generated by the algebra in 5. introduced in 4. given any collection 9 of subsets of X.e we shall see that when X is any subinterval of the real line.116 Chapter 5: Filters.26. 9") be a topological space. there exists a smallest (a)algebra that contains 9 namely. Some of the most important aalgebras are determined in one way or another by topologies. O C A also). (The (a)algebra thus generated is a special case of the Moore closure. Let A be the collection of all unions of finitely many subintervals of J (where a singleton is considered to be an interval and. Its members are called the B o r e l sets. we say that 9 is a g e n e r a t i n g set for that (a)algebra. d. Then A8~ aCA  {TC_X 9 T E 8~ for every a c A} is also a (a)algebra on X. Let { 8 ~ ' a E A} be a collection of (a)algebras on X. 9 the Baire sets. also known as the sets with the Baire property. the collection of all (a)algebras on X is a Moore collection of subsets of ~P(X). Hence. However. the terms "closed" and "closure" generally are not used in this context. In 15. f.c. Show that A is an algebra of subsets of J. (ii) The aalgebra generated by 9 is equal to the union of the aalgebras generated by countable subcollections of 9.3. studied in 20. and Other Sets of Sets c.37. and 9 in ~n.16).20 and thereafter.) The aalgebra generated by 9 is sometimes denoted by a(9).) hQ Let ~} be a collection of subsets of X." and the "Lebesgue measurable sets" are not the same as the "Lebesgue sets" (introduced in 25.) Some other aalgebras based on topologies are 9 the almost open sets.that is. Let J c_ ~ be an interval (possibly all of R). gO {S c_ X 9S or [:S is countable} is the aalgebra generated by the singletons of X. We call it the (r g e n e r a t e d by ~}. (Remark. {S C_ X " S or CS is finite} is the algebra generated by the singletons of X. equipped with its usual topology. The B o r e l a .
H. and H \ G E J. 117 5. This result will be used in 20. Let f~ be a set.e..35 but we shall not study such measures in this book. there exists a smallest monotone class containing o4. then Un%lAn 6 J~. Let ft be a set. Exercise. GC_TC_H.27. T c o4. given any collection A of subsets of ft.. we shall call it the m o n o t o n e class g e n e r a t e d b y o4... then 0n%l An E 2~. The aalgebra approach has the advantage that it is more algebraic i. Hence.16. A a . and T such that GCSC_H. It is the smallest (a)algebra of sets that contains both A and J. . A m o n o t o n e with both of these properties: class of subsets of f~ is a collection ?v[ of subsets of f~ (i) If (Aj) is a sequence in N: and A1 C_ A2 C_ A3 c _ .r i n g (also known as a t r i b e ) is a ring J~ that also satisfies A1. arings are more general. .B E ~E ~ AuB. and more useful in the study of regular measures on locally compact Hausdorff spaces see the remarks in 20.28. A r i n g of subsets of ~t (also known as a clan) is a collection ~ of subsets of ft that satisfies ~ c ~ and also A. It is easy to see that the monotone classes form a Moore collection i. any intersection of monotone classes is a monotone class..21 and 21.Algebras and SigmaAlgebras j. the algebraic structure of a aalgebra is simpler than that of a aring. 5.. .e. Clearly. If A is a (a)algebra on X and J is a (a)ideal. The clopen subsets of a topological space X form an algebra of subsets of X. A\B r ~.. then o4AJ {AAI 9 AEAandIEJ} is a (a)algebra on X. 6 :R := => AI U A2 U A3 U ' " 6 ~.A3. More definitions (optional). (See hint in diagram below. a collection ~ C_ T(ft) is an algebra (or aalgebra) if and only if it is a ring (or aring) in which ft is a member. 5. On the other hand.) H X Hint for exercise on A A J" Show that a set S c X is an element of o4 A J if and only if there exist sets G. (ii) If (Aj) is a sequence in 5I and A1 _D A2 _D A3 _D.A2.29. Most t r e a t m e n t s of measure theory use either aalgebras or arings.
~ classes that contain A). d. Proof. hence NA . respectively. (ii) for each U c II. (iv) ~[ is a filter on X x X. ~ is a aalgebra.. Let 2V[ and S be.4 is equal to the aalgebra generated by A. CMAN. MNCNallbelongto2V[}. if L. A u n i f o r m s p a c e is a pair (X.2. then M c NA. ~q'M is & monotone class.) Also.{(x. If A c A and M c 2V[. We say II is a u n i f o r m i t y on X if. ll) consisting of a set X and a uniformity II c [P(X x X). (See 16. by minimality of 8 among aalgebras containing A. 2V[ _D $. (It is necessarily a proper filter since each U E II contains I and is therefore nonempty. hence 8 _D ?vl: by minimality of 2Vl:. Temporarily fix any M C 2V[. (Indeed. Thus A C_ :Jr[ N $. hence A c NM. Then a p r e u n i f o r m i t y on X is a collection 11 of subsets of X x X that satisfies: (i) U ~ II ~ U _D I (i. g. Remarks. Algebras of sets will be related to Boolean algebras in Chapter 13. the "subbase for a uniformity" defined in some books.31. 5. Topologies.x) 9x E X}.17. UNIFORMITIES 5. the "preuniformity" defined above is related to.  {NE~ 9 MNN. in addition. :M: is an algebra of sets. M c :M:. and Other Sets of Sets M o n o t o n e C l a s s T h e o r e m .l . f. Caution" The definitions of "uniformity" and "uniform space" vary slightly in the literature.NM. Algebras of sets and aalgebras will be used in measure theory in several later chapters. Therefore ?tiM . and (iii) for each U C II. but not the same as.3V[ (by the minimality of :M: again). each U c l~ is reflexive).118 Chapter 5: Filters. Thus A C_ NM.30. there is some V E 11 with V o V c_ U. and let NM Verify that a. Let X be a set. Measurable spaces will be studied a little more in Chapter 9. there is some V c ~[ with V C_ U . Then the monotone class generated by . then A C NA. b. We may refer to X itself as a uniform space if II does not need to be mentioned explicitly. If A c A. Define inverses and compositions as in 3. use the fact that L c 2V[ .) An element of II is called an e n t o u r a g e (or a vicinity). 8 is a monotone class containing A. Definitions. Let A be an algebra of subsets of Ft. . the monotone class and the aalgebra generated by A.e. its diagonal is the set I . (by the minimality of 3Vl: among monotone c.) e.
Let lID be the collection of all sets U c_ X x X that have this property: There is some number r > 0 and some finite set F C_ D such that {(x.16. 5. We may sometimes denote it by 9"ll. rather artificial are given in 16.lID.32. this term also applies to uniformities and pseudometrics. Uniformities constructed from distances. For each number r > 0.33.9"ll. let = { ( x . We say that a topology 9.22(C).. However. D) will be understood to be equipped with this uniformity unless some other arrangement is specified. Exercise. (The set F can be taken to be a singleton. We say D and E are u n i f o r m l y e q u i v a l e n t if they determine the same uniformity. Some uniformities are not pseudometrizable. Different uniformities may determine different topologies or the same topology. We say that a uniformity II and a gauge D are c o m p a t i b l e if I I . is a filter on X.c). x E X. and It is an easy exercise to verify that the system of filters {ll[x] : x E X} satisfies condition 5. the terminology is discussed further in 9. lfE or the same uniformity. Some examples of nonuniformizable topologies admittedly. More generally. and let U[x] = = {y 9 (x.x') C X x X ' m a x a e F d ( x .4. It would be natural to say that a uniformity II is "gaugeable" if 11 = 11D for some gauge D. Then (exercise) lid is a uniformity on X. We shall call it the u n i f o r m i t y o n X d e t e r m i n e d b y d.4. . then D and E determine the same uniformity.on X.4. We shall call it the u n i f o r m i t y for X d e t e r m i n e d b y D . and hence it is the system of neighborhood filters for a topology 9. y ) c {U[x] 9 u u} is a subset of X. see 16. If D is a gauge and E is its max closure or its sum closure (as defined in 4. x') < Then let lid .20. if D is directed. T h a t topology is called the u n i f o r m t o p o l o g y determined by II.Uniformities 119 5. Let d be a pseudometric on a set X (defined as in 2.{V c_ X x X 9 V _D Ur for some r > 0}. Any gauge space (X. for each U E 11. it turns out that every uniformity is gaugeable. Two different gauges D and E on a set X may determine different uniformities 11D. we do not necessarily have a particular uniformity in mind when we say that a topology 9.is uniformizable. Topologies constructed from uniformities. A uniformity that can be determined by a pseudometric (or a metric) in this fashion is called a p s e u d o m e t r i z a b l e u n i f o r m i t y (or a m e t r i z a b l e u n i f o r m i t y ) . as in 4. This is an equivalence relation on the set of all gauges on X. x ' ) < r} C_ U. 11) be a uniform space. Most topologies in applications are uniformizable.c. x') 9 x x x 9 d(x. Then Let (X. an example of this is given in 18. let D be a gauge on a set X.18.and a uniformity 11 are c o m p a t i b l e if 9 . A topology that can be represented in this fashion is said to be u n i f o r m i z a b l e .11).) Then (exercise) l l o is a uniformity on X.
we may refer to ~ as the "knob" of X. v are equal to if exactly one of u. A set X equipped with this pseudometric will be called a k n o b s p a c e .e. ll. Examples.e. we shall call it the i n d i s c r e t e u n i f o r m i t y . if U is an entourage then there exists a symmetric entourage V such t h a t V 3 . v)  ( 0 ( 1 if both or neither of u.X \ {~}. by the Kronecker metric 1 . v C X. It determines the indiscrete topology (defined in 5.. If U is an entourage and k is a positive integer.oV C U. {~}. and Other Sets of Sets Exercise9 Let D be a gauge on X. not every discrete metric yields the discrete uniformity. C.) b. Topologies. an example is given in 19. 5. The uniformity it determines is the singleton {X x X}. then V . We shall use that fact in our proof of 16.Moreover.8 (see 2. v is equal to for u.120 Chapter 5: Filters. Let X be a nonempty set. Define a pseudometric on X by d(u. Knob spaces will be important in certain arguments concerning the Axiom of Choice.h is the same as the uniform topology 9"ti determined as above from the uniformity . Then the gauge topology 9"D determined by D as in 5. k of the V's . Show that a.. an entourage satisfying V = V .12.20 and 19. we may choose V symmetric. a.i ) contained in U. In particular. Let d 9 X x X ~ N be the constant function 0.24). If U is an entourage.b).a).. However.13. S. Let S .15. Let (X.) be a uniform space. .12.V o V o V C_ U. Then d is a pseudometric. we shall call it the d i s c r e t e u n i f o r m i t y ..15.34. in 17. using "Kelley's choice" (see 6. and let ~ be some particular specified element of X. The resulting uniformity is 11 {UC_XxX 9 U_DSxSand(~.~]~D defined from the gauge D as in 5. X}. b.b) yields the discrete topology (defined in 5. then there exists an entourage V satisfying Vk = VoVoVo. The discrete uniformity is determined by some discrete metrics for instance. The topology that it determines is the discrete topology.16. This is the smallest uniformity on X.b).35. {U C_X x X 9 U _D I} is the largest uniformity on X.U M U 1 is a symmetric entourage (i.11. 5. Any discrete metric (defined in 2. U c li =~ U 1 c II.~)EU} and the resulting topology is {0. c.15. (This property is not always satisfied by a preuniformity.32. Some basic properties of uniformities.
. "generating" collection 8 cannot be chosen arbitrarily. Let 7rl.36 that an intersection of uniformities is not necessarily a uniformity.16. In m a n y cases of interest.~(y)l. Moreover. this is not a special case of a Moore closure. S 2 .. .38. The union of any family of preuniformities on X is a preuniformity on X. d2(x. The smaller. Suppose that S is a preuniformity on X. then g[x] is a neighborhood base at x for the uniform topology.37.71"l ( y ) [ . An intersection of uniformities is not necessarily a uniformity. Remarks. Show that W is not a uniformity. It will sometimes be useful to "generate" a uniformity ~ / f r o m a smaller collection 8 of sets. the preuniformity 8 has the property that it is closed under finite intersection i. by 5. the formula becomes II = {U _c X • 9 U_DSforsomeSES}. It will be continued in much greater detail in Chapter 18 and thereafter. for x and y in X. 5. 7r2 : X ~ R be the coordinate projections .that is." 5. . S1. uniformities are not like aalgebras or topologies. . we saw in 5.) For instance. Given a collection 8 of sets. . y ) E S } and g[x]  {S[x] 9 S E g } . there does not exist a smallest uniformity containing W. in this case let us denote S[x]  {y 9 ( x . Hence 8 generates a proper filter on X • X.31) will serve quite well for this purpose. Pathological example. Here are some further noteworthy properties of a preuniformity S on X and the uniformity 1 / i t generates" a. Then 8 is a filter subbase on X • X. We shall call l / t h e uniformity g e n e r a t e d b y 8. and in fact ] / i s the smallest uniformity containing 8. b. .36.d2 on X by d~(x.31(i). S n } C ~}. The theory of uniform spaces will be developed a little further in Chapter 9 and in 16.. 5. Define pseudometrics dl. although there do exist uniformities that contain W. (In this respect. However. Most of these ideas are from Kelley [1955/1975]. we can simplify our formula for the uniformity II generated by 8." such a uniformity 1/need not exist.NSn for some finite set { S 1 . y) lTrl(x) . but preuniformities (defined in 5.e.. y)  . It is easy to show that l / i s a uniformity on X. and let W = lll N l/2.X2) = Xl and 7r2(xl. take X = R • R.Uniformities 121 5.x2) = x2. that filter is 11  { U c_ X • 9 U ~ $1NS2N. Let l/1 and 112 be the resulting pseudometric uniformities on X. although that book does not use the term "preuniformity.39. $2 c ~ ~ $1 n $2 C ~. 71"l(X1. In this case. we cannot simply look for the "smallest uniformity 1 / t h a t contains S. Caution: Despite the similar language. Use that fact to show also that.
and Other Sets of Sets IMAGES AND PREIMAGES OF SETS OF SETS 5. If 9" is closed under complementation in Y.42. Let q" be a collection of subsets of Y. then g . and let gl(9) __ (gI(T) 9 T e 9}.a) are filterbases on Y. an ultrafilter. Topologies. Sets with suitable inverse images. 5. The ideas in the remainder of this chapter are mainly needed for set theory and logic. Let X be a set. then 9.l ( ~ is. Then the following conditions are equivalent. a filter.has the same property on Y. or under finite or countable or arbitrary intersection. or a collection of nonempty subsets of X.ses} is a filter subbase or a filterbase. U C~[} is a preuniformity on X (regardless of whether g is surjective).(defined as above). If one (hence all) are satisfied. then gl(~y) has the same property on X. and let ~T {T C Y 9 g . If 8 is a filter subbase. Etransitive). If 9. respectively.g(x2)). Definitions and remarks. or a aalgebra on X. under finite or countable or arbitrary union.l ( ~ ) __ {(g • g ) .is a topology or a (a)algebra. Show that a. then gl(0") is closed under the same operation in X. b. . too. both of which generate the filter 9. then 9" is closed under the same operation in Y.l ( v ) . Remark. If $ is a filter subbase or a filterbase on X. Inverse image of a collection.l ( T ) e S}. x 2 ) preuniformity on Y. Forward image of a filter subbase. under finite or countable or arbitrary union. then g(8) and g(~B) (defined as in 5. Define g • g 9 X 2 ~ y 2 by (g • g ) ( x l . If g is surjective and 9" is a filter subbase or filterbase on Y.122 Chapter 5: Filters. they can be skipped if one is only concerned with the traditional topics of analysis.40. a fixed ultrafilter. TRANSITIVE SETS AND ORDINALS 5. we say X is t r a n s i t i v e (or more precisely. then (g(xl). If 11: c Y • Y is a (g • g ) . If 8 is a filter generated by a filterbase ~B on X. d. c. a topology.40. If 8 is closed under complementation in X. Let S be a collection of subsets of X. Let g" X ~ Y be some function. or under finite or countable or arbitrary intersection.41. on Y. then g(s) {g(s).
{0.43. Of course. for most mathematicians. Basic properties of transitive sets. Remarks. then A E X..Transitive Sets and Ordinals (A) Whenever A E S and S E X. and so on. (B) Each member of X is also a subset of X T h a t can also be restated as X C ~P(X).) The nonnegative integers are thus defined to be the sets 0 0. 2 . and so each member of each member of X is a subset of X. then each member of X is a subset of X. then S is a subset of some transitive set. 1} contains exactly n elements. d. 5. 123 (C) U n ( X ) C_ X. . . that is. Set theorists find it convenient to attach the labels "0. 1  {0}. If S is any set. is a transitive set with cl(S) _D S. not a set.e. c. After all.44. see 5.3). the set {{0}} is not. 2  {o. If S is any set. b. We shall call it the t r a n s i t i v e c l o s u r e of S. and n + 1  . then S is a member of some transitive set. For some purposes in set theory. those not actively involved in set theory. cl(S) is the smallest transitive superset of S. Because the definition of ordinals is somewhat abstract and complicated. n n U {n} is the successor of n. cl(S) S U Un(S) U Un(Un(S)) U Un(Un(Un(S))) U . Preview/examples. this process must be continued to an infinite depth. The first few ordinals are the finite ordinals.. The sets o and { 0 } and {0. For instance. For instance." etc. the notion of transitive sets is slightly alien to most mathematicians i. to these sets. a. . Thus the set n . 1. and so on. it is the intersection of all the transitive supersets of S. where U n ( X ) is the union of the members of X. it is enough to consider sets of sets and occasionally sets of sets of sets. 5. In fact.16 and 1. As Doets [1983] points out. as defined with the Axiom of Unions in 1. But if X is transitive." "1. the assertions that we now make about these examples cannot be proved until a few pages later.44.46. It is a special case of Moore closures (discussed in 4. (See the related discussions in 1. The intersection of any nonempty collection of transitive sets is a transitive set. A E X =~ A c X. we shall precede that definition with a few examples.47." "2.{o}}. Examples. one such set is the transitive closure of the singleton {S}. {0}} are transitive. except that in this case the domain of el is a proper class.. .
Topologies..202+1} {0.. 02.. Definitions. Proof of equivalence of conditions.202.. The first few of these are 02 02+1 02+2 202  {0. but we can easily sketch a proof of their existence. . Among the uncountable ordinals that are isomorphic to subsets of 2N. {0.2.02+1}. we say X is an o r d i n a l ..02. . 3..45.. there is a first one. 02 + 1. {0.2. .02 + 2.3.1. } . and the relation ~ is a chain ordering of X. } 302 and so on.2. B are not ecomparable}. They are a bit harder to visualize.m i n i m u m element of S..02 + 2 .1.46.. . {0. let A0 be a @minimal element of the nonempty set So = {A c X 9 some B E X exists such that A. 5. the reader may find it helpful to briefly review the theory of well ordered sets developed in Chapter 3.02.46. Then let B0 be some @minimal element of the nonempty set To {B C X 9 A0 and B are not =comparable}.02...02 + 1. The ordinal 02 is an ordered version of the unordered set N U {0}." Let X be a set. and all the members of X are transitive sets.02}. and Other Sets of Sets After the finite ordinals come the countably infinite ordinals.124 Chapter 5: Filters. . the expression A e_ B will mean "A E B or A = B.3. Using the Axiom of Regularity.02 + 1. "  (C) X is a transitive set. It remains to prove (C) ~ (A). thus there exist sets A. . 2. (B) X is a transitive set.. 1. ...2.47" If S is a nonempty subset of X.202 + 1. . and the implication (A) ~ (C) is a fairly easy exercise.f)...202 + 2 .202. by 5.2.3.3. and the relation e is a well ordering of X.02.. In the discussion below.g. . 202} 202+ 1 202+ 2 m {0.1..3.. After the countable ordinals come the uncountable ordinals.02+2. Suppose (A) is false. For (A) =~ (B) use the Axiom of Regularity in 1. . ..1.. 1. .20).. note that the successor of any ordinal S is the ordinal S U {S}. 2 TM and give it a well ordering (see (AC4) in 6. The first uncountable ordinal is equal to the set of all countable ordinals.. Before proceeding further. it is easy to show that any @minimal element of S is also a = .2..}. If one (hence all) are satisfied.02+1. Then the following conditions are equivalent. (A) X is a transitive set. Then find the ordinal that is order isomorphic to it (see 5.. B c X that are not ~comparable.. .. 1..02 + 2 . Consider any uncountable set for instance.3. The implication (B) =~ (A) is trivial. E .. Again.. . {0..
then X . t h e n X E Y or X Y or Y E X. The only order isomorphism from an ordinal onto an ordinal is the identity map from an ordinal to itself. as we shall see in 5. thus we must have Co E Ao. To others it meant any equivalence class of well ordered sets. Let c~ be the first member of X \ Y. Thus Do is ~comparable with every member of X. But then Bo K Do E Ao. then the intersection of all the members of e is an ordinal. By our choice of Do we know that Do r Bo. hence they are transitive sets. where two well ordered sets are considered to be equivalent if there exists an order isomorphism between them. However. it would contradict the fact that Ao E and Bo are not =comparable. Since Co E Bo E X and X is transitive.f.{proper lower sets of X}. Both Ao and Bo are members of X. then the set of predecessors of c~ is Pre(c~) . The von Neumann definition removes these difficulties by specifying a natural representative from that equivalence class. If X is an ordinal. thus we must have Bo _c Do. Since Bo is an Eminimal element of To.c~. some of the earlier literature used slightly different definitions of "ordinal. If e is any nonempty subclass of the class of ordinals. I f X and Y are ordinals. e. it follows that Do r So. Indeed. and furthermore that ordinal is a member of e in fact. E Since Ao and Bo are not =comparable.39. Since Ao is a Eminimal element of So and Do E Ao." To some mathematicians an ordinal meant any well ordered set. In recent years the definitions above (due to von Neumann) have become standard. since that equivalence class is a proper class. c.  .Y. (This argument follows Shoenfield [1967]. which makes X a well ordered set. we have Co r X. Let Do be some member of Ao \ Bo. This proves Bo C_ Ao. (Hint" Induction on lower sets.) Remarks. they are not equal. We first show that Bo G Ao. If (X. If A0 K Co E Bo. Thus. thus Ao \ Bo is nonempty. thus we do not have Ao c_ Co. 4) is a well ordered set. f. t h e n Y C X ~ YEX.) T h a t ordinal is sometimes referred to as the o r d i n a l t y p e of X. Conversely. d. let Co E Bo. g. it is the smallest member of e. E is a chain .c. Hint" Induction on lower sets. a. not a set. ordering on the class of all ordinals. Note that with this ordering. If X is an ordinal. if c~ c X. contradicting the fact that A0 and Bo are not ~comparable. The latter definition may cause some difficulties. then we understand X to be equipped with the ordering given by ~=.Transitive Sets and Ordinals 125 Then Ao and Bo are not ~comparable. Since Co E X \ To.46. Basic properties of ordinals. 5. Hints" Y E X ~ Y C X since X is transitive. Show that Pre(c~) . it follows that Co is not a member of To. suppose Y c X. Hint" 3. All the members of an ordinal are ordinals. b. we shall show Co E Ao. and in particular Do is ~comparable with Bo. it follows e that Ao and Co are =comparable.46.{x E X ' x E c~} . then there is one and only one mapping that is an order isomorphism from X onto an ordinal. I f X and Y are ordinals .
. "Aleph" is the name of R.) The infinite cardinals are also called a l e p h s . The Continuum Hypothesis is One statement t h a t 2 ~~ = R1.. then so is X U {X}. . It is the smallest ordinal greater t h a n X i. We have these inclusions: {alephs} c {ordinals} C {sets}. By our definition. also known as a c a r d i n a l or a c a r d i n a l n u m b e r . i. then U s ~ x S = x . it is sometimes written X + or X + 1. 3. 2. they are written 5. etc.namely.) An i n i t i a l o r d i n a l . . the empty set is a limit ordinal . R2w. X. 2co + 2. . is an ordinal X with the property t h a t no earlier ordinal has the same cardinality as X.47. Rco+2. The union of any set of ordinals is an ordinal. See further remarks in 6. are successor ordinals. (Note: Some m a t h e m a t i c i a n s add the restriction t h a t the set be infinite as part of the definition of initial ordinal. Rw+l. Note t h a t different ordinals may have the same cardinality. . co + 3. Preview. we may equivalently define a successor ordinal to be an ordinal t h a t has a largest element. R2. Refer to 5. and co + 1. . All the finite ordinals are cardinals. j..) It follows from 3.. . 2co + 3. but we shall not impose t h a t restriction. it is countable. Rw.2co. such bijections cannot be order preserving. Examples..44. R1. it is easy (exercise) to give bijections between the ordinals w and w + 1 and 2w.48. the intersection of all the members of e. since they have the same cardinality as co. . The first infinite ordinal is w = R0.. Topologies.42. Thus.44. (Later we shall show t h a t the class of all ordinals is a proper class i. .but some m a t h e m a t i c i a n s use a slightly different definition for limit ordinal t h a t excludes the e m p t y set. .20 t h a t any set S can be well ordered. The ordinals w + 1 and 2co are not cardinals.. . w is a cardinal.126 Chapter 5: Filters.b t h a t any infinite cardinal must be a limit ordinal. it is a well ordering: If e is any nonempty subclass of the class of all ordinals. If X is an ordinal. (Of course. . RO. and Other Sets of Sets In fact. .23..3co. it is the first ordinal after X..e.. etc. 5. It is called the s u c c e s s o r of X. and the first uncountable ordinal are limit ordinals. and hence we can assign a cardinal number to each set. R3. Conversely. the first letter of the Hebrew alphabet.. and 2co + 1. It will follow from (AC4) in 6. Note t h a t any successor ordinal X + has a largest element .e. Refer to 5. it is not a set. then e has a smallest m e m b e r namely. For instance. . The first uncountable ordinal is R1. the first uncountable ordinal is a cardinal. (Optional. The ordinals co. . . Show t h a t if X is a limit ordinal. co + 2.) h. show t h a t any ordinal with a largest element is a successor ordinal. The ordinals 1.. . Examples.. A l i m i t o r d i n a l is an ordinal t h a t does not have a largest element.. Any ordinal t h a t can be written in the form X + for some X is called a successor ordinal.
) Second proof. THE CLASS OF ORDINALS 5. the ordinals have some interesting structure. Show that 0 is then an ordinal and hence a member of itself. Then/3 is an ordinal.R) 9 Scx. because both are interesting. Suppose e is a class of ordinals.46. That slightly stronger statement will follow as a consequence only if we assume the Axiom of Choice see 6.. (After all. to study its alternatives.g. but we shall not prove that. tBnt: Let a = {fl: fl is an ordinal with card(fl) _< card(X)}. the H a r t o g s n u m b e r of X is defined to be the smallest ordinal a that satisfies card(a) ~ card(X) i. The only hard part is showing that a is actually a set.22. Then 0 is a proper class.49. Finally. we shall prove this below..49. show that a is an ordinal. Exercise.g. contradicting the definition of the Hartogs number.46.51. The first proof is more elementary in that it does not use the Hartogs number. The class of all alephs is also a proper class. and among such ordinals there is a smallest. a proof is given by Krivine [1971]. contradicting 1.45 that {sets} is a proper class. likewise we shall show in 5. 5.47 to show that {(S.e. Prove that the definition of the Hartogs number makes sense i. and the Axiom of Replacement to prove a is a set. The Hartogs number is mainly useful if one wishes to avoid using the Axiom of Choice e. the collection of all ordinals is not a set. We offer two slightly different proofs. Suppose that 0 is a set. much like the collection of all sets. hence card(~) _< card(0). prove that there do exist ordinals a satisfying card(a) ~ card(X).50 that {ordinals} is a proper class.The Class of Ordinals 127 We saw in 1. 5.f.e. Theorem. as we shall do briefly in this book.) We do not require that card(a) > card(X). Let/3 be the Hartogs number of 0. hence/3 c_ 0.41. Definition. (It is a cardinal number.50. not a set. such that . The collection of all ordinals is a proper class i. not a set.. they are well ordered by =. RcS• and R is a well ordering of S } is a set.) First use the Axiom of Comprehension in 1. For any set X. 5. the second proof may be preferable to some readers because it does not use the Axiom of Regularity. Then use 3. the smallest ordinal a that does not satisfy card(a) _< card(X). First proof. and a is the Hartogs number of X.. a "very big" collection.e. Nevertheless. Consequently we have the following two ~ principles: Induction on the Ordinals. Let 0 = {ordinals}. (This is the BuraliForti Paradox. as we noted in 5. Suppose 0 is a set.
If the induction principle does not hold. Slightly longer proofs that avoid the use of ordinals are given by Fuchssteiner [1986] and Mafika [1988]. then Zermelo's Fixed Point Theorem is a trivial corollary of Zorn's Lemma. then X is the first ordinal t h a t does not belong to e. Let X be the first member of M + that does not belong to e. Proof. Let 2MZbe the class of all ordinalbased maps. Z e r m e l o ' s F i x e d P o i n t T h e o r e m . and therefore c a r d ( ~ ) _< card(X).e. is the function F restricted to X. a contradiction. However. R e c u r s i o n o n t h e O r d i n a l s . note that if ~b is strictly increasing on some ordinal S E 9{. Then f has at least one fixed point. Then there exists a unique function F 9 {ordinals} + {sets} that satisfies F(X) . then X also belongs to e. . In the last line above. Let 9{ be either the Hartogs number of X or the class of all ordinals. a contradiction. ~) be a nonempty poset. the preceding ordinals ordinal X is equal to its own set of predecessors. None of these proofs requires the Aziom of Choice or any of its consequences. which is (AC7) in 6. Hence f(u(C(S))) exists and is strictly greater than every member of C(S) hence tb is defined and strictly increasing on S + = S U {S}. This completes the definition of . we omit the details.P ) _ _ x (FI for each ordinal X. Let (X. Remarks. FIx. and not X itself. since ~b : 9{ + X is strictly increasing. Suppose not a function i. suppose f(x) >. Topologies. Then u(C) = f ( s u p C ) defines u : {chains in X} + X. If we permit the use of the Axiom of Choice and its equivalents.e. satisfying u(C) ~. it is injective.40.p. Let p be some function of classes. Suppose f : X + X is a function that satisfies f(z) ~ z for all x. then C(S) = {tb(T): T E S} is a chain. The members of its domain are the members of X i. The proof above is from Howard [1992]. By an ordinalbased map we shall mean a function from some ordinal into some set. and Other Sets of Sets whenever X is an ordinal whose members all belong to e.45.. Note that any Proofs.s u p r e m u m in X. The recursion principle can be proved by an argument similar to 3. We shall show that such a function yields a contradiction. then there is some ordinal M ~ e. as in 19.z for all z E X. with the property that each 4chain in X has a 4 . and so u(C(S)) exists and is an upper bound for C(S).52. from/lV[ into {sets}. according to the reader's taste the rest of the argument will work with either 9{. Then e contains all the ordinals.c for all c E C. Recursively define a strictly increasing mapping ~b : 9s + X by To see that this definition makes sense. Thus Zermelo's Fixed Point Theorem is occasionally useful in the study of set theory without the Axiom of Choice.20. 5.128 Chapter 5: Filters.
51). let T be a transitive set with X E T. Let X be any given set. instead we have simply assumed that the class of all sets is some collection of objects that satisfies the ZF axioms. every set is in some stage V is the class of all sets. because an interesting and useful description of it was given by von Neumann. First. Define ordered pairs and ordered triples in terms of sets. some mathematicians use the term "rank" instead of "stage. Thus A E V. then. we have a a E/3. for any sets X and Y. (The literature contains minor variants on this definition. let M be a Eminimal member of the set T\v. that a set is a collection of "already fixed" sets. Then A E T by transitivity of T. Thus M C_ Stage(~). the ultimate effect is the same. let . 5. By the Axiom of Regularity. However. Proof (following Shoenfield [1967]). The class of all sets specified by the ZF axioms is often denoted by V.43.53. that T \ V is nonempty. which has some successor/3 +. we shall obtain GSdel's constructible universe. we define a function of classes Stage : {ordinals} . The idea is that instead of taking arbitrary subsets of V~ to get V~+I.46. V is a proper class.The C/ass of Ordinals 129 5. we are now ready to use the ZF axioms to prove a precise version of that earlier intuitive statement. that is. the c~th stage is the collection of all subsets of all sets that have already been formed in previous stages. In 1. For every A E M. somewhat imprecisely and intuitively. but A ~t (T \ V) by minimality of M. so M E Stage(~ +) contradicting our choice of M as a set that does not belong to V. {sets} by this rule: Stage(a) r In other words. L. Also. ~E{ordinals} Although each Stage(a) is a set. This will appear briefly in our discussions in Chapter 14.44 we stated. Let A be any member of the set M. by the Axiom of Comprehension. we shall use describable subsets. Therefore A E Stage(aA) for some ordinal c~. the von Neumann class U ~P(Stage(/3)). If T C_ V. Its union is an ordinal /3. then we are done.") Von Neumann's universe is the class V = U Stage(c~) . hence A E Stage(~). Some mathematicians prefer to define Stage(a) by two slightly different formulas when a is a limit ordinal or when c~ is a successor ordinal.54. Note that the class T \ V is actually a set. we wish to show X E V. T h e o r e m . Using recursion on the ordinals (in 5. We have not used that statement in our formal development of ZF set theory. define the product of two sets as a set of ordered pairs. In ZF set theory. Assume. By 5.d. Now. as in 1. By a slight modification of von Neumann's cumulative construction. However. Then the G S d e l o p e r a t i o n s are defined as follows. This scheme is also known as the c u m u l a t i v e h i e r a r c h y . {aA " A E M } is a set of ordinals.
~) e x } . introduced in 6. is L equal to V. y ) . (~. Bishop's constructions.130 9". . (ii) L~+I = T(L~)n cl(L~ U {L~}) for ordinals c~. ~r ~(x) . ~. ~ ) . . or other books on logic and set theory.v)'u. let 9"4(X) . v e Y}. v) e X for some v } . and Other Sets of Sets ~'2(X. ~ ( x . ~) e x } .((~. We flow recursively define (i) L0 = O.{u "(u. Ghdel's constructions may take uncountably many steps.Uz<~ Lz when a is a limit ordinal. Y) . . or c o n s t r u c t i b l e r e l a t i v e t o t h e o r d i n a l s . ~.{X.7. for any set X. Manin [1977]. Of course. The members of L are said to be G h d e l c o n s t r u c t i b l e . (iii) L~ . ~ ) . ~. Topologies. and define cl(X)  x u 9 ( x ) u 9 2 ( x ) u ~}3(x) u . 4_<i___8}. ~) e x } .Dom(X) . it is discussed further in 14. for any set X. (X. 5~(x) . (v.x • Y . x e x . v.U~eordinals La. They are quite different from. and should not be confused with. etc.((~. Y}. or are these classes different? This question cannot be answered either way except by making additional assumptions beyond those of conventional set theory... Let c}2(X) . v) e x • x.~ e v}. For further discussion we refer to Jech [1973]. statement V = L is called the A x i o m of C o n s t r u c t i b i l i t y . which permit only countably many steps. l < _ i < _ 3 } U { 9 " i ( u ) ' u e X . since Ord is not a set. L is a proper class.((~. 9 ~ ( x ) . Also. Chapter 5: Filters. v.{(x. vEX. ~ ) . Y) .~(~(X)). (~. and finally L . Y) = X \ Y .2. Then cl(X) is the smallest set that contains X and is closed under the Ghdel operations. .{(~. Is every set in von Neumann's universe "constructed" at some stage of Ghdel's hierarchy? Or are there some other sets in V that cannot be so "constructed?" In other words. Now. define ~(X) XU{9~i(u. . The. ~.
and we shall prove most of these implications. Banach tneorems / \ \ ~ n \~(. ZF was introduced in 1.47. in this chapter we include proofs of AC =~ DC =~ CC 131 . Preview. for the most part. Tychonov's Theorem / ~ ~ DC + BP + LM (Solovay) I DC + BP (Shelah) ACR / (Choice ~ forR) . All assertions in the chart are understood to be in conjunction with ZF. that is..1.~ D~ c o:: p P e m "~ [ ~ notLM HB for separable spaces partly on a chart of Pincus [1974].Chapter 6 C o n s t r u c t i v i s m and C h o i c e 6. Some relations between these principles are summarized in the chart below. / [ / \ ~ \ ~  l ~ ~ / ACF Hahn~. which is based AXIOM OF CHOICE (AC) Vector Basis Theorem. For instance. it is just a formalization of our intuition about sets. ZermeloFraenkel set theory plus the Axiom of Choice. Implications in the chart are downward. I\ \ ~ G rnirWright Continuity Theorem i I I ~ ~ ~Banach s Closed Graph Theorem I CC I \ ] ~ [ Uniform Boundedness Theorems I Uniform Boundedness Theorems for Norms \ \ ~ ~ ~ / (e~)* r el gana~h Tarski ~ n o tB . Conventional set theory is ZF + AC. This chapter introduces the Axiom of Choice (AC) and a few weakened forms of Choice.
The proof of WUF =~ notLM is somewhat complicated and is not included in this book. the terms "constructive" and "constructivist" are used more narrowly..38. An enormous survey of the weak forms of Choice. for it is known that ZF + DC + UF does not imply AC.132 Chapter6: Constructivism and Choice and AC =~ UF =~ ACF. One might almost think of those principles as the "constructive component" and the "nonconstructive component" of AC.76 and 14. The statement notBP is the weakest nonconstructive consequence of AC that we shall consider as such in this book. it was given by Sierpifiski [1938]. These four principles appear in many equivalent forms in later chapters. However." and notBP = "there exists a subset of I~ that lacks the Baire property. is a remarkable accomplishment. see Pincus [1977]. Most of them can be found in Jech [1973] or Pincus [1974] or in references cited therein. Shelah~s T h e o r e m . but its foundational significance must be mentioned now. and HB (the HahnBanach Theorem)." The topological meaning of the Baire property will be discussed in Chapter 20. Con(ZF) =~ Con(ZF + DC + BP). UF (the Ultrafilter Principle). is given by [Howard and Rubin. For our purposes. Most interesting consequences of AC actually follow from either DC or UF. if the proof approximates the object arbitrarily closely by a procedure involving just countably many steps. An existence proof is c o n s t r u c t i v e if the proof actually finds the object in questioIl by a procedure involving just finitely many steps or. of proving that those objects have no explicitly constructible examples. including implications and irreversibility results. DC (Dependent Choice). However. in preparation].45. This is discussed in greater detail in 14.e. That DC + BP implies the GarnirWright Theorem is proved in 27. thus BP is our strongest negation of the Axiom of Choice. it is known that HB ~ UF and UF ~ AC but proofs of these irreversibility results are beyond the scope of this book.77. in some cases. The terms "construct" and "construction" are used loosely by most mathematicians. these terms may be applied to any argument that builds something complicated from seemingly simpler things. C o n s t r u c t i v i s t s are mathematicians who study such proofs and/or who prefer such .2. a proof of (t~)* ~: t~l =~ notBP is given in 29.i. the most important principles are AC (the Axiom of Choice). it gives us a unified method of proving the intangibility of many of the pathological objects that arise in analysis . EXAMPLES OF NONCONSTRUCTIVE MATHEMATICS 6. a brief discussion of constructivism will be helpful. that description would be slightly misleading. Most of this book follows mainstream mathematics. Particular attention will also be devoted to the principles BP = "every subset of lt~ has the Baire property. which is not constructivist. Most of the implications are known to be irreversible for instance. However..
Indeed. We follow that pedagogical practice i. certain kinds of linear maps. new teachers of mathematics probably hear that pedagogical practice recommended more often than any other. Two of the axioms of conventional set theory are nonconstructive. The Axiom of Regularity (introduced in 1. perhaps more precisely. The lack of examples may be disconcerting to students.50).4." and attach different degrees of importance to that notion as well.. and diverse other objects. we mean that it can be proven that no explicit examples can ever be given. of giving examples whenever possible. it has little effect on mathematics outside of set theory. they are conceptually useful in pure mathematics and appear frequently in the literature (usually without much explanation). indicated in 5. and Lebesgue. We shall see that this peculiar status is shared by free ultrafilters. 6. It is ironic that Baire~ Borel. Indeed.) Most mathematicians are parttime informal constructivists.25) was crucial to their work. for some mathematical ideas are inherently nonconstructive. the Axiom of Choice (introduced later in this chapter) has enormous effects on many branches of mathematics.e. only later was this use pointed out explicitly by Sierpiiiski. here and later. three of the founders of this century's analysis.3. for most purposes it can be replaced by the Principle of EInduction (in 1. The Axiom of Choice is a highly visible form of nonconstructive reasoning." However. (Bishop's constructibility should not be confused with GSdel constructibility. c o n s t r u c t i v i s m is the study of such proofs. were philosophically opposed to any uses of arbitrary choices. 6. see especially 14. Some mathematicians are not aware of other kinds of nonconstructive reasoning. we might call this constructivism in the sense of Errett Bishop.47) is largely a formality. it is not nonconstructive. we try to follow any abstract idea with one or more concrete examples. Although intangibles can be avoided in applied mathematics. in this respect: In teaching or learning mathematics. and consequently they use the term "constructive" simply to mean "not using the Axiom of Choice.54. well orderings of R. which is in some sense constructive or.77. By this we do not mean merely that no examples have been found yet. The survey given in the next few pages is too superficial to distinguish between these different schools. which permits uncountably many steps and is very different in nature. rather. but we must depart from that practice at times. some of the objects studied in this book are i n t a n g i b l e : We shall see that the objects "exist.Examples of Nonconstructive Mathematics 133 proofs. and cannot be replaced so easily with a constructive variant. and yet Countable Choice a mildly nonconstructive principle involving a sequence of arbitrary choices (introduced in 6." but that explicitly constructible examples of these objects do not exist. Indeed. We shall give some explanation of the lack of examples. Some ezamples of nonconstructive mathematics. Different mathematicians have different interpretations for the term "constructive. . included in set theory for convenience. They used it without noticing it. Bridges and Richman [1987] give a much more detailed survey. In contrast. See Moore [1983]. To be more specific. Actually. finitely additive probabilities that are not countably additive. nontrivial universal nets. The Axioms of Regularity and Choice postulate the existence of certain sets without giving any indication of how to find those sets. the literature now contains many schools of constructivism that differ slightly from Bishop's view. subsets of IR that lack the Baire property.
If Goldbach's Conjecture gets proved or disproved between the time this book is written and the time this book is read.35). shortly before this book was finished. where 1 m e a n s "not. n o n c o n s t r u c t i v e proof is as follows: E i t h e r (i) v/2 v~ is r a t i o n a l (ii) V~ v~ is irrational t h e n take a . We shall prove t h e following proposition. formal c o n s t r u c t i v i s t s use language a bit differently from m a i n s t r e a m m a t h ematicians. or b o t h . b u t m o s t m a t h e m a t i c i a n s would agree t h a t surely it is one or t h e other. or we have a c o n s t r u c t i v e proof of Q. either P holds or n o t . " W i t h this convention. T h e r e are o t h e r kinds of n o n c o n s t r u c t i v e existence proofs. the expression "P or Q" m e a n s "we have a c o n s t r u c t i v e p r o o f of P. w i t h this i n t e r p r e t a t i o n . either P holds or n o t . Thus. T h e following e x a m p l e is t a k e n from T r o e l s t r a a n d D a l e n [1988]. a n o t h e r e x p l a n a t i o n will be given in 14. a proof of Fermat's Last Theorem was finally completed by Taylor and Wiles [1995]. Goldbach conjectured that every even integer greater than 2 can be written as the sum of two prime numbers. in intuitionist logic see 14. Goldbach's Conjecture is of interest not because of what it would tell us about prime numbers.x/~. or t h e n take a v/2 v~ a n d b v/2. simply replace it with some other such problem. P V ~P). G o l d b a c h ' s C o n j e c t u r e is not yet t r u e or false.g.36. t h e Law of t h e E x c l u d e d Middle becomes: For every p r o p o s i t i o n P. This is one of the most famous unsolved problems of mathematics: As of the time of this writing. and we can determine which one.48. m o s t m a t h e m a t i c i a n s are in w h o l e h e a r t e d a g r e e m e n t with t h e Law of t h e E x c l u d e d Middle a n d m i g h t have t r o u b l e seeing how t h e c o n s t r u c t i v i s t s could reject it. though many mathematicians have spent much time trying and have proved slightly weakened versions of the conjecture. I n t e r p r e t e d in t h e language of constructivists. we d o n ' t yet know w h e t h e r G o l d b a c h ~ s C o n j e c t u r e 1 is t r u e or false.P holds. P r o o f b y c o n t r a d i c t i o n was i n t r o d u c e d in 1. A quick. one of which we shall now describe. a l t h o u g h s o m e d a y it m a y a t t a i n one of those states. 6.) . (P) T h e r e exist positive." In some c o n s t r u c t i v e frameworks (e. (An earlier draft of this book used Fermat~s Last Theorem.b .P holds (or m o r e briefly. B u t why do c o n s t r u c t i v i s t s use l a n g u a g e in this fashion? T h e e x a m p l e below m a y help us to u n d e r s t a n d why.134 Chapter 6: Constructivism and Choice t h a t is an erroneous usage. For constructivists. For the purposes of this book.. l In 1742. t h e principle of p r o o f by c o n t r a d i c t i o n is equivalent to t h e L a w o f t h e E x c l u d e d M i d d l e : For every p r o p o s i t i o n P. Any other unsolved problem that can be solved in that fashion will do as well for the discussions in this section and in 10. Of course.46 and 15.5. irrational n u m b e r s a a n d b such t h a t a b is rational.9. Goldbach's Conjecture was part of Problem 8 in Hilbert's famous list of 23 problems for the twentieth century. a more famous problem that went unsolved for 300 years. easy. it can be s t a t e d as ~ P => P . A n example with irrationals. However. c o n s t r u c t i v i s t s a n d m a i n s t r e a m m a t h e m a t i c i a n s agree on t h a t . However. t h e Law of t h e E x c l u d e d Middle is b l a t a n t l y false. but rather because it is a simple example of an unsolved problem that could be solved if we could carry out a countable infinity of steps. See Yuan [1984] for a survey of Goldbach's Conjecture. For example. no one has yet proved or disproved Goldbach's Conjecture.
one of the fundamental theorems of classical functional analysis is the HahnBanach Theorem. a :/= 1. if a and b are positive algebraic numbers. A basic example is in the meaning of "real number. By a theorem of Gelfond and Schneider. then a b is transcendental. the Trichotomy Law is not needed in its full strength. (ii) is true and can be proved constructively. but a constructive proof can be given for a variant involving normed spaces X that are separable i. Making mathematics constructive. Actually.Further Comments on Constructivism 135 However. it can be replaced by the following weaker law. FURTHER COMMENTS ON CONSTRUCTIVISM 6. The theorem is inherently nonconstructive. and b is irrational. Some versions assert the existence of a certain type of linear functional on a normed space X. The constructive version of the HahnBanach Theorem is more complicated. Bishop [1973/1985] points out that in most applications. Sometimes it is easier. This proof could not be used as a subroutine in a numerical computer program: It yields not one answer. by a much longer argument. for in applied m a t h most or all normed spaces of interest are separable. so we have not found a particular explicit example of a pair (a. as in the preceding example with v/~ v~. C o m p a r i s o n L a w . normed spaces that have a countable dense subset. The alterations one makes while translating classical m a t h e m a t i c s to constructive mathematics generally have little or no effect on the ultimate applications. Sometimes this is only with difficulty. For any real numbers u. related results are surveyed by Tijdeman [1976].7. Little is lost in restricting one's attention to separable spaces. there is no algorithm that takes constructive descriptions of x and y and yields the assertion of one of those three relations. This law is constructively provable. see Bridges [1979]. See page 106 of Gelfond [1960].9 and 10.e.6. One way to define . b) satisfying (P). However. different schools of constructivists use the same words in different ways.. the T r i c h o t o m y L a w for R e a l N u m b e r s : for all real numbers x and y. and y. For instance. We shall illustrate and demonstrate this unprovability in two ways in 14. but two possible answers with no method of choosing between them. this proof does not tell us which of the two possibilities (i) or (ii) is valid. Nonconstructive arguments often can be replaced by constructive ones. v.46. if u < v then at least one of u < y or y < v must hold. either x < y or x = y or x > y is not constructively provable. we shall study several versions of this theorem in later chapters. in fact. For instance. 6. Constructivists and m a i n s t r e a m mathematicians use the same words in different ways. but it has the advantage that it actually finds the linear functional in question." Mainstream mathematicians have several different equivalent definitions of real numbers (see Chapter 10).
. several other mathematicians have extended Bishop's style of reasoning and written constructive versions of many other parts of mathematics. This book. the following is true: C e i t i n ' s T h e o r e m . in terms of such sequences. and 27.rn I _< max{ ~ . in mainstream mathematics.33. but that fact is not provable in constructive mathematics.g. Of course.) Since then.. And. either directly or indirectly. 27. finally. the continuous functions that are of chief interest are the uniformly continuous ones. it is a theorem (in certain axiom systems of constructive mathematics) that every representable function is continuous. any function with certain "good" properties is continuous. but the context is one where the two notions are classically equivalent anyway. all computations about real numbers are expressed..c. (See also the revised version.e. logic and set theory). Of course. In the terminology of Bishop and Bridges [1985]. The introduction to constructivism given by Bridges and Mines [1984] also discusses the importance of language. under certain uses of the language. and Heyting extended the matter from philosophy to formal logic. the reader may refer to Bridges [1979] for functional analysis.46. Brouwer's intuitionism was more a matter of philosophy than mathematics. a sequence (rn) of rational 1 1 numbers that satisfies Irm . A proof of this startling result can be found on page 69 of Bridges and Richman [1987].45. In particular.) Here is a more complicated example of the differences in language: In constructive analysis. Among some constructivists. Every function is continuous. 6. Moreover. Indeed.e. but such sequences are not essential to our way of thinking about real numbers. . is frequently nonconstructive. the only functions that can really be called "functions" are the representable ones. any continuous function on a compact interval is uniformly continuous. Indeed. which is intended to introduce the reader to the literature. it is hard to constructively establish that a function is continuous except by giving a modulus of uniform continuity and thus establishing that the function is indeed uniformly continuous. Bish0~) [1967] showed how to develop a large portion of analysis constructively.c). every real number can be represented as the limit of such a sequence. until a few decades ago. and to Bridges and Richman [1987] for a recent survey of the several different schools of constructivism. that book's definition of "continuity" is the usual definition of uniform continuity. But then. a function on a compact interval is continuous if it has a modulus of uniform continuity i.28.e. theorems to this effect are given in 24. The result is slightly less startling when we consider that.136 Chapter 6: Constructivism and Choice a real number is as an equivalence class of Cauchy sequences of rational numbers (see 19. since much of the literature is nonconstructive. we would have been right. In constructivist mathematics. the constructivist viewpoint is foreign to most mathematicians today.8. we are so used to nonconstructive proofs that we tend to believe one cannot do much interesting mathematics constructively. in mainstream mathematics.42. Bishop and Bridges [1985]. ~}. Thus. (Constructivist "real numbers" are discussed further in 10. to Beeson [1985] for foundations (i. But constructivists prefer to indicate a real number by a Cauchy sequence that is accompanied by some estimate of the rate of c o n v e r g e n c e . even in mainstream mathematics. Constructivism versus mainstream mathematics.
Bishop exhorted mathematicians to return to a more meaningful mathematics. For instance. The oracle is able and willing to answer some. to better understand some of the nonconstructible objects studied in this book. that of the mainstream mathematician. we may compare results requiring the Axiom of Choice with results that only require a weakened form of the Axiom of Choice. in pursuit of form. have lost track of content. Perhaps the contentless mathematics that he condemned would include the intangibles studied elsewhere in this book (free ultrafilters. For instance. it tells us that v/2 v~ is irrational see 6. Much of this book is concerned with nonconstructive mathematics. for instance. it is not enough to prove that a certain pond contains a fish. but a nonconstructive proof is often quicker and simpler. constructivists distinguish between notions that the classical mathematician is accustomed to viewing as identical. At first glance.g. although we shall not address them in any depth. Both constructive and nonconstructive thinking have their advantages. In particular. For instance. constructivism remains separated from the mainstream of mathematics. it would be helpful to have an inexpensive device that quickly and easily determines which ponds contain fish. that looks like a rather strange notion. On the other hand. Augenstein [1994] suggests that the BanachTarski Decomposition may be a useful model of some interactions of subatomic particles. Extending a metaphor of Urabe: To feed one's family. 6. of these questions. Thus. Bishop [1973//1985] suggested that mainstream mathematicians. a similar metaphor may be helpful in the present context. human methods. This may be largely because constructivism's finer distinctions necessitate a use of language quite different from. who has frequent conversations with some deity. A constructive proof may be more informative (e." However. a mainstream mathematician can only learn constructivism by relearning his or her entire language a sizable undertaking. Imagine we have access to an o r a c l e . etc. . or we can't. and more complicated than. among some constructive analysts. the oracle might tell us whether Goldbach's 2Caution: Some constructive analysts use x ~ y to denote apartness and use ~(x = y) to denote inequality. Moreover. How we can say that one object is harder to find than another object. Some philosophical questions deserve at least a brief mention here. x % y simply means the negation of x = y.5). which lack examples and do not seem to be a direct reflection of anything in the "real world. free ultrafilters provide a basis for nonstandard analysis.). while x # y means 2 the slightly stronger condition of a p a r t n e s s : We can find a positive lower bound for the distance between approximations to x and y.Further C o m m e n t s on Constructivism 137 Despite its growing literature. after all. when in fact we can't find either of them? The metaphor of "oracles" was introduced in recursion theory by Turing [1939] (see the discussion by Enderton [1977 recursion theory]). Moreover. We present the oracle with various questions that we have been unable to answer by merely mortal. we may be surprised by just what kinds of mathematics can reflect the real world. which yields new insights into calculus and other limit arguments. an argument can be made for the conceptual usefulness of such objects.. we shall sometimes find it helpful to vary the amount and kind of nonconstructiveness that we are willing to accept. ultimately one must catch the fish. either we can find a certain mathematical object.9. Consequently. but not all.
not the actual answers themselves. the Axiom of Regularity says that we have an oracle of the following type: We may describe to the oracle some nonempty set S. if the oracle gives us an answer to question A.. two answers may be considered equivalent to each other if each is stronger than the other i. Then S is nonempty. The following proof is modified from Beeson [1985]. It is these relations between the answers. in terms that do not necessarily give a clear understanding of the set but that do at least uniquely determine the set. if interpreted in constructivist terms. The proof of this implication.) Proof. Thus.10. the oracle is not helping in such deductions. when we compare different levels of nonconstructiveness.9. we shall restate the proof in terms of the oracle metaphor of 6. Since most readers of this book probably are not familiar with constructivist language. The Axiom of Choice. one answer may be stronger than another. Since the oracle is not actually used to determine and compare those different levels.e. it is given by Beeson [1985] and Bridges and Richman [1987]. Then the oracle will specify to us some element x E S such that x N S = 0. in some cases. though short. depends on a deeper understanding of constructivist language. then we are using human. if either answer would enable us to deduce the other. 6. but that we do not necessarily know to be true or false. mortal reasoning i. it does not translate readily into the language of mainstream mathematicians. Proposition. but refuse to comment on the Riemann Hypothesis. Interpreted in constructivist terms. Let P be a proposition (such as Goldbach's conjecture) that we can state precisely. Remark. our relation of one answer being stronger than another is determined without the aid of the oracle..e. It must be emphasized that when we use the oracle's answer to A to deduce an answer to B. that will concern us later." Now. Now define S  [ {O'{O}} t { {0 } } ifPistrue if P is false. such an oracle is referred to as a "limited principle of omniscience. we may use that information to deduce an answer to question B even if the oracle has not given us an answer to B. Thus. (Hence the Axiom of Regularity is nonconstructive. since {0} E S. this relation does not depend on our actually having answers to either of the questions A or B. The oracle will tell us either " 0 is a member of S that does not meet S" in which case P is obviously true or "{0} is a member of S that does not meet S" in which case we can deduce that P is false.138 Chapter 6: Constructivism and Choice conjecture is true. the oracle can be used 15o deduce the t r u t h or falsehood of any proposition P. In some of the literature. We omit it here. if interpreted in the language of constructivism. . we may now dispense with the oracle altogether. The Axiom of Regularity implies the Law of the Excluded Middle. Thus. Similarly. can also be shown to imply the Law of the Excluded Middle.
and 19.29. 17. there exists a function f " A ~ U ~ e i X~ satisfying f ( s c X~ for each A. . abbreviated ZF + AC. (A much longer list of equivalents is given by Rubin and Rubin [1985]. ( A C 2 ) Set of R e p r e s e n t a t i v e s . and 15. If {X~ 9A c A} is a nonempty set of nonempty sets. The Axiom of Choice is "obviously true. and definability in the sense of L6vy are far outside the mainstream of thinking of most analysts. we shall study several in this and later chapters.19 the proof of equivalence of these three principles. Let {X~ 9A E A} be a nonempty set of nonempty sets that are pairwise disjoint. etc. In fact. 10.76 we shall introduce "quasiconstructibility. That is. (AC3). For instance. The A x i o m of C h o i c e has many equivalent forms. Let X be a nonempty set. constructibility in the sense of GSdel." in that it agrees with the intuition of most mathematicians. In 14.13. is called a choice f d n c t i o n ..) We shall denote our equivalents of Choice by (AC1).13.29. then we say AC is used unconsciously or implicitly. Conventional set theory is ZermeloFraenkel set theory plus the Axiom of Choice. Then for each nonempty subset S C_ X it is possible to choose some element s c S. An object x0 is said to be definable if there exists a proposition P(x) in firstorder logic for which x = x0 is the unique element for which P(x) is true.46.47. there exists a function f that assigns to each nonempty set S c_ X some representative element f ( S ) E S. We described ZermeloFraenkel set theory in 1. A function f that specifies choices. Constructivism (in the sense of Errett Bishop) will be discussed further in 6. (AC2).The Meaning of Choice 139 6. AC is so much a part of the way of thinking of most mathematicians that it can easily sneak into a proof unnoticed.48. We postpone until 6.16. (AC4). (AC3) N o n e m p t y P r o d u c t s . in this or similar contexts. Each nonempty set S c_ X certainly contains some element s. See L6vy [1965]. That is. Constructibility in the sense of Bishop. Most of these equivalents are discussed in next few pages. and thus to define f(S) it suffices to "just pick any such s. consider (AC1). Then there exists a set C containing exactly one element from each X~." It requires only a small stretch of the imagination to make all such choices simultaneously and thus to define the function f. THE MEANING OF CHOICE 6. collectively we shall refer to them as AC." which is (in this author's opinion) closer to the way that most analysts think. Here are three of the simplest forms of Choice: ( A C 1 ) C h o i c e F u n c t i o n for S u b s e t s . then the Cartesian product YI)~CA XA is nonempty.11.12. Logicians have another notion that is similar to constructibility. and Tychonov's Theorem and similar results on product topologies in 15. A few more equivalents are the Vector Basis Theorem in 11.
. . in fact. Canonical choice functions are sometimes available.) In situations where we need the Axiom of Choice. This is discussed further in 14. For instance. we cannot establish the existence of infinitely many choices. However. However. in other cases we cannot find f explicitly.e.N . then the choices are said to be c a n o n i c a l . first) element of S. the simplest instance of an object that exists but that we cannot illustrate with a specific example. by some completely describable procedure or rule. The reader is urged to try for a moment to think of an explicit choice function f for IR. perhaps the simplest i n t a n g i b l e i.77 and 6. (See Moore [1982].140 Chapter 6: Constructivism and Choice Cantor and other mathematicians used Choice implicitly in their early work in set theory in the late 19th century. W i t h o u t assuming V . There exists a function f that assigns to each nonempty set S C_ I~ some element f ( S ) c S. and. . But what about a choice function that works for all nonempty subsets of I~? No explicit choice function has ever been found 3 for I~. More complicated answers will choose points from larger collections of sets.e. If a choice function can be given explicitly. For instance. for analysts.34). then ~ becomes a well ordering of ]~.. rather than just a consequence of definitions.15. it can be proved that no explicit choice function ever will be found for R (see 14.. 3Logicians and set theorists m a y have a slightly different view of this m a t t e r .. then every set S with nonempty interior contains some rational number and so we may take f (S) to be the first rational number in S. then we can satisfy (AC1) by taking f ( S ) to be the smallest (i. we can write down a formula ~ t h a t has the following property: W h e n we assume V = L. except by giving an example or applying some nonconstructive principle such as AC. 2 . one specialization of AC is this principle" ( A C R ) A x i o m of C h o i c e for t h e R e a l s . when S is a bounded nonempty interval. we can choose the socks using a slightly weakened form of Choice discussed in 6. or even the existence of one choice. only in 1908 did Zermelo become aware of this assumption in their work and explicitly formulate it as an axiom. The Axiom of Choice makes selections for us that we do not know how to make for ourselves. usually there are infinitely many choices available. . To see why. otherwise the choices are said to be a r b i t r a r y ." Bertrand Russell illustrated it with this example: To select one sock from each of infinitely many pairs of socks requires the Axiom of Choice.L or the Axiom of Choice or any of its relatives such as DC. Hence we need to assume a principle such as AC or ACR to tell us that such a function f exists.{ 1 . for instance.g. ~ provides an "explicit example" of a well ordering of R and hence an "explicit example" of a choice function for R . CC. Sah [1990] calls it a "mathematicians' Maxwell demon. let us consider the choice function f in (AC1). Thus. Such a function is. For instance. etc.54). 3 . one way to choose shoes canonically would be to take all the left shoes. then let f (S) be the midpoint of that interval. if X . Some partial solutions suggest themselves e. . UF. but for shoes the Axiom is not needed. (Actually. } .77.) AC is so very "obviously true" that the reader may wonder why it is considered to be an axiom. for t h e y are more comfortable with the Axiom of Constructibility V = L (introduced in 5. let (r~) be an enumeration of the rationals. socks do not require the full strength of the Axiom of Choice.
Each Si is nonempty. Indeed. when we need to make infinitely many arbitrary choices. If n is a positive integer and $1.. If each Xa contains exactly one element xa. $2. VARIANTS AND CONSEQUENCES OF CHOICE 6. . for it follows "for free" from conventional logic and the axioms of ZF set theory without any additional assumptions. . s2. in most cases of interest." The flaw in this reasoning is a subtle one: By what method do we "make all the Xa's larger?" If the enlarged set Xa still contains the original member xa.. The weakest of these is: F i n i t e " A x i o m " of C h o i c e . Although this principle is sometimes called an axiom. Of course.14. we can make a canonical choice.15.. or some form of the Axiom of Choice. intuitively it is obvious that 1IaeA Xa is nonempty in fact..Variants and Consequences of Choice 141 6.. If we make all the Xa's larger.. thus the Axiom of Choice is "proved. A defective "proof" of Choice. 6. some of which are much weaker. then indeed the original function x would still be available to us. Following are two variants of Choice that do not follow "for free" from logic and ZF set theory: ( A C F ) A x i o m of C h o i c e for F i n i t e Sets. and so it would still be nonempty. hence it contains (and we can choose from it) some element si. IIa~A Xa contains infinitely many elements. Let C be a set whose members are nonempty finite sets. it really is not an axiom..sn) is a member of the product. We have no way to "construct" the function x except via some additional assumption such as (AC3). then $1 x $2 x . then we can conclude 1IaeA Xa is nonempty without using any existential axiom: We know that 1IacA Xa contains the function x that assigns to each coordinate A the value xa. then we no longer have an explicit formula or rule for choosing one element from each Xa. Sn are nonempty sets. then this could only make 1IaeA Xa larger. then (Sl. But we may be unable to find any particular element of 1IaeA Xa. Let C be a set whose members are nonempty . But if we lose track of the xa's when we enlarge the Xa's and all we know about the enlarged sets is that they are nonempty. We only need the Axiom of Choice. To understand better what the Axiom of Choice is really assuming. Then it is possible to choose some member s from each set S c C . ( M C ) M u l t i p l e C h o i c e A x i o m . and if we still know which of the elements of Xa is that original member xa. Repeat this operation n times. let us contrast AC with several related principles.13. x Sn is nonempty. ordinary mathematical logic permits us to apply an operation finitely many times.. The reader may find it instructive to consider the following "proof" of (AC3).
the proof is too long to include here. the pieces in the BanachTarski Decomposition are not Lebesgue measurable. its paradoxical appearance can be explained away. The ordinary "volume" of a subset of I~n is its ndimensional Lebesgue measure. can be partitioned into finitely many pieces." In mathematics.. each identical to the original ball B. although the names are similar. The Axiom of Choice is a nonconstructive assertion of existence: It postulates the existence of certain objects without giving any indication of how to find those objects. Moreover. In particular." making choices for us that we cannot make for ourselves. but we shall not reproduce it here because it does not translate readily into the language of mainstream mathematics (see 6. Perhaps the most dramatic of these pathologies is: B a n a c h . Then it is possible to choose some nonempty finite subset F from each set SEe. the term "paradox" usually refers to an impossibility..i.e.12). A short proof of this implication is given by Goodman and Myhill [1978]. which can be rearranged by rigid motions (i. In fact. ACF. ACF is strong enough to choose Bertrand Russell's socks (see 6. then the rules of .. the BanachTarski Decomposition only appears impossible at first. However. Actually.not all subsets of IRn are Lebesgue measurable. For instance.8). It simply tells us that. B = { ( x . 6. Pathological consequences of A C. but as we shall see later in this b o o k . constructible examples that they are contrary to our intuition. is still strong enough to act as a mathematical "Maxwell's demon. The closed unit ball in three dimensions. That number is defined if the set is Lebesgue measurable. the BanachTarski Decomposition does not actually violate any rules concerning volume. they are so very different from familiar. if we accept the Axiom of Choice. The proof requires the Axiom of Regularity.T a r s k i D e c o m p o s i t i o n .4). rotations and translations) and recombined to form two closed unit balls. the theorem above is often called the "BanachTarski Paradox. it can be found in Jech [1973] or in Rubin and Rubin [1985]. z ) EI~3 : x 2 + y 2 + z 2 < _ l } .16. unlike most other proofs mentioned in this book. At first glance. y . We shall see in 14. objects for which no constructive examples can ever be given.. The Axiom of Choice for Finite Sets must not be confused with the Finite Axiom of Choice. There may not even be a way to find those objects.e. It can be shown that ACF implies the Law of the Excluded Middle (introduced in 6. It blatantly contradicts our intuition about the conservation of mass or volume.i. though weaker than the Axiom of Choice.. However. Thus.142 Chapter 6: Constructivism and Choice sets. some of the intangible consequences of the Axiom of Choice are p a t h o l o g i c a l .e. Obviously AC is equivalent to ACF + MC. the BanachTarski Decomposition seems preposterous.77 that many of the objects generated by AC are intangibles . it can be proved that the Multiple Choice Axiom by itself is equivalent to the Axiom of Choice.
are at least compatible with weakened forms of AC. C_. which make the Axiom of Choice either false or true.. the BanachTarski Decomposition does not require the full strength of AC. AC is simply another axiom that we may accept or reject. The intuition about volumes that we have obtained from our experience with everyday macroscopic objects in the real. but only with "implies. exist.. nevertheless there are strong reasons for considering its alternatives: Such considerations will improve our understanding of the consequences of Choice. that intuition is not even applicable to submicroscopic objects in the real world. We shall not prove the BanachTarski Decomposition Theorem in this book a proof and much related material are given by Wagon [1985] but in 21. in a few brief discussions we shall also consider some of its alternatives.19." Constructivist mathematics and classical (mainstream) mathematics give us two intuitive interpretations of language." and "function.22 we shall give Vitali's classical short proof of the existence of a Lebesgue nonmeasurable set. a weakened form of Choice that will be studied extensively later in this book." With this viewpoint.Variants and Consequences of Choice 143 volume are more complicated than we might like. . rather than in the simpler setting of Actually.. which have important consequences in functional analysis. and "set" of their usual meanings and investigates how certain relations between the meaningless symbols and words imply certain other relations. thus we shall also study such weakened forms. Axiomatic set theory takes a more rigorous approach that does not rely on intuitive interpretations. The existence of Lebesgue nonmeasurable sets is the chief reason that measure theory is generally developed for an algebra or oralgebra g of subsets of a set X. What makes A C true or false? When we accept the Axiom of Choice. Some of the alternatives to Choice.. (In fact. Although we shall use AC freely throughout most of this book. which we shall prove in 23. A recent proof of Pawlikowski [1991] shows that the BanachTarski Theorem is implied by the HahnBanach Theorem. though not compatible with AC itself. one is not concerned with "true" or "false" (because ultimately these things are unknowable). This implies a particular interpretation of some words. Alternative axiom systems are also possible... physical world is only applicable to some. We shall not give Pawlikowski's proof. feeling that the pathological consequences are outweighed by the advantages of AC. not all. Even if the reader is a "firm believer" in the Axiom of Choice. . regardless of whether we can find examples of those objects. and some of them are just as consistent as conventional set theory.) Most mathematicians have learned to live with such pathological consequences of the Axiom of Choice. such as "choose. subsets of the mathematical world R 3. 6. In axiomatic set theory.17. Augenstein [1994] suggests that the BanachTarski Decomposition is a possible model for some kinds of interactions of subatomic particles. Axiomatic set theory divests symbols and words such as E. set. we declare that we intend to treat certain mathematical objects as if they exist. but a crucial ingredient of that proof is Luxemberg's Boolean reformulation of the HahnBanach Theorem. .
( A C 4 ) Well O r d e r i n g P r i n c i p l e ( Z e r m e l o ) . 4) be a poset. Then X has a ~maximal element. Let (X. The oracle metaphor in 6. we shall keep track of its uses in some parts of this book. and (AC3).10. Every set can be well ordered.9 may be helpful in understanding these "effective" proofs. (ACI) Hint for (AC4) Hint for (AC5) Hint for =~ (AC4)" Use 3. ~) be a poset. (ACS) W e a k e n e d Z o r n L e m m a . To clarify the role played by Choice. suppose that 9" has finite character (as defined in 3. S x {S} is a bijective copy of S. Two statements are e q u i v a l e n t (or effectively equivalent) if each can be proved effectively from the other. (AC6) Maximal Chain Principle (Hausdorff). Then any member of 9~ is a subset of some C_maximal member of 9".46). A proof is effective if it does not use AC or consequences of AC except as explicitly stated hypotheses. 6. ~) be a poset. Assume every ~chain in X has a ~upper bound in X. =~ (AC6)" The 4chains form a collection of finite character. Let (X. Zorn. T e i c h m u l l e r ) . Maximal principles. ( A C 5 ) F i n i t e C h a r a c t e r P r i n c i p l e (Tukey. and let 9" be a collection of subsets of X. Relabel copies of the subsets of X so that they are all disjoint. The preceding discussions may have made clear just what the rules are that govern proofs of equivalence. with ? ( S )  f(X\S). Then X has a 4maximal element. (ACT) Z o r n ' s Lemma (Hausdorff. since {S} is a singleton and the sets S x {S} are disjoint subsets of X x T(X).20. Kuratowski.19. Let (X. For instance. Let X be a set. =~ (AC7): Use the upper bound of the maximal chain. o t h e r s ) . By a "partial choice function" for ft we shall mean a function f whose domain is some collection of nonempty subsets of (AC6) Hint for (AC7) Proof of (AC8) Hint for . Assume every subset of X that is directed by 4 has a 4upper bound in X. =~ (AC8): Any chain is a directed set. =~ (AC1).144 Chapter 6: Constructivism and Choice SOME EQUIVALENTS OF CHOICE 6. 6. (AC2). =~ (ACb)" Use the theorem in 3. The reader may now try to prove the equivalence of (AC1).18.43.46. Then any ~chain in X is included in a C_maximal ~chain. The following statements are equivalent to the Axiom of Choice. Hint for (AC2) ~ (AC1): See 1. Let f~ be a nonempty set.
partially order X by taking f ~ g if Graph(f) c_ Graph(g). Indeed. Thus it may be surprising that some basic properties of cardinality are actually equivalent to the Axiom of Choice. Cantor's Theorem.. . We understand finite sets fairly well. Our first few results about cardinality the Schr6derBernstein Theorem. Indeed. let [A] denote the cardinality of a set A. precisely one of these three conditions holds: IS[ < IT[. because the Axiom of Choice is worded in such a way that the simultaneity of the choices has little psychological impact. Comparison of cardinalities is a chain ordering. Bona's aphorism does agree with most mathematicians' intuition. although of course those who use it repeatedly become accustomed to it and begin to think of it as "true.12. thus it suffices to show that X has a maximal element. In fact. the Well Ordering Principle is obviously false." definable in some absolute way." In contrast.14) that X is nonempty and that any maximal element of X must be a function f with domain [P(~t) \ {~}. That is. several different descriptions seem equally plausible. the Well Ordering Principle seems false. 6." 6. The Axiom of Choice and its equivalents deal with infinite sets. and who can tell about Zorn's Lemma? The joke is that the three principles are equivalent. IS[ > IT I.Some Equivalents of Choice 145 ~t. For this section. That is. Let X be the collection of partial choice functions. did not depend on the Axiom of Choice and may have given the impression that every set has some definite "size. Comparison of cardinalities is a well ordering. IS[ = IT[. and we are altogether unable to find an explicit well ordering for I~. since the simultaneity of choices is built into the well ordering. Choice and cardinality. Still. The Axiom of Choice is obviously true. as we noted in 6.e. then Ix • X I = Ixl. ( A C 9 ) W e l l O r d e r i n g of C a r d i n a l s .21. Well orderings are quite difficult to find. Zorn's Lemma is too complicated to seem "obviously true" or "obviously false" to most mathematicians. but it is difficult to extrapolate from finite sets and describe how infinite sets should behave. ( A C 1 0 ) T r i c h o t o m y of C a r d i n a l s . as we have just seen. satisfying f(S) E S for each S E Dom(f).22. Finally. (ACll) C o m p a r a b i l i t y of t h e H a r t o g s N u m b e r . It follows from the Finite Axiom of Choice (see 6. We ask again: Is the Axiom of Choice "true?" According to Bona [1977]. for any two sets S and T. (AC1) seems true because we can "just pick any s E S. If X is an infinite set. there is a point to the jest: Our intuition isn't reliable here. if S is a set whose elements are sets. then there is some So E g that satisfies IS0[ < IT[ for all T E S. then ]H(S)] and IS] are comparable i. The Axiom of Choice seems true. Verify that the hypotheses of (AC8) are satisfied. a partial ordering chosen "at random" generally is not a well ordering. etc. (AC12) S q u a r i n g of C a r d i n a l s . one is bigger than or equal to the other (and hence ]H(S)] > ]S]). If H(S) is the Hartogs number of a set S.
. thus we may assume X and Y are disjoint. Next we shall prove that (AC12). The proof of (AC10) =~ (AC11) is immediate from the definition of the Hartogs number. show by induction that IXnl = IXl for ~ll positive integers n. for a proof of (ACll) ~ (AC4). y) is an injection from Y into H1. we shall show that (AC14) ~ (ACll). Finally. then IX x Y I  Chapter 6: Constructivism and Choice If X is an infinite set.146 (AC13) M u l t i p l i c a t i o n of C a r d i n a l s . Finally. In that case. let a(y) be the first such h.b. pick any y0 c Y and any object v that is not in X.g.IX x {y0}l <_ IX x YI _< Ix x X l . let H = H ( S ) be its Hartogs number. For a proof of (AC13) =~ (AC15). Hence IXuY! _< I(Xx{y0})U({v}xY)l I(Xu{v}) x Y I IXxYI  I ( X u Y ) x YI  IXuYI. use 3. y) r Y1 i.IYI.b.I H I and IYll = IYI . such that (h. this proof takes a bit longer but it will complete our cycle of equivalences. there exists some y r Y such that H x {y} c_ Y1.IYI < ]HI I = IHI. We shall first prove that (AC4). The implication (AC9) =~ (AC10) is obvious. . (AC15) If X is an infinite set. Choose any y0 c Y. For a proof of (AC4) =~ (AC9). Y is a IXI 9 (AC14) If X and Y are disjoint sets.IXl. nonempty set. let Y be a copy of S (i.g. proving that IHI __ IYll. and IXI _> IYI. By relabeling. In the second case. and therefore I s l . Hence Un~ X n has the same cardinality as the union of N disjoint copies of X i. we have IH U YI = ]H x YI. It follows easily that H is infinite also. the mapping h H (h. (AC15) ~ (AC12) is obvious. Finally.. Then IX] = ]X U {v}l by 2. use relabeling. Since H is well ordered. Let any set S be given.e. and (ACll) are equivalent. Since H is an infinite ordinal.e. we wish to show that IHI and ISI are comparable. use 5. then the cardinality of X is equal to the cardinality of Un%l x n {finite sequences in X}.ISI. Thus y H (a(y). and Y is nonempty. Then I X l . (AC9). Thus H1 = / 3 ( H ) and Y1 =/3(Y) are disjoint sets whose union equals H x Y and such that I H l l . we have IHI _ INI by 3. To prove (AC13) ~ (AC14).. y) c H1.45.46. In the first case. then Ix u Y ] .  Proofs. Hence for every y c Y there exists at least one h r H such that (h. IXI > INI. (AC13). note that we have an injection from any set X into a well ordered set H(X). and (AC15) are equivalent. y) is an injection from H into Y1. Therefore there exists a bijection/3: H U Y ~ H x Y.20. where the last equation follows from (AC13) since IX U YI ~ IYl. A proof of (AC4) ~ (AC12) is immediate from 3. (AC10).42..Ix • YI. For a proof of (AC12) ~ (AC13). We may assume S is not finite. there is no such y.e. By (AC14). We now consider two cases.39. the same cardinality as X x N. a set with the same cardinality as that of S) that is disjoint from H.
let ~ be some object that is not an element of S~. There may be many such ordinals for instance.II~EA Y~. in a proof we present in 17. However. then we can make a canonical selection from each equivalence class.16 remains valid.16. However. If we assume the Axiom of Choice. as follows: Every set can be well ordered. uniquely determined by S. Some mathematicians may prefer the following alternate definition. Recall that each stage is a set. for such an assignment is an irrelevant distraction. Some of these principles also imply AC and thus are equivalent to it.23. (For instance.e. we could take ~ . as defined in 5. all have the same cardinality .46. there is some stage in which S occurs. thus it would seem that we can define the "cardinal number" of a set to be the equivalence class to which that set belongs.Some Equivalents of Choice 147 6. this approach involves an equivalence relation on the class of all sets. Given any set S. then c a r d ( S ) = S. we wish to show that II~EA S~ is nonempty. which will be used several times in later chapters to prove that certain topological principles imply (AC3). How can we define the "cardinal number" of an infinite set? We would like to define an object "card(S)" separately for every set S in such a way that our definition of "card(S) _< card(T)" in 2. . which uses Regularity but not Choice. an initial ordinal). which is a very large proper class perhaps too large for some purposes. It is a cardinal number.16.. W i t h this definition.{(A) . . Let/3 be the first ordinal with the property that some set T E Stage(fl) satisfies card(T) = card(S) i. .. as defined in 5. since S~ ~ S~. Now let kard(S) {T E Stage(fl) 9 c a r d ( T ) card(S)}. Thus. and hence has the same cardinality as some ordinal. We shall now sketch a general argument. the Axiom of Choice will be used to prove certain important topological principles. Let {S~ 9A E A} be a nonempty set of nonempty sets. from the collection of all sets (a proper class) to the collection of all ordinals (another proper class). it is not equal to kard(S). by taking the first such ordinal.47. One naive approach would be to observe that equality of cardinality is an equivalence relation. let card(S) be the first ordinal that has the same cardinality as S.) Let Y~ .S~ U {{~}.f. let X . The preceding definition uses the Axiom of Choice but not the Axiom of Regularity. and two sets have the same "kardinality" if and only if they have the same cardinality. the ~ ' s can be selected without making any arbitrary choices. "card" is a function of classes. co + 2. not a proper class.{~ is an element of X = . In 2. However.24. Note in particular that if S is a cardinal number (i. In later chapters. Thus. but for infinite sets we merely indicated how to compare cardinalities. and let r~ 9X + Y~ be the Ath coordinate projection. For each A.but we can choose canonically among such ordinals. see 5. it is probably better not to think of {~ as being equal to S~.e..53. Say S E Stage(a). Obviously the function { defined by 7r~({) . co + 1. K e l l e y ' s C h o i c e . the first ordinal with the property that there exists a bijection between S and some member of Stage(/3). Then kard(S) is a set (not a proper class). when S is an initial ordinal. it follows Enderton [1977 set theory]. we defined the cardinal number of a finite set. so long as it is not a member of S~.S~. It does not really m a t t e r what we choose for {~. This type of argument apparently was first used by Kelley [1950]. for any set S. 6. w.
Now let M be any finite subset of A. ) with xn E S~ for each n. we can choose a sequence (Xl. Using either CC or recursion. x 3 . But when ~}~ do not know of any structure. . Garnir. To answer that question. The remainder of the argument is topological and takes a different form for different topological principles.. if the Sn's have some sort of known structure e.. then 1I~__1 Sn is nonempty i. but will be given in 15. An important weakened form of Choice is: ( C C ) A x i o m of C o u n t a b l e C h o i c e . or the "knob" topology and let X = lI~cA Y~ be equipped with the product topology. In fact. $3. $2. which permits arbitrarily many arbitrary choices. one constructs a. if each S~ is a nonempty subset of N then it may be possible to make canonical choices.g. . and so some arbitrary choices must be made.g.29. Countable Choice is strong enough for many a p p l i c a t i o n s .13. the resulting sequence (xn) is uniquely determined.23). . Then the set II. the sequence described in CC is not uniquely determined (unless all the S~'s are singletons). is a sequence of nonempty sets. However. We can choose representative elements from a sequence of nonempty sets.for instance. and Schmets [1968] develop a sizable portion of functional analysis using this axiom rather than the Axiom of Choice. and 17. In brief: We equip each Yx with some simple topology e.XeM S.e.eA Sx = riM TM is a nonempty subset of X.sequence (Xn). . the axiom of Countable Choice still permits us to make an infinite sequence of arbitrary choices.. In contrast. the discrete topology. the indiscrete topology. if S1.k is a nonempty subset of X. De Wilde. COUNTABLE CHOICE 6.x) x (II.148 Chapter 6: Constructivism and Choice 1[xea Yx.) Countable Choice will now be used to prove two very basic properties of cardinality.. Some assumed topological principle is then used to prove that l[).14)..25.kcA\M{~)~})" Observe that TM N TN = TMuN. CC is so weak that the reader may again ask why this is an axiom. is nonempty. we shall contrast CC with countable recursion (see examples in 2. our presentation follows that of Jech [1973].kcM S. .16. Of course. x2. as it includes (II. which is therefore nonempty. A recursive definition only allows one possible value for each xn. 19. By the Finite Axiom of Choice (see 6. The details cannot be given here. rather than just an"obviously true" statement or a consequence of definitions. and so no choices need to be made. In other words. (Contrast this also with AC. Countable Choice is strictly weaker than the Axiom of Choice (see Jech [1973]). hence the collection of sets 9~ = {TM : M is a finite subset of A} is a filterbase on X.
Some mathematicians take Dedekind infiniteness.p(x)) is an injection from S into N x N. Then there exists a sequence (x~) in S such that xn+l E f(x~) for each n. then the three notions of "infinite" coincide. We wish to show that S = UnEA Sn is countable. or the condition card(X) _> card(N). A set is D e d e k i n d i n f i n i t e if it has the same cardinality as some proper subset of itself. For each n E A. if and only if card(X) _> card(N). so Aj_I C X. let x be any element of (X3 X \ AS_l. let us replace the s with n's.28. If we assume Countable Choice. take A0 = 2~. that implication does not require the Axiom of Countable Choice. it is at this step that we need the Axiom of Countable Choice. DEPENDENT CHOICE 6. To reflect this relabeling. which is countable by 2. then CC would not be needed. not finite) if and only if it contains a countably infinite set i. and so there exists at least one injection from Sn into N. for each x E S.) is false for some j > 0. If (. ( D C 1 ) D e p e n d e n t C h o i c e ( v e r s i o n w i t h o u t h i s t o r y ) . 2 7 . This contradiction proves (. Since there are countably many n's.AS_l U {x}. (Assume CC. we are to prove that S = UacA Sx is countable. .20. (If the S~'s were given to us with some sort of listing already provided.) for each nonnegative integer j. Then the mapping x H (n(x). 2 . we are making countably many choices. .e. Let any nonempty set S and any function f : S ~ {nonempty subsets of S} be given. let n(x) be the first integer n that satisfies x E S~. Since A is countable. Thus X \ Aj1 is nonempty. Then U3=0 Aj is a countably infinite subset of X. by relabeling we may assume that A = N or A = {1. (Assume CC. so that we could choose the ~ ' s canonically. consider the smallest such j.Dependent Choice 6. . assume X is an infinite set.) Now.e.. We claim that (. Between the Axiom of Choice and Countable Choice lies an important but more complicated principle. Let {Sa : A E A} be a countable collection of countable sets. Indeed. For each n. 149 Discussion and hints. the set Sn is countable. 6 . Conversely. N} for some positive integer N. otherwise it is D e d e k i n d finite. It is clear that card(X) > card(N) implies X is not finite. and let p(x) = L~(x)(x). Now take A 5 .e. . Remarks.). Then:) A set X is infinite (i. the P r i n c i p l e of D e p e n d e n t C h o i c e ( D C ) . . Then Aj1 is finite and X is not.26. let us choose some injection L~ : S~ ~ N. Proof. We shall give two versions of this principle now and a few more versions in Chapters 19 and 20. the set X contains a subset Aj having exactly j elements. as a definition of X being infinite.. Then:) The union of countably many countable sets is countable.
then the Axiom of Regularity is equivalent to the following principle..150 Chapter6: Constructivismand Choice ( D C 2 ) D e p e n d e n t C h o i c e ( v e r s i o n w i t h h i s t o r y ) . 6. $ 2 .. Assume Dependent Choice.30.. with it in mind we can then choose some x2. . then there is s o m e SnI1 C Sn A SO.31. that satisfies.. the Ultrafilter Principle (also known . Let S1. $2. Exercise. We shall not study AD in this book. etc... Let (X. Show that <_ is a well ordering of X if and only if there does not exist an infinite sequence Xl > X2 > X3 > "'" in X..Xn)} • fn(Xl.. which were introduced in Chapter 5. . If Sn meets So.) Midway between the Axiom of Choice (AC) and the Axiom of Choice for Finite Sets (ACF) is a more complicated but very important principle.32 and thereafter.. Though the two names sound similar.29. let fn be a mapping from $1 x $2 x . Remarks.) such that Xn+l C fn(Xl. In 1.32. and Swart [1978]. with Xl and x2 (or just x2) in mind we can then choose some x3.. It is known that AC is strictly stronger than DC. (DC1) ~ (DC2).X2. _<) be a chain ordered set. ) ... Doets.. Recently. x Sn into {nonempty subsets of Sn+I}. c $3 c $2 c S1 c So.Xn) for each n.X2.Xn).minacy (AD). For each n > 1. $3. the two axioms are entirely different. (The Ultrafilter Principle.) The Axiom of Dependent Choice (DC) should not be confused with the Axiom of Deter. the idea is that we can choose some x l. Then there exists a sequence (Xl. In either formulation. suppose that the Axiom of Regularity is false. Use DC to form a sequence (So. . Optional exercise.49 we proved that the Axiom of Regularity implies No Infinite Regress. Hint for ( D C 1 ) = v (DC2)" Let O0 S D U ( S l X S2 X "'" X Sn) n=l and f(Xl.. Optional exercise (from Johnstone [1987]). be nonempty sets. . Say So is a nonempty set that meets each of its elements. $2... and ( A C 4 ) = v (DC2)=v CC..{(Xl. (We assume some familiarity with filters. Howard and Rubin [1996] have shown that UF + CC does not imply DC.. and DC is strictly stronger than CC..Xn)... .. UF. is discussed in 6. a good introduction to it is given by Dalen. N o I n f i n i t e R e g r e s s . 6.. Proof. .X3.. S1. $3. 6...X2. If we assume Dependent Choice. Conversely. THE ULTRAFILTER PRINCIPLE 6.X2. There does not exist an infinite sequence of sets So.X2. $1.
This book contains nearly two dozen equivalents of the Ultrafilter Principle.27. W U F plus CC implies statement (**) above. Morillon [1986]. Rav [1977]. It also follows from Shelah's result Con(ZF + DC + BP). and W U F is weaker still. (UF3).77: They exist in conventional set theory. see the discussion in 14.17. Any proper filter is included in an ultrafilter. Existence of free ultrafilters. the wider literature contains many more.29.61. Thus we obtain this corollary of (UF1): On every infinite set there exists a free ultrafilter. Mathematicians who wish to study equivalents of UF are urged to search not only under "ultrafilter" but also under "Compactness Principle.22.33.74. This result follows easily from (AC5) or (ACT). in 6. then there exists an ultrafilter li on X with l~ D 9". but we cannot prove their existence using just ZF + DC.card{T(T(X))}. 19. 14. This is in contrast with card{fixed .The Ultratilter Principle 151 as the Ultrafilter Theorem). Show that if 9" is a free ultrafilter on X0. etc.5..76 and 14. The equivalents have not all been collected into one source. 9. and Rubin and Rubin [1985]. this is proved in Jech [1973]. see 6. A free ultrafilter exists on N.22. (Proof. then card{free ultrafilters on X} . that was proved by Halpern and Ldvy [1971]." those equivalents of UF (considered later in this book) are less essential to analysts but are more famous among logicians and algebraists. A few more equivalents are given by Jech [1973]. 14. via an argument of W U F =v notBP given later in this book. and 28. then { S C_ X 9S contains some member of if} is a free ultrafilter on X. if 9" is a proper filter on a set X.24. 17.20.f. 7. 6. The equivalents discussed in this book will be denoted (UF1). but the proof is beyond the scope of this book.57. 14. UF is strictly weaker than the Axiom of Choice." and "Stone Representation Theorem.59. (**) A special case of this result is important enough to have its own name: ( W U F ) W e a k U l t r a f i l t e r T h e o r e m . 17.) Neither of the implications AC =v UF =v W U F is reversible. 13. Recall that an ultrafilter on an infinite set is free if and only if it contains the cofinite filter (see 5. This result was proved by Pincus and Solovay [1977]. 17." "Boolean Prime Ideal Theorem. The version of greatest use for purposes of this book is (UF1) U l t r a f i l t e r P r i n c i p l e ( C a r t a n ) . Like the Axiom of Choice. Actually. in the sense of 14. Hence.54.4.35.42. A theorem of Tarski states that if X is an infinite set. and in 6.d). collectively we shall refer to them as UF. the Ultrafilter Principle has many important equivalents in many branches of mathematics. That is. The Ultrafilter Principle is strictly weaker than the Axiom of Choice. using (UF1) to extend the cofinite filter is one method of "constructing" free ultrafilters. (UF2). Thus. Free ultrafilters on N or on any infinite set are i n t a n g i b l e s . CC tells us that X contains a countably infinite set X0. They can be found in the paragraph below. they are also far more numerous. Although the free ultrafilters are harder to illustrate or imagine than the fixed ones.
The sets {f E F{a} : f(A) = x} (for x E ~(A)) are disjoint and their union is F{a}. By (UP1). that will be evident from the argument in 13. Note that ~(A) = {f(A) : f E F{x}}. ( U F 2 ) C o w e n . Proof of (UP1) => (UP2). Assume that (i) (I)(A) . a function f from some subset of A into X is a member of (I) if and only if each restriction of f to a finite subset of Domain(f) is a member of (I).qD(A)} E for all A E A . Let that x be denoted by qa(A). is too long to be included here.34.b and 5. by 10. and (iii) (I) has finite character.e. Let A and X be sets.35. 6. Hints: We may replace 1R with [P(N). The principle (UP2) is very similar to several principles that are known as R a d o ' s S e l e c t i o n L e m m a . . The proof. Thus. 1}. and Thomassen [1983].22. Proofs can be found in Tarski [1939]. Then A is the domain of at least one element of (I). is in many respects similar to the Compactness Principle of Propositional Logic. Jech [1977]. let Ps = {f E 9 : D o m ( f ) _D S}. by hypothesis (ii). satisfying {f E r{x} 9 f(A) . By 3.8(E). Bell and Slomson [1974]. Let (I) be a collection of functions from subsets of A. Let Fin(A) = {finite subsets of A}. This proof is modified from arguments of Rav [1977] and Luxemburg [1962]. hence by the theorem in 3.61. the subsets of N can be well ordered. temporarily fix any A E A.{f(A) 9f E (I) with A E Dom(f)} is a finite subset of X. we define a function p : A + X. In fact.12) implies WUF.22. the collection of sets {Fs : S E Fin(A)} has the finite intersection property. and Gghler [1977]. For a few results on Rado's Lemma(s) see Howard [1984] and [1993].card(X).f. as Rav [1977] points out. Show that ACR (in 6. precisely one of the x's in ~(A) satisfies {f E F{a} : I(A) = x} E 11. Remarks. which is a member of the ultrafilter II. By 5. Then Fs is nonempty. which is (UP16) in 14.46 the cofinite filter can be extended to a maximal member of e. For each S E Fin(A). i. which uses the Axiom of Choice.E n g e l e r L e m m a . nonlogicians' version of (UFI6). (UP2) remains equivalent if we make the further stipulation that X = {0.152 Chapter 6: Constructivism and Choice ultrafilters on X} . Since Fs N Pr = Ps~r. The CowenEngeler Lemma.44. Rav [1977]. for each AEA. The proof of (UP2) ~ (UP1) will be given via several other propositions in 13.43. into X. particularly with X = {0. (ii) each finite set S c_ A is the domain of at least one element of (I). there exists a (not necessarily unique) ultrafilter II on (I) that includes { F s : S E Fin(A)}.7. the reader is cautioned that those principles are not all known to be equivalent to each another. the CowenEngeler Lemma is a sort of combinatorial. The collection e of filter subbases on N is a collection with finite character. the CowenEngeler Lemma can often be used in place of (UFI6) but does not require any knowledge of formal logic. To define ~ : A + X. 6. 1}.. Actually.
the sets {1. Since (I) has finite character. . the implication (A) => (B) is obvious.. This proof follows Halmos and Vaughan [1950]. The theorem above gives necessary and sufficient conditions for the solvability of the "marriage problem" of combinatorics: Let F be a collection of heterosexual people of one gender. n } . . In this case we may choose any xn E Sn and then apply the induction hypothesis to the n . or (ii) each S~ is finite (for M.14). respectively. and the FordFulkerson Maxflow Mincut Theorem of network theory. Let {S~ : y E F} be a collection of sets. then condition (A) says that all the elements of F can be married simultaneously to suitors. H a l l ' s T h e o r e m ) ..1) contains at least k + 1 elements. 6. later we shall use M. . 6. .37. Then the following two conditions are equivalent: (A) there exists an injective function x E lI~Er S~. hence a nonempty subset of (I). Now any f E 9 will do.~(/~)} )~ES is also an element of l/. satisfy (B) but not (A).e. and let S~ be the set of suitors (of the other gender) of person ~/. it is (B) => (A) that we must prove. {3}.36. For larger n we consider two cases: 9 First.} 9 f()~) . Let any S E Fin(A) be given.) Surveys of related material are given by Mirsky [1971] and Reichmeider [1984].. we proceed by induction on n. including theorems of KSnig and Menger in graph theory. suppose that each union of k Si's (1 _< k _< n . 3. 2 . Hall's Theorem to prove LSwig's Theorem in 11.}. Our proof of M. Assume either (i) F is finite (for P. In both cases. because they were apparently the earliest publishers of those theorems see Hall [1935] and Hall [1948] but both theorems have been subsequently rediscovered many times. Show that (UF2) implies the Axiom of Choice for Finite Sets. S2\{x~}. Proof of P.The Ultral~lter Principle 153 It suffices to show that ~ E (I). M a r r i a g e T h e o r e m s . assuming (i). Hall's Theorem . Hall's Theorem will use (UF2). Dilworth's Theorem on partially ordered sets. H a l l ' s T h e o r e m ) ..31. (B) c a r d ( U ~ p S~) > card(F) for each finite set F C F.. Exercise. For instance. it suffices to show that ~ agrees on S with some f E (I). Hall's Theorem is also equivalent to several other important combinatorial matching theorems. Let F = { 1 . . Hint: Use the Finite Axiom of Choice (6. It is not yet known whether M.2. 3 . Hall's Theorem or LSwig's Theorem is equivalent to UF. S~l\{Xn}.1 sets Sl\{X~}. (By "equivalent" we mean in this instance that each theorem implies the others easily. {1}. The set {f E r{~. Remarks.... We cannot omit hypotheses (i) and (ii). . For n = 1 the result is trivial. which was stated in 6. P. i. {2}.15 as (ACF).. We have attributed the two theorems above to Phillip Hall and Marshall Hall.
154 Chapter 6: Constructivism and Choice 9 On the other hand. To see that.k sets contains a t least r elements. $2. This proof is modified from Mirsky [1971].e. It suffices for us to show that the inductive hypothesis also can also be applied to the n . Also. each finite subset of F is the domain of at least one member of ~. . since the union of those r sets together with T is the union of k + r of the original Si's. and F is the domain of at least one member of ~. Hall's Theorem... i. Clearly the inductive hypothesis can be applied to the k sets S1. Sk. each set ~(~/) = {f(~) : f e ~. Sn\T. assuming (ii).. suppose that some union of k of the Si's contains exactly k elements. $2. since it is contained in the finite set S~.. the union of any r of these n .1. Sk+2\T. Hall's Theorem .. Let (I) be the collection of all injective functions f defined on subsets of F that satisfy f(~/) c S~ for each ? c Domain(f). for some k with 1 _< k _< n . in the sense of (VF2)(iii). and hence contains at least k + r elements.... By relabeling we may assume that these are the sets S1. It is easy to see that (I) has finite character.. . Proof of M. By P.k sets S k + I \ T . . note that for 1 < r < n . Let their union be T. ~ e Domain(f)} is finite.k. Thus.. Sk. (UF2) is applicable.
We then write x. A sequence (x~) in a metric space (X. Chapter overview. Nets are a generalization of sequences. A sequence is a function whose domain is N. Most of this chapter can be postponed.N(e) such t h a t n > N d ( x . a net (or "generalized sequence") is a function whose domain is any directed set D. d) is said to c o n v e r g e to a limit x E X if for each number c > 0. but occasionally we need greater generality. there exists an integer N . ::v 7.2.o~icalI I~rstcountablel ~ IP et I Ichainl I"~e I I~' +~1 (theextendedroallino/I I 7. A c o n v e r g e n c e s p a c e is a set X equipped with some rule t h a t specifies which nets 155 . x ~ ) < c. Much of analysis can be formulated in terms of convergence of sequences in metric spaces. An elementary special case.~ ~ x or x limn~ xn.Chapter Nets 7 and Convergences // IHausdor~l lord complete ~ lattice I centered isotone I \ Ipret~176176 I [topolo I 1~ topo.1. it will not be needed until much later in this book.
7. to a much smaller degree. That is the subject of the first half of this chapter. but it imposes a mild restriction on the kinds of net convergences that we shall consider. y ~ u. it is conceptually simpler to first study nets without regard to convergences i. Also. Review of directed sets. For instance. 1 which filters converge to which "limits" in X. We review a few basic properties of directed sets from Chapter 3: 9 The universal ordering (x ~ x for all x c X) is a directed ordering that is not antisymmetric.7.156 Chapter 7: Nets and Convergences or equivalently. The chart at the beginning of this chapter shows the relations between some of the main types of convergences we shall consider in this book. Analysts who are already familiar with convergent sequences in metric spaces should have little dimculty with convergent nets. + ~ ] . pretopological. are introduced here mainly to give a clearer understanding of the basic properties of topological and order convergences.~ . Nets are an aid to the intuition and to the process of discovery. it may be helpful to briefly review the introduction to filters in Sections 5. 1To make the convergence of nets equivalent to the convergence of filters simplifies our theory substanti~lly. recall from 3. Before reading this chapter. etc. this topic is considered briefly in 21. two examples of this are the proof of Caristi's Theorem given in 19. first countable." without any regard to limits. Other nontopological order convergences are important in the study of vector lattices. but the result becomes readable by a wider audience since familiarity with nets is no longer required. Although nets are used mainly for convergences. many proofs involving nets can be rewritten so that nets are not mentioned. but that subject is not studied in great depth in this book. the weak topology of an infinitedimensional normed vector space or understanding topologies that are not known to be metrizable.Hausdorff.. Some researchers prefer to rewrite their proofs in that fashion: The original insight may thereby be obscured. order convergences.e. This is discussed further in 7. but the order viewpoint and the topological viewpoint yield different kinds of information about that convergence. But nets are also occasionally useful in metric spaces. We are more concerned with order convergences that are topological.1 through 5.g. but they are not always essential.43.. then x ~ z) and that also satisfies this condition: for each x.3. One very important order convergence that is not topological is the convergence almost everywhere of [ . if x ~ y and y ~ z.31. as devices for a modified sort of "counting.e. for as we shall see in this chapter nets and convergence spaces are natural generalizations of sequences in metric spaces. . the order convergence and the topological convergence in I~ are identical. The other kinds of convergences . In later chapters we shall be primarily concerned with topological convergences and.. y E X.45 and the explanation of Riemann integrals given in 24. there exists u C X such that x.v a l u e d random variables over a positive measure. Nets are particularly helpful for understanding topologies that are known to be nonmetrizable e..11.8 the definition of directed set: It is a set X equipped with a relation that is reflexive (x ~ x for all x) and transitive (i.
but t h a t ordering is seldom useful. x 3 .6.). so this book will often denote directed sets by A. S C T ~ S ~ T. 2. 7. m. } has its usual ordering. note the use of braces instead of parentheses.) In accordance with much of the literature. where N = {1. . Subscripts i. However. and when it is helpful we shall adopt that notation. then a sequence is written as ( x l . X 2 .. we see in particular t h a t reverse inclusion is a directed ordering on any filter. Thus. We may abbreviate this as (x~) if (]D. B.5. C. . let us first review the notation of sequences. We refer to _D as the o r d e r i n g b y r e v e r s e inclusion. . we shall usually denote elements of a directed set by lowercase Greek letters (c~. Then (~. thus we usually represent such a net by the expression (x~ : 5 E D). in the notation of Chapter 1. 3 . j. x ( 3 ) . with the product ordering. Reverse inclusion is the most common directed ordering used on filters.. but the set ]]} and its ordering ~ are still understood to be part of the structure of the net. . x ( 2 ) . again note the use of braces . Ordinary inclusion (C_) is also a directed ordering on any filter. unfortunately. II3. 7. k. } .Nets 9 Any product of directed sets. 3'. Before turning to generalized sequences. . the values of a net may sometimes be written as x((~). it is more common to view a net as a set parametrized by a directed set D. Recall t h a t a sequence in a set X is a mapping from N into X. . . _D) is a directed set if and only if ~ is a filterbase on X. If we disregard the ordering of (xn). a net in X is any function x : ]I) ~ X. Let ~B be a nonempty collection of nonempty subsets of a set X..4. which is the range of the net (x6). we obtain the set {x6 : ~ E D}. where ]I} is any nonempty directed set. we obtain the countable set {Xn} = { X l . A sequence can be viewed as a function. ./~. they are also sometimes called g e n e r a l i z e d s e q u e n c e s or M o o r e . with values x ( 1 ) . which is the range of the sequence. n generally will mean elements of N if no other index set is indicated. NETS 7...S m i t h s e q u e n c e s . x 2 . and smaller sets are "larger" i.x(/3). X 3 . . Note that. . . 15 7 9 A subset of a directed set (when equipped with the restriction ordering) is not necessarily directed. except when some other ordering is specified. ~) does not need to be mentioned explicitly. is directed. Since every filter is also a filterbase. However. larger sets are "smaller" in this ordering. A n e t in a set X is any function from a nonempty directed set into X. etc. . If we disregard the order on (x6). Nets are a generalization of sequences in fact. it is more common to view a sequence as a set parametrized by N.e. It will be understood to be in use whenever we use a filter as a directed set. . .. Directed sets will be used as generalizations of N or R. ) or (Xn : n E N) or (xn). . (The reader must determine from context whether C means a directed set or the complex numbers.
or that x6 E S happens for all 6 s u f f i c i e n t l y large. Let x" D ~ X be a net. We shall call these the f i l t e r b a s e of t a i l s and the . and so most statements we shall make about nets will apply to sequences as well.5). b.. unrelated meaning.8. S is an e v e n t u a l (or residual) set of the net if S contains some tail set i.e. A set is eventual if and only if its complement is infrequent. We shall say that S is a t a i l s e t of the net if S is of the form {x5 " 5 ~ 50} for some 50 E D. by viewing the identity map i 9D ~ D as a Dvalued net (with ie . c. see 20.. and so whenever possible we prefer to use sequences. and the net (x5" 5 E D). the codomain X.e. Remark. Thus. Let i : N ~ N be the identity map. and frequent if and only if it is an infinite set. ~). 7. ~) is a directed set. the study of convergence in nonmetrizable topological spaces) nets are a more natural tool. S is i n f r e q u e n t if it is not frequent. A subset of N is eventual if and only if it is cofinite (i. if there is some 50 E ID such that {x~ : 5 ~ 50} c_ S.7. a subset ~ c_ ]I} is a tail set if S is of the form {5 c ]]3) 9 5 ~ 50}. Let any net (x5 95 ElI)) in a set X be given. and i~ is frequently a multiple of 17. Let N be partially ordered by m ~ n if m is a factor of n..9. Then i~ is eventually greater than 5. an eventual set if S contains some set of the form {5 C ]I) 95 ~ 5o}. Then (N. and let S c_ X. Then ~ = {tail sets of the net (x~)} is a filterbase on X.e. for some purposes (e.g. for reasons indicated in 3. However. Examples and basic properties. these definitions all depend on the directed set (D. or a frequent set if S meets every set of the form {5 E D" 5 ~ 50}.31. Of course. 7. as far as this author knows.9. and consider the identity map i : N ~ N as a net. sequences are conceptually simpler than nets. Caution: The term "tail set" has another. Correspondence between nets and filters. the proper filter that it generates is 9" = {eventual sets of the net (x~)}. In this case we say that x5 E S happens f r e q u e n t l y . a. or that x5 c S happens for a r b i t r a r i l y l a r g e v a l u e s o f 6.f. In this case we say that x5 c S happens e v e n t u a l l y .158 Chapter 7: Nets and Convergences instead of parentheses. then i5 is eventually a multiple of 17. Sequences are a special case of nets. Let N have its usual ordering. Of course.. 7. These terms can also be applied to subsets of a directed set D. The alternate term "stream" was suggested by McShane [1952]. but "net" is the standard word. if for each 5o c D there is some 5 ~ 5o such that x5 E S. S is a f r e q u e n t (or cofinal) set of the net if S meets every tail set i. has finite complement). The word "net" is perhaps unfortunate it does not have any intuitive justification.
) 7. E ~:} is the eventuality filter of the net (p(x~)" 5 E (iv) p(N) c_ p(9 ~) C_ ~}. the net satisfying x~ . we can convert statements about frequent sets to statements about eventual sets. Let (II3.Nets 159 e v e n t u a l i t y f i l t e r of (xe).9 we saw how each net determines a proper filter. and let X • ~ have the product ordering. one that can be constructed from iB by a straightforward algorithm without any arbitrary choices. and S is the filter generated by both p(~B) and p(9").3.10. (iii) ~ ." "infrequent" as meaning "small.b. y c X then the identity map i 9X ~ X is a net whose eventuality filter is the singleton {X}. If n is a positive integer and S1 U $2 U " " U Sn is frequent. a set S c_ X is eventual if and only if X \ S is infrequent. Many such nets are available. any filterbase on a set X. bo If X is a set directed by the universal ordering C.e. Let ~ be any proper filter or more generally..z for all c~ filter equal to the ultrafilter fixed at z.} "~ ~ ]I}} and eventuality filter 9 " . the reader may find it helpful to think of "eventual" as meaning "large. Show that (i) p(~B) . Conversely. e.11.{S c_ Y 9p . Using this duality.e. x 4 y for all x.{{x5 95 ~ ~. a.l ( S ) 113) in Y. any superset of an eventual set is eventual.4).g)..{p(B) 9 B c ~B} is the tail filterbase for the net (p(x5)" ~ C D) in Y. then (xe) has tail filterbase equal to ~B . let ~ be ordered by reverse inclusion (as in 7. Any eventual set is frequent. when ordered by the restriction of the given ordering. This construction is taken from Bruns and Schmidt [1955]. and vice versa.40. Then S is itself a directed set. Referring to the discussion in 5. d. gQ Let p" X ~ Y be a mapping from one set into another. (Refer to 5.i." and "frequent" as meaning "not small. now let iB be a proper filter on a set X.. respectively. (Some mathematicians call 9" the filter of tails of The proper ideal that is dual to the filter 9" is the collection of all infrequent subsets of X . but we shall describe one that is canonical . which we call the eventuality filter. and suppose S c_ II) is frequent. (ii) p(9 r) . we wish to construct a net (x5) in X whose eventuality filter is iB. Further properties and examples." 7. Let X have the universal ordering (as in 3.{ p ( F ) " F E 9"} is also a filterbase on Y. In 7. 4) be a directed set. Any superset of a frequent set is frequent. Let (xe 95 EII}) be a net in a set X. that is. and hence is a . has eventuality The constant net at z i.9.{S c_ X " S _D B for some B c ~B}. S ) E X x ~B : x E S} is a frequent subset of X x ~B. In other words. Show that D = { ( x . it was independently rediscovered by Wilansky [1970]. then at least one of the Si's is frequent.
Bo) c A . but that replacement is somewhat complicated and artificial. 4) is a directed poset and that {x~ : a (uo. algebra.2 and 15. .hence ~B is the filterbase of tails for the net. the filter N(x) of all neighborhgods of a point studied in 15. However.B' and n < n'. no.. Nets are a natural generalization of sequences. Then show that the map (x.10. see especially 7. We shall use it occasionally. This gives us a bijection between the proper filters on X and a certain collection of nets in X. thereby gaining the advantages of each. Nevertheless.plays a useful special role in many proofs. it is also a poset. Filters have many other uses in set theory. no. It can be replaced by a net.160 Chapter 7: Nets and Convergences directed set by 7. It is this author's opinion that the ideas of nets and filters complement each other. and removes much of the available intuition. so they may be intuitively appealing to analysts.e. then there also exists a net (x~ : a C A) whose filterbase of tails is N and such that the directed set A is antisymmetric i. that there is some canonical way to construct a net with a given eventuality filter. who are already familiar with sequences. T h a t interchangeability is strengthened by the ideas of Aarnes and Andenms. its eventuality filter is ~ if that filterbase is a filter. B) ~ (u'. Bo)} = Bo for each (uo. On the other hand.n. we may prefer to work with N(x). This book will make frequent use of nets and filters and of their interchangeability. since it is the smallest filter that converges to the point x.e. the collection of all filters on X is a set of ordinary size clearly. Remarks: nets versus filters.11. "Most" nets are not canonical nets. Define x(~. the correspondence between filters and nets is quite good. our applications would not be greatly affected if we made a n t i s y m m e t r y a part of our definition of directed set. the specific details of the construction will not enter into most applications. etc. this bijection is not onto the class of all nets in X. For any proper filter 9. since we make no restriction on the choice of the underlying directed set.) Construction: Let A = e XxNx : eB} Order it as follows: (u. 7.. In fact. In contrast. is a net whose filterbase of tails is ~.B) = u.9 we showed how to switch back and forth between nets and filters. {filters on X} c_ T(T(X)).13. 7. In 7.) If N is any filterbase on a set X. as in 7.f. since filters are always "canonical. logic. Verify that (A. but more often we shall merely need to use the fact that some canonical construction exists i. and use only one system or the other. the class of all nets in X is a proper class (see 1. so that each system can be used to its best advantage. but filters can also be used to study convergences.15(C). This canonical construction is admittedly a bit complicated. they should not be viewed as two separate systems of ideas." For instance. We shall call this net the c a n o n i c a l n e t of ~. and so we may use the two tools interchangeably. n. from D into X. S) ~ x.7 and t h e r e a f t e r . (Consequently. (Optional.44) it is far too big to be a set. B') if and only if either (i) B D B' or (ii) B . many proofs are easier in terms of filters. nets and filters yield essentially the same results about convergences. the eventuality filter of the canonical net of 9" is 9". Some mathematicians prefer nets or prefer filters. This construction is also from Bruns and Schmidt [1955].12. In fact. n'.
Preview and historical remarks. several other mathematicians notably Cartan and Bourbaki were developing an analogous theory of filters. The difficulty is removed by a more general approach to subnets that has been suggested independently by several mathematicians (Smiley [1957]. they yield essentially the same statements of theorems. because they investigated it in greatest depth.29. Each system offered certain advantages: Nets look more like sequences and thus appeal more to the intuition of analysts. Subnets are a generalization of subsequences.19 and 15. A slight variant on Kelley's definition was given by Willard [1970].b below. this will be shown in 17. they are frequent } subnets C_ Willard subnets } C { subnets } C_ { Kelley aa subnets }" {subsequences} c_ The last three types Willard. but not both. We shall name this approach after Aarnes and Anden~es. Aarnes and Andenaes [1972].c. we present it in 7. but they are much more specialized.. Smith. the two systems were not easily interchangeable. A satisfying certain technical conditions discussed in 7. Kelley. Analogous ideas for nets were gradually developed by Moore. their nearinterchangeability will follow from results in 7. Recall that (yp : p E N) is a s u b s e q u e n c e of (xn : n E N) if we can write yp = x~(p) or yp = x~p for some positive integers ~(1) < ~(2) < p(3) < .c) are important enough to deserve mention. seems not to be widely known yet. they cannot be used interchangeably with the other three kinds of subnets. . nevertheless. The theory was popularized by Kelley's textbook [1955/1975]. For an abridged treatment. weak topologies. The AA definition is the most general and yields the simplest proofs. Although the three definitions require slightly different proofs of theorems. Soon it became clear that the two systems of ideas yielded the same kinds of conclusions about uniform convergence.14. . and AA are our main types of subnets. etc.16. this book will use the term "subnet" to mean "AA subnet" except where noted explicitly." and has been used as such elsewhere in the literature. The Kelley definition is oldest and is most widely used in the literature. Tukey. and perhaps others) but which. compactness. For those reasons and other reasons indicated below. While Kelley et al. In general. We shall say that (y~ : fl E ItS) is a Kelley subnet of (x~ :c~ c A) if we can write y.15. there was some awkwardness in the translation. filters are amenable to arguments involving elementary settheoretic operations and the Ultrafilter Principle. In order of increasing generality. Frequent subnets (introduced in 7. Any one of these by itself would make a good definition of "subnet. Birkhoff.Subnets 161 SUBNETS 7. . The Aarnes and Anden~es (AA) approach moved further away from . were investigating nets.15.38. for a function ~ : It~ . and Kelley. However. In the following pages we shall compare several types of subnets. but the other two definitions are simpler. Most mathematicians in convergence theory ended up using either nets or filters. Murdeshwar [1983].~ = x~(~) or y~ = x~. the reader may skip over Willard and Kelley subnets.
y~ = x~(~) for all/3 c ~. (ii') The Avalued net ~ : ~ + A is an AarnesAndenms subnet of the identity map iA :A ~ A. this may make the definition more palatable to many readers. If any (hence all) of them are satisfied. and (ii) ~ is monotone. This approach makes nets and filters easily interchangeable. if yz E S for arbitrarily large values of ~. yz = x~(~) for all/~ E ]~ and (ii) for each eventual set S c_ A. with eventuality a. but the AA approach relates the nets by their behavior in the codomain X. that is. we shall say that (yz) is a s u b n e t of (x~) (or more precisely. adding a requirement of monotonicity. the set ~1 (S) is eventual in It~. Let (x~ 9a c A) and (yz 9 filters 9" and 9. for each ao E A there is some/30 c ~ such that {yz 9 ~/30} C_ {x~ " a ~ ao}. Kelley's definition related two nets x : A ~ X and y : ~ ~ X by their behavior in the domains A and ~. if x~ c S for all sufficiently large values of a. That is. . We shall say (yz) is a K e l l e y s u b n e t of (x~) if there exists a function ~" ~ ~ A such that (i) y = x o that is. respectively. Then: c ~) be nets in a set X. or a s u b n e t in t h e sense of A a r n e s a n d Anden~es). an A A s u b n e t .) (D) Each (x~)tail set contains some (y~)tail set. (In other words. In other words. 7. thus offering mathematicians the advantages of both systems. The following conditions are equivalent. then y~ c S for all sufficiently large (C) 9 _D 9". Definitions. the set y . (A) Every (yz)frequent subset of X is also (x~)frequent. there is some/30 c I~ such that ~ ~/~0 =~ ~p(/3) ~ a0. Condition (ii) can be restated in either of these equivalent forms" (ii') For each a0 c A.162 Chapter 7: Nets and Convergences the original notion of subsequence. then x~ c S for arbitrarily large values of a. (E) For each eventual set S C_ A.l ( x ( S ) ) is eventual in ItS. That is. (B) Every (x~)eventual subset of X is also (yz)eventual. c. Willard [1970] modified Kelley's definition slightly. and dispensed altogether with the connecting function :IB ~ A. an AA subnet corresponds to a superfilter. ~1 ~ /~2 ::~ ~(~1) ~ r (iii) for each a0 C A there is some/30 c ~ such that ~(/3o) ~ a0.15. We shall say (yz) is a W i l l a r d s u b n e t of (x~) if there exists a function ~ : ~ . b. A such that (i) y = x o ~ that is.
8 . 7. . Then the following conditions are equivalent" (A) F N G N H is nonempty. G E 9. 4. 6. ) is an AA subnet of the other. (B) 3 ~ . Say the nets have eventuality filters 9=. each of the sequences (0.. 3. Two nets have the same eventuality filter if and only if each net is a subnet of the other. If the two given subnets are Kelley subnets. H c 9{. Suppose that (xa 9a c A) is a net in a set X and (x~) is eventually in some set of the form E = E1 U E2 U . of the sequence (1.0. 3 . 7. If (z~) is a subnet of (yz).10. . Show that any Willard subnet is also a Kelley subnet. b. . . .15. or frequent subnets. 6. . and (w 7 "7 E C) be three nets taking values in a set X. Thus (x~) has a frequent subnet that takes all its values in Ej. there exists a proper filter which contains all three given filters. and 9{. then then (z. 3. Willard subnets.Subnets 7. this the inclusion map The converse is not valid. ) Show that any frequent subnet is a Willard subnet (by using i 9 F ~ A for the map ~ in definition 7. Show that (yn) is a subsequence of (Xm) if and only if (Yn) is a frequent subnet of (Yn). For instance. respectively. 5. but not a frequent subnet. Frequent subnets cannot be used interchangeably with Willard. 5. The converse is not valid. 3. . 7. but neither is a Kelley subnet of the other. Frequent subnets are a generalization of subsequences.17. 2. 2. The converse is not valid. ) and (1. Hcg{}isaproper (C) The three filters have a common proper s u p e r f i l t e r . Let (xm : m c H) and (Yn : n C H) be two sequences.5. for every F E 9=. 4. Then there is at least one j such that (x~) is frequently in Ej.e q u i v a l e n t . Suppose (x~ : a c A) is a net in a set X and F is a frequent subset A. (v3 9 E ~). .18. but neither is a Willard subnet of the other. b. Composition of subnets. .16. c. Kelley. . 5 . 1. 6.i.29. . Furtherelementaryproperties. . Let ( u ~ ' a c A). ) and (1. 6 .Gcg. and so (x~ : a E F) is a that it is a f r e q u e n t s u b n e t of the net (x~ : a E A). show that (1. d. . . Then F is a directed set (see 7. (In some of is called a cofinal s u b n e t . We shall say the literature. Show that any Kelley subnet is also an AarnesAndenms subnet. c of the directed set net. 1. c. each of the sequences (2.{ S C X filter. For instance. 7 . ) is a Kelley subnet of the other. 8 . 3 . ) is a Willard subnet. 2.e. 163 a. 9. then (z~) is a subnet of (x~). or simply e q u i v a l e n t . . . 9 S_~FNGAHforsomeFcg=. ) . see 17. Comparison of the definitions. . 2. . .y) is the same type of subnet of (x~). .c). L e m m a o n C o m m o n S u b n e t s . a. or AA subnets. We shall then say the nets are A A . and (yz) is a subnet of (x~). Definition. For instance. U En C_ X. .
Then Ta.. The number 3 may be replaced by any positive integer. the implication (E) =~ (D) is trivial. (It is understood that three different functions are used for the monotone mappings from L into A.#~b. (yz).k E L) is a Willard subnet of each of the three given nets.b. define ra.) is also an AA subnet of (px). As in 7. then (q. Thus (yz) and (px) are subnets of each other.e. Since (y#) has the property for which (px) is maximal.(a.e. The equivalence of (C) and (D) is immediate from our correspondence between AA subnets and superfilters. (yz) is an AA subnet of (px).c) to a Willard subnet of (x~). 7. by condition (A).164 (D) The three nets have a common AA subnet is an AA subnet of each of the given nets. B. b.w~. See especially 15. that net can be chosen so that it is a m a x i m a l c o m m o n A A s u b n e t of the three given nets i.) Furthermore.c is nonempty. A similar result is given by G~hler [1977]. Note that the filter 5I in condition (B) is a a minimum common superfilter i.c  {u~" o~ ~ a} n {vz" fl ~ b} n {w.y ~ c} = {xEX 9 xusv~w. there exists a net (pa : A c L) which is a Willard subnet of each of the three given nets. . We have stated the lemma in terms of three nets and three filters to display a typical case. "7) in L.e. Consequently. the three types of subnets can be used interchangeably in many contexts. let (px) be a common Willard subnet and also a maximal common AA subnet of the two given nets. We have seen that every Willard subnet is a Kelley subnet.17. it is the smallest filter containing all of the given filters. C. Hence is a frequent subset of A x B x C. . It suffices to show that (A)(D) together imply (E). B.yforsomec~a. so that if (q. every Kelley subnet is an AA subnet. namely. For the monotone mappings ~ from A x B x C into A. 3'~c}. and C.us . Any net corresponding to it is a maximal common AA subnet of the three given nets. If (yz) is an AA subnet of (x~). It suffices to exhibit a net (px : )~ E L) whose eventuality filter is :M:. such that (px : . Corollary on e q u i v a l e n t s u b n e t s . Chapter 7: Nets and Convergences  i.e.) is any common AA subnet of the three given nets.b. fl. For each (a.vz . there exists a net (pa) which (E) The three given nets have a common Willard subnet i.. For each A .38. Note.. use the coordinate projections. Hints: The two given nets have a common AA subnet Remarks. when A x B x C is given the product ordering. c) c A x B x C. define px . Proof of lemma. the remaining verifications are easy. and every AA subnet is equivalent to a Willard subnet.18(E).. The implications ( C ) = ~ ( A ) = ~ (B) =~ ( C ) a r e easy. then (yz) is equivalent (in the sense of 7.19.
then so does every supernet. Then (x6) is a universal net. in a topological space. 7.. (xs : c~ E A) is some net in X. Assume (x6) is e v e n t u a l l y c o n s t a n t . some properties are s u p e r n e t h e r e d i t a r y . which we now present in two formulations: (1) Suppose that f : X ~ V is some function. either (i) eventually x6 E S or (ii) eventually x6 E X \ S . Consequently. . assume there exists some z E X such that eventually x6 = z. Although other universal nets exist. if qz = yz = v~(z). if yz = f(x~(9)). or with any convenient supernet. every subnet of a convergent net is convergent. we shall see in later chapters that in a topological space. in many proofs it is possible to replace a given net with any convenient subnet. then we can reformulate the problem as in (1) by taking X = U x V and xs = (us. and ( y z : / 3 E B) is some Kelley subnet of the net ( f ( x s ) : c~ E A) in V.Universal Nets 165 7. Then ( ( u s . Definition. For instance. observe that if f. what this means is that by relabeling.10. " particularly in connection with hereditary properties. Some properties of nets are s u b n e t h e r e d i t a r y . Many proofs with nets involve such hereditary properties. Conversely. ) (2) Suppose that ((us. i. (Indeed. and letting f : X ~ V be the projection onto the second coordinate. vs) as in (2). Though AA subnets are simpler than Kelley subnets in most respects. Some proofs use the phrase "we m a y a s s u m e . A u n i v e r s a l n e t (also occasionally known as an ultranet) in a set X is a net (x~) with the property that for each set S C X. in the sense that if a net has the property. Suppose that (yz :/3 E B) is some Kelley subnet of the net (vs :c~ E A) in V. vs). if we are given (us. (xs).) These are actually two formulations of the same principle. in the sense that if a net has the property. then so does every subnet. Likewise. (y~) are given as in (1). Let (x6) be a net in X. UNIVERSAL NETS 7. (Indeed. Example. v s ) : c ~ E A) has a Kelley subnet ((pz. q ~ ) : /3 E B) such that qz = yz for each/3.21. f(xs)). take s 9 = x~(9).20. To see this.22. For instance. In many cases. 7. other explicit examples of universal nets do not exist! T h a t is explained below. Kelley subnets do have at least one advantage. Then ( x s : c~ E A) has a Kelley subnet (sz : ~ E B) in X such that f(sz) = yz for each/3.e. See the related discussion in 1. we may replace the given net with some subnet or supernet that has an additional property of interest. vs) : c~ E A) is a net in some product of sets U x V. then we can reformulate the problem as in (2) by taking X = U and (us. the property of not being convergent is supernet hereditary.23. take pz = u~(z).(xs. then (vs : c~ E A) is some net in V. vs) .
If (xn) is a universal net that is a sequence. then x~ is eventually in S. As we remarked in 6. then any subnet of (x~) is AAequivalent to (x~) and is also universal. the theory of universal nets is simply a reformulation of the theory of ultrafilters. Likewise. The Ultrafilter Principle. therefore. as long as X is large enough to contain all the points of that net. If (xs) is a universal net in a set X . .33.24. The same is therefore also true of universal nets t h a t are not eventually constant. free ultrafilters are intangibles. nevertheless they are useful conceptual tools for some kinds of reasoning. If (xs) is a universal net in a finite set X. then that range can be partitioned into two disjoint infinite sets. A net (xs) is eventually equal to some constant x if and only if its eventuality filter is the fixed ultrafilter at x. then there is at least one j such that eventually x5 c Sj.19. Hint" 5. introduced in 6. the universality of a net (x~) in a set X does not depend on the choice of X. .8(E). and consider its range R . then (f(x~)) is a universal net in Y. Then what? g. Every net has a subnet that is universal.S1 U $2 U . If a net is universal. If (x~) is a universal net in X and x~ is frequently in some set S c_ X. . Though we have no explicit examples of these peculiar nets. There exists a universal net in N that is not eventually constant. A net (xs) is universal if and only if its eventuality filter is an ultrafilter.33. Hint: If (xn) has infinite range. then any AAequivalent net is also universal.{x~ 95 E D}. U Sn.166 Chapter 7: Nets and Convergences 7. Thus. the Weak Ultrafilter Theorem. or AA subnets. then A has two disjoint frequent sets A1. h. If a net (x~ 9a c A) is not universal. Thus. in the discussions below it does not m a t t e r whether we use Willard subnets. then (x~) is eventually constant. c. by 7.32. b. Further properties of universal nets. Observations. then it is eventually constant. d. 7. a. e.25. Let (x~ 95 c D) be a net in some set X. can be reformulated as ( W U F ~) W e a k Universal Subnet Theorem. can be reformulated as (UF3) Universal Subnet Theorem (Tukey. Show that (xs) is a universal net in X if and only if (x~) is a universal net in R. If (x~) is a universal net in a set X and f " X ~ Y is any function. f. Kelley subnets. If (x~) is a universal net. presented in 6. A2 such that the resulting frequent subnets (x~ 9 c~ E Aj) have disjoint ranges. Kelley).
L e m m a . al < a2 < a3 < " " .l ( r ) is also a finite subset of N.l ( r ) is finite. It is clear from the definitions above that v(N) . suppose conditions (i) and (ii) are satisfied. Now let A . for each r c y(N). Conversely. then v . then x . and assume that (yj) is an AA subnet of (xi). This argument is from Aarnes and Anden~es [1972]. by discarding the first few terms of (yj) we may assume without loss of generality that y(N) C_ x(H). . Then y is eventually in the set Range(y). The sets Range(y) \ S and Range(v) \ Range(y) are finite. is an infinite set. Let S be a subset of X such that eventually y E S. in this case let Ar = 9 If y . then (vk) is a subsequence of (xi). Then (yj) is AAequivalent to some subsequence of (xi). This argument is from Aarnes and Anden~es [1972]. For k sufficiently large.l ( r ) is also a finite set.l ( r ) is nonempty.l ( r ) .y(N). then A is an infinite subset of N. Since Range(x) is an eventual set for y(). we are to show that eventually v c S. If the reader desires a canonical choice of A~. For condition (ii). Define vk . ) In either case we obtain x(Ar) . For all such k. Say its members are.[Jrcy(N) A~.{r}. hence vj ~ Range(y) for only finitely many values of j.l ( r ) is a finite set. the set y . if y . First. and it is a subset of X \ S. define a set A~ C N as follows: 9 If y . let A~ be the singleton whose sole member is the first member of x .l ( r ) is a finite subset of N.l ( r ) and y . Proof. For each r E y(N). hence the set v . we have k ~ F. then v is also eventually in that set. (Any such set will do for the purposes of this proof. 7. v and y are AA subnets of each other. hence v .[J~Range(v)\s v . and (ii) for each r c X. For each r c X \ S. Also. we have vk E S. hence the set Range(v) \ S is finite. For such r. this implies (i). Let (vm) and (Ym) be sequences in a set X. in increasing order. T h e o r e m o n e q u i v a l e n t s u b s e q u e n c e s . suppose that yl(r) is a finite set.x(A) .26. hence the set x . Let ( x i ) a n d (yj) be sequences in a set X.More about Subsequences 167 MORE ABOUT SUBSEQUENCES 7.27. the sets v .X a k . hence v is eventually in that set.26.l ( r ) xi(~). let A~ be some nonempty finite subset of xl(r). then y is eventually in X \ {r}.l ( r ) are both finite or both infinite.l ( r ) is also an infinite set. Proof. Then v is an AA subnet of y if and only if these two conditions are satisfied: (i) Range(v) \ Range(y) is a finite subset of X. By 7. the set y .l ( r ) is nonempty. suppose that v is an AA subnet of y.l ( r ) is finite.l ( r ) is finite. Therefore the set F .
. It is easy to see t h a t if s is a sequence t h a t has no maximal element.f. Let Xn(1) be a maximal element of s. the one used in college calculus). an ordering. if we use Kelley subnets. . By a c o n v e r g e n c e s p a c e (or function lim : limit space) we shall mean a set X equipped with a . ) .OL2.m2j..g.Xn(1)+3. . . then each O~j is itself a sequence of positive integers.) be a given sequence.). regardless of which type of subnets we use. .1. if we use AarnesAndena~s subnets and X is a finite set.. 7. Now let s = (Xl. . ) is an AA subnet of (x~). If some sequence (ym) is a subnet of (x~). . the set lim9 ~ contains at most one point of X. . Yes. let II be any free ultrafilter on X.) Let (x~) be a corresponding net. m33 q.) is a sequence in A.e there is at least one x0 c X such t h a t frequently x~ = x0. Indeed. T h e n there is no j for which c~j (roll +. m22 qt_ 1. c. Does every net in X have at least one subnet t h a t is a sequence? a. then s has an increasing subsequence. b. such convergence spaces are discussed further in 7. Hint: If (OL1.. {subsets of X}.10. x0. x 0 . Let X be a given nonempty set. take A = H TMwith the product ordering. No. and take x to be any function from A into X. T h e n any sequence in X has a monotone subsequence.168 Chapter 7: Nets and Convergences 7. Proof (Thurston [1994]).33.O~3. We emphasize t h a t the value of lim is a subset of X.xn(2)+3..36. a measure. Then the constant sequence (x0.1 .. hence (Ym) is universal and not eventually constant contradicting 7.28. (The existence of such an ultrafilter was established in 6. we may assume t h a t every subsequence of s has a maximal element. Let X be a chain ordered set (e. (Optional.a topology. Let xn(2) be a maximal element of (Xn(1)+l....Xn(1)+2. Continuing in this fashion. we obtain positive integers n(1) < n(2) < n(3) < . etc. thus (x~) is a universal net t h a t is not eventually constant.. but in most cases of interest the function is determined by some structure already given on X .25. satisfying xn(1) >_ xn(2) >_ xn(3) _>'" ".).). by 7.Xn(2)+2. Indeed. . . In some convergence spaces (e. Say (~j = (mlj. Indeed. T h e o r e m . then (Ym) has the same eventuality filter II.. here is one of them...X2.30. No. then no Kelley subnet of (x~ : a E A) is a sequence..X3.29. CONVERGENCE SPACES 7. the real line).m3j. {proper filters on X} Any function can be used fo~ lim in this definition. By a maximal element of a sequence we shall mean a maximal element of the range of t h a t sequence.g. Let xn(3) be a maximal element of (Xn(2)+l.) There are a few minor differences between AarnesAndenms subnets and Kelley subnets. if X is an infinite set.....
Conversely.~ z and read as "9. not a set. also. For instance. and LowenColebunders [1970] for categories of convergence spaces. we shall not a t t e m p t to list them all here." When two or more convergences are being considered. Other variants on the notation should be clear from the context. we may use a prefix or subscript or superscript to distinguish them. we verify (*) by verifying a stronger property described below in 7. we have this alternate definition: A c o n v e r g e n c e s p a c e is a set X that is equipped with some function lim 9 {nets in X} ~ {subsets of X} that satisfies (*). then we shall extend the function lim in the following ways: (a) If ~B is a filterbase on X. as well). the expression z E lim 9" will be read as "z is a l i m i t of 9". see the remarks in 15. in convergence spaces we may use nets and their eventuality filters interchangeably (and use AA subnets and superfilters interchangeably. if (*) is satisfied. without condition (*) being imposed a priori.31..is a proper filter or a net.31.33.e.c o n v e r g e s to z. if (x~) and (y~) are AAequivalent). then the set of limits of (x~) is equal to the set of limits of (yz). (In many applications. rather than some other structure g." It may also be written as 9. since {nets in X} is a proper class.13. Herrlich.10.32. for instance. then lim iB . Indeed. for a net (x~ : c~ E A). Y be a mapping from one convergence space into another. Note." The statement "9" does not converge to z" may be written as z ~ lim 9". lim) is a convergence space. the expression x~ ~ z may also be written as "x~ . as we noted in 7.34. z in X as ~ increases in A. {subsets of X} is not a function strictly in the sense of 1. Note that the resulting "function" lim 9 {nets in X} . Let p" X . Remarks. and G/ihler [1984] for "convergence spaces" that are more general than "filter convergence spaces. 7.is the eventuality filter of (x~).where 9" is the filter generated by iB. net convergences are considered in great generality. or as 9" 74 z." In Kelley's [1955/1975] book. see" Bentley. Many variants on these notations can be used for clarification. that this function satisfies the following condition: (*) if (x~) and (y~) are nets with the same eventuality filter (i. Each type of object has its advantages. We shall say p is c o n v e r g e n c e p r e s e r v i n g if it has this property: . Dolecki and Greco [1986] for algebraic properties of collections of convergence structures. we may write z E % l i m x~ or z E limx~ or x~ >z to indicate that z is a limit of the net (x~) when we use the convergence function determined by some structure ~'. If 9. (b) If (x~) is a net in X. then (b) defines a corresponding limit function on the collection of all proper filters on X. 7. Whenever (X.l i r a g" where 9.Convergence Spaces 169 7.lim 9".b. For more general theories of convergences than those considered in this book. More notations. then lim(x~) .) Thus.
for AA subnets the isotonicity condition above implies condition (*) of 7. we say t h a t z is the limit of F. and 9" ~ z.. equivalently. Exercise. equivalently. 7. if (yZ) is a subnet of (x~).) a. If (x~) is a universal net and some subnet of (x~) converges to z.g. (Exercise. Most convergence spaces or topological spaces in applications are Hausdorff. 7. or locally convex space.34.lim F.) Observe t h a t the composition of two convergence preserving maps is convergence preserving. or AA since we have built condition (*) of 7. (Some m a t h e m a t i c i a n s make one or both of these properties a part of their definition of convergence space. lim) is H a u s d o r f f if each net or proper filter F has at most one limit i. equivalently. topological linear space. if (x~) is a net such t h a t eventually x~ = z. then x~ ~ z also..31 into our definition of convergence space. completely regular space. then the net converges to p(x) in Y (p(x~)) or. In the last sentence. (Now the notation should begin to look more like t h a t of college calculus. in fact. A convergence space is i s o t o n e if it has this property: if ~} is a superfilter of 9".170 Chapter 7: Nets and Convergences whenever (x~) is a net converging to a limit x in X.e. our original limit function which took values in {subsets of X} . Most convergence spaces of interest satisfy both of the properties below. then z c l i m F may be rewritten as z . We shall not . On the other hand. we are not asserting t h a t z = {z}. and so some m a t h e m a t i c i a n s incorporate the Hausdorff condition into other definitions e. if each set of the form lim F contains at most one member. or. whenever 9" is a filter converging to a limit x in X. 7.is replaced by a new function. they make it a part of their definition of convergence space.l ( S ) E 9"} converges to p(x) in Y.31. A convergence space is c e n t e r e d if it has the property t h a t if llz is the ultrafilter fixed at z. it does not m a t t e r which type of subnet we use Willard. this is discussed further in 9. A convergence space (X. Thus. Definitions. then lIz ~ z. Prove the equivalence.) In effect. these properties are satisfied by all the convergence spaces t h a t we shall consider in this book. then yz ~ z. then x~ ~ z." which takes values in X. lim) is a Hausdorff convergence space.36. The distinction between the two different lim functions should be clear in most contexts and should not cause any confusion. again denoted by "lim. then the filter {S c_ Y 9 p . and x~ ~ z. gauge space.35.7. b. then ~} ~ z or. Let X be an isotone convergence space. Kelley. W h e n (X.. compact space.
7. in special contexts. that special case should be kept in mind by the reader at all times throughout the remainder of this chapter. More notation. ~). It can be restated in other ways that are sometimes more convenient. and each finite subset of T must have a lower bound in T. 4) be a poset.41 and 7. Let (X. T c_ X such that (S. ~) and (T.40. despite its complexity. Let z E X. Thus. see the discussions in 5.15. Making oc a member of our convergence space is not particularly difficult. then the equation lim f ( x ) X'"~ X o Y0 is a condition on x0. Most limits in college calculus are of this form in some cases with x0 or y0 equal to oc. and let (x~ : a E A) be a net in X.45. for many of the concepts in this book are revealed more clearly if Hausdorffness is treated as a separate property.15. The two most important kinds of convergences are the topological convergences. It is often helpful to analyze Hausdorff spaces in terms of other. then f ( x ~ ) . Remarks. CONVERGENCE IN POSETS 7. . We say that (x~) is i n c r e a s i n g if a ~ B ~ x~ ~ xs.24. Hausdorffness will be assumed only when stated explicitly. each finite subset of S must have an upper bound in S. Thus. The following one works best for our purposes. Throughout this text.37. and for each fixed s E S and t E T we have eventually s ~ x~ ~ t.39.d). many of the basic properties of order convergence in IR generalize readily to other settings that are occasionally useful.d and. sup(S) and inf(T) both exist in X and are equal to z. If X and Y are convergence spaces and Y is Hausdorff. The most important type of order convergence needed by analysts is the order convergence in IR. yo. ~) are directed sets. see 7. The literature contains several different.Convergence in Posers 171 follow that practice. 18. 7. Let (x~ 9a E A) be a net taking values in a partially ordered set (X.38. inequivalent definitions of convergence in partially ordered sets. studied in the remainder of this chapter. studied in Chapter 15.) 7. (We emphasize that T is to be a directed set when we reverse the restriction of the given ordering.f. and f.g. simpler spaces that are not Hausdorff (see 15. Definition. 5. with the following meaning: Whenever (x~) is a net in X \ {x0} that converges in X to xo. we begin our study of order convergence in a setting that has as few hypotheses as possible: the setting of partially ordered sets.25. We shall say that (x~) is o r d e r c o n v e r g e n t to z (sometimes written x~ o z) if there exist nonempty sets S . and the order convergences. Yo in Y. However. Definitions.
if in addition z = sup{x~ : a E A}. Hint" Apply the preceding result with x~ . O g. For the "only if" part.172 C h a p t e r 7: N e t s a n d C o n v e r g e n c e s This m a y be a b b r e v i a t e d x~ T.e. i.) Let f 9X ~ Y be some function t h a t is suppreserving and infpreserving (see 3. Order convergence in t e r m s o f m o n o t o n e convergence: x~ o z (defined as in 7.Fix any s x E S x and t y c TY. x~ T z if and only if (xa) is increasing and xa b. in 15. the net d e c r e a s e s t o a l i m i t z (written x~ I z) if in addition z = inf{x~ : a ~ A}. ~). let (uz) and (v~) be given by the identity m a p s on the sets ~ . C o n v e r g e n c e p r e s e r v e s inequalities. Order convergence is Hausdorff.52(E). Suppose (x~ "c~ E A) and (y~ "c~ E A) are nets O based on the same directed set.40. T h e " s q u e e z e t h e o r e m . 7. In a complete lattice. Hint" First show t h a t f preserves the convergence of m o n o t o n e sequences i. Also.38). ~. then also y ~ ~w. We say t h a t (x~) i n c r e a s e s t o a l i m i t z. the s t a t e m e n t x~ ~ z m a y be r e w r i t t e n as z . and let z c X. (Here we use the same symbol ~ for two different partial orderings.38) if and only if there exist nets (uz 9 e I~) and (v~ "'7 e C) such t h a t uz T z and v~ $ z. Then: a. let S and T be the ranges of those nets (uz) and (v~). c.) .38). Let (x~ 9 c~ E A) be a net in a poset (X. A net is m o n o t o n e if it is increasing or decreasing. denoted x~ T z. if X and Y are equipped with their order convergences.38 t h a t define the O convergence x~ ~ x ~ . Hints" For the "if" part.40. e. any m o n o t o n e net converges.22). Order convergence is centered and isotone.39.T. T h e n f is also convergencepreserving (see 7. ( z ~ ' ~ c A) are nets O based on the same directed set. x~ I z if and only if (x~) is decreasing and x~ o o z (in the sense of 7. (y~ 9 (~ c A).y~. and thus s ~ ~ t y.33). t h e n x ~ ~ y ~ Hint" Let S x and T x be two sets t h a t satisfy the conditions in 7. C o m p a r e with 26. the convergences described in 7. R e m a r k " T h e a s s u m p t i o n s cannot be weakened substantially. Exercises. h. ( R e m a r k .38 t h a t define the convergence y~ o Y~.d. Use t h a t fact to prove t h a t s u p ( S x) ~ inf(TY). " Suppose (x~ 9 (~ c A). Analogously. satisfying x~ ~ y~ ~ z~ for all a. ~) and (Y. d. let S ~ and T y be two sets t h a t satisfy the conditions in 7. t h e n we have eventually s x ~ x~ ~ y~ ~ t y. a net (x~ : a c A) is d e c r e a s i n g (written x~ l) if a ~ ~ ~ x~ ~ xz. f.olim x~. then use 7. and for each fixed ~ and ~ we have aeventually x~ E {x 9 uz ~ x ~ v~}.45 we give a partial converse. Let (X.. If x~ ) x ~ and y~ o ~ y ~ .S and C . ~) be posets. Thus. satisfying x~ ~ y~ for all ~. z (in the sense of 7. If x~ ~ w and O O z~ > w.
We shall show x~ V x~ o . Proof of equivalence. r we have a .r  Vp V .&x. 9p E M ) . we shall find a set T satisfying the conditions of 7. UA. there exist nets (ua 9A E L). x . some chains have one or the other. +co].e. a contradiction.38. t h a t z' is a lower b o u n d for T and z' > z. T h e n z' is actually a m e m b e r of T. we have eventually x~ < z' and thus eventually x~ < z.) If the net (x~) satisfies eventually x~ _< z. This a r g u m e n t follows Vulikh [1967]. By assumption. Let L x S and M x T have the p r o d u c t orderings. Note t h a t condition (i) is satisfied vacuously (i. for free) if z h a p p e n s to be the largest element of X. Assume. ! u~T X/ / Proof. therefore. (u''a C S). T h e n the lattice operations V. and u'gx~ 4%. Let T = {t E X : t > z}.41. (Forming S from (ii) is similar. #. condition (ii) is satisfied vacuously if z h a p p e n s to be the smallest element of X.o c . Considering the examples of [ .e. It is an easy exercise t h a t order convergence implies conditions (i) and (ii). we shall show t h a t this set satisfies the requirements. x'~ ) ' a E A) is a net i n X x X w i t h x ~ . for then there is no element r t h a t satisfies z < r. T h e n x~ ) z (that is..42. then eventually x~ > o. on the contrary. and so T is nonempty. Remarks. as defined in 7... A are "jointly continuous. Suppose. Likewise. It suffices to show t h a t z = inf(T). l and for each fixed A. (v.d) if and only if these two conditions are satisfied for all a and r in X: (i) if z > or. . and we are done. We have frequently x~ E T." in the following sense: If o o ( ( x ~ . we see t h a t some chains have b o t h a largest and smallest element. ( v ' ' r E T) such t h a t u~tx. the result for meets is proved analogously. eventually x~ < t. v. . + o c ) . and (ii) if z < r. and thus z' is the smallest element of T.40.38 or characterized in 7. Suppose X is an infinitely distributive lattice (as defined in 4.e v e n t u a l l y uagx~ 4v. Conversely. t h a t the net (x~) does not satisfy eventually x~ < z. T h a t is. and some chains have neither. Let (X. we omit the details. Since z' E T. o. T h e o r e m o n c o n v e r g e n c e in c h a i n s . V T.Convergence in Posets 173 7. This completes the proof. then eventually x~ < r. Define I. z and z' are adjacent in the ordering i.23).a ttA V tt~t A Vp. Let (x~ 9a E A) be o a net in X. assume t h a t (x~) and z satisfy condition (i) above. Proposition (optional). there is no other element of X between z and z'. X l . [0. R. 7. v . and xa A x~ o . x A X l " > x V X l . then x~ V x~ o ~ x V x' ' . From condition (i) we see t h a t for each t E T. then the singleton T = {z} satisfies the requirements for 7. Clearly z is a lower b o u n d for T. <) be a chain. we must show t h a t it is larger t h a n any other lower bound.38. and let z E X. order convergence. x a n d x ~/ .
or sometimes lim x~.~CO][0'1]. Here are two commonly used methods for extending the theory to such spaces: (i) We may work in some larger set Y _~ X that is order complete. In applications.~. the net (uh.43. . b ] .e. T) e M x T) is decreasing. The liminf of the x~'s is denoted liminf x~. since it is the limit of the infs. Furthermore. Then sup (h. for any real numbers a. 1] = {continuous functions from [0. it is also called the lower limit of the net (x~).~ "(p.aE S (uhVu~)  supuh k. 1]} with the product ordering (for any set S).1] into ~}.I (x v x').TeT (infvit) <item o (it. so x~ v x~ vit. For instance. then the preceding characterizations of order convergence can be restated in other forms that are sometimes more convenient.. 7. +c~]. Thus. \TCT v'r X/" Thus uh. then ( s ~ ) i s also a sequence.aES xVx' by 3. Definitions.~ sup hE L. Use the infinite distributivity of the lattice to prove the middle equality in this string of equations: inf vA~. = inf ( v i t V v ~ ) IteM. a. if (X. That limit is called the l i m i n f of the given net (x~). Remarks on applicability of the theory. 7 we have aeventually uh. ~ can be embedded in [ . ~). More generally.m. Note that if the given net ( x ~ ) i s a sequence. For instance. provided that we restrict our attention to nets that are eventually bounded. +c~] and the space [0. ~Examples of complete lattices to keep in mind are the extended real line [cx~.~ >x v CONVERGENCE IN COMPLETE LATTICES 7.{functions from S into [0. ~ ) i s Dedekind complete.hE L V supu k. Observe that s~ ~ x~ ~ t~. %) that are not order complete . 1] can be embedded in [c~. the theory of convergences in order complete sets is applicable to a Dedekind complete poset.p. b] is. 1] S . The net (s~ : c~ C A) is increasing. then any set of the form [a.174 Chapter 7: Nets and Convergences Then for each fixed A.~ "(A.44. hence it increases to a limit. and C[0. the real line R or a space such as C[0. Then we may define the related objects s~ = inf xz and t~ = sup xz..~ ~ x~ Vx~ ~ A vit. ~) is a complete lattice. a) e L x S) is increasing and (~it.21. Let (x~ 9a c A) be a net in a complete lattice (X.a)E L x S uh. When (X. although R is not order complete.g. (ii) Alternatively.. the interval [a. b with a < b.~ . we may find some subset of X that is order complete and arrange our applications so that everything of interest stays in that subset. one may wish to apply the following results to other posets (X.~)eM xT V(inf ')xVx'.o T (x v t x') and ~'.{x E X 9 a ~ x ~ b} is order complete.
and they can be used in place of a limit in m a n y arguments. ~) be a complete lattice. T h a t limit is called the l i m s u p of the given net (xs). t h a t limit is equal to the liminf and limsup. Also note t h a t lim inf xs ~ lim sup xs. t~] for all a sufficiently large.33. . To prove (A) =~ (B). or sometimes lim xs. and ss ~ xs ~ ts for all c~. Show t h a t eventually xs ~ ta. ~) is a poset and the literature sometimes uses one of these conditions as a definition of order convergence in such a setting. hence eventually yz ~ ta. Therefore us. vs E [s~. the liminf and limsup exist in any case. E A) in X.44. Note t h a t if the given net (xs) is a sequence.e. 7. Then us ~ xs ~ vs.40.38 or as equivalently characterized in 7. To prove (B) ~ (C). This is valid for every A.. 7. suppose (xs) and z satisfy the conditions in 7. then take A = S • T t h a t is. It is obvious t h a t ( D ) i m p l i e s the condition given in 7. t~] E 9". and let ta .47. hence l i m s u p yz ~ ta. t ~ ] = {z}.d.sups~ a xs.. a. t~]. In a complete lattice. since it is the limit of the sups. define ss and ts as in 7. Let (X. Hence V and V both lie in ~ E A [ S ~ .46. it is also called the upper limit of the net (xs).45. Further properties. T h e n the following conditions are equivalent. in such a setting the several conditions listed here are not all equivalent. For (C) =~ (D).40. (B) The net's eventuality filter 9~ contains a family of intervals {[s~. However. T h e o r e m o n c o n v e r g e n c e in l a t t i c e s . t~] for all ~ sufficiently large." or "pseudolimits.EA[S~. t~.Convergence in Complete Lattices 175 The net (ts : a E A) is decreasing. Since [s~. thus (D) =~ (A). then (ts) is also a sequence. t E T}. then us T U and vs I V. t ~ ] : / ~ E A} such t h a t [~.g. consider the collection of order intervals {Is. whether the limit exists or not.] = {z}. Proof. the liminf and limsup serve as "almost limits. 9 (A) xs o z (as defined in 7.. (C) liminf xs = z = l i m s u p xs.38. and let (xs :c~ E A) be a net in X. let us = infz~s xz and vs = supzv s xz for each a E A. when a net has a limit. T h e n what? 2Some of these conditions make sense in a more general setting . The limsup of the x s ' s is denoted l i m s u p xs. For a very different sort of generalized limit. Remarks. (D) There exist nets (ss : a E A) and (ts : a E A) (based on the given directed set A) such t h a t ss 1" z and ts $ z. t] : s E S. hence it decreases to a limit. V E [s~. Let U = l i m i n f x s and V = l i m s u p x z . if we merely assume that (X. ~) be a complete lattice. Suppose ( y z : / 3 E IB) is a subnet of ( x s : a lim inf xs ~ lim inf yz Hints: Fix any a E A. In cases where the limit does not exist or is not known to exist. It follows t h a t U. However. Let (X. see 12. Temporarily fix any A E A. 7. T h e n @ l i m s u p yz ~ lim sup xs.d). Let z E X." They possess many of the properties one associates with a limit. we have xz E [s~.
Although convergence of nets of sets is most often defined as in 7. and we define S~ ~ S to mean that lim inf S~ or equivalently that S c_ l i m i n f S ~ and limsupS~ c_ S. 7. eventually ls~ (w) .e.49. other definitions are occasionally useful. closed. Note that if g is a aalgebra of subsets of f~.176 Chapter 7: Nets and Convergences b. 9 Several different topologies on the collection of closed subsets of a metric space are surveyed by Beer and Lucchetti [1993]. Then for any net (S~). and let S c_ ft also.48 (or equivalently. Then it is always true that lim inf S~ c_ lim sup S~. 7.18. T h a t convergence can also be restated in terms of the characteristic functions of the sets.e). defined in 5. T h a t pseudometric determines a convergence. and perhaps the most frequently useful. and x~ 4 y~ for all c~ (or for all c~ sufficiently large). = S lim sup S~. lim inf S~   {w E f~ 9 eventually w E S~ }. T h a t ordering makes ~P(f~) a complete lattice. The simplest of these. Then lim inf x~ ~ lim inf y~ and limsup x~ ~ limsup y~. as in 21. metrically bounded subsets of a metric space.d. not all equivalent. Let (S~) be a net whose elements are subsets of a set f~. Each topology determines a convergence. determines a convergence for the nonempty. and (Sn) is a sequence in S. Convergence of sets. and if both nets possess limits then lim x~ ~ lim y~. Remarks. is in terms of the ordering in which S ~ T means that S c_ T. we have lim sup S~ = N U so U N s o = {w E f~ 9 frequently w E S~ }. W h a t does S~ ~ S mean? There are many different definitions in the literature.48.26. Suppose (x~ o~ E A) and (y~ " ~ E A) are nets based on the same directed set A. it says for each w E f t .9.26. See also the related result in 15. then lim sup Sn and lim inf Sn both lie in g. . as in 15. The Hausdorff metric.Is(w). particularly when the sets have some additional structure" 9 A positive charge determines a pseudometric on an algebra of sets.
The f u n d a m e n t a l o p e r a t i o n s of the monoid are E] (binary) and i (nullary). a s u b m o n o i d is a subset S c_ X that is closed under the fundamental operations of X i.1.y n x 179 (or Abelian) if it also satisfies (commutative law) . 0.e. a h o m o m o r p h i s m from X to Y is a mapping f : X ~ Y that preserves the fundamental operations i. it is called the u n d e r l y i n g set of the monoid. 8. but we may use the same symbols i and [] in different monoids if no confusion will result. f (xCVx') . i) is c o m m u t a t i v e x[]y . More definitions. []. that satisfies i x c S and also satisfies s. we call f an i s o m o r p h i s m of monoids.Chapter 8 Elementary Algebraic Systems MONOID$ 8. In a monoid (X. we may use subscripts (ix. A m o n o i d is a triple (X. The identity element i in a monoid is uniquely determined.2. iy. and just consider X as a set. Thus.ix). [].i2. and a special element i E X (called the identity element of X). satisfying these rules: (x~y)~z x~(y~z) (associative law) (identity law) x[]i . FlY) for clarification if necessary. If (X. i) consisting of a set X. i y ) are monoids. a binary operation •.f (x) o f (x') for all x.. then il . [].e. If f " X ~ Y is a bijective homomorphism. y __~ X is then a homomorphism as well. it is a subset S that becomes a monoid in its own right when the monoid operations of X are restricted to S. We may refer to X itself as a monoid if i and [] do not need to be mentioned explicitly. if [] is a binary operation on a set X and il. 8.3. we don't even need the associative law for that result. o. In fact. t c S s[]t E S. Different monoids X and Y generally have different identity elements and different binary operations. a mapping such that f(ix) iy.1 . i2 E X both satisfy the identity law in 8. Definitions. Exercise. it is easy to see that f . y. []X. i x ) and (Y. W h e n we disregard i and []. x' E X. A monoid (X. z E X.1..x .i[:]x for all x.
T of an additive monoid. y instead as xy. Measure theory will be introduced briefly in 11. N U {0}. known as m u l t i p l i c a t i o n . certain subsets of the extended real line are additive monoids for instance.c o ) 4(+oc). The set [c~.for a noncommutative operation. The identity element of X x is the identity map ix : X ~ X defined in 2. for there is no suitable way to define ( . A measure.e.see 8. any other sum of two elements in [co. The symbol for multiplication may also be omitted altogether.f (x'). Examples of monoids. Arithmetic in the extended real number system [oo. and X is an a d d i t i v e m o n o i d .4. However.).5.x2 4. we write x + S = {x + s : s E S} and S + T = {s + t : s E S. but usually it is not. and the monoid is a m u l t i p l i c a t i v e m o n o i d .37 and studied in much greater depth in Chapter 21. i. [0. X. M) are commutative monoids for any set X. The choice of X is discussed further in 11. Although this looks just like multiplication of real numbers. Let X be a set. but also infinite sums Xl 4. Then X x = {functions from X into X} is a monoid. In most cases of interest. t E T}. +oc] is not an additive monoid. f(0) = 0 . with the binary operation being the composition of functions. see 10.e. Z. N u {0} u In [0. For nonempty subsets S. for instance.. Caution: Algebraists occasionally use 4.x3 4. In this book addition will always represent a commutative operation. but analysts generally do not.. 4cc] is defined. In that case the identity element is denoted by "0" and known as z e r o or the a d d i t i v e i d e n t i t y .. e.39. W h e n these conditions are met we shall say f is an a d d i t i v e m a p ." + xn. the reader is cautioned not to assume that it is c o m m u t a t i v e . In that case the identity element is denoted by "1" and known as o n e or the m u l t i p l i c a t i v e i d e n t i t y .180 Chapter 8: Elementary Algebraic Systems for all x..7. The definition of homomorphism can be restated for additive monoids thus: f (x 4. d. 400] was defined in 1. U) and (~P(X). 8.a. y E X. is a particular type of mapping taking values in a monoid X. 0 .e. as in 2. in place of [] we use the symbol +. Consequently. +oc] we can define not only finite sums xl + x2 + . c. that monoid X is either [0. Then we say that the operation is written additively. 400] or a vector space. Composition of functions is often written "multiplicatively" i. For some monoids not necessarily commutative the symbol used in place of [] is a raised dot (. b.3. ((P(X). we may write x . . as are certain subsets of ~ [0. For many commutative monoids.x') = f (x) 4. IR is an additive monoid. The last equation can be written f(Ox) = 0y if some clarification is needed. f o g is often written simply as f g but composition of functions generally is not commutative.17. a..4. known as a d d i t i o n . defined as in 2. Then we say that the operation is written multiplicatively. + ~ ) .38. or more generally a charge.
Identifying functions with their graphs. ) . . The same reasoning can be applied if u~ is repla~ed by any other right inverse of x. a collection of symbols t h a t can be distinguished from one another.e.. To reverse this example. t h a t inverse may be denoted . b. and let X = ~t~ = {functions from ~ into ~t}. this can be proved directly or using 8.3.Groups 181 Actually. a. Any element of a monoid has at most one inverse. studied in C h a p t e r 14. r 4 . We shall restate this definition more directly: . A g r o u p is a monoid in which each element has an inverse. abc o abac = abcabac. Let X be the set of all finite strings of symbols made from members of t h a t alphabet for instance. g" Let A be an alphabet i. thus all the right inverses of x are equal to ut. For instance. More complicated algebraic systems t h a t are similar to this one are the basis of f o r m a l logic. and x has no other left or right inverses. we say t h a t x is a left i n v e r s e for y and t h a t y is a right inverse for x. Suppose t h a t (X. A monoid element may have many left inverses (or just one. r l . we find t h a t X x (discussed in 8. .4. . r2.e) is a submonoid of T ( X x X). use the same set X.for instance. and u~ is a right inverse of x. i. where f~ : X ~ X is the mapping defined by f~ (x) = u[]x. If x has an inverse.e. The element x has no left inverses. 8. r 3 . c. Let (X. r3. the string containing no symbols will also be considered as a m e m b e r of X. i. For a binary operation we use c o n c a t e n a t i o n . but use binary operation [] defined by u[]v = v o u. Then ul = u~.5. Then x o yp = i. c E A. The identity of this monoid is the diagonal set I = {(x.x . []) be a monoid. •) is a monoid.. if (X. Let X be a set. Similarly for right inverses. E x e r c i s e s a n d examples. The empty string i. then abc and abac and cab are three different members of X. i) is any monoid.6. ) = (p. these are onesided inverses. or none). b. . x E X. . . with composition for the binary operation.b. If x [ ] y = y [ ] x = i. . ) . ) = (r2. twosided inverses). r 3 . we say t h a t x and y are inverses of each other (or. with the binary operation being the composition of relations. so x has many right inverses. If x [ ] y = i. t h a t inverse may be denoted x 1. then X is isomorphic to a submonoid of X x via the mapping u ~ f~. then it is the identity element and X is a monoid.e. r 2 . . •. r2.5.x) : x E X}. defined as in 3. Define x ( r l . D e f i n i t i o n s . . . P r o @ u~ = u l o i = u l o ( x o u ~ ) = ( u t o x ) o u ~ = i o u ~ = u~. all the left inverses of x are equal to u~. for emphasis. define y p ( r l . ul is a left inverse of x.. . if a. let ~ = {sequences of real numbers}. T h e n ~P(X x X) = {subsets of X x X} is a monoid. with identity element i. Similarly. GRoups 8. for each real number p. r 3 . In an additive monoid. . Also.
a nullary operation) i.y C S ~ x[3y. An a d d i t i v e g r o u p or m u l t i p l i c a t i v e g r o u p is a group in which the binary operation is written as + o r . and a nullary operation i that is. that includes the identity element and also satisfies x./ ( x ) 1. Hint: [ ( x .y ) is abbreviated x . a unary operation x H x . respectively. y.iy. T of an additive group we write .e. In this book + will only be used for a commutative operation.1.x .i) consisting of a set X and three f u n d a m e n t a l o p e r a t i o n s that obey certain axioms.1. 8.182 C h a p t e r 8: E l e m e n t a r y Algebraic S y s t e m s A group is a quadruple (X. x 1 c S. and a special element (i. / ( x 1) .s : s E S} and ST={st : s E S . In an additive group. x[3i . the intersection of all the subgroups that include B. Exercise (optional).. as in 2.1. The three fundamental operations are a binary operation (x. and the sum x + ( .l ) .a shorter list would suffice: Suppose X is a set equipped with a binary operation [3. a specially selected element i E X.y.1 . More definitions. More notation. i y ) is a mapping f " X ~ Y satisfying f ( x [ 3 x ' ) . y) H x[3y. z c X. . <>. commutative group. [3. There is some redundancy in our list of axioms for a g r o u p .. 8. y C X. 8. (x1)1 X for all x.9. . The s u b g r o u p g e n e r a t e d by a set B c X is the smallest subgroup that includes B i.8. the product x .6) under the fundamental operations of the group. In a multiplicative.I .7.S = { . A s u b g r o u p of a group (X. and x1 [3x    i for all x E X . the inverse of an element x is written as . For a nonempty subsets S . [3. i[3x = x.i ) is also written x / y or y.1[ 3 X (associative law) (identity law) (law of inverses) (commutative law)  for all x. The group is c o m m u t a t i v e (or A b e l i a n ) if it also satisfies x[3y y[3x for all x. satisfying these axioms: [3 is associative. f(ix) . a unary operation x H x 1. Thus. it is a subset S that becomes a group in its own right when the fundamental operations of X are restricted to S. [3. A h o m o m o r p h i s m between groups (X.i [ 3 x 1] [](X[]X1) (xl) 1[] [(XIDx)F]X1]. i) is a set S C_ X that is closed under the group's fundamental operations i.f ( x ) <>/ ( x ' ) . i x ) and (Y.x.. ( y .7.e.e. The axioms are (x[3y)[3z x[3i X[3X1 x i  x[3(y[3z) = i[3x x . Show that the set and operations must also satisfy X[3X 1 = i.. t c T}. it is the closure of B (in the sense of 4.
r). Actually. W h e n X and Y are additive groups. when equipped with their usual addition operation.Groups 183 for all x . for they follow as consequences of the first equation. discussed in 10. A. Some subgroups are the positive rational numbers and the set {2 k 9 k E Z}. the integers Z. (The values of r most commonly used here are 1 and 27r.. the real numbers R. (xFly) 1  yl[~x1. may be viewed as a commutative group whose binary operation is ordinary multiplication (. D . i~p(x). or C generated by the set {1}. Then (~P(X). by the mapping x ~ In x. +oc). a.10. and let G {x E X " x has an inverse}. if X and Y are groups. This group is isomorphic to the circle group. Thus. Degenerate examples. D . All oneelement groups are isomorphic to each other. The positive reals.i. r) can be viewed as an additive group. The singleton {0}. The multiplicative group of positive real numbers is isomorphic to the additive group of real numbers. referred to as t h e r e a l s m o d u l o r. with obvious operations. Then G is a submonoid of X. we have i 1 i. i). . defined as follows" Give x + y its usual meaning when x + y E [0. consequently [0. The set Z is the subgroup of Q. All twoelement groups are isomorphic to each other.g) is a commutative group. x2 E X. . Elementary properties and examples of groups. where ~P(X) denotes the power set of X and A denotes symmetric difference. we may then call it an a d d i t i v e m a p p i n g . r) itself is sometimes referred to as the circle group. again with obvious operations. The identity element is 0.1 } is a multiplicative group. The next smallest group contains just two elements. and in fact G is a group. In any group (X. Let X be a set. Hence (A A C) A (B A C) . (0. {0} c_ Z c_ Q c_ R c_ C.1 .32. f.) The interval [0. 2r). i ) be a monoid. the inverse operation is the identity map . Note that in this group. b. the subgroup generated by the empty set is the singleton consisting of just the identity element.10. the second and third equations can be omitted from this definition. 8.A A B. d.) and whose identity element is the number 1. a mapping f : X ~ Y between additive groups is a homomorphism if and only if it satisfies f ( x l + x2) = f ( x l ) + f(x2) for all xx. and let x + y be replaced by x + y . e. A particular example of this is given in 8. each member of ~P(X) is its own inverse. (X1)I __ X.f ( x ) and f(0) = 0.that is. g. Let r be a positive number. In any group.x ) = . each is a subgroup of the next. and the complex numbers C are additive groups.r when x + y E [r. . the last two equations can be rewritten as f ( . An i s o m o r p h i s m of groups is a bijective homomorphism. c. In fact. x I c X. R. the rational numbers Q. The smallest group is a singleton. Let ( X . then a mapping f : X ~ Y is a homomorphism of groups if and only if it is a homomorphism of monoids. . the proof of this is an easy exercise. One convenient representation is this: { 1 . The addition operation for this group is a d d i t i o n m o d u l o r.
. _. i... Then the s u m of the Sj's is the set S l na ~2 nt_ . If X is the underlying set of a group (X. 82 C $ 2 . $2. it is the set of all sums of the form 8 81 + 82 + 83 Jr''' Jr. from Z x X into X.~_ 8n : 81C $ 1 . SUMS AND QUOTIENTS OF GROUPS 8. then the group Perm(X) is not commutative.n ) x = . Show that (i) Y~'~eAS~ is the union of sums of finitely many of the S~'s. If n is a positive integer. 3x = x + x + x. then Perm(X) = {permutations of X} is a group. (ii) }2~eAS~ is the subgroup of X generated by the set U~eA S~. .~. where n is a nonnegative integer and each sj is a member of some S~. ..e and 8. + n)x = (rex) + (nx) for all m. then the permutation group on a set X containing n elements is also called the s y m m e t r i c g r o u p of o r d e r n.. In fact. }~eA S~. .11.8n.. If X is any nonempty set. Let $1. x) H nx. h. Their sum. it is the subgroup generated by the singleton {x}.( n x ) . 8n C S n } . and for n C N we also define ( ._S n {81__~ 82 _. let {S~ : A E A} be a collection of subgroups of an additive group X.25) is a subgroup of T(X). . n C Z and x.. then an isomorphism from X onto a subgroup of Perm(X) is given by u ~. this is the group of invertible elements obtained from the monoid X X (see 8. A bijection from a set X onto itself is a p e r m u t a t i o n of X. fu. . where the permutation fu : X ~ X is given by fu (x) = urlx.. + y) = ( r e x ) + (my). In other words. For x E X we define Ox = O. it is written Sn.184 Chapter 8: Elementnry Algebraic Systems Any algebra of subsets of X (defined in 5. is defined to be the set of all sums of finitely many elements of U~eA S~. . with the binary group operation given by the composition of functions and with the identity element of the group being the identity function of X.. show that = m(nx). More generally. Sn be finitely many subgroups of an additive group X. lx = x. 71). Let X be an additive group.~_.4. i. . ..10. By induction or any other convincing argument. Also show that Zx = { n x : n E Z} is a subgroup of X. If X contains more than two elements. In this fashion we define a "multiplication" operation (n.b).. 2x = x + x. y C X.
b. considered as a m a p from S into either S or Sx. T.$2 n t''" ~.12 t h a t is. and X = R a n g e ( p )  Range(q).) Some basic properties of direct sum decompositions are a. We shall then say t h a t the subgroups S and T are a d d i t i v e l y c o m p l e m e n t a r y . is additive. This means t h a t each x E X can be written in one and only one way in the form s + t.say S and T.i x .$1  $2   S .k(S). and write S = (~XEA Sx. Conversely. or t h a t they are a d d i t i v e c o m p l e m e n t s of each other. Show t h a t q is also idempotent.82 Jr'''nt. Such a decomposition may be helpful. (~S(~T . d. where sj E Sj. Show t h a t a. S~ by the rule t h a t s = ~~. it has range Sx. We then write n S . ~ s + ~T = i x (where i x is the identity m a p of X ) b.. Each px. considered as a m a p from S into itself. Suppose X = S  T.30. Each m a p p i n g p x. (The t e r m "projection" also has other meanings. Let S = $1 t. S is an internal direct sum of the subgroups {Sx : .Srt be a sum of finitely m a n y subgroups. if each s E S can be written in one and only one way as a sum s = }'~XEAsx. We say S is the i n t e r n a l d i r e c t s u m of the Sx's. see 1.34 and 22. More generally. x = ~ s ( x ) + ~T(X) for each x. let S = ~XEA Sx be a sum of arbitrarily m a n y subgroups. R a n g e ( ~ s ) = C." but it should not be confused with the external direct sum described in 9.13.) If S = ( ~ a S~. is i d e m p o t e n t (defined in 2. 8.p . then we can define mappings ~ : S . we may call px the p r o j e c t i o n o n t o Sx. equivalently. t h a t S+T=X andSnT={0}. T h e set S is called the i n t e r n a l d i r e c t s u m of the Sj's if it has this further property: Each s E S can be expressed in one and only one way as s = Sl nt.4). An i m p o r t a n t special case is t h a t in which an additive group X itself is the internal direct sum of two subgroups . Let ~ s : X ~ S and ~T : X ~ T be the projections. suppose X is an additive group and p : X ~ X is an i d e m p o t e n t homomorphism.( ~ j = l Sj.12. N X ~ r s~ . c.{0} for each # E A. T h e n we write X = S  T. (The internal direct sum is often called the "direct sum.Sums and Quotients of Groups 185 8.) Exercises. where s E S and t E T. or S .(~T(~S K e r ( ~ T ) = S and R a n g e ( ~ T ) = K e r ( ~ s ) = = O.45.8n.k E A} if and only if S = ~ a ~ A Sx and S . .kEA 99. where each sx is a m e m b e r of Sx and only finitely many of the sx's are nonzero. Let q . because it may express a complicated object S in terms of simpler Sj's. (Some m a t h e m a t i c i a n s would simply call these sets "complements" of each other. but in this book we have too m a n y other uses for t h a t term. defined as in 8. or.
. (x+H) (x)+H. that theory is more complicated and will not be needed for our purposes. applicable to groups that are not necessarily commutative. Note that it satisfies 7r(Trl(B)). Let G / H be the set of all cosets of H. 8. and the constant map x H 0 (from X into any additive group) has kernel X. but there is no subgroup G c_ ]R satisfying ]R . The quotient map 7r" G ~ G / H (defined as in 3. However. K e r ( f ) = {0} if and only if f is injective.e. contradicting the result in 8.186 Chapter 8: Elementary Algebraic Systems 8.15.g + H. whereas Algebra books contain a more general theory of quotients. The cosets of H are the sets x + H = {x + h 9 h E H}.11). 1).f. In the group ]R. A few of its basic properties are: a. Then F(Tr(x)) = f(x) defines a group isomorphism F : X/Ker(f) ~ R a n ( f ) .3. and let H be a subgroup. the group G / H is called the q u o t i e n t g r o u p . The cosets of H are the equivalence classes for this equivalence relation.11) is given by 7r(g) . a homomorphism of additive groups. Since the cosets of H form a partition of G.A + H for any B C G/H. Consequently. Not every subgroup of every group has an additive complement. {xeX:f(x)=0}. Indeed. can also be described as the quotient of the additive group I1~ by the subgroup Z. One easy way to show this is to note that 0 and ~ are distinct solutions of x + x = 0 in [0. Degenerate examples.) Example.e. Let G be an additive group. The circle group [0. Define sums of sets as in 8.g2 c H. (Contrast 11. 1). 1). ) Let 7r : X ~ X/Ker(f) be the quotient map.. Not every quotient group G / H is isomorphic to a subgroup of G.Z  G. d. Example. for any A _C G. thus they form a partition of G.17. K e r ( f ) is a subgroup of X. and hence isomorphic to [0.30. hence 0 c Ker(f). Let X be any additive group.16. the equation x + x = 0 has only one solution. Let f : X ~ Y be an additive mapping Then the k e r n e l of f is the set Ker(f) = fl(0) = i. they define an equivalence relation on G by: gl ~ g2 *: :" gl. show that G / H is an additive group with identity element 0 + H and with other operations defined as above. 8.15. it would be isomorphic to ]R/Z. ( I s o m o r p h i s m has kernel {0}. Z is a subgroup of JR.10. 8. It is a group homomorphism from G onto G/H. b. Note that any two cosets are either identical or disjoint. T h e o r e m . Then the identity map i : X ~ X c. show that if G were such a group.B ~rl(Tr(A)) . g2 belong to the same coset ~. and G / H is the quotient set (as in 3. The circle group is not isomorphic to 1 a subgroup of IR. introduced in 8. 'gl .14. Show that (x+H)+(y+H) (x+y)+H.
A r i n g w i t h u n i t also has a special element 1 (one). the unary operation . x.54.f(xl)f(x2). A r i n g is an additive group (R. Some fields with which most readers are informally acquainted are Q and IR.18.e.X2 E R.). i. 10. and then they may refer to those objects simply as "rings. (See the related remarks in 8. no additional requirement is imposed on f for this case.10..8. which distributes over addition on both the left and right: w. 0. A field is a commutative ring with unit. The f u n d a m e n t a l o p e r a t i o n s of a ring with unit or a field are those of its additive group (the binary operation +. (x + y) = (w. which satisfies f ( x ~ + x2) . f(x) . f(O) . called m u l t i p l i c a t i o n .) By our definitions. (Most of the rings used by analysts have additional structure: They are linear algebras.O.) A h o m o m o r p h i s m of rings with unit is a mapping f : R ~ S from one ring into another. consider the even integers. However. implied by some of the other conditions. 10. A homomorphism of fields will simply mean a homomorphism f 9R ~ S of rings with unit. For a less trivial example of considerable interest to analysts.f ( x ) ~ for all x ~= 0. f(1) 1 for all X. For instance. the addition operation in any ring is commutative. . w y wz + x y + ~ ~_ X Z XZ Examples. these are introduced formally in 8. which preserves the fundamental operations i. Z is an important commutative ring that is not a linear algebra. the student should prove (and explain) that in a field. in which 0 =/= 1 and in which every nonzero element has a multiplicative inverse. y c R. with the usual operations of addition and multiplication.15. +) equipped with another associative binary operation (. All of these conditions are conceptually relevant. f(xlx2) .. Consequently.e. and the nullary operation 0) and those of its multiplicative monoid (the binary operation 9 and the nullary operation 1). y) and = for all w.4." For a trivial example of a ring without unit. and 10. However. in fields we are able to perform "ordinary arithmetic" computations. Caution: Some mathematicians work only with rings with unit. then a field will simply be viewed as a particular type of ring with unit. 1. Definitions. but some of them are redundant. When we talk about fundamental operations and related concepts.f(x). see l l. where R and S happen to be fields. Xl. (exercise) it follows from our definition that if f 9R ~ S is a homomorphism of fields. . A c o m m u t a t i v e r i n g is a ring in which the multiplication operation is also commutative.22..e.Rings and Fields 187 RINGS AND FIELDS 8. such that (R. then f is injective and f ( x 1) .f ( x l ) + f(x2). as explained in 11. x) + (w.) is a monoid.3.
write x . T h a t is. y E Z. 3 . The arithmetic operations make sense on the equivalence classes.y (mod m) if x . .188 8. C h a p t e r 8: E l e m e n t a r y Algebraic S y s t e m s S o m e e l e m e n t a r y properties. If 0 = 1. 5. e. 7. ( . . E x a m p l e : finite rings and fields. The equivalence classes are most often represented by their smallest nonnegative members i.1}. It is easy to verify t h a t = is an equivalence relation on Z. 0 . For all w.y ) = . we have a. m 1}. 8. 1. This is the smallest ring..e. 0 . introduced in 8.1 ) . 2 . + 0 1 2 3 4 5 0 0 1 2 3 4 5 1 1 2 3 4 5 0 2 2 3 4 5 0 1 3 3 4 5 0 1 2 4 4 5 0 1 2 3 5 5 0 1 2 3 4 0 0 0 0 0 0 0 1 2 3 4 0 0 0 0 1 2 3 4 2 4 0 2 3 0 3 0 3 4 2 0 4 2 5 4 3 2 l 5 0 5 4 Recall t h a t a p r i m e n u m b e r is one of the numbers 2. 3. below are the addition and multiplication tables for Z6. x = . y). Note that. y = x . 2 . since Xl  Yl. . 2 . y in Zm. 11.1. .x ) . x2 . .Y2 =V Xl + x2 . the additive inverse of 1 times any ring element x is the additive inverse of x. In particular. . .x . . x. m) of reals modulo m. 1. considered as an additive group. b. m .( x . if x . . W i t h these operations. and then subtract a suitable multiple of m to obtain an element of {0.Yl ~.y = k m for some integer k. As an illustrative example. . We then say t h a t x and y are c o n g r u e n t m o d u l o m .t h a t is.y is a multiple of m . take their ordinary sum or product in Z.20. then R = {0}.. It is an easy exercise to show t h a t the finite ring %m is a field if and only if m is a prime number. . it will be of some importance in the study of Boolean algebras. 1} is the smallest field. .YlY2. Z2 = {0. For integers x. . d. Zm is a commutative ring with unit. c. m . z in a ring R with unit. There is a unique homomorphism from Z into the ring R. t h a t is. ( . etc. an integer greater t h a n 1 t h a t can only be written as a product of two positive integers if one of those factors is 1. 1 . Let m be an integer greater t h a n 1.e. y.10. x = 0 = x . called t h e i n t e g e r s m o d u l o m . the numbers 0. %m is the subgroup generated by { 1 } in the group [0. which can be described more directly as follows: to add or multiply two numbers x. XlX2 . Thus we obtain arithmetic operations on the set Zm = { 0 .Y2. ( .19.
Example: products. (f g)(A) = f(A) g(A). More generally. In the field Z5. we shall give a formal construction of it. by defining operations coordinatewise: (f + g)(A) = f(A) + g(A). The multiplicative group ]Fq\{0} is isomorphic (as a group) to the additive group Zq_l. 1. thus the product of two nonzero elements is zero. The reader is undoubtedly quite familiar with the field of rational numbers.21. when operations are defined in this fashion. Nevertheless. 2. the same method of construction will subsequently be used to form another. 4. An i n t e g r a l d o m a i n is a commutative ring D with the property that whenever x. Considered as a linear space over Zp (see Chapter 11). 3 = 0." We shall now state without proof a few more results about finite fields. In the ring Z4. y is 0.23. then it is fairly easy to form a field with p2 elements. The product of two or more fields is not a field. b. since any element of P with a 0 in at least one component has no multiplicative inverse. then exactly half of the nonzero members of Zm are squares of members of Era. etc. the computation of their addition and multiplication tables is a somewhat complicated matter. The explicit formation of such finite fields i. we have 2 2 = 0. less familiar field. 4. . a simple method is given in 10.22. Then there exists a field Fq containing exactly q elements if and only if q is of the form q = pn for some prime number p and some positive integer n in which case the field Fq is unique (up to isomorphism). n r 0}. Q = { m / n : m. 8. then at least one of x. However. the squares of the numbers 0.Rings and Fields 189 Related exercises. the proofs of these additional results are beyond the scope of this book. C. If the Ra's are rings with unit. then so is P. a different method can sometimes be used to make a product of fields into a field.22. it is unique up to isomorphism. 3. Let q be an integer greater than 1. Suppose that (Rx : A c A) is a collection of rings. when p is an odd prime. a. 4 are the numbers 0. The ring Z4 is not a field. with multiplicative identity 1p equal to the function that takes the value l a on the ~th coordinate. 1. show that if m is an odd prime number. Find its addition and multiplication tables. However.. Remarks. In the ring Z6. The additive identity 0p is the function that takes the value 0a at the ~th coordinate. Finite fields are not often useful in analysis. but there does exist a field F4 containing exactly 4 elements. y E D with xy = 0. n E Z. see 10. we have mentioned t h e m only because they offer very easily understood illustrations of the concept of "field. Lidl and Niederreiter [1983].b. 8. we have 2 . the field Fq is isomorphic to (Zp) n. Then we can make the Cartesian product P = lIacA Rx into a ring.e. but can be found in more specialized books see for instance. 1.
m ) + (y. Let D be an integral domain. Let S = {s. the uniqueness of prime factorization. 1) is an embedding of D in F that is. (We assume familiarity with basic properties of the integers. 8. That is. Exercises about Q.e. With that ring homo 2 morphism. Addition and multiplication in F are defined by (x. since any pair in the equivalence class can be used to form this expression. m) will make it less likely that we will inadvertently assume some familiar property of F that has not yet been proved. Having completed our construction of F. m2n2).) 8. Hint: If x = p/q. If F is a field. Example: the ring of polynomials and the field of rational functions. .190 Chapter 8: Elementary Algebraic Systems Of course. m2n2). It is called the field of f r a c t i o n s of D.2. e. Verify that this is an equivalence relation on D x (D\{0}). Z5. (For the . m l n l ) ~ (x2y:. (Thus we obtain a member of {0.. 4} which is in a sense "congruent modulo 5" to the fraction 2/3. } to display a few typical elements. verify that F is a field. it is denoted by Q.g. a. the integers or the rationals or the reals). In the particular case where the integral domain D is the ring Z. n2). consider how many factors of 2 there are in p or in q. u . We urge the reader not to switch to this notation until after completing the construction of F and the verifications that it requires. the representax tion is not unique. u . Hint: 2. evaluate h(5 ).) c. For pairs (x. We write S as {s. The reader should verify that these operations are welldefined i.. define (x. ran). Of course.24. Let K be any integral domain (for instance. 1. an injective ring homomorphism . or field of q u o t i e n t s . m i n i ) ~ (x2n2 + y2m2. m ) ( y . ml) ~ (x2. (XlYl. . d.20.23. The ring Z = {integers} is an example of an integral domain that is not a field. Example. m2) and (Yl. Define F to be the set of equivalence classes. n) in D x (D\{0}).n) to mean that xn = ym. The artificiality of the unfamiliar notation (x. n) = (xn + (x. another example will be given in 8. that the definitions above do not depend on the particular choice of representations for the equivalence classes. m) is represented by x / m or ~ . . } be a nonempty (finite or infinite) set of distinct symbols not already used in our description of K or elsewhere in our language. verify that (Xlnl ~. b. m) and (y. t.e.and so we may view the ring D as a subset of the field F. nl) ~ (Y2. Explain. m) (y.. 3. we now switch to conventional notation: The equivalence class containing (x. There is a unique ring homomorphism h : Q . .ylml. but we do not require that S be countable or ordered in any fashion. t. n) = (xy. if (Xl. . Show that card(Q) = card(N). There is no x E Q satisfying x 2 = 2. The finite rings Z4 and Z6 are not integral domains. With operations so defined. The mapping x H (x. the resulting field of quotients is the field of r a t i o n a l n u m b e r s .24. any field is an integral domain. there is a unique ring homomorphism from Q into F. .
Hence we can form its field of quotients. If p ( s ) is not the constant function 0.i. where a E K and s . If the coefficient a is not zero. A typical rational function is as 3 + bst 2 + ct 2 + d u v + e bt 3 + d s t + f u r 3 + g " Equality between such rational functions and arithmetic operations with such functions are defined in the usual fashion. If the ring K is an integral domain.e. for instance.25. If S consists of just a single variable s. as in 8. T h a t field is called the field of r a t i o n a l f u n c t i o n s with variables in S and coefficients in F. However.~/5st 2 + 6. t. as 3 + bst 2 + ct 2 + d u v + e has degree 3. we shall denote it by K(S). Addition. Let 23 = { p / q e K(S) : p and q are homogeneous polynomials of the same degree}. later we shall have some uses for much larger collections S as well.a l s + ao where the coefficients aj are members of K. a polynomial of degree 0.K[{s}] may be written more briefly as K[s]. and equality of polynomials are defined by the usual algebraic rules. The d e g r e e of the polynomial is the highest degree of any of its monomials. B l a s s ' s Subfield ( o p t i o n a l ) . then so is the ring K[S]. When S consists of just one variable say s then the ring K [ S ] . we omit the details. t.i.29. thus each member of K may be viewed as a member of K[S]. Then any polynomial may be written in the form p(s) a n S n 4. a quotient of two polynomials..) A m o n o r n i a l with variables in S and coefficients in K is any expression such as a s 3 t 2 u v . the monomial a s 3 t 2 u v .K({s}) may be written more briefly as K(s). any expression such as as 3 + b s t 2 + c t 2 + d u v + e . then 383 + X/f2s2t 4 1 [ s t u + 7rsu 2 1 7 s t u .. This field will be mentioned again in 11. u} a~d K = ~. b. Then n is the degree of the polynomial.e.7. u. A p o l y n o m i a l with variables in S and coefficients in K is a sum of finitely many monomials . Define K(S) as above. then the field K(S) . A h o m o g e n e o u s p o l y n o m i a l of d e g r e e k is a sum of several monomials of degree k.. .{s}.. and an is called the l e a d i n g coefficient.R i n g s a n d Fields 191 simplest examples.i S n1 + " " 4. thus we may view K as a subset of K[S]. Note that each a c K may be viewed as a constant polynomial i. we omit the details. take S to be just a singleton: S . v E S. which we shall denote by K[S].) Show that 23 is a subfield of K(S). an element of K multiplied by finitely many members of S. where a. t .e. 8.179t 3 is a typical member of ~B. then the d e g r e e of the monomial is the sum of the exponents of the variables for instance. e c K and s..a s 3 t 2 u l v 1 has degree 3 + 2 + 1 + 1 . v E S that is.22. (For instance. as 3 + bs2t + c s t u + d t u 2 + ev 3 is a homogeneous polynomial of degree 3. c. multiplication. if S = {s. u. The set of all polynomials with variables in S and coefficients in K is easily seen to form a commutative ring with unit. A member of that field is a r a t i o n a l f u n c t i o n with variables in S and coefficients in F .a n . d. then by dropping any leading zero terms we can choose the representation so that an # O. for instance. This mapping from K into K[S] is an injective ring homomorphism.
. it is c u s t o m a r y I to consider elements of Kp as c o l u m n m a t r i c e s when m a t r i c e s are to be used at all.bp) can also be w r i t t e n as [bl b2 . . M a t r i x m u l t i p l i c a t i o n has slightly complicated dimensional requirements 9 If A is an m . In fact.o b t a i n e d by flipping A over diagonally. We say aij is the e l e m e n t (or c o m p o n e n t ) in row i and column j.n.m m a t r i x A ~. Note t h a t the t r a n s p o s e of a row m a t r i x is a column m a t r i x . for the m a t r i c e s above. For instance.b y . 2 7 .. . t h e n we can form their p r o d u c t A B = R. Obviously.n m a t r i x is called a c o l u m n m a t r i x if n . Let IK be a ring.. and vice versa.n m a t r i x and B is an nbyp m a t r i x . but to save space on the p r i n t e d page t h e y are often r e p r e s e n t e d as the t r a n s p o s e s of row matrices.1 (i.(a~j 91 < i < m. . . Thus.." aln a2n bll b21 b12 b22 "9 "'" blp b2p _ rll r21 r12 r22 9149 rip .192 C h a p t e r 8: Elementary Algebraic Systems MATRICES 8 . the p r o d u c t B A is not necessarily defined.b y . we emphasize t h a t the r e p r e s e n t a t i o n with p a r e n t h e s e s requires c o m m a s while the r e p r e s e n t a t i o n with brackets requires t h a t the c o m m a s be o m i t t e d . and in t h a t case A B is an m . . T h e m a t r i x A given above m a y be r e p r e s e n t e d m o r e briefly as A . M a t r i x n o t a t i o n . "9149 arnn . bnl bn2 "9 b p n rml 9 rrn2 9 9149 rmp mbyn nbyp mbyp defined by this formula: rik = a i l b l k + ai2b2k + ' " + ainbnk.  8 .b y .p matrix: all a21 a12 a22 '" . 1 < j < n). multiplication of m a t r i c e s is not c o m m u t a t i v e . a r o w m a t r i x if m . 2 6 . In general. and a s q u a r e m a t r i x if rn .b y . An m . and let m and n be positive integers. (AT) T A.b y . A B = B A can only hold if A and B are square m a t r i c e s of 1Some older algebra books represent members of Kp as row matrices. .n m a t r i x A is the n .e. if it consists of j u s t one column). where each aij is an element of IK. when the p r o d u c t A B is defined. 99 9 999 aln a2n aml am2 """ amn with m rows and n columns. if it consists of just one row). an m . but column matrices seem to be the prevailing convention since sometime around 1960. An m .1 (i. T h e t r a n s p o s e of an m . T h u s the ordered ptuple ( b l . r2p arnl .b y . am2 . 9 bp]T. or still more briefly as (a~j) if no confusion will result. we can only define B A if m = p. . b 2 . For any positive integer p.m m a t r i x while B A is an nbyn m a t r i x .b y .e.n m a t r i x over IK is a r e c t a n g u l a r array all a21 A 9 a12 a22 . so t h a t the kth row becomes the kth column and vice versa.
am2 . amn and v  .~. An i m p o r t a n t special case of m a t r i x multiplication is the following: Let A be an m . For example.a r n 2 V 2 Jr. discussed further in C h a p t e r 11. under the operation of m a t r i x multiplication 9 T h e invertible elements of t h a t monoid form a group 9 An interesting subgroup consists of the p e r m u t a t i o n m a t r i c e s o f o r d e r n. provided K is a ring with unit in which 0 =/= 1. column k of t h a t p r o d u c t is Ein_= 1 E j P 1  ahibijcj k 8. and C is a pbyq matrix. T h e n the m a p p i n g v ~ A v is an additive m a p from K n into Km. Even then.n matrix.e. there are six p e r m u t a t i o n matrices of order 3: [ ] [ ] [ [100] [010] [001] 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 0 0 0 1 1 0 1 0 0 0 0 1 0 1 1 0 0 0 ] .B A only holds in an occasional coincidence. respectively.a m n V n In particular. these are the nbyn matrices A t h a t have the following property: Each row contains n . It is an easy exercise to show t h a t multiplication of matrices is associative: ( A B ) C = A ( B C ) whenever the dimensions of the matrices m a t c h up i. . The element in row h.1 zeros and 1 one.28.' ' ' n t.. However. it does not hold in general. Vn aml 9" " allY1 ~ a 1 2 v 2 na . For instance.. if A [ 110 ]0  and [110 0 then A B ~ B A . we do have (AB) T = B TAT if the ring K is commutative. . It is so i m p o r t a n t t h a t we shall write it out more explicitly here: all a21 a12 a22 999 999 aln a2n Vl v2 If A  .b y . A is an mbyn matrix. Matrices as functions on columns.a l n V n 9 1 4 9 1 4a9 n V n + 2 a 2 1 v 1 ~. then the nbyn matrices form a monoid. the nbyn matrices act as mappings from ]Kn into itself9 If IK is a ring with unit.. A B . This type of m a p plays an i m p o r t a n t role in the theory of finitedimensional vector spaces. . by mby1 matrices and by nby1 matrices. Hence we may omit the parentheses and write the p r o d u c t simply as A B C .a 2 2 v 2 + then A v arnlVl ~.e.1 zeros and 1 one.i. Represent elements of K m and tKn by column m a t r i c e s . 9 . B is an nbyp matrix. even if the underlying ring IK is commutative.Matrices 193 the same dimension. 9 . each column also contains n .
The group of permutation matrices of order n is isomorphic (as a group) to the symmetric group of order n. min(S) + min(T) min(S + T). Addition of m . However. In this book an o r d e r e d m o n o i d will mean an additive monoid X that is equipped with a partial ordering ~ that is t r a n s l a t i o n .n matrix that has ls along its main diagonal and 0s elsewhere. .b y .e. [0. The ring o f matrices. 0 0 . +c~] is an important ordered monoid that is not even a group. Let S and T be nonempty subsets of an ordered monoid (X. .. 0 . we shall call it an o r d e r e d g r o u p . 8. that satisfies x~y ~ x+u~y+u for all x.194 C h a p t e r 8: E l e m e n t a r y A l g e b r a i c S y s t e m s Such a matrix is called a p e r m u t a t i o n m a t r i x for the following reason 9 If n distinct members of the ring K are arranged in a column matrix v.'' amn + bmn Thus. K) .. If X is also a group.. 8. 0 1 0 .. ..b y . then the mapping v H A v p e r m u t e s those n members . if A = I).{nbyn matrices over K} is a ring. y. 0 0 .n matrices is defined componentwise: all 9 99 9 aln I bll "" bin [ all + b l l 9149 a l n .. the column matrix A v consists of the same n members. u c X. With multiplication and addition defined as above.~ .. 1 0=0n= and I=/~= Here In is an n . show that if the left side exists... For each of the following equations.i n v a r i a n t i.. K) is not commutative. 0 0 1 . A r i t h m e t i c in ordered m o n o i d s .. the set Mat(n.10. it may be written more briefly as (Sij).b l n am1 99 9 amn bml 9 bmn am1 + bml . In general the ring Mat(n.31.. Most of the ordered monoids used by analysts have a great deal more structure in fact. introduced in 8.. it has additive and multiplicative identities given by 0 0 .i. ~). arranged in some other order (or in the same order. ORDERED GROUPS 8.29. . 0 . where 5 is the Kronecker delta.30.i. 0 0 0 . most of them are Riesz spaces. then the right side also exists and the two sides are equal: max(S) + max(T) m a x ( S + T). we can only add two matrices if they have the same dimensions.e. .
f2 E 9 there exists f E 9 such that f(x) ~ fl(x) and f(x) ~ f2(x) for all x E M.(pAq) = (p) V (q). The sup of a directed family of additive maps is additive.Ordered Groups Also show that each of the following inequalities holds if both sides exist: sup(S) + sup(T) > sup(S + T). d. b.7. Assume that h(x) = suPfE~ f ( x ) exists in Y for each x E M. Such statements occur in pairs. Let (X.57.a. 195 8. Hint: . . y E X. the members of such a pair are said to be d u a l to each other. From all of these equations it follows that any statements about m a x i m a or suprema can be translated into statements about minima or infima. the last equation becomes . For the reverse inequality. More precisely: Let M be an additive monoid. Then the function h : M + Y is also additive. and let x. we leave the details as an exercise.32. the left side exists if and only if the right side exists.S ) . Show that Let S and T be a. for each fl. etc.sup(S) = i n f ( .max(S) = m i n ( . . Let 9 be a collection of additive maps from M into Y. in which case they are equal: max(x + S) = x + max(S). Arithmetic in ordered groups. x 4 y ~ x>y. Assume 9 is directed by the product ordering on y M that is. equivalently. in many cases we mention only one of the two statements.S ) . nonempty subsets of X. Proposition.sup(S) = i n f ( . sup(x + S) = x + sup(S). 4) be an ordered monoid. . show that Hints: The proof of h(x + x') ~ h(x)+ h(x') is easy h(x) + h(x') = sup fl . (Compare with 8.( u V v) = ( . if the left side exists. inf(S) + inf(T) = inf(S + T). IE~ 8. For each of the following equations.) inf(S) + inf(T) ~ inf(S + T). inf(x + S) = x + inf(S).33. Duality in ordered groups. .) it does not require 9 to be directed. Let D be a subgroup of X. Then D is supdense in X if and only if D is infdense in X. and let (Y.u ) A ( . (This result will be used in 11.f2E~ [fl(x) + f2(x')] < sup [f(x) + f(x')] = h(x + x').33. then the right side exists and the two sides are equal: sup(S) + sup(T) = sup(S + T). min(x + S) = x + min(S). When S contains just two elements. 4) be an ordered group. c.S ) . For each of the following equations. See also 1. and vice versa. For brevity.v ) or.
when X ." ' " nt. we use the notation [a. hence all. In an ordered group.Y n ] t whenever n is a positive integer and the xi's and yi's are members of X with xi g yi for all i. Note that in any ordered group. . Thus.34. Yl] nt.Yl]~[x2. (B) If Pl. then f(a)  f(sup(S))  f(sup(S)) inf(f(S))inf(f(S)) sup(f(S)). Y n ] . X 2 . . then there exist some zij E X+ for 1 < i _< m and 1 ~ j < n.E i=1 zij for all j. (D) If X l . . are satisfied. j..[ X n . . . . then there exists some r E X with pi ~ r ~ qj for all i. Let f " X ~ Y be a group homomorphism. we say X has the R i e s z D e c o m p o s i t i o n P r o p e r t y . Y2] C [Xl dr.Y2]~"'" nt.[Xl ~.X n . . Let (X. b ] . (C) [Xl. If X is an ordered group. Hint for the "if" part" If a . . 8. xyEX+. P 2 . Note that the ordering can be recovered from the positive cone" We have x ~ y . Y 2 . = f(inf(S)) 8..R. . see 8.36." In particular. If one. then the p o s i t i v e c o n e of X is the set X+ = { x c X " x ~ O}. there is some p E X+ with p ~ x. ~) is a directed set. .38. v E X+. Then the following conditions are equivalent: (A) (X. .'.[0. where Y is another ordered group.Ix2. The following conditions are then equivalent. j.X2 nt. . X m C X+ and Y l .Y21 ~ x ~ b}. . u] + [0. For some examples of the property described below. v] ."'" nt.X2. q 2 .that is. Caution: Elements of the positive cone are not necessarily called "positive. Yl nt. . T h e o r e m . Exercise (optional). . Y l 3 Y2 nt. 8.196 Chapter 8: Elementary Algebraic Systems e.{x E X ' a [Xl.X+ = X. (B) X+ generates the group . ~) be an ordered group. 0 is a member of the "positive cone" but it is not a positive number. such that n m x i E j=l zij for a l l / and YJ . More definitions.sup(S) for some S c_ X. Y n C X+ with Xl + x2 + "'" + Xm = Yl +Y2 + ' ' " + Yn. (A) [0. qn C X with Pi ~ qj for all i. Let (X.35. Then f is suppreserving if and only if f is infpreserving. 4) be an ordered group. u + v] whenever u. (C) For each x E X. X+ . then X+ is the set of all nonnegative real numbers.37 and 8. . . Pm C X a n d ql. .
Proof of (D) =~ (A). q j . LATTICE GROUPS 8. It sumces to prove this for n = 2.Z l n with Zlj E [0. Proof of (C) =~ (D). this ordering will play an i m p o r t a n t role in our theory.[0. t h a t lies in [0. y E X. We know z . ql] and pi~ Proof of (B) ~ (C). b] is a singleton if a = b. Decompose 8. q l some aj C [0. SO Xl = Z l l AY" " " . an even weaker hypothesis is sufficient: If X is an ordered group and sup{x. an . yj]. Despite its simplicity.37. 2. 197 (B) for m = n = 2.X 1 and x2 4 Z . Let z E [xl + x2.Yl] and r E [x2. Here are a few of its basic properties: a.(q2 .46. (ql .y. then. n qj for i. Thenp+q=u+vforsomeq~0. yn]. L e t p E [0. Let r = a l ~Pl r=q2a2E [p2. Hence z .Xl.y 1 4 r and x2 4 r and r 4 y2 and r 4 z .Yl ~ z . Assume. T h e n r = a l + Pl C [Pl.Yl 4 Y2 and x2 4 y2 and z . d. then higher values follow by induction. these s t a t e m e n t s are dual to each other. We shall refer to this as the t r i v i a l o r d e r i n g . It is not a lattice ordering. Some examples of lattice groups are given in 11. yl]+" "+[O.X l . c.y2]. Yl [Y2]. or e m p t y if a =/= b.pl P l ] Jr. T h e positive cone X+ is just {0}. y C X.pj].P 2 ] . (ii) x A y exists for all x.. It suffices to prove (D) for m = 2.45 and 11. higher values of n then follow by induction.53.r c [xl. u + v ] . T h e trivial ordering has the Riesz Decomposition Property. and so either implies the other. b. then X is a lattice group. Now let z2j = yj . t h e n higher values of m.P l ) ~. q2 . A degenerate example. j = 1. b o t h sides of t h a t equation as in (D). Hence there is some r with z . Actually.e.Lattice Groups Proof of (A) =~ (B). A l a t t i c e g r o u p is an ordered group whose ordering is a lattice o r d e r i n g ordered group X t h a t satisfies b o t h of the following conditions: (i) x V y exists for all x .q2]. hence either one of these implies X is a lattice group.[0.32 and 26. We know t h a t q 2 . Hence q2 . T h e set [a.P 2 ) ] ~. in an ordered group. i. 0} exists for each x E X . I f x l + x 2 = Yl +" " + Y n = s. see 12. It suffices to prove follow by induction. Any group X can be ordered by this relation: x 4 y if and only if x . if X contains more t h a n one element.P l z aa + a2 for = a2Jrq2. Since the ordering of an ordered group is translationinvariant.~ .Zlj. then Xl C [O.38.
and the a b s o l u t e v a l u e of x. Caution: We have two notions of "absolute value" the group element / x / defined above. This book will reserve the notation Ix[ for realvalued absolute values and norms. as in 11. In the wider literature. when given the product o r d e r i n g . However. in 26. / y / are comparable in order. unfamiliar "absolute value" of members of a lattice group e. see 8.x ( A ) . then pl V P2 V . ~) is a lattice group.198 Chapter 8: Elementary Algebraic Systems It is easy to see that any lattice group has the Riesz Decomposition Property indeed if pi's and qj's satisfy the hypotheses of 8. we note that the Riesz Decomposition Property is also enjoyed by some other ordered groups that are not lattice groups. 0}.max{x. ..31 and in Chapter 22.37. Neither of these is a special case of the other. like x + and x . they may accidentally attribute some of its properties to this new. let A be any set. Then the product ~h = {functions from A into I~} is a Dedekind complete lattice group (actually a vector lattice). a simple counterexample is given in 8. then x + . The lattice operations V.39.40. a n d / x / i s just the usual absolute value Ixl. 0} and x .36(B). 8. /x/x + + x.) Our use of the notation Ix~ will serve as a constant visual reminder that the "absolute value" being considered is. However. . Examples. However. not necessarily a member of R. that convention causes some difficulties for some beginners who are already familiar with the realvalued absolute value of real or complex numbers.that is.55.41.g. (Those absolute values are not necessarily comparable.m a x { .R.45. the n e g a t i v e p a r t of x. Here our notation is slightly unconventional. respectively.. when ordered by x~y The nonnegative cone is then (RA) § = if x(A) _~ y(A) for every A E A.31. it is customary to represent both kinds of absolute values by the expression Ixl. 8. 0}. More generally. functions x + (A) = max{x(fl).x . When X is the real line with its usual ordering.A are defined pointwise on A. . then for any x E X we can define x+ x v o. They are called the p o s i t i v e p a r t of x. . V Pm ~ ql A q2 A ' . These three objects are elements of the nonnegative cone X+. which are discussed in 10.g. A qn.m a x { . and the nonnegative real number Ix I defined in 10.. they may inadvertently assume that any two absolute values / x / . a member of some lattice group.. If (X. for instance. This book's unusual practice may prevent confusion in contexts where both types of absolute values are needed e. the two notions coincide when X . x(x) v o. 0}. Ixl( We also have the )  Ix( )l. x(A) .
. However. J o r d a n Decomposition. In 11. Hint" Sum d e c o m p o s i t i o n and 8. / x / . x .R ~. Hints" Show u .. ~ x ~ x + ~ /x/.v . .) 8.. x x + x.x . a f u n c t i o n from A into IR whereas Ix(A)l is a real n u m b e r .x + . t h e n / x / ( O ) < / y / ( O ) and / x / ( 1 ) > / y / ( 1 ) .) A 0l + x (x A 0) ..)l is an element of IRA t h a t is. y. x A y.. t h e n I x ~ .(x V y) + (x A y). u .50 we'll see t h a t if Z is a vector lattice..7 ~ + x + ~pandvp+x~p. y < > y k. hence0uAv~p. d. Some vector lattices have appreciably more c o m p l i c a t e d formulas for the sup. (Here we use ~ to c o m p a r e m e m b e r s of X and _< to c o m p a r e m e m b e r s of R. x +. R e m a r k .x ) + .x ) . i.. inf.x V ( . . we do not necessarily h a v e / x / ~ / y / or/y/~/x/. x + y . neither / x / ~ / y / nor / y / ~ / x / is valid. that A r i t h m e t i c in lattice groups.x + v ~ x and u ~ 0. z E X.( x A 0).O . R e m a r k . S u m decomposition. T h e pointwise formulas given above for x V y./ .x / .u ..( . 8 . / x / ~ y if and o n l y if b o t h . / x / . the f u n c t i o n / x / i s defined b y / x / ( t ) . the pointwise formulas for the lattice o p e r a t i o n s are not valid in some other subsets of I~A.33. x ~ O ~ x+x ~ xO ~ /x/x. x + A x Hint: By t r a n s l a t i o n invariance. T h a t is.x ) v O.42.46.21 and 11.x + and v .( x A 0) and x + .I x ~ ~ ~ (x) xO. t h e n / x / ~ x~y. g.t and y ( t ) .c.u . etc. If x and y are m e m b e r s of a lattice group X . hence u ~ x +. T h e n for any function x. See for instance 4.( . We emphasize t h a t /x// = Ix(. [(x + . Example. Let X . f. If x . sup(S). 4 1 . t h e n u .. inf(S).x + + x . On the other hand. Observe t h a t if x ( t ) . In 11. Uniqueness of the J o r d a n Decomposition.y ~ x ~ y a n d y ~ O. //x// r e m a i n valid in m a n y i m p o r t a n t subsets of ~A.x .t . some of these are given in 11. Show a.x . (xAy)+z. and let x. x .v where u A v . Hint: Use translationinvariance and c.x ~ O.x + v x . This can also be described as: Addition distributes over V and A.0. with the p r o d u c t ordering.47. x + A x .x ) . T h e n p . V and A are translationinvariant.O j.1. 8.33.c.x(t)l.) + x] A x[(x + . e. Thus.Lattice Groups 199 where the last expression is the usual (realvalued) absolute value of the real n u m b e r x(A). b. Let X be a lattice group.50 we'll see t h a t if X is a vector lattice.x v ( . (x + z)V (y+ z ) (x+z)A(y+z)  (xVy) +z. h. absolute value.
Then X is distributive. with n+l jO n j=O n V [Jx + (x V O)]  ( ? ) (jx) + (x V O).v/ ~ /u v/. 2 ( x V y ) = x + y + / x .y / a n d 2 ( x A y ) = x + y .u.200 Chapter 8: Elementary Algebraic Systems I. x A (y v z) . for any x E X and any nonempty set S C X. / x + y~ ~ / x / + / y / . It is the largest solid set contained in T.y / . and likewise /l~l. .v.v+/ ~ / u  v~ a n d / u  . it is called the solid k e r n e l of T.~l/x .50./x/l = U [u.y/.10. on distributivity. Hint" Let f(t) be any of the functions t +. the empty set is solid. A set S C_ X is said to be solid if it satisfies" /x/~/y/ and yES ~ xES.V [ ( J + l ) x V j x ]  jO jO Remark..) Hint: Use induction on n. to p r o v e / f ( u ) . In a Riesz space this formula simplifies to n(x +) . In fact." i. then 0 v x v (2x) v (3x) v . as in 8..+ y . let sk(T) be the union of all the solid subsets of T. /u + . . (In particular.42. (x+y)+~x + + y+ and (x + y ) ./ x .43. In a vector lattice.d x . y . or / t / . .(x A y) v (x A z) and x v (y A z) . T h a t is..f ( v ) / ~ / u v/.~l(x § y) § ~l/x . If x c X and n is a positive integer.e. By the preceding exercises. Remark. see 11. Apply this once with x .u .h. t . f ( x + y) .(nx)+. their complements form a collection of Moore closed sets in the sense of 4. + x is the sum of n x's. .(x v y) A (x v z) for every x. m. y. we can make a stronger assertion: X is infinitely distributive. q.~l(x § y ) . Show also that ( sk(T) % x E/x/.x and nx = x + x + x + . o. T r i a n g l e I n e q u a l i t y ..y / a n d x A y . (Here it is understood that 0x . Let X be a lattice group. Theorem that is.~1. n. .b. For any set T C_ X.v and once with x . z E X.3. Thus solid sets are "Moore open sets. yvu. Hint: Immediate from 8.lv// ~ I~. it follows immediately that x V y .) Note that the union of any collection of solid sets is solid. thus it is a sort of "Moore interior" (dual to a Moore closure).f ( x ) ~ f(y). V (Jx) .u]C_T 8.+x n summands + . Show that sk(T) is solid. p.. v (~x) = x++x ++.
we are to show t h a t x A cr is the s u p r e m u m of T. For (B) => (A). s. note t h a t x + . O ~ y. For each s E S we know r > x A s = (x + s) . t h e n we can also conclude cx~ ~ cx for every real n u m b e r c. Proof. c o m p u t e Proof. T h e n the following conditions are equivalent: (A) (B) f is a lattice h o m o m o r p h i s m .Lattice Groups (i) (ii) x A s u p ( S ) = sup{x A s : s E S} and xVinf(S)inf{xVs:sES} 201 where each e q u a t i o n is i n t e r p r e t e d in this sense: W h e n e v e r the left side of the e q u a t i o n exists.42 are applicable.. To prove (i).33. thus r + (x V or) .38 and 7. we are to show r > x A or. o /XoL / x~ + y~ o .d) if and only if there exists a set S C_ X with these three properties: (i) S is directed d o w n w a r d i. 4 4 . x~ o x (as defined in 7. x + y. v e X. (f(uv))  (f(u)f(v))VO+f(v) T h u s f preserves sups. T h e lattice group o p e r a t i o n s are continuous. (ii) 0 inf(S). V 0 and f ( 0 ) 0. To show t h a t it is the least u p p e r bound.(x V s). By duality (8. Let (X. o b. ~ . hence Take the s u p r e m u m on the right.. the conclusions of 7.( f ( x ) ) + for all x E X. If X is a vector lattice. since f is a group h o m o m o r p h i s m . it also .x f ( u V v) = f ((u . we have e v e n t u a l l y / / x ~ . f ( x +) .t h a t is. Since X is infinitely distributive. Proposition.e. t h e n the right side also exists and the two sides are equal.e. It suffices to prove (i). Convergence in lattice groups.c). For (A) => (B). 8.40. T h e n . t h e n (ii) will follow by duality. let r be any u p p e r b o u n d for T.(x V or) to b o t h sides to prove r > x A or. Add x . a s s u m e a = s u p ( S ) exists.XoL o )X.82 E S there exists s E S with 8~SlAS2. Let X and Y be lattice groups. It is certainly an u p p e r b o u n d for T.45.x / ~ is a net in X • X .v) V O + v) ++f(v)  f ((u . 4 ) be a lattice group. in the following sense" Suppose (x~. Show also t h a t a. (iii) For each s E S. f (u V v) = f (u) V f (v) and f (u A v) f ( u ) A f ( v ) for all u. y~) ~ x and y~ o .x > or. 8 ./x/. for each 81. an additive map. with x~ . X.v) + ) + f ( v ) f (u) V f (v). and let f : X + Y be a group homomorphism i. since x A s 4 x A a for each s E S. preserves infs. and let T = {x A s : s E S}..Jrx~ o ..
etc. in practice we often refer to X itself as the algebraic system.40). w. lattice groups. the fundamental operations are part of the definition of the algebraic system. McNulty. {~j} understood. That function's domain.53.2. Concerning the range of T: In most examples in the literature and in all examples in this book. is helpful for our present purposes i. J.fo s i n ( t ) d t .. % To be precise. The notation involving 7. then their j t h fundamental operations ~j and ~j1 may be quite different. 3. Observation. Another term for "algebraic system" is u n i v e r s a l a l g e b r a .f:~ x(t)dt. and Taylor [1987] and other books on varieties or universal algebras. 27r] = {continuous functions from [0. an algebraic system is then called i n f i n i t a r y . taking values in {0. we commonly use the same symbol + in different commutative groups. but at least they have the same a r i t y i..21. 27r] into I~}. 2. The functions ~j are called the f u n d a m e n t a l o p e r a t i o n s of X. they are both T(j)ary operations. Now consider the function u(t) = sin(t). An a r i t y f u n c t i o n . ~j.) However. but that restriction is not required for the theory. and let Y . Any lattice homomorphism is orderpreserving. see the remarks at the end of 8.. developing an abstract theory of algebraic systems but it is seldom used in the context of individual algebraic systems. (Hint: 3. {~j }) \ / i. we should denote the system by an expression such as (X. and use one symbol for both ~j and for instance. in principle other values are possible. Abelian groups. Example. an orderpreserving group homomorphism between lattice groups is not necessarily a lattice homomorphism..47. However. we may drop the primes. An a l g e b r a i c s y s t e m of a r i t y 7" is a set X equipped with a collection of functions ~j 9 X ~(j) ~ X (for j E J).f. ~j .}..202 Chapter 8: Elementary Algebraic Systems 8.R. 1. We shall now study certain ideas that can be applied simultaneously to lattices. we shall only consider f i n i t a r y algebraic systems i.e. sin(t) 0 when0_<t_<Tr whenTr_<t_<27r. 2}. Let X = C[0. Define f " X ~ Y by f(x) . the arity function ~. monoids. with the choices of J. However.e..be an arity function.) Let ~. (Some algebraists permit T to also take infinite values.46. u(t)}  hence f (u +) . for algebraic systems is a function w. T. defined on any nonempty set J. or t y p e . Thus the j t h function..e. rings. When no confusion will result. thus they behave alike in certain important respects. ~j's. This material is taken from McKenzie. 1.. etc. is a w(j)ary operation on X (see 1. UNIVERSAL ALGEBRAS 8. We have f(u) = 0. those in which each w(j) is finite.actually maps the set J into the set {0. u +(t)  max{0. If X and X ' are algebraic systems with the same arity function 7. groups. J. hence (f(u)) + = 0. may be finite or infinite.e. different algebraic systems may be built from the same underlying set X by attaching different fundamental operations. On the other hand. Some of our results will use this assumption. both cases will be important in our applications.
monoid homomorphism.... as explained below. finitely many times. . The identity map x H x will be considered a term. This generalizes the definitions of lattice homomorphism. We may call this a h o m o m o r p h i s m of a r i t y v to emphasize the particular arity being used. it specifies a corresponding term for each algebraic system of arity 7. is a monoid homomorphism if and only if it satisfies f ( x l [hx x2) = f ( x l ) []y f(x2).e. For instance. Hence corresponding compositions of fundamental operations can be used to define corresponding terms in different algebr~nic systems. ~1 is commutative if it is a binary operation satisfying ~1 (Xl. Note that the right side does not depend on w. if ~1 is a 1ary operation and ~2 is a 2ary operation. this illustrates that a term is not required to depend on all of its arguments. the values of T(j)) and on the order of composition of the ~j's. ~jt or ~.1. Note that this definition does not involve any additional properties that may be enjoyed by the algebraic systems X and X ~.. If f : X ~ Y is an isomorphism (i.e. from one monoid into another. Note that our method of specifying a term depends only on the arities of the ~aj's (i. (For examples see 8.) Let X and X ~ be algebraic systems with the same arity function 7. it is the composition of no fundamental operations. x2) = ~1(x2. provided they are of the same arity 7. given in 4.and corresponding fundamental operations ~aj and ~j~.. Exercise.e. For instance. group homomorphism.9. Xl). a bijective homomorphism) from one algebraic system of arity ~. not on other information about X or the ~j'S. but just in those algebraic systems that satisfy a given collection of identities. then f . . For instance.52. A h o m o m o r p h i s m from X into X ~ is a mapping f : X ~ X ~ that preserves the fundamental operations . Let X be an algebraic system of arity T. By a "term of arity 7" we shall mean a method of specifying a term. 8. A t e r m in X is an nary operation on X that is formed by composing finitely many of the fundamental operations. Let T be an arity function. that satisfies f ~j(Xl.Universal Algebras 203 On the other hand. regardless of whether one or both monoids are commutative.48.1 : y ~ X is also a homomorphism. then we can define a term by ~2(x.X2..18. and ring homomorphism.50. 8.i. and 8.. if w(1) = 1 and 7(2) = 2.. . ~al(Z). . E X. A function f : X ~ Y. f(x~(j)) ( ) '( ) for all j E J and all Xl.26. . Our main interest lies not in all algebraic systems of type /.~) to distinguish between the operations of different algebraic systems. ~l(Z)) but not by ~2(x. but this additional information is not relevant in determining whether f is a homomorphism. w) regardless of other properties that may or may not be enjoyed by the functions Pl and p2.49. 8. 8. 8.x~(j)) . The method does not refer to any particular algebraic system X.X2..~ j /(Xl). in some introductory discussions such as this one it will be helpful to use different symbols (such as ~j. then the function is a term in the algebraic system.onto another... /(x2).
.e.. described in 8. for any equation (.X2. . Then equation (.. or identity. . each taking n arguments..Xn) for all X l . any syntactic theorem is also a semantic theorem.204 Chapter 8: Elementary Algebraic Systems An equational axiom. Obviously. x 2 . in the following sense: Let n be a nonnegative integer. X n ) for all x l . +). For instance. . . X 2 ..) is a s y n t a c t i c t h e o r e m in (T.. for algebras X of arity T is a condition on X of the form p(Xl.52. let 7 be an arity function such that 7(1) = 2 i. J) in a natural way. Thus. We shall omit most of the proof... 8. Consider the equation p(Xl.). J). such that ~1 is a binary operation. Optional exercise. Such a condition is satisfied by some algebraic systems of arity T and not by others. .10.Xn) q(Xl. J) is a complete theory. Proposition (optional). . and let ~ be a collection of identities compatible with T..X2.. X 2 ) . .can be made into an algebraic system Z of type (T. Let 7 be an arity. By an a l g e b r a i c s y s t e m of v a r i e t y (~.e. The equational variety (7. iJ). if and only if equation (. the set of all equivalence classes . X 2 ..51. where p and q are terms of arity T. x n E X. The quotient set i.58.i. which shows by example that not every ring is commutative).7) or a counterexample (as in 8.•). in the sense that it is satisfied by every algebraic system of type (7. this is an equivalence relation on terms.g. we can find either a proof (as in 8. Since p = q is a semantic theorem. Sketch of proof. ..53.Xn) . . J). it is satisfied by E.Xl). and let p and q be any terms of arity ~.. . it can be found in more detail in Johnstone [1987] and in other textbooks. but not by the binary operation of a noncommutative group such as Perm(X) see 8. This equational axiom is satisfied by the binary operation of a commutative group such as (~.) is a s e m a n t i c t h e o r e m in (T. Carry out this argument in detail for some particularly simple variety e. . To prove any semantic theorem is a syntactic theorem.. .. hence p = q is a semantic theorem. the variety of monoids.q ( X l . Then the c o m m u t a t i v e law for ~1 is the equational axiom ~1 ( X l . Remarks.~l(X:. Examples will be given starting in 8.27. . ~J) we shall mean a algebraic system X of arity T that satisfies all the identities in :J..X2. in the sense that i t c a n be deduced from the identities that belong to J by using finitely many substitutions. Related discussion: see 14. the main idea is this: Call two terms (~ and/3 "equivalent" if the equation a = / 3 is a syntactic theorem in (r. since it is not needed later in this book.Xn E X ($) (not necessarily belonging to J).
with fundamental operations denoted by V and A. L3}). A group is an algebraic system of arity a defined on {0. Most other kinds of ordered sets ..2} by the values T(1) = 7(2) = 2.20. . 1} by the values 7(0) = 0 and 7(1) = 2.X1). and so they do not form equational varieties.f ( x l ) V f(x2). 1. A.2} by 7(0) = 0..~2(X)) . A monoid or group is commutative if it satisfies the equational axioms listed above plus this axiom: ~I(Xl.~ I ( ~ 2 ( X ) . p0) = x.x) x. a specially selected constant member of X) and one binary operation ~1.Examples of Equational Varieties 205 EXAMPLES OF EQUATIONAL VARIETIES 8.~ ( f ( x ) ) . lattices make up the equational variety (~. Some algebraic systems of arity 7 are lattices. x2 E X. A lattice homomorphism is a homomorphism of arity 7.be the function defined on J = {0. directed sets. y).53. (inverse laws) A homomorphism of arity a means a homomorphism f : X ~ X ' of arity T that also satisfies f(p2(x)) . Thus. L2. Let 7. f ( x l A x 2 ) = f ( x l ) A f(x2) for all Xl. 7(2) = 1 that is. X ) . plus a unary operation P2 . z). In many contexts we describe a lattice in terms of its ordering ~. which also satisfies the equational axioms L1L3 of 4.X2) = ~1(X2.~0 . A lattice is an algebraic system with this arity function 7.52. ~1 satisfy these three axioms: ~1 (X.between two lattices. (commutative law) .and that satisfies the equational axioms above and also these two equations: ~l(X. etc. but for purposes of this chapter we must instead describe a lattice in terms of its fundamental operations V.e. Let X and Y be two such algebraic systems. ~1 (Y. with the same fundamental operations as monoids. Let 7 be defined on the set J = {1.~1 (~1 (X. Z)) . and some are not.~'l(f(xl). {L1. 8. chains.is a set X equipped with one nullary operation ~0 (i. X ' satisfy the equational axioms for a group. Then a mapping f : X ~ Y is a homomorphism (of arity T) if and only if it satisfies f(xl V x2) . regardless of whether X .. f ( x 2 ) ) for all x l.cannot be described in an analogous fashion. Then an algebraic system of arity T means a set X equipped with two binary fundamental operations. A monoid is an algebraic system with this arity T whose two fundamental operations ~0. w(1) = 2.posets. Then an algebraic system of arity 7. ~l(X.. A homomorphism from one algebraic system of this arity to another is a mapping f : X ~ X ' that satisfies f(~0) ~ and f<~l(Xl~X2)) . = (associative law) (identity laws) ~l(~0.X2 E X .
to fit scalar multiplication into our theory of universal algebras." "Boolean lattice. Rings with unit form an equational variety. in a certain sense. another equational variety. we obtain commutative rings with unit.1. The notation ~0. However. not on all of X. V. Thus. g)l.14.X . Thus.18. described in 13. A Boolean ring is a ring with unit. together with the operation of scalar multiplication.. Rings with unit were introduced in 8. as described in 13. Boolean rings and Boolean lattices are different views of the same objects: Boolean rings and Boolean lattices can be transformed into each other. as some mathematicians do). we prefer to think of scalar multiplication as a collection of many unary operations rnc : x ~ cx. in which each element satisfies x 2 = x. +). ~ 2 ( X ) .x) H cx. we simply add one more identity to the list of identities for rings with unit. 1. The fundamental operations of a Boolean ring are the fundamental operations of a ring with unit: (0. A. but their arity is a little more complicated to describe. satisfying rather different equational axioms. but in this book we distinguish between the ring and lattice viewpoints. . j ~(j) ~j 0 0 0 1 2 t 2 1 3 0 1 4 2 9 Boolean rings will be studied in 13.XlEJX2. V)2 is advantageous only when we are trying to see how monoids and groups fit into a more general theory of algebraic systems. with rather different fundamental operations 0. Consequently we cannot view fields as an equational variety (unless we replace our definitions with much more complicated definitions. 8. Instead we shall simply view a field as a particularly interesting member of the variety of commutative rings with unit.13 and thereafter. The operations of an Flinear space X are the operations of an additive group. 1 . The terms "Boolean ring.55. In most contexts. However. the various properties of monoids and groups take a simpler appearance if we use the notations introduced earlier in this chapter" ~0 . scalar multiplication is thought of as a mapping rn : (c.54. Let F be a field. . Attaching one more equational axiom.1 .206 Chapter 8: E l e m e n t a r y Algebraic S y s t e m s Of course. A field X has a multiplicative inverse operation x H x 1. we prefer those notations when we are working solely with monoids or groups. . but that operation is only defined on X \ {0}. If F . In Chapter 11 we shall introduce Flinear spaces. C. We have one mapping from X into X for each c E F. ~ 1 ( X l . Boolean rings form an equational variety. X2) . A ring with unit is an algebraic system with arity function given by the table below and satisfying certain identities that we shall not list here. from F x X into X. 8." and "Boolean algebra" are used interchangeably in some of the literature. Boolean lattices are another equational variety. These form an equational variety.i.
How can the condition be reformulated? We can dispense with 4. since their orderings cannot be described in terms of fundamental operations. Ordered monoids. with the fundamental operations of additive groups plus the fundamental operations of a lattice. introduced in 8. replacing statements of the form z 4 y with corresponding statements of the form z V y = y. Lattice groups were introduced in 8. Part of our definition of a lattice group was the translationinvariance of the ordering.Examples of Equational Varieties 207 is an infinite field (such as the real numbers R or the complex numbers C). the translationinvariance of the ordering can be restated as the equational axiom (x + z) V (y + z) = (z V y) + z (introduced in 8.56. A. are not equational varieties. They are an equational variety. Vector lattices and lattice algebras will be introduced in 11.42. an implication is not an equational axiom.30. and a partial ordering is not a binary operator. ordered groups. then we have infinitely many of these unary operations.a).3) by adding one more fundamental operation (vector multiplication) governed by a few more identities.38. However. and with appropriate equational axioms. introduced in 8. with the fundamental operations of vector spaces or algebras together with V. 8.44. Thus. We obtain the equational variety of Flinear algebras (defined as in 11.30 and 11. . z 4 y =~ z + z 4 y + z. and ordered vector spaces.44. the condition z 4 y =~ x + z 4 y + z is not permitted as an ingredient in our theory of universal algebras. They are equational varieties.
linear)] J \ J Fnormed spaces (contin. (An additional chart at the beginning of Chapter 22 shows some more advanced categories. additive) I I TVS (contin. which (in most cases 208 . (contin.) I topological spaces (continuous)[ lattices (lattice homom. The chart above shows some of the most basic categories that we shall consider in this book.Chapter 9 Concrete Categories I sets (functions)i posets (increasing) monoids (monoid homom. contin.) The components of a category are (i) its objects .1. contin.sets with additional structure and (ii) its m o r p h i s m s mappings between those sets.) Gnormed spaces (contin. Preview. linear) I 9.) lattice groups (additive lattice homomorphisms) vector lattices (linear lattice homom.) additive groups (additive maps) uniform spaces (unif.) metric sp.) linear spaces (linear maps) TAG (contin. additive) metric spaces (unif..
Morphisms are indicated in parentheses in the chart. It should not be confused with Baire category theory. and Mac Lane and Birkhoff [1967]. 9. this chapter may be considered as a preview of those categories.14. for instance. suppose X is isomorphic to a subset of a set Y. Different categories groups.Y. The category theory being introduced here is based loosely on the theory of Eilenberg and Mac Lane. If two objects A and B are isomorphic.3. for instance. Such a mapping is then called an i s o m o r p h i s m . see particularly 9. etc. algebraic. being concerned with different kinds of structures order. The "essence" of the objects is the part of them that does not depend on the particular choice of representation.2. The language of the EilenbergMac Lane theory is useful to us. They can be used interchangeably. Some. More generally. We say that two objects X and Y are i s o m o r p h i c if there is a correspondence between them that preserves (in both directions) all the structure currently of interest. Some of the categories mentioned in this chapter are not introduced formally until later. then we may identify X with that subset and write X c_ Y. "topological spaces (continuous)" is included in the chart to indicate the category whose objects are topological spaces and whose morphisms are continuous maps between those spaces.6. but not all.5). The EilenbergMac Lane theory was originally developed mainly for applications in algebraic topology (discussed briefly in 9. and treat them as equal. and examples will be given in some detail starting in 9.33). Introductory discussion. of all abstract thinking). We ~ a y even write X . etc. of these forgetful functors are given by the inclusion of a subcategory in a category (discussed in 9. uniform. but we shall take the liberty of modifying that language slightly to make it more useful for the purposes of analysts. Some other introductions to category theory can be found in Herrlich and Strecker [1979]. recently it has also been useful in the abstract theory of computer programs. Different branches of mathematics. because for most practical purposes they are the "same" set. topological. When two objects X and Y are isomorphic. This interchangeability is the heart of mathematics (and. then they differ only in their labeling and are essentially two different representations of the same object. However. most meanings of isomorphic and isomorphism can be subsumed by one abstract meaning developed in this chapter.) However. topological spaces. so . Precise definitions will be given in 9. have different properties. most theorems of EilenbergMac Lane category theory are irrelevant to the purposes of this book and will be omitted. Mac Lane [1971]. have different meanings for the terms "isomorphic" and "isomorphism. although this term has more specific meanings in some contexts. an unrelated topic introduced elsewhere in this book. if this will not cause confusion.Preview 209 of interest) preserve that additional structure in at least one direction. indeed. the "essence" of the number 4 does not depend on whether we are dealing with four apples or four airplanes.34). The line segments in the chart indicate natural relations between categories via forgetful functors (discussed in 9. thus some of our definitions differ slightly from the definitions to be found in books on category theory. A structurepreserving map from one set into another is sometimes called an e m b e d d i n g . provided that we are willing to relabel everything else that they interact with. we may sometimes identify X and Y." (This multiplicity of meanings may confuse some beginners.
. 9") is a morphism. B. the identity map i x on the underlying set X is a morphism from A to A. C is a morphism. Following are precise definitions. A c o n c r e t e c a t e g o r y consists of a collection of objects and a collection of morphisms. In principle. respectively. it is possible to create a . then g o f : A . then X is called the u n d e r l y i n g set of the object (X. To define morphisms. We will often refer to X itself as the object if the choice of S is clear or does not need to be mentioned explicitly. but that preservation is not explicitly built into the two axioms listed above. S) and (Y. S) and (Y. g). The collection of all such triples forms a class that is usually larger than what we want. if A. Some readers will find it helpful to glance ahead to the examples. C are objects and f : A ~ B and g : B ~ C are morphisms.s) f consisting of two objects of the given category and a function f : X ~ Y whose domain and codomain are the underlying sets X and Y of those two objects. S). there are analogies between the most elementary properties of these different categories e. S) consisting of a set X and some additional structure S on X (such as a preordering or a aalgebra). In most categories of interest. The nature of the "additional structure" will vary from one category to another. 9") are considered equal (as objects) if X = Y and S = 9". which begin in 9. When f : (X. Some subclass will be specified as the collection of m o r p h i s m s for the category. but it is understood that S is still part of the object. In other words. by being equipped with several different structures for instance. DEFINITIONS AND AXIOMS 9. 9") its d o m a i n and c o d o m a i n . but the meaning of this term will be clear in particular categories. the class of morphisms is chosen so that the morphisms preserve the structure of the objects in at least one direction. A morphism is also sometimes known as an arrow.g.210 Chapter 9: Concrete Categories ultimately they must be studied separately. Two objects (X.3. These analogies may help the beginner through the unavoidable plethora of definitions and elementary propositions.6. between subgroups and topological subspaces. the specified subclass must satisfy two axioms noted below. consider triples (x. One set X may give rise to several different objects. The collection of morphisms must satisfy these two axioms: (i) ( C o m p o s i t i o n s ) Any composition of two morphisms is a morphism. However. or between products of groups and products of topological spaces. two different preordered sets may contain the same points. An o b j e c t is a pair (X. we call (X. (ii) ( I d e n t i t y ) For each object A = (X. S) ~ (Y.
. etc. For instance.. For instance. A set X may be made into an object in more than one way. will also be applied to any devices (metrics. which would prohibit anything from being stronger than itself..11. in many categories of interest.e. '3"we attach to X and Y.g. when topological spaces are used for objects. Whether or not a function f : X ~ Y is a morphism depends on what structures S. then we may omit mentioning the category ." these are two different categories. if 9"ll _D ~Y~/. we shall say that d is u n i f o r m l y s t r o n g e r than e if ~d is stronger than ~/r Two metrics are t o p o l o g i c a l l y e q u i v a l e n t or u n i f o r m l y e q u i v a l e n t if they determine the same topology or the same uniformity. where the topologies are defined as in 5.. This syntactic convention also applies to other devices than metrics . if no other meaning is evident. or that ~ is w e a k e r (or coarser) than S. In some categories. $2 . Whenever we discuss two or more objects and/or morphisms together. nor vice versa. Thus.8. we say that S is s t r o n g e r (or finer) than 9~. The literature of category theory sometimes refers to categories by their objects.9~. if ~'d _~ ~'e). it should be understood that all the objects and/or morphisms being considered are in the same category.4. so we may refer to the "category of topological spaces." . if two topologies are both stronger than each other." "weaker. S) into (X. if either structure is stronger than the other. the function f may be a morphism for one choice of structures but not for another choice.e. see the last paragraph of 9. we may refer to the "category of metric spaces with continuous maps. let d and e be metrics that determine topologies ~d and ~e and uniformities ~[d and ~e on a set X.. by equipping it with different structures S1. a particular choice of objects does not force upon us a particular description of morphisms..for instance.. unless specified otherwise.g.e. see 1. ~'). If the identity map i x is a morphism from (X.e. 9"). Thus any structure is stronger than itself. Clearly. Here mathematical language differs from everyday English. gauges. 9. see the last paragraph of 9.4. If the context is understood. if the identity map i x : X ~ X is a morphism in both directions between (X. this preordering is also antisymmetric and thus a partial ordering i..e.Definitions and A x i o m s 211 category in which the morphisms are entirely unrelated to the additional structure but that would not be a particularly interesting category. continuous maps are almost invariably used for morphisms. then they are equal." etc. It even applies to uniformities: one uniformity l / i s topologically stronger than another uniformity 1/~ if it determines a stronger topology i.. then S . then "stronger" usually means "topologically stronger. then they are equal.33.) that are used to define the structure of a category.. S) and (X. the relation "stronger than" is a preordering (i." However. This terminology "stronger." or the "category of metric spaces with uniformly continuous maps. transitive and reflexive) on the collection of all structures on X. In many categories of interest (but not all). We shall say that d is t o p o l o g i c a l l y s t r o n g e r than e if ~d is stronger than 9~ (that is. it also applies to gauges. For instance. in general.e. This omission is made most often for topological structure i. different topologies or different orderings. we may simply say d is s t r o n g e r than e or d is e q u i v a l e n t to e.
Such functions are called c o n t i n u o u s maps. The categories of measurable spaces. Examples will be given below. or a uniformity S C_ [P(X x X).212 9. In each of these categories. We say  is a full s u b c a t e g o r y of ~ if condition (i) is satisfied. Such functions are called m e a s u r a b l e mappings.~') i s a m a p f : X . a morphism is a mapping with respect to which the inverse image of a specially designated set is also a specially designated set.8.l ( T ) E S i. In this category.l ( T ) E S i. 9. conditions are satisfied: We say that  Chapter 9: Concrete Categories is a s u b c a t e g o r y of ~ if these two (i) {Gobjects} c_ {~objects}. (ii) Whenever A and B are objects of  (and hence also objects of ~). as in 9. In each of these categories.10. topological spaces. for which the inverse image of an open set is an open set. a morphism f : (X. Then two sets are isomorphic if they have the same cardinality. The simplest category is the c a t e g o r y of sets.33. A category can be formed by taking c o n v e r g e n c e s p a c e s for objects and convergence preserving maps for morphisms.S) ~ (Y. EXAMPLES OF CATEGORIES 9.e. for which the inverse image of a measurable set is a measurable set.e..one may refer to a measurable mapping from X to Y but we emphasize that the meaning of "measurable mapping" does nevertheless depend . see particularly 9. S) consisting of a set X and a collection $ of specially designated sets a aalgebra or topology S c_ [P(X). Some elementary examples of continuous maps are given in 15. For an isomorphism in this category. and uniform spaces differ in their deeper properties. in which the objects are sets (without any additional structure specified) and the morphisms are functions.6. This is explained in greater detail below. Let  and J~ be categories. More precisely. Measurable spaces form the objects of a category. In some contexts the choices of 8 and 9" are understood and do not need to be m e n t i o n e d .7. but they are quite similar in their most elementary properties. every Gobject can also be viewed as a J~object perhaps via some change of description. we might use a bijection between two sets. 9. 9") is a mapping f : X ~ Y with the property that T E ~ =~ f . an object is a pair (X. Topological spaces form the objects of a category.~ Y w i t h t h e p r o p e r t y t h a t T E ~ " =~ f .17. 8) ~ (]I. see 7.. as well as the following strengthened version of condition (ii): (ii') Whenever A and B are objects of  (and hence also objects of ~). then every  from A into B is also a J~morphism from A into B.5. I n v e r s e i m a g e c a t e g o r i e s . In this category. then the  morphisms from A into B are the same as the J~morphisms from A into B.10 and the mapping from Gobjects to J~objects is injective. a morphism f 9 (X.
here "generating collection" is defined as in 5. and 5. 8) * (Y.b. the "measurable mappings" of applied mathematics do not form the morphisms of a category. In this category. Such functions are called u n i f o r m l y c o n t i n u o u s mappings. 8) where 8 is a collection of subsets of X satisfying certain axioms. It will sometimes be convenient to adopt a notation that makes these three categories look more alike. in Chapter 18. and use uniformly continuous maps for morphisms. (Y. Y. Y with the property that TE~ {(Xl. In that context. define a mapping f " X . the stronger structure is represented by the larger set. This class of mappings is not closed under composition.37.e. For any set X. In most of the theory developed in later chapters. A A A For any mapping f " X . g~) is a mapping f : X . Further subcategories are obtained by further restricting the choice of morphisms: Use . f ( x 2 ) ) E T} E S i. and a m o r p h i s m f 9 (X. but no such restriction will be imposed in the more general theory developed in this chapter. and examples are given. the resulting category is a subcategory of either the topological spaces or the uniform spaces. for which the inverse image of a vicinity is a vicinity. (We remark that even greater restrictions are imposed in applied mathematics. 5. a "measurable mapping" usually means a measurable mapping from an open subset of R TM equipped with its Lebesgue measurable subsets to an open subset of IRn equipped with its Borel subsets. For each of our inverse image categories. a structure 8 on a set X is s t r o n g e r than another structure '/r on the same set X precisely when 8 _D ~r. Y by taking for measurable or topological spaces if we are working with uniform spaces.( f ( x l ) . thus.Examples of Categories 213 very much on the choices of 8 and ~Y.X + Y with the property that T E ~" ==~ f~l(T) E 8. Such functions are studied further. f  fxf where (f x f ) " (X x X) + (Y x Y) is defined by (f x f ) ( x l .) Uniform spaces form the objects of a category.X2) E X " ( f ( X l ) . f ( x 2 ) ) . Thus. then f is a morphism. x2) . It is easy to verify that if a mapping f " X + Y satisfies T E ~ ~ ?I(T) E S for some generating collection of sets ~ c_ ~. a morphism f : (X. let us define A X  { XxX x for measurable or topological spaces if we are working with uniform spaces.e. an object (in any of the three categories) consists of a pair (X. the codomain Y is a topological space and g~ is the aalgebra of Borel subsets of Y.26. 7) is a mapping f . With these conventions.23. 8) .. We note a few important examples of subcategories that will be studied in later chapters" Use metric spaces for objects.
9. with the homomorphisms of type T (defined in 8.J) (defined in 8. Thus. equipped with their lower set topologies (defined as in 5. In the category of preordered sets with increasing mappings. that point is called the b a s e p o i n t of the space.d). in contrast with the situation described in the last paragraph of 9.1 . Indeed. we change our description of the object when we go from (X. A category can be formed with objects consisting of pointed topological spaces. as shown in 26.47). Exercise (optional). 9.be an arity function (defined in 8. Conclude that preordered sets (with increasing mappings for morphisms) are a full subcategory of topological spaces (with continuous mappings for morphisms). we shall maintain a distinction between topological and uniform spaces. Thus the stronger preordering is represented by the smaller set. or nonexpansive maps.10. suppreserving maps. the statement "~ is stronger than __" (as defined in 9. Show that a function f " X ~ Y is increasing if and only if it is continuous (defined in 9. This category will be important in 9. . note that the ordering is not equal to the topology. and with morphisms consisting of continuous maps that preserve the base points. whereas the topology g is a subset of ~P(X). A l g e b r a i c c a t e g o r i e s . A p o i n t e d t o p o l o g i c a l s p a c e is a topological space X with a particular point x0 E X selected. g).33. Many important subcategories of this category can be obtained by using a smaller collection of objects e. y ~ X is also a morphism.8.37. it means Graph(4) C_ Graph(_E). Perhaps the simplest way is to use preordered sets for objects and to use increasing mappings for morphisms. However. However. the ordering 4 (or its graph) is a subset of X x X. the category of monoids. 4) to (X.11.) if f " X ~ Y is a bijection and a morphism..9. Examples are the category of lattices. we shall see in later chapters that these classes of maps are closed under composition. 4) as a topological space (X.48) for the morphisms. because this facilitates understanding and because occasionally one wants to apply these concepts in some context other than that of topological Abelian groups. because in the setting of topological vector spaces or more generally. 9. obtained as follows: Let :J be a collection of identities compatible with T. Let X and Y be preordered sets. then f . in the setting of topological Abelian groups the two kinds of structures are nearly interchangeable: There is a onetoone correspondence between topologies and "nice" uniformities.15.g. the category of groups. (. Caution" Although we may view each preordered set (X.50) can be used as the objects for a category. Let ~..4) means that x ~ y =~ x _ y.g. and the category of rings.8). chains or complete lattices and/or by using some smaller collection of morphisms e. the category of Abelian groups. equivalently. C a t e g o r i e s of o r d e r e d sets can be formed in many ways. Lipschitzian maps. In an algebraic category. the category of lattice groups. generally we are interested in a full subcategory of that category.214 Chapter 9: Concrete Categories H61der continuous maps. then the algebraic systems of variety (~. The universal algebras of type 7 can be used for the objects of a category. Some functional analysis books gloss over the distinction between topological spaces and uniform spaces. g).
4). and the identity map ix is a morphism in either direction between (X. the intersection of countably many open dense sets is dense. d) (both equipped with continuous maps). Later chapters will introduce more complicated and specialized categories. On the other hand. Indeed. for instance.) 9 A theorem of Baire states that in a topologically complete space (i. d') are different objects in the latter category. then they are equal. g) and (X.47 considers the effects of such a replacement. Every metric determines a topology (see 5.. Thus the category of metrizable topological spaces (X. it is not particularly meaningful to discuss whether one structure on a set is "stronger" than another. but the reader should be aware that the EilenbergMac Lane theory deals with . Concrete categories will suffice for the purposes of this book. lattices may be viewed as algebraic systems (X. This is a statement about metric spaces. There is some overlap between our general classes of categories. depending on what properties and structures we wish to study. For instance: 9 A theorem of Banach states that any strict contraction selfmapping of a complete metric space has a fixed point. For instance.12. a topological space (X.13. Each viewpoint has its advantages. Many of these are "hybrid categories. 9") whose topology can be given by various metrics.) Thus." combining structures of two simpler categories and also imposing some condition of compatibility between the two structures. at least one of which is complete). If we replace the metric with another metric that yields the same topology. Nonconcrete categories (optional). one topology 9" may be determined by two different metrics d. 9. metrizable topological spaces form a full subcategory of topological spaces (both equipped with continuous maps for morphisms). A) or as preordered sets (X. if one structure is stronger than another. This is a statement about metrizable spaces. but there are also some topologies that are not determined by any metric. 9. d) and (X. then in fact g = 9~. d ~. if g and ~r are two structures on a set X.15. the open sets and the dense sets are unaffected. (That follows from (. In an algebraic category. a topological linear space has both a topology and a linear structure. the selfmapping may no longer be a strict contraction. If we replace a metric with an equivalent metric. Remarks: overview of categories. '3) is slightly different from the category of metric spaces (X. This kind of distinction is also displayed in a chart in 18. ~ . Thus. V.1.e. Early chapters of this book are devoted to the simplest categories. since (X. Viewing an object in different categories may yield different kinds of information about that object. Different questions arise naturally in these slightly different categories. (Meyers' converse in 19.Examples of Categories 215 This invertibility property is not shared by most other kinds of categories studied in this book.g).). which must be compatible in that the vector space operations are jointly continuous. The topologies that can be determined by a metric are called rnetrizable topologies. Following are a few examples.
e q u i v a l e n t . 9") is a bijection f : X ~ Y such that both f : A ~ B and f .x) = g(x) for all x E X. called the c o m p o s i t i o n of f and g. More definitions..216 Chapter 9: Concrete Categories other kinds of categories as well. We mention two examples of nonconcrete categories: a. B and g : B ~ C are morphisms. we say f and g are h o m o t o p y . See also 9.i. Note that in this category. for objects take elements of S. h(1. for any two objects z and y there is at most one morphism from z to y. if iA and i5.14. there exists a morphism iA. For instance. 1] x X + Y such that h(O.2. satisfying g o iA = g for any morphism g with domain A. g : X + Y be continuous mappings. 4) be any preordered set. Clearly the identity map of A is an automorphism. (iii) For each object A. both satisfy condition (iii). g) and B = (Y. A category can be formed by taking topological spaces for objects and homotopy equivalence classes for morphisms. but there may be others as well. and let [0. 1] have its usual topology. is an automorphism of . A c a t e g o r y consists of certain collections of mathematical devices called o b j e c t s (not necessarily sets) and m o r p h i s m s (not necessarily functions). Each morphism is represented in the form f : A ~ B. this definition can also be restated as follows: An isomorphism between two objects A = (X.33. an a u t o m o r p h i s m of an object A is an isomorphism from A onto A.x) = f ( x ) . Let f. C. 9.e. Let X and Y be topological spaces. respectively. Let [0. In any category concrete or not an isomorphism between two objects A and B is a morphism f : A ~ B for which there exists another morphism g : B ~ A such that g o f = iA and f o g = iB.i ' A. called the i d e n t i t y m o r p h i s m of A. where f is the name of the morphism and A and B are objects called the d o m a i n and c o d o m a i n of the morphism. This category is typical of the ones used in algebraic topology.1 : B ~ A are morphisms. for morphisms take ordered pairs (z. from IR into R. In any category. 1] x X have the product topology (discussed elsewhere in this chapter and in Chapter 15). y) satisfying z 4 y. In a concrete category. Let (S. (ii) Composition of morphisms is a s s o c i a t i v e : f o (g o h) = ( f o g) o h. It is an easy exercise to show that the identity morphism is unique . The reader should verify that this is an equivalence relation on the collection of all continuous mappings from X into Y.. then there exists a morphism g o f : A . and iA o f = f for any morphism f with codomain A. Morphisms must satisfy these rules: (i) If f : A . satisfying certain rules listed below. then iA . This makes precise the definition given in 9.. the translation mapping ~ : x H x + 3. If such a mapping exists. b. A h o m o t o p y from f to g is a continuous mapping h : [ 0 .
an object with these two properties: (i) Each of the mappings ~a : X + Ya is a morphism. Ezercises. g) if and only if each of the compositions ~ o f is a morphism from (W. in many categories of interest to us particularly. J~) into (Ya. g). and let f : W + X be any function. Hence there is at most one initial structure determined by the ~ ' s and 7~'s. J~) be any object in the category. The automorphism group of A is often denoted by Aut(A). and for each A let ~ : X ~ Yx be some given mapping. For instance. a. g2 is weaker than ~he other. If g is an initial structure on X determined by the ~a's and 7x's. Then an initial s t r u c t u r e determined by the ~x's and 7~'s is a structure g that makes (X. then each of S1. 7~)'s and ~ ' s be given. Let X be a set. Suppose that each of the compositions ~ o f is a morphism from (W. ~R) be any object in the category. ~) into (Yx. and so the subscripts are not needed. in our algebraic categories. Then the initial structure does not always exist. this will be evident from our discussion . 9"~) : A c A} be a collection of objects in some category. J~) into (X. It is the weakest structure that makes the p~'s into morphisms. with composition of morphisms for the group's binary operation. However. since it does not preserve the identity). INITIAL STRUCTURES AND OTHER CATEGORICAL CONSTRUCTIONS 9. It is easy to see that the automorphisms of A form a (not necessarily Abelian) group. J~) into (X. 7~) or. Definition. Let any X and (Y~. in our inverse image categories and algebraic categories. ) c. usually the choice of category is clear from the context. or even a morphism. equivalently. then g is weaker than any other structure on X that makes the ~x's into morphisms.15. let {(Yx. If ~1 and g2 are two initial structures on X determined by the ~ ' s and 7~'s.Initial Structures and Other Categorical Constructions 217 metric spaces (but it is not an automorphism. Then f is a morphism from (W. We could indicate it with subscripts. This notation does not indicate what category is being used. Further remarks. g) into an object with this property: Let (W. (For this reason it is sometimes called the w e a k s t r u c t u r e . (ii) Let (W. let ~ be the translation map mentioned in the previous paragraph. then p E Autmetric spaces(R) but ~ ~ AUtadditive groups(R). Tx). Prove the equivalence stated above. The inverse of any automorphism is another automorphism. b. Then f is a morphism from (W. of additive groups. and let f : W ~ X be any function.
In the category of measurable spaces or topological spaces. and uniform spaces. Let g be the structure on X generated by ~}. We shall show that g is an initial structure. By a p r o d u c t s t r u c t u r e on X we shall mean an initial structure (defined as in 9. However.( ~ o ~ ) . Initial structures exist in the categories of measurable spaces.b. In fact. In the category of measurable spaces or topological spaces. by 5. Now suppose that (W.8.I ( T ) ) . E is a structure on X.40. m 9.40. E contains g.17. Proposition. S~).H ~ e h Y~ be the product of the underlying sets. Thus E _D (}. Then ~px o g 9 (W. X 9.16. where mappings ~'~ 9 ~ X Yx are defined as in 9. equivalent definitions are available in some categories e. by 5. For the category of uniform spaces. It is clear that each ~a is a morphism from (X. or uniform spaces. ~r~).I ( ~ . A E A} be a collection of objects in one of those categories. P r o d u c t s . 9. we know :2 is a uniformity on W. We first show that E _D c}.20. q'a).. it is equal to the initial structure determined by the identity mappings i~ 9 .f).9"~)" A c A} be a collection of objects in some category. thus E D g. it also contains the structure generated by 9.15) determined by the coordinate projections 7r~. More precisely: Let { (Y~.38. it suffices to show that E _D g. and let some mappings ~ 9 ~ Y~ be given.g. hence ~ (T) is a member of E. each ~'~1(9"~) is a preuniformity.b. Then there exists an initial X structure S on X. the initial structure does always exist in our inverse image categories. Proof. " X+ Y~. h e n c e g . we shall prove that in 9. The definition of initial structure given above is admittedly rather complicated.18. Let X be a set. in the category of measurable spaces. Let E .I ( T ) is a member of J~. Let {S~ 9A E A} be a collection of structures on a set X. and let X . by 5. topological spaces. (X. determined by those spaces and mappings.24) and for uniform spaces (see 18.g) into (Ya. topological spaces. hence ~} can be used to generate a structure.40. Hence E contains the smallest filter that contains ~} that is. ~1 can be used to generate a structure.{ S C X " ~ .a. Let {(YA.9. Also E contains 9. .218 Chapter 9: Concrete Categories in 9.~r~) is a morphism. hence ~} is a preuniformity. Simpler. Since it is a structure containing ~}.16. J~) is some object and g 9 W ~ X is some mapping such that each composition ~x o g is a morphism.1 ( S ) E J~}. J~) (Y. we must show that g itself is a morphism. A n important special case. since any collection of sets can be used to generate a structure.c. hence E is a filter on X x X by 5. S is the structure generated by the collection of sets  U ~'~1 (~A) AEA h h  {~I(T) 9 A C A and r C ~'x}. In the category of uniform spaces. which is a preuniformity. Fix any A E A and fix any T E 9"~. for topological spaces (see 15. hence a filter on W x W. Then the s u p r e m u m of the S~'s is the smallest structure that contains U~ch S~. since g _D 9 _D ~~1(9"a).
}.~ ynz Yn~  q~od(Ylc~. then the product topological structure and the product vector space structure on the product set X = 1[~A Y~ are compatible with each other and thus yield a product topological vector space.>.) " We leave it to the ambitious reader to unwind all the notation and verify that this formula does indeed satisfy the definition given in 9.. 9. ) where yi~ c Y~.. p ~ j . For instance.a.. Some of our "hybrid" categories also have product objects.. . if ((Y~. :.) if A = {c~. is the same as the product defined in 3.10..1.j.9. but a product of infinitely many normed spaces is a topological vector space that cannot be equipped with a norm.>)  (:.Initial Structures and Other Categorical Constructions 219 Products always exist in our inverse image categories. ..~ .20)./3. J). .iv. and verify that X is an algebraic system of type (T. W are lattices. Say X = Y~ x YZ x Y~ x . etc....1] [. suppose that f~ : X~ ~ Y~ is a morphism. 9. This construction generalizes readily to products of any number of factors in any equational variety.] [. see 26.. Then any function f : W ~ X is the "same" as the composition . More precisely: For each A. qP'. . .] Vl A v2 Vl A v 2 Wl W2 Wl A w2 . Y2~.. let i 9 X ~ Y b e c the inclusion map. YnZ) Yn~. . . yiz E YZ.<. Then f is a morphism.. we shall now describe the action of the nary operation (I)j of X in terms of the nary operations p~j.>. hence products also exist in the category of preordered sets. ..15. Yno~) yg/3j(Yl~.20.. Assume that P and Q are equipped with product structures. Fundamental operations are defined coordinatewise. Say xi = (yi~. As we noted in 9. pZj. y i ~ . It is sometimes called the d i r e c t p r o d u c t . ~/. preordered sets (with increasing maps for morphisms) may be viewed as a full subcategory of topological spaces. Let X be a subset of a set Y. Although initial structures do not always exist in our algebraic categories (as we shall see in 9. . ~ ) : A E A) is a collection of topological vector spaces.(x.Xn) E yn. The function Oj acts on ntuples (Xl. Discussion of codomains. . . .7. For instance. if U. then lattice operations are defined on UxVxWby [. . see 27.] [.X2. . .. nevertheless product structures always exist.16.. of the factor spaces. V. a product of finitely many normed spaces is a normed space. For instance.. ' Y2~ y2z Y2~ ' ""' Y. A product of morphisms is a morphism. this is just a special case of 9.c. Y2c~. defined in this fashion. is a product of algebraic systems of type (~. . Then Yl~ Yl~ ~J Yl~.~ :. .20. :J).] ?21 V v2 Vl V v 2 Wl W2 W l V W2 . ... [. . suppose T(j) = n. Y2~. ... Exercise. .(./j(Yl~.1] [. yiz.19.. This does not work in some other categories. Define a mapping f from P = II ~~A X~ into Q = 1[~~A Y~ by taking :((. . Exercise (optional): Show that the product of preordered sets.
it is called the subalgebra g e n e r a t e d by T. if we attach structures :R. X is a subalgebra of itself. the category of rings. Definition. The intersection of any collection of subalgebras of X is a subalgebra of X. It is sometimes called the t r a c e of g" on X. which we introduced in 5. Of course. c. and let f 9W ~ X be any function. 7. ~r) be objects in some category. We say that (X. b. ~) iof> (y.:J). Let X be an object. generated by a set S c_ X. g. subgroup. X. It is the closure of T under the fundamental operations. with homomorphisms of type T. The intersection of all the subalgebras containing some given set T C_ X is the smallest subalgebra containing T.16 that the subobject structure is S . 9") if these two objects are related by this condition: Let (W. Y is a subalgebra of X if and only if Y is a subset of X that is closed under the fundamental operations of X. not every subset of X is closed under the fundamental operations.21. 9") is any object. We consider the category consisting of the algebraic systems of some type (~. :R) f. a subobject is called a s u b a l g e b r a .c. g. Subobjects in algebraic categories are studied in more detail below. S is the same as the relative topology.15.e. Thus. Thus. 9. etc. then every subset X c_ Y has a unique subobject structure g.21. think of the category of lattices. making them into different objects in a category. it differs only in our designation of the codomain. insofar as it has the same domain and the same values f(w) = i(f(w)).8(A). group. or the category of lattice groups. Basic properties of subalgebras.5. Then (W. Hence the closure operator is an algebraic closure operator.220 Chapter 9: Concrete Categories i o f : W ~ Y. the subobjects of an algebraic system Y form a Moore collection. where X is a given monoid. 9" to the sets W. then the difference between f and i o f may be more substantial: perhaps one is a morphism while the other is not. initial structures do not always exist. In fact. if cl(S) is the submonoid. g) is a s u b o b j e c t of (II.) d. (X. as defined in 4. See 2. etc. S is the initial structure on X determined by the inclusion map i 9 X and the structure ft.J). the operator S H cl(S) is an algebraic closure. in an algebraic category. C> y  In other words. That depends on our choices of the additional structures :R. that is just a special case of 9. Y. it follows easily from 9. Show that a. J~) be any object in the category. g) and (Y. In our inverse image categories. Thus..16. whose fundamental operations are the restrictions to Y of the fundamental operations of X. subring. (We mentioned this for sublattices in 4. Let (X. Further remarks. ring. (For examples. It then follows that Y itself is also an algebraic system of type (~. all of those operations are finitary. g) is a morphism if and only if the composition (W.) In this category. if (IT. ~j~)is a morphism. Except in degenerate cases.{X A T 9 T E 7}. . In the category of topological spaces.
Our presentation is based on Kurosh [1965]. That is: Let 1Idea Ya be the direct product of some algebraic systems Yx of some variety. but that category is atypical. satisfying identities that make (X. unary.and so the empty set is not a subalgebra in most algebraic systems considered in this book. Let f 9X + Y be a function from one algebraic system to another of the same arity. Then f is a homomorphism if and only if G r a p h ( f ) is a subalgebra of X x Y." However. It is the basis for a theory of ideals and quotients developed below. Then: (i) The set f ( X ) . 3) be a variety. usually denoted 0 or 1 . a special element. Hint" It is closed under the fundamental operations. and let Wa be a subalgebra of Ya for each . Among the algebraic categories studied in this book. The product of subalgebras is a subalgebra.e. . (ii) More generally.22.) . then the set rlxeA w a is a subalgebra of X.. Most of the algebraic systems considered in this book have a nullary operation . minus ( . O) into an additive group. what we call an object in a "idealsupporting variety" is what Kurosh calls an "f~group. Kurosh and other algebraists do not make that assumption. we assume that "addition" is commutative.l ( T ) is a subalgebra of X. 9. and nullary). (iv) If T is a subalgebra of Y. and zero (0) (respectively binary.. Most equational varieties of interest to us have an addition operation (+). f. (v) If X satisfies identities ~1.. We shall say that (r.. h. which plays a special role among the various fundamental operations. then so does the subalgebra f ( X ) c_ Y (regardless of whether Y satisfies those identities). Remarks. Definitions. g. the category of lattices is atypical.23. then f .i.s u p p o r t i n g v a r i e t y if the following two further conditions are satisfied: (i) Included among the fundamental operations of the category are operations of addition (+). the empty set is a subalgebra when we consider lattices. since those act separately on each coordinate.Varieties with Meals 221 e. Suppose f " X ~ Y is a homomorphism of arity r.Range(f) is a subalgebra of Y. + . if S c_ X is a subalgebra of X. consequently they have a slightly more general and more complicated theory of ideals and quotients. Let (r. :J) is an i d e a l . The empty set is a subalgebra of X if and only if none of the fundamental operations of X is nullary. .~. (iii) f is uniquely determined by its values on any subset S c_ X that generates X. Thus. then f ( S ) i s a subalgebra of Y. VARIETIES WITH IDEALS 9.
Then the following conditions are equivalent.e. we do not assert that {0} is a subalgebra.0. (A) (B) S is the kernel of some Thomomorphism f 9X . since then ~ is a constant and 9~ 9~ . the set S is closed under the nary operation ~/)r . . . .. If 0 is the only nullary operation. + .Xn). then f is also a group homomorphism. 0 . . Let X be an algebraic system of the variety (z. Let (T. is a Thomomorphism. J). Y. " " " . In an idealsupporting variety (z.X2.l ( 0 ) and the quotient group X / K e r ( f ) as in 8. 9..0 is a member of the subgroup S..J). and X is an object in this variety. . D e f i n i t i o n and proposition. for some object Y in the category. xn (Sl. (However.) . . and for each Xl .222 C h a p t e r 9: C o n c r e t e Categories (There may or may not also be other fundamental operations and other identities. Then: a. and ]Flinear spaces (for any field F) are all idealsupporting. . Hence we may define the subgroup Ker(f) = f . Y defined by x H 0 is a Thomomorphism from any object X into any object Y..) b.8 1 .) Some examples: the varieties of additive groups. .. Xn 9 X n ~ X defined by r . when clarification is needed. t Xn qSn)  ~(Xl. Let S c_ X be an additive subgroup of X.25. If any (hence all) of them is satisfied. for each fundamental operation 9~ that is nary.17. from X into {0}. commutative rings. an algebraic system of type (z. X2 n . (This is an identity. [J) be an idealsupporting variety. . vector lattices... then the constant mapping 0 : X .. we say S is an ideal in X. X 2 . It may be included as a member of J. .) (ii) Whenever 9~ is one of the fundamental operations of the algebraic system and 9~ is not nullary. If f : X ~ Y is a Thomomorphism.. {0} is an object in this category i. lattice groups. 0. . 0) = 0. c. [J) is an idealsupporting variety.but not necessarily any other fundamental operations) and zhomomorphisms (which preserve all the fundamental operations). (However. then ~(0. J). J). ..14 and 8. Sn) 99(Xl nt . Suppose (z. The mapping x H 0. The varieties of monoids and lattices are not idealsupporting. However.. d. (This condition is trivially satisfied for n . although {0} is a subgroup of X. in many cases of interest it does not have to be assumed explicitly. rings. because it follows as a consequence from other identities in :J. . E l e m e n t a r y observations.24.) 9.8 2 . For each integer n _> 0. we may distinguish between group h o m o m o r p h i s m s (which preserve 0. we do not assert that these additive groups are objects of the category ( t a t . S2...xn C X.
. it now follows that X / S satisfies all the identities ~J. quotient algebra..2.. then S is its Proof of (A) =~ (B).. we must show that 7r(xi) .:J) (called the q u o t i e n t o b j e c t ) . (A).. and therefore f(x~ + si) = f(x~)..f(SO(XIJSl. We might say S is an ideal in the algebra X. quotient vector space. Proof of (B) =~ (C)... 7r(xl)F]Tr(x2) .. By 9. If the quotient map 7r: X ~ X / S is a homomorphism. Verifications of this sort follow from the proof of (B) ~ in (C) i. x2 from equivalence classes i.71"(SO(XI. to distinguish this from the "ideal of sets" introduced in 5. Proof of (C) ~ kernel.Varieties with Ideals (C) The quotient group X / S can be made into an object of the variety (T.X2.Xn)).x~ E S. x (xl.7r(x.. quotient ring.71"(X2)... below. with the fundamental operations ~ on X / S defined in terms of the given fundamental operations SOon X.. x .XnJSn)SO(XI.. (The quotient object has different names in different categories quotient group.. Assume S = Ker(f).7r(x~)... 7r(x2) .e. addition and multiplication in X / S are operations [ . one must show that 7r(Xl) ...7r(xl "x2). It may be easier to understand if we mention a typical example: In the category of rings..Xn) ') . The two notions of "ideal" coincide in the context of the Boolean algebra T(ft).. and the quotient map 7r : X ~ X / S can be made into a Thomomorphism... The last equation in (C) is admittedly complicated.Xn)) f(SO(XI nVSI..SO(f(Xl~ Sl).) ==~ (Xl + + .~ and [7] defined by 7r(Xl)[@]Tr(z2) . In these conditions we do not assert that S is necessarily an object in the category.. ( .. see 13. etc..f(Xn)) .. ) ..(Xl.21.17. Under the hypotheses of (B) we have (xi + si) ..f(v)..7r SO(XII... By our definition of the ~'s..7r(xl ~x2)..d... as follows: 223 ~(71"(Xl). We must verify that the functions ~ are welldefined by the formula ( ) .f(XnJSn)) :(f(Xl).. Now 0 . Obviously ~ is nary.. and thus X / S is an algebraic system of arity ~. But that is just a restatement of (B).. some verification is needed: One must show that 7r(Xl +x2) and 7r(Xl.. it follows that 7r is a homomorphism of algebraic systems. if So is nary.7r(x'i) for all i =~ 7r SO(x1.Tr(Xn)) .) Remarks.. For both of these equations.Xn)) .XnJSn)) f(. x2) do not depend on the particular choice of representatives Xl.).e.xn) (C).
25(C).e. . The ring Zm (introduced in 8. and thus it is an ideal when we consider the category of additive groups. Let 9.. then {0} and X are ideals in X. a. .~ ( X l . d. whether a subset S is an ideal in an algebraic system X may depend on what category we use in considering S and X .. Thus. This may not be entirely obvious from definitions 9. (ii) The set of all polynomials of degree <_ 1 is not an ideal in R[x]..224 Chapter 9: Concrete Categories and therefore ~(Xl ~. ( f g ) ( x ) = f ( x ) .25(A) or 9. Then: (i) Nix] is an ideal in itself.8 n ) . . but possibly with different sets of equational axioms. Let R[x] be the ring of all polynomials in one variable x. The inclusion map i 9 ]R[x] c IR~ is a homomorphism.Xn) is in Ker(f) . .21. hence it is the only ideal that is also a subring (i. However. Further examples. :J) and ~ = (7. not the other objects of the category.S. Then the same set X. Any ideal in X other than X itself is called a p r o p e r ideal.20) can also be described as the quotient of the ring Z by the ideal mZ = { m z : z c Z}.i = (7. subalgebra in the category of rings with unit). but it is immediately evident from 9. b. note that a ring X is the only ideal in X that contains 1. Contrast this with the result for subalgebras noted in 9. but not in IR•.e. g. A m a x i m a l ideal is a proper ideal that is not contained in any other proper ideal. g(x).) . Subalgebras. an object in the category of rings by forgetting that its member "1" has some special property. may also be viewed as a ring "without unit" i. with the same addition and 0 and multiplication. for the category of rings. . Suppose X is an algebraic system that satisfies both sets of equational axioms i. Ideals.25(B). In the category of rings or in the category of rings with unit. X is an object in both categories.f(i). In the category of rings with unit. with coefficients in 1R...25(A) or 9. x E X =:v sx. with the same fundamental operations. with multiplication given pointwise that is. xs c S. and subgroups are all the same thing in the category of additive groups. c. this is also a ring with unit. for the category of rings. (This is may not be obvious from 9.e.26. an ideal in a ring X is an additive subgroup S c_ X that satisfies s c S. Observe that X has the same ideals in either category.. ~e.81. 9. but it is easy to see from 9.25(B). it is an additive subgroup.Xn ~. Then the 9. Let X be a ring with unit. with the same arity function T i. g) be two idealsupporting varieties.lideals in X are the same as the ~3ideals in X.e. Thus. Let IRR = {functions from ]R into ]R}. If X is an object in an idealsupporting category. the homomor> phic image of an ideal is not necessarily an ideal.25(C).. since that condition only involves the fundamental operations of X.
Further properties of ideals. Any additive subgroup is closed under the operation ~b determined by the mappings p(x) . by 8. Proofs of (F) ~ (B) follow from elementary considerations about the absolute value function.) 9. (Use 9.25. one of these functions is dual to the other. then x c S.f ( x ) thus generalizing 8. (D) Whenever t E S and u E X. Let X be a lattice group. plus the fact that S is an additive group. + v (. For (D) ~ (F). The two remaining fundamental operations in a lattice group are V and A. use u = s to show s E S <> / s / E S. any ideal is also a sublattice.e>. x' E X. (D) <> (E) follow from translationinvariance of the lattice operations. use 8. taking these binary operations for ~ yields the functions ~)1(8.(u V 0) E S. we note this corollary" In the category of lattice groups. A Boolean ring is a ring X with unit. which satisfies x 2 . But after some changes of sign. (E) Whenever s c S and u E X.33. We begin by considering what 9.25(B) looks like in the category of lattice groups.x or p(x. s' E S and x.( x V x') E S. /x/~/s/ => x E S . Show that we obtain the same ideals in a Boolean ring. then (u + s) + .c.) (B) S i s s o l i d that is. Proof of equivalence. and moreover. 8 / ) [(. 9. Proposition on ideals in lattice groups.c. then [(x + s) V (x' + s')] . Then the following conditions are equivalent. and let S c_ X be an additive subgroup.27.e>.0 or p(x) . Then" a.25(B) here. (A) S is an ideal in X.0 in condition (C). For (F) => (E).Varieties with Ideals 225 Example.42. whenever s E S and x E X satisfy 0 4 x 4 s.u + E S.) If f " X + Y is a homomorphism in that category.0. (c) Whenever s. then (u V t) . (Boolean rings will be studied further in Chapter 13 and thereafter. s E S . x') . (F) s E S . whether we view it in the category of rings or the category of rings with unit or the category of Boolean rings. Let X be an object in an idealsupporting variety. . v Thus. an additive subgroup S is an ideal if and only if it is closed under these two binary operations for every choice of x .' +  (.e>. then is isomorphic to Ran(f) X/Ker(f) by the mapping F ( r c ( x ) ) .x + x'.x' . x ' E X .x for all x E X. Proofs of (C) . / s / E S. (C). ( I s o m o r p h i s m Theorem.. Taking x .28. as defined in 9.17. This proves (A) .
hence h .. Thus h E F. d. if E~ is an ideal in X~ for each A.i ( 0 ) E 9} is an ideal in the algebra X. It is easy to verify that F is an additive subgroup of X. and let I be an ideal in A. The intersection of a subalgebra and an ideal is an ideal.11) is also an ideal. then the function h" A ~ Y is defined by h()~)  :x(fl()~)~gl()~). xn defined in 9. Now. We shall show that F also satisfies 9. Let any functions g l . Unwind the notation. let X be equipped with the product structure.. f~) belongs to F.. In other words. . Show that YIAEAE~ is the kernel of the homomorphism f = 1[~Ei f~ " 1I~Ei X~ ~ Vi~eh Y~ defined as in 9.. We shall show S N I is an ideal in S. in fact..29.1I~EAY~ be a product of algebraic systems in some ideal .. f : .25(B).. Let ~} be a filter of sets on the set A.. then the sum ~]~EAS~ (defined in 8.. Let X = 1IAEAYA be a product of algebraic systems in some idealsupporting variety (~..l(0) fn()~)~gn()~))~)~(fl()~). Hence our earlier results about Moore closures in 4. f... as in 9.8 are applicable. (ii) For any set B C_ X. . Let jS 9 c) A be the inclusion homomorphism. Let ~a~ : Y~ ~ Y~ (A E A) and (I) : X n ~ X be corresponding nary fundamental operations. It is the intersection of the ideals that contain B.l ( 0 ) .. fn E X be given. fn "Jr gn) . A product of ideals is an ideal..226 Chapter 9: Concrete Categories b. c. . Hint: E~ is the kernel of some homomorphism f~ 9 X~ . If f " X ~ Y is a homomorphism and T C_ Y is an ideal. each set g . then 1[~eh E). Proof. :3). . . If S~ (A E A) are ideals in X. Corollary. l ( 0 ) belongs to From this it follows easily that the filter ~. then f . we are to show that the function h  (I)(fl + gl.25(B). . .fn()~)). . Proof. is an ideal in 1[~Ei X).. 9. there is a smallest ideal in X that contains B. . gn E I~ and fl. It is called the i d e a l generated by B.18. let S be a subalgebra of A. Y~. let h" A ~ B be a homomorphism with J ) A h B has kernel equal to S N I. Using those results or by a direct argument. g. 9. Let A be an algebraic system in some idealsupporting variety..l ( T ) is an ideal in X. .30. show that: (i) Any intersection of ideals in X is an ideal.l ( o ) belongs to ~. Then F = {g E X : g .. Then the composition S e. C h .. Let X . The ideals are the sets closed under the finitary operations r .. n .(P(f~. g 2 .6 and our earlier results about algebraic closures in 4. it is the ideal generated by U~sA S~. kernel equal to I.19.
. Then ja (Yx) is a subgroup of X that is isomorphic to Yx. 2).Functors 227 supporting variety (r. We shall call this ideal the e x t e r n a l d i r e c t s u m of the Ya's. when A is finite. If we gloss over the distinction between isomorphism and equality. c(x) and = o G(v) . Then a covariant functor F :9. 0 . a c o v a r i a n t f u n c t o r preserves compositions and arrow directions. Then the set f (A) r 0 for only finitely many A's } AEA = AEA is an ideal in the algebra X. v. a f u n c t o r is a morphism in the category of categories. then the external direct sum of the Yx's is "the same as" the internal direct sum of the Yx's.1 + ~3 yields F(p): F(X) ~ F(Y) and F(u) = F ( v ) o F(w) in category ~ ." In this context the relevant structure involves such things as the compositions of morphisms. suppose that p : X + Y and u = v o w are a typical morphism and a typical composition of morphisms in category 9. F u r t h e r p r o p e r t i e s .1.12). . . (In some categories it coincides with the c o p r o d u c t . Show that AEA AEA (where  represents an internal direct sum. Thus. 0. whereas a contravariant functor G :9.29 using the cofinite filter. . 0) that is. a c o n t r a v a r i a n t f u n c t o r reverses compositions and arrow directions. To be entirely precise.31. . a functor is a mapping from one category into another. internal direct sums and external direct sums are often used interchangeably. ) Of course. as defined in 8. this is immediate from 9. an injective homomorphism jx : Yx + X can be defined by jx(v) = (0. both are referred to simply as direct sums. For each A E A. the external direct sum of the Ya's is the internal direct sum of a collection of groups that are isomorphic to the Yx's. C a u t i o n : In the wider literature. sending objects to objects and morphisms to morphisms and preserving the "relevant structure. equipped with the product structure. Thus. A little more precisely.1 ~ ~ yields G(p): c:(Y). put v in the Ath component and zeros elsewhere. 0. 0. Loosely speaking. FUNCTORS 9. o ja : Yx ~ Y. . . Also note that 7ra o jx is the identity map of Ya and 7r. 0 . then the external direct sum is the same as the product. is the zero map if # =/= A. 0. .
a. if there exists a continuous function h : [0.33. Let (X.[P(X) defined in 2. Thus ~rl is a covariant functor from the category of pointed topological spaces to the category of groups.l : [p(y) ___. This functor is covariant. Call two such paths f. The c o n t r a v a r i a n t p o w e r set f u n c t o r is another functor from the category of sets to itself. note particularly 9. 1] ~ X is a member of some equivalence class that is.i. This functor sends each set X to the set [P(X) and sends each mapping f : X ~ Y to the forward image map f : [P(X) ~ [P(Y) defined in 2. that it does indeed preserve equivalence and furthermore. 1]. 1) = x0 for all s. but it sends each mapping f : X ~ Y to the inverse image map f . (Optional. y0). follow one and then the other. denoted 7t"1 (X.e. Y0) is a morphism of pointed topological spaces (defined as in 9. 1] ~ Y is a member of some corresponding equivalence class that is a member of 7r1(II. this functor is usually represented with an asterisk on the right. X0) 1 ~ 71"1(Y. x0). Some elementary examples of functors. t E [0. The r e d u c e d p o w e r f u n c t o r S H *S will be discussed starting in 9.9). then ~a o f : [0. in turn. This group. this mapping is a group homomorphism. h(1. then we can define a mapping between the fundamental groups. described in 9. Consider all paths in X that begin and end at x0 that is. under the operation of "composition" 9 to compose two paths. 1] x [0.b. 71" (~) 1 : 71" (X. 9. x0). That is one of the basic ideas of a l g e b r a i c t o p o l o g y . it is usually represented with an asterisk on the left. Yo). It will not be pursued further in this book. Using this functor.228 Chapter 9: Concrete Categories in category f~. g equivalent if there exists a homotopy from f to g that preserves the endpoints . 1] ~ X that satisfies h(0. a member of 70 (X. x0) ~ (Y.32. h(s.13.9).e. O) = h(s. is called the Poincar@ f u n d a m e n t a l g r o u p of the pointed space (X. as follows: If f : [0. t) = g(t). This functor also sends each set X to the set [P(X).55 below.7.) We now specialize slightly the notion developed in 9. 1] ~ X that satisfy f(0) = f(1) = x0. all continuous functions f : [ 0 . . If ~ : (X.. xo) be a pointed topological space (defined as in 9.8. It is easy to verify that the equivalence classes form a group. 9 the inverse any path is the same path run backward. It is not hard to verify that this mapping is well defined i. The c o v a r i a n t p o w e r set f u n c t o r is a functor from the category of sets to itself.. t) = f(t).. Do not confuse it with the contravariant e x p o n e n t i a l f u n c t o r S H S*.37. however.50. we can transform some questions about topological spaces into corresponding questions about groups. 9. x0).
9. and it is easy to verify that any continuous map is measurable when its domain and codomain are equipped with the Borel aalgebras (see 21. the discrete topology on N and the two lower set topologies given in 5.37. The functors that take any poset to its sup completion. after we have developed a few more tools. we noted in 9.36. but it . any Tychonov space to its Stone(~ech compactification. In forgetting some structure.e). since different uniformities on a set may determine the same topology for instance. Thus we obtain a forgetful functor from topological spaces (with continuous maps) to measurable spaces (with measurable maps).15. 4) may be viewed as a topological space (X. but ~ is not equal to 8. since different topologies may yield the same oralgebra for instance.8(iii).a). see the two examples below. or any separated uniform space to its separated uniform completion.9. Thus there is a forgetful functor from uniform spaces (with uniformly continuous maps) to topological spaces (with continuous maps). it can be found in Herrlich and Strecker [1979]. and any uniformly continuous map is also continuous (18. g'(11~)). an example of that fact is given in 21. if II is the initial uniformity determined on a set X by a collection of mappings ~x : X ~ (Yx.34.26. ~ If Yt is a subcategory of N. this forgetful functor has the interesting property that it preserves the formation of initial objects. Preview. 8).10 that any preordered set (X. Any theorem about increasing maps between preordered sets can also be applied to the special case of lattice homomorphisms between lattices. Many covariant functors can be described as f o r g e t f u l f u n c t o r s . are examples of inclusions of reflective s u b c a t e g o r i e s . This simplified approach is less powerful than the customary treatment. Preview.c). 11~).9. A forgetful functor is in use when we go from one category to another by forgetting part of the relevant structure. We now describe two especially important forgetful functors that will be important in later chapters. This forgetful functor is not given by a subcategory inclusion. However. and a lattice homomorphism is a special type of increasing map. Other functors.The Reduced Power Functor 229 9.2. For instance. That is. (The proof of that equality may be easier to prove in 18. 9. we permit some change in the description of the objects. The forgetful functor from topological spaces to measurable spaces sometimes does not preserve the formation of initial objects.33).g. see 18. then the resulting uniform topology 9~(ll) is equal to the initial topology determined by the maps px : X ~ (Yx.11. then the inclusion A c N is a forgetful functor.9. however. That topic will not be discussed here.35. Not every forgetful functor is of this form.e.d and 19. For instance.) Every topology determines a Borel oralgebra (see 5. Every uniform structure determines a topology (5. a lattice is a special type of preordered set. This forgetful functor is not given by the inclusion of a subcategory. THE REDUCED POWER FUNCTOR 9.d all yield the discrete oralgebra. In the next few pages we shall develop a "junior version" of nonstandard analysis.
These ad hoc methods are sometimes a bit tedious. master the Transfer Principle.5. n Sn if and only if *T = *S1 N ' S 2 n . *f. and therefore the mapping S H *S is injective. or relation *S. Preview of the Transfer Principle. Our junior version. 9~. Let 9~ be a proper filter on A. U. and then proceed through the next few pages.17.40. we shall disregard the codomains of these functions.37. For instance. However. n * S n for any positive integer n. S = T if and only if *S = *T. but other choices of 9~ are also of some interest. . and let J be the proper ideal that is dual to 9~ that is.230 Chapter 9: Concrete Categories avoids the conceptual difficulties of sets of sets of sets and avoids the formal study of mathematical languages a study that is second nature to logicians but may seem quite foreign to many analysts. J will be held fixed throughout the discussion.) . given any set S. Likewise. In the next few pages we develop some basic properties of the star mapping by purely ad hoc methods.a. see for instance 21. Our choices of A. we shall show in 9. • = {A \ F : F E 9"}. T. Some readers may prefer to glance through a text on nonstandard analysis. Thus. or U) for various choices of sets S. T A. Any two functions that are defined on A and agree at every point of A will be viewed as the "same" function. T.19 and an explanation of limits in terms of infinitesimals in 10. the Transfer Principle would be a helpful shortcut. The particular choice of the codomain does not matter.33. 9. as in 2.38.45." we will need to analyze our language. we can construct a corresponding set. For our junior version of nonstandard analysis. In Chapter 14 we shall sketch some of the remaining ingredients of the customary approaches to nonstandard analysis. or relation R. function. or domain. that logical analysis will be carried out in part in Chapter 14. *R in a "larger universe. . we will be concerned with sets such as S A. For simplicity. 9.46.h that T = S1 n $2 N . 9" will usually be a free ultrafilter. not to all statements. T. but that sketch will rely on some results and intuition developed in the next few pages. which may seem more natural to analysts. function f. . provided that it is sufficiently large for our applications. Let A be the index set. is adequate for a few minor applications including a "construction" of the hyperreal number system *II~ in 10. without use of the Transfer Principle. For the discussions below we may refer to A as the i n d e x set. but we will not be concerned with a larger set containing all of S.c.39. as indicated above. In the next few pages we shall show how. U A (the functions from A into S. For instance. the Transfer Principle only applies to "suitably worded" statements. (The existence of free ultrafilters was discussed in 6. U. We shall consider many functions from A to various sets. . this will give the reader a quick taste of what nonstandard analysis is like." The T r a n s f e r P r i n c i p l e states that any suitably worded statement without stars is true if and only if the corresponding statement with stars is true. our observation about finite intersections does not extend to infinite intersections see 9. any larger set will do just as well. 9. To make precise this notion of "suitably worded statements. Let A be a nonempty set.
For any point s E S.) In other words. h defined on A will be said to be 9equivalent.is a free ultrafilter. in at least two parts of analysis: . The reduced power construction is a simplified version of the n o n s t a n d a r d e n l a r g e m e n t construction used in nonstandard analysis. Moreover.. Remark. We may identify each point s with the resulting equivalence class 7r(c~). let c8 be the constant function taking the value s i. When the filter 9.3 i. if the set {~cA : g(~)=h(~)} is "large" in the sense that it is an element of 9 . for most interesting results are obtained when we hold A and 9: fixed and consider what happens as S is varied. It is easy to verify that this is an equivalence relation on the set ft A = {functions from A into ft}.or.e. in exercises below. and let S be any set. that *S inherits many of the properties of S. at least one member of which is a function whose range is a subset of S. Then it is clear that 7r(c~) E *S. and thus it is a sort of "enlarged copy" of S. This notion of "ultrapower" should not be confused with the Banach space ultrapower. Let A. then the reduced power *S can be written instead as sA/9.42.41. if the statement g = h is satisfied 9almost everywhere in the sense of 5. Remarks. if the set {A ~ A : g(A) =/=h(A)} is "small" in the sense that it is a member of :J. it is easy to see that the mapping s ~ 7r(c~) is injective i.. (Here "for almost all A" means for all ~ in some member of 9. with different intuition and syntactic conventions. the function defined by cs(A) = s for all A E A. The reduced power construction is used in substantially different ways. equivalently. let 7r(g) denote the equivalence class containing a function g. Then the set of equivalence classes *S = {Tr(g) : g E S a} = {7r(f) : f(A) E S f o r a l m o s t a l l A } is called the r e d u c e d p o w e r of S.or as sA/J.t.e. or to a g r e e 9almost e v e r y w h e r e . then 9"equivalence is an equivalence relation on the proper class of all functions that are defined on A.40. a related but slightly different object that is often used when techniques of nonstandard analysis are applied in the study of Banach spaces. thus we may consider S as a subset of *S. for any codomain ft. 9. c~ E *S if and only if c~ is an equivalence class.The Reduced Power Functor 231 Two functions g. We shall see. Usually that notation is not needed. A brief introduction to Banach space ultrapowers can be found in Coleman [1987]. 9. however. then the equivalence classes 7r(c~) and 7r(ct) are distinct. When the choices of A and 9" need to be mentioned explicitly.43. 9. if s :/.e. as in 9. 9. :J be as above. then the reduced power *S is called the u l t r a p o w e r of S. For the present discussion. If we do not specify a codomain f~.
members of *X may be called X . Let :J be the collection of null sets for some complete positive measure p on a set A." (ii) In the theory of measure and integration. 9. Suppose 9" is a proper filter. is convenient and usually harmless because the quotient m a p 7r 9X A ~ * X preserves most (not quite all) of the structures and operations that are of interest..e. but *S ~ S if S contains two or more points. but they are often discussed as if they were functions.i. Hint" If A.27) that any infinite set S satisfies card(S) _> card(N). reduced powers also arise naturally..8(E). but it is generally not a maximal ideal. Then :J is a aideal. The list of properties below. p) are not really functions. members of "I~ are sometimes called r e a l r a n d o m v a r i a b l e s . then i is not equivalent to a constant mapping. and elements of *S are discussed much as though they were elements of S i. In this context. points in some set slightly larger than S. For instance. is based on Robinson and Zakon [1969]. then *S ~ S.232 Chapter 9: Concrete Categories (i) Nonstandard analysis will be introduced briefly in 14. y are distinct members of S. If 9"is a free ultrafilter and card(S) _> card(A). Hint: Here we use the fact (established in 6. Exercise: When are S and *S different? We have seen that S c_ *S.63.. Occasionally the distinction between functions and their equivalence classes becomes important. This is. $3. and much of the other material in this subchapter. show that f (u)  ~ x l y ifuEA ifu~A defines a function f : A ~ S that is not equivalent to a constant function.45.S if S is the empty set or a singleton. 9" is usually a free ultrafilter. Hint: 5. Further properties of reduced powers of sets. *S brings us nothing new. since not every subset of A is necessarily a null set or the complement of a null set.17. Corollary. CA are nonempty proper subsets of A that do not belong to 9". In this context. In that context. $2. If 9= is a free ultrafilter and S is a finite set. then *S ~ S for every infinite set S. Let S. an abuse of notation the elements of LP(A. see particularly 9. e. and S~ (c~ E A) be sets.e. c.v a l u e d r a n d o m v a r i a b l e s . S. Thus. then *S = S. and S1. Show that *S . . 9.. admittedly. elements of *IR are treated as some sort of generalized "numbers. For instance. b. and x. T. Then: . Then *S . If A = N and 9" is a free ultrafilter on N. functions defined on A. g.53. quite common in the literature. the dual filter 9" is generally not an ultrafilter. more generally. Assume 9= is a proper filter on A (not necessarily an ultrafilter). elements of *X are discussed much as though they were elements of X A . elements of the Lebesgue spaces LP(A. d. this case is of little interest to us. but not an ultrafilter. Suppose 9" is the fixed ultrafilter at some point A0 c A.S for all sets S.44. when do we have S ~: *S also? a. But the blurring of that distinction. this is discussed in 21.. Hint: There exists an injective mapping i : A . p) are equivalence classes of functions. Then the operation S ~. g. then the distinction is pointed out.
. . ~. Also.g cannot be strengthened to equalities. Without additional assumptions. Let 9" be any filter on N that includes the cofinite filter. Then an ntuple of functions (fl. then *S :/: *T. . and let fl.(*S) \ (*T). N Sn) i. f2. Intersections and unions satisfy these inclusions: h. For both of the examples below. a. . The conclusions of 9. define f : N + N by taking f ( n ) = n. . *(S \ T) c_ ( * S ) \ (*T). 5.N \ {j} for all j b.46.{2. rr(f) ~ *(1%1\ Sl) 71"(f) ~ *Sl U ' S 2 if Sl . Observe that two ntuplevalued functions f = (fl. if Sj . I f S C _ T . then *(S \ T) .i may not be valid if we do not assume 9" is an ultrafilter. . . if Sj . . Let 9" be the cofinite filter. Reduced power of a finite products of sets. 4. *($1 N $2 N . f..}. . ~5. f~) may also be viewed as an ntuplevalued function. For any positive finite integer n.f and 9. "a(f) r *Sl U *S2 U *S3 U . } .. . .54). . t h e n S . if $2 . N *Sn. .~ . * S C * T ~ SCT.47. .. 8 . . (compare with 9. .45. 6. g 2 .fn be finitely many functions defined on A. If 9" is an ultrafilter. f 2 .40 if and only if the set {~ E A : f(/~) = g(A)} belongs to . The * mapping is injective" If S # T. let rr(f) be the equivalence class containing the function f. A filter 9~ will be specified below. g. . . we now show this with examples.gn) are equivalent in the sense of 9. c. . b.45. ~(f) E *($1 US2). . . d. e. We shall use the two viewpoints interchangeably. Let 9"be a filter on A. let A = N. . Then rr(f) E *(1%1)\ *(Sl). Examples. but Sl n S2 n S 3 .. .  *$1 U ' S 2 U " " U * S n. fn) and g = ( g l . . .{1. * ~ . Then rr(f) E * S l n *S2 n *S3 n ..45. . 9. ) . . then also *($1 U $2 U " " U Srt) = *S1 N *$2 N . If 9" is an ultrafilter.{j} for all j. . f 2 .The Reduced Power Functor 233 a. rr(f) E * (Sl U $2 U S3 U .T N*S. we now show this by examples. 3.. 7. the inclusions in 9. 9.
gj(A)} belongs to 9".49. Show that if Gr(p) c X x Y is the graph o f a functionp" X ~ Y. . f2.. g. Any ~c E * X may be written in the form ~ ( f ) for some function f " A ~ ~t (where (B) .) and g = (gl. g 3 . . the function *p is an extension of the function p . be infinitely many functions defined on A. .48. For instance.. if and only if the set Nj=I{/~ E A 9 fj(/~) . 9. . where 8 is the Kronecker delta (defined in 2. no two of which have the same first element. we would like to extend the function sin : I~ ~ ~ to a function *sin : *I~ . . Let's see what goes wrong. Observe that two sequencevalued functions f = (fl. if and only if the set ~ j = l {A E A 9 fj(A) . Note that. . It is easy to verify that *($1 X $2 X "'" X Sn) = *$1 X *$2 X ''' X *Sn for any n sets $1. but is not necessarily implied by. Sn. We see that fj is equivalent to gj since they agree everywhere on H except at one point. g2. g 3 .. f(n) and g(n) differ in their n t h coordinate. g 3 . $2. and let fl. f 3 . f3. we wish to define a function (*p) 9 *X ~ *Y by specifying its value on each ~ E *X. we have X c_ *X. But f = (fl. fn) is equivalent to (gl.gj()~)} belongs to 9". but not necessarily conversely. Reduced powers of functions. since Gr(p) c_ Gr(*p).p(x) for every x c X. g2.Gr(*p). . Let fj be the constant function 0. . 9. Let 9" be a filter on A. and *p(x) . .. .. f 3 . ) may also be viewed as a sequencevalued function. There are a couple of natural methods for defining a reduced power *p : *X ~ *Y.. Since 5[ is a filter. . ) is equivalent to (gl. . .40 if and only if the set {A E A : f(A) = g(A)} belongs to 9" that is..2. f2. this condition holds if and only if each of the n sets {A E A : fj(A) = gj(A)} belongs to 9". Since 9" is a filter. We may use the two viewpoints interchangeably. let A = H. . which we shall denote by *p. . Thus *(Gr(p)) . then *(Gr(p)) c_ *X x *Y is the graph of a function from *X into *Y. if (fl. how is this accomplished? Let p : X ~ Y be a function from one set to another. Another method is by this rule: If p 9X ~ Y is some function. . (A) One method is to identify a function with its graph. g2. (fl. then each fj is equivalent to gj. ) since they agree nowhere on H indeed. f 2 .that is.) if and only if fl is equivalent to gl. f2. and let 9" be the cofinite filter on H.. f2. . and fn is equivalent to gn. . . the condition that each of the sets {A E A : fj()~) = gj(A)} belongs to 9. . . . . this condition implies. f2. In other words. In other words. . . An equivalence class of sequences is not the same thing as a sequence of equivalence classes. . Then a sequence of functions ( f l. Then a function is a set of ordered pairs. *R. . f 3 .. ) is not equivalent to g = (gl. W h a t about an infinite product of sets? Not all of the reasoning in the preceding section generalizes readily. Therefore.d). f2 is equivalent to g2. g 2 ...234 n Chapter 9: Concrete Categories 9" that is. . and let gj : N ~ N be defined by gj(k) = 8jk. f 3 . How do we extend functions? For instance. Then for each j. fortunately they yield the same result. ) . . an equivalence class of ntuples can be represented as an ntuple of equivalence classes. .. ) are equivalent in the sense of 9.
(B) For functions f. Further properties of reduced powers of relations. *(poq) . a.The Reduced Power Functor f~ is some sufficiently large codomain). Reduced powers of relations. Show that these two definitions yield the same binary relation *R on the set *X. How do we extend relations? For instance.. show that if'rr(fl) = 7r(f2) and rr(gl) = rr(g2).12). that term was introduced in 9. rr(fl) .7r(f2) => rr(po f l ) . Hence a function (*p) 9 *X + *Y is well defined by the rule (*p)(rr(f)) rr(pof) for f e X A. Hints" The hypothesis can be restated as: G r ( f ) N (S x Y) c_ X x T. (A) One method is to identify the relation with its graph i. to work with the set Gr(R) C_ X x X. 9 . Thus * ( G r ( R ) ) = Gr(*R). '. There are a couple of natural methods for defining a binary relation *R on the set *X. *(Gr(R)) C *(X x X) = (*X) x (*X). then ( * f ) ( * S ) c *T.. g : A + X.e. W h e n p is some familiar function. The taking of reduced powers preserves identity maps i. the extension of sin. which would naturally be written * sin.51.. is customarily written sin instead. Q is the quotient map taking functions to their equivalence classes.(*p)o(*q) for any functions q" W + X and p" X + Y. 5 0 . say that :r(f) *R rr(g) if and only if the statement f R g is 9"true in the sense of 5.e. which we naturally call *R. Further properties of reduced powers of functions. For instance. If f ( S ) c T.e. 9. 9. we can show that G r ( * f ) A (*S x *Y) c_ *X x *T. b. and rr 9 ftA __+ . It is the graph of a binary relation on *X. Let R be a binary relation on a set X.52. Show that the mapping f H p o f respects the equivalence relation on X A that is.{)~ C n " f2(/~) R g 2 ( / ~ ) } C ~. Show that this makes *R well defined on *X i. Also. .. Co Let f " X + Y and let S c_ X and T c_ Y. 235 Show that these two definitions yield the same function *p. then {/~ E n " fl()~) n g l ( / ~ ) } C ~ :. that is. The reduced power *p" *X + *Y is injective or surjective if and only if the mapping p" X + Y has that property. then it is customary to write *p without the star. Then we can take its reduced power. fortunately they yield the same result. the taking of reduced powers preserves composition of functions. Using several results of the last few pages. then *(ix) is equal to the identity map of the set *X. we would like a corresponding notion for members of *IR. From these two facts it follows that the taking of reduced powers is a covariant functor from the category of sets to the category of sets. for some sufficiently large codomain ft. if i x is the identity map of X.3 i.31. we know 3 < 5. if and only if the set {A E A : f(A) R g(A)} is a member of 9".rr(po f2) (see 3.e.
~ c *S such that none of the conditions c~ *< 8. _<) is a chain. Hint" Use characterization (A). antisymmetric.xA/9: defined in 9. we cannot omit the assumption that fl~ be an ultrafilter. The restriction of *R to the set X C_ *X is precisely R. Then the equivalence classes of the characteristic functions of the sets A and B are elements c~. Let X be an algebraic system of an idealsupporting type (~. and S 9 S  {x c X " aRx and xRb}. In particular. * E ) . Hint" If g. b) in ]R is the corresponding interval (a. then so i s * p 9 (*X. together with 9. and that 0 < 1. then so does *R: reflexive. __u)is an orderpreserving map from one preordered set into another. In fact. 4 ) ~ (]I.41. . Proof. Reduced powers of algebraic systems. The reduced power of a complete or Dedekind complete ordering may inherit that completeness property (as in 21. the enlargement of an interval (a. dl Suppose a. and the inclusion map . {/~ "g(A) > h(A)} form a partition of A. h E S A. Example. equivalence relation. c~ . irreflexive. {/~ 9 g(A) = h(A)}.53.~. above. if X is a ring. or it may not (as in 10. if S is a chain containing at least two elements and fl~ is a proper filter on A but not an ultrafilter. If p" (X.29. any x E X is mapped to the equivalence class of the constant function from A into X whose constant value is x. it is another algebraic system of variety (r. let A be a set. b ) i n *R. b c X. XC*X is a ring homomorphism. or c~ *> ~ holds.25(C). partial order. * <) is a chain. then (*S. X is a subring of *X. g. By relabeling. bo If R has any of the following properties. then *X cannot be a chain. and let 9" be a filter of subsets of A.:J).236 Chapter 9: Concrete Categories a. preorder. and so X is a subalgebra of *X. N {g E X A 9 gl(0) c ~r} is an ideal in the product algebra X A = {functions from A into X}.41.42). we may assume that two of the elements of S are called 0 and 1. (That is. Hence we can form the quotient X A / N as in 9. In the preceding result. If 9" is an ultrafilter on A and (S. We may embed X in *X. such that neither A nor B is a member of 9". f.19). " 4 ) ~ (*Y. then the three sets {A "g(A) < h(A)}.49). so exactly one of them is a member of 9~. e. That is" x R y if and only if x. Then {~E*X 9 a*R~ and ~ * R b } . 9. There exist sets A and B that partition A. then *X is a ring. by the method described in 9.) The embedding is an injective homomorphism. symmetric. c. transitive. 3).45. Remark.25(C)) are the same as the reduced powers "7) of the fundamental operations 7) of X (defined as in 9.d. For instance. and the fundamental operations of the algebraic system X A / N (defined as in 9. y E X and x *R y. lattice. Then by 9. It is easy to verify (exercise) that this quotient X A/N is the same thing as the reduced power *X .
we have cE} (a A. However. Then it is possible to choose an index set A and a free ultrafilter 9~ on A.A(~) for each A E A. Proof of (UF1) ~ (UF4). by replacing it with a larger set if necessary.) E E is satisfied "almost everywhere. . We shall use A .54. We shall show that ~ E ~'IEE~ *E. For each g E 9 and each E E ~. each of these maps preserves the fundamental operations. Indeed.X+A 9 A(P:)EE}. It may be helpful to compare this principle with the following characterization of compact topological spaces: they are spaces in which. (The remainder of the proof is just a m a t t e r of unwinding the notation.43.s A / ~ " have this property: Whenever is a proper filter on a subset of ft. We shall show that this Y has the required property.E = {. the reader may find it easier to proceed on his or her own instead of reading further. The equivalence of (UF1) and (UF4) is similar to a result proved by Lutz and Goze [1981]. It is easy to verify that the collection g .32.B. 9 the coordinate projections 7rx" X A . let any proper filter ~ E 9 be given." so { E *E.) For every E E ~. introduced in 6. then ~ E ~ cl(E) is nonempty. X. and therefore preserves a great deal of the relevant structure. The Ultrafilter Principle.The Reduced Power Functor 237 Elements of *X are equivalence classes of functions from A into X. E 9 ~ E ~. then ~ E E ~ *E is nonempty. a(z) E} E E fl~. whenever ~ is a proper filter. Let 9 be the family of all proper filters on subsets of ft. and 9 the i n c l u s i o n X C~. We may assume ft is infinte. Thus the condition c(. as we remarked in 9. and let ~ E *f~ be the equivalence class of the function e. such that the resulting ultrapowers *S .{A~. Define a function c 9A .e. it has the finite intersection property. 9.. I d e a l i z a t i o n ) P r i n c i p l e . is equivalent to the following principle. E E ~} is a filter subbase i. elements of *X are sometimes discussed as if they were elements of X A or elements of X. by Cartan's Ultrafilter Principle. These styles of discussion are feasible largely because each of these maps is a homomorphism: 9 the quotient map 7r" X A ~ X A / N . ft by taking c(A) . Hence. Let ft be a set.ft~ = {functions from r into ft}. consider the set A.x Thus. there exists an ultrafilter fl: on A such that Y _D g. which is similar to a principle of nonstandard analysis" ( U F 4 ) E n l a r g e m e n t ( C o n c u r r e n c e . We emphasize that it is possible to make a single choice of A and 11 that works for all choices of ~.
listed below as (HI).32). (H4). Let ~ be a proper filter on a set f~. Thus the ultrafilter 9" must be free. 1] F I~ F F T For each object X in r define the set X* = {@morphisms from X into A}.33. called e x p o n e n t i a l f u n c t o r s or d u a l f u n c t o r s .44. let ~ 9A ~ ~t be any function whose equivalence class is ~. 1}. and let A be some particular object in that category. EXPONENTIAL (DUAL) FUNCTORS 9. as in the diagram below.a. But since f~ is infinite. then it is fixed whence *S .238 Chapter 9: Concrete Categories If 9" is not a free ultrafilter. by 9. Let A and 9"be as in (UF4). (H3). a contradiction. F stands for a scalar field (generally I~ or C). Proof of (UF4) ~ (UF1). A few categories that we shall consider in later chapters have functors that we shall now describe. In the table. f X > Y Ao f .A o f for all A c Y*. and (H5). we wish to extend it to an ultrafilter. by 6. let ~ E ~ E s t ~ * E .55. 2 stands for the set {0. (H2). Objects of ~ sets Boolean algebras Boolean spaces Tychonov spaces vector spaces Riesz spaces topological vector spaces Banach spaces Pontryagin groups Morphisms functions Boolean homomorphisms continuous maps continuous maps linear maps order bounded linear maps continuous linear maps continuous linear maps continuous homomorphisms A 2 2 2 [0. Then O = ~ E ~ E = ~ E ~ *E % O. It is easy to verify that 9 is an ultrafilter on f~ and that 9 _D t~. Some commonly used choices of ~ and A are listed in the table below. and T stands for the circle group (see 10. it has some free ultrafilter ~.f*(A) c X* ~ A J AcY* . Let ~ be a given category. For each morphism f " X ~ Y in the category r define a mapping f* 9 Y* ~ X* by the rule f*(A) . These categories satisfy five hypotheses. Let 9 be the filter on f~ generated by the filterbase {e(F)" F c 9"}.S for every set S.
ff and if* are the same category. it is actually irrelevant to the theory developed below. In the most interesting instances of this theory. the elements of X * separate the points of X. . for any two distinct points Xl. then the other is surjective.d. (H2) There is some natural way to attach structures to the dual sets X*. Elementary example. Also assume that (H3) The categories ff and if* have special objects A with the same underlying set (perhaps with different structures attached). working with whichever properties are more convenient. the following hypotheses are satisfied: (H1) For each object X in if. Prove that this converse is valid in the category of sets i. X * is surjective. the bidual category if** is identical to the original category if.X2 E X. We can switch back and forth between either setting and its dual. but that could be viewed as mere coincidence. where any set is an object. a converse can be proved: if one of the functions f : X ~ Y or f * : y * ~ X * is injective. though each of those terms has other meanings as well. 9. We may refer to it as the exp o n e n t i a l f u n c t o r . and making the dual functions f * : Y* ~ X * into morphisms in that category. Exercise. (Hint: Use (H1). then f H f * is the inverse image functor defined in 9. assume A is a set containing two or more points. 9. we have ff ~ if* in at least one important application: The categories of Boolean spaces and Boolean algebras are dual to each other. which maps into some category if**. see 11. T h a t is. For a simple. and any function is a morphism. there exists at least one morphism f : X ~ A such that f ( x l ) ~ = f(x2). concrete example.Exponential (Dual) Functors 239 We sometimes refer to X * and f * as the d u a l s or a d j o i n t s of X and f. or a d j o i n t f u n c t o r . For instance. making t h e m objects in another category ~*. If ff and if* are both the category of sets and A = {0. Assume ff is a category that has a dual functor. Suppose that the category if* also has a dual functor. 1}.22. Actually. d u a l f u n c t o r .e. Moreover. then the other is injective. and that fact is very important in the theory sketched below. though not all.32. show that if one of the functions f : X ~ Y or f * : Y* . thus they define a contravariant functor from ff into if*. mapping into some category if*. It is easy to verify that the rules X H X * and f H f * reverse arrows and compositions. in most cases of interest. In the categories where duals are useful.57. Much of the interest in exponential functors stems from the fact that some properties of X and f correspond to dual properties of X * and f*.) In some categories.56..
then the function extension of f that is. thus the canonical embedding X c ~ X * * is a morphism.e.58..240 C h a p t e r 9: Concrete Categories We shall denote both of these special objects by the same symbol A. the canonical embedding x H Tx turns out to be surjective i. The canonical embedding x ~ Tx. is then called the c a n o n i c a l i s o m o r p h i s m . This functor has some further properties of interest. 9 for the category of Pontryagin groups (i. is an c > X** and T " Y c y * * be the canonical embeddings. In some categories if.e. Such an object X will be called reflexive. the inclusion morphism X ~ X** is actually an isomorphism X c > X**. from X to X**. we may view the underlying set of X as a subset of the underlying set of X** (without regard to the additional structures attached to those sets). Thus x ~ Tx is an injective mapping from X into X**. locally compact Hausdorff Abelian groups). 9 Hint Let S 9X 9.e. In such a category we have X = X** and f = f * * for all objects X and morphisms f. f ~ f* ~ f**. then the mapping Tx = (. The inclusion map T 9 X ~ X** is sometimes known c as the c a n o n i c a l e m b e d d i n g of X in its b i d u a l . x} = A(x) c A.x} : X* ~ morphism in the category if*. 9 for the Banach spaces of type LP(#). In many categories. x) : X * ~ A. by 28. and instead view x as the function. the bidual functor has this further property: (Hh) The categories ff and if** are the same.12. Exercise. and thus Tx is a member of the set X**. with ~ for the argument... called the bidual functor: X ~ X* ~ X**. we obtain a covariant functor from ~ into ~**.. Each A E X * was defined as a function with argument x E X and value {~. Let now us change our viewpoint. are distinct. by the topological version of the Stone Representation Theorem (see 17. If f : X ~ Y i s a morphism. by 28. and X is a subobject of X** in that category. if x r x ~ then the mappings Tx and Tx. . dual functors. W h a t must be ~ verified is f * * ( S x ) = Tf(x) or more concretely. This isomorphism is established: 9 for the category of Hausdorff locally convex topological linear spaces with weak topologies. which we now describe. f** : X** ~ Y** 9. Then x acts as a mapping Tx = (.. with 1 < p < ~ . this further hypothesis is satisfied: (H4) If X is an object in ~ and x c X. called the e v a l u a t i o n m a p at x.59. A is a By (H1). For some objects X.44 and the sections following it). 9 for the categories of Boolean algebras and Boolean spaces. G r a p h ( f * * ) 2 G r a p h ( f ) . every object is reflexive (and so the term "reflexive" is not commonly used in those categories). In the categories of interest.e. By composing the two contravariant.44. which we may view as an inclusion i. [f**(Sx)](i~) = Tf(x)(~) for each A E Y*.50. by the Pontryagin Duality Theorem 26.
1 and 11. see 28. important properties. For instance. in the category of sets (without additional structure) and in the category of infinitedimensional vector spaces (without topology). . in some categories we can easily establish that no object is reflexive. and therefore X** cannot equal X.Exponential (Dual) Functors 241 On the other hand. a Banach space is reflexive if and only if its closed unit ball is weakly compact. some objects are reflexive while others are not. and reflexivity may be linked to other.36). For instance. we can prove that X* is strictly larger than X (see 2. In still other categories.41.20.
we say X is i n t e g r a l l y closed." Other terms occasionally used for an irrational number are "radical" or "surd" (an abbreviation of "absurd"). ~) be an ordered group. Remarks and definition. where the multiplication is defined as in 8. Preview.1. 242 .1 could have a square root. they were reluctant to admit that . The closest X can come to such a condition is Dedekind completeness. hence both. in the context of ordered groups. For this reason. when mathematicians began to use complex numbers to analyze polynomial equations. the new numbers in * R \ R were simply called "nonstandard" .a rather neutral term.C h a p t e r 10 T h e Real N u m b e r s 10." and this name stuck. Our language still reflects the resistance with which some of these extensions originally were met. let K x . Then the following two conditions are equivalent to each other. even today. of them are satisfied.2. the word "irrational" means an element of I~\Q but also means "crazy. We shall introduce both of these fields formally in this chapter. 10. a "number" means an element of a field.especially. If one. Mathematical nomenclature was not so disparaging a few decades ago when nonstandard analysis gave a rigorous foundation for the use of infinitesimals.. If X is an ordered group other than {0}. then x ~ 0. by comparison.23. For any x C X and K C_ Z.10. (A) Whenever the set Nx is bounded above. The two fields most commonly used in analysis are the real number system I~ and the complex number system C. a "complete ordered group" generally means a Dedekind complete ordered group.3. then X cannot have a greatest element (easy exercise)." in the sense of 3.h. too.{kx" k E K}. such a square root must be "imaginary. The ancient Greeks were reluctant to admit that the universe could not be explained in terms of ratios of whole numbers. Centuries later. DEDEKIND COMPLETIONS OF ORDERED GROUPS 10. A "real" number could not be such a square root. A significant part of the history of mathematics is the successive extension of number s y s t e m s . the inclusions N C_ Z C_ Q C_ R c C. Usually. Hence X cannot be "complete. though we have assumed some informal familiarity with ~ in earlier chapters. Let (X.
x ) are both bounded above.4. } is bounded above by/3 if and only if the set {0. } is bounded above b y / 3 . then. Some basic properties and examples. hence both. the following two conditions are equivalent to each other. then the group operations on X must satisfy ~+r / ~ = = sup{x+y inf{x + y sup{x inf{x 9 9 9 9 x. (C) Whenever the set Zx is bounded above. The group Z 2 with the lexicographical ordering (see 3.x is also an upper bound for Z+x. note that if Zx is bounded above. (D) {0} is the only subgroup of X that has an upper bound. To show (a) .44. hence/3 ~ / ~ . hence x 4 0. are Furthermore. Let (D. Examples. . and let X be a Dedekind completion of D (as in 4. Hints: Suppose x E X with Z+x bounded above.z5. x. } is bounded above. x~}. . then Z+x and Z + ( . Also. .+> (B). then all four conditions are equivalent. Any Dedekind complete. Proof. to prove (C) ~ (B) when X is lattice ordered. c. conditions (A)(B) imply conditions (C)(D). hence x 4 0 and .x 4 0.42. On the other hand. x. The groups Z and IR are Dedekind complete. 2 x . 2 x .x. a. Any subgroup of an integrally closed group is integrally closed. (al) (a2) (bl) (b2) xED. then x 4 0.33 and 4. Show that / 3 . Thus the subgroup Z(x V 0) is bounded above.19. x4~}. hence x = 0. . Furthermore. xED. All three of these groups are integrally closed. satisfied. b. The proof of (C) . Another example of an ordered group that is not integrally closed will be given in 10. . Then X can be made into an ordered group in which D is a subgroup if and only if D is integrally closed. x ~ ~c. . y 4 r l } . ordered group is integrally closed. 243 If one. 4) be an ordered group. note that the set {x.p show that Z+(x V 0) is bounded above by /3 v 0. If the ordering 4 on X is a lattice ordering. 3 x .( x V 0) 4 0. By (C). Theorem: C o m p l e t i o n of a Group. 10. x V 0 = 0. y E D. 10. suppose Z+x is bounded above by /3. x . Let/3 = sup(Z+x).a) is an ordered group that is not integrally closed. if those conditions are satisfied.n ( x V 0) 4 0 for n E N. . 2x.x. by adding show that . x 4 ~ . we say X is A r c h i m e d e a n .5. . we omit the details. . .34). (D) is easy. . y ~ r/}.Dedekind Completions of Ordered Groups (B) Whenever the set Z+x = {0. then x = 0. Finally. By 8. y E D . To show (B) ~ (C). Q is not.
the addition in X also satisfies ~ + 0 = 0 + ~ = ~. Then each of the following statements is equivalent to the next: p + +r p ~ u + z whenever u . x4~. +) is an additive monoid. y4r/}wheneverzEDandz4 z ~ x + y whenever x . we must show that ? = 0.g." our proof must not rely on any asyetunestablished properties of addition in X. y4r/}and : x. Clifford. If D has such a group completion X. Everett. it suffices to show that u ~ 0. Our first step will be to show that addition in X is associative. we shall show that these make X into an ordered group that has D as a subgroup. r/. and Ulam as contributors to this theorem.~ as in (bl). Let p E D. The beginner is cautioned not to assume too much just on the basis of notation: Although we use the symbol "+. Since D is supdense in X. z4<. we must not yet use associativity or subtraction (the existence of additive inverses) in X. W i t h definition (al) it is easy to see that + extends the addition operation of D. Fix any u E D with u ~ "y. we assume D is commutative. r hence the first statement is not affected by a permuting of those three terms. z E D a n d x 4 ~ . + . z4r pz~sup{x+y pzEDandu4sup{x+y : x. y E D . . Let any ~. y E D . suppose D is integrally closed.21. 0) is an additive group. Proof of theorem. z E D and x 4 ~. ~ E X be given. Observe that v4~ and hence y=~+(~) =sup{xv: 4=~ v~.. successively. we can freely use associativity and subtraction in the given group D. Conversely. we shall apply 3. z c D and u 4 ~ + r / a n d z 4 pz~uwheneveru. Lorenzen. forvED. To show that (X. y . In particular. hence integrally closed (by 10. Let X be a Dedekind completion of D. From u ~ "y we conclude. fix any ~ E X.4. the mapping x ~ . However. that .x from D into D is thus extended to a mapping from X into X. Equations (al) through (b2) follow from 8. To show that 3' ~ 0. v E D. From this it follows immediately that 0 / 4 0. z 4 y.33 and the fact that D is sup. However. x 4 ~ 4 v}. r/.a). Fuchs mentions Krull. whereas Fuchs does not impose that restriction.and infdense in X. x. At the end of this proof we shall show the validity of (a2) and (b2) as well.244 Chapter 10: The Real Numbers Remarks. Thus addition in X (defined as in (al)) is associative. y 4 r/.4. y 4 r l . and that + is commutative. hence D is integrally closed (by 10. and let y = ~ + (~). x4~. Our proof is based on that in Fuchs [1963].b). p~x+y+zwheneverx. Define . 0. then X is Dedekind complete. The last statement is symmetric in ~. Thus we have established that (X. Define operations on X by (al) and (bl).
and that is. v. x4~C =~ x 4 u + v .7. Proof. It suffices to show that F preserves addition. = sup(Lr Then for any ~.F(sup(L~)) F(~) + F(~) by (al) since F since f by 8. for e a c h v E D w i t h v > N(u) is bounded below. that Definitions. y E D .u f o r . hence u ~ 0. A c h a i n o r d e r e d r i n g is a ring R equipped with an ordering _< such (i) (R.33 since F since D in 10. 245 By assumption D is integrally closed. x _ < y => x 4 .u _ < y 4 .defined as in (al) and (bl).. Hence X is in fact an ordered group.6. then {x+y : x.~)) sup(f(L~)) 4. (ii) the ordering is translationinvariant all x. x 4 ~ 4 v implies u > x for e a c h v c D w i t h v > { .31. when equipped with the operations + and . l%l(u) is bounded above. y. The ordering is translationinvariant (as defined in 8. and F : X ~ Q is a suppreserving extension of f. Then F is also a group homomorphism.30). now (a2) and (b2) follow from ( a l ) a n d (bl). w e h a v e ~ 4 u + v . Q is a suppreserving group homomorphism. Therefore i n f ( . x 4 ~ .Ordered Fields and the Reals x . also. and f : D .5 is suppreserving is additive on D is suppreserving is supdense in X.f(L. with D as a subgroup.Lv)) sup(f(L~ 4.s u p ( S ) for any set S C_ X. v E D.sup(f(Lv)) F(sup(L~)) 4. Suppose the conditions of the preceding theorem are satisfied. C. ORDERED FIELDS AND THE REALS 10. u E X. Thus X is an additive group. w e h a v e x c D .S ) . ~ C X we have Define sets Lr as in 4.  Y4~7} {xED : x4~}+{yED : Y4~7} = Lr Hence F(~+~) = = = F(sup(L~ 4. 10. this follows trivially from our definition of addition in X. that Q is another ordered group._<) is a chain. Suppose.Lv) ) sup(f(L~) 4.
12 we present a chain ordered field that is not contained in •.14159265358979323. The theorem on the uniqueness of the reals.44 and 10.. It does not determine IR uniquely. thus.15.y > O ~ x y > O.a. but it also follows from a construction presented in 10. chain ordered field. and limits. or a set of pairs of rational numbers. and (ii) any two such fields are isomorphic. introduced in the next paragraph. chain ordered field by representing it in terms of rational numbers. we may discard that representation.e.11.23. Any one of these constructions is sumcient. Actually. the geometric description does not translate readily into usable algebraic axioms. but the essential properties of the real numbers. which is introduced in 10. other proofs are sketched briefly in 10. (iii) If R is also a field. that description has certain drawbacks.. or a pair of sets of rational numbers..) Some basic examples. respectively. . We shall prove those facts in 10. Discussion.246 Chapter 10: The Real Numbers x. tells us that these constructions of F~ from (~ all yield the same result. infs. chain ordered field (or.. After we have proved the existence of a Dedekind complete. The rational number system (~ (with its usual ordering) is clearly a chain ordered field. and it does not matter which one we use. in advanced analysis we usually consider the decimal expansions to be just representations for numbers. Perhaps this view of the real number system is the most concrete and the most useful for purposes of realworld applications in physics. We also think of reals as "infinite decimal expansions" such as 3. in hexadecimal. In 10. we usually think of the real number system as a model for the set of all points on a Euclidean straight line. etc. in the terminology of some mathematicians. b E Q}. that it is chain ordered means that inequalities work the way they should. engineering.. The real number system I~. Those numbers have other representations (in binary. {a + by/2: a. That IR is a field means that we can do ordinary arithmetic.33. primitive objects like the points on a line.15.45. etc. we may return to thinking of real numbers as indivisible.14159265358979323.45 and 19. etc. in ternary.). In grade school we learn. 10. what we really need are not concrete representations such as 3.15. in 10. there is only one real number system. To make sense of this definition. Intuitively. In the development of abstract theory. not the numbers themselves. A formal theory of such expansions is sketched in 10. we shall assume informal familiarity with that fact. However. that it is Dedekind complete means that we can take sups. but that is not specific enough for the purposes of this book. for it also fits *lt~ quite well.8. We now define the real number system R to be a Dedekind complete. All of the constructions are somewhat complicated and nonintuitive they represent a real number as a set of rational numbers. (Some mathematicians call these an ordered ring and an ordered field. The proof in 10. Certain other subsets of I~ are also chain ordered fields for instance. we shall call it a c h a i n o r d e r e d field.c. Also. we shall show that (i) there exists a Dedekind complete. a complete ordered field). However.. is a chain ordered field. which are used to prove theorems. how to perform arithmetic operations with such expansions.d is fairly detailed. Analysts often take these ideas for granted and forget how complicated a structure the real number system is. informally. we shall prove the existence of IR in several different ways.
22.10 is just the usual ordering. ~) into a chain ordered field. f.9.12. (An example to'keep in mind for now is A = Z. x. define x y ~ to mean xn 3_ ym. then the field of fractions is Q. we may write N C Z C_ Q c_ F. This example can be found in various algebra books. making it a chain ordered ring. Hence our construction in 10. If F is a chain ordered field. d. and hence 1R + 1R + ' ' ' + 1R (the sum of finitely many such terms) is also positive. a. then x 2 . An abstract construction of chain ordered fields. The resulting field F of fractions is A(x). we shall follow this convention in results below.< . show that the ordering on F is an extension of the ordering on D.. we now define this ordering: p 1 q will mean that the leading coefficient of the polynomial p .1 has no solution x in F. then the unique ring homomorphism from Q into F (noted in 8. _<) be a chain ordered ring that is also an integral domain. Two important particular cases of this construction are given in the next two sections. then 0 < x < y c. n Z 0. A nonArchimedean example. and let F be the resulting field of fractions.Ordered Fields and the Reals 10. If F is a chain ordered field. When the integral domain D is Z .{integers}.23.24). later we may reconsider this construction with A = R. 10. y x b. Identifying various sets with their isomorphic copies when no confusion will result.10 makes (A(x). If R is a chain ordered ring with unit. y E D with rn.d) is injective and orderpreserving. _E) into a chain ordered ring.) Show that F is then a chain ordered field. 247 A few basic properties. n. On D = A[x]. If F is a chain ordered field.11.24) is strictly greater than 0. 10. Hint" Let 1R denote the multiplicative identity of R. =~ 1 1 0 < . then D is an integral domain (see 8. 10. If F is a chain ordered field. Let D be A[x].q (defined in 8. Example.c) is injective and orderpreserving. e. the field of r a t i o n a l f u n c t i o n s in the one variable x with coefficients in A. Let (A.19. Let D be an integral domain.10. considering D as a subset of F. A few observations about this field will be useful later in this chapter: . Define an ordering ~ on F as follows: For rn.) Let x be a variable. another source is Lightstone and Robinson [1975]. m Tt (The reader should verify that this ordering does not depend on the choice of the representatives of the equivalence classes. other than {0}. Moreover. Suppose that some ordering _E is given on D. the ring of polynomials in the one variable x with coefficients in A. the ordering given in 10. Thus R contains an isomorphic copy of Z. Show that 1R > 0. the field of r a t i o n a l n u m b e r s . Thus F contains a uniquely determined isomorphic copy of Q. as in 8. No finite field can be a chain ordered field. Verify that this makes (A[x]. then F has no greatest or least element. then the unique homomorphism from Z to R (noted in 8.
Then N c_ Z c_ Q c_ F as noted in 10./3 c IF be given with c~ </3. s are rational functions with positive leading coefficients.248 Chapter 10: The Real Numbers a. By (C) and (D). ) (F) Q is supdense and infdense in F (see 4. .3. (D) For each c E F. (B) N does not have an upper bound in F.) .c. The sequence 1.{Z " n c N} and T . then p 1 q. The function p(x) = x is strictly greater than every constant function k.9.i n f ( T ) . the sequence 1. a chain ordered field IF possessing one (hence all) of these conditions is said to be an A r c h i m e d e a n field. b. Say that a sequence (xn) is Cauchy in IF if for each ~ in IF with c > 0 there exists a positive integer M such that j. the degree of p is greater than the degree of q. and m E Z with m > nc~ >_ m . x 3 . and the leading coefficient of p is positive. (Optional..~ > 1/n..) 10. we have inf(S) . This presentation is based partly on Davis [1977]. . the d e g r e e of the rational function p/q will mean the difference deg(p) . is not bounded a b o v e .deg(q). . . (This is sometimes called the D e n s i t y P r o p e r t y . that is.X k < C. Hints for the equivalence proof: It is fairly easy to see that conditions (A) through (D) are equivalent. x. 10. let c~. 7 . To show that those conditions imply (E). Show that if r. x 2. is bounded above. d.hence F is also lattice ordered). Then the following conditions are equivalent. (E) Between any two elements of F there is an element of Q. {0} is the only additive subgroup of IF that is bounded above by an element of F.20. If p and q are polynomials. Since every element of either of these sets is less than some member of the other set. 1 To see that (F)implies (C). then r ~ s. there exist n C N w i t h / 3 . If p and q are polynomials other than 0.e. c. let S . (Cauchy sequences will be studied in another setting in Chapter 19. the set {m E Z ' m > c} has a lowest element.14.i. and the leading coefficient of p/q will mean the quotient of the leading coefficients of p and q. 1 (C) The set {Z "n C N} has infimum (in F) equal to 0. (A) F is Archimedean in the sense of 10. Thus. It is easy to see that (E) implies (F). and deg(r) > deg(s). 2. 3 . .) Let F be a chain ordered field.0 since Q is infdense in F. there does not exist a rational function r(x) that satisfies r(x) ~_ x n for all nonnegative integers n. but inf(T) . k >_ M ~ c < Xj .1.31). . Let IF be a chain ordered field (as defined in 1 0 . Show that m / n lies between (~ and/3. (Contrast this result with 10.e. Definition and exercise.13.{q E Q " q > 0}.
3). > 0} is an Archimedean chain ordered group (as d. . if F is any Archimedean field.XTt 1 ) > k~. chain ordered field. The multiplicative group {x E Q : x in 10.2M to obtain a contradiction. Suppose (x~) is not Cauchy.38). Hints: Let R denote the Dedekind completion of F. Then (A) IF is Archimedean if and only if (B) each bounded. we have b _< )''M'I Use j . chain ordered field.12. Hints: By 10. By 10.'" "} has infimum equal to 0. By the Cauchy criterion. monotone sequence. 2.) 1 1 1 Hints: For (B) ~ (A). Show that this makes R into a Dedekind complete. let (x~) be a bounded. chain ordered field. a. The field constructed in 10. Hence R1 is a Dedekind completion of Q.5 to define addition and additive inverses in R and to define multiplication and multiplicative inverses in {~c E R : c > 0}. Similarly for R2.15. there is a unique order isomorphism from ]R1 onto R2 that leaves Q fixed.c. there is some positive integer M such that j.6. that . _ < b f o r s o m e b E F. The completion is unique up to order isomorphism. Q is an Archimedean field.0 ~ (~Cpk . } is bounded above by b/c. Use 10. Since ~~1 E S. b. 10. There exists a Dedekind complete.~C~k ) + ' ' " + (Xp2 .13(E). . . Q is both supdense and infdense in R1.38.M and k . ~. (Also see related results in 10. thus F and {z E F : z > 0} are subgroups of the groups ( R . Show that {{ E R : { > 0} is the (also unique) Dedekind completion of the nmltiplicative group {z E IF : z > 0}. We may assume that (x~) is increasing (why?) and that Xl _> 0 (why?). e. and there is an isomorphism (of rings with unit) from R1 onto R2 that leaves elements of Q fixed and preserves order. show that b ~ ~l~pk . it sumces to show that the set S . For (A) =~ (B). By the uniqueness of completions (4. The construction of the reals by cuts was published by Dedekind in 1872.{1. 2. . Let F be a chain ordered field. Let R1 and R2 be two Dedekind complete. 5. by 4. . 3 . E x i s t e n c e of t h e reals. t h u s 0 _ < Xl _< x2 _< x 3 _ < . monotone sequence in IF is Cauchy.) respectively. For any positive integer k.12 is not Archimedean. Then there exist some c > 0 in IF and some positive integers ft 1 < Pl < rt2 < P2 < "'" such that xpj .x~j > c for all j.XTt 2 ) + (Xpl .17.Ordered Fields and the Reals 249 Proposition. Then R1 and R2 contain ringisomorphic copies of Q.k>_M ~ 1 1 j k <b. then the Dedekind completion of F is a Dedekind complete. by 10. . Suppose that some b > 0 is a lower bound for S. Examples and theorems about Archimedean fields. Thus the set { 1. In fact. c. + ) and ({{ C R : { > 0}. U n i q u e n e s s of t h e reals. chain ordered fields. Extend the definition of multiplication to other products of real numbers by { ' r / = [sgn({)][sgn(~)] I~11~l.
{x c F ' a (defined as in 17. Then: .15.14) has a limit in F. (Optional. _< x _< b} is compact {x c F 9 a _< x _< b} is We shall not prove the equivalence.b] .b c F with a _< b. THE HYPERREAL NUMBERS 10.19 and 10. See 14. two more constructions are given in 10. (D) F is connected (defined as in 5. language and logic.) Let IF be an ordered field. we see that that isomorphism also preserves products.20. bounded sequence in F.) Let F be a chain ordered field.b] pseudocompact (defined as in 17.250 Chapter 10: The Real Numbers isomorphism preserves sums. and so it is convenient to call that field "the" hyperreal number system while we are working with it. (Optional. thus Z ~ Q ~ IR c ]HI. By the h y p e r r e a l line (or the h y p e r r e a l n u m b e r s y s t e m ) we shall mean any nonArchimedean.17. Thus Dedekind completeness is not the same thing as C a t c h y completeness.6 to the multiplicative groups of positive elements. Let F be equipped with the order interval topology (see 5.68. Let ]HI be a hyperreal line.. These conditions and others are proved equivalent by Artmann [1988]. the set [a. Applying 10. as is common in nonstandard analysis .12.b E F with a _< b.. 10.g.26. Show that F is Archimedean if and only if (after relabeling by isomorphism) we have Q c_ IF c_ R.a). order language and different models of use of higherorder Our proof of the uniqueness of ~ depends on our use of conventional If we change our rules of inference . However. Artmann's book also gives an example of a nonArchimedean field in which every Catchy sequence converges.. usually we work with just one such field at a time. and every Catchy sequence in IF (defined as in 10.e. then (xn) has a limit in IF. if we restrict ourselves to first logic. (A) F is Dedekind complete and thus is the real line. Remarks. (F) For any a.41 and 15. Then it can be shown that the following conditions are equivalent. the set [a. 10. f. there are many hyperreal lines.then there may be many the real line (though they may be indistinguishable except through the language and logic). the members of H are called h y p e r r e a l n u m b e r s . (C) F is Archimedean. Strictly speaking. 0 and the resulting convergence (see 7. (E) For any a. chain ordered field ]HI that contains R as a subfield. (B) If (xn) is any monotone. and I~ is the Dedekind completion of IF. in which case Q is both supdense and infdense in F.18.2). that exposition is based in part on Steiner [1966].12).41).16. We gave one construction in 10.
T h a t number r is called the s t a n d a r d p a r t of {.. r . The set of positive infinitesimal numbers has no largest or smallest member. A hyperreal number { is called b o u n d e d i f . nestled around each real number r there are infinitely many bounded hyperreal numbers. c. The set of all infinitesimals is an ordered ring (without unit). and c~.) Clearly. . Every positive real number is an upper bound for the set of infinitesimals. r + 5. (Some mathematicians exclude 0 when they define infinitesimal. Let { be a bounded hyperreal number. Show that {bounded hyperreals} = {real numbers}  {infinitesimals} is an internal direct sum decomposition of one additive group into two subgroups.. some of these hyperreal numbers are denoted by r + s. any real number is bounded.r < { < r for every positive real number r. This illustrates the fact (which we already knew) that H is not Dedekind complete. Show that 0 is the only real infinitesimal.) d. Hint: To show that there is at least one such number. Elements of IR are called r e a l n u m b e r s . we may abbreviate it by std({). The set of positive unbounded hyperreal numbers has no largest or smallest member. c~ is yet another one. Show that the set of infinitesimals does not have a least upper bound in H..The Hyperreal Numbers 251 a. r . otherwise { is u n b o u n d e d . Then +c~. are different unbounded hyperreal numbers. Show that some hyperreal number c~ is unbounded. (Other terms commonly used in place of "bounded" are limited and hyperfinite.5.r < { < r for some real number r. +3c~. a picture of a microscope is sometimes used to suggest their closeness to r.s. A hyperreal number { is called i n f i n i t e s i m a l i f . Show that a nonzero hyperreal number { is infinitesimal if and only if 1/{ is unbounded. +2c~. In some books. b. or sometimes (for emphasis) s t a n d a r d r e a l numbers. Thus. Show that std : {bounded hyperreals} + {real numbers} is an isotone map (for the ordering) and a ring homomorphism. Show that there is one and only one real number r that is infinitely close to {.. all infinitely close to that real number r. use the Dedekind completeness of IR to prove that there is a real number r = inf{s 6 IR: s > {}. e. The set of bounded hyperreals is a commutative ring with unit. then show it has the required properties. but that definition has the disadvantage that the resulting set of infinitesimals does not have such a nice algebraic structure. etc. Two hyperreal numbers are said to be i n f i n i t e l y close (or infinitesimally close) if their difference is an infinitesimal.
In the hyperreal line IRN/~ we have 7r(c~) . Show that any member of ~Eet~ * E is .(1. . For instance. . constructed as in 10.IRA/9: is a nonArchimedean.12.41.0 and z r ( ~ ) . *R is a commutative lattice algebra since R is. as a variable. A. then x acts as a transcendental over R and hence acts algebraically as an indeterminate i.e and 9. by specifying 9" in more detail. Suppose ft.52. but their product (~:1c~2 is 0. . In fact. hence *R . .~ integer 'n}. the equivalence class of the sequence (1. f2(k).{S c_ IR" S _D ( n .0. } . and R C_ f~. 6 .{2. f 2 ( k ) . . 1. We can choose which is which.52.0. . *R is nonArchimedean. ~ ( f ~ ) . . 5 . . 1 .33. Indeed. See also the related remarks in 10. U l t r a p o w e r s o f the reals. fk(k)}. . In fact.N. the h y p e r n a t u r a l n u m b e r s *N inherit some of the properties of N. 9. H i n t s : If 9~ is not an ultrafilter. . } . Every countable set S c_ *R is order bounded. . . although we may not be able to extend quite as many functions in the setting of this field as we did in 9. } . and let F .252 Chapter 10: The Real Numbers R e m a r k . Then *R is a ring with unit by 9.) Recall from 9. Using 5. ) is an upper bound for N in *R. . Define the reduced power *R = R A / 9 " and its arithmetical operations and ordering as in 9. Then * R . where each fn is a function from N into R.51. . .49. Assume that 9" is a free ultrafilter on the set A . (Similarly. The resulting field generated by R U {x} must then be R(x). 9" satisfy the conditions of the Enlargement Principle [9. It can be shown that the s m a l l e s t hyperreal line is the field R(x).54). 10.20. ) is equivalent to the real number 0 and the other is equivalent to the real number 1. 3 . Show that a.m upper bound for N. H i n t " Let E . 10.{cofinite subsets of N}.. . .min { f l ( k ) . let E . . . . Define functions u. 3. fk(k)} . ) or ~A. if we take the real line and adjoin some element x that is infinitely large. and 9. b. b" Different constructions may yield slightly different hyperreal number systems.1 and 7r(Z).5.is a chain ordered field. of rational functions in one variable with real coefficients. Show that a.1. Fleischer also points out that.e. 0.53 since R is a ring with unit. . Then E is a proper filter on R.{odd numbers} {1.IRr~/9. H/~t~" Let s ~(f~). in the hyperreal line RN/9" we have 7r(ct). 4.49. we can obtain free ultrafilters ~. then A can be partitioned into sets A1 and A2. Show that neither OzI nor c~2 is equivalent to 0. v(k) .19. at least we can extend some functions c o n s t r u c t i v e l y . 2. one of the sequences c~ .20. . . Let Ct I and c~2 be their characteristic functions.o..i and 6.(0. . 0 . neither of which is an element of 9. 1. 9.c. + o c ) for some posit iv. chain ordered field. .max {fl(k).on N that satisfy _D {E} U e and 9" _D {F} U e. . C.f that *R is chain ordered if and only if 9" is an ultrafilter. Let 9" be a proper filter on a set A. Let C . 1. *R is a field if and only if 9" is an ultrafilter.{even numbers} . This is discussed by Fleischer [1967]. if we want to. v" N ~ R by taking u ( k ) .
Then ( is the equivalence class of some function f 9A + R.. suppose *R is not Archimedean. Does it follow that *R is nonArchimedean? It does under certain additional hypotheses. .{S1. then a unique member of g belongs to 9". F 3 . suppose that no member of g belongs to 9". satisfying those conditions (A)(F) above? That is a famous problem in set theory.A \ S~ belong to 9. (E) Whenever g . . For (F) ~ (A). hence a member of 9. Implication (C) =~ ( D ) i s obvious. $3. Nn%l Fn E 9.. and thus 7r(u) _< 7r(fj) <_ 7c(v). is not isomorphic to the minimal hyperreal line R(x) discussed in 10. For (A) =~ (B). then A c c e C is a member of 9. Do there exist ... A w . For (E) ~ (F). Clearly. A m e a s u r a b l e c a r d i n a l is a cardinal c~ with this property: c~ is uImountable and there exists a free ultrafilter 9.. then the ideal of sets {A \ F 9F E 9.45.. a contradiction. $2. . Then all the sets F~ . Further remarks.} is a aideal. (C) * S  (D) * N . (Optional. on some set A with cardinality c~ such that 9~ is card(N)complete. show that Cj = {k E N : u(k) <_ fj(k) <_ v(k)} is cofinite. but in general the answer is not clear. a contradiction...but that intersection is empty. observe that R c_ *R in any case. such that the set F~ .d. 10. and it is shown by Bell and Slomson [1969] that a wmeasurable cardinal exists if and only if a measurable cardinal exists. F2. For (D) =~ (E). Hence their intersection belongs to 9 " . in general (i. we see that IRA/9. For (B) ~ (C). with card(C) < c~.21.19.12. Then Mrt%lFrt is nonempty. and define the ultrapower RA/9.any filters 9. This observation is taken from Fleischer [1967].f).20. Then N is bounded above by some ( c *R..d. (B) *RR.{A c A ' n < / ( A ) } is a member of 9" for each n E N.) Let 9" be a free ultrafilter on a set A.15. S for every set S c_ R. is a sequence in 9. on a set A is said to be c~complete if.e.The Hyperreal Numbers 253 For each j E N..n when A E S~.18.f. Since f is equivalent to some constant k. as in 10.N . use 9. and an ordered field F is Archimedean if and only if Q c_ F c_ R (see 10. on some set with cardinality c~ such that 9" is/3complete for every cardinality/3 < c~.a..} is a countably infinite partition of A. . This result can be found in Takeuchi [1984] and elsewhere.m e a s u r a b l e c a r d i n a l is a cardinal c~ with this property" c~ is uncountable and there exists a free ultrafilter 9. Proof of equivalence. In other words. the set Sk is a member of 9. Therefore 7r(u) and 7r(v) are lower and upper bounds for S.b and 10. for any free ultrafilter 90 the following conditions are equivalent" *R (A) *R is Archimedean. define a function f " 1 + N by taking f(A) . whenever C C_ 9. To discuss it we need a few definitions: An ultrafilter 9. Contrasting the result above with 10. Further remark. a filter satisfying condition (F) above exists if and only if an wmeasurable cardinal exists. (F) Whenever F1. However.
and suppose q is an element of F that is not a square i. the solution is the "number" (0.20.c. QUADRATIC EXTENSIONS AND THE COMPLEX NUMBERS 10. but its consistency is not implied by Con(ZF). Thus we may view F as a subset of F(~/~). etc. 0) and the multiplicative inverse of (a.a2bl) where the expressions ala2 + qblb2. bl + b2) (al. b) =fi (0. Fix any such q.) Let F ( v ~ ) represent the set F x F. b) may be written as a + bv@ with all the usual rules of arithmetic being preserved..c. Show that the only other solution of x 2 = q in F(x/q) is the "number" ( 0 .23. Exercise.) Furthermore. b) 7/= (0. These results can be found in Kunen [1980] and other books on formal logic. and 10. the following results are known: (i) The consistency of ZF implies the consistency of ZF + AC + "there does not exist a measurable cardinal.c. Then F ( v ~ ) has additive identity 0 and multiplicative identity 1. . then (a.b. 8. equipped with binary operations defined as follows: addition: multiplication: (al. b2) = (al + a2. Exercise. Let m be an odd prime.1) = v/~.b2) = (ala2 + qblb2. b) as bx/~.20.) 10. Verify that ]F(x~) is a field when equipped with these binary operations. If we write (a. 0) is an injective homomorphism from F into F ( ~ ) . b. bl) + (a2. . alb2 Jr. the additive and multiplicative identities are (0. 1) = v/~.23. Let F be a field.bl)(a2. to reduce the likelihood of assuming something that has not already been proved. there are elements q E ~m for which x 2 = q has no solution in Zm.22. a a2 _ qb 2 when (a. As we noted in 8.e. assume q c IF and suppose there is no solution x c IF for the equation x 2 = q.q b 2 :/= 0. 0) as a and (0." (ii) The axiom system ZF + AC + "there exists a measurable cardinal" is empirically consistent. O) =~ a 2 . the existence or nonexistence of a measurable cardinal cannot be proved in conventional set theory.9. b) rather than the more familiar notation a + bx/~. a.. the mapping a ~ (a. b e Q} is a subfield of IR. (The beginner is urged to use the ordered pair notation (a. We have extended our original field to a larger field in which the equation x 2 = q does have a solution. 0). Then Zm(v/~) is a field containing exactly m 2 elements. More precisely. (This makes sense since (a. are computed using the arithmetic rules of IF.254 C h a p t e r 10: T h e Real N u m b e r s However. 0) and (1. Q(v/2) = {a + b y e : a. b) is ( a2 _ qb 2 . Construct addition and multiplication tables for the field with 9 elements. Examples. (Examples are given in 8.
wrote that such computations are a method for showing that the equation x ( 1 0 .x .2 m a t r i c e s ~ 1 7 6 [ /qy xY I w i t h x y E F ' . Any complexvalued function of a complex variable can be rewritten as a R2valued function o f a v a r i a b l e i n l R 2. with the real part being the distance to the right of the origin. the matrix above corresponds to the complex number x + iy. and let q be an element of F that has no square root in L.x/~15.24. d . and the imaginary part being the distance up from the origin. then u + iv y) x . See the following illustration. equippedwithmatrix addition and multiplication. For many purposes it is convenient to represent complex numbers as points in the plane. Exercise. The letters f. The 2by2 matrices of the form / F.1 is usually written as i. z are customarily used in the literature in precisely this arrangement. T h e n t h e 2 . (Caution: Im (~ . around 1750.I ) .y. Some mathematicians use overlines for other purposes than complex conjugation e. x.x + iy has r e a l p a r t . then c~ = r cos 0 + ir sin 0. 10. E x a m p l e . where u. let F be a field. and c o m p l e x c o n j u g a t e defined by Re a .27.g. w. form a field that is isomorphic to C. i m a g i n a r y p a r t . y are real. f x y with x y E R. The complex number c~ . The c o m p l e x n u m b e r s are the quadratic extension field R ( ~ . we may write z = x + iy a n d w = u + iv = f ( x + iy). then ~ is the reflection of c~ in the horizontal coordinate axis.22. where x.b y . If w = f ( z ) = z 3. form a field that is isomorphic to the quadratic extension field F(x/~). y E R. H i s t o r i c a l remarks. v. x.iy 3.x) = 40 has no . u. set complementation or topological closure.Quadratic Extensions and the Complex Numbers 255 10. Thus complex numbers can be written as x + iy or x + yi. when considered as real vector spaces: They yield the same results for addition and for multiplication by a real number.iy. For instance. I f w = f ( z ) .3xy:  (x + iy) 3 y) 3x:y = x 3 + 3ix2y 3 x y 2 . this field is usually denoted by C.25. More generally.x.. The real number 0 is sometimes called the a r g u m e n t of the complex number c~. The complex number ~ .) 10. Calculations with complex numbers were performed long before such numbers were properly understood or fully accepted. with the matrix above corresponding to x + yv/~. without any clear understanding of what such numbers could mean. Euler. formed from R by the construction of 10. v.1 5 and 5 . This representation is sometimes known as an A r g a n d d i a g r a m . equipped with y x matrix addition and multiplication. The spaces C and R 2 are isomorphic. Cardan showed that the quadratic equation x(10 . This illustration also shows the polar coordinate representation: If r is the distance from 0 to c~ and 0 is the angle from the positive real axis to the line between 0 and c~.x) = 40 has the two solutions 5 + ~ . 10.26. y.
it becomes much more natural when interpreted geometrically with polar coordinates. Using (. 1 . Thus. The rule for multiplication. These are points equally spaced along a circle centered at 0 with radius r 1/n. n 1). William R. . . to multiply two complex numbers.i ( a 1 5 2 ~. while multiplication is a rotation and stretching. The rule for addition is fairly simple.O. what algebraic properties should it have? Around 1800. . 2 . . 10.29. Reversing the process described above. solutions. Hamilton published a paper explaining complex numbers in terms of ordered pairs. .256 C h a p t e r 10: The Real Numbers Im Argand diagram O~ J I I v! ax+iy Re Oz m = r (cos 0 + i sin 0) . and may seem rather arbitrary to the beginner. we multiply the radii and add the angles.1 . in 1830. verify that the product of the complex numbers rlcis(01) and r2cis(02) is the complex number (rlr2)cis(01 ~02). 10.22.a251). Some mathematicians took this attitude: Of course there is no square root of a negative number. (. it is the same as the addition of vectors in R 2 (see Chapter 11). (al + i 5 1 ) ( a 2 na i52)  ( a l a 2 . with q taken to be .) and some basic trigonometric identities. we find that the nth roots of any complex number rcis(0) are rl/ncis 0 +n27rJ 1 (j . However. Let cis(0) denote cos(0) + / s i n ( 0 ) . The rules for addition and multiplication of complex numbers are the same as the rules given in 10. It follows that [rcis(O)] n = rncis(nO) for integers n.) is more complicated. This is known as D e M o i v r e ' s formula. Finally. Some accounts of the history of this subject are given by Kline [1990] and Tietze [1965].28. Arithmetic with complex numbers may be viewed as transformations of the plane: Addition is a translation. but if there were such a thing.5152) ~. writings of Argand and Gauss gave our present geometrical interpretation of complex numbers as points in the plane. probably that is the simplest starting point for mathematicians learning about complex numbers today.
we can apply this formula with any complex numbers a. Divide through by a.O.29 tells us how to find square roots of b 2 . z 0 (with a =/= 0) has solutions given by the . in fact.2 f . We can actually give f o r m u l a s for the roots of the simplest polynomials: a. finally. to + _ _b. as in 10. the reader who makes an error of this sort is in good company: Euler made some similar mistakes. but there is some repetition and we only end up with three distinct solutions. with known constants b + v/b 2 . the notations v G and ~ are too ambiguous and imprecise for some computations with complex numbers. For instance. which are like "polynomials of infinite degree. b.4 a c >_ O. b. numerical approximation methods do not depend on the use of these formulas.O.1.29 shows that for any complex number c~ other than 0. 0 has two square roots and three cube roots. c (with a r 0).29. An analogous formula. Again some cancellation occurs.Quadratic Extensions and the Complex Numbers 257 The reader is cautioned that familiar properties of v/J or ~/z for positive real numbers z do not always extend to complex numbers z.e( 1.d f 2 +. Every student learns this in high school 2a at least. v/~v/0 = ~ is valid for p . or one solution repeated if b 2 . taking w .27.0 (with a r 0). every polynomial of degree n has exactly n complex roots. in the years before complex numbers were well understood. P o l y n o m i a l e q u a t i o n s . which will be considered briefly in 22.1 ) ( .36 we shall prove that every nonconstant polynomial (in a single complex variable. we have n distinct solutions z to the polynomial equation z n .a = 0. c are real numbers and b 2 . However.l d .0 can be rewritten as (6 _ 2f~3 _ (?3 _ 0. . This complicated procedure is seldom used in applications.1 ) = x/~f = 1 is clearly incorrect. thus ~ 3 __ f Jr. 10. counting multiplicities of multiple roots. with complex coefficients) has at least one complex root.23 and 25. but it is a bit more complicated.30. Power series and analytic functions. It was published by Cardan in 1545. The quadratic equation a z 2 + bz + c quadratic formula. but the computation .4 a c . Actually. b Substitute z . W h a t about other polynomial equations? In 17. involving square roots and cube roots.a. b.3 _ 2 f .4 a c . The resulting equation (3 _ ( 7 3 ~ . This leads. The quadratic formula yields two distinct complex solutions z.4ac 1 e ~c lb2 9 ' f _ 27 1 ba +l b c .30. 3 This looks like six values. thus we may assume a . q > 0. The discussion in 10.(?3. This is a quadratic in (3. Solve it as in 10. 6 2 Now make another substitution.( .w Some cancellation occurs.1 = x/Zfv/Lf = V / ( . The resulting equation can be 3 rewritten in the more convenient form w a + 3 e w . Indeed. since each complex number other than. Then find ~. since 10." have a more complicated theory. can be given for the c u b i c equation a z a + b z 2 + c z + d . when a.
one of the solutions of x 5 + 20x + 32 = 0 is the number 1 ~/2500x/~+ 250150 .d) a thirddegree equation for r.4 (1~a2 _ b + 2r) (r 2 . B.10x/~_ 750V/50+ 10v/~ 5 1 ~/2500x/~_ 250150 + 10v/~_ 750V/50_ 10v/~ + g ~ 2500v/5 + 250 i 50 + 10v~ + 750 V/5 0 . then we will have ( z 2 + ~az + r 1 )2 . two quadratic equations.10v~ 1 g 2 5 0 0 v / 5 .b + 2 r ) a z2 + ( a r . could be solved by a . with constants A. like the equations of lower degree. and in that case we have D = . For any constant r (to be specified).e. which can be rewritten as z 2 + l a z + r . we may rewrite the given equation as z + laz)  2 la2 _ b) z 2  CZ ~ d. Some q u i n t i c (or fifthdegree) equations have solutions that can be expressed in terms of fifth roots.B z +C. which we can solve as in 10.A(z D) 2.30. we have ( z 21 . Here is one description of the method: By completing the square.a.10v/5 + 750V/50 + 10x/5 (taken from [Wolfram 1994]).D) 2 if the constants A. B. This equation. If we can choose a constant r to 2B satisfy this condition.b. written in more detail.~ a z + r 2 + ) ( l 42 . A still more complicated formula or method yields the solution of the q u a r t i c equation z 4 + a z 3 + bz 2 + cz + d = 0. For instance.c) 2 . Now. an expression of the form A z 2 + B z + C.258 C h a p t e r 10: The Real Numbers c.d).. d... which we can solve for z as in 10. Examples like this could lead one to expect that the general fifthdegree equation.250 50 . an expression of the form A ( z .D ) that is. C is a perfect square i.A z 2 +. 1 which we shall rewrite as (z 2 + ~ a z + r) 2 . C satisfy C B 2 = 4 A C .30.+ x / ~ ( z .c ) z + (r 2 .. is ( a t . It remains only to find a value of r that satisfies B 2 = 4 A C . This problem was solved by Cardan's student and published in Cardan's book in 1545.
For instance. But in 1826 Abel proved that such a formula is impossible. so the roots of 2x 5 . and quotients).x 5 + x are easy to figure out: That polynomial has derivative p~(x) . In 1858 Hermite. For most choices of a. one such scheme is Newton's Method. which can be found in every modern textbook on calculus. These numerical schemes do not yield exact solutions. and we shall not give it here. d.)is \ ] a function of five variables that can be expressed entirely in terms of radicals (i. the relationships between the roots and the other numbers present in the problem. and so p is strictly increasing and gives a bijection from I~ onto ]K. In 1844 Eisenstein solved quintic equations in terms of radicals and what we shall call the E i s e n s t e i n f u n c t i o n in the paragraph below. are of interest because they reveal the p a t t e r n o{ the roots i.36) that every fifthdegree polynomial equation with complex coefficients has five complex roots. Eisenstein.E i s ( q ( a .2x + 1 cannot be represented in terms of radicals. For additional information about some of these solutions. Hermite. That formula can be produced by methods described by Stillwell [1995].. The formulas of Cardan. the quintic equation x 5 + x . et al. and a few years later Galois developed a theory that describes exactly when a polynomial is solvable by radicals.e. That is important for theoretical purposes and ultimately has some effect on engineering problems as well. Let the inverse of that function p be denoted by E i s ( x ) .2x + 1 has Galois group $5. in terms of nth roots for n _< 5. For instance. how can the roots be represented? Radicals are not enough. where q(. eo Nowadays. c. By an a b s o l u t e v a l u e on . we shall call it the E i s e n s t e i n f u n c t i o n . roots produced by a numerical scheme may have no apparent rhyme or reason.a cannot be solved by radicals. in 1877 Klein solved quintic equations in terms of radicals and the hypergeometric function. the polynomial 2x 5 . see Wolfram [1994] or Shurman [1995]. Still. which is not solvable. Then it can be shown that a solution of x 5 + a x 4 + bx 3 + c x 2 + d x + e . but they yield solutions to as much accuracy as one wishes. differences.Absolute Values 259 formula in terms of radicals. Let X be a field (not necessarily ordered). Definitions. more functions are needed. However. ABSOLUTE VALUES 10. The function q can be expressed in closed form. However.. Mathematicians sought such a formula for many years.31.e. we know by the Fundamental Theorem of Algebra (17.0 is given by x . 10 decimal places of accuracy is more accuracy than a n y engineering problem will ever require. when one wants to solve a polynomial equation of degree higher than two. and Brioschi solved quintic equations in terms of radicals and elliptic modular functions.5x 4 + 1 _> 1. products. generally one uses a numerical iterative scheme on an electronic computer. they may seem to be arranged entirely at random. Kronecker. but the formula is extremely long. e ) l . b. some basic properties of the polynomial p ( x ) . together with sums. If they cannot be represented using radicals.
then Ic~I . IF could be one of the finite fields discussed in 8.28. by our observations in 10.33. define Ix + iy I V/X2 + y2. in that case axe is just m a x { x .8. the mapping 0 H cis(0) is a group homomorphism (not an isomorphism) from the additive group R onto the multiplicative group T.I ~ l + 2 Re c ~ + 1~12 _< + 21~31 + 191 .39 for lattice groups. the two notions do coincide in the case where the field or lattice group is R (or any subfield of R). It is isomorphic (as a group) to the additive group introduced in 8. . Show that a.11 However C and R 2 have different differentiable structures. Remark.1} is a commutative group whose operation is the multiplication of complex numbers./3) .{z E C 9 Izl .A r c h i m e d e a n . The absolute value thus defined on fields should not be confused with the absolute value defined in 8.lallJl. It is often called the circle g r o u p .20. y c X. b. . I ld. Then the following conditions are equivalent.) For clarity in the discussion below. For any n E N. let 1 denote the multiplicative identity of R. and let e denote the multiplicative identity of IF..260 X we mean a mapping I 9[" X ~ Ix[0 ~ [0.e.10. the mapping 0 H cis(0) (defined in 10.x } . These properties imply also Ill.28). Hence I s / 3 1 . The absolute value of a complex number.1 (ezercise). if any (hence all) are satisfied we say that the absolutevalue is n o n . Fortunately. An absolute value is also known as a modulus or magnitude or value or valuation. f.32. For real numbers x and y. (A) The set {Inel : n c N} is bounded in 1R. Let I I be an absolute value on the field IF. This topology and uniform structure are the same as those of IR2" see 18 18 and 22. 10.(l~l + 191) e. since it is geometrically a circle. The usual topology and uniform structure on C are given by the metric d(a.v / ( R e a ) 2 + (Im a) 2 for any complex number c~. 2rr) onto the multiplicative group T. The set ll" . except if some other arrangement is specified. (For instance. see 25. except when some other arrangement is specified. let us denote ne = e + e + .l a . If a . I I is an absolute value. l a D . not necessarily contained in R or containing R. It will always be used on R. in the complex plane.r. . It will always be used on C. the u s u a l a b s o l u t e v a l u e o n R. It is the u s u a l a b s o l u t e v a l u e o n C. In fact. + e (the sum of n e's). Ia +/312 . x 2 + y2 _ 1. 10.r cis0 for some real number 0 and positive number r (as in 10. Also. +oc) satisfying x0 Chapter 10: The Real Numbers (positivedefiniteness) (multiplicativeness) (subadditivity) I yl Ix+yl lyl _< I x l + y l for all z.28) is an isomorphism from the additive group [0./ 3 I. Let F be a field.
10.34. Let p be a prime n u m b e r i.Absolute Values (B) 261 Inel _< 1 for every n 6 N. we shall let Re c~.0. (c) (D) la + bl <_ max{lal. Obviously this satisfies condition (C) given above. .ct and Imc~ .v) <_ max{d(u.R.. 5. Intentional ambiguity. p(x) and q(x) are polynomials. the expression Ic~ has the same value regardless of w h e t h e r the field being used is R or C. T h e n let Irt = 2 . which are used in algebraic n u m b e r theory and in the s t u d y of topological groups. Instead of 2 .m .a d i c a b s o l u t e v a l u e on Q. respectively.w).. Likewise.v] satisfies the u l t r a m e t r i c inequality d(u.m for any constant c > 1. 11.( n + l ) r s ~. the complex conjugate of c~.b 6 IF be given. c. and let F(x) be the field of rational functions in the variable x with coefficients in IF (see 9. Hence la + b I < ~/(n + 1)r s. 7. I Ip) is the system o f p . the completion of the metric space (Q.w)}. T h e metric d(u. Im c~.e. This intentional ambiguity permits us to cover b o t h cases simultaneously. one of the n u m b e r s 2. it sumces to prove (A) =~ (C). m. see also Narici and Beckenstein [1990]. .m we could use c . a. let x be a variable. Verify t h a t this yields a n o n . 3. [bl}. Take limits as n . This notation is applicable (albeit unnecessarily complicated) even when F . For any c~ E F. Let r = sup{lnel : n 6 H}. An introduction to this subject is given by B a c h m a n [1964]. where r. let s = max{lal. Proof of equivalence. with m and n nonzero and not divisible by p. ~. Examples of nonArchimedean valuations. Completions of metric spaces will be studied in C h a p t e r 19.28).v) = l u . since the absolute value function on R is just the restriction of the absolute value function on C. Let any a. oc to obtain la + bl _< s. We shall sometimes state a t h e o r e m involving a field F t h a t may be R or C. observe t h a t n la+b ~  I(a+b)~l  anJ bj j=0 <_ E r s ' ~ j=0 . Let ]F be a field.d(v. this is the p .~ .. T h e proofs of ( C ) ~ (D)and (C)=~ (B)=~ ( a ) a r e easy. and tc~ be the real and i m a g i n a r y parts of c~. this completes the proof. in t h a t case we have Re c~ . This proof follows Rooij [1978].A r c h i m e d e a n valuation on F(x). Ibl} for all a. b. n are integers. w i t h o u t specifying which of these fields is intended. In later chapters. Each nonzero element can be written in the form r(x) = x'~p(x)/q(x) where m is an integer. For further reading.a d i c numbers. Define Im/(np~)]p = p~. and neither p nor q has a factor of x..0 if x # 0 . For each n 6 H. T h e d i s c r e t e a b s o l u t e v a l u e (or Kronecker absolute value) on any field F is given by ]x { 0 1 if x . b E IF. the main fields we shall use are R and C. and the absolute value of c~. Any nonzero rational n u m b e r can be expressed in the form m/(np~).
Observe that ll + r = = (1 + ~)(1 + ~) (1 . 1). Hence. we find that this is also valid for r c [0. with r . hence it suffices to consider 0 c [0. with inequality reversed if p _< 2 <_ q. it suffices to show that ~(0) _< ~(0) for all 0.b)~' (b) for a. we shall give inequalities only for p > 2 _> q.1.r 2r cos0 + r 2. Clarkson's inequality for scalars. Define ~(t)  (1 + ~ft)P + ( 1  ~/t)P (0 < t < 1).I ~ l E [0.30. Let p .1.) Next we claim that if 4 is any complex number with ]{] _< 1.+oc) with 1 1 for any complex numbers ~. 7r]. note that can be represented in the form ~ r cos0 + i r sin0 for some real number 0.r p and b . 1). b E (0. Observe that ~"(t) _< 0 (assuming p _> 2 > q). 10. Note that ~ is periodic with period 7r. q E (1.I 1 + ~[P + I1 ~IP.r) p G 2 (1 + rq) p/q for r E (0.~1p ~ 2 ~ + I~1~)p/q (1~1 if p _> 2 _> q. r/.~)  1 + 2r cos0 + r 2 1  and ll . Then q I~ + ~1p + I ~ . we obtain (1 + r) p + (1 . + 1 1 . P 4 . and 27. 26. Then compute the first two derivatives and simplify to: qJ(t) . (This result will be used in Chapter 22 to prove that the Banach spaces L P ( p ) are uniformly convex. other properties of uniform convexity are studied in that chapter and in Chapter 28.~ switch places when we increase 0 by 7r.r pq and simplifying. (This paragraph can be omitted if we wish to consider only real numbers for scalars. 1]. Holding r fixed. Our presentation is based on that of Weir [1974]. To establish this inequality. Substituting a . 1]. For most of the remaining steps. since ~ and . ~(a) < ~(b) + (a .35. the reversed inequalities are then valid when p < 2 < q.262 C h a p t e r 10: T h e R e a l Numbers See also the related discussions in 11. and the reverse of this inequality holds if p _< 2 _< q.~ ) ( 1 . by integrating. then I1+r . 26.) Proof.r _< 21+1r q ]p/q again. . Taking limits.[(1 ~~/~)p1 (1~/~)p1] t(1/p)1 g)"(t) (~1)[(1+ ~/~)p2 (1~/t)P21 t (1/p) 2. define ~(0) .21.5.
+oc] has both a lim inf and a lim sup in Ico. let any complex numbers ~.36. Let f : R ~ R be some function. the sequence 0. .41.2 + I 1 . since any net in that space has a limsup and a liminf. can be described in terms of infinitesimals.) Assume that :Y is a free ultrafilter on N. Any bounded net in IR has both a lim inf and a lim sup in R. Remarks.7: (or a m i n i m u m .0 and 0 . Questions about convergence in [oc. Show that . has no limit. 1 . +eel are chains.) 10. every bounded net in R has generalized limits. The extended real line [oc.33. Questions about convergence in IR can be restated as questions about convergence in [oc.~/~.0. ~ is nonzero. for instance. ~ onto [oc. such as limits. it is sometimes easier to work in R. 0. if p_< 2_< q). +oc]. ~ be given.2 r c o s 0 + r2) p/2 and it is then easy to compute 263 ~'(0)  pr{ . via the following observation: The mapping 0 H t a n 0 is an order isomorphism from ~. 1. because that space has a simpler metric and simpler arithmetic. It is sometimes easier to work in [oc. (The epsilondelta approach now widely used in calculus books was not developed until many decades after Newton and Leibniz.Convergence of Sequences and Series This yields the representation (1 + 2r cos 0 + r2) p/2 + (1 . since the convergence in IR is just the restriction of the convergence in On the other hand.45. (Optional. (However. let p.) Notions of calculus. to prove the theorem. without loss of generality we may assume I~1 < Ir/I9 Then we substitute ~ . +oc]. . +oc].C I p2}sin0. L be some real numbers. 1. 7r/2 < 0__ :r. Not every bounded net in R has a limit.a. This completes the proof of the claim.20. Any net in [oc. Newton and Leibniz had something like this in mind when they invented calculus. . Finally.37. hence their natural convergences are defined as in 7. hence *R = IRN/:y is a nonArchimedean field. we have Ii+gl II+gl > < I1~1 I1~1 when when 0 < 0 < 7r/2. CONVERGENCE OF SEQUENCES AND SERIES 10.II + C p . as in 10. Also. We may assume that at least one of ~. In the interval 0 _< 0 <_ 7r we have sin 0 _> 0. +oc] has the further advantage that it is order complete. +oc] can be restated as questions about convergence in a bounded subset of R. hence its convergence can also be described as in 7. and Hence~b assumes a m a x i m u m at 0 . in a sense described in 12. Both IR and [oc. +oc].
It follows from 9. In particular. Also.L < c is satisfied for every positive real number e. it suffices to show that f ( p + an) + L.LI < e for all n sufficiently large.e < *f(p + c~) . T h a t is.) Applying that result twice. Every sequence in R X has a subsequence that is a Pryce Proof. it s u m c e s t o show that If(P + an) . 0<a<6 Le < f(p+a) < L+c. This completes the proof.) Let X be a nonempty set. Define a function A : N + R by taking A(n) = an. suppose that *f(p + c~) is infinitely close to L for every positive infinitesimal c~. Let (an) be any sequence of positive real numbers decreasing to 0.c < f ( p + A(n)) < L + c for all but finitely many values of n. L .L. then c~ is a positive infinitesimal. Let any positive real number e be given. if and only if for each positive infinitesimal a the hyperreal number *f(p + c~) is infinitely close to L. for any function u" X + [oc.0 f ( p + a) exists and equals L if and only if for each nonzero infinitesimal the hyperreal number *f(p + c~) is infinitely close to L.52.38. it will not be needed until 28. We follow the presentation of KSnig [1986]. then 0 < c~ < ~5 is satisfied for every positive real number 5. hence * f (p + c~) is infinitely close to L.d that ctE*IR. zEX .e < *f(p + ~) < L + e. 0<c~<~5 => Lc < *f(p+o~) < L + c . (The number a is understood to vary through real values. we obtain: l i m a . Since *f(p + c~) is infinitely close to L. Conversely. Pro@ It suffices to prove the first equivalence. Pryce Selection Theorem.lima10 f ( p § a) exists and equals L. +oc].) It is easy to show that any subsequence of a Pryce sequence is also a Pryce sequence. superscripts will not denote exponentiation or composition. x EX j . will denote the number sup u(x). 10. if c~ is a positive infinitesimal. A sequence (fj) in R x is a Pryce sequence if sup lim inf fj (x) x EX j * o c  sup lim sup fj (x). we have in particular L . We wish to show that f ( p + ) = L. hence . Then there is some positive real number 5 such that aER.37. First suppose that f ( p + ) . both subscripts and superscripts will be used as indices. +oc]. Let any real number e > 0 be given.c and 9. (This result can be postponed. not hyperreal values. for brevity.50. let c~ be the equivalence class of that function. cx:~ (The liminf and limsup take their values in [oc. Throughout this proof.264 Chapter 10: The Real Numbers the righthand limit f(p+) . sequence.
we have Po _< P l <_ P2 _< " ' " _< P <_ q _< "'" _< q2 _< ql _< qo. we shall show t h a t (h ~) is a Pryce sequence. we have r '~ .p'~(z '~) Now let (h ~) be a diagonal subsequence of the 9's t h a t is.i finite. . Taking limits yields ~ _< p. +oc]. Hence the n u m b e r s q.qnl(ocn). For each n . 2.ce p < +oc. .1 < +q'~a(x~) 72 1 <_ . qnl(xn ) . and therefore 7.1 (z) j+oc for each z E X.. For j . h ~+2. 2. Since (h j) is a subsequence of (9~ " J c N).+ ~ IZ 1 < +oc. Since r ~ .~i are b o u n d e d below by the n u m b e r ~. .) is a subsequence of (92 "J c N).l . Since the sequence (g~](z '~) " j c N) is convergent. ] .~ . . which is not . .fj. Sin. our definition of x ~ tells us t h a t qn1 c a n n o t be +oc.limsupj__. . For those 72. (h ~) is a subsequence of the originally given sequence (fj).1. Define p(x) lira inf h j (x). let 9j0 . 3 .in = pn(xn ) < pn < ~ < FOG. recursively define a point z ~ E X and a subsequence (9~ "J E N) of the sequence (95~1" j E N) as follows" Let p'~l(z) l i m i n f 90 j+ oc qnl(x) . For each j n.l i m j ~ hJ(x~). Let r '~ . Thus for all n sufficiently large we have q ~ . .l ( x n ) <_ 72 for all n sufficiently large. let h ~ be a m e m b e r of the sequence (g~ 9j c N). h ~+3. This completes the recursive definition. .1. some subsequence (g2 " j E N) of the sequence ( g . j E N) satisfies r ~ . In particular. 3 . If qn1 E R.limj__+~ g~(z'~).o c . j +cx~ then p _< q and it suffices to show t h a t p > ~. which is in t u r n a subsequence of (9~ 1 9 j c N). Also. .l ( x n ) > qn1 If qn1 _ +oc.~gj rt1 (z~). with values in [ . and therefore (by our definition of x '~) OC < 0 <_ q n .1. T h e n choose x ~ E X according to the value of qn. and then let h n+l be some later m e m b e r of the sequence ( 9 ~ ' J c N). it follows t h a t ( h ~ . j +oc q(x)  l i m s u p hJ (x).l i m s u p 9jn . choose some z ~ E X satisfying q n .o c and p < +oc. we h a v e q n . as follows" If qn1 _ cx). let z ~ be any point of X. .o c . choose some x ~ E X satisfying qnl(ocn) > 72. We m a y assume t h a t ~ > . h ~+1. chosen so t h a t h n+l also belongs to the subsequence (9~+1 9 c N).Convergence of Sequences and Series 265 Let the given sequence be ( f j). .
. we say the series d i v e r g e s . We may abbreviate these two cases by saying simply that }~k=l ak < c~ or that }~k=l ak . is called a s e r i e s (or an infinite series). Again. . because (exercise) if }~cA a~ is finite. a2. a 3 .( 2 . some mathematicians say that the series d i v e r g e s t o i n f i n i t y . .~k=l ak c o n v e r g e s . +c~].. then } .36). . Then the expression k1 E ak or al + a2 + a3 + . but we shall not follow that terminology in our discussion of infinite series. al + a2 + a3. the order of the terms does not affect the summation..a3. +c~] that is.. in Chapter 26 we consider the case where X is a topological vector space. a2.20 we consider the case where X is any Banach space.j=l aj and }~j=l by are convergent series of real or complex numbers with ~ finite sums. . 0<3 Some basic properties of convergent series. the sequence (2 k) . + an. 41 s.) Similar terminology applies for . ..) converges to 0 while 1 1 1 the series ~~k~__l2k . with sums equal to I aj + }2~j=1by and k Y~j=I aj. . we say the series ~. More generally.a3." . and k is any constant.{A E A" a~ > ~ } is finite. . 9 . The supremum exists. Let al.39.40. When all the ak's are nonnegative real numbers. are members of any monoid equipped with a Hausdorff convergence structure (see 7. respectively. The definitions above all generalize readily to the case where al. For instance..a2. 1 al + a2 + a3 + ' . a 3 . Hint: Show that for each 1 positive integer m.. When the limit happens to be +c~. .N.") When the limit exists. let (a~ 9A c A) be any parametrized collection of members of [0. not just in R. are real numbers. then E k = l ak always exists in [0. but in the extended real line [c~. be complex numbers (or. al + a2. in particular.(al.41.~ E k = l ak if that limit exists. (We say the series is the "limit of the partial sums. When A . it is customary to extend the definition a little further: " k=lak is understood to mean the limit of When al.~ + ~ + g + . we are mainly interested in the countable case.. . +c~]. the series always converges to a finite number or diverges to +c~. since [0. .. Then }~j=l (aj +by) and }~j=l (kay) are also convergent series. For infinite series of real numbers. (Some mathematicians say that the series c o n v e r g e s t o i n f i n i t y . . T h a t expression also represents the limit of the sequence al. +c~] is order complete. . Actually. Suppose ~ . real numbers). The sequence (ak) .c~.41.d. a. Then we define the sum ~ .~ A a~ to mean the supremum of all sums of the form ~~CL a~ for finite sets L C_ A.) should not be confused with the series ~j~k=l ak 1 . 10.~ al + a2 + . then at most countably many of the a~'s are nonzero. In 22. then this definition is equivalent to the one given earlier for Ek1 ak. the value of limn_. converges to 1 by the result in 10. the set Am .~ . When the limit fails to exist.a2. .266 Chapter 10: The Real Numbers 10. Convergence of infinite series. 10. n al + a2 + a3 + a4. Exercise. that is.
The series }~j=2 j 1. . Integral test.it often takes delicate calculations to decide the convergence or divergence of series t h a t are similar to the harmonic oc oc 1 series. { 1/(1 .8 . t 3 . The harmonic series is a sort of "borderline case" . . 1 1 h.63.O. z__. We omit the rather elementary proof (which can be found in most calculus books). t 2 . converges.~4 7 4 ~.for example. W h e n n is a g o o g o l .41. to make }~j=l ? moderately large.~ +15 . then changing the order of the terms may affect the answer.. or 101~176 then }~j=l . d.i n n . . If al _> a2 > a3 _> . then the series al .. 10.r ~ + l ) / ( 1 . and = 1 + r + r 2 + r 3 +.42. if limj__+~ aj . T h a t limit is a p p r o x i m a t e l y 0. Show that tn . In particular. then it does not m a t t e r in what order we add them. hence rj j=0 + r n . W h e n we add up only finitely m a n y numbers.f ~ + f ( x ) d x <_ Y~. Corollary.. On 1 oc 1 the other hand. . since we shall prove a stronger result in 22. the integral test tells us t h a t }~.. .O.~t 11 6 ~.1 + ~ + 5 + " " + 1 .j=l f (J)" f. n 1 it diverges rather slowly i.26. show 1 + g . Show t h a t 1 + r + r 2 + . oc 1 In particular. or add up infinitely m a n y nonnegative numbers.. " > 0 and l i m n ~ an . .h. when we add up infinitely m a n y numbers. the h a r m o n i c s e r i e s }~j=l 7 _ 1 + 1 + 1 5 + 1~+ " " diverges.41.in 2. t h e n E j = I aj < E j = I convergent. is b o u n d e d and decreasing. oo On the other hand. Alternating series test.1 is still only about ln(101~176 100 ln(10) ~ 230. then ~ j = l n j is only approximately 1 . In fact. e.j=2 f ( J ) x~n+l <. using 10. .t~+l .0. when n is a trillion. we must make n n incredibly enormous.f below.r) divergent if Irl < 1 if ]r _> 1.. Hence n x l the sequence tl. consider the harmonic series in 10. n equal to ln(1012) . the result is the same. ~ln2. +oc) is a decreasing function. The series }~j=l JP converges for real numbers p > 1 and diverges if 0<p_<l.e. If f 9 [1. Geometric series. ( X) In fact.121n(10) ~ 27.. It therefore converges to a limit.a 4 + ..r) if r 5r 1. If aj < bj for all j.~j and }~j=2 j(ln j)(ln in j) also diverge even more slowly. but 1 t h a t 1 . OO (x) (X) 267 bj. .13+ 1 5 .Convergence of Sequences and Series b. ..a 2 + aa .(1 .j=l j1 is approximately equal to in n. However. the series }~j=2 j(lnj)2 converges g. For instance. then l i m j + ~ aj . .21. However. if 0 < aj < bj and ~ bj is c. If Y~j=I aj is a convergent series of real or complex numbers. which is called E u l e r ' s c o n s t a n t . +oc) + [0.f ~ + l 1 dx n + 1 > 0. Let t~ . fl oc n then Ej=I 1 Oo f ( J ) and f(x)dx are b o t h finite or b o t h infinite." " . then ~ aj is convergent.a 1 1 1 1 1 1 1 1 1 1 1 1 3 1t 3 2 t..577215664901532. it does not follow t h a t Y~j=I aj is convergent . it is c o m m o n l y denoted by ~/. For instance. some positive and some negative..
Now consider any finite number L. . are not all added "simultaneously.. then take just enough negative terms to get a partial sum that is less than L. . Thus at least one of 1 are) (modulo 2re) and so at least one of the numbers those angles lies in the interval (are. sin((k + M)x) is larger than 1 V~. Those 1 angles cannot skip across the interval (are .) oc 1 Proof. Choose a positive integer M large enough so that ( M .ln(3) 2 1 ' > 8 ln(3) . Since I sin(n(x + rr)) I = Isin(nx)l. which may have different limits.) Thus. k + 2 . . show that 1 4 > In(l) . . .43. the angles (k + 1 ) x . .l n ( 2 ) 2 1 ' > 6 l n ( 2 ) .n x ) l = I sin(nx)l. Example.26. "+an). Since I s i n ( . Different orderings of the ak's yield different partial sums sn .6 + 9+]1+13~ 1 odd term 2 odd terms 4 odd terms 1 1 1 1 1 1 1 1 + ]7 + T6 + ~ + 23 + 25 + ~ + 29 q 31 8 odd terms 1 . .1 sin(nx) converges for any real number x. 1 1 1 1 gt~ > ~+~.~ .+co. Take just enough positive terms to get a partial sum that is greater than L. . this is reflected in our definition }~k=l ak ... etc. . we must say that }~ ak is the sum of the ak's k=l in the specified order. art ) (modulo 2rr) without taking a value in that interval. .(k + 1)x > 2re. we may translate x by any multiple of re. it can be proved that any number L in [ . In fact. . .268 Chapter 10: The Real Numbers If we change the order of the terms a bit more.c ~ in a fashion analogous to the method used above for the sum L . +oc] can be obtained as the sum of a suitable rearrangement of the series above. (k + M)x go a bit more than once around a circle. We shall now show that the series }~n=l nlsin(nx)l diverges. it may be helpful to view a series this way: The numbers in al qa2 + a3 + . a sin((k + 1 ) x ) .4 + 1 5~7 1 1 1 1 1 15 1 8 1 2 V 3 . (Hints: Obtain L . the leftmost terms are added earlier than the terms occurring farther to the right. 10 which converges to +oc. (However. Consider any M consecutive integers k + 1. k + M. .limn+~ (al + a 2 + .1)x > 2re. Since (k + M ) x .a2 + a3 + "" is simply the "sum" of the ak's. . which is larger than x. we may assume that 0 < x < re/2. Hints: Observe that 1 1 > ~. since that interval has width re/2. To be more precise. we obtain the series 1 m m _~_ m 1 . for any real number x that is not a multiple of re. Hence for any nonnegative integer j 9 .l n ( 4 ) 2 ' Other rearrangements of this series yield other sums.al +. . (2<3 (X) 10.22 we shall show that the series }~n=l~ . in 22. Also. 1 1 1 1 1 1 1 9 + i1 + ]3 + i5 > i6 + 16 + i6 q." rather. . . Intuitively. . it is erroneous and misleading to say that ~ k = l ak . . + an and thus different sequences (Sn).al q. then just enough positive t e r m s . .16' 1 etc. See the related results in 23. .a2 + .r e / 2 < x < re/2 and x J= 0. thus we may assume that .
By a decimal rational we shall mean a number of the form m / 1 0 k. An analogous system. . d. . . . 1 .24.card(DN). Note that there are only countably many decimal rationals in [0. . . 1}N). 10. . 0 _< h(cr) _< 1. let S1 S2 Let D . 2 7 9 9 9 9 . . . .. the partial sums are an increasing sequence. ) Decimals from real numbers.. . Any other real number r c (0.Convergence of Sequences and Series we have s i n ( M j + 1) s i n ( M j + 2) Mj + I ' Mj + 2 sin(MJMj+M ) } +M 269 max Therefore M j+l oc 1 /~ V z oc oc M E n=l lsin(n)lrt = E E j=0 p=l I sin(MJMj+p +p)I > EMj+I' j=0 which diverges to oc since the harmonic series does. ( 8 1 . b In particular. 1 . . 1]) = card(2N). e. we could take b = 2. with sj E {0. 1]) = card({0. in D N. . 1) is equal to h(a) for exactly two different sequences a" one that is all 0s after a certain point. hence the series converges to a finite real number h(a).3. b. We evolved the decimal representation system because we each have ten fingers. . For each sequence cr  S3 S4 oo Sj h(cr) : ld t. . defined in 10. Use this to show that card(J0. 8 4 . Show that a.1}. f. 9 } . 83. But mathematically. Thus card(J0.. . . 3 . . which might develop on a planet where the people have b fingers for some integer b > 1. and another that is all 9s after a certain point. Then the expression "0. there is nothing special about the number ten. 2 .~0ff + ~0g + ~0~ + . . and it is easy to see that they are bounded above. (For instance. . 1]) . 2 .{0. Conclude that c a r d ( R ) = card(2N). 8 2 . Thus card(J0. 1].Sl s2s3 " "" is called the the d e c i m a l r e p r e s e n t a t i o n of the number h(a). .44. 1) (i. The number system C. . . C a r d i n a l i t y of t h e reals. conclude that also card(C) = card(2N). not a decimal rational) is equal to h(a) for exactly one sequence a. ) c. 81 $2 $3 $4 . 1. . E j=l 10J " Since the sj's are nonnegative. would use representations of the form T+j+g+V+.e. 2 . where rn and k are integers. has a natural bijection to R x R. b .280000 . Show that any decimal rational m/lO k in (0.
We assume some familiarity with (~. .yl. b E is defined to be the sup of the sums of the rational truncations of a and b. we omit the details. Constructible numbers.d and we studied decimal expansions as infinite series in ]R. z § . for we have formulas (albeit complicated) for approximating these numbers to as many decimal places as we may wish. we considered I~ as already known i. these ideas were published by Stolz in 1886. + 1 0 k " 1 Define the ordering of I~ in the usual lexicographical fashion.) is true for every k. Abian [1981]. in the obvious fashion. z + . b E R is the sup of the products of their rational truncations. . . as defined in 10. the constructivists' notion of "number" has a few surprising consequences. Thus.) However.yly2"'" Yk" represents the rational number z + ~ + ~0 0 + . . 2 .8 and constructed in 10. Although we do not know whether (. ) c Z• { 0 . Define the arithmetic operations (+) and (.44. The constructivists' notion of "number" is a bit different from the mainstream mathematicians' notion. is a complete ordered field. " Its rational truncations are the finite sequences of symbols z.7.270 Chapter 10: The Real Numbers 10. and thus we can evaluate the number defined by 0 Xk = 1 if (. y 3 . (. decimal expansions predate the abstract ideas of a Dedekind complete. chain ordered field by using formal decimal expansions.yly2y3y4 . Then the sum of two numbers a. and Ritt [1946]. where "z + .46.YlY2y3.b). we can easily test it for any particular k. 1 .) is true for this k if (. The product of two notnecessarilypositive numbers is defined in terms of the products of positive numbers. We could actually construct a Dedekind complete. . as usual.) first for rational truncations. Dienes [1957]. No proof or counterexample for this proposition has yet been found. chain ordered field. Such a sequence will be represented. defined in this fashion. . . Define IR to be the set of all infinite sequences of the form (z.4 that G o l d b a c h ' s C o n j e c t u r e asserts that for each integer k > 1. For a constructivist. z § .15. etc. where we identify a sequence ending in infinitely many 0s with the corresponding sequence ending in infinitely many 9s (as in 10.) the number 2k can be written as the sum of two prime numbers. z § . Historically.yly2. In the preceding section. Real numbers from decimals (optional). The product of two positive numbers a. but not with I~. numbers such as v/2 and 7r are perfectly acceptable. 10. . a number is acceptable if it can be approximated arbitrarily closely and some estimates can be given for how fast the approximations are converging.yly2Y3Y4. Then it is easy to show that I~ is chain ordered and Dedekind complete. by "z + . y l . recall from the footnote in 6. .. . Y 4 .e. . . (See also 6. For instance.45. y 2 . 9 } N. . This approach is developed in greater detail in various other sources for instance. Zealous readers can verify that I~.) is false for this k.
Convergence of Sequences and Series 271 Let us also define X l . we still may be unable to tell which of the numbers is larger. if the conjecture is false and the first c o u n t e r e x a m p l e (i.) Now define F = Xl 10 J X2 X3 ~. . This makes plausible our assertion in 6.. 9 F < 0. " We d o n ' t know the "exact value" of F yet.n for any n. . even if we know two real numbers (0 and F) to arbitrarily high accuracy.1. 100 1000 Xk + " "(_l)k_=_ + .F~I < c. Nevertheless. it is possible t h a t we will never know. either use results a b o u t C a u c h y sequences in C h a p t e r 19.) We shall refer to this n u m b e r F as the " G o l d b a c h n u m b e r .)) occurs w h e n k is even. or prove t h a t the liminf and limsup of the partial sums differ by less t h a n 10 . or will prove t h a t all the xk 'S are 0.e. the first c o n t r a d i c t i o n to (. if the conjecture is false a n d the first c o u n t e r e x a m p l e occurs w h e n k is odd.e. T h e sign of the G o l d b a c h n u m b e r is related to the G o l d b a c h Conjecture: 9 F 0. all known xk's are 0..0. given any c > 0. 10 ~ (To show t h a t this series converges to a real n u m b e r . .48. 9 F > 0. We shall e n c o u n t e r the G o l d b a c h n u m b e r F again in 15. We can evaluate Xk for as m a n y k's as we wish. if the conjecture is true. We d o n ' t yet know which of those three cases holds. . " since we have an a l g o r i t h m t h a t can "find" F as accurately as we wish i.... we can c o m p u t e an a p p r o x i m a t i o n F ~ satisfying I F . This mysterious quality m a y m a k e some classical m a t h e m a t i c i a n s r e l u c t a n t to accept F as a "real n u m b e r . constructivists would say t h a t F is indeed a "real n u m b e r .6 t h a t the T r i c h o t o m y Law for real n u m b e r s is not constructively provable. " It leads constructivists to conclude t h a t . P e r h a p s s o m e d a y someone will find a k for which xk . (So far.
v = ( a . ( u + v ) = ( ~ . it will b e understood t h a t all the linear spaces are over the same scalar field IF (unless some other arrangement is specified) e.v (for fixed scalar a) and the mapping a H a ." An Flinear space may be called a l i n e a r s p a c e . they can also be described as asserting the additivity of the mapping v ~ a. from IF • V into V. but we do not mix the two types unless t h a t is mentioned explicitly. Whenever possible. we refer to IF as the s c a l a r field. The elements of IF are then called the s c a l a r s . Definitions. v (for fixed vector v). (ii) a . The rules satisfied by scalar multiplication are: (i) 1 . (/3. the result of the scalar multiplication of c and v is called their p r o d u c t . +. v ) . The same symbol "0" will be used for the additive identities of the scalar field IF and the various linear spaces. Let IF be any field. for all a. v ) . fl E IF and u. and also equipped with another mapping called s c a l a r m u l t i p l i c a t i o n . which make it an additive group. v) = (a/3) 9v. u ) + ( a ... For any vector v and scalar c. v c V. or a v e c t o r s p a c e .l i n e a r s p a c e is a set V equipped with operations 0. The elements of V are called v e c t o r s . although it should be noted t h a t two different kinds of multiplication are involved: scalar times scalar and scalar times vector. we prefer not to specify what scalar field is being used.Chapter 11 Linearity LINEAR SPACES AND LINEAR SUBSPACES 11. Whenever we work with several linear spaces at once. if the choice of the scalar field IF is clear or does not need to be mentioned explicitly. (iv) ( a + fl) . it should be clear from the context just which additive identity is meant by any "0. the discussion may apply to several vector spaces over IR or to several vector spaces over C. The last two rules assert the distributivity of multiplication over addition. so t h a t we can apply our results to many different 272 . generally the raised dot is included only for clarification or emphasis. satisfying certain rules noted below. It is usually written as c . An F .g. The second rule is a sort of associativity of multiplication. v ) + (fl . v = v. (iii) a .1. .v or as cv.
the p r o d u c t of x and y (with either type of multiplication) is usually denoted x . y.i.) is a ring with unit 1. 11. . +. 0. a. . t h a t is.x  (c 9 y) for all scalars c and vectors x. the field's additive identity times any vector v yields the linear space's additive identity.e. 0.) or indicated by j u x t a p o s i t i o n . such t h a t (i) X with O. More generally. if F is a field and n is a positive integer.e. +. 11. (ii) X with 0. Elements of F n are customarily represented in the form v .(Vl.{ntuples of elements of F} is a c o m m u t a t i v e unital algebra over F. Most of the rings used by analysts are linear algebras over the field R or C. b. then F ~ .l i n e a r r i n g .1 ) . can be viewed as algebras over the finite field Z2 = {0. then the resulting linear algebra is called an algebra with unit.v~]<. we have used the symbols  and 9 in this i n t r o d u c t o r y discussion only for emphasis. y c X.. ) If (X. often called the s c a l a r m u l t i p l i c a t i o n ) .. v . Usually. (iii) T h e two multiplication operations satisfy this compatibility rule: c (c  x) 9 y . 1}. .. For clarification we might call X an algebra over F. 11. or a u n i t a l algebra. studied in C h a p t e r 13 and thereafter.1 times any vector v yields the additive inverse of v.34. See also the related discussion in 10. if x .0 for any vector v. or as nby1 column matrices.. Examples. (The operation 9 m a y be called the r i n g m u l t i p l i c a t i o n . Any field F is a c o m m u t a t i v e unital algebra over itself. v 2 . the multiplication operations are b o t h w r i t t e n as a raised dot (... ( . 273 Some basic properties.v for any vector v. +. in some contexts it is referred to as the v e c t o r m u l t i p l i c a t i o n . Boolean a l g e b r a s . we might refer to it as an a l g e b r a i n t h e c l a s s i c a l s e n s e . x for all x . A l i n e a r a l g e b r a over a field F is a set X equipped with 0. vn) using parentheses and commas. and two multiplication operations  and . is a ring. More definitions. .2. y or xy. and b.4. (Perhaps a b e t t e r t e r m would be l i n e a r r i n g . a.  is a linear space over some field F (and  is the multiplication of scalars times vectors. v . .Linear Spaces and Linear Subspaces scalar fields. + . or as the transposes of 1byn row matrices" [Vl v2 . . the field's . t h a t is. ( x .3. Of course. or F . y)  Such an object X is simply called an "algebra" in some of the older literature. see . y = y . T h e linear algebra is said to be c o m m u t a t i v e if its ring multiplication is c o m m u t a t i v e i.
with vector multiplication given by the multiplication of matrices (as defined in 8. Let IF be a field. The vector operations on IFn are defined coordinatewise: Xl Yl Xl + Yl Xl X2 . then P is also an IF(unital) algebra.n matrices over IF.~ xz x~ cx. More generally. (Here we follow the convention of 1. y C IFn and scalar c E IF. This algebra is not commutative if n > 1. we see that X A = {functions from A into X} is a linear space or a linear algebra. ~ c A. c..274 8. If the X~'s are IF(unital) algebras.nx~A Xx of IFlinear spaces can be made into an IFlinear space.27). with operations defined coordinatewise: ( f + g)(A) ( f ( A ) ) + (g(A)). let n be a positive integer. Then IF is a unital algebra. then the linear operators from X into X form a noncommutative unital algebra with ring multiplication given by composition of operators. and scalars c.~. The product vector space takes a more intuitively appealing form if we write A {c~.~ cxz cx.. It is commutative if the X~'s are all commutative 9 In particular.. Yl Y2 x2y2 Xn Yn XnYn etc.26. we may also consider the continuous linear operators. .}.b y .y + y~ x.) Then we have x~ xz x~ y~ yz y~ x~ + y~ x z + yz x. for any vectors x. it is another unital algebra. f ) ( A ) c. g E P.y +  . and let X be the set of all n ./~. Xn Yn Xn + Yn Xn XlYl CXn Xl x2 .32: it is not assumed that A is ordered or countable. any product P . (c. when all the X~'s are equal to one space X. with vector multiplication ( f g)(A) (f(A))(g(A)). C . if X is a linear space.. If X is a topological vector space. Still more generally. c = and for linear algebras x~ xz x~ y~ yz y~ x~y~ xzyz x~ y~/ 9 d. ( f ( A ) ) for all f. Preview.  x2 + Y2 . C h a p t e r 11: Linearity CX l cx2 x2 9 + Y2 .
It is also equal to the set of . i x (i x j) = . c. the sj's are elements of S. it is the intersection of all the linear subspaces containing T.5. Consequently.s)g(s) ds.ira be equipped with the usual vector space operations.see 26. (Thus. Several more examples are given in 11.55 and 9. b. A l i n e a r c o m b i n a t i o n elements of S is an expression of the form t Oz181 ~. 11. T h e c r o s s p r o d u c t of two vectors is defined by [Xl] ix2] Yl • Y2 z2 Zl  122 z12] [ ZlX2 . k 0 . Let G be a locally c o m p a c t Abelian group equipped with its H a a r measure. Let X be an Flinear space. as in l l. k • j . [0] of 0 0 1 T h e cross p r o d u c t is not associative.0 = (i x i) x j. either a.0. A n o t h e r i m p o r t a n t algebraic system can be described as follows" Let X . and the c~j's are elements of F. A l i n e a r s u b s p a c e of X is a subset S c_ X with the p r o p e r t y t h a t any linear combination of elements of S is also an element of S. and let L 1 ((7) be defined accordingly . Equivalently.45. where n is a nonnegative integer. IRa is not a linear algebra when the cross p r o d u c t is used for vector multiplication.i. T h e n there exists a smallest linear subspace containing T. Any intersection of linear subspaces is a linear subspace.Linear Spaces and Linear Subspaces 275 e. Definitions.) 11. we have i• j• it satisfies x x y . and consequently k• j • iwhere i [1] 0 k. and in C h a p t e r 22 and thereafter. for instance..j ~. i • k j. j 1 [0] .45 and 11. generally not unital.46. Let T C_ X.' ' ' nL C t n S n . Let X be an Flinear space.y x x. Prove the following results. f.C t 2 8 2 J r . We p e r m i t n . In particular.b.21. It can be shown t h a t L 1 ((7) is a c o m m u t a t i v e algebra. T h e whole vector space X is a linear subspace of itself.21.6. with the convention t h a t the sum of no elements of X is 0. with ring multiplication defined by the c o n v o l u t i o n operation ( f 9 g)(t) f a f ( t .4. directly or by using results of 9. it is a subalgebra in the variety of Flinear spaces see 8.O. it is a n o n e m p t y set t h a t is closed under scalar multiplication by all scalars and under the binary operation of addition. and let S C_ X.XlZ2 . . Basic properties.ylx2 xlY2 9 This multiplication is a n t i c o m m u t a t i v e : z x z ..
Let (Y~ : A C A) be an indexed set of Flinear spaces. this is a linear subspace of F ~. it is a commutative algebra.1) into IF} is a linear space. i. {0} is a linear space over any scalar field. additive inverse. it may be abbreviated span(T). Caution: Some mathematicians call this the "direct sum. k@ Preview. e.11) is also a linear subspace. If S and T are linear subspaces of X and c E F. The definition of linear subspace depends on the choice of the scalar field F. then F ~ = {functions from F into itself} is a linear space. if A is a finite set. when the vector space operations of X (addition.) For each positive integer n. It is contained in the span of any set. In fact. then [JncN F is the linear space consisting of all sequences of scalars that have only finitely many nonzero terms. see 11. An important special case is that in which all the Y~'s are equal to one vector space Y. scalar multiplication. It is also a linear subspace of any linear space. then S becomes a linear space in its own right. but not when we take C for the scalar field. cS = {cs : s c S} are linear subspaces also. tET}. Specializing further: Let F be the scalar field.30. The set Qn = {polynomials of degree exactly n} is not a linear space. If S~ (A c A) are linear subspaces of X. but the term "closure" generally is not used in this context. let Pn = {polynomials of degree at most n. in one variable.276 Chapter 11: Linearity all linear combinations of elements of T. Then the external direct sum UACA Y is equal to the set of all functions f : A ~ Y that vanish on all but finitely many A's. f. it is the span of U ~ A S~. For instance. then the sum }~A~A S~ (defined in 8. If S is a linear subspace of X. If F is any field. Following are some linear subspaces of F (~ of types that . since it is not closed under addition." see the remarks in 9. The empty set is not a linear space. in fact. (In fact. 0) are restricted to S.) d. with coefficients in F}. the set S = {u E C : Re(u) = 0} is a linear subspace of C when we take IK for the scalar field. A linear subspace S is said to be s p a n n e d by T if S = span(T). Then F (~ = {functions from (0.30. then the external direct sum is equal to the product. (This is a special type of Moore closure. It is called the s p a n (or linear span) of T. The external direct sum described above is a special case of the external direct sum defined in 9. {0} is the span of the empty set. The e x t e r n a l d i r e c t s u m of the YA's is the set [_]Y~ AEA {I H is nonzero at most nitely many s} This is a linear subspace of the product HACAYA" Of course. h.3. g. Let IF be the scalar field (either IR or C). then the sets S+T = (s+t : sES.
A b i l i n e a r m a p is a mapping f : X x Y + Z from the product of two linear spaces into a linear space. not f ( x + u). such t h a t f ( x .)" Y +Z is linear for each fixed x E X. uniformly continuous functions}.1) :/: B ~ BC ~ BUG ~ Lip D C ~ D LINEAR MAPS 11. {bounded. 11. we emphasize t h a t the choice of IF is part of the definition. if no confusion will result. . Thus. if f is linear then f ( x ) may be written as f x . Ezercise for more advanced readers: Show t h a t IF(0. Y) {Elinear maps from X into Y} is a linear subspace of y X. For instance. A linear m a p from a vector space into the scalar field is also called a l i n e a r f u n c t i o n a l . x ' E X and c E IF. It is easy to see t h a t if X and Y are Flinear spaces. We may omit the prefix "F" and simply refer to a l i n e a r m a p . An F . if no parentheses dictate otherwise.cf (z) for all x . The category of Flinear spaces has IFlinear spaces for objects and IFlinear maps for morphisms. {Lipschitzian functions}. the map c~ H ~.48) for the variety of all IFlinear spaces. However. Mathematicians often omit the parentheses in writing linear maps i. As usual. operations written multiplicatively are performed before operations written additively. The Flinear maps are just the homomorphisms (as defined in 8.7. then Lin(X. . X ~ Z is linear for each fixed y E Y.e.. Definitions. an expression such as f x + u is understood to mean (f(x)) + u. y ) ' . {smooth functions vanishing at endpoints}.l i n e a r m a p is a mapping f " X + Y from one IFlinear space into another t h a t satisfies f (z + z')  f (z) + f (z') and f (cz) . BC BUC Lip = All of the relevant terms are defined later in this book. from C into itself. is lRlinear but not Clinear. and f(. The l i n e a r d u a l of an Flinear space is the linear space Lin(X. {bounded continuous functions}.. IF) {linear maps from X into IF}.Linear Maps will be studied later in this book" B = =  277 {bounded functions}.8.
Y is a linear isomorphism. y ~ X is also linear.X2. 1]  {continuous functions from [0. a. Any linear map f is a homomorphism of additive groups. Let X / S be the quotient space i. Hence it satisfies f(0)0. c (x) (cx) for Xl.. then f .1 . and the quotient map is a linear map.e.x2 c S.278 Chapter 11" Linearity When the context is clear. either directly or as specializations of results about homomorphisms between algebraic systems.l ( T ) is a linear subspace of X.X C X and c c F. the set of all equivalence c l a s s e s . ) F(Tr(x)) = f ( x ) . its kernel is X. 1] into I~} is a linear subspace of IR[~ Let any g c C[0. b. e.55.9. {0} c f . Represent elements of the vector spaces F m and F n as column vectors. F) may be called the d u a l of X and denoted more briefly by Lin(X) or by X*. g. h. Then X / S is a linear X space. IR can be defined by the Riemann integral Lg(f)  f(t)g(t) dt (I ~ 6'[0. This is an equivalence relation. by the mapping k. Then the map v H Av. If f " X ~ Y is a linear map and T is a linear subspace of Y. . then a linear functional L 9 "C[0. A l i n e a r i s o m o r p h i s m is a linear map that is bijective. (This example requires some familiarity with calculus. then f . Let A be an mbyn matrix over a field F. i.f . and hence is also a linear isomorphism. Prove the following. with operations defined by 71"(Xl) + 71"(X2) .28.l ( 0 ) . ( I s o m o r p h i s m T h e o r e m . This example will be generalized substantially in later chapters. The reader is cautioned that "dual" and "X*" have other meanings in other contexts. see 9. X / K e r ( f ) is isomorphic to Range(f). j. its kernel is {0}. f. If f 9X ~ Y is a linear map and S is a linear subspace of X.) The set C[0. 1] . then the constant mapping from X to Y that sends all elements to 0 is a linear map. Lin(X. then f ( S ) is a linear subspace of Y. Show that if f 9X . 1] be fixed. 1]). If f " X ~ Y is a linear map.71"(Xl nt. If X and Y are linear spaces. defined as in 8.l ( 0 ) . The identity map i" X ~ X is linear. c. Define a relation on X by Xl ~ x2 if Xl . Let S be a linear subspace of X. then G r a p h ( f ) is a linear subspace of X x Y.X2). d. Examples and further properties. 11. f is injective ~ {0} . is a linear map from IFn into F TM.and let 7r 9 ~ X / S be the quotient map.
Then f can be extended to a linear map F " span(S) ~ Y if and only if f has this property" whenever 81. If ( a l S l + ' ' " + arnSrn) . (*) Proof. Then on the vector space X x X we can define the vector operations (Xl. . this limitation is without loss of generality. Note t h a t X is isomorphic to the subset {(x.F a m f ( S r n ) ] . then the extension F is unique. but in most applications the complexification arises naturally. Moreover. In applications. a y l F bXl ) for ( X l .(xl. "+arnf(sm)  a l s 1 nta2s2nt.g. we can replace the scalar multiplication (.X2. We may denote it by X + i X and its element (x. In some parts of functional analysis ... the most important fields are R and C.0 by our hypothesis on f. O. . Then (I) has finite character (see 3. since IR c_ C. particularly spectral theory and m a t h e m a t i c a l physics. vector lattices or nonlinear functional analysis there are relatively .. These definitions make X x X a complex linear space.few benefits from working with complex scalars.Linear Maps 279 11. then f satisfies a l f ( S l ) + a 2 f ( s 2 ) + . Proposition. for every real linear space X can be viewed as a subset of a complex linear space. Complex scalars are i m p o r t a n t for some areas of functional analysis and its applications. if f satisfies t h a t condition. Let S c_ X.)" C x V ~ V with its restriction (.Y2).46). A linear space over the scalar field R is a r e a l l i n e a r s p a c e . then [ a l f ( S l ) + ' " .) does indeed define a function F. called the c o m p l e x i f i c a t i o n of X. Y2) in X x X and a.24) is just the complexification of R. + amf(Sm).bntn) . Obviously t h a t function is linear. . Indeed. V. S2. Indeed. such t h a t f can be extended to a linear m a p on s p a n ( D o m ( f ) ) .by1. Further observation (optional). since every complex linear space can also be viewed as a real linear space. for it must satisfy F(alSl + a2s2 + . Consequently some m a t h e m a t i c i a n s simplify their notation by only considering real linear spaces. . y) by x + iy. This limitation is without much loss of generality. This construction seems rather cumbersome. Yl) and (x2. Let X and Y be linear spaces.10. + amsm)  a l f ( S l ) + a2f(s2) + .)" R x V .F bnf(tn)] .e.(bit1 +''"Jr. For many purposes.Yl) ( X l nt. Hence the formula (. Consequently some m a t h e m a t i c a l books and papers only consider complex linear spaces.Y2) (a+ib). and  ( a x l . Real linear versus cornplez linear. C (introduced in 10. Yl t.Yl)+(x2. Here is another example: . and for any set A the linear space C A is the complexification of R A. with the si's and tj's in S. Let (I) be the collection of all graphs of functions f from subsets of X into Y. "ntamSrn .O. 11.11. Srn are elements of S and al. 999. . 0 ) : x E X}.[ b l f ( t l ) + . a2. .0. as we now show" Let X be any real linear space. a linear space over the scalar field C is a c o m p l e x l i n e a r s p a c e . 9 9 am are scalars such t h a t . . . and let f 9S ~ Y be some function. b E R.
then a function f : A ~ C is continuous if and only if it is of the form f = u + iv.14. and f = gl Jrig2. Any subset containing 0 is linearly dependent. where n is a positive integer. then any real linear m a p from T into X extends uniquely to a complex linear m a p from T 4. then gl (v) Re f ( v ) and g2(v) = Im f ( v ) are real linear functionals on V with g2(v) = . 11. . In any linear space: a. ~ is a linearly independent subset. is an Flinear m a p if it is additive and satisfies f ( c v ) = c f ( v ) for all v E V and c E F. LINEAR DEPENDENCE 11. Observations. Conversely.280 Chapter 11" Linearity Let A be a topological space. the ci's are nonzero scalars.i T into X + i X .C282 ~ " ' " + C n S n .12. Definitions. d. then there is a bijection between complex linear maps f : V ~ X + i X and real linear maps gl : V + X. c.13. A r e a l l i n e a r f u n c t i o n a l on V is an Rlinear m a p from V into ~. v are realvalued continuous functions.g l (iv). d. if T and X are real linear spaces. Any complex vector space V may also be viewed as a real vector space (by "forgetting" how to multiply by scalars).) Generalize the preceding argument. Also. from one IFlinear space into another. S is l i n e a r l y i n d e p e n d e n t . but the two viewpoints give us two different collections of linear functionals. b. If f is a complex linear functional on V. If V is a complex linear space and X is a real linear space with complexification X + i X .  b.e we shall see t h a t any complex linear space can be viewed as the complexification of a real linear space though not necessarily in a constructive fashion. a c o m p l e x l i n e a r f u n c t i o n a l on V is a Clinear m a p from V into C. if gl is a real linear functional on V. If b and c are distinct scalars and v is any vector. How are the two collections related? Recall t h a t a mapping f : V ~ Z. where u. igl(iv) is a c. Now suppose V is a complex linear space.30. If v is a nonzero vector then the singleton {v} is a linearly independent set. and the si's are distinct elements of S. A set S C_ X is l i n e a r l y d e p e n d e n t 0 ~ if we can write C l S 1 ~. In l l. (Optional. then any set containing both by and cv is linearly dependent. then f ( v ) = g l ( v ) complex linear functional on V. These transformations give a bijection f ~ gl between the real linear and complex linear functionals on V. Show t h a t a. 11. If 0 cannot be expressed in this fashion. BohnenblustSobczyk Correspondence.
X and B does not contain some other set A satisfying span(A) . Z 2 . Then the following conditions are equivalent.. A 2 . For each r E IR. . Definitions.x E ~ i = l Ker(Ai).a n p ( r n ) . (D) B is a minimal spanning set for X.8i0. with n equal to a positive integer. .) (A) For each nonzero vector x E X.46).. (Some mathematicians also call it a H a m e l basis.{sequences of reals}. . . B is a maximal linearly independent subset of X. a linearly independent set that is not included in any other linearly independent set. i. 11. r 2 . . Example.17.30. Then: (Pk) n i ~ l Ker(Ai) c_ Ker(A0) if and only if A0 c span{A1.an and some distinct real numbers rl. X 2 . let A0. . . rn. z m are as in Qm and z c X.16. and let A0.Am be linearly independent elements of Lin(X).a n y ( m ) . 4. .X. the linear dual of X (defined in 11. anO.X. Let X be a linear space. Then { v ( r ) ' r linearly independent subset of the real linear space R N . with the ci's equal to nonzero scalars.. and let B C X. ..Linear D e p e n d e n c e 281 e. We shall prove the "only if" parts by induction on k. .e. . A 2 . but other mathematicians reserve that term for a narrower meaning indicated in 11. span(B) .a2v(r2) nc . . . . let X be a linear space. hence we may assume that that set is linearly independent. Now show that if X l .A2. By considering the Lagrange polynomials (2. .0 for every polynomial p.15.8~j (where 8 is the Kronecker delta). If one (hence all) of them is satisfied. . . s h o w N i m = l Ker(Ai) C_ Zer(A0). showing Qm ~ Pm ~ Qm+l. Ak }.e). Ak are linearly independent elements of X * if and only if there exist vectors Z l . A1. .2. .r. and with the si's equal to distinct elements of B. .. there is one and only one way (except for changing the order of the summation) to write x . Hints: The "if" parts are obvious. A 2 .) A set S is linearly independent if and only if each finite subset of S is linearly independent. .. first note that we can omit any element of the set {A1.. let v ( r ) . 11. show al . .a2 . Ak be elements of X*. . 999. a 2 . to be more specific.C l S 1 4. ) .0 for some scalars a l .. . . . To prove Pm =~ Qm+l.. (B) (C) B is linearly independent and span(B) . it suffices to establish the existence of xo satisfying Ai(x0) . Show that a l p ( r 1 ) t.(1. Thus. r 3 . To prove Qm => Pro. f. the collection of all linearly independent sets has finite character (see 3. If no such x0 exists. we say B is a b a s i s for X or. that is.. (Qk) A1. E IR} is a Hint: Suppose a i r ( r 1 ) 4.4. x k E X such that A j ( x i ) . The proof of Q1 is trivial. A1. .c 2 8 2 Jr''' + CnSn. m m then X~j=I A j ( x ) x j . Am} that is a linear combination of the other elements of that set. . .8). Let k be a positive integer. . . . 11. By s y m m e t r y (explain). A 2 . A set S is linearly dependent if and only if some point s E S is in s p a n ( S \ { s } ) . C o m m o n K e r n e l L e m m a . .c. a v e c t o r b a s i s or l i n e a r basis. .a2p(r2) 4. r 2 . (Optional.
let B be a vector basis for X.. then dim(V) is a nonnegative integer. indeed.k.. . d .. let {r]l.d i m e n s i o n a l according as dim(V) is finite or infinite. Let n be a positive integer.8). { (1. if V ..1. W h e n V is finitedimensional. Then the values of A are determined by its values on the 7]j's. a. For each j c { 1 . b.. define the vector r]j (0. the characteristic function (defined on B) of the singleton {b}. Thus.0) [0 .6. Y) .i). F B is isomorphic to the linear dual of X (defined in 11.. In particular. 0). in the sense of 11. . Then the empty set is a vector basis for X. 1) } are two different vector bases for F 2. and let f c y B. such functions form a vector basis for the external direct sum. tin} be the standard vector basis for F n. 0 1 0 . and f(b) vanishes for all but finitely many b. r]2.. and let S c_ X. C. Indeed.1 8.C2. FURTHER RESULTS IN FINITE DIMENSIONS 11. (This sum makes sense since f(b) is a scalar. . Matrices as linear maps. e.. Let A : IFn F m be a linear map.9. T h a t cardinality is called the d i m e n s i o n of V.. where r]j has a 1 in the j t h position and 0s elsewhere. .~cnA?]n. Let X be a linear space...19.. Hence {r]l.l ( b ) equal to l{b}. Then f extends uniquely to a linear map from X into Y.' " "~tCn~n for some scalars Cj E F. as in 11.. there is an isomorphism between the linear spaces y U and Lin(X.20. which contains just the one vector 0.. 0). .tin} is a vector basis for ]Fn. then X is isomorphic t o UbEB ]~ . 2 . Examples and observations.. Every linear map from IFn into IFm is uniquely representable as an mbyn matrix. 11. n } . ( . we have i . . . .. 11.Cn.. An isomorphism i from the external direct sum onto X is given by i ( f ) = ~~b~B f ( b ) . it is called the s t a n d a r d b a s i s for F n.0.19.) For each b C B. it is written dim(V).0.. The vector r]j is called the j t h s t a n d a r d b a s i s v e c t o r for F n. Then S is linearly independent if and only if S is a vector basis for span(S).... b is a vector.282 Chapter 11" Linearity Later we shall use the Axiom of Choice to prove that every linear space V has a vector basis and that any two vector bases for V have the same cardinality. b. we must take cj = vj for each j. Let X and Y be linear spaces. If X is an Flinear space with vector basis B. It is easy to see that any vector v = (Vl.that is.....C l ~ l TtC2~2~ .. Let X be the degenerate linear space {0}.1. the external direct sum of B copies of the scalar field IF (defined in 11..vn) c IFn can be written in one and only one way as v = c1~]1 +c2~]2 +" "'+Cn~]n for scalars Cl. 0] T c F n. 1) } and { (1. then Av = clA~71 +c2Ar]2~. (0. indeed.. r]2. The linear space is said to be f i n i t e d i m e n s i o n a l or i n f i n i t e .V2.{linear maps from X into Y}.
. A 2 . . Suppose A : F n ~ F m is some linear map. define a m a p p i n g fv : F n ~ F by w H v T w . the zero vector in F m. where r5 is the j t h s t a n d a r d basis vector. . .[Cl c2 . T h u s it is a lby1 matrix. the c o m p o n e n t cik is the scalar p r o d u c t of the i t h row of A with the k t h c o l u m n of B. To show it is injective..) Hints: Any m e m b e r of (Fn) * is a linear m a p from F n into F 1. . T h e m a p p i n g f : v ~ fv is a bijection from F n onto (Fn) *.m m a t r i x B such t h a t B A = In. for some scalars C l . a n d let w ~ g~ be the analogous bijection from F m onto (Fm)*. thus f is surjective. 11.20) r e p r e s e n t a b l e as f~ for some nby1 m a t r i x v.b y . s u p p o s e B A . .AT w.n m a t r i x over F.B A c . B u t t h e n c . is a linear m a p from F n into (F n)*. . T h e m a p p i n g f : v H f . T h e n we m a y define a dual m a p A* : Y* ~ X * by A * ( f ) = f o A. Show t h a t if X a n d Y are finite dimensional.26. . 2 3 . . 1 1 . . v i w . r e p r e s e n t e d by an m . Then: a. suppose clA1 + . Define a c o r r e s p o n d i n g m a p A * : (Fro) * .b y . .fv. Cn] r'.Inc . t h e n A* is r e p r e s e n t e d by the t r a n s p o s e of t h a t matrix. T h e n the n columns of A are linearly i n d e p e n d e n t elements of F TM if and only if there exists an n .55. a contradiction. t h e n vj r 0 for at least one j. i n t r o d u c e d in 8. where v .. The dual of a linear map.n m a t r i x which we shall also d e n o t e by A. (Fn) * by this rule: [A*(gw)](x) = g ~ ( A ( x ) ) for any x c F n t h a t is.2. a n d A : X ~ Y is some linear map. More precisely: Let v H fv be the bijection from F n onto (F n)* described above. where A T is the t r a n s p o s e of the m a t r i x A A* (g~) .. (We t h e n say B is a left i n v e r s e for A. w E F n. hence (by 11. A n are linearly d e p e n d e n t vectors in F m i.w T v. t h a t is. t h e n v T w is the p r o d u c t of a 1byn m a t r i x a n d a nby1 matrix.Further R e s u l t s in Finite D i m e n s i o n s 283 It follows t h a t A is r e p r e s e n t e d by the rnbyn m a t r i x whose columns are the vectors Ar b (j. 1 1 .n). infer t h a t A c is equal to 0m.In. We shall call this the s c a l a r p r o d u c t of t h e vectors v a n d w.Cn t h a t are not all 0. b u t the columns A 1 . ]~n A ~ ]~m gw ~ (Fro). Show t h a t A*(gw) .BOrn .. the linear dual of F n. A couple of its basic properties are: a.22. f~ is a m e m b e r of (F n)*. .b y .1. .t h a t is. . A*(gw) is the c o m p o s i t i o n A*(g~) _ gw o A . d. f~ is a linear functional on Fn. c. ( T h u s F n is isomorphic to its own linear dual. .) Proof. b .. For the "if" part. hence fv r 0. vn) :/: 0. . Suppose X and Y are linear spaces over the scalar field F.e. t h a t is. hence fv(r]j) r 0. If v. and A is r e p r e s e n t e d by a matrix.A B ... Let A be an m . b. L e m m a . . In a p r o d u c t of matrices C . Let c . as in 9.0n. V 2 . which we m a y view as a scalar. T h e scalar p r o d u c t is s y m m e t r i c . 2 1 . For each fixed v c I~n.f a T w . + c n A n is equal to 0m. note t h a t if v = ( V l .
1 . . . R e m a r k s . Wn+l are linearly dependent. . . . . we first show that any n + 1 vectors in Fn are linearly dependent. w 2 . as we see from the example in 8. We first prove this in the special case where V . Let B w n + l . is a linear space with dimension n + 1. Then X has at least one vector basis. It implies that a linear operator f : V ~ V. View Wl. .Yn. . . i. 11. . Now. x n}. The set of all polynomials of degree _< n. cn] T. a matrix C such that A C I. . there are vectors b l . .e. + cn+ l Wn+ l .[ a i l ai2 . 11. then BAIn. that Wl. Let n be the smallest number of vectors that span X. Take the lbym matrices b~ to be the rows of a matrix B. n + 1) for some scalars aij. . .27.0 for some scalars ci that are not all 0. . Wn+l are linearly dependent. By taking transposes in the preceding result. . T h a t conclusion is not valid in infinitedimensional linear spaces. assume that wl.ai~] T. . . . + C n + l A n + l .5. One vector basis is {1. has a left inverse if and only if it has a right inverse. . By the Common Kernel Lemma. any vector basis for X contains exactly n vectors. w 2 . by their action in the scalar product (see 11. . when the vector operations are defined in the obvious fashion. Hence c l A 1 + . . Let Ai .bn C F TM such that b [ A j . . from a finitedimensional linear space into itself. y2. 11.26. . . so they are linearly dependent. if wl. Assume. Wn. It can also be proved. This slightly deeper result can be proved using determinants or other more advanced methods. . . .) 11. .284 C h a p t e r 11" Linearity For the "only if" part. Let X be a linear space. and we call n the d i m e n s i o n of the vector space X.c [cl c2 . with coefficients in F. x.n matrix B such that B W . . . . W~+l lie in the span of {yl. . w 2 . assume the columns A 1 . A n are linearly independent vectors in F TM. . . (We then say that X is finite d i m e n s i o n a l .24. though not so easily. . . . . then the vectors Wl. . .ClWl + "'" + CnWn. Then Wi  ~ aijyj j=l (i . Wn+I are linearly independent. then there exists an n .. y 2 . .O. Proof. . In any linear space V over the field F. . 2 .b y . but we shall not need it. w 2 . Show that Wn+l .25. View them as elements of the linear dual of IFm. Wn as the columns of an n . x 2 . .5ij. . proving that wl.b y . that is.. to the contrary. A 2 .a. w 2 . Then the Ai's are n + 1 vectors in F n. and X is isomorphic to F n. that the rows of a square matrix are linearly independent if and only if its columns are linearly independent. b 2 .F n and the yi's are the basis vectors rh.n matrix W. Now show that Cl Wl + . for an arbitrary linear space V. in one variable x. . . Let F be a field. Assume that X can be spanned by some finite subset of X. C o r o l l a r y . w 2 . . . Wn+l are n + 1 vectors in the span of some n vectors y l . 99 yn }.I. P r o p o s i t i o n . we obtain this dual result" The rows of a matrix A are linearly independent if and only if A has a right inverse.22). . E x a m p l e .
Then I c_ B C_ G for some vector basis B. and let G be a subset of X that generates X (that is. We now state three more equivalents of Choice" ( A C 1 6 ) V e c t o r B a s i s T h e o r e m ( s t r o n g f o r m ) . this can always be accomplished by relabeling.i. Suppose that I is a linearly independent subset of X. they will not be needed later except for pathological examples. (weak form). Form the external direct sum U F aEA  {fEF A 9 f(ct)r as in 11. and I c_ G. Let S . For each s E S and a E A. Let X be a linear space over a field F. the results below are optional.6. 11. span(G) .X). which may be conceptually helpful to the beginner. This proof is due to Halpern [1966]. Obviously ( A C 1 6 ) = ~ (AC17)~ (AC18). Let X be a linear space over a field F. Use 11. Let {So " a E A} be a nonempty set of nonempty disjoint sets. because (1) the Axiom of Choice and its consequences are nonconstructive. we shall now obtain further results about vector bases and cardinality in infinitedimensional linear spaces.17(C).E[S] be the field of rational functions with coefficients in E and variables in S (see 8. span(G) . and (2) the vector basis of an infinitedimensional topological linear space generally has little connection with the topology of that space. we wish to prove that there is a set So consisting of exactly one element from each So. G is a generating set (that is. let gs(c~) s 0 if s E So i f s ~ So. Let B C_ G be a vector basis for (I) over F. These results have little practical value in applied mathematics or in functional analysis. we shall construct not only a suitable vector space. Let F .Choice and Vector Bases 285 CHOICE AND VECTOR BASES 11. Proof of (AC17) =~ (AC2). (AC18) Vector Basis Theorem vector basis. ( A C 1 7 ) V e c t o r B a s i s T h e o r e m ( i n t e r m e d i a t e f o r m ) .X). Let E be a field disjoint from S. then (I) is a linear space over F. Remarks. Then X has a vector basis B contained in G. but also a suitable scalar field.28. the characteristic . For each a E A.{g. Every linear space has a Proof of ( A C 5 ) o r (AC7) =~ (AC16).U ~ A So.{g~ 9s E S} spans (I).29. To this end. Then gs E (I) and in fact the set G .24). However. These results provide representations of linear spaces. Many formulations of the Axiom of Choice were introduced in Chapter 6. By using the Axiom of Choice. then B .~ " s E S0} for some set So c_ S.
(That cardinality can therefore be called the d i m e n s i o n of the linear space. This proof is taken from Hall [1958]. Let V be an ISlinear space. . show that there exists a function f 9 I~ ~ I~ that is additive t h a t is. f.31. It may be instructive to contrast this with 8. Then tgu = ugt. . . Hint: 11. Then S is the range of a linear projection i. a 2 . If V and W are linear spaces over F. . its linear dual separates points of V. Then any two vector bases for V over F have the same cardinality. satisfying the condition that each x E X can be written in one and only one way as s + t with s C S and tET. c.18.) Proof. 11. there exists a linear map f : X ~ S that has range S and satisfies f(s) = s for each s c S.25). then f can be extended to a linear function from V into W. Corollaries of the Vector Basis Theorem. 1934). and f 9I ~ W is any function. u are distinct members of So N S~.e).d. contradicting the fact that the set {gt. tn} obtained in this fashion. g. X has a linear subspace T satisfying S+T X and SMT {0}. I C_ V is linearly independent. To show that So meets each S~ in at most one point. a. If V is any linear space. . d. or. Then S has an additive complement T that is. and thus So meets S~. Hence there is at least one gs c B that does not vanish at a. and some vectors tl. + antn for some positive integer n. given by Blass [1984]. . . b. (Recall that (MC) was stated in 6.286 Chapter 11: Linearity function of the singleton {a} is an element of 9 and thus in the span of B. e. Let S and T be vector bases for V. gu} is linearly independent. .15.e.an. as noted in 11. some nonzero scalars a l . tn E T..17.16. Each s c S can be expressed uniquely (except for the order of summation) in the form s = altl + . Let S be a linear subspace of a linear space X.42.) This proof. .18. a basis for this linear space is called a H a m e l basis. . satisfying f ( s + t) . suppose t. . equivalently.) Using such a basis. We may view I~ as a linear space over the scalar field Q. . (Some mathematicians apply that term more widely. is similar but somewhat longer. T h e o r e m ( L S w i g . Proof of (ACI8) =~ (MC). Let F(s) be the finite set {tl. .f(s) + f(t) but not continuous. t 2 . t 2 . Any complex linear space can be represented as the complexification of some real linear space (as defined in 11. Let S be a linear subspace of a linear space X.11). . Remark: Compare this with 24. 11. We remark that it uses Blass's subfield (8. . and so we shall omit it. Any Flinear space can be represented as an external direct sum of copies of IF (see 11.30.
44. m~t" 11.30. DIMENSION OF THE LINEAR DUAL (OPTIONAL) 11.dim(X). T h e o r e m .47.37(ii).. d i m ( X ) } .. then dim(X*) > dim(X). 11.Dimension of the Linear Dual (Optional) 287 If 8 1 . the set of all linear maps from X into F. Let B be a vector basis for X. we have dim(X*) _> dim(2N). and 11. S 2 . Hints" Let {e0.. the results below assume the Axiom of Choice.25)..card(X*). 8 k are distinct elements of S. Similarly card(T) _< card(S). el. . hence dim(X*) . L e m m a . Let B be any vector basis for X.  n < oc. Then use the SchrSderBernstein Theorem. . Let L]b~B F be the external direct sum (defined in 11.e.35. For each real number r. then dim(X*) _> card(2N).6. for otherwise S l .card((2N) B) .. c a r d ( X ) . 6. there exist points t(s) E f ( s ) such that the mapping s H t(s) is injective. .b there exists some f~ E X* satisfying f~(en) .22. Hi t . 8 2 .} be any linearly independent sequence in X. If X is infinitedimensional. If X is infinitedimensional.32. Proof.36. . . By M.15.card(X). Throughout the discussion below.18. to show that the f~'s are linearly independent members of X*. and remarks. by 11. . then card(F x B) . Also.0.max{card(F). then dim(X*)  n also. U F ( s k ) contains at least k elements. hence card(N x B) .. The results below make use of the fact (proved in 10. assume F is either R or C.card(2 NxB) card(2 B) > card(B) . then B is an infinite set. . . Let X be a linear space over F. 2.e.22. and card(X)  11.34. card(X*) Observation. e2. let F be the scalar field.card(F). notations. Hall's Marriage Theorem 6.sk would be linearly dependent (by 11. and let X* be its linear dual i. . now apply the SchrSderBernstein Theorem 2.a. A s s u m p t i o n s . Use 11. thus card(S) <_ card(T). If dim(X) . .22.m a x { c a r d ( F ) . these results should be contrasted with 27. Now apply 11.f) that card(R) = card(C) = card(2N).r ~ for n . 11.19. 11.29 to explain card(X)card (hUB F ) _< card = card(F x B) < card(X x X) .card(F B) . Proposition. then F ( S l ) U F(s2)LJ .card(B).. . we have card(X*) .dim(X)} by (AC13) in 6.i) of B copies of F. 1. Since the X* is isomorphic to F B.33. . By the preceding results.
288
Chapter 11" Linearity
PREVIEW OF MEASURE AND INTEGRATION
11.37. Definitions. Let X be an additive monoid. (In most cases of interest X is either [0, +oc] or a vector space.) Let g be a collection of subsets of a set f~ with z c g, and let 7 9S ~ X be some mapping satisfying T(O)  0. We say that 7 is f i n i t e l y a d d i t i v e if T(Uj= 1 Sj) E j = I T(Sj) whenever $1, $2,... , Sn are finitely many disjoint members of g whose union is also a member of g; c o u n t a b l y a d d i t i v e (or a  a d d i t i v e ) if X is equipped with a metric (or other (~D (9O notion of convergence) and T(Uj= 1 Sj)  E j = I T(Sj) whenever S1, $2, $3,... is a sequence of disjoint members of g whose union is also a member of g. The expression T(Sj) is defined as in 10.39. Of course, every countably additive mapping is also finitely additive, since we may take $3, $4, $5, ... all equal to ;3. We emphasize that "finitely additive" means "at least finitely additive, and perhaps countably additive;" it does not mean "finitely additive but not countably additive." Aside from the requirements ~ E S C_ T(ft), the collection S in the definition above is arbitrary. We now impose some additional restrictions. By a c h a r g e we shall mean a finitely additive mapping from an algebra of sets into an additive monoid. By a m e a s u r e we shall mean a countably additive mapping from a aalgebra of sets into an additive monoid equipped with some convergence structure. Cautions" The terminology varies considerably throughout the literature. Some mathematicians apply the term "measures" to what we have called charges, or to countably additive charges, or to positive measures (defined below), etc. Unfortunately, the phrase "p is a charge (or measure) on W" has two different meanings in the literature" It may mean W is the (a)algebra S on which p is defined, or it may mean that W is the underlying set f~ on which S is defined. One must determine from context just which meaning is intended. 11.38. Remarks on the choice of the codomain X. In most applications of charges, the monoid X usually is either [0, +cc] or some vector space; then p may be called a p o s i t i v e c h a r g e or a v e c t o r c h a r g e , respectively. Though a wide variety of vector spaces are used in this fashion in spectral theory, in more elementary applications the vector spaces most often used for the monoid X are the onedimensional vector spaces R and C. The resulting charge or measure is then called a r e a l  v a l u e d c h a r g e o r m e a s u r e or a c o m p l e x c h a r g e o r m e a s u r e , respectively. We shall study positive charges and measures in 21.9 and thereafter; realvalued charges and measures in 11.47 and thereafter; and other vector charges and measures in 29.3 and thereafter. Positive charges and vector charges differ only slightly in their definition, but more substantially in their use. We are mainly interested in positive charges when they are in fact measures; moreover, it is commonplace to fix one particular positive measure p and then use it for many different purposes. In contrast, vector charges are sometimes of interest without countable additivity or aalgebras, but they are of interest mainly in large
n n
Ej=I
(XD
Preview of Measure and Integration
289
collections i.e., we may study the relationships between many different vector charges, which are members of a "space of charges" as in 11.47. An important part of the theory of vector measures u is the question of just when they can be represented in the form ~(S) L f(w) d#(w)
for some vectorvalued function f and some positive measure #; see 29.20 and 29.21. 1'1.39. Remarks on the choice of the domain g. In most of our elementary examples of charges or measures later in this book, the collection of sets g is actually equal to [P(ft) = {subsets of ft}. However, our most important measure is Lebesgue measure, which is not so elementary and which is not defined on a aalgebra of the form iP(f~); in 21.22 we prove it cannot be extended in a natural way to iP(ft). In many cases of interest, ft is a topological space, and g is either the Borel aalgebra or some aalgebra containing the Borel oralgebra. Recall that the B o r e l c r  a l g e b r a is the oralgebra on Ft generated by the topology i.e., the smallest aalgebra containing all the open sets; the members of that aalgebra are called B o r e l sets. A m e a s u r a b l e s p a c e is a pair (f~, g) consisting of a set ft and a aalgebra g of subsets of f~; a m e a s u r e s p a c e is a triple (f~,g,#) in which # is a positive measure on g. (It might be more descriptive to call (f~, g, #) a "positive measure space," but we shall not be concerned with a measure "space" in which # is a vector measure.) Thus, a measurable space is a space that is capable of being equipped with a measure; a measure space is a space that has been equipped with a positive measure. These terms should not be confused with each other, or with a space of measures i.e., a collection of measures equipped with some structure that makes the collection into a vector space, a topological space, or some other sort of "space," as in 11.48. 11.40. Several kinds of integrals will be introduced in this book; still more integrals can be found in the wider literature. When necessary, we shall specify what kind of integral is being used. Fortunately, the several integrals generally agree in those cases where they are all defined. For instance, fo t2 dt makes sense as a Riemann integral or as a Lebesgue integral, but with either interpretation the expression has the value of 1/3. We now informally sketch some of the main features shared by most types of integrals. Precise definitions will be given later. In general, an integral fs f d# depends on a set S, a function f (called the i n t e g r a n d ) , and a charge #. In some of our studies of integrals, we may hold one or two of the arguments S, f, # fixed. When an argumefit is held fixed a n d / o r its value is understood, then it may be supressed from the notation; thus
~sfd#
may be written as
j~sf
or
ffa
or
/f.
Usually, when S is omitted from the notation, then S f~ is a subset of IRn and # is Lebesgue measure, then The integral fs f d# may be written in greater a d u m m y variable, or placeholder. It is sometimes
is understood to be equal to ft. When d#(a~) may be written simply as da~. detail as fs f(a~) d#(a~). Here a~ is helpful in clarifying just what is the
290
Chapter 11: Linearity
argument of f, particularly if the function f is complicated. The integral is not altered in value if we replace w with some other letter, or omit it altogether. Thus:
L
L
11.41. Using (a)algebras and charges, we shall consider integrals f f dp of three main types in this book: (i) p is a vector charge taking values in a complete normed vector space, and f is a scalarvalued function taking values in the scalar field of that vector space. Then f fdp takes values in the vector space. We shall call this a B a r t l e i n t e g r a l (though the terminology varies in the literature); this type of integral is introduced in 29.30. The mapping ( f , p ) H f f dp is bilinear i.e., linear in each variable when the other variable is held fixed. For f and # held fixed, the mapping S ~~ fs f dp is finitely additive; i.e., it is a vector charge. This is algebraically the simplest type of integral we shall consider. We modify this concept in a couple of ways, indicated below, to allow +c~ in our computations. (ii) # is a positive measure (and thus may take the value + ~ ) , f is a function taking values in some complete normed vector space, and some restriction is placed on IIf(')ll so that it is "not too big." Then f fd# takes values in the vector space. We shall call this a B o c h n e r i n t e g r a l ; it is introduced in 23.16. It is a linear function of f, for fixed #. For fixed f, the mapping # ~ f f d# is like the "upper half" of a linear map: It preserves sums and multiplication by positive constants. For f and # fixed, the mapping S ~ fs f d# is countably additive i.e., it is a vector measure. A central result for Bochner integrals is Lebesgue's Dominated Convergence Theorem, 22.29. (iii) p and f both take values in [0, +c~], and f fd# does, too. We shall call this a p o s i t i v e i n t e g r a l ; it is introduced in 21.36. It behaves like the "positive quadrant" of a bilinear mapping: The maps f H f f d# and # H f f d# both preserve sums and multiplication by positive constants. For f and p fixed, the mapping S ~ fs f d# is countably additive i.e., it is a positive measure. A central result for positive integrals is Lebesgue's Monotone Convergence Theorem, 21.38(ii). We emphasize that for integrals of this type, f f d# may take the value +c~. When f fd# exists and is finite, we say that f is i n t e g r a b l e . Other types of integrals over charges are possible, of course. For instance, for any vector spaces X, Y, Z, we could integrate an Xvalued function f with respect to a Yvalued measure it, using some bilinear map ( , } : X • ~ Z; then f f d # takes place in Z. However, such integrals will not be studied in this book. A few other integrals will be defined in other fashions, not in terms of charges and algebras. The R i e m a n n i n t e g r a l f: f(t)dt is reviewed in Chapter 24; in that chapter we
Preview of Measure and Integration
also introduce the H e n s t o c k integral
291
f: f(t)dt
and the H e n s t o c k  S t i e l t j e s
integral
f: f(t)dp(t), and show how these integrals are related to the Lebesgue integral. Here f and ~ are functions defined on an interval [a, hi.
11.42. Integration of simple functions. Let A be an algebra of subsets of a set ~. function f : ~ ~ X is called a s i m p l e f u n c t i o n if the range of f is a finite subset of X, and f  l ( x ) equivalently, for each x E R a n ( f ) ) . E Y:[ for each x E X (or A
Equivalently, a simple function is one that can be written in the form
n
f(')
=
~ lsj(')xj
j=l
(,)
where n is a positive integer, the xj's are members of X, and 1s~ (.) is the characteristic function of some set Sj E A. (The representation (,) is not unique, since we do not require the xj's to be nonzero or distinct and we do not require the Sj's to be disjoint.) If X is a vector space, then it is easy to verify that the simple functions form a linear subspace of X ~. If X = [0, +oc], the set of simple functions is not a linear space, but at least it acts like the "upper half" of a linear space: It is closed under addition and under multiplication by nonnegative constants. Now let # be a charge defined on A, taking values in some monoid K, and let f : ~ ~ X be a simple function. When it makes sense, we define
/~ f dp
=
E #(fl(x))
x
X.
The summation on the right is over all x E X or, equivalently, (since # ( ~ ) = 0) the summation is over all x E R a n ( f ) . Thus, the summation involves only finitely many terms. Equivalently, if f is represented by (,), then
n
J~fd#
=
E
j=l
p(Sj) xj.
For these summations to make sense, we must also make certain restrictions: We must have some notion of how to multiply x times # ( /  l ( x ) ) and how to add up the resulting products. This requirement is met by any simple function, in cases 11.41(i) and ll.41(iii). In case ll.41(ii), the requirement is met by any simple function f that satisfies this additional hypothesis:
In this case we say that f is an i n t e g r a b l e s i m p l e f u n c t i o n . If we use representation (,), then we must choose the Sj's so that no nonzero vector xj is associated with a set Sj that has infinite measure. (That is accomplished, for instance, if we require that f be an integrable simple function and the Sj's be disjoint.)
292
Chapter 11" Linearity
11.43. Simple functions should not be confused with step functions, though the two notions are closely related. A s t e p f u n c t i o n is a mapping f : [a, b] ~ X, from some subinterval of R into some vector space, with the property that there exists some partition a = to < tl < t2 < ... < tn = b such that f is constant on each subinterval ( t j _ l , t j ) . (Different partitions may be used for different step functions.) Step functions are a special case of simple functions, as follows: Let A be the collection of all finite unions of subintervals of [a, b]. (We interpret "subintervals" so that singletons and the empty set belong to A.) Then A is an algebra of sets, and the resulting Xvalued simple functions (defined as in 11.42) are precisely the step functions.
ORDERED VECTOR SPACES
11.44. Remarks. We shall only consider ordered vector spaces using R for the scalar field. It is possible to develop a theory of ordered vector spaces using other scalar fields see, for instance, Schaefer [1971]   but such a theory is more complicated and less natural and intuitively appealing; it is not recommended for beginners. Definitions. An o r d e r e d v e c t o r s p a c e is a real vector space X equipped with a partial ordering ~ such that (i) x ~ y ~ x + u ~ y + u (i.e., X is an ordered group); and (ii) I f x ~ 0 i n X a n d r > 0 i n I R , thenrx~0inX. We say X is a R i e s z space, or v e c t o r lattice, if in addition (iii) (X, 4) is a lattice and an infimum. i.e., each finite nonempty subset of X has a supremum
Finally, X is a l a t t i c e a l g e b r a (or a l g e b r a lattice) if X is also an algebra (in the classical sense, as in 11.3) whose vector multiplication satisfies (iv) x , y ~ O
~ xy~O.
If X is a Riesz space, then a R i e s z s u b s p a c e is a subset S that is closed under the vector operations and the lattice operations that is,
s, t E S ,
cER
s + t , cs, s V t , s A t
ES.
Clearly, such a set is itself a Riesz space, when equipped with the restriction of the operations of X. 11.45.
Example: realvalued functions. Let A be any set. Then the product
IRA
=
{functions from A into R} that is, when
is a Dedekind complete lattice algebra, when given the product ordering ordered by x~y if x ( A ) < y ( A ) for every A E A .
Ordered Vector Spaces
The vector and lattice operations are defined pointwise"
293
( . . y)(a) (x V y)(s max{x(A), y(k)},
. ( a ) . y(a),
(x A y)(k)  min{x(A), y(A)}.
More generally, for any set S c_ IRA that is bounded above or below by some realvalued function, we have [sup(S)] (/~)  sup{s(A) 9 s C S}, [inf(S)] (A) inf{s(A) 9 s E S}.
When this ordering is used, many mathematicians write x <_ y instead of x 4 y. However, in this book we shall often write 4 for such an ordering, to help beginners avoid inadvertently attributing familiar properties (e.g., a chain ordering) to a familiar symbol. 11.46. Further examples" subspaces of R A. The pointwise formulas given for x V y, x A y, sup(S), inf(S) in the previous paragraph remain valid in many important subsets of IRA; some of these are listed below. a. The set B(A) = {bounded functions from A into R} is a Dedekind complete lattice subalgebra of R A. b. The space C[a, b] = {continuous functions from [a, b] into R} is a lattice subalgebra of R [a'b], for any real numbers a, b with a < b. C[a, b] is not Dedekind complete. Example. Show that the sequence of functions f~(t) = ~/max{0, t} is bounded above in C [  1 , 1] but does not have a least upper bound in C [  1, 1]. c. The space Cl[a,b] = {continuously differentiable functions from [a,b] into IR} is a subalgebra of R [a'b] i.e., it is closed under addition and both multiplications. Also, it is an ordered vector space. Cl[a, b] is not a lattice. Example. Let x(t) = t and y(t) = t. Show that the set {x, 9} has an upper bound in CI[1, 1], but not a least upper bound. d. If we use R for the scalar field, then many of the Banach spaces used in the theory of measure and integration are vector lattices. They are not subspaces of IRA; rather, they are subspaces of a quotient space IRA/J for some ideal J. Examples will be developed in later chapters.
11.47. The space of b o u n d e d real charges. subsets of f~, and let
Let ft be a set, let A be an algebra of
ba(A,R)

(bounded, realvalued charges on A}.
(Here, "ha" stands for "bounded additive.") Then ba(A, IR) is a linear subspace of B(A, R) {bounded functions from A into R}, which is in turn a linear subspace of IRA  {functions from A into R}. Let ba(A, R) be equipped with the restriction of the product ordering that is, # ~ u
means
that
p(A) <_ u(A)
for e v e r y A E A .
294
Chapter 11" Linearity
Then ba(A, R) is a Dedekind complete vector lattice, with lattice operations as follows" (#V~)(A) (#A~)(A) =
sup{p(B)+~(A\B)
inf{p(B)+~(A\B)
9 BeA,
: BEA,
BC_A},
BC_A}.
Although ba(A, R) is a linear subspace of N A as linear spaces, it is not a sublattice. The lattice operations V and A shown in the preceding paragraph are not simply the restrictions of the lattice operations of R A. Indeed, ba(A,R), considered as a subset of R A, is not closed under that space's lattice operations; an elementary example of this is given in the exercise in 21.11.c. Since ba(A, R) is a vector lattice, each charge # has a positive part, a negative part, and an absolute value, as defined in 8.39. In the present context those are
#+(A) #(A) /#/(A)
= = =
sup{p(S) 9 S e A , sup{p(S) sup{p(S)p(A\S)
Sc_A}, 9 SEA, SC_A},
9 Seo4, Sc_A},
respectively. The three functions #+, #  , a n d / # / a r e called the p o s i t i v e v a r i a t i o n , the n e g a t i v e v a r i a t i o n , and the v a r i a t i o n (or total variation) of #, respectively; they are members of ba(A,N). The variation of # may also be written as Var(#). We emphasize that any bounded real charge has finite variation; this fact will be important in 29.6.d and 29.6.h. The lattice ba(A,R) is Dedekind complete. If M is a nonempty subset of ba(A,R), bounded above or below by some member of ba(A, R), then we have
n n
[sup(M)] (A) 
sup E p j ( S j )
j=l
or
[inf(M)] (A)  inf E p j ( S j ) ,
j=l
respectively, where the sup or inf is over all positive integers n, all finite sets of charges {#1, # 2 , . . . , #n} C_ M, and partitions A  $1 U $2 [2... U Sn where the Sj's are disjoint elements of A. If (M, ~) or (M, ~) is a directed set, then we obtain these simpler formulas, respectively:
[sup(M)] (A) sup
itEM
#(A)
or
[inf(M)] (A) 
ttCM
inf #(A).
(Hints: Since ba(A,R) is a lattice, we easily reduce the proof to the case where M is a
directed set; then use the fact that a setwise limit of charges is a charge.) The space ba(A,R) is studied in greater depth by Bhaskara Rao and Bhaskara Rao [19831 . 11.48. T h e s p a c e of b o u n d e d , c o u n t a b l y a d d i t i v e real c h a r g e s . Let ~ be a set, let A be an algebra of subsets of ~t, and let
ca(A,R)

{bounded, countably additive, realvalued charges on A}.
Ordered Vector Spaces
295
(We emphasize that A is not assumed to be a aalgebra, so the members of ca(A,R) are not necessarily measures.) Then ca(A,R) is a sublattice of ba(04, IR) that is, ca(A,R) is closed under the binary operations V and A of ba(A,R). (Exercise.) Moreover, ca(A, JR) is Dedekind complete. If M is a subset of ca(A, R) that is bounded above or below by some member of ca(A, R), then
oo (x)
[sup(M)] (A) 
sup Z
#j(Sj)
or
[inf(M)] (A) 
inf E
#j(Sj)
j=l
j=l
respectively, where the sup or inf is over all countable collections {Pl, P2, P3,...} C M, and pi~rtitions A  $1 U $2 U $3 U .  . , where the Sj's are disjoint elements of 04. (Exercise. Verify.) 11.49.
Notes on the boundedness of charges.
a. In 29.3 we shall prove that any realvalued measure (i.e., countably additive, on a aalgebra) is bounded. In fact, any measure taking values in a Banach space is bounded. b. A finitely additive charge on an algebra of sets need not be bounded. For example: Let A  {S c_ N 9 S is either finite or cofinite}; this is an algebra (but not a aalgebra) of subsets of N. Define A 9 ~ Z by A A(S) { card(S)  c a r d ( N \ S) if S is finite if S is cofinite.
Verify that I is a realvalued charge that is unbounded. e. Are there any realvalued charges on a aalgebra that are not countably additive? Well, yes and no. Such objects exist, but explicitly constructible examples of such objects do not exist. This is discussed further in 29.37. 11.50. Some basic properties of Riesz spaces. If X is a Riesz space, then X is a lattice group, so it has all the properties of lattice groups listed earlier in this chapter. It also has the following properties:
a. r(x V y) = (rx) V (ry) and r(x A y) = (rx) A (ry) for all x, y e X and any real number r > 0. Hence a l s o / r x / = r / x / a n d (rx) + = r(x+). b. x V y   ~ x+y+/xy
/) a n d x A y   ~
x+y
/x  y /)
c. x V (  x )  I x l ~ O. d.y4y <> y ~ O .
11.51. P r o p o s i t i o n . Let X be a Riesz space. Then X has the same ideals, whether we view X as a Riesz space or (by "forgetting" how to multiply by scalars) we view X as a lattice group. Thus, an ideal in a Riesz space X is an additive subgroup satisfying any of the conditions in 9.27.
Pro@ Since the category of lattice groups has fewer fundamental operations, it has at least as many ideals i.e., every Riesz space ideal is a lattice group ideal. Conversely, suppose
296
Chapter 11" Linearity
S c_ X is a lattice group ideal; we must show that it is a Riesz space ideal. We shall use the fact that S is solid (established in 9.27(B)). To show that S satisfies definition 9.25(B) for Riesz spaces, it suffices to show that
cCN,
sES
~
csES.
Since S is an additive subgroup, it suffices to prove this implication in the case where c > 0. Since 0 ~ s + ~ / s / and 0 ~ s  ~ / s / , we have s + , s  c S. Since S is closed under addition, m s +, m s  c S for any positive integer m. Let m be some integer greater than c. Since X is a Riesz space, we obtain
0 ~ cs + ~ m s +
and
0 ~ cs
~ ms,
and therefore cs +, c s  c S. Now use the Jordan decomposition:
= c = 
Since S is an additive group, it follows that cs E S.
P O S I T I V E
O P E R A T O R S
11.52. Definitions. Let X and Y be lattices (not necessarily groups or vector spaces). A mapping f " X ~ Y is
a lattice homomorphism x2) f(xl)A f(x2).
if it satisfies f ( x l V x2)

f(xl)
V
f ( x 2 ) and f ( x l
A
i n c r e a s i n g (or i s o t o n e ) if X 1 ~ X 2
==~
f ( x l ) 4 f(x2).
o r d e r b o u n d e d if the image of any order interval is contained in an order interval i.e., for any Xl, x2 E X there exist Yl, Y2 E Y such that
f ({x c X
9 Xl ~ x ~ x 2 } )
c_
{y E Y 9 yl ~ y ~ y2}.
It is clear that f is a lattice homomorphism => f is increasing =~ f is order bounded. Any of these three types of functions can be used as the morphisms for a category, with lattices for the objects. Note that a linear operator between Riesz spaces (or more generally, an additive mapping between ordered groups) is increasing if and only if it is a p o s i t i v e o p e r a t o r i.e., if and only if it satisfies x ~ 0 ==v f (x) ~ O. 11.53. Proposition. Let X and Y be Riesz spaces; assume that Y is Archimedean (defined as in 10.3). Let f " X ~ Y be a n a d d i t i v e , increasing map that is,
f ( x l +x2)  f ( x l ) + f(x2),
and
Xl ~ X2 =:~ f ( x l ) ~ f(x2).
Positive Operators
T h e n f is Rlinear.
297
Corollary. E v e r y lattice group h o m o m o r p h i s m from a Riesz space into an A r c h i m e d e a n Riesz space is actually a Riesz space h o m o m o r p h i s m . Proof of proposition. It suffices to show t h a t f ( r x )  r f ( x ) for every real n u m b e r r and every vector x c X . By additivity and the J o r d a n Decomposition, it suffices to prove t h a t e q u a t i o n w h e n r _> 0 and x > 0. By additivity, it is easy to see t h a t f(qx)  qf(x) for rational n u m b e r s q. Since f is orderpreserving, we can conclude t h a t
x E X +, 0 < ql <_ r < q2, ql, q2 E Q =~ qlf(x) 4 f(rx) 4 q2f(x).
ql  r < 0 < q2  r < 1 . TtL ~f(x) 4
Now, for any integer rn C N, we can find rational n u m b e r s ql ' q2 > 0 such t h a t  z 77L < It follows t h a t
(qlr)f(x) 4
m
f(rz)rf(z)4
(q2r)f(z)4
f(z).
~rt
1
Let 7  f ( r z )  rf(z); it follows t h a t the s u b g r o u p Z7  {rn7 9 rn E Z} is b o u n d e d above by f ( z ) . Since Y is A r c h i m e d e a n , it follows t h a t 7  0. 1 1 . 5 4 . A pathological ezarnple. In the preceding t h e o r e m , we c a n n o t omit the a s s u m p t i o n t h a t Y be A r c h i m e d e a n . To see this, let H be the h y p e r r e a l line (see 10.18). We shall prove the existence of a m a p p i n g f : IR ~ H t h a t is a h o m o m o r p h i s m for lattice groups but is not Rlinear. First represent R as an internal direct sum, R = X O Y , where X and Y are some additive s u b g r o u p s of R other t h a n {0} and IR itself. (This can be accomplished using l l . 3 0 . a , since R m a y be viewed as a linear space over the scalar field Q.) Let c be a nonzero infinitesimal in H. Define f : R ~ H by taking f ( z + y) = z + (1 + c)y for all z E X and y E Y. T h e n f is clearly additive. It is not linear, for if z, y are nonzero real n u m b e r s with z E X and y c Y t h e n y f ( z ) = yz r (1 + c ) z y = z / ( y ) . It s~mce~ to ~how t h a t f is orderpreserving. S u p p o s e X 1 + Yl < x2 + Y2, where Xl,X 2 E X and Yl,Y2 C Y. T h e n x2 + Y2  X l  Yl is a positive real n u m b e r and (Yl  y2)e is an infinitesimal. Hence (Yl  y2)e < x2 + Y2  Xl  Yl. T h a t is, f ( X l + Yl) < f ( x 2 + Y2). (This e x a m p l e disproves an erroneous assertion of Birkhoff [1967, page 349].) 1 1 . 5 5 . Proposition (Kantorovid). Let X, Y be Riesz spaces, a n d a s s u m e Y is A r c h i m e d e a n . Let f : X + , Y+ be any function. T h e n f e x t e n d s to a positive o p e r a t o r F : X ~ Y if and only if f is additive i.e., if and only if f ( x l + x2) = f ( x l ) + f ( x 2 ) for all Xl, x2 E X + . If t h a t condition is satisfied, t h e n the extension F is uniquely d e t e r m i n e d : It satisfies
F(x)

f ( x +)  f ( x  ) .
(**)
Proof. This p r o o f follows the p r e s e n t a t i o n of Aliprantis and B u r k i n s h a w [1985]. Obviously, if f e x t e n d s to a positive linear operator, t h e n f m u s t be additive a n d the extension F m u s t satisfy the formula (**). Conversely, assume f is additive and define F : X ~ Y by (**); we m u s t show t h a t F is linear. T h e proof will be in several steps:
298
a. Ifxuvwithu, vEX+,thenF(x)f(u)f(v).
Chapter 11: Linearity Hint: x +  x   x  u  v ,
hence x + + v  u + x  ; now use our assumption that f is additive on X+.
b. F(Xl + x2)  F ( X l ) + F(x2); that is, F is additive on Z . Hint: Apply the preceding
result w i t h u  X l + + x + a n d v  x l+x 2. c. F is an increasing function on X. Hint: If x ~ 0 => F(x)  f (x) ~ O. Finally, apply 11.53 to complete the proof. 11.56.
Observations. Let X and Y be Riesz spaces. Then: a. Lb(X, Y) = {order bounded linear maps from X into Y} is a linear subspace of L(X, Y ) = {linear maps from X into Y}.
(A) f  g is an increasing operator that is, Xl ~ X2 ==> f ( X l )  g ( X l ) that is, x ~ 0 ~
b. Let f, g E L(X, Y). Then the following conditions are equivalent:
f(x2) g(x2).
(B) f  g is a positive operator
f ( x )  g(x) ~ O.
(C) f(x) ~ g(x) for all x E X+. In other words, the restriction of f to X+ is larger than or equal to the restriction of g to X+, where functions on X+ are ordered by the pointwise o r d e r i n g  i.e., where IRX+ is equipped with the product ordering. When either (hence both) of these conditions holds, we shall write f ~ g. This ordering makes L(X, Y) and Lb(X, Y) into ordered vector spaces. 11.57. T h e o r e m ( R i e s z  K a n t o r o v i e ) . Let X and Y be Riesz spaces, and suppose Y is Dedekind complete. Then the linear space
Lb(X, Y)

{order bounded linear operators from X into Y}
is equal to the set of all linear operators that can be written as the difference of two positive operators. Furthermore, Lb(X, Y) is a Dedekind complete Riesz space when ordered as in 11.56.b. For any f E Lb(X, Y), the positive part is given by this formula: f+(x) sup{f(u) 9 u E [ 0 , x]} when x E X+.
Other lattice operations are as follows, for x E X+"
(f V g)(x) (f A g)(x) /f/(x)

s u p { f ( u ) + g(v) 9 u,v E X+ and u + v  x}, inf{f(u) + g(v) 9 u,v E X+ and u + v  x}, sup{/(u) 9u E [x,x]} 

s u p { / f ( u ) / 9 u E [x,x]}.
When 9 is a nonempty subset of Lb(X, Y) that is directed and that is bounded above by some member of Lb(X, Y), then (sup (I))(x) sup f(x)
fE~
for each x E X+.
Positive Operators
299
Caution: A formula above shows the relation b e t w e e n / f / ( x ) a n d / f ( x ) / . In general they are not the same; do not confuse them. In the expression /f/(x), we take the absolute value of the vector f in the lattice Lb(X, Y); it is a function from X into Y that can be evaluated at x. On the other hand, f(x) is a vector in the lattice Y, and so we can take its absolute value in that lattice to o b t a i n / f ( x ) / E Y. Proof of theorem. Our proof is based on the presentation of Fremlin [1974]). It is easy to show that any positive operator is order bounded; hence any difference of two positive operators is order bounded. Conversely, suppose f  X ~ Y is order bounded. Define a function g" X+ ~ Y+ by g(x)  s u p { f ( u ) ' u E [0, x]}; that supremum exists because Y is assumed to be Dedekind complete. We note that g is additive on X+"
g(xl) + g(x2) =
s u p { f ( u l ) 4 f(u2) " ~tl E [0, Xl] , U2 E [0, X2]}
= (])
s u p { f ( v ) ' v E [0, xl] + [0, x2]} s u p { f ( v ) ' v E [0, Xl ~ X2]}

g(Xl JrX2)
where equation (!) is by the Riesz Decomposition Property (noted in 8.38). By 11.55, therefore, g extends to a positive linear map from X into Y, which we shall also denote by g. Then g(x) ~ f(x) for all x E X+, so g  f is also a positive linear map. Thus f  g  ( g  f) is the difference of two positive linear maps. It is an easy exercise to verify that the function g constructed above is actually equal to the supremum of the set {0, f}, in the ordered vector space LD(X, Y). Thus 0 V f exists for each f E LD(X, Y), and therefore that ordered vector space is a vector lattice, by the observations in 8.38. To show LD(X,Y) is Dedekind complete, suppose 9 C_ LD(X,Y) is a nonempty set bounded above by some/3 E LD(X, Y); we shall show that sup O exists in LD(X, Y). We may replace (P by the collection of sups of nonempty finite subsets of ~; the existence and value of 9 are not thereby affected. Thus we may assume (I) is directed; we shall show that, on X+, sup (I) is then equal to the pointwise supremum of the members of q). Fix any ~0 E (I). We may replace each ~ E 9 with the function ~  ~0; this does not affect the existence of sup O, and it replaces the value of sup 9 with sup 9  ~0; thus we may assume that 0 E (P. Since Y is Dedekind complete, h(x)  suP/E~ f(x) exists for each x E X+. Since 0 E ~, we have h(x) ~ 0 for each x E X+. By 8.32, the function h 9 X+ ~ X+ is additive. By 11.55, h extends to a positive linear operator from X into Y; clearly that operator is the sup of q) in LD(X, Y). 11.58.
Definition and corollary. Let X be a Riesz space. Then the linear space Lb(X, R)
{order bounded linear functionals on X} It is equal to the set of all linear functionals that can be positive linear functionals. It is a Dedekind complete Riesz ordering: f ~ g if x ~ 0 ~ f(x) >_g(x). It also satisfies sup{if(u)l : u E [  x , x ] } , if x E X+.
is called the o r d e r d u a l of X. written as the difference of two space when equipped with this this formula: /f/(x) =
300
Chapter 11" Linearity
This definition is a special case of the notion of "dual" introduced in 9.55. It is investigated further in books on vector lattices; we shall not study it further in this book.
ORTHOGONALITY IN RIESZ SPACES (OPTIONAL)
11.59. Definitions. Let X be a Riesz space (or more generally, a lattice group). In this context, two elements x, y are orthogonal to each other, denoted x _1_ y, i f / / x / A / y / /   O. For any set S c_ X, the o r t h o g o n a l c o m p l e m e n t of S is the set S• = {xEX 9 x_l_sforallsES}.
This definition is a special case of 4.12, with
r = {(x,y).x• 
and so the conclusions of 4.12 are applicable. Thus, x I x ~ S c_ S •177 S •  S •177177 and S•
x  0, and
S1 C $2 ===}S~ C S1 Z,
then
for sets S, $1, $2 C X. Also, if S  T • and T {0} 
SAT
S
NT •  ( S U T )
11.60. Example. We consider IRA as in 11.45. Verify t h a t x _L y if and only if xy  0, where xy is the function defined pointwise i.e., (xy)(A) = [x(A)][y(A)] for all A e A. Also prove t h a t two sets $1, $2 c_ IRA are orthogonal complements of each other if and only if they are sets of the form
where A1 and A2 form a partition of A. 11.61. Definition. Let X be a Riesz space or, more generally, a lattice group. A b a n d in X, also known as a n o r m a l s u b l a t t i c e , is an ideal (as defined in 9.27 and 11.51) t h a t is supclosed in X (as defined in 4.4.b). Riesz Theorem on Orthogonal Decompositions. Suppose t h a t X is Riesz space (or, more generally, a lattice group). Assume X is Dedekind complete. Then a subset S c_ X is an orthogonal complement of some subset of X if and only if S is a band. Furthermore, if S and T are orthogonal complements of each other, then they form a direct sum decomposition: X  S  T, as defined in 8.13. The projections 7rs 9X ~ S and 7rT 9 X ~ T are homomorphisms of lattice groups (or of Riesz spaces, if X is a Riesz space). The projection onto S is given by the formula
s(x) es}
ifx ~0,
O r t h o g o n a l i t y in R i e s z Spaces ( O p t i o n a l )
301 The projection onto T is given by analogous
and ~ s ( x ) formulas.
~ s ( x + )  ~ s ( x  ) in general.
R e m a r k . Compare this theorem with 22.52. P r o o f of t h e o r e m (following Bhaskara Rao and Bhaskara Rao [1983]). First suppose t h a t S is an orthogonal complement. Then S is an ideal; t h a t is a straightforward exercise. To show t h a t S is supclosed, let M be a nonempty subset of S, and suppose #  s u p ( M ) exists in X. Show t h a t / # / ~ sup{/m/'m E M}, and hence # E S. (These arguments actually do not require t h a t X be Dedekind complete.) Now assume X is a Dedekind complete lattice group, and S is a band in X. Most of our proof will be concerned with showing t h a t
(***)IfTS
"c a n d x E X
+,thenxsx+tx
for s o m e s x E S a n d t x E T .
The set M  { / s / A x " s E S} is bounded above; since X is Dedekind complete, Sx s u p ( M ) exists in X. Since S is a supclosed ideal, we have M C_ S and also Sx E S. The elements of M are nonnegative; hence Sx ~ 0 also. Let tx  x  Sx; next we shall show t h a t tx lies in T  S "c. Let any a E S be given; we are to show t h a t / a / A / t x /  O. Since M is bounded above by x, we have Sx ~ x; therefore tx ~ 0 a n d / t x / tx. Let u  (7 a b / ~ t x ; t h e n u ~ 0 a n d i t suffices to show that u 4 0. S i n c e 0 4 u 4 / a / a n d a E S a n d S i s a supclosed ideal, it follows t h a t u E S; hence also u + Sx E S. Then
0 ~ U+Sx = (/a/Atx)+(Xtx) ~ x,
and so
=
e
M,
whence u + Sx ~ s u p ( M )  Sx, and thus u ~ 0. This completes our proof of (, 9 ,). Next we prove the conclusion of (, 9 ,) with the hypothesis weakened: We shall permit x to be any element of X, not necessarily nonnegative. Applying the Jordan Decomposition, we have x  p n, where p, n E X +. Then p, n have Riesz decompositions
p
Sp F tp
E
S + T
and
n

s~ + t~
E
S + T.
We obtain x  S x + t x , with Sx  S p  S ~ E S and tx  t p  t ~ E T . To show every supclosed ideal is an orthogonal Complement, let S be an sup:closed ideal. Clearly S c_ S "c'c For the reverse inclusion, let x E S "c'c have decomposition x  s + t E S + T. Then t E T, but also txsES  c ' c  T "c. H e n c e t  0 , andxsES. Whenever S and T are orthogonal complements, they satisfy S N T  {0}; see 8.13. In the present context, we have also shown t h a t S + T  X. Hence S   X (see 8.11), and the projections 7rS,TrT are uniquely determined group homomorphisms. The arguments of the preceding paragraphs show t h a t 7rs must satisfy the formula stated in the theorem. Note that i f u ~ 0 then 0 4 7rs(u) 4 u. For any x E X, both x + and x  are nonnegative, so 0 4 7rs(x +) 4 x + and 0 4 7rs(x) 4 x  . S i n c e x + A x   0, it follows t h a t 7rs(x +) A 7 r s ( x  )  O. From the J o r d a n Decomposition x  x +  x  , we obtain 7cs(x)  7rs(x +)  : r s ( x  ) , which is therefore the J o r d a n Decomposition of 7rs(x). Hence [Trs(x)] +  7rs(x +). By 8.45, 7rs is a homomorphism of lattice groups. If X is a Riesz space, then 7rs is a homomorphism of Riesz spaces, by 11.53. The same conclusions can be drawn for 71"T .
Chapter 12 Convexity
12.1. Preview. The diagram below shows examples of a star set, a nonconvex set, and a convex set, all of which will be defined soon. The distinction between convex and nonconvex may be easier to understand after 12.5.i.
A typical star set in R 2
/ convex
set
12.2. Notational convention. Throughout the remainder of this book (except where noted otherwise), the scalar field of a linear space will always be either R or C. Usually the scalar field will be denoted by IF, and we shall not specify which field is intended; this intentional ambiguity will permit us to treat both the real and complex cases simultaneously. However, we shall make free use of certain properties and structures enjoyed by IR and C that are not shared by all other f i e l d s  e.g., the real part, imaginary part, complex conjugate, and absolute value (see 10.31), and the completeness of the metric determined by that absolute value (see Chapter 19).
C ONVEX SETS
12.3. Definitions. Several types of sets will now be introduced together; they have similar definitions and basic properties. Let X be a linear space with scalar field IF (equal to IR or 302
Convex Sets
C), and let S c_ X. We say that the set S is a l i n e a r s u b s p a c e of X if s, t E S and A, # E F imply As 4 #t E S; c o n v e x if s , t C S and A E (0, 1) imply As + (1  A)t E S; affine if s , t E S and A E IF imply As 4 (1  A)t E S; symmetric ifsES ::>  s c S .
303
Also, a nonempty set S a X is said to be b a l a n c e d (or circled) if, whenever s E S and a is a scalar with ]a <_ 1, then a s E S; a b s o l u t e l y c o n v e x if, whenever s, t G S and a,/3 are scalars with ]a] + I/3 <_ 1, then a s + fit E S; a s t a r s e t if, whenever s c S and A E [0, 1), then As E S.
Caution" This definition of "star set" is well suited for our purposes, but it differs slightly from the definitions of "star body," "starlike set," etc., used elsewhere in the literature. Though these different classes of sets ultimately must be studied separately, they do share a few basic properties: They are classes of sets that are closed under certain fundamental operations, and thus they are Moore collections (as in 4.6). For instance, a set S a_ X is a linear subspace of X if and only if S is closed under all the binary operations b~,~ 9X • X ~ X defined by b~,~(x, y)  Ax + #y, for all choices of A, # in the scalar field. Likewise, S is a convex set if and only if S is closed under all the binary operations b~,l~ for A E [0, 1]. The other classes of sets can be characterized similarly using not only binary operations, but also unary operations (s ~ As for balanced sets and star sets, s ~  s for symmetric sets) and the nullary operation 0 (for balanced sets, absolutely convex sets, and star sets). Since these classes are Moore collections, they are closed under intersection. Thus, any intersection of convex sets is a convex set, etc. In fact, all the fundamental operations involved are finitary, and so the resulting classes of sets are algebraic closure systems, in the sense of 4.8. Since these classes of sets are Moore collections, they yield Moore closures (see 4.3) in fact, they yield algebraic closures (see 4.8). However, in this context it is not customary to use the term "closure." Instead we use different terms for the different kinds of closures: The smallest linear subspace containing a set T is the ( l i n e a r ) s p a n of T. The smallest convex set containing a set T is the c o n v e x hull of T. Analogously we define the affine hull of T, the s y m m e t r i c hull of T, the b a l a n c e d hull of T, the a b s o l u t e l y c o n v e x hull of T, and the s t a r hull of T. Notations for these hulls vary throughout the literature. In this book the convex hull of T and balanced hull of T will be denoted by co(T) and hal(T), respectively.
12.4. Some relations between convexity and its relatives. These relationships are summarized in the following chart.
304
Chapter 12: Convexity
I nonzero singleton I
I{O}l
I linear subspace
[ fane I
absolutely convex = convex and
l 1

affine containing 0 I
balanced I
I balanced [
I convex containing 0 1
I convex
S
IstaI set I
I symmetric I
a. A set is absolutely convex if and only if it is convex and balanced. b. Every balanced set is a symmetric star set. c. Every convex set that contains 0 is a star set. d. Every affine set is convex. e. A subset of X is a linear subspace of X if and only if it is affine and contains 0. Thus any linear subspace of X (in particular, X itself) is convex, affine, symmetric, balanced, absolutely convex, and a star set. f. If x E X \ {0}, then the singleton {x} is an affine set, but it is not balanced. Moreover, suppose that the scalar field IF is R. Then: g. A set is balanced if and only if it is a symmetric star set. h. A set is absolutely convex if and only if it is nonempty, symmetric, and convex. 12.5.
Further elementary properties. Let X be an Flinear space. Then:
a. Any union of symmetric sets or balanced sets or star sets is, respectively, a symmetric or balanced or star set. b. Suppose that 9" is a nonempty collection of subsets of X that is directed by inclusion i.e., such that for each F1, F2 C 9" there exists some F c :~ such that F1 U F2 C_ F. If every member of 9" is convex or affine or absolutely convex, then the union of the members of 9" also has that property, respectively. Hint: 4.8(B).
. A set S c_ X is convex if and only if it contains the straight line segment connecting each pair of its m e m b e r s (regardless of w h e t h e r the scalar field is IK or C). I f S c_ X is a convex set .6. T = hal(S) = [0. a set S c_ X is affine if and only if it contains the straight line t h r o u g h each pair of its members. Let the scalar field be R. /////~ bal(S) I h" If x . if the scalar field is R.1 . y}. then the s t r a i g h t l i n e s e g m e n t from x to y is the set {c~x + (1 . k. T h e absolutely convex hull of any set S c_ X is equal to co(bal(S)). let the vector space be IR2.Convex Sets 305 of m e m b e r s c. T h e convex hull of a set T is equal to the set of all c o n v e x c o m b i n a t i o n s of T i. T h e n S is convex. X be a real linear space. Then YI~EACA is a convex subset of the linear space I[XEA Xx. It is the affine hull of the set {x.c ~ ) y 9 c~ E R}. 1]}. y C X. A subset of IR is convex if and only if it is an interval. Let f " X ~ Y b e a l i n e a r m a p . T h e convex hull of a set T is the union of the convex hulls of the finite subsets of T.l ( T ) C_ X.1] x [0. and let S = [0. suppose t h a t Ca is a convex subset of some linear space Xa. It is the convex hull of the set {x. T h e balanced hull of a convex is set is not necessarily convex. For each ~ in some index set A.0] x [ . and the cj's are positive d.. y}. v S Example. b. and let C c X.C ) . T h e convex hull of a balanced set is balanced. then so is f ( S ) c_ Y . i. Exercises: arithmetic operations on convex sets.c ~ ) y 9 E [0. 0 ] is balanced but not convex. all vectors of the form X ' Cltl + c2t2 + "'" + Cntn tj's where n is a positive integer. If x. Let that and if x 12. a. In a real vector space. j.{0}.e. (ii) C n ( . and (iii) E C and r > 0 then r x E C. g. T h e n there exists an ordering 4 on X makes X into an ordered vector space with nonnegative cone X + equal to C if only if C satisfies these conditions" (i) C is convex. f. then the s t r a i g h t l i n e t h r o u g h x and y is the set { c ~ x + ( 1 . are m e m b e r s of T. However. If T C_ Y is a convex set. the n u m b e r s whose sum is 1. y E X . 1]. 1] x [0. e.1] U [1. See the example in the following diagram. then so is f . T h e points x and y are its e n d p o i n t s .
C.7.that is. Let n be an integer greater than 1. for . These are the algebraic systems that have fundamental operations given by some binary operations c~ for r E (0. . Z)) The convex sets are the barycentric algebras that can be embedded in vector spaces. . . if S C_ X is a convex set. yE C}{2u'uc C}.2 C . . x) x and c~(x.y) r x + (1 . from [Romanowska and Smith].) The set [P(f~) cannot possibly be isomorphic to a convex subset of a real vector space. It can be proved (see Romanowska and Smith [1985]) that the smallest variety containing all convex sets is the variety of barycentric algebras.r ) y for r E (0. . e 2 . Let A be the convex hull of the set ft. c. We shall also consider the set ~P(f~) = {subsets of f~} as an algebraic system. 1 Equivalently. . For any sets A1. . y)  c l . t h e n C + C . If S and T are convex subsets of X. then c S . Let ft = { e l . for different values of r. However. it is a convex subset of ]Rn (called the s t a n d a r d s i m p l e x ) . 0<c~<1 d. { x + y ' x . with binary operations defined by cr(A. 1). x) when 0 < r < 1. convex sets do not form a variety. . 1). . . 1 .en} be the standard basis for ]Rn . y). 12. not all barycentric algebras can be so embedded. (We emphasize that all the cr's. . shows that the class of convex sets is not closed under the taking of homomorphic images that respect the fundamental operations.. One may be t e m p t e d to try to view convex sets as an equational variety and thus apply to t h e m all the theory of equational varieties.) As we noted in 12. If C is a convex set . The following example. 1).3. Ct/(s+l ) (Cs/t(X . 0. 0 . for they are not closed under the taking of homomorphic images that respect the fundamental operations. let e i = (0.~ ( y . are the s a m e binary operation. . Z)  Cs/(s+l ) (X.oe)T]. (Optional. Cts(y.{cs 9s c S } is convex for any scalar c.306 C h a p t e r 12: Convexity In particular. . . . e. and xo + S .{xo + s" s E S} is convex for any vector xo E X. .~ 1 C . with fundamental operations given by the binary operations cr(x. ~ C . Ak C X . A 2 . when 0 < s < t < s + 1. where the binary operations satisfy these identities: c~(x. it is possible to consider convex sets as algebraic systems.B) A UB for r E (0. the convex hull of the sum is the sum of the convex hull That co (E L1  co(A ) T h a t is. 0) be the vector with 1 in the j t h place and 0s elsewhere. then co(S U T) = U [oeS+ (1 .
. T h e r e f o r e it preserves any identities t h a t could be used to define the variety of convex sets b u t it is not a convex set. x k E S and a0. Show t h a t x . . A set S c_ X is a b s o r b i n g (or radial) if for each x E X we have cx E S for all scalars c sufficiently small (i..26.. . X l . .25. . X l .ak C (0. .e.g) a n d topological vector spaces (see 26. such k k t h a t ~~j=o PJ . T h e n x = aoxo + . for some positive integers k a n d n with k > n. A b s o r b i n g sets will be i m p o r t a n t in the t h e o r y of Minkowski functionals (see 12. (i) Let Tk be the set of all convex c o m b i n a t i o n s of k or fewer elements of S.al. .. for all scalars c satisfying c I < r.~ ?~jej) .{ej E ~ 9 rj is a h o m o m o r p h i s m . the m a p p i n g f : A . Carath~odory's Theorem.O.e.) (iii) Choose real n u m b e r s P o . P k as in R a d o n ' s L e m m a .20). Proof. not all zero. Let X be a linear space. (Explain. ..i. Radon's Intersection Theorem. In fact. 12. . . T h u s ~P(f~) is a h o m o m o r p h i c image of a convex set.)~=0/3j (r). .c and 12. Definition.. (iv) By a suitable choice of r. T h u s convex sets do not form a variety. Pk.C o m b i n a t o r i a l C o n v e x i t y in F i n i t e D i m e n s i o n s (Optional) 307 any convex set t h a t contains m o r e t h a n one point m u s t contain infinitely m a n y points. For j = 0.~~j=o/3j ( r ) x j a n d 1 . Xk be vectors in I~~. with scalar field R or C. .29. T(f~) is a b a r y c e n t r i c algebra. . P l . . . T h e n every point in co(S) can be expressed as a convex c o m b i n a t i o n of n + 1 or fewer elements of S. T h e n there exist real n u m b e r s P0. . .x0 are linearly d e p e n d e n t see 11. + akXk for some X o .0 a n d ~~j=o pjxj . thus t h e y are sets t h a t are "large" in the sense of 5. T(f~) defined by f (. Hint: First show t h a t the vectors Xl . . . k and any real n u m b e r r. . Show t h a t the absorbing sets form a p r o p e r filter on X .9. . 12.e.x 0 . 1 . xk . 27.11. let fly(r) .29.8. However. . It suffices to show t h a t if k > n t h e n Tk+l C_ Tk. R a d o n ' s A f f i n e n e s s L e m m a . x2 .a j . and 27. . Let x0. T h e p r o o f is in several steps. Let S be a subset of R n consisting of at least k . . show t h a t x E Tk.9. . P l . . 1] with ao + . 12. Let S c N n.10. (Why?) (ii) Let x E Tk+l.3. + ak = 1. .. where r is some positive n u m b e r t h a t m a y d e p e n d on x and S). it preserves the f u n d a m e n t a l o p e r a t i o n s of the algebraic systems.x0. COMBINATORIAL CONVEXITY IN FINITE DIMENSIONS (OPTIONAL) 12.rpj. ..
pn+l as in Radon's Lemma. Other matters related to the theorems of Radon. The definitions can be simplified slightly when f is known to be realvalued i. By relabeling and reordering... Additional material on convexity. pl. Choose real numbers po. Rockafellar [1970].Xr} Then what? and R D {Xr+l.13." Proofs can be found in the appendices of Arrow and Hahn [1971] and Starr [1969].e. . and Stoer and Witzgall [1970]. Let S0.Pn+l _~ 0.12. where k and n are positive integers and k > n.g.14. 99 pr > 0 where 0 _< r < n + 1 (explain).23. +c~]. The following result is interesting enough to deserve mention.Sk be convex subsets of ]~n. some mathematicians define "convex" only for realvalued . this shows that the sum of a large number of arbitrary sets is "almost convex.. For each j . when .Xl. Actually..c ~ .. S l .. 1 . .. Q ~ {XO...... Then x can m be expressed as x .. where each xj e co(Aj) and where { j ' x j ~ Aj} has cardinality at most n.. we may assume that the intersection of any k of the Sj's is nonempty (explain). but the problem can easily be reduced to that case by using Carath~odory's Theorem and its consequences.0.and certainly that restricted case still covers most of the applications. and Klee [1963]. ... Apply Radon's Intersection Theorem to the points xj. CONVEX FUNCTIONS 12.Xn+l}. (How?) 12. Hints: Let S D_ {XO.308 Chapter 12: Convexity n + 2 points. those proofs assume the sets Aj are compact. For the definitions below.. Taking m much larger than n. . Remarks. Then []~=0 Si is nonempty. Suppose that each n + 1 of these sets have nonempty intersection. Suppose x C E j = I co(Aj)..Xr+2..F o l k m a n T h e o r e m .. 12. . +c~ ~ Range(f) .Pr+2. k . H e l l y ' s I n t e r s e c t i o n T h e o r e m . we consider functions f taking values in [c~. can be found in Roberts and Varberg [1973]. Griinbaum. Then S can be partitioned into disjoint subsets Q and R such that co(Q) meets co(R).Xn+I}. and Carath~odory are considered by Danzer. . especially in finite dimensions. Now let and Pr+l.pl. . For these reasons. though its proof is too difficult to include here: m S h a p l e y . we may assume P0.. . . Helly.Xl. Hints: By induction on k. pick some xj c ~i#j Si.E j : I xj. see 26. in I~'~..
then f (plXl ~. . we can always take sups and infs in Arithmetic in [oc.~lf(Xl) + A2f(x2)§ whenever the right side is defined (see remark in 12. and let f : C ~ [oc. .pnXn) whenever the right side is defined. However. (F) i. r 2 .p2X2 ~"""~. # 2 .. the greater generality of extended realvalued functions is occasionally useful. Note that a sum of finitely many terms. the function p ~ h~. + r n .X2. is defined if and only if . Xl E C and 0 < A < 1.f (~) is increasing on the interval {p E R : p > 0. if +oc ~ Range(Z) then the following conditions are also For each v c X and ~ E C. x ~ c C. . ..e. then f ((1 .v(p) = [f(~ + pv) . because [oc.r) E C x R : f ( x ) < r} is a convex subset of C x R.f ( ~ ) ] / p is increasing on the set where it is defined i. . (This set is called the e p i g r a p h of f. the function f (~ + pv) . r ~ .14). . ~ 2 . r l ~.A)xo +/~Xl) _~ (1 . ~ + pv c C}. if they are satisfied we say f is a c o n v e x function. (G) For each v E X and ~ c C.o c and +oc are not both among r l . . Then the following conditions are equivalent. . (A) The set {(x.) (E) Whenever n is a positive integer and #1.x~ E C. .Convex Functions 309 functions. +oc] is order complete i. # l f ( X l ) q.e. on the set {p E R \ {0} : +pvcC}. . . (D) Whenever n is a positive integer and hi.~/~nXn) ~_ . . .p n f ( x ~ ) #i + # 2 + ' " + # ~ If f is realvalued equivalent. ~ are positive numbers summing to 1 and X l .17. .. . +oc] be some function. Let C be a convex subset of a linear space X. #~ are positive numbers and Xl. . .e. + A~f(x.) (B) The set {(x. Definition.P2f(x2) + ' ' " . .14). +oc] is defined as in 1. .15.+ .A ) f ( x o ) § Af(xl) whenever the right side is defined (see remark in 12.. .. then f (/~lXl ~/~2x2 ~.r) C C x R : f ( x ) <_ r} is a convex subset of C x N. X 2 . .r2 + ' ' ' 12. (C) Whenever xo.
then any convex function from C into IR is either affine or strictly convex.A)y) < )~f(x) + (1 . hence quasicon(Hint: Use 12. Esa map that is defined separately on various of those parts.310 Chapter 12: Convexity IBnts: The equivalence of (A). then f()~x + (1 . also. We say f is s t r i c t l y c o n v e x if it has this property: Whenever x and y are two distinct points in C and 0 < A < 1.~(1). (B). If f is a realvalued function defined on f .e. This terminology is especially . A function g : C ~ [c~. (iii) f(x) < r} is a convex set for each (Example.o e . +c<~] is quasiconvex.g is convex. To prove (C) => (F).. then f is affine if and only if "linear" is used for affine maps as well. +c<~] is c o n c a v e i f . f(x) = x 3 is increasing on R. To prove (F) ~ (C).~ . 12. or a finite real number. +c~] be some function. (D) follows from (C) by induction. We say f is q u a s i c o n v e x if the set {x C C : r C [ . let C be a convex subset of X. Xl E C. the inequality h~.l i n e a r " map is parts of its domain and is affine on each common in numerical analysis. . a linear space.) The function vex.x0. d. Show that (i) Every convex function is quasiconvex.17.) c. 1] into b. then h((1  )x0 + Xl)  (1 h(Xl) whenever the right side is defined i. To prove that (C) and (F) together imply (G). Condition (E) is just a reformulation of (D). Further definitions.f(0) is linear. use the fact that h~. x l E C and 0 < A < 1. Now suppose f is realvalued.. conversely.~ § pv. A function h : C ~ [c~. note that h~. the function A H f ((1 .~(A) < h~. a " p i e c e w i s e ._~(p) = h~. +c<~] is a t t i n e if it is both concave and convex. h(xl) equal to . + ~ ] . Some elementary properties of convex functions.19(E).v(Ap) <_ h~. v(p) for p > 0 follows from the convexity of f. Let X be a vector space. (C) follows by considering various cases.v(p) for 0 < A < 1 by taking x0 = ~ and X l .c ~ and the other equal to +c~.v(p) < h~. according to whether each number involved is +oc. Show that if C is an open interval in the real line.)~)f(y). Then: a. Obviously (D) implies (C) as a special case. 12.~(p). if and only if for each x0. f is convex if and only if the restriction f ]L is a convex function for each line segment L whose endpoints are elements of C .16.A)x0 § AXl) is a convex function from the interval [0. and let f : C ~ [oe. Caution: In some contexts the term pecially. show h~. but it is not convex. Obviously ( G ) i m p l i e s (F).. An equivalent condition for h to be affine is that whenever x 0 . whenever we do not have one of h(x0). take ~ = x0 and v = X l .equivalently. (ii) Every increasing function from R into [c~.
For each y c C.46). by taking f ( x ) . +ec] {x C C 9 f ( x ) < + o c } is a convex subset of C. (Optional.Convex Functions 311 e. Hints: Say C = c o { x l . sometimes called the e f f e c t i v e d o m a i n of f. defined in 2. Let C be any larger convex set. . ( y . 1}.2.) If f is realvalued and convex and its domain C is the convex hull of a finite set. Let 9 be the collection of graphs of functions f t h a t have the property t h a t they can be extended to convex functions from convex subsets of V into IR. then f is bounded. (Optional. Remarks.x ) [ f ' ( y ) . . though the choice of C is not reflected by our notation Is.b. and assume f " C ~ IR is continuously differentiable.l z . Then 9 has finite character (see 3. . sometimes called the i n d i c a t o r f u n c t i o n of C. Use this to obtain a lower bound on f(y). .) Let C c_ R be an interval.if(x)] > 0 for all x. (B) (C) (D) ( y .) 12. 12. . any convex function f defined on any convex set S C_ X can be extended to a convex function on any larger convex set C. Let C be a convex subset of a vector space X. Here is a simple special case: Let f be the constant function 0 on some convex set S.) Let V be a real linear space. show there is some corresponding z E C f 1 satisfying u . y e C. f.f ( x ) for all x. Show t h a t the following are equivalent: (A) f is convex.18. and we can replace f with its restriction to this set without seriously affecting most results about convex functions. Note t h a t its definition depends on not only S but also C. Then the set [oc. First show SUpxec f ( x ) <_ maxj f ( x j ) .25 we shall determine precisely how much differentiability a convex function must possess. (These results assume some familiarity with college calculus. x 2 . If f is twice continuously differentiable. Is(x)  +oc Then I s is a convex function. Then f can be extended to the convex function I s " C + {0. and let f 9C ~ be a convex function.x ) f ' ( y ) > f ( y ) . Conversely. y E C.lt( x l + x2 + ' " + xn). Most interesting behavior of convex functions occurs in the effective domain. then this condition is also equivalent" (E) Remark. x n } . (The indicator function should not be confused with the characteristic function I s : C + {0.+oc whenever x c C \ S. +co} defined by 0 when x E S whenxcC\S. f " > _ 0 o n C.19. The extension and the original function have the same effective domain. Then let u . In 25. Derivatives and convexity.~y + n . f ' ( x ) is an increasing function of x on C.
show that x H exp(tan(x)) is convex on [0. and f and g are both increasing or both decreasing. c. Let f 9C ~ J and g 9J ~ [oc. +oo] is convex. Then use that fact to show that if p. which is not convex. The infimum of convex functions. a. a. A converse of this result is given by (HB4) in 12. Compare also the supremum results in 15.. 7r/2] ~ [0. Compositions" Let J C_ R be an interval i. Show that the composition g o f 9C ~ [oc. b. +oc] by taking t(x) is over all choices of n. Then f + g is convex.s u p ~ e A f~(x). the ~j'S n are members of A. 12. Then the pointwise infimum is Ixl. if the linear space is R. Taking limits. then ~/t ~ _< p + q f o r t . . In general. Show that t ~ e t is convex on R. +co). or a member of N. +co) be convex functions. +ec).o c . defined on [0. and let {fa : A E A} consist of just the two functions x and .j=l CjXj . Products" Let f. c~) with p l't.20.312 12. Chapter 12: Convexity Corollaries.) Show that the product function x H f ( x ) g ( x ) is also convex. q t u 0 b.x . +co]. Let {fx : A c A} be a nonempty family of convex functions from C into [oc.31. Then a is convex. Show that tan "[0.23 and 16. 12. where the infimum n n is a positive integer. Sums" Let f and g be convex functions defined on C. g" C ~ [0. cj. u >_ . The function x ~ 0<p_<l. is convex if 1 <_ p < ec and concave if c. Using this result (or arguing directly). b. As a particular example. 7r/2). x p. +oc] is convex. 7 r / 2 ) ~ [0. just note that the epigraph of a is the intersection of the epigraphs of the f~'s. Assume also that 0 _< (This condition is satisfied. a convex subset of R.o c . define t a n ( T r / 2 ) . Let C be a convex subset of a linear space. the pointwise infimum p(x) = infxcA f x ( x ) is not convex. q E (1. +oc] both be convex. x} is convex on R. Remarks. +oc] or both taking values in [oc. (Dually.+oc.22. the pointwise infimum of concave functions is concave. for instance. and assume g is increasing. Define ~ 9C ~ [co. Example. the Cj'S are positive numbers summing to 1. For each x E C let a(x) . both taking values in ( . prove that the mapping x min{ 1. and the xj's are members of C satisfying x . +co].~~.e. let C be the real line. and show that tan "[0. C o m b i n i n g c o n v e x f u n c t i o n s . )U. xj such that inf ~~j=l cjfaj (xj).16(D). a.) Hint" Rather than bother with separate cases according to whether f~( x ) is +oc. Pointwise suprema" Let {f~ 9A E A} be a nonempty family of convex functions from C into [oo. d.21. + o c ) i s convex.1 __ 1. We now note some specific applications of the preceding results. For instance.
< min{fx 9A E A0}. suppose that for each finite set A0 c A t h e r e i s some # E A such that f . Then g is homogeneous if and only if g is both balanced and positively homogeneous. [0. and Other Special Functions and such that + e c and . Definition and exercise. +oc) is h o m o g e n e o u s if it satisfies g(tx) Itlg(x) for all scalars t and all x E X.e c .0 .0.Xn(Xn). +co). if I c l . Let X be a real or complex vector space.1 Definitions. We may refer to it as the c o n v e x i n f i m u m of the fa's.IR and the collection of functions consists of just { .o c . in some cases. we shall say that p is a b a l a n c e d function.) Exercise. a. (A) (B) Icl _< 1 =~ p(cx) <_ p(x) for scalars c and vectors x. oc . = p(Iclx). Balanced Functionals. (C) For each number b E [0. +ec] form a complete lattice. +oc) and x E X.24. but it must satisfy g(0) . fx2(x2). is satisfied.. Then the pointwise infimum p is equal to the convex infimum 5. Thus. for instance.e. A function g" X ~ [oc. 313 Then c is convex. Here we follow the convention that 0. Icll ~ Ic~l ~ /9(CLX) ~ p(C2X) for scalars c1. the convex functions from C into [ .Norms. That is the case. +oc] is p o s i t i v e l y h o m o g e n e o u s if it satisfies g(tx) t~(x) for all t E [0. 9 .23. BALANCED F UNCTIONALS~ AND OTHER SPECIAL FUNCTIONS 12. the set {x E X : p(x) <_ b} is balanced (in the sense of 12.3). 12. If one. Let g 9X ~ [0. (Here g is not permitted to take an infinite value.. +ec) be some mapping.e c are not both among the values fAl (Xl). hence all of them. Let p " X . Show that the following conditions are equivalent. a positively homogeneous function g may have oc in its range. In particular.c2 and vectors x. Of course. and in fact ~ is the largest convex function that satisfies c _< fx for all A. A function g" X * [oc. .x . c. NORMS. Show that any balanced function also satisfies p(cx) then p(cx) = p(x). x}. +oc]. when C . f. . Thus. the convex infimum may simply be the constant . let the scalar field be denoted by F. Let X be a real or complex linear space. Suppose {fx 9A E A} is directed downward i.
In particular.b). Hint: Use the decomposition x . P r y c e ' s S u b l i n e a r i t y L e m m a . d. s} ' tanh(s) ' and sp for p C (0.31. Let X be a linear space.e .37. (Some beginners may be unfamiliar with tanh(s).y) and p(y) < p(x) + p(y . A n o r m is a seminorm f that also satisfies x r 0 ~ f ( x ) > 0. +oc] does not have both . +oc) is concave and/3(0) . we say/3 is s u b a d d i t i v e (at least. A function f : X ~ [oc. let Y C X be a convex set. The m a p f H f + . which is the function (e s . If p is subadditive. and let ~ r X. Any norm is a seminorm. hence convex. 8 min{ 1. 0} is sublinear on IRx . Elementary properties and examples.(1 .m a x { f .g. the term "subadditive" has another meaning in measure theory see 29. linear or seminorm) or weaker hypotheses (e. c. any sublinear function is convex. a.A)0 + A(x + y) to show 3(x) > Similarly.14.S ) / ( e s + eS)./3(y) > X x+y x+y Y /3(x + y). any linear function is sublinear. 12. +c~) ~ [0. most of our functions will satisfy either stronger hypotheses (e. Now add these two results. Remarks. convexity). which uses a sublinear functional. If ~(x + y) _< ~ ( x ) + ~(y) for all x. Note that any such function is sublinear. hence p(ux) < p(x)p(u) <_ p(xu). then p( x) _< (I I 1]] + 1)p(x) < (1 1 + 1)p(x) for all vectors x and scalars c. +oc) that is subadditive and homogeneous. Let a. t h e n / 3 is subadditive. with p(~) + ac < yCY infp(~+ay) ..g. Seminorms and norms will be studied in greater detail in Chapter 22 and thereafter. in the context of vector spaces. that fact will be significant in 18. then p(x) < p(y) + p(x . A s e m i n o r m is a function f : X ~ [0. for any set X.x). ~+ s 1 . Suppose/3 : C ~ [oc.. some subadditive functions are the functions arctan (s). If p : X ~ [0. Such a function is convex. 1]. b. c. We shall use sublinearity very seldom in this book. 12. any seminorm is sublinear.314 Chapter 12: Convexity b. See also the remarks in 12.26.0. Let C be a subset of X that is closed under addition. e. c c (0.) All of these functions except the last are also bounded. f. +oc] is s u b l i n e a r if it is both subadditive and positively homogeneous. +oc) is balanced and subadditive. Let p" X ~ I~ be a sublinear functional. y E C.25. where Is I is the greatest integer less than or equal to s. An exception is the proof in 28.o c and +oc in its range.29. I f / 3 " [ 0 . Exercise. +oc). b.
ayo Jrbyl ~ q. a a b _b inf p(~ + arl + by) (l+)inf a >  p({+a~) 9 ~a + b ' yeY  p({) _> = b b (1 + . Y and let ~  Then ~ E Y and b b (1 + .b ) y By sublinearity of p.(a q. (This rather technical result will not be needed until 28. The hypothesis can be restated as: p(~) Consider any y0 yl 6  ac y6Y inf p(~ § ay) + 6 ayo+byl a+b for some 6 > 0. then {r c [0.) Proof (Pryce [1966]).) p ( ~ + a~) p(~). +oc) that includes 0. MINKOWSKI FUNCTIONALS 12. p(~ Jrayo + by1) Hence for any fixed ~ E Y.Minkowski Functionals Then there is a point r / c Y such that p({ + aT]) + bc < y6Y 315 inf p({ + arI + by).infp({+ay) yEY +6 a The last expression. a a +.{ . we have yEY >_ (1 + . so is a subinterval of (0.37. in square brackets []. +oc] that includes oc.) ( { + aft) . . Note that if z is a point in the vector space (not necessarily a member of S). is greater than bc + p(~ + arl) if we choose r / c Y appropriately.p(~) a yEY a (1+) a yEY inf p ( { + a y ) +  a ac.27. Let S be a star set in a vector space X.. Definitions. +oc) : r z ~ S } 1 is a subinterval of [0.) inf p(~ + ay) .
+ ~ ] be some function.29. Hints for (B) =~ (A)" To show S is a star set.e.e. (1) x r x E S r < 1 =~ x E S r 1 = > i t s ( x ) <_r. (B) g is a positively homogeneous function and {x E X 9 g(x) < l} C_ S c_ {x E X 9 g(x) < l}. observe that g(x) < r => g =~ g(Ax) Ag(x) < 1 =~ Ax E S.316 Thus the number its(x)  Chapter 12: Convexity inf{kE(0. d. observe that x E S. does not take the value +c~) if and only if S is absorbing (as defined in 12. if S and T are star sets and S C_ T.. Let g" X ~ [0. The smallest star set.e . Proposition. The Minkowski functional of any balanced set is a balanced function. its(x) < r ~ 1 ~ g (1)x r <_1 ~ g(x) <_r. it is easily seen to be it{0} (x) _ S 0 if x 0 ifx ~0. Then the following two conditions are equivalent" (A) S is a star set and g is the Minkowski functional of S.29.but the most basic properties of Minkowski functionals do not involve convexity. {0}. Corollaries and f u r t h e r properties. It is easy to see that the mapping S H its is directionreversing i. A E [0. Let X be a vector space. 1) To show its <_ g.. we will use Minkowski functionals its mainly when S is a convex set a case investigated in 12. Let S be a balanced star set. 12. If g is a positively homogeneous function.28. c. Then its is finitevalued (i. +c~] is the M i n k o w s k i f u n c t i o n a l of the set S. has the largest Minkowski functional.+ec] 9 klx E S} is well defined (though it may be ec). 12. The function its " X ~ [0. then its _> itT. a. A function on a vector space is positively homogeneous if and only if it is the Minkowski functional of some star set. . X.. and let S be a nonempty subset of X. then both the sets {x E X : g(x) < 1} and {x E X : g(x) <_ 1} are star sets with Minkowski functional equal to g. In our applications.8). similarly. The largest star set. = To show g <_ its. has the smallest Minkowski functional: itx is just the constant function 0. b.
(In 17. # s is a seminorm on X. Still other forms of the HahnBanach Theorem are given by Buskes [1993]. these readers can skip some of the converse proofs. then X0 . and 29. but its proof is beyond the scope of this book. Introduction. 23. (HB3). 28. Then the linear span of S is the set Xo . that is.A)y  317 /~1 y belong to Aa + (1  x ~ (1 . The converse of that last result is false. Gluschankof and Tilli [1987]. For instance. 28.a. 23. Let S c_ X be a convex.y) <_ 1} are convex. then c~1 x and S. However.19. collectively we shall refer to them all as HB.~ < 1 and c~ > ps(x) and/3 > Ps(Y). The function g is the Minkowski functional of both A and B.) We shall keep track of effective proofs because the HahnBanach Theorem is nonconstructive: It implies the existence of certain pathological objects for which we have no explicit examples.14.{ x E X ' p s ( x ) < OO}. and 26. and also of any set between A and B. the HahnBanach Theorem is strictly weaker than the Ultrafilter Principle. many analysts prefer to view the Axiom of Choice (AC) as simply being "true" and will therefore view the Hahn ~' Banach Theorem in the same fashion. y) .HahnBanach Theorems e. If S is a convex star set. any one of which may be referred to as "the HahnBanach Theorem. If S is also absorbing. but A c_ S c_ B does not imply S is convex. The literature contains many closely related theorems. Remark. A survey comparing the relative strengths the HahnBanach Theorem and other weak forms of Choice is given by Pincus [1974].32. Hints" If 0 < . We shall view the HahnBanach Theorems as weak forms of the Axiom of Choice. 22.{(x.X. (HB4). Holmes [1975]. L e m m a o n t h e c o n s t r u c t i o n of s e m i n o r m s . Some theorems that appear similar to the . which is in turn weaker than the Axiom of Choice. lyl}.28.y) < 1} a n d B {(x. this function is convex. and let #s be its Minkowski functional. balanced set (not necessarily absorbing). then #s is a convex function. Thierfelder [1991]. (It is surprising that some seemingly weaker forms of HB commonly presented as "corollaries" in the literature such as (HB1).y) E R 2 9 g(x.56. 26. + (1  +  f. Luxemburg [1969].4. in this and later chapters see 12. HAHNBANACH THEOREMS 12. We shall prove about 20 of these theorems. and Zowe [1978]." These theorems are useful in different ways in different parts of analysis.18.25. (HB2).Un%l n S . hence A x + ( 1 . Thus the s e t s A .A)/3  y E S. and Its is a seminorm on that set..11. Minkowski functionals will be used in 22.30.) In fact.29. that fact was established by Pincus [1972].max{ix .]R2 and define g(x. (HB9) are in fact equivalent to HB in their settheoretic strength.6 we shall prove UF ~ HB. g. 22.31. etc. The various equivalent forms will be denoted by (HB1). a star set S may be nonconvex and still have # s convex.y) E R 2 9 g(x. the HahnBanach Theorem is weaker than the Ultrafilter Principle. let X . Considered as a settheoretic principle. Tuy [1972].
p 9X . in each case the sublinear version is just a weakened form of the convex version. We shall omit the details of that argument. 12. Theorems (HB4) and (HB5) differ in the same fashion.B a n a c h T h e o r e m s . ~) that is. then for each x0 E X there exists some affine function f : X . The sufficiency of convexity was noted at least as early as Nakano [1959].36. R e a l . X0 is a linear subspace. Suppose X is a real vector space. introduced below. and let B(A) = {bounded functions from A into R}. 4) be a directed set. topological vector spaces. Any convex (or sublinear) function from a real vector space into Ig is the pointwise maximum of the affine functions that lie below it.v a l u e d . R is a convex (or sublinear) function. R that satisfies f(x) < p(x) for all x E X and .4). will be discussed further in 12.12): If X is a complex vector space on which A is a linear functional. Most of the literature assumes sublinearity a notable exception being the excellent textbook of Reed and Simon [1972].33. That is. 12. Then I can be extended to a linear map A" X ~ Ig that satisfies A < p on X. N o n t o p o l o g i c a l H a h n . Many of the HahnBanach Theorems can be extended to complex vector spaces via the BohnenblustSobczyk Correspondence (11. Then there exists a realvalued Banach limit for (A. It is also interesting to compare (HB6) with Dowker's Sandwich Theorem 16. Banach limits. We begin with vector space versions of the HahnBanach Theorem which do not involve any topology. and Boolean algebras. for simplicity we shall generally only consider real vector spaces.30. since any sublinear function is convex. Theorems (HB2) assumes p is convex where (HB3) assumes p is sublinear. (HB4) Convex Support Theorem and ( H B h ) Sublinear Support T h e o r e m . 1 9 X0 * IR is a linear map. then X can also be viewed as a real vector space on which Re 1 is a linear functional. Following are our most basic versions of the HahnBanach Theorem. but it seems not to be widely known that the assumption of sublinearity can be replaced by the weaker hypothesis of convexity. ( H B 1 ) E x i s t e n c e of B a n a c h L i m i t s .16(D) and (HB17) (see 28.31. we shall show that they are equivalent to one another and are consequences of the Axiom of Choice. where we present proofs of more general results. For brevity we combine certain theorems.318 Chapter 12: Convexity HahnBanach Theorem are in fact equivalent to the Axiom of Choice (see Lembcke [1979]) or the Ultrafilter Principle (see Buskes and van Rooij [1992]). In later chapters we shall present other versions in normed vector spaces. sublinear). and A < p on X0. We will use convexity instead of sublinearity throughout this book.38. the proofs will be postponed until 12.37. a linear map LIM : B(A) ~ R that satisfies LIM(f) _< limsup~cA f(6) for each f e B(A). Let (A. hence it has a limsup. It is interesting to compare (HB4) with 16. ( H B 2 ) C o n v e x E x t e n s i o n T h e o r e m and ( H B 3 ) S u b l i n e a r E x t e n s i o n T h e o r e m . Of course. and 12. However. if p : X ~ IR is convex (respectively. Note that any member of B(A) is a bounded net of real numbers. otherwise those two theorems are identical.
4) be a Dedekind complete.33. +oc). in each case the condition is to hold for all x. the precise definition of "Banach limit" varies slightly from one paper to another in the literature. 1].38. An introduction to the subject and further references are given by Ioffe [1982]. [0.t)p(y) for for all all t E all t E all t E t E [0.t)p(y) for ~ tp(x) + (1 . The theory of convex operators obviously includes the theory of convex realvalued functions.p( xo ) . (The reader should keep in mind the special case of Z = R.t)p(y) for = tp(x) + (1 . It also includes the theory of affine operators between (unordered) linear spaces. y E C. Remarks. ( H B 6 ) S a n d w i c h T h e o r e m . The notion of generalized limits can be traced back at least as far as Banach [1932]. our definition of "Banach limit" agrees with the definition given by Yosida [1964]. In the case o f ' Z = R. in this context a function f is b o u n d e d if its range has an upper bound and a lower bound. we can make it into a convex operator by equipping Y with the trivial ordering described in 8. t h a t is the simplest and most fmportant case. We shall study convex operators further in Chapters 26 and 27. Let C be a convex subset of a real vector space. Suppose t h a t e" C ~ IR is a concave function. A mapping p : C ~ Z is sublinear convex concave affine if if if if p(x + y) ~ p(tx + (1 p(tx + (1 p(tx + (1  p(x) t)y) t)y) t)y) + p(y) and p(tx) = tp(x) ~ tp(x) + (1 . and 12." these are convexsetvalued functions. 4) be an ordered vector space (not necessarily a vector lattice). and let (Z. 1].31 follow from each other and from the Axiom of Choice.Convex Operators f (xo ) .36.) Let B ( A . for if f : X ~ Y is any affine mapping. 319 CONVEX OPERATORS 12. ordered vector space. Let C be a convex subset of a vector space and let (Z. Note t h a t any sublinear function is convex. we shall generalize slightly. 12._E) be a directed set.37. 1]. g 9C ~ R is a convex function. This will require more definitions: Definitions. Z) is itself an ordered vector space. Then there exists an affine function f 9C ~ R satisfying e _< f _< g. [0. Before proving t h a t the principles in 12. and e < g everywhere on C. [0. Another way to unite the theories of linear operators and convex functionals is via Ioffe's "fans. Note t h a t B ( A . Z) {bounded functions from A into Z}. 12. Let (A.37. Proofs will be given in 12. However.32. .
That is. Z). That linear subspace .f ) and L I M ( . it says that every bounded net of real numbers "converges" to a real number. the positive integers with their usual o r d e r i n g . _) is the ordered set (N. in which case their common value is the limit of f. At the end of this section we shall show that the following three conditions are equivalent: (A) LIM(f) 4 limsuP~EA (B) liminf5EA f(5) for each f E B(A. and lim inf(f) 4 limsup(f). it gives us a way of saying that..they are given by a linear map but they do not preserve topological properties quite as well as the limsup and liminf do. limsup(f)  a E A ~_~a inf sup f(/3) both exist in Z. the net (u(c) : c E A) decreases to limsuP~EA f(~).f) ~ 0. We may sometimes refer to this as the o r d e r limit. Banach limits have slightly better algebraic p r o p e r t i e s . __) If. We say that the net f c o n v e r g e s if and only if liminf(f) and l i m s u p ( f ) are equal. f ~ 0 =~ LIM(f) ~ 0. but they may give different generalized limits for those nets that do not converge in the usual sense. We may also call it the o r d i n a r y limit.320 Chapter 12: Convexity A member of B(A.generally is not all of B(A. condition (A) implies that LIM is a positive operator. We emphasize that there may be many different Zvalued Banach limits for a directed set (A. so L I M ( u . from a linear subspace of B(A. of these three conditions are satisfied. we say LIM is a Z . Clearly. It remains only to prove (C) =~ (A). Z). and LIM(f) = limsEA f(5) whenever the right side of that equation exists. taking values in Z. Z) may be viewed as a bounded net based on A. Also.L I M ( f ) by the linearity of LIM.. thus u . Z). f(5) ~ LIM(f) 4 limsuPSED f(5) for each f E B(A.e. Banach limits could be contrasted with limsups and liminfs (discussed as "generalized limits" in 7.l i m i n f ( . and condition (B) implies that LIM agrees with lim wherever the latter exists." In particular. (C) LIM is a positive operator that extends the ordinary limit to all of B(A. Note that (ezercise) the order limit is a positive linear operator.. Z). (A) ~ . If one. hence all. a realvalued sequential Banach limit is a way of saying that every bounded sequence of real numbers converges to a real number. when Z = R. For each e E A. Then u(c) ~ f(c). Thus LIM(f) 4 LIM(u) = lim~EzX u(C) = limsuP~EA f(5). Z) into Z.that is.f ) = . to contrast it with the generalized limit developed in the next few paragraphs. _E). The Banach limit is an extension of the ordinary limit. Z) the objects liminf(f) aEA ~ a sup inf f(~). denoted lim(f). In the next few sections we shall prove the existence of Banach limits. let u(e) = supsz~ f(5). moreover. (B) since limsup f = . in a generalized sense. for any f E B(A. (A. every bounded net "converges.i. They agree on those nets that converge in the usual sense.46).. the space of all convergent nets .v a l u e d B a n a c h l i m i t for the directed set (A. since some bounded nets are not convergent in the sense of ordinary limits. Proof of equivalence: First. Since Z is Dedekind complete. Let LIM : B(A) ~ Z be a linear map. _<) .then we shall call LIM a s e q u e n t i a l B a n a c h limit.f ~ 0.
We shall show that the following principles are equivalent to each other and t h a t they are all consequences of the Axiom of Choice. Suppose t h a t e : C ~ Z i s a c o n c a v e function. Note that FEL differs from (VHB2) only in the addition of the hypothesis (. p : X ~ Z is a convex (or sublinear) function. Suppose X is a real vector space and Z is a Dedekind complete. We shall use FEL in proving (VHB2). 12. Z is convex. Then ~ can be extended to a linear map A : X ~ Z t h a t satisfies A4ponX. ( V H B 2 ) C o n v e x E x t e n s i o n T h e o r e m and ( V H B 3 ) S u b l i n e a r E x t e n sion T h e o r e m . (VHB6) Sandwich Theorem. then there exists a Zvalued Banach limit for (A. Any convex (or sublinear) function from a real vector space into Z is the pointwise m a x i m u m of the affine functions that lie below it. Remarks. 3 4 . (. V e c t o r . ( V H B 4 ) C o n v e x S u p p o r t T h e o r e m and ( V H B 5 ) S u b l i n e a r S u p p o r t T h e o r e m . Suppose X0 c_ X is a linear subspace. and 4 p on X0. g : C ~ Z is a c o n v e x function. FEL or a similar result was already known to Banach. Suppose X is a real vector space. Let C be a convex subset of a real vector space. F i n i t e E x t e n s i o n L e m m a ( F E L ) . ordered vector space. the convexity of p. if p : X ~ Z is convex (respectively. then for each x0 E X there exists some affine function f : X . 4) is any directed set. : X0 ~ Z is a linear map.32).B a n a c h T h e o r e m s . (VHB1) E x i s t e n c e o f B a n a c h L i m i t s . The proof is by induction on the cardinality of S. Proof of FEL. Then there exists an aFfine function f : C ~ Z satisfying e 4 f 4 g. and the fact t h a t . FEL does not require the Axiom of Choice or any of its weaker relatives. sublinear). ~0 " X0 ~ Z is linear. If (A.). Z t h a t satisfies f(x) ~ p(x) for all x c X and f(xo) = p(x0). X0 is a linear subspace.) Hypothesis. and A0 4 p on X0.35. 4) (as defined in 12.31.) Then A0 can be extended to a linear map A 9X ~ Z satisfying ~ 4 p on X.span(X0 U S) for some finite set S c_ X. Let Z be a Dedekind complete. Proofs will be given in the next few sections.v a l u e d H a h n .Convex Operators 321 1 2 . ordered vector space. p" X . (It is not yet known whether they are equivalent to the Axiom of Choice or are strictly weaker. Using the linearity of A0. Also assume t h a t X . it appears explicitly in Luxemburg [1969]. thus we may assume S contains just one element ~. We now generalize the theorems of 12. FEL can be proved using just ZF. T h a t is. and e 4 g everywhere on C.
and let X0 = {convergent nets}. Proof of (VHBh) =. Z ) . r c N. The mapping (I) : f H limsup6~A f(6) is a sublinear mapping from B ( A .38. there is a maximal member of this partially ordered set i. must be defined by A(x + r~) = Ao(x) + rA(~) for all x E X0. this completes the proof. the one remaining argument is much longer and will be given separately in 12. For x c X0. Proof of ( V H B 4 ) ~ (VHBh). In this section we prove most of the equivalences stated in 12. Most of the equivalence proofs. (VHB4).p(xo). which vanishes on the zero element of the linear space B(A. Obvious. Partially order such A's by inclusion of their graphs.10).35. then A ~ p on X0. The function . Consider all linear maps A : W ~ Z.322 A0 4 p on X0. A can be extended to all of span(W U {~}) contradicting the maximality of W.~ that satisfies A ~ p on W. Let p(x) = limsup6eA x(5).~o(v) r (the details of the verification are left as an exercise). Let LIM: B(A. let A(x) . completing the proof. Z ) into Z.. A is an extension of . Use the Finite Character Principle ((AC5). Partially order such A's by inclusion of their graphs.~. r > 0 p(v + r~) . and A is a function that can be extended to a linear function (see 11. where W is a linear subspace of X that includes X0 and A is an extension of A that satisfies A ~ p on W. Z). From our choice of ~(~) it follows that .xo) + p(x0).lim x.34. being linear. Proof of ( V H B 2 ) = ~ (VHB3). Proof of (VHB3) =. this functional is easily verified to be sublinear. and q(0) = 0. Version (i): This is the more traditional proof. Version (ii): Some mathematicians may find the Finite Character Principle more intuitively appealing than Zorn's Lemma. 12. Z) ~ Z be an affine mapping satisfying LIM ~ 9 and LIM(0) = . s < 0 Chapter 12: Convexity p(w + s~) . choose any ~ E X \ W.~o(w) 8 ~ inf vEXo. the details of the verification are left as an exercise). By the Finite Extension Lemma 12.20) to show that there is a maximal A in this collection. Proof of (VHB2) =. If W C X. Consider all functions A : W ~ Z where W is a subset of X that includes X0.36. Define q(x) = p(x + x o ) . 12. We shall give two different proofs. By Zorn's Lemma. Thus W = X. so we offer an alternative proof. we can verify that sup wEXo. Now let f(x) = g(x .37. Proof of AC =. Now let ~(~) be any member of Z lying between those two values. (VHB1). an extension A : W ~ Z that cannot be extended farther. Then q is also convex. This completes the proof. (VHB1). By (VHB2) (with X0 = {0}). Obvious. there exists some linear function g : X ~ Z that satisfies g ~ q everywhere on Z . in 6. (VHB2).e. Let X = B ( A .~ 4 p on X (again.
This proof takes several ingredients from Neumann [1994]. the function p is constructed from ct. The proof will be in several steps.fo(x). y E X and c~. then ~ E Z+. S) 9 S is a finite subset of X and g E ~ s } . let (I)s be the set of all functions g 9X ~ Z that have the following properties: g is an extension of fo. Then ~Px is bounded. (The term "canonically" here refers to the fact that no arbitrary choices are needed. Define LIM" B(A.(g. For each x E X.1. We first show that (a) From any f E N and c~. and so he used Zorn's Lemma to find a minimal element of the set N investigated below. Instead. Observe that f ( x ) . and let A be ordered by: (gl. Now let A = {(g. we can canonically construct a function p E 5I that satisfies p(x) 4 f ( x ) and p (c~x + ~y) ~ c~p(x) +/3e(y) for all x. since we shall apply this construction infinitely many times later in this proof. each (I)s is nonempty. Let f0 " X0 ~ Z and p be given. y}.e(x) : x E C}.0 for all (5 sufficiently large. LIM(~x+ZyC~x/~y) (The proof above is a reformulation of an argument of Luxemburg [1969].x ) ~ g(x) ~ p(x) for all x E X. Fix any x . Since (I)s N(I)T . Then for all ~i . In our present investigation of weak forms of Choice. we may assume t = 0. Define a function f " X + Z by taking f ( x ) .38. For each finite set S c_ X.g(x). since . Neumann was not concerned with weak forms of Choice. then ~Px(5) .p ( .LIM(~Px). Z) + Z as in (VHB1).LIM(~bx) 4 LIM(p(x)) . However. T h a t fact is important.p ( . we have f (c~x + fly) . Thus g(c~x +/3y) .v) 0.fig(y) . y E C . we have x.) . ~. in our proof (heretofore unpublished) we shall use infinitely many decreasing sequences in :JV[.c~f (x) . since p(x) does not depend on (5. y E S. $2) if S1 C_ $2.) Proof of (VHB6) ~ (VHB1).~ . . S1) 4 (g2. so I(x) L I M ( f 0 ( x ) ) . 12.c~g(x) . our condition (VHB1) is essentially a translation of Luxemburg's Theorem 6./3 E (0. By replacing g with g .1. and the restriction of g to span(X0 U S) is linear.x ) 4 ~bx((5) 4 p(x) for all ~ E A. It remains to show that f is linear.C~x(~i) /3~y(~i) . so g is linear on the span of {x. then A is a directed set.p(x). Since LIM is a linear operator. The set 5I is nonempty..Convex Operators 323 Proof of (VHB1) =.S) . we are not permitted to use Zorn's Lemma an equivalent of AC and so it is not clear that :M has a minimal element. 1) with c~ + / 3 ./ 3 f ( y ) =  LIM(ga~x+Zv) . By Banach's Finite Extension Lemma (in 12. S) sufficiently large in A. Let ~ = i n f { g ( x ) . We use nets where Luxemburg used reduced powers of Z and the terminology of nonstandard analysis. Let N be the set of those convex functions f : C + Z that satisfy e 4 f 4 g on C.c~LIM(~b.(PSuT. and f by a uniquely specified algorithm. Therefore ~x+Zy(~i) .O.~ E Z. (VHB2). define a function gax " A + Z by taking gax(g.) /3LIM(g. Proof of (VHB1) ~ (VHB6). Also observe that if x E X0. the family of sets (I)s has the finite intersection property.35).fo(x). Let e  liminf and g  limsup. since g E 5I.
This completes the proof of (a).. we define a decreasing sequence f0 ~ fl ~ ]'2 ~ f3 ~ "'" in 9~ as follows: Let f0 = f. Since e is concave and e ~ fn. too (easy exercise). Pn ~ fn and with ~p2 ~f3 ~P3~"" in:M: (2) that is. y e C. so p is convex. too. 1) x C by making it periodic in j with period J. ~ j ) ' j E N) in (0. Extend the given finite sequence to an infinite sequence (((~j. we have e(x) ~ [fn(ax + ~ y ) .~nY) .. Now assume some fn c 9V[ is given. (0L2.9 (y) A ( x ) + Z [fn(Y)  f (x) + _Z [g(y) _ for all x.c~j)y) = c~jq(~j) + (1 .~e(Y) .OLnPn(r /~n for y e C.f. choose Pn C :)~ as in statement (a) pn(OlnX + ~nY) ~ OlnPn(X) ~ ~ne(y) Then we can define a convex function fn+l " C >Z by fn+l (Y) " for all x. From fn ~ g and the convexity of f~. Now define p(x) = infneN fn(x).y C C. By taking the infimum over all n in both sides of (1).~e(y) 9 y E C} . From the convexity of fn and the concavity of e. we find that fn+l is convex. y E C.1 . with Given fn. (4) From (3) we may deduce that e ~ fn+l o n C. Construct a sequence fl~pl~f2 as follows: Let fl .. Let/~j . this yields fn+l(X) ~ fn(X) since i n f { g ( y ) . fn+l ~ Pn follows from the convexity of Pn. Then . (1) Then e ~ fn+l. On the other hand. Hold x fixed and take the infimum over all y.~j)) f [ \ in (0. we obtain p(x)  inf{P(ax+~Y)c~ .e(y) : y E C} = 0.c~j. (3) Pn (OLnCn ~. 1) • C.324 Chapter 12: Convexity To prove (a). Hence p E ?yr. Thus fn+l c 9~. (Olj. Next we shall show that (b) From any function f E :M: and any finite sequence ((011. Hence we may define a function fn+l : C* Z by f n+ l (X) inf { f n (olx + ~Y)a .. Thus fn+l c 9V[.~2).~1). Y E C } . we obtain A( x + 9Y) .~e(y)]/c~ for all x. This completes the recursive construction of the sequence {fn }. Now let q be the pointwise infimum of the functions in that sequence.aj)q(y) for all y E C and 1 _< j _< J. completing the recursive construction of the sequence (2). The pointwise infimum of a chain of convex functions is convex. we can canonically construct a function q c :M: that satisfies q(y) ~ f(y) and q(o~j~j + (1 .
Taking order limits. For such an ordered pair (Q. y E C and ct E (0. q) 4 g(z) for all (Q. In this fashion we define a function ~ x " A + Z.c~)f(y).ct)@v(Q. we obtain f(c~x + (1 . Let A be the set of all ordered pairs (Q. and consider the subsequence of equations obtained from (4) by taking n .q) + (1 . j + 3J. 1). Finally we proceed to our main construction. we obtain q(Y) = q(ctj~j +/3jy) c~jq(~j) 9j for y E C. That is. Thus ~x E B(A. Define f ( x ) . Fix some particular j. q) and for each z E C. By (VHB1) there exists a Zvalued Banach limit LIM" B(A. For all (Q.c~)y) .q(z). j + 2J. Z) Z. let ~x(Q.Convex Operators 325 q is the order limit of the fn's. q) where Q is a finite subset of (0. This completes the proof. z) E Q. j + J. . Say that (Q. fix any z. q) sufficiently large. @c~z+(1a)y(Q.c~q(z)+ (1 c~)q(y). Z).c~f(z) + (1 . as well as the order limit of the pn'S. from statement (b) it follows that _ is a directed ordering of A. 1) x C and q is a member of :M: that satisfies q(a~+(1a)y)aq(~)+(1a)q(y) for a l l y E C a n d ( a . q)  ctq~x(Q. q) sufficiently large in A. ~ ) EQ. q) E A. q) for all (Q. This function has bounded range" e(z) ~ ~x(Q. . r) whenever Q c_ R. Apply the Banach limit on both sides. we have (c~. so it preserves linear combinations. The order limit is a linear operator. Then q(c~z + (1 c~)y) .j. q) E_ (R. q) . . This proves (b). Then e(x) 4 f(z) 4 g(z).LIM(~x). To show f is affine. . .
A distribute over each other i. then each member of X has at most one complement. the term "Boolean algebra" emphasizes the universal algebra viewpoint.0 and xVy . Much of this chapter is based on Halmos [1963].Chapter 13 Boolean Algebras BOOLEAN LATTICES 13. then each x E X has exactly one complement.e.7. x v (y A z)  (x v y) A (x v z) for all x . However. as discussed in 13. if every subset S c_ X has a supremum (written V S) and an infimum (written AS). Monk [1989]. and for that reason the terms "Boolean rings" and "Boolean lattices" are occasionally used interchangeably. A Boolean lattice is essentially the same thing as a B o o l e a n algebra. Boolean rings are introduced in 13. By the exercise above.1. there is a natural correspondence between them. If X is a distributive lattice with smallest and largest elements 0 and 1. but we shall not impose that restriction.. 326 . Recall from 4. and Sikorski [1964]. which we shall denote by Cx. if x A (y v z)  (x A y) v (x A z).. (Some mathematicians add the further requirement that 0 ~= 1. 4) be a lattice that has a first element and a last element denoted 0 and 1. The lattice X is c o m p l e m e n t e d if each of its elements has at least one complement.) A Boolean lattice is c o m p l e t e if its ordering is complete . and the two terms may be used interchangeably. A B o o l e a n l a t t i c e is a complemented lattice that is also distributive. respectively. If x c X. D e f i n i t i o n .i.4. z E X . It is convenient to also define the s y m m e t r i c d i f f e r e n c e of two elements x and y: xAy  (xACy) V(CxAy).. although Boolean rings and Boolean lattices are not the same. if X is a Boolean lattice.1. Let (X.a and 13.13.e. E x e r c i s e . then a c o m p l e m e n t of x is an element y that satisfies xAy .13.23 that a lattice is d i s t r i b u t i v e if its binary operations V. y. see the remarks in 13.
In fact. Algebra of subsets of f~ Boolean lattice O 0 A A Boolean lattices are not really much more general than algebras of sets. However. In contrast." (Compare the remarks in 5. because with that restriction Boolean algebras and Boolean rings (discussed later in this chapter) do not form equational varieties. {0} is a Boolean lattice and so Boolean algebras and Boolean rings do form equational varieties.{x E ft" x ~ S}. proved later in this chapter.21 about "pointless" open sets. That restriction only excludes one Boolean lattice. . Aside from the conceptual difficulty of intangibles. the Stone Representation Theorem. To call {0} a Boolean lattice reflects a recent trend among algebraists. states that every Boolean algebra is isomorphic to some algebra of sets. when ordered by c_. If X contains more than one element. Then the set S of all complemented elements of X is a sublattice of X. b. Exercise (optional). We emphasize that in Boolean lattices. then no element of X can be its own complement. that restriction complicates the notation and the development of the theory. It is isomorphic to [P(2~)= {0}. 13. the chief difference between Boolean lattices and algebras of sets is one of viewpoint: When considering algebras of sets. 1} consisting of two elements. In this case we have a conversion of symbols as described in the table below. not necessarily containing any "points. F u r t h e r e x a m p l e s of B o o l e a n l a t t i c e s .4. The older literature imposed the restriction that 0 :/: 1 in any Boolean lattice (and that additional restriction is still imposed by some mathematicians today).2. Of course. If 0 = 1 in a Boolean lattice X. the members of a Boolean lattice are considered as urelements. We emphasize that in this book.25). C may have other meanings. If g is an algebra of subsets of a set Ft (as defined in 5. then g is a Boolean lattice. we are permitted to consider the points that make up those sets. that isomorphism is sometimes an inconvenient representation. it is a Boolean lattice if we restrict the lattice operations of X to S. Let X be a distributive lattice with smallest and largest elements 0 and 1. T h e b a s i c e x a m p l e " a l g e b r a s of sets. the proof of the Stone Representation Theorem involves arbitrary choices and intangibles. in the algebra of sets.Boolean Lattices 327 13. However. the complement of a set S is the set CS . a.3. The next smallest Boolean lattice is the set 2 {0. and so that restriction has little effect on the ultimate results of the theory if one is careful to keep track of the degenerate case. We shall call {0} the d e g e n e r a t e B o o l e a n l a t t i c e . Any Boolean lattice with 0 ~ 1 will be called n o n d e g e n e r a t e . then X contains just the single element 0.) 13. it is the smallest Boolean lattice.
and let IF be the set of all formulas that can be formed in that language.16. Hints" Any upper bound for S is a set B that contains all the even integers. Clearly. a topological space may have other regular open sets as well.18 is complemented but not distributive. let X be the algebra of all finite or cofinite subsets of Z. This example. V. define closure and interior as in 5.int(cl(S)). The regular open sets may be described as those open sets that have no "cracks" or "pinholes." The collection R O ( f ~ ) {regular open subsets of ft}. d. c. such as ~P(fft). hO Let f~ be a topological space. See 14.33." and "not" respectively. though quite elementary. Let S be a topological space. with tET tET tET Hint: To prove (e. C) is a complete Boolean lattice.A U C to show that AV(BAC) (AVB) A(AVC) for any A. This turns out to be an equivalence relation on F. Assume that the language is equipped with some suitable collection of axioms and rules of inference. (This is a special case of 5. f. The lattice M3 given in 4. forms a complete Boolean lattice. It will play an important role in 17. a n d l e t C = { S C X 9S JJS} that is. i6I A SiiCI int (i('~ISi) ' ' c CS~\cl(S) .44. e. Show that (C.32. C E (~. it will be used in 13. A set S c f~ is r e g u l a r o p e n if S .{clopen subsets of S} is an algebra of sets. thus it is n o t a Boolean lattice. go Let L be some language. any clopen set is regular open. are complete Boolean lattices. Q c_ X we have c l ( g n Q ) = cl(g) ncl(Q). Recall that a subset of S is c l o p e n if it is both open and closed.328 Chapter13: Boolean Algebras ordered by 0 ~ 1. If r is an odd integer belonging to B. observe that for any sets P. hence it is cofinite. hence it is not finite.) Call two formulas A and ~B "equivalent" provided each implies the other via the given axioms and rules of inference. Some algebras of sets. The resulting quotient algebra is a Boolean lattice. and C corresponding to the logical notions "and. hence it contains all but finitely many odd integers.12 . Others are not complete. (A precise specification of such "suitable" axioms and rules will be given in 14.) Show that S = {finite subsets of the set of even integers} is a subset of X that does not have a least upper bound in X. and thus a Boolean lattice. C_) is distributive. with Boolean lattice operations given by V siiEI int ( e l ( U S / ) ) . as noted in 4. The set 2 is isomorphic to ~P(S) if S is any singleton. Let X and l be as in 4. which reflect ordinary methods of reasoning. The collection clop(S) . is extremely important. ordered by c . B.A U B and Q . Apply this with P . with the binary operations A." "or.f. let e b e t h e collection of closed subsets of X.12.19.26. For instance. then B \ {r} is a slightly smaller upper bound for S.
We may view Boolean lattices as an equational variety.7. Whenever B = (X. we obtain a new Boolean lattice if we keep the same set X and the same complementation operation. S~ E RO(f~). 0. A Boolean lattice.) b. We emphasize that. The f u n d a m e n t a l o p e r a t i o n s are V.4.i. for brevity we may state just one of them. If X is a Boolean lattice. V. a. and b ~ Cb ~ two elements is also a chain..Boolean Homomorphisms and Subalgebras 329 for S. 13. the mapping x ~ Cx is an isomorphism. in this book. Thus.4.e. xAy = 0 ~ xACy=CxAy=O. BOOLEAN HOMOMORPHISMS AND SUBALGEBRAS 13. in the sense of 8. ~. Definitions.a. L1L3 in 4. ~. the singleton {0} is a Boolean algebra (albeit a degenerate one). 1. For instance. C(x v y) = (Cx) A (Cy) and C(x A y) = (Cx) v (Cy). but it arises naturally in certain applications: It turns out to be the smallest complete Boolean lattice that fits certain constructions.5. D e M o r g a n ' s e. see the remarks in 13. is usually called a B o o l e a n a l g e b r a .20). Another Boolean lattice that is of particular interest to analysts is given in 21. It plays an important role in the theory of forcing. 13. a mapping that preserves the fundamental operations. C is an involution of X.9.i. CCx = x. then B ~ = (X.50. (In fact. the Boolean lattice operations of R O ( ~ ) generally are not just the restrictions of the Boolean lattice operations of ~P(ft). x ~ y c. 1.5. A) is another Boolean lattice .d are dual to each other. A. from B onto B~ Any statement about Boolean lattices has a dual statement that follows as a consequence by this swapping. in the sense of 13. show that ~ Cx~Cy. C. (Thus. C. When two statements are dual to each other in this fashion.6. We may occasionally revert to the term "Boolean lattice" to emphasize the ordering structure. 0. together with these axioms: xAO = xACx = O. and swap meets and joins. viewed as an algebraic system in this fashion. A few computations.53 and Bell [1985]. V) is a Boolean lattice. xVl = xvCx = 1. in the sense of 2. Although RO(ft) is a subcollection of T(ft) and its ordering is the restriction of the ordering of [P(ft). The duality principle.. The Boolean lattice RO(ft) may appear rather complicated.e. b = 1. A. 0. A Boolean lattice satisfies the axioms of a lattice (that is. a Boolean homomorphism means a mapping .7.. A B o o l e a n h o m o m o r p h i s m is a homomorphism in this v a r i e t y . C. b ~ Cb K>. but swap 0 and 1. 1. the two De Morgan's Laws in 13. See 14. x = y ~ d. Hence no Boolean lattice with more than L a w s . b = 0.
If X is a Boolean algebra.. and let G c_ X. f(1) = 1. and ~0 c ~t. the other conditions then follow as consequences.T(~) for some set ~ and G is a collection of subsets of ~t. f ( x l A x2) = f ( x l ) A f(x2).X2 E S Xl V X2.0 A CO.26. a Boolean subalgebra of X is a nonempty set S C X that satisfies Xl. f(0) = 0. Xl A X2. for the Boolean subalgebra S.v a l u e d h o m o m o r p h i s m on a Boolean algebra X is a Boolean homomorphism from X into the Boolean algebra 2 = {0. then a B o o l e a n s u b a l g e b r a of X is a subobject of X in the variety of Boolean algebras i. A t w o . then there does not exist any homomorphism from the degenerate Boolean algebra {0} into Y. If X is an algebra of subsets of some set ~t. In the special case where X . Exercise. X is a Boolean subalgebra of itself. If Y is a nondegenerate Boolean algebra. Thus. It suffices to show that f preserves V and C. 1}. when equipped with the restrictions of the operations of X. Xl. We may call this a Boolean algebra homomorphism for emphasis or clarification. Exercise. We can apply to Boolean subalgebras all the conclusions of Chapter 4 about Moore closed sets and all the conclusions of Chapter 9 about subalgebras in an equational variety. N o r m a l F o r m T h e o r e m .330 Chapter l3: Boolean Algebras f : X ~ Y from one Boolean algebra into another that satisfies f ( x l V x2) = f ( x l ) V f(x2). Note that S itself is then a Boolean algebra. the intersection of any collection of Boolean subalgebras is a Boolean subalgebra. 13. ~Xl. 9 x is the sup of finitely many members of GCA}.e. The Boolean subalgebra S g e n e r a t e d by a set G C_ X is the smallest Boolean subalgebra that includes G. . 0. In particular. Definition and a concrete example.10.9. 1 E S. or a set of g e n e r a t o r s . as follows: Let GC GCA GCA v = {xEX {x E X {x c X 9 xCGorCxEG}. Thus.X2 E X . for all X. it is equal to the intersection of all the Boolean subalgebras that include G. Let X be a Boolean algebra.8.e). Then the Boolean subalgebra S generated by G can be described more concretely in three stages. we find that S is the algebra of sets generated by G (see 5. 13. it is a set S C_ X that is closed under the fundamental operations. More definitions. then one twovalued homomorphism on X is the p r o b a b i l i t y c o n c e n t r a t e d a t Wo: 1 #(S) 0 ifw0ES ifw0~S forSEX. f(Cx) = Cf(x). 13. Hint: 0 . 9 x is the inf of finitely many members of G c}. any homomorphic image of a Boolean subalgebra is a Boolean subalgebra. the set G is then called a g e n e r a t i n g set. there is no twovalued homomorphism on the degenerate Boolean algebra {0}. etc.
.k and 2. then the extension F 9S . . We omit the lengthy computation that shows that the resulting collection Ae~ is not closed under complementation or countable intersection." respectively. with A equal to the collection of all sets of the form O(3 m P x l[j=m+l {0. .) is said to be in n o r m a l form... the n ( i .. Use the Distributive Law and De Morgan's Laws to show that GCAv is also closed under finite infs and under complementation.10. Let G be a subset of a Boolean algebra X. A5~ is contained in the aalgebra generated by ~}. GCA. R e m a r k s . 1}. An example is given by ft = 2N. S i k o r s k i ' s e x t e n s i o n c r i t e r i o n . . Indeed. the resulting collection A ~ is not necessarily equal to the aalgebra generated by 9. then under countable union.20. Y is uniquely determined. 9 9 gk C G and nl. J i are nonnegative integers.38) of countably many countable sets is not necessarily countable see 2. the subalgebra generated by a finite set is also finite.1. . and then close it under complementation. An important special case is that in which X = ~P(f~) for some set f~. and "inf" and "sup" mean "intersection" and "union. . 13. It is easy to see where the analogy between algebras and aalgebras breaks down: the product of finitely many finite sets is finite. J 1 . then under countable intersection.Boolean Homomorphisms and Subalgebras 331 (We have 1 C GCA and 0 E GCAv since the inf of no members of X is 1 and the sup of no members of X is 0. GCAv are collections of subsets of ft.20. and the gi. The theorem above shows that the algebra of sets g generated by a given collection of sets ~} can be obtained by a threestage construction. R e m a r k . where m is any positive integer and P is any subset of r l j = l {0. In other words.. 1}. J 2 . Hints: Obviously GCAv is closed under finite sups. j ) ' s are nonnegative integers (or more simply. G C. and let f 9G ~ Y be some mapping.1}.. but a product Y I ~ c A~ (as in 1. g2. Moreover. A ~nkgk  0 implies ~nlf(gl) A ~n2f(g2) A ' ' ' A Cnkf(gk) 0 for every nonnegative integer k and every choice of gl. An expression such as the right side of (.11. Then f can be extended to a Boolean homomorphism F " S ~ Y if and only if f satisfies the following condition: CTMgl A C n292 A . but Ae~ is not necessarily closed under complementation or countable intersection..j's are elements of G. Let Y be another Boolean algebra.n2. which generates a Boolean subalgebra S c_ X.) Then GCAv = S.nk C {0. In particular. though a bit more complicated. if this condition is satisfied. An analogous threestage construction does n o t work for aalgebras: If we start with a collection 9 of subsets of ft. S consists of the elements of X that can be written in the form I J~ i=1 j = l where I. Then G. This theorem is similar in nature to 11. 0s and ls).
. Hence.O. (2) where the join V(j. qPl A ( C ~ 2 ) . We must show that ~1 and ~2 are equal to each other. . . . we find t h a t ( ~ 9 1 ) A ~2 . j .=1 j = l whenever s V A Cn(i'J) 9 i .j's in G as above. j 2 . After we establish that. Similarly.10. we see that K Lk I 0  sACs  (ji) k = l V V A A /=1 i=1 The right side of this equation is an expression of the type in the hypothesis of Sikorski's criterion. K (ji) k = l Lk I 0.t k1 l = l for some gi. first observe by De Morgan's Law and the Distributive Law I that Cs  V (ji) i=1 9. from the two representations for s. The uniqueness of F is clear: If an extending homomorphism F " S + Y exists. Some s c S may be representable in normal form in terms of gid's in more than one way.l).j = 8 . and so we must verify that the resulting value of F(s) does not depend on the particular representation of s./.e. . together with De Morgan's Law a n d the Distributive Law. . Ji} i.j. . Now. suppose that I Ji K V i=l A j=l Cn(i'J)gi.t 'S in G. and let I ~1 Ji Cn(i'J) f ( g i . jI) that satisfy 1 _< ji <_ Ji for each i. Therefore ~1 . It is not immediately clear that equation (1) determines a function.V Lk A Cm(k'z) hk. V V /=1 i=1 AA Unwinding our computations. . we shall show that the function F defined by (1) is indeed a homomorphism. To show that (1) actually does define a function.j's and hk. hence there is at most one homomorphism F : S ~ Y that extends f. by assumption. by 13. To show this. then it must satisfy I g(8) Ji V A Cn(i'J) f (gi.. j ) and (~2  V A i=1 j=l V A Cm(k'l) k=l /=1 K Lk f(hk..332 Chapter 13: Boolean Algebras Proof of theorem. all sequences (jl.) is over all sequences (ji) e I I i 2 1 { 1 .j) . however.~2. i=1 j=l I Ji (1) Every s c S can be expressed as a combination of gi.0. .. 2 .
AXm_l AXrn) A (Cyl A Cy2"'" A C Y n . Let A .) It suffices to show that (Xl AX2 A.. r'l 9:x. let ~x = { A c_ X 9 x E A} {A E A 9 x E A} c_ X..Y~}...Yn}}. (This is just the ultrafilter on X. (a) (We permit m or n to be 0. For each x E X. The second equation follows from our representation of Cs in (2).. hence it extends uniquely to a Boolean algebra homomorphism from . Indeed.. Every Boolean algebra is the homomorphic image of some algebra of sets. r'l C~y.12. Suppose that s and F ( s ) are represented as in (1)._1 r1C~xm) r'l (Cgry. 13. and just view X as a set. Now recall the Boolean algebra structure of X..) Thus O = {9"x : x E X} is a collection of subsets of A. if X is any Boolean algebra..O.11).....~ onto X..Boolean H o m o m o r p h i s m s and Subalgebras 333 Thus F is welldefined. such that (~x.~_. with the understanding that the intersection of no subsets of A is just A. .. as in 5... Pro@ We may specify such a surjective homomorphism as follows: Temporarily forget the Boolean algebra structure of X. T h a t F also preserves V is obvious from our definition of F. r'l C~yn) 0. TarskiScottLuxemburg E p i m o r p h i s m T h e o r e m . (b) S c X 9 {Xl. We shall show that the mapping ~ H x satisfies Sikorski's criterion (13.. ci [~c~y 2 r'l .. Thus F preserves C...Y~ be any elements of X.. Let Xl.Xrn} is not a subset of X \ {Yl.Xrn } C S C X \ {Yl.. It remains to show that F is a Boolean homomorphism.5.J~) (ji) i=1  F(cs).7 that F is a homomorphism... by an argument analogous to our proof of (2).~P(X).1 The set on the left side of (a) can be rewritten as A Cyn) ..~z2 r'l . r'l..X2.. fixed at x. using the definition of F.X m and Yl.Xm_I. .Y2.Y~I. let . the first equation follows from our representation of F ( s ) in (1).~ be the algebra of subsets of A generated by O.c.Y2. Now it follows from an exercise in 13. then there exists some g) that is an algebra of sets and some surjective Boolean homomorphism f :fo+ X .. We claim that I CF(s)  V A ccn(~'J~)f(gi. T h a t is. The fact that this set is empty implies that {Xl~X2~.
Let XD denote the given Boolean algebra X.. who credited it to Scott. However. 1). implying (b). in which every element is idempotent .a. +. C. Definitions and remarks. Hints: (x + x) 2 = (x + x) and (x + y)2 = (x + y).19. . The construction of free algebras is a wellknown construction and can be found in many algebra books. and so our results on algebraic systems are applicable.e. B OOLEAN RINGS 13. and let X8 denote the underlying set of X i. y E X . the Stone Representation Theorem is an equivalent of the Ultrafilter Principle i. Show that x = x and x y = yx for all x . We emphasize that. By the definition of a free Boolean algebra (not covered in this book). This theorem was announced by Tarski [1954]. Any Boolean ring (X. The collection of all Boolean rings is an equational variety. and so Boolean rings do form an equational variety. But then xi A Cyj = O. Exercise.e. . {0} is a Boolean ring.p = p.11. (X. .22. 1) can be made into a Boolean lattice with the same underlying set..i. (HB13). Cp= l + p .. In this book. . Readers with a greater background in algebra may prefer the following proof. A. it can be shown (without using the Axiom of Choice or any of its weaker offspring) that the free Boolean algebra with generators X~ is isomorphic to a subalgebra of iP(T(X~)). 0. The unary operations 0 and 1 are left unchanged. By an argument similar to the proof of 13. In contrast. Remarks and alternate proof. as defined in 8.13. 1. p+q = pAq. the identity map i : X~ ~ Xb extends uniquely to a homomorphism I : A .. by our definition. + . Let X be a Boolean ring. see the remarks in 13. V. Some of the older literature imposes the further restriction that 0 r 1 as part of the definition of a Boolean ring. A B o o l e a n r i n g is a ring X with unit. Conversely. we can make any Boolean lattice into a Boolean ring by defining pq = p A q . in which x 2 = x for every x E X. 0.14. without its Boolean structure. Xb. . which is therefore surjective. It was subsequently used by Luxemburg [1969]. Boolean rings are a full subcategory of the category of rings with unit. thus A is isomorphic to an algebra of sets.e.50. We shall use that fact in our proof of (HB12) =. The fundamental operations are the ring operations:..334 Chapter 13: Boolean Algebras and therefore xi = yj for some i and j. . pVq=p+q+pq. by the definitions pAq=pq. the degenerate ring {0} is a Boolean algebra. the present theorem does not require any arbitrary choices. 13. in 23.. it is a weak form of the Axiom of Choice.. Let A be the free Boolean algebra that has the set X8 for its set of generators. This theorem could be taken as a corollary of the Stone Representation Theorem (UF6) in 13. 0.4.
.x2. . 0... x2. Let p(xl. then a B o o l e a n s u b r i n g of X is a nonempty subset S c_ X that is closed under the fundamental operations of rings with unit i. (The relevant verifications are left as a tedious but straightforward exercise. but any Boolean lattice has both. . . V.e. X 2 .) In the mathematical literature. .Xn) . X 2 . expressed by formulas using only those variables and the fundamental operations 0..15.. Xl ~X2. However. 1. +.xn) and q(xl.50) that is.2 that an ordered group containing more than one element cannot have a lowest or highest element. . Xn) . they yield a bijection between Boolean rings and Boolean lattices." and "Boolean subalgebra" interchangeably..x2.Boolean Rings 335 These two transformations are inverses to each other.x.x~.X2.... It is then a Boolean ring in its own right. 1.Xn) be terms for the variety of Boolean algebras (as in 8. Show that S is a Boolean subring (i. . . assume p and q are functions of the variables Xl..e.{0. . . It should be noted that if X contains more than one element.. In other words. . Some features associated with Boolean rings are equivalent to features associated with Boolean lattices: a.. X 2 . then the ring and lattice structures are not compatible with each other in the sense of ordered groups (discussed in 8. Proof. ." "Boolean ring. . 13. the phrases "Boolean lattice." "Boolean subring. .. since this result is not needed later in this book. that satisfies Xl. Hereafter. Proposition (optional). closed under the Boolean lattice operations). ..Xn) is satisfied by every choice of Xl. and let S c_ X..X2.16. xn) is satisfied by every choice of Xl.q ( X l . . Xl.. A.~ in the Boolean algebra 2 .. 1 E S. x~ in every Boolean algebra X if and only if the equation p ( X l . .. . Let X be a Boolean algebra. closed under the ringwithunit operations) if and only if S is a Boolean sublattice (i.e. when equipped with the restrictions of the operations of X. x 2 . Remarks. We omit the proof. 13.. .q(xl. the identities that are true for all Boolean algebras are the same as the identities that are true for 2. . some caution must be exercised. A proof is given by Johnstone [1987] and other books. If X is a Boolean ring. [~ or 0. ." and "Boolean algebra" are sometimes used interchangeably.. 1}. The same set X may be viewed as a Boolean lattice or a Boolean ring. Then the equation p ( X l .30).. We noted in 10. . we shall use the terms "Boolean sublattice.X2 E S ==~ XlX2.
Show that S is an ideal in X if and only if S is nonempty and (i) s.17. Equivalently. We say S and T are dual to each other. Dual to the notion of "ideal" is that of "filter. a. f ( x l ) + f(x2).X ~ Y. we say I is a p r i m e i d e a l and F is a B o o l e a n u l t r a f i l t e r .1(1).7) if and only if f is a homomorphism of rings with unit (as in the preceding paragraph). ~ xEX sAtET. (We obtain the same ideals either way. f(xl)f(x2).25. and ~ xVtET.{Cs" s E S}. => s v t E S .) .26. x2 E X . and let F = { C x : x E I}.) Clearly the set X itself is an ideal. More definitions.18. d. with ordering given by c_.e. 13. that satisfies f(xlx2) f(xl) for all x l.. f(xl).1). see 9. By an i d e a l in X we shall mean a set S c X that is an ideal in the sense of 9. Let T . and let T C_ X. that preserves the fundamental operations of rings with units i. thus F is a proper filter. then T is a filter if and only if S is a ideal. 13. Let X be a Boolean algebra. then an ideal or filter in X as defined above is the same thing as an ideal of sets or a filter of sets (as defined in 5. Show that the following conditions are equivalent. b.l ( 1 ) because we may take f " X + {0}. Let I be a proper ideal in a Boolean algebra X. recall that 0 . If X . F is a maximal filter. and let S c_ X. Recall that a ringwithunit homomorphism is a mapping f . t E T (ii) t E T . a proper ideal that is not included in any other proper ideal.1 in the degenerate Boolean algebra {0}. Clearly the set X itself is a filter. any other ideal is called a p r o p e r ideal.e.iP(f~) for some set f~. where f is any Boolean homomorphism. from one ring with unit to another. Exercise." Let X be a Boolean algebra.2 and 5. xEX and => x A s E S .. Show that f is a homomorphism of Boolean lattices (as defined in 13.g. When any of them is satisfied. f(1) 1 Let f : X + Y be a mapping from one Boolean algebra into another. either type of homomorphism may be described more briefly as a B o o l e a n h o m o m o r p h i s m . t E S (ii) s E S . (A) I is a maximal ideal i. An equivalent definition can be given in terms of the lattice structure. We shall say that T is a filter if T is nonempty and (i) s.336 Chapter 13: Boolean Algebras b. Hereafter. any other filter is called a p r o p e r filter. in the category of rings with unit or in the category of Boolean rings.) c. a filter is a set of the form f . (Note that the improper filter is of the form f . (Equivalently. f ( x l + x2) f(0) 0. Let X be a Boolean ring.
we could say that a Boolean algebra A has dual A* .8).55. then {x V y" x ~ z and y E I} is a strictly larger ideal that contains z. to prove (UF8) in 13. Since we may identify sets with their characteristic functions. . This proposition does not require the Axiom of Choice or any of its weakened forms.55. then at least one of x. 13. which is a subset of 2 a . From the examples in 13.8). we would simply have to replace each argument with a dual argument. The characteristic function of F is a twovalued homomorphism on X (defined as in 13. then at least one of x. then another equivalent condition is (F) F is an ultrafilter of sets on f~ (in the sense of 5. which removes the restriction to finite X's. y belongs to F. 13. We prefer to use ultrafilters rather than ideals here.22.19.44 we shall introduce a natural topology on A*.{mappings from A into {0.[P(f~) for some set ft. Remark. we prefer to view A* as a subset of 2 a because this makes the topology on A* (introduced in 17. This proposition will be used. In this book we shall always take A* to be the set of (characteristic functions of) Boolean ultrafilters. but A* is nonempty when A .{Boolean ultrafilters in A} C_ ~P(A). Any finite. Proposition. then there exists at least one f E X * with f ( z o ) . Equivalently: if x V y E F.Boolean Rings (B) For each x E X. so we find it more convenient to stick with just one of the conventions. However.8 w e see that A* is empty when A is the degenerate algebra {0}. exactly one of x. Remarks. Switching to this definition would require no changes of substance. Of course.20. y is an element of I. The quotient Boolean algebra X / I is isomorphic to {0. T h a t is. In 17. because this makes our Boolean duality more like the other kinds of duality discussed in 9. 1}. In fact. switching back and forth between the two conventions is a tedious process. If X . Caution" Some mathematicians instead define A* to be the set of all prime ideals in A. 1 } }.1. together with a weak form of Choice. Hint for (A) ~ (B)" Show that if I is an ideal in X containing neither z nor [}z. Cx is an element of I (and hence the other is a member of F). the flter F = {x E X : Cx E I} is also equal to e X :x r 337 (c) (D) (E) Whenever x A y E I. The d u a l of a Boolean algebra A is defined to be the set a 9  {twovalued homomorphisms on A} {characteristic functions of Boolean ultrafilters in A}. In (UF8) we shall prove that A* is nonempty whenever A is any nondegenerate Boolean algebra. if z0 E X \ {0}.~P(ft) for some nonempty set ~t. nondegenerate Boolean algebra X has a nonempty dual.44) more obvious and also makes more obvious the analogy between Boolean duality and the other kinds of dualities described in 9.
Since So = {u E X : 0 ~ u ~ xo} is a nonempty. then there exists f E X* with f ( x o ) ~: f ( x l ) . Verify that S(x v y) = S(x)u s(y). we do not need a canonical. L e m m a o n Stone's Epimorphism. x l are distinct members of X. by translation. We shall show that the several principles listed below are equivalent to the Ultrafilter Principle. In Chapter 6 we proved (UF1) ~ (UF2) (and in Chapters 7 and 9 we proved that (UF1) r (UF3) w (UF4)). we may restate this as: If X is a Boolean algebra and x0 E X \ {0}.1}. The mathematician who is searching through the literature for equivalents of UF and related material would do well to look under not only "ultrafilter." (UFh) Boolean Separation T h e o r e m . (Different choices of uo may yield different functions f. for each y E X. now we shall complete the cycle by proving that (UF2) =v (UFh) :~ ( U F 6 ) = v (UF7) =v ( U F 8 ) ~ (UF9)~ ( U F 1 0 ) = v (UF1).22. but in the present argument we are only concerned with proving the existence of at least one such f. see (UF6) in 13. Hence its characteristic function 1 0 ifyET ifyEX\ I(Y) is a member of X* with f ( x o ) = 1. Assume that X* is nonempty. the variants listed below are as well known in logic and algebra.) Observe that y A uo is either uo or 0. Let u0 be any minimal element of So. it has a minimal element. S(x A y) = S(x) n s(y). if X is a Boolean algebra and x0.22. Use that fact to show that the set T = {y E X : y ~ uo} is a Boolean ultrafilter in X. Remark.21. Then there exists a Boolean homomorphism from X onto an algebra  of subsets of X*. These equations show that  is an algebra of subsets of X* and that the mapping x ~ S ( x ) is a homomorphism from X onto  These observations do not require the Axiom of Choice or any of its weakened forms. Let  be the range of this mapping. finite poset. we can draw further conclusions about Stone's epimorphism if we assume some weakened form of the Axiom of Choice. S(Cx) = X * \ S(x). particular f." but also "prime ideal" and "Boolean. BOOLEAN EQUIVALENTS OF UF 13. The dual of a Boolean algebra separates its points. let S(x) ( f E X * : f ( x ) . then there exists f E X* with f ( x o ) = 1. . Or. That is. one such homomorphism may be defined as follows: For each x E X.  T 13.338 Chapter 13: Boolean Algebras Proof of proposition. In fact. Although (UF1) is probably the version of the Ultrafilter Principle most useful for analysts. S(0) = o. S(1) = X*. Let X be any Boolean algebra. However.
This argument is a modification of one by Rice [1968]. where that subalgebra includes the point x0 and where that homomorphism maps x0 to 1. such that { (x0. (UF2) is applicable. Proof of (UF5) ~ (UF6). so S is injective. Let 9 be the collection of all functions f from subsets of X into {0. S is included in a prime ideal if and only if S has the analogous finite join property. To verify (UF2)(ii).20 to it.21 is welldefined. in the sense of (UF2)(iii). (I) satisfies (UF2)(i) trivially. . (Equivalently. and so we can apply 13. It is easy to verify that (I) has finite character.b.) Proof of (UF2) =~ (UF5). X has a Boolean ultrafilter. there exists a twovalued probability on X.11). 1} that have the following property: f can be extended to a Boolean homomorphism from some subalgebra of X into {0. . Thus the ringwithunit homomorphism S : X ~ G has kernel equal to {0}. Thus. and let S be a nonempty subset of X. 1}.1} is finite. ( U F 8 ) B o o l e a n P r i m e I d e a l E x i s t e n c e T h e o r e m .) ( U F 1 0 ) B o o l e a n U l t r a f i l t e r E x t e n s i o n T h e o r e m . If X is a nondegenerate Boolean algebra. X* is nonempty.19. in the terminology of 23.) ( U F 9 ) B o o l e a n P r i m e I d e a l E x t e n s i o n T h e o r e m . . Then S is included in a Boolean ultrafilter if and only if S has this finite meet property: 81 A 82 A ' ' " A 8n r 0 339 for each finite set {81.21) is an isomorphism from X onto an algebra of sets. then X* is nonempty and the Stone mapping (described in 13. so Stone's mapping S : X ~ G in 13. . Every Boolean algebra is isomorphic to some algebra of sets. since the set {0. this completes the proof. If X is a nondegenerate Boolean algebra. let S be any finite subset of X. Therefore S is an isomorphism from X onto 6 . Also. 1}. Then every proper ideal in X is included in a prime ideal. from (UF5) we see that x c X \ { 0 } => S(x) =/: ~. Let X be a Boolean algebra. then X has a prime ideal.Boolean Equivalents of UF ( U F 6 ) S t o n e R e p r e s e n t a t i o n T h e o r e m ( e x p l i c i t v e r s i o n ) . Then the Boolean subalgebra generated by S U {x0} is finite. Then (P can be described also as the set of all functions f from subsets of X into {0. (Equivalently. Also. X* is nonempty. ( U F 7 ) S t o n e R e p r e s e n t a t i o n T h e o r e m ( s i m p l e v e r s i o n ) . Let X be a Boolean algebra. . every proper filter in X is included in a Boolean ultrafilter. 1)} u G r a p h ( f ) is the graph of a function that satisfies Sikorski's extension criterion (13. (Equivalently. s n} C S. s 2 .
b. They have most of the properties of Boolean algebras. X / I has a prime ideal P. In this subchapter we shall consider two types of algebraic systems that are slightly more general than Boolean algebras. it is the smallest such filter. We shall also refer to => as the H e y t i n g i m p l i c a t i o n . . it is the filter generated by S. and let a. this filter is contained in some ultrafilter. Let X be a lattice. Definition. Basic properties.25. Verify that r~l(P) is a prime ideal in X that includes I. but not quite all. . Then: a. The "only if" part is obvious and does not require (UF9). and thus the Heyting implication is a mapping from X • X into X. conversely.28.8. suppose S has the finite meet property. HEYTING ALGEBRAS 13. Proof of (UF10) => (UF1). I n t e r c h a n g e of H y p o t h e s e s . s n } C S } is a proper filter containing S. Immediate from 13. Let X be a relatively pseudocomplemented lattice. Immediate from the example in 13. . Proof of (UF9) => (UF10). (a => (b => c)) = (b => (a => c))." That relatively pseudocomplemented lattices are more general than Heyting algebras and Heyting algebras are more general than Boolean algebras can be seen from the examples in 13. c.18(F). Proof of (UF7) => (UF8). they lack some of the symmetry or duality of Boolean algebras. Proof of (UF8) ~ (UF9). (In fact. a => b exists in X for all a. thus they might be thought of as "onesided Boolean algebras. 13. .340 Proof of ( U F 6 ) = > Chapter 13: Boolean Algebras (UF7). ( a A ( a ~ b ) ) ~ b ~ (a~b). b E X.) By (UF9). any subset of a proper filter) has the finite meet property. b E X. The p s e u d o c o m p l e m e n t of a r e l a t i v e to b is the element of X denoted by a => b and defined by this formula: (a=>b) = max{xEX : aAx~b} if such a maximum exists.24. For the "if" part. onto the quotient Boolean algebra. In particular. We say that X is a r e l a t i v e l y p s e u d o c o m p l e m e n t e d l a t t i c e if the Heyting implication is a binary operation that is.23. d. By (UF8). Then the set { x C X " x ~ s l A " " A Sn f o r s o m e f i n i t e s e t { s 1 . any subset of an ultrafilter (or more generally. x ~ ( a = > b ) ifandonlyifaAx~b. . x ~ (a => (b => c)) if and only if a A b A x ~ c. 13. b is a member. Let rr : X + X / I be the quotient map. Then the set {x E X : a A x ~ b} is nonempty for instance. Obvious. Let I be a given proper ideal in X.
a A s 4 r for each s E S. (This proof can be found in Rasiowa and Sikorski [1963] and other books. To show t h a t in a relatively pseudocomplemented lattice.27. Hint: Use the Contrapositive Law with b = Ca. In fact. it satisfies one of the infinite distributive laws: If a C X and S C_ X and c r . .) k" It can be shown that relatively p s e u d o c o m p l e m e n t e d lattices form an equational variety. Let X be a Heyting algebra. obtained by setting one of the variables to 0. X has a largest element. a ~ CCa. C o n t r a p o s i t i v e d.sup(S) exists. In a Heyting algebra X. If a ><b.Heyting Algebras e. Law. we shall show t h a t a A a 4 r.25. We omit the proof. (Several of these are just specializations of results of 13. Basic properties. By assumption. it can be found in Rasiowa [1974]. Let T = {a A s : s E S}. b.26. Using the definition of => again. c. then Cb ~ Ca. j. for any a E X. In any lattice. L a w . (a ~ b) ~ ((b ~ c) =~ (a =~ c)). f. C S} exists Proof. hereafter denoted by 1. It is equal to (a =~ a). a ~ b if and only if (a => b) = l. g. X is a distributive lattice.B o o l e a n a l g e b r a ) is a relatively p s e u d o c o m p l e m e n t e d lattice with the further p r o p e r t y t h a t X has a smallest element. (0=~b)=l. 13. if a = sup(S) exists. Thus cr 4 (a ~ r). (a =~ b) ~ ((Cb) =~ (Ca)). as required.) a.i. we have a A a 4 r. then (c => a) ><(c => b) and (a ~ c) ~ (b ~ c). C 0 = I . then sup{a A s ' s and equals a A or. by definition of 3 . denoted hereafter by 0. A H e y t i n g a l g e b r a (also known as a B r o u w e r i a n lattice or a p s e u d o . e. Prove the following properties. (a =~ (Cb)) . 13.25. {x E X " b A x ~ c}. D o u b l e N e g a t i o n and use 13. we also define a unary operation C" X + X by Ca (a=~0) max{xcX 9 aAx0). h. CI=0. ( l = ~ a ) = a a n d ( a = ~ l ) = l f o r a n y a c X . If a ~ b. then a A a is an upper b o u n d for T (easy exercise). Then each s c S satisfies s 4 (a ~ r). The operation C is called the p s e u d o c o m p l e m e n t .(b =~ (Ca)). i. where r is any given upper b o u n d for T. Hint" 341 {x c X " c A x ~ a} {x c X " a A x ~ c} C_ D_ {x E X " c A x ~ b}. a A a is the least upper bound.
T . b. page 128]. i. COOa= Ca. also use 13. hence it satisfies the requirements for a relative pseudocomplement. It may or may not have a smallest member.) In 5. 13. Topological examples.28. and thus it may or may not be a Heyting algebra.27. (B) COa ~ a for all a E X. ((Ca) V b) d (a =~ b). and let 2) be a collection of subsets of X that is closed under finite intersections and arbitrary unions. Chapter l3: Boolean Algebras h. since it also has a smallest member . Let X be a set. by the argument given at the beginning of this section. Hint" Use 13.13.27.j.27.21. ((Ca) v (Cb)) d C(a A b). The lattice of open sets may or may not be a Boolean algebra. Then (2). g. ((Ca) A (Cb)) . with binary lattice operations U. k.21 we gave an example in which the lattice of open sets does not satisfy the other infinite distributive law and thus is not a Boolean algebra. n. Then the following conditions are equivalent: (A) a v ( C a ) .b this yields C(COa) ~ C(a). any Heyting algebra is lattice isomorphic to the lattice of open sets of some topological space.f with a In. Cb.c). Hints: (Ca) d C0(Ca) by applying the Double Negation Law to Ca.342 f.25. let S. Assume that the partially ordered set (9 C_) is a lattice. in 5. is a Heyting algebra. (a=> (Ca)) . and thus the largest member of KS.29..namely.C(a V b). T be any two given members of 2). For instance. Although we omit the proof. by 13. rq (see 15. C_) is a relatively pseudocomplemented lattice. We note two particular instances of this when X is a topological space. Which Heyting algebras are Boolean? Suppose X is a Heyting algebra. Let %S. 13.1. that fact also follows from 13. The lattice of open sets. (exercise) it does not have a smallest member if X is the real line with its usual topology.i. B r o u w e r ' s Triple N e g a t i o n Law.21 we verified directly that the lattice of open sets satisfies one of the infinite distributive laws. a proof of this is given by Rasiowa and Sikorski [1963.T is itself a member of %S. apply C to both sides of the Double Negation Law. .{G C 9 9S rq G c_ T}. the empty set. The open dense subsets of a topological space X form a lattice. a A ( C a ) = 0 .T. CC (b v (Cb)) . Also. these examples are from Rasiowa and Sikorski [1963]" a.(Ca).l f o r a l l a E X .T. See Rasiowa [1974]. it is a relatively pseudocomplemented lattice. Every Boolean lattice is a Heyting algebra. conversely. it can be shown that Heyting algebras form an equational variety. discussed in 5. To see this. (Although the proof is too long to present here. j. In fact. it can be shown that. Then the union of the members of %S. 1.
since any Heyting algebra is a distributive lattice. a) for alla. 343 Proof.b E X. For (C) implies (A). let b = a. For (B) implies (A). then C is a complementation operation (as in 13. For (D) implies (B). let b = COa and simplify. then it is easy to verify that all the other conditions listed above are satisfied. note that CO (a v (Ca)) . (E) ((Ca) =~ b) 4 ((Cb) ~ (F) X is a Boolean algebra.1). let b = Ca. For (E) implies (B). (D) ((Ca) =~ (Cb)) 4 (b =~ a) for all a. b E X . b E X. If X is a Boolean algebra. Conversely: If (A) holds. not just pseudocomplementation.1 and a 4 CCa in any Heyting algebra. . (F) follows.Heyting Algebras (C) (a =~ b) 4 ((Ca) v b) for all a.
Our main goal is to develop some understanding of the notion of "consistency. Our goal is only to understand the statement of Shelah's result and some of its applications. Con(ZF) ~ Con(ZF + DC + BP)..20.15. We shall cover the basics of logic.. 14.Chapter 14 Logic and Intangibles 14. which show that the syntactic and semantic views of consistency are equivalent.51 and 13." so that we can understand Shelah's alternative to conventional set theory.. a weak form of Choice studied extensively in other chapters of this book. which are of a rather different nature than the theorems of other branches of mathematics.i. This chapter provides a brief introduction to formal logic. real analysis. we shall use Shelah's consistency result to explain intangibles .2.i. it is a whole other subject. and they may be confused in the minds of some beginners. etc..e. Rather. with its own methods and its own theorems. It is possible to do some interesting things in set theory without any formal logic (see Chapter 6). An easy corollary of the Compactness Principle is the existence of nonstandard models of arithmetic and analysis in 14. However.63. Conversely. the study of logic does not make us more precise or unemotional. The Completeness and Compactness Principles are also interesting to us because they are equivalent to the Ultrafilter Principle. logic can be applied to other theories besides set theory e. Contrary to the assumption of many nonmathematicians. objects that "exist" in conventional mathematics but that lack 344 . with references in lieu of proofs. this is one way to introduce the subject of nonstandard analysis. After the Completeness and Compactness Principles we shall state a few more advanced results. the study of formal logic does not make us more "logical" in the usual sense of that word .e." this is discussed in 14. Introduction. We have already seen examples of this in 8. logic and set theory are really different subjects..1. Chapter overview. This result was proved by Shelah [1984]. up to and including a proof of the Completeness and Compactness Principles.g. but the proof is too long and too advanced to be included in this book. but we follow the unconventional approach of Rasiowa and Sikorski [1963] in our definition of "free variables" and "bound variables. ring theory. Our presentation is mostly conventional. At the end of this chapter. Formal logic is not merely a more accurate or more detailed version of ordinary mathematics. Because many of logic's most important applications are in set theory. those two subjects are often presented together.
and are not intended as exercises.3. A m o d e l of a collection of formulas is an interpretation that makes those formulas true.. An interpretation of a language may be highly unconventional. After a model has been used to establish consistency of some axioms. in the present subchapter we shall present some informal examples of models.e. An i n t e r p r e t a t i o n of a language is a way of assigning meanings to its symbols.. it is a sort of "example" for that collection of formulas. unwieldly.g. trusting that other mathematicians have already justified those axioms with a model. and not at all intuitive. The axioms for an ordered field were given in 10. onetime use e. By the Completeness Theorem (proved in 14. the reader who wishes to fill in the details should consult the references in the bibliography.e. All of the terms introduced above interpretation. but they become false if we interpret "set" and "member" in the peculiar fashion indicated in 1.Some Informal Examples of Models 345 "examples. In logic we separate a language from its meanings. Formulas are not true or false in any absolute sense. the axioms of ZF set theory are usually regarded as true. then the nature of sets or other mathematical objects changes. For instance.48. intended only to indicate the flavor of the ideas. consistent.7. some may choose to skip ahead to the end of this chapter and just read the summary of consistency results and the explanation of intangibles. 14. Models of the reals..57). (A good example of this is in 14. If we cannot establish absolute truth." For a first reading. the rest of this chapter will not be needed elsewhere in the book. model. formalists mathematical objects such as sets do not really "exist. SOME INFORMAL EXAMPLES OF MODELS 14. Bertrand Russell took such a viewpoint when he said Mathematics is the only science where one never knows what one is talking about nor whether what is said is true. the next best thing is s y n t a c t i c c o n s i s t e n c y i. to prove the consistency of a given collection of axioms.) Applicationoriented mathematicians may choose to skip the modeling step altogether. in some cases we may choose to discard the model and think solely in terms of the axioms." all that really "exists" is the language we use to discuss sets and the reasoning we can perform in that language. When we change the language or its interpretation. to introduce the basic ideas. knowing that our axioms do not lead by logical deduction to a contradiction. will be given more specialized and precise meanings later in this chapter. But first. because they are conceptually simpler. etc. In the view of some logicians especially.4.4. The omitted details are considerable. they are only true or false when we give a particular interpretation to the language. and begin with the axioms. Most of these examples are mere sketches. syntactic consistency is equivalent to s e m a n t i c c o n s i s t e n c y . those plus .i. where "model" has the broad and slightly imprecise meaning indicated above.. knowing that our collection of axioms has at least one model. It may be constructed just for a brief.
Any one of these constructions is a model of the axioms of I~. such as triangles. Proofs of (i) by different constructions in terms of the rationals are given in 10. In these attempts. the Parallel Postulate is concerned with behavior of points that are very far a w a y .c. they were merely viewed as imaginary mathematical constructs. one approach was to replace the Parallel Postulate with some sort of alternative that negates the Parallel Postulate. then by the same argument with a different interpretation of the words we can obtain a contradiction in Euclidean geometry. we have a model that establishes relative consistency: If the axioms of Euclidean geometry are noncontradictory and the . Thus. For instance. and (ii) any two such fields are isomorphic.e. Proof of (ii) is given in 10. but other alternatives merely led to very peculiar conclusions. But how do we know that that list of axioms makes sense? We must show (or trust other mathematicians who say they have shown) that (i) there is such a field. However. and the sum of the angles of a triangle is greater than 180 degrees. etc. The other postulates of geometry are concerned with objects of finite size. which says. A nonEuclidean geometry modeled in Euclidean geometry. Thus. The peculiar conclusions made up new. mathematicians became concerned about Euclid's P a r a l l e l P o s t u l a t e . if a contradiction arises in our reasoning about double elliptic geometry.15. ordered field. in one formulation: through a given point p not on a given line L. the constructions involving Dedekind cuts.5. Some mathematicians attempted to remove any doubts by proving this axiom as a consequence of the other axioms of Euclidean geometry.perhaps infinitely far a w a y .and so the Parallel Postulate is less selfevident. there passes exactly one line that lies in the same plane as L but does not meet L. nonEuclidean geometries. in 1868 Eugenio Beltrami observed that the axioms of twodimensional double elliptic geometry are satisfied by the surface of an ordinary sphere of Euclidean geometry. and then try to derive a contradiction.33.346 Chapter 14: Logic and Intangibles Dedekind completeness make up the axioms for the real number system. a circle whose diameter is the diameter of the sphere). During the 18th and 19th centuries. 10. and it therefore demonstrates the consistency of the axioms of I~. a paper of Riemann (1854) developed a geometry. if we interpret "line" to mean "great circle" (i.. in which any two lines meet in two points.d. At first these geometries were not seen to have anything to do with the "real" world. are rather complicated and generally have little to do with our intended applications of the reals. 14. and 19. equivalence classes of Cauchy sequences. after we have demonstrated consistency we may discard the model and think of the real numbers in terms of their axioms: The real number system is a complete ordered field. The axioms for R are usually much simpler conceptually and more convenient for applications. However.e. now called d o u b l e elliptic g e o m e t r y or Riem a n n i a n g e o m e t r y .15. In contrast. Many analysis books simply "define" IR to be a Dedekind complete. Some alternatives to the Parallel Postulate did indeed lead to clear contradictions.45. Therefore.
In the model [M[. for others. then the axioms of double elliptic geometry are also consistent.6. Specifying a universe. using the interior of a circle in the Euclidean plane. our threedimensional space may be very slightly curved in a fourth direction. Perhaps a space ship that travels far enough in a seemingly straight line will eventually return to its home planet. if they thought at all. Easy exercise (from Doets [1983]). but our best scientific description of physical space. before the 19th century Euclidean geometry was "not merely an axiomatic study. Hereafter. Are they satisfied in the model ~ ? Yes. Indeed. then A = B holds if and only if for each set T. Beltrami's reasoning leads to this important conclusion: The Parallel Postulate of Euclidean geometry is not implied by the other axioms of Euclidean geometry. then :M: satisfies the condition (!) above." Thus. for some choices of 3V[. As Hirsch [1995] has put it so aptly.Some Informal Examples of Models 347 theorems we have proved about the sphere in Euclidean geometry are correct.47) says that if A and B are two sets. the definition of equality has this interpretation: Let A. then A = B if and only if for each T E :M:. B E ~ and :M N A = :M: A B. For a more detailed discussion of the history of these ideas. it is discussed by Young [1911/1955]. we have T E A ~ T E B. as in 1. . Here is one way to construct m o d e l s o f set t h e o r y : Let :M be some given class of sets. which states that any model that satisfies (!) is isomorphic to a transitive model. B E 5I. was developed by Cayley. In much the same fashion. we have T E A ~ T E B. We can only be certain of what is near at hand. the definition of equality of sets (given in 1. we can now see that double elliptic geometry is every bit as "realistic" as Euclidean geometry. For instance. interpret the term "set" to mean "member of 3V[. no. It is easy to see that "equality" of sets in the model Ni coincides with the restriction of ordinary equality to the collection JV[ if and only if 5I has this property: whenever A. but the curvature may be so small that we have not yet detected it. but not when those terms have certain unconventional meanings. This condition is satisfied when "set" and "member" have their usual meanings. Ants on a very large sphere might think they were on a plane. see Kline [1980]. A similar approach to the Parallel Postulate. A converse to this exercise is M o s t o w s k i ' s C o l l a p s i n g L e m m a ." In retrospect. Even if we find these bizarre geometries distasteful and prefer to concern ourselves only with Euclidean geometry. If :M: is a transitive set (defined as in 5.42). it can be found in books on axiomatic set theory. We shall not prove this lemma. many humans thought that way until Colombus sailed." Then statements in the language of set theory can be interpreted in terms of :M:. then A = B. 14.48. the phrases "for each set" and "for some set" will be interpreted as "for each member of N:" and "for some member of 3V[.
53). Though GSdel's proof involved constructible sets. since ~ has fewer sets than V. but there does not exist such a function in 9V[." as described in 5. then sets A. then the new statements are also true with probability 1. it has been proved by other methods that V ~: L is also consistent with set theory see for instance Bell [1985] . The axiom V = L is called the A x i o m of C o n s t r u c t i b i l i t y . With this interpretation. Thus card(A) = card(B) in V.8. One of the simplest to outline is the following: Let (~. then so is ZF + AC + GCH + (V = L). O n the other hand.7. and his use of V assumed the consistency of the ZF axioms. which says that all sets are constructible relative to the ordinals. (ii) If statements that are true with probability 1 are used to generate new statements. For instance. #) be a probability space. Let 9~ be the space of all equivalence classes of realvalued random variables. G5del's universe. For a suitable choice of (~t. When we go from the smaller universe ~ to the larger universe V. and is not restricted to any particular meaning for "sets." using the universe L of sets that are "constructible relative to the ordinals.so the Constructibility Axiom is independent of the axioms of conventional set theory.) GSdel's construction also shows that if ZF is consistent.. #). by other methods. The Continuum Hypothesis (CH) can be formulated as a statement about subsets of the reals: It says that no set S satisfies card(N) < card(S) < card(R). it is possible to show that (i) For a suitably formulated axiomatization of ll~. E. Thus. but card(A) % card(B) in 5I. He constructed his model L inside the conventional universe V. we do not obtain a contradiction if we assume that those two universes are the same. then ZF + A C + G CH is consistent. see 14. every axiom of I~ is satisfied with probability 1 by ~. Modeling the reals with random variables. That universe is (perhaps) smaller than the usual universe V. Thus. he was able to show that the axioms of ZF set theory plus AC (the Axiom of Choice) plus GCH (the Generalized Continuum Hypothesis) are all true. but not all. He interpreted "set" to mean "member of L. some distinct cardinalities coalesce.8. Thus the Continuum Hypothesis and the Generalized Continuum Hypothesis are independent of the axioms of conventional set theory. it also has fewer functions and fewer bijective functions.54. this phenomenon is called c a r d i n a l c o l l a p s e . that ~CH is also consistent with set theory. we cannot be sure that GSdel's constructible universe L really is smaller than von Neumann's universe V. 14." (In 1963 Cohen showed. this conclusion does not mention constructible sets.348 Chapter 14: Logic and Intangibles If :M: includes some. he concluded that if ZF is consistent. 14. By now the literature contains several different variants of Cohen's proof that CH is independent of ZF + AC. E. members of von Neumann's universe V (described in 5. via the rules of logic. where the set gt has a very high cardinality. . It is quite possible that there exists a bijective function in V between A and B. A subclass of V was used for an important model of set theory by GSdel around 1939. B E :M: may have different properties when viewed in 5I or in V.
We emphasize that those five axioms might also have other interpretations. interpreted as a statement about [R. We now sketch part of a demonstration of that unprovability. No member of L contains more than two members of K. 1]. 14. It can be demonstrated (though we shall omit the details here) that the Scedrovreal numbers are a model of the real numbers with constructivist rules of inference. We mentioned in 6. the axioms of sets and of R cannot be used. Therefore. thus they cannot be contradictory. and let x c [0." The axioms of set theory and of R. Then the t r u t h value of the statement f _< g is the empty set (since the interior of a singleton is empty). A finite model. then we say P is t r u e for af a t a: if P is a true statement about the real number f(y) for all !/ in some neighborhood of x in [0.9. to deduce the Continuum Hypothesis. The following example (from Nagel and Newman [1958]) is a bit contrived. due to Scedrov. Any two members of K are contained in just one member of L. A topos model for constructivists. 2. via the rules of logic. 1]. 1] into IR. 3. We consider a m a t h e m a t i c a l system consisting of two classes of objects. 5. 349 Here IR is modeled by [R. 1]. for all x E [0. By a "Scedrovreal number" we shall mean a continuous function from [0. 4. Hence the t r u t h value of the statement " f < g or f > g" is the interval (0.10. No member of K is contained in more than two members of L. which must satisfy these axioms: 1. Our t r e a t m e n t is modified from Bridges and Richman [1987]. We can verify that this model satisfies the preceding five axioms. remain unchanged in superficial appearance. is not true with probability 1. Let L be the set of edges of T. A statement is t r u e if its t r u t h value is the entire interval [0. . and thus that statement is not true in this model. 1]. 14. and the rules of logic remain unchanged insofar as they deal with strings of symbols. let P be a statement about a real number. Some of t h e m are given by Manin [1977]. 1].Some Informal Examples of Models (iii) The Continuum Hypothesis. the omitted details are numerous and lengthy. Now let f ( x ) = x and g(x) = 0. The explanation sketched above is only the merest outline. K and L. Let f be such a function. regardless of what kind of "truth" and "sets" and "real numbers" we use. The collection of all such points x is the t r u t h v a l u e of P for f. The members of K are not all contained in a single member of L. Any two members of L intersect in just one member of K. 1]. it is an open set.6 that the Trichotomy Law for real numbers is not constructively provable. The consistency of this axiom system can be established by the following model: (*) Let T be a triangle. and "truth" is replaced by "truth with probability 1. but it illustrates a point well. and let K be the set of vertices ofT. while the t r u t h value of f > g is the interval (0. though interpreted in a peculiar fashion.
c be distinct objects. {b. N. The importance of such models is discussed in 14. It is a sort of microcosm of reasoning. The use of geometry is not essential for our present axiom system.12. " ) . LANGUAGES AND TRUTHS 14. but A CE B. We can reformulate (*) without mentioning triangles: (**) Let a. c_. =. those other than logicians) may study sentences. b. The model (**) has only finitely many parts. "1 + 2" and "3" are different.70.g. A O U = B. We have used Euclidean geometry to make (*) easy to visualize. a good place to start is the monoid of meaningless strings of symbols. then the strings "1 + 2" and "3" represent the same object. b. Let K . 14. However. Although ultimately we shall be concerned with attaching meanings to the symbols in the language L. However. etc. Just as a theorem about rings must have precise hypotheses ("Let G be a commutative r i n g . but it is delimited more precisely. at the outset it is best to disregard such meanings even the meanings of familiar symbols such as E.. theorems. AN = are not formulas. N. it leaves very little room for doubt. we do have at least one model. The inner system of reasoning is the subject of the work. The logician is related to the mathematician much as a linguist is related to a novelist. c}. described in 8.e. unrelated strings of meaningless symbols. theorems. together with rules of grammar that govern the ways in which those symbols may be put together into strings of symbols called "formulas. When we interpret that language in its usual fashion. +. thus. so too a theorem in logic . Ordinary mathematicians (i. The ordinary mathematician uses a language that describes rings or differential equations. and proofs about sentences. c_.{a. {c. given in (*). one of the most important languages we shall study is the language of set theory. theorems. This language includes symbols such as E. b}. Conceptually. For instance. As Rosser [1939] pointed out. in a formal language. a} }. and proofs about (for instance) rings or differential equations. c} and L = { {a. and proofs. .350 Chapter 14: Logic and Intangibles we are not restricted to (*) as the only possible interpretation. the logician uses a language that describes languages. It may be less powerful than the reasoning that we use in "ordinary" mathematics. logicians study sentences. but perhaps we do not feel certain about the reliability of Euclidean geometry. A l a n g u a g e is a collection L of symbols. in works of logic we may commonly identify at least two distinct systems of reasoning: a.11. .4." For instance. In formal logic we separate a language from its meanings. This is sufficient to prove that the five axioms by themselves cannot lead to a contradiction. Formal versus informal systems. Its grammatical rules tell us that A O B E C is a formula.
we must assume ZF plus the Ultrafilter Principle (UF) in our outer system when we want to prove the GSdelMal'cev Completeness Principle. more advanced readers can consider a third level: Throughout many chapters of this book we study equivalents of AC and of UF. The implication (UFS) ~ ( U F l l ) . such as ring theory or differential equations e. proved in 14. also sometimes known as the m e t a l a n g u a g e a natural language such as English or Japanese." For instance. as discussed below.13. Thus they are metamathematical statements. z)) will occur in the object languages studied in this chapter." "theorems.. That principle can be applied to inner systems that are weaker (such as ZF) or stronger (ZF + AC) or perhaps not even directly comparable. a rnetarnetatheorem. " which translates roughly to "beyond mathematics" or "above mathematics" or "about mathematics. f(y. the Soundness Principle 14. Then "Con(ZF)" and "Con(ZF + AC + GCH)" are two statements about the consistency of certain axiom systems in formal set theory. It is conducted in the language of ordinary discourse. For instance. No such confusion arises in other subjects. modified slightly to suit the specialized needs of mathematicians.or. c. b. Here is another example: Let "Con" denote consistency. we shall give an example of the kind of difficulties that arise when the distinction is not maintained carefully: Berry's Paradox. Call a positive integer succinct if it can be described in . Throughout this chapter we shall use notations that support that distinction. Formulas such as (V~ P(~.. one of these systems may be stronger than the other. x ) ) u Q (x.both systems include "sentences." depending on what kind of universe we decide to live in.57 are results about formal systems. Beginners may find it helpful to view the Ultrafilter Principle and the Completeness Principle as "true. however. Then "Con(ZF) ~ Con(ZF + AC + GCH)" is a m e t a t h e o r e m .g. viewing them as principles which "might be true" or "might be false. The inner language is also called the o b j e c t l a n g u a g e . The inner and outer systems do not necessarily have the same truths.55(iv) and the GSdelMal'cev Completeness Principle in 14. The outer system of reasoning is ordinary reasoning. the outer language usually does not have to be formal we can communicate effectively without first discussing in detail how we will communicate. where the mathematics in this case is set theory. However. the outer system is often called " r e c t a m a t h e m a t i c s . In logic.57." and "proofs. It is even more "outer" than the "outer system" but to avoid confusion. First. When the inner system is mathematics. is a metametatheoremit is a theorem about metatheorems such as the Completeness Principle. if we prefer.. 14."). as in algebra or analysis.Languages and Truths 351 must have precise hypotheses ("Let L be a language with infinitely many free variable symbols ." then they will only need to deal with the two levels of reasoning described above. hereafter we shall not discuss such results in this fashion. The beginner is cautioned to carefully maintain in his or her mind the distinction between inner and outer systems." "implications. There is some resemblance between the logician's inner and outer s y s t e m s . thus they are "metatheorems" which reside in the outer system. a theorem about rings generally does not look like a ring." This may cause some confusion for beginners.
Now we can give a more precise definition of a "succinct" positive integer: It is a positive integer that can be described in the object language in fewer than 1000 characters. Everyday. However. and let 9" be a formula. also makes 9" into a syntactic theorem. then 9" ~ g is also a formula in that language. When E is the empty set. rather than a concept expressed in the language. These definitions are mathematically precise. after all. Likewise. a concept about the language. So no is succinct after all. Our definition of no is formulated in the metalanguage. and the consequence 9" is called a s y n t a c t i c t h e o r e m . The statement " E ~.9". or a punctuation symbol). formal language. more briefly. and thus cannot discuss what is a "sentence of the English language. so no contradiction is reached. but they are formulated in the metalanguage. (ii) a member of E. It is most often read as "9" implies g. we must use some frozen." We now consider two types of implications in our rnetalanguage. We assume that some particular version of English has been selected and is understood by all parties participating in this endeavor. a space. English is not mathematics. We have described no in this paragraph. This object language cannot discuss itself. English is a very fluid language. The first sentence suggests that we are to use English for the formal language of our inner system. we may write ~ k 9" or. no is the first positive integer that cannot be described in the object language in fewer than 1000 characters." The notion of a "sentence" (as it is used here) is a m e t a m a t h e m a t i c a l concept i.14. Explanation of the flaw in the reasoning. quite brief. 14. We cannot conclude that no is succinct. and this kind of selfreferencing can lead to paradoxes. but they are not taken seriously because. which is shorter than 1000 characters. or (iii) obtained from previous members of the sequence by rules of inference. We certainly have not given a definition of no that is 1000 characters or fewer in the object language. which may be preferred by some readers: "any set of axioms that makes all the members of E into syntactic theorems. A d e r i v a t i o n of 9" from E is a finite sequence of formulas s s s such that En = 9~ and each ~j is either (i) an axiom. the derivation is called a p r o o f . we have not given a definition of no in the object language. Let no be the first positive integer that is not succinct. then we say 9: is a s y n t a c t i c c o n s e q u e n c e of E. not in the object language. The term "English" hereafter is understood to refer to this frozen. Observe that this notation does not reflect the choice of the axioms. Let E be a set of formulas.9" " is equivalent to this statement.9". a. which will be investigated in greater detail later in this chapter." . a contradiction. ~. and so it is clear that there are only finitely m a n y succinct numbers. which changes even while it is being used.. There are only finitely m a n y different characters. In a t t e m p t i n g to make mathematically precise sense out of Berry's Paradox. and not at all fallacious. unchanging "version" of English for the formal language of our inner system.e. When such a sequence exists. W h e n 9" and ~} are formulas in our object language. this is abbreviated as E k. which are nevertheless available for use in the derivation. n o n m a t h e m a t i c a l English is permitted to talk about itself.352 Chapter 14: Logic and Intangibles sentences of the English language using less than 1000 characters (where a character means a letter.
T h a t may confuse beginners.e. also makes 9~ true. . or 9" h o l d s in 5I. The axioms. (iii) The symbols V (syntactic implication) and ~ (semantic implication) should not be confused with these similar symbols: T (truth)." However. we can simply refer to a "theorem" and to "consistency. A set of formulas E is s e m a n t i c a l l y i n c o n s i s t e n t if it has no models or s e m a n t i c a l l y c o n s i s t e n t if it has at least one model. This means that any interpretation that makes the axioms true also makes 9~ true. This equivalence will be established in 14. (ii) The symbol ~ has another meaning. the consequence 9= is called a s e m a n t i c t h e o r e m . We mention it to prevent confusion when the reader runs across it in some other book. A few cautionary remarks. We write E ~ 9" to say that 9= is a s e m a n t i c c o n s e q u e n c e of E. Some mathematicians prefer either the viewpoint of proof theory or the viewpoint of model theory. " or " t a u t o l o g y " to be synonomous with what we have called a "syntactic theorem" or with what we have called a "semantic theorem. which are used in some books (but not this one). The study of semantic consequences is sometimes called m o d e l t h e o r y . The study of syntactic consequences is sometimes called p r o o f t h e o r y .57 and 14. we must first develop the syntactic and semantic views separately. _[_ (falsity). " " t r u e f o r m u l a . This means that every model of E is also a model of 9 " . that every interpretation of the language that makes E true (and makes all of our unmentioned axioms true). IF(forcing).a model of 9". This is read as: :IV[is. if any.i." provided that our system of reasoning is sound. (i) Terminology varies in the literature. On the surface.. thereafter. A fundamental and nontrivial result of firstorder logic is that proof theory and model theory are equivalent. and so they define the terms " t h e o r e m . proof theory and model theory seem rather different. When E is the empty set. otherwise the set of formulas is s y n t a c t i c a l l y c o n s i s t e n t .Languages and Truths 353 A set of formulas E is s y n t a c t i c a l l y i n c o n s i s t e n t if some formula and its negation are both syntactic consequences of E. in this sense: they actually yield the same notions of "theorem" and "consistency. If N is some particular model in which a formula 9" is true. We may write ~ ~ 9" or.59. are understood from the context and are not mentioned in this notation. which will not be used in this chapter. more briefly. to establish the equivalence. some mathematicians may write 5I ~ iT." Which term is applied to which type of theorem varies from one paper or book to another. but it does not affect the ultimate results since the two kinds of theorems will eventually be shown equivalent. " "valid f o r m u l a .53. or 5I satisfies 9~. b. Forcing is discussed briefly in 14. ~ 9~.
All of the symbols are understood as meaningless characters of a meaningless alphabet. Some examples of extralogical axioms are given in 14. it is also known as p r e d i c a t e logic or the p r e d i c a t e calculus.o r d e r l a n g u a g e includes an alphabet of symbols punctuation symbols. It is possible to give precise rules for the use of parentheses and commas.354 Chapter 14: Logic and Intangibles (iv) Although 9" ~. The rules of inference and logical axioms are discussed in the subchapter which begins with 14. but we shall omit the details.9 and 9" ~ 9 will ultimately be proved equivalent to each other. .. 14. Firstorder language is discussed in further detail in the next subchapter. plus rules of inference and logical axioms. and the resulting theorems are discussed in the subchapters after that.47). we may subdivide a theory into these ingredients: A f i r s t . These are parentheses for g r o u p i n g . INGREDIENTS OF FIRSTORDER LANGUAGE We shall now list the ingredients. In fact.o r d e r logic includes the language. and the last two are equivalent to each other. 14.24. a special case that has fewer ingredients. The first two are equivalent to each other. The specification of the symbols and rules is understood to include a specification of the arities of the operation symbols. It may also be viewed as including the resulting theorems i.38 and thereafter. symbols for individuals. It may also be viewed as including the resulting theorems. as explained in 14. A f i r s t .15. since 9" . A f i r s t .e. In predicate logic. plus extralogical axioms. In propositional logic.o r d e r t h e o r y includes the logic. Some readers may wish to glance ahead to 14. no direct comparison is possible.o r d e r logic. symbols for operations. The kind of logic used most often in the literature is f i r s t . to avoid ambiguity and commas for delimiting items in a list.18 below.e.. the last two statements are slightly stronger than the first two statements. they will only take on meaning when we consider an interpretation or quasiinterpretation (as in 14. the syntactic and semantic consequences of those rules and axioms. this will be discussed further in 14. all four statements are equivalent to each other. they are not equivalent to 9" ~ 9.25. We can modify that statement slightly if we wish to make comparisons: Each of the four expressions 9 9 S. and quantifiers and grammatical rules for forming those symbols into formulas.16. (9 (9 S) is a statement in the metalanguage.i. where we consider propositional logic. S is a statement in the object language. To be precise. Punctuation symbols.27.
. ~2.17. . . . for some theoretical purposes it can be useful to conceptualize and investigate a language with uncountably many symbols e. .24. v3. . that would require more time than any mere mortal has. . z . a3. or ~1.) In predicate logic. It is not even humanly possible to write down a countably infinite collection of symbols. . we shall assume that the set of free variables and the set of bound variables are both empty (as in propositional logic) or both infinite (as in ordinary mathematics). we could say "let L be a language that includes a constant symbol c~ for each real number r. . . However. b. a ~. denoted in the discussions below by x. a2. there are three types of symbols for individuals: 9 i n d i v i d u a l c o n s t a n t s y m b o l s . usualiy replacing all occurrences of one free variable with copies of some term whose free variables are not already in use. everyday m a t h e m a t i c s uses only countably many symbols for instance.41. for that matter) involves substitutions. 99 9 or ~. . but in general we do not know in advance how large or small that finite number will be the number may vary from one computation to another. in some texts either the constant symbols or the variable symbols are omitted altogether.20. since they may be viewed as function symbols of arity 0. . T h a t case is manageable for elementary results but becomes difficult starting in 14. see 14. 9 i n d i v i d u a l b o u n d v a r i a b l e s y m b o l s .18 below. constant symbols may be dispensed with as a separate class of symbols. denoted in the discussions below by ~. we have no way of actually writing down distinct representations for most of those numbers. this is discussed further in 14. but these sets could be larger or smaller. One difficulty with that case can be explained roughly as follows: Reasoning in formal logic (or in other parts of mathematics. v 2 .. It will involve only a finite number of free variables." We can talk about the c~'s in the abstract. If our language has only finitely . although there are uncountably many real numbers. A single computation may involve many substitutions and thus many free variables. The sets of constants and variables are countably infinite in most applications. ~'. 999 or ?J. .g. The case of finitely many variables turns out to be technically different and difficult. the free and bound variables are taken from the same set of symbols. or ?21. . v t. ~. for simplicity of exposition we shall exclude that case from the outset. .. even if we can't write them all down concretely. and in general there is no finite upper bound for the number of variables needed for a computation. Practical. denoted in the discussions below by a. Symbols for individuals. or al. see 14. and will not be considered in this chapter. In most texts on logic. c . y. ~ " . Nevertheless. . ~ . . Hereafter. 9 9 or a.Ingredients of FirstOrder Language 355 14. a ' .) 9 i n d i v i d u a l free v a r i a b l e s y m b o l s . . ?2tt. ~3. we prefer to use three separate sets of symbols. (These are omitted in propositional logic. (Actually.
an associated nonnegative integer that specifies how many arguments each of these symbols should be followed by. such a s . >." and ".) Examples.18(ii). here denoted P. (These are omitted in propositional logic. these occur only with arity 0. or r a n k i." An example of a relation with arity 1 is "x is a prime number. and the function symbols cos and . these two operators are represented by different keys on some recent handheld electronic calculators. but those are actually different operators.24 and 14. =. V." Interestingly. We might use the function symbols + . see 14. y) might have the interpretation "x is a divisor of y. y. etc. Symbols for operations. It is convenient to write x + y instead of +(x. a source of confusion for mathematicians who grew up using one character for the two operations. For instance. we can complete any computation and still have plenty of free variables left over. <.. the symbol "3" or "v/5. Remark. and C with arity 1.23. y. R. . We have three types of operation symbols.e. if some particular finite number is specified in advance then we may run out of variables before some computations are completed. Each operation symbol has an a r i t y . etc. It is actually possible to dispense with function symbols. if f has arity 4.e. We might use the function symbols N. . Although interpretations are not part of the formal language. we might use ~ for a symbol with arity 0. a preview of typical interpretations may make the formal language easier to understand. Analogous modifications also apply for other commonly used binary operation symbols. A function symbol with arity 0 is a symbol that gets interpreted as a constant e. z). if we have infinitely many free variables. >. etc. Examples in group theory. listed below. x.. '' In ordinary mathematics. A. A character such as o or D might be used as a function symbol of arity 2. z) with arity 3. see 14. y) with arity 2.24. h. in arithmetic. (In propositional logic.g. Some common relations of arity 2 are <. for instance.. U with arity 2. and for purposes of logic it would be best to represent them with different characters such as " . Examples in set theory.with arity 1." A relation with arity 0 is just a statement that does not mention any variables. 14.18. then we may form expressions such as f(w. The precise rules for forming such expressions are given in 14.) Examples in arithmetic or analysis. ~. the character " . y) will apply to expressions x + y with obvious modifications. (ii) R e l a t i o n (or p r e d i c a t e ) s y m b o l s . here denoted f. For instance. r E.22 and 14.x. y). . Q. / . . but it also determines a relation R(x. by viewing each function of arity n as a relation with arity n + 1." represents both the binary operator of subtraction and the unary operator of additive inverse. all with arity 2. g. and are then called primitive proposition symbols. On the other hand. Many other meanings are possible for relation symbols. . R(x. the equation z = x + y determines a function z = f(x. The abstract discussions of expressions f(x.356 Chapter 14: Logic and Intangibles many free variables i. so we include a few examples of interpretations here: (i) F u n c t i o n s y m b o l s ..
g. some mathematicians prefer to define some of the connectives in terms of others e. There are two kinds of quantifiers: V(. implication 357 Some mathematicians use additional c o n n e c t i v e s .c.19.25). +. A). Also. 14. among some mathematicians.27.g. the connective (iff) or the connective I(the Sheffer stroke). In the formal theory. Meanings will be attached later. In fact. function symbols may be governed by extralogical axioms as in 14. we prefer to begin with unrelated symbols and then find relationships as a consequence of axioms. However.~ ~B. with arities assigned to grammatically govern the joining of these meaningless symbols into meaningless strings.27.e. The notations vary slightly. In the formal language. No other meaning is attached to any of these symbols in the formal theory.47. usually read "for each (." 3~. "not" "or" "and" "implies" may may may may be be be be written written written written instead instead instead instead as as as as V or U A or n or =~ or D We have chosen our notation in this book so that different symbols are used in logics (U. disjunction and.27. and 14. bols may vary slightly from one are: 7 (arity U (arity (arity (arity The precise choice of logical connective symexposition to another. negation or.32. N).Ingredients of FirstOrder Language (iii) L o g i c a l c o n n e c t i v e s y m b o l s . but we do not yet associate these symbols with their usual meanings or any other meanings. This may reduce some confusion when two of these different kinds of structures must interact see especially 14. [}. The ones we shall use 1) 2) 2) 2) not.38. Quantifiers. E. conjunction implies. and in algebras of sets (U.d. the u n i v e r s a l q u a n t i f i e r .N). For instance. We may call ~ the "negation" or call + the "plus sign" to make them easier to read aloud and to lend some intuition about what this is all leading up to. relation symbols may be governed by extralogical axioms as in 14. in lattices (V. ~. 14.a. o." .. these symbols are merely viewed as meaningless symbols. when we consider interpretations in 14. the e x i s t e n t i a l q u a n t i f i e r . usually read "there exists ~ such that. our meaningless symbols may also be accompanied by some axioms. The logical connective symbols are governed by the logical axioms (see 14. both of which are introduced in the following pages. the connective U may be defined by the equation A U ~B = (=A) . It should be understood that meanings are not yet attached to the symbols not even to familiar symbols such as 7. the symbols 7 and U have slightly different meanings in intuitionist and classical logic.. =.
one by one. The customary mathematical meaning of V is closer to "for each. z . we shall now discuss bindings in general. . A more important difference is in the use of bound and free variables: In conventional treatments such as Mendelson [1964] or Hamilton [197S]. ) will be free variables and which symbols (~. Discussion of bindings. . . perhaps with each treated differently.e.23(iii) and 14. which is unconventional in some minor respects. languages of second or third order.23) and for making substitutions (described in 14. The rules for incorporating those variables into formulas (described in 14. the symbols V and 3 should be viewed as not having any meaning at all. those textbooks have been usedwidely and their treatment can now be considered "conventional" or "customary. The quantifier V is commonly read as "for all" in the mathematical literature.24) We caution that the symbols V and 3 occasionally have meanings slightly different from "for each" and "there exists. Thus. To motivate either approach. the rule for incorporating quantifiers into formulas is trivial: If 04 is any formula and x is any variable. etc. . any other bound variable may be used in the same fashion.j. but we prefer to read it as "for each. they are simply meaningless symbols whose use is governed by grammatical rules and inference rules listed in 14." but grammatically it is not permitted to say "for each formula 9"' or "for each class S of individuals. r/. Some popular expositions of logic are Mendelson [1964] and Hamilton [1978]. see 14. we agree in advance which symbols (x. 14.26) are not trivial. (In propositional logic there are no variables and thus no quantifiers. then Vx04 and 3x04 are formulas. The definitions of bound and free variables and the rules for substitution are (in this author's opinion) rather complicated and nonintuitive. but they are not particularly complicated.47. two disjoint sets of symbols are used." which emphasizes that the objects under consideration can all be treated separately." Those expressions are permitted in higherorder languages i. Until we study their interpretations in 14. the rule for defining free and bound variables is trivial: Before we even begin to think about how to make formulas. ." see 14. . "for all" suggests that the objects are perhaps being treated all in the same fashion. one defines whether a variable symbol x is bound or free according to how and where it appears in a formula." 9 In Rasiowa and Sikorski [1963]. y.. it is possible to say "for each individual ~.22 and 14. the RasiowaSikorski treatment uses fewer definitions of symbols and more axioms governing the use of undefined symbols." Our own treatment will follow Rasiowa and Sikorski [1963]. a quantifier is understood to act only on an individual variable. The definitions involve the "scope" of a quantifier and the rather convoluted notion of "a term t that is free for the variable x in the formula 3". For instance. we shall not investigate such languages in this book.j.358 Chapter 14: Logic and Intangibles Here ~ is a bound variable. ) will be bound variables. We emphasize that in a firstorder language.20. regardless of how x is already being used in the formula A or elsewhere. .. The same symbols are used for free variables and bound variables. ( .26." In common English.47.
~ is not really a "variable" at all it is just a "placeholder. z. bindings are generated by certain operators such as f . but with little or no stigma attached. C. exponentiation. etc. In this equation." and the place can be held just as well by nearly any other letter. as we shall now describe. In this respect.e. mathematics outside of logic). The function f is a function of x. In the paragraph above. the equation f (x)  ~0x ~2 d~ makes sense whenever x is a real number. an expression such as g(x. and ~ is a bound or dummy variable (also sometimes known as an apparent variable). y. For instance. and compositions of such).e.) and bound variables (~. In the formula ..). we have followed the typographical convention of Rasiowa and Sikorski.Ingredients of FirstOrder Language 359 In ordinary mathematics (i. Analogous beasts appear in conventional logic books.f~o z2 dx. that convention generally is not observed. Admittedly. /i x ~2 d~ represent exactly the same function f(x). using different sorts of letters for free variables (x. x is a free variable. but we know that what is probably meant is g(x.fo x2 dx since that equation uses the same letter z for two different purposes as a free variable and a bound variable. ~. In fact. /o x ~2 dr]. and any letter can be used for either type of variable. in other respects the Mendelson/Hamilton approach differs from the conventions of ordinary mathematics. In some sense. logarithms. that function can also be represented without any dummy variables: f ( z ) = x3/3. For instance. /oox ~2 d~. it is well known (but not easy to prove) that the function g(x) . that type of expression can be found in some physics or engineering books it is interpreted to mean the same thing as f(x) .fo ~2 d~ but mathematicians frown upon such constructions. dummy variables are unavoidable for certain other functions. But in the wider literature.fo ~2 d~. for instance. the Mendelson/Hamilton approach follows the convention of "ordinary" mathematics (i. However. There is one exception: We should not replace ~ with x itself.. trigonometric functions. y)  (x + y)2 + exp({ 2) d{. }~. l[. outside of formal logic). the function f described in the preceding paragraph could be defined as easily by the equation f(~) . However. Polite mathematicians prefer not to write f(x) . y)  (x + y)2 + /0 /0 exp(x 2) dx will make any wellbred mathematician uncomfortable. Likewise. In the equation f(x) .fo exp (~2) d~ cannot be represented in terms of the classical elementary functions (algebraic expressions. it does not really involve ~. we can replace ~ by nearly any other letter. etc. All of the expressions i x ~2 dw.
That approach will not be followed in this book. Substitution notation. The ~'s in the first half of this expression are unrelated to the ~'s in the second half of this expression. but the rules are necessarily rather complicated. and in which other free or bound variables may occur 0 or more times. since Vx (R(x. Throughout the discussions in the next few pages. since it has no analogue outside of formal logic. we shall frequently use this notation: Let x be a free variable symbol. Actually.y)u (Vw (R(w. one such formula is which has one variable bound twice. Grammatical rules. In a firstorder language. variables could be dispensed with altogether. and so shall we in this book.21. such as the Completeness Principle.) with the formula Q(x. Lercher and Seldin [1972] show that everything can be expressed in terms of functions. Let A(x) be a finite string of symbols in which x may occur 0 or more times. if x does not appear in the string A(x). the difference between options (i) and (ii) has only a superficial or cosmetic effect.42. A word of caution: Even the RasiowaSikorski approach is not entirely trivial. part 1. Of course. Some confusion might be avoided if we replace this formula with the equivalent formula (34 P(~))U(Vv Q(r/))./ 1 / o x3ltdz dx has no clear meaning in ordinary mathematics. analogous formulas appear often in logic. then A(a) is identical to A(x). Again. Such a formula may seem unnatural. (The "x" immediately after the V is one of the bound occurrences. such beasts are not really necessary: Since Vx P(x. u) (see 14. Thus we can replace (. Let a be any finite string of symbols. An integral f o r m u l a such as g(tt) . u) is in most respects equivalent to Vw P(w. it has no effect on deeper results discussed later in this chapter. as explained in 14. it permits expressions such as (34 P(~)) U (V~ Q(~)).) Thus.z)) is in most respects equivalent to Vw (R(w. Conventional books such as Mendelson [1964] and Hamilton [1978] have followed option (ii). u))). it may be helpful to view (**) as having the same meaning as 3x (Vw (P(w. Then A(a) will denote the string of symbols obtained from A(x) by replacing each occurrence of x (if there are any) with a copy of the string or. however. Nevertheless. or (ii) provide rules for dealing with such expressions. books on c o m b i n a t o r y logic such as Hindley.22. 14. 14. Rasiowa and Sikorski [1963] have taken option (i). Since the nonintuitive expressions can always be replaced by more acceptable ones anyway.360 Chapter 14: Logic and Intangibles the variable x has one free occurrence and two bound occurrences. (An analogous interpretation would make fl fo x3udx dx equal to fo fl w3ud w dx.) Such a distasteful formula is not absolutely necessary for proofs.42).z)). t e r m s are certain finite strings of symbols formed recursively by these two rules: . Among other things. which does not mix free and bound occurrences of one symbol.z))). an expression with no double bindings.) and (**). it is necessary to either (i) prohibit nonintuitive expressions such as (. in any explanation of logic.
are function symbols of arity 2.. Rather. When it is given its usual interpretation involving real numbers. 6 are among the constant symbols. (eLI1 ~ oZ:[2) are formulas. .. The precise expression "let t be a term" is an abbreviation for the imprecise and unwieldly expression "let us consider any term. 5. .. it is often possible to proceed by induction on the length or depth of the terms.. 9 9 tn are terms. J t 2 . then ~ ( A 1 . written more commonly as 5z + v/6y . We permit n = 0. . it should be understood that these letters are not actually symbols making up a part of our formal language (the inner system). tl.. then (n~[1). (J:[1 U eLI2). w e l l . it does not involve any circular reasoning that leads to a contradiction.. these letters are m e t a v a r i a b l e s i. and + .e.t2. t2. then that string of symbols is a realvalued function of two real variables.. (~[1 [] J6~2).. z ) + ( v / ( ( 6 .c4)). this is the only type of atomic formula. 361 There are no other terms besides those formed via these rules. (ii) If f is an nary function symbol and tl. . . .. etc.x2. In the discussions below... Indeed. Condition (ii) is not selfreferential i. v / a n d cos are function symbols of arity 1." 14.e. this rule can be restated as: If ~[1 and A2 are formulas. since the only relation symbols we have in propositional logic are those'of arity 0. . Grammatical rules. Thus a primitive proposition symbol (i. We may omit the parentheses when no confusion is likely.. terms will generally be represented by the letters t. Then the construction in condition (ii) always forms longer terms from shorter ones or forms deeper terms from shallower ones. To prove a statement about all the terms of a language. where P is an nary relation symbol and tl. Remark. (iii) Suppose 04(x) is a formula in which the bound variable ~ does not occur. or wtFs). such as f(xl.. . .S2. t2. (In propositional logic. t n ) is a term. . Sl.(cos(2)))) is a term. no bound variables appear in terms.21. Apply the substitution notation of 14.f o r m e d f o r m u l a s . see 14.) (ii) If J t l . . they are informal conventions adopted for our discussion in the outer system. Then V~ ~4(~) and 3~ r are also .e. in some books..e. . It is a formula.. The definitions are recursive: (i) An a t o m i c f o r m u l a (or atom) is an expression of the form P(tl. how many times we have functions nested within functions). Certain finite strings of symbols are known as f o r m u l a s (or.h(x3.t2. z and y are among the variable symbols. y ) .24. Then the string of symbols ( 5 . t 2 .. Since we will only use a few connectives. . tn).23.e.tn are terms. they are part of the metalanguage. then the expression f ( t l . Consider a language in which 2.. Ezample.Ingredients of FirstOrder Language (i) Any constant symbol or free variable symbol is a term. how many symbols appear in a string) or their depth (i. and S. .An are formulas and ~ is an nary logical connective symbol. c3. part 2. we may classify strings of symbols according to their length (i. Observation: By our definition.cos 2.An) is a formula.. . However.g(q. relation symbol of arity 0) is an atomic formula. A 2 ..c2)..
(i) (A ~ ~) ~ ((~ ~ C) ~ (A ~ e)).. A typical formula in propositional logic is (P ~ (P R (~P))) ~ (~P).362 Chapter 14: Logic and Intangibles formulas. A statement about the set of all formulas can be proved by induction on the lengths of the formulas. g(z)))))). ((iB~e)~ ((AU~)~e)). Each formula formed as in 14. and there are no quantifiers and no terms. y.25. etc. alphabet and grammatical rules. described above). a logical theory also involves certain assumptions. and no free variables and there are no function symbols. However. However. f(y.23(ii) seems selfreferential and thus might permit circular reasoning. z. in that it has fewer ingredients.. Historically.23(ii) is longer (in number of symbols used) than the formulas from which it was formed. z) ~ ((R ~ (S(x. e. Our first nine axioms determine what is known as positive logic.e. x)) U Q (x. An important special case of predicate logic is p r o p o s i t i o n a l logic (or p r o p o sitional calculus. Rather. no bound variables. the only relation symbols have arity 0. f(y. The literature contains many different axiomatizations of logic. there is no need for worry no circularity is possible here.24. These are listed below. also known as s e n t e n t i a l logic or s e n t e n t i a l calculus). ASSUMPTIONS IN FIRSTORDER LOGIC In addition to its language (i. (ii) A ~ ( A U ~ ) . there are no symbols for individuals i. Remarks. X. it should be understood that these letters are not actually symbols making up a part of our formal language (the inner system). no constant individuals. (v) (X n S) (vi) ( A m S ) + S . it developed before other kinds of logic. The precise expression "let 9"be a formula" is an abbreviation for the imprecise and unwieldly expression "let us consider any formula. Logical axioms. This is called the Syllogism Law. 14. they are metavariables.z)).e. as in propositional logic." 14. It is simpler than predicate logic. The beginner might be concerned that 14. 13. (Of course. no formulas can be formed in this fashion if the sets of variable symbols are empty. In the discussions below. We shall follow the development of Rasiowa and Sikorski [1963]. (iv) ( A ~ e ) . z))) U (Q(x. formulas will generally be represented by the letters A..) There are no other formulas besides those formed recursively using the rules above. In propositional logic. .. adopted for our informal discussion in the outer system. (iii) ~B ~ (A u ~B). this is a common method of proof. a typical formula in predicate logic is (V~ P(~. such as (~P(x. Consequently.
the rules of inference are most often applied to formulas that are in some sense "true. . Here. The nine axioms above. until a contradiction is reached.26. and let tl. also known as the rule of d e t a c h m e n t . xn) be a formula. and so they can be skipped in considering any logic that does not involve variables (such as propositional logic). . (B) we can infer N.. For instance. (x) (xi) (04 ~ (~04)) ~ ~.55(ii). Indeed. (viii) (04 . Our own rules. . (xii) 04 U (~04). plus the next two axioms below. e)) . the eleven axioms above plus the twelfth axiom below determine what is known as classical logic.X2. Then from A and (A . Law. y))) U (Q n (~S))) by using different formulas for A and N. The rules of inference and logical axioms vary slightly from one exposition to another. Axiom Scheme (ii) yields the axiom P ~ (PUQ). is present in all versions of logic. These rules will be "justified" in 14.we are merely collecting "obtainable" formulas. This is the E x p o r t a t i o n Law. we may assume the negation of the desired conclusion. and then use the rules of inference to try to infer various consequences of that and other assumptions.xn be distinct free individual variables. . .. x 2 ." but this is not always the case. They are informal shorthand abbreviations for expressions such as P ( x ) • R ( f ( a . This is the I m p o r t a t i o n (ix) ((04 R (B). and the rules of inference tell us which formulas are obtainable.Assumptions in FirstOrder Logic (vii) (e . ((04 ~ (B). determine what is commonly known as i n t u i t i o n i s t logic. from a given set of formulas. e). which belong to the object language. (Recall that A and N only belong to the metalanguage." We are not necessarily obtaining "true" f o r m u l a s . in a proof by contradiction. The remaining rules below involve variables. y))) ~ ((P(x) ~ R ( f ( a . t 2 . This is the D u n s S c o t u s Law. (:B . Finally. "deduce" and "infer" merely mean "obtain. The twelve rules listed above are actually axiom schemes each of them represents infinitely many axioms. what one book calls a rule of inference is what another book may call a logical axiom. (A ~ (04 m (~04))) ~ (~04). y ) ) . thereby proving the desired conclusion. modus ponens.. t~ be (not necessarily distinct) terms. Admittedly. which is close to the way of thinking of most mathematicians. listed below. 04) ~ 363 ((e ~ (B) ~ (C ~ (04 m ~ ) ) ) . we may d e d u c e (or infer) another formula. . The rules of i n f e r e n c e of our logical system are rules by which. or tertium non datur. (R1) M o d u s p o n e n s . follow those of Rasiowa and Sikorski [1963]. . (R2) R u l e of s u b s t i t u t i o n . . Let 04(Xl. . For instance. but it also yields the axiom (P(x) N R ( f ( a . Suppose A and ~ are Our first rule. . C) ~ (04 ~ (~ ~ C)). formulas. Let Xl.) 14. It was developed largely by Heyting and corresponds closely to intuitionist or constructivist thinking. This is the L a w of t h e E x c l u d e d M i d d l e .
Then from A(x) ~ ~ we can infer (3~ A(~)) ~ ~.X2. let ~B be any formula.e. Then from fl[(Xl. Besides the logical axioms shared by essentially all firstorder theories determining our reasoning methods.. We refer to these as extralogical or nonlogical axioms.. . Then from ~ ~ A(x) we can infer ~B ~ (V~ A(~)).. 9 ~... . x 2 . to emphasize that it is a symbol with precisely specified properties under formal study rather than just ordinary informal equality. a special role is played by a relation symbol of rank two. In the four rules below.) From (3~ A(~)) + ~ we can infer A(x) ~ ~. Also.y) • (y . Examples of extralogical axioms. ..z)) ~ ( x . tn) be the formula obtained from A(Xl. Xn occurs 0 or more times.e.) From ~B ~ (V~ A(~)) we can infer ~ + A(x)... (R6) Elimination of universal quantifiers. It should be understood that these examples are not part of the general explanation of predicate logic developed in this chapter i. Then: (R3) Introduction of existential quantifiers. Suppose ~ contains no occurrence of x.21. (iii) ((x . 14. the other five rules of inference are of the form 9" F. Let A(tl.x. we follow the substitution notation of 14. Here is a typical set of a x i o m s for equality: (i) x . and in fact the rules of inference are our most basic examples of syntactic implications. Suppose ~ contains no occurrence of x.Xn) by simultaneously replacing all occurrences of the xj's with copies of the corresponding tj's. . tn).. . specialized axioms. 9.x).. . The rule of modus ponens says that for certain formulas 9.30. (We make no assumption about whether x appears in ~.. (We make no assumption about whether x appears in ~. x 2 . (R5) Elimination of existential quantifiers. a particular firstorder theory may have additional. called equality or equals.9. In many firstorder systems. we shall not assume these axioms later in this chapter. determining the mathematical objects that we wish to study with that reasoning process..z ) .. .27.9{. taken as hypotheses in our reasoning about reasoning. a. Xn) we can infer A(tl. .364 Chapter 14: Logic and Intangibles in which each of the free individual variables x l.y) ~ (y . let A(x) be a formula in which the bound variable ~ does not occur. . ... The rules of inference listed above will be assumed i. 9{ we have 9. (ii) (x . The rules of inference form the basis for our syntactic implications. (R4) Introduction of universal quantifiers. Most often this symbol is denoted by " = " though in some expositions it may instead be denoted by " ~ " or " = " or some other symbol. Below are some examples. It is equipped with several axioms. Some auxiliary rules of inference will be proved as consequences of modus ponens and the logical axioms in 14. t2. t2. the precise list of axioms varies from one exposition to another.
we caution that some mathematicians are concerned solely with firstorder systems with equality. two of the binary relation symbols are = and 4." thus it involves a quantifier that ranges over subsets of D. for instance. below. but none of them can be expressed in firstorder language over D.d. and let y be a free variable.X. C. equivalent ways to formulate the condition of Dedekind completeness of D. In a statement requiring a higherif each nonempty subset that is that condition is All of the axioms above are of first order . Contrast this with 14.8 and 3. Axioms used are the axioms for equality (described above) plus these axioms for the ordering: (reflexive) (x 4 x). members of a preordered or partially ordered set D. However. for each S c_ D. 365 The first three of these axioms say that equality is an "equivalence relation. ioxx. For the t h e o r y of p a r t i a l l y o r d e r e d sets. let tj be the term obtained from t by replacing each occurrence of the variable y with the term sj. function of arity 0) is denoted by i. one of the binary functions is o.which means that those axioms may sometimes get used without being mentioned. .s2.y).. o4). (transitive) ((x 4 Y)R (y 4 z)) ~ (x 4 z). The condition begins with "for each set S that is a subset of D. There are other. then (Sl = s2) ~ (o4(Sl) A(s2)) is an axiom. ((7 (S .21. Recall that D is Dedekind complete bounded above has a least upper bound. one of the binary relations is =. s2 are terms.i.. if Sl. (See Hamilton [1978].2~))~ (3x Vy (y E S ~ y ~ x))) ~ (3u Vx ((u ~) ~ (vy (y e s ~ y ~ ~)))) where A ~ ~ is an abbreviation for (o4 + ~B) A (~B . Axioms used are the axioms of equality (described above) plus these axioms: (associative) (right identity) (left identity) (x o y) o z xoi . since they do not occur in all firstorder systems.e. we add this axiom: (antisymmetric) ((~ 4 y ) n (y 4 x)) ~ ( x . In symbols.10. For j = 1.e. The last two axioms say that "equals can be substituted for equals. and (v).Assumptions in FirstOrder Logic (iv) Let s1.27.2. and some of these mathematicians find it convenient to designate the axioms of equality as "logical axioms" . there is some redundancy in our formulation. the Dedekind completeness of a poset D is order language.. Then (Sl = s2) ~ (tl = t2) is an axiom. x o (y o ~).) A logical system that includes such axioms is generally called p r e d i c a t e logic w i t h e q u a l i t y .t be terms. go For the t h e o r y of p r e o r d e r e d sets. our axioms (ii) and (iii) actually follow from axioms (i). In this book we shall consider the axioms of equality to be extralogical axioms." Actually. contrast. (iv). and some nullary function (i. (v) W i t h notation as in 14." in a sense similar to that in 3. For the t h e o r y of m o n o i d s . they deal only with individual not with subsets of that set.
and one Axiom of Replacement for each function f that can be formulated in the firstorder language. In 14. the real numbers can be built up using Dedekind cuts of rationals.25 and 14.b. Contrast this with 14. Each of these two schemes represents infinitely many different axioms. We now consider some consequences of the logical axioms and inference rules listed in 14. these results will apply equally well to propositional logic or predicate logic./t).z. etc.46. and u . (i) (ii) (iii) Some basic syntactic theorems of positive logic./t + . The integers can be built up using the Axiom of Infinity. A basic binary relation is E (membership). and the subsets of X are members of ~P(X). the only undefined constant is O.29.67. that we discuss are intended to represent sets. In the l a n g u a g e of set t h e o r y .44. c.e. (B + (. y. these results will not require any rules of inference except modus ponens. above.27.39 we shall begin to consider results that do involve variables and constants.. d. groups.366 Chapter 14: Logic and Intangibles Additional algebraic axioms can be used to determine the theory of other types of algebraic systems e.4). all familiar objects of mathematics can be expressed in this language. z. As we remarked in 1. We begin with some results that do not mention variables or constants.. Remark. etc. the rational numbers can be built up using equivalence classes of pairs of integers. To make ZF into a firstorder theory we must view the Axiom of Comprehension and the Axiom of Replacement not as single axioms. In conventional (i.g. rings. 0 is an abbreviation for O. 1 is an abbreviation for {0}. the individual elements a. atomless) set theory. A ~ A. (See also the reinterpretation of these axioms indicated in 14. Other relations can be defined in terms of membership. all other constants are defined in terms of it. etc. For instance. Thus. SOME SYNTACTIC RESULTS (PROPOSITIONAL LOGIC) 14.v means c n c The most commonly used axioms of set theory are the ZF axioms listed in 1.. as in 5.47. The language of set theory is sufficiently expressive for us to assert that a certain poset X is Dedekind complete: We can describe the ordering as a subset of X x X. u C v means (z E u) + (z E v). but as axiom schemes.26.) The language of set theory is extremely powerful it is more expressive than any of the other languages mentioned above. 2 is an abbreviation for {0. We have one Axiom of Comprehension for each property P that can be formulated in the firstorder language. A + (:B + . 14. {O}}..28. b. .
. then 04 theorems by a x i o m s (v) and (vi) via modus ponens. then 04 gl ~B follows from T h e o r e m (vi) and two applications of the Theorems of and 13 are syntactic and :B are syntactic of modus ponens. 14. Conversely. combine that with the preceding formula to prove T h e o r e m (i). but also some the previous section. to prove T h e o r e m (iii). An instance of T h e o r e m (i) is (04 + ~B) + (04 + ~B). Combine it with Theorem (i). then 04 + C is a syntactic theorem. These results will be used later in proofs.using the axioms of positive logic (i. C are some formulas such that A ~ ~B and ~B + C are syntactic theorems.Some Syntactic Results (Propositional Logic) 367 (iv) 04 + ((04 + 04) + 04). Combine these. by modus ponens. We shall use the axioms of positive logic to prove: If 04. The formula (04 + 04) + (:B + (04 + 04)) is an instance of T h e o r e m (ii). then 04 + C is also a syntactic theorem. The formula ((04 rq 5 ) + (04 n ~B)) + (04 + ({B + (04 n {B))) is an instance of Axiom (ix). if 04 theorems.e.30. observe that if 04 rl ~B is a syntactic theorem. We shall use not only Axioms (i) through (ix). Combine these to prove Theorem (vi). to prove the formula ((04 q 04) + 04) + (04 + 04). (04 n 04) + 04 is another instance of Axiom (vi). Rule (i) follows easily from Theorem (i) and modus ponens. These results will be proved for all formulas 04 and ~B. Theorem (iv) is just an instance of Theorem (ii). (iv) If 04 + (~B + 12) and 04 + {B are syntactic theorems. by modus ponens. to prove T h e o r e m (v). (iii) (04 + A) + 04 is a sy~ltactic theorem if and only if 04 is a syntactic theorem. Proof. Additional rules of inference. Proofs. by modus ponens. and (((04 V1A)+ 0 4 ) 9 0 4 ) +04 is an instance of Axiom (vi).25). For Rule (ii). The formula is an instance of Axiom (ix). Combine those. The formula ((04 + ~B) + (04 + {B)) + (((04 + {B)R04) + {B) is an instance of a x i o m (viii). and the formula (04 ~ {B) + (04 ~ ~B) is an instance of Theorem (i). Theorem (ii) is immediate from Axioms (v) and (ix) via modus ponens (with the substitution e = A). (ii) A rq ~B is a syntactic theorem if and only if both 04 and N are syntactic theorems. Axioms (i) through (ix) of 14.. {B. (v) ((x n x) (vi) A + (~ + (04 m ~)). Next.
30 that is a preorder. A 4 iB will mean that the formula A ~ N is a syntactic theorem. Combine it with ponens. and so it can only involve a collection of formulas is syntactically consistent if and only if each finite subset of that collection is syntactically consistent. Let F be the set of all formulas. then from Theorem (iv) by modus ponens we know (A ~ A) ~ A is a syntactic theorem. it means that A ~ N. using 14.12. We use the positive logic axioms from 14. iB) ~ C. [A] = [N] means that A and N belong to the same equivalence class i. The preorder ~ on F determines a partial order on L. thus A ~ C is a syntactic theorem. We now define two binary relations on IF as follows: For formulas A and N.25). If E is not syntactically inconsistent.29 and Rule (i) of 14. finitely m a n y of the axioms.30(i). and ~ is an equivalence relation. is commonly known as the L i n d e n b a u m a l g e b r a .b). Conversely.29(v) gives us ((i3 ~ C)R 14.368 Chapter 14: Logic and Intangibles To prove Rule (iii): If A is a syntactic theorem. Thus. Note that we can then use E to is clear from the Duns ScoUts Law (Axiom (x) in 14. that is. Thus.32.25. thus we obtain A ~ (((~ ~ C ) R ~ ) ) an instance of 14. equipped with the operations discussed below." It follows from Theorem (i) of 14. In other words. syntactic consistency of sets of formulas is a property with finite character. which we shall also denote by ~. A ~ iB will mean that both A ~ ~ and iB * A are syntactic theorems. To prove Rule (iv): Note that (A ~ (i3 ~ an instance of Axiom (vii). Let L = ( F / ~ ) be the set of equivalence classes.46. the expressions A ~ ~ and A ~ ~ are not "formulas. A set E of formulas is s y n t a c t i c a l l y i n c o n s i s t e n t if we A and ~A. not part of our object language (see 14. The set L.. on F. 14. Let [] : F ~ L be the quotient map. On the other hand. It should be emphasized that the relations ~ and ~ are part of our metalanguage. can use E to deduce both deduce any formula. Combine these two results. Thus [A] ~ [~] if and only if (A ~ ~) is a syntactic theorem. via modus as a syntactic theorem. if (A ~ A) ~ A is a syntactic theorem. [A] is the equivalence class containing the formula A. Definition of the ordering of the language. We shall use equality (=) in its usual fashion as a relation between equivalence classes.e.31. for some formula A. that consistent. . then it is s y n t a c t i c a l l y A derivation is understood to involve only finitely many steps. Therefore. in the sense of 3. plus whatever additional axioms we may choose. then from Theorem (i) by modus ponens we can conclude that A is also a syntactic theorem. C ) ) ~ ((A ~ ~) ~ (A ~ ((iB ~ C ) ~ I B ) ) ) i s the two given syntactic theorems.
satisfying 0 = 1) if and only if our set of axioms is syntactically inconsistent (i. in which case every formula is provable. in fact. [~B]} be given. (o4 . The Lindenbaum algebra (L. f ) defined above is. 9 4 (04 + ~B). and thus we may speak of the Lindenbaum algebra. ~B. Thus [04 U ~B] is indeed l ~ t ~mong the upper bounds of {[4]. there is some formula such that 04N (~04) is a syntactic theorem). Then 04 . we assume that some particular choice is made regarding the additional axioms. it follows that (04 V ~B) ~ e is also a syntactic theorem. thus [04 U ~B] 4 [e]. (vi). using Axioms (v). ~B is a syntactic theorem. T h e o r e m . ~B). The formula ((9 NA)~ $ ) ~ ( 9 * (A * $ ) ) is also a syntactic theorem. By Axiom (iv) via modus ponens.e. e and ~B ~ e are syntactic theorems. A formula 04 is a syntactic theorem if and only if it satisfies [04] = 1. From those two syntactic theorems via modus ponens we deduce the syntactic theorem 9 . we are to show that [04 ~ ~B] is the largest A in L that satisfies . say that upper bound can be represented by [e] for some formula C. then those two elements can be represented as [04] and [~B] for some formulas 04 and ~B (which are not uniquely determined by the given elements of L). so [04 ~ ~B] is indeed largest.29 we know that [04 ~ ~B] is one of the .. The Boolean algebra L is more than just {0. Thus L is a lattice.24). a relatively pseudocomplemented lattice (as defined in 13. Let any two elements of L be given. By Theorem (v) of 14. 4). Then (9 N 04) . C[A] = [. with operations given by v = u ([.~ A [04] 4 [~B]. as it is an instance of Axiom (ix). (This argument follows the exposition of Rasiowa and Sikorski [1963]. If we assume the axioms of classical logic.04)] for any formula 04.4] => = [4 [A]A[~] = [04n$]. Proof of theorem. with 0 = [~ n (~4)]. Throughout most of our discussions in this and the next chapter. If we assume the axioms of intuitionist logic.) We first show that any twoelement subset of (L. ~B be given. From Axioms (ii) and (iii) we see that [NUN] is an upper bound for the set { [04]. Let any formulas 04. [N] } in the poset (L.4] for any formula 04.e. then L is a Boolean algebra. An analogous argument works for lower bounds.Vs with that property.. Is it the largest? Let any formula 9 satisfying [9 A [04] 4 [~B] be given. The Heyting algebra is degenerate (i. with [04] V [~B] = [04 U ~B] and [04] A [~B] = [04 V1~B]. and (vii). 4) has a supremum.. then L is a Heyting algebra. Its greatest member 1 is also equal to [04 II (.Some Syntactic Results (Propositional Logic) 369 It should be emphasized that any axioms whatsoever (or no axioms at all) may be used in addition to Axioms (i)(ix). ~ and thus yield different Lindenbaum algebras. Different choices of additional axioms yield different relations 4. 14. . 1  [04~A] for any formulas 04. Is it least among the upper bounds? Let any other upper bound for {[04]. IS] }. by our definition of 4. Next we shall show that ([04] => [$]) = [04 ~ ~B] defines a relative pseudocomplementation operator.33. In other words. 1} if and only if at least one formula 9" is neither provable nor disprovable from the axioms.
Moreover. By 14. B r o u w e r ' s T r i p l e N e g a t i o n Law. (~~~A)~ ( ~ A ) a n d ( . A ~ (~~A). ~ (~(A ~ ~)). By Axiom (x) we see that [Am (~A)] ~ [~] for any formulas A. ~ (~(A U ~)) and (~(A U ~)) ~ ((~A) R (~B)). (A) L a w of the E x c l u d e d Middle" A U (~A) (B) C o n v e r s e of t h e D o u b l e N e g a t i o n Law: (~~A) ~ A (D) ( ( = A ) ~ (=~B))~ (~B ~ A) . Hence L is a Heyting algebra. 14. e. (A ~ (~ ~ C)) ~ (~ ~ (A ~ e)). Thus L has a smallest element. for any formula A. This follows from the fact that the formulas correspond to identities that are satisfied in any Heyting algebra. all the formulas given by the following schemes are syntactic theorems. so L is a Boolean algebra. i. d. As in any relatively pseudocomplemented lattice. ((~A) R ( ~ ) ) h.~ A ) ) ~ (~A).370 Chapter 14: Logic and Intangibles That proves our claim about relative pseudocomplements. given by the rule 0 = [A ~ (~A)] for any formula A.A]. On the other hand. plus the ones listed in the section below. (A ~ ( . We have 0 = [A] A [. Some of the formulas below could be viewed as symbolic representations of the principle of proof by contradiction. that is.[A u (~A)] . adding any one of them to intuitionist logic yields classical logic. they are equivalent to each other. (A ~ ~B)~ ( ( ~ B ) ~ g. If we also assume Axiom (xii).25. I n t e r c h a n g e of H y p o t h e s e s . we also have ([A] =~ 0) = [A ~ (A ~ (~A))] ~ [~A] by Axiom (xi).~ A ) ~ (~~A).34. D o u b l e N e g a t i o n Law. we now know that the largest element of L is 1 = [A + A].~ ) ) ~ (~ ~ (~A)). The conclusion about inconsistency and degeneracy is now obvious. In the setting of intuitionist logic. Thus the pseudocomplement ([A] =~ 0 ) i s equal to [~A]. They are all derivable in classical logic.1. c. Now suppose our axioms include the axioms of intuitionist logic.30(iii) we know that A is a syntactic theorem if and only if 1 ~ [A]. ((~A)U ~ ) ~ (A ~ ~). (A ~ ( . In classical logic we have those formulas. if and only if 1 = [A]. Further consequences in intuitionistic logic. using just the intuitionist logical axioms. in the sense that they can neither be proved nor disproved syntactically. ~. f.a it follows that [~A] ~ ([A] =~ 0). 14. By 13. b. In any intuitionist logic. C o n t r a p o s i t i v e Law. in the sense that any one of them can be deduced from any of the others. then [A] V (C[A]) . a. the following formula schemes are undecidable. ((~A) U ( ~ ) ) (~A)). Some nonconstructive techniques of reasoning.35.
nevertheless P U ( ~ P ) is a t h e o r e m . V. 3 . because it seems to give a stronger conclusion in intuitionist logic than in classical logic even though intuitionist logic is the weaker logic. one instance of the Duns Scotus Law is the formula (Pc • (~Pc)) ~ Pa. Pc. It is now a tedious but straightforward matter to verify that (i) each of the eleven logical axiom schemes of intuitionist logic is represented by 1 in the Heyting algebra. page 394]. A. say the primitive propositional symbols are denoted P0. b. in intuitionist propositional logic. and (ii) if :T and 9" ~ 9 are formulas represented by 1 in the Heyting algebra." . Pa. Pb.e. because it is so different from what we are familiar with in classical logic. we have used H as a "quasimodel" for our propositional calculus. a. then at least one of A or N is a syntactic theorem. Models and quasimodels will be explored in greater detail later in this chapter. That expression simplifies to 1." but (iii) the Law of the Excluded Middle is not represented by 1 in this particular Heyting algebra. In effect. 1. then ~ is also represented by 1 in the Heyting algebra i. However. and so all our formal models and quasimodels will be Booleanvalued.a. An analogous result for predicate logic can be found on page 430 of that book. For example. for brevity we shall only consider classical logics.29. the set of syntactic theorems in classical logic is slightly larger than that of intuitionist logic. P1. etc.) Say the members of H are 0. This completes the proof. interpret this as the member of the Heyting algebra represented by (c A (Cc)) =V a. c. Now interpret each formula in the propositional logic as the corresponding member of the Heyting algebra. Form a propositional calculus that has one primitive propositional symbol for each member of H. The connective U has rather different meanings in classical logic and intuitionist logic. 0. (An example of such is mentioned in 13.Some Syntactic Results (Propositional Logic) 371 (E) for all formulas A. let (H. In classical logic.this is just the Law of the Excluded Middle. To say that these conditions cannot be disproved in intuitionist logic is just to say that the axioms of classical logic are syntactically consistent. modus ponens preserves "truth.28. A) Proof. But read carefully! Since intuitionist logic has fewer axioms and fewer syntactic theorems than classical logic. then neither P nor ~ P is a theorem. It may also puzzle some readers. etc. Discussion of intuitionist logic. if Yt and N are some formulas such that A U is a syntactic theorem.. ~3. Remarks. a topological proof is given by Rasiowa and Sikorski [1963. Thus the argument given in the preceding paragraph does not quite fit the formal framework developed later in this chapter.) This result may surprise many readers. in any Heyting algebra. The axiom system of classical logic is slightly stronger than that of intuitionist logic. C) be some particular Heyting algebra that is not a Boolean algebra. 1. The equivalence of these conditions is just a restatement of the result of 13. (The proof is too difficult to give here. if P is a primitive proposition symbol about which nothing in particular is assumed. Hence. 14. the hypothesis that "A U N is a syntactic theorem of intuitionist logic" is stronger than the hypothesis that "04 U N is a syntactic theorem of classical logic. In contrast.36. To show that these conditions cannot be disproved in intuitionist logic. that will be established in the next chapter.
A few of them are also relevant to propositional logic.e. Assume E b (9" + ~).. that whenever 9" and 9 are some particular formulas that satisfy (i).) Under certain additional assumptions we can show that (ii) => (i). . Since we are also given 9".) The Law of the Excluded Middle (which we introduced in 6. but we shall not try to make that description precise.e. see Kneebone [1963]. although for some particular problems 04 we may be able to solve at least one of 04 or ~04. then they also satisfy (ii).. that is shown by an example in 14. Remark. (ii) are equivalent. but take a simplified form in that case. that is the subject of 14." ~04 means the problem "to show how any solution of 04 would yield a contradiction." 04 U ~B means the problem "to solve at least one of 04 or iB. etc. (For further references and discussion. without additional assumptions). by modus ponens we may deduce ~3. 14. to denote problems that are to be solved." Then the properties of Kolmogorov's system of problemsolving coincides with the properties of Heyting's formal intuitionist propositional calculus.37. It is easy to see that (i) ~ (ii) i. see 14. including this intuitive (i.. Kolmogorov published some related results. and therefore (i). we do not have a general method for doing that. Most of these results are only relevant to predicate logic. Interpret connectives as follows: o4 ~ ~B means the problem "to solve both 04 and iB. for there are many different schools of constructivism.4) is not taken as an axiom in the intuitionist system of Heyting or Kolmogorov i. The systems of Heyting and Kolmogorov reflect a somewhat constructive viewpoint. In 1932. they would apply equally well to propositional logic or predicate logic.b for instance. we may deduce 9" + 9. (Proof. We begin by considering the relation between these two kinds of implications: (i) E k(9"+ ~3). (ii) E tO 9" b ~." 04 + N means the problem "to show how any solution of 04 would yield a solution of ~B.372 Chapter 14: Logic and Intangibles Heyting developed his algebraic approach to intuitionist logic in a paper in 1930. SOME SYNTACTIC RESULTS (PREDICATE LOGIC) 14. realworld) interpretation of Heyting's formalism: Let us use letters such as 04. Our preceding syntactic results did not make any direct use of variables...60.40.e. in general (i. and assume we are given the set of formulas E U 9". {B. e. Here 9" and 9 are any formulas. We now turn to syntactic results that do involve variables.40.e. and E is any set of formulas. (ii) does not imply (i).38. However. Since we are given E.39 and 14.
~B contains no occurrences of x. . n.29(i). We prove this by considering cases according to the method by which 8k enters the given derivation. Thus 5 ~ 8k follows from 9" ~ 8j by the Rule of Substitution. x does not occur in 9".34. by different reasonings in these four cases. We view E as a collection of extralogical axioms.9. T h e D e d u c t i o n P r i n c i p l e . Let gl.. 8n . moreover.a. (a) For some of these cases it is helpful to use 14.25. (c) Applications of (R3) are of this form: A(x) contains no occurrence of ~. thus (a) is established if we can just show that from (C F? 9 " ) ~ 9 we can deduce ( C ' ~ 9 " ) ~ 2)'.39. 9. It suffices to prove. Xl.contains .1. .use Axioms (viii) and (ix) in 14. . Then E k (9" ~ g).X2.~ 8k). Then (with i and j switched if necessary) we may assume 8i is the formula 8j ~ Ek. or a consequence of previous 8i 'S by the rules of inference. Suppose that E U {9"} k 9.(9. by the same rule of inference.Some Syntactic Results (Predicate Logic) 373 14. and from A(x) ~ ~B we infer (3~ A(~)) ~ ~B. then 9" ~ 8k is the formula 5 . it is helpful to. ~B contains no occurrence of x. and so from A(x) ~ (2F ~ ~B) we infer (3~ A(~)) ~ (9" . By our induction hypothesis we have E ~. Next. (R6). there exists a derivation of 9 from E U (9"}. Next. Since none of those free variables appear in 9". (R3). ~B). (9" ~ ~k) (in 14. consider the case in which 8k follows from some previous formula 8j by the Rule of Substitution (R2) i. Thus. . that the derivation can be chosen so that whenever any of the inference rules (R2). thus (a) is established if we can just show that from C ~ (9" . . that E k. T h a t is just (b). 2 . By assumption.. . in these cases we have E k (:Y ~ 8k). or 9". By 14. hence it does not occur in (9" ~ ~B). a formula (~ 9 of a certain type. 9 we can deduce C' (:Y . Suppose. t~j is a formula e ~ 9 of a certain type.(9" ~ 8k). ( R 4 ) i s used. without even referring to the induction hypothesis. and each gi is either a logical or extralogical axiom. that from 9~ ~ (e ~ 9 we can deduce 9" ~ (C' ~ 9 ).30(iv) it follows that E k. If E k is equal to :Y. By assumption. Applications of (R4) are of this form: A(x) contains no occurrence of ~. and from 23 ~ A(x) we infer ~B ~ (V~ A(~)). 9". .. (b) For other cases.X3. 9 ). Thus.e.(9" ~ 8j). and from it we can deduce 8k. being replaced are symbols that do not appear in 9~. In these cases. that is. If 8k is an axiom. (R5). 8 2 . Proof. which was proved in 14. by induction on k . by replacing some or all of the free variables with specified terms.. 8~ be the given derivation. It suffices to show.(9" ~ (~j ~ 8k)) and E k. 8k. and let E be a set of formulas.29(ii)) and 8k we may deduce 9" . Let g: and ~ be formulas. . Next. then the free variables X. (R4). consider the case in which 8k follows from previous formulas Ei and 8j via modus ponens. the same substitution leaves 9" unaffected. consider the cases in which 8k follows from some previous formula 8j by one of the remaining inference rules (R3). then from 8k .
Now.. we follow the substitution notation of 14. ~). . Then. Hence by modus ponens we have E U {A} F (~A). at least in propositional logic. 14.4 + (~A)) + (~A). If E U {A} is syntactically inconsistent. Applications of (R6) are of this form: A(x) contains no occurrence of ~. Hence by modus ponens we have E F. By 14. tET Proof of c. Refer to 14. by 14. or more generally let T be any set of terms with T ~_ {Xl. Let T be the set of all free variables ( . hence 23 V19" contains no occurrences of x.. In propositional logic.40. This is (c). Then E U { 9 " } k S if and only if e (In particular. When we replace each x with t in the formula A(x) ~ (3~ A(~)).34. A w e a k f o r m o f p r o o f b y c o n t r a d i c t i o n . By the same rule. and let E be a set of formulas. However.9 if and only if F (9~ + S). taking E = O.inf [A(t)]. Let 9~ and S be formulas.41. in the Lindenbaum algebra (L. This is just (c).40. Corollaries. E U {Jt} F (23 [1 ( ~ ) ) [V~ A ( ~ ) ] . In special circumstances we obtain simplified versions of the Deduction Principle: a. by our assumption in 14. By assumption.} where the xj's are distinct free variables.25. we see that 9" F. and let E be a set of formulas. hence by the same inference rule from ({B N 9") ~ A(x) we infer ({B R 9") + (V~ A(~)). Note that the variable x does not appear in the formula (3~ A(~)). from (3~ A(~)) ~ (9" + ~B) we infer A(x) + (~+ ~B).e we have the syntactic theorem (. Thus. and from {B + (V~ A(~)) we infer {B ~ A(x). the result is the formula A(t) ~ (3~ A(~)). D e d u c t i o n P r i n c i p l e for C l o s e d F o r m u l a s . This completes the proof.an infinite set. based solely on the logical axioms.X3. Since A has no free variables. then E F (~A). Then E O {9~} F.38.21. this result is valid both in classical logic and in intuitionist logic. Let t be any term in T..9 if and only if e b. 14. let 9" and S be formulas. D e d u c t i o n P r i n c i p l e for P r o p o s i t i o n a l Logic. Assume 9" has no free variables. tET Proof.) for some formula 23.(~A).. (Moreover. Axiom (x) from 14.) c. the formula (iB V1(~23)) ~ (~A) is an instance of the Duns ScoUts Law. T h e o r e m c h a r a c t e r i z i n g q u a n t i f i e r s as s u p a n d inf. From that syntactic theorem and inference rule (Rh) we can infer that A(x) + (3~ A(~)) is also a syntactic theorem. By the same inference rule. Suppose E is a set of formulas and A is a formula with no free variables. "+" means just what one would expect it to mean.a.X2. Applications of (R5) are of this form: A(x) contains no occurrence of ~ and from (3~ A(~)) ~ {B we infer A(x) + ~B. This is (b). we have E ~(A + (~A)) by the Deduction Pr