You are on page 1of 192

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/374291336

INTRODUCTION TO MEASURE THEORY: FOUNDATIONS AND APPLICATIONS OF


QUANTITATIVE ANALYSIS

Book · September 2023

CITATIONS READS

0 576

1 author:

Carlos Polanco
Instituto Nacional de Cardiología
340 PUBLICATIONS 630 CITATIONS

SEE PROFILE

All content following this page was uploaded by Carlos Polanco on 24 November 2023.

The user has requested enhancement of the downloaded file.


Carlos Polanco

INTRODUCTION TO MEASURE
THEORY
FOUNDATIONS AND APPLICATIONS OF
QUANTITATIVE ANALYSIS

Bentham Science Publishers


Copyright to the book

Dear reader, if you find the information contained in this book useful and choose to reference it in
your work, research, or publication, I kindly ask that you properly cite this work. By doing so, you
contribute to acknowledging the effort and dedication invested in the creation of this material and promote
respect for copyright. Below, I provide a suggested format for citing this book: Carlos Polanco (2024).
INTRODUCTION TO MEASURE THEORY: FOUNDATIONS AND APPLICATIONS OF QUANTITATIVE
ANALYSIS. DOI: 10.13140/RG.2.2.3 2054.8864 6/4, ResearchGate GmbH. I appreciate in advance your
cooperation and respect for intellectual property.
This work, “Introduction to measure theory: foundations and applications of quatitative analysis”, created
by Carlos Polanco, is available under a Creative Commons Attribution 4.0 International (CC BY 4.0)
license. This license allows others to share, copy, distribute, and use the work, even for commercial purposes,
as long as the original authorship is credited.

Word processor: LATEX ©2023; Operating system: Linux Fedora-39 ©2023.

Software: QuillBot (Course Hero), LLC. ©2023, & ChatGPT 4.0 OpenAI ©2023.

v
One accurate measurement is worth a thousand expert
opinions.

– Grace Hooper
1906 – 1992
Foreword

Few domains in the vast realm of mathematical comprehension are as fundamental and potent as measure
theory. From the fundamental concepts of size and quantity, it expands to encompass everything from the
complexities of integration to the foundational principles of probability theory. This book seeks to provide
a comprehensive introduction to the world of measure theory by illuminating its complexities and elegant
structure.
The introductory chapters “Preliminaries on Sets” and “Measurable Spaces and Measure Properties” are
devoted to equipping the reader with the necessary tools and knowledge of sets and spaces. These ideas
serve as the bedrock upon which the subsequent chapters are built, like the foundation of a magnificent
structure. With this solid foundation, the reader is led into the regions of “measurable functions” and the
crucial “Lebesgue integral.” These sections represent both an evolution and a revolution in our conception
of integration.
As we progress, “Convergence in Measure Theory” provides insights into the nature of convergence in this
newly discovered context, a cornerstone of many advanced mathematical ideas. “Lebesgue Measure on Rn ”
and “Measure Theory in General Spaces” broaden the reader’s perspective by demonstrating the universal
applicability and scalability of the learned concepts.
However, as with all great voyages, this one too is riddled with enigmas. “Non-Measurable Sets and
Paradoxes” explores mathematical mysteries and paradoxes that have perplexed mathematicians for centuries.
These enticing facets of measure theory not only challenge but also enrich our comprehension.
The concluding chapters provide a natural and seamless transition to applications, particularly in the domain
of probability. “Applications in Probability” and “Convergence in Probability” illustrate the significant
impact of measure theory on the probabilistic universe. Probability can be regarded as a twist on measurement
theory, making this union both inevitable and illuminating.
“INTRODUCTION TO MEASURE THEORY: FOUNDATIONS AND APPLICATIONS OF QUANTITATIVE
ANALYSIS” is designed to be both your guide and companion, whether you are a student just beginning
your mathematical voyage or an experienced mathematician seeking a new perspective. Through its pages,
you will not only gain a technical understanding of measure theory, but you will also gain an appreciation
for its beauty, depth, and immense potential.

Pushchino, Moscow region, Russia. Vladimir N. Uversky


E-mail: vuversky@usf.edu

ix
x Foreword

Russian Academy of Sciences


Foreword

As the mathematics tapestry unfolds, it becomes increasingly apparent that certain strands are foundational,
providing structure to the expansive fabric by interweaving with various disciplines. One such thread is
measure theory. “INTRODUCTION TO MEASURE THEORY: FOUNDATIONS AND APPLICATIONS
OF QUANTITATIVE ANALYSIS” book invites readers on a voyage through the landscapes of measure
theory, providing a panoramic view of its rich vistas and deep, sometimes elusive pools of intuition.
Beginning with the essential “Preliminaries on Sets,” we aim to lay the groundwork for the remainder of the
text by introducing the reader to the fundamental concepts that permeate not only measure theory but also
much of advanced mathematics. Following closely, “Measurable Spaces and Measure Properties” delves
deeper, shedding light on the spaces in which we work and the properties that make them central to this
investigation.
Our exploration of the “Lebesgue Integral” and “Measurable Functions” is analogous to climbing a mathematical
peak. From this vantage point, one can see the vast implications and applications of knowing how to measure
and integrate in novel ways that deviate from our intuitive understandings.
With the “Convergence in Measure Theory” and “Lebesgue Measure on Rn ” chapters, we link new
theoretical insights with recognizable mathematical domains, bridging the gap between the abstract and
the concrete. “Measure Theory in General Spaces” strengthens this relationship by revealing the grand
architecture of measures in diverse spaces.
However, the path is not devoid of mysteries. “Non-Measurable Sets and Paradoxes” presents a series of
mathematical conundrums that emphasize the profound and, at times, perplexing aspects of our topic.
The segments on “Applications in Probability” and “Convergence in Probability” demonstrate the symbiotic
relationship between measure theory and probability as the course transitions from the theoretical to
the practical. One could argue that a thorough comprehension of measurement is essential for grasping
probability.
This rigorous “INTRODUCTION TO MEASURE THEORY: FOUNDATIONS AND APPLICATIONS OF
QUANTITATIVE ANALYSIS” book aspires to be more than just a technical guide. It seeks to instill awe,
invoke the profound joy of discovery, and cultivate a profound appreciation for the delicate interplay of logic,
intuition, and imagination in the domain of measure theory.

Cuernavaca Morelos, Mexico. Thomas Buhse


E-mail: buhse@uaem.mx

xi
xii Foreword

Universidad Autónoma del Estado de Morelos


Preface

The mathematical landscape is frequently dotted with theories that fundamentally alter our understanding
of the universe. Measure theory is among these foundational theories, providing profound insights into
integration, probability, and other domains. “INTRODUCTION TO MEASURE THEORY: FOUNDATIONS
AND APPLICATIONS OF QUANTITATIVE ANALYSIS” book, an introduction to measure theory, aims
to equip the reader with the tools necessary to comprehend this intricate and fascinating field.
Beginning with “Preliminaries on Sets,” the reader is guided through an important review of the language of
sets, an indispensible glossary that paves the way for subsequent chapters. As demonstrated in “Measurable
Spaces and Measure Properties,” the reader is then prepared to examine the broader applications and
subtleties of measure. Here, the intuitive and the formal dance go hand in hand, providing a framework
for all the other components.
Following “Measurable Functions” and “Lebesgue Integral” are the theory’s major actors. Here, we depart
from traditional Riemann integration and take a novel approach to the comprehension of “size” and
“volume”. It is a pivot that reveals a universe of possibilities in both the deterministic and random worlds.
As the chapters proceed, we will examine increasingly nuanced topics. ”Convergence in Measure Theory”
examines the complex issues surrounding the behavior of sequences of sets and functions, which are crucial
to advanced studies in functional analysis and probability theory.
The framework is expanded in “Lebesgue Measure on Rn ” and “Measure Theory in General Spaces” to
accommodate multivariable functions and more exotic spaces. The seemingly abstract concepts covered in
earlier chapters are given new vitality and application in this chapter, expanding your mathematical toolbox.
However, the theory contains paradoxes. Non-Measurable Sets and Paradoxes is an intriguing detour into
the world of mathematical anomalies and complexities that inspire and challenge even the most seasoned
mathematicians.
The book concludes with a discussion of “Applications in Probability” and “Convergence in Probability,”
bringing together the realms of measure theory and probability. It provides a tantalizing insight into the
practical applications and broader implications of measure theory, from the theoretical to the real world.
We hope that this book will serve as a cornerstone in your mathematical education, igniting your curiosity
and providing a foundational understanding of one of the most fascinating topics in mathematics. The
universe of measure theory awaits you.

xiii
xiv Preface

The author wishes to express gratitude to the Instituto Nacional de Cardiologa “Ignacio Chavez” and the
Faculty of Sciences at Universidad Nacional Autônoma de México for providing valuable examples.

CONFLICT OF INTEREST
The author declares no conflict of interest regarding the contents of each of the chapters of this ebook.

Carlos Polanco

E-mail: polanco@unam.mx
Department of New Technologies and Intellectual Protection
Instituto Nacional de Cardiologı́a “Ignacio Chávez”
México

Department of Mathematics, Faculty of Sciences


Universidad Nacional Autónoma de México
México
Acknowledgements

I would like to thank everyone who provided me with positive comments on my ebook proposals.

xv
Contents

List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

1 Preliminaries on Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Operations with Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Introduction to Families of Sets and σ -algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Importance of a σ -algebra in Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Main Sets in Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2 Measurable Spaces and Measure Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Measurable Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Properties of a Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 The Importance of Measure in σ -algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5 Measure Zero Sets and their Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 The Lebesgue Measure: A Simplified Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.7 Applications of Measurable Spaces and σ -algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3 Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3 Properties and Operations with Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4 Importance of σ -algebras in Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5 Examples across Disciplines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

xvii
xviii Contents

4 Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Introduction of the Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2.1 Riemann Integral and Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3 Examples of Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.4 Limitations of the Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.5 Understanding the Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.6 Limitations of the Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5 Convergence in Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 Monotone Convergence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3 Dominated Convergence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6 Lebesgue Measure on Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Measure in Multidimensional Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.3 Product of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

7 Measure Theory in General Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.2 Measures in Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.2.1 Borel Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2.2 Radon Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.2.3 Dirac Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

8 Non-measurable Sets and Paradoxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115


8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.2 Paradoxes in Non-measurable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.3 Examples of Non-measurable sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.4 Significance of Non-measurable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8.5 Integration in Non-measurable Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
8.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Contents xix

9 Applications in Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125


9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
9.2 Probability Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
9.3 Random Variables and their Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
9.4 Lebesgue’s Dominated Convergence Theorem in Probability . . . . . . . . . . . . . . . . . . . . . . . . . . 141
9.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
9.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

10 Convergence in Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145


10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.2 Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
10.3 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
10.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
10.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

11 SOLUTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
List of Symbols

Symbol Description Page

A∪B Union of sets 2


A∩B Intersection of sets 2
A\B Difference of sets 3
A∆B Symmetric difference of sets 3
Ai Family of sets 4
σ − algebra Structure associated with a family of sets 4

xxi
Chapter 1
Preliminaries on Sets

Carlos Polanco
Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México.
Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de México, Ciudad de
México, México.

Abstract In this chapter, we delve deep into the foundational world of set theory, a cornerstone in modern
mathematics that underpins various advanced topics. We commence with the rudimentary yet pivotal
operations involving sets, detailing union, intersection, complement, and difference, alongside a few less
common, but equally pertinent operations. As we progress, we turn our attention to “Families of Sets”,
explicating the intricacies and characteristics that distinguish them from mere collections of sets. Integral to
our discussion is the concept of σ -algebras, powerful structures that play a quintessential role in measure
theory and probability. Readers will garner an appreciation for the capabilities of a σ -algebra, specifically
its relevance in ensuring the measurability of sets, thereby permitting sophisticated mathematical analyses.
Concluding the chapter, we introduce the Borel Set-a cornerstone in real analysis and probability theory.
Through a blend of theoretical exposition and practical examples, we aim to provide readers with a
comprehensive understanding of these concepts, setting the stage for more advanced studies in analysis,
topology, and probability.

Keywords: σ -algebra, families of sets, main sets in measure theory, measure theory, operations with sets.

1.1. Introduction
In the vast landscape of mathematics, sets serve as foundational bedrock, encapsulating collections of distinct
elements and their intricate interplay. From basic operations like union and intersection to more advanced

(B) Carlos Polanco: Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México. Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de
México; Tel: +01 55 5595 2220; E-mail: polanco@unam.mx

1
2 1 Preliminaries on Sets

constructs, sets are at the heart of many mathematical explorations. Beginning with the rudimentary
operations involving sets, we soon escalate to understanding the concept of “families of sets”—essentially
a collection of sets with unique and significant properties. But the journey doesn’t end there. As we
delve deeper, we encounter the σ -algebra [1, 2], a structure paramount to measure theory and probability,
characterized by a collection of sets closed under certain operations.
The capabilities of a σ -algebra, both its transformative power and its limitations, form a cornerstone of this
exploration. Beyond that, lies the introduction to Borel sets, a specific and pivotal σ -algebra birthed from
open intervals on the real line. These Borel sets, fundamental to realms such as real analysis, measure theory,
and probability, offer a glimpse into the vast and intricate structural beauty of mathematical systems. By the
chapter’s conclusion, readers will not only grasp the essentials of set operations and σ -algebras but also
appreciate the foundational role of these structures in the broader mathematical tapestry.

1.2. Operations with Sets


Sets can be combined in various ways to produce new sets.
(1) Union: The union of two sets, A and B, denoted A ∪ B, is the set of all elements that are in A, or in B, or
in both.

A∪B
R
a c b d

Figure 1.1: Graphical representation of the union of two sets, A and B, on the real line. The set A is
represented by the blue interval from a to b and the set B by the red interval from c to d. Their union,
A ∪ B, is represented by the dashed rectangle.

Example 1.1. If A = {1, 2} and B = {2, 3}, then A ∪ B = {1, 2, 3}.


(2) Intersection: The intersection of two sets, A and B, denoted A ∩ B, is the set of all elements that are both
in A and in B.

A B
R
a c b d

Figure 1.2: Graphical representation of the intersection of two sets, A and B, on the real line. The set A is
represented by the blue outlined region from a to b and the set B by the red outlined region from c to d. Their
intersection is evident where the two outlined regions overlap. In the context of measure theory, this visually
demonstrates how two measurable sets might intersect, creating an overlapping subset.
1.2 Operations with Sets 3

Example 1.2. Using the same sets A and B, A ∩ B = {2}.


(3) Difference: The difference of two sets, A and B, denoted A − B or A \ B, is the set of all elements in A
but not in B.

Figure 1.3: Graphical representation of the difference between two sets A and B in the context of measure
theory. The set A is represented as a rectangle shaded in light blue. The set B is represented as a circle with a
red outline, positioned inside A. The difference A \ B corresponds to the region of the rectangle not covered
by the circle. This visual demonstration helps to understand how the measure (or “size”) of a set changes
when another subset is removed from it.

Example 1.3. For A = {1, 2, 3} and B = {3, 4, 5}, A − B = {1, 2}.


(4) Symmetric Difference: Given two sets A and B, the symmetric difference of A and B, denoted A ∆ B, is
defined as:

A ∆ B = (A \ B) ∪ (B \ A)
In other words, A ∆ B consists of all elements that are in A but not in B, and in B but not in A.

B
5,6
A
3,4
1,2

Figure 1.4: The areas shaded in yellow and green represent the symmetric difference A ∆ B, which consists of
all elements that are in A but not in B, and in B but not in A. The overlapping region indicates the intersection
A ∩ B.

Example 1.4. Let’s consider two sets:

A = {1, 2, 3, 4}
4 1 Preliminaries on Sets

B = {3, 4, 5, 6}

The symmetric difference A ∆ B consists of all elements that are in A but not in B, and all elements that
are in B but not in A. Thus, A ∆ B is:

A ∆ B = {1, 2, 5, 6}

To elaborate:

(a) The numbers 1 and 2 are in A but not in B.


(b) The numbers 5 and 6 are in B but not in A.

These four numbers make up the symmetric difference between A and B. Hopefully, this example
clarifies the concept!

1.3. Introduction to Families of Sets and σ -algebras


(1) Families of Sets: A family of sets is simply a collection of sets. It’s like a “set of sets.” If we consider
the sets A = {1, 2}, B = {2, 3, 4}, and C = {5}, then the family F = {A, B,C} contains three sets.

A2

A3

A1

A4

Figure 1.5: Illustration of a family of subsets {Ai } within a measure space X. Each circle represents a subset
Ai and the rectangle represents the entire space X. In the context of measure theory, a family of sets like this
could be of interest for various operations like unions, intersections, or set differences, and each subset could
have its own measure or “size”.

(2) σ -algebras: A σ -algebra on a set X is a collection Σ of subsets of X that includes X, is closed under
taking complements, and is closed under countable unions.
1.3 Introduction to Families of Sets and σ -algebras 5

Definition 1.1. Given a set X, a σ -algebra F on X is a collection of subsets of X that satisfies the
following properties:

(a) X ∈ F .
(b) If A ∈ F , then its complement Ac = X \ A is also in F .
(c) If A1 , A2 , . . . is a sequence of sets in F , then the countable union of these sets, ∞
S
i=1 Ai , is also in F .

In more intuitive terms: a σ -algebra is a collection of sets that includes the universal set X, is closed
under the operation of taking complements, and is closed under countable unions.
In the context of probability theory, X is the sample space and F is the set of all events (subsets of the
sample space) for which we can coherently assign a probability.


B If B ∈ σ , then Bc ∈ σ

If A ∈ σ , then Ac ∈ σ

Figure 1.6: Illustration of a σ -algebra. The sample space, Ω, contains all possible outcomes. Any subset, like
A or B, in the σ -algebra ensures that their complements, Ac and Bc , are also in the σ -algebra. Furthermore,
countable unions of these subsets would also be part of this σ -algebra.

Example 1.5. Let X = {1, 2, 3}. One possible σ -algebra on X is Σ = {0,


/ {1}, {2,
3}, {1, 2, 3}}.

/ {1}, {2, 3}, {1, 2, 3}} is a σ -algebra on X = {1, 2, 3}.


Solution 1.1. Let’s thoroughly explain why Σ = {0,
A collection Σ of subsets of a set X is a σ -algebra if it satisfies the following properties:

(1) X is in Σ.
(2) If A is in Σ, then its complement X \ A is also in Σ. S∞
(3) If {Ai }∞
i=1 is a sequence of sets in Σ, then their countable union i=1 Ai is also in Σ.

Now, let’s verify these properties for our Σ:

(1) The universal set is in Σ: X = {1, 2, 3} is clearly in Σ, satisfying the first condition.
(2) Complements:
/ Its complement in X is X itself, which is {1, 2, 3}. We note that {1, 2, 3} ∈ Σ.
(a) Consider the set 0.
(b) For the set {1}, its complement in X is {2, 3}. We also note that {2, 3} ∈ Σ.
(c) Similarly, for the set {2, 3}, its complement is {1}, which is in Σ.
6 1 Preliminaries on Sets

/ which is also in Σ.
(d) Lastly, the complement of {1, 2, 3} is 0,
Hence, all sets in Σ have their complements in Σ, satisfying the second condition.
(3) Countable Unions: Given that X has a finite number of elements, there is no infinite sequence of distinct
sets in Σ. However, for any finite sub-collection from Σ, its union is in Σ. For instance, the union of 0/ and
{1} is {1}, which is in Σ. Similarly, the union of {1} and {2, 3} is {1, 2, 3}, which is also in Σ. Although
this property is trivially true for finite sets, it’s important that it holds.

Since all these properties are met, we can confidently state that Σ is a σ -algebra on X.

Example 1.6. Let X = {1, 2, 3} and consider the collection

F = {0,
/ {1}, {2, 3}}.

Now, let’s check the properties for a σ -algebra:

Solution 1.2. (1) The set X must be in the collection.


(2) If a set B is in the collection, then its complement Bc must also be in the collection.
(3) If a sequence of sets {Bn }∞ n=1 is in the collection, then their countable union (Fig. 1.1) must also be in
the collection.

[
Bn (1.1)
n=1

For our collection F : The set X is not in F . Therefore, F does not satisfy the first property of a σ -algebra.
In fact, we only need a violation of one of the properties to show that F is not a σ -algebra. In this case, the
first property is already not satisfied, so F is not a σ -algebra.

/ {a}, {b, c}, Ω}. Is F a σ -algebra?


Example 1.7. Let Ω = {a, b, c}. Consider F = {0,

Solution 1.3. For F to be a σ -algebra, it must satisfy three properties:

(1) Ω ∈ F .
(2) If A ∈ F , then Ac ∈ F .
(3) If A1 , A2 , ... are sets in F , then ∞
i=1 Ai ∈ F .
S

Checking each property:


(1) Ω is in F .
(2) Taking the complements: {a}c = {b, c} which is in F , {b, c}c = {a} which is in F , Ωc = 0/ which is in
F , and 0/ c = Ω which is in F .
(3) Since there are only a finite number of sets in F , any finite union will also be in F .
Thus, F is a σ -algebra.

/ {1, 2}, {3, 4}}. Is G a σ -algebra?


Example 1.8. Let Ω = {1, 2, 3, 4}. Consider G = {0,

Solution 1.4. To determine if G is a σ -algebra, we check the same properties mentioned above:
Ω is not in G . Thus, G is not a σ -algebra.
1.3 Introduction to Families of Sets and σ -algebras 7

Example 1.9. Provide a graphical example illustrating the countable union of sets. Ensure each individual
set is clearly highlighted and how they come together to form the overall union.

Solution 1.5. The figure (Fig. 1.7) visualizes the concept of a countable union of sets on the real line. In
mathematical terms, a “countable union” refers to the union of a countable sequence of sets. The term
“countable” indicates that there’s an infinite number of these sets, yet they can still be “counted” in the sense
that you can enumerate them one-by-one (for instance, the set of all integers is countable).
Interpretation: The figure visually demonstrates that each set An is a distinct segment on the real line. The
idea of a countable union is illustrated by showcasing that we can have an infinity of these sets, each situated
at different parts of the line. If we were to take the union of all these sets, we’d be grouping all these
individual intervals into one larger set. In other words, any point that belongs to any set An will belong to the
union of all these sets.
In the Context of Measure Theory: In measure theory, countable unions are essential as they allow us to
construct more intricate sets from simpler ones. For instance, we might have sets whose length or “measure”
we want to compute. While each individual set An might have a straightforward measure (like the length of
an interval), the measure of their countable union could be more complex. This figure aids in visualizing
what these individual sets look like and how they can come together into a larger structure.

...
R
A1 A2 A3 A4

Figure 1.7: Graphical representation of a countable union of sets on the real line. Each set An is represented as
a distinct outlined interval in blue. The process illustrated by sets A1 , A2 , A3 , and A4 can continue indefinitely,
representing the idea of a countable union in measure theory.

Example 1.10. Consider a basic set X = {a, b, c}. Determine if there exists a σ -algebra F on this set with
the following collection of subsets:

F = {0,
/ {a}, {b, c}, {a, b, c}}

Solution 1.6. For F to be a σ -algebra, it must satisfy three properties:

(1) X is in F .
(2) If A is in F , then its complement Ac is also in F .S
(3) If A1 , A2 , . . . are in F , then their countable union ∞
i=1 Ai is also in F .

Verification:

(1) X = {a, b, c} is clearly in F .


(2) Taking complements:
(a) The complement of 0/ is X, which is in F .
(b) The complement of {a} is {b, c}, which is in F .
(c) The complement of {b, c} is {a}, which is in F .
(d) / which is in F .
The complement of X is 0,
8 1 Preliminaries on Sets

(3) For this finite set, the notion of countable union translates to finite union. Every finite union of sets in
F results in a set that’s also in F .

Therefore, F is a σ -algebra on X.

Example 1.11. Consider the set Ω = {1, 2, 3}. What is the smallest σ -algebra on this set?
Solution 1.7. The smallest σ -algebra over Ω is {0,
/ Ω}. Therefore,

F = {0,
/ {1, 2, 3}}.

Example 1.12. For the set Ω = {1, 2}, what is the power set and is it a σ -algebra?
Solution 1.8. The power set (set of all subsets) of Ω is

F = {0,
/ {1}, {2}, {1, 2}}.

The power set of any set Ω is always a σ -algebra.

Example 1.13. Consider a discrete set Ω = {1, 2, 3}. Determine a σ -algebra that contains all singletons and
their complements.
Solution 1.9. The σ -algebra containing all singletons and their complements, along with the empty set and
Ω, is
F = {0,/ {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}.

Example 1.14. Consider the set of real numbers, Ω = R. Can you provide an example of a σ -algebra that
divides the real numbers into two parts: numbers less than or equal to zero and numbers greater than or equal
to zero?
Solution 1.10. A σ -algebra on Ω that satisfies the given condition is

/ R, (−∞, 0], [0, ∞)}.


F = {0,

1.4. Importance of a σ -algebra in Measure Theory


A σ -algebra is a structured family of sets that has certain specific properties. These properties enable a range
of operations to be performed coherently within this family. If a family of sets F is a σ -algebra over a set
Ω, then one can:

(1) Take the Complement of Any Set: If A belongs to F , its complement Ac (elements in Ω not in A) also
belongs to F .
(2) Perform Countable Unions: If one has a countable sequence of sets A1 , A2 , A3 , . . . and each Ai belongs
to F , then the countable union ∞
S
i=1 Ai also belongs to F .
(3) Perform Countable Intersections: Due to the complement property, each Ai belongs to F , then the
if S
countable intersection ∞
T∞ ∞ c c
i=1 Ai is also in F . This is because i=1 Ai = ( i=1 Ai ) .
T

(4) Include the Universal Set and the Empty Set: The universal set (or sample space) Ω and the empty set 0/
are always in F .
1.5 Main Sets in Measure Theory 9

(5) Construct Sets from Basic Operations: Given the ability to form unions, intersections, and complements,
one can construct a wide variety of sets from the basic sets in F using these operations.
(6) Define Measures (and Probabilities) Coherently: Although not delving into measures, it’s noteworthy
that σ -algebras are essential for defining measures (and especially probability measures) on the set
family. A probability measure assigned to a σ -algebra adheres to properties like being non-negative,
assigning probability 1 to the sample space, and being σ -additive.
(7) Model and Refine Information: In advanced contexts, like stochastic process theory, σ -algebras are used
to model information available at different times. A filtration of σ -algebras can represent the evolution
of information over time.

A σ -algebra is a fundamental mathematical structure in measure theory, which serves to define a measure
over sets. It is particularly important in probability theory, where the measure is often probability. Below, we
present some problems that arise when there is no σ -algebra defined on a set:

(1) You cannot define a measure: To define a measure (like the Lebesgue measure or a probability) on a
space, you need a σ -algebra over that space. Without a σ -algebra, it is not possible to talk about the
measure of sets in that space coherently.
(2) Inability to handle random events: In probability, we use σ -algebras to discuss random events. Without
a σ -algebra, we cannot formally talk about events or associated probabilities.
(3) Integration becomes impossible: Integration, like the mathematical expectation of a random variable, is
defined using the measure. Without a σ -algebra (and therefore, without a measure), you cannot integrate
functions.
(4) Lack of completeness: One of the reasons to introduce σ -algebras is to ensure the completeness of the
measure space, meaning all “small” sets in an appropriate sense are in the σ -algebra. Without a σ -
algebra, it is possible that certain sets which intuitively should have measure zero (in the context of
Lebesgue measure) or probability zero (in the context of probability) are not adequately addressed.
(5) Problems with limits: Many results in probability and analysis involve limits, such as the Central Limit
Theorem or Lebesgue’s Dominated Convergence. These theorems often require σ -algebra structures to
ensure that the limits of sequences of sets or functions are well-defined.
(6) Issues with composition of functions: Given two measurable functions, their composition is not necessarily
measurable unless you have appropriate σ -alge
bra structures on the involved spaces.
(7) Difficulty with infinite intersections and unions: σ -algebras allow for handling infinite intersections and
unions of sets in a coherent way. Without these structures, it’s difficult or impossible to work with infinite
collections of sets.

1.5. Main Sets in Measure Theory

Open and Closed Sets


Open set: Imagine that each point in a set is like a tiny star with a halo around it. If every point in the set (each
star) has a halo, no matter how small, that stays completely within the set without crossing its boundaries,
10 1 Preliminaries on Sets

then that set is open. It’s like having a piece of paper where you can draw inside the paper without touching
the edges; the interior of the paper would represent an open set.
Closed set: A closed set is essentially the opposite of an open set. If we think about the halo idea around
each point, in a closed set not all points need to have a halo that remains entirely within the set. Some points
are right at the “border” of the set. Imagine a box with walls; everything inside the box, including the walls
themselves, represent the closed set. If a point is on the wall, it doesn’t have room for a full halo without
stepping outside the box.

Definition 1.2. Open set: A set A in a metric space (X, d) is open if, for every point x in A, there exists a
positive number ε such that the open ball centered at x with radius ε is entirely contained in A. This is written
as:
∀x ∈ A, ∃ε > 0 : B(x, ε ) ⊆ A
where B(x, ε ) is the open ball centered at x with radius ε .

Definition 1.3. Closed set: A set A is closed if its complement (the points not in A) is an open set.

Open Set Closed Set

Figure 1.8: Illustration of an open set (circle with a dashed border) and a closed set (circle with a solid
border). The dashed border indicates the points on the boundary are not included in the open set.

Example 1.15. Review the intersections and unios of open and closed sets.
Solution 1.11. Basic Definitions:
(1) The union of an arbitrary collection of open sets is an open set.
(2) The intersection of a finite collection of open sets is an open set.
(3) The union of a finite collection of closed sets is a closed set.
(4) The intersection of an arbitrary collection of closed sets is a closed set.

Open Set:
A = (0, 1)
Closed Set:
B = [2, 3]

Union:

A ∪ B = (0, 1) ∪ [2, 3]
= (0, 1) ∪ (2, 3) ∪ {2, 3}
1.5 Main Sets in Measure Theory 11

Given that we’re uniting an open set with a closed set, the outcome isn’t necessarily open or closed. In this
instance, it’s simply a set consisting of an open interval, a closed interval, and two isolated points.
Intersection:
A ∩ B = (0, 1) ∩ [2, 3] = 0/

The intersection of a non-overlapping open set and closed set is the empty set, which is both open and closed
(a clopen set).
Conclusions:
(1) The union of an open set and a closed set may not be neither open nor closed, depending on whether the
sets overlap or not.
(2) The intersection of a non-overlapping open set and closed set is the empty set, which is both open and
closed.

Compact Sets
A compact set can be thought of as a set that is “small” and “closed” in a certain sense. Imagine being in
a vast meadow and wanting to define an area where you can keep your sheep without them dispersing too
much. This area you choose is akin to a compact set for two main reasons:

(1) It is closed: This means that if the sheep are close to the boundary of the area, they won’t escape. In
mathematical terms, if you choose any point within the set, even on the boundary, you’re still within the
set.
(2) It is bounded: While the meadow might be infinitely vast, the area you picked for your sheep is not.
There’s a clear limit to how far the sheep can go. They can’t walk indefinitely in one direction and still
be within the area.

Definition 1.4. A set S in a metric space M is called compact if for every open cover of S (i.e., a collection
of open sets such that S is contained in their union), there exists a finite subcover (i.e., a finite number of
those open sets which still cover S).
In the context of the real line or Euclidean space, a set is compact if and only if it is closed and bounded.

Consider the real line. A closed and bounded interval, such as [a, b], is an example of a compact set.

a b

Figure 1.9: Illustration of the compact set [a, b] on the real line. This set is both closed and bounded, making
it a compact set in the context of measure theory.

Consider the real line. The half-open interval [a, ∞) is an example of a set that is closed but not bounded.
12 1 Preliminaries on Sets

a ···

Figure 1.10: Illustration of the closed but unbounded set [a, ∞) on the real line. This set includes the starting
point a but extends indefinitely to the right.

Example 1.16. Consider the subset S = [0, 1] of real numbers. Proof that S is compact.
Solution 1.12. We will use the Heine-Borel Theorem for R. The theorem states that a subset S of R is
compact if and only if it is closed and bounded.

(1) S is bounded: By definition, S consists of all real numbers x such that 0 ≤ x ≤ 1. Hence, all numbers in
S lie between 0 and 1, implying that S is bounded.
(2) S is closed: The set S contains all its limit points. For any limit point l of S, every open interval around l
will contain a point from S. Given that S = [0, 1] and includes its endpoints 0 and 1, l must belong to S.
This indicates that S is closed.

Combining points (1) and (2), S is found to be both closed and bounded. By the Heine-Borel Theorem, it
follows that S is compact.
The closed interval S = [0, 1] is a compact subset of R.

Fσ and Gδ Sets
Imagine you have a bunch of closed boxes (representing closed sets) and a bunch of rings or hoops
(representing open sets).

(1) Fσ (F sigma) Set:


(a) Imagine you select some of your closed boxes and line them up in a row.
(b) An Fσ set is simply the total collection of all the boxes you’ve chosen, put together. That is, it’s
a union (when you bring two or more things together) of a countable number (like a list you can
count) of closed boxes.
(2) Gδ (G delta) Set:
(a) Now, think that you take some of your rings and arrange them so that each one envelops the previous
one, as if you were trying to focus on a specific area.
(b) A Gδ set is the area that remains in the center after you’ve placed all your rings. That is, it’s the
intersection (the common part) of a countable number of open rings.

In slightly more mathematical but still simple terms:


(1) An Fσ is a set formed by joining many closed sets together.
(2) A Gδ is a set formed by finding the common area (or intersection) of many open sets.
1.5 Main Sets in Measure Theory 13

Definition 1.5. An Fσ set is one that can be expressed as a countable union of closed sets. Formally, if { En
} is a sequence of closed sets, then the set F given by

[
F= En
n=1

is an Fσ set.

Definition 1.6. A Gδ set is one that can be expressed as a countable intersection of open sets. Formally, if {
On } is a sequence of open sets, then the set G given by

\
G= On
n=1

is a Gδ set.

Fσ set Gδ set

Figure 1.11: Visual representation of Fσ and Gδ sets. The Fσ set is formed by the union of closed sets
(depicted by boxes) and the Gδ set is formed by the intersection of open sets (depicted by ellipses).

Example 1.17. Consider the set of rational numbers Q within the interval [0,1]. We can express Q as a
countable union of isolated points (singleton sets), each of which is closed in R. Thus, Q in [0,1] is a Fσ set.

Solution 1.13. We’ll start by defining the concepts:


A Fσ set is a set that is a countable union of closed sets. A Gδ set is a set that is a countable intersection of
open sets.
Formally,

[
Q ∩ [0, 1] = {qn }
n=1

where {qn } is an enumeration of the rationals in [0,1].

Now, let’s construct a Gδ set using the set of irrational numbers I in [0,1]. The irrationals can be seen as
the countable intersection of open sets if we remove them from a sequence of open sets that ”narrow down”
around Q.
14 1 Preliminaries on Sets

Formally, for every rational number q in [0,1], we can find an open set (q − 1n , q + n1 ) (an open interval
centered at q with radius 1n ), the set of irrationals in [0,1] can be written as:
 

\ [ 1 1
I ∩ [0, 1] = [0, 1] \ (q − , q + )
n=1
n n
q∈Q∩[0,1]

This set is a Gδ as it’s a countable intersection of open sets.

Summary:
(1) Q ∩ [0, 1] is an example of a Fσ set.
(2) I ∩ [0, 1] is an example of a Gδ set.

Simple Sets
In the realm of measure theory, especially when discussing Lebesgue measure on the real line, simple sets
can be thought of as collections of basic “rectangular” blocks that can be grouped together.
Imagine you have a set of toy blocks. Each block represents an interval on the real line. Now, by grouping
several of these blocks together, you form a more complex shape, yet it’s still a combination of these basic
blocks. In this scenario, the shape you’ve formed with the blocks is a “simple set”.
A simple set on the real line is a finite union of intervals. That is, it’s a set that you can form by taking some
intervals (which could be open, closed, or mixed) and putting them together. In higher dimensions, like in
the plane or space, these “blocks” would be rectangles or box-like rectangular prisms, respectively.
Simple sets are valuable as they allow us to approximate more intricate sets and work with them in terms
of these foundational blocks. It’s akin to how constructing a complex structure becomes easier if you first
understand how to assemble its simpler components.
Definition 1.7. Let E be a subset of the real line R.
We call E a simple set if it can be expressed as a finite union of disjoint intervals. Formally, E is a simple set
if there exist intervals I1 , I2 , . . . , In such that:
(1) The intervals are pairwise disjoint: I j ∩ Ik = 0/ for all j 6= k.
(2) E = I1 ∪ I2 ∪ . . . ∪ In .

Note: The intervals I j can be open, closed, half-open, or even degenerate (reduced to a single point).

A simple set on the real line can be visualized as a finite union of disjoint intervals. Here’s an example:

Figure 1.12: In the figure above, the simple set is composed of three disjoint intervals. The blue segments
represent the intervals, with filled circles indicating closed endpoints and unfilled circles indicating open
endpoints.
1.5 Main Sets in Measure Theory 15

Example 1.18. Consider the intervals A1 = (2, 5] and A2 = (7, 10].


The union of these intervals, A = A1 ∪ A2 , is a simple set. Let’s denote this union as A:

A = (2, 5] ∪ (7, 10]

Now, suppose you want to find the characteristic function of this simple set. Recall that the characteristic
function (also known as the indicator function) of a set A is defined as:
(
1 if x ∈ A
χA (x) =
0 if x ∈ /A

Solution 1.14. For our simple set A:


(
1 if x ∈ (2, 5] ∪ (7, 10]
χA (x) =
0 otherwise

This means that χA (x) will be equal to 1 for values of x between 2 and 5 (exclusive of 2, inclusive of 5) and
between 7 and 10 (exclusive of 7, inclusive of 10). For all other values, χA (x) will be 0.

Borel Sets
Imagine the line of real numbers, which goes from negative infinity to positive infinity, passing through all
the numbers you know: -3, -2.5, 0, 0.5, 1, 2, 3, . . . and so on.
Now, on this line, we can pinpoint intervals, such as the interval of all numbers between 1 and 2 (this would
include 1.5, 1.75, 1.0001, and many more).
Construction Step by Step
(1) Step 1: We start by considering all possible open intervals. An open interval between two numbers a and
b (where a < b) is the set of all numbers between a and b, but not including a and b themselves. We
denote it as (a, b).
(2) Step 2: Using these open intervals, we can form other sets through set operations, like unions, intersections,
and complements. For instance, we can unite two open intervals to form a larger set, or we can take the
intersection of two intervals to form a smaller set.
(3) Step 3: We repeat these operations infinitely many times. That is, we don’t just take unions of two or
three intervals, but unions of infinitely many intervals, and so forth.
(4) Step 4: After all these operations, all the sets we have formed (which are many) are called “Borel sets”.

The idea behind Borel sets is to have a “manageable” collection of sets with which we can work when doing
mathematics, especially in areas like measure theory and probability. Essentially, Borel sets cover a wide
range of “interesting” or “useful” sets without being too complicated to work with.
Borel sets on the real line are, basically, sets that can be formed from open intervals through set operations
(like unions and intersections) done infinitely many times. These sets are foundational in various areas of
mathematics and provide a structured way to work with subsets of the real numbers.
16 1 Preliminaries on Sets

Formed sets
Open interval

R
−3−2.5 0 0.5 1 2 3

Continue the process infinitely

Figure 1.13: A visualization of the initial stages in constructing Borel sets. The figure shows the real number
line, with an example of an open interval in blue. Using set operations, new sets are formed, represented in
red. This process continues indefinitely, creating an intricate web of sets, culminating in what we call Borel
sets.

Lebesgue Measurable Sets


Imagine you’re painting an infinitely long line. The way you typically measure how much you’ve painted is
by looking at how many inches or feet of the line are covered in paint.
Now, the “traditional” way of measuring this, known as Riemann measurement, is similar to placing a ruler
on the line and summing up all the areas where there’s paint. However, this technique has its limitations and
doesn’t work well when the paint is spread in strange or “irregular” patterns along the line.
It was Lebesgue, who proposed a different way of measuring. Instead of looking at the line directly, he
proposed looking at the paint and asking, “How much of the line would be covered by this particular drop
of paint?”. This approach is more flexible and can handle more complicated paint distributions.
Lebesgue Measurable Sets are, in essence, all the possible patterns or ways you might spread paint along the
line such that you can accurately measure how much you’ve painted using Lebesgue’s method. If you come
across a pattern that isn’t “Lebesgue measurable”, then Lebesgue’s method can’t measure it directly.
In short, Lebesgue measurable sets are like the “allowed paint patterns” that Lebesgue’s method can
accurately measure. It’s a more general and powerful way of measuring compared to more traditional
techniques.

Definition 1.8. Given a set E ⊆ R, we define its outer measure m∗ (E) as the infimum of the sum of the
lengths of open intervals that cover E. Specifically, for each covering of E by open intervals {In }, we
consider:

∑ length(In )
n

And we take the infimum over all such sums to obtain m∗ (E).
We say that a set E is Lebesgue measurable if, for every ε > 0, there exists an open set O such that:

m∗ (O ∆ E) < ε
1.5 Main Sets in Measure Theory 17

Where ∆ denotes the symmetric difference, i.e., O ∆ E = (O \ E) ∪ (E \ O).


The intuition behind this definition is that a set is Lebesgue measurable if it can be approximated arbitrarily
well by open sets in terms of the outer measure.

Example 1.19. Consider the set A which is defined as the set of numbers in [0, 1) whose decimal representation
starts with the digit ”3” after the decimal point. That is,

A = {x ∈ [0, 1) | the first digit of x after the decimal point is 3}

Now, for any n ≥ 1, let’s define the intervals:


       
3 4 3 4 3 4 3 4
In = , ∪ , ∪ , ∪...∪ ,
10 10 100 100 1000 1000 10n 10n

Clearly, each x ∈ A belongs to infinitely many In , and every x not in A belongs to at most a finite number of
In . So, we can express A as the following limit:

\
A= In
n=1

Solution 1.15. If a set can be expressed as a countable intersection or union of open intervals (or closed
intervals or any combination thereof), then it is Lebesgue measurable. In this case, A is a countable
intersection of sets of open intervals, so A is Lebesgue measurable.

(1) We defined the set A based on a particular property of its elements.


(2) We constructed a sequence of sets In where each In is a union of open intervals.
(3) By observing that the intersection of all In gives us the set A, we have expressed A as a countable
intersection of open intervals.
(4) Therefore, A is Lebesgue measurable.

Example 1.20. Consider a subset E of [0, 1] such that for every ε > 0, there exists a countable collection of
open intervals {In } such that:
(1) E is contained in the union of the In ’s.
(2) The sum of the lengths of the In ’s is less than m∗ (E) + ε , where m∗ denotes the outer measure.

Show that if E = [a, b] where 0 ≤ a < b ≤ 1, then E is Lebesgue measurable.

Solution 1.16. For any ε > 0, let’s consider the open interval I1 = (a − ε2 , b + ε2 ):

(1) Clearly, E is contained in I1 .


(2) The length of I1 is (b − a + ε ), which is less than m∗ ([a, b]) + ε since m∗ ([a, b]) is b − a.

Therefore, E is Lebesgue measurable.


18 1 Preliminaries on Sets

Null Sets
Imagine you are painting a straight line, and someone tells you there are specific points on that line that you
should not paint. However, these points are so incredibly tiny and sparse that even if you decide to paint
them anyway, they wouldn’t affect the overall appearance of your painted line. In fact, whether you paint
them or not, to anyone looking at the line, it would appear fully painted.
In measure theory, a null set is akin to those inconsequential points on the line. It is a set that is so “small”
in terms of measure (think of “area” or “volume” for simplicity) that it is considered to have size zero. Even
though it might contain many points (even infinitely many, like all the rational numbers between 0 and 1),
its “size” in terms of measure is null.
In other words, in the context of integration or measurement, you can ignore these null sets without
affecting the final outcome. It’s a way of saying that while there might be technical details or exceptions
when working with functions and sets, those details are so insignificant in certain contexts that they can be
“overlooked” without changing the bigger picture.

Definition 1.9. Given a measure space (X, M , µ ), where:


(1) X is the set being measured,
(2) M is a σ -algebra of subsets of X,
(3) µ : M → [0, ∞] is a measure,

a set E ∈ M is said to be a null set if:


µ (E) = 0.
That is, E is a null set if its measure is zero.

Moreover, any subset of a null set is also considered a null set, even if it’s not in the σ -algebra M . This is
particularly important in the context of the Lebesgue measure on the real line.

Null set (e.g., Q ∩ [0, 1])


R

Figure 1.14: Symbolic representation of a null set on the real line. The blue point denotes a set like Q ∩ [0, 1]
which, despite containing infinitely many points, has a Lebesgue measure of zero.

Example 1.21. In the context of measure theory, a Null Set is a set which has a ”size” of zero with respect to
a given measure, even if the set is not empty. A classic example of a set with zero measure on the real line is
the set of rational numbers in the interval [0, 1].

Solution 1.17. Consider Q, the set of all rational numbers. Rational numbers are those numbers that can be
expressed as a fraction of two integers, i.e., q = ps , where p is an integer and s is a non-zero integer.
Within the interval [0, 1], there are infinitely many rational numbers. Despite this, these rationals are ”sparse”
amongst the irrational numbers and do not ”cover” the entire interval [0, 1]. In fact, it can be demonstrated
that the set of rational numbers has zero Lebesgue measure on the real line.
1.6 Conclusions 19

An intuitive way to understand this is to attempt to ”cover” all rational numbers in [0, 1] with open intervals
of very small length, such that the total length of these intervals is arbitrarily small.
For each rational number qi in [0, 1], we can associate it with an open interval Ii of length ε × 2−i , where
ε > 0 is an arbitrarily small number. Since there are infinitely many rationals in [0, 1], the sum of the lengths
of these intervals would be:
∞  
−i 1 1 1
∑ ε ×2 = ε × 2 + 4 + 8 +... = ε
i=1

Given that ε can be made arbitrarily small, the sum of the lengths of the intervals that cover all the rational
numbers in [0, 1] can also be made arbitrarily small. This indicates that the rational numbers have zero
measure in [0, 1].
This is a rough and intuitive presentation, and there are technical details involved in the actual proof.
However, the key idea is that even though there are infinitely many rational numbers in [0, 1], their ”size” in
terms of the Lebesgue measure is zero. Hence, they form a Null Set or a set of measure zero.

1.6. Conclusions
Throughout this chapter, we engaged in a comprehensive examination of the fundamental concepts and
structures underlying advanced mathematical studies, particularly in the fields of measure theory and
probability. Beginning with the fundamentals of set operations, we delved into the intricate operations that
shape these assemblages of elements, enabling us to comprehend how sets interact with one another and
their inherent properties.
Our journey proceeded to the complexities of set families and -algebras. Here, we discussed how essential it
is to have sets organized into groups that meet specific criteria. -algebras proved to be a crucial component of
more complex mathematical analysis, particularly when discussing measure theory and probability spaces.
Examining the capabilities of a -algebra revealed the strength and potential of these structures, exposing
their indelible imprint on numerous mathematical theories and applications. Not only do their properties,
such as closure under countable unions and complements, establish their significance, but they also pave the
way for advanced mathematical constructions and discussions.
Finally, our introduction to the Borel set emphasized the significance of these particular -algebras, especially
when examining the real line. Borel sets, whose formation is based on infinite operations on open intervals,
exemplify the exquisite complexity and immensity of mathematical thought.
As we near the end of this chapter, it becomes clear that these abstract ideas are the pillars of advanced
mathematics. They not only demonstrate the precision and rigor of mathematical thought, but also the
elegance and breadth of the discipline. As we progress, it is essential to understand these fundamental
concepts, as they will continue to serve as the foundation upon which more complex ideas and theories
are constructed.
20 1 Preliminaries on Sets

1.7. Exercises
Exercise 1.1. Given two sets A = {1, 2, 3} and B = {2, 3, 4, 5}, find A ∪ B, A ∩ B, A − B, and B − A.

/ {1}, {2, 3}, {1, 2, 3}}. Is F a σ -algebra?


Exercise 1.2. Consider the family of sets F = {0,

Exercise 1.3. Given the sample space Ω = {a, b, c} and the σ -algebra A = {0, / Ω,
{a}, {b, c}}, assign a probability measure P to the sets of A such that the total probability is 1.

Exercise 1.4. Given a set E ⊂ R where E = {x ∈ R : 0 < x < 1}, determine if E is a Borel set.

Exercise 1.5. Given the following family of sets:

F = {0,
/ {a}, {b}, {a, b}}

where a and b are distinct elements. Prove that F is a σ -algebra on the set {a, b}.

Exercise 1.6. Let A, B be two sets in a measure space with µ being the measure. If

µ (A ∩ B) = 2, µ (Ac ∩ B) = 3, µ (A ∩ Bc ) = 4

Find µ (A) and µ (B).


Chapter 2
Measurable Spaces and Measure Properties

Carlos Polanco
Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México.
Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de México, Ciudad de
México, México.

Abstract This chapter delves into the fundamental concepts underlying measure theory, beginning with an
in-depth exploration of measurable spaces. As we traverse the landscape of this mathematical domain, we
discuss the intrinsic properties of a measure, elucidating the nuances that define and distinguish various
measures. A crucial juncture in our journey is the exploration of the intertwined relationship between
measures and σ -algebras, emphasizing their interdependence and the pivotal role of the latter in shaping the
characteristics of the former. The Lebesgue Measure, a cornerstone of modern analysis, is then introduced
in an accessible manner, stripping away the complexities to present its essence. Concluding the chapter,
we highlight the diverse applications of measurable spaces and σ -algebras, underscoring their ubiquity and
relevance in various branches of mathematics and its implications in real-world scenarios. Readers will
emerge with a solid grasp of these foundational concepts, appreciating both their theoretical significance
and practical utility.

Keywords: measurable spaces, properties of a measure, the importance of measure in σ -algebras, the
Lebesgue measure.

2.1. Introduction
Measure theory, a cornerstone of modern mathematics, has enriched our understanding of diverse areas such
as integration, probability, and functional analysis. At its heart, this theory seeks to quantify the ”size” or

(B) Carlos Polanco: Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México. Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de
México; Tel: +01 55 5595 2220; E-mail: polanco@unam.mx

21
22 2 Measurable Spaces and Measure Properties

”volume” of sets in a way that generalizes our elementary notions of length, area, and volume. This chapter
embarks on a journey to explore the foundational concepts and the broader implications of measure theory.
We begin by introducing Measurable Spaces [3, 4, 5, 6], which sets the stage for our subsequent discussions.
With this foundational understanding, we delve into the Properties of a Measure, understanding the nuances
and intricacies that characterize measures. The interplay between measures and the structure of sets becomes
particularly evident when we discuss σ -algebras. In the section The Importance of Measure in σ -algebras,
we shed light on the symbiotic relationship between these two constructs and how they shape the landscape
of measure theory. A notable highlight of this chapter is the Lebesgue Measure. While traditional measures,
like length or volume, have their limitations, the Lebesgue measure emerges as a powerful tool, overcoming
many of these challenges. We aim to demystify this concept with a simplified explanation that brings
its importance and subtlety to the forefront. Finally, measure theory is not just an abstract mathematical
construct; it has profound implications in various domains. In Applications of Measurable Spaces and σ -
algebras, we touch upon the real-world significance and the myriad areas in which these concepts play
a pivotal role. So buckle up for an enlightening voyage into the world of measure theory, where abstract
concepts intertwine with practical applications and mathematical rigor meets intuitive understanding.

(1) To Measure: Measuring, in its essence, means assigning a number to an object or set of objects in a
systematic way. For example, when you measure the length of a book, you’re assigning a number that
describes its length. In measure theory, you measure sets in a similar way.
(2) Measure Theory: It is a field of mathematics that generalizes the notion of measuring. Instead of
measuring only lengths, areas, or volumes, measure theory seeks to understand how to measure sets
in more general spaces in a systematic and coherent manner.
(3) Measure: In this context, a measure is a function that takes a set and assigns it a non-negative number,
interpreting its “size”. However, not all sets can have a measure assigned, and the function must satisfy
certain properties. For instance, the “size” of an empty set is always 0, and if you have two non-
overlapping sets, the “size” of their union should be the sum of their individual “sizes”.

Example 2.1. One of the most important examples of a measure is the Lebesgue measure. This is a way of
assigning a “size” to sets of real numbers. The beauty of the Lebesgue measure is that it can measure sets
that simpler notions of measurement (like length, area, or volume) cannot. It is fundamental in mathematical
analysis, especially in the theory of integration.

Measure theory is a way to understand and generalize the concept of “measuring” in mathematics. It goes
beyond the simple notions of length, area, or volume, allowing the measurement of sets in broader and more
abstract contexts.

Definition 2.1. Let M be a σ -algebra on a set X and µ : M → [0, ∞] be a function. The function µ is called
a measure on M if it satisfies the following properties:

(1) Non-negativity: For every E ∈ M , µ (E) ≥ 0.


(2) Measure of the empty set: µ (0) / = 0.
(3) σ -additivity: If {En }∞
n=1 is a sequence of sets in M that are pairwise disjoint (i.e., Ei ∩ E j = 0
/ for i 6= j),
then: !

[ ∞
µ En = ∑ µ (En )
n=1 n=1

When µ is a measure on M , the triple (X, M , µ ) is called a measure space.


2.2 Measurable Spaces 23

The most famous measure on R is the Lebesgue measure, but many other measures can be defined in
various contexts. The central idea is that a measure provides a formalized and generalized notion of “size”
or “volume” for sets.

2.2. Measurable Spaces


Imagine you have a large bag of gems of various colors and shapes, and you want to group them in certain
ways to take inventory. If you’re only interested in color, you might have groups of red, blue, green gems,
etc. If you’re interested in shapes, you might group them into circles, squares, triangles, etc. Each of these
ways of grouping the gems defines a “collection” of sets.
Now, in mathematics, when we try to “measure” sets, we need a special collection that’s broad enough to
include sets of interest but also structured in a way that permits measurement. These special collections are
what we call measurable spaces.
A measurable space is simply a pair (X, M ) where:
(1) X is the “universal” set or the total set under consideration (like the entire bag of gems).
(2) M is a σ -algebra over X. This means M is a collection of subsets of X (like different groupings of
gems) that meets three properties:
(a) Contains the empty set: 0/ ∈ M .
(b) Is closed under complement: if a set E is in M , then the set containing everything not in E (called
the complement of E) is also in M .
(c) Is closed under countable unions: if you have a sequence of sets E1 , E2 , E3 , . . . and each Ei is in M ,
then the union of all these sets is also in M .

In simple terms, a measurable space gives us a framework that lets us talk about measuring sets coherently.
It’s like having specific rules about how you can group gems so that any grouping you come up with
following those rules can be “counted” or “measured” systematically.

Figure 2.1: Illustration of a measurable space Ω with a measurable set A within it.
24 2 Measurable Spaces and Measure Properties

2.3. Properties of a Measure


A measure is a tool in mathematics that gives a formal way to describe the “size” or “amount” of sets. It’s
like how we measure the weight of an object in pounds or kilograms, but more general. A measure has some
crucial properties that make it behave nicely and logically.

(1) Non-negativity: Just as you can’t have a negative weight for an object, in measure theory, the “size”
or measure of a set is always non-negative. This means that the measure of any set is either zero or a
positive number.

∀E, µ (E) ≥ 0
(2) Measure of the empty set: Think of an empty box; it doesn’t contain anything, so its content weighs
zero. Similarly, in measure theory, the measure of an empty set (a set with nothing in it) is always zero.

µ (0)
/ =0
(3) σ -additivity: This property is a bit like adding up weights. If you have several disjoint boxes (meaning
no overlaps), and you know the weight of each box, the total weight is just the sum of their individual
weights. In measure theory, if you have a collection of sets that don’t overlap, the measure of their union
(all of them put together) is the sum of their individual measures.
!

[ ∞
If E1 , E2 , E3 , . . . are disjoint, then µ En = ∑ µ (En )
n=1 n=1

The measure of any set is always ≥ 0. Suppose we’re measuring lengths on the real line. The measure
(length) of the interval [3, 5] is 5 − 3 = 2. Note that this is a non-negative number.
The measure of the empty set is always 0. In our context of measuring lengths on the real line, the empty set
has no length. Therefore, its measure is µ (0)
/ = 0.
If you have multiple disjoint sets and know their individual measures, the measure of their union is the sum
of their individual measures.

Example 2.2. Suppose we have three disjoint intervals: A = [1, 2], B = [3, 4], and C = [5, 6]. The measure
of each is 1. If we take the union of these sets, A ∪ B ∪ C, then the total measure is µ (A) + µ (B) + µ (C) =
1 + 1 + 1 = 3.

Example 2.3. Imagine you’re measuring areas on a plane. We have two disjoint squares: the first square has
an area of 4 cm2 and the second square has an area of 9 cm2 . If we consider both squares together, their
combined total area (measure) is 4 cm2 + 9 cm2 = 13 cm2 .

Example 2.4. Consider the intervals A = [2, 4] and B = [5, 7] on the real line. Determine the measure of each
interval and then the measure of their union.

Solution 2.1. The measure (length) of interval A is 4−2 = 2. Similarly, the measure of interval B is 7−5 = 2.
Since A and B are disjoint, the measure of their union A ∪ B is the sum of their individual measures.
So, µ (A ∪ B) = µ (A) + µ (B) = 2 + 2 = 4.
2.3 Properties of a Measure 25

Example 2.5. Suppose you have three disjoint sets on the real line: C = [1, 3], D = [4, 6], and E = [7, 8].
Determine the measure of the union of these sets.

Solution 2.2. First, find the measure of each set:

(1) µ (C) = 3 − 1 = 2
(2) µ (D) = 6 − 4 = 2
(3) µ (E) = 8 − 7 = 1

Now, since the sets are disjoint, the measure of their union is the sum of their measures:

µ (C ∪ D ∪ E) = µ (C) + µ (D) + µ (E) = 2 + 2 + 1 = 5

Example 2.6. Consider the set F = (2, 5) which is an open interval. Determine its measure.

Solution 2.3. The only difference between open and closed intervals in terms of measure on the real line is
in their endpoints. The length (or measure) is determined simply by subtracting the endpoints.
For the interval F, its measure is:
µ (F) = 5 − 2 = 3

a b

Figure 2.2: Representation of a measure as the length of an interval [a, b] on the real line. Here, the measure
is b − a.

a b c d

Figure 2.3: Illustration of σ -additivity. The total measure of the union of disjoint sets (intervals) is the sum
of their individual measures.

Example 2.7. Given a measurable space Ω and three sets A, B, and C in Ω with the following properties:

(1) A is a measurable set with measure µ (A) = 5.


(2) B is a measurable set with measure µ (B) = 3.
(3) C is the intersection of A and B with measure µ (C) = 2.

Determine the measure of the union of A and B, that is, µ (A ∪ B).


26 2 Measurable Spaces and Measure Properties

Solution 2.4. To find the measure of the union of A and B, we use the measure property known as the
“inclusion-exclusion principle”, which is defined as:

µ (A ∪ B) = µ (A) + µ (B) − µ (A ∩ B)

In this problem, A ∩ B is the set C.


Plugging in the given values, we have:

µ (A ∪ B) = µ (A) + µ (B) − µ (C)


µ (A ∪ B) = 5 + 3 − 2
µ (A ∪ B) = 6

Therefore, the measure of the union of A and B is 6.

When we want to find the measure of the union of two sets, we have to account for the fact that if we just add
their individual measures, we are counting the measure of their intersection twice (since both sets contain
this intersection). That’s why the “inclusion-exclusion principle” subtracts the measure of the intersection.
In our case, as C is the intersection of A and B, its measure was double-counted (once for A and once for B).
By subtracting µ (C) from µ (A) + µ (B), we correct this double counting and get the correct measure for the
union A ∪ B.

Example 2.8. Given a measurable space (Ω, F ), where Ω is the underlying set and F is a σ -algebra on Ω.
Suppose you have a function f : Ω → R satisfying the following property:
For every r ∈ R, the set {x ∈ Ω | f (x) > r} belongs to F .
Prove that f is a measurable function.

Solution 2.5. A function f : Ω → R is said to be measurable if, for every open set O ⊂ R, the preimage
f −1 (O) is a set in F .

(1) Begin by observing that the set of real numbers less than r, denoted as (−∞, r), is an open set.
(2) By the given property of f , we know that the set {x ∈ Ω | f (x) > r} is an element of F .
(3) Notice that the set {x ∈ Ω | f (x) ≤ r} is simply the complement of the set {x ∈ Ω | f (x) > r}. Since F is
a σ -algebra, we know that the complement of any set in F is also in F . Thus, {x ∈ Ω | f (x) ≤ r} ∈ F .
(4) Using the fact that the inverse image of an open set under a measurable function results in a measurable
set (by definition), and that (−∞, r) is open in R, the preimage f −1 ((−∞, r)) is the set {x ∈ Ω | f (x) < r},
which is a subset of {x ∈ Ω | f (x) ≤ r}. Therefore, f −1 ((−∞, r)) ∈ F .

We have shown that the preimage under f of any open set in R is in F . By definition, this means f is a
measurable function.
2.4 The Importance of Measure in σ -algebras 27

Where f (x) > r


r


R

Figure 2.4: Illustration of the measurable space Ω and its mapping to the real line R via the function f .
Points from Ω are transformed into real values on R. The blue shaded region on R above the dashed red line
at r highlights values where f (x) > r. This visual representation showcases the given property of f that for
every r ∈ R, the set {x ∈ Ω | f (x) > r} belongs to F .

Figure 2.5: Illustration of the measurable space Ω and its mapping to the real line R via the function f .
Points from Ω are transformed into real values on R. The shaded region on R highlights values greater than
a chosen real number r, representing the set of outputs where f (x) > r. This diagram visually depicts the
given property of f that for every r ∈ R, the set {x ∈ Ω | f (x) > r} is a measurable set in F .

2.4. The Importance of Measure in σ -algebras


The concept of measure in σ -algebras is fundamental in measure theory and is especially crucial in the study
of probability and statistics. Here’s a brief outline of its importance in these contexts:

(1) Formal Definition of Probability: In probability theory, a probability space consists of a sample space
Ω, a σ -algebra F of subsets of Ω, and a probability measure P that assigns to each event (a set in F )
a probability. This structure formalizes and generalizes the intuitive notions of random experiments and
their possible outcomes.
(2) Integration: Measure allows us to define the Lebesgue integral, which is a generalization of the Riemann
integral. This new integral can handle a broader class of functions and is essential in real and complex
analysis, ergodic theory, and more.
(3) Central Limit Theorem: One of the most fundamental results in probability, which states that the sum of
independent identically distributed random variables (under certain conditions) converges to a normal
distribution as more and more variables are summed, relies on measure theory.
(4) Conditional Probability and Conditional Expectation: Measure enables us to define and work with
conditional probabilities and expectations in a broader framework, which is crucial in statistics and
econometrics, for instance.
(5) Generality: Through the concept of measure, it’s possible to deal with more general spaces than just
the Euclidean space Rn . This allows for the modeling of a wide variety of random phenomena in
mathematics, physics, economics, and many other disciplines.
(6) Product Measure: When working with multiple measure spaces (as in the case of multiple random
variables), constructing the product measure is fundamental. This construction allows us to define a
joint probability on the product of the spaces.
28 2 Measurable Spaces and Measure Properties

(7) Handling Mathematical “Anomalies”: Measure theory allows for working with “small” sets precisely.
For instance, sets of measure zero in Rn , which are sets that are so “thin” that their “volume” is zero
(like the set of rational numbers in R).

The measure in σ -algebras provides the necessary tools to work rigorously with concepts of probability
and analysis, and is essential for the development and application of statistics, probability theory, and other
mathematical areas.
Example 2.9. Consider the sample space Ω = {1, 2, 3, 4}. List the σ -algebra generated by the subset A =
{1, 2}.
Solution 2.6. The subset A generates a σ -algebra, which contains the empty set, the whole space, the set A
itself, and its complement. Thus, the σ -algebra is:

/ {1, 2}, {3, 4}, Ω}


F = {0,

Example 2.10. For the real line R, consider the set B = {x ∈ R : 0 ≤ x < 1}. Describe a σ -algebra that
contains this set.
Solution 2.7. One possible σ -algebra that contains the set B is the Borel σ -algebra on the real line, denoted
by B(R). It is the smallest σ -algebra that contains all the open intervals of the real line. Thus, it certainly
contains the set B as well as countless other sets.
Example 2.11. Let Ω = {a, b, c, d, e}. Consider the σ -algebra

/ {a, b}, {c, d, e}, Ω}


F = {0,

and the measure function µ defined as µ ({a, b}) = 0.4 and µ ({c, d, e}) = 0.6. Find µ (Ω) and µ (0).
/

Solution 2.8. By definition of a measure, we have:

µ (Ω) = 1

and
µ (0)
/ =0
Example 2.12. Consider the real interval Ω = [0, 2]. Let the σ -algebra be the collection of all subsets of Ω
that are either countable or have a countable complement within Ω. If the measure of a single point is defined
as 0, determine the measure of the set A = [0, 1).
Solution 2.9. The set A has a countable complement in Ω (the set of points in the interval [1, 2] plus the
single point 1). Since the measure of a single point is 0, and A has uncountably many points, the measure of
A is:
µ (A) = µ (Ω) − µ ({1}) = 2 − 0 = 2
Example 2.13. Let Ω = R, and consider the Borel σ -algebra, B(R). Let λ be the Lebesgue measure.
Determine the measure of the set B = [1, 3) ∪ [4, 5].
Solution 2.10. The Lebesgue measure of an interval [a, b] or (a, b] is b − a. Thus, for the given set B, the
measure is:
λ (B) = λ ([1, 3)) + λ ([4, 5]) = (3 − 1) + (5 − 4) = 2 + 1 = 3
2.5 Measure Zero Sets and their Identification 29

2.5. Measure Zero Sets and their Identification

Definition of Measure Zero Sets


A set E ⊂ R has measure zero if, for any ε > 0, there exists a collection of open intervals I1 , I2 , . . . such that:
(1) E is contained within the union of these intervals, that is, E ⊂ n In .
S

(2) The sum of the lengths of these intervals is less than ε . Mathematically, ∑n |In |< ε , where |In | denotes
the length of interval In .

This means that we can cover the set E with intervals whose total length is arbitrarily small. Imagine that
E is a scattered set of points on a line, and you’re trying to cover these points with very small intervals. If
you can make the total length of these intervals as small as any given positive number ε , then E has measure
zero.

Examples of Measure Zero Sets


(1) Finite Sets: Any finite set of points in R has measure zero. For each point, we can cover it with an
interval of length εn , where n is the number of points and ε > 0. The sum of these lengths is ε , which
can be made as small as we wish.
(2) Set of Rational Numbers in [0,1]: Although this is an infinite and dense set in [0,1], it has measure zero.
We can ”surround” each rational number with a very small interval such that the total length of these
intervals is less than any given positive number.
(3) Countable Sets: In general, any countable (infinite but enumerable) set in R has measure zero. If E =
ε
{x1 , x2 , . . .}, for each xn , choose an interval centered at xn of length 2n+1 . The total sum of the lengths of
these intervals is ε .

Identification of Measure Zero Sets


Identifying measure zero sets isn’t always straightforward. The examples above are relatively straightforward,
but there are more intricate sets. One approach is to attempt to cover the set with intervals whose total length
is as small as you desire. If you can achieve this, the set has measure zero. However, the existence of such
intervals might not be obvious and could require advanced mathematical arguments.

Why are Measure Zero Sets Important?


Even though these sets might be infinite and dense within certain intervals, their ”size” in the Lebesgue
sense is null. This has implications in integration and probability theory, as a function can be altered on a
30 2 Measurable Spaces and Measure Properties

measure zero set without affecting its integral. In other words, measure zero sets are ”insignificant” from the
perspective of Lebesgue integration.

Interval [0, 1]

Countable set represented in red

Blue intervals covering set points

Figure 2.6: A graphical representation of the interval [0, 1] on the real line. Within this interval, a countable
set of distinct points, depicted as red dots, is distributed at specific positions: 0.1, 0.22, 0.33, 0.44, 0.55, 0.66,
0.77, and 0.88. Furthermore, four of these red points (those at 0.1, 0.22, 0.33, and 0.44) are enveloped by
small blue rectangular intervals. These blue intervals demonstrate the concept of covering specific elements
of a countable set using intervals of minimal lengths.

Visualization of a Set with Non-Zero Measure


In the context of R2 , one can visualize a set with a non-zero measure as a geometric shape whose size or
area is non-zero. For instance, consider a rectangle lying on the x-axis with a given height. The measure (in
this context, area) of this rectangle would be its width multiplied by its height.

A h=2

x
[1, 2]

Figure 2.7: The above figure represents a rectangle, labeled as A, that has a base on the interval [1, 2] on the
x-axis and a height h = 2. The measure (area) of the set represented by this rectangle is 2 × (2 − 1) = 2,
which can be visualized on the y-axis. The rectangle is shaded in blue to highlight its region in R2 .
2.6 The Lebesgue Measure: A Simplified Explanation 31

2.6. The Lebesgue Measure: A Simplified Explanation


Imagine trying to measure the length of something, like a stretch of road. You might approach this by using
a ruler, marking off each unit of length (e.g., each inch or meter) one at a time. This is analogous to how we
usually think about measuring the length of an interval on the real number line. If we have an interval from 1
to 4, we know it is 3 units long because 4 − 1 = 3. This is essentially the Riemann way of measuring things,
named after the mathematician Riemann.
But what if our stretch of road isn’t a single, continuous piece? What if it’s made up of scattered plots, some
of which might be infinitely small? How would you measure that? This is where the Lebesgue measure
comes into play.
Instead of focusing on where these plots are (like marking with a ruler), the Lebesgue approach focuses on
how much there is. It doesn’t matter where they are; what matters is their total amount.
Mathematically, the Lebesgue measure takes a more generalized viewpoint. Instead of summing up the
lengths of intervals, it looks at the overall “size” or “volume” of sets of numbers. This allows it to handle
much more complex sets, including those that traditional methods struggle with, such as sets containing
points that are spread out in peculiar ways.
The real genius of the Lebesgue measure lies in its ability to give a meaningful size to even very complicated
sets. For instance, there are sets with an “in-between” size: they’re not as small as a single point (which has
size 0), but they’re not quite as big as an interval either. The Lebesgue measure can precisely capture the size
of such sets.
To put it simply, while the traditional Riemann way of measuring looks at “where” to sum up lengths, the
Lebesgue way focuses on “what” there is to measure, regardless of its location.

Definition 2.2. The Lebesgue measure λ on the real line is defined for every interval (a, b) (where a ≤ b) as
the difference b − a. This measure is then extended to more complex sets using σ -algebras and the property
of countable additivity. Specifically, if a set can be represented as a countable union of disjoint intervals,
its Lebesgue measure is the sum of the measures of these intervals. The Lebesgue measure can be further
generalized to higher dimensions, leading to concepts such as area and volume, and is the foundation for
Lebesgue integration.

2 5
Length = 3

Figure 2.8: The figure above represents an interval on the real number line between 2 and 5, with a length
(or Lebesgue measure) of 3 units. This length is depicted in red. For basic intervals, the Lebesgue measure
is just the length of the interval. The complexity of the Lebesgue measure arises when considering more
intricate or disjoint sets on the number line.

Example 2.14. Consider the set E of all rational numbers in the interval [0, 1]. Using the concept of measure,
assign a Lebesgue measure λ to E such that λ (E) = 0 (since any countable set has measure 0). Define the
function:
32 2 Measurable Spaces and Measure Properties
(
1 if x ∈ E
f (x) =
0 if x ∈
/E
Determine the derivative f ′ (x) at the point x = 0.5.

Solution 2.11. The function f (x) is an indicator function for the set E. For any x that is not rational, f (x) is
0. For any x that is rational, f (x) is 1.
To compute the derivative, first note that at any neighborhood around x = 0.5, the function f takes both
values, 0 and 1.
Consider any number ε > 0. If we take an interval (0.5 − ε , 0.5 + ε ), this interval will contain both rational
and irrational numbers. Therefore, the values of f in this interval will oscillate between 0 and 1.
For the derivative of f at x = 0.5, we look at the difference quotient:

f (0.5 + h) − f (0.5)
h
as h tends to 0.
Since f (0.5 + h) and f (0.5) can either be 0 or 1, the above quotient can oscillate between many values as h
tends to 0. Thus, we can say that f ′ (0.5) does not exist.
From this example is that even though the Lebesgue measure of the rational numbers in [0, 1] is 0, their
presence affects the differentiability of functions defined in terms of them. The concept of measure helps
us understand how a “small” set (in terms of measure) can have a significant impact on other mathematical
structures, such as the differentiability of a function.

Example 2.15. Consider the set F of all numbers x in the interval [0, 1] such that x has a decimal expansion
consisting only of the digits 0 and 1. The Cantor ternary set is an example of such a set. Using the concept
of measure, we know the Lebesgue measure λ (F) is 0, because the Cantor set is “nowhere dense” and has
zero measure.
Define the function: (
1 if x ∈ F
g(x) =
0 if x ∈
/F
Determine the derivative g′ (x) at the point x = 0.25.

Solution 2.12. The function g(x) is an indicator function for the set F. It evaluates to 1 for any x that has a
decimal expansion using only 0s and 1s and 0 otherwise.
To compute the derivative at x = 0.25, consider any number ε > 0. If we consider a neighborhood around
x = 0.25 given by the interval (0.25 − ε , 0.25 + ε ), this interval will contain numbers from the set F as well
as numbers not in F. Therefore, the values of g in this interval will take both 0 and 1.
For the derivative of g at x = 0.25, we look at the difference quotient:

g(0.25 + h) − g(0.25)
h
as h tends to 0.
2.6 The Lebesgue Measure: A Simplified Explanation 33

Since g(0.25 + h) and g(0.25) can either be 0 or 1, the above quotient can oscillate between multiple values
as h tends to 0. Thus, we can conclude that g′ (0.25) does not exist.
This example is similar to the previous one: a set with zero Lebesgue measure, like the Cantor set, can still
have a profound effect on the differentiability of functions defined over it.

Example 2.16. Consider the sequence of sets An ⊆ [0, 1] defined by:


 
k n
An = | k = 0, 1, 2, . . . , 2
2n

These sets An consist of points that are fractions with denominators that are powers of 2. Determine the limit
set A of the An , and discuss its measure in the measurable space with the Borel σ -algebra on [0, 1].
Solution 2.13. Given a real number x in [0, 1], we can write its binary expansion as:

x = ∑ ai 2−i
i=1

where each ai is either 0 or 1.

If x belongs to any of the sets An , then it has a finite base-2 expansion (and is 0 for all terms after the nth
term). However, as n approaches infinity, the expansion can become infinite.
The limit set A is simply the set of all real numbers in [0, 1] that have a binary expansion, i.e., A = [0, 1].
The Lebesgue measure of [0, 1] is 1. However, we have described [0, 1] in terms of its fine structure based on
the binary expansion. The Borel σ -algebra on [0, 1] allows us to describe and measure complicated sets like
the An and their limit.
Even though the structure of the sets An might seem intricate due to the binary expansion, measure theory
gives us a simple and unified way to understand their content: their measure is simply 1.

Example 2.17. This is a basic example, but it introduces the key concepts of a measurable space in the
context of probability theory. Constructing Ω, F , and P is fundamental in probability and statistics to model
and analyze random experiments.
Let’s consider a classic example related to the tossing of a coin.
Sample Space Consider an experiment where we toss a coin twice. The sample space Ω is the set of all
possible outcomes. For our experiment,

Ω = {(H, H), (H, T ), (T, H), (T, T )}

where H represents heads and T represents tails.

σ -algebra Let’s consider the σ -algebra F generated by all the subsets of Ω. This is the power set of Ω,
denoted 2Ω . Thus,

/ {(H, H)}, {(H, T )}, {(T, H)}, {(T, T )}, {(H, H), (H, T )}, . . . , Ω}
F = {0,
34 2 Measurable Spaces and Measure Properties

There are in total 24 = 16 subsets in F .


Probability Measure Lastly, let’s assign a probability measure P to this space. Assuming the coin is fair,
the probability of obtaining heads or tails in one toss is 12 . Therefore, for any specific outcome in Ω (like
(H, T )), the probability is 12 × 21 = 41 .
For instance:
1
P({(H, T )}) =
4
1 1 1
P({(H, H), (H, T )}) = + =
4 4 2
Detailed Explanation:

(1) Sample Space: By tossing a coin twice, we’ve identified all the possible outcomes we might observe,
represented by Ω.
(2) σ -algebra: The σ -algebra F tells us which subsets of Ω we can “measure” or assign a probability to.
In this case, we can measure any subset of Ω, from the empty set to the full set Ω (and everything in
between).
(3) Probability Measure: Finally, we assign a probability to each measurable set in F . Using the assumption
that the coin is fair, we can state that each possible outcome in Ω has the same probability of 41 . To find
the probability of any other set in F , we simply sum up the probabilities of the individual outcomes in
that set.

Example 2.18. Let’s delve into a basic example interlinking topology and measure theory.
Topological Space Let X be the closed interval [0, 1] in R. This interval, endowed with the standard topology
induced by R, is a topological space.
Borel σ -algebra One of the most significant σ -algebras associated with topological spaces is the Borel σ -
algebra, which is the σ -algebra generated by all open sets of the space. For the interval [0, 1], the Borel
σ -algebra, denoted B([0, 1]), contains all subsets of [0, 1] that can be expressed as unions, possibly infinite
intersections, and complements of open and closed sets in [0, 1].
Lebesgue Measure The Lebesgue measure, denoted m, is a measure defined on the Borel σ -algebra B([0, 1])
that assigns to each Borel set B its “size” or “length”. For instance, for any subinterval [a, b] of [0, 1], where
0 ≤ a < b ≤ 1, we have:

m([a, b]) = b − a

Example: Consider the set A which is the union of the intervals [0, 14 ) and ( 43 , 1]. We aim to find the measure
of A under the Lebesgue measure.
1 3
m(A) = m([0, )) + m(( , 1])
4 4
1 1
= +
4 4
1
=
2
2.6 The Lebesgue Measure: A Simplified Explanation 35

Hence, the “length” or “size” of A is 21 under the Lebesgue measure. This example showcases how
topological concepts, such as open and closed sets, can interplay with measure theory. In particular, the
topology determines which sets are “measurable” (via the Borel σ -algebra), and measure theory informs us
how “large” these sets are.
Example 2.19. Let’s consider a straightforward example where we’ve surveyed 10 people about their favorite
color. The options provided were: Red, Blue, Green, and Yellow.
Sample Space The sample space Ω is the set of all possible responses. Since each person can choose one
color, and we’ve surveyed 10 people:

Ω = {Red, Red, Red, ..., Red, Red, Blue, ..., . . . , Yellow, Yellow, Yellow, ...}

The total number of elements in Ω would be 410 (since each of the 10 people has 4 choices).

σ -algebra For simplicity, let’s consider the σ -algebra F generated by individual choices and combinations
of two colors. That is, F contains subsets like:

{Red, Blue, Red, ...}, {Green, Green, Blue, ...}, {Red, Yellow}, . . .

Measure (Empirical Distribution Function) Suppose out of the 10 people, 4 chose Red, 3 chose Blue, 2
chose Green, and 1 chose Yellow. We can define the measure of each set in F as the proportion of people
that chose that particular set of colors. For instance:
4
P(Red) = = 0.4
10
3
P(Blue) = = 0.3
10
2
P(Green) = = 0.2
10
1
P(Yellow) = = 0.1
10
Statistical Application With this measurable space and the empirical distribution function, we can perform
basic analyses like determining the most popular color, calculating means, variances, etc. For example, we
can conclude that the most popular color among the respondents is Red with a 40% preference.
Example 2.20. Consider a square S in the Euclidean plane with vertices (0, 0), (1, 0), (1, 1), and (0, 1). We
aim to study certain properties of subsets of this square in the context of measure theory.
The set of all points inside the square S forms our sample space, denoted as Ω.
Consider the σ -algebra F generated by all rectangular subsets of S (that is, subsets which are themselves
rectangles with sides parallel to the axes). This σ -algebra will, of course, contain more complex sets than
simple rectangles, but for simplicity, think of it as the collection of sets generated by unions, intersections,
and complements of rectangles within S.
We define a measure m on this measurable space that assigns to each set A in F its Euclidean area. For
instance, if A is a rectangle with sides of length a and b, then m(A) = a × b.
36 2 Measurable Spaces and Measure Properties

Consider the rectangle R with vertices (0.25, 0.25), (0.75, 0.25), (0.75, 0.75), and (0.25, 0.75).
What is the measure of R?
The rectangle R has sides of length 0.5. Therefore, its measure (area) under the measure function m that
we’ve defined is:
m(R) = 0.5 × 0.5 = 0.25

This example is straightforward but illustrates how measure theory and measurable spaces can be applied
even in basic geometric contexts. The choice of rectangular subsets to generate the σ -algebra is natural here
due to the geometric structure of the square and the ease with which we can compute areas of rectangles.

(0,1) (1,1)

(0.25,0.75) (0.75,0.75)

(0.25,0.25) (0.75,0.25)

(0,0) (1,0)

Figure 2.9: Diagram of the square S with vertices at (0, 0), (1, 0), (1, 1), and (0, 1). Inside S, there’s a shaded
rectangle R representing the subset described in the example. The rectangle has vertices at (0.25, 0.25),
(0.75, 0.25), (0.75, 0.75), and (0.25, 0.75).

Example 2.21. Let’s consider an example combining elements from group theory and measure theory.
Consider the group G = Z/2Z, which is the group of integers modulo 2, with the usual addition operation
modulo 2. This group consists of two elements: 0 and 1. The operation is defined as:

0 + 0 ≡ 0 (mod 2)

0 + 1 ≡ 1 (mod 2)
1 + 0 ≡ 1 (mod 2)
1 + 1 ≡ 0 (mod 2)

Let’s now consider a measurable space, which will be G itself, with the smallest σ -algebra containing G and
the empty set: F = {0,
/ G}.
To complete our measurable space, we’ll assign a measure m to the sets in F . Let’s define:

/ =0
m(0)

m(G) = 1
2.7 Applications of Measurable Spaces and σ -algebras 37

Group: We’ve chosen a very simple group, Z/2Z. It’s a cyclic group and its structure is straightforward.
Measurable Space and σ -algebra: The σ -algebra chosen is the simplest possible one. It contains the empty
set (which must always be in any σ -algebra) and the total set G.
Measure: The measure function is also the simplest. It assigns measure zero to the empty set and measure
one to the total set.

2.7. Applications of Measurable Spaces and σ -algebras


Beyond the fields where the importance of measurable spaces and σ -algebras have already been exemplified,
such as topology, probability, statistics, geometry, and group theory, these structures have profound implications
in many other domains. The following are some notable applications:

(1) Functional Analysis: Particularly in the study of L p spaces, which are spaces of functions measurable
with respect to a given measure. These spaces are fundamental in many areas of analysis and partial
differential equations.
(2) Ergodic Theory: This is a branch of mathematics that deals with the long-term statistical properties of
deterministic dynamical systems. Here, measure theory and σ -algebras play a pivotal role.
(3) Information Theory: Concepts like entropy and Kullback-Leibler divergence are defined in terms of
probability measures, inherently involving measurable structures.
(4) Mathematical Physics: In quantum mechanics, the Hilbert space of system states is intimately connected
with L2 -spaces, a particular case of L p spaces, and thus relies on measure concepts.
(5) Control Theory: In the formulation and solution of optimal control problems, it’s often necessary to
consider spaces of measurable functions and to work with integrals with respect to measures, especially
in stochastic settings.
(6) Mathematical Economics: In game theory, optimization, and general equilibrium, it is often necessary
to work in functional spaces involving measures, integration, and measurable structures.
(7) Mathematical Biology: In models of populations or disease spread, it’s often beneficial or necessary to
work with measures, especially when dealing with continuous populations or probability distributions.
(8) Geography and Earth Sciences: When analyzing spatial or geographical data, it’s often necessary to
work with measures defined over geographical or spatial regions, implying measurable structures.

Due the measure theory provides a rigorous foundation for integration and a general framework to discuss
“size” or “amount” in a very broad sense, its applications are wide and deep across many areas of
mathematics and science.

Example 2.22. Controlling an Outbreak in a City. A new contagious disease has broken out in a large city.
The city’s health department has data in the form of a list of reported infections for every block in the
city. The goal is to quarantine areas accounting for at least 60% of the total infections. Instead of a density
function across the city, we have exact counts for each block.
The city is divided into several blocks, each of which can be represented as an element in our space X. The
σ -algebra M contains subsets of X, with each subset representing a collection of blocks.
38 2 Measurable Spaces and Measure Properties

The measure, m : M → [0, ∞), assigns to each set A ∈ M the total number of infections in the blocks
contained in A.
Solution:

(1) Total Infections: Compute the total number of infections in the city by evaluating the measure of the
entire space:
T = m(X)
(2) Target: Set a target for the number of infections to be contained within the quarantine zones. This will
be 60% of the total infections:
Ttarget = 0.6 × T
(3) Identifying Blocks to Quarantine:

(a) Sort all city blocks based on their individual infection counts (which can be derived directly from
the measure m by evaluating it on singletons or individual blocks).
(b) Starting from the block with the highest number of infections, sequentially add blocks to the
quarantine list until the cumulative infection count from these blocks is at least Ttarget .

(4) Implementation: Quarantine all identified blocks.

This reframed method, while not using a measurable function, still leverages the principles of measurable
spaces and σ -algebras. By sorting the blocks based on their infection count and then accumulating the most
affected areas, we can achieve the target of quarantining areas that account for at least 60% of total infections.

In our exploration of the “Applications of Measurable Spaces and σ -algebras,” one salient point emerges:
the true power and versatility of measure theory become evident when delving into measurable functions.
While sets, measures, and σ -algebras provide the foundational structure, it is the measurable functions that
offer the tools to interact with these structures in meaningful and often practical ways. They act as bridges,
connecting abstract mathematical constructs to real-world problems and applications. To truly appreciate
the depth and breadth of measure theory’s applications, a thorough understanding of measurable functions
is indispensable. Hence, their comprehensive study will be the primary focus of the following chapter.

2.8. Conclusions
Throughout this chapter, we have journeyed through the foundational concepts of measure theory, beginning
with the basic building blocks of measurable spaces. By exploring the inherent properties of a measure,
we’ve glimpsed the intricacies and subtleties that underpin this robust mathematical framework. Our
discourse on the σ -algebras accentuated the pivotal role they play in shaping the landscape of the theory,
ensuring we can meaningfully discuss the ’size’ or ’magnitude’ of different sets. The Lebesgue measure,
an essential centerpiece, was demystified, providing readers with an intuitive grasp of its significance and
the paradigm shift it introduced to integration theory. Finally, by delving into the applications of measurable
spaces and σ -algebras, we’ve seen how these abstract concepts permeate various domains, from functional
analysis to mathematical economics, underscoring their ubiquity and practical importance. As we close this
chapter, it becomes evident that measure theory, while intricate and demanding, is a cornerstone of modern
2.8 Conclusions 39

mathematics, acting as a bridge between pure mathematical abstraction and tangible real-world problems.
As we proceed further, we will continually encounter its profound influence, and it is our hope that readers,
armed with the knowledge from this chapter, will approach subsequent topics with increased insight and
appreciation.
40 2 Measurable Spaces and Measure Properties

2.9. Exercises
Exercise 2.1. Given a set Ω = {1, 2, 3, 4} and a σ -algebra F = {0,
/ Ω, {1, 2}, {3, 4}}, is (Ω, F ) a measurable
space?

Exercise 2.2. Let µ be a measure on R such that µ ([0, 1]) = 1. Find µ ([0, 12 ]) if µ is additive.

Exercise 2.3. Consider the σ -algebra A = {0,


/ R, {1}, R\{1}}. Assign a measure m on A such that m(R) =
1.

Exercise 2.4. Given the set E = { 1n : n ∈ N}, is E a measure zero set in R with the Lebesgue measure?

Exercise 2.5. Given the interval [a, b] in R, what is its Lebesgue measure?

Exercise 2.6. In the study of random experiments, why is it important to consider measurable spaces and
σ -algebras?

Exercise 2.7. Given a measurable space (X, M ) where X is the whole space and M is a σ -algebra. Let µ
be a measure defined on M . Show that for any set A ∈ M :

µ (Ac ) = µ (X) − µ (A)


where Ac denotes the complement of A in X.

Exercise 2.8. Explain why a measure, specifically the Lebesgue measure, defined on a σ -algebra is important
for standardizing the idea of ”size” or ”length” for subsets of R.
Chapter 3
Measurable Functions

Carlos Polanco
Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México.
Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de México, Ciudad de
México, México.

Abstract In this chapter, we embark on an exploration of the intricate realm of measurable functions,
a cornerstone of modern measure theory. Beginning with a comprehensive introduction to measurable
functions, the chapter elucidates their defining characteristics and the fundamental role they play in bridging
the gap between abstract mathematical structures and real-world applications. Delving deeper, we investigate
the inherent properties of these functions and the operations that can be performed with them, highlighting
their versatility and robustness in various mathematical scenarios. These foundational concepts pave the
way for an in-depth discussion on σ -algebras, another pivotal construct in measure theory. The chapter
unravels the significance of σ -algebras, detailing their intricate relationship with measurable functions and
emphasizing their indispensability in facilitating rigorous mathematical analyses. Lastly, recognizing the
universality of these concepts, we traverse across multiple disciplines, presenting a plethora of examples
that underscore the ubiquity and applicability of measurable functions and σ -algebras. From physics to
economics, the examples demonstrate the profound impact of measure theory, solidifying its status as
an invaluable tool in both theoretical investigations and practical applications. This chapter, replete with
rigorous definitions, theorems, and real-world illustrations, serves as a comprehensive guide for both novices
and seasoned scholars eager to delve into the rich tapestry of measurable functions and their overarching
importance in measure theory.

Keywords: σ -algebra, Borel σ -algebra, examples across disciplines, Importance of σ -algebras in measure
theory, measurable functions, properties and operations with measurable functions.

(B) Carlos Polanco: Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México. Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de
México; Tel: +01 55 5595 2220; E-mail: polanco@unam.mx

41
42 3 Measurable Functions

3.1. Introduction
Measure Theory is not merely an abstract branch of mathematics but an essential tool that permeates
many fields of study and application. Its essence lies in the ability to assign a notion of size or quantity
to sets, thereby allowing for a rigorous and generalized description of concepts such as length, area,
volume, and probability. Within this context, measurable functions [7, 8] emerge as protagonists, forming
the bridge between abstract sets and quantifiable structures. These functions, with their intricate properties
and operations, unveil themselves as vehicles that carry the theory from the abstract to the tangible, enabling
practical applications across various disciplines.
The role of σ -algebras in this theory cannot be understated. They act as gatekeepers, dictating which sets
are deemed worthy of measurement and which are not, ensuring that our measurements remain coherent
and contradiction-free. As we delve deeper into their definitions and properties, we will grasp how these
structures underpin most constructions in measure theory. Our journey will ultimately take us through
concrete examples in fields as varied as physics, engineering, economics, and beyond. These examples not
only illustrate the theoretical relevance but also the real and tangible impact of measure theory on the world
around us.

3.2. Measurable Functions


First, to understand measurable functions, it’s necessary to grasp the concept of measurable sets. Imagine
you have a set of numbers and you wish to determine its “size”. For finite sets (like {1, 2, 3}), this is easy,
just count the elements. But, what if the set is infinite? In this case, the idea of “measuring” becomes a bit
more complicated. In mathematics, we use a tool called “measure” to “weigh” or “measure” sets, especially
the infinite ones. A set is “measurable” if we can determine its size in a coherent and sensible way.
Now, a measurable function is simply a function that respects this idea of measurability. That is, if you
take a measurable set of numbers and apply the function to obtain a new set, that new set should also be
measurable.
Imagine a function that takes temperatures in Celsius and converts them to Fahrenheit. If you have a set of
temperatures in Celsius that is measurable, after applying the function (convert to Fahrenheit), you should
also be able to measure the new set of temperatures in Fahrenheit. If this is the case, we can say this function
is measurable.

Understanding Preimages in Function Theory


In function theory, the preimage (or inverse image) refers to the set of all elements in the domain of a function
that map to a particular element in the codomain (or range) under that function.
More formally, if f : X → Y is a function and B is a subset of Y , then the preimage of B under f , denoted by
f −1 (B), is the set of all elements x in X such that f (x) is in B.
3.2 Measurable Functions 43

It’s important to note that the term “preimage” doesn’t necessarily imply the existence of an inverse function
for f . The notation f −1 in this context doesn’t refer to the inverse function (unless one truly exists), but
rather to the operation that takes a set in Y and returns a set in X.
For example, consider the function f : R → R given by f (x) = x2 . The preimage of {4} under f is the set
{-2, 2}, since both -2 and 2 map to 4 under the function f .

f
X Y

σ -algebra A σ -algebra B

Figure 3.1: Illustration representing a measurable function f mapping from space X to space Y . The
structures A and B denote their respective σ -algebras. For the function to be measurable, the preimage
of any set in B should also be in A .

Why is the Measurable Functions Important?


Measurable functions are foundational in areas like analysis and measure theory, which deal with the “sizes”
and structures of infinite sets. These ideas are pivotal in advanced mathematics and have applications in
physics, economics, and other sciences.
A measurable function is one that, when acting upon a measurable set, results in another set that’s also
measurable. It’s a way to ensure the operations we perform in the mathematical world are coherent and
consistent. It’s like ensuring a cooking recipe works correctly no matter the size of ingredients you use!
A function is deemed measurable if it “respects” or “preserves” the structure of a given σ -algebra.
Specifically, let’s assume we have two measurable spaces (X, A ) and (Y, B), where X and Y are sets and
A and B are the respective σ -algebras on X and Y . A function f : X → Y is considered measurable if, for
every set B in the σ -algebra B, the preimage f −1 (B) is in the σ -algebra A .
The rationale behind this definition is that the “measurability” of a function ensures that any measurable
structure (like the ability to integrate or measure) in the target set Y can be “pulled back” to the source set X.
Example 3.1. Let’s consider:
(1) X = {1, 2, 3, 4}
(2) Y = {a, b}
/ {1, 2}, {3, 4}, X} is a σ -algebra on X
(3) A = {0,
/ {a}, {b},Y } is a σ -algebra on Y
(4) B = {0,

Let’s define the function f : X → Y as:


(
a if x ∈ {1, 2}
f (x) =
b if x ∈ {3, 4}
44 3 Measurable Functions

The function f is measurable with respect to A and B because for any set in B, its preimage under f
belongs to A . For instance, the preimage of {a} is {1, 2}, which is in A .

Non-Measurable Functions
Using the same sets X, Y , A , and B as before, consider a different function:
(
a if x = 1
g(x) =
b if x ∈ {2, 3, 4}

The function g is not measurable with respect to A and B because, for example, the preimage of {a} under
g is {1}, which is not in A .

3.3. Properties and Operations with Measurable Functions


The properties and operations of measurable functions ensure that certain manipulations and transformations
won’t “break” their measurability. It’s a bit like having rules in a game: as long as you follow those
rules, everything will work smoothly. These characteristics make working with measurable functions in
mathematics more manageable and predictable. It’s like having reliable tools in your mathematical toolbox!

Properties of Measurable Functions


(1) Conservation of Measurability: If you have a measurable function, and you apply basic mathematical
operations to this function (like adding, subtracting, multiplying, dividing), the result will still be a
measurable function. In other words, measurable functions are resilient: you can do things to them, and
they remain measurable.
(2) Composition of Measurable Functions: If you have two functions and both are measurable, then the
resulting function from their composition will also be measurable. Imagine two machines that process
fruits: if each machine produces fruits in ordered boxes and you put a fruit from an ordered box into
the second machine, you will still have an ordered box at the end. This is what happens when you
“compose” two measurable functions.
3.4 Importance of σ -algebras in Measure Theory 45

Operations with Measurable Functions


(1) Addition and Subtraction: If you have two measurable functions, you can add or subtract their values
(just as when you add or subtract functions in basic math) and the outcome will be a new function that
is also measurable.
(2) Multiplication and Division: Similarly, if you multiply or divide two measurable functions, the result
will still be measurable (just be careful not to divide by zero!).
(3) Limits: If you have a sequence of measurable functions that converge to a function (i.e., the functions get
closer and closer to a particular function), then this limit function is also measurable. It’s like having a
series of increasingly clearer pictures of an object: if all the photos are “measurable,” the clearest picture
(the limit) will also be.
(4) Absolute Value: The absolute value of a measurable function is, again, measurable. If you think of
absolute value as a machine that turns negative numbers into positive ones (and leaves the positives as
they are), this machine does not break the “measurability” of a function.

3.4. Importance of σ -algebras in Measure Theory


Measure theory is a foundational area of mathematics that deals with how to “measure” sets in a coherent
manner and extend these measurements to more general functions. σ -algebras play a crucial role in this
theory for the following reasons:

(1) Coherent Definition of Measure: σ -algebras provide a framework in which one can define a measure
(such as the Lebesgue measure) so that this measure is coherent and free of contradictions. Specifically,
they ensure that if you can measure certain basic sets, then you can also measure countable unions,
intersections, and complements of those sets.
(2) Closure Properties: Since σ -algebras are closed under complements and countable unions, any operation
we apply to measurable sets (sets in the σ -algebra) will remain measurable. This is essential for the
development of a coherent integral theory.
(3) Working with Measurable Functions: Functions that are “measurable” are those that preserve the
structure of a σ -algebra. That is, if we have a function f mapping from a space X to a space Y and
we have a σ -algebra on Y , we want the preimage of any measurable set in Y (according to that σ -
algebra) to also be measurable in X. This is key to integration theory, and σ -algebras are the right tool
to define this property.
(4) Development of Probabilistic Theory: In probability, the concept of a probability space (Ω, F , P)
includes a σ -algebra (F ), representing all events (subsets of Ω) that we can coherently assign a
probability to using the probability measure P. Without σ -algebras, we wouldn’t have a solid framework
for discussing probability in general terms.
(5) Connection with Topology: σ -algebras have a relationship with topology, particularly with the notion of
open and closed sets. For instance, the σ -algebra generated by the open sets of a topological space is
fundamental in the study of measurable functions and integration in more general spaces.
(6) Facilitate Extension of Measures: σ -algebras allow for the extension of measures from simple sets (like
intervals) to more complicated sets in a coherent manner.
46 3 Measurable Functions

σ -algebras provide the necessary framework to develop measure theory and integration in a consistent and
general manner. Without them, we could not generalize many of the fundamental concepts and results in
analysis, probability, and related areas.

3.5. Examples across Disciplines


The foundational concepts of measurable spaces and sigma-algebras play a pivotal role across numerous
disciplines. Their significance extends beyond the abstract mathematical realm, touching various applied
fields ranging from geography to quantum physics. Each discipline harnesses the power of these concepts in
unique ways to address problems intrinsic to its domain.
This section, aims to bridge the gap between the abstract nature of measure theory and its tangible
applications in various domains. Through a collection of meticulously curated examples, readers will glean
insights into how the seemingly intricate world of measurable functions and sets operates in real-world
contexts. These examples not only reinforce the importance of the foundational concepts but also showcase
their versatility and breadth of application.

Functional Analysis
Example 3.2. Consider the Hilbert space L2 ([0, 1]) of square-integrable functions over the interval [0,1].
Given a sequence of functions fn (x) = xn in L2 ([0, 1]), investigate its convergence in the L2 norm and
determine its limit.

Solution 3.1. To determine the convergence of the sequence in the L2 norm, we must evaluate:
ˆ 1  12
2
k fn − f k2 = | fn (x) − f (x)| dx
0

where f is the function to which fn converges.

Now, we note that for any f in L2 ([0, 1]), the value of the integral is finite, and for the given fn :

1 1
x2n+1
ˆ 
2n 1
x dx = =
0 2n + 1 0 2n + 1

1
As n goes to infinity, 2n+1 approaches 0. This implies that the sequence fn converges to the zero function in
the L2 norm.
Now, to ensure our sequence lies in a σ -algebra (and is thus measurable), consider the Borel σ -algebra on
[0,1], denoted by B([0, 1]). The functions fn are polynomials and are continuous on [0,1]. Since continuous
functions on compact sets are measurable with respect to the Borel σ -algebra, our sequence fn is measurable
with respect to B([0, 1]).
3.5 Examples across Disciplines 47

n=1

n=2

n=3
x

Figure 3.2: Graphical representation of the sequence of functions fn (x) = xn on the interval [0,1]. As n
increases, the function graph tends to 0 on the interval [0,1) and spikes at x = 1.

Ergodic Theory
Example 3.3. Consider the unit interval I = [0, 1). We define a transformation T : I → I by T (x) = 2x mod 1.
This is often referred to as the “doubling map.” The question is: Is the transformation T ergodic with respect
to the Lebesgue measure m on I?

Solution 3.2. For T to be ergodic, it means that for any set A in I such that T −1 (A) = A, either m(A) = 0 or
m(A) = 1.
To see this, assume there is a set A with 0 < m(A) < 1 such that T −1 (A) = A. By the properties of the
1 1
doubling map, the interval 2 , 1 maps into the interval [0, 2 ). Thus, either A or its complement must be
entirely contained in 21 , 1 , which would give it a measure of at least 12 – a contradiction since 0 < m(A) < 1.
 

Thus, T is ergodic with respect to the Lebesgue measure on I.


48 3 Measurable Functions

T (x)
1
x= 2

y = 2x
y=1

Figure 3.3: Graphical representation of the doubling map T (x) = 2x mod 1 on the unit interval [0, 1). The
blue curve represents
  the transformation, doubling each segment of the interval. Due to the modulo operation,
the interval 21 , 1 is wrapped back into [0, 12 ). The red dashed lines at y = 1 and x = 12 highlight the critical
points of this transformation.

Information Theory: Huffman Coding


Example 3.4. Consider the problem of designing a code to transmit messages over a communication channel.
The messages are sequences of letters from a finite set A and we wish to associate each letter with a code
(a sequence of bits). However, not all letters from A are equally likely. Let’s consider A = {a, b, c} with
probabilities P(a) = 0.5, P(b) = 0.25, and P(c) = 0.25. We want to design a code with minimum average
length.

Solution 3.3. Define a probability space (A , F , P), where F is a σ -algebra containing all the subsets of
A . The function l : A → R+ associating each letter with its code length is measurable.
An optimal code for this set, given these probabilities, is a Huffman code. Based on the provided probabilities,
the Huffman code might be l(a) = 0, l(b) = 10, l(c) = 11.
The average length of the code is:

E[l(X)] = 0.5 × 1 + 0.25 × 2 + 0.25 × 2 = 1.25

Thus, on average, each letter is encoded using 1.25 bits.


3.5 Examples across Disciplines 49

a: 0

0.5 b: 10 0.25

c: 11

Figure 3.4: Tree representation of the Huffman code for the given probabilities. The letters a, b, and c are
assigned codes based on their probabilities, resulting in an average code length of 1.25 bits.

Mathematical Physics
Example 3.5. Consider a quantum system described by the position operator x̂ acting on wave functions in
the Hilbert space L2 (R), which represents square-integrable functions on the real line. The eigenfunctions
of x̂ corresponding to the position eigenvalue x are Dirac delta functions δ (x − x0 ). Now, the task is to
determine whether these eigenfunctions form a part of our Hilbert space, given the properties of the Dirac
delta function.

Solution 3.4. The Dirac delta function δ (x − x0 ) is not a true function in the classical sense, but rather a
distribution. In the context of measure theory, we can understand this as a measure. However, the square-
integrable condition for functions in L2 (R) states:
ˆ ∞
|ψ (x)|2 dx < ∞
−∞

For the Dirac delta function, when squared, its integral would be infinite. Thus, the Dirac delta function,
though crucial in quantum mechanics, is not an element of the Hilbert space L2 (R) in a classical sense.
Instead, its action is understood through the effects it has when integrated against test functions.
50 3 Measurable Functions

δ (x − x0 )

x0

Figure 3.5: Illustration of the Dirac delta function. While it’s not a function in the classical sense, it’s pivotal
in quantum mechanics. The spike at x0 represents its action, “picking out” a value at that point.

Figure 3.6: Graphical representation of the Dirac delta function, δ (x − x0 ). It’s characterized by being zero
everywhere except at the point x0 , where it is infinite. The area under the spike is 1, symbolizing its integral
over all space. In quantum mechanics, this “function” serves as an idealized model for a measurement at a
precise position, x0 , capturing the essence of localization in position space.

Control Theory: Stabilizing a System


Example 3.6. Consider a system whose state at time t is represented by x(t) and its dynamics is given by the
equation:
ẋ(t) = u(t)
where u(t) is the control input. The goal is to design u(t) such that x(t) converges to zero as t → ∞ and the
control input u(t) does not exceed a certain threshold.
Given the set Ω of all possible functions u(t) defined on [0, ∞), we want to find a function in this space that
is admissible. Define a σ -algebra F on Ω which includes all subsets of functions such that their absolute
value does not exceed a certain threshold α .
Find an admissible function u(t) that ensures x(t) → 0 as t → ∞.
Solution 3.5. An effective control strategy in this simplified scenario can be the proportional control:

u(t) = −k × x(t)

where k is a positive constant. Given the dynamics, the solution is:

x(t) = e−kt x(0)


3.5 Examples across Disciplines 51

Choosing k such that |u(t)|≤ α for all t ensures that the function is in our σ -algebra F . This strategy ensures
the state x(t) converges to zero, and the control input is bounded.

u(t)

State Trajectory

x(t)

Control Trajectory

Figure 3.7: The blue curve represents the system state trajectory x(t) and the red curve represents the control
trajectory u(t) designed to stabilize the system.

Figure 3.8: Graphical representation of the system’s state trajectory and control strategy. The blue curve
illustrates the state x(t) over time, showing its convergence to zero. The red curve represents the control
input u(t), demonstrating how the input is adjusted over time to stabilize the system while staying within the
set bounds of the σ -algebra.

Mathematical Economics
Example 3.7. Consider a market with a continuum of traders indexed by the interval I = [0, 1]. Each trader
x ∈ I possesses a quantity q(x) of a certain good. We want to understand the distribution of this good across
the market.
Assume the amount each trader has, q(x), can be represented by the function:

q(x) = sin(2π x) + 2

Our goal is to determine the set of traders who possess more than 1 unit of the good.

Solution 3.6. The problem is to find the measure of the set A defined by:

A = {x ∈ I : q(x) > 1}
52 3 Measurable Functions

Using our function for q(x), the set A is given by:

A = {x ∈ I : sin(2π x) + 2 > 1}

⇒ A = {x ∈ I : sin(2π x) > −1}


π 3π
⇒ A = {x ∈ I : − < 2π x < }
2 2

This results in the interval A = 14 , 34 which is a measurable subset of I under the Borel sigma-algebra.


Hence, the proportion of traders having more than 1 unit of the good is 12 or 50% of the total traders.

q(x)

Figure 3.9: Distribution of goods among traders. The region between the blue curve and the dashed red line
represents traders with more than 1 unit of the good.

Mathematical Biology
Example 3.8. Consider a closed environment, such as a pond, in which algae grow. The growth of algae in
the pond can be modeled as a function of the amount of sunlight they receive. If S(x) represents the sunlight
intensity on day x, where x ∈ [0, 365] (considering a year with 365 days), we want to measure the total
amount of sunlight the algae will receive in a given year.
We need to ensure that S is a measurable function in the sigma-algebra generated by the Borel sets on [0, 365]
to integrate it over the year.
3.5 Examples across Disciplines 53

Solution 3.7. Let’s consider the sigma-algebra B on [0, 365] generated by the Borel sets. If S is Borel
measurable, meaning S is measurable with respect to B, then we can integrate S over [0, 365] to get the
total sunlight received.
Given that most physically meaningful functions of this nature are Borel measurable, we can integrate:
ˆ 365
S(x) dx
0

This integral represents the total sunlight the algae receive over the year.

Sunlight intensity, S(x)

Sunlight intensity over the year

Days

Figure 3.10: Graph depicting the sunlight intensity, S(x), as a function of days in the year. The oscillating
nature of the curve might represent seasonal changes in sunlight. The total area under the curve corresponds
to the cumulative sunlight exposure for the algae over the year. The blue curve represents the varying sunlight
intensity, S(x), over a year. The total area under the curve would represent the total sunlight received by the
algae.

Geography and Earth Sciences


Example 3.9. Given a region R on Earth’s surface, represented as a subset of the plane. The region consists
of various terrains including forests, lakes, and urban areas. We have a function f : R → {0, 1} defined as:
(
1 if (x, y) is a forest point
f (x, y) =
0 otherwise

Measure the total forested area within the region R.

Solution 3.8. To find the measure of the forested region, define a sigma-algebra F on R with each subset
representing a specific terrain type. Let F represent the set of all forested points, F = {(x, y) ∈ R : f (x, y) =
54 3 Measurable Functions

1}. The area of F concerning the Lebesgue measure will represent the total forested area within R.
Mathematically, this is: ˆ
Area of F = f (x, y) dx dy
R

Forest F

Region R

Figure 3.11: Illustration of region R with the forested area F highlighted. The measure of F provides the
total forested area within R.

3.6. Conclusions
Throughout this chapter, we have delved deep into the intricate world of measurable functions, highlighting
their role in measure theory. By defining measurable functions, we have laid a robust foundation for
understanding how sets can be quantified and classified in terms of their “size” or “measure”. The properties
and operations with measurable functions were introduced not just as mathematical tools but also as
fundamental to understanding the nature of these functions. For instance, we observed how operations like
summation and multiplication, or the transformation of measurable functions, result in new measurable
functions, underscoring the versatility and coherence of this category of functions. The concept of σ -algebras
was underscored as a cornerstone in measure theory. The σ -algebras provide a structure that allows us to
speak of measurable sets, and by extension, measurable functions. This framework is crucial for ensuring
that our measures are consistent and meaningful in practical contexts. Finally, the chapter culminated with
examples across various disciplines, from physics to economics, demonstrating the ubiquity and applicability
of measurable functions and measure theory at large. These examples served to solidify the idea that measure
theory is not just an abstract mathematical exercise but a vital tool with concrete applications in various
fields of knowledge. This chapter has been a profound immersion into measurable functions, laying down
their definition, properties, and significance in the broad field of measure theory. As readers, we are now
better equipped to tackle intricate problems in this domain and apply these concepts across a multitude of
disciplines.
3.7 Exercises 55

3.7. Exercises
Exercise 3.1. Given the function f : R → R defined by
(
1 if x ∈ Q
f (x) =
0 if x ∈
/Q

Is f a measurable function with respect to the Borel sigma-algebra on R?

Exercise 3.2. Let f , g : R → R be measurable functions defined by f (x) = x and g(x) = x2 . Is the function
h(x) = f (x) + g(x) measurable?

Exercise 3.3. Given a set A = [0, 1) and the sigma-algebra B of Borel sets on R, is A in B?

Exercise 3.4. The Heaviside step function is defined as



0
 if x < 0
H(x) = 1/2 if x = 0

1 if x > 0

Is the Heaviside step function measurable with respect to the Borel sigma-algebra on R?

Exercise 3.5. Let O be the collection of open sets in R. Prove that the smallest σ -algebra containing O is
the Borel σ -algebra on R.

Exercise 3.6. Let f : R → R be a measurable function. Prove that if c is a constant, then c f (the function
that maps each x to c f (x)) is also measurable.
Chapter 4
Lebesgue Integral

Carlos Polanco
Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México.
Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de México, Ciudad de
México, México.

Abstract In this chapter, we embark on a comprehensive journey through the landscape of integration in
mathematical analysis. We initiate our exploration by delineating the general “Definition and Properties of
the Integral,” setting a foundation for readers unfamiliar with the concept. Subsequently, we delve deeper
into the nuances of integration by presenting the “Formal Definition of the Riemann Integral,” providing a
rigorous examination of its mathematical structure and foundational significance. In an effort to elucidate
these theoretical concepts with practical application, a compelling “Case: An Unusual Roulette Spin” is
discussed, wherein the principles of integration are applied to a real-world scenario, offering readers an
engaging application of the integral’s theoretical underpinnings. The chapter concludes with an analysis of
the “Limitations of the Lebesgue Integral,” critiquing and expanding on its capabilities and constraints in
mathematical analysis. By the end of this chapter, readers will possess a profound understanding of the
integral, from its foundational definitions to its practical and theoretical limitations.

Keywords: case: an unusual roulette spin, definition and properties of the integral, formal definition of the
riemann integral, limitations of the Lebesgue integral, measurable function, measurable space.

4.1. Introduction
The concept of integration lies at the heart of calculus and serves as a cornerstone in various scientific
disciplines ranging from physics to economics. Integration can be approached in numerous ways, each with

(B) Carlos Polanco: Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México. Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de
México; Tel: +01 55 5595 2220; E-mail: polanco@unam.mx

57
58 4 Lebesgue Integral

its own set of advantages, limitations, and fields of application. This chapter aims to explore some of the key
aspects of integration theory, focusing primarily on its formal definitions and properties. In particular, we will
delve into the definition and properties of the integral in a general sense, followed by the formal definition
of the Riemann integral, a well-known approach to integration. To illustrate the practical applications and
subtleties of these concepts, we will consider a special case: an unusual game of roulette. Lastly, we will
explore the limitations of the Lebesgue integral [9, 10], another prominent approach that generalizes the
Riemann integral.
Definition and Properties of the Integral: We will start by examining what it means to integrate a function,
including an exploration of the different properties that integrals can exhibit. These properties provide the
foundation for understanding the various types of integrals that exist.
Formal Definition of the Riemann Integral: Building on the general properties, we will provide a rigorous
definition of the Riemann Integral, and explore how it applies in a variety of contexts. We’ll discuss its
strengths as well as its limitations.
Case: An Unusual Roulette Spin: To provide an intuitive understanding of the abstract concepts introduced,
we will study a unique case involving a roulette wheel with a continuous set of points. This will illuminate
the intricacies of applying integration in real-world scenarios.
Limitations of the Lebesgue Integral: Finally, we will discuss the Lebesgue integral. Although it generalizes
the Riemann integral and solves many of its problems, the Lebesgue integral is not without its own
limitations. We will explore these constraints in detail.
By the end of this chapter, readers should have a solid understanding of the key concepts surrounding the
theory of integration, and be equipped to apply these principles in their respective fields.

4.2. Introduction of the Lebesgue Integral


To understand the Lebesgue integral, it’s important to first grasp the problem it aims to solve. When we
learn about integration in high school or in introductory university calculus courses, we’re introduced to the
Riemann integral. This integral is based on “summing up areas of rectangles” under a curve to obtain the
total area beneath that curve over a certain interval. This approach works well for many functions, but there
are some “trickier” functions for which the Riemann integral doesn’t behave nicely.
This is where the Lebesgue integral comes in. Instead of summing the areas of rectangles in the traditional
manner, as in the Riemann integral, the Lebesgue integration shifts our perspective. Instead of asking, “How
high does the function go at this point?”, we ask, “How many points of the function reach this height?”. This
subtle difference allows the Lebesgue integral to work in many cases where the Riemann integral fails.
Imagine you’re trying to measure how much paint you need to color a series of stakes of varying heights. The
Riemann integral would be like looking at each stake individually, measuring its height, and then summing
everything up. The Lebesgue integral, on the other hand, would be like sorting all the stakes by height and
then measuring how many stakes you have of each height. It’s a shift in perspective.
The Lebesgue integral uses a tool called “measure,” which is a way to generalize our notion of length (in
one dimension), area (in two dimensions), or volume (in three dimensions) to more general sets and more
4.2 Introduction of the Lebesgue Integral 59

dimensions. This allows us to deal with functions that have “jumps” or “holes” more effectively than the
Riemann integral.
In summary:
(1) The Riemann integral “slices and sums” based on domain points (i.e., the input values of the function).
(2) The Lebesgue integral “slices and sums” based on the function values (i.e., the heights).

The Lebesgue integral is a powerful tool in mathematics and is fundamental in many areas, especially in
functional analysis and measure theory. Its development has allowed for the tackling and solving of problems
that were once out of reach.

Function Value Measure of Sets

Domain Function Value


Riemann Integral Lebesgue Integral

Figure 4.1: Comparison between Riemann and Lebesgue integration mechanisms. The Riemann integration
(left) evaluates the function’s values by partitioning the domain and summing up the areas of rectangles
based on these values. On the other hand, the Lebesgue integration (right) sorts the function by its values
and measures how many points achieve each specific value, reflecting the concept of measuring sets.

4.2.1. Riemann Integral and Lebesgue Integral


To visually understand the difference between Riemann and Lebesgue integrals, let’s consider the polynomial
function f (x) = x2 over the interval [0,1]. This is a conceptual and visual approach rather than a detailed
analysis.
Riemann Integral: The Riemann integral of f over [0,1] involves splitting the domain (x-axis) into small
subintervals, calculating the function value at some point within each subinterval (or several points,
depending on whether it’s a lower, upper, or midpoint sum), and summing up all these areas.
Splitting the interval [0,1] into n equal subintervals, the i-th subinterval is [xi−1 , xi ] where xi = ni .
Using the midpoint Riemann sum:
1 n n 
i − 0.5 2 1
  
i − 0.5
ˆ
2 1
0
x dx = lim
n→∞
∑f n
= lim ∑
n n→∞ i=1 n n
i=1

To evaluate the Riemann integral of f (x) = x2 on the interval [0,1] using Riemann sums, we’ll use midpoints
xi∗ in subintervals of the partition, that is, xi∗ = i−0.5 1
n . The length of each subinterval is n .
60 4 Lebesgue Integral

The Riemann sum is: 2


n n 
i − 0.5 1
∑ f (xi∗ )∆x =∑
i=1 i=1 n n

Expanding the sum:


n  2 n 2
i − 0.5 1 i − i + 0.25
lim
n→∞
∑ n n
= lim ∑
n→∞ n3
i=1 i=1
!
1 n 2 1 n 0.25 n
= lim
n→∞
∑ i − n3 ∑ i + n2 ∑ 1
n3 i=1 i=1 i=1

Using the formulas for sums of squares and integer sums:


n
n(n + 1)(2n + 1)
∑ i2 = 6
i=1

n
n(n + 1)
∑i= 2
i=1
n
∑1=n
i=1

Substituting in:
 
1 n(n + 1)(2n + 1) 1 n(n + 1) 0.25
= lim · − · + · n
n→∞ n3 6 n3 2 n2
 3
2n + 3n2 + n n2 + n 0.25

= lim − +
n→∞ 6n3 2n3 n

As n approaches infinity, terms with n in the denominator tend to zero. Thus, the limit is:
1
3

So: ˆ 1
1
x2 dx =
0 3
4.2 Introduction of the Lebesgue Integral 61

12

0.752

0.52

0.252
x

0.25 0.5 0.75 1

Figure 4.2: Riemann Integration for f (x) = x2 . The blue curve represents the function f (x) = x2 over the
interval [0,1]. The vertical red lines demonstrate the “divide and conquer” approach of Riemann integration,
where the domain (x-axis) is partitioned into small subintervals. For each subinterval, the function value is
evaluated, usually at the midpoint, to determine the height of the rectangular area contributing to the integral.
The width of each rectangle is the length of the subinterval, and the area of each rectangle contributes to the
sum that approaches the integral as the number of subintervals increases.

In the Riemann representation, the red lines depict the domain divisions, while in the Lebesgue representation,
the green lines depict the codomain divisions. The blue curve is the graph of f (x) = x2 .
Lebesgue Integral: The Lebesgue integral takes a different route: instead of dividing the domain, it divides
the codomain (y-values of the function). It addresses the question, ”For how many x’s does the function take
a value between y and y + dy?”.
For the function f (x) = x2 , imagine a tiny horizontal strip at√y = √c, of thickness dy. The set of x-values
in
√ [0,1] for
√ which f (x) lies within this strip is the interval [ c, c + dy]. The ”width” of this interval is
c + dy − c.
The Lebesgue integral of f over [0,1] is:
ˆ 1 ˆ 1
cµ f −1 ([c, c + dc]) dc

f dµ =
0 0
62 4 Lebesgue Integral
√ √
For f (x) = x2 , the measure µ of f −1 ([c, c + dc]) is c + dc − c.
√ √
This integration process can be visualized as ”stacking” horizontal strips of height c and width c + dc− c.
To numerically evaluate the Lebesgue integral of f (x) = x2 over [0,1], we integrate over the codomain (y-
values).
Given f (x) = x2 , we want to find:
1 √ √
ˆ
c( c + dc − c) dc
0

To simplify this, we use a change of variables.


√ √
Approximating: ∆x ≈ c + dc − c

Expanding c + dc using a Taylor series around c:
√ √ dc
c + dc ≈ c + √
2 c

Thus:
dc
∆x ≈ √
2 c

Substituting this into the integral:


ˆ 1
dc
c √
0 2 c
ˆ 1
cdc
= √
0 2 c
ˆ 1√
1
= c dc
2 0

Evaluating the integral:


1 2 3/2 1
 
c=
2 3 0
1 2 h 3/2 i
= · 1 − 03/2
2 3
1
=
3

Thus, the Lebesgue integral of f (x) = x2 over [0,1] is 13 .


This result is the same as if we had integrated x2 using the conventional Riemann integral over the interval
[0,1].
4.3 Examples of Lebesgue Integral 63

0.75

0.5

0.25

x
√ √ √ √
0.25 0.5 0.75 1

Figure 4.3: Lebesgue Integration for f (x) = x2 . The blue curve again represents the function f (x) = x2 over
the interval [0,1]. Contrary to the Riemann method, the green horizontal lines depict the Lebesgue integration
approach where the codomain (y-values) is partitioned. For each segment of constant height in the codomain,
we ask: “Over which subinterval in the domain does the function lie within this y-value range?”. The integral
is computed by summing the areas of these horizontal strips, where each strip’s width varies based on the
function’s behavior and each strip’s height is determined by the y-value partition.

The primary difference between the two integrals is the focus: Riemann revolves around partitioning the
domain (x-axis) while Lebesgue is centered on partitioning the codomain (y-axis).

4.3. Examples of Lebesgue Integral


Example 4.1. Consider the function f : [0, 2] → R defined by f (x) = x2 . Let’s compute the Lebesgue integral
of f over this interval. For comparison, the Riemann integral is:
2 2
x3
ˆ
8
x2 dx = =
0 3 0 3

For the Lebesgue integral, we proceed as follows:


64 4 Lebesgue Integral

(1) Choose a value y in the range of f , i.e., y ∈ [0, 4].


(2) Consider the set Ey = {x ∈ [0, 2] : f (x) > y}.
(3) Integrate over y considering the measure of Ey .

For our function f (x) = x2 :

Ey = {x ∈ [0, 2] : x2 > y}

Ey = {x ∈ [0, 2] : x > y}

The measure of Ey in the space [0, 2] is:



m(Ey ) = 2 − y

Now, we integrate:
4

ˆ
(2 − y) dy
0
4
2 3
= [2y − y 2 ]
3 0
16
= 8−
3
8
=
3
As seen, the Lebesgue and Riemann integrals coincide in this instance. This is not surprising for “simple”
functions like polynomials, but the Lebesgue integral has advantages when dealing with more complex and
“problematic” functions not easily handled by the Riemann integral.
4.3 Examples of Lebesgue Integral 65

f (x) = x2
4

x
2

Figure 4.4: Illustration of the Lebesgue integration mechanism for f (x) = x2 on the interval [0, 2]. The red
shaded regions represent the “vertical slices” which are summed to get the integral’s value.

Example 4.2. Consider the indicator function of the rationals χQ in the interval [0, 1], defined as:
(
1 if x ∈ Q
χQ (x) =
0 if x ∈/Q

The Riemann integral of this function over the interval [0, 1] does not exist since the function has discontinuities
everywhere.
However, using the Lebesgue integral, we can proceed:

(1) For y = 1, the set Ey = {x ∈ [0, 1] : χQ (x) = 1} is just the set of rationals in [0, 1], which is countable and
thus has Lebesgue measure 0.
(2) For y = 0, the set Ey = {x ∈ [0, 1] : χQ (x) = 0} is the set of irrationals in [0, 1], which has Lebesgue
measure 1 (since it complements the rationals).

Therefore, the Lebesgue integral of χQ over the interval [0, 1] is:


ˆ 1
χQ (x) dm(x) = 1 × 0 + 0 × 1 = 0
0

Thus, even though the Riemann integral of this function does not exist over the interval, its Lebesgue integral
is 0.
66 4 Lebesgue Integral

Rationals
1

Irrationals x
1

Figure 4.5: Illustration of the Lebesgue integration mechanism for the indicator function of the rationals χQ
in the interval [0, 1]. The red shaded region visually represents the rationals, even though their measure is 0.
The yellow shaded region represents the irrationals.

Example 4.3. Consider the indicator function of the Cantor set χC in the interval [0, 1], defined as:
(
1 if x ∈ C
χC (x) =
0 if x ∈/C

The Cantor set C is a non-countable set but has measure 0 in the interval [0, 1].
For the Lebesgue integral:

(1) For y = 1, the set Ey = {x ∈ [0, 1] : χC (x) = 1} is just the Cantor set C. Even though it is uncountably
infinite, its Lebesgue measure is 0.
(2) For y = 0, the set Ey = {x ∈ [0, 1] : χC (x) = 0} is the complement of the Cantor set in [0, 1], which has
Lebesgue measure 1 (since the entire interval minus the measure of the Cantor set is 1).

Therefore, the Lebesgue integral of χC over the interval [0, 1] is:


ˆ 1
χC (x) dm(x) = 1 × 0 + 0 × 1 = 0
0

This is interesting because, while the Cantor set has uncountably many points (just as many as the entire
interval [0, 1]), its Lebesgue measure (and thus its contribution to the integral) is 0.
4.3 Examples of Lebesgue Integral 67

Cantor set Cantor set

x
1

Figure 4.6: Illustration of the Lebesgue integration mechanism for the indicator function of the Cantor set χC
in the interval [0, 1]. The green shaded regions represent the segments of the Cantor set after two iterations.
Even though the Cantor set is uncountably infinite, its measure is 0, as can be seen by the “missing” middle
thirds.

Example 4.4. Consider the Dirichlet function D(x) in the interval [0, 1], defined as:
(
1 if x ∈ Q
D(x) =
0 if x ∈/Q

This function is everywhere discontinuous in the interval [0, 1], making it non-integrable in the Riemann
sense.
For the Lebesgue integral:

(1) For y = 1, the set Ey = {x ∈ [0, 1] : D(x) = 1} is the set of rationals in [0, 1]. Since the rationals are
countable and dense in [0, 1], its Lebesgue measure is 0.
(2) For y = 0, the set Ey = {x ∈ [0, 1] : D(x) = 0} is the set of irrationals in [0, 1], which has Lebesgue
measure 1 (as it complements the rationals).

Therefore, the Lebesgue integral of D(x) over the interval [0, 1] is:
ˆ 1
D(x) dm(x) = 1 × 0 + 0 × 1 = 0
0

Even though the function is discontinuous everywhere, its Lebesgue integral is well-defined and equal to 0,
contrasting with its non-integrability in the Riemann sense.
68 4 Lebesgue Integral

1 1

D(x) = 1 on rationals

D(x) = 0 on irrationals

x
1

Figure 4.7: Illustration of the Lebesgue integration mechanism for the Dirichlet function D(x) in the interval
[0, 1]. The function is everywhere discontinuous, taking the value of 1 on rationals and 0 on irrationals.
The shaded area represents both rationals and irrationals as they are both dense in [0, 1], but the Lebesgue
measure of the rationals is 0.

Example 4.5. Consider the function f (x) in the interval [0, 1], defined as:

f (x) = x

For the Riemann integral: This function is continuous on its domain, so it’s Riemann-integrable. Using the
fundamental theorem of calculus, we find:
ˆ 1 1
1 1
x dx = x2 =
0 2 0 2

For the Lebesgue integral:


Given the function’s simplicity, the sets Ey for various y values are straightforward:

(1) For y in the interval [0, 1], the set Ey = {x ∈ [0, 1] : f (x) ≥ y} is the interval [y, 1]. The Lebesgue measure
of this set is 1 − y.

Therefore, the Lebesgue integral of f (x) over the interval [0, 1] is:
4.3 Examples of Lebesgue Integral 69
ˆ 1 ˆ 1 ˆ 1 1
1 1 1
f (x) dm(x) = y(1 − y) dy = y − y2 dy = y2 − y3 =
0 0 0 2 3 0 2

As expected, for this simple function, both the Riemann and Lebesgue integrals give the same result: 21 .

f (x) = x
1

∆y

1−y
y

x
1

Figure 4.8: Illustration of the Lebesgue integration mechanism for the function f (x) = x in the interval [0, 1].
The shaded horizontal strip represents a typical set Ey used in the Lebesgue integral, which in this case is
the interval [y, 1]. For each y value, the width of this horizontal strip is ∆y and its length is 1 − y.

Example 4.6. Consider a fair dice roll and let X be the random variable representing the value appearing on
the top face. We wish to compute E[X].
First, we define the probability density function of X, f (x), for x in [1, 6] as:
(
1
if x ∈ {1, 2, 3, 4, 5, 6}
f (x) = 6
0 otherwise

The expectation E[X] is given by: ˆ


E[X] = X(ω )P(d ω )

Using the definition of f (x), we compute:


6
1 1 7 × 6 42
E[X] = ∑ k× 6 = 6 × 2
=
12
= 3.5
k=1
70 4 Lebesgue Integral

Thus, the expected value of the dice roll, E[X], is 3.5. This makes intuitive sense, as it is the average of the
possible outcomes.

f (x) × x

6
5 6
4 6
3 6
2 6
1 6
6
x
1 2 3 4 5 6

Figure 4.9: Illustration of the computation of E[X] for a fair dice roll. The height of each bar represents
x × f (x) for each value of x from 1 to 6. The expected value is the sum of these quantities divided by 6,
which gives 61 × 7×6
2 = 3.5.

Example 4.7. Let X be a random variable with a uniform distribution on the interval [0, 2]. This implies the
probability density function f (x) of X is:
(
1
if x ∈ [0, 2]
f (x) = 2
0 otherwise

The expectation E[X] is given by: ˆ


E[X] = X(ω ) f (ω ) d ω

Since f (x) is non-zero only on [0, 2], our integral becomes:


ˆ 2  2 2
1 1 x 1
E[X] = x × dx = × = ×2 = 1
0 2 2 2 0 2

Hence, the expected value E[X] is 1, which is the midpoint of the interval [0, 2]. This is consistent with our
intuition for a uniform distribution: the expected value should be the average of the endpoints of the interval.
4.4 Limitations of the Riemann Integral 71

f (x)

1
2

x
E[X] = 1

Figure 4.10: Uniform distribution of X on the interval [0, 2]. The constant height is f (x) = 21 . The expected
value E[X], shown as the red dashed line, is the midpoint of the interval, which is 1.

4.4. Limitations of the Riemann Integral


Given a function f : [a, b] → R defined on a closed and bounded interval [a, b], the Riemann integral of f
over [a, b] is defined as follows:
(1) Partition the interval [a, b] into n subintervals such that:

a = x0 < x1 < · · · < xn−1 < xn = b

Each subinterval has length ∆xi = xi − xi−1 .


(2) Choose a point ci in each subinterval xi−1 ≤ ci ≤ xi .
(3) Form the sum:
n
S( f , P,C) = ∑ f (ci )∆xi
i=1

where P is the partition and C is the set of chosen points ci .

The function f is Riemann-integrable over [a, b] if and only if there exists a number I such that for every
ε > 0, there exists a δ > 0 such that if P is any partition of [a, b] and C is any set of chosen points ci with
kPk< δ , then:
|S( f , P,C) − I|< ε
Here, kPk is the length of the largest of the subintervals in the partition.
If such a number I exists, we say:
ˆ b
f (x) dx = I
a

Conditions or Restrictions for Using the Riemann Integral:


(1) Boundedness: The function f must be bounded on the interval [a, b]. This means there exist numbers M
and m such that:
m ≤ f (x) ≤ M
for every x ∈ [a, b].
72 4 Lebesgue Integral

(2) Discontinuities: A function having a finite number of discontinuities in the interval [a, b] can still be
Riemann-integrable. However, if the function has “too many” discontinuities (e.g., a discontinuity at
every rational point), it may not be Riemann-integrable.
More formally, a sufficient (but not necessary) condition for Riemann integrability is that the function is
continuous almost everywhere in the interval of integration.

Lebesgue integration theory generalizes and extends the theory of Riemann integration. In simple terms,
any function that is Riemann-integrable on a closed interval is also Lebesgue-integrable on that interval, and
both integrals will yield the same value.
However, there are functions that are Lebesgue-integrable but not Riemann-integra
ble. As such, the set of Lebesgue-integrable functions is broader than the set of Riemann-integrable
functions.
Therefore, there are no functions that can only be integrated in the Riemann sense and not in the Lebesgue
sense. If a function is Riemann-integrable, it will also be so in the Lebesgue context.
That being said, there are cases where the Riemann integral might be more straightforward or conceptually
simple for certain functions, especially those that are “nice” in the classical sense, such as continuous or
piecewise continuous functions. However, this doesn’t mean the Lebesgue integral cannot be applied to
these functions; it might simply be that in certain contexts, one might prefer the Riemann technique for its
simplicity or familiarity.

4.5. Understanding the Lebesgue Integral


Imagine being a painter, with a task to paint a line on paper. Instead of simply drawing a straight line, you
decide to paint individual points along that line. The Riemann integral, which most of us learn in school,
measures how much paint you used based on the length of each painted line segment. It’s as if you’re drawing
tiny rectangles along the line and summing their area.
The Lebesgue integral, on the other hand, takes a different approach. Instead of looking at how much paint
you used on each segment of line, you look at each drop of paint and decide where on the line it was used. It’s
as though you’re organizing the paint drops by color (or value) and then determining on which line segments
each group was used.

Riemann vs. Lebesgue


If we had to summarize the difference in one sentence, it would be:

(1) Riemann: “For each line segment, how much paint did I use?”
(2) Lebesgue: “For each drop of paint, where did I put it?”
4.5 Understanding the Lebesgue Integral 73

An Illustrative Example
Imagine an artist with a sheet of paper and a set of markers of varying thicknesses. When they use a thick
marker, the line they draw is wider and thus uses more ink. When they use a thin marker, the line is finer and
uses less ink.
Riemann’s Approach: The artist looks at each segment of line they’ve drawn and asks, “How much ink have
I used here?” They then sum up all these amounts to get the total ink used.
Lebesgue’s Approach: Rather than looking at line segments, the artist looks at their markers one by one and
asks, “Where have I used this particular marker and how much?” They then sum up all this ink to get the
total.

Lebesgue’s Approach:
Sum ink based on Mixed line
marker used

Thin line

Riemann’s Approach: Medium line


Sum ink of each
segment separately
Thick line

Figure 4.11: Illustration of Riemann vs. Lebesgue integration approaches using the analogy of an artist with
markers of varying thicknesses.

Advantages of Lebesgue
The Lebesgue integral has some key advantages:

(1) Handling Complex Functions: There are functions that are challenging or impossible to integrate using
Riemann’s approach but become manageable with Lebesgue.
(2) Measure Theory: Lebesgue introduces the concept of “measuring” subsets of the real numbers. This is
foundational in modern probability theory.
(3) Consistency in Function Spaces: In certain situations, especially when working with function limits, the
Lebesgue integral offers more consistent and fewer surprising outcomes than the Riemann integral.

The Lebesgue integral might seem like a challenging and abstract concept, but its core idea is just about
shifting our perspective. Instead of looking at “tiny pieces” of a function’s domain (as we do with Riemann),
we look at the values the function takes and see where they appear in the domain. By doing so, we can tackle
a wider range of problems and gain a deeper understanding of the underlying math.
74 4 Lebesgue Integral

4.6. Limitations of the Lebesgue Integral


The Lebesgue integral, introduced by Henri Lebesgue in the 20th century, is an extension of the classical
notion of integration, the Riemann integral. The Lebesgue integral was designed to integrate functions that
the Riemann integral cannot handle. In this regard, it has broader applicability. Nevertheless, there are certain
functions and situations where the Lebesgue integral does not directly apply or does not resolve specific
problems:

(1) Non-measurable Functions: The Lebesgue integral is based on the concept of measurability. If a function
is not Lebesgue-measurable on a given set, it cannot be integrated using the Lebesgue integral directly
over that set.
(2) Sets of Undefined Measure: The Lebesgue integral hinges on measure theory. If a set doesn’t have a
well-defined measure, it is not immediately clear how one would integrate a function over that set using
the Lebesgue integral.
(3) Problems Requiring Different Approaches: There are problems in mathematical analysis that do not
reduce directly to integration or that require a different kind of integration, such as stochastic integration
in financial mathematics or the path integral in quantum physics.
(4) Problems Outside of Real Analysis: Although the Lebesgue integral is powerful within real analysis
and is foundational in areas like functional analysis, it is not the right tool for many problems in other
branches of mathematics, such as number theory, topology, geometry, and mathematical logic, among
others.
(5) Extremely Singular Functions: Even though the Lebesgue integral can handle more singular functions
than the Riemann integral, there are still functions that defy integration, like certain distributions in
functional analysis that require a different approach.
(6) Conditional Convergence: The Lebesgue integral addresses many convergence issues that the Riemann
integral cannot. However, the dominated convergence theorem and the monotone convergence theorem
have their own hypotheses which, if not met, do not guarantee the convergence of the integral.

It’s essential to note that while there are problems the Lebesgue integral cannot directly address, this tool
has revolutionized mathematical analysis and expanded our ability to work with a much broader range of
functions and problems.

4.7. Conclusions
This chapter provided a deep dive into the intricate world of integration, a cornerstone of mathematical
analysis. Beginning with the fundamentals, we established a solid grounding by exploring the formal
definition and properties of the integral. The journey from understanding the essence of the Riemann Integral,
with its robust ties to partitioning and its importance in the realm of functions with finite discontinuities,
showcased the true beauty and complexity of mathematical definitions.
Our exploration took a fascinating twist with the case study of an “Unusual Roulette Spin.” This example
offered an intuitive perspective on the Lebesgue Integral and its application, ensuring the content remained
relatable and understandable, even for readers unfamiliar with advanced integration concepts.
4.7 Conclusions 75

Lastly, it was essential to address the limitations of the Lebesgue Integral. Despite its vast power and
flexibility in handling more intricate functions than the Riemann Integral, it is not without its constraints.
Recognizing these limitations not only helps ensure correct application but also emphasizes the importance
of continually pushing mathematical boundaries.
In sum, the chapter offered a balanced blend of formal definitions, practical applications, and critical
assessments, setting the stage for more advanced topics and encouraging readers to approach mathematical
problems with both rigor and creativity.
76 4 Lebesgue Integral

4.8. Exercises
Exercise 4.1. Consider the characteristic function χQ which is 1 on rationals and 0 on irrationals in the
interval [0,1]. Compute the Lebesgue integral of χQ over [0,1].

Exercise 4.2. Find the Lebesgue integral of the function f (x) = x over the interval [0,1].

Exercise 4.3. Consider the function f defined on [0,1] by f (x) = 1 if x is rational and f (x) = 0 if x is
irrational. Is f Riemann integrable?

Exercise 4.4. Consider the function f (x) = x2 on [0,1]. Compare the value of the Riemann and Lebesgue
integrals of this function over the interval.

Exercise 4.5. Is the Lebesgue integral able to integrate any bounded function on a closed and bounded
interval [a,b]?

Exercise 4.6. Let f : [a, b] → R be a bounded function. Recall the definition of the Riemann integral and
provide an example of a function that is not Riemann integrable on a closed interval, explaining the reason
behind it.

Exercise 4.7. Consider a measurable space (X, M ) where X is the set and M is the σ -algebra of measurable
sets. Define what it means for a function to be measurable with respect to M . Afterwards, describe one
limitation of the Lebesgue integral with respect to such functions.
Chapter 5
Convergence in Measure Theory

Carlos Polanco
Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México.
Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de México, Ciudad de
México, México.

Keywords: dominated convergence theorem, Fatou’s lemma, Lebesgue integral, monotone convergence
theorem, Riemann integral.

5.1. Introduction
In the world of mathematical analysis, the process of integration serves as a fundamental tool, encapsulating
the notions of accumulation and area in its embrace. Classical Riemann integration, though rich and
instructive, exhibits certain limitations especially when dealing with intricate sequences of functions. The
Lebesgue integration, conceived in the crucible of measure theory, steps in to bridge these gaps, giving rise
to powerful theorems that address the behavior of integrals under limit processes.
In this chapter, we delve deep into two such monumental results: The Monotone convergence theorem and
the Dominated convergence theorem. These theorems address fundamental questions: How do we deal with
the integrals of sequences of functions? Can we interchange the roles of the limit and the integral?
The Monotone convergence theorem concerns itself with sequences of functions that are monotonically
increasing, providing conditions under which the integral of the limit is the limit of the integrals. On the other
hand, the Dominated convergence theorem operates under the setting where each function in the sequence is

(B) Carlos Polanco: Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México. Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de
México; Tel: +01 55 5595 2220; E-mail: polanco@unam.mx

77
78 5 Convergence in Measure Theory

“dominated” by another integrable function, offering a more generalized approach to swapping limits with
integrals.
Both of these theorems have extensive applications in real analysis, functional analysis, and probability
theory. They not only furnish the mathematical rigor required to justify intuitive ideas but also pave the way
for more advanced topics like ergodic theorems and the differentiation of the integral.
Embark on this journey with us as we unravel the intricacies of these convergence theorems, providing clear
statements, illuminating proofs, and illustrative examples. By the end of this chapter, readers will possess
a solid understanding of these theorems, their importance, and their applications in various domains of
mathematics.

5.2. Monotone Convergence Theorem


In basic calculus, we integrate functions over intervals on the real line. In measure theory, this idea is
generalized, and we can “integrate” functions over more general sets using what’s known as the “Lebesgue
integral.”
Suppose you have a sequence of functions f1 , f2 , f3 , . . . that are all non-negative (i.e., they always take
values greater than or equal to zero). Imagine that these functions represent, for example, different heat
distributions over a metal rod, and each function fn represents more heat than the one before it in the sense
that fn+1 (x) ≥ fn (x) for every point x.
Now, assume this sequence of functions converges pointwise to a limiting function f . Using our analogy, we
could say that f is the “final” heat distribution after a very long time.
The Monotone convergence theorem tells us that if we integrate each function fn and then take the limit as
n goes to infinity, we get the same result as if we first take the limit of the functions and then integrate it.
Mathematically, this is expressed as:
ˆ ˆ ˆ
lim fn = lim fn = f
n→∞ n→∞

Going back to our heat analogy: If you have several increasing heat distributions and want to know the total
amount of heat in the end, you can do it in two ways:

(1) Measure the heat of each distribution and then wait to see where these measurements tend.
(2) Wait until the heat stabilizes and then measure the final heat.

The Monotone convergence theorem assures us that both ways will yield the same result.
This theorem is pivotal because it allows swapping the order of the limit and the integral in certain situations,
making many calculations and proofs in analysis and probability theory easier.
5.2 Monotone Convergence Theorem 79

f
fn
f

f2

f1

Figure 5.1: Graphical representation of the Monotone convergence theorem. Displayed is a sequence of
functions f1 , f2 , . . . , fn (in blue) that monotonically increase and converge pointwise to a limiting function f
(in red). The key idea is that as the functions in the sequence approach the limiting function at each point,
the area under each curve (i.e., the integral of each function) also converges to the area under the curve of
the limiting function.

Example 5.1. Consider the function f (x) defined by the series



1
f (x) = ∑ n2 χ( n+11 , n1 ] (x)
n=1

where χ(a,b] is the characteristic function of the open interval (a, b] (i.e., it takes the value 1 inside the interval
and 0 outside). We aim to find the value of ˆ 1
f (x) dx
0
using the Monotone convergence theorem.

Solution 5.1. First, observe that each function in the sequence defined by
1
fn (x) = χ 1 1 (x)
n2 ( n+1 , n ]
is non-negative and increasing. Also, f (x) is the pointwise limit of the sum of these functions.

By the Monotone convergence theorem, we have


ˆ 1 ˆ 1 ∞ ∞ ˆ 1

0
f (x) dx =
0 n=1
∑ fn (x) dx = ∑ fn (x) dx
n=1 0
80 5 Convergence in Measure Theory

Now, integrating each fn (x) over the interval [0,1]:


ˆ 1 ˆ 1  
1 1 1 1 1
fn (x) dx = 2 χ( 1 , 1 ] (x) dx = 2 − =
0 n 0 n+1 n n n n+1 n(n + 1)

Therefore, ˆ 1 ∞
1
0
f (x) dx = ∑ n(n + 1)
n=1

This is a telescoping series and, when expanded, most terms cancel out, leaving only:
ˆ 1
f (x) dx = 1
0

Thus, the integral of the function over the interval [0,1] is 1.

fn (x) for some n

1
42 x
1 1
4 3

Figure 5.2: Graphical representation of the function f (x) defined by the series. The function is composed
of intervals whose widths decrease as we move to the right on the x-axis. Specifically, each interval spans
1
from n+1 to 1n for each positive integer n, and its height is given by n12 . The first few intervals, shaded in a
light blue tone, visually depict how the series builds up the function by adding these decreasing “steps” or
intervals. As n increases, the height of each interval diminishes, approaching zero, and the width becomes
narrower. The series continues infinitely, but only a finite number of intervals are shown for clarity.

The Lebesgue integral extends the traditional Riemann integral, capturing a broader class of functions.
Unlike the Riemann integral, the Lebesgue integral focuses more on “how much is being added” than on
“where it’s being added”. This provides it with properties absent in the Riemann integral.
The relationship between the Lebesgue integral and the Monotone convergence theorem highlights Lebesgue
integration’s predictable behavior concerning function convergence. It showcases its superiority over the
Riemann integral in specific scenarios, such as dealing with limits of function sequences.
5.2 Monotone Convergence Theorem 81

Example 5.2. Consider the sequence of functions fn defined on the interval [0,1]:
(
1
n if n+1 ≤ x ≤ 1n
fn (x) =
0 otherwise

Visually, fn represents a series of “spikes” that increase in height (to infinity) but decrease in width.

fn (x)

3 f3

2 f2

1 f1

x
1

Figure 5.3: As n increases, the “spike” of the function fn moves to the left and grows in height, but the width
of the spike decreases. This results in the area under the spike (i.e., the integral of fn over [0,1]) decreasing
to zero. However, despite the fact that fn converges pointwise to the zero function on [0,1], the sequence
does not converge uniformly.

 1 1
(1) Pointwise Convergence: For any x in [0,1], x will eventually be outside the interval n+1 , n as n grows.
Hence, fn (x) will converge to 0 for every x in [0,1]. In other words, fn converges pointwise to the zero
function on [0,1].
(2) Riemann Integral of fn : The integral of each fn over [0,1] is just the area beneath the “spike”. Since the
1
spike has height n and width n(n+1) , the integral is:
ˆ 1
1 1
fn (x) dx = n × =
0 n(n + 1) n + 1

As n → ∞, this integral tends to 0.


(3) Integral of the Limit Function: The limit function is the zero function over [0,1]. Hence, its integral is
also 0.
82 5 Convergence in Measure Theory

(4) Pointwise Convergence vs. Convergence of Integrals: Even though fn converges pointwise to the zero
function on [0,1], and each fn is integrable with an integral tending to 0, the sequence of functions does
not converge uniformly to 0 due to the “spikes” that grow indefinitely in magnitude.

This example showcases that pointwise convergence doesn’t guarantee the convergence of integrals under
the Riemann integral. In the context of the Lebesgue integral, the Monotone convergence theorem and
Dominated convergence theorem were developed to handle such discrepancies and provide clear conditions
under which limits and integration can be interchanged.

Example 5.3. Consider the function sequence fn defined on the interval [0,1] as follows:
(
1
n if n+1 ≤ x ≤ n1
fn (x) =
0 elsewhere

The Lebesgue integral handles situations where pointwise convergence and the convergence of integrals do
not align, unlike in the case of the Riemann integral.
For the function fn , we will compute its integral using the Lebesgue integral:
Recall that, intuitively, the Lebesgue integral
 1“sums  over sets first, then sums over the function”. Thus,
considering the indicator function on the set n+1 , 1n , the Lebesgue integral of fn over [0,1] is calculated as:
ˆ 1  
1 1
fn (x) d µ = n × µ ,
0 n+1 n

Where µ is the Lebesgue measure.


1 1
 
The measure (in simple terms, length) of the interval n+1 , n is:
1 1 n+1−n 1
− = =
n n+1 n(n + 1) n(n + 1)

Thus, the Lebesgue integral of fn over [0,1] is:


ˆ 1
1 1
fn (x) d µ = n × =
0 n(n + 1) n + 1

Now, considering the pointwise convergence of fn to the zero function on [0,1], the Lebesgue integral of the
limit function (which is 0 over [0,1]) is also 0.
Therefore, the sequence of Lebesgue integrals of fn converges to 0, which is exactly the same value as the
Lebesgue integral of the limit function.
This showcases the power of the Lebesgue integral: it allows us to interchange limits and integrals in
situations where the Riemann integral might not, given the appropriate conditions (like the Dominated
convergence theorem or the Monotone convergence theorem).

Theorem 5.1. Let { fn } be a sequence of measurable non-negative functions on a measurable set E such that

fn (x) ≤ fn+1 (x)


5.3 Dominated Convergence Theorem 83

for all x ∈ E and for all n. If


f (x) = lim fn (x)
n→∞

for all x ∈ E, then ˆ ˆ


lim fn d µ = f dµ.
n→∞ E E
´
´ (1) Given that fn (x) ≤ fn+1 (x) for all x, the sequence { E fn d µ } is monotone and bounded by
Proof.
E f d µ , hence it has a finite limit.
(2) For each n, define the function gn = f − fn . Each gn is measurable and non-negative. Since fn → f , we
have gn → 0 pointwise in E.
(3) Now, f − gn = fn , so ˆ ˆ ˆ
fn d µ = f dµ − gn d µ .
E E E
´
(4) As E fn d µ is monotone and converging, and gn → 0, we get
ˆ
lim gn d µ = 0.
n→∞ E

(5) Hence, ˆ ˆ
lim fn d µ = f dµ.
n→∞ E E

This concludes the proof of the theorem.

5.3. Dominated Convergence Theorem


Imagine you’re trying to calculate the distance that several people run over a series of days. Each day, they
run slightly different distances, but tend to run a specific distance over time.
To find out the average distance they run after a long time (i.e., the limit), there are two ways to go about it:

(1) First, compute the average distance they run each day, and then find the average of all these distances
after many days.
(2) Or, wait many days, see where those distances are tending towards, and then compute that limit directly.

Mathematically, these two methods correspond to interchanging the order of a limit and an integral (or sum).
However, there are scenarios where these two approaches don’t give the same answer.
The “Dominated convergence theorem” is a tool that tells us when we can safely swap the order of the limit
and the integral and still get the same result.
But what does “dominated” mean here? Well, to be able to interchange these, we need our distances (or
functions, in mathematical language) to be “dominated” by a distance (or function) we know well and whose
integral (or sum) is finite.
84 5 Convergence in Measure Theory

Imagine all the people running, but there’s an Olympic runner among them who runs faster than everyone
else. If we can guarantee that none of the other runners will ever exceed this Olympic runner in speed (i.e.,
the distance they cover), then we say the distances of the other runners are “dominated” by the Olympic
runner’s distance.
Thus, the Dominated convergence theorem states that if all the functions (distances in our example) converge
pointwise and are dominated by an integrable function, then we can interchange the limit and the integral
safely.
In the context of measure theory, this theorem is foundational because it allows handling situations where
other techniques fail. Thanks to this theorem, the Lebesgue integral surpasses the Riemann integral in terms
of flexibility and applicability.

Dominating Function g
f1
f2
Converging Functions f3
Limit Function f

Figure 5.4: Illustration of the Dominated convergence theorem. The sequence of functions fn (dashed blue
lines) represents functions converging pointwise to the function f (solid red line) over a given interval.
The function g (solid green line) serves as a dominating function, such that for every n and for all x in
the domain, | fn (x)|≤ g(x). This means that g always lies above the absolute value of each fn and f . The
Dominated convergence theorem assures that under these conditions, the integral of the limit function f is
the limit of the integrals of the fn functions.

Example 5.4. A classic example showcasing the advantage of the Lebesgue integral over the Riemann
integral involves the indicator function of the rationals over an interval. Let f (x) be defined on the interval
[0, 1] as:
(
1, if x is rational,
f (x) =
0, if x is irrational.

This function is not Riemann-integrable because it is not continuous at any point within the interval [0, 1].
Any partition you choose will always contain both rational and irrational points, making the upper and lower
Riemann sums’ limits fail to coincide.
However, this function is Lebesgue-integrable. We can use the Lebesgue measure λ to calculate its integral:
5.3 Dominated Convergence Theorem 85

ˆ
f (x) d λ (x) = λ ({x ∈ [0, 1] : f (x) = 1}) × 1 + λ ({x ∈ [0, 1] : f (x) = 0}) × 0
[0,1] (5.1)
= 0 + 0 = 0.

Here, λ denotes the Lebesgue measure, and λ ({x ∈ [0, 1] : f (x) = 1}) and λ ({x ∈ [0, 1] : f (x) = 0}) are both
zero, as the rationals and irrationals in [0, 1] are sets of Lebesgue measure zero and one, respectively.

Dominated Convergence Theorem


Suppose we have a sequence of functions fn that converge pointwise to f on [0, 1], and imagine each fn
is “smoothed out” in some manner, making it more “friendly” for Riemann integration. If we wished to
compute
ˆ
lim fn (x) dx,
n→∞ [0,1]

we could run into issues if trying to interchange the limit and the integral, especially if using the Riemann
integral. However, if we could find a function g(x) that dominates all the fn (x) and which is Lebesgue-
integrable, then the Dominated convergence theorem assures us that we can safely interchange the limit and
the Lebesgue integral:
ˆ ˆ  
lim fn (x) d λ (x) = lim fn (x) d λ (x).
n→∞ [0,1] [0,1] n→∞

This allows us to handle a wide range of functions and sequences of functions that wouldn’t be effectively
dealt with using the Riemann integral.

Example 5.5. Compute the limit of the following sequence of integrals:


ˆ 1
lim nxn (1 − x) dx
n→∞ 0

Solution 5.2. Before applying the Dominated convergence theorem, observe the sequence of functions:

fn (x) = nxn (1 − x)
We need to find a function g(x) that dominates fn (x) on [0, 1] for all n, i.e., | fn (x)|≤ g(x).

Consider the behavior of fn (x):


Since fn (x) is non-negative on the interval [0, 1], we don’t have to worry about negative values. The function
fn (x) achieves its maximum value on the interval [0, 1] where its derivative is zero.
Differentiating fn (x) with respect to x, we get:
86 5 Convergence in Measure Theory

fn′ (x) = n · xn−1 (1 − x) − nxn = n · xn−1 − 2nxn

Setting this to zero and solving, we find x = n−1


2n . Plugging this value into f n (x), we see the function hits
a peak at this point. However, as n goes to infinity, this value approaches 0.5. Therefore, as fn (0.5) =
n(0.5)n (0.5), we see that the function value goes to 0 as n goes to infinity.
Thus, any function g(x) that is greater than fn (x) on [0, 1] and has a finite integral on [0, 1] will be a
dominating function. A suitable choice is g(x) = 1, as 0 ≤ nxn (1 − x) ≤ 1 for all x ∈ [0, 1].
Now, the integral of g(x) over [0, 1] is:
ˆ 1 ˆ 1
g(x) dx = 1 dx = 1 < ∞
0 0

Hence, all the conditions of the Dominated convergence theorem are satisfied.
Finally, we interchange the limit and the integral:
ˆ 1 ˆ 1
n
lim nx (1 − x) dx = lim nxn (1 − x) dx
n→∞ 0 0 n→∞

Since nxn (1 − x) approaches 0 for every x ∈ [0, 1] as n goes to infinity, we get:


ˆ 1
0 dx = 0
0

Thus, ˆ 1
lim nxn (1 − x) dx = 0
n→∞ 0
5.3 Dominated Convergence Theorem 87

Illustration of fn (x) = nxn (1 − x)


1
n = 10
n = 20
0.8 n = 50
f (x) = 0

fn (x) 0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1
x

Figure 5.5: The above figure illustrates the function sequence fn (x) = nxn (1 − x) for different values of n.
As n increases, fn (x) gets closer to the zero function (shown in black) on the interval [0,1], which is the
application of the Dominated convergence theorem.

This concludes our example using the Dominated convergence theorem.

Example 5.6. Compute the limit of the following sequence of integrals:


ˆ π /2
lim sin(nx) dx
n→∞ 0

Solution 5.3. Before applying the Dominated convergence theorem, observe the sequence of functions:

(1) fn (x) = sin(nx)


(2) We need to find a function g(x) that dominates fn (x) on [0, π /2] for all n, i.e., | fn (x)|≤ g(x).

The function sin(nx) oscillates between -1 and 1 for all values of n. Hence, a suitable dominating function
is g(x) = 1, as it satisfies |sin(nx)|≤ 1 for all x ∈ [0, π /2] and all n.
The integral of g(x) over [0, π /2] is:
π /2 π /2
π
ˆ ˆ
g(x) dx = 1 dx = <∞
0 0 2

Now, the conditions of the Dominated convergence theorem are satisfied.


Interchanging the limit and the integral:
88 5 Convergence in Measure Theory
ˆ π /2 ˆ π /2
lim sin(nx) dx = lim sin(nx) dx
n→∞ 0 0 n→∞

However, as n approaches infinity, sin(nx) does not converge to any particular function within the interval
[0, π /2]. Hence, Dominated convergence theorem cannot be applied directly.
For the sequence given, it’s best to compute the integral directly:
ˆ π /2  π /2
1 1
sin(nx) dx = − cos(nx) = (1 − cos(nπ /2))
0 n 0 n

Considering the behavior of cos(nπ /2) as n changes, this sequence does not converge for all n. Therefore,
the limit of the integrals as n approaches infinity does not exist.

Illustration of fn (x) = sin(nx)


1
n = 10
n = 20
n = 50
0.5
f (x) = 0
fn (x)

−0.5

−1
0 0.2 0.4 0.6 0.8 1 1.2 1.4
x

Figure 5.6: The figure illustrates the function sequence fn (x) = sin(nx) for different values of n. As n
increases, fn (x) oscillates faster within the interval [0, π /2]. The Dominated convergence theorem can’t
be applied directly as the sequence does not have a limit function in the interval.

Theorem 5.2. Let fn : E → R be a sequence of measurable functions that converge pointwise to a function
f : E → R on a measurable set E. If there exists an integrable function g : E → [0, ∞) such that | fn (x)|≤ g(x)
for all x ∈ E and for all n, then f is integrable and
ˆ ˆ
lim fn d µ = f dµ.
n→∞ E E

Proof. Step 1: Prove that f is integrable.


5.3 Dominated Convergence Theorem 89

Since | f (x)|≤ lim infn→∞ | fn (x)|, using Fatou’s Lemma, we get:


ˆ ˆ ˆ ˆ
| f | d µ ≤ lim inf| fn | d µ ≤ lim inf | fn | d µ ≤ g d µ .
E E n→∞ n→∞ E E

Thus, f is integrable.

Step 2: Prove the equality of integrals. Given that fn → f pointwise, we have fn − f → 0. Also, | fn − f |≤
| fn |+| f |≤ 2g, meaning fn − f is dominated by 2g, which is integrable. Applying the theorem to the sequence
fn − f , we get: ˆ ˆ
lim ( fn − f ) d µ = 0 d µ = 0.
n→∞ E E
Therefore,
ˆ ˆ
lim fn d µ = f dµ.
n→∞ E E

This concludes the proof of the Dominated convergence theorem.

Lemma 5.1. Imagine you’re observing a sequence of clouds (each cloud representing a function) that are
all floating above a city. Each cloud has varying densities across different parts, corresponding to the values
of the function at different points. You’re interested in finding out the minimal total rainfall that will come
from these clouds as they all rain down.
Fatou’s Lemma is a tool that helps you make an educated estimation: even if you can’t determine the exact
rainfall from each cloud immediately, you can make a pretty good guess about the least amount of rain the
city might experience in the future as more and more clouds pass.
Given a sequence of non-negative measurable functions, the integral of the limit inferior of this sequence is
less than or equal to the limit inferior of the integrals of the functions.
Let { fn } be a sequence of non-negative measurable functions. Then:
ˆ ˆ
lim inf fn d µ ≤ lim inf fn d µ
n→∞ n→∞

The “lim inf” of a sequence refers to the “best possible outcome” as n grows larger (imagine it as the least
rainy cloud as time progresses). Fatou’s Lemma says: when you find the best possible outcome and then
integrate (calculate total rainfall for the city from that least rainy cloud), that value will always be less than
or equal to if you were to first integrate each cloud individually and then pick the best possible outcome from
those integrals.
Fatou’s Lemma often comes in handy when dealing with sequences of functions, especially in situations
where it’s challenging to find the exact limit of the sequence. By providing a bound, it can offer insights into
the behavior of the limit, even if we can’t explicitly find or evaluate it. It’s particularly important in the study
of the Lebesgue integral and is foundational for more advanced results, like the Dominated convergence
theorem.
90 5 Convergence in Measure Theory

5.4. Conclusions
In this chapter, we’ve navigated the intricate waters of measure theory to explore two pivotal theorems:
the Monotone convergence theorem and the Dominated convergence theorem. These theorems play an
indispensable role in the modern theory of integration, particularly within the context of the Lebesgue
integral.
The Monotone convergence theorem, which pertains to sequences of non-negative functions that converge
pointwise, enables us to swap the order of taking a limit and integrating. It assures us that, under the
constraints of monotonicity, the integral of the limit is the limit of the integrals.
On the other hand, the Dominated convergence theorem provides us with a powerful tool to handle
situations where functions converge pointwise and are dominated by another integrable function. In essence,
it allows us to control and understand the behavior of integrals under the limit, even when the functions
involved are not necessarily monotonic.
Together, these theorems bolster our understanding of how limits and integrals interact, allowing for a
more profound exploration into complex analyses, probability theory, and functional analysis. As we proceed
to the subsequent chapters, we will see these theorems in action, forming the bedrock upon which many
advanced mathematical results are built. Readers are encouraged to frequently revisit this chapter as they
encounter the practical applications of the Monotone convergence theorem and Dominated convergence
theorem in various mathematical contexts.
5.5 Exercises 91

5.5. Exercises
Exercise 5.1. Let ( fn ) be a sequence of functions defined on [0, 1] by fn (x) = xn . Show that ( fn ) satisfies the
´1
conditions of the Monotone Convergence Theorem and find limn→∞ 0 fn (x) dx.

Exercise 5.2. Consider a sequence of functions (gn ) defined on [0, 1] by gn (x) = sin(nx) . Show that (gn )
´1 n
satisfies the conditions of the Dominated Convergence Theorem and find limn→∞ 0 gn (x) dx.

Exercise 5.3. Let fn : R → R be a sequence of functions given by:

sin(nx)
fn (x) =
n2
´ ´
Prove that: (i) limn→∞ R f n (x) dx = 0. (ii) R lim infn→∞ f n (x) dx ≤ 0.

Exercise 5.4. Consider the function f : [0, 1] → R defined as:


(
1 if x is rational
f (x) =
0 if x is irrational

(i) Show that f is Lebesgue integrable on [0,1] and compute its Lebesgue integral. (ii) Discuss the Riemann
integrability of f on [0,1].
Chapter 6
Lebesgue Measure on Rn

Carlos Polanco
Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México.
Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de México, Ciudad de
México, México.

Abstract The Lebesgue measure, a fundamental concept in Measure Theory, extends our basic intuition of
length, area, and volume to a more general and rigorous framework, especially in the context of Rn . This
chapter introduces the Lebesgue measure on Rn , elucidating its distinctions from the traditional Riemann
integration. Key topics include the construction of the Lebesgue measure, measurable sets, the measure’s
translation invariance, and its pivotal role in Lebesgue integration. By the chapter’s end, readers will
comprehend the significance of the Lebesgue measure in modern analysis and its applicability in various
mathematical contexts.

Keywords: analogies to understand multidimensional measure, Lebesgue measure on Rn , Measure in


multidimensional spaces, moving to higher dimensions: Rn , product of measures.

6.1. Introduction
The real line R is a space we are all intimately familiar with, and Rn — its multi-dimensional counterpart —
is a natural extension. For example, R2 represents the plane, and R3 embodies the three-dimensional space
we live in. When we speak of measuring “length” in R, “area” in R2 , or “volume” in R3 , we are treading on
familiar ground. But how can we generalize this notion of measurement to any arbitrary set in Rn ? [11, 12]

(B) Carlos Polanco: Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México. Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de
México; Tel: +01 55 5595 2220; E-mail: polanco@unam.mx

93
94 6 Lebesgue Measure on Rn

And more importantly, how can we do it in a way that is consistent with our intuitions and yet powerful
enough to address more abstract and intricate sets? Enter the Lebesgue measure.
Historically, Henri Lebesgue introduced his measure in the early 20th century as a solution to some of the
inconsistencies and limitations of the Riemann integral. While Riemann’s approach to integration was based
on partitioning the domain (x-axis), Lebesgue’s idea was to partition the codomain (y-axis), leading to a
more flexible and encompassing measure and integration method.
In this chapter, we will dive deep into the construction of the Lebesgue measure on Rn , its properties, and its
implications. We will explore how it provides a more inclusive way to measure “size” that gracefully handles
even the most elusive of sets. By the end of this chapter, readers will not only appreciate the elegance and
power of the Lebesgue measure but also understand its pivotal role in modern analysis.

6.2. Measure in Multidimensional Spaces


Imagine you have a piece of string. You can easily measure its length using a ruler. That’s a one-dimensional
measure. Now, if you spread out a sheet of paper, you would need to measure both its length and its width to
know its entire size. This is a two-dimensional measure. But if you’re dealing with a box, you would need
to measure its length, width, and height. That’s a three-dimensional measure.
Now, imagine if you were in a world with more than three dimensions (hard to visualize, I know). You’d need
even more measurements to fully understand the “size” or “volume” of an object in that space. This is where
the idea of “measure in multidimensional spaces” comes into play. It’s essentially about understanding the
“size” or “volume” of objects in spaces that have more than just the usual three dimensions we’re used to.
In mathematics, spaces with any number of dimensions (including potentially infinite dimensions!) can be
studied. When mathematicians talk about “measure,” they are trying to generalize this idea of “size” or
“volume” to any kind of space, regardless of how many dimensions it has.

Analogies to Understand Multidimensional Measure:


(1) Lines, Areas, and Volumes: As already mentioned, the concept starts from one-dimensional measures
like length, expands to two-dimensional measures like area, and further to three-dimensional measures
like volume.
(2) Slices of Bread: Think about a loaf of bread. If you slice it, each slice represents a two-dimensional
space (length and width of the slice). Stack them together, and you have a volume (a third dimension).
Imagine if each slice also had another kind of measurable “depth” or “density” to it. That would be like
adding another dimension.
(3) Shadows: Consider a shadow cast by a three-dimensional object. The shadow itself is two-dimensional,
but it’s determined by a three-dimensional shape. If we lived in a four-dimensional world, 3D objects
might cast 3D “shadows”!
6.3 Product of Measures 95

Why is this Useful?


(1) Advanced Physics: Theories in modern physics, like string theory, sometimes deal with more than three
spatial dimensions. Having a way to measure in multidimensional spaces is crucial for these theories.
(2) Data Analysis: In the age of big data, we often deal with data sets that have many attributes. Each
attribute can be thought of as a dimension. Tools from multidimensional measure theory can sometimes
be applied to analyze such data.
(3) Mathematical Exploration: Sometimes, mathematicians study higher dimensions purely out of curiosity
and for the sake of exploration, not always because of an immediate real-world application.

In sum, measure in multidimensional spaces is an extension of our everyday understanding of length, area,
and volume. It helps mathematicians, scientists, and data analysts measure and understand “spaces” that go
beyond our usual three-dimensional world.

6.3. Product of Measures


Think of “measure” as a mathematical way of assigning a size or volume to sets. The most intuitive measure
for us is length for lines, area for surfaces, and volume for 3D objects. The Lebesgue measure, in simple
terms, is a method of measuring “how big” a set is, and it’s especially well-suited for some complex or
oddly-shaped sets.

Moving to Higher Dimensions: Rn


Now, real life as we experience it has three spatial dimensions (length, width, and height). But mathematically,
we can explore spaces with any number of dimensions, even though it’s hard to visualize. In mathematics,
Rn represents n-dimensional space. For instance, R2 is like a flat sheet of paper (with length and width), and
R3 is like the regular 3D space we live in.

What Does it Mean?


When we talk about the “product of measures”, especially in the context of Rn , we’re essentially trying
to build a way to measure sets in higher dimensions by using our understanding of measuring sets in one
dimension.
Imagine trying to determine the area (a 2D concept) of a rectangle. You’d multiply the length (a 1D concept)
by the width (another 1D concept). This multiplication is, in essence, a “product of measures”.
So, when moving to higher dimensions, the idea is similar. If you know how to measure things in 1D (using
the Lebesgue measure), and you want to measure something in 2D, you can use the “product” of the 1D
measures. For 3D, you’d use the product of the measures of the three 1D spaces, and so on.
96 6 Lebesgue Measure on Rn

Lebesgue Measure on Rn
Let’s connect this back to the Lebesgue measure in higher dimensions. Say you’re in R2 (like a flat sheet of
paper). If you have a “rectangle” on this sheet, its “area” would be the product of the Lebesgue measures of
the intervals defining its sides.
Now, imagine a “box” in R3 . Its “volume” would be the product of the Lebesgue measures of the intervals
defining its three sides.
As we increase dimensions, we just keep taking products of the Lebesgue measures of each of the intervals
defining the sides of the shape in Rn .

Why is This Important?


(1) Advanced Mathematics & Physics: In many mathematical and physical contexts, it’s vital to understand
and work with higher-dimensional spaces. By understanding how to measure in these spaces, we can
make more advanced computations and predictions.
(2) Consistency: It provides a consistent way to talk about the “size” or “volume” of sets across all
dimensions.
(3) Flexibility: It allows us to understand and work with very oddly-shaped sets in higher dimensions, sets
that might be difficult or impossible to tackle with more naive concepts of size or volume.

The product of measures, especially in the context of the Lebesgue measure on Rn , provides a powerful
and consistent way to understand and work with the concept of “size” or “volume” across any number of
dimensions.

Example 6.1. Consider the evaluation of the triple integral of the function f (x, y, z) = xyz over the cuboid
defined by 0 ≤ x ≤ 2, 0 ≤ y ≤ 3, and 0 ≤ z ≤ 4.

Solution 6.1. The integral of interest is given by


ˆ
f (x, y, z) d λ ,
E

where E represents the cuboid, and λ is the Lebesgue measure in R3 .

The triple integral can be expressed as iterated integrals in each variable:


ˆ 2ˆ 3ˆ 4
xyz dz dy dx.
0 0 0

Starting with the innermost integral with respect to z:


6.3 Product of Measures 97
ˆ 4 4
1
xyz dz = xyz2
0 2 0
1
= xy(16)
2
= 8xy.

Next, integrate with respect to y:


ˆ 3 3
8xy dy = 4xy2
0 0
= 4x(9)
= 36x.

Finally, integrate with respect to x:


ˆ 2 2
36x dx = 18x2
0 0
= 18(4)
= 72.

Thus, the triple integral of f (x, y, z) = xyz over the defined cuboid is 72.

Note that this particular example can be solved using simple integrals since the domain is a rectangular
region. The advantages of Lebesgue integration become more evident for more complex sets and functions
that aren’t easily tackled using Riemann’s traditional approach.

Example 6.2. Consider the set E in R5 defined by

E = {(x1 , x2 , x3 , x4 , x5 ) ∈ R5 : x2 = x3 = x4 = x5 = 0, 0 ≤ x1 ≤ 1}

This set represents a line segment extending from the point (0,0,0,0,0) to the point (1,0,0,0,0) in R5 . It can
be visualized as a “cable” stretching in the direction of the first coordinate but having no breadth in the other
directions.

Solution 6.2. To find the Lebesgue measure of this set in R5 , we need to consider its “volume” in the 5-
dimensional space. Since the set E has no extension in the x2 , x3 , x4 , and x5 directions, its measure in those
dimensions is zero. The only dimension it has an extension in is the x1 direction, where its measure is 1 (the
length of the line segment).
The Lebesgue measure (or “volume”) of E in R5 is the product of its measures in each individual dimension.
Therefore,
m(E) = 1 × 0 × 0 × 0 × 0 = 0

Thus, even though E has an extension in one dimension, its measure in R5 is zero.
98 6 Lebesgue Measure on Rn

This case illustrates how a set can have an extension in one dimension and yet have a zero measure in a
higher-dimensional space. It serves as an example of how Lebesgue measure in multidimensional spaces
can yield results counterintuitive to our usual perception of “size” or “volume”.

Example 6.3. Evaluate ˆ


xy d(µ × ν )
[0,2]×[1,2]

where µ and ν are the standard Lebesgue measures on R.

Solution 6.3. (1) The integral can be decomposed as a double integral:


ˆ ˆ 2 ˆ 2 
xy d(µ × ν ) = xy d ν (y) d µ (x)
[0,2]×[1,2] 0 1

(2) Compute the inner integral first:


ˆ 2 ˆ 2
xy d ν (y) = xy dy
1 1
xy2 2
=
2 1

4 1
=x −
2 2
3x
=
2
(3) Now, integrate the above expression with respect to x:
2
3x2 2
ˆ
3x
dx =
0 2 4 0
3(22 ) 3(02 )
= −
4 4
=3

Thus, ˆ
xy d(µ × ν ) = 3
[0,2]×[1,2]
.

Example 6.4. Evaluate ˆ


z dλ
B(0,1)

where B(0, 1) is the unit ball centered at the origin in R3 and λ is the standard Lebesgue measure in R3 .

Solution 6.4. (1) Observe that the function is odd in z and B(0, 1) is symmetric with respect to the xy-plane.
Thus, the integral over any half-ball (e.g., where z > 0) is the negative of the integral over the other
half-ball (where z < 0).
6.4 Conclusions 99

(2) This means we can compute the integral over the upper half-ball and then multiply by 2:
ˆ ˆ
z dλ = 2 z dλ
B(0,1) B+ (0,1)

where B+ (0, 1) is the unit ball in the upper hemisphere.


(3) Use spherical coordinates. The transformation is:

x = r sin θ cos φ
y = r sin θ sin φ
z = r cos θ

with θ ∈ [0, π /2] for the upper half-ball.


(4) The volume differential in spherical coordinates is r2 sin θ dr d θ d φ .
(5) Compute the integral:
ˆ 1ˆ π /2 ˆ 2π
2 r3 cos θ sin θ d φ d θ dr
0 0 0

(6) Integrate first with respect to φ :


ˆ 1ˆ π /2
2 2π r3 cos θ sin θ d θ dr
0 0

(7) Then with respect to θ :


ˆ 1 π /2
ˆ 1
π r3 cos2 θ

2 dr = 2 0 dr
0 0 0
=0

Thus, ˆ
z dλ = 0
B(0,1)
.

6.4. Conclusions
In this chapter, we delved deep into the concept of the Lebesgue measure, specifically within the realm
of Rn . The Lebesgue measure serves as a cornerstone in modern measure theory, offering a profound and
comprehensive way of understanding “size” or “volume” that surpasses the limitations of the more intuitive,
but restrictive, Riemannian methods.
We learned how the Lebesgue measure is defined on Rn and its implications for functions and subsets within
this space. Notably, this measure offers a way to handle sets with intricate structures that would be elusive to
classical methods. Moreover, the introduction of the concept of “measurable” functions and sets has opened
100 6 Lebesgue Measure on Rn

doors to a more broad and flexible understanding of integration, which is crucial for advanced mathematical
analysis.
Furthermore, the power of the Lebesgue measure becomes evident when we consider its applications in
various fields of mathematics and its central role in Lebesgue integration. It provides us with tools to tackle
problems that were previously deemed unsolvable or ill-defined.
As we move forward in our journey through measure theory, the principles and ideas introduced in this
chapter will serve as a foundation. The Lebesgue measure on Rn is just the tip of the iceberg, and as we
explore more general spaces and measures, we will frequently revisit and build upon these concepts.
In the upcoming chapters, we will generalize our understanding, venturing into more abstract measure spaces
and discovering the beauty and intricacies of the mathematical universe that lies ahead.
6.5 Exercises 101

6.5. Exercises
Example 6.5. Given a rectangle R in R2 defined as R = {(x, y) : 1 ≤ x ≤ 2, 2 ≤ y ≤ 4}, find the 2-dimensional
Lebesgue measure λ2 (R).

Example 6.6. Let µ and ν be measures on the sets X = {1, 2} and Y = {a, b}, respectively, such that
µ ({1}) = 2, µ ({2}) = 3, ν ({a}) = 4, and ν ({b}) = 5. Find the product measure µ × ν for the set
A = {(1, a), (2, b)}.

Exercise 6.1. Consider the set A which is a unit square in R2 and the set B which is a unit cube in R3 . Provide
an intuitive analogy for understanding the difference in their measures, and describe how the concept of
“measure” extends from R2 to R3 .

Exercise 6.2. Let A be a rectangle in R2 defined by A = [a1 , b1 ] × [a2 , b2 ]. Express the Lebesgue measure of
A in terms of a1 , b1 , a2 , and b2 . Further, explain how this can be seen as a product of measures in R.
Chapter 7
Measure Theory in General Spaces

Carlos Polanco
Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México.
Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de México, Ciudad de
México, México.

Abstract In this chapter, we embark on a rigorous exploration of measures within the realm of topological
spaces, focusing on their construction, properties, and applications. We begin by introducing the general
concept of measures in topological spaces, elucidating how sets are assigned a numerical value representing
”size” in a manner that aligns with our intuitive understanding, yet is adaptable to more abstract settings. Our
journey continues with a deep dive into Borel measures, a class of measures defined on the Borel σ -algebra
of a topological space, providing insights into their significance in integration theory and the foundational
Lebesgue measure. The Radon measures follow suit, encompassing locally finite Borel measures and
connecting topological properties with measure-theoretic ones. This chapter concludes with a discussion
on Dirac measures, particularly the Dirac delta ”function,” underscoring its role in distribution theory and
its pivotal applications in physics and engineering. Throughout this exposition, emphasis is placed on the
intertwining of topology and measure theory, illustrating how these two branches of mathematics elegantly
complement each other, leading to profound implications both within and outside the mathematical sphere.

Keywords: Borel measures, connection with measure theory in general spaces, Dirac measures, measures
in topological spaces measures in topological spaces, Radon measures.

(B) Carlos Polanco: Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México. Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de
México; Tel: +01 55 5595 2220; E-mail: polanco@unam.mx

103
104 7 Measure Theory in General Spaces

7.1. Introduction
The mathematical study of measures, fundamental in understanding the size and structure of sets, is an
indispensable tool in various branches of analysis. Within the realm of topological spaces, these measures
acquire nuances and complexities that require rigorous examination and understanding. This chapter offers
a systematic and formal exploration of measures within the context of topological spaces.
Initially, we shall elucidate the basic principles underpinning the concept of measures in topological
spaces [13, 14]. It is imperative to establish a robust framework to comprehend the intricacies and
technicalities of attributing ’size’ or ’measure’ to sets in these spaces.
Subsequent to this foundational discussion, the exposition transitions to the study of Borel measures. Derived
from Borel sets, which themselves emerge from countable operations on open sets, Borel measures play an
integral role in bridging topology with more applied domains, including real analysis and probability theory.
Our discourse then advances to Radon measures. Named in honor of Johann Radon, these measures are
specifically tailored for locally compact Hausdorff spaces. Their utility and relevance in this specialized
class of topological spaces warrant a detailed investigation.
Concluding this chapter, we shall turn our attention to the Dirac measures. With origins intertwined with
the Dirac delta function, these measures have vast implications across various mathematical and physical
disciplines. Their unique ability to encapsulate point sources or singularities will be rigorously explored.
The objective of this chapter is to provide the reader with a comprehensive and formal understanding of
the measures within topological spaces, emphasizing the interplay and connections amongst the various
categories. It is hoped that this rigorous treatise will serve as a valuable resource for advanced students and
researchers in the field.

7.2. Measures in Topological Spaces


To understand the concept of measure in topological spaces, we need to first grasp a few key terms.

(1) Topological Space: Imagine a set of points with certain properties that allow us to discuss notions such
as “close” or “far”, without necessarily having a concrete idea of distance. This structure allows us to
talk about concepts like continuity and limits in a very general way.
(2) Measure: It’s a way to assign a “size” or “volume” to sets of points in a space. For instance, on the real
line, the “size” of an interval is its length. But measures can be more general and don’t necessarily have
to be lengths, areas, or volumes.

(1) Definition of Measure: A measure in a topological space is a function that takes a set and returns a
non-negative number (representing the “size” of the set) which satisfies certain basic properties:

(a) The measure of the empty set is 0.


(b) The measure is “additive”: If you take two sets that don’t overlap, the measure of their union is the
sum of their individual measures.
7.2 Measures in Topological Spaces 105

(2) Measures and Topology: The interplay between topology and measure is rich and complex. Not all sets
in a topological space have a well-defined measure. The sets for which a measure is well-defined are
termed “measurable”.
(3) Regular Measures: In many topological spaces, one wishes for measures to have additional properties
related to the topological structure of the space. A regular measure is one that interacts well with open
and closed sets of the space. For instance, in a topological space, we might want any open set to be
approximated from the outside by measurable sets with measure arbitrarily close.
(4) Borel Measures: In the context of metric spaces (which are a type of topological space where you can
precisely measure distance), a Borel measure is one that is defined for all “sufficiently simple” sets
(Borel sets), which include open sets, closed sets, and more.

Why is this important?


Understanding how to define and work with measures in topological spaces is fundamental in many areas of
mathematics and physics. For instance, probability theory in general spaces requires these notions. Measures
enable us to talk about concepts like density, probability, and size in a very broad and general framework.

7.2.1. Borel Measures


Imagine you have a box filled with building blocks of different shapes and sizes. If you wanted to know how
many blocks are in total, you’d just count them one by one. But what if you wanted to know the total “size”
of the blocks inside the box? You’d need a way to measure each block and then sum those “sizes” together.
In mathematics, this concept of “size” is termed “measure”. And the way we measure certain mathematical
objects is based on specific rules to ensure our “measurement” makes sense.
Borel sets are like those blocks in the box. They are a specific type of subsets in mathematical spaces, and
they have a structure that allows us to measure them coherently.
A Borel measure is a tool that lets us assign a number (think of it as a “size”) to these Borel sets. This tool
must follow certain rules to make sure the measures we assign make sense and are consistent.
For instance, one of the rules is “additivity”. Imagine you have two blocks that don’t overlap and you want
to measure their total “size”. If you measure each block separately and then sum those measurements, you
should get the same result as if you measured both blocks together as one unit. The Borel measure ensures
this happens.

Example 7.1. Consider the real number line. A simple Borel set might be an interval, like [1, 2], which
contains all numbers between 1 and 2. If we assign a Borel measure to this interval, we’d say its “size” is
exactly 1 (because 2 minus 1 is 1). If we take another interval, [2, 4], its “size” would be 2. Summing up the
“sizes” of these two intervals gives us 3, which is what we’d expect.
106 7 Measure Theory in General Spaces

1 2 4

Figure 7.1: A visual representation of the intervals on the real number line. The interval [1,2] is shown in
blue and [2,4] in red.

Definition 7.1. A Borel measure is a function defined on the Borel σ -algebra that assigns to each Borel set
a non-negative real number, typically interpreted as the ”size” or ”measure” of the set.
Formally, if B(Rn ) denotes the Borel σ -algebra on Rn (the collection of all Borel sets in Rn ), then a measure
m on B(Rn ) is a function:
m : B(Rn ) → [0, ∞]
that satisfies the following properties:

(1) Non-negativity: For every set E ∈ B(Rn ), we have m(E) ≥ 0.


(2) Measure of the Empty Set: m(0) / = 0.
(3) σ -Additivity: If {Ei }∞
i=1 is a collection of sets in B(Rn ) that are pairwise disjoint (i.e., Ei ∩ E j = 0/ for all
i 6= j), then: !

[ ∞
m Ei = ∑ m(Ei ).
i=1 i=1

The Lebesgue measure is a particularly important example of a Borel measure on Rn , but there are other
Borel measures depending on the context and the requirements of the application.

Example 7.2. The Lebesgue measure on Rn , denoted by m, is an extension of our conventional notion of
length, area, and volume. For an interval I = [a, b] ⊂ R, the Lebesgue measure is defined as:

m(I) = b − a.

The Lebesgue measure possesses the crucial property of being translationally invariant; that is, if E is a
measurable set and x ∈ Rn , then:
m(E + x) = m(E),
where E + x denotes the translation of E by x.

Example 7.3. The counting measure is another Borel measure, commonly defined on discrete spaces. If X is
a discrete set (like the integers, Z), the counting measure, denoted by µ , is defined as:

µ (E) = number of elements in E.

For instance, for the set E = {1, 2, 3}:


µ (E) = 3.

Example 7.4. Let’s solve a Lebesgue integral using a Borel measure on the real line. We’ll first describe the
context, then set up the function we want to integrate, and finally solve the integral step by step.
Assume we are working in the measurable space (R, B(R)), where B(R) denotes the Borel sigma-algebra
on the real line. The Lebesgue measure, which we’ll denote by m, is a measure defined on this sigma-algebra.
7.2 Measures in Topological Spaces 107

For simplicity, but also for illustrative purposes, consider the function f : R → R defined by:
(
x2 if x ∈ [0, 1],
f (x) =
0 otherwise.

Solution 7.1. (1) Restrict the Domain: Since f (x) = 0 outside the interval [0,1], the Lebesgue integral of f
over all of R is simply the integral of f over [0,1].
(2) Evaluate the Integral: The Lebesgue integral of f over [0,1] looks like:
ˆ ˆ 1
f dm = x2 dm(x)
R 0

Where dm(x) is simply dx for the Lebesgue measure on R.


(3) Compute the Integral: Now, treat this integral like any Riemann integral (since the function is straightforward
and continuous on the interval [0,1]):
1
1
x3 13 03
ˆ
2 1
x dx = = − =
0 3 3 3 3
0

Thus, the Lebesgue integral of f over R with respect to the Lebesgue measure is 13 .
It’s worth noting that the Lebesgue and Riemann integrals agree in this case because we’re dealing with
a straightforward function on a finite interval. However, the true power of the Lebesgue integral manifests
when dealing with more intricate functions and more exotic sets where the Riemann integral isn’t well-
defined or is more cumbersome to apply.

7.2.2. Radon Measures


To understand Radon measures, let’s start with the idea of measuring things. Imagine you have a kitchen
scale. This scale allows you to measure the “weight” of the food you place on it. In mathematics, we also
like to “measure” things, but instead of just food, we want to measure sets.
The “measure” in mathematics is similar to the idea of measuring weight, but it’s generally more abstract.
It can refer to concepts like length, area, volume, and so on. For instance, the measure of an interval on the
real line would be its length.
Now, when trying to measure sets in more complicated spaces (like topological spaces, which are spaces
with a defined notion of closeness or continuity), things can get tricky. This is where Radon measures come
in.
A Radon measure is a special kind of measure that has “friendly” properties in topological spaces. One of
its most significant properties is that it is “locally finite.” This means that even if there might be many parts
to measure in a vast space, if we focus on a “small” piece of that space (like a compact), the measure will
always be a finite number.
108 7 Measure Theory in General Spaces

These measures are useful because, despite the complexity of the spaces we’re dealing with, they behave in
a predictable and manageable way.

Example 7.5. Imagine you have a space representing a landscape with mountains and valleys. You want
to measure the “amount” of landscape within certain bounds. The mountains might represent areas where
the measure is infinite because there’s “a lot” there. However, if you focus on a small piece of a mountain
(like the peak), the Radon measure will give you a finite number for that peak because you are looking at a
compact set.

Mountain

Measure

Space

Figure 7.2: A visualization of Radon measures on a landscape. The measure is finite over small, compact
regions like the peak of the mountain.

Definition 7.2. A Radon measure on a topological space X is a measure that is locally finite and for which
all closed measurable sets are Borel sets (i.e., they belong to the σ -algebra generated by the open sets of the
space).
More formally:
A measure µ on X is called a Radon measure if it satisfies the following properties:

(1) µ (K) < ∞ for every compact set K ⊂ X. (Locally finite)


(2) For any open set U ⊂ X, and for every ε > 0, there exists a compact set K ⊂ U such that µ (U \ K) < ε .
(Internally regular measure)
(3) For any Borel set B ⊂ X, µ (B) = inf{µ (U) : U is open and B ⊂ U}. (Externally regular measure)

It’s important to note that the concept of a Radon measure is central in functional analysis, measure theory,
and potential theory. Radon measures on Rn coincide with locally finite Borel measures, but in more general
spaces, the above definition is crucial.
7.2 Measures in Topological Spaces 109

Example 7.6. The Lebesgue measure is the natural extension of length in R, area in R2 , and volume in R3 .
Typically denoted by m or λ , the Lebesgue measure is a Radon measure that is completely additive on
disjoint sets and satisfies m([a, b]) = b − a for any closed interval in R. In Rn , the Lebesgue measure of a
“cube” defined by [a1 , b1 ] × . . . × [an , bn ] is the product of the lengths of its sides: (b1 − a1 ) . . . (bn − an ).
Example 7.7. The Dirac measure centered at a point x0 in Rn is a Radon measure defined for any Borel set
A as: (
1 if x0 ∈ A,
δx0 (A) =
0 otherwise.
Essentially, the Dirac measure “concentrates” all the measure at a single point x0 and assigns measure zero
to any set not containing x0 .
Both measures are foundational in analysis and have a wide range of applications, from measure theory and
integration to differential equations and physics.
Example 7.8. Let’s consider a function f : R → R defined as:
( 2
e−x if x ≥ 0
f (x) =
0 if x < 0
We aim to compute the integral of f with respect to the Radon measure µ defined by:
ˆ
µ (A) = δ0 (A) + e−x dx
A∩[1,∞)

Solution 7.2. Here, δ0 is the Dirac measure at the point 0. Therefore, the integral of f with respect to µ is:
ˆ ˆ
f d µ = f (0)δ0 ({0}) + f (x)e−x dx
[1,∞)

2
Step 1: Evaluate f (0)δ0 ({0}) This is simply f (0) × 1 = e−0 × 1 = 1.
Step 2: Evaluate [1,∞) f (x)e−x dx To do this, we need to evaluate the integral:
´

ˆ ∞
2
e−x e−x dx
1

This is an improper integral, so we approach it as a limit:


ˆ a
2
lim e−x e−x dx
a→∞ 1

This integral doesn’t have a closed-form solution, but can be approximated numerically (e.g., using Gaussian
quadrature).
Step 3: Sum up the results. ˆ ˆ a
2
f d µ = 1 + lim e−x e−x dx
a→∞ 1

Again, the second term needs to be approximated numerically.


110 7 Measure Theory in General Spaces

7.2.3. Dirac Measures


Imagine you have an inflated balloon that represents the entire space. Now, think about what would happen
if, instead of distributing the air inside uniformly throughout the balloon, you could concentrate all of that
air into one tiny, singular point within the balloon, making that specific point have all the “importance” or
“weight”.
The Dirac measure achieves a similar concept in the mathematical realm: rather than spreading measure
(akin to our intuitive sense of “weight” or “importance”) in a space evenly or broadly, the Dirac measure
places it all at one specific point.
This idea is captured by saying that for a specific point x0 , the Dirac measure at x0 gives a value of 1 to any
set containing that point and a value of 0 to any set not containing it. It’s as if all the “mass” or “energy” of
the system is concentrated at that point x0 , and nowhere else.

Example 7.9. Suppose you are hosting a party and are considering how much food and drink you need.
If all your friends have uniform eating habits, you would distribute the food evenly. However, if one of
your friends is extremely enthusiastic about potato chips and virtually doesn’t eat anything else, then that
individual is like a “Dirac point” for the potato chips. All the “weight” or “importance” of the chips falls
onto that one individual.
The Dirac measure is employed in many areas of mathematics and physics, especially in signal theory and
quantum mechanics, due to its ability to represent “point-like” concentrations of mass or energy.

Definition 7.3. Let X be a set and x0 ∈ X. The Dirac measure at x0 , denoted by δx0 , is defined for all subsets
A ⊆ X as: (
1 if x0 ∈ A,
δx0 (A) =
0 if x0 ∈
/ A.

In other words, the Dirac measure assigns a “weight” of 1 to any set containing the point x0 and 0 otherwise.
While the Dirac measure is a true measure on X, its analogous “Dirac delta function” in the context of
functions and distributions is not a function in the traditional sense, though it’s often treated as one in
physics and engineering.

Example 7.10. Consider the real number line, R. For a fixed point a ∈ R, the Dirac measure δa at a is defined
as: (
1 if a ∈ A,
δa (A) =
0 if a ∈ / A.
For any interval (or more generally, Borel set) A ⊆ R, the measure δa (A) is 1 if a is in A and 0 otherwise. As
an instance, for a = 0:
δ0 ((−1, 1)) = 1 and δ0 ((1, 2)) = 0.

Example 7.11. Let’s consider the plane R2 . For a fixed point (x0 , y0 ), the Dirac measure centered at this
point, δ(x0 ,y0 ) , is defined as:
(
1 if (x0 , y0 ) ∈ A,
δ(x0 ,y0 ) (A) =
0 if (x0 , y0 ) ∈
/ A.
7.2 Measures in Topological Spaces 111

For instance, for a rectangle A given by [a, b] × [c, d], the measure δ(x0 ,y0 ) (A) is 1 if both a ≤ x0 ≤ b and
c ≤ y0 ≤ d; otherwise, its value is 0.

Example 7.12. Let f : R → R be a measurable function and δ0 the Dirac measure centered at 0. Compute:
ˆ
f (x) d δ0 (x)
R

Solution 7.3. Step 1: Recall the definition of the Dirac measure centered at 0:
(
1 if 0 ∈ A,
δ0 (A) =
0 if 0 ∈
/ A.

Step 2: From the definition of the Lebesgue integral with respect to a measure µ , we have:
ˆ
f (x) d µ (x) = ∑ f (xi )µ ({xi })
R i

for certain points xi where the function f is non-zero and µ is non-zero.


Step 3: Using the Dirac measure, we only need to consider the value of f at the point where the Dirac
measure is non-zero, which is x = 0. This simplifies the integral to:
ˆ
f (x) d δ0 (x) = f (0)δ0 ({0})
R

Step 4: Using the definition of δ0 , we get δ0 ({0}) = 1. Therefore:


ˆ
f (x) d δ0 (x) = f (0)
R

The Lebesgue integral of a function f with respect to the Dirac measure centered at 0 is simply the value of
the function at the point 0.
This result aligns with the intuition that the Dirac measure “samples” or “picks out” the value of a function
at the point where the measure is centered. This is why the Dirac delta function is often used in engineering
and physics to model impulses and “samples” of signals.

Connection with Measure Theory in General Spaces


So far, it might seem we are only talking about measuring volumes in space, like length, area, or 3D volume.
But the beauty of measure theory is that it applies in very general contexts.
112 7 Measure Theory in General Spaces

For instance, instead of measuring the volume of an object in space, we might be interested in measuring how
“likely” an event is in a probability space. Or, we might be interested in measuring how much “information”
there is in a message in an information theory context.

Space X

Set E

Figure 7.3: Illustration of a subset E within a measurable space X.

Example 7.13.

(1) Euclidean Space: Suppose we are working within the 2D Euclidean space (the plane). In this case,
measurable sets could be regions on the plane, and a set’s measure might be its area.
(2) Probability Space: Imagine a set representing all possible outcomes of rolling a dice. The space is
X = {1, 2, 3, 4, 5, 6}. A probability measure could assign to each subset of X the likelihood of that set of
outcomes occurring when the dice is rolled.
(3) Information Theory: In a context where “sets” are messages and we aim to measure how much
information is in a message, the measure might be related to the length of the message or its content.

7.3. Conclusions
The exploration of measures in topological spaces is foundational in understanding many advanced concepts
in mathematical analysis and its applications. At the core of our discussion, the Borel measures gave insight
into how we can assign a ”size” or ”volume” to the Borel sets within a topological space, providing the
groundwork for integration and probability on these sets. Building upon this, the Radon measures extended
our horizon, enabling the characterization of measures on locally compact Hausdorff spaces, thus making
connections between topological and measure-theoretic properties more evident.
The introduction of the Dirac measures, particularly the Dirac delta, added another layer of depth to our
exploration. As a measure concentrated at a single point, it plays an instrumental role in both theoretical
endeavors and practical applications, particularly in the realms of distribution theory and physics.
Collectively, these topics not only serve as keystones in modern analysis but also bridge the divide between
pure mathematics and its profound influence on applied disciplines. As we move forward, the intricate
interplay between topology and measure theory will continue to inspire and challenge our mathematical
pursuits.
7.4 Exercises 113

7.4. Exercises
Exercise 7.1. Let X = {1, 2, 3, 4} be a topological space with the discrete topology. Define a measure µ on
this space such that µ ({1}) = 1, µ ({2, 3}) = 2, and µ ({4}) = 1. What is µ (X)?

Exercise 7.2. Let λ be the Lebesgue measure on R. Compute λ ([1, 3]), where [1, 3] denotes the closed
interval from 1 to 3.

Exercise 7.3. Given a Radon measure ν on R such that ν ([0, 1]) = 3 and ν ([1, 2]) = 2, find ν ([0, 2]).

Exercise 7.4. Consider the Dirac measure δ0 centered at 0 in R. Compute δ0 ([−ε , ε ]) for ε > 0.

Exercise 7.5. Let X be a topological space and B(X) the Borel σ -algebra generated by the open sets of
X. Show that any measure µ defined on B(X) and which assigns a finite value to compact sets is a Borel
measure.

Exercise 7.6. Let X be a locally compact Hausdorff space and x0 ∈ X. Define the Dirac measure at the point
x0 by (
1 if x0 ∈ A
δx0 (A) =
0 otherwise
for every Borel set A ⊂ X. Prove that δx0 is a Radon measure.
Chapter 8
Non-measurable Sets and Paradoxes

Carlos Polanco
Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México.
Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de México, Ciudad de
México, México.

Abstract In this chapter, we delve deep into the mysterious and enigmatic world of non-measurable sets, a
topic that challenges our intuitive notions of measure and integration. We begin by introducing the reader
to the foundational paradoxes inherent within non-measurable sets, elucidating how certain subsets of real
numbers defy our conventional understanding of ”size” or ”measure”. Venturing further, we present classic
examples of non-measurable sets, such as the Vitali Set, illuminating their counterintuitive properties and
behaviors. The significance of non-measurable sets is then discussed, emphasizing their implications on set
theory, real analysis, and the foundation of mathematics. As we approach the culmination of the chapter,
we address the intricate challenge of integration within non-measurable spaces, shedding light on why
traditional integration techniques falter and how mathematicians have sought to circumvent these inherent
complications. Readers will emerge with a renewed appreciation for the subtleties and profundities present
in the study of measure and integration, equipped with a deeper understanding of the paradoxes that reside
within the mathematical universe.

Keywords: examples of non-measurable sets, integration in non-measurable spaces, non-measurable sets


and paradoxes, paradoxes in non-measurable sets, significance of non-measurable sets.

(B) Carlos Polanco: Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México. Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de
México; Tel: +01 55 5595 2220; E-mail: polanco@unam.mx

115
116 8 Non-measurable Sets and Paradoxes

8.1. Introduction
In the vast universe of set theory and measure theory, one encounters domains that abide by well-understood
rules and others that defy intuition. Non-measurable sets belong to the latter category, existing as enigmatic
collections that do not comfortably fit within our traditional frameworks of measure. Their elusive nature has
birthed a range of paradoxes, challenging our foundational understanding of space, volume, and dimension.
This chapter embarks on a journey into the labyrinth of non-measurable sets [15]. We will first tackle the
inherent paradoxes that arise when attempting to measure the immeasurable. These paradoxes serve not
only as intriguing mathematical puzzles but also as fundamental challenges to our preconceptions about
sets and measures. Through a series of examples, we aim to illustrate the varied and surprising forms that
non-measurable sets can take, deepening our appreciation for their complexity.
Furthermore, we will delve into the significance of these sets within the broader landscape of mathematics.
While they might appear as abstract curiosities, non-measurable sets have profound implications for
our understanding of mathematical structures and foundational concepts. Their existence and properties
necessitate a reevaluation of how we approach measurement, especially when confronted with the infinite
and the infinitesimal.
Lastly, the chapter will culminate in an exploration of integration within non-measurable spaces. How does
one approach integration when standard measures fall apart? Can we glean meaningful results from these
ostensibly chaotic realms? These questions beckon us to push the boundaries of integration theory, searching
for new techniques and insights in the process.
Join us as we navigate the enigma of non-measurable sets, a topic that is as perplexing as it is fascinating,
challenging not only our mathematical acumen but also our foundational beliefs about the universe of sets.

8.2. Paradoxes in Non-measurable Sets


Let’s start with the general idea of a paradox. A paradox is a statement or situation that seems to contradict
common sense or logic. It’s like someone telling you: “The next statement is false. The previous statement
is true.” These two statements contradict each other, causing confusion. That, in essence, is what a paradox
does.
Now, let’s take this idea into the world of sets and mathematics. When we talk about sets, especially in the
context of real numbers, we often want to measure these sets in some way. What does measuring mean?
Well, in simple terms, it’s like determining how “big” or “small” they are. For example, the set of numbers
between 0 and 1 has a “measure” or “length” of 1.
But here’s where things get weird: there are some sets that are so peculiar that we can’t sensibly assign them
a measure. These are called non-measurable sets.
The “paradox” comes into play when we try to understand the properties of these non-measurable sets. One
of the most famous examples is the Banach-Tarski Paradox. Here’s the simplified version:
Imagine you have a solid chocolate ball. The Banach-Tarski Paradox states that you can break this ball into
a finite number of tiny pieces (using very specific, unconventional mathematical rules) and, if you rearrange
those pieces, you can end up with two chocolate balls identical to the original!
8.3 Examples of Non-measurable sets 117

This defies our common sense because, in the real world, if you break something apart and then put it back
together, you get back the same thing, not twice as much. But in the abstract world of mathematics and
non-measurable sets, this is possible.
Paradoxes in the context of non-measurable sets show us the limitations and quirks of our intuition and the
concept of measure. While these situations might not have a direct real-world analog, they offer a window
into the complexities and wonders of the mathematical universe.

8.3. Examples of Non-measurable sets

Vitali Sets
One of the famous non-measurable sets is the Vitali Set. Let’s consider the interval [0, 1) on the real line.
The Vitali Set is a collection of points such that every number in [0, 1) is in exactly one of the subsets of the
Vitali Set.
Formally, the Vitali Set is constructed as follows:
Given a real number x in [0, 1), consider its decimal expansion in base 10. If it has two such expansions (like
1
10 can be represented as 0.1 or 0.0999...), choose one. Using these decimal representations, two numbers x
and y from [0, 1) are said to be equivalent if the difference between their representations is a rational number.
This difference creates an equivalence relation, and the set of all numbers equivalent to a given number x
forms an equivalence class.
The Vitali Set selects exactly one number from each equivalence class.
Now, here’s the intriguing part: The Lebesgue measure (which is a generalization of the “length” for more
complex sets) assigns a measure (or “length”) of 1 to the interval [0, 1). But the Vitali Set, being a subset of
[0, 1), cannot be assigned a meaningful measure while respecting the properties of the Lebesgue measure!
That’s to say, it’s “non-measurable.”

Sets from the Banach-Tarski Paradox


Imagine you have a solid sphere, like a chocolate ball. Now, I tell you it’s possible to divide this sphere
into a finite number of pieces (let’s say 5), and with a special kind of “magical movements” (which are just
rotations and translations, not real magic), I can rearrange these pieces to obtain two complete spheres, both
the same size as the original!
Yes, it sounds like magic or something that defies the laws of matter conservation, doesn’t it? But that’s
precisely what the Banach-Tarski Paradox tells us.
But, how is this possible? The answer lies in the foundations of set theory and what we call “non-measurable
sets.”
118 8 Non-measurable Sets and Paradoxes

Measurability and Non-Measurable Sets


“Measurability” is a property of sets that allows us to assign them a “volume” or “measure.” For example,
in the real world, we say a sphere has a certain volume. But in the mathematical universe, there are sets so
strangely constructed that it’s not possible to assign them a measure in a meaningful way.
These “non-measurable” sets are essential for the Banach-Tarski paradox. The pieces into which we divide
our original sphere are, in fact, non-measurable sets. And because they don’t have a “volume” in the
traditional sense, we can manipulate them in seemingly impossible ways.
This result might sound unsettling, but it’s vital to understand that it’s a product of the pure mathematical
world and doesn’t have a direct application to the physical world as we know it. In reality, we can’t take a
chocolate ball and make two using the Banach-Tarski paradox. The findings of this paradox are grounded in
set theory and abstract concepts without counterparts in the physical world.
The Banach-Tarski Paradox is a curious and counter-intuitive phenomenon in the mathematical world. It
tells us that it’s possible to split a sphere into a finite number of pieces and, using simple rotations and
translations, rearrange these pieces to obtain two spheres identical in size to the original. While it sounds
incredible, it’s a valid result within the framework of set theory. However, it’s essential to remember that this
result is purely mathematical and doesn’t have a direct counterpart in the real world.

Hausdorff Paradox Sets


The “Hausdorff Paradox” is a concept from the realm of mathematics, specifically in set theory and geometry.
It’s quite abstract, but we’ll aim to explain it as clearly as possible.
Metric Spaces and Dimensions Most of us are familiar with the dimensions we live in:
(1) 1D (a line)
(2) 2D (a plane, like a sheet of paper)
(3) 3D (the space we inhabit, with height, width, and depth)
These are examples of “metric spaces”, places where we can measure distances.
Uncountable Sets There are certain sets of numbers that are “larger” than others in terms of how many
elements they have. For instance, though there are infinite integers and also infinite decimal numbers between
0 and 1, it has been shown that there are “more” decimal numbers between 0 and 1 than integers. These larger
sets are called “uncountable”.
Rotations and Transformations Imagine you have a shape in space. You can move it, rotate it, scale it, and
so on. These changes are called “transformations”.
The Hausdorff Paradox Felix Hausdorff, a German mathematician, showed that in certain metric spaces,
especially in 3 dimensions, you can take an uncountable set (a very large one, like the decimal numbers
between 0 and 1 as mentioned earlier), perform several rotations and transformations, and astonishingly and
counter-intuitively end up with that set being “divided” into a finite number of smaller parts such that these
parts, despite being smaller, are actually “equal” to the original set!
8.4 Significance of Non-measurable Sets 119

This is bewildering. It’s akin to saying we can break an apple into a finite number of pieces and somehow,
each piece is “as large as” the original apple. Clearly, this doesn’t happen in the real world, but in the abstract
realm of mathematics and geometry, it’s possible under certain conditions.
The Hausdorff Paradox demonstrates how surprising and counterintuitive the world of mathematics can be,
especially when working with abstract concepts and high-dimensional spaces.
All these non-measurable sets exploit certain properties of the real numbers and rely heavily on the Axiom of
Choice, a somewhat controversial principle in set theory. It’s worth noting that these non-measurable sets are
more of a mathematical curiosity and don’t have direct real-world analogs. They do, however, offer profound
insights into the nature and limitations of “size” and “measure” in mathematics.

Connection to Banach-Tarski
Both the Banach-Tarski Paradox and the concept of the Vitali Set challenge our intuition about “volume” or
“size”. In the case of the Banach-Tarski Paradox, it’s about the volume of a sphere, while in the Vitali Set,
it’s about the length of an interval.
However, directly integrating with respect to a non-measurable set like the Vitali Set doesn’t make sense in
the context of Lebesgue integration. The Lebesgue integral is constructed precisely to handle “well-behaved”
sets, which are those that can be assigned a meaningful measure. Non-measurable sets, like the one crucial
to the Banach-Tarski Paradox or the Vitali Set, are beyond the purview of Lebesgue integration.
So, while the Banach-Tarski Paradox and Lebesgue integration both touch upon deep properties and
peculiarities of sets in the real number system, they address different aspects. The Banach-Tarski Paradox is
more about the surprising ways we can decompose and reassemble sets, while Lebesgue integration is about
assigning a “size” to sets in a way that generalizes the usual notion of length, area, or volume.

8.4. Significance of Non-measurable Sets


Non-measurable sets play a pivotal role in mathematics for a range of reasons, particularly within the study
of measure and integration. Below are some key reasons why these sets are of relevance:

(1) Challenge to Intuition: Non-measurable sets, like those appearing in the Banach-Tarski Paradox,
challenge our fundamental intuition about “size” and “volume”. Studying these sets can provide deeper
insight into the structure and properties of real numbers and space itself.
(2) Axiom of Choice: Many constructions of non-measurable sets hinge on the axiom of choice, a postulate
in set theory that is both powerful and contentious. The fact that the axiom of choice leads to the
existence of non-measurable sets has been a point of discussion in the philosophy and foundations of
mathematics.
(3) Foundations of Measure: To construct a coherent theory of measure (like the Lebesgue Measure), it’s
crucial to understand which sets can be measured and which can’t. By identifying and studying non-
measurable sets, we can better define the limits and applications of our measure theory.
120 8 Non-measurable Sets and Paradoxes

(4) Motivation for Mathematical Development: Issues and paradoxes related to non-measurable sets have
driven the development of new theories and techniques in mathematics. For instance, the introduction
of Lebesgue Measure and Lebesgue integration was developed, in part, to address the shortcomings of
Riemann integration, especially in the context of problematic sets.
(5) Philosophical Implications: The existence of non-measurable sets raises intriguing philosophical questions
about the nature of mathematics and reality. Do these sets truly exist in the “real world,” or are they
merely artifacts of our mathematical system?
(6) Applications in Other Areas: Even though non-measurable sets are, in some sense, “strange” and
“uncommon”, the techniques developed to handle them have found applications in other areas of
mathematics, especially functional analysis, ergodic theory, and descriptive set theory.

While non-measurable sets may seem counterintuitive or esoteric, they play a foundational role in the study
and development of key mathematical areas, and provide a window into the limits and potentials of our
mathematical understanding.
Axiom of choice The Axiom of Choice is one of the most debated and fundamental axioms in set theory.
Formally, it can be stated as follows:
Given a collection C of non-empty sets, there exists a function f such that for each set S in the collection C , f (S) is an
element of S.

In more intuitive terms, if you have a collection of non-empty boxes, the Axiom of Choice states that it’s
possible to select exactly one item from each box, even if there’s no specific rule on how to make that
selection.
While the Axiom of Choice might seem obviously true for finite collections, its controversial nature arises
primarily when dealing with infinite collections, especially when no clear method exists to specify a choice
for each set in the collection.
This axiom is equivalent to other mathematical statements, such as Zorn’s Lemma and the Well-Ordering
Theorem. It’s often used to establish the existence of certain mathematical entities even when there isn’t a
clear, constructive method to produce these objects.
Despite its foundational role in modern set theory and mathematics, the Axiom of Choice has been the
subject of much debate due to some of its counterintuitive implications, like the Banach-Tarski Paradox.

8.5. Integration in Non-measurable Spaces


The question of whether one can integrate over non-measurable spaces is intriguing and warrants a detailed
response. Firstly, it is imperative to clarify what we mean by “integrate” and “non-measurable spaces.”

Integration and Measure


Integration, in its most common form, is tied to the concept of “measure.” For instance, when we integrate
a function over the interval [a, b] on the real line, we’re utilizing the Lebesgue measure (in the case of
8.5 Integration in Non-measurable Spaces 121

Lebesgue integration) or the Riemann sum (in the case of Riemann integration) to “measure” the area under
the function’s curve.
A “non-measurable set” is a set for which a measure cannot be consistently defined using a particular
measure system (like the Lebesgue measure on the real line). The existence of non-measurable sets on
the real line is tied to the axiom of choice, a foundational principle in set theory.

Can We Integrate in Non-measurable Spaces?


With this understanding, if we have a space that is entirely non-measurable, it doesn’t make sense to speak of
integration in that space in the traditional context since integration requires a notion of measure. However,
it’s rare to consider entirely non-measurable spaces in practical applications. More often, we encounter
individual sets within a measurable space that are non-measurable.
In practice, what typically happens is that one operates in spaces where measure is well-defined and avoids
directly dealing with non-measurable sets. If, for some reason, one needs to deal with a non-measurable set,
approximation techniques can be employed to treat the set as a limit of measurable sets, although this can be
intricate and isn’t always feasible.
Integration, as we understand it, hinges on the notion of measure. In non-measurable spaces or sets,
integration in the traditional sense isn’t directly applicable. However, in most practical contexts, we operate
in measurable spaces and sidestep issues associated with non-measurable sets.

Example 8.1. Integrating in a “hybrid space” containing both measurable and non-measurable regions poses
intrinsic difficulties due to the nature of Lebesgue integration. However, one can handle the issue via
approximations or by focusing solely on the measurable regions. Consider E as a non-measurable set within
the interval [0, 1], and let’s define the function f : [0, 1] → R by
(
1 if x ∈ E
f (x) =
0 if x ∈
/E
We wish to compute the Lebesgue integral of f over [0, 1].

Solution 8.1. (1) Non-measurable Set: Since E is non-measurable, we can’t directly compute the Lebesgue
measure of E. As a result, we also can’t directly compute the Lebesgue integral of f over E.
(2) Measurable Region: However, the complement of E within [0, 1], denoted E c , is measurable. For f , we
know that f (x) = 0 for all x ∈ E c . Thus, the Lebesgue integral of f over E c is zero.
(3) Approximation: One strategy might be to approximate E with measurable sets En such that En → E in
some manner (in a setwise sense). We could then attempt to compute the integral of f over these sets
En and see if it converges to some value as n → ∞. However, this strategy isn’t always fruitful, and we
might not get a definitive answer about the integral of f over E.
(4) Conclusion: In this scenario, even though we have a function defined over a “hybrid” space containing
both measurable and non-measurable regions, we can’t compute the Lebesgue integral of f over the
entire [0, 1] due to the non-measurable component. Nevertheless, we can compute the integral over the
measurable regions and employ approximation techniques to try to understand the behavior in the non-
measurable regions.
122 8 Non-measurable Sets and Paradoxes

8.6. Conclusions
Throughout this chapter, we have delved into the intriguing world of non-measurable sets and the paradoxes
associated with them. These sets, which defy the traditional notion of ”measure”, are both fascinating and
perplexing. Through various examples, we have seen how certain sets, like those arising from the Banach-
Tarski Paradox, cannot be associated with a defined volume or length in the classical sense.
Paradoxes in non-measurable sets illuminate the subtleties and complexities of the mathematical universe,
challenging our intuitions about space and measure. These examples also showcase the deep interplay
between set theory, geometry, and measure theory.
The study of non-measurable sets is not merely a theoretical exercise; its significance and relevance manifest
in areas of mathematics such as integration. As we have seen, attempting to integrate over non-measurable
spaces introduces additional challenges and considerations, underscoring the importance of a solid grasp of
measure theory.
In summary, while non-measurable sets and the associated paradoxes might seem like distant abstractions
from everyday reality, they serve as reminders of the limits of our mathematical intuition and of the power
and depth of the language of mathematics.
8.7 Exercises 123

8.7. Exercises
Exercise 8.1. Consider the Vitali set, which is a collection of subsets of [0, 1] such that every two distinct
subsets are disjoint, and their union covers [0, 1]. If this collection is indexed by R, can we assign a ”fair”
measure to each subset in the collection?

Exercise 8.2. Given the definition of the Vitali set, provide an intuition for why such a set is non-measurable.

Exercise 8.3. Why are non-measurable sets important in measure theory?

Exercise 8.4. If we attempt to integrate a function over a non-measurable set, what problems might arise?

Exercise 8.5. Describe the Vitali set and explain its significance in the context of Lebesgue measure.

Exercise 8.6. Describe the Banach-Tarski paradox and discuss the implications it has on integration in non-
measurable spaces.
Chapter 9
Applications in Probability

Carlos Polanco
Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México.
Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de México, Ciudad de
México, México.

Abstract In this chapter, we delve into the foundational structures underpinning modern probabilistic
thought. The chapter commences with an in-depth analysis of Probability Spaces—providing readers
with a rigorous understanding of sample spaces, events, and probability measures. Through illustrative
examples, the chapter highlights the significance of defining probability spaces in accurately modeling real-
world random phenomena. Transitioning from foundational theory to its practical application, the chapter
introduces Random Variables. Here, we demystify the concept, emphasizing its role as a bridge between
abstract probability spaces and tangible outcomes. Through comprehensive explanations complemented by
visual aids, readers will grasp how random variables function to map events to real numbers, thus paving
the way for mathematical operations and further analyses. Lastly, the chapter delves into the Expectation of
random variables, often referred to as the ‘average’ or ‘mean’ value. This section elucidates the importance
of expectation in predicting future outcomes based on probabilistic models. Utilizing a plethora of real-
life scenarios and exercises, readers are equipped with the skills to compute and interpret expectations—
serving as a crucial tool in their statistical toolkit. Throughout this chapter, our emphasis remains steadfast:
to seamlessly intertwine rigorous theoretical explanations with accessible examples. By the end, readers are
not just familiar with these concepts, but are also adept at applying them to complex real-world problems,
setting the stage for more advanced topics in subsequent chapters.

Keywords: integration in measurable spaces, Lebesgue integral, Normal distribution, Poisson distribution,
probability.

(B) Carlos Polanco: Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México. Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de
México; Tel: +01 55 5595 2220; E-mail: polanco@unam.mx

125
126 9 Applications in Probability

9.1. Introduction
The journey into the world of probability theory is akin to entering a vast landscape, both challenging and
fascinating. Just as maps chart out terrains, mathematicians over the ages have sought to establish rigorous
foundations for understanding random phenomena. The bedrock of this theoretical landscape is the concept
of a probability space [1, 2, 8, 16, 17].
A probability space is, at its essence, a mathematical formulation of a random experiment’s inherent
uncertainties. Be it the simple toss of a fair coin, the intricate ballet of gas molecules, or the unpredictable
whims of financial markets, a probability space offers us a structured lens to view and understand these
phenomena.
This chapter will introduce the foundational concepts that make up a probability space, setting the stage for
more advanced topics in probability theory. We shall delve deep into understanding events, outcomes, and
the measures that give them meaning.
Every probabilistic experiment results in an outcome. But how do we map these outcomes to real-world
quantities of interest? Enter the realm of random variables. These are functions that assign a real number to
each outcome of an experiment. For instance, in a game of dice, the number that faces up can be considered
a random variable.
Beyond the basic definition, there is the crucial concept of the expectation of a random variable. This gives
us a weighted average or a “mean” sense of what we can anticipate from a random experiment. Expectations
play a central role in various disciplines, from finance to physics, helping professionals gauge average
outcomes in the face of uncertainty.
In this section, we shall journey through the mathematical intricacies of random variables and their
expectations, cementing our understanding with practical examples and intuitive explanations. By the end,
the reader will be equipped with the tools to interpret and employ these concepts in diverse applications.

9.2. Probability Spaces


The Measure Theory is a branch of mathematics that investigates how to assign a notion of size or quantity to
sets. This is done through a “measure”. For instance, when we measure the area of a rectangle, we are using
a type of measure. In this context, probability becomes a particular measure where the “size” or “quantity”
represents the likelihood of a set.
A probability space consists of three main components:

(1) Sample Space (Ω): This is the set of all possible outcomes of a random experiment. For example, in
flipping a coin, Ω could be {heads, tails}.
(2) σ -algebra (F ): A set of events. An event is a subset of the sample space. Not every subset of Ω needs to
be in F , but F must fulfill certain properties. The most crucial is that if an event is in F , its complement
(the outcomes not in the event) must also be in F . Additionally, the countable union of events in F
should also be in F . These properties ensure we can perform logical operations with the events and still
remain within F .
9.3 Random Variables and their Expectation 127

(3) Probability Measure (P): This is a function that assigns to each event in F a probability. This measure
must adhere to certain properties:
(a) The probability of the entire sample space (Ω) is 1: P(Ω) = 1.
(b) Probability is non-negative: for any event A, P(A) ≥ 0.
(c) If we have a series of events that do not overlap (i.e., they are mutually exclusive), then the
probability of the union of those events is the sum of their individual probabilities.

Thus, a probability space is a triple (Ω, F , P) where:


(1) Ω is the sample space.
(2) F is a σ -algebra on Ω.
(3) P is a probability measure defined over F .

A probability space allows us to define and work with random events in a mathematically rigorous manner.
It provides a framework within which we can talk about events, their probabilities, and the relationships
between different events in a coherent and well-defined context.

9.3. Random Variables and their Expectation


In simple terms, a random variable is a function that assigns a real value to each outcome in a sample space.
Imagine rolling a dice: the sample space is {1, 2, 3, 4, 5, 6}. A random variable could simply be the number
that comes up when you roll the dice. But you could also have more complex random variables, like “the
squared number that comes up” or “the number minus three.”
Mathematically, if Ω is our sample space and X is our random variable, then X : Ω → R is a function that
assigns to each element of Ω a real number.
Within measure theory, a random variable X is a measurable function. This means that if you take a set of
real numbers, the pre-images (the outcomes from the sample space mapped to that set) are events in our
σ -algebra F. “Measurability” ensures that we can assign probabilities to these events using our probability
measure P.

Expectation
The expectation of a random variable is a weighted average based on the probabilities of the different
outcomes of that random variable. It’s a way to summarize in a single number the different possibilities
and probabilities of a random variable.
Mathematically, if X is a discrete random variable (meaning it can only take countable values like 1, 2, 3,
...), the expectation E[X] is defined as:

E[X] = ∑ X(ω )P(ω )


ω ∈Ω
128 9 Applications in Probability

If X is a continuous random variable (can take any value in a range, like any number between 0 and 1), the
expectation is defined as:
ˆ
E[X] = X(ω )dP(ω )

These formulas might look complicated, but the basic idea is simple: for the expectation, you take each
possible value of X, multiply it by the probability of that value, then sum (or integrate) all up.
In layman’s terms, the expectation is the “average value” you would expect to get if you could repeat the
random experiment infinitely many times.

Relation with the Lebesgue Integral


Within measure theory, the mathematical expectation (or expected value) of a random variable is often
defined using the Lebesgue Integral. This integral is a generalization of the Riemann integral and is
particularly useful when dealing with more complex functions and more general spaces.
The Lebesgue Integral allows us to integrate functions with respect to measures that aren’t necessarily “nice”
or “simple”, like the Lebesgue measure in the case of real numbers or a probability measure in a probability
space. When talking about the expectation E[X] of a random variable X, we are actually using the Lebesgue
Integral to “sum” in a more generalized manner the products of X(ω ) and P(ω ) across the entire sample
space Ω.
In Lebesgue notation, the expectation of a random variable X is expressed as:
ˆ
E[X] = X dP

Here, dP represents the probability measure P, and the integration is performed in the Lebesgue sense.

Example 9.1. Imagine you’re playing a dice rolling game with a fair 6-sided dice. You want to know the
“average value” you can expect when rolling the dice.

(1) Random Variable X: Represents the outcome of the dice roll. The sample space Ω = {1, 2, 3, 4, 5, 6}.
(2) Probability Measure P: Since the dice is fair, P({ω }) = 61 for each ω in Ω.
(3) Expectation E[X]:
1 1 1 1 1 1 21
E[X] = 1 × +2× +3× +4× +5× +6× = = 3.5
6 6 6 6 6 6 6
Thus, on average, you’d expect to get an outcome of 3.5 when rolling a fair 6-sided dice, even if 3.5
isn’t a possible outcome in a single roll.
In this example, we’ve calculated the expectation using a finite sum because we’re in a discrete setting.
However, the concept extends to more general and complex sample spaces using the Lebesgue Integral,
as done in measure theory.
9.3 Random Variables and their Expectation 129

The random variable and its expected value or expectation are core concepts in probability theory. Within
the framework of measure theory, these ideas are further formalized and tied into more advanced concepts
like the Lebesgue Integral. This mathematical framework allows for the tackling of more complex problems
and practical applications in science, engineering, economics, and other fields.

Example 9.2. Buffon’s needle problem is a classic probability problem involving dropping a needle of length
l onto a floor that has parallel lines drawn a distance d apart. The goal is to determine the probability that
the needle will cross one of the lines.
Suppose a needle of length l (where l < d) is dropped randomly onto the floor. What is the probability P that
the needle will cross a line?

Solution 9.1. The solution to this problem relies on two random variables:

(1) The distance x from the center of the needle to the nearest line, which is uniformly distributed on the
interval [0, d2 ].
(2) The angle θ that the needle makes with the lines, which is uniformly distributed on the interval [0, π ].

The needle will cross a line if:


l × sin(θ ) > x

To find the probability, we’re going to use the Lebesgue integral. We’ll integrate over the region where the
above inequality holds true.
Establish the measure: Since both x and θ are uniformly distributed over their respective intervals, the joint
measure is simply the product of the individual measures. In other words, d θ dx.
Set the integration limits:

(1) For x, it ranges from 0 to d2 .


(2) For θ , it ranges from 0 to π .

Set up and evaluate the integral: Given that the needle crosses a line if l sin(θ ) > x, we can rewrite this as
x < l sin(θ ). So, for a fixed θ , x ranges from 0 to l sin(θ ).
The probability is:
ˆ π ˆ l sin(θ )
1
P= d dxd θ
2 ×π 0 0

Evaluating the inner integral with respect to x:


ˆ π
1
P= d
l sin(θ )d θ
2 × π 0

Evaluating the outer integral with respect to θ :


ˆ π
2l
P= sin(θ )d θ
dπ 0
130 9 Applications in Probability

2l 2l 4l
P= [− cos(θ )]π0 = (2) =
dπ dπ dπ
Thus, the probability P that the needle crosses a line is:

4l
P=

This problem serves as a good example of how integral calculus and probability techniques can be applied
in a geometric context and how the Lebesgue integral allows us to tackle problems that might be challenging
to address with the Riemann integral alone.

x
θ

Figure 9.1: Visualization of Buffon’s Needle problem. The parallel lines represent the floorboards. Two
scenarios are illustrated: a needle (in red) that doesn’t cross any line and another (in blue) that intersects a
line. For each needle, the distance x from its midpoint to the nearest line and the angle θ it makes with the
floorboards are highlighted.

Example 9.3. Consider the function f defined on the interval [0, 1] as:
(
1 if x is rational
f (x) =
0 if x is irrational

This function is known to take the value 1 on rational numbers and 0 on irrational numbers.
9.3 Random Variables and their Expectation 131

Solution 9.2. Calculate the “total sum” of f over the interval [0, 1], that is, its integral over this interval.
If we attempt to use the Riemann integral, we encounter a problem: for any subinterval of [0, 1], there will
always be both rational and irrational numbers in that subinterval, no matter how small. This makes the
function “too erratic” at every point for the Riemann integral to make sense.
The Lebesgue integral approaches integration in a different manner, considering the set of points where the
function takes certain values, rather than summing up function values over intervals.
For f , for any number ε > 0, the set of points in [0, 1] where f is greater than 1 − ε is precisely the set of
rationals in [0, 1], which has Lebesgue measure zero.
Therefore, the Lebesgue integral of f over [0, 1] is zero. That is:
ˆ 1
f dµ = 0
0

where µ is the Lebesgue measure.


This result is quite surprising and showcases the power of the Lebesgue integral. Even though f takes the
value 1 at infinitely many points in [0, 1], its “total sum” (in the sense of Lebesgue) is zero. This is because
the “density” of the points where f is non-zero (the rationals) is null in the sense of the Lebesgue measure.
This serves as a compelling example of how the Lebesgue integral can provide answers where the Riemann
integral fails. The indicator function of the rationals is not Riemann-integrable over any interval that contains
both rational and irrational numbers, which is any non-degenerate interval.

f (x)

Rationals
1

Irrationals x

Figure 9.2: A symbolic representation of the indicator function of the rationals over the interval [0,1]. Red
dotted lines denote the value of 1 for rational numbers, while blue dotted lines denote the value of 0 for
irrational numbers. This is a conceptual representation; in reality, rational numbers are dense in the reals, so
there are no gaps.

Example 9.4. Suppose we have a random variable X that follows an exponential distribution with parameter
λ > 0. The probability density function of X is:

f (x) = λ e−λ x , x≥0

The goal is to find the expected value E[X] of this random variable.
132 9 Applications in Probability

Solution 9.3. To compute E[X], the general formula using the Lebesgue integral is:
ˆ ∞
E[X] = x f (x) dx
−∞

Here, f (x) is the probability density function of X.

(1) Set the integration limits: Since f (x) = 0 for x < 0, the integral really just needs to be evaluated from 0
to ∞.
(2) Set up and evaluate the integral: Plugging in f (x) = λ e−λ x into the integral, we get:
ˆ ∞
E[X] = xλ e−λ x dx
0

To evaluate this integral, we can use integration by parts, with u = x and dv = λ e−λ x dx.

Solving, we get:
1
E[X] =
λ
This is a classic result in statistics and is derived using the Lebesgue integral. It serves as a good example of
how this mathematical tool can be applied to solve problems in probability theory.

f (x) = e−x
f (x)

E[X]

Figure 9.3: The expected value E[X] can be visualized as the balance point of the area under the curve. For
the exponential distribution, this is equal to λ1 .

Example 9.5. A probability space is a specialized form of a measurable space. Let’s recall the foundational
definitions:
9.3 Random Variables and their Expectation 133

(1) A measurable space is a pair (X, F ) where X is a set and F is a σ -algebra over X. A σ -algebra is a
collection of subsets of X that contains the empty set, is closed under complement, and is closed under
countable unions.
(2) A probability space is a triplet (X, F , P) where (X, F ) is a measurable space and P : F → [0, 1] is a
probability measure. A probability measure P must satisfy:
/ = 0 and P(X) = 1.
(a) P(0)
(b) If A1 , A2 , . . . are disjoint events in F , then:
!

[ ∞
P Ai = ∑ P(Ai )
i=1 i=1

Consider the set X = {a, b} and the collection F = {0,


/ {a}, {b}, {a, b}}.

Solution 9.4. First, we verify that F is a σ -algebra over X:

(1) The empty set and X are in F .


(2) The complement of {a} is {b}, and both are in F . The same applies for the complement of {b} which
is {a}.
(3) Regarding countable unions, there isn’t much to check since our set X is finite. But, for example, the
union of {a} and {b} is {a, b}, which is in F .

Thus, F is a σ -algebra over X, making (X, F ) a measurable space.


Now, let’s define a function P : F → [0, 1] as:

/ = 0,
P(0) P({a}) = 0.5, P({b}) = 0.5, P({a, b}) = 1

We check the properties of P:

/ = 0 and P(X) = P({a, b}) = 1.


(1) P(0)
(2) Since we only have finite sets in F , we only need to verify that for disjoint sets A and B, P(A ∪ B) =
P(A) + P(B). For A = {a} and B = {b}, we get P(A ∪ B) = P({a, b}) = 1, which matches P({a}) +
P({b}) = 0.5 + 0.5 = 1.

Hence, P is a probability measure over F .


(X, F , P) is a probability space, and since every probability space is, by definition, a measurable space,
we’ve shown that (X, F ) is indeed a measurable space.
134 9 Applications in Probability

X P(X) = P({a, b}) = 1

a b

P({a}) = 0.5 P({b}) = 0.5

/ =0
P(0)

Figure 9.4: Venn diagram illustrating the set X, its subsets, and the corresponding probabilities in the
probability space.

Example 9.6. Sample Space:


Ω = {1, 2, 3, 4}

Collection of Subsets:
F = {0,
/ {1, 2}, {3, 4}, {1, 2, 3, 4}}

For F to be a σ -algebra, it must satisfy the following properties:

(1) Ω and 0/ are in F .


(2) If A is in F , then its complement Ac must also be in F .
(3) F is closed under countable unions. That is, if Ai ∈ F for all i, then i Ai ∈ F .
S

Verification:

(1) Ω and 0/ are clearly in F .


(2) Consider the sets in F :
(a) The complement of {1, 2} is {3, 4} which is in F .
(b) The complement of {3, 4} is {1, 2} which is in F .
(c) The complement of Ω is 0/ and vice versa, both are in F .
(3) As our sample space is finite, we only need to consider finite unions. For instance, {1, 2} ∪ {3, 4} =
{1, 2, 3, 4} which is in F .

Therefore, F meets all three required properties and is a σ -algebra.


9.3 Random Variables and their Expectation 135


The whole rectangle represents Ω

1, 2 3, 4

0/ (not visually represented)

Figure 9.5: Venn diagram representing the sample space Ω and the subsets of the σ -algebra F . The two
shaded circles represent the subsets {1, 2} and {3, 4}.

Example 9.7. Let’s consider the real interval [0, 1]. This interval contains both rational and irrational
numbers. The set of rational numbers in this interval is denoted by Q ∩ [0, 1]. The main questions addressed
are:

/ [0, 1]} a σ -algebra?


(1) Is the collection of subsets {Q ∩ [0, 1], [0, 1] \ (Q ∩ [0, 1]), 0,
(2) What is the probability (in terms of Lebesgue measure) of randomly picking a rational number from the
interval [0, 1]?

Solution 9.5. (1) Verification of σ -algebra: The set F = {Q ∩ [0, 1], [0, 1] \ (Q ∩ [0, 1]), 0,
/ [0, 1]} should
satisfy three properties to be a σ -algebra:

(a) The entire sample space [0, 1] is in F .


(b) If an event A is in F , its complement Ac should also be in F .
(c) If A1 , A2 , . . . are in F , their countable union should also be in F .

Thus, F is a σ -algebra.
(2) Probability using Lebesgue Integral: The Lebesgue measure m of a set E in R is defined as the total
length of intervals that comprise E. To determine the measure of the rationals Q ∩ [0, 1]:
Rational numbers can be expressed as qp , where p and q are integers and q 6= 0. There are countably
many rational numbers in [0, 1], so we can enumerate them as r1 , r2 , r3 , . . .. If we take a tiny interval
around each rational, the total measure of these intervals is the sum of their lengths.
ε
For each ri , let’s take an interval of length 2i
. Then, the total length of these intervals covering all the
ε
rationals is ∑∞i=1 2i = ε .

Given that ε can be any positive number as small as we wish, the Lebesgue measure of the set of all
rationals in [0, 1] is 0.
Thus, the probability of randomly selecting a rational number from the interval [0, 1] is 0.
136 9 Applications in Probability

Rational numbers in green

0 1

Figure 9.6: Graphical representation of the interval [0, 1]. Rational numbers are denoted by green points,
while the true density of these points is not captured due to their countable nature. The continuum of the line
represents both rational and irrational numbers.

Example 9.8. The probability density function (pdf) for a standard normal random variable X is given by:
1 x2
f (x) = √ e− 2

To calculate the probability that X falls between two values a and b, we evaluate the integral of f (x) over
the interval [a, b]. Using the Lebesgue integral notation, this would be:
ˆ
P(a ≤ X ≤ b) = f dµ
[a,b]

where µ represents the Lebesgue measure.


Furthermore, to compute the expectation and variance of X, we would employ the Lebesgue integral. For
instance, the expectation E[X] is:
ˆ
E[X] = x f (x) d µ
R
ˆ
1 x2
E[X] = x √ e− 2 d µ
R 2π
E[X] = 0

(as the normal distribution is symmetric about 0).


The variance Var(X) is:
Var(X) = E[X 2 ] − (E[X])2
ˆ
1 x2
Var(X) = x2 √ e− 2 d µ − 02
R 2π
Var(X) = 1

(where we’ve used the fact that E[X] = 0 for the standard normal distribution).
It’s essential to note that while the Lebesgue integral notation looks similar to Riemann integral notation,
the underlying definition and construction of the Lebesgue integral are more general. This generality allows
handling a broader range of functions and situations. The normal distribution itself can be tackled with the
Riemann integral, but the Lebesgue integral offers a more general and robust framework for working with
probability and random variables.
9.3 Random Variables and their Expectation 137

f (x)

a≤X ≤b

Figure 9.7: Graphical representation of the standard normal probability density function f (x). The shaded
region between a and b depicts the probability P(a ≤ X ≤ b), calculated using the Lebesgue integral.

Example 9.9. The Poisson distribution is a fundamental tool in probability theory and statistics for modeling
the number of events occurring in fixed intervals of time or space. The Lebesgue integral can be used in
contexts where the Riemann integral struggles, especially when dealing with functions that aren’t “friendly”
in the traditional sense.
Suppose you have a random variable X that follows a Poisson distribution with parameter λ . The probability
mass function (pmf) for X is:

e−λ λ k
P(X = k) =
k!
Now, imagine you wish to compute the expectation E[g(X)], where g(x) is a particularly nasty function that,
for instance, jumps between many different values in an irregular manner. It might be the case that g(x) is
Riemann-integrable over some intervals but not others.
To find E[g(X)], you would need to sum over all possible values of X, weighted by their probability:

e−λ λ k
E[g(X)] = ∑ g(k) k!
k=0

In terms of the Lebesgue integral, this would look like integrating with respect to the countable measure
generated by the Poisson distribution. If g is a complex function, it may not be easily Riemann-integrable in
the traditional sense. However, with the Lebesgue integral, we can tackle this problem more generally since
the Lebesgue integral is defined in terms of a measure (in this case, the Poisson measure) and doesn’t require
the function to be “friendly” in the same sense as the Riemann integral.
The Lebesgue integral allows you to work directly with the underlying measure (the Poisson distribution in
this case) and avoids issues that might arise if you tried to tackle the problem using only Riemann techniques.
The case demonstrates how the Lebesgue integral can be advantageous when working with the expectation
of a Poisson random variable, especially when dealing with functions that might not be easily manageable
with the Riemann integral. This is one of the reasons why measure theory and the Lebesgue integral are so
foundational in advanced probability theory: they enable us to handle a much broader range of problems and
scenarios.
138 9 Applications in Probability

0.2
0.2

0.15
0.15

Probability 0.12

0.1 0.1
0.1
8 · 10−2
7 · 10−2
6 · 10−2
5 · 10−2
5 · 10−2 4 · 10−2
3 · 10−2

0 1 2 3 4 5 6 7 8 9 10
Number of Events (k)

Figure 9.8: A representation of the Poisson distribution for a given parameter λ . The bars show the
probability of observing k events.

Example 9.10. Imagine a company that produces collectible cards. There is a special edition which includes
4 rare cards out of a total of 20 cards. You decide to buy 5 cards at random. The primary interest is to find
out the probability of getting exactly 3 rare cards.
To model this using a hypergeometric distribution, the probability P(X = k) is defined as:
K  N−K 
k n−k
P(X = k) = N
n

where:

(1) N is the total number of cards (20 in this case).


(2) K is the total number of rare cards (4 in this case).
(3) n is the total number of cards you decide to purchase (5 in this case).
(4) k is the number of rare cards you want to obtain (3 in this case).

Now, let’s imagine that we want to understand how this distribution behaves as the total number of cards
N tends to infinity, while keeping the proportion of rare cards constant. This could lead us to a continuous
distribution, and this is where the Lebesgue integral could come into play.

Solution 9.6. By making N large and keeping the proportion of rare cards constant, our hypergeometric
distribution can be approximated using the binomial distribution. Furthermore, if the number of cards you
buy is also a large number, we might use a normal approximation.
9.3 Random Variables and their Expectation 139

The normal distribution is defined in terms of an integral, specifically the error function, and this is where
the Lebesgue integral might be relevant. We might want to evaluate properties of this normal distribution on
certain zero-measure sets (sets that are “thin” but significant, like certain fractal sets).
However, this is a highly theoretical and conceptual scenario. In practice, the hypergeometric distribution is
handled in discrete terms and doesn’t typically require the application of the Lebesgue integral.

Probability

Hypergeometric

Binomial
Normal k

Figure 9.9: Schematic representation of the hypergeometric probability mass function and its binomial and
normal approximations. As the sample size increases, the hypergeometric distribution can be approximated
by the binomial and, further, by the normal distribution.

Example 9.11. Consider tossing a fair coin n = 1000 times, and we are interested in the probability of getting
heads exactly 500 times. Instead of using the binomial formula (which would be computationally intensive
for n = 1000), we can use a normal approximation.
Binomial Distribution:  
n k
P(X = k) = p (1 − p)n−k
k
Where:
n = 1000
k = 500
p = 0.5

Normal Approximation: The mean µ and variance σ 2 of a binomial distribution are given by:

µ = np
140 9 Applications in Probability

σ 2 = np(1 − p)

For n = 1000 and p = 0.5, we have:


µ = 500
σ 2 = 250
σ ≈ 15.81

To compute the probability of getting exactly 500 heads using the normal approximation, we can find the
probability that X lies between 499.5 and 500.5:

P(499.5 < X < 500.5)

To do this using the Lebesgue integral, we need the probability density function of the normal distribution:

1 (x−µ )2

f (x) = √ e 2σ 2
σ 2π
Thus, the desired probability is:
ˆ 500.5
P(499.5 < X < 500.5) = f (x) dx
499.5

Solution 9.7. ˆ 500.5 (x−500) 2


1 −
P(499.5 < X < 500.5) = √ e 2(15.812 ) dx
499.5 15.81 2π

This integral is typically solved using standard normal distribution tables or statistical software.
Using the Lebesgue integral allows us to deal with functions that might not be integrable in the traditional
Riemann sense. However, in this case, the function is smooth and well-behaved, so both Riemann and
Lebesgue integrals would yield the same result. Nevertheless, utilizing the Lebesgue integral prepares us
to work in more general contexts where Riemann integrability might not be applicable.

P(X = k)

k
490 495 500 505 510

Figure 9.10: Alternative visualization of the binomial distribution (blue bars) for n = 1000 and p = 0.5
alongside its normal approximation (orange dashed line) with µ = 500 and σ ≈ 15.81.
9.5 Conclusions 141

9.4. Lebesgue’s Dominated Convergence Theorem in Probability


One of the most fundamental theorems in probability theory that directly benefits from measure theory, and
in particular from the Lebesgue integral, is the Lebesgue’s Dominated Convergence Theorem. This theorem
has already been explained in this book.
Theorem 9.1 (Lebesgue’s Dominated Convergence). Suppose that (Xn ) is a sequence of measurable
random variables and X is a random variable such that Xn → X almost surely (or in probability). If there
exists a random variable Y such that |Xn |≤ Y almost surely for all n and E[|Y |] < ∞, then

E[|Xn − X|] → 0.

Furthermore, if the conditions of the theorem hold, then

E[Xn ] → E[X].

Relevance with the Lebesgue Integral


The Riemann integral, which many students first encounter, behaves problematically when dealing with
limits under the integral sign. The Lebesgue integral, on the other hand, is much more amenable to limit
operations, making it easier to prove and apply results like the Dominated Convergence Theorem.
Lebesgue integration allows for the interchange of limits and integration operations under far more general
conditions than the Riemann integral, making it foundational in probability theory, especially when dealing
with sequences of random variables.
In the context of the theorem, the “expectation” or “expected value” E[·] is actually an integral with respect to
a probability measure, and this integral is defined using the Lebesgue integral. The dominated convergence
theorem is an essential tool in proving the convergence of expectations of sequences of random variables,
which would be far more challenging to establish without measure theory and the Lebesgue integral.

9.5. Conclusions
Throughout this chapter, we embarked on a journey through the foundational concepts of probability theory:
probability spaces and random variables, and delved into the pivotal concept of expectation.
We began by introducing probability spaces as a mathematical framework comprising a sample space,
events, and probabilities assigned to these events. This structure, derived from measure theory, provides a
rigorous and general setting to tackle problems in probability, making it essential for advanced applications.
Moving forward, we transitioned into random variables. These variables, which map outcomes in a sample
space to real numbers, serve as a bridge between the abstract universe of events and the more concrete realm
of measurable numerical outcomes. They form the backbone of many statistical procedures and real-world
applications, ranging from risk assessments in finance to predictions in machine learning.
142 9 Applications in Probability

Lastly, we explored the concept of expectation. The expectation of a random variable offers a measure of
its “center” or “average” value and plays a vital role in numerous areas of mathematics and its applications.
Whether predicting future stock prices or determining the average lifespan of a product, understanding and
calculating expectations are paramount.
In essence, this chapter provided the foundational tools required to analyze uncertainty and make predictions.
As we move forward, these tools will enable us to delve deeper into more advanced topics in probability and
statistics, building upon the strong foundation laid in this chapter.
9.6 Exercises 143

9.6. Exercises
Exercise 9.1. Given the sample space Ω = {H, T } representing a coin flip where H is heads and T is tails,
and a probability measure P(H) = 0.5 and P(T ) = 0.5, find the probability of the event A = {H}.

Exercise 9.2. Consider a random variable X that takes on values 1, 2, and 3 with probabilities 0.2, 0.5, and
0.3, respectively. Find the expectation E[X].

Exercise 9.3. Let Xn be a sequence of random variables defined as Xn (ω ) = ωn for ω ∈ [0, 1]. Prove that
Xn converges to X(ω ) = 0 almost surely, and verify the conditions of Lebesgue’s Dominated Convergence
Theorem.

Exercise 9.4. Suppose f : R → R is a measurable function such that f (x) = 0 for almost every x with respect
to Lebesgue measure. Prove that the Lebesgue integral of f over R is zero, i.e.,
ˆ
f d λ = 0.
R

Exercise 9.5. Suppose X is a random variable that follows a Poisson distribution with parameter λ = 4.
Given that the standard normal distribution Z has mean 0 and variance 1, prove that as λ tends to infinity,
√ λ tends to the distribution of Z.
the distribution of X−
λ
Chapter 10
Convergence in Probability

Carlos Polanco
Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México.
Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de México, Ciudad de
México, México.

Abstract This chapter delves into the foundational pillars of probability theory and statistical inference:
the Central Limit Theorem, the Weak Law of Large Numbers, and the Law of Large Numbers. The
reader is introduced to the profound implications of these theorems on the behavior of large collections
of random variables. The Central Limit Theorem unravels the fascinating convergence of the distribution
of standardized sample means to a standard normal distribution, irrespective of the original distribution,
as the sample size burgeons. In parallel, the Weak Law of large Numbers and Law of Large Numbers are
expounded upon, illuminating the almost sure convergence of sample averages to their expected values.
Through a blend of rigorous proofs, intuitive explanations, and practical examples, the chapter furnishes a
comprehensive understanding of these essential theorems and their pivotal roles in the realm of statistics,
data analysis, and beyond.

Keywords: central limit theorem, definition of weak law of large numbers, law of large numbers.

10.1. Introduction
The study of probability theory provides the foundation upon which much of statistics and data science are
built. Two pillars of this foundation are the Central Limit Theorem and the Laws of Large Numbers. These

(B) Carlos Polanco: Department of New Technologies and Intellectual Protection, Instituto Nacional de Cardiologı́a “Ignacio
Chávez”. Ciudad de México, México. Department of Mathematics, Faculty of Sciences, Universidad Nacional Autónoma de
México; Tel: +01 55 5595 2220; E-mail: polanco@unam.mx

145
146 10 Convergence in Probability

theorems offer profound insights into the behavior of random sequences and serve as essential tools for both
theoretical inquiries and practical applications.
In this chapter, we will embark on an exploration of these fundamental concepts, starting with the raw
definitions and culminating in a deep understanding of their implications. Our journey begins with the Law
of Large Numbers, both in its strong and weak forms. At its core, the Laws of Large Numbers guarantees
that as we collect more data, our sample averages tend to the true population average. It provides the bedrock
principle that underpins much of empirical science and the justification for taking larger samples in statistics.
Following this, our attention will turn to the Central Limit Theorem, a remarkable result that tells us that no
matter the original distribution of data, the distribution of the average of a large sample resembles a normal
distribution. This universality is what makes the normal distribution ubiquitous in many fields of study.
Together, the Central Limit Theorem and Laws of Large Numbers form the cornerstones of classical
probability theory, with ramifications spanning across disciplines from finance to physics. Join us as we delve
into the elegant world of sums and averages, of convergence and limits, and discover why these theorems
have earned their place of honor in the annals of mathematical history.

10.2. Law of Large Numbers


Imagine flipping a coin many times. For a fair coin, we know that the probability of getting a head or tail is
50%. But what happens if we flip the coin 10, 100, or 1,000 times? Do we see exactly 50% heads and tails
every time? Probably not. However, the Law of Large Numbers tells us something very intriguing about this.
At its core, this law assures us that as we flip the coin more and more, the average number of heads we get
tends towards its expected value, which is 50%. It’s like saying that even if we don’t know what will happen
in the next flip, if we flip it many times, in the end, the proportion of heads and tails will be quite predictable.
Now, what does this have to do with measure theory and convergence in probability?
Convergence in probability is a mathematical concept that helps us understand how sequences of random
events behave in the long run. In the context of measure theory, this convergence pertains to how certain
sequences of “measures” (a mathematical way of saying “quantities”) stabilize towards a specific value.
The Law of Large Numbers is a perfect exemplification of convergence in probability. It tells us that even
though each coin flip is random and unpredictable, the average over many flips converges to its expected
value. That is, as the number of flips increases, the proportion of heads “stabilizes” around 50%.
Why is it crucial to study this in the context of measure theory? Well, measure theory provides us with the
mathematical tools to deal with random events and their “measures” or “quantities”. It allows us to formalize
and prove laws like the Law of Large Numbers. In this chapter, we dive into the math behind this law and
explore how, in the vast realm of randomness, there are still some predictable behaviors, and all this is
unveiled through convergence in probability!
10.3 Central Limit Theorem 147

Definition of Weak Law of Large Numbers


Let (Xn )n≥1 be a sequence of independent and identically distributed (i.i.d.) random variables with expected
value E[Xi ] = µ and variance Var(Xi ) = σ 2 < ∞. The Weak Law of Large Numbers states that for every
ε > 0: !
1 n
lim P
n→∞
∑ Xi − µ > ε = 0
n i=1

Proof (Using Chebyshev’s Inequality):


Proof. Given the variance is finite, for any ε > 0:
!
1
∑ni=1 Xi

1 n Var n
P ∑ Xi − µ > ε
n i=1

ε2

Since Xi are independent,


!
1 n 1 n nσ 2 σ2
Var ∑ Xi
n i=1
= ∑
n2 i=1
Var(Xi ) =
n2
=
n

Plugging this into our inequality:


!
1 n σ2
P ∑ Xi − µ > ε
n i=1

nε 2
As n → ∞, the probability approaches zero, which proves the Weak Law of Large Numbers.

Example 10.1. Consider flipping a fair coin where the probability of heads, denoted as 1, is p = 0.5 and the
probability of tails, denoted as 0, is q = 0.5. Let the random variable Xi represent the result of the ith flip.
If we flip the coin n times, the average number of heads is given by n1 ∑ni=1 Xi .
As per the Weak Law of Large Numbers, as n grows larger, the difference between the observed average and
the expected value µ = 0.5 will become increasingly smaller with high probability. In simpler terms, if you
flip the coin a large number of times, the proportion of heads will get closer to 50%.

10.3. Central Limit Theorem


Suppose that you have a standard six-sided dice. If you roll it many times and note down the number on each
throw, you’ll eventually realize that each number (from 1 to 6) comes up approximately the same number of
times. If you plot the frequency of each number, you get a uniform distribution because each number has an
equal chance of appearing.
Imagine instead of rolling one dice once, you roll it twice and sum the results. If you do this many, many
times and plot the results, you’ll see that the number 7 (which can be obtained by adding, say, 3 and 4, or 5
148 10 Convergence in Probability

and 2 among others) tends to appear more often than 2 or 12. The distribution of the sums has a shape that’s
more “peaky” in the middle and “flat” on the ends.
This theorem tells us that if you keep summing more and more results of throws (or observations from any
distribution, as long as it has a finite mean and variance) and plot the sums or average, the shape of the
distribution will approach what’s called a “Normal distribution” or “bell curve” (because it’s bell-shaped).
This happens regardless of what the original distribution looked like (in the dice’s case, it was uniform).
The normal distribution is well-known and studied, and many statistical tools are based on it. The Central
Limit Theorem tells us that no matter how our original data is distributed, if we take enough samples and
compute their average, those averages will distribute normally. This allows us to make inferences and run
statistical tests on those averages, even if our original data wasn’t normal.

Connection with Measure Theory


The Central Limit Theorem and Measure Theory are intimately linked, although it’s essential to understand
that Measure Theory is a broader, more mathematically rigorous field that goes beyond the typical practical
applications of the Central Limit Theorem in statistics.
Measure Theory, in simple terms, deals with how to define and understand “volume” or “size” in very general
spaces. In mathematics, a “measure” is a way to assign a volume, length, area, etc., to sets in certain spaces.
When working in statistics and probability, it’s essential to understand how to assign probabilities to different
events. The probability, in this context, is just a particular measure that satisfies certain properties.
The space on which we define this measure in statistics is the probability space, and the events in this space
get a “probability measure” (i.e., a probability) indicating how likely they are to occur.
The Central Limit Theorem is about sequences of random variables and how their sums (or averages) are
distributed. The way we define and understand the “distribution” of a random variable involves measure.
Specifically, we use what’s called the “distribution function” to describe how probability (or measure) is
spread over the set of possible values a random variable can take.
The rigorous proof of the Central Limit Theorem, which shows that sums (or averages) of random variables
distribute in a specific way (the normal distribution), relies on concepts from Measure Theory. This proof
ensures that, under certain conditions, the probability measure of averages converges to the measure of the
normal distribution.

Definition of Central Limit Theorem


Let X1 , X2 , . . . be independent and identically distributed random variables with expected value µ and
variance σ 2 . Then the normalized sum
X1 + X2 + . . . + Xn − nµ
Zn = √
σ n
converges in distribution to a standard normal distribution as n → ∞.
10.3 Central Limit Theorem 149

Proof. (1) Moment Generating Functions (MGF): The MGF of a random variable X is defined as:

MX (t) = E[etX ].

(2) Considering Zn , the MGF is:  X +X +...+Xn −nµ 


t 1 2 σ √n
MZn (t) = E e .

(3) Due to the independence of the Xi , we can write:


n  X −µ 
t i√
MZn (t) = ∏ E e σ n .
i=1

(4) Using the Taylor expansion centered at zero for ex (up to second-order terms):

x2
ex ≈ 1 + x + .
2
Thus,
Xi − µ t 2 (Xi − µ )2
 X −µ   
t i√
E e σ n ≈ E 1+t √ + .
σ n 2σ 2 n
(5) Calculating the expectations:  
Xi − µ
E t √ =0
σ n
and
t 2 (Xi − µ )2 t2
 
E 2
=
2σ n 2n
(because E[(Xi − µ )2 ] = σ 2 ).
(6) Plugging these results into the MGF expansion:
n
t2

MZn (t) ≈ 1 + .
2n
t2
(7) As n → ∞, the above term converges to e 2 , which is the MGF of a standard normal distribution.
(8) Since MGFs determine the distribution (when they exist in an interval containing the point t = 0), we
conclude that Zn converges in distribution to a standard normal.

Example 10.2. The Central Limit Theorem is often framed within the context of measure theory using
the Lebesgue integral to formalize expectations and variances. This becomes particularly pertinent when
dealing with random variables having more intricate domains or when working with distributions that aren’t
absolutely continuous with respect to the Lebesgue measure.
Consider a continuous random variable X with density function f (x) over the interval [0, 1], and f (x) is
Lebesgue-measurable. We can use the Lebesgue integral to compute the expectation and variance of X:
ˆ 1
µ = E[X] = x f (x) dx
0
150 10 Convergence in Probability
ˆ 1
σ 2 = V[X] = (x − µ )2 f (x) dx
0

Now, consider the sum of n independent copies of X:

Sn = X1 + X2 + . . . + Xn

The Central Limit Theorem tells us that if we normalize Sn , the distribution of the sum approaches a normal
distribution with mean nµ and variance nσ 2 . The normalized variable is:
Sn − nµ
Zn = √
nσ 2

According to the Central Limit Theorem, Zn will approach a standard normal distribution as n goes to infinity.
The role of the Lebesgue integral here is foundational in ensuring that expectations and variances are well-
defined, and provides a more general framework than the Riemann integral for working with more complex
functions.

10.4. Conclusions
Throughout this chapter, we delved deep into some of the fundamental theorems that underpin classical
statistics and probability theory. The Law of Large Numbers, both in its weak and strong forms, laid the
foundation for our understanding of how averages of random variables behave as we consider more and
more observations. Specifically, the Weak Law of Large Numbers assures us that, as the sample size grows,
the sample mean converges in probability to the true expected value. This is a powerful affirmation of the
intuition that, by collecting more data, we gain a clearer picture of the underlying truth.
The Central Limit Theorem complements this understanding by describing the distribution of the sum (or
average) of a large number of random variables, irrespective of the original distribution of the data. It is this
theorem that gives rise to the ubiquity of the normal distribution in many natural and social phenomena,
allowing us to make significant statistical inferences even with limited knowledge about the underlying
distribution.
In essence, these theorems together form the bedrock on which large parts of statistical analysis and
inferential methods are based. Their importance cannot be overstated, as they guarantee the consistency
and efficiency of many statistical estimators used in practice. As we continue our journey in the realm of
statistics and probability, these foundational concepts will guide our understanding and interpretation of
more advanced topics.
10.5 Exercises 151

10.5. Exercises
Exercise 10.1. Consider a fair six-sided die. Let Xi be the random variable representing the outcome of the
i-th roll. If you roll the die 100 times, and the average of the outcomes is 3.5, verify the Law of Large
Numbers.

Exercise 10.2. A factory produces packages with a mean weight of 500 grams and a standard deviation of
50 grams. If a sample of 36 packages is selected at random, what is the probability that their average weight
is between 480 and 520 grams?

Exercise 10.3. Let X1 , X2 , . . . , Xn be independent and identically distributed (i.i.d.) random variables with
mean µ and variance σ 2 . If the sample mean is given by:

1 n
X¯n = ∑ Xi
n i=1

Explain how the Central Limit Theorem (CLT) describes the distribution of X¯n as n approaches infinity.

Exercise 10.4. Define the Weak Law of Large Numbers (WLLN) and explain its significance in the context
of sample averages converging to the expected value.
Chapter 11
SOLUTIONS

Solutions Chapter 1
Solution 1.1.

A ∪ B = {1, 2, 3, 4, 5}
A ∩ B = {2, 3}
A − B = {1}
B − A = {4, 5}

Solution 1.2. For F to be a σ -algebra, it must satisfy:


(1) 0/ ∈ F
(2) If A ∈ F , then Ac ∈ F S
(3) If Ai ∈ F for all i, then i Ai ∈ F .

Given the set F , it fulfills the properties of a σ -algebra.

Solution 1.3.

/ =0
P(0)
P({a}) = 0.4
P({b, c}) = 0.6
P(Ω) = 1

Solution 1.4. Borel sets in R are generated from open sets. The set E is an open set in R since it doesn’t
include its endpoints. Therefore, E is a Borel set.

Solution 1.5. To prove that F is a σ -algebra on {a, b}, we need to verify three properties:

(1) {a, b} is in F .
(2) If A is in F , then its complement Ac is also in F .
(3) For any countable collection {Ai } where each Ai is in F , the union i Ai is also in F .
S

153
154 11 SOLUTIONS

Let’s verify each property:

(1) {a, b} is clearly in F .


(2) Complements: - Complement of 0/ is {a, b} which is in F . - Complement of {a} is {b} which is in F .
- Complement of {b} is {a} which is in F . - Complement of {a, b} is 0/ which is in F .
(3) Taking unions: - The union of any sets in F results in another set that’s in F or the set itself.

Since all three properties are satisfied, we conclude that F is a σ -algebra on {a, b}.

Solution 1.6. Using properties of measures and set operations:

(1) To find µ (A):


µ (A) = µ (A ∩ B) + µ (A ∩ Bc ) = 2 + 4 = 6
(2) To find µ (B):
µ (B) = µ (A ∩ B) + µ (Ac ∩ B) = 2 + 3 = 5

Hence, µ (A) = 6 and µ (B) = 5.


11 SOLUTIONS 155

Solutions Chapter 2
Solution 2.1. Yes, a measurable space is defined as a set Ω together with a σ -algebra F . Hence, (Ω, F ) is
a measurable space.

Solution 2.2. Given additivity, µ ([0, 1]) = µ ([0, 21 ]) + µ ([ 12 , 1]). Since µ ([0, 1]) = 1, and both intervals are
of equal length, µ ([0, 21 ]) = 12 .

Solution 2.3. Consider the σ -algebra A = {0,


/ R, {1}, R \ {1}}. Assign a measure m on A such that m(R) =
1.

Solution 2.4. One possible assignment is:


/ =0
m(0)
m({1}) = 0
m(R \ {1}) = 1
m(R) = 1

Solution 2.5. Yes, E is countable and each point has an “interval” of measure 0. The total measure of E is
the sum of measures of these points, which is 0.

Solution 2.6. The Lebesgue measure of the interval [a, b] is its length, which is b − a.

Solution 2.7. Measurable spaces and σ -algebras provide a foundational framework for defining events and
their associated probabilities in a consistent and rigorous manner. They allow us to define probability
measures and ensure that various operations, like union or intersection of events, are well-defined.

Solution 2.8. Given that µ is a measure on M , we know:

(1) µ (E) ≥ 0 for every E ∈ M .


(2) µ (0)
/ = 0.
(3) µ is countably additive.

Now, considering A and Ac which are disjoint sets, their union is the whole space X. Thus,

µ (X) = µ (A ∪ Ac )

Using the additivity property of the measure,

µ (A ∪ Ac ) = µ (A) + µ (Ac )

Substituting in the above expression,

µ (X) = µ (A) + µ (Ac )

Rearranging gives:

µ (Ac ) = µ (X) − µ (A)


156 11 SOLUTIONS

Solution 2.9. The concept of “size” or “length” of a set is intuitive for simple geometric shapes but becomes
ambiguous for more intricate sets. In the realm of mathematics, this ambiguity needs a precise definition,
especially when we deal with subsets of R.

(1) σ -Algebras: These provide a structured framework that ensures the sets we consider can be combined in
ways (unions, intersections, complements) that still produce sets within the framework. This consistency
is crucial.
(2) Measure: A measure generalizes the idea of “size” or “length” to any set within the σ -algebra. It assigns
a non-negative real number to each set, indicating its size in a consistent way.
(3) Lebesgue Measure: While there are various measures, the Lebesgue measure, λ , is particularly significant
for subsets of R. It extends the intuitive notion of length for intervals to more complex sets. For an
interval (a, b), λ ((a, b)) = b − a, which matches our intuitive understanding. Additionally, Lebesgue’s
approach allows for measuring the “size” of sets that other measures can’t handle, making it essential
for integration theory and real analysis.

Defining a measure, especially the Lebesgue measure, on a σ -algebra allows mathematicians to standardize
and extend the concept of “size” or “length” to intricate subsets of R, providing a foundation for further
mathematical exploration.
11 SOLUTIONS 157

Solutions for Chapter 3


Solution 3.1. The function f is measurable since the inverse images of Borel sets under f are also Borel
sets.

Solution 3.2. Yes, the sum of two measurable functions is also measurable. Here, h(x) = x+x2 is measurable.

Solution 3.3. Yes, A is a half-open interval and is a Borel set, so A ∈ B.

Solution 3.4. Yes, the Heaviside step function is measurable since the inverse images of Borel sets under H
are also Borel sets.

Solution 3.5. The Borel σ -algebra, denoted B(R), is defined as the smallest σ -algebra containing all the
open sets in R.
To prove the statement, we must show:

(1) B(R) contains O, and


(2) If F is any σ -algebra containing O, then F contains B(R).
(3) By definition, B(R) contains all open sets, so O ⊆ B(R). Assume F is a σ -algebra containing O.
Since B(R) is the smallest σ -algebra containing all open sets, we must have B(R) ⊆ F .

Thus, B(R) is indeed the smallest σ -algebra containing O.

Solution 3.6. To show that c f is measurable, we need to prove that for every open set O in R, the pre-image
under c f , i.e., (c f )−1 (O), is a measurable set.
Let O be an open set in R. We have:

(c f )−1 (O) = {x ∈ R : c f (x) ∈ O}

If c > 0:

(c f )−1 (O) = {x : f (x) ∈ O/c}

Since f is measurable, the pre-image of the open set O/c under f is measurable. Thus, (c f )−1 (O) is
measurable.
If c < 0:

(c f )−1 (O) = {x : f (x) ∈ −O/c}

Again, since f is measurable, the pre-image of the open set −O/c under f is measurable. Therefore,
(c f )−1 (O) is measurable.
In the case c = 0, c f is the constant function, which is trivially measurable.
Hence, in all cases, c f is a measurable function.
158 11 SOLUTIONS

Solutions for Chapter 4


Solution 4.1. The set of rationals has measure zero in [0,1], hence
ˆ 1
χQ d µ = 0
0

Solution 4.2. ˆ 1
1
x dµ =
0 2

Solution 4.3. The function f is not Riemann integrable on [0,1] because its set of discontinuities, the
rationals, is not of measure zero.

Solution 4.4. For this continuous function, the Riemann and Lebesgue integrals are the same:
ˆ 1 ˆ 1
2 1
x dx = x2 d µ =
0 0 3

Solution 4.5. No, a function must be Lebesgue measurable to be Lebesgue integrable. There are functions
that are bounded on [a,b] but are not Lebesgue measurable.

Solution 4.6. The Riemann integral of a bounded function f : [a, b] → R is defined as:
ˆ b n

a
f (x) dx = lim
kPk→0 i=1
∑ f (xi∗ )(xi − xi−1 )
where the limit is taken over all partitions P of the interval [a, b] and xi∗ is any point in the subinterval
[xi−1 , xi ].
An example of a function that is not Riemann integrable on a closed interval is the Dirichlet function defined
by: (
1 if x is rational
D(x) =
0 if x is irrational
on the interval [0, 1].

The reason is that for any partition P of the interval [0, 1], each subinterval contains both rational and
irrational numbers. Thus, the supremum and infimum of D(x) on each subinterval is 1 and 0, respectively.
As a result, the upper and lower sums of D(x) over the interval [0, 1] will always differ by a positive amount,
making the difference between the upper and lower integral non-zero. Therefore, the function D(x) is not
Riemann integrable on [0, 1].

Solution 4.7. A function f : X → R is said to be measurable with respect to the σ -algebra M if for every
open set O in R, the inverse image f −1 (O) is in M . In other words:

f −1 (O) ∈ M for all open sets O ⊂ R.


11 SOLUTIONS 159

Limitation of the Lebesgue Integral: One limitation of the Lebesgue integral, especially when compared to
the Riemann integral, is its inability to assign values to functions that possess significant “wild” behavior at
a “large” set of points. For instance, while the Dirichlet function mentioned earlier is not Riemann integrable
over any interval, it is Lebesgue integrable over [0, 1] because the set of rational numbers (though dense in
[0, 1]) has Lebesgue measure zero. However, if a function were to be “wildly” oscillating at points that form
a set with positive Lebesgue measure, the Lebesgue integral would also fail to provide a meaningful value
for the integral.
160 11 SOLUTIONS

Solutions for Chapter 5


Solution 5.1. Firstly, note that ( fn ) is a sequence of measurable functions and fn+1 (x) ≤ fn (x) for all n and
x ∈ [0, 1].
Using the Monotone Convergence Theorem, we have
ˆ 1 ˆ 1
lim fn (x) dx = lim fn (x) dx.
n→∞ 0 0 n→∞

Since limn→∞ fn (x) = 0 for x ∈ [0, 1) and limn→∞ fn (1) = 1, we find


ˆ 1
lim fn (x) dx = 0,
0 n→∞

which gives the required result.

Solution 5.2. The sequence (gn ) is a sequence of measurable functions. Also, for all x ∈ [0, 1] and n ≥ 1,
|gn (x)|≤ 1n × n = 1. Thus, the sequence (gn ) is dominated by the function h(x) = 1, which is integrable over
[0, 1].
By the Dominated Convergence Theorem, we have
ˆ 1 ˆ 1
lim gn (x) dx = lim gn (x) dx.
n→∞ 0 0 n→∞

Since limn→∞ gn (x) = 0 for all x, it follows that


ˆ 1
lim gn (x) dx = 0,
0 n→∞

which gives the required result.

Solution 5.3. (i) For the Dominated Convergence Theorem, we need a dominating function g such that
| fn (x)|≤ g(x) for all n and g is integrable. Let g(x) = x12 . We have:

sin(nx) 1
| fn (x)|= 2
≤ 2 = g(x)
n n
´
Now, R g(x) dx converges, thus by the Dominated Convergence Theorem:

ˆ ˆ
lim fn (x) dx = 0 dx = 0
n→∞ R R

(ii) Using Fatou’s Lemma: ˆ ˆ


lim inf fn (x) dx ≤ lim inf fn (x) dx
R n→∞ n→∞ R
11 SOLUTIONS 161

Given the result from part (i), the right side is 0. Hence:
ˆ
lim inf fn (x) dx ≤ 0
R n→∞

Solution 5.4. (i) The set of rational numbers Q in [0,1] is countable, and hence has Lebesgue measure zero.
Therefore: ˆ ˆ ˆ
f dµ = 1 · χQ d µ + 0 · χQ c d µ = 0
[0,1] [0,1] [0,1]

Where χ denotes the characteristic function.


(ii) For any partition P of [0,1], every subinterval contains both rational and irrational numbers. Hence, the
infimum of f in any subinterval of P is 0 and the supremum is 1. This means the lower sum L( f , P) = 0 and
the upper sum U( f , P) = 1 for any partition P. Since L( f , P) and U( f , P) do not converge to the same value
as the mesh of the partition goes to zero, f is not Riemann integrable on [0,1].
162 11 SOLUTIONS

Solutions for Chapter 6


Solution 6.1. The Lebesgue measure λ2 (R) of a rectangle in R2 is given by the formula:

λ2 (R) = (b − a) × (d − c)

where R = {(x, y) : a ≤ x ≤ b, c ≤ y ≤ d}.

For the given rectangle R:


λ2 (R) = (2 − 1) × (4 − 2) = 1 × 2 = 2

Solution 6.2. The product measure µ × ν of the set A can be calculated as follows:

µ × ν (A) = µ ({1}) × ν ({a}) + µ ({2}) × ν ({b})

µ × ν (A) = 2 × 4 + 3 × 5 = 8 + 15 = 23

Solution 6.3. The measure in R2 , often called area, is an extension of the concept of length in R. Similarly,
the measure in R3 , often called volume, is an extension of the concept of area in R2 .
The unit square A in R2 has an “area” of 1 × 1 = 1. In contrast, the unit cube B in R3 has a “volume” of
1 × 1 × 1 = 1.
An intuitive analogy is that if you were to paint the unit square, you would measure the amount of paint
needed in terms of square units (e.g., square meters), while for the unit cube, you would measure the amount
of substance (maybe water) it can hold in cubic units (e.g., cubic meters).
Thus, as we move from R2 to R3 , our measure extends from considering 2D areas to considering 3D volumes.

Solution 6.4. For the rectangle A, the Lebesgue measure (or area) in R2 can be computed as the product of
the lengths of its sides. Specifically:

m(A) = (b1 − a1 ) × (b2 − a2 )

This formula can be understood in terms of the product of measures in R. Consider the intervals I1 = [a1 , b1 ]
and I2 = [a2 , b2 ] in R. Their Lebesgue measures (or lengths) are m(I1 ) = b1 − a1 and m(I2 ) = b2 − a2 ,
respectively. The measure of the rectangle A in R2 can be seen as the product of the measures of I1 and I2 :

m(A) = m(I1 ) × m(I2 )

This showcases how the measure of a product space can be seen as the product of the measures of the
constituent spaces, a foundational concept in measure theory when generalizing to higher dimensions.
11 SOLUTIONS 163

Solutions for Chapter 7


Solution 7.1.

µ (X) = µ ({1}) + µ ({2, 3}) + µ ({4})


= 1+2+1
= 4.

Solution 7.2. Since [1, 3] is a closed interval of length 2:

λ ([1, 3]) = 2.

Solution 7.3. By additivity of the measure:

ν ([0, 2]) = ν ([0, 1]) + ν ([1, 2])


= 3+2
= 5.

Solution 7.4. For any set containing 0, the Dirac measure at 0 is 1. Therefore, for any ε > 0:

δ0 ([−ε , ε ]) = 1.

Solution 7.5. To show that µ is a Borel measure on B(X), we need to prove that it is defined on the Borel
σ -algebra B(X).
By definition, µ is defined on B(X). Additionally, the measure of compact sets is finite. In general
topological spaces, compact sets play the role analogous to closed and bounded sets in Rn . Therefore, since
µ assigns a finite value to compact sets, it respects the “boundedness” (or finiteness) in the topological space
X.
Consequently, µ is a Borel measure defined on B(X).

Solution 7.6. To show that δx0 is a Radon measure, we need to verify the following properties:

(1) δx0 is Borel measurable: This is given since δx0 is defined for every Borel set A ⊂ X.
(2) δx0 is locally finite: For any point x ∈ X, there exists an open neighborhood U such that δx0 (U) is finite.
This is clearly satisfied since δx0 (U) is either 0 or 1.
(3) δx0 is inner regular: For any open set U with x0 ∈ U, δx0 (U) = 1, and this is the supremum of measures
of compact subsets of U since any compact subset containing x0 will have measure 1 and otherwise 0.
(4) δx0 is outer regular: For any Borel set A with x0 ∈ A, δx0 (A) = 1. Given any ε > 0, we can find an open
set U such that x0 ∈ U ⊂ A and δx0 (A \U) < ε . However, since δx0 (A \U) is always 0, this property is
satisfied.

Since all properties are met, δx0 is a Radon measure on X.


164 11 SOLUTIONS

Solutions for Chapter 8


Solution 8.1. No, we cannot. The Vitali set is a classic example of a non-measurable set. If we were to assign
a positive measure m > 0 to each subset, the total measure would be infinite. If we assigned a measure of 0,
the total measure would be 0. In both cases, we get a paradox because neither result matches the measure of
the interval [0, 1].
Solution 8.2. The Vitali set is a collection of subsets of the interval [0, 1] such that each real number between
0 and 1 is in exactly one of the subsets. Since we can’t assign a consistent non-zero measure to these disjoint
sets that would sum up to the measure of the interval [0, 1], the set is deemed non-measurable.
Solution 8.3. Non-measurable sets serve as counterexamples and indicate the boundaries of our measuring
capabilities. They illustrate the limitations of the Lebesgue measure and emphasize the necessity of certain
conditions, like σ -algebra, to guarantee the existence of a measure.
Solution 8.4. If we attempt to integrate over a non-measurable set, we can’t ensure that the integral exists or
is well-defined. Since integration is closely tied to the notion of measuring “size” (i.e., Lebesgue measure),
integrating over non-measurable sets lacks meaning or consistent interpretation within our framework of
integration.
Solution 8.5. The Vitali set is a classic example of a non-measurable subset of the real numbers. To define
the Vitali set, consider the unit interval [0, 1). Two numbers x and y in this interval are said to be equivalent
(denoted x ∼ y) if their difference x − y is rational. Essentially, this divides the interval into equivalence
classes.
A Vitali set is a collection of exactly one representative from each of these equivalence classes. More
formally, for each equivalence class C, we choose one element xC from C and put it in our Vitali set V .
Therefore, V contains exactly one representative from every equivalence class.
The significance of the Vitali set in the context of Lebesgue measure is profound: V is not Lebesgue
measurable. This means that it’s not possible to assign a “length” to the Vitali set in a way that’s consistent
with the way we assign lengths to intervals.
The existence of such non-measurable sets challenges our intuition about “size” or “length” and underscores
the intricacies and subtleties of measure theory. In particular, the Vitali set showcases the limitations of the
Lebesgue measure and the necessity of having a clear and rigorous definition of measurability.
Solution 8.6. The Banach-Tarski paradox is a result in set-theoretic geometry. It states that it’s possible to
decompose a solid ball in R3 into a finite number of non-overlapping subsets, which can then be rearranged
using only rotations and translations to result in two solid balls, each the same size as the original. This
astonishing claim seems to defy our intuition about volume.
The paradox relies heavily on the Axiom of Choice, a controversial and non-constructive principle in set
theory. The sets used in the decomposition of the ball are non-measurable, meaning they cannot be assigned
a consistent “volume” in the context of Lebesgue measure.
The implications of the Banach-Tarski paradox on integration are significant. If we were to attempt to
integrate over non-measurable sets, such as those involved in the Banach-Tarski decomposition, we would
run into insurmountable problems. The very notion of “volume” becomes ill-defined for these sets, making
integration (which, in a geometric context, can be thought of as finding “volume”) impossible or at least
ill-defined. It reinforces the importance of working within measurable spaces when performing integration
and highlights the complexities and potential pitfalls when venturing outside of these spaces.
11 SOLUTIONS 165

Solutions for Chapter 9


Solution 9.1. The probability of the event A is:

P(A) = P({H}) = 0.5

Solution 9.2. The expectation of X is given by:

E[X] = 1(0.2) + 2(0.5) + 3(0.3) = 0.2 + 1 + 0.9 = 2.1

Solution 9.3. For any ω ∈ [0, 1], as n approaches infinity, Xn (ω ) approaches 0, which means Xn converges
to X = 0 almost surely.
To verify the conditions of Lebesgue’s Dominated Convergence Theorem: (i) Xn converges to X almost
surely. (ii) There exists a function g such that |Xn (ω )|≤ g(ω ) for all n and ω , and E[g] < ∞.
´1
Taking g(ω ) = 1, we see that |Xn (ω )|≤ 1 for all n and ω . Also, E[g] = 0 g(ω )d ω = 1 < ∞.
Thus, the conditions of Lebesgue’s Dominated Convergence Theorem are satisfied.

Solution 9.4. If f (x) = 0 for almost every x with respect to the Lebesgue measure, it means there exists a
set N ⊂ R with Lebesgue measure zero (i.e., λ (N) = 0) such that f (x) 6= 0 only for x ∈ N.
Given that the value of the Lebesgue integral is not affected by the function’s values on a set of measure
zero, we can define a function g : R → R such that g(x) = f (x) for x ∈
/ N and g(x) = 0 for x ∈ N.
Clearly, g(x) = 0 everywhere on R. So, ˆ
g d λ = 0.
R
And since f and g are equal almost everywhere, their integrals over R are the same:
ˆ ˆ
f dλ = g d λ = 0.
R R

Solution 9.5. For a Poisson distributed random variable X with parameter λ , its mean µ is λ and its variance
σ 2 is λ .
Considering the Central Limit Theorem, for large λ , the standardized version of X given by:

X −µ X −λ
Y= = √
σ λ
will be approximately normally distributed with mean 0 and variance 1.

This means, as λ tends to infinity, the distribution of Y approaches the standard normal distribution Z.
Therefore, the distribution of √λ
X−
tends to the distribution of Z as λ tends to infinity.
λ
166 11 SOLUTIONS

Solutions for Chapter 10


Solution 10.1. The expected value of a single roll of the die is:
1 1 1 21
E(Xi ) = ×1+ ×2+...+ ×6 = = 3.5
6 6 6 6
Given that the average of 100 rolls is 3.5, this is consistent with the Law of Large Numbers, which states
that the average of the outcomes should approach the expected value as the number of trials increases.

Solution 10.2. Using the Central Limit Theorem, the mean µ of the sample distribution is the same as the
population mean, and its standard deviation (standard error) σx̄ is:

σ 50
σx̄ = √ = √ = 8.33
n 36
Using the z-score formula:
x−µ
z=
σx̄
For x = 480, z1 = 480−500
8.33 ≈ −2.4
For x = 520, z2 = 520−500
8.33 ≈ 2.4
Using standard z-tables, the probability P(−2.4 < z < 2.4) can be found to be approximately 0.9834 or
98.34%.
Hence, there’s a 98.34% chance that the average weight of the sampled packages is between 480 and 520
grams.

Solution 10.3. The Central Limit Theorem (CLT) states that, given the conditions in the exercise, as n
becomes large (i.e., as n → ∞), the distribution of the standardized sample mean:

X¯n − µ
Zn = √
σ/ n

converges in distribution to the standard normal distribution.

Mathematically, this is expressed as:


d
Zn −
→Z
where Z is a standard normal random variable (with mean 0 and variance 1).

This means that, for large n, the sample mean X¯n has approximately a normal distribution with mean µ and
variance σ 2 /n, irrespective of the distribution of the Xi variables, as long as µ and σ 2 are finite.

Solution 10.4. The Weak Law of Large Numbers (WLLN) states that if X1 , X2 , . . . are independent and
identically distributed (i.i.d.) random variables with finite mean µ , then the sample average:

1 n
X¯n = ∑ Xi
n i=1
11 SOLUTIONS 167

converges in probability to µ as n → ∞. Formally, for every ε > 0:

lim P(|X¯n − µ |> ε ) = 0


n→∞

The significance of the WLLN is profound in both theoretical and practical terms. It implies that the sample
average will get closer and closer to the expected value of the random variables as the sample size grows.
This means that by taking larger and larger samples, we can get an empirical estimate that is arbitrarily close
to the true expected value. This law provides a solid foundation for the practice of statistics, where we often
use sample averages as estimators for population averages.
168 11 SOLUTIONS

References
[1] W. Rudin, Real and Complex Analysis, 3rd ed. New York, NY: McGraw-Hill, 1987.
[2] P. Billingsley, Probability and Measure, 3rd ed. Hoboken, NJ: Wiley, 2012.
[3] H. L. Royden, Real Analysis, 3rd ed. New York, NY, USA: Macmillan Publishing Company, 1988, a
classic text that provides a comprehensive introduction to measure and integration theory.
[4] J. L. Doob, Measure Theory, ser. Graduate Texts in Mathematics. New York, NY, USA: Springer,
1994, vol. 143, a foundational text covering various aspects of measure theory including measurable
spaces.
[5] T. Tao, An Introduction to Measure Theory. American Mathematical Soc., 2011.
[6] G. B. Folland, Real Analysis: Modern Techniques and Their Applications. John Wiley & Sons, 2013.
[7] P. R. Halmos, Measure Theory. New York, NY: Springer Science & Business Media, 2013.
[8] H. L. Royden and P. Fitzpatrick, Real Analysis, 4th ed. Upper Saddle River, NJ: Pearson, 2010.
[9] J. Elstrodt, Measure and Integration Theory. de Gruyter, 2006.
[10] E. H. Lieb and M. Loss, Analysis. American Mathematical Society, 2001.
[11] A. N. Kolmogorov and S. V. Fomin, Measure, Lebesgue Integrals, and Hilbert Space. Academic
Press, 1960.
[12] V. A. Zorich, A Course in Mathematical Analysis: Volume 2, Metric and Topological Spaces, Functions
of a Vector Variable. American Mathematical Society, 2014.
[13] G. Fubini, Sulla teoria della misura in uno spazio qualunque. Torino, Italy: Tipografia della R.
Accademia delle scienze, 1923, in Italian.
[14] C. E. Bonferroni, Lezioni di Calcolo delle Probabilità e Teoria delle Misure. Bologna, Italy:
Zanichelli, 1936, in Italian.
[15] E. Borel, Lecons sur la theorie des fonctions. Gauthier-Villars, 1905.
[16] M. Loeve, Integration: Mesure, Lebesgue Integral, Applications. North-Holland Publishing Co., 1973.
[17] G. Choquet, Théorie de la Mesure et de l’Intégration. Gauthier-Villars, 1962.
Index

Fσ sets, 12 Information theory, 37, 48, 112


Gδ sets, 12 Intersection of two sets, 2
σ -algebra, 4, 23
Lebesgue integral, 59, 78, 128
Advanced physics, 95 Lebesgue measurable sets, 16
Axiom of choice, 120 Lebesgue measure, 31
Limitations of the Lebesgue integral, 74
Banach-Tarski sets, 119
Borel σ -algebra, 28, 46 Mathematical biology, 37, 52
Borel measures, 105 Mathematical economics, 37, 51
Borel sets, 15 Mathematical exploration, 95
Mathematical physics, 37, 49
Closed sets, 9 Measurable functions, 42
Compact sets, 11 Measure in multidimensional spaces, 94
Conditional convergence, 74 Measure theory, 45
Control theory, 37, 50 Measureable spaces, 23
Measures in topological spaces, 104
Data analysis, 95 Measuring sets over points, 72
Difference of two sets, 3 Monotone convergence theorem, 78
Dominated convergence theorem, 85
Non-measurable functions, 44, 74
Ergodic theory, 37, 47 Null sets, 18
Euclidean space, 112
Open sets, 9
Families of sets, 4
Fatou’s lemma, 89 Paradoxes in non-measurable sets, 116
Functional analysis, 37, 46 Preimages, 42
Probability space, 112
Geography and earth sciences, 37, 53 Probability spaces, 126
Product of measures, 95
Hausdorff paradox sets, 118 properties of a measure, 24
Huffman coding, 48
Hybrid space, 121 Radon measures, 107

169
170 INDEX

Random variables, 127 Symmetric difference of sets, 3


Restrictions for using the Riemann integral, 71
Riemann, 80 Topological space, 104
Riemann integral, 59
Union of two sets, 2
Sets of undefined measure, 74
Simple sets, 14 Vitali sets, 117

View publication stats

You might also like