B. Ramdas Bhat - Modern Probability Theory - An Introductory Textbook-Wiley (1985)

Modern Probability Theory An Introductory Textbook (SECOND EDITION) B. Ramdas Bhat Professor of Statistics Karnatak University ‘Dharwad WILEY EASTERN LIMITED ‘New Delhi Bangalore Bombay Calcutta Madras HyderabadCopyright ©1901, WILEY EASTERN LIMITED Second Edition, 1986 First Roptint. 1886 WILEY EASTERN LIMITED 4895/24, Ansari Road. Daryaganj, New Delhi-110002 4854/21, Daryaganj, New Delni-r10002 ‘6 Shri B. P. Wacia Road, Basavangudi, Bangalore-360008 ‘Abid House, Dr; Bhadkamkar Marg. Borbay-400007 40/8, Ballygunge. Circular Road, Calcutta-r00019 Post Box No. 8804 Thiruvanmiyur, Madras-S00041 Post Box No. 1050. Himayath Nagar. P.O. Hydorabad-500029 This book or any part thereof may not be produced In eny form without the written permission of the publisher This book Is not to be sold outside the ‘country te which its consigned by Wiley Eastern Limited woerssesad 40.00 Published by Mohinder Singh Sejwal for Wiley Eastern Limited, 4895/24, Ansari Road, Daryagenj, Now Delhi - 110.002 and” printed “by Sunil Dutt: at Gopsons Papers Pvt: Lid. ‘A-28, Sector IX, NOIDA.. a a =Preface to the Second Edition The second revised edition of the book incorporates many suggestions rade by my friends and colleagues. Numerous errors of the earlier ‘dition have been weeded out. A number of problems have been added at the end of different chapters. Chapters 10 and 11 on Laws of Large ‘Numbers and Central Limit Theorem have been enlarged and completely § rewritten, gratefully acknowledge the suggestions and corrections received from the large number of teachers and students using the book, in parii- cular from Profs M.S, Prasad, M.S. Chikkegoudar, Dr. N-R. Mohan land others, Lam looking forward to receiving similar suggestions from » the readers in future’ also. Dharwad B. Rampas BHAT May, 1985Preface to the First Edition While teaching the subject of probability theory to post-graduate students of statistics I could find books dealing with either from a very low mathematical Ievel or from a highly sophisticated mathematical level Books written for students with only the knowledge of graduate level mathematios are hard to find. To fill this gap the present book is written. It forms the contents of a one-year probability theory which the author thas taught for a number of years. The background ‘needed is a good Knowledge of mathematical analysis including Riemann integration and some familiarity with probability theory. Even though the book does not assume arly previous kaowledge of measure theory and develops probability theory in @ self-contained manier those who had previous knowledge of measure theory can cover the contents of the book quickly, probably in about half the time, ic., in one semester. fi Chapter 1 of the book iatroduces set theory and defines event as @ set of outcomes of af experiment, In this chapter set-theoretical background is provided for defining probability’on a class of events, which is done in Chapter 3.” Chapter 2 discusses properties of another important ‘concept, viz., random variable as 2 measurable function, A’salicat feature. of the book is to distinguish “definition” of probability from the “method” ofits calculation. Axiomatic deflation is given and its properties as a finite measure are studied. Distribution functions of r.v.’s and general distributions are introduced in Chaper 4, ‘These play an important role in the study of statistics. Distribution functions which are mixtures of discrete and continuous distributions are highlighted in the book. Expectations are defined similar to Lebesgue-Stieltjes integral. Chapter ‘is self-contained and does not assume any previous knowledge of the theory of Lebesgue integration. Chapter 6 is concerned not only with the definitions of the Various modes of convergence of a sequence of r.¥.’, viz., convergence in probability, convergence almost surcly, convergence inrth mean, and convergence in distribution, “but discusses their inter telationships. There is a section on the interchange of limits and expectations, - Double and iterated integrals are defined and Fubini theorem stated in the last Section. Characteristic function is an important tool of. probability analysis. Some basic’ properties which are used in later chapters, particularly in Chapters 8 and 11 are derived in Chapter 7. Weak convergence of distributions, which is related to weak convergence of probability measures is an important concept amenable to sophisticatedx PREFACE TO THE FIRST.EDITION mathematical analysis, This is touched briefly in Chapter 8. ‘Concept of independence occurring in probability theory does not have 4 parailel in measure theory. How this condition of independence affects the probability measure defined on the probability space induced by a sequence of v's is discussed in Chapter 9. Convergence and stability properties of a series of independent r.v.’s and laws of large numbers are the-topics of Chapter 10. Convergence to normal law, the so-called central limit theorem, is derived for the univariate and p-variate cases in Chapter 1. A special feature of the book which makes it different from the usual ‘texts, is the treatment of conditional probability. It is defined on the space of measufable subsets of a measurable set. Conditional expectation is defined and its properties studied in Chapter 12. As an illustra- tion, simple properties of a martingale are deduced. Markov chain with finite state space is analysed in the last Chapter 13 as an introduction to the study of Markov processes. The latter is a subject of intense research activity during the last three decades ad has been analysed using ighly sophisticated mathematical tools. If tho student's interest is Kindled to take up further study in this area, the purpose of writing this chapter is fulfilled. Arnotable feature of this book is the collection of theoretical aid umerioal problems given at the end of cach chapter. Complementary topics not covered in the chapter are also mentioned here, A few of these have been talien from the question papers set at different university examinations, ‘The author gratefully acknowledges his indebtedness to bese universities. Phrases such as ‘how’, ‘why’, ‘prove’ have been insefted at different places to guard the inexperienced student against assuming @ few statements to be true intuitively, rather than deriving the missing steps logically. Emphasis, throughout the book, is oa rigorous derivation of various results, starting from the definitions and assump- tions. It is hoped that such derivations would train a student in analytical and logical thinking, T don't propase to list the books and the authors whose work inspired ‘me to write the present work, because it is bound to be incomplete. My teachers at Berkeley, particularly Prof. M. Loave, are responsible for rousing my interest in the subject of probability theory. If some readers feel that the present book is an enlarged version of Love's classical ‘treatise, there is some justification for it. Tam indebted to the University, Grants Commission for providing financial assistance to write this book.. Iam indebted to my friends, colleagues, and students for their valuable criticism and comments, Dharwad B, Rampas Buat December, 1980P as, dle) r e aa. A,B,C, etc. A, B, Cy etc. a List of Abbreviations element of not an clement of logical implication logical equivalence such that there exists ‘each, all or every therefore since end of the proof convergence in probability ‘convergence almost surely ‘convergence in law (or distribution) ‘convergence in rth mean ‘weak convergence complete convergence almost all eee sets ‘lasses ‘complement of A almost everywhere almost surely Borel field in R* characteristic function distribution fuaction ifand only if indicator function of A infinitely divisible ” independent identically distributed Markov chainLIST OF ABBREVIATIONS moment generating function k-dimensional real space random vatiable (or vector) union intersection ‘up to additive constant with respect to inverse of X° left hand side right band side communicating states of a Markov chainContents Preface to the Second Edition Preface to the First Edition List of Abbreviations 1. Sets and Classes of Events | «LL The Event, 1 Algebra of Sets 3 Fields and o-Fields 9 Class of Events 17 , ‘Complements and Probleias 18 vie wie 1 Rei 2. Random Variables 2.1 Functions and Inverse Functions 20 2.2 Random Variables 29 2,3 Limits of Random Variables 33 ‘Complements and Problems 40 Va Ps babliity Spacé Definition of, Probability 42 Some Simple Properties 47 Discrete Probability Space 50 General Probability Space. 54 Induced PrdBability Space 59 ‘Other Measures 61 Complements and Problems 65 3. 3. 3A i 3. 3. 3. 4. Distribution Funetions 4.1 Distribution Function of a Random Variable 69 \ \~ 4.2, Decomposition of D.F’s 72 | 4.3 Distribution Functions of Vector Random Variables 74 i 4.4 Correspondence Theorem 77 Complements and Problems 79 5. Expectation and Moments 5.1 Definition of Expectation 81 5.2. Properties of Expectation 87 4 oe aL% 10. nL. ‘CONTENTS 5.3. Moments, Inequalities 93 Complements and Problems 100 Convergence of Random Variables 4 105 Convergence in Probability 105 Convergence Almost Surely JO Convergence in Distribution 114 Convergence in rth Mean 117 Convergence Theorems for Expectations 119 Fubini’s Theorem 124 Complements and Problems "127 ‘Characteristic Fanetions 132 7,1 Definition and Simple Properties 132 7.2. Some Simple Properties" 135 7.3 Inversion Formula 138 7.4 Characteristic Functions and Moments 145 4.5 Bochner’s Theorem 150 Complements and Problems 152 Convergence of Distribution Functions 156 8.1 Weak Convergence 156 8.2 Convergence of Distribution Functions and Characteristic Functions 162 8.3 Convergence of Moments 164 Complements and Problems 166 9 Independence 169 9.1 Definition 169 9.2 Multiplication Properties 173 9.3 Zero-One Laws 178 ‘Complements and Problems 181 Laws of Large Numbers 185 10.1 Convergence of a Series of Independent Random Variables 135 - 10.2. Kolmogorov Inequalities and A.S. Convergence 189 10.3. Stability of Independent R.V.'s 193 ‘Complenients and Problems 202 Central Limit Theorem 7 11.1 Tatroduction 207 1.2 LLD. Case 208 11.3 Variable Distributions 210 114 Multi713. Finite Markov Ch ‘CONTENTS 12, Conditioning 12.1, Radon-Nikodym Theorem 223 12.2 Conditional Expectation 228 12,3. Properties of Conditional Expectation 230 12,4 Martingales 235 > Complements and Problems 238 13.1. Definition-240 13.2 Calculation of n-Step Transition Probabilities 243 13.3. Recurrence 247 13,4" Limiting Properties of Transition Probabilities 252 Complements and Problems 255 Appendix A Appendix B Appendix © te Bibliography Index av 223 240 258 260 262. Sets and Classes of Events 1.4 THE EVENT In everyday life we come across many phenomena, the nature of which cannot be predioted in advance or many experiments, whose outcomes ‘may not be known precisely. However we may know that the outcome thas to be one of tho several possibilities. The weight of a newborn baby cannot be known before the birth, except that it may lic ia a certain range. When a coin is tossed, we know that the outcome has to be either ‘ahead or a tail; but we do not know the outcome of a’ particular throw in advance. By a statistical experiment or simply experiment we mean not only an ‘experiment such as the tossing of a coin or the observation of the number of defects in a certain sample of NV items chosen from the daily production, in which the possible outcomes are finite; but also. the observation of a phenomenon such as the weight of a new born baby, or the weather condition of a certain region, where the number of possible outcomes is infinite. The concept of experiment is fundamental to the study of pro- ability theory, because it is concerned with assigning chances to the ‘outcomes of the statistical experitient or to possible state of nature and studying them. Let us denote by @, the typical outcome of an experiment E. 0 is called a sample point, The totality of all outcomes of & will be denoted by 2 and is called the sample space. A collection of outcomes of “F in which we ate interested will be called an event. Thus an event is a subset of ©. The number of sample points may be finite, countable or uncoun!- able as illustrated by the following examples. Example 1.1, An industrial engineer may be interested in. determining the proportion of defectives produced by a certain machine. He may ‘choose n items from the daily production and classify cach one of them as defective (D) or not (G). The number of possible outcomes of this experiment is 2. .The typical outcome may be represented by an n-tuple consisting of D’s and G's as (DDG..GD), which may be denoted by o. “The number of defectives in a sample of S-does not exceed 2” js an event. It consists of sample points (DDGGG), (DGDGG), (DGGDG), (DGGGD), (GDDGG), (GDGDG), (GPGGD), (GGDDG), (GGDGD), (GGGDD), (DGGGG),. (GDGGG), (GGDGG), (GGGDG), (GGEGD), (GaGae).2 MODERN PROBABILITY THEORY Example 1.2. A textile manufacturet may observe the namber of fault in a yard of cloth chosen from the daily production, The possible outcome « may be represented by a number 0,1, 2...” Thus Q 40, 1, 2,...} contains countably infinite number of points, ‘of faults per yard exceeds 3” is an event, containing the sample points 6, Toss. “The number of faults is 3” is an event containing only one sample point 3. Example 1.3. The managing director of a company may be interested in knowing the profit (or loss), o, which is likely to occur to his company in the coming year. Since @ may be positive or negative and may take any value, @ may be taken to be {—a < < coh.’, Profit will exceed ais an event consisting of all those « which exceed a. Tt may be written as{o:0> a}. Similarly {o:a < @ < D} isan event consisting of all those @ which lie in the rangé (a, 8), representing the event that profit will lie im the range (a, 8). Events will be denoted by Latin letters 4, B, C,.... etc. Many a time the events will consist of all those sampte points o which obey a certain relation, as is the case in Example 123. {e : 0. satisfying: a relation R} is an event consisting of all those «’s which satisfy the relation R, The event consisting of a single outcome w, will be denoted by (o,). Tt isa singleton and is also cailed a simple event. If the event contains o, and wyit is denoted by (0, «y} and is a double-ton, For convenience, we shall consider the subset of 2 contaiaing no point, viz., the empty set 4, to be an event and shall call it an impossible event. Similarly the whole space @ will be called a sure event. _ ‘Any collection of events is a class of events. Classes will be denoted by script letters «/, @, @, ete. In éxample 1.1 the sample space 2 contains only a finite number of sample points. Hence the number of all possible subsets of © is also finite. If contains m points, there are 2" subsets of ©, The class of all subsets of 2 is called the power set of Q. Note that the power set is a class. The number of points in a set is called its cardinality. Example 14; Let Q= {a , @,}. Then the power set consists of the B events {5}, {Oy}, {0s}, {gh (is tah, {Oy Oy} {eg Og}, (0s, Oy Os). (What is the power set of Q when Q = (oy oy Oy 0) It must be emphasized here that the experimenter may be interested only in a particular class of outcomes and not in all events. In Example 1.1, suppose the experimenter is interested only in the total number of defective items in the observed sample. If ay denotes the outcome of. observing i defectives, © will contain only 6 sample points, wy, ays. {isa simple event ia the present case, but a composite event containing () sample points in tho case of Example 1.1.. The power set of ‘Qin the present case is only'a subclass of the power set of Example 1.1.SETS AND CLASSES OF EVENTS 3 ‘We shall use the following abbreviations which have become standard in mathématical texts. Symbol Meaning Example e clement of wes det ¢ notan element of o ¢ 4 > logical implication a >b,b>era>e ° logical equivalence ab and ba ea=b 3 such that q there exists. iff if and only if « each, all or every 1.2, ALGEBRA OF SETS (@) Set-oPERarions ‘Since events are sets of sample points it is essential that one becomes familiar with the algebra of sets, in order to understand manipulations involving events, In the following we assume that 2 is given. The important set operations are: (j) Complementation; (ji) Inclusion and Equality; and (ii) Union and Intersection. (i) Complementation: To every set A we can associate another set At consisting of all points of @ not contained in 4, A* is called the com plement of A. Symbolically, At = {wi o¢ A} a.) Evidently, 2° = ¢ and $f = Q Example 15. Suppose a coin is tossed 3 times. The possible outcomes ‘may be denoted by HHH, HAT, HTH, THH, HTT, THT, TTH, a1T ee Oe ey MO» Oe On Oe If A is the event that at Jeast one head turns up, A= {0p Op OD then A* = {TTT} = (ws) represonts the event that no head turns up. If Bis the event that exactly one head turns up, then “B= (0p oy oh Be = (res Oy Ons latter representing the event that the number of heads turning up is not4 MODERN PROBABILITY THEORY equal to one. In goneral, complements of events aro events (see section 1.4). (a Inclusion: If all points of a set A are also points of another set B, thon we say that A is a subset of Bor, A is included in or contained in B. This is denoted by Ac B or B> A. Symbolically, ACBa(WEA>0G 8). 2) Evidently (i) AG A, (reflexivity); (i) ACB BCCHe(EA> OER 0EO 7 = AC C (transitivity), a3) In Example 1.5, Bis inclided in A IF Q represents the collection of all students, 4, who pass in first. class and B, who pass, then A CB, It may be noted that aa 7 de> BF, as Students who do -not pass form a subset of those who do not pass in first class. ‘We shall see later that a subset of an event may not bo an event. Gb Equality: If AC Band BC A, then A and B are said to be equal, denoted by A = B, Thus in any problem, in order to establish the equility of two sets A and B we have to prove that @E4e@ED. Since inclusion relation is reflexive and transitive, equality relation is also reflexive and transitive. It is also symmetric, e., A = Bar Ba A. (il) Union and Intersection; If A-éind B are two sets, then the set of all Points o which belong to either or Bis called A union Bandis denoted by AU B. The set of all points which belong to both A and B is called A intersection B and is denoted by A.B. Symbolically AU B= {0:0 EA oF © EB} = {w6 belongs to at least one of : the sets A or B} (1.6) 41 B= {0:06 A and w€ B) = (0:0 both A and B}. (1.7) acs, * AUB=BANB A (Why?). (1.8) Example 1.6. In Example 1.5, if”UTS AND CLASSHS OF BVENTS 5 lI ; = fon on, L and B = {the 3 tosses have at most one tail}, = (y Op oy od : then AUB = {or Oy Oy, Oy ©, ; An B= fo) If ANB = ¢ then A and B are said to be disjoint or mutually exclusive. In this case and only in this case AU B will be denoted by A+ B. | Thus, 4 + 4° = Q.. We use this notation throughout the book. | If ACB, Bf) A will be denoted by B— A and is called the proper difference of Band A. B— A and A are disjoint and (B— A)+ A= B. We may note that 4¢= 0-4, Many a time Af) Bis written as AB omitting 1. ‘We may note that ABCA, A-AB = ABE C Be; AB CB, B—AB = BAS CB. | Hence AB* and BA® are disjoint. AB*} BAe = A A Bis called the i symmetric difference of Aand B. A ig called. the symmetric difference | operation, While the proper difference. is defined only if Ac 8, sym “Hi metric difference is always defined. (Mark 4A B on a Venn-diagram!) We may note that AB} AB* = 4, Unions, intersections and differences of events are events (sée sec. 1.4). Example 1.7. Suppose 0 is the real line R consisting of all real points 7 A= (0-0 Band AU B= 4, A 1) B= B. Note that * = [a 20), BF = (0, ¢] U [@,@). ‘Thus the complement of an interval: need not be an interval. The definition of union of an arbitrary possibly uncountable numbet of sets is given by U4 E50 E A for someie D, a9) Define where Jis an arbitrary index set assumed nonempty. If Fis finite we6 : MODERN PROBABILITY TABORY have a finite union, If J is countable aad is given by 1, 2,..., we havea comtable union denoted by U. 4). Similarly the intersection of Ai(i G7) is ae QA= E9206 A forall e ). (1.10) The operations of {J and A) are’ reflexive, commutative, associative and distributive as can be easily verified (How?). @ AUA+4,A 1 A&A; (Reflexive) (i) AUB=BU 4,4 1 B= BK A; (Commutative) (i) @U BU C=4UBUC=AU BUCO) : =U UB (AN BNC=AN BH CHAN(NC) = (AN C)-/ B; (Associative) ) AN@UO=4NHUEND AU(BNC) = (AUB)N(AUC). Distributive) From (iv) we conclude that A(B+C) = AB + AC and more generally A 2B) = BAB, By the definition of 4*, : AUA =D AN A= Now (A BY =o ro ¢4N Bh «= { ; @ does not belongto both 4 and B}, = fo teither ¢ A ore ¢ B}, =4UB. (ly, Replacing A by A* and B by Be in (1-11), we have in B= 49 U B= AUB and herice tiking complementation of both sides, Ae) Be = (AY BY. a.) ‘This may be proved directly also. In general, if 4,, 1 J are arbitrary sets Cy Aik = fo +0 does not belong to even one dy, = 2, = (0:0 ¢ Asfor each one of i © 1}, =q4 13) and ( f) Aijt = {w +0 docs not belong to each one of As} = {@ ! @ does not belong to at ieast one 4},SETS AND CLASSES OF EVENTS 1 =u4 Can) a (1.13) and (1.14) are known as de Morgan rules (or formulae) for deriving complements. Loma 1.1: Given a class (diy i = Uy 2yons nt} of sets there exists-a lass (Bi, 1 = 1, 2yuvy th of disjoint sets such that, Ua- 4, (ay) Proofs The Lemma will be proved by induotion. Evidently AyU Ay = Ay + fds = By + Be, 80Y, 2.16), where B, and B, are disjoints. Thus the Lemmaistrue for n= 2. Supposeit is true for all n 2. Then dim (AU Amin = (3 B)U Ady (by induction hypothesis), 25+ @B) Ama (by 1.16) = 3 Bit Bonn say, where Buy and 5B; are disjoint and hence Bry and B; arc disjoint for i==1,2,...,m(Why?). Hence the Lemma holds for n= m-+ 1 and by induetion, for arbitrary 2. (+)! Note that By C A, for all i CoRoLtary 1.1: a Ay = Art Ag Ag+ Ag AG ASH oe (i?) This is the extension of Lemma 1.1 to the countable case. It tells us how to get a countable class of disjoint sets, starting from a countable class of arbitrary sets, such that theirunions are equal. Tus, in futur ‘we may assume that we are given a countable class of disjoint sets, with out loss of generality. (Is the class {B;} on the RHS of (1.17) unique?) Proof: Suppose o Ga Then « belongs to some 4, Thus o may belong to 4; or @ may belong to Af. In the latter case « has to belong, to A, or As. Thus cither we {Aq or @ = Al AS. In the latter case has to belong to 4, or 4$. Continuing in this manner, o has to belong Wil signify the end of the proof.8 MODERN PROBABILITY THEORY to either Ay, APAys ALAS Agy cy ALAS, ny AS dy te. Hence oc RS & of (1.17). ; Conversely, if «i @RHS of (1.17), @ € Aj AS, dia de for some | ke But Al Ajyy Aina de C Aa. Hence © © Ay for some k. Thus 6 0, Au, which establishes the equivalence of the two" sets on the two sides of (1.17). (*). (®) Sequences, Liss ‘A sequence of sets {4x} is said to be’ monotone increasing if AyC Ana for each n. Since in this caso U Ae dey U Aa Aiscalled the Jimi mm wm of the sequence. Symbolically this is written as Ant A. If A, D dass for ‘each 1, the sequence {4,} of sets is said-to bemonosone decreasing. Then Aca dy and Bi 4n=A is called the timit of (4). This is written asdf A. Example 18, Let 4y={o:0 <0 <1—I/r} c(—2, &)=O. Then Ant A= {010 << Ip. Let By=(o:0 <@ <1 + In}, Then Bey B= {o:0 A, We also say that {4,} converges to A. It should be noted that lim 4, and fin, will always exist, even though lint A, may not exist. Example 1.9. Let dy be the set of points (x, ») of the Cartesian plane lying within the rectangle bounded by the two axes and the lines x=7 and y =I), ie. An= (0S 8 1) _ [lo:0 <0 ) Lesa |.2! 4 field is efosed under finite uniovs. Conversely, a class osed wnder complementations aed finite unions is a field, Proof: Suppose sf is afield. Then @ dev ose, (1.23) (i), Ay Aye Anal fh Aveo. a. But Ay Agysty dy al AG, AS, ony AS A, (by 1.23) =A A et, (by 1.24) + (hay ew”, (by 1.23) - 2UAew, (by de Morgan rule). ‘Hence sf is closed under finite unions also. Conversely, suppose «f is a class such that (1.23) holds true for every Ac #and nest hes, GD Ay Aye o Then proceeding as above and using de Morgan rule, we can prove that ‘9? is closed under Gite intersectionsalso. Hence wehave the Lemma. (*) From the Lemma, a field is sometimes defined as a class closed under ‘complementations, finite intersections.and/or finite unions. Evidently, of is 3 field, AES AES, sdUs ew, ie, DES, 25) aAN MES, ie PEW (1.26) Hence we have the following Lemma. Leman 1.3: Every field contains the empty set ¢ and the whole space Q. ‘The class containing only ¢ and Qis a field. It is the smallest field and is contained in every other field. It is called the degenerate. or trivial| MODERN PROBADILITY THEORY 2 field. The power set consisting of every subset of Q is~also a field and isthe logge: st field. If a field contains A, it has to contain 4°, and hence contains the class 14, Ab, D}. 4.2) But this class is a field, contained in every field containing A. Therefore it is the smallest field containing A, (What is the smallest field containing A, BY Brample 13. Let Ay, Ay dy be such that A, 9) Ay = 6 for all i,j. 1,2,3(//) and & Ay =, ‘Thus @ is subdivided with 3 mutually exclusive and exhaustive sets. To determine the field containing the class Ug Ay As, ibis sufficient if we add to the above class the sets 4, dy + BoA phy Ay + Ay a0. Since Ay + dy = i, Ay Ay > AP Ay + 4q = Ai, the resulting class {6 4p An Ay An + dy dat A ds + 459 (1.28) is closed under complementations and.finite unions; hence it is a field, Itis the smallest field containing (41,'4yy 4a)- Consider an arbitrary. class @ of sets. The’smallest field cont is called 188 Fainimal field containing @ or the fleld generated by @. This contained in every field containing @. If @,(7G 1) denote the fields containing @, then @, =) Cri the minimal field containing denoted by #(G). This follows from the following theorem 1.1 and the fact that ¢ ig contained in @y (see also Neveu (1965), p. 14.) ‘Turonmm 1.1: The intersection of arbitrary munber of fields ts d field. Proof: Let $ (ED) be the felds indexed by points of a nonempty set 1 which may bo finite, countable or uncountable, Let ¢, =), the class of all sets-common to all @;. Since each one of the @ contains $ and O, %, contains # and Q. Suppose AE %y thon AG, 1 EL Hence AEG; for alli, ‘Thusé°e %,, and €, is closed under complementation. : WApE Sy, (=, ny mh then Ac ES, VIEL Then Dae 6 fot all i J and honce it belongs to 2 €,=©, implying that ©, is closed . undgr finite intersections. Hence @, is afield") : ‘We may note that even though the intersections of two or more fields isa ald, the unions of two fields may not be a field; eg. (4, 4%, 6, O}, {B, BY, %, 0} are two fields, but their union (4,48, B, BY, b QF (1.29) ‘is not a field.i SEES AND CLASSES OF EVENTS 13 Let {Aj} = 1, 2, uy m be & class of disjoint sets. such that 3 Ay Thus, {4} are mutually exclusive and exhaustive. Then {4}} is called a Partition of . Im this case, Ags Ag t Agt + An (Ai + AY and so on. Hence the class Hl {65 Aay Aas or Aw Ast Ay Ant Avy oo Anat Aen Ay + Ag+ Age on eat der 1 dyes Ay by boot dy =O} (1.30) containing 4, © and the unions of 4/'s taken one set at a time, 2 scts at a time, ete., which is therefore closed under finite unions, is also closed under complementations. Thus it is a field. It is the minimal fleld containing Ag t Ag t An {Ay Fy Qyeny Me contain (7) + (") tot (7) = 2 ot, Partition of’ is said to be finer than pattition «, or «, coarser than if the minimal field #*(c?") containing of" contains the minimal field F(w) containing . For example, (4, 45} is coarser than {4, BA®, (AU BY}: {AB, ABs, BAe, (A UB) is finer than both these partitions. i ‘This may be verified by obtaining the minimal. fields containing these partitions. | We have seen that it is very easy to obtain ¥(), the minimal field containing a class @, it @ is a partition, In general, to obtain #(€) we may proceed along the following steps: @). Obtain @, — (9, 0, A, As, such that dither Ae © ot 4° © ©} where A CQ. Evidently @, is closed under complementation and i contains €. 1 (i) Obtain the cass @ containing (By, where Be %y k= 1, 2jausy 2, By’S and m being arbitrary. Now @, is closed under finite inter~ sections, but not under complementation, (ii) Obtain %,, the class of all finite unions of pair-wise disjoint subsets belonging to @,. Since they also contain complements, @, is a field . and is the sninimal fletf containing @. (Prove!) Example 1.14," Let © = (4, B}. Then ©) =, 9, A, AB, BS, ©, = (Cy AB, ABS, APB, AEBS, = (6, AB 4 A&BY, ABY APB, AB ABS + AB, ABS AB 4 AB ABS ABT, AB ABS Am dh ABS ABE ES + AB + A& & 4 MODERN PROBABILITY THEORY Gis the smallest feld containing @. Tt coincides with the minimal field containing the partition (42, AB®, AB, A°B%. The minimal field containing {4, B, C} is the minimal field containing the partition, {4BC, APBC, ABC, ABCE, ASBPC, ABPC®, AEBC*, ASB*C8} (Why?), © Fini, Bore Fusco Closure under finite operations does not imply closure uader countable operations, In Example 1.11, we have seen that the class @ of all inter vals of form (x, 00), x Ris closed under flaite intersection, But it not closed under countable intersections, because Le An, co) = [x 0) ¢ @, | A nonempty class of sets which is closed under complementations and (How?). It possesses all the properties of a field, hence contains ¢ and 2. IF@ contains only'a finite number of sets and is a field, itis also a ovfield. However a field contai be a o-field. ig an inflaite number of sets may not Example 115. Let Q-~ (1, 2, 3, 4.) and © be the class of subsets 4 of © such that either A contains a finite number of points or 4 contains a finite number of points. Evidently @ is closed under complementation. It is also closed under faite unions, because A U B will contain a finite number of points if each one of aad Bis fuite and (AUB) = Ae ne will contain a finite number of points if either A° is finite or Bis finite. ‘Thus either 4 UB contains a finite number of points or (AU B}* contains a finite oumber of points. Hence AUB @. Thius @ is a field: But ¢ is not a o-feld. Let : Ap= 2 T= 1,2). ‘Then ( 4: =(2,4,6...}, is neither finite ‘nor its complement is finfte. a Hence it does not belong to’. By Theorem 1.1, intersection of arbitrary number of fields is a field. Following similar steps we can prove Theorem 1.2. ‘Tuconem 1:2: The intersection of an arbitrary number of o-fields is a \, eel 7 } _ Given a class @, the minimal e-fleld containing the class © will be denoted by o(@). It is the Mversection of allsields containing ©. Ie is also called the o-field generated by ©. If @ is finite, the minimal field| | | SETS AND CLASSES OF EVENTS 15 FG) containing © coincides with the minimal o-Aeld o(@) containing @. [> To obtain o(%) from a given class @, we follow the same steps (i)-(iii) “| used in the procedure of obtaining #(@), except that in step 2 we allow \ ! 7 ‘nto be infinite, Example 116, Consider the class ¢ of all intervals of'the form(—co, x), & R, as subsets of the real line R, This class is closed under fi intersections, but not under complementation nor under countable intersections, Let o(@) = # be the minimal o-feld containing @. ‘Thea contains intervals of the form [x, 0); which are complements of sets of the form (—co, x). It contains intervals of the form (=, aj fi (Ce, a + 1/m),: (by countable intersection), | \ ap SIA = Cie ae (by complementation) pia @=Ce.IN@ 2 @ WS pe B10, ot fora, Bee Re LA J ‘@ is called the Borel field of subsets of the real line. The sets of # fe called Borel sets:~-BOtel field and Borel sets play a very important fole in the study of probability theory. Lemma L4: Let @, be the class of all intervals of the form (a, by GED), a, DER, but arbitrary. Then (6) = B, Proof: BY (1.31), (a, b) €&, for all a, b, Hence @, c #. By definition ‘of minimal o-field, o(@) C ‘ To prove inclusion in the reverse direction, we note that, by definition of (8), U (-m ae o(@), v x = (—@, 2) € ofG,), ¥ x = €co(@) where @ is defined in Example 1.16. Hence (8) C o(,). But of) = @ and therefore'a(%,). = o(() = @-(s) In the same manner as above, one can prove »that the Borel feld is ‘the minimal field containing any one of the following classes, (—@, a], x € Rh, y= (Gta 4*, Prove that fit (4,UB,) = fim A,Ulim By ‘What can you say for lim (4nU Bq), lim (4,B,), lim (An Bs)? 5, Let Q = R*, 4, = interior of the circle with centre at ((—1)"/n, 0) and radius 1. Find fim 4, lim dp, and lim dy, if exists. 6. Prove that (i) (UA)NB = UAL Bh GAL y and 9) 4, = 4 then A= G (AAs To Wp = A,B = 1,3, Se = Bn = 2,4, 6 AU By lim 4e= A 1B When does lim A, show that iim A, exist?” 8, Examine the following sequences of sets for convergence, If con~ vergent, derive the limit. (@) Aan =O, 1/20), Ayes = 11, Gn + Dis {b) Aq = Ithe set of rationals in I—1/(m + 1), 1+ LsSETS AND CLASSES OF EVENTS 18 12. B. 14, 15; 16. 7. (An = Q=Un,2 + 2nd, n odd, even. Halmos (1956) calls a o-field as Boolean o-algebra with operations of addition and multiplication suitably defined. Boolean o-algebra of is said to be separable or of countable type if there exists & ‘countabie family of subsets of © which generates AUBEM, ) BCAsA-BEY, i) Qe”, ‘The class f is a o-feld if in addition @) AEA (H12) 2d Ger If {Ayyses An} and {By.-.5 By) ate two partitions of Q, show that (@) (rf) Bi} isa partition of 2, (©) ofo(Ai) U ofB)) = 24 N By, = o({4i} U 8) Can you extend the above result when ‘{4,} and {B,) are (i) countable partitions (ii) arbitrary classes? Let € be an arbitrary class of subsets of Q and let sf be the collection of all nite unions J Ay (= 1,2,...) where each Ay = A Bry, ma a By EE or Bi, © ©. Show that « is the: minimal field over 6. ‘This gives us a procedure of obtaining minimal eld over a class 6. Extend this procedure to arrive at the minimal e-feld. R= RU {co} U {|- is called the extended real line. One can define Borel field @ in R similarly to @ in R, Study the relation ship between # and @. Borel sets are subsets of R. But the converse is not true, "There exist subsets of R which are not obtainable by’ countable operations on intervals. Give an example of such a set. Also give an example of a non-countable Borel set which is not an interval. (see Halmos 1956) Let B be a Borel set in R, Let B= {a +x: xe B),aeR, Show that BF is also a Borel set. If @ generates a a-field w, the o-ficld Bx of subsets of Bis generated by BNE.mmr | 2, Random Variables 21 FUNCTIONS AND INVERSE FUNCTIONS (@) -Ponsr Function anp Set Function Suppose is the sample space with sample points «.° Sometime’ we are interested in a value X(o) associated with o, and not ino itself. Xo) may be observable while « is not. Example 2.1. Consider the sample space in Example 1.5, consisting of 8 possible outcomes of the experiment of tossing a coin 3 times. We may be interested only in the number of times head turns up. Thus, with the outcome HHH = 0, we will associate a number Xio,)— 3, representing the numiber of heads'in oy. With the outcome HHT =o, we will associate X(o0,) = 2, and so on, cs A function ¥ on a space 2 to a space 2’ assigns to cach point a € 2 ‘@ unique point in Q' denoted by (a). Xo) is the image of the argue ment o under. It isalso called the value of Xat a. X is also a mapping from Q to 9", denoted. by © *. 2.” «is mapped on to ¥(o) = o! EO" by x. X also establishes correspondence relation between points of Q with Points of O’. Qis called the’ domain of X and Q is called the range. The set 2" = [X(o): © © Q] which is a subset of Q', is called the strict range of X. If.0" =", then we say that ¥ is amapping from © onto Q%, otherwise X is said to be a mapping from Q into O'. “The symbols A), X(@), ete. will be used to denote. functions, even though they really denote the values of functions. Example 2.2. Let © =[0, 4:1, £2)... @ = (0,1, 2...] and X(a) = ot, Then the strict range of Y'is (0. 1, 4, 9,...] =”. Hence X is a mapping of @ ‘into’ Q’ and ‘onto’ 2", If X(o) = w+ 2, then X is ¢ mapping of 2 ‘onto’ 2. : We consider’ one-valued functions only. Thus, «a, + X(o)) = X(o,). Many-valued functions such as X() = + V/u, o€ Rare out ide our purview. But, in general X(o,)= X(o,) 0, =a, because, there may be more than one @ which is mapped on the same’ image «’. IX) = X@,) > 0, = 0, whatever be wy, ©, Q, then X is said to be- a one-to-one (1-1) function, In Example 2.2, X(o)=o-+2 is aRANDOM. VARIARLES. 21 1-1 function, as there correspond’ a unique image toa point-o = O; but X(o)=0" isnot a 1-1 funétion, since 0, = +2, o,~—2 have the same image X(o,) X(o,) IF Qis the real line (—2 << &) and Q’ =: (0 <1" <<), the function ¥(w) = exp (0) is a 11 ‘onto function from @ to ', and is a 11 ‘into’ function from Q to 0 Tf the range, space is the real line R or its subset, the function is said to be a numerical or real-valued function. In’ Example 2.1 with each o we associate a numerical valuc. Hence, we have 2 mapping from Qto R.if the arguments of Xare points of space 2, we have a ‘point function. Tn the above examples, we considered only point func tions. If the arguments of a function are sets of a certain class, then we have a'set function. If .0/ is a class of sets, and with each A = wwe associate a valixe u(4), say, then y isa set function. x may represent such entities as weight, length, measure, etc., associated with the sets of . For example with ax interval (a, 6) we may associate its length ba with the disjoint union (a,5) U (4) we may associate, n(@, 3) U (6, d)) = (ba) + (d=2), and s0 on. "Two real valued functions X and ¥ on are said to be equal iff Xa) = Yo) Vo ED, and we write ¥= ¥. “Similarly we write XS, ¥, i-X(w) So), for all ore Q. Thus, to vorify the equality (or inequality) of two functions, we have to verify the equality (or inequality) of the values of these functions at all argument points, If X(o) = ¢ a constant for all w, X is degenerate. : (bo) INvERsE FUNCTION We have seen that for a point 0’ GQ there may exist one or more than one points of O, whose image under X is o'. The set of all points oe whose image under ¥ is o' is called the inverse image of {a}, denoted as X1({0'), Thus, Xo!) = (@E 2X0) = 0") en Hence forward, sets of points or classes of sets will be denoted either by {7} or by [+]. 7 {In general, let B” C0". The set of all poitts-of © for which X(w) 8° is called the inverse image of B’ underX, denoted by ¥4(B"): X4(B) =[0FX@) € B'). (2.2) ‘Thus, with every point function X, we associate a set funetion Xj whose domain is a class 2” of subsets of O' and whose: range is a.class , (say) of subsets of 2, X°*is called inverse function (or mapping) Of” X. Weshall denote XB) = Xe):0 EB], BO, XA@) = [(X7@): Bes), 23)2 MODPRN PROBABILITY THEORY Evidently X7(Q) = fo: Xa) € 2] =. Example 2.3, Consider the function X(o) = w" from R to R. Let B’ be the set of points contained in the interval (1,2). Then X-(B")= @, VU(— v2, ~ 1). If the range and the domain of X is the same and is (0, co), then XB) = (1, 2). Example 2.4. Consider the real valued function Jy or (A) defined on as follows: TANe) = 140) = Lite 4, } es Lo) = 0, foe Ae, ‘Then 4 or (A) is called the indicator function (or characteristic function by some authors) of A. The strict range of Ly is TQ) = {lufw): » & Q} = (0, I}. Tf BC R, the range space, then T3°(B) = $5 if B does not contain ‘0° or ‘1, = A, if B contains 1” butnot ‘0’, @.3) = 4s, if B contains ‘0 but not + + 20, if Bconsins bork 0 and PJ Thus IPA) = (6, A, A, O} = of A}. 2.6) Evidently, cl, takes the value ¢ on A and 0 on 4*, and hence * (Lp) A) = TB) = tA}. i Ta takes the value, ‘1’ for allo GQ. If X(w) =e for all o @Q, then = clp and id this case X-1(B') = $ of Q, whatever be B'C Ry Qn . Ts! @)=(@, 2) = Clo) 2.8) As the indicator function is the simplest non-trivial function, we give below some of its important properties which can be easily established, (Provel) © AcBemA Xu) ¢ B, ae XB). Thus, 748) = (XA). (2.1) Jt may be noted hére thatthe unions and intersectis __ above need not be countable. (*) Evidently, XQ) = [wv :X(0) € W'] = 2,17) = 4 Hf ANB =$,_ then X(ANB) = X4(d)NX"B) = g. Hence, ¥-1(4) and YB) are disjoint. If 6" is a certain class of subsets of and X46?) = [X-4C): CE @, from the above Lemma 2.1 the qiestion arises whether X-¥(6") is a o-ficld, if @” is a o-field. A. partial answer to this question is contained in the following Corollary 2,1 and _the complete answer ia Corollary 2.2.mw MODERN PROBABILITY THEORY Conoutany 2.1: If.0f is a class of subsets of Q and is a o-feld, then the lass & of all sets whose inverse images belong to of is also a ofeld. . Proof: By definition of &, since of is a o-field, using Lemma 2.1, By Byes EB > XMB), XB) pov EL =f eo, = XUN B) Ew, = Nhe, Thus, @ is closed under countable intersections. Similarly, by Lemma 2.1 itis closed under complementation also. Hence @ is a o-ficld. (+) Conoutany 2.2: (i) If is a fleld (or ae-field) of subsets of Q, XB) isu field (or a o-feld) of subsets of 0. (it) Inverse image of the minimal ‘orfield over any class @ is the minimal o-field over X(6), he “ g{X-C)} = XHo(S)}. @.12) Proof. (i) Let € bea a-field on Q! and. be the class of inverse images of the elements of €, Then, since € is a o-field XA(BYs XABdyey EA By Bye EE =nhee, = (QE, 2B) EH ‘using Lemma 2.1. Thus, of is closed under countable intersections. Since (¥-1(B))¢ = X-1(B9), of is closed under complementations and hence it is a o-field. ‘Thus, the inverse. of a o-field is a o-field and in particular, inverse of a field is a feld. Gi, Lot o(@) be the minimal e-field containing @, a class of subsets of 0. From (i) ¥-o(6) is a o-field of subsets of . It contains X46) and hence o(¥-¥(¢)).. Thus, AHElH) > ofS). 13)” ‘To prove the inclusion in the reverse direction, from Corollary 2.1, the class of all subsets @ of 0" whose inverse images are clements of the cerficld o(X(6)), is a c-field. Hence XAG) C AKG) > ECB, = CM, 5 XO) XO).[RANDOM VARIABLES 25 But by definition of @, X-(@) equals o(X-4(@)). Hence Xo) S ofA). 2.14) (2.13) and (2. 14) imply (2.12). (*) (©) Mnasurapee Function, Bonet Function, INDUCED 6-FIBLD ‘An important class of functions { that.of real_valued functions X on & toR, Let@ be the Borel field of subsets of R, and sf be a o-field of subsets of O. I X-4B) & of for all Borel sets BE @, X is said to be a.0f-measurable function, or a'function measurable with respect 10 of. If is also the real ling R ot its subsot and if X is measurable with respect to the Borel field. @ oa the domain, then X is called @ Borelfunesion. fon f ¥-'(8) & &, for Borel sets B in the range_ Example 2.5. ‘The indicator function J is a real valued function, By 2.5), LNB) € (G4, A). Since of is a c-feld inQ, if A & of, of also belongs fo sf and hence, 17'(B) € of + AG. Thus 14s measurable iff A < of. A function taking a constaat value for o € O is measurable with respect to any o-feld in @ (Why?). We can verify that, if X can take three distinct valucs ey , ¢ such that +. Xo) @.15) =n ifoe dn then, (Ay, Aye Ay) forms a partition of O-(Why?). Then X will best moasurabe i [doy Ay dgh © f Gee Example 2.6). Tet bea real valued function on and @ be the Borel field. By Cor. 2.2, the class of sets ¥-2(@) = (X“(B) 1B E Diva ofteld, tis called the orffeld induced by ¥. Itis also denoted by B(X), later on + X vill be w/-measurable if it induces a o-fleld which’ is a sub-o-eld of of. Is w"' and X is of-messurable then it is also f'-mcasurable. If Cf and Xis.f-measurable then X may not be.sf”"-measurable. X Would be s/”-measurable if XB) c af” also. ‘The smallest o-field swith respect to which X is measurable is X-1(@), the o-field induced by” X. Since it is contaiged in every e-field with respect to which Xis measurable, it is the intersection of all such o-fields. By 2.5), 14(@) = (4,4, 45,0} which is the o-field induced by Ia. The erfield induced by the constant is the degenerate afield. ‘Therefore, a constant function is measurable with respect to every o-field of subsets of ©. Conversely if {%, Q} is the o-field induced by X, it * Why’). Example 2.6. Consider the function X defined by (2.15). Then26 MODERN PROBABILITY THEORY X-WB) = if B does not contain cy oF ¢ oF Cry = dy if B contains ¢, only, = Ay if B contains ¢, only, <2 dy if B contains ¢, only, = dy + Ay if B contains cy and cx, = Ay + Ay if B contains cy and ¢,, Ay + Ay if B contains ¢ and ey, = 0, if B contains cy ¢, and cy Hence XB) = {hy Ao Aus bey dot Avy do + day Ar + de Qe (2-18) which is the minimal a-fleld containing the partition (4g Ay, 4,). In general, if Xo) = 6h foe Ar F= 1,20, my @.17 ..-) Aq} forms a partition of Q. The e-field induced by X 1g this partition. (Why?) Now eda: (2.18) in terms of indicator functions’ of disjoint sets of the partition {Ai}. ‘Since we are concerned with oné-valued fumetions only, any function taking finite distinct values {e1, cyy.- Cx) oon be written in tye form 2.18), where i, XM, = 1, 2 me (2.19) Hence Ai’s are evidently disjoint and form a partition of ©. A finite linear-combination of-indicators_of “sets (not necessarily dise ‘will be simple. (Why )eeh PH jew Joy Chee seg ers ‘Lemma 2.2: Any simple function X can be written in t a te 28 i x= 3 sala 2.20) ‘where x,'s are distinct numerical constants and {dy Ax. Partition of 2. An sa Proof: Let X be a linear function of Js, and I, only, say Xo alo, F caloe @.21) Then Xo) 0, if w¢B, and of By ie, we Bi Bi= a, t oS B, and wf By ie, EBB = Ay ey if w PB, andwS By ie, w © BIB, ~ dy = GtepifoS By and OE By ie, © E BBr = Ay ) is called a simple function. If Gis finite, all seal valued | fonctions Lge 'RANDOM VARIABLES, 27 Bence X(0) = x for & & Ay, where x = 0, Xp = Cys Xa = Ce ADE xy = GF ee Moreover {4yy yy dy Ag formsa partition of ©. Thus, the Lemma is proved for the linear combination of 2 indicators. Suppose in general X°is a linear combination of m indicators, i, = Feder (2.22), Then X(0) = Ob Cin Peet Oe (2.23) Hf © SB, By MB Blets «1 Bim = A sayy where (jj, Joes Jn) 18 2 permutation of (1, 2c tm) and k= 0, Tyeus mt Thus, ¥ can have at most 2" values which are distinct. The sets 4; being disjoint, (4)} forms « partition of ©. The Lemma follows by taking for n, the number of distinct values of X. (*) Thus, a simple function takes only a finite number of values. xy Xs Loss Sw (S8¥) 8nd We have seen from the earlier discussion that conversely a function taking a finite number of values is simple. (4;} is the partition on which ¥ takes distinct values. It is called the partition induced by X. In fact Ay = [0 X(0) = ¥) = IMG, P= eee We may note that the partitions induced by X and eX(c 0) are the same\= Hence forward we shall assume that a sitople function is of the form (2.20), with x's distinct and 4,’s disjoint forming a partition of o.* Lamma 2.3: The o-field induced by. a simple function (2.20) is the minimal cefield containing the partition (Ay Aaya) Proof: Let B be any Borel subset of the real line containing by Rha veey Xp) C Phas Save Hae Then X(o) & Biff X(0) But X(w) + x) ifo Ay. Hence pp OF Xj, OF ons Xp fo Me) © B]= XB) = EA E oly, Ase dade Since the sets of {i} are sums of sets of (4,}, Lemma follows. (+) ‘ConoELARY 2.3: The oefield induced by X= = Gln is (equal to) the minimal a-field containing Byy Byres Bre Corollary follows from Lemmas 2.2 and 2.3 since each one of the 's belongs to ofByy Byyoer Br ‘A countable linear combination of indicator functions is_called an. _ élementary function, A few authors do not distinguish simple and'ele-28 MODERN PROBABILITY THECRY mentary functions and call thom both as simple functions. But we shall restrict ourselves to the above definition. i Without loss of generality, we can assume that an elementary funetion is the linear combination of a countable number of indicator functions of disjoint sets, . m7 x= 3 riley ‘then Ay = XK{x)}) = [@ : X(@) = x1] = EX = mi), say, ‘and {4} forms a partition of &. (Why?) If B is a Borel set and the domain R = Q, the indicator function of B isa Borel function, A linear. combination of indicator functions of Borel sets is also @-measurable and is hence ¢ Borel function. This follows from the fact that if B; & #, (I = 1, 2...) {By} C # and 6B} C | @. Similarly, all the results we prove for /-measurablé functions will hold in particular for Borel functions. | If X(o) = @ + 2, 0 @ R, Xa, b) = (a—2, b—2) EB. In general, | ‘XA\B) © @ for any Borel set Bin the range space (see Lemma 2.5), ‘and hence X is Borel function. If Xu) = of, o © R, Ka, Bb > a> 0) equals fo :a* & (a, by) =(- yb, —VaU(Va, vb) EB. In | general, X-1(B) & @ for any Borel set BC R, the range space and @ in the domain. Hence X isa Borel function. Similarly, we can establish that any 1—1 monotone function flu), which has an inversé pc function f-', is a Borel function (Prove!). a (@)_ Fonction oF 4 Function i If Xs a function or a mapping from O to 0 and X"is a function from "to 0”, then ¥(((o)) is a function from Q to Q”, denoted by XX or XX). Itis said to bea function of a function or a composition of | two functions X and.X’. Its inverse (X"X)"1 is a function on the subsets i of 0" to the subsets of Q such that for any Bc Q", (L2)-B) = fo | XK) € Bl | = [o XK) € XY). Y = xB). (2.24) | Thus, _ (x7 = oy, i Lanta 2:4: Borel function of of-measurable function X is sf-measur- i able and intuces a sub-c-field of that induced by X, 4 T | Proofi If f(X) is the Borel‘function of X and B is any Borel set, U@) € 3) = [ve fy). ‘But, if B is a Borel set, f-"(B) is a Borel set B’ and hence ESB) = ike) € BE MA@ cat,I \ RANDOM VARIABLES 2» ‘Thus f(X) is o/-measurable and induces a sub-o-field of X-(@), which is the o-field induced by X. (*) Ja the Example 1.5, suppose we associate with eache, a number representing (No. of heads—No. of tails) called X(o). We cen obtain the e-field induced by X and X* and vetify (1) that the field induced by 4? is a sub-o-field of that induced by ¥. We may note that X*is a Borel function of a w/-measurable function X. 2,2 RANDOM VARIABLES (@) Economical, Dsemvtrion So far, in the definition of a function we have not assumed that © is the sample space. Here onwards we shall restrict ourselves only to the ‘case Where Q@ isthe sample'space. Let . be the o-ficld of events asso ciated with a certain fixed experiment. Any real valued, .f-measurable function defined on 0 is called'a random variable (F.v.). Thus X is a rv. if X-1(@), the e-field induced by X, is contaitied in. or equiva. ently, X-(B) is an event for every Borel set # in the range space of X. Example 2.7. From (2.5), fA Gf, 12 B) Eo, ie, if Ais an_ event, Iu is.a random variable. By Lemma 2.3, a simple function X= 3. sly) induces the o-field containing the partition (Ay. day.» 4n)._If this partic is contained in sf, ic., if each one of the Ay's is an event, then Vis SHEMET siple FV Since singletons (3), (ER), belong to %, Seite “y= al a= X(N | &. From Lemma 2.4, Borel function g(X) of a rv. Yis anv. and induces a o-field, Which isa sub-o-field of that which is induced by X. For a given X, define two non-negative functions X+(a) = Xo) # Xo) 0 if X(0) <0, (2.25) and X-(w) = — XO) if Mo) <0 =0, if X(a) 20, (2.26) called respectively, the positive and negative parts of X¥. Then X+ and ¥- are Borel functions of X (Why?) and will be random variables if ¥ jsarv. We may nofe that X* and X~ are non-negative functions, such that for all o, 2X0) = X40) = XH), | Hoy) =X) + HO). ‘These functions play an important role in the theory of integration to be developed in later chapters. 7 ‘To verify whether a certain fonction is a random variable, it is not30 . MODERN PROBABILITY THEORY necessity to determine whether, X-"(B)E oY for every BED. It is sufficient if we verify ¥-\(¢) c 7, where @ is any class of subsets of R given in equation (1.32), which generates @- Hence we have the following ‘economical definition’ of a rv. Lena 2.5: Xis arr. if XG) Cs, where 6 is any class of subsets of R, which generates 2. Proof: We have to prove that HO) Cf ae HB) cot, 2.27 Since @ cH and XB) cw, ¥(G) C sf. To piove the converse, since sf is a e-field and INO) Cf (XG) Ct, = IG) CA, (by Cor. (2.2)) =A cw. Heace we have the Lemma, (+) -From the Lemma 2,5, to vetify that X is-a rv. it is sufficient to verify that X (—co, x) & of (say), for all x R. Thus some authors define a ny. X to be a real valued function for which ¥-1(—co, x) To: Mw) x] = X-(x, co) are events ¥ xR, for establishing that X is ar.v. We can prove directly that X*, X- and [ X | are r.v-'s, if is a nv. (Provel). (&) Vector Ranpow Varian With any » GQ associate Zo) = (X(a), Y(w)), a point in the 2-dimensional Euclidean” R*, defines a function from 2 to R® In RY consider the class © of all rectangles bounded by the lines x= a, x =, I~ Gy~ da XA). (2.32) Similarly, Z(@) > YB). Therefore, Z-1(@,) > X4(@)UY4(B) and since Z+4(,) is a o-field, ZB) > a[X(@)UY A (2.33) To prove the reverse inclusion, we see tbat Z7(B) = X4 (a, BN (ce, DE o [X74 (UY A) Since the R.HLS, of (2.33) is a o-field and since (ZB), Ba rectangle} Co {X-4 (BUT @), o{Z-1(B), B a rectangle} C of X-(@)U ¥(@)]. ie. BF 210; ZG) cof DUM) 2.34)32 MODERN PROBABILITY THEORY From (2.33) and (2.34) we have the Lemma 2.5. (#) ‘The Lemma implies that (¥, ¥) is a veotor rv iff X and Y are r.v.s. ‘Example 2.9. To obtain the c-field induced by (I4, /), consider TG'B) = {0, A, AS, Q), Ts) = (, By BS, Q}, Ia @)UTs" (A) = (9. A, By AY, B,D}, ola? (@)U Ia" (@)l = ofAB, AB, ABS, APB} = 0 {A, BY which is the o-field induced by (Z4, J») as can be verified from Example (2.8). {AB, A‘B, AB‘, A°B%} is a partition of Q and is called the partition induced by (a, I). In general, if X iu Tay, ¥ simple functions, the o-fleld induced by (X, ¥) is the minimal ovfield containing (AsBe), P= 1, 2, ey me = 1 2, nym} (Why?) A Borel function on R? to R (or R’) is'a function such that the iaverse of a Borel set in (or RY) is a Borel set in. Eto, ao Lemata 2.7: Any Borel function of a vector rv. (X, ¥) is a ry. Proof follows from Lemma 2.4, since a Borel function of a vector. rv. (% ¥) induces a e-field which is a'sub-c-field of that induced by GG Y). Itfollows from this Lemma 2.7 that if (6 ¥) is a vector random variable, ¥-+ ¥, X/¥ (V0), XY, ote., which are Borel functions (Prove!) of (X,Y) are vs. If ¥ and ¥ are t.v.'s one can also directly prove that X+Yis a rv. without using the above Lemma 2.7 as follows: By Lemma (2.5), X.Y would be ary. iffor all x= R, [K+ m Ado) ~ sup Nis) = least upper bound of Xi¢u) for kn Then, a8 1+ 00, ¥,(e) sup Yao) Zao) 4 inf Z5(0) ~ lim sup Xs(o) = Tit X40), Ao) and Lim X4(o) will exist forall w © 0, Morcover fi Xu) 2 Hale), Wo We say that {X} converges to X on 4, denoted Xelo)—> Foecait ¢ . . Tele) = Hi Yio) ~ X(0) 0, there exists nf) auch thee for ee Ne), [Xse) = X(0) |< «AF mG) is. independent of oe A, then the convergence is said t0 be unlform on 4. Itt s the se ak conver. of alla ana, iS the set of all where.(¥,) does not converge, Thee Gralla where lim Xs) = fim Xa) = Xfa) = eo wey of divergence. “IF0 is the set of convergence, then {1s} is id no converge : Srerywhere. Ifthe set of convergcnce is $, (X,} converges nowhon inf X,(0) = lim X,(0), Example 2.9. Let (4,} be a sequence of sets and {1(4,)} be the sequence of their indicator functions. “Then inf 14,)(0) = 1, if dy for all k'> p sd = 0, otherwise, inf 1(4,) = Fin Ay), n.! BIGO~ Tet aay Henee, Similarly, SBE AN) = 1 Guy AN) = 1, if weAy for some km, = 0, otherwise. tim inf 1(4,)(«) = Tim inf 4,)(0) = 1, if o all but finite number of 4,'s, ; Jim sup 1(4,)(0) = lim sup 4,)(0) = 1, ifo infinitely many 4,’s. If lim A, exists, then Mim A,) = tim 1(4,). Lawnta 2.8: Yq, Zuo lim Xan loi X, are extended rv.'s anid Him Xq (when exists) (also an extended ry, Proof Considet the function Yq Evideoty iti real valued or ro. For any xe R,MODERN PROBABILITY 1HEORY [y,< a= {or at least one of Xe (0) isles than x5 #3 2, =U he xk (2.43) we Since Y's are rvs [eal= Ule> dew, 2.44) forall x Rand all n, Hence Z, isan extended rv, But supremum and infimum of extended rv’s are extended t.v.’s. Hence sup Ys and jaf Z, are extended r.v-'s, ke. lim Xy and lim X, are extended 1-v."s I lim, exists on ©, then it is equal to, lim X, (say) and is hence an extended rv. 7 IEA is the sot of convergence of {X,}, limit of {%,} will be finite on A only and + A= (0: lim X,(0) = fim X,(0) < 0, (2.45) If V = Titi X,— lim Yq then Vis an extended real valued non-negative random variable on the set where. it exists and hence A=VA(Qe”. Note that V does not exist on the set of divergence, i lit X= lim Xp = se Define ~ X(o) = lim X,(0), say, fore EA, = ©, for oe set of divergence. ‘Then for BE @, since lim Xq isan extended r.v., XB) = Clim XB) S (2.46) Singe the set of diverience is measurable, X is an extended rv. (3). Te may be noted that, since Y, is @ function of (Kp Yay) it is Gemeasurable, If T, ~ sup Ys, then Ty is also Grmeasurable, But 7, = lim Xp | Taus, lim Xe is measurable with respect to 5 for_all n=1Qyon Since S47, lim X, is F-measurable. ‘Similarly, Tim, and lin X, are also T-measurable, Since Vis T-measurable, AE 7. ‘Thus, the set of convergence. is a tail event. We have seen that if a rv. X takes only a finite number of values Xp Yponeyovthen itis a simple rv. If it takes 2 countable number of then it-can be represented by ‘values 25% x= E wry, and is called an elementary r.¥. where Ago 2 X@) = HE MK = 1, 2.0)RANDOM. VARIABLES 37 ‘The partition {4,) induced by X contains infiaite number of events and 50 also the o-field o{44} induced by X. A discrete ry. isa finite or an elementary rv. . (©) Consreuctive Deenuiron ‘The following Theorem 2.1 gives us a constructive definition of & 1. It tells us that we can get an arbitrary, rv. as the pointwise limit (ait) of either simple r-v's or elementary r.v.'s, Tarone 2.1: of simple rvs = mentary rvs. The class of r.».’s'— The classof finite limits of sequences The class of uniform (finite) limits of sequences of ele Prooft (i) Let & be the class of all r-v.'s and be the class of finite its of sequences of simple rv.’s. “Let S be an clement of Y. Sis a limit of a sequence of simple r.v.’s. By Lemma 2.8, finite limits of nv’sare rvs," Hence Sis ary, and S 2, Conversely, if X is any r.v., define a teal valued function X, by Ht) MOP aces38 MODERN. PROBABILITY, THEORY Xalo) =, if X(0) =m, = hd HEAD" S Mw) EE 2% (b= is, = 128 1 1) =n, if X(@) = — A. Ny = — nILY < ~ 1] + nl[X > a) 4 a-ak X(w) asin > for all 6, implying that if Y © 2, then Ye S. (i) Let é be the class of uniform limits of elementary fv." Then by Lemma 2.8, if £ is an clement of &, then Ee @. To provethe converse, define X, = E ea < Xe + D2). (2.48) Now X, is an elementary r.v. | X,(@) — X(w) | < 2" forall o, Thus, X is the uniform limit of X,'s. Hence any X belonging to 2 is also an element of @. The theorem is proved. (+) Conoutany 2.4: If X p 0, there exists a-monotone increasing sequence f non-negative sumple function {Xq} converging to X. Proof: Define ala) om kI-%, HT k2-" < X(w) < (kK + V2, (=O, 1... 24-1), = mit Xo) >m 0 that o ‘ ea Parmar 0 for all n and w. ° Moreover, Xoea(o) = 2k 2404, if 2k I < Nw) < Qk + 1) 2-H, = Re 4 1)21eY, if (2K 1)2-H < Xo) < (BEF DIZ-WM, k= 0.05 0+ D2 =) Snt Lit X@)an4l Comparing with X,(a), we see that Xp) > Xe(w) for all a and o Since X,(0) < X(w) and | X,(o)—X(o) | <2" for X(w) ~ 0< X,a)t X(o) for allo GQ. Hence we have the Corollary. (+) Theorem 2.1 and Corollary 2:4 will playa very important role. in the theory of integration, where we build up the properties of integralsRANDOM VARIABLES: 39 starting with simple functions and then deriving it for non-negative functions, later, for arbitrary functions. With any .v. Y we may asso- Giate two non-negative rvs defined by (2.25) and (2.26) or equiv lently by X+ = XIX > 0}, 2.50) Xe = XIX < 0), IX is atv. (o: Xo) > 0} and fw: ¥(w) <0) ate events and hence the indicators of these sets are r.v.’s. Thus, X* is the Product of two r.v.'s. and is hence a r.v. Similarly X- is also a r.v. Since they are non- negative, there exist monotone increasing sequences of simple functions converging to them. We may note that since Xeoxe—x 2.51) [Xp =axt, X and | X | are linear functions of X+ and ¥-. In this section so far we were concerned with /-measurable real valued functions on Q. If Q is the real line @ as the o-field, then the above theorem implies that any Borel measurable real valued func tion, i.@., Borel function is a limit of simple Borel functions. But simple Borel functions are step functions which increase ot decrease in steps only, Thus, Borel functions are step fiinctions or limits of step functions, ‘TazoREM 2 Continuous real valued functionson R are Borel functions. Proof. Let g(-) be a continuous function from 2 to. R. Consider g(X), where X is @-measurable function from R,to R. By Theorem (2.1), there exists a sequence (Xj), of elemenitary functions converging uni. formly to X. Since X, takes only a countable number of distinct values, (%,) also takes only a countable number of distinct values and is there. fore, an elementary function from R to R. Since @ is continuous, as alo) > Xo), sE@)) > a(Xo)). Thus, 2(X) is a limit of elementary Borel-measurable functions and is hence Borel-measurable, Take X to be the identity function with X(w) This is Borel function. Then g(X(w)) = g(o). Since g(X) is shown to be Borel measu rable for any measurable function X, g(.) is a Borel function, (*) Since contiauous fuictions are. Borel functions, and by Lemma 2.8 ‘the class of Borel functions is closed under the formation of finite limits, limits of continuous functions ere also Borel functions.. But continuous functions or their limits are called Baire functions. Theorem 2.2 imp! ‘that a Baire function is a Borel. function, It may be noted that li of continuous functions need not be continuous,

B. Ramdas Bhat - Modern Probability Theory - An Introductory Textbook-Wiley (1985)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

B. Ramdas Bhat - Modern Probability Theory - An Introductory Textbook-Wiley (1985)

Uploaded by

Copyright:

Available Formats

You might also like