Applied Probability
By: Paul E Pfeiffer
Applied Probability
By: Paul E Pfeiffer
Online: <http://cnx.org/content/col10708/1.4/ >
CONNEXIONS
Rice University, Houston, Texas
©2008
Paul E Pfeier
This selection and arrangement of content is licensed under the Creative Commons Attribution License: http://creativecommons.org/licenses/by/3.0/
Table of Contents
Preface to Pfeier Applied Probability 1 Probability Systems 1.1 1.2 1.3 1.4
Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Probability Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 ........................................ ................... 1
Problems on Probability Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Solutions
2 Minterm Analysis 2.1 2.2 2.3
Minterms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Minterms and MATLAB Calculations Problems on Minterm Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3 Conditional Probability 3.1 3.2
Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Problems on Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4 Independence of Events 4.1 4.2 4.3 4.4
Independence of Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 MATLAB and Independent Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Composite Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Problems on Independence of Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5 Conditional Independence 5.1 5.2 5.3
Conditional Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Patterns of Probable Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Problems on Conditional Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Solutions
6 Random Variables and Probabilities 6.1 Random Variables and Probabilities 6.2 Problems on Random Variables and
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
7 Distribution and Density Functions 7.1 7.2 7.3
Distribution and Density Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Distribution Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Problems on Distribution and Density Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Solutions
8 Random Vectors and joint Distributions 8.1 8.2 8.3
Random Vectors and Joint Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Random Vectors and MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Problems On Random Vectors and Joint Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Solutions
9 Independent Classes of Random Variables 9.1 9.2
Independent Classes of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Problems on Independent Classes of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
iv
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
10 Functions of Random Variables 10.1 Functions of a Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 257 10.2 Function of Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 10.3 The Quantile Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 10.4 Problems on Functions of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
11 Mathematical Expectation 11.1 11.2 11.3
Mathematical Expectation: Simple Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Mathematical Expectation; General Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
Problems on Mathematical Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
Solutions
12 Variance, Covariance, Linear Regression 12.1 12.2 12.3 12.4
Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 Covariance and the Correlation Coecient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Problems on Variance, Covariance, Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
Solutions
13 Transform Methods 13.1 Transform Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 13.2 Convergence and the central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 13.3 Simple Random Samples and Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 13.4 Problems on Transform Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
14 Conditional Expectation, Regression 14.1 14.2
Conditional Expectation, Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Problems on Conditional Expectation, Regression
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
15 Random Selection 15.1 Random Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 15.2 Some Random Selection Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464 15.3 Problems on Random Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
16 Conditional Independence, Given a Random Vector 16.1 16.2 16.3
Conditional Independence, Given a Random Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Elements of Markov Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Problems on Conditional Independence, Given a Random Vector . . . . . . . . . . . . . . . . . . . . . . . . . 523
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
17 Appendices 17.1 17.2 17.3 17.4 17.5 17.6 17.7 17.8
Appendix A to Applied Probability: Directory of mfunctions and mprocedures . . . . . . . . . . . . 531 Appendix B to Applied Probability: some mathematical aids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592 Appendix C: Data on some common distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596 . . . . . . . . . . . . . . . . . . . 598
Appendix D to Applied Probability: The standard normal distribution
Appendix E to Applied Probability: Properties of mathematical expectation . . . . . . . . . . . . . . 599 Appendix F to Applied Probability: Properties of conditional expectation, given a random vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601 Appendix G to Applied Probability: given a random vector Properties of conditional independence, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
Matlab les for "Problems" in "Applied Probability" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
v
Solutions
........................................................................................
??
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615 Attributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
vi
Preface to Pfeier Applied Probability
The course
This is a "rst course" in the sense that it presumes no previous course in probability. modules taken from the unpublished text:
1
The units are
Paul E. Pfeier, ELEMENTS OF APPLIED PROBABILITY,
USING MATLAB. The units are numbered as they appear in the text, although of course they may be used in any desired order. For those who wish to use the order of the text, an outline is provided, with indication of which modules contain the material. The mathematical prerequisites are ordinary calculus and the elements of matrix algebra. A few standard series and integrals are used, and double integrals are evaluated as iterated integrals. The reader who can evaluate simple integrals can learn quickly from the examples how to deal with the iterated integrals used in the theory of expectation and conditional expectation. Appendix B (Section 17.2) provides a convenient compendium of mathematical facts used frequently in this work. And the symbolic toolbox, implementing MAPLE, may be used to evaluate integrals, if desired. In addition to an introduction to the essential features of basic probability in terms of a precise mathematical model, the work describes and employs user dened MATLAB procedures and functions (which we refer to as
mprograms,
or simply
programs )
to solve many important problems in basic probability.
This
should make the work useful as a stand alone exposition as well as a supplement to any of several current textbooks. Most of the programs developed here were written in earlier versions of MATLAB, but have been revised slightly to make them quite compatible with MATLAB 7. In a few cases, alternate implementations are
available in the Statistics Toolbox, but are implemented here directly from the basic MATLAB program, so that students need only that program (and the symbolic mathematics toolbox, if they desire its aid in evaluating integrals). Since machine methods require precise formulation of problems in appropriate mathematical form, it is necessary to provide some supplementary analytical material, principally the socalled
minterm analysis.
This material is not only important for computational purposes, but is also useful in displaying some of the structure of the relationships among events.
A probability model
Much of "real world" probabilistic thinking is an amalgam of intuitive, plausible reasoning and of statistical knowledge and insight. Mathematical probability attempts to to lend precision to such probability analysis by employing a suitable
mathematical model, which embodies the central underlying principles and structure.
The mathematical formu
A successful model serves as an aid (and sometimes corrective) to this type of thinking. Certain concepts and patterns have emerged from experience and intuition.
lation (the mathematical model) which has most successfully captured these essential ideas is rooted in measure theory, and is known as the Kolmogorov (19031987).
Kolmogorov model,
after the brilliant Russian mathematician A.N.
1 This
content is available online at <http://cnx.org/content/m23242/1.7/>.
1
2
One cannot prove that a model is incorrect).
correct.
Only experience may show whether it is
useful
(and not
The usefulness of the Kolmogorov model is established by examining its structure and show
ing that patterns of uncertainty and likelihood in any practical situation can be represented adequately. Developments, such as in this course, have given ample evidence of such usefulness. The most fruitful approach is characterized by an interplay of
• •
A formulation of the problem in precise terms of the model and careful mathematical analysis of the problem so formulated. A grasp of the problem based on experience and insight. and interpretation of analytical results of the model. analytical solution process. This underlies both problem formulation
Often such insight suggests approaches to the
MATLAB: A tool for learning
In this work, we make extensive use of MATLAB as an aid to analysis. I have tried to write the MATLAB programs in such a way that they constitute useful, readymade tools for problem solving. Once the user understands the problems they are designed to solve, the solution strategies used, and the manner in which these strategies are implemented, the collection of programs should provide a useful resource. However, my primary aim in exposition and illustration is to
aid the learning process
and to deepen
insight into the structure of the problems considered and the strategies employed in their solution. Several features contribute to that end. 1. Application of machine methods of solution requires precise formulation. The data available and the fundamental assumptions must be organized in an appropriate fashion. The requisite
discipline
for
such formulation often contributes to enhanced understanding of the problem. 2. The development of a MATLAB program for solution requires careful attention to possible solution strategies. One cannot instruct the machine without a clear grasp of what is to be done. 3. I give attention to the tasks performed by a program, with a general description of how MATLAB carries out the tasks. The reader is not required to trace out all the programming details. However, it is often the case that available MATLAB resources suggest alternative solution strategies. Hence,
for those so inclined, attention to the details may be fruitful. I have included, as a separate collection, the mles written for this work. These may be used as patterns for extensions as well as programs in MATLAB for computations. Appendix A (Section 17.1) provides a directory of these mles. 4. Some of the details in the MATLAB script are presentation details. These are renements which are not essential to the solution of the problem. But they make the programs more readily usable. And
they provide illustrations of MATLAB techniques for those who may wish to write their own programs. I hope many will be inclined to go beyond this work, modifying current programs or writing new ones.
An Invitation to Experiment and Explore
Because the programs provide considerable freedom from the burden of computation and the tyranny of tables (with their limited ranges and parameter values), standard problems may be approached with a new spirit of experiment and discovery. When a program is selected (or written), it embodies one method of
solution. There may be others which are readily implemented. The reader is invited, even urged, to explore! The user may experiment to whatever degree he or she nds useful and interesting. endless. The possibilities are
Acknowledgments
After many years of teaching probability, I have long since lost track of all those authors and books which have contributed to the treatment of probability in this work. I am aware of those contributions and am
3
most eager to acknowledge my indebtedness, although necessarily without specic attribution. The power and utility of MATLAB must be attributed to to the longtime commitment of Cleve Moler, who made the package available in the public domain for several years. The appearance of the professional versions, with extended power and improved documentation, led to further appreciation and utilization of its potential in applied probability. The Mathworks continues to develop MATLAB and many powerful "tool boxes," and to provide leadership in many phases of modern computation. They have generously made available MATLAB 7 to aid in
checking for compatibility the programs written with earlier versions. I have not utilized the full potential of this version for developing professional quality user interfaces, since I believe the simpler implementations used herein bring the student closer to the formulation and solution of the problems studied.
CONNEXIONS
The development and organization of the CONNEXIONS modules has been achieved principally by two people: C.S.(Sid) Burrus a former student and later a faculty colleague, then Dean of Engineering, and most importantly a long time friend; and Daniel Williamson, a music ma jor whose keyboard skills have enabled him to set up the text (especially the mathematical expressions) with great accuracy, and whose dedication to the task has led to improvements in presentation. I thank them and others of the CONNEXIONS team who have contributed to the publication of this work. Paul E. Pfeier Rice University
4 .
the result of any trial is one of a specic set of distinguishable outcomes. certain statistical regularities of results are observed. each is assigned a probability 1/N . The pioneering work of John Graunt in the seventeenth century was directed to the study of vital statistics. Probability has roots that extend far back into antiquity. the practitioner needs a sound conceptual basis which. were put forward. To utilize these. there are numerous instances in the Hebrew Bible in which decisions were made by lot or some other chance mechanism. Joseph Barsabbas and Matthias. The solutions of management decision problems use as aids decision analysis. the book of Acts describes the selection of a successor to Judas Iscariot as one of the Twelve. Although the result of any play is not predictable.1 Introduction Probability models and techniques permeate many important areas of modern life. which fell on Matthias. A variety of types of random processes. in 1693. one considers the selection of a member of the population on a chance basis. freeing it from its religious and magical overtones. Edmond Halley (for whom the comet is named) published the rst life insurance tables. The notion of chance played a central role in the ubiquitous practice of gambling. There is need to develop a feel for the structure of the underlying mathematical model. The game is described in terms of a well dened trial (a play). The group prayed. such as records of births. inventory theory. cost analysis under uncertainty all rooted in applied probability theory. then drew lots. Early developments of probability as a mathematical discipline. reliability models and techniques. for the role of various types of assumptions. deaths. Modern probability developments are increasingly sophisticated mathematically. came as a response to questions about games of chance played repeatedly. and various diseases. time series. Some thirty years later. waiting line theory. fortunately.org/content/m23243/1. Graunt determined the fractions of people in London who died from various diseases during a period in the early seventeenth century.Chapter 1 Probability Systems 1.1 Likelihood1 1. For example. But chance acts were often related to magic or religion. In the New Testament. To apply these results. The mathematical formulation owes much to the work of Pierre de Fermat and Blaise Pascal in the seventeenth century. Methods of statistical analysis employ probability analysis as an underlying discipline. with the understanding that the outcome was determined by the will of God. If there are N such possible equally likely results. One 1 This content is available online at <http://cnx.1. The developers of mathematical probability also took cues from early work on the analysis of statistical data. and statistical considerations in experimental work play a signicant role in engineering and the physical sciences. The possible results are described in ways that make each result seem equally likely. Two names.6/>. and for the principal strategies of problem formulation and solution. 5 . can be attained at a moderate level of mathematical sophistication.
it is a person who is selected. The trial is not successfully completed until one of the outcomes is realized.6 CHAPTER 1. there are thirty six equally likely outcomes. We move rather directly to the mathematical formulation (the mathematical model) which has most successfully captured these essential ideas.. Associated with each outcome is a certain characteristic (or combination of characteristics) pertinent to the problem at hand. rooted in the mathematical system known as measure theory. Such signals constitute a subset of the larger set of all possible signals.e. In polling for political opinions. Kolmogorov Kolmogorov published his (19031987). Of course. though marked by ashes of brilliance. but the interest is in certain characteristics. Out of this statistical formulation came an interest not only in probabilities as fractions or relative frequencies but also in averages or expectatons. gender. Kolmogorov succeeded in bringing together various developments begun at the turn of the century. The outcome is described by a number representing the magnitude of the quantity in appropriate units. If the outcome is characterized by the total number of spots on the two die. But the primary feature. For example. If that same .1. Borel and H. the possible values fall among a nite set of integers. It was translated into English and published in 1956 by Chelsea Publishing Company. some of the other features may be of interest for analysis of the poll. If the outcome is viewed as an ordered pair. The trial here is the selection of a person. • Much more sophisticated notions of outcomes are encountered in modern theory. 1. which characterizes the outcome. Although it is a person who is selected. • A poll of a voting population is taken. It is designed for one of an innite set of possible signals. A successful mathematical these notions with precision. then there are eleven possible outcomes (not equally likely). occupation. We may speak of the event that the person selected will die of a certain disease say consumption. the outcome is viewed in terms of the numbers of spots appearing on the top faces of the two dice. the responses may be categorized as positive (or favorable). after the brilliant Russian mathematician A. i (if and only if ) the respondent replies in the armative. religious preference. principally in the work of E. This is the model. or uncertain (or no opinion). PROBABILITY SYSTEMS then assigns the probability that such a person will have a given disease. age. a communication system experiences only one signal stream in its life. i. Certain concepts and patterns which emerged from experience and intuition called for clarication. is the political opinion on the question asked. is called the Kolmogorov model.). The following are typical. A hand of ve cards is drawn. The likelihood of encountering a certain kind of signal is important in the design. negative (or unfavorable). which was long and halting. • A measurement is made. event to which a probability occur on any trial.N. These considerations show that our probability model must deal with • • A trial which results in (selects) an outcome from a set of conceptually possible outcomes. the possible values may be any real number (usually in some specied interval). preferences for food. epochal work in German in 1933. Lebesgue on measure theory. etc. • A pair of dice is rolled. is the notion of an may be assigned as a measure of the model must formulate the outcome observed. in communication or control theory. That person has many features and characteristics (race. as well as in precise analysis. These averages play an essential role in modern probability. it is death from consumption which is of interest. The event one or more aces occurs i the hand actually drawn has at least one ace. In other cases. For example. We do not attempt to trace this history. Inherent in informal thought.2 Outcomes and events Probability applies to situations in which there is a well dened trial whose possible outcomes are found among those in a given basic set. An event is identied in terms of the characteristic of The event a favorable response to a polling question occurs if the outcome observed likelihood the event will has that characteristic. Outcomes are characterized by responses to a question. In some cases. But a communication system is not designed for a single signal stream.
e. the event We must specify unambiguously what outcomes are possible.. One of the advantages of this formulation of an event as a subset of the basic set of possible outcomes is that we can use elementary set theory as an aid to formulation. the degree of dierentibility. such as Venn diagrams and indicator functions (Section 1. Thus the event referred to is E = (J ∪ K) ∩ (H ∪ S) The notation of set theory thus makes possible a precise formulation of the event (1. write the number of spots. The event determined by some characteristic of the possible outcomes is the set of those The event outcomes having this characteristic.e. The occurrence of both conditions means the outcome is in the intersection (common part) designated by ∩. Throwing the dice is equivalent to selecting randomly one of the thirty six pictures.. set (event) H c. Now it may be desirable to specify events which involve various logical combinations of the characteristics. the only accepted outcomes are heads and tails. The event SEVEN occurs i the picture selected is one of the set of those pictures with seven on the back. queen. each outcome may have several characteristics which are the basis for describing events. Now such signals have many characteristics: the maximum peak value. Each card is characterized by a face value (two through ten. If the signal has a peak absolute value less than ten volts. Should the coin stand on its edge. These considerations lead Denition. or king is represented by the union spade is the union H ∪ S. We formalize these ideas as follows: • Let Ω be the set of all possible outcomes of the basic trial or experiment. since it has no members (contains no outcomes). This is the condition that the sets representing the events are disjoint (i. with peak rate of change 10. thirteen cards with heart as suit.e. we would ordinarily consider that to be the result of an improper trial. i not more than one can occur on any trial. diamonds.000 herz. spades). An ace is drawn (the event ACE occurs) i the outcome (card) belongs to the A heart is drawn i the card belongs to the set of set (event) of four cards with ace as face value. We call this the or the basic space sure event. Thus the set of cards which does not have suit heart is the set of all those outcomes not in event H. In set theory. On the back of each picture. The event SEVEN occurs i the outcome is one of those combinations with a total of seven spots (i. this is the • • Events are mutually exclusive ∅ complementary is useful. king. This could be represented as follows. then the event two clubs has to the following denition. This set (event) consists of an uncountable innity of such signals. jack. occurs i the outcome of the trial is a member of that set (i. have one of the charac teristics. ∅ . • The event of throwing a seven with a pair of dice (which we call the event SEVEN) consists of the set of those possible outcomes with a total of seven spots turned up. etc. hearts. ace) and a suit (clubs. Suppose the two dice are distinguished (say by color) and a picture is taken of each of the thirty six possible combinations.000 volts per second. we may be interested in the event the face value The set for jack spade. the frequency spectrum. have no members in common).3) for studying event combinations.. then it is one of the set of signals with those characteristics.1) • Sometimes we are interested in the situation in which the outcome does not E. ipping a coin. And tools. Now the event SEVEN consists of the set of all those pictures with seven on the back. hence.7 hand has two cards of the suit of clubs. The event "the signal has these characteristics" has occured. in set terminology. the average value over a given time period. set∅. Suppose we are drawing a single card from an ordinary deck of playing cards. a frequency spectrum essentially limited from 60 herz to 10. say by leaning against a wall. is jack or king and the suit is heart J ∪K and the set for heart or or Thus. since if the trial is carried out successfully the outcome will be in Ω. • Observing for a very long (theoretically innite) time the signal passing through a communication channel is equivalent to selecting one of the conceptually possible signals. provide powerful aids to establishing and visualizing relationships between events. belongs to the event SEVEN). • As we note above. occurred. The notion of the Event impossible event The impossible event is. has the characteristic determining the event). the One use of empty is to cannot occur. In Ω is sure to occur on any trial.
If a trial results in B (so that event B has occurred).8) . A2 .8 CHAPTER 1. If The role of disjoint unions is so important in probability that it is useful to have a symbol indicating the union of a disjoint class. · · · } then {Ai : i ∈ J} is the sequence {A1 : 1 ≤ i}. Let Ak be the event that exactly k occur on a trial and or more occur on a trial. i∈J (1.4) F E is the union of a class of events. Now the Bk can be expressed in terms of the Ak . one for each index i in the index set and J (1. A3 }.6) i. if Ai . We may express B2 Ei occur directly in terms of the Ei as follows: B2 = E1 E2 ∪ E1 E3 ∪ E2 E3 (1. Ai = A1 ∪ A2 ∪ A3 . E3 } be the event that k of events.3) J = {1. Then c c c c c A0 = E1 E2 E3 . for example. Ai i=1 (1. A1 = E1 E2 E3 c c c E1 E2 E3 E1 E2 E3 E1 E2 E3 . 2. • Set inclusion provides a convenient way to designate the fact that event A implies event B . In the following it may be arbitrary. A is also in B . it is sometimes useful to use an innite. then event E occurs i at least one event in the class occurs. we write n n A= i=1 Ai to signify A= i=1 Ai with the proviso that the Ai form a disjoint class (1. Ei in one and Ei c in the other. intersection of a class of events. then that The language and notaton of sets provide a precise language and notation for events and their combinations. Thus. Ai = i∈J and ∞ ∞ Ai = i∈J i=1 If event is the Ai .7) and The union in this expression for B2 is disjoint since we cannot have exactly two of the exactly three of them occur on the same trial. 2. The notion of Boolean combinations may be applied to arbitrary classes of sets. E2 . We say the index J is countable if it is nite or countably uncountable. PROBABILITY SYSTEMS provide a simple way of indicating that two sets are mutually exclusive.1: Events derived from a class Bk Consider the class {E1 . We collect below some useful facts about logical (often called Boolean) combinations of events (as sets). hence cannot both occur on any given trial. is the class of sets then {Ai : i ∈ J} For example. use the alternate To say AB = ∅ (here we AB for A ∩ B) is to assert that events A and B have no outcome in common. i∈J If Ai = A1 ∩ A2 ∩ A3 . in the sense A requires the occurrence of B .2) J = {1. For this reason. for at least For example B 2 = A2 A3 (1.5) Example 1. 3} {Ai : i ∈ J} is the class {A1 . A3 = E1 E2 E3 The unions are disjoint since each pair of terms has one c c E1 E2 E3 c c E1 E2 E3 A2 = (1. then event F occurs i all events in the class occur on the trial. otherwise it is index set to designate membership. The set relation an outcome in that the occurrence of element (outcome) in outcome is also in A (event A ⊂ B signies that every A occurs). We use the cup with a bar across to indicate that the sets combined in the union are disjoint.
We begin by looking at the likelihood Probability of the occurrence of that event on classical model which rst successfully formulated probability ideas in math ematical form. and it is not in B2 c . The basic space Ω consists of a nite number N of possible outcomes. not in at least one) of the Ai i it fails to be in all the intersection (i.7/>. . c Ei occurs i at least two of them fail to 1.1 (Events derived from a class) Express the event of no more than one occurrence of the events in {E1 . there is uncertainty about which of these latent possibilities will be realized. if one pair. As described so far.9) D be the event that one or three of the events occur.12) Example 1. Occurrence or nonoccurrence of an event is determined by characteristics or attributes of the outcome Performing the trial is visualized as selecting an outcome from the basic set. In the module "Likelihood" (Section 1..11) An outcome is not in the union (i.9 Here the union is not disjoint. Ai .10) Two important patterns in set theory known as an arbitrary class DeMorgan's rules are useful in the handling of events. traditionally is a number assigned to an event indicating the any trial. and logical or Boolean combinations of the events (unions. Then c C = E1 E2 E3 Let c E1 E2 E3 (1. . Classical probability 1. or it could involve the uncertainties associated with chance. E3 } as c c c c c 3 c c c c c c c B2 = [E1 E2 ∪ E1 E3 ∪ E2 E3 ] = (E1 ∪ E2 ) (E1 ∪ E3 ) E2 E3 = E1 E2 ∪ E1 E3 ∪ E2 E3 The last expression shows that not more than one of the occur. not in all) i it fails to be in at least one of the Ai .1) we introduce the notion of a basic space Ω of all possible outcomes of a trial or experiment. An event occurs whenever the selected outcome is a member of the subset representing the event. then E1 E3 = ∅ and the pair {E1 E2 .2. say {E1 . c c Ai i∈J = i∈J Ac i and Ai i∈J = i∈J Ac i (1. events as subsets of the basic space determined by appropriate characteristics of the outcomes. the selection process could be quite deliberate. E2 .org/content/m23244/1.e. E3 } is disjoint.1 Probability measures observed on a trial. We use modern terminology and notation to describe it. E2 E3 } is disjoint (draw a Venn diagram). with a prescribed outcome.There are thirty six possible outcomes of throwing two dice. and complements) corresponding to logical combinations of the dening characteristics.2: Continuation of Example 1. Probability enters the picture only in the latter situation. Suppose C is the event the rst two occur or the last two occur but no other combination.e.2 Probability Systems2 1. Before the trial is performed. in general. For {Ai : i ∈ J} of events. D = A1 c c A3 = E1 E2 E3 c c E1 E2 E3 c c E1 E2 E3 E1 E2 E3 (1. However. 2 This content is available online at <http://cnx. intersections. (1.
is N = C (52. It turns out that they can be derived from these three.There are 52! C (52. it is not really satisfactory for many applications (the communications problem.15) P (D) ≈ 0.14) Thus the There are two or more aces i there are exactly two or exactly three or exactly four. P (Ω) = N/N = 1 If A. We seek a more general model which includes classical probability as a special case and is thus an extension of it. 2) C (48. Each possible outcome is assigned a probability 3. Since A. PROBABILITY SYSTEMS . event so that P (A) = 103776 NA = ≈ 0. so that N − ND ND =1− = 1 − P (D) = 0. We thus is the number not in D which is N − ND .3: Probabilities for hands of cards Consider the experiment of drawing a hand of ve cards from an ordinary deck of 52 playing cards.13) P (A) = NA /N With this denition of probability. the fraction favorable to A is assigned a unique probability. Although the classical model is highly useful. so that number in the disjoint union is the sum of the numbers in P A B C = P (A) + P (B) + P (C) (1. 5) = 5!47! = 2598960 dierent hands of ve cards (order not 25 = 32 results (sequences of heads or tails) of ipping ve coins. C are mutually exclusive. B be the event of exactly three aces. B. as noted above..16) This example illustrates several important properties of the classical probability. We adopt the Kolmogorov model (introduced by the Russian mathematician A. for example). There is P (Dc ).e. Thus NA = C (4. and of exactly four aces. and an extensive theory has been developed. 2) C (48. 4) C (48. N. 2) + C (4. 1. Kolmogorov) which . then the the individual events.0399 N 2598960 (1. Now the number in Dc P (Dc ) = one ace or none i there are not two or more aces. each event (i. the number of elements in A (in the classical language. B. ND = NA + NB + NC = C (4. the number of outcomes "favorable" to the event) and N the total number of possible outcomes in the sure event Ω. In the rst problem. If event (subset) A has NA 1/N elements. We select two aces from the four. 2. by counting Example 1. 2. we must count the number C be the event NA of ways of drawing a hand with two aces. What is the probability of drawing a hand with exactly two aces? What is the probability of drawing a hand with two or more aces? What is the probability of not more than one ace? SOLUTION Let A be the event of exactly two aces. then the probability assigned event A is A) (1. D of two or more is D=A B C.10 CHAPTER 1.9583 N N (1. The number of outcomes. 1) = 103776 + 4512 + 48 = 108336 so that want (1. which may be determined NA .There are . P (A) = NA /N is a nonnegative quantity. important). 3) C (48. C are mutually exclusive. 3) = 103776. 5) = 2598960.17) Several other elementary properties of the classical probability may be identied.0417. 3) + C (4. 3. and select the other three cards from the 48 non aces.
∅ = Ωc . Gradually fundamental properties needed for the classical case. We use the mass interpretation to help visualize the properties. logically from (P4) ("(P4)".18) P (∅) = 0. If The probability of the sure event Countable additivity. The Our approach is to begin with the notion of events as sets introduced above. probability would be counted more than once in the sum. p. 11). hence cannot occur. we introduce and utilize additional concepts to build progressively a powerful and useful discipline. The mass Now is proportional to the weight. The total unit mass is assigned to the basic set Ω. P (Ω) = 1. level. this follows . {Ai : 1 ∈ J} is a mutually exclusive. If the sets were not disjoint. It has no members. (P4): P (Ac ) = 1 − P (A). A full explication requires a level of mathematical sophistication inappropriate for a treatment such as this. a mass assignment with three properties does not seem a very promising beginning. p. But most of the concepts and many of the results are elementary and easily grasped. but are primarily concerned to interpret them in terms of likelihoods. then the probability of the disjoint union is the sum of the individual probabilities. We may think of probability as mass assigned to an event. This follows from additivity and the fact that 1 = P (Ω) = P A (P5): Ac = P (A) + P (Ac ) (1. then to introduce probability as a number assigned to events sub ject to certain conditions which become denitive properties. But we soon expand this rudimentary list of properties. and a Ω of elementary outcomes of a trial or experiment. Of course. And many technical mathematical considerations are not important for applications at the level of this introductory treatment and may be disregarded. Repeatedly we refer to the probability mass assigned a given set. 11) and (P2) ("(P2)".3 (Probabilities for hands of cards). The empty set represents an impossible event. Since It seems reasonable that it should be assigned zero probability (mass).11 captures the essential ideas in a remarkably successful way. We borrow from measure theory a few key facts which are either very plausible or which can be understood at a practical This enables us to utilize a very powerful mathematical system for representing practical problems in a manner that leads to both insight and useful strategies of solution. a class of probability measure P (·) which assigns values to the events in accordance with the following rules: For any event A. We can use this interpretation as a precise representation. countable class of events. The Kolmogorov model is grounded in abstract measure theory. are just those illustrated in Example 1. so sometimes we speak informally of the weight rather than the mass. The necessity of the mutual exclusiveness (disjointedness) is illustrated in Example 1. is abstractsimply a number assigned to each set representing an event. as dened. Reality always seems to escape our logical nets.3 (Probabilities for hands of cards) Denition events (P1): (P2): (P3): A probability system consists of a basic set as subsets of the basic space. A probability. no model is ever completely successful. the probability P (A) ≥ 0. The additivity property for disjoint sets makes the mass interpretation consistent. But we can give it an interpretation which helps to visualize the various patterns and relationships encountered.
1). extra in B A∪B which is not in A. The rst three expressions follow from additivity and partitioning of A∪B =A Ac B = B AB c = AB c AB Ac B (1. The last expression shows that we correct this by taking away the extra common mass. From the mass point of view. then P (A) ≤ P (B). PROBABILITY SYSTEMS Figure 1. If we add the mass in A and B The third is the probability in the common part plus the extra in A and the we have counted the mass in the common part twice. is the probability mass in A plus A similar interpretation holds for B. we get the last expression. disjoint class and A is contained in the union. the rst expression says the probability in the additional probability mass in the part of the second expression. B must (1.1: Partitions of the union A ∪ B . probability mass. we have since A⊂B means that if A is also in B. (P6): If A ⊂ B. From a purely formal point of view. Hence B A. so that B A occurs.20) In terms of If we add the rst two expressions and subtract the third. Now the relationship is at least as likely to occur as A.19) B=A Ac B so that P (B) = P (A) + P (Ac B) ≥ P (A) P (Ac B) ≥ 0 (P7): P (A ∪ B) = P (A) + P (Ac B) = P (B) + P (AB c ) = P (AB c ) + P (AB) + P (Ac B) = P (A) + P (B) − P (AB) A∪B as follows (see Figure 1.21) . (P8): If {Bi : i ∈ J} is a countable.12 CHAPTER 1. every point in must have at least as much mass as also. then P (A) = i∈J A= i∈J ABi so that P (ABi ) (1.
While there is uncertainty about which patient. If there were complete uncertainty.. a physician certainly knows approximately what fraction of the clinic's patients have the disease in question. The mechanism of chance (i. and the fact (Partitions) ∞ ∞ A= i=1 Ai = i=1 Bi . Flexibility at a price In moving beyond the classical model. proles in which successive hourly temperatures alternate between very high then very low values throughout the day constitute an unlikely subset (event). we have gained great exibility and adaptability of the model. a substantial subset of these has very little likelihood of occurring. Although a trac engineer does not know exactly how many vehicles will be observed in a given time period. the cause of uncertainty lies in the intricacies of the correction mechanisms and the perturbations produced by a very complex environment. Before his or her appearance and the result of a test. While there is an extremely large number of possible hourly temperature proles. p. As we see in the treatment of conditional probability (Section 3.1). constitutes a probability measure. If we have an assignment of numbers to the events. will appear at a clinic. For example. This would not be incorrect. p. the assumption of equal likelihood is based on knowledge of symmetries and structural regularities in the mechanism by which the game is carried out. and (P6) ("(P6)". p. are so elementary that it seems they should be included in the dening statement. It is this ability to assign likelihoods to the various events which characterizes applied . with which symptoms. the dice problem may be handled directly by assigning the appropriate probabilities to the various numbers of total spots. In a game of chance. 12). In the general case. the principal variations are not due to experimental error but rather to unknown factors which converge to provide the specic weather pattern experienced. analyzed into equally likely outcomes. the source of the uncertainty) may depend upon the nature of the actual process or system observed.. And the other properties follow as logical consequences. 11). in taking an hourly temperature prole on a given day at a weather station. But this is not usually the case.e. 12). the situation would be chaotic. but would be inecient. we must resort to experience. about which outcome will present itself. The existence of uncertainty due to chance or randomness does not necessarily imply that the act of performing the trial is haphazard. the situation is one of uncertainty. Some of these properties. The trial may be quite carefully planned. experiment. In the classical case. In the case of an uncorrected digital transmission error. such as (P4) ("(P4)". This follows from countable additivity.13 (P9): Subadditivity. A patient at a clinic may be self selected. Whether one sees this as an act of the gods or simply the result of a conguration of physical or behavioral causes too complex to analyze. If A= ∞ i=1 Ai . p. the cause is simply attributed to chance. introduce new probability measures) when partial information about the outcome is obtained. It For example. p. from the point of view of the experimenter. For example. p. and symmetry in the system under observation which makes it possible to assign likelihoods to the occurrence of various events. ("(P1)". 11). In each case. the probability value to be assigned an event is clearly dened (although it may be very dicult to perform the required counting). property (P6) ("(P6)". regularity. and (P3) ("(P3)". (P2) ("(P2)". experience provides some idea what range of values to expect. 2 through 12. And the number of outcomes associated with a given event is known.22) This includes as a special case the union of a nite number of events. there is some basis in statistical data on past experience or knowledge of structure. One normally expects trends in temperatures over the 24 hour period. we make new probability assignments (i. does not require a uniform distribution of the probability mass among the outcomes. In each case. But this freedom is obtained at a price. p. where B i = Ai Ac Ac · · · Ac ⊂ Ai 1 2 i−1 (1. we need only establish (P1) 11). 11) to be able to assert that the assignment 11). the contingency may be the result of factors beyond the control or knowledge of the experimenter. or may be determined. then P (A) ≤ ∞ i=1 P (Ai ). It may be used for systems in which the number of outcomes is innite (countably or uncountably). the physician may not know which patient with which condition will appear. structure of the system studied.e. or statistical studies to assign probabilities. (P5) ("(P5)". before the trial.
occurrence. However determined. p. 1. Just as the dening conditions put • One group believes probability is objective in the sense that it is something inherent in the nature of things. the total probability assigned to the union must equal the sum of the probabilities of the separate events.14 CHAPTER 1. ("(P2)". It is to be discovered.org/content/m23246/1. This idea is so imbedded in much intuitive thought about probability that some probabilists have insisted that it must be built into the denition of probability. • Another group insists that probability is a condition of the mind of the person making the probability assessment. p. (P3) ("(P3)". A typical applied problem provides the probabilities of members of a class of events (perhaps only a few) from which to determine the probabilities of other events of interest. p. (P2) ("(P2)". as in the classical case. constraints on allowable probability assignments. The sure event is assigned probability one. for a gain of −a. p. 11). by analysis and experiment. 3 This content is available online at <http://cnx. if possible. One cannot assign negative proba bilities or probabilities greater than one. just as the notion of mass is consistent with many dierent mass assignments to sets in the basic space. (P2) probability. With the exception of the sure event and the impossible event. in which the expected or average gain is zero.7/>.3. Since the probabilities are not built in. The formal system is consistent with many probability assignments. the model does not tell us how to assign probability to any given event. 11) and derived for making probability assignments. If two or more events are mutually exclusive. for a gain of (1 − a). Various attempts have been made to nd objective ways to measure the P (A) expresses the degree of certainty one feels that event A will occur. the fraction of times that event A occurs approaches as a limit the value P (A). they also provide important structure. p. the laws of probability simply impose rational consistency upon the way one assigns probabilities to events. properties provide consistency rules 11). Any assignment of probability consistent with these conditions is allowed. it is there. (P3) ("(P3)". known as Borel's theorem. PROBABILITY SYSTEMS probability is a number assigned to events to indicate their likelihood of 11). One important dichotomy among practitioners. p. 11) (plus others which may also be derived from these). We consider an important class of such problems in the next chapter. 11) along with derived properties (P4) through (P9) (p. a prime role of probability theory is to derive other probabilities from a set of given probabilites. . The early work on probability began with a study of strength of one's belief or degree of certainty that an event will occur. Whether we can determine it or not. The dening properties (P1) ("(P1)". The assignments must be consistent with the dening properties (P1) ("(P1)". Behind this formulation is the notion of a fair game.1 What is Probability? The formal probability system is a model whose usefulness can only be established by examining its structure and determining whether patterns of uncertainty and likelihood in any practical situation can be represented adequately. The probability relative frequencies of occurrence of an event under repeated but independent trials. which may be interpreted if a trial is performed a large number of times in an independent manner. Establishing this result (which we do not do in this treatment) provides a formal validation of the intuitive notion that lay behind the Inveterate gamblers had noted longrun statistical regularities. One approach to characterizing an individual's degree of certainty is to equate his assessment of P (A) with the amount a he is willing to pay to play a game which returns one unit of money if A occurs. This approach has not been entirely successful mathematically and has not attracted much of a following among either theoretical or applied probabilists. In the model we adopt. From this point of view. 11). There is a variety of points of view as to how probability should be interpreted.3 Interpretations3 1. and returns zero if A does not occur. One may not know the probability assignment to every event. early attempts to formulate probabilities. These impact the manner in which probabilities are assigned (or assumed). there is a fundamental limit theorem.
probability is Those who hold this view usually assume an objective view of probability. • What does a 30 percent probability of rain mean? Does it mean that if the prediction is correct. And there is no doubt that as weather forecasting technology and methodology continue to improve the weather probability assessments will become increasingly useful. similar situations the frequencies tend to coincide with the value Example 1. From this point of view. there is often a hidden frequentist element in the subjective evaluation. it is generally useful to know whether the conditions lead to a 20 percent or a 30 percent or a 40 percent probability assignment. there is some ambiguity about the event and whether it has occurred. sub jective assessment of probabilities may be based on intimate knowledge of relative strengths and weaknesses of the teams involved.5: The probability of rain Newscasts often report that the probability of rain of is 20 percent or 60 percent or some other gure.15 and sought explanations from their mathematically gifted friends. These data are used to answer the question: What is the probability (likelihood) that a member of the population.e. Example 1. injuries. There are many applications of probability in which the relative frequency point of view is not feasible. to be discovered by repeated experiment. chosen at random (i. It is a number determined by the nature of reality. There may be a considerable ob jective basis for the subjective assignment of probability. the latter alternative may well hide a frequency interpretation. in this and other similar situations. 30 percent of the area indicated will get rain (in an agreed amount) during the specied time period? Or does it mean that 30 percent of the occasions on which such a prediction is made there will be signicant rainfall in the area during the specied time period? Again. There is usually enough practical agreement for the concept to be useful. And there is some diculty with knowing how to interpret the probability gure. An event either occurs or it does not. However. As a matter of fact. Does the statement mean that it rains 30 percent of the times when conditions are similar to current conditions? Regardless of the interpretation. • To use the formal mathematical model. While the precise meaning of a 30 percent probability of rain may be dicult to determine.. the amount. the outcome of a game or a horse race. This is certainly not a probability that can be The game is only played once. nonrepeatable trials. How do we determine whether it has rained or not? Must there be a measurable amount? Where must this rain fall to be counted? During what time period? Even if there is agreement on the area.4: Subjective probability and a football game The probability that one's favorite football team will win the next Superbowl Game may well be only a sub jective probability of the bettor. Nevertheless. those who take a subjective view tend to think in terms of such problems. These are unique. whereas those who take an objective view usually emphasize the frequency interpretation. Sometimes. In fact. Examples include predictions of the weather. use of the concept of an event may be helpful even if the description is not denitive. Another common type of probability situation involves determining the distribution of some characteristic over a populationusually by a survey. there remains ambiguity: one cannot say with logical certainty the event did occur or it did not occur. the success of a newly designed computer. You only go around once. There is an assessment (perhaps unrealized) that in sub jectively assigned. and experience. as well as factors such as weather. meaningful only in repeatable situations. on an equally likely basis) will have a certain characteristic? . As the popular expression has it. and the time period. the performance of an individual on a particular job. probabilities in these situations may be quite subjective. there must be precision in determining an event. There are several diculties here. the determined by a large number of repeated trials.
2 Probability and odds odds favoring O (A) = P (A) P (A) = P (Ac ) 1 − P (A) (1. then the same probabilities would be applied to any student selected at random from the entire student body. The probability a student is either on campus and satised or o campus and not satised is P CS OU = P (CS) + P (OU ) = 127/300 + 43/300 = 170/300 (1. It is fortunate that we do not have to declare a single position to be the correct viewpoint and interpretation. (1.16 CHAPTER 1. C be the event on campus. are free in any situation to make the It is not necessary to t all problems The formal model is consistent with any of the views set forth. If A and B are events with positive A over B is the probability ratio P (A) /P (B).3 = 7/3.23) If there is reason to believe that the population sampled is representative of the entire student body. There are 200 on campus members in the population. O U be the event unsatised. if O (A) 1 + O (A) a a+b .1 If an individual is selected on an equally likely basis from this group of 300. The probability that a student selected is on campus and satised is taken to be P (CS) = 127/300. Survey Data Survey Data S C O 127 46 U 31 43 N 42 11 Table 1.6: Empirical probability based on survey data A survey asks two questions of 300 students: the recreational facilities in the student center? Do you live on campus? Are you satised with Answers to the latter question were categorized Let reasonably satised. so P (C) = 200/300 and P (O) = 100/300. unsatised. the probability of any of the events is taken to be the relative frequency of respondents in each category corresponding to an event. then P (A) = 5/ (5 + 3) = 5/8. nor is it necessary to change mathematical model each time a dierent point of view seems appropriate. If not otherwise specied. If the odds favoring A is 5/3.7/0. . We interpretation most meaningful and natural to the problem at hand. N be the event o campus. or no denite opinion. and be the event no denite opinion. into one conceptual mold. PROBABILITY SYSTEMS Example 1. Data are shown in the following table.25) a/b O (A) = a/b then P (A) = 1+a/b = O (A) = 0. S be the event reasonably satised.24) This expression may be solved algebraically to determine the probability from the odds P (A) = In particular. B is c taken to be A and we speak of the odds favoring A Often we nd it convenient to work with a ratio of probabilities.3. probability the 1.
each of whose probabilities is known. A partition is a mutually exclusive class {Ai : i ∈ J} such that Ω= i∈J Ai (1.3. {ABi : ß ∈ J} is a is mutually exclusive and occurs. It is dened very simply as follows: IE (ω) = { 1 0 for for ω∈E ω ∈ Ec (1. additivity and property (P6) ("(P6)".3. 1.28) Bi is the set of those elements of Ai not in any of the previous members of the sequence. 12) P (Bi ) ≤ P (Ai ). p. ∞ by (P6) ("(P6)". 11) places a premium on appropriate partitioning of Denition. p. 12) follows from countable This representation is used to show that subadditivity (P9) ("(P9)".30) . In in the module on Partitions and Minterms (Section 2.26) A partition of event A is a mutually exclusive class {Ai : i ∈ J} such that A= i∈J Ai (1. then the class partition of A and A = i∈J ABi . we show that of a nite class of events can be expressed as a disjoint union in a manner that often facilitates systematic determination of the probabilities. p. 12). ∞ ∞ ∞ P i=1 Ai =P i=1 Bi = i=1 P (Bi ) ≤ i=1 P (Ai ) (1.27) Remarks. A partition of event of the Ai A is a mutually exclusive class of events such that A occurs i one (and only one) Ω.4 The indicator function One of the most useful tools for dealing with set combinations (and hence with event combinations) is the indicator function IE for a set E ⊂ Ω. Since each Now B i ⊂ Ai . A partition (no qualier) is taken to be a partition of the sure event If class {Bi : ß ∈ J} A ⊂ B = i∈J Bi . then the probability of the combination is the sum of the individual probailities.2: Partitions and minterms). If an event can be expressed as a countable disjoint union of events. Bi = Ai Ac Ac · · · Ac 1 2 i−1 (1. Thus each and for any i > 1.3 Partitions and Boolean combinations of events The countable additivity property (P3) ("(P3)". p. • • • • A partition is a mutually exclusive class of events such that one (and only one) must occur on each trial.1.17 1. events. {A1 : 1 ≤ i} and determine a mutually exclusive (disjoint) sequence We may begin with a sequence {B1 : 1 ≤ i} as follows: B 1 = A1 .29) The representation of a union as a disjoint union points to an important strategy in the solution of probability any Boolean combination problems.
2.31) IAc = 1 − IA This follows from the fact IAc (ω) = 1 i IA (ω) = 0. If IA ≤ IB . Much of the usefulness of the indicator function comes from the following properties. IA W B = IA + IB (extends to any disjoint class) The following example illustrates the use of indicator functions in establishing relationships between set combinations. Thus we have a step function with unit value over the interval M. Other uses and techniques are established in the module on Partitions and Minterms (Sec tion 2. (IF3). IB } (the maximum rule extends to any class) The maximum ω is in the union i it is in any one or more of the events in the union i any one or more of the individual indicator function has value one i the maximum is one. A=B i both so ω ∈ B. In the abstract basic space Ω we cannot draw a graph so easily. (IF1): IA ≤ IB IA (ω) = 1 (IF2): IA = IB i implies i A ⊂ B . then ω ∈ A implies IA (ω) = IB (ω) = 1. .2: Partitions and minterms). (IF5): rule follows from the fact that IA∪B = IA + IB − IA IB = max{IA . ω ∈ A implies ω ∈ B implies IB (ω) = 1. Remark. IA∪B = 1 − IAc B c = 1 − [1 − IA ] [1 − IB ] = 1 − 1 + IB + IA − IA IB (IF6): (1.18 CHAPTER 1.2: Representation of the indicator function IE for event E. We have occasion in various cases to dene them on the real line and on higher dimensional Euclidean spaces. PROBABILITY SYSTEMS Indicator fuctions may be dened on any domain. and (IF4). B} is disjoint. with the representation of sets on a Venn diagram. as in Figure 1. on the real line then For example. IB } (extends to any class) An element ω belongs to the intersection i it belongs to all i the indicator function for each event is one i the product of the indicator functions is one.32) If the pair {A. if M is the interval [a. If A ⊂ B.1. IAB = IA IB = min{IA . IM (t) = 1 for each However. then A=B (IF3): (IF4): A⊂B and B⊂A i IA ≤ IB and IB ≤ IA i IA = IB (1. The sum rule for two events is established by DeMorgan's rule and properties (IF2). Figure 1. we can give a schematic representation. b] t in the interval (and is zero otherwise).
with the assurrance that in most applications a set so produced is a mathematically valid event. any subset can be taken as an event.1 Let (Solution on p.org/content/m24071/1. there are some technical mathematical reasons for limiting the class of subsets to be considered as events. and the formal mathematical structure.35) The last sum is the indicator function for i=1 1. Events have been modeled as subsets of the basic space of all possible outcomes of the trial or experiment.34) Ai form a partition.5 A technical comment on the class of events The class of events plays a central role in the intuitive background. we have = 1.4/>. {Ai : i ∈ J} is a nite or countable class of events.4 Problems on Probability Systems4 Exercise 1. a basic assumption is that the class of events is a sigma algebra and the probability measure assigns probabilities to members of that class. if it contains complements of all its sets and countable intersections. Even there. then Bc = i=1 c Ai Ci (1. then it contains countable unions. measuretheoretic treatment.33) VERIFICATION Utilizing properties of the indicator function established above. Consider the subsets 4 This content is available online at <http://cnx. The so called measurability question only comes into play in dealing with random processes with continuous parameters. In the case of a nite number of outcomes. so that the indicator function for the complementary event is n n n n n I Bc =1− i=1 IAi ICi = i=1 IAi − i=1 n IAi ICi = i=1 c Ai Ci .) Ω consist of the set of positive integers. its complementary set must also be an event. The practical needs are these: 1. Likewise. 23. the application. the sets produced will be events. IAi [1 − ICi ] = i=1 c IAi ICi (1. A simple argument based on DeMorgan's rules shows that if the class contains complements of all its sets and countable unions. But precisely because the class is so general and inclusive in ordinary applications we need not be concerned about which sets are permissible as events A primary task in formulating a probability problem is identifying the appropriate events and the relationships between them. In the general theory. The theoretical treatment shows that we may work with great freedom in forming events.19 Example 1. involving innite possibilities. .7: Indicator functions and set combinations Suppose {Ai : 1 ≤ i ≤ n} is a partition. we have n IB = i=1 Note that since the IAi ICi n i=1 IAi (1.3. under reasonable assumptions. 1. then it contains countable intersections. the union and the intersection of members of the class need to be events. If A is an event. If 2. In a formal. A class of sets closed under complements and countable unions is known as a sigma algebra of sets. Such a class is so general that it takes very sophisticated arguments to establish the fact that such a class does not contain all subsets. n If n B= i=1 Ai C i .
a. e. Exercise 1. player two. B. Let (Solution on p. c. Write an expression for each of these events in terms of the events Ei . The integers which are even and no greater than 6 or which are odd and greater than 12.2 Let (Solution on p. D. Exercise 1. AB AC AB c ∪ C ABC c A ∪ Bc A ∪ BC c ABC Ac BC c (Solution on p. 23. Let (Solution on p. g. be played an unlimited number of times.20 CHAPTER 1. Exactly two are alike. f. with each digit (0 through 9) a. C be the respective events that player one. 9} {8.4 random manner. 8}.3 Consider fteenword messages in English. 7} {3. Exercise 1. 23. 23. C. Let A = {5. PROBABILITY SYSTEMS A = {ω : ω ≤ 12} B = {ω : ω < 8} C = {ω : ω is even} D = {ω : ω is a multiple of 3} E = {ω : ω is a multiple of 4} Describe in terms of A. 6. 7. Write an expression for the event that both men have been selected by the third selection. B. Which event has the greater probability? Why? Exercise 1.7 being equally likely and each triple equally likely: (Solution on p. B = the odd integers in Ω. 10} d. b.) A group of ve persons consists of two men and three women. . 6.) Suppose the game in Exercise 1.) be the event of a success on the Two persons play a game consecutively until one of them is successful or there are ten unsuccessful Ei i th play of the game. 23. All three are alike. They are selected onebyone in a Ei be the event a man is selected on the i th selection. The positive integers which are multiples of six. 23. b. or neither wins.) Ω be the set of integers 0 through 10. sets: {1. b. c. 5. d. E and their complements the following a. Describe the following sets by listing their elements.5 could.6 Write an expression for the event (Solution on p.) Find the (classical) probability that among three random digits. Let A. 3. in principle. c. e.5 plays. 1 ≤ i ≤ 10.) Let Exercise 1. The rst digit is 0. h. Write an expression for the event F that the game will never terminate. and C= the integers in Ω which are even or less than three. Exercise 1. D that the game will be terminated with a success in a nite number of times. The even integers greater than 12. d. f. 23. word bank and let A= the set of such messages which contain the B= the set of messages which contain the word bank and the word credit. No two are alike.
12 An extension of the classical model involves the use of areas.3 0.8 P (AB) 0.9 member of the group will be on the committee? (Solution on p.1 0. Is that a reasonable assumption? Justify your answer. 23. 24. likely as the combination C and event B A or C. dene P (A) = area (A) /area (L). in each case. P (C). show that P (Ac B) = P (B) − P (A).2 Exercise 1.3 and the probability the Dallas Cowboys will win is 0.) .) P (A ∪ B ∪ C) in terms of the probabilities of the events A.7 0.17 If occurrence of event A implies occurrence of B. ll out the table.5 0. P (Ac B c ) and P (A ∪ B). 23.5 the maximum value of P (B) = 0. What is the (classical) probability that on a given day Jim will be in place three? equally likely ways to arrange n items (order important).21 Exercise 1. 24.7 (they are not playing each other). (Solution on p.) The classical probability model is based on the assumption of equally likely outcomes. probability measure on the subregions of L.16 Determine the probability their intersections. For any subregion (Solution on p.3 P (B) 0. P (Ac B). (Solution on p. 24. (Solution on p. Determine the probabilities P (A) .3 0.7 0. ω1 be the outcome both ω2 the outcome that both are tails. A certain region L (say of land) is Show that P (·) is a (Solution on p.10 (Solution on p. C and Exercise 1.8 (Solution on p.15 The class {A.2 0 0 P (A ∪ B) P (AB c ) P (Ac B) P (A) + P (B) Table 1.4 0. Which assignments are not P (A) 0. Event is twice as likely as A (Solution on p. Exercise 1. 0. determine P (AB c ). B. What is the probability that a specied Exercise 1. He thinks the probability both will win is somewhere betweensay.) A committee of ve is chosen from a group of 20 people. 23. n! Exercise 1.3. Some care must be shown in analysis to be certain that this assumption is good.) is as Exercise 1.14 permissible? Explain why. 23. 23. 23.) P (A) = 0. B. C} of events is a partition. One of three outcomes is observed: Let are heads. Two coins are tossed.) John thinks the probability the Houston Texans will win next Sunday is 0.2 0. P (B) .3 0. 24. A well known example is the following. and ω3 be the outcome that they are dierent.11 taken as a reference.) There are Ten employees of a company drive their cars to the city each day and park randomly in ten spots. Exercise 1.4 0. Is it reasonable to suppose these three outcomes are equally likely? What probabilities would you assign? Exercise 1.5.) For each of the following probability assignments. What is the largest possible value of P (AB)? Using P (AB).13 Suppose (Solution on p. Are these and values determined uniquely? Exercise 1.) A.
) Determine P (A ⊕ B) in terms of P (A). Such a combination of probability measures is known as a mixture. A2 . · · · .) Exercise 1. let n n Q (E) = i=1 Show that ci P (EAi ) / i=1 ci P (Ai ) (1.22 Suppose each event (Solution on p.18 Show that (Solution on p. where the Pi are probabilities measures.22 CHAPTER 1. This is the event that only one of disjunctive union or the symetric the events A or B occurs on a trial.37) Q (·) us a probability measure. 24. PROBABILITY SYSTEMS Exercise 1. Extend this to n n P (E) = i=1 ci Pi (E) . class of events is a probability measure. . and P (AB). and ci = 1 i=1 (1. An } is a partition and {c1 . E. P 2 are probability measures and Show that the assignment c1 .36) Exercise 1.) (Solution on p.) P (AB) ≥ P (A) + P (B) − 1.21 Suppose P1 . cn } is a class of positive constants. b. A ⊕ B = AB c Ac B is known as the Exercise 1. 24. (Solution on p. Exercise 1. ci > 0. c2 are positive P (E) = c1 P1 (E) + c2 P2 (E) to the numbers such that c1 + c2 = 1. 24. P (AB) ≤ P (A) ≤ P (A ∪ B) ≤ P (A) + P (B) P ∞ j=1 Ej ≤ P (Ei ) ≤ P ∞ j=1 Ej ≤ ∞ j=1 P (Ej ) (Solution on p. · · · .19 dierence The set combination of A and B. P (B).) For {A1 . 24.20 Use fundamental properties of probability to show a. c2 . 24.
19) a = BC c .2 (p. b. ∞ Then ∞ D= n=1 Fn−1 En and F = Dc = i=1 c Ei (1.7.3 (p. 2) = 3 ways to pick other. P ({ω3 } = 1/2 . 2. have a designated member. 20) AB = {5. Ten triples. 21) P (AB) = 0. Solution to Exercise 1. f.9 (p. 6. Solution to Exercise 1. 100 triples with rst d. 20) A = E1 c c E1 E2 E3 c c c c E1 E2 E3 E4 E5 c c c c c E1 E2 E3 E4 E5 E6 c c c c c c E1 E2 E3 E4 E5 E6 E7 c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 c c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 E9 c c c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 (1. 21) P ({ω1 }) = P ({ω2 }) = 1/4.8 (p. Solution to Exercise 1. C (20. It must no greater than the minimum of P (A) = 0. all alike: b. 8. 20) Each triple has probability 1/103 = 1/1000 a. P = 10/1000. 9} implies Solution to Exercise 1.10 (p.38) Solution to Exercise 1. 7} AC = {6. 10 ways to pick the common value. d. 10} ABC = ∅ Ac BC c = {3.5is not reasonable. (1.1 (p.4 (p. 19! 5!15! · = 5/20 = 1/4 4!15! 20! P = 9!/10! = 1/10.11 (p. c E1 E2 E3 Solution to Exercise 1. 4) P = Solution to Exercise 1. P = 270/1000. 8} AB C ∪ C = C ABC c = AB A ∪ B c = {0. e = CD. one zero: P = 100/1000 two positions alike. c. 7. Solution to Exercise 1. a.5 (p.23 Solutions to Exercises in Chapter 1 Solution to Exercise 1. 10 × 9 × 8 triples all dierent: c. f = BC Ac C c Solution to Exercise 1.3 and P (B) = 0. 21) 10! permutations.6 (p.40) 1 × 9! permutations with Jim in place 3. b = DAE c . 21) Additivity follows from additivity of areas of disjoint regions. 20) B⊂A P (B) ≤ P (A). g. 20) Let c c c c B = E1 E2 E1 E2 E3 E4 10 c C = i=1 Ei F0 = Ω and Fk = k i=1 c Ei for k ≥ 1.12 (p.39) Solution to Exercise 1. d = CAc . 21) C (19. c = CAB c Dc . . 5. 9 ways to pick the C (3. 20) A = E1 E2 c E1 E2 E3 Solution to Exercise 1. 5) committees. 4. e.7 (p. P = 720/1000.
0 0. If ∞ E= k=1 Ek .3 0.2 (1.15 (p.24 CHAPTER 1. P (A) = 2P (C).44) The pattern is the same for the general case.1 0. and which implies P (A) = 1/3.20 (p. The Solution to Exercise 1.8 P (A) + P (B) 1. P (B) = 1/2 (1.7 0. P (A ⊕ B) = P (AB c ) + P (AB c ) = P (A) + P (B) − 2P (AB).42) P (A ∪ B ∪ C) = P (A ∪ B) + P (C) − P (AC ∪ BC) = P (A) + P (B) − P (AB) + P (C) − P (AC) − P (BC) + P (ABC) Solution to Exercise 1. 22) Clearly Ai Ω = Ai ∞ we have Q (Ω) = 1. Solution to Exercise 1. 21) P (A) 0.41) Solution to Exercise 1. 22) A Venn diagram shows Solution to Exercise 1.8 0.3 1.5 0. P (AB) ≤ P (A) ≤ P (A ∪ B) = P (A) + P (B) − P (AB) ≤ P (A) + P (B).1 Table 1. . or use algebraic expressions P (Ac B) = P (B) − P (AB) = 0 P (Ac B c ) = P (Ac ) − P (Ac B) = 0. 21) P (A) + P (B) + P (C) = 1. Solution to Exercise 1.2 0.5 P (A ∪ B) = 0. ∞ ∞ ∞ ∞ E= i=1 Ei implies P (E) = c1 i=1 P1 (Ei ) + c2 i=1 P2 (Ei ) = i=1 P (Ei ) (1.43) P (AB) = P (A) Follows from and P (AB) + P (Ac B) = P (B) implies P (Ac B) = P (B) − P (A). 21) and P (B) = P (A) + P (C) = 3P (C).6 0.2 0 0 P (A ∪ B) 0. Q (E) ≥ 0 and since Solution to Exercise 1. Solution to Exercise 1. then P (EAi ) = P (Ek Ai ) ∀ i k=1 (1.1 0.7 0.2 0.45) Interchanging the order of summation shows that Q is countably additive. P (C) = 1/6. 22) P (A) + P (B) − P (AB) = P (A ∪ B) ≤ 1. 22) AB ⊂ A ⊂ A ∪ B implies general case follows similarly. 22) Clearly P (E) ≥ 0.21 (p.1 0.3 0. except that the sum of two terms is replaced by the sum of terms n ci Pi (E).5 0.14 (p.3 0. with the last inequality determined by subadditivity. Solution to Exercise 1.19 (p.8 1.8 1.5 0. PROBABILITY SYSTEMS P (AB c ) = P (A) − P (AB) = 0.3 P (B) 0. P (Ω) = c1 P1 (Ω) + c2 P2 (Ω) = 1.16 (p.3 0.3 P (Ac B) 0.1 P (AB c ) 0.8 P (AB) 0.3 0.22 (p.17 (p. 21) Draw a Venn diagram.13 (p.5 Solution to Exercise 1.4 0.0 0.4 0.18 (p.3 0. 21) (1.1 0.3 Only the third and fourth assignments are permissible.
D}. D}. D} . These are. etc. B} has partitioned as intersections of members of the class.1. Ω=A Now if Ac (2. so that Ω = Ac B c Ac B AB c AB (2. If the probability of every minterm in this subclass can be determined. B. AB}. there The members of this partition are called minterms.Chapter 2 Minterm Analysis 2. C. For a class of four events. B. If we partition an the probability of F is the sum of these component probabilities. {A. hence 16 minterms. C. there are 24 = 16 such patterns. We examine these ideas in more detail. C. four events {A.1) B is a second event. For each such nite class. C}. is a fundamental partition determined by the class. then the additivity property implies of a nite class of events. Ac B. consider rst the partition of the basic space produced A. when the probabilities of certain other combinations are known. Ac B c C c D c Ac B c C c D Ac B c C D c Ac B c C D 1 This Ac BC c Dc Ac BC c D Ac BC Dc Ac BC D AB c C c Dc AB c C c D AB c C Dc AB c C D ABC c Dc ABC c D ABC Dc ABC D content is available online at <http://cnx. then by additivity the probability of the Boolean combination is determined.7/>. in a systematic arrangement. We form the minterms {A.2) {Ac B c .2 Partitions and minterms by a single event To see how the fundamental partition arises naturally. Continuation is this way leads systematically to a partition by three events {A. 2. 25 .1 Minterms1 A 2. with various patterns of complementation. then A = AB The pair AB c and Ac = Ac B Ω into Ac B c .1 Introduction event fundamental problem in elementary probability is to nd the probability of a logical (Boolean) combination F into component events whose probabilities can be determined. Any Boolean combination of members of the class can be expressed as the disjoint union of a unique subclass of the minterms.org/content/m23247/1. C} or {A.1. AB c . B. B. Frequently. We illustrate the fundamental patterns in the case of four events {A. B. the event F is a Boolean combination of members of a nite class say.
for the generating class {A. it is customary to refer to these sets as variables. 2n An examination of the development above shows that if we begin with a class of minterms. in that order.2 We may view these quadruples of zeros and ones as binary representations of integers.1. No. Since the actual content of any minterm depends upon the sets A. (2. and D in the generating class. With this scheme. the minterm arrangement above becomes 0000 ∼ 0 0001 ∼ 1 0010 ∼ 2 0011 ∼ 3 0100 ∼ 4 0101 ∼ 5 0110 ∼ 6 0111 ∼ 7 1000 ∼ 8 1001 ∼ 9 1010 ∼ 10 1011 ∼ 11 1100 ∼ 12 1101 ∼ 13 1110 ∼ 14 1111 ∼ 15 Table 2. set A is the right half of the diagram and set C is the lower half. for example. Similar splits occur in the other cases. this causes no problems. ω in the basic space. The answers to the four questions above can be represented numerically by the scheme No ∼0 and Yes Thus. The membership of any minterm depends upon the membership of each generating set A. because each diers from the others by complementation of at least one member event. one of which is sure to occur on each trial. In a similar way. B. B. we introduce a simple numbering system for the minterms. four. D . No. the minterms represent mutually exclusive events. but set B is split. it is useful to refer to If the members of the generating class are treated in a xed order. MINTERM ANALYSIS Table 2. if ∼1 ω is in Ac B c C c Dc . D}. the answers are tabulated as 0 0 0 0. which we illustrate by considering again the four events A.26 CHAPTER 2. in that order. we can determine the membership of each partition. C. Then Suppose. . Remark. If ω is in AB c C c D. the answers are: ω is in the minterm AB c C c D. C. as shown in the table. B. C or D. and the relationships between them. B. In the threevariable case. Yes. C . we may designate Ac B c C c D c = M 0 (minterm 0) AB c C c D = M9 (minterm 9). and ve generating events.1 No element can be in more than one minterm. Thus. n events. which may also be represented by their decimal equivalents. then this is designated 1001 . there are To aid in systematic handling. for the cases of three. As we see below. etc. These are illustrated in Figure 2.3) We utilize this numbering scheme on special Venn diagrams called minterm maps. Each element answers to four questions: Is it in ω is assigned to exactly one of the minterms by determining the A? Is it in B ? Is it in C ? Is it in D ? Yes. so that it is the union of the second and fourth columns. then each minterm number arrived at in the manner above species a minterm uniquely. one or more of the minterms are empty (impossible events). Frequently. the minterms by number. For some classes. Thus. Other useful arrangements of minterm maps are employed in the analysis of switching circuits. the minterms form a That is.
f . Minterm expansion Each Boolean combination of the elements in a generating class may be expressed as the disjoint union of an appropriate subclass of the minterms. M i er m apsf t ee. or ve variables.1.and fve varabl . 2.1: Minterm maps for three. This representation is known as the minterm expansion for the .3 Minterm maps and the minterm expansion The signicance of the minterm partition of the basic space rests in large measure on the following fact. gur 1.27 A B 0 2 4 6 B 1 3 5 7 C Three variables A B 0 4 8 12 B 1 5 9 13 D 2 6 10 14 C D 3 7 11 15 Four variables A B C 0 1 E 2 D 3 E 7 11 15 19 23 27 31 6 10 14 18 22 26 30 4 5 8 9 C 12 13 16 17 20 21 C 24 25 28 29 B C Five variables Fi e 1. four. nt m or hr our i i es Figure 2.
Hence. MINTERM ANALYSIS {A. 7) = p (1) + p (3) + p (7) (2. A key to establishing the expansion is to note that each minterm is either a subset of the combination or is disjoint from it. 7) Minterm expansion for Example 2. we let let Mi represent the i th minterm (numbering from zero) and p (i) represent the probability of that minterm.4) Figure 2. In deriving an expression for a given Boolean combination which holds for any class events.1. 2.28 CHAPTER 2. 5. 4. 6. which we designate M (6. 4. B. E = M (1. 6. so that its complement (B ∪ C c ) = M (1. 7). 6. The existence and uniqueness of the expansion is made plausible by simple examples utilizing minterm maps to determine graphically the minterm content of various Boolean combinations. A general verication using indicator functions is sketched in the last section of this module. . 2.2: expansion) E = AB ∪ Ac (B ∪ C c )c = M (1. The combination c B∪C c = M (0. Examination of the minterm map in Figure 2. M (1. When we deal with a union of minterms in a minterm expansion. whether empty or not. C . 7). it is convenient to utilize the shorthand illustrated in the following. D} of four combination. Example 2. 3. 5). 7). Similarly. 6. 7) . its presence does not modify the set content or probability assignment for the Boolean combination. We consider several simple examples and illustrate the use of minterm maps in formulation and solution. The expansion is thus the union of those minterms included in the combination. This leaves the common c c c c part A (B ∪ C ) = M1 . M7 .1 ( Minterm Consider the following simple example. F = A ∪ B C = M (1.2 shows that AB consists of the union of minterms M6 .4 Use of minterm maps A typical problem seeks the probability of certain Boolean combinations of a class of events when the probabilities of various other combinations is given. 3. we include all possible minterms.1: Minterm expansion Suppose E = AB ∪ Ac (B ∪ C c ) c . 3. Using the arrangement and numbering system introduced above. 7) = M1 M3 M7 and p (1. If a minterm is empty for a given class.
They are displayed in Figure 3b (Figure 2. but no word processor : P (AB C) = 2P (A BC) The probability is 0.65 The probability is 0.2: Survey on software Statistical data are taken for a certain student population with personal computers. 3. 5.29 Example 2.80.05 ⇒ p (0) = 0 p (4) = p (4. 1) = 0.5) .10 p (1) = p (1.05 (AB ∪ AC ∪ BC) = p (3.65 − 0. 6.10 P (Ac B c ) = p (0. hence P (B c ) = p (0. p (5) = 0.30. hence P (Ac ) = p (0. 6. 7) = 0.8 spread sheet program: P (B) = 0. 3) = 0.80 that the person has a The probability is 0.30 − 0. 6.3).10 ⇒ p (2) = 0.3). 7) = 0.1 The probability is 0.50 = 0.80 − 0. An individual is selected at random. 5) − p (7) = 0. expressed in terms of minterm probabilities. 5. 7) − p (5) − p (6. 5.50 − 0. 5) = p (3.10 + 0.50 p (6) = p (6. 5. The • • • • • • • word processing program: P (A) = 0.15 − 0. What is the probability that the person has exactly two of the programs? b. 7) − p (7) = 0. 6. 1.20 Thus.55and P (Ac B c C) = p (1) = 0. 5.35 (C) = p (1.05 The probability of at least two : P (AB ∪ AC ∪ BC) = 0.70 (ABC) = p (7) = 0.10 = 0. 3) = p (0.65. 3.10 = 0. From these we get P Ac BC AB c C ABC c = p (3. 7) = 0. 6) = 0. 7) = 0. 4. 1.65 that the person has a.05 (2. 3) = 0. 5. 2. 4.15 = 0. 5.65 − 0. Let A= the event the person selected has word processing. displayed in the minterm map to aid in an algebraic solution for the various minterm probabilities.3 The probability is 0.20 − 0.40 = 0.05 + 0. 7) − p (3. 2. 5) = 0.15 ⇒ p (3) = 0. but no spread sheet is twice the probabilty c c of spread sheet and data base. B= the event he or she has a spread sheet program. 6) = 0.10 that the person has all three : P (ABC) = 0. 6.50 = 0. 7) = 0.40 p (3. are: P P P P P P (A) = p (4. all minterm probabilities are determined. p (2.65 that the person has a P (Ac B c ) = 0.65 (AB c C) = p (5) = 2p (3) = 2P (Ac BC) We use the patterns These data are shown on the minterm map in Figure 3a (Figure 2. 7) − p (6. 3. 1. 7) = 0. 7) = p (2.10 − 0. 1) = 0.20 (B) = p (2.30 that the person has a data base program: P (C) = 0. 2.05.15 p (6. and data imply C= the event the person has a data base program.05 that the person has neither word processing nor spread sheet : The probability is 0. 3. What is the probability that the person has only the data base program? Several questions arise: • • • Are these data consistent? Are the data sucient to answer the questions? How may the data be utilized to anwer the questions? SOLUTION The data.05 = 0. hence P (C c ) = p (0. 3) − p (0. 6. 7) − p (2.65 word processor and data base.
A person is selected at random from this population. and twice as many own both PC and laptop as those who have both Macintosh desktop and laptop.2 (Survey on software) Example 2. and 151 have laptop computers. What is the probability he or she has at least one of these types of computer? What is the probability the person selected has only a laptop? . MINTERM ANALYSIS Figure 2.3: Minterm maps for software survey. 51 have all three. 515 have Macintosh desktop computers. 212 have at least two of the three. 124 have both PC and laptop computers.30 CHAPTER 2. Example 2.3: Survey on personal computers A survey of 1000 students shows that 565 have PC compatible desktop computers.
441 − 0.364 p (1) = p (1. 6) − p (2) = 0. 4.051 = 0.062 p (3) = 0.124 − 0.151. For example. hence P (C c ) = p (0. 3.4. 3.3 (Survey on personal computers) SOLUTION Let A= the event of owning a PC desktop. 7) = 0. 4. Thus. 6) = 0.376 = 0. 4.124 = 0.6) .4: Minterm probabilities for computer survey.485 (C) = p (1.435 (B) = p (2.515. 7) = 0. We utilize a minterm map for three variables to help determine minterm patterns.124 = 0. 6. 7) = P (BC) = 0.565 − 0. hence P (B c ) = p (0.124 (AB ∪ AC ∪ BC) = p (3. 7) = 0.376 p (0) = P (C c ) − p (4. 2.151 − 0.062 − 0. 6) = 0. expressed in terms of minterm probabilities.016 (2. 7) = 0. 7) = 0.849 (ABC) = p (7) = 0.212 − 0.565. 6. 7) − p (3) − p (5. Example 2. 6.124 = 0.027 − 0. 5. 1. are: P P P P P P (A) = p (4. 3) = P (Ac C) = 0.212 (AC) = p (5. 7) = 0. 3) = 0. C .032 We have determined the minterm probabilities. B= the event of owning a Macintosh desktop. B.011 p (6) = p (3. 7) = 0. 5.124/2 = 0.077 = 0.968 and P (Ac B c C) = p (1) = 0.1124 = 0.441 p (3. 7) = 2p (3.051 P (AC) = p (5. the event AC = M5 M7 so that P (AC) = p (5) + p (7) = p (5.849 − 0. hence P (Ac ) = p (0. The data. 7).073 p (1. 7) − p (7) = 0. 1.062 − 0.051 = 0. 2. P (A ∪ B ∪ C) = 1 − P (Ac B c C c ) = 1 − p (0) = 0.31 Figure 2. p (5) = p (5. 5) = 0. 3) − p (3) = 0.077 p (4) = P (A) − p (6) − p (5. 5. 7) − p (6) = 0. 7) = 2P (BC) We use the patterns displayed in the minterm map to aid in an algebraic solution for the various minterm probabilities.565 − 0.11 = 0.027 P (AC c ) = p (4.515 − 0. 6. We may now compute the probability of any Boolean combination of the generating events A. and C= the event of owning a laptop.011 − 0.016 p (2) = P (B) − p (3.077 − 0. which are displayed on the minterm map Figure 2.
P (Ac B c Dc ) = 0.8) (2. P (Ac BC) = 0. P (B) = 0.055 P (A ∪ BC ∪ Dc ) = 0. P (C) = 0.32 CHAPTER 2. it is not clear that the data are consistent or sucient to determine the minterm probabilities. .200. P (D) = 0. there is hope.520. P (A (B ∪ C c ) Dc ) = 0. P (AB c C) = 0.120. that a solution exists. A or (B and not C ) SOLUTION or C. P (AC c ) = 0. Thus. P (Ac BC c D) = 0. However. A step elimination procedure.140 P (ACDc ) = 0.10) Ac (BC c ∪ B c C) that is.020 Determine the probabilities for each minterm and for each of the following combinations (2.300.5: Minterm probabilities for opinion survey. following probabilities for various combinations: A.4: Opinion survey A survey of 1000 persons is made to determine their opinions on four propositions. Let be the events a person selected agrees with the respective propositions. but not both) At the outset. Example 2.5. D Survey results show the P (A) = 0. P (ABCD) = 0.200.9) (2.030 P (Ac B c C c D) = 0. as in the previous examples. shows that all minterms can in fact be calculated.700. The formulation above suggests a more systematic algebraic formulation which should make possible machine aided solution.025. C.500. B. but no assurance.7) (2. not A and (B A ∪ BC c that is. It would be desirable to be able to analyze the problem systematically. P (ABC c Dc ) = 0.4 (Opinion survey) Example 2.195. MINTERM ANALYSIS Figure 2.015. The results are displayed on the minterm map in Figure 2. an examination of the data shows that there are sixteen items (including the fact that the sum of all minterm probabilities is one).120.
5. These may be written in matrix form as follows: 1 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 0 1 −2 1 1 0 0 0 0 0 0 1 1 0 1 0 0 1 1 1 1 1 0 0 0 1 0 1 p (0) 1 P (Ω) 1 1 1 1 0 1 0 p (1) p (2) p (3) = p (4) p (5) p (6) p (7) 0. 6.65 P (B) 0.11) p (0) . 7) = 0.05 0.65 (C) = p (1.80 This is an algebraic equation in (2. 5. The solution utilizes an logical operations. 1) = 0.80 (B) = p (2. 7) = 0.1. 5. 6. 4. are: P P P P P P P (A) = p (4. which could be carried out in a variety of ways.2 (Survey on software).2.1: MATLAB ) how we may use MATLAB for .13) • • The patterns in the coecient matrix are determined by these with the aid of a minterm map. both aspects. 5.10 (Ac B c ) = p (0. 3. We obtained algebraic procedure. Consider again the software survey Example 2. 7) = 0.80 P (A) 0. of Example 2. 7) = 0. so that standard methods of solution may be employed. 6.33 2. · · · . 6.05 (AB ∪ AC ∪ BC) = p (3.5 Systematic formulation Use of a minterm map has the advantage of visualizing the minterm expansion in direct relation to the Boolean combination. The algebraic solutions of the previous problems involved ad hoc manipulations of We the data minterm probability combinations to nd the probability of the desired target combination.65 P (AB ∪ AC ∪ BC) 0 P (AB c C) − 2P (Ac BC) (2. expressed in terms of minterm probabilities. Each equation has a matrix or vector of zeroone coecients indicating which minterms are included. The rst datum can be expressed as an equation in minterm probabilities: 0 · p (0) + 0 · p (1) + 0 · p (2) + 0 · p (3) + 1 · p (4) + 1 · p (5) + 1 · p (6) + 1 · p (7) = 0.30 P (C) = 0. 1. seek a systematic formulation of the data as a set of linear algebraic equations with the minterm probabilities as unknowns. so that p (5) − 2p (3) = 0 We also have in any case P (Ω) = P (A ∪ Ac ) = p (0. 7) = 1 to complete the eight items of data needed for determining all eight minterm probabilities.12) The others may be written out accordingly. 3. Minterm vectors and including several standard computer packages for matrix operations.65 (AB c C) = p (5) = 2p (3) = 2P (Ac BC). 3. giving eight linear algebraic equations in eight variables p (7).10 P (ABC) P (Ac B c ) 0.30 (ABC) = p (7) = 0. 2. p (7) with a matrix of coecients [0 0 0 0 1 1 1 1] p (0) through (2. We show in the module Minterm Vectors and MATLAB (Section 2.5: The software survey problem reformulated The data.
then IE (ω) = 1 for all ω ∈ Mi . B.14) E. 2. . Mi JE (2. . For example. as a union of minterms. 6.2 Minterms and MATLAB Calculations2 The concepts and procedures in this unit play a signicant role in many aspects of the analysis of probability topics and in the use of MATLAB throughout this work. The value of the indicator function for any Boolean combination must be constant on each minterm. As an indicator function. must be constant over the minterm. by the minterm expansion. 5. so that Mi ⊂ E . minterm (2. Thus.2. I B .8/>. If ω is in E ∩ Mi . for each and CHAPTER 2. E= Then. Suppose A. Since each ω ∈ Mi for some i.org/content/m23248/1. it takes on only the values zero and one.6 Indicator functions and the minterm expansion • • For example. e1 . for the examples above. I C .1 Minterm vectors and MATLAB The systematic formulation in the previous module Minterms (Section 2. We formulate this pattern carefully below and show how MATLAB logical operations may be utilized in problem setup and solution. 7) (e0 . E= JE which is the minterm expansion of Mi (2. 1. IB (ω) = 0. another arrangement of the minterm map. as a minterm vector. Mi The ones indicate which is included in E i ei = 1. e7 ). • Let {Mi : i ∈ JE } be the subclass of those minterms on which IE has the value one. it is convenient to view the pattern it convenient to use representing it.18) content is available online at <http://cnx. 7) M1 M7 = M (0.16) E = A (B ∪ C c ) ∪ Ac (B ∪ C c ) = M1 F = Ac B c ∪ AC = M0 minterms are present in the set. IC (ω) = 1.34 2.17) We may designate each set by a pattern of zeros and ones In the pattern for set E. This is. Then Consider a Boolean combination ID (ω) = 0. MINTERM ANALYSIS Previous discussion of the indicator function shows that the indicator function for a Boolean combination of sets is a numerical valued function of the indicator functions for the individual sets. any function of AB c CDc . · · · . ω in the minterm • E of the generating sets. 4.1. 2. 2 This the same symbol for the name of the event and for the minterm vector or matrix E ∼ [0 1 0 0 1 0 1 1] and which may be represented by a row matrix or row vector [e0 e1 · · · e7 ] We nd Thus. C .1) shows that each Boolean combination. E must be the union of those minterms sharing an ω with E. in eect. In this form. I D must have IA (ω) = 1. can be designated by a vector of zeroone coecients. F ∼ [1 1 0 0 0 1 0 1] (2. in the i th position (numbering from zero) indicates the inclusion of minterm Mi E is a Boolean combination of A coecient one in the union.15) where Mi is the i th minterm and JE is the set of indices for those Mi included in E. consider (2. we IA . c M4 M5 M6 M7 = M (1.
As a rst step. B . considered above E = A (B ∪ C c ) ∪ Ac (B ∪ C c ) ∼ [0 1 0 0 1 0 1 1] Then c and F = Ac B c ∪ AC ∼ [1 1 0 0 0 1 0 1] (2.20) MATLAB logical operations on zeroone matrices provide a convenient way of handling Boolean combinations of minterm vectors represented as matrices. This means the jth element of E∩F is the We illustrate for the case of the two combinations E and F of three generating sets.6: Minterms for the class . the class of generating sets is respectively. B. then the logical combination E = [1 0 1 0 1 0 1 1]. for example. E∩F E∪F has all the minterms in either set. second minterm vector for the family (i. Minterm vectors for Boolean combinations If E and of length 2n . E C is the matrix obtained by interchanging one and zero in each place in E.. E∪F is the maximum of the jth elements for the two vectors. Use MATLAB logical operations to obtain the minterm vector for any Boolean combination. B. F are combinations of n generating sets. This means the c has zeros and ones interchanged. 2. MATLAB logical operations E ∩ F ∼ [0 1 0 0 0 0 0 1] . Then the minterm vectors for A. C}. if EF is the minterm vector for E&F is the minterm vector for E ∩ F. 1. Suppose. For two zeroone matrices E. This suggests a general approach to determining minterm vectors for Boolean combinations. {A.1). then Thus. MATLAB implementation A key step in the procedure just outlined is to obtain the minterm vectors for the generating elements {A. The minterm expansion for of the vector for 3. we determine the minterm vector with the aid of a minterm map. The minterm expansion for the vector for E and F and want to obtain the minterm vector 2. C}. E. This means the j th element minimum of the jth elements for the two vectors.e. A = [0 0 0 0 1 1 1 1] If B = [0 0 1 1 0 0 1 1] E = (A&B)  C C = [0 1 0 1 0 1 0 1] of the matrices yields (2.21) E = AB ∪ C c . 1. the minterm vector for replicated twice to give B ). has only those minterms in both sets. Start with minterm vectors for the generating sets. we suppose we have minterm vectors for of Boolean combinations of these. E ∪ F. are {A. and C. and E =1−E is the minterm vector for Ec . Example 2. F of the same size : : : EF is the matrix obtained by taking the maximum element in each place. We have an mfunction to provide such fundamental vectors. c The minterm expansion for E has only those minterms not in the expansion for E.k) generates the k th minterm vector for a class of n generating sets. C}. B. The j th element of Ec is one i the corresponding vector for E element of E is zero. the basic zeroone pattern 0 0 1 1 is For example to produce the 0 The function 0 1 1 0 0 1 1 minterm(n. and E c ∼ [1 0 1 1 0 1 0 0] (2.35 It should be apparent that this formalization can be extended to sets generated by any nite class. We wish to develop a systematic way to determine these vectors. then each is represented by a unique minterm vector In the treatment in the module Minterms (Section 2. F are minterm vectors for sets by the same name.19) E ∪ F ∼ [1 1 0 0 1 1 1 1] . E&F is the matrix obtained by taking the minimum element in each place.
1). C = minterm(3.2) 0 0 1 1 = minterm(3.36 CHAPTER 2. > A = minterm(3. c. 5. F = (A&B)(∼B&C) F = 0 1 0 0 0 1 1 1 G = A(∼A&C) G = 0 1 0 1 1 1 1 1 islogical(A) % Test for logical array ans = 0 islogical(F) ans = 1 m = max(A. Logical operators to give new logical arrays. Zeroone arrays in MATLAB The treatment above hides the fact that a rectangular array of zeros and ones can have two quite dierent meanings and functions in MATLAB. 7) These basic minterm patterns are useful not only for Boolean combinations of events but also for many aspects of the analysis of those random variables which take on only a nite number of values.B) % A matrix operation m = 0 0 1 1 1 1 1 1 islogical(m) ans = 0 m1 = AB % A logical operation m1 = 0 0 1 1 1 1 1 1 islogical(m1) ans = 1 a = logical(A) % Converts 01 matrix into logical array a = 0 0 0 0 1 1 1 1 b = logical(B) m2 = ab .7: Minterm patterns for the Boolean combinations G = A ∪ Ac C F = (A&B)(∼B&C) F = 0 1 0 G = A(∼A&C) G = 0 1 0 JF = find(F)1 JF = 1 5 6 0 1 0 1 1 1 1 1 1 1 % Use of find to determine index set for F 7 % Shows F = M(1.3) 0 1 0 1 F = AB ∪ B c C A = B B = C C = 1 0 0 1 0 1 1 1 0 1 1 1 Example 2. A numerical matrix (or vector) subject to the usual operations on matrices. Array operations with numerical matrices to give numerical results.2). 1..3). MINTERM ANALYSIS A = minterm(3. Some simple examples will illustrate the principal properties. logical array whose elements are combined by a. 6. B = minterm(3. b. A 2. Array operations (element by element) to give numerical matrices.1) 0 0 0 0 = minterm(3.
37 m2 = 0 0 1 1 1 1 1 1 p = dot(A. p = A*B' p = 2 p1 = total(A. k = 0.05 0. C = program. simple for loop yields the following mfunction. If of these events occur on a trial can be that exactly characterized simply in terms of the minterm expansion. which gives the sum of each column. For more complicated cases. 3. we have an mfunction called csort (for sort and consolidate) to perform this operation.*b) p1 = 2 p3 = total(A. M = mintable(3) M = .10] where (2. the minterm probabilities are pm = [0 0. We count the number of successes corresponding to each minterm by using the MATLAB function sum.10 0.05 0. The function Use of the function minterm in a mintable(n) Generates a table of minterm vectors for n generating sets. it would be easy to determine each distinct value and add the probabilities on the minterms which yield this value.23) If we have the minterm probabilities. The event Akn k occur is given by Akn = the In the matrix event disjoint union of those minterms with exactly k positions uncomplemented (2. It is desired to get the probability an individual selected has SOLUTION k of these.*B) p3 = 2 p4 = a*b' % Cannot use matrix operations on logical arrays ??? Error using ==> mtimes % MATLAB error signal Logical inputs must be scalar. The following example in the case of three variables illustrates the procedure. 2. Example 2. pm = 0.9: The software survey (continued) In the software survey problem. Example 2.01*[0 5 10 5 20 10 40 10]. it is easy to pick out the appropriate minterms and combine the probabilities. B = event has spread sheet. Often it is desirable to have a table of the generating minterm vectors.B) % Equivalently. 1. The M = mintable(n) these are the minterms corresponding to columns with exactly k Bkn that k or more occur is given by n Bkn = r=k Arn (2.24) event has a data base A = event has word processor.10 0. the event that exactly k k of n events.20 0.8: Mintable for three variables M3 = 0 0 0 M3 = mintable(3) 0 0 0 0 1 1 1 0 1 1 0 0 1 0 1 1 1 0 1 1 1 As an application of mintable.22) ones.40 0. We form a mintable for three variables. In this case. consider the problem of determining the probability of {Ai : 1 ≤ i ≤ n} is any nite class of events.
we simply pick out the probabilities corresponding to the minterms in the expansion and add them.09.pk] = csort(T.0000 0. 10 These generate and name the minterm vector for each generating set and its complement.25) . if desired V = [AAc. A&B. minvec4. we have a family of minvec procedures sets. we cannot indicate the superscript.pm). Example 2. disp([k.07 (2. p4 = 0. p3 = 0. p2 = 0. The approach is much more useful in the case of Independent Events. In MATLAB. Ac&Cc].5500 3. determines distinct values in T and consolidates probabilities For three variables. A&B&C. For a larger number of variables. because of the ease of determining the minterm probabilities. and suppose the respective minterm probabilities are p0 = 0. C.11. it is easy enough to identify the various combinations by eye and make the combinations. however.. .pk]') 0 0 1. p7 = 0.1000 % 3 % % % % Column sums give number of successes on each minterm. A. Minvec procedures Use of the tilde ∼ to indicate the complement of an event is often awkward. we can determine the probability of any Boolean combination of the members of the class. respectively.38 CHAPTER 2.3500 2. To facilitate writing combinations. B. the minterm vector. ABC . . In the following example.29.0000 0. so we indicate the instead of ∼ E.14. A.21. % Logical combinations (one per % row) yield logical vectors disp(V) 1 1 1 1 1 1 1 1 % Mixed logical and 0 0 0 0 1 1 1 1 % numerical vectors 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 0 0 0 0 Minterm probabilities and Boolean combination If we have the probability of every minterm generated by a nite class. It is customary to indicate the complement of an event complement by E c E by Ec . p5 = 0. (minvec3. minvec3 % Call for the setup procedure Variables are A. p1 = 0. p6 = 0.. Bc. AB . we do this by hand then show how to do it with MATLAB . When we know the minterm expansion or. equivalently.03.. 5. C. · · · . this may become tedious.0000 0. MINTERM ANALYSIS 1 0 1 1 2 1 1 0 2 1 1 1 0 0 0 0 0 0 1 0 1 1 0 1 0 1 0 T = sum(M) T = 0 1 1 2 [k. minvec10) to expedite expressing Boolean combinations of n = 3.06.11 Consider E = A (B ∪ C c ) ∪ Ac (B ∪ C c ) c and F = Ac B c ∪ AC of the example above. Ac.10: Boolean combinations using minvec3 We wish to generate a matrix whose rows are the minterm vectors for C. and A C c c Ω = A ∪ Ac . Cc They may be renamed. 4. Example 2.
Data vector combinations are: DV = [AAc. 7).36 This is easily handled in MATLAB.12: Solution of the software survey problem We set up the matrix equations with the use of MATLAB and solve for the minterm probabilities. (A&Bc&C) . C. (A&B)(A&C)(B&C). 6.3700 Example 2. 7) = 0. B. 4.2*(Ac&B&C)] DV = 1 1 1 1 1 1 1 1 % Data mixed numerical 0 0 0 0 1 1 1 1 % and logical vectors 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 2 0 1 0 0 DP = [1 0.01*[21 6 29 11 9 3 14 7]. minvec3 Variables are A. for by the MATLAB commands PE = E*pm' and PF = F*pm' . Ac&Bc.65 0]. PE = E*pm' % Picks out and adds the minterm probabilities PE = 0. 5.1000 0. F = (Ac&Bc)(A&C). B. 1. 1. 4. Ac. Bc. C. minvec3 % Call for the setup procedure Variables are A.27) E and F. 6. Use logical matrix operations E = (A& (BCc))  (Ac& ( (BCc))) to obtain the (logical) minterm vectors for and F = (Ac&Bc)  (A&C) (2. B. % Corresponding data probabilities pm = DV\DP' % Solution for minterm probabilities pm = 0. A. Ac. 7) and F = M (0. Cc They may be renamed. we may solve for the desired target probabilities. The following is a transcript of the MATLAB operations. P (F ) = p (0. • If pm is the matrix of minterm probabilities.3 0. A&B&C.3600 PF = F*pm' PF = 0. Bc. C. and so that P (E) = p1 + p4 + p6 + p7 = p (1. 7) = 0.0500 0. Cc They may be renamed.39 Use of a minterm map shows E = M (1. if desired.8 0.1 0.5 x 1017 0.0000 % Roundoff 3.05 0.65 0. if desired. perform the algebraic dot product or scalar This can be called product of the pm matrix and the minterm vector for the combination.26) • • Use minvec3 to set the generating minterm vectors.37 (2. From these. pm = 0. 5. E = (A&(BCc))(Ac&∼(BCc)).0500 .
The procedure picks out the computable minterm probabilities and the computable target probabilities and calculates them. Sometimes the data are not sucient to calculate all minterm probabilities.5500 % Target probabilities 0. and the target minterm vectors are linearly dependent upon the data vectors (i. To determine the linear combinations. yet are sucient to allow determination of the target probabilities. solve the matrix equation T V = CT ∗ DV Then the matrix which has the MATLAB solution CT = T V /DV ' (2. we have: CT = TV/DV. which is the whole space. Now each target probability is the same linear combination of the data probabilities. The consistency check is principally for negative minterm probabilities. the problem must be formulated appropriately and precisely. The following procedure does not require calculation of the minterm probabilities.28) tp of target probabilities is given by tp = CT ∗ DP . Continuing the MATLAB procedure above. Use the MATLAB program 2. The renements consist of determining consistency and computability of various individual minterm probabilities and target probilities. Suppose the data minterm vectors are linearly independent.2000 0.13: An alternate approach The previous procedure rst obtained all minterm probabilities. the target vectors can be expressed as linear combinations of the data vectors).4000 0.1000 0. as follows: 1..5500 0. To utilize the procedure. tp = CT*DP' tp = 0.1000 TV = [(A&B&Cc)(A&Bc&C)(Ac&B&C). MATLAB The translates each row into the minterm vector for the corresponding Boolean combination. then used these to determine probabilities for the target combinations. These are organized into two matrices: The data vector matrix DV has the data Boolean combinations one on each row. Its minterm vector .2 The procedure mincalc The procedure mincalc performs calculations as in the preceding examples. consist of Boolean combinations of the basic events and the respective probabilities of these combinations. MINTERM ANALYSIS 0.2.e. Ac (for A Ac ). rst entry (on the rst row) is A consists of a row of ones. Data • minvecq to set minterm vectors for each of q basic events. Ac&Bc&C] % Target combinations TV = 0 0 0 1 0 1 1 0 % Target vectors 0 1 0 0 0 0 0 0 PV = TV*pm % Solution for target probabilities PV = 0.0500 2.40 CHAPTER 2. The computability tests are tests for linear independence by means of calculation of ranks of various matrices.0500 Example 2.
TV = [(A&B&Cc)(A&Bc&C)(Ac&B&C). probability of a target vector included in another vector will actually exceed what should be the larger probability.65 0. it may be dicult to determine consistency. It is usually a good idea to check results by hand to determine whether they are consistent with data.1 0. Without considerable checking. converge. Usual case Suppose the data minterm vectors are linearly independent and the target vectors are each linearly dependent on the data minterm vectors. even though the data should be adequate for the problem at hand. In a few unusual cases. the result should be checked with the aid of a minterm map. The rst entry is one. or it may not be able to decide which of the redundant data equations to use. There are several situations which should be dealt with as special cases. there is no simple check on the consistency. (A&B)(A&C)(B&C).05 0. then it follows that the probability of each minterm in the group is zero. The checking by hand is usually much easier than obtaining the solution unaided. In the case of linear dependence. These are obtained by the MATLAB operation tp = DP ∗ CT ' . Both the original and the transformed matrices have the same zeroone pattern. Apparently the approximation process does not MATLAB Solutions for examples using mincalc Example 2. the probability of the whole space. If the total probability of a group of minterms is zero. but MATLAB interprets them dierently. the command CT = TV/DV does not operate appropriately. The The data probability matrix DP is a row matrix of the data probabilities. 4. patterns into zeroone matrices. one on each row. Since the consistency check is for negative minterms. Computational note. Consistency check. 2. A. B. Should it provide a solution. Then each target minterm vector is expressible as a linear combination of data minterm vectors.14: Software survey % file mcalc01 Data for software survey minvec3. Ac&Bc.41 • 3. This is accomplished for DV by the operation DV = ones(size(DV)). 1. C. Cautionary notes The program mincalc depends upon the provision in MATLAB for solving equations when less than full data are available (based on the singular value decomposition).65 0]. it is necessary to turn the arrays DV and TV consisting of zeroone corresponding target Boolean combination. and similarly for TV. Linear dependence. the operation called for by the command CT = TV/DV may not be able to solve the equations.*DV.3 0. MATLAB produces the minterm vector for each In mincalc. if there are not enough data Sometimes the to calculate the minterm probabilities. Ac&Bc&C]. into the objective is to determine the probability of various target Boolean combinations. It simply considers these minterm probabilities not computable. so that use of MATLAB is advantageous even in questionable cases. . The matrix may be singular. if mincalc does not have enough information to calculate the separate minterm probabilities in the case they are not zero. 3. The Zero Problem. These are put target vector matrix T V . A&B&C.2*(Ac&B&C)]. (A&Bc&C) DP = [1 0. DV = [AAc. However. MATLAB solves this CT = T V /DV .8 0. with the command Thus. it will not pick up in the zero case the fact that the separate minterm probabilities are zero. The target probabilities are the same linear combinations of the data probabilities. there is a matrix CT such that T V = CT ∗ DV . disp('Call for mincalc') mcalc01 % Call for data Call for mincalc % Prompt supplied in the data file .
0500 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities..0000 0.1000 6.0000 0. A&B&C. B. call for PMA disp(PMA) 0 0.3640 5.1000 Example 2.0160 2.4000 7.0000 0. DP = 0. Ac&Bc&C]. A&C. disp('Call for mincalc') mcalc02 Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.0000 0.0320 1.1000 3. .0000 0. (A&B)(A&C)(B&C).0510 Example 2. call for PMA disp(PMA) % Optional call for minterm probabilities 0 0 1.001*[1000 565 515 151 51 124 212 0].(A&C)].0000 0.2000 5.42 CHAPTER 2.0000 0.16 .0500 2.m Data for computer survey minvec3 DV = [AAc.0770 7.0000 0.0000 0.0000 0. MINTERM ANALYSIS mincalc Data vectors are linearly independent Computable target probabilities 1.0160 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities.0000 0.5500 2.0000 0.0000 0.0000 0. A.15: Computer survey % file mcalc02.3760 3. TV = [ABC. C.0730 6..0500 4.0000 0.0000 0.9680 2.0000 0.0110 4. 2*(B&C) .0000 0.
48. A&Bc&C.4800 The number of minterms is 16 The number of available minterms is 16 Available minterm probabilities are in vector pma To view available minterm probabilities. . Suppose the probability that at least one of the events A occurs is 0.0150 The procedure mincalct A useful modication.0800 0. A((B&C)Dc) . 2. A&B&Cc&Dc]. Ac&Bc&Dc. This procedure assumes a data le similar to that for mincalc.0850 0.0350 0. C..0200 0. B. ing and computing the minterm probabilities.001*[1000 200 500 300 700 55 520 200 15 30 195 120 120 . disp('Call for mincalc') mincalc03 Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.0850 0.org/content/m24171/1. without checkTV . We may simply use the procedure mincalct and input the desired Boolean combination at the prompt. Ac&B&Cc&D.0000 0.1950 0. D.2000 0..2850 (A&D)(B&Dc) Repeated calls for mcalct may be used to compute other target probabilities.43 % file mcalc03. A&B&C&D. Determine the probability that neither of the events A or C but at least one of the events B or D occurs.75 and the probability that at least one of the four events occurs is 0. C. DP = 0. except that it does not need the target matrix The procedure mincalct may be used after mincalc has performed its operations to calculate probabilities for additional target combinations. B. A..0200 0. It is not necessary to recalculate all the other quantities..0150 0.3 Problems on Minterm Analysis3 Exercise 2. computes the available target probabilities. . call for PMA disp(minmap(pma)) % Display arranged as on minterm map 0.0350 0. which we call mincalct. A&(BCc)&Dc. TV = [Ac&((B&Cc)(Bc&C)). Example 2.) C {A. mincalct Enter matrix of target Boolean combinations Computable target probabilities 1.0000 0.m Data for opinion survey minvec4 DV = [AAc. A&C&Dc.4000 2.0500 0.0500 0.90. 140 25 20].1 Consider the class or (Solution on p. Ac&B&C.4/>. Ac&Bc&Cc&D. D} of events. A(B&Cc)].0200 0.0100 0.0000 0. A&Cc. 3 This content is available online at <http://cnx. since it prompts for target Boolean combination inputs.0850 0.17: (continued) Additional target datum for the opinion survey Suppose mincalc has been applied to the data for the opinion survey and that it is desired to determine P (AD ∪ BDc ).
c c Determine P (A(BC ) ) and P (AB ∪ AC ∪ BC).0756 0. Repeat part (1) using the mprocedure minvec3 and MATLAB logical operations. 49.0216 0. 50. . P (Ac B) = 0.0432 0. P (AC ∪ A C). and P (ABC c ) = 0. P (C) = 0. c c c and P (A B C ) = 0. C.2. c.0216 0.0108 0.0672 0.) Exercise 2.6 Suppose (Solution on p.0336 0.) possible.0336 0.) 1.1.0504 0. B.0324 0. 3.4.2352 0.0756 0.65. P (AC) = 0.0144 0.0672 0.7 Suppose and P ((AB c ∪ Ac B) C) = 0. P (AC) = 0.1. Repeat part (1) using indicator functions (evaluated on minterms).0288 0.2.0224 0.0336 0. 48. (3) the mprocedure a. arranged as on a minterm map are 0.2.0144 0.) P (A ∪ B c C) = 0.3.0096 0.0392 0. if (Solution on p. arranged as on a minterm map are 0.2 (Solution on p. (Solution on p.0336 0.0216 0. C}: a. b. b.30.0252 0. C.6. MINTERM ANALYSIS Exercise 2.5.0336 What is the probability that three or more of the events occur on a trial? Of exactly four? Of three or fewer? Of either two or four? Exercise 2.25. P (A) = 0.0216 0. if possible.0072 0. A (B ∪ C c ) ∪ Ac BC ⊂ A (BC ∪ C c ) ∪ Ac B A ∪ Ac BC = AB ∪ BC ∪ AC ∪ AB c C c (Solution on p.4.0504 0.0588 0.1008 0. and P (Ac (B ∪ C c )).0144 0. if possible.0216 0. 49.) Use (1) minterm maps.0144 0. P (AB c C c ) = 0.25 P (Ac C c ) = 0. D. E}.5 Minterms for the events (Solution on p.1.) Exercise 2. P ((AB c ∪ Ac ) C c ).9 Suppose (Solution on p.3. P (AB) = 0.1568 0.0504 0.3. Exercise 2. c c c c c Then determine P ((AB ∪ A ) C ) and P (A (B ∪ C )).) {A.0216 0. A ∪ (BC) = A ∪ B ∪ B c C c c (A ∪ B) = Ac C ∪ B c C A ⊂ AB ∪ AC ∪ BC c 2.0324 0. Determine P (Ac C c ∪ AC). B.0252 0.) P (A) = 0. D}. P (Ac B) = 0. P (Ac C c ) = 0.3 minvec3 and MATLAB logical operations to show that (Solution on p.0504 0.0168 0.0504 0.0336 0.44 CHAPTER 2.0168 0. (2) indicator functions (evaluated on minterms).0144 0. c c c c c Determine P ((A ∪ B) C ).0144 0.4 Minterms for the events {A. P (C) = 0.5. Determine P ((AC c ∪ Ac C) B c ). B. 48.0096 0.0224 0. Use minterm maps to show which of the following statements are true for any class {A. Exercise 2.1008 What is the probability that three or more of the events occur on a trial? Of exactly two? Of two or fewer? Exercise 2. P (AB) = P (AC) = 0. 49.8 Suppose P (A) = 0.0504 0. and P (AC ∪ A B). 50. 48. Exercise 2. P (BC c ) = 0.6.
15 (2. Suggestion Express the numbers as a fraction.12 (Solution on p.05 P (Ac B c C c ) = 0. A= the event the student is male.2. 20 engineering students will take a foreign language and plan graduate study. c c Determine P (A B ∪ A (C ∪ D)). and P (ACDc ) = 0. 30 of whom do not identify with a political party. and treat as probabilities. and How many nonmembers of a political party did not vote? C A be the event of rain in Austin. 53.) Exercise 2. 52. What is the probability of rain in exactly one of the cities? Exercise 2.4. including all of the women.) Exercise 2.) be the event of rain be the event of rain in San Antonio. What is the probability of rain in all three cities? b. P (AB c ) = 0.) One hundred students are questioned about their course of study and plans for graduate study.29) P (BC) = 0.45 Then repeat with additional data P (Ac B c C c ) = 0. P (Ac B c ) = 0. What is the probability of rain in exactly two of the three cities? c. 75 students will take foreign language classes. 26 men and 19 women plan graduate study. Exercise 2. 52. 13 male engineering students and 8 women engineering students plan graduate study. What is the probability of selecting a student who plans foreign language classes and graduate study? . 23 engineering students. What is the probability of a female student active in sports? Exercise 2. and are active in sports 8 percent are male and live o campus 17 percent are male students inactive in sports a.10 Given: P (A) = 0. B= the event the student is studying engineering. 10 of whom are female. 10 of whom are women.1.20.6. on campus student who is not active in sports? c.13 During a period of unsettled weather.11 A survey of a represenative group of students yields the following information: • • • • • • • 52 percent are male 85 percent live on campus 78 percent are male or are active in intramural sports (or both) 30 percent live on campus but are not active in sports 32 percent are male. women students plan foreign language study and graduate study. C = the D = the event the student is planning graduate study (including professional school). 52. What is the probability that a randomly chosen student is male and lives on campus? b. (Solution on p.30) a. 50 are members of a political party. live on campus.1 and P (Ac BC) = 0.) A survey of 100 persons of voting age reveals that 60 are male.45 (2. P (AC c ) = 0. (Solution on p. 51. event the student plans at least one year of foreign language.30 P (B c C) = 0. a. 5 non engineering students plan graduate study but no foreign language courses.14 Let (Solution on p. 20 nonmembers of a party voted in the last election. 11 non engineering.15.35. What is the probability of a male. let in Houston. (Solution on p. The results of the survey are: There are 55 men students. P (AB c ∪ Ac B) = 0. Suppose: P (AB) = 0. B P (AC) = 0.05.
1. 55 students live on campus. What is the probability of selecting a women engineer who does not plan graduate study? c. b. 209 have seen at least b.49.) A customer enters the store. 35 of the reasonably active students read the newspaper regularly. and headlight adjustment 325 needed at least two of these three items 125 needed headlight and brake work 550 needed at wheel alignment a.45. and P (Ac B c C c ) = 0. P (Ac ) = 0. Let Exercise 2. printers. 62 have seen at least a and c. MINTERM ANALYSIS b. c c c c c Determine P (B ∪ C). 112 have seen at least c.) • • • • 100 needed wheel alignment. 54. 70 consider themselves reasonably active in student aairs50 of these live on campus. • • (a) How many have seen at least one special? (b) How many have seen only one special program? Exercise 2. brake repair.6.15 A survey of 100 students shows that: (Solution on p. How many inactive men are not regular readers? Exercise 2. and c. 54. 197 have seen at least two of the programs. which we call a. C be the respective events the customer buys a computer.16 area have watched three recent special programs. but not both is 0. c c Repeat the problem with the additional data P (A BC) = 0. 45 have seen all three. P (ABC c ) of buying both a computer and a monitor but not a printer is P (AC) of buying both a computer and a printer is 0.) Of the 1000 persons A television station runs a telephone survey to determine how many persons in its primary viewing 221 have seen at least a.3. a printer. non readers of the newspaper. and P (A (B ∪ C )).18 Suppose (Solution on p. 56. if possible. What is the probability of selecting a male student who either studies a foreign language but does not intend graduate study or will not study a foreign language but plans graduate study? Exercise 2.2 and P (A B) = 0.50. a monitor.3. the results are: (Solution on p. 10 of the oncampus women readers consider themselves active. B.17.) 60 are men students. (Solution on p.46 CHAPTER 2. a.39 P (AC c Ac C) of buying a computer or a printer. Assume the following probabilities: • • • • • The probability The probability 0. How many who do not need wheel alignment need one or none of the other items? Exercise 2. 5 men are active. as do 5 of the o campus women. P ((AB ∪ A B ) C ∪ AC). How many active men are either not readers or o campus? b. All women who live on campus and 5 who live o campus consider themselves to be active. A. The probability The probability The probability P (AB) of buying both a computer and a monitor is 0. 40 read the student newspaper regularly. How many needed only wheel alignment? b. 25 of whom are women. ocampus. the number having seen at least a and b is twice as large as the number who have seen at least b and c. . 25 of whom are women. P (BC) of buying both a monitor and a printer is 0. monitors. 55. 54.) P (A (B ∪ C)) = 0. surveyed.17 An automobile safety inspection station found that in 1000 cars tested: (Solution on p.19 A computer store sells computers.
P (Ac C c ) = 0.062.3. What is the probability of buying at least two? d.197 and P (BC) = 2P (AC).21 with the original data probability matrix.65.) AB replaced by AC in Does mincalc work in this case? Check results on a . c P (AB c ) (2.3. With only six items of data (including Try the additional data P (Ac B c ) .43. Repeat. 56.22 Repeat Exercise 2.228.1 and P (Ac B c ) = 0. P (AC) = 0. P (ABC) = 0.4.045. P (C) of buying each? b.) Exercise 2.31) P (Ω) = P (A A ) = 1). P (C) = 0.23 Repeat Exercise 2. Determine available minterm probabilities and the following. What is the result? Explain the reason Exercise 2.20 Data are (Solution on p. not all minterms are available. with the additional data P (C) = 0.21 Data are: P (A) = 0. (Solution on p. 58. Are these consistent and linearly (Solution on p.230.) P (A) = 0. but with the data vector matrix. c c Determine P (A ∪ B ∪ C) and P (A B C). minterm map.25. What is the probability of buying exactly two of the three items? c. but not both is 0.3.21 with for this result.47 • • The probability The probability P (AB c P (BC c Ac B) of buying a computer or a monitor. P (A ∪ B) . P (Ac BC c ) = 0. P (AB ∪ AC ∪ BC) = 0. 57. if computable: P (AC c ∪ Ac C) . but not both is 0. or a. B c C) of buying a monitor or a printer. What is the probability P (A). P (B).5.43. if possible. What is the result? (Solution on p.232. What is the probability of buying all three? Exercise 2. P (AB) changed from 0. P (ABC) = 0. P (B) = 0.3 to 0. 59.) independent? Are all minterm probabilities available? Exercise 2. P (AB) = 0.
0.75 = 0. H = (Ac&C)(Bc&C). F = (A&((B&C)Cc))(Ac&B).H]) 1 1 0 0 0 0 1 0 1 0 K = (A&B)(A&C)(B&C). Bc. 44) 0 0 1 1 1 1 % E subset of F 1 1 1 1 1 1 % G = H We use mintable(4) and determine positions with correct number(s) of ones (number of occurrences). E = (A&(BCc))(Ac&B&C). 44) so that P (Ac C c (B ∪ D)) = (2. disp([G.F]) 1 1 1 0 1 1 0 1 1 1 G = ∼(AB).15 Solution to Exercise 2. C. Cc They may be renamed.8. Ac. minvec3 Variables are A. Cc They may be renamed. H = (A&B)(B&C)(A&C)(A&Bc&Cc). F = AB(Bc&Cc). disp([G. c P (A ∪ C ∪ B ∪ D) = P (A ∪ C)+P (Ac C c (B ∪ D)) . An alternate is to use minvec4 and express the Boolean combinations which give the correct number(s) of ones. Ac.90 − 0. Use mintable(4) a = mintable(4). minvec3 Variables are A. disp([E. .3 (p.2 (p. which displays the essential patterns. B.4 (p.F]) 0 0 0 1 1 0 0 1 1 1 G = A(Ac&B&C). Bc. MINTERM ANALYSIS Solutions to Exercises in Chapter 2 Solution to Exercise 2. 43) Use the pattern P (E ∪ F ) = P (E) + P (E c F ) and (A ∪ C) = Ac C c .H]) 0 0 0 1 1 0 0 0 1 1 Solution to Exercise 2. if desired. which displays the essential patterns. disp([A. npr02_04 (Section~17. C. B.48 CHAPTER 2.32) We use the MATLAB procedure. if desired. E = A∼(B&C).1: npr02_04) Minterm probabilities are in pm.1 (p. disp([E. 44) 1 1 1 1 1 1 % Not equal 0 1 1 1 0 0 1 1 0 0 1 1 % Not equal % A not contained in K We use the MATLAB procedure.K]) 0 0 0 0 1 0 0 0 1 0 Solution to Exercise 2.
% Number of ones in each minterm position P1 = (s>=3)*pm' % Select and add minterm probabilities P1 = 0. Use mintable(5) a = mintable(5). The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities.3000 % The first and third target probability 3.2: npr02_05) Minterm probabilities are in pm. Ac.8.3: npr02_06) % Data for Exercise~2. npr02_05 (Section~17.7 .8. Ac&B. 44) We use mintable(5) and determine positions with correct number(s) of ones (number of occurrences). B. Call for mincalc mincalc Data vectors are linearly independent % Data file Computable target probabilities 1.25 0.25 0. Cc They may be renamed.3500 % is calculated.49 s = sum(a).6 (p. A(Bc&C).0000 0.3728 P3 = (s<=2)*pm' P3 = 0. A&C. 44) % file npr02_06.0000 0.1712 P3 = (s<=3)*pm' P3 = 0.6 minvec3 DV = [AAc.30]. Bc. disp('Call for mincalc') npr02_06 % Call for data Variables are A. Check with minterm map. C. s = sum(a).4: npr02_07) % Data for Exercise~2.5380 P2 = (s==4)*pm' P2 = 0. ((A&Bc)Ac)&Cc.5284 % Number of ones in each minterm position % Select and add minterm probabilities Solution to Exercise 2. P1 = (s>=3)*pm' P1 = 0.7952 P4 = ((s==2)(s==4))*pm' P4 = 0.65 0. call for PMA Solution to Exercise 2. DP = [1 0.7 (p. B&Cc].4716 P2 = (s==2)*pm' P2 = 0.5 (p. TV = [((A&Cc)(Ac&C))&Bc. Ac&(BCc)]. if desired. 44) % file npr02_07. Ac&Cc.m (Section~17.m (Section~17.4784 Solution to Exercise 2.8.20 0.
C. Cc They may be renamed.0000 0. Ac&Cc.4 0. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. Ac. 44) % file npr02_09.6 0. (A&Cc)(Ac&B)].3 0. B. disp('Call for mincalc') npr02_08 % Call for data Variables are A.4000 % even though not all minterms are available 3.8 minvec3 DV = [AAc. ((A&Bc)(Ac&B))&C.4 0. Bc. A&B. 44) % file npr02_08.50 CHAPTER 2.m (Section~17. A&Bc&Cc].5 0. disp('Call for mincalc') npr02_07 % Call for data Variables are A.m (Section~17. Ac&B.8. call for PMA Solution to Exercise 2. C. A. C. (A&Cc)(Ac&C).2 0. Cc They may be renamed. if desired. Ac&(BCc)]. TV = [(AB)&Cc. A. DP = [ 1 0.4000 The number of minterms is 8 The number of available minterms is 6 Available minterm probabilities are in vector pma To view available minterm probabilities.0000 0.2 0.0000 0.9 minvec3 .6 0. ((A&Bc)Ac)&Cc.1]. B.3 0. TV = [(Ac&Cc)(A&C).9 (p.6: npr02_09) % Data for Exercise~2. DP = [ 1 0.5: npr02_08) % Data for Exercise~2.0000 0. Ac. Bc.5000 % All target probabilities calculable 2. C. A&C.1].8. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.8 (p. call for PMA Solution to Exercise 2. Ac&Bc&Cc].0000 0.0000 0.4000 % even though not all minterms are available 3.5000 The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities. if desired.7000 % All target probabilities calculable 2. MINTERM ANALYSIS minvec3 DV = [AAc.
A. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. Ac.10 minvec4 DV = [AAc. B. npr02_09 % Call for data Variables are A. Ac&B&C]. (A&B)(A&C)(B&C)]. A&B&Cc].7: npr02_10) % Data for Exercise~2. 45) % file npr02_10. call for PMA DV = [DV.1]. A&B.0000 0. A.1]. % DP = [DP 0. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. if desired. A&C. mincalc Data vectors are linearly independent Computable target probabilities 1.4 0.51 DV = [AAc. Cc They may be renamed. C. call for PMA Solution to Exercise 2. disp('Call for mincalc') npr02_10 Variables are A.0000 0. DP = [ 1 0. C.1 0. DP = [1 0. Ac&B&C]. Ac&Bc.1 0.05]. Bc.3 0.4500 % even though not all minterms are available The number of minterms is 8 The number of available minterms is 6 Available minterm probabilities are in vector pma To view available minterm probabilities.m (Section~17. Bc. Ac. TV = [A&(∼(B&Cc)).8.5 0.4000 % Only the first target probability calculable The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities.05].7000 % Checks with minterm map solution The number of minterms is 16 The number of available minterms is 0 Available minterm probabilities are in vector pma .10 (p. % Modification of data DP = [DP 0.0000 0. disp('Call for mincalc') % Modification for part 2 % DV = [DV. B. A&C&Dc]. TV = [(Ac&B)(A&(CcD))].3 0. A&Cc. Cc.2 0. Ac&Bc&Cc.0000 0.6 0. Dc They may be renamed. Ac&Bc&Cc.4000 % Both target probabilities calculable 2. D. if desired.
Ac. call for PMA Solution to Exercise 2.0000 0.85 0. A&Cc]. Ac&C].32 0.m (Section~17. call for PMA Solution to Exercise 2. A&B&Cc.2600 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities. disp('Call for mincalc') npr02_11 Variables are A.30 0.52 0.17]. 45) % file npr02_11.9: npr02_12) % Data for Exercise~2. A&Bc.4400 2. if desired. B&Cc. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.20 0. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. Cc They may be renamed. TV = [Bc&Cc].8.8.60 0. if desired.12 (p. Bc&C. B = party member.8: npr02_11) % Data for Exercise~2.11 (p.1200 3.m (Section~17.12 % A = male. A&Bc. B. C. 45) % file npr02_12. DP = [ 1 0.08 0.30 0.10]. Ac.50 0. C = voted last election minvec3 DV = [AAc. Cc They may be renamed. B. Bc. B = on campus.78 0. B. Ac&Bc&C]. call for PMA . A&B&C. Bc.52 CHAPTER 2. B.3000 The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities. C. C = active in sports minvec3 DV = [AAc.0000 0. A. MINTERM ANALYSIS To view available minterm probabilities. disp('Call for mincalc') npr02_12 Variables are A. DP = [ 1 0. A.11 % A = male. TV = [A&B.0000 0. AC.0000 0.
45) % file npr02_13. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.05 0.. A&Bc. Ac&C. if desired.0000 0. A&B. C.8. A&D.0000 0. Dc They may be renamed.05 0. % C = foreign language. TV = [A&B&C.15 0.m (Section~17.30 0. C. C.53 Solution to Exercise 2.3900 2. Ac&B&D. Cc They may be renamed. call for PMA . DP = [1 0. disp('Call for mincalc') npr02_13 Variables are A.26 0. 45) % file npr02_14.20 0. D = graduate study minvec4 DV = [AAc.0000 0. . Ac. Cc. B&C. disp('Call for mincalc') npr02_14 Variables are A.11: npr02_14) % Data for Exercise~2. Bc&C.19 0. (A&Bc)(Ac&B). A.45 0.13 0.10 0. if desired.75 0. Bc&Cc&D. TV = [C&D.13 (p.20 0. (A&B&Cc)(A&Bc&C)(Ac&B&C). B = engineering.10: npr02_13) % Data for Exercise~2. A&C. B. B.2500 3.23 0.13 % A = rain in Austin.55 0.45 0. DP = [ 1 0. D. call for PMA Solution to Exercise 2. Ac&D. Bc.8. A&((C&Dc)(Cc&D))].2600 % Third target probability not calculable The number of minterms is 16 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities. B = rain in Houston.14 (p.08 0. Ac&Dc.35 0. Bc. Ac&Bc&Cc]. B&C&D. B. (A&Bc&Cc)(Ac&B&Cc)(Ac&Bc&C)].4000 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities. Ac&B..11]. % C = rain in San Antonio minvec3 DV = [AAc.2000 2. Ac&Bc&C&D].15]. A&B&D. Ac. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.14 % A = male.m (Section~17.0000 0.0000 0.
if desired. A&Bc&Cc&D]. A.55 0. Ac&Bc&C&D.17 . call for PMA Solution to Exercise 2. B. DP = [ 1 0. Ac&B&D. if desired. Bc. B = on campus.209 0. C.05 0. B.25 0.25 0. npr02_16 Variables are A. C. TV = [A&D&(CcBc). Dc They may be renamed.197 0. 46) % file npr02_16. Bc.13: npr02_16) % Data for Exercise~2. Ac. A.8. disp('Call for mincalc') npr02_15 Variables are A.0000 0. D.m (Section~17. Ac. A&C. (A&B)(A&C)(B&C).70 0.3000 2. C&D.12: npr02_15) % Data for Exercise~2.2500 The number of minterms is 16 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities.16 (p.54 CHAPTER 2.8.17 (p. D = active minvec4 DV = [AAc. (A&Bc&Cc)(Ac&B&Cc)(Ac&Bc&C)].. MINTERM ANALYSIS Solution to Exercise 2. . TV = [ABC. D.05 0.6 0.16 minvec3 DV = [AAc. C.112 0.05]. A&B&C.15 % A = men.m (Section~17. B&D. Ac&C.35 0. call for PMA Solution to Exercise 2. Ac&B. DP = [1 0. C. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.15 (p. B. Ac&B&C&D.0000 0.m (Section~17. B.1030 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities. A&Dc&Cc]. 46) % file npr02_15..0000 0.10 0.0000 0. (A&B)2*(B&C)].40 0.25 0.50 0.3000 2.062 0]. Cc. C = readers. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.045 0.14: npr02_17) % Data for Exercise~2. 46) % file npr02_17. Cc They may be renamed. Ac&Bc&D.8.221 0.
DP = [ 1 0. 46) % file npr02_18.550].2 0. % DP = [DP 0.4000 . % Modified data DP = [DP 0. if desired.0000 0.0000 0. call for PMA DV = [DV. Cc They may be renamed. A&(BC). Ac&B&C. (((A&B)(Ac&Bc))&Cc)(A&C). Ac&Bc&Cc]. Ac&B&C. C = headlight minvec3 DV = [AAc. Cc They may be renamed. A ].55 % A = alignment.3 0.0000 0. Ac.0000 0. mincalc % New calculation Data vectors are linearly independent Computable target probabilities 1.8000 2.m (Section~17.4250 The number of minterms is 8 The number of available minterms is 3 Available minterm probabilities are in vector pma To view available minterm probabilities.1].15: npr02_18) % Date for Exercise~2. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. Ac&(∼(B&C))]. B&C. Bc. C. C. Bc. if desired. Ac&B]. Ac.325 0. TV = [BC. disp('Call for mincalc') % Modification % DV = [DV. TV = [A&Bc&Cc. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. Ac&(BCc)].2 0. B = brake work. Ac.8. A&B&C.3].18 minvec3 DV = [AAc. npr02_18 Variables are A.6 0.8000 2.3].4000 The number of minterms is 8 The number of available minterms is 2 Available minterm probabilities are in vector pma To view available minterm probabilities. call for PMA Solution to Exercise 2. DP = [ 1 0.0000 0.18 (p. Ac&B]. B. disp('Call for mincalc') npr02_17 Variables are A.100 0. B.125 0. (A&B)(A&C)(B&C).0000 0.2500 2.
B. if desired.3200 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities.228 0.45 0..0000 0.19 % A = computer.0000 0. C.8000 2.232 0.8. TV = [ABC.17: npr02_20) Variables are A.17 0.20 (p.4000 The number of minterms is 8 The number of available minterms is 5 Available minterm probabilities are in vector pma To view available minterm probabilities.50 0. 46) % file npr02_19.8. A.39 0.49 0. A&B&C]. Ac. 47) % file npr02_20. (A&B)(A&C)(B&C). C.2*(A&C)].m (Section~17. disp('Call for mincalc') % Modification % DV = [DV. disp('Call for mincalc') npr02_19 Variables are A.17: npr02_20) % Data for Exercise~2. B = monitor.0000 0.20 minvec3 DV = [AAc. C. call for PMA Solution to Exercise 2. % DP = [DP 0. B&C . Ac&Bc&Cc]. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. A&B.43 0. (A&Bc)(Ac&B). B. B&C. C]. . A&C. TV = [A. call for PMA Solution to Exercise 2. if desired. A&B&C. (A&B&Cc)(A&Bc&C)(Ac&B&C).230 ].0000 0. DP = [1 0. npr02_20 (Section~17. (B&Cc)(Bc&C)]. .3700 5. Bc..16: npr02_19) % Data for Exercise~2. MINTERM ANALYSIS 3.43]. Cc They may be renamed. DP = [ 1 0. (A&Cc)(Ac&C).6100 3.56 CHAPTER 2.6900 6. C = printer minvec3 DV = [AAc. Cc They may be renamed.0000 0. Bc. A&C. A&B&Cc.0000 0.062 0.m (Section~17.8.045 0. Ac.0000 0.19 (p.6000 4. (A&B)(A&C)(B&C). B. B.197 0].
21 minvec3 DV = [AAc.0000 0.21 (p. call for PMA disp(PMA) 2.57 mincalc Data vectors are linearly independent Data probabilities are INCONSISTENT The number of minterms is 8 The number of available minterms is 6 Available minterm probabilities are in vector pma To view available minterm probabilities.18: npr02_21) % Data for Exercise~2. B. call for PMA Solution to Exercise 2.1 0. A&B. disp('Call for mincalc') % Modification % DV = [DV.1800 7.0000 0. mincalc Data vectors are linearly independent Data probabilities are INCONSISTENT The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities.18: npr02_21) Variables are A.0000 0.8. Ac&B&Cc.0000 0. C]. A&B&C. Ac&B&Cc.3500 4. C.0000 0. AB.1000 The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities.0000 0.3 0.m (Section~17. Ac&Bc].230]. A&Bc].0450 % Negative minterm probabilities indicate 4. Bc. .0480 3.3 ].4 0.65 0. DP = [DP 0.0000 0. npr02_21 (Section~17. A. Ac&Cc].3 ].25 0. Ac&Bc]. call for PMA DV = [DV.0000 0. Cc They may be renamed. C. 47) % file npr02_21. TV = [(A&Cc)(Ac&C). if desired. % DP = [DP 0. Ac&Bc. Ac. DP = [ 1 0.8.0100 % inconsistency of data 5.0170 6. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.0450 DV = [DV.
0000 0.0000 0.0000 0.3 ]. call pma for PMA pma for PMA . MINTERM ANALYSIS DP = [DP 0. Ac&B&Cc.2500 7.19: npr02_22) Variables are A. A&B&C. Bc.1 0.5 0. npr02_22 (Section~17.1 0. Ac.8. A&Bc]. if desired. C. Ac&Bc.m (Section~17.3 ]. Ac&B&Cc.2500 DV = [DV.1000 6. call for PMA Solution to Exercise 2.25 0.58 CHAPTER 2.19: npr02_22) % Data for Exercise~2. Ac&Cc]. mincalc Data vectors are linearly independent Computable target probabilities 1.3500 2.0000 0.0000 0.0000 0.2000 5. DP = [ 1 0. TV = [(A&Cc)(Ac&C). B.3 ]. A&B.1 0.1000 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities. Call for mincalc mincalc Data vectors are linearly independent Data probabilities are INCONSISTENT The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector To view available minterm probabilities.22 minvec3 DV = [AAc. C.0000 0. disp('Call for mincalc') % Modification % DV = [DV.3000 3. % DP = [DP 0.8.0000 0. AB.7000 4. Ac&Bc]. Ac&Bc].22 (p. Cc They may be renamed. call disp(PMA) 4. mincalc Data vectors are linearly independent Data probabilities are INCONSISTENT The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector To view available minterm probabilities. A. DP = [DP 0.4 0. 47) % file npr02_22.3 ].65 0.
25 0.0000 4. Call for mincalc mincalc Data vectors are NOT linearly independent Warning: Rank deficient. Ac. % DP = [DP 0.65 0.2000 0. Ac&Bc]. mincalc Data vectors are NOT linearly independent Warning: Matrix is singular to working precision.2500 Solution to Exercise 2.1 0. DP = [ 1 0. Ac&Bc. DP = [DP 0. C.2000 0. 47) % file npr02_23. Ac&B&Cc. A. B. A&Bc].0243e15 Computable target probabilities 1. if desired. Ac&B&Cc. A&C. Ac&Bc]. AB.59 disp(PMA) 0 1. A&B&C.3 0.1 0. C.0000 2.23 (p.20: npr02_23) % Data for Exercise~2.4 0.0000 7. TV = [(A&Cc)(Ac&C). Cc They may be renamed.3 ].1000 0. Computable target probabilities 1 Inf % Note that p(4) and p(7) are given in data 2 Inf 3 Inf The number of minterms is 8 The number of available minterms is 6 Available minterm probabilities are in vector pma To view available minterm probabilities.8.0000 0. call for PMA . Ac&Cc].0000 5.1000 0.4500 The number of minterms is 8 The number of available minterms is 2 Available minterm probabilities are in vector pma To view available minterm probabilities.1000 0.2500 0.2000 0.0000 6.m (Section~17. call for PMA DV = [DV. disp('Call for mincalc') % Modification % DV = [DV.3 ]. Bc.0000 3. rank = 5 tol = 5. npr02_23 Variables are A.3 ].0000 0.23 minvec3 DV = [AAc.
60 CHAPTER 2. MINTERM ANALYSIS .
example • An applicant for a job as a manager of a service department is being interviewed. P (B). This leads to the notion of conditional probability. The probability P (A) For reassessment of the likelihood of event A. eats a low fat. because of the way it is derived from the original. She is somewhat overweight. His credit rating. is found to be quite low. She nds one that appears to be an excellent buy. has all the properties of the original probability measure. it may A. the conditional probability measure has a number of special properties which are important in applications. 11). His résumé shows adequate experience and other qualications. On the basis of this new information. 61 . this assignment to all events constitutes a new probability measure which C has occurred. He is considered a prospect highly likely to succeed.6/>.2 Conditional probability The original or prior probability measure utilizes all available information to make probability assignments P (A) . indicates the likelihood that event A will occur on any trial. new information is received which leads to a etc. reduced considerably. Then he discovers that she exercises regularly.1 Introduction The probability P (A) of an event A is a measure of the likelihood that the event will occur on any trial. Before He nds evidence that the car has been wrecked with The likelihood the car will be satisfactory is thus possible frame damage that has been repaired. Frequently. he reassesses the likelihood of 1 This content is available online at <http://cnx. and (P3) (p.1.Chapter 3 Conditional Probability 3. It looks clean. high ber. In addition. He suspects that she may be prone to heart problems.org/content/m23252/1. the likelihood that he is a satisfactory candidate changes radically. she has a mechanic friend look at it. has reasonable mileage. because of bad debts. He conducts himself with ease and is quite articulate in his interview. • A physician is conducting a routine physical examination on a patient in her seventies. heart problems. buying. • A young woman is seeking to purchase a used car. Given this information. Sometimes partial information determines that an event be necessary to reassign the likelihood for each event For a xed conditioning event C. and is a dependable model of a well known make. The interview is followed by an With extensive background check.1. sub ject to the dening conditions (P1). this information..1 Conditional Probability1 3. (P2). variagated diet. 3. and comes from a family in which survival well into their nineties is common.
event he or she is unfavorable. These considerations and experience with the classical case suggests the following procedure for reassignment. so that for xed probability measure. P (U ) = 150/250. N (negative). Let U be the event the student is an undergraduate and G be the event he or she is a graduate student.62 CHAPTER 3. the conditional probability of A. information determines a of event basic space. When we write has occurred. CONDITIONAL PROBABILITY conditioning event C . given C is P (AC) = For a xed conditioning event P (AC) P (C) (3. (3. this makes C a new The new unit of probability mass is New. with all the properties of an ordinary C Remark. For one thing. This is not the probability of a conditional event AC . which may call for reassessing the likelihood A. and Y be the event he or she is favorable to the plan. so the results can be taken as typical of the student body. C. but partial.1: Conditional probabilities from joint frequency data A survey of student opinion on a proposed national health care program included 250 students. Let A student is picked at random. P (ΩC) j = 1. P (AC) we are evaluating the likelihood of event A when it is known that event Conditional events have no meaning in the model we are developing. Now P (AC) j ≥ 0. and D (uncertain or no opinion). we have a new likelihood assignment to the event A. of whom 150 were undergraduates and 100 were graduate students. P (Y ) = (60 + 70) /250.1) C. and P j Aj C = P( W j Aj C ) P (C) = (3. (P2). and (P3) (p.3) P (Y U ) = P (Y U ) 60/250 60 = = P (U ) 150/250 150 (3. N be the D is the event of no opinion (or uncertain). Y U G 60 70 N 40 20 D 50 10 Table 3.4) . Although such a reassignment is not logically necessary. Eectively. Example 3. If C is an event having positive probability. subsequent developments give substantial evidence that this is the appropriate procedure.2) P (Aj C) /P (C) = P (Aj C) satises the three dening properties (P1). the new function P ( · C) probability. 11) for we have a Thus. Results are tabulated below. this means that A occurs i the event AC occurs. Then etc. made? One possibility is to make the new assignment to A How should the new probability assignments be proportional to the probability P (AC). Their responses were categorized Y (armative). P (C). P (Y U ) = 60/250. The data may reasonably be interpreted P (G) = 100/250. Denition. new probability measure.1 Suppose the sample is representative.
each sub ject is instructed to do 1. P (DU ) = 50/150. We wish to calculate probability the answer is yes given the response is honest. 2.. Let A be the event the subject is told to reply honestly.001.8) P (EA) = P (E) − P (B) P (A) (3. we can calculate P (N U ) = 40/150. P (GN ) = 20/60. etc. regardless of the truth in the matter. E P (A) selected randomly are told to the be the event the reply is yes. and C B be the event the subject is instructed The probabilities be the event the answer is to be no.5) We may also calculate directly P (U Y ) = 60/130. Example 3. More extensive treatment of the problem is given there. SOLUTION Since E = EA B. 3.2: Jet aircraft with two engines An aircraft has two jet engines.63 Similarly. and P (C) are determined by a chance mechanism (i.0003. so that if the other has already failed. to reply yes. P (EA).7) Thus reliability of any one engine may be less than satisfactory. Nonetheless. etc. 4. yet the overall reliability may be The following example is taken from the UMAP Module 576. P (A). Thus that P (F1 ) = 0. P (DG) = 10/100 (3. P (B). by Paul Mullenix.3: Responses to a sensitive question on a survey In a survey. we have P (E) = P (EA) + P (B) = P (EA) P (A) + P (B) which may be solved algebraically to give (3. Respond no regardless of the true answer. Once the rst engine fails. added P (F2 F1 ) = 0. the response given may be incorrect or misleading. Example 3. one engine fails on a long distance ight. Greenberg. It will y with only one engine operating. (3. (3. Now the second engine can fail only F2 ⊂ F1 so that P (F2 ) = P (F1 F2 ) = P (F1 ) P (F2 F1 ) = 3 × 10−7 quite high. G. if answering yes to a question may tend to incriminate or otherwise embarrass the sub ject. Respond with an honest answer to the question. P (N G) = 20/100. vol 2. no. it may be desirable to obtain correct responses for purposes of social analysis. and F2 Let F1 be the event the event the second fails. reprinted in UMAP Journal. P (Y G) = 70/100.e.6) Conditional probability often provides a natural way to deal with compound trials carried out in several steps. one of three things: By a chance process.9) . Respond yes to the question. a fraction Let answer honestly.). Experience indicates load is placed on the second. The following device for dealing with this problem is attributed to B.
1 ≤ i ≤ 5 be the event the i th of these is hired ( (for each event Richard is hired. the friend selected on an equally likely basis one of the three to be hired. etc. Jim. which we take to mean P (A) = 0. The friend tells Jim that Richard has been selected. is Thus. below) shows P (A1 B3 ) = 6 = P (A1 ) = P (A1 A3 ) 10 (3. According to the pattern above P (EA) = 27 62/250 − 14/100 = ≈ 0. Barry.4: What is the conditioning event? Five equally qualied candidates for a job. Three of these are to be selected at random. Jim analyzes the problem as follows. 3) 10 (3.14 and P (E) = 62/250. Under this assumption. computation (see Example 5. with results to be posted the next day. Jim asks the friend to tell him the name of one of those selected (other than himself ). This is usually due to some ambiguity or misunderstanding of the information provided.13) This is consistent with the fact that if Jim knows that Richard is hired. Jim. Richard. CONDITIONAL PROBABILITY Suppose there are 250 subjects. before receiving the P (A1 ) = 6 1 × C (4. Many feel that there must be information about somewhat as follows.10) The formulation of conditional probability assumes the conditioning event C is well dened. The chance mechanism is such that P (C) = 0. 3) 10 i=j (3. Jim's probability of being hired.16. ANALYSIS Let Ai . so that P (A1 A3 ) = 3 1 × C (3.7. Jim's name was The friend took the three names selected: removed and an equally likely choice among the other two was made. P (B) = 0. Many would make an assumption if Jim was one of them. One of them. and Evan. for any pair A3 has occurred. Example 3. P (A1 A3 ) = The conditional probability 3 C (3.154 70/100 175 (3. Sometimes there are subtle diculties. how the friend chose to name Richard. Now P (Ai ) i) A1 is the event Jim is hired. Hence. the information assumed is an event B3 which is not the same as A3 . C (5. 2) = = P (Ai ) . i=j the number of combinations of three from ve including these two is just the number of ways of picking one from the remaining three.64 CHAPTER 3.). In fact. 1) = = P (Ai Aj ) . then there are two to be selected from the four remaining nalists.14) Discussion Although this solution seems straightforward. it has been challenged as being incomplete. are identied on the basis of interviews and told that they are nalists. information about Richard. There are 62 responses yes. Paul.15) . 1) = = 1/2 C (4. is the probability that nalist i A3 is the is in one of the combinations of three from ve.11) The information that Richard is one of those hired is information that the event Also. It may not be entirely clear from the problem description what the conditioning event is. otherwise. 1 ≤ i ≤ 5 C (5. 2) 6 (3.12) P (A1 A3 ) = 3/10 P (A1 A3 ) = = 1/2 P (A3 ) 6/10 (3. has a friend in the personnel oce.
What is the probability that all four customes get good items? SOLUTION Let Ei be the event the i th customer receives a good item. Example 3.19) Three items are to be selected (on an equally likely basis at each step) from ten. which corresponds to the dierence in the information given (or assumed). 3. Also. Determine the probability that the rst and third selected are good.16) P (ABCD) = P (A) P (AB) P (ABC) P (ABCD) · · = P (A) P (BA) P (CAB) P (DABC) P (A) P (AB) P (ABC) (3.20) . so that P (E1 E2 E3 E4 ) = P (E1 ) P (E2 E1 ) P (E3 E1 E2 ) P (E4 E1 E2 E3 ) = 9 8 7 6 6 · · · = 10 9 8 7 10 (3.65 Both results are mathematically correct. The dierence is in the conditioning event. customers purchase one of the items. (CP1) Product rule Derivation If P (ABCD) > 0.6: A selection problem C (9. 4) 126 = = 3/5 C (10. Hence P (E1 E2 E3 E4 ) = Example 3.. P (AB) = P (A) P (BA). Then the rst chooses one of the nine out of ten good ones.1. etc. the selection is on an equally likely basis from those remaining. then P (ABCD) = P (A) P (BA) P (CAB) P (DABC). Likewise The dening expression may be written in product form: P (ABC) = P (A) and P (AB) P (ABC) · = P (A) P (BA) P (CAB) P (A) P (AB) (3. 2 By the product rule P (G1 G3 ) = P (G1 ) P (G2 G1 ) P (G3 G1 G2 ) + P (G1 ) P (Gc G1 ) P (G3 G1 Gc ) = 2 2 7 6 8 7 · 8 + 10 · 2 · 8 = 28 ≈ 0. Four successive Each time. 1 ≤ i ≤ 3 be the event the i th unit selected is good. two of which are defective. SOLUTION Let Gi . 4) 210 (3. the number of combinations of four good ones is the number of combinations of four of the nine.5: Selection of items from a lot An electronics store has ten items of a given type in stock. the second chooses one of the eight out of nine goood ones. One is defective.18) Note that this result could be determined by a combinatorial argument: under the assumptions. the events may be taken in any order.62 9 9 45 8 10 · (3. conditional probability has are consequences of the way it is related to the original probability measure special properties which P (·).17) This pattern may be extended to the intersection of any nite number of events. The following are easily derived from the denition of conditional probability and basic properties of the prior probability measure. each combination of four of ten is equally likely. Then G1 G3 = G1 G2 G3 G1 Gc G3 . and prove useful in a variety of problem situations.3 Some properties In addition to its properties as a probability measure.
5. A student passes the exam. Let Hi be the event that a student tested is from high school i. P (F H1 ) = 0.4 (What is the conditioning event?). Also A5 = A1 A5 Ac A5 . Example 3. so B 5 ⊂ A5 . 1 We therefore have B 5 = B 5 A5 = B 5 A1 A5 B 5 Ac A5 . This condition is examined more thoroughly in the chapter on "Independence of Events" (Section 4.1). P (H3 ) = 0.7 A compound experiment.23) A1 B5 = A1 A5 B5 . student is from high school P (F H2 ) = 0. One of cards in the box is drawn on an equally likely basis (from either two or three) Let Ai be the event the numbered i i th card is drawn on the rst selection and let Bi be the event the card and is drawn on the second selection (from the box). P (A1 B5 ). Three cards are selected without replacement.66 CHAPTER 3. Often in applications data lead to conditioning with respect to an event but the problem calls for conditioning in the opposite direction. Let F be the event the student fails the test. the other two are put in a box If card 1 is not drawn. a disjoint union.26) i. all three are put in a box 2.24) We thus have P (A1 B5 ) = Occurrence of event 3/20 6 = = P (A1 ) 5/20 10 (3. P (A1 B5 ). Thus. since 1 3 1 1 3 · + · = 2 10 3 10 4 3 1 3 · = 10 2 20 (3.02.25) B1 has no aect on the likelihood of the occurrence of A1 .10. Suppose data indicate P (H1 ) = 0.8: Reversal of conditioning Students in a freshman mathematics class come from three dierent high schools. they are given a diagnostic test.3. (3. 66).2. This implies P Ai Ac = P (Ai ) − P (Ai Aj ) = 3/10 j that (3. A twostep selection procedure is carried out as follows. P (H2 ) = 0. SOLUTION From Example 3. {Ai : 1 ≤ i ≤ n} of E = A1 E A2 E · · · events is mutually exclusive and An E . 1. P (B5 ) = P (B5 A1 A5 ) P (A1 A5 ) + P (B5 Ac A5 ) P (Ac A5 ) = 1 1 Also. on an equally likely basis. Their mathematical preparation varies.22) Now we can draw card ve on the second selection only if it is selected on the rst drawing.21) Five cards are numbered one through ve. • • If card 1 is drawn. Determine for each i the conditional probability P (Hi F c ) that the . CONDITIONAL PROBABILITY E Suppose the class (CP2) Law of total probability every outcome in is in one of these events. P (F H3 ) = 0. Then P (E) = P (EA1 ) P (A1 ) + P (EA2 ) P (A2 ) + · · · + P (EAn ) P (An ) Example 3. Determine P (B5 ). P (A1 B5 ) = P (A1 A5 B5 ) = P (A1 A5 ) P (B5 A1 A5 ) = (3. we have P (Ai ) = 6/10 and P (Ai Aj ) = 3/10.06 (3. In order to group them appropriately in class sections. 1 ≤ i ≤ 3. 1 By the law of total probability (CP2) (p.
a selection is then made from lot 2 (also on an equally likely basis). 67). P (G1 ) = 2/3. and probability P (T D) of a c c false negative. 2 4 1 3 11 · + · = 3 5 3 5 15 (3.31) By the law of total probability (CP2) (p. The desired probabilities are P (DT ) and P (D T ).29) The basic pattern utilized in the reversal is the following. What is the G1 be the event the rst item (from lot 1) was good. P (G2 Gc ) = 3/5 1 (3.5 + 0. probability P (T D ) of a false positive.9: A compound selection and reversal Begin with items in two lots: 1. One item is selected from lot 1 (on an equally likely basis).2962 P (F c ) 952 (3. P (F c H1 ) P (F c H1 ) P (H1 ) 180 = = = 0. then P (Ai E) = P (EAi ) P (Ai ) P (Ai E) = 1≤i≤n P (E) P (E) The law of total probability yields P (E) (3.1891 c) P (F P (F c ) 952 (3.27) P (H1 F c ) = Similarly.5147 P (F c ) 952 and P (H3 F c ) = P (F c H3 ) P (H3 ) 282 = = 0. Four items. probability the item selected from lot 1 was good? SOLUTION Let This second item is good. one defective. 66).67 SOLUTION P (F c ) = P (F c H1 ) P (H1 ) + P (F c H2 ) P (H2 ) + P (F c H3 ) P (H3 ) = 0. P (G2 G1 ) = 4/5. Three items.98 · 0.33) Example 3. one defective.2 + 0. Example 3. and G2 be the event the second item Now the data are interpreted (from the augmented lot 2) is good. this item is added to lot 2.30) Such reversals are desirable in a variety of practical situations. 2.952 Then (3. P (G2 ) = P (G1 ) P (G2 G1 ) + P (Gc ) P (G2 Gc ) = 1 1 By Bayes' rule (CP3) (p.73 P (G2 ) 11/15 11 (3.28) P (H2 F c ) = P (F c H2 ) P (H2 ) 590 = = 0. Data are usually of the form: prior probability .10: Additional problems requiring reversals • Medical tests. Suppose D is the event a patient has a certain disease and T is the event a P (D) (or prior c c c odds P (D) /P (D )). test for the disease is positive.32) P (G1 G2 ) = P (G2 G1 ) P (G1 ) 4/5 × 2/3 8 = = ≈ 0.3 = 0. n (CP3) Bayes' rule If E⊂ i=1 Ai (as in the law of total probability).94 · 0. We want to determine as P (G1 G2 ).90 · 0.
P(AnEc)] Enter matrix PEA of conditional probabilities PEA Enter matrix PA of probabilities PA P(E) = 0.2000 0.10 0. The desired probabilities P (Ai E) P (Ai E c ) are calculated and displayed Example 3. P ( · A) It is the ratio of the probabilities (or likelihoods) of C. The second fraction in the right hand member is the before knowledge of the occurrence of the conditioning event is known as the likelihood ratio. • Job success.5 0. The rst fraction in the right hand member C for the two dierent prior odds. PA = [0. named above The procedure displays the results in tabular form. high) and P (T Dc ). the various quantities are in the workspace in the matrices named.4167 0. The following variation of Bayes' rule is applicable in many practical situations. P(EAn)] and PA = [P(A1) P(A2) .g. bayes Requires input PEA = [P(EA1) P(EA2) . T If D is the event a dangerous condition exists (say a steam pressure is too is the event the safety alarm operates. The calculations.. then data are usually of the form P (D). the data are usually prior the characteristics as predictors in the form is P (HE).5147 0. We have an mprocedure called bayes to perform the calculations easily. P (DT ) and P (D T ). PA. (CP3*) Ratio form of Bayes' rule P (BC) The left hand member is called the P (AC) = posterior odds.68 CHAPTER 3. and P (EH c ).5000 0. as shown. Again.1891 0.3000 0. a rm usually examines the condition of a test market as an indicator of the national market. The probabilities P (Ai ) are put into a matrix PA and the conditional probabilities and P (EAi ) are put into matrix PEA.1000 0.. If Before launching a new product on the national market. the likelihood the national market is favorable.0600 0. which is the odds after P (AC) P (BC) = P (CA) P (CB) · P (A) P (B) of the conditioning event. and E is the event that an individual inter viewed has certain desirable characteristics.. as in Example 3. • Presence of oil.2083 0. event the national market is favorable and are a prior estimate and E H is the is the event the test market is favorable. What is desired is P (EH) P (HE) .2962 Various quantities are in the matrices PEA. desired and P (T c D). PAEc. which is the odds knowledge of the occurrence probability measures and P ( · B). data P (H) of the likelihood the national market is sound. and data P (EH c ) indicating the reliability of the test market. the c c correctly.. CONDITIONAL PROBABILITY • Safety alarm. or equivalently (e.. P(An)] Determines PAE = [P(A1E) P(A2E) .8 (Reversal of conditioning). P (EH).8 (Reversal of conditioning) PEA = [0.3]. The desired probability is P (HE)..11: MATLAB calculations for Example 3. is the event of success on a job. given the test market is favorable. and E is the event of certain geological structure (salt dome or fault). In addition. If H is the event of the presence of oil at a proposed well site. the data are usually P (H) (or the • Market condition. probabilities are that the safety alarms signals If H P (T c Dc ) and P (T D)).2 0.06].02 0.. P(AnE)] and PAEc = [P(A1Ec) P(A2Ec) . . are simple but can be tedious.3750 0. P (H) and reliability of The desired probability P (EH) and P (EH c ).048 P(EAi) P(Ai) P(AiE) P(AiEc) 0.0200 0. so that they may be used in further calculations without recopying.. odds). PAE..
(CP4) Some equivalent conditions If 0 < P (A) < 1 and 0 < P (B) < 1.69 Example 3. or > and is > . A test is performed. The test has probability 0.05 so that P (ST ) = 198 = 0.05 of a false positive and 0. and P (T c S) = 0. we examine the derivation of these results. (Ac B) > P (Ac ) i P (AB) < P (A). P (T S c ) = 0. b. =. The machine seems to be operating so well that the prior odds it is satisfactory are taken to be ten to one.12: A performance test As a part of a routine maintenance procedure.01 of a false negative. 1. CD.99 P (ST ) = · = · 10 = 198 c T ) c ) P (S c ) P (S P (T S 0. P P P P (AB) + P (Ac B) = 1. P (AB) + P (AB c ) = 1. PC (A) becomes PC (AD) = P (ACD) PC (AD) = PC (D) P (CD) (3.34) The following property serves to establish in the chapters on "Independence of Events" (Section 4. The result is positive. Reassign the conditional probabilities. We have a new conditioning event 1. P (S) /P (S c ) = 10.1) and "Conditional Independence" (Section 5. ≥. ≤.9950 199 (3. =.1. T be the event the test is Then by the ratio form of Bayes' rule P (T S) P (S) 0. VERIFICATION of (CP4) (p. VERIFICATION Exercises (see problem set) 3.05. c. (AB) > P (A) i P (Ac B c ) > P (Ac ).37) .35) P (AB) ∗ P (A) P (B) iﬀ P (Ac B c ) ∗ P (Ac ) P (B c ) iﬀ P (AB c ) P (A) P (B c ) where (3. d. a computer is given a performance test.01. ≤. 4. or < . respectively. in general. but. (AB) > P (A) i P (AB c ) < P (A). What are the posterior odds the device is operating properly? SOLUTION Let S be the event the computer is operating satisfactorily and let The data are favorable.4 Repeated conditioning Suppose conditioning by the event C has occurred. 2. Because of the role of this property in the theory of independence and conditional independence. 3.36) ∗is < .1) a number of important properties for the concept of independence and of conditional independence of events. then P (AB) ∗ P (A) iﬀ P (BA) ∗ P (B) iﬀ P (AB) ∗ P (A) P (B) and (3. 69) a. (AB) ∗ P (A) P (B) i P (AB) ∗ P (A) (divide by P (B) may exchange A and Ac ) (AB) ∗ P (A) P (B) i P (BA) ∗ P (B) (divide by P (A) may exchange B and Bc ) (AB) ∗ P (A) P (B) i [P (A) − P (AB c )] ∗ P (A) [1 − P (B c ) i −P (AB c ) ∗ −P (A) P (B c ) (AB c ) P (A) P (B c ) c We may use c to get P (AB) ∗ P (A) P (B) i P (AB ) P (A) P (B c ) i P (Ac B c ) ∗ P (Ac ) P (B c ) P P P P i A number of important and useful propositons may be derived from these. Additional information is then received that event There are two possibilities: D has occurred. ≥.
· · ·. What is the(conditional) probability that she is a female who lives on campus? Exercise 3.20.org/content/m24173/1. .30. P (Ai B0 ) for each possible i.2 In Exercise 11 (Exercise 2.55 that she will live to at least eighty years. live on campus. This result is important in extending the concept of "Independence of Events" (Section 4.2 Problems on Conditional Probability2 Exercise 3. the conditional probability (3. and are active in sports 8 percent are male and live o campus 17 percent are male students inactive in sports Let A = male. P (Ac ∪ BC) = 0. 01.1). the probability a woman lives to at least seventy years is 0. or may be done in one step. P (BC) = 0. Exercise 3. If a woman is seventy years old. what is the conditional probability she will survive to eighty years? Note that if A⊂B then P (AB) = P (A).11) from "Problems on Minterm Analysis. P (Ac BC c ) = 0.3 (Solution on p.) Exercise 3. and Bj is the event the product of the two digits is j. (Solution on p.) Exercise 3. Suppose Ai is the event the sum 0 ≤ i ≤ 18.5 Two fair dice are rolled. What is the (conditional) probability that one turns up two spots.15 Determine. What is the (conditional) probability that he is active in sports? (b) A student selected is active in sports.55. 99. Thus repeated conditioning by two events may be done in any order.) a.4/>. 74. 74.1) to "Conditional Independence" (Section 5.1 Given the following data: (Solution on p.) P (A) = 0. These conditions are important for many problems of probable inference.38) Basic result: PC (AD) = P (ACD) = PD (AC).70 CHAPTER 3.) In a certain population.55. He is male and lives on campus. (Solution on p." we have the following data: A survey of a represenative group of students yields the following information: • • • • • • • 52 percent are male 85 percent live on campus 78 percent are male or are active in intramural sports (or both) 30 percent live on campus but are not active in sports 32 percent are male. 74. i. C = active in sports. • • (a) A student is selected at random. CONDITIONAL PROBABILITY 2. 74. Reassign the total probabilities: P (A) becomes PCD (A) = P (ACD) = P (ACD) P (CD) (3. (Solution on p. P (AB) = 0. given they show dierent numbers? 2 This content is available online at <http://cnx. one card is drawn. B = on campus. 74. This result extends easily to repeated conditioning by any nite number of events.4 of the two digits on a card is Determine From 100 cards numbered 00.39) P (Ac B) = P (Ac B) /P (B). 3. if possible.70 and is 0. 02.
7 of whom are women. given that three of those selected are women? Exercise 3.000 and $100.2 a. P (Si Ej ) (3. A collector buys a painting.45 0.40) S1 E1 E2 E3 P (Si ) 0. P (E2 ) = 0. A unit is examined and found to have the characteristic symptom.8 (Solution on p.10 annual income is less than $25.10 0.10 0. given that the sum is for each k from two through 12? Exercise 3.7 (Solution on p. 75.) Twenty percent of the paintings in a gallery are not originals. E1 = event did not complete college education.10 Table 3.) Four persons are to be selected from a group of 12 people.80 0. given that the sum is each k k.65. Data on incomes and salary ranges for a certain population are analyzed as follows.000. What is the (conditional) probability that he or she will make $25. for k. Suppose a person has a university education (no graduate study).10 0. (Solution on p. 75.40 S3 0. while only two percent of the units which do not have this defect exhibit that characteristic. 2.000 or more? . 75. What is the (conditional) probability that the rst turns up six.10 of buying a fake for an original but never rejects an original as a fake.50 S2 0.85 0. What is the (conditional) probability that the rst and third selected are women. 1. What is the (conditional) probability the painting he purchases is an original? Exercise 3. given this behavior? Exercise 3.05 0.000.71 b. 75. What is the conditional probability that the unit has the defect. If one is selected at random and found to be good. E2 = event of completion of bachelor's degree. He has probability 0. S3 = event annual income is greater than $100. E3 = event of completion of graduate or professional degree program.50 0. a. Experience shows that 93 percent of the units with this defect exhibit a certain behavioral characteristic. and P (E3 ) = 0.) Five percent of the units of a certain type of equipment brought in for service have a common defect. 75.05. What is the probability that the rst and third selected are women? b. from two through 12? c.6 (Solution on p.000.05 0. Determine P (E3 S3 ). What is the probability that three of those selected are women? c.30. What is the (conditional) probability that at least one turns up six. what is the probability of no defective units in the lot? Exercise 3.) S1 = event S2 = event annual income is between $25. (Solution on p. or 3 defective units in the lot. b.) There is an equally likely probability that there are 0. Data may be tabulated as follows: P (E1 ) = 0.9 A shipment of 1000 electronic units is received.
Exercise 3. and ve defective units.02. 76.12 (Solution on p. Units which fail to pass the inspection are sold to a salvage rm.) we must have P (AB) + P (Ac B) = 1. 76. What are the (conditional) probabilities the unit was selected from each of the boxes? Exercise 3. What is the (conditional) probability the unit was originally Exercise 3. in fact.) They have respectively one. The automatic test procedure has probability 0.02 of giving a false negative.) (Solution on p. That is. 75. free of defects.18 Use property (CP4) (p. A box is selected at random. 76. Exercise 3. Experience shows that one percent of the units produced are defective. Exercise 3. Exercise 3.) Exercise 3. What is the likelihood.15 is determined that the defendent is left handed.) Two percent of the units received at a warehouse are defective.) and P (AC) > P (BC) P (AC c ) > P (BC c ).11 (Solution on p. and T is the event and P (T D) = 0. 76. (Solution on p. 76. four.05 of giving a false positive indication and probability 0. 75.) Previous In a survey. out of fear of reprisal. experience indicates that 20 percent of those who do not favor the policy say that they do. (Solution on p. then P (A) > P (B). if that it tests satisfactory. CONDITIONAL PROBABILITY c. b. This rm applies a corrective procedure which does not aect any good unit and which corrects 90 percent of the defective units. times more likely if the defendent is guilty than if he were not.20 Show that . (Solution on p. 76. Exercise 3. Find the total probability that a person's income category is at least as high as his or her educational level. Determine the probability P (D T ) c that a unit which tests good is. 85 percent of the employees say they favor a certain company policy. What is the probability that an employee picked at random really does favor the company policy? It is reasonable to assume that all who favor say so. Is the converse true? Prove or give a counterexample. 76. three.) It An investigator convinces the judge this is six At a certain stage in a trial. given this evidence.17 Since P (·B) is a probability measure for a given Construct an example to show that in general P (AB) + P (AB c ) = 1.13 Five boxes of random access memory chips have 100 units per box. P (AB) = P (ABC) P (CB) + P (ABC c ) P (C c B). A nondestructive test procedure gives two percent false positive indications and ve percent false negative.14 (Solution on p. then D is the event a unit tested is defective. It is defective.05 P (T c Dc ) = 0. Exercise 3. (Solution on p. 69) to show a. A customer buys a unit from the salvage rm. the judge feels the odds are two to one the defendent is guilty. c. that the defendent is guilty? Exercise 3.19 Show that P (AB) ≥ (P (A) + P (B) − 1) /P (B). 76.16 Show that if (Solution on p. on an equally likely basis. and a unit is selected at random therefrom. two. P (AB) > P (A) i P (AB c ) < P (A) P (Ac B) > P (Ac ) i P (AB) < P (A) P (AB) > P (A) i P (Ac B c ) > P (Ac ) (Solution on p.) A quality control group is designing an automatic test procedure for compact disk players coming from a production line.) B.72 CHAPTER 3. defective? It is good.
rst choice. r + b = n). We suppose P (B) = p and P (BA) ≥ P (B) Exercise 3. P (AB c ) = 1/n. This might be selection from answers on a multiple choice question. Let A be the event he makes a correct selection. There are balls on the second choice. . show that Polya's urn scheme for a contagious disease. c. An urn contains initially b black balls and r (Solution on p. 77.22 ( and P (BA) increases with n for xed p. The process is repeated.) red balls along with c A ball is drawn on an equally likely basis from among those in the urn. draw and Rk n+c be the event of a red ball on the k th draw. when only one is correct. then replaced additional balls of the same color. and B be the event he knows which is correct before Determine making the selection. P (BA). b.21 An individual is to select from among n (Solution on p. 77. d. etc.) alternatives in an attempt to obtain a particular one.73 Exercise 3. P P P P (B2 R1 ) (B1 B2 ) (R2 ) (B1 R2 ). Let Bk be the event of a black ball on the n balls on the k th Determine a.
..4545 Solution to Exercise 3.2500 2. DP = [ 1 0. 70) i..0000 0..74 CHAPTER 3.30 0..0000 0...20 0... Bc.2 (p. B.3770 Solution to Exercise 3.mincalc .61 PAcB_C = 0... A. A&B.. Ac.55 0.. C] event she lives to eighty. [A&B&C. Cc They may be renamed. for each P (Ai B0 ) = (1/100) / (1/10) = 1/10 Solution to Exercise 3. Ai B0 is the event that the card with numbers 0i is drawn. 70) npr02_11 (Section~17. 0 through 9.0000 0.. Ac&B&C..23/0. 70) Let A = event she lives to seventy and B = P (AB) /P (A) = P (B) /P (A) = 55/70.5 (p.7273 PAcB_C = 0.2300 4.. 70) B0 is the event one of the rst ten is drawn. TV = [Ac&B.1 (p.mincalct Enter matrix of target Boolean combinations Computable target probabilities 1..P = 0.21: npr03_01) % Data for Exercise~3...55 0.5500 The number of minterms is 8 The number of available minterms is 4 .0000 0. Ac(B&C). if desired... .8. Ac&B&Cc].. C.32/0.3200 2..4400 3..55 P = 0.15 ].. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. B&C...6100 PC_AB = 0. Since B ⊂ A.. P (BA) = Solution to Exercise 3.44 PC_AB = 0.1 minvec3 DV = [AAc. CONDITIONAL PROBABILITY Solutions to Exercises in Chapter 3 Solution to Exercise 3. disp('Call for mincalc') npr03_01 Variables are A.0000 0.3 (p. B].8.8: npr02_11) . A&B. 70) % file npr03_01..25/0..m (Section~17.4 (p.0000 0.
71) Let D = the event the unit is defective P (CD) = 0.85 + 0.85. b. 71) c P (W1 W3 ) = P (W1 W2 W3 ) + P (W1 W2 W3 ) = 7 5 6 7 7 6 5 · · + · · = 12 11 10 12 11 10 22 (3. There are 2×5 ways that they are dierent and one turns up two spots. C = the event it has the characteristic. Let sums shows c. then P (AB6 Sk ) = 2/36 for k = 7 through 11 and P (AB6 S12 ) = 1/36. Thus.02. The conditional probability is 2/6.99 9702 P (Dc T ) = = = P (DT ) P (T D) P (D) 0.11 (p. A table of P (A6 Sk ) = 1/36 and P (Sk ) = 6/36.10) · 0. 1.1 · 0. P (E3 S3 ) = P (S3 E3 ) P (E3 ) = 0. respectively. If AB6 is the event at least one is a six. 2/36.12 (p.65 + (0. 2/5. P (SF ) = 1. 2/4.44) (3.05 93 = = P (CD) P (D) + P (CDc ) P (Dc ) 0.0225 P (S2 S3 E2 ) = 0.8 + 0.05 · 0.8 40 P (GB) = = = c ) P (Gc ) P (B) P (BG) P (G) + P (BG 0. 71) Let Dk = the event of k G be the event a good one is chosen. c.01 5 P (Dc T ) = 9702 5 =1− 9707 9707 (3.2 41 and (3. A6 = event rst is a six and Solution to Exercise 3.93 · 0. and P (CDc ) = 0.43) Solution to Exercise 3.10 (p.45 · 0. Now A6 Sk = ∅ for k ≤ 6.75 a. b.80 + 0.45 · 0. 72) implies P (T Dc ) P (Dc ) 0. respectively. Assume P (BG) = 1 and P (GB) = P (BG) P (G) 0. 71) B = the event the collector buys. then Let and G= the event the painting is original. P (DC) = P (CD) P (D) 0.7 (p.9 (p. Hence P (A6 Sk ) = 1/6. 1/5. P (SF c ) = 0.80 + 0. 71) a.30 + 0. the conditional probabilities are 2/6. There are 6×5 ways to choose all dierent.90 p = (0. respectively.05.02 · 0.47) (3. Sk = event the sum is k. P (BGc ) = 0. 1/3. 1/36 for k = 7 through 12.46) P (S) = P (SF ) P (F ) + P (SF c ) [1 − P (F )] Solution to Exercise 3.42) Solution to Exercise 3.93.05 + 0. 1. 2/3.45) Solution to Exercise 3. 1.48) .10 = 0.98 · 0. 3/36.41) Solution to Exercise 3. 4/36. 5/36.9425 Also. Then P (D) = 0. P (F ) = P (S) − P (SF c ) 13 = 1 − P (SF c ) 16 (3.10 + 0. P (D0 G) = P (GD0 ) P (D0 ) P (GD0 ) P (D0 ) + P (GD1 ) P (D1 ) + P (GD2 ) P (D2 ) + P (GD3 ) P (D3 ) = 1000 1 · 1/4 = (1/4) (1 + 999/1000 + 998/1000 + 997/1000) 3994 (3. reasonable to assume Solution to Exercise 3.05 = 0.8.8 (p. 1/2.93 · 0. 72) P (S) = 0.95 131 defective and (3. If P (G) = 0. 1/4.05) · 0.05 = 0.1.20.6 (p.
P (Hi ) = 1/5 and P (DHi ) = i/100.54) odds: P (G) P (LG) P (GL) = · = 2 · 6 = 12 P (Gc L) P (Gc ) P (LGc ) P (GL) = 12/13 Solution to Exercise 3. Solution to Exercise 3. 72) Suppose P (A) < P (B). 72) 1 ≥ P (A ∪ B) = P (A) + P (B) − P (AB) = P (A) + P (B) − P (AB) P (B).51) (3. Then P (AB) = P (A) /P (B) < 1 and P (AB c ) = 0 so the sum is less than one. Then 1/2 = P (A) = But 1 1 (1/4 + 3/4) > (1/2 + 1/4) = P (B) = 3/8 2 2 (3.53) 0.52) P (G) = P (GT ) = P (GDT ) + P (GDc T ) = P (D) P (T D) P (GT D) + P (Dc ) P (T Dc ) P (GT Dc ) P (DG) = Solution to Exercise 3.02.90 441 = 0. P (T Dc ) = 0. L = the event the P (G) /P (Gc ) = 2. P (Hi D) = P (DHi ) P (Hi ) = i/15. P (AC) = 1/4. P (BC) = 1/2. A⊂B with Solution to Exercise 3.58) . P (GDc T ) = 1 P (DG) = P (GD) . CONDITIONAL PROBABILITY i. 72) a.05 · 1.57) P (AC) < P (BC). P (G) P (GD) = P (GT D) = P (D) P (T D) P (GT D) (3.76 CHAPTER 3. Solution to Exercise 3. 72) (3. 72) P (AB) = P (ABC) + P (ABC c ) P (AB) = P (B) P (B) (3.20 (p. c.90 + 0. 1 ≤ i ≤ 5 P (DHj ) P (Hj ) event initially defective.00 1666 (3.15 (p.16 (p.98 · 0.98 · 0. c The converse is not true.50) (3. b.49) Solution to Exercise 3. Prior (3.55) (3. defendent is left handed.19 (p. D= G= event unit purchased is good. 72) Let T = event test indicates defective.56) P (A) = P (AC) P (C) + P (AC c ) P (C c ) > P (BC) P (C) + P (BC c ) P (C c ) = P (B). Consider P (C) = P (C ) = 0.02 · 0. and P (BC ) = 1/4. P (GDT ) = 0.13 (p. P (T c D) = 0.98 · 0. 72) Let G = event the defendent is guilty.18 (p. Data are P (D) = 0.02 · 0.90. and (3.17 (p. 72) Hi = the event from box Solution to Exercise 3.02. P (AB) > P (A) i P (AB) > P (A) P (B) i P (AB c ) < P (A) P (B c ) i P (AB c ) < P (A) P (Ac B) > P (Ac ) i P (Ac B) > P (Ac ) P (B) i P (AB) < P (A) P (B) i P (AB) < P (A) P (AB) > P (A) i P (AB) > P (A) P (B) i P (Ac B c ) > P (Ac ) P (B c ) i P (Ac B c ) > P (Ac ) Simple algebra gives the Solution to Exercise 3. P (GT c ) = 0. Result of testimony: P (LG) /P (LGc ) = 6.5. desired result. c c P (AC ) = 3/4.05.14 (p.
61) b P (B2 R1 ) = n+c b b+c P (B1 B2 ) = P (B1 ) P (B2 B1 ) = n · n+c P (R2 ) = P (R2 R1 ) P (R1 ) + P (R2 B1 ) P (B1 ) = P (B1 R2 ) = P (R2 B1 )P (B1 ) P (R2 ) r+c r r b r (r + c + b) · + · = n+c n n+c n n (n + c) r n+c (3.62) d. P (AB c ) = 1/n.59) Solution to Exercise 3.77 = P (ABC) P (BC) + P (ABC c ) P (BC c ) = P (ABC) P (CB) + P (ABC c ) P (C c B) P (B) (3. we have b b = r+b+c n+c (3. increases from 1 to 1 n p np = (n − 1) p + 1 (1 − p) as (3.63) . Using (c).22 (p. with P (R2 B1 ) P (B1 ) = P (B1 R2 ) = · b n.60) 1/p n→∞ (3.21 (p. 73) a. b. P (B) = p P (BA) = P (AB) P (B) P (B) + P (AB c ) P (B c ) = P AB p+ P (BA) n = P (B) np + 1 − p Solution to Exercise 3. c. 73) P (AB) = 1.
78 CHAPTER 3. CONDITIONAL PROBABILITY .
The list of examples could be extended indenitely. the grade one makes should not aect the grade made by the other.1 Independence of Events1 Historically. 79 . If customers come into a well stocked shop at dierent times. Pairs of events are considered.6/>. The operational independence described indicates that knowledge that one of the events has occured does not aect the likelihood that the other will occur. In each case. much less information is required to determine probabilities of Boolean combinations and calculations are correspondingly easier. For a pair of events {A. If such knowledge of the occurrence of B does not aect the likelihood of the occurrence of A. given knowledge that B has occurred. the notion of independence has played a prominent role in probability.Chapter 4 Independence of Events 4. A second card picked on a repeated try should not be aected by the rst choice. How should we incorporate the concept in our developing model of probability? We take our clue from the examples above. we should be inclined to think of the events A and B as being independent in a probability sense. Our basic interpretation is that P (AB) as the likelihood The development of conditional probability in the module Conditional Probability (Section 3. the consequences must be evaluated for evidence that these ideas have been captured successfully. the the item purchased by one should not be aected by the choice made by the other. leads to the interpretation of A will occur on a trial.1 Independence as lack of conditioning There are many situations in which we have an operational independence. B. we give a precise formulation of the concept of independence in the probability sense. this is the condition P (AB) = P (A) Occurrence of the event (4.1) A is not conditioned by occurrence of the event P (A) that indicates of the likelihood of the occurrence of event A.1. If two students are taking exams in dierent courses. As in the case of all concepts which attempt to incorporate intuitive notions. each unaware of the choice made by the others. B}. • • • Supose a deck of playing cards is shued and a card is selected at random then replaced with reshuing. 4. 1 This content is available online at <http://cnx. we should expect to model the events as independent in some way.1).org/content/m23253/1. If events form an independent class. In this unit.
A} is an independent pair for any has probability one (is an almost sure event). product rule in the upper right rule Denition. B} independent. 69) for conditional probability as follows. {A. . S so that the product rule holds. which we augment and extend below: If the pair the events. Thus {N. then the occurrence of one Formally: Suppose 0 < P (A) < 1 and 0 < P (B) < 1. in many applications the assump tions leading to independence may be formulated more naturally in terms of one or another of the equivalent expressions. replacement rule. B} mutually exclusive implies P (AB) = P (∅) = 0 = P (A) P (B).2 These conditions are equivalent in the sense that if any one holds. Unless at least one of the events has probability one or zero.80 4.2) Remark. then its complement is independent. The replacement rule may thus be extended to: Replacement Rule If the pair {A. P (AB) = P (A) P (AB c ) = P (A) P (Ac B) = P (Ac ) P (Ac B c ) = P (Ac ) P (AB) = P (A) P (B) P (AB c ) = P (A) P (B c ) P (Ac B) = P (Ac ) P (B) P (Ac B c ) = P (Ac ) P (B c ) P (B c Ac ) = P (B c ) Table 4. A} is independent. so replacement rule and the fact just established. a pair cannot be both independent and mutually exclusive. we adopt the hand corner of the table. Property (CP4) (p.2 Independent pairs We take our clue from the condition (in the case of equality) yields CHAPTER 4. We are free to do this. event • If event A. then all hold. B} of events is said to be (stochastically) independent P (AB) = P (A) P (B) i the following product (4. holds: The pair {A. if the pair is mutually exclusive. Because of its simplicity and symmetry with respect to the two events. {S c . B} independent. By the P (N ) = 0 = P (A) P (N ). so is any pair obtained by taking the complement of either or both of • Suppose event N has probability zero (is a null event). so is any pair obtained by replacing either or both of the events by their complements or by a null event or by an almost sure event. Then for any event A. B} independent implies P (AB) = P (A) P (B) > 0 = P (∅) requires that the other does not occur. We may chose any one of these as the dening condition and consider the others as equivalents for the dening condition. We note two relevant facts The equivalences in the righthand column of the upper portion of the table may be expressed as a {A. {A. Although the product rule is adopted as the basis for denition. we have 0 ≤ P (AN ) ≤ Sc is a null event. INDEPENDENCE OF EVENTS sixteen equivalent conditions P (BA) = P (B) P (B c A) = P (B c ) P (BAc ) = P (B) P (AB) = P (A).1 P (AB) = P (AB c ) P (Ac B) = P (Ac B c ) P (BA) = P (BAc ) P (B c A) = P (B c Ac ) Table 4. Intuitively. for the eect of assuming any one condition is to assume them all. CAUTION 1.1. A} {S.
Independence is not a property of events. It is easy to show. C} but and no two of these three events forms an independent pair. we may replace any of the sets by its complement. C} is independent i all four of the following product rules hold P (AB) = P (A) P (B) P (AC) = P (B) P (C) P (ABC) = P (A) P (B) P (C) If any P (A) P (C) not P (BC) = (4. since P (AB) = P (A1 ) = 1/4 = P (A) P (B) and similarly for the other pairs. We may extend the replacement rule as follows. and the resulting class is also independent. Minterm Probabilities If {Ai : 1 ≤ i ≤ n} is an independent class and the the class {P (Ai ) : 1 ≤ i ≤ n} of individual probabilities is known.1. although somewhat cumbersome to write out. by a null event. Use of a minterm maps shows these assignments are Elementary calculations show the product rule applies to the class {A. and for the whole class. One immediate and important consequence is the following. it k+1 of them. of events is said to be (stochastically) independent i the product rule holds for every nite subclass of two or more events in the class. A3 C = A1 Let A = A1 Then the class A2 B = A1 A4 (4.3 Independent classes Extension of the concept of independence to an arbitrary Denition. that if the rule holds for any nite number holds for any k of events in an independent class. This can be seen by considering various probability distributions on a Venn diagram or minterm map.81 2. Such replacements may be made for any number of the sets in the class. C = AB D. with each P (Ai ) = 1/4. Two non mutually exclusive events may be independent under one probability measure. A3 . Note that we say not independent or nonindependent rather than dependent. but may not be independent for another. Suppose {A1 . then the probability of every minterm may be calculated. B. D} with AD = BD = ∅. C. A class A class class of events utilizes the product rule. Consider the class (4. 4. B. the class is independent. As noted above. or by an almost sure event. but (4. .4) {A. P (A) = P (B) = 1/4.6) P (AB) = 1/64. A4 } is a partition.1: Some nonindependent classes 1.5) P (ABC) = P (A1 ) = 1/4 = P (A) P (B) P (C) 2. but not independent. We consider some classical exmples of nonindependent classes Example 4. {A. B. C} has P (A) = P (B) = P (C) = 1/2 and is pairwise independent. the replacement rule holds for any pair of events. A similar situation holds for a class of four events: the product rule must hold for every pair. the rule must hold for any nite subclass. consistent. P (D) = 15/64. A2 .3) one or more of these product expressions fail. The reason for this becomes clearer in dealing with independent random variables. {A. for every triple. B. General Replacement Rule If a class is independent. By the principle of mathematical induction.
E10 } has respective probabilities 0. B c . % The odd positions P = prod(p(g))*prod(q(h)) P = 0.09. P (C c ) = 0. Example 4.8) c c c P (E1 E2 E3 E4 E5 E6 E7 ).15/0. 0. in order.71 0. % The even positions h = find(rem(1:10.37 0. p = 0. and the scheme for indexing a matrix. . E2 .51 − 0. % Complemented positions P = prod(p(e))*prod(q(f)) % p(e) probs of uncomplemented factors P = 0. C} is independent with P (A ∪ BC) = 0. three appropriate probabilities are sucient. We may use the MATLAB function prod and (b) c c c c c P (E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 ). C} is independent and P (M1 ) = P (Ac ) P (B c ) P (C) = 0. With these minterm probabilities. Example 4.14 Similarly. B. P (B) = 0.7) With each of the basic probabilities determined.09. E2 .15. eight appropriate probabilities must be specied to determine the minterm probabilities for a class of three events.30.4: MATLAB and the product rule Frequently we have a large enough independent class {E1 . we extend the use of MATLAB in the calculations for such classes.5 (4. In the independent case.82 CHAPTER 4. P (C) = 0.7 × 0.56 0. % Uncomplemented positions f = [3 5 6].22 0. may be calculated In general.06. hence the probability of any Boolean combination of the events. B c .12 0. B .14 {Ac . · · · . En } that it is desirable to use MATLAB (or some other computational aid) to calculate the probabilities of various and combinations (intersections) of the events or their complements.5 = P (C) and P (A) + P (Ac ) P (B) P (C) = 0.31 It is desired to calculate (a) (4.21.13 0. are 0. we may calculate the minterm probabilities. C} is independent with respective probabilities P (A) = 0. C c } is independent and P (M0 ) = P (Ac ) P (B c ) P (C c ) = 0. Then {Ac . B.5.3 = 0.21. 0.51.43 0.3: Three probabilities yield the minterm probabilities Suppose Then {A. complemented in odd positions g = find(rem(1:10.6. P (AC c ) = 0. Suppose the independent class {E1 .6 0.3.2) == 0).0010 % q(f) probs of complemented factors % Case of uncomplemented in even positions.06. INDEPENDENCE OF EVENTS Example 4.51 so that and P (A) = 0.57 0.01*[13 37 12 56 33 71 22 43 57 31].0034 In the unit on MATLAB and Independent Classes. % First case e = [1 2 4 7]. the probability of any Boolean combination of and C A. and 0.2: Minterm probabilities for an independent class Suppose the class and {A.30 = 0. 0. 0.33 0. P (B) = 0. the probabilities of the other six minterms. q = 1p.2) ∼= 0). · · · .
0720 0. k) of events.6 p = 0. General systems may be decomposed into subsystems of these types.1120 0. then function function calculates individual probabilities that calculates the probabilities that k of n occur k or more occur Example 4.0720 0.org/content/m23255/1. Then Let Ei be the event the survives the designated time period. hence the mintable. If of the n individual event probabilities. The three fundamental congurations are: 1. Example 4. The reliability R Ri = P (Ei ) is dened to be the reliability i th component of that component. The size of the class. and k is a matrix of integers less than or equal to P is a matrix n. k) y = ckn (P.0720 0.2921 0.1680 Probability of occurrence of k of n independent events In Example 2.9516 0. as follows: t = minmap(pm) t = 0. we show how to use the mfunctions mintable and csort to obtain the probability of the occurrence of k of n events.2520 0.2 MATLAB and Independent Classes2 4. The subsystems become components in the larger conguration.83 4. k = [2 5 7]. when minterm probabilities are available. In the case of an independent class. is determined.1680 0. In Independence of Events we show that in the independent case. . There are three basic congurations.1845 0.0266 % cumulative probabilities Reliability of systems with independent components Suppose a system has n components which fail independently. Series.1080 0. While these calculations are straightforward.k) P = 0.1080 0. we may calculate all minterm probabilities from the probabilities of the basic events.0480 0.1*[4 7 6]) pm = 0.0480 0. It is only necessary to specify the probabilities for the n basic events and the numbers k y = ikn (P. For this we have an mfunction minmap pm. This function uses the mfunction mintable to set up the patterns of minprob which calculates all minterm probabilities from the probabilities of the p's and q's for the various minterms and then takes the products to obtain the set of minterm probabilities.5 pm = minprob(0.0225 % individual probabilities Pc = ckn(p.01*[13 37 12 56 33 71 22 43 57 31]. the minterm probabilities are calculated easily by minprob.6/>. we show how to use minterm probabilities and minterm vectors to calculate probabilities of Boolean combinations of events. of the complete system is a function of the component reliabilities.2.1). and the minterm probabilities are calculated by minprob.1 MATLAB and Independent Classes case we have an mfunction In the unit on Minterms (Section 2. We have two useful mfunctions. they may be tedious and subject to errors. The system operates i not all components fail : R = i=1 Ri n R = 1 − i=1 (1 − Ri ) n 2 This content is available online at <http://cnx. in this basic or generating sets. P = ikn(p.2520 which reshapes the row matrix 0.1680 0.k) Pc = 0.1680 It may be desirable to arrange these as on a minterm map.0720 0. 2. The system operates i all n components operate : Parallel. Fortunately.1401 0.1120 0.
INDEPENDENCE OF EVENTS 3.95 0.9172 Figure 4. Probabilities of the components in order are 0. and the last ve probabilities are for RC = 0.8 .1: Schematic representation of the system in Example 4.m).83 0.80 0.85 the 3 of 5 combination. 1. (4. it is more ecient to use the mfunction cbinom (see Bernoulli trials and the binomial distribution. combination of components 2 and 3.7 % ckn is a user defined function (in file ckn. k of n Conguration R = ckn(RC.90 0.7 Example 4. % prod is a built in MATLAB function Parallel Conguration R = parallel(RC) % parallel is a user defined function 3.9) The second and third probabilities are for the parallel pair. The system operates i k or more components operate.k) Example 4. If the component probabilities are all the same.85 0.91 0.01*[95 90 92 80 83 91 85 85].84 CHAPTER 4. Component 1 is in series with a parallel There are eight components. Put the component reliabilities in matrix RC = [R1 R2 · · · Rn] Series Conguration R = prod(RC) 2. below).3) % Solution Ra = 0. % Component reliabilities Ra = RC(1)*parallel(RC(2:3))*ckn(RC(4:8).92 0. numbered 1 through 8. R may be calculated with the m function ckn. MATLAB solution. k of n. followed by a 3 of 5 combination of components 4 through 8 (see Figure 1 for a schematic representation).
8 A test for independence It is dicult to look at a list of minterm probabilities and determine whether or not the generating events form an independent class. % Component reliabilities 18 Rb = prod(RC(1:2))*parallel([RC(3).10 pm = [0.25 0. and performs a check for independence.01*[95 90 92 80 83 91 85 85].11 .01*[15 5 2 18 25 5 18 12].30]: %An improper number of probabilities disp(imintest(pm)) The number of minterm probabilities incorrect Example 4.10 0. determines the number of variables.3)]) % Solution Rb = 0. It checks for feasible size.85 RC = 0.2: Schematic representation of the system in Example 4.15 0.9 pm = 0.ckn(RC(4:8). % An arbitrary class disp(imintest(pm)) The class is NOT independent Minterms for which the product rule fails 1 1 1 0 1 1 1 0 Example 4. Example 4. The mfunction imintest has as argument a vector of minterm probabilities.8532 Figure 4.20 0.
43 0. The desired probability is the sum of these. The minterm expansion is A ∪ BC = M (3. 5. minterm probabilities.0320.86 CHAPTER 4. As an alternate approach. In larger problems. We could simplify as follows: .22 0. C} is independent.57 0.1280.0480. we write A ∪ BC = A Ac BC. p (5) = 0.12: A simple Boolean combination Suppose the class {A.31}. p (7) = 0. 0. As a larger example for which computational aid is highly desirable.5 0.4 + 0.56 0.8. consider again the class and the probabilities utilized in Example 4. Each requires the multiplication of ten numbers.12) 210 = 1024 minterm probabilities to be calculated. disp(imintest(pm)) The class is independent 4.6 · 0. Determine P (A ∪ BC).71 0. or in situations where probabilities of several Boolean combinations are to be determined.3 0. 4.2.11) Considerbly fewer arithmetic operations are required in this calculation. However.13 0.7]). above.6 · 0. F = (A1(A3&(A4A7c)))(A2&(A5c(A6&A8)))(A9&A10c). so that P (A ∪ BC) = 0.33 0.2 Probabilities of Boolean combinations As in the nonindependent case. it may be desirable to calculate all minterm probabilities then use the minterm vector techniques introduced earlier to calculate probabilities for various Boolean combinations. INDEPENDENCE OF EVENTS pm = minprob([0. E2 . if desired. 7) (4.6.1920 .01*[13 37 12 56 33 71 22 43 57 31]. Example 4.6 · 0.6880.6636 Writing out the expression for F is tedious and error prone. · · · . Example 4.2280. 6.12 0.8 = 0. Similarly p (4) = 0. 0. B. We wish to calculate respective probabilities c c c P (F ) = P (E1 ∪ E3 (E4 ∪ E7 ) ∪ E2 (E5 ∪ E6 E8 ) ∪ E9 E10 ) There are (4. E10 } with {0. p (6) = 0. Thus so that P (A ∪ BC) = p (3. it is frequently more ecient to manipulate the expressions for the Boolean combination to be a disjoint union of intersections. 4. 5.6. PF = F*pm' PF = 0. The solution with MATLAB is easy.4. we may utilize the minterm expansion and the minterm probabilities to calculate the probabilities of Boolean combinations of events.37 0. 0.10) It is not dicult to use the product rule and the replacement theorem to calculate the needed p (3) = P (Ac ) P (B) P (C) = 0.13 Consider again the independent class {E1 .6880 (4. minvec10 Vectors are A1 thru A10 and A1c thru A10c They may be renamed. as follows: P = 0. pm = minprob(P).8 = 0. 6. with respective probabilities 0. 7) .6 · 0.
4. we exhibit the essential structure which provides the basis for the following general proposition. P (B2 ) = 0.3 = 0.5 · 0.8 · 0.14 Consider the independent class 0.7 · 0. P (A1 ) = 0. 4.6. Consider n distinct subclasses of an independent class of events. If for each is a Boolean (logical) combination of members of the independent class. B1 .14) Ai and Bj . We state these facts and consider in each case an example which exhibits the essential structure.87 A = A1(A3&(A4A7c)). and P (M3 ) = P (Ac A2 A3 ) = 0. N5 .13) c c P (M3 N5 ) = P (Ac A2 A3 B1 B2 B3 B4 ) = 0.126 1 0. we need some central facts about independence of Boolean combinations.7.7 · 0. 0. B3 . A3 .8 · 0. F = ABC. 1 ≤ j ≤ m} is independent. Again.15) . B2 . 0. It is important to see that this is in fact a consequence of the product rule. A3 }.7 · 0. A2 .5 · 0. then the original class is independent. 0. If the combined class {Ai : 1 ≤ i ≤ n} and a second combination {Ai . we would expect the combinations of the subclasses to be independent. % This minterm vector is the same as for F above This decomposition of the problem indicates that it may be solved as a series of smaller problems.2. for it is further evidence that the product rule has captured the essence of the intuitive notion of independence.3 · 0.168 Also andP (N5 ) c c = P (B1 B2 B3 B4 ) = (4. C = A9&A10c.6 · 0. B4 }. j . Suppose each member of a class can be expressed as a disjoint union.7 · 0. If each auxiliary class formed by taking one member from each of the disjoint unions is an independent class.15 Suppose Suppose A = A1 A2 A3 and B = B1 B2 . it should be apparent that the result holds for any number of to any number of distinct subclasses of an independent class.6 1 = (0.3) · (0. in each case. A2 .3. First.0212 The product rule shows the desired independence. B = A2&(A5c(A6&A8)). Bj : 1 ≤ i ≤ n. with respective probabilities minterm three for the class Then {A1 . An } is an Verication of this far reaching result rests on the minterm expansion and two elementary facts about the disjoint subclasses of an independent class. B3 .8. In the following discussion. Bj } independent for each pair i. Formulation of the general result. is simply a matter of careful use of notation. with {Ai . then the class {A1 .5. (4. 1.3 Independent Boolean combinations Suppose we have a Boolean combination of the events in the class the events in the class {Bj : 1 ≤ j ≤ m}. · · · .1.3. 0.6 · 0. B4 }. Consider M3 . B2 . P (B1 ) = 0.6 = 0.5 · 0.5 (4. i th i the event Ai subclass. and it can be extended 2. A2 .3.4. 0. minterm ve for the class {B1 . 0.2. {A1 .7 · 0.6 · 0. A class each of whose members is a minterm formed by members of a distinct subclass of an independent class is itself an independent class.8 · 0.6) = P (M3 ) P (N5 ) = 0. Example 4. Proposition. P (A3 ) = 0. Example 4. P (A2 ) = 0.7 · 0.
3. 2 10 = 1024 minterms for all ten of F = A ∪ B ∪ C.21) n m r A= i=1 Ai .13 0. Select m distinct subclasses and form Boolean combi nations for each of these. · · · .16: A hybrid approach Consider again the independent class {E1 . Use of the minterm expansion for each of these Boolean combinations and the two propositions just illustrated shows that the class of Boolean combinations is independent To illustrate. B. B} is independent. so that the second proposition holds.1 · 0.16) AB = A1 A2 A3 B1 B 2 = A1 B 1 A1 B2 A2 B1 A2 B 2 A3 B 1 A3 B 2 (4. i.2 + 0.12 0.3 · 0. E10 } {0.22 0.2 + 0..23) {E1 .69 = 0. If (4. E4 . Begin with an independent class E of n events. Ck } independent and similarly for any nite number of such combinations.43 0.4 · 0.37 0.1 · 0. we may extend this rule to the triple {A. which involves an independent class of ten events. with each class {Ai . We wish to with respective probabilities calculate c c c P (F ) = P (E1 ∪ E3 (E4 ∪ E7 ) ∪ E2 (E5 ∪ E6 E8 ) ∪ E9 E10 ) In the previous solution.2 + 0. COMPUTATION P (A) = P (A1 ) + P (A2 ) + P (A3 ) = 0.2 + 0. 570.4 + 0.3 · 0. C} (4. B = j=1 Bj .3933. Example 4.20) {A.4 · 0. E8 }.3 + 0.1 = 0. E7 } and B We may calculate directly is a combination of P (C) = 0. INDEPENDENCE OF EVENTS We wish to show that the pair {A. as P (AB) = P (A1 ) [P (B1 ) + P (B2 )] + P (A2 ) [P (B1 ) + P (B2 )] + P (A3 ) [P (B1 ) + P (B2 )] = [P (A1 ) + P (A2 ) + P (A3 )] [P (B1 ) + P (B2 )] = P (A) P (B) It should be clear that the pattern just illustrated can be extended to the general case.5 + 0. Bj } independent (4. Now A is a Boolean combination of {E2 . and C= k=1 Ck . E5 .31}.71 0.5 = 0. Also. we use minprob to calculate the the (4.57 · 0.56 = P (A) P (B) The product rule can also be established algebraically from the expression for follows: (4.17) By additivity and pairwise independence.5 + 0. Bj . B} is independent. the product rule P (AB) = P (A) P (B) holds. E3 .56 0. with each pair {Ai .19) n m A= i=1 then the pair Ai and B= j=1 Bj . E6 .18) P (AB). As we note in the alternate expansion of F.7 Now andP (B) = P (B1 ) + (4.22) Ei and determine the minterm vector for F. where c c c A = E1 ∪ E3 (E4 ∪ E7 ) B = E2 (E5 ∪ E6 E8 ) C = E9 E10 (4.8 P (B2 ) = 0.5 = 0.e.88 CHAPTER 4. By the result on independence . we have P (AB) = P (A1 ) P (B1 ) + P (A1 ) P (B2 ) + P (A2 ) P (B1 ) + P (A2 ) P (B2 ) + P (A3 ) P (B1 ) + P (A3 ) P (B2 ) = 0.13. E2 . we return to Example 4.33 0.
1) involve such multistep trials. For purposes of analysis. % Minterm probabilities for calculating P(B) minvec4.org/content/m23256/1. minvec3 % The problem becomes a three variable problem F = ABC. the steps are carried out sequentially in time. we refer the 4. a = A(B&(CDc)). For example. describing it as a Cartesian product space.6636 % Agrees with the result of Example~4. represented by a single point in the basic space Ω. in the experiment of i th component trial. For the experiment of ipping a coin ten times.3 Composite Trials3 i th toss as the component trials. one for each possible sequence of heads and tails.11 4. D to E7 PA = a*pma' = 0.6/>. probability of F. We nd that unnecessary. % Selection of probabilities for B pma = minprob(pa). % Selection of probabilities for A pb = p([2 5 6 8]). important special case of Bernoulli trials. We examine more systematically how to model composite trials in terms of events In the subsequent section.89 of Boolean combinations. Then we deal with the independent class {A.01*[13 37 12 56 33 71 22 43 57 31]. In some cases.1 Composite trials and component events Often a trial is a composite one. C} is independent. We use the mprocedures to P (A) and P (B).p(10)) = 0. We need to consider the composite trial as a single outcome i.2852 PC = p(9)*(1 . . But we may have an experiment in which ten coins are ipped simultaneously. The question is how to model these repetitions.C} an independent class PF = F*pm' = 0. in setting up the basic model. Some authors give considerable attention the the nature of the basic space. we illustrate this approach in the determined by the components of the trials. In many cases. % A corresponds to E1.2243 b = A&(Bc(C&D)). the order of performance plays no signicant role. the fundamental trial is completed by performing several steps. pa = p([1 3 4 7]). 3 This content is available online at <http://cnx. That is. We simply suppose the basic space has enough elements to consider each possible outcome. C} to obtain the PA PB PC PF p = 0.3933 pm = minprob([PA PB PC]). B to E3. C to E6.B. Some of the examples in the unit on Conditional Probability (Section 3. with each coordinate corresponding to one of the component outcomes. Should they be considered as ten trials of a single simple experiment? It turns out that this is not a useful formulation. In other situations. in which each outcome results in a success or failure to achieve a specied condition. C to E4. % with {A. % Minterm probabilities for calculating P(A) pmb = minprob(pb). there must be at least 210 = 1024 elements. We call the individual steps in the composite trial ipping a coin ten times. B. the component trials will be performed sequentially in time. and often confusing. D to E8 PB = b*pmb' = 0..3.e. the class calculate {A. % A corresponds to E2. B. we impose an ordering usually by assigning indices. B to E5.
INDEPENDENCE OF EVENTS Of more importance is describing the various events associated with the experiment. The following is a somewhat more complicated example of this type.25) . 0. 0. and each In the second case. 0. The experiment consists of selecting at random a ball from the rst box. We may let box). When these are known. it is reasonable to assume that the class component probability is usually taken to be 0. identifying the appropriate component events.24) Usually we would be able to assume the Ei form an independent class. Let A3 Let Ei be the event the be the event exacty three hit the target.83. 8. 0. c A3 = E1 E2 E3 E4 c E1 E2 E3 E4 c E1 E2 E3 E4 c E1 E2 E3 E4 If each (4. 0.18 Four persons take one shot each at a target.77. what is the probability that k or more will qualify ( k = 6. P (Ei ) is known. Example 4. then all minterm probabilities can be calculated easily.84. various Boolean combinations of these can be expressed as minterm expansions.17: Component events • In the coin ipping experiment. the usual assumptions and the properties of conditional probability suce to assign probabilities. 10)? 210 = 1024 minterms. qualify. then making a random selection from the modied contents of the second box.19 Ten race cars are involved in time trials to determine pole positions for an upcoming race.85.72. center. with H H3 H 's and T 's. Example 4. 0. Let the i th Ei To be the event is car makes qualifying speed. Example 4. In the rst example. Suppose there are two boxes. Suppose A consists of those outcomes represented by sequences is the event of a head on the third toss and a tail on the ninth toss. each containing some red and some blue balls. the assignment of probabilities is somewhat more involved.90. 0. consider the event Each outcome H3 that the third toss results in a head.93. ω of the experiment may be represented by a sequence of The event senting heads and tails. placing it in the second box. 9. This approach of utilizing component events is used tacitly in some of the examples in the unit on Conditional Probability. they must post an average speed of 125 mph or more on a trial run. It seems reasonable to suppose the class {Ei : 1 ≤ i ≤ 10} independent. n Bk = r=k Ar (4. Note that this event is the intersection c H3 H9 . This consists of those outcomes corresponding to sequences with third position and T in the ninth. 0.96. Clearly R1 {Hi : 1 ≤ i ≤ 10} are component events. is independent. The SOLUTION Let Ak be the event exactly The event event Bk Ak k qualify. We begin by A component event is determined by propositions about the outcomes of the corresponding component trial. For one thing. When appropriate component events are determined.91. repreH in the in the third position. The class {Ei : 1 ≤ i ≤ 10} that k is the union of those minterms which have exactly or more qualify is given by k generates places uncomplemented. If the respective probabilities for success are 0.88. 7. it is necessary to know the numbers of red and blue balls in each box before the composite trial begins. 0.5.90 CHAPTER 4. • A somewhat more complex example is as follows. The composite trial is made up of two component selections. Then generated by the Ei A3 i th shooter hits the target is the union of those minterms which have three places uncomplemented. and and R2 R1 be the event of selecting a red ball on the rst component trial (from the rst R2 be the event of selecting a red ball on the second component trial.
85. P (Ei ) = p. However. For each component trial in success and the other a failure.20 btdata Enter n.k) PB = 0. 0. if the number of trials is largesay several hundredthe process may be time consuming. the number of trials 10 Enter p.96. invariant with i. 0. PB = ckn(P.4. Simulation of Bernoulli trials It is frequently desirable to simulate Bernoulli trials. 0. called bt.84]. Example 4. k = 6:10. btdata.8472 0.91 The task of computing and adding the minterm probabilities by hand would be tedious. 0. we let The event of a failure on the In many cases. failure trials is Bernoulli i 1. We have a convenient twopart mprocedure for sets the parameters.0.2.2114 An alternate approach is considered in the treatment of random variables.9628 0. 4.88. Also. 0. The rst part. we successfailure trials. the outcome is one of two kinds. there are limitations on the values of p. a sequence of success component trials are independent and have the same probabilities. call for SEQ disp(SEQ) % optional call for the sequence 1 1 2 1 3 0 4 0 5 0 6 0 7 0 .91. to determine the desired probabilities quickly and easily. formally. introduced in the unit on MATLAB and Independent Classes and illustrated in Example 4. P = [0. The probability {Ei : 1 ≤ i} is independent.93.3.2 Bernoulli trials and the binomial distribution Many composite trials may be described as a sequence of the sequence.4 To view the sequence. Repeated calls for bt produce new sequences.77. 0. model the sequence as a Bernoulli sequence.90. it is relatively easy to carry this out physically. called simulating Bernoulli sequences. rolling a die with various numbers of sides (as used in certain games).83. and items from a production line meet or fail to meet specications in a sequence of quality control checks.9938 0.37 Call for bt bt n = 10 p = 0. i th component trial is thus Ei c . However. the probability of success on each trial 0. The second. 0. or using spinners. By ipping coins. The class 2. Ei be the event of a success on the i th component trial in the sequence. favor or disapprove of a proposition in a survey sample. uses the random number generator in MATLAB to produce a sequence of zeros and ones (for failures and successes). One we designate a To represent the situation. 0. we may use the function ckn.37 % n is kept small to save printout space Frequency = 0.72.5756 0. in which the results on the successive Thus. the probability of success. to say the least. Examples abound: heads or tails in a sequence of coin ips.
92
CHAPTER 4. INDEPENDENCE OF EVENTS
8 9 10 0 1 1
Repeated calls for bt yield new sequences with the same parameters. To illustrate the power of the program, it was used to take a run of 100,000 component trials, with probability
p of success 0.37, as above.
Successive runs gave relative frequencies 0.37001 and 0.36999. Unless the random
number generator is seeded to make the same starting point each time, successive runs will give dierent sequences and usually dierent relative frequencies.
The binomial distribution
A basic problem in Bernoulli sequences is to determine the probability of trials. We let
Sn
be the number of successes in
n
k
successes in
n
component
trials. This is a special case of a simple random variable,
which we study in more detail in the chapter on "Random Variables and Probabilities" (Section 6.1). Let us characterize the events
successes is the union of the minterms generated by
Akn = {Sn = k}, 0 ≤ k ≤ n. As noted above, the event Akn of exactly k {Ei : 1 ≤ i} in which there are k successes (represented c by k uncomplemented Ei ) and n − k failures (represented by n − k complemented Ei ). Simple combinatorics n show there are C (n, k) ways to choose the k places to be uncomplemented. Hence, among the 2 minterms, n! there are C (n, k) = k!(n−k)! which have k places uncomplemented. Each such minterm has probability
n−k
. Since the minterms are mutually exclusive, their probabilities add. We conclude that
pk (1 − p)
P (Sn = k) = C (n, k) pk (1 − p)
as the
n−k
= C (n, k) pk q n−k (n, p).
where
q =1−p
for
for
0≤k≤n
(4.26)
These probabilities and the corresponding values form the
binomial distribution,
(n, p).
distribution
Sn .
This distribution is known
with parameters
Sn ∼
binomial
A related set of probabilities is
(n, p), and often write P (Sn ≥ k) = P (Bkn ), 0 ≤ k ≤ n. If the number n of
We shorten this to binomial
component trials is small, direct computation of the probabilities is easy with hand calculators.
Example 4.21: A reliability problem
A remote device has ve similar components which fail independently, with equal probabilities. The system remains operable if three or more of the components are operative. Suppose each unit remains active for one year with probability 0.8. operative for that long? SOLUTION What is the probability the system will remain
P = C (5, 3) 0.83 · 0.22 + C (5, 4) 0.84 · 0.2 + C (5, 5) 0.85 = 10 · 0.83 · 0.22 + 5 · 0.84 · 0.2 + 0.85 = 0.9421
probabilities
(4.27)
Because Bernoulli sequences are used in so many practical situations as models for successfailure trials, the
P (Sn = k) and P (Sn ≥ k) have been calculated and tabulated for a variety of combinations (n, p). Such tables are found in most mathematical handbooks. Tables of P (Sn = k) are usually given a title such as binomial distribution, individual terms. Tables of P (Sn ≥ k) have a designation
of the parameters such as
binomial distribution, cumulative terms.
Note, however, some tables for cumulative terms give
P (Sn ≤ k).
Care should be taken to note which convention is used.
Example 4.22: A reliability problem
Consider again the system of Example 5, above. Suppose we attempt to enter a table of Cumulative Terms, Binomial Distribution at
n = 5, k = 3,
and
greater than 0.5. In this case, we may work with
Ei c .
failures.
p = 0.8.
Most tables will
We just interchange the role of
not have probabilities Ei and
Thus, the number of failures has the binomial
successes i there are
not three or more failures.
(n, q) distribution.
Now there are three or more
We go the the table of cumulative terms at
n = 5,
k = 3,
and
p = 0.2.
The probability entry is 0.0579. The desired probability is 1  0.0579 = 0.9421.
In general, there are
k
or more successes in
n trials i there are not n − k + 1 or more failures.
93
mfunctions for binomial probabilities
Although tables are convenient for calculation, they impose serious limitations on the available parameter values, and when the values are found in a table, they must still be entered into the problem. Fortunately, we have convenient mfunctions for these distributions. When MATLAB is available, it is much easier to
generate the needed probabilities than to look them up in a table, and the numbers are entered directly into the MATLAB workspace. And we have great freedom in selection of parameter values. For example we may use
n of a thousand or more, while tables are usually limited to n of 20, or at most 30.
P (Akn )
and
The two mfunctions
for calculating
P (Bkn )
are
: :
P (Akn ) is calculated by y = ibinom(n,p,k), where k P (Bkn ) is calculated by y = cbinom(n,p,k), where k
n.
The result
y
is a row or column vector of integers between 0 and
is a row vector of the same size as
k.
n.
The result
y
is a row or column vector of integers between 0 and
is a row vector of the same size as
k.
Example 4.23: Use of mfunctions ibinom and cbinom
If
n = 10
and
p = 0.39,
determine
P (Akn )
and
P (Bkn )
for
k = 3, 5, 6, 8.
p = 0.39; k = [3 5 6 8]; Pi = ibinom(10,p,k) % individual probabilities Pi = 0.2237 0.1920 0.1023 0.0090 Pc = cbinom(10,p,k) % cumulative probabilities Pc = 0.8160 0.3420 0.1500 0.0103
Note that we have used probability Although we use only work well for
p = 0.39.
It is quite unlikely that a table will have this probability.
n
n = 10,
frequently it is desirable to use values of several hundred.
up to 1000 (and even higher for small values of
p
The mfunctions Hence,
or for values very near to one).
there is great freedom from the limitations of tables. If a table with a specic range of values is desired, an mprocedure called
binomial
produces such a table. The use of large
n raises the question of cumulation of
errors in sums or products. The level of precision in MATLAB calculations is sucient that such roundo errors are well below pratical concerns.
Example 4.24
binomial Enter n, the number of trials 13 Enter p, the probability of success 0.413 Enter row vector k of success numbers 0:4 n p 13.0000 0.4130 k P(X=k) P(X>=k) 0 0.0010 1.0000 1.0000 0.0090 0.9990 2.0000 0.0379 0.9900 3.0000 0.0979 0.9521 4.0000 0.1721 0.8542
% call for procedure
Remark.
While the mprocedure binomial is useful for constructing a table, it is usually not as convenient
for problems as the mfunctions ibinom or cbinom.
directly into the
MATLAB
workspace.
The latter calculate the desired values and
put them
94
4.3.3 Joint Bernoulli trials
of MATLAB for this.
CHAPTER 4. INDEPENDENCE OF EVENTS
Bernoulli trials may be used to model a variety of practical problems. One such is to compare the results of two sequences of Bernoulli trials carried out independently. The following simple example illustrates the use
Example 4.25: A joint Bernoulli trial
Bill and Mary take ten basketball free throws each. We assume the two seqences of trials are independent of each other, and each is a Bernoulli sequence. Mary: Has probability 0.80 of success on each trial. Bill: Has probability 0.85 of success on each trial. What is the probability Mary makes more free throws than Bill? SOLUTION We have two Bernoulli sequences, operating independently. Mary: Bill: Let
n = 10, p = 0.80 n = 10, p = 0.85
M be the event Mary wins Mk be the event Mary makes k or more freethrows. Bj be the event Bill makes exactly j freethrows
Then Mary wins if Bill makes none and Mary makes one or more, or Bill makes one and Mary makes two or more, etc. Thus
M = B0 M1
and
B1 M2
···
B9 M10
(4.28)
P (M ) = P (B0 ) P (M1 ) + P (B1 ) P (M2 ) + · · · + P (B9 ) P (M10 )
vidual probabilities for Bill.
(4.29)
We use cbinom to calculate the cumulative probabilities for Mary and ibinom to obtain the indi
pm = cbinom(10,0.8,1:10); % cumulative probabilities for Mary pb = ibinom(10,0.85,0:9); % individual probabilities for Bill D = [pm; pb]' % display: pm in the first column D = % pb in the second column 1.0000 0.0000 1.0000 0.0000 0.9999 0.0000 0.9991 0.0001 0.9936 0.0012 0.9672 0.0085 0.8791 0.0401 0.6778 0.1298 0.3758 0.2759 0.1074 0.3474
To nd the probability sum. This is just the dot or scalar product, which MATLAB calculates with the command
P (M ) that Mary wins, we need to multiply each of these pairs together, then pm ∗ pb' .
We may combine the generation of the probabilities and the multiplication in one command:
P = cbinom(10,0.8,1:10)*ibinom(10,0.85,0:9)' P = 0.2738
95
The ease and simplicity of calculation with MATLAB make it feasible to consider the eect of dierent values of
n.
Is there an optimum number of throws for Mary? Why should there be an optimum?
An alternate treatment of this problem in the unit on Independent Random Variables utilizes techniques for independent simple random variables.
4.3.4 Alternate MATLAB implementations
Alternate implementations of the functions for probability calculations are found in the available as a supplementary package. package is needed.
Statistical Package
We have utilized our formulation, so that only the basic MATLAB
4.4 Problems on Independence of Events4
Exercise 4.1
The minterms generated by the class
(Solution on p. 101.)
{A, B, C}
have minterm probabilities
pm = [0.15 0.05 0.02 0.18 0.25 0.05 0.18 0.12]
Show that the product rule holds for all three, but the class is not independent.
(4.30)
Exercise 4.2
The class
(Solution on p. 101.)
is independent, with respective probabilities 0.65, 0.37, 0.48, 0.63. Use Use the mfunction minmap to put
{A, B, C, D}
the mfunction minprob to obtain the minterm probabilities.
them in a 4 by 4 table corresponding to the minterm map convention we use.
Exercise 4.3
from "Minterms" are
(Solution on p. 101.)
The minterm probabilities for the software survey in Example 2 (Example 2.2: Survey on software)
pm = [0 0.05 0.10 0.05 0.20 0.10 0.40 0.10]
Show whether or not the class of the mfunction imintest.
(4.31)
{A, B, C}
is independent: (1) by hand calculation, and (2) by use
Exercise 4.4
computers) from "Minterms" are
(Solution on p. 101.)
The minterm probabilities for the computer survey in Example 3 (Example 2.3: Survey on personal
pm = [0.032 0.016 0.376 0.011 0.364 0.073 0.077 0.051]
Show whether or not the class of the mfunction imintest.
(4.32)
{A, B, C}
is independent: (1) by hand calculation, and (2) by use
Exercise 4.5
Minterm probabilities
(Solution on p. 101.)
p (0) through p (15) for the class {A, B, C, D} are, in order, pm = [0.084 0.196 0.036 0.084 0.085 0.196 0.035 0.084 0.021 0.049 0.009 0.021 0.020 0.049 0.010 0.021] Use the mfunction imintest to show whether or not the class {A, B, C, D} is independent.
(Solution on p. 102.)
Exercise 4.6
Minterm probabilities
p (0)
through
p (15)
for the opinion survey in Example 4 (Example 2.4:
Opinion survey) from "Minterms" are
Show whether or not the class
pm = [0.085 0.195 0.035 0.085 0.080 0.200 0.035 0.085 0.020 0.050 0.010 0.020 0.020 0.050 0.015 0.015] {A, B, C, D} is independent.
4 This
content is available online at <http://cnx.org/content/m24180/1.4/>.
96
CHAPTER 4. INDEPENDENCE OF EVENTS
Exercise 4.7
The class
(Solution on p. 102.)
is independent, with
{A, B, C}
P (A) = 0.30, P (B c C) = 0.32,
and
P (AC) = 0.12.
Determine the minterm probabilities.
Exercise 4.8
The class
(Solution on p. 102.)
is independent, with
{A, B, C}
P (A ∪ B) = 0.6, P (A ∪ C) = 0.7,
and
P (C) = 0.4.
Determine the probability of each minterm.
Exercise 4.9
others are not?
(Solution on p. 102.)
A pair of dice is rolled ve times. What is the probability the rst two results are sevens and the
Exercise 4.10
probabilities of making 90 percent or more are
(Solution on p. 102.)
Their
David, Mary, Joan, Hal, Sharon, and Wayne take an exam in their probability course.
0.72 0.83 0.75 0.92 0.65 0.79
(4.33)
respectively. Assume these are independent events. What is the probability three or more, four or more, ve or more make grades of at least 90 percent?
Exercise 4.11
(Solution on p. 102.)
Two independent random numbers between 0 and 1 are selected (say by a random number generator on a calculator). What is the probability the rst is no greater than 0.33 and the other is at least 57?
Exercise 4.12
Helen is wondering how to plan for the weekend. with probability 0.05.
(Solution on p. 102.)
She will get a letter from home (with money) There is a probability of 0.85 that she will get a call from Jim at SMU in
Dallas. There is also a probability of 0.5 that William will ask for a date. What is the probability she will get money and Jim will not call or that both Jim will call and William will ask for a date?
Exercise 4.13
A basketball player takes ten free throws in a contest.
(Solution on p. 103.)
On her rst shot she is nervous and has
probability 0.3 of making the shot. She begins to settle down and probabilities on the next seven shots are 0.5, 0.6 0.7 0.8 0.8, 0.8 and 0.85, respectively. Then she realizes her opponent is doing
well, and becomes tense as she takes the last two shots, with probabilities reduced to 0.75, 0.65. Assuming independence between the shots, what is the probability she will make
k
or more for
k = 2, 3, · · · 10?
Exercise 4.14
In a group there are
M
men and
W
women;
graduates. An individual is picked at random.
m of the men and w of the women are college Let A be the event the individual is a woman and B
{A, B} independent?
and
(Solution on p. 103.)
be the event he or she is a college graduate. Under what condition is the pair
Exercise 4.15
Consider the pair
(Solution on p. 103.)
{A, B}
of events.
Let
P (BAc ) = p2 .
Exercise 4.16
Under what condition is
P (A) = p, P (Ac ) = q = 1 − p, P (BA) = p1 , the pair {A, B} independent?
Show that if event
A is independent of itself, then P (A) = 0 or P (A) = 1.
{B, C} {A, C}
(Solution on p. 103.)
(This fact is key to an
important zeroone law.)
Exercise 4.17
Does
(Solution on p. 103.)
independent imply is independent? Justify your
{A, B}
independent and
answer.
97
Exercise 4.18
Suppose event
A implies B
(Solution on p. 103.)
(i.e.
A ⊂ B ).
Show that if the pair
{A, B}
is independent, then either
P (A) = 0
or
P (B) = 1.
(Solution on p. 103.)
The groups work What is the
Exercise 4.19
A company has three task forces trying to meet a deadline for a new device.
independently, with respective probabilities 0.8, 0.9, 0.75 of completing on time. probability at least one group completes on time? (Think. Then solve by hand.)
Exercise 4.20
(Solution on p. 103.)
Two salesmen work dierently. Roland spends more time with his customers than does Betty, hence tends to see fewer customers. On a given day Roland sees ve customers and Betty sees six. The customers make decisions independently. If the probabilities for success on Roland's customers are
0.7, 0.8, 0.8, 0.6, 0.7
and for Betty's customers are
0.6, 0.5, 0.4, 0.6, 0.6, 0.4,
what is the probability
Roland makes more sales than Betty? What is the probability that Roland will make three or more sales? What is the probability that Betty will make three or more sales?
Exercise 4.21
Two teams of students take a probability exam. independently. Team 1 has ve members and Team 2 has six members.
(Solution on p. 104.)
The entire group performs individually and They have the following
indivudal probabilities of making an `A on the exam. Team 1: 0.83 0.87 0.92 0.77 0.86 Team 2: 0.68 0.91 0.74 0.68 0.73 0.83
a. What is the probability team 1 will make b. What is the probability team 1 will make
at least as many A's as team 2? more A's than team 2?
(Solution on p. 104.)
Exercise 4.22
0.78, 0.88, 0.92. Units 1 and 2 operate as a series combination.
A system has ve components which fail independently. Their respective reliabilities are 0.93, 0.91, Units 3, 4, 5 operate as a two
of three subsytem.
The two subsystems operate as a parallel combination to make the complete
system. What is reliability of the complete system?
Exercise 4.23
A system has eight components with respective probabilities
(Solution on p. 104.)
0.96 0.90 0.93 0.82 0.85 0.97 0.88 0.80
units 4 through 8. What is the reliability of the complete system?
(4.34)
Units 1 and 2 form a parallel subsytem in series with unit 3 and a three of ve combination of
Exercise 4.24
combination in series with the three of ve combination?
(Solution on p. 104.)
How would the reliability of the system in Exercise 4.23 change if units 1, 2, and 3 formed a parallel
Exercise 4.25
(Solution on p. 104.)
How would the reliability of the system in Exercise 4.23 change if the reliability of unit 3 were changed from 0.93 to 0.96? What change if the reliability of unit 2 were changed from 0.90 to 0.95 (with unit 3 unchanged)?
Exercise 4.26 Exercise 4.27
(Solution on p. 105.) (Solution on p. 105.)
If customers buy
Three fair dice are rolled. What is the probability at least one will show a six?
A hobby shop nds that 35 percent of its customers buy an electronic game.
independently, what is the probability that at least one of the next ve customers will buy an electronic game?
98
CHAPTER 4. INDEPENDENCE OF EVENTS
Exercise 4.28
is 0.1. Successive messages are acted upon independently by the noise.
(Solution on p. 105.)
Suppose the message is
Under extreme noise conditions, the probability that a certain message will be transmitted correctly
transmitted ten times. What is the probability it is transmitted correctly at least once?
Exercise 4.29
Suppose the class
(Solution on p. 105.)
{Ai : 1 ≤ i ≤ n}
is independent, with
P (Ai ) = pi , 1 ≤ i ≤ n.
What is the
probability that at least one of the events occurs? What is the probability that none occurs?
Exercise 4.30
what is the probility 8 or more are sevens?
(Solution on p. 105.)
In one hundred random digits, 0 through 9, with each possible digit equally likely on each choice,
Exercise 4.31
set, what is the probability the store will sell three or more?
(Solution on p. 105.)
Ten customers come into a store. If the probability is 0.15 that each customer will buy a television
Exercise 4.32
Seven similar units are put into service at time
(Solution on p. 105.)
t = 0.
The units fail independently. The probability
of failure of any unit in the rst 400 hours is 0.18. What is the probability that three or more units are still in operation at the end of 400 hours?
Exercise 4.33
(Solution on p. 105.)
A computer system has ten similar modules. The circuit has redundancy which ensures the system operates if any eight or more of the units are operative. Units fail independently, and the probability is 0.93 that any unit will survive between maintenance periods. What is the probability of no system failure due to these units?
Exercise 4.34
(Solution on p. 105.)
Only thirty percent of the items from a production line meet stringent requirements for a special job. Units from the line are tested in succession. Under the usual assumptions for Bernoulli trials, what is the probability that three satisfactory units will be found in eight or fewer trials?
Exercise 4.35
(Solution on p. 105.)
What is the
The probability is 0.02 that a virus will survive application of a certain vaccine. probability that in a batch of 500 viruses, fteen or more will survive treatment?
Exercise 4.36
In a shipment of 20,000 items, 400 are defective.
(Solution on p. 105.)
These are scattered randomly throughout the
entire lot. Assume the probability of a defective is the same on each choice. What is the probability that
1. Two or more will appear in a random sample of 35? 2. At most ve will appear in a random sample of 50?
Exercise 4.37
A device has probability
p is necessary to ensure the probability of successes on all of the rst four trials is 0.85? value of p, what is the probability of four or more successes in ve trials?
Exercise 4.38
A survey form is sent to 100 persons. and each has probability 1/4 of replying, what is the probability of
p
(Solution on p. 105.)
of operating successfully on any trial in a sequence. What probability With that
(Solution on p. 105.)
If they decide independently whether or not to reply,
k
or more replies, where
k = 15, 20, 25, 30, 35, 40?
Exercise 4.39
are less than or equal to 0.63?
(Solution on p. 105.)
Ten numbers are produced by a random number generator. What is the probability four or more
99
Exercise 4.40
i she scores an
(Solution on p. 105.)
A player rolls a pair of dice ve times. She scores a hit on any throw if she gets a 6 or 7. She wins
odd number of hits in the ve throws.
What is the probability a player wins on any
sequence of ve throws? Suppose she plays the game 20 successive times. What is the probability she wins at least 10 times? What is the probability she wins more than half the time?
Exercise 4.41
on various trials are independent. Each spins the wheel 10 times.
(Solution on p. 106.)
What is the probability Erica
Erica and John spin a wheel which turns up the integers 0 through 9 with equal probability. Results
turns up a seven more times than does John?
Exercise 4.42
(Solution on p. 106.)
Erica and John play a dierent game with the wheel, above. Erica scores a point each time she gets an integer 0, 2, 4, 6, or 8. John scores a point each time he turns up a 1, 2, 5, or 7. If Erica spins eight times; John spins 10 times. What is the probability John makes
more
points than Erica?
Exercise 4.43
A box contains 100 balls; 30 are red, 40 are blue, and 30 are green. random, with replacement and mixing after each selection. ball; Martha has a success if she selects a blue ball.
(Solution on p. 106.)
Martha and Alex select at Alex has a success if he selects a red
Alex selects seven times and Martha selects
ve times. What is the probability Martha has more successes than Alex?
Exercise 4.44
of sixes?
(Solution on p. 106.)
Two players roll a fair die 30 times each. What is the probability that each rolls the same number
Exercise 4.45
A warehouse has a stock of
n items of a certain kind, r
p = r/n
(Solution on p. 106.)
of which are defective. Two of the items are
chosen at random, without replacement. What is the probability that at least one is defective? Show that for large
n the number is very close to that for selection with replacement, which corresponds
of success on any trial.
to two Bernoulli trials with pobability
Exercise 4.46
terminate.
(Solution on p. 106.)
A coin is ipped repeatedly, until a head appears. Show that with probability one the game will
tip: The probability of not terminating in
n trials is qn .
(Solution on p. 106.)
Exercise 4.47
plays. Let
Two persons play a game consecutively until one of them is successful or there are ten unsuccesful
Ei
be the event of a success on the
independent class with rst player wins,
B
P (Ei ) = p1
for
i
i th
play of the game.
odd and
P (Ei ) = p2
be the event the second player wins, and
C
for
i
Suppose
even. Let
A
{Ei : 1 ≤ i}
is an
be the event the
be the event that neither wins.
a. Express
b. Determine
c.
d.
A, B , and C in terms of the Ei . P (A), P (B), and P (C) in terms of p1 , p2 , q1 = 1 − p1 , and q2 = 1 − p2 . Obtain numerical values for the case p1 = 1/4 and p2 = 1/3. Use appropriate facts about the geometric series to show that P (A) = P (B) i p1 = p2 / (1 + p2 ). Suppose p2 = 0.5. Use the result of part (c) to nd the value of p1 to make P (A) = P (B) and then determine P (A), P (B), and P (C).
(Solution on p. 106.)
Exercise 4.48
a success on the
Three persons play a game consecutively until one achieves his objective. Let
i th
Ei
be the event of
trial, and suppose for
i = 1, 4, 7, · · ·, P (Ei ) = p2
{Ei : 1 ≤ i} is an independent class, with P (Ei ) = p1 for i = 2, 5, 8, · · ·, and P (Ei ) = p3 for i = 3, 6, 9, · · ·. Let A, B, C be the
respective events the rst, second, and third player wins.
100
CHAPTER 4. INDEPENDENCE OF EVENTS
a. Express
A, B ,
and
C
in terms of the
Ei .
p1 , p2 , p3 ,
then obtain numerical values in the case
b. Determine the probabilities in terms of
p1 = 1/4, p2 = 1/3,
Exercise 4.49
and
p3 = 1/2.
What is the probability of a success on the given there are
r
i th trial in a Bernoulli sequence of n component trials,
(Solution on p. 107.)
(Solution on p. 107.)
successes?
Exercise 4.50
A device has
N
similar components which may fail independently, with probability
p
of failure of
any component. The device fails if one or more of the components fails. In the event of failure of the device, the components are tested sequentially. a. What is the probability the rst defective unit tested is the have failed? b. What is the probability the defective unit is the
nth, given one or more components
nth, given that exactly one has failed?
c. What is the probability that more than one unit has failed, given that the rst defective unit is the
nth?
101
Solutions to Exercises in Chapter 4
Solution to Exercise 4.1 (p. 95)
pm = [0.15 0.05 0.02 0.18 0.25 0.05 0.18 0.12]; y = imintest(pm) The class is NOT independent Minterms for which the product rule fails y = 1 1 1 0 1 1 1 0 % The product rule hold for M7 = ABC
Solution to Exercise 4.2 (p. 95)
P = [0.65 0.37 0.48 p = minmap(minprob(P)) p = 0.0424 0.0249 0.0722 0.0424 0.0392 0.0230 0.0667 0.0392
0.63]; 0.0788 0.1342 0.0727 0.1238 0.0463 0.0788 0.0427 0.0727
Solution to Exercise 4.3 (p. 95)
pm = [0 0.05 0.10 0.05 0.20 0.10 0.40 0.10]; y = imintest(pm) The class is NOT independent Minterms for which the product rule fails y = 1 1 1 1 % By hand check product rule for any minterm 1 1 1 1
Solution to Exercise 4.4 (p. 95)
npr04_04 (Section~17.8.22: npr04_04) Minterm probabilities for Exercise~4.4 are in pm y = imintest(pm) The class is NOT independent Minterms for which the product rule fails y = 1 1 1 1 1 1 1 1
Solution to Exercise 4.5 (p. 95)
npr04_05 (Section~17.8.23: npr04_05) Minterm probabilities for Exercise~4.5 are in pm imintest(pm) The class is NOT independent
102
CHAPTER 4. INDEPENDENCE OF EVENTS
Minterms for which the product rule fails ans = 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 0
Solution to Exercise 4.6 (p. 95)
npr04_06 Minterm probabilities for Exercise~4.6 are in pm y = imintest(pm) The class is NOT independent Minterms for which the product rule fails y = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Solution to Exercise 4.7 (p. 96)
P (C) = P (AC) /P (A) = 0.40
and
P (B) = 1 − P (B c C) /P (C) = 0.20.
pm = minprob([0.3 0.2 0.4]) pm = 0.3360 0.2240 0.0840 0.0560
Solution to Exercise 4.8 (p. 96)
0.1440
0.0960
0.0360
0.0240
P (Ac C c ) = P (Ac ) P (C c ) = 0.3 implies P (Ac ) = 0.3/0.6 = 0.5 = P (A). P (Ac B c ) = P (Ac ) P (B c ) = 0.4 implies P (B c ) = 0.4/0.5 = 0.8 implies P (B) = 0.2
P = [0.5 0.2 0.4]; pm = minprob(P) pm = 0.2400 0.1600 0.0600
P = (1/6) (5/6) = 0.0161.
2 3
0.0400
0.2400
0.1600
0.0600
0.0400
Solution to Exercise 4.9 (p. 96) Solution to Exercise 4.10 (p. 96)
P = 0.01*[72 83 75 92 65 79]; y = ckn(P,[3 4 5]) y = 0.9780 0.8756 0.5967
Solution to Exercise 4.11 (p. 96)
P = 0.33 • (1 − 0.57) = 0.1419
Solution to Exercise 4.12 (p. 96)
A∼
letter with money,
B∼
call from Jim,
C∼
William ask for date
P = 0.01*[5 85 50]; minvec3 Variables are A, B, C, Ac, Bc, Cc They may be renamed, if desired. pm = minprob(P); p = ((A&Bc)(B&C))*pm' p = 0.4325
103
Solution to Exercise 4.13 (p. 96)
P = 0.01*[30 50 60 70 80 80 80 85 75 65]; k = 2:10; p = ckn(P,k) p = Columns 1 through 7 0.9999 0.9984 0.9882 0.9441 0.8192 Columns 8 through 9 0.0966 0.0134
Solution to Exercise 4.14 (p. 96)
0.5859
0.3043
P (AB) = w/ (m + w) = W/ (W + M ) = P (A)
Solution to Exercise 4.15 (p. 96)
p1 = P (BA) = P (BAc ) = p2
(see table of equivalent conditions).
Solution to Exercise 4.16 (p. 96)
P (A) = P (A ∩ A) = P (A) P (A). x2 = x
Solution to Exercise 4.17 (p. 96)
i
x=0
or
x = 1.
% No. Consider for example the following minterm probabilities: pm = [0.2 0.05 0.125 0.125 0.05 0.2 0.125 0.125]; minvec3 Variables are A, B, C, Ac, Bc, Cc They may be renamed, if desired. PA = A*pm' PA = 0.5000 PB = B*pm' PB = 0.5000 PC = C*pm' PC = 0.5000 PAB = (A&B)*pm' % Product rule holds PAB = 0.2500 PBC = (B&C)*pm' % Product rule holds PBC = 0.2500 PAC = (A&C)*pm' % Product rule fails PAC = 0.3250
Solution to Exercise 4.18 (p. 97)
A ⊂ B implies P (AB) = P (A); P (B) = 1 or P (A) = 0.
independence implies
P (AB) = P (A) P (B). P (A) = P (A) P (B)
only if
Solution to Exercise 4.19 (p. 97)
At least one completes i not all fail.
P = 1 − 0.2 • 0.1 • 0.25 = 0.9950
Solution to Exercise 4.20 (p. 97)
PR = 0.1*[7 8 8 6 7]; PB = 0.1*[6 5 4 6 6 4]; PR3 = ckn(PR,3) PR3 = 0.8662 PB3 = ckn(PB,3) PB3 = 0.6906
0:4)*ckn(PR. 97) P1 = 0. 97) Rc = parallel(R(1:3)) Rc = 0.9960 = ckn(R(4:8).5065 Solution to Exercise 4. = parallel(R(1:2)) = 0.8463 = ckn(R(3:5).0:4)*ckn(P1.9818 Solution to Exercise 4. INDEPENDENCE OF EVENTS PRgB = ikn(PB.01*[93 91 78 88 92]. Ra = parallel(R1(1:2)) Ra = 0.9390 R2 = R.24 (p.5527 P1g = ikn(P2. P2 = 0.9997 Rss = prod([Rb Rc]) Rss = 0.1:5)' PRgB = 0.01*[68 91 74 68 73 83].0:5)' P1geq = 0.9821 Rs3 = prod([Ra R1(3) Rb]) Rs3 = 0. = prod(R(1:2)) = 0.104 CHAPTER 4.23 (p.01*[96 90 93 82 85 97 88 80].21 (p.96.22 (p.9924 Solution to Exercise 4.3) = 0. 97) Ra Ra Rb Rb Rs Rs R = 0. R1(3) =0. 97) R1 = R. . 97) Ra Ra Rb Rb Rs Rs R = 0.2) = 0.0:5)*ckn(P1.25 (p.9821 = prod([Ra R(3) Rb]) = 0.3) Rb = 0.9960 Rb = ckn(R1(4:8).01*[83 87 92 77 86].2561 Solution to Exercise 4.9097 Solution to Exercise 4.9506 = parallel([Ra Rb]) = 0.1:5)' P1g = 0. P1geq = ikn(P2.
0. 3) = 0.29 (p.11/36.10) = 0.9946 0.0164 0. 98) P = 1 − 0.8840 Solution to Exercise 4. 3) = 0. 8) = 0.26 (p. 4) = 0.82.02.1547.851/4 .910 = 0. 0.3) Rb = 0.38 (p. 8) = 0.37 (p.1. 98) n i=1 (1 − pi ) P = cbinom (100.33 (p.5383 0.36 (p. P = cbinom (5.9999 Solution to Exercise 4. 98) P = cbinom (10.28 (p.02.4482 Solution to Exercise 4. • P 2 = 1 − cbinom (35.95.[1 3 5])) = 0.39 (p. 6) = 0.93. 0.9971 Solution to Exercise 4.02. 99) Each roll yields a hit with probability p= 6 36 + 5 36 = 11 36 .6513 Solution to Exercise 4. 2) = 0.4956 = cbinom(20.34 (p.9644 Solution to Exercise 4.7939 Solution to Exercise 4. 0. p.3. 98) P = cbinom (10.30 (p. 98) P 1 = 1 − P 0.105 R2(2) = 0. 98) P = cbinom (500.1/4.0814 Solution to Exercise 4. 0.9854 Solution to Exercise 4.27 (p.0.PW.1798 Solution to Exercise 4. 98) • P 1 = cbinom (35.655 = 0. 0. 0. PW P2 P2 P3 P3 PW = sum(ibinom(5.9717 Solution to Exercise 4.15:5:40) 0.0007 Solution to Exercise 4. 98) P = cbinom (7.15. 97) 3 P = 1 − 0. 0. 97) P = 1 − (5/6) = 0.3963 . 98) P 1 = cbinom (10. 98) p = 0. Ra = parallel(R2(1:2)) Ra = 0. P0 = Solution to Exercise 4.9980 Rb = ckn(R2(4:8).9115 Solution to Exercise 4.9821 Rs4 = prod([Ra R2(3) Rb]) Rs4 = 0.PW.32 (p.5724 = cbinom(20. 98) P = P = cbinom(100.31 (p.40 (p.4213 Solution to Exercise 4. 98) P = cbinom (8.63.11) = 0. 3) = 0. 4) = 0.9005 0.1495 0.35 (p. 15) = 0.
99) ∞ a. 0. 0.36) Solution to Exercise 4. In this case P (A) = 1 31 • = 31/64 = 0.41) c. 0.3. Solution to Exercise 4. 99) P1 = r n−r n−r r (2n − 1) r − r2 r r−1 • + • + • = n n−1 n n−1 n n−1 n (n − 1) P2 = 1 − r n 2 (4. 1 : 9) ' = 0.4. 99) P = ibinom (7.3437 Solution to Exercise 4.42 (p.4. 1 : 10) ' = 0. 3k+2 i=1 k • P (A) = p1 k=0 (q1 q2 q3 ) = q • P (B) = 1−q11p2 q3 q2 ∞ .39) P (B) = q1 p2 For 1 − (q1 q2 ) 5 P (C) = (q1 q2 ) 1 − q1 q2 and (4. d. Then N ⊂ Nk for all k P (N ) = 0.4030 Solution to Exercise 4.4844 = P (B) . 0. 99) P = sum (ibinom (30.5. P (A) = p1 1 + q1 q2 + (q1 q2 ) + (q1 q2 ) + (q1 q2 ) 5 2 3 4 = p1 1 − (q1 q2 ) 1 − q1 q2 5 (4. Note that P (A) + P (B) + P (C) = 1.46 (p. p2 = 1/3. A = E1 c B = E1 E2 c c c c E1 E2 E3 E4 E5 c c c c c c E1 E2 E3 E4 E5 E6 E7 c c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 E9 (4.1.47 (p. 0 : 9) ∗ cbinom (10. 0 : 4) ∗ cbinom (5. • A = E1 k=1 c • B = E1 E2 c c • C = E1 E2 E3 ∞ 3k i=1 c Ei E3k+1 c Ei E3k+2 c Ei E3k+3 p1 1−q1 q2 q3 k=1 3k+1 i=1 ∞ k=1 b. c c E1 E2 E3 event does not terminate in We conclude k plays. 0 : 30) . 99) P = ibinom (10. 1/6. 99) P = ibinom (8. 0.45 (p. P (C) = 1/32 4 16 i (4.1.3613 Solution to Exercise 4.5 = 1/3 p1 = p2 / (1 + p2 ). 99) Let implies N = event never terminates and Nk = 0 ≤ P (N ) ≤ P (Nk ) = 1/2k for all k.41 (p. 0 : 8) ∗ cbinom (10.106 CHAPTER 4.38) b. 1 : 5) ' = 0. C= 10 i=1 c Ei . 99) a.40) p1 = 1/4.37) c c c E1 E2 E3 E4 c c c c c E1 E2 E3 E4 E5 E6 c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 c c c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 (4. 0.35) = 2nr − r2 n2 (4.43 (p. P (A) = P (B) i p1 = q1 p2 = (1 − p1 ) p2 p1 = 0. we have q1 q2 = 1/2 q1 p2 = 1/4.5/1.1386 Solution to Exercise 4.^2) = 0.44 (p.48 (p. INDEPENDENCE OF EVENTS Solution to Exercise 4. Solution to Exercise 4.
failures.50 (p. b. q n−1 p 1 − QN −1 P (B2 Fn ) = = 1 − q N −n P (Fn ) q n−1 p (4. r) = r/n. Solution to Exercise 4. r − 1) /C (n. Solution to Exercise 4. B2 = event of two or more failures. Since probability not all from nth are good is 1 − q N −n . 100) Let A1 = event one failure. B1 = event of one or more Fn = the event the rst defective unit found is the nth.107 q1 • P (C) = 1−qq2qp3 3 1 2q • For p1 = 1/4. r) pr q n−r . a. and Fn ⊂ B1 implies P (Fn B1 ) = P (Fn ) /P (B1 ) = P (Fn A1 ) = q n−1 p 1−q N P (Fn A1 ) 1 q n−1 pq N −n = = N −1 P (A1 ) N pq N (4. p2 = 1/3.42) (see Exercise 4.49 (p. Hence P (Ei AA rn) = C (n − 1. P (A) = P (B) = P (C) = 1/3. r − 1) pr−1 q n−r and P (Arn ) = C (n. 100) P (Arn Ei ) = pC (n − 1.49) c.43) P (B2 Fn ) = . p3 = 1/2.
INDEPENDENCE OF EVENTS .108 CHAPTER 4.
the grade one makes should not aect the grade made by the other. For one probability measure a pair may be independent while for another probability measure the pair may not be independent. Now we should think form an independent pair. since it has the three dening properties and all those properties derived therefrom.1 Conditional Independence1 The idea of stochastic (probabilistic) independence is explored in the unit Independence of Events (Section 4. The concept is approached as lack of conditioning: P (AB) = P (A).org/content/m23258/1.Chapter 5 Conditional Independence 5.1 The concept Examination of the independence concept reveals two important mathematical facts: • Independence of a class of non mutually exclusive events depends upon the probability measure.e. we should think the events weather on the purchases.7/>. unless probabilities are indicated. The weather has a decided eect on the likelihood of buying an umbrella..1: Buying umbrellas and the weather A department store has a nice stock of umbrellas. the the item purchased by one should not be aected by the choice made by the other.. Among the simple examples of operational independence" in the unit on independence of events. each unaware of the choice made by the other. If two students are taking exams in dierent courses. independence with respect to a conditional probability measure? In this chapter we explore that question in a fruitful way.e. B} be the event the weather is rainy (i. But consider the eect of P (AC) > P (AC c ) P (BC) > P (BC c ). is raining or threatening and Two customers come into the store indepen Normally. Let A be the event the rst buys an umbrella and B the event the second buys an umbrella. 109 . and not on the relationship between the events. This is equivalent to the product rule P (AB) = P (A) P (B) . which lead naturally to an assumption of probabilistic independence are the following: • • If customers come into a well stocked shop at dierent times. But given the fact the weather is rainy 1 This content is available online at <http://cnx. dently. Independence cannot be displayed on a Venn diagram. Example 5. Let to rain).1. 5. C {A. We consider an extension to conditional independence.1). • Conditional probability is a probability measure. This raises the question: is there a useful conditional independencei.
P (BC c ) = 0. Does this make the pair {A.4) P (A) P (B) = 0. so that the pair is (5. additional knowledge that A occurs.60.5) not independent. Then (5. P (BC) = 0. one would suppose their performances form an independent pair. Thus. Consider the following c c c and P (ABC ) = P (AC ) P (BC ) and numerical case.1110 As a result. Saturday. CONDITIONAL INDEPENDENCE (event C has occurred).2: Students and exams Two students take exams in dierent courses. we can see why this should be so.0816 = 0. An examination of the pattern of computation shows that independence would require very special probabilities which are not likely to be encountered. B} in spect to the conditional probability measure For this example. If the rst customer buys an umbrella. shows that we have independence of the pair {A. use of equivalent conditions shows that the situation may P (ABC) = P (AC) P (BC) and P (ABC c ) = P (AC c ) P (BC c ) P (AC) < P (AC c ) and (5.30 that there was a football game on It is the fall semester. in which case the second customer is likely to buy.30. Let Saturday. B} with rePC (·) = P (·C). Under normal circumstances.2) P (A) = P (AC) P (C) + P (AC c ) P (C c ) = 0. Let B A be the event the rst student makes grade 80 or be the event the second has a grade of 80 or better.1110 = P (AB) The product rule fails. Now it is reasonable to suppose C be the event of a game on the previous P (AC) = P (ABC) aect the lielihood that be expressed and P (AC c ) = P (ABC c ) (5. in another notation. Example 5. we should suppose that P (BC) < P (BC c ). (5. we should also expect that with respect to the conditional probability dependent (with respect to the prior probability measure P )? Some numerical examples make it Without calculations. this indicates a higher than normal likelihood that the weather is rainy. it may be reasonable to suppose P (AC) = P (ABC) replaced by probability measure or. Thus. withP (C) = 0. and both students are enthusiastic fans.3200 P (B) = P (BC) P (C) + P (BC c ) P (C c ) = 0. B Again. it would seem reasonable that purchase of an umbrella by one should not aect the likelihood of such a purchase by the other. P (AC c ) = P (ABC c ). with probability measure P PC . Suppose P (AC) = 0. If we knew that one did poorly on the exam.2550 P (AB) = P (ABC) P (C) + P (ABC c ) P (C c ) P (AC c ) P (BC c ) P (C c ) = 0.3) = P (AC) P (BC) P (C) + (5. better and morning. so that there is independence c measure P (·C ).1) An examination of the sixteen equivalent conditions for independence.110 CHAPTER 5.50. PC (A) = PC (AB) (5. The condition leads to P (ABC) = P (AC) P (BC) P (BA) > P (B). plain that only in the most unusual cases would the pair be independent.20. The exam is given on Monday There is a probability 0.15. P (ABC) = P (AC) P (BC). P (AC c ) = 0. this would increase the likelihoood there was a .7) Under these conditions.6) has occurred does not If we know that there was a Saturday game.
B} is said to be conditionally independent. Sixteen equivalent conditions P (ABC) = P (AC) P (AB C) = P (AC) P (Ac BC) = P (Ac C) P (Ac B c C) = P (Ac C) c P (BAC) = P (BC) P (B AC) = P (B C) P (BAc C) = P (BC) P (B c Ac C) = P (B c C) Table 5. or {Ac . In the hybrid notation we use for repeated conditioning.8400. As soon as then In a given problem. The failure to be Although their performances are operationally independent. we write PC (AB) = PC (A) This translates into or PC (AB) = PC (A) PC (B) (5.7400.111 Saturday game and hence increase the likelihood that the other did poorly. B c }. givenC. then translate as above. we have the following equivalent conditions. we take as the dening Denition. then so .1).4: Repeated conditioning) and the equivalent conditions for independence (Table 4.3 Straightforward calculations show (5.2 The patterns of conditioning in the examples above belong to this set.6300.8 P (C) = 0. 5.1 P (ABC) = P (AC) P (BC) P (AB c C) = P (AC) P (B c C) P (Ac BC) = P (Ac C) P (BC) P (Ac B c C) = P (Ac C) P (B c C) c c P (ABC) = P (AB c C) P (Ac BC) P (Ac B c C) = P (BAC) = P (BAc C) P (B c AC) P (B c Ac C) = Table 5.9) P (ABC) = P (AC) If it is known that likelihood of or P (ABC) = P (AC) P (BC) (5. independent arises from a common chance factor that aects both. designated {A. one or the of these patterns is recognized. The equivalence of the four entries in the right hand column of the upper part of the table. they are not independent in the probability sense. P (B) = 0.9 P (BC) = 0. suppose P (AC) = 0.10) A.1. B}. {A. we may produce a similar table of equivalent conditions for conditional independence.7 P (AC c ) = 0. B} ci C i the following product rule holds: P (ABC) = P (AC) P (BC).2 Sixteen equivalent conditions Using the facts on repeated conditioning (Section 3. condition the product rule P (ABC) = P (AC) P (BC). {A. other of these conditions may seem a reasonable assumption. then additional knowledge of the occurrence of B does not change the If we write the sixteen equivalent conditions for independence in terms of the conditional probability measure PC ( · ). given C. A pair of events Because of its simplicity and symmetry. all one are equally valid assumptions. P (AB) = 0.6 P (BC c ) = 0. C has occurred.8) Note that P (A) = 0. {A.8514 > P (A) as would be expected. P (AB) = 0.1. {Ac . As a numerical example. B}. establish The replacement rule If any of the pairs are the others. B c } is conditionally independent.
B} ci Dc . . The doctor orders two independent tests for conditions that indicate the disease. We consider several examples. event the rst completes on time to assume C be the event the library has two copies available. neither aware of the testing by the others. B} ci D and {A. • Two students are working on a term paper. given an event C. we may replace either or both by their complements and still have a conditionally independent pair. With neither cheating or collaborating on the test itself. Let But there are cases in which the eect of the conditioning event is asymmetric. to the detriment of the second. is conditionally independent. but some chance factor which aects both. give an adequate supply. Since conditional independence is ordinary independence with respect to a conditional probability measure. (indicates the conditions associated with the disease) and Then it would seem reasonable to suppose B A is the event the rst test is positive is the event the second test is positive. If they prepared separately. CONDITIONAL INDEPENDENCE This may be expressed by saying that if a pair is conditionally independent. They both need to borrow If A is the reasonable {A. the replacement rule extends. it should be clear how to extend the concept to larger classes of sets. if one does well. and conditional independence. if the tests are meaningful. since if the rst student completes on time. B} ci C and {A. The operational independence suggests probabilistic independence. Yet. However. given the complementary event. If they study together. event As in the case of simple independence. denoted {Ai : i ∈ J} ci C . • If the two contractors of the example above both need material which may be in scarce supply. Examples 1 and A is the event the rst contracter completes his job on time and B is the event the If C is the event of good weather. • Two contractors work quite independently on jobs in the same city. In the examples considered so far. then successful completion would be conditionally independent. where J is an arbitrary index set. an event must be sharply dened: on any trial it occurs or it does not. Did a trace of rain or thunder in the area constitute bad weather? Did rain delay on one day in a month long project constitute bad weather? Even with this ambiguity. then he or she must have been successful in getting the book. {A. A preliminary examination leads the doctor to think there is a thirty percent chance the patient has a certain disease. they must both be aected by the actual condition of the Suppose D is the event the patient has the disease. and B the event the second is successful. given C. To illustrate further the usefulness of this concept. then it seems P (BAC c ) < P (BC c ). Suppose second completes on time. B} ci C . • Two students in the same course take an exam. Remark. then the two conditions would In general not be conditionally independent.112 CHAPTER 5. Are results of these tests really independent? There is certainly operational independencethe tests may be done by dierent laboratories. They work quite separately. i the product rule holds for every nite subclass of two or more. if only one book is available. the other should also. given a short supply. the event of both getting good grades should be conditionally independent. The replacement rule If the class {Ai : i ∈ J} ci C . whereas they would not be conditionally independent. The event of good weather is not so clearly dened. a certain book from the library. However. • A patient goes to a doctor. then the likelihoods of good grades would not be independent. In formal probability theory. the pattern of probabilistic analysis may be useful. Denition. it has been reasonable to assume conditional independence. patient. then any or all of the events Ai may be replaced by their complements and still have a conditionally independent class. we note some other common examples in which similar conditions hold: there is operational independence. B} ci C . then arguments similar to those in c 2 make it seem reasonable to suppose {A. both jobs are outside and subject to delays due to bad weather. A class {Ai : i ∈ J}.
If the prospects are sound. Because of assumed conditional independence. Consider the trials to constitute a Bernoulli sequence. 0. In this case it is reasonable to assume We consider a simple example. PE~=~ckn(P1. F = the event three or more advisers are G = F c = the event three or more are negative.3 The use of independence techniques Since conditional independence is independence. P1~=~0. given that the venture is sound and also given that the venture is not sound. H. the respective probabilities are 0. determine of correct picks. If S is the number of successes in picking the winners in the ten races. Sharon makes the right choice if three or more of the ve advisors are positive. we may use independence techniques in the solution of problems. The following MATLAB solution of the investment problem is indicated. the probabilities are 0.85. 0.2. Let H be the event he is correct in his claim.75. P (E) = P (F H) + P (GH c ) = P (F H) P (H) + P (GH c ) P (H c ) Let form (5.113 5. and 0. given H. . if the venture is not sound. If the venture is unsound. the advisers are independent of one another in the sense that no one is aected by the others.3: Use of independence techniques Sharon is investigating a business venture which she thinks has probability 0. We may reasonably assume conditional independence of the advice.1. Example 5. 0. and E = the event of the correct decision.4: Test of a claim A race track regular claims he can pick the winning horse in any race 90 percent of the time. If he is simply guessing. positive. P (HS = k) for various numbers k Suppose it is equally likely that his claim is valid or that he is merely guessing. P2~=~0. i th adviser is positive.9. Then Let H = the event the pro ject is sound. We assume two conditional Bernoulli trials: Claim is valid: Ten trials. they are not independent. 0.7 of being successful. Given the quality of the project.9255 Often a Bernoulli sequence is related to some conditioning event the sequence {Ei : 1 ≤ i ≤ n} ci H and ci H c .6. She checks with ve independent advisers. Then P (F H) = the sum of probabilities of Mk are minterms generated by the class {Ei : 1 ≤ i ≤ 5}. and P (Mk H ). In order to test his claim. for they are all related to the soundness of the venture. Of course.2.3)*(1~~PH) PE~=~~~~0.3)*PH~+~ckn(P2. and 0. what is the probability she will make the right decision? SOLUTION If the project is sound.7 that the advice will be negative.12) P (Mk H) probability of three or more successes. the probability of success on each race is 0. Example 5.7. 0. PH~=~0.8.9. p = P (Ei H c ) = 0.01*[80~75~60~90~80]. c c c c P (E1 E2 E3 E4 E5 H) = P (E1 H) P (E2 H) P (E3 H) P (E4 H) P (E5 H) with similar expressions for each (5. We consider two types of problems: an inference problem and a conditional Bernoulli sequence. he picks a horse to win in each of ten races. c This means that if we want the we can use ckn with the matrix of conditional probabilities.7. 0. probability p = P (Ei H) = 0.9.11) the the Ei be the event the where P (Mk H).01*[75~85~70~90~70].75. If Sharon goes with the ma jority of advisers. she makes the right choice if three or more of the ve advisers are negative. There are ve horses in each race.8 that the advisers will advise her to proceed. probability Guessing at random: Ten trials.
~given~H^c OH~~=~Pk1.OH(e)]')) ~~~~~~~~~~~6~~~~~~~~~~~2~~~~~~%~Needs~at~least~six~to~have~creditability ~~~~~~~~~~~7~~~~~~~~~~73~~~~~~%~Seven~would~be~creditable. he would have to pick at least seven correctly to give reasonable validation of his claim.~given~H Pk2~=~ibinom(10.2. we have evidence for the condition which can never be absolutely certain. In general. . We simply use Bayes' rule to reverse the direction of conditioning. and P (EH c ).~~~~~~~~~~~~%~Conditional~odds~Assumes~P(H)/P(H^c)~=~1 e~~~=~OH~>~1. basis of the evidence. CONDITIONAL INDEPENDENCE Let S= number of correct picks in ten trials. 0 ≤ k ≤ 10 P (H c S = k) P (H c ) P (S = kH c ) Giving him the benet of the doubt.13) P (H) /P (H c ) = 1 and calculate the conditional k~=~0:10. Then P (H) P (S = kH) P (HS = k) = · .k)./Pk2.org/content/m23259/1. Pk1~=~ibinom(10.1 Some Patterns of Probable Inference We are concerned with the likelihood of some hypothesized condition. Some typical examples: We are forced to assess probabilities (likelihoods) on the HYPOTHESIS Job success Presence of oil Operation of a device Market condition Presence of a disease EVIDENCE Personal traits Geological structures Physical condition Test market condition Tests for symptoms 5.k). H is the event the hypothetical condition exists and E is the event the evidence occurs. the probabilities (5.9.0. What is desired is P (HE) or.3 If available are usually P (H) (or an odds value).0. c equivalently. P (EH). (5.~~~~~~~~~~~~~~%~Selects~favorable~odds disp(round([k(e).2.2 Patterns of Probable Inference2 Table 5.6/>.~~~~%~Probability~of~k~successes.114 CHAPTER 5.~~~~%~Probability~of~k~successes. 5. the odds P (HE) /P (H E). ~~~~~~~~~~~8~~~~~~~~2627~~~~~~%~even~if~P(H)/P(H^c)~=~0. we suppose odds. P (HE) P (EH) P (H) = · P (H c E) P (EH c ) P (H c ) No conditional independence is involved in this case.1 ~~~~~~~~~~~9~~~~~~~94585 ~~~~~~~~~~10~~~~~3405063 Under these assumptions.14) 2 This content is available online at <http://cnx.
115
Independent evidence for the hypothesized condition
Suppose there are two independent bits of evidence. Now obtaining this evidence may be operationally independent, but if the items both relate to the hypothesized condition, then they cannot be really independent. knowledge of Thus The condition assumed is usually of the form does not aect the likelihood of and
E2
{E1 , E2 } ci H
{E1 , E2 } ci H c .
E1 .
Similarly, we usually have
P (E1 H) = P (E1 HE2 ) if H occurs, then P (E1 H c ) = P (E1 H c E2 ).
Example 5.5: Independent medical tests
Suppose a doctor thinks the odds are 2/1 that a patient has a certain disease. independent tests. Let
H
be the event the patient has the disease and
E1
and
E2
She orders two be the events
the tests are positive. Suppose the rst test has probability 0.1 of a false positive and probability 0.05 of a false negative. The second test has probabilities 0.05 and 0.08 of false positive and false negative, respectively. If both tests are positive, what is the posterior probability the patient has the disease?
SOLUTION
Assuming
{E1 , E2 } ci H
and
ci H c , we work rst in terms of the odds, then convert to probability.
(5.15)
P (H) P (E1 E2 H) P (H) P (E1 H) P (E2 H) P (HE1 E2 ) = · = · P (H c E1 E2 ) P (H c ) P (E1 E2 H c ) P (H c ) P (E1 H c ) P (E2 H c )
The data are
P (H) /P (H c ) = 2, P (E1 H) = 0.95, P (E1 H c ) = 0.1, P (E2 H) = 0.92, P (E2 H c ) = 0.05
Substituting values, we get
(5.16)
0.95 · 0.92 1748 P (HE1 E2 ) =2· = P (H c E1 E2 ) 0.10 · 0.05 5
Evidence for a symptom
so that
P (HE1 E2 ) =
1748 5 =1− = 1 − 0.0029 1753 1753
(5.17)
Sometimes the evidence dealt with is not evidence for the hypothesized condition, but for some condition which is stochastically related.
symptom.
For purposes of exposition, we refer to this intermediary condition as a
Consider again the examples above.
HYPOTHESIS Job success Presence of oil Operation of a device Market condition Presence of a disease
SYMPTOM Personal traits Geological structures Physical condition Test market condition Physical symptom
EVIDENCE Diagnostic test results Geophysical survey results Monitoring report Market survey result Test for symptom
Table 5.4
We let
S
be the event the symptom is present. The usual case is that the evidence is directly related to
the symptom and not the hypothesized condition. The diagnostic test results can say something about an applicant's personal traits, but cannot deal directly with the hypothesized condition. The test results would be the same whether or not the candidate is successful in the job (he or she does not have the job yet). A geophysical survey deals with certain structural features beneath the surface. If a fault or a salt dome is The physical monitoring
present, the geophysical results are the same whether or not there is oil present.
report deals with certain physical characteristics. Its reading is the same whether or not the device will fail. A market survey treats only the condition in the test market. The results depend upon the test market, not
116
CHAPTER 5. CONDITIONAL INDEPENDENCE
A blood test may be for certain physical conditions which frequently are related (at
the national market.
least statistically) to the disease. But the result of the blood test for the physical condition is not directly aected by the presence or absence of the disease. Under conditions of this type, we may assume
P (ESH) = P (ESH c )
These imply
and
P (ES c H) = P (ES c H c )
(5.18)
{E, H} ci S
P (HE) P (H c E)
and
ci S c . =
Now
=
P (HE) P (H c E)
P (HES)+P (HES c ) P (HS)P (EHS)+P (HS c )P (EHS c ) P (H c ES)+P (H c ES c ) = P (H c S)P (EH c S)+P (H c S c )P (EH c S c ) c c = PP (HS)P (ES)+P (HSS )P (ES ) ) (H c S)P (ES)+P (H c c )P (ES c
(5.19)
It is worth noting that each term in the denominator diers from the corresponding term in the numerator by having
Hc
in place of
H.
Before completing the analysis, it is necessary to consider how
H
and
S
are
related stochastically in the data. Four cases may be considered.
a. Data are b. Data are c. Data are d. Data are
P P P P
(SH), (SH), (HS), (HS),
P P P P
(SH c ), (SH c ), (HS c ), (HS c ),
and and and and
P P P P
(H). (S). (S). (H).
Case a:
P (H) P (SH) P (ES) + P (H) P (S c H) P (ES c ) P (HE) = c E) P (H P (H c ) P (SH c ) P (ES) + P (H c ) P (S c H c ) P (ES c )
(5.20)
Example 5.6: Geophysical survey
Let
H
be the event of a successful oil well,
favorable to the presence of oil, and structure. We suppose
S be the event there is a geophysical structure E be the event the geophysical survey indicates a favorable
and
{H, E} ci S
ci S c .
Data are
P (H) /P (H c ) = 3, P (SH) = 0.92, P (SH c ) = 0.20, P (ES) = 0.95, P (ES c ) = 0.15
Then
(5.21)
P (HE) 0.92 · 0.95 + 0.08 · 0.15 1329 =3· = = 8.5742 c E) P (H 0.20 · 0.95 + 0.80 · 0.15 155
so that
(5.22)
P (HE) = 1 −
155 = 0.8956 1484
(5.23)
The geophysical result moved the prior odds of 3/1 to posterior odds of 8.6/1, with a corresponding change of probabilities from 0.75 to 0.90.
Case b:
Data are
P (S)P (SH), P (SH c ), P (ES).
and
P (ES c ).
If we can determine
P (H),
we can
proceed as in case a. Now by the law of total probability
P (S) = P (SH) P (H) + P (SH c ) [1 − P (H)]
which may be solved algebraically to give
(5.24)
P (H) =
P (S) − P (SH c ) P (SH) − P (SH c )
(5.25)
117
Example 5.7: Geophysical survey revisited
In many cases a better estimate of of previous geophysical data. Suppose the prior odds for
P (S) or the odds P (S) /P (S c ) can be made on the basis S are 3/1, so that P (S) = 0.75. P (H) = 55/17 P (H c )
Using the other data in Example 5.6 (Geophysical survey), we have
P (H) =
0.75 − 0.20 P (S) − P (SH c ) = = 55/72, P (SH) − P (SH c ) 0.92 − 0.20
so that
(5.26)
Using the pattern of case a, we have
P (HE) 55 0.92 · 0.95 + 0.08 · 0.15 4873 = · = = 9.2467 c E) P (H 17 0.20 · 0.95 + 0.80 · 0.15 527
so that
(5.27)
P (HE) = 1 −
527 = 0.9024 5400 P (ES)
and
(5.28)
Usually data relating test results to symptom are of the form
P (ES c ),
or equivalent.
Data relating the symptom and the hypothesized condition may go either way. In cases a and b, the data are in the form
P (SH)
and
P (SH c ),
and
or equivalent, derived from data showing the fraction of
times the symptom is noted when the hypothesized condition is identied. But these data may go in the opposite direction, yielding and d.
P (HS)
P (HS c ),
or equivalent. This is the situation in cases c
Case c:
Data are
P (ES) , P (ES c ) , P (HS) , P (HS c )
and
P (S). P (S)
A test A
Example 5.8: Evidence for a disease symptom with prior
time.
When a certain blood syndrome is observed, a given disease is indicated 93 percent of the The disease is found without this syndrome only three percent of the time.
for the syndrome has probability 0.03 of a false positive and 0.05 of a false negative.
preliminary examination indicates a probability 0.30 that a patient has the syndrome. A test is performed; the result is negative. What is the probability the patient has the disease?
SOLUTION
In terms of the notation above, the data are
P (S) = 0.30,
P (ES c ) = 0.03,
and
P (E c S) = 0.05,
(5.29)
P (HS) = 0.93,
We suppose
P (HS c ) = 0.03
(5.30)
{H, E} ci S
and
ci S c .
(5.31)
P (HE c ) P (S) P (HS) P (E c S) + P (S c ) P (HS c ) P (E c S c ) = c E c ) P (H P (S) P (H c S) P (E c S) + P (S c ) P (H c S c ) P (E c S c ) =
which implies
429 0.30 · 0.93 · 0.05 + 0.70 · 0.03 · 0.97 = 0.30 · 0.07 · 0.05 + 0.70 · 0.97 · 0.97 8246
(5.32)
P (HE c ) = 429/8675 ≈ 0.05.
Case d:
This diers from case c only in the fact that a prior probability for
determine the corresponding probability for
S
H
is assumed. In this case, we
by
P (S) =
and use the pattern of case c.
P (H) − P (HS c ) P (HS) − P (HS c )
(5.33)
118
CHAPTER 5. CONDITIONAL INDEPENDENCE
Example 5.9: Evidence for a disease symptom with prior
P (H)
Suppose for the patient in Example 5.8 (Evidence for a disease symptom with prior physician estimates the odds favoring the presence of the disease are 1/3, so that Again, the test result is negative. Determine the posterior odds, given
Ec .
P (S)) the P (H) = 0.25.
SOLUTION
First we determine
P (S) =
Then
0.25 − 0.03 P (H) − P (HS c ) = = 11/45 P (HS) − P (HS c ) 0.93 − 0.03
(5.34)
P (HE c ) (11/45) · 0.93 · 0.05 + (34/45) · 0.03 · 0.97 15009 = = = 0.047 P (H c E c ) (11/45) · 0.07 · 0.05 + (34/45) · 0.97 · 0.97 320291
The result of the test drops the prior odds of 1/3 to approximately 1/21.
(5.35)
Independent evidence for a symptom
In the previous cases, we consider only a single item of evidence for a symptom. But it may be desirable to have a second opinion. We suppose the tests are for the symptom and are not directly related to the hypothetical condition. If the tests are operationally independent, we could reasonably assume
c P (E1 SE2 ) = P (E1 SE2 )
{E1 , E2 } ci S {E1 , H} ci S {E2 , H} ci S {E1 E2 , H} ci S
(5.36)
P (E1 SH) = P (E1 SH ) P (E2 SH) = P (E2 SH ) P (E1 E2 SH) = P (E1 E2 SH c )
This implies
c
c
{E1 , E2 , H} ci S .
depending on the tie between
S
A similar condition holds for and
H. We consider a "case a" example.
Sc .
As for a single test, there are four cases,
Example 5.10: A market survey problem
A food company is planning to market nationally a new breakfast cereal. Its executives feel condent that the odds are at least 3 to 1 the product would be successful. Before launching the new product, the company decides to investigate a test market. Previous experience indicates that the reliability of the test market is such that if the national market is favorable, there is probability 0.9 that the test market is also. On the other hand, if the national market is unfavorable, there is a probability of only 0.2 that the test market will be favorable. These facts lead to the following analysis. Let
H be the event the national market is favorable (hypothesis) S be the event the test market is favorable (symptom) The initial data are the following probabilities, based on past experience:
(a) Prior odds: (b) Reliability
• •
P (H) /P (H c ) = 3 c of the test market: P (SH) = 0.9 P (SH ) = 0.2
If it were known that the test market is favorable, we should have
P (HS) P (SH) P (H) 0.9 = = · 3 = 13.5 P (H c S) P (SH c ) P (H c ) 0.2
(5.37)
Unfortunately, it is not feasible to know with certainty the state of the test market. The company decision makers engage two market survey companies to make independent surveys of the test market. The reliability of the companies may be expressed as follows. Let
: :
E1 E2
be the event the rst company reports a favorable test market. be the event the second company reports a favorable test market.
119
On the basis of previous experience, the reliability of the evidence about the test market (the symptom) is expressed in the following conditional probabilities.
P (E1 S) = 0.9 P (E1 S c ) = 0.3
national market is favorable, given this result?
P (E2 S) = 0.8 B (E2 S c ) = 0.2
(5.38)
Both survey companies report that the test market is favorable.
What is the probability the
SOUTION
The two survey rms work in an operationally independent manner. The report of either company is unaected by the work of the other. Also, each report is aected only by the condition of the
test market regardless of what the national market may be. According to the discussion above, we should be able to assume
{E1 , E2 , H} ci S
and
{E1 , E2 , H} ci S c
(5.39)
We may use a pattern similar to that in Example 2, as follows:
P (H) P (SH) P (E1 S) P (E2 S) + P (S c H) P (E1 S c ) P (E2 S c ) P (HE1 E2 ) = · c E E ) c ) P (SH c ) P (E S) P (E S) + P (S c H c ) P (E S c ) P (E S c ) P (H P (H 1 2 1 2 1 2 =3· 327 0.9 · 0.9 · 0.8 + 0.1 · 0.3 · 0.2 = ≈ 10.22 0.2 · 0.9 · 0.8 + 0.8 · 0.3 · 0.2 32
(5.40)
(5.41)
In terms of the posterior probability, we have
P (HE1 E2 ) =
We note that the odds favoring
327/32 327 32 = =1− ≈ 0.91 1 + 327/32 359 359
(5.42)
as compared with the odds favoring
H, given positive indications from both survey companies, is 10.2 H, given a favorable test market, of 13.5. The dierence
reects the residual uncertainty about the test market after the market surveys. Nevertheless, the results of the market surveys increase the odds favoring a satisfactory market from the prior 3 to 1 to a posterior 10.2 to 1. In terms of probabilities, the market surveys increase the likelihood
of a favorable market from the original
P (H) = 0.75
to the posterior
P (HE1 E2 ) = 0.91.
The
conditional independence of the results of the survey makes possible direct use of the data.
5.2.2 A classication problem
A population consists of members of two subgroups. It is desired to formulate a battery of questions to aid in identifying the subclass membership of randomly selected individuals in the population. The questions are designed so that for each individual the answers are independent, in the sense that the answers to any subset of these questions are not aected by and do not aect the answers to any other subset of the questions. The answers are, however, aected by the subgroup membership. Thus, our treatment of conditional idependence suggests that it is reasonable to supose the answers are conditionally independent, given the subgroup membership. Consider the following numerical example.
Example 5.11: A classication problem
A sample of 125 subjects is taken from a population which has two subgroups. membership of each sub ject in the sample is known. The subgroup Each individual is asked a battery of ten
questions designed to be independent, in the sense that the answer to any one is not aected by the answer to any other. The subjects answer independently. Data on the results are summarized in the following table:
120
CHAPTER 5. CONDITIONAL INDEPENDENCE
GROUP 1 (69 members)
Q 1 2 3 4 5 6 7 8 9 10 Yes 42 34 15 19 22 41 9 40 48 20 No 22 27 45 44 43 13 52 26 12 37 Unc. 5 8 9 6 4 15 8 3 9 12
GROUP 2 (56 members)
Yes 20 16 33 31 23 14 31 13 27 35 No 31 37 19 18 28 37 17 38 24 16 Unc. 5 3 4 7 5 5 8 5 5 5
Table 5.5
Assume the data represent the general population consisting of these two groups, so that the data may be used to calculate probabilities and conditional probabilities. Several persons are interviewed. The result of each interview is a prole of answers to the
questions. The goal is to classify the person in one of the two subgroups on the basis of the prole of answers. The following proles were taken.
• • •
Y, N, Y, N, Y, U, N, U, Y. U N, N, U, N, Y, Y, U, N, N, Y Y, Y, N, Y, U, U, N, N, Y, Y
Classify each individual in one of the subgroups.
SOLUTION
Let
G1 =
the event the person selected is from group 1, and
G2 = Gc = 1
the event the person
selected is from group 2. Let
Ai = the event the answer to the i th question is Yes Bi = the event the answer to the i th question is No Ci = the event the answer to the i th question is Uncertain The data are taken to mean P (A1 G1 ) = 42/69, P (B3 G2 ) = 19/56, etc. The prole Y, N, Y, N, Y, U, N, U, Y. U corresponds to the event E = A1 B2 A3 B4 A5 C6 B7 C8 A9 C10
We utilize the ratio form of Bayes' rule to calculate the posterior odds
P (G1 E) P (EG1 ) P (G1 ) = · P (G2 E) P (EG2 ) P (G2 )
(5.43)
If the ratio is greater than one, classify in group 1; otherwise classify in group 2 (we assume that a ratio exactly one is so unlikely that we can neglect it). Because of conditional independence, we are able to determine the conditional probabilities
P (EG1 ) =
42 · 27 · 15 · 44 · 22 · 15 · 52 · 3 · 48 · 12 6910 29 · 37 · 33 · 18 · 23 · 5 · 17 · 5 · 24 · 5 5610
and
(5.44)
P (EG2 ) =
(5.45)
121
The odds
P (G1 ) /P (G2 ) = 69/56.
We nd the posterior odds to be
P (G1 E) 42 · 27 · 15 · 44 · 22 · 15 · 52 · 3 · 48 · 12 569 = 5.85 = · P (G2 E) 29 · 37 · 33 · 18 · 23 · 5 · 17 · 5 · 24 · 5 699
The factor
(5.46)
569 /699
comes from multiplying
5610 /6910
by the odds
P (G1 ) /P (G2 ) = 69/56.
Since
the resulting posterior odds favoring Group 1 is greater than one, we classify the respondent in group 1. While the calculations are simple and straightforward, they are tedious and error prone. To
make possible rapid and easy solution, say in a situation where successive interviews are underway, we have several mprocedures for performing the calculations. Answers to the questions would
normally be designated by some such designation as Y for yes, N for no, and U for uncertain. In order for the mprocedure to work, these answers must be represented by numbers indicating the appropriate columns in matrices
A and B. Thus, in the example under consideration, each Y must
be translated into a 1, each N into a 2, and each U into a 3. The task is not particularly dicult, but it is much easier to have MATLAB make the translation as well as do the calculations. following twostage approach for solving the problem works well. The rst mprocedure The
oddsdf
sets up the frequency information.
The next mprocedure
odds
calculates the odds for a given prole. The advantage of splitting into two mprocedures is that we can set up the data once, then call repeatedly for the calculations for dierent proles. As always, it is necessary to have the data in an appropriate form. The following is an example in which the data are entered in terms of actual frequencies of response.
%~file~oddsf4.m %~Frequency~data~for~classification A~=~[42~22~5;~34~27~8;~15~45~9;~19~44~6;~22~43~4; ~~~~~41~13~15;~9~52~8;~40~26~3;~48~12~9;~20~37~12]; B~=~[20~31~5;~16~37~3;~33~19~4;~31~18~7;~23~28~5; ~~~~~14~37~5;~31~17~8;~13~38~5;~27~24~5;~35~16~5]; disp('Call~for~oddsdf')
Example 5.12: Classication using frequency data
oddsf4~~~~~~~~~~~~~~%~Call~for~data~in~file~oddsf4.m Call~for~oddsdf~~~~~%~Prompt~built~into~data~file oddsdf~~~~~~~~~~~~~~%~Call~for~mprocedure~oddsdf Enter~matrix~A~of~frequencies~for~calibration~group~1~~A Enter~matrix~B~of~frequencies~for~calibration~group~2~~B Number~of~questions~=~10 Answers~per~question~=~3 ~Enter~code~for~answers~and~call~for~procedure~"odds" y~=~1;~~~~~~~~~~~~~~%~Use~of~lower~case~for~easier~writing n~=~2; u~=~3; odds~~~~~~~~~~~~~~~~%~Call~for~calculating~procedure Enter~profile~matrix~E~~[y~n~y~n~y~u~n~u~y~u]~~~%~First~profile Odds~favoring~Group~1:~~~5.845 Classify~in~Group~1 odds~~~~~~~~~~~~~~~~%~Second~call~for~calculating~procedure Enter~profile~matrix~E~~[n~n~u~n~y~y~u~n~n~y]~~~%~Second~profile Odds~favoring~Group~1:~~~0.2383 Classify~in~Group~2
122
CHAPTER 5. CONDITIONAL INDEPENDENCE
odds~~~~~~~~~~~~~~~~%~Third~call~for~calculating~procedure Enter~profile~matrix~E~~[y~y~n~y~u~u~n~n~y~y]~~~%~Third~profile Odds~favoring~Group~1:~~~5.05 Classify~in~Group~1
The principal feature of the mprocedure odds is the scheme for selecting the numbers from the and
B
A
matrices.
If
E = [yynyuunnyy],
used internally.
then the coding translates this into the actual numerical
matrix
[1 1 2 1 3 3 2 2 1 1]
elements of
E. Thus
Then
A (:, E)
is a matrix with columns corresponding to
e~=~A(:,E) e~=~~~42~~~~42~~~~22~~~~42~~~~~5~~~~~5~~~~22~~~~22~~~~42~~~~42 ~~~~~~34~~~~34~~~~27~~~~34~~~~~8~~~~~8~~~~27~~~~27~~~~34~~~~34 ~~~~~~15~~~~15~~~~45~~~~15~~~~~9~~~~~9~~~~45~~~~45~~~~15~~~~15 ~~~~~~19~~~~19~~~~44~~~~19~~~~~6~~~~~6~~~~44~~~~44~~~~19~~~~19 ~~~~~~22~~~~22~~~~43~~~~22~~~~~4~~~~~4~~~~43~~~~43~~~~22~~~~22 ~~~~~~41~~~~41~~~~13~~~~41~~~~15~~~~15~~~~13~~~~13~~~~41~~~~41 ~~~~~~~9~~~~~9~~~~52~~~~~9~~~~~8~~~~~8~~~~52~~~~52~~~~~9~~~~~9 ~~~~~~40~~~~40~~~~26~~~~40~~~~~3~~~~~3~~~~26~~~~26~~~~40~~~~40 ~~~~~~48~~~~48~~~~12~~~~48~~~~~9~~~~~9~~~~12~~~~12~~~~48~~~~48 ~~~~~~20~~~~20~~~~37~~~~20~~~~12~~~~12~~~~37~~~~37~~~~20~~~~20
The
i th entry on the i th column is the count corresponding to the answer to the i th question. A.
The element on the diagonal in the third column of
For
example, the answer to the third question is N (no), and the corresponding count is the third entry in the N (second) column of
A (:, E)
is
the third element in that column, and hence the desired third entry of the N column. By picking out the elements on the diagonal by the command diag(A(:,E)), we have the desired set of counts corresponding to the prole. The same is true for diag(B(:,E)). Sometimes the data are given in terms of conditional probabilities and probabilities. A slight
modication of the procedure handles this case. For purposes of comparison, we convert the problem above to this form by converting the counts in matrices
A and B to conditional probabilities.
We do
this by dividing by the total count in each group (69 and 56 in this case). Also,
P (G1 ) = 69/125 =
0.552
and
P (G2 ) = 56/125 = 0.448.
GROUP 1
Q 1 2 3 4 5 6 7 8 9 10 Yes 0.6087 0.4928 0.2174 0.2754 0.3188 0.5942 0.1304 0.5797 0.6957 0.2899
P (G1 ) = 69/125
No 0.3188 0.3913 0.6522 0.6376 0.6232 0.1884 0.7536 0.3768 0.1739 0.5362 Unc. 0.0725 0.1159 0.1304 0.0870 0.0580 0.2174 0.1160 0.0435 0.1304 0.1739
GROUP 2
Yes 0.3571 0.2857 0.5893 0.5536 0.4107 0.2500 0.5536 0.2321 0.4821 0.6250 No
P (G2 ) = 56/125
Unc. 0.0893 0.0536 0.0714 0.1250 0.0893 0.0893 0.1428 0.0893 0.0893 0.0893
0.5536 0.6607 0.3393 0.3214 0.5000 0.6607 0.3036 0.6786 0.4286 0.2857
123
Table 5.6
These data are in an mle
oddsp4.m.
The modied setup mprocedure
oddsdp uses the condi
tional probabilities, then calls for the mprocedure odds.
Example 5.13: Calculation using conditional probability data
oddsp4~~~~~~~~~~~~~~~~~%~Call~for~converted~data~(probabilities) oddsdp~~~~~~~~~~~~~~~~~%~Setup~mprocedure~for~probabilities Enter~conditional~probabilities~for~Group~1~~A Enter~conditional~probabilities~for~Group~2~~B Probability~p1~individual~is~from~Group~1~~0.552 ~Number~of~questions~=~10 ~Answers~per~question~=~3 ~Enter~code~for~answers~and~call~for~procedure~"odds" y~=~1; n~=~2; u~=~3; odds Enter~profile~matrix~E~~[y~n~y~n~y~u~n~u~y~u] Odds~favoring~Group~1:~~5.845 Classify~in~Group~1
The slight discrepancy in the odds favoring Group 1 (5.8454 compared with 5.8452) can be attributed to rounding of the conditional probabilities to four places. The presentation above rounds the results to 5.845 in each case, so the discrepancy is not apparent. This is quite acceptable, since the discrepancy has no eect on the results.
5.3 Problems on Conditional Independence3
Exercise 5.1
Suppose
(Solution on p. 129.)
and
{A, B} ci C
{A, B} ci C c , P (C) = 0.7,
and
P (AC) = 0.4 P (BC) = 0.6 P (AC c ) = 0.3 P (BC c ) = 0.2
Show whether or not the pair
(5.47)
{A, B}
is independent.
Exercise 5.2
Suppose
(Solution on p. 129.)
and
{A1 , A2 , A3 } ci C
ci C c ,
with
P (C) = 0.4,
and
P (Ai C) = 0.90, 0.85, 0.80
Determine the posterior odds
P (Ai C c ) = 0.20, 0.15, 0.20 P (CA1 Ac A3 ) /P 2 (C
c
for
i = 1, 2, 3,
respectively
(5.48)
A1 Ac A3 ). 2
(Solution on p. 129.)
Each has a good chance to break
Exercise 5.3
Five world class sprinters are entered in a 200 meter dash.
the current track record. There is a thirty percent chance a late cold front will move in, bringing conditions that adversely aect the runners. Otherwise, conditions are expected to be favorable for an outstanding race. Their respective probabilities of breaking the record are:
•
3 This
Good weather (no front): 0.75, 0.80, 0.65, 0.70, 0.85
content is available online at <http://cnx.org/content/m24205/1.4/>.
124
CHAPTER 5. CONDITIONAL INDEPENDENCE
•
Poor weather (front in): 0.60, 0.65, 0.50, 0.55, 0.70
The performances are (conditionally) independent,
given good weather,
and also,
given poor
weather. What is the probability that three or more will break the track record?
Hint.
If
B3
is the event of three or more,
P (B3 ) = P (B3 W ) P (W ) + P (B3 W c ) P (W c ).
(Solution on p. 129.)
The alarm is given if three or more of
Exercise 5.4
A device has ve sensors connected to an alarm system.
the sensors trigger a switch. If a dangerous condition is present, each of the switches has high (but not unit) probability of activating; if the dangerous condition does not exist, each of the switches has low (but not zero) probability of activating (falsely). Suppose condition and Suppose suppose
D=
the event of the dangerous
A =
the event the alarm is activated.
Ei =
the event the
i th
Proper operation consists of
AD
Ac D c .
unit is activated.
Since the switches operate independently, we
{E1 , E2 , E3 , E4 , E5 } ci D
and
ci Dc
(5.49)
Assume the conditional probabilities of the E1 , given D, are 0.91, 0.93, 0.96, 0.87, 0.97, and given Dc , are 0.03, 0.02, 0.07, 0.04, 0.01, respectively. If P (D) = 0.02, what is the probability the alarm system acts properly? Suggestion. Use the conditional independence and the procedure ckn.
Exercise 5.5
dently; however, the likelihood of completion depends upon the weather.
(Solution on p. 129.)
If the weather is very
Seven students plan to complete a term paper over the Thanksgiving recess. They work indepen
pleasant, they are more likely to engage in outdoor activities and put o work on the paper. Let be the event the during the
Ei i th student completes his or her paper, Ak be the event that k or more complete recess, and W be the event the weather is highly conducive to outdoor activity. It is
{Ei : 1 ≤ i ≤ 7} ci W
and
reasonable to suppose
ci W c .
Suppose
P (Ei W ) = 0.4, 0.5, 0.3, 0.7, 0.5, 0.6, 0.2 P (Ei W c ) = 0.7, 0.8, 0.5, 0.9, 0.7, 0.8, 0.5
respectively, and their papers and
(5.50)
(5.51)
P (W ) = 0.8. P (A5 ) that ve
Determine the probability or more nish.
P (A4 )
that four our more complete
Exercise 5.6
(Solution on p. 129.)
A manufacturer claims to have improved the reliability of his product. Formerly, the product had probability 0.65 of operating 1000 hours without failure. The manufacturer claims this probability is now 0.80. A sample of size 20 is tested. Determine the odds favoring the new probability for How many
various numbers of surviving units under the assumption the prior odds are 1 to 1. survivors would be required to make the claim creditable?
Exercise 5.7
(Solution on p. 130.)
A real estate agent in a neighborhood heavily populated by auent professional persons is working with a customer. The agent is trying to assess the likelihood the customer will actually buy. His experience indicates the following: if
H
is the event the customer buys,
is a professional with good income, and
E
S
is the event the customer
is the event the customer drives a prestigious car, then
P (S) = 0.7 P (SH) = 0.90 P (SH c ) = 0.2 P (ES) = 0.95 P (ES c ) = 0.25
reasonable to suppose
(5.52)
Since buying a house and owning a prestigious car are not related for a given owner, it seems
P (EHS) = P (EH c S)
and
P (EHS c ) = P (EH c S c ).
The customer
drives a Cadillac. What are the odds he will buy a house?
125
Exercise 5.8
geophysical survey.
(Solution on p. 130.)
On the basis of past experience, the decision makers feel the odds are about Various other probabilities can be assigned on the basis of past
In deciding whether or not to drill an oil well in a certain location, a company undertakes a
four to one favoring success. experience. Let
• H be the event that a well would be successful • S be the event the geological conditions are favorable • E be the event the results of the geophysical survey are
The initial, or prior, odds are
positive
P (H) /P (H c ) = 4. P (SH c ) = 0.20
Previous experience indicates
P (SH) = 0.9
P (ES) = 0.95
P (ES c ) = 0.10
(5.53)
Make reasonable assumptions based on the fact that the result of the geophysical survey depends upon the geological formations and not on the presence or absence of oil. The result of the survey is favorable. Determine the posterior odds
P (HE) /P (H c E).
(Solution on p. 130.)
Exercise 5.9
A software rm is planning to deliver a custom package. Past experience indicates the odds are at least four to one that it will pass customer acceptance tests. As a check, the program is sub jected to two dierent benchmark runs. Both are successful. Given the following data, what are the odds favoring successful operation in practice? Let
• • • •
H be the event the performance is satisfactory S be the event the system satises customer acceptance tests E1 be the event the rst benchmark tests are satisfactory. E2 be the event the second benchmark test is ok.
{H, E1 , E2 } ci S
and
Under the usual conditions, we may assume
ci S c .
Reliability data show
P (HS) = 0.95, P (HS c ) = 0.45 P (E1 S) = 0.90 P (E1 S c ) = 0.25 P (E2 S) = 0.95 P (E2 S c ) = 0.20
Determine the posterior odds
(5.54)
(5.55)
P (HE1 E2 ) /P (H c E1 E2 ).
(Solution on p. 130.)
Exercise 5.10
A research group is contemplating purchase of a new software package to perform some specialized calculations. The systems manager decides to do two sets of diagnostic tests for signicant bugs that might hamper operation in the intended application. The tests are carried out in an operationally independent manner. The following analysis of the results is made.
• • • •
H = the event the program is satisfactory for the intended S = the event the program is free of signicant bugs E1 = the event the rst diagnostic tests are satisfactory E2 = the event the second diagnostic tests are satisfactory
application
Since the tests are for the presence of bugs, and are operationally independent, it seems reasonable to assume the
{H, E1 , E2 } ci S and {H, E1 , E2 } ci S c . Because of the reliability of the software company, manager thinks P (S) = 0.85. Also, experience suggests P (HS) = 0.95 P (HS c ) = 0.30 P (E1 S) = 0.90 P (E1 S c ) = 0.20 P (E2 S) = 0.95 P (E2 S c ) = 0.25
126
CHAPTER 5. CONDITIONAL INDEPENDENCE
Table 5.7
Determine the posterior odds favoring
H
if results of both diagnostic tests are satisfactory.
Exercise 5.11
A company is considering a new product now undergoing eld testing. Let
(Solution on p. 131.)
• H be the event the product is introduced and successful • S be the event the R&D group produces a product with the desired characteristics. • E be the event the testing program indicates the product is satisfactory
The company assumes
P (S) = 0.9
and the conditional probabilities
P (HS) = 0.90 P (HS c ) = 0.10
to suppose
P (ES) = 0.95 P (ES c ) = 0.15 P (HE) /P (H c E).
(5.56)
Since the testing of the merchandise is not aected by market success or failure, it seems reasonable
{H, E} ci S
and
ci S c .
The eld tests are favorable. Determine
Exercise 5.12
(Solution on p. 131.)
She
Martha is wondering if she will get a ve percent annual raise at the end of the scal year.
understands this is more likely if the company's net prots increase by ten percent or more. These will be inuenced by company sales volume. Let
• H = the event she will get the raise • S = the event company prots increase by ten percent or more • E = the event sales volume is up by fteen percent or more
Since the prospect of a raise depends upon prots, not directly on sales, she supposes and
{H, E} ci S
{H, E} ci S c .
She thinks the prior odds favoring suitable prot increase is about three to one.
Also, it seems reasonable to suppose
P (HS) = 0.80 P (HS c ) = 0.10
Martha will get her raise?
P (ES) = 0.95 P (ES c ) = 0.10
(5.57)
End of the year records show that sales increased by eighteen percent.
What is the probability
Exercise 5.13
independent advice of three specialists. Let
(Solution on p. 131.)
A physician thinks the odds are about 2 to 1 that a patient has a certain disease.
H
He seeks the
be the event the disease is present, and
A, B, C
be
the events the respective consultants agree this is the case. majority. suppose
The physician decides to go with the
Since the advisers act in an operationally independent manner, it seems reasonable to and
{A, B, C}ciH
ciH c .
Experience indicates
P (AH) = 0.8, P (BH) = 0.7, P (CH) = 0.75 P (Ac H c ) = 0.85, P (B c H c ) = 0.8, P (C c H c ) = 0.7
present, and does not if two or more think the disease is not present)?
(5.58)
(5.59)
What is the probability of the right decision (i.e., he treats the disease if two or more think it is
Exercise 5.14
(Solution on p. 131.)
A software company has developed a new computer game designed to appeal to teenagers and young adults. It is felt that there is good probability it will appeal to college students, and that if it appeals to college students it will appeal to a general youth market. To check the likelihood of appeal to college students, it is decided to test rst by a sales campaign at Rice and University of Texas, Austin. The following analysis of the situation is made.
Should the well be drilled? Exercise 5. E2 be the events the surveys indicate a salt dome. E2 } ci S and ci S c .16 membership of each sub ject in the sample is known. and the tests are carried out in an operationally independent manner. oil deposits are highly likely to be associated with underground H dome in the area. not the presence of oil.1. it seems reasonable to assume previous experience in game sales. Data on the reliability of the surveys yield the following probabilities P (E1 S) = 0.95 P (E2 S c ) = 0. S is the event of a salt Company executives believe the odds favoring oil in the area is at least 1 in 10.80.) is the event that an oil deposit is present in an area. in the sense that the answer to any one is not aected by the answer to any other.15 salt domes.10 (5. 131. and In a region in the Gulf Coast area. It decides to conduct two independent geophysical surveys for the presence of a salt dome.8 its P (HS) = 0. questions designed to be independent. (Solution on p. it seems reasonable to assume {H. Because the surveys are tests for the geological structure. Austin good Since the tests are for the reception are at two separate universities and are operationally independent. 131.90 P (E1 S ) = 0. Data on the results are summarized in the following table: GROUP 1 (84 members) Q 1 2 3 4 5 6 7 8 9 10 Yes 51 42 19 24 27 49 16 47 55 24 No 26 32 54 53 52 19 59 32 17 53 Unc 7 10 11 7 5 16 9 5 12 7 GROUP 2 (66 members) Yes 27 19 39 38 28 19 37 19 27 39 No 34 43 22 19 33 41 21 42 33 21 Unc 5 4 5 9 5 6 8 5 6 6 . E1 . E1 .05 Determine the posterior odds P (E2 S) = 0.9 and P (SH c ) = 0. the {H.95 P (HS ) = 0.30 c P (E2 S) = 0.127 • • • • H = the event the sales to the general market will be S = the event the game appeals to college students E1 = the event the sales are good at Rice E2 = the event the sales are good at UT.60) P (HE1 E2 ) P (H c E1 E2 ) . Also. experience suggests P (E1 S) = 0. E2 } ci S c . The subjects answer independently. experience indicates P (SH) = 0. E1 .) The subgroup Each individual is asked a battery of ten A sample of 150 sub jects is taken from a population which has two subgroups.25 c Determine the posterior odds favoring H if sales results are satisfactory at both schools.95 P (E1 S c ) = 0. Because of managers think P (S) = 0. E2 } ci S and {H. Exercise 5. Let E1 .90 P (E2 S c ) = 0. If (Solution on p.20 Table 5.
n. u. n. y. y.09 Table 5.14 0. n. y. u. n. y.59 No 0. u.50 0.20 0. u. y.08 0.38 0. y.09 0. n. y.06 0.23 0.62 0.42 0.08 0. u.12 0.63 0. n.17 follows (probabilities are rounded to two decimal places). n.08 0.64 0. n. y. as GROUP 1 Q 1 2 3 4 5 6 7 8 9 10 Yes 0. u. u. n.08 0.32 0.59 0.128 CHAPTER 5. n. y.16. CONDITIONAL INDEPENDENCE Table 5. y.41 0. (Solution on p. y.70 0.31 0. n. n. u.33 0.29 0.65 0.29 0. u ii.65 0. u ii.56 No 0.57 0. y. n. n.08 0. y. u. y . y.44 Unc 0. y.08 GROUP 2 Yes 0.38 0. y. n.32 0. so that the data may be used to calculate probabilities and conditional probabilities. u. y Exercise 5.50 0. n. y iii.08 0.12 0. are converted to conditional probabilities and probabilities. u.) The data of Exercise 5.29 0.23 0.63 Unc 0.06 0. n.41 0. y. n. i.19 0.19 0.56 0.63 0.15 0. n. Several persons are interviewed.61 0. above. n.11 0. n.58 0.10 For the following proles classify each individual in one of the subgroups. u.29 0. y iii.13 0.56 0.9 Assume the data represent the general population consisting of these two groups. n. The result of each interview is a prole of answers to the questions. The goal is to classify the person in one of the two subgroups For the following proles. y. y.09 0.06 0. classify each individual in one of the subgroups i. y.32 P (G2 ) = 0.29 0.51 0.62 0.50 0. 132.29 P (G1 ) = 0.
4)*0. 124) E1 be the event the probability is 0.85 • 0.9 • 0.3 (p.7 + 0.8 + ckn(PWc.5)*0. 124) PWc PA4 PA4 PA5 PA5 Let PW = 0.4*0.9997 Solution to Exercise 5.4 (p.6*0.3*0.6 (p.02 + (1 .6*0. not independent Solution to Exercise 5.63) . = 0.62) Solution to Exercise 5. and P (AB) = PA = 0. Assume P (E1 ) /P (E2 ) = 1. = ckn(PW.2 (p.1776 PAB = 0.8353 Solution to Exercise 5. P (AC) P (BC) P (C) + P (AC c ) P (BC c ) P (C c ).ckn(P2.5 (p. P = ckn(P1.2*0.3 PAB = 0.4*0.3)*0.98 P = 0.80 108 • = = 2. 123) PW = 0.01*[60 65 50 55 70].2 = 0.3700 PB = 0.7 + 0.01*[91 93 96 87 97].80 and E2 be the event the probability is 0.3 PA = 0.2 = 0.1*[4 5 3 7 5 6 2].129 Solutions to Exercises in Chapter 5 Solution to Exercise 5.1 (p.7 + ckn(PWc. P2 = 0.3)*0.1*[7 8 5 9 7 8 5]. P (E1 ) P (Sn = kE1 ) P (E1 Sn = k) = • P (E2 Sn = k) P (E2 ) P (Sn = kE2 ) (5.7 + 0.12 0.20 51 (5. P = ckn(PW.4800 PA*PB ans = 0.3)*0.3))*0.8 + ckn(PWc.5)*0. 123) P (A1 C) P (Ac C) P (A3 C) P (C) P (CA1 Ac A3 ) 2 2 c A ) = P (C c ) • P (A C c ) P (Ac C c ) P (A C c ) P (C c A1 A2 3 1 3 2 = 0.4993 = ckn(PW. 123) P (A) = P (AC) P (C) + P (AC c ) P (C c ) .2*0.65.61) (5.4 0. P (B) = P (BC) P (C) + P (BC c ) P (C c ).1860 % PAB not equal PA*PB.01*[3 2 7 4 1].3 P = 0.3*0.2482 Solution to Exercise 5. PWc = 0.4)*0.01*[75 80 65 70 85].20 • 0.3 PB = 0.6 0.15 • 0. 124) P1 = 0.
65) so that P (HS) 5 0.30 • 0.90•0./ibinom(20. E} ci S and ci S c .20 (5.6372 15.85 • 0..73) P (HE1 E2 ) 0.7 (p.95+0..9 45 = • = c S) P (H 2 0. 125) P (HE1 E2 ) P (HE1 E2 S) + P (HE1 E2 S c ) = c E E ) P (H P (H c E1 E2 S) + P (H c E1 E2 S c ) 1 2 P (HE1 E2 S) = P (S) P (HS) P (E1 SH) P (E2 SHE1 ) = P (S) P (HS) P (E1 S) P (E2 S) with similar expressions for the other terms.0000 0..95 + 0.13.7121 19.3663 18.10 (5.20•0.05 • 0.20 0.90 • 0.05•0.8148 0.20•0.95+0.70) = 0. (5. 125) P (HE) P (H) P (SH) P (ES) + P (S c H) P (ES c ) = • P (H c E) P (H c ) P (SH c ) P (ES) + P (S c H c ) P (ES c ) =4• Solution to Exercise 5.70 • 0.6111 Solution to Exercise 5.15 • 0.0000 2.71) Solution to Exercise 5..25 • 0.9558 17.66) Solution to Exercise 5.10 (p.0000 0.10 • 0.20 = 16.15 • 0.90•0.0000 13.3723 % Need at least 15 or 16 successes 16.45•0. 125) (5.0.65.64) P (H) P (SH) P (HS) = c S) P (H P (H c ) P (SH c ) P (S) = P (H) P (SH) + [1 − P (H)] P (SH c ) P (H) = P (S) − P (SH c ) = 5/7 P (SH) − P (SH c ) which implies (5.67) 0.95 + 0.25•0.odds]') ...95 + 0. 124) Assumptions amount to {H.5337 20.2958 14.69) = P (S)P (HS)P (E1 S)P (E2 S)+P (S c )P (HS c )P (E1 S c )P (E2 S c ) P (S)P (H c S)P (E1 S)P (E2 S)+P (S c )P (H c S c )P (E1 S c )P (E2 S c ) (5.0000 29. odds = ibinom(20.6555 P (H c E1 E2 ) 0.85 • 0.20 = = 16.8 (p.k).74) .80•0. disp([k.25 ∗ 0.95•0. (5.80 • 0. CONDITIONAL INDEPENDENCE k = 1:20.0000 1.80•0.72) (5...20 • 0.90 • 0.0000 63.55•0.64811 (5..68) P (HE1 E2 ) P (HE1 E2 S) + P (HE1 E2 S c ) = P (H c E1 E2 ) P (H c E1 E2 S) + P (H c E1 E2 S c ) (5..k).130 CHAPTER 5.9 (p.2 4 (5.10 = 12.95 + 0.0.80.90 • 0.95 • 0.25•0.0000 6.
25 • 0.25 0.75 • 0. 28 33 5.20•0.15 (p.05 • 0.95+0. 127) (5.01*[85 80 70].15 (5. 19 43 4. 24 53 7].90 • 0.78) Solution to Exercise 5.10 = • = 0. 42 32 10.8556 P (H c E1 E2 ) 10 0.05 • 0.95 + 0.90 • 0.75 • 0.90•0.10 = 3.95 • 0.10 • 0.30•0.90 • 0. 127) P (HE1 E2 ) P (HE1 E2 S) + P (HE1 E2 S c ) = P (H c E1 E2 ) P (H c E1 E2 S) + P (H c E1 E2 S c ) P (HE1 E2 S) = P (H) P (SH) P (E1 SH) P (E2 SHE1 ) = P (H) P (SH) P (E1 S) P (E2 S) with similar expressions for the other terms. 126) P (HE) P (S) P (HS) P (ES) + P (S c ) P (HS c ) P (ES c ) = P (H c E) P (S) P (H c S) P (ES) + P (S c ) P (H c S c ) P (ES c ) = 0.79) = P (S)P (HS)P (E1 S)P (E2 S)+P (S c )P (HS c )P (E1 S c )P (E2 S c ) P (S)P (H c S)P (E1 S)P (E2 S)+P (S c )P (H c S c )P (E1 S c )P (E2 S c ) (5.2)*(1 .9 • 0.10 • 0. .80 • 0.11 (p.90•0.81) Solution to Exercise 5.10 • 0.8. 19 54 11.76) Solution to Exercise 5. 55 17 12.4697 0.90 • 0.m (Section~17.20•0. 38 19 9. 27 52 5.10 Solution to Exercise 5. (5.10 • 0.90 • 0.20•0.15 = 7. B = [27 34 5.25 • 0.95 + 0.20•0.01*[80 70 75].95 + 0.95•0.10 • 0.83) P (HE1 E2 ) 1 0.25 = 15.1 • 0.131 Solution to Exercise 5. 126) P (HE) P (S) P (HS) P (ES) + P (S c ) P (HS c ) P (ES c ) = c E) P (H P (S) P (H c S) P (ES) + P (S c ) P (H c S c ) P (ES c ) = 0.14 (p.82) (5. 16 59 9. 126) PH = 0.20 • 0.90 + 0.05•0.75) (5. 47 32 5.13 (p.10 • 0.95 • 0. P = ckn(PH.16 A = [51 26 7.70•0.80) = 0.12 (p.16 (p.80•0.95+0.90 • 0.10 (5.25: mpr05_16) % Data for Exercise~5. PHc = 0.7879 0. 39 22 5. pH = 2/3.2)*pH + ckn(PHc.90 + 0.80•0.8577 Solution to Exercise 5. 49 19 16. 24 53 7.pH) P = 0.8447 (5.77) (5.84) % file npr05_16. 126) P (HE1 E2 S) + P (HE1 E2 S c ) P (HE1 E2 ) = P (H c E1 E2 ) P (H c E1 E2 S) + P (H c E1 E2 S c ) (5.95 + 0.
70 0.08 .65 0.31 0.33 0.09 0.64 0.19 0. u = 3.08].11 0.20 0.06 0.38 0.08 0.29 0.32 0.15 0.57 0.56 0.08 0.08 0. 37 21 8.19 0.56 0. n = 2.2693 Classify in Group 2 odds Enter profile matrix E [y y n y u u n n y y] Odds favoring Group 1: 5.42 0. B = [0. odds Enter profile matrix E [y n y n y u n u y u] Odds favoring Group 1: 3.12 0.743 Classify in Group 1 odds Enter profile matrix E [n n u n y y u n n y] Odds favoring Group 1: 0.06 0.17 (p.61 0.08 0.62 0.63 0. 128) npr05_17 (Section~17.23 0.65 0.29 0.13 0.59 0.51 0.29 0. CONDITIONAL INDEPENDENCE 19 41 6.08 0. 19 42 5.62 0.132 CHAPTER 5.06 0. disp('Call for oddsdf') npr05_16 (Section~17.50 0.m (Section~17.8.32 0.23 0.14 0. A = [0.26: npr05_17) % file npr05_17.50 0.8. 27 33 6.12 0.29 0.63 0.25: mpr05_16) Call for oddsdf oddsdf Enter matrix A of frequencies for calibration group 1 A Enter matrix B of frequencies for calibration group 2 B Number of questions = 10 Answers per question = 3 Enter code for answers and call for procedure "odds" y = 1.29 0. PG2 = 66/125.64 0. 39 21 6].17 PG1 = 84/150.58 0.286 Classify in Group 1 Solution to Exercise 5.38 0.29 0.41 0.26: npr05_17) % Data for Exercise~5.8.
50 0.2603 Classify in Group 2 odds Enter profile matrix E [y y n y u u n n y y] Odds favoring Group 1: 5.59 0.486 Classify in Group 1 odds Enter profile matrix E [n n u n y y u n n y] Odds favoring Group 1: 0.41 0. odds Enter profile matrix E [y n y n y u n u y u] Odds favoring Group 1: 3. disp('Call for oddsdp') Call for oddsdp oddsdp Enter matrix A of conditional probabilities for Group 1 Enter matrix B of conditional probabilities for Group 2 Probability p1 an individual is from Group 1 PG1 Number of questions = 10 Answers per question = 3 Enter code for answers and call for procedure "odds" y = 1.133 0.09].162 Classify in Group 1 A B . n = 2. u = 3.09 0.32 0.
CONDITIONAL INDEPENDENCE .134 CHAPTER 5.
Recall that a realvalued function on a on the real line) is characterized by the assignment of a real number (argument) in the domain.Chapter 6 Random Variables and Probabilities 6.org/content/m23260/1. In many nonnumerical cases. For example. For a realvalued function of a real variable. In the chapter "Random Vectors and Joint Distributions" (Section 8. 6. y to each Ω and whose range is a subset of the real line domain (say an interval element x I R. Random variable X. Except in special cases. although several ω The ob ject point ω is mapped. 135 . from the in visualizing the domain Ω to the codomain R. Each ω is mapped into exactly one t. In a Bernoulli trial. each outcome is characterized by a number. In each case.1). Often. The experiment is performed. it is often possible to write a formula or otherwise state a rule describing the assignment of the value to each argument. we extend the notion to vectorvalued random quantites. n component trials.e. point There are various ways of characterizing a function. the associated number becomes a property of the outcome. t = X (ω). 1 This content is available online at <http://cnx. The fundamental idea of a real random variable is the assignment of a real number to each elementary outcome whose domain is ω in the basic space Ω. Such an assignment amounts to determining a function X.1 Introduction on any trial. Probability associates with an event a number which indicates the likelihood of the occurrence of that event An event is modeled as the set of those possible outcomes of an experiment which satisfy a property or proposition characterizing the event. In a sequence of trials. Mappings and inverse mappings mapping R.1 Random Variables and Probabilities1 6. as a mapping from basic space Ω to the real line ω a value Probably the most useful for our purposes is as a assigns to each element t.1. the size of that quantity (in prescribed units) is the entity actually observed.8/>. it is convenient to assign a number to each outcome. in a coin ipping experiment. random variables share some important general properties of functions which play an essential role in determining their usefulness. However. If the outcome is observed as a physical quantity. a success may be represented by a 1 and a failure by a 0. realvalued random variables). One could assign a distinct number to each card in a deck Observations of the result of selecting a card could be recorded in terms of individual numbers.2 Random variables as functions We consider in this chapter real random variables (i. we may be interested in the number of successes in a sequence of of playing cards. We nd the mapping diagram of Figure 1 extremely useful essential patterns. a head may be represented by a 1 and a tail by a 0. into the image may have the same image point..1. or carried. we cannot write a formula for a random variable X.
is an event for each M. the inverse image is the entire basic space Ω. the inverse image is the empty set (impossible event). By the inverse image of M under the mapping X. If X does not take a value in M. Let M be a set of numbers on the real line. the results of measure theory ensure that we may make the assumption for any X and any subset M of the real line likely to be encountered in practice.2: E is the inverse image X −1 (M ). it may be assigned a probability. As an event.136 CHAPTER 6.1: The basic mapping diagram t = X (ω). (the set of all possible values of X ). A detailed examination of that measure theory. inverse mapping X −1 and the inverse images it produces. Fortunately. RANDOM VARIABLES AND PROBABILITIES Figure 6. The set X −1 (M ) is the event that X takes a value in M.1) X (M ). we mean the set of all those ω ∈ Ω which are mapped into M by X (see Figure 2). a subset of Ω. . If M includes the range of X. −1 assertion is a topic in Figure 6. Formally we write Associated with a function as a mapping are the X X −1 (M ) = {ω : X (ω) ∈ M } Now we assume the set (6.
Then. Preservation of set operations Let X be a mapping from to the real line R. and the sure event Ω. then X −1 (M c ) = E c . are sets of real numbers.3. .3) Before considering further examples. 2.1: Some illustrative examples. Ei . but 1 is not. k) pk (1 − p) n−k 0≤k≤n (6.2) P ({ω : X (ω) = 1}) = p. i ∈ J . We state it in terms of a random variable. P (X = 0) = 1−p. with respective inverse images E. X −1 i∈J Mi = i∈J Ei and X −1 i∈J Mi = i∈J Ei (6. This rather ungainly notation is shortened to P (X = 1) = p. 1. then X (M ) = Ω In this case the class of all events X −1 (M ) c consists of event E. X = IE where The event that E is an event with probability p. according to the analysis in the section "Bernoulli Trials and the Binomial Distribution" (Section 4. the impossible event ∅. then X −1 (M ) = E −1 If both 1 and 0 are in M. 0 and 1. Consider any set M. Let Sn be the random n component trials. with probability p of success. t and that the inverse image of M Central to the structure are the facts that each element is the set of all ω is mapped those ω which are mapped into Figure 6. Formal proofs amount to careful reading of the notation. but 0 is not. If neither 1 nor 0 is in M. then X (M ) = E c If 1 is in M. its complement E . If M.137 Example 6. Similarly. X take on the value 1 is the set Now X takes on only two values. Mi .2: Bernoulli trials and the binomial distribution) P (Sn = k) = C (n. then X −1 (M ) = ∅ −1 If 0 is in M. {ω : X (ω) = 1} = X −1 ({1}) = E so that (6. we note a general property of inverse images. Consider a sequence of variable whose value is the number of successes in the sequence of n Bernoulli trials. which maps Ω Ω to the real line (see Figure 3). into only one image point image points in M.3: Preservation of set operations by inverse images.4) Examination of simple graphical examples exhibits the plausibility of these patterns.
we are limited in the kinds of calculations that can be performed meaningfully with the probabilities on the basic space. RANDOM VARIABLES AND PROBABILITIES An easy. since we cannot have ω ∈ Ak and ω ∈ Aj . we consider useful ways to describe the probability distribution induced by a random variable. Let Ak = {ω : S10 (ω) = k}. R We have achieved a pointbypoint transfer of the probability apparatus to the real line in such a manner that we can make calculations about the random variable PX the probability measure induced byX. in some detail.1. When the structure and properties of simple random variables have .138 CHAPTER 6. for any ω .1). We call X. Its importance lies in the fact that to determine the likelihood that random quantity X will take on a value in set M. P (X ∈ M ) = PX (M ).33. We now think of the mapping from Ω to R as a producing a pointbypoint transfer of PX (M ) ≥ 0 ∞ and PX (R) = P (Ω) = 1. In the chapter "Distribution and Density Functions" (Section 7. technical sense. Thus. These are called simple random variables.4 Simple random variables We consider. We turn rst to a special class of random variables. N is a disjoint union of the separate Example 6. j = k (i.3 Mass transfer and induced probability distribution Because of the abstract nature of the basic space and the class of events.7) This means that PX has the properties of a probability measure dened on the subsets of the real line. Some results of measure theory show that this probability is dened uniquely on a class of subsets of that includes any set normally encountered in applications. By the additivity of probability. we cannot have two values for Sn (ω)). which we usually shorten to Ak = {S10 = k}. This implies that the inverse of a disjoint union of Mi M. consequence of the general patterns is that the inverse images of disjoint are also disjoint. Bernoulli trials. 8 P (2 < S10 ≤ 8) = P (S10 = k) = 0. For one thing.e. Thus the term simple is used in a special. ∞ And because of the preservation of set operations by the inverse mapping ∞ ∞ ∞ PX i=1 Mi =P X −1 i=1 Mi =P i=1 X −1 (Mi ) = i=1 P X −1 (Mi ) = i=1 PX (Mi ) (6. again.2: Events determined by a random variable n = 10 and p = 0.5) S10 takes on a value greater than 2 but no greater than 8 i it takes one of the integer values from 3 to 8. Now..6) 6. any random variable may be approximated as closely as pleased by a simple random variable. Now the Ak form a partition. In addition. inverse images. in practice we can distinguish only a nite set of possible values for any random variable. This may be done as follows: −1 To any set M on the real line assign probability mass PX (M ) = P X (M ) It is apparent that and minterm maps. We represent probability as mass distributed on the basic space and visualize this with the aid of general Venn diagrams the probability mass to the real line. Suppose we want to determine the probability P (2 < S10 ≤ 8). The importance of simple random variables rests on two facts. the random variable Sn which counts the number of successes in a sequence of {2 < S10 ≤ 8} = A3 since A4 A5 A6 A7 A8 (6. random variables which have only a nite set of possible values. but important.6927 k=3 (6. This transfer produces what is called the probability distribution for X. Let n Consider. we determine how much induced probability mass is in the set M. 6.1.
k) pk (1 − p) n−k . but m j=1 Cj = Ω. · · · . canonical form must be n Sn = k=0 k IAk with P (Ak ) = C (n. 1. displays directly the range (i. we include possible null values.e. Many properties of simple random variables extend to the general case via the approximation procedure. it {tk : 1 ≤ k ≤ n} paired pk = P (Ak ) = P (X = tk ). tn } and if with respective probabilities {p1 . Example 6. For some probability P (Ai ) = 0 for one or more of the ti . we must nd suitable ways to express them analytically. which displays the possible values and the corresponding events. and if the probabilities Note that in canonical form.8) Ai = {X = ti }. This is true for any ti .e. pn } (6.4: The indicator function). One of the advantages of the canonical form is that it displays the range (set of values). cm ICm . for they can only occur with probability zero. the distribution is determined. distributions it may be that ti has value zero. we may append the event Cm+1 = Cj and assign value zero to it. we include that term.4: Simple random variables in primitive form . since the representation is not unique. p2 .9) X (ω) = ti .139 been examined. · · · . A2 . if one of the null values. For one thing. In that case. Canonical form is a special primitive form. so the expression X are X (ω) for all ω . We do this with the aid of indicator functions (Section 1.10) For many purposes.. · · · . with the corresponding set of probabilities 2. t2 . set of values) of the random variable. The distinct set {t1 . An } one of these events occurs). Simple random variable X may be represented by a primitive form {Cj : 1 ≤ j ≤ m} is a partition (6. Canonical form is unique. then ω ∈ Ai . generated by) X. canonical form is desirable.11) X = c1 IC1 + c2 IC2 + · · · . Standard or canonical form.3. since they do not aect any probabilitiy calculations. both theoretical and practical. with the same value ci associated with each subset formed. Example 6. Probability pi = P (Ai ) and the corresponding calculations for made in terms of its distribution. The summation expression thus picks out the correct value represents ti . write We call this the partition determined by n is a partition (i. where Remarks • • • If {Cj : 1 ≤ j ≤ m} m j=1 c is a disjoint class. exactly (or. so that IAi (ω) = 1 and all the other indicator functions have value zero. These are not mutually exclusive representatons. If X takes on distinct values {t1 . · · · .. on any trial. 0≤k≤n (6. p2 . Any of the a primitive form. Representation with the aid of indicator functions In order to deal with simple random variables clearly and precisely. The set of values where distribution consists of the {pk : 1 ≤ k ≤ n}. We may X = t1 IA1 + t2 IA2 + · · · + tn IAn = i=1 If ti IAi (6. then {A1 . and in many ways normative. We say Ci may be partitioned. · · · .8). the general formulation.3: Successes in Bernoulli trials As the analysis of Bernoulli trials and the binomial distribution shows (see Section 4. t2 . for 1 ≤ i ≤ n. we call these values In are known. we turn to more general cases. and hence are practically impossible. tn } of the values probabilities {p1 . pn } constitute the distribution for X. Three basic forms of representation are encountered.
5.5IC8 3.0IC5 + 5. If Ei Sn which counts the number of successes in a sequence trial. $3. the player loses one dollar. 0.15. inverse images) determined by sure event Ω. the Any primitive form is a special ane form in which Ei often form an independent class. $5. $5. if the numbers 3. and with P (Ei ) = p 1 ≤ i ≤ n 1 ≤ i ≤ n. The prices are $3. Ai X consists of the impossible event in the partition determined by Hence.50. with IEi .e. ∅.12) A store has eight items for sale. D} is the partition determined X and the range of X is {−2.50.16) by {A. the integers 1 through 10. or 7 turn up. We commonly have (6. a linear combination of the indicator functions plus a constant. In fact. She purchases one of the items with probabilities 0. 0.00. the (6. the random variable of n Bernoulli trials.50. Ci be Each P (Ci ) = 0. The random variable expressing the amount of her purchase may be written X = 3.50. B..00. 0. In this case.0IC2 + 3.14) {E1 . Let the event the wheel stops at i. If the set J is the set of indices i such ti ∈ M . the player gains nothing. and $7. or 9 turn up. and the Ei form a partition. the class of events (i.e. $5.1.5IC7 + 7.10 0. since they form an independent class. . Consider any set then every point ω ∈ Ai maps into ti . Remark. RANDOM VARIABLES AND PROBABILITIES • A wheel is spun yielding. be expressed in primitive form as The random variable expressing the results may X = −10IC1 + 0IC2 + 10IC3 − 10IC4 + 0IC5 + 10IC6 − 10IC7 + 0IC8 + 10IC9 − IC10 • (6.15. $7.10.15. the class cj IEj (6.0IC6 + 3.05. if the numbers 2. respectively.5IC4 + 5. 4. the player loses ten dollars. If the numbers 1.20. or 8 turn up. 1 ≤ i ≤ 10. i th Example 6. 0.140 CHAPTER 6. Em } is not necessarily mutually exclusive.5IC1 + 5. on a equally likely basis. hence X in terms of the M of real numbers. · · · . if the number 10 turns up. the player gains ten dollars.5 Consider. A customer comes in.6: Events determined by a simple random variable Suppose simple random variable X is represented in canonical form by X = −2IA − IB + 0IC + 3ID Then the class (6. Only those points ω in AM = i∈J Ai map into M. E2 . which may be zero). −1. the Example 6.5IC3 + 7. 3}.15) c0 = 0 ci = 1 for Ei cannot form a mutually exclusive class.. 0. $3. {Ai : 1 ≤ i ≤ n} then it determines. again. m X = c0 + c1 IE1 + c2 IE2 + · · · + cm IEm = c0 + j=1 In this form. C. then one natural way to is the event of a success on the express the count is n Sn = i=1 This is ane form. and the union of any subclass of the X.13) X represented in ane form. and the coecients do c0 = 0 not display directly the set of possible values. 6. in which the random variable is represented as an ane combination of indicator functions (i. Events generated by a simple random variable: canonical form partition is in that We may characterize the class of all inverse images formed by a simple random M.10 0. 0.00. If ti in the range of X into M.50.
we nd four distinct values: value 1.07 2. where M = (−∞.00 is taken on for taken on for t1 = 1. C10 so that C5 C6 C10 and A2 = C2 Similarly P (A2 ) = P (C2 ) + P (C5 ) + P (C6 ) + P (C10 ) = 0.10 1.19) P (Ai ) = P (X = ti ) = j∈Ji P (Cj ) (6.5 Determination of the distribution Determining the partition generated by a simple random variable amounts to determining the canonical form.11 2.18) X is thus k P (X = k) 1. −1] ∪ [1.50 0.7: The distribution from a primitive form Suppose one item is selected at random from a group of ten items. The ω ∈ C7 . Ji where Ai = j∈Ji Cj .50 0. C5 . C. C6 . and t4 = 2.00. 5].00 0. For any possible value ti in the range. The distribution is then completed by determining the probabilities of each event Ak = {X = tk }. so that A1 = C7 and P (A1 ) = P (C7 ) = 0.08 1. Then the terms cj ICj = ti Ji and m ICj = ti IAi . t2 = 1.20) Examination of this procedure shows that there are two phases: . Suppose these are t1 < t2 < · · · < tn .09 1.50 0. respective probabilities are The values (in dollars) and cj P (Cj ) 2. 0 (−2.15 1.08 1. {X ≤ 1} = {X ∈ (−∞.40 2.50 is ω ∈ C2 .00 0. Example 6. and 0 are in M and X −1 (M ) = A (b) If B is the set (c) The event 2. 1. we identify the set of distinct values in the set {cj : 1 ≤ j ≤ m}.2 The general procedure If may be formulated as follows: X = j=1 cj ICj . 1].00 0.50.1 By inspection. 6. we consider an illustrative example.10 Table 6.1.14 2.50 0.50 0.50 0.00 0.14 1. 1.40 (6.141 (a) If M M is the interval [−2.00 0.23 (6.08 2.00. From a primitive form Before writing down the general pattern. 3 are in M and X −1 (M ) = B D. identify the index set Ji of those j such that cj = ti . (6. 1]} = X −1 (M ).14. the event {X ≤ 1} = A B C.17) P (A3 ) = P (C1 ) + P (C3 ) + P (C9 ) = 0. Since values are in M.23 Table 6.50 0.23 The distribution for and P (A4 ) = P (C4 ) + P (C8 ) = 0. then the values 2. t3 = 2. 1]. then the values 1. Value 1.50.50 0.00 0.23 2.
15 0.10 0.PX]') % Display of results 1.50 2. at a price of 18 dollars. Extension to the general case should be quite evident. tn Add all probabilities associated with each value ti to determine P (X = ti ) The software We use the mfunction csort which performs these two operations (see Example 4 (Example 2.00 2. First. we do the problem by hand in tabular form. But in many problems From ane form Suppose X is in ane form.142 CHAPTER 6.1400 1. We do this in a systematic way by utilizing minterm vectors and properties of indicator functions.50].21) X on each minterm generated by {Ej : 1 ≤ j ≤ m}.50 1.00 1.23) .08 0. Example 6. each indicator function the value si of X on each minterm Mi .50 2. use of a tool such as csort is not really needed. Example 6. 1. We determine in a special primitive form 2 −1 m X= k=0 si IMi .5.07 0. 0.2300 2.9: Finding the distribution from ane form A mail order house is featuring three items (limit one of each kind per customer).0000 0. Then we use the mprocedures to carry out the desired operations. with P (Mi ) = pi . 0 ≤ i ≤ 2m − 1 (6. Let X be the amount a customer who orders the special items spends on them plus mailing cost.3. in X = 10 IE1 + 18 IE2 + 10 IE3 + 3 (6. the event the customer orders item 2. We suppose {E1 .5000 0. RANDOM VARIABLES AND PROBABILITIES Select and sort the distinct values • • t1 .PX] = csort(C. E2 .22) 2.10]. with large sets of data the mfunction csort is very useful.00 1. There is a mailing charge of 3 dollars per order. X is constant on each minterm generated by the class ment of the minterm expansion. the event the customer orders item 3.2300 For a problem this small.5000 0. · · · .9: survey (continued)) from "Minterms and MATLAB Calculations").6.00 2.08 0. ane form. m X = c0 + c1 IE1 + c2 IE2 + · · · + cm IEm = c0 + j=1 We determine a particular primitive form by determining the value of the class cj IEj (6. Let E1 = E2 = E3 = the event the customer orders item 1. We apply the csort operation to the matrices of values and minterm probabilities to determine the distribution for X. at a price of 10 dollars. · · · .14 0. This describes X {E1 .08 0.09 0. Em } since.50 1. % Matrix of c_j pc = [0.0000 0. t2 . at a price of 10 dollars.50 1.8: Use of csort on Example 6. % The sorting and consolidating operation disp([X.11 0.pc). 0. Then.4000 2. % Matrix of P(C_j) [X. respectively. E3 } is independent with probabilities 0. We illustrate with a simple example.7 (The distribution from a primitive form) C = [2. as noted in the treatIEi is constant on each minterm. E2 .
143 We seek rst the primitive form. To complete the process of determining the distribution.21 0.09 Table 6.21 0.06 0.21 0.09 0.14 0.06 0.06 0. using the minterm probabilities. we list the sorted values and consolidate by adding together the probabilities of the minterms on which each value is taken. To obtain the value of X on each minterm we • • Multiply the minterm vector for each generating event by the coecient for that event Sum the values on each minterm and add the constant To complete the table. to expose more clearly the primitive Primitive form Values i 0 1 4 2 5 3 6 7 si 3 13 13 21 23 31 31 41 pmi 0.24) has the X = 3 IM0 + 13 IM1 + 13 IM4 + 21 IM2 + 23 IM5 + 31 IM3 + 31 IM6 + 41 IM7 We note that the value 13 is taken on on minterms value 13 is thus p (1) + p (4). as follows: . Similarly.3 We then sort on the form for X. the values on the various Mi . si .14 0. list the corresponding minterm probabilities. The probability has value 31 on minterms M3 and M6 .4 The primitive form of X is thus (6.14 0. 1. X M1 and M4 . i 0 1 2 3 4 5 6 7 10IEi 0 0 0 0 10 10 10 10 18IE2 0 0 18 18 0 0 18 18 10IE3 0 10 0 10 0 10 0 10 c 3 3 3 3 3 3 3 3 si 3 13 21 31 13 23 31 41 pmi 0. X 2.09 0.14 0. which may calculated in this case by using the mfunction minprob.06 0.09 Table 6.21 0.
The minterm probabilities are included in a row matrix..09 Table 6.14 + 0.25) X and PX describe the distribution for X. then uses a matrix of coecients.14 0. including the constant term (set to zero if absent). 6.5 The results may be put in a matrix probabilities that X X of possible values and a corresponding matrix PX of takes on each of these values.14 0.. RANDOM VARIABLES AND PROBABILITIES k 1 2 3 4 5 6 tk 3 13 21 23 31 41 pk 0.21 0.. pm = minprob(0.9 (Finding c = [10 18 10 3].06 + 0. and suppose we have available. Examination of the table shows that X = [3 13 21 23 31 41] Matrices and P X = [0.. M = mintable(3) M = 0 0 0 0 1 0 0 1 1 0 0 1 0 1 0 % .09] (6.35 0.06 0... then incorporate these in the mprocedure canonic for carrying out the transformation.15 0. The procedure uses mintable to set the basic minterm vector patterns. 2.C = colcopy(c(1:3).1.. to obtain the values on each minterm. Having obtained the values on each minterm.*M CM = % Constant term is listed last % Minterm vector pattern 1 0 1 1 1 1 1 0 1 % An approach mimicking ``hand'' calculation % Coefficients in position 10 10 18 18 10 10 % Minterm vector values 10 18 10 . the minterm probabilities. 1..21 = 0. or can calculate.15 0..06 0. Example 6.09 = 0.21 0.144 CHAPTER 6. We start with the random variable in ane form..1*[6 3 5]).8) C = 10 10 10 10 10 18 18 18 18 18 10 10 10 10 10 CM = C. the procedure performs the desired consolidation by using the mfunction csort.6 An mprocedure for determining the distribution from ane form We now consider suitable MATLAB steps in determining the distribution from ane form..10: Steps in determining the distribution for the distribution from ane form) X in Example 6.35 0..
0600 23.06 0.21 31 0.06 0.9 (Finding the distribution from ane form)) X c = [10 18 10 3]....10 (Steps in determining the distribution for in Example 6.06 0.21 0.const.0000 0.5]).09 % Extra zeros deleted const = c(4)*ones(1. which we use to solve the previous problem.14 13 0.0900 .PX]') 3 0..21 0.11: Use of canonic on the variables of Example 6.8)..0000 0.% Practical MATLAB procedure s = c(1:3)*M + c(4) s = 3 13 21 31 13 23 31 41 pm = 0.35 21 0.0000 0.PX] = csort(s.09 0..145 0 0 0 0 10 10 10 10 0 0 18 18 0 0 18 18 0 10 0 10 0 10 0 10 cM = sum(CM) + c(4) % Values on minterms cM = 3 13 21 31 13 23 31 41 % .s.09 0... canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 3.14 0.14 0.06 23 0.0000 0...0000 0.09 % Sorting on s.3500 21. disp([X.0000 0.14 % MATLAB gives four decimals 0.pm]') 0 0 0 3 3 0 0 10 3 13 0 18 0 3 21 0 18 10 3 31 10 0 0 3 13 10 0 10 3 23 10 18 0 3 31 10 18 10 3 41 [X.14 0.06 0.1400 13.pm).6 0.15 41 0.3 0.09 % Display of primitive form 0. Example 6..21 0. % Note that the constant term 3 must be included last pm = minprob([0. consolidation of pm % Display of final result The two basic steps are combined in the mprocedure canonic.2100 31.} disp([CM.1500 41.21 0.
6200 G = (X . Apply csort to the pair Z = g (X). 0.146 CHAPTER 6. Use the matrix G. or.93. 0. 0. We have two alternatives: a.0900 % Calculation using distribution for Z Example 6.11 (Use of canonic on the variables of Example 6.88. M = (X>=15)&(X<=35). If the respective probabilities for success are 0. 0. c = [ones(1. 10)? SOLUTION Let X= 10 i=1 IEi .84]. 0.PX) Z = 44 36 26 126 154 PZ = 0. Example 6.85. P (X − 20 ≤ 7).77.93. qualify. This distribution (in b. they must post an average speed of 125 mph or more on a trial run.PZ] = csort(G. 0. 0.20 <= 7 % Picks out and sums those minterm probs % Value of g(t_i) for each possible value % Total probability for those t_i such that % g(t_i) > 0 % Distribution for Z = g(X) 496 0. 7.10).96.83. 0. 0. 0.85.3800 % Ones for minterms on which 15 <= X <= 35 % Picks out and sums those minterm probs % Ones for minterms on which X .90.12: Continuation of Example 6.19) from "Composite Trials" Ten race cars are involved in time trials to determine pole positions for an upcoming race. Let the i th Ei To be the event is car makes qualifying speed.9 (Finding the distribution from ane and form))) it is desired to determine the probabilities P (15 ≤ X ≤ 35). canonic .10 X in Example 6. 0.3800 [Z.9 (Finding the distribution from ane form))) X Suppose for the random variable (Steps in determining the distribution for X in Example 6.90. what is the probability that k or more will qualify ( k = 6.0600 0. 0. which has values g (ti ) for each possible value ti (G.84. 0.91.11 (Use of canonic on the variables of Example 6.25) G = 154 36 44 26 126 496 P1 = (G>0)*PX' P1 = 0.1500 0. value and probability matrices) may be used in exactly the same manner as that for the original random variable X. M = 0 0 1 1 1 0 PM = M*PX' PM = 0. 2.13: Alternate formulation of Example 3 (Example 4. we may calculate a wide variety of quantities associated with the random variable. P X) to get the distribution for for X. P ((X − 10) (X − 25) > 0). It seems reasonable to suppose the class {Ei : 1 ≤ i ≤ 10} independent. 0. 0.4200 N = abs(X20)<=7.88. Determine X to determine a matrix PM = M*PX' M which has P (X ∈ M ): G = g (X) = [g (X1 ) g (X2 ) · · · g (Xn )] by using array operations on matrix X.10) 0]. We use two key devices: 1.83.10 (Steps in determining the distribution for in Example 6.72.96. 9.72.*(X . 0.1400 0. 8.77. Use relational and logical operations on the matrix of values ones for those values which meet a prescribed condition. RANDOM VARIABLES AND PROBABILITIES X (set of values) and With the distribution available in the matrices PX (set of probabilities). N = 0 1 1 1 0 0 PN = N*PX' PN = 0. 0. 0.91.2100 P2 = (Z>0)*PZ' P2 = 0. P = [0.3500 0.
e. but the class of Borel sets is general enough to include any set likely to be encountered in applicationscertainly at the level of this treatment. a great many other probabilities can be determined. disp([Z.9938 0. the class of those events of the form Ai = {X = ti } for each ti in the range). Example 6.0000 0. However.PZ]') 3. and countable intersections of Borel sets. for i = 1:length(k) Pk(i) = (X>=k(i))*PX'.1. but the distribution % matrices are now named Z and PZ 6.PZ] = canonicf(c.2114 This solution is not as convenient to write out.147 Enter row vector of coefficients c Enter row vector of minterm probabilities minprob(P) Use row matrices X and PX for calculations Call for XDBN to view the distribution k = 6:10. This is a type of class known as a sigma algebra of events..1*[6 3 5]). Or.1500 41. If the random variable takes on a continuum of values. countable unions. The Borel sets include any interval and any set that can be formed by complements. The class of events determined by for M X is the set of all inverse images X −1 (M ) any member of a general class of subsets of subsets of the real line known in the mathematical literature as the Borel sets.1).3500 21. There are technical mathematical reasons for not saying M is any subset. then the probability mass distribution may be spread smoothly on the line. A function form.0900 % Numbers as before. the class of events determined by random variable X is also a sigma algebra.2100 31.0000 0. which we call canonicf.0000 0. This is particularly the case when it is desired to compare the results of two independent races or heats. frequently it is desirable to use some other name for the random variable from the start.19) from "Composite Trials").0000 0. A function form for canonic One disadvantage of the procedure canonic is that it always names the output X and P X.9628 0.8472 0.7 General random variables The distribution for a simple random variable is easily visualized as point mass concentrations at the various values in the range. but the details are more complicated. pm = minprob(0. is useful in this case.5756 0.0000 0. The situation is conceptually the same for the general case. using canonicf c = [10 18 10 3]. [Z.pm). and is often designated σ (X).0000 0. We consider such problems in the study of Independent Classes of Random Variables (Section 9. end disp(Pk) 0.14: Alternate solution of Example 6. with the distribution for X as dened.0600 23. Because of the preservation of set operations by the inverse image.13 (Alternate formulation of Example 3 (Example 4. and the class of events determined by a simple random variable is described in terms of the partition generated by X (i. There are some technical questions . the distribution may be a mixture of point mass concentrations and smooth distributions on some intervals. While these can easily be renamed.1400 13.
50. in terms of Exercise 6. $5. She purchases one of the items with probabilities 0. 0. form Borel set M i for every semiinnite interval (−∞. Random variable Express X X (Solution on p. Another fact. B. (a) by hand. $3. 0. 2.2 Problems on Random Variables and Probabilities2 Exercise 6. the player loses one dollar.) A wheel is spun yielding on an equally likely basis the integers 1 through 10. 152.5/>. 2} (Solution on p.2 Random variable X. t]) is an event. 3)}. 3. or 7 turn up. if the numbers 3. $5. or 8 turn up.1. {X ∈ (−4. A customer comes in. Ci be the event Each P (Ci ) = 0. (Solution on p. These facts point to the importance of the distribution function introduced in the next chapter. The random variable expressing the results may be expressed in primitive form as X = −10IC1 + 0IC2 + 10IC3 − 10IC4 + 0IC5 + 10IC6 − 10IC7 + 0IC8 + 10IC9 − IC10 (6. (b) using MATLAB.09.06. X −1 (M ) is an event for every X −1 ((−∞.50.75IA − 1. t]. 3. C.6 A store has eight items for sale. 6.1 The following simple random variable is in canonical form: (Solution on p. D. Determine the distribution for X. {X 2 ≥ 4}. 2 This content is available online at <http://cnx. 0.) has values {1. 152. and (Solution on p. 2. and D. {X ∈ (−∞. B. respectively. These also are settled in such a manner that there is no need for concern at this level of analysis. 1]}. 5.3 C1 {Cj : 1 ≤ j ≤ 10} through C10 . The prices are $3. and $7.12.5 the wheel stops at (Solution on p. terms of A.00. $5. 152.50. Let i. 1.org/content/m24208/1. . 3. $7.09.08.4 The class {Cj : 1 ≤ j ≤ 10} in Exercise 6. alluded to above and discussed in some detail in the next chapter. or 9 turn up. 3]}. 4. {X − 1 ≥ 1}. 153.) in canonical form. However. Exercise 6. 6. Two facts provide the freedom we need to proceed with little concern for the technical details. 0. 152.6ID . if the numbers 2. 2]}.00.07.13. Express the events and {X ≥ 0} in Exercise 6. some of these questions become important in dealing with random processes and other advanced notions increasingly used in applications. Exercise 6. 1. {X ∈ (0. if the number 10 turns up. is given by X = −2IA − IB + IC + 2ID + 5IE .) X = −3. 0. Exercise 6. If the numbers 1. respectively. 1 ≤ i ≤ 10. the player loses ten dollars. in canonical form.) P (X < 0). 4. 0. {X − 2 ≤ 3}.11. C . {X ≤ 0}. is a partition. P (X > 0). 0. 0.50. the player gains nothing.148 CHAPTER 6. {X < 0}. 5. The class on and E.13IB + 0IC + 2.50. RANDOM VARIABLES AND PROBABILITIES PX induced by concerning the probability measure X. hence the distribution. $3.10. the player gains ten dollars. {X ∈ [2. The induced probability distribution is determined uniquely by its assignment to all intervals of the (−∞.26) • • Determine the distribution for Determine X. 152. is that any general random variable can be approximated as closely as pleased by a simple random variable.3 has respective probabilities 0. t] on the real line 2. 0.14.) Express the events A. We turn in the next chapter to a description of certain commonly encountered probability distributions and ways to describe them analytically.11.00.
3.2ID − 3. on A2 B3 . etc.15.20. Suppose the class {A. the Orlando Magic. or .6 0. in order. She loses ve if the Rockets win and gains ten if they lose $10 to 5 against the Magic even $5 to 5 on the Bulls. Let c.048 0. let and determine the distribution of W.168 0. Z = 3 + 3 each Ai Bj and determine the corresponding P (Ai Bj ). 0. B.2. 0.170 0.30) (Solution on p.018 0.75.) Exercise 6. determine the distribution for Z. P (Bj ) are 0. Exercise 6.028 0. D} are.8.062 0. W = XY . 153. d. 0.) in canonical form are X. 154. C} is independent. (b) using MATLAB. who does not like the Magic.) X be the minimum of the two numbers which turn up. Let Z be the sum of the two numbers. respectively. and the Consider the random variable independent. Determine the value of Z on From this. She decides to take him up on his bets as follows: • • • $10 to 5 on the Rockets i.6. Bj } is Z = X + Y .0IC2 + 3.040 0.) Exercise 6.149 0.5IB + 2.15.3IA − 2.28) P (Ai ) are 0.048 0.31) X. (Solution on p. (Solution on p.012 0.0IC6 + 3.) On a Tuesday evening. Each pair {Ai .9 A pair of dice is rolled.072 0. The random variable expressing the amount of her purchase may be written X = 3. Determine the distribution for Y.010 0.70 0. B. 154. Let W be the absolute value of the dierence.110 0.11 games (but not with one another).29) Determine the distribution for random variable X = −5.10 0.042 0.5IC1 + 5.7 Suppose (Solution on p.15. Determine the value of W on each Ai B j Exercise 6. Let b.112 0.1. He wants to bet on the games. Ellen's boyfriend is a rabid Rockets fan. breaks even. 0. C. 0. and (6.028 0.0IC5 + 5.5IC3 + 7. B be the event the Magic be the event the Bulls win. Determine its distribution. a. Determine the distribution for X Y be the maximum of the two numbers.2 0.10 0.3IC + 4. 0.27) X (a) by hand. 0. What are the probabilities Ellen loses money. the Houston Rockets.8 For the random variables in Exercise 6. 155. with respective probabilities 0.5IC8 Determine the distribution for (6. (Solution on p.032 (6. 153. Let win. 0. and the Chicago Bulls all have C A be the event the Rockets win.e. Y X = 2IA1 + 3IA2 + 5IA3 The Y = IB1 + 2IB2 + 3IB3 (6.5IC4 + 5. Then Z = 2 + 1 on A1 B1 .5IC7 + 7.7.10 Minterm probabilities p (0) through p (15) for the class {A. Determine the distribution for Z.7 Exercise 6. Ellen's winning may be expressed as the random variable X = −5IA + 10IAc + 10IB − 5IB c − 5IC + 5IC c = −15IA + 15IB − 10IC + 10 Determine the distribution for comes out ahead? (6.05.
157. four dollars if he wins the second bet. 0. The random variable X = IA + IB + IC + ID on a trial. RANDOM VARIABLES AND PROBABILITIES Exercise 6. Ei (Solution on p.) A marker is placed at a reference position on a line (taken to be the origin). an amount a is earned if there is a success on the trial and an amount b (usually negative) if there is a failure.17 repeatedly.33) X. and six dollars if he wins the third. a socket wrench set at $56. Exercise 6. He puts down two dollars for each bet. 156. C.9.3 (6. He decides independently on the a set of screwdrivers at $18.12 The class (Solution on p. a vise at $24. B.6. with respective probabilities 0.34) Assume the results of the games are independent. purchases of the individual items. Exercise 6.90. $26. Their arrivals are the A. Show that W = (a − b) Sn + bn. Let the i th component trial. P (B) = 0. X = 20IA + 26IB + 33IC represents the total amount received. if a tail turns up. Find the probability that one or three of these occur on a trial.15 Henry goes to a hardware store. James is expecting three checks in the mail. Let amount of his total purchases. the marker is moved one unit to the right. 158. b.13 events Then (Solution on p.5. what are the possible positions and the probabilities of being in each? . His net winning can be represented by the random variable X = 3IA + 4IB + 6IC − 6. and hammer at $8. D} pm = 0. 156. After ten tosses. a coin is tossed If a head turns up.16 A sequence of trials (not necessarily independent) is performed.80. B. P (C) = 0.150 CHAPTER 6. Find the distribution for X counts the number of the events which occur and determine the probability that two or more occur on a trial.) Assume the class is independent.75. Determine the distribution for X. Exercise 6. the marker is moved one unit to the left.5. a. We associate with each trial a payo function Xi = aIEi + bIEic .4. and $33 dollars. 155. 0. Thus. X be the Exercise 6. Determine the distribution for X. 0. Determine the distribution for (6. (Solution on p. C . for $20. Let Sn be the number of successes in the n trials and W be the net payo. 0.4. 0. 0.001 ∗ [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] (6. What is the probability he receives at least $50? Less than $30? Exercise 6. with respective probabilities 0. with P (A) = 0.) A gambler places three bets.32) • • Determine whether or not the class is independent.35) Ei is the event of a head on the i th toss and S10 is the number of heads in ten trials.) He considers a power drill at $35. He picks up three dollars (his original bet plus one dollar) if he wins the rst bet. Show that the position at the end of ten tosses is given by the random variable 10 10 10 c IEi = 2 X= i=1 where IEi − i=1 IEi − 10 = 2S10 − 10 i=1 (6. 158.) has minterm probabilities {A.) be the event of success on (Solution on p.14 (Solution on p.7.
z = r + s. 15. 17.PZ] = csort(z.PY). 12 dollars. possible purchases form an independent class. 15 dollars with respective probabilities 0. Suggestion for part (c). 12. 0.63. Assume that all eleven a. 0. the total amount the two purchase.37. Z =X +Y.23. [t. 158. 21. 18. 8. [r. 0. pz = t. Let MATLAB perform the calculations.22.151 Exercise 6.41. Determine the distribution for b. Anne contemplates six purchases in the amounts 8.58. Determine the distribution for c.u] = ndgrid(PX.Y). Determine the distribution for X. the amount purchased by Margaret. 0.*u.77. the amount purchased by Anne. 15. 0.pz). 0.18 (Solution on p.81.83. 0. 0. Y.s] = ndgrid(X.) Margaret considers ve purchases in the amounts 5. [Z. .52.38. with respective probabilities 0. 0.
1000 0 0.01*[8 13 6 9 14 11 12 7 11 9]. 148) T [X. 148) • • • • • D A A B A B B C D D E E Solution to Exercise 6. 148) T = [1 3 2 3 4 2 1 3 5 2].0000 0.3000 1. 148) • • • • • A D A B B D C C C C Solution to Exercise 6. = sort(T) 1 1 2 2 2 3 3 1 7 3 6 10 2 4 3 8 4 5 5 9 (6. RANDOM VARIABLES AND PROBABILITIES Solutions to Exercises in Chapter 6 Solution to Exercise 6.PX]') 1.0000 0. disp([X.36) X = IA + 2IB + 3IC + 4ID + 5IE A = C1 C7 .0000 0.10).2000 2.PX] = csort(T.0000 0.I] X = I = = [1 3 2 3 4 2 1 3 5 2]. disp([X.0000 0.1 (p. [X. C = C2 C4 C8 .0000 0. B = C3 C6 C10 . c = [10 0 10 10 0 10 10 0 10 1]. pc = 0.4 (p.3 (p.1400 5. [X.pc).2900 4.37) Solution to Exercise 6.3000 10.0000 0.p).2600 3.0000 0. 148) p = 0.PX]') 10.PX] = csort(c. E = C9 (6.1100 Solution to Exercise 6.2 (p.3000 . D = C5 .152 CHAPTER 6.5 (p.1*ones(1.
01*[10 15 15 20 10 5 10 15].3).*b XY = 2 3 4 6 6 9 W 5 10 15 PW % XY values % Distribution for W = XY . c = [3.6 0.5 7.0000 0.1200 0.0200 [Z.0600 0.1400 7.3000 5.3).*pb % Probabilities for various values P = 0.3).4200 6. =a + b % Possible values of sum Z = X + Y = 3 4 6 4 5 7 5 6 8 PA = [0. pa= rowcopy(PA.5]. [X.0000 0.7 (p.0000 0. 149) XY = a.3500 5.5000 0.0600 8. disp([X.8 (p.3500 Solution to Exercise 6. PB = [0. 148) p = 0.4000 = (X>0)*PX' = 0.0200 0.PX]') 3.300 Solution to Exercise 6. disp([Z.0000 0.0000 0.0200 B a b Z Z Solution to Exercise 6.0600 4.0000 0.2].5 5 5 3.P). = rowcopy(A. P = pa.3).6 0.PX] = csort(c.3 0.3000 7.1200 0.p).5000 0. = colcopy(B.6 (p.2 0.0600 0.5 5 3.0600 0. = [1 2 3].1800 0. 149) A = [2 3 5].PZ] = csort(Z.1]. pb = colcopy(PB.0000 0.5 7.153 Pneg Pneg Ppos Ppos = (X<0)*PX' = 0.PZ]') % Distribution for Z = X + Y 3.3600 0.
040 0.y).1800 0..10 (p.0600 0.0000 3.10 pm = [ 0. c = [5. coefficients in c') .6).fW] = csort(d. RANDOM VARIABLES AND PROBABILITIES 2. 149) t = 1:6.m (Section~17.154 CHAPTER 6.y in each position sorts values and counts occurrences % PX = fX/36 % PY = fY/36 8 5 9 4 10 3 11 2 12 1 %PZ = fZ/36 % PW = fW/36 Solution to Exercise 6. 149) % file npr06_10. 0.0000 4.112 0.7].5 2.168 0.062 0.c) W = 0 1 2 fW = 6 10 8 4 4 4 4 4 4 1 2 3 4 5 6 5 5 5 5 5 5 1 2 3 4 5 6 6 6 6 6 6 6 1 2 3 4 5 6 % xvalues in each position % yvalues in each position 4 5 4 7 5 4 3 6 5 3 5 9 6 5 4 4 6 1 6 11 7 6 5 2 % % % % % min in each position max in each position sum x+y in each position x .fZ] = csort(s.0000 6.fY] = csort(M.0000 5.3 2.0200 0.048 0.170 0.t) x = 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 y = 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 m = min(x.028 0. c = ones(6. [x.028 . s = x + y.c) Z = 2 3 4 fZ = 1 2 3 [W. M = max(x.0000 9..048 0. d = abs(x .fX] = csort(m.1200 0.9 (p.y).1200 0.012 0.c) X = 1 2 3 fX = 11 9 7 [Y.0200 Solution to Exercise 6.c) Y = 1 2 3 fY = 1 3 5 [Z.010 0. [X.27: npr06_10) % Data for Exercise~6.042 0. disp('Minterm probabilities are in pm.0000 10.4200 0.0000 0.032].2 3.0000 15.y] = meshgrid(t.3 4.8.0600 0.072 0.y).018 0.110 0.
0280 6.2000 0.0450 0 0.4800 PXpos = (X>0)*PX' PXpos = 0.8.0350 PXneg = (X<0)*PX' PXneg = 0.0000 0.0420 3.1200 15.9000 0. c = [15 15 10 10].2250 PX0 = (X==0)*PX' PX0 = 0.0000 0.3000 0.0000 0.1120 1.2950 Solution to Exercise 6.12 (p.7000 0.0000 0.1100 6.2000 0. 150) minprob(P) .0620 7.7000 0.0400 9.01*[75 70 80]. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 15.0280 0.8000 0.0000 0.3000 0.1700 9.8000 0.0180 0.0480 3.27: npr06_10) Minterm probabilities are in pm.0480 2.11 (p.1400 25.0000 0.0100 2.5000 0.0320 4.4800 10. coefficients in c canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution XDBN XDBN = 11.1800 5.155 npr06_10 (Section~17. 149) P = 0.5000 0.0120 Solution to Exercise 6.0000 0.0000 0.1680 5.4000 0.5000 0.0720 2.
0000 0.0650 Solution to Exercise 6.8. 150) c = [20 26 33 0].0600 79. coefficients in c a = imintest(pm) The class is NOT independent Minterms for which the product rule fails a = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution XDBN = 0 0.0000 0.0000 0.14 (p.0050 20.0000 0.0000 0.0000 0.13 (p.2120 3.0000 0.9520 P13 = ((X==1)(X==3))*PX' P13 = 0.1800 59.0150 33.5400 P50 = (X>=50)*PX' P50 = 0.0000 0. 150) minprob(P) .0000 0.1350 53. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 0 0.0050 1.0450 26.4810 Solution to Exercise 6.3020 P2 = (X>=2)*PX' P2 = 0.156 CHAPTER 6.0000 0.28: npr06_12) Minterm probabilities in pm.0430 2.0200 46.7800 P30 = (X <30)*PX' P30 = 0. P = 0.4380 4.01*[90 75 80].0000 0. RANDOM VARIABLES AND PROBABILITIES npr06_12 (Section~17.
0000 0.0000 0.0000 0. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution dsp(XDBN) 6.0000 0.0324 91.0000 0.0900 1.0036 82.2100 3.157 c = [3 4 6 6].0000 0.0000 0.0324 50.0036 42.0000 0.0024 61.0000 0.0000 0.0084 99.1400 0 0. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 0 0.0000 0.1*[5 4 3]. 150) minprob(P) c = [35 56 18 24 8 0].2100 2.0084 24.0000 0.0504 53.0056 80.0000 0.0900 4.0216 35.1*[5 6 7 4 9].0000 0.0000 0.0024 26.0036 8.0000 0.0000 0.0216 74.0000 0.1400 3.0054 59.0054 98.0756 64.0000 0.0324 18.0600 7. P = 0. P = 0.0486 minprob(P) .0000 0.0486 67.0000 0.0000 0.0000 0.0000 0.0504 88.0000 0.0000 0.0000 0.15 (p.0000 0.0000 0.0000 0.0084 56.0000 0.0126 77.0756 32.1134 85.0056 43.0600 Solution to Exercise 6.
151) % file npr06_18.0000 0.PZ] = csort(z. 150) c Xi = IEi − IEi = IEi − (1 − IEi ) = 2IEi − 1 (6.0098 10.0000 0.PY] = canonicf(cy.5.10. X = 2*S .0756 0.8.s] = ndgrid(X.0084 0.18 (p.0000 0.m (Section~17.0756 106.0000 0. z = r + s.0000 0.0126 0.0000 117.0000 141.m) cx = [5 17 21 8 15 0]. disp([X.0000 115.0000 0.0000 Solution to Exercise 6. [Z. PS = ibinom(10.2461 2.1134 0.40) 10 n X= i=1 Xi = 2 e=1 IEi − 10 (6.0324 0.m) [X.0000 109.0. RANDOM VARIABLES AND PROBABILITIES 0. [t.pmx).PS]') 10. pmy = minprob(0. npr06_18 (Section~17.0439 4.0000 0. pmx = minprob(0.PY).41) S = 0:10.0098 6.0000 0. 150) Xi = aIEi + b (1 − IEi ) = (a − b) IEi + b n n (6.0:10).0000 123.01*[37 22 38 81 63]).29: npr06_18.0000 0. [Y.0000 133.pmy).2051 0 0. [r.38) W = i=1 Xi = (a − b) i=1 IEi + bn = (a − b) Sn + bn (6.29: npr06_18.1172 2.1172 6.0000 0.u] = ndgrid(PX.0036 0.PX] = canonicf(cx. .*u.2051 4.0010 Solution to Exercise 6.39) Solution to Exercise 6.8.Y). cy = [8 15 12 18 15 12 0].17 (p. pz = t.0439 8.158 CHAPTER 6.pz).0010 8.16 (p.01*[77 52 23 41 83 58]).
159 a = length(Z) a = 125 plot(Z.cumsum(PZ)) % 125 different values % See figure Plotting details omitted .
RANDOM VARIABLES AND PROBABILITIES .160 CHAPTER 6.
6/>. with a jump in the amount p0 at t0 i P (X = t0 ) = p0 . If the point t0 from the left. t>s there must be at least as much probability mass (−∞.2 The distribution function distribution induced by a random variable semiinnite intervals of the form by the following. on the other hand. 161 .1 Distribution and Density Functions1 7. PX .1. the probability mass included in at or to the left of t as there is for s. the interval does not include the probability mass at t0 until t reaches that value. 1 This content is available online at <http://cnx.1). we need a more useful description than that provided by the induced probability measure 7.Chapter 7 Distribution and Density Functions 7. t approaches Except in very unusual cases involving random variables which may take innite values. if t approaches t0 from the right. but drops immediately as t moves to the left of t0 . has the following properties: (F1) : (F2) : FX must be a nondecreasing function. for if (F3) : FX is continuous from the right.org/content/m23267/1.1) As FX at or to the left of the point t.1. at which point the amount at or to the left of t increases ("jumps") by amount p0 . as t moves to the left. The mapping induces a transfer of the probability mass on the basic space to subsets of the real line in such a way that the probability that X takes a value in a set is exactly the mass assigned to that set by the transfer. For simple random variables this is easy.1) we introduce real random variables as mappings from the basic space Ω to the real line. To perform probability calculations. this is the probability mass a consequence. we note that the probability (−∞.1 Introduction M possible value of In the unit on Random Variables and Probability (Section 6. X is determined uniquely by a consistent assignment of mass to This suggests that a natural description is provided Denition The distribution function FX for random variable X is given by FX (t) = P (X ≤ t) = P (X ∈ (−∞. t]) ∀ t ∈ R In terms of the mass distribution on the line. we need to describe analytically the distribution on the line. In the theoretical discussion on Random Variables and Probability (Section 6. For more general cases. t] for each real t. (7. the interval includes the mass p0 all the way to and including t0 . We have at each X a point mass equal to the probability X takes that value. t] must increase to one as t moves to the right.
FX (t) remains constant at zero until it jumps by the amount p1 . canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution ddbn % Circles show values at jumps Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % Printing details See Figure~7. A of point mass pi at each point similar situation exists for a discretevalued random variable which may take on an innity of values (e. increases. t].1: Graph of FX for a simple random variable c = [10 18 10 3]. The distribution function (−∞.1 . The graph of FX is thus a step function. so that FX (−∞) = lim FX (t) = 0 t→−∞ and FX (∞) = lim FX (t) = 1 t→∞ (7. as t increases to the smallest value t1 . % Distribution for X in Example 6. According to FX for a simple random variable is easily visualized. continuous from the right. To the left of the smallest value in the range. FX (t) = 0. there is always some probability at points to the right of any total probability mass is one.2) A distribution function determines the probability mass in each semiinnite interval the discussion referred to above. FX (t) remains constant at p1 until t increases to t2 .g. since the X ddbn may be used to plot the distributon function for a simple random variable from a of values and a corresponding matrix Example 7. DISTRIBUTION AND DENSITY FUNCTIONS the probability mass included must decrease to zero.162 CHAPTER 7.. with a jump in the amount pi at the corresponding point ti in the range.1 pm = minprob(0. The distribution consists ti in the range..5. this determines uniquely the induced distribution. In this case. where it jumps by an amount p2 to the value p1 + p2 . but this must become vanishingly small as t PX of probabilities. This continues until the value of FX (t)reaches 1 at the largest value tn . the geometric distribution or the Poisson distribution considered below). The procedure matrix ti .1*[6 3 5]).
163 Figure 7.1). function has a jump in the amount 2. k) pk q n−k (7.3 Description of some common discrete distributions We make repeated use of a number of common distributions which are used in many practical situations. continuous from the right. This random variable appears as the number of successes in a sequence of trials with probability p n Bernoulli of success. Indicator function. As pointed out in the study of Bernoulli sequences in the unit on Composite Trials. (7. Binomial FX pi at t = ti (See for a simple random variable)) (n. The q at t = 0 and an additional jump of p to the value 1 n Simple random variable X = i=1 ti IAi (canonical form) P (X = ti ) = P (Ai ) = p1 The distribution function is a step function. 1.1.3) Figure 7. with jump of distribution at t = 1.1 for Example 7.4) P (Ei ) = p P (X = k) = C (n. This collection includes several distributions which are studied in the chapter "Random Variables and Probabilities" (Section 6.1 (Graph of FX for a simple random variable) 7.5) ibinom andcbinom are available for computing the individual and cumulative binomial probabilities.1 (Graph of 3. X = IE P (X = 1) = P (E) = pP (X = 0) = q = 1 − p. p). two mfunctions . In its simplest form n X= i=1 I Ei with {Ei : 1 ≤ i ≤ n} independent (7.1: Distribution function for Example 7.
0 ≤ k The second designates the component trial on which the rst success occurs. (p) There are two related distributions. m≤k (7. The event {X = k} consists of a sequence of k This is sometimes called failures. Also geometric (p). Example 7. which we often designate by X ∼ The geometric (p). The mth success k trials is greater than or ~P~=~cbinom(10. p).2131 . If it has failed the probability of failing an additional probability of failing k n times.1326 5. It is generally more Y = X + m. Each combination has probability Example 7. The event {Y = k} We have P (Y = k) = q k−1 p. the number of the trial on which the mth success occurs. X is the number of failures before the mth success.6) k−1 failures. then a success on the k th component trial. What is the probability of nding no BMW owners in the sample? SOLUTION The sampling process may be viewed as a sequence of Bernoulli trials with probability of success. P (X ≥ n + kX ≥ n) = P (X ≥ n + k) = q n+k /q n = q k = P (X ≥ k) P (X ≥ n) (7. 1 ≤ k We say (7. Y = X+1 or Y − 1 = X. it is customary to refer to the distribution for the number of the trial for the rst success by saying probability of k or more failures before the rst success is Y −1 ∼ P (X ≥ k) = q k . k or more times before the next success is the same as the initial or more times before the rst success. consists of (7.5. She takes a sample of size 100. m − 1) pm q k−m . Thus P (X = k) = q k p. p = 0.98100 = 0.3: A game of chance A player throws a single sixsided die repeatedly.2131 An alternate solution m.8) This suggests that a Bernoulli sequence essentially "starts over" on each trial. k−1 trials. We have an mfunction nbinom to calculate these probabilities. The probability of 100 or more failures before the rst success is or about 1/7. What is the probability he scores ve times in ten or fewer throws? ~p~=~sum(nbinom(5. For this reason. 4. then a success.5) P~~=~~0.02 0.7) X has the geometric distribution with parameter Now (p).1/3. the waiting time.9) examination of the possible patterns and elementary combinatorics show that There are m−1 successes in the rst pm q k−m .5:10)) p~~=~~0. DISTRIBUTION AND DENSITY FUNCTIONS Geometric sequences. then a success. Negative binomial convenient to work with (m.1/3.2: The geometric distribution A statistician is taking a random sample from a population in which two percent of the members own a BMW automobile. An P (Y = k) = C (k − 1. is possible with the use of the comes not later than the equal to k th trial i the number of successes in binomial distribution. both arising in the study of continuing Bernoulli The rst counts the number of failures before the rst success.164 CHAPTER 7. He scores if he throws a 1 or a 6.
0060 The descriptions of these distributions. P = cpoisson(mu. the two agree. along with a number of other facts.9666~~0. fX (t) is the mass per unit length in the probability distribution. where is calculated by is a row or column vector of integers and the result is a row matrix of the probabilities. What is the probability the number of arrivals is greater than equal to 110. The Poisson distribution is integer valued. And strictly speaking the integrals are Lebesgue integrals rather than the ordinary Riemann kind. so that we are free to use ordinary integration techniques. 150.k). with P (X = k) = e−µ µk k! 0≤k (7. 130.8209~~0.5117~~0. 2. This distribution is assumed in a wide variety of applications.4: Poisson counting random variable The number of messages arriving in a one minute period at a communications network junction is a random variable N ∼ Poisson (130). Example 7. This term comes distribution.110:10:160) p~~=~~0. There is a technical mathematical description of the condition spread smoothly with no point mass concentrations. distribution is approximately Poisson Use of the generating function (see Transform Methods) shows the sum of independent Poisson random variables is Poisson. 7.165 6. 140. Remarks 1. we have two mfunctions for calculating Poisson probabilities. fX ≥ 0 The density function has three characteristic properties: t (f1) (f2) fX = 1 R (f3) FX (t) = −∞ fX (7.k). there is a probability density M function fX which satises P (X ∈ M ) = PX (M ) = At each fX (t) dt (area under the graph of fX over M) (7.10) Although Poisson probabilities are usually easier to calculate with scientic calculators than binomial probabilities. For large n and small p (which may not be a value found in a table). Poisson (µ). are summarized in the table DATA ON SOME COMMON DISTRIBUTIONS in Appendix C (Section 17. : : P (X = k) P (X ≥ k) is calculated by the result P P P = ipoisson(mu. We often simply abbreviate as continuous absolutely continuous.2011~~0. 120. the use of tables is often quite helpful.1.13) . These have advantages of speed and parameter range similar to those for ibinom and cbinom. the binomial (np).4 The density function If the probability mass in the induced distribution is spread smoothly along the real line. As in the case of the binomial distribution. where k k is a row or column vector of integers and is a row matrix of the probabilities.12) A random variable (or distribution) which has a density is called from measure theory.0461~~0. But for practical cases.3). It appears as a counting variable for items arriving with exponential interarrival times (see the relationship to the gamma distribution below). with no point mass concentrations.11) t. By the fundamental theorem of calculus ' fX (t) = FX (t) at every point of continuity of fX (7. 160 ? ~p~=~cpoisson(130.
Thus g (t) fX (t) dt = R The rst expression is g (t) fX (t) dt In many situations.14) not an indenite integral. fX will be zero outside an interval. (7. 1. multiplication by the appropriate positive f. DISTRIBUTION AND DENSITY FUNCTIONS f with 3. which f = 1. . the integrand eectively determines the region of integration. 4.166 CHAPTER 7.25. In the literature on probability. 4. An argument based on the Quantile Function shows the existence of a random variable with that distribution. Any integrable. Thus. λ = 0. nonnegative function in turn determines a probability distribution. it is customary to omit the indication of the region of integration when integrating over the whole line.2: The Weibull density for α = 2. Figure 7. constant gives a suitable If f = 1 determines a distribution function F .
fX (t) = The graph of 1 a<t<b b−a (zero outside the interval) (7. Thus.15) FX rises linearly. fX (t) = { (a + t) /a (a − t) /a 2 2 −a ≤ t < 0 0≤t≤a It This distribution is used frequently in instructional numerical examples because probabilities can be obtained geometrically. [a. the density may have a triangular graph which is not symmetric. The density must be constant over the interval (zero outside).001. and the distribution function increases linearly with in the interval. since probability associated with each individual point is zero. appears naturally (in shifted form) as the distribution for the sum or dierence of two independent random variables uniformly distributed on intervals of the same length. a). It can be shifted. with a shift of the graph. More generally.3: The Weibull density for α = 10. b]. 1. 7. This fact is established with the use of the moment generating function (see Transform Methods). It is immaterial whether or not the end points are Mass is spread uniformly on the interval included. to dierent sets of values. .1. 2. This distribution is used to model situations in which it is known that X t takes on values in [a. b] but is equally likely to be in any subinterval of a given length. The probability of any subinterval is proportional to the length of the subinterval.167 Figure 7.5 Some common absolutely continuous distributions 1. Symmetric triangular (−a. The probability of being in any two subintervals of the same length is the same. 1000. b). λ = 0. with slope 1/ (b − a) from zero at t=a to one at t = b. Uniform (a.
central role in many aspects of applied probability theory. Much of its importance comes from the central limit theorem (CLT). Failure is due to some stress of external origin.17) X Because of this property. X > t) then units of time. this means that the average interarrival time is 1/3 hour or 20 minutes. Many devices have the property that they do not wear out. Determine P (120 < X ≤ 250). or Gaussian µ.[4~6]) p~~=~~0. Using the fact that areas of similar triangles are proportional to the square of any side. P (X > t + hX > t) = P (X > t + h) /P (X > t) = e−λ(t+h) /e−λt = e−λh = P (X > h) (7. If the t (i. each exponential (λ).168 CHAPTER 7. Example 7. λ). If we suppose these are independent. we have Exponential note that P (X > t) = 1 − FX (t) = Since the exponential distribution. σ 2 fX (t) = 1 √ exp σ 2π We generally indicate that a random variable X ∼ N µ. represents the time to failure (i. Many solid state electronic devices behave essentially in this way.. DISTRIBUTION AND DENSITY FUNCTIONS Example 7. as is usually the case.9846 5. ~p~=~gammadbn(10. σ 2 X −1 2 t−µ 2 σ ∀t The gaussian distribution plays a has the normal or gaussian distribution by writing . hours? What is the probability of ten or more arrivals in four hours? In six SOLUTION The time for ten arrivals is the sum of ten interarrival times. h > 0 implies X > t. 300). it is immaterial whether the end point of the intervals are included or not. 4. Use of Cauchy's equation (Appendix B) shows that the exponential distribution is the only continuous distribution with this property. α = n.7576~~~~0. This leads to an extremely important property of X > t + h. SOLUTION To get the area under the triangle between 120 and 250.855 2 (7. which is a term applied to a number . this property says that if the device survives to time the (conditional) probability it will survive of surviving for h h more units of time is the same as the original probability t = 0. As we show in the chapter Mathematical Expectation (Section 11. We −λt e t ≥ 0. Note that in the continuous case. X∼ symmetric triangular (100. Gamma distribution We have an mfunction (α.1). (λ) fX (t) = λe−λt t ≥ 0 (zero elsewhere). we have P =1− 1 2 2 (20/100) + (50/100) = 0.e. −λt Integration shows FX (t) = 1 − e t ≥ 0 (zero elsewhere). the times (in hours) between arrivals in a hospital emergency unit may be represented by a random quantity which is exponential (λ = 3).6: An arrival problem On a Saturday night. the life duration) of a device put into service at Suppose distribution is exponential. a random variable X ∼ X ∼ gamma gamma A relation to the Poisson distribution is described in Sec 7. λ) fX (t) = gammadbn λα tα−1 e−λt Γ(α) t≥0 (zero elsewhere) to determine values of the distribution function for functions shows that for the sum of (α. 3). then the time for ten arrivals is gamma (10.. we take one minus the area of the right triangles between 100 and 120 and between 250 and 300. λ) has the same distribution as n independent random variables.e.16) 3. the exponential distribution is often used in reliability problems. Use of moment generating (n. particularly in the area of statistics.5: Use of a triangular distribution Suppose Remark.3. once initial burn in tests have removed defective units. putting in the actual values for the parameters.5. Normal.
1). We need only have a table of the distribution function for use X ∼ N (0. The greater the parameter σ2 . fX (t) is symmetric about its maximum at the smaller the maximum value and the more slowly the curve decreases with distance from µ. so that we have t = 1.18) Note that the area to the left of t = −1. it is known that the distribution function. We may use the symmetry for any case. Thus. The graph of the density function is the well known bell shaped curve. where the quantity observed is an additive combination of a large number of essentially independent quantities. Note that Φ (t) for t > 0 to Φ (0) = 1/2. as the integral of the density function. The symmetry about the origin contributes to its usefulness.5 is the same as the area to the right of Φ (−2) = 1 − Φ (2).5. Since there are two parameters.19) This indicates that we need only a table of values of any t. respectively. be able to determine Φ (t) for . The parameter µ is called the σ. symmetrical about the origin (see Figure 7. 2 √1 e−t /2 2π Standardized normalφ (t) so that the distribution function is Φ (t) = t −∞ φ (u) du. Essentially. the positive square root of the variance.169 of theorems in analysis. the gaussian distribution appears naturally in such topics as theory of errors or theory of noise. usual procedure is to use tables obtained by numerical integration. We φ and Φ for the standardized normal density and distribution functions. Examination of the expression shows that the graph for t = µ. this raises the question whether a separate table is needed for each pair of parameters. the CLT shows that the distribution for the sum of a suciently large number of independent random variables has approximately the gaussian distribution. The σ2 While we have an explicit formula for the density function.4). The same is true for any t. It is a remarkable fact that this is not the case. so that Φ (−t) = 1 − Φ (t) ∀ t (7. is a measure of the spread of mass about The parameter µ. is mean value and σ 2 is the called the standard deviation. = This is refered to as the standardized normal distribution. cannot be expressed in terms of elementary functions. Thus parameter µ locates the center of the mass distribution and variance. P (X ≤ t) = Φ (t) = area under the curve to the left of t (7.
we nd Φ (2) = 0.2616 as compared with 0.8185 and P (X > 1) = General gaussian distribution X ∼ N µ.4)). P (−1 ≤ X ≤ 2) = Φ (2) − Φ (−1) = Φ (2) − [1 − Φ (1)] = Φ (2) + Φ (1) − 1 P (X > 1) = P (X > 1) + P (X < − 1) = 1 − Φ (1) + Φ (−1) = 2 [1 − Φ (1)] From a table of standardized normal distribution function (see Appendix D (Section 17.9772 0.1). The density is centered about normal density.3989 for the standardized Inspection shows that the graph is narrower than that for the standardized normal. The distribution function reaches 0. σ 2 . the density maintains the bell shape.8413 which gives P (−1 ≤ X ≤ 2) = 0. t = 2. 2.4: The standardized normal distribution.170 CHAPTER 7. Figure 7.3174 For and Φ (1) = 0. SOLUTION 1. Determine P (−1 ≤ X ≤ 2) and P (X > 1). Example 7. DISTRIBUTION AND DENSITY FUNCTIONS Figure 7. 0.5 at the mean value 2.7: Standardized normal calculations Suppose X ∼ N (0.5 shows the distribution function and density function for X ∼ N (2. 1). It has height 1. . but is shifted with dierent spread and height.
.21) FX (t) = −∞ φ (u) du = Φ t−µ σ (7. 16) P (X − 3 > 4).5: The normal density and distribution functions for X ∼ N (2.8: General gaussian calculation X ∼ N (3. 2. µ = 3 and σ 2 = 16)..20) Make the change of variable and corresponding formal changes u= to get x−µ 1 t−µ du = dx x = t ∼ u = σ σ σ (t−µ)/σ (7.171 Figure 7.e.7 (Standardized normal calculations) We have mfunctions gaussian and gaussdensity to calculate values of the distribution and density function for any reasonable value of the parameters. FX (t) = 1 √ σ 2π t exp − −∞ 1 x−µ 2 σ 2 t dx = −∞ φ x−µ σ 1 dx σ (7.8185 4 4 P (X − 3 < − 4)+P (X − 3 > 4) = FX (−1)+[1 − FX (7)] = Φ (−1)+1−Φ (1) = 0. 0.1). Determine P (−1 ≤ X ≤ 11) and FX (11) − FX (−1) = Φ 11−3 − Φ −1−3 = Φ (2) − Φ (−1) = 0. (i. Suppose SOLUTION 1. A change of variables in the integral shows that the table for standardized normal distribution function can be used for any case.22) Example 7.3174 In each case the problem reduces to that in Example 7.
7) 0. If general case we have two mfunctions. (7.16.172 CHAPTER 7. these can be extended to other intervals.gaussian(3.7 (Standardized normal calculations) and Example 7.1. beta and betadbn to perform the calculatons. For the .23) Figure 7.3173 = gaussian(3. s > 0.11) .1)) 0. s gives the uniform distribution on the unit interval. DISTRIBUTION AND DENSITY FUNCTIONS The following are solutions of Example 7.8186 = gaussian(3. Example 7. s.3173 The dierences in these results and those above (which used tables) are due to the roundo to four places in the tables.1.7 show graphs of the densities for various values of The usefulness comes in approximating densities on the unit interval.1) 0.8186 = 2*(1 .1. Beta(r.7 (Standardized normal calculations) and Example 7. the density function is a polynomial.gaussian(0. fX (t) = 1 0 Γ(r+s) r−1 (1 Γ(r)Γ(s) t − t) s−1 0<t<1 Analysis is based on the integrals ur−1 (1 − u) s−1 du = Γ (r) Γ (s) Γ (r + s) with Γ (t + 1) = tΓ (t) r.1)) + 1 . s) .6 and Figure 7. are integers.8 (General gaussian calculation) (continued) P1 = P2 P2 = P1 P2 = P2 P2 = P1 = gaussian(0.9: Example 7.gaussian(0.8 (General gaussian calculation). r > 0.16. The special case r=s=1 r.(gaussian(3. using the mfunction gaussian. By using scaling and shifting.16.16. 6. The Beta distribution is quite useful in developing the Bayesian statistics for the problem of sampling to determine a population proportion.1) 0.2) .
173 Figure 7. 10.s) density for r = 2. 2. s = 1. .6: The Beta(r.
ν) FX (t) = 1 − e−λ(t−ν) α > 0. Examination shows α = 1 the distribution is exponential (λ). s = 2. Weibull(α. t ≥ ν The parameter ν is a shift parameter.6 and Figure 7. k) (µ/n) (1 − µ/n) 2 This n−k = n (n − 1) · · · (n − k + 1) µ 1− nk n 1− µ n n µk k! (7. gamma.24) P (X = k) = C (n. we have mfunctions for shift parameter weibull (density) and weibulld (distribution function) ν=0 only. much use of it. 7. λ > 0.174 CHAPTER 7. .25) content is available online at <http://cnx.org/content/m23313/1. k) pk (1 − p) Suppose n−k ≈ e−np p = µ/n with n large and µ/n < 1. and Gaussian distributions The Poisson approximation to the binomial distribution The following approximation is a classical one.7 show graphs of the Weibull density for some The distribution is used in reliability theory.2. The shift can be obtained by subtracting a constant from the t values. representative values of α that for scale for Figure 7. We wish to show that for small 7. Poisson. Usually we assume ν = 0.1 Binomial.7: The Beta(r. However.s) density for r = 5. λ. P (X = k) = C (n. DISTRIBUTION AND DENSITY FUNCTIONS Figure 7. 7. 10. 5. ν ≥ 0.7/>.2 Distribution Approximations2 p and suciently large n (7. The parameter α provides a distortion of the time the exponential distribution. k np k! −k Then. We do not make α and λ (ν = 0).
we have the following equality i X∼ gamma P (X ≤ t) = Now 1 Γ (n) λt un−1 d−u du = e−λt 0 k=n (λt) k! k (7. ∞ (7. k! Now Y ∼ Poisson (λt). is ∞ n−1 tn−1 e−t dt = Γ (n) e−a a Noting that k=0 ∞ ak k=0 k! ak k! with Γ (n) = (n − 1)! (7.30) a = λt and α = n. where µ = np. P (X = k) ≈ e−µ µ . X∼ gamma (α. The second factor approaches one as n of the same degree k. referred to in the discussion of the gaussian or normal distribution above. The number of successes in n trials has the binomial (n.31) ∞ P (Y ≥ n) = e−λt k=n (λt) k! k i Y ∼ Poisson (λt) (7.29) 1 = e−a ea = e−a we nd after some simple algebra that 1 Γ (n) For a ∞ tn−1 e−t dt = e−a 0 k=n ak k! (α. . which must n becomes large.32) The gaussian (normal) approximation The central limit theorem. obtained by integration by parts.33) X is np and the variance is npq .28) A well known denite integral. λ) i P (X ≤ t) = λα Γ (α) t xα−1 e−λx dx = 0 1 Γ (α) t 0 (λx) α−1 −λx e d (λx) (7. λ). n This random variable may be expressed X= i=1 Since the mean value of IEi where the I Ei constitute an independent class (7.26) known property of the exponential 1− The result is that for large µ n n → e−µ k as n→∞ The Poisson and gamma distributions Suppose n. p) distribution. According to a well (7. suggests that the binomial and Poisson distributions should be approximated by the gaussian.175 The rst factor in the last expression is the ratio of polynomials in approach one as n becomes large. the distribution should be approximately N (np.27) = 1 Γ (α) λt uα−1 e−u du 0 (7. npq).
Now if X ∼ Poisson (µ). take the area under Use of mprocedures to compare We have two mprocedures to make the comparisons. It is apparent that the the probability X ≤ k . may be considered the sum of (µ/n). n = 300. it turns out that the best gaussian approximaton is obtained by making a continuity correction. the probability at t=k is represented by a rectangle of height pk and unit width. To get an approximation to a density for an integervalued random variable. To approximate the curve from k + 1/2. Since the binomial and Poisson distributions are integervalued.176 CHAPTER 7. each Poisson X is approximately It is generally best to compare distribution functions. First. Use of the generating function shows that the sum of independent Poisson random variables is Poisson.8: Gaussian approximation to the binomial. this is called the continuity correction. then X N (µ. p = 0. with k as the midpoint. Figure 1 shows a plot of the density and the corresponding gaussian density for gaussian density is oset by approximately 1/2. we consider approximation of the . µ). Since the mean value and the variance are both µ. it is reasonable to suppose that suppose that n independent random variables.1. DISTRIBUTION AND DENSITY FUNCTIONS Figure 7.
177 Figure 7. .9: Gaussian approximation to the Poisson distribution function µ = 10.
Here µ.11 are for correction provides the better approximation.178 CHAPTER 7. DISTRIBUTION AND DENSITY FUNCTIONS Figure 7. Figure 7. but not by as much as for the smaller µ = 100. the continuity correction is applied to the gaussian distribu tion at integer values (circles). µ).10: Gaussian approximation to the Poisson distribution function µ = 100. selects a suitable range about k = µ and plots the distribution function for the Poisson distribution (stairs) and the normal (gaussian) distribution (dash dot) for N (µ. It is clear that the continuity correction The plots in Figure 7. µ = 10. the continuity . Poisson (µ) distribution. In addition. The mprocedure poissapp calls for a value of µ.10 shows plots for provides a much better approximation.
.03.11: Poisson and Gaussian approximation to the binomial: n = 1000.179 Figure 7. p = 0.
Calculations based on this distribution approximate corresponding calculations on the continuous distribution. In this case. agreement of all three distribution functions is evident. Figure 7. A continuous random variable is characterized by a set of possible values spread continuously over an interval or collection of intervals. the probability is also spread smoothly. the Poisson distribution does not track very well. The diculty.1). p = 0.12: Poisson and Gaussian approximation to the binomial: n = 50. DISTRIBUTION AND DENSITY FUNCTIONS Figure 7. .12 shows plots for n = 1000. This describes the distribution of the random variable and makes possible calculations of event probabilities and parameters for the distribution. In the unit Random Variables (Section 6. A simple approximation is obtained by subdividing an interval which includes the range (the set of possible values) into small enough subintervals that the density is approximately constant over each subinterval. gaussian. However. selects suitable bincomp compares the binomial. of n The mprocedure and p. p = 0.180 CHAPTER 7. and Poisson distributions.03. we show how a simple random variable is determined by the set of points on the real line representing the possible values and the corresponding set of probabilities that each of these values is taken on. p = 0.2 Approximation of a real random variable by simple random variables Simple random variables play a signicant role.6. is the dierence in variances for the binomial as compared with np for the Poisson. whose value at any point indicates "the probability per unit length" near the point. subinterval. It calls for values k values.6. as we see in the unit Variance (Section 12. and continuity adjusted values of the gaussian distribution function at the integer values. and plots the distribution function for the binomial. The distribution is described by a probability density function. A point in each subinterval is selected and is assigned the probability mass in its The combination of the selected points and the corresponding probabilities describes the dis tribution of an approximating simple random variable. There is npq still good agreement of the binomial and adjusted gaussian. The good n = 50. a continuous Figure 7. both in theory and applications.11 shows plots for approximation to the distribution function for the Poisson. 7.2.1).
is negligible.0668 70.2535 % times the numbers in the last column. 0 ≤ t ≤ 1.2 = 0.0000 0.181 Before examining a general approximation procedure which has signicant consequences for later treatments.0000 223. so P = 0.0000 62. Experiment indicates √ n = µ+6 µ (i. for a given parameter value µ. an analytical solution is easy.0000 284. 40.1354 50.K. for i = 1:length(mu) K(i) = floor(mu(i)+ 6*sqrt(mu(i))). the result agrees quite well with the theoretical value.0000 0.0000 18. p = M*PX' p = 0.0000 5. Example 7.4163 % Residual probabilities are 0.9 − 0.0000 28.0159 200.4540 % K is the value of k needed to achieve 30. Once the approximating simple distribution is established. However.0000 92.2)&(X <= 0.7210 Because of the regularity of the density and the number of approximation points.0000 0. mu = [5 10 20 30 40 50 70 100 150 200].0359 100.000001 10.length(mu)). we consider some illustrative examples.K(i)). 3 3 Determine P (0. where ti is the midpoint.0000 0. the graph of the density over the subinterval is approximated by a rectangle of length dx and height fX (ti ).0000 0. absolutely continuous with density functon fX .0133 An mprocedure for discrete approximation If X is bounded.2140 % the residual shown. end disp([mu. p(i) = cpoisson(mu(i). the probability for k ≥ n. six standard deviations beyond the mean) is a reasonable value for 5 ≤ µ ≤ 200.0000 160.0000 46.0000 77.7210. Example 7. the mprocedure tappr distribution for an approximating simple random variable.0000 0. In eect.^2 Use row matrices X and PX as in the simple case M = (X >= 0. K = zeros(1.length(mu)). An interval containing the range of into a specied number of equal subdivisions.e.0205 150. FX (t) = t3 on the interval SOLUTION In this case.0000 0.0000 0.. n suciently large. then the integral of the density function over the subinterval is approximated by fX (ti ) dx.0000 2. [0.9). 20. . If X sets up the is divided The probability mass for each subinterval is assigned to dx is the length of the subintervals.11: A numerical example Suppose fX (t) = 3t2 . calculations are carried out as for simple random variables. We use tappr as follows: tappr Enter matrix [a b] of xrange endpoints [0 1] Enter number of x approximation points 200 Enter density as a function of t 3*t.2 ≤ X ≤ 0. p = zeros(1.0000 120.9).p*1e6]') 5. the midpoint.10: Simple approximation to Poisson A random variable with the Poisson distribution is unbounded. 1].
12 (Radial tire mileage).1910 % Theoretical value = (2/3)exp(5/4) = 0..191003 cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See Figure~7. However. Figure 7. (b/a)*exp(k*(at)). Example 7.*(t >= 40000) Use row matrices X and PX as in the simple case P = (X >= 45000)*PX' P = 0. In particular. DISTRIBUTION AND DENSITY FUNCTIONS The next example is a more complex one. it is easy to determine a bound beyond which the probability is negligible.14 for plot .^2/a^3). b = 20/3. the distribution is not bounded. 000). Determine P (X ≥ 45.12: Radial tire mileage The life (in miles) of a certain brand of radial tires may be represented by a random variable with density X fX (t) = { where t2 /a3 (b/a) e −k(t−a) for 0≤t<a a≤t (7. % Test shows cutoff point of 80000 should be satisfactory tappr Enter matrix [a b] of xrange endpoints [0 80000] Enter number of x approximation points 80000/20 Enter density as a function of t (t.*(t < 40000) + .182 CHAPTER 7.34) for a = 40. k = 1/4000.13: Distribution function for Example 7.. a = 40000. 000. b = 20/3. and k = 1/4000.
This random variable is in canonical form. ti−1 ≤ si < ti . we form a simple random variable Xs as follows. Suppose I is partitioned into n subintervals by points ti . we use a rather large number of approximation points. tn ] (see Figure 7. one having a nite set of possible values). Then the Ei form a partition of the basic space Ω. For the unbounded case. ti ) be the i th subinterval. then X (ω) ∈ Mi and Xs (ω) = si .14: Partition of the interval I including the range of X Figure 7. X may map into any point in the interval. designating a large number of approximating points usually causes no computer memory problem. in which the range of X is limited to a bounded interval Ei = X −1 (Mi ) be the set of points mapped into Mi by X. 1 ≤ i ≤ n − 1 and Mn = [tn−1 . absolute value of the dierence satises If ω ∈ Ei .35) . Let Mi = [ti−1 . b]. As a consequence. 1 ≤ i ≤ n − 1. and hence into any point in each subinterval Mi . Now the X (ω) − Xs (ω)  < ti − ti−1 the length of subinterval Mi (7. For the given subdivision. Consider the simple random variable Xs = i=1 si IEi .e.183 In this case. pick a n point si . In the singlevariable case. Now random variable We limit our discussion to the bounded case. Let Figure 7.. In each subinterval. The general approximation procedure We show now that any bounded real random variable may be approximated as closely as desired by a simple random variable (i. I = [a. the approximation is close except in a portion of the range having arbitrarily small total probability. with a = t0 and b = tn . the results are quite accurate.14).15: Renement of the partition by additional subdividion points.
I large enough that the probability outside I is 7.12.) The Exercise 7. 0.2 C1 {Cj : 1 ≤ j ≤ 10} through C10 . Determine and plot the distribution function FX .11. The random variable expressing the amount of her purchase may be written X = 3. $5.3 Problems on Distribution and Density Functions3 Exercise 7. 0. $3. the selection of si = ti−1 (the lefthand endpoint) leads In this case. 0.) (See Exercises 3 (Exercise 6. . Thus. Simply let a to N.) The prices are $3. $5. $7. and $7. Random variable X has values respectively. we simply select an interval negligible and use a simple approximation over I. Xs (ω) ≤ Ys (ω) ≤ X (ω) ∀ ω To see this.5IC1 + 5.001 ∗ [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] Determine and plot the distribution function for the random variable which counts the number of the events which occur on a trial.07.0IC6 + 3. $5.6) from "Problems on Random Variables and Probabilities"). N th set of subdi making the last subinterval increasingly shorter. 3.15).50.50. B.10 0. D} has minterm probabilities pm = 0. DISTRIBUTION AND DENSITY FUNCTIONS ω and the corresponding subinterval.05. Subintervals from a to N are made {XN : 1 ≤ N } of simple random variables. While the choice of the to the property all of the length of the Mi (7.4) from "Problems on Random Variables and {1. 3 This content is available online at <http://cnx.38) X.0IC5 + 5. we have the important fact Since this is true for each X (ω) − Xs (ω)  < maximum dierence as small as we please. 0.10 0.36) By making the subintervals small enough by increasing the number of subdivision points. The result is a nondecreasing sequence XN (ω) → X (ω) as N → ∞.50. if we add subdivision points to decrease the size of some or satises Mi . (7.06. 3. (Solution on p. is a partition. The class on (Solution on p. 2. $3. Xs maps both i '' and Ei into ti .3) and 4 (Exercise 6.09. 0. Mi is partitioned into Mi Mi and Ei is partitioned into '' ' '' ' '' Ei . 0. the asserted inequality must hold for each ω By taking a sequence of partitions in which each succeeding partition renes the previous (i.20.00.12) from "Problems on Random Variables and Probabilities"). 0.11. 3. (See Exercise 6 (Exercise 6. with probabilities 0.3 class (See Exercise 12 (Exercise 6. 1. consider ' Ei ' Ei ' '' (7.5IC8 Determine and plot the distribution function for (7. 189.15. for each ω ∈ Ω. vision points extend from we may form a nondecreasing sequence of simple random [N.14. A customer comes in. the maximum length of subinterval goes to zero. She purchases one of the items with probabilities 0.00. ∞).org/content/m24209/1. 0.08. A store has eight items for sale. 0. 2} Exercise 7. 189.0IC2 + 3.13. Ys maps Ei into ti and maps Ei into t∗ > ti .50. respectively. 0. we can make the si is arbitrary in each Mi .37) ∈ Mi (see Figure 7.00. 5. 0. variables Xn which increase to X for each ω .10. (Solution on p.15. X maps Ei into Mi' and Ei into Mi'' . 189. 0.5IC3 + 7.15. the new simple approximation Ys t∗ i Xs (ω) ≤ X (ω) ∀ω .184 CHAPTER 7. with adds subdivision points) in such a way that The latter result may be extended to random variables unbounded above.e. 4. For probability calculations.39) X = IA + IB + IC + ID . {A. 2.1 Probabilities").5IC7 + 7.5IC4 + 5.09.50. C.4/>. 0. 0.
probability of of a lightning strike starting a re is about 0. Determine the distribution for where N is the number of winners and (7. in a. he or she wins $30 with probability (Solution on p. The prot to the College is X = 50 · 10 − 30N.6 and Exercise 7.7 Two hundred persons buy tickets for a drawing. 190. fourth.) For the system in Exercise 7. 5? (Solution on p. What is the probability that at least ve of Exercise 7. Fifty chances are sold. the probability of one or more a. 4.0017. or fewer winners.12 A player pays $10 to play. 2.) A single sixsided die is rolled repeatedly until either a one or a six turns up.2. 189. . the sequence of service/nonservice days may be considered a Bernoulli sequence with probability lamp failures in a day. 189. (Solution on p.) On any day.13 (Solution on p.8 tails) Two coins are ipped twenty times. Exercise 7. It takes about one hour to replace a lamp. res are started.0083. repeat using the binomial distribution. 4? (Solution on p. 190.) A residential College plans to raise money by selling chances on a board. What is the probability that on any day the loss of production time due to lamp failaures is k or fewer hours. 189. What is the probability that the rst appearance of either of these numbers is achieved by the fth trial or sooner? Exercise 7.40) X and calculate P (X > 0). On ten spins what is the probability of matching. 1.) Solve using the negative For the system in Exercise 7. could fail with probability p = 0.10 assume the plant works seven days a week.) Assume the initial digit may be zero.) Exercise 7.10 (Solution on p.11 is the probability the third service day occurs by the end of 10 days? binomial distribution. in order.) Exercise 7. 190. 2. Beginning on a Monday morning. p1 . 189.) is a ten digit number. k = 2. What is the probability of no service days in a seven day week? Exercise 7. A wheel turns up the digits 0 through 9 with equal probability on each spin. once it has failed. 0 ≤ k ≤ 20? (Solution on p. k = 0. What is the probability of k Each ticket has probability 0.53 of success on any component trial. 189. Exercise 7. third. What Exercise 7. What is the probability the results match (both heads or both k times. These lamps are critical. 189. p = 0. P (X ≥ 200).008 of winning. P (X ≥ 300).4 Suppose a (Solution on p. 3. 190. each lamp Exercise 7. what is the probability the rst service day is the rst. Experience shows that the What is the probability that k k = 0.6 A manufacturing plant has 350 special lamps on its production lines.14 Consider a Bernoulli sequence with probability (Solution on p. second. call a day in which one or more failures occur among the 350 lamps a service day.) Exercise 7. 3? (Solution on p. Since a Bernoulli sequence starts over at any time. 189.) p = 0. fth day of the week? b.9 them get seven or more heads? Thirty members of a class each ip a coin ten times. and must be replaced as quickly as possible.6. 3. 1. 0 ≤ k ≤ 10? k or more of the ten digits (Solution on p.5 In a thunderstorm in a national park there are 127 lightning strikes.185 Exercise 7.
0. Exercise 7.18 Suppose (Solution on p. P (X ≥ 2/λ). Suppose the unit has operated successfully for 500 hours. Do this Compare P (Y ≤ k) for X ∼ binomial(5000. What is the (conditional) probability it will operate for another 500 hours? Exercise 7. what is the probability the third satisfactory unit will be found on six or fewer trials? Exercise 7. of a televesion set sub ject to random voltage surges has the exponential (0.) Items are selected Fifty percent of the components coming o an assembly line fail to meet specications for a special It is desired to select three units which meet the stringent specications.002) distribution.) The number of cars passing a certain trac count position in an hour has Poisson (53) distribution. The times to failure This means the expected life is 5000 hours.) and P (X ≤ k) 0 ≤ k ≤ 10.0002).) X∼ binomial (1000. 190. . Exercise 7. inclusive? What is the probability of no more than ten? Exercise 7. 10.375). determine P (X ≥ 1/λ). X ∼ Y ∼ Poisson (4. 15. DISTRIBUTION AND DENSITY FUNCTIONS a. will survive for 5000 hours. Use the procedure nbinom to calculate this probability . 0.19 (Solution on p.16 (Solution on p. 8.) X∼ exponential (λ). Determine the probabilities that at least k. 191. a. for k = 5.24 Twenty identical units are put into operation.) The number of customers arriving in a small specialty store in an hour is a random quantity having Poisson (5) distribution. in hours of operating time. Then use the mprocedure bincomp to obtain graphical results (including a comparison with the normal distribution). 0. Exercise 7.5).20 (Solution on p. (Solution on p. 191. Use the Poisson distribution to determine P (T ≤ 100. Exercise 7.22 (Solution on p. b. What is the probability the number of cars passing in an hour lies between 45 and 55 (inclusive)? What is the probability of more than 55? Exercise 7.1). 0. for directly with ibinom and ipoisson. The probability the fourth success will occur no later than the tenth trial is determined by the negative binomial distribution.17 (Solution on p.) The time to failure. 190. Calculate this probability using the binomial distribution. 191.) T ∼ gamma (20. and Z ∼ exponential (1/4.23 For (Solution on p.001) and Y ∼ Poisson (5).) They fail independently. 000). 191. random variable. 191. What is the probability of having at least 10 pulses in an hour? What is the probability of having at most 15 pulses in an hour? Exercise 7.21 Random variable (Solution on p.15 job. calculate and tabulate the probability of a value at least k. P (T ≤ 100. Determine P (X ≥ 80) .0002) be the total operating time for the units described in Exercise 7. 190. and tested in succession.186 CHAPTER 7. (Solution on p.24. Use the appropriate Poisson distribution to approximate these values. exponential (0. For each for integer values 3 ≤ k ≤ 8. Under the usual assumptions for Bernoulli trials. 000). P (X ≥ 120) b. What is the probability the number arriving in an hour will be between three and seven. a.25 Let (Solution on p. 191. Use the mfunction for the gamma distribution to determine b.5). Exercise 7. 191. 12. Exercise 7. (in hours) form an iid class. P (X ≥ 100) .) The number of noise pulses arriving on a power circuit in an hour is a random quantity having Poisson (7) distribution. 191.) binomial (12.
3). Exercise 7. determine (Solution on p.) Customers arrive at a service center with independent interarrival times in hours. Calculate the probabilities in part (a) with the mfunction gaussian. 193.) A random number generator produces a sequence of numbers between 0 and 1. utilize the relation of the gamma to the Poisson distribution to determine P (X ≤ 30).187 Exercise 7.05 of the mean value µ. 0. is made each fteen days. X (Solution on p. 191. 192. 193. 64). µ = 5 and σ 2 = 81.1). a table of That is.01).30 (Solution on p. 81).15). 1]. a. That is.) Exercise 7. exponential ( 0. X (Solution on p.28 exponential (3) distribution. P (1 < X < 9) and P (X − 3 ≤ 4). in days.26 (Solution on p. Do not make a discrete adjustment. Use X (Solution on p. What is the distribution for the total time present for the six calls? Use an appropriate Poisson distribution to determine Z from the P (Z ≤ 20). µ = 3 and σ 2 = 64. This means the time X λ= for the arrival of the fourth message is gamma(4. which have X for the third arrival is thus gamma (3. The units fail independently.) has gaussian distribution with of standardized normal distribution to determine results with the mfunction gaussian. 0.) A maintenance check Five identical electronic devices are installed at one time.) has gaussian distribution with mean µ = 4 and variance standardized normal distribution to determine P (2 < X < 8) and P (X − 4 ≤ 5).1). That is. Without using tables or mprograms. (Solution on p. can be considered an observed value of a random variable uniformly distributed on the interval [0. What is the probability that three or Items coming o an assembly line have a critical dimension which is represented by a random ∼ N(10. Each of these Exercise 7.) Ten items are selected at random.32 Suppose X ∼ N (4. exponential (1/3). 0. They assume their values independently. 192. Without using tables or mprograms. Without using P (X ≤ 2). Use a table Check your Exercise 7. The time tables or mprograms. P (3 < X < 9) and P (X − 5 ≤ 5). 193. Exercise 7. in seconds per month. 192. more are within 0. 192. that is normally distributed with .33 Suppose X ∼ N (5.35 variable (Solution on p. A sequence of 35 numbers is generated.) has gaussian distribution with of standardized normal distribution to determine results using the mfunction gaussian.) Exercise 7. maintenance check? (Solution on p. b.34 Suppose X ∼ N (3. currently in use by a sixth person.29 Five people wait to use a telephone.36 The result of extensive quality control sampling shows that a certain model of digital watches coming o a production line have accuracy. 192.31 time to failure.) Exercise 7. What is the probability 25 or more are less than or equal to 0. Suppose time for the six calls (in minutes) are iid.71? (Assume continuity. 192. 81).) Exercise 7. (Solution on p. σ 2 = 81. (Solution on p. and the What is the probability that at least four are still operating at the Exercise 7. of each is a random variable exponential (1/30).27 Interarrival times (in minutes) for fax messages on a terminal are independent.) The sum of the times to failure for ve independent units is a random variable X ∼ gamma (5. determine P (X ≤ 25). 192. Use a table Check your Exercise 7.
38 values of (Solution on p.5).01 to 0. a watch must have an accuracy within the range of 5 to +10 seconds per month.5).5 ≤ X < 1.188 CHAPTER 7.5).) has density 3 fX (t) = 2 t2 .2] (t) (2 − t) 5 5 (7. c. Exercise 7. Use the mprocedures tappr and cdbn to plot an approximation to the distribution function.7. P (X − 1 < 1/4). Use the mprocedures tappr and cdbn to plot an approximation to the distribution function. Determine P (X ≤ 0.41 Random variable X (Solution on p. Determine an expression for the distribution function. What is the probability a watch taken from the production line to be tested will achieve top grade? mfunction gaussian. µ from 10 to 500.) has density function fX (t) = { (6/5) t2 (6/5) (2 − t) for for 0≤t≤1 1<t≤2 6 6 = I [0. 194. 194. b.5). −1≤t≤1 (and zero elsewhere).5 ≤ X < 1. Exercise 7. using a standardized normal table. b.) from 10 to 500 and p from 0. 193. 0 ≤ t ≤ 2 (and zero elsewhere). c. Calculate. Determine an expression for the distribution function. b. 193. a. c. P (X > 0. P (X − 1 < 1/4).) has density function 3 fX (t) = t − 8 t2 . To achieve a top grade. P (0. Determine P (−0. 193. Determine an expression for the distribution function.5). . Use the mprocedures tappr and cdbn to plot an approximation to the distribution function. Exercise 7. to observe the approximation of the binomial distribution by the Poisson. Exercise 7.5).41) a.37 Use the mprocedure bincomp with various values of n (Solution on p.25 ≤ 0.5 ≤ X < 0. DISTRIBUTION AND DENSITY FUNCTIONS µ=5 and σ 2 = 300. 8). Check with the Exercise 7.40 Random variable X (Solution on p. P (X − 0. Determine P (X ≤ 0. P (0. a.39 Random variable X (Solution on p. 1] (t) t2 + I(1.) Use various Use the mprocedure poissapp to compare the Poisson and gaussian distributions.
[X.7838 0.1 (p.01*[8 13 6 9 14 11 12 7 11 9].PX] = csort(T.7) = 0. pc = 0.8799 0.1719 P = cbinom(30.(1 . . 0 : 10). % Alternate solution.3 (p.7 (p.p. 184) % See MATLAB plot npr06_12 (Section~17. Solution to Exercise 7.5.1945 0.pc).5 5 3.3470 0.9768 Solution to Exercise 7.0017)^350 = 0.0.0083. pc = 0.8.1. [X.6 (p.9968 0. 185) p1 = 1 .4 (p.cbinom(350. 0. 184) T = [1 3 2 3 4 2 1 3 5 2]. 185) P = cbinom (10.cbinom(200.28: npr06_12) Minterm probabilities in pm.5 7.5 7.6052 Solution to Exercise 7. ddbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot Solution to Exercise 7.2 (p.0.5].0. ddbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX Solution to Exercise 7.9775 0. See Exercise 12 (Exercise~6.4487 k = 1:5.01*[10 15 15 20 10 5 10 15].9220 0. 185) P = ibinom(20.9 (p.0.1/2.008.0.189 Solutions to Exercises in Chapter 7 Solution to Exercise 7.PX] = csort(T.5) = 0.5513 0.5 5 5 3.5 (p. ddbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX Solution to Exercise 7.pc).9996 1. (prob given day is a service day) a. 185) P = 1 .0017.pm). coefficients in c T = sum(mintable(4)).10 (p.0000 Solution to Exercise 7.PX] = csort(T.0678 = 0.1:6) P = ibinom(127.8 (p. 185) P = 1 .3688 0.12) from "Problems on Random Va [X.0:20) Solution to Exercise 7. 184) % See MATLAB plot T = [3.3:5) = 0. 185) p = cbinom(10.0:3) P = 0. 185) Solution to Exercise 7.
0. 186) P = cbinom(6. P = sum(nbinom(4.p1.8666 8.8990 • Pa = cbinom(10.3:10)) = 0.3) = 0.53.30*N.14 (p.4:10)) = 0.1034 Solution to Exercise 7. PN = ibinom(50.^(k1) = 0.0.0000 0. 186) k = 0:10.16 (p.7623 0.(2/3)^5 = 0.0000 0.2649 0. Pb = 1 .0. P0 = (1 .6562 Solution to Exercise 7.0.p1)^7 = 0.2.0000 0.0000 0.56) = 0.4405 5.1245 0.3) = 0.p1).4) = 0.3581 Solution to Exercise 7.4404 0.cpoisson(53. Pp = 1 .0:50).11 (p.0067 0.0000 0.0414 b.6160 0. 185) N = 0:50.0000 0.0.1247 3.4487 0.001.2650 4.8729 Solution to Exercise 7.0000 0.9320 0.0000 0.9864 0.8683 Solution to Exercise 7.12 (p.5836 P300 = (X>=300)*PN' P300 = 0.p1.5. DISTRIBUTION AND DENSITY FUNCTIONS P = p1*(1 .190 CHAPTER 7. 185) a.0000 0.0. b.45) .7622 7.2474 0.9856 P200 = (X>=200)*PN' P200 = 0.17 (p.cpoisson(5. 186) P1 = cpoisson(53.8667 0. 185) P = 1 .0155 Solution to Exercise 7.k+1).56) = 0.53.4487 • P = sum(nbinom(3.8990 Solution to Exercise 7. 185) p1 = 1 .8729 Pa = cbinom(10.9863 .k+1).13 (p.Pb.cbinom(5000.0067 1. disp([k.Pp]') 0 0.0404 0.5224 P2 = cpoisson(53.1364 0.0017)^350 = 0.0404 2. Ppos = (X>0)*PN' Ppos = 0.9319 9.(1 .0752 0.9682 0. X = 500 .9682 10.6160 6.15 (p.0000 0.
4679 6. cbinom(20.20) = 0. Pz = exp(k/4.100000) = 0.1695 P2 = 1 .0000 0.0000 0.4111 0.0294 0..4655 0.k).2709 0.7176 0. 186) k = [80 100 120].1689 8.25 (p.3) .8) = 0.0220 0.5297 P2 = cpoisson(0.0002*5000) 0.0.1178 0.5134 0.stairs Poisson.3679 [5 8 10 12 15]. 186) P (X > kλ) = e−λk/λ = e−k Solution to Exercise 7.cpoisson(5.0006 Solution to Exercise 7.. 186) k = 3:8.5. 186) P (X > 500 + 500X > 500) = P (X > 500) = e−0.3679 Solution to Exercise 7.18 (p.Py.2971 7. 186) P1 = gammadbn(20.9110 0.0390 0.11) = 0.0000 0. Px = cbinom(12. disp([k.cpoisson(7.p.16) = 0.8865 0.0002*100000.0866 Solution to Exercise 7.5133 0.k) P = 0.10) = 0.5297 .5154 P1 = cpoisson(100.1601 0.7420 P2 = 1 .21 (p.9867 0. P = cbinom(1000.5).002·500 = 0. 186) 0.k).1690 P1 = cpoisson(7.22 (p.2111 0.0.191 bincomp Enter the parameter n 5000 Enter the parameter p 0.3292 0.23 (p.9976 Solution to Exercise 7.0002.0000 0. 186) p k P P = = = = p = exp(0.0282 Solution to Exercise 7.0000 0.Pz]') 3. Py = cpoisson(4.0000 0.Px.9863 Solution to Exercise 7.6577 5.4897 0. 186) P1 = cpoisson(5.cpoisson(5.o o o gtext('Exercise 17') Solution to Exercise 7.375.9825 0. Adjusted Gaussian.2636 0.k) P1 = 0.1.19 (p.8264 4.k) 0.24 (p.001 Binomial.20 (p.0.
1) P2 = 0.32 (p.5875 − 1 = 0.3483 Solution to Exercise 7.2) P1 = 0.50) (7.2587 P (X − 5 ≤ 5) = 2Φ (5/9) − 1 = 1.84.75 + Solution to Exercise 7.28 (p.54) .2596 P2 = gaussian(4. 187) (7.192 CHAPTER 7.2 · 30 = 3) 32 33 + 2 3! = 0.46) P (Y ≥ 3) = 1 − P (Y ≤ 2) = 1 − e−6 (1 + 6 + 36/2) = 0. 187) a.31 (p. 187) P (X ≤ 25) = P (Y ≥ 5) .752 3. 187) p = exp(15/30) = 0. 187) (7.754 + + 2 3! 24 (7.25) = 0. Y ∼ poisson (3 · 2 = 6) (7.15 · 25 = 3.p. Y ∼ poisson (0.44) P (Y ≥ 4) = 1 − P (Y ≤ 3) = 1 − e−3 1 + 3 + Solution to Exercise 7.0.35 1 + 3.27 (p.71.43) P (X ≤ 30) = P (Y ≥ 4) .6547 Solution to Exercise 7.81. DISTRIBUTION AND DENSITY FUNCTIONS Solution to Exercise 7.30 (p.81.4212 − 1 = 0.753 3.45) P (X ≤ 2) = P (Y ≥ 3) .gaussian(4.gaussian(4.81.4181 Solution to Exercise 7. (7.4) = 0.9380 Solution to Exercise 7. 187) P (3 < X < 9) = Φ ((9 − 5) /9) − Φ ((3 − 5) /9) = Φ (4/9) + Φ (2/9) − 1 = 0.3528 (7.75) = 0.47) Z∼ gamma (6. P (2 < X < 8) = Φ ((8 − 4) /9) − Φ ((2 − 4) /9) = Φ (4/9) + Φ (2/9) − 1 = 0.33 (p.1/3). P (Z ≤ 20) = P (Y ≥ 6) .6712 + 0. 187) 3.6712 + 0. 187) (7.8) .4212 − 1 = 0.3225 (7.2587 P (X − 4 ≤ 5) = 2Φ (5/9) − 1 = 1.49) p = cbinom(35.52) P1 = gaussian(4.26 (p. 6) = 0.53) (7.42) P (Y ≥ 5) = 1 − P (Y ≤ 4) = 1 − e−3.4212 b.5620 Solution to Exercise 7.48) P (Y ≥ 6) = cpoisson (20/3.6065 P = cbinom(5.29 (p. Y ∼ poisson (0. Y ∼ poisson (1/3 · 20) (7.5875 − 1 = 0.51) (7.9) .4212 (7.
36 (p.8036 Solution to Exercise 7.1) P1 = 0.38 (p.01.75) = b.34 (p.gaussian(5.9.5) 3 1 = 0.25 ≤ X ≤ 0.3721 P2 = gaussian(3.3829 P = cbinom(10.3) P1 = 0.61) FX (t) = t −1 fX = 1 2 t3 + 1 .25 ≤ 0.95) p = 0.81.3185 P 2 = 2 0.64.39 (p.75) + Φ (0.3829 (7.gaussian(5. 188) Experiment with the mprocedure bincomp.10) .193 P1 = gaussian(5.56) (7.p.577) − 1 = 0.83 − (−0.55) (7.3829 − 1 = 0.0.3317 Solution to Exercise 7. 187) P (1 < X < 9) = Φ ((9 − 3) /8) − Φ ((1 − 3) /9) = Φ (0.0) P2 = 0. 10) − gaussian (5. 187) √ √ P (−5 < X < 10) = Φ 5/ 300 + Φ 10/ 300 − 1 = Φ (0.gaussian(3.01.3721 P (X − 3 ≤ 4) = 2Φ (4/8) − 1 = 1.64.0.05) . 188) 3 2 a. 300.58) Solution to Exercise 7.614 + 0.37 (p. 1 3 3 (3/4) − (−1/4) = 7/32 2 (7.gaussian(10. −5) = 0. (7.60) P 3 = P (X − 0. 187) p = gaussian(10.81.331 P = gaussian (5.717 − 1 = 0. c.84.25) − 1 = 0. t2 = t3 /2 (7.10.3829 Solution to Exercise 7.35 (p.5 ∗ 0.5) = P (−0.57) P1 = gaussian(3.5) = 7/8 2 (7. Solution to Exercise 7.9) .gaussian(3.7) . 188) Experiment with the mprocedure poissapp.64.64.5 3 2 3 t = 1 − (−0.5987 − 1 = 0.59) P 1 = 0.9) .4181 Solution to Exercise 7.2596 P2 = gaussian(5.1) P2 = 0.289) + Φ (0.81. 300.3) P = 0.7734 + 0.
65) t FX (t) = 0 c.*t. = t2 t3 − 2 8 (7. 188) a.^2 + .*(2 .. 0≤t≤2 tappr Enter matrix [a b] of xrange endpoints [0 2] Enter number of x approximation points 200 Enter density as a function of t t .5*t..53 /8 = 7/64 b. P1 = 6 5 1/2 t2 = 1/20 0 P2 = t2 + 6 5 6 5 5/4 1 t2 + 1/2 6 5 3/2 (2 − t) = 4/5 1 (7. P 2 = 1. c.40 (p. 188) 3 t − t2 8 a.64) P3 = b.1] (t) t3 + I(1.62) P 1 = 0.194 CHAPTER 7. 6 5 1 3/4 (2 − t) = 79/160 1 (7. (6/5)*(t>1). 2 7 6 fX = I[0.41 (p.63) FX (t) = t2 2 − t3 8.t) Use row matrices X and PX as in the simple case cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot . DISTRIBUTION AND DENSITY FUNCTIONS tappr Enter matrix [a b] of xrange endpoints [1 1] Enter number of x approximation points 200 Enter density as a function of t 1.2] (t) − + 5 5 5 2t − t2 2 (7.(3/8)*t.52 /2 − 0.^2 Use row matrices X and PX as in the simple case cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot Solution to Exercise 7.66) tappr Enter matrix [a b] of xrange endpoints [0 2] Enter number of x approximation points 400 Enter density as a function of t (6/5)*(t<=1).52 /2 − 1.53 /8 − 7/64 = 19/32 P 3 = 79/256 (7.^2 Use row matrices X and PX as in the simple case cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot Solution to Exercise 7.
the distribution may also be described by a probability density function The probability density is the linear density of the probability mass along the real line (i. Each combination of two is equally likely. mass per unit length). a simple approximation may be set up. For a simple random variable. That t = X (ω). Often we have more than one random variable. so it is decided to select them by chance. Determine the probability for each of the possible pairs.2 Random variables considered jointly. 1. which provides a means of making probability calculations. Each can be considered separately. The mapping induces a probability mass distribution on the real line. 0). 8. 2) . To simplify exposition and to keep calculations manageable. realvalued random variable is a function (mapping) from the basic space is. Example 8. They seem equally qualied. (1. random vectors As a starting point. but usually they have some probabilistic ties which must be taken into account when they are considered jointly.. or (2. A single.1. joint case by considering the individual random variables as coordinates of a random vector. with no point mass concentrations.1 Introduction fX . 1). 2) and Y X be the number be the number of seniors selected (possible values 0. we consider a pair of random variables as coordinates of a twodimensional random vector.org/content/m23318/1. 195 .1. Let of juniors selected (possible values 0. 1.Chapter 8 Random Vectors and joint Distributions 8. Various mprocedures and mfunctions aid calculations for simple distributions. In the absolutely continuous case. The concepts and results extend directly to any nite number of random variables considered jointly.7/>. In the absolutely continuous case. We treat the We extend the techniques for a single random variable to the multidimensional case.1 Random Vectors and Joint Distributions1 8. 2). Two juniors and three seniors apply. 1 This content is available online at <http://cnx. However there are only three possible pairs of values for (X.1: A selection problem Two campus jobs are open. consider a simple example in which the probabilistic interaction between two random quantities is evident. to each possible outcome ω of an experiment there corresponds a real value Ω to the real line. the probability distribution consists of a point mass pi at each possible value ti of the random variable. so that calculations for the random variable are approximated by calculations on this simple distribution. The distribution is described by a distribution function FX . Y ) : (0. Others have zero probability.e. since they are impossible. The density is thus the derivative of the distribution function.
Only one pair can be both C (3. We formalize as follows: A pair {X. we have a distribution function. Y } of random variables considered jointly is treated as the pair of coordinate functions for To each assigns the pair of real numbers a twodimensional random vector where then X (ω) = t and Y (ω) = u. A fundamental result from measure theory ensures W = (X. Y ) is a random vector i each of the coordinate functions X and Y is a random variable. all mapping ideas extend. W = (X. u) on the plane. Y ). we may assign the probability mass PXY (Q) = P W −1 (Q) = P (X. so that If we represent the pair of values {t.2: Induced distribution and probability calculations To determine P (1 ≤ X ≤ 3. Example 8. R2 . as they must. ω ∈ Ω. Y ) assignment determines −1 (Q) (8. RANDOM VECTORS AND JOINT DISTRIBUTIONS SOLUTION There are one of each. The argument parallels probability distribution induced by W = (X. juniors. random variables conisidered individually.1) These probabilities add to one. Y ) takes on a (vector) value in region determine the probability that the vectorvalued function Q. Y = 1) = 6/10. we model X (the number of juniors selected) Y (the number of seniors selected) as random variables. Y ). u). 2) = 3 ways to select pairs of seniors. Since to Q W −1 (Q) is an event for each reasonable set Q on the plane. To W = (X. Hence the vectorvalued function 8. (which we call t) we determine the region for which the rst coordinate value is between one and three and the second coordinate value (which we call greater than zero. Y ) : Ω → R2 is a mapping from the basic space inverse mapping (8. the mass PXY as a probability measure on the subsets of the plane The result is the that for the singlevariable case. since this exhausts the mutually exclusive possibilities. W (t. each Borel set) on the plane. Y = 2) = 3/10.196 CHAPTER 8. The probability of any other combination must be zero.2) joint distribution and two individual or marginal distributions. How this is acheived depends upon the nature of the distribution and how it is described.3) Ω to the plane W −1 R2 . and In the selection example above.1. i −1 (Q) is an event for each reasonable set (technically. Gometrically.3 Induced distribution and the joint distribution function In a manner parallel to that for the singlevariable case. W (ω) = (t. 2) = 10 equally likely pairs.4) Because of the preservation of set operations by inverse mappings as in the singlevariable case. Y > 0). The plays a role analogous to that of the inverse mapping A twodimensional vector W is a random vector X −1 for a real random variable. The problem is to determine how much probability mass lies in that strip. There are C (5. Y = 0) = 1/10 (8. we simply determine how much induced probability mass is in that region. u). Thus P (X = 1. we obtain a mapping of probability mass from the basic space to the plane. We also have the distributions for the X = [0 1 2] P X = [3/10 6/10 1/10] We thus have a Y = [0 1 2] P Y = [1/10 6/10 3/10] (8. this is the strip on the plane bounded by (but not including) the horizontal axis and by the vertical lines t = 1 and t = 3 (included). Six pairs can be P (X = 0. u} as the point (t. Denition . This corresponds to the set Q u) is of points on the plane with 1≤t≤3 and u > 0. P (X = 2. As in the singlevariable case. W = (X. W Since W is a function.
1: The region Qab for the value FXY (a. b). Y ) there in the general case. The theoretical result quoted in the real variable case extends to ensure that a distribution on the plane is determined uniquely by consistent assignments to the semiinnite intervals distribution is determined completely by the joint distribution function. . we FXY (t. the induced Figure 8. should there be mass concentrations on the boundary. uj ) pij = P [W = (ti . u = b). b). u) is equal to the probability mass in the region rst coordinate is less than or equal to may write t Qtu and the second coordinate is less than or equal to u. s ≤ u} (8.5) on the plane such that the FXY (t. Y ) ∈ Qtu ] .197 The joint distribution function FXY for W = (X. Thus.6) (a. case where there are point mass concentrations) one must be careful to note whether or not the boundaries are included in the region. Now for a given point of the vertical line through point where Qtu = {(r. s) : r ≤ t. u) = P (X ≤ t. u) = P [(X. is probability mass At point (ti . u) on the plane which are on or to the left (t. uj )] = P (X = ti . Y = uj ). 0)and on or below the horizontal line through (0. to determine In the discrete case (or in any P [(X. As in the range of W = (X. Y ) is given by FXY (t. u) (see Figure 1 for specic Qtu . u) ∈ R2 This means that (8. t = a. Formally. Distribution function for a discrete random vector The induced distribution consists of point masses. the region Qab is the set of points (t. Y ≤ u) ∀ (t. We refer to such regions as semiinnite intervals on the plane. Y ) ∈ Q] we determine how much probability mass is in the region.
and fourth quadrants. 6/10 at (1.1 (A selection problem) The probability distribution is quite simple. The situation in the other Distribution function for a mixed distribution Example 8.1).1) Mass 6/10 spread uniformly over the unit square with these vertices The joint distribution function is zero in the second. u).4: A mixed distribution The pair {X. third. Thus.198 CHAPTER 8.2.0). u) on the plane. and 1/10 at (2.3) Point masses 1/10 at points (0. The region Qtu is a giant sheet (t. function.1 (A selection problem)). If (t. u) is in the square or on the left and lower boundaries. This distribution is plotted in Figure 8. (t. u) is the amount of probability covered by the sheet. including the lefthand and lower boundariies. • If the point (t. then the sheet covers that probability.0) plus 0. no probability mass is covered by the sheet with corner in the cell.0). if is in any of the three squares on the lower left hand part of the diagram. u) = 6/10. u) is on or in the square having probability 6/10 at the lower lefthand corner.7) .2: Example 8. The value of FXY (t. think of moving the point corner at To determine (and visualize) the joint distribution with This (t. Y } produces a mixed distribution as follows (see Figure 8.3 (Distribution function for the selection problem in Example 8.0).3: Distribution function for the selection problem in Example 8. (1. RANDOM VECTORS AND JOINT DISTRIBUTIONS The joint distribution for Example 8. the sheet covers the point mass at (0. FXY (t. (0. Mass 3/10 at (0. u) value is constant over any grid cell. (1. and is the value taken on at the lower lefthand corner of the cell.2).6 times the area covered within the square. Figure 8. and the value of cells may be checked out by this procedure. Thus in this region FXY (t. u) = 1 (1 + 6tu) 10 (8.1).
the sheet covers two point masses plus the portion of the mass in the square to the left of the vertical line through FXY (t.8) (t. (8. u) is above and to the right of the square (i. to give FXY (t. the treatment in that case shows that the marginals determine the joint distribution.3: Mixed joint distribution for Example 8. if the component random variables form an independent pair.4 Marginal distributions If the joint distribution for a random vector is known.e. the converse However. In this case t = 1. These are known as is not true. u) = • If the point 1 (2 + 6t) 10 0 ≤ u < 1.9) mass is covered and (t. then all probability Figure 8.. In general. u).4 (A mixed distribution). u). u) is to the right of the square (including its boundary) with the sheet covers two point masses and the portion of the mass in the square below the horizontal line through (t. u) is above the square (including its upper boundary) but to the left of the line (t. u) = • If 1 (2 + 6u) 10 (8. both 1 ≤ t and 1 ≤ u). marginal distributions.199 • If the pont (t. then the distribution for each of the component random variables may be determined. 8. .1. FXY (t. u) = 1 in this region.
If we push the point up vertically.4.200 CHAPTER 8. marginal (t.12) This mass is projected onto the vertical axis to give the marginal distribution for Y. Y can take any of its possible values (8. This is the total probability that X ≤ t. u). u) = mass on or below horizontal line through (t. A FY (u) = P (Y ≤ u) = FXY (∞. Y < ∞) Thus i. u).e. The probability mass described by FX (t) is the same as the total joint probability mass on or to the left of the vertical line through mass in the half plane being projected onto the horizontal line to give the parallel argument holds for the marginal for Y. RANDOM VECTORS AND JOINT DISTRIBUTIONS To begin the investigation. the upper boundary of mass on or to the left of the vertical line through Now Qtu is pushed up until eventually all probability (t. u) (8. We may think of the distribution for X. ∞) = lim FXY (t. Figure 8. Marginals for a joint discrete distribution . u) u→∞ This may be interpreted with the aid of Figure 8.4: Construction for obtaining the marginal distribution for X. Consider the sheet for point (8. note that FX (t) = P (X ≤ t) = P (X ≤ t. u) is included..10) FX (t) = FXY (t. FX (t) describes probability mass on the line.11) (t.
(0. There t = 0. Then the mass increases linearly with t. (1. 4/10.5) The marginal distribution for respectively. X has masses 2/10. (2.0). 2/10 at points Similarly. slope 0. (1. onto the point uj Similarly.6 shows the graph of the marginal distribution function is a jump in the amount of 0. 1) (See Figure 8. 0) . (3. 2/10. . 0) . 0) is pro jected onto the point ti on a horizontal (0. 1. 2. 4/10. respectively. 3. 2. 2) . has masses 4/10.1). the marginal distribution for Y t = 0. uj ) is projected on a vertical line to give P (Y = uj ).201 Consider a joint simple distribution.6.4 (A mixed distribution). Figure 8.6 Consider again the joint distribution in Example 8.5: Marginals for a discrete distribution The pair {X. 1. produces a mixed distribution as follows: Point masses 1/10 at points (0. 2/10 at each of the ve points Example 8. (2.2 corresponding to the two point masses on the vertical line. until a nal jump at t = 1 in the amount of 0.13) Thus. m n P (X = ti ) = j=1 P (X = ti . Y = uj ) (8. 1) . (1. 2/10 at points u = 0. Y } produces a joint distribution that places mass (0. all the probability mass on the vertical line through line to give P (X = ti ).1) Mass 6/10 spread uniformly over the unit square with these vertices The construction in Figure 8. Y = uj ) and P (Y = uj ) = i=1 P (X = ti .0). Y } FX .5: Marginal distribution for Example 1. Example 8. all the probability mass on a horizontal line through (ti .2 at The pair {X.
8. 2 This content is available online at <http://cnx. RANDOM VECTORS AND JOINT DISTRIBUTIONS produced by the two point masses on the vertical line. Y . By symmetry. . the marginal distribution for Y is the same. the total mass is covered and FX (t) is constant at one for t ≥ 1. These are. considered jointly. calculations on a pair of simple random variables X.6. in W = (X. u)plane.1 mprocedures for a pair of simple random variables We examine. two components of a random vector The induced distribution is on the 8.org/content/m23320/1. Y ).202 CHAPTER 8.6: Marginal distribution for Example 8. Figure 8. Values on the horizontal axis ( axis) correspond to values t Ω to the plane.2 Random Vectors and MATLAB2 eect. which maps from the basic space (t. rst. At t = 1.6/>.2.
Thus P with elements arranged corresponding to the P has element P (X = ti . P (X = t2 ) .e. · · · . and row from the bottom) MATLAB array and logical operations t. uj ) position. t2 . MATLAB operations to determine the inner product of N This is accomplished by one of the and PX We extend these techniques and strategies to a pair of simple random variables. we form computational matrices (ti . is some prescribed set of values. The mprocedure takes care of this inversion. and P = [pij ]. Example 8. in a manner analogous to the operations in the singlevariable case. P (X = ti ) for which These will be in a zeroone Select the in the corresponding positions and sum..17) b. g (ti ) ∈ M . P perform the specied operations on P (X = ti . The data X Y = [u1 . Y } of random variables are the values of X and Y. Y = uj ) atthe (ti . u. Y values increasing upward. let us review the onevariable strategy. t2 . We arrange the joint probabilities as the right and for this procedure are in three matrices: X = [t1 . tn ] To perform calculations on and P X = [P (X = t1 ) . · · · . c. with ones in the desired positions. where M in matrix P X. X and on the plane. uj . these intermediate steps would not be displayed. utilizing the MATLAB function meshgrid. uj ) on position (i.16) X = [t1 t2 · · · tn ] andY = [u1 u2 · · · um ] and the joint probabilities graphically by putting probability mass P (X = ti . uj ) on the plane.m jcalc % Call for setup procedure Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X . at each point on the position (i. We extend the computational strategy used for a single random variable. probabilities In this case. P (X = tn )] (8. at each point on the jth t and u such that t has element ti at each i th column from the left) u has element uj at each (ti . in which values of the second variable increase downward. um ] is the set of values for random P (X = ti . Formation of the t and u matrices is achieved by a basic setup mprocedure called jcalc. with X values where pij = increasing to This is dierent from the usual arrangement in a matrix. Y = uj ). considered jointly.. uj ) position (8. and computes Y. First. u2 . we we use array operations on X to form a matrix G = [g (t1 ) g (t2 ) · · · g (tn )] which has (8. matrix g (ti ) in a position corresponding to Determine P (X = ti ) P (g (X) ∈ M ). Y = uj ) in a matrix P.203 of the rst coordinate random variable X and values on the vertical axis ( axis) correspond to values of u Y. To perform calculations. Ordinarily. · · · . The mprocedure forms the matrices the marginal distributions for t and u. · · · . Y = uj ) at the point (ti .15) Basic problem. In the following example. uj ) ti . The data for a pair matrices {X. This joint probability may is represented by the matrix mass points on the plane.e. • • Use relational operations to determine the positions N. tn ] is the set of values for random variable variable Y. data consist of values ti and corresponding P (X = ti ) arranged in matrices X = [t1 . Y = uj ) at each (ti . which we may put in row (8.14) Z = g (X). We usually represent the distribution P (X = ti . we display the various steps utilized in the setup procedure. a.7: Setup and basic calculations jdemo4 % Call for data in file jdemo4.
7336 % P(g(X.0589 0.u_j)] matrix = 3 6 5 3 19 6 3 2 6 22 9 0 1 9 25 15 6 7 15 31 M = G >= 1 % Positions where G >= 1 M = 1 0 0 1 1 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 pM = M.0132 PX % Optional call for display of PX PX = 0.1032 0. u.1244 [t.0372 0.*P % Selection of probabilities pM = 0.1800 0. Y.. % Formation of t.1161 0. PX.0372 0 0 0.3100 0.1356 0.0132 PM = total(pM) % Total of selected probabilities PM = 0.0360 0 0 0..0558 0.0516 0 0.2088 PY = fliplr(sum(P')) % Calculation of PY (note reversal) PY = 0.1356 0. uj )].0297 0. u matrices (note reversal) disp(t) % Display of calculating matrix t 3 0 1 3 5 % A row of Xvalues for each value of Y 3 0 1 3 5 3 0 1 3 5 3 0 1 3 5 disp(u) % Display of calculating matrix u 2 2 2 2 2 % A column of Yvalues (increasing 1 1 1 1 1 % upward) for each value of X 0 0 0 0 0 2 2 2 2 2 Suppose we wish to determine the probability and u.0270 0.0744 0.0744 0.^2 .1161 0.0180 0.1900 0.0774 0. RANDOM VECTORS AND JOINT DISTRIBUTIONS Enter row matrix of VALUES of Y Y Use array operations on matrices X.fliplr(Y)).u] = meshgrid(X.0180 0..3: Distribution function for the selection problem in Example 8.0516 0.1512 0.0837 0.% Steps performed by jcalc PX = sum(P) % Calculation of PX as performed by jcalc PX = 0.2088 PY % Optional call for display of PY PY = 0.0285 0.0264 0. In Example 3 (Example 8.4300 0.. PY.1800 0..0405 0. t.0589 0.0209 0.0817 0.2700 0. P X 2 − 3Y ≥ 1 .1032 0.3100 0.2700 0.0270 0.1512 0.0360 0. Using array operations on t G = t.204 CHAPTER 8.0264 0.3*u % Formation of G = [g(t_i.0405 0.. and P disp(P) % Optional call for display of P 0.0285 0. we obtain the matrix G = [g (ti .0817 0.1900 0.0198 0...0209 0.1244 .1 (A selection problem)) from "Random Vectors and Joint Distributions" we note that the joint distribution function .4300 0.Y) >= 1) G d.
1 (A selection problem)) from "Random Vectors and Joint Distributions" shows agreement with values obtained by hand. FXY = flipud(cumsum(flipud(P))) % Cumulative column sums upward FXY = 0.205 FXY is constant over any grid cell. • • Take cumulative sums upward of the columns of P. at the value taken These lower lefthand corner values may be obtained on at the lower lefthand corner of the cell.1000 FXY = cumsum(FXY')' % Cumulative row sums FXY = 0.6000 0.3000 0. By ipping the matrix and transposing.1 (A selection problem)) in "Random Vectors and Joint Distributions'. systematically from the joint probability matrix P by a two step operation. Example 8. 0 6 0.3: Distribution function for the selection problem in Example 8. which takes column cumulative sums downward.9000 1. Take cumulative sums of the rows of the resultant matrix.8: Calculation of FXY values for Example 3 (Example 8.1 (A selection problem)) from "Random Vectors and Joint Distributions" P = 0.3000 0.1000 0 0 0.7: The joint distribution for Example 3 (Example 8. This can be done with the MATLAB function cumsum. 0 0 1].1*[3 0 0.3: Distribution function for the selection problem in Example 8. Comparison with Example 3 (Example 8.6000 0. we can achieve the desired results. including the lefthand and lower boundaries.0000 0 0.3: Distribution function for the selection problem in Example 8.1000 0 0.1000 Figure 8.7000 0 0 0.6000 0. .
0000 0.3: Distribution function for the selection problem in Example 8.18) We have three properties analogous to those for the singlevariable case: t (f1) u fXY ≥ 0 (f2) fXY = 1 R2 (f3) FXY (t. u.1152 0.3312 0. u) = −∞ −∞ fXY (8.0780 0. return to the distribution in Example Example 8. s) dsdr (8.20) FX (t) = FXY (t.7 (Setup and basic calculations) Example 8. py. For any joint mapping to the plane which assigns zero probability to each set with zero area (discrete points. then there exists a joint density function fXY P [(X.y.19) At every continuity point for fXY .5157 0. ∞) = −∞ −∞ fXY (r. Denition If the joint probability distribution for the pair with zero area. the density is the second partial fXY (t.0534 0.6848 0.2. the condition that there are no point mass concentrations on the line ensures the existence of a probability density function.1824 0. u) = Now ∂ 2 FXY (t. As in the case of canonic for a single random variable.206 CHAPTER 8.t.Y.9: Joint distribution function for Example 8. e. Y } assigns zero probability to every set of points with the property fXY Q (8.p] may be given any desired names. As an example.0264 0.1 (A selection problem)) in "Random Vectors and Joint Distributions". and p function[x. Y ) ∈ Q] = {X. y.2 Joint absolutely continuous random variables In the singlevariable case.5656 0. useful in probability calculations.1224 0. it is often useful to have a function version of the procedure jcalc to provide the freedom to name the outputs conveniently. 8.1356 These values may be put on a grid.4492 0. line or curve segments.py.0939 0.2) for Example 3 (Example 8. in the same manner as in Figure 2 (Figure 8.P) The quantities x. call for FXY disp(FXY) 0.u.7912 1.8756 0.6012 0. and countable unions of these) there is a density function. t.21) . A similar situation exists for a joint distribution for two (or more) variables. RANDOM VECTORS AND JOINT DISTRIBUTIONS The two step procedure has been incorprated into an mprocedure jddbn. px.2754 0.1512 0. = jcalcf(X.px.7 (Setup and basic calculations) jddbn Enter joint probability matrix (as on the plane) P To view joint distribution function.3390 0. u) ∂t ∂u t ∞ (8.
75. 0 ≤ u ≤ 1 (8. 0 ≤ t ≤ 1 t dt = 4u 1 − u2 .5 ≤ X ≤ 0.10 (Marginal density functions). u = t.8: Distribution for Example 8.0977 (8.5 and t = 0. to obtain the marginal density for the rst variable. and t fXY (t. .75 and above the line u = 0. Example 8.25) Figure 8.23) fXY (t.22) Marginal densities. u) du (8.5) = P [(X. and t = 0.5. u = t. and similarly for the marginal for the second variable.8) fX (t) = fY (u) = This region is the triangle bounded by u = 0. u) = 8tu 0 ≤ u ≤ t ≤ 1.5. s) ds and fY (u) = −∞ fXY (r. u) du = 8t 0 1 u du = 4t3 . u) dt = 8u u (8. Y ) ∈ Q] where Q is the common part of the triangle with t = 0. ∞ Use of the fundamental theorem of calculus to obtain the derivatives ∞ fX (t) = −∞ fXY (t.75. Y > 0.10: Marginal density functions Let fXY (t. Thus. Thus the strip between 3/4 t p=8 1/2 1/2 tu dudt = 25/256 ≈ 0.207 A similar expression holds for gives the result FY (u). t = 1 (see Figure 8.24) P (0. integrate out the second variable in the joint density. This is the small triangle bounded by u = 0.
. 6 37 1 • 0≤t≤1 fX (t) = (t + 2u) du = 0 6 (t + 1) 37 (8. Suppose elsewhere.208 CHAPTER 8.26) • For 1<t≤2 fX (t) = 6 37 0 t (t + 2u) du = 12 2 t 37 (8. SOLUTION by t = 0. therefore express fX by fX (t) = IM (t) 12 2 6 (t + 1) + IN (t) t 37 37 (8.28) Figure 8. Y } has joint density fXY (t. 2].9). Then IM (t) = 1 for t ∈ M (i. Examination of the gure shows that we have dierent limits for the integral with respect to for u 0≤t≤1 For and for 1 < t ≤ 2. u) = 37 (t + 2u) on the region bounded u = 0. Likewise. M = [0. Determine the marginal density fX .9: Marginal distribution for Example 8..e. 0 ≤ t ≤ 1) and zero IN (t) = 1 for t ∈ N and zero elsewhere. 1] and N = (1.11 (Marginal distribution with compound expression). and u = max{1. We can. t = 2. RANDOM VECTORS AND JOINT DISTRIBUTIONS Example 8.27) We may combine these into a single expression in a manner used extensively in subsequent treatments.11: Marginal distribution with compound expression The pair 6 {X. t} (see Figure 8.
~PY.31) P (X ≤ 0.12: Approximation to a joint continuous distribution fXY (t. with constant increments dy . X. If we let the approximating pair {X ∗ . and we approximate the distribution in a manner similar to that for a single random variable.3355~~~~~~~~~~~~~~~~%~Maple~gives~0.8)&(u~>~0. u) = 3 Determine on 0 ≤ u ≤ t2 ≤ 1 (8. uj ).29) P ((X. We then utilize the techniques developed for a pair of simple random variables. from which it determines u as does the mprocess jcalc for simple random variables. as in the singlevariable case. Y ∗ = uj ) = P ((X. so that the point be (ti . uj ) is at the midpoint of the rectangle. with constant increments dx. .2.30) tuappr t and calls for endpoints of intervals which include the ranges of X and Y and for the numbers of subintervals on each. we then have n · m pairs (ti . we have a grid structure consisting of rectangles of size dx · dy .1). and the vertical axis for values of Y. Y } with joint density fXY . It calculates the marginal approximate distributions and sets up the Calculations are then carried out as for any joint simple pair. Y ) ∈ ij th The mprocedure rectangle) ≈ dx · dy · fXY (ti .8. Y ) As in the onevariable case. ~p~=~total(M. Y ∗ ) = (ti .3 Discrete approximation in the continuous case For a pair {X.^2) Use~array~operations~on~X. If we subdivide the horizontal axis for values of n approximating values ti for X its increment. uj ) (8. It then prompts for an expression for the joint probability distribution.~u. Example 8. Y ∗ }.209 8. Y > 0. in ij th rectangle)(8. u).*P)~~~~~~~~~~%~Evaluation~of~the~integral~with p~=~~~0. If we have m approximating values uj for Y.3352455531 The discrete approximation may be used to obtain approximate plots of marginal distribution and density functions. We select ti and uj at the midpoint of corresponding to points on the plane. uj )) = P (X ∗ = ti .~Y. we assign pij = P ((X ∗ .~t.1). if the increments are small enough.~and~P ~M~=~(t~<=~0. ~tuappr Enter~matrix~[a~b]~of~Xrange~endpoints~~[0~1] Enter~matrix~[c~d]~of~Yrange~endpoints~~[0~1] Enter~number~of~X~approximation~points~~200 Enter~number~of~Y~approximation~points~~200 Enter~expression~for~joint~density~~3*(u~<=~t.~PX. calculating matrices fXY (t.
~~~~~~~~~~~~~~~~%~Density~for~Y ~FX~=~cumsum(PX).10) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~%~Theoretical~(3/2)(1~~t)^2 ~fy~=~PY/dy. and conditional expectation and regression.1t)) Use~array~operations~on~X.13 (Approximate plots of marginal density and distribution functions).X.~and~P ~fx~=~PX/dx.FX)~~~~~~~~~~~~%~Plotting~details~omitted These approximation techniques useful in dealing with functions of random variables. u) = 3u on the triangle bounded by u = 0.~Y. expectations.fx. Figure 8.~PX.~u. . RANDOM VECTORS AND JOINT DISTRIBUTIONS Marginal density and distribution function for Example 8.~PY.13: Approximate plots of marginal density and distribution functions fXY (t.10: Example 8. u ≤ 1 + t.*(u<=min(1+t.210 CHAPTER 8.~~~~~~~~~~~~~~~~%~Density~for~X~~(see~Figure~8. ~tuappr Enter~matrix~[a~b]~of~Xrange~endpoints~~[1~1] Enter~matrix~[c~d]~of~Yrange~endpoints~~[0~1] Enter~number~of~X~approximation~points~~400 Enter~number~of~Y~approximation~points~~200 Enter~expression~for~joint~density~~3*u.~~~~~~~~~~~%~Distribution~function~for~Y ~plot(X.10) ~FY~=~cumsum(PY). and u ≤ 1 − t.~t.~~~~~~~~~~~%~Distribution~function~for~X~(Figure~8.
m (Section 17.) be the total number of spots which turn up. What is the probability of three or more sevens? Exercise 8.0551 0. . Let an additional X times. Exercise 8.) Two cards are selected at random.0493 0. determine the joint distribution and the marginals.4 the joint distribution for the pair As a variation of Exercise 8.34) content is available online at <http://cnx.0420 0.1] 0.) has the joint distribution (in mle npr08_07. (Solution on p.0620 0.2 (Solution on p.3 Problems On Random Vectors and Joint Distributions3 Exercise 8. Determine the joint distribution for the pair {X.1 number of aces and (Solution on p. Determine {X. Exercise 8.5 Suppose a pair of dice is rolled. Under the usual assumptions.9 5.37: npr08_06)): {X. It is decided to select two at random (each possible pair equally likely).3 A die is rolled. Y }. Y } X = [−2.1 3. from a standard deck.) has the joint distribution (in mle npr08_06. Y = u) 3 This (8. {X.211 8. Y } and from this determine the marginal distribution for Y. Arrange the joint matrix as on the plane.3 2.0437 P = 0. Y Let X be the be the number of spades. 215.8. Exercise 8.0483 0.6 The pair (Solution on p. A coin is ipped X (Solution on p.5 4. Suppose a pair of dice is rolled instead of a single die. Let Y X (Solution on p.0399 0. Let Y be the number of heads that turn up. Y } P (X = t. Determine the joint distribution for {X. Two sophomores.0380 0.3 − 0. (For a MATLAB based way to determine the joint distribution see A random number N of Bernoulli trials) from "Conditional Expectation.7: Regression") Y.38: npr08_07)): Exercise 8.33) Determine the marginal distributions and the corner values for P (X + Y > 2) P (X ≥ Y ).3. (Solution on p.) k.) Two positions for campus jobs are open. 216.0527 0. 218.3] 0.4/>. Determine (8. 216.0399 0. and three seniors apply. Determine the joint distribution for the pair and for each P (X = k) = 1/6 for 1≤k≤6 bution.0651 0.0667 and 0. P (Y = jX = k) has the Y increasing upward.0580 Y = [1. Y } and from this determine the marginals for each.0323 0. Let sophomores and the pair Y X be the number of be the number of juniors who are selected.0589 FXY .0609 0.0357 0.7 1.32) 0. three juniors. 217.1 5.0441 (8. Let X be the number that turns up.org/content/m24244/1. 218.7 The pair {X. Assume binomial (k. with values of the marginal distribution for Example 7 (Example 14. without replacement.m (Section 17. Y } and from this determine the marginal distribution for Y.0361 0.) times. 215.8. 1/2) distriDetermine Exercise 8. Roll the pair be the number of sevens that are thrown on the X rolls.0713 0.
0012 0.0594 0.0063 7 0.0260 0.0134 0.0070 0.049 0.5 4.039 0. 220.0038 0.051 0.0256 0. (Solution on p.012 0.0484 1.0091 0.0216 0.061 0.0084 0. (Solution on p.0060 0.0096 0.065 0.0026 15 0. RANDOM VECTORS AND JOINT DISTRIBUTIONS t = u = 7.4 0.0196 0.0396 0 0.) {X.0256 0.009 2 0.0056 0.0017 5 0.0182 0.0891 0.0108 0.0162 0.0223 Table 8. and Data were kept on the eect of training time on the time to perform a job on a production line.0234 0.010 0.017 .0040 0.40: npr08_09)): P (X = t. Y } has the joint distribution (in mle npr08_08.033 0.0156 0.0363 0.m (Section 17.0060 0.0324 0.0203 0.0495 0.0297 0 4.001 0.1 0.5 0.0167 11 0. 219.0182 0.0234 0.1 2.011 0.0172 17 0.0064 0.0072 3 0.0035 0.0189 0.0244 0.0180 0.0191 0.5 0.0126 0.0060 0.0140 0. The data are as follows (in mle npr08_09.0156 0.0044 0.5 0.0045 9 0.0140 0.0056 0.137 0.039 0.0096 0.005 0.050 0.015 0.0056 0.0084 0.0091 0. Y = u) (8.0112 0. Determine P (1 ≤ X ≤ 4.0217 19 0.0098 0.0070 0. Y X is the time to perform the task.0196 0.0081 0.0120 0.0104 0.0072 0.0080 0.0252 0.0160 0.0268 0.1320 0.0054 0.045 2.0126 0.2 0. in minutes.058 0.0200 0.0160 0.1089 0.0280 0.8 3.7 0.070 0.0060 0.001 0.8.0440 0.0726 2.0096 0.0108 0.1 Determine the marginal and distributions and the corner values for FXY .031 0.8.0168 0. Y > 4) Exercise 8. Determine FXY (10.9 is the amount of training.0064 0.0180 0.0090 13 0. Y = u) (8.39: npr08_08)): P (X = t.0112 0.m (Section 17.36) t = u = 5 4 3 2 1 1 0.0256 0.0120 0.025 3 0.35) t = u = 12 10 9 5 3 1 3 5 1 0.0510 0.0528 0.0231 0.0132 3.0090 0.) Exercise 8.0050 0.0204 0.9 0.0040 0.8 The pair P (X − Y  ≤ 2).212 CHAPTER 8. 6) and P (X > Y ).0 3.163 0.2 Determine the marginal distributions.0182 0.0056 0.003 1.0405 0.0077 Table 8. in hours.0049 0.
1) . 1/2 < Y < 3/4) . (2. (8.) on the square with vertices at fXY (t. Y ≤ 1/2) . 222. P (X − Y  < 1) Exercise 8. P (X < Y ) Exercise 8. 2) .) for fXY (t. P (X ≤ 1. Use a discrete approximation to plot the marginal density FX . Y > 1) . 1). Y > 1) . u) = 4ue−2t 0 ≤ t.) fXY (t. u) = 1 8 (t + u) for 0 ≤ t ≤ 2. Determine by discrete approximation the indicated probabilities. 0 ≤ u ≤ 1 (8.16 (Solution on p. P (X ≤ 1/2. 0) .10 (Solution on p. 221. 1) (8. Exercise 8. 223. P (X > 0. (0. 0) . P (X > 1/2.5. P (Y ≥ 1/2) . Sketch the region of denition and determine analytically the marginal density functions b. (8. (1.38) P (X > 1. and the marginal distribution function c. 1) .) on the parallelogram with vertices fXY (t. Calculate analytically the indicated probabilities. 223.) for fXY (t. 224.41) P (X ≤ 1. Y > 1/2) .213 Table 8. Y > 1) . Y > 1) . P (Y ≤ X) Exercise 8.14 (Solution on p.12 (Solution on p. P (0 ≤ X ≤ 1/2. Y > 1) .15 (Solution on p. 221. Y > 0) . u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2. P (X ≤ 1/2. P (X < 1/2. 220.43) P (X ≤ 1/2.25). For the joint densities in Exercises 1022 below a. 0 ≤ u ≤ 2 (1 − t). 0) . (0.39) P (1/2 < X < 3/4. 1) . 1 < Y ) .40) P (X > 1/2.3 Determine the marginal distributions. u) = 12t2 u (−1.) for fXY (t. P (Y ≤ X) Exercise 8. 0 ≤ u ≤ 1 + t. Y > 1/2) . Y > 1/2) . Determine FXY (2. 0 ≤ u ≤ 1. Y > 1/2) . fX fX and fY .37) Exercise 8. P (Y ≤ X) Exercise 8. P (Y ≤ X) (8. (0. (8. u) = 4t (1 − u) 0 ≤ t ≤ 1. 0 ≤ u ≤ 2.42) FXY (1.13 (Solution on p. P (0 ≤ X ≤ 1. (1.) fXY (t. (8. u) = 1/2 (1. 3) and P (Y /X ≥ 1. d. u) = 1 0 ≤ t ≤ 1.11 (Solution on p.
u) = 24 11 tu 0 ≤ t ≤ 2. t} (8.49) P (1/2 ≤ X ≤ 3/2. 0 ≤ u ≤ min{1. 228. 0 ≤ u ≤ max{2 − t. Y ≤ 1) . RANDOM VECTORS AND JOINT DISTRIBUTIONS Exercise 8.) for fXY (t. Y > 1) . Y ≥ 1) . 0 ≤ u ≤ min{1 + t.) fXY (t. P (X > 1) . for 0 ≤ t ≤ 2. 225.20 (Solution on p. P (Y < X) Exercise 8. Y ≤ 1) .214 CHAPTER 8.2] (t) 14 t2 u2 for (Solution on p. P (Y < X) Exercise 8.1] (t) 8 t2 + 2u + I(1. 228. Y ≥ 1) .) 0 ≤ u ≤ 1. 0 ≤ u ≤ min{2t. P (X < Y ) (8.45) P (X ≥ 1. 3 − t} (8. 3 − t} (8.19 (Solution on p. P (X ≤ 1. Y ≤ 1/2) . 227.47) P (X ≤ 1/2. P (Y ≤ X/2) Exercise 8. Y ≤ 1) .5. Y ≤ 3/2) . 0 ≤ u ≤ min{2.) fXY (t. u) = 2 13 (t + 2u) for 0 ≤ t ≤ 2. 224. 2} (8.) fXY (t.44) Exercise 8. u) = 3 23 (t + 2u) for 0 ≤ t ≤ 2. 2 − t} P (X ≤ 1. u) = I[0.22 9 3 fXY (t.) fXY (t.46) P (X ≥ 1.18 (Solution on p. P (Y ≤ X) Exercise 8. P (X ≥ 1. u) = 12 179 3t2 + u . u) = 12 227 (3t + 2tu) for 0 ≤ t ≤ 2.17 (Solution on p. P (Y ≤ 1) .48) P (X < 1) . 226.21 (Solution on p. (8. P (X ≤ 1.
5588 0. X. disp('Data in Pn. 211) Let X be the number of aces and Y be the number of spades. k) = P (X = i. i = 1. and neither on the i and Ni . Ci be the events of selecting a sophomore.8.0588 Solution to Exercise 8. Y = 0:2. 0) = P (A1 C2 ) + P (C1 A2 ) = 8 • 3 + 3 • 2 = 12 7 8 7 56 3 12 P (1. 2) = 0 P X = [30/56 24/56 2/56] P Y = [20/56 30/56 6/56] % file npr08_02. 0) = P (N1 N2 ) = 52 • 51 = 1260 2652 864 P (0. P = Pn/(52*51). 864 144 6. Y = k) 6 P (0.1448 0. or senior.32: npr08_01) % Solution for Exercise~8. Y') npr08_01 % Call for mfile Data in Pn.m (Section~17. 35 36 P (0.3824 0. junior. Pn = [132 24 0.2 X = 0:2. Let 3 51 + 1 52 • 36 51 + 36 52 • 1 51 = 144 2652 % type npr08_01 (Section~17. spade (other than the P (i. 2) = P (∅) = 0 ace). P. X.8. 1) = P (2.0045 PY = fliplr(sum(P')) PY = 0.2 (p. 2. 0) = P (C1 C2 ) = 3 • 2 = 56 8 7 18 P (0. other ace.8.8. Y = k).8507 0. k) = P (X = i. 0) = P (A1 A2 ) = 52 • 51 = 2652 1 3 3 1 6 P (2.1 (p. Ai . 6 12 2].m (Section~17. 1) = P (N1 S2 S1 N2 ) = 36 • 12 + 12 • 36 = 2652 52 51 52 51 11 132 12 P (0. Y. 211) Let Ai . Let X be the number of juniors selected. 1) = P (A1 B2 ) + P (B1 A2 ) = 2 • 3 + 8 • 2 = 56 8 7 7 2 1 2 P (2.1 X = 0:2.Pn. disp('Data are in X.33: npr08_02) . 1) = P (A1 S2 S1 A2 AS1 N2 N1 AS2 ) = 52 • 12 + 12 • 51 52 1 1 24 P (1. P = Pn/56. 0) = P (A1 A2 ) = 8 • 7 = 56 P (1. Y = 0:2. Set P (i. 1) = P (AS1 A2 A1 AS2 ) = 52 • 51 + 52 • 51 = 2652 P (2. P') npr08_02 (Section~17. Dene the events ASi . 2) = P (S1 S2 ) = 52 • 51 = 2652 3 36 3 216 P (1. respectively.215 Solutions to Exercises in Chapter 8 Solution to Exercise 8. Y % Result PX = sum(P) PX = 0.33: npr08_02) % Solution for Exercise~8. selection. 1260 216 6]. Si .32: npr08_01) % file npr08_01. 2) = P (B1 B2 ) = 8 • 7 = 56 2 P (1. 18 12 0. Bi . 1) = P (B1 C2 ) + P (C1 B2 ) = 3 • 3 + 3 • 3 = 56 8 7 8 7 3 2 6 P (0. Pn = [6 0 0. 0) = P (A1 N2 N1 S2 ) = 52 • 36 + 52 • 51 = 2652 51 3 P (1. on the be the number of sophomores and Y i th trial. P. 2) = P (AS1 S2 S1 AS2 ) = 52 • 12 + 12 • 51 = 2652 51 52 2 6 3 P (2. 2) = P (2. of drawing ace of spades.
0156 0.0417 0. Y.0625 0.0833 0. Y = 0:6. RANDOM VECTORS AND JOINT DISTRIBUTIONS Data are in X.1/2.1/2. Y.0:i).5357 0. P PX = sum(P) PX = 0.1667 0.0052 0.0833 0.3125 0. PY disp(P) 0 0 0 0 0 0.m (Section~17.3571 0.34: npr08_03) % Call for solution mfile Answers are in X.0260 0. PY = fliplr(sum(P')).8. PX = (1/36)*[1 2 3 4 5 6 5 4 3 2 1].3 X = 1:6.8.0417 0.0391 0 0 0.5357 0.1:i+1) = (1/6)*ibinom(i.0755 0. P') npr08_04 (Section~17.0104 0. PY.8.0521 0.8.3 (p. Y = 0:12.0026 Solution to Exercise 8.Pn.0156 0 0 0 0. % Initialize for i = 1:6 % Calculate rows of Y probabilities P0(i.0521 0.0208 0.0026 0 0 0 0 0.35: npr08_04) Answers are in X.0833 0. Y.m (Section~17. 211) P (X = i. P disp(P) Columns 1 through 7 0 0 0 0 0 0 0 .0391 0.2578 0.1641 0.0521 0 0. 211) % file npr08_04.4 X = 2:12. for i = 1:11 P0(i.7). PY. P.216 CHAPTER 8.0625 0.0208 0.35: npr08_04) % Solution for Exercise~8.1:i+2) = PX(i)*ibinom(i+1. % Reverse to put in normal order disp('Answers are in X. end P = rot90(P0). Y.1071 Solution to Exercise 8. Y. % file npr08_03.13).0:i+1). % Rotate to orient as on the plane PY = fliplr(sum(P')). P. P0 = zeros(6.0026 disp(PY) 0.34: npr08_03) % Solution for Exercise~8.0260 0.0357 0. P0 = zeros(11.4 (p.0052 0. end P = rot90(P0).0417 0. Y = k) = P (X = i) P (Y = kX = i) = (1/6) P (Y = kX = i). disp('Answers are in X.0625 0.0104 0.4286 PY = fliplr(sum(P')) PY = 0.0208 0.0417 0. PY') npr08_03 (Section~17.
0013 0 0 0 0.0326 0.0139 0.0208 0.0326 0.0054 0.m (Section~17.0034 0.13).0013 0.0054 0.217 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0020 0.2152 0. PY') npr08_05 (Section~17.0273 0.0380 0.0171 0.0035 0.0174 0. for i = 1:11 P0(i.0230 Columns 8 through 13 0.0069 Columns 8 through 11 0 0 0 0 0 0.0456 0. P.0078 0.0000 0.0043 0.0098 0.0035 0.0001 0. Y.0000 0.0043 0.0002 0.2158 0.0001 0.0152 0.36: npr08_05) Answers are in X.0052 0 0.0048 0.0125 0.0205 0.0130 0.0304 0.0000 0.0171 0.0208 0.0091 0.5 PX = (1/36)*[1 2 3 4 5 6 5 4 3 2 1].0304 0.0020 0.0037 0.0002 0.0090 0. PY = fliplr(sum(P')).0022 0 0 0 0 0.0347 0.1954 0.0208 0. disp('Answers are in X.0001 0.0182 0.0037 0.0174 0.0130 0.0005 0.0001 0.0015 0.0004 0.3660 0.0045 0.0806 Solution to Exercise 8.5 (p. Y.3072 0.1:i+2) = PX(i)*ibinom(i+1.0022 0.0069 0.0001 disp(PY) Columns 1 through 7 0.0152 0.0273 0. X = 2:12.1025 Columns 8 through 13 0.0008 0.0005 0. P.1400 0.0456 0.0:i+1).1/6.0000 0.0078 0.0000 0.0063 0.0003 0.0828 0.0125 0.0008 0 0 0 0 0 0.0015 0.0091 0. Y = 0:12. end P = rot90(P0).0269 0.0434 0.0004 0.0052 0.0015 0.0008 0.0015 0.0098 0.0040 0 0 0 0 0 0 0.1823 0.0069 0.0273 0.0069 0.0140 0 0 0 0 0 0 0 0. PY disp(PY) Columns 1 through 7 0. P0 = zeros(11.8.0090 0.36: npr08_05) % Data and basic calculations for Exercise~8.8.0034 0.0208 0.0045 0.0312 0.0375 0.0273 0.0003 0.0182 0. 211) % file npr08_05.0347 0.0008 .
4000 0.0667 0.4740 0.7000 0.6361 0.0000 0.2980 P1 = total((t+u>2).218 CHAPTER 8.6000 0.1500 0.PY]') 1.5000 0.0000 Solution to Exercise 8.1000 0.2000 0.37: npr08_06) Data are in X.0000 0.3020 4.0000 0. Y. and P disp([X.8.8.3160 0.2400 0.PX]') 2. PX. Y. P jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.1740 0.0000 0.4860 0. PY.7163 P2 = total((t>=u). t.2100 jddbn Enter joint probability matrix (as on the plane) P To view joint distribution function.3000 0.1100 . P jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.2000 3.1160 0.6000 0.1700 1.38: npr08_07) Data are in X.*P) P1 = 0.1000 0.3600 0.1200 3.1380 0.3000 0. PX.1000 0.2200 1.6 (p. PY.2980 2.9000 0.7000 0.0001 0.8020 1.4000 0.1000 0.2799 Solution to Exercise 8. call for FXY disp(FXY) 0.2300 0. Y. 211) npr08_07 (Section~17.PX]') 3.2020 5. Y.0000 0.1900 5.3300 2.*P) P2 = 0. u. RANDOM VECTORS AND JOINT DISTRIBUTIONS 0.7900 0.3000 0. 211) npr08_06 (Section~17. t.1817 0. u.1980 disp([Y.5000 0. and P disp([X.2300 0.2391 0.7 (p.
0000 0.8200 0.219 4. Y.6904 0.1929 npr08_08 (Section~17.8061 0. Y.9000 0.1318 10.*P) F = 0.0000 0.5089 0.0500 9..0000 0.2706 7.0000 0.1929 2.7000 0.0000 0.PY]') 5.0700 disp([Y.0000 0.*P) P1 = 0.1222 9.0000 0.1400 15.3214 0.1300 5.0510 0.1768 1. P1 = total(M.3357 Solution to Exercise 8.5000 0.3426 4.0000 0..7564 0..1500 0.1092 3.1000 0.0000 0.0900 7.0000 0.0000 0.*P) P = 0.1720 0.7390 .PX]') 1. call for FXY disp(FXY) 0.8000 0.1364 3.0000 0.1432 5.2719 0.3700 0. u..5355 0.0886 12.0000 0.0000 0. and P disp([X.0000 0.39: npr08_08) Data are in X.0915 0.0918 F = total(((t<=10)&(u<=6)). PY. 212) 1.3230 P2 = total((abs(tu)<=2).PY]') 3.Use array operations on matrices X.1410 0.8.*P) P2 = 0.0000 0.0000 0.2982 P = total((t>u). PX.0000 0..0800 3. t.1939 jddbn Enter joint probability matrix (as on the plane) P To view joint distribution function..0000 0.0800 17.8 (p.4336 0.9300 0.1852 0.4792 0.1852 M = (1<=t)&(t<=4)&(u>4).1300 11.1300 19..1000 13.0700 disp([Y.5920 0.0994 0.0000 0. P jcalc .
plot(X. t.0).5000 0.PX]') 1. PX. 0 ≤ u ≤ 2 lies outside the triangle (8. 0 ≤ t ≤ 1 1−u/2 (8.51) M 1 = {(t. PY.4000 2.5000 0.10 (p. u) : t > 1/2. RANDOM VECTORS AND JOINT DISTRIBUTIONS Solution to Exercise 8..X.220 CHAPTER 8. Y. PY.5100 P = total((u.8.53) region in the triangle under u = t.2000 2.0000 0.1500 1.0000 0.. 213) Region is triangle with vertices (0.1500 3.fx.0000 0. 212) npr08_09 (Section~17. (1. and P disp([X. (0. u > 1/2} M 3 = the has area in the trangle (8. u.. which has area 1/3 (8. P jcalc . PX..0990 2.0570 F = total(((t<=2)&(u<=3)).0000 0.0000 0.0000 0.52) M 2 = {(t.*P) F = 0..2100 5.50) fY (u) = 0 dt = 1 − u/2.0000 0. t. Y ) ∈ M 1) = 0 = 1/2 (8.PY]') 1.2).Use array operations on matrices X... u) : 0 ≤ t ≤ 1/2..3130 4.FX) % Figure not reproduced .*P) P = 0. 2(1−t) fX (t) = 0 du = 2 (1 − t) . and P fx = PX/dx.54) tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 400 Enter expression for joint density (t<=1)&(u<=2*(1t)) Use array operations on X.. FX = cumsum(PX).5570 Solution to Exercise 8.9 (p.0). Y.40: npr08_09) Data are in X./t>=1. Y.25)..0000 0. u.1000 disp([Y. u > 1} P ((X.3210 3.
u) : t ≤ 1/2. = 1/8. u.t1)) Use array operations on X.5 fY (t) by symmetry 1+t 1−t du + I(1. 3−t t−1 and Solution to Exercise 8.5)&(u>0.*P) PM3 = 0.X.2] (t) 0. u) : u ≤ t} has area in the trangle = 1/2. PM3 = total(M3.2501 % Theoretical = 1/4 M2 = (t<=1/2)&(u>1). PX.5000 total((u<=t). PM1 = total(M1. u = 3 − t. and P fx = PX/dx.5023 % Theoretical = 1/2 Solution to Exercise 8.5*(u<=min(1+t.*P) PM1 = 0.58) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density 0.0631 % Theoretical = 1/16 = 0.1] (t) 0.FX) % Plot not shown M1 = (t>1)&(u>1). 1 fX (t) = 0 1 4t (1 − u) du = 2t. 213) The region is bounded by the lines u=t−1 fX (t) = I[0. u = 1 − t. u) : t > 1. 0 ≤ t ≤ 1 4t (1 − u) dt = 2 (1 − u) .59) fY (u) = 0 (8.11 (p. PM2 = total(M2.*P) 0..*P) PM2 = 0.12 (p. (u>=max(1t. P M 3 = 1/2 (8.3t))& . t.*P) 0. 0 ≤ u ≤ 1 (8.221 M1 P1 P1 M2 P2 P2 P3 P3 = = = = = = = = (t>0. PY.fx. u > 1} M 2 = {(t.5 du = I[0.5).2] (t) (2 − t) = (8.55) M 1 = {(t.0625 M3 = u<=t.*P) 0 (t<=0. Y.57) has area in the trangle = 1.3350 % Theoretical = 0 % Theoretical = 1/2 % Theoretical = 1/3 u = 1 + t. total(M1..60) . FX = cumsum(PX).1] (t) t + I(1. plot(X. total(M2. 213) Region is the unit square.5)&(u>1). so so P M 1 = 1/4 P M 2 = 1/16 (8. u > 1} M 3 = {(t.56) has area in the trangle so (8.
P1 = total(M1. t.0625 % Theoretical = 1/16 = 0. P3 = total(M3.*P) .*(1 . PY.8350 % Theoretical = 5/6 = 0.fx.0781 M2 = (t<=1/2)&(u>1/2).FX) M1 = (t>1/2)&(u>1/2).62) tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density 4*t. u. PX.13 (p. PX.63) P1 = 1/2 1/2 (t + u) dudt = 45/64 2 t P2 = 0 1 (t + u) dudt = 1/4 (8. plot(X. plot(X.X. P2 = total(M2.222 CHAPTER 8. t.*P) P3 = 0. 0 ≤ t ≤ 2 4 1 2 (8.u) Use array operations on X. Y.*P) P2 = 0.64) P3 = 0 0 (t + u) dudt = 1/2 (8.*P) P1 = 0. P1 = total(M1. PY. Y. RANDOM VECTORS AND JOINT DISTRIBUTIONS 3/4 1 1/2 1 P1 = 1/2 1/2 4t (1 − u) dudt = 5/64 1 t P2 = 0 1/2 4t (1 − u) dudt = 1/16 (8.0625 M3 = (u<=t). fX (t) = 2 2 1 8 2 (t + u) = 0 1 (t + 1) = fY (t) . FX = cumsum(PX).FX) % Plot not shown M1 = (1/2<t)&(t<3/4)&(u>1/2).61) P3 = 0 0 4t (1 − u) dudt = 5/6 (8. u. FX = cumsum(PX). 0 ≤ u ≤ 2. 213) Region is the square 0 ≤ t ≤ 2. and P fx = PX/dx.0781 % Theoretical = 5/64 = 0.fx.X.8333 Solution to Exercise 8.65) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (1/8)*(t+u) Use array operations on X. and P fx = PX/dx.
7031 % Theoretical = 1/4 % Theoretical = 1/2 t = 0.66) Solution to Exercise 8.*P) 0. fXY = fX fY ∞ 3/4 P 1 = 0.69) fX (t) = 3 88 1+t 0 fY (u) = I[0. total(M3. 0 ≤ t.3] (u) 3 + 2u + 8u2 − 3u3 88 88 1 1 (8. P2 = 0. 1) = 0 1 1+t 0 fXY (t. 213) Region bounded by t = 0.72) P1 = 0 1 fXY (t.5 1 1 2e−2t dt 1/2 2udu = e−1 5/16 3 −2 1 e + = 0. u) dudt = 3/44 1 1+t (8.3] (u) 0 3 88 2 2t + 3u2 dt = u−1 (8.15 (p. and P M2 = (t > 0.*P) 0. PY.71) FXY (1.1150 p3 = total((t<u).67) P3 = 4 0 t ue−2t dudt = (8.7030 Solution to Exercise 8. t.5)&(u > 0.5)&(u<3/4). t = 2.1] (u) 3 88 2t + 3u2 dt + I(1. u = 0.68) tuappr Enter matrix [a b] of Xrange endpoints [0 3] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density 4*u.73) .1139 % Theoretical = (5/16)exp(1) = 0.1] (u) I[0. u) dudt = 329/352 (8.7031 (t<=1)&(u>1). 0 ≤ t ≤ 2 88 88 2 (8. u = 1 (8.7030 2 2 (8. 0 ≤ u ≤ 1.7047 % Theoretical = 0. PX.5025 % Theoretical = 45/64 = 0. Y.*P) p3 = 0. 213) Region is strip bounded by fX (t) = 2e−2t . fY (u) = 2u. u = 1 + t 2t + 3u2 du = 3 3 (1 + t) 1 + 4t + t2 = 1 + 5t + 5t2 + t3 .2500 u<=t.*exp(2*t) Use array operations on X. u) dudt = 41/352 P2 = 0 1 fXY (t. u = 0. total(M2.223 P1 M2 P2 P2 M3 P3 P3 = = = = = = = 0.70) 3 3 6u2 + 4 + I(1.*P) p2 = 0.14 (p. p2 = total(M2. u.
and P p1 = total((t<=1/2).*P) p2 = 0.1875 P3 = total((u>=1/2). P 2 = 12 0 u−1 t2 udtdu = 3/16 (8. F = total(MF. PX. PY.*((u<=t+1)&(u>=t)) Use array operations on X.75) P 1 = 1 − 12 1/2 t t2 ududt = 33/80.224 CHAPTER 8.1856 % Theoretical = 3/16 = 0. 0 ≤ u ≤ 1 1/2 u (8.9347 Solution to Exercise 8.8144 % Theoretical = 13/16 = 0. PY.76) P 3 = 1 − P 2 = 13/16 (8.9297 % Theoretical = 329/352 = 0.X.4098 % Theoretical = 33/80 = 0.4125 M2 = (t<1/2)&(u<=1/2). t.*(u<=1+t) Use array operations on X. u = t. plot(X.0] (t) 12 0 t2 udu + I(0. P1 = total(M1.77) tuappr Enter matrix [a b] of Xrange endpoints [1 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density 12*u.^2.FX) MF = (t<=1)&(u<=1).^2).8125 . t.*P) P2 = 0. p2 = total(M2. P2 = total(M2. u.0682 M1 = (t<=1)&(u>1).*P) P1 = 0. PX. Y. 213) Region bounded by u = 0. RANDOM VECTORS AND JOINT DISTRIBUTIONS tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 3] Enter number of X approximation points 200 Enter number of Y approximation points 300 Enter expression for joint density (3/88)*(2*t+3*u. FX = cumsum(PX).*P) F = 0. Y.0] (t) 6t2 (t + 1) + I(0.*t. u = t + 1 t+1 1 fX (t) = I[−1.1172 % Theoretical = 41/352 = 0.16 (p. and P fx = PX/dx.74) fY (u) = 12 u−1 1 1 t2 udt + 12u3 − 12u2 + 4u. u = 1. u.1] (t) 12 t t t2 udu = I[−1.*P) p1 = 0.1] (t) 6t2 1 − t2 2 (8.*P) P3 = 0.1165 M2 = abs(tu)<1.0681 % Theoretical = 3/44 = 0.fx.
83) fY (u) = I[0.84) I[0.2727 Solution to Exercise 8.4545 P3 = total((t<u).*(u<=2t) Use array operations on X.*P) P1 = 0. u. and P M1 = (t<=1)&(u<=1).2] (t) 0 24 11 2−t tudu = 0 (8. 0 ≤ u ≤ 1 11 24 11 2 1 0 2−t (8.85) (t + 2u) dudt = 13/46.80) tududt = 6/11 P 2 = 24 11 1 0 t 1 tududt = 5/11 (8. P1 = total(M1.78) I[0.5447 % Theoretical = 6/11 = 0.82) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density (24/11)*t.*u. 3 23 2 0 0 t (t + 2u) dudt = 12/23 (8.1] (u) P1 = 3 23 2 1 1 t 6 3 (2u + 1) + I(1. PX.1] (u) 3 23 2 (t + 2u) dt + I(1. t = 2.81) P3 = tududt = 3/11 (8. PY.1] (t) 3 23 (t + 2u) du+I(1. u = 2 − t fX (t) = I[0. u = 2. u = 0.2] (t) t2 23 23 2 (8. 213) Region is bounded by t = 0. u = t (1 < t ≤ 2) 2−t fX (t) = I[0. u = 0.2] (t) 0 3 23 t (t + 2u) du = I[0.2] (u) 0 3 23 2−u (t + 2u) dt + 0 3 23 (t + 2u) dt = u (8. Y.4553 % Theoretical = 5/11 = 0.86) P3 = (t + 2u) dudt = 16/23 (8.1] (t) 24 11 1 tudu + I(1.2] (t) t(2 − t) 11 11 tudt = 12 2 u(u − 2) .2705 % Theoretical = 3/11 = 0.*P) P2 = 0. u = 2 − t (0 ≤ t ≤ 1) .1] (t) 0 6 6 (2 − t)+I(1. t.1] (t) fY (u) = P1 = 24 11 1 0 0 1 12 12 2 t + I(1. 214) Region is bounded by t = 0.225 Solution to Exercise 8.5455 P2 = total((t>1).*P) P3 = 0.18 (p.79) 24 11 2−u 0 (8.87) .17 (p.2] (u) 4 + 6u − 4u2 23 23 P2 = 3 23 2 0 0 1 (8.
226 CHAPTER 8.fx.*P) P2 = 0. P1 = total(M1.*(u<=max(2t. t.2] (u) 27 − 24u + 8u2 − u3 179 179 12 179 2 3/2 0 1 0 3−t 0 1 (8. RANDOM VECTORS AND JOINT DISTRIBUTIONS tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (3/23)*(t+2*u).*P) P1 = 2312 % Theoretical = 41/179 = 0.*P) P3 = 0. 214) Region has two parts: (1) 0 ≤ t ≤ 1.88) 6 24 3t2 + 1 + I(1. u. PY. Y. u. 0 ≤ u ≤ 2 12 179 2 (2) 1 < t ≤ 2.2] (t) 9 − 6t + 19t2 − 6t3 179 179 12 179 2 (8..X.1] (t) 3t2 + u du + I(1.2] (u) 0 12 179 3−u 3t2 + u dt = 0 (8.93) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (12/179)*(3*t..19 (p. t.2] (t) 0 3t2 + u du = 0 (8. plot(X. FX = cumsum(PX).*P) P1 = 0.89) fY (u) = I[0. Y.5217 P3 = total((u<=t). P1 = total(M1.5190 % Theoretical = 12/23 = 0.90) 24 12 (4 + u) + I(1. and P fx = PX/dx. (u<=min(2. .1] (u) I[0.1] (t) I[0. PY.6959 % Theoretical = 16/23 = 0.1] (u) P1 = 12 179 2 1 1 3−t 3t2 + u dt + I(1.2291 M2 = (t<=1)&(u<=1).t)) Use array operations on X.3t)) Use array operations on X. 0 ≤ u ≤ 3 − t 12 179 3−t fX (t) = I[0.91) 3t2 + u dudt = 41/179P 2 = 12 179 3/2 0 0 t 3t2 + u dudt = 18/179 3t2 + u dudt = 1001/1432 (8.6957 Solution to Exercise 8. and P M1 = (t>=1)&(u>=1).92) P3 = 3t2 + u dudt + 12 179 (8.^2+u). PX.2826 P2 = total((u<=1).* .2841 13/46 % Theoretical = 13/46 = 0. PX.FX) M1 = (t>=1)&(u>=1).
100) P3 = (3t + 2tu) dudt = 144/227 (8. 214) Region is in two parts: 1. PX.* . u) du + I(1.101) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (12/227)*(3*t+2*t. u) dt + I(1.6990 Solution to Exercise 8. P1 = total(M1.6308 % Theoretical = 144/227 = 0. t.7003 % Theoretical = 18/179 = 0.2] (t) t 227 227 2 2 (8.2] (u) 0 u−1 fXY (t. (u<=min(1+t.2] (u) (2u + 3) 3 + 2u − u2 227 227 24 6 (2u + 3) + I(1.96) I[0.2] (t) 0 fXY (t.2)) Use array operations on X. 0 ≤ u ≤ 2 1+t 2 fX (t) = I[0. P2 = total(M2. u) du = (8.95) fXY (t.99) P2 = (3t + 2tu) dudt + 12 227 2 0 1 t (3t + 2tu) dudt = 68/227 (8.*P) P2 = 0.2] (u) 9 + 12u + u2 − 2u3 227 227 1/2 0 0 1+t (8.*P) = 0.*P) P3 = 0.97) = I[0.0384 % Theoretical = 139/3632 = 0.1006 % Theoretical = 1001/1432 = 0. u) dt = (8. 0 ≤ u ≤ 1 + t 1 < t ≤ 2.227 P2 P2 M3 P3 P3 = total(M2. and P M1 = (t<=1/2)&(u<=3/2).1] (t) fY (u) = I[0. (2) 0 ≤ t ≤ 1.*P) = 0. P3 = total(M3.*P) P1 = 0.. = total(M3.1] (t) 0 fXY (t.1] (u) 6 24 (2u + 3) + I(1. Y.2996 M3 = u<t.1] (u) 12 3 120 t + 5t2 + 4t + I(1.98) 12 227 (3t + 2tu) dudt = 139/3632 12 227 3/2 1 1 2 (8. 2. u.1003 = u<=min(t.3t).3001 % Theoretical = 68/227 = 0.20 (p.6344 ..0383 M2 = (t<=3/2)&(u>1).*u).1] (u) P1 = 12 227 1 0 1 1+t (8.94) I[0. PY.
t.*P) P2 = 0.2] (t) 0 2 13 3−t (t + 2u) du = I[0.104) P1 = 0 0 (t + 2u) dudt = 4/13 P 2 = 1 2 t/2 0 (t + 2u) dudt = 5/13 (8.1] (t) 0 2 3 2 3 t + 1 + I(1. 3 − t (1 ≤ t ≤ 2) 2t Solution to Exercise 8.3076 % Theoretical = 4/13 = 0. u.*(u<=min(2*t.102) fY (u) = I[0.106) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 400 Enter number of Y approximation points 400 Enter expression for joint density (2/13)*(t+2*u). PY.3076 % Theoretical = 4/13 = 0. 214) Region is rectangle bounded by t = 0.1] (t) 2 13 (t + 2u) du + I(1.22 (p. u = 2t (0 ≤ t ≤ 1) . 214) Region bounded by fX (t) = I[0.3077 M2 = (t>=1)&(u<=1).109) P1 = t2 + 2u dudt + 9 14 3/2 1 0 t2 u2 dudt = 55/448 (8.1] (t) fX (t) = I[0. t = 2.107) fXY (t.110) .2] (t) t2 8 14 (8. u = 0. PX. RANDOM VECTORS AND JOINT DISTRIBUTIONS t = 2. u = 1 3 2 9 t + 2u + I(1.105) P3 = 0 0 (t + 2u) dudt = 4/13 (8.2] (t) (3 − t) 13 13 (8.3077 Solution to Exercise 8. Y.3t)) Use array operations on X.3844 % Theoretical = 5/13 = 0.*P) P1 = 0.2] (t) t2 u2 .*P) P3 = 0.2] (t) 0 t2 u2 du = I[0.1] (u) I[0. 0 ≤ u ≤ 1 8 14 9 14 9 14 1 (8.1] (t) 3 8 1 t2 + 2u du + I(1.2] (u) 2 9 6 21 + u − u2 13 13 52 1 (8.108) fY (u) = 3 8 3 8 1 t2 + 2u dt + 0 1 1/2 0 1/2 t2 u2 dt = 1 1 3 3 + u + u2 0 ≤ u ≤ 1 8 4 2 1/2 (8.2] (u) u/2 2 13 3−u (t + 2u) dt = u/2 (8.1] (t) 0 12 2 6 t + I(1. u) = I[0. P2 = total(M2.228 CHAPTER 8.1] (u) 1 2t 2 13 2 (t + 2u) dt + I(1. and P P1 = total((t<1).21 (p.103) 4 8 9 + u − u2 13 13 52 + I(1.3846 P3 = total((u<=t/2).
*P) P = 0. PY.^2+2*u)..1228 % Theoretical = 55/448 = 0. + (9/14)*(t. and P M = (1/2<=t)&(t<=3/2)&(u<=1/2). t. PX.229 tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density (3/8)*(t. P = total(M. u.*(t<=1) ..1228 .*u.*(t > 1) Use array operations on X.^2.^2). Y.
230 CHAPTER 8. RANDOM VECTORS AND JOINT DISTRIBUTIONS .
4) 231 . N } of the form M = (−∞.e. u]. We extend the notion of Recall that for a random variable independence to a pair of random variables by requiring independence of the events they determine. the inverse image Y −1 (N ) is an event determined by random variable Y for each reasonable set N. u 1 This content is available online at <http://cnx. the set of all outcomes ω ∈ Ω which are mapped into M by X ) is an event for each reasonable subset M on the real line. u for the inverse images of a special class of sets (9. Y −1 (N )} A of random variables is (stochastically) is independent. t] . X. N (9. Y } is independent i the following product rule holds FXY (t. u) = FX (t) FY (u) ∀ t. Thus we may assert {X. u) = P (X ∈ (−∞.org/content/m23321/1. Y } {X −1 (M ) .1 Independent Classes of Random Variables1 9.1) FXY (t. t] and N = (−∞. u]) = P (X ∈ (−∞. Y ∈ N ) = P (X ∈ M ) P (Y ∈ N ) Independence implies M.. (9.2 Independent pairs precisely.5/>. Similarly. N }.1. the inverse image X −1 (M ) (i. In this unit.3) Note that the product rule on the distribution function is equivalent to the condition the product rule holds {M. 9. we extend the concept to classes of random variables.Chapter 9 Independent Classes of Random Variables 9. independent i each pair of events This condition may be stated in terms of the product rule for all (Borel) sets P (X ∈ M.2) (9. t]) P (Y ∈ (−∞.1. u]) = FX (t) FY (u) ∀ t. Y ∈ (−∞. More Denition pair {X.1 Introduction The concept of independence for classes of events is developed in terms of a product rule. An important theorem from measure theory ensures that if the product rule holds for this special class it holds for the general class of The pair {M.
we employ the following terminology.7) fY (u) = Thus it follows that 1 (b − a) (d − c) dt = a 1 c≤u≤d d−c (9. u) such that a≤t≤b and c≤u≤d (i. Denition If M is a subset of the horizontal axis and N is a subset of the vertical axis.8) X is uniform on [a. and Taking limits shows FX (t) = lim FXY (t. u ∈ I2 ). Y } is therefore indepen If there is a joint density function. the constant value of du = c b 1 a≤t≤b b−a and (9.3: Rectangle with interval sides The rectangle in Example 9. then the cartesian product t ∈ M and M ×N u ∈ N. Since 1/ (b − a) (d − c).1: An independent pair Suppose FXY (t. Y is uniform on [c. That is. d]. u) = 1 − e−αt so that the product rule dent. u) = 1 − e−βu t→∞ holds. u) = (1 − e−αt ) 1 − e−βu u→∞ 0 ≤ t.232 CHAPTER 9.e. the pair is independent i fXY (t. Y } is uniform on a rect(b − a) (d − c). consisting of all those points (t. u) = fX (t) fY (u) ∀ t. t ∈ I1 and . b].3 The joint mass distribution It should be apparent that the independence condition puts restrictions on the character of the joint mass distribution on the plane. u) on the plane such that Example 9.6) fXY is I1 = [a. In order to describe this more succinctly. u) = fX (t) fY (u) for t. Simple integration gives fX (t) = 1 (b − a) (d − c) d the area is {X.2 (Joint uniform distribution on a rectangle) is the Cartesian product I1 × I2 . Y } is independent. b] and Y is uniform on [c. and fXY (t. so that the pair {X. INDEPENDENT CLASSES OF RANDOM VARIABLES Example 9.. The pair (9. u. The converse is also true: if the pair is independent with X uniform on [a. then the relationship to the joint distribution function makes it clear that the pair is independent i the product rule holds for the density. d]. the the pair has uniform joint distribution on I1 × I2 .1. d].5) FXY (t. is the (generalized) rectangle consisting of those points (t. b] and I2 = [c. FY (u) = lim FXY (t. all 9.2: Joint uniform distribution on a rectangle Suppose the joint probability mass distributions induced by the pair angle with sides (9. u Example 9. u) = FX (t) FY (u) {X. 0 ≤ u.
is the portion of the joint probability mass in the vertical strip. N are intervals on the horizontal and vertical M M ×N is the intersection of the vertical strip meeting the horizontal with the horizontal strip meeting the vertical axis in N.1 illustrates the basic pattern. The probability in the rectangle is the product of these marginal probabilities. axes. The probability X ∈ M rectangle test. respectively. the probability Y ∈N is the part of the joint probability in the horizontal strip. Y ∈ N ) = P ((X.233 Figure 9. Y ) ∈ M × N ) = P (X ∈ M ) P (Y ∈ N ) Reference to Figure 9. We restate the product rule for independence in terms of cartesian product sets.1: Joint distribution for an independent pair of random variables. We illustrate with a . then the rectangle axis in If (9. P (X ∈ M. This suggests a useful test for nonindependence which we call the simple example.9) M.
15 These values are shown in bold type on Figure 9. (9. (2. The following four items of information are available.5: Joint and marginal probabilities from partial information Suppose the pair {X. Yet P (X ∈ M ) > 0 and P (Y ∈ N ) > 0. Example 9. so that we would not expect independence of the pair. in many cases the complete joint and marginal distributions may be obtained with appropriate partial information. (1. Y = u2 ) = 0. but P ((X. Y } is independent and each has three possible values.0).11) A combination of the product rule and the fact that the total probability mass is one are used to calculate each of the marginal and joint probabilities. The product rule fails.1). Example 9.3. small rectangle To establish this.3.2. Y ) ∈ M × N ) = 0.08 implies P (Y = u2 ) = 0. Remark. And it does not provide a test for independence. consider the M × N shown on the gure.10) P (X = t2 .2). For example P (X = t1 ) = 0. P (X = t1 ) = 0. Y = u1 ) = 0.234 CHAPTER 9. In spite of these limitations.2: Rectangle test for nonindependence of a pair of random variables.2 that a value of X determines the possible values of Y and vice versa.08 (9. Then P (Y = u3 ) .4: The rectangle test for nonindependence Supose probability mass is uniformly distributed over the square with vertices at (1.1). INDEPENDENT CLASSES OF RANDOM VARIABLES Figure 9. The following is a simple example. Because of the information contained in the independence condition. P (Y = u1 ) = 0. hence the pair cannot be stochastically independent. so that P (X ∈ M ) P (Y ∈ N ) > 0. (0. There are nonindependent cases for which this test does not work. P (X = t1 . Y = u2 ) = P (X = t1 ) P (Y = u2 ) = 0. it is frequently useful. There is no probability mass in the region.4.2 and P (X = t1 . It is evident from Figure 9.
u) = 1 1 − ρ2 t − µX σX 2 − 2ρ t − µX σX u − µY σY . u) = where 1 2πσX σY (1 − ρ2 ) 1/2 e−Q(t.u)/2 (9. σX and 2 Y ∼ N µY . Y } has the joint normal distribution i the joint density is fXY (t. this. + u − µY σY 2 (9. And it has not seemed useful to develop MATLAB procedures to accomplish Figure 9. There is no unique procedure for solution.3. . While it is true that every independent pair of normally distributed random variables is joint normal.3: Joint and marginal probabilities from partial information. (9. The details are left as an exercise for the interested Remark.6: The joint normal distribution A pair {X.14) ρ = 0. Example 9. σY If the parameter ρ is set to zero.235 = 1 − P (Y = u1 ) − P (Y = u2 ) = 0.12) Q (t.13) The marginal densities are obtained with the aid of some algebraic tricks to integrate the joint density. The result is that 2 X ∼ N µX . Others are calculated similarly. u) = fX (t) fY (u) so that the pair is independent i reader. not every pair of normally distributed random variables has the joint normal distribution. the result is fXY (t.
· · · . n FX1 X2 ···Xn (t1 . tn ) = i=1 FXi (ti ) for all (t1 . t2 . Frequently we deal with independent classes in Such classes are referred to as which each random variable has the same marginal distribution. Absolutely continuous random variables If a class {Xi : i ∈ J} is independent and the individual variables are absolutely continuous (i.18) Similarly.236 CHAPTER 9.15) is the joint normal density for an independent pair ( Now dene the joint density for a pair ρ = 0) of standardized normal random variables. u) Both distribution is positive for all in the rst and third quadrants..5 Simple random variables Consider a pair {X.1. t2 . tn ) (9.17) Since we may obtain the joint distribution function for any nite subclass by letting the arguments for the others be ∞ (i. if each nite subclass is jointly absolutely continuous. 9. Examples are simple random samples from a given population. ti2 . {X. However. t2 . · · · .identically distributed). (an acronym for iid classes independent.e. Denition A class {Xi : i ∈ J} of random variables is (stochastically) independent i the product rule holds for every nite subclass of two or more. they cannot be joint normal. 1) and Y ∼ N (0. tn ) (9. A Bernoulli sequence is a simple example. by taking the limits as the appropriate ti increase without bound).4 Independent classes Since independence of random variables is independence of the events determined by the random variables. The function φ (t. Y } fXY (t. u). extension to general classes is simple and immediate. the single product rule suces to account for all nite subclasses. tim ) = k=1 fXik (tik ) for all (t1 . u) = 2φ (t.1. INDEPENDENT CLASSES OF RANDOM VARIABLES Example 9. The index set J in the denition may be nite or innite. independence is equivalent to the product rule For a nite class {Xi : 1 ≤ i ≤ n}. or the results of repetitive trials with the same distribution on the outcome of each component trial. u) = t2 u2 1 exp − − 2π 2 2 by (9. Y } of simple random variables in canonical form n m X= i=1 ti IAi Y = j=1 uj IBj (9. have densities).7: A normal pair not joint normally distributed We start with the distribution for a joint normal pair and derive a joint distribution for a normal pair which is not joint normal. · · · . then each individual variable is absolutely continuous and the product rule holds for the densities. Remark. (t. and zero elsewhere (9.19) .16) X ∼ N (0.. since the joint normal 9. 1).e. · · · . then any nite subclass is jointly absolutely continuous and the product rule holds for the densities of such subclasses m fXi1 Xi2 ···Xim (ti1 .
t PY and If the random variables are given in canonical form. respectivly.0264 ~~~~0. Since (n − 1) + (m − 1) = m + n − 2 values are X and Y are in ane form. then the uj as a coordinate has zero probability X has n distinct values and Y has m n + m marginal probabilities suce to determine the the marginal probabilities for each variable must add to one.0198~~~~0.20) ti or mass . we use an mprocedure we call probability matrix is simply a matter of determining all the joint probabilities icalc. Fj : 1 ≤ i ≤ n.01*[15~43~31~11].0372~~~~0. Independence of the minterm pairs is implied by independence of the combined class {Ei . Once we have both marginal distributions. icalc Enter~row~matrix~of~Xvalues~~X Enter~row~matrix~of~Yvalues~~Y Enter~X~probabilities~~PX Enter~Y~probabilities~~PY ~Use~array~operations~on~matrices~X.0558~~~~0. PX~=~0. Y. Y ).01*[12~18~27~19~24]. we need only the marginal distributions in matrices X. Y.~Y. 1 ≤ j ≤ m} MATLAB and independent simple random variables (9.0589~~~~0. Y~=~[0~1~2~4]. (ti .~PX. j) = P (X = ti . In the general case of pairs of joint simple random variables we have the mprocedure jcalc. PY~=~0.0209~~~~0. Formation of the joint p (i. Y = uj ) = P (X = ti ) P (Y = uj ) Once these are calculated. only needed.~t. P X . Y = uj ) = P (X = ti ) P (Y = uj ) According to the rectangle test. Nb } generated by the two classes.0744 . Bj } is W = (X. P (X = ti . If distinct values. Y } is independent i each pair of minterms {Ma . formation of the calculation matrices (9. the pair {X. which uses t information in matrices and u. The joint distribution has probability mass at each point Thus at every point on the grid.0132~~~~0. That is.22) Calculations in the joint simple case are readily handled by appropriate mfunctions and mprocedures.21) Ar = {X = tr } is the union of minterms generated by the minterms generated by the Ei and Bj = {Y = us } is the union of Fj .~and~P disp(P)~~~~~~~~~~~~~~~~~~~~~~~~%~Optional~display~of~the~joint~matrix ~~~~0.8: Use of icalc to set up for joint calculations X~=~[4~2~0~1~3].~u. and P to determine the marginal probabilities and the calculation matrices In the independent case.0297~~~~0. X. no gridpoint having one of the (9. and to determine the joint probability matrix (hence the joint distribution) and the calculation matrices u. Example 9. Y } is independent i each of the pairs independent. is independent.0837~~~~0.23) t and u is achieved exactly as in jcalc. The marginal distributions determine the joint distributions. n m X = a0 + i=1 Since ai IEi Y = b0 + j=1 bj IFj (9.~PY. we have the marginal distributions. Suppose m · n joint probabilities. we may use canonic (or the function form canonicf ) to obtain the marginal distributions.237 Since Ai = {X = ti } and Bj = {Y = uj } the pair {X. If they are in ane form. uj ) in the range of {Ai .
2738~~~~~~~~~~~~~~~~~~~~~%~Agrees~with~solution~in~Example 9 from "Composite Trials".7400 Q~=~M&N.0180~~~~0. Example 9.*P)~~~~~~~~~~~~%~Additional~information~is~more~easily Pe~=~~0.Y). X~=~0:10. Pe~=~total((u==t). consider again the problem of joint Bernoulli trials described in the treatment of Composite trials (Section 4.3).0774~~~~0.8) and Y ∼ binomial (10.2276~~~~~~~~~~~~~~~~~~~~~%~obtained~than~in~the~event~formulation Pm~=~total((t>=u).1161~~~~0.~2] PM~=~total(M.~Y.0.8.~~~~~~~~~~%~N~=~{u:~u~>~0. and each is a Bernoulli sequence.85).^2<=15).238 CHAPTER 9.0270~~~~0.80 of success on each trial.*P)~~~~~~~~~~~~~~~%~P(Y~in~N) PN~=~~~0.X) Enter~Y~probabilities~~PY~~~~~~~~%~Could~enter~ibinom(10.8.~~~~~~~~~~~~%~M~=~[3.9: The joint Bernoulli trial of Example 4. icalc Enter~row~matrix~of~Xvalues~~X~~%~Could~enter~0:10 Enter~row~matrix~of~Yvalues~~Y~~%~Could~enter~0:10 Enter~X~probabilities~~PX~~~~~~~~%~Could~enter~ibinom(10.0817~~~~0.~u.0.6400 N~=~(u>0)&(u. We assume the two seqences of trials are independent of each other.0360 disp(t)~~~~~~~~~~~~~~~~~~~~~~~~%~Calculation~matrix~t ~~~~4~~~~2~~~~~0~~~~~1~~~~~3 ~~~~4~~~~2~~~~~0~~~~~1~~~~~3 ~~~~4~~~~2~~~~~0~~~~~1~~~~~3 ~~~~4~~~~2~~~~~0~~~~~1~~~~~3 disp(u)~~~~~~~~~~~~~~~~~~~~~~~~%~Calculation~matrix~u ~~~~~4~~~~~4~~~~~4~~~~~4~~~~~4 ~~~~~2~~~~~2~~~~~2~~~~~2~~~~~2 ~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 M~=~(t>=3)&(t<=2).9.Y)~in~MxN) PQ~=~~~0. PX~=~ibinom(10.Y)~in~MxN)~=~P(X~in~M)P(Y~in~N) As an example.~t.~u^2~<=~15} PN~=~total(N.~and~P PM~=~total((t>u). Bill: Has probability 0.5014 .0285~~~~0. Mary: Has probability 0. Then X∼ binomial (10.0516~~~~0. 0.85 of success on each trial.0405~~~~0. 1 Bill and Mary take ten basketball free throws each.X).4736~~~~~~~~~~~~~~~~~~%~P((X.*P)~~~~~~~~~~~~~~~%~P(X~in~M) PM~=~~~0. PY~=~ibinom(10.*P)~~~~~~~~~~~~~~~%~P((X. Y~=~0:10. 0.1032 ~~~~0.~~~~~~~~~~~~~~~~~~~~~~~%~Rectangle~MxN PQ~=~total(Q.*P)~~~~~~~~~~~~%~of~Example 9 from "Composite Trials".Y) ~Use~array~operations~on~matrices~X.0. INDEPENDENT CLASSES OF RANDOM VARIABLES ~~~~0.*P) PM~=~~0.~PY.0.4736 p~=~PM*PN p~~=~~~0.85. Pm~=~~0.~PX. What is the probability Mary makes more free throws than Bill? SOLUTION Let X be the number of goals that Mary makes and Y be the number that Bill makes.85.
73 0.minprob(P2)).61 0. p] = icalcf (X.~Y.10: Sprinters time trials Twelve world class sprinters in a meet are running in two heats of six persons each.6)~0].1*[3~5~2].~PY. Each runner has a reasonable chance of breaking the track record.PX]~=~canonicf(c1.75~0. is independent.0800 ~~~~0. [X. then icalc to get the joint distribution. we have an mfunction version icalcf (9.73~0.81~0. py.62~0. t. P2~=~[0. Example 9. [Y.48 0. idbn for obtaining the joint probability matrix from the marginal probabilities.58 0. SOLUTION Let X be the number of successes in the rst heat and Y be the number who are successful in the second heat.8525 As in the case of jcalc. P1~=~[0. We use the mfunction canonicf to determine c1~=~[ones(1. We suppose results for individuals are independent.55~0.1200~~~~0. First heat probabilities: 0.1000~~~~0.24) [x.2000~~~~0.0400 .61~0.3986 Pm2~=~total((u>t).0750~~~~0.51].55 0.75 0.239 Example 9.1250~~~~0.66~0.~and~P Pm1~=~total((t>u).0300 ~~~~0.~PX.*P)~~~%~Prob~first~heat~has~most Pm1~=~~0. PY) We have a related mfunction Its formation of the joint matrix utilizes the same operations as icalc. PY~=~0.0500 ~~~~0.minprob(P1)). Then the pair the distributions for X {X.62 0.01*[20~15~40~25].8708 Py3~=~(Y>=3)*PY'~~~~~~~~%~Prob~second~has~3~or~more Py3~=~~0.66 0. P~~=~idbn(PX.6)~0].11: A numerical example PX~=~0.43 Second heat probabilities: 0. y.43].51 Compare the two heats for numbers who break the track record. c2~=~[ones(1.0600~~~~0.48~0.77 0.PY) P~= ~~~~0.3606 Peq~=~total((t==u).81 0.PY]~=~canonicf(c2. px.0450~~~~0.~t. u. PX.2408 Px3~=~(X>=3)*PX'~~~~~~~~%~Prob~first~has~3~or~more Px3~=~~0. icalc Enter~row~matrix~of~Xvalues~~X Enter~row~matrix~of~Yvalues~~Y Enter~X~probabilities~~PX Enter~Y~probabilities~~PY ~Use~array~operations~on~matrices~X.58~0. Y.0750~~~~0.*P)~~~%~Prob~second~heat~has~most Pm2~=~~0.~u.77~0. Y } and for Y.*P)~~%~Prob~both~have~the~same Peq~=~~0.
0128 ~~~~~0.1032 ~~~~~0.0169~~0.0049~~0.0161~~0.0184~~0.0208 ~~~~~0.0042~~0.0372~~~~0.0035~~0.0049~~0.0020~~0. We do not ordinarily exhibit the matrix P to be tested.0405~~~~0.~call~for~D disp(D)~~~~~~~~~~~~~~~~~~~~~~~~~~%~Optional~call~for~D ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 Next.0104~~0.240 CHAPTER 9.0126~~0.0264 ~~~~~0.0091~~0.0117~~0.0077~~0. then forming an independent joint test matrix.0084~~0.0028~~0.0104~~0.0095~~0.0128 ~~~~~0.0065~~0.0774~~~~0.0744 ~~~~~0.0092~~0.0040~~0.0207~~0.0112 ~~~~~0. Example 9.0056~~0.0209~~~~0.procedure itest checks a joint distribution for independence.0120~~0. we consider an example in which the pair is known to be independent.0168~~0.0096 ~~~~~0.0189~~0.0299~~0.0144 ~~~~~0.0135~~0.0030~~0.0063~~0.0035~~0.0837~~~~0.0065~~0.0176 itest Enter~matrix~of~joint~probabilities~~P The~pair~{X.0055~~0.0190~~0.0189~~0.0184~~0.0091~~0.0105~~0.0198~~~~0.0231~~0.0091~~0.1161~~~~0.0075~~0.0091~~0.0516~~~~0. which is compared with the original.0078~~0.0080 ~~~~~0. and it would be very dicult to pick out those for which it fails.0040~~0.0147~~0.0273~~0.0207~~0.0817~~~~0.0120~~0.0180~~~~0.0025~~0.0285~~~~0.0045~~0.0143~~0.0169~~0.0270~~~~0.0138~~0.0060~~0.0132~~~~0.0064 ~~~~~0.0052~~0.0165~~0.12 idemo1~~~~~~~~~~~~~~~~~~~~~~~~~~~%~Joint~matrix~in~datafile~idemo1 P~=~~0.Y}~is~NOT~independent~~~%~Result~of~test To~see~where~the~product~rule~fails. Example 9. INDEPENDENT CLASSES OF RANDOM VARIABLES An m.0558~~~~0.0112 ~~~~~0. However.0168~~0.0253~~0. It does this by calculating the marginals.0065~~0.0299~~0.0147~~0.0144 ~~~~~0.0208 ~~~~~0.0056~~0. The mprocedure simply checks all of them. this is a case in which the product rule holds for most of the minterms.0035~~0.0161~~0.0063~~0.0105~~0.13 jdemo3~~~~~~%~call~for~data~in~mfile disp(P)~~~~~%~call~to~display~P ~~~~~0.0589~~~~0.0117~~0.0360 ~ itest .0195~~0.0115~~0.0273~~0.0045~~0.0297~~~~0.0135~~0.0105~~0.
Example 9.~Y. . u ≥ 0 Since (9.25) e −12 ≈ 6 × 10 −6 .~~~~~~~~%~W~=~3X~+~2Y~4Z [W. We call the mprocedure icalc3.1. The following is a simple example of its use. yet have the approximated pair this is highly unlikely. so is any pair of approximating simple functions theoretically possible for the approximating pair {Xs . variations of the mfunction mgsum and the mfunction diidsum Also several are used for obtaining distributions for sums of independent random variables.1*[2~2~3~3]. Y } is an Now it is independent pair.~v.~~~~~~~~%~Distribution~for~W PG~=~total((G>0). Ys } of the type considered.3370 Pg~=~(W>0)*PW'~~~~~~~~~~~~%~P(Z~>~0) Pg~=~~0.3370 An mprocedure icalc4 to handle an independent class of four variables is also available.15: An independent pair Suppose X∼ exponential (3) and Y ∼ exponential (2) with fXY (t. For all practical purposes. PY~=~0. PX~=~0.1*[2~2~1~3~2]. u) = 6e−3t e−2u = 6e−(3t+2u) t ≥ 0.~PZ. icalc3 Enter~row~matrix~of~Xvalues~~X Enter~row~matrix~of~Yvalues~~Y Enter~row~matrix~of~Zvalues~~Z Enter~X~probabilities~~PX Enter~Y~probabilities~~PY Enter~Z~probabilities~~PZ Use~array~operations~on~matrices~X. But {Xs .~u. 9. Y~=~1:2:7. Y } {X. PX. we show that if {g (X) . Example 9. Also. Ys } is independent. we show that an approximating simple random variable the type we use is a function of the random variable is an independent pair. we approximate X for values up to 4 and Y for values up to 6. we may consider {X. Thus if {X. We consider them in various contexts in other units. h (Y )} for any reasonable functions g and h. consider a second pair of approximating simple functions with more subdivision points.PW]~=~csort(G.~PY.P). independence by the approximating random variables. so is X Xs of which is approximated.~Z.241 Enter~matrix~of~joint~probabilities~~P The~pair~{X.1*[1~3~2~3~1]. Y } not independent.~and~P G~=~3*t~+~2*u~~4*v.Y}~is~independent~~~~~~~%~Result~of~test The procedure icalc can be extended to deal with an independent class of three random variables. Z~=~0:3:12.Z)~>~0) PG~=~~0.14: Calculations for three independent random variables X~=~0:4.Y. When in doubt. Ys } to be independent.6 Approximation for the absolutely continuous case In the study of functions of random variables.~t. PZ~=~0. Y } to This decreases even further the likelihood of a false indication of be independent i {Xs . {X.*P)~~~~~~~~%~P(g(X.
0493 0.0441 (9.16: Test for independence The pair {X.~and~P itest Enter~matrix~of~joint~probabilities~~P The~pair~{X. u) = 4tu0 ≤ t ≤ 1.0551 0.2 Problems on Independent Classes of Random Variables2 Exercise 9. 2 This content is available online at <http://cnx.0399 0.5 4.0437 P = 0.242 CHAPTER 9.0527 0. they are the same.0380 0.1 3. Y } X = [−2.) has the joint distribution (in mle npr08_06.0420 0.0483 0.0589 (9.~t.7 1.0361 0.~and~P itest Enter~matrix~of~joint~probabilities~~P The~pair~{X. Y } is independent.0399 0.~PY.27) 0.Y}~is~independent 9.0609 0.26) fXY = fX fY which ensures the pair is independent.1] 0.*u Use~array~operations~on~X.~t.9 5.28) {X.~u.3] 0.0651 0.0667 Determine whether or not the pair 0.3 2.4/>. 0 ≤ t ≤ 1 (9.8.m (Section 17.~Y.0713 0.1 5. tuappr Enter~matrix~[a~b]~of~Xrange~endpoints~~[0~1] Enter~matrix~[c~d]~of~Yrange~endpoints~~[0~1] Enter~number~of~X~approximation~points~~100 Enter~number~of~Y~approximation~points~~100 Enter~expression~for~joint~density~~4*t. 1 It is easy enough to determine the marginals in this case.0357 0. fX (t) = 4t 0 so that u du = 2t.org/content/m24298/1. Y } has joint density fXY (t.0323 0. 247.3 − 0. INDEPENDENT CLASSES OF RANDOM VARIABLES tuappr Enter~matrix~[a~b]~of~Xrange~endpoints~~[0~4] Enter~matrix~[c~d]~of~Yrange~endpoints~~[0~6] Enter~number~of~X~approximation~points~~200 Enter~number~of~Y~approximation~points~~300 Enter~expression~for~joint~density~~6*exp((3*t~+~2*u)) Use~array~operations~on~X.~PY.1 The pair (Solution on p.~u.0580 Y = [1.0620 0.~PX.~Y.37: npr08_06)): {X. By symmetry. Consider the solution using tuappr and itest. 0 ≤ u ≤ 1. .Y}~is~independent Example 9.~PX.
0556 0.0495 0.) for fXY (t.5 0.0891 0.0324 0. 0 ≤ u ≤ 2 (see Exercise 13 (Exercise 8.13) from "Problems on Random Vectors and Joint Distributions"). For the distributions in Exercises 410 below a.0342 0.0304 0. 248.8.8 (Solution on p. (1.0308 (9.0961 P = 0.9 − 1. (Solution on p. 0) .m (Section 17. u) = 4t (1 − u) Exercise 9. 1) . Exercise 9.7 0 ≤ t ≤ 1.0405 0. u) = 1/π Exercise 9. u) = 4ue−2t Exercise 9.1 Determine whether or not the pair {X.0 3. fXY (t. 247.0589 0.9 0.5 (Solution on p. Y } P (X = t.) has the joint distribution (in mle npr09_02.243 Exercise 9.0077 Table 9.2 The pair (Solution on p.1320 0.0672 0.0209 (9.0682 0.) on the parallelogram with vertices fXY (t.4 (Solution on p. Exercise 9.1089 0. center at (0.) for fXY (t. 249.0594 0.0726 2.0216 0.8 3.0398 0.0504 0. (0. 1) (see Exercise 11 (Exercise 8.) on the square with vertices at fXY (t. Y } X = [−3.0242 0.1 0.0132 3.7 1.0).0484 1.0396 0 0.0189 0. u) = 12t2 u (−1.29) 0. 248. 247.0363 0.7 0.0440 0. Y } is independent. 1) . Exercise 9. 2) . Y } is independent.0297 0 4.4 0.1] 0. 0 ≤ u ≤ 1 (see Exercise 14 (Exercise 8.14) from "Problems on Random Vectors and Joint Distributions").3 The pair (Solution on p.38: npr08_07)): {X.0744 0.6 (1.m (Section 17. 0) . 0) . u) = 1/2 Exercise 9. (Solution on p.0528 0.2 0.12) from "Problems on Random Vectors and Joint Distributions"). Determine whether or not the pair is independent.41: npr09_02)): {X.0231 0.) has the joint distribution (in mle npr08_07.5 2.31) t = u = 7. (0.8 4. (0. Y = u) (9.0341 0.0350 0.11) from "Problems on Random Vectors and Joint Distributions").30) {X.8.) fXY (t. Use a discrete approximation and an independence test to verify results in part (a). 0 ≤ u ≤ 1 (see Exercise 12 (Exercise 8. (1.) on the circle with radius one.0528 0.6 5.1] 0.0203 0.1 2.0456 0. 1) .9 0 ≤ t.5 4.0510 0. (2. 249. (Solution on p. b.0498 0. 248.0868 Determine whether or not the pair 0. 247. u) = 1 8 (t + u) for 0 ≤ t ≤ 2.0090 0.0448 Y = [−2 1 2.
If successive units fail independently. are preparing a new business package in time anticipated completion time. in What is the probability both will complete on time. Payos are $570. 17. that neither will complete on time? Exercise 9.81. with probability of success on each call p = 0. 12 dollars. 250. INDEPENDENT CLASSES OF RANDOM VARIABLES (see Exercise 16 (Exercise 8. (Solution on p.15 (Solution on p. 250.17) from "Problems on Random Vectors and Joint Distributions").) The location of ten points along a line may be considered iid random variables with symmytric [1. what is the probability of no breakdown due to the module for one year? Exercise 9. unless failure occurs. The Exercise 9. 0. that at least one will complete on time. 0.) A critical module in a network server has time to failure (in hours of machine time) exponential (1/3000).38.) The times to failure are iid.) for fXY (t. exponential (1/130). The module is replaced routinely every 30 days (720 hours). The time to failure (in hours) of each unit is exponential (1/750). 3]. MicroWare and BusiCorp. u) = 24 11 tu 0 ≤ t ≤ 2.22. she makes three independent calls. 0. Exercise 9. 0.13 triangular distribution on 1/2 of the point (Solution on p. and $465.244 CHAPTER 9.58.16) from "Problems on Random Vectors and Joint Distributions"). display is on continuously for 750 hours (approximately one month). 185. 0.41. What is the probability Anne spends at least $30 more than Margaret? . 15.) Eight similar units are put into operation at a given time.) • • In the rst. 0 ≤ u ≤ min{1. What is the probability that three or more will lie within distance t = 2? (Solution on p.e. and 0. 249. 15 dollars with respective probabilities 0.57. with respective probabilities 0. 0. Assume the {X. except for brief times for maintenance or repair. $525. BusiCorp has time to completion. Determine the probability the number of lights which survive the entire period is at least 175.14 A Christmas display has 200 lights. what is the probability that ve or more units will be operating at the end of 500 hours? Exercise 9.41. Which alternative oers the maximum possible gain? b. $900. 190. 250. Let pair She realizes $150 prot on each successful sale. exponential (1/150).57.) Exercise 9. Exercise 9.35. Exercise 9.77. with In the second.) They work independently.37. What is the probability Anne spends at least twice as much as Margaret? b. 8. 18. exponential (1/10000). 180. respective probabilities of 0.12 (Solution on p. 12. X be the net prot on the rst alternative and is independent.17 Margaret considers ve purchases in the amounts 5.11 for a computer trade show 180 days in the future. she makes eight independent calls.83. days. 0.63. What is the probability the second exceeds the rst i. $1100. Assume that all eleven possible purchases form an independent class. 250. MicroWare has Two software companies. 0. 251. Anne contemplates six purchases in the amounts 8. The machine operates continuously. c. 21.10 (Solution on p. a.52. Y } a.23. If the units fail independently. Compare probabilities in the two schemes that total sales are at least $600. (Solution on p.16 Joan is trying to decide which of two sales opportunities to take. 2 − t} (see Exercise 17 (Exercise 8. 0. what is P (Y > X)? (Solution on p. 15. 0. 250.. in days. 250. $1000. Y be the net gain on the second.
0. (Solution on p. P (5 ≤ Z ≤ 25).) Exercise 9. 252.18 James is trying to decide which of two sales opportunities to take.33. $380. and 0. the respective probabilites {0.41}.22 throws more sevens than does Ronald? Two players. It is reasonable to suppose the pair {N1 . F } is independent. (Solution on p.) A residential College plans to raise money by selling chances on a board. 0. Assume the {X. and Z = 2X − 3Y (9. C. What is the probability the College will lose money? What is the probability the prot will be $400 or more.. win $20 with probability Pay $10 to play. 252. 254.21 The class (Solution on p. Determine the marginal distributions for X and Y then use icalc to obtain the joint distribution and the calculating matrices.05) and N2 ∼ binomial (50.57. 0.2 (one in twenty) (one in ve) Thirty chances are sold on Game 1 and fty chances are sold on Game 2. with respective probabilities of 0.46. he makes three independent calls. If on the respective games. There are two games: Game 1: Game 2: Pay $5 to play.e. Let pair He realizes $100 prot on each successful sale.57. and $350. $750. What is the probability that at least eight girls throw more heads than their partners? . Y } • • • Which alternative oers the maximum possible gain? What is the probability the second exceeds the rst i. P (Z > 0). It is reasonable to suppose N1 ∼ (30. (9. win $30 with probability p1 = 0.20 The class distribution (Solution on p.2). E. Do this with icalc. 254. 0. so that {X.) A class has fteen boys and fteen girls. then repeat with icalc3 and P (W > 0) Exercise 9. In the second.245 Exercise 9. The total prot for the College is Z = X +Y . N2 } is independent. Z} of random variables is iid (independent. 0. then X and Y are the prots X = 30 · 5 − 20N1 where binomial and Y = 50 · 10 − 30N2 (9. Determine the distribution for W and from P (−20 ≤ W ≤ 10). with probability of success on each call p = 0.01 ∗ [15 20 30 25 10] this determine compare results. Payos are $310. They pair up and each tosses a coin 20 times. 0. 253. throw a pair of dice 30 times each. Y } is independent. what is P (Y > X)? Compare probabilities in the two schemes that total sales are at least $600. he makes eight independent calls.) • • In the rst. B. less than $200. $700. Determine Y = −2ID + 6IE + 2IF − 3.47. identically distributed) with common X = [−5 − 1 3 4 7] Let and P X = 0.37.33) W = 3X − 4Y + 2Z .32) N1 .19 (Solution on p. N2 are the numbers of winners on the respective games. Ronald and Mike. D.23 (Solution on p.35.27. 0. Y.34) P (Y > X). Y be the net gain on the second. Exercise 9. between $200 and $450? Exercise 9. What is the probability Mike Exercise 9. Consider the simple random variables X = 3IA − 9IB + 4IC .41.) {X.) for these events are {A. 254.05 p2 = 0. 0. X be the net prot on the rst alternative and is independent.
246 CHAPTER 9. Mary.87. (Solution on p. 0. 255.52.80. Pay ten dollars for each play. 0. 0. of on the success Assume that all nine events form an independent class. Determine P (X ≥ 15). with probability 0. 254. Margaret makes four sales calls on the respective calls. Jim. (Solution on p.48.24 Glenn makes ve sales calls. P (W 1 > 0) and Compare the two games further by calculating P (W 2 > 0) Which game seems preferable? Exercise 9. 0.00 on each sale. Let She is trying to decide whether to play Game 1 ten times or Game 2 W1 and W2 be the respective net winnings (payo minus fee to play).) • • Let and Mike shoots 10 ordinary free throws. The probability of a win is 1/2. and Ellen have respective probabilities p = 0. If she wins. 0. Each takes ve free throws.37. 0. she loses her $10. The probability of a win on any play is Martha has $100 to bet.77. Exercise 9. scored by Mike and Harry. what is the probability Margaret's gain is at least $10.27 (Solution on p. and 0. twenty times. otherwise.63.75 of success on each shot. X. so the game is fair. Pay ve dollars to play.85 of making each shot tried. P (X ≥ Y ). 256. for a net gain of $10 on the play. What is the probability Mary and Ellen make a total number of free throws at least as great as the total made by the guys? . Harry shoots 12 three point shots. • • Determine P (W 2 ≥ W 1).71. receive $15 for a win.82. If Glenn realizes a prot of $18. with probabilities respective calls. respectively. Y be the number of points P (Y ≥ 15) .91.00 more than Glenn's? Exercise 9. 0.75. worth two points each. 255. (Solution on p. 0. Make the usual independence assumptions.) 0. 0.) Jim and Bill of the men's basketball team challenge women players Mary and Ellen to a free throw contest. of success with probabilities 0.00 on each sale and Margaret earns $20.) Game 1: Game 2: 1/3. Bill.25 Mike and Harry have a basketball shooting contest. with probability 0.26 Martha has the choice of two games. INDEPENDENT CLASSES OF RANDOM VARIABLES Exercise 9.40 of success on each shot.82. she receives $20.
243) npr09_02 (Section~17.8. P itest Enter matrix of joint probabilities P The pair {X. Y.8.8. call for D disp(D) 0 0 0 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 Solution to Exercise 9.2 (p. P itest Enter matrix of joint probabilities P The pair {X. 242) npr08_06 (Section~17.37: npr08_06) Data are in X.4 (p.Y} is NOT independent To see where the product rule fails.1 (p.Y} is NOT independent To see where the product rule fails.41: npr09_02) Data are in X.38: npr08_07) Data are in X.3 (p. 243) Not independent by the rectangle test. tuappr Enter matrix [a b] of Xrange endpoints [1 1] Enter matrix [c d] of Yrange endpoints [1 1] Enter number of X approximation points 100 . Y.247 Solutions to Exercises in Chapter 9 Solution to Exercise 9. call for D disp(D) 0 0 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 0 0 Solution to Exercise 9. P itest Enter matrix of joint probabilities P The pair {X. Y. 243) npr08_07 (Section~17.Y} is NOT independent To see where the product rule fails. call for D disp(D) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Solution to Exercise 9.
. PY. 243) Not independent.7 (p. 243) P From the solution of Exercise 13 (Exercise 8. PX. PX.5 (p. (9.36) fXY = fX fY which implies the pair is not independent. tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (1/2)*(u<=min(1+t. PY.t1)) Use array operations on X. and P itest Enter matrix of joint probabilities P The pair {X. Y. u. PX. (u>=max(1t..248 CHAPTER 9.12) from "Problems on Random Vectors and Joint Distributions" we have fX (t) = 2t. u..6 (p.35) tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 100 Enter number of Y approximation points 100 Enter expression for joint density 4*t. call for D % Not practical. INDEPENDENT CLASSES OF RANDOM VARIABLES Enter number of Y approximation points 100 Enter expression for joint density (1/pi)*(t. fXY = fX fY so the pair is independent. 0 ≤ t ≤ 2 4 (9. t.too large Solution to Exercise 9. call for D Solution to Exercise 9.^2 + u.Y} is NOT independent To see where the product rule fails. Y. by the rectangle test. fY (u) = 2 (1 − u) .3t)).^2<=1) Use array operations on X.Y} is NOT independent To see where the product rule fails. and P itest Enter matrix of joint probabilities P The pair {X. u.Y} is independent Solution to Exercise 9.*(1u) Use array operations on X. Y. 0 ≤ t ≤ 1. t. t.13) from "Problems on Random Vectors and Joint Distributions" we have fX (t) = fY (t) = so 1 (t + 1) . and P itest Enter matrix of joint probabilities The pair {X.* . 0 ≤ u ≤ 1. 243) From the solution for Exercise 12 (Exercise 8. PY.
t)) Use array operations on X. t. Y. 0 ≤ t. fY (u) = 2u. PY.37) fXY = fX fY and the pair is independent. PY. Y. Y. tuappr Enter matrix [a b] of Xrange endpoints [1 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 100 Enter expression for joint density 12*t. PY. tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 . 0 ≤ u ≤ 1 so that (9. call for D Solution to Exercise 9. 243) From the solution for Exercise 14 (Exercise 8.*exp(2*t) Use array operations on X. call for D Solution to Exercise 9..10 (p.*(u<=min(t+1.^2.14) from "Problems on Random Vectors and Joint Distribution" we have fX (t) = 2e−2t . u.9 (p.8 (p.. 244) By the rectangle test. tuappr Enter matrix [a b] of Xrange endpoints [0 5] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 500 Enter number of Y approximation points 100 Enter expression for joint density 4*u. t. PX. u.249 tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 100 Enter number of Y approximation points 100 Enter expression for joint density (1/8)*(t+u) Use array operations on X. 243) Not independent by the rectangle test. and P itest Enter matrix of joint probabilities P The pair {X.* . PX. and P itest Enter matrix of joint probabilities P The pair {X. the pair is not independent.1)).Y} is independent % Product rule holds to within 10^{9} Solution to Exercise 9. PX. (u>=max(0. t.Y} is NOT independent To see where the product rule fails.Y} is NOT independent To see where the product rule fails.*u. u. and P itest Enter matrix of joint probabilities P The pair {X.
p2) Pneither = 0. Y. % Probability any one will survive P = cbinom(8. p = 3/4.056 Solution to Exercise 9. call for D Solution to Exercise 9. and P itest Enter matrix of joint probabilities P The pair {X.9277 k = 175:5:190. u. t.16 (p.35].6988 p2 = 1 . 244) p = exp(500/750).2t)) Use array operations on X. 244) p = exp(750/10000) p = 0.3930 Solution to Exercise 9.0000 0. PX. PY.15 (p.5238 Poneormore = 1 . with [P (A) P (B) P (C)] = [0.exp(180/130) p2 = 0.p2) % 1 . Y = 150S .*(u<=min(1. INDEPENDENT CLASSES OF RANDOM VARIABLES Enter number of Y approximation points 100 Enter expression for joint density (24/11)*t.12 (p.11 (p.p.9449 185.0000 0. where S∼ binomial (8.3) = 0.Pneither Poneormore = 0.k). Solution to Exercise 9.7496 Pboth = p1*p2 Pboth = 0.7866 % Probability any unit survives P = p^12 % Probability all twelve survive (assuming 12 periods) P = 0.250 CHAPTER 9.(1 .p1)*(1 .Y} is NOT independent To see where the product rule fails. so that P = cbinom(10. P = cbinom(200.13 (p. 244) X = 570IA + 525IB + 465IC 0.57).9996.1381 Solution to Exercise 9. 244) p = exp(720/3000) p = 0.5) % Probability five or more will survive P = 0.570.9973 180.14 (p.p1)*(1 .0754 Solution to Exercise 9.*u.exp(180/150) p1 = 0. disp([k.6263 190.0000 0. 244) p1 = 1 .9246 Pneither = (1 .P]') 175.410.p. 244) Geometrically. .0000 0.p.
0:8). t.0784 0. for i = 1:4 py(i) = (Y>=k(i))*PY'.pmy). pmx = minprob(0.py]') 0.57.4131 0. Y.7765 0. X Y .0. end disp([px. 41 83 58]). and P xmax = max(X) xmax = 1560 ymax = max(Y) ymax = 1200 k = [600 900 1000 1100].2560 0. u. px = zeros(1. 244) cx = [5 17 21 8 15 0].4131 0.PX] = canonicf(cx.35]). [Y.57 0.251 c = [570 525 465 0].41 0. pm = minprob([0. canonic % Distribution for X Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution Y = 150*[0:8]. PM = total(M. icalc Enter row matrix of Xvalues Enter row matrix of Yvalues Enter X probabilities PX 81 63]).PY] = canonicf(cy.*P) PM = 0. PY.0111 M = u > t.17 (p. % Distribution for Y PY = ibinom(8. for i = 1:4 px(i) = (X>=k(i))*PX'.4).0818 0.4).pmx).3514 0. end py = zeros(1. PX.5081 % P(Y>X) Solution to Exercise 9. icalc % Joint distribution Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X.01*[77 52 23 [X.01*[37 22 38 cy = [8 15 12 18 15 12 0]. pmy = minprob(0.
py]') 0.t >=30. u.0:8). Y.5081 k = [600 700 750]. PX.0111 Solution to Exercise 9.05. PY.01*[35 41 57]). 245) N1 = 0:30. for i = 1:3 px(i) = (X>=k(i))*PX'.2431 Solution to Exercise 9. PX.0. px = zeros(1. PN1 = ibinom(30. PM2 = total(M2. pmx = minprob(0.0. py = zeros(1.*P) PYgX = 0. end for i = 1:3 py(i) = (Y>=k(i))*PY'.3). u.0:30). Y. .2337 0. x = 150 . canonic Enter row vector of coefficients cx Enter row vector of minterm probabilities pmx Use row matrices X and PX for calculations Call for XDBN to view the distribution icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. t.19 (p. and P M1 = u >= 2*t.*P) PM2 = 0. INDEPENDENT CLASSES OF RANDOM VARIABLES Enter Y probabilities PY Use array operations on matrices X.3448 M2 = u .252 CHAPTER 9.3).20*N1. and P xmax = max(X) xmax = 1040 ymax = max(Y) ymax = 800 PYgX = total((u>t). 245) cx = [310 380 350 0].18 (p. end disp([px.*P) PM1 = 0.0818 0.57. PM1 = total(M1. PY = ibinom(8.2560 0.4131 0. Y = 100*[0:8]. t.0784 0. PY.
N2 = 0:50.1957 Pl200 = total(Ml200. Y. [Y.5300 P2 = ((20<=W)&(W<=10))*PW' P2 = 0.5514 icalc3 % Alternate using icalc3 .PN1). Plose = total(Mlose. PY. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. u.PV] = csort(a.5249e04 Pm400 = total(Mm400.P). we avoid a renaming problem by using x and px for data vectors P X. P1 = (W>0)*PW' P1 = 0. t. PX.PW] = csort(b. Mm400 = G >= 400.PN2). u.*P) Pl200 = 0.PY] = csort(y. Ml200 = G < 200.P). Y.20 (p. icalc Enter row matrix of Xvalues V Enter row matrix of Yvalues 2*x Enter X probabilities PV Enter Y probabilities px Use array operations on matrices X. t. PX. u. [V. PX.*P) P200_450 = 0.2.0828 P200_450 = total(M200_450.01*[15 20 30 25 10]. and P b = t + u.PX] = csort(x. icalc Enter row matrix of Xvalues 3*x Enter row matrix of Yvalues 4*x Enter X probabilities px Enter Y probabilities px Use array operations on matrices X. x = [5 1 3 4 7].*P) Pm400 = 0. Mlose = G < 0.0.8636 Solution to Exercise 9. PY.253 [X. [W.*P) Plose = 3. Y. px = 0. M200_450 = (G>=200)&(G<=450). and P G = t + u. t. PN2 = ibinom(50. 245) Since icalc uses and X X and PX in its output. and P a = t + u.30*N2.0:50). y = 500 . PY.
icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X.PX] = canonicf(cx.8) P = 0. 246) cg = [18*ones(1. PYgX = total((u>t). [Z.23 (p. 245) cx = [3 9 4 0]. and P a = 3*t . 245) Solution to Exercise 9.24 (p.3752 PZpos = (Z>0)*PZ' PZpos = 0. PY.5300 P2 = ((20<=W)&(W<=10))*PW' P2 = 0.1/2. PX.PW] = csort(a. . Y. PZ.4307 pg = (ibinom(20.PY] = canonicf(cy.0:19))*(cbinom(20.5514 Solution to Exercise 9. 245) P = (ibinom(30. P1 = (W>0)*PW' P1 = 0.*P) PYgX = 0.5654 P5Z25 = ((5<=Z)&(Z<=25))*PZ' P5Z25 = 0.1:20))' pg = 0.1/6.4373 % Probability each girl throws more P = cbinom(15. cy = [2 6 2 3].pg.5) 0]. v.01*[47 37 41]). Y. pmy = minprob(0.1:30))' = 0.pmy).0:29))*(cbinom(30.4745 Solution to Exercise 9.PZ] = csort(G. [Y.22 (p.3100 % Probability eight or more girls throw more Solution to Exercise 9. Z. PY. PX. t.4*u + 2*v.1/6. pmx = minprob(0. INDEPENDENT CLASSES OF RANDOM VARIABLES Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter row matrix of Zvalues x Enter X probabilities px Enter Y probabilities px Enter Z probabilities px Use array operations on matrices X. and P G = 2*t . t. u. [W.4) 0]. [X. cm = [20*ones(1. u.3*u.pmx).P).21 (p.P).1/2.01*[42 27 33]).254 CHAPTER 9.
100. u.01*[37 52 48 71 63]). P1pos = (W1>0)*PW1' P1pos = 0. icalc Enter row matrix of Xvalues G Enter row matrix of Yvalues M Enter X probabilities PG Enter Y probabilities PM Use array operations on matrices X. and P G = u >= t.01*[77 82 75 91]).5207 icalc Enter row matrix of Xvalues W1 Enter row matrix of Yvalues W2 Enter X probabilities PW1 Enter Y probabilities PW2 Use array operations on matrices X.5197 Solution to Exercise 9.0:12). and P PX15 = (X>=15)*PX' PX15 = 0.0. Y. t. PY. PY = ibinom(12.0:20). PX = ibinom(10. t. t.3770 P2pos = (W2>0)*PW2' P2pos = 0.1/2.0:10). PX. 246) W1 = 20*[0:10] . .PM] = canonicf(cm. PW2 = ibinom(20.25 (p.*P) p1 = 0. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. u.pmg).*P) PG = 0. Y.75.26 (p. pmm = minprob(0.pmm).0:10). PX. and P H = ut>=10. [M. PX.5256 PY15 = (Y>=15)*PY' PY15 = 0. u.100. PW1 = ibinom(10.255 pmg = minprob(0.PG] = canonicf(cg.5811 Solution to Exercise 9. PG = total(G. Y = 3*[0:12]. PY. [G. 246) X = 2*[0:10]. Y.5618 G = t>=u. p1 = total(H. PY.40.0.1/3. W2 = 15*[0:20] .
0.Pm] = csort(H. v. [Tm.5746 icalc4 % Alternate using icalc4 Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter row matrix of Zvalues x Enter row matrix of Wvalues x Enter X probabilities PJ Enter Y probabilities PB Enter Z probabilities PM Enter W probabilities PE Use array operations on matrices X.27 (p. 246) PJ PB PM PE x = 0:5. icalc Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter X probabilities PM Enter Y probabilities PE Use array operations on matrices X. PY. Y.x).256 CHAPTER 9.80. u. = ibinom(5.x). [Tw. PY. PW t. and P Gw = u>=t. PGw = total(Gw. Z.P).*P) PG = 0. PX. t. = ibinom(5. INDEPENDENT CLASSES OF RANDOM VARIABLES PG = total(G. and P G = t+u. PY.x). PX. PZ.5182 Solution to Exercise 9. Y. and P H = v+w >= t+u. icalc Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter X probabilities PJ Enter Y probabilities PB Use array operations on matrices X.x). = ibinom(5.85.0. u. t.0. PX.*P) PH = 0.0. and P H = t+u. w. PH = total(H. Y. icalc Enter row matrix of Xvalues Tm Enter row matrix of Yvalues Tw Enter X probabilities Pm Enter Y probabilities Pw Use array operations on matrices X. PY.Pw] = csort(G.5746 . t.W PX. u. = ibinom(5.*P) PGw = 0.82.87. Y.P). u.
diameter is If we can assume true spherical shape and kw1/3 . but are really interested in a value derived Z (ω) = g (X (ω)).2: Price breaks The cultural committee of a student organization has arranged a special deal for tickets to a concert. schedule: • • • • 1120. and the density of the steel. If Thus Frequently.5/>. the Thus. a Borel function).org/content/m23329/1. 10.1 The problem.1 Functions of a Random Variable1 Introduction X is a random variable and g is a reasonable function (technically.1: A quality control problem In a quality control check on a production line for ball bearings it may be easier to weigh the balls than measure the diameters. if units of measurement. $16 each 3150. then Z = g (X) is a new random variable which has the value g (t) for any ω such that X (ω) = t. (10. $15 each 51100. the total cost (in dollars) is a random quantity (10.1. The agreement is that the organization will purchase ten tickets at $20 each (regardless Additional tickets are available according to the following of the number of individual buyers).1) Z = g (X) described by g (X) = 200 + 18IM 1 (X) (X − 10) + (16 − 18) IM 2 (X) (X − 20) + (15 − 16) IM 3 (X) (X − 30) + (13 − 15) IM 4 (X) (X − 50) 1 This content is available online at <http://cnx. X is the weight of the sampled ball. Example 10.2) 257 . where k w is the weight. rst. A wide variety of functions are utilized in practice. functions of a single random variable. $18 each 2130.Chapter 10 Functions of Random Variables 10. $13 each If the number of purchasers is a random variable X. we observe a value of some random variable. Example 10. an approach We consider. from this by a function rule. then is a factor depending upon the formula for the volume of a sphere. the desired random variable is D = kX 1/3 .
determine Once (10. N = {t : g (t) > 0} is the The X values −2 and 6 lie in this set. The mapping approach Determine P (Z > 0). Select those ti which do and add the associated probabilities. merely use the dening conditions.4) N is identied.4 (10. b. then X (ω) ∈ N . 1. Note that it is not necessary to describe geometrically the set M . 4.2 6 3 0. but the essential problem is the same. This is the approach we use in the MATLAB calculations. The discrete alternative X= PX = Z= Z>0 2 0.1. 0. Determine the set N of all those t which are mapped into M by the function g. and if g (X (ω)) ∈ M . To nd (a) P (X ∈ M ). Example 10. (b) Discrete alternative. g (t) = (t + 1) (t − 4). Now if X (ω) ∈ N . then Z = g (X) is a new random variable. The problem If for X is a random variable. above). Mapping approach. determine whether g (ti ) meets the dening condition for M. ∞) . ∞) . with respective probabilities 0. identify those values ti of X f .5) Second solution. 0. Simply nd the amount of probability mass mapped into the set random variable Mapping approach. The set P (X ∈ N ) in the usual manner (see part a. To nd (a) P (g (X) ∈ M ).2. X. M 3 = [30. M 4 = [50. then g (X (ω)) ∈ M . ∞) .2 = 0. Remark. the probability Z takes a value in the set M ? An approach to a solution the distribution We consider two equivalent approaches a.3: A discrete example X has values 2.2 14 1 6 1 0 0 .2 + 0. 3. M by the In the absolutely continuous case.3) The function rule is more complicated than in Example 10. ∞) where (10. • • probabilities.2 0 0. Suppose we have X. 0. Hence {ω : g (X (ω)) ∈ M } = {ω : X (ω) ∈ N } Since these are the same event.258 CHAPTER 10.1 4 1 0. 6. For each possible value ti of X. Hence set of points to the left of −1 or to the right of P (g (X) > 0) = P (X = −2) + P (X = 6) = 0.3 4 0 6 0. M 2 = [20. SOLUTION First solution.3 0. 0. calculate In the discrete case. M X which are in the set M and add the associated (b) Discrete alternative. Consider each value ti of X. they must have the same probability.2. Select those which meet the dening conditions for M and add the associated probabilities. Suppose N in the mapping approach is called the inverse image N = g −1 (M ). Consider Z = g (X) = (X + 1) (X − 4).2.1 (A quality control problem). FUNCTIONS OF RANDOM VARIABLES M 1 = [10. How can we determine P (Z ∈ M ).
FZ (v) = P (Z ≤ v) = P (Z ∈ M ) = P (X ∈ N ) = P (X ≤ σv + µ) = FX (σv + µ) Since the density is the derivative of the distribution function. for given t−µ ≤v σ i t ≤ σv + µ so that (10.259 Table 10.2 + 0.13) Thus fZ (v) = We conclude that σ 1 σv + µ − µ √ exp − 2 σ σ 2π 2 2 1 = √ e−v /2 = φ (v) (10.11) M = (−∞.9) Z 2 1 φ (t) = √ e−t /2 2π (10. 7]. Example 10.10) Now g (t) = Hence. Thus. 1) σ is (10.4: An absolutely continuous example Suppose X∼ uniform [−3.7) P (Z > 0). some important examples. ' ' fZ (v) = FZ (v) = FX (σv + µ) σ = σfX (σv + µ) (10.8) We consider. Then fX (t) = 0.2 = 0.6) In this case (and often for hand calculations) the mapping approach requires less calculation. v] the inverse image is N = (−∞. Let Z = g (X) = (X + 1) (X − 4) Determine (10. σ 2 then Z = g (X) = VERIFICATION We wish to show the denity function for X −µ ∼ N (0. next. 7] is 0. the desired for SOLUTION First we determine (t + 1) (t − 4) > 0 density over any probability is P (g (X) > 0) = 0. the integral of the subinterval of [−3.5: The normal distribution and standardized normal distribution To show that if X ∼ N µ.1 times the length of that subinterval.1. −3 ≤ t ≤ 7 (and zero elsewhere). σv + µ]. However. 1) . the discrete alternative is more readily implemented. Because of the uniform distribution.12) (10. As in Example 10.5 (10.4 (10. for MATLAB calculations (as we show below). Example 10. g (t) = t < − 1 or t > 4.14) 2π Z ∼ N (0.1 [(−1 − (−3)) + (7 − 4)] = 0. N = {t : g (t) > 0}. we have P (Z > 0) = 0.3 (A discrete example).1 Picking out and adding the indicated probabilities.
the corresponding density is Consider Z = aX + b (a = 0). 0). v−b a = 0. Thus for a > 1. FUNCTIONS OF RANDOM VARIABLES Example 10.6 (Ane functions) on ane functions shows that fX (t) = 1 φ σ t−µ σ = 1 t−µ 1 √ exp − 2 σ σ 2π 2 (10. −a = a. Since for t ≥ 0.8: Fractional power of a nonnegative random variable Suppose 0≤t 1/a X ≥ 0 and Z = g (X) = X 1/a ≤ v i 0 ≤ t ≤ v a .17) FZ (v) = 1 − FX For the absolutely continuous case.7: Completion of normal and standardized normal relationship Suppose Z ∼ N (0.20) Example 10. X has distribution function FX . we have FZ (v) = P (Z ≤ v) = P (X ≤ v a ) = FX (v a ) In the absolutely continuous case ' fZ (v) = FZ (v) = fX (v a ) av a−1 (10. +P X= v−b a (10. the two cases may be combined into one formula. Then Z = X 1/a ∼ Weibull (a. VERIFICATION Use of the result of Example 10. . 1).9: Fractional power of an exponentially distributed random variable Suppose X∼ exponential (λ).19) Example 10. σ 2 .16) X≥ v−b a =P X> v−b a +P X= (10.15) • a > 0: FZ (v) = P • a<0 FZ (v) = P So that X≤ v−b a = FX v−b a v−b a (10. fZ (v) = 1 fX a v−b a (10. t1/a is increasing. Show that X = σZ + µ (σ > 0) is N µ. Determine the distribution function for SOLUTION Z g (t) = at + b. λ. an ane function (linear plus a constant).18) P X= v−b a and by dierentiation • • for for 1 a > 0 fZ (v) = a fX v−b a 1 a < 0 fZ (v) = − a fX v−b a Since for a < 0. (and the density in the absolutely continuous case).22) Example 10. FZ (v) = P (Z ≤ v) = P (aX + b ≤ v) There are two cases (10.6: Ane functions Suppose fX .260 CHAPTER 10.21) (10. Here If it is absolutely continuous.
Now IEi = IMi (X).10: A simple approximation as a function of If X X X is limited to is a random variable. we use the discrete alternative approach.25) 10. In each subinterval. PM = M*PX' PM = 0.6.24) X to within the length of the largest subinterval i Mi . Suppose I is partitioned into n subintervals by points ti .0000 132.0000 1.*(X .0000 0. For the given subdivision.4800 disp([X.0000 0.11: Basic calculations for a function of a simple random variable X = 5:10.261 According to the result of Example 10.M. ti−1 ≤ si < ti . with a = t0 and b = tn .1.PX]') 5.0. • We perform array operations on vector X to obtain G = [g (t1 ) g (t2 ) · · · g (tn )] • (10. Let Mi = [ti−1 .0000 78. Suppose the distribution for X is expressed in the row vectors X and P X. We use of Example 10. We limit our discussion to the bounded case. ti ) be the i th subinterval. G = (X + 6).2 Use of MATLAB on simple random variables For simple random variables.23) Z∼ Weibull (a. a function of X (10. −1 Let Ei = X (Mi ) be the set of points mapped into Mi by X. we form a simple random variable Xs as follows.0000 120. a simple function approximation may be constructed (see Distribution Approximations).0000 4. b].0000 1.8).8 (Fractional power of a nonnegative random variable). FZ (t) = FX (ta ) = 1 − e−λt which is the distribution function for a (10. λ.100)&(G < 130).*(X .0:15). in which the range of a bounded interval I = [a. n We may thus write Xs = i=1 si IMi (X) .0003 . since this may be implemented easily with MATLAB. M = (G > . pick a point si . 0).26) relational and logical operations on G to obtain a matrix M which has ones for those ti (values X ) such that g (ti ) satises the desired condition (and zeros elsewhere). 1 ≤ i ≤ n−1 and Mn = [tn−1 . Example 10. 1 ≤ i ≤ n−1. since IEi (ω) = 1 i X (ω) ∈ Mi IMi (X (ω)) = 1.G. tn ].0000 0 % Values of X % Probabilities for X % Array operations on X matrix to get G = g(X) % Relational and logical operations on G % Sum of probabilities for selected values % Display of various matrices (as columns) 0. • The zeroone matrix M is used to select the the corresponding pi = P (X = ti ) and sum them by the taking the dot product of M and P X . The simple random variable n Xs = i=1 approximates si IEi (10.1). PX = ibinom(15. Then the Ei form a partition of the basic space Ω.0000 3.
PX % Alternate using Z.0000 78.0634 48.262 CHAPTER 10.30. B.0000 0 0 0 1.0000 132.0000 0.0832 48.1400 10.0005 % Sorting and consolidating to obtain % the distribution for Z = g(X) 2.0245 78.0000 1.0000 5.0000 120.0000 4.1181 0 0.0634 0.0000 8.PZ]') 132.0064 132.3334 90.0000 0 9.1859 p1 = (Z<120)*PZ' p1 = 0.60. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 0 0.PZ] = csort(G.0000 0. (10.0000 48.0000 0. C} We calculate the distribution for X.0245 0.0000 3.0000 288.0000 90.0005 P1 = (G<120)*PX ' P1 = 0.0000 0 48.0000 0.0000 6.0000 120.1181 0.0000 120. disp([Z. pm = minprob(0.0000 0 2. PZ X = 10IA + 18IB + 10IC with {A.1771 78.0000 1.0000 1.0600 20.0000 0.1859 0.0000 0.0000 0.0000 7.1859 120.0219 0.0047 0.0000 0.0000 0.0000 90.1859 Example 10.0000 0 0.0000 1.5].PX).2066 0. FUNCTIONS OF RANDOM VARIABLES 1.0000 120.0000 0.0000 1.3500 18.0003 288.0000 [Z.1268 0.27) c = [10 18 10 0].0000 0.1771 0.0000 1. then determine the distribution for Z = X 1/2 − X + 50 independent and P = [0.0000 10.0074 0.0000 90.0074 120.0000 1.12 % Further calculation using G.0612 0.2100 pm .0000 0.0000 1.0000 1.0000 0.0016 0.0000 0.1*[6 3 5]).
263 28.8) = FX (0.2426 0.0000 0.PW] = csort(H.82 = 0. We may use the approximation mprocedure tappr to obtain an approximate discrete distribution.64) = 0.0600 43.0600 741 0. [Z.5800 % Formation of G matrix % Sorts distinct values of g(X) % consolidates probabilities 0 % Direct use of Z distribution 1 1 Remark. Suppose we want calculated to be P (Z ≤ 0.8 i X ≤ 0.^2 + 2*t)/2 Use row matrices X and PX as in the simple case G = X.^2 .org/content/m23332/1.3500 50. The desired probability may be P (Z ≤ 0.PZ]') 18.3359 Using the approximation procedure.2100 36.642 /2 = 0.4721 0.64.1500 38. FX (t) = 1 2 Example 10.28) tappr Enter matrix [a b] of xrange endpoints [0 1] Enter number of x approximation points 200 Enter density as a function of t (3*t. we may name the output as desired.1400 M = (Z < 20)(Z >= 40) M = 1 0 0 PZM = M*PZ' PZM = 0.13: Continuation of Example 10. .8).8.1500 34.12.2100 fX (t) = 1485 0. Example 10.0900 G = sqrt(X) . above.1400 0. [W. Now Z ≤ 0.0000 0. PM = M*PX' PM = 0.PX).5/>.1623 0.X + 50.14: A discrete approximation Suppose X has density function 3t2 + 2t for Then t3 + t2 . H = 2*X.PZ] = csort(G.3*X + 1.^(1/2). Then we work with the approximating random variable as a simple random variable.2915 0.PX) W = 1 171 595 PW = 0. M = G <= 0. Let Z =X 1/2 .643 + 0.1644 0. Note that with the mfunction csort.0900 27.0900 0 ≤ t ≤ 1.1500 1 2 2775 0.2 Function of Random Vectors2 Introduction 2 This content is available online at <http://cnx.0000 0. disp([Z.3359 % Agrees quite closely with the theoretical 10. we have (10.3500 0.
Select those (ti . To nd (a) Q.1 The general approach extended to a pair Consider a pair {X. uj ) (X. Y (ω)) ∈ Qig ((X (ω) .2. It is convenient to consider the case of two random variables. determine whether g (ti . plane. Y ) ∈ Q). Now Determine the set Q of all those (t. above). they must have the same probability. although the details are more complicated. Y ). function g. uj ) which do and add the associated proba We illustrate the mapping approach in the absolutely continuous case. Y } having joint distribution on the plane. Y ) ∈ M i (X. uj ) of (X. Select those which meet the and add the associated probabilities. nding the set by integrating Q A key element in the approach is The desired probability is obtained on the plane such that over fXY Q. (ti . It does not require that we describe geometrically the region b. a. g (X. f Q XY . This is the approach we use in the MATLAB calculations. g is real valued and M is a subset the real line. • • Q on the plane by the random vector W = (X. Extensions to more than two random variables are made similarly. The approach is analogous to that for a single random variable with distribution on the line. 10. To nd (a) P ((X. of In the absolutely continuous case. uj ) (ti . Y (ω)) ∈ M } = {ω : (X (ω) . FUNCTIONS OF RANDOM VARIABLES The general mapping approach for a single random variable and the discrete alternative extends to functions of more than one variable. . u) which are mapped into M by the W (ω) = (X (ω) . Y ) which are in the set Q and (b) Discrete alternative. calculate In the discrete case.29) {ω : g (X (ω) .30) Q is identied on the (b) Discrete alternative. Consider each vector value dening conditions for Q (ti . uj ) of (X. determine P ((X. For each possible vector value meets the dening condition for M. Y (ω)) ∈ M Hence(10. Y (ω)) ∈ Q} Since these are the same event. Simply nd the amount of probability mass mapped into the set Mapping approach. Y ) ∈ M ). Mapping approach. identify those vector values add the associated probabilities. considered jointly. Y ) ∈ Q) in the usual manner (see part a. Once (10. Y ) ∈ Q. bilities. Y ).264 CHAPTER 10. P (g (X. Y ).
Thus ∞ v−t FZ (v) = Qv fXY = −∞ −∞ fXY (t. Examination of the gure shows that for this region. The desired probability is 2 t 0 P (Y ≤ X) = 0 6 (t + 2u) du dt = 32/37 ≈ 0. t = 2. Y ) ∈ Qv ) {(t.32) FZ (v) = P (X + Y ≤ v) = P ((X. u) : u ≤ v − t} For any xed where Qv = {(t.2). u) = 37 (t + 2u) on the region bounded by t = 0.265 Figure 10.16: The density for the sum Suppose the pair {X.8649 37 X +Y fXY .1: Distribution for Example 10. u = 0. Determine (10. t} (see Figure 1). fXY is dierent from zero on the triangle bounded by t = 2. Here g (t.31) Example 10. u = 0. u) = t − u and M = [0. u) : u ≤ t} which is the region on the plane on or below the line u = t. u) : t + u ≤ v} = u = v−t (see (10. Y } has joint density fXY (t. ∞). Example 10. the region Qv is the portion of the plane on or below the line Figure 10. Determine P (Y ≤ X) = P (X − Y ≥ 0). u = max{1. Now Q = {(t.34) .15: A numerical example The pair 6 {X. and u = t.33) v. Y } has joint density the density for Z =X +Y SOLUTION (10.15 (A numerical example). u) : t − u ≥ 0} = {(t. u) du dt (10.
PXY 0≤v≤1 and FZ (v) = 1 − (2 − v) 2 2 for 1≤v≤2 [0. Now PXY (Qv ) = 1 − PXY (Qc ). Y } has joint uniform density on the unit square 0 ≤ t ≤ 1.37) With the use of indicator functions. v − t) dt (10.1] (v) v + I(1. where v As Figure 3 shows.266 CHAPTER 10. these may be combined into a single expression fZ (v) = I[0. Determine the density for SOLUTION Z =X +Y. 0 ≤ u ≤ 1. 2]. which has 2 c (and hence probability) v /2. Qv which has probability mass is the lower shaded triangular region on the gure.35) convolution integral. we get ∞ fZ (v) = ∞ This integral expresssion is known as a fXY (t.17: Sum of joint uniform random variables Suppose the pair {X.36) Dierentiation shows that Z has the symmetric triangular distribution on fZ (v) = v for 0≤v≤1 and fZ (v) = (2 − v) for 1≤v≤2 (10. FZ (v) = v2 2 for is the probability in the region Qv : u ≤ v − t. the region. for v ≤ 1.2] (v) (2 − v) (10. the complementary region Qv is the upper shaded (2 − v) /2. Example 10. so that 2 (Qv ) = 1 − (2 − v) /2. It has area 2 in this case. FUNCTIONS OF RANDOM VARIABLES Dierentiating with the aid of the fundamental theorem of calculus. Thus. For v > 1.2: Region Qv for X + Y ≤ v. c FZ (v) part of area the complementary set Qv is the set of points above the line. since (10.38) . Figure 10.
3: Geometry for sum of joint uniform random variables. 1] (v − t). u) = I[0. 1] (v) I[0. W = h (Y ). then the pair {Z. However. Thus. ALTERNATE SOLUTION Since v−t≤1 fXY (t. This is illustrated for simple random variables with the aid of the mprocedure jointzw at the end of the next section.39) t gives the result above. 1] (t) Integration with respect to (10. Now 0≤ fXY (t. the pair {Xs . Example 10. 2] (v) I[v−1.41) X Y. Independence of functions of independent random variables Suppose {X. v − t) = I[0. W } is independent. pair {IMi (X) . INj (Y )} is independent.40) {Z −1 (M ) . Y } is an independent pair and Z = g (X) . n n m Xs = i=1 As functions of ti IEi = i=1 and ti IMi (X) and Ys = j=1 uj IFj = j=1 is uj INj (Y ) Also each (10. the pair {Z. Y } is an independent pair. Let Z = g (X) . then in general {Z. so that we have fXY (t. W } is not independent. i v − 1 ≤ t ≤ v . Ys } independent. Y ). if Z = g (X. {X. Y } is an independent pair with simple approximations Xs m and Ys as described in Distribution Approximations. N }.18: Independence of simple approximations to an independent pair Suppose {X. Y ) and W = h (X. v − t) = I[0. 1] (t) I[0. and Since Z −1 (M ) = X −1 g −1 (M ) the pair If W −1 (N ) = Y −1 h−1 (N ) (10. W } is independent. W = h (Y ). respectively. v] (t) + I(1. 1] (t) I[0. W −1 (N )} is independent for each pair {M.267 Figure 10. 1] (u). .
0297 0.0744 4.0000 0.0837 0.0198 6.2. % Calculation using the XY distribution PM = total(M. use total((G>=1). PM = (Z>=1)*PZ' % Calculation using the Z distribution PM = 0. u.0000 0.1375 0 0.0000 0.0817 0. Z = g (X. u.PZ] = csort(G.0744. uj ).1032. we use array operations on the values of CHAPTER 10.0000 0.0000 0.PZ]') % Display of the Z distribution 12.0000 0.0180 % X Y P We extend the example above by considering a function W = h (X. we must use array operations on the calculating matrices a matrix G t and u to obtain whose elements are mfunction csort on G g (ti . = [0.*P) % Alternately.1032 9.0360].0000 0.0589 3.uj)] M = G >= 1.0405 0.19 % file jdemo3. 0.0405 1. This is illustrated in the following example.0297 11. % Formation of G = [g(ti.0000 0.0000 0.0837 5.0372 0.1161 0. = [0 1 2 4]. we may use the A rst step.268 10. FUNCTIONS OF RANDOM VARIABLES X to determine a matrix of values of g (X).0209 8.0000 0.0132 0.0285 0. PX.0516 16. and P G = t. jdemo3 % Call for data jcalc % Set up of calculating matrices t.1425 2.0000 0. 0.4665 disp([Z.0264. t. PY. 0. Y ).0774 0.0000 0. .P). Example 10.0360 10.0270 0.0589 0.m data for joint simple distribution = [4 2 0 1 3].2 Use of MATLAB on pairs of simple random variables In the singlevariable case.4665 [Z. In the twovariable case. Y.0558 0.1059 3.0000 0.0516 0.0180 0. Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.^2 3*u.0000 0. Y ) which has a composite denition.0402 6.0000 0.0198 0. To obtain the distribution for and the joint probability matrix P. then.0372 13.0209 0. is the use of jcalc or icalc to set up the joint distribution and the calculating matrices.*P) PM = 0.
2700 1.0000 0.19 Let W ={ X X +Y 2 2 for for X +Y ≥1 Determine the distribution for W (10.0000 0. disp([W.20: Continuation of Example 10.Y) % Plot of distribution function PW % See Figure~10.0000 0.0516 20.^2).0000 0.0372 32.0270 5.2400 4.*(t+u<1).0000 0.1900 3.0180 17.42) X +Y <1 H = t. % Specification of h(t.0774 8.u) [W.0198 0 0.0000 0.0000 0.0558 16.PW] = csort(H.0000 0.*(t+u>=1) + (t.PW]') 2.^2 + u.0000 0.4 .269 Example 10.0000 0.0132 ddbn Enter row matrix of values W Enter row matrix of probabilities print % Distribution for W = h(X.P).
value matrices and the 2. Assign to the position in the joint probability matrix for the probability PZW (i. P Z and P W marginal probability matrices. Y ) and distribution for a pair Suppose W = h (X.4: Distribution for random variable W in Example 10. corresponds to a set of pairs of P ZW for (Z. j) = total (P (a)) (10. Z = zj ).43) The joint distribution requires the probability of each pair. 4. If we select and add the probabilities corresponding to the latter pairs. FUNCTIONS OF RANDOM VARIABLES Figure 10. As special Z=X Z or W =Y.270 CHAPTER 10. For each pair (wi . has values [z1 z2 · · · zc ] and and W has values [w1 w2 · · · wr ] (10. Z = g (X. we may have (X. zj ). Z = zj ). use the MATLAB function nd to determine the positions a for which (10. Y ) In previous treatments. Z) values corresponds to one or more pairs of (Y. Each pair of (W.20 (Continuation of Example 10. Use csort to determine the Z and W G = [g (t. u)]. we have P (W = wi .45) . Y ). This may be values. Set up calculation matrices t and u as with jcalc. we assign to each position X Y P (W = wi . Z = zj ). Joint distributions for two functions of It is often desirable to have the cases. u)] and H = [h (t. with values of W increasing upward. Y ).19). we use csort to obtain the joint marginal distribution for a single function Z = g (X. X) values. Use array arithmetic to determine the matrices of values 3. j) P ZW (Z.44) (H == W (i)) & (G == Z (j)) (i. To determine the joint probability matrix arranged as on the plane. W ) P (W = wi . Each such pair of values (i. W ) 5. j) the probability accomplished as follows: 1.
u) positions disp([i j]) % Optional display of positions .027 0.026 0.m P = [0.013]. Z = 1 position.060 0.009.. Z = 1) is the total probability at these positions. Example 10.P) Z = 0 1 2 PZ = 0.3530 [Z.052 + 0.PZ] = csort(G. 0. Y = 2:3.012 0.001 = 0. (2.H = u. This may be achieved by MATLAB with the following operations. u) positions for which this pair of (W.052 0.050 0.3110 0. we need to determine the (t. we nd these to be (2. We put this probability in the joint probability matrix P ZW at the W = 4.054 0..018.. 0.001 0.Y) H = 9 9 9 9 9 4 4 4 4 4 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 4 4 4 4 4 G = abs(t) % Matrix of values for Z = g(X. Then P (W = 4.3610 % Third value for W % Second value for Z P (W = 4.1870 % Determination of marginal for Z 0.2).023 0. u .015 0.1490 0.040 0.040 0. which are then implemented in the mprocedure jointzw.j] = find((H==W(3))&(G==Z(2))).001 + 0.058 0.013.. 0.P) W = 0 1 4 PW = 0. (6..2670 0.4).271 We rst examine the basic calculations.029 0.3720 r = W(3) r = 4 s = Z(2) s = 1 To determine 9 2 2 2 2 2 2 % Determination of marginal for W 0. jdemo7 % Call for data in jdemo7. and (6. Z = 1).032 0.PW] = csort(H.061 0.048 0. 0.013.4).039. This is 0. % Location of (t.058 0.Y) G = 2 1 0 1 2 1 0 1 2 1 0 1 2 1 0 1 2 1 0 1 2 1 0 1 [W.21: Illustration of the basic joint calculations % file jdemo7.112.061 0.^2 % Matrix of values for W = h(X. X = 2:2.030 0. 0..001 0. Z) values is taken on..058 + 0.m jcalc % Used to set up calculation matrices t. By inspection..004 0.053 0.2).060 0. [i.
0774 0.272 CHAPTER 10.m data for joint simple X = [4 2 0 1 3].0209 0.0520 0 PZW = zeros(length(W). P0(a) = P(a) P0 = 0 0 0 0 0. Y ) = X − Y  and W = h (X.1120 0 0 0 0 PZW = flipud(PZW) PZW = 0 0 0 0.2) = total(P(a)) PZW = 0 0 0 0 0 0 0 0.length(Z)) PZW(3. w.0405 0.1161 0.0264.0270 0.u): abs(t. FUNCTIONS OF RANDOM VARIABLES 2 2 6 2 2 4 6 4 a = find((H==W(3))&(G==Z(2))).Y): P Enter values for X: X Enter values for Y: Y Enter expression for g(t. P0 = zeros(size(P)).0558 0. Example 10. v.0589 0.0010 0 0 0 0 0 0 0 0 0 0 0 0. PW. disp(PZW) 0.1032. 0.0180 0.0132 0 0 0 0 0.0837 0.22: Joint distribution for Z = g (X. PZW 0 0 0 .0132 0.0580 0 0 0 0 0 0 0 0.0516 0.*u) Use array operations on Z. 0.0570 0 distribution 0. W.0264 0 0 0 0 0.0372 0.u): abs(abs(t)u) Enter expression for h(t.0817 0.1120 0 0 0 0 The procedure the % Location in more convenient form % Setup of zero matrix % Display of designated probabilities in P 0 0 0. Y = [0 1 2 4].0285 jdemo3 % Call for data jointzw % Call for mprogram Enter joint prob for (X.0198 0. Y ) = XY  % file jdemo3.0360]. 0.0297 0. P = [0.0010 0 % Initialization of PZW matrix % Assignment to PZW matrix with % W increasing downward % Assignment with W increasing upward 0 0 0 0 carries out this operation for each possible pair of flipud jointzw W and Z values (with operation coming only after all individual assignments are made). PZ.0744.
u): t+u % Z = g(X. using marginal dbn ew = 2.u): t.0360 0 0 0 0 0 0.Y) Use array operations on Z. However. using marginal dbn ez = 1. PZ. w.1446 EZ = total(v.0477 ez = Z*PZ' % Alternate.^2 .4398 0 0 0. then the pair {Z.Y} is independent jointzw Enter joint prob for (X. PZ. W } is not independent. % P(Z>W) PM = total(M.1363 0. Y } is an independent pair and Z = g (X). W.1032 0 0 0. W. Y ).Y): P Enter values for X: X Enter values for Y: Y Enter expression for g(t.1107 0 0. Y ) and W = h (X. W } is independent.Y} is independent % The pair {g(X). v.23: Functions of independent random variables jdemo3 itest Enter matrix of joint probabilities P The pair {X. PW.0725 0 0 0 0. if Z = g (X. PZW itest Enter matrix of joint probabilities PZW The pair {X.6075 ew = W*PW' % Alternate.4398 EW = total(w.h(Y)} is independent jdemo3 % Refresh data jointzw Enter joint prob for (X. PZW itest Enter matrix of joint probabilities PZW . v.273 0 0. W = h (Y ).3390 At noted in the previous section.3*t % Z = g(X) Enter expression for h(t. then in general the pair {Z.*u % W = h(X. PW.6075 M = v > w.0817 0 0.u): t.*PZW) EW = 2.Y): P Enter values for X: X Enter values for Y: Y Enter expression for g(t. We may illustrate of the mprocedure jointzw this with the aid Example 10.0558 0 0 0 0 0.Y} is independent % The pair {X.*PZW) PM = 0.Y) Enter expression for h(t.u): abs(u) + 3 % W = h(Y) Use array operations on Z.0744 0.*PZW) EZ = 1. w. if {X.0405 0.
Y ) ∈ Qv ). u ≥ v/t}} (10. In this section. Let Z = XY . SOLUTION (see Figure 10.274 CHAPTER 10. v ≥ 0. FUNCTIONS OF RANDOM VARIABLES The pair {X.Y} is NOT independent % The pair {g(X.5: Region Qv for product XY . u ≤ v/t} {(t.Y)} is not indep To see where the product rule fails. Example 10. Figure 10.Y).24: Distribution for a product Suppose the pair {X. u) : tu ≤ v} = {(t.2. u) : t < 0. u) : t > 0. Y } has joint density fXY . then obtain simple approximations.3 Absolutely continuous case: analysis and approximation As in the analysis Joint Distributions. we solve several examples analytically. call for D % Fails for all pairs 10.h(X. we may set up a simple approximation to the joint distribution and proceed as for simple random variables.5) Qv = {(t. Determine Qv such that P (Z ≤ v) = P ((X.46) .
PY. it must be expressed in terms of t.8466. % Note that although f = 1. [Z. 0 < v ≤ 1 (10.8466. tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (u>=0)&(t>=0) Use array operations on X. u. Y } ∼ uniform on unit square fXY (t.5)*PZ' p = 0. v/t}} (10. above . P (XY ≤ v) = Qv Integration shows Then (see Figure 10.47) FZ (v) = P (XY ≤ v) = v (1 − ln (v)) For so that fZ (v) = −ln (v) = ln (1/v) . Y with uniform joint distribution on the unit square. 0 ≤ t ≤ 1. FZ (0. 0 ≤ u ≤ 1. Example 10.6) 1 dudt where Qv = {(t.275 Figure 10.6: Product of X.8465 % Theoretical value 0.48) v = 0. u) : 0 ≤ t ≤ 1. u.*u.P).5. Y. 0 ≤ u ≤ min{1.PZ] = csort(G. p = (Z<=0. u) = 1.25 {X. and P G = t. t. PX.5) = 0.
PQ = total(Q. Y ) ∈ Q) Reference to Figure 10.4853 % Theoretical value 0. Y ) ∈ Q) = 18/37 ≈ 0. u) : u ≤ 1/t} (10. by t = 0.49) P ((X. above . FUNCTIONS OF RANDOM VARIABLES Example 10. u.*(u<=max(t. PX. t = 2.50) APPROXIMATE SOLUTION tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 300 Enter number of Y approximation points 300 Enter expression for joint density (6/37)*(t + 2*u).7 shows that where Q = {(t.26 (Continuation of Example 5 (Example 8. and P Q = t. Figure ANALYTIC SOLUTION P (Z ≤ 1) = P ((X. Determine P (Z ≤ 1).26: Continuation of Example 5 (Example 8.4865. u) = 37 (t + 2u) on the region bounded u = 0.4865 6 37 1 0 1 0 6 (t + 2u) dudt + 37 2 1 1/t 0 (t + 2u) dudt = 9/37 + 9/37 = (10. 10.5: Marginals for a discrete distribution) from "Random Vectors and Joint Distributions"). Y } has joint density fXY (t. Y. and u = max{1. Let Z = XY . t.5: Marginals for a discrete distribution) from "Random Vectors and Joint Distributions" The pair 6 {X. PY.*u<=1. t} (see Figure 7).1)) Use array operations on X.7: Area of integration for Example 10.276 CHAPTER 10.*P) PQ = 0.
u) : u ≤ 1/2. u ≤ t2 } P (Z ≤ 1/2) 2 3 1 t2 1/2 0 = 2 3 t1 0 1/2−t 0 (t + 2u) dudt + 2 3 t2 t1 t2 0 (t + 2u) dudt + (10.2322 APPROXIMATE SOLUTION . u > t2 }.277 G = t. using the distribution for Z In the following example. Reference to 2 Figure 10.27: A compound function The pair {X. it has a dierent rule for Figure 10. Y > X 2 = P (X. Y ) Y + IQc (X. Example 10.27 (A compound function). u) : u ≤ t2 }. t . [Z. u) : t + u ≤ 1/2. Y ) ∈ QA where QB (10.52) QB = {(t. 1/2 . % Alternate.53) (t + 2u) dudt = 0. Y ) (X + Y ) (10. ANALYTICAL SOLUTION P (Z ≤ 1/2) = P Y ≤ 1/2. 0 ≤ u ≤ 1. PZ1 = (Z<=1)*PZ' PZ1 = 0.8 shows that this is the part of the unit square for which u ≤ min max 1/2 − t. Z={ for Y X +Y for for = IQ (X.51) Q = {(t.PZ] = csort(G. Determine P (Z < = 0.8: Regions for P (Z ≤ 1/2) in Example 10. Let 1/2 − t1 = t1 and t2 = 1/2.P). That is.4853 dierent parts of the plane. the function g has a compound denition. u) = X2 − Y ≥ 0 X2 − Y < 0 2 3 (t + 2u) on the unit square 0 ≤ t ≤ 1. Y } has joint density fXY (t. Then and QA = {(t.*u.5). Y ≤ X 2 + P X + Y ≤ 1/2. 2 2 We may break up the integral into three parts.
Y. the associated quantile function Q is essentially an inverse F.*(1Q).^2. 10. If F is a probability distribution function. Example 10. expectation. and examples P (X ≤ t) = u. based on the MATLAB function ernv (inverse error function). with having the same (marginal) distribution. Quantile functions for simple random variables may be used to obtain an important Poisson approximation theorem (which we do not develop in this work). For F continuous and strictly increasing at t.3.3 The Quantile Function3 F is a probability distribution function. prob = total((G<=1/2). the approximation proceeds in a straightforward manner from the normal description of the problem. Once set up. G = u. We consider a general denition which applies to any probability distribution function. The quantile function is dened on the unit interval (0. If 10. u. above The setup of the integrals involves careful attention to the geometry of the system. the quantile function may be used to construct a The quantile function for a probability distribution has many uses in both the theory and application of random variable having F as its distributions function. 1).55) content is available online at <http://cnx. Thus. Denition: If F is a function quantile function for F is given by having the properties of a probability distribution function. 0) −ln (1 − u) /3 (10. then the Q (u) = inf {t : F (t) ≥ u} 3 This ∀ u ∈ (0. an and associated Yi independent class {Yi : 1 ≤ i ≤ n} may be constructed. The numerical result compares quite closely with the theoretical value and accuracy could be improved by taking more subdivision points. This fact serves as the basis of a method of simulating the sampling from an arbitrary distribution with the aid of a nite class random number generator.1 The Quantile Function probability. t. The restriction to the continuous case is not essential. FUNCTIONS OF RANDOM VARIABLES tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (2/3)*(t + 2*u) Use array operations on X. 1) (10.29: The Normal Distribution The mfunction values of Q norminv. Also. given any {Xi : 1 ≤ i ≤ n} each Xi of random variables. then Q (u) = t i F (t) = u.6/>. and P Q = u <= t. if u is a probability value. PY.2322.*P) prob = 0. On the other hand. calculates for the normal distribution.2328 % Theoretical is 0.54) u = F (t) = 1 − e−3t t ≥ 0 ⇒ t = Q (u) = Example 10.28: The Weibull distribution 2 (3. PX.278 CHAPTER 10.org/content/m23385/1.*Q + (t + u). . the evaluation is elementary but tedious. t = Q (u) is the value of t for which of The quantile function is used to derive a number of useful special forms for mathematical General conceptproperties. 2.
form an independent class with the same marginals as the Xi . then the random F. . has distribution function Q. F (t∗ ) < u∗. then X = Q (U ) FX (t) = P [Q (U ) ≤ t] = P [U ≤ F (t)] = F (t). 1 ≤ i ≤ n. 1) (marginals for the joint uniform distribution on the unit hypercube with sides (0. To see this. Property (Q2) implies that if variable FX = F . Example 10. has distribution function Hence. 1).279 We note If If F (t∗ ) ≥ u∗.30: Independent classes with prescribed distributions {Xi : 1 ≤ i ≤ n} is an arbitrary class of random variables with corresponding distribution {Fi : 1 ≤ i ≤ n}. 1)). 1). Then the random variables Yi = Qi (Ui ) . note that X = Q (U ). Suppose functions Several other important properties of the quantile function may be established. with quantile function uniformly distributed on (0. ≤t then then t∗ ≥ inf {t : F (t) ≥ u∗ } = Q (u∗ ) t∗ < inf {t : F (t) ≥ u∗ } = Q (u∗ ) ∀ u ∈ (0. with U F is any distribution function. we have the important property: (Q1)Q (u) i u ≤ F (t) The property (Q1) implies the following important property: (Q2)If U ∼ uniform (0. Let {Qi : 1 ≤ i ≤ n} be the respective quantile functions. 1). There is always an independent class {Ui : 1 ≤ i ≤ n} iid uniform (0.
2. 1.280 CHAPTER 10. FUNCTIONS OF RANDOM VARIABLES Figure 10. Q is leftcontinuous. whereas F is rightcontinuous. construction of the graph of obtained by the following two step procedure: u = Q (t) may be • • Invert the entire gure (including axes).9: Graph of quantile function from graph of distribution function. If jumps are represented by vertical line segments. then Rotate the resulting gure 90 degrees counterclockwise .
interval The quantile function is a leftcontinuous step function having value ti ti and on the (bi−1 .57) Figure 1 shows a plot of the distribution function FX .56) Example 10. If X is discrete with probability pi at ti .9. This may be stated F (ti ) = bi . . It is reected in the horizontal axis then Q (u) versus u.281 This is illustrated in Figure 10.2 0.4] (10. If jumps are represented by vertical line segments.1 0.3 0. bi ]. 1 ≤ i ≤ n. thenQ (u) = ti forF (ti−1 ) < u ≤ F (ti ) (10. then jumps go into at segments and at segments go into vertical segments. 3. where b0 = 0 If and bi = pj .31: Quantile function for a simple random variable Suppose simple random variable X has distribution X = [−2 0 1 3] rotated counterclockwise to give the graph of P X = [0. i j=1 then F has jumps in the amount pi at each is constant between.
.1 3. we use dquanplot which employs the stairs function and plots X vs the distribution function F X .282 CHAPTER 10.1 9. The basis for quantile function calculations for a simple random variable is the formula above. FUNCTIONS OF RANDOM VARIABLES Figure 10. We use the analytic characterization above in developing a number of mfunctions and mprocedures.4 7. The procedure dsample employs dquant to obtain a sample from a population with simple distribution and to calculate relative frequencies of the various values.31 (Quantile function for a simple random variable).10: Distribution and quantile functions for Example 10.8]. which is used as an element of several simulation procedures. This is Example 10.3 5.3 1. mprocedures for a simple random variable implemented in the mfunction dquant.32: Simple Random Variable X = [2. To plot the quantile function.
1000 0.1875 7.283 PX = 0.32 Population variance Var[X] = 16.01*[18 15 23 19 13 12].305 Sample variance = 16.1800 0.325 Population mean E[X] = 3. A pair of mprocedures are available for simulation of that problem.2320 5.11 for plot of results rand('seed'.1466 3.33 Figure 10.1805 1.1900 0.8000 0. The rst is called targetset.32 (Simple Random Variable).11: Quantile function for Example 10.1333 9.1300 0. It calls for the population distribution and then for the designation of a target set of possible values. The .3000 0. or one of a set of values.1200 0.3000 0.2300 0. Sometimes it is desirable to know how many trials are required to reach a certain value. dquanplot Enter VALUES for X X Enter PROBABILITIES for X PX % See Figure~10.1000 0.1201 Sample average ex = 3.4000 0.1500 0.0) % Reset random number generator for reference dsample Enter row matrix of values X Enter row matrix of probabilities PX Sample size n 10000 Value Prob Rel freq 2.
1].3 0.7 5.32 The standard deviation is 4.2 0.2000 3.3000 rand('seed'.2 3.7000 Enter the set of target values Call for targetrun % Population values % Population probabilities % Set of target states PX E 5. of members of the target set to be reached.4 shows the fraction of runs requiring t steps or less . PX = [0.3 0.3 0.6.0) % Seed set for possible comparison targetrun Enter the number of repetitions 1000 The target set is 1. FUNCTIONS OF RANDOM VARIABLES targetrun.1 0. E = [1.089 The minimum completion time is 2 The maximum completion time is 30 To view a detailed count. and asks for the number After the runs are made. call for D.33 X = [1.7000 Enter the number of target values to visit 2 The average completion time is 6.3000 3.284 CHAPTER 10. calls for the number of repetitions of the experiment. The first column shows the various completion times. targetset Enter population VALUES X Enter population PROBABILITIES The set of population values is 1. Example 10. calculated and displayed.3000 0.3 3.7]. the second column shows the numbers of trials yielding those times % Figure 10.5 7. various statistics on the runs are second procedure.3].5000 7.
34: Quantile function associated with a distribution function. dfsetup utilizes the distribution function to set up an approximate simple distribution.*(t >= 0)'. mprocedures for distribution functions Example 10. these are not displayed as in the discrete case.*(t < 0) + (0. The mprocedure qsample is used to obtain a sample from the population. F = '0. either defined previously or upon call Enter matrix [a b] of Xrange endpoints [1 1] Enter number of X approximation points 1000 Enter distribution function F as function of t F Distribution is in row matrices X and PX quanplot Enter row matrix of values X Enter row matrix of probabilities PX % String .4*(t + 1).12: Fraction of runs requiring t steps or less. A procedure The mprocedure Since there are so many possible values. except the ordinary plot function is used in the continuous case whereas the plotting function stairs is used in the discrete case.6 + 0. quanplot is used to plot the quantile function. This procedure is essentially the same as dquanplot.4*t).285 Figure 10. dfsetup Distribution function F is entered as a string variable.
2664 Figure 10.*(1<=t)' Distribution is in row matrices X and PX quanplot Enter row matrix of values X Enter row matrix of probabilities PX Probability increment h 0.25 Approximate population variance V(X) = 0.14 for plot . Enter matrix [a b] of xrange endpoints [0 3] Enter number of x approximation points 1000 Enter density as a function of t '(t. acsetup Density f is entered as a string variable. except that the density function is entered as a string variable. procedures quanplot and qsample are used as in the case of distribution functions.13: function.286 CHAPTER 10. either defined previously or upon call. FUNCTIONS OF RANDOM VARIABLES Probability increment h 0.01 % See Figure~10. This is essentially the Then the same as the procedure tuappr.01 % See Figure~10.procedure acsetup is used to obtain the simple approximate distribution.13 for plot qsample Enter row matrix of X values X Enter row matrix of X probabilities PX Sample size n 1000 Sample average ex = 0.004146 Approximate population mean E(X) = 0. Quantile function for Example 10. Example 10.34 (Quantile function associated with a distribution mprocedures for density functions An m.0004002 % Theoretical = 0 Sample variance vx = 0.t/3).35: Quantile function associated with a density function.^2).*(t<1) + (1.).
4/>.35 (Quantile function associated with a density func 10. .3622 Sample variance vx = 0.14: tion.org/content/m24315/1.) Let is a nonnegative.0) qsample Enter row matrix of values X Enter row matrix of probabilities PX Sample size n 1000 Sample average ex = 1. C > 0. a > 0.3242 Approximate population variance V(X) = 0.287 rand('seed'.352 Approximate population mean E(X) = 1. Use properties of the exponential and natural log function to show that FZ (v) = 1 − FX 4 This − ln (v/C) a for 0<v≤C (10. 0 < Z ≤ C.3474 Figure 10.). Quantile function for Example 10. 294. Then Z = g (X) = Ce−aX . absolutely continuous random variable.361 % Theoretical = 49/36 = 1.3474 % Theoretical = 0.58) content is available online at <http://cnx.1 Suppose where X (Solution on p.4 Problems on Functions of Random Variables4 Exercise 10.
g (t) = IM (t) [(p − r) t + (r − c) m] + IM (t) [(p − s) t + (s − c) m] = (p − s) t + (s − c) m + IM (t) (s − r) (t − m) Suppose (10.288 CHAPTER 10. Exercise 10.) X∼ λ/a exponential (λ). 294. Use the result of Exercise 10. m].62) Z = g (X) described by g (X) = 200 + 18IM 1 (X) (X − 10) + (16 − 18) IM 2 (X) (X − 20) + . If demand exceeds the number originally ordered.) Present value of future costs. 294. a = 0.2 Use the result of Exercise 10. 500. $13 each If the number of purchasers is a random variable X.) A merchant is planning for the Christmas season. eax at the end of x a.2 to determine the probability b.60) (10. 295. Then one dollar in hand now. compounded years. $15 each 51100.1 to show that if (Solution on p. Determine P (500 ≤ Z ≤ 1100).2: (Solution on p. Z ≤ 700.4 Optimal stocking of merchandise. has a value spent x years in the future has a present value e−ax . The agreement is that the organization will purchase ten tickets at $20 each (regardless of the number of individual buyers). they may be returned with recovery of r (µ). 294. Hence.59) (Solution on p. $16 each 3150.. Units are sold at a price p per unit. to stock (Solution on p. Additional tickets are available according to the following schedule: • • • • 1120. then FZ (v) = Exercise 10. g (t) = (p − c) t − (c − r) (m − t) = (p − r) t + (r − c) m t > m. Suppose λ = 1/10.000 approximation points. Experience indicates demand can be represented by a random variable D ∼ Poisson season. Suppose money may be invested at an annual rate continually. then t ≤ m. one dollar to failure (in years) X ∼ exponential the present value of the replacement (λ). Exercise 10. g (t) = (p − c) m + (t − m) (p − s) = (p − s) t + (s − c) m Then M = (−∞. If Z = g (D) • • Let For For is the gain from the sales. the total cost (in dollars) is a random quantity (10. m units of a certain item at a cost of c He intends per unit. $18 each 2130.07. extra units may be ordered at a cost of s each. 200. and C = $1000.5 (See Example 2 (Example 10. If units remain in stock at the end of the per unit. Use a discrete approximation for the exponential density to approximate the probabilities in part (a). Truncate X at 1000 and use 10. Suppose a device put into operation has time a. Approximate the Poisson random variable D by truncating at 100. FUNCTIONS OF RANDOM VARIABLES Exercise 10.61) µ = 50 m = 50 c = 30 p = 50 r = 20 s = 40. If the cost of replacement at failure is C dollars.3 v C 0<v≤C (10. then −aX is Z = Ce .) Price breaks) from "Functions of a Random Variable") The cultural committee of a student organization has arranged a special deal for tickets to a concert.
3 − 0.7 pair (See Exercise 2 (Exercise 9.8 (Solution on p.3 2.64) Determine Suppose X ∼ Poisson (75). P (Z ≥ 1300).0323 0.0437 P = 0. Y } has the joint distribution (in mle npr09_02.0350 0.0242 0.0342 0.0713 0.0868 Determine 0.0209 (10.6 (Solution on p. (Solution on p.0341 0. 295.37: npr08_06)): X = [−2.9 − 1.7 1. P (Z < 0) and P (−5 < Z ≤ 300).66) P (max{X.0483 0. P (X − Y  > 3).0551 0.1 3.) (See Exercise 7 (Exercise 8. M 3 = [30.7) from "Problems on Random Vectors and Joint Distributions".m (Section 17.5 2 8 4.0399 0.5 4.2) from "Problems on Independent Classes of Random Variables") The {X.0667 Determine Determine 0. (10. P X 2 + Y 2 ≤ 10 Exercise 10.69) .0682 0.0744 0. ∞) .0580 Let Y = [1.0672 . ∞) .0398 0.3) from "Problems on Independent Classes of Random Variables") The pair {X.0308 P ({X + Y ≥ 5} ∪ {Y ≤ 2}).0441 (10.65) 0.m (Section 17.289 (15 − 16) IM 3 (X) (X − 30) + (13 − 15) IM 4 (X) (X − 50) where (10.6) from "Problems on Random Vectors and Joint Distributions".0589 0.0304 0.0357 0. ∞) (10.0651 0.) (See Exercise 6 (Exercise 8. Y } ≤ 4) . Z = 3X 3 + 3X 2 Y − Y 3 . and P (900 ≤ Z ≤ 1400). Exercise 10.8.0961 P = 0. M 2 = [20.1] 0.0493 0.) Exercise 10.0589 (10.63) M 1 = [10. Y } has the joint distribution (in mle npr08_07.8.0498 0.1 5.67) 0.0448 Y = [−2 1 2. 295.0527 0.m (Section 17. Y = u) (10.0556 0. Approximate the Poisson distribution by truncating at 150.1) from "Problems on Independent Classes of Random Variables")) The pair {X.0504 0. and Exercise 3 (Exercise 9.0609 0. and Exercise 1 (Exercise 9.8. ∞) .7 1.68) 0.0620 0.3] 0. M 4 = [50.0361 0.41: npr09_02)): X = [−3.38: npr08_07)): P (X = t.0380 0. Y } has the joint distribution (in mle npr08_06.0399 0.1] 0.9 5. P (Z ≥ 1000) .0528 0.0420 0. 296.0456 0.1] 0.6 5.
0324 0.1 0.0132 3. Z = g (X.0396 0 0. For the distributions in Exercises 1015 below a.71) P (Z ≤ 2) . Determine and plot Exercise 10.) in Exercise 10.0216 0.1320 0.0405 0.10 For the pair (Solution on p. let {X.0891 0.4 0.290 CHAPTER 10.0077 Table 10.' Exercise 10.0090 0.0726 2.0440 0.) fXY (t.5 0. 0 ≤ u ≤ 1 + t (see Exercise 15 (Exercise 8.9 0. b. Y } the distribution function for Z. let {X.70) Determine and plot the distribution function for W.0203 0. Z = I[0. u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2.2 Determine P X 2 − 3X ≤ 0 .0510 0.0231 0. Determine analytically the indicated probabilities.11 (Solution on p. 296. P X 3 − 3Y  < 3 .) in Exercise 10.0297 0 4. Y ) X + IM c (X. 296.0 3. 296. Y ) 2Y (10.0594 0.7 0.0363 0. Y } W = g (X.0495 0. Y ) = { X 2Y for for X +Y ≤4 X +Y >4 = IM (X. FUNCTIONS OF RANDOM VARIABLES t = u = 7.8 3.15) from "Problems on Random Vectors and Joint Distributions").1089 0.9 For the pair (Solution on p.0528 0.2] (X) (X + Y ) Determine (10.0189 0.2 0.8.0484 1.1 2. Use a discrete approximation to calculate the same probablities. Y ) = 3X 2 + 2XY − Y 2 . Exercise 10.1] (X) 4X + I(1.5 4.8.
u) ≤ 1} Determine (10. Y ) (X + Y ) + IM c (X.291 Figure 10. 3 23 (10. Determine 1 Z = IM (X. 297.18) from "Problems on Random Vectors and Joint Distributions").15 Exercise 10. 2 − t} (see Exercise 17 (Exercise 8. Z = IM (X.) for fXY (t. Y ) Y 2 .13 (Solution on p. . M = {(t. u) : u > t} 2 P (Z ≤ 1/4). t} (see Exercise 18 (Exercise 8. u) : max (t. 297.17) from "Problems on Random Vectors and Joint Distributions"). 0 ≤ u ≤ max{2 − t. Y ) 2Y.73) P (Z ≤ 1). Y ) X + IM c (X. u) = 24 11 tu 0 ≤ t ≤ 2.72) Exercise 10. u) = (t + 2u) 0 ≤ t ≤ 2.) for fXY (t. M = {(t.12 (Solution on p. 0 ≤ u ≤ min{1.
) fXY (t. u) = (3t + 2tu). 3 − t} (see Exercise 19 (Exercise 8. Y ) Detemine M = {(t.16 Exercise 10. for 0 ≤ t ≤ 2. u) : u ≤ min (1. FUNCTIONS OF RANDOM VARIABLES Figure 10. X (see Exercise 20 (Exercise 8. u) = 12 179 3t2 + u . 0 ≤ u ≤ min{1 + t.19) from "Problems on Random Vectors and Joint Distributions"). 298.15 fXY (t. Z = IM (X. Y ) 2Y 2 . (Solution on p. .74) P (Z ≤ 2). 298. 2 − t)} (10. Y ) (X + Y ) + IM c (X. 2} Y .292 CHAPTER 10.20) from "Problems on Random Variables and Joint Distributions") Z = IM (X. 0 ≤ u ≤ min{2. u) : t ≤ 1. Determine M = {(t.14 (Solution on p. u ≥ 1} (10.75) P (Z ≤ 1). Y ) X + IM c (X. for 0 ≤ t ≤ 2.) 12 227 Exercise 10.
299.13 5. 000.22 0.76) {D. 299. Compare the relative frequency for each value with the probability that value is taken on.375 0. Z} is independent.025 0. b. Take a random sample of size n = 10.5 1.77) Z has distribution Value Probability 1.32 P (E) = 0.293 Figure 10.) probabilities are (in the usual order) {X. Minterm 0.12 1.15 0.40 (10.2 0. The class (10.1 − 0.) has distribution X = [−3. X = −2IA + IB + 3IC . Y.4 3.08 Table 10.3 0.012 0.9] P X = [0.3 Determine P X 2 + 3XY 2 > 3Z .4 0.018 Y = ID + 3IE + IF − 3.07] (10.17 Exercise 10. Plot the distribution function FX and the quantile function QX . Exercise 10.045 0.24 2. E.56 P (F ) = 0.8 0.17 The simple random variable X (Solution on p.12 0. .7 4.7 0.33 0.16 The class (Solution on p.162 0.2 2.78) a.255 0.108 0. F } is independent with P (D) = 0.43 3.11 0.
288) λ · ln (v/C) a = v C λ/a (10. M = (500<=G)&(G<=1100).1*exp(t/10) Use row matrices X and PX as in the simple case G = 1000*exp(0. so that FZ (v) = P (Z ≤ v) = P (X ≥ −ln (v/C) /a) = 1 − FX Solution to Exercise 10.1 (p.D). c = 30.80) P (Z ≤ v) = v 1000 10/7 (10. s = 40.79) FZ (v) = 1 − 1 − exp − Solution to Exercise 10.PZ] = csort(G.294 CHAPTER 10.1003 tappr Enter matrix [a b] of xrange endpoints [0 1000] Enter number of x approximation points 10000 Enter density as a function of t 0.81) v = [700 500 200]. m = 50.3 (p.s)*D + (s . 288) mu = 50.3716 PM3 = (G<=200)*PX' PM3 = 0. PM = M*PD' PM = 0. 287) Z = Ce−aX ≤ v i e−aX ≤ v/C i −aX ≤ ln (v/C) i X ≥ −ln (v/C) /a.2 (p.r)*(D .PD). PD = ipoisson(mu.07*t). PM1 = (G<=700)*PX' PM1 = 0.3715 0.9209 [Z. FUNCTIONS OF RANDOM VARIABLES Solutions to Exercises in Chapter 10 Solution to Exercise 10.^(10/7) P = 0.9209 % Alternate: use dbn for Z . 287) − ln (v/C) a (10.4 (p. p = 50. r = 20.m).c)*m +(s .*(D <= m).1003 Solution to Exercise 10. m = (500<=Z)&(Z<=1100). P = (v/1000). D = 0:100. G = (p . pm = m*PZ' pm = 0.6008 0.6005 PM2 = (G<=500)*PX' PM2 = 0.
% Alternate: use dbn for Z p1 = (Z>=1000)*PZ' p1 = 0.3713 Solution to Exercise 10. PX. PX = ipoisson(75.41: npr09_02) Data are in X.. P jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.*P) P3 = 0.295 Solution to Exercise 10.*P) P4 = 0. P3 = total((G<0).1142 P3 = ((900<=G)&(G<=1400))*PX' P3 = 0.16)*(X. % Alternate: use dbn for Z p4 = ((5<Z)&(Z<=300))*PZ' p4 = 0.PX).4860 P2 = total((abs(tu)>3). t.*P) P2 = 0.20).9742 [Z. P jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.. (15 . P1 = (G>=1000)*PX' P1 = 0.37: npr08_06) Data are in X. PY.7 (p. t. P1 = total(M1. PX.PZ] = csort(G.X). .8.*(X>=30) + (13 .PZ] = csort(G.*(X>=50). Y.8.5420 P4 = total(((5<G)&(G<=300)).50).^3.5 (p. and P P1 = total((max(t. Y.15)*(X .18)*(X .10).*P) P1 = 0.30). 289) npr08_06 (Section~17.^3 + 3*t. 288) X = 0:150.*P) P1 = 0.7054 M2 = t.^2.u.^2 <= 10. u. u.*(X>=20) + .6 (p.P). G = 200 + 18*(X .u)<=4).9288 Solution to Exercise 10. and P M1 = (t+u>=5)(u<=2).3713 [Z. PY. Y.*u .9288 P2 = (G>=1300)*PX' P2 = 0. Y. 289) npr09_02 (Section~17.^2 + u.4516 G = 3*t.*(X>=10) + (16 .
% Obtain dbn for Z = g(X. P jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.P).P).*P) P2 = 0. 0 ≤ u ≤ 1 + t} (10. 290) G = 3*t.11 (p.u. [W. FUNCTIONS OF RANDOM VARIABLES P2 = total(M2. 290) H = t.PW] = csort(H.Y) ddbn % Call for plotting mprocedure Enter row matrix of VALUES Z Enter row matrix of PROBABILITIES PZ % Plot not reproduced here Solution to Exercise 10. Y.85) tuappr Enter matrix [a b] of Xrange endpoints Enter matrix [c d] of Yrange endpoints [0 2] [0 3] .3*t <=0.8. P1 = total(M1.9 (p. 0 ≤ u ≤ 1 + t} Q1 = {(t.38: npr08_07) Data are in X.10 (p.^2.^2 + 2*t.*(t+u<=4) + 2*u. where M 1 = {(t. u) : 1 < t ≤ 2. PX.296 CHAPTER 10.82) M 2 = {(t.8 (p.*P) P2 = 0. u) : 0 ≤ t ≤ 1/2}. and P M1 = t.^3 .4500 M2 = t. ddbn Enter row matrix of VALUES W Enter row matrix of PROBABILITIES PW Solution to Exercise 10. t. u. 289) npr08_07 (Section~17. 290) % Plot not reproduced here P (Z ≤ 2) = P Z ∈ Q = Q1M 1 Q2M 2 .^2 .*P) P1 = 0.*u . % Determine g(X. u) : u ≤ 2 − t} 3 88 1/2 0 0 1+t (see gure) (10.3*abs(u) < 3.84) P = 2t + 3u2 dudt + 3 88 2 1 0 2−t 2t + 3u2 dudt = 563 5632 (10. P2 = total(M2. Q2 = {(t.7876 Solution to Exercise 10.3282 Solution to Exercise 10.*(t+u>4).83) (10.Y) [Z. PY. Y.PZ] = csort(G. u) : 0 ≤ t ≤ 1.
291) P (Z ≤ 1) = P (X.*(t>1).PZ] = csort(G. Y ) ∈ M1 Q1 M2 Q2 . [Z.*(t<=1) + (t+u). PX. PY. u) : 0 ≤ t ≤ u ≤ 1} (10. [Z.91) (10.88) P = tu dudt + 24 11 3/2 1/2 0 1/2 tu dudt + 24 11 2 3/2 0 2−t tu dudt = 85 176 (10.5*t. t.*(u>t) + u.*(u<=1+t) Use array operations on X. Y ) ∈ M1 Q1 M2 Q2 . u) : 1 ≤ t ≤ 2.2t)) Use array operations on X. Y.87) (10.4830 Solution to Exercise 10. 0 ≤ u ≤ 1 − t} (10.92) P = (t + 2u) dudt + 3 23 2 1 0 1/2 (t + 2u) dudt = 9 46 (10.1010 % Theoretical = 563/5632 = 0. u) : 0 ≤ t ≤ 2. M1 = {(t.P). u.1000 Solution to Exercise 10.*u. PZ2 = (Z<=2)*PZ' PZ2 = 0.93) tuappr Enter matrix Enter matrix Enter number Enter number [a [c of of b] of Xrange endpoints [0 2] d] of Yrange endpoints [0 2] X approximation points 300 Y approximation points 300 . 0 ≤ t ≤ min (t. and P G = 4*t. u) : u ≤ 1/2} 3 23 1 0 0 1−t (see gure) (10. 0 ≤ u ≤ t} Q1 = {(t. u) : u ≤ 1/2} 24 11 1/2 0 0 1 (see gure) (10. PY. pp = (Z<=1/4)*PZ' pp = 0.4844 % Theoretical = 85/176 = 0. u) : u ≤ 1 − t} Q2 = {(t. 2 − t)} Q1 = {(t. u) : t ≤ 1/2} Q2 = {(t. 291) P (Z ≤ 1/4) = P (X.PZ] = csort(G.^2).*(u<t).*(u<=min(1.P). u) : 0 ≤ t ≤ 1.297 Enter number of X approximation points 200 Enter number of Y approximation points 300 Enter expression for joint density (3/88)*(2*t + 3*u. PX.90) M2 = {(t.86) M2 = {(t. M1 = {(t. u.12 (p. Y.^2. and P G = 0. t.89) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density (24/11)*t.13 (p.
97) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 300 Enter number of Y approximation points 300 Enter expression for joint density (12/179)*(3*t.*P) p = 0.100) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 400 Enter number of Y approximation points 400 Enter expression for joint density (12/227)*(3*t+2*t. u) : u ≤ 1/2} P = 12 179 1 0 0 2−t (see gure) (10. G = M. u) : 0 ≤ t ≤ 1} Q2 = {(t.2)) Use array operations on X.14 (p. 0 ≤ u ≤ 1} M3 = {(t.*(t<=1) + (t>1).*P) P = 0. 292) P (Z ≤ 2) = P (. M1 = {(t.*u. and P M = max(t.*(u<=t). M2 = M c (see gure) (10. Y ) ∈ M1 Q1 M2 Q2 .15 (p.*(t + u) + (1 . Y. 1 ≤ u ≤ 2} (10.^2. PY. PX. u) : u ≤ 1 − t} Q2 = {(t.96) 3t2 + u dudt + 12 179 2 1 0 1 3t2 + u dudt = 119 179 (10.1957 Solution to Exercise 10. and P M = (t<=1)&(u>=1).98) Q1 = {(t. u.^2.6648 Solution to Exercise 10.*u. PY. M1 = M.M)*2. Y ) ∈ M1 Q1 M2 M3 Q2 . PX.*u). p = total((G<=1).*(u>=2t).298 CHAPTER 10. PX.t)) Use array operations on X. u) : 0 ≤ t ≤ 1.*(t + u) + (1 .94) M2 = {(t. Y.5463 .*(t + u) + (1 . Y. p = total((G<=2). G = M.3t)) Use array operations on X. PY.*(u<=min(1+t. u) : 1 ≤ t ≤ 2.^2 + u).*u. 0 ≤ u ≤ 3 − t} Q1 = {(t.M)*2. 292) P (Z ≤ 1) = P (X.*P) p = 0.M)*2. u) : u ≤ t} P = 12 227 1 0 0 1 (10.*(u<=max(2t. t.5478 % Theoretical = 124/227 = 0. Z = M.95) (10. and P Q = (u<=1). P = total(Q. t. u.99) (3t + 2tu) dudt + 12 227 2 1 t (3t + 2tu) dudt = 2−t 124 227 (10.6662 % Theoretical = 119/179 = 0. u.1960 % Theoretical = 9/46 = 0.*(u<=min(2. FUNCTIONS OF RANDOM VARIABLES Enter expression for joint density (3/23)*(t + 2*u). t. u) : 0 ≤ t ≤ 1.u) <= 1.
3340 2. Z. pmy = minprob(0. PX. 293) X = [3.pmx).0700 0.4000 0.859 .4 5. pmy.2 2. Z = [1.299 Solution to Exercise 10.17 (p. Z.1000 0.^2 > 3*v.*P) PM = 0. ddbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % Plot not reproduced here dquanplot Enter VALUES for X X Enter PROBABILITIES for X PX % Plot not reproduced here rand('seed'.1070 4.5 1.16 (p. pmx.2164 1.m (Section~17. PX = 0.8792 Population mean E[X] = 0.1184 3.0) % Reset random number generator dsample % for comparison purposes Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX Sample size n 10000 Value Prob Rel freq 3. cy. cy = [1 3 1 3].3587 Solution to Exercise 10.PY] = canonicf(cy.2000 0.7 4. PZ [X. pmx = 0.8.01*[15 22 33 12 11 7].2 2.3300 0. Y.PX] = canonicf(cx.1100 0.1490 0.1200 0.5000 0. PY. v.*u.1 0.42: npr10_16) Data for Exercise~10. PZ.7000 0.01*[32 56 40]). PM = total(M. 293) % file npr10_16.8]. and P M = t. pmx. icalc3 Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter row matrix of Zvalues Z Enter X probabilities PX Enter Y probabilities PY Enter Z probabilities PZ Use array operations on matrices X.9000 0. cy. Z. PZ = 0. t.001*[255 25 375 45 108 12 162 18]. u.^2 + 3*t.0752 Sample average ex = 0.16 cx = [2 1 3 0].7 3.2200 0.9]. [Y.3 1.pmy).1500 0.4 3. pmy. PZ') npr10_16 % Call for data Data are in cx.01*[12 24 43 13 8]. disp('Data are in cx.
112 .146 Population variance Var[X] = 5. FUNCTIONS OF RANDOM VARIABLES Sample variance vx = 5.300 CHAPTER 10.
identication of this relationship provides access to a rich and powerful set of properties which have far reaching consequences in both application and theory.1. The fi must add to n. which lies beyond the scope of this study. 5. 5.2 · 5 + 0. (11. three of the ten. has the value 1.1 · 1 + 0. · · · xn } to be averaged. then extend the denition and properties to the general case. we note the relationship of mathematical expectation to the Lebesque integral. 8. Although we do not develop this theory.1 · 4 + 0.2 Expectation for simple random variables The notion of mathematical expectation is closely related to the idea of a weighted mean. numbers have value t1 . 1/10 has the value 4. with m ≤ n distinct values have value t2 .1. One of the ten. this idea of likelihood is rooted X takes a value in a set M of real numbers is interpreted as the in the intuitive notion that if the experiment is repeated enough times the probability is approximately the fraction of times the value of the values taken on. t2 .5/>. Consider the arithmetic average 2. content is available online at <http://cnx.2) Multiply each possible value by the fraction The fractions are often referred to as the relative frequencies. Historically. 4. tm }.3 · 2 + 0. 8. 8. have the value 2. x2 . or the fraction 3/10 of them.1 Mathematical Expectation: Simple Random Variables1 11. used extensively in the handling of numerical data.Chapter 11 Mathematical Expectation 11. · · · . we could write x = (0. which is developed in abstract measure theory. 11. 1 This A sum of this sort is known as a In general. fm have value tm .3 · 8) The pattern in this last expression can be stated in words: of the numbers having that value and then sum these products. x= 1 (1 + 2 + 2 + 2 + 4 + 5 + 5 + 8 + 8 + 8) 10 (11. 2/10 have the value 5.org/content/m23387/1. Thus. In the process. {t1 . · · · . 2. f2 {x1 . X will fall in M. which is given by x of the following ten numbers: 1. 301 . 2. and 3/10 have the value 8.1) Examination of the ten numbers to be added shows that ve distinct values are included. suppose there are Suppose f1 n weighted average. We incorporate the concept of as an appropriate form of such averages.1 Introduction The probability that real random variable likelihood that the observed value X (ω) on any trial will lie in M. Associated with this interpretation is the notion of the average of mathematical expectation into the mathematical model We begin by studying the mathematical expectation of simple random variables. or the fraction 1/10 of them.
then the fraction pi is called the relative frequency of those ti .4) Note that the expectation is determined by the distribution. In symbols Denition. If mean. hence the same expectation. · · · . For 3. this average has been called the the mean value. or Example 11. is the probability weighted average of the values taken on by X. a constant c. Traditionally. so that E [c] = cP (Ω) = c. X = cIΩ .3) In probability theory.302 CHAPTER 11.1: Moment of a probability distribution about the origin. X with values {t1 . The average x of the n numbers may be written x= 1 n n m numbers in the set which If we set have the value xi = i=1 j=1 tj pj (11. 1 ≤ i ≤ m. MATHEMATICAL EXPECTATION pi = fi /n.1: Some special cases X = aIE = 0IE c + aIE . Two quite dierent random variables may have the same distribution. so that X n n E [aX] = i=1 ati P (Ai ) = a i=1 ti P (Ai ) = aE [X] (11. designated E [X]. . For a simple random variable the n n E [X] = i=1 ti P (X = ti ) = i=1 ti pi (11. Since 2. we have E [aIE ] = aP (E). of the random variable X. t2 . pi = P (X = ti ). n n X = i=1 ti IAi then aX = i=1 ati IAi . tn } and corresponding probabilities mathematical expectation. 1.5) Figure 11. we have a similar averaging process in which the relative frequencies of the various possible values of are replaced by the probabilities that those values are observed on any trial.
00 · 0. this moment is known to be the same as the product of the Example 11.7) Although the calculation is very simple in this case. The prices are $2. as shown in Figure 1. with is the Mechanically.50.303 Mechanical interpretation In order to aid in visualizing an essentially abstract system.5 3 3. respectively.00. and $3.00.8500 = X*PX' % Another alternate = 2.3. weightless rod with point mass pi attached at each value ti of X. Example 11. we have employed the notion of probability as mass.*PX) % An alternate calculation = 2.8500 . pi at The expectation is ti pi i Now (11. and 0. From physical theory. We suppose each number is equally likely.50.3 = 2.50 · 0.2.2. What is the expected cost of her selection? E [X] = 2. then the mean value µX is the point of balance. % Matrix of probabilities = dot(X.8500 = sum(X. Suppose the random variable X has values {ti : 1 ≤ i ≤ n}. the mathematical expectation is determined as the dot product of the value matrix with the probability matrix. third or fourth.85 (11. We utilize the mass distribution to give an important and helpful mechanical interpretation of the expectation or mean value. Example 11. it is really not necessary. second. of the probability mass distribution about the total mass times the number which locates the center of mass. with respective probabilities 0. with P (X = ti ) = pi . This is easily calculated using MATLAB.2: The number of spots on a die Let X be the number of spots which turn up on a throw of a simple sixsided die.4: MATLAB calculation for Example 3 PX EX EX Ex Ex ex ex X = [2 2. the sum of the products origin on the real line. This produces a probability mass distribution.3: A simple choice A child is told she may have one of four toys. Since the total mass is one.00 · 0.8) For a simple random variable. % Matrix of values (ordered) = 0. Thus the values are the integers one through six.3 + 3.2 + 3. ti pi moment pi to the left of the origin i ti is negative. 0. Often there are symmetries in the distribution which make it possible to determine the expectation without detailed calculation. The distribution induced by a real random variable on the line is visualized as a unit of probability mass actually distributed along the line. $2. we give an alternate interpretation in terms of meansquare estimation.6) ti  is the distance of point mass pi from the origin.50 · 0. The probability distribution places equal mass at each of the integer values one through six.16: Alternate interpretation of the mean value) in "Mathematical Expectation: General Random Variables".PX) % The usual MATLAB operation = 2. with point mass concentration in the amount of the point ti . $3.1*[2 2 3 3]. the is the mean value location of the center of mass. By denition E [X] = 1 1 1 1 1 1 7 1 · 1 + · 2 + · 3 + · 4 + · 5 + · 6 = (1 + 2 + 3 + 4 + 5 + 6) = 6 6 6 6 6 6 6 2 (11.2 + 2.5]. The center of mass is at the midpoint.3 of choosing the rst. In Example 6 (Example 11. 0. and each probability is 1/6. She choses one. If the real line is viewed as a sti.
identify the index set Ji of those j such that cj = ti .9) We wish to ease this restriction to canonical form.13) P (A1 ) = P (C1 ) + P (C3 ) . {C1 .19) .10) m X= j=1 We show that cj ICj . or 3. MATHEMATICAL EXPECTATION X canonical form. A2 = {X = 2} = C2 C5 C6 and A3 = {X = 3} = C4 (11. Then the terms cj ICj = ti Ji By the additivity of probability m ICj = ti IAi .12) Inspection shows the distinct possible values of X to be 1. C2 . Suppose simple random variable X is in a primitive form {Cj : 1 ≤ j ≤ m} is a partition (11. we begin with an example which exhibits the essential pattern. where Ai = {X = ti }.11) Before a formal verication. we have n n n m E [X] = i=1 ti P (Ai ) = i=1 ti j∈Ji P (Cj ) = i=1 j∈Ji cj P (Cj ) = j=1 cj P (Cj ) (11.5: Simple random variable X in primitive form with X = IC1 + 2IC2 + IC3 + 3IC4 + 2IC5 + 2IC6 .18) j ∈ Ji we have cj = ti .16) X = j=1 cj ICj . A1 = {X = 1} = C1 so that C3 .17) P (Ai ) = P (X = ti ) = j∈Ji Since for each P (Cj ) (11. E [X] = i=1 ti P (Ai ) (11. consider (11. in which case n implies Expectation and primitive form The denition and treatment above assumes is in n X= i=1 ti IAi . Also. Establishing Example 11.304 CHAPTER 11. Suppose these are t1 < t2 < · · · < tn . C5 . C4 . C6 } a partition (11. P (A2 ) = P (C2 ) + P (C5 ) + P (C6 ) . the general case is simply a matter of appropriate use of notation. where m E [X] = j=1 cj P (Cj ) (11. 2. We identify the distinct set of values contained in the set {cj : 1 ≤ j ≤ m}.15) (11. C3 . For any value ti in the range. Now and P (A3 ) = P (C4 ) (11.14) E [X] = P (A1 ) + 2P (A2 ) + 3P (A3 ) = P (C1 ) + P (C3 ) + 2 [P (C2 ) + P (C5 ) + P (C6 )] + 3P (C4 ) = P (C1 ) + 2P (C2 ) + P (C3 ) + 3P (C4 ) + 2P (C5 ) + 2P (C6 ) To establish the general pattern. Ji where Ai = j∈Ji Cj (11.
06. % Determination of dbn for X disp([X.PX] = csort(c.07.2000 Ex = X*PX' % E[X] from distribution Ex = 1. 0.13.7800 Linearity The result on primitive forms may be used to establish the linearity of mathematical expectation for simple random variables. we work through the verication in some detail.23) j=1 IAi IBj = IAi Bj and forms a partition.4200 2. 0.14.08. % Matrix of probabilities EX = c*pc' EX = 1. 0.3800 3.6: Alternate determinations of Suppose X E [X] in a primitive form is X = IC1 + 2IC2 + IC3 + 3IC4 + 2IC5 + 2IC6 + IC7 + 3IC8 + 2IC9 + IC10 with respective probabilities (11.12. 0. % Matrix of coefficients pc = 0.08.pc). 0. j) Z = X + Y in a primitive form.05.0000 0. 0.21) c = [1 2 1 3 2 2 1 3 2 1]. Y = uj }. X from the coecients and probabilities of the primitive form. Because of its fundamental importance. Since m IAi = i=1 we have IBj = 1 j=1 (11.16 (11.7800 % Direct solution [X. the dening expression for expectation thus holds for X in a primitive form.11.0000 0.305 Thus. Suppose X= n i=1 ti IAi and Y = m j=1 uj IBj n (both in canonical form).22) n ti IAi m IBj + m n n m X +Y = i=1 Note that u j I Bj j=1 i=1 IAi = i=1 j=1 (ti + uj ) IAi IBj (11. we have Ai Bj = {X = ti . 0. An alternate approach to obtaining the expectation from a primitive form is to use the csort operation to determine the distribution of Example 11. above. 0.25) . The class of these sets for all possible pairs (i. Thus. 0.01*[8 11 6 13 5 8 12 7 14 16]. the last summation expresses on primitive forms.20) P (Ci ) = 0. Because of the result n m n m n m E [X + Y ] = i=1 j=1 (ti + uj ) P (Ai Bj ) = i=1 j=1 n m m ti P (Ai Bj ) + i=1 j=1 n uj P (Ai Bj ) (11.PX]') 1.24) = i=1 ti j=1 P (Ai Bj ) + j=1 uj i=1 P (Ai Bj ) (11.0000 0.
28) X. so that with the aid of Example 11. It follows that E [aX + bY + cZ] = E [aX + bY ] + cE [Z] = aE [X] + bE [Y ] + cE [Z] simple random variables. The expectation of a linear combination of a nite number of simple random variables is that linear combination of the expectations of the individual random variables. (11.26) n m E [X + Y ] = i=1 Now have ti P (Ai ) + j=1 uj P (Bj ) = E [X] + E [Y ] (11.33) Some algebraic tricks may be used to show that the second form sums to of that. the dening expression holds for any ane combination of indicator functions.32) n E [X] = k=0 kC (n. MATHEMATICAL EXPECTATION i and for each We note that for each j m n P (Ai ) = j=1 Hence. k) pk q n−k .31) n X= k=0 so that kIAkn .7: Binomial distribution (n. then n n E [X] = E c0 + i=1 ci IEi = c0 + i=1 ci P (Ei ) (11. Y. p) This random variable appears as the number of successes in n Bernoulli trials with probability p of n success on each component trial. Example 11. but there is no need . q =1−p (11.27) aX and bY are simple if X and Y are.30) Thus. and cZ . this pattern may be extended to a linear combination of any nite number of Linearity.1 (Some special cases) we E [aX + bY ] = E [aX] + E [bY ] = aE [X] + bE [Y ] If (11. It is naturally expressed in ane form n X= i=1 Alternately. with pk = P (Akn ) = P (X = k) = C (n.306 CHAPTER 11. q =1−p np. we may write P (Ai Bj ) and P (Bj ) = i=1 P (Ai Bj ) (11. Thus we may assert (11. Expectation of a simple random variable in ane form As a direct consequence of linearity. whether in canonical form or not. whenever simple random variable X is in ane form. k) pk q n−k . The computation for the ane form is much simpler. then so are aX + bY . Z are simple.29) By an inductive argument. in canonical form IEi so that E [X] = i=1 p = np (11.
307
Example 11.8: Expected winnings
A bettor places three bets at $2.00 each. The rst bet pays $10.00 with probability 0.15, the second pays $8.00 with probability 0.20, and the third pays $20.00 with probability 0.10. What is the expected gain? SOLUTION The net gain may be expressed
X = 10IA + 8IB + 20IC − 6,
Then
with
P (A) = 0.15, P (B) = 0.20, P (C) = 0.10
(11.34)
E [X] = 10 · 0.15 + 8 · 0.20 + 20 · 0.10 − 6 = −0.90
These calculations may be done in MATLAB as follows:
(11.35)
c = [10 8 20 6]; p = [0.15 0.20 0.10 1.00]; % Constant a = aI_(Omega), with P(Omega) = 1 E = c*p' E = 0.9000
Functions of simple random variables
If then
X
is in a primitive form (including canonical form) and
g
is a real function dened on the range of
X,
m
Z = g (X) =
j=1
so that
g (cj ) ICj
a primitive form
(11.36)
m
E [Z] = E [g (X)] =
j=1
g (cj ) P (Cj )
(11.37)
Alternately, we may use csort to determine the distribution for
Caution.
If
X
Z
and work with that distribution.
is in ane form (but not a primitive form)
m
m
X = c0 +
j=1
so that
cj IEj
then
g (X) = g (c0 ) +
j=1
g (cj ) IEj
(11.38)
m
E [g (X)] = g (c0 ) +
j=1
g (cj ) P (Ej )
(11.39)
Example 11.9: Expectation of a function of
Suppose
X
X
(11.40)
in a primitive form is
X = −3IC1 − IC2 + 2IC3 − 3IC4 + 4IC5 − IC6 + IC7 + 2IC8 + 3IC9 + 2IC10
with probabilities Let
P (Ci ) = 0.08, 0.11, 0.06, 0.13, 0.05, 0.08, 0.12, 0.07, 0.14, 0.16. g (t) = t2 + 2t. Determine E [g (X)].
308
CHAPTER 11. MATHEMATICAL EXPECTATION
% Original coefficients % Probabilities for C_j % g(c_j)
c = [3 1 2 3 4 1 1 2 3 2]; pc = 0.01*[8 11 6 13 5 8 12 7 14 16]; G = c.^2 + 2*c G = 3 1 8 3 24 1 3 8 15 EG = G*pc' % EG = 6.4200 [Z,PZ] = csort(G,pc); % disp([Z;PZ]') % 1.0000 0.1900 3.0000 0.3300 8.0000 0.2900 15.0000 0.1400 24.0000 0.0500 EZ = Z*PZ' % EZ = 6.4200
distribution is available. Suppose
8 Direct computation
Distribution for Z = g(X) Optional display
E[Z] from distribution for Z
A similar approach can be made to a function of a pair of simple random variables, provided the joint
X=
n i=1 ti IAi
and
Y =
m
m j=1
u j I Bj
(both in canonical form). Then
n
Z = g (X, Y ) =
i=1 j=1
The
g (ti , uj ) IAi Bj
(11.41)
Ai Bj
form a partition, so
Z
is in a primitive form. We have the same two alternative possibilities: (1)
direct calculation from values of
g (ti , uj )
and corresponding probabilities
or (2) use of csort to obtain the distribution for
Z.
P (Ai Bj ) = P (X = ti , Y = uj ),
Example 11.10: Expectation for
calculations, we use jcalc.
Z = g (X, Y ) g (t, u) = t2 + 2tu − 3u.
To set up for
We use the joint distribution in le jdemo1.m and let
% file jdemo1.m X = [2.37 1.93 0.47 0.11 0 0.57 1.22 2.15 2.97 3.74]; Y = [3.06 1.44 1.21 0.07 0.88 1.77 2.01 2.84]; P = 0.0001*[ 53 8 167 170 184 18 67 122 18 12; 11 13 143 221 241 153 87 125 122 185; 165 129 226 185 89 215 40 77 93 187; 165 163 205 64 60 66 118 239 67 201; 227 2 128 12 238 106 218 120 222 30; 93 93 22 179 175 186 221 65 129 4; 126 16 159 80 183 116 15 22 113 167; 198 101 101 154 158 58 220 230 228 211]; jdemo1 % Call for data jcalc % Set up Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X, Y, PX, PY, t, u, and P G = t.^2 + 2*t.*u  3*u; % Calculation of matrix of [g(t_i, u_j)] EG = total(G.*P) % Direct calculation of expectation EG = 3.2529 [Z,PZ] = csort(G,P); % Determination of distribution for Z EZ = Z*PZ' % E[Z] from distribution
309
EZ =
3.2529
11.2 Mathematical Expectation; General Random Variables2
In this unit, we extend the denition and properties of mathematical expectation to the general case. In the process, we note the relationship of mathematical expectation to the Lebesque integral, which is developed in abstract measure theory. Although we do not develop this theory, which lies beyond the scope of this
study, identication of this relationship provides access to a rich and powerful set of properties which have far reaching consequences in both application and theory.
11.2.1 Extension to the General Case
In the unit on Distribution Approximations (Section 7.2), we show that a bounded random variable can be expressed as the dierence
X
can be
represented as the limit of a nondecreasing sequence of simple random variables. Also, a real random variable
X = X+ − X−
of two nonnegative random variables.
The extension of
mathematical expectation to the general case is based on these facts and certain basic properties of simple random variables, some of which are established in the unit on expectation for simple random variables. We list these properties and sketch how the extension is accomplished.
Denition.
hold
almost surely,
A condition on a random variable or on a relationship between random variables is said to abbreviated a.s. i the condition or relationship holds for all
ω
except possibly a set
with probability zero.
Basic properties of simple random variables (E0) (E1) (E2) (E3) : : : :
If X = Y a.s. then E [X] = E [Y ]. E [aIE ] = aP (E). Linearity. X = n ai Xi implies E [X] = i=1
Positivity; monotonicity
a. If b. If
n i=1
ai E [Xi ]
X ≥ 0 a.s. , then E [X] ≥ 0, with equality i X = 0 a.s.. X ≥ Y a.s. , then E [X] ≥ E [Y ], with equality i X = Y a.s.
If X ≥ 0 is bounded and {Xn : 1 ≤ n} is an a.s. nonnegative, limXn (ω) ≥ X (ω) for almost every ω , then limE [Xn ] ≥ E [X]. nondecreasing
(E4) : (E4a):
Fundamental lemma
n
If for all
sequence with
n, 0 ≤ Xn ≤ Xn+1 a.s.
n
and
Xn → X a.s. ,
then
E [Xn ] → E [X]
(i.e., the expectation of
the limit is the limit of the expectations).
Ideas of the proofs of the fundamental properties
• • • •
Modifying the random variable without changing
X
on a set of probability zero simply modies one or more of the
Ai
P (Ai ).
Such a modication does not change
E [X].
Properties (E1) ("(E1) ", p. 309) and (E2) ("(E2) ", p. 309) are established in the unit on expectation of simple random variables.. Positivity (E3a) (p. 309) is a simple property of sums of real numbers. Modication of sets of probability zero cannot aect the expectation. Monotonicity (E3b) (p. 309) is a consequence of positivity and linearity.
X≥Y
i
X − Y ≥ 0 a.s.
and
E [X] ≥ E [Y ]
i
E [X] − E [Y ] = E [X − Y ] ≥ 0
(11.42)
2 This
content is available online at <http://cnx.org/content/m23412/1.5/>.
310
CHAPTER 11. MATHEMATICAL EXPECTATION
The fundamental lemma (E4) ("(E4) ", p. expectation. 309) plays an essential role in extending the concept of
•
It involves elementary, but somewhat sophisticated, use of linearity and monotonicity,
limited to nonnegative random variables and positive coecients. We forgo a proof.
•
Monotonicity and the fundamental lemma provide a very simple proof of the monotone convergence theoem, often designated MC. Its role is essential in the extension.
Nonnegative random variables
There is a nondecreasing sequence of nonnegative simple random variables converging to
X. Monotonicity
E [X] =
implies the integrals of the nondecreasing sequence is a nondecreasing sequence of real numbers, which must have a limit or increase without bound (in which case we say the limit is innite). We dene
limE [Xn ].
n
Two questions arise.
1. Is the limit unique? The approximating sequences for a simple random variable are not unique, although their limit is the same. 2. Is the denition consistent? If the limit random variable with the old?
X
is simple, does the new denition coincide
The fundamental lemma and monotone convergence may be used to show that the answer to both questions is armative, so that the denition is reasonable. Also, the six fundamental properties survive the passage to the limit. As a simple applications of these ideas, consider discrete random variables such as the geometric Poisson
(p)
or
(µ),
which are integervalued but unbounded.
Example 11.11: Unbounded, nonnegative, integervalued random variables
The random variable
X
may be expressed
∞
X=
k=0
Let
kIEk ,
where
Ek = {X = k}
with
P (Ek ) = pk
(11.43)
n−1
Xn =
k=0
Then each for all Now
kIEk + nIBn ,
where
Bn = {X ≥ n}
If
(11.44)
Xn is a simple random variable with Xn ≤ Xn+1 .
Hence,
n ≥ k + 1.
Xn (ω) → X (ω)
for all
ω.
By monotone convergence,
X (ω) = k , then Xn (ω) = k = X (ω) E [Xn ] → E [X].
n−1
E [Xn ] =
k=1
If
kP (Ek ) + nP (Bn )
(11.45)
∞ k=0
kP (Ek ) < ∞,
then
∞
∞
0 ≤ nP (Bn ) = n
k=n
Hence
P (Ek ) ≤
k=n
kP (Ek ) → 0
as
n→∞
(11.46)
∞
E [X] = limE [Xn ] =
n k=0
kP (Ak )
(11.47)
We may use this result to establish the expectation for the geometric and Poisson distributions.
311
Example 11.12:
We have
X ∼ geometric (p) pk = P (X = k) = q k p, 0 ≤ k .
∞
By the result of Example 11.11 (Unbounded, nonnegative,
integervalued random variables)
∞
E [X] =
k=0
For
kpq k = pq
k=1
so that
kq k−1 =
pq (1 − q)
2
= q/p
(11.48)
Y −1∼
geometric
(p), pk = pq k−1
E [Y ] = 1 E [X] = 1/p q
Example 11.13:
We have
X ∼ Poisson (µ)
k
pk =
e−µ µ k!
.
By the result of Example 11.11 (Unbounded, nonnegative, integervalued
random variables)
∞
E [X] = e−µ
k=0
k
µk = µe−µ k!
∞
k=1
µk−1 = µe−µ eµ = µ (k − 1)!
(11.49)
The general case
We make use of the fact that
X = X + − X −,
where both
X
+
and
X

are nonnegative. Then
E [X] = E X + − E X −
Denition.
theory. If both
provided at least one of are nite,
E X+
,
E X−
is nite
(11.50)
E [X + ]
and
E [X − ]
X
is said to be
integrable.
The term integrable comes from the relation of expectation to the abstract Lebesgue integral of measure
Again, the basic properties survive the extension. The property (E0) ("(E0) ", p. 309) is subsumed in a more general uniqueness property noted in the list of properties discussed below.
Theoretical note
The development of expectation sketched above is exactly the development of the Lebesgue integral of the random variable
X
as a measurable function on the basic probability space
(Ω, F, P ),
so that
E [X] =
Ω
X dP
(11.51)
As a consequence, we may utilize the properties of the general Lebesgue integral. is not particularly useful for actual calculations. real line by random variable an integral on the real line.
In its abstract form, it
X
A careful use of the mapping of probability mass to the
produces a corresponding mapping of the integral on the basic space to
Although this integral is also a Lebesgue integral it agrees with the ordinary
Riemann integral of calculus when the latter exists, so that ordinary integrals may be used to compute expectations.
Additional properties
The fundamental properties of simple random variables which survive the extension serve as the basis
of an extensive and powerful list of properties of expectation of real random variables and real functions of random vectors. Some of the more important of these are listed in the table in Appendix E (Section 17.5). We often refer to these properties by the numbers used in that table.
Some basic forms
The mapping theorems provide a number of basic integral (or summation) forms for computation.
1. In general, if integral.
Z = g (X)
with distribution functions
FX
and
FZ , we have the expectation as a Stieltjes
uFZ (du)
(11.52)
E [Z] = E [g (X)] =
g (t) FX (dt) =
312
CHAPTER 11. MATHEMATICAL EXPECTATION X
and
2. If
g (X)
are absolutely continuous, the Stieltjes integrals are replaced by
E [Z] =
g (t) fX (t) dt =
ufZ (u) du
(11.53)
where limits of integration are determined by
fX
or
fY .
Justication for use of the density function is
provided by the RadonNikodym theoremproperty (E19) ("(E19)", p. 600). 3. If
X
is simple, in a primitive form (including canonical form), then
m
E [Z] = E [g (X)] =
j=1
If the distribution for
g (cj ) P (Cj )
(11.54)
Z = g (X)
is determined by a csort operation, then
n
E [Z] =
k=1
vk P (Z = vk )
(11.55)
4. The extension to unbounded, nonnegative, integervalued random variables is shown in Example 11.11 (Unbounded, nonnegative, integervalued random variables), above. innite series (provided they converge). 5. For The nite sums are replaced by
Z = g (X, Y ), E [Z] = E [g (X, Y )] = g (t, u) FXY (dtdu) = vFZ (dv)
(11.56)
6. In the absolutely continuous case
E [Z] = E [g (X, Y )] =
g (t, u) fXY (t, u) dudt =
vfZ (v) dv
(11.57)
7. For joint simple
X, Y
(Section on Expectation for Simple Random Variables (Section 11.1.2: Expec
tation for simple random variables))
n
m
E [Z] = E [g (X, Y )] =
i=1 j=1
g (ti , uj ) P (X = ti , Y = uj )
(11.58)
Mechanical interpretation and approximation procedures
In elementary mechanics, since the total mass is one, the quantity
E [X] =
tfX (t) dt
is the location of
the center of mass. This theoretically rigorous fact may be derived heuristically from an examination of the expectation for a simple approximating random variable. Recall the discussion of the mprocedure for discrete approximation in the unit on Distribution Approximations (Section 7.2) The range of
X
is divided into equal
subintervals. The values of the approximating random variable are at the midpoints of the subintervals. The associated probability is the probability mass in the subinterval, which is approximately
fX (ti ) dx,
where
dx
is the length of the subinterval. This approximation improves with an increasing number of subdivisions,
with corresponding decrease in
dx.
The expectation of the approximating simple random variable
Xs
is
E [Xs ] =
i
ti fX (ti ) dx ≈
tfX (t) dt
(11.59)
The approximation improves with increasingly ne subdivisions. The center of mass of the approximating distribution approaches the center of mass of the smooth distribution.
313
It should be clear that a similar argument for
g (X)
leads to the integral expression
E [g (X)] =
g (t) fX (t) dt
(11.60)
This argument shows that we should be able to use tappr to set up for approximating the expectation
E [g (X)]
as well as for approximating
P (g (X) ∈ M ),
etc.
We return to this in Section 11.2.2 (Properties
and computation).
Mean values for some absolutely continuous distributions
1.
Uniform
on
[a, b]fX (t) =
1 b−a ,
a≤t≤b
The center of mass is at
(a + b) /2.
To calculate the value
formally, we write
E [X] =
Symmetric triangular on[a,
interval 3.
tfX (t) dt = b]
1 b−a
b
t dt =
a
b2 − a 2 b+a = 2 (b − a) 2
(11.61)
2.
The graph of the density is an isoceles triangle with base on the
[a, b].
By symmetry, the center of mass, hence the expectation, is at the midpoint
(a + b) /2.
Exponential(λ).
fX (t) = λe−λt , 0 ≤ t E [X] =
Using a well known denite integral (see Appendix B
(Section 17.2)), we have
∞
tfX (t) dt =
0
λte−λt dt = 1/λ
(11.62)
4.
Gamma(α,
λ). fX (t) =
1 α−1 α −λt λ e , Γ(α) t
0≤t
Again we use one of the integrals in Appendix B
(Section 17.2) to obtain
E [X] =
tfX (t) dt =
1 Γ (α)
∞
λα tα e−λt dt =
0
Γ (α + 1) = α/λ λΓ (α)
1 0
(11.63)
The last equality comes from the fact that 5.
Beta(r,
Γ(r)Γ(s) Γ(r+s) ,
s). fX (t) =
Γ(r+s) r−1 (1 Γ(r)Γ(s) t
− t)
s−1
Γ (α + 1) = αΓ (α). , 0 < t < 1 We use
the fact that
ur−1 (1 − u)
s−1
du =
r > 0, s > 0. tfX (t) dt = Γ (r + s) Γ (r) Γ (s)
1 0
α
E [X] =
Weibull(α,
tr (1 − t)
s−1
dt =
Γ (r + s) Γ (r + 1) Γ (s) r · = Γ (r) Γ (s) Γ (r + s + 1) r+s
Dierentiation shows
(11.64)
6.
λ, ν). FX (t) = 1 − e−λ(t−ν)
α > 0, λ > 0, ν ≥ 0, t ≥ ν .
α−1 −λ(t−ν)α
fX (t) = αλ(t − ν)
First, consider
e
,
t≥ν
(11.65)
Y ∼
exponential
(λ).
For this random variable
∞
E [Y r ] =
0
If
tr λe−λt dt =
Γ (r + 1) λr
(11.66)
Y
is exponential (1), then techniques for functions of random variables show that
1/α 1 λY
+ν ∼
Weibull
(α, λ, ν).
Hence,
E [X] =
1 1 E Y 1/α + ν = 1/α Γ λ1/α λ
1 +1 +ν α
(11.67)
314
CHAPTER 11. MATHEMATICAL EXPECTATION
Normal
7.
µ, σ 2
The symmetry of the distribution about
t=µ
shows that
E [X] = µ.
This, of course,
may be veried by integration. A standard trick simplies the work.
∞
∞
E [X] =
−∞
We have used the fact that
tfX (t) dt =
−∞
(t − µ) fX (t) dt + µ
(11.68)
∞ −∞
fX (t) dt = 1.
If we make the change of variable
integral, the integrand becomes an odd function, so that the integral is zero. Thus,
x = t − µ in the E [X] = µ.
last
11.2.2 Properties and computation
The properties in the table in Appendix E (Section 17.5) constitute a powerful and convenient resource for the use of mathematical expectation. These are properties of the abstract Lebesgue integral, expressed in the notation for mathematical expectation.
E [g (X)] =
g (X) dP
(11.69)
In the development of additional properties, the four basic properties: (E1) ("(E1) ", p. 309) Expectation of indicator functions, (E2) ("(E2) ", p. 309) Linearity, (E3) ("(E3) ", p. 309) Positivity; monotonicity, and (E4a) ("(E4a)", p. 309) Monotone convergence play a foundational role. We utilize the properties in the
table, as needed, often referring to them by the numbers assigned in the table. In this section, we include a number of examples which illustrate the use of various properties. Some
are theoretical examples, deriving additional properties or displaying the basis and structure of some in the table. Others apply these properties to facilitate computation
Example 11.14: Probability as expectation
Probability may be expressed entirely in terms of expectation.
• • •
By properties (E1) ("(E1) ", p. 309) and positivity (E3a) ("(E3) ", p. 309),
P (A) = E [IA ] ≥
0.
As a special case of (E1) ("(E1) ", p. 309), we have
P (Ω) = E [IΩ ] = 1
By the countable sums property (E8) ("(E8) ", p. 600),
A=
i
Ai
implies
P (A) = E [IA ] = E
i
IAi =
i
E [IAi ] =
i
P (Ai )
(11.70)
Thus, the three dening properties for a probability measure are satised.
Remark.
There are treatments of probability which characterize mathematical expectation with properties
(E0) through (E4a) (p. 309), then dene has not been widely adopted.
P (A) = E [IA ].
Although such a development is quite feasible, it
Example 11.15: An indicator function pattern
Suppose
X
is a real random variable and
E = X −1 (M ) = {ω : X (ω) ∈ M }. IE = IM (X)
Then
(11.71)
To see this, note that Similarly, if ", p. 309),
X (ω) ∈ M i ω ∈ E , so that IE (ω) = 1 i IM (X (ω)) = 1. E = X −1 (M ) ∩ Y −1 (N ), then IE = IM (X) IN (Y ). We thus have,
by (E1) ("(E1)
P (X ∈ M ) = E [IM (X)]
and
P (X ∈ M, Y ∈ N ) = E [IM (X) IN (Y )]
(11.72)
315
Example 11.16: Alternate interpretation of the mean value
E (X − c)
2
is a minimum i
c = E [X] , X (ω) − c.
in which case
E (X − E [X])
2
= E X 2 − E 2 [X]
(11.73)
INTERPRETATION. If we approximate the random variable the error of approximation is (often called the
X
by a constant
c,
then for any
ω
The probability weighted average of the square of the error . This average squared error is smallest i
mean squared error ) is E (X − c)2 the approximating constant c is the mean value.
VERIFICATION We expand
(X − c)
2
and apply linearity to obtain
E (X − c)
2
= E X 2 − 2cX + c2 = E X 2 − 2E [X] c + c2
(11.74)
E X 2 and E [X] are constants). The usual calculus treatment shows the expression has a minimum for c = E [X]. Substitution of this value for c shows 2 the expression reduces to E X − E 2 [X].
The last expression is a quadratic in (since A number of inequalities are listed among the properties in the table. The basis for these inequalities is usually some standard analytical inequality on random variables to which the monotonicity property is applied. We illustrate with a derivation of the important Jensen's inequality.
c
Example 11.17: Jensen's inequality
If of
X is a real random variable and g X, then
is a convex function on an interval
I
which includes the range
g (E [X]) ≤ E [g (X)]
VERIFICATION The function
(11.75)
g
is convex on
I
i for each
t0 ∈ I = [a, b]
there is a number
λ (t0 )
such that
g (t) ≥ g (t0 ) + λ (t0 ) (t − t0 )
This means there is a line through
(11.76)
a ≤ X ≤ b,
by
then by monotonicity
(E11) ("(E11)", p. 600)). We
c, we have
(t0 , g (t0 )) such that the graph of g lies on or above it. If E [a] = a ≤ E [X] ≤ E [b] = b (this is the mean value property may choose t0 = E [X] ∈ I . If we designate the constant λ (E [X])
g (X) ≥ g (E [X]) + c (X − E [X])
Recalling that tonicity, to get
(11.77)
E [X]
is a constant, we take expectation of both sides, using linearity and mono
E [g (X)] ≥ g (E [X]) + c (E [X] − E [X]) = g (E [X])
(11.78)
Remark.
It is easy to show that the function
λ (·) is nondecreasing.
This fact is used in establishing Jensen's
inequality for conditional expectation.
The product rule for expectations of independent random variables
Example 11.18: Product rule for simple random variables
Consider an independent pair
{X, Y }
of simple random variables
n
m
X=
i=1
ti IAi Y =
j=1
uj IBj
(both in canonical form)
(11.79)
316
CHAPTER 11. MATHEMATICAL EXPECTATION
We know that each pair product
{Ai , Bj }
is independent, so that
P (Ai Bj ) = P (Ai ) P (Bj ).
Consider the
XY .
a function of
X ) from "Mathematical Expectation:
n m
According to the pattern described after Example 9 (Example 11.9: Expectation of Simple Random Variables."
n
m
XY =
i=1
ti IAi
j=1
uj IBj =
i=1 j=1
ti uj IAi Bj
(11.80)
The latter double sum is a primitive form, so that
E [XY ] (
n i=1 ti P
= (Ai ))
n m i=1 j=1 ti uj P m j=1 uj P (Bj )
(Ai Bj )
=
n i=1
m j=1 ti uj P
(Ai ) P (Bj )
=
(11.81)
= E [X] E [Y ]
Thus the product rule holds for independent simple random variables.
Example 11.19: Approximating simple functions for an independent pair
Suppose of
X
and
rule for
{X, Y } is an independent pair, with an approximating simple pair {Xs , Ys }. As functions Y, respectively, the pair {Xs , Ys } is independent. According to Example 11.18 (Product simple random variables), above, the product rule E [Xs Ys ] = E [Xs ] E [Ys ] must hold.
there exist nondecreasing sequences
Example 11.20: Product rule for an independent pair
For
X ≥ 0, Y ≥ 0,
simple random variables increasing to
X
and
Y,
{Xn : 1 ≤ n}
and
respectively.
The sequence
{Yn : 1 ≤ n} of {Xn Yn : 1 ≤ n} is
By the monotone
also a nondecreasing sequence of simple random variables, increasing to convergence theorem (MC)
XY .
E [Xn ] [U+2197]E [X] ,
Since In the general case,
E [Yn ] [U+2197]E [Y ] ,
for each
and
E [Xn Yn ] [U+2197]E [XY ]
(11.82)
E [Xn Yn ] = E [Xn ] E [Yn ] XY = X + − X −
n, we conclude E [XY ] = E [X] E [Y ]
(11.83)
Y + − Y − = X +Y + − X +Y − − X −Y + + X −Y −
Application of the product rule to each nonnegative pair and the use of linearity gives the product rule for the pair
{X, Y }
Remark.
It should be apparent that the product rule can be extended to any nite independent class.
Example 11.21: The joint distribution of three random variables
{X, Y, Z} is independent, with the marginal g (X, Y, Z) = 3X 2 + 2XY − 3XY Z . Determine E [W ].
The class
distributions shown below.
Let
W =
X = 0:4; Y = 1:2:7; Z = 0:3:12; PX = 0.1*[1 3 2 3 1]; PY = 0.1*[2 2 3 3]; PZ = 0.1*[2 2 1 3 2]; icalc3 Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter row matrix of Zvalues Z Enter X probabilities PX Enter Y probabilities PY % Setup for joint dbn for{X,Y,Z}
317
Enter Z probabilities PZ Use array operations on matrices X, Y, Z, PX, PY, PZ, t, u, v, and P EX = X*PX' % E[X] EX = 2 EX2 = (X.^2)*PX' % E[X^2] EX2 = 5.4000 EY = Y*PY' % E[Y] EY = 4.4000 EZ = Z*PZ' % E[Z] EZ = 6.3000 G = 3*t.^2 + 2*t.*u  3*t.*u.*v; % W = g(X,Y,Z) = 3X^2 + 2XY  3XYZ EG = total(G.*P) % E[g(X,Y,Z)] EG = 132.5200 [W,PW] = csort(G,P); % Distribution for W = g(X,Y,Z) EW = W*PW' % E[W] EW = 132.5200 ew = 3*EX2 + 2*EX*EY  3*EX*EY*EZ % Use of linearity and product rule ew = 132.5200
Example 11.22: A function with a compound denition: truncated exponential
Suppose
X∼
exponential (0.3). Let
Z={
Determine
X2 16
for for
X≤4 X>4
= I[0,4] (X) X 2 + I(4,∞] (X) 16
(11.84)
E [Z].
∞
ANALYTIC SOLUTION
E [g (X)] =
g (t) fX (t) dt =
0 4
I[0,4] (t) t2 0.3e−0.3t dt + 16E I(4,∞] (X)
(11.85)
=
0
APPROXIMATION
t2 0.3e−0.3t dt + 16P (X > 4) ≈ 7.4972
(by Maple)
(11.86)
To obtain a simple aproximation, we must approximate the exponential by a bounded random variable. Since
P (X > 50) = e−15 ≈ 3 · 10−7
we may safely truncate
X
at 50.
tappr Enter matrix [a b] of xrange endpoints [0 50] Enter number of x approximation points 1000 Enter density as a function of t 0.3*exp(0.3*t) Use row matrices X and PX as in the simple case M = X <= 4; G = M.*X.^2 + 16*(1  M); % g(X) EG = G*PX' % E[g(X)] EG = 7.4972 [Z,PZ] = csort(G,PX); % Distribution for Z = g(X) EZ = Z*PZ' % E[Z] from distribution EZ = 7.4972
318
CHAPTER 11. MATHEMATICAL EXPECTATION
Because of the large number of approximation points, the results agree quite closely with the theoretical value.
Example 11.23: Stocking for random demand (see Exercise 4 (Exercise 10.4) from "Problems on Functions of Random Variables")
The manager of a department store is planning for the holiday season. dollars per unit and sells for
p
A certain item costs
dollars per unit.
additional units can be special ordered for
s
If the demand exceeds the amount
m
c
ordered,
dollars per unit (
s > c).
If demand is less than
amount ordered, the remaining stock can be returned (or otherwise disposed of ) at unit (
r < c).
Demand
D
r
dollars per
for the season is assumed to be a random variable with Poisson What amount
distribution. Suppose
µ = 50, c = 30, p = 50, s = 40, r = 20.
m should the manager
(µ)
order to maximize the expected prot? PROBLEM FORMULATION Suppose
D
is the demand and
X
is the prot. Then
For For
D ≤ m, X = D (p − c) − (m − D) (c − r) = D (p − r) + m (r − c) D > m, X = m (p − c) + (D − m) (p − s) = D (p − s) + m (s − c)
It is convenient to write the expression for
X
in terms of
IM , where M = (−∞, m].
Thus
X = IM (D) [D (p − r) + m (r − c)] + [1 − IM (D)] [D (p − s) + m (s − c)] = D (p − s) + m (s − c) + IM (D) [D (p − r) + m (r − c) − D (p − s) − m (s − c)] = D (p − s) + m (s − c) + IM (D) (s − r) (D − m)
Then
(11.87)
(11.88)
(11.89)
E [X] = (p − s) E [D] + m (s − c) + (s − r) E [IM (D) D] − (s − r) mE [IM (D)]. D∼
Poisson
ANALYTIC SOLUTION For
(µ), E [D] = µ
m
and
E [IM (D)] = P (D ≤ m)
m
E [IM (D) D] = e−µ
k=1
Hence,
k
µk = µe−µ k!
k=1
µk−1 = µP (D ≤ m − 1) (k − 1)!
(11.90)
E [X] = (p − s) E [D] + m (s − c) + (s − r) E [IM (D) D] − (s − r) mE [IM (D)] = (p − s) µ + m (s − c) + (s − r) µP (D ≤ m − 1) − (s − r) mP (D ≤ m)
Because of the discrete nature of the problem, we cannot solve for the optimum calculus. We may solve for various
(11.91)
(11.92) by ordinary
m
m
about
m=µ
and determine the optimum.
We do so with
the aid of MATLAB and the mfunction cpoisson.
mu = 50; = 30; = 50; = 40; = 20; = 45:55; EX = (p  s)*mu + m*(s c) + (s  r)*mu*(1  cpoisson(mu,m)) ... (s  r)*m.*(1  cpoisson(mu,m+1)); disp([m;EX]') 45.0000 930.8604 c p s r m
0000 49. t = 0:n. u = 1 + t. solution may be obtained by MATLAB.2347 929.3886 % Optimum m = 50 A direct.1532 934.:) = t*(p .0000 47. pD = ipoisson(mu.*(t .2001e10 n = 100.5231 939.24 (A jointly distributed pair). . u = 1 − t (see Figure 11..M.0000 48. Y } has joint density fXY (t. u) = 3u on the triangular region bounded by u = 0. end EG = G*pD'. D.r) .0000 51.m(i)*(c .0699 938. the maximum and minimum gains. Y ) = X 2 + 2XY . Let Z = g (X. based on simple approximation to for each m could be studied e. is that the distribution of gain Example 11.r). for i = 1:length(m) % Step by step calculation for various m M = t > m(i).r).6750 942.0000 52.1895 941. Suppose the pair Figure 11. G(i. Determine E [Z].319 46.m(i))*(s .24: A jointly distributed pair {X.g.2: The density for Example 11.9247 941.2988 943. % Values agree with theoretical to four deicmals An advantage of the second solution.2). using nite approximation for the Poisson distri APPROXIMATION ptest = cpoisson(mu.0000 bution.0000 53.0000 55.100) % Check for suitable value of n ptest = 3. 935.0000 50.7962 943.t).0000 54.
and u = t − 1 (see Figure 11.^2 + 2*t. u) : max{t. u ≤ 1}.3).*(u<=min(1+t.94) Q = {(t.93) tuappr Enter matrix [a b] of Xrange endpoints [1 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density 3*u.25: A function with a compound denition The pair {X. X 2Y for for max{X. MATHEMATICAL EXPECTATION ANALYTIC SOLUTION E [Z] = t2 + 2tu fXY (t. Y } ≤ 1 max{X.320 CHAPTER 11.1006 % Theoretical value = 1/10 [Z. u} ≤ 1} = {(t. PY. u) = 1/2 u = 1 − t. Y ) X + IQc (X. % Distribution for Z EZ = Z*PZ' % E[Z] from distribution EZ = 0. t.1t)) Use array operations on X.PZ] = csort(G. E [W ].*P) % E[g(X. PX. u) : t ≤ 1. Y } > 1 = IQ (X.Y) = X^2 + 2XY EG = total(G. Y ) 2Y Determine (11.P).1006 Example 11. u) dudt 0 1+t 1 1−t =3 −1 APPROXIMATION t2 u + 2tu2 dudt + 3 0 0 0 t2 u + 2tu2 dudt = 1/10 (11.*u. u = 3 − t. u. % g(X. W ={ where on the square region bounded by u = 1 + t.Y)] EG = 0. Y } has joint density fXY (t. and P G = t. . Y.
*M + 2*u. (u>=max(1t.321 Figure 11.8333 t−1 (11.*(1 . PX.PZ] = csort(G.*P) % E[g(X.u)<=1. 1 0 1 1+t Reference to the gure shows three regions of integration.PZ) % E[Z] from distribution EZ = 1. % Z = g(X..95) APPROXIMATION tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density ((u<=min(t+1. Y. PY.8340 Special forms for expectation .8333 [Z. and P M = max(t.3t))& . E [W ] = 1 2 1 0 1 t dudt + 1−t 1 2 2u dudt + 1 2 2 1 3−t 2u dudt = 11/6 ≈ 1. u..Y)] EG = 1. ANALYTIC SOLUTION The intersection of the region Q and the square is the set for which 0 ≤ t ≤ 1 and 1 − t ≤ u ≤ 1. G = t. % Distribution for Z EZ = dot(Z.3: The density for Example 11.8340 % Theoretical 11/6 = 1.25 (A function with a compound denition).P).M). t.Y) EG = total(G.t1)))/2 Use array operations on X.
A and B are the same in the two ∞ Area 0 A= 0 [1 − F (t)] dt and Area B= −∞ F (t) dt (11. p. which we do not need. 600) are often useful.96) may be derived from (E20a) by use of integration by parts for Stieltjes integrals. However. the areas of the regions marked gures. Because of the construction.4).101) E [X] = 0 u1/a du = The same result could be obtained by using ' fX (t) = FX (t) tfX (t) dt. 601) (11. 601) ∞ E [X] = −∞ [u (t) − FX (t)] dt (11.97) 1 E [g (X)] = 0 VERIFICATION If g [Q (u)] du Y = Q (U ). a > 0.26: The property (E20f ) (list.99) ∞ E [X] = 0 R (t) dt (11. The special form (E20b) (list.100) Example 11. p. 601). MATHEMATICAL EXPECTATION The various special forms related to property (E20a) (list. where U∼ uniform on (0. if X is the life duration (time to failure) for a device. The general result.103) .28: Use of the quantile function Suppose FX (t) = ta . As may be seen. (11. we use the relationship between the graph of the distribution function and the graph of the quantile function to show the equivalence of (E20b) (list. p. a 1 = 1 + 1/a a+1 and evaluating (11. p.102) F is shown in Figure 3(b) (Figure 11. 0 ≤ u ≤ a.27: Reliability and expectation In reliability. Figure 3(a) shows 1 0 Q (u) du is the dierence in the shaded areas 1 Q (u) du = Area A 0 The corresponding graph of the distribution function . 601) For the special case g (X) = X . Hence. the reliability function is the probability at any time t the device is still operative. p. Example 11.98) 1 E [g (X)] = E [g (Q (U ))] = g (Q (u)) fU (u) du = 0 g (Q (u)) du Example 11. 1). 601) If Q is the quantile function for the distribution function FX . then (11.29: Equivalence of (E20b) (list. 601) and (E20f ) (list.322 CHAPTER 11. 601) and (E20f ) (list. then Y has the same distribution as X. p. 0 ≤ t ≤ 1.Area B (11. The latter property is readily established by elementary arguments. Example 11. Thus R (t) = P (X > t) = 1 − FX (t) According to property (E20b) (list. p. 1 Then Q (u) = u1/a . is usually derived by an argument which employs a general form of what is known as Fubini's theorem. p.
104) .Area B= −∞ [u (t) − F (t)] dt (11.323 Use of the unit step function u (t) = 1 for t > 0 and 0 for t < 0 (dened arbitrarily at t = 0) enables us to combine the two expressions to get 1 ∞ Q (u) du = Area A 0 .
601) is a direct result of linearity and (E20b) (list. functions cancelling out. p. MATHEMATICAL EXPECTATION Figure 11.324 CHAPTER 11. p. Property (E20c) (list. 601). p. with the unit step . 601). p.4: Equivalence of properties (E20b) (list. 601) and (E20f) (list.
we may assert b b P (X > t) dt = a For each nonnegative integer P (X ≥ t) dt a (11. Then ∞ ∞ ∞ P (X ≥ n + 1) ≤ E [X] ≤ n=0 VERIFICATION For P (X ≥ n) ≤ N n=0 k=0 P (X ≥ kN ) .108) P (X ≥ t) is decreasing with t En has unit length. by (E20b) (list. For integer valued random variables. 601) is more frequently used. The special case (E20e) (list. p. P (X ≥ t) = P (X ≥ n) on En and P (X ≥ t) = P (X > n) = P (X ≥ n + 1) on En+1 (11.31: Property (E20e) (list. p.111) The result follows as a special case of (E20d) (list.32: (E20e) (list. 601) for integervalued random variables By denition ∞ n E [X] = k=1 kP (X = k) = lim n k=1 kP (X = k) (11.107) n. then ∞ ∞ E [X] = k=1 VERIFICATION P (X ≥ k) = k=0 P (X > k) (11. 601).105) X ≥ 0. 601) ∞ ∞ E [X] = 0 Since [1 − F (t)] dt = 0 P (X > t) dt P (X > t) and (11.112) An elementary derivation of (E20e) (list. Example 11. 601) is used primarily for theoretical purposes.110) Remark.30: Property (E20d) (list. 601) Useful inequalities Suppose X ≥ 0. Example 11. X Property (E20d) (list. for all N ≥1 (11. p. p.109) (k+1)N P (X ≥ t) dt ≤ N kN EkN P (X ≥ t) dt ≤ N P (X ≥ kN ) (11. p.113) . integer valued. ∞ ∞ By the countable additivity of expectation E [X] = n=0 Since E [IEn X] = n=0 and each P (X ≥ t) dt En (11. let En = [n.106) F can have only a countable number of jumps on any interval and P (X ≥ t) dier only at jump points. p. p. p. 601) can be constructed as follows.325 Example 11. we have by the mean value theorem P (X ≥ n + 1) ≤ E [IEn X] ≤ P (X ≥ n) The third inequality follows from the fact that (11. n + 1). 601) If is nonnegative.
8. 3 This content is available online at <http://cnx.4/>. 0. $3. Then P (X ≥ k) = q k .3 (Solution on p. D} has minterm probabilities pm = 0. 0. In a thunderstorm in a national park there are 127 lightning strikes.10 0.50. 0. 0. .09. 0.12. Determine the expected number of res. Example 11. The class {A.00.14.10 0. MATHEMATICAL EXPECTATION Now for each nite n. 2.33: The geometric distribution Suppose X∼ geometric (p).5IC4 + 5.org/content/m24366/1. 334.115) 11. and Exercise 3 (Exercise 7.06.1 npr07_01. She purchases one of the items with probabilities 0.326 CHAPTER 11. 2.m (Section 17. is a partition. 2} C1 {Cj : 1 ≤ j ≤ 10} through C10 .20. B." mle npr06_12.5 are ipped twenty times. mle X has values {1. 0.5IC3 + 7.50. C. Exercise 11. A customer comes in.8.12) from "Problems on Random Variables and Probabilities". $5.31: npr07_02) ). A store has eight items for sale. The class on (Solution on p. $5. p. 3.m (Section 17.) Exercise 11.50.5IC8 Determine the expection (11.0IC2 + 3.2) from "Problems on Distribution and Density Functions". 0. $5. and $7. 4.8.28: npr06_12)). 0. 601) gives E [X] = k=1 qk = q k=0 qk = q = q/p 1−q (11.15.09.0083. n k n n n n kP (X = k) = k=1 Taking limits as P (X = k) = k=1 j=1 j=1 k=j P (X = k) = j=1 P (X ≥ j) (11.5) from "Problems on Distribution and Density Functions").00.50. 334.2 (See Exercise 2 (Exercise 7. 0. mle npr07_02.m (Section 17.117) which X = IA + IB + IC + ID .15. ∞ ∞ Use of (E20e) (list.) (See Exercise 8 (Exercise 7. (Solution on p. Let (Solution on p. 0. 335. 334. $3. 1. 0.3 Problems on Mathematical Expectation3 Exercise 11. 335.11.) (See Exercise 5 (Exercise 7.30: variable npr07_01)).3) from "Problems on Distribution and Density Functions. $7.0IC6 + 3. Experience shows that the probability of of a lightning strike starting a re is about 0.0IC5 + 5.8) from "Problems on Distribution and Density Functions").5IC1 + 5. The prices are $3. 5. 0.10.116) E [X] of the value of her purchase.5IC7 + 7.00.11. respectively.4 (Solution on p.05.) (See Exercise 1 (Exercise 7.15. Exercise 11. (11. 0.1) from "Problems on Distribution and Density Functions". Determine E [X]. Exercise 11. with probabilities 0. Random respectively. 0. 3. Two coins X be the number of matches (both heads or both tails). Determine E [X].07.001 ∗ [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] Determine the mathematical expection for the random variable counts the number of the events which occur on a trial.08. The random variable expressing the amount of her purchase may be written X = 3.50.) (See Exercise 12 (Exercise 6. 3.13.114) n→∞ yields the desired result.
The prot to the College is X = 50 · 10 − 30N. E [Y ]. m sophomores.) The Exercise 11.7 (See Exercise 19 (Exercise 7. Use the fact that ∞ te−λt dt = 0 to determine an expression for 1 λ2 ∞ and te−λt dt = a 1 −λa e (1 + λa) λ2 (11. without replacement.12 le npr08_02. Let who are selected. Determine the expected prot where N is the number of winners (11.) Two positions for campus jobs are open.327 Exercise 11.) A (See Exercise 12 (Exercise 7. Fifty chances are sold.8. A player pays $10 to play. (Solution on p. and usual assumptions. number of noise pulses arriving on a power circuit in an hour is a random quantity having Poisson (7) distribution.a] (X) X + I(a. 0.10 Truncated exponential. Y } and E [X]. E [XY ]. λ = 1/50. 335.119) What is the expected value E [X]? (Solution on p.8. Let X be the number of aces and Y be the number of spades. Two (See Exercise 2 (Exercise 8. Exercise 11. a = 30. 1] (t) t2 + I(1.8 (Solution on p. Use the approximation method. residential College plans to raise money by selling chances on a board.∞) (X) a. E Y2 .120) E [Y ]. What is the expected number of pulses in an hour? Exercise 11.2.2) from "Problems On Random Vectors and Joint Distributions".12) from "Problems on Distribution and Density Functions").6 (Solution on p. Approximate the exponential at Compare the approximate result with the theoretical result b. with 10. mle npr08_01.) (See Exercise 41 (Exercise 7. Determine E [X]. What is the expected operating time? Exercise 11.) (See Exercise 1 (Exercise 8. determine the joint distribution.) (See Exercise 24 (Exercise 7. E X 2 Y be the number of juniors . a. The total operating time for the units in Exercise 24 (Exercise 7. Exercise 11. 335.m (Section 17. possible pair equally likely). 335.m (Section 17.25) from "Problems on Distribution and Density Functions").24) is a random variable T ∼ gamma (20.24) and Exercise 25 (Exercise 7.000 points for of part (a). and X It is decided to select two at random (each be the number of sophomores and Determine the joint distribution for {X.19) from "Problems on Distribution and Density Functions"). Suppose X∼ exponential (λ) and Y = I[0. he or she wins $30 with probability p = 0.9 variable (Solution on p.1) from "Problems On Random Vectors and Joint Distributions".41) from "Problems on Distribution and Density Functions"). from a standard deck.2] (t) (2 − t) 5 5 (11. 0 ≤ t ≤ 1000. . 335. E [Y ]. Under the . Random X has density function fX (t) = { (6/5) t2 (6/5) (2 − t) for for 0≤t≤1 1<t≤2 6 6 = I [0. (Solution on p.118) E [X].32: npr08_01)). 336. E Y2 E [XY ].11 (Solution on p. Two cards are selected at random. and three seniors apply. 335. 335.0002).) Exercise 11.33: npr08_02) ). three juniors. E X 2 .
1 Determine E [X].m (Section 17.37: npr08_06)).0077 Table 11. E [Y ].) (See Exercise 6 (Exercise 8.2 0.1 2.0363 0.17 npr08_07.0399 0. Let (Solution on p.7 0.9 0.0231 0.4 0.0713 0.0 3.35: npr08_04) ). Y = u) (11.0589 (11.1320 0. Y } E [Y ].0551 0. Y } has the joint distribution: X = [−2. mle npr08_04.0380 0. Exercise 11.0396 0 0.36: npr08_05)). sevens that are thrown on the X Roll the pair an additional X times. with values of Y increasing upward.328 CHAPTER 11. A coin is ipped npr08_03) ).8 3. Exercise 11. suppose a pair of dice is rolled instead of a single die.0324 0.15 npr08_05.0297 0 4. P (Y = jX = k) has the binomial (k.0357 0.3 2.0437 P = 0.13.1089 0.1 5.0528 0.0132 3. and E [XY ].5 4.0405 0.0495 0. E Y2 Exercise 11.m (Section 17. E X 2 .0483 0.5 4. 337.0203 0. E [Y ].122) E [X]. times. 1/2) distribution.0527 0.1 3.0726 2.1] 0.6) from "Problems On Random Vectors and Joint Distributions".0651 0.0667 Determine 0.38: npr08_07)). mle {X. and Y = [1.5) from "Problems On Random Vectors and Joint Distributions". Let A die is rolled. E Y2 .8. Determine the expected value E [Y ].0891 0.16 npr08_06.0399 0.0594 0.m (Section 17.9 5. Let (See Exercise 5 (Exercise 8.0493 . Let Y X be the total be the number of and determine rolls.123) t = u = 7.0189 0.8.m (Section 17. (Solution on p.m (Section 17. The pair (Solution on p. Determine the joint distribution for {X.121) 0. mle {X.8.) (See Exercise 4 (Exercise 8. 337.0420 0.3] 0. Arrange the joint matrix as on the plane.0580 E [XY ]. Determine the joint distribution for {X. mle number of spots which turn up.8.13 npr08_03.1 0. .0510 0.0090 0. Y } and determine E [Y ]. 337. The pair (Solution on p.0484 1.0440 0.34: turn up. 0.) (See Exercise 3 (Exercise 8.3) from "Problems On Random Vectors and Joint Distributions".5 0.7) from "Problems On Random Vectors and Joint Distributions". E X 2 . 336.) (See Exercise 7 (Exercise 8. Assume P (X = k) = 1/6 for 1 ≤ k ≤ 6 and for each k.) Suppose a pair of dice is rolled. Y }.14 X Y X be the number of spots that Determine the be the number of heads that turn up. Exercise 11.3 − 0.0620 0. Y } has the joint distribution: P (X = t.0216 0. mle joint distribution for the pair {X.0361 0.7 1. 336.0609 0. As a variation of Exercise 11.8.4) from "Problems On Random Vectors and Joint Distributions".0323 0.0441 (11. MATHEMATICAL EXPECTATION Exercise 11. (Solution on p.
0167 11 0.049 0. 2 approximation for E [X].0182 0.039 0.0162 0.001 0.0070 0. The data are as follows: P (X = t.0156 0.001 0.0049 0.0072 0.0060 0.0168 0.0056 0. E X and .050 0.0196 0.0172 17 0.0182 0.012 0.0096 0.9) from "Problems On Random Vectors and Joint Distributions".017 Table 11.19 npr08_09.0108 0.8) from "Problems On Random Vectors and Joint Distributions".0268 0.010 0.0035 0.18 npr08_08.0040 0. and Y is the time to perform the task.m (Section 17.045 2.8.) Exercise 11.0204 0.0040 0.0140 0.0134 0. E Y2 .0108 0.0120 0. X is the amount of training.8. Y } has the joint distribution: P (X = t.0091 0.0200 0. in hours.033 0. and E [XY ].0063 7 0.0191 0.40: npr08_09)).0060 0.0056 0.0084 0. E [Y ]. E X 2 .061 0.0126 0. (See Exercise 9 (Exercise 8.125) t = u = 5 4 3 2 1 1 0. E Y2 .0091 0.0156 0.058 0.0180 0.0045 9 0.) (See Exercise 8 (Exercise 8. E [XY ].0180 0.025 3 0.0126 0. E Y 2 .0064 0. Y = u) (11. Y = u) (11.0081 0. The pair (Solution on p.329 Exercise 11.0196 0.003 1. 337. mle Data were kept on the eect of training time on the time to perform a job on a production line.0054 0. Determine analytically b.0096 0.0120 0.009 2 0.0056 0.070 0.0256 0.0104 0.0256 0.0060 0.39: npr08_08)). .163 0.0182 0. in minutes.3 Determine E [X].2 Determine E [X].137 0.039 0.051 0.0098 0.0012 0.0217 19 0.0160 0.0050 0.0234 0.0080 0.0044 0.0223 Table 11.0056 0.005 0.0060 0.0140 0.0252 0.0070 0. mle {X.124) t = u = 12 10 9 5 3 1 3 5 1 0. E X 2 .0026 15 0.0256 0. E X 2 .0096 0. E [Y ].0072 3 0.065 0.0160 0. and E [XY ]. For the joint densities in Exercises 2032 below a.0234 0. 338. E Y 2 .5 0.0112 0.m (Section 17.0260 0.0084 0.0280 0. (Solution on p. Use a discrete E [X].0038 0.0064 0.0090 13 0.5 0. E [Y ]. E [Y ].0017 5 0.031 0.015 0.0244 0. and E [XY ].0112 0.011 0.
1) fX (t) = I[−1.134) .131) 3 3 (1 + t) 1 + 4t + t2 = 1 + 5t + 5t2 + t3 .11) from "Problems On Random Vectors and Joint Distributions"). fXY (t. (11. fX (t) = 2e−2t . 0 ≤ t ≤ 1. u) = 4ue−2t 0 ≤ t. 339.20 (Solution on p. fXY (t. 0 ≤ t ≤ 2 4 (11.) (See Exercise 10 (Exercise 8. 0 ≤ u ≤ 1 (11. 0 ≤ u ≤ 1 + t. (11. 2) .13) from "Problems On Random Vectors and Joint Distributions"). (1. 0 ≤ u ≤ 2.126) Exercise 11. fY (u) = 2 (1 − u) . (1. 1). fX (t) = fY (t) = 1 (t + 1) . 0 ≤ u ≤ 1 Exercise 11. u) = 12t2 u (−1.25 (Solution on p. (11.23 (Solution on p. 0 ≤ t ≤ 2 88 88 3 3 6u2 + 4 + I(1.12) from "Problems On Random Vectors and Joint Distributions"). 338.) for (See Exercise 14 (Exercise 8.) (See Exercise 15 (Exercise 8.) for (See Exercise 13 (Exercise 8. 2 (11. u) = 3 88 2t + 3u2 fX (t) = for 0 ≤ t ≤ 2.133) fY (u) = 12u3 − 12u2 + 4u.132) (Solution on p. (0. (0. 339. 1) .3] (u) 3 + 2u + 8u2 − 3u3 88 88 fY (u) = I[0. fXY (t.14) from "Problems On Random Vectors and Joint Distributions"). fXY (t. fY (u) = 1 − u/2. 0 ≤ t ≤ 1. fXY (t.26 (11. MATHEMATICAL EXPECTATION Exercise 11. (2. fX (t) = 2 (1 − t) . 339.127) fX (t) = fY (t) = I[0. 0 ≤ t.21 (Solution on p.1] (t) 6t2 1 − t2 . 0 ≤ u ≤ 2 (11. 0 ≤ u ≤ 1. (0. 0 ≤ u ≤ 1.16) from "Problems On Random Vectors and Joint Distributions").1] (t) t + I(1.) on the parallelogram with vertices (See Exercise 16 (Exercise 8.130) Exercise 11.22 (Solution on p.) for (See Exercise 12 (Exercise 8.330 CHAPTER 11.128) fX (t) = 2t. 0) . 0) . 0 ≤ u ≤ 1 (11. 339.2] (t) (2 − t) Exercise 11.1] (u) Exercise 11. 0 ≤ u ≤ 2 (1 − t).129) Exercise 11. fXY (t. fY (u) = 2u. 338. fXY (t. u) = 4t (1 − u) 0 ≤ t ≤ 1. 1) .0] (t) 6t2 (t + 1) + I(0. u) = 1 for 0 ≤ t ≤ 1. u) = 1 8 (t + u) 0 ≤ t ≤ 2.24 (Solution on p. 339.10) from "Problems On Random Vectors and Joint Distributions").) on the square with vertices at (See Exercise 11 (Exercise 8. 0) .15) from "Problems On Random Vectors and Joint Distributions"). u) = 1/2 (1.
2] (t) (3 − t) 13 13 + I(1. fXY (t.18) from "Problems On Random Vectors and Joint Distributions"). fXY (t.2] (t) t(2 − t) . u) = 2 13 (t + 2u).135) fX (t) = I[0. u) = 24 11 tu 0 ≤ t ≤ 2. 340. 0 ≤ u ≤ max{2 − t. 8 . fY (u) = u(u − 2) . 0 ≤ u ≤ min{1. u) = 3 23 (t + 2u) 0 ≤ t ≤ 2. 0 ≤ t ≤ 2.331 Exercise 11.139) fX (t) = I[0.143) (Solution on p. 2}. u) = I[0.2] (u) 9 + 12u + u2 − 2u3 227 227 (11.2] (u) 27 − 24u + 8u2 − u3 179 179 (11. fX (t) = I[0.19) from "Problems On Random Vectors and Joint Distributions").1] (t) Exercise 11.30 (11.1] (u) Exercise 11. 340.1] (t) fY (u) = I[0.32 4 8 9 + u − u2 13 13 52 (11. 9 fXY (t.) (See Exercise 22 (Exercise 8.1] (t) fY (u) = I[0.2] (t) 9 − 6t + 19t2 − 6t3 179 179 24 12 (4 + u) + I(1.2] (t) t 227 227 (11.138) (Solution on p.1] (t) 12 2 6 t + I(1. 0 ≤ u ≤ min{2t.2] (t) 23 t2 .) for (See Exercise 18 (Exercise 8.) for (See Exercise 20 (Exercise 8. 0 ≤ t ≤ 2. u) = 12 179 3t2 + u . 339. 0 ≤ u ≤ min{2.29 (Solution on p.1] (t) 23 (2 − t) + I(1. 340. 340.2] (u) (2u + 3) 3 + 2u − u2 227 227 24 6 (2u + 3) + I(1. 0 ≤ u ≤ 1 11 11 11 (11.2] (u) 23 (4 + 6u − 4u2 ) = 6 I[0. 3 − t}. for 0 ≤ t ≤ 2.140) = I[0.) for (See Exercise 21 (Exercise 8.31 (11.141) (Solution on p.28 (Solution on p. 340. 6 6 fX (t) = I[0. 2 − t}.1] (u) Exercise 11.27 (Solution on p. 0 ≤ u ≤ min{1 + t.137) fX (t) = I[0. 3 − t}. fXY (t. u) = 12 227 (3t + 2tu).) for (See Exercise 17 (Exercise 8.142) fY (u) = I[0.2] (u) 9 6 21 + u − u2 13 13 52 (11. fXY (t.) (See Exercise 19 (Exercise 8.136) Exercise 11.21) from "Problems On Random Vectors and Joint Distributions"). fY (u) 3 I(1.17) from "Problems On Random Vectors and Joint Distributions"). 12 3 120 t + 5t2 + 4t + I(1.1] (t) 3 t2 + 2u + I(1. 24 6 3t2 + 1 + I(1.2] (t) 14 t2 u2 .22) from "Problems On Random Vectors and Joint Distributions").1] (u) Exercise 11. 12 12 12 2 2 t + I(1.1] (u) 23 (2u + 1) + (11.20) from "Problems On Random Vectors and Joint Distributions").1] (u) 24 6 (2u + 3) + I(1. fXY (t. t}.
m (Section 17. 341.7 0. Y.0495 0.1320 0. M 4 = [50. $13 each If the number of purchasers is a random variable quantity X.0077 Table 11. Do this using icalc.0231 0.1089 0. Determine E [W ].) The (See Exercise 5 (Exercise 10. Exercise 11.148) E [Z] and X ∼ Poisson (75). MATHEMATICAL EXPECTATION for 0 ≤ u ≤ 1. {X.5) from "Problems on Functions of Random Variables") The cultural committee of a student organization has arranged a special deal for tickets to a concert.147) M 1 = [10. M 2 = [20. identically distributed) with common X = [−5 − 1 3 4 7] Let P X = 0.0216 0.0090 0. M 3 = [30. {X. then repeat with icalc3 and compare results.1 2.0203 0.5 0.144) fX (t) = I[0. ∞) .4 Let Z = g (X.5 4.0363 0. ∞) .4 0.0891 0.0726 2.0594 0. (11.0484 1.) has the joint distribution (in mle npr08_07. $18 each.1 0. the total cost (in dollars) is a random Z = g (X) described by g (X) = 200 + 18IM 1 (X) (X − 10) + (16 − 18) IM 2 (X) (X − 20) + (15 − 16) IM 3 (X) (X − 30) + (13 − 15) IM 4 (X) (X − 50) where Suppose (11. 2130 $16 each.2] (t) t2 . Y } Approximate the Poisson distribution by truncating at 150.8 3.0189 0.0324 0. Z} of random variables is iid (independent. Y = { X 2Y for for X +Y ≤4 X +Y >4 = IM (X. ∞) .145) W = 3X − 4Y + 2Z .) W = g X. 3150. Y ) = 3X 2 + 2XY − Y 2 .149) t = u = 7.0132 3.150) . Y } in Exercise 11.1] (t) Exercise 11. 340. fY (u) = + u + u2 0 ≤ u ≤ 1 8 14 8 4 2 (Solution on p. Determine Exercise 11.146) (11. 51100.35. Exercise 11.2 0.35 The pair (Solution on p.) {X.0440 0. 342. Y = u) (11. $15 each.33 The class distribution 3 3 2 3 1 3 t + 1 + I(1. Additional tickets are available according to the following schedule: 1120.9 0.0510 0.0405 0. Y ) X + IM c (X. agreement is that the organization will purchase ten tickets at $20 each (regardless of the number of individual buyers).0396 0 0.0297 0 4.38: npr08_07)): P (X = t.0528 0.332 CHAPTER 11. Y ) 2Y (11.36 For the pair (Solution on p.34 (Solution on p.8. E Z2 . let Determine E [Z] and E Z2 . ∞) (11. 341.01 ∗ [15 20 30 25 10] (11.0 3.
108 0.8 0. u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2. 3 − t} (see Exercise 11. Y ) (X + Y ) + IM c (X. 0 ≤ u ≤ 1 + t (see Exercise 11.4 0. 342. Y ) X + IM c (X. 2 − t)} (11.) for fXY (t. For the distributions in Exercises 3741 below a. u) : u ≤ min (1.045 0.1] (X) 4X + I(1. Y ) XY. Y. Exercise 11. 343.39 3 23 (11.27).12 1. of Random Variables".) (See Exercise 16 (Exercise 10.) fXY (t.) fXY (t. 342.) fXY (t. 342. 1 Z = IM (X.28).333 Determine E [W ] and E W2 E [Z] . u) = (t + 2u) 0 ≤ t ≤ 2. 0 ≤ u ≤ min{1 + t. X = −2IA + IB + 3IC . 343. u) : u > t} 2 Exercise 11. Y ) (X + Y ) + IM c (X. 0 ≤ u ≤ min{2. The class (11. Z} is independent. u ≥ 1} (11.025 0. u) : max (t. 0 ≤ u ≤ max{2 − t.42 The class M = {(t. E.30).156) {D.8.16) from "Problems on Functions {X. b. 344.375 0.41 M = {(t. M = {(t. t} (see Exercise 11.5 W = X 2 + 3XY 2 − 3Z . mle npr10_16.012 0. Determine analytically and E Z2 .) for fXY (t.m (Section 17. Use a discrete approximation to calculate the same quantities.2] (X) (X + Y ) Exercise 11. F } is independent with P (D) = 0. u) = 12 227 (3t + 2tu).7 0.32 P (E) = 0. 2 − t} (see Exercise 11. Determine E [W ] and E W2 .157) Z has distribution Value Probability 1. Exercise 11. M = {(t. Z = I[0.153) (Solution on p. for 0 ≤ t ≤ 2.162 0.13 5. Y ) 2Y 2 . Y ) 2Y. 0 ≤ u ≤ min{1. u) ≤ 1} Exercise 11.56 P (F ) = 0. .42: npr10_16)) Minterm probabilities are (in the usual order) 0.29). u) = 12 179 3t2 + u . Y ) X + IM c (X. Y ) Y 2 . Z = IM (X.40 (11.40 (11.3 0.24 2. u) : t ≤ 1.25).38 (11. u) = 24 11 tu 0 ≤ t ≤ 2.43 3.08 Table 11. Z = IM (X.37 (Solution on p.152) (Solution on p. 2} (see Exercise 11.018 Y = ID + 3IE + IF − 3. Z = IM (X.154) (Solution on p. Exercise 11. for 0 ≤ t ≤ 2.255 0.2 0.155) (Solution on p.151) (Solution on p.
coefficients in c canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution EX = X*PX' EX = 2. pc') npr07_02 Data are in T.PX] = csort(T. disp('Minterm probabilities in pm.2) from "Problems on Distribution and Density Functions" T = [3.PX] = csort(T.30: npr07_01) % Data for Exercise 1 (Exercise~7.5].px] = csort(T. c = [1 1 1 1 0].pc). pc = 0.m (Section~17.001*[5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302].12) from "Problems on Random Variables and Probabilities" pm = 0. 326) % file npr07_02. pc = 0. PX ex = X*PX' ex = 2.9890 T = sum(mintable(4)).3 (p.pm). disp('Data are in T and pc') npr07_01 Data are in T and pc EX = T*pc' EX = 2.5 7. [x.8. pc EX = T*pc' EX = 5.0 3.7000 [X.1 (p.5 5. coefficients in c') npr06_12 Minterm probabilities in pm.9890 .28: npr06_12) % Data for Exercise 12 (Exercise~6.m (Section~17. MATHEMATICAL EXPECTATION Solutions to Exercises in Chapter 11 Solution to Exercise 11.0 3.2 (p.31: npr07_02) % Data for Exercise 2 (Exercise~7.5 7.5 5.01*[ 8 13 6 9 14 11 12 7 11 9]. 326) % file npr06_12.8.3500 Solution to Exercise 11.1) from "Problems on Distribution and Density Functions" T = [1 3 2 3 4 2 1 3 5 2]. disp('Data are in T. ex = x*px ex = 2. ex = X*PX' ex = 5.7000 Solution to Exercise 11.8.01*[10 15 15 20 10 5 10 15]. % Alternate using X.0 5.334 CHAPTER 11.3500 [X.m (Section~17.pc). 326) % file npr07_01.
5594 Solution to Exercise 11.5000 EX2 = (X.2).335 Solution to Exercise 11. PY. 327) Poisson (7). Y.5 (p.11 (p. u.32: npr08_01) Data in Pn. X. E [X] = 7. 326) X∼ X∼ N∼ X∼ X∼ binomial (127.exp(30/50)) % Theoretical value ez = 22. 327) a E [Y ] = g (t) fX (t) dt = 0 tλe−λt dt + aP (X > a) = (11. 1 2 Solution to Exercise 11. and P EX = X*PX' EX = 0.*P) ex = 0. Solution to Exercise 11.4 (p. P. t.7 (p. 326) binomial (20.1538 EY = Y*PY' EY = 0. 327) E [X] = tfX (t) dt = 6 5 t3 dt + 0 6 5 2t − t2 dt = 1 11 10 (11.8.10 (p. E [X] = 127 · 0.1629 % Alternate . PX. Solution to Exercise 11.8 (p.*(X<=30) + 30*(X>30).^2)*PX' EX2 = 0.9 (p.158) Solution to Exercise 11. 000.2 = 10. E [N ] = 50 · 0.0541 Solution to Exercise 11.5594 ez = 50*(1 . E [X] = 500 − 30E [N ] = 200.1538 ex = total(t. 0.0002 = 100. 327) gamma (20. Y jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X. 0.160) tappr Enter matrix [a b] of xrange endpoints [0 1000] Enter number of x approximation points 10000 Enter density as a function of t (1/50)*exp(t/50) Use row matrices X and PX as in the simple case G = X.0083 = 1. 0. 1/2). 327) binomial (50.5 = 10. 327) npr08_01 (Section~17. EZ = G8PX' EZ = 22.159) λ 1 1 − e−λa 1 − e−λa (1 + λa) + ae−λa = λ2 λ (11. E [X] = 20 · 0.0083). E [X] = 20/0. Solution to Exercise 11.0002).6 (p.
P.6176 = total(t.9643 EXY = total(t. Y.8.1667 EY2 = (Y.5000 EY = Y*PY' EY = 1..7500 EX2 = (X..6667 EXY = total(t.^2)*PY' EY2 = 4. 327) npr08_02 (Section~17.. P jcalc .5000 EX2 = (X...^2)*PX' EX2 = 54.. Y...0769 EY2 EY2 EXY EXY Solution to Exercise 11.... PY jcalc .^2)*PY' EY2 = 0.5000 EY = Y*PY' EY = 0.EX = X*PX' EX = 7 EY = Y*PY' EY = 3..*u..*u..EX = X*PX' EX = 0...8..34: npr08_03) Answers are in X..8333 .12 (p.13 (p. 328) npr08_03 (Section~17.... MATHEMATICAL EXPECTATION = (Y.... 328) npr08_04 (Exercise~8..*P) EXY = 7.^2)*PX' EX2 = 0..7500 EX2 = (X.*P) EXY = 0.2143 Solution to Exercise 11..Pn.. P jcalc .4) Answers are in X.336 CHAPTER 11..*u.5714 EY2 = (Y.*P) = 0.5833 Solution to Exercise 11. Y.^2)*PY' = 0..14 (p.33: npr08_02) Data are in X.EX = X*PX' EX = 3.^2)*PX' EX2 = 15.
8590 EY = Y*PY' EY = 1.3696 EY = Y*PY' EY = 3.*u.1455 EX2 = (X.17 (p.8..6115 EXY = total(t.0000 EY = Y*PY' EY = 1. PY jcalc ..^2)*PX' EX2 = 9..4839 EXY = total(t..8495 EY2 = (Y.EX = X*PX' EX = 1.. P jcalc .6803 Solution to Exercise 11. Y.^2)*PY' EY2 = 15..337 EY2 = (Y.8. Y.18 (p......EX = X*PX' EX = 7.7644 EY2 = (Y.0344 EX2 = (X.. 328) npr08_06 (Section~17. 328) npr08_07 (Section~17. 329) ..37: npr08_06) Data are in X..^2)*PX' EX2 = 5.*P) EXY = 4.^2)*PY' EY2 = 19..1423 Solution to Exercise 11.15 (p.....^2)*PY' EY2 = 11.36: npr08_05) Answers are in X. Y. 328) npr08_05 (Section~17......1667 Solution to Exercise 11..8.. P jcalc . P.EX = X*PX' EX = 0.*P) EXY = 3.16 (p...38: npr08_07) Data are in X..4583 Solution to Exercise 11.*u..
3t))&(u>= max(1t.5*(u<=min(t+1. E X 2 = E Y 2 = 7/6 1+t 2 3−t (11.8.1667 (use t.20 (p.. P jcalc . 330) 1 E [X] = 0 2t (1 − t) dt = 1/3...5564 EXY = total(t..*P) EXY = 22.^2)*PX' EX2 = 4.1684 EY2 = 1.8... E [Y ] = 2/3.0000 EY = 1..8050 EX2 = (X.1000 EY = Y*PY' EY = 3. E X 2 = 1/6. u.*u..6667 EX2 = 0.1687 EXY = 1..t1)) EX = 1.19 (p.164) tuappr: [0 2] [0 2] 200 200 0. 329) npr08_09 (Section~17.0800 EY2 = (Y.. MATHEMATICAL EXPECTATION npr08_08 (Section~17.^2)*PY' EXY = total(t..EX = X*PX' EX = 10.39: npr08_08) Data are in X.3333 EY = 0.EX = X*PX' EX = 1... P) Solution to Exercise 11.9850 EXY = 5.^2)*PX' EX2 = 133...338 CHAPTER 11. E Y 2 = 2/3 1 2(1−t) (11.*u.6667 EXY = 0.2890 Solution to Exercise 11.162) tuappr: [0 1] [0 2] 200 400 u<=2*(1t) EX = 0..9250 EY = Y*PY' EY = 2.161) E [XY ] = 0 0 tu dudt = 1/6 (11...21 (p.*P) EY2 = 8.0016 EX2 = (X.0002 . Y.0002 EX2 = 1..^2)*PY' EY2 = 41. 330) 1 t E [X] = E [Y ] = 0 t2 dt + 1 1 2t − t2 dt = 1.0375 EY2 = (Y.40: npr08_09) Data are in X.1410 Solution to Exercise 11.1667 EY2 = 0.. Y.163) E [XY ] = (1/2) 0 1−t dudt + (1/2) 1 t−1 dudt = 1 (11. P jcalc ..
E Y2 = .2277 EY2 = 3.5822 EX2 = 1.5000 EY2 = 0. E Y 2 = .0368 EY2 = 0.1141 EXY = 2.1667 EX2 = 1. E [XY ] = 5 15 5 5 5 (11.*(u<1+t) EY = 1. 330) E [X] = 11 2 3 2 2 .165) tuappr: EX = 0.167) tuappr: EX = 1.166) E [XY ] = 1 8 2 0 0 2 t2 u + tu2 dudt = (11.5098 .171) tuappr: EX = 0. E Y 2 = .*(u<= min(1+t. E X 2 = . E X 2 = 1/2. E X2 = .^2). E [XY ] = 55 55 55 5 55 (11.*(1u) EX2 = 0.4004 EXY = 0.7342 EX2 = 0. E X2 = . 330) EXY = 0.6667 EY2 = 1.*u.5000 EY = 0.169) tuappr: EX = 1.6009 EXY = 0.168) tuappr: [0 6] [0 1] 600 200 4*u.9458 [0 2] [0 1] 400 200 (24/11)*t.4016 EY2 = 0.2t)) EY = 0. E [Y ] = . E [Y ] = 1/3. 330) E [X] = E [Y ] = 1 4 2 t2 + t dt = 0 7 .6667 [0 1] [0 1] EY = 0.3333 200 200 4*t.4229 [0 2] [0 3] 200 300 (3/88)*(2*t + 3*u.25 (p.1667 [0 2] [0 2] 200 200 (1/8)*(t+u) EY = 1. 330) ∞ E [X] = 0 2te−2t dt = 2 1 1 1 1 .5000 Solution to Exercise 11. E [Y ] = .26 (p.6202 EX2 = 2. 331) E [X] = 52 32 57 2 28 .1667 EXY = 0.4035 EY = 0.*(u<=min(1. E [Y ] = .4998 EY2 = 0.*u.27 (p.2222 Solution to Exercise 11. 330) E [X] = 2/3.4021 Solution to Exercise 11.23 (p.6667 EXY = 1. E [XY ] = 220 880 22 55 880 (11. E [Y ] = .4415 Solution to Exercise 11.24 (p. E Y 2 = 1/6 E [XY ] = 2/9 (11. E Y 2 = .3333 Solution to Exercise 11.3333 E [X] = 313 1429 49 172 2153 . E [XY ] = 2 3 2 2 3 (11.22 (p.6667 EX2 = 0.*(u>= max(0. E X 2 = E Y 2 = 5/3 6 4 3 (11.^2.170) tuappr: [1 1] [0 1] 400 200 12*t.t)).1)) EX = 0.*exp(2*t) EX = 0.339 Solution to Exercise 11. E X 2 = .
9596 EX2 = 1.174) tuappr: [0 2] [0 2] 400 400 (12/227)*(3*t + 2*t.7251 EY2 = 1. E [XY ] = 1135 2270 227 1135 3405 (11.*u. E Y2 = .1417 EXY = 1. x = [5 1 3 4 7].^2.01*[15 20 30 25 10]. E Y2 = . E [XY ] = 13 12 130 78 390 (11.6849 EY2 = 1.7745 Solution to Exercise 11.*(u<=max(2t.9169 EX2 = 1. icalc Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter X probabilities px Enter Y probabilities px Use array operations on matrices X. E X2 = . E Y2 = .0848 EY = 0. E [XY ] = 46 23 230 230 230 (11.5450 Solution to Exercise 11.2309 [0 2] [0 2] 400 400 (2/13)*(t + 2*u). E [Y ] = .0967 EY2 = 1.*u). 331) E [X] = 16 11 219 83 431 .*(u<=min(1+t.3t)) EX = 1.340 CHAPTER 11.33 (p.31 (p.32 (p. E [Y ] = .176) tuappr [0 2] [0 1] 400 200 (3/8)*(t. E [Y ] = . E [Y ] = .0944 Solution to Exercise 11. E [XY ] = 224 16 70 240 448 (11.5292 EXY = 0.^2+2*u).^2).0122 Solution to Exercise 11.3t)) EY = 0. E X2 = . u. 332) Use x and px to prevent renaming.*(t<=1) + (9/14)*(t.2923 EY = 0.0974 EX2 = 2.172) tuappr: EX = 1. E [XY ] = 1790 895 895 895 1790 (11. t.9119 EY2 = 1.0239 EXY = 1. 331) E [X] = 243 11 107 127 347 .29 (p.173) tuappr: [0 2] [0 2] 400 400 (12/179)*(3*t.5286 EY2 = 0. E Y2 = .*(u<=min(2*t. 331) E [X] = 2313 778 1711 916 1811 .2)) EX = 1. E [Y ] = .*(t > 1) EX = 1. PX. and P G = 3*t 4*u.5120 EXY = 1.t)) EY = 0.30 (p.0647 EXY = 1.1518 [0 2] [0 2] 200 200 (3/23)*(t + 2*u).6875 EX2 = 1.1056 Solution to Exercise 11. E X2 = . MATHEMATICAL EXPECTATION Solution to Exercise 11.*(u<=min(2.28 (p. PY.8695 EX2 = 1. px = 0.175) tuappr: EX = 1. 331) E [X] = 53 22 397 261 251 . . E X2 = .3805 EY = 1. Y. 331) E [X] = 2491 476 1716 5261 1567 .^2 + u). E X2 = . E Y2 = .
10).^2.16)*(X. PY.P)..50). v. [Z..6500 icalc3 % Solution with icalc3 Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter row matrix of Zvalues x Enter X probabilities px Enter Y probabilities px Enter Z probabilities px Use array operations on matrices X.*u . (15 . Y.18)*(X ..20).4*u + 2*v.*P) EG2 = 1. PX.PZ] = csort(G.PZ] = csort(G. 332) npr08_07 (Section~17.. Z. % Alternate EZ = Z*PZ' .*P) EG = 5. icalc Enter row matrix of Xvalues R Enter row matrix of Yvalues x Enter X probabilities PR Enter Y probabilities px Use array operations on matrices X. EK = total(K. % Alternate EW = W*PW' EW = 1. PX = ipoisson(75.. PY.. t.3699e+06 Solution to Exercise 11. u.. and P K = 3*t .^2)*PZ' EZ2 = 1.^2.u.X).*P) EH = 1. EG = total(G.30). t. PX.341 [R. and P H = t + 2*u.6500 [W.G = 3*t.8.*(X>=20) + .0868e+03 [Z. PZ.2975 ez2 = total(G.^2 + 2*t. u.PR] = csort(G.PW] = csort(H..*P) EK = 1.PX).P).1650e+03 EZ2 = (Z. G = 200 + 18*(X .34 (p. EZ = Z*PZ' EZ = 1. Y.P). EH = total(H.*(X>=10) + (16 .6500 Solution to Exercise 11. P jcalc . 332) X = 0:150.35 (p.*(X>=30) + (13 .. Y.15)*(X .*(X>=50).38: npr08_07) Data are in X.
1278 Solution to Exercise 11. % Alternate EW = W*PW' EW = 4.2975 EZ2 = (Z. 333) E [Z] = E Z2 = 3 88 1 0 1 0 0 0 1+t 4t 2t + 3u2 dudt + 1+t 2 3 88 3 88 2 1 2 1 0 1+t (t + u) 2t + 3u2 dudt = 1+t 0 2 5649 1760 4881 440 (11.*(t+u<=4) + 2*u.*(u<=min(1.*u.37 (p.*(t<=1) + (t + u).0868e+03 Solution to Exercise 11. EG = total(G. 333) E [Z] = E Z2 = 12 11 6 11 1 0 1 0 t t 1 t2 u dudt + 1 24 11 24 11 1 0 1 0 0 t tu3 dudt + t 24 11 24 11 2 1 2 1 0 2−t tu3 dudt = 2−t 16 55 39 308 (11. EZ = 0.^2. 333) E [Z] 3 23 2 1 3 = (t + u) (t + 2u) dudt + 23 0 0 t 2u (t + 2u) dudt = 175 92 1 1 1 3 23 1 0 2−t 1 2u (t + 2u) dudt + (11.*(u<=t).181) .38 (p.180) tuappr: [0 2] [0 1] 400 200 (24/11)*t. 332) H = t.^2)*PZ' EZ2 = 1.178) tuappr: [0 2] [0 3] 200 300 (3/88)*(2*t+3*u.*(t+u>4).36 (p.177) 3 88 (4t) 2t + 3u2 dudt + (t + u) 2t + 3u2 dudt = (11.0872 Solution to Exercise 11.7379 EW2 = (W. MATHEMATICAL EXPECTATION EZ = 5.2086 EG2 = total(G.*(u<=1+t) G = 4*t.^2)*PW' EW2 = 61.PW] = csort(H.*(t>1).4351 Solution to Exercise 11.39 (p.*P) EH = 4.342 CHAPTER 11.^2.^2).7379 EH2 = total(H.*P) EH2 = 61.*P) EG2 = 11.2920 EZ2 = 0.^2. EH = total(H.2t)) G = (1/2)*t.4351 [W.179) t3 u dudt + tu5 dudt + 0 tu5 dudt = 0 (11.P).*P) EG = 3.*(u>t) + u.
EZ = total(G.5224 Solution to Exercise 11.*P) EZ = 1.*(u<=max(2t.*u.2)) M = u <= min(1.188) E Z2 = 56673 15890 (11.5898 EZ2 = total(G.^2.u)<=1.*u).187) tu (3t + 2tu) dudt + 1 tu (3t + 2tu) dudt = 2−t (11.^2.184) E Z2 = 12 179 1 0 1 2 (t + u) 12 179 12 179 1 1 (11.*(1M).*(u <= min(2.3t)) M = (t<=1)&(u>=1).6955 EZ2 = total(G.^2.183) 2u2 3t2 + u dudt = 3t2 + u dudt + 2 1 0 3−t (11.2t).M). G = (t + u).9048 EZ2 = total(G.*P) EZ2 = 4.*(1 .*P) EZ = 1.*P) EZ2 = 4.*P) EZ2 = 3. EZ = total(G.4963 Solution to Exercise 11.^2.186) tuappr: [0 2] [0 2] 400 400 (12/179)*(3*t.185) 4u4 3t2 + u dudt = 28296 6265 (11. G = (t+u). 333) E [Z] = 12 179 1 0 1 2 (t + u) 3t2 + u dudt + 12 179 2 1 2 0 3−t 12 179 1 0 0 1 2u2 3t2 + u dudt+ 1422 895 4u4 3t2 + u dudt+ 0 0 (11.*(u <= min(1+t.*(1 . 333) E [Z] = 12 227 1 0 12 227 1+t 1 0 0 1 t (3t + 2tu) dudt + 12 227 12 227 2 1 2 1 2 0 2−t t (3t + 2tu) dudt + 5774 3405 (11.189) tuappr: [0 2] [0 2] 400 400 (12/227)*(3*t + 2*t.*P) EZ = 1.40 (p. EZ = total(G.343 E [Z 2 ] 3 23 2 1 3 = (t + u)2 (t + 2u) dudt + 23 0 0 t 4u2 (t + 2u) dudt = 2063 460 1 1 1 3 23 1 0 2−t 1 4u2 (t + 2u) dudt + (11.*M + t.*M + 2*u.41 (p.5659 .*M + 2*u.M).t)) M = max(t.182) tuappr: [0 2] [0 2] 400 400 (3/23)*(t+2*u). G = t.^2 + u).
^2 + 3*t.3*v.8..42 (p. PZ [X. PY. [W. PZ . [Y. v. Z.PX] = canonicf(cx. t.42: npr10_16) Data are in cx. u. EW = W*PW' EW = 1. PX.pmy). and P G = t. 333) npr10_16 (Section~17. PY. Z.8673 EW2 = (W. Z.pmx).. cy. pmy. pmx. MATHEMATICAL EXPECTATION Solution to Exercise 11.PY] = canonicf(cy.PW] = csort(G.. PZ. Y. PX.344 CHAPTER 11.P).^2 . Y. icalc3 input: X.8529 ..Use array operations on matrices X.*u..^2)*PW' EW2 = 426.
The variance is the probability weighted average of the square of these variations. Covariance. spread of the probability mass about the mean. 1 This content is available online at <http://cnx. and it weights large variations more heavily than smaller ones. We show below that the standard deviation is a natural measure of the variation from the mean. As in the case of mean value.1. the variance is a property of the distribution. and linear regression how it may be used to characterize the distribution for a pair considered jointly with the concepts 12. in which case E (X − E [X]) 2 = E X 2 − E 2 [X] (12.2) This shows that the mean value is the constant which best approximates the random variable. have been found particularly useful. In particular. {X. we show that E (X − c) 2 is a minimum i c = E [X] . It would be helpful to have a measure of the Among the possibilities. its variation from the mean is X (ω) − µX . the standard deviation.Chapter 12 Variance. we deal In subsequent units.1 Variance1 In the treatment of the mathematical expection of a real random variable locates the center of the probability mass distribution induced by X X.6/>. Linear Regression 12. 345 . but provides limited information.org/content/m23441/1.1 Variance Denition. Y } X.1) standard deviation for X If is the positive square root σX of the variance. In the treatment of mathematical expectation. rather than of the random variable. the variance and its square root. we examine how expectation may be used for further characterization of the distribution for with the concept of variance and its square root the standard deviation. we show covariance. in the mean square sense. value: Location of the center of mass for a distribution is important. In this unit. The square of the error treats positive and negative variations alike. Two markedly dierent random variables may have the same mean value. The variance of a random variable X is the mean square of its variation about the mean 2 Var [X] = σX = E (X − µX ) The 2 where µX = E [X] (12. we note that the mean value on the real line. Remarks • • • • • X (ω) is the observed value of X.
3).346 CHAPTER 12. More generally. sinceE IAi IAj = 0i = j . Indicator function X = IE P (E) = p. Var k=1 The term ak Xk = k=1 a2 Var [Xk ] + 2 k i<j ai aj (E [Xi Xj ] − E [Xi ] E [Xj ]) (12.2). Some Mathematical Aids. The result quoted above gives an alternate (V1): (V2): Calculating formula. If the cij = E [Xi Xj ] − E [Xi ] E [Xj ] is the covariance of the pair {Xi . unshifted distribution about the original center of mass. For one thing. expression. The variation of the shifted distribution about the shifted center of mass is the same as the variation of the original. Xj }. Var [X] = E X 2 − E 2 [X]. Var [aX] = a2 Var [X]. In addition. (12. cij are all zero. as examples in the next section show. Simple random variable X = Var [X] = n i=1 ti IAi n (primitive form) P (Ai ) = pi . Var [X + b] = Var [X]. Multiplication of factor The squares of the variations are multiplied by a2 . q = 1 − p E [X] = p 2 E X 2 − E 2 [X] = E IE − p2 = E [IE ] − p2 = p − p2 = p (1 − p) = pq (12. Var [aX ± bY ] = a2 Var [X] + b2 Var [Y ] ± 2ab (E [XY ] − E [X] E [Y ]) n n b. If the ai = ±1 and all pairs are uncorrelated. Adding a constant b to X shifts the distribution (hence its center of mass) by that amount. COVARIANCE. (Section 17.5) 2. whose role we study Remarks • • If the pair {X. The converse is not true. it is uncorrelated. We calculate variances for some common distributions.3) in the unit on that topic. Y } is independent. then n n Var k=1 ai Xi = k=1 Var [Xi ] (12. VARIANCE. Some details are omittedusually details of algebraic manipulation or the straightforward evaluation of integrals. a.4) The variance add even if the coecients are negative. Basic patterns for variance Since variance is the expectation of a function of the random variable tation in computations. innite series or values of denite integrals.6) t2 pi qi − 2 i i=1 i<j ti tj pi pj . Shift property. (V3): Change of scale. In some cases we use well known sums of A number of pertinent facts are summarized in Appendix The results below are included in the table in Appendix C Variances of some discrete distributions 1. we utilize properties of expecE (X − µX ) 2 . B (Section 17. LINEAR REGRESSION X. while the variance is dened as this is usually not the most convenient form for computation. (V4): Linear combinations a. we nd it expedient to identify several patterns for variance which are frequently useful in performing calculations. X by constant a changes the scale by a So also is the mean of the squares of the variations. we say the class is uncorrelated.
c c Var [X] = E X 2 = −c Now. λ)fX (t) = 1 α α−1 −λt e t Γ(α) λ t E X2 = Hence 1 Γ (α) λα tα+1 e−λt dt = 0 Γ (α + 2) α (α + 1) = λ2 Γ (α) λ2 (12. c). Then the distribution is symmetric triangular (−c. Uniform on (a. p.347 3. 2 =e −µ k=2 µk k (k − 1) + µ = e−µ µ2 k! ∞ k=2 µk−2 + µ = µ2 + µ (k − 2)! (12. q2 2 + q/p − (q/p) = q/p2 p2 (12. Binomial (n.12) fX (t) = 3. t2 fX (t) dt = 2 0 t2 fX (t) dt (12.14) Gamma(α. 346).7) Geometric (p). c = (b − a) /2.9) Poisson(µ)P (X = k) = e−µ µ ∀k ≥ 0 k! k Using E X 2 = E [X (X − 1)] + E [X]. Some absolutely continuous distributions 1. b)fX (t) = E X2 = 1 b−a 1 b−a a b < t < bE [X] = a+b 2 2 2 (12. c−t 0 ≤ t ≤ cso c2 that E X2 = 2 c2 c ct2 − t3 dt = 0 c2 (b − a) = 6 24 2 (12.10) Var [X] = µ2 + µ − µ2 = µ. t ≥ 0E [X] = 1/λ ∞ E X2 = 0 4.15) Var [X] = α/λ2 . ∞ we have E X Thus. Note that both the mean and the variance have common value µ.13) Exponential (λ) fX (t) = λe−λt .11) t2 dt = a b3 − a3 (a + b) (b − a) b3 − a3 soVar [X] = − = 3 (b − a) 3 (b − a) 4 12 2. b) Because of the symmetry Because of the shift property (V2) ("(V2)". . Var [IEi ] = i=1 pq = npq (12. λt2 e−λt dt = ≥ 0E [X] = ∞ 2 2 sothatVar [X] = 1/λ λ2 α λ (12. we may center the where distribution at the origin. P (X = k) = pq k ∀k ≥ 0E [X] = q/p We use a trick: E X 2 = E [X (X − 1)] + E [X] ∞ ∞ E X2 = p k=0 k (k − 1) q k +q/p = pq 2 k=2 k (k − 1) q k−2 +q/p = pq 2 2 (1 − q) 3 +q/p = 2 q2 +q/p p2 (12.8) Var [X] = 2 5. X = n i=1 IEi with{IEi : 1 ≤ i ≤ n}iidP (Ei ) = p n n Var [X] = i=1 4. p). Symmetric triangular (a. in this case.
SOLUTION The net gain may be expressed X = 10IA + 8IB + 20IC − 6. VARIANCE.9: Expectation of a function of ) from "Mathematical Expectation: Simple Random Variables") X X Suppose X in a primitive form is X = −3IC1 − IC2 + 2IC3 − 3IC4 + 4IC5 − IC6 + IC7 + 2IC8 + 3IC9 + 2IC10 with probabilities Let (12. VX = sum(c. 0. 0.19) P (Ci ) = 0.4200 VG = (G.EG^2 VG = 40.18) c = [10 8 20].9900 Example 12.13. 0.pc). We may use MATLAB to perform the arithmetic.8: Expected winnings) from "Mathematical Expectation: Simple Random Variables") A bettor places three bets at $2.*q) VX = 58.20.1: Expected winnings (Example 8 (Example 11. pc = 0.EZ^2 VZ = 40.01*[8 11 6 13 5 8 12 7 14 16]. g (t) = t2 + 2t. We may reasonbly suppose the class in computing the mean).8036 [Z.12. C} is independent (this assumption is not necessary Var [X] = 102 P (A) [1 − P (A)] + 82 P (B) [1 − P (B)] + 202 P (C) [1 − P (C)] Calculation is straightforward.06. E [Y ] = 0. Var [Y ] = √2 2π ∞ 2 −t2 /2 t e dt 0 = 1.15. COVARIANCE. and the third $20. the second $8.11.05. (12.07.^2)*pc' .10 (12.17) {A. examples and calculate the variances. The rst pays $10.20.2: A function of (Example 9 (Example 11.15. Determine E [g (X)] and Var [g (X)] c = [3 1 2 3 4 1 1 2 3 2].08. G = c.01*[15 20 10]. B. σ 2 E [X] = µ Consider 5.00 with probability 0. 0. P (B) = 0.14. We revisit some of those Example 12. 0.^2.16) X = σY + µimpliesVar [X] = σ 2 Var [Y ] = σ 2 Extensions of some previous examples In the unit on expectations.00 with probability 0. EZ = Z*PZ' EZ = 6.PZ] = csort(G.10.00 each.16. p = 0. P (C) = 0.00 with probability 0.8036 % Original coefficients % Probabilities for C_j % g(c_j) % Direct calculation E[g(X)] % Direct calculation Var[g(X)] % Distribution for Z = g(X) % E[Z] % Var[Z] . 0.4200 VZ = (Z.348 CHAPTER 12. we calculate the mean for a variety of cases.*p. 0. 0. Y ∼ N (0.^2 + 2*c EG = G*pc' EG = 6.p. LINEAR REGRESSION Normal µ. Then with P (A) = 0. 0. (12.08. q = 1 . 1) .^2)*PZ' .
4972 (by Maple) (12.Y)] VG = 80. u) = t2 + To set up for calculations.2529 VG = total(G.4] (t) t2 0.22: A function with a compound denition: truncated exponential) from "Mathematical Expectation.*P) . and P G = t. .Y)] EG = 3. % Determination of distribution for Z EZ = Z*PZ' % E[Z] from distribution EZ = 3.∞] (X) = 0 t4 0.4] (X) X 2 + I(4. PX.4] (X) X 4 + I(4. Y )) Expectation for g (X. % Calculation of matrix of [g(t_i.0562 (by Maple) (12.3e−0. from "Mathematical Expectation: Simple Random Variables" and let Z = g (t.10: Expectation for from "Mathematical Expectation: Simple Random Variables") We use the same joint distribution as for Example 10 (Example 11.3).PZ] = csort(G.10: Z = g (X.∞] (X) (12.349 Example 12. Y )) 2tu − 3u.4] (t) t4 0.25) To obtain a simple aproximation. we use jcalc.∞] (X) 16 (12.22) Z 2 = I[0. Let Z={ Determine X2 16 for for X≤4 X>4 = I[0. Y ) (Example 10 (Example 11.^2.24) Var [Z] = E Z 2 − E 2 [Z] ≈ 43. jdemo1 % Call for data jcalc % Set up Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.3e−0. General Random Variables") Suppose X∼ exponential (0.3e−0.20) E [Z] and Var [Z].EG^2 % Direct calculation of Var[g(X.23) E Z2 = 0 I[0. PY.EZ^2 % Var[Z] from distribution VZ = 80.*u . Y.3e−0. we must approximate by a bounded random variable.8486 APPROXIMATION (12.^2)*PZ' .2529 VZ = (Z.3t dt + 16P (X > 4) ≈ 7.P).*P) % Direct calculation of E[g(X. t.3t dt + 256e−1.^2 + 2*t. u.4: A function with compound denition (Example 12 (Example 11.21) = 0 t2 0.3t dt + 16E I(4. ∞ ANALYTIC SOLUTION E [g (X)] = g (t) fX (t) dt = 0 4 I[0.2 ≈ 100.3*u.2133 Example 12.3: Z = g (X.3t dt + 256E I(4. Since P (X > 50) = e−15 ≈ 3 · 10−7 we may safely truncate X at 50. u_j)] EG = total(G.2133 [Z.∞] (X) 256 ∞ 4 (12.
EZ^2 % Var[Z] VZ = 43. % Distribution for Z = g(X) EZ = Z*PZ' % E[Z] from distribution EZ = 7. VARIANCE.8486 [Z. additional units can be special ordered for s If the demand exceeds the amount m c ordered.28) E [X] = (p − s) E [D] + m (s − c) + (s − r) E [IM (D) D] − (s − r) mE [IM (D)] We use the discrete approximation. If demand is less than the amount ordered. X = D (p − c) − (m − D) (c − r) = D (p − r) + m (r − c) D > m.3*t) Use row matrices X and PX as in the simple case M = X <= 4.350 CHAPTER 12. Thus X = IM (D) [D (p − r) + m (r − c)] + [1 − IM (D)] [D (p − s) + m (s − c)] = D (p − s) + m (s − c) + IM (D) [D (p − r) + m (r − c) − D (p − s) − m (s − c)] = D (p − s) + m (s − c) + IM (D) (s − r) [D − m] Then (12. dollars per unit ( s > c). COVARIANCE.^2 + 16*(1 .*X.23: Stocking for random demand (see Exercise 4 (Exercise 10.PZ] = csort(G. LINEAR REGRESSION tappr Enter matrix [a b] of xrange endpoints [0 50] Enter number of x approximation points 1000 Enter density as a function of t 0. p = 50. c = 30. G = M. dollars per unit and sells for p A certain item costs dollars per unit.27) (12. n = 100.^2)*PX' .4972 VG = (G. s = 40. m]. % Approximate distribution for D .4) from "Problems on Functions of Random Variables")) from "Mathematical Expectation.PX).8472 % Theoretical = 43. Demand D r dollars per for the season is assumed to be a random variable with Poisson What amount distribution. where M = (−∞. Suppose µ = 50.8472 Example 12.4972 VZ = (Z.5: Stocking for random demand (Example 13 (Example 11. m should the manager (µ) order to maximize the expected prot? PROBLEM FORMULATION Suppose D is the demand and X is the prot. pD = ipoisson(mu.3*exp(0. % g(X) EG = G*PX' % E[g(X)] EG = 7.M). t = 0:n. APPROXIMATION (12.26) (12.t).EG^2 % Var[g(X)] VG = 43. r = 20. the remaining stock can be returned (or otherwise disposed of ) at unit ( r < c).^2)*PZ' . Then For For D ≤ m. X = m (p − c) + (D − m) (p − s) = D (p − s) + m (s − c) It is convenient to write the expression for X in terms of IM .29) mu = 50. General Random Variables") The manager of a department store is planning for the holiday season.
0944 2. Y ) = X 2 + 2XY .0153 0. VG = (G. m = 45:55. ANALYTIC SOLUTION on the triangular region bounded by E [Z] = (t2 + 2tu) fXY (t.m(i)). Let Z = g (X.0941 2.*u.0108 0.24: A jointly distributed pair) from "Mathematical Expectation.8880 0.0936 1.0939 1.VG'.4869 0. Y.0929 3.0130 0.0174 0.3343 0.7908 0.3117 0.*(t . PY. u) = 3u u = 0.^2.0145 0.Y) = X^2 + 2XY . and P G = t.32) tuappr Enter matrix [a b] of Xrange endpoints [1 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density 3*u.351 c = 30. u) dudt 1 1−t 3 0 0 u (t2 + 2tu) dudt = 1/10 0 1+t = 3 0 −1 1+t 0 u (t2 + 2tu) dudt + (12. end EG = G*pD'. r = 20. for i = 1:length(m) % Step by step calculation for various m M = t<=m(i).SG']') 1. Y } has joint density fXY (t.0942 1. t.1561 0.EG.0931 1. % g(X.0167 0. Determine E [Z] and Var [Z]. p = 50.^2 + 2*t. u = 1 + t. G(i.0943 2.*(u<=min(1+t.0137 0.0943 1.0179 Example 12.0934 3. PX. General Random Variables") Suppose the pair {X.0115 0. SG =sqrt(VG). s = 40. disp([EG'.6799 0.0112 0.0122 0. u.6: A jointly distributed pair (Example 14 (Example 11.0e+04 * 0.0938 2. u = 1 − t.^2)*pD' .31) Var [Z] = E Z 2 − E 2 [Z] = 53/700 ≈ 0.1075 0.1t)) Use array operations on X.30) E Z2 = 3 −1 0 u t2 + 2tu 2 1 1−t dudt + 3 0 0 u t2 + 2tu 2 dudt = 3/35 (12.0757 APPROXIMATION (12.:) = (ps)*t + m(i)*(sc) + (sr)*M.0160 0.5637 0.2206 0.
^2)*PZ' . Y } ≤ 1 max{X. and u = t − 1.^2.*P) % E[g(X.9306 (12. Y ) X + IQc (X.1006 % Theoretical value = 1/10 VG = total(G. X 2Y for for max{X.u)<=1. u) : t ≤ 1. E [W ] and Var [W ]. (u$gt.8333 t−1 (12.35) Var [W ] = 103/24 − (11/6) = 67/72 ≈ 0. ANALYTIC SOLUTION The intersection of the region Q and the square is the set for which 0 ≤ t ≤ 1 and 1 − t ≤ u ≤ 1.=max(1t.P).Y)] EG = 1. PY. G = t.8333 VG = total(G. 1 0 1 1 0 1 2 1+t 1+t Reference to Figure 11.352 CHAPTER 12. PX.9306 [Z. u) = 1/2 u = 1 − t. LINEAR REGRESSION EG = total(G.Y) EG = total(G..8340 VZ = (Z. Y } has joint density fXY (t.^2.33) Determine Q = {(t. Y ) 2Y (12.*M + 2*u.3t))& .0765 % Theoretical value 53/700 = 0.9368 .^2*PZ' . W ={ where on the square region bounded by u = 1 + t.36) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density ((u<=min(t+1. General Random Variables") The pair {X.. E [W ] = 1 2 1 0 1 t dudt + 1−t 1 0 1 1 2 2u dudt + 1 2 1 2 2 1 3−t 2u dudt = 11/6 ≈ 1. % Z = g(X.EZ^2 VZ = 0. Y.25: A function with a compound denition) from "Mathematical Expectation.8340 % Theoretical 11/6 = 1.34) E W2 = 1 2 t2 dudt + 1−t 4u2 dudt + 1 2 2 1 3−t 4u2 dudt = 103/24 t−1 (12. t.*(1 . VARIANCE. COVARIANCE.*P) .3.PZ] = csort(G.1006 VZ = Z. u.2 shows three regions of integration. Y } > 1 = IQ (X.Y)] EG = 0.7: A function with compound denition (Example 15 (Example 11.9368 % Theoretical 67/72 = 0.0757 [Z.0765 Example 12.*P) % E[g(X.M). u ≤ 1}.EG^2 VG = 0.EG^2 VG = 0. u} ≤ 1} = {(t. u = 3 − t. % Distribution for Z EZ = Z*PZ' % E[Z] from distribution EZ = 0.PZ] = csort(G.EZ^2 VZ = 0. u) : max{t. and P M = max(t.t1)))/2 Use array operations on X.P). % Distribution for Z EZ = Z*PZ' % E[Z] from distribution EZ = 1.*P) .
PZ] = csort(G. EG = total(G.39) t0 1 1−t 1 t2 E [Z] = 3 0 For t 0 dudt + 3 t0 t 0 dudt + 3 t0 1−t dudt = E Z 2 replace t by t2 in the integrands to get E Z 2 = (25t0 − 1) /20. we have the . the standard deviation from the mean.8176 VG = total(G.0540 % Theoretical = 0. E [X] = µ .40) For the normal distribution.0540 [Z.8176 EZ2 = (25*t0 .0540 tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density 3*(u <= t. Y ) X + IQc (X.*P) EG = 0." we show that if and X ∼ N µ. and P G = (t+u <= 1). EZ = Z*PZ' EZ = 0.*t + (t+u > 1).^2. Y.353 Example 12. u) : u + t ≤ 1} meet satises (12. Y ) The value Q = {(t. t.7225 VZ = (2125*t0 .1309)/80 VZ = 0. σ 2 then Z = Var [X] = σ 2 X−µ σ ∼ N (0.0540 Standard deviation and the Chebyshev inequality In Example 5 (Example 10. 1).37) Z = IQ (X.8: A function with compound denition fXY (t.8169 VZ = (Z. 0 3 (5t0 − 2) 4 (12.^2)*PZ' .0540. we get Var [Z] = (2125t0 − 1309) /80 ≈ 0. APPROXIMATION % Theoretical values t0 = (sqrt(5) .8169 % Theoretical = 0. Thus P X − µ ≤t σ = P (X − µ ≤ tσ) = 2Φ (t) − 1 σ (12.EZ^2 VZ = 0.EG^2 VG = 0.P). √ Using t0 = 5 − 1 /2 ≈ 0.38) t0 where the line u=1−t t2 and the curve u = t2 t2 = 1 − t0 . Also.5: The normal distribution and standardized normal distribution) from "Functions of a Random Variable.^2) Use array operations on X. For a general distribution with mean seems to be a natural measure of the variation away µ and variance σ2 .*P) .6180.6180 EZ = (3/4)*(5*t0 2) EZ = 0.1)/2 t0 = 0.1)/20 EZ2 = 0. u. u) = 3 on 0 ≤ u ≤ t2 ≤ 1 for (12.
1554 DERIVATION OF THE CHEBYSHEV INEQUALITY Let A = {X − µ ≥ aσ} = {(X − µ) ≥ a2 σ 2 }. COVARIANCE.4444 0.0124 12. A random variable X is centered i E [X] = 0.gaussian(0.1. A random variable is always centered. (12.t)).4945 2. Y }.1111 0. It may be instructive to compare the bound on the probability given by the Chebyshev inequality with the actual probability for the normal distribution. This inequality is useful in many theoretical applications as well as some practical ones. the standard deviation appears as a measure of the variation from the mean value. Upon taking expectations of both sides and using monotonicity.c.45) {X. p = 2*(1 .2500 0.0000 0. t = 1:0. Denition.44) X∗ = Denition.1600 0.1336 3.0027 41.5000 0. m = [t. where X∗ = X − µX Y − µY .^2). LINEAR REGRESSION Chebyshev inequality P X − µ ≥a σ ≤ 1 a2 or P (X − µ ≥ aσ) ≤ 1 a2 (12. since it must hold for any distribution which has a variance. Y∗ = σX σY (12. (12.1515 1. h = [' t Chebyshev Prob Ratio'].0000 0. A pair X' X −µ = σ σ is standardized {X. disp(h) t Chebyshev Prob Ratio disp(m) 1. r = c.354 CHAPTER 12.42) We consider three concepts which are useful in many situations. i X is standardized E [X] = 0 and Var [X] = 1./(t.3263 2. we have a2 σ 2 P (A) ≤ E (X − µ) from which the Chebyshev inequality follows immediately.41) In this general case.8831 3.0000 0.5000 0.43) X' = X − µ Denition.0455 5. Y } of random variables is uncorrelated i E [XY ] − E [X] E [Y ] = 0 It is always possible to derive an uncorrelated pair as a function of a pair variances. VARIANCE. the bound is not a particularly tight. Consider (12.p.5:3.3173 3.46) . c = ones(1. However.r]'.length(t)). both of which have nite U = (X ∗ + Y ∗ ) V = (X ∗ − Y ∗ ) .0000 1./p. = σ2 (12. 2 Then a2 σ 2 IA ≤ (X − µ) 2 2 .
*vv.Y} not uncorrelated VX = total(t.48) jdemo1 jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.EY^2 VY = 3.EX*EY c = 0. Y )) from "Mathematical Expectation: Simple Random Variables")). Y.EX)/SX.*u.9755e06 % Differs from zero because of roundoff .10: Expectation for from "Mathematical Expectation: (Example 10 (Example 11.3 ( Z = g (X.*P) .355 Now E [U ] = E [V ] = 0 and E [U V ] = E (X ∗ + Y ∗ ) (X ∗ − Y ∗ ) = E (X ∗ ) so the pair is uncorrelated.EY)/SY.10: Simple Random Variables" and Example 12.*P) EX = 0.*P) .*P) EXY = 0. EUV = total(uu.3016 VY = total(u. and P EX = total(t. Y )) Z = g (X. Y ) Expectation for Z = g (X. % Standardized random variables y = (u .^2.9: Determining an uncorrelated pair We use the distribution for Examples Example 10 (Example 11.6420 EY = total(u.^2.8170 SY = sqrt(VY) SY = 1. u. uu = x + y.6566 SX = sqrt(VX) SX = 1. for which E [XY ] − E [X] E [Y ] = 0 (12. PY.*P) % Check for uncorrelated condition EUV = 9.*P) EY = 0.y. t. PX. 2 − E (Y ∗ ) 2 =1−1=0 (12.1130 c = EXY .0783 EXY = total(t.EX^2 VX = 3.1633 % {X. % Uncorrelated random variables vv = x .47) Example 12.9122 x = (t .
≤s (12.1 Covariance and the Correlation Coecient The mean value µX = E [X] and the variance distribution for real random variable X. Can the expectation of an appropriate function of (X. If we let The quantity X ' = X − µX and Cov [X. (12. LINEAR REGRESSION 12.51) Y (ω) − µY is the variation of Y ω . Now for given its mean and used.52) Note that the variance of If we standardize. Y ] = E X ' Y ' of X and Y. Y ] = E [X ∗ Y ∗ ] = X is the covariance of X with itself.2.5/>. . with Denition. we have ρ2 = E 2 [X ∗ Y ∗ ] ≤ E (X ∗ ) Now equality holds i 2 E (Y ∗ ) 2 =1 with equality i Y ∗ = cX ∗ (12. Y ] = E [(X − µX ) (Y − µY )] is called the covariance Y ' = Y − µY be the centered random variables. correlation coecientρ = ρ [X. By Schwarz' inequality (E15). Y ∗ ) X−µX σX Y −µY σY • • We consider rst the distribution for the standardized pair Since P (X ∗ ≤ r. expand the expression (X − µX ) (Y − µY ) and use linearity to get E [(X − µX ) (Y − µY )] = E [XY − µY X − µX Y + µX µY ] = E [XY ] − µY E [X] − µX E [Y ] + µX µY which reduces directly to the desired expression.54) 1 = c2 E 2 (X ∗ ) We conclude 2 = c2 which implies c = ±1 and ρ = ±1 (12. Y ∗ ≤ s) = P ≤ r.org/content/m23460/1. Y ] /σX σY .56) = P (X ≤ t = σX r + µX . the following terminology is Denition. we have E [(X − µX ) (Y − µY )] σX σY (12.55) −1 ≤ ρ ≤ 1. Y ] is the quantity ρ [X.356 CHAPTER 12.49) We note 2 σX = E (X − µX ) 2 give important information about the information about the joint distribution? A clue to one possibility is given in the expression Var [X ± Y ] = Var [X] + Var [Y ] ± 2 (E [XY ] − E [X] E [Y ]) The expression also that for E [XY ] − E [X] E [Y ] vanishes if the pair is independent (and in some other cases). Y ≤ u = σY s + µY ) 2 This content is available online at <http://cnx. Y ) give useful (12. with Relationship between ρ = ± 1 i Y ∗ = ± X ∗ ρ and the joint distribution (X ∗ . We examine these concepts for information on the joint distribution. µX = E [X] and µY = E [Y ] E [(X − µX ) (Y − µY )] = E [XY ] − µX µY (12. VARIANCE.53) Thus ρ = Cov [X. X (ω) − µX is the variation of X from from its mean. For this reason. (12. COVARIANCE.2 Covariance and the Correlation Coecient2 12.50) To see this. The X ∗ = (X − µX ) /σX and Y ∗ = (Y − µY ) /σY . then Cov [X.
s) to the line s = r. ρ=1 ρ = −1 If i i X∗ = Y ∗ X ∗ = −Y ∗ i all probability mass is on the line i all probability mass is on the line −1 < ρ < 1. Y ∗ ) (ω) from the line s = r (i.. the variance W = Y + X . Figure 12.357 we obtain the results for the distribution for (X.e. The ρ = ±1 lines for the (X.57) (X ∗ .1 shows this is the average of the square of the distances of the points about the line Similarly for (r. then at least some of the mass must fail to be on these lines.1: Distance from point (r. s) = (X . s = r). Y ) by the mapping t = σX r + µX u = σY s + µY Joint distribution for the standardized variables (12. Then E 1 2 2Z = ∗ 1 2E (Y ∗ − X ∗ ) ∗ Reference to Figure 12.58) Z = Y ∗ − X ∗. Y ∗ ) (ω) s = r. (12. s = −r.60) .59) ∗ 1 1 2 2 2 E (Y ∗ ± X ∗ ) = {E (Y ∗ ) + E (X ∗ ) ± 2E [X ∗ Y ∗ ]} = 1 ± ρ 2 2 Thus 1−ρ 1+ρ Now since is the variance about is the variance about s = r (the ρ = 1 line) s = −r (the ρ = −1 line) E (Y ∗ − X ∗ ) 2 = E (Y ∗ + X ∗ ) 2 i ρ = E [X ∗ Y ∗ ] = 0 (12. Now (12. (r. Y ∗ ). E W 2 /2 is the variance about s = −r. Y ) distribution are: u − µY t − µX =± σY σX Consider or u=± 2 σY (t − µX ) + µY σX . s) = (X ∗ .
358 CHAPTER 12. ρ = 0 i the variances about both are the same.62) ρ = −1 line is: u − µY t − µX =− σY σX or u=− (12. {X.2: Uniform marginals but dierent correlation coecients. This fact can be veried by calculation. the pair cannot be independent.11: Uniform marginal distributions Figure 12. Y ] = 0.61) t = σX r + µX The ρ=1 line is: u − µY t − µX = σY σX The or u= σY (t − µX ) + µY σX σY (t − µX ) + µY σX (12. By the rectangle ρ = 1 line is u = t and the ρ = −1 line is the variance about each of these lines is the same. Example 12. . also. VARIANCE. if desired. Example 12. By symmetry. Thus ρ = 0.10: Uncorrelated but not independent Suppose the joint density for test. the condition Transformation to the (X.63) 1 − ρ is proportional to the variance abut the ρ = 1 line and 1 + ρ ρ = −1 line. Y ) plane u = σY s + µY r= t − µX σX s= u − µY σY (12. the is proportional to the variance about the u = −t. COVARIANCE. which is true i Cov [X. Y } is constant on the unit circle about the origin. By symmetry. LINEAR REGRESSION ρ=0 is the condition for equality of the two variances.
Again. In case (a).4216 σX σY 30 (12. The marginals are uniform on (1. In case (c) the two squares are in the second and fourth quadrants. Y ] = 10 ≈ 0.65) ρ = −0.9: Determining an uncorrelated pair) in "Variance.4 . (1. The actual value may be calculated to give ρ = 3/4. and u = 1. (1. (in fact the pair is independent) and E [XY ] = 0 Example 12. σX = 1. Since 1 − ρ < 1 + ρ.66) To complete the picture we need E [XY ] = Then 6 5 t2 u + 2tu2 dudt = 8/25 (12.1). and Var [Y ] = 3/80.0). E [XY ] < 0 and ρ < 0.*(u>=t) Use array operations on X. Var [X] = 3/50. and P EX = total(t. the two signs must be the same. examination of the gure conrms this. By symmetry. (1.9122 (12.1). 0 ≤ u ≤ 1 5 E [X] = 2/5. (1. Y. We use the joint distribution for Example 9 (Example 12. t. Y } has joint density function fXY (t. For every pair of possible values.1). (1. we have fX (t) = From this we obtain 6 1 + t − 2t2 .1633 = Cov [X. distribution is uniform over two squares. Since 1 + ρ < 1 − ρ. This is evident from the gure.1) and (0. 0 ≤ t ≤ 1 and fY (u) = 3u2 .04699.0). Example 12. E [XY ] > 0 which implies ρ > 0. (0. a. 1 0 t 1 (12. the distribution is uniform over In case (b).*P) EX = 0. Y ] .13: An absolutely continuous pair The pair by 6 {X. ρ = 0.1). PY.4012 % Theoretical = 0.0). the variance about the ρ = −1 line is less than that about the ρ = 1 line.67) Cov [X.1).64) ρ=1 line is u=t and the ρ = −1 u = −t.359 Consider the three distributions in Figure 12. (1. (0. b.1) in each case. in the rst and third quadrants with vertices (0. E [Y ] = 3/4. u) = 5 (t + 2u) on the triangular region bounded t = 0.2.68) tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (6/5)*(t + 2*u). u. so c. PX.1). (1. By the usual integration techniques. the the square centered at the origin with vertices at (1. so that in each case E [X] = E [Y ] = 0 This means the and Var [X] = Var [Y ] = 1/3 line is (12. the variance about the ρ = 1 line is less than that about the ρ = −1 line.0).12: A pair of simple random variables With the aid of mfunctions and MATLAB we can easily caluclate the covariance and the correlation coecient.8170 so that and σY = 1. u = t.1). Y ] = E [XY ] − E [X] E [Y ] = 2/100 APPROXIMATION and ρ= 4√ Cov [X." In that example calculations show E [XY ] − E [X] E [Y ] = −0.
4212 Coecient of linear correlation The parameter of linear correlation.14: Suppose X∼ uniform (1. Note that g could be any even function dened on (1.02 % Theoretical = 0. so that the value of the integral is zero. the value of Y is completely determined by the value of X ).0603 VY = total(u. Y = g (X) ρ is usually called the correlation coecient.72) Y. Xj ) = i=1 a2 Cov (Xi .71) n n n n X' = i=1 and similarly for ai Xi − i=1 ai µXi = i=1 ai (Xi − µXi ) = i=1 ' ai Xi (12. 346) on linear combinations. Xi ) + i i=j ai aj Cov (Xi . yet ρ = 0.0375 % Theoretical = 0.360 CHAPTER 12. so that (i.*P) .74) . Y ] = E [XY ] = Thus tcos t dt = 0 −1 (12.^2. Since by linearity of expectation. coecient The following example shows that all probability mass may be on a curve. Y ] and Var [X].*P) . Yj ) i. Y ) = E X ' Y ' = E i. VARIANCE.e.06 % Theoretical = 0.73) i. Xj ) (12.j ai aj Cov (Xi . COVARIANCE.*P) EY = 0. − 1 < t < 1 1 2 1 and Let Y = g (X) = cosX . so that Y = g (X) but ρ = 0 fX (t) = 1/2. n m with the centered random variables µX = i=1 we have ai µXi and µY = j=1 bj µYj (12.1).j n Var (X) = Cov (X. Example 12. p.EX^2 VX = 0. ' By denition Cov (X.. In this case the integrand tg (t) is odd.EY^2 VY = 0.j (12.70) X ' = X − µX and Cov [X. Variance and covariance for linear combinations We generalize the property (V4) ("(V4)". E [X] = 0.75 % Theoretical = 0.0201 rho = CV/sqrt(VX*VY) rho = 0.*u. It is convenient to work Y ' = Y − µy .^2.*P) .EX*EY CV = 0.7496 VX = total(t.4216 A more descriptive name would be EY = total(u. Consider the linear combinations n m X= i=1 We wish to determine ai Xi and Y = j=1 bj Yj (12.1).j In particular ' ai bj Xi Yj' = ' ai bj E Xi Yj' = ai bj Cov (Xi .0376 CV = total(t. X) = i. LINEAR REGRESSION % Theoretical = 0.69) ρ = 0. Then Cov [X.
3. Xj ) = aj ai Cov (Xj .3 Linear Regression3 12. A value X (ω) is observed. Error = Y − aX − b = (Y − µY ) − a (X − µX ) + µY − aµX − b = (Y − µY ) − a (X − µX ) − β (12. It is desired to estimate the corresponding value Y is a function of X. The problem is to determine the coecients a. The most common measure of error is the mean of the square of the error E Y−Y ^ 2 (12. Obviously there is no rule for determining Y (ω) unless on the average of some function of the errors. given X. then square and take the expectation.5: The regression problem). Xi ). That is. b such that E (Y − aX − b) 2 is a minimum (12.80) Error squared = (Y 3 This − µY )2 + a2 (X − µX )2 + β 2 − 2β (Y − µY ) + 2aβ (X − µX ) − 2a (Y − µY ) (X − µX ) content is available online at <http://cnx. The regression line of Y on X r We seek the best straight line function for minimizing the mean squared error. the expression for variance reduces to n Var [X] = i=1 a2 Var [Xi ] i (12.75) ai 2 does not depend upon the sign of ai . or is observed. That is. we seek a function r such that E (Y − r (X)) 2 is a minimum. we seek a function of the form u = r (t) = at + b. and by some rule an estimate Y (ω).76) 12. or are otherwise uncorrelated. (12. (12.81) .1.5/>. The error of the estimate is Y (ω) − Y (ω). If the Xi form an independent class. (Section 14.361 Using the fact that ai aj Cov (Xi . n we have Var [X] = i=1 Note that a2 Var [Xi ] + 2 i i<j ai aj Cov (Xi .77) The choice of the mean square has two important properties: it treats positive and negative errors alike.org/content/m23468/1.78) In the unit on Regression The problem of determining such a function is known as the regression problem. and it weights large errors more heavily than smaller ones. The best that can be hoped for is some estimate based on an average of the errors. we seek an important partial solution. we seek a rule (function) the estimate r such that Y (ω) ^ is r (X (ω)).79) We write the error in a special form. Suppose ^ X (ω) Y (ω) ^ is returned. In general. we show that this problem is solved by the conditional expectation of Y.1 Linear Regression Suppose that a pair {X. Y } of random variables has a joint distribution. Xj ) (12. At this point.
1−u 1 The regression curve is u = E [Y ] = 3 0 u2 u−1 dtdu = 6 0 u2 (1 − u) du = 1/2 (12. the preferred form. COVARIANCE.0495 b = EY . ANALYTIC SOLUTION By symmetry. Y ] Var [X] b = µY − aµX regression line of Y on X.15: The simple pair of Example 3 (Example 12.83) a= Thus the optimum line. VARIANCE. Y ] = 0.3016 CV = total(t.11 Example 12. Only the covariance (which requres both means) and the variance of X are needed.362 CHAPTER 12. this is the rst form is usually the more convenient. There is no need to determine Var [Y ] or ρ. Example 12.1633 a = CV/VX a = 0. PY. region bounded by E [X] = E [XY ] = 0.0783 VX = total(t. is (12.*P) . u = 1 + t.6420 EY = total(u.^2.0495t + 0.16: The pair in Example 6 (Example 12. t.a*EX b = 0. . Y )) from "Mathematical Expectation: Simple Random Variables")) from "Variance" jdemo1 jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X. u) = 3u on the triangular u = 0. But for calculation. and P EX = total(t. For certain theoretical purposes. Y ] (12. LINEAR REGRESSION E (Y − aX − b) 2 2 2 = σY + a2 σX + β 2 − 2aCov [X. Determine the regression line of Y on X.*P) EY = 0.3: Z = g (X. Y ) (Example 10 (Example 11.6: A jointly distributed pair (Example 14 (Example 11. Y ] σY (t − µX ) + µY = ρ (t − µX ) + µY = α (t) Var [X] σX The second form is commonly used to dene the regression line.*P) . PX.10: Expectation for Z = g (X.82) Standard procedures for determining a minimum (with respect to a) show that this occurs for (12. u. the approximation procedure is not very satisfactory unless a very large number of approximation points are employed. Y } has joint density fXY (t.EX^2 VX = 3. General Random Variables")) from "Variance" Suppose the pair {X. called the Cov [X.*P) EX = 0. but by the rectangle test is not independent.EX*EY CV = 0.85) Note that the pair is uncorrelated.*u.1100 % The regression line is u = 0.84) u= Cov [X. 1 so Cov [X. u = 1 − t. With zero values of E [X] and E [XY ]. Y.24: A jointly distributed pair) from "Mathematical Expectation.
17 (Distribution of Example 5 (Example 8.26: Continuation of Example 5 (Example 8. Figure 12.3).5: Marginals for a discrete distribution) from "Random Vectors and Joint Distributions") from "Function of Random Vectors").17: Distribution of Example 5 (Example 8. u) = 37 (t + 2u) on the max{1.26: Continuation of Example 5 (Example 8. Y } has joint density fXY (t.363 Example 12.11: Marginal distribution with compound expression) from "Random Vectors and MATLAB" and Example 12 (Example 10.3: ANALYTIC SOLUTION E [X] = 6 37 1 0 0 1 t2 + 2tu dudt + 6 37 2 1 0 t t2 + 2tu dudt = 50/37 (12.5: Marginals for a discrete distribution) from "Random Vectors and Joint Distributions") from "Function of Random Vectors" The pair 6 {X. what is the best meansquare linear estimate of Y (ω)? region on X.11: Marginal distribution with compound expression) from "Random Vectors and MATLAB" and Example 12 (Example 10.7 0 ≤ t ≤ 2. as follows: Quantity Integrand Value 779/370 127/148 E X 2 t + 2t u tu + 2u2 t u + 2tu 2 2 3 2 E [Y ] E [XY ] 232/185 Table 12.86) The other quantities involve integrals over the same regions with appropriate integrands. t} (see Figure Figure 12. Determine the regression line of Y is observed. If the value X (ω) = 1. 0 ≤ u ≤ Regression line for Example 12.1 .
9760 APPROXIMATION tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 400 Enter number of Y approximation points 400 Enter expression for joint density (6/37)*(t+2*u).^2.7*a + b y = 0.7a + b = 0.90) 2 2 = σY .8581 VX = total(t. COVARIANCE.3514 EY = total(u. PY.89) The analysis above shows the minimum mean squared error is given by E Y−Y ^ =E 2 Y −ρ σY (X − µX ) − µY σX 2 2 = σY E (Y ∗ ) − 2ρX ∗ Y ∗ + ρ2 (X ∗ ) ^ 2 2 = σY 1 − 2ρ2 + ρ2 = σY 1 − ρ2 (12. LINEAR REGRESSION Then Var [X] = and 779 − 370 50 37 2 = 3823 13690 Cov [X. but is a common interpretation. u. Then. the best linear estimate (in the mean (see Figure 12. Y ] /Var [X] = The regression line is ^ sense) is 1293 6133 ≈ 0. More general linear regression . then E Y−Y ρ2 is interpreted as the fraction of uncertainty removed by the linear rule and X.*P) . and P EX = total(t.4011 y = 1. PX.0944 a = CV/VX a = 0.3 for an approximate plot). (12.3382.4011 3823 15292 u = at + b.a*EX b = 0. often found in the discussion of observations or experimental results. Y.EX*EY CV = 0.3394 % Theoretical = 0.8594 % Theoretical = 0.*(u<=max(t. This interpretation should not be pushed too far.2790 % Theoretical = 0.87) a = Cov [X. If X (ω) = 1.EX^2 VX = 0.3382 b = EY .*P) EY = 0. Y ] = 232 50 127 1293 − · = 185 37 148 13690 (12.364 CHAPTER 12.3517 % Theoretical = 1.*u.1)) Use array operations on X.4006 % Theoretical = 0.88) square Y (ω) = 1.9760 An interpretation of ρ2 2 2 2 = σY E (Y ∗ − ρX ∗ ) 2 (12. the mean squared error in the case of zero linear correlation. t.0947 % Theoretical = 0.*P) EX = 1. b = E [Y ] − aE [X] = ≈ 0.9776 % Theoretical = 0.*P) . If ρ = 0. VARIANCE.2793 CV = total(t.7.
X2 . this condition holds if the class (12.3)). we have ai = Cov [Y. 1. X2 . with X0 = 1. n.91) U satises this minimum condition. E (Y − V ) is a minimum when V = U . equivalently n E [Y V ] = E [U V ] To see this. with zero value i If we take V to be U − V = 0 a. a1 . Xi ] = a1 Cov [X1 . Xn }. Hence. . · · · . as follows. 2.s.93) d2 ≤ E (W + αV ) If we select the special = d2 + 2αE [W V ] + α2 E V 2 α=− This implies E [W V ] E [V 2 ] then 0≤− 2E[W V ] E[W V ] 2 + 2 E V E [V 2 ] E[V 2 ] E [W V ] = 0. Xi ] These In the important special case that the (12. The second term is nonnegative. 2 which can only be satised by E [Y V ] = E [U V ] On the other hand. X1 .95) E [(Y − U ) V ] = 0 for all V of the form above.365 Consider a jointly distributed class. Xi ] + · · · + an Cov [Xn .96) U −V V. 1. so that 2 2 (12. Xi ] Var [Xi ] 1≤i≤n (12. Xi are uncorrelated (i. X1 .92) W =Y −U and let d2 = E W 2 2 . set V of the form V = i=0 ci Xi (12. The rst term is xed.. we obtain n + 1 linear equations in the n + 1 unknowns a0 . Xj ] = 0 for i = j ). Xi ] + a2 Cov [X2 .99) {Xi : 1 ≤ i ≤ n} is iid as in the case of a simple random sample (see the section on "Simple Random Samples and Statistics (Section 13. Xn . We wish to deterimine a function U of the n U= i=0 If ai Xi . if Consider (12. then E [(Y − U ) V ] = 0.98) and a0 = E [Y ] − a1 E [X1 ] − a2 E [X2 ] − · · · − an E [Xn ] In particular. the last term is zero. E [Y ] = a0 + a1 E [X1 ] + · · · + an E [Xn ] E [Y Xi ] = a0 E [Xi ] + a1 E [X1 Xi ] + · · · + an E [Xn Xi ] i = 1.e. Now. successively.97) n equations plus equation (1) may be solved alagebraically for the ai . for all or. for any α (12. such that E (Y − U ) 2 is a minimum (12. E (Y − V ) Since 2 = E (Y − U + U − V ) is of the same form as 2 = E (Y − U ) 2 + E (U − V ) 2 2 + 2E [(Y − U ) (U − V )] (12. an . 2. · · · . · · · . Cov [Xi . form {Y. · · · . then E (Y − U ) 2 is a minimum.94) E[W V ] ≤ 0. we take for 1≤i≤n For each (2) − E [Xi ] · (1) and use the calculating expressions for variance and covariance to get Cov [Y.
0.2) from "Problems on Distribution and Density Functions ". 4.10 0. 2. 0. 374. and $7. 0.1) from "Problems on Mathematical Expectation". Example 12. 0.9565. and Exercise 2 (Exercise 11. Var [X1 ] = 3.8. $7. 0. E [X1 ] = 2.m (Section 17. $5.4/>.08.15.09.4 in a national park there are 127 lightning strikes. Covariance.0IC6 + 3.100) Solution of these simultaneous linear equations with MATLAB gives the results a2 = 0.) (See Exercise 1 (Exercise 7.102) X = IA + IB + IC + ID .5IC3 + 7.org/content/m24379/1. Determine Var [X]. VARIANCE. Then the three equations are a0 0 0 a0 = −1.00. and a1 = a.5IC7 + 7.8. 0. The class on {1.2 (See Exercise 2 (Exercise 7. COVARIANCE. Random variable X has respectively. 3.) Exercise 12. The class {A. Linear Regression4 Exercise 12. X1 ] = 5. a1 = 1. 374. 0. and Cov [X1 .11.50. and Exercise 1 (Exercise 11.0083. with probabilities 0. is a partition.00.12) from "Problems on Random Variables and Probabilities". .4 Problems on Variance.m (Section 17.00.4348. respectively.) Experience shows that the probability of each (See Exercise 4 (Exercise 11. D} has minterm probabilities pm = 0. Suppose E [Y ] = 3.28: npr06_12)). 0. B. Exercise 12.31: npr07_02)). Cov [Y. with X1 = X .05. Var [X].13.15. A customer comes in. 0. 1. She purchases one of the items with probabilities 0.2) from "Problems on Mathematical Expectation". mle npr07_01. and + + + 2a2 3a1 1a1 + + + 3a3 1a2 8a2 = = = 3 5 7 (12.10 0. (Solution on p. $3. 0.6957. a0 = b.12.09.15.50. The prices are $3. $3. mle npr06_12.0IC2 + 3.5IC1 + 5. mle npr07_02.3 (See Exercise 12 (Exercise 6. A store has eight items for sale. E [X2 ] = 3.m (Section 17.10. 5.) Exercise 12. 2. LINEAR REGRESSION Examination shows that for n = 1.1 (Solution on p.101) Var [X]. C. 0.14. (Solution on p.06. 4 This content is available online at <http://cnx. Determine Var [X]. Exercise 3 (Exercise 11. above.1) from "Problems on Distribution and Density Functions ".50. 374. 0.001 ∗ [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] Consider Determine (12. 0. Var [X2 ] = 8.4) from "Problems on Mathematical Expectation").366 CHAPTER 12. 0. $5. 3. Cov [Y.5IC4 + 5.3) from "Problems on Mathematical Expectation".5IC8 Determine (12. (Solution on p. $5. The random variable expressing the amount of her purchase may be written X = 3.50. 2} C1 {Cj : 1 ≤ j ≤ 10} through C10 .18: Linear regression with two variables.11. In a thunderstorm lightning strike starting a re is about 0.30: values npr07_01)). X2 ] = 1. 3.20. the result agrees with that obtained in the treatment of the regression line.07. 12. which counts the number of these events which occur on a trial.50.8.0IC5 + 5. 374. X2 ] = 7.
he or she wins $30 with probability p = 0. D.0430 a. with respective probabilities 0. 0. N The prot to the College is X = 50 · 10 − 30N. 0. Exercise 12.1125 0.) (See Exercise 5 (Exercise 11.21 0.1680 0.1120 0.0120 0. and Var [Y ]. Fifty chances are sold. and Exercise 8 (Exercise 11. 374.0270 0. 374.10. Var [X].0170 0.4. Calculate b.) Exercise 12. Use properties of expectation and variance to obtain that it is b.06 0. {A.) The total operating Exercise 12. (Solution on p.14 0.105) 0. 0. 0.107) .14 0.6) from "Problems on Mathematical Expectation"). Determine E [Z].0130 0.09 0. Determine Var [T ].m (Section 17.0280 0.43. Var [X]. 374.1675 0.0720 0.21] (12. The number of X having Poisson (7) Var [X]. 375. E.37. D} has minterm probabilities (data are in mle npr12_10. (Solution on p.53. Let (12. F } is independent. Determine (See Exercise 7 (Exercise 11. (Solution on p.3IA − 1.) The class Exercise 12.06 0.0002).11 Consider a second random variable The class Y = 10IE +17IF +20IG −10 in addition to that in Exercise 12.) A residential (See Exercise 6 (Exercise 11. B. X Two coins are Determine be the number of matches (both heads or both tails).8) from "Problems on Mathematical Expectation"). 0.3IC + 7. C.0475 0.2.0180 0.5) from "Problems on Mathematical Expectation"). 374.43: npr12_10)) pmx = [0.6ID − 3.46.0420 · · · (12. Let X = 6IA + 13IB − 8IC .0480 0.367 Exercise 12. Let (Solution on p. A player pays $10 to play.m (Section 17.5 ipped twenty times.0725 0.6 (Solution on p.7 noise pulses arriving on a power circuit in an hour is a random quantity distribution. C.106) E [X] and Var [X].8.43: npr12_10)) pmy = [0.8 (See Exercise 24 (Exercise 7. time for the units in Exercise 24 (Exercise 7.9 The class {A. 374.7IB + 2. B. F.103) Var [X]. (Solution on p. Calculate E [W ] and Var [W ]. Z = 3Y − 2X . W = 2X 2 − 3X + 2. 375.) Exercise 12. Y = −3ID + 4IE + IF − 7 (12.10 Consider X = −3.24) from "Problems on Distribution and Density Functions" is a random variable T ∼ gamma (20. Note necessary to obtain the distributions for X or Y. Determine where is the number of winners (12.8.7) from "Problems on Mathematical Expectation"). G} has minterm probabilities (in mle npr12_10.104) a.45. and Var [Z]. (Solution on p. {E.39. E [Y ]. Let not E [X]. College plans to raise money by selling chances on a board.09 0. 0.) Exercise 12.24) from "Problems on Distribution and Density Functions".
108) Var [3X − 2Y ]. Calculate E [Z] and Var [Z]. 377. Determine E [Y ] = 4.) has joint distribution. (Solution on p. (Solution on p.) Exercise 12. Use properties of expectation and variance to obtain b.) Exercise 12. Determine E [X n ]. Determine E X 2 = 11. (Solution on p.16 The pair {X. E [Y ] = 1.33. Var [X]. 378. and Var [Y ].368 CHAPTER 12. E [Y ] = 10.0. 0. D. F } 0. with X∼ gamma (3. Var [Y ]. E [XY ] = 15. 0. Y } has joint distribution. 0.112) Var [3X + 2Y ].41. E [Y ]. B. Y } E [X] = 2. andZ = 3Y − 2X (12. with respective probabilities Exercise 12. (Solution on p. a. Determine E [X].47. and E [Z]. and Var [Z].111) Var [15X − 2Y ]. Let Z = 2X − 5Y . E X 2 = 11.) is independent. Let E [Y ] and Var [Y ]. E [Z].46. E [XY ] = 30 (12.) is independent. Use the result of part (a) to determine Var [X]. Use appropriate mprograms to obtain Compare with results of parts (a) and (b). Exercise 12.) Exercise 12. 377. Var [X]. E Y 2 = 2. E Y 2 = 101. c. E [X] and b. Y } is jointly distributed with the following parameters: E [X] = 3. (r. s) a. Suppose E [X] = 3. Y } is independent.17 The pair {X. Y } E [X] = 2. VARIANCE. Z = X 2 + 2XY − Y . 0.) distribution. COVARIANCE. Var [Z]. Var [Y ] = 4 (12.14 The class {A. C. Determine E [Z] and Var [Z]. E. where n is a positive integer.109) X = 8IA + 11IB − 7IC .15 For the Beta (Solution on p.1) and Y ∼ Poisson (13).110) a. E [X].27. with Exercise 12.37 Let (12. Var [Y ] = 5 (12.113) . 376. 0. Y } is independent.18 The pair {X. (Solution on p. (Solution on p. E [XY ] = 1 (12. Y = −3ID + 5IE + IF − 3. 377. E [Y ]. Calculate b. 376. Determine E X 2 = 5. 376. LINEAR REGRESSION The pair {X.12 Suppose the pair {X. Var [X] = 6. Suppose Exercise 12.13 The pair {X. E [Y ] = 1.
0090 13 0.19 X (See Exercise 9 (Exercise 11.369 Let Determine Z = 2X 2 + XY 2 − 3Y + 4.0126 0.0049 0.0160 0.0260 0.0064 0.8) from "Problems On Random Vectors and Joint Distributions".0134 0.0050 0.4 0.5 4.0160 0.9) from "Problems on Mathematical Expectation"). (Solution on p.0063 7 0.0072 0.0012 0.0056 0.5 0. Y } P (X = t.0112 0. 378.0200 0.0104 0.8 3.39: npr08_08)): {X.m (Section 17. Cov [X.0297 0 4.115) t = u = 7.0108 0.116) t = u = 12 10 9 5 3 1 3 5 1 0.0070 0.0223 .38: npr08_07)): {X. has the joint distribution (in le npr08_08.0280 0.m (Section 17. and Exercise 17 (Exercise 11.18) from "Problems on Mathematical Expectation").0594 0.0324 0.8.0072 3 0.) Exercise 12.0196 0.2] (t) (2 − t) 5 5 (12.0084 0.0363 0.0484 1.0234 0.0056 0.0077 Table 12.) The pair (See Exercise 8 (Exercise 8.0203 0.0234 0.0191 0. has the joint distribution (in le npr08_07.0054 0.0244 0.0510 0.0231 0.0091 0.0156 0.0256 0.0080 0. 378.17) from "Problems on Mathematical Expectation").21 (Solution on p.0112 0.0060 0. 378.0162 0.0081 0. Y } P (X = t.1 2.0084 0.8.2 0.) The pair Exercise 12.0167 11 0.0 3.0268 0.0140 0.2 Exercise 12.1320 0.7) from "Problems On Random Vectors and Joint Distributions".20 (See Exercise 7 (Exercise 8.0495 0.0044 0.0204 0.0060 0. and the regression line of For the distributions in Exercises 2022 Var [X].0017 5 0.0096 0.0091 0.0168 0.0120 0. Y on X.0405 0.0182 0.0189 0.0726 2.0108 0.0090 0.0126 0.0096 0.0180 0.0256 0.0528 0. 1] (t) t2 + I(1.0060 0.114) Var [X].1 0.0040 0.0182 0.0180 0. Y ].1089 0. Y = u) (12.0217 19 0.0064 0. (Solution on p.0891 0..0256 0.0098 0.0096 0.0120 0.0132 3. E [Z] . Y = u) (12. and Exercise 18 (Exercise 11.0216 0.0056 0.0038 0.0156 0.0060 0. Random variable has density function fX (t) = { E [X] = 11/10.0026 15 0.0070 0.0045 9 0.0182 0.0040 0.7 0.9 0. Determine (6/5) t2 (6/5) (2 − t) Determine for for 0≤t≤1 1<t≤2 6 6 = I [0.0196 0.0172 17 0.0056 0.0140 0.0440 0.0035 0.0396 0 0.0252 0.
Check these with a discrete approximation. training.025 3 0.005 0. and Exercise 23 (Exercise 11.23) from "Problems on Mathematical Expectation").031 0.051 0.065 0.8. E [X] = 313 1429 49 .15) from "Problems On Random Vectors and Joint Distributions". 380.9) from "Problems On Random Vectors and Joint Distributions". Data were kept on the eect of training time on the time to perform a job on a production line. Determine analytically Var [X].24 (Solution on p.010 0. and Exercise 20 (Exercise 11.058 0.039 0. u) = 0 ≤ t ≤ 2. fXY (t. E [Y ] = . Exercise 12.015 0.5 0.) (See Exercise 10 (Exercise 8. in hours. Y = u) (12.4 For the joint densities in Exercises 2330 below a.40: npr08_09)): P (X = t. E X2 = 6 3 1 8 (t + u) (12.033 0.3 Exercise 12.012 0.118) Exercise 12. for fXY (t. u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2. and the regression line of Y on X. E X2 = .22 (Solution on p.070 0.19) from "Problems on Mathematical Expectation").10) from "Problems On Random Vectors and Joint Distributions".) (See Exercise 13 (Exercise 8. LINEAR REGRESSION Table 12. VARIANCE.049 0. 379.137 0.) (See Exercise 9 (Exercise 8. and Exercise 19 (Exercise 11. E [X] = 1 1 .25) from "Problems on Mathematical Expectation").370 CHAPTER 12.001 0. The data are as follows (in le npr08_09.050 0. 379. Y ]. E X2 = 220 880 22 (12.061 0.5 0.009 2 0.001 0.23 (Solution on p. b.163 0.045 2. 379.20) from "Problems on Mathematical Expectation"). 0 ≤ u ≤ 2 (1 − t). 0 ≤ u ≤ 1 + t.003 1. u) = 1 0 ≤ t ≤ 1. and Y X is the amount of is the time to perform the task.m (Section 17. E [X] = E [Y ] = 7 5 . and Exercise 25 (Exercise 11.) (See Exercise 15 (Exercise 8.039 0.017 Table 12.117) t = u = 5 4 3 2 1 1 0. COVARIANCE. 0 ≤ u ≤ 2. for fXY (t. Cov [X.120) .011 0.25 (Solution on p. in minutes.13) from "Problems On Random Vectors and Joint Distributions".119) Exercise 12. 3 6 E [Y ] = 2 3 (12.
22) from "Problems On Random Vectors and Joint Distributions". and Exercise 28 (Exercise 11.121) 11 2 2 .) (See Exercise 16 (Exercise 8. 2 − t}. E X2 = 46 23 5290 (12. identically distributed) with common X = [−5 − 1 3 4 7] Let P X = 0.371 Exercise 12. E X2 = 55 55 605 24 11 tu (12. (0. E [Y ] = . Y. 8 for 0 ≤ u ≤ 1. E [X] = 53 22 9131 .123) Exercise 12. 1) .27) from "Problems on Mathematical Expectation"). u) = 2 13 (t + 2u).31 The class distribution 243 11 107 .21) from "Problems On Random Vectors and Joint Distributions".127) W = 3X − 4Y + 2Z .) (See Exercise 21 (Exercise 8. fXY (t.28) from "Problems on Mathematical Expectation"). Determine E [W ] Var [W ]. 380. and Exercise 26 (Exercise 11. and Exercise 32 (Exercise 11.30 (Solution on p. (1.32) from "Problems on Mathematical Expectation"). 0 ≤ u ≤ min{2t.18) from "Problems On Random Vectors and Joint Distributions".) (See Exercise 17 (Exercise 8. u) = 0 ≤ t ≤ 2.27 (12. u) = 3 23 (t + 2u) for 0 ≤ t ≤ 2. . (0.) (See Exercise 18 (Exercise 8. E [X] = 16 11 2847 .126) E [X] = Exercise 12. E X2 = 224 16 70 (Solution on p.) {X. 381.122) (Solution on p. E X2 = 5 15 5 (12. E [Y ] = .16) from "Problems On Random Vectors and Joint Distributions". 0) . for fXY (t.17) from "Problems On Random Vectors and Joint Distributions". u) = 12t2 u (−1. u) = 9 I[0. fXY (t. (12.01 ∗ [15 20 30 25 10] and (12.) (See Exercise 22 (Exercise 8.124) Exercise 12. on the parallelogram with vertices fXY (t. 1) E [X] = Exercise 12. 0 ≤ u ≤ max{2 − t.125) Exercise 12. E X2 = 13 12 1690 (12.29 (Solution on p. Z} of random variables is iid (independent.2] (t) 14 t2 u2 .26) from "Problems on Mathematical Expectation"). 0) .26 (Solution on p.1] (t) 3 t2 + 2u + I(1. E [Y ] = . E [X] = 32 627 52 . E [Y ] = . and Exercise 27 (Exercise 11. 381. for 0 ≤ t ≤ 2. t}. 3 − t}. then repeat with icalc3 and compare results. 380. 381. and Exercise 31 (Exercise 11. 380. 0 ≤ u ≤ min{1. Do this using icalc. fXY (t.31) from "Problems on Mathematical Expectation"). E [Y ] = .28 (Solution on p.
383.39) from "Problems on Mathematical Expectation").33 (Solution on p. E Z2 = 1135 3405 15890 Cov [X. 3 − t} (see Exercise 29 (Exercise 11. 0 ≤ u ≤ min{1. 2 − t} (see Exercise 27 (Exercise 11. u) = (t + 2u) 0 ≤ t ≤ 2. 55 E [Z] = 16 .128) (12. M = {(t. u ≥ 1} (12. LINEAR REGRESSION Exercise 12. .28) and Exercise 39 (Exercise 11. Y ) X + IM c (X.34 (Solution on p.) fXY (t.) for fXY (t. u) = 0 ≤ t ≤ 2. Y ) XY. u) ≤ 1} E [X] = Determine (12.35 (Solution on p. for Check with discrete approximation. Z].137) . u) : u > t} 2 E [X] = Determine (12. Y ) (X + Y ) + IM c (X. 0 ≤ u ≤ 1 + t (see Exercise 25 (Exercise 11. 0 ≤ u ≤ min{1 + t. VARIANCE. Y ) (X + Y ) + IM c (X. E [X] = M = {(t.2] (X) (X + Y ) 313 5649 4881 . Z]. Exercise 12. Exercise 12. Z = IM (X. u) : u ≤ min (1. 55 E Z2 = 39 308 (12.25) and Exercise 37 (Exercise 11. 1 Z = IM (X. 382. 92 E Z2 = 2063 460 (12. Y ) 2Y. Y ) X + IM c (X. u) = (3t + 2tu).129) Determine Var [Z] 24 11 tu and Exercise 12. 0 ≤ u ≤ min{2. u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2. u) = 3t2 + u 0 ≤ t ≤ 2. 382. 2 − t)} (12. for Check with discrete approximation. E [X] = M = {(t. 383.40) from "Problems on Mathematical Expectation").132) 53 . Y ) Y 2 .38) from "Problems on Mathematical Expectation"). 383.) fXY (t. Z = IM (X.36 (Solution on p. E [Z] = . for (12. E Z2 = 1790 895 6265 Cov [X. (12. M = {(t. u) : t ≤ 1. t} (see Exercise 28 (Exercise 11. Z = IM (X.41) from "Problems on Mathematical Expectation"). E Z2 = 220 1760 440 Cov [X. Z = I[0. 0 ≤ t ≤ 2. u) : max (t.131) Var [Z] 3 23 and Cov [X. 2} (see Exercise 30 (Exercise 11.372 CHAPTER 12. 46 E [Z] = 175 . Check with discrete approximation.133) Var [Z] 12 179 and Cov [Z]. COVARIANCE.) fXY (t. Check with discrete approximation.32 (Solution on p.) fXY (t.27) and Exercise 38 (Exercise 11. Y ) 2Y 2 .130) 52 . 0 ≤ u ≤ max{2 − t.29) and Exercise 40 (Exercise 11.1] (X) 4X + I(1. Check with discrete approximation. E [Z] = . Z]. E [Z] = .136) Determine Var [Z] and 1567 5774 56673 .37) from "Problems on Mathematical Expectation"). E [X] = (12. Z].135) Exercise 12.134) Determine Var [Z] 12 227 and 2313 1422 28296 .30) and Exercise 41 (Exercise 11.
) (See Exercise 12.139) Determine the joint distribution for the pair {Z. .10) from "Problems on {X. W } W on Z. Y } in Exercise 12.9) and 10 (Exercise 10. 384.138) W = h (X. Y ) X + IM c (X.37 Functions of Random Variables").20. let Z = g (X.20. Y ) 2Y and determine the regression line of (12. Y ) = { X +Y ≤4 X +Y >4 = IM (X. and Exercises 9 (Exercise 10.373 Exercise 12. Y ) = 3X 2 + 2XY − Y 2 X 2Y for for (12. For the pair (Solution on p.
% Alternate Ex = X*PX' Ex = 2.EX^2 VX = 2.31: npr07_02) Data are in T. 366) X∼ X∼ N∼ X∼ T ∼ binomial (127. 366) npr07_01 (Section~17.pc).1 (p.7000 VX = (T. VARIANCE. Var [X] = 302 Var [N ] = 7200. Var [X] = 20 · (1/2) = 5.0083). 367) binomial (50.2 (p.1/2).30: npr07_01) Data are in T and pc EX = T*pc' EX = 2.9 (p.8. Var [X] = µ = 7.0083 · (1 − 0.6 (p.00022 = 500.28: npr06_12) Minterm probabilities in pm.7 (p. 367) cx = [6 13 8 0].8.^2)*PX' .8 (p. Solution to Exercise 12.0002).0.5500 [X.0083) = 1. cy = [3 4 1 7]. coefficients in c canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution VX = (X. LINEAR REGRESSION Solutions to Exercises in Chapter 12 Solution to Exercise 12. 366) npr06_12 (Section~17. VX = (T. Var [N ] = 50 · 0.0.PX] = csort(T. 2 Solution to Exercise 12. pc EX = T*pc'.^2)*PX' . 000.2 · 0.8525 Solution to Exercise 12. Var [T ] = 20/0. Var [X] = 127 · 0.EX^2 Vx = 1. 367) Poisson (7).2).5 (p.374 CHAPTER 12. 367) binomial (20. .4 (p.^2)*pc' . 367) gamma (20. 000.^2)*pc' . COVARIANCE.0. 366) npr07_02 (Section~17.7309 Solution to Exercise 12. Solution to Exercise 12.8.7000 Vx = (X. Solution to Exercise 12.8 = 8.0454.5500 Solution to Exercise 12. Solution to Exercise 12.3 (p.(X*PX')^2 VX = 0.EX^2 VX = 1.
43: npr12_10) Data are in cx. u. PY.7900 dot(cy.8659e+03 Solution to Exercise 12.px) 5. t. and P H = t.EX^2 VX = 18. pmx and pmy canonic Enter row vector of coefficients cx Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution EX = dot(X.PW] = csort(G.EW^2 VW = 2.^2 .375 px py EX EX EY EY VX VX VY VY EZ EZ VZ VZ = = = = = = = = = = = = = = 0.*(1py)) 6.01*[37 45 39 100].9200 sum(cx.^2 + 2*t.11 (p. 0. 367) (Continuation of Exercise 12.^2.PZ) .PY) EY = 19.2958 3*EY .6874 VW = dot(W.2000 VY = dot(Y.pmy).10 (p.8.9386 Solution to Exercise 12.u.*py.3400 9*VY + 4*VX 323. 367) npr12_10 (Section~17.^2.PX) EX = 1.PY] = canonicf(cy.2200 VX = dot(X. EY = dot(Y. EZ = dot(Z.P).2*EX 29.EY^2 VY = 178.^2.8191 sum(cy. [Z. [W.py) 5.PW) .0253 G = 2*X. cy.PZ] = csort(H. dot(cx.3600 icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X.01*[43 53 46 100]. EW = dot(W.PY) . Y.PW) EW = 44.3*X + 2.^2. PX.PX) .*u .10) pmx [Y.*px.*(1px)) 66.PX).^2.
*(1 .01*[27 41 37 100]. LINEAR REGRESSION EZ = 46. VARIANCE. 368) X ∼ gamma (3.EZ^2 VZ = 3.EX^2 VX = 54.^2. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X.*px. py = 0. E [Z] = 2 · 30 − 5 · 13 = −5.7165e+04 Solution to Exercise 12.^2. ex = dot(cx.px)) vx = 54.376 CHAPTER 12.1) implies E [X] = 30 and Var [X] = 300.3900 VX = dot(X.EX^2 VX = 2 CV = EXY . PY.PY] = canonicf(cy. Then 0. EY = 4.8671 vy = sum(cy.0545 [X. Y.12 (p. 368) px = 0. EXY = 15.minprob(px(1:3))).5343 VZ = dot(Z. cx = [8 11 7 0].px) ex = 4.140) EX = 3.8671 .*py.PX) EX = 4.PY) EY = 1. and P EX = dot(X. PX. VX = EX2 .PX) .^2.PZ) . COVARIANCE.py) ey = 1.1700 EY = dot(Y.minprob(py(1:3))). t.01*[47 33 46 100].EX*EY CV = 3 VZ = 9*VX + 4*VY .3900 vx = sum(cx.13 (p. cy = [3 5 1 3]. Y ∼ Poisson (13) implies E [Y ] = Var [Y ] = 13. [Y.*(1py)) vy = 8.^2. Solution to Exercise 12. 368) Var [Z] = 4 · 300 + 25 · 13 = 1525 (12. VY = 5.PX] = canonicf(cx.6*2*CV VZ = 2 Solution to Exercise 12.1700 ey = dot(cy. EX2 = 11. u.14 (p.
368) E [X n ] = Γ (r + s) Γ (r) Γ (s) 1 0 tr+n−1 (1 − t) s−1 dt = Γ (r + s) Γ (r + n) Γ (s) · = Γ (r) Γ (s) Γ (r + s + n) (12.9589 Solution to Exercise 12.17 (p.144) EX = 3.EY^2 VY = 1 CV = EXY .EY^2 8.EX*EY CV = 1 . EXY = 1. EXY = 30.16 (p. 368) (12.EY^2 VY = 1 CV = EXY .142) Γ (x + 1) = xΓ (x) we have E [X] = r r (r + 1) .15 (p.5100 9*VY + 4*VX 291. EY2 = 101. EX2 = 5.PY) .EX*EY CV = 0 VZ = 15^2*VX + 2^2*VY VZ = 454 Solution to Exercise 12. VX = EX2 .2*EX 12.^2.EX^2 VX = 2 VY = EY2 . EX2 = 11. E X2 = r+s (r + s) (r + s + 1) rs (r + s) (r + s + 1) 2 (12. VX = EX2 . EY = 1. 368) EX = 2.141) Γ (r + n) Γ (r + s) Γ (r + s + n) Γ (r) Using (12.EX^2 VX = 1 VY = EY2 .377 VY VY EZ EZ VZ VZ = = = = = = dot(Y.0545 3*EY . EY2 = 2.143) Some algebraic manipulations show that Var [X] = E X 2 − E 2 [X] = Solution to Exercise 12. EY = 10.
20 (p. 368) EX = 2.. VX = dot(X.145) Var [X] = E X 2 − E 2 [X] = Solution to Exercise 12.19 (p.a*EX b = 0..EX*EY CV = 2. 369) npr08_08 (Section~17.EX^2 VX = 5.. P jcalc . VX = dot(X..PY). LINEAR REGRESSION VZ = 9*VX + 4*VY + 2*6*CV VZ = 1 Solution to Exercise 12.EX*EY . VY = 4.18 (p.38: npr08_07) Data are in X..8..6924 % Regression line: u = at + b Solution to Exercise 12. EY = dot(Y. VX = 6.8.. VARIANCE.PX) ..5275 b = EY .^2. Y... EY = 1. P jcalc .EX = dot(X.*u. Y.1116 CV = total(t.146) npr08_07 (Section~17...*P) .3*EY + 4 EZ = 31 Solution to Exercise 12..PY).. 369) E X2 = t2 fX (t) dt = 6 5 1 t4 dt + 0 6 5 2 2t2 − t3 dt = 1 67 50 (12.378 CHAPTER 12. EX2 = VX + EX^2 EX2 = 10 EY2 = VY + EY^2 EY2 = 5 EZ = 2*EX2 + EX*EY2 .^2. COVARIANCE. EY = dot(Y.0700 CV = total(t.. 369) 13 100 (12...EX^2 VX = 31..EX = dot(X.21 (p.*P) .PX) .*u.PX).PX).6963 a = CV/VX a = 0..39: npr08_08) Data are in X.
EX^2 VX = 0.EX*EY CV = 0.*P) . Y.147) Cov [X.PX) .0556 CV = total(t. VX = dot(X. Y ] = 1 1 2 2 − · = −1/18 Var [X] = 1/6 − (1/3) = 1/18 6 3 3 (12. b = E [Y ] − aE [X] = 14/11 (12..22 (p.a*EX b = 4.PX). EY = dot(Y.23 (p.*u.379 CV = 8.PY).24 (p.40: npr08_09) Data are in X.PY).a*EX b = 1..EX = dot(X.150) (12. 370) npr08_09 (Section~17. EY = dot(Y. Y ] /Var [X] = −1 b = E [Y ] − aE [X] = 1 (12. P jcalc .8.^2.EX^2 VX = 0.*P) .a*EX b = 5.149) tuappr: [0 1] [0 2] 200 400 EX = dot(X.0556 a = CV/VX a = 1..EX*EY CV = 0.0000 u<=2*(1t) Solution to Exercise 12.^2.0272 a = CV/VX a = 0..3319 CV = total(t.77937/6..0000 b = EY ..151) . Y ] /Var [X] = −1/11.2584 b = EY . b = EY .*u.PX) .. Cov [X.148) a = Cov [X. 370) E [XY ] = 1 8 2 0 0 2 tu (t + u) dudt = 4/3..6110 % Regression line: u = at + b Solution to Exercise 12.PX)..3051 % Regression line: u = at + b Solution to Exercise 12.2586 a = CV/VX a = 0. Var [X] = 11/36 a = Cov [X. Y ] = −1/36.. 370) 1 2(1−t) E [XY ] = 0 0 tu dudt = 1/6 (12. VX = dot(X.
*(u>= max(0. Var [X] = 75 25 (12.155) a = Cov [X.2t)) VX = 0.154) Cov [X.162) .1425 CV =0.*(u<=min(1.158) a = Cov [X.1056 a = 0. 371) E [XY ] = 3 23 1 0 0 2−t tu (t + 2u) dudt + Cov [X.t)).6700 b = 0.156) tuappr: [1 1] [0 1] 400 200 12*t.380 CHAPTER 12.0909 b = 1. Y ] = − 3 23 2 1 0 t tu (t + 2u) dudt = 251 230 (12. 370) E [XY ] = 3 88 2 0 0 1+t tu 2t + 3u2 dudt = 2153 26383 9831 Cov [X.*u. Y ] /Var [X] = − (12.152) a = Cov [X.3055 CV = 0.2727 Solution to Exercise 12. COVARIANCE.153) tuappr: [0 2] [0 3] 200 300 (3/88)*(2*t + 3*u.*u.4432 b = 0. Y ] = 8 6 .5553 Solution to Exercise 12. Var [X] = 880 1933600 48400 26321 26383 b = E [Y ] − aE [X] = 39324 39324 (12. LINEAR REGRESSION tuappr: [0 2] [0 2] 200 200 (1/8)*(t+u) VX = 0. Y ] = .27 (p. Var [X] = 5290 10580 114 4165 b = E [Y ] − aE [X] = 4217 4217 (12.^2. Var [X] = 3025 3025 124 368 b = E [Y ] − aE [X] = 431 431 (12.2867 b = 0.*(u<=1+t) VX = 0.*(u<= min(1+t.1)) VX = 0.^2).157) Cov [XY ] = − 124 431 . 371) E [XY ] = 24 11 1 0 0 1 t2 u2 dudt + 24 11 2 1 0 2−t t2 u2 dudt = 28 55 (12.2383 CV = 0.28 (p.6736 Solution to Exercise 12.25 (p.2036 CV = 0. Y ] /Var [X] = (12.0409 a = 0.159) tuappr: [0 2] [0 1] 400 200 (24/11)*t.161) a = Cov [X.26 (p.8535 Solution to Exercise 12. Y ] /Var [X] = − (12.0278 a = 0.160) 57 4217 . VARIANCE.1364 a = 0. Y ] /Var [X] = 4/9 b = E [Y ] − aE [X] = 5/9 (12. 371) 0 t+1 1 1 E [XY ] = 12 −1 0 t3 u2 dudt + 12 0 t t3 u2 dudt = 2 5 (12.
371) E [XY ] = 2 13 1 0 0 3−t tu (t + 2u) dudt + Cov [X.8275 EW = (3 . t.167) a = Cov [X. 371) E [XY ] = 3 8 1 0 0 1 tu t2 + 2u dudt + Cov [X.5989 Solution to Exercise 12.*(t<=1) + (9/14)*t.0287 a = 0.165) tuappr: [0 2] [0 2] 400 400 (2/13)*(t + 2*u). Y ] /Var [X] = − (12.6500 VW = (3^2 + 4^2 + 2^2)*VX VW = 371. [R.3984 CV = 0.PR] = csort(G. Var [X] = 3584 250880 7210 105691 b = E [Y ] − aE [X] = 88243 176486 (12.1350 b = 1. and P G = 3*t .px) % Use of properties EX = 1.168) tuappr: [0 2] [0 1] 400 200 (3/8)*(t.t)) VX = 0.31 (p.0108 a = 0.px) .164) a = Cov [X.0229 a = 0. Y ] = − 2 13 2 1 0 2t tu (t + 2u) dudt = 431 390 (12. u.30 (p.^2.*(u<=max(2t.0272 b = 0. Y.381 tuappr: [0 2] [0 2] 200 200 (3/23)*(t + 2*u).9975 icalc % Iterated use of icalc Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter X probabilities px Enter Y probabilities px Use array operations on matrices X.6500 VX = dot(x.163) 287 3 Var [X] = 130 1690 39 3733 b = E [Y ] − aE [X] = 297 3444 (12. 371) x = [5 1 3 4 7]. Y ] = 9 14 2 1 0 1 t3 u3 dudt = 347 448 (12.*(t>1) VX = 0.4*u. EX = dot(x.*(u<=min(2*t.0839 Solution to Exercise 12. PY.*u.P).166) 103 88243 .^2 + 2*u).1698 CV = 0.EX^2 VX = 12.29 (p. Y ] /Var [X] = (12. PX.0817 b = 0.^2.9909 Solution to Exercise 12.3t)) VX = 0. px = 0.01*[15 20 30 25 10].4+ 2)*EX EW = 1.3517 CV = 0.^2. icalc .
2435 VZ = total(G.7934 % Theoretical 0.169) Var [Z] = E Z 2 − E 2 [Z] = 94273 2451039 Cov [X. PX.pw] = csort(S. EW = dot(W. X] = E [XZ] − E [X] E [Z] = − 84700 42350 (12.EZ^2 VZ = 0.PW) EW = 1. Z. 372) E [XZ] = 24 11 1 0 t 1 t (t/2) tu dudt + 24 11 1 0 0 t tu2 tu dudt + 24 11 2 1 0 2−t ttu2 tu dudt = 211 770 (12. t. Y.6500 VW = dot(W.2445 % Theoretical 0.^2. [w.pw) .6500 Vw = dot(w. 372) E [XZ] = 3 88 1 0 0 1+t 4t2 2t + 3u2 dudt + 3 88 2 1 0 1+t t (t + u) 2t + 3u2 dudt = 16931 3520 (12. Ew = dot(w. u. Y.P). PY.4220 CV = total(G.171) Var [Z] = E Z 2 − E 2 [Z] = 3557 43 Cov [Z.*t.^2.32 (p.33 (p.172) .7913 Solution to Exercise 12. v. and P S = 3*t .9975 icalc3 % Use of icalc3 Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter row matrix of Zvalues x Enter X probabilities px Enter Y probabilities px Enter Z probabilities px Use array operations on matrices X. [W. LINEAR REGRESSION Enter row matrix of Xvalues R Enter row matrix of Yvalues x Enter X probabilities PR Enter Y probabilities px Use array operations on matrices X. VARIANCE.170) tuappr: [0 2] [0 3] 200 300 (3/88)*(2*t+3*u.PW] = csort(H. u.pw) Ew = 1.EW^2 VW = 371.382 CHAPTER 12.^2).Ew^2 Vw = 371. Z] = E [XZ] − E [X] E [Z] = 3097600 387200 (12.9975 Solution to Exercise 12. and P H = t + 2*u.PW) . EZ = total(G. PX.*P) .P).*P) EZ = 3. PZ.4*u + 2*v.PX) EX = 1. COVARIANCE.*(t<=1) + (t+u). t.*(t>1).EX*EZ CV = 0.2110 EX = dot(X.*(u<=1+t) G = 4*t. PY.*P) .^2.
t)) M = max(t.^2.PX).*(u<=max(2t.0425 CV = total(t.*G.2t)) G = (t/2). 372) 1 0 1 2 1 0 0 1 E [ZX] = 12 179 t (t + u) 3t2 + u dudt + 12 179 2 1 0 3−t 12 179 2tu2 3t2 + u dudt+ 24029 12530 (12.EZ*dot(X.1347 Solution to Exercise 12.*(u<=t).^2 + u). 372) 1 0 0 1 2 1 0 2−t E [ZX] = 12 227 1 0 1 12 227 1+t t2 (3t + 2tu) dudt + 12 227 12 227 2 2 t2 (3t + 2tu) dudt + t2 u (3t + 2tu) dudt = 20338 7945 (12.*(u<=min(1. G = (t + u).*M + 2*u. G = (t+u).*(1M).EZ*EX CV = 0.383 tuappr: [0 2] [0 1] 400 200 (24/11)*t.176) 2tu2 3t2 + u dudt = (12.EZ^2 VZ = 0.*(u <= min(2.EX*EZ CV = 0.*(1 .180) 1 2−t Var [Z] = E Z 2 − E 2 [Z] = 112167631 5915884 Cov [Z.181) .M). X] = E [ZX] − E [Z] E [X] = 162316350 27052725 (12.36 (p. CV = total(t.174) Var [Z] = E Z 2 − E 2 [Z] = 39 36671 Cov [Z.*G.PX).173) 2tu (t + 2u) dudt = (12. VZ = total(G. EZ = total(G. 372) 1 0 0 1 1 0 1 2−t E [ZX] = 3 23 t (t + u) (t + 2u) dudt + 3 23 2 1 1 t 3 23 2tu (t + 2u) dudt + 1009 460 (12.35 (p. EX = dot(X.*P) .2940e04 Solution to Exercise 12.^2.*P) .3t)) M = (t<=1)&(u>=1). EX = dot(X. CV = total(t.*M + 2*u.34 (p.*G.^2.*P) .0017 Solution to Exercise 12.*P) .*u.*P). X] = E [ZX] − E [Z] E [X] = − 5607175 11214350 (12.PX) CV = 9.177) Var [Z] = E Z 2 − E 2 [Z] = 11170332 1517647 Cov [Z.178) tuappr: [0 2] [0 2] 400 400 (12/179)*(3*t.u)<=1.175) tuappr: [0 2] [0 2] 400 400 (3/23)*(t+2*u). X] = E [ZX] − E [Z] E [X] = 42320 21160 (12. EZ = total(G.*P).179) t2 u (3t + 2tu) dudt + (12.*(u>t) + u.
COVARIANCE.*(t+u<=4) + 2*u.2)) EX = dot(X.*(u <= min(1+t.PZ) EZ = 5. W.Y) P Enter values for X X Enter values for Y Y Enter expression for g(t.EZ^2 VZ = 0.8.0588e+03 CZW = total(v.*u.384 CHAPTER 12.*P) EZX = 2.2975 EW = dot(W.7988 % Regression line: w = av + b .*(t+u>4) Use array operations on Z.*P) .5597 CV = EZX .EX*EZ CV = 0.6907 Solution to Exercise 12.38: npr08_07) Data are in X.*u .*w.2t).EZ*EW CZW = 12.PX).*u). M = u <= min(1. VARIANCE.PW) EW = 4.2188 VZ = total(G.*P). 373) npr08_07 (Section~17. EZX = total(t. LINEAR REGRESSION tuappr: [0 2] [0 2] 400 400 (12/227)*(3*t + 2*t.u) 3*t. PZW EZ = dot(Z.*PZW) . v.^2 Enter expression for h(t. Y.u.7379 VZ = dot(Z. w.^2.*G. G = t. P jointzw Enter joint prob for (X.0115 b = EW .EZ^2 VZ = 1.u) t.^2 + 2*t.*M + t.a*EZ b = 4.*(1 . PZ.1697 a = CZW/VZ a = 0.37 (p.^2.PZ) .M). PW. EZ = total(G.
Denition. which makes it possible to utilize the considerable literature on these transforms.1) and Variance (Section 12. In the section on integral transforms (Section 13.1.4) 1 This content is available online at <http://cnx.2: Integral transforms). the mathematical expectation E [X] = µX of a random variable X locates the center of mass for the induced distribution.2) The characteristic functionφX for random variable X is φX (u) = E eiuX The i2 = −1. as the mean (moment) of information. for its distribution) is the s is a real or complex parameter) (13.Chapter 13 Transform Methods 13.1). respectively.1 Transform Methods1 As pointed out in the units on Expectation (Section 11. we show their relationship to well known integral transforms.7/>..1.e. or asymetry. transforms. determine some key properties. We investigate further along these lines by exam Each of these functions involves a parameter. integervalued random variable X gX (s) = E sX = k sk P (X = k) (13. and use them to study various probability distributions associated with random variables. These quantities are also known. ining the expectation of certain functions of X. We 13. X and the second moment of X about the mean.org/content/m23473/1. u is a real parameter)(13.3) is generating functiongX (s) for a nonnegative. the third moment about the mean E (X − µX ) 3 gives information about the skew.1 Three basic transforms We dene each of three transforms. Other moments give added For example. 385 . and the expectation E [g (X)] = E (X − E [X]) 2 2 = Var [X] = σX (13. These have been studied extensively and used in many other applications. function The moment generating functionMX MX (s) = E esX ( for random variable X (i. of the distribution about the mean. in a manner that completely determines the distribution. For reasons noted below. we refer to these as consider three of the most useful of these.1) measures the spread of the distribution about its center of mass.
We note two which are MX (s) = E esX = E (es ) X = gX (es ) and φX (u) = E eiuX = MX (iu) (13. TRANSFORM METHODS E sX has meaning for more general random variables. we may convert the generating function to the moment generating function by replacing s with es . particularly useful.6) Since expectation is an integral and because of the regularity of the integrand. but its usefulness is greatest The generating function for nonnegative.5) Because of the latter relationship. The corresponding result for the characteristic function is Example 13. we have ' MX (0) = E [X]. ∞ ∞ The generating function does not lend itself readily to computing moments.11) For higher order moments. (k) provided the k th moment exists (13. we ordinarily use the moment generating function instead of the characteristic function to avoid writing the complex unit variable. When desirable. the integral with respect to the parameter. k! so that gX (s) = e−µ k=0 sk µk = e−µ k! k=0 (sµ) = e−µ eµs = eµ(s−1) k! (13. i. we may dierentiate inside ' MX (s) = d sX d E esX = E e = E XesX ds ds φ(k) (0) = ik E X k . ∞ MX (s) = E esX = 0 ' MX (s) = λe−(λ−s)t dt = '' MX (s) = λ λ−s 3 (13. and we limit our consideration to that case. we convert easily by the change of Moments The name and some of the importance of the moment generating function arise from the fact that the derivatives of MX evaluateed at s=0 are the moments about the origin. integervalued variables. Example 13. The dening expressions display similarities which show useful relationships. except that ' gX (s) = ksk−1 P (X = k) k=1 so that ' gX (1) = kP (X = k) = E [X] k=1 (13.7) Upon setting s = 0. Repeated dierentiation gives the general result.2: The Poisson (µ) distribution ∞ ∞ k P (X = k) = e−µ µ . The integral transform character of these entities implies that there is essentially a onetoone relationship between the transform and the distribution.12) . k ≥ 0.8) λ (λ − s) 2 2λ (λ − s) (13. then work with MX k and its derivatives.9) ' E [X] = MX (0) = λ 1 2λ 2 '' = E X 2 = MX (0) = 3 = 2 λ2 λ λ λ (13.10) From this we obtain Var [X] = 2/λ2 − 1/λ2 = 1/λ2 .1: The exponential distribution The density function is fX (t) = λe−λt for t ≥ 0. Specically MX (0) = E X k . (13.386 CHAPTER 13.
Some discrete distributions 1. Operational properties We refer to the following as operational properties.14) These results agree. A partial converse for (T2) is as follows: (13. then (13. and Var [X] = µ2 + µ − µ2 = µ (13. Y } is uncorrelated. gZ (s) = sb gX (sa ) (T1): If Z = aX + b.15) MZ (s) = ebs MX (as) .13) so that ' E [X] = MX (0) = µ. Then ' MX (s) = eµ(e −1) '' µes MX (s) = eµ(e −1) µ2 e2s + µes (13.21) E [XY ] = E [X] E [Y ]. with those found by direct computation with the distribution.19) '' '' '' ' ' MX+Y (s) = [MX (s) MY (s)] = MX (s) MY (s) + MX (s) MY (s) + 2MX (s) MY (s) (13.387 We convert to MX by replacing s with s es to get MX (s) = eµ(e s s −1) . We utilize these properties in determining the moment generating and generating functions for several of our common distributions. To show this. Y } is independent. Indicator function X = IE P (E) = p gX (s) = s0 q + s1 p = q + ps MX (s) = gX (es ) = q + pes (13. Note that we have not shown that being uncorrelated implies the product rule.20) On setting s=0 and using the fact that MX (0) = MY (0) = 1.17) s. we obtain two expressions for . E (X + Y ) 2 then the pair {X. For the moment generating function.18) (T3): If MX+Y (s) = MX (s) MY (s). For the moment generating function. one by direct expansion and use of linearity. esY form an independent pair for each value of the By the product rule for expectation E es(X+Y ) = E esX esY = E esX E esY Similar arguments are used for the other two transforms. '' E X 2 = MX (0) = µ2 + µ.22) .16) (T2): If the pair {X. then MX+Y (s) = MX (s) MY (s) . (13. φZ (u) = eiub φX (au) . esX and gX+Y (s) = gX (s) gY (s) (13. and the other by taking the second derivative of the moment generating function. this pattern follows from E e(aX+b)s = sbs E e(as)X Similar arguments hold for the other two. of course. we have E (X + Y ) which implies the equality 2 = E X 2 + E Y 2 + 2E [X] E [Y ] (13. parameter φX+Y (u) = φX (u) φY (u) . E (X + Y ) '' 2 = E X 2 + E Y 2 + 2E [XY ] (13.
27) The power series expansion about (1 + t) Hence −m = 1 + C (−m. Thus Xm are m times the expectation and variance for the E [Xm ] = mq/pandVar [Xm ] = mq/p2 6.388 CHAPTER 13.31) .25) Negative binomial (m. esti pi Binomial (n. 2) t2 + · · · for − 1 < t < 1 ∞ (13. each geometric Xm = (p). 1) t + C (−m. This also shows that the expectation and variance of geometric. b)fX (t) = 1 b−a a<t<b est fX (t) dt = 1 b−a b MX (s) = est dt = a esb − esa s (b − a) (13. p).24) Geometric (p).26) where C (−m. and If Ym is the number of the trial in a Bernoulli sequence on which the is the number of failures before the Xm = Ym − m mth success. Uniform on (a.29) Comparison with the moment generating function for the geometric distribution shows that Ym − m has the same distribution as the sum of m iid random variables. successive waiting times to success. with X ∼ Poisson Y ∼ Poisson (µ). pq s = p k=0 k k (qs) = k p p MX (s) = 1 − qs 1 − qes (13. p) success occurs. above. we s gX (s) = eµ(s−1) and MX (s) = eµ(e −1) .30) Poisson(µ)P (X = k) = e−µ µ ∀k ≥ 0 k! k establish (λ) and In Example 13. Follows from (T1) and product of exponentials. Y } is an independent pair.2 (The Poisson (µ) distribution). TRANSFORM METHODS Simple random variable X = n i=1 ti IAi (primitive form) 2. k) (−q) pm −m (−m − 1) (−m − 2) · · · (−m − k + 1) k! t = 0 shows that (13. If {X. X = indicator function. (13. P (X = k) = pq k ∀k ≥ 0E [X] = q/p We use the formula for the geometric series to get ∞ ∞ gX (s) = k=0 5. k) = (13. n i=1 IEi with {IEi : 1 ≤ i ≤ n} iid P (Ei ) = p We use the product rule for sums of independent random variables and the generating function for the n gX (s) = i=1 4. (q + ps) = (q + ps) n MX (s) = (q + pes ) n (13. then k mth P (Xm = k) = P (Ym − m = k) = C (−m. This suggests that the sequence is characterized by independent.28) MXm (s) = pm k=0 C (−m. then Z = X + Y ∼ Poisson (λ + µ).23) n MX (s) = i=1 3. P (Ai ) = pi (13. Some absolutely continuous distributions 1. k) (−q) esk = k p 1 − qes m (13.
36) X has the distribution of the sum of n independent random variables 5. est e−t −∞ 2 /2 dt (13. • X = σZ + µ implies by property (T1) MX (s) = esµ eσ 2 2 s /2 = exp σ 2 s2 + sµ 2 (13. Thus. above. Then Z is normal. 4. we show that Exponential (λ)fX (t) = λe−λt .38) since the integrand (including the constant √ 1/ 2π ) is the density for N (s. c). Let Z = µZ = aµX + bµY + c and 2 2 2 σZ = a2 σX + b2 σY (13.39) Example 13. σX and Y ∼ N µY . a positive integer. Z ∼ N (0. Symmetric triangular (−c. t ≥ 0 MX (s) = For λα Γ (α) ∞ tα−1 e−(λ−s)t dt = 0 λ λ−s α (13. 1 Gamma(α. (λ).40) and by the operational properties for the moment generating function MZ (s) = esc MX (as) MY (bs) = exp 2 2 a2 σX + b2 σY s2 + s (aµX + bµY + c) 2 (13. σ 2 • each exponential . c) fX (t) = I[−c. MX (s) = which shows that in this case λ λ−s n (13.33) X 3. where MY is the ecs − 1 1 − e−cs · = MY (s) MZ (−s) = MY (s) M−Z (s) cs cs moment generating function for Y ∼ uniform (0.41) . c) and similarly (13. for by properties of expectation and variance Suppose .35) α = n.32) MX (s) = = (c + t) est dt + −c 1 c2 (c − t) est dt = 0 ecs + e−cs − 2 c2 s2 (13.37) Now st − t2 2 = s2 2 1 − 2 (t − s) 2 so that MZ (s) = es 2 /2 1 √ 2π ∞ e−(t−s) −∞ 2 /2 dt = es 2 /2 (13.3: Ane combination of independent normal random variables 2 2 {X. has the same distribution as the dierence of two independent random variables. λ)fX (t) = Γ(α) λα tα−1 e−λt t ≥ 0 In example 1.0) (t) 1 c2 0 c+t c−t + I[0. Normal µ.c] (t) 2 2 c c c (13. 1) 1 MZ (s) = √ 2π ∞ The standardized normal. λ MX (s) = λ−s . each uniform on (0. . Y } is an independent pair with X ∼ N µX .34) for MZ . 1). σY aX + bY + c.389 2.
5 0.1600 6. u j . of pairs (ti .0900 10. Ai is the event {X = ti } for each of the distinct Then the moment generating function for X is n pi esti i=1 (13. we need to sort the values.0000 0. Example 13.2] (13. pi πj Since more than one pair sum may have the same value. MX (s) = in canonical form.390 CHAPTER 13. uj ) of values.1] P Y = [0. Y = 2:4.0000 0. [Z.42) MZ shows that Z is normally distributed. we have n pi esti m πj esuj = pi πj es(ti +uj ) i.0000 0. with P (Y = uj ) = πj . Y } is independent with distributions X = [1 3 5 7] Y = [2 3 4] P X = [0.1000 5.44) MX (s) MY (s) = i=1 The various values are sums j=1 for the values corresponding to ti + uj ti . It produces the pairproducts for Although not directly the probabilities and the pairsums for the values. That is.PY).0000 0. X = [1 3 5 7].2000 7.0600 4.1*[3 5 2].45) Determine the distribution for Z =X +Y. Moment generating function and simple random variables Suppose X = n i=1 ti IAi values in the range of X. disp([Z.0500 11.1500 9.Y. it produces the same result as that produced by multiplying moment generating functions.0000 0. dependent upon the moment generating function analysis.43) The moment generating function variable X.0000 0.3 0. MX is thus related directly and simply to the distribution for random Consider the problem of determining the sum of an independent pair {X. Y = m j=1 uj IBj .2 0.1*[2 4 3 1].4: Distribution for a sum of independent simple random variables Suppose the pair {X. Now if The moment generating function for the sum is the product of the moment generating functions.4 0.PZ] = mgsum(X. TRANSFORM METHODS 2 σZ s2 + sµZ 2 = exp The form of (13.PX.1700 8. PX = 0.0000 0.0200 . Y } of simple random variables.0000 0. We have an mfunction mgsum for achieving this directly. consolidate like values and add the probabilties for like values to achieve the distribution for the sum. then performs a csort operation. with pi = P (Ai ) = P (X = ti ).3 0.j Each of these sums has probability (13. PY = 0.0000 0.PZ]') 3.
The generating function The form of the generating function for a nonnegative. integervalued. the MATLAB convolution function may be used (see Example 13.7 (Sum of independent simple random variables)). which has the advantage that other functions of X and Y may be handled. The mfunctions mgsum3 and mgsum4 utilize this strategy.391 This could.X. The techniques for simple random variables may be used with the simple approximations to absolutely continuous random variables.PX.1).1: Density for the dierence of an independent pair.PZ/d) % Divide by d to recover f(t) % plotting details . we may obtain the distribution for the sum of more than two simple random variables. By repeated use of the function mgsum. have been achieved by using icalc and csort. uniform (0.see Figure~13. integervalued random variable exhibits a number of important properties. Example 13.5: Dierence of uniform distribution The moment generating functions for the uniform and the symmetric triangular show that the latter appears naturally as the dierence of two uniformly distributed random variables.PX). of course.1 Figure 13. since the random variables are nonnegative. We consider and Y X iid. Also.PZ] = mgsum(X.46) . uniform on [0.1]. tappr Enter matrix [a b] of xrange endpoints [0 1] Enter number of x approximation points 200 Enter density as a function of t t<=1 Use row matrices X and PX as in the simple case [Z. ∞ ∞ X= k=0 kIAi (canonical form) pk = P (Ak ) = P (X = k) gX (s) = k=0 sk pk (13. plot(Z.
. 387). pk = 0 for k > n). function gX (s) = em(s−1) = e−m ems From the known power series for the exponential. that form characterizes the distribution. we simply encounter the generating X ∼ Poisson (µ) from the distribution.1*[2 3 3 0 0 2].1*[0 2 4 4]. all powers of s must be accounted for by including zeros gx = 0.e. If the generating function is known in closed form. disp(a) Z PZ % Distribution for Z = X + Y disp(b) 0 0 1. b = [0:8. If the power series converges to a known closed form. gX is a polynomial. k the probability pk = P (X = k) 3. The power series expansion about the origin of an analytic function is unique. Y } is independent.49) For simple. 4 gy = 0.7: Sum of independent simple random variables Suppose the pair {X. (13.47) ∞ gX (s) = e−m k=0 We conclude that (ms) = e−m k! k ∞ sk k=0 mk k! (13. TRANSFORM METHODS s with nonnegative coecients whose partial sums converge to one.48) P (X = k) = e−m which is the Poisson distribution with parameter mk . The coecients of the power series display the distribution: for value is the coecient of sk . the unique power series expansion about the origin determines the distribution. a = [' Z PZ']. 1 2 + 3s + 3s2 + 2s5 10 gY (s) = 1 2s + 4s2 + 4s3 10 (13.50) In the MATLAB function convolution. 2. 0≤k k! µ = m.2 (The Poisson (µ) distribution). % Zeros for missing powers 3. Example 13. the generating functions are polynomials. nonnegative.6: The Poisson distribution In Example 13. above.0400 2. p.0000 0.0000 0. % Zero for missing power 0 gz = conv(gx.gz]'. the series 1.0000 0. we establish the generating function for Suppose.gy). For a simple random variable (i. This may be done quickly and easily with the MATLAB convolution function.392 CHAPTER 13.1400 3. integervalued random variables. the problem of determining the distribution for the sum of independent random variables may be handled by the process of multiplying polynomials. 4. Example 13. however. Because of the product rule (T2) ("(T2)". we get (13.2600 4.2400 5. with gX (s) = for the missing powers.0000 0. As a power series in converges at least for s ≤ 1.0000 0.1200 .
0000 7. we may Example 13.0000 0. then MX (−s) = −∞ In this case. This is particularly useful when the random variable X is nonnegative. fX .55) MX (−s) is the bilateral Laplace transform of use ordinary tables of the Laplace transform to recover fX .0400 0.1.0000 8. it would not be necessary to be concerned about missing powers and the corresponding zero coecients. ∞ is absolutely continuous.2 Integral transforms We consider briey the relationship of the moment generating function and the characteristic function with well known integral transforms (hence the name of this chapter). The bilateral Laplace transform for FX FX is a probability is given by ∞ e−st FX (t) dt −∞ The LaplaceStieltjes transform for (13.56) . then MX (−s) is the LaplaceStieltjes transform for The theory of LaplaceStieltjes transforms shows that under conditions suciently general to include all practical distribution functions ∞ ∞ MX (−s) = −∞ Hence e−st FX (dt) = s −∞ e−st FX (t) dt (13. We may use tables of Laplace transforms is known.0800 If mgsum were used. X. Suppose distribution function with FX (−∞) = 0.393 6. Moment generating function and the Laplace transform When we examine the integral forms of the moment generating function. equivalently. for FX ).53) 1 MX (−s) = s to recover so that If ∞ e−st FX (t) dt −∞ (13. we see that they represent forms of the Laplace transform.52) X Thus. e−st fX (t) dt (13.8: Use of Laplace transform Suppose nonnegative X has moment generating function MX (s) = 1 (1 − s) (13. 13. For nonnegative random variable X. if MX is the moment generating function for (or. X FX (t) = 0 for t < 0. widely used in engineering and applied mathematics.54) The right hand expression is the bilateral Laplace transform of FX when MX FX .0800 0.51) FX is ∞ e−st FX (dt) −∞ (13.
Example 13. in keeping with the determination. An expansion theorem If E [Xn ] < ∞. TRANSFORM METHODS We know that this is the moment generating function for the exponential (1) distribution.58) From a table of Laplace transforms. so that FX (t) = 1 − e−t t ≥ 0. 1. The characteristic function Since this function diers from the moment generating function by the interchange of parameter iu. If φn (u) → φ (u) for all u and φ is continuous at 0. If F is a distribution function such that characteristic function for F. t≥0 Γ (α) (13. we nd (13. s and The the integral expressions make that change of parameter. and φ is the (13. above.57) 1/s is the transform for the constant 1 (for t ≥ 0) and 1/ (1 + s) is the transform for e−t . Not only do we have the operational properties (T1) ("(T1)". of the moment generating function for that distribution. then φ is the characteristic function for distribution function F such that Fn (t) → F (t) at each point of continuity of F (13. tα−1 eat t ≥ 0 (13. where i is the imaginary unit. result is that Laplace transforms become Fourier transforms. λ).394 CHAPTER 13. Now.62) ∀u 2.9: Laplace transform and the density Suppose the moment generating function for a nonnegative random variable is MX (s) = λ λ−s α (13.61) We note one limit theorem which has very important consequences. but there is an important expansion for the characteristic function. p. 1 1 1 1 MX (−s) = = − s s (1 + s) s 1+s From a table of Laplace transforms. i2 = −1. 387) and the result on moments as derivatives at the origin. A fundamental limit theorem Suppose {Fn : 1 ≤ n} is a sequence of probability distribution functions and {φn : 1 ≤ n} is the corresponding sequence of characteristic functions. we nd that for α > 0. λα tα−1 e−λt . 387) and (T2) ("(T2)".60) X ∼ gamma (α. for 0≤k≤n and φ (u) = k=0 (iu) E X k + o (un ) k! k as u→0 (13. The theoretical and applied literature is even more extensive for the characteristic function. then Fn (t) → F (t) φn (u) → φ (u) at every point of continuity for F. φ (k) k then n (0) = i E X k . we nd after some algebraic manipulations fX (t) = Thus.63) .59) Γ (α) α (s − a) If we put is the Laplace transform of a = −λ. as expected. t ≥ 0. p.
5/>. We have characteristic function for the Xi . the sample average is a constant times the sum of the random variables in the sampling process . Let be the characteristic function for φ be the common ∗ Sn . In much of the theory of errors of measurement. for all t (13. {Xn : 1 ≤ n} is iid.64) ∗ Sn be the standardized sum and let appropriate conditions. the noise signal is the sum of In such situations. illustrates the kind of argument used in more sophisticated proofs required for more general cases. We sketch a proof of this version of the CLT. We consider a form of the CLT under hypotheses which are reasonable assumptions in many practical situations. Thus. . Similarly.69) √ nφ t/σ n − 1 − t2 /2n  → 0 2 This as n→∞ (13. which utilizes the limit theorem on characteristic functions.65) Fn (t) → Φ (t) IDEAS OF A PROOF There is no loss of generality in assuming and for each as n → ∞. In the statistics of large samples.70) content is available online at <http://cnx.2 Convergence and the central Limit Theorem2 13. The CLT asserts that under Fn (t) → Φ (t) as n → ∞ for all t.66) n let φn µ = 0. the sample average is approximately normalwhether or not the population distribution is normal.2. Consider an independent sequence It {Xn : 1 ≤ n} with of random variables.org/content/m23475/1. and ∗ Sn = Sn − nµ √ σ n (13. for large samples. then central limit theorem (CLT) asserts that if random variable X is the sum of a large class of independent X is approximately normally distributed.67) φ about the origin noted above. normal population distribution is frequently quite appropriate. this theorem serves as the basis of an extraordinary amount of applied work. then Var [Xi ] = σ 2 . condition the Central Limit Theorem (LindebergLévy form) If Xi ∗ Fn be the distribution function for Sn . along with certain elementary facts from analysis. with E [Xi ] = µ. the assumption of a a large number of random components. On the other hand. the observed error is the sum of a large number of independent random quantities which contribute additively to the result. each with reasonable distributions. in the theory of noise. independently produced.68) √ √ φ t/σ n − 1 − t2 /2n  = β t/σ n  = o t2 /σ 2 n so that (13. Form the sequence of partial sums n n n Sn = i=1 Let Xi ∀ n ≥ 1 E [Sn ] = i=1 E [Xi ] and Var [Sn ] = i=1 Var [Xi ] (13. φ (t) = E eitX Using the power series expansion of and √ ∗ φn (t) = E eitSn = φn t/σ n (13. We sketch a proof of the theorem under the form an iid class. above. we have φ (t) = 1 − This implies σ 2 t2 + β (t) 2 where β (t) = o t2 as t→0 (13.395 13. known as the LindebergLévy theorem.1 The Central Limit Theorem The random variables. This celebrated theorem has been the ob ject of extensive theoretical research directed toward the discovery of the most general conditions under which it is valid.
F) % Stair step plot hold on plot(x. % Distribution for the sum of 5 iid rv F = cumsum(px).72) √ 2 φ t/σ n → e−t /2 as n→∞ for all t (13.3 7.PX.x).2 1.05 2.') % Plot of gaussian distribution function % Plotting details (see Figure~13. The theorem says that the distribution functions for sums of increasing numbers of the Xi converge to the normal distribution function. It is instructive to consider some examples.2) . diidsum (sum of discrete iid random variables).6 5.2].PX) .5*VX. but the gaussian approximation shifted one half unit (the so called continuity correction for integervalues random variables). which are easily worked out with the aid of our mfunctions.'.px] = diidsum(X. The rst variable has six distinct values. EX = X*PX' EX = 1. It uses a designated Example 13.1 4. Fn (t) → Φ (t). PX = 0.5).1*[2 2 1 3 1 1].10: First random variable X = [3. but it does not tell how fast. Discrete examples Demonstration of the central limit theorem We take the sum of ve iid simple random The discrete We rst examine the gaussian approximation in two cases.396 CHAPTER 13. A principal tool is the mfunction number of iterations of mgsum. above.^2. variables in each case.9900 VX = dot(X. The t is remarkably good in either case with only ve terms.71) It is a well known property of the exponential that 1− so that t2 2n n → e−t 2 /2 as n→∞ (13. the second has only three.73) By the convergence theorem on characteristic functions. TRANSFORM METHODS A standard lemma of analysis ensures √ φn t/σ n − 1 − t2 /2n n √  ≤ nφ t/σ n − 1 − t2 /2n  → 0 as n→∞ (13. % Distribution function for the sum stairs(x.gaussian(5*EX.0904 [x. Here we use not only the gaussian approximation. character of the sum is more evident in the second case.EX^2 VX = 13.
5).x+0.397 Figure 13.') % Plot of gaussian distribution function plot(x. Example 13. % Distribution for the sum of 5 iid rv F = cumsum(px).x).'o') % Plot with continuity correction % Plotting details (see Figure~13.11: Second random variable X = 1:3.PX.1000 VX = EX2 .F) % Stair step plot hold on plot(x.2: Distribution for the sum of ve iid random variables.5). EX = X*PX' EX = 1.3) .2]. % Distribution function for the sum stairs(x.gaussian(5*EX.'. PX = [0.gaussian(5*EX.5*VX.EX^2 VX = 0.4900 [x.3 0.^2*PX' EX2 = 4.9000 EX2 = X.px] = diidsum(X.5 0.5*VX.
3000 VX = dot(X. We examine only part of the distribution function where most of the probability is concentrated.px] = diidsum(X. FG = gaussian(21*EX.x).PX) . TRANSFORM METHODS Figure 13. Example 13. stairs(40:90.1*[1 2 3 2 2]. This eectively enlarges the xscale.21*VX.EX^2 VX = 4.2100 [x.3: Distribution for the sum of ve iid random variables.4) .PX) EX = 3.PX. so that the nature of the approximation is more readily apparent.FG(40:90)) % Plotting details (see Figure~13. F = cumsum(px).12: Sum of twentyone iid random variables X = [0 1 3 5 6].21). we take the sum of twenty one iid simple random variables with integer values. EX = dot(X. PX = 0.F(40:90)) hold on plot(40:90.398 CHAPTER 13. As another example.^2.
pz] = diidsum(X.FG(a).3*VX. 1).4: Distribution for the sum of twenty one iid random variables. Suppose X∼ uniform (0. Then E [X] = 0. The results on discrete variables indicate that the more values the more quickly the conversion seems to occur.399 Figure 13. uniform random variables. tappr Enter matrix [a b] of xrange endpoints [0 1] Enter number of x approximation points 100 Enter density as a function of t t<=1 Use row matrices X and PX as in the simple case EX = 0. we start with a random variable uniform on (0.F(a).5 and Var [X] = 1/12.3). % Plot every fifth point plot(z(a). we may get approximations to the sums of absolutely continuous random variables. FG = gaussian(3*EX.5) .PX. 1). Example 13.z(a). Absolutely continuous examples By use of the discrete approximation.'o') % Plotting details (see Figure~13. length(z) ans = 298 a = 1:5:296. [z.z). VX = 1/12. F = cumsum(pz).13: Sum of three iid. In our next example.5.
TRANSFORM METHODS Figure 13. Calculations show Var [X] = E X 2 = 7/12.5) and (0. Other distributions may take many more Example 13.PX. the t is remarkably good.F.5)(t>=0. FG = gaussian(0.z. since the sum of two gives a symmetric triangular distribution on terms to get a good t.z).14: Sum of eight iid random variables Suppose the density is one on the intervals (−1. −0. 1). (0.pz] = diidsum(X. plot(z. E [X] = 0. VX = 7/12. This is not entirely surprising.5: Distribution for the sum of three iid uniform random variables. Although the density is sym metric.8*VX.8). F = cumsum(pz). Consider the following example. From symmetry. The MATLAB computations are: tappr Enter matrix [a b] of xrange endpoints [1 1] Enter number of x approximation points 200 Enter density as a function of t (t<=0.400 CHAPTER 13. it has two separate regions of probability.5) Use row matrices X and PX as in the simple case [z. 2).5.6) .FG) % Plottting details (see Figure~13. For the sum of only three random variables.
The convergence of the sample average is a form of the socalled weak law of large numbers.2 Convergence phenomena in probability theory The central limit theorem exhibits one of several kinds of convergence important in probability theory. Convergent sequences are characterized by the fact that for large enough N. Although the sum of eight random variables is used.2. the distance an − am  between any two terms is arbitrarily small for all n. the convergence is remarkable fastonly a few terms are needed for good approximation. uniform random variables. Such a sequence is said to be fundamental (or Cauchy ). 13.13: Sum of three iid. This unique number L is called the limit of the sequence. namely convergence values of the sample average random variable n illustrates convergence in probability. in distribution (sometimes called An weak convergence).74) is a sequence of real numbers.).6: Distribution for the sum of eight iid uniform random variables. For large enough n the probability that An lies within a given distance of the population mean can be made as near one as desired. we say N suciently large an approximates arbitrarily closely some number L for all n ≥ N . The increasing concentration of E An − µ2 → 0 In the calculus. the t to the gaussian is not as good as that for the sum of three in Example 4 (Example 13. m ≥ N . if we let ε > 0 be the error of approximation. we deal with sequences of numbers. To be precise.75) . In either case. then the sequence is {an : 1 ≤ n} • Convergent i there exists a number L such that for any ε > 0 there is an N L − an  ≤ ε for all such that n≥N (13. If the sequence converges i for as n→∞ (13. The fact that the variance of An becomes small for large n illustrates convergence in the mean (of with increasing order 2).401 Figure 13.
the convergence of the sample average is almost sure (i. Rather The notion of convergence than deal with the sequence on a pointwise basis. the closeness to a limit is expressed in terms of the probability that the observed value Xn (ω) follows: should lie close the the value of the limiting random variable.e. The notion of uniform convergence also applies. It is not dicult to construct examples for which there is convergence in probability but pointwise convergence for no ω . convergence (a strong law of large numbers). m ≥ N (13.402 CHAPTER 13. we have a sequence {fn (x) : 1 ≤ n} of real numbers. Xn (ω) → X (ω). Suppose {Xn : 1 ≤ n} is a sequence of real random variables. . then there is an set of This means any For all other tapes. such tape. the situation may be quite dierent. and limits of products are the products of the limits. which are realvalued functions with we have a sequence A somewhat more restrictive condition (and often a more desirable one) for sequences of functions is ω Ω and argument ω . Here the uniformity is over values of the argument x. X3 (ω) . we say the seqeunce domain converges almost surely ω (abbreviated a. exists an uniform convergence. • If the sequence converges in probability. consider for each possible outcome the sequence of values ω a tape on which there is X1 (ω) . (13. For each argument {Xn (ω) : 1 ≤ n} of real numbers.. For n suciently large. Instead of balls. This is the case that the sequence converges except for a set of arbitrarily small probability. In setting up the basic probability model. We may state this precisely as A sequence {Xn : 1 ≤ n} converges to P Xin probability. we think in terms of balls drawn from a jar or box. In the case of sample average. As a matter of fact. The notion of convergent and fundamental sequences applies to sequences of realvalued functions with a common domain.76) As a result of the completeness of the real numbers. designated Xn → X i for any ε > 0. It is quite possible that such a sequence converges for some ω and diverges (fails to converge) for others. TRANSFORM METHODS ε>0 • Fundamental i for any there is an N such that an − am  ≤ ε for all n.s. it deals with the random variables as such. The kind of convergence noted for the sample average is convergence in probability (a weak law of large numbers). that by going far enough out on prescribed distance of the value X. This says nothing about the values Xm (ω) on the selected tape for any In fact.). the values Xn (ω) beyond that point all lie within a X (ω) of the limit random variable. X2 (ω) . The following schematic representation may help to visualize the dierence between almostsure convergence and convergence in probability.e. For example the limit of a linear combination of sequences is that linear combination of the separate limits.s. exceptional tapes which has zero probability. For each x in the domain. It turns out that for a sampling process of the kind used in simple statistics. has a limit). the strong law holds). A tape is selected. X (ω). These concepts may be applied to a sequence of random variables. in many important cases the sequence converges for all ω except possibly a set (event) of probability zero. the probability is arbitrarily near one that the observed value Xn (ω) lies within a prescribed distance of larger m. In probability theory we have the notion of uniformly for all almost uniform in probability X (ω) convergence. What is really desired in most cases is a. the sequence on the selected tape may very well diverge. to a random variable • If the sequence of random variable converges a. · · ·. It is easy to confuse these two types of convergence. In this case.. In this case.77) limP (X − Xn  > ε) = 0 n There is a corresponding notion of a sequence fundamental in probability. The sequence may converge for some x and fail to converge for others. And such convergence has certain desirable properties.s. it is true that any fundamental sequence converges (i. noted above is a quite dierent kind of convergence. for any ε > 0 there N which works for all x (or for some suitable prescribed set of x ).
For as n→∞ (13. The notion of mean convergence illustrated by the reduction of expressed more generally and more precisely as follows. Various chains of implication can be traced. convergence. 2. A somewhat more detailed summary is given in PA.i.79) We use Roughly speaking. The relationships between types of convergence are important. 3. A sequence order p to X Var [An ] with increasing n {Xn : 1 ≤ n} converges in the Lp may be mean of i E [X − Xn p ] → 0 as n→∞ designated Xn → X. then it converges in probability. If it converges in probability. Also. to be integrable a random variable cannot be too large on too large a set. It converges in mean. order p. convergence in mean p. Chapter 17. 600) for integrals is integrable i E I{Xt >a} Xt  → 0 as a→∞ (13. What conditions imply the various kinds of convergence? 4.80) This condition plays a key role in many aspects of theoretical probability. weakly). 2. we consider one important condition known as uniform integrability. it may be easier to establish one type which implies another of more immediate interest. An arbitrary class probability measure P {Xt : t ∈ T } is uniformly integrable as (abbreviated u. While much of it could be treated with elementary ideas. since they are frequently utilized in advanced treatments of applied probability and of statistics. X According to the property (E9b) (list. Sometimes only one kind can be established. a However. p. Relationships between types of convergence for probability measures Consider a sequence {Xn : 1 ≤ n} of random variables. 1. We do not develop the underlying theory. If it converges almost surely. Do the various types of limits have the usual properties of limits? Is the limit of a linear combination of sequences the linear combination of the limits? Is the limit of products the product of the limits? 3. i it is uniformly integrable and converges in probability. it is complete treatment requires considerable development of the underlying measure theory. we speak of meansquare The introduction of a new type of convergence raises a number of questions. this characterization of the integrability of a single random variable to dene the notion of the uniform integrability of a class. What is the relation between the various kinds of convergence? Before sketching briey some of the relationships between convergence types. . But for a complete treatment it is necessary to consult more advanced treatments of probability and measure.) with respect to i supE I{Xt >a} Xt  → 0 t∈T a→∞ (13. We simply state informally some of the important relationships. For example • • Almost sure convergence implies convergence in probability Almost sure convergence and uniform integrability implies implies convergence in distribution.78) If the order p is one. 1.e. p = 2. important to be aware of these various types of convergence. 4.403 To establish this requires much more detailed and sophisticated analysis than we are prepared to make in this treatment. It converges almost surely i it converges almost uniformly. then it converges in distribution (i. we simply say the sequence converges in the mean. There is the question of fundamental (or Cauchy) sequences and convergent sequences. Denition.
or realization. we select at random a subset of the population and observe how the quantity varies over the sample. they have According to the central limit theorem. Once formulated. Now if the population mean E [X] is µ and the population variance Var [X] is σ 2 . that is not always feasible. we may apply probability theory to exhibit several basic ideas of statistical analysis.org/content/m23496/1. However. A population may be most any collection of Associated with each member is a quantity or a feature that can be assigned a number. The goal is to determine as much as possible about the character of the population. Also. · · · . random sample is an observation. population distribution.404 CHAPTER 13. then the sample mean and the sample variance should approximate the population quantities. • • The A sampling process is the iid class {Xi : 1 ≤ i ≤ n}. sample distribution will give a useful approximation to the population distribution. As such. and is not aected by. the others. If each member could be observed. x= 1 n n i=1 ti . Two important We want the population mean and the population variance. the observed value of these quantities would probably be dierent. We model the sampling process Let as follows: Xi . The population distribution is the distribution of that quantity among the members of the population. with each member having the population distribution. which means we select n members of the population and observe the quantity The selection is done in such a manner that on any trial each member is equally likely to be selected. It appears that we are describing a composite trial.7/>. the sampling is done in such a way that the result of any one selection does not aect. tn ) of the sampling process. (t1 .82) 3 This content is available online at <http://cnx. As the examples demonstrating the central limit theorem show. for any reasonable sized sample they should be approximately normally distributed. This provides a model for sampling either from a very large population (often referred to as an innite population) or sampling with replacement from a small population.1 Simple Random Samples and Statistics We formulate the notion of a (simple) random sample. (the common distribution of the Xi ). the The sampling process We take a sample of size associated with each. If the sample is representative of the population. Hopefully. This is an observation of the The sample average and the population mean sample average Consider the numerical average of the values in the sample An = The sample sum 1 n n Xi = i=1 1 Sn n Now (13. TRANSFORM METHODS 13. parameters are the mean and the variance.3 Simple Random Samples and Statistics3 13. then n n E [Sn ] = i=1 E [Xi ] = nE [X] = nµ and Var [Sn ] = i=1 Var [Xi ] = nVar [X] = nσ 2 (13.3. The quantity varies throughout the population. the population distribution could be determined completely. In order to obtain information about the population distribution. An Sn and are functions of the random variables distributions related to the population distribution {Xi : 1 ≤ i ≤ n} in the sampling process.81) Sn and the sample average An are random variables. We begin with the notion of a individuals or entities. . the sample size need not be large in many cases. If another observation were made (another sample taken). which is basic to much of classical statistics. t2 . 1 ≤ i ≤ n be the random variable for the i th component trial. n. Then the class {Xi : 1 ≤ i ≤ n} is iid.
15.9800 2.3263 5. p = P (An − µ ≤ a) = P √ a n An − µ √ ≤ σ σ/ n √ = 2Φ a n/σ − 1 (13.2816 1. 1. but the variance of the sample average is An is the same as Thus. probability specied.2) 3.15: Sample size Suppose a population has mean complementary questions: 1/n times the population variance.405 so that E [An ] = 1 E [Sn ] = µ n and Var [An ] = 1 Var [Sn ] = σ 2 /n n2 (13.x.x.8415 = 384.^2]') 0. √ 2Φ a n/σ − 1 ≥ p i √ p+1 Φ a n/σ ≥ 2 √ x = a n/σ (13.8415 0.8000 1. If the sample size n is reasonably large.7055 0.8 0. 2 uncertainty about the actual σ 2 Use at least 385 or perhaps 400 because The idea of a statistic .5758 6. factor for large enough sample.9000 1. If n is given. The population standard deviation. depending on the population distribution (as seen in the previous demonstrations). Sample size to be determined. p = [0.86) norminv to calculate values of x for various p.85) Find from a table or by use of the inverse normal function the value of to make required Φ (x) at least (p + 1) /2. There are 1. σ 2 /n . then An is approximately N µ.83) Herein lies the key to the usefulness of a large sample. x = norminv(0.95 0. as a measure of the variation is reduced by a √ 1/ n. what is the probability the sample average lies within distance a from the population mean? 2. n ≥ (2/0.98 0.4119 0. The mean of the sample average the population mean. µ and variance σ2 .9900 2.2.1. Then n ≥ σ 2 (x/a) = We may use the MATLAB function 2 σ a 2 x2 (13. probability to be determined.9600 3. A sample of size n is to be taken. Example 13. a = 0. Sample size given.6424 0.9500 1. σ = 2.(1+p)/2).6449 2. disp([p.99]. the probability is high that the observed value of the sample average will be close to the population mean. What value of within distance n is required to ensure a probability of at least p a from the population mean? that the sample average lies SOLUTION Suppose the sample variance is known or can be approximated reasonably.6349 For of p = 0.9 0.84) 2.95.
93) members. · · · .406 CHAPTER 13. Denition. we use the last expression to show ∗ E [Vn ] = The quantity has a 1 n σ 2 + µ2 − n σ2 + µ2 n = n−1 2 σ n (13. i=1 2 where µ = E [X] However. tN ) represent the complete set of values in a population of would be given by (13.16: Statistics as functions of the sampling process The random variable W = is 1 n n (Xi − µ) . VERIFICATION Consider the statistic ∗ Vn = 1 n n (Xi − An ) = i=1 2 1 n n 2 Xi − A2 n i=1 (13.87) not a statistic.92) Vn 1/ (n − 1) rather than 1/n is often called the sample variance to distinguish it from the population variance. . However. the following result shows that a slight modication is desirable. If the set of numbers (t1 . Example 13. i=1 2 then E [Vn ] = n n−1 2 σ = σ2 n−1 n (13. t2 . Example 13. the following (13.94) 1/N rather than 1/ (N − 1).88) It would appear that ∗ Vn might be a reasonable estimate of the population variance.17: An estimator for the population variance The statistic Vn = 1 n−1 n (Xi − An ) i=1 2 (13. the sample average is an example of a statistic. since it uses the unknown parameter µ.91) bias in the average. If we consider Vn = The quantity n 1 V∗ = n−1 n n−1 with n (Xi − An ) .89) is an estimator for the population variance. ∗ Vn = 1 n n (Xi − An ) = i=1 2 1 n 2 Xi − A2 n i=1 (13.90) Noting that E X 2 = σ 2 + µ2 . the variance for the population N 1 σ = N 2 Here we use N t2 i i=1 − 1 N N 2 ti i=1 (13. TRANSFORM METHODS As a function of the random variables in the sampling process. n is a statistic. A statistic is a function of the class {Xi : 1 ≤ i ≤ n} which uses explicitly no unknown parameters of the population.
it seems a reasonable candidate for an estimator of the population variance.000 Sample average ex = 0.m. An evaluation similar to that for the mean. % fraction of elements <= each value pg = gaussian(0. we need to consider its variance.7) . S100 .'k'. we plot the fraction of observed sums less than or equal to that value. As a random variable.t.0) % Seeds random number generator for later comparison tappr % Approximation setup Enter matrix [a b] of xrange endpoints [1 1] Enter number of x approximation points 100 Enter density as a function of t 0.ones(1.') % Comparative plot % Plotting details (see Figure~13. Y ∼ N (0. which has mean zero and variance 100/3. a = reshape(T. This yields an experimental distribution function for the distribution function for a random variable which is compared with rand('seed'. it has a variance. % Gaussian dbn for sample sum values plot(t. Then E [X] = 0 Var [X] = 1/3. 1]. % Matrix A of sample sums [t.407 Since the statistic Vn has mean value σ2 . Var [Vn ] is small. but more complicated in detail.m)).100/3.p.3333 m = 100.18: A sampling demonstration of the CLT Consider a population random variable X ∼ uniform [1. This gives a sample of size 100 of the sample sum random variable S100 . For each observed value of the sample sum random variable.m). % Forms 100 samples of size 100 A = sum(a). shows that Var [Vn ] = For large 1 n µ4 − n−3 4 σ n−1 where µ4 = E (X − µ) 4 (13. % Sorts A and determines cumulative p = cumsum(f)/m.'k.pg. If we ask how good is it.95) n. and Example 13.561e17 Sample variance vx = 0. so that Vn is a good largesample estimator for σ2 .t). 100/3).3344 Approximate population variance V(X) = 0. and determine the sample sums.5*(t<=1) Use row matrices X and PX as in the simple case qsample % Creates sample Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX Sample size n = 10000 % Master sample size 10.003746 Approximate population mean E(X) = 1.f] = csort(A. We take 100 samples of size 100.
The unit will be replaced immediately upon failure or at 60 hours.) X ∼ exponential (1/50).) has distribution X = [−3 − 2 0 1 4] P X = [0.96) a.408 CHAPTER 13. Show by direct calculation the ' MX (0) = E [X] X.30 0.) gX (s) gX (s) for the geometric (p) distribution.4 Problems on Transform Methods4 Exercise 13.) (µ) distribution. 412.5/>. Exercise 13.3 A pro jection bulb has life (in hours) represented by (Solution on p.2 Calculate directly the generating function for the Poisson (Solution on p. generating function for the time Y Determine the moment to replacement.7: The central limit theorem for sample sums.10] (13. and '' MX (0) = E X 2 . 4 This content is available online at <http://cnx.1 Calculate directly the generating function (Solution on p. 412. TRANSFORM METHODS Figure 13.org/content/m24424/1. 412.15 0. . Exercise 13. Determine the moment generating function for b. whichever comes rst. 13.25 0. 412.4 Simple random variable X (Solution on p. Exercise 13.20 0.
b. Y } is iid with common moment generating function generating function for Z = 2X − 4Y + 3. (Solution on p. Recognize the distribution from the form and compare part (a). moment generating function for Z = 5X + 2Y .13 Suppose the class Consider (Solution on p. λ) Normal µ. Determine the moment generating functions for IA . Obtain an expression for the symmetric triangular distribution on (a. p.5 Exponential (Solution on p. E [X] and b. IB . 414. Exercise 13. 0.409 Exercise 13.) Determine Exercise 13. Determine the Exercise 13.99) a.) λ3 . (1−qes )2 a.12 Random variable X (Solution on p.11 Random variable X (Solution on p. Use the moment generating function to determine the distribution for X. 0. . Use derivatives to determine E [X] and Var [X].) {A. determine E [X] (13. σ 2 (Solution on p.6 The pair {X.) is independent. (−c. (λ−s)3 Determine the moment Exercise 13. 412.2. c) as derived in the section "Three Basic Transforms".9 Random variable X has moment generating function p2 .5. 413. Y } is iid with common moment generating function MX (s) =(0.6 + 0.4es ).3). with respective probabilities 0. Poisson (4) and Y ∼ geometric (0. 413. Y } generating function gZ for X ∼ Z = 3X + 2Y . B. Use the result of part (a) to show that the sum of two independent random variables uniform on (a. b) has symmetric triangular distribution on (2a. 2b).) Use the moment generating function to obtain the variances for the following distributions (λ) Gamma (α. Var [X] with the result of Exercise 13. (Solution on p. 413. 413. b) for any a < b.) has moment generating function MX (s) = By recognizing forms and using exp (3 (es − 1)) · exp 16s2 /2 + 3s 1 − 5s rules of combinations. Exercise 13. a.10 The pair (Solution on p.97) By recognizing forms and using rules of combinations.) has moment generating function MX (s) = 1 · exp 16s2 /2 + 3s 1 − 3s E [X] and (13. C} of events is independent.3.) 388) on Use the moment generating function for the symmetric triangular distribution (list.98) and Var [X]. determine Var [X].8 (Solution on p. X.7 The pair the {X. 413. IC and use properties of moment generating functions to determine the moment generating function for b. {X.) Exercise 13. Exercise 13. 413. 413. X = −3IA + 2IB + 4IC (13.
X −Y? Justify your answer. with X ∼ binomial (5. Show that the k th derivative at s = 1 is gX (1) = E [X (X − 1) (X − 2) · · · (X − k + 1)] (k) (13. 416. 414.5. Compare with result (b).) {X.17 Suppose the pair (Solution on p.33) Y ∼ binomial (7. 414.16 Suppose the pair {X.3.100) a. 415. . 2 2 Let G = g (X) = 3X − 2X and H = h (Y ) = 2Y + Y + 3.19 Suppose the pair (Solution on p. d. determine the distribution for the sum with mgsum3.410 CHAPTER 13.18 Suppose the pair (Solution on p. Exercise 13. Use icalc and csort to obtain the distribution for (a). Use distributions for the separate terms.2.51). Use mgsum to obtain the distribution for U +V. Exercise 13. Compare with result (b). Use generating functions to show under what condition. − 0. with both and a. 0. express the generating function as a power series. 0. Use icalc and csort to obtain the distribution for (a). Y } is independent. a.3.101) b. What about X +Y is Poisson. TRANSFORM METHODS c.) {X.) Y ∼ uniform {X. Y } is independent. G + H. 0. Let U = 3X 2 − 2X + 1 and and V = Y 3 + 2Y − 3 (13.) Poisson. 415.) binomial. binomial (6. Does Z have the binomial distribution? Is the result surprising? Examine the rst few possible values for the generating function for Z .) is Poisson and X +Y is Poisson. Y } is independent.5}. Use the generating functions to show that Y X (Solution on p. Use mgsum to obtain the distribution for Z = 2X + 4Y . Exercise 13. {X. Use canonic to determine the distribution.14 Suppose the pair {X.14 X +Y is binomial.39) on {−1. Use generating functions to show under what condition b. and compare with the result of part Exercise 13.) is a nonnegative integervalued random variable. Write Exercise 13. X +Y is binomial. 414. 3. G+H a. with X ∼ binomial (8. does it have the form for the binomial distribution? and Z. Y } is independent. Use this to show the '' ' ' Var [X] = gX (1) + gX (1) − gX (1) 2 . Exercise 13. Y is nonnegative integervalued. Exercise 13. Y } is independent. if any. with both X X and Y Y (Solution on p.47). 1. 414. By the result of Exercise 13. is Poisson. U +V and compare with the result of part b. 0.15 Suppose the pair (Solution on p. Y } is iid. Use the mgsum to obtain the distribution for b. 2.20 If X (Solution on p.
p) Exercise 13. (Solution on p. N µ.3: Ane combination of independent normal random variables) from "Transform Methods" for general result). e−sµ MX (s) evaluated 2 then Var [X] = σ .411 Exercise 13. (Solution on p.) a. σ 2 . Use the Chebyshev inequality to estimate the needed sample size.) A population has standard deviation approximately three. Use this fact to show that if X ∼ N µ. at s = 0. . Show that Var [X] is the second derivative of b.) MXm (s) to obtain the mean and variance of the negative binomial (m.) Use the central limit theorem to show that for large enough sample size (usually 20 or more). 5). the An = is approximately variance 1 n n Xi i=1 (13. Use the normal approximation to estimate n (see Example 1 (Example 13. b. Exercise 13. 416.23 dent random variables. Y } is iid N (3.) Use the moment generating function to show that {X.21 Let MX (·) be the moment generating function for X.24 The pair (Solution on p. It is desired to determine the sample n needed to ensure that with probability 0. 416.5 of the mean value. Exercise 13. 416. σ 2 /n for any reasonable population distribution having mean value µ and Exercise 13.102) σ2 .95 the sample average will be within 0.15: Sample size) from "Simple Random Samples and Statistics").25 sample average (Solution on p.26 size (Solution on p. (Solution on p. a. 417. Z = 3X − 2Y + 3 is is normal (see Example 3 (Example 13. 416. 417.) Use moment generating functions to show that variances add for the sum or dierence of indepen Exercise 13.22 Use derivatives of distribution.
412 CHAPTER 13.∞) (X) eas ∞ (13.15e−3s + 0.25es + 42 · 0. Solution to Exercise 13.5 (p.a] (X) X + I(a.∞) (X) a a esY = I[0. Gamma λ 1 2λ 2 = E X2 = 3 = 2 λ2 λ λ λ Var [X] = 2 − λ2 = 1 λ2 (13.10e4s (13.111) E [X] = b.30 + 0.20e−2s + 0 + 0.106) = λ 1 − e−(λ−s)a + e−(λ−s)a λ−s (13. 408) ∞ ∞ gX (s) = E sX = k=0 pk sk = p k=0 q k sk = p 1 − qs (geometric series) (13.25es + 0.15e−3s + (−2) · 0.4 (p.113) '' MX (s) = α2 λ λ−s 1 1 λ +α λ−sλ−s λ−s 1 (λ − s) α λ2 2 (13.1 (p.107) Solution to Exercise 13.a] (X) esX + I(a. 408) Y = I[0.109) '' MX (s) = (−3) · 0.2 (p. 408) ∞ ∞ gX (s) = E sX = k=0 pk sk = e−µ k=0 µk sk = e−µ eµs = eµ(s−1) k! (13. λ): MX (s) = λ λ−s α ' MX (s) = α λ λ−s α α−1 λ (λ − s) 2 =α λ λ−s α α 1 λ−s (13.103) Solution to Exercise 13.112) (α. 409) a.108) (13. TRANSFORM METHODS Solutions to Exercises in Chapter 13 Solution to Exercise 13.15e−3s − 2 · 0.115) .20e−2s + 0 + 0. Exponential: MX (s) = λ λ ' MX (s) = 2 λ−s (λ − s) '' MX (s) = 2λ (λ − s) 1 λ 3 2 (13.25es + 4 · 0.104) Solution to Exercise 13.105) MY (s) = 0 est λe−λt dt + esa a λe−λt dt (13.10e4s ' MX (s) = −3 · 0. 408) MX (s) = 0.114) E [X] = α α2 + α E X2 = λ λ2 Var [X] = (13.3 (p.20e−2s + 0.10e4s 2 2 (13.110) Setting s=0 and using e0 = 1 give the desired results.
X2 ∼ exponential (1/5) . 409) (13.9 (p. 16) (13.122) s2 (b − a) p2 (1 − qes ) p2 (1 − qes ) −2 '' −2 ' = 2p2 qes (1 − qes ) + (1 − 3 so that E [X] = 2q/p E X2 = 6q 2 2q + 2 p p (13.6 + 0. σ): MX (s) = exp σ 2 s2 + µs 2 ' MX (s) = MX (s) · σ 2 s + µ (13.118) MZ (s) = e3s Solution to Exercise 13. 409) λ λ − 2s 3 λ λ + 4s 3 (13. If Y ∼ (m − c. which has E [X] = 2q/p Var [X] = 2q/p2 .121) MX+Y (s) = Solution to Exercise 13.6 + 0. 409) 3 −1) · (13.126) X = X1 + X2 with X1 ∼ exponential(1/3) X2 ∼ N (3. 409) gZ (s) = gX s3 gY s2 = e4(s Solution to Exercise 13. 409) e −e s (b − a) sb sa 2 = e s2b +e s2a − 2e 2 s(b+a) (13.127) E [X] = 3 + 3 = 6 Var [X] = 9 + 16 = 25 Solution to Exercise 13.6 (p. 0. X3 ∼ N (3. 409) (13.3 1 − qs2 Solution to Exercise 13.117) E [X] = µ E X 2 = µ2 + σ 2 Var [X] = σ 2 Solution to Exercise 13.130) .7 (p. with X1 ∼ Poisson (3) .128) X = X1 + X2 + X3 .125) (2. then (13.12 (p.116) '' MX (s) = MX (s) · σ 2 s + µ 2 + MX (s) σ 2 (13.10 (p. 16) (13. m + c) = (a.119) MZ (s) = 0.8 (p. c).4e2s (−c.413 c.123) = 6p2 q 2 es (1 − 4 qes ) 2p2 qes 3 qes ) so that (13.120) m = (a + b) /2 and c = (b − a) /2.129) E [X] = 3 + 5 + 3 = 11 Var [X] = 3 + 25 + 16 = 44 (13.11 (p. Normal (µ.124) Var [X] = X∼ negative binomial 2 q 2 + pq 2q 2 2q 2q + = = 2 p2 p p2 p and (13.4e5s Solution to Exercise 13. 409) Let triangular on 0. p). b) and symetric triangular on X = Y +m is symmetric MX (s) = ems MY (s) = e(m+c)s + e(m−c)s − 2ems ebs + eas − 2e ecs + e−cs − 2 ms e = = 2 s2 2 s2 b−a 2 2 c c s 2 a+b 2 s (13.
X2 = [0 2].0000 0.0300 2.15 (p.0700 Solution to Exercise 13.P2.135) Y −X could have negative values.0000 0.131) 0.12 0.px] = mgsum3(X1.2800 2.07 0.0000 0. X3 = [0 4].16 (p.12e−3s + 0. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution P1 = [0. Solution to Exercise 13. disp([X.28 0.7 + 0.2800 0 0. n m gX+Y (s) = (q1 + p1 s) (q2 + p2 s) Solution to Exercise 13.8 0.1200 0 0. gX (s) = eµ(s−1) gX (s) gives gY (s) = eν(s−1) . P2 = [0.2]. where ν = E [Y ] > 0.17 (p.5 0.07e6s The distribution is (13.0300 4.x.414 CHAPTER 13.0000 0.0000 0.5e2s 0. (13. as the argument below shows.134) gX+Y (s) = eµ(s−1) eν(s−1) = e(µ+ν)(s−1) However.0300 1.3].13 (p.1200 1. P3 = [0. 410) Always Poisson.X3. 410) .2800 3.0000 0.03 0.07] (13.0000 0.7 0.0700 6.X2.132) X = [−3 − 1 0 1 2 3 4 6] P X = [0.0000 0.0000 0.P1.0000 0.133) c = [3 2 4 0].5 + 0.0000 0.07e4s + 0.2800 1.0700 4.0000 0.28 0.1200 3.2e4s = (13. 410) Division by E [X + Y ] = µ + ν .03e3s + 0.28 + 0. X1 = [0 3].px]') 3.12 0.03 0. 409) MX (s) = 0.3e−3s 0.0000 0.14 (p.1200 1.1*[3 5 2].P3).0300 3.12e−s + 0. P = 0.8 + 0.PX.5].03es + 0.0700 6. [x. Solution to Exercise 13. = (q + ps) n+m i p1 = p2 (13. and gX+Y (s) = gX (s) gY (s) = e(µ+ν)(s−1) .28e2s + 0. as shown below.0000 0. TRANSFORM METHODS Solution to Exercise 13. 410) Binomial i both have same minprob(P) p.
[Z.0000 0.5].. and P M = 3*t.18 (p.^2 .^3 + 2*Y .39.. PY = (1/5)*ones(1. Y = [1.PZ] = mgsum(U..0000 0.5 1.PZ)) % Comparison of p values e = 0 Solution to Exercise 13.2*t + 1 + u. since odd values missing 2. Y. e = max(abs(pz . Y.51s4 6 (13. disp([Z(1:5)..pz] = csort(M.PZ(1:5)]') 0 0.PZ] = mgsum(2*x.Y).x). U = 3*X.gX (s) = gY (s) = (0.2*X.5).P).415 x = 0:6.0043 6. PY. PY. PX. Y = 0:7.3 2.PZ] = mgsum(G. [z.PX.0259 .^2 .33. t. G = 3*X..0118 8. . PX = ibinom(8. [Z.^2 + Y + 3.51s) Solution to Exercise 13.^2 + u + 3.px).49 + 0.3 0.0. [Z..0000 0. 410) 6 gZ (s) = 0.136) X = 0:5.0. PY = ibinom(7.^3 + 2*u . 410) X = 0:8.51s2 6 0.X).2*X + 1.0012 4.2 3.^2 .PX.0002 % Cannot be binomial. and P M = 3*t.PY).3. PX = ibinom(5.2*t + 2*u.3. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. u.47.V. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. H = 2*Y. px = ibinom(6.^2 .X).4*x. V = Y. t.0. PX. u.49 + 0.H.49 + 0.19 (p.0.PY).px.0000 0.51.
143) Var [X] = Solution to Exercise 13.139) Solution to Exercise 13. h (s) = MX (s) MY (s) (13.pz] = csort(M.416 CHAPTER 13.P).147) g (−s) shows Var [X + Y ] = Var [X] + Var [Y ].PZ)) e = 0 Solution to Exercise 13. Var [X − Y ] = Var [X] + Var [Y ].20 (p.145) h' (s) = f ' (s) g (s) + f (s) g ' (s) h'' (s) = f '' (s) g (s) + f ' (s) g ' (s) + f ' (s) g ' (s) + f (s) g '' (s) Setting s=0 yields E [X + Y ] = E [X] + E [Y ] E (X + Y ) 2 = E X 2 + 2E [X] E [Y ] + E Y 2 E 2 [X + Y ] = (13. 411) M3X (s) = MX (3s) = exp M−2Y (s) = MY (−2s) = exp = exp 4 · 5s2 − 2 · 3s 2 (13. e = max(abs(pz . TRANSFORM METHODS [z. 411) ' '' ' f (s) = e−sµ MX (s) f '' (s) = e−sµ −µMX (s) + µ2 MX (s) + MX (s) − µMX (s) (13.144) f (s) = MX (s). mpm qes (1 − m+1 qes ) f (s) = pm m (1 − qes ) f ' (s) = f '' (s) = mpm qes (1 − m+1 qes ) + m (m + 1) pm q 2 e2s (1 − qes ) m+2 (13. 9 · 5s2 + 3 · 3s 2 A similar treatment with g (s) replaced by Solution to Exercise 13. 411) To simplify writing use (13. set mq m (m + 1) q 2 m2 q 2 mq + − 2 = 2 2 p p p p and (13.24 (p.22 (p.138) '' ' ' Var [X] = E X 2 − E 2 [X] = E [X (X − 1)] + E [X] − E 2 [X] = gX (1) + gX (1) − gX (1) (13. 411) To simplify writing.137) gX (1) = k=n k (k − 1) · · · (k − n + 1) pk = E [X (X − 1) · · · (X − n + 1)] 2 (13.23 (p.141) f (s) for MX (S). 410) Since power series may be dierentiated term by term ∞ gX (s) = k=n ∞ (n) (n) k (k − 1) · · · (k − n + 1) pk sk−n so that (13.149) .140) Setting s=0 and using the result on moments gives f '' (0) = −µ2 + µ2 + E X 2 − µ2 = Var [X] Solution to Exercise 13.148) MZ (s) = e3s exp (45 + 20) s2 + (9 − 6) s 2 65s2 + 6s 2 (13.146) E 2 [X] + 2E [X] E [Y ] + E 2 [Y ] Taking the dierence gives (13.21 (p.142) E [X] = mpm q (1 − q) m+1 = mq m (m + 1) pm q 2 mq E X2 = + m+2 p p (1 − q) (13. g (s) = MY (s).
05 0. An is approximately Solution to Exercise 13. 411) normal.52 n implies n ≥ 720 (13.5 n An − µ √ ≥ 3 σ/ n ≤ 32 ≤ 0. with the mean and variance above.417 Solution to Exercise 13.25 (p. Chebyshev inequality: P Normal approximation: √ 0.5) 3.150) By the central limit theorem.152) . 411) E [An ] = 1 n n µ = µ Var [An ] = i=1 1 n2 n σ2 = i=1 σ2 n (13.84 = 128 2 (13.26 (p.151) Use of the table in Example 1 (Example 13.15: Sample size) from "Simple Random Samples and Statistics" shows n ≥ (3/0.
TRANSFORM METHODS .418 CHAPTER 13.
1 Conditioning by an event If a conditioning event bility measure C occurs. given a random vector. The expectation E [X] is the probability X.1 Conditional Expectation. Regression 14. 14. We discover that conditional expectation is a random quantity. in turn. Two possibilities for making the modication are suggested. 419 . We rst consider an elementary form of conditional expectation with respect to an event. Various types of conditioning characterize some of the more important random sequences and processes. The basic property for conditional expectation and properties of ordinary expectation are used to obtain four fundamental properties which imply the expectationlike character of conditional expectation. In examining these.1. Regression1 Conditional expectation. Conditional independence plays an essential role in the theory of Markov processes and in much of decision theory. Then we consider two highly intuitive special cases of conditional expectation.1) . given a random variable. The notion of conditional independence is expressed in terms of conditional expectation. gives an alternate interpretation of conditional expectation.We limit the possible outcomes to event C P (C) as the new unit .5/>. We could continue to use the prior probability measure follows: P (·) and modify the averaging process as 1 This content is available online at <http://cnx. It seems reasonable to make a corresponding modication of mathematical expectation when the occurrence weighted average of the values taken on by • • We could replace the prior probability measure P (·) with the conditional probability measure P (·C) and take the weighted average with respect to these new weights.org/content/m23634/1. In making the change from P (A) we eectively do two things: to P (AC) = P (AC) P (C) (14. plays a fundamental role in much of modern probability theory. we modify the original probabilities by introducing the conditional proba P (·C).We normalize the probability mass by taking of event C is known.Chapter 14 Conditional Expectation. An extension of the fundamental property leads directly to the solution of the regression problem which. we identify a fundamental property which provides the basis for a very general extension.
1: A numerical example Suppose X ∼ [1/λ.2) The nal sum is expectation with respect to the conditional probability measure. We suppose P (Ai ) = P (X = ti ) > 0 and P (X = ti .1. The notion of a conditional distribution. 2/λ]. CONDITIONAL EXPECTATION.8) .5) E [XC] = 2e−1 − 3e−2 1. we adopt the following Denition.6) 14.7) We take the expectation relative to the conditional probability m E [g (Y ) X = ti ] = j=1 g (uj ) P (Y = uj X = ti ) = e (ti ) (14. given event C with positive probability. X (ω) for only those ω ∈ C . The conditional expectation of X. and taking weighted averages with respect to the conditional probability is intuitive and natural in this case.420 CHAPTER 14. The product form E [XC] P (C) = E [IC X] (λ) and Example 14. Arguments using basic theorems on expectation and the approximation of general random variables by simple random variables allow an extension to a general random variable X.418 ≈ λ (e−1 − e−2 ) λ (14.4) E [IC X] = Thus IM (t) tλe−λt dt = 1/λ tλe−λt dt = 1 2e−1 − 3e−2 λ (14. In order to display a natural relationship with more the general concept of conditioning with repspect to a random vector.Consider the values IC X which has value probability weighted . Now P (Y = uj X = ti ) = n m form. This may be done by using the random variable X (ω) for ω ∈ C and zero elsewhere.The weighted sum of those values taken on in C. j .3) Remark.2 Conditioning by a random vectordiscrete case Suppose X = i=1 ti IAi and Y = j=1 uj IBj in canonical P (Bj ) = P (Y = uj ) > 0. for each permissible i. Y = uj ) P (X = ti ) P (·X = ti ) to get (14. is the quantity E [XC] = E [IC X] E [IC X] = P (C) E [IC ] is often useful. this point of view is limited. average is obtained by dividing by P (C). The expectation E [IC X] is the These two approaches are equivalent. Now IC = IM (X) where M = P (C) = P (X ≥ 1/λ) − P (X > 2/λ) = e−1 − e−2 2/λ and (14. (14. REGRESSION . For a simple random variable X= n k=1 tk IAk n in canonical form n n E [IC X] /P (C) = k=1 E [tk IC IAk ] /P (C) = k=1 tk P (CAk ) /P (C) = k=1 tk P (Ak C) (14. given C. exponential C = {1/λ ≤ X ≤ 2/λ}. However.
2: Basic calculations and interpretation Suppose the pair {X. The following interpretation helps visualize the conditional expectation and points to an important result in the general case.15 0. But rst.01 0.01 + 2 · 0.30 PX Table 14. the function e (·) is dened on the range of X.10 + 0 · 0. Example 14.10 E [Y X = 0] = −1 0.10) j=1 n = i=1 We have the pattern IM (ti ) e (ti ) P (X = ti ) = E [IM (X) e (X)] (14.83 The pattern of operation in each case can be described as follows: • For the i th column.20 + 0 0.05 0. sum.21) /0.20 = (−1 · 0.30 E [Y X = 4] = (−1 · 0.09 + 2 · 0. Y = uj ) g (uj ) P (Y = uj X = ti ) P (X = ti ) (14.10 + 2 · 0.11) (A) for all E [IM (X) g (Y )] = E [IM (X) e (X)] where e (ti ) = E [g (Y ) X = ti ] (14.05) /0.05 + 2 0. concept.05 + 0 · 0. then divide by P (X = ti ). consider an example to display the nature of the We return to examine this property later.10 = 0.40 = 0.04 0.12) ti in the range of X.15) /0.10 0. Y } has the joint distribution P (X = ti .40 9 0.05 0.10 + 0 · 0.20 = 0 E [Y X = 1] = (−1 · 0.421 Since we have a value for each consider any reasonable set M ti in the range of X.9) IM (ti ) m = i=1 (14.04) /0.09 0.20 0. Now on the real line and determine the expectation n m E [IM (X) g (Y )] = i=1 j=1 n IM (ti ) g (uj ) P (X = ti .05 0.10 0. Y = uj ).10 4 0. .05 0.10 = 0.1 Calculate E [Y X = ti ] for each possible value ti taken on by X 0.20 1 0.80 E [Y X = 9] = (−1 · 0.05 0.05 + 2 · 0. multiply each value uj by P (X = ti .13) X= Y =2 0 1 0 0.10 0.21 0. Y = uj ) (14.05 + 0 · 0.
disp([X.14) elsewhere fX (t) > 0 of a density for each xed t eectively determines the range of for which X. As in the case of ordinary expectations. P (X = t) = 0 for adopted in the discrete case. in the meansquare sense.0000 1./sum(P). Instead Z = g (X. The result of the computation is to determine the center of conditional distribution above t = ti . This is particularly useful in dealing with the simple approximation to an absolutely continuous pair.3000 4. making the handling of much larger problems quite easy. 5 1 9 10. Y } X = t.*P).15) fY X (ut) ≥ 0.1.1. 10 5 10 5]. t. Although the calculations are not dicult for a problem of this size. has joint density function The fact that fXY . we consider the t requires a conditional density each for We seek to use the concept of a conditional modication of the approach fY X (ut) = { The condition fXY (t.0000 0. EZX = sum(G. be the best estimate. of values of uj we use values of g (ti . % Data for the joint distribution Y = [1 0 2].0500 9.5000 1. This mass is distributed along a vertical mass line at values for the this should uj taken on by Y. jcalc % Setup for calculations Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.8000 9.^2 .0000 12. We examine that possibility in the treatment of the regression problem in Section 14. REGRESSION • For each ti we use the mass distributed above it. u) /fX (t) 0 fX (t) > 0 (14. and P EYX = sum(u. 1 fX (t) fXY (t. Y ) X = ti ].2*t. j 0 0 1.Y) = Y^2 . P = 0.0000 0.*P = u_j P(X = t_i.01*[ 5 4 21 15.*P). of Y when X = ti . u. % sum(P) = PX (operation sum yields column sums) disp([X.8333 The calculations extend to the calculations.3 Conditioning by a random vector absolutely continuous case Suppose the pair distribution. Suppose E [g (X.*u. given {X. The function fY X (·t) has the properties fX (t) > 0.5 (The regression problem). Y ) = Y 2 − 2XY . PY. Intuitively./sum(P).EYX]') % u. X = [0 1 4 9]. Y.5000 4.0000 4. uj ) in G = u.8333 % Z = g(X. CONDITIONAL EXPECTATION.2XY % E[ZX=x] 14.422 CHAPTER 14.EZX]') 0 1.0000 0. PX. fY X (ut) du = . u) du = fX (t) /fX (t) = 1 (14. Y = u_j) for all i. the basic pattern can be implemented simply with MATLAB.
This causes . E [g (Y ) X = t] = The function g (u) fY X (ut) du = e (t) (14.17) e (t) = E [g (Y ) X = t] (14.3: Basic calculation and interpretation Suppose the pair by {X. Then fX (t) = 6 5 1 6 5 (t + 2u) on the triangular region bounded (t + 2u) du = t 6 1 + t − 2t2 . For any reasonable set M on the real line. 0 ≤ t ≤ 1 5 (14. e (t) = E [g (Y ) X = t] (14. hence eectively on the range of X. t=1 since the denominator is zero for that value of t. we must rule out no problem in practice.423 We dene. u = 1. where IM (t) g (u) fY X (ut) du fX (t) dt (14. Y } has joint density fXY (t. then. we postpone examination of this pattern until we consider a more general case.1).18) Thus we have.20) By denition.22) Theoretically. and u = t (see Figure 14. as in the discrete case. in this case. u) = t = 0.16) e (·) is dened for fX (t) > 0. u) dudt = IM (t) e (t) fX (t) dt.21) E [Y X = t] = ufY X (ut) du = 1 1 + t − 2t2 1 tu + 2u2 du = t 4 + 3t − 7t3 0≤t<1 6 (1 + t − 2t2 ) (14.19) (A) E [IM (X) g (Y )] = E [IM (X) e (X)] where Again. fY X (ut) = We thus have t + 2u 1 + t − 2t2 on the triangle (zero elsewhere) (14. E [IM (X) g (Y )] = = IM (t) g (u) fXY (t. for each t in the range of X. Example 14.
consider a narrow vertical strip of width ∆t • with the vertical line through t not vary appreciably with for any u.5: The regression problem). . u) does The mass in the strip is approximately Mass ≈ ∆t fXY (t. We are able to make an interpretation quite analogous to that for the discrete case.25) This interpretation points the way to the use of MATLAB in approximating the conditional expectation. • For any t in the range of X t (between 0 and 1 in this case). This also points the way to practical MATLAB calculations. Consider the MATLAB treatment of the example under consideration. In the MATLAB handling of joint absolutely continuous random variables. we divide the region into narrow vertical strips. Also.424 CHAPTER 14. at its center. u) du = ∆tfX (t) u=0 is approximately (14. then fXY (t.24) • The center of mass in the strip is Moment Mass Center of mass = ≈ ∆t ufXY (t. The success of the discrete approach in approximating the theoretical value in turns supports the validity of the interpretation. u) du (14. structure. Then we deal with each of these by dividing the vertical strips to form the grid The center of mass of the discrete distribution over one of the t chosen for the approximation must lie close to the actual center of mass of the probability in the strip.1: The density function for Example 14. "The Regression Problem" (Section 14. CONDITIONAL EXPECTATION.1. REGRESSION Figure 14.3 (Basic calculation and interpretation). u) du = ∆tfX (t) ufY X (ut) du = e (t) (14. If the strip is narrow enough.23) • The moment of the mass in the strip about the line Moment ≈ ∆t ufXY (t. this points to the general result on regression in the section.
14.EYx.*P).2: Theoretical and approximate conditional expectation for above (p. since the approximation determines the center of mass of the discretized mass which approximates the center of the actual mass in each vertical strip. t.2) Figure 14. % Density as string variable tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density eval(f) % Evaluation of string variable Use array operations on X.^2)). Y.2*X. % Approximate values eYx = (4 + 3*X . these cases and this interpretation are rather limited and do not provide the basis for the range of applicationstheoretical and ./sum(P).425 f = '(6/5)*(t + 2*u). PX. % Theoretical expression plot(X. 424). It also indicates that the interpretation is reasonable. u. The agreement of the theoretical and approximate values is quite good enough for practical purposes./(6*(1 + X . Analysis of However.7*X. PY.eYx) % Plotting details (see Figure~14.4 Extension to the general case Most examples for which we make numerical calculations will be one of the types above.1.^3). and P EYx = sum(u.*(u>=t)'.X. these cases is built upon the intuitive notion of conditional distributions.
. REGRESSION practicalwhich characterize modern probability theory.28) is always a constant. b). The condition (A) For a detailed treatment and proofs. e (X) = E [g (Y ) X] a. The next condition is just the property (B) noted above. where the t≥0 E [X] (14. Frequently.30) of the If the parameter random variable device. By the special case of the Radon Nikodym theorem (E19) ("(E19)".26) t in the range of X. This extension to possible is extremely important. the range of X The conditional expectationE [g (Y ) X = t] = e (t) E [IM (X) g (Y )] = E [IM (X) e (X)] e (·) is a function. (CE1) Dening condition. At this point it has little apparent signicance.s. Thus fXH (tu) = ue−ut for X ∼ exponential (u). has positive we have We have a tie to the simple case of conditioning with respect to an event.e. By the uniqueness property (E5) ("(E5) ". We make a denition based on these facts. except that it must include the two special cases studied in the previous sections. 600).s. any of a number of books on measuretheoretic probability may be consulted. e (X) is a random variable and E [g (Y )] The concept is abstract.s. determine the expected life . (i. the function exists e (·)always and is such that random variable e (X) is unique a.29) X and vector valued (CE1a) If X Y do not need to be real valued. E [IM (X) e (X)] = E [g (Y ) X ∈ M ] P (X ∈ M ) The special case which is obtained by setting for all M to include the entire range of X so that IM (X (ω)) = 1 ω is useful in many theoretical and applied problems. SOLUTION H ∼ uniform (a. we have the property (A) for all E [IM (X) g (Y )] = E [IM (X) e (X)] where e (t) = E [g (Y ) X = t] C = {X ∈ M } (14. 600). except for a set of ω of probability zero). the data It may seem strange that we should complicate the problem of determining conditional expectation e (X) = E [g (Y ) X] supplied in a problem makes this the expedient procedure. it is not clear why the term conditional expectation should be used. Denition. In each case examined above. CONDITIONAL EXPECTATION. although and Y g (Y ) is real valued. Also. In Appendix F we tabulate a number of key properties of conditional expectation. then P (X ∈ M ) > 0. 2. p. then using IC = IM (X) (B) E [IM (X) g (Y )] = E [g (Y ) X ∈ M ] P (X ∈ M ) (14. E [g (Y )] = E{E [g (Y ) X]} E [g (Y )] by rst getting the then taking expectation of that function.s. We examine several of these properties. If probability. since (A) holds for all reasonable (Borel) sets.27) Two properties of expectation are crucial here: 1. (CE1b) Law of total probability.426 CHAPTER 14. is the a.4: Use of the law of total probability Suppose the time to failure of a device is a random quantity parameter u is the value of a parameter random variable H. Example 14. unique function dened on such that (A) Note that for all Borel sets Expectation M (14. The justication rests in certain formal properties which are based on the dening condition (A) and other properties of expectation. is called property (CE1) (p. i E [IM (X) g (Y )] = E [IM (X) e (X)] Note that for each Borel set M on the codomain of X (14. p. then e (X) is unique a. We seek a basis for extension (which includes the special cases). 426).
35) is easily established by extension to any nite linear combination Independence. the uniqueness property (E5) ("(E5) ". 600) and the product rule (E18) ("(E18)". with some reservation for the fact that A similar development holds for conditional expectation. mathematical induction. These properties for expectation yield most The next three properties. 427) forms the basis for the solution of the regresson problem in the next section.s. e (X) is a random variable.33) a = 1/100. E{IM (X) [ae1 (X) + be2 (X)]} a. 427) provides another condition for independence. Since the equalities hold for any Borel implies M. (CE6) e (X) = E [g (Y ) X] a. of the other essential properties for expectation.s. b = 2/100. and e (X) = E [ag (Y ) + bh (Z) X] a.s.31) E [XH = u] = 1/u Thus and fH (u) = 1 . E [X] = 100ln (2) ≈ 69. . linearity. we examine one of them. A formal proof utilizes uniqueness (E5) ("(E5) ". The resulting constant value of the conditional expec E [g (Y )] in order for the law of total probability to hold. by (CE1) by linearity of expectation by (CE1) by linearity of expectation 600) for expectation E [IM (X) e (X)] E{IM (X) [ag (Y ) + bh (Z)]} a.s. {X. p. along with the dening condition provide the expectation like character.s. = aE [IM (X) e1 (X)] + bE [IM (X) e2 (X)] a. For any constants a. Property (CE6) (p.s.s. i E [h (X) g (Y )] = E [h (X) e (X)] a.s. Property (CE5) (p. An (14. (CE5) 427).s. (CE2) Linearity. b (14. = = = e2 (X) = E [h (Z) X] .32) E [X] = For 1 b−a b a 1 ln (b/a) du = u b−a (14. b−a a<u<b (14.s. for all E [IN (Y ) X] = E [IN (Y )] a.s. 600) for expectation. e (X) = ae1 (X) + be2 (X) a.31. for Borel functions all Borel sets N g on the codomain of Y Since knowledge of X does not aect the likelihood that expectation should not be aected by the value of tation must be Y will take on any set of values. aE [IM (X) g (Y )] + bE [IM (X) h (Z)] a. This is property (CE2) (p. then conditional X. p. This restriction causes little problem for applications at the level of this treatment.427 We use the law of total probability: E [X] = E{E [XH]} = Now by assumption E [XH = u] fH (u) du (14. and monotone convergence. positivity/monotonicity. p. for any Borel function h . In order to get some sense of how these properties root in basic properties of expectation. VERIFICATION Let e1 (X) = E [g (Y ) X] . Y } is an independent pair i i E [g (Y ) X] = E [g (Y )] a.34) E [ag (Y ) + bh (Z) X] = aE [g (Y ) X] + bE [h (Z) X] a. unique a.s.
u) = 6 (t + 2u) 5 on the triangular region bounded by t = 0. IDEAS OF A PROOF OF (CE6) (p. We list them here (as well as in Appendix F) for reference. n h (X) = i=1 ai IMi (X). 427) 1. then we should be able to replace It certainly seem reasonable to suppose that if X by t in {X.s. the limits are equal. Y } has density fXY (t. For h ≥ 0. They are essential in the study of Markov sequences and of a class of random sequences known as submartingales. so that E [g (t. The pair {X. Determine E [ZX = t]. b. (CE9) Repeated conditioning If X = h (W ) . e (X) ≥ 0. (14. to get some insight into how the various properties arise. Properties (CE8) (p.5: Use of property (CE10) (p. there is a seqence of nonnegative.s.38) We show in Example 14. h (X) = IM (X). u = 1. For h = h+ − h− .428 CHAPTER 14.36) Since corresponding terms in the sequences are equal. 427). 428) are peculiar to conditional expectation. CONDITIONAL EXPECTATION. Property (CE10) (p. (CE2) (p. Y )] a. (CE10) Under conditions on g that are nearly always met in practice a. the result follows by linearity g = g + − g − . then E{E [g (Y ) X] W } = E{E [g (Y ) W ] X} = E [g (Y ) X] a. It is easy to establish in the two elementary cases developed in previous sections. They play an essential role in many theoretical developments. Example 14. If E [g (X.s.s. E [hn (X) g (Y )] [U+2197]E [h (X) g (Y )] Now by positivity. Y ) X = t]. and u=t (14. and E [hn (X) e (X)] [U+2197]E [h (X) e (X)] (14. SOLUTION . By monotone convergence (CE4). g ≥ 0. g ≥ 0. For 3. 428) below provide useful aids to computation. the result follows by linearity. Y ) X = t] = E [g (t. for any Borel function h This property says that any function of the conditioning random vector may be treated as a constant factor. If E [g (X. then the value of X should not aect the value of Y. we sketch the ideas of a proof of (CE6) (p.3 (Basic calculation and interpretation) that E [Y X = t] = Let 4 + 3t − 7t3 0≤t<1 6 (1 + t − 2t2 ) (14. For 2. Y ) X = t] = E [g (t. 427). 428) Consider again the distribution for Example 14. This combined with (CE10) (p.37) This somewhat formal property is highly useful in many theoretical developments. We provide an interpretation after the development of regression theory in the next section. The next property is highly intuitive and very useful. . Y } is independent. then E [g (X. 426).3 (Basic calculation and interpretation). (CE8) E [h (X) g (Y ) X] = h (X) E [g (Y ) X] a.39) Z = 3X 2 + 2XY . 4. 428) assures this. [PX ] X = t. Its proof in the general case is quite sophisticated. Y ) X = t] = E [g (t. Again. REGRESSION Examination shows this to be the result of replacing IM (X) in (CE1) (p. Y )] a. Y } is an independent pair. Y ) X = t] a. 428) and (CE9) (p. [PX ] {X. For 5. 426) with arbitrary h (X). this is (CE1) (p.s. the result again follows by linearity. Y ) X = t] to get E [g (t. simple hn [U+2197]h.
6: The conditional distribution function As in Example 14. measuretheoretic treatment shows that it may for all t in the range of not be true that FY X (· t) is a distribution function X. If the parameter random variable H ∼ uniform (a. b).44) FY (u) = E FY X (uX) = If there is a conditional density FY X (ut) FX (dt) (14.41) E [IE C] = Denition. Example 14. 428) E [ZX = t] = 3t2 + 2tE [Y X = t] = 3t2 + Conditional probability 4t + 3t2 − 7t4 3 (1 + t − 2t2 ) (14. SOLUTON As in Example 14. where the fXH (tu) = ue−ut t ≥ 0 Then (14.4 (Use of the law of total probability).47) A careful. distribution function FY X (uX) = P (Y ≤ uX) = E I(−∞. this is seldom a problem. The conditional probability E. P (EC) E [IE IC ] = = P (EC) P (C) P (C) of event (14. by the law of total probability (CE1b) (p.48) t FXH (tu) = 0 ue−us ds = 1 − e−ut 0 ≤ t (14. 426). 428). (CE8) (p.u] (Y ) X Then.42) In this manner. take the assumption on the conditional distribution to mean X∼ exponential (u). and (CE10) (p. we have (14. given X. there is no need for a separate theory of conditional probability. determine the distribution function FX .429 By linearity. we extend the concept conditional expectation.43) We may dene the conditional P (EX) = E [IE X] Thus. given an event. (14. Modeling assumptions often start with such a family of distribution functions or density functions.49) .46) u FY X (ut) = −∞ fY X (rt) dr so that fY X (ut) = ∂ FY X (ut) ∂u (14. is (14. suppose parameter u is the value of a parameter random variable H. However.45) fY X such that P (Y ∈ M X = t) = M then fY X (rt) dr (14.4 (Use of the law of total probability).40) In the treatment of mathematical expectation. we note that probability may be expressed as an expectation P (E) = E [IE ] For conditional probability. in applications.
52) The following example uses a discrete conditional distribution and marginal distribution to obtain the joint distribution for the pair. Let is chosen by a random selection from the integers from 1 through 20 (say by drawing a card from a box).m+1). both ones.51) t yields the expression for fX (t) = 1 b−a b 1 + t2 t e−bt − a 1 + t2 t (14. end PS = sum(P).54) The following MATLAB procedure calculates the joint probabilities and arranges them as on the plane.50) =1− Dierentiation with respect to 1 e−bt − e−at t (b − a) fX (t) e−at t>0 (14. disp('Joint distribution N. S. for N ∼ uniform on the integers 1 through 20. given N = i. P = zeros(n. 1/6). for i = 1:n P(i.1:N(i)+1) = PN(i)*ibinom(N(i). 0 ≤ j ≤ i. we have S conditionally binomial (i. so that E [S] = 1 1 · 6 20 20 i= i=1 20 · 21 7 = = 1. Example 14. j).0:N(i)).. the probability of a match on any throw is 1/6. A pair of dice is thrown S be the number of matches (i.m p = input('Enter the probability of success '). P (N = i) = 1/20 1 ≤ i ≤ 20. S}. PN = input('Enter PROBABILITIES for N ').430 CHAPTER 14. P. REGRESSION By the law of total probability FX (t) = FXH (tu) fH (u) du = 1 b−a b 1 − e−ut du = 1 − a 1 b−a b e−ut du a (14. N = input('Enter VALUES of N '). CONDITIONAL EXPECTATION. m = max(N). n = length(N).). S = j) = P (S = jN = i) P (N = i) Now (14. P = rot90(P). and marginal PS') randbern % Call for the procedure Enter the probability of success 1/6 Enter VALUES of N 1:20 .p. both twos. etc. S = 0:m. Since there are 36 pairs of numbers for the two dice and six possible matches. P (N = i.e. % file randbern.53) E [SN = i] = i/6. For any pair (i.7: A random number A number N N of Bernoulli trials N times. Determine the joint distribution for SOLUTION {N.75 6 · 20 · 2 4 (14. Since the i throws of the dice constitute a Bernoulli sequence with probability 1/6 of a success (a match).
.2.1 for the discrete case. we seek a function r taken to be that for which the average square of the error is a minimum. which is the center of mass for the distribution. In the treatment of expectation. 427) proves to be key to obtaining the result.56) 2 + E (e (X) − r (X)) + 2E [(Y − e (X)) (r (X) − e (X))] h (X) = r (X) − e (X) to assert that (14.1.2 (Basic calculations and interpretation) for the absolutely continuous e (t) = E [Y X = t] might be the best estimate for Y when the value X (ω) = t Let is observed. (14. line of Y on X. In the interpretive Example 14.58) E (Y − r (X)) r (X) = e (X) a. 2 = E (Y − e (X)) 2 + E (e (X) − r (X)) X (ω) = t 2 (14.59) The rst term on the right hand side is xed. The optimum We have some hints of possibilities. then Y (ω) − r (X (ω)) is the error of estimate. of Thus. is observed. with a minimum at zero i Y r=e is the best rule. is found in Example 14. the second term is nonnegative. A pair Here we are concerned with A value {X. e (X) = E [Y X]. We now turn to the problem of nding the best u = at + b. The best That is. case. we nd that the best constant to approximate a random variable in the mean square sense is the mean value. This is dened for argument t in N such that P (X ∈ N ) = 0.20) Joint distribution N.s. r. The property (CE6) (p.) and for any choice of r we may take E [(Y − e (X)) (r (X) − e (X))] = E [(Y − e (X)) h (X)] = 0 Thus (14. which may in some cases be an ane function. we nd the conditional expectation E [Y X = ti ] is the center of mass for the conditional distribution at X = ti . For a given value the best mean square estimate is u = e (t) = E [Y X = t] The graph of the range of (14. but more often is not.55) In the treatment of linear regression.5 The regression problem We introduce the regression problem in the treatment of linear regression. We investigate this possibility. and marginal PS ES = S*PS' ES = 1.60) X. u = e (t) vs t is known as the and is unique except possibly on a set regression curve of Y on X. more general regression. If X (ω) Y (ω) is the actual value and estimation rule (function) r (X (ω)) r (·) is is the estimate.431 Enter PROBABILITIES for N 0.05*ones(1. such that E (Y − r (X)) function of this form denes the regression function 2 is a minimum. We may write (CE6) (p.57) e (X) is xed (a. P. We desire a rule for obtaining the best estimate of the corresponding value Y (ω). 427) in the form E [h (X) (Y − e (X))] = 0 2 for any Consider E (Y − r (X)) = E (Y − e (X)) Now 2 = E (Y − e (X) + e (X) − r (X)) 2 (14. considering thin vertical strips. we determine the best ane function.7500 % Agrees with the theoretical value 14. Determination of the regression curve is thus determination of the conditional expectation. reasonable function h. Y } of real random variables has a joint distribution.s. S. This suggests the possibility that A similar result.
then is the horizontal line through and the since Cov [X. E [ZX = t] = E IM (X) X 2 X = t + E [IN (X) 2Y X = t] = IM (t) t2 + IN (t) 2E [Y X = t] Now (14. REGRESSION Example 14. u) = 60t2 u for 0 ≤ t ≤ 1. Suppose distribution. 0 ≤ u ≤ 1 − t. Y ). Then the X = t is E [ZX = t]. Y } fXY (t. Y }). (14. Y } has joint density the triangular region bounded by that {X. 1]. and the best mean square estimate of Z given Z = g (X. 0 ≤ t < 1 3 (1 − t) 3 (14. agrees with regression line is u = 0 + E [Y ]. Y } is independent. pair {X.9 (Estimate of a function of {X. SOLUTION By linearity and (CE8) (p. Z} has a joint Example 14. 428). regression curve of the regression line. Determine Figure 14.63) E [Y X = t] = ufY X (ut) du = 1 (1 − t) 2 0 1−t 2u2 du = 2 (1 − t) 2 · 2 = 3 (1 − t) .9: Estimate of a function of Suppose the pair {X. Y ] = 0 u = E [Y X = t] = E [Y ]. The result extends to functions of X and Y. u = 0. so that the u = E [Y ].3). 1/2] and N = (1/2.3: The density function for Example 14.8: Regression curve for an independent pair If the pair Y on X {X. Integration shows 2u (1 − t) 2 fX (t) = 30t2 (1 − t) .64) . This.61) Z={ where X2 2Y for for X ≤ 1/2 X > 1/2 = IM (X) X 2 + IN (X) 2Y E [ZX = t]. This is t = 0. 0 ≤ t ≤ 1 Consider 2 and fY X (ut) = on the triangle (14. of course.62) M = [0. and u = 1 − t (see Figure 14. CONDITIONAL EXPECTATION.432 CHAPTER 14.
for this would give an expression incorrect for all t in the range of Note that the indicator functions separate the two expressions. PX. APPROXIMATION tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 100 Enter number of Y approximation points 100 Enter expression for joint density 60*t.') % Plotting details % See Figure~14. 1].*t.5). 1/2] and the second holds on the interval N = (1/2. EZx = sum(G. The rst holds on the interval X.4 The t is quite sucient for practical purposes.^2 + 2*(t>0. % Theoretical plot(X.'k'.65) M = [0.^2.EZx.^2 + (4/3)*(X>0. PY.*u.5).433 so that E [ZX = t] = IM (t) t2 + IN (t) 4 (1 − t) 3 (14. and P G = (t<=0. Y.*(1X). The two expressions t2 and (4/3) (1 − t)must not be added.X. in spite of the moderate number of approximation points.*(u<=1t) Use array operations on X.5).eZx. . The dierence in expressions for the two intervals of X values is quite clear./sum(P). % Approximation eZx = (X<=0.'k.*u.*P). u. t.*X.5).
CONDITIONAL EXPECTATION. (see Figure 14. u) : u ≥ t} (14.4: of {X.10: Estimate of a function of Suppose the pair {X. where Q = {(t. 0 ≤ t ≤ 1. Y }). Theoretical and approximate regression curves for Example 14. Y ) 3XY. u) fY X (ut) du + 3t 6t +1 t IQc (t. REGRESSION Figure 14. Y } fXY (t.66) Z={ 2X 2 3XY for for X≤Y X>Y = IQ (X.67) Determine E [ZX = t]. Y ) 2X 2 + IQc (X.68) = t2 + u du + t 2t2 t2 u + u2 du = 0 (14. The usual integration shows fX (t) = Consider 3 2t2 + 1 .69) . on the unit square 0 ≤ t ≤ 1.434 CHAPTER 14. 5 and fY X (ut) = 2 t2 + u 2t2 + 1 on the square (14. SOLUTION E [ZX = t] = 2t2 4t2 2t2 + 1 1 IQ (t. u) = 0≤u≤1 6 5 t2 + u .9 (Estimate of a function Example 14. u) ufY X (ut) du −t5 + 4t4 + 2t2 . 0≤t≤1 2t2 + 1 (14.5). Y } has joint density {X.
. but sum of the two parts is needed for each t. Here they serve to set the eective limits of integration. There they provide a separation of two parts of the result.10 (Estimate of a function of {X.9 (Estimate of a function of {X.5: The density and regions for Example 14. Y }). Y }) Note the dierent role of the indicator functions than in Example 14.435 Figure 14.
Figure 14. Y. Although the same number of approximation points are use as in Figure 14. plot(X.X.'k'.^2 Use array operations on X. t.4 (Example 14.^5 + 4*X.^2).'k.eZx.*P).*u. Y })). An alternate approach is simply to dene .10 (Estimate of a function APPROXIMATION tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (6/5)*(t. EZx = sum(G.*(u<t). PX.^4 + 2*X. The theoretical and approximate are barely distinguishable on the plot. Given our approach to conditional expectation.') % Plotting details + u) P % Approximate % Theoretical % See Figure~14.436 CHAPTER 14.*(u>=t) + 3*t.6 {X. u. eZx = (X. Y }). and G = 2*t.EZx.6: Theoretical and approximate regression curves for Example 14. PY.^2. REGRESSION of {X. CONDITIONAL EXPECTATION.9 (Estimate of a function of the fact that the entire region is included in the grid means a larger number of eective points in this example./sum(P).^2 + 1)./(2*X. the fact that it solves the regression problem is a matter that requires proof using properties of of conditional expectation.
2 Problems on Conditional Expectation.0396 0 0. in particular. then determine its properties.) (See Exercise 17 (Exercise 11. 444. the two concepts.4 0. the best mean square estimator of g (Y ). given (X.0363 0. we do not change the estimate of g (Y ). We examine the special form (CE9a) Put E{E [g (Y ) X] X.0510 0. Regression2 For the distributions in Exercises 13 a.8.70) g (Y ). .0216 0.5 4. The pair {X.2 0. given X. 426). those ordinarily used in applications at the level of this treatment will have a variance.5 0. There are some technical dierences which do not aect most applications.38: npr08_07)): (Solution on p.1 0.s. Z] = e (X) a. given X. Z). However. and then take the best mean square estimate of that.6924.s.0077 Table 14. On the other hand if we rst get the best mean sqare estimate of g (Y ).1 has the joint distribution (in le npr08_07. we get the best mean square estimate of g (Y ).1 2. Not all random variables have this property. given (X. Z] X} = E [g (Y ) X] e1 (X. to interpret the formal property (CE9) (p.7 0. In words. (14.0297 0 4.4/>. if we take the best estimate of given and E [e1 (X. properties of 600)) show the essential equivalence of expectation (including the uniqueness property (E5) ("(E5) ". given X. Once that is established.5275t + 0. then take the best mean sqare estimate of that.1089 0.0594 0.0528 0.0189 0.0891 0.0484 1. Z].9 0. Y } P (X = t. on X. Z) = E [g (Y ) X.0203 0. Z). We use the interpretation of e (X) = E [g (Y ) X] as the best mean square estimator of g (Y ). determine the regression curve of Z X.2 The regression line of Y on X is u = 0. Then (CE9b) can be expressed E [e (X) X. Exercise 14.0324 0.0726 2. our dening condition (CE1) (p. Z) X] = e (X) a.0231 0.0440 0. For the function curve of Y on X and compare with the regression line of Y on Z = g (X.0495 0.0 3. 14.0090 0. hence a nite second moment.0132 3. given X.71) t = u = 7.1320 0. 437). p. Y ) indicated in each case.17) from "Problems on Mathematical Expectation"). This yields.72) 2 This content is available online at <http://cnx. (X.org/content/m24441/1.m (Section 17. The alternate approach assumes the second moment E X2 is nite.437 the conditional expectation to be the solution to the regression problem. Determine the regression b. Z} = E{E [g (Y ) X. Y = u) (14. Z = X 2 Y + X + Y  (14.0405 0. Z).8 3.
0017 5 0. in minutes.19) from "Problems on Mathematical Expectation").001 0.0167 11 0.76) For the joint densities in Exercises 411 below .0056 0.0056 0. The data are as follows (in le npr08_09.0182 0.0244 0.0084 0. 445.049 0.0063 7 0.73) t = u = 12 10 9 5 3 1 3 5 1 0. in hours.012 0.0049 0.0234 0.40: npr08_09)): P (X = t.011 0.0160 0.0182 0.8.015 0.0050 0.0172 17 0.0168 0.39: npr08_08)): P (X = t.0026 15 0.0134 0.009 2 0.0098 0.0256 0.0044 0.0252 0.0156 0. Y ) X (Y − 4) + IQc (X.0104 0. Z = (Y − 2.0072 3 0.18) from "Problems on Mathematical Expectation").) The pair (See Exercise 18 (Exercise 11.0140 0. of training.0070 0.010 0.0040 0.0084 0.0140 0.74) √ Z = IQ (X.051 0.003 1.0096 0.0180 0.3 The regression line of Y on X is u = −0.8. Y ) XY 2 Exercise 14.0217 19 0. {X.5 0.0060 0.050 0.045 2.065 0.058 0.017 Table 14.2584t + 5.0056 0.3 (Solution on p.0120 0.137 0.0112 0.0080 0.0112 0.163 0.0182 0. and Y X is the amount is the time to perform the task.031 0.0256 0.0196 0. Y = u) (14.3051.0090 13 0.0180 0.0160 0.0091 0.0070 0.005 0. u) : u ≤ t} (14.0196 0.0156 0.0234 0.0072 0. Y = u) (14.m (Section 17. Y } has the joint distribution (in le npr08_08.0126 0.0108 0.0012 0.0035 0.7793t + 4.025 3 0. Q = {(t.0120 0.0162 0.0064 0.0096 0.0260 0.0040 0.0060 0.0223 Table 14.75) t = u = 5 4 3 2 1 1 0.0108 0.0126 0.0060 0.039 0.0045 9 0.0280 0. CONDITIONAL EXPECTATION.8) /X (14. 444.0256 0.070 0.0200 0.0054 0.033 0.0038 0. Data were kept on the eect of training time on the time to perform a job on a production line.4 The regression line of Y on X is u = −0.0191 0.m (Section 17.0096 0.001 0.061 0.039 0.0091 0.438 CHAPTER 14.) (See Exercise 19 (Exercise 11. REGRESSION Exercise 14.0081 0.0060 0.6110.5 0.0064 0.0268 0.0204 0.0056 0.2 (Solution on p.
0) .6 (Solution on p. 0 ≤ t ≤ 2 4 (14. 0 ≤ t ≤ 2 88 88 (14. The regression line of Y on X is u = 1 − t.23) from "Problems on Variance. 446. Covariance.) (See Exercise 16 (Exercise 8. and Exercise 24 (Exercise 12.1] (t) 6t2 1 − t2 Exercise 14. Exercise 25 (Exercise 11. 0 ≤ u ≤ 2.27) from "Problems on Mathematical Expectation".17) from " Problems On Random Vectors and Joint Distributions". and Exercise 27 (Exercise 12. 2 (14. and Exercise 23 (Exercise 12. for fXY (t.0] (t) 6t2 (t + 1) + I(0.) (See Exercise 17 (Exercise 8.81) fX (t) = I[−1. 1) The regression line of (14.7 3 3 (1 + t) 1 + 4t + t2 = 1 + 5t + 5t2 + t3 .25) from "Problems on Variance.80) Y on X is u = (4t + 5) /9. 0 ≤ u ≤ min{1. Covariance. (0. Exercise 26 (Exercise 11.79) (Solution on p.439 a. Linear Regression"). regression line of Y on X is u = 0. u) = 12t2 u on (−1. for fX (t) = Exercise 14.77) fX (t) = 2 (1 − t) .4 (Solution on p.23) from "Problems on Mathematical Expectation". 0 ≤ u ≤ 2 (1 − t).) (See Exercise 13 (Exercise 8. u) = 0 ≤ t ≤ 2.13) from " Problems On Random Vectors and Joint Distributions". 2 − t}. 0 ≤ t ≤ 1 Exercise 14.8 (Solution on p. Check these with a discrete approximation. (1. u) = 3 88 2t + 3u2 The 0 ≤ t ≤ 2.10) from "Problems On Random Vectors and Joint Distributions". (Exercise 12. 24 11 tu for .) (See Exercise 10 (Exercise 8.15) from "Problems On Random Vectors and Joint Distributions". curve of Y on X and compare with the regression line of Y on b. fXY (t. fXY (t. Covariance.4876. u) = 1 for 0 ≤ t ≤ 1. fXY (t. 445.16) from " Problems On Random Vectors and Joint Distributions".27) from "Problems on Variance. 0 ≤ u ≤ 1 + t.0958t + 1.) and Exercise 25 (See Exercise 15 (Exercise 8.26) from "Problems on Mathematical Expectation". and Exercise 26 (Exercise 12. 0) . Linear Regression"). 1 (t + 1) .5 (Solution on p. u) = 0 ≤ t ≤ 2. Covariance. Linear Regression"). Linear Regression"). Linear Regression"). 1) . Exercise 20 (Exercise 11. 445. Exercise 23 (Exercise 11. the parallelogram with vertices fXY (t. Determine analytically the regression X. The regression line of 1 8 (t + u) Y on X is u = −t/11 + 35/33.20) from "Problems on Mathematical Expectation". (14.26) from "Problems on Variance. (0. Exercise 14.24) from "Problems on Variance. 446.25) from "Problems on Mathematical Expectation".78) fX (t) = Exercise 14. Exercise 27 (Exercise 11. 446. Covariance.
fXY (t. 447.37) from "Problems on Mathematical Expectation". Covariance.) (See Exercise 18 (Exercise 8.22) from " Problems On Random Vectors and Joint Distributions".12 fXY (t. (14. 0 ≤ u ≤ min{2t.1] (X) 4X + I(1. Covariance.82) fX (t) = I[0. fX (t) = I[0.29) from "Problems on Variance.) and Exercise 30 (See Exercise 22 (Exercise 8.1] (t) 3 2 3 t + 1 + I(1. 0 ≤ u ≤ max{2 − t. Exercise 31 (Exercise 11.10 (Solution on p. 3 − t}.28) from "Problems on Mathematical Expectation".0561t − 0. The regression line of Y on X is u = −0. t}.) (See Exercise 21 (Exercise 8. Linear Regression").28) from "Problems on Variance.0839. Linear Regression").1] (t) 12 2 6 t + I(1. u) = 0 ≤ t ≤ 2.0817t + 0. REGRESSION The regression line of Y on X is u = (−124t + 368) /431 12 12 2 t + I(1. 447.5989. 447. for fXY (t. Exercise 28 (Exercise 11. Determine analytically E [ZX = t] (Solution on p. Use a discrete approximation to calculate the same functions.2] (X) (X + Y ) (14.6).) b.2] (t) t2 23 23 3 23 (t + 2u) (14.31) from "Problems on Mathematical Expectation".2] (t) (3 − t) 13 13 2 13 (t + 2u).9 (Solution on p. The regression line of Y on X is u = 0.83) Exercise 14.84) Exercise 14. CONDITIONAL EXPECTATION. 0 ≤ u ≤ 1 + t (see Exercise 37 (Exercise 11.1] (t) 6 6 (2 − t) + I(1. and Exercise 28 (Exercise 12.2603.11 (Solution on p. and Exercise 29 (Exercise 12.85) For the distributions in Exercises 1216 below a.18) from " Problems On Random Vectors and Joint Distributions". and Exercise 14. 8 for 0 ≤ u ≤ 1. 0 ≤ t ≤ 2 88 88 Z = I[0.440 CHAPTER 14.1359t + 1. u) = 0 ≤ t ≤ 2.2] (t) 14 t2 u2 .2] (t) t(2 − t) 11 11 (14. fX (t) = I[0.32) from "Problems on Mathematical Expectation".21) from " Problems On Random Vectors and Joint Distributions". u) = 9 I[0. u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2. fX (t) = 3 3 (1 + t) 1 + 4t + t2 = 1 + 5t + 5t2 + t3 .87) . 448.86) (14.30) from "Problems on Variance. for fXY (t.1] (t) 3 t2 + 2u + I(1.2] (t) t2 8 14 (14. (Exercise 12. Exercise 14.1] (t) Exercise 14. Exercise 32 (Exercise 11. Linear Regression"). The regression line of Y on X is u = 1. fX (t) = I[0. Covariance.
6 6 (2 − t) + I(1. M = {(t.10). Y ) 2Y.17 Suppose (14. fX (t) = I[0.2] (t) (3 − t) 13 13 M = {(t. 2 − t)} (14.39).) a. {X. given X = i. for 0 ≤ t ≤ 2.9).2] (t) t(2 − t) 11 11 (14. u) = 24 11 tu 0 ≤ t ≤ 2.7: A b. u) = 2 13 (t + 2u). 2 − t} (see Exercise 38 (Exercise 11. 0 ≤ u ≤ min{2t. Y ) X + IM c (X. Determine E [Y ] from E [Y X = i]. Use jcalc to determine E [Y ].7: A b. u) = I[0.1] (t) 12 12 2 t + I(1. 450. and Exercise 14. 8 for (14.15 (14. and Exercise 14.441 Exercise 14. 3 − t}.31) from "Problems on Mathematical Expectation".32) from "Problems on Mathematical Expectation". Y ) 2Y 2 .) 0 ≤ u ≤ 1. compare with the theoretical value. M = {(t.38) from "Problems on Mathematical Expectaton". fX (t) = I[0. Exercise 14.11). u) : u ≤ min (1. (see Exercise 32 (Exercise 11.8). 448.2] (t) t2 8 14 M = {(t. Determine E [Y ] from E [Y X = i]. u ≥ 1} (14.89) (Solution on p.) fXY (t.90) Z = IM (X.95) X∼ uniform on 0 through n and Y ∼ conditionally uniform on 0 through i.1] (t) 12 2 6 t + I(1. u) : max (t. t} (see Exercise 39 (Exercise 11. given X = i. {X. Y ) XY. Y } for (Solution on p. Exercise 14. Regression" for a possible approach). n = 50 (see Example 7 (Example 14.2] (t) 14 t2 u2 . fX (t) = I[0.16 9 fXY (t.14 (14. fX (t) = I[0.1] (t) 3 2 3 t + 1 + I(1. Exercise 14. u) = 3 23 (t + 2u) for 0 ≤ t ≤ 2.88) 1 Z = IM (X. Determine the joint distribution for random number N of Bernoulli trials) from "Conditional Expectation. Determine the joint distribution for random number N of Bernoulli trials) from "Conditional Expectation. 448.18 Suppose X∼ uniform on 1 through n and Y ∼ conditionally uniform on 1 through i.) and fXY (t.1] (t) 3 t2 + 2u + I(1.2] (t) t2 23 23 Exercise 14.) a. 0 ≤ u ≤ max{2 − t. 450.1] (t) (14. Y ) Y 2 . Regression" for a possible approach). u) ≤ 1} Exercise 14.) for fXY (t. (see Exercise 31 (Exercise 11. u) : t ≤ 1. n = 50 (see Example 7 (Example 14. Y ) (X + Y ) + IM c (X.94) Z = IM (X.92) Z = IM (X. Y } for (Solution on p.13 (Solution on p. .93) (Solution on p.91) (Solution on p. Y ) X + IM c (X. compare with the theoretical value. Use jcalc to determine E [Y ]. 449. Y ) (X + Y ) + IM c (X. Exercise 14. 449. u) : u > t} 2 Exercise 14. 0 ≤ u ≤ min{1.
452. Determine E [Y ]. 2 for E [Y X = t] = 2 (1 − t) for 0 ≤ t < 1 3 0 ≤ t ≤ 1. Z). 451. {X. that (14.19 Suppose X∼ uniform on 1 through n and Y ∼ conditionally binomial (i.28 E [g (t. Exercise 14. Y ) Z = v] = Exercise 14. Let Y X be the number of matches (i. REGRESSION Exercise 14.) 0 ≤ t < 2. Z = v] = E [g (t. Y } is independent. 451. CONDITIONAL EXPECTATION.26 Use the fact that (CE10) to show . Exercise 14. for p = µ/ (µ + λ) (14.3. 0 ≤ k ≤ n. 2 3 (2 − t) and X has density function fX (t) = 15 2 16 t (2 − t) 2 (Solution on p. Y } for (Solution on p.) A shop which works past closing time to complete jobs on hand tends to speed up service on any job received during the last hour before closing.) has density function fX (t) = 4 − 2t for 1 ≤ t ≤ 2.27 Use the result of Exercise 14. Determine Exercise 14. Y ) X = t. k) pk (1 − p) Exercise 14.22 E [Y X = t] = 10t Exercise 14.) X Suppose the pair {X. Extend property E [g (X.98) (Solution on p. for Let X is selected randomly from the integers 1 through Y be the number of sevens thrown on the X tosses. show that n−k Show that P (X = kX + Y = n) = C (n.) 100. (n. independently and randomly.) Each of two people draw Exercise 14. Y ) X = t.23 and X (Solution on p. given X = i. Y ) = g ∗ (X. Determine E [Y ] from E [Y X = k]. That is. etc.s. times. 452.442 CHAPTER 14. [PZ ] (14. and then determine (Solution on p.96) (Solution on p. Y } E [Y ]. 452. µ/ (µ + λ)). Determine E [Y ]. both draw twos.s.) g (X. given X + Y = n. u.97) (Solution on p. Determine the joint distribution and then determine E [Y ]. period is conditionally exponential T ∼ uniform [0. n = 50 and b. p). p..) a. 452.24 and X (Solution on p. Y ) X = t. Suppose the arrival time of a job in hours before closing time is a random variable Y.) 601) and (CE10) to show E [g (X. Service time Y for a unit received in that β (2 − u). Z = v] a. (Solution on p. 452. A pair of dice is thrown X Determine the joint distribution {X. [PXZ ] Exercise 14. both draw ones. 1]. Determine the distribution function for . 452.e. 451.20 A number times.25 (Solution on p. Y. Use jcalc to determine E [Y ]. with is conditionally binomial X ∼ Poisson (µ) and Y ∼ Poisson (λ). Determine the joint distribution for p = 0.26 and properties (CE9a) ("(CE9a)". compare with the theoretical value. 452. v) does not vary with v. given T = u.).21 A number X is selected randomly from the integers 1 through 100. where g ∗ (t.) has density function fX (t) = 30t2 (1 − t) E [Y X = t] = E [Y ]. a number from 1 to 10. Exercise 14. Z = v] FXZ (dtv) a.
The system fails if any one or more of the components What is the probability the failure is due to the be the time to system failure. Thus {W = Xi } = {(X1 . · · · . X2 . · · · . Let W is iid exponential (λ). 453. 0.) of a manufactured unit has an exponential distribution. ∀ k = i} (14. · · · tn ) : tk > ti .30 A system has given H = u. 453. Suppose the parameter is the value of random variable H ∼ uniform on[0.29 Time to failure X (Solution on p. Determine n components. Xn )] = E{E [IQ (X1 . Determine E [XH = u] and use this to determine E [X]. Time to failure of the i th component is Xi (Solution on p. P (X > 150). Note that W = Xi i Xj > Xi for all j = i. Xn ) ∈ Q}.99) P (W = Xi ) = E [IQ (X1 .100) . Exercise 14. X2 .01].) and the class {Xi : 1 ≤ i ≤ n} fails.005. i th component? Suggestion.443 Exercise 14. Xn ) Xi ]} (14. X2 . The parameter is dependent upon the manufacturing process. and X is conditionally exponential (u). t2 . · · · . Q = {(t1 .
.0139 2..38: npr08_07) Data are in X.1000 4.2000 1. disp([X.39: npr08_08) Data are in X. REGRESSION Solutions to Exercises in Chapter 14 Solution to Exercise 14.2000 6.5000 0.9869 5.*P).1957 15..6110.*P)./sum(P).5700 G = t.444 CHAPTER 14.EZx]') 1. disp([X.*u + abs(t+u). disp([X.8.0000 3.0290 0.0383 0.EYx]') 3.0000 0. CONDITIONAL EXPECTATION.. 437) The regression line of Y on X is u = 0.0000 3. Y.^2.. EZx = sum(G./sum(P).9100 M = u<=t.6500 7.0000 166.7130 4. G = (u4).7000 59./sum(P).0000 2.0000 2.0000 5.4000 17.1960 3. 437) The regression line of Y on X is u = −0...EYx = sum(u.5345 1.9000 69.8.2 (p.1000 0.9100 17.3050 3.. npr08_07 (Section~17.9000 2. P jcalc ...^2.9322 .EZx]') 3.7000 3.3100 9.1 (p.0000 58.*M + t.7269 5.../sum(P).*sqrt(t)..*P).*u.6860 1.1757 Solution to Exercise 14. disp([X.0000 5..9100 13.5254 19.0000 1...*(1M).0000 175.5530 3.6924.*P). npr08_08 (Section~17. Y. P jcalc ...0254 11.2584t + 5.EYx = sum(u.4000 2.5275t + 0.0000 2. EZx = sum(G.5350 3.0000 0.3270 2.8130 4..5000 3.EYx]') 1.
EZx]') 1.5000 3./sum(P).1412 2.4076 2..40: npr08_09) Data are in X. 439) The regression line of Y on X is u = 1 − t..EYx = sum(u.1367 Solution to Exercise 14..0000 3.3051..0000 19.EYx) % Straight line thru (0. 0 ≤ t ≤ 1...0000 13.1250 2..7896 119.0000 0. npr08_09 (Section~17.7531 105.8).0000 9.104) .*P).3900 G = (u . 438) The regression line of Y on X is u = −0.5000 0.5 (p.4690 Solution to Exercise 14. (t + u) 0 ≤ t ≤ 2.0000 2.0000 2.9675 10.7793t + 4.. 0 ≤ u ≤ 2 2 (t + 1) 2 (14. Y.2031 13..3933 3./sum(P). EZx = sum(G..EYx]') 1.*P). plot(X. 439) The regression line of Y on X is u = −t/11 + 35/33.8999 11.2.0000 1.../sum(P).101) fY X (ut) = E [Y X = t] = udu = 1 − t. P jcalc .0333 1.8333 1.2167 2.5175 2.445 7. disp([X.0000 17.8..0) Solution to Exercise 14. (1...0000 0. 0 ≤ t ≤ 1 0 (14.0000 185.4 (p.*P).. 1 .102) tuappr: [0 1] [0 2] 200 400 u<=2*(1t) .1)...5000 2.EYx = sum(u... 0 ≤ u ≤ 2 (1 − t) 2 (1 − t) 1 2 (1 − t) 2(1−t) (14.5000 0. disp([X.103) fY X (ut) = E [Y X = t] = 1 2 (t + 1) tu + u2 du = 1 + 0 1 0≤t≤2 3t + 3 (14.0000 11.3 (p.1627 3./t.0000 15..
446
CHAPTER 14. CONDITIONAL EXPECTATION, REGRESSION
tuappr: [0 2] [0 2] 200 200 (1/8)*(t+u) EYx = sum(u.*P)./sum(P); eyx = 1 + 1./(3*X+3); plot(X,EYx,X,eyx) % Plots nearly indistinguishable
Solution to Exercise 14.6 (p. 439)
The regression line of
Y
on
X
is
u = 0.0958t + 1.4876. 2t + 3u2 0≤u≤1+t (1 + t) (1 + 4t + t2 )
1+t
(14.105)
fY X (ut) = E [Y X = t] = =
1 (1 + t) (1 + 4t + t2 )
2tu + 3u3 du
0
(14.106)
(t + 1) (t + 3) (3t + 1) , 0≤t≤2 4 (1 + 4t + t2 )
(14.107)
tuappr: [0 2] [0 3] 200 300 (3/88)*(2*t + 3*u.^2).*(u<=1+t) EYx = sum(u.*P)./sum(P); eyx = (X+1).*(X+3).*(3*X+1)./(4*(1 + 4*X + X.^2)); plot(X,EYx,X,eyx) % Plots nearly indistinguishable
Solution to Exercise 14.7 (p. 439)
The regression line of
Y
on
X
is
u = (23t + 4) /18. 2u (t + 1) 1 (t + 1)
2 0 2
fY X (ut) = I[−1,0] (t)
+ I(0,1] (t)
t+1
2u (1 − t2 )
on the parallelogram
(14.108)
E [Y X = t] = I[−1,0] (t)
2u du + I(0,1] (t)
1 (1 − t2 )
1
2u du
t
(14.109)
= I[−1,0] (t)
2 2 t2 + t + 1 (t + 1) + I(0,1] (t) 3 3 t+1
(14.110)
tuappr: [1 1] [0 1] 200 100 12*t.^2.*u.*((u<= min(t+1,1))&(u>=max(0,t))) EYx = sum(u.*P)./sum(P); M = X<=0; eyx = (2/3)*(X+1).*M + (2/3)*(1M).*(X.^2 + X + 1)./(X + 1); plot(X,EYx,X,eyx) % Plots quite close
Solution to Exercise 14.8 (p. 439)
The regression line of
Y
on
X
is
u = (−124t + 368) /431. 2u (2 − t) 1 (2 − t)
2 0
(14.113)
fY X (ut) = I[0,1] (t) 2u + I(1,2] (t)
1
2 2−t
(14.111)
E [Y X = t] = I[0,1] (t)
0
2u2 du + I(1,2] (t)
2u2 du
(14.112)
= I[0,1] (t)
2 2 + I(1,2] (t) (2 − t) 3 3
447
tuappr: [0 2] [0 1] 200 100 (24/11)*t.*u.*(u<=min(1,2t)) EYx = sum(u.*P)./sum(P); M = X <= 1; eyx = (2/3)*M + (2/3).*(2  X).*(1M); plot(X,EYx,X,eyx) % Plots quite close
Solution to Exercise 14.9 (p. 440)
The regression line of
Y
on
X
is
u = 1.0561t − 0.2603. t + 2u t + 2u 0 ≤ u ≤ max (2 − t, t) + I(1,2] (t) 2 (2 − t) 2t2
2−t
(14.114)
fY X (ut) = I[0,1] (t) E [Y X = t] = I[0,1] (t) 1 2 (2 − t)
tu + 2u2 du + I(1,2] (t)
0
1 2t2
t
tu + 2u2 du
0
(14.115)
= I[0,1] (t)
1 7 (t − 2) (t − 8) + I(1,2] (t) t 12 12
(14.116)
tuappr: [0 2] [0 2] 200 200 (3/23)*(t+2*u).*(u<=max(2t,t)) EYx = sum(u.*P)./sum(P); M = X<=1; eyx = (1/12)*(X2).*(X8).*M + (7/12)*X.*(1M); plot(X,EYx,X,eyx) % Plots quite close
Solution to Exercise 14.10 (p. 440)
The regression line of
Y
on
X
is
u = −0.1359t + 1.0839. t + 2u t + 2u + I(1,2] (t) 0 ≤ u ≤ max (2t, 3 − t) 2 6t 3 (3 − t) tu + 2u2 du + I(1,2] (t)
0
(14.117)
fY X (tu) = I[0,1] (t) E [Y X = t] = I[0,1] (t) 1 6t2
t
1 3 (3 − t)
3−t
tu + 2u2 du
0
(14.118)
= I[0,1] (t)
11 1 2 t + I(1,2] (t) t − 15t + 36 9 18
(14.119)
tuappr: [0 2] [0 2] 200 200 (2/13)*(t+2*u).*(u<=min(2*t,3t)) EYx = sum(u.*P)./sum(P); M = X<=1; eyx = (11/9)*X.*M + (1/18)*(X.^2  15*X + 36).*(1M); plot(X,EYx,X,eyx) % Plots quite close
Solution to Exercise 14.11 (p. 440)
The regression line of
Y
on
X
is
u = 0.0817t + 0.5989. t2 + 2u + I(1,2] (t) 3u2 0 ≤ u ≤ 1 t2 + 1
1 1
(14.120)
fY X (tu) = I[0,1] (t) E [Y X = t] = I[0,1] (t)
1 t2 + 1
t2 u + 2u2 du + I(1,2] (t)
0 0
3u3 du
(14.121)
= I[0,1] (t)
3t2 + 4 3 + I(1,2] (t) 2 + 1) 6 (t 4
(14.122)
448
CHAPTER 14. CONDITIONAL EXPECTATION, REGRESSION
tuappr: [0 2] [0 1] 200 100 (3/8)*(t.^2 + 2*u).*(t<=1) + ... (9/14)*t.^2.*u.^2.*(t>1) EYx = sum(u.*P)./sum(P); M = X<=1; eyx = M.*(3*X.^2 + 4)./(6*(X.^2 + 1)) + (3/4)*(1  M); plot(X,EYx,X,eyx) % Plots quite close
Solution to Exercise 14.12 (p. 440)
Z = IM (X) 4X + IN (X) (X + Y ),
601) gives
Use of linearity, (CE8) ("(CE8)", p.
601), and (CE10) ("(CE10)", p.
E [ZX = t] = IM (t) 4t + IN (t) (t + E [Y X = t]) = IM (t) 4t + IN (t) t + (t + 1) (t + 3) (3t + 1) 4 (1 + 4t + t2 )
(14.123)
(14.124)
% Continuation of Exercise~14.6 G = 4*t.*(t<=1) + (t + u).*(t>1); EZx = sum(G.*P)./sum(P); M = X<=1; ezx = 4*X.*M + (X + (X+1).*(X+3).*(3*X+1)./(4*(1 + 4*X + X.^2))).*(1M); plot(X,EZx,X,ezx) % Plots nearly indistinguishable
Solution to Exercise 14.13 (p. 440)
1 Z = IM (X, Y ) X + IM c (X, Y ) Y 2 , M = {(t, u) : u > t} 2 IM (t, u) = I[0,1] (t) I[t,1] (u) IM c (t, u) = I[0,1] (t) I[0,t] (u) + I(1,2] (t) I[0,2−t] (u) E [ZX = t] = I[0,1] (t) t 2
1 t 2−t
(14.125)
(14.126)
2u du +
t 0
u2 · 2u du + I(1,2] (t)
0
u2 ·
2u (2 − t)
2
du
(14.127)
1 1 2 = I[0,1] (t) t 1 − t2 + t3 + I(1,2] (t) (2 − t) 2 2
(14.128)
% Continuation of Exercise~14.6 Q = u>t; G = (1/2)*t.*Q + u.^2.*(1Q); EZx = sum(G.*P)./sum(P); M = X <= 1; ezx = (1/2)*X.*(1X.^2+X.^3).*M + (1/2)*(2X).^2.*(1M); plot(X,EZx,X,ezx) % Plots nearly indistinguishable
Solution to Exercise 14.14 (p. 441)
Z = IM (X, Y ) (X + Y ) + IM c (X, Y ) 2Y, M = {(t, u) : max (t, u) ≤ 1} IM (t, u) = I[0,1] (t) I[0,1] (u) IM c (t, u) = I[0,1] (t) I[1,2−t] (u) + I(1,2] (t) I[0,t] (u)
(14.129)
(14.130)
449
1 E [ZX = t] = I[0,1] (t) 2(2−t) I(1,2] (t) 2E [Y X = t]
1 0
(t + u) (t + 2u) du +
1 (2−t)
2−t 1
u (t + 2u) du +
(14.131)
= I[0,1] (t)
1 2t3 − 30t2 + 69t − 60 7 · + I(1,2] (t) 2t 12 t−2 6
(14.132)
% Continuation of Exercise~14.9 M = X <= 1; Q = (t<=1)&(u<=1); G = (t+u).*Q + 2*u.*(1Q); EZx = sum(G.*P)./sum(P); ezx = (1/12)*M.*(2*X.^3  30*X.^2 + 69*X 60)./(X2) + (7/6)*X.*(1M); plot(X,EZx,X,ezx)
Solution to Exercise 14.15 (p. 441)
Z = IM (X, Y ) (X + Y ) + IM c (X, Y ) 2Y 2 ,
M = {(t, u) : t ≤ 1, u ≥ 1}
(14.133)
IM (t, u) = I[0,1] (t) I[1,2] (u) IM c (t, u) = I[0,1] (t) I[0,1) (u) + I(1,2] (t) I[0,3−t] (u) E [ZX = t] = I[0,1/2] (t) I(1/2,1] (t) 1 6t2
1
(14.134)
1 6t2
2t
2u2 (t + 2u) du+
0
(14.135)
2u2 (t + 2u) du +
0
1 6t2
2t
(t + u) (t + 2u) du + I(1,2] (t)
1
1 3 (3 − t)
3−t
2u2 (t + 2u) du
0
(14.136)
= I[0,1/2] (t)
32 2 1 80t3 − 6t2 − 5t + 2 1 t + I(1/2,1] (t) · + I(1,2] (t) −t3 + 15t2 − 63t + 81 9 36 t2 9
(14.137)
tuappr: [0 2] [0 2] 200 200 (2/13)*(t + 2*u).*(u<=min(2*t,3t)) M = (t<=1)&(u>=1); Q = (t+u).*M + 2*(1M).*u.^2; EZx = sum(Q.*P)./sum(P); N1 = X <= 1/2; N2 = (X > 1/2)&(X<=1); N3 = X > 1; ezx = (32/9)*N1.*X.^2 + (1/36)*N2.*(80*X.^3  6*X.^2  5*X + 2)./X.^2 ... + (1/9)*N3.*(X.^3 + 15*X.^2  63.*X + 81); plot(X,EZx,X,ezx)
Solution to Exercise 14.16 (p. 441)
Z = IM (X, Y ) X + IM c (X, Y ) XY,
1 3
M = {(t, u) : u ≤ min (1, 2 − t)}
2−t 1
(14.138)
E [X = t] = I[0,1] (t)
0
t + 2tu du + I(1,2] (t) t2 + 1
3tu2 du +
0 2−t
3tu3 du
(14.139)
450
CHAPTER 14. CONDITIONAL EXPECTATION, REGRESSION
13 3 t + 12t2 − 12t3 + 5t4 − t5 4 4
= I[0,1] (t) t + I(1,2] (t) −
(14.140)
tuappr: [0 2] [0 1] 200 100 (t<=1).*(t.^2 + 2*u)./(t.^2 + 1) +3*u.^2.*(t>1) M = u<=min(1,2t); G = M.*t + (1M).*t.*u; EZx = sum(G.*P)./sum(P); N = X<=1; ezx = X.*N + (1N).*((13/4)*X + 12*X.^2  12*X.^3 + 5*X.^4  (3/4)*X.^5); plot(X,EZx,X,ezx)
Solution to Exercise 14.17 (p. 441)
a.
E [Y X = i] = i/2,
so
1 E [Y ] = E [Y X = i] P (X = i) = n+1 i=0
b.
n
n
i/2 = n/4
i=1
(14.141)
P (X = i) = 1/ (n + 1) , 0 ≤ i ≤ n, P (Y = kX = i) = 1/ (i + 1) , 0 ≤ k ≤ i; P (X = i, Y = k) = 1/ (n + 1) (i + 1) 0 ≤ i ≤ n, 0 ≤ k ≤ i.
hence
n = 50; X = 0:n; Y = 0:n; P0 = zeros(n+1,n+1); for i = 0:n P0(i+1,1:i+1) = (1/((n+1)*(i+1)))*ones(1,i+1); end P = rot90(P0); jcalc: X Y P           EY = dot(Y,PY) EY = 12.5000 % Comparison with part (a): 50/4 = 12.5
Solution to Exercise 14.18 (p. 441)
a.
E [Y X = i] = (i + 1) /2,
so
n
E [Y ] =
i=1
b.
E [Y X = i] P (X = i) =
1 n
n
i=1
i+1 n+3 = 2 4
hence
(14.142)
P (X = i) = 1/n, 1 ≤ i ≤ n, P (Y = kX = i) = 1/i, 1 ≤ k ≤ i; i ≤ n, 1 ≤ k ≤ i.
P (X = i, Y = k) = 1/ni 1 ≤
n = 50; X = 1:n; Y = 1:n; P0 = zeros(n,n); for i = 1:n P0(i,1:i) = (1/(n*i))*ones(1,i); end P = rot90(P0); jcalc: P X Y            EY = dot(Y,PY) EY = 13.2500 % Comparison with part (a):
53/4 = 13.25
451
Solution to Exercise 14.19 (p. 442)
a.
E [Y X = i] = ip,
so
n
E [Y ] =
i=1
b.
E [Y X = i] P (X = i) =
p n
n
i=
i=1
p (n + 1) 2
(14.143)
P (X = i) = 1/n, 1 ≤ i ≤ n, P (Y = kX = i) = ibinom (i, p, 0 : i) , 0 ≤ k ≤ i.
n = 50; p = 0.3; X = 1:n; Y = 0:n; P0 = zeros(n,n+1); % Could use randbern for i = 1:n P0(i,1:i+1) = (1/n)*ibinom(i,p,0:i); end P = rot90(P0); jcalc: X Y P           EY = dot(Y,PY) EY = 7.6500 % Comparison with part (a):
Solution to Exercise 14.20 (p. 442)
a.
0.3*51/2 = 0.765
P (X = i) = 1/n, E [Y X = i] = i/6,
so
E [Y ] =
1 6
n
i/n =
i=0
(n + 1) 12
(14.144)
b.
n = 100; p = 1/6; X = 1:n; Y = 0:n; PX = (1/n)*ones(1,n); P0 = zeros(n,n+1); % Could use randbern for i = 1:n P0(i,1:i+1) = (1/n)*ibinom(i,p,0:i); end P = rot90(P0); jcalc EY = dot(Y,PY) EY = 8.4167 % Comparison with part (a): 101/12 = 8.4167
Solution to Exercise 14.21 (p. 442)
Same as Exercise 14.20, except
p = 1/10. E [Y ] = (n + 1) /20.
n = 100; p = 0.1; X = 1:n; Y = 0:n; PX = (1/n)*ones(1,n); P0 = zeros(n,n+1); % Could use randbern for i = 1:n P0(i,1:i+1) = (1/n)*ibinom(i,p,0:i); end P = rot90(P0); jcalc          EY = dot(Y,PY) EY = 5.0500 % Comparison with part (a): EY = 101/20 = 5.05
452
CHAPTER 14. CONDITIONAL EXPECTATION, REGRESSION
Solution to Exercise 14.22 (p. 442)
2
E [Y ] =
E [Y X = t] fX (t) dt =
1
10t (4 − 2t) dt = 40/3
(14.145)
Solution to Exercise 14.23 (p. 442)
1
E [Y ] =
E [Y X = t] fX (t) dt =
0
20t2 (1 − t) dt = 1/3
3
(14.146)
Solution to Exercise 14.24 (p. 442)
E [Y ] =
E [Y X = t] fX (t) dt =
5 8
2 0
t2 (2 − t) dt = 2/3
3
(14.147)
Solution to Exercise 14.25 (p. 442)
that
X ∼ Poisson (µ), Y ∼ Poisson (λ), X + Y ∼ Poisson (µ + λ)
Use of property (T1) ("(T1)", p. 387) and generating functions shows
P (X = kX + Y = n) =
k
P (X = k, X + Y = n) P (X = k, Y = n − k) = P (X + Y = n) P (X + Y = n)
n−k
(14.148)
=
Put
λ e−µ µ e−λ (n−k)! k!
e−(µ+λ) (µ+λ) n!
n
=
n! µk λn−k k! (n − k)! (µ + λ)n
(14.149)
p = µ/ (µ + λ)
and
q = 1 − p = λ/ (µ + λ)
to get the desired result.
Solution to Exercise 14.26 (p. 442)
E [g (X, Y ) X = t, Z = v] = E [g ∗ (X, Z, Y )  (X, Z) = (t, v)] = E [g ∗ (t, v, Y )  (X, Z) = (t, v)] = E [g (t, Y ) X = t, Z = v] a.s. [PXZ ]
Solution to Exercise 14.27 (p. 442)
By (CE9) ("(CE9)", p. 601), By (CE10), by (CE10)
(14.150)
(14.151)
E [g (X, Y ) Z] = E{E [g (X, Y ) X, Z] Z} = E [e (X, Z) Z] a.s.
E [e (X, Z) Z = v] = E [e (X, v) Z = v] = e (t, v) FXZ (dtv) a.s.
By Exercise 14.26,
(14.152)
(14.153)
E [g (X, Y ) X = t, Z = v] FXZ (dtv) = E [g (t, Y ) X = t, Z = v] FXZ (dtv) a.s. [PZ ]
Solution to Exercise 14.28 (p. 442)
1
(14.154)
(14.155)
FY (v) =
FY T (vu) fT (u) du =
0
1 − e−β(2−u)v
du =
(14.156)
1 − e−2βv
eβv − 1 1 − e−βv = 1 − e−βv , 0<v βv βv
(14.157)
453
Solution to Exercise 14.29 (p. 443)
FXH (tu) = 1 − e−ut fH (u) =
0.01
1 = 200, 0.005 ≤ u ≤ 0.01 0.005 200 −0.005t e − e−0.01t t ≈ 0.3323 du = 200ln2 u
(14.158)
FX (t) = 1 − 200
0.005
e−ut du = 1 −
(14.159)
P (X > 150) =
200 −0.75 e − e−1.5 150
0.01
(14.160)
E [XH = u] = 1/u E [X] = 200
0.005
(14.161)
Solution to Exercise 14.30 (p. 443)
Let
Q = {(t1 , t2 , · · · , tn ) : tk > ti , k = i}.
Then
P (W = Xi ) = E [IQ (X1 , X2 , · · · , Xn )] = E{E [IQ (X1 , X2 , · · · , Xn ) Xi ]} = E [IQ (X1 , X2 , · · · , ti , · · · Xn )] FX (dt) P (Xk > t) = [1 − FX (t)]
k=i
If put
(14.162)
(14.163)
E [IQ (X1 , X2 , · · · , ti , · · · Xn )] =
n−1
(14.164)
FX
is continuous, strictly increasing, zero for Then
t < 0,
u = FX (t), du = fX (t) dt. t = 0 ∼ u = 0,
1
t = ∞ ∼ u = 1.
1
P (W = Xi ) =
0
(1 − u)
n−1
du =
0
un−1 du = 1/n
(14.165)
454
CHAPTER 14. CONDITIONAL EXPECTATION, REGRESSION
Chapter 15
Random Selection
15.1 Random Selection1
15.1.1 Introduction
N,
The usual treatments deal with a single random variable or a xed, nite number of random variables, considered jointly. However, there are many common applications in which we select at random a member of a class of random variables and observe its value, or select a random number of random variables and obtain some function of those selected. This is formulated with the aid of a which is nonegative, integer valued.
counting
or
selecting
random variable
It may be independent of the class selected, or may be related in We consider only the independent case.
some sequential way to members of the class. problems require
optional
random variables, sometimes called
Markov times.
Many important
These involve more theory
than we develop in this treatment. Some common examples:
1. Total demand of
2. Total service time for 3. 4. 5. 6.
N independent of the individual demands. N independent of the individual service times. Net gain in N plays of a game N independent of the individual gains. Extreme values of N random variables N independent of the individual values. Random sample of size N N is usually determined by propereties of the sample observed. Decide when to play on the basis of past results N dependent on past.
customers
N
N
units
15.1.2 A useful modelrandom sums
As a basic model, we consider the sum of a random number of members of an iid class. In order to have a concrete interpretation to help visualize the formal patterns, we think of the demand of a random number of customers. We suppose the number of customers a model to be used for a variety of applications.
N
is independent of the individual demands. We formulate
A
An
basic sequence {Xn : 0 ≤ n} [Demand of n customers] incremental sequence {Yn : 0 ≤ n} [Individual demands]
n
These are related as follows:
Xn =
k=0
Yk
for
n≥0
and
Xn = 0
for
n<0
Yn = Xn − Xn−1
for all
n
(15.1)
1 This
content is available online at <http://cnx.org/content/m23652/1.5/>.
455
456
CHAPTER 15. RANDOM SELECTION
A
D
counting random variable N.
(the random sum)
If
N =n Yk =
then
n
of the
Yk
are added to give the
compound demand
(15.2)
N
∞
∞
D=
k=0
I{N =k} Xk =
k=0 k=0
I{k} (N ) Xk ∞.
Note.
In some applications the counting random variable may take on the idealized value
For example,
in a game that is played until some specied result occurs, this may never happen, so that no nite value can be assigned to independent of the
We assume throughout, unless specically stated otherwise, that:
1. 2. 3.
Independent selection from an iid incremental sequence
N. In such a case, it is necessary to decide what value X∞ is to Yn (hence of the Xn ), we rarely need to consider this possibility.
be assigned.
For
N
X0 = Y0 = 0 {Yk : 1 ≤ k} is iid {N, Yk : 0 ≤ k} is
an independent class
We utilize repeatedly two important propositions:
1. 2.
E [h (D) N = n] = E [h (Xn )] , n ≥ 0. MD (s) = gN [MY (s)]. If the Yn are nonnegative
integer valued, then so is
D
and
gD (s) = gN [gY (s)]
DERIVATION We utilize properties of generating functions, moment generating functions, and conditional expectation.
1.
2.
E I{n} (N ) h (D) = E [h (D) N = n] P (N = n) by denition of conditional expectation, given an event. Now, I{n} (N ) h (D) = I{n} (N ) h (Xn ) and E I{n} (N ) h (Xn ) = P (N = n) E [h (Xn )]. Hence E [h (D) N = n] P (N = n) = P (N = n) E [h (Xn )]. Division by P (N = n) gives the desired result. sD By the law of total probability (CE1b), MD (s) = E e = E{E esD N }. By proposition 1 and the
product rule for moment generating functions,
n
E esD N = n = E esXn =
k=1
Hence
n E esYk = MY (s)
(15.3)
∞
MD (s) =
n=0
A parallel argument holds for
n MY (s) P (N = n) = gN [MY (s)]
(15.4)
gD
in the integervalued case.
Remark.
The result on
MD
and
gD
may be developed without use of conditional expectation.
∞
∞
MD (s) = E esD =
k=0 ∞
E I{N =n} esXn =
k=0
P (N = n) E esXn
(15.5)
=
k=0
n P (N = n) MY (s) = gN [MY (s)]
(15.6)
Example 15.1: A service shop
Suppose the number
N
of jobs brought to a service shop in a day is Poisson (8).
One fourth of
these are items under warranty for which no charge is made. Others fall in one of two categories. One half of the arriving jobs are charged for one hour of shop time; the remaining one fourth are
457
charged for two hours of shop time. Thus, the individual shop hour charges distribution
Yk
have the common
Y = [0 1 2]
SOLUTION
with probabilities
P Y = [1/4 1/2 1/4] P (D ≤ 4). 1 1 + 2s + s2 4
(15.7)
Make the basic assumptions of our model. Determine
gN (s) = e8(s−1) gY (s) =
According to the formula developed above,
(15.8)
gD (s) = gN [gY (s)] = exp (8/4) 1 + 2s + s2 − 8 = e4s e2s e−6
result of straightforward but somewhat tedious calculations is
2
(15.9)
Expand the exponentials in power series about the origin, multiply out to get enough terms. The
gD (s) = e−6 1 + 4s + 10s2 +
56 3 86 4 s + s + ··· 3 3
(15.10)
Taking the coecients of the generating function, we get
P (D ≤ 4) ≈ e−6 1 + 4 + 10 +
56 86 + 3 3
= e−6
187 ≈ 0.1545 3
(15.11)
Example 15.2: A result on Bernoulli trials
Suppose the counting random variable
N∼
n
binomial
(n, p)
and
Yi = IEi ,
with
P (Ei ) = p0 .
Then
gN = (q + ps)
and
gY (s) = q0 + p0 s
(15.12)
By the basic result on random selection, we have
gD (s) = gN [gY (s)] = [q + p (q0 + p0 s)] = [(1 − pp0 ) + pp0 s]
so that
n
n
(15.13)
D∼
binomial
(n, pp0 ).
In the next section we establish useful mprocedures for determining the generating function ment generating function
MD
gD and the moTo this end, we
for the compound demand for simple random variables, hence for determining It may helpful, if not entirely
the complete distribution.
Obviously, these will not work for all problems.
sucient, in such cases to be able to determine the mean value establish the following expressions for the mean and variance.
E [D]
and variance
Var [D].
Example 15.3: Mean and variance of the compound demand
E [D] = E [N ] E [Y ]
DERIVATION
and
Var [D] = E [N ] Var [Y ] + Var [N ] E 2 [Y ]
(15.14)
∞
∞
E [D] = E
n=0
I{N =n} Xn =
n=0 ∞
P (N = n) E [Xn ]
(15.15)
= E [Y ]
n=0 ∞
nP (N = n) = E [Y ] E [N ]
∞
(15.16)
E D
2
=
n=0
P (N = n) E
2 Xn
=
n=0
P (N = n) {Var [Xn ] + E 2 [Xn ]}
(15.17)
n+1 rows has begin by preallocating zeros to the rows. we set nm + 1 elements. The probabilities in the consist of the coecients for the appropriate power of To achieve this.25 (0 + 2 + 4) − 1 = 0. values with zero probability). integer valued. We enter these gN = [p0 p1 · · · pn ] y = [π0 π1 · · · πm ] ··· y2 = conv (y. we need a matrix. corresponding to missing values of orient the joint gY and the power of gY for the previous row. y (n − 1)) We wish to generate a matrix 1 × (n + 1) 1 × (m + 1) 1 × (2m + 1) 1 × (3m + 1) 1 × (nm + 1) Coecients of Coecients of gN gY 2 gY 3 gY (15.1. When the matrix P is completed.3 Calculations for the compound demand We have mprocedures for performing the calculations necessary to determine the distribution for a composite demand D when the counting random variable N and the individual demands Yk are simple random variables with not too many values. Suppose gN (s) = p0 + p1 s + p2 s2 + · · · + pn sn gY (s) = π0 + π1 s + π2 s2 + · · · + πm sm The coecients of (15. then so is D.21) (15. (15. We examine a strategy for computation which is implemented in the mprocedure gend. the length of yn. we remove N and D (i. With the joint i th row are obtained by distribution.22) gN and gY are the probabilities of the values of and calculate the coecients for powers of gY : N and Y. .5 + 8 · 1 = 12 Hence. we may then calculate any desired quantities.e.1 ( A service shop) E [N ] = Var [N ] = 8. Var [D] = 8 · 0.4: Mean and variance for Example 15. y) y3 = conv (y. RANDOM SELECTION ∞ = n=0 Hence P (N = n) {nVar [Y ] + n2 E 2 [Y ]} = E [N ] Var [Y ] + E N 2 E 2 [Y ] (15. The procedure gend If the Yi are nonnegative. In some cases. By symmetry E [Y ] = 1. E [D] = 8 · 1 = 8. and there is a generating function. P = zeros (n + 1.5. That is. each of whose gY multiplied by the probability N i th row We the has that value. we are able to approximate by a simple random variable. respectively. To probabilities as on the plane. n ∗ m + 1). we rotate P ninety degrees counterclockwise.20) 15.23) Coecients of Coecients of Coecients of n gY P whose rows contain the joint probabilities.19) Example 15. y2) ··· yn = conv (y.458 CHAPTER 15. The replacement probabilities for the the convolution of zero rows and columns. Var [Y ] = 0. such as for a Poisson counting random variable.. We then replace appropriate elements of the successive rows.18) Var [D] = E [N ] Var [Y ] + E N 2 E 2 [Y ] − E[N ] E 2 [Y ] = E [N ] Var [Y ] + Var [N ] E 2 [Y ] 2 (15.
2*[1~1~1~1~1]. EDn~=~sum(u.6: A numerical example gN (s) = Note that the zero 1 1 + s + s2 + s3 + s4 gY (s) = 0.24) gN~=~0.0000~~~~1.0000~~~~1.1200~~~~~~~~~~~~~~~~%~Agrees~with~theoretical~EN*VY~+~VN*EY^2 Example 15.~~~~%~Note~zero~coefficient~for~missing~zero~power gY~=~0. disp([N.~call~for~gD disp(gD)~~~~~~~~~~~~~~~~~~%~Optional~display~of~complete~distribution ~~~~~~~~~0~~~~0.2250 ~~~~3.8000 VD~=~(D.~Y.~PY.0040 ~~~~6. or 3.~D.EDn]') ~~~~1. 1. Note that the for any missing powers gN and gY .0000~~~~0. The zero coecients must be included.1167~~~~~~~~~~~~~~~~ [N.0000~~~~0.~D.~~~~~~%~Note~the~zero~coefficient~in~the~zero~position gend Do~not~forget~zero~coefficients~for~missing~powers Enter~the~gen~fn~COEFFICIENTS~for~gN~~gN ./sum(P). or 2 items with respective probabilities 0.P). 0. gY~=~0.0000~~~~0.0880 ~~~~4.u. 0. Customers buy independently.1*[5~4~1].1 5s + 3s2 + 2s3 5 power is missing from gY . (15.~P To~view~distribution~for~D.0000~~~~0.~P May~use~jcalc~or~jcalcf~on~N. regardless of the number of customers.2917 ~~~~1.459 Example 15.~~~~~~~~%~All~powers~0~thru~2~have~positive~coefficients gend ~Do~not~forget~zero~coefficients~for~missing~powers Enter~the~gen~fn~COEFFICIENTS~for~gN~gN~~~~%~Coefficient~matrix~named~gN Enter~the~gen~fn~COEFFICIENTS~for~gY~gY~~~~%~Coefficient~matrix~named~gY Results~are~in~N.5. 2.0000~~~~0.6000 ED~=~D*PD' ED~=~~1.2000~~~~~~~~~~~~~~~~%~Agrees~with~theoretical~EN*EY P3~=~(D>=3)*PD' P3~~=~0.0243 ~~~~5.PL]~=~jcalcf(N.t. gN~=~(1/3)*[0~1~1~1].0000~~~~0. corresponding to the fact that P (Y = 0) = 0. First we determine the matrices representing coecients are the probabilities that each integer value is observed.D.^2)*PD'~~ED^2 VD~=~~1.0003 EN~=~N*PN' EN~=~~~2 EY~=~Y*PY' EY~=~~0. Each customer buys 0.1.5: A compound demand The number of customers in a ma jor appliance store is equally likely to be 1.3667 ~~~~2.PD.4.PN.1*[0~5~3~2].0000~~~~0.*P).D.~PD.~PN.2000 ~~~~3.6000~~~~~~~~%~Agrees~with~theoretical~E[DN=n]~=~n*EY ~~~~2.
gN. or 3.0000~~~~0.0075 ~~~11.~PD. 2.~P May~use~jcalc~or~jcalcf~on~N.~call~for~gD disp(gD)~~~~~~~~~~~~~~~~~%~Optional~display~of~complete~distribution ~~~~~~~~~0~~~~0.0000~~~~0.0000~~~~0. the number of customers in a major appliance store is equally likely to be 1. RANDOM SELECTION Enter~the~gen~fn~COEFFICIENTS~for~gY~~gY Results~are~in~N.1250 P4_12~=~((D~>=~4)&(D~<=~12))*PD' P4_12~=~0.1100 ~~~~3.0000~~~~0.0964 ~~~~7. following example utilizes such an approximation procedure for the counting random variable N.1000 ~~~~2.0000~~~~0.~P May~use~jcalc~or~jcalcf~on~N.2 ( A result on Bernoulli trials).~Y.0000~~~~0.~D.0003 p3~=~(D~==~3)*PD'~~~~~~~~%~P(D=3) P3~=~~0.0000~~~~0.~D. and each buys at least one item with probability the distribution for the number SOLUTION We use D Determine of buying customers.0000~~~~0. .~%~Note~zero~coefficient~for~missing~zero~power gY~=~[0.4560 ~~~~2.0000~~~~0.6]. gY .~~~~~~~%~Generating~function~for~the~indicator~function gend Do~not~forget~zero~coefficients~for~missing~powers Enter~gen~fn~COEFFICIENTS~for~gN~~gN Enter~gen~fn~COEFFICIENTS~for~gY~~gY Results~are~in~N.0203 ~~~10.0000~~~~0.~P To~view~distribution~for~D.0000~~~~0.7: Number of successes for random number We are interested in the number of successes in N trials for a general counting random variable. and gend.5 (A compound demand). with nonnegative integer values.0000~~~~0.2640 ~~~~3.0720 The procedure gend is limited to simple N and Yk .6.~PD. Suppose. gN~=~(1/3)*[0~1~1~1].4650~~~~~~~~~~~%~P(4~<=~D~<=~12) Example 15.~call~for~gD disp(gD) ~~~~~~~~~0~~~~0.~D. Sometimes.~PN.1155 ~~~~5.460 CHAPTER 15.~Y.~PN.0019 ~~~12.0000~~~~0.0696 ~~~~8.~PY.0424 ~~~~9. This is a generalization of the Bernoulli case in Example 15.~D.0000~~~~0.0000~~~~0.2080 ~~~~1. N of trials. as in Example 15.1110 ~~~~6.1250 ~~~~4.4~0.~PY.~P To~view~distribution~for~D. p = 0. a random The solution in the variable with unbounded range may be approximated by a simple random variable.2000 ~~~~1.
we need to check for a sucient number of terms in a simple approximation.:))~~~~~~~~~~~~%~Calculated~values~to~D~=~50 ~~~~~~~~~0~~~~0.0021 ~~~20.0684 ~~~12.25*[1~2~1].0037 ~~~19.0000~~~~0.0025~~~~~~~~~%~Display~for~D~<=~20 ~~~~1.~Y.0253 ~~~15.0000~~~~0. determine SOLUTION Since Poisson P (D ≤ 4).1 ( VD~=~(D.^2)*PD'~~ED^2 VD~=~~11.0000~~~~0.0064 ~~~18.2834~~~~0.0711 ~~~~5.0:25).1099 ~~~~7.0166 ~~~16. The individual shop hour have the common distribution Y = [012] with probabilities P Y = [1/41/21/4].8: Solution of the shop time Example 15.~PD.10:5:30)~~~~~%~Check~for~sufficient~number~of~terms pa~=~~~0. pa~=~cpoisson(8.1132 ~~~~9.0000~~~~0.0000~~~~0.0000~~~~0.0003~~~~0.0000~~~~0.0000~~~~~~~~~~~~~~~~~%~Theoretical~=~8~~(Example~15.0000~~~~0.0000~~~~0.0173~~~~0.1545~~~~~~~~~~~~~~~~~%~Theoretical~value~(4~~places)~=~0.~D.0000~~~~0.~PN.9999~~~~~~~~~~~~~~~~~%~Theoretical~=~12~(Example~15.0861 ~~~11.0939 ~~~~6.1021 ~~~10. gend Do~not~forget~zero~coefficients~for~missing~powers Enter~gen~fn~COEFFICIENTS~for~gN~~gN Enter~gen~fn~COEFFICIENTS~for~gY~~gY Results~are~in~N.0000~~~~0.0000~~~~0.0515 ~~~13.~D.0000~~~~0.0000~~~~0.0463 ~~~~4.1545 ED~=~D*PD' ED~=~~~8.1 ( A A .1722e06 gN~=~ipoisson(8.25)~~~~~~~~~%~Check~on~choice~of~n~=~25 p25~=~~1.0000~~~~0. Then we proceed as in the simple case.0000 p25~=~cpoisson(8.0248 ~~~~3.~P May~use~jcalc~or~jcalcf~on~N. Under the basic assumptions of our model.0000~~~~0.0000~~~~0.0000~~~~0.4 (Mean and variance for Example~15. N is unbounded.4 (Mean and variance for Example~15.1165 ~~~~8.0105 ~~~17.0000~~~~0.~PY.~~~~~~~%~Approximate~gN gY~=~0.1 ( A service shop) The number charges Yk N of jobs brought to a service shop in a day is Poisson (8).461 Example 15.0012 sum(PD)~~~~~~~~~~~~~~~~~~~~~~~%~Check~on~sufficiency~of~approximation ans~=~~1.0000~~~~0.~call~for~gD disp(gD(D<=20.0099 ~~~~2.0000~~~~0.0369 ~~~14.0000 P4~=~(D<=4)*PD' P4~=~~~0.~P To~view~distribution~for~D.
0187 . Instead of the convolution procedure used in gend to determine the distribution for the sums of the individual demands.3. The job types for arrivals may be represented by an iid class {Yi : 1 ≤ i ≤ 4}.0450 ~~~35. for the individual demands.0000~~~~0. 0. C The Yi be the total amount of gN~=~0. For the moment generating function. independent of the arrival process.5000~~~~0. The values for the individual demands are not limited to integers. The number of jobs received in a normal work day can be considered a random variable N which takes on values 0.0000~~~~0.2000 ~~~10.0600 ~~~25.2.0400 ~~~20.5000~~~~0.~probabilities~are~in~PD.5.5000~~~~0. Y N and PY .0000~~~~0. the mprocedure mgd utilizes the mfunction mgsum to obtain these distributions.0600 ~~~15. we MD rather than the generating function gD . RANDOM SELECTION The mprocedures mgd and jmgd The next example shows a fundamental limitation of the gend procedure. coecients as in gend.1*[5~3~2].0000~~~~0. need to implement the moment generating function In this case. 2. Let services rendered in a day. The distributions for the various sums are concatenated into two row vectors. 0. and $15.0000~~~~0. 3.5000~~~~0.0000~~~~0. However. For gN .0330 ~~~32. 1.0372 ~~~45.5000~~~~0.0500 ~~~22.0000~~~~0.0353 ~~~42. 12. Determine the distribution for SOLUTION C. we nd it convenient to have two mprocedures: distribution and jmgd D.0468 ~~~50.0414 ~~~40.2*[1~1~1~1~1].5000~~~~0.~~~~~~~~~%~Enter~data Y~=~[10~12. As a consequence. to which csort is applied to obtain the distribution for the compound demand.0240 ~~~30. In the generating function case. mgd~~~~~~~~~~~~~~~~~~~~~~~~~~~%~Call~for~procedure Enter~gen~fn~COEFFICIENTS~for~gN~~gN Enter~VALUES~for~Y~~Y Enter~PROBABILITIES~for~Y~~PY Values~are~in~row~matrix~D. D} as to develop the for the joint distribution. the joint distribution requires considerably mgd for the marginal {N.~call~for~mD.462 CHAPTER 15.0486 ~~~47. 15 with respective probabilities 0.1000 ~~~12. PY~=~0.5.0580 ~~~27.0352 ~~~52. the actual values and probabilities in the distribution for pair of row matrices. it is as easy to develop the joint distribution for marginal distribution for more computation. 4 with equal probabilities 0. and there are considerable gaps between the values.5000~~~~0. $12.0570 ~~~37.9: Noninteger values A service shop has three standard charges for a certain class of warranty services it performs: $10.5~15]. The procedure requires as input the generating function for and the actual distribution. To~view~the~distribution. disp(mD)~~~~~~~~~~~~~~~~~~~~~~%~Optional~display~of~distribution ~~~~~~~~~0~~~~0. there are no zeros in the probability matrix for missing values.0000~~~~0.2.50.0000~~~~0. take on values 10.5000~~~~0. it is necessary to treat the Example 15. Y are put into a If Y is integer valued.
1 ∗ [532].0000~~~~0.5000~~~~0.1421 As expected.0075 ~~~57.4000 P_4_12~=~((D>=4)&(D<=12))*PD' P_4_12~=~~0.0000~~~~0.0000~~~~0. Actual operation is quite similar to the operation . D value. we have 1 1 + s + s2 + s3 + s4 gY (s) = 0. We use the same expression for gN as in Example 15.0019 ~~~12. PY~=~0. If it is desired to obtain the joint distribution for {N.0019 ~~~60. D}.0000~~~~0.0000~~~~0.6 (A numerical example). disp(mD) ~~~~~~~~~0~~~~0. The This requires some calculations to determine the appropriate size of the matrices used as well as a procedure to put each probability in the position corresponding to its of mgd.1100 ~~~~3.6 (A numerical example).1110 ~~~~6. jmgd.6 (A numerical example) In Example 15.0696 ~~~~8.10: Recalculation of Example 15.6 (A numerical example). Example 15. above.1250 ED~=~D*PD' ED~=~~~3.0000~~~~0.1*[5~3~2].~call~for~mD.0000~~~~0.0000~~~~0.0003 We next recalculate Example 15.4650 P7~=~(D>=7)*PD' P7~=~~~0.0000~~~~0. the results are the same as those obtained with gend.0000~~~~0.1000 ~~~~2.~probabilities~are~in~PD. Y~=~1:3.463 ~~~55.5).0000~~~~0.1155 ~~~~5.0000~~~~0.2*ones(1.0203 ~~~10. using mgd rather than gend.0424 ~~~~9. and requires the same data format.1250 ~~~~4.0000~~~~0. gN (s) = (15. To~view~the~distribution.0964 ~~~~7.0000~~~~0.25) gN~=~0. complications come in placing the probabilities in the P we use a modication of mgd called matrix in the desired positions.0003 P3~=~(D==3)*PD' P3~=~~~0.1 5s + 3s2 + 2s3 5 This means that the distribution for Y is Y = [123] and P Y = 0.2000 ~~~~1.0075 ~~~11. mgd Enter~gen~fn~COEFFICIENTS~for~gN~~gN Enter~VALUES~for~Y~~Y Enter~PROBABILITIES~for~Y~~PY Values~are~in~row~matrix~D.
in which one type is called a success and the other a failure. with P (Ti = k) = P (Eki ) = pk n m invariant with i (15. If the random variable Ti i th arrival and the class {Ti : 1 ≤ i} is iid.464 CHAPTER 15. pk ).29) C (n. if the use of gend is appropriate.26) {Ti : 1 ≤ i} In a sequence of is iid.1 The Poisson decomposition is the type of the In many problems.27) n trials. We establish some useful theoretical results and in some cases use MATLAB procedures.2. of course. For each i. Each such arrangement has probability so that m P (N1n = n1 . since it sums to If the values of m−1 of them are known. such as nE [Y ]. we extend the treatment to a variety of problems.6/>.2 Some Random Selection Problems2 In the unit on Random Selection (Section 15. Multinomial trials Let We analyze such a sequence of trials as follows. 15. · · · . n2 . This. m 1 2 n1 of the E1i . we have multinomial trials. N2n = n2 . the event {N1n = n1 . the individual demands may be categorized in one of m types. Suppose there are Eki be the event that type k occurs on the i th component trial. 15.31) 2 This content is available online at <http://cnx. · · · . · · · Nmn = nm ) = n! k=1 pnk k nk ! (15. RANDOM SELECTION A principle use of the joint distribution is to demonstrate features of the model. nm of the Emi . but these do not depend upon this result. Remark. · · · . The type on the i th trial the Bernoulli or binomial case. it is faster and more ecient than mgd (or jmgd). N2n = n2 . Now each Nkn ∼ binomial (n. the class {Eki : 1 ≤ k ≤ m} types will occur. Nmn = nm } is one of the (15. In this unit. we develop some general theoretical results and computational procedures using MATLAB. . For m = 2 we have m types. we let Nkn be the number of occurrences of type k. MY (s). it is usually helpful to demonstrate the validity of the assumptions in typical examples. Then Nkn = i=1 IEki with Nkn = n k=1 (15.30) pn1 pn2 · · · pnm . nm ) = n!/ (n1 !n2 ! · · · nm !) ways of arranging (15. since on each component trial exactly one of the may be represented by the type random variable m Ti = k=1 We assume kIEki (15. The class {Nkn : 1 ≤ k ≤ m} cannot be independent. is utilized in obtaining the expressions for MD (s) in terms of E [DN = n] = gN (s) and This result guides the development of the computational procedures.1).org/content/m23664/1. is a partition. which we number 1 through m. and are convenient tools for solving various compound demand type problems. If n1 + n2 + · · · + nm = n.28) n. However. the value of the other is determined. moderate size. In general. n2 of the E2i . including those in the unit on random selection. But both mprocedures work quite well for problems of And it will handle somewhat larger problems. etc. n1 .
the case of a random number (µ). Let Nk be the number of results of type k N of multinomial trials. and p3 = 0. and 1/10 require regular mail. and Type 3 is regular mail. Then N1 ∼ Poisson (0. Also N1 + N2 ∼ Poisson (120 + 150 = 270). Each b.4 · 300 = 120). µ for the sum the sum of the µi for the variables added.150) P1 = 0. Ti : 1 ≤ i} is with P (Ti = k) = pk .4. Na ∼ Poisson (50 · 0.47 The number of On incoming line 1 the messages have probability of leaving on outgoing line a of 1 − p1a of leaving on line b. 2. what is the distribution of outgoing messages on line a? What are the probabilities of at least 30.cpoisson(120.33 p2a = 0.300) P12 = 0.5. 3. Before verifying the propositions above. Orders are shipped by next day express.1 · 300 = 30). by second day priority.12: Message routing A junction point in a network has two incoming lines and two outgoing lines.9620 Example 15. Example 15. where in a random number N N ∼ Poisson of multinomial trials.5 · 300 = 150). What is the probability that fewer than 150 want next day express? What is the probability that fewer than 300 want one or the other of the two faster deliveries? SOLUTION Model as a random number of multinomial trials. Type 2 is second day priority. on line 2 the number is N2 ∼ Poisson p1a = 0.cpoisson(270.65). Under the usual independence assumptions. This is readily established with the aid of the generating function. p2 = 0.32) Poisson decomposition Suppose 1. with three outcome types: Type 1 is next day express. this is the binomial distribution with parameter A random number of multinomial trials We consider. 40 outgoing messages on line a? SOLUTION By the Poisson decomposition. in particular. The messages coming in on line 2 have probability leaving on line a. p1 ). 5/10 want second day priority. 1 ≤ k ≤ m independent Nk ∼ Poisson (µpk ) {Nk : 1 ≤ k ≤ m} is independent.1. 35. we consider some examples. and N1 on line one in one hour is Poisson (50). or by regular parcel mail. with respective probabilities p1 = 0. Make the usual assumptions on compound demand. N2 ∼ Poisson (0. and type 1 a success. and N3 ∼ Poisson (0. . Suppose 4/10 of the customers want next day express. P1 = 1 . (n.465 This set of joint probabilities constitutes the multinomial distribution.47 = 37. incoming messages (45). with N ∼ Poisson (µ) {Ti : 1 ≤ i} is iid {N. For m = 2.9954 P12 = 1 .11: A shipping problem The number N of orders per day received by a mail order house is Poisson (300). Then a. N ∞ m Nk = i=1 IEki = n=1 I{N =n} Nkn with Nk = N k=1 (15.33 + 45 · 0. The usefulness of this remarkable result is enhanced by the fact that the sum of independent Poisson random variables is also Poisson.
466 CHAPTER 15.6500 Pa = cpoisson(ma. Nm = nm } = {N = n} ∩ {N1n = n1 .6890 0. · · · . n m .37) {Nk : 1 ≤ k ≤ m}.2 Extreme values Consider an iid class {Yi : 1 ≤ i} of nonnegative random variables. (15. This is composite demand with Yk = IEki . Yn } P (Vn > t) = P n (Y > t) Now consider a random number and P (Wn ≤ t) = P n (Y ≤ t) (15.9119 0.33) n1 . RANDOM SELECTION ma = 50*0. For any (15. Y2 .33 + 45*0. Nmn = nm }} is independent.2. let Nk ∼ Poisson (µpk ). FV (t) = P (V ≤ t) = 1 + P (N = 0) − gN [P (Y > t)] FW (t) = gN [P (Y ≤ t)] . then a. N2n = n2 .40) Computational formulas If we set V0 = W0 = 0.30:5:40) Pa = 0. gNk (s) = gN [gYk (s)] = eµ(1+pk (s−1)−1) = eµpk (s−1) which is the generating function for b. so that gYk (s) = qk + spk = 1 + pk (s − 1). For any positive integer n we let (15. The minimum and maximum random variables are ∞ ∞ VN = n=0 I{N =n} Vn and WN = n=0 I{N =n} Wn (15. n = n1 + n2 + · · · + nm . {N1n = n1 .36) The second product uses the fact that m eµ = eµ(p1 +p2 +···+pm ) = k=1 Thus. the class {{N = n}. Therefore. and consider A = {N1 = n1 . 15. Nk = N i=1 IEki . · · · . Nmn = nm } Since N (15. · · · . N2 = n2 . n 2 . the product rule holds for the class eµpk so that it is independent. Y2 .35) P (A) = e−µ µn · n! n! m k=1 pnk k = (nk )! m e−µpk k=1 pnk k = nk ! m P (Nk = nk ) k=1 (15. N2n = n2 . · · · . By the product rule and the multinomial distribution (15. b.3722 VERIFICATION of the Poisson decomposition a. · · · . · · · .34) is independent of the class of IEki .47 ma = 37.39) N of the Yi .38) Vn = min{Y1 . Yn } Then and Wn = max{Y1 .
41) If we add into the last sum the term P (N = 0) P 0 (Y > t) = P (N = 0) then subtract it. The market is such that the bids have a common distribution symmetric triangular on (150. 12. In some cases. we do not have the extra term for (15. we have ∞ P (VN > t) = n=0 P (N = n) P n (Y > t) − P (N = 0) = gN [P (Y > t)] − P (N = 0) In this case.250).0. Vn } ∞ for each n ∞ P (VN > t) = n=0 P (N = n) P (Vn > t) = n=1 P (N = n) P n (Y > t) .3694 12.0000 0. since {N = 0}. distribution exponential (1/3). N =0 does not correspond to an admissible outcome (see Example 15.8739 18.3).9516 Example 15.0668 9.14: Lowest Bidder A manufacturer seeks bids on a modication of one of his processing units. (in thousands of dollars) form an iid class. In that case ∞ ∞ n FV (t) = n=1 [1 − P (Y > t)] P (N = n) = n=1 P (Vn ≤ t) P (N = n) = ∞ ∞ n n=1 P (Y > t) P (N = n) n=1 P (N = n) − Add (15. below.467 ∞ These results are easily established as follows. so that the number of bids Assume the bids Yi N ∼ binomial (20. independence of {VN > t} = {N = n}{Vn > t}.PW]') 6.43) P (N = 0) = P 0 (Y > t) P (N = 0) ∞ to each of the sums to get FV (t) = 1 − n=0 P n (Y > t) P (N = n) = 1 − gN [P (Y > t)] (15.0000 0.42) A similar argument holds for proposition (b). PW = exp(20*exp(t/3)). 15.0000 0. on lowest bidder and Example 15. P (W0 ≤ t) = 1. since P (V0 > t) = 0 (15.3.0000 0.16 (Batch testing)).13: Maximum service time The number N of jobs coming into a service center in a week is a random quantity having a Poisson Suppose the service times (in hours) for individual units are iid. What is the probability the maximum service time for the units is no greater than 6. with common (20) distribution. Twenty contractors are invited to bid. 18 hours? SOLUTION SOLUTION P (WN ≤ t) = gN [P (Y ≤ t)] = e20[FY (t)−1] = exp −20e−t/3 (15. disp([t.6933 15.14 (Lowest Bidder ).45) t = 6:3:18. n=0 By additivity and {N. What is the probability of at least . They bid with probability 0.44) Example 15. Special case. 9.0000 0.
3.pd] = gendf(gN. p = [23/25 41/50 17/25 1/2 8/25]. respectively.5000 0. we get p = [23/25 41/50 17/25 1/2 8/25] for t = [170 180 190 200 210] 20 Now gN (s) = (0.8671 200.14 (Lowest Bidder ) may be worked in this manner by using ibinom(20.7 + 0.16: Batch testing Electrical units from a production line are rst inspected for operability.0000 0.14 (Lowest Bidder ) with a general counting variable Suppose the number of bids is 1. Y 's is no greater than t if and only if there is at least one Y less We determine in each case probabilities for the number of bids satisfying Y ≤ t.5019 200.9896 Example 15.PV]') 170. fact that the probabilities in this example are lower for each t than in Example 15.length(t)). .3848 180. RANDOM SELECTION one bid no greater than 170.5 0.0000 0. 210? Note that no bid is we must use the special case.7: Number of successes for random number of trials. Determine P (V ≤ t) in each case. of course.0000 0. SOLUTION.9612 210. are the same as in the previous solution.3 0. This is essentially the problem in Example 7 (Example 15. for i=1:length(t) gY = [p(i). % Probabilities Y <= t are 1 . PV(i) = (d>0)*pd'. experience All operable of those passing the initial operability test are defective.p(i)].15: Example 15.3.14 (Lowest Bidder ) reects the fact that there are probably fewer bids in each case.9200 0.1 .p gN = [0 0.0000 0.(0.0000 0. % Selects positions for d > 0 and end % adds corresponding probabilities disp([t.6705 190.3s) . p = [23/25 41/50 17/25 1/2 8/25]. with probability N t = [170 180 190 200 210].2.7 + 0. gN = The The results. [d. 190.PV]') 170.0000 0.7 + 0.0000 0. The minimum of the selected than or equal to For each t.3*p). not a low bid of zero.3200 0. We use MATLAB to obtain 20 t = [170 180 190 200 210]. we are interested in the probability of one or more occurrences of the event p = P (Y ≤ t).8200 0.1451 180.6800 0.) from "Random Selection".^20.0000 0. 180.0000 0. 0. 0. t.0000 0.468 CHAPTER 15. indicates that a fraction p However.5.p. disp([t. Y ≤ t. Example 15.8462 Example 15. 2 or 3 with probabilities 0.0.3p) where p = P (Y > t) Solving graphically for p = P (V > t) . 200.3075 190.0:20). PV = 1 . % Zero for missing value PV = zeros(1.7000 210.gY). hence Solution P (V ≤ t) = 1 − gN [P (Y > t)] = 1 − (0.2].
0000 0. If the test goes what is the probability of no defective units? units of time with no failure (i. independent of so that N. we have {N = 0} ⊂ {V > t} Since so that P (N = 0V > t) = P (N = 0) P (V > t) (15. N is large and p is reasonably small. 15. MATLAB calculations give t = 1:5.47) n = 5000.469 units are subsequenly tested in a batch under continuous operation ( a burn in test). A batch of be the time of the rst failure and t N n units is tested. exponential (λ). Now N − 1 ∼ geometric (p) implies pMY (s) 1 − qMY (s) gN (s) = ps/ (1 − qs).0000 0.0000 0. Now under the n (n. 2. so that gN (s) = (q + ps) . the number of defective units P (N = 0) = e−np . In actually designing the test.9125 3.001.46) N =0 does not yield a minimum value. If np(s−1) approximately Poisson (np) with gN (s) = e and N∼ binomial for large n P (N = 0V > t) = For e−np enp[P (Y >t)−1] = e−npP (Y >t) = e−npe −λt (15. and t = 1. 3. MT (s) = gN [MY (s)] = There are two useful special cases: (15. p). Now P (Y > t) = e−λt .49) . Statistical data indicate the defective units have times to failure Yi iid. Then the total time (or cost) from the beginning to the completion of the rst success is N T = i=1 We suppose the Yi (composite demand with N −1∼ geometric p (15. 4. p = 0.e. one should probably make calculations with a number of dierent assumptions on the fraction of defective units and the life duration of defective units. disp([t.001. N is P (V > t) = gN [P (Y > t)]. SOLUTION V > t). These calculations are relatively easy to make with MATLAB.3 Bernoulli trials with random execution times or costs Consider a Bernoulli sequence with probability p of the trial on which the rst success occurs. lambda = 2. Let Yi of success on any component trial. whereas good units Let V be the number of defective units in the batch.9877 4. have very long life (innite from the point of view of the test).5083 2.9998 It appears that a test of three to ve hours should give reliable results. Since no defective units implies no failures in any reasonable test time. 5. P = exp(n*p*exp(lambda*t)). Let be the time (or cost) to execute the N be the number i th trial.0000 0. n = 5000.48) Yi form an iid class.9983 5.2. we have condition above.0000 0. λ = 2..P]') 1. p = 0.
each expo nential (3).5000 0. (λ). so that (pλ).53) (qs)k + (qs)n+1 (qs)k k n+1 n+1 = ps k=0 (qs) + (qs) gN (s) = gn (s) + (qs) gN (s) (15.9093. so that P (T ≤ 4) = 1 − e−0.4105 0. RANDOM SELECTION Yi ∼ exponential 1. etc.25:1.63 the maximum is 3/4 hour or less.5. For the case of series. 1.75. In the general case.52) P (W ≤ t) = gN [P (Y ≤ t)] = MATLAB computations give t = 0. we may use approximation procedures based on properties of the geometric N −1∼ geometric (p). gY (s) = p0 s 1−q0 s gT (s) = so that pp0 s/ (1 − q0 s) pp0 s = 1 − pp0 s/ (1 − q0 s) 1 − (1 − pp0 ) s (15. Thus. Interview times (in hours) Yi are presumed to form an iid class. so that MY (s) = λ λ−s .5 hours? SOLUTION T ∼ exponential (0. MT (s) = which implies 2.79 the maximum is one hour or less.50) T ∼ exponential Yi − 1 ∼ geometric (p0 ).2500 0.6293 1.51) T −1∼ geometric (pp0 ). with probability 0. Example 15./(1 + 4*exp(3*t)).17: Job interviews Suppose a prospective employer is interviewing candidates for a job from a pool in which twenty percent are qualied. 1.2 · 3 = 0.PWt]') 0.25.2 1 − e−3t 1 − e−3t = 1 − 0.5000 0.exp(3*t)). 0.5.470 CHAPTER 15.8 (1 − e−3t ) 1 + 4e−3t (15.8925 1.6).7500 0. 0.5:0.2. We take the probability for success on any interview to be be found in four hours or less? p = 0. Since T requires transform theory. with probability 0. pλ/ (λ − s) pλ = 1 − qλ/ (λ − s) pλ − s (15. What is the probability a satisfactory candidate will What is the probability the maximum interview time will be no greater than 0.9468 The average interview time is 1/3 hour. 1. PWt = (1 . the average interview time is 1/3 hour (twenty minutes). disp([t. solving for the distribution of a program such as Maple or Mathematica.0000 0. and may be handled best by simple Yi .54) . gN (s) ps = n k=0 ps 1−qs = ps ∞ k=0 ∞ k=0 n (qs)k = ps n k=0 (qs)k + ∞ k=n+1 (qs)k = (15.7924 1.6·4 = 0.
1*[0 ones(1. Now gT (s) = gn [gY (s)] + (qs) n+1 gN [gY (s)] s = 1.FD(1:101)) % See Figure~15. call for gD. 10}.56) (qs) n+1 gN [gY (s)] for s=1 reduces to q n+1 = P (N > n) With such an Yi which is negligible if n is large enough.1 P50 = (D<=50)*PD' P50 = 0.10)]. Suitable n may be determined in each case.0000 FD = cumsum(PD). as in the general case of simple faster and more ecient for the integervalued case. if the (15.0379 0. 0 <= k <= 40 gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Values are in row matrix D.3.2254 0.3 and Y be uniformly distributed on {1. However.8263 q a b b . gY = 0. (15.0064 n = 40. we could use mgd. gN = p*[0 q.471 Note that gn (s) has the form of the generating function for a simple approximation values and probabilities with N Nn which matches up to k = n. Since (15.^a = 1. 2. % Distribution function for D plot(0:100. gn [gY (s)]. sum(PD) % Check sum of probabilities ans = 1. Example 15. N Determine the distribution for T = k=1 SOLUTION Yk (15. = 1 .55) The evaluation involves convolution of coecients which eectively sets gN (1) = gY (1) = 1. n. · · · . approximate Unless gn Yi .^(k1)]. probabilities are in PD. % Check for suitable n = q. gend is usually q is small.58) p = 0. % Probabilities.0e04 * % Use n = 40 0.57) are nonnegative. k = 1:n.18: Approximating the generating function Let p = 0.p. integervalued. the number of terms needed to is likely to be too great. = [30 35 40]. To view the distribution. we may use the gend procedure on where gn (s) = ps + pqs2 + pq 2 s3 + · · · + pq n sn+1 For the integervalued case.9497 P30 = (D<=30)*PD' P30 = 0.
but use the actual distribution for 15. the failures of a system. . with 0 = S0 < S1 < · · · a.2. separated by random waiting or interarrival times. gN as in Example 15. • Arrival times : {Sn : 0 ≤ n}.1: Execution Time Distribution Function FD . (incremental sequence) The strict inequalities imply that with probability one there are no simultaneous arrivals. RANDOM SELECTION Figure 15. (basic sequence) • Interarrival times : {Wi : 1 ≤ i}. Sn = i=1 Wi and Wn = Sn − Sn−1 for all n≥1 (15.472 CHAPTER 15. These may be arrivals of customers in a store.1. The stream of arrivals may be described in a third way.59) The formulation indicates the essential equivalence of the problem with that of the compound demand (Section 15.4 Arrival times and counting processes Suppose we have phenomena which take place at discrete instants of time. The same results may be achieved with mgd.18 ( Approximating the generating function).3: Calculations for the compound demand). with each Wi > 0 a. The notation and terminology are changed to correspond to that customarily used in the treatment of arrival and counting processes. although at the cost of more computing time. between the two sequences are simply The relations n S0 = 0. use Y.s. In that case. of noise pulses on a communications line. arrivals and designate the times of occurrence as arrival times. etc.s. We refer to these occurrences as A stream of arrivals may be described in three equivalent ways. vehicles passing a position on a road.
62) P (Nt = n) = P (Sn ≤ t) − P (Sn+1 ≤ t) = P (Sn+1 > t) − P (Sn > t) Although there are many possibilities for the interarrival time distributions. In this case the random process is a counting process. it is common for the random variable and the to count an arrival at the convention above. For a given t. rightcontinuous. We use Exponential iid interarrival times The case of exponential interarrival times is natural in many applications and leads to important mathematical results.61) ω . 1. . N (t + h) − N (t) counts the arrivals h > 0. t = 0. More advanced treatments show that the process has independent.60) N 1.s. Sn . λ) for all n ≥ 1. N (0. It should be clear that We thus have three equivalent descriptions for the stream of arrivals. For N (t + h) − N (t) = N (h) for all t. h > 0. and S0 = 0. b.t] (Si ) = max{n : Sn ≤ t} = min{n : Sn+1 > t} (15. ω) is a nondecreasing. renewal process Nt In the literature on renewal processes. The essential relationships between the three ways of describing the stream of arrivals is displayed in Wn = Sn − Sn−1 . ω) = 0. the interarrival times Wi . (15. h > 0. in the discussion of the connection between the gamma distribution and the exponential distribution. the counting process is often referred to as a renewal times. N (tm ) − N (tm−1 )} is independent. This follows the result in the unit DISTRIBUTION APPROXI9MATIONS on the relationship between the Poisson and gamma distributions. 2. For any given with I(0. Sn ∼ gamma (n. ω).64) and the interrarival Under such assumptions. N (t4 ) − N (t3 ) . ω the value is N (t. Nt ∼ Poisson Remark. a. This requires an adjustment of the expressions relating Si . We utilize the following propositions about the arrival times and the counting process N. so that N (t + h) ≥ N (t) for Nt = i=1 3. integervalued function dened on [0. That is 2. t]. {Nt ≥ n} = {Sn ≤ t}. · · · . {Nt = n} = {Sn ≤ t < Sn+1 } This imples (15. along with the fact that {Nt ≥ n} = {Sn ≤ t}. ∞). N (·. Such a family of random variables constitutes a random process. t + h]. N0 = 0 and for t > 0 we have ∞ in the interval (t. The counting process is a Poisson process in the sense that (λt) for all t > 0.473 • Counting processes : Nt = N (t) t. we assume (15. stationary increments. If {Wi : 1 ≤ i} is iid exponential (λ). this is a random quantity for each nonnegative is the number of arrivals in time period (0.63) {Wi : 1 ≤ i} times are called is iid. i Nt ∼ Poisson (λt) for all t > 0. {Sn : 0 ≤ n} Several properties of the counting process {Wn : 1 ≤ n} should be noted: {Nt : 0 ≤ t} (15. n ≥ 1. λ) for all This is worked out in the unit on TRANSFORM METHODS. the class {N (t2 ) − N (N1 ) . then Sn ∼ gamma (n. with Wi > 0 a. and t1 < t2 ≤ t3 < t4 ≤ · · · ≤ tm−1 < tm .
5243 0. then a dollar spent is the time of replacement of the t units of time in the future has a nth unit. then Sn ∼ gamma (n.70) . Upon failure the unit is replaced immediately by a similar unit.474 CHAPTER 15.cpoisson(8*15.67) We may apply (E8b) (list.20: Gamma arrival times Suppose the arrival times Sn ∼ ∞ gamma (n.[101 121 141]) = 0. 120. (λt) = λe−λt eλt = λ (n − 1)! n=1 ∞ n−1 (15. p.69) Example 15.9669 Sn ∼ gamma We develop next a simple computational result for arrival processes for which (n. If Sn α. What is the probability the number of calls in an eight hour shift is no more than 100.68) ∞ fn (t) = λe−λt n=1 the proposition is established.19: Emergency calls Emergency calls arrive at a police switchboard with interarrival times (in hours) exponential (15). Thus.65) ∞ ∞ E n=1 VERIFICATION g (Sn ) = λ 0 g (15. λ) and and g is such that ∞ g < ∞ 0 Then E n=1 g (Sn )  < ∞ (15.66) We use the countable sums property (E8b) (list. Example 15. 140? p p = 1 . and the numbers of arrivals in nonoverlapping time intervals are independent. 600) to assert ∞ 0 ∞ ∞ ∞ gfn = n=1 Since g 0 n=1 fn (15. RANDOM SELECTION In words. p. Example 15.21: Discounted replacement costs A critical unit in a production system has life duration exponential (λ). If money is discounted at a rate current value e−αt . λ) and the present value of all future replacements is ∞ C= n=1 ce−αSn (15.0347 0. 600) for expectation and the corresponding property for integrals to assert ∞ ∞ ∞ ∞ E n=1 g (Sn ) = n=1 E [g (Sn )] = n=1 0 g (t) fn (t) dt where fn (t) = λe−λt (λt) (n − 1)! n−1 (15. λ). Units fail independently. the average interarrival time is 1/15 hour (four minutes). Cost of replacement of a unit is c dollars. the number of arrivals in any time interval depends upon the length of the interval and not its location in time.
75) VERIFICATION Since Nt ≥ n i Sn ≤ t. t] (Sn ) g (Sn ).71) ∞ E [C] = λ 0 Suppose unit replacement cost discount rate per year ce−αt dt = λc α 1/λ = 1/4. Then ∞ ∞ ∞ is a E [C] = E n=1 Cn e−αSn = n=1 E [Cn ] E e−αSn = n=1 cE e−αSn = λc α (15. t].73) Example 15.76) Example 15.t] g ∞ ∞ n=0 I(0. The result of Example 15. t In the result of Example 15. costs).37.77) Thus.21 ( Discounted replacement costs) Cn . with {Cn .08 (15.16 = 0. the expected cost for the innite horizon λc/α 1 − e−αt . 871. SOLUTION t E [C] = λc 0 e−αu du = λc 1 − e−αt α is reduced by the factor (15.21 ( Discounted replacement 1 − e−0.21 ( Discounted replacement costs). t Nt If Zt = n=1 g (Sn ) . the reduction factor is . average time (in years) to failure and the α = 0.74) The analysis to this point assumes the process will continue endlessly into the future. 000 0.20 ( Gamma arrival times). (15.23: Finite horizon Under the conditions assumed in Example 15. above. often referred to as a nite horizon. Example 15. Often. Nt n=1 ( Gamma arrival times). invariant with n.24: Replacement costs.08 (eight percent). nite period.72) c = 1200. it is desirable to plan for a specic. Sn } independent and E [Cn ] = c. For t = 2 and the numbers in Example 15. nite horizon Under the conditions of Example 15.20 ( Gamma arrival times) may be modied easily to account for a nite period.20 and note that I(0. then E [Zt ] = λ 0 g (u) du (15. consider the replacement costs over a twoyear period. Then E [C] = 1200 · 4 = 60. let counting random variable for arrivals in the interval Nt be the (0.1479 to give E [C] = 60000 · 0.475 The expected replacement cost is ∞ E [C] = E n=1 Hence g (Sn ) where g (t) = ce−αt (15.1479 = 8.t] (u) g (u) du = 0 0 g (u) du (15. replace g (Sn ) = g by I(0.22: Random costs Suppose the cost of the random quantity nth replacement in Example 15.
the expression for In the important special case that E[ ∞ n=1 g (Sn )] may be put into a form which does not require the interarrival times to be exponential. X times.4/>. c = 100. ∞ g iid. where {Wi : 1 ≤ i} is pair {Vn . c = 100 α = 0. 482. Determine the distribution for Y.81) a = 1. RANDOM SELECTION g (u) = ce−αu .) (See Exercise 3 (Exercise 8. b).08.4) from "Problems on Random Variables and Joint Distributions") As a variation of Exercise 15. MW (−α) = Let e−aα − e−bα α (b − a) and so that E [C] = c · e−aα − e−bα α (b − a) − [e−aα − e−bα ] (15.MW) EC = 376.80) α>0 and W > 0.exp(b*A))/(A*(b . 1 − MW (−α) so that provided MW (−α)  < 1 (15.1 die is rolled.79) ∞ E [C] = c n=1 Since n MW (−α) = MW (−α) . Then (see Appendix C (Section 17. a = 1. Then.) (See Exercise 4 (Exercise 8. 3 This content is available online at <http://cnx. Let (Solution on p. suppose a pair of dice is rolled instead of a single die.78) MW is the moment generating function for W.26: Uniformly distributed interarrival times Suppose each Wi ∼ uniform (a. by properties of expectation and the geometric series (15. Sn } is independent.3) from "Problems on Random Variables and Joint Distributions") A X be the number of spots that turn up. we have 0 < e−αW < 1. A = 0.1.2 (Solution on p. Let Y be the Exercise 15.08.3)). DERIVATION First we note that n E Vn e−αSn = cMSn (−α) = cMW (−α) Hence.3 Problems on Random Selection3 Exercise 15. b = 5.org/content/m24531/1. Let Then for {Vn : 1 ≤ n} α>0 be a class such E [C] = E n=1 where Vn e−αSn = c · MW (−α) 1 − MW (−α) (15. exponential Suppose that S0 = 0 and Sn = each E [Vn ] = c and each n i=1 Wi .a)) MW = 0. MW = (exp(a*A) . Example 15.7900 EC = c*MW/(1 . Example 15.476 CHAPTER 15.25: General interarrival. . b = 5. Determine the distribution for Y. MW (−α) = E e−αW < 1.1643 15. 482. A coin is ipped number of heads that turn up.
20 40. Determine E [Y ] and P (Y ≤ 10). Let an additional (Solution on p. 50.6 number (See Exercise 21 (Exercise 14. with probabilities (Solution on p.e. Determine the distribution for X times. Determine P (10 ≤ D ≤ 30).) Game wardens are making an aerial survey of the number of deer in a park.7 Suppose the number of entries in a contest is Let (Solution on p. The table below shows the values of the items and the Value Probability 12. 484.4 0. Exercise 15.05 9 0.1] E [D] .10 .5). Suppose the contestant. etc.e. (Solution on p. Let both draw ones.3 0. 484. P (15 ≤ D ≤ 25).3 Suppose a pair of dice is rolled.. 485. Exercise 15. Determine Exercise 15. 482.) A supply house stocks seven popular items.1 Let D be the number of deer sighted under this model.4). Yi are iid.477 Exercise 15.15 30. Roll the pair be the number of sevens that are thrown on the X rolls. 485.10 7 0.20 5 0.) Yi N ∼ binomial (20.10 3 0.50 0. both twos. Determine the distribution for Y.) (See Exercise 5 (Exercise 8.7: Expectation. 0.8 to be sighted is assumed to be a random variable be from 1 to 10 in size. Regression") A number X N (Solution on p..4 (See Example 7 (Example 14.15 4 0. both draw twos. Exercise 15. Y be the number of Exercise 15. The number of herds N∼ binomial (20. What is the probability of three or more sevens? A random number Y X be the total number of spots which turn up.20 42. with common distribution Y = [1 2 3 4] Let P Y = [0.05 2 0. Determine E [Y ] P (Y ≤ 20). A pair of dice is thrown X times. Let matches (i.00 0. (15. A pair of dice is thrown be the number of sevens thrown on the and X tosses. Var [D].10 25.) (See Exercise 20 (Exercise 14.15 50.) of Bernoulli trials) from "Conditional is chosen by a random selection from the integers 1 through 20 (say by drawing a card from a box).) Each of two people draw Exercise 15. Y.50 0.9 probability of each being selected by a customer. times independently and randomly a number from 1 to 10.82) and D be the total number of correct answers.05 10 0.15 6 0.00 0. 483. 75.2 0. both ones. 100 and P (D ≥ 90).00 0. Regression") A X is selected randomly from the integers 1 through 100. be the number of questions answered correctly by the i th There are four questions. Determine P (D ≤ t) for t = 25.20) from "Problems on Conditional Expectation.21) from "Problems on Conditional Expectation. (Solution on p.00 0. Each herd is assumed to Value Probability 1 0.10 60.).10 8 0.5 number Let (Solution on p.). Regression") A Y X is selected randomly from the integers 1 through 100. 0. Y X be the number of matches (i. Determine the distribution for Y.5) from "Problems on Random Variables and Joint Distributions") X times. 484. Let the distribution for Y.50 0. etc.05 Table 15.
Customers who buy do so (in dollars) with common distribution: Y = [200 220 240 260 280 300] Let P Y = [0. Determine Determine E [D] and P (D ≤ t) P (100 < D ≤ 200).5). N ∼ If each respondent has probability p = 0.10 0. 488.15 0. 250. The probability that a reply will be favorable is 0. The number of points made is the total of the numbers turned up on the sequence of throws of the die. 0. P (D ≥ 1500). and order an amount p1 = 0. (Solution on p. A wheel is spun. binomial (20. Var [X]. 155. E [X].12 and P (D ≥ 750). what is the probability all will pass? Ten or more will pass? Exercise 15. D2 . RANDOM SELECTION Table 15. P (X > = 16). and the number of customers in a day is binomial (10. 487. 487. on line 2 the number is N2 ∼ Poisson p1a = 0.7 of making 70 or more. p = 0. The probability of a reply is 0.83) D1 be the total sales for Marvin and D2 Determine the distribution and mean and variance for D1 . If 25 percent of the Japanese and 20 percent of the Germans visit Disney World. 300.0. 488.25 0. Determine the distribution for the total demand D. P (D ≥ 1000). Exercise 15.3).10 A game is played as follows: (Solution on p.6 he makes a sale in each case. 250.10] the total sales for Geraldine.) N1 ∼ Poisson (500) and the N2 ∼ Poisson (300). A single die is thrown the number of times indicated by the result of the spin of the wheel.14 (Solution on p. Let Y represent the number of points made and X = Y − 16 be the net gain (possibly negative) of the player.33 On incoming line 1 the messages have probability of leaving on outgoing line a . How many dierent possible values are there? What is the maximum possible total sales? b.5 of a sale in each case.) Five hundred questionnaires are sent out. a. 0. 150. 250 favorable replies? Exercise 15.16 incoming messages (45).15 Suppose the number of Japanese visitors to Florida in a week is number of German visitors is (Solution on p. what is the probability of ten or more favorable replies? Of fteen or more? Exercise 15.478 CHAPTER 15.8 of favoring a certain proposition.) 1.) Exercise 15. Let (15. If each student has probability N ∼ binomial (20.2 Suppose the purchases of customers are iid. Geraldine Yi p2 = 0.25 0.) The number of Exercise 15. 486.75.6. a dollar is returned for each point made.) of students take a qualifying exam. A junction point in a network has two incoming lines and two outgoing lines. 487.7).11 Marvin calls on four customers.15 0. P (X > 0). (Solution on p. 200. 3. Determine P (D1 ≥ D2 ) and (Solution on p. for t = 100. With probability calls on ve customers. giving one of the integers 0 through 9 on an equally likely basis. What is the probability of at least 200. what is the distribution for the total number Japanese visitors to the park? Determine D of German and P (D ≥ k) for k = 150. 225.) D = D1 + D2 . The number who reply is a random number A questionnaire is sent to twenty persons.13 A random number Suppose N (Solution on p. and D. P (X > = 10). A player pays sixteen dollars to play. 2. · · · . N1 on line one in one hour is Poisson (50). with probability on an iid basis. 486. A grade of 70 or more earns a pass. Determine the maximum value of X. 245. Exercise 15.
489. N1 ∼ Poisson (30).3.50 boxes is not more than $50.00 boxes? What is the probability the cost of the $2. HP.15 2.15 2.00 boxes? Suggestion.50 boxes is no greater than $475? What is the probability the cost of the $2. 489.25 0. N2 ∼ Poisson (65).50 0. the composite demand assumptions are reasonable. others The number of nonuniversity buyers Mac. • • • The number of university buyers are 0.10 4.21 (Solution on p.75 0.) One car in 5 in a certain community is a Volvo.47 of leaving on line a.3 What is the probability the total cost of the $2.50 boxes is greater than the cost of the $3. What is the probability of 10 or more sales? What is the probability that the number of sales is at least half the number of information requests (use suitable simple approximations)? Exercise 15.) A service center on an interstate highway experiences customers in a onehour period as follows: • • Northbound: Total vehicles: Poisson (200).17 has two ma jor sources of customers: (Solution on p. Suppose the following assumptions are reasonable for monthly purchases. which with packing costs have distribution Cost (dollars) Probability 0. What is the distribution for the number of Mac sales? What is the distribution for the total number of Mac and Dell sales? Exercise 15. 0.18 The number N (Solution on p. is 0. 488. .2. 488. what is the distribution of outgoing messages on line a? What are the probabilities of at least 30.60 that the inquirer just browses but does not identify an interest. The respective probabilities for For each group. others are 0.20 (Solution on p.00 greater than the cost of the $3.50 0. what is the expected number of Volvos? What is the probability of at least 30 Volvos? What is the probability the number of Volvos is between 16 and 40 (inclusive)? Exercise 15. respectively. 35. and various other IBM compatible personal computers. Exercise 15.00 0.4. If the number of cars passing a trac check point in an hour is Poisson (130). Orders require one of seven kinds of boxes.10 that any hit results in a sale. Students and faculty from a nearby university 2. 0.) of hits in a day on a Web site on the internet is Poisson (80). HP.19 The number N (Solution on p.05 Table 15.) A computer store sells Macintosh.25 3. It 1. The messages coming in on line 2 have probability p2a = 0. Suppose the probability is 0. Under the usual independence assumptions. The probabilities for Mac.00 0. 0.30 that the result is a request for information.2. and the two groups buy independently. Truncate the Poisson distributions at about twice the mean value. 0.00 0. Southbound: Total vehicles: Poisson (180). General customers for home and business computing.) of orders sent to the shipping department of a mail order house is Poisson (700). HP.4.479 and 1 − p1a of leaving on line b. Twenty ve percent are trucks.10 1. 489. Twenty percent are trucks.5. and is 0.20 3. 40 outgoing messages on line a? Exercise 15.
RANDOM SELECTION .2.6. with probabilities 0.05 7 0. distribution P N = 0. Determine P (N2 > N3 ). . respectively. Determine the probabilities that the maximum withdrawal is less than or equal to t for t = 100.500 or that no bid is not a bid of 0. the nember of orders NB ∼ Poisson (40). 300.20 5 0.05 1 0. let D be the number of persons to be served.3.25 with distribution (Solution on p.22 N of customers in a shop in a given day is Poisson (120). 490.3. uniform [100. 0.) The number of customers during the noon hour at a bank teller's station is a random number N N = 1 : 10. or 5 persons.25. Customers pay with cash or by MasterCard or Visa charge cards. and for b is 0. 5]. Experience indicates the number values 0 through 8.15 2 0.) Two items. the probability an order for a is 0.Each car has 1.01).10 6 0.) Note Y N ∼ binomial (7.15 3 0. 200.4.05 8 0.03 Table 15.05 9 0.000 or less. number of orders in a day from store A is is (Solution on p.480 CHAPTER 15.Each truck has one or two persons.) of bids is a random variable Value Probability 0 0.05 1 0. Let N1 . Var [D]. 500.07 8 0. 0. with respective probabilities 0. The number and the generating function gD (s).26 A job is put out for bids. and card charges.4 The market is such that bids (in thousands of dollars) are iid. Customer requests NA ∼ Poisson (30). Determine E [D]. (Solution on p.7. 0. 4.40.27 A property is oered for sale. Experience indicates the number having values 0 through 10.05 . N2 . with respective probabilities N (Solution on p.10 7 0. Exercise 15. What is the probability of at least one bid of $3.20 4 0. 491. The A discount retail store has two outlets in Houston. Determine Exercise 15. 200]. respectively Under the usual independence assumptions. 490. the probability of at least one bid of $125. the probability an order for a is 0. P (N2 ≥ 60).1. 490. Exercise 15. and for b is 0. Make the usual independence assumptions.10 6 0.84) The amounts they want to withdraw can be represented by an iid class having the common Y ∼ exponential (0. 0. MasterCard charges.6).35. Visa P (N1 ≥ 30). For store A. 0. a and b. P (N3 ≥ 50). What is the probability the total order for item b in a day is 50 or more? Exercise 15. 491.3.1.7 and 0. with respective probabilities N (Solution on p. N3 be the numbers of cash sales. 0.10 2 0. 491. are featured in a special sale. with a common warehouse. 0.24 The number of bids on a job is a random variable dollars) are iid with less? (Solution on p.) of bids is a random variable having Value Probability 0 0. Exercise 15. from store B.20 4 0. with respective probabilties 0.23 are phoned to the warehouse for pickup.01 ∗ [5 7 10 11 12 13 12 11 10 9] (15.10 5 0. 2. Bids (in thousands of uniform on [3.3. 3.05 10 0.15 3 0.) Exercise 15. For store B. 400.
Under the usual independence assumptions. t] is Poisson (µt). 492. Under the usual independence assumptions. Exercise 15. with the class (0.28 A property is oered for sale. 492. 1. The bid amounts (in thousands of form an iid class.15 2 0. in hours. uniform on [10. 491. between sales qualifying for a drawing is exponential (4). with respective probabilities N (Solution on p. Exercise 15. what is the expected time between a winning draw? What is the probability of three or more winners in a ten hour day? Of ve or more? Exercise 15. 3 or 4 students come in during oce hours on a given day.) of noise bursts on a data transmission line in a period number of digit errors caused by the geometric Suppose i th burst is Yi .000 or more.0025).) What is the probability that the maximum amount bid is greater than $5. Let V be the minimum Determine P (V > t) for integer values from 10 to 20.) The Noise pulses arrrive on a data phone line according to an arrival process such that for each Yi for Nt of arrivals in time interval (0.6 The market is such that bids (in thousands of dollars) are iid symmetric triangular on [150 250].34 number (Solution on p. Exercise 15. A computer store oers each customer who makes a purchase of $500 or more a free chance Suppose the times. 2.33 at a drawing for a prize.) of bids on a painting is binomial (10.20 4 0.31 Twelve solidstate modules are installed in a control system. is Poisson (7t). {Yi : 1 ≤ i} i th pulse has an intensity FY (u) = 1 − e−2u 2 t > 0 the such that the class is iid.05 Table 15. in hours. An error correcting system is capable or correcting ve or fewer errors µ = 12 and p = 0.10 6 0.481 Table 15.05 any unit could have a defect which results in a lifetime (in hours) exponential (0. 0.005 (37 − 2t)2 ≤ t ≤ 10.05.29 Suppose of the N N ∼ binomial values of the Yi .) are iid. Exercise 15. what is the probability that no visit will last more than 20 minutes.3) and the Yi (Solution on p.35 The number N (Solution on p. If the lengths of the individual visits. Determine the probability that in an eighthour day the intensity will not exceed two. 493. Experience indicates the number having values 0 through 8. two hours of (p). in minutes.000? Exercise 15.05 1 0. with probability (Solution on p. 491. what is the probability the unit does not fail because of a defective module in the rst 500 hours after installation? Exercise 15. 200] Determine the probability of at least one bid of $180. What is the probability of no uncorrected error in operation? .3). with common density function fY (t) = 0. 493.32 The number dollars) Yi N (Solution on p. The {Yi : 1 ≤ i} iid.) Suppose a teacher is equally likely to have 0.15 3 0. However.) If the modules are not defective. Exercise 15.) of bids is a random variable Number Probability 0 0. Yi − 1 ∼ in any burst. 20].1).30 (Solution on p. (10.35.10 7 0.15 5 0. (Solution on p. are iid exponential (0. 493. The probability of winning on a draw is 0.05 8 0. Determine the probability of at least one bid of $210. 0. uniform [150. with the common distribution function u ≥ 0.000 or more. they have practically unlimited life.5 The market is such that bids (in thousands of dollars) are iid. p = 0. 492. t].
1400 6. PN.0000 0. P To view the distribution.0000 0.0026 Solution to Exercise 15. P May use jcalc or jcalcf on N. Y. disp(gD) % Compare with P83 0 0.0208 6.1823 3.5]. disp(gD) 0 0.1641 1.1667 4.1025 2. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN PX Enter gen fn COEFFICIENTS for gY PY Results are in N.0008 11.3125 2.2158 4.0001 12. PY.5 0. PY = [0. PY. RANDOM SELECTION Solutions to Exercises in Chapter 15 Solution to Exercise 15.0140 % (Continued next page) 9. P To view the distribution. PN.0000 0. PD.0269 1.0000 0.3 (p.1 (p. D.2578 3. Y. . 477) PX = (1/36)*[0 0 1 2 3 4 5 6 5 4 3 2 1]. D.0000 0. D. call for gD.0000 0.0000 0. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN PN Enter gen fn COEFFICIENTS for gY PY Results are in N.6)].0000 0.0000 0. call for gD.0755 5.0000 0.0000 Solution to Exercise 15. PY = [0.0000 0. P May use jcalc or jcalcf on N. 476) PX = [0 (1/6)*ones(1.5 0.482 CHAPTER 15.0806 7.0000 0.5].0000 0.2 (p.0000 0.0040 10.0000 0.0375 8. 476) PN = (1/36)*[0 0 1 2 3 4 5 6 5 4 3 2 1].0000 0.0000 0.1954 5.0000 0. D. PD.
0000 0. D.0000 12. D.0000 9.0000 0.0000 P = (D>=3)*PD' P = 0.0000 7.2152 3.20)].0000 2.2113 0.0003 0.0047 0.0000 10.4 (p.483 PY = [5/6 1/6].0000 0.0000 5. PY. PD.2435 0.0000 0.0000 0. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N.0000 12. PD. disp(gD) 0 1.0000 0. call for gD.1116 Solution to Exercise 15.0000 0.0000 11. P To view the distribution. gY = [5/6 1/6]. D. Y.3660 2.0000 . disp(gD) 0 0.0000 0.0000 4. PN.0000 8.0828 4.0000 0. P To view the distribution.0000 11.0000 3.0000 10.0000 0.0000 0.3072 1.0000 13. Y.1419 0.0000 0. P May use jcalc or jcalcf on N. PN.0000 0. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN PX Enter gen fn COEFFICIENTS for gY PY Results are in N.0795 0. 477) gN = (1/20)*[0 ones(1.0001 8.0230 5.0000 0.0048 6.0008 7.0000 6.0370 0.0000 9.0000 0. PY.0001 0. P May use jcalc or jcalcf on N.0144 0.0013 0.2661 0. D. call for gD.
0000 20. Y.4. call for gD.1*[0 2 4 3 1].0. call for gD. PY.6 (p.0000 16.0000 18. gY = 0. Y.0000 15. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N. EY = dot(D. 477) gN = 0.0000 0. D. D. PY.0000 19. P May use jcalc or jcalcf on N. D.0000 0.5 (p. D. PY. P To view the distribution.0000 0. call for gD. PN. P To view the distribution.0000 0.01*[0 ones(1.484 CHAPTER 15.7 (p. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N.0000 0.100)].100)].9837 Solution to Exercise 15. P To view the distribution. . gY = [5/6 1/6]. PD.0:20). PN.01*[0 ones(1. D. EY = dot(D. Y.1]. gY = [0.9188 Solution to Exercise 15. RANDOM SELECTION 14.PD) EY = 5. PD. PD. P May use jcalc or jcalcf on N.0000 0.0000 Solution to Exercise 15.0000 0. PN.0500 P10 = (D<=10)*PD' P10 = 0. P May use jcalc or jcalcf on N. 477) gN = 0. 477) gN = ibinom(20.PD) EY = 8.9 0.4167 P20 = (D<=20)*PD' P20 = 0. D.0000 17. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N.
PN. gY = 0. 477) gN = ibinom(10.ED^2 31.01*[10 15 20 20 15 10 10].01*[0 5 10 15 20 15 10 10 5 5 5].1012 0. for i = 1:4 P(i) = (D<=k(i))*PD'.0.8720 ((15<=D)&(D<=25))*PD' 0.5.4). mgd Enter gen fn COEFFICIENTS for gN gN Enter VALUES for Y Y Enter PROBABILITIES for Y PY Values are in row matrix D. PD. end disp(P) 0.0. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N. P = zeros(1. Y. P May use jcalc or jcalcf on N.9725 0.6386 ((10<=D)&(D<=30))*PD' 0.9290 Solution to Exercise 15.6156 0. D.5 25 30.8497 0.9 (p. call for mD.0:10).5578 0. PY = 0.^2)*PD' . PY. k = [25 50 75 100].0:20).5 50 60]. P To view the distribution.PD) 18.8 (p. call for gD.485 ED ED VD VD P1 P1 P2 P2 = = = = = = = = dot(D. 477) gN = ibinom(20. Y = [12. probabilities are in PD.9614 P1 = ((100<D)&(D<=200))*PD' P1 = 0.5). To view the distribution.4000 (D.9998 Solution to Exercise 15.5 40 42.5. P = zeros(1. D.5144 . for i = 1:5 P(i) = (D<=t(i))*PD'.3184 0. s = size(D) s = 1 839 M = max(D) M = 590 t = [100 150 200 250 300].0310 0. end disp(P) 0.
PX) .PD) .PD1.PD1) . PX.PD2] = mgdf(gnG. PY.0000 % Check: ED1 = EnM*EY = 2.1968e+04 ED2 = dot(D2.ED1^2 VD1 = 6. [D2. Y = 200:20:300.4*250 VD1 = dot(D1.0.4214e+05 P1g2 = total((t>u).^2.PD) ED = 1.PX] = csort(Y16. 478) gnM = ibinom(4.PY).PY] = gendf(gn.10 (p.5*250 VD2 = dot(D2. M = max(X) M = 38 EX = dot(X.4214e+05 vD = VD1 + VD2 vD = 1.3).0803 % Check EX = En*Ey .PD1] = mgdf(gnM.16 = 0.P).0:5). % Check: VD = VD1 + VD2 .1*ones(1. t.PD2) ED2 = 625.2147 P16 = (X>=16)*PX' P16 = 0. 478) gn = 0. gy = (1/6)*[0 ones(1. Y.PD2.4612 k = [1500 1000 750].PD2).ED2^2 VD2 = 8.Y.D2.P] = icalcf(D1.ED^2 VD = 1.5.1875 Ppos = (X>0)*PX' Ppos = 0.16 = 4. ED = dot(D.^2.486 CHAPTER 15.5 % 4.PY).10).6)]. PY = 0.Y. u. gnG = ibinom(5.0.11 (p.2250e+03 % (Continued next page) VD = dot(D. RANDOM SELECTION Solution to Exercise 15.01*[10 15 25 25 15 10]. PDk = zeros(1.6. ED1 = dot(D1.5 .D2.PD1.gy).PX) EX = 0. Use array opertions on matrices X.0175e+04 [D1.EX^2 VX = 114.*P) P1g2 = 0.2250e+03 eD = ED1 + ED2 % Check: ED = ED1 + ED2 eD = 1. [X.0000 % Check: ED2 = EnG*EY = 2. [D1.0:4).4667 P10 = (X>=10)*PX' P10 = 0.PD] = csort(t+u.5*3.u.25 Solution to Exercise 15.^2.^2.PY). and P [D.t.PD2) .PD1) ED1 = 600.2500 VX = dot(X. [Y.5*3.
P To view the distribution.2556 0. D.7*0. gY = [0.7788 P15 = (D>=15)*PD' P15 = 0. D. D.7.3.0. 478) gN = ibinom(20. Pall = (D==20)*PD' Pall = 2.0:20).0660 Solution to Exercise 15. 478) n = 500.2 0.0660 pD = ibinom(20.0. PY.8872 Solution to Exercise 15. Y.12 (p.75. P10 = (D>=10)*PD' P10 = 0. D. p10 = (D>=10)*pD' p10 = 0. end disp(PDk) 0. call for gD.0:20). P May use jcalc or jcalcf on N.0:20).6. % Alternate: use D binomial (pp0) D = 0:20. gY = [0.7)^20 % Alternate: use D binomial (pp0) pall = 2. Y.3 0.7822e14 pall = (0.14 (p.3*0.7822e14 P10 = (D >= 10)*PD' P10 = 0.7326 0. .7].7788 p15 = (D>=15)*pD' p15 = 0.487 for i = 1:3 PDk(i) = (D>=k(i))*PD'. P To view the distribution. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N. PD. 478) gN = ibinom(20. PY.8.8]. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N. PD. call for gD. p0 = 0. PN. P May use jcalc or jcalcf on N.0. p = 0.0038 Solution to Exercise 15. PN.13 (p.
k = [200 225 250].2 = 25).0000 0.p*p0. Y = 0:80. PD = cpoisson(185. RANDOM SELECTION D = 0:500.9892 160.0001 245.16 (p.0000 0.47.3722 Solution to Exercise 15. D∼ Poisson (185). PNa = cpoisson(ma.4 + 65*0.0000 0. m2a = 45*0.5173 0.18 (p.0000 0.2 + 65*0.7785 180.0000 0.X).0000 250.8736 175.9893 0. 479) X = 0:30.3). PX = ipoisson(80*0.9362 170.0000 0. 479) Mac sales Poisson (30*0.0000 0.20 = 60).0000 0.488 CHAPTER 15.PD]') 150. . PD = ibinom(500. disp([k.0024 230.0067 225.1.17 (p.0776 210.25 = 125). Solution to Exercise 15.0000 0.0000 0.5).3663 195. 478) m1a = 50*0.0000 0.0000 0.D).0000 0. k = 150:5:250.0000 Solution to Exercise 15.9119 0. HP sales Poisson (30*0.[30 35 40]) PNa = 0.5098 190.0140 Solution to Exercise 15.6890 0.15 (p.0000 0.3 = 25. 478) JD ∼ Poisson (500*0.0000 0.0000 0.0000 0. end disp(P) 0. ma = m1a + m2a. for i = 1:3 P(i) = (D>=k(i))*PD'. total Mac plus HP sales Poisson(50.2405 200.0002 240.0000 0.6532 185.0167 220.k).0000 0.33.5).9718 165.0379 215.1435 205. GD ∼ Poisson (300*0.0000 0.0000 0. P = zeros(1.9964 155.0008 235.
1572 Solution to Exercise 15.^2.Y). and P P1 = (2..20 (p.2407 P2 = cpoisson(26.2100 YP = 1:5.^2.2834 M = t>=0.75 b = 295 YT = [1 2].25. P ∼ Poisson (200*0.5*X<=475)*PX' P1 = 0.2. 479) T ∼ Poisson (200*0. 479) P1 = cpoisson(130*0... icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. Y. a = 85 b = 200*0. PM = total(M.8785 M = 2. icalc: X Y PX PY . EYT = dot(YT.*P) PM = 0..EYT^2 VYT = 0.PYT) .16) .6400 .PYT) EYT = 1.19 (p.2834 pX10 = cpoisson(8.489 PY = ipoisson(80*0.Y).3000 VYT = dot(YT.3].. Y = 0:300.7500 Solution to Exercise 15. PM = total(M.41) = 0.8 + 180*0.X). t. PYP = 0. PY = ipoisson(700*0.21 (p.5*u.2 + 180*0.7 0. PX = ipoisson(700*0..PX10 = (X>=10)*PX' % Approximate calculation PX10 = 0. EYP = dot(YP. u.cpoisson(26.25 = 85).4000 VYP = dot(YP.1*[3 3 2 1 1].9819 Solution to Exercise 15.75 = 295). PY. 479) X = 0:400.PYP) ..PYP) EYP = 2.10) % Direct calculation pX10 = 0.8 + 180*0.20. PYT = [0.30) = 0.3. PX.EYP^2 VYP = 1...5*t<=(3*u + 50).*P) PM = 0..
gYP). 480) (15. PM = total(M.NT).2468 . PX = ipoisson(120*0. PY. PY = ipoisson(120*0.3s2 − 1 gDP (s) = exp 295 0. PX.86) X = 0:120.85) (15. EDT = dot(DT.5000 EDP = 295*EYP EDP = 708. u. 480) Solution to Exercise 15.NP).1 3s + 3s2 2s3 + s4 + s5 − 1 gD (s) = gDT (s) gDP (s) Solution to Exercise 15.6. RANDOM SELECTION EDT = 85*EYT EDT = 110. gYP = 0. Y = 0:120.50) = 0.7s + 0. and P M = t > u.7190 Solution to Exercise 15.1*[0 3 2 2 1 1].22 (p. t. gNP = ipoisson(295.EDT^2 VDT = 161. gYT = 0.*P) PM = 0. icalc Enter row matrix of X values X Enter row matrix of Y values Y Enter X probabilities PX Enter Y probabilities PY Use array opertions on matrices X.PDT) EDT = 110.24 (p.490 CHAPTER 15.23 (p. % Possible alternative gNT = ipoisson(85.0000 ED = EDT + EDP ED = 818.5000 VP = 295*(VYP + EYP^2) VP = 2183 VD = VT + VP VD = 2.PDT) .PDT] = gendf(gNT. 480) P = cpoisson(30*0.Y).gYT).1*[0 7 3].5000 VDT = dot(DT.5000 NP = 0:500.PDP] = gendf(gNP. Y.X).7+40*0.^2. % Requires too much memory gDT (s) = exp 85 0.4.35. [DP.5000 VT = 85*(VYT + EYT^2) VT = 161. [DT.2705e+03 NT = 0:180.
[D. PY = 1 .4598 0.PD] = gendf(gN.7490 0.gY).PY) P = 0.gY). 480) Consider a sequence of N trials with probabiliy p = (180 − 150) /50 = 0.25 PY =0.exp(0. [D.6794 % Second solution .PY) % fliplr puts coeficients in % descending order of powers FW = 0. gN = 0. % D is number of successes Pa = (D>0)*PD' % D>0 means at least one successful bid Pa = 0. P = (D>0)*PD' P = 0.6800 PW = 1 .01*[5 10 15 20 20 10 10 7 3].polyval(fliplr(gN). P = 1 .6536 Solution to Exercise 15.75)^7 P = 0.27 (p. % i. gY = [0.25 pN = ibinom(7.26 (p.0. FW = polyval(fliplr(gN).25 (p.1330 0.0:7).Positive number of satisfactory bids.8989 0.6794 Solution to Exercise 15.5 + 0.6]. gY = [3/4 1/4].01*[0 5 7 10 11 12 13 12 11 10 9].6536 %alternate gY = [0. t = 100:100:500. PY = 0.6*0.(4/5)^2) PY = 0. P = (D>0)*PD' P = 0.25.9116 Solution to Exercise 15.FY(t) = 1 . 481) gN = 0. 480) Use FW (t) = gN [P (Y ≤ T )] gN = 0.PD] = gendf(gN.29 (p.32].4 0.polyval(fliplr(gN).gY). with P(E) = 0.9615 Solution to Exercise 15.68 0.6. the outcome is indicator for event E. gN = 0.PD] = gendf(pN. 480) Probability of a successful bid P Y = (125 − 100) /100 = 0.01*[5 15 15 20 10 10 5 5 5 5 5].01*[5 15 15 20 15 10 10 5 5].PY) PW = 0.gN[P(Y>t)] P = 1(0.491 % First solution .28 (p.5*(1 . 481) .4 + 0.8493 Solution to Exercise 15. % Generator function for indicator [D.6.01*t).e.
1686 Columns 8 through 11 0.5104 0.0.PD] = gendf(gN.7 + 0.30 (p.45 (15. PW = (D==0)*PD' PW = 0.8410 Solution to Exercise 15.31 (p. = 0.0. p = 1 .32 (p.0:10).7635 Solution to Exercise 15.9718 0.0360 0. % Alternate [D.7635 gY = [p 1p].0147 0 t p P P Solution to Exercise 15.0664 0.5104 0.05*p)^12 FW = 0.0360 0. = 10:20.7092 0. [D. RANDOM SELECTION gN = ibinom(10.1092 0.3612 0.9718 0.8352 .gY). [D. % D is number of "successes" Pa = (D>0)*PD' Pa = 0.0. 481) 5 P (Y ≤ 5) = 0.0.(0. FW = polyval(fliplr(gN).3*p)^10 P = 0. gY = [p 1p].exp(2).2503 0.1686 Columns 8 through 11 0.^10 .3.3*p).PD] = gendf(gN.0:10). PW = (D==0)*PD' PW = 0. FW = (0.0:12).7 + 0.1092 gN = 0.005 2 (37 − 2t) dt = 0.8352 gN = ibinom(10.7^10 = Columns 1 through 7 0.gY).p) FW = 0.exp(0.1*(20 .5).45.7092 0.3. gY = [p 1p].PD] = gendf(gN.2*ones(1.7^10 % Alternate form of gN Pa = Columns 1 through 7 0.0147 0 Pa = (0.0664 0.87) p = 0.3612 0.t).8410 gN = ibinom(12.95 + 0.492 CHAPTER 15. P = 1 .p) .0. 481) 0.2503 0.gY). 481) p = 1 . = polyval(fliplr(gN).05.0025*500).
WDt exponential (λp).35. p = 0.3233 0.33 (p.exp(t^2) .3586 Solution to Exercise 15.05. FW = exp(mu*t*(1 .0527 Solution to Exercise 15. t = 2. FW2 = exp(56*(1 .1)) FW = 0.34 (p. t = 10.0138 . mu = 12.1)) FW2 = 0. lambda = 4.[3 5]) PND10 = 0. 481) N8 is Poisson (7*8 = 56) gN (s) = e56(s−1) . 481) FW (k) = gN [P (Y ≤ k)]P (Y ≤ k) − 1 − q k−1 Nt ∼ Poisson (12t) q = 1 .0. t = 2.493 Solution to Exercise 15. 481) Nt ∼ Poisson (λt). NDt ∼ Poisson (λpt). EW = 1/(lambda*p) EW = 5 PND10 = cpoisson(lambda*p*t. k = 5.35 (p.q^(k1) .
RANDOM SELECTION .494 CHAPTER 15.
given an event. designated {X. Denition. N (16. and of the theory of Markov systems. the concept of conditional independence of events is examined and used to model a variety of common situations.2) holds for all reasonable M and N is (technically. Y . given a random vector. X and Y. Given a Random Vector 16. 495 . This suggests a possible extension to conditional expectation.3) Remark. Y } conditionally independent.Chapter 16 Conditional Independence. In this unit.5/>. or Z be real valued. of many topics in decision theory. for all Borel sets M.org/content/m23813/1.2). given C. the rst of these.1 The concept The denition of conditional independence of events is based on a product rule which may be expressed in terms of conditional expectation. given event C. X. We examine the following concept. As in the case of other concepts. i the product rule E [IM (X) IN (Y ) C] = E [IM (X) C] E [IN (Y ) C] (16. We examine in this unit. For example. Y } conditionally independent. Given a Random Vector1 In the unit on Conditional Independence (Section 5. it is useful to identify some key properties.1) be reasonable to A = X −1 (M ) and B = Y −1 (N ). This concept lies at the foundations of Bayesian statistics.1) . The pair {X.s. if X M and N is a three dimensional random vector. For example. we understand that the sets respectively. we investigate a more general concept of conditional independence. then Since it is not necessary that are on the codomains for M is a subset of R3 . the following (CI1) 1 This E [IM (X) IN (Y ) Z] = E [IM (X) Z] E [IN (Y ) Z] a. Y } ci Z . The pair {A. based on the theory of conditional expectation.1 Conditional Independence. In the unit on Markov Sequences (Section 16. B} is conditionally independent. which we refer to by the numbers used in the table in Appendix G. We note two kinds of equivalences. givenZ. N content is available online at <http://cnx. then IA = IM (X) and IB = IN (Y ). very briey. are equivalent. It would the pair {X. all Borel M and N ). 16.1. we provide an introduction to the third. i for all Borel E [IM (X) IN (Y ) Z] = E [IM (X) Z] E [IN (Y ) Z] M. i E [IA IB C] = P (ABC) = P (AC) P (BC) = E [IA C] E [IB C] If we let consider (16.
we have E{IN (Y ) IQ (Z) E [IM (X) Z. (16. Y } ci Z implies that additional knowledge about Y does not modify that best estimate. Q (16. 496). (CI2) (p. Z) = E [IM (X) Z]. 428) shows that (CI2) (p. Z) Z. property (CI4) (p. Y ] = E [IM (X) IQ (Z) Z] a. Y ] = E [g (X.9) (16. 426) yields E{IN (Y ) IQ (Z) E [IM (X) Z]} = E{IQ (Z) E [IN (Y ) E [IM (X) Z] Z]} = E{IQ (Z) E [IM (X) Z] E [IN (Y ) Z]} = E{IQ (Z) E [IM (X) IN (Y ) Z]} = E [IN (Y ) IQ (Z) IM (X)] which establishes the desired equality.s.4) 600) for expectation we may assert e1 (Y. 496). Z) Y ] = E{E [g (X. Z) h (Y. h Because the indicator functions are special Borel functions. Z) = E [IM (X) Z. E [IM (X) IN (Y ) Z] = E [IM (X) Z] E [IN (Y ) Z] a. Y ] = E [IM (X) Z] a. 496) provides an important tation is often the most useful as a modeling assumption.7) are useful in a variety of contexts. Z) a. Y } ci Z . To show that (CI1) (p. 496) being the dening condition for (CI1) (CI2) (CI3) (CI4) {X. use of (CE1) (p.5) e2 (Y. 496) is equivalent to (16.8) • (CI2) (p. Z) Z] E [h (Y. (CE8) (p. Z) = (16. for all Borel E [IN (Y ) IQ (Z) e1 (Y. CONDITIONAL INDEPENDENCE. 426). . we show the equivalence of (CI1) (p. Y ] Z} = E{IN (Y ) E [IM (X) Z. Q As an example of the kinds of argument needed to verify these equivalences. 496) and (CI3) (p. 496) is equivalent to (CI6) E [g (X. Property (CI9) Property (CI7) ("(CI7) ". The additional properties in Appendix G (Section 17. Set If we show e1 (Y. Similarly. monotonicity. so also (CI3) (p. 495). 496). with (CI1) (p.6) (16. Y ] Z} = E{IN (Y ) E [IM (X) Z] Z} = E [IM (X) Z] E [IN (Y ) Z] (CI1) (p. we have E [IM (X) IN (Y ) Z] = E{E [IM (X) IN (Y ) Z.s.s. 496) and (CI2) (p. for all Borel sets M. Z) Z] = E [g (X. 602) is just a convenient way of expressing the other conditions. 496) is a special case of (CI5) 495). 426) for conditional expectation. 602) is an alternate way of expressing (CI6) (p. The condition {X. 496). 428). given knowledge of Z. for all Borel sets M E [IM (X) IQ (Z) Z. (CI3) (p. GIVEN A RANDOM VECTOR (CI5) E [g (X.11) Use of property (CE8) (p.s. for all Borel sets M. and (CI4) (p. Y ] and e2 (Y. N.s. 427) for conditional expectation.10) (16. 428).s.s.7) (16. We refer to them as needed. for all Borel functions g 496). The properties (CI1) (p.s. for all Borel functions g. (CE8) (p. Z)] then by the uniqueness property (E5b) (list. 496) extends to (CI5) (p.496 CHAPTER 16. Using (CE9) (p. 495). Z) Z] a. Z) Z] a. 428). 496). • (CI1) (p. ("(CI9) ". (CI2) (p. p. Z) Z] is the best meansquare estimator for g (X. 496) implies (CI5) (p. (CI1) (p. for all Borel functions g (CI8) E [g (X. 496) are equivalent. p. for all Borel sets M. A second kind of equivalence involves various patterns. (p. 496). 496). and 426) to (CE6) monotone convergence in a manner similar to that used in extending properties (CE1) (p. 428). Z) Z] Y } a. we need to use linearity. Now just as interpretation of conditional independence: E [g (X. and (CE8) (p. p. particularly in establishing properties of Markov systems. Y ]} = E [IN (Y ) IQ (Z) IM (X)] On the other hand. (CI1) (p. N E [IM (X) Z. (p. 496) implies (CI1) (p. Q E [IM (X) IQ (Z) Y ] = E{E [IM (X) IQ (Z) Z] Y } a. Using the dening property (CE1) (p. and (CE1) (p. 496) are equivalent. 496) implies (CI2) (p. This interpreProperty (CI6) (p. Z). 496). Z)] = E [IN (Y ) IQ (Z) e2 (Y.
The fraction of items from a production line which meet specications. but most important. approach makes a fundamental change of viewpoint. For each 2. u in the range of H. given H = u. To see this. · · · . The population distribution Since the population mean is a by experiment. consider Xi = IEi .12) =E i=1 If I(−∞. · · · Xn ≤ tn H = u) n n n (16. given H. X2 . The value of the parameter (say µ in the discussion above) is a realization of a parameter random variable H. · · · . In this view. The fraction of women between the ages eighteen and fty ve who hold full time jobs. The Bayesian This has no meaning. it is modeled as a random variable whose value is to be determined state of nature.. If two or more parameters are sought (say the mean and variance). H is the parameter The Bayesian model If X is a random variable whose distribution is the population distribution and random variable.14) is binomial (n. a fundamental problem is to obtain information about the population There is an inherent diculty with this Suppose it is desired to determine the population mean distribution from the distribution in a simple random sample. then the conditional distribution for Sn . The proportion of a given population which has a certain disease. 4. The mean value is thus a random variable whose value is determined by this selection. prior distribution for H. X2 ≤ t2 . If Sn is the number of successes in a sample of size H is the parameter random variable and n.ti ] (Xi ) H = u = i=1 E I(−∞. statistical problems: determining the proportion of a population which has a particular characteristic. The population distribution is a conditional distribution.e. The proportion of a population of voters who plan to vote for a certain candidate.2 The Bayesian approach to statistics In the approach. 2. 3. {Xi : 1 ≤ i ≤ n} is conditionally and consider the joint conditional distribution function iid. 2. We assume a 3. given the value of H. tn u) = P (X1 ≤ t1 . 1. that of We Population proportion We illustrate these ideas with one of the simplest. with P (Ei H = u) = E [Xi H = u] = e (u) = u . Let W = (X1 . (16. To implement this point of view. 1. The parameter in this case is the proportion p who meet the criterion. we have a conditional distribution for X. mention only a few to indicate the importance. then a similar product rule holds. This is based on previous experience. we assume 1. classical approach to statistics. µ. given H : i. then the sampling process is equivalent to a sequence of Bernoulli trials. given H = u. we cannot assign a probability such as P (a < µ ≤ b).13) X has conditional density. then {X.497 16. since it is a constant. about which there is uncertainty. We have a random sampling process.ti ] (Xi ) H = u = i=1 FXH (ti u) (16. H} have a joint distribution. given H.1. quantity about which there is uncertainty. Examples abound. Now µ is an unknown quantity However. t2 . One way of expressing this idea is to refer to a has been selected by nature from a class of distributions. If sampling is at random. they may be considered components of a parameter random vector. the population distribution is conceived as randomly selected from a class of such distributions. u). Xn ) FW H (t1 .
proves to be a natural choice for this Its range is the unit interval.2)). given an event. we must assume a prior distribution for any. Sn = k .17) fH (u) du (16. Two steps must be taken: H = u. s. k) uk (1 − u) n−k and E [Sn H = u] = nu (16. we know an exression for E [Sn H = u] = nu. s) distribution (see Appendix G (Section 17. E [HSn = k]. 426) to obtain E [HSn = k] = E{HE I{k} (Sn ) H } E HI{k} (Sn ) = = E I{k} (Sn ) E{E I{k} (Sn ) H } = C (n. if The Bayesian reversal Since {Sn = k} is an event with positive probability. (r.498 CHAPTER 16.1) and 2 (Figure 16. . GIVEN A RANDOM VECTOR u as in the ordinary Bernoulli case.16) The objective We seek to determine the best meansquare estimate of 1. H on the basis of prior knowledge.7)). we use the denition of the conditional expectation. k) uk (1 − u) n−k n−k uE I{k} (Sn ) H = u fH (u) du E I{k} (Sn ) H = u fH (u) du fH (u) du (16. CONDITIONAL INDEPENDENCE.15) E I{k} (Si ) H = u = P (Sn = kH = u) = C (n. Sampling gives We make a Bayesian reversal to get 2. and the law of total probability (CE1b) (p. k) uk+1 (1 − u) C (n. If H. the density function can be given a variety of forms (see Figures 1 (Figure 16. given Sn = k. is the number of successes in If Anaysis is carried out for each xed n n Sn = i=1 we have the result Xi = i=1 IEi n component trials (16. To complete the task. and by proper choice of parameters r.18) A prior distribution for H The beta purpose.
2.1: The Beta(r. . s = 1.s) density for r = 2. 10.499 Figure 16.
using the integral formula above. s) A (r. 1]. fH (16.23) given Sn = k .t] (H) . s). s) . fH has a [0. shows E [H] = If the prior distribution for r r+s and Var [H] = rs (r + s) (r + s + 1) 1 k+r n+s−k−1 u (1 − u) du 0 1 k+r−1 n+s−k−1 u (1 − u) du 0 2 (16.19) (r.22) du We may adapt the Γ (r + k + 1) Γ (n + s − k) Γ (r + s + n) k+r · = Γ (r + s + n + 1) Γ (r + k) Γ (n + s − k) n+r+s analysis above to show that H is conditionally beta (r + k. s) = uk+1 (1 − u) uk (1 − u) u (1 − u) − u) s−1 du 1 0 n−k r−1 u (1 s−1 = (16. For r. s ≥ 2.2: The Beta(r.24) It (H) = I[0. s = 2. FHS (tk) = E It (H) I{k} (Sn ) E I{k} (Sn ) where (16. CONDITIONAL INDEPENDENCE. so that determination Γ (r + s) r−1 s−1 s−1 t (1 − t) = A (r.21) H 1 0 is beta (r.s) density for r = 5. s + n − k) .500 CHAPTER 16. 5. we may complete the determination of E [HSn = k] as follows. s) tr−1 (1 − t) 0<t<1 Γ (r) Γ (s) maximum at (r − 1) / (r + s − 2).20) is a polynomial of the distribution function is easy. GIVEN A RANDOM VECTOR Figure 16. Its analysis is based on the integrals 1 0 For ur−1 (1 − u) H∼ beta s−1 du = Γ (r) Γ (s) Γ (r + s) with Γ (a + 1) = aΓ (a) (16. In any case. the density is given by fH (t) = For on r ≥ 2. straightforward integration. n−k r−1 E [HSn = k] = A (r. 10. s positive integers. (16.
we get In the integral H∼ (r. Conditional densities for repeated sampling. the conditional distribution given size Assuming prior ignorance (i. Fourteen H ∼ beta (1. with thirteen favorable responses.25) The integrand is the density for beta (r + k. what is S20 = 14? After the rst sample is taken. Example 16. is (r + k) / (r + s + n) = 15/22 ≈ 0. The value is as likely to be in any subinterval of a given length as in any other of the same length. 7).1 (Population proportion with a SOLUTION For above.3: beta prior). except that For H is replaced by beta It (H).501 The analysis goes through exactly as for expression for the numerator. n + s − k) = (15. however. A sample of size respond in favor of the increase. According the treatment (k + r. distribution for the rst sample as the prior for the second.6818.e. If there is no prior r = 1. one factor u is replaced by It (u). t FHS (tk) = Γ (r + s + n) Γ (r + k) Γ (n + s − k) uk+r−1 (1 − u) 0 n+s−k−1 du = 0 fHS (uk) du (16.1)). The density has a maximum at (r + k − 1) / (r + k + n + s − k − 2) = k/n. which corresponds to H∼ uniform on (0. s. s = 1. 1). s). The information in the sample serves to modify the distribution for that information. conditional upon Example 16. Make a new estimate of H. t E [HSn = k].1: Population proportion with a beta prior It is desired to estimate the portion of the student body which favors a proposed increase in the student blanket tax to fund the campus radio station. we simply take H can be utilized to select suitable r. The conditional expecation. a second sample of Analysis is made using the conditional n = 20 is taken. n + s − k). H the rst sample the parameters is conditionally beta .. that n = 20 is taken. are r = s = 1. Any prior information on the distribution for information. H. Figure 16.
the estimator for µ is the sample average.502 CHAPTER 16. GIVEN A RANDOM VECTOR For the second sample. with the conditional distribution as the new prior. The same result is obtained if the two samples are combined into one sample of size 40. the maximum for the second is somewhat larger and occurs at a slightly smaller reecting the smaller k. we should expect more sharpening of the density about the new meansquare estimate. And the density in the second case shows less spread. although there is nothing The sampling procedure For small samples. Since. given Sn .'k') As expected.6750.30) .t). the estimate is the conditional expectation of H.t. and are absolutely continuous. k = 13.beta(28.7.t). u) du = fW H (tu) fH (u) du (16. In any event. for the Bayesian case with beta prior.01:1. The Bayesian estimate is often referred to as the small sample estimate.26) For large sample size n.14. the dierence may be quite in the Bayesian procedure which calls for small samples.beta(15. case prior information is not utilized. To determine E [HW = t] = we need ufHW (ut) du (16. H} is jointly absolutely continuous. Bayesian reversal in the analysis. t = 0:0.27) fHW (ut). Suppose fW H (tu) and fH (u) are specied.'k'. It may be well to compare the result of Bayesian analysis with that for classical statistics. u) fW (t) and fW H (t. This requires a Bayesian reversal of the conditional densities. these do not dier signicantly. H ∼ beta (15. In the treatment above. we make the comparison with the case of no prior knowledge ( r = s = 1). The new conditional distribution has parameters r∗ = 15 + 13 = 28 ∗ and s = 20 + 7 − 13 = 14. plot(t. next. The best estimate of H is 28/ (28 + 14) = 2/3. we utilize the fact that the conditioning random variable the pair Sn is discrete. u) = fW H (tu) fH (u) (16. For the classical case. resulting from the fact that prior information from the rst sample is incorporated into the analysis of the second sample. The density has a maximum at t = (28 − 1) / (28 + 14 − 2) = 27/40 = 0.28) Since by the rule for determining the marginal density fW (t) = we have fW H (t. name Bayesian comes from the role of The application of Bayesian analysis to the population proportion required Bayesian reversal in the case of discrete Sn . The idea of the Bayesian approach is the view that an unknown parameter about which there is uncertainty is modeled as the value of a random variable. in the latter. the Bayesian estimate seems preferable for small samples. Now by denition fHW (ut) = fW H (t. CONDITIONAL INDEPENDENCE. and the prior n = 20. important. 7). and it has the advantage that upgrades the prior distribution. The conditonal densities in the two cases may be plotted with MATLAB (see Figure 1). We consider. For the new sample. t. The essential prior information may be utilized.29) fHW (ut) = fW H (tu) fH (u) fW H (tu) fH (u) du and E [HW = t] = ufW H (tu) fH (u) du fW H (tu) fH (u) du (16. If Sn = k : Classical estimate = k/n Bayesian estimate = (k + 1) / (n + 2) (16. this reversal process when all random variables The Bayesian reversal for a joint absolutely continuous pair {W.
tn ). This should provide a probabilistic perspective for a more complete study of the algebraic analysis. Each (16. · · · . The essential ChapmanKolmogorov In the usual timehomogeneous case with nite state space. t1 . and the SOLUTION n fXi H (ti u) = ue−uti Hence so that fW H (tu) = i=1 ue−uti = un e−ut ∗ (16.33) 16. · · · } trial." given the present. and t∗ = t1 + t2 + meansquare estimate of H. We consider only a nite state space and identify the states by integers from 1 to M. Determine the best Xi are conditionally iid. This is the The past inuences the future only through the present. X2 . At discrete instants of time state to the succeeding one (which may be the same). Markov sequences We wish to model a system characterized by a sequence of we call transition times. the move to the next state is characterized by a distribution. utilizing algebraic tools such as matrix analysis. whose value is one of the members of a set E. . We suppose that the system is memoryless. Put · · · + tn .31) E [HW = t] = ∞ n+1 −(λ+t∗ )u u e du 0 ∞ n −(λ+t∗ )u u e du 0 ufHW (ut) du = (λ + t∗ ) · n! n+1 ∞ n+1 −ut∗ u e λe−λu du 0 ∞ n −ut∗ u e λe−λu du 0 (16.2: A Bayesian reversal Suppose H ∼ exponential sample of size n (λ) is taken. 2. the ChapmanKolmogorov equation leads to the algebraic formulation that is widely studied at a variety of levels of mathematical sophistication. equation is seen as a consequence of this conditional independence. there is either a change to a new state or a renewal Each state is maintained unchanged during the of the state immediately before the transition. In this section.32) = = (n + 1)! (λ + t∗ ) n+2 n+1 = (λ + t∗ ) n where t = i=1 ∗ ti (16. exponential (u). We view an observation of the system as a where N = {0. the state is represented by a value of a random variable Xi . 1.2. We suppose the system is evolving in time. we only sketch some of the more common results. For period i. t = (t1 . or a trajectory.34) yields a sequence of states {X0 (ω) . X1 (ω) .org/content/m23824/1. t2 . or stage period between transitions.503 Example 16. in the sense that the transition probabilities are dependent upon the current state (and perhaps the period number). · · · } which is referred to as a realization composite ω of the sequence. Markov property. but not upon the manner in which that state was reached. states taken on at discrete instants which At each transition time. A W = (X1 . conditional transition probability At any transition time. given W = t. known as the state space.5/>. we show that the fundamental Markov property is an expression of conditional independence of past and future. With the background laid. given H = u.1 Elements of Markov Sequences Markov sequences (Markov chains) are often studied at a very elementary level. · · · .2 Elements of Markov Sequences2 16. We thus have a sequence XN = {Xn : n ∈ N}. which we model in terms of conditional independence. Xn ). t2 . · · · the system makes a transition from one 2 This content is available online at <http://cnx.
CONDITIONAL INDEPENDENCE. it is reasonable to suppose that the process XN is Markov. j. at t2 state is X0 (ω). If the periods are of unit length.. n = 1. given a random vector. We verify this below. t ∈ [0. For simplicity. · · · .. n = k. tk+1 ). Xn+1 .n = (Xm .3: Onedimensional random walk An object starts at a given initial position. The various moves are independent of each other. GIVEN A RANDOM VECTOR Initial period: Period one: . Period .37) The state in the next period is conditioned by the past only through the present state. k ∈ E. Xn ) The random vector U n = (Xn .. Yn+1 ). t2 . t1 state is X1 (ω). Un } ci Xn for all n≥0 (16. the past only by the value of the position Xn+1 = g (Xn . at the transition is to the transition is to X1 (ω) X2 (ω) k: state is Xk (ω). We note rst that (M) is equivalent to P (Xn+1 = kXn = j. t ∈ [t1 . n = 0. The sequence XN is Markov i (M) {Xn+1 . we suppose that at certain Some mechanism limits each generation discrete instants the entire next generation is produced.35) In order to Un is called the past at n of the sequence XN Un is the future at n. Let Y0 Yk be the initial position be the amount the object moves at time Xn = We note that n k=0 Yk be the position after n moves. X1 . Since the position after the transition at tn+1 is aected by Example 16. tn+1 )..4: A class of branching processes Each member of a population is able to reproduce.. At transition from the state Xn (ω) to the state Xn+1 (ω) for the next period. · · · the object moves a random distance along a line. so that the future is aected by the present. but not by how the present is reached. and not by the transition probabilities The statistics of the process are determined by the initial P (Xn+1 = kXn = j) ∀ j. at tk+1 move to Xk+1 (ω) Table 16. · · · . Xn ) ∈ En Um. t2 ). At discrete instants t1 . there is a n indicates the period t∈ [tn . capture the notion that the system is without memory. we adopt the following convention: Un = (X0 . Let . To simplify and writing. (16.36) Several conditions equivalent to the Markov condition (M) may be obtained with the aid of properties of conditional independence.504 CHAPTER 16. we utilize the notion of conditional independence. k ∈ (16. to a maximum population of M members. and the for each n ≥ 0. in the following Denition.. Un−1 ∈ Q) = P (Xn+1 = kXn = j) E. · · · ) ∈ En and (16.. and Q ⊂ En−1 state probabilities manner in which the present state is reached.38) The following examples exhibit a pattern which implies the Markov condition and which can be exploited Example 16. then tn = n. n ≥ 0 to obtain the transition probabilities. t = tk {Yk : 1 ≤ k} iid Xn and not by the sequence of positions which led to this position.1 The parameter tn+1 . t ∈ [tk .. t1 ).
Zin = 0 indicates death and no ospring. Remark.505 Zin be the number propagated by the i th member of the nth generation. it is replaced immediately with another unit of the same type. Example 16. Un } is independent. for purposes of analysis. Thus. n ≥ 0 We now verify the Markov condition and obtain a method for determining the transition probabilities.5: An inventory problem A certain item is stocked according to an (m. order up to M. i.39) {Zin : 1 ≤ i ≤ M. but the observations are at discrete instants. · · · . Z2n . then if 0 ≤ Xn < m if m ≤ Xn (16. Let {Yn+1 . with {Yn : 1 ≤ n} iid Then Xn+1 = { Xn − 1 Yn+1 − 1 if if Xn ≥ 1 Xn = 0 = g (Xn . n ≥ 0. 0} max{Xn − Dn+1 .42) . 0} is independent. nth period (before restocking). Yn : 1 ≤ n} is independent Xn+1 = gn+1 (Xn . Set X0 = g0 (Y0 ) Then and Xn+1 = gn+1 (Xn . Then for X0 be the initial stock. as follows: • • Let If stock at the end of a period is less than If stock at the end of a period is m. Xn+1 = { If we suppose max{M − Dn+1 . ii. do not order. Yn+1 ) ∀ n ≥ 0 (16. and and let Dn Xn be the stock at the end of the be the demand during the nth period. fails.6: Remaining lifetime A piece of equipment has a lifetime which is an integral number of units of time. Un } is independent for each and the Markov condition seems to be indicated. ZM n ). Zin = k indicates a net of k propagated member (either death and k ospring or survival and k − 1 ospring). (16. i=1 We suppose the class Zin } Yn+1 = (Z1n . M ) inventory policy. Suppose When a unit Xn is the remaining lifetime of the unit in service at time n Yn+1 is the lifetime of the unit installed at time n. It seems reasonable to suppose the sequence XN Then is Markov. = g (Xn . Each of these four examples exhibits the pattern {X0 . m or greater. the actual transition takes place throughout the period. A pattern yielding Markov sequences Suppose {Yn : 0 ≤ n} is independent (call these the driving random variables ). the transitions are dispersed in time. The population in generation by the i th n+1 is given by Xn Xn+1 = min {M. However. Yn+1 ) (16. we examine the state only at the end of the period (before restocking).41) Remark. Dn+1 ) n ≥ 0. 0 ≤ n} is iid.40) {Dn : 1 ≤ n} {Dn+1 . Example 16. Yn+1 ) . In this case.
From the propositions on transition probabilities.44) P (Xn+1 ∈ QXn = u) = E{IQ [gn+1 (Xn . = E{IQ [gn+1 (u. Yn are known.7: Random walk continued gn (u.48) . This case is important enough to warrant separate classication. p. all u ∈ E. Y = Xn . Thus. By property (CI13) ("(CI13)". Denition. transition matrix. transition probabilities. so that gn is invariant with n. given Xn = i. 599) by The application of this proposition. Since the function g is the same for all n and the driving random variables corresponding to the Yi form an iid class. Un−1 ). the elements on i th row constitute the conditional distribution for Xn+1 . We return to the examples. Yn+1 ) ∈ Q] . The transition matrix thus has the property that each row sums to one. Yn + 1) ∈ Q] n. Yn ) {Yn+1 . the Markov As a matter of fact. p. which ensures each pair X = Yn+1 . we write P (Xn+1 = jXn = i) = p (i. then Un is known. Such a matrix is called a stochastic matrix. 601) = P [gn+1 (u. then the process is a homogeneous Markov process. Y1 . Yn+1 )] Xn = u} a. Y0 . j) on row i and column j is the probability P (Xn+1 = jXn = i). Y1 . we note the following special case of the proposition above. Homogenous Markov sequences If {Yn : 1 ≤ n} is iid and gn+1 = g for all n. the transition probabilities are invariant with n. to the previous examples shows that the transition probabilities are invariant with n. b. · · · . and any Borel set Q. P (Xn+1 ∈ QXn = u) = P [gn+1 (u. 602) ensures {Xn+1 . Yn+1 ) ∈ Q] by (E1a) ("(E1a) ".45) Remark. Un }ciXn ∀n ≥ 0 which is the Markov property. We may utilize part (b) of the propositions to obtain the onestep Example 16. (16. j) These are called the or pij P (invariant with n (16.506 CHAPTER 16. this is the only case usually treated in elementary texts. 603). n (16. process XN If P (Xn+1 ∈ QXn = u) is said to be homogeneous. · · · .s. j)] The element the p (i. CONDITIONAL INDEPENDENCE. Un−1 }ciXn (16. the sequences must be homogeneous. Yn+1 )]}a. we have {Yn+1 . P (Xn+1 = kXn = j) = P (j + Y = k) = P (Y = k − j) = pk−j pk = P (Y = k) (16. p. GIVEN A RANDOM VECTOR XN is Markov for all a. u. with and Z = Un−1 . called the The transition probabilities may be arranged in a matrix usually referred to as the transition probability matrix. Thus Un = hn (Y0 . Yn+1 ) = u + Yn+1 . property (CI9) ("(CI9) ". In this case. Xn ) and Un = hn (Xn . it is apparent that each is Markov.47) P = [p (i. for all Borel sets Q.43) Since Xn+1 = gn+1 (Yn+1 . b. is invariant with n. (16. It is apparent that if . VERIFICATION a. In this regard.46) (onestep) transition probabilities.s. p. [PX ] (CE10b) ("(CE10)". In the homogeneous case. Since {Yn : 1 ≤ n} where is iid. and invariant with P (Xn+1 ∈ QXn = u) = P [g (u. below. Un } is independent.
49) P (Xn+1 = kXn = j) = { P (Wjn = k) P (Wj n ≥ M ) for 0≤k<M k≥M 0≤j≤M (16. a = min(M.k). 1 ≤ n} is iid. z = 1. Z = zeros(M. call for branchdbn .p] = csort(a.45 The transition matrix is P To study the evolution of the process.:) = p. 1. M = input('Enter maximum allowable population '). P(1.num2str(EZ).1. disp('Do not forget zero probabilities for missing values of Z') PZ = input('Enter PROBABILITIES for individuals '). If {Zin : 1 ≤ i ≤ M. [t.Z(i. % Probability distribution for individuals branchp % Call for procedure Do not forget zero probabilities for missing values of Z Enter PROBABILITIES for individuals PZ Enter maximum allowable population 10 The average individual propagation is 1.]) P = zeros(M+1. for i = 1:M % Operation similar to genD z = conv(PZ. · · · . M }.M*mz+1).PZ).50) for With the aid of moment generating functions. call for branchdbn') PZ = 0. Z(i. one may determine distributions for W1 = Z 1 .z).1) = 1.1:i*mz+1) = z. then j i=1 Zin } and E = {0.M+1).8: Branching process continued g (j.m % Calculates transition matrix for a simple branching % process with specified maximum population.507 Example 16.:)). Yn+1 ) = min{M. W 2 = Z 1 + Z 2 .51) branchp. k = 0:M*mz. j Wjn = i=1 We thus have Zin ensures {Wjn : 1 ≤ n} is iid for each j∈E (16. % file branchp. mz = length(PZ) .01*[15 45 25 10 5]. EZ = dot(0:mz. P(i+1. W M = Z 1 + · · · + Z M These calculations are implemented in an mprocedure called bution for the iid (16. end disp('The transition matrix is P') disp('To study the evolution of the process. We simply need the distri Zin . · · · . disp(['The average individual propagation is '.
0222 0.0002 0. k) = P [g (j.0345 0.1000 0. CONDITIONAL INDEPENDENCE.0500 0.54) P (Xn+1 = kXn = j) = p (j.8799 0. g (j. Because of the invariance with n.1080 0.0483 0. Also.0006 0 0 0.0284 0.0068 0. g (0.0000 0.1381 0.0590 0.0020 0.1910 0.1293 0. 0} g (1.0074 0. there is a high probability of returning to that value and very small probability of becoming extinct (reaching zero state).0022 0. it is extinct and no more births can occur.0350 0. D) = 0 D) = 1 D) = 2 D) = 3 i i i i D D D D ≥3 =2 =1 =0 implies implies implies implies p (0.1852 0.0119 0.0294 0 0.0307 0.0075 0.0987 0.0406 0.1623 0.0000 0.1227 0.53) D for Dn . GIVEN A RANDOM VECTOR % Optional display of generated P 0 0. 0) = P (D ≥ 1) . 0) = P 1) = P 2) = P 3) = P (D (D (D (D ≥ 3) = 2) = 1) = 0) g (1.1350 0.0079 0.0000 Columns 8 through 11 0 0 0 0 0.0001 0.7591 0.2500 0.0000 0 0 0 0.0001 0 0 0. Dn+1 ) = k] The various cases yield g (0.0559 0. p (0.0705 0.0000 0 0.1412 0.0061 0.0100 0.0022 disp(P) Columns 1 through 7 1.0078 0. 0} g (0. use M =3 Dn is Poisson (1) (16.0000 0. Dn+1 ) = { max{M − Dn+1 .0034 0.0225 0.1010 0.0025 0.2775 0.0017 0.1625 0. 0} max{j − Dn+1 .0440 0.0062 0. 0) = 1.0304 0. p (0. 0} for for 0≤j<m m≤j≤M (16.0147 Note that p (0.0579 0.1534 0.1003 0.0000 0.0000 0.9468 0 0. set (16.3585 0.2550 0.5771 0. D) = max{1 − D. Example 16.0000 0 0 0 0.0005 0.0169 0.0000 0.2239 0.0000 0. If the population ever reaches zero.0001 0.0253 0. g (0.0194 0.0698 0.508 CHAPTER 16.1991 0.0864 0.52) Numerical example m=1 To simplify writing. D) = 0 i D≥1 implies p (1.1179 0.1500 0. g (0.0000 0.0950 0.0001 0.0021 0.4500 0.1528 0.1730 0.0005 0.0000 0.0011 0.1179 0.0043 0.1574 0.0832 0.1545 0.0315 0.0003 0.1675 0.1879 0.9: Inventory problem (continued) In this case.0005 0.1481 0. D) = max{3 − D. p (0.0702 0. if the maximum population (10 in this case) is reached.
D) = max{3 − D. and demand is Poisson (3). my = length(Y).sy).2642 0. P = zeros(ms.3679 The calculations are carried out by hand in this case.3679 0.3679 0 P= 0.s] = meshgrid(Y. 0} = g (0.My). D) = 0 D) = 1 D) = 2 D) = 3 i i i D≥2 D=1 D=0 implies implies implies p (2. D) = 2. for i = 1:ms [a. 0) = P (D ≥ 2) p (2.55) 0. inventory1 has been written to implement the function g. 0} g (2.3679 0. inventory1 Enter value M of maximum stock 5 Enter value m of reorder point 3 Enter row vector of demand values 0:20 Enter demand probabilities ipoisson(3.:) = PY*(a==b)'. This is a standard problem in inventory theory. Y = input('Enter row vector of demand values ').3679 0 0 0. PY = input('Enter demand probabilities ').ms). D) The various probabilities for D may be obtained from a table (or may be calculated easily with cpoisson) to give the transition probability matrix 0.*(s >= m). the reorder point m = 3. ms = length(states). states = 0:M. to exhibit the nature of the calculations.0803 0. D) = 1 i D = 0 implies p (1.1839 0. T = max(0.b] = meshgrid(T(i. % Calculations for determining P [y.states). 2) = P (D = 0) p (3. g (2.0803 0. g (2. 1) = P (D = 0) g (1. 1) = P (D = 1) p (2.m % Version of 1/27/97 % Data for transition probability calculations % for (m. end P We consider the case M = 5. We approximate the Poisson distribution with values up to 20.states).M) inventory policy M = input('Enter value M of maximum stock '). involving costs and rewards.6321 0.1839 0.3679 (16. k) is impossible so that g (3. m = input('Enter value m of reorder point '). P(i.509 g (1.0:20) P = % % % % Maximum stock Reorder point Truncated set of demand values Demand probabilities . g (2. 3 is impossible g (2.*(s < m) + max(0. An mprocedure % file inventory1.:).3679 0. k) = p (0. D) = max{2 − D.
Y ) = Y − 1.n . Thus. it follows that (CK) E [g (Xn+k+m ) Xn ] = E{E [g (Xn+k+m ) Xn+k ] Xn } This is the ∀ n ≥ 0. unless state space ··· ··· ··· ··· (16. Um } ci Um.60) ChapmanKolmogorov equation.2240 0.58) g U n+k .2240 0. Xn+1 may Various properties of conditional independence.2240 0.3528 0.q (j.1847 0. p. k.1494 0. we may assert Extended Markov property XN is Markov i Un.k for j ≥ 1 The resulting transition probability matrix is p1 p2 0 1 p3 0 0 ··· 1 P= 0 ··· ··· The matrix is an innite matrix.n ∀0≤m≤n (16. k. so that p (j.2240 0. 602).62) . Xn+k = Xn+k+m and h (Un .2240 0. (CI10) ("(CI10)". m ≥ 1 (16.1494 0 0. If the range of Y is {1.2240 0.0498 Example 16. Xn } ci Xn+k for all n ≥ 0.0498 0 0 0.1847 0. p.q (i. The immediate future be replaced by any nite future Un.2240 0. so that p (0. Y ) = j − 1 for j ≥ 1.2240 0. · · · . GIVEN A RANDOM VECTOR 0.510 CHAPTER 16.5768 0. k) 0≤m<n<q (16.1494 0. and (CI12) ("(CI12)".2240 0.0498 0.1494 0.2240 0. M } then the E is {0. M − 1}.1680 0.57) The ChapmanKolmogorov equation and the transition matrix As a special case of the extended Markov property. k) = P (Y − 1 = k) = P (Y = k + 1) g (j. E. k. Un } ci Xn+k Setting for all n ≥ 0.1680 0. 1. 2. we have {U n+k . k) = j∈E pm.1680 0. p. ≥ 1 in (CI9) ("(CI9) ". Xn+k ) = Xn {Xn+k+m .10: Remaining lifetime (continued) g (0. be replaced by the entire (M ∗ {U n .n+p and the present Xn may be replaced by any Some results of abstract measure theory show that the nite future future Un . p. m ≥ 1 (16.1680 0.1494 0. we get (16.n (i.61) CK ' pm. 603). with which plays a central role in the study of Markov sequences.1494 0.0498 0.0498 0. particularly (CI9) ("(CI9) ". · · · .1847 0.n+p may extended presentUm.2240 0. j) this equation takes the form (16. For a discrete state space P (Xn = jXm = i) = pm.0498 0.59) By the iterated conditioning rule (CI9) ("(CI8) ".n (i.1847 0. CONDITIONAL INDEPENDENCE. p. k) = δj−1. j) pn. 602) for conditional independence. 602). may be used to establish the following. 603).56) Y is simple.
2619 =P = 0. the threestep transition probability matrix P(3) is obtained by raising P to the third power to get P (3) 3 0.66) In terms of the product mstep transition matrix P(m) = [pm (i. the distribution for any (i. letting pk (n) = P (Xn = k) for each k ∈ E.65) mstep transition probabilities.1649 0.68) The product rule for transition matrices (m) The mstep probability matrix P = Pm . Xn is determined by the initial distribution VERIFICATION . CK '' this set of sums is equivalent to the matrix P(m+n) = P(m) P(n) (16. M }.2753 0. for X0 ) and the transition probability matrix P.511 To see that this is so. consider P (Xq = kXm = i) = E I{k} (Xq ) Xm = i = E{E I{k} (Xq ) Xn Xm = i} = j (16. k)]. We use π (n) for the row matrix To simplify writing. j) pn (j. we may put CK ' in a useful matrix form.2629 0. (16.2917 0. etc. · · · . k) j∈E ∀ i. k) pm. The conditional probabilities pm of the form pm (i.1524 (16.n (i. j) (16. That is. j ∈ E (16. A simple inductive argument based on P(3) = P(2) P(1) = P3 .1524 We consider next the state probabilities for the various stages.69) 0..2504 0.9 (Inventory problem (continued)). Xn .2993 0.67) Now P(2) = P(1) P(1) = PP = P2 .11: The inventory problem (continued) For the inventory problem in Example 16.n (i.2930 0. j) = j pn.63) E I{k} (Xq ) Xn = j pm. we consider a nite state space π (n) = [p1 (n) p2 (n) · · · pM (n)] As a consequence of the product rule. we examine the distributions for the various E = {1.e.2917 0.q (j. k) = The ChapmanKolmogorov equation in this case becomes pm (i.2930 0.2730 0.2629 0.2854 0.70) Probability distributions for any period For a homogeneous Markov sequence.64) Homogeneous case For this case. we have (16. CK '' pm+n (i. the mth power of the transition matrix P CK '' establishes Example 16. k) = P (Xn+m = kXn = i) are known as the invariant in n (16.1898 0.
the stochastic behavior of a homogeneous chain is determined completely by the probability distribution for the initial state and the onestep transition probabilities matrix p (i.11: The inventory problem (continued)). A transition diagram for a homogeneous Markov chain is a linear graph with one node for each state and one directed edge for each possible onestep transition between states (nodes). Remarks • A similar treatment shows that for the nonhomogeneous case the distribution at any stage is determined by the initial distribution and the class of onestep transition matrices. M }. 2. so that the distribution for X3 π (3) = [p0 (3) p1 (3) p2 (3) p3 (3)] = [0. with the dynamics of the evolving system behavior.1524] Thus. Since for some pair two nodes in one direction. the edges on the diagram correspond to positive onestep transition probabilities between the nodes connected. or Markov sequence. sequence.2917 that X3 = 1. any transition which has zero transition probability. the probability is 0. transition probabilities pn. j) of states.. For any n ≥ 0. In the nonhomogeneous case. On the The transition diagram and the transition matrix The previous examples suggest that a Markov chain is a dynamic system. j) > 0 but p (j. a P. known as the transition diagram. as essentially impossible. other hand.71) π (0) = the initial π (1) = π (0) P .2629 0.73) This is the M = 3 at startup. This onestep stochastic linkage has made it customary to refer to a Markov sequence as a In the discreteparameter Markov case.12: Inventory problem (continued) In the inventory system for Examples 3 (Example 16. let pj (n) = P (Xn = j) for each j ∈ E. evolving in time.72) is π (0) and P3 is the fourth row of P3 ...2930 0.2917 0. CONDITIONAL INDEPENDENCE. is characterized by the fact that each member Xn+1 of the sequence is conditioned by the value of the previous member of the sequence. but none in the other. probability of one unit in stock at the end of period number three. makes it possible to link the unchanging structure. j) as presented in the transition However. i) = 0. we use the terms Markov chain. j) depend on the stage n. given a stock of (16. or chain interchangeably. · · · . • A discreteparameter Markov process. GIVEN A RANDOM VECTOR Suppose the homogeneous sequence XN has nite statespace E = {1.512 CHAPTER 16. . The system can be viewed as an object jumping from state to state (node to node) at the successive transition times. suppose the initial stock is M = 3. we may have p (i. As we follow the trajectory of this object. Example 16. 7 (Example 16. This means that π (0) = [0 0 0 1] The product of (16. we may have a connecting edge between we achieve a sense of the evolution of the system. We ignore. process. Thus. Denition. (i. represented by the transition matrix.. The timeinvariant transition matrix may convey a static impression of the system.n+1 (i.9: Inventory problem (continued)) and 9 (Example 16.5: An inventory problem). probability distribution π (n) = π (n − 1) P = π (0) P(n) = π (0) Pn = the nthperiod distribution The last expression is an immediate consequence of the product rule. simple geometric representation. Put π (n) = [p1 (n) p2 (n) · · · pM (n)] Then (16.
At each node corresponding to one of the state value is shown.368 0 0. For convenience. Note that the probabilities on edges leaving a node (including a self loop) must total to one.080.4: Transition diagram for the inventory system of Example 16. These are represented by the self loop and a directed edge to each of the nodes representing states. since the probability of a transition from value 2 to value 3 is zero. move to node 2 with probability 0. the transition matrix P for the inventory problem (rounded to three decimals). the state value is one less than the state number. we refer to the node for state k + 1.74) 0.368 possible states.513 Example 16.632 0. Each of these directed edges is marked with the (conditional) transition probability.184.184 0. there is no directed edge from node 1 to either node 2 or node 3.368 0. the Figure 1 shows the transition diagram for this system.080 0. as node k.368. On the other hand.368.368 0 0 P= 0. other node.13: Transition diagram for inventory example Consider. since these correspond to the transition probability distribution from that node. which has state value k. In this example. there are four possibilities: remain in that condition with probability 0. .264 0. Figure 16. There is no directed edge from the node 2 to node 3.368 (16. again. move to node 1 with probability 0. Similary.184 0. or move to node 3 with probability 0.368 0. If the state value is zero.080 0.13 ( Transition diagram for inventory example). 0. probabilities of reaching state value 0 from each of the others is A similar situation holds for each represented by directed edges into the node for state value 0.368 0.
j). If the state space is innite. Periodicities can sometimes be seen. in the sense that if the system arrives at any one state in the subset it can never reach a state outside the subset. State Remark. the probability of reaching state j for the rst time from state i in k F (i. We simply quote them as needed... With this relationship. hence is visited an innity of times. and aperiodic.e. Only the positive case is common in the innite case. CONDITIONAL INDEPENDENCE. If it is absorbing. the probability of ever reaching state j from state i. recurrent states fall into one of two subclasses: positive or null. Denition. closed if no state outside the class can be The following important conditions are intuitive and may be established rigorously: i↔j i→j i→j i and i and implies i is recurrent i j j is recurrent recurrent implies recurrent implies i↔j recurrent . A class of states is communicating i every state in the class may be reached from every other state in the class (i. Denition. then once it is reached it returns each step (i. j (after the initial period). j) = P (Ti = kX0 = i). then state j is periodic with period δ . Classication of states Many important characteristics of a Markov chain can be studied by considering the number of visits to an arbitrarily chosen. Denition. but xed. although it is usually easier to use the diagram to show that periodicities cannot occur. Denition. State We say j. By including j reaches j in all cases. steps. Usually if there are any self loops in the transition diagram (positive probabilities on the diagonal of the transition matrix case.e. state j aperiodic. States i and j communicate. transitive. and that is the only possible case for systems with nite state space.75) δ > 1. but also displays certain structural properties. the relation ↔ is an equivalence relation (i. j) = P (Ti < ∞X0 = i) = ∞ k=1 Fk (i. Sometimes there is a regularity in the structure of a Markov sequence that results in periodicities. state. every pair communicates). i F (j. P) the system is aperiodic. denoted i → j . GIVEN A RANDOM VECTOR There is a oneone relation between the transition diagram and the transition matrix P. For a xed state j. j) < 1. i pn (i. It is called absorbing A state j is called i ergodic i it is positive. although we do not develop them in this F. Unless stated otherwise. Some subsets of states are essentially closed. Often a chain may be decomposed usefully into subchains. p (j. j) = 1. An arrow notation is used to indicate important relations between states. j) = 1. we can dene important classes. For state j. otherwise. j) > 0 for some n > 0. j) > 0} is (16. never leaves). denoted i ↔ j i both i reaches j and j reaches i reaches i.514 CHAPTER 16. E F. recurrent. let of the rst visit to state T1 = the time (stage number) Fk (i. A number of important theorems may be developed for Fk and treatment. is reexive. A recurrent state is one to which the system eventually returns. Questions of communication and recurrence may be answered in terms of the transition diagram. let δ = greatest If common denominator of {n : pn (j.e. The transition diagram not only aids in visualizing the dynamic evolution of a chain. An important classication of states is made in terms of Denition. and is said to be j is said to be transient recurrent i F (j. and idempotent). A class is reached from within the class. we limit consideration to the aperiodic Denition.
81) λ is the largest absolute value of the eigenvalues for P λ = 1. and not all states are transient. k) j.78) π1 π2 π2 ··· ··· πm (16.All states are aperiodic . is closed. P = 0. irreducible. then .14: The long run distribution for the inventory example We use MATLAB to check the eigenvalues for the transition probability P and to powers of P. F (i.1839 0.has no proper closed subsets).3679 0 0 0. n P0 = limPn n is the long run distribution Denition. j) = F (i.77) π (n) = π (0) Pn → π (0) P0 where (16. The convergence process is readily evident.3679 .515 Limit theorems for nite state space sequences The following propositions may be established for Markov sequences with nite state space: • • There are no null states.. j) (16.3679 0 0.e. as above. then M M limpn (i. A distribution is stationary i π = πP generating function analysis shows the convergence is exponential in the following sense (16.0803 0. positive..All states are recurrent .0803 0. recurrent.6321 0. A Pn − P0  ≤ aλ where n other than (16.3679 0. k ∈ C .2642 0.3679 0.If a class then C or all are periodic with the same period.76) π (n) = [p1 (n) p1 (n) · · · pM (n)] the result above may be written so that π (n) = π (0) Pn (16. and for all i is a transient state (necessarily not in C ).1839 E = abs(eig(P)) E = 0. If a class of states is irreducible (i. A limit theorem If the states in a Markov chain are ergodic (i.80) The result above may be stated by saying that the longrun distribution is the stationary distribution.e.79) π 1 P0 = ··· π1 Each row of πm ··· ··· ··· π2 · · · πm π = limπ (n). aperiodic).3679 0. j) = πj > 0 n j=1 If. we let πj = 1 πj = i=1 πi p (i. obtain increasing Example 16.3679 0.
28579574360207 0.16387960205989 0.2847 0.26316681807503 0.0045824.28250048636032 0.1663 0.00458242348096 P4 = P^4 P4 = 0.2632 0.984007206519035e05 % Compare with 0.28471028095839 0.0000 format long N = E(2).00002099 error12 = max(max(abs(P^16 . and we have approximated the long run matrix P0 with P16 .516 CHAPTER 16.2632 0.^[4 8 12] N = 0.28958568915950 0. have not determined the factor a.28580046500309 0.2602 0.1663 0.2858 0.1663 0.00002099860496 0. if we consider agreement to four decimal places sucient. .00000009622450 The convergence process is clear.16637097383535 0.28471421248816 0. CONDITIONAL INDEPENDENCE.2632 0.26746979455342 0.00441148012334 % Compare with 0.2847 shows that 0.28593792666752 0.26315634631961 0.26315895715219 0.16633422627939 0.16634116914369 error4 = max(max(abs(P^16 .28579574073314 0.26315641543927 0.28958568915950 P8 = P^8 P8 = 0.2847 0. For example.P4))) % Use P^16 for P_0 error4 = 0..26315628010643 0.26315895715219 0.28470680858266 0.P12))) error12 = 1.28581491438224 0.28593792666752 0.00000009622450 0.1663 n = 10 is quite sucient. GIVEN A RANDOM VECTOR 1.16632636535655 0.28385952806702 0.28470687626748 0..0000 0.28579560683438 0.28479107531968 0.P8))) error8 = 2.16387960205989 0.28470680858266 0. error8 = max(max(abs(P^16 . then We P10 = P^10 P10 = 0.28579560683438 0.2858 0.28469190218618 0.2602 0.16617268146679 0.16634103381085 0.26288737107246 0. then n is suciently large.16634117201261 0. and the agreement with the error is close to the predicted.28471421248816 0.2858 0.005660185959822e07 % Compare with 0.2858 0.2632 0.26315641543927 0.28577030590344 0.26059678211310 0.16634116914369 0.26314057837998 0. This n agree within acceptable exhibits a practical criterion for sucient convergence. If the rows of P precision.28470680714781 0.28156644866011 0.16632636535655 P12 = P^12 P12 = 0.26059678211310 0.2847 0.28580046500309 0.17075261450021 0.
Elements and terminology 1. Computation is carried For out with state numbers. For each state. • • If the initial state is not in the target set. A fundamental strategy for sampling from a given population distribution is developed in the unit on the Quantile Function. • In some instances. with the selection of the next state made in each case from the distribution for the current state. We call the number of transitions in this case the recurrence time. Such a matrix is known as a stochastic matrix. The transition matrix P thus has nonnegative elements. the number of transitions is a measure of the elapsed time since the chain originated. We suppose there are purposes of analysis and simulation. States and state numbers.3). In this section. we adapt that procedure to the problem of simulating a trajectory for a homogeneous Markov sequences with nite state space.2 Simulation of nite homogeneous Markov sequences In the section. transformed by the quantile function into a sample from the desired distribution. The We use the term stage and period period number It is customary to number the periods or stages beginning with zero for the initial is the number of transitions to reach that stage from the initial one. the quantile function is used with a random number generator to obtain a simple random sample from a given population distribution. 1]. m. 2. "The Quantile Function" (Section 10. A succession of these choices. Sometimes we are interested in continuing until rst arrival at (or visit to) a specic target state or any one of a set of target states. transitions. The i th row consists of the transition distribution for selecting the nextperiod state when the current state number is i. period numbers. 3. We call the sequence of states encountered as the system evolves a trajectory or a chain. This sample is To obtain a sample from the uniform distribution use a random number generator. we speak of the arrival time. we have a distribution for selecting the next state. stage. The terms sample path or realization of the process are also used in the literature. If Q is the quantile function for the population distribution and U is a random variable distributed uniformly on the interval [0. If we use the quantile function for that distribution and a number produced by a random number generator.517 16. usually carrying a numerical value. we make a selection of the next state based on that distribution. the arrival time would be zero. Again there is a choice of treatment in the case the initial set is in the target set. constitutes a valid simulation of a trajectory. Stages. one transition to reach the next (period one). In this case. if desired. The time (in transitions) to reach a target state is one less than the number of stages in the tra jectory which begins with the initial state and ends with the target state reached. trajectories and time. We nd it convenient to refer to fashion.2. then X = Q (U ) has the desired distribution. so time or period number is one less than the number of stages. we do not stop at zero but continue until the next visit to a target state (possibly the same as the initial state). we number the states 1 through m states. these can be translated into the actual state values after computation is completed. 2. there is a conditional The fundamental simulation strategy 1. Arrival times and recurrence times The basic simulation produces one or more tra jectories of a specied length. transition probability distribution for the next state. it may be desirable to know the time to complete visits to a prescribed nu