Applied Probability
By: Paul E Pfeiffer
Applied Probability
By: Paul E Pfeiffer
Online: <http://cnx.org/content/col10708/1.4/ >
CONNEXIONS
Rice University, Houston, Texas
©2008
Paul E Pfeier
This selection and arrangement of content is licensed under the Creative Commons Attribution License: http://creativecommons.org/licenses/by/3.0/
Table of Contents
Preface to Pfeier Applied Probability 1 Probability Systems 1.1 1.2 1.3 1.4
Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Probability Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 ........................................ ................... 1
Problems on Probability Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Solutions
2 Minterm Analysis 2.1 2.2 2.3
Minterms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Minterms and MATLAB Calculations Problems on Minterm Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3 Conditional Probability 3.1 3.2
Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Problems on Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4 Independence of Events 4.1 4.2 4.3 4.4
Independence of Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 MATLAB and Independent Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Composite Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Problems on Independence of Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5 Conditional Independence 5.1 5.2 5.3
Conditional Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Patterns of Probable Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Problems on Conditional Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Solutions
6 Random Variables and Probabilities 6.1 Random Variables and Probabilities 6.2 Problems on Random Variables and
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
7 Distribution and Density Functions 7.1 7.2 7.3
Distribution and Density Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Distribution Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Problems on Distribution and Density Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Solutions
8 Random Vectors and joint Distributions 8.1 8.2 8.3
Random Vectors and Joint Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Random Vectors and MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Problems On Random Vectors and Joint Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Solutions
9 Independent Classes of Random Variables 9.1 9.2
Independent Classes of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Problems on Independent Classes of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
iv
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
10 Functions of Random Variables 10.1 Functions of a Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 257 10.2 Function of Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 10.3 The Quantile Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 10.4 Problems on Functions of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
11 Mathematical Expectation 11.1 11.2 11.3
Mathematical Expectation: Simple Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Mathematical Expectation; General Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
Problems on Mathematical Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
Solutions
12 Variance, Covariance, Linear Regression 12.1 12.2 12.3 12.4
Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 Covariance and the Correlation Coecient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Problems on Variance, Covariance, Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
Solutions
13 Transform Methods 13.1 Transform Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 13.2 Convergence and the central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 13.3 Simple Random Samples and Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 13.4 Problems on Transform Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
14 Conditional Expectation, Regression 14.1 14.2
Conditional Expectation, Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Problems on Conditional Expectation, Regression
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
15 Random Selection 15.1 Random Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 15.2 Some Random Selection Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464 15.3 Problems on Random Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
16 Conditional Independence, Given a Random Vector 16.1 16.2 16.3
Conditional Independence, Given a Random Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Elements of Markov Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Problems on Conditional Independence, Given a Random Vector . . . . . . . . . . . . . . . . . . . . . . . . . 523
Solutions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
17 Appendices 17.1 17.2 17.3 17.4 17.5 17.6 17.7 17.8
Appendix A to Applied Probability: Directory of mfunctions and mprocedures . . . . . . . . . . . . 531 Appendix B to Applied Probability: some mathematical aids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592 Appendix C: Data on some common distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596 . . . . . . . . . . . . . . . . . . . 598
Appendix D to Applied Probability: The standard normal distribution
Appendix E to Applied Probability: Properties of mathematical expectation . . . . . . . . . . . . . . 599 Appendix F to Applied Probability: Properties of conditional expectation, given a random vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601 Appendix G to Applied Probability: given a random vector Properties of conditional independence, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
Matlab les for "Problems" in "Applied Probability" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
v
Solutions
........................................................................................
??
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615 Attributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
vi
Preface to Pfeier Applied Probability
The course
This is a "rst course" in the sense that it presumes no previous course in probability. modules taken from the unpublished text:
1
The units are
Paul E. Pfeier, ELEMENTS OF APPLIED PROBABILITY,
USING MATLAB. The units are numbered as they appear in the text, although of course they may be used in any desired order. For those who wish to use the order of the text, an outline is provided, with indication of which modules contain the material. The mathematical prerequisites are ordinary calculus and the elements of matrix algebra. A few standard series and integrals are used, and double integrals are evaluated as iterated integrals. The reader who can evaluate simple integrals can learn quickly from the examples how to deal with the iterated integrals used in the theory of expectation and conditional expectation. Appendix B (Section 17.2) provides a convenient compendium of mathematical facts used frequently in this work. And the symbolic toolbox, implementing MAPLE, may be used to evaluate integrals, if desired. In addition to an introduction to the essential features of basic probability in terms of a precise mathematical model, the work describes and employs user dened MATLAB procedures and functions (which we refer to as
mprograms,
or simply
programs )
to solve many important problems in basic probability.
This
should make the work useful as a stand alone exposition as well as a supplement to any of several current textbooks. Most of the programs developed here were written in earlier versions of MATLAB, but have been revised slightly to make them quite compatible with MATLAB 7. In a few cases, alternate implementations are
available in the Statistics Toolbox, but are implemented here directly from the basic MATLAB program, so that students need only that program (and the symbolic mathematics toolbox, if they desire its aid in evaluating integrals). Since machine methods require precise formulation of problems in appropriate mathematical form, it is necessary to provide some supplementary analytical material, principally the socalled
minterm analysis.
This material is not only important for computational purposes, but is also useful in displaying some of the structure of the relationships among events.
A probability model
Much of "real world" probabilistic thinking is an amalgam of intuitive, plausible reasoning and of statistical knowledge and insight. Mathematical probability attempts to to lend precision to such probability analysis by employing a suitable
mathematical model, which embodies the central underlying principles and structure.
The mathematical formu
A successful model serves as an aid (and sometimes corrective) to this type of thinking. Certain concepts and patterns have emerged from experience and intuition.
lation (the mathematical model) which has most successfully captured these essential ideas is rooted in measure theory, and is known as the Kolmogorov (19031987).
Kolmogorov model,
after the brilliant Russian mathematician A.N.
1 This
content is available online at <http://cnx.org/content/m23242/1.7/>.
1
2
One cannot prove that a model is incorrect).
correct.
Only experience may show whether it is
useful
(and not
The usefulness of the Kolmogorov model is established by examining its structure and show
ing that patterns of uncertainty and likelihood in any practical situation can be represented adequately. Developments, such as in this course, have given ample evidence of such usefulness. The most fruitful approach is characterized by an interplay of
• •
A formulation of the problem in precise terms of the model and careful mathematical analysis of the problem so formulated. A grasp of the problem based on experience and insight. and interpretation of analytical results of the model. analytical solution process. This underlies both problem formulation
Often such insight suggests approaches to the
MATLAB: A tool for learning
In this work, we make extensive use of MATLAB as an aid to analysis. I have tried to write the MATLAB programs in such a way that they constitute useful, readymade tools for problem solving. Once the user understands the problems they are designed to solve, the solution strategies used, and the manner in which these strategies are implemented, the collection of programs should provide a useful resource. However, my primary aim in exposition and illustration is to
aid the learning process
and to deepen
insight into the structure of the problems considered and the strategies employed in their solution. Several features contribute to that end. 1. Application of machine methods of solution requires precise formulation. The data available and the fundamental assumptions must be organized in an appropriate fashion. The requisite
discipline
for
such formulation often contributes to enhanced understanding of the problem. 2. The development of a MATLAB program for solution requires careful attention to possible solution strategies. One cannot instruct the machine without a clear grasp of what is to be done. 3. I give attention to the tasks performed by a program, with a general description of how MATLAB carries out the tasks. The reader is not required to trace out all the programming details. However, it is often the case that available MATLAB resources suggest alternative solution strategies. Hence,
for those so inclined, attention to the details may be fruitful. I have included, as a separate collection, the mles written for this work. These may be used as patterns for extensions as well as programs in MATLAB for computations. Appendix A (Section 17.1) provides a directory of these mles. 4. Some of the details in the MATLAB script are presentation details. These are renements which are not essential to the solution of the problem. But they make the programs more readily usable. And
they provide illustrations of MATLAB techniques for those who may wish to write their own programs. I hope many will be inclined to go beyond this work, modifying current programs or writing new ones.
An Invitation to Experiment and Explore
Because the programs provide considerable freedom from the burden of computation and the tyranny of tables (with their limited ranges and parameter values), standard problems may be approached with a new spirit of experiment and discovery. When a program is selected (or written), it embodies one method of
solution. There may be others which are readily implemented. The reader is invited, even urged, to explore! The user may experiment to whatever degree he or she nds useful and interesting. endless. The possibilities are
Acknowledgments
After many years of teaching probability, I have long since lost track of all those authors and books which have contributed to the treatment of probability in this work. I am aware of those contributions and am
3
most eager to acknowledge my indebtedness, although necessarily without specic attribution. The power and utility of MATLAB must be attributed to to the longtime commitment of Cleve Moler, who made the package available in the public domain for several years. The appearance of the professional versions, with extended power and improved documentation, led to further appreciation and utilization of its potential in applied probability. The Mathworks continues to develop MATLAB and many powerful "tool boxes," and to provide leadership in many phases of modern computation. They have generously made available MATLAB 7 to aid in
checking for compatibility the programs written with earlier versions. I have not utilized the full potential of this version for developing professional quality user interfaces, since I believe the simpler implementations used herein bring the student closer to the formulation and solution of the problems studied.
CONNEXIONS
The development and organization of the CONNEXIONS modules has been achieved principally by two people: C.S.(Sid) Burrus a former student and later a faculty colleague, then Dean of Engineering, and most importantly a long time friend; and Daniel Williamson, a music ma jor whose keyboard skills have enabled him to set up the text (especially the mathematical expressions) with great accuracy, and whose dedication to the task has led to improvements in presentation. I thank them and others of the CONNEXIONS team who have contributed to the publication of this work. Paul E. Pfeier Rice University
4 .
Two names. Joseph Barsabbas and Matthias. The group prayed. were put forward. But chance acts were often related to magic or religion. There is need to develop a feel for the structure of the underlying mathematical model.6/>. each is assigned a probability 1/N . Modern probability developments are increasingly sophisticated mathematically. and various diseases. A variety of types of random processes. The developers of mathematical probability also took cues from early work on the analysis of statistical data. To apply these results. To utilize these. then drew lots. reliability models and techniques. in 1693. Probability has roots that extend far back into antiquity. one considers the selection of a member of the population on a chance basis. certain statistical regularities of results are observed.1 Likelihood1 1. and for the principal strategies of problem formulation and solution. the book of Acts describes the selection of a successor to Judas Iscariot as one of the Twelve.1 Introduction Probability models and techniques permeate many important areas of modern life.Chapter 1 Probability Systems 1. fortunately. inventory theory. The solutions of management decision problems use as aids decision analysis. The possible results are described in ways that make each result seem equally likely.1. there are numerous instances in the Hebrew Bible in which decisions were made by lot or some other chance mechanism. The notion of chance played a central role in the ubiquitous practice of gambling. The pioneering work of John Graunt in the seventeenth century was directed to the study of vital statistics. 5 . can be attained at a moderate level of mathematical sophistication. for the role of various types of assumptions. Edmond Halley (for whom the comet is named) published the rst life insurance tables.org/content/m23243/1. Methods of statistical analysis employ probability analysis as an underlying discipline. Although the result of any play is not predictable. The mathematical formulation owes much to the work of Pierre de Fermat and Blaise Pascal in the seventeenth century. the result of any trial is one of a specic set of distinguishable outcomes. If there are N such possible equally likely results. Graunt determined the fractions of people in London who died from various diseases during a period in the early seventeenth century. waiting line theory. freeing it from its religious and magical overtones. One 1 This content is available online at <http://cnx. Some thirty years later. such as records of births. For example. cost analysis under uncertainty all rooted in applied probability theory. with the understanding that the outcome was determined by the will of God. and statistical considerations in experimental work play a signicant role in engineering and the physical sciences. Early developments of probability as a mathematical discipline. the practitioner needs a sound conceptual basis which. time series. which fell on Matthias. The game is described in terms of a well dened trial (a play). deaths. In the New Testament. came as a response to questions about games of chance played repeatedly.
in communication or control theory. event to which a probability occur on any trial. Associated with each outcome is a certain characteristic (or combination of characteristics) pertinent to the problem at hand. which characterizes the outcome. In other cases. Certain concepts and patterns which emerged from experience and intuition called for clarication. It was translated into English and published in 1956 by Chelsea Publishing Company. These considerations show that our probability model must deal with • • A trial which results in (selects) an outcome from a set of conceptually possible outcomes. but the interest is in certain characteristics. Lebesgue on measure theory.. Borel and H. • Much more sophisticated notions of outcomes are encountered in modern theory. An event is identied in terms of the characteristic of The event a favorable response to a polling question occurs if the outcome observed likelihood the event will has that characteristic. after the brilliant Russian mathematician A.1. the responses may be categorized as positive (or favorable). For example. or uncertain (or no opinion). preferences for food. We may speak of the event that the person selected will die of a certain disease say consumption. PROBABILITY SYSTEMS then assigns the probability that such a person will have a given disease. Inherent in informal thought. But the primary feature. For example. is the notion of an may be assigned as a measure of the model must formulate the outcome observed. The trial here is the selection of a person. • A measurement is made. some of the other features may be of interest for analysis of the poll. though marked by ashes of brilliance. The outcome is described by a number representing the magnitude of the quantity in appropriate units. We do not attempt to trace this history. In some cases. Kolmogorov succeeded in bringing together various developments begun at the turn of the century. • A poll of a voting population is taken. That person has many features and characteristics (race. If the outcome is characterized by the total number of spots on the two die. principally in the work of E. occupation. Of course. a communication system experiences only one signal stream in its life. Although it is a person who is selected. it is death from consumption which is of interest. as well as in precise analysis. We move rather directly to the mathematical formulation (the mathematical model) which has most successfully captured these essential ideas. A successful mathematical these notions with precision. It is designed for one of an innite set of possible signals. gender. 1. The likelihood of encountering a certain kind of signal is important in the design. the outcome is viewed in terms of the numbers of spots appearing on the top faces of the two dice.6 CHAPTER 1. rooted in the mathematical system known as measure theory. etc. Outcomes are characterized by responses to a question. negative (or unfavorable). is the political opinion on the question asked. Out of this statistical formulation came an interest not only in probabilities as fractions or relative frequencies but also in averages or expectatons. i (if and only if ) the respondent replies in the armative.N. • A pair of dice is rolled.2 Outcomes and events Probability applies to situations in which there is a well dened trial whose possible outcomes are found among those in a given basic set. religious preference. Such signals constitute a subset of the larger set of all possible signals.e. If that same . The following are typical. then there are eleven possible outcomes (not equally likely). there are thirty six equally likely outcomes. is called the Kolmogorov model. which was long and halting. The trial is not successfully completed until one of the outcomes is realized.). age. A hand of ve cards is drawn. This is the model. These averages play an essential role in modern probability. Kolmogorov Kolmogorov published his (19031987). In polling for political opinions. the possible values fall among a nite set of integers. The event one or more aces occurs i the hand actually drawn has at least one ace. epochal work in German in 1933. it is a person who is selected. If the outcome is viewed as an ordered pair. the possible values may be any real number (usually in some specied interval). i. But a communication system is not designed for a single signal stream.
provide powerful aids to establishing and visualizing relationships between events. then it is one of the set of signals with those characteristics. The event determined by some characteristic of the possible outcomes is the set of those The event outcomes having this characteristic. such as Venn diagrams and indicator functions (Section 1. is jack or king and the suit is heart J ∪K and the set for heart or or Thus. since it has no members (contains no outcomes). in set terminology. Thus the set of cards which does not have suit heart is the set of all those outcomes not in event H.7 hand has two cards of the suit of clubs. the only accepted outcomes are heads and tails. This set (event) consists of an uncountable innity of such signals. Should the coin stand on its edge. we may be interested in the event the face value The set for jack spade. • The event of throwing a seven with a pair of dice (which we call the event SEVEN) consists of the set of those possible outcomes with a total of seven spots turned up. each outcome may have several characteristics which are the basis for describing events. belongs to the event SEVEN). This is the condition that the sets representing the events are disjoint (i. or king is represented by the union spade is the union H ∪ S. On the back of each picture. thirteen cards with heart as suit. the frequency spectrum. a frequency spectrum essentially limited from 60 herz to 10. occurred. have one of the charac teristics. This could be represented as follows. the degree of dierentibility. Each card is characterized by a face value (two through ten. diamonds. One of the advantages of this formulation of an event as a subset of the basic set of possible outcomes is that we can use elementary set theory as an aid to formulation.1) • Sometimes we are interested in the situation in which the outcome does not E. say by leaning against a wall. king. the One use of empty is to cannot occur.000 herz.000 volts per second. spades). Suppose we are drawing a single card from an ordinary deck of playing cards. Thus the event referred to is E = (J ∪ K) ∩ (H ∪ S) The notation of set theory thus makes possible a precise formulation of the event (1. ipping a coin. Throwing the dice is equivalent to selecting randomly one of the thirty six pictures. etc. These considerations lead Denition. set∅. has the characteristic determining the event). set (event) H c. i not more than one can occur on any trial. have no members in common). The occurrence of both conditions means the outcome is in the intersection (common part) designated by ∩. this is the • • Events are mutually exclusive ∅ complementary is useful. queen. hearts. Suppose the two dice are distinguished (say by color) and a picture is taken of each of the thirty six possible combinations. the average value over a given time period. And tools. occurs i the outcome of the trial is a member of that set (i. jack.. • As we note above. we would ordinarily consider that to be the result of an improper trial. then the event two clubs has to the following denition. The notion of the Event impossible event The impossible event is. We call this the or the basic space sure event.e.e. In Ω is sure to occur on any trial. the event We must specify unambiguously what outcomes are possible. write the number of spots. with peak rate of change 10.e. We formalize these ideas as follows: • Let Ω be the set of all possible outcomes of the basic trial or experiment. An ace is drawn (the event ACE occurs) i the outcome (card) belongs to the A heart is drawn i the card belongs to the set of set (event) of four cards with ace as face value. The event SEVEN occurs i the outcome is one of those combinations with a total of seven spots (i.. ∅ . • Observing for a very long (theoretically innite) time the signal passing through a communication channel is equivalent to selecting one of the conceptually possible signals. The event "the signal has these characteristics" has occured. hence. ace) and a suit (clubs..3) for studying event combinations. Now it may be desirable to specify events which involve various logical combinations of the characteristics. Now the event SEVEN consists of the set of all those pictures with seven on the back. If the signal has a peak absolute value less than ten volts. The event SEVEN occurs i the picture selected is one of the set of those pictures with seven on the back. In set theory. Now such signals have many characteristics: the maximum peak value. since if the trial is carried out successfully the outcome will be in Ω.
We say the index J is countable if it is nite or countably uncountable. The set relation an outcome in that the occurrence of element (outcome) in outcome is also in A (event A ⊂ B signies that every A occurs).7) and The union in this expression for B2 is disjoint since we cannot have exactly two of the exactly three of them occur on the same trial.8) . intersection of a class of events. If a trial results in B (so that event B has occurred).8 CHAPTER 1. A3 = E1 E2 E3 The unions are disjoint since each pair of terms has one c c E1 E2 E3 c c E1 E2 E3 A2 = (1. We may express B2 Ei occur directly in terms of the Ei as follows: B2 = E1 E2 ∪ E1 E3 ∪ E2 E3 (1.4) F E is the union of a class of events. Ai = i∈J and ∞ ∞ Ai = i∈J i=1 If event is the Ai . Now the Bk can be expressed in terms of the Ak . if Ai . one for each index i in the index set and J (1. If The role of disjoint unions is so important in probability that it is useful to have a symbol indicating the union of a disjoint class. We use the cup with a bar across to indicate that the sets combined in the union are disjoint. A2 . Then c c c c c A0 = E1 E2 E3 . use the alternate To say AB = ∅ (here we AB for A ∩ B) is to assert that events A and B have no outcome in common. A1 = E1 E2 E3 c c c E1 E2 E3 E1 E2 E3 E1 E2 E3 . For this reason. • Set inclusion provides a convenient way to designate the fact that event A implies event B . it is sometimes useful to use an innite. for at least For example B 2 = A2 A3 (1. i∈J (1. A is also in B . The notion of Boolean combinations may be applied to arbitrary classes of sets. Ei in one and Ei c in the other. hence cannot both occur on any given trial. i∈J If Ai = A1 ∩ A2 ∩ A3 . E2 . We collect below some useful facts about logical (often called Boolean) combinations of events (as sets). Ai i=1 (1.5) Example 1. then that The language and notaton of sets provide a precise language and notation for events and their combinations. · · · } then {Ai : i ∈ J} is the sequence {A1 : 1 ≤ i}. then event F occurs i all events in the class occur on the trial.6) i. 2. in the sense A requires the occurrence of B .2) J = {1. A3 }.3) J = {1. Thus. E3 } be the event that k of events. for example. is the class of sets then {Ai : i ∈ J} For example. otherwise it is index set to designate membership. we write n n A= i=1 Ai to signify A= i=1 Ai with the proviso that the Ai form a disjoint class (1.1: Events derived from a class Bk Consider the class {E1 . Ai = A1 ∪ A2 ∪ A3 . PROBABILITY SYSTEMS provide a simple way of indicating that two sets are mutually exclusive. Let Ak be the event that exactly k occur on a trial and or more occur on a trial. 3} {Ai : i ∈ J} is the class {A1 . 2. In the following it may be arbitrary. then event E occurs i at least one event in the class occurs.
.2 Probability Systems2 1.1) we introduce the notion of a basic space Ω of all possible outcomes of a trial or experiment.11) An outcome is not in the union (i.. with a prescribed outcome.9) D be the event that one or three of the events occur. An event occurs whenever the selected outcome is a member of the subset representing the event.e.7/>. E2 E3 } is disjoint (draw a Venn diagram).9 Here the union is not disjoint. E3 } is disjoint.1 Probability measures observed on a trial.10) Two important patterns in set theory known as an arbitrary class DeMorgan's rules are useful in the handling of events.1 (Events derived from a class) Express the event of no more than one occurrence of the events in {E1 . and complements) corresponding to logical combinations of the dening characteristics. As described so far.There are thirty six possible outcomes of throwing two dice. .12) Example 1. not in at least one) of the Ai i it fails to be in all the intersection (i. intersections. Ai . events as subsets of the basic space determined by appropriate characteristics of the outcomes. Classical probability 1. Occurrence or nonoccurrence of an event is determined by characteristics or attributes of the outcome Performing the trial is visualized as selecting an outcome from the basic set. E2 . In the module "Likelihood" (Section 1.2. then E1 E3 = ∅ and the pair {E1 E2 .org/content/m23244/1. We begin by looking at the likelihood Probability of the occurrence of that event on classical model which rst successfully formulated probability ideas in math ematical form. not in all) i it fails to be in at least one of the Ai . the selection process could be quite deliberate. We use modern terminology and notation to describe it. Probability enters the picture only in the latter situation.2: Continuation of Example 1. there is uncertainty about which of these latent possibilities will be realized. E3 } as c c c c c 3 c c c c c c c B2 = [E1 E2 ∪ E1 E3 ∪ E2 E3 ] = (E1 ∪ E2 ) (E1 ∪ E3 ) E2 E3 = E1 E2 ∪ E1 E3 ∪ E2 E3 The last expression shows that not more than one of the occur. Suppose C is the event the rst two occur or the last two occur but no other combination.e. The basic space Ω consists of a nite number N of possible outcomes. and logical or Boolean combinations of the events (unions. say {E1 . in general. D = A1 c c A3 = E1 E2 E3 c c E1 E2 E3 c c E1 E2 E3 E1 E2 E3 (1. 2 This content is available online at <http://cnx. and it is not in B2 c . However. traditionally is a number assigned to an event indicating the any trial. Then c C = E1 E2 E3 Let c E1 E2 E3 (1. (1. or it could involve the uncertainties associated with chance. For {Ai : i ∈ J} of events. c c Ai i∈J = i∈J Ac i and Ai i∈J = i∈J Ac i (1. Before the trial is performed. c Ei occurs i at least two of them fail to 1. if one pair.
In the rst problem. so that N − ND ND =1− = 1 − P (D) = 0. 5) = 2598960. Kolmogorov) which .. Each possible outcome is assigned a probability 3. and select the other three cards from the 48 non aces. the fraction favorable to A is assigned a unique probability. 1. ND = NA + NB + NC = C (4. then the the individual events. the number of outcomes "favorable" to the event) and N the total number of possible outcomes in the sure event Ω. 3) + C (4.17) Several other elementary properties of the classical probability may be identied. P (A) = NA /N is a nonnegative quantity. The number of outcomes. 4) C (48. 2) C (48.9583 N N (1. and of exactly four aces. Since A. Although the classical model is highly useful. important). and an extensive theory has been developed.3: Probabilities for hands of cards Consider the experiment of drawing a hand of ve cards from an ordinary deck of 52 playing cards. Thus NA = C (4. B be the event of exactly three aces. then the probability assigned event A is A) (1. we must count the number C be the event NA of ways of drawing a hand with two aces. If event (subset) A has NA 1/N elements.There are . We thus is the number not in D which is N − ND . C are mutually exclusive. We select two aces from the four.0417. the number of elements in A (in the classical language. 2) + C (4. Now the number in Dc P (Dc ) = one ace or none i there are not two or more aces. PROBABILITY SYSTEMS . 3) C (48. is N = C (52.15) P (D) ≈ 0. 1) = 103776 + 4512 + 48 = 108336 so that want (1. We adopt the Kolmogorov model (introduced by the Russian mathematician A. 3. by counting Example 1. C are mutually exclusive. We seek a more general model which includes classical probability as a special case and is thus an extension of it.13) P (A) = NA /N With this denition of probability. each event (i. B. so that number in the disjoint union is the sum of the numbers in P A B C = P (A) + P (B) + P (C) (1. it is not really satisfactory for many applications (the communications problem. B. N. What is the probability of drawing a hand with exactly two aces? What is the probability of drawing a hand with two or more aces? What is the probability of not more than one ace? SOLUTION Let A be the event of exactly two aces. which may be determined NA . event so that P (A) = 103776 NA = ≈ 0. 3) = 103776. 5) = 5!47! = 2598960 dierent hands of ve cards (order not 25 = 32 results (sequences of heads or tails) of ipping ve coins. as noted above. There is P (Dc ). 2) C (48.0399 N 2598960 (1.16) This example illustrates several important properties of the classical probability. 2. for example). It turns out that they can be derived from these three. D of two or more is D=A B C.14) Thus the There are two or more aces i there are exactly two or exactly three or exactly four. 2.e.There are 52! C (52.10 CHAPTER 1. P (Ω) = N/N = 1 If A.
We may think of probability as mass assigned to an event. A full explication requires a level of mathematical sophistication inappropriate for a treatment such as this. Repeatedly we refer to the probability mass assigned a given set. It has no members. no model is ever completely successful. The empty set represents an impossible event. and a Ω of elementary outcomes of a trial or experiment. We borrow from measure theory a few key facts which are either very plausible or which can be understood at a practical This enables us to utilize a very powerful mathematical system for representing practical problems in a manner that leads to both insight and useful strategies of solution. a class of probability measure P (·) which assigns values to the events in accordance with the following rules: For any event A. Reality always seems to escape our logical nets. we introduce and utilize additional concepts to build progressively a powerful and useful discipline. probability would be counted more than once in the sum. ∅ = Ωc . If the sets were not disjoint. as dened. so sometimes we speak informally of the weight rather than the mass. P (Ω) = 1. Gradually fundamental properties needed for the classical case. this follows .3 (Probabilities for hands of cards). are just those illustrated in Example 1. the probability P (A) ≥ 0. The necessity of the mutual exclusiveness (disjointedness) is illustrated in Example 1. then to introduce probability as a number assigned to events sub ject to certain conditions which become denitive properties. The mass Now is proportional to the weight. hence cannot occur. p. (P4): P (Ac ) = 1 − P (A). But we soon expand this rudimentary list of properties. countable class of events. Since It seems reasonable that it should be assigned zero probability (mass). is abstractsimply a number assigned to each set representing an event.3 (Probabilities for hands of cards) Denition events (P1): (P2): (P3): A probability system consists of a basic set as subsets of the basic space. logically from (P4) ("(P4)". a mass assignment with three properties does not seem a very promising beginning. If The probability of the sure event Countable additivity. The total unit mass is assigned to the basic set Ω. And many technical mathematical considerations are not important for applications at the level of this introductory treatment and may be disregarded.18) P (∅) = 0. then the probability of the disjoint union is the sum of the individual probabilities. level. 11). This follows from additivity and the fact that 1 = P (Ω) = P A (P5): Ac = P (A) + P (Ac ) (1. A probability. The Kolmogorov model is grounded in abstract measure theory. but are primarily concerned to interpret them in terms of likelihoods. 11) and (P2) ("(P2)". Of course. The Our approach is to begin with the notion of events as sets introduced above. The additivity property for disjoint sets makes the mass interpretation consistent. But most of the concepts and many of the results are elementary and easily grasped. {Ai : 1 ∈ J} is a mutually exclusive.11 captures the essential ideas in a remarkably successful way. We can use this interpretation as a precise representation. p. We use the mass interpretation to help visualize the properties. But we can give it an interpretation which helps to visualize the various patterns and relationships encountered.
we get the last expression. is the probability mass in A plus A similar interpretation holds for B. disjoint class and A is contained in the union. B must (1. The rst three expressions follow from additivity and partitioning of A∪B =A Ac B = B AB c = AB c AB Ac B (1.21) . every point in must have at least as much mass as also. The last expression shows that we correct this by taking away the extra common mass. so that B A occurs. Now the relationship is at least as likely to occur as A.1: Partitions of the union A ∪ B .1). the rst expression says the probability in the additional probability mass in the part of the second expression. PROBABILITY SYSTEMS Figure 1. (P6): If A ⊂ B. extra in B A∪B which is not in A. Hence B A. (P8): If {Bi : i ∈ J} is a countable.19) B=A Ac B so that P (B) = P (A) + P (Ac B) ≥ P (A) P (Ac B) ≥ 0 (P7): P (A ∪ B) = P (A) + P (Ac B) = P (B) + P (AB c ) = P (AB c ) + P (AB) + P (Ac B) = P (A) + P (B) − P (AB) A∪B as follows (see Figure 1.20) In terms of If we add the rst two expressions and subtract the third. then P (A) ≤ P (B). then P (A) = i∈J A= i∈J ABi so that P (ABi ) (1. From a purely formal point of view. From the mass point of view.12 CHAPTER 1. probability mass. we have since A⊂B means that if A is also in B. If we add the mass in A and B The third is the probability in the common part plus the extra in A and the we have counted the mass in the common part twice.
such as (P4) ("(P4)". the probability value to be assigned an event is clearly dened (although it may be very dicult to perform the required counting). 11). Whether one sees this as an act of the gods or simply the result of a conguration of physical or behavioral causes too complex to analyze. p. Before his or her appearance and the result of a test. If we have an assignment of numbers to the events. In the case of an uncorrected digital transmission error. or statistical studies to assign probabilities. (P5) ("(P5)". there is some basis in statistical data on past experience or knowledge of structure. introduce new probability measures) when partial information about the outcome is obtained. analyzed into equally likely outcomes.13 (P9): Subadditivity. experiment.. structure of the system studied. before the trial. p. from the point of view of the experimenter. The trial may be quite carefully planned. the principal variations are not due to experimental error but rather to unknown factors which converge to provide the specic weather pattern experienced.1). 2 through 12. While there is an extremely large number of possible hourly temperature proles. but would be inecient. or may be determined. The mechanism of chance (i. ("(P1)". the physician may not know which patient with which condition will appear. the cause is simply attributed to chance. we make new probability assignments (i. with which symptoms. proles in which successive hourly temperatures alternate between very high then very low values throughout the day constitute an unlikely subset (event). While there is uncertainty about which patient. p. and (P3) ("(P3)". In a game of chance. we need only establish (P1) 11). the source of the uncertainty) may depend upon the nature of the actual process or system observed.. Some of these properties. But this is not usually the case. we must resort to experience. It For example. where B i = Ai Ac Ac · · · Ac ⊂ Ai 1 2 i−1 (1. p. property (P6) ("(P6)". the dice problem may be handled directly by assigning the appropriate probabilities to the various numbers of total spots. and (P6) ("(P6)". the assumption of equal likelihood is based on knowledge of symmetries and structural regularities in the mechanism by which the game is carried out. the contingency may be the result of factors beyond the control or knowledge of the experimenter. 12). It may be used for systems in which the number of outcomes is innite (countably or uncountably).22) This includes as a special case the union of a nite number of events. If there were complete uncertainty. 11) to be able to assert that the assignment 11). For example. In each case. And the other properties follow as logical consequences. And the number of outcomes associated with a given event is known.e. the cause of uncertainty lies in the intricacies of the correction mechanisms and the perturbations produced by a very complex environment. experience provides some idea what range of values to expect. a substantial subset of these has very little likelihood of occurring. then P (A) ≤ ∞ i=1 P (Ai ). The existence of uncertainty due to chance or randomness does not necessarily imply that the act of performing the trial is haphazard. p. constitutes a probability measure. about which outcome will present itself. A patient at a clinic may be self selected. (P2) ("(P2)".e. a physician certainly knows approximately what fraction of the clinic's patients have the disease in question. the situation would be chaotic. in taking an hourly temperature prole on a given day at a weather station. does not require a uniform distribution of the probability mass among the outcomes. This would not be incorrect. and symmetry in the system under observation which makes it possible to assign likelihoods to the occurrence of various events. In the general case. 12). regularity. will appear at a clinic. are so elementary that it seems they should be included in the dening statement. In each case. For example. One normally expects trends in temperatures over the 24 hour period. Flexibility at a price In moving beyond the classical model. and the fact (Partitions) ∞ ∞ A= i=1 Ai = i=1 Bi . If A= ∞ i=1 Ai . Although a trac engineer does not know exactly how many vehicles will be observed in a given time period. It is this ability to assign likelihoods to the various events which characterizes applied . As we see in the treatment of conditional probability (Section 3. p. In the classical case. But this freedom is obtained at a price. we have gained great exibility and adaptability of the model. This follows from countable additivity. the situation is one of uncertainty. p. 11).
and returns zero if A does not occur. the total probability assigned to the union must equal the sum of the probabilities of the separate events. 1. it is there. . With the exception of the sure event and the impossible event. 11). the fraction of times that event A occurs approaches as a limit the value P (A). (P2) probability. p. The assignments must be consistent with the dening properties (P1) ("(P1)". properties provide consistency rules 11). The dening properties (P1) ("(P1)". a prime role of probability theory is to derive other probabilities from a set of given probabilites.7/>. 11) (plus others which may also be derived from these). there is a fundamental limit theorem. (P2) ("(P2)". in which the expected or average gain is zero. Whether we can determine it or not. There is a variety of points of view as to how probability should be interpreted.14 CHAPTER 1.1 What is Probability? The formal probability system is a model whose usefulness can only be established by examining its structure and determining whether patterns of uncertainty and likelihood in any practical situation can be represented adequately. the laws of probability simply impose rational consistency upon the way one assigns probabilities to events. Various attempts have been made to nd objective ways to measure the P (A) expresses the degree of certainty one feels that event A will occur. The sure event is assigned probability one. for a gain of (1 − a). (P3) ("(P3)". p. which may be interpreted if a trial is performed a large number of times in an independent manner. just as the notion of mass is consistent with many dierent mass assignments to sets in the basic space. One important dichotomy among practitioners. 11) and derived for making probability assignments. as in the classical case. they also provide important structure. Establishing this result (which we do not do in this treatment) provides a formal validation of the intuitive notion that lay behind the Inveterate gamblers had noted longrun statistical regularities. p. From this point of view. occurrence.org/content/m23246/1. • Another group insists that probability is a condition of the mind of the person making the probability assessment. 11) along with derived properties (P4) through (P9) (p. This idea is so imbedded in much intuitive thought about probability that some probabilists have insisted that it must be built into the denition of probability. the model does not tell us how to assign probability to any given event. If two or more events are mutually exclusive.3 Interpretations3 1. p. known as Borel's theorem. However determined. for a gain of −a. These impact the manner in which probabilities are assigned (or assumed). (P3) ("(P3)". The formal system is consistent with many probability assignments. In the model we adopt. It is to be discovered. This approach has not been entirely successful mathematically and has not attracted much of a following among either theoretical or applied probabilists. by analysis and experiment. The early work on probability began with a study of strength of one's belief or degree of certainty that an event will occur. ("(P2)". The probability relative frequencies of occurrence of an event under repeated but independent trials. p. early attempts to formulate probabilities. 3 This content is available online at <http://cnx. Behind this formulation is the notion of a fair game. 11). Just as the dening conditions put • One group believes probability is objective in the sense that it is something inherent in the nature of things. One cannot assign negative proba bilities or probabilities greater than one. One may not know the probability assignment to every event. One approach to characterizing an individual's degree of certainty is to equate his assessment of P (A) with the amount a he is willing to pay to play a game which returns one unit of money if A occurs. Since the probabilities are not built in. p. if possible. Any assignment of probability consistent with these conditions is allowed. constraints on allowable probability assignments. A typical applied problem provides the probabilities of members of a class of events (perhaps only a few) from which to determine the probabilities of other events of interest.3. We consider an important class of such problems in the next chapter. PROBABILITY SYSTEMS probability is a number assigned to events to indicate their likelihood of 11).
These are unique. Another common type of probability situation involves determining the distribution of some characteristic over a populationusually by a survey. it is generally useful to know whether the conditions lead to a 20 percent or a 30 percent or a 40 percent probability assignment. the outcome of a game or a horse race.5: The probability of rain Newscasts often report that the probability of rain of is 20 percent or 60 percent or some other gure. And there is no doubt that as weather forecasting technology and methodology continue to improve the weather probability assessments will become increasingly useful. sub jective assessment of probabilities may be based on intimate knowledge of relative strengths and weaknesses of the teams involved. whereas those who take an objective view usually emphasize the frequency interpretation.15 and sought explanations from their mathematically gifted friends. probabilities in these situations may be quite subjective. the success of a newly designed computer. the determined by a large number of repeated trials. • What does a 30 percent probability of rain mean? Does it mean that if the prediction is correct. These data are used to answer the question: What is the probability (likelihood) that a member of the population. There are many applications of probability in which the relative frequency point of view is not feasible. probability is Those who hold this view usually assume an objective view of probability. However. there is some ambiguity about the event and whether it has occurred. there must be precision in determining an event. Does the statement mean that it rains 30 percent of the times when conditions are similar to current conditions? Regardless of the interpretation. and experience. While the precise meaning of a 30 percent probability of rain may be dicult to determine. as well as factors such as weather. This is certainly not a probability that can be The game is only played once. There is an assessment (perhaps unrealized) that in sub jectively assigned. Sometimes. there is often a hidden frequentist element in the subjective evaluation. to be discovered by repeated experiment. How do we determine whether it has rained or not? Must there be a measurable amount? Where must this rain fall to be counted? During what time period? Even if there is agreement on the area. An event either occurs or it does not. 30 percent of the area indicated will get rain (in an agreed amount) during the specied time period? Or does it mean that 30 percent of the occasions on which such a prediction is made there will be signicant rainfall in the area during the specied time period? Again. And there is some diculty with knowing how to interpret the probability gure. In fact. As the popular expression has it. those who take a subjective view tend to think in terms of such problems. Nevertheless. and the time period.. the latter alternative may well hide a frequency interpretation. similar situations the frequencies tend to coincide with the value Example 1. • To use the formal mathematical model. From this point of view.4: Subjective probability and a football game The probability that one's favorite football team will win the next Superbowl Game may well be only a sub jective probability of the bettor. There may be a considerable ob jective basis for the subjective assignment of probability. use of the concept of an event may be helpful even if the description is not denitive. There is usually enough practical agreement for the concept to be useful. Examples include predictions of the weather. in this and other similar situations. chosen at random (i. As a matter of fact. there remains ambiguity: one cannot say with logical certainty the event did occur or it did not occur. You only go around once. the performance of an individual on a particular job. There are several diculties here. It is a number determined by the nature of reality. nonrepeatable trials. injuries.e. Example 1. on an equally likely basis) will have a certain characteristic? . the amount. meaningful only in repeatable situations.
the probability of any of the events is taken to be the relative frequency of respondents in each category corresponding to an event. (1.2 Probability and odds odds favoring O (A) = P (A) P (A) = P (Ac ) 1 − P (A) (1. If the odds favoring A is 5/3.1 If an individual is selected on an equally likely basis from this group of 300.7/0. There are 200 on campus members in the population. probability the 1.6: Empirical probability based on survey data A survey asks two questions of 300 students: the recreational facilities in the student center? Do you live on campus? Are you satised with Answers to the latter question were categorized Let reasonably satised. unsatised. It is fortunate that we do not have to declare a single position to be the correct viewpoint and interpretation.25) a/b O (A) = a/b then P (A) = 1+a/b = O (A) = 0. N be the event o campus. Survey Data Survey Data S C O 127 46 U 31 43 N 42 11 Table 1. S be the event reasonably satised. . so P (C) = 200/300 and P (O) = 100/300. O U be the event unsatised. If A and B are events with positive A over B is the probability ratio P (A) /P (B). The probability a student is either on campus and satised or o campus and not satised is P CS OU = P (CS) + P (OU ) = 127/300 + 43/300 = 170/300 (1. The probability that a student selected is on campus and satised is taken to be P (CS) = 127/300. We interpretation most meaningful and natural to the problem at hand. then P (A) = 5/ (5 + 3) = 5/8. then the same probabilities would be applied to any student selected at random from the entire student body. PROBABILITY SYSTEMS Example 1. nor is it necessary to change mathematical model each time a dierent point of view seems appropriate. are free in any situation to make the It is not necessary to t all problems The formal model is consistent with any of the views set forth. If not otherwise specied. and be the event no denite opinion. if O (A) 1 + O (A) a a+b . Data are shown in the following table.3 = 7/3.23) If there is reason to believe that the population sampled is representative of the entire student body. into one conceptual mold. C be the event on campus. or no denite opinion. B is c taken to be A and we speak of the odds favoring A Often we nd it convenient to work with a ratio of probabilities.3.24) This expression may be solved algebraically to determine the probability from the odds P (A) = In particular.16 CHAPTER 1.
28) Bi is the set of those elements of Ai not in any of the previous members of the sequence. 12) P (Bi ) ≤ P (Ai ). p. {ABi : ß ∈ J} is a is mutually exclusive and occurs.4 The indicator function One of the most useful tools for dealing with set combinations (and hence with event combinations) is the indicator function IE for a set E ⊂ Ω.17 1. ∞ ∞ ∞ P i=1 Ai =P i=1 Bi = i=1 P (Bi ) ≤ i=1 P (Ai ) (1. 1. A partition is a mutually exclusive class {Ai : i ∈ J} such that Ω= i∈J Ai (1. 12). additivity and property (P6) ("(P6)". events. 11) places a premium on appropriate partitioning of Denition. If an event can be expressed as a countable disjoint union of events. Thus each and for any i > 1. p.26) A partition of event A is a mutually exclusive class {Ai : i ∈ J} such that A= i∈J Ai (1. ∞ by (P6) ("(P6)". Bi = Ai Ac Ac · · · Ac 1 2 i−1 (1. A partition of event of the Ai A is a mutually exclusive class of events such that A occurs i one (and only one) Ω.27) Remarks. A partition (no qualier) is taken to be a partition of the sure event If class {Bi : ß ∈ J} A ⊂ B = i∈J Bi . we show that of a nite class of events can be expressed as a disjoint union in a manner that often facilitates systematic determination of the probabilities. It is dened very simply as follows: IE (ω) = { 1 0 for for ω∈E ω ∈ Ec (1. then the probability of the combination is the sum of the individual probailities. p.30) . • • • • A partition is a mutually exclusive class of events such that one (and only one) must occur on each trial. Since each Now B i ⊂ Ai .29) The representation of a union as a disjoint union points to an important strategy in the solution of probability any Boolean combination problems. then the class partition of A and A = i∈J ABi . {A1 : 1 ≤ i} and determine a mutually exclusive (disjoint) sequence We may begin with a sequence {B1 : 1 ≤ i} as follows: B 1 = A1 .1. In in the module on Partitions and Minterms (Section 2. 12) follows from countable This representation is used to show that subadditivity (P9) ("(P9)". p.2: Partitions and minterms).3.3 Partitions and Boolean combinations of events The countable additivity property (P3) ("(P3)". each of whose probabilities is known.3.
as in Figure 1. Other uses and techniques are established in the module on Partitions and Minterms (Sec tion 2. with the representation of sets on a Venn diagram. Much of the usefulness of the indicator function comes from the following properties. . if M is the interval [a. IM (t) = 1 for each However. b] t in the interval (and is zero otherwise). IA W B = IA + IB (extends to any disjoint class) The following example illustrates the use of indicator functions in establishing relationships between set combinations. (IF3).1. on the real line then For example. B} is disjoint. Figure 1. (IF1): IA ≤ IB IA (ω) = 1 (IF2): IA = IB i implies i A ⊂ B . The sum rule for two events is established by DeMorgan's rule and properties (IF2).2: Representation of the indicator function IE for event E. ω ∈ A implies ω ∈ B implies IB (ω) = 1. IAB = IA IB = min{IA .2. Thus we have a step function with unit value over the interval M. we can give a schematic representation. then ω ∈ A implies IA (ω) = IB (ω) = 1.31) IAc = 1 − IA This follows from the fact IAc (ω) = 1 i IA (ω) = 0.2: Partitions and minterms). IB } (the maximum rule extends to any class) The maximum ω is in the union i it is in any one or more of the events in the union i any one or more of the individual indicator function has value one i the maximum is one.18 CHAPTER 1. In the abstract basic space Ω we cannot draw a graph so easily. If IA ≤ IB . PROBABILITY SYSTEMS Indicator fuctions may be dened on any domain. and (IF4). We have occasion in various cases to dene them on the real line and on higher dimensional Euclidean spaces. then A=B (IF3): (IF4): A⊂B and B⊂A i IA ≤ IB and IB ≤ IA i IA = IB (1.32) If the pair {A. (IF5): rule follows from the fact that IA∪B = IA + IB − IA IB = max{IA . Remark. If A ⊂ B. IA∪B = 1 − IAc B c = 1 − [1 − IA ] [1 − IB ] = 1 − 1 + IB + IA − IA IB (IF6): (1. A=B i both so ω ∈ B. IB } (extends to any class) An element ω belongs to the intersection i it belongs to all i the indicator function for each event is one i the product of the indicator functions is one.
then it contains countable unions. If 2. we have = 1. with the assurrance that in most applications a set so produced is a mathematically valid event. the application. A class of sets closed under complements and countable unions is known as a sigma algebra of sets. The practical needs are these: 1. n If n B= i=1 Ai C i . under reasonable assumptions. {Ai : i ∈ J} is a nite or countable class of events.34) Ai form a partition. any subset can be taken as an event. Consider the subsets 4 This content is available online at <http://cnx. involving innite possibilities.1 Let (Solution on p. measuretheoretic treatment. we have n IB = i=1 Note that since the IAi ICi n i=1 IAi (1. Such a class is so general that it takes very sophisticated arguments to establish the fact that such a class does not contain all subsets. . there are some technical mathematical reasons for limiting the class of subsets to be considered as events. a basic assumption is that the class of events is a sigma algebra and the probability measure assigns probabilities to members of that class. so that the indicator function for the complementary event is n n n n n I Bc =1− i=1 IAi ICi = i=1 IAi − i=1 n IAi ICi = i=1 c Ai Ci . then Bc = i=1 c Ai Ci (1. the sets produced will be events. 1.4 Problems on Probability Systems4 Exercise 1. Events have been modeled as subsets of the basic space of all possible outcomes of the trial or experiment. A simple argument based on DeMorgan's rules shows that if the class contains complements of all its sets and countable unions.org/content/m24071/1.4/>.3.19 Example 1. IAi [1 − ICi ] = i=1 c IAi ICi (1. then it contains countable intersections. and the formal mathematical structure.7: Indicator functions and set combinations Suppose {Ai : 1 ≤ i ≤ n} is a partition.5 A technical comment on the class of events The class of events plays a central role in the intuitive background. 23. The so called measurability question only comes into play in dealing with random processes with continuous parameters. the union and the intersection of members of the class need to be events. In the case of a nite number of outcomes. Likewise. The theoretical treatment shows that we may work with great freedom in forming events. If A is an event. But precisely because the class is so general and inclusive in ordinary applications we need not be concerned about which sets are permissible as events A primary task in formulating a probability problem is identifying the appropriate events and the relationships between them. In the general theory.) Ω consist of the set of positive integers.33) VERIFICATION Utilizing properties of the indicator function established above. its complementary set must also be an event. In a formal. Even there.35) The last sum is the indicator function for i=1 1. if it contains complements of all its sets and countable intersections.
B. The even integers greater than 12. D. They are selected onebyone in a Ei be the event a man is selected on the i th selection. c. be played an unlimited number of times.4 random manner. C be the respective events that player one. word bank and let A= the set of such messages which contain the B= the set of messages which contain the word bank and the word credit.5 plays.6 Write an expression for the event (Solution on p.) Let Exercise 1.) be the event of a success on the Two persons play a game consecutively until one of them is successful or there are ten unsuccessful Ei i th play of the game. sets: {1. f. player two. Exercise 1. Write an expression for the event that both men have been selected by the third selection. 9} {8. or neither wins. g. The positive integers which are multiples of six.2 Let (Solution on p. B = the odd integers in Ω. b. PROBABILITY SYSTEMS A = {ω : ω ≤ 12} B = {ω : ω < 8} C = {ω : ω is even} D = {ω : ω is a multiple of 3} E = {ω : ω is a multiple of 4} Describe in terms of A. and C= the integers in Ω which are even or less than three. AB AC AB c ∪ C ABC c A ∪ Bc A ∪ BC c ABC Ac BC c (Solution on p.) A group of ve persons consists of two men and three women. Exercise 1. D that the game will be terminated with a success in a nite number of times. Let (Solution on p.7 being equally likely and each triple equally likely: (Solution on p. 23. c.) Suppose the game in Exercise 1. Exercise 1. 10} d. 23. 3. c. d. 23. a. 23. e. Exactly two are alike. C. Describe the following sets by listing their elements. All three are alike. E and their complements the following a. Let A = {5. h. b. d.3 Consider fteenword messages in English. in principle. Let (Solution on p. 7. Which event has the greater probability? Why? Exercise 1. 6. 5. Write an expression for the event F that the game will never terminate. with each digit (0 through 9) a. 1 ≤ i ≤ 10.) Find the (classical) probability that among three random digits.5 could. The integers which are even and no greater than 6 or which are odd and greater than 12. Write an expression for each of these events in terms of the events Ei . 7} {3. The rst digit is 0. f. Let A. e. 23. 8}. No two are alike.20 CHAPTER 1.) Ω be the set of integers 0 through 10. Exercise 1. . 6. 23. b. B.
P (Ac B c ) and P (A ∪ B).16 Determine the probability their intersections.) The classical probability model is based on the assumption of equally likely outcomes.1 0.) .17 If occurrence of event A implies occurrence of B.10 (Solution on p. Exercise 1.2 0. Some care must be shown in analysis to be certain that this assumption is good. 24. likely as the combination C and event B A or C. 24. 23.8 P (AB) 0. Which assignments are not P (A) 0.) A committee of ve is chosen from a group of 20 people. Are these and values determined uniquely? Exercise 1.4 0.3 and the probability the Dallas Cowboys will win is 0. Determine the probabilities P (A) .3 0. 23.7 0.) is as Exercise 1. B.) John thinks the probability the Houston Texans will win next Sunday is 0.11 taken as a reference.4 0. determine P (AB c ). and ω3 be the outcome that they are dierent.) There are Ten employees of a company drive their cars to the city each day and park randomly in ten spots. For any subregion (Solution on p. B. probability measure on the subregions of L.15 The class {A. 23. (Solution on p.14 permissible? Explain why. ω1 be the outcome both ω2 the outcome that both are tails. P (Ac B). P (C).5 the maximum value of P (B) = 0. show that P (Ac B) = P (B) − P (A). 23.21 Exercise 1.7 (they are not playing each other). (Solution on p. 23. 24.) P (A) = 0.9 member of the group will be on the committee? (Solution on p. A certain region L (say of land) is Show that P (·) is a (Solution on p.12 An extension of the classical model involves the use of areas. 23. n! Exercise 1. Two coins are tossed. ll out the table.3 0.13 Suppose (Solution on p. P (B) . A well known example is the following. dene P (A) = area (A) /area (L).3 P (B) 0.5. Event is twice as likely as A (Solution on p. C} of events is a partition.) A. Is that a reasonable assumption? Justify your answer.2 0 0 P (A ∪ B) P (AB c ) P (Ac B) P (A) + P (B) Table 1. Exercise 1. C and Exercise 1.) P (A ∪ B ∪ C) in terms of the probabilities of the events A.3 0.7 0. What is the probability that a specied Exercise 1. (Solution on p. What is the (classical) probability that on a given day Jim will be in place three? equally likely ways to arrange n items (order important). Is it reasonable to suppose these three outcomes are equally likely? What probabilities would you assign? Exercise 1. What is the largest possible value of P (AB)? Using P (AB). One of three outcomes is observed: Let are heads.8 (Solution on p.2 Exercise 1. He thinks the probability both will win is somewhere betweensay. 24. in each case.5 0.) For each of the following probability assignments. 0.3.
E. where the Pi are probabilities measures.21 Suppose P1 .18 Show that (Solution on p. cn } is a class of positive constants. b.22 Suppose each event (Solution on p. P (B). P 2 are probability measures and Show that the assignment c1 .) Determine P (A ⊕ B) in terms of P (A).) For {A1 . ci > 0. 24.) Exercise 1.37) Q (·) us a probability measure. c2 are positive P (E) = c1 P1 (E) + c2 P2 (E) to the numbers such that c1 + c2 = 1. A2 .19 dierence The set combination of A and B.36) Exercise 1. . PROBABILITY SYSTEMS Exercise 1. (Solution on p. 24. c2 . Such a combination of probability measures is known as a mixture. · · · . class of events is a probability measure.20 Use fundamental properties of probability to show a. P (AB) ≤ P (A) ≤ P (A ∪ B) ≤ P (A) + P (B) P ∞ j=1 Ej ≤ P (Ei ) ≤ P ∞ j=1 Ej ≤ ∞ j=1 P (Ej ) (Solution on p. Extend this to n n P (E) = i=1 ci Pi (E) . A ⊕ B = AB c Ac B is known as the Exercise 1. 24. Exercise 1. and P (AB). An } is a partition and {c1 .) (Solution on p.) P (AB) ≥ P (A) + P (B) − 1. and ci = 1 i=1 (1. · · · . 24.22 CHAPTER 1. let n n Q (E) = i=1 Show that ci P (EAi ) / i=1 ci P (Ai ) (1. This is the event that only one of disjunctive union or the symetric the events A or B occurs on a trial. 24.
Solution to Exercise 1. 2. Ten triples.39) Solution to Exercise 1. e = CD. 7} AC = {6. 7. (1. 20) A = E1 c c E1 E2 E3 c c c c E1 E2 E3 E4 E5 c c c c c E1 E2 E3 E4 E5 E6 c c c c c c E1 E2 E3 E4 E5 E6 E7 c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 c c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 E9 c c c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 (1. It must no greater than the minimum of P (A) = 0.11 (p.7 (p.6 (p. . c. 8. 21) C (19. 20) AB = {5. 8} AB C ∪ C = C ABC c = AB A ∪ B c = {0. 10} ABC = ∅ Ac BC c = {3. 20) A = E1 E2 c E1 E2 E3 Solution to Exercise 1. f = BC Ac C c Solution to Exercise 1. Solution to Exercise 1. 9 ways to pick the C (3. 20) Let c c c c B = E1 E2 E1 E2 E3 E4 10 c C = i=1 Ei F0 = Ω and Fk = k i=1 c Ei for k ≥ 1. g. f.12 (p. 20) B⊂A P (B) ≤ P (A). 19) a = BC c . 21) P ({ω1 }) = P ({ω2 }) = 1/4. 21) Additivity follows from additivity of areas of disjoint regions. 10 ways to pick the common value. P = 270/1000.9 (p. P = 720/1000.7.40) 1 × 9! permutations with Jim in place 3. a.38) Solution to Exercise 1. 4. all alike: b.3 (p. 6. Solution to Exercise 1.5is not reasonable. P ({ω3 } = 1/2 . ∞ Then ∞ D= n=1 Fn−1 En and F = Dc = i=1 c Ei (1. P = 10/1000. 2) = 3 ways to pick other. 20) Each triple has probability 1/103 = 1/1000 a. 19! 5!15! · = 5/20 = 1/4 4!15! 20! P = 9!/10! = 1/10. 10 × 9 × 8 triples all dierent: c. one zero: P = 100/1000 two positions alike. 21) 10! permutations. C (20. b = DAE c . 21) P (AB) = 0. Solution to Exercise 1. 9} implies Solution to Exercise 1. d = CAc . e.10 (p. b. c E1 E2 E3 Solution to Exercise 1.4 (p.23 Solutions to Exercises in Chapter 1 Solution to Exercise 1. 4) P = Solution to Exercise 1. 5) committees.8 (p. c = CAB c Dc . 5.3 and P (B) = 0.1 (p. 100 triples with rst d.5 (p.2 (p. d. have a designated member.
2 0. 22) AB ⊂ A ⊂ A ∪ B implies general case follows similarly. or use algebraic expressions P (Ac B) = P (B) − P (AB) = 0 P (Ac B c ) = P (Ac ) − P (Ac B) = 0.13 (p.3 Only the third and fourth assignments are permissible.3 0. Q (E) ≥ 0 and since Solution to Exercise 1.19 (p.1 P (AB c ) 0. Solution to Exercise 1.0 0.8 P (A) + P (B) 1.43) P (AB) = P (A) Follows from and P (AB) + P (Ac B) = P (B) implies P (Ac B) = P (B) − P (A). 22) Clearly P (E) ≥ 0. Solution to Exercise 1. with the last inequality determined by subadditivity.0 0.8 P (AB) 0. P (AB) ≤ P (A) ≤ P (A ∪ B) = P (A) + P (B) − P (AB) ≤ P (A) + P (B).24 CHAPTER 1. Solution to Exercise 1.8 0.4 0. 21) (1. The Solution to Exercise 1.3 0.1 Table 1. If ∞ E= k=1 Ek . P (B) = 1/2 (1.7 0. 21) P (A) + P (B) + P (C) = 1.44) The pattern is the same for the general case.16 (p. Solution to Exercise 1. PROBABILITY SYSTEMS P (AB c ) = P (A) − P (AB) = 0.3 P (Ac B) 0.1 0. P (Ω) = c1 P1 (Ω) + c2 P2 (Ω) = 1.45) Interchanging the order of summation shows that Q is countably additive.5 0.5 0.42) P (A ∪ B ∪ C) = P (A ∪ B) + P (C) − P (AC ∪ BC) = P (A) + P (B) − P (AB) + P (C) − P (AC) − P (BC) + P (ABC) Solution to Exercise 1.7 0. 21) P (A) 0.3 0.3 1. P (A ⊕ B) = P (AB c ) + P (AB c ) = P (A) + P (B) − 2P (AB).3 0.5 Solution to Exercise 1.15 (p. then P (EAi ) = P (Ek Ai ) ∀ i k=1 (1.2 (1.17 (p. 21) and P (B) = P (A) + P (C) = 3P (C).5 0. 22) P (A) + P (B) − P (AB) = P (A ∪ B) ≤ 1.2 0.21 (p.3 0.8 1. and which implies P (A) = 1/3.5 P (A ∪ B) = 0.4 0.14 (p.1 0. . 21) Draw a Venn diagram. except that the sum of two terms is replaced by the sum of terms n ci Pi (E).1 0.41) Solution to Exercise 1.3 P (B) 0.2 0 0 P (A ∪ B) 0.18 (p. P (C) = 1/6.22 (p.8 1. 22) Clearly Ai Ω = Ai ∞ we have Q (Ω) = 1.6 0. P (A) = 2P (C). ∞ ∞ ∞ ∞ E= i=1 Ei implies P (E) = c1 i=1 P1 (Ei ) + c2 i=1 P2 (Ei ) = i=1 P (Ei ) (1.1 0.3 0.20 (p. 22) A Venn diagram shows Solution to Exercise 1.
2. D} . consider rst the partition of the basic space produced A.1 Introduction event fundamental problem in elementary probability is to nd the probability of a logical (Boolean) combination F into component events whose probabilities can be determined. C}. four events {A. AB}. For each such nite class.1.7/>. C. Continuation is this way leads systematically to a partition by three events {A. If the probability of every minterm in this subclass can be determined. B} has partitioned as intersections of members of the class. there The members of this partition are called minterms. D}.1) B is a second event. {A. Any Boolean combination of members of the class can be expressed as the disjoint union of a unique subclass of the minterms.org/content/m23247/1. with various patterns of complementation. These are. D}. B. when the probabilities of certain other combinations are known. We form the minterms {A. Frequently. C. B. B. Ac B c C c D c Ac B c C c D Ac B c C D c Ac B c C D 1 This Ac BC c Dc Ac BC c D Ac BC Dc Ac BC D AB c C c Dc AB c C c D AB c C Dc AB c C D ABC c Dc ABC c D ABC Dc ABC D content is available online at <http://cnx. etc. then A = AB The pair AB c and Ac = Ac B Ω into Ac B c . 25 . B. hence 16 minterms. there are 24 = 16 such patterns. is a fundamental partition determined by the class. B. C} or {A.1 Minterms1 A 2.2) {Ac B c . Ω=A Now if Ac (2.2 Partitions and minterms by a single event To see how the fundamental partition arises naturally.1. Ac B. the event F is a Boolean combination of members of a nite class say. For a class of four events. We illustrate the fundamental patterns in the case of four events {A. C. then the additivity property implies of a nite class of events. then by additivity the probability of the Boolean combination is determined.Chapter 2 Minterm Analysis 2. in a systematic arrangement. so that Ω = Ac B c Ac B AB c AB (2. We examine these ideas in more detail. AB c . If we partition an the probability of F is the sum of these component probabilities.
With this scheme. because each diers from the others by complementation of at least one member event. it is customary to refer to these sets as variables. The answers to the four questions above can be represented numerically by the scheme No ∼0 and Yes Thus. one or more of the minterms are empty (impossible events). etc. for the generating class {A. in that order. These are illustrated in Figure 2. in that order. we may designate Ac B c C c D c = M 0 (minterm 0) AB c C c D = M9 (minterm 9). and the relationships between them. B. then this is designated 1001 . if ∼1 ω is in Ac B c C c Dc .26 CHAPTER 2. For some classes. the answers are: ω is in the minterm AB c C c D. the minterm arrangement above becomes 0000 ∼ 0 0001 ∼ 1 0010 ∼ 2 0011 ∼ 3 0100 ∼ 4 0101 ∼ 5 0110 ∼ 6 0111 ∼ 7 1000 ∼ 8 1001 ∼ 9 1010 ∼ 10 1011 ∼ 11 1100 ∼ 12 1101 ∼ 13 1110 ∼ 14 1111 ∼ 15 Table 2. As we see below. MINTERM ANALYSIS Table 2. D}. Each element answers to four questions: Is it in ω is assigned to exactly one of the minterms by determining the A? Is it in B ? Is it in C ? Is it in D ? Yes. The membership of any minterm depends upon the membership of each generating set A. then each minterm number arrived at in the manner above species a minterm uniquely. one of which is sure to occur on each trial. C . B. the answers are tabulated as 0 0 0 0. we introduce a simple numbering system for the minterms. which may also be represented by their decimal equivalents.1. In a similar way. so that it is the union of the second and fourth columns. D . (2. Similar splits occur in the other cases. the minterms form a That is. for example. but set B is split. there are To aid in systematic handling. Yes. for the cases of three. Thus. we can determine the membership of each partition. and D in the generating class. . B. No.1 No element can be in more than one minterm. four. it is useful to refer to If the members of the generating class are treated in a xed order. the minterms represent mutually exclusive events.3) We utilize this numbering scheme on special Venn diagrams called minterm maps. Thus. the minterms by number. Remark.2 We may view these quadruples of zeros and ones as binary representations of integers. C. C or D. ω in the basic space. which we illustrate by considering again the four events A. 2n An examination of the development above shows that if we begin with a class of minterms. Other useful arrangements of minterm maps are employed in the analysis of switching circuits. this causes no problems. In the threevariable case. set A is the right half of the diagram and set C is the lower half. n events. No. C. as shown in the table. B. Frequently. Since the actual content of any minterm depends upon the sets A. If ω is in AB c C c D. Then Suppose. and ve generating events.
3 Minterm maps and the minterm expansion The signicance of the minterm partition of the basic space rests in large measure on the following fact.27 A B 0 2 4 6 B 1 3 5 7 C Three variables A B 0 4 8 12 B 1 5 9 13 D 2 6 10 14 C D 3 7 11 15 Four variables A B C 0 1 E 2 D 3 E 7 11 15 19 23 27 31 6 10 14 18 22 26 30 4 5 8 9 C 12 13 16 17 20 21 C 24 25 28 29 B C Five variables Fi e 1.f . gur 1. This representation is known as the minterm expansion for the . or ve variables. four.and fve varabl . Minterm expansion Each Boolean combination of the elements in a generating class may be expressed as the disjoint union of an appropriate subclass of the minterms. nt m or hr our i i es Figure 2. 2. M i er m apsf t ee.1.1: Minterm maps for three.
When we deal with a union of minterms in a minterm expansion. A key to establishing the expansion is to note that each minterm is either a subset of the combination or is disjoint from it. 5. Example 2.28 CHAPTER 2.1.1: Minterm expansion Suppose E = AB ∪ Ac (B ∪ C c ) c .4) Figure 2.4 Use of minterm maps A typical problem seeks the probability of certain Boolean combinations of a class of events when the probabilities of various other combinations is given. E = M (1. A general verication using indicator functions is sketched in the last section of this module. . Using the arrangement and numbering system introduced above. The existence and uniqueness of the expansion is made plausible by simple examples utilizing minterm maps to determine graphically the minterm content of various Boolean combinations. The combination c B∪C c = M (0. whether empty or not. 7) = M1 M3 M7 and p (1. 6. Hence. In deriving an expression for a given Boolean combination which holds for any class events. M (1. 7). so that its complement (B ∪ C c ) = M (1. 7) . MINTERM ANALYSIS {A. 3. Examination of the minterm map in Figure 2.2 shows that AB consists of the union of minterms M6 . 6. 4. 7). This leaves the common c c c c part A (B ∪ C ) = M1 . Similarly. The expansion is thus the union of those minterms included in the combination.2: expansion) E = AB ∪ Ac (B ∪ C c )c = M (1. 4. 7) Minterm expansion for Example 2. we let let Mi represent the i th minterm (numbering from zero) and p (i) represent the probability of that minterm. 7) = p (1) + p (3) + p (7) (2. 6. which we designate M (6. B. F = A ∪ B C = M (1. 3. 5). 2. it is convenient to utilize the shorthand illustrated in the following. 7). its presence does not modify the set content or probability assignment for the Boolean combination. D} of four combination. we include all possible minterms. If a minterm is empty for a given class.1 ( Minterm Consider the following simple example. We consider several simple examples and illustrate the use of minterm maps in formulation and solution. 6. M7 . C . 2. 3.
but no word processor : P (AB C) = 2P (A BC) The probability is 0. 3. They are displayed in Figure 3b (Figure 2.35 (C) = p (1.30. 5.30 − 0.40 = 0.50 − 0.65 (AB c C) = p (5) = 2p (3) = 2P (Ac BC) We use the patterns These data are shown on the minterm map in Figure 3a (Figure 2. 1) = 0. 7) = p (2. 5. 7) = 0.65.15 ⇒ p (3) = 0. 5. 7) = 0. 6) = 0. 6.05 (2.30 that the person has a data base program: P (C) = 0. hence P (B c ) = p (0. 2.65 word processor and data base.20 − 0.10 P (Ac B c ) = p (0. 5) = p (3. 5. 3. and data imply C= the event the person has a data base program.05 The probability of at least two : P (AB ∪ AC ∪ BC) = 0. The • • • • • • • word processing program: P (A) = 0. 3. all minterm probabilities are determined. Let A= the event the person selected has word processing. are: P P P P P P (A) = p (4.80. 5) = 0. 6) = 0.1 The probability is 0. 7) = 0.2: Survey on software Statistical data are taken for a certain student population with personal computers. 7) − p (6.29 Example 2. but no spread sheet is twice the probabilty c c of spread sheet and data base.3 The probability is 0. An individual is selected at random. 1.05. hence P (Ac ) = p (0.05 ⇒ p (0) = 0 p (4) = p (4. expressed in terms of minterm probabilities. 5.05 = 0. 7) = 0.15 = 0.5) . p (5) = 0. 2. displayed in the minterm map to aid in an algebraic solution for the various minterm probabilities.65 − 0.40 p (3. hence P (C c ) = p (0. 3) − p (0. 4.05 + 0.70 (ABC) = p (7) = 0.8 spread sheet program: P (B) = 0. 1.65 − 0.10 p (1) = p (1.50 = 0. 6.65 that the person has a.15 p (6.05 (AB ∪ AC ∪ BC) = p (3.20 Thus. 3) = 0.65 The probability is 0. 7) − p (2.10 that the person has all three : P (ABC) = 0. 2.10 − 0.10 = 0. 1.10 + 0. B= the event he or she has a spread sheet program. 1) = 0. 6. 7) = 0.10 ⇒ p (2) = 0. 6. 4. p (2.50 = 0. What is the probability that the person has only the data base program? Several questions arise: • • • Are these data consistent? Are the data sucient to answer the questions? How may the data be utilized to anwer the questions? SOLUTION The data. 6.15 − 0.20 (B) = p (2. 7) − p (5) − p (6.80 − 0. From these we get P Ac BC AB c C ABC c = p (3.3). 6.05 that the person has neither word processing nor spread sheet : The probability is 0.50 p (6) = p (6. 7) − p (7) = 0.65 that the person has a P (Ac B c ) = 0. 7) = 0. 3. 7) − p (3. 5. 3) = p (0. 3) = 0.10 = 0.3). 5. 5) − p (7) = 0.80 that the person has a The probability is 0.55and P (Ac B c C) = p (1) = 0. What is the probability that the person has exactly two of the programs? b.
and 151 have laptop computers. What is the probability he or she has at least one of these types of computer? What is the probability the person selected has only a laptop? .3: Minterm maps for software survey.3: Survey on personal computers A survey of 1000 students shows that 565 have PC compatible desktop computers. and twice as many own both PC and laptop as those who have both Macintosh desktop and laptop. 515 have Macintosh desktop computers. 51 have all three.30 CHAPTER 2. MINTERM ANALYSIS Figure 2. Example 2. 212 have at least two of the three. A person is selected at random from this population.2 (Survey on software) Example 2. 124 have both PC and laptop computers.
011 p (6) = p (3. 6.565 − 0. We utilize a minterm map for three variables to help determine minterm patterns.515 − 0. 7) = 0. The data.441 p (3.151.062 p (3) = 0. 7) − p (3) − p (5. 4. 7) = 0. 6) − p (2) = 0.051 P (AC) = p (5. P (A ∪ B ∪ C) = 1 − P (Ac B c C c ) = 1 − p (0) = 0. 1. 7) − p (7) = 0. are: P P P P P P (A) = p (4.062 − 0. We may now compute the probability of any Boolean combination of the generating events A.011 − 0. hence P (C c ) = p (0. 3) = 0.077 p (4) = P (A) − p (6) − p (5.124 = 0.441 − 0.073 p (1. 3) = P (Ac C) = 0. 6.051 = 0.485 (C) = p (1. 4. 7) = 2P (BC) We use the patterns displayed in the minterm map to aid in an algebraic solution for the various minterm probabilities.124 = 0. 6) = 0.376 = 0. 7) = 0.565 − 0.376 p (0) = P (C c ) − p (4. 7) = 0.077 = 0. the event AC = M5 M7 so that P (AC) = p (5) + p (7) = p (5.124 (AB ∪ AC ∪ BC) = p (3. 4.11 = 0. which are displayed on the minterm map Figure 2. 5. 7) = P (BC) = 0.565.062 − 0. 1. B= the event of owning a Macintosh desktop. 7) = 2p (3.051 = 0.968 and P (Ac B c C) = p (1) = 0. 5) = 0. 7) = 0. Example 2.212 (AC) = p (5. 2. C . hence P (Ac ) = p (0.032 We have determined the minterm probabilities. p (5) = p (5. 5. 2.31 Figure 2.151 − 0. Thus.124/2 = 0.849 (ABC) = p (7) = 0.027 P (AC c ) = p (4.016 p (2) = P (B) − p (3.027 − 0. 7). 3. 6) = 0. 7) = 0. 3) − p (3) = 0. For example.4: Minterm probabilities for computer survey.124 − 0. B.124 = 0. 3.4.1124 = 0.3 (Survey on personal computers) SOLUTION Let A= the event of owning a PC desktop.435 (B) = p (2. 6.077 − 0.364 p (1) = p (1.515.016 (2. hence P (B c ) = p (0. expressed in terms of minterm probabilities. 5.212 − 0.849 − 0.6) . and C= the event of owning a laptop. 7) = 0. 6. 7) − p (6) = 0.
Let be the events a person selected agrees with the respective propositions.015. an examination of the data shows that there are sixteen items (including the fact that the sum of all minterm probabilities is one). shows that all minterms can in fact be calculated.32 CHAPTER 2.5: Minterm probabilities for opinion survey. .500. However. but not both) At the outset.700. C. P (ABC c Dc ) = 0.025. The results are displayed on the minterm map in Figure 2. P (AC c ) = 0. as in the previous examples. following probabilities for various combinations: A.020 Determine the probabilities for each minterm and for each of the following combinations (2. MINTERM ANALYSIS Figure 2. P (A (B ∪ C c ) Dc ) = 0. P (Ac BC) = 0.4: Opinion survey A survey of 1000 persons is made to determine their opinions on four propositions. P (AB c C) = 0. A step elimination procedure. D Survey results show the P (A) = 0. Example 2. A or (B and not C ) SOLUTION or C. that a solution exists.8) (2.030 P (Ac B c C c D) = 0.120. but no assurance.9) (2.10) Ac (BC c ∪ B c C) that is. P (ABCD) = 0. P (C) = 0. P (Ac BC c D) = 0. P (B) = 0. B.5.200.7) (2. it is not clear that the data are consistent or sucient to determine the minterm probabilities.200. not A and (B A ∪ BC c that is.120.055 P (A ∪ BC ∪ Dc ) = 0. It would be desirable to be able to analyze the problem systematically.4 (Opinion survey) Example 2. there is hope. The formulation above suggests a more systematic algebraic formulation which should make possible machine aided solution.195.300. P (D) = 0. Thus.520.140 P (ACDc ) = 0. P (Ac B c Dc ) = 0.
of Example 2. 1.33 2.05 0. Consider again the software survey Example 2.2 (Survey on software). 1) = 0. 5. 5.5 Systematic formulation Use of a minterm map has the advantage of visualizing the minterm expansion in direct relation to the Boolean combination.80 P (A) 0. 2. 3.65 (AB c C) = p (5) = 2p (3) = 2P (Ac BC). so that standard methods of solution may be employed. so that p (5) − 2p (3) = 0 We also have in any case P (Ω) = P (A ∪ Ac ) = p (0. expressed in terms of minterm probabilities. p (7) with a matrix of coecients [0 0 0 0 1 1 1 1] p (0) through (2. These may be written in matrix form as follows: 1 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 0 1 −2 1 1 0 0 0 0 0 0 1 1 0 1 0 0 1 1 1 1 1 0 0 0 1 0 1 p (0) 1 P (Ω) 1 1 1 1 0 1 0 p (1) p (2) p (3) = p (4) p (5) p (6) p (7) 0. 3. both aspects.30 (ABC) = p (7) = 0.65 (C) = p (1. The algebraic solutions of the previous problems involved ad hoc manipulations of We the data minterm probability combinations to nd the probability of the desired target combination.11) p (0) . We obtained algebraic procedure. The rst datum can be expressed as an equation in minterm probabilities: 0 · p (0) + 0 · p (1) + 0 · p (2) + 0 · p (3) + 1 · p (4) + 1 · p (5) + 1 · p (6) + 1 · p (7) = 0. · · · .05 (AB ∪ AC ∪ BC) = p (3. 3. 7) = 1 to complete the eight items of data needed for determining all eight minterm probabilities.65 P (B) 0. 5. 6.1: MATLAB ) how we may use MATLAB for .10 (Ac B c ) = p (0. 6.13) • • The patterns in the coecient matrix are determined by these with the aid of a minterm map.80 This is an algebraic equation in (2. which could be carried out in a variety of ways. 7) = 0.5: The software survey problem reformulated The data. giving eight linear algebraic equations in eight variables p (7). seek a systematic formulation of the data as a set of linear algebraic equations with the minterm probabilities as unknowns. are: P P P P P P P (A) = p (4. 6.65 P (AB ∪ AC ∪ BC) 0 P (AB c C) − 2P (Ac BC) (2.80 (B) = p (2.12) The others may be written out accordingly. 4. 7) = 0. 5. 7) = 0. Each equation has a matrix or vector of zeroone coecients indicating which minterms are included.2.30 P (C) = 0.1.10 P (ABC) P (Ac B c ) 0. 6. Minterm vectors and including several standard computer packages for matrix operations. The solution utilizes an logical operations. 7) = 0. We show in the module Minterm Vectors and MATLAB (Section 2.
2 Minterms and MATLAB Calculations2 The concepts and procedures in this unit play a signicant role in many aspects of the analysis of probability topics and in the use of MATLAB throughout this work. Since each ω ∈ Mi for some i.17) We may designate each set by a pattern of zeros and ones In the pattern for set E. · · · .16) E = A (B ∪ C c ) ∪ Ac (B ∪ C c ) = M1 F = Ac B c ∪ AC = M0 minterms are present in the set. I D must have IA (ω) = 1. 2. we IA . 7) M1 M7 = M (0. consider (2. . Mi JE (2. E= Then. 2. For example. IB (ω) = 0. We formulate this pattern carefully below and show how MATLAB logical operations may be utilized in problem setup and solution. MINTERM ANALYSIS Previous discussion of the indicator function shows that the indicator function for a Boolean combination of sets is a numerical valued function of the indicator functions for the individual sets. Suppose A. in eect. in the i th position (numbering from zero) indicates the inclusion of minterm Mi E is a Boolean combination of A coecient one in the union. Then Consider a Boolean combination ID (ω) = 0.org/content/m23248/1. so that Mi ⊂ E . If ω is in E ∩ Mi . e1 . another arrangement of the minterm map. . F ∼ [1 1 0 0 0 1 0 1] (2. The value of the indicator function for any Boolean combination must be constant on each minterm. C . • Let {Mi : i ∈ JE } be the subclass of those minterms on which IE has the value one. can be designated by a vector of zeroone coecients.1. for each and CHAPTER 2. then IE (ω) = 1 for all ω ∈ Mi . it takes on only the values zero and one. any function of AB c CDc .14) E. This is. as a union of minterms. IC (ω) = 1. In this form.6 Indicator functions and the minterm expansion • • For example. minterm (2. ω in the minterm • E of the generating sets. As an indicator function.2.1) shows that each Boolean combination. I B . 4.15) where Mi is the i th minterm and JE is the set of indices for those Mi included in E. 1. 7) (e0 . E= JE which is the minterm expansion of Mi (2. by the minterm expansion.8/>. B.1 Minterm vectors and MATLAB The systematic formulation in the previous module Minterms (Section 2.34 2. 6.18) content is available online at <http://cnx. for the examples above. 5. E must be the union of those minterms sharing an ω with E. it is convenient to view the pattern it convenient to use representing it. Mi The ones indicate which is included in E i ei = 1. e7 ). 2 This the same symbol for the name of the event and for the minterm vector or matrix E ∼ [0 1 0 0 1 0 1 1] and which may be represented by a row matrix or row vector [e0 e1 · · · e7 ] We nd Thus. Thus. as a minterm vector. must be constant over the minterm. I C . c M4 M5 M6 M7 = M (1.
and E c ∼ [1 0 1 1 0 1 0 0] (2.. {A. if EF is the minterm vector for E&F is the minterm vector for E ∩ F. 1. then Thus. C}. the class of generating sets is respectively. Use MATLAB logical operations to obtain the minterm vector for any Boolean combination. second minterm vector for the family (i. are {A.6: Minterms for the class . the minterm vector for replicated twice to give B ). E∩F E∪F has all the minterms in either set. This suggests a general approach to determining minterm vectors for Boolean combinations. F of the same size : : : EF is the matrix obtained by taking the maximum element in each place.e. For two zeroone matrices E. the basic zeroone pattern 0 0 1 1 is For example to produce the 0 The function 0 1 1 0 0 1 1 minterm(n. Example 2. has only those minterms in both sets.20) MATLAB logical operations on zeroone matrices provide a convenient way of handling Boolean combinations of minterm vectors represented as matrices. B .k) generates the k th minterm vector for a class of n generating sets. considered above E = A (B ∪ C c ) ∪ Ac (B ∪ C c ) ∼ [0 1 0 0 1 0 1 1] Then c and F = Ac B c ∪ AC ∼ [1 1 0 0 0 1 0 1] (2. for example. F are minterm vectors for sets by the same name.19) E ∪ F ∼ [1 1 0 0 1 1 1 1] . Then the minterm vectors for A. This means the c has zeros and ones interchanged. Minterm vectors for Boolean combinations If E and of length 2n . A = [0 0 0 0 1 1 1 1] If B = [0 0 1 1 0 0 1 1] E = (A&B)  C C = [0 1 0 1 0 1 0 1] of the matrices yields (2. Suppose. and C. The minterm expansion for of the vector for 3. E. MATLAB logical operations E ∩ F ∼ [0 1 0 0 0 0 0 1] . This means the j th element minimum of the jth elements for the two vectors. As a rst step. 1.1). then the logical combination E = [1 0 1 0 1 0 1 1]. c The minterm expansion for E has only those minterms not in the expansion for E.35 It should be apparent that this formalization can be extended to sets generated by any nite class. we determine the minterm vector with the aid of a minterm map. Start with minterm vectors for the generating sets. B. then each is represented by a unique minterm vector In the treatment in the module Minterms (Section 2. We wish to develop a systematic way to determine these vectors. 2.21) E = AB ∪ C c . MATLAB implementation A key step in the procedure just outlined is to obtain the minterm vectors for the generating elements {A. The minterm expansion for the vector for E and F and want to obtain the minterm vector 2. We have an mfunction to provide such fundamental vectors. This means the jth element of E∩F is the We illustrate for the case of the two combinations E and F of three generating sets. and E =1−E is the minterm vector for Ec . The j th element of Ec is one i the corresponding vector for E element of E is zero. E&F is the matrix obtained by taking the minimum element in each place. B. we suppose we have minterm vectors for of Boolean combinations of these. E C is the matrix obtained by interchanging one and zero in each place in E. F are combinations of n generating sets. C}. B. C}. E∪F is the maximum of the jth elements for the two vectors. E ∪ F.
logical array whose elements are combined by a. C = minterm(3. Logical operators to give new logical arrays. Array operations (element by element) to give numerical matrices. 6. Array operations with numerical matrices to give numerical results. A 2.3) 0 1 0 1 F = AB ∪ B c C A = B B = C C = 1 0 0 1 0 1 1 1 0 1 1 1 Example 2.1) 0 0 0 0 = minterm(3. 7) These basic minterm patterns are useful not only for Boolean combinations of events but also for many aspects of the analysis of those random variables which take on only a nite number of values.3). Some simple examples will illustrate the principal properties. > A = minterm(3. c. b. 1.2). A numerical matrix (or vector) subject to the usual operations on matrices. F = (A&B)(∼B&C) F = 0 1 0 0 0 1 1 1 G = A(∼A&C) G = 0 1 0 1 1 1 1 1 islogical(A) % Test for logical array ans = 0 islogical(F) ans = 1 m = max(A. B = minterm(3. Zeroone arrays in MATLAB The treatment above hides the fact that a rectangular array of zeros and ones can have two quite dierent meanings and functions in MATLAB.7: Minterm patterns for the Boolean combinations G = A ∪ Ac C F = (A&B)(∼B&C) F = 0 1 0 G = A(∼A&C) G = 0 1 0 JF = find(F)1 JF = 1 5 6 0 1 0 1 1 1 1 1 1 1 % Use of find to determine index set for F 7 % Shows F = M(1.B) % A matrix operation m = 0 0 1 1 1 1 1 1 islogical(m) ans = 0 m1 = AB % A logical operation m1 = 0 0 1 1 1 1 1 1 islogical(m1) ans = 1 a = logical(A) % Converts 01 matrix into logical array a = 0 0 0 0 1 1 1 1 b = logical(B) m2 = ab .2) 0 0 1 1 = minterm(3. 5.1). MINTERM ANALYSIS A = minterm(3.36 CHAPTER 2..
24) event has a data base A = event has word processor.01*[0 5 10 5 20 10 40 10]. B = event has spread sheet. consider the problem of determining the probability of {Ai : 1 ≤ i ≤ n} is any nite class of events. the minterm probabilities are pm = [0 0.05 0.10 0.*B) p3 = 2 p4 = a*b' % Cannot use matrix operations on logical arrays ??? Error using ==> mtimes % MATLAB error signal Logical inputs must be scalar. The function Use of the function minterm in a mintable(n) Generates a table of minterm vectors for n generating sets. If of these events occur on a trial can be that exactly characterized simply in terms of the minterm expansion. p = A*B' p = 2 p1 = total(A. 3. We form a mintable for three variables. the event that exactly k k of n events.20 0.B) % Equivalently.*b) p1 = 2 p3 = total(A. We count the number of successes corresponding to each minterm by using the MATLAB function sum. simple for loop yields the following mfunction.10 0.23) If we have the minterm probabilities. we have an mfunction called csort (for sort and consolidate) to perform this operation. In this case.10] where (2.37 m2 = 0 0 1 1 1 1 1 1 p = dot(A. Often it is desirable to have a table of the generating minterm vectors.9: The software survey (continued) In the software survey problem. pm = 0. The following example in the case of three variables illustrates the procedure. it would be easy to determine each distinct value and add the probabilities on the minterms which yield this value. 1. For more complicated cases. 2.05 0. The event Akn k occur is given by Akn = the In the matrix event disjoint union of those minterms with exactly k positions uncomplemented (2.22) ones. Example 2. It is desired to get the probability an individual selected has SOLUTION k of these. k = 0. M = mintable(3) M = . C = program. Example 2. it is easy to pick out the appropriate minterms and combine the probabilities. which gives the sum of each column.40 0.8: Mintable for three variables M3 = 0 0 0 M3 = mintable(3) 0 0 0 0 1 1 1 0 1 1 0 0 1 0 1 1 1 0 1 1 1 As an application of mintable. The M = mintable(n) these are the minterms corresponding to columns with exactly k Bkn that k or more occur is given by n Bkn = r=k Arn (2.
% Logical combinations (one per % row) yield logical vectors disp(V) 1 1 1 1 1 1 1 1 % Mixed logical and 0 0 0 0 1 1 1 1 % numerical vectors 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 0 0 0 0 Minterm probabilities and Boolean combination If we have the probability of every minterm generated by a nite class. we do this by hand then show how to do it with MATLAB .29. MINTERM ANALYSIS 1 0 1 1 2 1 1 0 2 1 1 1 0 0 0 0 0 0 1 0 1 1 0 1 0 1 0 T = sum(M) T = 0 1 1 2 [k. p3 = 0.0000 0. Ac&Cc]. C.. 10 These generate and name the minterm vector for each generating set and its complement.21. A. p4 = 0.10: Boolean combinations using minvec3 We wish to generate a matrix whose rows are the minterm vectors for C. A&B. determines distinct values in T and consolidates probabilities For three variables. Cc They may be renamed. A&B&C.06. .09. and A C c c Ω = A ∪ Ac . disp([k.11. respectively.03. The approach is much more useful in the case of Independent Events. AB .1000 % 3 % % % % Column sums give number of successes on each minterm.14. When we know the minterm expansion or. Example 2. we have a family of minvec procedures sets. In MATLAB. · · · . minvec4. so we indicate the instead of ∼ E. It is customary to indicate the complement of an event complement by E c E by Ec . it is easy enough to identify the various combinations by eye and make the combinations. we simply pick out the probabilities corresponding to the minterms in the expansion and add them. p1 = 0. minvec3 % Call for the setup procedure Variables are A.07 (2.. C. if desired V = [AAc. we can determine the probability of any Boolean combination of the members of the class. Example 2. p5 = 0. 4.0000 0. Minvec procedures Use of the tilde ∼ to indicate the complement of an event is often awkward.3500 2. because of the ease of determining the minterm probabilities. B. To facilitate writing combinations. p2 = 0. we cannot indicate the superscript. (minvec3.25) . p6 = 0. ABC . minvec10) to expedite expressing Boolean combinations of n = 3.pk]') 0 0 1. For a larger number of variables.5500 3.pk] = csort(T. this may become tedious. 5.pm). .11 Consider E = A (B ∪ C c ) ∪ Ac (B ∪ C c ) c and F = Ac B c ∪ AC of the example above.. A. Bc. the minterm vector. and suppose the respective minterm probabilities are p0 = 0. Ac. p7 = 0. however.0000 0. In the following example.38 CHAPTER 2. equivalently.
5 x 1017 0. Ac&Bc. Cc They may be renamed.36 This is easily handled in MATLAB. for by the MATLAB commands PE = E*pm' and PF = F*pm' . we may solve for the desired target probabilities.27) E and F. PE = E*pm' % Picks out and adds the minterm probabilities PE = 0. C.26) • • Use minvec3 to set the generating minterm vectors.0500 . F = (Ac&Bc)(A&C).1000 0.2*(Ac&B&C)] DV = 1 1 1 1 1 1 1 1 % Data mixed numerical 0 0 0 0 1 1 1 1 % and logical vectors 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 2 0 1 0 0 DP = [1 0. From these. Ac. A&B&C. C. 7) = 0. 6. The following is a transcript of the MATLAB operations. C. B.12: Solution of the software survey problem We set up the matrix equations with the use of MATLAB and solve for the minterm probabilities.8 0. (A&B)(A&C)(B&C). Use logical matrix operations E = (A& (BCc))  (Ac& ( (BCc))) to obtain the (logical) minterm vectors for and F = (Ac&Bc)  (A&C) (2. perform the algebraic dot product or scalar This can be called product of the pm matrix and the minterm vector for the combination. P (F ) = p (0. Ac. A. 5.39 Use of a minterm map shows E = M (1. 7) = 0.0000 % Roundoff 3. E = (A&(BCc))(Ac&∼(BCc)). Bc. and so that P (E) = p1 + p4 + p6 + p7 = p (1. Bc.01*[21 6 29 11 9 3 14 7]. Data vector combinations are: DV = [AAc. 1. 7).65 0.3700 Example 2. (A&Bc&C) . • If pm is the matrix of minterm probabilities. 7) and F = M (0.05 0. if desired.1 0. % Corresponding data probabilities pm = DV\DP' % Solution for minterm probabilities pm = 0. Cc They may be renamed. 5. 4. minvec3 Variables are A.65 0].37 (2. 4.3 0. pm = 0. B.3600 PF = F*pm' PF = 0. minvec3 % Call for the setup procedure Variables are A. B. 6. if desired.0500 0. 1.
4000 0.2. To determine the linear combinations. Its minterm vector . then used these to determine probabilities for the target combinations. solve the matrix equation T V = CT ∗ DV Then the matrix which has the MATLAB solution CT = T V /DV ' (2.2 The procedure mincalc The procedure mincalc performs calculations as in the preceding examples. as follows: 1. rst entry (on the rst row) is A consists of a row of ones. the target vectors can be expressed as linear combinations of the data vectors).. The renements consist of determining consistency and computability of various individual minterm probabilities and target probilities. The computability tests are tests for linear independence by means of calculation of ranks of various matrices. yet are sucient to allow determination of the target probabilities. Sometimes the data are not sucient to calculate all minterm probabilities. Suppose the data minterm vectors are linearly independent. The consistency check is principally for negative minterm probabilities.0500 Example 2. MINTERM ANALYSIS 0. MATLAB The translates each row into the minterm vector for the corresponding Boolean combination.5500 % Target probabilities 0. Continuing the MATLAB procedure above. Now each target probability is the same linear combination of the data probabilities. consist of Boolean combinations of the basic events and the respective probabilities of these combinations. tp = CT*DP' tp = 0. To utilize the procedure.13: An alternate approach The previous procedure rst obtained all minterm probabilities.0500 2.2000 0. we have: CT = TV/DV.40 CHAPTER 2.5500 0.e. the problem must be formulated appropriately and precisely. These are organized into two matrices: The data vector matrix DV has the data Boolean combinations one on each row.28) tp of target probabilities is given by tp = CT ∗ DP . The following procedure does not require calculation of the minterm probabilities. Ac (for A Ac ). which is the whole space.1000 TV = [(A&B&Cc)(A&Bc&C)(Ac&B&C). Ac&Bc&C] % Target combinations TV = 0 0 0 1 0 1 1 0 % Target vectors 0 1 0 0 0 0 0 0 PV = TV*pm % Solution for target probabilities PV = 0. Data • minvecq to set minterm vectors for each of q basic events.1000 0. Use the MATLAB program 2. and the target minterm vectors are linearly dependent upon the data vectors (i. The procedure picks out the computable minterm probabilities and the computable target probabilities and calculates them.
. DV = [AAc. it is necessary to turn the arrays DV and TV consisting of zeroone corresponding target Boolean combination. and similarly for TV. TV = [(A&B&Cc)(A&Bc&C)(Ac&B&C). Ac&Bc&C]. patterns into zeroone matrices.1 0. Computational note. It is usually a good idea to check results by hand to determine whether they are consistent with data. Should it provide a solution. converge. Usual case Suppose the data minterm vectors are linearly independent and the target vectors are each linearly dependent on the data minterm vectors. Ac&Bc. If the total probability of a group of minterms is zero.2*(Ac&B&C)]. It simply considers these minterm probabilities not computable. However. There are several situations which should be dealt with as special cases. The target probabilities are the same linear combinations of the data probabilities. A&B&C.65 0]. disp('Call for mincalc') mcalc01 % Call for data Call for mincalc % Prompt supplied in the data file . or it may not be able to decide which of the redundant data equations to use. even though the data should be adequate for the problem at hand. These are put target vector matrix T V .41 • 3. it may be dicult to determine consistency. Linear dependence. one on each row. C. A. The Zero Problem. into the objective is to determine the probability of various target Boolean combinations. Both the original and the transformed matrices have the same zeroone pattern. 4. The rst entry is one.8 0. B. the command CT = TV/DV does not operate appropriately. 3. Cautionary notes The program mincalc depends upon the provision in MATLAB for solving equations when less than full data are available (based on the singular value decomposition). The matrix may be singular. there is no simple check on the consistency. the operation called for by the command CT = TV/DV may not be able to solve the equations.05 0.3 0. with the command Thus. if mincalc does not have enough information to calculate the separate minterm probabilities in the case they are not zero. In the case of linear dependence.*DV. 1. Apparently the approximation process does not MATLAB Solutions for examples using mincalc Example 2. Then each target minterm vector is expressible as a linear combination of data minterm vectors. then it follows that the probability of each minterm in the group is zero. These are obtained by the MATLAB operation tp = DP ∗ CT ' . The checking by hand is usually much easier than obtaining the solution unaided. there is a matrix CT such that T V = CT ∗ DV . MATLAB solves this CT = T V /DV . (A&B)(A&C)(B&C). MATLAB produces the minterm vector for each In mincalc.65 0. (A&Bc&C) DP = [1 0. the result should be checked with the aid of a minterm map. 2. probability of a target vector included in another vector will actually exceed what should be the larger probability. In a few unusual cases. Without considerable checking. This is accomplished for DV by the operation DV = ones(size(DV)). Consistency check. but MATLAB interprets them dierently. it will not pick up in the zero case the fact that the separate minterm probabilities are zero. The The data probability matrix DP is a row matrix of the data probabilities. Since the consistency check is for negative minterms. the probability of the whole space.14: Software survey % file mcalc01 Data for software survey minvec3. if there are not enough data Sometimes the to calculate the minterm probabilities. so that use of MATLAB is advantageous even in questionable cases.
0000 0.(A&C)].001*[1000 565 515 151 51 124 212 0].42 CHAPTER 2.0000 0. DP = 0.0000 0.0500 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities.15: Computer survey % file mcalc02.9680 2.0500 2. B.0730 6. call for PMA disp(PMA) 0 0.0000 0. TV = [ABC.0000 0.0000 0. A.4000 7.0000 0. disp('Call for mincalc') mcalc02 Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. C.5500 2.2000 5. Ac&Bc&C].0000 0.0000 0.0500 4. .0000 0. A&B&C.0160 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities.0770 7.0320 1.3640 5.16 .0000 0.0000 0.1000 Example 2.0000 0. MINTERM ANALYSIS mincalc Data vectors are linearly independent Computable target probabilities 1. call for PMA disp(PMA) % Optional call for minterm probabilities 0 0 1.0000 0.0000 0.m Data for computer survey minvec3 DV = [AAc..1000 6.0000 0.0000 0.1000 3..3760 3.0160 2.0000 0.0110 4.0510 Example 2. A&C. (A&B)(A&C)(B&C). 2*(B&C) .
disp('Call for mincalc') mincalc03 Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. A&Cc. C.2000 0.0200 0. Example 2.001*[1000 200 500 300 700 55 520 200 15 30 195 120 120 ...0850 0. .0800 0. A&B&Cc&Dc]. Ac&B&C. 2. 3 This content is available online at <http://cnx.43 % file mcalc03. DP = 0.m Data for opinion survey minvec4 DV = [AAc. A&(BCc)&Dc.0850 0. Ac&Bc&Dc. Suppose the probability that at least one of the events A occurs is 0.0000 0. A&B&C&D.0200 0.0150 The procedure mincalct A useful modication. A&Bc&C.0500 0. call for PMA disp(minmap(pma)) % Display arranged as on minterm map 0.90. which we call mincalct. except that it does not need the target matrix The procedure mincalct may be used after mincalc has performed its operations to calculate probabilities for additional target combinations.0000 0. A&C&Dc. A((B&C)Dc) . We may simply use the procedure mincalct and input the desired Boolean combination at the prompt.org/content/m24171/1. Determine the probability that neither of the events A or C but at least one of the events B or D occurs. mincalct Enter matrix of target Boolean combinations Computable target probabilities 1. since it prompts for target Boolean combination inputs. computes the available target probabilities. 48.4000 2.75 and the probability that at least one of the four events occurs is 0. A. ing and computing the minterm probabilities. This procedure assumes a data le similar to that for mincalc. TV = [Ac&((B&Cc)(Bc&C)).4/>.0100 0. 140 25 20].0150 0.17: (continued) Additional target datum for the opinion survey Suppose mincalc has been applied to the data for the opinion survey and that it is desired to determine P (AD ∪ BDc ).0850 0.. A(B&Cc)].0350 0.0500 0.. C. . without checkTV .1 Consider the class or (Solution on p. B. Ac&B&Cc&D.3 Problems on Minterm Analysis3 Exercise 2. D. Ac&Bc&Cc&D.2850 (A&D)(B&Dc) Repeated calls for mcalct may be used to compute other target probabilities.0350 0.) C {A.0200 0.4800 The number of minterms is 16 The number of available minterms is 16 Available minterm probabilities are in vector pma To view available minterm probabilities. B.1950 0. It is not necessary to recalculate all the other quantities. D} of events.0000 0.
) {A.0224 0. C.0216 0.0756 0. MINTERM ANALYSIS Exercise 2.2 (Solution on p. P (AC) = 0. b. E}.2352 0. if possible.0252 0.0072 0.0096 0. Exercise 2. 50. c c c c c Then determine P ((AB ∪ A ) C ) and P (A (B ∪ C )). and P (Ac (B ∪ C c )).0216 0.0336 0. .6 Suppose (Solution on p.6. if (Solution on p. (3) the mprocedure a.8 Suppose P (A) = 0.0672 0. P (A) = 0.65. b.0144 0.3. 48. c.3.0504 0. 3. 48.25 P (Ac C c ) = 0. P (AB) = 0. arranged as on a minterm map are 0.0224 0. arranged as on a minterm map are 0. P (BC c ) = 0.5. 49. C}: a.0288 0. P (Ac B) = 0.2. Exercise 2. P ((AB c ∪ Ac ) C c ).0504 0.0216 0.0324 0.) Exercise 2.0144 0.3. and P (AC ∪ A B).0336 0. Repeat part (1) using indicator functions (evaluated on minterms). 49. A (B ∪ C c ) ∪ Ac BC ⊂ A (BC ∪ C c ) ∪ Ac B A ∪ Ac BC = AB ∪ BC ∪ AC ∪ AB c C c (Solution on p. P (Ac C c ) = 0. c c Determine P (A(BC ) ) and P (AB ∪ AC ∪ BC). 48. Repeat part (1) using the mprocedure minvec3 and MATLAB logical operations.4.4.0504 0.0756 0.9 Suppose (Solution on p. A ∪ (BC) = A ∪ B ∪ B c C c c (A ∪ B) = Ac C ∪ B c C A ⊂ AB ∪ AC ∪ BC c 2.7 Suppose and P ((AB c ∪ Ac B) C) = 0.1568 0. P (Ac B) = 0.0504 0.0672 0.) 1.0144 0. B.) P (A) = 0. C. c c c c c Determine P ((A ∪ B) C ). B.0504 0.0336 0.) Use (1) minterm maps. P (AC ∪ A C).0336 0. P (C) = 0. 50. P (AB) = P (AC) = 0. Exercise 2.4 Minterms for the events {A.5.0252 0. P (AB c C c ) = 0.0216 0.0144 0.2.0144 0.0216 0.0336 0.0432 0. Determine P (Ac C c ∪ AC).1.0168 0.0096 0.44 CHAPTER 2.1008 What is the probability that three or more of the events occur on a trial? Of exactly two? Of two or fewer? Exercise 2.0324 0. B. and P (ABC c ) = 0.) possible. D.0392 0. (2) indicator functions (evaluated on minterms).) P (A ∪ B c C) = 0.2. if possible.0144 0.6. D}.3 minvec3 and MATLAB logical operations to show that (Solution on p. Use minterm maps to show which of the following statements are true for any class {A. Determine P ((AC c ∪ Ac C) B c ). P (AC) = 0.) Exercise 2. 49. P (C) = 0.5 Minterms for the events (Solution on p.25.1008 0.1.0504 0.0588 0. (Solution on p. c c c and P (A B C ) = 0.1.0168 0.0336 What is the probability that three or more of the events occur on a trial? Of exactly four? Of three or fewer? Of either two or four? Exercise 2.30.0108 0.0216 0.
10 of whom are women.11 A survey of a represenative group of students yields the following information: • • • • • • • 52 percent are male 85 percent live on campus 78 percent are male or are active in intramural sports (or both) 30 percent live on campus but are not active in sports 32 percent are male. What is the probability that a randomly chosen student is male and lives on campus? b. 20 engineering students will take a foreign language and plan graduate study. What is the probability of rain in all three cities? b.14 Let (Solution on p.) Exercise 2. Exercise 2. P (Ac B c ) = 0.45 (2. C = the D = the event the student is planning graduate study (including professional school). 13 male engineering students and 8 women engineering students plan graduate study.12 (Solution on p. 11 non engineering.13 During a period of unsettled weather. (Solution on p. 23 engineering students. including all of the women. What is the probability of selecting a student who plans foreign language classes and graduate study? . What is the probability of rain in exactly two of the three cities? c.30 P (B c C) = 0.1. A= the event the student is male.15 (2. 52. live on campus. 10 of whom are female.) A survey of 100 persons of voting age reveals that 60 are male.20. 5 non engineering students plan graduate study but no foreign language courses.45 Then repeat with additional data P (Ac B c C c ) = 0. and are active in sports 8 percent are male and live o campus 17 percent are male students inactive in sports a. Suppose: P (AB) = 0. on campus student who is not active in sports? c. (Solution on p. 52. Suggestion Express the numbers as a fraction. P (AB c ) = 0.2.30) a. 30 of whom do not identify with a political party.05. The results of the survey are: There are 55 men students. What is the probability of rain in exactly one of the cities? Exercise 2. 53.) One hundred students are questioned about their course of study and plans for graduate study. 52.) be the event of rain be the event of rain in San Antonio. 20 nonmembers of a party voted in the last election. P (AB c ∪ Ac B) = 0.35. 50 are members of a political party. 51. event the student plans at least one year of foreign language.10 Given: P (A) = 0. a. women students plan foreign language study and graduate study.1 and P (Ac BC) = 0.) Exercise 2. 75 students will take foreign language classes. B P (AC) = 0.29) P (BC) = 0. P (AC c ) = 0. and P (ACDc ) = 0.05 P (Ac B c C c ) = 0. and How many nonmembers of a political party did not vote? C A be the event of rain in Austin. 26 men and 19 women plan graduate study.6.4. B= the event the student is studying engineering. What is the probability of a female student active in sports? Exercise 2. What is the probability of a male. let in Houston. c c Determine P (A B ∪ A (C ∪ D)).15. (Solution on p. and treat as probabilities.
non readers of the newspaper. 112 have seen at least c.17.) A customer enters the store. which we call a. b. monitors.19 A computer store sells computers. 56. brake repair.50. 197 have seen at least two of the programs.) Of the 1000 persons A television station runs a telephone survey to determine how many persons in its primary viewing 221 have seen at least a. 54. How many needed only wheel alignment? b. The probability The probability The probability P (AB) of buying both a computer and a monitor is 0.39 P (AC c Ac C) of buying a computer or a printer. 5 men are active. 25 of whom are women. 35 of the reasonably active students read the newspaper regularly. P (BC) of buying both a monitor and a printer is 0. the number having seen at least a and b is twice as large as the number who have seen at least b and c. MINTERM ANALYSIS b. How many who do not need wheel alignment need one or none of the other items? Exercise 2. 55. What is the probability of selecting a male student who either studies a foreign language but does not intend graduate study or will not study a foreign language but plans graduate study? Exercise 2. c c c c c Determine P (B ∪ C). 209 have seen at least b. C be the respective events the customer buys a computer. • • (a) How many have seen at least one special? (b) How many have seen only one special program? Exercise 2.17 An automobile safety inspection station found that in 1000 cars tested: (Solution on p. P (Ac ) = 0. Let Exercise 2. 45 have seen all three. 40 read the student newspaper regularly. B. a monitor. if possible. and P (Ac B c C c ) = 0. What is the probability of selecting a women engineer who does not plan graduate study? c. . 70 consider themselves reasonably active in student aairs50 of these live on campus. All women who live on campus and 5 who live o campus consider themselves to be active. P ((AB ∪ A B ) C ∪ AC).49.45. (Solution on p. the results are: (Solution on p.6.2 and P (A B) = 0. printers. a printer. A. 54.) P (A (B ∪ C)) = 0. How many inactive men are not regular readers? Exercise 2.) 60 are men students.16 area have watched three recent special programs. and headlight adjustment 325 needed at least two of these three items 125 needed headlight and brake work 550 needed at wheel alignment a.3. as do 5 of the o campus women. and c. 54. ocampus. 62 have seen at least a and c.1. P (ABC c ) of buying both a computer and a monitor but not a printer is P (AC) of buying both a computer and a printer is 0. c c Repeat the problem with the additional data P (A BC) = 0. Assume the following probabilities: • • • • • The probability The probability 0. 10 of the oncampus women readers consider themselves active. 25 of whom are women. a.15 A survey of 100 students shows that: (Solution on p.) • • • • 100 needed wheel alignment. 55 students live on campus. surveyed.46 CHAPTER 2.3. but not both is 0. How many active men are either not readers or o campus? b.18 Suppose (Solution on p. and P (A (B ∪ C )).
c P (AB c ) (2.47 • • The probability The probability P (AB c P (BC c Ac B) of buying a computer or a monitor. P (C) = 0. What is the result? Explain the reason Exercise 2. (Solution on p. P (Ac C c ) = 0. c c Determine P (A ∪ B ∪ C) and P (A B C).232. Repeat. What is the probability of buying at least two? d. Determine available minterm probabilities and the following.045. with the additional data P (C) = 0. P (ABC) = 0. 56. but not both is 0.) Exercise 2.4. if possible. 57. P (A ∪ B) .) P (A) = 0. not all minterms are available.43.1 and P (Ac B c ) = 0.228.22 Repeat Exercise 2.) independent? Are all minterm probabilities available? Exercise 2.31) P (Ω) = P (A A ) = 1).21 with for this result. P (C) of buying each? b. What is the probability P (A). 58. P (ABC) = 0.3.23 Repeat Exercise 2. if computable: P (AC c ∪ Ac C) .3 to 0. P (AB) = 0. but with the data vector matrix. but not both is 0.21 with the original data probability matrix.25.43.5. B c C) of buying a monitor or a printer. P (AB ∪ AC ∪ BC) = 0.) AB replaced by AC in Does mincalc work in this case? Check results on a . With only six items of data (including Try the additional data P (Ac B c ) .197 and P (BC) = 2P (AC).20 Data are (Solution on p.230.3. P (Ac BC c ) = 0.3. What is the probability of buying all three? Exercise 2. P (AC) = 0.21 Data are: P (A) = 0. 59. minterm map.062. P (B).65. What is the result? (Solution on p. P (AB) changed from 0. or a. Are these consistent and linearly (Solution on p. P (B) = 0. What is the probability of buying exactly two of the three items? c.
B.75 = 0. if desired. H = (Ac&C)(Bc&C).H]) 0 0 0 1 1 0 0 0 1 1 Solution to Exercise 2.8. F = AB(Bc&Cc).K]) 0 0 0 0 1 0 0 0 1 0 Solution to Exercise 2. disp([E. npr02_04 (Section~17. . Ac. disp([G. c P (A ∪ C ∪ B ∪ D) = P (A ∪ C)+P (Ac C c (B ∪ D)) .1 (p. Use mintable(4) a = mintable(4).F]) 0 0 0 1 1 0 0 1 1 1 G = A(Ac&B&C). disp([G. disp([E.4 (p.15 Solution to Exercise 2. MINTERM ANALYSIS Solutions to Exercises in Chapter 2 Solution to Exercise 2. minvec3 Variables are A. 44) 1 1 1 1 1 1 % Not equal 0 1 1 1 0 0 1 1 0 0 1 1 % Not equal % A not contained in K We use the MATLAB procedure.F]) 1 1 1 0 1 1 0 1 1 1 G = ∼(AB).H]) 1 1 0 0 0 0 1 0 1 0 K = (A&B)(A&C)(B&C). 43) Use the pattern P (E ∪ F ) = P (E) + P (E c F ) and (A ∪ C) = Ac C c . 44) so that P (Ac C c (B ∪ D)) = (2.3 (p. Cc They may be renamed. minvec3 Variables are A. disp([A.2 (p. Bc. An alternate is to use minvec4 and express the Boolean combinations which give the correct number(s) of ones.90 − 0. F = (A&((B&C)Cc))(Ac&B).48 CHAPTER 2. which displays the essential patterns. H = (A&B)(B&C)(A&C)(A&Bc&Cc). 0. which displays the essential patterns.32) We use the MATLAB procedure. E = (A&(BCc))(Ac&B&C). 44) 0 0 1 1 1 1 % E subset of F 1 1 1 1 1 1 % G = H We use mintable(4) and determine positions with correct number(s) of ones (number of occurrences). Ac. B.1: npr02_04) Minterm probabilities are in pm. Bc. if desired. Cc They may be renamed. C. E = A∼(B&C). C.
7 .5284 % Number of ones in each minterm position % Select and add minterm probabilities Solution to Exercise 2. Call for mincalc mincalc Data vectors are linearly independent % Data file Computable target probabilities 1.20 0.65 0. 44) % file npr02_06. s = sum(a).m (Section~17.4: npr02_07) % Data for Exercise~2.8.8. Ac&(BCc)].25 0.1712 P3 = (s<=3)*pm' P3 = 0.4784 Solution to Exercise 2. C. B&Cc].30].5 (p. npr02_05 (Section~17. Ac&Cc. 44) We use mintable(5) and determine positions with correct number(s) of ones (number of occurrences). if desired.3500 % is calculated.25 0. Check with minterm map.3728 P3 = (s<=2)*pm' P3 = 0. Bc. Use mintable(5) a = mintable(5).8. A&C. Ac&B.49 s = sum(a). call for PMA Solution to Exercise 2.7952 P4 = ((s==2)(s==4))*pm' P4 = 0. The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities. B. A(Bc&C).0000 0.3000 % The first and third target probability 3. 44) % file npr02_07.0000 0. % Number of ones in each minterm position P1 = (s>=3)*pm' % Select and add minterm probabilities P1 = 0.3: npr02_06) % Data for Exercise~2.6 (p.m (Section~17.4716 P2 = (s==2)*pm' P2 = 0. DP = [1 0.7 (p.2: npr02_05) Minterm probabilities are in pm.5380 P2 = (s==4)*pm' P2 = 0. P1 = (s>=3)*pm' P1 = 0.6 minvec3 DV = [AAc. ((A&Bc)Ac)&Cc. Ac. Cc They may be renamed. TV = [((A&Cc)(Ac&C))&Bc. disp('Call for mincalc') npr02_06 % Call for data Variables are A.
0000 0. if desired.9 minvec3 . Cc They may be renamed. call for PMA Solution to Exercise 2. Cc They may be renamed.m (Section~17.m (Section~17. B.4 0. 44) % file npr02_09.8. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. A&C.6 0.5 0.8 (p. Ac&(BCc)].0000 0. C. DP = [ 1 0. C.0000 0. (A&Cc)(Ac&C).0000 0. disp('Call for mincalc') npr02_08 % Call for data Variables are A. Ac.2 0. Ac. call for PMA Solution to Exercise 2. Ac&B.8.7000 % All target probabilities calculable 2. MINTERM ANALYSIS minvec3 DV = [AAc. A&Bc&Cc]. Ac&Bc&Cc]. Bc.9 (p. C.4000 The number of minterms is 8 The number of available minterms is 6 Available minterm probabilities are in vector pma To view available minterm probabilities.1]. A.5000 % All target probabilities calculable 2.8 minvec3 DV = [AAc. ((A&Bc)(Ac&B))&C.4000 % even though not all minterms are available 3. TV = [(AB)&Cc. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. A. 44) % file npr02_08. DP = [ 1 0.0000 0.5000 The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities. (A&Cc)(Ac&B)]. TV = [(Ac&Cc)(A&C). Ac&Cc. if desired.0000 0.4000 % even though not all minterms are available 3. ((A&Bc)Ac)&Cc. B.3 0.50 CHAPTER 2.3 0.4 0. Bc.6 0.2 0. C. disp('Call for mincalc') npr02_07 % Call for data Variables are A.5: npr02_08) % Data for Exercise~2.6: npr02_09) % Data for Exercise~2.1]. A&B.
1]. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. Cc They may be renamed. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. A&C.0000 0.4500 % even though not all minterms are available The number of minterms is 8 The number of available minterms is 6 Available minterm probabilities are in vector pma To view available minterm probabilities. 45) % file npr02_10.1 0.7: npr02_10) % Data for Exercise~2. if desired. Cc. Ac.8. npr02_09 % Call for data Variables are A.2 0. A&B. Ac&B&C]. TV = [(Ac&B)(A&(CcD))]. C.0000 0. call for PMA DV = [DV.7000 % Checks with minterm map solution The number of minterms is 16 The number of available minterms is 0 Available minterm probabilities are in vector pma . Ac&Bc. call for PMA Solution to Exercise 2. TV = [A&(∼(B&Cc)). Dc They may be renamed. mincalc Data vectors are linearly independent Computable target probabilities 1.3 0.0000 0. A.4000 % Only the first target probability calculable The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities. (A&B)(A&C)(B&C)]. A&B&Cc].1].6 0.3 0. A&Cc. C. DP = [1 0. A&C&Dc].0000 0.10 (p.05]. Bc.4 0.51 DV = [AAc. Bc.1 0. B.05]. Ac&Bc&Cc. DP = [ 1 0. if desired. Ac&Bc&Cc.4000 % Both target probabilities calculable 2. D. Ac. % DP = [DP 0. disp('Call for mincalc') npr02_10 Variables are A. % Modification of data DP = [DP 0.10 minvec4 DV = [AAc.5 0. A. Ac&B&C].m (Section~17. disp('Call for mincalc') % Modification for part 2 % DV = [DV. B.
Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. call for PMA Solution to Exercise 2. 45) % file npr02_12. 45) % file npr02_11.10].9: npr02_12) % Data for Exercise~2. A&Cc]. disp('Call for mincalc') npr02_11 Variables are A. DP = [ 1 0. A&B&C.0000 0.4400 2.32 0.08 0.12 (p.8: npr02_11) % Data for Exercise~2. B. TV = [Bc&Cc].11 (p. B = on campus. Bc.60 0. DP = [ 1 0. A&Bc. B. C = voted last election minvec3 DV = [AAc.0000 0. Bc.85 0. A. call for PMA Solution to Exercise 2. A.17]. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.2600 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities. A&B&Cc. AC.0000 0. B. Ac&Bc&C]. if desired.3000 The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities.52 CHAPTER 2.50 0. disp('Call for mincalc') npr02_12 Variables are A. C. Ac. B&Cc. A&Bc.30 0. B = party member.78 0. Cc They may be renamed.m (Section~17. Ac. C.30 0. Ac&C].52 0. Bc&C.0000 0.12 % A = male. TV = [A&B. MINTERM ANALYSIS To view available minterm probabilities.20 0.8.8. Cc They may be renamed. call for PMA .1200 3. B.11 % A = male. C = active in sports minvec3 DV = [AAc. if desired.m (Section~17.
. Ac&B. TV = [C&D. Bc.14 % A = male. C. Bc.23 0. B&C&D. (A&Bc)(Ac&B). B. TV = [A&B&C. disp('Call for mincalc') npr02_13 Variables are A.19 0.13 0.14 (p.45 0.13 % A = rain in Austin. if desired. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. disp('Call for mincalc') npr02_14 Variables are A. D = graduate study minvec4 DV = [AAc.3900 2. Ac&Dc. Bc&C. Ac&C.20 0.20 0.0000 0. Cc They may be renamed. % C = foreign language. C. C. if desired.08 0.30 0.05 0. A.13 (p. Ac&B&D. 45) % file npr02_14. B&C..10 0.4000 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities. Ac.0000 0.8. A&Bc. Ac&D.2500 3.35 0. call for PMA . .05 0. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. A&D. Ac.55 0. B = engineering. Ac&Bc&Cc].26 0.15].m (Section~17. B. call for PMA Solution to Exercise 2.11: npr02_14) % Data for Exercise~2. Dc They may be renamed. B.2600 % Third target probability not calculable The number of minterms is 16 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities. DP = [1 0. A&B&D. B = rain in Houston. Cc. D. A&B. DP = [ 1 0.11]. A&C.75 0.10: npr02_13) % Data for Exercise~2. A&((C&Dc)(Cc&D))]. (A&Bc&Cc)(Ac&B&Cc)(Ac&Bc&C)]. % C = rain in San Antonio minvec3 DV = [AAc.53 Solution to Exercise 2.0000 0.0000 0.8.15 0. Ac&Bc&C&D].0000 0.45 0.m (Section~17. (A&B&Cc)(A&Bc&C)(Ac&B&C).2000 2. Bc&Cc&D. 45) % file npr02_13.
MINTERM ANALYSIS Solution to Exercise 2.25 0. Ac&Bc&D.55 0. Cc. Bc.3000 2. B.197 0. (A&B)2*(B&C)]. 46) % file npr02_15.m (Section~17. if desired. Dc They may be renamed.10 0. (A&B)(A&C)(B&C).17 .. TV = [A&D&(CcBc).70 0.0000 0. C = readers. Ac&Bc&C&D. npr02_16 Variables are A. 46) % file npr02_17. C&D.112 0. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.16 minvec3 DV = [AAc.m (Section~17. Ac. A. C.209 0.0000 0. Ac&C.13: npr02_16) % Data for Exercise~2. Ac. C. 46) % file npr02_16. DP = [1 0.6 0. A&B&C. A&Dc&Cc]. call for PMA Solution to Exercise 2. A&C. C. if desired. A&Bc&Cc&D].2500 The number of minterms is 16 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities. D. B = on campus.17 (p. B. B.05 0.54 CHAPTER 2.0000 0.m (Section~17.25 0. Ac&B&D.. disp('Call for mincalc') npr02_15 Variables are A.12: npr02_15) % Data for Exercise~2.0000 0.221 0.05]. Ac&B.50 0.1030 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities.8. B&D. D.40 0.8. Cc They may be renamed.14: npr02_17) % Data for Exercise~2.16 (p.15 % A = men. TV = [ABC. C.3000 2. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. (A&Bc&Cc)(Ac&B&Cc)(Ac&Bc&C)]. D = active minvec4 DV = [AAc. A. call for PMA Solution to Exercise 2.05 0.045 0. B.8.15 (p. .25 0. DP = [ 1 0. Ac&B&C&D.35 0.062 0]. Bc.
A&B&C. (((A&B)(Ac&Bc))&Cc)(A&C).125 0. TV = [A&Bc&Cc.100 0.550]. Bc.8000 2. C = headlight minvec3 DV = [AAc. Ac. Ac&B&C. if desired. TV = [BC.0000 0. Ac&B].3]. Ac&Bc&Cc].6 0.2 0.18 (p. call for PMA Solution to Exercise 2. Ac. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1. DP = [ 1 0. Ac. B. Cc They may be renamed. B = brake work.0000 0.1]. % DP = [DP 0.0000 0. disp('Call for mincalc') npr02_17 Variables are A. 46) % file npr02_18. A ]. B.2 0. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.0000 0. Ac&B&C. (A&B)(A&C)(B&C). C.55 % A = alignment.3 0.3]. DP = [ 1 0. if desired. disp('Call for mincalc') % Modification % DV = [DV. Ac&(BCc)]. % Modified data DP = [DP 0. Bc. Ac&B]. npr02_18 Variables are A.18 minvec3 DV = [AAc.m (Section~17.0000 0.325 0. call for PMA DV = [DV. Ac&(∼(B&C))]. mincalc % New calculation Data vectors are linearly independent Computable target probabilities 1.4000 .4000 The number of minterms is 8 The number of available minterms is 2 Available minterm probabilities are in vector pma To view available minterm probabilities.8.2500 2.8000 2.15: npr02_18) % Date for Exercise~2. B&C.4250 The number of minterms is 8 The number of available minterms is 3 Available minterm probabilities are in vector pma To view available minterm probabilities. C. Cc They may be renamed.0000 0. A&(BC).
B&C. Bc.8. (B&Cc)(Bc&C)]. DP = [ 1 0. (A&Cc)(Ac&C). Bc. . (A&B)(A&C)(B&C). if desired. C].230 ].17 0.. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.6900 6.0000 0.0000 0.228 0. A&B&C]. Ac.8000 2. C.6100 3.20 minvec3 DV = [AAc. A. if desired. (A&B&Cc)(A&Bc&C)(Ac&B&C).8.0000 0. MINTERM ANALYSIS 3. B. .m (Section~17. (A&Bc)(Ac&B).20 (p.39 0. TV = [A.0000 0.m (Section~17.50 0.45 0. npr02_20 (Section~17. 47) % file npr02_20. disp('Call for mincalc') npr02_19 Variables are A. B&C . C. C.8. B.. B = monitor.43]. A&C.0000 0. B. A&C. % DP = [DP 0. Cc They may be renamed. A&B&C. A&B.062 0.0000 0. A&B&Cc. DP = [1 0.43 0.4000 The number of minterms is 8 The number of available minterms is 5 Available minterm probabilities are in vector pma To view available minterm probabilities. Cc They may be renamed.17: npr02_20) % Data for Exercise~2.197 0].0000 0. TV = [ABC.3700 5.56 CHAPTER 2.3200 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities.16: npr02_19) % Data for Exercise~2. (A&B)(A&C)(B&C). Ac&Bc&Cc].6000 4.045 0. B. 46) % file npr02_19.19 % A = computer. call for PMA Solution to Exercise 2.232 0. call for PMA Solution to Exercise 2.17: npr02_20) Variables are A.49 0. disp('Call for mincalc') % Modification % DV = [DV. C = printer minvec3 DV = [AAc.19 (p.2*(A&C)]. Ac.
TV = [(A&Cc)(Ac&C). A&B&C.0000 0.m (Section~17.0170 6. Ac&Bc.4 0. A&B. Ac&Cc]. DP = [DP 0. if desired. C].0480 3. DP = [ 1 0.1 0.0000 0.230].8.1800 7. Bc.21 (p. 47) % file npr02_21. disp('Call for mincalc') % Modification % DV = [DV.18: npr02_21) % Data for Exercise~2.18: npr02_21) Variables are A. npr02_21 (Section~17. AB.0100 % inconsistency of data 5.0000 0. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.0000 0.1000 The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector pma To view available minterm probabilities. call for PMA Solution to Exercise 2. call for PMA DV = [DV. .0000 0.0450 DV = [DV. Cc They may be renamed.0000 0.3 ].3 ]. Ac&Bc]. % DP = [DP 0. B.65 0. Ac.21 minvec3 DV = [AAc. call for PMA disp(PMA) 2.0000 0. Ac&Bc]. A&Bc].0000 0.3 0. Ac&B&Cc. C. A.25 0.57 mincalc Data vectors are linearly independent Data probabilities are INCONSISTENT The number of minterms is 8 The number of available minterms is 6 Available minterm probabilities are in vector pma To view available minterm probabilities. Ac&B&Cc.0450 % Negative minterm probabilities indicate 4.3500 4. mincalc Data vectors are linearly independent Data probabilities are INCONSISTENT The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities.8. C.
1 0.65 0.5 0. Ac&B&Cc.0000 0. Ac&Bc].3 ]. call pma for PMA pma for PMA . disp('Call for mincalc') % Modification % DV = [DV. call disp(PMA) 4.2500 7. A&Bc].58 CHAPTER 2.3500 2. Bc. TV = [(A&Cc)(Ac&C).3 ]. 47) % file npr02_22. B. DP = [DP 0.3000 3. AB. Ac&Bc. if desired. C.2000 5. npr02_22 (Section~17. mincalc Data vectors are linearly independent Computable target probabilities 1.m (Section~17.2500 DV = [DV. Cc They may be renamed.25 0. Ac&Cc]. A&B. call for PMA Solution to Exercise 2.1 0.3 ]. MINTERM ANALYSIS DP = [DP 0.1000 6.0000 0. % DP = [DP 0. A&B&C.0000 0.0000 0. mincalc Data vectors are linearly independent Data probabilities are INCONSISTENT The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector To view available minterm probabilities. Ac&Bc]. Ac. Call for mincalc mincalc Data vectors are linearly independent Data probabilities are INCONSISTENT The number of minterms is 8 The number of available minterms is 4 Available minterm probabilities are in vector To view available minterm probabilities.1 0.0000 0. Ac&B&Cc.1000 The number of minterms is 8 The number of available minterms is 8 Available minterm probabilities are in vector pma To view available minterm probabilities.22 minvec3 DV = [AAc.0000 0.3 ].19: npr02_22) Variables are A.7000 4. C.0000 0.8.4 0.8. A.22 (p.0000 0. DP = [ 1 0.19: npr02_22) % Data for Exercise~2.
0000 4.m (Section~17.3 ].0000 5. Bc. Ac&Bc]. Computable target probabilities 1 Inf % Note that p(4) and p(7) are given in data 2 Inf 3 Inf The number of minterms is 8 The number of available minterms is 6 Available minterm probabilities are in vector pma To view available minterm probabilities. A&C.4 0. B. Call for mincalc mincalc Data vectors are NOT linearly independent Warning: Rank deficient. Ac.2500 0.3 ].3 0.1 0.0000 0. Ac&B&Cc.0000 0.0000 2.23 minvec3 DV = [AAc. TV = [(A&Cc)(Ac&C). Ac&Bc]. Ac&B&Cc.1 0. DP = [DP 0. rank = 5 tol = 5. call for PMA DV = [DV. 47) % file npr02_23. A&B&C. disp('Call for mincalc') % Modification % DV = [DV.2000 0.23 (p.25 0. % DP = [DP 0.2000 0. C.59 disp(PMA) 0 1.4500 The number of minterms is 8 The number of available minterms is 2 Available minterm probabilities are in vector pma To view available minterm probabilities.3 ].1000 0.0243e15 Computable target probabilities 1. Cc They may be renamed. Ac&Bc.1000 0. mincalc Data vectors are NOT linearly independent Warning: Matrix is singular to working precision.20: npr02_23) % Data for Exercise~2.0000 6. A. A&Bc].1000 0.65 0. DP = [ 1 0.8.2500 Solution to Exercise 2. npr02_23 Variables are A. C.0000 7. if desired. AB.2000 0.0000 3. call for PMA . Ac&Cc].
60 CHAPTER 2. MINTERM ANALYSIS .
and (P3) (p. He is considered a prospect highly likely to succeed.org/content/m23252/1. In addition.. this information. He conducts himself with ease and is quite articulate in his interview. has all the properties of the original probability measure. because of bad debts. The interview is followed by an With extensive background check. 61 . She is somewhat overweight.6/>. heart problems. His credit rating. eats a low fat. He suspects that she may be prone to heart problems.1. sub ject to the dening conditions (P1). because of the way it is derived from the original. he reassesses the likelihood of 1 This content is available online at <http://cnx. reduced considerably. The probability P (A) For reassessment of the likelihood of event A. and comes from a family in which survival well into their nineties is common.1 Introduction The probability P (A) of an event A is a measure of the likelihood that the event will occur on any trial. It looks clean. Then he discovers that she exercises regularly. new information is received which leads to a etc. Frequently. she has a mechanic friend look at it. high ber. • A physician is conducting a routine physical examination on a patient in her seventies. example • An applicant for a job as a manager of a service department is being interviewed. and is a dependable model of a well known make.1. (P2). 3. She nds one that appears to be an excellent buy. the likelihood that he is a satisfactory candidate changes radically. this assignment to all events constitutes a new probability measure which C has occurred.2 Conditional probability The original or prior probability measure utilizes all available information to make probability assignments P (A) . • A young woman is seeking to purchase a used car. P (B). it may A. buying. the conditional probability measure has a number of special properties which are important in applications. has reasonable mileage. 11). Given this information. Sometimes partial information determines that an event be necessary to reassign the likelihood for each event For a xed conditioning event C.1 Conditional Probability1 3. is found to be quite low.Chapter 3 Conditional Probability 3. Before He nds evidence that the car has been wrecked with The likelihood the car will be satisfactory is thus possible frame damage that has been repaired. This leads to the notion of conditional probability. indicates the likelihood that event A will occur on any trial. On the basis of this new information. His résumé shows adequate experience and other qualications. variagated diet.
so the results can be taken as typical of the student body. Let U be the event the student is an undergraduate and G be the event he or she is a graduate student. made? One possibility is to make the new assignment to A How should the new probability assignments be proportional to the probability P (AC). These considerations and experience with the classical case suggests the following procedure for reassignment. Let A student is picked at random. P (Y U ) = 60/250. and D (uncertain or no opinion). Then etc. this means that A occurs i the event AC occurs. given C is P (AC) = For a xed conditioning event P (AC) P (C) (3. P (C). P (ΩC) j = 1. Results are tabulated below. Eectively. so that for xed probability measure. with all the properties of an ordinary C Remark. For one thing. which may call for reassessing the likelihood A. and P j Aj C = P( W j Aj C ) P (C) = (3.4) . CONDITIONAL PROBABILITY conditioning event C . N (negative). 11) for we have a Thus. N be the D is the event of no opinion (or uncertain). Denition. The data may reasonably be interpreted P (G) = 100/250. (P2). information determines a of event basic space. P (Y ) = (60 + 70) /250. new probability measure. we have a new likelihood assignment to the event A. P (U ) = 150/250. When we write has occurred.62 CHAPTER 3. but partial. Although such a reassignment is not logically necessary. the conditional probability of A. and Y be the event he or she is favorable to the plan. C. of whom 150 were undergraduates and 100 were graduate students.1 Suppose the sample is representative. (3. and (P3) (p.3) P (Y U ) = P (Y U ) 60/250 60 = = P (U ) 150/250 150 (3.1) C.1: Conditional probabilities from joint frequency data A survey of student opinion on a proposed national health care program included 250 students. the new function P ( · C) probability. Now P (AC) j ≥ 0. If C is an event having positive probability. Example 3. subsequent developments give substantial evidence that this is the appropriate procedure. This is not the probability of a conditional event AC .2) P (Aj C) /P (C) = P (Aj C) satises the three dening properties (P1). event he or she is unfavorable. Their responses were categorized Y (armative). P (AC) we are evaluating the likelihood of event A when it is known that event Conditional events have no meaning in the model we are developing. Y U G 60 70 N 40 20 D 50 10 Table 3. this makes C a new The new unit of probability mass is New.
and C B be the event the subject is instructed The probabilities be the event the answer is to be no. Respond no regardless of the true answer. (3.6) Conditional probability often provides a natural way to deal with compound trials carried out in several steps. one of three things: By a chance process. and F2 Let F1 be the event the event the second fails. etc. P (DU ) = 50/150. so that if the other has already failed. P (GN ) = 20/60. Nonetheless. vol 2. each sub ject is instructed to do 1. by Paul Mullenix.5) We may also calculate directly P (U Y ) = 60/130. Respond with an honest answer to the question. E P (A) selected randomly are told to the be the event the reply is yes.63 Similarly. Now the second engine can fail only F2 ⊂ F1 so that P (F2 ) = P (F1 F2 ) = P (F1 ) P (F2 F1 ) = 3 × 10−7 quite high. it may be desirable to obtain correct responses for purposes of social analysis.8) P (EA) = P (E) − P (B) P (A) (3. to reply yes. It will y with only one engine operating. if answering yes to a question may tend to incriminate or otherwise embarrass the sub ject. we have P (E) = P (EA) + P (B) = P (EA) P (A) + P (B) which may be solved algebraically to give (3.7) Thus reliability of any one engine may be less than satisfactory. added P (F2 F1 ) = 0.2: Jet aircraft with two engines An aircraft has two jet engines.001. P (EA). Greenberg. Respond yes to the question. Experience indicates load is placed on the second.. 3. The following device for dealing with this problem is attributed to B. 4. one engine fails on a long distance ight.e. More extensive treatment of the problem is given there.3: Responses to a sensitive question on a survey In a survey. P (Y G) = 70/100. P (A). the response given may be incorrect or misleading. a fraction Let answer honestly. G.). Example 3. SOLUTION Since E = EA B. Let A be the event the subject is told to reply honestly. Thus that P (F1 ) = 0. and P (C) are determined by a chance mechanism (i. regardless of the truth in the matter. reprinted in UMAP Journal. no. (3.9) . Once the rst engine fails. P (B). we can calculate P (N U ) = 40/150. We wish to calculate probability the answer is yes given the response is honest. Example 3.0003. P (DG) = 10/100 (3. 2. etc. P (N G) = 20/100. yet the overall reliability may be The following example is taken from the UMAP Module 576.
Three of these are to be selected at random. Jim analyzes the problem as follows.15) . is Thus. 2) = = P (Ai ) . Many would make an assumption if Jim was one of them. Barry.13) This is consistent with the fact that if Jim knows that Richard is hired.12) P (A1 A3 ) = 3/10 P (A1 A3 ) = = 1/2 P (A3 ) 6/10 (3. it has been challenged as being incomplete. Jim. otherwise. i=j the number of combinations of three from ve including these two is just the number of ways of picking one from the remaining three. Richard. which we take to mean P (A) = 0.4: What is the conditioning event? Five equally qualied candidates for a job. This is usually due to some ambiguity or misunderstanding of the information provided.16. Sometimes there are subtle diculties. P (B) = 0. then there are two to be selected from the four remaining nalists. how the friend chose to name Richard. According to the pattern above P (EA) = 27 62/250 − 14/100 = ≈ 0.10) The formulation of conditional probability assumes the conditioning event C is well dened. Many feel that there must be information about somewhat as follows. 3) 10 i=j (3. for any pair A3 has occurred. The chance mechanism is such that P (C) = 0.14) Discussion Although this solution seems straightforward. 1 ≤ i ≤ 5 be the event the i th of these is hired ( (for each event Richard is hired. computation (see Example 5. Jim. Example 3. ANALYSIS Let Ai . with results to be posted the next day. are identied on the basis of interviews and told that they are nalists.7. There are 62 responses yes.11) The information that Richard is one of those hired is information that the event Also. 3) 10 (3. has a friend in the personnel oce. One of them.64 CHAPTER 3.). below) shows P (A1 B3 ) = 6 = P (A1 ) = P (A1 A3 ) 10 (3. P (A1 A3 ) = The conditional probability 3 C (3. CONDITIONAL PROBABILITY Suppose there are 250 subjects.154 70/100 175 (3. is the probability that nalist i A3 is the is in one of the combinations of three from ve. so that P (A1 A3 ) = 3 1 × C (3. information about Richard. Now P (Ai ) i) A1 is the event Jim is hired. 1) = = 1/2 C (4. and Evan. 2) 6 (3. etc. the friend selected on an equally likely basis one of the three to be hired. 1) = = P (Ai Aj ) . Under this assumption. The friend tells Jim that Richard has been selected. Jim asks the friend to tell him the name of one of those selected (other than himself ). Paul. Hence. Jim's name was The friend took the three names selected: removed and an equally likely choice among the other two was made.14 and P (E) = 62/250. It may not be entirely clear from the problem description what the conditioning event is. before receiving the P (A1 ) = 6 1 × C (4. the information assumed is an event B3 which is not the same as A3 . Jim's probability of being hired. In fact. 1 ≤ i ≤ 5 C (5. C (5.
18) Note that this result could be determined by a combinatorial argument: under the assumptions.17) This pattern may be extended to the intersection of any nite number of events. One is defective. conditional probability has are consequences of the way it is related to the original probability measure special properties which P (·). What is the probability that all four customes get good items? SOLUTION Let Ei be the event the i th customer receives a good item. the second chooses one of the eight out of nine goood ones.19) Three items are to be selected (on an equally likely basis at each step) from ten.3 Some properties In addition to its properties as a probability measure. then P (ABCD) = P (A) P (BA) P (CAB) P (DABC). Hence P (E1 E2 E3 E4 ) = Example 3.. The following are easily derived from the denition of conditional probability and basic properties of the prior probability measure. etc. so that P (E1 E2 E3 E4 ) = P (E1 ) P (E2 E1 ) P (E3 E1 E2 ) P (E4 E1 E2 E3 ) = 9 8 7 6 6 · · · = 10 9 8 7 10 (3. Likewise The dening expression may be written in product form: P (ABC) = P (A) and P (AB) P (ABC) · = P (A) P (BA) P (CAB) P (A) P (AB) (3. the number of combinations of four good ones is the number of combinations of four of the nine. Then G1 G3 = G1 G2 G3 G1 Gc G3 .5: Selection of items from a lot An electronics store has ten items of a given type in stock.6: A selection problem C (9. the events may be taken in any order. Four successive Each time. each combination of four of ten is equally likely.62 9 9 45 8 10 · (3.1. Then the rst chooses one of the nine out of ten good ones.16) P (ABCD) = P (A) P (AB) P (ABC) P (ABCD) · · = P (A) P (BA) P (CAB) P (DABC) P (A) P (AB) P (ABC) (3. 4) 126 = = 3/5 C (10. which corresponds to the dierence in the information given (or assumed).20) . (CP1) Product rule Derivation If P (ABCD) > 0. 3. the selection is on an equally likely basis from those remaining. 4) 210 (3. Also. Example 3. and prove useful in a variety of problem situations. customers purchase one of the items. The dierence is in the conditioning event. two of which are defective. 2 By the product rule P (G1 G3 ) = P (G1 ) P (G2 G1 ) P (G3 G1 G2 ) + P (G1 ) P (Gc G1 ) P (G3 G1 Gc ) = 2 2 7 6 8 7 · 8 + 10 · 2 · 8 = 28 ≈ 0. Determine the probability that the rst and third selected are good.65 Both results are mathematically correct. P (AB) = P (A) P (BA). 1 ≤ i ≤ 3 be the event the i th unit selected is good. SOLUTION Let Gi .
Then P (E) = P (EA1 ) P (A1 ) + P (EA2 ) P (A2 ) + · · · + P (EAn ) P (An ) Example 3.24) We thus have P (A1 B5 ) = Occurrence of event 3/20 6 = = P (A1 ) 5/20 10 (3.7 A compound experiment. Let F be the event the student fails the test. P (A1 B5 ) = P (A1 A5 B5 ) = P (A1 A5 ) P (B5 A1 A5 ) = (3.21) Five cards are numbered one through ve.2. Let Hi be the event that a student tested is from high school i. P (F H3 ) = 0. Thus.02. P (B5 ) = P (B5 A1 A5 ) P (A1 A5 ) + P (B5 Ac A5 ) P (Ac A5 ) = 1 1 Also.1). 1 ≤ i ≤ 3. Example 3. all three are put in a box 2. Determine P (B5 ).06 (3. P (F H1 ) = 0.5. P (A1 B5 ). 1 We therefore have B 5 = B 5 A5 = B 5 A1 A5 B 5 Ac A5 . (3. This condition is examined more thoroughly in the chapter on "Independence of Events" (Section 4. P (A1 B5 ). on an equally likely basis.3. 66). Also A5 = A1 A5 Ac A5 . {Ai : 1 ≤ i ≤ n} of E = A1 E A2 E · · · events is mutually exclusive and An E . A student passes the exam. so B 5 ⊂ A5 . Three cards are selected without replacement. This implies P Ai Ac = P (Ai ) − P (Ai Aj ) = 3/10 j that (3. 1 By the law of total probability (CP2) (p. P (H3 ) = 0. student is from high school P (F H2 ) = 0. Their mathematical preparation varies.23) A1 B5 = A1 A5 B5 . they are given a diagnostic test. A twostep selection procedure is carried out as follows. CONDITIONAL PROBABILITY E Suppose the class (CP2) Law of total probability every outcome in is in one of these events.25) B1 has no aect on the likelihood of the occurrence of A1 . Often in applications data lead to conditioning with respect to an event but the problem calls for conditioning in the opposite direction. since 1 3 1 1 3 · + · = 2 10 3 10 4 3 1 3 · = 10 2 20 (3.66 CHAPTER 3. In order to group them appropriately in class sections. • • If card 1 is drawn. the other two are put in a box If card 1 is not drawn. we have P (Ai ) = 6/10 and P (Ai Aj ) = 3/10.4 (What is the conditioning event?). Determine for each i the conditional probability P (Hi F c ) that the . 1.10. One of cards in the box is drawn on an equally likely basis (from either two or three) Let Ai be the event the numbered i i th card is drawn on the rst selection and let Bi be the event the card and is drawn on the second selection (from the box). SOLUTION From Example 3. P (H2 ) = 0. Suppose data indicate P (H1 ) = 0.22) Now we can draw card ve on the second selection only if it is selected on the rst drawing.26) i.8: Reversal of conditioning Students in a freshman mathematics class come from three dierent high schools. a disjoint union.
2. P (G2 Gc ) = 3/5 1 (3.3 = 0.73 P (G2 ) 11/15 11 (3. this item is added to lot 2.33) Example 3. probability P (T D ) of a false positive. Three items. 67).5147 P (F c ) 952 and P (H3 F c ) = P (F c H3 ) P (H3 ) 282 = = 0.2 + 0.67 SOLUTION P (F c ) = P (F c H1 ) P (H1 ) + P (F c H2 ) P (H2 ) + P (F c H3 ) P (H3 ) = 0.5 + 0. Example 3. We want to determine as P (G1 G2 ). P (F c H1 ) P (F c H1 ) P (H1 ) 180 = = = 0.90 · 0. 2 4 1 3 11 · + · = 3 5 3 5 15 (3.2962 P (F c ) 952 (3.31) By the law of total probability (CP2) (p.98 · 0. one defective. P (G2 G1 ) = 4/5. 66). one defective.952 Then (3.1891 c) P (F P (F c ) 952 (3. Data are usually of the form: prior probability .94 · 0. and probability P (T D) of a c c false negative. and G2 be the event the second item Now the data are interpreted (from the augmented lot 2) is good. The desired probabilities are P (DT ) and P (D T ). One item is selected from lot 1 (on an equally likely basis). Suppose D is the event a patient has a certain disease and T is the event a P (D) (or prior c c c odds P (D) /P (D )). What is the G1 be the event the rst item (from lot 1) was good.32) P (G1 G2 ) = P (G2 G1 ) P (G1 ) 4/5 × 2/3 8 = = ≈ 0. probability the item selected from lot 1 was good? SOLUTION Let This second item is good. Four items.27) P (H1 F c ) = Similarly. P (G1 ) = 2/3.30) Such reversals are desirable in a variety of practical situations.29) The basic pattern utilized in the reversal is the following. then P (Ai E) = P (EAi ) P (Ai ) P (Ai E) = 1≤i≤n P (E) P (E) The law of total probability yields P (E) (3. n (CP3) Bayes' rule If E⊂ i=1 Ai (as in the law of total probability). P (G2 ) = P (G1 ) P (G2 G1 ) + P (Gc ) P (G2 Gc ) = 1 1 By Bayes' rule (CP3) (p. a selection is then made from lot 2 (also on an equally likely basis). test for the disease is positive.10: Additional problems requiring reversals • Medical tests.28) P (H2 F c ) = P (F c H2 ) P (H2 ) 590 = = 0.9: A compound selection and reversal Begin with items in two lots: 1.
2083 0. PAE. In addition. PAEc. the c c correctly. PA.3750 0. then data are usually of the form P (D). . P ( · A) It is the ratio of the probabilities (or likelihoods) of C. which is the odds after P (AC) P (BC) = P (CA) P (CB) · P (A) P (B) of the conditioning event.. as in Example 3.10 0..68 CHAPTER 3.4167 0.. odds). given the test market is favorable. Again. the various quantities are in the workspace in the matrices named. probabilities are that the safety alarms signals If H P (T c Dc ) and P (T D)). is the event of success on a job. a rm usually examines the condition of a test market as an indicator of the national market. so that they may be used in further calculations without recopying. the likelihood the national market is favorable. and E is the event that an individual inter viewed has certain desirable characteristics. • Presence of oil. The second fraction in the right hand member is the before knowledge of the occurrence of the conditioning event is known as the likelihood ratio. which is the odds knowledge of the occurrence probability measures and P ( · B).11: MATLAB calculations for Example 3.0200 0. • Job success. T If D is the event a dangerous condition exists (say a steam pressure is too is the event the safety alarm operates.3000 0. The calculations. What is desired is P (EH) P (HE) . The following variation of Bayes' rule is applicable in many practical situations. P (H) and reliability of The desired probability P (EH) and P (EH c ). P(An)] Determines PAE = [P(A1E) P(A2E) .. high) and P (T Dc ). and P (EH c ). and data P (EH c ) indicating the reliability of the test market. The desired probabilities P (Ai E) P (Ai E c ) are calculated and displayed Example 3. The rst fraction in the right hand member C for the two dierent prior odds. If Before launching a new product on the national market.3]. are simple but can be tedious. If H is the event of the presence of oil at a proposed well site.1000 0. The desired probability is P (HE). (CP3*) Ratio form of Bayes' rule P (BC) The left hand member is called the P (AC) = posterior odds.02 0.048 P(EAi) P(Ai) P(AiE) P(AiEc) 0. the data are usually P (H) (or the • Market condition.5 0.06].. or equivalently (e. We have an mprocedure called bayes to perform the calculations easily.0600 0. P (EH). CONDITIONAL PROBABILITY • Safety alarm.. The probabilities P (Ai ) are put into a matrix PA and the conditional probabilities and P (EAi ) are put into matrix PEA.5147 0. PA = [0.2 0. event the national market is favorable and are a prior estimate and E H is the is the event the test market is favorable. bayes Requires input PEA = [P(EA1) P(EA2) . P(EAn)] and PA = [P(A1) P(A2) .5000 0.. named above The procedure displays the results in tabular form. P(AnEc)] Enter matrix PEA of conditional probabilities PEA Enter matrix PA of probabilities PA P(E) = 0.8 (Reversal of conditioning) PEA = [0. P (DT ) and P (D T ).2000 0. data P (H) of the likelihood the national market is sound. and E is the event of certain geological structure (salt dome or fault).1891 0..2962 Various quantities are in the matrices PEA.8 (Reversal of conditioning). desired and P (T c D). P(AnE)] and PAEc = [P(A1Ec) P(A2Ec) . as shown..g. the data are usually prior the characteristics as predictors in the form is P (HE).
99 P (ST ) = · = · 10 = 198 c T ) c ) P (S c ) P (S P (T S 0.9950 199 (3. then P (AB) ∗ P (A) iﬀ P (BA) ∗ P (B) iﬀ P (AB) ∗ P (A) P (B) and (3. (AB) ∗ P (A) P (B) i P (AB) ∗ P (A) (divide by P (B) may exchange A and Ac ) (AB) ∗ P (A) P (B) i P (BA) ∗ P (B) (divide by P (A) may exchange B and Bc ) (AB) ∗ P (A) P (B) i [P (A) − P (AB c )] ∗ P (A) [1 − P (B c ) i −P (AB c ) ∗ −P (A) P (B c ) (AB c ) P (A) P (B c ) c We may use c to get P (AB) ∗ P (A) P (B) i P (AB ) P (A) P (B c ) i P (Ac B c ) ∗ P (Ac ) P (B c ) P P P P i A number of important and useful propositons may be derived from these. T be the event the test is Then by the ratio form of Bayes' rule P (T S) P (S) 0. or < . 3. b. =. CD.01 of a false negative. ≤. 2.1.34) The following property serves to establish in the chapters on "Independence of Events" (Section 4. A test is performed. (AB) > P (A) i P (Ac B c ) > P (Ac ). 1. but. Reassign the conditional probabilities.1) and "Conditional Independence" (Section 5. PC (A) becomes PC (AD) = P (ACD) PC (AD) = PC (D) P (CD) (3. and P (T c S) = 0. Because of the role of this property in the theory of independence and conditional independence.37) . in general.4 Repeated conditioning Suppose conditioning by the event C has occurred.12: A performance test As a part of a routine maintenance procedure.35) P (AB) ∗ P (A) P (B) iﬀ P (Ac B c ) ∗ P (Ac ) P (B c ) iﬀ P (AB c ) P (A) P (B c ) where (3. ≤. P (S) /P (S c ) = 10. respectively. The test has probability 0.1) a number of important properties for the concept of independence and of conditional independence of events. d. Additional information is then received that event There are two possibilities: D has occurred. P (T S c ) = 0. We have a new conditioning event 1.05 so that P (ST ) = 198 = 0. The machine seems to be operating so well that the prior odds it is satisfactory are taken to be ten to one. 69) a. The result is positive. or > and is > .69 Example 3. What are the posterior odds the device is operating properly? SOLUTION Let S be the event the computer is operating satisfactorily and let The data are favorable. P P P P (AB) + P (Ac B) = 1. a computer is given a performance test. =. P (AB) + P (AB c ) = 1. c. we examine the derivation of these results. (AB) > P (A) i P (AB c ) < P (A).05. VERIFICATION Exercises (see problem set) 3.05 of a false positive and 0.01. (Ac B) > P (Ac ) i P (AB) < P (A). ≥. (CP4) Some equivalent conditions If 0 < P (A) < 1 and 0 < P (B) < 1. 4.36) ∗is < . VERIFICATION of (CP4) (p. ≥.
if possible. Exercise 3. • • (a) A student is selected at random. He is male and lives on campus.3 (Solution on p. 3. B = on campus. . P (AB) = 0. and are active in sports 8 percent are male and live o campus 17 percent are male students inactive in sports Let A = male.20.11) from "Problems on Minterm Analysis.55. one card is drawn. P (Ac ∪ BC) = 0. i. given they show dierent numbers? 2 This content is available online at <http://cnx.) Exercise 3.4/>. the probability a woman lives to at least seventy years is 0.) P (A) = 0. P (Ai B0 ) for each possible i.55. What is the(conditional) probability that she is a female who lives on campus? Exercise 3.) In a certain population.5 Two fair dice are rolled. the conditional probability (3.1). P (BC) = 0. What is the (conditional) probability that he is active in sports? (b) A student selected is active in sports. · · ·.4 of the two digits on a card is Determine From 100 cards numbered 00. 99. 02. 74.1 Given the following data: (Solution on p.) a.1) to "Conditional Independence" (Section 5. CONDITIONAL PROBABILITY 2. Thus repeated conditioning by two events may be done in any order. or may be done in one step.2 In Exercise 11 (Exercise 2. 74.) Exercise 3.2 Problems on Conditional Probability2 Exercise 3. What is the (conditional) probability that one turns up two spots. and Bj is the event the product of the two digits is j. 74.55 that she will live to at least eighty years. (Solution on p. (Solution on p." we have the following data: A survey of a represenative group of students yields the following information: • • • • • • • 52 percent are male 85 percent live on campus 78 percent are male or are active in intramural sports (or both) 30 percent live on campus but are not active in sports 32 percent are male. These conditions are important for many problems of probable inference. This result is important in extending the concept of "Independence of Events" (Section 4. 74. 01.70 CHAPTER 3.38) Basic result: PC (AD) = P (ACD) = PD (AC).39) P (Ac B) = P (Ac B) /P (B). C = active in sports. This result extends easily to repeated conditioning by any nite number of events.70 and is 0. Suppose Ai is the event the sum 0 ≤ i ≤ 18. live on campus. 74.15 Determine.30. If a woman is seventy years old. P (Ac BC c ) = 0. (Solution on p. Reassign the total probabilities: P (A) becomes PCD (A) = P (ACD) = P (ACD) P (CD) (3. what is the conditional probability she will survive to eighty years? Note that if A⊂B then P (AB) = P (A).org/content/m24173/1.
50 S2 0.10 0. b. 75.000.10 Table 3.6 (Solution on p. E3 = event of completion of graduate or professional degree program. P (E2 ) = 0. What is the (conditional) probability that at least one turns up six. What is the (conditional) probability that the rst turns up six. given this behavior? Exercise 3. 75.000 or more? .2 a. (Solution on p.000 and $100.05 0. What is the probability that three of those selected are women? c. Experience shows that 93 percent of the units with this defect exhibit a certain behavioral characteristic.71 b. What is the conditional probability that the unit has the defect. S3 = event annual income is greater than $100.9 A shipment of 1000 electronic units is received. and P (E3 ) = 0. from two through 12? c.05.) S1 = event S2 = event annual income is between $25. 7 of whom are women.10 0. P (Si Ej ) (3. or 3 defective units in the lot.) There is an equally likely probability that there are 0.05 0.85 0.80 0.50 0. given that three of those selected are women? Exercise 3. A unit is examined and found to have the characteristic symptom.10 0. given that the sum is for each k from two through 12? Exercise 3.40 S3 0. what is the probability of no defective units in the lot? Exercise 3.8 (Solution on p.) Four persons are to be selected from a group of 12 people.7 (Solution on p. What is the (conditional) probability that the rst and third selected are women. Suppose a person has a university education (no graduate study). E1 = event did not complete college education. What is the (conditional) probability the painting he purchases is an original? Exercise 3. 75. for k. 2.10 of buying a fake for an original but never rejects an original as a fake.30.40) S1 E1 E2 E3 P (Si ) 0.) Five percent of the units of a certain type of equipment brought in for service have a common defect. He has probability 0.000. 75. 75. given that the sum is each k k.65. If one is selected at random and found to be good. Data may be tabulated as follows: P (E1 ) = 0. A collector buys a painting. E2 = event of completion of bachelor's degree. Determine P (E3 S3 ). What is the probability that the rst and third selected are women? b.000. Data on incomes and salary ranges for a certain population are analyzed as follows. (Solution on p. 1. a. What is the (conditional) probability that he or she will make $25.) Twenty percent of the paintings in a gallery are not originals.10 annual income is less than $25. while only two percent of the units which do not have this defect exhibit that characteristic.45 0.
02 of giving a false negative. Exercise 3. P (AB) = P (ABC) P (CB) + P (ABC c ) P (C c B). given this evidence. Is the converse true? Prove or give a counterexample. That is. (Solution on p. two.19 Show that P (AB) ≥ (P (A) + P (B) − 1) /P (B). A customer buys a unit from the salvage rm. 76.) Previous In a survey. (Solution on p. What is the probability that an employee picked at random really does favor the company policy? It is reasonable to assume that all who favor say so. the judge feels the odds are two to one the defendent is guilty. Exercise 3.) and P (AC) > P (BC) P (AC c ) > P (BC c ). 69) to show a. that the defendent is guilty? Exercise 3. times more likely if the defendent is guilty than if he were not. if that it tests satisfactory. (Solution on p. 75. Exercise 3.16 Show that if (Solution on p.17 Since P (·B) is a probability measure for a given Construct an example to show that in general P (AB) + P (AB c ) = 1. 76. 76. defective? It is good. A box is selected at random. 85 percent of the employees say they favor a certain company policy. 76.20 Show that . Determine the probability P (D T ) c that a unit which tests good is.11 (Solution on p.15 is determined that the defendent is left handed. 75. c. It is defective. then D is the event a unit tested is defective. Exercise 3. Find the total probability that a person's income category is at least as high as his or her educational level. This rm applies a corrective procedure which does not aect any good unit and which corrects 90 percent of the defective units. 76.) B.) A quality control group is designing an automatic test procedure for compact disk players coming from a production line. and ve defective units.) Two percent of the units received at a warehouse are defective. and T is the event and P (T D) = 0.02. What are the (conditional) probabilities the unit was selected from each of the boxes? Exercise 3. P (AB) > P (A) i P (AB c ) < P (A) P (Ac B) > P (Ac ) i P (AB) < P (A) P (AB) > P (A) i P (Ac B c ) > P (Ac ) (Solution on p. CONDITIONAL PROBABILITY c. and a unit is selected at random therefrom. b. experience indicates that 20 percent of those who do not favor the policy say that they do. The automatic test procedure has probability 0.05 P (T c Dc ) = 0. Experience shows that one percent of the units produced are defective.12 (Solution on p.) Exercise 3. then P (A) > P (B). 76.05 of giving a false positive indication and probability 0. A nondestructive test procedure gives two percent false positive indications and ve percent false negative. (Solution on p.) It An investigator convinces the judge this is six At a certain stage in a trial. 76. in fact.) (Solution on p. Exercise 3.72 CHAPTER 3. three. out of fear of reprisal.) we must have P (AB) + P (Ac B) = 1.18 Use property (CP4) (p. four. Exercise 3. 76. on an equally likely basis.14 (Solution on p.) They have respectively one. Units which fail to pass the inspection are sold to a salvage rm.13 Five boxes of random access memory chips have 100 units per box. What is the likelihood. free of defects. What is the (conditional) probability the unit was originally Exercise 3.
21 An individual is to select from among n (Solution on p. and B be the event he knows which is correct before Determine making the selection.22 ( and P (BA) increases with n for xed p. 77. when only one is correct. c.) red balls along with c A ball is drawn on an equally likely basis from among those in the urn. . P P P P (B2 R1 ) (B1 B2 ) (R2 ) (B1 R2 ). P (AB c ) = 1/n. This might be selection from answers on a multiple choice question. rst choice. draw and Rk n+c be the event of a red ball on the k th draw. The process is repeated. r + b = n). Let A be the event he makes a correct selection. 77. Let Bk be the event of a black ball on the n balls on the k th Determine a. then replaced additional balls of the same color. d. etc. An urn contains initially b black balls and r (Solution on p.) alternatives in an attempt to obtain a particular one. show that Polya's urn scheme for a contagious disease. There are balls on the second choice. b. P (BA).73 Exercise 3. We suppose P (B) = p and P (BA) ≥ P (B) Exercise 3.
[A&B&C. Ac(B&C). Ai B0 is the event that the card with numbers 0i is drawn..55 0.55 P = 0... if desired.3200 2.1 minvec3 DV = [AAc..P = 0.mincalc .. P (BA) = Solution to Exercise 3. Ac&B&C..25/0.4545 Solution to Exercise 3. A&B. Ac.2500 2..30 0.8. 70) i.. Ac&B&Cc]. DP = [ 1 0.55 0.32/0.0000 0.44 PC_AB = 0.. C..4400 3.4 (p.1 (p.. TV = [Ac&B... Bc.mincalct Enter matrix of target Boolean combinations Computable target probabilities 1. disp('Call for mincalc') npr03_01 Variables are A.0000 0. 70) npr02_11 (Section~17. 70) Let A = event she lives to seventy and B = P (AB) /P (A) = P (B) /P (A) = 55/70.2300 4.0000 0. C] event she lives to eighty.0000 0. B....3 (p.m (Section~17...7273 PAcB_C = 0. B&C. CONDITIONAL PROBABILITY Solutions to Exercises in Chapter 3 Solution to Exercise 3.5 (p.20 0.74 CHAPTER 3.0000 0.23/0.3770 Solution to Exercise 3... B].2 (p. for each P (Ai B0 ) = (1/100) / (1/10) = 1/10 Solution to Exercise 3..6100 PC_AB = 0.. .0000 0.. 70) % file npr03_01. Cc They may be renamed. Since B ⊂ A. A.5500 The number of minterms is 8 The number of available minterms is 4 .61 PAcB_C = 0.. Call for mincalc mincalc Data vectors are linearly independent Computable target probabilities 1.21: npr03_01) % Data for Exercise~3.8. A&B.. 0 through 9....15 ].. 70) B0 is the event one of the rst ten is drawn.8: npr02_11) ..
46) P (S) = P (SF ) P (F ) + P (SF c ) [1 − P (F )] Solution to Exercise 3.02.20. C = the event it has the characteristic. A6 = event rst is a six and Solution to Exercise 3.05 + 0.85 + 0. then Let and G= the event the painting is original. 2/36.85.95 131 defective and (3.1 · 0.80 + 0. b.10 + 0. 3/36.10 = 0. 1. P (E3 S3 ) = P (S3 E3 ) P (E3 ) = 0. P (DC) = P (CD) P (D) 0. 1.8 (p.8 + 0.05 93 = = P (CD) P (D) + P (CDc ) P (Dc ) 0.8.10 (p.0225 P (S2 S3 E2 ) = 0.99 9702 P (Dc T ) = = = P (DT ) P (T D) P (D) 0.12 (p.75 a.41) Solution to Exercise 3.30 + 0. There are 2×5 ways that they are dierent and one turns up two spots. P (BGc ) = 0.45 · 0. 1/5.05 = 0. Now A6 Sk = ∅ for k ≤ 6.98 · 0. There are 6×5 ways to choose all dierent. A table of P (A6 Sk ) = 1/36 and P (Sk ) = 6/36.93 · 0. c. If P (G) = 0. Thus. 72) P (S) = 0. 1/4.43) Solution to Exercise 3. respectively. If AB6 is the event at least one is a six. P (F ) = P (S) − P (SF c ) 13 = 1 − P (SF c ) 16 (3. b.44) (3. 1/2.6 (p. P (SF c ) = 0. respectively.45) Solution to Exercise 3. Assume P (BG) = 1 and P (GB) = P (BG) P (G) 0. 72) implies P (T Dc ) P (Dc ) 0.47) (3. P (D0 G) = P (GD0 ) P (D0 ) P (GD0 ) P (D0 ) + P (GD1 ) P (D1 ) + P (GD2 ) P (D2 ) + P (GD3 ) P (D3 ) = 1000 1 · 1/4 = (1/4) (1 + 999/1000 + 998/1000 + 997/1000) 3994 (3.02 · 0.01 5 P (Dc T ) = 9702 5 =1− 9707 9707 (3. 71) c P (W1 W3 ) = P (W1 W2 W3 ) + P (W1 W2 W3 ) = 7 5 6 7 7 6 5 · · + · · = 12 11 10 12 11 10 22 (3. then P (AB6 Sk ) = 2/36 for k = 7 through 11 and P (AB6 S12 ) = 1/36. Then P (D) = 0. The conditional probability is 2/6. Sk = event the sum is k.80 + 0. 71) B = the event the collector buys. 1.42) Solution to Exercise 3. 4/36. reasonable to assume Solution to Exercise 3.93 · 0. 5/36. 1/36 for k = 7 through 12.05 = 0.2 41 and (3. 2/5.05) · 0.8 40 P (GB) = = = c ) P (Gc ) P (B) P (BG) P (G) + P (BG 0. 1/3.10) · 0.90 p = (0.9425 Also. P (SF ) = 1.1.48) .05. 2/4. respectively. Hence P (A6 Sk ) = 1/6. Let sums shows c. 2/3.05 · 0.9 (p.65 + (0. 71) Let D = the event the unit is defective P (CD) = 0.7 (p.45 · 0. the conditional probabilities are 2/6.93. 71) Let Dk = the event of k G be the event a good one is chosen. 71) a.11 (p. and P (CDc ) = 0.
P (T Dc ) = 0.50) (3.02. 72) a.05 · 1.90 441 = 0. Consider P (C) = P (C ) = 0. P (G) P (GD) = P (GT D) = P (D) P (T D) P (GT D) (3.54) odds: P (G) P (LG) P (GL) = · = 2 · 6 = 12 P (Gc L) P (Gc ) P (LGc ) P (GL) = 12/13 Solution to Exercise 3.05.17 (p. P (Hi D) = P (DHi ) P (Hi ) = i/15.02 · 0. 72) 1 ≥ P (A ∪ B) = P (A) + P (B) − P (AB) = P (A) + P (B) − P (AB) P (B). Then 1/2 = P (A) = But 1 1 (1/4 + 3/4) > (1/2 + 1/4) = P (B) = 3/8 2 2 (3. and (3. P (BC) = 1/2. P (T c D) = 0. P (AB) > P (A) i P (AB) > P (A) P (B) i P (AB c ) < P (A) P (B c ) i P (AB c ) < P (A) P (Ac B) > P (Ac ) i P (Ac B) > P (Ac ) P (B) i P (AB) < P (A) P (B) i P (AB) < P (A) P (AB) > P (A) i P (AB) > P (A) P (B) i P (Ac B c ) > P (Ac ) P (B c ) i P (Ac B c ) > P (Ac ) Simple algebra gives the Solution to Exercise 3. c The converse is not true.49) Solution to Exercise 3. b. Solution to Exercise 3. Prior (3. CONDITIONAL PROBABILITY i.15 (p. 72) (3. A⊂B with Solution to Exercise 3. 72) Hi = the event from box Solution to Exercise 3.98 · 0.00 1666 (3. Then P (AB) = P (A) /P (B) < 1 and P (AB c ) = 0 so the sum is less than one.76 CHAPTER 3.16 (p. 72) Let G = event the defendent is guilty. 72) Let T = event test indicates defective. c.53) 0.57) P (AC) < P (BC).02. 72) P (AB) = P (ABC) + P (ABC c ) P (AB) = P (B) P (B) (3.90 + 0. defendent is left handed.51) (3. 72) Suppose P (A) < P (B). P (GT c ) = 0.98 · 0. Data are P (D) = 0.19 (p. c c P (AC ) = 3/4.90.52) P (G) = P (GT ) = P (GDT ) + P (GDc T ) = P (D) P (T D) P (GT D) + P (Dc ) P (T Dc ) P (GT Dc ) P (DG) = Solution to Exercise 3.5.56) P (A) = P (AC) P (C) + P (AC c ) P (C c ) > P (BC) P (C) + P (BC c ) P (C c ) = P (B). P (GDc T ) = 1 P (DG) = P (GD) .02 · 0.58) . P (AC) = 1/4.14 (p. desired result. D= G= event unit purchased is good. Result of testimony: P (LG) /P (LGc ) = 6. 1 ≤ i ≤ 5 P (DHj ) P (Hj ) event initially defective.18 (p. and P (BC ) = 1/4.13 (p. Solution to Exercise 3. L = the event the P (G) /P (Gc ) = 2.98 · 0. P (GDT ) = 0.55) (3. P (Hi ) = 1/5 and P (DHi ) = i/100.20 (p.
b. increases from 1 to 1 n p np = (n − 1) p + 1 (1 − p) as (3. we have b b = r+b+c n+c (3.59) Solution to Exercise 3.63) . 73) P (AB) = 1.22 (p. P (B) = p P (BA) = P (AB) P (B) P (B) + P (AB c ) P (B c ) = P AB p+ P (BA) n = P (B) np + 1 − p Solution to Exercise 3. c. 73) a. Using (c). P (AB c ) = 1/n.77 = P (ABC) P (BC) + P (ABC c ) P (BC c ) = P (ABC) P (CB) + P (ABC c ) P (C c B) P (B) (3.60) 1/p n→∞ (3.21 (p.61) b P (B2 R1 ) = n+c b b+c P (B1 B2 ) = P (B1 ) P (B2 B1 ) = n · n+c P (R2 ) = P (R2 R1 ) P (R1 ) + P (R2 B1 ) P (B1 ) = P (B1 R2 ) = P (R2 B1 )P (B1 ) P (R2 ) r+c r r b r (r + c + b) · + · = n+c n n+c n n (n + c) r n+c (3. with P (R2 B1 ) P (B1 ) = P (B1 R2 ) = · b n.62) d.
CONDITIONAL PROBABILITY .78 CHAPTER 3.
6/>. The operational independence described indicates that knowledge that one of the events has occured does not aect the likelihood that the other will occur. we should expect to model the events as independent in some way. this is the condition P (AB) = P (A) Occurrence of the event (4. If such knowledge of the occurrence of B does not aect the likelihood of the occurrence of A. If customers come into a well stocked shop at dierent times. 1 This content is available online at <http://cnx. the grade one makes should not aect the grade made by the other. In each case. the the item purchased by one should not be aected by the choice made by the other. we give a precise formulation of the concept of independence in the probability sense. As in the case of all concepts which attempt to incorporate intuitive notions. the consequences must be evaluated for evidence that these ideas have been captured successfully. If two students are taking exams in dierent courses. 79 . In this unit. much less information is required to determine probabilities of Boolean combinations and calculations are correspondingly easier. given knowledge that B has occurred.org/content/m23253/1. If events form an independent class. B. each unaware of the choice made by the others.1 Independence as lack of conditioning There are many situations in which we have an operational independence. How should we incorporate the concept in our developing model of probability? We take our clue from the examples above.1). 4. The list of examples could be extended indenitely. Pairs of events are considered.1 Independence of Events1 Historically. For a pair of events {A. we should be inclined to think of the events A and B as being independent in a probability sense.Chapter 4 Independence of Events 4. Our basic interpretation is that P (AB) as the likelihood The development of conditional probability in the module Conditional Probability (Section 3. leads to the interpretation of A will occur on a trial.1) A is not conditioned by occurrence of the event P (A) that indicates of the likelihood of the occurrence of event A. A second card picked on a repeated try should not be aected by the rst choice. B}.1. the notion of independence has played a prominent role in probability. • • • Supose a deck of playing cards is shued and a card is selected at random then replaced with reshuing.
we adopt the hand corner of the table.2) Remark. A} is an independent pair for any has probability one (is an almost sure event). then its complement is independent. B} independent.1 P (AB) = P (AB c ) P (Ac B) = P (Ac B c ) P (BA) = P (BAc ) P (B c A) = P (B c Ac ) Table 4. By the P (N ) = 0 = P (A) P (N ). Then for any event A. {A. We are free to do this. event • If event A. so is any pair obtained by taking the complement of either or both of • Suppose event N has probability zero (is a null event).1. We note two relevant facts The equivalences in the righthand column of the upper portion of the table may be expressed as a {A. B} independent implies P (AB) = P (A) P (B) > 0 = P (∅) requires that the other does not occur.80 4. {S c . holds: The pair {A. so replacement rule and the fact just established. so is any pair obtained by replacing either or both of the events by their complements or by a null event or by an almost sure event. P (AB) = P (A) P (AB c ) = P (A) P (Ac B) = P (Ac ) P (Ac B c ) = P (Ac ) P (AB) = P (A) P (B) P (AB c ) = P (A) P (B c ) P (Ac B) = P (Ac ) P (B) P (Ac B c ) = P (Ac ) P (B c ) P (B c Ac ) = P (B c ) Table 4. A} is independent. The replacement rule may thus be extended to: Replacement Rule If the pair {A. Intuitively. then the occurrence of one Formally: Suppose 0 < P (A) < 1 and 0 < P (B) < 1. Thus {N.2 Independent pairs We take our clue from the condition (in the case of equality) yields CHAPTER 4. B} independent. A} {S. We may chose any one of these as the dening condition and consider the others as equivalents for the dening condition. Because of its simplicity and symmetry with respect to the two events. then all hold. INDEPENDENCE OF EVENTS sixteen equivalent conditions P (BA) = P (B) P (B c A) = P (B c ) P (BAc ) = P (B) P (AB) = P (A). a pair cannot be both independent and mutually exclusive. which we augment and extend below: If the pair the events. in many applications the assump tions leading to independence may be formulated more naturally in terms of one or another of the equivalent expressions. Property (CP4) (p. Although the product rule is adopted as the basis for denition. if the pair is mutually exclusive. for the eect of assuming any one condition is to assume them all. replacement rule. Unless at least one of the events has probability one or zero. B} mutually exclusive implies P (AB) = P (∅) = 0 = P (A) P (B).2 These conditions are equivalent in the sense that if any one holds. product rule in the upper right rule Denition. . S so that the product rule holds. {A. 69) for conditional probability as follows. CAUTION 1. we have 0 ≤ P (AN ) ≤ Sc is a null event. B} of events is said to be (stochastically) independent P (AB) = P (A) P (B) i the following product (4.
it k+1 of them. . consistent. C} is independent i all four of the following product rules hold P (AB) = P (A) P (B) P (AC) = P (B) P (C) P (ABC) = P (A) P (B) P (C) If any P (A) P (C) not P (BC) = (4. but may not be independent for another. One immediate and important consequence is the following. the rule must hold for any nite subclass.3) one or more of these product expressions fail. We consider some classical exmples of nonindependent classes Example 4. {A. A4 } is a partition. C. by a null event. This can be seen by considering various probability distributions on a Venn diagram or minterm map. of events is said to be (stochastically) independent i the product rule holds for every nite subclass of two or more events in the class. 4. Note that we say not independent or nonindependent rather than dependent. A class A class class of events utilizes the product rule. P (A) = P (B) = 1/4. Independence is not a property of events. and for the whole class. B. {A.1. A similar situation holds for a class of four events: the product rule must hold for every pair.5) P (ABC) = P (A1 ) = 1/4 = P (A) P (B) P (C) 2. General Replacement Rule If a class is independent. Use of a minterm maps shows these assignments are Elementary calculations show the product rule applies to the class {A. the replacement rule holds for any pair of events. P (D) = 15/64. with each P (Ai ) = 1/4. Two non mutually exclusive events may be independent under one probability measure. C} but and no two of these three events forms an independent pair. C} has P (A) = P (B) = P (C) = 1/2 and is pairwise independent. It is easy to show.4) {A. A2 .81 2.1: Some nonindependent classes 1. since P (AB) = P (A1 ) = 1/4 = P (A) P (B) and similarly for the other pairs. D} with AD = BD = ∅. A3 .6) P (AB) = 1/64. and the resulting class is also independent.3 Independent classes Extension of the concept of independence to an arbitrary Denition. B. then the probability of every minterm may be calculated. or by an almost sure event. for every triple. Minterm Probabilities If {Ai : 1 ≤ i ≤ n} is an independent class and the the class {P (Ai ) : 1 ≤ i ≤ n} of individual probabilities is known. we may replace any of the sets by its complement. that if the rule holds for any nite number holds for any k of events in an independent class. C = AB D. Consider the class (4. although somewhat cumbersome to write out. A3 C = A1 Let A = A1 Then the class A2 B = A1 A4 (4. By the principle of mathematical induction. but (4. B. Suppose {A1 . The reason for this becomes clearer in dealing with independent random variables. the class is independent. but not independent. As noted above. We may extend the replacement rule as follows. B. Such replacements may be made for any number of the sets in the class.
q = 1p.31 It is desired to calculate (a) (4. Then {Ac .0034 In the unit on MATLAB and Independent Classes. With these minterm probabilities.3. B c . and the scheme for indexing a matrix. E10 } has respective probabilities 0.06.7) With each of the basic probabilities determined. and 0.13 0. three appropriate probabilities are sucient. We may use the MATLAB function prod and (b) c c c c c P (E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 ). · · · . are 0.09.57 0.0010 % q(f) probs of complemented factors % Case of uncomplemented in even positions.2) ∼= 0). % Uncomplemented positions f = [3 5 6].09. P (B) = 0.5 = P (C) and P (A) + P (Ac ) P (B) P (C) = 0. INDEPENDENCE OF EVENTS Example 4. E2 .56 0. .71 0. 0.21. complemented in odd positions g = find(rem(1:10. E2 . 0. the probabilities of the other six minterms. C} is independent with respective probabilities P (A) = 0. · · · . p = 0.51 so that and P (A) = 0. hence the probability of any Boolean combination of the events.15. in order.6 0. B.30 = 0. % First case e = [1 2 4 7].33 0.14 {Ac . In the independent case. % The even positions h = find(rem(1:10.3 = 0.06.3: Three probabilities yield the minterm probabilities Suppose Then {A. Example 4.5.30.4: MATLAB and the product rule Frequently we have a large enough independent class {E1 . C} is independent and P (M1 ) = P (Ac ) P (B c ) P (C) = 0. 0.22 0.51. B c .51 − 0. P (AC c ) = 0.15/0. eight appropriate probabilities must be specied to determine the minterm probabilities for a class of three events.8) c c c P (E1 E2 E3 E4 E5 E6 E7 ).14 Similarly.12 0.2) == 0). 0.43 0. En } that it is desirable to use MATLAB (or some other computational aid) to calculate the probabilities of various and combinations (intersections) of the events or their complements. B.2: Minterm probabilities for an independent class Suppose the class and {A. we may calculate the minterm probabilities. C} is independent with P (A ∪ BC) = 0.01*[13 37 12 56 33 71 22 43 57 31]. % The odd positions P = prod(p(g))*prod(q(h)) P = 0.21. P (C c ) = 0.37 0. % Complemented positions P = prod(p(e))*prod(q(f)) % p(e) probs of uncomplemented factors P = 0.6. P (C) = 0. Example 4. the probability of any Boolean combination of and C A. P (B) = 0. Suppose the independent class {E1 .5 (4. we extend the use of MATLAB in the calculations for such classes. B . may be calculated In general. C c } is independent and P (M0 ) = P (Ac ) P (B c ) P (C c ) = 0.82 CHAPTER 4.7 × 0.
0720 0. We have two useful mfunctions.0480 0.1680 It may be desirable to arrange these as on a minterm map. P = ikn(p. k) y = ckn (P. The reliability R Ri = P (Ei ) is dened to be the reliability i th component of that component.83 4. when minterm probabilities are available.0720 0.0720 0.1401 0.6/>.1 MATLAB and Independent Classes case we have an mfunction In the unit on Minterms (Section 2.1120 0.org/content/m23255/1. General systems may be decomposed into subsystems of these types. Example 4. we may calculate all minterm probabilities from the probabilities of the basic events.k) Pc = 0.2 MATLAB and Independent Classes2 4. 2. The subsystems become components in the larger conguration. It is only necessary to specify the probabilities for the n basic events and the numbers k y = ikn (P. There are three basic congurations. in this basic or generating sets.1120 0.0720 0.01*[13 37 12 56 33 71 22 43 57 31]. While these calculations are straightforward. they may be tedious and subject to errors. Then Let Ei be the event the survives the designated time period.5 pm = minprob(0. and the minterm probabilities are calculated by minprob.2.1680 Probability of occurrence of k of n independent events In Example 2.1680 0. Fortunately. This function uses the mfunction mintable to set up the patterns of minprob which calculates all minterm probabilities from the probabilities of the p's and q's for the various minterms and then takes the products to obtain the set of minterm probabilities. For this we have an mfunction minmap pm. The system operates i all n components operate : Parallel.0266 % cumulative probabilities Reliability of systems with independent components Suppose a system has n components which fail independently. and k is a matrix of integers less than or equal to P is a matrix n.1*[4 7 6]) pm = 0. The system operates i not all components fail : R = i=1 Ri n R = 1 − i=1 (1 − Ri ) n 2 This content is available online at <http://cnx.0225 % individual probabilities Pc = ckn(p. we show how to use minterm probabilities and minterm vectors to calculate probabilities of Boolean combinations of events. The three fundamental congurations are: 1.k) P = 0.0480 0. hence the mintable.1845 0. then function function calculates individual probabilities that calculates the probabilities that k of n occur k or more occur Example 4. k = [2 5 7]. In Independence of Events we show that in the independent case. .2520 which reshapes the row matrix 0.2520 0.1080 0.1080 0.6 p = 0. Series. is determined.2921 0. The size of the class.9516 0. k) of events. If of the n individual event probabilities.1680 0.1). of the complete system is a function of the component reliabilities. as follows: t = minmap(pm) t = 0. In the case of an independent class. we show how to use the mfunctions mintable and csort to obtain the probability of the occurrence of k of n events. the minterm probabilities are calculated easily by minprob.
below). R may be calculated with the m function ckn. (4. combination of components 2 and 3. 1. Put the component reliabilities in matrix RC = [R1 R2 · · · Rn] Series Conguration R = prod(RC) 2. k of n.85 0.91 0.k) Example 4.7 Example 4.3) % Solution Ra = 0.90 0.92 0.8 . it is more ecient to use the mfunction cbinom (see Bernoulli trials and the binomial distribution. % prod is a built in MATLAB function Parallel Conguration R = parallel(RC) % parallel is a user defined function 3.01*[95 90 92 80 83 91 85 85]. The system operates i k or more components operate. Probabilities of the components in order are 0. MATLAB solution. INDEPENDENCE OF EVENTS 3.m). If the component probabilities are all the same. Component 1 is in series with a parallel There are eight components. % Component reliabilities Ra = RC(1)*parallel(RC(2:3))*ckn(RC(4:8).80 0.95 0.9) The second and third probabilities are for the parallel pair.1: Schematic representation of the system in Example 4.83 0. k of n Conguration R = ckn(RC.84 CHAPTER 4. numbered 1 through 8.85 the 3 of 5 combination. followed by a 3 of 5 combination of components 4 through 8 (see Figure 1 for a schematic representation).7 % ckn is a user defined function (in file ckn.9172 Figure 4. and the last ve probabilities are for RC = 0.
9 pm = 0. % Component reliabilities 18 Rb = prod(RC(1:2))*parallel([RC(3).20 0. Example 4.85 RC = 0. It checks for feasible size. % An arbitrary class disp(imintest(pm)) The class is NOT independent Minterms for which the product rule fails 1 1 1 0 1 1 1 0 Example 4.11 . The mfunction imintest has as argument a vector of minterm probabilities.30]: %An improper number of probabilities disp(imintest(pm)) The number of minterm probabilities incorrect Example 4.3)]) % Solution Rb = 0.10 pm = [0.2: Schematic representation of the system in Example 4.8 A test for independence It is dicult to look at a list of minterm probabilities and determine whether or not the generating events form an independent class. and performs a check for independence.ckn(RC(4:8).01*[15 5 2 18 25 5 18 12].25 0.01*[95 90 92 80 83 91 85 85].8532 Figure 4.15 0. determines the number of variables.10 0.
2280.5 0.6 · 0.33 0.4 + 0. or in situations where probabilities of several Boolean combinations are to be determined.7]). The desired probability is the sum of these.31}. 7) . 4.12) 210 = 1024 minterm probabilities to be calculated. · · · . Determine P (A ∪ BC).1920 .12: A simple Boolean combination Suppose the class {A. disp(imintest(pm)) The class is independent 4. pm = minprob(P).6636 Writing out the expression for F is tedious and error prone. minterm probabilities. The minterm expansion is A ∪ BC = M (3.0320.13 0.11) Considerbly fewer arithmetic operations are required in this calculation.1280. 0.6.71 0.3 0.2 Probabilities of Boolean combinations As in the nonindependent case. 0.6. it is frequently more ecient to manipulate the expressions for the Boolean combination to be a disjoint union of intersections. The solution with MATLAB is easy. INDEPENDENCE OF EVENTS pm = minprob([0. 0.8. As an alternate approach.56 0. Thus so that P (A ∪ BC) = p (3. However. C} is independent. 6. As a larger example for which computational aid is highly desirable. it may be desirable to calculate all minterm probabilities then use the minterm vector techniques introduced earlier to calculate probabilities for various Boolean combinations.12 0. Each requires the multiplication of ten numbers. as follows: P = 0. F = (A1(A3&(A4A7c)))(A2&(A5c(A6&A8)))(A9&A10c).37 0.10) It is not dicult to use the product rule and the replacement theorem to calculate the needed p (3) = P (Ac ) P (B) P (C) = 0.6 · 0.86 CHAPTER 4. E10 } with {0. above.6 · 0. We wish to calculate respective probabilities c c c P (F ) = P (E1 ∪ E3 (E4 ∪ E7 ) ∪ E2 (E5 ∪ E6 E8 ) ∪ E9 E10 ) There are (4. p (5) = 0.0480. p (6) = 0.6880. p (7) = 0. 5. PF = F*pm' PF = 0.2. Example 4.57 0.6 · 0. 4.8 = 0. We could simplify as follows: . with respective probabilities 0.4. Example 4. E2 . we may utilize the minterm expansion and the minterm probabilities to calculate the probabilities of Boolean combinations of events. minvec10 Vectors are A1 thru A10 and A1c thru A10c They may be renamed. 6.22 0.01*[13 37 12 56 33 71 22 43 57 31]. we write A ∪ BC = A Ac BC. In larger problems. B. consider again the class and the probabilities utilized in Example 4. so that P (A ∪ BC) = 0.8 = 0.43 0. 7) (4.13 Consider again the independent class {E1 . 5. Similarly p (4) = 0.6880 (4. if desired.
2. j .5 · 0.168 Also andP (N5 ) c c = P (B1 B2 B3 B4 ) = (4. N5 .5 · 0. {A1 .6) = P (M3 ) P (N5 ) = 0. Consider M3 . is simply a matter of careful use of notation. (4.7 · 0. A2 . Again. P (A1 ) = 0.6 = 0. minterm ve for the class {B1 . and P (M3 ) = P (Ac A2 A3 ) = 0. 0.3. B = A2&(A5c(A6&A8)). Proposition.7 · 0.87 A = A1(A3&(A4A7c)).4.5 · 0.14 Consider the independent class 0.5 (4. P (B1 ) = 0. B2 . F = ABC. 0.8 · 0. A class each of whose members is a minterm formed by members of a distinct subclass of an independent class is itself an independent class.6 · 0.15 Suppose Suppose A = A1 A2 A3 and B = B1 B2 . B1 .7. We state these facts and consider in each case an example which exhibits the essential structure. An } is an Verication of this far reaching result rests on the minterm expansion and two elementary facts about the disjoint subclasses of an independent class. B4 }.6 · 0.2. P (A3 ) = 0. A3 }. Bj } independent for each pair i. 1 ≤ j ≤ m} is independent. P (B2 ) = 0. B3 .4. with respective probabilities minterm three for the class Then {A1 . with {Ai .3 = 0. and it can be extended 2. 0. P (A2 ) = 0. Consider n distinct subclasses of an independent class of events. C = A9&A10c.13) c c P (M3 N5 ) = P (Ac A2 A3 B1 B2 B3 B4 ) = 0. 4. B4 }.3 Independent Boolean combinations Suppose we have a Boolean combination of the events in the class the events in the class {Bj : 1 ≤ j ≤ m}.6 1 = (0.7 · 0. % This minterm vector is the same as for F above This decomposition of the problem indicates that it may be solved as a series of smaller problems.126 1 0. then the original class is independent. It is important to see that this is in fact a consequence of the product rule.3.1. A2 . we would expect the combinations of the subclasses to be independent. Suppose each member of a class can be expressed as a disjoint union. then the class {A1 .3 · 0.3) · (0. B3 .6 · 0.7 · 0. A2 .0212 The product rule shows the desired independence. i th i the event Ai subclass. 0.15) . Example 4. we exhibit the essential structure which provides the basis for the following general proposition. If each auxiliary class formed by taking one member from each of the disjoint unions is an independent class.6. In the following discussion. for it is further evidence that the product rule has captured the essence of the intuitive notion of independence. Formulation of the general result. it should be apparent that the result holds for any number of to any number of distinct subclasses of an independent class. in each case.14) Ai and Bj . B2 . 0. we need some central facts about independence of Boolean combinations.5. First. A3 .8. 0. If for each is a Boolean (logical) combination of members of the independent class.7 · 0.7 · 0. If the combined class {Ai : 1 ≤ i ≤ n} and a second combination {Ai . 1.8 · 0. · · · .8 · 0. Example 4. Bj : 1 ≤ i ≤ n.3.
5 = 0. B} is independent. with each class {Ai . Begin with an independent class E of n events.5 = 0.13. If (4.3933. COMPUTATION P (A) = P (A1 ) + P (A2 ) + P (A3 ) = 0. · · · .4 · 0.43 0. Use of the minterm expansion for each of these Boolean combinations and the two propositions just illustrated shows that the class of Boolean combinations is independent To illustrate.3 · 0.1 = 0.17) By additivity and pairwise independence. E7 } and B We may calculate directly is a combination of P (C) = 0.13 0.5 + 0.71 0.3 · 0.. Select m distinct subclasses and form Boolean combi nations for each of these.3 + 0. Also. E4 . E10 } {0.2 + 0.19) n m A= i=1 then the pair Ai and B= j=1 Bj . the product rule P (AB) = P (A) P (B) holds. 2 10 = 1024 minterms for all ten of F = A ∪ B ∪ C. Bj .18) P (AB). we may extend this rule to the triple {A. and C= k=1 Ck .4 + 0. we use minprob to calculate the the (4.e.12 0. E6 . with each pair {Ai . i.21) n m r A= i=1 Ai .33 0. E8 }. E2 . we have P (AB) = P (A1 ) P (B1 ) + P (A1 ) P (B2 ) + P (A2 ) P (B1 ) + P (A2 ) P (B2 ) + P (A3 ) P (B1 ) + P (A3 ) P (B2 ) = 0. Bj } independent (4.4 · 0. C} (4.22) Ei and determine the minterm vector for F.8 P (B2 ) = 0.88 CHAPTER 4.22 0.16) AB = A1 A2 A3 B1 B 2 = A1 B 1 A1 B2 A2 B1 A2 B 2 A3 B 1 A3 B 2 (4.1 · 0. 570.2 + 0. E5 . Ck } independent and similarly for any nite number of such combinations.31}. we return to Example 4. B} is independent. as P (AB) = P (A1 ) [P (B1 ) + P (B2 )] + P (A2 ) [P (B1 ) + P (B2 )] + P (A3 ) [P (B1 ) + P (B2 )] = [P (A1 ) + P (A2 ) + P (A3 )] [P (B1 ) + P (B2 )] = P (A) P (B) It should be clear that the pattern just illustrated can be extended to the general case.23) {E1 . 3.69 = 0. B = j=1 Bj .37 0. As we note in the alternate expansion of F.2 + 0.1 · 0. where c c c A = E1 ∪ E3 (E4 ∪ E7 ) B = E2 (E5 ∪ E6 E8 ) C = E9 E10 (4. INDEPENDENCE OF EVENTS We wish to show that the pair {A. By the result on independence .2 + 0. Example 4.56 0. which involves an independent class of ten events. so that the second proposition holds. E3 . Now A is a Boolean combination of {E2 . We wish to with respective probabilities calculate c c c P (F ) = P (E1 ∪ E3 (E4 ∪ E7 ) ∪ E2 (E5 ∪ E6 E8 ) ∪ E9 E10 ) In the previous solution.56 = P (A) P (B) The product rule can also be established algebraically from the expression for follows: (4.7 Now andP (B) = P (B1 ) + (4.5 + 0.57 · 0.16: A hybrid approach Consider again the independent class {E1 . B.20) {A.
B.1 Composite trials and component events Often a trial is a composite one. We examine more systematically how to model composite trials in terms of events In the subsequent section. % A corresponds to E2. % with {A. D to E8 PB = b*pmb' = 0. one for each possible sequence of heads and tails. there must be at least 210 = 1024 elements. For example. in the experiment of i th component trial. In other situations. In many cases. We nd that unnecessary.e. pa = p([1 3 4 7]). The question is how to model these repetitions.3. 3 This content is available online at <http://cnx. the steps are carried out sequentially in time. Some of the examples in the unit on Conditional Probability (Section 3. important special case of Bernoulli trials. But we may have an experiment in which ten coins are ipped simultaneously. We simply suppose the basic space has enough elements to consider each possible outcome.01*[13 37 12 56 33 71 22 43 57 31].3 Composite Trials3 i th toss as the component trials. we refer the 4. we impose an ordering usually by assigning indices. C} to obtain the PA PB PC PF p = 0. In some cases. the fundamental trial is completed by performing several steps. % Minterm probabilities for calculating P(B) minvec4.. with each coordinate corresponding to one of the component outcomes.6/>.C} an independent class PF = F*pm' = 0.2852 PC = p(9)*(1 . C} is independent. in setting up the basic model.p(10)) = 0. .B. We use the mprocedures to P (A) and P (B). the class calculate {A.org/content/m23256/1. Some authors give considerable attention the the nature of the basic space.3933 pm = minprob([PA PB PC]). C to E4. That is. % Selection of probabilities for A pb = p([2 5 6 8]). the component trials will be performed sequentially in time. we illustrate this approach in the determined by the components of the trials. D to E7 PA = a*pma' = 0.1) involve such multistep trials. Should they be considered as ten trials of a single simple experiment? It turns out that this is not a useful formulation. a = A(B&(CDc)).2243 b = A&(Bc(C&D)). % A corresponds to E1.6636 % Agrees with the result of Example~4. % Selection of probabilities for B pma = minprob(pa). minvec3 % The problem becomes a three variable problem F = ABC. For purposes of analysis.11 4. C to E6. represented by a single point in the basic space Ω. the order of performance plays no signicant role. % Minterm probabilities for calculating P(A) pmb = minprob(pb). probability of F. B. We need to consider the composite trial as a single outcome i. Then we deal with the independent class {A. and often confusing. We call the individual steps in the composite trial ipping a coin ten times. B to E3. describing it as a Cartesian product space.89 of Boolean combinations. B to E5. For the experiment of ipping a coin ten times. in which each outcome results in a success or failure to achieve a specied condition.
84. For one thing. the usual assumptions and the properties of conditional probability suce to assign probabilities. 0. what is the probability that k or more will qualify ( k = 6. center. If the respective probabilities for success are 0.18 Four persons take one shot each at a target.88. The class {Ei : 1 ≤ i ≤ 10} that k is the union of those minterms which have exactly or more qualify is given by k generates places uncomplemented.83. This approach of utilizing component events is used tacitly in some of the examples in the unit on Conditional Probability.5. 0. the assignment of probabilities is somewhat more involved. various Boolean combinations of these can be expressed as minterm expansions. It seems reasonable to suppose the class {Ei : 1 ≤ i ≤ 10} independent. P (Ei ) is known. Example 4. then making a random selection from the modied contents of the second box. with H H3 H 's and T 's. is independent. c A3 = E1 E2 E3 E4 c E1 E2 E3 E4 c E1 E2 E3 E4 c E1 E2 E3 E4 If each (4. This consists of those outcomes corresponding to sequences with third position and T in the ninth. placing it in the second box. 0. qualify.72. Suppose there are two boxes.96. Let the i th Ei To be the event is car makes qualifying speed.91.90 CHAPTER 4. 0. Suppose A consists of those outcomes represented by sequences is the event of a head on the third toss and a tail on the ninth toss.77. 7. ω of the experiment may be represented by a sequence of The event senting heads and tails. The experiment consists of selecting at random a ball from the rst box.19 Ten race cars are involved in time trials to determine pole positions for an upcoming race. each containing some red and some blue balls. Example 4. • A somewhat more complex example is as follows. We begin by A component event is determined by propositions about the outcomes of the corresponding component trial. We may let box).25) . The SOLUTION Let Ak be the event exactly The event event Bk Ak k qualify. 0. n Bk = r=k Ar (4. Note that this event is the intersection c H3 H9 . it is reasonable to assume that the class component probability is usually taken to be 0.85. The following is a somewhat more complicated example of this type. 8. repreH in the in the third position. then all minterm probabilities can be calculated easily. When these are known. When appropriate component events are determined. and each In the second case. The composite trial is made up of two component selections.17: Component events • In the coin ipping experiment.24) Usually we would be able to assume the Ei form an independent class.90. 0. 0. In the rst example. Then generated by the Ei A3 i th shooter hits the target is the union of those minterms which have three places uncomplemented. 10)? 210 = 1024 minterms. Example 4. consider the event Each outcome H3 that the third toss results in a head. and and R2 R1 be the event of selecting a red ball on the rst component trial (from the rst R2 be the event of selecting a red ball on the second component trial. Let A3 Let Ei be the event the be the event exacty three hit the target. Clearly R1 {Hi : 1 ≤ i ≤ 10} are component events. 0. 0.93. it is necessary to know the numbers of red and blue balls in each box before the composite trial begins. they must post an average speed of 125 mph or more on a trial run. 9. INDEPENDENCE OF EVENTS Of more importance is describing the various events associated with the experiment. identifying the appropriate component events.
called bt. formally. i th component trial is thus Ei c . the outcome is one of two kinds. 0. the number of trials 10 Enter p.85.91 The task of computing and adding the minterm probabilities by hand would be tedious. 0. called simulating Bernoulli sequences. and items from a production line meet or fail to meet specications in a sequence of quality control checks.90.20 btdata Enter n. the probability of success on each trial 0. if the number of trials is largesay several hundredthe process may be time consuming.83. The probability {Ei : 1 ≤ i} is independent. uses the random number generator in MATLAB to produce a sequence of zeros and ones (for failures and successes).72. The second.84]. However. Repeated calls for bt produce new sequences. to determine the desired probabilities quickly and easily. 0. PB = ckn(P. P (Ei ) = p.4. Also.96.88. Example 4.77. By ipping coins. favor or disapprove of a proposition in a survey sample.37 Call for bt bt n = 10 p = 0.5756 0. in which the results on the successive Thus.2. 0. we successfailure trials. P = [0.8472 0.37 % n is kept small to save printout space Frequency = 0. we let The event of a failure on the In many cases. to say the least. failure trials is Bernoulli i 1. 0. For each component trial in success and the other a failure. we may use the function ckn. Examples abound: heads or tails in a sequence of coin ips. The class 2.k) PB = 0.4 To view the sequence.9628 0. 0. model the sequence as a Bernoulli sequence. it is relatively easy to carry this out physically. btdata. call for SEQ disp(SEQ) % optional call for the sequence 1 1 2 1 3 0 4 0 5 0 6 0 7 0 . One we designate a To represent the situation. We have a convenient twopart mprocedure for sets the parameters. 4. 0. there are limitations on the values of p. the probability of success.2114 An alternate approach is considered in the treatment of random variables.91.2 Bernoulli trials and the binomial distribution Many composite trials may be described as a sequence of the sequence. rolling a die with various numbers of sides (as used in certain games). invariant with i. or using spinners. k = 6:10. However. Simulation of Bernoulli trials It is frequently desirable to simulate Bernoulli trials. Ei be the event of a success on the i th component trial in the sequence. a sequence of success component trials are independent and have the same probabilities.3.9938 0. The rst part.0. 0. introduced in the unit on MATLAB and Independent Classes and illustrated in Example 4.93.
92
CHAPTER 4. INDEPENDENCE OF EVENTS
8 9 10 0 1 1
Repeated calls for bt yield new sequences with the same parameters. To illustrate the power of the program, it was used to take a run of 100,000 component trials, with probability
p of success 0.37, as above.
Successive runs gave relative frequencies 0.37001 and 0.36999. Unless the random
number generator is seeded to make the same starting point each time, successive runs will give dierent sequences and usually dierent relative frequencies.
The binomial distribution
A basic problem in Bernoulli sequences is to determine the probability of trials. We let
Sn
be the number of successes in
n
k
successes in
n
component
trials. This is a special case of a simple random variable,
which we study in more detail in the chapter on "Random Variables and Probabilities" (Section 6.1). Let us characterize the events
successes is the union of the minterms generated by
Akn = {Sn = k}, 0 ≤ k ≤ n. As noted above, the event Akn of exactly k {Ei : 1 ≤ i} in which there are k successes (represented c by k uncomplemented Ei ) and n − k failures (represented by n − k complemented Ei ). Simple combinatorics n show there are C (n, k) ways to choose the k places to be uncomplemented. Hence, among the 2 minterms, n! there are C (n, k) = k!(n−k)! which have k places uncomplemented. Each such minterm has probability
n−k
. Since the minterms are mutually exclusive, their probabilities add. We conclude that
pk (1 − p)
P (Sn = k) = C (n, k) pk (1 − p)
as the
n−k
= C (n, k) pk q n−k (n, p).
where
q =1−p
for
for
0≤k≤n
(4.26)
These probabilities and the corresponding values form the
binomial distribution,
(n, p).
distribution
Sn .
This distribution is known
with parameters
Sn ∼
binomial
A related set of probabilities is
(n, p), and often write P (Sn ≥ k) = P (Bkn ), 0 ≤ k ≤ n. If the number n of
We shorten this to binomial
component trials is small, direct computation of the probabilities is easy with hand calculators.
Example 4.21: A reliability problem
A remote device has ve similar components which fail independently, with equal probabilities. The system remains operable if three or more of the components are operative. Suppose each unit remains active for one year with probability 0.8. operative for that long? SOLUTION What is the probability the system will remain
P = C (5, 3) 0.83 · 0.22 + C (5, 4) 0.84 · 0.2 + C (5, 5) 0.85 = 10 · 0.83 · 0.22 + 5 · 0.84 · 0.2 + 0.85 = 0.9421
probabilities
(4.27)
Because Bernoulli sequences are used in so many practical situations as models for successfailure trials, the
P (Sn = k) and P (Sn ≥ k) have been calculated and tabulated for a variety of combinations (n, p). Such tables are found in most mathematical handbooks. Tables of P (Sn = k) are usually given a title such as binomial distribution, individual terms. Tables of P (Sn ≥ k) have a designation
of the parameters such as
binomial distribution, cumulative terms.
Note, however, some tables for cumulative terms give
P (Sn ≤ k).
Care should be taken to note which convention is used.
Example 4.22: A reliability problem
Consider again the system of Example 5, above. Suppose we attempt to enter a table of Cumulative Terms, Binomial Distribution at
n = 5, k = 3,
and
greater than 0.5. In this case, we may work with
Ei c .
failures.
p = 0.8.
Most tables will
We just interchange the role of
not have probabilities Ei and
Thus, the number of failures has the binomial
successes i there are
not three or more failures.
(n, q) distribution.
Now there are three or more
We go the the table of cumulative terms at
n = 5,
k = 3,
and
p = 0.2.
The probability entry is 0.0579. The desired probability is 1  0.0579 = 0.9421.
In general, there are
k
or more successes in
n trials i there are not n − k + 1 or more failures.
93
mfunctions for binomial probabilities
Although tables are convenient for calculation, they impose serious limitations on the available parameter values, and when the values are found in a table, they must still be entered into the problem. Fortunately, we have convenient mfunctions for these distributions. When MATLAB is available, it is much easier to
generate the needed probabilities than to look them up in a table, and the numbers are entered directly into the MATLAB workspace. And we have great freedom in selection of parameter values. For example we may use
n of a thousand or more, while tables are usually limited to n of 20, or at most 30.
P (Akn )
and
The two mfunctions
for calculating
P (Bkn )
are
: :
P (Akn ) is calculated by y = ibinom(n,p,k), where k P (Bkn ) is calculated by y = cbinom(n,p,k), where k
n.
The result
y
is a row or column vector of integers between 0 and
is a row vector of the same size as
k.
n.
The result
y
is a row or column vector of integers between 0 and
is a row vector of the same size as
k.
Example 4.23: Use of mfunctions ibinom and cbinom
If
n = 10
and
p = 0.39,
determine
P (Akn )
and
P (Bkn )
for
k = 3, 5, 6, 8.
p = 0.39; k = [3 5 6 8]; Pi = ibinom(10,p,k) % individual probabilities Pi = 0.2237 0.1920 0.1023 0.0090 Pc = cbinom(10,p,k) % cumulative probabilities Pc = 0.8160 0.3420 0.1500 0.0103
Note that we have used probability Although we use only work well for
p = 0.39.
It is quite unlikely that a table will have this probability.
n
n = 10,
frequently it is desirable to use values of several hundred.
up to 1000 (and even higher for small values of
p
The mfunctions Hence,
or for values very near to one).
there is great freedom from the limitations of tables. If a table with a specic range of values is desired, an mprocedure called
binomial
produces such a table. The use of large
n raises the question of cumulation of
errors in sums or products. The level of precision in MATLAB calculations is sucient that such roundo errors are well below pratical concerns.
Example 4.24
binomial Enter n, the number of trials 13 Enter p, the probability of success 0.413 Enter row vector k of success numbers 0:4 n p 13.0000 0.4130 k P(X=k) P(X>=k) 0 0.0010 1.0000 1.0000 0.0090 0.9990 2.0000 0.0379 0.9900 3.0000 0.0979 0.9521 4.0000 0.1721 0.8542
% call for procedure
Remark.
While the mprocedure binomial is useful for constructing a table, it is usually not as convenient
for problems as the mfunctions ibinom or cbinom.
directly into the
MATLAB
workspace.
The latter calculate the desired values and
put them
94
4.3.3 Joint Bernoulli trials
of MATLAB for this.
CHAPTER 4. INDEPENDENCE OF EVENTS
Bernoulli trials may be used to model a variety of practical problems. One such is to compare the results of two sequences of Bernoulli trials carried out independently. The following simple example illustrates the use
Example 4.25: A joint Bernoulli trial
Bill and Mary take ten basketball free throws each. We assume the two seqences of trials are independent of each other, and each is a Bernoulli sequence. Mary: Has probability 0.80 of success on each trial. Bill: Has probability 0.85 of success on each trial. What is the probability Mary makes more free throws than Bill? SOLUTION We have two Bernoulli sequences, operating independently. Mary: Bill: Let
n = 10, p = 0.80 n = 10, p = 0.85
M be the event Mary wins Mk be the event Mary makes k or more freethrows. Bj be the event Bill makes exactly j freethrows
Then Mary wins if Bill makes none and Mary makes one or more, or Bill makes one and Mary makes two or more, etc. Thus
M = B0 M1
and
B1 M2
···
B9 M10
(4.28)
P (M ) = P (B0 ) P (M1 ) + P (B1 ) P (M2 ) + · · · + P (B9 ) P (M10 )
vidual probabilities for Bill.
(4.29)
We use cbinom to calculate the cumulative probabilities for Mary and ibinom to obtain the indi
pm = cbinom(10,0.8,1:10); % cumulative probabilities for Mary pb = ibinom(10,0.85,0:9); % individual probabilities for Bill D = [pm; pb]' % display: pm in the first column D = % pb in the second column 1.0000 0.0000 1.0000 0.0000 0.9999 0.0000 0.9991 0.0001 0.9936 0.0012 0.9672 0.0085 0.8791 0.0401 0.6778 0.1298 0.3758 0.2759 0.1074 0.3474
To nd the probability sum. This is just the dot or scalar product, which MATLAB calculates with the command
P (M ) that Mary wins, we need to multiply each of these pairs together, then pm ∗ pb' .
We may combine the generation of the probabilities and the multiplication in one command:
P = cbinom(10,0.8,1:10)*ibinom(10,0.85,0:9)' P = 0.2738
95
The ease and simplicity of calculation with MATLAB make it feasible to consider the eect of dierent values of
n.
Is there an optimum number of throws for Mary? Why should there be an optimum?
An alternate treatment of this problem in the unit on Independent Random Variables utilizes techniques for independent simple random variables.
4.3.4 Alternate MATLAB implementations
Alternate implementations of the functions for probability calculations are found in the available as a supplementary package. package is needed.
Statistical Package
We have utilized our formulation, so that only the basic MATLAB
4.4 Problems on Independence of Events4
Exercise 4.1
The minterms generated by the class
(Solution on p. 101.)
{A, B, C}
have minterm probabilities
pm = [0.15 0.05 0.02 0.18 0.25 0.05 0.18 0.12]
Show that the product rule holds for all three, but the class is not independent.
(4.30)
Exercise 4.2
The class
(Solution on p. 101.)
is independent, with respective probabilities 0.65, 0.37, 0.48, 0.63. Use Use the mfunction minmap to put
{A, B, C, D}
the mfunction minprob to obtain the minterm probabilities.
them in a 4 by 4 table corresponding to the minterm map convention we use.
Exercise 4.3
from "Minterms" are
(Solution on p. 101.)
The minterm probabilities for the software survey in Example 2 (Example 2.2: Survey on software)
pm = [0 0.05 0.10 0.05 0.20 0.10 0.40 0.10]
Show whether or not the class of the mfunction imintest.
(4.31)
{A, B, C}
is independent: (1) by hand calculation, and (2) by use
Exercise 4.4
computers) from "Minterms" are
(Solution on p. 101.)
The minterm probabilities for the computer survey in Example 3 (Example 2.3: Survey on personal
pm = [0.032 0.016 0.376 0.011 0.364 0.073 0.077 0.051]
Show whether or not the class of the mfunction imintest.
(4.32)
{A, B, C}
is independent: (1) by hand calculation, and (2) by use
Exercise 4.5
Minterm probabilities
(Solution on p. 101.)
p (0) through p (15) for the class {A, B, C, D} are, in order, pm = [0.084 0.196 0.036 0.084 0.085 0.196 0.035 0.084 0.021 0.049 0.009 0.021 0.020 0.049 0.010 0.021] Use the mfunction imintest to show whether or not the class {A, B, C, D} is independent.
(Solution on p. 102.)
Exercise 4.6
Minterm probabilities
p (0)
through
p (15)
for the opinion survey in Example 4 (Example 2.4:
Opinion survey) from "Minterms" are
Show whether or not the class
pm = [0.085 0.195 0.035 0.085 0.080 0.200 0.035 0.085 0.020 0.050 0.010 0.020 0.020 0.050 0.015 0.015] {A, B, C, D} is independent.
4 This
content is available online at <http://cnx.org/content/m24180/1.4/>.
96
CHAPTER 4. INDEPENDENCE OF EVENTS
Exercise 4.7
The class
(Solution on p. 102.)
is independent, with
{A, B, C}
P (A) = 0.30, P (B c C) = 0.32,
and
P (AC) = 0.12.
Determine the minterm probabilities.
Exercise 4.8
The class
(Solution on p. 102.)
is independent, with
{A, B, C}
P (A ∪ B) = 0.6, P (A ∪ C) = 0.7,
and
P (C) = 0.4.
Determine the probability of each minterm.
Exercise 4.9
others are not?
(Solution on p. 102.)
A pair of dice is rolled ve times. What is the probability the rst two results are sevens and the
Exercise 4.10
probabilities of making 90 percent or more are
(Solution on p. 102.)
Their
David, Mary, Joan, Hal, Sharon, and Wayne take an exam in their probability course.
0.72 0.83 0.75 0.92 0.65 0.79
(4.33)
respectively. Assume these are independent events. What is the probability three or more, four or more, ve or more make grades of at least 90 percent?
Exercise 4.11
(Solution on p. 102.)
Two independent random numbers between 0 and 1 are selected (say by a random number generator on a calculator). What is the probability the rst is no greater than 0.33 and the other is at least 57?
Exercise 4.12
Helen is wondering how to plan for the weekend. with probability 0.05.
(Solution on p. 102.)
She will get a letter from home (with money) There is a probability of 0.85 that she will get a call from Jim at SMU in
Dallas. There is also a probability of 0.5 that William will ask for a date. What is the probability she will get money and Jim will not call or that both Jim will call and William will ask for a date?
Exercise 4.13
A basketball player takes ten free throws in a contest.
(Solution on p. 103.)
On her rst shot she is nervous and has
probability 0.3 of making the shot. She begins to settle down and probabilities on the next seven shots are 0.5, 0.6 0.7 0.8 0.8, 0.8 and 0.85, respectively. Then she realizes her opponent is doing
well, and becomes tense as she takes the last two shots, with probabilities reduced to 0.75, 0.65. Assuming independence between the shots, what is the probability she will make
k
or more for
k = 2, 3, · · · 10?
Exercise 4.14
In a group there are
M
men and
W
women;
graduates. An individual is picked at random.
m of the men and w of the women are college Let A be the event the individual is a woman and B
{A, B} independent?
and
(Solution on p. 103.)
be the event he or she is a college graduate. Under what condition is the pair
Exercise 4.15
Consider the pair
(Solution on p. 103.)
{A, B}
of events.
Let
P (BAc ) = p2 .
Exercise 4.16
Under what condition is
P (A) = p, P (Ac ) = q = 1 − p, P (BA) = p1 , the pair {A, B} independent?
Show that if event
A is independent of itself, then P (A) = 0 or P (A) = 1.
{B, C} {A, C}
(Solution on p. 103.)
(This fact is key to an
important zeroone law.)
Exercise 4.17
Does
(Solution on p. 103.)
independent imply is independent? Justify your
{A, B}
independent and
answer.
97
Exercise 4.18
Suppose event
A implies B
(Solution on p. 103.)
(i.e.
A ⊂ B ).
Show that if the pair
{A, B}
is independent, then either
P (A) = 0
or
P (B) = 1.
(Solution on p. 103.)
The groups work What is the
Exercise 4.19
A company has three task forces trying to meet a deadline for a new device.
independently, with respective probabilities 0.8, 0.9, 0.75 of completing on time. probability at least one group completes on time? (Think. Then solve by hand.)
Exercise 4.20
(Solution on p. 103.)
Two salesmen work dierently. Roland spends more time with his customers than does Betty, hence tends to see fewer customers. On a given day Roland sees ve customers and Betty sees six. The customers make decisions independently. If the probabilities for success on Roland's customers are
0.7, 0.8, 0.8, 0.6, 0.7
and for Betty's customers are
0.6, 0.5, 0.4, 0.6, 0.6, 0.4,
what is the probability
Roland makes more sales than Betty? What is the probability that Roland will make three or more sales? What is the probability that Betty will make three or more sales?
Exercise 4.21
Two teams of students take a probability exam. independently. Team 1 has ve members and Team 2 has six members.
(Solution on p. 104.)
The entire group performs individually and They have the following
indivudal probabilities of making an `A on the exam. Team 1: 0.83 0.87 0.92 0.77 0.86 Team 2: 0.68 0.91 0.74 0.68 0.73 0.83
a. What is the probability team 1 will make b. What is the probability team 1 will make
at least as many A's as team 2? more A's than team 2?
(Solution on p. 104.)
Exercise 4.22
0.78, 0.88, 0.92. Units 1 and 2 operate as a series combination.
A system has ve components which fail independently. Their respective reliabilities are 0.93, 0.91, Units 3, 4, 5 operate as a two
of three subsytem.
The two subsystems operate as a parallel combination to make the complete
system. What is reliability of the complete system?
Exercise 4.23
A system has eight components with respective probabilities
(Solution on p. 104.)
0.96 0.90 0.93 0.82 0.85 0.97 0.88 0.80
units 4 through 8. What is the reliability of the complete system?
(4.34)
Units 1 and 2 form a parallel subsytem in series with unit 3 and a three of ve combination of
Exercise 4.24
combination in series with the three of ve combination?
(Solution on p. 104.)
How would the reliability of the system in Exercise 4.23 change if units 1, 2, and 3 formed a parallel
Exercise 4.25
(Solution on p. 104.)
How would the reliability of the system in Exercise 4.23 change if the reliability of unit 3 were changed from 0.93 to 0.96? What change if the reliability of unit 2 were changed from 0.90 to 0.95 (with unit 3 unchanged)?
Exercise 4.26 Exercise 4.27
(Solution on p. 105.) (Solution on p. 105.)
If customers buy
Three fair dice are rolled. What is the probability at least one will show a six?
A hobby shop nds that 35 percent of its customers buy an electronic game.
independently, what is the probability that at least one of the next ve customers will buy an electronic game?
98
CHAPTER 4. INDEPENDENCE OF EVENTS
Exercise 4.28
is 0.1. Successive messages are acted upon independently by the noise.
(Solution on p. 105.)
Suppose the message is
Under extreme noise conditions, the probability that a certain message will be transmitted correctly
transmitted ten times. What is the probability it is transmitted correctly at least once?
Exercise 4.29
Suppose the class
(Solution on p. 105.)
{Ai : 1 ≤ i ≤ n}
is independent, with
P (Ai ) = pi , 1 ≤ i ≤ n.
What is the
probability that at least one of the events occurs? What is the probability that none occurs?
Exercise 4.30
what is the probility 8 or more are sevens?
(Solution on p. 105.)
In one hundred random digits, 0 through 9, with each possible digit equally likely on each choice,
Exercise 4.31
set, what is the probability the store will sell three or more?
(Solution on p. 105.)
Ten customers come into a store. If the probability is 0.15 that each customer will buy a television
Exercise 4.32
Seven similar units are put into service at time
(Solution on p. 105.)
t = 0.
The units fail independently. The probability
of failure of any unit in the rst 400 hours is 0.18. What is the probability that three or more units are still in operation at the end of 400 hours?
Exercise 4.33
(Solution on p. 105.)
A computer system has ten similar modules. The circuit has redundancy which ensures the system operates if any eight or more of the units are operative. Units fail independently, and the probability is 0.93 that any unit will survive between maintenance periods. What is the probability of no system failure due to these units?
Exercise 4.34
(Solution on p. 105.)
Only thirty percent of the items from a production line meet stringent requirements for a special job. Units from the line are tested in succession. Under the usual assumptions for Bernoulli trials, what is the probability that three satisfactory units will be found in eight or fewer trials?
Exercise 4.35
(Solution on p. 105.)
What is the
The probability is 0.02 that a virus will survive application of a certain vaccine. probability that in a batch of 500 viruses, fteen or more will survive treatment?
Exercise 4.36
In a shipment of 20,000 items, 400 are defective.
(Solution on p. 105.)
These are scattered randomly throughout the
entire lot. Assume the probability of a defective is the same on each choice. What is the probability that
1. Two or more will appear in a random sample of 35? 2. At most ve will appear in a random sample of 50?
Exercise 4.37
A device has probability
p is necessary to ensure the probability of successes on all of the rst four trials is 0.85? value of p, what is the probability of four or more successes in ve trials?
Exercise 4.38
A survey form is sent to 100 persons. and each has probability 1/4 of replying, what is the probability of
p
(Solution on p. 105.)
of operating successfully on any trial in a sequence. What probability With that
(Solution on p. 105.)
If they decide independently whether or not to reply,
k
or more replies, where
k = 15, 20, 25, 30, 35, 40?
Exercise 4.39
are less than or equal to 0.63?
(Solution on p. 105.)
Ten numbers are produced by a random number generator. What is the probability four or more
99
Exercise 4.40
i she scores an
(Solution on p. 105.)
A player rolls a pair of dice ve times. She scores a hit on any throw if she gets a 6 or 7. She wins
odd number of hits in the ve throws.
What is the probability a player wins on any
sequence of ve throws? Suppose she plays the game 20 successive times. What is the probability she wins at least 10 times? What is the probability she wins more than half the time?
Exercise 4.41
on various trials are independent. Each spins the wheel 10 times.
(Solution on p. 106.)
What is the probability Erica
Erica and John spin a wheel which turns up the integers 0 through 9 with equal probability. Results
turns up a seven more times than does John?
Exercise 4.42
(Solution on p. 106.)
Erica and John play a dierent game with the wheel, above. Erica scores a point each time she gets an integer 0, 2, 4, 6, or 8. John scores a point each time he turns up a 1, 2, 5, or 7. If Erica spins eight times; John spins 10 times. What is the probability John makes
more
points than Erica?
Exercise 4.43
A box contains 100 balls; 30 are red, 40 are blue, and 30 are green. random, with replacement and mixing after each selection. ball; Martha has a success if she selects a blue ball.
(Solution on p. 106.)
Martha and Alex select at Alex has a success if he selects a red
Alex selects seven times and Martha selects
ve times. What is the probability Martha has more successes than Alex?
Exercise 4.44
of sixes?
(Solution on p. 106.)
Two players roll a fair die 30 times each. What is the probability that each rolls the same number
Exercise 4.45
A warehouse has a stock of
n items of a certain kind, r
p = r/n
(Solution on p. 106.)
of which are defective. Two of the items are
chosen at random, without replacement. What is the probability that at least one is defective? Show that for large
n the number is very close to that for selection with replacement, which corresponds
of success on any trial.
to two Bernoulli trials with pobability
Exercise 4.46
terminate.
(Solution on p. 106.)
A coin is ipped repeatedly, until a head appears. Show that with probability one the game will
tip: The probability of not terminating in
n trials is qn .
(Solution on p. 106.)
Exercise 4.47
plays. Let
Two persons play a game consecutively until one of them is successful or there are ten unsuccesful
Ei
be the event of a success on the
independent class with rst player wins,
B
P (Ei ) = p1
for
i
i th
play of the game.
odd and
P (Ei ) = p2
be the event the second player wins, and
C
for
i
Suppose
even. Let
A
{Ei : 1 ≤ i}
is an
be the event the
be the event that neither wins.
a. Express
b. Determine
c.
d.
A, B , and C in terms of the Ei . P (A), P (B), and P (C) in terms of p1 , p2 , q1 = 1 − p1 , and q2 = 1 − p2 . Obtain numerical values for the case p1 = 1/4 and p2 = 1/3. Use appropriate facts about the geometric series to show that P (A) = P (B) i p1 = p2 / (1 + p2 ). Suppose p2 = 0.5. Use the result of part (c) to nd the value of p1 to make P (A) = P (B) and then determine P (A), P (B), and P (C).
(Solution on p. 106.)
Exercise 4.48
a success on the
Three persons play a game consecutively until one achieves his objective. Let
i th
Ei
be the event of
trial, and suppose for
i = 1, 4, 7, · · ·, P (Ei ) = p2
{Ei : 1 ≤ i} is an independent class, with P (Ei ) = p1 for i = 2, 5, 8, · · ·, and P (Ei ) = p3 for i = 3, 6, 9, · · ·. Let A, B, C be the
respective events the rst, second, and third player wins.
100
CHAPTER 4. INDEPENDENCE OF EVENTS
a. Express
A, B ,
and
C
in terms of the
Ei .
p1 , p2 , p3 ,
then obtain numerical values in the case
b. Determine the probabilities in terms of
p1 = 1/4, p2 = 1/3,
Exercise 4.49
and
p3 = 1/2.
What is the probability of a success on the given there are
r
i th trial in a Bernoulli sequence of n component trials,
(Solution on p. 107.)
(Solution on p. 107.)
successes?
Exercise 4.50
A device has
N
similar components which may fail independently, with probability
p
of failure of
any component. The device fails if one or more of the components fails. In the event of failure of the device, the components are tested sequentially. a. What is the probability the rst defective unit tested is the have failed? b. What is the probability the defective unit is the
nth, given one or more components
nth, given that exactly one has failed?
c. What is the probability that more than one unit has failed, given that the rst defective unit is the
nth?
101
Solutions to Exercises in Chapter 4
Solution to Exercise 4.1 (p. 95)
pm = [0.15 0.05 0.02 0.18 0.25 0.05 0.18 0.12]; y = imintest(pm) The class is NOT independent Minterms for which the product rule fails y = 1 1 1 0 1 1 1 0 % The product rule hold for M7 = ABC
Solution to Exercise 4.2 (p. 95)
P = [0.65 0.37 0.48 p = minmap(minprob(P)) p = 0.0424 0.0249 0.0722 0.0424 0.0392 0.0230 0.0667 0.0392
0.63]; 0.0788 0.1342 0.0727 0.1238 0.0463 0.0788 0.0427 0.0727
Solution to Exercise 4.3 (p. 95)
pm = [0 0.05 0.10 0.05 0.20 0.10 0.40 0.10]; y = imintest(pm) The class is NOT independent Minterms for which the product rule fails y = 1 1 1 1 % By hand check product rule for any minterm 1 1 1 1
Solution to Exercise 4.4 (p. 95)
npr04_04 (Section~17.8.22: npr04_04) Minterm probabilities for Exercise~4.4 are in pm y = imintest(pm) The class is NOT independent Minterms for which the product rule fails y = 1 1 1 1 1 1 1 1
Solution to Exercise 4.5 (p. 95)
npr04_05 (Section~17.8.23: npr04_05) Minterm probabilities for Exercise~4.5 are in pm imintest(pm) The class is NOT independent
102
CHAPTER 4. INDEPENDENCE OF EVENTS
Minterms for which the product rule fails ans = 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 0
Solution to Exercise 4.6 (p. 95)
npr04_06 Minterm probabilities for Exercise~4.6 are in pm y = imintest(pm) The class is NOT independent Minterms for which the product rule fails y = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Solution to Exercise 4.7 (p. 96)
P (C) = P (AC) /P (A) = 0.40
and
P (B) = 1 − P (B c C) /P (C) = 0.20.
pm = minprob([0.3 0.2 0.4]) pm = 0.3360 0.2240 0.0840 0.0560
Solution to Exercise 4.8 (p. 96)
0.1440
0.0960
0.0360
0.0240
P (Ac C c ) = P (Ac ) P (C c ) = 0.3 implies P (Ac ) = 0.3/0.6 = 0.5 = P (A). P (Ac B c ) = P (Ac ) P (B c ) = 0.4 implies P (B c ) = 0.4/0.5 = 0.8 implies P (B) = 0.2
P = [0.5 0.2 0.4]; pm = minprob(P) pm = 0.2400 0.1600 0.0600
P = (1/6) (5/6) = 0.0161.
2 3
0.0400
0.2400
0.1600
0.0600
0.0400
Solution to Exercise 4.9 (p. 96) Solution to Exercise 4.10 (p. 96)
P = 0.01*[72 83 75 92 65 79]; y = ckn(P,[3 4 5]) y = 0.9780 0.8756 0.5967
Solution to Exercise 4.11 (p. 96)
P = 0.33 • (1 − 0.57) = 0.1419
Solution to Exercise 4.12 (p. 96)
A∼
letter with money,
B∼
call from Jim,
C∼
William ask for date
P = 0.01*[5 85 50]; minvec3 Variables are A, B, C, Ac, Bc, Cc They may be renamed, if desired. pm = minprob(P); p = ((A&Bc)(B&C))*pm' p = 0.4325
103
Solution to Exercise 4.13 (p. 96)
P = 0.01*[30 50 60 70 80 80 80 85 75 65]; k = 2:10; p = ckn(P,k) p = Columns 1 through 7 0.9999 0.9984 0.9882 0.9441 0.8192 Columns 8 through 9 0.0966 0.0134
Solution to Exercise 4.14 (p. 96)
0.5859
0.3043
P (AB) = w/ (m + w) = W/ (W + M ) = P (A)
Solution to Exercise 4.15 (p. 96)
p1 = P (BA) = P (BAc ) = p2
(see table of equivalent conditions).
Solution to Exercise 4.16 (p. 96)
P (A) = P (A ∩ A) = P (A) P (A). x2 = x
Solution to Exercise 4.17 (p. 96)
i
x=0
or
x = 1.
% No. Consider for example the following minterm probabilities: pm = [0.2 0.05 0.125 0.125 0.05 0.2 0.125 0.125]; minvec3 Variables are A, B, C, Ac, Bc, Cc They may be renamed, if desired. PA = A*pm' PA = 0.5000 PB = B*pm' PB = 0.5000 PC = C*pm' PC = 0.5000 PAB = (A&B)*pm' % Product rule holds PAB = 0.2500 PBC = (B&C)*pm' % Product rule holds PBC = 0.2500 PAC = (A&C)*pm' % Product rule fails PAC = 0.3250
Solution to Exercise 4.18 (p. 97)
A ⊂ B implies P (AB) = P (A); P (B) = 1 or P (A) = 0.
independence implies
P (AB) = P (A) P (B). P (A) = P (A) P (B)
only if
Solution to Exercise 4.19 (p. 97)
At least one completes i not all fail.
P = 1 − 0.2 • 0.1 • 0.25 = 0.9950
Solution to Exercise 4.20 (p. 97)
PR = 0.1*[7 8 8 6 7]; PB = 0.1*[6 5 4 6 6 4]; PR3 = ckn(PR,3) PR3 = 0.8662 PB3 = ckn(PB,3) PB3 = 0.6906
0:4)*ckn(PR.9506 = parallel([Ra Rb]) = 0. P1geq = ikn(P2. 97) Rc = parallel(R(1:3)) Rc = 0.104 CHAPTER 4.9097 Solution to Exercise 4.25 (p.9821 Rs3 = prod([Ra R1(3) Rb]) Rs3 = 0. 97) Ra Ra Rb Rb Rs Rs R = 0.01*[83 87 92 77 86].9821 = prod([Ra R(3) Rb]) = 0.22 (p.96.3) = 0. 97) R1 = R.9997 Rss = prod([Rb Rc]) Rss = 0.8463 = ckn(R(3:5).01*[68 91 74 68 73 83]. = parallel(R(1:2)) = 0. R1(3) =0.9960 Rb = ckn(R1(4:8). P2 = 0.1:5)' P1g = 0.9818 Solution to Exercise 4. INDEPENDENCE OF EVENTS PRgB = ikn(PB.01*[96 90 93 82 85 97 88 80]. Ra = parallel(R1(1:2)) Ra = 0.3) Rb = 0. . 97) P1 = 0. = prod(R(1:2)) = 0.01*[93 91 78 88 92].5527 P1g = ikn(P2.9960 = ckn(R(4:8).9390 R2 = R. 97) Ra Ra Rb Rb Rs Rs R = 0.5065 Solution to Exercise 4.21 (p.24 (p.23 (p.1:5)' PRgB = 0.0:4)*ckn(P1.9924 Solution to Exercise 4.0:5)*ckn(P1.0:5)' P1geq = 0.2561 Solution to Exercise 4.2) = 0.
98) P = cbinom (10. 98) P = cbinom (500. 4) = 0. 3) = 0.0.93.39 (p. P = cbinom (5. 97) P = 1 − (5/6) = 0.9821 Rs4 = prod([Ra R2(3) Rb]) Rs4 = 0. 98) P = P = cbinom(100.9999 Solution to Exercise 4. Ra = parallel(R2(1:2)) Ra = 0.1495 0.28 (p.5383 0.11/36.15:5:40) 0.1547. 8) = 0. 0.9717 Solution to Exercise 4. PW P2 P2 P3 P3 PW = sum(ibinom(5. 97) 3 P = 1 − 0.9980 Rb = ckn(R2(4:8).31 (p. 3) = 0.PW. 0.851/4 .105 R2(2) = 0.02.63. • P 2 = 1 − cbinom (35.36 (p.11) = 0.1.910 = 0.35 (p. 99) Each roll yields a hit with probability p= 6 36 + 5 36 = 11 36 .4956 = cbinom(20.655 = 0.7939 Solution to Exercise 4.9115 Solution to Exercise 4.9854 Solution to Exercise 4.1/4. 8) = 0.95. 0. 98) P = cbinom (8.40 (p.30 (p.37 (p.1798 Solution to Exercise 4.3. 98) P 1 = cbinom (10.38 (p. 98) P 1 = 1 − P 0. 98) P = 1 − 0.PW. 98) n i=1 (1 − pi ) P = cbinom (100. P0 = Solution to Exercise 4.10) = 0. 6) = 0.8840 Solution to Exercise 4.3963 .26 (p.34 (p. 4) = 0. 98) P = cbinom (7. 98) p = 0.6513 Solution to Exercise 4.9946 0.[1 3 5])) = 0. 0.5724 = cbinom(20.27 (p.9644 Solution to Exercise 4. 98) P = cbinom (10.15.9971 Solution to Exercise 4.4213 Solution to Exercise 4.02. 0. 0.29 (p. 15) = 0.33 (p. 3) = 0.4482 Solution to Exercise 4. 98) • P 1 = cbinom (35.9005 0.0164 0.32 (p. p.82. 2) = 0.0007 Solution to Exercise 4.3) Rb = 0.0814 Solution to Exercise 4.02. 0. 0.
3613 Solution to Exercise 4.41 (p.37) c c c E1 E2 E3 E4 c c c c c E1 E2 E3 E4 E5 E6 c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 c c c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 (4. 99) a.^2) = 0. 3k+2 i=1 k • P (A) = p1 k=0 (q1 q2 q3 ) = q • P (B) = 1−q11p2 q3 q2 ∞ . P (A) = p1 1 + q1 q2 + (q1 q2 ) + (q1 q2 ) + (q1 q2 ) 5 2 3 4 = p1 1 − (q1 q2 ) 1 − q1 q2 5 (4. 0. 99) P = ibinom (10. 99) ∞ a.40) p1 = 1/4. 0 : 8) ∗ cbinom (10.41) c.1. 0 : 30) .46 (p. 0. d.3437 Solution to Exercise 4. P (A) = P (B) i p1 = q1 p2 = (1 − p1 ) p2 p1 = 0. 0 : 4) ∗ cbinom (5. P (C) = 1/32 4 16 i (4. 0 : 9) ∗ cbinom (10.5. 99) Let implies N = event never terminates and Nk = 0 ≤ P (N ) ≤ P (Nk ) = 1/2k for all k. Solution to Exercise 4. Then N ⊂ Nk for all k P (N ) = 0. 0.4.48 (p. we have q1 q2 = 1/2 q1 p2 = 1/4. 1/6. 99) P = ibinom (8.1. Solution to Exercise 4.39) P (B) = q1 p2 For 1 − (q1 q2 ) 5 P (C) = (q1 q2 ) 1 − q1 q2 and (4. Note that P (A) + P (B) + P (C) = 1.3.45 (p.5/1.5 = 1/3 p1 = p2 / (1 + p2 ).1386 Solution to Exercise 4.43 (p. c c E1 E2 E3 event does not terminate in We conclude k plays.38) b. p2 = 1/3.47 (p. 0. In this case P (A) = 1 31 • = 31/64 = 0.106 CHAPTER 4.36) Solution to Exercise 4. INDEPENDENCE OF EVENTS Solution to Exercise 4. C= 10 i=1 c Ei . A = E1 c B = E1 E2 c c c c E1 E2 E3 E4 E5 c c c c c c E1 E2 E3 E4 E5 E6 E7 c c c c c c c c E1 E2 E3 E4 E5 E6 E7 E8 E9 (4.4. 1 : 5) ' = 0.44 (p. 1 : 10) ' = 0. 1 : 9) ' = 0. 99) P = sum (ibinom (30. 0. 99) P1 = r n−r n−r r (2n − 1) r − r2 r r−1 • + • + • = n n−1 n n−1 n n−1 n (n − 1) P2 = 1 − r n 2 (4. 0. • A = E1 k=1 c • B = E1 E2 c c • C = E1 E2 E3 ∞ 3k i=1 c Ei E3k+1 c Ei E3k+2 c Ei E3k+3 p1 1−q1 q2 q3 k=1 3k+1 i=1 ∞ k=1 b.4844 = P (B) .42 (p. 99) P = ibinom (7.35) = 2nr − r2 n2 (4.4030 Solution to Exercise 4.
107 q1 • P (C) = 1−qq2qp3 3 1 2q • For p1 = 1/4. Hence P (Ei AA rn) = C (n − 1. 100) P (Arn Ei ) = pC (n − 1.50 (p. a. q n−1 p 1 − QN −1 P (B2 Fn ) = = 1 − q N −n P (Fn ) q n−1 p (4. failures.43) P (B2 Fn ) = . and Fn ⊂ B1 implies P (Fn B1 ) = P (Fn ) /P (B1 ) = P (Fn A1 ) = q n−1 p 1−q N P (Fn A1 ) 1 q n−1 pq N −n = = N −1 P (A1 ) N pq N (4. 100) Let A1 = event one failure. Solution to Exercise 4. Solution to Exercise 4.42) (see Exercise 4. r − 1) pr−1 q n−r and P (Arn ) = C (n. p3 = 1/2. b. B1 = event of one or more Fn = the event the rst defective unit found is the nth. r − 1) /C (n. r) = r/n.49) c. B2 = event of two or more failures. P (A) = P (B) = P (C) = 1/3. r) pr q n−r . Since probability not all from nth are good is 1 − q N −n .49 (p. p2 = 1/3.
108 CHAPTER 4. INDEPENDENCE OF EVENTS .
1: Buying umbrellas and the weather A department store has a nice stock of umbrellas.org/content/m23258/1. is raining or threatening and Two customers come into the store indepen Normally. 109 . the the item purchased by one should not be aected by the choice made by the other. If two students are taking exams in dierent courses.1. 5. Among the simple examples of operational independence" in the unit on independence of events. • Conditional probability is a probability measure. Example 5.1).7/>. we should think the events weather on the purchases. This raises the question: is there a useful conditional independencei. But given the fact the weather is rainy 1 This content is available online at <http://cnx. Let A be the event the rst buys an umbrella and B the event the second buys an umbrella. Let to rain). each unaware of the choice made by the other. and not on the relationship between the events.e. But consider the eect of P (AC) > P (AC c ) P (BC) > P (BC c ). dently. Now we should think form an independent pair. the grade one makes should not aect the grade made by the other.Chapter 5 Conditional Independence 5.. This is equivalent to the product rule P (AB) = P (A) P (B) . The concept is approached as lack of conditioning: P (AB) = P (A). Independence cannot be displayed on a Venn diagram. For one probability measure a pair may be independent while for another probability measure the pair may not be independent. since it has the three dening properties and all those properties derived therefrom. which lead naturally to an assumption of probabilistic independence are the following: • • If customers come into a well stocked shop at dierent times.. B} be the event the weather is rainy (i. The weather has a decided eect on the likelihood of buying an umbrella.1 Conditional Independence1 The idea of stochastic (probabilistic) independence is explored in the unit Independence of Events (Section 4. unless probabilities are indicated.e.1 The concept Examination of the independence concept reveals two important mathematical facts: • Independence of a class of non mutually exclusive events depends upon the probability measure. C {A. independence with respect to a conditional probability measure? In this chapter we explore that question in a fruitful way. We consider an extension to conditional independence.
0816 = 0. additional knowledge that A occurs. CONDITIONAL INDEPENDENCE (event C has occurred). P (BC c ) = 0. If the rst customer buys an umbrella. An examination of the pattern of computation shows that independence would require very special probabilities which are not likely to be encountered. The condition leads to P (ABC) = P (AC) P (BC) P (BA) > P (B). this would increase the likelihoood there was a . we can see why this should be so. this indicates a higher than normal likelihood that the weather is rainy.1110 As a result.6) has occurred does not If we know that there was a Saturday game. P (AC c ) = P (ABC c ).4) P (A) P (B) = 0. The exam is given on Monday There is a probability 0.3) = P (AC) P (BC) P (C) + (5. Then (5. in which case the second customer is likely to buy. Now it is reasonable to suppose C be the event of a game on the previous P (AC) = P (ABC) aect the lielihood that be expressed and P (AC c ) = P (ABC c ) (5. (5.7) Under these conditions. P (AC c ) = 0. better and morning.1) An examination of the sixteen equivalent conditions for independence. it may be reasonable to suppose P (AC) = P (ABC) replaced by probability measure or. plain that only in the most unusual cases would the pair be independent. If we knew that one did poorly on the exam.30. Thus.60. we should suppose that P (BC) < P (BC c ).3200 P (B) = P (BC) P (C) + P (BC c ) P (C c ) = 0. we should also expect that with respect to the conditional probability dependent (with respect to the prior probability measure P )? Some numerical examples make it Without calculations. and both students are enthusiastic fans. Example 5. with probability measure P PC .2) P (A) = P (AC) P (C) + P (AC c ) P (C c ) = 0. PC (A) = PC (AB) (5. use of equivalent conditions shows that the situation may P (ABC) = P (AC) P (BC) and P (ABC c ) = P (AC c ) P (BC c ) P (AC) < P (AC c ) and (5. Saturday. P (BC) = 0.2550 P (AB) = P (ABC) P (C) + P (ABC c ) P (C c ) P (AC c ) P (BC c ) P (C c ) = 0.5) not independent. Thus. B Again. Suppose P (AC) = 0. Consider the following c c c and P (ABC ) = P (AC ) P (BC ) and numerical case. so that there is independence c measure P (·C ).1110 = P (AB) The product rule fails. B} with rePC (·) = P (·C). Does this make the pair {A. Let Saturday.2: Students and exams Two students take exams in dierent courses. P (ABC) = P (AC) P (BC).20. it would seem reasonable that purchase of an umbrella by one should not aect the likelihood of such a purchase by the other. so that the pair is (5. B} in spect to the conditional probability measure For this example.50. one would suppose their performances form an independent pair.110 CHAPTER 5.15. withP (C) = 0. shows that we have independence of the pair {A. in another notation.30 that there was a football game on It is the fall semester. Under normal circumstances. Let B A be the event the rst student makes grade 80 or be the event the second has a grade of 80 or better.
{A. 5. we have the following equivalent conditions. suppose P (AC) = 0. independent arises from a common chance factor that aects both.2 Sixteen equivalent conditions Using the facts on repeated conditioning (Section 3.3 Straightforward calculations show (5. establish The replacement rule If any of the pairs are the others. {Ac . Sixteen equivalent conditions P (ABC) = P (AC) P (AB C) = P (AC) P (Ac BC) = P (Ac C) P (Ac B c C) = P (Ac C) c P (BAC) = P (BC) P (B AC) = P (B C) P (BAc C) = P (BC) P (B c Ac C) = P (B c C) Table 5.111 Saturday game and hence increase the likelihood that the other did poorly. B c } is conditionally independent.1).2 The patterns of conditioning in the examples above belong to this set. B}.9) P (ABC) = P (AC) If it is known that likelihood of or P (ABC) = P (AC) P (BC) (5. condition the product rule P (ABC) = P (AC) P (BC). A pair of events Because of its simplicity and symmetry.4: Repeated conditioning) and the equivalent conditions for independence (Table 4. then so .1 P (ABC) = P (AC) P (BC) P (AB c C) = P (AC) P (B c C) P (Ac BC) = P (Ac C) P (BC) P (Ac B c C) = P (Ac C) P (B c C) c c P (ABC) = P (AB c C) P (Ac BC) P (Ac B c C) = P (BAC) = P (BAc C) P (B c AC) P (B c Ac C) = Table 5.6 P (BC c ) = 0. designated {A. or {Ac . As a numerical example. In the hybrid notation we use for repeated conditioning.8) Note that P (A) = 0. we may produce a similar table of equivalent conditions for conditional independence. The failure to be Although their performances are operationally independent. {A.8 P (C) = 0. The equivalence of the four entries in the right hand column of the upper part of the table. B c }.8514 > P (A) as would be expected. B} ci C i the following product rule holds: P (ABC) = P (AC) P (BC). all one are equally valid assumptions.1. they are not independent in the probability sense. we write PC (AB) = PC (A) This translates into or PC (AB) = PC (A) PC (B) (5.1. then additional knowledge of the occurrence of B does not change the If we write the sixteen equivalent conditions for independence in terms of the conditional probability measure PC ( · ).7400.9 P (BC) = 0.7 P (AC c ) = 0. P (AB) = 0. then translate as above. givenC. one or the of these patterns is recognized. B}. P (B) = 0.6300. given C.10) A. we take as the dening Denition. C has occurred. P (AB) = 0. {A. B} is said to be conditionally independent. other of these conditions may seem a reasonable assumption.8400. As soon as then In a given problem.
B} ci Dc . {A. an event must be sharply dened: on any trial it occurs or it does not.112 CHAPTER 5. They both need to borrow If A is the reasonable {A. The doctor orders two independent tests for conditions that indicate the disease. • Two students in the same course take an exam. given a short supply. • Two contractors work quite independently on jobs in the same city. A class {Ai : i ∈ J}. Since conditional independence is ordinary independence with respect to a conditional probability measure. given C. However. B} ci C and {A. it has been reasonable to assume conditional independence. both jobs are outside and subject to delays due to bad weather. and B the event the second is successful. The replacement rule If the class {Ai : i ∈ J} ci C . Remark. a certain book from the library. . B} ci C . we note some other common examples in which similar conditions hold: there is operational independence. Denition. then the two conditions would In general not be conditionally independent. patient. the pattern of probabilistic analysis may be useful. they must both be aected by the actual condition of the Suppose D is the event the patient has the disease. They work quite separately. A preliminary examination leads the doctor to think there is a thirty percent chance the patient has a certain disease. where J is an arbitrary index set. In formal probability theory. However. the other should also. to the detriment of the second. If they study together. the event of both getting good grades should be conditionally independent. then successful completion would be conditionally independent. is conditionally independent. give an adequate supply. then the likelihoods of good grades would not be independent. it should be clear how to extend the concept to larger classes of sets. and conditional independence. To illustrate further the usefulness of this concept. given an event C. If they prepared separately. if only one book is available. the replacement rule extends. neither aware of the testing by the others. The operational independence suggests probabilistic independence. whereas they would not be conditionally independent. event the rst completes on time to assume C be the event the library has two copies available. B} ci C . Yet. given the complementary event. CONDITIONAL INDEPENDENCE This may be expressed by saying that if a pair is conditionally independent. In the examples considered so far. Did a trace of rain or thunder in the area constitute bad weather? Did rain delay on one day in a month long project constitute bad weather? Even with this ambiguity. then arguments similar to those in c 2 make it seem reasonable to suppose {A. Are results of these tests really independent? There is certainly operational independencethe tests may be done by dierent laboratories. then it seems P (BAC c ) < P (BC c ). event As in the case of simple independence. (indicates the conditions associated with the disease) and Then it would seem reasonable to suppose B A is the event the rst test is positive is the event the second test is positive. B} ci D and {A. we may replace either or both by their complements and still have a conditionally independent pair. We consider several examples. The event of good weather is not so clearly dened. since if the rst student completes on time. if one does well. Let But there are cases in which the eect of the conditioning event is asymmetric. but some chance factor which aects both. • Two students are working on a term paper. Suppose second completes on time. i the product rule holds for every nite subclass of two or more. denoted {Ai : i ∈ J} ci C . then any or all of the events Ai may be replaced by their complements and still have a conditionally independent class. if the tests are meaningful. • If the two contractors of the example above both need material which may be in scarce supply. With neither cheating or collaborating on the test itself. then he or she must have been successful in getting the book. Examples 1 and A is the event the rst contracter completes his job on time and B is the event the If C is the event of good weather. • A patient goes to a doctor.
and P (Mk H ). they are not independent.01*[75~85~70~90~70].6.7 of being successful. If he is simply guessing. he picks a horse to win in each of ten races. given H. and 0. the probabilities are 0. Let H be the event he is correct in his claim. Example 5. c c c c P (E1 E2 E3 E4 E5 H) = P (E1 H) P (E2 H) P (E3 H) P (E4 H) P (E5 H) with similar expressions for each (5. If S is the number of successes in picking the winners in the ten races. the probability of success on each race is 0. probability Guessing at random: Ten trials. c This means that if we want the we can use ckn with the matrix of conditional probabilities.11) the the Ei be the event the where P (Mk H).3: Use of independence techniques Sharon is investigating a business venture which she thinks has probability 0.9. PH~=~0. positive. We assume two conditional Bernoulli trials: Claim is valid: Ten trials. If the venture is unsound. i th adviser is positive. the advisers are independent of one another in the sense that no one is aected by the others.75. P (E) = P (F H) + P (GH c ) = P (F H) P (H) + P (GH c ) P (H c ) Let form (5.8 that the advisers will advise her to proceed.3 The use of independence techniques Since conditional independence is independence.12) P (Mk H) probability of three or more successes. Of course. probability p = P (Ei H) = 0.9. The following MATLAB solution of the investment problem is indicated.85.9255 Often a Bernoulli sequence is related to some conditioning event the sequence {Ei : 1 ≤ i ≤ n} ci H and ci H c . for they are all related to the soundness of the venture.7. If Sharon goes with the ma jority of advisers. 0. Then Let H = the event the pro ject is sound. 0. the respective probabilities are 0. F = the event three or more advisers are G = F c = the event three or more are negative.3)*(1~~PH) PE~=~~~~0. determine of correct picks. If the prospects are sound. given that the venture is sound and also given that the venture is not sound. Then P (F H) = the sum of probabilities of Mk are minterms generated by the class {Ei : 1 ≤ i ≤ 5}. p = P (Ei H c ) = 0. 0. 0.2.4: Test of a claim A race track regular claims he can pick the winning horse in any race 90 percent of the time. Because of assumed conditional independence. she makes the right choice if three or more of the ve advisers are negative. There are ve horses in each race. . P1~=~0.3)*PH~+~ckn(P2. what is the probability she will make the right decision? SOLUTION If the project is sound.01*[80~75~60~90~80]. PE~=~ckn(P1. She checks with ve independent advisers. H. Example 5. Consider the trials to constitute a Bernoulli sequence. 0. Given the quality of the project. we may use independence techniques in the solution of problems. We consider two types of problems: an inference problem and a conditional Bernoulli sequence. In order to test his claim.7 that the advice will be negative. and E = the event of the correct decision.1. P2~=~0. and 0.113 5.2. P (HS = k) for various numbers k Suppose it is equally likely that his claim is valid or that he is merely guessing.7.9. In this case it is reasonable to assume We consider a simple example. if the venture is not sound.75. We may reasonably assume conditional independence of the advice. Sharon makes the right choice if three or more of the ve advisors are positive. 0.8.
Then P (H) P (S = kH) P (HS = k) = · .2 Patterns of Probable Inference2 Table 5. 0 ≤ k ≤ 10 P (H c S = k) P (H c ) P (S = kH c ) Giving him the benet of the doubt. he would have to pick at least seven correctly to give reasonable validation of his claim. CONDITIONAL INDEPENDENCE Let S= number of correct picks in ten trials.14) 2 This content is available online at <http://cnx. and P (EH c ).k).~~~~%~Probability~of~k~successes.~given~H Pk2~=~ibinom(10. P (EH). Pk1~=~ibinom(10.0.9.OH(e)]')) ~~~~~~~~~~~6~~~~~~~~~~~2~~~~~~%~Needs~at~least~six~to~have~creditability ~~~~~~~~~~~7~~~~~~~~~~73~~~~~~%~Seven~would~be~creditable. (5. the probabilities (5. the odds P (HE) /P (H E). basis of the evidence.~~~~~~~~~~~~~~%~Selects~favorable~odds disp(round([k(e). What is desired is P (HE) or. c equivalently. .1 Some Patterns of Probable Inference We are concerned with the likelihood of some hypothesized condition. P (HE) P (EH) P (H) = · P (H c E) P (EH c ) P (H c ) No conditional independence is involved in this case. 5. we suppose odds.2.3 If available are usually P (H) (or an odds value).~~~~%~Probability~of~k~successes.~given~H^c OH~~=~Pk1. H is the event the hypothetical condition exists and E is the event the evidence occurs. Some typical examples: We are forced to assess probabilities (likelihoods) on the HYPOTHESIS Job success Presence of oil Operation of a device Market condition Presence of a disease EVIDENCE Personal traits Geological structures Physical condition Test market condition Tests for symptoms 5.~~~~~~~~~~~~%~Conditional~odds~Assumes~P(H)/P(H^c)~=~1 e~~~=~OH~>~1. We simply use Bayes' rule to reverse the direction of conditioning.13) P (H) /P (H c ) = 1 and calculate the conditional k~=~0:10.0.6/>.114 CHAPTER 5.2. In general. ~~~~~~~~~~~8~~~~~~~~2627~~~~~~%~even~if~P(H)/P(H^c)~=~0. we have evidence for the condition which can never be absolutely certain.k)./Pk2.1 ~~~~~~~~~~~9~~~~~~~94585 ~~~~~~~~~~10~~~~~3405063 Under these assumptions.org/content/m23259/1.
115
Independent evidence for the hypothesized condition
Suppose there are two independent bits of evidence. Now obtaining this evidence may be operationally independent, but if the items both relate to the hypothesized condition, then they cannot be really independent. knowledge of Thus The condition assumed is usually of the form does not aect the likelihood of and
E2
{E1 , E2 } ci H
{E1 , E2 } ci H c .
E1 .
Similarly, we usually have
P (E1 H) = P (E1 HE2 ) if H occurs, then P (E1 H c ) = P (E1 H c E2 ).
Example 5.5: Independent medical tests
Suppose a doctor thinks the odds are 2/1 that a patient has a certain disease. independent tests. Let
H
be the event the patient has the disease and
E1
and
E2
She orders two be the events
the tests are positive. Suppose the rst test has probability 0.1 of a false positive and probability 0.05 of a false negative. The second test has probabilities 0.05 and 0.08 of false positive and false negative, respectively. If both tests are positive, what is the posterior probability the patient has the disease?
SOLUTION
Assuming
{E1 , E2 } ci H
and
ci H c , we work rst in terms of the odds, then convert to probability.
(5.15)
P (H) P (E1 E2 H) P (H) P (E1 H) P (E2 H) P (HE1 E2 ) = · = · P (H c E1 E2 ) P (H c ) P (E1 E2 H c ) P (H c ) P (E1 H c ) P (E2 H c )
The data are
P (H) /P (H c ) = 2, P (E1 H) = 0.95, P (E1 H c ) = 0.1, P (E2 H) = 0.92, P (E2 H c ) = 0.05
Substituting values, we get
(5.16)
0.95 · 0.92 1748 P (HE1 E2 ) =2· = P (H c E1 E2 ) 0.10 · 0.05 5
Evidence for a symptom
so that
P (HE1 E2 ) =
1748 5 =1− = 1 − 0.0029 1753 1753
(5.17)
Sometimes the evidence dealt with is not evidence for the hypothesized condition, but for some condition which is stochastically related.
symptom.
For purposes of exposition, we refer to this intermediary condition as a
Consider again the examples above.
HYPOTHESIS Job success Presence of oil Operation of a device Market condition Presence of a disease
SYMPTOM Personal traits Geological structures Physical condition Test market condition Physical symptom
EVIDENCE Diagnostic test results Geophysical survey results Monitoring report Market survey result Test for symptom
Table 5.4
We let
S
be the event the symptom is present. The usual case is that the evidence is directly related to
the symptom and not the hypothesized condition. The diagnostic test results can say something about an applicant's personal traits, but cannot deal directly with the hypothesized condition. The test results would be the same whether or not the candidate is successful in the job (he or she does not have the job yet). A geophysical survey deals with certain structural features beneath the surface. If a fault or a salt dome is The physical monitoring
present, the geophysical results are the same whether or not there is oil present.
report deals with certain physical characteristics. Its reading is the same whether or not the device will fail. A market survey treats only the condition in the test market. The results depend upon the test market, not
116
CHAPTER 5. CONDITIONAL INDEPENDENCE
A blood test may be for certain physical conditions which frequently are related (at
the national market.
least statistically) to the disease. But the result of the blood test for the physical condition is not directly aected by the presence or absence of the disease. Under conditions of this type, we may assume
P (ESH) = P (ESH c )
These imply
and
P (ES c H) = P (ES c H c )
(5.18)
{E, H} ci S
P (HE) P (H c E)
and
ci S c . =
Now
=
P (HE) P (H c E)
P (HES)+P (HES c ) P (HS)P (EHS)+P (HS c )P (EHS c ) P (H c ES)+P (H c ES c ) = P (H c S)P (EH c S)+P (H c S c )P (EH c S c ) c c = PP (HS)P (ES)+P (HSS )P (ES ) ) (H c S)P (ES)+P (H c c )P (ES c
(5.19)
It is worth noting that each term in the denominator diers from the corresponding term in the numerator by having
Hc
in place of
H.
Before completing the analysis, it is necessary to consider how
H
and
S
are
related stochastically in the data. Four cases may be considered.
a. Data are b. Data are c. Data are d. Data are
P P P P
(SH), (SH), (HS), (HS),
P P P P
(SH c ), (SH c ), (HS c ), (HS c ),
and and and and
P P P P
(H). (S). (S). (H).
Case a:
P (H) P (SH) P (ES) + P (H) P (S c H) P (ES c ) P (HE) = c E) P (H P (H c ) P (SH c ) P (ES) + P (H c ) P (S c H c ) P (ES c )
(5.20)
Example 5.6: Geophysical survey
Let
H
be the event of a successful oil well,
favorable to the presence of oil, and structure. We suppose
S be the event there is a geophysical structure E be the event the geophysical survey indicates a favorable
and
{H, E} ci S
ci S c .
Data are
P (H) /P (H c ) = 3, P (SH) = 0.92, P (SH c ) = 0.20, P (ES) = 0.95, P (ES c ) = 0.15
Then
(5.21)
P (HE) 0.92 · 0.95 + 0.08 · 0.15 1329 =3· = = 8.5742 c E) P (H 0.20 · 0.95 + 0.80 · 0.15 155
so that
(5.22)
P (HE) = 1 −
155 = 0.8956 1484
(5.23)
The geophysical result moved the prior odds of 3/1 to posterior odds of 8.6/1, with a corresponding change of probabilities from 0.75 to 0.90.
Case b:
Data are
P (S)P (SH), P (SH c ), P (ES).
and
P (ES c ).
If we can determine
P (H),
we can
proceed as in case a. Now by the law of total probability
P (S) = P (SH) P (H) + P (SH c ) [1 − P (H)]
which may be solved algebraically to give
(5.24)
P (H) =
P (S) − P (SH c ) P (SH) − P (SH c )
(5.25)
117
Example 5.7: Geophysical survey revisited
In many cases a better estimate of of previous geophysical data. Suppose the prior odds for
P (S) or the odds P (S) /P (S c ) can be made on the basis S are 3/1, so that P (S) = 0.75. P (H) = 55/17 P (H c )
Using the other data in Example 5.6 (Geophysical survey), we have
P (H) =
0.75 − 0.20 P (S) − P (SH c ) = = 55/72, P (SH) − P (SH c ) 0.92 − 0.20
so that
(5.26)
Using the pattern of case a, we have
P (HE) 55 0.92 · 0.95 + 0.08 · 0.15 4873 = · = = 9.2467 c E) P (H 17 0.20 · 0.95 + 0.80 · 0.15 527
so that
(5.27)
P (HE) = 1 −
527 = 0.9024 5400 P (ES)
and
(5.28)
Usually data relating test results to symptom are of the form
P (ES c ),
or equivalent.
Data relating the symptom and the hypothesized condition may go either way. In cases a and b, the data are in the form
P (SH)
and
P (SH c ),
and
or equivalent, derived from data showing the fraction of
times the symptom is noted when the hypothesized condition is identied. But these data may go in the opposite direction, yielding and d.
P (HS)
P (HS c ),
or equivalent. This is the situation in cases c
Case c:
Data are
P (ES) , P (ES c ) , P (HS) , P (HS c )
and
P (S). P (S)
A test A
Example 5.8: Evidence for a disease symptom with prior
time.
When a certain blood syndrome is observed, a given disease is indicated 93 percent of the The disease is found without this syndrome only three percent of the time.
for the syndrome has probability 0.03 of a false positive and 0.05 of a false negative.
preliminary examination indicates a probability 0.30 that a patient has the syndrome. A test is performed; the result is negative. What is the probability the patient has the disease?
SOLUTION
In terms of the notation above, the data are
P (S) = 0.30,
P (ES c ) = 0.03,
and
P (E c S) = 0.05,
(5.29)
P (HS) = 0.93,
We suppose
P (HS c ) = 0.03
(5.30)
{H, E} ci S
and
ci S c .
(5.31)
P (HE c ) P (S) P (HS) P (E c S) + P (S c ) P (HS c ) P (E c S c ) = c E c ) P (H P (S) P (H c S) P (E c S) + P (S c ) P (H c S c ) P (E c S c ) =
which implies
429 0.30 · 0.93 · 0.05 + 0.70 · 0.03 · 0.97 = 0.30 · 0.07 · 0.05 + 0.70 · 0.97 · 0.97 8246
(5.32)
P (HE c ) = 429/8675 ≈ 0.05.
Case d:
This diers from case c only in the fact that a prior probability for
determine the corresponding probability for
S
H
is assumed. In this case, we
by
P (S) =
and use the pattern of case c.
P (H) − P (HS c ) P (HS) − P (HS c )
(5.33)
118
CHAPTER 5. CONDITIONAL INDEPENDENCE
Example 5.9: Evidence for a disease symptom with prior
P (H)
Suppose for the patient in Example 5.8 (Evidence for a disease symptom with prior physician estimates the odds favoring the presence of the disease are 1/3, so that Again, the test result is negative. Determine the posterior odds, given
Ec .
P (S)) the P (H) = 0.25.
SOLUTION
First we determine
P (S) =
Then
0.25 − 0.03 P (H) − P (HS c ) = = 11/45 P (HS) − P (HS c ) 0.93 − 0.03
(5.34)
P (HE c ) (11/45) · 0.93 · 0.05 + (34/45) · 0.03 · 0.97 15009 = = = 0.047 P (H c E c ) (11/45) · 0.07 · 0.05 + (34/45) · 0.97 · 0.97 320291
The result of the test drops the prior odds of 1/3 to approximately 1/21.
(5.35)
Independent evidence for a symptom
In the previous cases, we consider only a single item of evidence for a symptom. But it may be desirable to have a second opinion. We suppose the tests are for the symptom and are not directly related to the hypothetical condition. If the tests are operationally independent, we could reasonably assume
c P (E1 SE2 ) = P (E1 SE2 )
{E1 , E2 } ci S {E1 , H} ci S {E2 , H} ci S {E1 E2 , H} ci S
(5.36)
P (E1 SH) = P (E1 SH ) P (E2 SH) = P (E2 SH ) P (E1 E2 SH) = P (E1 E2 SH c )
This implies
c
c
{E1 , E2 , H} ci S .
depending on the tie between
S
A similar condition holds for and
H. We consider a "case a" example.
Sc .
As for a single test, there are four cases,
Example 5.10: A market survey problem
A food company is planning to market nationally a new breakfast cereal. Its executives feel condent that the odds are at least 3 to 1 the product would be successful. Before launching the new product, the company decides to investigate a test market. Previous experience indicates that the reliability of the test market is such that if the national market is favorable, there is probability 0.9 that the test market is also. On the other hand, if the national market is unfavorable, there is a probability of only 0.2 that the test market will be favorable. These facts lead to the following analysis. Let
H be the event the national market is favorable (hypothesis) S be the event the test market is favorable (symptom) The initial data are the following probabilities, based on past experience:
(a) Prior odds: (b) Reliability
• •
P (H) /P (H c ) = 3 c of the test market: P (SH) = 0.9 P (SH ) = 0.2
If it were known that the test market is favorable, we should have
P (HS) P (SH) P (H) 0.9 = = · 3 = 13.5 P (H c S) P (SH c ) P (H c ) 0.2
(5.37)
Unfortunately, it is not feasible to know with certainty the state of the test market. The company decision makers engage two market survey companies to make independent surveys of the test market. The reliability of the companies may be expressed as follows. Let
: :
E1 E2
be the event the rst company reports a favorable test market. be the event the second company reports a favorable test market.
119
On the basis of previous experience, the reliability of the evidence about the test market (the symptom) is expressed in the following conditional probabilities.
P (E1 S) = 0.9 P (E1 S c ) = 0.3
national market is favorable, given this result?
P (E2 S) = 0.8 B (E2 S c ) = 0.2
(5.38)
Both survey companies report that the test market is favorable.
What is the probability the
SOUTION
The two survey rms work in an operationally independent manner. The report of either company is unaected by the work of the other. Also, each report is aected only by the condition of the
test market regardless of what the national market may be. According to the discussion above, we should be able to assume
{E1 , E2 , H} ci S
and
{E1 , E2 , H} ci S c
(5.39)
We may use a pattern similar to that in Example 2, as follows:
P (H) P (SH) P (E1 S) P (E2 S) + P (S c H) P (E1 S c ) P (E2 S c ) P (HE1 E2 ) = · c E E ) c ) P (SH c ) P (E S) P (E S) + P (S c H c ) P (E S c ) P (E S c ) P (H P (H 1 2 1 2 1 2 =3· 327 0.9 · 0.9 · 0.8 + 0.1 · 0.3 · 0.2 = ≈ 10.22 0.2 · 0.9 · 0.8 + 0.8 · 0.3 · 0.2 32
(5.40)
(5.41)
In terms of the posterior probability, we have
P (HE1 E2 ) =
We note that the odds favoring
327/32 327 32 = =1− ≈ 0.91 1 + 327/32 359 359
(5.42)
as compared with the odds favoring
H, given positive indications from both survey companies, is 10.2 H, given a favorable test market, of 13.5. The dierence
reects the residual uncertainty about the test market after the market surveys. Nevertheless, the results of the market surveys increase the odds favoring a satisfactory market from the prior 3 to 1 to a posterior 10.2 to 1. In terms of probabilities, the market surveys increase the likelihood
of a favorable market from the original
P (H) = 0.75
to the posterior
P (HE1 E2 ) = 0.91.
The
conditional independence of the results of the survey makes possible direct use of the data.
5.2.2 A classication problem
A population consists of members of two subgroups. It is desired to formulate a battery of questions to aid in identifying the subclass membership of randomly selected individuals in the population. The questions are designed so that for each individual the answers are independent, in the sense that the answers to any subset of these questions are not aected by and do not aect the answers to any other subset of the questions. The answers are, however, aected by the subgroup membership. Thus, our treatment of conditional idependence suggests that it is reasonable to supose the answers are conditionally independent, given the subgroup membership. Consider the following numerical example.
Example 5.11: A classication problem
A sample of 125 subjects is taken from a population which has two subgroups. membership of each sub ject in the sample is known. The subgroup Each individual is asked a battery of ten
questions designed to be independent, in the sense that the answer to any one is not aected by the answer to any other. The subjects answer independently. Data on the results are summarized in the following table:
120
CHAPTER 5. CONDITIONAL INDEPENDENCE
GROUP 1 (69 members)
Q 1 2 3 4 5 6 7 8 9 10 Yes 42 34 15 19 22 41 9 40 48 20 No 22 27 45 44 43 13 52 26 12 37 Unc. 5 8 9 6 4 15 8 3 9 12
GROUP 2 (56 members)
Yes 20 16 33 31 23 14 31 13 27 35 No 31 37 19 18 28 37 17 38 24 16 Unc. 5 3 4 7 5 5 8 5 5 5
Table 5.5
Assume the data represent the general population consisting of these two groups, so that the data may be used to calculate probabilities and conditional probabilities. Several persons are interviewed. The result of each interview is a prole of answers to the
questions. The goal is to classify the person in one of the two subgroups on the basis of the prole of answers. The following proles were taken.
• • •
Y, N, Y, N, Y, U, N, U, Y. U N, N, U, N, Y, Y, U, N, N, Y Y, Y, N, Y, U, U, N, N, Y, Y
Classify each individual in one of the subgroups.
SOLUTION
Let
G1 =
the event the person selected is from group 1, and
G2 = Gc = 1
the event the person
selected is from group 2. Let
Ai = the event the answer to the i th question is Yes Bi = the event the answer to the i th question is No Ci = the event the answer to the i th question is Uncertain The data are taken to mean P (A1 G1 ) = 42/69, P (B3 G2 ) = 19/56, etc. The prole Y, N, Y, N, Y, U, N, U, Y. U corresponds to the event E = A1 B2 A3 B4 A5 C6 B7 C8 A9 C10
We utilize the ratio form of Bayes' rule to calculate the posterior odds
P (G1 E) P (EG1 ) P (G1 ) = · P (G2 E) P (EG2 ) P (G2 )
(5.43)
If the ratio is greater than one, classify in group 1; otherwise classify in group 2 (we assume that a ratio exactly one is so unlikely that we can neglect it). Because of conditional independence, we are able to determine the conditional probabilities
P (EG1 ) =
42 · 27 · 15 · 44 · 22 · 15 · 52 · 3 · 48 · 12 6910 29 · 37 · 33 · 18 · 23 · 5 · 17 · 5 · 24 · 5 5610
and
(5.44)
P (EG2 ) =
(5.45)
121
The odds
P (G1 ) /P (G2 ) = 69/56.
We nd the posterior odds to be
P (G1 E) 42 · 27 · 15 · 44 · 22 · 15 · 52 · 3 · 48 · 12 569 = 5.85 = · P (G2 E) 29 · 37 · 33 · 18 · 23 · 5 · 17 · 5 · 24 · 5 699
The factor
(5.46)
569 /699
comes from multiplying
5610 /6910
by the odds
P (G1 ) /P (G2 ) = 69/56.
Since
the resulting posterior odds favoring Group 1 is greater than one, we classify the respondent in group 1. While the calculations are simple and straightforward, they are tedious and error prone. To
make possible rapid and easy solution, say in a situation where successive interviews are underway, we have several mprocedures for performing the calculations. Answers to the questions would
normally be designated by some such designation as Y for yes, N for no, and U for uncertain. In order for the mprocedure to work, these answers must be represented by numbers indicating the appropriate columns in matrices
A and B. Thus, in the example under consideration, each Y must
be translated into a 1, each N into a 2, and each U into a 3. The task is not particularly dicult, but it is much easier to have MATLAB make the translation as well as do the calculations. following twostage approach for solving the problem works well. The rst mprocedure The
oddsdf
sets up the frequency information.
The next mprocedure
odds
calculates the odds for a given prole. The advantage of splitting into two mprocedures is that we can set up the data once, then call repeatedly for the calculations for dierent proles. As always, it is necessary to have the data in an appropriate form. The following is an example in which the data are entered in terms of actual frequencies of response.
%~file~oddsf4.m %~Frequency~data~for~classification A~=~[42~22~5;~34~27~8;~15~45~9;~19~44~6;~22~43~4; ~~~~~41~13~15;~9~52~8;~40~26~3;~48~12~9;~20~37~12]; B~=~[20~31~5;~16~37~3;~33~19~4;~31~18~7;~23~28~5; ~~~~~14~37~5;~31~17~8;~13~38~5;~27~24~5;~35~16~5]; disp('Call~for~oddsdf')
Example 5.12: Classication using frequency data
oddsf4~~~~~~~~~~~~~~%~Call~for~data~in~file~oddsf4.m Call~for~oddsdf~~~~~%~Prompt~built~into~data~file oddsdf~~~~~~~~~~~~~~%~Call~for~mprocedure~oddsdf Enter~matrix~A~of~frequencies~for~calibration~group~1~~A Enter~matrix~B~of~frequencies~for~calibration~group~2~~B Number~of~questions~=~10 Answers~per~question~=~3 ~Enter~code~for~answers~and~call~for~procedure~"odds" y~=~1;~~~~~~~~~~~~~~%~Use~of~lower~case~for~easier~writing n~=~2; u~=~3; odds~~~~~~~~~~~~~~~~%~Call~for~calculating~procedure Enter~profile~matrix~E~~[y~n~y~n~y~u~n~u~y~u]~~~%~First~profile Odds~favoring~Group~1:~~~5.845 Classify~in~Group~1 odds~~~~~~~~~~~~~~~~%~Second~call~for~calculating~procedure Enter~profile~matrix~E~~[n~n~u~n~y~y~u~n~n~y]~~~%~Second~profile Odds~favoring~Group~1:~~~0.2383 Classify~in~Group~2
122
CHAPTER 5. CONDITIONAL INDEPENDENCE
odds~~~~~~~~~~~~~~~~%~Third~call~for~calculating~procedure Enter~profile~matrix~E~~[y~y~n~y~u~u~n~n~y~y]~~~%~Third~profile Odds~favoring~Group~1:~~~5.05 Classify~in~Group~1
The principal feature of the mprocedure odds is the scheme for selecting the numbers from the and
B
A
matrices.
If
E = [yynyuunnyy],
used internally.
then the coding translates this into the actual numerical
matrix
[1 1 2 1 3 3 2 2 1 1]
elements of
E. Thus
Then
A (:, E)
is a matrix with columns corresponding to
e~=~A(:,E) e~=~~~42~~~~42~~~~22~~~~42~~~~~5~~~~~5~~~~22~~~~22~~~~42~~~~42 ~~~~~~34~~~~34~~~~27~~~~34~~~~~8~~~~~8~~~~27~~~~27~~~~34~~~~34 ~~~~~~15~~~~15~~~~45~~~~15~~~~~9~~~~~9~~~~45~~~~45~~~~15~~~~15 ~~~~~~19~~~~19~~~~44~~~~19~~~~~6~~~~~6~~~~44~~~~44~~~~19~~~~19 ~~~~~~22~~~~22~~~~43~~~~22~~~~~4~~~~~4~~~~43~~~~43~~~~22~~~~22 ~~~~~~41~~~~41~~~~13~~~~41~~~~15~~~~15~~~~13~~~~13~~~~41~~~~41 ~~~~~~~9~~~~~9~~~~52~~~~~9~~~~~8~~~~~8~~~~52~~~~52~~~~~9~~~~~9 ~~~~~~40~~~~40~~~~26~~~~40~~~~~3~~~~~3~~~~26~~~~26~~~~40~~~~40 ~~~~~~48~~~~48~~~~12~~~~48~~~~~9~~~~~9~~~~12~~~~12~~~~48~~~~48 ~~~~~~20~~~~20~~~~37~~~~20~~~~12~~~~12~~~~37~~~~37~~~~20~~~~20
The
i th entry on the i th column is the count corresponding to the answer to the i th question. A.
The element on the diagonal in the third column of
For
example, the answer to the third question is N (no), and the corresponding count is the third entry in the N (second) column of
A (:, E)
is
the third element in that column, and hence the desired third entry of the N column. By picking out the elements on the diagonal by the command diag(A(:,E)), we have the desired set of counts corresponding to the prole. The same is true for diag(B(:,E)). Sometimes the data are given in terms of conditional probabilities and probabilities. A slight
modication of the procedure handles this case. For purposes of comparison, we convert the problem above to this form by converting the counts in matrices
A and B to conditional probabilities.
We do
this by dividing by the total count in each group (69 and 56 in this case). Also,
P (G1 ) = 69/125 =
0.552
and
P (G2 ) = 56/125 = 0.448.
GROUP 1
Q 1 2 3 4 5 6 7 8 9 10 Yes 0.6087 0.4928 0.2174 0.2754 0.3188 0.5942 0.1304 0.5797 0.6957 0.2899
P (G1 ) = 69/125
No 0.3188 0.3913 0.6522 0.6376 0.6232 0.1884 0.7536 0.3768 0.1739 0.5362 Unc. 0.0725 0.1159 0.1304 0.0870 0.0580 0.2174 0.1160 0.0435 0.1304 0.1739
GROUP 2
Yes 0.3571 0.2857 0.5893 0.5536 0.4107 0.2500 0.5536 0.2321 0.4821 0.6250 No
P (G2 ) = 56/125
Unc. 0.0893 0.0536 0.0714 0.1250 0.0893 0.0893 0.1428 0.0893 0.0893 0.0893
0.5536 0.6607 0.3393 0.3214 0.5000 0.6607 0.3036 0.6786 0.4286 0.2857
123
Table 5.6
These data are in an mle
oddsp4.m.
The modied setup mprocedure
oddsdp uses the condi
tional probabilities, then calls for the mprocedure odds.
Example 5.13: Calculation using conditional probability data
oddsp4~~~~~~~~~~~~~~~~~%~Call~for~converted~data~(probabilities) oddsdp~~~~~~~~~~~~~~~~~%~Setup~mprocedure~for~probabilities Enter~conditional~probabilities~for~Group~1~~A Enter~conditional~probabilities~for~Group~2~~B Probability~p1~individual~is~from~Group~1~~0.552 ~Number~of~questions~=~10 ~Answers~per~question~=~3 ~Enter~code~for~answers~and~call~for~procedure~"odds" y~=~1; n~=~2; u~=~3; odds Enter~profile~matrix~E~~[y~n~y~n~y~u~n~u~y~u] Odds~favoring~Group~1:~~5.845 Classify~in~Group~1
The slight discrepancy in the odds favoring Group 1 (5.8454 compared with 5.8452) can be attributed to rounding of the conditional probabilities to four places. The presentation above rounds the results to 5.845 in each case, so the discrepancy is not apparent. This is quite acceptable, since the discrepancy has no eect on the results.
5.3 Problems on Conditional Independence3
Exercise 5.1
Suppose
(Solution on p. 129.)
and
{A, B} ci C
{A, B} ci C c , P (C) = 0.7,
and
P (AC) = 0.4 P (BC) = 0.6 P (AC c ) = 0.3 P (BC c ) = 0.2
Show whether or not the pair
(5.47)
{A, B}
is independent.
Exercise 5.2
Suppose
(Solution on p. 129.)
and
{A1 , A2 , A3 } ci C
ci C c ,
with
P (C) = 0.4,
and
P (Ai C) = 0.90, 0.85, 0.80
Determine the posterior odds
P (Ai C c ) = 0.20, 0.15, 0.20 P (CA1 Ac A3 ) /P 2 (C
c
for
i = 1, 2, 3,
respectively
(5.48)
A1 Ac A3 ). 2
(Solution on p. 129.)
Each has a good chance to break
Exercise 5.3
Five world class sprinters are entered in a 200 meter dash.
the current track record. There is a thirty percent chance a late cold front will move in, bringing conditions that adversely aect the runners. Otherwise, conditions are expected to be favorable for an outstanding race. Their respective probabilities of breaking the record are:
•
3 This
Good weather (no front): 0.75, 0.80, 0.65, 0.70, 0.85
content is available online at <http://cnx.org/content/m24205/1.4/>.
124
CHAPTER 5. CONDITIONAL INDEPENDENCE
•
Poor weather (front in): 0.60, 0.65, 0.50, 0.55, 0.70
The performances are (conditionally) independent,
given good weather,
and also,
given poor
weather. What is the probability that three or more will break the track record?
Hint.
If
B3
is the event of three or more,
P (B3 ) = P (B3 W ) P (W ) + P (B3 W c ) P (W c ).
(Solution on p. 129.)
The alarm is given if three or more of
Exercise 5.4
A device has ve sensors connected to an alarm system.
the sensors trigger a switch. If a dangerous condition is present, each of the switches has high (but not unit) probability of activating; if the dangerous condition does not exist, each of the switches has low (but not zero) probability of activating (falsely). Suppose condition and Suppose suppose
D=
the event of the dangerous
A =
the event the alarm is activated.
Ei =
the event the
i th
Proper operation consists of
AD
Ac D c .
unit is activated.
Since the switches operate independently, we
{E1 , E2 , E3 , E4 , E5 } ci D
and
ci Dc
(5.49)
Assume the conditional probabilities of the E1 , given D, are 0.91, 0.93, 0.96, 0.87, 0.97, and given Dc , are 0.03, 0.02, 0.07, 0.04, 0.01, respectively. If P (D) = 0.02, what is the probability the alarm system acts properly? Suggestion. Use the conditional independence and the procedure ckn.
Exercise 5.5
dently; however, the likelihood of completion depends upon the weather.
(Solution on p. 129.)
If the weather is very
Seven students plan to complete a term paper over the Thanksgiving recess. They work indepen
pleasant, they are more likely to engage in outdoor activities and put o work on the paper. Let be the event the during the
Ei i th student completes his or her paper, Ak be the event that k or more complete recess, and W be the event the weather is highly conducive to outdoor activity. It is
{Ei : 1 ≤ i ≤ 7} ci W
and
reasonable to suppose
ci W c .
Suppose
P (Ei W ) = 0.4, 0.5, 0.3, 0.7, 0.5, 0.6, 0.2 P (Ei W c ) = 0.7, 0.8, 0.5, 0.9, 0.7, 0.8, 0.5
respectively, and their papers and
(5.50)
(5.51)
P (W ) = 0.8. P (A5 ) that ve
Determine the probability or more nish.
P (A4 )
that four our more complete
Exercise 5.6
(Solution on p. 129.)
A manufacturer claims to have improved the reliability of his product. Formerly, the product had probability 0.65 of operating 1000 hours without failure. The manufacturer claims this probability is now 0.80. A sample of size 20 is tested. Determine the odds favoring the new probability for How many
various numbers of surviving units under the assumption the prior odds are 1 to 1. survivors would be required to make the claim creditable?
Exercise 5.7
(Solution on p. 130.)
A real estate agent in a neighborhood heavily populated by auent professional persons is working with a customer. The agent is trying to assess the likelihood the customer will actually buy. His experience indicates the following: if
H
is the event the customer buys,
is a professional with good income, and
E
S
is the event the customer
is the event the customer drives a prestigious car, then
P (S) = 0.7 P (SH) = 0.90 P (SH c ) = 0.2 P (ES) = 0.95 P (ES c ) = 0.25
reasonable to suppose
(5.52)
Since buying a house and owning a prestigious car are not related for a given owner, it seems
P (EHS) = P (EH c S)
and
P (EHS c ) = P (EH c S c ).
The customer
drives a Cadillac. What are the odds he will buy a house?
125
Exercise 5.8
geophysical survey.
(Solution on p. 130.)
On the basis of past experience, the decision makers feel the odds are about Various other probabilities can be assigned on the basis of past
In deciding whether or not to drill an oil well in a certain location, a company undertakes a
four to one favoring success. experience. Let
• H be the event that a well would be successful • S be the event the geological conditions are favorable • E be the event the results of the geophysical survey are
The initial, or prior, odds are
positive
P (H) /P (H c ) = 4. P (SH c ) = 0.20
Previous experience indicates
P (SH) = 0.9
P (ES) = 0.95
P (ES c ) = 0.10
(5.53)
Make reasonable assumptions based on the fact that the result of the geophysical survey depends upon the geological formations and not on the presence or absence of oil. The result of the survey is favorable. Determine the posterior odds
P (HE) /P (H c E).
(Solution on p. 130.)
Exercise 5.9
A software rm is planning to deliver a custom package. Past experience indicates the odds are at least four to one that it will pass customer acceptance tests. As a check, the program is sub jected to two dierent benchmark runs. Both are successful. Given the following data, what are the odds favoring successful operation in practice? Let
• • • •
H be the event the performance is satisfactory S be the event the system satises customer acceptance tests E1 be the event the rst benchmark tests are satisfactory. E2 be the event the second benchmark test is ok.
{H, E1 , E2 } ci S
and
Under the usual conditions, we may assume
ci S c .
Reliability data show
P (HS) = 0.95, P (HS c ) = 0.45 P (E1 S) = 0.90 P (E1 S c ) = 0.25 P (E2 S) = 0.95 P (E2 S c ) = 0.20
Determine the posterior odds
(5.54)
(5.55)
P (HE1 E2 ) /P (H c E1 E2 ).
(Solution on p. 130.)
Exercise 5.10
A research group is contemplating purchase of a new software package to perform some specialized calculations. The systems manager decides to do two sets of diagnostic tests for signicant bugs that might hamper operation in the intended application. The tests are carried out in an operationally independent manner. The following analysis of the results is made.
• • • •
H = the event the program is satisfactory for the intended S = the event the program is free of signicant bugs E1 = the event the rst diagnostic tests are satisfactory E2 = the event the second diagnostic tests are satisfactory
application
Since the tests are for the presence of bugs, and are operationally independent, it seems reasonable to assume the
{H, E1 , E2 } ci S and {H, E1 , E2 } ci S c . Because of the reliability of the software company, manager thinks P (S) = 0.85. Also, experience suggests P (HS) = 0.95 P (HS c ) = 0.30 P (E1 S) = 0.90 P (E1 S c ) = 0.20 P (E2 S) = 0.95 P (E2 S c ) = 0.25
126
CHAPTER 5. CONDITIONAL INDEPENDENCE
Table 5.7
Determine the posterior odds favoring
H
if results of both diagnostic tests are satisfactory.
Exercise 5.11
A company is considering a new product now undergoing eld testing. Let
(Solution on p. 131.)
• H be the event the product is introduced and successful • S be the event the R&D group produces a product with the desired characteristics. • E be the event the testing program indicates the product is satisfactory
The company assumes
P (S) = 0.9
and the conditional probabilities
P (HS) = 0.90 P (HS c ) = 0.10
to suppose
P (ES) = 0.95 P (ES c ) = 0.15 P (HE) /P (H c E).
(5.56)
Since the testing of the merchandise is not aected by market success or failure, it seems reasonable
{H, E} ci S
and
ci S c .
The eld tests are favorable. Determine
Exercise 5.12
(Solution on p. 131.)
She
Martha is wondering if she will get a ve percent annual raise at the end of the scal year.
understands this is more likely if the company's net prots increase by ten percent or more. These will be inuenced by company sales volume. Let
• H = the event she will get the raise • S = the event company prots increase by ten percent or more • E = the event sales volume is up by fteen percent or more
Since the prospect of a raise depends upon prots, not directly on sales, she supposes and
{H, E} ci S
{H, E} ci S c .
She thinks the prior odds favoring suitable prot increase is about three to one.
Also, it seems reasonable to suppose
P (HS) = 0.80 P (HS c ) = 0.10
Martha will get her raise?
P (ES) = 0.95 P (ES c ) = 0.10
(5.57)
End of the year records show that sales increased by eighteen percent.
What is the probability
Exercise 5.13
independent advice of three specialists. Let
(Solution on p. 131.)
A physician thinks the odds are about 2 to 1 that a patient has a certain disease.
H
He seeks the
be the event the disease is present, and
A, B, C
be
the events the respective consultants agree this is the case. majority. suppose
The physician decides to go with the
Since the advisers act in an operationally independent manner, it seems reasonable to and
{A, B, C}ciH
ciH c .
Experience indicates
P (AH) = 0.8, P (BH) = 0.7, P (CH) = 0.75 P (Ac H c ) = 0.85, P (B c H c ) = 0.8, P (C c H c ) = 0.7
present, and does not if two or more think the disease is not present)?
(5.58)
(5.59)
What is the probability of the right decision (i.e., he treats the disease if two or more think it is
Exercise 5.14
(Solution on p. 131.)
A software company has developed a new computer game designed to appeal to teenagers and young adults. It is felt that there is good probability it will appeal to college students, and that if it appeals to college students it will appeal to a general youth market. To check the likelihood of appeal to college students, it is decided to test rst by a sales campaign at Rice and University of Texas, Austin. The following analysis of the situation is made.
20 Table 5.) The subgroup Each individual is asked a battery of ten A sample of 150 sub jects is taken from a population which has two subgroups. Should the well be drilled? Exercise 5. E1 .) is the event that an oil deposit is present in an area.80. and In a region in the Gulf Coast area.60) P (HE1 E2 ) P (H c E1 E2 ) . (Solution on p. not the presence of oil. experience suggests P (E1 S) = 0.90 P (E2 S c ) = 0.1. Because of managers think P (S) = 0.15 salt domes.25 c Determine the posterior odds favoring H if sales results are satisfactory at both schools. E2 } ci S and ci S c . S is the event of a salt Company executives believe the odds favoring oil in the area is at least 1 in 10. Let E1 . The subjects answer independently. E2 } ci S c . E1 . and the tests are carried out in an operationally independent manner. the {H. E1 .9 and P (SH c ) = 0. 131. Austin good Since the tests are for the reception are at two separate universities and are operationally independent. Exercise 5. oil deposits are highly likely to be associated with underground H dome in the area.30 c P (E2 S) = 0.127 • • • • H = the event the sales to the general market will be S = the event the game appeals to college students E1 = the event the sales are good at Rice E2 = the event the sales are good at UT. in the sense that the answer to any one is not aected by the answer to any other. 131.05 Determine the posterior odds P (E2 S) = 0.16 membership of each sub ject in the sample is known.95 P (HS ) = 0.95 P (E2 S c ) = 0. Data on the results are summarized in the following table: GROUP 1 (84 members) Q 1 2 3 4 5 6 7 8 9 10 Yes 51 42 19 24 27 49 16 47 55 24 No 26 32 54 53 52 19 59 32 17 53 Unc 7 10 11 7 5 16 9 5 12 7 GROUP 2 (66 members) Yes 27 19 39 38 28 19 37 19 27 39 No 34 43 22 19 33 41 21 42 33 21 Unc 5 4 5 9 5 6 8 5 6 6 . Data on the reliability of the surveys yield the following probabilities P (E1 S) = 0. It decides to conduct two independent geophysical surveys for the presence of a salt dome. it seems reasonable to assume previous experience in game sales.95 P (E1 S c ) = 0. questions designed to be independent. E2 } ci S and {H. If (Solution on p. Also.10 (5. E2 be the events the surveys indicate a salt dome. Because the surveys are tests for the geological structure.8 its P (HS) = 0. it seems reasonable to assume {H. experience indicates P (SH) = 0.90 P (E1 S ) = 0.
132. y.63 0.06 0.32 P (G2 ) = 0. u. n.38 0. y.29 P (G1 ) = 0. n.38 0. y.58 0. u.15 0. u.42 0. n.23 0. n. above.17 follows (probabilities are rounded to two decimal places). n.41 0.29 0. n.08 0.) The data of Exercise 5.06 0. y. u. n.61 0.08 0.32 0. u.31 0. as GROUP 1 Q 1 2 3 4 5 6 7 8 9 10 Yes 0. u. n. classify each individual in one of the subgroups i. y. y.65 0.57 0. y.56 0.33 0.62 0.128 CHAPTER 5.62 0. u. n. n. n.50 0. The result of each interview is a prole of answers to the questions. y. y.09 0. u.59 No 0.09 0. y.29 0. n. i. y Exercise 5. u.14 0.06 0. n.19 0. n. Several persons are interviewed.08 0.50 0. u.12 0. n.20 0.56 0.29 0.19 0.12 0. n.63 0.29 0.63 Unc 0.08 0. are converted to conditional probabilities and probabilities. y. CONDITIONAL INDEPENDENCE Table 5.44 Unc 0.09 Table 5. u ii. (Solution on p.08 0. y.56 No 0.16. n.9 Assume the data represent the general population consisting of these two groups.11 0.70 0. y. u ii. n.41 0. u. y.65 0. n. y .50 0.51 0. y. u. y. y. y iii.23 0. The goal is to classify the person in one of the two subgroups For the following proles.13 0. y iii.64 0. y.08 GROUP 2 Yes 0. y. n.10 For the following proles classify each individual in one of the subgroups.29 0. y.59 0. n. n.32 0. so that the data may be used to calculate probabilities and conditional probabilities.08 0.
2*0.4 (p.4 0.3))*0.1*[7 8 5 9 7 8 5]. 123) P (A1 C) P (Ac C) P (A3 C) P (C) P (CA1 Ac A3 ) 2 2 c A ) = P (C c ) • P (A C c ) P (Ac C c ) P (A C c ) P (C c A1 A2 3 1 3 2 = 0.6 0. 123) PW = 0.01*[75 80 65 70 85].63) .5)*0.80 and E2 be the event the probability is 0. Assume P (E1 ) /P (E2 ) = 1.01*[60 65 50 55 70].4)*0.61) (5.15 • 0.98 P = 0.2*0.3 P = 0. 124) P1 = 0.2 (p.1*[4 5 3 7 5 6 2].6 (p. = 0.3)*0.2482 Solution to Exercise 5.4993 = ckn(PW.9997 Solution to Exercise 5. PWc = 0.2 = 0.3)*0.4800 PA*PB ans = 0.5)*0.3700 PB = 0. 124) E1 be the event the probability is 0.5 (p.12 0.3)*0. 124) PWc PA4 PA4 PA5 PA5 Let PW = 0.01*[3 2 7 4 1].2 = 0.ckn(P2.02 + (1 .3 PA = 0.20 51 (5. 123) P (A) = P (AC) P (C) + P (AC c ) P (C c ) .20 • 0.4*0.3 PAB = 0.80 108 • = = 2.3 PB = 0.7 + 0.129 Solutions to Exercises in Chapter 5 Solution to Exercise 5.65.8 + ckn(PWc.1 (p. P (AC) P (BC) P (C) + P (AC c ) P (BC c ) P (C c ). P (E1 ) P (Sn = kE1 ) P (E1 Sn = k) = • P (E2 Sn = k) P (E2 ) P (Sn = kE2 ) (5. P (B) = P (BC) P (C) + P (BC c ) P (C c ).01*[91 93 96 87 97]. P = ckn(PW.7 + ckn(PWc. and P (AB) = PA = 0.4)*0.8353 Solution to Exercise 5.1776 PAB = 0.3 (p.1860 % PAB not equal PA*PB.9 • 0.4*0. = ckn(PW. not independent Solution to Exercise 5.6*0.7 + 0.7 + 0.8 + ckn(PWc.85 • 0. P2 = 0.62) Solution to Exercise 5.6*0.3*0.3*0. P = ckn(P1.
95 • 0.10 (5.72) (5.15 • 0.64) P (H) P (SH) P (HS) = c S) P (H P (H c ) P (SH c ) P (S) = P (H) P (SH) + [1 − P (H)] P (SH c ) P (H) = P (S) − P (SH c ) = 5/7 P (SH) − P (SH c ) which implies (5.95•0.66) Solution to Exercise 5.69) = P (S)P (HS)P (E1 S)P (E2 S)+P (S c )P (HS c )P (E1 S c )P (E2 S c ) P (S)P (H c S)P (E1 S)P (E2 S)+P (S c )P (H c S c )P (E1 S c )P (E2 S c ) (5.0..20 • 0.95 + 0.20•0. CONDITIONAL INDEPENDENCE k = 1:20...05 • 0.71) Solution to Exercise 5.10 = 12.9 45 = • = c S) P (H 2 0.5337 20.0000 6..45•0.80•0.20 = = 16.6372 15.80 • 0.130 CHAPTER 5.0000 1. 125) P (HE) P (H) P (SH) P (ES) + P (S c H) P (ES c ) = • P (H c E) P (H c ) P (SH c ) P (ES) + P (S c H c ) P (ES c ) =4• Solution to Exercise 5.10 • 0.74) .6111 Solution to Exercise 5.0000 13.95+0.64811 (5.70) = 0..8148 0.90•0.k). (5.7121 19.2958 14.3663 18.20 0. odds = ibinom(20.0000 0.85 • 0.25 ∗ 0.65. E} ci S and ci S c .25•0. 124) Assumptions amount to {H.7 (p.0000 29.odds]') .10 (p.55•0....73) P (HE1 E2 ) 0.70 • 0./ibinom(20. 125) P (HE1 E2 ) P (HE1 E2 S) + P (HE1 E2 S c ) = c E E ) P (H P (H c E1 E2 S) + P (H c E1 E2 S c ) 1 2 P (HE1 E2 S) = P (S) P (HS) P (E1 SH) P (E2 SHE1 ) = P (S) P (HS) P (E1 S) P (E2 S) with similar expressions for the other terms.20 (5.k).20 = 16.0000 63.85 • 0.80•0.9558 17..95+0. 125) (5.0.95 + 0.67) 0.6555 P (H c E1 E2 ) 0.3723 % Need at least 15 or 16 successes 16.95 + 0.80.8 (p.90 • 0.15 • 0.90 • 0.25•0.90•0.13.90 • 0.0000 2.95 + 0.25 • 0.05•0.65) so that P (HS) 5 0.20•0.68) P (HE1 E2 ) P (HE1 E2 S) + P (HE1 E2 S c ) = P (H c E1 E2 ) P (H c E1 E2 S) + P (H c E1 E2 S c ) (5.2 4 (5. disp([k.30 • 0. (5..9 (p.0000 0.
7879 0.78) Solution to Exercise 5.75 • 0.20•0. PHc = 0.10 • 0.15 = 7. 126) P (HE1 E2 S) + P (HE1 E2 S c ) P (HE1 E2 ) = P (H c E1 E2 ) P (H c E1 E2 S) + P (H c E1 E2 S c ) (5. 55 17 12.84) % file npr05_16.95•0.8447 (5.76) Solution to Exercise 5.10 Solution to Exercise 5.25: mpr05_16) % Data for Exercise~5.05 • 0. 19 43 4.2)*(1 .20 • 0. 24 53 7].30•0.10 = • = 0.90 • 0.20•0.90 • 0.25 0. .25 = 15.81) Solution to Exercise 5. B = [27 34 5. 38 19 9.25 • 0. 27 52 5.2)*pH + ckn(PHc.8556 P (H c E1 E2 ) 10 0.95+0. 19 54 11.90•0.82) (5. 42 32 10.95 + 0. 127) (5. 24 53 7.90 + 0.70•0.131 Solution to Exercise 5.10 • 0.95 + 0.95 + 0. 16 59 9.77) (5.90•0.15 (p.14 (p.pH) P = 0.10 = 3.10 • 0.75 • 0.79) = P (S)P (HS)P (E1 S)P (E2 S)+P (S c )P (HS c )P (E1 S c )P (E2 S c ) P (S)P (H c S)P (E1 S)P (E2 S)+P (S c )P (H c S c )P (E1 S c )P (E2 S c ) (5.8.15 (5.10 • 0. P = ckn(PH.90 • 0.80•0.10 (5.01*[80 70 75].8577 Solution to Exercise 5. 39 22 5. 28 33 5.16 (p. 49 19 16.95+0. (5.13 (p.05•0. 126) P (HE) P (S) P (HS) P (ES) + P (S c ) P (HS c ) P (ES c ) = c E) P (H P (S) P (H c S) P (ES) + P (S c ) P (H c S c ) P (ES c ) = 0.90 • 0.20•0.90 • 0.05 • 0.10 • 0.80•0.01*[85 80 70].95 • 0.90 • 0.4697 0.83) P (HE1 E2 ) 1 0.20•0.75) (5. 127) P (HE1 E2 ) P (HE1 E2 S) + P (HE1 E2 S c ) = P (H c E1 E2 ) P (H c E1 E2 S) + P (H c E1 E2 S c ) P (HE1 E2 S) = P (H) P (SH) P (E1 SH) P (E2 SHE1 ) = P (H) P (SH) P (E1 S) P (E2 S) with similar expressions for the other terms.80) = 0.95 + 0.11 (p.80 • 0.9 • 0. 47 32 5.m (Section~17.1 • 0.10 • 0.95 • 0.90 + 0. pH = 2/3.12 (p. 126) P (HE) P (S) P (HS) P (ES) + P (S c ) P (HS c ) P (ES c ) = P (H c E) P (S) P (H c S) P (ES) + P (S c ) P (H c S c ) P (ES c ) = 0. 126) PH = 0.25 • 0.16 A = [51 26 7.
57 0.59 0.32 0.62 0. 27 33 6.41 0.32 0.62 0.743 Classify in Group 1 odds Enter profile matrix E [n n u n y y u n n y] Odds favoring Group 1: 0.12 0.286 Classify in Group 1 Solution to Exercise 5.8. disp('Call for oddsdf') npr05_16 (Section~17.25: mpr05_16) Call for oddsdf oddsdf Enter matrix A of frequencies for calibration group 1 A Enter matrix B of frequencies for calibration group 2 B Number of questions = 10 Answers per question = 3 Enter code for answers and call for procedure "odds" y = 1.31 0.63 0.06 0.19 0. 19 42 5. 37 21 8.12 0.63 0.56 0.70 0.8.29 0.50 0.26: npr05_17) % Data for Exercise~5.11 0.14 0.2693 Classify in Group 2 odds Enter profile matrix E [y y n y u u n n y y] Odds favoring Group 1: 5.42 0.13 0.m (Section~17. A = [0.08 0.29 0.56 0.33 0.51 0.08 0. odds Enter profile matrix E [y n y n y u n u y u] Odds favoring Group 1: 3.17 PG1 = 84/150.65 0.64 0.132 CHAPTER 5.8.38 0.29 0. u = 3.29 0.06 0.65 0.38 0. n = 2.19 0.15 0. 128) npr05_17 (Section~17.29 0.08 0.29 0.17 (p. PG2 = 66/125.08 0.58 0.64 0.20 0.23 0. 39 21 6].23 0.08 .08].50 0.08 0. CONDITIONAL INDEPENDENCE 19 41 6. B = [0.09 0.26: npr05_17) % file npr05_17.06 0.61 0.
disp('Call for oddsdp') Call for oddsdp oddsdp Enter matrix A of conditional probabilities for Group 1 Enter matrix B of conditional probabilities for Group 2 Probability p1 an individual is from Group 1 PG1 Number of questions = 10 Answers per question = 3 Enter code for answers and call for procedure "odds" y = 1.09].486 Classify in Group 1 odds Enter profile matrix E [n n u n y y u n n y] Odds favoring Group 1: 0. u = 3. n = 2.50 0.09 0.162 Classify in Group 1 A B . odds Enter profile matrix E [y n y n y u n u y u] Odds favoring Group 1: 3.59 0.133 0.32 0.2603 Classify in Group 2 odds Enter profile matrix E [y y n y u u n n y y] Odds favoring Group 1: 5.41 0.
CONDITIONAL INDEPENDENCE .134 CHAPTER 5.
Chapter 6 Random Variables and Probabilities 6.e. or carried. Probability associates with an event a number which indicates the likelihood of the occurrence of that event An event is modeled as the set of those possible outcomes of an experiment which satisfy a property or proposition characterizing the event. we cannot write a formula for a random variable X.1. Random variable X. in a coin ipping experiment.2 Random variables as functions We consider in this chapter real random variables (i. as a mapping from basic space Ω to the real line ω a value Probably the most useful for our purposes is as a assigns to each element t. If the outcome is observed as a physical quantity.8/>. 6. In a Bernoulli trial.1). point There are various ways of characterizing a function. it is convenient to assign a number to each outcome. Recall that a realvalued function on a on the real line) is characterized by the assignment of a real number (argument) in the domain. we extend the notion to vectorvalued random quantites.1. In many nonnumerical cases. Mappings and inverse mappings mapping R. The experiment is performed. random variables share some important general properties of functions which play an essential role in determining their usefulness. One could assign a distinct number to each card in a deck Observations of the result of selecting a card could be recorded in terms of individual numbers. We nd the mapping diagram of Figure 1 extremely useful essential patterns. the associated number becomes a property of the outcome. The fundamental idea of a real random variable is the assignment of a real number to each elementary outcome whose domain is ω in the basic space Ω. In the chapter "Random Vectors and Joint Distributions" (Section 8. into the image may have the same image point. 135 . realvalued random variables).org/content/m23260/1. we may be interested in the number of successes in a sequence of of playing cards. a head may be represented by a 1 and a tail by a 0. n component trials. In each case. For example. although several ω The ob ject point ω is mapped.1 Introduction on any trial. 1 This content is available online at <http://cnx. it is often possible to write a formula or otherwise state a rule describing the assignment of the value to each argument. In a sequence of trials. y to each Ω and whose range is a subset of the real line domain (say an interval element x I R. the size of that quantity (in prescribed units) is the entity actually observed. Except in special cases. a success may be represented by a 1 and a failure by a 0. Often. each outcome is characterized by a number. Such an assignment amounts to determining a function X. However.1 Random Variables and Probabilities1 6. Each ω is mapped into exactly one t. t = X (ω). For a realvalued function of a real variable. from the in visualizing the domain Ω to the codomain R..
136 CHAPTER 6.2: E is the inverse image X −1 (M ). If X does not take a value in M. it may be assigned a probability.1: The basic mapping diagram t = X (ω). RANDOM VARIABLES AND PROBABILITIES Figure 6. inverse mapping X −1 and the inverse images it produces. (the set of all possible values of X ). the results of measure theory ensure that we may make the assumption for any X and any subset M of the real line likely to be encountered in practice. the inverse image is the entire basic space Ω. is an event for each M. Let M be a set of numbers on the real line. we mean the set of all those ω ∈ Ω which are mapped into M by X (see Figure 2). Formally we write Associated with a function as a mapping are the X X −1 (M ) = {ω : X (ω) ∈ M } Now we assume the set (6. a subset of Ω.1) X (M ). A detailed examination of that measure theory. . As an event. The set X −1 (M ) is the event that X takes a value in M. By the inverse image of M under the mapping X. −1 assertion is a topic in Figure 6. If M includes the range of X. the inverse image is the empty set (impossible event). Fortunately.
Consider a sequence of variable whose value is the number of successes in the sequence of n Bernoulli trials. . and the sure event Ω. 1. X take on the value 1 is the set Now X takes on only two values. X −1 i∈J Mi = i∈J Ei and X −1 i∈J Mi = i∈J Ei (6.3: Preservation of set operations by inverse images. the impossible event ∅. but 0 is not. Mi . t and that the inverse image of M Central to the structure are the facts that each element is the set of all ω is mapped those ω which are mapped into Figure 6. Formal proofs amount to careful reading of the notation.3. are sets of real numbers. with probability p of success.4) Examination of simple graphical examples exhibits the plausibility of these patterns.1: Some illustrative examples. then X −1 (M c ) = E c . we note a general property of inverse images. Ei . {ω : X (ω) = 1} = X −1 ({1}) = E so that (6. 2. Consider any set M. then X (M ) = E c If 1 is in M. P (X = 0) = 1−p.3) Before considering further examples. We state it in terms of a random variable.137 Example 6.2: Bernoulli trials and the binomial distribution) P (Sn = k) = C (n. k) pk (1 − p) n−k 0≤k≤n (6. but 1 is not. which maps Ω Ω to the real line (see Figure 3).2) P ({ω : X (ω) = 1}) = p. If M. This rather ungainly notation is shortened to P (X = 1) = p. Similarly. then X −1 (M ) = E −1 If both 1 and 0 are in M. If neither 1 nor 0 is in M. according to the analysis in the section "Bernoulli Trials and the Binomial Distribution" (Section 4. X = IE where The event that E is an event with probability p. Then. Preservation of set operations Let X be a mapping from to the real line R. then X −1 (M ) = ∅ −1 If 0 is in M. into only one image point image points in M. Let Sn be the random n component trials. 0 and 1. i ∈ J . with respective inverse images E. then X (M ) = Ω In this case the class of all events X −1 (M ) c consists of event E. its complement E .
When the structure and properties of simple random variables have .5) S10 takes on a value greater than 2 but no greater than 8 i it takes one of the integer values from 3 to 8. random variables which have only a nite set of possible values. In the chapter "Distribution and Density Functions" (Section 7. again. N is a disjoint union of the separate Example 6. This implies that the inverse of a disjoint union of Mi M. This transfer produces what is called the probability distribution for X. Bernoulli trials. Its importance lies in the fact that to determine the likelihood that random quantity X will take on a value in set M.7) This means that PX has the properties of a probability measure dened on the subsets of the real line. Now.1.33.138 CHAPTER 6.6927 k=3 (6.2: Events determined by a random variable n = 10 and p = 0. we cannot have two values for Sn (ω)).1). We represent probability as mass distributed on the basic space and visualize this with the aid of general Venn diagrams the probability mass to the real line.3 Mass transfer and induced probability distribution Because of the abstract nature of the basic space and the class of events. which we usually shorten to Ak = {S10 = k}. for any ω . The importance of simple random variables rests on two facts. technical sense. j = k (i.. Let Ak = {ω : S10 (ω) = k}. These are called simple random variables. inverse images. we are limited in the kinds of calculations that can be performed meaningfully with the probabilities on the basic space. RANDOM VARIABLES AND PROBABILITIES An easy.6) 6. we determine how much induced probability mass is in the set M. 6.1. We call X. Suppose we want to determine the probability P (2 < S10 ≤ 8). but important. By the additivity of probability. We turn rst to a special class of random variables. 8 P (2 < S10 ≤ 8) = P (S10 = k) = 0. P (X ∈ M ) = PX (M ). R We have achieved a pointbypoint transfer of the probability apparatus to the real line in such a manner that we can make calculations about the random variable PX the probability measure induced byX. we consider useful ways to describe the probability distribution induced by a random variable. Some results of measure theory show that this probability is dened uniquely on a class of subsets of that includes any set normally encountered in applications. any random variable may be approximated as closely as pleased by a simple random variable. Let n Consider. We now think of the mapping from Ω to R as a producing a pointbypoint transfer of PX (M ) ≥ 0 ∞ and PX (R) = P (Ω) = 1. In addition. This may be done as follows: −1 To any set M on the real line assign probability mass PX (M ) = P X (M ) It is apparent that and minterm maps. Thus the term simple is used in a special. ∞ And because of the preservation of set operations by the inverse mapping ∞ ∞ ∞ PX i=1 Mi =P X −1 i=1 Mi =P i=1 X −1 (Mi ) = i=1 P X −1 (Mi ) = i=1 PX (Mi ) (6. Now the Ak form a partition. in some detail.4 Simple random variables We consider. Thus. For one thing. the random variable Sn which counts the number of successes in a sequence of {2 < S10 ≤ 8} = A3 since A4 A5 A6 A7 A8 (6.e. since we cannot have ω ∈ Ak and ω ∈ Aj . in practice we can distinguish only a nite set of possible values for any random variable. consequence of the general patterns is that the inverse images of disjoint are also disjoint.
An } one of these events occurs). we include possible null values. canonical form is desirable. we call these values In are known. where Remarks • • • If {Cj : 1 ≤ j ≤ m} m j=1 c is a disjoint class. for 1 ≤ i ≤ n. · · · . since they do not aect any probabilitiy calculations. so the expression X are X (ω) for all ω . These are not mutually exclusive representatons. canonical form must be n Sn = k=0 k IAk with P (Ak ) = C (n. and if the probabilities Note that in canonical form. we turn to more general cases. and in many ways normative. Canonical form is a special primitive form. The set of values where distribution consists of the {pk : 1 ≤ k ≤ n}. In that case. We do this with the aid of indicator functions (Section 1. we may append the event Cm+1 = Cj and assign value zero to it. k) pk (1 − p) n−k . then ω ∈ Ai .4: The indicator function)..8) Ai = {X = ti }.e. exactly (or. · · · . Example 6. For some probability P (Ai ) = 0 for one or more of the ti . with the corresponding set of probabilities 2. on any trial.. which displays the possible values and the corresponding events. If X takes on distinct values {t1 .139 been examined. Any of the a primitive form. we must nd suitable ways to express them analytically. the distribution is determined. One of the advantages of the canonical form is that it displays the range (set of values). displays directly the range (i. write We call this the partition determined by n is a partition (i. The distinct set {t1 . We say Ci may be partitioned. tn } of the values probabilities {p1 . distributions it may be that ti has value zero. Representation with the aid of indicator functions In order to deal with simple random variables clearly and precisely. pn } (6. but m j=1 Cj = Ω.3. For one thing. generated by) X.e.10) For many purposes. pn } constitute the distribution for X.8). This is true for any ti . We may X = t1 IA1 + t2 IA2 + · · · + tn IAn = i=1 If ti IAi (6. Canonical form is unique. · · · . Three basic forms of representation are encountered. Standard or canonical form. 1. t2 . the general formulation. set of values) of the random variable. with the same value ci associated with each subset formed. tn } and if with respective probabilities {p1 . · · · . Probability pi = P (Ai ) and the corresponding calculations for made in terms of its distribution. p2 . p2 . then {A1 . A2 . so that IAi (ω) = 1 and all the other indicator functions have value zero. The summation expression thus picks out the correct value represents ti . t2 . 0≤k≤n (6. since the representation is not unique. we include that term.4: Simple random variables in primitive form . for they can only occur with probability zero. Example 6. and hence are practically impossible. if one of the null values. it {tk : 1 ≤ k ≤ n} paired pk = P (Ak ) = P (X = tk ). · · · . Simple random variable X may be represented by a primitive form {Cj : 1 ≤ j ≤ m} is a partition (6.9) X (ω) = ti . both theoretical and practical. Many properties of simple random variables extend to the general case via the approximation procedure. cm ICm .11) X = c1 IC1 + c2 IC2 + · · · .3: Successes in Bernoulli trials As the analysis of Bernoulli trials and the binomial distribution shows (see Section 4.
Events generated by a simple random variable: canonical form partition is in that We may characterize the class of all inverse images formed by a simple random M. and the union of any subclass of the X. 0. on a equally likely basis. and the Ei form a partition. hence X in terms of the M of real numbers. In this case. 0.50. {Ai : 1 ≤ i ≤ n} then it determines. or 7 turn up.10. or 9 turn up. the player loses ten dollars. since they form an independent class.1. If Ei Sn which counts the number of successes in a sequence trial.00. 0. 0. · · · .5IC3 + 7. the Example 6. the class cj IEj (6. or 8 turn up. m X = c0 + c1 IE1 + c2 IE2 + · · · + cm IEm = c0 + j=1 In this form. inverse images) determined by sure event Ω.00. $5. 6.5IC7 + 7. then one natural way to is the event of a success on the express the count is n Sn = i=1 This is ane form. 1 ≤ i ≤ 10.10 0.0IC6 + 3. if the numbers 2. with IEi . Remark. again.12) A store has eight items for sale. $5. $3. a linear combination of the indicator functions plus a constant. Ci be Each P (Ci ) = 0. which may be zero).13) X represented in ane form. the player loses one dollar.15. be expressed in primitive form as The random variable expressing the results may X = −10IC1 + 0IC2 + 10IC3 − 10IC4 + 0IC5 + 10IC6 − 10IC7 + 0IC8 + 10IC9 − IC10 • (6.e.15. Only those points ω in AM = i∈J Ai map into M. and the coecients do c0 = 0 not display directly the set of possible values. if the number 10 turns up. Em } is not necessarily mutually exclusive.0IC5 + 5. and with P (Ei ) = p 1 ≤ i ≤ n 1 ≤ i ≤ n.5IC4 + 5..5IC1 + 5. the player gains nothing.05. She purchases one of the items with probabilities 0. in which the random variable is represented as an ane combination of indicator functions (i. 0. ∅. the integers 1 through 10.50. We commonly have (6. $7.20. i th Example 6.15.14) {E1 .0IC2 + 3. If the numbers 1. RANDOM VARIABLES AND PROBABILITIES • A wheel is spun yielding. the player gains ten dollars. If the set J is the set of indices i such ti ∈ M .5IC8 3. 0. The prices are $3.50.50.6: Events determined by a simple random variable Suppose simple random variable X is represented in canonical form by X = −2IA − IB + 0IC + 3ID Then the class (6. Let the event the wheel stops at i. respectively. E2 .00. the Any primitive form is a special ane form in which Ei often form an independent class. and $7. Consider any set then every point ω ∈ Ai maps into ti . $3.140 CHAPTER 6. the (6.e. $5.. .10 0. Ai X consists of the impossible event in the partition determined by Hence. 5. In fact. if the numbers 3. the random variable of n Bernoulli trials.5 Consider. B.50. 3}. the class of events (i. C. The random variable expressing the amount of her purchase may be written X = 3.15) c0 = 0 ci = 1 for Ei cannot form a mutually exclusive class. −1.16) by {A. D} is the partition determined X and the range of X is {−2. If ti in the range of X into M. A customer comes in. 4.
50 0. (6. 6.00.50 0. we identify the set of distinct values in the set {cj : 1 ≤ j ≤ m}.00 0.7: The distribution from a primitive form Suppose one item is selected at random from a group of ten items.23 Table 6. and 0 are in M and X −1 (M ) = A (b) If B is the set (c) The event 2.00.40 2. we consider an illustrative example. From a primitive form Before writing down the general pattern.50 0. 0 (−2.08 1.07 2. C.50 0.5 Determination of the distribution Determining the partition generated by a simple random variable amounts to determining the canonical form.14 2. For any possible value ti in the range. 1]. the event {X ≤ 1} = A B C.23 (6.00 0.19) P (Ai ) = P (X = ti ) = j∈Ji P (Cj ) (6.40 (6. then the values 1.141 (a) If M M is the interval [−2.50.08 2. 1].08 1. respective probabilities are The values (in dollars) and cj P (Cj ) 2.50 0.00 0.15 1. The ω ∈ C7 . 1.50 is ω ∈ C2 .00 0. 5]. The distribution is then completed by determining the probabilities of each event Ak = {X = tk }. 1]} = X −1 (M ).50 0. C5 . 3 are in M and X −1 (M ) = B D. Ji where Ai = j∈Ji Cj . {X ≤ 1} = {X ∈ (−∞.17) P (A3 ) = P (C1 ) + P (C3 ) + P (C9 ) = 0. then the values 2.00 0. Then the terms cj ICj = ti Ji and m ICj = ti IAi . t2 = 1.11 2. 1.23 The distribution for and P (A4 ) = P (C4 ) + P (C8 ) = 0.18) X is thus k P (X = k) 1. and t4 = 2.20) Examination of this procedure shows that there are two phases: . −1] ∪ [1.1.10 1.09 1. C6 . where M = (−∞. Since values are in M.50.23 2.14 1.50 0. t3 = 2.00 0. Suppose these are t1 < t2 < · · · < tn .50 0.10 Table 6. identify the index set Ji of those j such that cj = ti . Example 6. so that A1 = C7 and P (A1 ) = P (C7 ) = 0.14. Value 1. C10 so that C5 C6 C10 and A2 = C2 Similarly P (A2 ) = P (C2 ) + P (C5 ) + P (C6 ) + P (C10 ) = 0.2 The general procedure If may be formulated as follows: X = j=1 cj ICj . we nd four distinct values: value 1.1 By inspection.00 is taken on for taken on for t1 = 1.
in X = 10 IE1 + 18 IE2 + 10 IE3 + 3 (6.09 0.23) . First. We suppose {E1 . 1.10 0. with large sets of data the mfunction csort is very useful.50 2. the event the customer orders item 3. X is constant on each minterm generated by the class ment of the minterm expansion.5000 0. Let E1 = E2 = E3 = the event the customer orders item 1.00 1.50 2. There is a mailing charge of 3 dollars per order.9: survey (continued)) from "Minterms and MATLAB Calculations"). m X = c0 + c1 IE1 + c2 IE2 + · · · + cm IEm = c0 + j=1 We determine a particular primitive form by determining the value of the class cj IEj (6. % Matrix of P(C_j) [X.15 0.22) 2.142 CHAPTER 6.21) X on each minterm generated by {Ej : 1 ≤ j ≤ m}.14 0.1400 1. Example 6.5000 0. each indicator function the value si of X on each minterm Mi . Let X be the amount a customer who orders the special items spends on them plus mailing cost. Example 6. Then. 0 ≤ i ≤ 2m − 1 (6. t2 .7 (The distribution from a primitive form) C = [2. % The sorting and consolidating operation disp([X. as noted in the treatIEi is constant on each minterm.9: Finding the distribution from ane form A mail order house is featuring three items (limit one of each kind per customer). E2 .3. use of a tool such as csort is not really needed. E2 . This describes X {E1 . We illustrate with a simple example.4000 2. at a price of 10 dollars.2300 For a problem this small.5.07 0. Extension to the general case should be quite evident. 0.00 2. We do this in a systematic way by utilizing minterm vectors and properties of indicator functions.50 1.50 1.0000 0. Then we use the mprocedures to carry out the desired operations.6.50 1. · · · .08 0. we do the problem by hand in tabular form. We determine in a special primitive form 2 −1 m X= k=0 si IMi . We apply the csort operation to the matrices of values and minterm probabilities to determine the distribution for X. But in many problems From ane form Suppose X is in ane form. respectively.2300 2.PX] = csort(C. % Matrix of c_j pc = [0.PX]') % Display of results 1. ane form.8: Use of csort on Example 6.10].50]. Em } since. at a price of 10 dollars. at a price of 18 dollars. 0.08 0. RANDOM VARIABLES AND PROBABILITIES Select and sort the distinct values • • t1 . E3 } is independent with probabilities 0.0000 0.pc). tn Add all probabilities associated with each value ti to determine P (X = ti ) The software We use the mfunction csort which performs these two operations (see Example 4 (Example 2.00 2. the event the customer orders item 2.11 0.08 0. with P (Mi ) = pi .00 1. · · · .
21 0.21 0.143 We seek rst the primitive form. X M1 and M4 .24) has the X = 3 IM0 + 13 IM1 + 13 IM4 + 21 IM2 + 23 IM5 + 31 IM3 + 31 IM6 + 41 IM7 We note that the value 13 is taken on on minterms value 13 is thus p (1) + p (4).4 The primitive form of X is thus (6. si . the values on the various Mi . 1.09 Table 6.3 We then sort on the form for X. we list the sorted values and consolidate by adding together the probabilities of the minterms on which each value is taken. To complete the process of determining the distribution. list the corresponding minterm probabilities.06 0. i 0 1 2 3 4 5 6 7 10IEi 0 0 0 0 10 10 10 10 18IE2 0 0 18 18 0 0 18 18 10IE3 0 10 0 10 0 10 0 10 c 3 3 3 3 3 3 3 3 si 3 13 21 31 13 23 31 41 pmi 0.06 0. as follows: . To obtain the value of X on each minterm we • • Multiply the minterm vector for each generating event by the coecient for that event Sum the values on each minterm and add the constant To complete the table. The probability has value 31 on minterms M3 and M6 .21 0.09 0. using the minterm probabilities. to expose more clearly the primitive Primitive form Values i 0 1 4 2 5 3 6 7 si 3 13 13 21 23 31 31 41 pmi 0.14 0.14 0.09 0.14 0.21 0.14 0.09 Table 6.06 0.06 0. X 2. Similarly. which may calculated in this case by using the mfunction minprob.
21 = 0.06 0.10: Steps in determining the distribution for the distribution from ane form) X in Example 6.9 (Finding c = [10 18 10 3]. The minterm probabilities are included in a row matrix.06 0.. the minterm probabilities.. pm = minprob(0.25) X and PX describe the distribution for X.14 0. The procedure uses mintable to set the basic minterm vector patterns. 1.09 Table 6.35 0.35 0. 6..8) C = 10 10 10 10 10 18 18 18 18 18 10 10 10 10 10 CM = C..14 + 0...144 CHAPTER 6.1.*M CM = % Constant term is listed last % Minterm vector pattern 1 0 1 1 1 1 1 0 1 % An approach mimicking ``hand'' calculation % Coefficients in position 10 10 18 18 10 10 % Minterm vector values 10 18 10 .. or can calculate.21 0. RANDOM VARIABLES AND PROBABILITIES k 1 2 3 4 5 6 tk 3 13 21 23 31 41 pk 0.1*[6 3 5]).. then uses a matrix of coecients.06 + 0.09] (6.C = colcopy(c(1:3)... Having obtained the values on each minterm. to obtain the values on each minterm. Example 6.09 = 0. We start with the random variable in ane form. and suppose we have available.15 0.14 0..15 0. Examination of the table shows that X = [3 13 21 23 31 41] Matrices and P X = [0. including the constant term (set to zero if absent).21 0. 2. the procedure performs the desired consolidation by using the mfunction csort.5 The results may be put in a matrix probabilities that X X of possible values and a corresponding matrix PX of takes on each of these values.. then incorporate these in the mprocedure canonic for carrying out the transformation.6 An mprocedure for determining the distribution from ane form We now consider suitable MATLAB steps in determining the distribution from ane form. M = mintable(3) M = 0 0 0 0 1 0 0 1 1 0 0 1 0 1 0 % .
s.10 (Steps in determining the distribution for in Example 6.5]).06 0.06 23 0.9 (Finding the distribution from ane form)) X c = [10 18 10 3].1400 13.const.14 0.14 0. which we use to solve the previous problem.09 0.09 % Sorting on s...15 41 0.06 0.06 0..pm).PX]') 3 0. disp([X.PX] = csort(s. consolidation of pm % Display of final result The two basic steps are combined in the mprocedure canonic.09 % Display of primitive form 0.09 % Extra zeros deleted const = c(4)*ones(1. % Note that the constant term 3 must be included last pm = minprob([0..14 13 0..3 0. Example 6.21 0.14 % MATLAB gives four decimals 0.11: Use of canonic on the variables of Example 6.} disp([CM.14 0.0000 0.0000 0.0600 23.09 0.0000 0.. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 3.06 0..0000 0.2100 31.21 31 0.0900 .6 0.0000 0.21 0.145 0 0 0 0 10 10 10 10 0 0 18 18 0 0 18 18 0 10 0 10 0 10 0 10 cM = sum(CM) + c(4) % Values on minterms cM = 3 13 21 31 13 23 31 41 % .3500 21....8).35 21 0.1500 41.21 0.0000 0.21 0..% Practical MATLAB procedure s = c(1:3)*M + c(4) s = 3 13 21 31 13 23 31 41 pm = 0.pm]') 0 0 0 3 3 0 0 10 3 13 0 18 0 3 21 0 18 10 3 31 10 0 0 3 13 10 0 10 3 23 10 18 0 3 31 10 18 10 3 41 [X.
146 CHAPTER 6.4200 N = abs(X20)<=7.85. P ((X − 10) (X − 25) > 0). N = 0 1 1 1 0 0 PN = N*PX' PN = 0.84]. 0. RANDOM VARIABLES AND PROBABILITIES X (set of values) and With the distribution available in the matrices PX (set of probabilities). P (X − 20 ≤ 7). 0. M = 0 0 1 1 1 0 PM = M*PX' PM = 0.72.2100 P2 = (Z>0)*PZ' P2 = 0.PZ] = csort(G. value and probability matrices) may be used in exactly the same manner as that for the original random variable X. 0.20 <= 7 % Picks out and sums those minterm probs % Value of g(t_i) for each possible value % Total probability for those t_i such that % g(t_i) > 0 % Distribution for Z = g(X) 496 0.93.77. 0.77. M = (X>=15)&(X<=35). 0.9 (Finding the distribution from ane and form))) it is desired to determine the probabilities P (15 ≤ X ≤ 35).6200 G = (X .13: Alternate formulation of Example 3 (Example 4.10) 0]. Apply csort to the pair Z = g (X). We use two key devices: 1.10 (Steps in determining the distribution for in Example 6.25) G = 154 36 44 26 126 496 P1 = (G>0)*PX' P1 = 0. If the respective probabilities for success are 0.19) from "Composite Trials" Ten race cars are involved in time trials to determine pole positions for an upcoming race.1400 0.83. 0. P X) to get the distribution for for X.88.84.10 X in Example 6.96.11 (Use of canonic on the variables of Example 6.1500 0.85.72. which has values g (ti ) for each possible value ti (G. or. c = [ones(1. Use the matrix G. qualify. 0.PX) Z = 44 36 26 126 154 PZ = 0. 0. 2.90.*(X . 7. It seems reasonable to suppose the class {Ei : 1 ≤ i ≤ 10} independent.3500 0. Let the i th Ei To be the event is car makes qualifying speed.91.83. 0. Use relational and logical operations on the matrix of values ones for those values which meet a prescribed condition. 9.0900 % Calculation using distribution for Z Example 6. 0. 0. P = [0. canonic .10). 0. 10)? SOLUTION Let X= 10 i=1 IEi .93. 0.88. Determine X to determine a matrix PM = M*PX' M which has P (X ∈ M ): G = g (X) = [g (X1 ) g (X2 ) · · · g (Xn )] by using array operations on matrix X.12: Continuation of Example 6. 0. 0. they must post an average speed of 125 mph or more on a trial run.90. 0. 0. we may calculate a wide variety of quantities associated with the random variable.0600 0.3800 % Ones for minterms on which 15 <= X <= 35 % Picks out and sums those minterm probs % Ones for minterms on which X . Example 6. This distribution (in b. what is the probability that k or more will qualify ( k = 6. We have two alternatives: a. 8.9 (Finding the distribution from ane form))) X Suppose for the random variable (Steps in determining the distribution for X in Example 6.96. 0.11 (Use of canonic on the variables of Example 6.91.3800 [Z.
. A function form for canonic One disadvantage of the procedure canonic is that it always names the output X and P X.7 General random variables The distribution for a simple random variable is easily visualized as point mass concentrations at the various values in the range. and is often designated σ (X).5756 0.0600 23. the distribution may be a mixture of point mass concentrations and smooth distributions on some intervals. [Z.e. Example 6. using canonicf c = [10 18 10 3].0000 0. pm = minprob(0.13 (Alternate formulation of Example 3 (Example 4.9938 0. This is a type of class known as a sigma algebra of events. The situation is conceptually the same for the general case. disp([Z.147 Enter row vector of coefficients c Enter row vector of minterm probabilities minprob(P) Use row matrices X and PX for calculations Call for XDBN to view the distribution k = 6:10. We consider such problems in the study of Independent Classes of Random Variables (Section 9.9628 0. There are some technical questions .2100 31.1500 41. countable unions.0000 0.0000 0. and the class of events determined by a simple random variable is described in terms of the partition generated by X (i. but the distribution % matrices are now named Z and PZ 6.0000 0.1). If the random variable takes on a continuum of values.pm).1.19) from "Composite Trials"). for i = 1:length(k) Pk(i) = (X>=k(i))*PX'.1400 13. with the distribution for X as dened. There are technical mathematical reasons for not saying M is any subset. which we call canonicf. frequently it is desirable to use some other name for the random variable from the start. then the probability mass distribution may be spread smoothly on the line. A function form. end disp(Pk) 0.0000 0. Or. While these can easily be renamed. This is particularly the case when it is desired to compare the results of two independent races or heats. the class of those events of the form Ai = {X = ti } for each ti in the range). the class of events determined by random variable X is also a sigma algebra. Because of the preservation of set operations by the inverse image.3500 21.2114 This solution is not as convenient to write out. is useful in this case. However.0900 % Numbers as before.1*[6 3 5]). and countable intersections of Borel sets. but the details are more complicated. The class of events determined by for M X is the set of all inverse images X −1 (M ) any member of a general class of subsets of subsets of the real line known in the mathematical literature as the Borel sets.14: Alternate solution of Example 6.8472 0.PZ]') 3. but the class of Borel sets is general enough to include any set likely to be encountered in applicationscertainly at the level of this treatment. The Borel sets include any interval and any set that can be formed by complements. a great many other probabilities can be determined.PZ] = canonicf(c.0000 0.
The prices are $3.) A wheel is spun yielding on an equally likely basis the integers 1 through 10. A customer comes in.) Express the events A. hence the distribution. some of these questions become important in dealing with random processes and other advanced notions increasingly used in applications. 3)}.148 CHAPTER 6. {X ∈ (0.11. However. the player gains nothing. If the numbers 1. the player loses ten dollars. Exercise 6.3 C1 {Cj : 1 ≤ j ≤ 10} through C10 . and $7. the player gains ten dollars.) has values {1.org/content/m24208/1.07.3 has respective probabilities 0. 3. 1]}.50. 0. 5.09.1. 1. is that any general random variable can be approximated as closely as pleased by a simple random variable. if the numbers 2.2 Problems on Random Variables and Probabilities2 Exercise 6. RANDOM VARIABLES AND PROBABILITIES PX induced by concerning the probability measure X. Let i.50. or 7 turn up. respectively. 152. {X − 1 ≥ 1}. {X − 2 ≤ 3}. $5.00. {X ∈ [2. She purchases one of the items with probabilities 0.) X = −3. 3.14.11. 6. 2. 2. t]) is an event. (Solution on p. P (X > 0). {X ∈ (−4. the player loses one dollar. Express the events and {X ≥ 0} in Exercise 6.5 the wheel stops at (Solution on p.4 The class {Cj : 1 ≤ j ≤ 10} in Exercise 6. 152. 3. Exercise 6.13. 152.08.06.09. 0. t] on the real line 2. if the number 10 turns up. The random variable expressing the results may be expressed in primitive form as X = −10IC1 + 0IC2 + 10IC3 − 10IC4 + 0IC5 + 10IC6 − 10IC7 + 0IC8 + 10IC9 − IC10 (6. B. Ci be the event Each P (Ci ) = 0. or 8 turn up.50. $5. 1. 4. is a partition. C .00.1 The following simple random variable is in canonical form: (Solution on p. $7.50. 5. C. 2 This content is available online at <http://cnx. {X ∈ (−∞. 1 ≤ i ≤ 10. Random variable Express X X (Solution on p. if the numbers 3. The induced probability distribution is determined uniquely by its assignment to all intervals of the (−∞. 152.75IA − 1. $3.50. B.) P (X < 0). or 9 turn up. and (Solution on p. X −1 (M ) is an event for every X −1 ((−∞. These also are settled in such a manner that there is no need for concern at this level of analysis. in terms of Exercise 6. These facts point to the importance of the distribution function introduced in the next chapter. 4. Another fact. form Borel set M i for every semiinnite interval (−∞. is given by X = −2IA − IB + IC + 2ID + 5IE .13IB + 0IC + 2.2 Random variable X. Determine the distribution for X. and D. 3]}.26) • • Determine the distribution for Determine X.6 A store has eight items for sale. t]. The class on and E.00. . 153. 6. {X 2 ≥ 4}.5/>. 0. Exercise 6. terms of A. 0. 2]}. D.10. alluded to above and discussed in some detail in the next chapter. 0. 2} (Solution on p. (b) using MATLAB. 0. $5. in canonical form. Two facts provide the freedom we need to proceed with little concern for the technical details.12. respectively. 0.) in canonical form. (a) by hand. We turn in the next chapter to a description of certain commonly encountered probability distributions and ways to describe them analytically. 0.6ID . {X < 0}. 0. 152. $3. {X ≤ 0}.
10 0.29) Determine the distribution for random variable X = −5. (b) using MATLAB. Determine the distribution for Z. who does not like the Magic. on A2 B3 . 155. 154.5IB + 2. Ellen's winning may be expressed as the random variable X = −5IA + 10IAc + 10IB − 5IB c − 5IC + 5IC c = −15IA + 15IB − 10IC + 10 Determine the distribution for comes out ahead? (6. 0. a. determine the distribution for Z. Determine its distribution.048 0.028 0. and the Chicago Bulls all have C A be the event the Rockets win. 153. Determine the distribution for X Y be the maximum of the two numbers. Determine the distribution for Y.9 A pair of dice is rolled.) X be the minimum of the two numbers which turn up. B.5IC3 + 7.10 Minterm probabilities p (0) through p (15) for the class {A. Z = 3 + 3 each Ai Bj and determine the corresponding P (Ai Bj ). Y X = 2IA1 + 3IA2 + 5IA3 The Y = IB1 + 2IB2 + 3IB3 (6.7.042 0. 0.15.170 0.) in canonical form are X.028 0.20.2 0.10 0. W = XY .) On a Tuesday evening.062 0. D} are.27) X (a) by hand. Determine the value of Z on From this. What are the probabilities Ellen loses money. She loses ve if the Rockets win and gains ten if they lose $10 to 5 against the Magic even $5 to 5 on the Bulls. P (Bj ) are 0. the Houston Rockets. Let Z be the sum of the two numbers. 0. (Solution on p. (Solution on p.3IA − 2. 0. 153. She decides to take him up on his bets as follows: • • • $10 to 5 on the Rockets i.5IC8 Determine the distribution for (6. Ellen's boyfriend is a rabid Rockets fan.0IC5 + 5.149 0. the Orlando Magic. He wants to bet on the games. respectively. 154.5IC4 + 5.018 0.5IC7 + 7. 0. or . and (6.15. Let W be the absolute value of the dierence. let and determine the distribution of W.e. Then Z = 2 + 1 on A1 B1 .31) X. breaks even.28) P (Ai ) are 0. Each pair {Ai .3IC + 4. Let b.30) (Solution on p.048 0.75. 0.6 0.040 0. C} is independent.11 games (but not with one another).112 0.70 0. Bj } is Z = X + Y . Determine the value of W on each Ai B j Exercise 6.3. The random variable expressing the amount of her purchase may be written X = 3. Suppose the class {A. Exercise 6. Let c.7 Suppose (Solution on p. Let win.010 0.5IC1 + 5. 0. (Solution on p.0IC6 + 3.15. etc. B be the event the Magic be the event the Bulls win.0IC2 + 3.) Exercise 6. d.2.8 For the random variables in Exercise 6. and the Consider the random variable independent. 0. C. in order.2ID − 3.6. with respective probabilities 0.7 Exercise 6.1.168 0.05. B.110 0.) Exercise 6.072 0.8.012 0.032 (6.
Thus. Show that the position at the end of ten tosses is given by the random variable 10 10 10 c IEi = 2 X= i=1 where IEi − i=1 IEi − 10 = 2S10 − 10 i=1 (6. a coin is tossed If a head turns up. 0. Exercise 6. Exercise 6. 0. Exercise 6. a. $26. and six dollars if he wins the third. if a tail turns up.35) Ei is the event of a head on the i th toss and S10 is the number of heads in ten trials.5. X be the Exercise 6. Let the i th component trial.90.001 ∗ [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] (6.5. the marker is moved one unit to the right. His net winning can be represented by the random variable X = 3IA + 4IB + 6IC − 6. D} pm = 0. RANDOM VARIABLES AND PROBABILITIES Exercise 6. a socket wrench set at $56. 0.3 (6.7. what are the possible positions and the probabilities of being in each? . What is the probability he receives at least $50? Less than $30? Exercise 6. P (B) = 0. Determine the distribution for X.) has minterm probabilities {A. an amount a is earned if there is a success on the trial and an amount b (usually negative) if there is a failure.75.) A marker is placed at a reference position on a line (taken to be the origin). We associate with each trial a payo function Xi = aIEi + bIEic .150 CHAPTER 6. Let amount of his total purchases. Ei (Solution on p. C. Determine the distribution for (6. He puts down two dollars for each bet. with respective probabilities 0. 0. B. B.16 A sequence of trials (not necessarily independent) is performed. a vise at $24. four dollars if he wins the second bet. the marker is moved one unit to the left.34) Assume the results of the games are independent. Let Sn be the number of successes in the n trials and W be the net payo.33) X.32) • • Determine whether or not the class is independent.9. Find the probability that one or three of these occur on a trial. C .) Assume the class is independent.14 (Solution on p. After ten tosses. James is expecting three checks in the mail.13 events Then (Solution on p. 158.4. The random variable X = IA + IB + IC + ID on a trial. 156. 158. He picks up three dollars (his original bet plus one dollar) if he wins the rst bet. and hammer at $8.17 repeatedly.15 Henry goes to a hardware store. 0. (Solution on p. b. for $20.) A gambler places three bets. purchases of the individual items.6. X = 20IA + 26IB + 33IC represents the total amount received.80. 0. P (C) = 0. and $33 dollars. Their arrivals are the A. Determine the distribution for X. 156. with P (A) = 0. 155. Find the distribution for X counts the number of the events which occur and determine the probability that two or more occur on a trial.12 The class (Solution on p.) He considers a power drill at $35.) be the event of success on (Solution on p. Show that W = (a − b) Sn + bn. He decides independently on the a set of screwdrivers at $18.4. with respective probabilities 0. 157.
Z =X +Y.*u.38. 0. Assume that all eleven a.s] = ndgrid(X. Determine the distribution for c. Y. pz = t. the amount purchased by Anne. 0. Let MATLAB perform the calculations.u] = ndgrid(PX. 0. [t. 158.PY).52.83.23.41. [r.63. 12 dollars. 0.37. 18. possible purchases form an independent class. 0.) Margaret considers ve purchases in the amounts 5. z = r + s.pz). Determine the distribution for b.58. 0.PZ] = csort(z. 8. 12. . 17. Determine the distribution for X. 15 dollars with respective probabilities 0. 15. Suggestion for part (c).151 Exercise 6. 21. Anne contemplates six purchases in the amounts 8. the amount purchased by Margaret.18 (Solution on p. with respective probabilities 0. 0.22. 0. [Z.81. the total amount the two purchase.Y). 0.77. 15.
148) T [X.PX]') 1.0000 0. pc = 0.37) Solution to Exercise 6.0000 0.1*ones(1. 148) p = 0. 148) • • • • • D A A B A B B C D D E E Solution to Exercise 6.PX] = csort(T. E = C9 (6. RANDOM VARIABLES AND PROBABILITIES Solutions to Exercises in Chapter 6 Solution to Exercise 6.0000 0. [X.2900 4. [X.10).3 (p.pc).0000 0.1000 0 0.2600 3. B = C3 C6 C10 .2000 2.I] X = I = = [1 3 2 3 4 2 1 3 5 2].3000 10.3000 1.1100 Solution to Exercise 6.36) X = IA + 2IB + 3IC + 4ID + 5IE A = C1 C7 .1400 5. D = C5 .p).0000 0.5 (p.2 (p.PX]') 10.0000 0.01*[8 13 6 9 14 11 12 7 11 9]. = sort(T) 1 1 2 2 2 3 3 1 7 3 6 10 2 4 3 8 4 5 5 9 (6. C = C2 C4 C8 .0000 0.PX] = csort(c. 148) T = [1 3 2 3 4 2 1 3 5 2].152 CHAPTER 6. disp([X. disp([X.0000 0.4 (p. c = [10 0 10 10 0 10 10 0 10 1]. 148) • • • • • A D A B B D C C C C Solution to Exercise 6.1 (p.3000 .
0000 0.6 (p. 149) XY = a. pb = colcopy(PB.PX]') 3.PZ] = csort(Z. 148) p = 0.0200 [Z.3500 Solution to Exercise 6.3000 5.7 (p.0200 0.5000 0.0600 4. P = pa.8 (p.0000 0.0200 B a b Z Z Solution to Exercise 6.300 Solution to Exercise 6. c = [3.4000 = (X>0)*PX' = 0. PB = [0.153 Pneg Pneg Ppos Ppos = (X<0)*PX' = 0.0000 0.0600 0.5000 0.5 7.3).5 7.3 0.3600 0. disp([X. 149) A = [2 3 5].3).0000 0.0000 0.1800 0.5].1]. = colcopy(B.3). disp([Z.1200 0.PX] = csort(c.*pb % Probabilities for various values P = 0.01*[10 15 15 20 10 5 10 15].0600 8.2 0.3500 5.5 5 3.0000 0.0600 0. [X.P).3).*b XY = 2 3 4 6 6 9 W 5 10 15 PW % XY values % Distribution for W = XY .3000 7.0000 0. =a + b % Possible values of sum Z = X + Y = 3 4 6 4 5 7 5 6 8 PA = [0.PZ]') % Distribution for Z = X + Y 3. = [1 2 3].2]. = rowcopy(A.p).5 5 5 3.6 0.1400 7.6 0.4200 6.1200 0. pa= rowcopy(PA.0600 0.
m (Section~17.y in each position sorts values and counts occurrences % PX = fX/36 % PY = fY/36 8 5 9 4 10 3 11 2 12 1 %PZ = fZ/36 % PW = fW/36 Solution to Exercise 6.0000 5. coefficients in c') . [x.4200 0.0600 0.y).110 0.012 0.0000 15.1200 0.048 0.7].y] = meshgrid(t.fW] = csort(d.018 0.0000 6.y). c = ones(6.040 0.c) W = 0 1 2 fW = 6 10 8 4 4 4 4 4 4 1 2 3 4 5 6 5 5 5 5 5 5 1 2 3 4 5 6 6 6 6 6 6 6 1 2 3 4 5 6 % xvalues in each position % yvalues in each position 4 5 4 7 5 4 3 6 5 3 5 9 6 5 4 4 6 1 6 11 7 6 5 2 % % % % % min in each position max in each position sum x+y in each position x .154 CHAPTER 6.0600 0.10 (p.168 0.c) Y = 1 2 3 fY = 1 3 5 [Z. 0.6).9 (p.8.fX] = csort(m. [X.0000 9.062 0. M = max(x.042 0. s = x + y.t) x = 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 y = 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 m = min(x.. d = abs(x .0000 0.0200 0. c = [5.c) Z = 2 3 4 fZ = 1 2 3 [W. RANDOM VARIABLES AND PROBABILITIES 2.y).048 0.3 4.0000 3.170 0.0200 Solution to Exercise 6.072 0.2 3. 149) t = 1:6.3 2.5 2.fY] = csort(M.0000 4.032].0000 10. 149) % file npr06_10.028 .c) X = 1 2 3 fX = 11 9 7 [Y.028 0.fZ] = csort(s..27: npr06_10) % Data for Exercise~6.010 0. disp('Minterm probabilities are in pm.1800 0.10 pm = [ 0.112 0.1200 0.
1800 5.0000 0.0480 3.155 npr06_10 (Section~17.0280 0.8000 0.8.12 (p.0120 Solution to Exercise 6.0000 0.2000 0. coefficients in c canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution XDBN XDBN = 11.0480 2.0350 PXneg = (X<0)*PX' PXneg = 0.27: npr06_10) Minterm probabilities are in pm.0000 0.2250 PX0 = (X==0)*PX' PX0 = 0.4000 0.0180 0. 149) P = 0.7000 0.0400 9.1100 6.0420 3.0620 7.01*[75 70 80].3000 0.0000 0.3000 0.1680 5.0280 6.0000 0. 150) minprob(P) .2950 Solution to Exercise 6.5000 0.7000 0.8000 0.1400 25.0000 0.4800 PXpos = (X>0)*PX' PXpos = 0.0000 0.2000 0. c = [15 15 10 10].0720 2.0450 0 0.1700 9.1200 15.5000 0.5000 0.1120 1.9000 0.0100 2.0320 4.0000 0.11 (p.4800 10. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 15.
0200 46.01*[90 75 80].0150 33.13 (p.0450 26. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 0 0.0050 1.0600 79.0000 0. coefficients in c a = imintest(pm) The class is NOT independent Minterms for which the product rule fails a = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution XDBN = 0 0.0000 0. 150) minprob(P) .2120 3.0000 0.4380 4.156 CHAPTER 6.1800 59.0000 0.0050 20.3020 P2 = (X>=2)*PX' P2 = 0.0000 0.28: npr06_12) Minterm probabilities in pm.0430 2.0000 0. P = 0.4810 Solution to Exercise 6.9520 P13 = ((X==1)(X==3))*PX' P13 = 0.0000 0.1350 53.7800 P30 = (X <30)*PX' P30 = 0. RANDOM VARIABLES AND PROBABILITIES npr06_12 (Section~17.0000 0.5400 P50 = (X>=50)*PX' P50 = 0.8.0000 0.14 (p. 150) c = [20 26 33 0].0000 0.0000 0.0650 Solution to Exercise 6.
0000 0.0000 0.0000 0.0000 0.0324 18.0000 0. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution dsp(XDBN) 6.157 c = [3 4 6 6]. 150) minprob(P) c = [35 56 18 24 8 0].0084 24.0000 0.0486 67.1400 3.0000 0.0000 0.0000 0.0324 91.0000 0.0036 42.1*[5 6 7 4 9].0084 99.0486 minprob(P) .0084 56.0000 0.0600 Solution to Exercise 6.0054 98.0504 88. P = 0.15 (p.0756 32.0000 0.0000 0.1400 0 0.0000 0.2100 2.0216 35.0216 74.0054 59.0000 0.0000 0.0000 0.0000 0.0000 0.0056 43.0504 53. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 0 0.0126 77.0036 8.0000 0.0000 0.0056 80.0000 0.0000 0.0000 0.1*[5 4 3].1134 85.0000 0.0000 0.0756 64.0600 7.0324 50.0000 0.0000 0.0024 61.0900 4. P = 0.2100 3.0000 0.0036 82.0000 0.0024 26.0000 0.0900 1.
*u. pz = t.10.0000 0.0439 8. [r.16 (p. disp([X.pmy). PS = ibinom(10.0000 117.0000 0.0084 0. npr06_18 (Section~17. 150) Xi = aIEi + b (1 − IEi ) = (a − b) IEi + b n n (6.0098 6.PZ] = csort(z.m (Section~17.0000 0.29: npr06_18.17 (p.8.41) S = 0:10.0000 0. RANDOM VARIABLES AND PROBABILITIES 0.s] = ndgrid(X.0000 Solution to Exercise 6.38) W = i=1 Xi = (a − b) i=1 IEi + bn = (a − b) Sn + bn (6.m) cx = [5 17 21 8 15 0].0000 109.0:10). cy = [8 15 12 18 15 12 0].pmx). pmx = minprob(0. [Y.0000 0.PX] = canonicf(cx.m) [X.0010 Solution to Exercise 6. 150) c Xi = IEi − IEi = IEi − (1 − IEi ) = 2IEi − 1 (6.18 (p.158 CHAPTER 6.01*[37 22 38 81 63]).0756 0.PY).0000 0.0000 115. X = 2*S .01*[77 52 23 41 83 58]).0439 4. .0000 0.2461 2.0000 123.PS]') 10.0126 0.0000 0.2051 0 0. z = r + s.0000 0. [t. pmy = minprob(0.0.u] = ndgrid(PX.1172 6.29: npr06_18. 151) % file npr06_18.39) Solution to Exercise 6.pz).0756 106.Y).0000 0.0324 0.0000 141.40) 10 n X= i=1 Xi = 2 e=1 IEi − 10 (6.5.2051 4.1134 0.0000 133.1172 2.0010 8.PY] = canonicf(cy.8.0036 0.0098 10. [Z.
159 a = length(Z) a = 125 plot(Z.cumsum(PZ)) % 125 different values % See figure Plotting details omitted .
RANDOM VARIABLES AND PROBABILITIES .160 CHAPTER 6.
but drops immediately as t moves to the left of t0 . For more general cases. X is determined uniquely by a consistent assignment of mass to This suggests that a natural description is provided Denition The distribution function FX for random variable X is given by FX (t) = P (X ≤ t) = P (X ∈ (−∞. the probability mass included in at or to the left of t as there is for s. this is the probability mass a consequence. for if (F3) : FX is continuous from the right. t approaches Except in very unusual cases involving random variables which may take innite values. PX . t] for each real t. has the following properties: (F1) : (F2) : FX must be a nondecreasing function.1) As FX at or to the left of the point t.1) we introduce real random variables as mappings from the basic space Ω to the real line. In the theoretical discussion on Random Variables and Probability (Section 6. at which point the amount at or to the left of t increases ("jumps") by amount p0 . the interval includes the mass p0 all the way to and including t0 .6/>. with a jump in the amount p0 at t0 i P (X = t0 ) = p0 . If the point t0 from the left.1.1. The mapping induces a transfer of the probability mass on the basic space to subsets of the real line in such a way that the probability that X takes a value in a set is exactly the mass assigned to that set by the transfer. we need a more useful description than that provided by the induced probability measure 7.1 Introduction M possible value of In the unit on Random Variables and Probability (Section 6.1 Distribution and Density Functions1 7. on the other hand.Chapter 7 Distribution and Density Functions 7.2 The distribution function distribution induced by a random variable semiinnite intervals of the form by the following. To perform probability calculations. We have at each X a point mass equal to the probability X takes that value. For simple random variables this is easy. the interval does not include the probability mass at t0 until t reaches that value. t] must increase to one as t moves to the right.1). t>s there must be at least as much probability mass (−∞.org/content/m23267/1. as t moves to the left. 1 This content is available online at <http://cnx. t]) ∀ t ∈ R In terms of the mass distribution on the line. (7. we note that the probability (−∞. we need to describe analytically the distribution on the line. 161 . if t approaches t0 from the right.
where it jumps by an amount p2 to the value p1 + p2 . FX (t) remains constant at zero until it jumps by the amount p1 .1 pm = minprob(0. The distribution consists ti in the range. so that FX (−∞) = lim FX (t) = 0 t→−∞ and FX (∞) = lim FX (t) = 1 t→∞ (7. FX (t) remains constant at p1 until t increases to t2 .. To the left of the smallest value in the range.1*[6 3 5]). DISTRIBUTION AND DENSITY FUNCTIONS the probability mass included must decrease to zero. this determines uniquely the induced distribution. According to FX for a simple random variable is easily visualized. with a jump in the amount pi at the corresponding point ti in the range. the geometric distribution or the Poisson distribution considered below).1 . canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution ddbn % Circles show values at jumps Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % Printing details See Figure~7.162 CHAPTER 7. A of point mass pi at each point similar situation exists for a discretevalued random variable which may take on an innity of values (e.1: Graph of FX for a simple random variable c = [10 18 10 3]. t]. The procedure matrix ti . In this case. The graph of FX is thus a step function. there is always some probability at points to the right of any total probability mass is one. as t increases to the smallest value t1 . % Distribution for X in Example 6. increases.2) A distribution function determines the probability mass in each semiinnite interval the discussion referred to above. FX (t) = 0.g.. The distribution function (−∞. since the X ddbn may be used to plot the distributon function for a simple random variable from a of values and a corresponding matrix Example 7. but this must become vanishingly small as t PX of probabilities. continuous from the right.5. This continues until the value of FX (t)reaches 1 at the largest value tn .
k) pk q n−k (7. function has a jump in the amount 2.1: Distribution function for Example 7.1 (Graph of 3.1 for Example 7.1. two mfunctions .163 Figure 7. X = IE P (X = 1) = P (E) = pP (X = 0) = q = 1 − p.3 Description of some common discrete distributions We make repeated use of a number of common distributions which are used in many practical situations. Indicator function. continuous from the right. 1. The q at t = 0 and an additional jump of p to the value 1 n Simple random variable X = i=1 ti IAi (canonical form) P (X = ti ) = P (Ai ) = p1 The distribution function is a step function.5) ibinom andcbinom are available for computing the individual and cumulative binomial probabilities.1 (Graph of FX for a simple random variable) 7.1). Binomial FX pi at t = ti (See for a simple random variable)) (n. (7. with jump of distribution at t = 1. This collection includes several distributions which are studied in the chapter "Random Variables and Probabilities" (Section 6. This random variable appears as the number of successes in a sequence of trials with probability p n Bernoulli of success. In its simplest form n X= i=1 I Ei with {Ei : 1 ≤ i ≤ n} independent (7. p). As pointed out in the study of Bernoulli sequences in the unit on Composite Trials.3) Figure 7.4) P (Ei ) = p P (X = k) = C (n.
consists of (7.1/3. An P (Y = k) = C (k − 1. Thus P (X = k) = q k p.6) k−1 failures. both arising in the study of continuing Bernoulli The rst counts the number of failures before the rst success. P (X ≥ n + kX ≥ n) = P (X ≥ n + k) = q n+k /q n = q k = P (X ≥ k) P (X ≥ n) (7.5) P~~=~~0.5. 1 ≤ k We say (7. Negative binomial convenient to work with (m.98100 = 0. X is the number of failures before the mth success. If it has failed the probability of failing an additional probability of failing k n times. p).2: The geometric distribution A statistician is taking a random sample from a population in which two percent of the members own a BMW automobile. k−1 trials. He scores if he throws a 1 or a 6. The event {Y = k} We have P (Y = k) = q k−1 p. m − 1) pm q k−m . m≤k (7.164 CHAPTER 7. She takes a sample of size 100. The event {X = k} consists of a sequence of k This is sometimes called failures. then a success on the k th component trial. 4. The mth success k trials is greater than or ~P~=~cbinom(10. (p) There are two related distributions. Example 7. the waiting time.02 0.7) X has the geometric distribution with parameter Now (p).3: A game of chance A player throws a single sixsided die repeatedly. Y = X+1 or Y − 1 = X.2131 . then a success. DISTRIBUTION AND DENSITY FUNCTIONS Geometric sequences. It is generally more Y = X + m. Also geometric (p).9) examination of the possible patterns and elementary combinatorics show that There are m−1 successes in the rst pm q k−m . p = 0. What is the probability of nding no BMW owners in the sample? SOLUTION The sampling process may be viewed as a sequence of Bernoulli trials with probability of success. We have an mfunction nbinom to calculate these probabilities. For this reason. k or more times before the next success is the same as the initial or more times before the rst success. 0 ≤ k The second designates the component trial on which the rst success occurs.8) This suggests that a Bernoulli sequence essentially "starts over" on each trial. The probability of 100 or more failures before the rst success is or about 1/7. which we often designate by X ∼ The geometric (p).1326 5. What is the probability he scores ve times in ten or fewer throws? ~p~=~sum(nbinom(5. the number of the trial on which the mth success occurs.2131 An alternate solution m. then a success. Each combination has probability Example 7. it is customary to refer to the distribution for the number of the trial for the rst success by saying probability of k or more failures before the rst success is Y −1 ∼ P (X ≥ k) = q k . is possible with the use of the comes not later than the equal to k th trial i the number of successes in binomial distribution.1/3.5:10)) p~~=~~0.
These have advantages of speed and parameter range similar to those for ibinom and cbinom. with P (X = k) = e−µ µk k! 0≤k (7. But for practical cases. P = cpoisson(mu. we have two mfunctions for calculating Poisson probabilities. 130. And strictly speaking the integrals are Lebesgue integrals rather than the ordinary Riemann kind. where is calculated by is a row or column vector of integers and the result is a row matrix of the probabilities. What is the probability the number of arrivals is greater than equal to 110.4 The density function If the probability mass in the induced distribution is spread smoothly along the real line. along with a number of other facts.8209~~0. fX ≥ 0 The density function has three characteristic properties: t (f1) (f2) fX = 1 R (f3) FX (t) = −∞ fX (7.2011~~0.k). fX (t) is the mass per unit length in the probability distribution. there is a probability density M function fX which satises P (X ∈ M ) = PX (M ) = At each fX (t) dt (area under the graph of fX over M) (7. 120.k).1. distribution is approximately Poisson Use of the generating function (see Transform Methods) shows the sum of independent Poisson random variables is Poisson. This distribution is assumed in a wide variety of applications. Example 7.3).12) A random variable (or distribution) which has a density is called from measure theory. 150. 7. : : P (X = k) P (X ≥ k) is calculated by the result P P P = ipoisson(mu. We often simply abbreviate as continuous absolutely continuous. the binomial (np). the use of tables is often quite helpful. 160 ? ~p~=~cpoisson(130. are summarized in the table DATA ON SOME COMMON DISTRIBUTIONS in Appendix C (Section 17.0461~~0. This term comes distribution.13) .11) t. Remarks 1. so that we are free to use ordinary integration techniques. Poisson (µ). By the fundamental theorem of calculus ' fX (t) = FX (t) at every point of continuity of fX (7. It appears as a counting variable for items arriving with exponential interarrival times (see the relationship to the gamma distribution below).10) Although Poisson probabilities are usually easier to calculate with scientic calculators than binomial probabilities.0060 The descriptions of these distributions. 2. the two agree. There is a technical mathematical description of the condition spread smoothly with no point mass concentrations.4: Poisson counting random variable The number of messages arriving in a one minute period at a communications network junction is a random variable N ∼ Poisson (130). The Poisson distribution is integer valued. 140. with no point mass concentrations. For large n and small p (which may not be a value found in a table).165 6. As in the case of the binomial distribution. where k k is a row or column vector of integers and is a row matrix of the probabilities.9666~~0.110:10:160) p~~=~~0.5117~~0.
(7. Thus. it is customary to omit the indication of the region of integration when integrating over the whole line.166 CHAPTER 7.14) not an indenite integral. Figure 7. fX will be zero outside an interval. 4.2: The Weibull density for α = 2.25. constant gives a suitable If f = 1 determines a distribution function F . multiplication by the appropriate positive f. DISTRIBUTION AND DENSITY FUNCTIONS f with 3. 1. An argument based on the Quantile Function shows the existence of a random variable with that distribution. . Thus g (t) fX (t) dt = R The rst expression is g (t) fX (t) dt In many situations. nonnegative function in turn determines a probability distribution. Any integrable. In the literature on probability. the integrand eectively determines the region of integration. 4. λ = 0. which f = 1.
The density must be constant over the interval (zero outside). the density may have a triangular graph which is not symmetric.3: The Weibull density for α = 10. and the distribution function increases linearly with in the interval. 7. a). fX (t) = The graph of 1 a<t<b b−a (zero outside the interval) (7. This fact is established with the use of the moment generating function (see Transform Methods). It can be shifted. [a. More generally. . since probability associated with each individual point is zero.5 Some common absolutely continuous distributions 1. 1000.167 Figure 7.15) FX rises linearly. The probability of being in any two subintervals of the same length is the same. appears naturally (in shifted form) as the distribution for the sum or dierence of two independent random variables uniformly distributed on intervals of the same length. The probability of any subinterval is proportional to the length of the subinterval. b]. 1. b). 2. It is immaterial whether or not the end points are Mass is spread uniformly on the interval included. Thus. with slope 1/ (b − a) from zero at t=a to one at t = b.001. b] but is equally likely to be in any subinterval of a given length. fX (t) = { (a + t) /a (a − t) /a 2 2 −a ≤ t < 0 0≤t≤a It This distribution is used frequently in instructional numerical examples because probabilities can be obtained geometrically. Uniform (a. This distribution is used to model situations in which it is known that X t takes on values in [a. λ = 0. Symmetric triangular (−a. to dierent sets of values.1. with a shift of the graph.
4. represents the time to failure (i. This leads to an extremely important property of X > t + h. λ) fX (t) = gammadbn λα tα−1 e−λt Γ(α) t≥0 (zero elsewhere) to determine values of the distribution function for functions shows that for the sum of (α. the life duration) of a device put into service at Suppose distribution is exponential. each exponential (λ). we have Exponential note that P (X > t) = 1 − FX (t) = Since the exponential distribution.17) X Because of this property. We −λt e t ≥ 0. Use of Cauchy's equation (Appendix B) shows that the exponential distribution is the only continuous distribution with this property. SOLUTION To get the area under the triangle between 120 and 250. Many solid state electronic devices behave essentially in this way. the exponential distribution is often used in reliability problems. σ 2 fX (t) = 1 √ exp σ 2π We generally indicate that a random variable X ∼ N µ. X > t) then units of time. which is a term applied to a number .5: Use of a triangular distribution Suppose Remark.3. Using the fact that areas of similar triangles are proportional to the square of any side.9846 5. we take one minus the area of the right triangles between 100 and 120 and between 250 and 300.168 CHAPTER 7. α = n. σ 2 X −1 2 t−µ 2 σ ∀t The gaussian distribution plays a has the normal or gaussian distribution by writing . or Gaussian µ. central role in many aspects of applied probability theory. putting in the actual values for the parameters.16) 3. then the time for ten arrivals is gamma (10. Note that in the continuous case. If the t (i. a random variable X ∼ X ∼ gamma gamma A relation to the Poisson distribution is described in Sec 7. As we show in the chapter Mathematical Expectation (Section 11.855 2 (7. we have P =1− 1 2 2 (20/100) + (50/100) = 0. hours? What is the probability of ten or more arrivals in four hours? In six SOLUTION The time for ten arrivals is the sum of ten interarrival times. this means that the average interarrival time is 1/3 hour or 20 minutes.5. X∼ symmetric triangular (100. it is immaterial whether the end point of the intervals are included or not.e.7576~~~~0.[4~6]) p~~=~~0. Many devices have the property that they do not wear out. Gamma distribution We have an mfunction (α.. 3). DISTRIBUTION AND DENSITY FUNCTIONS Example 7.. Use of moment generating (n. λ) has the same distribution as n independent random variables. P (X > t + hX > t) = P (X > t + h) /P (X > t) = e−λ(t+h) /e−λt = e−λh = P (X > h) (7. this property says that if the device survives to time the (conditional) probability it will survive of surviving for h h more units of time is the same as the original probability t = 0. the times (in hours) between arrivals in a hospital emergency unit may be represented by a random quantity which is exponential (λ = 3). as is usually the case. once initial burn in tests have removed defective units. λ). particularly in the area of statistics. h > 0 implies X > t.6: An arrival problem On a Saturday night. Example 7. Normal. If we suppose these are independent. 300). ~p~=~gammadbn(10.e. Much of its importance comes from the central limit theorem (CLT). (λ) fX (t) = λe−λt t ≥ 0 (zero elsewhere).1). −λt Integration shows FX (t) = 1 − e t ≥ 0 (zero elsewhere). Failure is due to some stress of external origin. Determine P (120 < X ≤ 250).
It is a remarkable fact that this is not the case. the gaussian distribution appears naturally in such topics as theory of errors or theory of noise. We may use the symmetry for any case. this raises the question whether a separate table is needed for each pair of parameters. Note that Φ (t) for t > 0 to Φ (0) = 1/2.5. respectively. The graph of the density function is the well known bell shaped curve. cannot be expressed in terms of elementary functions.19) This indicates that we need only a table of values of any t. is mean value and σ 2 is the called the standard deviation. The symmetry about the origin contributes to its usefulness. where the quantity observed is an additive combination of a large number of essentially independent quantities.5 is the same as the area to the right of Φ (−2) = 1 − Φ (2). Thus. We φ and Φ for the standardized normal density and distribution functions. P (X ≤ t) = Φ (t) = area under the curve to the left of t (7. usual procedure is to use tables obtained by numerical integration. Examination of the expression shows that the graph for t = µ. symmetrical about the origin (see Figure 7. as the integral of the density function. The same is true for any t. = This is refered to as the standardized normal distribution. 1). fX (t) is symmetric about its maximum at the smaller the maximum value and the more slowly the curve decreases with distance from µ. so that we have t = 1.18) Note that the area to the left of t = −1. The σ2 While we have an explicit formula for the density function.169 of theorems in analysis. it is known that the distribution function. so that Φ (−t) = 1 − Φ (t) ∀ t (7. The greater the parameter σ2 . Essentially. the positive square root of the variance. 2 √1 e−t /2 2π Standardized normalφ (t) so that the distribution function is Φ (t) = t −∞ φ (u) du. The parameter µ is called the σ. Thus parameter µ locates the center of the mass distribution and variance. Since there are two parameters. the CLT shows that the distribution for the sum of a suciently large number of independent random variables has approximately the gaussian distribution. We need only have a table of the distribution function for use X ∼ N (0. be able to determine Φ (t) for .4). is a measure of the spread of mass about The parameter µ.
4)). we nd Φ (2) = 0.5 shows the distribution function and density function for X ∼ N (2.8185 and P (X > 1) = General gaussian distribution X ∼ N µ. P (−1 ≤ X ≤ 2) = Φ (2) − Φ (−1) = Φ (2) − [1 − Φ (1)] = Φ (2) + Φ (1) − 1 P (X > 1) = P (X > 1) + P (X < − 1) = 1 − Φ (1) + Φ (−1) = 2 [1 − Φ (1)] From a table of standardized normal distribution function (see Appendix D (Section 17. The distribution function reaches 0. but is shifted with dierent spread and height.1). 1). Determine P (−1 ≤ X ≤ 2) and P (X > 1). 2. Example 7.3174 For and Φ (1) = 0.2616 as compared with 0.7: Standardized normal calculations Suppose X ∼ N (0. t = 2. Figure 7. the density maintains the bell shape.4: The standardized normal distribution. 0.3989 for the standardized Inspection shows that the graph is narrower than that for the standardized normal. SOLUTION 1. . DISTRIBUTION AND DENSITY FUNCTIONS Figure 7.5 at the mean value 2.9772 0. It has height 1.170 CHAPTER 7. σ 2 . The density is centered about normal density.8413 which gives P (−1 ≤ X ≤ 2) = 0.
(i.21) FX (t) = −∞ φ (u) du = Φ t−µ σ (7. Determine P (−1 ≤ X ≤ 11) and FX (11) − FX (−1) = Φ 11−3 − Φ −1−3 = Φ (2) − Φ (−1) = 0. A change of variables in the integral shows that the table for standardized normal distribution function can be used for any case.8: General gaussian calculation X ∼ N (3.. 16) P (X − 3 > 4). 2. . µ = 3 and σ 2 = 16).171 Figure 7.5: The normal density and distribution functions for X ∼ N (2. FX (t) = 1 √ σ 2π t exp − −∞ 1 x−µ 2 σ 2 t dx = −∞ φ x−µ σ 1 dx σ (7.1).22) Example 7.20) Make the change of variable and corresponding formal changes u= to get x−µ 1 t−µ du = dx x = t ∼ u = σ σ σ (t−µ)/σ (7.7 (Standardized normal calculations) We have mfunctions gaussian and gaussdensity to calculate values of the distribution and density function for any reasonable value of the parameters. 0.8185 4 4 P (X − 3 < − 4)+P (X − 3 > 4) = FX (−1)+[1 − FX (7)] = Φ (−1)+1−Φ (1) = 0. Suppose SOLUTION 1.3174 In each case the problem reduces to that in Example 7.e.
172 CHAPTER 7.3173 The dierences in these results and those above (which used tables) are due to the roundo to four places in the tables. the density function is a polynomial.(gaussian(3. fX (t) = 1 0 Γ(r+s) r−1 (1 Γ(r)Γ(s) t − t) s−1 0<t<1 Analysis is based on the integrals ur−1 (1 − u) s−1 du = Γ (r) Γ (s) Γ (r + s) with Γ (t + 1) = tΓ (t) r. DISTRIBUTION AND DENSITY FUNCTIONS The following are solutions of Example 7. s gives the uniform distribution on the unit interval.1) 0. Example 7.16.7 (Standardized normal calculations) and Example 7. The Beta distribution is quite useful in developing the Bayesian statistics for the problem of sampling to determine a population proportion. By using scaling and shifting.1) 0. The special case r=s=1 r. s > 0. s) .1.3173 = gaussian(3. these can be extended to other intervals.6 and Figure 7.7 show graphs of the densities for various values of The usefulness comes in approximating densities on the unit interval. 6. If general case we have two mfunctions. are integers. (7. r > 0.gaussian(0.23) Figure 7.1.1)) 0.8186 = 2*(1 .1. using the mfunction gaussian.16.11) .8186 = gaussian(3.7) 0.9: Example 7. s.16.2) .8 (General gaussian calculation) (continued) P1 = P2 P2 = P1 P2 = P2 P2 = P1 = gaussian(0.gaussian(0.7 (Standardized normal calculations) and Example 7. For the .1)) + 1 .gaussian(3. beta and betadbn to perform the calculatons.8 (General gaussian calculation).16. Beta(r.
s) density for r = 2. 2. 10.6: The Beta(r.173 Figure 7. . s = 1.
7. 10. k) (µ/n) (1 − µ/n) 2 This n−k = n (n − 1) · · · (n − k + 1) µ 1− nk n 1− µ n n µk k! (7. Weibull(α. However. ν ≥ 0. 7.6 and Figure 7.7/>. gamma.org/content/m23313/1.2. k np k! −k Then.s) density for r = 5. k) pk (1 − p) Suppose n−k ≈ e−np p = µ/n with n large and µ/n < 1. much use of it. Examination shows α = 1 the distribution is exponential (λ). DISTRIBUTION AND DENSITY FUNCTIONS Figure 7. λ. The shift can be obtained by subtracting a constant from the t values.1 Binomial.7 show graphs of the Weibull density for some The distribution is used in reliability theory. Poisson.2 Distribution Approximations2 p and suciently large n (7. . we have mfunctions for shift parameter weibull (density) and weibulld (distribution function) ν=0 only. λ > 0. The parameter α provides a distortion of the time the exponential distribution. representative values of α that for scale for Figure 7. and Gaussian distributions The Poisson approximation to the binomial distribution The following approximation is a classical one. P (X = k) = C (n.7: The Beta(r. We do not make α and λ (ν = 0). Usually we assume ν = 0. We wish to show that for small 7. ν) FX (t) = 1 − e−λ(t−ν) α > 0.174 CHAPTER 7. s = 2. 5.25) content is available online at <http://cnx. t ≥ ν The parameter ν is a shift parameter.24) P (X = k) = C (n.
26) known property of the exponential 1− The result is that for large µ n n → e−µ k as n→∞ The Poisson and gamma distributions Suppose n. P (X = k) ≈ e−µ µ . The number of successes in n trials has the binomial (n.33) X is np and the variance is npq . we have the following equality i X∼ gamma P (X ≤ t) = Now 1 Γ (n) λt un−1 d−u du = e−λt 0 k=n (λt) k! k (7. According to a well (7. referred to in the discussion of the gaussian or normal distribution above. . npq).175 The rst factor in the last expression is the ratio of polynomials in approach one as n becomes large.32) The gaussian (normal) approximation The central limit theorem. obtained by integration by parts. X∼ gamma (α.27) = 1 Γ (α) λt uα−1 e−u du 0 (7. The second factor approaches one as n of the same degree k. p) distribution. is ∞ n−1 tn−1 e−t dt = Γ (n) e−a a Noting that k=0 ∞ ak k=0 k! ak k! with Γ (n) = (n − 1)! (7. λ).28) A well known denite integral. the distribution should be approximately N (np. which must n becomes large. k! Now Y ∼ Poisson (λt). n This random variable may be expressed X= i=1 Since the mean value of IEi where the I Ei constitute an independent class (7.29) 1 = e−a ea = e−a we nd after some simple algebra that 1 Γ (n) For a ∞ tn−1 e−t dt = e−a 0 k=n ak k! (α.31) ∞ P (Y ≥ n) = e−λt k=n (λt) k! k i Y ∼ Poisson (λt) (7. λ) i P (X ≤ t) = λα Γ (α) t xα−1 e−λx dx = 0 1 Γ (α) t 0 (λx) α−1 −λx e d (λx) (7. ∞ (7. where µ = np.30) a = λt and α = n. suggests that the binomial and Poisson distributions should be approximated by the gaussian.
DISTRIBUTION AND DENSITY FUNCTIONS Figure 7. each Poisson X is approximately It is generally best to compare distribution functions. n = 300. the probability at t=k is represented by a rectangle of height pk and unit width. we consider approximation of the . it is reasonable to suppose that suppose that n independent random variables. p = 0. µ). To approximate the curve from k + 1/2. Since the binomial and Poisson distributions are integervalued. with k as the midpoint. take the area under Use of mprocedures to compare We have two mprocedures to make the comparisons.8: Gaussian approximation to the binomial. Figure 1 shows a plot of the density and the corresponding gaussian density for gaussian density is oset by approximately 1/2.1. It is apparent that the the probability X ≤ k . this is called the continuity correction. Now if X ∼ Poisson (µ). then X N (µ. it turns out that the best gaussian approximaton is obtained by making a continuity correction. Since the mean value and the variance are both µ. Use of the generating function shows that the sum of independent Poisson random variables is Poisson. First.176 CHAPTER 7. may be considered the sum of (µ/n). To get an approximation to a density for an integervalued random variable.
.9: Gaussian approximation to the Poisson distribution function µ = 10.177 Figure 7.
The mprocedure poissapp calls for a value of µ. selects a suitable range about k = µ and plots the distribution function for the Poisson distribution (stairs) and the normal (gaussian) distribution (dash dot) for N (µ. the continuity correction is applied to the gaussian distribu tion at integer values (circles). the continuity . µ).10 shows plots for provides a much better approximation. DISTRIBUTION AND DENSITY FUNCTIONS Figure 7. In addition. It is clear that the continuity correction The plots in Figure 7. Here µ. but not by as much as for the smaller µ = 100. Figure 7.11 are for correction provides the better approximation. Poisson (µ) distribution.178 CHAPTER 7. µ = 10.10: Gaussian approximation to the Poisson distribution function µ = 100.
p = 0. .03.11: Poisson and Gaussian approximation to the binomial: n = 1000.179 Figure 7.
the probability is also spread smoothly.12: Poisson and Gaussian approximation to the binomial: n = 50. A simple approximation is obtained by subdividing an interval which includes the range (the set of possible values) into small enough subintervals that the density is approximately constant over each subinterval. The distribution is described by a probability density function. . and continuity adjusted values of the gaussian distribution function at the integer values. p = 0. is the dierence in variances for the binomial as compared with np for the Poisson. agreement of all three distribution functions is evident. whose value at any point indicates "the probability per unit length" near the point. p = 0. as we see in the unit Variance (Section 12.03. This describes the distribution of the random variable and makes possible calculations of event probabilities and parameters for the distribution. of n The mprocedure and p. and plots the distribution function for the binomial. both in theory and applications. gaussian. we show how a simple random variable is determined by the set of points on the real line representing the possible values and the corresponding set of probabilities that each of these values is taken on.2. the Poisson distribution does not track very well.6.180 CHAPTER 7.1). A point in each subinterval is selected and is assigned the probability mass in its The combination of the selected points and the corresponding probabilities describes the dis tribution of an approximating simple random variable. DISTRIBUTION AND DENSITY FUNCTIONS Figure 7. There is npq still good agreement of the binomial and adjusted gaussian. The diculty. A continuous random variable is characterized by a set of possible values spread continuously over an interval or collection of intervals. and Poisson distributions. Figure 7. It calls for values k values. In the unit Random Variables (Section 6.1). In this case. The good n = 50. selects suitable bincomp compares the binomial.12 shows plots for n = 1000. 7. subinterval.6. p = 0.11 shows plots for approximation to the distribution function for the Poisson. a continuous Figure 7.2 Approximation of a real random variable by simple random variables Simple random variables play a signicant role. However. Calculations based on this distribution approximate corresponding calculations on the continuous distribution.
We use tappr as follows: tappr Enter matrix [a b] of xrange endpoints [0 1] Enter number of x approximation points 200 Enter density as a function of t 3*t.1354 50.0000 160. 0 ≤ t ≤ 1. An interval containing the range of into a specied number of equal subdivisions. p(i) = cpoisson(mu(i).K.0159 200.0205 150. 3 3 Determine P (0.K(i)). we consider some illustrative examples.0000 5.length(mu)). 1].0000 223. Example 7.9 − 0. mu = [5 10 20 30 40 50 70 100 150 200].0000 18.0000 0. calculations are carried out as for simple random variables.0133 An mprocedure for discrete approximation If X is bounded.7210 Because of the regularity of the density and the number of approximation points.0000 0. Experiment indicates √ n = µ+6 µ (i.0000 0.2 = 0. K = zeros(1.0000 0.0000 92.2)&(X <= 0.0000 2.9). n suciently large. the result agrees quite well with the theoretical value. for a given parameter value µ. If X sets up the is divided The probability mass for each subinterval is assigned to dx is the length of the subintervals.0000 0.0359 100.0000 77. the probability for k ≥ n.0000 0. 40.10: Simple approximation to Poisson A random variable with the Poisson distribution is unbounded. an analytical solution is easy. then the integral of the density function over the subinterval is approximated by fX (ti ) dx.0000 284.181 Before examining a general approximation procedure which has signicant consequences for later treatments. end disp([mu.p*1e6]') 5. for i = 1:length(mu) K(i) = floor(mu(i)+ 6*sqrt(mu(i))). [0. However.4540 % K is the value of k needed to achieve 30.0000 0. the midpoint.0668 70.e.2 ≤ X ≤ 0.2140 % the residual shown. the mprocedure tappr distribution for an approximating simple random variable..0000 120. In eect.^2 Use row matrices X and PX as in the simple case M = (X >= 0. the graph of the density over the subinterval is approximated by a rectangle of length dx and height fX (ti ). FX (t) = t3 on the interval SOLUTION In this case.0000 28. six standard deviations beyond the mean) is a reasonable value for 5 ≤ µ ≤ 200.000001 10. p = M*PX' p = 0.4163 % Residual probabilities are 0.9).11: A numerical example Suppose fX (t) = 3t2 .0000 0. where ti is the midpoint.7210. .0000 62. Once the approximating simple distribution is established. Example 7.0000 46.length(mu)). absolutely continuous with density functon fX . 20.2535 % times the numbers in the last column. so P = 0. is negligible. p = zeros(1.
b = 20/3.14 for plot . and k = 1/4000..191003 cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See Figure~7.34) for a = 40. b = 20/3. Figure 7.1910 % Theoretical value = (2/3)exp(5/4) = 0. a = 40000. k = 1/4000. 000).^2/a^3). 000. it is easy to determine a bound beyond which the probability is negligible. Example 7.*(t >= 40000) Use row matrices X and PX as in the simple case P = (X >= 45000)*PX' P = 0. DISTRIBUTION AND DENSITY FUNCTIONS The next example is a more complex one. the distribution is not bounded. In particular.182 CHAPTER 7. However.12: Radial tire mileage The life (in miles) of a certain brand of radial tires may be represented by a random variable with density X fX (t) = { where t2 /a3 (b/a) e −k(t−a) for 0≤t<a a≤t (7. % Test shows cutoff point of 80000 should be satisfactory tappr Enter matrix [a b] of xrange endpoints [0 80000] Enter number of x approximation points 80000/20 Enter density as a function of t (t.12 (Radial tire mileage).13: Distribution function for Example 7. Determine P (X ≥ 45.*(t < 40000) + .. (b/a)*exp(k*(at)).
14: Partition of the interval I including the range of X Figure 7. 1 ≤ i ≤ n − 1 and Mn = [tn−1 . Let Mi = [ti−1 . tn ] (see Figure 7. the approximation is close except in a portion of the range having arbitrarily small total probability. pick a n point si . ti ) be the i th subinterval. we use a rather large number of approximation points. Then the Ei form a partition of the basic space Ω. As a consequence. Consider the simple random variable Xs = i=1 si IEi . In the singlevariable case. ti−1 ≤ si < ti . with a = t0 and b = tn .35) . Now random variable We limit our discussion to the bounded case.15: Renement of the partition by additional subdividion points. then X (ω) ∈ Mi and Xs (ω) = si . the results are quite accurate. This random variable is in canonical form. Now the X (ω) − Xs (ω)  < ti − ti−1 the length of subinterval Mi (7. Suppose I is partitioned into n subintervals by points ti . For the given subdivision. Let Figure 7. For the unbounded case. The general approximation procedure We show now that any bounded real random variable may be approximated as closely as desired by a simple random variable (i. and hence into any point in each subinterval Mi . In each subinterval.. b]. designating a large number of approximating points usually causes no computer memory problem. in which the range of X is limited to a bounded interval Ei = X −1 (Mi ) be the set of points mapped into Mi by X. I = [a. X may map into any point in the interval.183 In this case.14). absolute value of the dierence satises If ω ∈ Ei .e. one having a nite set of possible values). we form a simple random variable Xs as follows. 1 ≤ i ≤ n − 1.
0.39) X = IA + IB + IC + ID .1 Probabilities"). Xs (ω) ≤ Ys (ω) ≤ X (ω) ∀ ω To see this. 2} Exercise 7. Subintervals from a to N are made {XN : 1 ≤ N } of simple random variables.14. Simply let a to N. the new simple approximation Ys t∗ i Xs (ω) ≤ X (ω) ∀ω . the maximum length of subinterval goes to zero. with probabilities 0. 3.00. 4. 0. The result is a nondecreasing sequence XN (ω) → X (ω) as N → ∞.5IC8 Determine and plot the distribution function for (7. $7.12) from "Problems on Random Variables and Probabilities").e. we simply select an interval negligible and use a simple approximation over I. D} has minterm probabilities pm = 0. 0. DISTRIBUTION AND DENSITY FUNCTIONS ω and the corresponding subinterval. variables Xn which increase to X for each ω . Ys maps Ei into ti and maps Ei into t∗ > ti .36) By making the subintervals small enough by increasing the number of subdivision points.3) and 4 (Exercise 6.37) ∈ Mi (see Figure 7.) The prices are $3.07. 5. 0. I large enough that the probability outside I is 7.6) from "Problems on Random Variables and Probabilities"). ∞). 0. 189. Determine and plot the distribution function FX . we have the important fact Since this is true for each X (ω) − Xs (ω)  < maximum dierence as small as we please. (See Exercise 6 (Exercise 6. 189. 0. 0. B.3 class (See Exercise 12 (Exercise 6.001 ∗ [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] Determine and plot the distribution function for the random variable which counts the number of the events which occur on a trial. $3.) (See Exercises 3 (Exercise 6. (7. The random variable expressing the amount of her purchase may be written X = 3. Thus. .5IC1 + 5.4/>.00.09. 2. and $7.15. X maps Ei into Mi' and Ei into Mi'' . 0. 2. While the choice of the to the property all of the length of the Mi (7. She purchases one of the items with probabilities 0.0IC2 + 3. The class on (Solution on p. For probability calculations.50. vision points extend from we may form a nondecreasing sequence of simple random [N.11.38) X. 3 This content is available online at <http://cnx.20. the asserted inequality must hold for each ω By taking a sequence of partitions in which each succeeding partition renes the previous (i. for each ω ∈ Ω.12. is a partition.10.50. 3.4) from "Problems on Random Variables and {1.) The Exercise 7. 0. A store has eight items for sale.2 C1 {Cj : 1 ≤ j ≤ 10} through C10 .08.15). 189.05.0IC6 + 3.06. Random variable X has values respectively. 0.13. 3.50.15. consider ' Ei ' Ei ' '' (7. if we add subdivision points to decrease the size of some or satises Mi .5IC7 + 7. {A. respectively. 0.11. 0.00. (Solution on p. the selection of si = ti−1 (the lefthand endpoint) leads In this case.15. C. (Solution on p. we can make the si is arbitrary in each Mi . Xs maps both i '' and Ei into ti . $5.50.50.10 0.org/content/m24209/1. 1. 0.5IC4 + 5. Mi is partitioned into Mi Mi and Ei is partitioned into '' ' '' ' '' Ei . N th set of subdi making the last subinterval increasingly shorter.3 Problems on Distribution and Density Functions3 Exercise 7. $3. $5. with adds subdivision points) in such a way that The latter result may be extended to random variables unbounded above.184 CHAPTER 7.5IC3 + 7. $5.09. 0. A customer comes in.0IC5 + 5.10 0.
What is the probability that the rst appearance of either of these numbers is achieved by the fth trial or sooner? Exercise 7. p = 0.11 is the probability the third service day occurs by the end of 10 days? binomial distribution. 190. Since a Bernoulli sequence starts over at any time. p1 .6. Beginning on a Monday morning.) A residential College plans to raise money by selling chances on a board.185 Exercise 7. 189. each lamp Exercise 7.4 Suppose a (Solution on p. fourth.6 and Exercise 7. On ten spins what is the probability of matching.53 of success on any component trial. 2. P (X ≥ 200). in order. in a.0017. second. 189.) Exercise 7. the probability of one or more a. or fewer winners.10 (Solution on p. repeat using the binomial distribution. 189.12 A player pays $10 to play. What is the probability that at least ve of Exercise 7. Exercise 7. 4. What is the probability that on any day the loss of production time due to lamp failaures is k or fewer hours. 4? (Solution on p. 189. 5? (Solution on p.) Assume the initial digit may be zero. 1.) Solve using the negative For the system in Exercise 7. 0 ≤ k ≤ 20? (Solution on p.) For the system in Exercise 7.) A single sixsided die is rolled repeatedly until either a one or a six turns up.13 (Solution on p. 190. A wheel turns up the digits 0 through 9 with equal probability on each spin. It takes about one hour to replace a lamp. once it has failed.2.7 Two hundred persons buy tickets for a drawing. What Exercise 7.) Exercise 7. call a day in which one or more failures occur among the 350 lamps a service day. 0 ≤ k ≤ 10? k or more of the ten digits (Solution on p.008 of winning.) On any day. What is the probability the results match (both heads or both k times. 3. what is the probability the rst service day is the rst. What is the probability of k Each ticket has probability 0.) Exercise 7.) is a ten digit number. could fail with probability p = 0. Fifty chances are sold. 3? (Solution on p.) p = 0.5 In a thunderstorm in a national park there are 127 lightning strikes. 189. Exercise 7. fth day of the week? b.10 assume the plant works seven days a week.9 them get seven or more heads? Thirty members of a class each ip a coin ten times. 1. 190. k = 2. and must be replaced as quickly as possible. (Solution on p. 189.14 Consider a Bernoulli sequence with probability (Solution on p. Determine the distribution for where N is the number of winners and (7. What is the probability of no service days in a seven day week? Exercise 7. probability of of a lightning strike starting a re is about 0. These lamps are critical.8 tails) Two coins are ipped twenty times. res are started. 189.6 A manufacturing plant has 350 special lamps on its production lines.40) X and calculate P (X > 0). . third. 2. k = 0. 3. Experience shows that the What is the probability that k k = 0. The prot to the College is X = 50 · 10 − 30N.0083. he or she wins $30 with probability (Solution on p. the sequence of service/nonservice days may be considered a Bernoulli sequence with probability lamp failures in a day. P (X ≥ 300). 190.
Use the procedure nbinom to calculate this probability . Exercise 7. a.) X∼ exponential (λ). 0. random variable.) The number of customers arriving in a small specialty store in an hour is a random quantity having Poisson (5) distribution.) The number of cars passing a certain trac count position in an hour has Poisson (53) distribution. Determine P (X ≥ 80) . determine P (X ≥ 1/λ).5). a. 0. and Z ∼ exponential (1/4.1). For each for integer values 3 ≤ k ≤ 8.001) and Y ∼ Poisson (5). 191.17 (Solution on p. Exercise 7.002) distribution.19 (Solution on p.22 (Solution on p. 0.16 (Solution on p.375). 191. 000).) T ∼ gamma (20.23 For (Solution on p. DISTRIBUTION AND DENSITY FUNCTIONS a. for k = 5. 190.186 CHAPTER 7. 190.15 job. 191.) Items are selected Fifty percent of the components coming o an assembly line fail to meet specications for a special It is desired to select three units which meet the stringent specications.24.) binomial (12. (Solution on p.) They fail independently. b. Exercise 7.24 Twenty identical units are put into operation. Calculate this probability using the binomial distribution. P (X ≥ 120) b. 0. calculate and tabulate the probability of a value at least k. Under the usual assumptions for Bernoulli trials. What is the (conditional) probability it will operate for another 500 hours? Exercise 7. inclusive? What is the probability of no more than ten? Exercise 7. P (T ≤ 100. 191.) The number of noise pulses arriving on a power circuit in an hour is a random quantity having Poisson (7) distribution. P (X ≥ 100) .) and P (X ≤ k) 0 ≤ k ≤ 10. 190. and tested in succession. 8. for directly with ibinom and ipoisson. What is the probability the number arriving in an hour will be between three and seven. Use the Poisson distribution to determine P (T ≤ 100. Suppose the unit has operated successfully for 500 hours. what is the probability the third satisfactory unit will be found on six or fewer trials? Exercise 7.21 Random variable (Solution on p. 15. will survive for 5000 hours. What is the probability of having at least 10 pulses in an hour? What is the probability of having at most 15 pulses in an hour? Exercise 7. 191. P (X ≥ 2/λ). Determine the probabilities that at least k. Exercise 7. Do this Compare P (Y ≤ k) for X ∼ binomial(5000. 10. What is the probability the number of cars passing in an hour lies between 45 and 55 (inclusive)? What is the probability of more than 55? Exercise 7. (Solution on p. Then use the mprocedure bincomp to obtain graphical results (including a comparison with the normal distribution). (in hours) form an iid class. 12. of a televesion set sub ject to random voltage surges has the exponential (0. . 191.) The time to failure. Exercise 7. 000).0002). Use the mfunction for the gamma distribution to determine b.20 (Solution on p. The probability the fourth success will occur no later than the tenth trial is determined by the negative binomial distribution. X ∼ Y ∼ Poisson (4. The times to failure This means the expected life is 5000 hours. exponential (0.25 Let (Solution on p. 191. in hours of operating time.18 Suppose (Solution on p.0002) be the total operating time for the units described in Exercise 7. Exercise 7.) X∼ binomial (1000. Use the appropriate Poisson distribution to approximate these values.5). 191.
32 Suppose X ∼ N (4.31 time to failure. b. That is. (Solution on p. Calculate the probabilities in part (a) with the mfunction gaussian. Without using tables or mprograms.71? (Assume continuity. (Solution on p. 193. that is normally distributed with . Use a table Check your Exercise 7. 0. of each is a random variable exponential (1/30). P (1 < X < 9) and P (X − 3 ≤ 4). determine (Solution on p. Exercise 7. 192. σ 2 = 81. Suppose time for the six calls (in minutes) are iid. 81).33 Suppose X ∼ N (5. is made each fteen days. Without using tables or mprograms. exponential (1/3). which have X for the third arrival is thus gamma (3. Use X (Solution on p. (Solution on p. Without using P (X ≤ 2). This means the time X λ= for the arrival of the fourth message is gamma(4. X (Solution on p. 1].) Customers arrive at a service center with independent interarrival times in hours. 0. Use a table Check your Exercise 7.) Exercise 7. 0. 192. and the What is the probability that at least four are still operating at the Exercise 7. 3). A sequence of 35 numbers is generated. 192.30 (Solution on p. 81). Exercise 7. They assume their values independently.) Ten items are selected at random.) has gaussian distribution with of standardized normal distribution to determine results using the mfunction gaussian. Each of these Exercise 7. 192.01). more are within 0.) has gaussian distribution with of standardized normal distribution to determine results with the mfunction gaussian. 192.36 The result of extensive quality control sampling shows that a certain model of digital watches coming o a production line have accuracy. a table of That is.28 exponential (3) distribution. Do not make a discrete adjustment. in seconds per month. in days. currently in use by a sixth person.1). can be considered an observed value of a random variable uniformly distributed on the interval [0.27 Interarrival times (in minutes) for fax messages on a terminal are independent.29 Five people wait to use a telephone.) Exercise 7. a. What is the probability that three or Items coming o an assembly line have a critical dimension which is represented by a random ∼ N(10. That is. 191. P (3 < X < 9) and P (X − 5 ≤ 5).1). The time tables or mprograms. 192. utilize the relation of the gamma to the Poisson distribution to determine P (X ≤ 30).) A maintenance check Five identical electronic devices are installed at one time.15).34 Suppose X ∼ N (3. 193. 192. 193. µ = 5 and σ 2 = 81. exponential ( 0.26 (Solution on p. maintenance check? (Solution on p. determine P (X ≤ 25). X (Solution on p. 64). What is the probability 25 or more are less than or equal to 0. The units fail independently.187 Exercise 7.) has gaussian distribution with mean µ = 4 and variance standardized normal distribution to determine P (2 < X < 8) and P (X − 4 ≤ 5).) Exercise 7.) Exercise 7.) The sum of the times to failure for ve independent units is a random variable X ∼ gamma (5.35 variable (Solution on p.05 of the mean value µ. What is the distribution for the total time present for the six calls? Use an appropriate Poisson distribution to determine Z from the P (Z ≤ 20). µ = 3 and σ 2 = 64.) A random number generator produces a sequence of numbers between 0 and 1.
Determine an expression for the distribution function. . −1≤t≤1 (and zero elsewhere). Use the mprocedures tappr and cdbn to plot an approximation to the distribution function. Determine P (X ≤ 0. 194.188 CHAPTER 7.38 values of (Solution on p. P (0.) has density 3 fX (t) = 2 t2 . c. P (X − 1 < 1/4). 8). Exercise 7. Use the mprocedures tappr and cdbn to plot an approximation to the distribution function. P (X − 0. 0 ≤ t ≤ 2 (and zero elsewhere). What is the probability a watch taken from the production line to be tested will achieve top grade? mfunction gaussian. Determine P (−0.01 to 0. 194. b. Exercise 7.) has density function 3 fX (t) = t − 8 t2 . Determine P (X ≤ 0.40 Random variable X (Solution on p. P (0.5 ≤ X < 0. Calculate.5).41 Random variable X (Solution on p. P (X − 1 < 1/4).) Use various Use the mprocedure poissapp to compare the Poisson and gaussian distributions.5). Exercise 7. Exercise 7.5). Use the mprocedures tappr and cdbn to plot an approximation to the distribution function. To achieve a top grade. c. to observe the approximation of the binomial distribution by the Poisson. b. P (X > 0.39 Random variable X (Solution on p.7.5).25 ≤ 0.5 ≤ X < 1. Determine an expression for the distribution function. a watch must have an accuracy within the range of 5 to +10 seconds per month.5).) from 10 to 500 and p from 0. 1] (t) t2 + I(1.2] (t) (2 − t) 5 5 (7. 193. DISTRIBUTION AND DENSITY FUNCTIONS µ=5 and σ 2 = 300. 193. b.37 Use the mprocedure bincomp with various values of n (Solution on p. µ from 10 to 500.5 ≤ X < 1.41) a. 193. Determine an expression for the distribution function. using a standardized normal table. Check with the Exercise 7. c. a.) has density function fX (t) = { (6/5) t2 (6/5) (2 − t) for for 0≤t≤1 1<t≤2 6 6 = I [0.5). a.
4487 k = 1:5. 185) p1 = 1 .3 (p.7) = 0.2 (p.1719 P = cbinom(30. (prob given day is a service day) a. 184) % See MATLAB plot T = [3.28: npr06_12) Minterm probabilities in pm. 185) P = 1 .9775 0. 185) p = cbinom(10.0.1:6) P = ibinom(127.3470 0.0. ddbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX Solution to Exercise 7.pc). 0 : 10).5 (p.5. 185) P = 1 .1. pc = 0.1 (p.8. % Alternate solution.1945 0.5 7.0:3) P = 0.p. [X. [X. See Exercise 12 (Exercise~6.PX] = csort(T.4 (p.1/2.PX] = csort(T.cbinom(200.9 (p. Solution to Exercise 7. .0.0678 = 0.189 Solutions to Exercises in Chapter 7 Solution to Exercise 7.0.0:20) Solution to Exercise 7.7 (p.8799 0. 185) P = ibinom(20.5) = 0. 184) % See MATLAB plot npr06_12 (Section~17.9220 0.01*[10 15 15 20 10 5 10 15]. 184) T = [1 3 2 3 4 2 1 3 5 2].PX] = csort(T.01*[8 13 6 9 14 11 12 7 11 9].5513 0.pm).3:5) = 0.cbinom(350.(1 .8 (p.5].10 (p. 185) P = cbinom (10.12) from "Problems on Random Va [X.6052 Solution to Exercise 7.5 5 3.7838 0.9996 1.0017)^350 = 0.9968 0. coefficients in c T = sum(mintable(4)).3688 0. ddbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX Solution to Exercise 7. 0.pc).0083.5 5 5 3. ddbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot Solution to Exercise 7.0.6 (p.0000 Solution to Exercise 7.008.0017. pc = 0. 185) Solution to Exercise 7.9768 Solution to Exercise 7.5 7.
53.0000 0.9682 0.17 (p. b.^(k1) = 0.6160 0.p1).0000 0.0155 Solution to Exercise 7.0.7623 0. 185) P = 1 .1034 Solution to Exercise 7.3) = 0.2. X = 500 .(1 . DISTRIBUTION AND DENSITY FUNCTIONS P = p1*(1 .0000 0. Pb = 1 .8666 8.5.45) .1245 0.8729 Pa = cbinom(10.9320 0.p1)^7 = 0.9856 P200 = (X>=200)*PN' P200 = 0.9319 9. 186) P = cbinom(6. 186) k = 0:10.190 CHAPTER 7.8683 Solution to Exercise 7.0067 0.5224 P2 = cpoisson(53.4405 5.2649 0.cpoisson(5.0000 0.3581 Solution to Exercise 7.7622 7.cpoisson(53.Pb. Ppos = (X>0)*PN' Ppos = 0.0000 0.k+1).0017)^350 = 0.13 (p. 185) N = 0:50.4) = 0.4404 0.6160 6.14 (p.11 (p.0000 0.9863 .8667 0.0.4487 • P = sum(nbinom(3.cbinom(5000.0752 0.8990 • Pa = cbinom(10. PN = ibinom(50.53.4:10)) = 0.Pp]') 0 0.1247 3.0414 b.0:50). Pp = 1 .4487 0.0000 0. P0 = (1 .001.2650 4.0000 0.30*N.0067 1.p1.(2/3)^5 = 0.0000 0.3:10)) = 0.0.p1.2474 0.16 (p.0. disp([k.1364 0.15 (p.8729 Solution to Exercise 7.0404 2.9864 0.0000 0. 186) P1 = cpoisson(53.9682 10.0.5836 P300 = (X>=300)*PN' P300 = 0.0404 0. 185) p1 = 1 .0.3) = 0. 185) a.6562 Solution to Exercise 7. P = sum(nbinom(4.12 (p.56) = 0.8990 Solution to Exercise 7.56) = 0.k+1).
5154 P1 = cpoisson(100.0000 0.0000 0.8865 0.1601 0.375. Pz = exp(k/4. 186) P (X > kλ) = e−λk/λ = e−k Solution to Exercise 7.k) P = 0.191 bincomp Enter the parameter n 5000 Enter the parameter p 0.3679 Solution to Exercise 7.k).2111 0.4679 6.0002.0000 0.0.cpoisson(7.2709 0.5).0866 Solution to Exercise 7.1689 8.20 (p. 186) P1 = cpoisson(5.0002*5000) 0.0.4897 0.16) = 0.o o o gtext('Exercise 17') Solution to Exercise 7.5297 .1178 0.7420 P2 = 1 .8) = 0.24 (p.3292 0.0000 0.0. Px = cbinom(12.10) = 0.Pz]') 3.2971 7.21 (p. cbinom(20.5134 0. Adjusted Gaussian.. 186) k = [80 100 120].0220 0. 186) p k P P = = = = p = exp(0.4111 0.cpoisson(5. 186) P (X > 500 + 500X > 500) = P (X > 500) = e−0.8264 4.9825 0.0294 0.9867 0.20) = 0. P = cbinom(1000.0000 0.19 (p.cpoisson(5.k).9110 0.1.001 Binomial.2636 0.11) = 0.0006 Solution to Exercise 7.1690 P1 = cpoisson(7.0390 0.6577 5. Py = cpoisson(4.100000) = 0.0282 Solution to Exercise 7.0002*100000.4655 0.5.Py.5297 P2 = cpoisson(0. 186) k = 3:8.002·500 = 0.p.23 (p.18 (p. 186) P1 = gammadbn(20.5133 0.9976 Solution to Exercise 7.7176 0.3) .22 (p.0000 0.Px.k) P1 = 0. 186) 0.3679 [5 8 10 12 15].k) 0.stairs Poisson.9863 Solution to Exercise 7.25 (p. disp([k..1695 P2 = 1 .
51) (7.5875 − 1 = 0.753 3.6712 + 0.46) P (Y ≥ 3) = 1 − P (Y ≤ 2) = 1 − e−6 (1 + 6 + 36/2) = 0.5620 Solution to Exercise 7. Y ∼ poisson (3 · 2 = 6) (7.75 + Solution to Exercise 7.3528 (7. 187) (7.4) = 0.54) .4212 (7. 187) P (3 < X < 9) = Φ ((9 − 5) /9) − Φ ((3 − 5) /9) = Φ (4/9) + Φ (2/9) − 1 = 0.6712 + 0.6547 Solution to Exercise 7.75) = 0.4212 b. Y ∼ poisson (1/3 · 20) (7.48) P (Y ≥ 6) = cpoisson (20/3.81.42) P (Y ≥ 5) = 1 − P (Y ≤ 4) = 1 − e−3.p.47) Z∼ gamma (6.81.81. 187) P (X ≤ 25) = P (Y ≥ 5) .2596 P2 = gaussian(4.27 (p. Y ∼ poisson (0.84.29 (p.52) P1 = gaussian(4.2) P1 = 0.2 · 30 = 3) 32 33 + 2 3! = 0.31 (p.752 3.4181 Solution to Exercise 7.43) P (X ≤ 30) = P (Y ≥ 4) .2587 P (X − 5 ≤ 5) = 2Φ (5/9) − 1 = 1.71. DISTRIBUTION AND DENSITY FUNCTIONS Solution to Exercise 7. P (2 < X < 8) = Φ ((8 − 4) /9) − Φ ((2 − 4) /9) = Φ (4/9) + Φ (2/9) − 1 = 0.754 + + 2 3! 24 (7.2587 P (X − 4 ≤ 5) = 2Φ (5/9) − 1 = 1. 187) (7.4212 − 1 = 0.5875 − 1 = 0.1) P2 = 0.3225 (7.0.26 (p.45) P (X ≤ 2) = P (Y ≥ 3) . P (Z ≤ 20) = P (Y ≥ 6) .15 · 25 = 3.35 1 + 3. 187) 3. 6) = 0.25) = 0.9380 Solution to Exercise 7.44) P (Y ≥ 4) = 1 − P (Y ≤ 3) = 1 − e−3 1 + 3 + Solution to Exercise 7.50) (7.3483 Solution to Exercise 7.33 (p. Y ∼ poisson (0.192 CHAPTER 7. 187) a. 187) p = exp(15/30) = 0.8) . 187) (7.6065 P = cbinom(5.49) p = cbinom(35. (7.30 (p.gaussian(4.gaussian(4.9) .32 (p.1/3).53) (7.4212 − 1 = 0.28 (p.
25 ≤ X ≤ 0.61) FX (t) = t −1 fX = 1 2 t3 + 1 .56) (7. 300. 187) √ √ P (−5 < X < 10) = Φ 5/ 300 + Φ 10/ 300 − 1 = Φ (0.60) P 3 = P (X − 0.3) P = 0.35 (p.64. Solution to Exercise 7. 187) p = gaussian(10.95) p = 0.10) .75) + Φ (0.81.gaussian(10.577) − 1 = 0.p.36 (p. 188) Experiment with the mprocedure poissapp.75) = b.5) = P (−0.5987 − 1 = 0.331 P = gaussian (5.3721 P2 = gaussian(3.0) P2 = 0.83 − (−0.3317 Solution to Exercise 7.7734 + 0.3) P1 = 0.01. 187) P (1 < X < 9) = Φ ((9 − 3) /8) − Φ ((1 − 3) /9) = Φ (0.0.0.717 − 1 = 0.9) .1) P2 = 0.01.84.7) .8036 Solution to Exercise 7.289) + Φ (0. t2 = t3 /2 (7.57) P1 = gaussian(3.05) .64.81.614 + 0.5) = 7/8 2 (7. 10) − gaussian (5.2596 P2 = gaussian(5.25 ≤ 0.39 (p.9) .38 (p.193 P1 = gaussian(5. 188) 3 2 a.3185 P 2 = 2 0.81. 300.64.3829 P = cbinom(10.34 (p.3829 (7.64. (7.3721 P (X − 3 ≤ 4) = 2Φ (4/8) − 1 = 1.59) P 1 = 0.gaussian(5.3829 − 1 = 0.1) P1 = 0. 1 3 3 (3/4) − (−1/4) = 7/32 2 (7.37 (p.9.5 3 2 3 t = 1 − (−0.58) Solution to Exercise 7.4181 Solution to Exercise 7. −5) = 0. c.25) − 1 = 0.gaussian(5.gaussian(3.5) 3 1 = 0.5 ∗ 0.3829 Solution to Exercise 7.10.55) (7.gaussian(3. 188) Experiment with the mprocedure bincomp.
= t2 t3 − 2 8 (7.41 (p.. DISTRIBUTION AND DENSITY FUNCTIONS tappr Enter matrix [a b] of xrange endpoints [1 1] Enter number of x approximation points 200 Enter density as a function of t 1.53 /8 − 7/64 = 19/32 P 3 = 79/256 (7.52 /2 − 1. 188) 3 t − t2 8 a.*t.63) FX (t) = t2 2 − t3 8.40 (p. P 2 = 1.5*t.^2 Use row matrices X and PX as in the simple case cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot Solution to Exercise 7.^2 + .t) Use row matrices X and PX as in the simple case cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot .52 /2 − 0.. c.62) P 1 = 0. 6 5 1 3/4 (2 − t) = 79/160 1 (7.(3/8)*t. P1 = 6 5 1/2 t2 = 1/20 0 P2 = t2 + 6 5 6 5 5/4 1 t2 + 1/2 6 5 3/2 (2 − t) = 4/5 1 (7. 188) a.1] (t) t3 + I(1.66) tappr Enter matrix [a b] of xrange endpoints [0 2] Enter number of x approximation points 400 Enter density as a function of t (6/5)*(t<=1).64) P3 = b.^2 Use row matrices X and PX as in the simple case cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot Solution to Exercise 7.65) t FX (t) = 0 c. 0≤t≤2 tappr Enter matrix [a b] of xrange endpoints [0 2] Enter number of x approximation points 200 Enter density as a function of t t . (6/5)*(t>1).*(2 .2] (t) − + 5 5 5 2t − t2 2 (7.53 /8 = 7/64 b.194 CHAPTER 7. 2 7 6 fX = I[0.
1.1. To simplify exposition and to keep calculations manageable. mass per unit length). Each can be considered separately. random vectors As a starting point. In the absolutely continuous case. Various mprocedures and mfunctions aid calculations for simple distributions. the distribution may also be described by a probability density function The probability density is the linear density of the probability mass along the real line (i. Y ) : (0. For a simple random variable. we consider a pair of random variables as coordinates of a twodimensional random vector.7/>.. 8. consider a simple example in which the probabilistic interaction between two random quantities is evident. a simple approximation may be set up. but usually they have some probabilistic ties which must be taken into account when they are considered jointly.1: A selection problem Two campus jobs are open. the probability distribution consists of a point mass pi at each possible value ti of the random variable. Let of juniors selected (possible values 0. 1. realvalued random variable is a function (mapping) from the basic space is. or (2.org/content/m23318/1. They seem equally qualied. 1). In the absolutely continuous case. so it is decided to select them by chance. The mapping induces a probability mass distribution on the real line. Each combination of two is equally likely.2 Random variables considered jointly. to each possible outcome ω of an experiment there corresponds a real value Ω to the real line.1. We treat the We extend the techniques for a single random variable to the multidimensional case. 195 .1 Random Vectors and Joint Distributions1 8. The distribution is described by a distribution function FX . joint case by considering the individual random variables as coordinates of a random vector. 0). (1. Two juniors and three seniors apply. since they are impossible. That t = X (ω). Determine the probability for each of the possible pairs. Example 8. 2). Others have zero probability. Often we have more than one random variable. 1 This content is available online at <http://cnx. with no point mass concentrations. so that calculations for the random variable are approximated by calculations on this simple distribution. The concepts and results extend directly to any nite number of random variables considered jointly.1 Introduction fX . 2) and Y X be the number be the number of seniors selected (possible values 0. 2) . which provides a means of making probability calculations. However there are only three possible pairs of values for (X.e.Chapter 8 Random Vectors and joint Distributions 8. The density is thus the derivative of the distribution function. A single.
we simply determine how much induced probability mass is in that region. ω ∈ Ω. i −1 (Q) is an event for each reasonable set (technically. Y ).3 Induced distribution and the joint distribution function In a manner parallel to that for the singlevariable case. so that If we represent the pair of values {t. RANDOM VECTORS AND JOINT DISTRIBUTIONS SOLUTION There are one of each. as they must. There are C (5. Y = 2) = 3/10. A fundamental result from measure theory ensures W = (X. The plays a role analogous to that of the inverse mapping A twodimensional vector W is a random vector X −1 for a real random variable. (which we call t) we determine the region for which the rst coordinate value is between one and three and the second coordinate value (which we call greater than zero. Gometrically. We also have the distributions for the X = [0 1 2] P X = [3/10 6/10 1/10] We thus have a Y = [0 1 2] P Y = [1/10 6/10 3/10] (8. The argument parallels probability distribution induced by W = (X. Six pairs can be P (X = 0. As in the singlevariable case. How this is acheived depends upon the nature of the distribution and how it is described. we have a distribution function. W = (X. Thus P (X = 1. W Since W is a function. W = (X. u} as the point (t. Y = 0) = 1/10 (8. 2) = 10 equally likely pairs. since this exhausts the mutually exclusive possibilities. This corresponds to the set Q u) is of points on the plane with 1≤t≤3 and u > 0. u) on the plane.4) Because of the preservation of set operations by inverse mappings as in the singlevariable case. Y ). random variables conisidered individually. We formalize as follows: A pair {X. Y ) assignment determines −1 (Q) (8. 2) = 3 ways to select pairs of seniors. To W = (X. W (t. we obtain a mapping of probability mass from the basic space to the plane.196 CHAPTER 8. Y = 1) = 6/10. u). each Borel set) on the plane. W (ω) = (t. this is the strip on the plane bounded by (but not including) the horizontal axis and by the vertical lines t = 1 and t = 3 (included). we may assign the probability mass PXY (Q) = P W −1 (Q) = P (X. Y } of random variables considered jointly is treated as the pair of coordinate functions for To each assigns the pair of real numbers a twodimensional random vector where then X (ω) = t and Y (ω) = u. and In the selection example above. all mapping ideas extend.1. Only one pair can be both C (3. Y ) : Ω → R2 is a mapping from the basic space inverse mapping (8. Since to Q W −1 (Q) is an event for each reasonable set Q on the plane. The probability of any other combination must be zero.2) joint distribution and two individual or marginal distributions. Y ) takes on a (vector) value in region determine the probability that the vectorvalued function Q. Hence the vectorvalued function 8. R2 . juniors. we model X (the number of juniors selected) Y (the number of seniors selected) as random variables. Y > 0). the mass PXY as a probability measure on the subsets of the plane The result is the that for the singlevariable case. u). Example 8.1) These probabilities add to one. P (X = 2.2: Induced distribution and probability calculations To determine P (1 ≤ X ≤ 3. Y ) is a random vector i each of the coordinate functions X and Y is a random variable. The problem is to determine how much probability mass lies in that strip. Denition .3) Ω to the plane W −1 R2 .
1: The region Qab for the value FXY (a. uj )] = P (X = ti . Y ) there in the general case. b). Thus. should there be mass concentrations on the boundary. to determine In the discrete case (or in any P [(X. u = b). Y ) ∈ Q] we determine how much probability mass is in the region. s ≤ u} (8. u) ∈ R2 This means that (8. the induced Figure 8. t = a. Now for a given point of the vertical line through point where Qtu = {(r. We refer to such regions as semiinnite intervals on the plane. Y = uj ).6) (a. uj ) pij = P [W = (ti . u) = P (X ≤ t. u) is equal to the probability mass in the region rst coordinate is less than or equal to may write t Qtu and the second coordinate is less than or equal to u. Distribution function for a discrete random vector The induced distribution consists of point masses. u) on the plane which are on or to the left (t. the region Qab is the set of points (t. u) (see Figure 1 for specic Qtu . Formally. 0)and on or below the horizontal line through (0. u) = P [(X. we FXY (t.197 The joint distribution function FXY for W = (X. . b). s) : r ≤ t. Y ≤ u) ∀ (t. The theoretical result quoted in the real variable case extends to ensure that a distribution on the plane is determined uniquely by consistent assignments to the semiinnite intervals distribution is determined completely by the joint distribution function. Y ) ∈ Qtu ] . Y ) is given by FXY (t. is probability mass At point (ti .5) on the plane such that the FXY (t. As in the range of W = (X. case where there are point mass concentrations) one must be careful to note whether or not the boundaries are included in the region.
and fourth quadrants. (1.2: Example 8. (0. The value of FXY (t.1 (A selection problem) The probability distribution is quite simple. if is in any of the three squares on the lower left hand part of the diagram.198 CHAPTER 8. u) is in the square or on the left and lower boundaries.1). u) value is constant over any grid cell.2). Mass 3/10 at (0. u) = 1 (1 + 6tu) 10 (8. think of moving the point corner at To determine (and visualize) the joint distribution with This (t.0). (t. • If the point (t.3: Distribution function for the selection problem in Example 8. Figure 8. u) is on or in the square having probability 6/10 at the lower lefthand corner. the sheet covers the point mass at (0.0). The region Qtu is a giant sheet (t. Y } produces a mixed distribution as follows (see Figure 8. u) = 6/10. and the value of cells may be checked out by this procedure. Thus. Thus in this region FXY (t. If (t. third. This distribution is plotted in Figure 8.3) Point masses 1/10 at points (0. The situation in the other Distribution function for a mixed distribution Example 8. u) is the amount of probability covered by the sheet.4: A mixed distribution The pair {X. function.0).1 (A selection problem)). including the lefthand and lower boundariies.0) plus 0.6 times the area covered within the square.2.7) . and is the value taken on at the lower lefthand corner of the cell. (1. RANDOM VECTORS AND JOINT DISTRIBUTIONS The joint distribution for Example 8. 6/10 at (1.1) Mass 6/10 spread uniformly over the unit square with these vertices The joint distribution function is zero in the second. then the sheet covers that probability. FXY (t.3 (Distribution function for the selection problem in Example 8.1). no probability mass is covered by the sheet with corner in the cell. u) on the plane. and 1/10 at (2. u).
u) = • If 1 (2 + 6u) 10 (8. In this case t = 1. u) = • If the point 1 (2 + 6t) 10 0 ≤ u < 1. u) is to the right of the square (including its boundary) with the sheet covers two point masses and the portion of the mass in the square below the horizontal line through (t.3: Mixed joint distribution for Example 8. In general. then the distribution for each of the component random variables may be determined.1. FXY (t. u) is above the square (including its upper boundary) but to the left of the line (t. the converse However. 8. if the component random variables form an independent pair. These are known as is not true. both 1 ≤ t and 1 ≤ u). the treatment in that case shows that the marginals determine the joint distribution. u).4 (A mixed distribution). to give FXY (t. u). u) is above and to the right of the square (i. u) = 1 in this region. .e.4 Marginal distributions If the joint distribution for a random vector is known.9) mass is covered and (t. (8..8) (t. marginal distributions.199 • If the pont (t. then all probability Figure 8. the sheet covers two point masses plus the portion of the mass in the square to the left of the vertical line through FXY (t.
the upper boundary of mass on or to the left of the vertical line through Now Qtu is pushed up until eventually all probability (t. u) is included.12) This mass is projected onto the vertical axis to give the marginal distribution for Y. note that FX (t) = P (X ≤ t) = P (X ≤ t.4.10) FX (t) = FXY (t. Marginals for a joint discrete distribution . ∞) = lim FXY (t. u).11) (t.200 CHAPTER 8. Consider the sheet for point (8. The probability mass described by FX (t) is the same as the total joint probability mass on or to the left of the vertical line through mass in the half plane being projected onto the horizontal line to give the parallel argument holds for the marginal for Y. u) (8. u) = mass on or below horizontal line through (t. marginal (t. Figure 8. Y can take any of its possible values (8. We may think of the distribution for X.e. u) u→∞ This may be interpreted with the aid of Figure 8. If we push the point up vertically. This is the total probability that X ≤ t.. Y < ∞) Thus i. u). RANDOM VECTORS AND JOINT DISTRIBUTIONS To begin the investigation.4: Construction for obtaining the marginal distribution for X. A FY (u) = P (Y ≤ u) = FXY (∞. FX (t) describes probability mass on the line.
0). Y = uj ) and P (Y = uj ) = i=1 P (X = ti . until a nal jump at t = 1 in the amount of 0. 4/10. has masses 4/10. respectively. the marginal distribution for Y t = 0. (1. Y } produces a joint distribution that places mass (0. (3.13) Thus. 3.5: Marginals for a discrete distribution The pair {X. (1. onto the point uj Similarly. Figure 8. Y = uj ) (8.1) Mass 6/10 spread uniformly over the unit square with these vertices The construction in Figure 8. 0) . uj ) is projected on a vertical line to give P (Y = uj ).0). 2. (2. 2/10 at points Similarly. (0. 2/10. X has masses 2/10. slope 0. 2.5) The marginal distribution for respectively. 1.1).201 Consider a joint simple distribution. There t = 0. 2/10 at each of the ve points Example 8.6 Consider again the joint distribution in Example 8. m n P (X = ti ) = j=1 P (X = ti . 2/10 at points u = 0. produces a mixed distribution as follows: Point masses 1/10 at points (0.6 shows the graph of the marginal distribution function is a jump in the amount of 0. Y } FX . 1) . 1) (See Figure 8. (1. all the probability mass on a horizontal line through (ti . 1. (2. 0) is pro jected onto the point ti on a horizontal (0. .2 corresponding to the two point masses on the vertical line. 4/10.4 (A mixed distribution).5: Marginal distribution for Example 1. Then the mass increases linearly with t.6. Example 8.2 at The pair {X. 2) . all the probability mass on the vertical line through line to give P (X = ti ). 0) .
the total mass is covered and FX (t) is constant at one for t ≥ 1.1 mprocedures for a pair of simple random variables We examine. in W = (X. These are.6: Marginal distribution for Example 8. . 8.2 Random Vectors and MATLAB2 eect. Values on the horizontal axis ( axis) correspond to values t Ω to the plane. two components of a random vector The induced distribution is on the 8. Y .202 CHAPTER 8. RANDOM VECTORS AND JOINT DISTRIBUTIONS produced by the two point masses on the vertical line. Y ). 2 This content is available online at <http://cnx. the marginal distribution for Y is the same. u)plane. Figure 8.2. By symmetry.org/content/m23320/1. which maps from the basic space (t. At t = 1. calculations on a pair of simple random variables X. rst.6/>.6. considered jointly.
we display the various steps utilized in the setup procedure. matrix g (ti ) in a position corresponding to Determine P (X = ti ) P (g (X) ∈ M ). um ] is the set of values for random P (X = ti . Formation of the t and u matrices is achieved by a basic setup mprocedure called jcalc. Y = uj ).14) Z = g (X). with X values where pij = increasing to This is dierent from the usual arrangement in a matrix.17) b. In the following example. t2 .. The data for a pair matrices {X. tn ] is the set of values for random variable variable Y. The mprocedure takes care of this inversion. We usually represent the distribution P (X = ti . we form computational matrices (ti . with ones in the desired positions. we we use array operations on X to form a matrix G = [g (t1 ) g (t2 ) · · · g (tn )] which has (8. uj . considered jointly. · · · . Y = uj ) atthe (ti .e. Y = uj ) at the point (ti . t2 .e. · · · . The data X Y = [u1 . where M in matrix P X. uj ) on the plane. Y values increasing upward. X and on the plane. First. probabilities In this case. and row from the bottom) MATLAB array and logical operations t. is some prescribed set of values. data consist of values ti and corresponding P (X = ti ) arranged in matrices X = [t1 . uj ) position (8. Y } of random variables are the values of X and Y. tn ] To perform calculations on and P X = [P (X = t1 ) . Example 8.203 of the rst coordinate random variable X and values on the vertical axis ( axis) correspond to values of u Y. these intermediate steps would not be displayed. • • Use relational operations to determine the positions N. let us review the onevariable strategy. uj ) position. P perform the specied operations on P (X = ti . u.7: Setup and basic calculations jdemo4 % Call for data in file jdemo4. · · · .m jcalc % Call for setup procedure Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X . We arrange the joint probabilities as the right and for this procedure are in three matrices: X = [t1 . P (X = t2 ) . Thus P with elements arranged corresponding to the P has element P (X = ti . in which values of the second variable increase downward.. Ordinarily. u2 . which we may put in row (8.15) Basic problem. utilizing the MATLAB function meshgrid. a. c. g (ti ) ∈ M . at each point on the jth t and u such that t has element ti at each i th column from the left) u has element uj at each (ti . We extend the computational strategy used for a single random variable. P (X = tn )] (8. To perform calculations. at each point on the position (i. in a manner analogous to the operations in the singlevariable case. and P = [pij ]. P (X = ti ) for which These will be in a zeroone Select the in the corresponding positions and sum. Y = uj ) at each (ti . and computes Y. uj ) on position (i. uj ) ti . · · · . Y = uj ) in a matrix P.16) X = [t1 t2 · · · tn ] andY = [u1 u2 · · · um ] and the joint probabilities graphically by putting probability mass P (X = ti . The mprocedure forms the matrices the marginal distributions for t and u. MATLAB operations to determine the inner product of N This is accomplished by one of the and PX We extend these techniques and strategies to a pair of simple random variables. This joint probability may is represented by the matrix mass points on the plane.
1161 0.7336 % P(g(X.0744 0.1244 .2088 PY % Optional call for display of PY PY = 0.3*u % Formation of G = [g(t_i.1032 0..0516 0.0774 0. % Formation of t.1900 0.1800 0..1244 [t.3: Distribution function for the selection problem in Example 8..0744 0.0516 0 0.1512 0. PX... RANDOM VECTORS AND JOINT DISTRIBUTIONS Enter row matrix of VALUES of Y Y Use array operations on matrices X.fliplr(Y)).0180 0..0817 0. Y.0817 0.u_j)] matrix = 3 6 5 3 19 6 3 2 6 22 9 0 1 9 25 15 6 7 15 31 M = G >= 1 % Positions where G >= 1 M = 1 0 0 1 1 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 pM = M.2700 0.0264 0.2700 0.0589 0.*P % Selection of probabilities pM = 0.0180 0.1512 0. P X 2 − 3Y ≥ 1 .0405 0. Using array operations on t G = t.4300 0.0558 0.0209 0. PY.0270 0. In Example 3 (Example 8.2088 PY = fliplr(sum(P')) % Calculation of PY (note reversal) PY = 0.1032 0.0285 0.0837 0.4300 0.0372 0.1356 0..3100 0.0360 0. u. u matrices (note reversal) disp(t) % Display of calculating matrix t 3 0 1 3 5 % A row of Xvalues for each value of Y 3 0 1 3 5 3 0 1 3 5 3 0 1 3 5 disp(u) % Display of calculating matrix u 2 2 2 2 2 % A column of Yvalues (increasing 1 1 1 1 1 % upward) for each value of X 0 0 0 0 0 2 2 2 2 2 Suppose we wish to determine the probability and u..0270 0.Y) >= 1) G d. t.u] = meshgrid(X.1161 0.0264 0.0198 0. uj )].0132 PM = total(pM) % Total of selected probabilities PM = 0.0209 0.1800 0.204 CHAPTER 8.1900 0.1 (A selection problem)) from "Random Vectors and Joint Distributions" we note that the joint distribution function . and P disp(P) % Optional call for display of P 0.% Steps performed by jcalc PX = sum(P) % Calculation of PX as performed by jcalc PX = 0.^2 .0285 0.0372 0 0 0.0132 PX % Optional call for display of PX PX = 0.0297 0.0360 0 0 0. we obtain the matrix G = [g (ti .3100 0.1356 0.0405 0.0589 0.
systematically from the joint probability matrix P by a two step operation.3: Distribution function for the selection problem in Example 8.6000 0. By ipping the matrix and transposing. Example 8.6000 0. • • Take cumulative sums upward of the columns of P. 0 6 0.1000 0 0 0. Take cumulative sums of the rows of the resultant matrix. including the lefthand and lower boundaries. This can be done with the MATLAB function cumsum. 0 0 1].7: The joint distribution for Example 3 (Example 8. FXY = flipud(cumsum(flipud(P))) % Cumulative column sums upward FXY = 0.205 FXY is constant over any grid cell.1*[3 0 0. at the value taken These lower lefthand corner values may be obtained on at the lower lefthand corner of the cell.1 (A selection problem)) in "Random Vectors and Joint Distributions'. which takes column cumulative sums downward.1000 Figure 8.3000 0.3000 0.9000 1.0000 0 0.1000 0 0. we can achieve the desired results. Comparison with Example 3 (Example 8.1000 FXY = cumsum(FXY')' % Cumulative row sums FXY = 0.6000 0.3: Distribution function for the selection problem in Example 8.3: Distribution function for the selection problem in Example 8.8: Calculation of FXY values for Example 3 (Example 8.1 (A selection problem)) from "Random Vectors and Joint Distributions" P = 0. .1 (A selection problem)) from "Random Vectors and Joint Distributions" shows agreement with values obtained by hand.7000 0 0 0.
0000 0. Y } assigns zero probability to every set of points with the property fXY Q (8.3390 0. y. ∞) = −∞ −∞ fXY (r.2) for Example 3 (Example 8. RANDOM VECTORS AND JOINT DISTRIBUTIONS The two step procedure has been incorprated into an mprocedure jddbn.19) At every continuity point for fXY .0939 0.1224 0. As an example.6012 0. useful in probability calculations.21) .20) FX (t) = FXY (t.3: Distribution function for the selection problem in Example 8.6848 0.P) The quantities x. and countable unions of these) there is a density function.u.1356 These values may be put on a grid.3312 0.206 CHAPTER 8. u) ∂t ∂u t ∞ (8.1 (A selection problem)) in "Random Vectors and Joint Distributions".7912 1.1824 0. Denition If the joint probability distribution for the pair with zero area.t.py. 8.8756 0.0264 0. px. s) dsdr (8. A similar situation exists for a joint distribution for two (or more) variables. Y ) ∈ Q] = {X. As in the case of canonic for a single random variable. u) = −∞ −∞ fXY (8. and p function[x.7 (Setup and basic calculations) jddbn Enter joint probability matrix (as on the plane) P To view joint distribution function. u.2 Joint absolutely continuous random variables In the singlevariable case. return to the distribution in Example Example 8. py.5656 0.px.9: Joint distribution function for Example 8. line or curve segments. = jcalcf(X. u) = Now ∂ 2 FXY (t.y.0534 0. the density is the second partial fXY (t.p] may be given any desired names. For any joint mapping to the plane which assigns zero probability to each set with zero area (discrete points. in the same manner as in Figure 2 (Figure 8. the condition that there are no point mass concentrations on the line ensures the existence of a probability density function.18) We have three properties analogous to those for the singlevariable case: t (f1) u fXY ≥ 0 (f2) fXY = 1 R2 (f3) FXY (t.Y.7 (Setup and basic calculations) Example 8.2.0780 0. e. then there exists a joint density function fXY P [(X.1512 0.2754 0.1152 0. call for FXY disp(FXY) 0.4492 0. t. it is often useful to have a function version of the procedure jcalc to provide the freedom to name the outputs conveniently.5157 0.
t = 1 (see Figure 8. u) du (8. Example 8.5) = P [(X.8: Distribution for Example 8.5.23) fXY (t.22) Marginal densities.24) P (0. and similarly for the marginal for the second variable.75 and above the line u = 0. Thus the strip between 3/4 t p=8 1/2 1/2 tu dudt = 25/256 ≈ 0. u) du = 8t 0 1 u du = 4t3 . .75. integrate out the second variable in the joint density. s) ds and fY (u) = −∞ fXY (r.10 (Marginal density functions). Y ) ∈ Q] where Q is the common part of the triangle with t = 0. Thus.5 and t = 0. Y > 0. to obtain the marginal density for the rst variable.8) fX (t) = fY (u) = This region is the triangle bounded by u = 0.25) Figure 8. and t fXY (t. u) dt = 8u u (8.10: Marginal density functions Let fXY (t. u = t.207 A similar expression holds for gives the result FY (u). This is the small triangle bounded by u = 0. u = t. and t = 0.5 ≤ X ≤ 0.5.75. u) = 8tu 0 ≤ u ≤ t ≤ 1. 0 ≤ t ≤ 1 t dt = 4u 1 − u2 . ∞ Use of the fundamental theorem of calculus to obtain the derivatives ∞ fX (t) = −∞ fXY (t.0977 (8. 0 ≤ u ≤ 1 (8.
. Examination of the gure shows that we have dierent limits for the integral with respect to for u 0≤t≤1 For and for 1 < t ≤ 2. SOLUTION by t = 0. Likewise.208 CHAPTER 8. therefore express fX by fX (t) = IM (t) 12 2 6 (t + 1) + IN (t) t 37 37 (8.28) Figure 8. 1] and N = (1. RANDOM VECTORS AND JOINT DISTRIBUTIONS Example 8. t} (see Figure 8.e. Determine the marginal density fX . 2]. . 6 37 1 • 0≤t≤1 fX (t) = (t + 2u) du = 0 6 (t + 1) 37 (8.11 (Marginal distribution with compound expression). u) = 37 (t + 2u) on the region bounded u = 0. t = 2. M = [0. Y } has joint density fXY (t.26) • For 1<t≤2 fX (t) = 6 37 0 t (t + 2u) du = 12 2 t 37 (8.9).27) We may combine these into a single expression in a manner used extensively in subsequent treatments. We can. Then IM (t) = 1 for t ∈ M (i. Suppose elsewhere. and u = max{1.11: Marginal distribution with compound expression The pair 6 {X. 0 ≤ t ≤ 1) and zero IN (t) = 1 for t ∈ N and zero elsewhere.9: Marginal distribution for Example 8.
3 Discrete approximation in the continuous case For a pair {X. We then utilize the techniques developed for a pair of simple random variables. ~tuappr Enter~matrix~[a~b]~of~Xrange~endpoints~~[0~1] Enter~matrix~[c~d]~of~Yrange~endpoints~~[0~1] Enter~number~of~X~approximation~points~~200 Enter~number~of~Y~approximation~points~~200 Enter~expression~for~joint~density~~3*(u~<=~t.^2) Use~array~operations~on~X.*P)~~~~~~~~~~%~Evaluation~of~the~integral~with p~=~~~0. if the increments are small enough. uj ) (8.29) P ((X.~t.8.~and~P ~M~=~(t~<=~0. uj )) = P (X ∗ = ti . If we let the approximating pair {X ∗ . .1).3352455531 The discrete approximation may be used to obtain approximate plots of marginal distribution and density functions. Y > 0. we then have n · m pairs (ti . in ij th rectangle)(8. If we subdivide the horizontal axis for values of n approximating values ti for X its increment. with constant increments dy . we have a grid structure consisting of rectangles of size dx · dy . Y ) As in the onevariable case. uj ) is at the midpoint of the rectangle. It then prompts for an expression for the joint probability distribution. Y ∗ }. Y } with joint density fXY .209 8.31) P (X ≤ 0.12: Approximation to a joint continuous distribution fXY (t.3355~~~~~~~~~~~~~~~~%~Maple~gives~0. We select ti and uj at the midpoint of corresponding to points on the plane.2.~u. calculating matrices fXY (t. with constant increments dx. If we have m approximating values uj for Y. u). Y ∗ = uj ) = P ((X. and the vertical axis for values of Y.~PY. Example 8.~PX.1). uj ). It calculates the marginal approximate distributions and sets up the Calculations are then carried out as for any joint simple pair. we assign pij = P ((X ∗ . so that the point be (ti . X.~Y. ~p~=~total(M.8)&(u~>~0. from which it determines u as does the mprocess jcalc for simple random variables. u) = 3 Determine on 0 ≤ u ≤ t2 ≤ 1 (8. Y ∗ ) = (ti . as in the singlevariable case. and we approximate the distribution in a manner similar to that for a single random variable. Y ) ∈ ij th The mprocedure rectangle) ≈ dx · dy · fXY (ti .30) tuappr t and calls for endpoints of intervals which include the ranges of X and Y and for the numbers of subintervals on each.
210 CHAPTER 8.~~~~~~~~~~~~~~~~%~Density~for~Y ~FX~=~cumsum(PX). expectations.fx.FX)~~~~~~~~~~~~%~Plotting~details~omitted These approximation techniques useful in dealing with functions of random variables. RANDOM VECTORS AND JOINT DISTRIBUTIONS Marginal density and distribution function for Example 8. and conditional expectation and regression.~t. u ≤ 1 + t.*(u<=min(1+t.13 (Approximate plots of marginal density and distribution functions). ~tuappr Enter~matrix~[a~b]~of~Xrange~endpoints~~[1~1] Enter~matrix~[c~d]~of~Yrange~endpoints~~[0~1] Enter~number~of~X~approximation~points~~400 Enter~number~of~Y~approximation~points~~200 Enter~expression~for~joint~density~~3*u.~u.10: Example 8.~PY. . u) = 3u on the triangle bounded by u = 0.~PX.~Y.~and~P ~fx~=~PX/dx. and u ≤ 1 − t.10) ~FY~=~cumsum(PY).~~~~~~~~~~~%~Distribution~function~for~X~(Figure~8.13: Approximate plots of marginal density and distribution functions fXY (t.~~~~~~~~~~~%~Distribution~function~for~Y ~plot(X.X.10) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~%~Theoretical~(3/2)(1~~t)^2 ~fy~=~PY/dy.~~~~~~~~~~~~~~~~%~Density~for~X~~(see~Figure~8. Figure 8.1t)) Use~array~operations~on~X.
7 1.0651 0.8.0620 0.org/content/m24244/1.) times. {X. .33) Determine the marginal distributions and the corner values for P (X + Y > 2) P (X ≥ Y ).0323 0. A coin is ipped X (Solution on p.0589 FXY .5 4. Y = u) 3 This (8.0609 0. Y } and from this determine the marginal distribution for Y.0361 0.34) content is available online at <http://cnx. 216.8. Y } and from this determine the marginals for each. with values of the marginal distribution for Example 7 (Example 14.0551 0.0713 0.) Two positions for campus jobs are open.0667 and 0. 217.0380 0. Suppose a pair of dice is rolled instead of a single die.0399 0.38: npr08_07)): Exercise 8. from a standard deck.2 (Solution on p.1 number of aces and (Solution on p.3 − 0.3 Problems On Random Vectors and Joint Distributions3 Exercise 8. Determine the joint distribution for the pair and for each P (X = k) = 1/6 for 1≤k≤6 bution. Determine {X. Let sophomores and the pair Y X be the number of be the number of juniors who are selected.1 5. without replacement.1] 0. Y } P (X = t. (Solution on p.0580 Y = [1. Two sophomores. (For a MATLAB based way to determine the joint distribution see A random number N of Bernoulli trials) from "Conditional Expectation.7 The pair {X. determine the joint distribution and the marginals. 218. Arrange the joint matrix as on the plane. Assume binomial (k.0493 0. P (Y = jX = k) has the Y increasing upward. Exercise 8. Determine the joint distribution for the pair {X.3 2.) be the total number of spots which turn up. (Solution on p. 218.37: npr08_06)): {X.3. It is decided to select two at random (each possible pair equally likely). Let an additional X times.4/>. Under the usual assumptions.0357 0.0399 0. Exercise 8. Y Let X be the be the number of spades.) has the joint distribution (in mle npr08_06.0527 0. What is the probability of three or more sevens? Exercise 8. Y }.3] 0. 215.3 A die is rolled. Determine the joint distribution for {X.7: Regression") Y.1 3. Let X be the number that turns up. Exercise 8. 1/2) distriDetermine Exercise 8. Determine (8.0483 0. Y } and from this determine the marginal distribution for Y.9 5.6 The pair (Solution on p.0420 0.211 8. and three seniors apply.4 the joint distribution for the pair As a variation of Exercise 8. 215. 216.m (Section 17. Y } X = [−2.m (Section 17.) has the joint distribution (in mle npr08_07.32) 0.) k. Let Y be the number of heads that turn up.5 Suppose a pair of dice is rolled.) Two cards are selected at random. Let Y X (Solution on p. Roll the pair be the number of sevens that are thrown on the X rolls.0441 (8.0437 P = 0. three juniors.
6) and P (X > Y ).0156 0. Determine P (1 ≤ X ≤ 4.0196 0.0203 0.039 0.0035 0.9 0.017 .0056 0.0594 0.001 0.8 3.0260 0.065 0.0495 0.0528 0.0060 0.5 0.0064 0.0070 0.0256 0.0038 0.0049 0.045 2.1 2.0026 15 0.) Exercise 8.0189 0.0077 Table 8.5 0.010 0.0440 0.0108 0.039 0.m (Section 17.0191 0.163 0. 219.050 0.0132 3.0120 0.009 2 0. in minutes.0112 0.0297 0 4. The data are as follows (in mle npr08_09.0172 17 0.1 0.0234 0.0012 0.0108 0.0126 0.0040 0.0054 0.5 0. Y } has the joint distribution (in mle npr08_08.0081 0.2 Determine the marginal distributions.058 0.0091 0.0244 0. RANDOM VECTORS AND JOINT DISTRIBUTIONS t = u = 7. and Data were kept on the eect of training time on the time to perform a job on a production line.005 0.9 is the amount of training.1320 0.0182 0.0096 0.0126 0.0256 0.40: npr08_09)): P (X = t.1 Determine the marginal and distributions and the corner values for FXY .0216 0.1089 0.39: npr08_08)): P (X = t.001 0.049 0.0070 0. Determine FXY (10.0064 0.025 3 0.0268 0.0063 7 0.0084 0.0252 0.0160 0.7 0.0045 9 0.0056 0.0510 0. (Solution on p.0104 0.8.0280 0.0234 0.015 0.0120 0. Y > 4) Exercise 8.0044 0.0072 0.0204 0.0091 0.070 0.031 0.003 1.051 0.35) t = u = 12 10 9 5 3 1 3 5 1 0.0098 0.0 3.0396 0 0.0217 19 0.4 0.137 0.0160 0.0180 0. Y X is the time to perform the task.0196 0.0050 0.0096 0.011 0.0017 5 0.0162 0.0484 1.2 0.0060 0.0040 0.0056 0.0096 0.033 0.36) t = u = 5 4 3 2 1 1 0.0256 0.0182 0.0090 13 0.012 0. (Solution on p. Y = u) (8.0156 0.0140 0.0405 0.0090 0.0167 11 0.0231 0. Y = u) (8.0134 0.0140 0.0180 0.5 4.0056 0.0112 0.0182 0.8 The pair P (X − Y  ≤ 2).m (Section 17.0080 0.8.0168 0.061 0.0324 0.0726 2.212 CHAPTER 8.0060 0.0084 0.0363 0.0891 0.0223 Table 8.0060 0. 220. in hours.0072 3 0.0200 0.) {X.
) on the parallelogram with vertices fXY (t. 220. Y > 1) . Y ≤ 1/2) . 1) . P (X > 0. Y > 1/2) .37) Exercise 8. Exercise 8. 1) .) fXY (t. 0 ≤ u ≤ 2. Determine FXY (2.15 (Solution on p. Y > 1/2) . P (X ≤ 1/2.13 (Solution on p. (0. Y > 1) . P (Y ≤ X) Exercise 8.213 Table 8. Calculate analytically the indicated probabilities.) for fXY (t. 1). P (Y ≥ 1/2) . 0 ≤ u ≤ 1 (8. Sketch the region of denition and determine analytically the marginal density functions b. fX fX and fY .16 (Solution on p. Y > 1) . (0.41) P (X ≤ 1. u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2. 3) and P (Y /X ≥ 1. P (Y ≤ X) Exercise 8. P (X ≤ 1/2. Y > 1) . u) = 4t (1 − u) 0 ≤ t ≤ 1. For the joint densities in Exercises 1022 below a. P (X < 1/2. 221. P (X < Y ) Exercise 8. 2) . u) = 1 0 ≤ t ≤ 1. and the marginal distribution function c. (8. Use a discrete approximation to plot the marginal density FX . P (X − Y  < 1) Exercise 8. 1 < Y ) . 0 ≤ u ≤ 2 (1 − t). P (X ≤ 1. (1. (0. (1. P (0 ≤ X ≤ 1. 0) .5. 1) . 224. Y > 1/2) .42) FXY (1.10 (Solution on p. 1) (8. 0 ≤ u ≤ 1 + t. 223.) for fXY (t.) for fXY (t.) fXY (t.25). Y > 0) .) on the square with vertices at fXY (t. d. u) = 12t2 u (−1. 221.39) P (1/2 < X < 3/4. 0) .43) P (X ≤ 1/2. 222.3 Determine the marginal distributions. (8. 0 ≤ u ≤ 1. P (0 ≤ X ≤ 1/2. P (Y ≤ X) Exercise 8. u) = 1 8 (t + u) for 0 ≤ t ≤ 2. 223. 1/2 < Y < 3/4) . u) = 1/2 (1.14 (Solution on p.12 (Solution on p. u) = 4ue−2t 0 ≤ t. P (X > 1/2. (8. (2.40) P (X > 1/2.38) P (X > 1. Y > 1) .11 (Solution on p. 0) . (8. P (Y ≤ X) (8. Y > 1/2) . Determine by discrete approximation the indicated probabilities.
) fXY (t.49) P (1/2 ≤ X ≤ 3/2.) fXY (t. 225. (8. 0 ≤ u ≤ min{1 + t. Y ≤ 1) . 0 ≤ u ≤ min{2. u) = 12 227 (3t + 2tu) for 0 ≤ t ≤ 2. u) = 3 23 (t + 2u) for 0 ≤ t ≤ 2. P (X ≥ 1. Y ≥ 1) .) fXY (t. P (Y ≤ X) Exercise 8.48) P (X < 1) . 0 ≤ u ≤ max{2 − t. Y > 1) .21 (Solution on p. 3 − t} (8.44) Exercise 8. Y ≤ 1) . P (X ≤ 1.214 CHAPTER 8. u) = 24 11 tu 0 ≤ t ≤ 2. P (Y < X) Exercise 8.) 0 ≤ u ≤ 1.) for fXY (t.19 (Solution on p. 226. u) = 2 13 (t + 2u) for 0 ≤ t ≤ 2. P (X < Y ) (8. Y ≥ 1) . Y ≤ 3/2) . 0 ≤ u ≤ min{2t. u) = I[0. 227.17 (Solution on p. P (Y ≤ X/2) Exercise 8.) fXY (t.46) P (X ≥ 1. 228. RANDOM VECTORS AND JOINT DISTRIBUTIONS Exercise 8. 2} (8. P (Y < X) Exercise 8. P (X > 1) . 0 ≤ u ≤ min{1. Y ≤ 1/2) . 224. t} (8. 3 − t} (8.1] (t) 8 t2 + 2u + I(1. P (Y ≤ 1) .18 (Solution on p.47) P (X ≤ 1/2. 228. P (X ≤ 1. for 0 ≤ t ≤ 2.2] (t) 14 t2 u2 for (Solution on p.45) P (X ≥ 1. Y ≤ 1) .20 (Solution on p. u) = 12 179 3t2 + u .5.22 9 3 fXY (t. 2 − t} P (X ≤ 1.
18 12 0. Ai . 0) = P (A1 A2 ) = 8 • 7 = 56 P (1. 211) Let Ai . Dene the events ASi . P = Pn/56. Let 3 51 + 1 52 • 36 51 + 36 52 • 1 51 = 144 2652 % type npr08_01 (Section~17. disp('Data in Pn. X.2 (p. 1) = P (A1 B2 ) + P (B1 A2 ) = 2 • 3 + 8 • 2 = 56 8 7 7 2 1 2 P (2. Y = 0:2. on the be the number of sophomores and Y i th trial.8. 2) = P (∅) = 0 ace). selection. Y = 0:2.8. Y % Result PX = sum(P) PX = 0. 864 144 6. 1) = P (A1 S2 S1 A2 AS1 N2 N1 AS2 ) = 52 • 12 + 12 • 51 52 1 1 24 P (1. 2) = 0 P X = [30/56 24/56 2/56] P Y = [20/56 30/56 6/56] % file npr08_02. Si . 0) = P (C1 C2 ) = 3 • 2 = 56 8 7 18 P (0. 0) = P (A1 A2 ) = 52 • 51 = 2652 1 3 3 1 6 P (2.1448 0. 0) = P (A1 N2 N1 S2 ) = 52 • 36 + 52 • 51 = 2652 51 3 P (1. respectively. 6 12 2].5588 0. Let X be the number of juniors selected. and neither on the i and Ni . spade (other than the P (i. X. junior. P') npr08_02 (Section~17.1 (p. 35 36 P (0.33: npr08_02) . Y') npr08_01 % Call for mfile Data in Pn.215 Solutions to Exercises in Chapter 8 Solution to Exercise 8. other ace.1 X = 0:2. Bi .8.m (Section~17. or senior. disp('Data are in X. Set P (i. P. 1) = P (AS1 A2 A1 AS2 ) = 52 • 51 + 52 • 51 = 2652 P (2.m (Section~17. 211) Let X be the number of aces and Y be the number of spades. 2) = P (B1 B2 ) = 8 • 7 = 56 2 P (1. Ci be the events of selecting a sophomore. k) = P (X = i.0045 PY = fliplr(sum(P')) PY = 0. P = Pn/(52*51).33: npr08_02) % Solution for Exercise~8. Y = k).Pn. Pn = [6 0 0.3824 0. Y = k) 6 P (0. 0) = P (N1 N2 ) = 52 • 51 = 1260 2652 864 P (0. 2) = P (2.0588 Solution to Exercise 8. 2) = P (S1 S2 ) = 52 • 51 = 2652 3 36 3 216 P (1. 0) = P (A1 C2 ) + P (C1 A2 ) = 8 • 3 + 3 • 2 = 12 7 8 7 56 3 12 P (1. 2.32: npr08_01) % file npr08_01. i = 1. Y. Pn = [132 24 0. 1) = P (B1 C2 ) + P (C1 B2 ) = 3 • 3 + 3 • 3 = 56 8 7 8 7 3 2 6 P (0. 1) = P (N1 S2 S1 N2 ) = 36 • 12 + 12 • 36 = 2652 52 51 52 51 11 132 12 P (0. 2) = P (AS1 S2 S1 AS2 ) = 52 • 12 + 12 • 51 = 2652 51 52 2 6 3 P (2.2 X = 0:2. of drawing ace of spades.8. P.32: npr08_01) % Solution for Exercise~8. k) = P (X = i.8507 0. 1) = P (2. 1260 216 6].
Y. disp('Answers are in X. % Initialize for i = 1:6 % Calculate rows of Y probabilities P0(i.0521 0.3 (p.34: npr08_03) % Call for solution mfile Answers are in X.0026 disp(PY) 0.0104 0.4 (p.0417 0. Y. PX = (1/36)*[1 2 3 4 5 6 5 4 3 2 1]. PY disp(P) 0 0 0 0 0 0.1:i+2) = PX(i)*ibinom(i+1. 211) % file npr08_04.0833 0.35: npr08_04) Answers are in X. Y = k) = P (X = i) P (Y = kX = i) = (1/6) P (Y = kX = i).0833 0.34: npr08_03) % Solution for Exercise~8. PY = fliplr(sum(P')).0833 0.0755 0. Y.8. PY') npr08_03 (Section~17. end P = rot90(P0).0417 0.m (Section~17.1/2. for i = 1:11 P0(i.8. % file npr08_03.0026 0 0 0 0 0.2578 0.0357 0. % Rotate to orient as on the plane PY = fliplr(sum(P')).13).0391 0.0104 0. end P = rot90(P0). P0 = zeros(6.0625 0.0391 0 0 0.0208 0. PY.0417 0.4286 PY = fliplr(sum(P')) PY = 0.0:i+1).1071 Solution to Exercise 8.1:i+1) = (1/6)*ibinom(i. Y. P disp(P) Columns 1 through 7 0 0 0 0 0 0 0 .0417 0. P PX = sum(P) PX = 0.0156 0 0 0 0. 211) P (X = i.0:i).1641 0.4 X = 2:12.3 X = 1:6.0208 0.1/2.m (Section~17.3571 0. P') npr08_04 (Section~17.0521 0.7). P.8.0208 0.0260 0. Y.8. RANDOM VECTORS AND JOINT DISTRIBUTIONS Data are in X.3125 0.216 CHAPTER 8.35: npr08_04) % Solution for Exercise~8.0260 0.Pn.0052 0. Y = 0:6. P0 = zeros(11.0625 0.1667 0. % Reverse to put in normal order disp('Answers are in X.0521 0 0.5357 0.0026 Solution to Exercise 8. Y = 0:12.5357 0.0156 0.0052 0. PY. P.0625 0.
0304 0.0208 0.36: npr08_05) % Data and basic calculations for Exercise~8.0069 Columns 8 through 11 0 0 0 0 0 0. PY = fliplr(sum(P')).0208 0.0005 0.0828 0.0000 0.0001 0.0182 0.0380 0.0090 0.0125 0.0001 0.0273 0.1400 0.0000 0.3660 0. end P = rot90(P0).0269 0.0003 0.2158 0.0001 disp(PY) Columns 1 through 7 0.0069 0. PY') npr08_05 (Section~17.0015 0.0171 0.0208 0.13).0054 0.0090 0.0130 0.0098 0.0347 0.5 PX = (1/36)*[1 2 3 4 5 6 5 4 3 2 1].0002 0.8. PY disp(PY) Columns 1 through 7 0.0273 0. for i = 1:11 P0(i.0015 0.0022 0.0000 0.0078 0.0045 0.0174 0. P. 211) % file npr08_05.36: npr08_05) Answers are in X.0140 0 0 0 0 0 0 0 0.0230 Columns 8 through 13 0.0048 0.0034 0.0022 0 0 0 0 0.1823 0.0139 0.0035 0.0069 0.0008 .0152 0.0304 0.0045 0.0008 0. Y.0005 0.217 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0312 0.0098 0.0375 0.0078 0.0130 0.0020 0.0125 0.0174 0.2152 0.0326 0.0456 0.1/6.0091 0.0208 0.0015 0.0052 0 0.0013 0 0 0 0.0043 0.0034 0.0001 0.0434 0. P.1025 Columns 8 through 13 0.0000 0.0171 0.3072 0.0004 0.0054 0.0152 0.0273 0.0040 0 0 0 0 0 0 0.m (Section~17.1954 0.0013 0.0:i+1).0043 0.0456 0.0035 0.0015 0.0069 0. disp('Answers are in X.0002 0.0037 0.0008 0.0001 0. Y = 0:12.0037 0.5 (p.0020 0. X = 2:12.0008 0 0 0 0 0 0.1:i+2) = PX(i)*ibinom(i+1.0806 Solution to Exercise 8.0347 0.0273 0. Y.0003 0.0052 0.0000 0.0004 0.0205 0.0326 0.8. P0 = zeros(11.0091 0.0182 0.0063 0.
4000 0.1000 0. Y. t.0000 0.1740 0.2799 Solution to Exercise 8.2300 0. PY.7 (p.6 (p.4000 0.1160 0.218 CHAPTER 8.*P) P1 = 0.1100 . P jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.4860 0. 211) npr08_06 (Section~17.8020 1.3000 0.2000 3.2100 jddbn Enter joint probability matrix (as on the plane) P To view joint distribution function.0001 0.38: npr08_07) Data are in X.3300 2.6000 0.2300 0.1200 3. PY.PX]') 2. and P disp([X. u.5000 0.7163 P2 = total((t>=u).PY]') 1. t. u.4740 0.0667 0.2200 1.3020 4. and P disp([X.6000 0.2000 0.6361 0. Y. Y.3000 0. P jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.7000 0. PX.*P) P2 = 0. PX.PX]') 3.3000 0.1000 0.1900 5. 211) npr08_07 (Section~17.1980 disp([Y.2020 5.37: npr08_06) Data are in X.3600 0.2980 2.0000 0.0000 0.3160 0.1380 0.8. Y.0000 Solution to Exercise 8.1500 0.5000 0.2400 0.1000 0.9000 0.1000 0. RANDOM VECTORS AND JOINT DISTRIBUTIONS 0. call for FXY disp(FXY) 0.7900 0.2980 P1 = total((t+u>2).2391 0.1817 0.8.0000 0.7000 0.0000 0.1700 1.
8061 0. 212) 1.*P) P = 0..0918 F = total(((t<=10)&(u<=6)).0886 12...2719 0.1400 15.*P) P2 = 0.0915 0..1300 11.0700 disp([Y.1318 10.0000 0.4792 0.0000 0.0000 0.*P) F = 0.0800 3.8 (p. u.0000 0.9000 0.5355 0.PY]') 3.1222 9.3700 0.Use array operations on matrices X.PX]') 1.1852 M = (1<=t)&(t<=4)&(u>4).2982 P = total((t>u).1939 jddbn Enter joint probability matrix (as on the plane) P To view joint distribution function.0000 0.5920 0.0000 0. PX. PY.1364 3.0000 0.*P) P1 = 0.1432 5.0000 0.1500 0.5089 0.1929 2..1929 npr08_08 (Section~17. P1 = total(M.1768 1.3426 4.0700 disp([Y.1300 19.0000 0.0000 0.0994 0.0000 0. P jcalc .3214 0.1000 0.7000 0.0000 0.9300 0.0000 0.1000 13.4336 0.2706 7. Y.5000 0.3230 P2 = total((abs(tu)<=2).0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1300 5. and P disp([X. Y. t.0500 9.0900 7.7564 0.0510 0.0000 0.1092 3.8.1720 0.7390 .39: npr08_08) Data are in X. call for FXY disp(FXY) 0.PY]') 5.1852 0.6904 0.3357 Solution to Exercise 8.219 4.1410 0..0800 17.8000 0..8200 0.
.25).2100 5.4000 2.2).0000 0./t>=1.10 (p. 0 ≤ u ≤ 2 lies outside the triangle (8.5000 0..5570 Solution to Exercise 8. u.3210 3. u > 1/2} M 3 = the has area in the trangle (8.51) M 1 = {(t.0). which has area 1/3 (8. (0. PY.1500 1.0000 0.fx. PY.1000 disp([Y.. u > 1} P ((X.0000 0. RANDOM VECTORS AND JOINT DISTRIBUTIONS Solution to Exercise 8. Y. 212) npr08_09 (Section~17. Y. P jcalc .PY]') 1..*P) P = 0.0990 2.. FX = cumsum(PX). PX. u) : t > 1/2. u.*P) F = 0.Use array operations on matrices X. Y ) ∈ M 1) = 0 = 1/2 (8. u) : 0 ≤ t ≤ 1/2.8. (1..9 (p.53) region in the triangle under u = t. Y.0000 0.5000 0.1500 3.3130 4.PX]') 1..52) M 2 = {(t.. and P fx = PX/dx.0000 0. plot(X. t.0000 0.FX) % Figure not reproduced .2000 2.54) tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 400 Enter expression for joint density (t<=1)&(u<=2*(1t)) Use array operations on X.220 CHAPTER 8.0)..5100 P = total((u. 213) Region is triangle with vertices (0. t.X. 2(1−t) fX (t) = 0 du = 2 (1 − t) .50) fY (u) = 0 dt = 1 − u/2.0570 F = total(((t<=2)&(u<=3)).40: npr08_09) Data are in X. 0 ≤ t ≤ 1 1−u/2 (8.0000 0.0000 0. and P disp([X. PX..
and P fx = PX/dx.59) fY (u) = 0 (8.5 fY (t) by symmetry 1+t 1−t du + I(1. so so P M 1 = 1/4 P M 2 = 1/16 (8.fx. (u>=max(1t. = 1/8. t.1] (t) t + I(1.*P) PM3 = 0. u > 1} M 3 = {(t.58) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density 0.55) M 1 = {(t.2] (t) 0.12 (p. total(M1. PY.2501 % Theoretical = 1/4 M2 = (t<=1/2)&(u>1).0631 % Theoretical = 1/16 = 0. u.3t))& . 213) The region is bounded by the lines u=t−1 fX (t) = I[0.60) . PM3 = total(M3.*P) PM2 = 0. u) : t > 1. plot(X. 0 ≤ u ≤ 1 (8.*P) PM1 = 0. 3−t t−1 and Solution to Exercise 8. PM1 = total(M1. PX. Y. u = 1 − t.5000 total((u<=t). 1 fX (t) = 0 1 4t (1 − u) du = 2t.221 M1 P1 P1 M2 P2 P2 P3 P3 = = = = = = = = (t>0. u) : t ≤ 1/2.5 du = I[0.5).0625 M3 = u<=t.57) has area in the trangle = 1.5023 % Theoretical = 1/2 Solution to Exercise 8.1] (t) 0. total(M2.*P) 0. 213) Region is the unit square.56) has area in the trangle so (8. PM2 = total(M2. u > 1} M 2 = {(t.t1)) Use array operations on X.*P) 0 (t<=0. P M 3 = 1/2 (8. FX = cumsum(PX)..2] (t) (2 − t) = (8. u = 3 − t.11 (p.*P) 0.5*(u<=min(1+t.3350 % Theoretical = 0 % Theoretical = 1/2 % Theoretical = 1/3 u = 1 + t.X.5)&(u>0. 0 ≤ t ≤ 1 4t (1 − u) dt = 2 (1 − u) .5)&(u>1).FX) % Plot not shown M1 = (t>1)&(u>1).. u) : u ≤ t} has area in the trangle = 1/2.
and P fx = PX/dx. RANDOM VECTORS AND JOINT DISTRIBUTIONS 3/4 1 1/2 1 P1 = 1/2 1/2 4t (1 − u) dudt = 5/64 1 t P2 = 0 1/2 4t (1 − u) dudt = 1/16 (8.64) P3 = 0 0 (t + u) dudt = 1/2 (8.61) P3 = 0 0 4t (1 − u) dudt = 5/6 (8. PY.*P) .0781 M2 = (t<=1/2)&(u>1/2). 213) Region is the square 0 ≤ t ≤ 2.X.13 (p. PX.0781 % Theoretical = 5/64 = 0. FX = cumsum(PX). PX.fx.fx.65) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (1/8)*(t+u) Use array operations on X. 0 ≤ u ≤ 2. 0 ≤ t ≤ 2 4 1 2 (8.*(1 .*P) P3 = 0. P2 = total(M2.u) Use array operations on X.X. P3 = total(M3.62) tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density 4*t. u.0625 M3 = (u<=t). t.*P) P2 = 0. and P fx = PX/dx.222 CHAPTER 8. P1 = total(M1.8333 Solution to Exercise 8. fX (t) = 2 2 1 8 2 (t + u) = 0 1 (t + 1) = fY (t) .FX) % Plot not shown M1 = (1/2<t)&(t<3/4)&(u>1/2). plot(X. PY.8350 % Theoretical = 5/6 = 0.63) P1 = 1/2 1/2 (t + u) dudt = 45/64 2 t P2 = 0 1 (t + u) dudt = 1/4 (8. t.*P) P1 = 0. Y.FX) M1 = (t>1/2)&(u>1/2).0625 % Theoretical = 1/16 = 0. plot(X. FX = cumsum(PX). u. P1 = total(M1. Y.
14 (p.15 (p.3] (u) 3 + 2u + 8u2 − 3u3 88 88 1 1 (8.7047 % Theoretical = 0. total(M3. 0 ≤ u ≤ 1.5 1 1 2e−2t dt 1/2 2udu = e−1 5/16 3 −2 1 e + = 0.72) P1 = 0 1 fXY (t. u = 0.1] (u) 3 88 2t + 3u2 dt + I(1. u = 1 + t 2t + 3u2 du = 3 3 (1 + t) 1 + 4t + t2 = 1 + 5t + 5t2 + t3 .*P) p3 = 0. PY.1139 % Theoretical = (5/16)exp(1) = 0. t = 2.7031 (t<=1)&(u>1). fY (u) = 2u.1] (u) I[0.66) Solution to Exercise 8.3] (u) 0 3 88 2 2t + 3u2 dt = u−1 (8. 1) = 0 1 1+t 0 fXY (t. u) dudt = 41/352 P2 = 0 1 fXY (t. 213) Region is strip bounded by fX (t) = 2e−2t . u. p2 = total(M2. 0 ≤ t ≤ 2 88 88 2 (8. and P M2 = (t > 0. u) dudt = 329/352 (8.7030 Solution to Exercise 8. u = 1 (8.223 P1 M2 P2 P2 M3 P3 P3 = = = = = = = 0.5)&(u > 0. total(M2. fXY = fX fY ∞ 3/4 P 1 = 0.68) tuappr Enter matrix [a b] of Xrange endpoints [0 3] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density 4*u. t.*P) p2 = 0.67) P3 = 4 0 t ue−2t dudt = (8. PX. 213) Region bounded by t = 0.5025 % Theoretical = 45/64 = 0.2500 u<=t.7030 2 2 (8. u = 0. P2 = 0.7031 % Theoretical = 1/4 % Theoretical = 1/2 t = 0.1150 p3 = total((t<u).73) .*P) 0.70) 3 3 6u2 + 4 + I(1.69) fX (t) = 3 88 1+t 0 fY (u) = I[0. Y.*P) 0.71) FXY (1. u) dudt = 3/44 1 1+t (8.*exp(2*t) Use array operations on X. 0 ≤ t.5)&(u<3/4).
*(u<=1+t) Use array operations on X.1] (t) 12 t t t2 udu = I[−1.*P) F = 0.1172 % Theoretical = 41/352 = 0.*t.8144 % Theoretical = 13/16 = 0.*P) p1 = 0. u. PY.*P) p2 = 0.*P) P2 = 0. Y. 0 ≤ u ≤ 1 1/2 u (8. t. u = 1.1875 P3 = total((u>=1/2). P 2 = 12 0 u−1 t2 udtdu = 3/16 (8.0681 % Theoretical = 3/44 = 0.4098 % Theoretical = 33/80 = 0. PX. Y. u. P1 = total(M1.*P) P1 = 0. t.4125 M2 = (t<1/2)&(u<=1/2). and P p1 = total((t<=1/2).FX) MF = (t<=1)&(u<=1).9297 % Theoretical = 329/352 = 0.9347 Solution to Exercise 8. plot(X. FX = cumsum(PX).77) tuappr Enter matrix [a b] of Xrange endpoints [1 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density 12*u. u = t + 1 t+1 1 fX (t) = I[−1.74) fY (u) = 12 u−1 1 1 t2 udt + 12u3 − 12u2 + 4u.1165 M2 = abs(tu)<1.75) P 1 = 1 − 12 1/2 t t2 ududt = 33/80. u = t. PX.1] (t) 6t2 1 − t2 2 (8.^2). F = total(MF.1856 % Theoretical = 3/16 = 0.*((u<=t+1)&(u>=t)) Use array operations on X. RANDOM VECTORS AND JOINT DISTRIBUTIONS tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 3] Enter number of X approximation points 200 Enter number of Y approximation points 300 Enter expression for joint density (3/88)*(2*t+3*u. 213) Region bounded by u = 0. p2 = total(M2.*P) P3 = 0.16 (p.^2.X.224 CHAPTER 8.0] (t) 12 0 t2 udu + I(0. PY.76) P 3 = 1 − P 2 = 13/16 (8.0682 M1 = (t<=1)&(u>1).0] (t) 6t2 (t + 1) + I(0.fx. P2 = total(M2.8125 . and P fx = PX/dx.
4545 P3 = total((t<u).4553 % Theoretical = 5/11 = 0.2] (t) t(2 − t) 11 11 tudt = 12 2 u(u − 2) .17 (p.2727 Solution to Exercise 8.*P) P3 = 0.2] (t) 0 3 23 t (t + 2u) du = I[0. u = 2 − t (0 ≤ t ≤ 1) . P1 = total(M1.18 (p. Y.1] (t) 3 23 (t + 2u) du+I(1.1] (u) P1 = 3 23 2 1 1 t 6 3 (2u + 1) + I(1.2] (t) 0 24 11 2−t tudu = 0 (8.80) tududt = 6/11 P 2 = 24 11 1 0 t 1 tududt = 5/11 (8. u = 0.83) fY (u) = I[0. PX.5455 P2 = total((t>1).85) (t + 2u) dudt = 13/46.*P) P1 = 0. PY.2705 % Theoretical = 3/11 = 0. u.1] (t) 0 6 6 (2 − t)+I(1.2] (u) 4 + 6u − 4u2 23 23 P2 = 3 23 2 0 0 1 (8. 0 ≤ u ≤ 1 11 24 11 2 1 0 2−t (8. t = 2.*u.79) 24 11 2−u 0 (8.1] (u) 3 23 2 (t + 2u) dt + I(1.2] (u) 0 3 23 2−u (t + 2u) dt + 0 3 23 (t + 2u) dt = u (8. u = 2 − t fX (t) = I[0. t.2] (t) t2 23 23 2 (8. 3 23 2 0 0 t (t + 2u) dudt = 12/23 (8.81) P3 = tududt = 3/11 (8.87) . u = 0. and P M1 = (t<=1)&(u<=1).1] (t) fY (u) = P1 = 24 11 1 0 0 1 12 12 2 t + I(1.225 Solution to Exercise 8. u = t (1 < t ≤ 2) 2−t fX (t) = I[0.86) P3 = (t + 2u) dudt = 16/23 (8. u = 2.82) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density (24/11)*t.84) I[0.5447 % Theoretical = 6/11 = 0.1] (t) 24 11 1 tudu + I(1.*(u<=2t) Use array operations on X. 213) Region is bounded by t = 0.78) I[0. 214) Region is bounded by t = 0.*P) P2 = 0.
5217 P3 = total((u<=t).93) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (12/179)*(3*t.6959 % Theoretical = 16/23 = 0. PX.2] (t) 9 − 6t + 19t2 − 6t3 179 179 12 179 2 (8.1] (t) 3t2 + u du + I(1.1] (t) I[0. (u<=min(2. 0 ≤ u ≤ 2 12 179 2 (2) 1 < t ≤ 2.* . PX.2] (u) 0 12 179 3−u 3t2 + u dt = 0 (8.*(u<=max(2t.89) fY (u) = I[0.. PY.2] (t) 0 3t2 + u du = 0 (8.92) P3 = 3t2 + u dudt + 12 179 (8.90) 24 12 (4 + u) + I(1. and P fx = PX/dx. and P M1 = (t>=1)&(u>=1). PY. u.2] (u) 27 − 24u + 8u2 − u3 179 179 12 179 2 3/2 0 1 0 3−t 0 1 (8..*P) P3 = 0.3t)) Use array operations on X. RANDOM VECTORS AND JOINT DISTRIBUTIONS tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (3/23)*(t+2*u).^2+u).6957 Solution to Exercise 8.*P) P1 = 2312 % Theoretical = 41/179 = 0.2826 P2 = total((u<=1). Y. u.*P) P1 = 0.91) 3t2 + u dudt = 41/179P 2 = 12 179 3/2 0 0 t 3t2 + u dudt = 18/179 3t2 + u dudt = 1001/1432 (8. P1 = total(M1. t. P1 = total(M1.2291 M2 = (t<=1)&(u<=1). FX = cumsum(PX). 0 ≤ u ≤ 3 − t 12 179 3−t fX (t) = I[0.FX) M1 = (t>=1)&(u>=1). Y.226 CHAPTER 8. 214) Region has two parts: (1) 0 ≤ t ≤ 1.t)) Use array operations on X.1] (u) I[0.*P) P2 = 0. t.fx.5190 % Theoretical = 12/23 = 0.88) 6 24 3t2 + 1 + I(1.X. plot(X.1] (u) P1 = 12 179 2 1 1 3−t 3t2 + u dt + I(1.19 (p.2841 13/46 % Theoretical = 13/46 = 0. .
0383 M2 = (t<=3/2)&(u>1).2996 M3 = u<t.1003 = u<=min(t.. P2 = total(M2.1] (u) 6 24 (2u + 3) + I(1. PX. = total(M3.2] (u) 0 u−1 fXY (t.20 (p.1006 % Theoretical = 1001/1432 = 0. 0 ≤ u ≤ 2 1+t 2 fX (t) = I[0. u) du + I(1.1] (u) 12 3 120 t + 5t2 + 4t + I(1.. t.3t).99) P2 = (3t + 2tu) dudt + 12 227 2 0 1 t (3t + 2tu) dudt = 68/227 (8. (u<=min(1+t.97) = I[0. 2. u) du = (8. 0 ≤ u ≤ 1 + t 1 < t ≤ 2.101) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (12/227)*(3*t+2*t.94) I[0.* .1] (t) fY (u) = I[0.*P) P2 = 0. 214) Region is in two parts: 1.100) P3 = (3t + 2tu) dudt = 144/227 (8. (2) 0 ≤ t ≤ 1.*P) P1 = 0. P3 = total(M3.96) I[0. u.227 P2 P2 M3 P3 P3 = total(M2.1] (u) P1 = 12 227 1 0 1 1+t (8.2] (u) (2u + 3) 3 + 2u − u2 227 227 24 6 (2u + 3) + I(1.0384 % Theoretical = 139/3632 = 0.2)) Use array operations on X.6344 . Y. and P M1 = (t<=1/2)&(u<=3/2).7003 % Theoretical = 18/179 = 0. u) dt = (8. P1 = total(M1.*P) P3 = 0.6308 % Theoretical = 144/227 = 0.*P) = 0.*u).2] (u) 9 + 12u + u2 − 2u3 227 227 1/2 0 0 1+t (8.2] (t) 0 fXY (t.2] (t) t 227 227 2 2 (8.95) fXY (t.*P) = 0.1] (t) 0 fXY (t.3001 % Theoretical = 68/227 = 0.6990 Solution to Exercise 8. u) dt + I(1.98) 12 227 (3t + 2tu) dudt = 139/3632 12 227 3/2 1 1 2 (8. PY.
u = 2t (0 ≤ t ≤ 1) . u) = I[0.*P) P3 = 0. and P P1 = total((t<1).2] (t) 0 t2 u2 du = I[0.1] (u) I[0.2] (t) 0 2 13 3−t (t + 2u) du = I[0. 0 ≤ u ≤ 1 8 14 9 14 9 14 1 (8.3076 % Theoretical = 4/13 = 0. 214) Region is rectangle bounded by t = 0. PX.3077 Solution to Exercise 8.102) fY (u) = I[0. t. t = 2.1] (t) 3 8 1 t2 + 2u du + I(1.2] (u) 2 9 6 21 + u − u2 13 13 52 1 (8.1] (t) 0 2 3 2 3 t + 1 + I(1. u = 1 3 2 9 t + 2u + I(1. P2 = total(M2.1] (t) fX (t) = I[0.104) P1 = 0 0 (t + 2u) dudt = 4/13 P 2 = 1 2 t/2 0 (t + 2u) dudt = 5/13 (8.107) fXY (t. 3 − t (1 ≤ t ≤ 2) 2t Solution to Exercise 8.22 (p.1] (t) 0 12 2 6 t + I(1.108) fY (u) = 3 8 3 8 1 t2 + 2u dt + 0 1 1/2 0 1/2 t2 u2 dt = 1 1 3 3 + u + u2 0 ≤ u ≤ 1 8 4 2 1/2 (8.105) P3 = 0 0 (t + 2u) dudt = 4/13 (8.3077 M2 = (t>=1)&(u<=1).1] (u) 1 2t 2 13 2 (t + 2u) dt + I(1.3844 % Theoretical = 5/13 = 0.110) .228 CHAPTER 8.2] (t) t2 8 14 (8.3t)) Use array operations on X.3846 P3 = total((u<=t/2).109) P1 = t2 + 2u dudt + 9 14 3/2 1 0 t2 u2 dudt = 55/448 (8.*P) P2 = 0. 214) Region bounded by fX (t) = I[0.103) 4 8 9 + u − u2 13 13 52 + I(1. u.2] (u) u/2 2 13 3−u (t + 2u) dt = u/2 (8.1] (t) 2 13 (t + 2u) du + I(1.2] (t) t2 u2 .21 (p. u = 0.2] (t) (3 − t) 13 13 (8. Y.*P) P1 = 0. PY.106) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 400 Enter number of Y approximation points 400 Enter expression for joint density (2/13)*(t+2*u).3076 % Theoretical = 4/13 = 0. RANDOM VECTORS AND JOINT DISTRIBUTIONS t = 2.*(u<=min(2*t.
229 tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density (3/8)*(t.^2). P = total(M. + (9/14)*(t..*P) P = 0.*(t > 1) Use array operations on X.*u. PY.1228 % Theoretical = 55/448 = 0. u. PX. Y. t. and P M = (1/2<=t)&(t<=3/2)&(u<=1/2).^2+2*u)..^2.*(t<=1) .1228 .
230 CHAPTER 8. RANDOM VECTORS AND JOINT DISTRIBUTIONS .
Y ∈ N ) = P (X ∈ M ) P (Y ∈ N ) Independence implies M. the inverse image Y −1 (N ) is an event determined by random variable Y for each reasonable set N.e. An important theorem from measure theory ensures that if the product rule holds for this special class it holds for the general class of The pair {M.1 Introduction The concept of independence for classes of events is developed in terms of a product rule. the set of all outcomes ω ∈ Ω which are mapped into M by X ) is an event for each reasonable subset M on the real line.3) Note that the product rule on the distribution function is equivalent to the condition the product rule holds {M. u]) = FX (t) FY (u) ∀ t. In this unit. u]) = P (X ∈ (−∞.1. t]) P (Y ∈ (−∞. u for the inverse images of a special class of sets (9. 9.1 Independent Classes of Random Variables1 9. the inverse image X −1 (M ) (i. N (9. t] . Y ∈ (−∞. u) = FX (t) FY (u) ∀ t.5/>.Chapter 9 Independent Classes of Random Variables 9. N }.4) 231 . Y −1 (N )} A of random variables is (stochastically) is independent. Y } {X −1 (M ) . u]..org/content/m23321/1. N } of the form M = (−∞. independent i each pair of events This condition may be stated in terms of the product rule for all (Borel) sets P (X ∈ M.1) FXY (t. Y } is independent i the following product rule holds FXY (t. u 1 This content is available online at <http://cnx. More Denition pair {X.2) (9.2 Independent pairs precisely. X. t] and N = (−∞. (9. We extend the notion of Recall that for a random variable independence to a pair of random variables by requiring independence of the events they determine. Thus we may assert {X. Similarly.1. u) = P (X ∈ (−∞. we extend the concept to classes of random variables.
b]. u) = fX (t) fY (u) ∀ t. Y } is independent. Y is uniform on [c. In order to describe this more succinctly. b] and Y is uniform on [c. u Example 9.6) fXY is I1 = [a. u) = FX (t) FY (u) {X.3 The joint mass distribution It should be apparent that the independence condition puts restrictions on the character of the joint mass distribution on the plane. u) = fX (t) fY (u) for t. 0 ≤ u.3: Rectangle with interval sides The rectangle in Example 9. u) = 1 − e−βu t→∞ holds. the the pair has uniform joint distribution on I1 × I2 .1. Since 1/ (b − a) (d − c). Simple integration gives fX (t) = 1 (b − a) (d − c) d the area is {X. u) = 1 − e−αt so that the product rule dent. and fXY (t. then the cartesian product t ∈ M and M ×N u ∈ N. Y } is uniform on a rect(b − a) (d − c). consisting of all those points (t. d]. u) = (1 − e−αt ) 1 − e−βu u→∞ 0 ≤ t. Y } is therefore indepen If there is a joint density function. INDEPENDENT CLASSES OF RANDOM VARIABLES Example 9. The converse is also true: if the pair is independent with X uniform on [a. b] and I2 = [c. u) such that a≤t≤b and c≤u≤d (i.232 CHAPTER 9.8) X is uniform on [a. d].e. d]. then the relationship to the joint distribution function makes it clear that the pair is independent i the product rule holds for the density. Denition If M is a subset of the horizontal axis and N is a subset of the vertical axis. The pair (9. the constant value of du = c b 1 a≤t≤b b−a and (9. the pair is independent i fXY (t..7) fY (u) = Thus it follows that 1 (b − a) (d − c) dt = a 1 c≤u≤d d−c (9. and Taking limits shows FX (t) = lim FXY (t. FY (u) = lim FXY (t. That is.1: An independent pair Suppose FXY (t. u. u ∈ I2 ). we employ the following terminology.2: Joint uniform distribution on a rectangle Suppose the joint probability mass distributions induced by the pair angle with sides (9.5) FXY (t. so that the pair {X. t ∈ I1 and . is the (generalized) rectangle consisting of those points (t.2 (Joint uniform distribution on a rectangle) is the Cartesian product I1 × I2 . u) on the plane such that Example 9. all 9.
1 illustrates the basic pattern. This suggests a useful test for nonindependence which we call the simple example. We illustrate with a . The probability X ∈ M rectangle test. Y ∈ N ) = P ((X. We restate the product rule for independence in terms of cartesian product sets. Y ) ∈ M × N ) = P (X ∈ M ) P (Y ∈ N ) Reference to Figure 9.233 Figure 9.1: Joint distribution for an independent pair of random variables. P (X ∈ M. axes. The probability in the rectangle is the product of these marginal probabilities. the probability Y ∈N is the part of the joint probability in the horizontal strip. is the portion of the joint probability mass in the vertical strip. then the rectangle axis in If (9.9) M. N are intervals on the horizontal and vertical M M ×N is the intersection of the vertical strip meeting the horizontal with the horizontal strip meeting the vertical axis in N. respectively.
The following is a simple example.3. Then P (Y = u3 ) .10) P (X = t2 . (9.08 implies P (Y = u2 ) = 0. Y = u2 ) = P (X = t1 ) P (Y = u2 ) = 0.2 that a value of X determines the possible values of Y and vice versa. In spite of these limitations.1). Yet P (X ∈ M ) > 0 and P (Y ∈ N ) > 0. it is frequently useful. (1.4: The rectangle test for nonindependence Supose probability mass is uniformly distributed over the square with vertices at (1. (0.3. The following four items of information are available.2. There are nonindependent cases for which this test does not work.1). but P ((X. There is no probability mass in the region.08 (9. Y = u1 ) = 0. P (X = t1 . Y ) ∈ M × N ) = 0. in many cases the complete joint and marginal distributions may be obtained with appropriate partial information. And it does not provide a test for independence. so that we would not expect independence of the pair. The product rule fails.4. Y = u2 ) = 0. so that P (X ∈ M ) P (Y ∈ N ) > 0.11) A combination of the product rule and the fact that the total probability mass is one are used to calculate each of the marginal and joint probabilities. Y } is independent and each has three possible values.2).234 CHAPTER 9. Because of the information contained in the independence condition.5: Joint and marginal probabilities from partial information Suppose the pair {X. INDEPENDENT CLASSES OF RANDOM VARIABLES Figure 9. Example 9. consider the M × N shown on the gure.2: Rectangle test for nonindependence of a pair of random variables. Remark.2 and P (X = t1 . (2. It is evident from Figure 9. small rectangle To establish this. P (X = t1 ) = 0. Example 9.15 These values are shown in bold type on Figure 9. For example P (X = t1 ) = 0. P (Y = u1 ) = 0. hence the pair cannot be stochastically independent.0).
235 = 1 − P (Y = u1 ) − P (Y = u2 ) = 0.u)/2 (9. + u − µY σY 2 (9.12) Q (t. While it is true that every independent pair of normally distributed random variables is joint normal. this. not every pair of normally distributed random variables has the joint normal distribution. The result is that 2 X ∼ N µX . (9. σX and 2 Y ∼ N µY . Y } has the joint normal distribution i the joint density is fXY (t.13) The marginal densities are obtained with the aid of some algebraic tricks to integrate the joint density. u) = fX (t) fY (u) so that the pair is independent i reader.6: The joint normal distribution A pair {X. The details are left as an exercise for the interested Remark. There is no unique procedure for solution.3. the result is fXY (t. . And it has not seemed useful to develop MATLAB procedures to accomplish Figure 9.3: Joint and marginal probabilities from partial information. u) = where 1 2πσX σY (1 − ρ2 ) 1/2 e−Q(t. Example 9.14) ρ = 0. u) = 1 1 − ρ2 t − µX σX 2 − 2ρ t − µX σX u − µY σY . σY If the parameter ρ is set to zero. Others are calculated similarly.
Examples are simple random samples from a given population.17) Since we may obtain the joint distribution function for any nite subclass by letting the arguments for the others be ∞ (i. u) = t2 u2 1 exp − − 2π 2 2 by (9.e. Denition A class {Xi : i ∈ J} of random variables is (stochastically) independent i the product rule holds for every nite subclass of two or more. then any nite subclass is jointly absolutely continuous and the product rule holds for the densities of such subclasses m fXi1 Xi2 ···Xim (ti1 .19) . Y } fXY (t. tim ) = k=1 fXik (tik ) for all (t1 . u). A Bernoulli sequence is a simple example. Y } of simple random variables in canonical form n m X= i=1 ti IAi Y = j=1 uj IBj (9. then each individual variable is absolutely continuous and the product rule holds for the densities. 1) and Y ∼ N (0. ti2 . by taking the limits as the appropriate ti increase without bound). 1). and zero elsewhere (9. Absolutely continuous random variables If a class {Xi : i ∈ J} is independent and the individual variables are absolutely continuous (i.16) X ∼ N (0. tn ) (9. · · · . they cannot be joint normal. if each nite subclass is jointly absolutely continuous.1. tn ) (9.4 Independent classes Since independence of random variables is independence of the events determined by the random variables. (t. u) Both distribution is positive for all in the rst and third quadrants. (an acronym for iid classes independent. or the results of repetitive trials with the same distribution on the outcome of each component trial. · · · .15) is the joint normal density for an independent pair ( Now dene the joint density for a pair ρ = 0) of standardized normal random variables. tn ) = i=1 FXi (ti ) for all (t1 . · · · . t2 . INDEPENDENT CLASSES OF RANDOM VARIABLES Example 9. extension to general classes is simple and immediate.identically distributed). independence is equivalent to the product rule For a nite class {Xi : 1 ≤ i ≤ n}.e. · · · . the single product rule suces to account for all nite subclasses. t2 .7: A normal pair not joint normally distributed We start with the distribution for a joint normal pair and derive a joint distribution for a normal pair which is not joint normal.18) Similarly. have densities). Remark. since the joint normal 9. t2 . n FX1 X2 ···Xn (t1 . However. The function φ (t.236 CHAPTER 9. {X.. Frequently we deal with independent classes in Such classes are referred to as which each random variable has the same marginal distribution. u) = 2φ (t.5 Simple random variables Consider a pair {X..1. The index set J in the denition may be nite or innite. 9.
~t. Y = uj ) = P (X = ti ) P (Y = uj ) According to the rectangle test.237 Since Ai = {X = ti } and Bj = {Y = uj } the pair {X.23) t and u is achieved exactly as in jcalc. Nb } generated by the two classes. is independent.01*[15~43~31~11].0264 ~~~~0. PY~=~0. PX~=~0. The marginal distributions determine the joint distributions.0209~~~~0. (ti . icalc Enter~row~matrix~of~Xvalues~~X Enter~row~matrix~of~Yvalues~~Y Enter~X~probabilities~~PX Enter~Y~probabilities~~PY ~Use~array~operations~on~matrices~X. no gridpoint having one of the (9. Independence of the minterm pairs is implied by independence of the combined class {Ei .01*[12~18~27~19~24]. Formation of the joint p (i. P (X = ti . n m X = a0 + i=1 Since ai IEi Y = b0 + j=1 bj IFj (9. Y = uj ) = P (X = ti ) P (Y = uj ) Once these are calculated.0837~~~~0.~PY.0558~~~~0. the pair {X.8: Use of icalc to set up for joint calculations X~=~[4~2~0~1~3].~PX. we use an mprocedure we call probability matrix is simply a matter of determining all the joint probabilities icalc.0372~~~~0. and to determine the joint probability matrix (hence the joint distribution) and the calculation matrices u. we may use canonic (or the function form canonicf ) to obtain the marginal distributions.0297~~~~0.0589~~~~0. Fj : 1 ≤ i ≤ n. In the general case of pairs of joint simple random variables we have the mprocedure jcalc. Example 9. If they are in ane form. formation of the calculation matrices (9. Y.~u. Y~=~[0~1~2~4]. Suppose m · n joint probabilities. we need only the marginal distributions in matrices X. Once we have both marginal distributions. That is.0744 . Y. j) = P (X = ti . P X .22) Calculations in the joint simple case are readily handled by appropriate mfunctions and mprocedures. The joint distribution has probability mass at each point Thus at every point on the grid. Y ). Y } is independent i each of the pairs independent.0132~~~~0. Bj } is W = (X. which uses t information in matrices and u. X. then the uj as a coordinate has zero probability X has n distinct values and Y has m n + m marginal probabilities suce to determine the the marginal probabilities for each variable must add to one. 1 ≤ j ≤ m} MATLAB and independent simple random variables (9. we have the marginal distributions. If distinct values.0198~~~~0.20) ti or mass .~and~P disp(P)~~~~~~~~~~~~~~~~~~~~~~~~%~Optional~display~of~the~joint~matrix ~~~~0.~Y.21) Ar = {X = tr } is the union of minterms generated by the minterms generated by the Ei and Bj = {Y = us } is the union of Fj . Y } is independent i each pair of minterms {Ma . and P to determine the marginal probabilities and the calculation matrices In the independent case. Since (n − 1) + (m − 1) = m + n − 2 values are X and Y are in ane form. uj ) in the range of {Ai . respectivly. t PY and If the random variables are given in canonical form. only needed.
0817~~~~0.0270~~~~0.*P)~~~~~~~~~~~~%~of~Example 9 from "Composite Trials". PX~=~ibinom(10.8) and Y ∼ binomial (10.0516~~~~0.238 CHAPTER 9.X) Enter~Y~probabilities~~PY~~~~~~~~%~Could~enter~ibinom(10.~~~~~~~~~~%~N~=~{u:~u~>~0.X).~PX. INDEPENDENT CLASSES OF RANDOM VARIABLES ~~~~0.4736~~~~~~~~~~~~~~~~~~%~P((X. Y~=~0:10.1161~~~~0.85.*P)~~~~~~~~~~~~~~~%~P(Y~in~N) PN~=~~~0.~t.Y).8. What is the probability Mary makes more free throws than Bill? SOLUTION Let X be the number of goals that Mary makes and Y be the number that Bill makes.*P)~~~~~~~~~~~~~~~%~P((X.85 of success on each trial.0360 disp(t)~~~~~~~~~~~~~~~~~~~~~~~~%~Calculation~matrix~t ~~~~4~~~~2~~~~~0~~~~~1~~~~~3 ~~~~4~~~~2~~~~~0~~~~~1~~~~~3 ~~~~4~~~~2~~~~~0~~~~~1~~~~~3 ~~~~4~~~~2~~~~~0~~~~~1~~~~~3 disp(u)~~~~~~~~~~~~~~~~~~~~~~~~%~Calculation~matrix~u ~~~~~4~~~~~4~~~~~4~~~~~4~~~~~4 ~~~~~2~~~~~2~~~~~2~~~~~2~~~~~2 ~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 M~=~(t>=3)&(t<=2).~2] PM~=~total(M. icalc Enter~row~matrix~of~Xvalues~~X~~%~Could~enter~0:10 Enter~row~matrix~of~Yvalues~~Y~~%~Could~enter~0:10 Enter~X~probabilities~~PX~~~~~~~~%~Could~enter~ibinom(10.2276~~~~~~~~~~~~~~~~~~~~~%~obtained~than~in~the~event~formulation Pm~=~total((t>=u).~~~~~~~~~~~~~~~~~~~~~~~%~Rectangle~MxN PQ~=~total(Q.0.~and~P PM~=~total((t>u). Bill: Has probability 0. Mary: Has probability 0. 0.3). We assume the two seqences of trials are independent of each other.~PY. Then X∼ binomial (10.9: The joint Bernoulli trial of Example 4.0285~~~~0.Y)~in~MxN)~=~P(X~in~M)P(Y~in~N) As an example. PY~=~ibinom(10.0405~~~~0.0. 1 Bill and Mary take ten basketball free throws each.9.85.7400 Q~=~M&N.*P)~~~~~~~~~~~~%~Additional~information~is~more~easily Pe~=~~0.~~~~~~~~~~~~%~M~=~[3.~u.85). Pe~=~total((u==t).8.0.6400 N~=~(u>0)&(u.1032 ~~~~0. 0. and each is a Bernoulli sequence.*P)~~~~~~~~~~~~~~~%~P(X~in~M) PM~=~~~0.Y)~in~MxN) PQ~=~~~0.~Y.Y) ~Use~array~operations~on~matrices~X.2738~~~~~~~~~~~~~~~~~~~~~%~Agrees~with~solution~in~Example 9 from "Composite Trials". Example 9.0774~~~~0.4736 p~=~PM*PN p~~=~~~0.0. X~=~0:10.*P) PM~=~~0.^2<=15).0180~~~~0. consider again the problem of joint Bernoulli trials described in the treatment of Composite trials (Section 4.80 of success on each trial.5014 .~u^2~<=~15} PN~=~total(N. Pm~=~~0.
P~~=~idbn(PX. t.0750~~~~0.0400 . then icalc to get the joint distribution.01*[20~15~40~25].0800 ~~~~0.8525 As in the case of jcalc.~t.66~0. is independent. We suppose results for individuals are independent.0750~~~~0. Then the pair the distributions for X {X. We use the mfunction canonicf to determine c1~=~[ones(1.62 0. Y } and for Y.PY]~=~canonicf(c2.8708 Py3~=~(Y>=3)*PY'~~~~~~~~%~Prob~second~has~3~or~more Py3~=~~0.11: A numerical example PX~=~0.2408 Px3~=~(X>=3)*PX'~~~~~~~~%~Prob~first~has~3~or~more Px3~=~~0.73~0. [X.43 Second heat probabilities: 0.61~0.51].2000~~~~0.0450~~~~0. PY~=~0. we have an mfunction version icalcf (9.75 0.239 Example 9.PY) P~= ~~~~0. Y. c2~=~[ones(1.*P)~~%~Prob~both~have~the~same Peq~=~~0.minprob(P1)).61 0.48 0.~u.6)~0].24) [x. px. p] = icalcf (X.73 0.81 0.51 Compare the two heats for numbers who break the track record. Example 9.0500 ~~~~0.3986 Pm2~=~total((u>t).minprob(P2)).58 0.1*[3~5~2]. P2~=~[0. PX.77 0.~PX.*P)~~~%~Prob~first~heat~has~most Pm1~=~~0. y.58~0.43].~Y. PY) We have a related mfunction Its formation of the joint matrix utilizes the same operations as icalc. u.75~0.PX]~=~canonicf(c1.*P)~~~%~Prob~second~heat~has~most Pm2~=~~0. SOLUTION Let X be the number of successes in the rst heat and Y be the number who are successful in the second heat. icalc Enter~row~matrix~of~Xvalues~~X Enter~row~matrix~of~Yvalues~~Y Enter~X~probabilities~~PX Enter~Y~probabilities~~PY ~Use~array~operations~on~matrices~X.1000~~~~0.55~0.62~0.1250~~~~0.~PY.0300 ~~~~0.55 0.3606 Peq~=~total((t==u). Each runner has a reasonable chance of breaking the track record.~and~P Pm1~=~total((t>u).10: Sprinters time trials Twelve world class sprinters in a meet are running in two heats of six persons each. First heat probabilities: 0. idbn for obtaining the joint probability matrix from the marginal probabilities.66 0.48~0.0600~~~~0.77~0.81~0. P1~=~[0. py.1200~~~~0. [Y.6)~0].
0055~~0. However.0105~~0.0112 ~~~~~0.0169~~0.0169~~0.0176 itest Enter~matrix~of~joint~probabilities~~P The~pair~{X.1161~~~~0. and it would be very dicult to pick out those for which it fails.0035~~0.0056~~0.0817~~~~0. Example 9.0297~~~~0.0060~~0.0253~~0.0025~~0.13 jdemo3~~~~~~%~call~for~data~in~mfile disp(P)~~~~~%~call~to~display~P ~~~~~0.0035~~0.0189~~0.0138~~0.0144 ~~~~~0.0105~~0.0558~~~~0.0091~~0.0168~~0.0096 ~~~~~0.0104~~0.0120~~0.240 CHAPTER 9. we consider an example in which the pair is known to be independent.0589~~~~0.0132~~~~0.0049~~0.0516~~~~0.0208 ~~~~~0.0208 ~~~~~0.0285~~~~0.0147~~0.0774~~~~0.0161~~0.0273~~0.0144 ~~~~~0.0165~~0.0052~~0.0143~~0.0231~~0.0091~~0.0049~~0.0045~~0.0065~~0.0115~~0.0207~~0.0128 ~~~~~0.0030~~0.0126~~0. then forming an independent joint test matrix. The mprocedure simply checks all of them.0040~~0.0299~~0.0065~~0.0147~~0.procedure itest checks a joint distribution for independence.0168~~0.0161~~0.0135~~0.0078~~0.0270~~~~0.0028~~0.Y}~is~NOT~independent~~~%~Result~of~test To~see~where~the~product~rule~fails.0065~~0. We do not ordinarily exhibit the matrix P to be tested.0064 ~~~~~0.0092~~0.0080 ~~~~~0.0045~~0.0184~~0. It does this by calculating the marginals.0128 ~~~~~0.0120~~0.0075~~0.0095~~0.0372~~~~0.0180~~~~0. which is compared with the original.0190~~0. Example 9.0091~~0.0195~~0.0207~~0.0405~~~~0.0077~~0. this is a case in which the product rule holds for most of the minterms.0056~~0.0184~~0.1032 ~~~~~0.0117~~0.0112 ~~~~~0.0264 ~~~~~0.0091~~0.0105~~0.0063~~0.0084~~0. INDEPENDENT CLASSES OF RANDOM VARIABLES An m.0063~~0.0042~~0.0189~~0.0744 ~~~~~0.0198~~~~0.0117~~0.0837~~~~0.0020~~0.0040~~0.0273~~0.0209~~~~0.0104~~0.0360 ~ itest .0035~~0.12 idemo1~~~~~~~~~~~~~~~~~~~~~~~~~~~%~Joint~matrix~in~datafile~idemo1 P~=~~0.~call~for~D disp(D)~~~~~~~~~~~~~~~~~~~~~~~~~~%~Optional~call~for~D ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1~~~~~1 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 ~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0~~~~~0 Next.0299~~0.0135~~0.
u ≥ 0 Since (9.Z)~>~0) PG~=~~0. variations of the mfunction mgsum and the mfunction diidsum Also several are used for obtaining distributions for sums of independent random variables.3370 Pg~=~(W>0)*PW'~~~~~~~~~~~~%~P(Z~>~0) Pg~=~~0.~and~P G~=~3*t~+~2*u~~4*v. Thus if {X.3370 An mprocedure icalc4 to handle an independent class of four variables is also available.P). Y~=~1:2:7.1*[2~2~3~3]. Example 9.14: Calculations for three independent random variables X~=~0:4. we may consider {X. When in doubt. so is any pair of approximating simple functions theoretically possible for the approximating pair {Xs .25) e −12 ≈ 6 × 10 −6 .6 Approximation for the absolutely continuous case In the study of functions of random variables. For all practical purposes. h (Y )} for any reasonable functions g and h.~PY. Z~=~0:3:12. 9.~u. Also.1*[1~3~2~3~1]. PZ~=~0. u) = 6e−3t e−2u = 6e−(3t+2u) t ≥ 0. PY~=~0. Y } is an Now it is independent pair.241 Enter~matrix~of~joint~probabilities~~P The~pair~{X. We consider them in various contexts in other units. PX.~~~~~~~~%~W~=~3X~+~2Y~4Z [W.1*[2~2~1~3~2].~Z. Ys } of the type considered.~~~~~~~~%~Distribution~for~W PG~=~total((G>0). yet have the approximated pair this is highly unlikely.Y}~is~independent~~~~~~~%~Result~of~test The procedure icalc can be extended to deal with an independent class of three random variables. But {Xs . {X. The following is a simple example of its use. Ys } is independent.~t.~PZ.PW]~=~csort(G. We call the mprocedure icalc3. Y } {X. Example 9. Ys } to be independent.15: An independent pair Suppose X∼ exponential (3) and Y ∼ exponential (2) with fXY (t.1. . so is X Xs of which is approximated.Y. Y } to This decreases even further the likelihood of a false indication of be independent i {Xs . icalc3 Enter~row~matrix~of~Xvalues~~X Enter~row~matrix~of~Yvalues~~Y Enter~row~matrix~of~Zvalues~~Z Enter~X~probabilities~~PX Enter~Y~probabilities~~PY Enter~Z~probabilities~~PZ Use~array~operations~on~matrices~X. consider a second pair of approximating simple functions with more subdivision points. independence by the approximating random variables. we show that if {g (X) .~Y. we show that an approximating simple random variable the type we use is a function of the random variable is an independent pair. PX~=~0.~v. we approximate X for values up to 4 and Y for values up to 6.*P)~~~~~~~~%~P(g(X. Y } not independent.
1 3.0551 0.1 The pair (Solution on p.0589 (9. .2 Problems on Independent Classes of Random Variables2 Exercise 9.0399 0.0483 0.37: npr08_06)): {X.4/>.8.0527 0.3 − 0.Y}~is~independent 9.~PY.0609 0. 0 ≤ u ≤ 1.1 5.0493 0. u) = 4tu0 ≤ t ≤ 1. Y } X = [−2. 1 It is easy enough to determine the marginals in this case.0399 0.0380 0. By symmetry.26) fXY = fX fY which ensures the pair is independent. they are the same.*u Use~array~operations~on~X.~Y.0323 0.9 5.0420 0.0441 (9.0437 P = 0.Y}~is~independent Example 9.3] 0.3 2.~PY.28) {X. tuappr Enter~matrix~[a~b]~of~Xrange~endpoints~~[0~1] Enter~matrix~[c~d]~of~Yrange~endpoints~~[0~1] Enter~number~of~X~approximation~points~~100 Enter~number~of~Y~approximation~points~~100 Enter~expression~for~joint~density~~4*t. 0 ≤ t ≤ 1 (9.0580 Y = [1.0713 0.1] 0.7 1. Y } is independent.5 4.0357 0.~PX.242 CHAPTER 9. fX (t) = 4t 0 so that u du = 2t.) has the joint distribution (in mle npr08_06.16: Test for independence The pair {X.org/content/m24298/1. INDEPENDENT CLASSES OF RANDOM VARIABLES tuappr Enter~matrix~[a~b]~of~Xrange~endpoints~~[0~4] Enter~matrix~[c~d]~of~Yrange~endpoints~~[0~6] Enter~number~of~X~approximation~points~~200 Enter~number~of~Y~approximation~points~~300 Enter~expression~for~joint~density~~6*exp((3*t~+~2*u)) Use~array~operations~on~X.~t.0620 0. Consider the solution using tuappr and itest.0667 Determine whether or not the pair 0.~u.~PX.27) 0.~and~P itest Enter~matrix~of~joint~probabilities~~P The~pair~{X.0651 0. 2 This content is available online at <http://cnx. 247.0361 0.m (Section 17.~t.~and~P itest Enter~matrix~of~joint~probabilities~~P The~pair~{X.~u. Y } has joint density fXY (t.~Y.
3 The pair (Solution on p.38: npr08_07)): {X. b. 0 ≤ u ≤ 2 (see Exercise 13 (Exercise 8.0242 0. Exercise 9. (Solution on p. 0) .7 0 ≤ t ≤ 1. center at (0.11) from "Problems on Random Vectors and Joint Distributions"). Determine whether or not the pair is independent. u) = 4ue−2t Exercise 9. 0 ≤ u ≤ 1 (see Exercise 12 (Exercise 8.1089 0.0682 0.9 0 ≤ t.) has the joint distribution (in mle npr09_02. 249. Y } is independent.0961 P = 0.6 (1.0398 0.9 0.0297 0 4.0203 0.0350 0.8. 248.0744 0.0891 0. (Solution on p.7 1.0132 3.0209 (9.0405 0.0484 1. 2) . 249.5 2.0341 0. 0) .0363 0. Exercise 9.1320 0.1] 0.5 (Solution on p.9 − 1.1] 0.0).) on the square with vertices at fXY (t.0528 0.29) 0.6 5. 1) .) on the circle with radius one. Exercise 9. 248. 247.8 4.8 3.0498 0.1 Determine whether or not the pair {X.4 0.m (Section 17.243 Exercise 9.0510 0.0324 0. (2.0556 0.) for fXY (t.14) from "Problems on Random Vectors and Joint Distributions").1 0. u) = 4t (1 − u) Exercise 9.0189 0.) fXY (t.0868 Determine whether or not the pair 0. 1) .12) from "Problems on Random Vectors and Joint Distributions"). 0) .0396 0 0.5 4.0 3.m (Section 17. 1) (see Exercise 11 (Exercise 8.0342 0.8 (Solution on p. 1) . Y = u) (9.) on the parallelogram with vertices fXY (t.8. For the distributions in Exercises 410 below a.0726 2.0231 0.0589 0. Use a discrete approximation and an independence test to verify results in part (a). 247. u) = 12t2 u (−1.0504 0.2 0.0304 0.) for fXY (t.0308 (9.5 0. (0.0528 0.0495 0.41: npr09_02)): {X.) has the joint distribution (in mle npr08_07. (Solution on p. (1.13) from "Problems on Random Vectors and Joint Distributions").4 (Solution on p.0440 0. Y } X = [−3. u) = 1 8 (t + u) for 0 ≤ t ≤ 2.0672 0. u) = 1/π Exercise 9. 0 ≤ u ≤ 1 (see Exercise 14 (Exercise 8.0594 0.0456 0. Y } P (X = t.0448 Y = [−2 1 2.2 The pair (Solution on p.30) {X. (0. 247. 248. (1.0077 Table 9. Y } is independent. u) = 1/2 Exercise 9.0090 0. (0.31) t = u = 7.0216 0.7 0.1 2. fXY (t.
$1000. with respective probabilities 0.22.15 (Solution on p. Determine the probability the number of lights which survive the entire period is at least 175.) for fXY (t. Payos are $570. that at least one will complete on time. unless failure occurs. X be the net prot on the rst alternative and is independent. a.13 triangular distribution on 1/2 of the point (Solution on p. respective probabilities of 0. 3].58. u) = 24 11 tu 0 ≤ t ≤ 2. except for brief times for maintenance or repair. 250. with probability of success on each call p = 0.57.52. 0. MicroWare and BusiCorp.83. 0. 2 − t} (see Exercise 17 (Exercise 8. 15. Exercise 9. 250. 15. Anne contemplates six purchases in the amounts 8. $1100.12 (Solution on p. If successive units fail independently. 21. Exercise 9.81. $525. BusiCorp has time to completion. what is P (Y > X)? (Solution on p.17 Margaret considers ve purchases in the amounts 5. 0. 0.38.41. INDEPENDENT CLASSES OF RANDOM VARIABLES (see Exercise 16 (Exercise 8. 0 ≤ u ≤ min{1. 15 dollars with respective probabilities 0. 190. c. display is on continuously for 750 hours (approximately one month). (Solution on p. Compare probabilities in the two schemes that total sales are at least $600. $900. What is the probability Anne spends at least $30 more than Margaret? . Exercise 9. What is the probability the second exceeds the rst i.16) from "Problems on Random Vectors and Joint Distributions"). exponential (1/150). exponential (1/130).) The location of ten points along a line may be considered iid random variables with symmytric [1. 250. 0. The Exercise 9. (Solution on p.) • • In the rst.17) from "Problems on Random Vectors and Joint Distributions"). Assume the {X.244 CHAPTER 9. 12 dollars.) A critical module in a network server has time to failure (in hours of machine time) exponential (1/3000). Assume that all eleven possible purchases form an independent class. If the units fail independently.41. what is the probability of no breakdown due to the module for one year? Exercise 9. Y be the net gain on the second. Y } a. 18. in days. 0.10 (Solution on p.) They work independently. What is the probability that three or more will lie within distance t = 2? (Solution on p. and $465. are preparing a new business package in time anticipated completion time. The time to failure (in hours) of each unit is exponential (1/750). 185.77.) Exercise 9. 180. she makes eight independent calls. that neither will complete on time? Exercise 9.57.) The times to failure are iid. What is the probability Anne spends at least twice as much as Margaret? b. 251. The module is replaced routinely every 30 days (720 hours). 250. 0. The machine operates continuously. and 0. 12. 250.) Eight similar units are put into operation at a given time.37. Which alternative oers the maximum possible gain? b. MicroWare has Two software companies.e.63. days..16 Joan is trying to decide which of two sales opportunities to take. exponential (1/10000). 249.14 A Christmas display has 200 lights.23. in What is the probability both will complete on time. with In the second. 0.11 for a computer trade show 180 days in the future. she makes three independent calls. 250. 0.35. Let pair She realizes $150 prot on each successful sale. what is the probability that ve or more units will be operating at the end of 500 hours? Exercise 9. 8. 17. 0.
22 throws more sevens than does Ronald? Two players. Determine Y = −2ID + 6IE + 2IF − 3. 0.) Exercise 9. win $30 with probability p1 = 0. Do this with icalc. 254.2).47.05 p2 = 0.23 (Solution on p. What is the probability the College will lose money? What is the probability the prot will be $400 or more. then X and Y are the prots X = 30 · 5 − 20N1 where binomial and Y = 50 · 10 − 30N2 (9. It is reasonable to suppose the pair {N1 . less than $200. B. Consider the simple random variables X = 3IA − 9IB + 4IC . and $350. so that {X. D.) • • In the rst.32) N1 . Y } is independent. 0.2 (one in twenty) (one in ve) Thirty chances are sold on Game 1 and fty chances are sold on Game 2. and Z = 2X − 3Y (9. Y.05) and N2 ∼ binomial (50. Exercise 9. $380.41.33) W = 3X − 4Y + 2Z .) A class has fteen boys and fteen girls. $750. Y } • • • Which alternative oers the maximum possible gain? What is the probability the second exceeds the rst i. throw a pair of dice 30 times each. Let pair He realizes $100 prot on each successful sale. N2 } is independent.21 The class (Solution on p. and 0. If on the respective games.37. The total prot for the College is Z = X +Y .) {X. (Solution on p. 0. Determine the marginal distributions for X and Y then use icalc to obtain the joint distribution and the calculating matrices. 0.e. What is the probability Mike Exercise 9. It is reasonable to suppose N1 ∼ (30. with probability of success on each call p = 0. 254. 252. the respective probabilites {0. 254.20 The class distribution (Solution on p. They pair up and each tosses a coin 20 times. between $200 and $450? Exercise 9. Ronald and Mike. N2 are the numbers of winners on the respective games. 252. Z} of random variables is iid (independent. 0.01 ∗ [15 20 30 25 10] this determine compare results. Assume the {X. There are two games: Game 1: Game 2: Pay $5 to play. win $20 with probability Pay $10 to play. 253. Determine the distribution for W and from P (−20 ≤ W ≤ 10). P (5 ≤ Z ≤ 25).46.57.18 James is trying to decide which of two sales opportunities to take.19 (Solution on p. identically distributed) with common X = [−5 − 1 3 4 7] Let and P X = 0. What is the probability that at least eight girls throw more heads than their partners? . (9. Y be the net gain on the second.27..245 Exercise 9. with respective probabilities of 0.35. $700. he makes eight independent calls. 0. he makes three independent calls.) for these events are {A. C. In the second. P (Z > 0).41}.33.57. what is P (Y > X)? Compare probabilities in the two schemes that total sales are at least $600. E. F } is independent. Payos are $310.) A residential College plans to raise money by selling chances on a board. 0.34) P (Y > X). X be the net prot on the rst alternative and is independent. 0. then repeat with icalc3 and P (W > 0) Exercise 9. (Solution on p.
26 Martha has the choice of two games. Pay ve dollars to play. of on the success Assume that all nine events form an independent class.246 CHAPTER 9. 0. worth two points each.71. otherwise.75. Bill. 0.82. P (W 1 > 0) and Compare the two games further by calculating P (W 2 > 0) Which game seems preferable? Exercise 9. The probability of a win on any play is Martha has $100 to bet. with probabilities respective calls. P (X ≥ Y ). and 0.00 more than Glenn's? Exercise 9. Let She is trying to decide whether to play Game 1 ten times or Game 2 W1 and W2 be the respective net winnings (payo minus fee to play).) Jim and Bill of the men's basketball team challenge women players Mary and Ellen to a free throw contest.91.) Game 1: Game 2: 1/3. Y be the number of points P (Y ≥ 15) . so the game is fair. with probability 0.48.63. The probability of a win is 1/2. scored by Mike and Harry. (Solution on p. receive $15 for a win. 0. • • Determine P (W 2 ≥ W 1).00 on each sale.37.82. what is the probability Margaret's gain is at least $10. Margaret makes four sales calls on the respective calls. Harry shoots 12 three point shots. 256. she receives $20. INDEPENDENT CLASSES OF RANDOM VARIABLES Exercise 9. with probability 0. Determine P (X ≥ 15).00 on each sale and Margaret earns $20. 255. Pay ten dollars for each play. Each takes ve free throws. twenty times. Jim. 0. of success with probabilities 0. 0. 255.85 of making each shot tried. 0.77. If Glenn realizes a prot of $18.) 0. 0. 0. for a net gain of $10 on the play. and Ellen have respective probabilities p = 0.80.25 Mike and Harry have a basketball shooting contest. (Solution on p. 0.52.24 Glenn makes ve sales calls. she loses her $10. 254.75 of success on each shot. X.87. Exercise 9. (Solution on p.) • • Let and Mike shoots 10 ordinary free throws.27 (Solution on p. respectively. If she wins. What is the probability Mary and Ellen make a total number of free throws at least as great as the total made by the guys? .40 of success on each shot. Mary. Make the usual independence assumptions.
1 (p.41: npr09_02) Data are in X.4 (p. 243) npr09_02 (Section~17. Y. Y. P itest Enter matrix of joint probabilities P The pair {X.8. P itest Enter matrix of joint probabilities P The pair {X. Y.Y} is NOT independent To see where the product rule fails.8. tuappr Enter matrix [a b] of Xrange endpoints [1 1] Enter matrix [c d] of Yrange endpoints [1 1] Enter number of X approximation points 100 .3 (p.38: npr08_07) Data are in X.Y} is NOT independent To see where the product rule fails. call for D disp(D) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Solution to Exercise 9. call for D disp(D) 0 0 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 0 0 Solution to Exercise 9. call for D disp(D) 0 0 0 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 Solution to Exercise 9.8. P itest Enter matrix of joint probabilities P The pair {X. 243) npr08_07 (Section~17.2 (p. 242) npr08_06 (Section~17.Y} is NOT independent To see where the product rule fails. 243) Not independent by the rectangle test.37: npr08_06) Data are in X.247 Solutions to Exercises in Chapter 9 Solution to Exercise 9.
by the rectangle test.. PX. u. fY (u) = 2 (1 − u) . (u>=max(1t. t. and P itest Enter matrix of joint probabilities P The pair {X. u. Y.12) from "Problems on Random Vectors and Joint Distributions" we have fX (t) = 2t.t1)) Use array operations on X. PY.too large Solution to Exercise 9.^2 + u. t. call for D % Not practical. t. and P itest Enter matrix of joint probabilities The pair {X.^2<=1) Use array operations on X. 0 ≤ t ≤ 1. PY.6 (p.Y} is independent Solution to Exercise 9. 243) Not independent. 0 ≤ t ≤ 2 4 (9. INDEPENDENT CLASSES OF RANDOM VARIABLES Enter number of Y approximation points 100 Enter expression for joint density (1/pi)*(t. u.*(1u) Use array operations on X.7 (p. call for D Solution to Exercise 9.248 CHAPTER 9.3t)). PX. fXY = fX fY so the pair is independent.* . . PY. (9.35) tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 100 Enter number of Y approximation points 100 Enter expression for joint density 4*t. Y.Y} is NOT independent To see where the product rule fails.Y} is NOT independent To see where the product rule fails.. and P itest Enter matrix of joint probabilities P The pair {X. PX.13) from "Problems on Random Vectors and Joint Distributions" we have fX (t) = fY (t) = so 1 (t + 1) . tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (1/2)*(u<=min(1+t. 243) P From the solution of Exercise 13 (Exercise 8. Y. 0 ≤ u ≤ 1.36) fXY = fX fY which implies the pair is not independent.5 (p. 243) From the solution for Exercise 12 (Exercise 8.
Y.* .9 (p. and P itest Enter matrix of joint probabilities P The pair {X.Y} is NOT independent To see where the product rule fails. Y.Y} is NOT independent To see where the product rule fails.37) fXY = fX fY and the pair is independent.10 (p.. 243) Not independent by the rectangle test. Y.^2.*u. call for D Solution to Exercise 9. fY (u) = 2u.1)). the pair is not independent.*exp(2*t) Use array operations on X. tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 . u. t. 0 ≤ t. PX. u.t)) Use array operations on X. PX. tuappr Enter matrix [a b] of Xrange endpoints [0 5] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 500 Enter number of Y approximation points 100 Enter expression for joint density 4*u.249 tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 100 Enter number of Y approximation points 100 Enter expression for joint density (1/8)*(t+u) Use array operations on X. call for D Solution to Exercise 9. and P itest Enter matrix of joint probabilities P The pair {X.Y} is independent % Product rule holds to within 10^{9} Solution to Exercise 9. 0 ≤ u ≤ 1 so that (9. 244) By the rectangle test. 243) From the solution for Exercise 14 (Exercise 8. u. PY. (u>=max(0. PY.14) from "Problems on Random Vectors and Joint Distribution" we have fX (t) = 2e−2t . tuappr Enter matrix [a b] of Xrange endpoints [1 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 100 Enter expression for joint density 12*t. and P itest Enter matrix of joint probabilities P The pair {X.. t.*(u<=min(t+1.8 (p. t. PX. PY.
p.Pneither Poneormore = 0.p. p = 3/4.9996. Solution to Exercise 9. Y.250 CHAPTER 9. u.9246 Pneither = (1 .0000 0. PX. 244) p1 = 1 .6988 p2 = 1 .6263 190.0000 0.k). PY. so that P = cbinom(10.Y} is NOT independent To see where the product rule fails.3930 Solution to Exercise 9. .5) % Probability five or more will survive P = 0.7866 % Probability any unit survives P = p^12 % Probability all twelve survive (assuming 12 periods) P = 0.p1)*(1 .5238 Poneormore = 1 .1381 Solution to Exercise 9.*(u<=min(1. and P itest Enter matrix of joint probabilities P The pair {X.57). call for D Solution to Exercise 9. 244) X = 570IA + 525IB + 465IC 0.13 (p.16 (p.0000 0.0000 0.410.9449 185.12 (p. 244) Geometrically.15 (p.9973 180.35].(1 . 244) p = exp(500/750). where S∼ binomial (8. % Probability any one will survive P = cbinom(8. t. with [P (A) P (B) P (C)] = [0.p2) % 1 .11 (p.3) = 0.9277 k = 175:5:190.0754 Solution to Exercise 9.exp(180/150) p1 = 0. 244) p = exp(720/3000) p = 0.p2) Pneither = 0.056 Solution to Exercise 9.14 (p. 244) p = exp(750/10000) p = 0.570. P = cbinom(200.exp(180/130) p2 = 0.p. Y = 150S .*u.p1)*(1 .P]') 175. disp([k. INDEPENDENT CLASSES OF RANDOM VARIABLES Enter number of Y approximation points 100 Enter expression for joint density (24/11)*t.2t)) Use array operations on X.7496 Pboth = p1*p2 Pboth = 0.
canonic % Distribution for X Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution Y = 150*[0:8].35]).57.4131 0.0.4). for i = 1:4 py(i) = (Y>=k(i))*PY'.01*[77 52 23 [X. pmx = minprob(0.251 c = [570 525 465 0]. Y.pmx). pmy = minprob(0. % Distribution for Y PY = ibinom(8.3514 0. pm = minprob([0. and P xmax = max(X) xmax = 1560 ymax = max(Y) ymax = 1200 k = [600 900 1000 1100].17 (p.py]') 0.0818 0. u.4131 0.57 0.PY] = canonicf(cy. X Y . icalc Enter row matrix of Xvalues Enter row matrix of Yvalues Enter X probabilities PX 81 63]).4).0784 0. for i = 1:4 px(i) = (X>=k(i))*PX'. end disp([px. t.01*[37 22 38 cy = [8 15 12 18 15 12 0]. 244) cx = [5 17 21 8 15 0].*P) PM = 0.0111 M = u > t. PM = total(M. PX.pmy). px = zeros(1. end py = zeros(1. [Y.41 0.5081 % P(Y>X) Solution to Exercise 9.PX] = canonicf(cx. 41 83 58]). icalc % Joint distribution Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. PY.0:8).2560 0.7765 0.
end disp([px.18 (p. u.0818 0.2560 0.57.2431 Solution to Exercise 9. PX. and P xmax = max(X) xmax = 1040 ymax = max(Y) ymax = 800 PYgX = total((u>t). INDEPENDENT CLASSES OF RANDOM VARIABLES Enter Y probabilities PY Use array operations on matrices X.4131 0.3). u.0.20*N1.05. end for i = 1:3 py(i) = (Y>=k(i))*PY'.3448 M2 = u . and P M1 = u >= 2*t.0784 0. for i = 1:3 px(i) = (X>=k(i))*PX'. t. PM1 = total(M1.2337 0.0:8). canonic Enter row vector of coefficients cx Enter row vector of minterm probabilities pmx Use row matrices X and PX for calculations Call for XDBN to view the distribution icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. x = 150 .01*[35 41 57]).0111 Solution to Exercise 9. 245) N1 = 0:30.19 (p.*P) PM2 = 0. py = zeros(1.5081 k = [600 700 750]. PY.py]') 0. px = zeros(1. PN1 = ibinom(30. PM2 = total(M2. Y. PY = ibinom(8.252 CHAPTER 9.0:30).t >=30. 245) cx = [310 380 350 0]. t. . pmx = minprob(0.*P) PM1 = 0.*P) PYgX = 0. Y = 100*[0:8]. Y. PY. PX.0.3).
5514 icalc3 % Alternate using icalc3 . PX. PY. u. px = 0. u. 245) Since icalc uses and X X and PX in its output. PY.P). icalc Enter row matrix of Xvalues 3*x Enter row matrix of Yvalues 4*x Enter X probabilities px Enter Y probabilities px Use array operations on matrices X. we avoid a renaming problem by using x and px for data vectors P X. Plose = total(Mlose.253 [X. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. PX.PN2).0828 P200_450 = total(M200_450. [V. u.*P) Pm400 = 0. PN2 = ibinom(50. t.0. [Y. t. and P a = t + u. PX. and P b = t + u.5249e04 Pm400 = total(Mm400.8636 Solution to Exercise 9.1957 Pl200 = total(Ml200.0:50).01*[15 20 30 25 10]. Y.*P) Pl200 = 0.P). Y.*P) Plose = 3.2. Mm400 = G >= 400. and P G = t + u. y = 500 . icalc Enter row matrix of Xvalues V Enter row matrix of Yvalues 2*x Enter X probabilities PV Enter Y probabilities px Use array operations on matrices X. N2 = 0:50.5300 P2 = ((20<=W)&(W<=10))*PW' P2 = 0.PW] = csort(b. Mlose = G < 0. PY. M200_450 = (G>=200)&(G<=450).20 (p.PY] = csort(y. P1 = (W>0)*PW' P1 = 0. Y. Ml200 = G < 200.*P) P200_450 = 0.PN1).PX] = csort(x.PV] = csort(a. x = [5 1 3 4 7]. [W. t.30*N2.
Y. [Z. 245) P = (ibinom(30.3100 % Probability eight or more girls throw more Solution to Exercise 9.21 (p.24 (p. PY.4373 % Probability each girl throws more P = cbinom(15.5514 Solution to Exercise 9. [Y. cy = [2 6 2 3].3*u.5300 P2 = ((20<=W)&(W<=10))*PW' P2 = 0.PW] = csort(a.PZ] = csort(G.0:29))*(cbinom(30. PYgX = total((u>t). [X.PX] = canonicf(cx. 245) cx = [3 9 4 0]. pmx = minprob(0.1/6. .5654 P5Z25 = ((5<=Z)&(Z<=25))*PZ' P5Z25 = 0.3752 PZpos = (Z>0)*PZ' PZpos = 0.P).5) 0].1/2. INDEPENDENT CLASSES OF RANDOM VARIABLES Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter row matrix of Zvalues x Enter X probabilities px Enter Y probabilities px Enter Z probabilities px Use array operations on matrices X.1/2.01*[47 37 41]). t.254 CHAPTER 9. cm = [20*ones(1.22 (p.1:20))' pg = 0. [W.8) P = 0. PY. P1 = (W>0)*PW' P1 = 0.P).1:30))' = 0. Y. and P a = 3*t . t. PZ. u.pg. Z.23 (p.PY] = canonicf(cy.4) 0]. 246) cg = [18*ones(1. PX.*P) PYgX = 0. v. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X.4*u + 2*v.1/6. PX. pmy = minprob(0. 245) Solution to Exercise 9. u.4745 Solution to Exercise 9. and P G = 2*t .pmy).0:19))*(cbinom(20.pmx).4307 pg = (ibinom(20.01*[42 27 33]).
P1pos = (W1>0)*PW1' P1pos = 0. .75. W2 = 15*[0:20] .5618 G = t>=u. PY = ibinom(12.26 (p.*P) p1 = 0. t. PG = total(G. PX.5256 PY15 = (Y>=15)*PY' PY15 = 0. 246) W1 = 20*[0:10] . PX = ibinom(10. PY.5207 icalc Enter row matrix of Xvalues W1 Enter row matrix of Yvalues W2 Enter X probabilities PW1 Enter Y probabilities PW2 Use array operations on matrices X. pmm = minprob(0. t. [M.PG] = canonicf(cg.5197 Solution to Exercise 9.3770 P2pos = (W2>0)*PW2' P2pos = 0. PX. and P PX15 = (X>=15)*PX' PX15 = 0.0:20). PX. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. PW2 = ibinom(20. Y = 3*[0:12].25 (p.0:12).0. icalc Enter row matrix of Xvalues G Enter row matrix of Yvalues M Enter X probabilities PG Enter Y probabilities PM Use array operations on matrices X. 246) X = 2*[0:10].pmm). PY. u.0:10).100.PM] = canonicf(cm.1/3. [G. u. p1 = total(H.5811 Solution to Exercise 9. Y. and P H = ut>=10. PY.0. u. Y.255 pmg = minprob(0. t.0:10).01*[77 82 75 91]). Y.pmg).01*[37 52 48 71 63]). PW1 = ibinom(10. and P G = u >= t.40.1/2.*P) PG = 0.100.
P). = ibinom(5. icalc Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter X probabilities PJ Enter Y probabilities PB Use array operations on matrices X.x). PX. and P H = t+u.0. t.80. PY. and P Gw = u>=t.*P) PG = 0. = ibinom(5.5182 Solution to Exercise 9. Z. u.5746 . 246) PJ PB PM PE x = 0:5. u. t. PZ. Y. v.0.0.85. and P H = v+w >= t+u.x).27 (p.5746 icalc4 % Alternate using icalc4 Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter row matrix of Zvalues x Enter row matrix of Wvalues x Enter X probabilities PJ Enter Y probabilities PB Enter Z probabilities PM Enter W probabilities PE Use array operations on matrices X. = ibinom(5. u. PX.P). Y.Pm] = csort(H. Y. PH = total(H.82.W PX. INDEPENDENT CLASSES OF RANDOM VARIABLES PG = total(G. PX. icalc Enter row matrix of Xvalues Tm Enter row matrix of Yvalues Tw Enter X probabilities Pm Enter Y probabilities Pw Use array operations on matrices X. PY. PY. [Tm.x). = ibinom(5. Y. PGw = total(Gw. and P G = t+u.*P) PGw = 0. icalc Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter X probabilities PM Enter Y probabilities PE Use array operations on matrices X. PW t.87. w.0.256 CHAPTER 9. u.*P) PH = 0.x). [Tw. t.Pw] = csort(G. PY.
we observe a value of some random variable. and the density of the steel. The agreement is that the organization will purchase ten tickets at $20 each (regardless Additional tickets are available according to the following of the number of individual buyers). $16 each 3150.5/>. $13 each If the number of purchasers is a random variable X.2) 257 .1 Functions of a Random Variable1 Introduction X is a random variable and g is a reasonable function (technically. 10. a Borel function). then Z = g (X) is a new random variable which has the value g (t) for any ω such that X (ω) = t. an approach We consider. rst. the desired random variable is D = kX 1/3 . If Thus Frequently. $15 each 51100. the total cost (in dollars) is a random quantity (10. (10. Example 10. from this by a function rule.1) Z = g (X) described by g (X) = 200 + 18IM 1 (X) (X − 10) + (16 − 18) IM 2 (X) (X − 20) + (15 − 16) IM 3 (X) (X − 30) + (13 − 15) IM 4 (X) (X − 50) 1 This content is available online at <http://cnx. $18 each 2130. the Thus. where k w is the weight. A wide variety of functions are utilized in practice. if units of measurement. but are really interested in a value derived Z (ω) = g (X (ω)).1.2: Price breaks The cultural committee of a student organization has arranged a special deal for tickets to a concert. then is a factor depending upon the formula for the volume of a sphere. X is the weight of the sampled ball.org/content/m23329/1.1: A quality control problem In a quality control check on a production line for ball bearings it may be easier to weigh the balls than measure the diameters.Chapter 10 Functions of Random Variables 10. functions of a single random variable. schedule: • • • • 1120. Example 10.1 The problem. diameter is If we can assume true spherical shape and kw1/3 .
merely use the dening conditions. N = {t : g (t) > 0} is the The X values −2 and 6 lie in this set. For each possible value ti of X. 3. M 3 = [30. Hence set of points to the left of −1 or to the right of P (g (X) > 0) = P (X = −2) + P (X = 6) = 0. with respective probabilities 0. This is the approach we use in the MATLAB calculations. 1. Example 10. ∞) . identify those values ti of X f . Simply nd the amount of probability mass mapped into the set random variable Mapping approach. g (t) = (t + 1) (t − 4).2.2. Consider Z = g (X) = (X + 1) (X − 4). M 4 = [50.3 0. M by the In the absolutely continuous case. Remark. determine Once (10.3 4 0 6 0. The mapping approach Determine P (Z > 0). Suppose N in the mapping approach is called the inverse image N = g −1 (M ). ∞) . Consider each value ti of X. but the essential problem is the same. Hence {ω : g (X (ω)) ∈ M } = {ω : X (ω) ∈ N } Since these are the same event. then Z = g (X) is a new random variable. b. Select those which meet the dening conditions for M and add the associated probabilities. Determine the set N of all those t which are mapped into M by the function g. calculate In the discrete case. Now if X (ω) ∈ N . M X which are in the set M and add the associated (b) Discrete alternative. 0. Mapping approach. Note that it is not necessary to describe geometrically the set M . The problem If for X is a random variable.3) The function rule is more complicated than in Example 10.1. then X (ω) ∈ N . ∞) where (10.2 + 0. SOLUTION First solution.1 (A quality control problem). Suppose we have X. The set P (X ∈ N ) in the usual manner (see part a. 0.2 = 0.4) N is identied.2 14 1 6 1 0 0 . above).258 CHAPTER 10. FUNCTIONS OF RANDOM VARIABLES M 1 = [10.2 6 3 0. 0. • • probabilities. X. and if g (X (ω)) ∈ M . M 2 = [20. 4. they must have the same probability. (b) Discrete alternative.4 (10.5) Second solution. determine whether g (ti ) meets the dening condition for M. 6. 0.1 4 1 0.2 0 0. The discrete alternative X= PX = Z= Z>0 2 0. then g (X (ω)) ∈ M . the probability Z takes a value in the set M ? An approach to a solution the distribution We consider two equivalent approaches a.3: A discrete example X has values 2. ∞) . To nd (a) P (X ∈ M ). To nd (a) P (g (X) ∈ M ). Select those ti which do and add the associated probabilities.2. How can we determine P (Z ∈ M ).
Example 10. next.4: An absolutely continuous example Suppose X∼ uniform [−3. 7]. Example 10.14) 2π Z ∼ N (0. 1) .5 (10. 7] is 0. g (t) = t < − 1 or t > 4. N = {t : g (t) > 0}.1 Picking out and adding the indicated probabilities. σv + µ]. As in Example 10.1 [(−1 − (−3)) + (7 − 4)] = 0.8) We consider.12) (10. Then fX (t) = 0. ' ' fZ (v) = FZ (v) = FX (σv + µ) σ = σfX (σv + µ) (10. for given t−µ ≤v σ i t ≤ σv + µ so that (10.9) Z 2 1 φ (t) = √ e−t /2 2π (10.2 + 0.4 (10. the discrete alternative is more readily implemented.2 = 0. Thus. the integral of the subinterval of [−3.10) Now g (t) = Hence.5: The normal distribution and standardized normal distribution To show that if X ∼ N µ. −3 ≤ t ≤ 7 (and zero elsewhere).1 times the length of that subinterval.7) P (Z > 0). v] the inverse image is N = (−∞. 1) σ is (10.13) Thus fZ (v) = We conclude that σ 1 σv + µ − µ √ exp − 2 σ σ 2π 2 2 1 = √ e−v /2 = φ (v) (10. σ 2 then Z = g (X) = VERIFICATION We wish to show the denity function for X −µ ∼ N (0.11) M = (−∞. we have P (Z > 0) = 0. for MATLAB calculations (as we show below). Let Z = g (X) = (X + 1) (X − 4) Determine (10.1.3 (A discrete example). Because of the uniform distribution. However.259 Table 10. some important examples. FZ (v) = P (Z ≤ v) = P (Z ∈ M ) = P (X ∈ N ) = P (X ≤ σv + µ) = FX (σv + µ) Since the density is the derivative of the distribution function.6) In this case (and often for hand calculations) the mapping approach requires less calculation. the desired for SOLUTION First we determine (t + 1) (t − 4) > 0 density over any probability is P (g (X) > 0) = 0.
t1/a is increasing. 0). 1). −a = a. VERIFICATION Use of the result of Example 10.7: Completion of normal and standardized normal relationship Suppose Z ∼ N (0. an ane function (linear plus a constant). the two cases may be combined into one formula.15) • a > 0: FZ (v) = P • a<0 FZ (v) = P So that X≤ v−b a = FX v−b a v−b a (10.6 (Ane functions) on ane functions shows that fX (t) = 1 φ σ t−µ σ = 1 t−µ 1 √ exp − 2 σ σ 2π 2 (10.19) Example 10. FUNCTIONS OF RANDOM VARIABLES Example 10.16) X≥ v−b a =P X> v−b a +P X= (10. λ. Here If it is absolutely continuous. v−b a = 0.22) Example 10. we have FZ (v) = P (Z ≤ v) = P (X ≤ v a ) = FX (v a ) In the absolutely continuous case ' fZ (v) = FZ (v) = fX (v a ) av a−1 (10. (and the density in the absolutely continuous case). Then Z = X 1/a ∼ Weibull (a.17) FZ (v) = 1 − FX For the absolutely continuous case.18) P X= v−b a and by dierentiation • • for for 1 a > 0 fZ (v) = a fX v−b a 1 a < 0 fZ (v) = − a fX v−b a Since for a < 0. the corresponding density is Consider Z = aX + b (a = 0). X has distribution function FX .21) (10. Show that X = σZ + µ (σ > 0) is N µ. +P X= v−b a (10. Determine the distribution function for SOLUTION Z g (t) = at + b. σ 2 .8: Fractional power of a nonnegative random variable Suppose 0≤t 1/a X ≥ 0 and Z = g (X) = X 1/a ≤ v i 0 ≤ t ≤ v a .9: Fractional power of an exponentially distributed random variable Suppose X∼ exponential (λ).6: Ane functions Suppose fX . Thus for a > 1. FZ (v) = P (Z ≤ v) = P (aX + b ≤ v) There are two cases (10.20) Example 10. .260 CHAPTER 10. Since for t ≥ 0. fZ (v) = 1 fX a v−b a (10.
b]. 1 ≤ i ≤ n−1.8).11: Basic calculations for a function of a simple random variable X = 5:10.0000 0 % Values of X % Probabilities for X % Array operations on X matrix to get G = g(X) % Relational and logical operations on G % Sum of probabilities for selected values % Display of various matrices (as columns) 0. • The zeroone matrix M is used to select the the corresponding pi = P (X = ti ) and sum them by the taking the dot product of M and P X .1). G = (X + 6). For the given subdivision. since this may be implemented easily with MATLAB. 0).6.0000 3. FZ (t) = FX (ta ) = 1 − e−λt which is the distribution function for a (10.0000 0.8 (Fractional power of a nonnegative random variable).24) X to within the length of the largest subinterval i Mi .*(X . ti ) be the i th subinterval. pick a point si .PX]') 5. we form a simple random variable Xs as follows. Let Mi = [ti−1 .M. a simple function approximation may be constructed (see Distribution Approximations). ti−1 ≤ si < ti . we use the discrete alternative approach. in which the range of a bounded interval I = [a. a function of X (10. Now IEi = IMi (X).0000 132.2 Use of MATLAB on simple random variables For simple random variables. 1 ≤ i ≤ n−1 and Mn = [tn−1 . Suppose the distribution for X is expressed in the row vectors X and P X.26) relational and logical operations on G to obtain a matrix M which has ones for those ti (values X ) such that g (ti ) satises the desired condition (and zeros elsewhere).0. tn ].*(X . The simple random variable n Xs = i=1 approximates si IEi (10.10: A simple approximation as a function of If X X X is limited to is a random variable. n We may thus write Xs = i=1 si IMi (X) . Then the Ei form a partition of the basic space Ω. since IEi (ω) = 1 i X (ω) ∈ Mi IMi (X (ω)) = 1. Example 10. We limit our discussion to the bounded case.261 According to the result of Example 10. In each subinterval.0000 0. M = (G > .0000 4. with a = t0 and b = tn .23) Z∼ Weibull (a.100)&(G < 130).0000 1. • We perform array operations on vector X to obtain G = [g (t1 ) g (t2 ) · · · g (tn )] • (10. PM = M*PX' PM = 0.1. λ.25) 10.0000 1. Suppose I is partitioned into n subintervals by points ti . We use of Example 10. PX = ibinom(15.0000 120.0003 . −1 Let Ei = X (Mi ) be the set of points mapped into Mi by X.0:15).G.0000 78.4800 disp([X.
0000 0 0.0000 132.27) c = [10 18 10 0].0000 5.0005 % Sorting and consolidating to obtain % the distribution for Z = g(X) 2.1400 10.0000 120.0000 1.0074 0.0000 0.0000 1.0000 120.0000 0.PX).1181 0 0.0000 1.0000 [Z. pm = minprob(0.0000 120.0000 0 0 0 1.0000 3.1*[6 3 5]).0000 0.0000 0.0000 1.0245 78.0000 8. FUNCTIONS OF RANDOM VARIABLES 1. (10.0074 120.0000 0 9. B.0832 48.0000 0.0000 0.0000 1.0612 0.0047 0.0634 0.0000 10. PX % Alternate using Z. disp([Z.1859 120.12 % Further calculation using G.0064 132. then determine the distribution for Z = X 1/2 − X + 50 independent and P = [0.0000 90.0000 4.PZ] = csort(G.0000 0.0000 7.0000 0.0600 20.0000 0.0000 0 2.1771 0.0000 288.1859 0.262 CHAPTER 10.0000 1.0000 120.1859 Example 10.1859 p1 = (Z<120)*PZ' p1 = 0.60.0000 0.0000 1.1268 0.30.0016 0.0245 0. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 0 0.0219 0.0000 1.0000 48.0005 P1 = (G<120)*PX ' P1 = 0.0634 48.0000 0.0000 0.0003 288.0000 78.PZ]') 132.2066 0.3500 18.0000 90.1771 78.0000 90. PZ X = 10IA + 18IB + 10IC with {A.3334 90.0000 6. C} We calculate the distribution for X.0000 0.2100 pm .0000 0 48.0000 1.0000 0.1181 0.5].
Note that with the mfunction csort.2100 fX (t) = 1485 0.3500 50. M = G <= 0.642 /2 = 0.0000 0. We may use the approximation mprocedure tappr to obtain an approximate discrete distribution.3359 % Agrees quite closely with the theoretical 10.1400 M = (Z < 20)(Z >= 40) M = 1 0 0 PZM = M*PZ' PZM = 0. [Z.3359 Using the approximation procedure.643 + 0.PZ]') 18.0000 0.org/content/m23332/1.1500 38. disp([Z.0900 27.0600 43.PX) W = 1 171 595 PW = 0.64. Now Z ≤ 0. above.PZ] = csort(G.1500 34.0900 0 ≤ t ≤ 1.8) = FX (0.1644 0.13: Continuation of Example 10.5800 % Formation of G matrix % Sorts distinct values of g(X) % consolidates probabilities 0 % Direct use of Z distribution 1 1 Remark.28) tappr Enter matrix [a b] of xrange endpoints [0 1] Enter number of x approximation points 200 Enter density as a function of t (3*t. FX (t) = 1 2 Example 10.12. we have (10.2426 0.2915 0. Then we work with the approximating random variable as a simple random variable. H = 2*X.1623 0.^(1/2).82 = 0.PW] = csort(H.^2 .^2 + 2*t)/2 Use row matrices X and PX as in the simple case G = X. Let Z =X 1/2 .8.X + 50. we may name the output as desired.3*X + 1.263 28.64) = 0.1400 0.8 i X ≤ 0.0000 0.5/>. PM = M*PX' PM = 0. The desired probability may be P (Z ≤ 0.1500 1 2 2775 0.3500 0. Example 10.0900 G = sqrt(X) .2100 36.8). .2 Function of Random Vectors2 Introduction 2 This content is available online at <http://cnx.4721 0. Suppose we want calculated to be P (Z ≤ 0. [W.0600 741 0.14: A discrete approximation Suppose X has density function 3t2 + 2t for Then t3 + t2 .PX).
determine whether g (ti . Y (ω)) ∈ M Hence(10. a. Select those which meet the and add the associated probabilities. For each possible vector value meets the dening condition for M. g is real valued and M is a subset the real line.264 CHAPTER 10. f Q XY . of In the absolutely continuous case. Y ) ∈ M ). The approach is analogous to that for a single random variable with distribution on the line. bilities. plane. Y (ω)) ∈ Q} Since these are the same event. Y ) which are in the set Q and (b) Discrete alternative. To nd (a) P ((X. Y ). g (X. P (g (X. Now Determine the set Q of all those (t. Once (10.1 The general approach extended to a pair Consider a pair {X. Y ) ∈ M i (X. • • Q on the plane by the random vector W = (X. Y ) ∈ Q) in the usual manner (see part a. . uj ) which do and add the associated proba We illustrate the mapping approach in the absolutely continuous case.30) Q is identied on the (b) Discrete alternative. Simply nd the amount of probability mass mapped into the set Mapping approach. function g.29) {ω : g (X (ω) . Y } having joint distribution on the plane. uj ) (X. although the details are more complicated. FUNCTIONS OF RANDOM VARIABLES The general mapping approach for a single random variable and the discrete alternative extends to functions of more than one variable. determine P ((X. above). To nd (a) Q. uj ) of (X. Extensions to more than two random variables are made similarly. u) which are mapped into M by the W (ω) = (X (ω) . Y ) ∈ Q). Y ) ∈ Q. Consider each vector value dening conditions for Q (ti . Y (ω)) ∈ Qig ((X (ω) . Y (ω)) ∈ M } = {ω : (X (ω) . Y ). Y ). This is the approach we use in the MATLAB calculations.2. uj ) of (X. uj ) (ti . Select those (ti . Mapping approach. calculate In the discrete case. nding the set by integrating Q A key element in the approach is The desired probability is obtained on the plane such that over fXY Q. identify those vector values add the associated probabilities. 10. (ti . they must have the same probability. It does not require that we describe geometrically the region b. considered jointly. It is convenient to consider the case of two random variables.
fXY is dierent from zero on the triangle bounded by t = 2.2). u) : u ≤ v − t} For any xed where Qv = {(t. t} (see Figure 1).16: The density for the sum Suppose the pair {X. t = 2.31) Example 10. Here g (t. u = max{1.265 Figure 10. ∞). u) : t + u ≤ v} = u = v−t (see (10. the region Qv is the portion of the plane on or below the line Figure 10.34) . u) = 37 (t + 2u) on the region bounded by t = 0.1: Distribution for Example 10. Now Q = {(t.32) FZ (v) = P (X + Y ≤ v) = P ((X. Example 10. and u = t.33) v. Thus ∞ v−t FZ (v) = Qv fXY = −∞ −∞ fXY (t. u) : t − u ≥ 0} = {(t.15: A numerical example The pair 6 {X. u) : u ≤ t} which is the region on the plane on or below the line u = t. Y ) ∈ Qv ) {(t. u) du dt (10. Examination of the gure shows that for this region.15 (A numerical example). u = 0. u = 0. The desired probability is 2 t 0 P (Y ≤ X) = 0 6 (t + 2u) du dt = 32/37 ≈ 0. Determine P (Y ≤ X) = P (X − Y ≥ 0). Determine (10. Y } has joint density fXY (t.8649 37 X +Y fXY . u) = t − u and M = [0. Y } has joint density the density for Z =X +Y SOLUTION (10.
FZ (v) = v2 2 for is the probability in the region Qv : u ≤ v − t. Y } has joint uniform density on the unit square 0 ≤ t ≤ 1. v − t) dt (10. the complementary region Qv is the upper shaded (2 − v) /2. Determine the density for SOLUTION Z =X +Y.1] (v) v + I(1. It has area 2 in this case.38) .2: Region Qv for X + Y ≤ v. Thus.37) With the use of indicator functions. for v ≤ 1. since (10. we get ∞ fZ (v) = ∞ This integral expresssion is known as a fXY (t.17: Sum of joint uniform random variables Suppose the pair {X.266 CHAPTER 10. For v > 1. Figure 10.2] (v) (2 − v) (10. these may be combined into a single expression fZ (v) = I[0. PXY 0≤v≤1 and FZ (v) = 1 − (2 − v) 2 2 for 1≤v≤2 [0. Qv which has probability mass is the lower shaded triangular region on the gure. the region. Now PXY (Qv ) = 1 − PXY (Qc ). 2]. c FZ (v) part of area the complementary set Qv is the set of points above the line. so that 2 (Qv ) = 1 − (2 − v) /2. where v As Figure 3 shows.35) convolution integral. 0 ≤ u ≤ 1.36) Dierentiation shows that Z has the symmetric triangular distribution on fZ (v) = v for 0≤v≤1 and fZ (v) = (2 − v) for 1≤v≤2 (10. Example 10. FUNCTIONS OF RANDOM VARIABLES Dierentiating with the aid of the fundamental theorem of calculus. which has 2 c (and hence probability) v /2.
Y ). so that we have fXY (t. Thus. i v − 1 ≤ t ≤ v .41) X Y. Y } is an independent pair with simple approximations Xs m and Ys as described in Distribution Approximations. Independence of functions of independent random variables Suppose {X. respectively. n n m Xs = i=1 As functions of ti IEi = i=1 and ti IMi (X) and Ys = j=1 uj IFj = j=1 is uj INj (Y ) Also each (10. 1] (u). W } is not independent. 1] (v) I[0.3: Geometry for sum of joint uniform random variables. 2] (v) I[v−1. v] (t) + I(1. W = h (Y ). then in general {Z. if Z = g (X. then the pair {Z. This is illustrated for simple random variables with the aid of the mprocedure jointzw at the end of the next section. ALTERNATE SOLUTION Since v−t≤1 fXY (t. 1] (t) I[0. 1] (t) I[0.39) t gives the result above. Let Z = g (X) . 1] (t) Integration with respect to (10. W } is independent. Y } is an independent pair and Z = g (X) .18: Independence of simple approximations to an independent pair Suppose {X. Y ) and W = h (X. v − t) = I[0.40) {Z −1 (M ) . Now 0≤ fXY (t. W −1 (N )} is independent for each pair {M. . Y } is an independent pair. INj (Y )} is independent. the pair {Z. Ys } independent.267 Figure 10. and Since Z −1 (M ) = X −1 g −1 (M ) the pair If W −1 (N ) = Y −1 h−1 (N ) (10. 1] (v − t). u) = I[0. the pair {Xs . W = h (Y ). Example 10. pair {IMi (X) . N }. v − t) = I[0. W } is independent. {X. However.
0.0000 0.0000 0.0000 0.0180 % X Y P We extend the example above by considering a function W = h (X.0000 0.0209 8. PY.0837 5.0402 6.0198 0.*P) PM = 0.4665 disp([Z. % Calculation using the XY distribution PM = total(M. % Formation of G = [g(ti.0000 0.268 10.PZ] = csort(G. uj ).0516 16. and P G = t.0270 0.0297 11.0360 10. Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.0000 0.PZ]') % Display of the Z distribution 12.0405 0.0000 0. u. u. t. we may use the A rst step. Y.0744 4. jdemo3 % Call for data jcalc % Set up of calculating matrices t. This is illustrated in the following example.1032.0000 0.0209 0.0817 0.0000 0.0000 0.*P) % Alternately.0180 0.0132 0. Example 10.0589 3. 0.0372 0.^2 3*u. FUNCTIONS OF RANDOM VARIABLES X to determine a matrix of values of g (X). To obtain the distribution for and the joint probability matrix P.0405 1.0372 13.0516 0. Z = g (X. In the twovariable case.0000 0.0744.4665 [Z.19 % file jdemo3. Y ) which has a composite denition. .0000 0.0837 0.0000 0.0000 0.0589 0. = [0 1 2 4].0774 0.0360].0285 0.1059 3. Y ).uj)] M = G >= 1.0297 0. PX.2 Use of MATLAB on pairs of simple random variables In the singlevariable case.P).0198 6.m data for joint simple distribution = [4 2 0 1 3].1032 9.0264.1375 0 0. we must use array operations on the calculating matrices a matrix G t and u to obtain whose elements are mfunction csort on G g (ti .1161 0.1425 2. use total((G>=1).0000 0. then. PM = (Z>=1)*PZ' % Calculation using the Z distribution PM = 0. is the use of jcalc or icalc to set up the joint distribution and the calculating matrices. we use array operations on the values of CHAPTER 10. 0. = [0.0558 0.2.
u) [W.0000 0.4 .0000 0.269 Example 10.0558 16.*(t+u<1).0180 17.2700 1.PW]') 2.19 Let W ={ X X +Y 2 2 for for X +Y ≥1 Determine the distribution for W (10.0000 0.2400 4.0000 0.0000 0.PW] = csort(H.^2).0000 0.^2 + u.0132 ddbn Enter row matrix of values W Enter row matrix of probabilities print % Distribution for W = h(X.0774 8.*(t+u>=1) + (t. % Specification of h(t.Y) % Plot of distribution function PW % See Figure~10. disp([W.0372 32.0000 0.0000 0.0270 5.0516 20.1900 3.P).20: Continuation of Example 10.42) X +Y <1 H = t.0000 0.0000 0.0198 0 0.
we may have (X. Y ). Each such pair of values (i. As special Z=X Z or W =Y. Joint distributions for two functions of It is often desirable to have the cases. use the MATLAB function nd to determine the positions a for which (10. Z = zj ). we use csort to obtain the joint marginal distribution for a single function Z = g (X.4: Distribution for random variable W in Example 10. Z = zj ). value matrices and the 2. Use csort to determine the Z and W G = [g (t.270 CHAPTER 10. Y ). j) the probability accomplished as follows: 1. zj ). W ) P (W = wi .43) The joint distribution requires the probability of each pair. Use array arithmetic to determine the matrices of values 3.44) (H == W (i)) & (G == Z (j)) (i.45) . If we select and add the probabilities corresponding to the latter pairs. has values [z1 z2 · · · zc ] and and W has values [w1 w2 · · · wr ] (10. P Z and P W marginal probability matrices. To determine the joint probability matrix arranged as on the plane.20 (Continuation of Example 10. 4. corresponds to a set of pairs of P ZW for (Z. FUNCTIONS OF RANDOM VARIABLES Figure 10. Z) values corresponds to one or more pairs of (Y. u)]. Each pair of (W. W ) 5. For each pair (wi . j) = total (P (a)) (10. This may be values. X) values. u)] and H = [h (t. with values of W increasing upward. Z = zj ). Y ) and distribution for a pair Suppose W = h (X. Z = g (X. Assign to the position in the joint probability matrix for the probability PZW (i. Y ) In previous treatments. j) P ZW (Z.19). we assign to each position X Y P (W = wi . Set up calculation matrices t and u as with jcalc. we have P (W = wi .
PZ] = csort(G.030 0.^2 % Matrix of values for W = h(X.P) Z = 0 1 2 PZ = 0. (2. We put this probability in the joint probability matrix P ZW at the W = 4.048 0. [i.009.061 0..040 0.. we need to determine the (t.PW] = csort(H.112.058 0.2670 0. u) positions for which this pair of (W. Y = 2:3. which are then implemented in the mprocedure jointzw.004 0.052 0. 0.032 0.013. Z) values is taken on.060 0.1490 0.P) W = 0 1 4 PW = 0.1870 % Determination of marginal for Z 0.m P = [0. Z = 1 position. 0.060 0.Y) G = 2 1 0 1 2 1 0 1 2 1 0 1 2 1 0 1 2 1 0 1 2 1 0 1 [W.3530 [Z.H = u.3720 r = W(3) r = 4 s = Z(2) s = 1 To determine 9 2 2 2 2 2 2 % Determination of marginal for W 0.m jcalc % Used to set up calculation matrices t.013].3110 0.058 + 0..012 0.001 0. 0. and (6. 0. (6. By inspection.001 + 0.023 0.Y) H = 9 9 9 9 9 4 4 4 4 4 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 4 4 4 4 4 G = abs(t) % Matrix of values for Z = g(X. Z = 1) is the total probability at these positions.. Z = 1). 0.013. Example 10. % Location of (t... X = 2:2.027 0.015 0.061 0.018.j] = find((H==W(3))&(G==Z(2))).4)..4).21: Illustration of the basic joint calculations % file jdemo7.053 0. we nd these to be (2.026 0.039.040 0.052 + 0.271 We rst examine the basic calculations. jdemo7 % Call for data in jdemo7..2). This may be achieved by MATLAB with the following operations.058 0.u) positions disp([i j]) % Optional display of positions .3610 % Third value for W % Second value for Z P (W = 4.001 0. u .001 = 0.2).054 0. Then P (W = 4. This is 0.050 0.029 0.
w. 0.u): abs(abs(t)u) Enter expression for h(t.0558 0. Example 10. Y ) = XY  % file jdemo3. P = [0.2) = total(P(a)) PZW = 0 0 0 0 0 0 0 0.m data for joint simple X = [4 2 0 1 3].22: Joint distribution for Z = g (X. v.0270 0. Y ) = X − Y  and W = h (X. 0.0132 0.0010 0 % Initialization of PZW matrix % Assignment to PZW matrix with % W increasing downward % Assignment with W increasing upward 0 0 0 0 carries out this operation for each possible pair of flipud jointzw W and Z values (with operation coming only after all individual assignments are made).0264.0285 jdemo3 % Call for data jointzw % Call for mprogram Enter joint prob for (X.0297 0.0372 0.0570 0 distribution 0.Y): P Enter values for X: X Enter values for Y: Y Enter expression for g(t. PW.0774 0.0837 0.0589 0. 0.1161 0.272 CHAPTER 10. P0(a) = P(a) P0 = 0 0 0 0 0.0132 0 0 0 0 0. Y = [0 1 2 4].0209 0.0580 0 0 0 0 0 0 0 0.0360]. disp(PZW) 0.0520 0 PZW = zeros(length(W).0405 0.length(Z)) PZW(3. W.0198 0.*u) Use array operations on Z.0516 0. PZ.0180 0.1120 0 0 0 0 PZW = flipud(PZW) PZW = 0 0 0 0.0744. PZW 0 0 0 .0264 0 0 0 0 0.1120 0 0 0 0 The procedure the % Location in more convenient form % Setup of zero matrix % Display of designated probabilities in P 0 0 0.1032. P0 = zeros(size(P)). FUNCTIONS OF RANDOM VARIABLES 2 2 6 2 2 4 6 4 a = find((H==W(3))&(G==Z(2))).u): abs(t.0010 0 0 0 0 0 0 0 0 0 0 0 0.0817 0.
3390 At noted in the previous section.273 0 0.Y} is independent jointzw Enter joint prob for (X.0817 0 0.4398 0 0 0. PZW itest Enter matrix of joint probabilities PZW .u): t. W. PZ.0744 0. However. then the pair {Z.u): t+u % Z = g(X.0477 ez = Z*PZ' % Alternate. using marginal dbn ew = 2.0725 0 0 0 0. then in general the pair {Z. v. PZ. % P(Z>W) PM = total(M. w. W = h (Y ).1107 0 0.*u % W = h(X.4398 EW = total(w.Y) Use array operations on Z.1446 EZ = total(v. if Z = g (X. W } is not independent.h(Y)} is independent jdemo3 % Refresh data jointzw Enter joint prob for (X.*PZW) PM = 0.0360 0 0 0 0 0 0.*PZW) EZ = 1.u): t.6075 M = v > w.Y} is independent % The pair {g(X). Y ) and W = h (X. if {X.1032 0 0 0. PZW itest Enter matrix of joint probabilities PZW The pair {X. Y } is an independent pair and Z = g (X).u): abs(u) + 3 % W = h(Y) Use array operations on Z. w. W. W } is independent. PW.*PZW) EW = 2.0558 0 0 0 0 0. using marginal dbn ez = 1.Y) Enter expression for h(t.^2 .Y): P Enter values for X: X Enter values for Y: Y Enter expression for g(t. v. We may illustrate of the mprocedure jointzw this with the aid Example 10.23: Functions of independent random variables jdemo3 itest Enter matrix of joint probabilities P The pair {X. PW.1363 0.3*t % Z = g(X) Enter expression for h(t.6075 ew = W*PW' % Alternate.0405 0. Y ).Y): P Enter values for X: X Enter values for Y: Y Enter expression for g(t.Y} is independent % The pair {X.
Y ) ∈ Qv ).Y)} is not indep To see where the product rule fails.5: Region Qv for product XY . In this section.274 CHAPTER 10. u ≤ v/t} {(t. we may set up a simple approximation to the joint distribution and proceed as for simple random variables.24: Distribution for a product Suppose the pair {X. v ≥ 0.Y). Example 10. Let Z = XY . Determine Qv such that P (Z ≤ v) = P ((X.46) . u) : t < 0. we solve several examples analytically.Y} is NOT independent % The pair {g(X. then obtain simple approximations. FUNCTIONS OF RANDOM VARIABLES The pair {X. Figure 10.5) Qv = {(t. u) : t > 0.2. Y } has joint density fXY . SOLUTION (see Figure 10.h(X. u ≥ v/t}} (10. call for D % Fails for all pairs 10.3 Absolutely continuous case: analysis and approximation As in the analysis Joint Distributions. u) : tu ≤ v} = {(t.
8465 % Theoretical value 0. it must be expressed in terms of t.275 Figure 10.6) 1 dudt where Qv = {(t.6: Product of X. Y. tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (u>=0)&(t>=0) Use array operations on X. P (XY ≤ v) = Qv Integration shows Then (see Figure 10.47) FZ (v) = P (XY ≤ v) = v (1 − ln (v)) For so that fZ (v) = −ln (v) = ln (1/v) . 0 ≤ u ≤ min{1. u.5) = 0. Y with uniform joint distribution on the unit square. Example 10. v/t}} (10. PY.25 {X. PX.5)*PZ' p = 0.5. Y } ∼ uniform on unit square fXY (t. u) = 1.*u. above . u. u) : 0 ≤ t ≤ 1. 0 ≤ t ≤ 1. FZ (0. t. [Z. 0 ≤ u ≤ 1. 0 < v ≤ 1 (10. p = (Z<=0.8466.48) v = 0.P). and P G = t.PZ] = csort(G.8466. % Note that although f = 1.
u) = 37 (t + 2u) on the region bounded u = 0. PQ = total(Q. PX.4865.*u<=1.26: Continuation of Example 5 (Example 8.*(u<=max(t. above . Y ) ∈ Q) = 18/37 ≈ 0.4853 % Theoretical value 0.26 (Continuation of Example 5 (Example 8. Figure ANALYTIC SOLUTION P (Z ≤ 1) = P ((X. t.4865 6 37 1 0 1 0 6 (t + 2u) dudt + 37 2 1 1/t 0 (t + 2u) dudt = 9/37 + 9/37 = (10. Y } has joint density fXY (t.7: Area of integration for Example 10.50) APPROXIMATE SOLUTION tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 300 Enter number of Y approximation points 300 Enter expression for joint density (6/37)*(t + 2*u). FUNCTIONS OF RANDOM VARIABLES Example 10. u) : u ≤ 1/t} (10. Y ) ∈ Q) Reference to Figure 10. t} (see Figure 7). Let Z = XY . PY. and P Q = t. t = 2. Y. by t = 0. Determine P (Z ≤ 1).1)) Use array operations on X.*P) PQ = 0.49) P ((X.7 shows that where Q = {(t. 10.5: Marginals for a discrete distribution) from "Random Vectors and Joint Distributions" The pair 6 {X. and u = max{1.5: Marginals for a discrete distribution) from "Random Vectors and Joint Distributions").276 CHAPTER 10. u.
Then and QA = {(t.*u. Reference to 2 Figure 10. Y ) ∈ QA where QB (10. u > t2 }.52) QB = {(t.27: A compound function The pair {X. u ≤ t2 } P (Z ≤ 1/2) 2 3 1 t2 1/2 0 = 2 3 t1 0 1/2−t 0 (t + 2u) dudt + 2 3 t2 t1 t2 0 (t + 2u) dudt + (10. 1/2 .51) Q = {(t. Y ) Y + IQc (X. Y ≤ X 2 + P X + Y ≤ 1/2.5).PZ] = csort(G.27 (A compound function). % Alternate. Y } has joint density fXY (t. PZ1 = (Z<=1)*PZ' PZ1 = 0.8: Regions for P (Z ≤ 1/2) in Example 10. Example 10. the function g has a compound denition. ANALYTICAL SOLUTION P (Z ≤ 1/2) = P Y ≤ 1/2. Y ) (X + Y ) (10. 2 2 We may break up the integral into three parts. That is. Z={ for Y X +Y for for = IQ (X. using the distribution for Z In the following example. it has a dierent rule for Figure 10. u) : u ≤ 1/2.4853 dierent parts of the plane. u) : t + u ≤ 1/2. Let 1/2 − t1 = t1 and t2 = 1/2. u) = X2 − Y ≥ 0 X2 − Y < 0 2 3 (t + 2u) on the unit square 0 ≤ t ≤ 1.2322 APPROXIMATE SOLUTION .277 G = t.8 shows that this is the part of the unit square for which u ≤ min max 1/2 − t. [Z. t . Determine P (Z < = 0.53) (t + 2u) dudt = 0. Y > X 2 = P (X.P). u) : u ≤ t2 }. 0 ≤ u ≤ 1.
if u is a probability value. an and associated Yi independent class {Yi : 1 ≤ i ≤ n} may be constructed. Y. We consider a general denition which applies to any probability distribution function.28: The Weibull distribution 2 (3. Quantile functions for simple random variables may be used to obtain an important Poisson approximation theorem (which we do not develop in this work).3 The Quantile Function3 F is a probability distribution function. then the Q (u) = inf {t : F (t) ≥ u} 3 This ∀ u ∈ (0.*Q + (t + u).*P) prob = 0. u. PX. with having the same (marginal) distribution.1 The Quantile Function probability. t = Q (u) is the value of t for which of The quantile function is used to derive a number of useful special forms for mathematical General conceptproperties. the associated quantile function Q is essentially an inverse F.2322. FUNCTIONS OF RANDOM VARIABLES tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (2/3)*(t + 2*u) Use array operations on X. 1) (10. On the other hand. the evaluation is elementary but tedious. For F continuous and strictly increasing at t. 1). and examples P (X ≤ t) = u. Thus.6/>. If 10. Once set up. calculates for the normal distribution. 0) −ln (1 − u) /3 (10. The quantile function is dened on the unit interval (0.*(1Q). This fact serves as the basis of a method of simulating the sampling from an arbitrary distribution with the aid of a nite class random number generator.org/content/m23385/1. G = u. Denition: If F is a function quantile function for F is given by having the properties of a probability distribution function.29: The Normal Distribution The mfunction values of Q norminv. then Q (u) = t i F (t) = u.54) u = F (t) = 1 − e−3t t ≥ 0 ⇒ t = Q (u) = Example 10.3. the approximation proceeds in a straightforward manner from the normal description of the problem. Example 10.2328 % Theoretical is 0. Also. If F is a probability distribution function.278 CHAPTER 10. above The setup of the integrals involves careful attention to the geometry of the system. given any {Xi : 1 ≤ i ≤ n} each Xi of random variables. PY.^2. The numerical result compares quite closely with the theoretical value and accuracy could be improved by taking more subdivision points. expectation. based on the MATLAB function ernv (inverse error function). prob = total((G<=1/2). 2. the quantile function may be used to construct a The quantile function for a probability distribution has many uses in both the theory and application of random variable having F as its distributions function. . t.55) content is available online at <http://cnx. and P Q = u <= t. The restriction to the continuous case is not essential. 10.
F (t∗ ) < u∗. then X = Q (U ) FX (t) = P [Q (U ) ≤ t] = P [U ≤ F (t)] = F (t). There is always an independent class {Ui : 1 ≤ i ≤ n} iid uniform (0. Then the random variables Yi = Qi (Ui ) . 1 ≤ i ≤ n. Property (Q2) implies that if variable FX = F . with quantile function uniformly distributed on (0.30: Independent classes with prescribed distributions {Xi : 1 ≤ i ≤ n} is an arbitrary class of random variables with corresponding distribution {Fi : 1 ≤ i ≤ n}.279 We note If If F (t∗ ) ≥ u∗. Let {Qi : 1 ≤ i ≤ n} be the respective quantile functions. . form an independent class with the same marginals as the Xi . has distribution function Q. has distribution function Hence. 1). 1). note that X = Q (U ). 1). Example 10. ≤t then then t∗ ≥ inf {t : F (t) ≥ u∗ } = Q (u∗ ) t∗ < inf {t : F (t) ≥ u∗ } = Q (u∗ ) ∀ u ∈ (0. To see this. with U F is any distribution function. then the random F. Suppose functions Several other important properties of the quantile function may be established. 1) (marginals for the joint uniform distribution on the unit hypercube with sides (0. 1)). we have the important property: (Q1)Q (u) i u ≤ F (t) The property (Q1) implies the following important property: (Q2)If U ∼ uniform (0.
then Rotate the resulting gure 90 degrees counterclockwise .280 CHAPTER 10. 2. If jumps are represented by vertical line segments. 1. Q is leftcontinuous. whereas F is rightcontinuous.9: Graph of quantile function from graph of distribution function. FUNCTIONS OF RANDOM VARIABLES Figure 10. construction of the graph of obtained by the following two step procedure: u = Q (t) may be • • Invert the entire gure (including axes).
281 This is illustrated in Figure 10.2 0. bi ]. 1 ≤ i ≤ n. It is reected in the horizontal axis then Q (u) versus u. . This may be stated F (ti ) = bi . 3. then jumps go into at segments and at segments go into vertical segments. where b0 = 0 If and bi = pj . If X is discrete with probability pi at ti .1 0.9. If jumps are represented by vertical line segments.57) Figure 1 shows a plot of the distribution function FX .31: Quantile function for a simple random variable Suppose simple random variable X has distribution X = [−2 0 1 3] rotated counterclockwise to give the graph of P X = [0. i j=1 then F has jumps in the amount pi at each is constant between. interval The quantile function is a leftcontinuous step function having value ti ti and on the (bi−1 .3 0.4] (10.56) Example 10. thenQ (u) = ti forF (ti−1 ) < u ≤ F (ti ) (10.
We use the analytic characterization above in developing a number of mfunctions and mprocedures. which is used as an element of several simulation procedures. The basis for quantile function calculations for a simple random variable is the formula above. mprocedures for a simple random variable implemented in the mfunction dquant. .3 5.1 9.8].4 7. FUNCTIONS OF RANDOM VARIABLES Figure 10.282 CHAPTER 10. To plot the quantile function.3 1. This is Example 10. we use dquanplot which employs the stairs function and plots X vs the distribution function F X .1 3.32: Simple Random Variable X = [2. The procedure dsample employs dquant to obtain a sample from a population with simple distribution and to calculate relative frequencies of the various values.31 (Quantile function for a simple random variable).10: Distribution and quantile functions for Example 10.
1875 7.4000 0.32 (Simple Random Variable). It calls for the population distribution and then for the designation of a target set of possible values. or one of a set of values.1300 0.3000 0.3000 0.11: Quantile function for Example 10. Sometimes it is desirable to know how many trials are required to reach a certain value.0) % Reset random number generator for reference dsample Enter row matrix of values X Enter row matrix of probabilities PX Sample size n 10000 Value Prob Rel freq 2.2300 0. A pair of mprocedures are available for simulation of that problem.1800 0. The .305 Sample variance = 16.01*[18 15 23 19 13 12]. The rst is called targetset.325 Population mean E[X] = 3.1500 0.283 PX = 0.1466 3.1000 0.32 Population variance Var[X] = 16.1200 0.1805 1.1000 0.11 for plot of results rand('seed'.8000 0.1900 0.33 Figure 10.1201 Sample average ex = 3.2320 5.1333 9. dquanplot Enter VALUES for X X Enter PROBABILITIES for X PX % See Figure~10.
of members of the target set to be reached.4 shows the fraction of runs requiring t steps or less . E = [1.3000 0. call for D.5000 7.7 5.089 The minimum completion time is 2 The maximum completion time is 30 To view a detailed count. calls for the number of repetitions of the experiment.2 3.3 0.1]. The first column shows the various completion times. targetset Enter population VALUES X Enter population PROBABILITIES The set of population values is 1.7000 Enter the number of target values to visit 2 The average completion time is 6. PX = [0.5 7. the second column shows the numbers of trials yielding those times % Figure 10.3000 3.3 0.0) % Seed set for possible comparison targetrun Enter the number of repetitions 1000 The target set is 1.3 3.284 CHAPTER 10.7].1 0.6.7000 Enter the set of target values Call for targetrun % Population values % Population probabilities % Set of target states PX E 5. various statistics on the runs are second procedure.33 X = [1.2000 3. calculated and displayed. Example 10. FUNCTIONS OF RANDOM VARIABLES targetrun.3].2 0. and asks for the number After the runs are made.3 0.32 The standard deviation is 4.3000 rand('seed'.
6 + 0.*(t >= 0)'. these are not displayed as in the discrete case. A procedure The mprocedure Since there are so many possible values. mprocedures for distribution functions Example 10.285 Figure 10. except the ordinary plot function is used in the continuous case whereas the plotting function stairs is used in the discrete case. quanplot is used to plot the quantile function. dfsetup utilizes the distribution function to set up an approximate simple distribution.4*t). F = '0. The mprocedure qsample is used to obtain a sample from the population.34: Quantile function associated with a distribution function. dfsetup Distribution function F is entered as a string variable. either defined previously or upon call Enter matrix [a b] of Xrange endpoints [1 1] Enter number of X approximation points 1000 Enter distribution function F as function of t F Distribution is in row matrices X and PX quanplot Enter row matrix of values X Enter row matrix of probabilities PX % String .*(t < 0) + (0. This procedure is essentially the same as dquanplot.4*(t + 1).12: Fraction of runs requiring t steps or less.
34 (Quantile function associated with a distribution mprocedures for density functions An m.0004002 % Theoretical = 0 Sample variance vx = 0.13 for plot qsample Enter row matrix of X values X Enter row matrix of X probabilities PX Sample size n 1000 Sample average ex = 0.004146 Approximate population mean E(X) = 0. except that the density function is entered as a string variable.01 % See Figure~10. This is essentially the Then the same as the procedure tuappr.*(t<1) + (1. Example 10. FUNCTIONS OF RANDOM VARIABLES Probability increment h 0.t/3).13: function.01 % See Figure~10.procedure acsetup is used to obtain the simple approximate distribution.14 for plot .*(1<=t)' Distribution is in row matrices X and PX quanplot Enter row matrix of values X Enter row matrix of probabilities PX Probability increment h 0. either defined previously or upon call.286 CHAPTER 10. acsetup Density f is entered as a string variable.). Quantile function for Example 10.^2).2664 Figure 10.25 Approximate population variance V(X) = 0.35: Quantile function associated with a density function. Enter matrix [a b] of xrange endpoints [0 3] Enter number of x approximation points 1000 Enter density as a function of t '(t. procedures quanplot and qsample are used as in the case of distribution functions.
C > 0.4/>.). 294.4 Problems on Functions of Random Variables4 Exercise 10.org/content/m24315/1.14: tion.58) content is available online at <http://cnx.361 % Theoretical = 49/36 = 1. Use properties of the exponential and natural log function to show that FZ (v) = 1 − FX 4 This − ln (v/C) a for 0<v≤C (10.352 Approximate population mean E(X) = 1.1 Suppose where X (Solution on p.3474 Figure 10. 0 < Z ≤ C.35 (Quantile function associated with a density func 10.) Let is a nonnegative.3474 % Theoretical = 0. a > 0. .0) qsample Enter row matrix of values X Enter row matrix of probabilities PX Sample size n 1000 Sample average ex = 1. Quantile function for Example 10.287 rand('seed'.3622 Sample variance vx = 0.3242 Approximate population variance V(X) = 0. absolutely continuous random variable. Then Z = g (X) = Ce−aX .
295. g (t) = IM (t) [(p − r) t + (r − c) m] + IM (t) [(p − s) t + (s − c) m] = (p − s) t + (s − c) m + IM (t) (s − r) (t − m) Suppose (10. $16 each 3150. If the cost of replacement at failure is C dollars. extra units may be ordered at a cost of s each. Units are sold at a price p per unit. Experience indicates demand can be represented by a random variable D ∼ Poisson season. $18 each 2130. m]. Then one dollar in hand now.61) µ = 50 m = 50 c = 30 p = 50 r = 20 s = 40. Approximate the Poisson random variable D by truncating at 100. Suppose λ = 1/10.2 to determine the probability b. $15 each 51100. $13 each If the number of purchasers is a random variable X. eax at the end of x a.2 Use the result of Exercise 10.62) Z = g (X) described by g (X) = 200 + 18IM 1 (X) (X − 10) + (16 − 18) IM 2 (X) (X − 20) + .) X∼ λ/a exponential (λ). The agreement is that the organization will purchase ten tickets at $20 each (regardless of the number of individual buyers). then FZ (v) = Exercise 10.288 CHAPTER 10. Truncate X at 1000 and use 10. 200.) Price breaks) from "Functions of a Random Variable") The cultural committee of a student organization has arranged a special deal for tickets to a concert. If Z = g (D) • • Let For For is the gain from the sales. has a value spent x years in the future has a present value e−ax . If demand exceeds the number originally ordered. 294.2: (Solution on p. Use the result of Exercise 10. then t ≤ m.. Exercise 10. they may be returned with recovery of r (µ). Determine P (500 ≤ Z ≤ 1100). Additional tickets are available according to the following schedule: • • • • 1120. Use a discrete approximation for the exponential density to approximate the probabilities in part (a). then −aX is Z = Ce . Suppose money may be invested at an annual rate continually.1 to show that if (Solution on p.) Present value of future costs. one dollar to failure (in years) X ∼ exponential the present value of the replacement (λ). 294.000 approximation points. a = 0. Exercise 10. 500. Suppose a device put into operation has time a. compounded years. 294.5 (See Example 2 (Example 10. g (t) = (p − c) t − (c − r) (m − t) = (p − r) t + (r − c) m t > m. g (t) = (p − c) m + (t − m) (p − s) = (p − s) t + (s − c) m Then M = (−∞.60) (10.4 Optimal stocking of merchandise.59) (Solution on p. If units remain in stock at the end of the per unit. the total cost (in dollars) is a random quantity (10.07.) A merchant is planning for the Christmas season. Z ≤ 700. and C = $1000. m units of a certain item at a cost of c He intends per unit. Hence.3 v C 0<v≤C (10. FUNCTIONS OF RANDOM VARIABLES Exercise 10. to stock (Solution on p.
0304 0.0493 0.0341 0.7) from "Problems on Random Vectors and Joint Distributions".7 1.0308 P ({X + Y ≥ 5} ∪ {Y ≤ 2}).0398 0.6 5.) Exercise 10.0399 0.0323 0. Y } has the joint distribution (in mle npr08_07.0242 0. M 3 = [30.63) M 1 = [10.1] 0. Y } ≤ 4) .m (Section 17.0399 0. and P (900 ≤ Z ≤ 1400).0209 (10.69) . and Exercise 1 (Exercise 9.6) from "Problems on Random Vectors and Joint Distributions".1 5.0528 0.3) from "Problems on Independent Classes of Random Variables") The pair {X.0420 0.0342 0. P (Z ≥ 1300). ∞) .1] 0.0589 0.0609 0.0682 0.0504 0. 295.3 − 0.m (Section 17.38: npr08_07)): P (X = t.0448 Y = [−2 1 2.0380 0.9 5. P X 2 + Y 2 ≤ 10 Exercise 10.1 3. 296. 295.m (Section 17.8.0456 0.8 (Solution on p.0744 0.3 2. P (Z ≥ 1000) .0868 Determine 0.5 2 8 4.5 4.1] 0.65) 0. Y } has the joint distribution (in mle npr09_02.3] 0.0361 0.0667 Determine Determine 0.41: npr09_02)): X = [−3.1) from "Problems on Independent Classes of Random Variables")) The pair {X. Y = u) (10.8. and Exercise 3 (Exercise 9.0556 0. P (Z < 0) and P (−5 < Z ≤ 300).0620 0.2) from "Problems on Independent Classes of Random Variables") The {X. ∞) (10. P (X − Y  > 3).8.37: npr08_06)): X = [−2.0589 (10. ∞) .64) Determine Suppose X ∼ Poisson (75).6 (Solution on p.0651 0.0961 P = 0.68) 0.289 (15 − 16) IM 3 (X) (X − 30) + (13 − 15) IM 4 (X) (X − 50) where (10. M 2 = [20.0713 0. Exercise 10. Y } has the joint distribution (in mle npr08_06.0350 0.) (See Exercise 6 (Exercise 8.0527 0.0580 Let Y = [1.0441 (10. ∞) .) (See Exercise 7 (Exercise 8.0498 0.7 pair (See Exercise 2 (Exercise 9.7 1. Approximate the Poisson distribution by truncating at 150.0672 . (Solution on p.0437 P = 0.67) 0. M 4 = [50.0483 0.66) P (max{X.0551 0.9 − 1. Z = 3X 3 + 3X 2 Y − Y 3 .0357 0. (10.
0510 0.0405 0.1089 0.0132 3.4 0. Z = g (X. let {X. For the distributions in Exercises 1015 below a.8. Y ) = 3X 2 + 2XY − Y 2 .11 (Solution on p.0495 0. 296.0090 0.290 CHAPTER 10.) in Exercise 10. Use a discrete approximation to calculate the same probablities.2 0.1320 0.1] (X) 4X + I(1. Determine analytically the indicated probabilities. 0 ≤ u ≤ 1 + t (see Exercise 15 (Exercise 8.2] (X) (X + Y ) Determine (10.2 Determine P X 2 − 3X ≤ 0 . 296. let {X.71) P (Z ≤ 2) .0203 0.10 For the pair (Solution on p. Y ) = { X 2Y for for X +Y ≤4 X +Y >4 = IM (X.0396 0 0.8.0363 0.7 0. Exercise 10.0726 2.0484 1. Y ) 2Y (10. Y } W = g (X.5 4. b. Determine and plot Exercise 10.70) Determine and plot the distribution function for W.0189 0.8 3.0297 0 4.0528 0.0216 0.0594 0.9 For the pair (Solution on p.0324 0.0077 Table 10.9 0.0 3. 296.1 0.0440 0.15) from "Problems on Random Vectors and Joint Distributions"). FUNCTIONS OF RANDOM VARIABLES t = u = 7. Z = I[0. Y } the distribution function for Z. Y ) X + IM c (X.5 0.1 2.' Exercise 10.) in Exercise 10. P X 3 − 3Y  < 3 . u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2.) fXY (t.0891 0.0231 0.
3 23 (10. Y ) (X + Y ) + IM c (X. Y ) 2Y. u) = 24 11 tu 0 ≤ t ≤ 2. Determine 1 Z = IM (X.73) P (Z ≤ 1).291 Figure 10. u) : max (t. M = {(t. Z = IM (X. 0 ≤ u ≤ min{1.17) from "Problems on Random Vectors and Joint Distributions").72) Exercise 10. 297. t} (see Exercise 18 (Exercise 8. u) ≤ 1} Determine (10.13 (Solution on p. Y ) X + IM c (X. . u) : u > t} 2 P (Z ≤ 1/4). 297.12 (Solution on p. u) = (t + 2u) 0 ≤ t ≤ 2. 0 ≤ u ≤ max{2 − t. M = {(t.18) from "Problems on Random Vectors and Joint Distributions"). 2 − t} (see Exercise 17 (Exercise 8. Y ) Y 2 .) for fXY (t.15 Exercise 10.) for fXY (t.
20) from "Problems on Random Variables and Joint Distributions") Z = IM (X. 298.) fXY (t. u) = 12 179 3t2 + u . Y ) X + IM c (X.15 fXY (t. Y ) (X + Y ) + IM c (X. 2} Y . u) : t ≤ 1. 0 ≤ u ≤ min{1 + t.74) P (Z ≤ 2). Y ) Detemine M = {(t. X (see Exercise 20 (Exercise 8. u) = (3t + 2tu). (Solution on p. u) : u ≤ min (1. u ≥ 1} (10. . Determine M = {(t. for 0 ≤ t ≤ 2. 298. Y ) 2Y 2 . for 0 ≤ t ≤ 2.75) P (Z ≤ 1). 3 − t} (see Exercise 19 (Exercise 8. 2 − t)} (10.) 12 227 Exercise 10. Z = IM (X.19) from "Problems on Random Vectors and Joint Distributions").16 Exercise 10.14 (Solution on p.292 CHAPTER 10. 0 ≤ u ≤ min{2. FUNCTIONS OF RANDOM VARIABLES Figure 10.
Z} is independent.12 0.375 0. Plot the distribution function FX and the quantile function QX .32 P (E) = 0.76) {D.5 1.17 Exercise 10. Take a random sample of size n = 10.16 The class (Solution on p.108 0.293 Figure 10.018 Y = ID + 3IE + IF − 3. X = −2IA + IB + 3IC . .8 0. 000.045 0.) has distribution X = [−3. E.1 − 0.3 0.77) Z has distribution Value Probability 1.40 (10.4 0.22 0. 299. F } is independent with P (D) = 0.17 The simple random variable X (Solution on p.3 Determine P X 2 + 3XY 2 > 3Z . Exercise 10. Compare the relative frequency for each value with the probability that value is taken on.12 1. 299.7 4. Y.11 0.4 3. Minterm 0.2 0.9] P X = [0.025 0.33 0.78) a.08 Table 10.255 0.24 2. b.07] (10.7 0.162 0. The class (10.) probabilities are (in the usual order) {X.15 0.13 5.012 0.2 2.56 P (F ) = 0.43 3.
*(D <= m).r)*(D .2 (p. 287) Z = Ce−aX ≤ v i e−aX ≤ v/C i −aX ≤ ln (v/C) i X ≥ −ln (v/C) /a. D = 0:100. s = 40.^(10/7) P = 0. r = 20.80) P (Z ≤ v) = v 1000 10/7 (10. m = 50.3716 PM3 = (G<=200)*PX' PM3 = 0.s)*D + (s .9209 [Z.81) v = [700 500 200].6008 0. pm = m*PZ' pm = 0. P = (v/1000).9209 % Alternate: use dbn for Z .1003 tappr Enter matrix [a b] of xrange endpoints [0 1000] Enter number of x approximation points 10000 Enter density as a function of t 0. PM = M*PD' PM = 0. p = 50. FUNCTIONS OF RANDOM VARIABLES Solutions to Exercises in Chapter 10 Solution to Exercise 10.79) FZ (v) = 1 − 1 − exp − Solution to Exercise 10.6005 PM2 = (G<=500)*PX' PM2 = 0.1003 Solution to Exercise 10. 287) − ln (v/C) a (10.294 CHAPTER 10.3715 0. 288) mu = 50.m). 288) λ · ln (v/C) a = v C λ/a (10.D).4 (p.PZ] = csort(G. so that FZ (v) = P (Z ≤ v) = P (X ≥ −ln (v/C) /a) = 1 − FX Solution to Exercise 10.3 (p. m = (500<=Z)&(Z<=1100). G = (p .c)*m +(s .07*t).1 (p. PD = ipoisson(mu. PM1 = (G<=700)*PX' PM1 = 0.1*exp(t/10) Use row matrices X and PX as in the simple case G = 1000*exp(0.PD). M = (500<=G)&(G<=1100). c = 30.
t.6 (p.P).295 Solution to Exercise 10. 289) npr08_06 (Section~17. Y.3713 Solution to Exercise 10.*P) P4 = 0.50). and P M1 = (t+u>=5)(u<=2).. P1 = total(M1.41: npr09_02) Data are in X. and P P1 = total((max(t. P3 = total((G<0).*(X>=50).3713 [Z.18)*(X .PZ] = csort(G.16)*(X.PX).*P) P1 = 0.5 (p. (15 . Y.4860 P2 = total((abs(tu)>3).^3..15)*(X . 288) X = 0:150.20). PX. G = 200 + 18*(X .10). % Alternate: use dbn for Z p4 = ((5<Z)&(Z<=300))*PZ' p4 = 0. PX = ipoisson(75. u. u.8. P jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.8.PZ] = csort(G.*(X>=30) + (13 .^2. Y.*P) P2 = 0. t. PY.*(X>=10) + (16 .37: npr08_06) Data are in X.*P) P1 = 0. PY.7 (p.*(X>=20) + .*P) P3 = 0.9288 Solution to Exercise 10. PX.X).^2 + u. Y.^3 + 3*t.5420 P4 = total(((5<G)&(G<=300)).u)<=4).9742 [Z.7054 M2 = t.4516 G = 3*t. 289) npr09_02 (Section~17.*u .1142 P3 = ((900<=G)&(G<=1400))*PX' P3 = 0. P jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.u.30). % Alternate: use dbn for Z p1 = (Z>=1000)*PZ' p1 = 0. .9288 P2 = (G>=1300)*PX' P2 = 0.^2 <= 10. P1 = (G>=1000)*PX' P1 = 0.
85) tuappr Enter matrix [a b] of Xrange endpoints Enter matrix [c d] of Yrange endpoints [0 2] [0 3] .^2 .38: npr08_07) Data are in X. [W. t.^3 .82) M 2 = {(t.PZ] = csort(G.3*t <=0.8 (p. 290) G = 3*t.11 (p. % Determine g(X. u) : 0 ≤ t ≤ 1/2}.P).PW] = csort(H.u.P).*P) P2 = 0. PY.3*abs(u) < 3.296 CHAPTER 10.83) (10. 290) H = t. Y. P jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.84) P = 2t + 3u2 dudt + 3 88 2 1 0 2−t 2t + 3u2 dudt = 563 5632 (10.8.7876 Solution to Exercise 10. ddbn Enter row matrix of VALUES W Enter row matrix of PROBABILITIES PW Solution to Exercise 10. 0 ≤ u ≤ 1 + t} Q1 = {(t.Y) ddbn % Call for plotting mprocedure Enter row matrix of VALUES Z Enter row matrix of PROBABILITIES PZ % Plot not reproduced here Solution to Exercise 10. Q2 = {(t.*(t+u<=4) + 2*u. Y. FUNCTIONS OF RANDOM VARIABLES P2 = total(M2. P1 = total(M1.*P) P1 = 0. PX.*u .^2. and P M1 = t.Y) [Z.10 (p. P2 = total(M2. 0 ≤ u ≤ 1 + t} (10.3282 Solution to Exercise 10. 290) % Plot not reproduced here P (Z ≤ 2) = P Z ∈ Q = Q1M 1 Q2M 2 . 289) npr08_07 (Section~17.4500 M2 = t. where M 1 = {(t. u) : 1 < t ≤ 2.*P) P2 = 0.^2 + 2*t. u.9 (p. u) : 0 ≤ t ≤ 1.*(t+u>4). u) : u ≤ 2 − t} 3 88 1/2 0 0 1+t (see gure) (10. % Obtain dbn for Z = g(X.
89) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density (24/11)*t. Y.92) P = (t + 2u) dudt + 3 23 2 1 0 1/2 (t + 2u) dudt = 9 46 (10. u) : u ≤ 1/2} 24 11 1/2 0 0 1 (see gure) (10. pp = (Z<=1/4)*PZ' pp = 0. 2 − t)} Q1 = {(t.86) M2 = {(t. PX. 0 ≤ u ≤ 1 − t} (10. and P G = 4*t. u) : 0 ≤ t ≤ u ≤ 1} (10. Y.12 (p.^2). u. PY.*(u<=min(1.^2. u) : 1 ≤ t ≤ 2.*(t>1).*(u>t) + u.*(t<=1) + (t+u).4844 % Theoretical = 85/176 = 0.*(u<t). u) : t ≤ 1/2} Q2 = {(t. t. PY. Y ) ∈ M1 Q1 M2 Q2 . 0 ≤ t ≤ min (t. PZ2 = (Z<=2)*PZ' PZ2 = 0.5*t.91) (10.PZ] = csort(G. and P G = 0. t. PX.P).13 (p. u) : u ≤ 1 − t} Q2 = {(t. [Z.297 Enter number of X approximation points 200 Enter number of Y approximation points 300 Enter expression for joint density (3/88)*(2*t + 3*u. [Z.88) P = tu dudt + 24 11 3/2 1/2 0 1/2 tu dudt + 24 11 2 3/2 0 2−t tu dudt = 85 176 (10. 291) P (Z ≤ 1) = P (X. M1 = {(t. u.*u.P). u) : 0 ≤ t ≤ 1.1000 Solution to Exercise 10. Y ) ∈ M1 Q1 M2 Q2 . 291) P (Z ≤ 1/4) = P (X. u) : u ≤ 1/2} 3 23 1 0 0 1−t (see gure) (10.93) tuappr Enter matrix Enter matrix Enter number Enter number [a [c of of b] of Xrange endpoints [0 2] d] of Yrange endpoints [0 2] X approximation points 300 Y approximation points 300 . M1 = {(t.90) M2 = {(t.4830 Solution to Exercise 10.PZ] = csort(G.87) (10. u) : 0 ≤ t ≤ 2.1010 % Theoretical = 563/5632 = 0. 0 ≤ u ≤ t} Q1 = {(t.2t)) Use array operations on X.*(u<=1+t) Use array operations on X.
*(t<=1) + (t>1).5463 .6648 Solution to Exercise 10. p = total((G<=2).*u.5478 % Theoretical = 124/227 = 0.94) M2 = {(t.98) Q1 = {(t. PX. PX.^2. u) : 1 ≤ t ≤ 2. u) : 0 ≤ t ≤ 1. t.*(u>=2t). and P M = (t<=1)&(u>=1).6662 % Theoretical = 119/179 = 0.*u.*u). u) : u ≤ t} P = 12 227 1 0 0 1 (10. u. PX. 292) P (Z ≤ 1) = P (X. and P Q = (u<=1).1957 Solution to Exercise 10. M1 = {(t. and P M = max(t. u) : 0 ≤ t ≤ 1.2)) Use array operations on X.*(t + u) + (1 . 0 ≤ u ≤ 3 − t} Q1 = {(t. G = M. Z = M.*(t + u) + (1 . G = M. Y. u) : u ≤ 1/2} P = 12 179 1 0 0 2−t (see gure) (10. u) : u ≤ 1 − t} Q2 = {(t.95) (10.M)*2.*P) p = 0. M2 = M c (see gure) (10. p = total((G<=1). PY. 1 ≤ u ≤ 2} (10. u) : 0 ≤ t ≤ 1} Q2 = {(t. t.M)*2. u.t)) Use array operations on X.^2.1960 % Theoretical = 9/46 = 0. Y ) ∈ M1 Q1 M2 M3 Q2 .*u.*(u<=t). Y.M)*2. 0 ≤ u ≤ 1} M3 = {(t.*(u<=max(2t. u.15 (p.14 (p.298 CHAPTER 10. P = total(Q.*(t + u) + (1 . 292) P (Z ≤ 2) = P (.99) (3t + 2tu) dudt + 12 227 2 1 t (3t + 2tu) dudt = 2−t 124 227 (10.3t)) Use array operations on X.*P) P = 0.*(u<=min(2. t. Y ) ∈ M1 Q1 M2 Q2 .^2 + u).*(u<=min(1+t.96) 3t2 + u dudt + 12 179 2 1 0 1 3t2 + u dudt = 119 179 (10. M1 = M. Y.100) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 400 Enter number of Y approximation points 400 Enter expression for joint density (12/227)*(3*t+2*t. FUNCTIONS OF RANDOM VARIABLES Enter expression for joint density (3/23)*(t + 2*u).97) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 300 Enter number of Y approximation points 300 Enter expression for joint density (12/179)*(3*t. PY.*P) p = 0.u) <= 1. PY.
pmy. pmy = minprob(0. PX = 0. cy = [1 3 1 3].17 (p. Y.16 cx = [2 1 3 0].1490 0. icalc3 Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter row matrix of Zvalues Z Enter X probabilities PX Enter Y probabilities PY Enter Z probabilities PZ Use array operations on matrices X.7 3.2 2.1 0.0752 Sample average ex = 0. u.01*[15 22 33 12 11 7].4 5.001*[255 25 375 45 108 12 162 18].8]. ddbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % Plot not reproduced here dquanplot Enter VALUES for X X Enter PROBABILITIES for X PX % Plot not reproduced here rand('seed'.7 4.3 1. PM = total(M.9].3340 2.2 2. cy.m (Section~17.*P) PM = 0.2164 1.8.0700 0.2200 0.1500 0. and P M = t. PY. [Y.5000 0. pmy. pmx.42: npr10_16) Data for Exercise~10. cy.5 1.299 Solution to Exercise 10. Z = [1. Z.1000 0. v.PY] = canonicf(cy.16 (p. PZ') npr10_16 % Call for data Data are in cx.^2 > 3*v.^2 + 3*t. PZ.3300 0.1200 0.pmy).9000 0. PZ [X. disp('Data are in cx.1100 0. pmx.pmx).4 3. 293) % file npr10_16. PZ = 0. Z.*u.2000 0.7000 0.01*[32 56 40]). t. Z.4000 0. PX.1184 3.1070 4.8792 Population mean E[X] = 0.859 .3587 Solution to Exercise 10.PX] = canonicf(cx. 293) X = [3.0) % Reset random number generator dsample % for comparison purposes Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX Sample size n 10000 Value Prob Rel freq 3. pmx = 0.01*[12 24 43 13 8].
112 .146 Population variance Var[X] = 5.300 CHAPTER 10. FUNCTIONS OF RANDOM VARIABLES Sample variance vx = 5.
One of the ten. 301 . this idea of likelihood is rooted X takes a value in a set M of real numbers is interpreted as the in the intuitive notion that if the experiment is repeated enough times the probability is approximately the fraction of times the value of the values taken on. which lies beyond the scope of this study. x2 . then extend the denition and properties to the general case. The fi must add to n.2 · 5 + 0. 8. · · · . fm have value tm . In the process. We incorporate the concept of as an appropriate form of such averages. three of the ten. has the value 1. 8. 5.5/>. tm }. we could write x = (0.1. · · · . have the value 2. 5.3 · 2 + 0. X will fall in M. and 3/10 have the value 8.1 Mathematical Expectation: Simple Random Variables1 11. 2. numbers have value t1 .2) Multiply each possible value by the fraction The fractions are often referred to as the relative frequencies. or the fraction 3/10 of them.1 Introduction The probability that real random variable likelihood that the observed value X (ω) on any trial will lie in M. f2 {x1 . Historically. we note the relationship of mathematical expectation to the Lebesque integral. t2 . which is developed in abstract measure theory. 11. · · · xn } to be averaged.1 · 1 + 0. 4. used extensively in the handling of numerical data.1) Examination of the ten numbers to be added shows that ve distinct values are included. Thus.1 · 4 + 0.1.2 Expectation for simple random variables The notion of mathematical expectation is closely related to the idea of a weighted mean. Consider the arithmetic average 2.org/content/m23387/1. identication of this relationship provides access to a rich and powerful set of properties which have far reaching consequences in both application and theory. x= 1 (1 + 2 + 2 + 2 + 4 + 5 + 5 + 8 + 8 + 8) 10 (11. with m ≤ n distinct values have value t2 . Associated with this interpretation is the notion of the average of mathematical expectation into the mathematical model We begin by studying the mathematical expectation of simple random variables. 2.Chapter 11 Mathematical Expectation 11. 2/10 have the value 5. content is available online at <http://cnx. which is given by x of the following ten numbers: 1.3 · 8) The pattern in this last expression can be stated in words: of the numbers having that value and then sum these products. 8. or the fraction 1/10 of them. 1/10 has the value 4. suppose there are Suppose f1 n weighted average. Although we do not develop this theory. 1 This A sum of this sort is known as a In general. {t1 . (11.
X with values {t1 . so that X n n E [aX] = i=1 ati P (Ai ) = a i=1 ti P (Ai ) = aE [X] (11. Two quite dierent random variables may have the same distribution. we have E [aIE ] = aP (E). n n X = i=1 ti IAi then aX = i=1 ati IAi . this average has been called the the mean value. Traditionally. or Example 11. of the random variable X. so that E [c] = cP (Ω) = c.1: Some special cases X = aIE = 0IE c + aIE .4) Note that the expectation is determined by the distribution. a constant c. MATHEMATICAL EXPECTATION pi = fi /n.302 CHAPTER 11. t2 . The average x of the n numbers may be written x= 1 n n m numbers in the set which If we set have the value xi = i=1 j=1 tj pj (11.1: Moment of a probability distribution about the origin. 1 ≤ i ≤ m. For a simple random variable the n n E [X] = i=1 ti P (X = ti ) = i=1 ti pi (11. is the probability weighted average of the values taken on by X. 1. we have a similar averaging process in which the relative frequencies of the various possible values of are replaced by the probabilities that those values are observed on any trial.3) In probability theory. designated E [X]. then the fraction pi is called the relative frequency of those ti . Since 2. · · · . If mean. . tn } and corresponding probabilities mathematical expectation. For 3. hence the same expectation.5) Figure 11. In symbols Denition. pi = P (X = ti ). X = cIΩ .
She choses one. % Matrix of values (ordered) = 0.8) For a simple random variable. This is easily calculated using MATLAB. we give an alternate interpretation in terms of meansquare estimation. The prices are $2. The distribution induced by a real random variable on the line is visualized as a unit of probability mass actually distributed along the line.16: Alternate interpretation of the mean value) in "Mathematical Expectation: General Random Variables".PX) % The usual MATLAB operation = 2. we have employed the notion of probability as mass.00 · 0.3. Since the total mass is one.1*[2 2 3 3]. with is the Mechanically.8500 .00 · 0. respectively.3 of choosing the rst. In Example 6 (Example 11.2. of the probability mass distribution about the total mass times the number which locates the center of mass.85 (11. then the mean value µX is the point of balance.50.50 · 0. $2.5 3 3.00.50 · 0. By denition E [X] = 1 1 1 1 1 1 7 1 · 1 + · 2 + · 3 + · 4 + · 5 + · 6 = (1 + 2 + 3 + 4 + 5 + 6) = 6 6 6 6 6 6 6 2 (11. 0. as shown in Figure 1. If the real line is viewed as a sti. This produces a probability mass distribution.50.303 Mechanical interpretation In order to aid in visualizing an essentially abstract system.4: MATLAB calculation for Example 3 PX EX EX Ex Ex ex ex X = [2 2. Thus the values are the integers one through six.2 + 2. weightless rod with point mass pi attached at each value ti of X.2 + 3. pi at The expectation is ti pi i Now (11. The probability distribution places equal mass at each of the integer values one through six. % Matrix of probabilities = dot(X. it is really not necessary. with P (X = ti ) = pi .6) ti  is the distance of point mass pi from the origin. and $3.3 + 3. We utilize the mass distribution to give an important and helpful mechanical interpretation of the expectation or mean value. and each probability is 1/6.*PX) % An alternate calculation = 2. We suppose each number is equally likely. and 0.5].3 = 2. the mathematical expectation is determined as the dot product of the value matrix with the probability matrix. From physical theory. 0.00. The center of mass is at the midpoint.3: A simple choice A child is told she may have one of four toys. the is the mean value location of the center of mass. with respective probabilities 0.8500 = X*PX' % Another alternate = 2. the sum of the products origin on the real line. What is the expected cost of her selection? E [X] = 2. Example 11. ti pi moment pi to the left of the origin i ti is negative. with point mass concentration in the amount of the point ti . Suppose the random variable X has values {ti : 1 ≤ i ≤ n}.8500 = sum(X. Often there are symmetries in the distribution which make it possible to determine the expectation without detailed calculation. $3. this moment is known to be the same as the product of the Example 11. Example 11.7) Although the calculation is very simple in this case. second.2: The number of spots on a die Let X be the number of spots which turn up on a throw of a simple sixsided die.2. third or fourth.
identify the index set Ji of those j such that cj = ti . Also. P (A2 ) = P (C2 ) + P (C5 ) + P (C6 ) .19) .14) E [X] = P (A1 ) + 2P (A2 ) + 3P (A3 ) = P (C1 ) + P (C3 ) + 2 [P (C2 ) + P (C5 ) + P (C6 )] + 3P (C4 ) = P (C1 ) + 2P (C2 ) + P (C3 ) + 3P (C4 ) + 2P (C5 ) + 2P (C6 ) To establish the general pattern.17) P (Ai ) = P (X = ti ) = j∈Ji Since for each P (Cj ) (11. Then the terms cj ICj = ti Ji By the additivity of probability m ICj = ti IAi . Suppose these are t1 < t2 < · · · < tn . C6 } a partition (11. A2 = {X = 2} = C2 C5 C6 and A3 = {X = 3} = C4 (11.13) P (A1 ) = P (C1 ) + P (C3 ) . Suppose simple random variable X is in a primitive form {Cj : 1 ≤ j ≤ m} is a partition (11.16) X = j=1 cj ICj . the general case is simply a matter of appropriate use of notation. Ji where Ai = j∈Ji Cj (11. where m E [X] = j=1 cj P (Cj ) (11. we have n n n m E [X] = i=1 ti P (Ai ) = i=1 ti j∈Ji P (Cj ) = i=1 j∈Ji cj P (Cj ) = j=1 cj P (Cj ) (11. We identify the distinct set of values contained in the set {cj : 1 ≤ j ≤ m}. For any value ti in the range. C3 .304 CHAPTER 11. {C1 . Now and P (A3 ) = P (C4 ) (11. where Ai = {X = ti }.11) Before a formal verication.10) m X= j=1 We show that cj ICj .5: Simple random variable X in primitive form with X = IC1 + 2IC2 + IC3 + 3IC4 + 2IC5 + 2IC6 . 2. E [X] = i=1 ti P (Ai ) (11. C4 . we begin with an example which exhibits the essential pattern. Establishing Example 11.18) j ∈ Ji we have cj = ti . consider (11. or 3.15) (11.12) Inspection shows the distinct possible values of X to be 1.9) We wish to ease this restriction to canonical form. A1 = {X = 1} = C1 so that C3 . in which case n implies Expectation and primitive form The denition and treatment above assumes is in n X= i=1 ti IAi . C5 . C2 . MATHEMATICAL EXPECTATION X canonical form.
% Matrix of probabilities EX = c*pc' EX = 1. 0.24) = i=1 ti j=1 P (Ai Bj ) + j=1 uj i=1 P (Ai Bj ) (11.21) c = [1 2 1 3 2 2 1 3 2 1].11.23) j=1 IAi IBj = IAi Bj and forms a partition. Suppose X= n i=1 ti IAi and Y = m j=1 uj IBj n (both in canonical form).PX]') 1. 0.07.0000 0.12. X from the coecients and probabilities of the primitive form. 0. % Determination of dbn for X disp([X.25) .pc).6: Alternate determinations of Suppose X E [X] in a primitive form is X = IC1 + 2IC2 + IC3 + 3IC4 + 2IC5 + 2IC6 + IC7 + 3IC8 + 2IC9 + IC10 with respective probabilities (11. Since m IAi = i=1 we have IBj = 1 j=1 (11. above.0000 0.08. 0. Thus. j) Z = X + Y in a primitive form.3800 3. we have Ai Bj = {X = ti . 0.14. the last summation expresses on primitive forms.01*[8 11 6 13 5 8 12 7 14 16]. % Matrix of coefficients pc = 0.0000 0.7800 % Direct solution [X. 0. Because of its fundamental importance.05. we work through the verication in some detail. An alternate approach to obtaining the expectation from a primitive form is to use the csort operation to determine the distribution of Example 11. 0. the dening expression for expectation thus holds for X in a primitive form.PX] = csort(c. Y = uj }.22) n ti IAi m IBj + m n n m X +Y = i=1 Note that u j I Bj j=1 i=1 IAi = i=1 j=1 (ti + uj ) IAi IBj (11.06.13.4200 2.16 (11.20) P (Ci ) = 0. 0.305 Thus.7800 Linearity The result on primitive forms may be used to establish the linearity of mathematical expectation for simple random variables.2000 Ex = X*PX' % E[X] from distribution Ex = 1.08. Because of the result n m n m n m E [X + Y ] = i=1 j=1 (ti + uj ) P (Ai Bj ) = i=1 j=1 n m m ti P (Ai Bj ) + i=1 j=1 n uj P (Ai Bj ) (11. The class of these sets for all possible pairs (i. 0.
28) X.33) Some algebraic tricks may be used to show that the second form sums to of that. The computation for the ane form is much simpler.29) By an inductive argument.1 (Some special cases) we E [aX + bY ] = E [aX] + E [bY ] = aE [X] + bE [Y ] If (11. Z are simple. Example 11. q =1−p (11. q =1−p np. MATHEMATICAL EXPECTATION i and for each We note that for each j m n P (Ai ) = j=1 Hence. and cZ . whenever simple random variable X is in ane form.26) n m E [X + Y ] = i=1 Now have ti P (Ai ) + j=1 uj P (Bj ) = E [X] + E [Y ] (11. but there is no need .27) aX and bY are simple if X and Y are.30) Thus. in canonical form IEi so that E [X] = i=1 p = np (11. we may write P (Ai Bj ) and P (Bj ) = i=1 P (Ai Bj ) (11.32) n E [X] = k=0 kC (n.306 CHAPTER 11.31) n X= k=0 so that kIAkn . the dening expression holds for any ane combination of indicator functions. The expectation of a linear combination of a nite number of simple random variables is that linear combination of the expectations of the individual random variables. Thus we may assert (11. so that with the aid of Example 11. k) pk q n−k . It follows that E [aX + bY + cZ] = E [aX + bY ] + cE [Z] = aE [X] + bE [Y ] + cE [Z] simple random variables. Y. then so are aX + bY . p) This random variable appears as the number of successes in n Bernoulli trials with probability p of n success on each component trial. with pk = P (Akn ) = P (X = k) = C (n.7: Binomial distribution (n. then n n E [X] = E c0 + i=1 ci IEi = c0 + i=1 ci P (Ei ) (11. Expectation of a simple random variable in ane form As a direct consequence of linearity. this pattern may be extended to a linear combination of any nite number of Linearity. (11. It is naturally expressed in ane form n X= i=1 Alternately. whether in canonical form or not. k) pk q n−k .
307
Example 11.8: Expected winnings
A bettor places three bets at $2.00 each. The rst bet pays $10.00 with probability 0.15, the second pays $8.00 with probability 0.20, and the third pays $20.00 with probability 0.10. What is the expected gain? SOLUTION The net gain may be expressed
X = 10IA + 8IB + 20IC − 6,
Then
with
P (A) = 0.15, P (B) = 0.20, P (C) = 0.10
(11.34)
E [X] = 10 · 0.15 + 8 · 0.20 + 20 · 0.10 − 6 = −0.90
These calculations may be done in MATLAB as follows:
(11.35)
c = [10 8 20 6]; p = [0.15 0.20 0.10 1.00]; % Constant a = aI_(Omega), with P(Omega) = 1 E = c*p' E = 0.9000
Functions of simple random variables
If then
X
is in a primitive form (including canonical form) and
g
is a real function dened on the range of
X,
m
Z = g (X) =
j=1
so that
g (cj ) ICj
a primitive form
(11.36)
m
E [Z] = E [g (X)] =
j=1
g (cj ) P (Cj )
(11.37)
Alternately, we may use csort to determine the distribution for
Caution.
If
X
Z
and work with that distribution.
is in ane form (but not a primitive form)
m
m
X = c0 +
j=1
so that
cj IEj
then
g (X) = g (c0 ) +
j=1
g (cj ) IEj
(11.38)
m
E [g (X)] = g (c0 ) +
j=1
g (cj ) P (Ej )
(11.39)
Example 11.9: Expectation of a function of
Suppose
X
X
(11.40)
in a primitive form is
X = −3IC1 − IC2 + 2IC3 − 3IC4 + 4IC5 − IC6 + IC7 + 2IC8 + 3IC9 + 2IC10
with probabilities Let
P (Ci ) = 0.08, 0.11, 0.06, 0.13, 0.05, 0.08, 0.12, 0.07, 0.14, 0.16. g (t) = t2 + 2t. Determine E [g (X)].
308
CHAPTER 11. MATHEMATICAL EXPECTATION
% Original coefficients % Probabilities for C_j % g(c_j)
c = [3 1 2 3 4 1 1 2 3 2]; pc = 0.01*[8 11 6 13 5 8 12 7 14 16]; G = c.^2 + 2*c G = 3 1 8 3 24 1 3 8 15 EG = G*pc' % EG = 6.4200 [Z,PZ] = csort(G,pc); % disp([Z;PZ]') % 1.0000 0.1900 3.0000 0.3300 8.0000 0.2900 15.0000 0.1400 24.0000 0.0500 EZ = Z*PZ' % EZ = 6.4200
distribution is available. Suppose
8 Direct computation
Distribution for Z = g(X) Optional display
E[Z] from distribution for Z
A similar approach can be made to a function of a pair of simple random variables, provided the joint
X=
n i=1 ti IAi
and
Y =
m
m j=1
u j I Bj
(both in canonical form). Then
n
Z = g (X, Y ) =
i=1 j=1
The
g (ti , uj ) IAi Bj
(11.41)
Ai Bj
form a partition, so
Z
is in a primitive form. We have the same two alternative possibilities: (1)
direct calculation from values of
g (ti , uj )
and corresponding probabilities
or (2) use of csort to obtain the distribution for
Z.
P (Ai Bj ) = P (X = ti , Y = uj ),
Example 11.10: Expectation for
calculations, we use jcalc.
Z = g (X, Y ) g (t, u) = t2 + 2tu − 3u.
To set up for
We use the joint distribution in le jdemo1.m and let
% file jdemo1.m X = [2.37 1.93 0.47 0.11 0 0.57 1.22 2.15 2.97 3.74]; Y = [3.06 1.44 1.21 0.07 0.88 1.77 2.01 2.84]; P = 0.0001*[ 53 8 167 170 184 18 67 122 18 12; 11 13 143 221 241 153 87 125 122 185; 165 129 226 185 89 215 40 77 93 187; 165 163 205 64 60 66 118 239 67 201; 227 2 128 12 238 106 218 120 222 30; 93 93 22 179 175 186 221 65 129 4; 126 16 159 80 183 116 15 22 113 167; 198 101 101 154 158 58 220 230 228 211]; jdemo1 % Call for data jcalc % Set up Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X, Y, PX, PY, t, u, and P G = t.^2 + 2*t.*u  3*u; % Calculation of matrix of [g(t_i, u_j)] EG = total(G.*P) % Direct calculation of expectation EG = 3.2529 [Z,PZ] = csort(G,P); % Determination of distribution for Z EZ = Z*PZ' % E[Z] from distribution
309
EZ =
3.2529
11.2 Mathematical Expectation; General Random Variables2
In this unit, we extend the denition and properties of mathematical expectation to the general case. In the process, we note the relationship of mathematical expectation to the Lebesque integral, which is developed in abstract measure theory. Although we do not develop this theory, which lies beyond the scope of this
study, identication of this relationship provides access to a rich and powerful set of properties which have far reaching consequences in both application and theory.
11.2.1 Extension to the General Case
In the unit on Distribution Approximations (Section 7.2), we show that a bounded random variable can be expressed as the dierence
X
can be
represented as the limit of a nondecreasing sequence of simple random variables. Also, a real random variable
X = X+ − X−
of two nonnegative random variables.
The extension of
mathematical expectation to the general case is based on these facts and certain basic properties of simple random variables, some of which are established in the unit on expectation for simple random variables. We list these properties and sketch how the extension is accomplished.
Denition.
hold
almost surely,
A condition on a random variable or on a relationship between random variables is said to abbreviated a.s. i the condition or relationship holds for all
ω
except possibly a set
with probability zero.
Basic properties of simple random variables (E0) (E1) (E2) (E3) : : : :
If X = Y a.s. then E [X] = E [Y ]. E [aIE ] = aP (E). Linearity. X = n ai Xi implies E [X] = i=1
Positivity; monotonicity
a. If b. If
n i=1
ai E [Xi ]
X ≥ 0 a.s. , then E [X] ≥ 0, with equality i X = 0 a.s.. X ≥ Y a.s. , then E [X] ≥ E [Y ], with equality i X = Y a.s.
If X ≥ 0 is bounded and {Xn : 1 ≤ n} is an a.s. nonnegative, limXn (ω) ≥ X (ω) for almost every ω , then limE [Xn ] ≥ E [X]. nondecreasing
(E4) : (E4a):
Fundamental lemma
n
If for all
sequence with
n, 0 ≤ Xn ≤ Xn+1 a.s.
n
and
Xn → X a.s. ,
then
E [Xn ] → E [X]
(i.e., the expectation of
the limit is the limit of the expectations).
Ideas of the proofs of the fundamental properties
• • • •
Modifying the random variable without changing
X
on a set of probability zero simply modies one or more of the
Ai
P (Ai ).
Such a modication does not change
E [X].
Properties (E1) ("(E1) ", p. 309) and (E2) ("(E2) ", p. 309) are established in the unit on expectation of simple random variables.. Positivity (E3a) (p. 309) is a simple property of sums of real numbers. Modication of sets of probability zero cannot aect the expectation. Monotonicity (E3b) (p. 309) is a consequence of positivity and linearity.
X≥Y
i
X − Y ≥ 0 a.s.
and
E [X] ≥ E [Y ]
i
E [X] − E [Y ] = E [X − Y ] ≥ 0
(11.42)
2 This
content is available online at <http://cnx.org/content/m23412/1.5/>.
310
CHAPTER 11. MATHEMATICAL EXPECTATION
The fundamental lemma (E4) ("(E4) ", p. expectation. 309) plays an essential role in extending the concept of
•
It involves elementary, but somewhat sophisticated, use of linearity and monotonicity,
limited to nonnegative random variables and positive coecients. We forgo a proof.
•
Monotonicity and the fundamental lemma provide a very simple proof of the monotone convergence theoem, often designated MC. Its role is essential in the extension.
Nonnegative random variables
There is a nondecreasing sequence of nonnegative simple random variables converging to
X. Monotonicity
E [X] =
implies the integrals of the nondecreasing sequence is a nondecreasing sequence of real numbers, which must have a limit or increase without bound (in which case we say the limit is innite). We dene
limE [Xn ].
n
Two questions arise.
1. Is the limit unique? The approximating sequences for a simple random variable are not unique, although their limit is the same. 2. Is the denition consistent? If the limit random variable with the old?
X
is simple, does the new denition coincide
The fundamental lemma and monotone convergence may be used to show that the answer to both questions is armative, so that the denition is reasonable. Also, the six fundamental properties survive the passage to the limit. As a simple applications of these ideas, consider discrete random variables such as the geometric Poisson
(p)
or
(µ),
which are integervalued but unbounded.
Example 11.11: Unbounded, nonnegative, integervalued random variables
The random variable
X
may be expressed
∞
X=
k=0
Let
kIEk ,
where
Ek = {X = k}
with
P (Ek ) = pk
(11.43)
n−1
Xn =
k=0
Then each for all Now
kIEk + nIBn ,
where
Bn = {X ≥ n}
If
(11.44)
Xn is a simple random variable with Xn ≤ Xn+1 .
Hence,
n ≥ k + 1.
Xn (ω) → X (ω)
for all
ω.
By monotone convergence,
X (ω) = k , then Xn (ω) = k = X (ω) E [Xn ] → E [X].
n−1
E [Xn ] =
k=1
If
kP (Ek ) + nP (Bn )
(11.45)
∞ k=0
kP (Ek ) < ∞,
then
∞
∞
0 ≤ nP (Bn ) = n
k=n
Hence
P (Ek ) ≤
k=n
kP (Ek ) → 0
as
n→∞
(11.46)
∞
E [X] = limE [Xn ] =
n k=0
kP (Ak )
(11.47)
We may use this result to establish the expectation for the geometric and Poisson distributions.
311
Example 11.12:
We have
X ∼ geometric (p) pk = P (X = k) = q k p, 0 ≤ k .
∞
By the result of Example 11.11 (Unbounded, nonnegative,
integervalued random variables)
∞
E [X] =
k=0
For
kpq k = pq
k=1
so that
kq k−1 =
pq (1 − q)
2
= q/p
(11.48)
Y −1∼
geometric
(p), pk = pq k−1
E [Y ] = 1 E [X] = 1/p q
Example 11.13:
We have
X ∼ Poisson (µ)
k
pk =
e−µ µ k!
.
By the result of Example 11.11 (Unbounded, nonnegative, integervalued
random variables)
∞
E [X] = e−µ
k=0
k
µk = µe−µ k!
∞
k=1
µk−1 = µe−µ eµ = µ (k − 1)!
(11.49)
The general case
We make use of the fact that
X = X + − X −,
where both
X
+
and
X

are nonnegative. Then
E [X] = E X + − E X −
Denition.
theory. If both
provided at least one of are nite,
E X+
,
E X−
is nite
(11.50)
E [X + ]
and
E [X − ]
X
is said to be
integrable.
The term integrable comes from the relation of expectation to the abstract Lebesgue integral of measure
Again, the basic properties survive the extension. The property (E0) ("(E0) ", p. 309) is subsumed in a more general uniqueness property noted in the list of properties discussed below.
Theoretical note
The development of expectation sketched above is exactly the development of the Lebesgue integral of the random variable
X
as a measurable function on the basic probability space
(Ω, F, P ),
so that
E [X] =
Ω
X dP
(11.51)
As a consequence, we may utilize the properties of the general Lebesgue integral. is not particularly useful for actual calculations. real line by random variable an integral on the real line.
In its abstract form, it
X
A careful use of the mapping of probability mass to the
produces a corresponding mapping of the integral on the basic space to
Although this integral is also a Lebesgue integral it agrees with the ordinary
Riemann integral of calculus when the latter exists, so that ordinary integrals may be used to compute expectations.
Additional properties
The fundamental properties of simple random variables which survive the extension serve as the basis
of an extensive and powerful list of properties of expectation of real random variables and real functions of random vectors. Some of the more important of these are listed in the table in Appendix E (Section 17.5). We often refer to these properties by the numbers used in that table.
Some basic forms
The mapping theorems provide a number of basic integral (or summation) forms for computation.
1. In general, if integral.
Z = g (X)
with distribution functions
FX
and
FZ , we have the expectation as a Stieltjes
uFZ (du)
(11.52)
E [Z] = E [g (X)] =
g (t) FX (dt) =
312
CHAPTER 11. MATHEMATICAL EXPECTATION X
and
2. If
g (X)
are absolutely continuous, the Stieltjes integrals are replaced by
E [Z] =
g (t) fX (t) dt =
ufZ (u) du
(11.53)
where limits of integration are determined by
fX
or
fY .
Justication for use of the density function is
provided by the RadonNikodym theoremproperty (E19) ("(E19)", p. 600). 3. If
X
is simple, in a primitive form (including canonical form), then
m
E [Z] = E [g (X)] =
j=1
If the distribution for
g (cj ) P (Cj )
(11.54)
Z = g (X)
is determined by a csort operation, then
n
E [Z] =
k=1
vk P (Z = vk )
(11.55)
4. The extension to unbounded, nonnegative, integervalued random variables is shown in Example 11.11 (Unbounded, nonnegative, integervalued random variables), above. innite series (provided they converge). 5. For The nite sums are replaced by
Z = g (X, Y ), E [Z] = E [g (X, Y )] = g (t, u) FXY (dtdu) = vFZ (dv)
(11.56)
6. In the absolutely continuous case
E [Z] = E [g (X, Y )] =
g (t, u) fXY (t, u) dudt =
vfZ (v) dv
(11.57)
7. For joint simple
X, Y
(Section on Expectation for Simple Random Variables (Section 11.1.2: Expec
tation for simple random variables))
n
m
E [Z] = E [g (X, Y )] =
i=1 j=1
g (ti , uj ) P (X = ti , Y = uj )
(11.58)
Mechanical interpretation and approximation procedures
In elementary mechanics, since the total mass is one, the quantity
E [X] =
tfX (t) dt
is the location of
the center of mass. This theoretically rigorous fact may be derived heuristically from an examination of the expectation for a simple approximating random variable. Recall the discussion of the mprocedure for discrete approximation in the unit on Distribution Approximations (Section 7.2) The range of
X
is divided into equal
subintervals. The values of the approximating random variable are at the midpoints of the subintervals. The associated probability is the probability mass in the subinterval, which is approximately
fX (ti ) dx,
where
dx
is the length of the subinterval. This approximation improves with an increasing number of subdivisions,
with corresponding decrease in
dx.
The expectation of the approximating simple random variable
Xs
is
E [Xs ] =
i
ti fX (ti ) dx ≈
tfX (t) dt
(11.59)
The approximation improves with increasingly ne subdivisions. The center of mass of the approximating distribution approaches the center of mass of the smooth distribution.
313
It should be clear that a similar argument for
g (X)
leads to the integral expression
E [g (X)] =
g (t) fX (t) dt
(11.60)
This argument shows that we should be able to use tappr to set up for approximating the expectation
E [g (X)]
as well as for approximating
P (g (X) ∈ M ),
etc.
We return to this in Section 11.2.2 (Properties
and computation).
Mean values for some absolutely continuous distributions
1.
Uniform
on
[a, b]fX (t) =
1 b−a ,
a≤t≤b
The center of mass is at
(a + b) /2.
To calculate the value
formally, we write
E [X] =
Symmetric triangular on[a,
interval 3.
tfX (t) dt = b]
1 b−a
b
t dt =
a
b2 − a 2 b+a = 2 (b − a) 2
(11.61)
2.
The graph of the density is an isoceles triangle with base on the
[a, b].
By symmetry, the center of mass, hence the expectation, is at the midpoint
(a + b) /2.
Exponential(λ).
fX (t) = λe−λt , 0 ≤ t E [X] =
Using a well known denite integral (see Appendix B
(Section 17.2)), we have
∞
tfX (t) dt =
0
λte−λt dt = 1/λ
(11.62)
4.
Gamma(α,
λ). fX (t) =
1 α−1 α −λt λ e , Γ(α) t
0≤t
Again we use one of the integrals in Appendix B
(Section 17.2) to obtain
E [X] =
tfX (t) dt =
1 Γ (α)
∞
λα tα e−λt dt =
0
Γ (α + 1) = α/λ λΓ (α)
1 0
(11.63)
The last equality comes from the fact that 5.
Beta(r,
Γ(r)Γ(s) Γ(r+s) ,
s). fX (t) =
Γ(r+s) r−1 (1 Γ(r)Γ(s) t
− t)
s−1
Γ (α + 1) = αΓ (α). , 0 < t < 1 We use
the fact that
ur−1 (1 − u)
s−1
du =
r > 0, s > 0. tfX (t) dt = Γ (r + s) Γ (r) Γ (s)
1 0
α
E [X] =
Weibull(α,
tr (1 − t)
s−1
dt =
Γ (r + s) Γ (r + 1) Γ (s) r · = Γ (r) Γ (s) Γ (r + s + 1) r+s
Dierentiation shows
(11.64)
6.
λ, ν). FX (t) = 1 − e−λ(t−ν)
α > 0, λ > 0, ν ≥ 0, t ≥ ν .
α−1 −λ(t−ν)α
fX (t) = αλ(t − ν)
First, consider
e
,
t≥ν
(11.65)
Y ∼
exponential
(λ).
For this random variable
∞
E [Y r ] =
0
If
tr λe−λt dt =
Γ (r + 1) λr
(11.66)
Y
is exponential (1), then techniques for functions of random variables show that
1/α 1 λY
+ν ∼
Weibull
(α, λ, ν).
Hence,
E [X] =
1 1 E Y 1/α + ν = 1/α Γ λ1/α λ
1 +1 +ν α
(11.67)
314
CHAPTER 11. MATHEMATICAL EXPECTATION
Normal
7.
µ, σ 2
The symmetry of the distribution about
t=µ
shows that
E [X] = µ.
This, of course,
may be veried by integration. A standard trick simplies the work.
∞
∞
E [X] =
−∞
We have used the fact that
tfX (t) dt =
−∞
(t − µ) fX (t) dt + µ
(11.68)
∞ −∞
fX (t) dt = 1.
If we make the change of variable
integral, the integrand becomes an odd function, so that the integral is zero. Thus,
x = t − µ in the E [X] = µ.
last
11.2.2 Properties and computation
The properties in the table in Appendix E (Section 17.5) constitute a powerful and convenient resource for the use of mathematical expectation. These are properties of the abstract Lebesgue integral, expressed in the notation for mathematical expectation.
E [g (X)] =
g (X) dP
(11.69)
In the development of additional properties, the four basic properties: (E1) ("(E1) ", p. 309) Expectation of indicator functions, (E2) ("(E2) ", p. 309) Linearity, (E3) ("(E3) ", p. 309) Positivity; monotonicity, and (E4a) ("(E4a)", p. 309) Monotone convergence play a foundational role. We utilize the properties in the
table, as needed, often referring to them by the numbers assigned in the table. In this section, we include a number of examples which illustrate the use of various properties. Some
are theoretical examples, deriving additional properties or displaying the basis and structure of some in the table. Others apply these properties to facilitate computation
Example 11.14: Probability as expectation
Probability may be expressed entirely in terms of expectation.
• • •
By properties (E1) ("(E1) ", p. 309) and positivity (E3a) ("(E3) ", p. 309),
P (A) = E [IA ] ≥
0.
As a special case of (E1) ("(E1) ", p. 309), we have
P (Ω) = E [IΩ ] = 1
By the countable sums property (E8) ("(E8) ", p. 600),
A=
i
Ai
implies
P (A) = E [IA ] = E
i
IAi =
i
E [IAi ] =
i
P (Ai )
(11.70)
Thus, the three dening properties for a probability measure are satised.
Remark.
There are treatments of probability which characterize mathematical expectation with properties
(E0) through (E4a) (p. 309), then dene has not been widely adopted.
P (A) = E [IA ].
Although such a development is quite feasible, it
Example 11.15: An indicator function pattern
Suppose
X
is a real random variable and
E = X −1 (M ) = {ω : X (ω) ∈ M }. IE = IM (X)
Then
(11.71)
To see this, note that Similarly, if ", p. 309),
X (ω) ∈ M i ω ∈ E , so that IE (ω) = 1 i IM (X (ω)) = 1. E = X −1 (M ) ∩ Y −1 (N ), then IE = IM (X) IN (Y ). We thus have,
by (E1) ("(E1)
P (X ∈ M ) = E [IM (X)]
and
P (X ∈ M, Y ∈ N ) = E [IM (X) IN (Y )]
(11.72)
315
Example 11.16: Alternate interpretation of the mean value
E (X − c)
2
is a minimum i
c = E [X] , X (ω) − c.
in which case
E (X − E [X])
2
= E X 2 − E 2 [X]
(11.73)
INTERPRETATION. If we approximate the random variable the error of approximation is (often called the
X
by a constant
c,
then for any
ω
The probability weighted average of the square of the error . This average squared error is smallest i
mean squared error ) is E (X − c)2 the approximating constant c is the mean value.
VERIFICATION We expand
(X − c)
2
and apply linearity to obtain
E (X − c)
2
= E X 2 − 2cX + c2 = E X 2 − 2E [X] c + c2
(11.74)
E X 2 and E [X] are constants). The usual calculus treatment shows the expression has a minimum for c = E [X]. Substitution of this value for c shows 2 the expression reduces to E X − E 2 [X].
The last expression is a quadratic in (since A number of inequalities are listed among the properties in the table. The basis for these inequalities is usually some standard analytical inequality on random variables to which the monotonicity property is applied. We illustrate with a derivation of the important Jensen's inequality.
c
Example 11.17: Jensen's inequality
If of
X is a real random variable and g X, then
is a convex function on an interval
I
which includes the range
g (E [X]) ≤ E [g (X)]
VERIFICATION The function
(11.75)
g
is convex on
I
i for each
t0 ∈ I = [a, b]
there is a number
λ (t0 )
such that
g (t) ≥ g (t0 ) + λ (t0 ) (t − t0 )
This means there is a line through
(11.76)
a ≤ X ≤ b,
by
then by monotonicity
(E11) ("(E11)", p. 600)). We
c, we have
(t0 , g (t0 )) such that the graph of g lies on or above it. If E [a] = a ≤ E [X] ≤ E [b] = b (this is the mean value property may choose t0 = E [X] ∈ I . If we designate the constant λ (E [X])
g (X) ≥ g (E [X]) + c (X − E [X])
Recalling that tonicity, to get
(11.77)
E [X]
is a constant, we take expectation of both sides, using linearity and mono
E [g (X)] ≥ g (E [X]) + c (E [X] − E [X]) = g (E [X])
(11.78)
Remark.
It is easy to show that the function
λ (·) is nondecreasing.
This fact is used in establishing Jensen's
inequality for conditional expectation.
The product rule for expectations of independent random variables
Example 11.18: Product rule for simple random variables
Consider an independent pair
{X, Y }
of simple random variables
n
m
X=
i=1
ti IAi Y =
j=1
uj IBj
(both in canonical form)
(11.79)
316
CHAPTER 11. MATHEMATICAL EXPECTATION
We know that each pair product
{Ai , Bj }
is independent, so that
P (Ai Bj ) = P (Ai ) P (Bj ).
Consider the
XY .
a function of
X ) from "Mathematical Expectation:
n m
According to the pattern described after Example 9 (Example 11.9: Expectation of Simple Random Variables."
n
m
XY =
i=1
ti IAi
j=1
uj IBj =
i=1 j=1
ti uj IAi Bj
(11.80)
The latter double sum is a primitive form, so that
E [XY ] (
n i=1 ti P
= (Ai ))
n m i=1 j=1 ti uj P m j=1 uj P (Bj )
(Ai Bj )
=
n i=1
m j=1 ti uj P
(Ai ) P (Bj )
=
(11.81)
= E [X] E [Y ]
Thus the product rule holds for independent simple random variables.
Example 11.19: Approximating simple functions for an independent pair
Suppose of
X
and
rule for
{X, Y } is an independent pair, with an approximating simple pair {Xs , Ys }. As functions Y, respectively, the pair {Xs , Ys } is independent. According to Example 11.18 (Product simple random variables), above, the product rule E [Xs Ys ] = E [Xs ] E [Ys ] must hold.
there exist nondecreasing sequences
Example 11.20: Product rule for an independent pair
For
X ≥ 0, Y ≥ 0,
simple random variables increasing to
X
and
Y,
{Xn : 1 ≤ n}
and
respectively.
The sequence
{Yn : 1 ≤ n} of {Xn Yn : 1 ≤ n} is
By the monotone
also a nondecreasing sequence of simple random variables, increasing to convergence theorem (MC)
XY .
E [Xn ] [U+2197]E [X] ,
Since In the general case,
E [Yn ] [U+2197]E [Y ] ,
for each
and
E [Xn Yn ] [U+2197]E [XY ]
(11.82)
E [Xn Yn ] = E [Xn ] E [Yn ] XY = X + − X −
n, we conclude E [XY ] = E [X] E [Y ]
(11.83)
Y + − Y − = X +Y + − X +Y − − X −Y + + X −Y −
Application of the product rule to each nonnegative pair and the use of linearity gives the product rule for the pair
{X, Y }
Remark.
It should be apparent that the product rule can be extended to any nite independent class.
Example 11.21: The joint distribution of three random variables
{X, Y, Z} is independent, with the marginal g (X, Y, Z) = 3X 2 + 2XY − 3XY Z . Determine E [W ].
The class
distributions shown below.
Let
W =
X = 0:4; Y = 1:2:7; Z = 0:3:12; PX = 0.1*[1 3 2 3 1]; PY = 0.1*[2 2 3 3]; PZ = 0.1*[2 2 1 3 2]; icalc3 Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter row matrix of Zvalues Z Enter X probabilities PX Enter Y probabilities PY % Setup for joint dbn for{X,Y,Z}
317
Enter Z probabilities PZ Use array operations on matrices X, Y, Z, PX, PY, PZ, t, u, v, and P EX = X*PX' % E[X] EX = 2 EX2 = (X.^2)*PX' % E[X^2] EX2 = 5.4000 EY = Y*PY' % E[Y] EY = 4.4000 EZ = Z*PZ' % E[Z] EZ = 6.3000 G = 3*t.^2 + 2*t.*u  3*t.*u.*v; % W = g(X,Y,Z) = 3X^2 + 2XY  3XYZ EG = total(G.*P) % E[g(X,Y,Z)] EG = 132.5200 [W,PW] = csort(G,P); % Distribution for W = g(X,Y,Z) EW = W*PW' % E[W] EW = 132.5200 ew = 3*EX2 + 2*EX*EY  3*EX*EY*EZ % Use of linearity and product rule ew = 132.5200
Example 11.22: A function with a compound denition: truncated exponential
Suppose
X∼
exponential (0.3). Let
Z={
Determine
X2 16
for for
X≤4 X>4
= I[0,4] (X) X 2 + I(4,∞] (X) 16
(11.84)
E [Z].
∞
ANALYTIC SOLUTION
E [g (X)] =
g (t) fX (t) dt =
0 4
I[0,4] (t) t2 0.3e−0.3t dt + 16E I(4,∞] (X)
(11.85)
=
0
APPROXIMATION
t2 0.3e−0.3t dt + 16P (X > 4) ≈ 7.4972
(by Maple)
(11.86)
To obtain a simple aproximation, we must approximate the exponential by a bounded random variable. Since
P (X > 50) = e−15 ≈ 3 · 10−7
we may safely truncate
X
at 50.
tappr Enter matrix [a b] of xrange endpoints [0 50] Enter number of x approximation points 1000 Enter density as a function of t 0.3*exp(0.3*t) Use row matrices X and PX as in the simple case M = X <= 4; G = M.*X.^2 + 16*(1  M); % g(X) EG = G*PX' % E[g(X)] EG = 7.4972 [Z,PZ] = csort(G,PX); % Distribution for Z = g(X) EZ = Z*PZ' % E[Z] from distribution EZ = 7.4972
318
CHAPTER 11. MATHEMATICAL EXPECTATION
Because of the large number of approximation points, the results agree quite closely with the theoretical value.
Example 11.23: Stocking for random demand (see Exercise 4 (Exercise 10.4) from "Problems on Functions of Random Variables")
The manager of a department store is planning for the holiday season. dollars per unit and sells for
p
A certain item costs
dollars per unit.
additional units can be special ordered for
s
If the demand exceeds the amount
m
c
ordered,
dollars per unit (
s > c).
If demand is less than
amount ordered, the remaining stock can be returned (or otherwise disposed of ) at unit (
r < c).
Demand
D
r
dollars per
for the season is assumed to be a random variable with Poisson What amount
distribution. Suppose
µ = 50, c = 30, p = 50, s = 40, r = 20.
m should the manager
(µ)
order to maximize the expected prot? PROBLEM FORMULATION Suppose
D
is the demand and
X
is the prot. Then
For For
D ≤ m, X = D (p − c) − (m − D) (c − r) = D (p − r) + m (r − c) D > m, X = m (p − c) + (D − m) (p − s) = D (p − s) + m (s − c)
It is convenient to write the expression for
X
in terms of
IM , where M = (−∞, m].
Thus
X = IM (D) [D (p − r) + m (r − c)] + [1 − IM (D)] [D (p − s) + m (s − c)] = D (p − s) + m (s − c) + IM (D) [D (p − r) + m (r − c) − D (p − s) − m (s − c)] = D (p − s) + m (s − c) + IM (D) (s − r) (D − m)
Then
(11.87)
(11.88)
(11.89)
E [X] = (p − s) E [D] + m (s − c) + (s − r) E [IM (D) D] − (s − r) mE [IM (D)]. D∼
Poisson
ANALYTIC SOLUTION For
(µ), E [D] = µ
m
and
E [IM (D)] = P (D ≤ m)
m
E [IM (D) D] = e−µ
k=1
Hence,
k
µk = µe−µ k!
k=1
µk−1 = µP (D ≤ m − 1) (k − 1)!
(11.90)
E [X] = (p − s) E [D] + m (s − c) + (s − r) E [IM (D) D] − (s − r) mE [IM (D)] = (p − s) µ + m (s − c) + (s − r) µP (D ≤ m − 1) − (s − r) mP (D ≤ m)
Because of the discrete nature of the problem, we cannot solve for the optimum calculus. We may solve for various
(11.91)
(11.92) by ordinary
m
m
about
m=µ
and determine the optimum.
We do so with
the aid of MATLAB and the mfunction cpoisson.
mu = 50; = 30; = 50; = 40; = 20; = 45:55; EX = (p  s)*mu + m*(s c) + (s  r)*mu*(1  cpoisson(mu,m)) ... (s  r)*m.*(1  cpoisson(mu,m+1)); disp([m;EX]') 45.0000 930.8604 c p s r m
r) .5231 939. D.m(i)*(c . using nite approximation for the Poisson distri APPROXIMATION ptest = cpoisson(mu.7962 943. u = 1 + t.0000 53.0000 bution.0000 52. u = 1 − t (see Figure 11. solution may be obtained by MATLAB.*(t . .r).:) = t*(p .9247 941. Determine E [Z]. % Values agree with theoretical to four deicmals An advantage of the second solution. end EG = G*pD'.2001e10 n = 100.24 (A jointly distributed pair).0000 55. pD = ipoisson(mu.3886 % Optimum m = 50 A direct.0000 50.M.2: The density for Example 11. based on simple approximation to for each m could be studied e. t = 0:n.0000 54..0699 938.0000 49.g.1532 934.1895 941.6750 942.2988 943. for i = 1:length(m) % Step by step calculation for various m M = t > m(i). Y ) = X 2 + 2XY . is that the distribution of gain Example 11.319 46.2).0000 48. G(i.r).0000 51. the maximum and minimum gains.t).2347 929. 935.0000 47.100) % Check for suitable value of n ptest = 3. u) = 3u on the triangular region bounded by u = 0.m(i))*(s . Let Z = g (X. Y } has joint density fXY (t. Suppose the pair Figure 11.24: A jointly distributed pair {X.
W ={ where on the square region bounded by u = 1 + t.P).1t)) Use array operations on X. t. Y } ≤ 1 max{X. MATHEMATICAL EXPECTATION ANALYTIC SOLUTION E [Z] = t2 + 2tu fXY (t. X 2Y for for max{X.PZ] = csort(G.3). u) = 1/2 u = 1 − t. Y } > 1 = IQ (X. PY.1006 Example 11. u) dudt 0 1+t 1 1−t =3 −1 APPROXIMATION t2 u + 2tu2 dudt + 3 0 0 0 t2 u + 2tu2 dudt = 1/10 (11. % g(X.25: A function with a compound denition The pair {X. u) : max{t.93) tuappr Enter matrix [a b] of Xrange endpoints [1 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density 3*u. u ≤ 1}.94) Q = {(t.^2 + 2*t.*u. and u = t − 1 (see Figure 11. % Distribution for Z EZ = Z*PZ' % E[Z] from distribution EZ = 0. Y } has joint density fXY (t.Y) = X^2 + 2XY EG = total(G. Y.*(u<=min(1+t. and P G = t. . u) : t ≤ 1. Y ) X + IQc (X. PX. u.1006 % Theoretical value = 1/10 [Z. u} ≤ 1} = {(t.Y)] EG = 0. u = 3 − t.*P) % E[g(X. Y ) 2Y Determine (11.320 CHAPTER 11. E [W ].
u. (u>=max(1t. ANALYTIC SOLUTION The intersection of the region Q and the square is the set for which 0 ≤ t ≤ 1 and 1 − t ≤ u ≤ 1.321 Figure 11. t.*P) % E[g(X.95) APPROXIMATION tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density ((u<=min(t+1.8340 % Theoretical 11/6 = 1.8340 Special forms for expectation .Y) EG = total(G. and P M = max(t. E [W ] = 1 2 1 0 1 t dudt + 1−t 1 2 2u dudt + 1 2 2 1 3−t 2u dudt = 11/6 ≈ 1.M)...PZ) % E[Z] from distribution EZ = 1. % Z = g(X.8333 t−1 (11. G = t.*M + 2*u. % Distribution for Z EZ = dot(Z.8333 [Z.Y)] EG = 1.PZ] = csort(G. 1 0 1 1+t Reference to the gure shows three regions of integration.3: The density for Example 11.u)<=1. PX.t1)))/2 Use array operations on X.*(1 . Y.25 (A function with a compound denition).P). PY.3t))& .
MATHEMATICAL EXPECTATION The various special forms related to property (E20a) (list. if X is the life duration (time to failure) for a device. 601) For the special case g (X) = X . where U∼ uniform on (0.102) F is shown in Figure 3(b) (Figure 11.Area B (11. Figure 3(a) shows 1 0 Q (u) du is the dierence in the shaded areas 1 Q (u) du = Area A 0 The corresponding graph of the distribution function . 601) and (E20f ) (list. Example 11. Because of the construction.99) ∞ E [X] = 0 R (t) dt (11.26: The property (E20f ) (list. 601). 0 ≤ t ≤ 1. a > 0.97) 1 E [g (X)] = 0 VERIFICATION If g [Q (u)] du Y = Q (U ). 600) are often useful. As may be seen.98) 1 E [g (X)] = E [g (Q (U ))] = g (Q (u)) fU (u) du = 0 g (Q (u)) du Example 11. is usually derived by an argument which employs a general form of what is known as Fubini's theorem. The general result. then Y has the same distribution as X. 601) (11. then (11. Example 11. p. 1 Then Q (u) = u1/a . p. 601) ∞ E [X] = −∞ [u (t) − FX (t)] dt (11. The latter property is readily established by elementary arguments. the reliability function is the probability at any time t the device is still operative. The special form (E20b) (list.27: Reliability and expectation In reliability.4). p.103) . 1).322 CHAPTER 11.96) may be derived from (E20a) by use of integration by parts for Stieltjes integrals. 601) If Q is the quantile function for the distribution function FX . p.101) E [X] = 0 u1/a du = The same result could be obtained by using ' fX (t) = FX (t) tfX (t) dt. the areas of the regions marked gures. 0 ≤ u ≤ a. Hence. p. 601) and (E20f ) (list. p. a 1 = 1 + 1/a a+1 and evaluating (11. However. p. (11.29: Equivalence of (E20b) (list. A and B are the same in the two ∞ Area 0 A= 0 [1 − F (t)] dt and Area B= −∞ F (t) dt (11. p.100) Example 11.28: Use of the quantile function Suppose FX (t) = ta . Thus R (t) = P (X > t) = 1 − FX (t) According to property (E20b) (list. we use the relationship between the graph of the distribution function and the graph of the quantile function to show the equivalence of (E20b) (list. which we do not need.
104) .323 Use of the unit step function u (t) = 1 for t > 0 and 0 for t < 0 (dened arbitrarily at t = 0) enables us to combine the two expressions to get 1 ∞ Q (u) du = Area A 0 .Area B= −∞ [u (t) − F (t)] dt (11.
with the unit step . MATHEMATICAL EXPECTATION Figure 11. 601) is a direct result of linearity and (E20b) (list.4: Equivalence of properties (E20b) (list. Property (E20c) (list. 601). 601). p.324 CHAPTER 11. functions cancelling out. p. p. p. 601) and (E20f) (list.
n + 1). p. 601) Useful inequalities Suppose X ≥ 0. for all N ≥1 (11.30: Property (E20d) (list.111) The result follows as a special case of (E20d) (list. 601) is used primarily for theoretical purposes. 601) If is nonnegative. Example 11. 601).113) .325 Example 11. p. we may assert b b P (X > t) dt = a For each nonnegative integer P (X ≥ t) dt a (11.105) X ≥ 0. 601) can be constructed as follows.110) Remark. p.31: Property (E20e) (list. p.107) n. P (X ≥ t) = P (X ≥ n) on En and P (X ≥ t) = P (X > n) = P (X ≥ n + 1) on En+1 (11. X Property (E20d) (list. 601) for integervalued random variables By denition ∞ n E [X] = k=1 kP (X = k) = lim n k=1 kP (X = k) (11. The special case (E20e) (list.106) F can have only a countable number of jumps on any interval and P (X ≥ t) dier only at jump points. 601) is more frequently used.32: (E20e) (list.109) (k+1)N P (X ≥ t) dt ≤ N kN EkN P (X ≥ t) dt ≤ N P (X ≥ kN ) (11.108) P (X ≥ t) is decreasing with t En has unit length. then ∞ ∞ E [X] = k=1 VERIFICATION P (X ≥ k) = k=0 P (X > k) (11. we have by the mean value theorem P (X ≥ n + 1) ≤ E [IEn X] ≤ P (X ≥ n) The third inequality follows from the fact that (11. Then ∞ ∞ ∞ P (X ≥ n + 1) ≤ E [X] ≤ n=0 VERIFICATION For P (X ≥ n) ≤ N n=0 k=0 P (X ≥ kN ) . ∞ ∞ By the countable additivity of expectation E [X] = n=0 Since E [IEn X] = n=0 and each P (X ≥ t) dt En (11. 601) ∞ ∞ E [X] = 0 Since [1 − F (t)] dt = 0 P (X > t) dt P (X > t) and (11. p. p. p. Example 11. by (E20b) (list.112) An elementary derivation of (E20e) (list. p. let En = [n. For integer valued random variables. integer valued.
m (Section 17. Random respectively.) Exercise 11.8. 334. The random variable expressing the amount of her purchase may be written X = 3. Experience shows that the probability of of a lightning strike starting a re is about 0. 0.114) n→∞ yields the desired result.) (See Exercise 8 (Exercise 7.31: npr07_02) ). 0. A customer comes in.m (Section 17. mle X has values {1. Exercise 11. 0. 5.10 0.0IC5 + 5. 0. C. is a partition.5IC7 + 7. 335.10 0. The prices are $3.5IC1 + 5. ∞ ∞ Use of (E20e) (list. 3. Exercise 11. Then P (X ≥ k) = q k . 335.13.326 CHAPTER 11. mle npr07_02. (11.org/content/m24366/1. $7. $5.001 ∗ [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] Determine the mathematical expection for the random variable counts the number of the events which occur on a trial.30: variable npr07_01)). The class on (Solution on p.00. 2. 2.50.15. 0. Exercise 11.0083.50. (Solution on p.4 (Solution on p.15.5IC3 + 7.00. A store has eight items for sale.0IC2 + 3.116) E [X] of the value of her purchase. Determine the expected number of res. 4.14. 0.3 (Solution on p.8) from "Problems on Distribution and Density Functions").m (Section 17. $3. 3. .05.09. The class {A. 0.5) from "Problems on Distribution and Density Functions").12) from "Problems on Random Variables and Probabilities".5 are ipped twenty times. $5.) (See Exercise 5 (Exercise 7. with probabilities 0.00.1) from "Problems on Distribution and Density Functions".2) from "Problems on Distribution and Density Functions".8.50. B. $5.09. 601) gives E [X] = k=1 qk = q k=0 qk = q = q/p 1−q (11.06.07. 1. $3.08.50. 3. and Exercise 3 (Exercise 7. Determine E [X].5IC4 + 5.11.0IC6 + 3. Two coins X be the number of matches (both heads or both tails).12. Let (Solution on p. D} has minterm probabilities pm = 0.8. In a thunderstorm in a national park there are 127 lightning strikes.15.50.2 (See Exercise 2 (Exercise 7.3 Problems on Mathematical Expectation3 Exercise 11. 0. respectively.5IC8 Determine the expection (11. p.3) from "Problems on Distribution and Density Functions. 2} C1 {Cj : 1 ≤ j ≤ 10} through C10 .4/>. 3 This content is available online at <http://cnx. 0. She purchases one of the items with probabilities 0.11.10.33: The geometric distribution Suppose X∼ geometric (p).115) 11. 0. 0. MATHEMATICAL EXPECTATION Now for each nite n. 334.) (See Exercise 12 (Exercise 6.) (See Exercise 1 (Exercise 7. Determine E [X]. n k n n n n kP (X = k) = k=1 Taking limits as P (X = k) = k=1 j=1 j=1 k=j P (X = k) = j=1 P (X ≥ j) (11. Example 11.20. and $7. 0. 0.1 npr07_01." mle npr06_12.28: npr06_12)). 334. 0.117) which X = IA + IB + IC + ID .
336.41) from "Problems on Distribution and Density Functions").) The Exercise 11. possible pair equally likely).) (See Exercise 1 (Exercise 8. The prot to the College is X = 50 · 10 − 30N. mle npr08_01. Determine E [X]. 1] (t) t2 + I(1. E [Y ].∞) (X) a. with 10.1) from "Problems On Random Vectors and Joint Distributions". Let X be the number of aces and Y be the number of spades. E [XY ]. E [Y ]. What is the expected number of pulses in an hour? Exercise 11. 335.9 variable (Solution on p. number of noise pulses arriving on a power circuit in an hour is a random quantity having Poisson (7) distribution.11 (Solution on p. Fifty chances are sold. a.12 le npr08_02.2] (t) (2 − t) 5 5 (11.) (See Exercise 24 (Exercise 7.10 Truncated exponential. 335.000 points for of part (a).m (Section 17. and three seniors apply. Under the . residential College plans to raise money by selling chances on a board.8. Determine the expected prot where N is the number of winners (11.12) from "Problems on Distribution and Density Functions").25) from "Problems on Distribution and Density Functions").) A (See Exercise 12 (Exercise 7. What is the expected operating time? Exercise 11.0002).2. Use the fact that ∞ te−λt dt = 0 to determine an expression for 1 λ2 ∞ and te−λt dt = a 1 −λa e (1 + λa) λ2 (11.) (See Exercise 41 (Exercise 7. Exercise 11. Two cards are selected at random. E X 2 .119) What is the expected value E [X]? (Solution on p.6 (Solution on p.32: npr08_01)).m (Section 17. Random X has density function fX (t) = { (6/5) t2 (6/5) (2 − t) for for 0≤t≤1 1<t≤2 6 6 = I [0.) Exercise 11. m sophomores. three juniors. 335. Y } and E [X]. 335.120) E [Y ]. . E X 2 Y be the number of juniors . (Solution on p.118) E [X]. Two (See Exercise 2 (Exercise 8.8.327 Exercise 11. and X It is decided to select two at random (each be the number of sophomores and Determine the joint distribution for {X. 0. and usual assumptions.24) and Exercise 25 (Exercise 7. Use the approximation method.a] (X) X + I(a. Suppose X∼ exponential (λ) and Y = I[0.24) is a random variable T ∼ gamma (20. 0 ≤ t ≤ 1000. Approximate the exponential at Compare the approximate result with the theoretical result b. determine the joint distribution. without replacement. he or she wins $30 with probability p = 0.7 (See Exercise 19 (Exercise 7.8 (Solution on p. a = 30. E Y2 . λ = 1/50. (Solution on p. A player pays $10 to play. from a standard deck.33: npr08_02) ). Exercise 11.19) from "Problems on Distribution and Density Functions"). E Y2 E [XY ]. 335.2) from "Problems On Random Vectors and Joint Distributions". Let who are selected. The total operating time for the units in Exercise 24 (Exercise 7.) Two positions for campus jobs are open. 335.
) (See Exercise 6 (Exercise 8.0609 0.0363 0.0396 0 0.38: npr08_07)).0713 0. E [Y ].0891 0.0357 0.35: npr08_04) ).0726 2.3 − 0. MATHEMATICAL EXPECTATION Exercise 11. Y = u) (11. 336. The pair (Solution on p. E X 2 .5 0.1 2.0527 0.0231 0. E Y2 .1 3. and Y = [1.7 0.0580 E [XY ].4) from "Problems On Random Vectors and Joint Distributions".0594 0.0551 0.36: npr08_05)).5) from "Problems On Random Vectors and Joint Distributions". Let A die is rolled.m (Section 17.0203 0. (Solution on p.7 1. Exercise 11. E Y2 Exercise 11.0216 0. (Solution on p.0440 0.7) from "Problems On Random Vectors and Joint Distributions".1 0.1 5.34: turn up.0651 0.0495 0.8 3. The pair (Solution on p.3 2.15 npr08_05. times.328 CHAPTER 11.8.0399 0.0077 Table 11. 0.37: npr08_06)). A coin is ipped npr08_03) ).3) from "Problems On Random Vectors and Joint Distributions".8. Let (Solution on p.0420 0. mle joint distribution for the pair {X.1] 0. Determine the joint distribution for {X. Y }.121) 0. mle npr08_04.) Suppose a pair of dice is rolled.8.0437 P = 0.m (Section 17.8.9 5.0528 0.0323 0.4 0. Y } and determine E [Y ].0132 3. Y } E [Y ].17 npr08_07.0441 (11. Arrange the joint matrix as on the plane.13 npr08_03. Exercise 11. Exercise 11. Determine the joint distribution for {X.) (See Exercise 4 (Exercise 8.0510 0.0297 0 4. Let (See Exercise 5 (Exercise 8.0667 Determine 0. mle number of spots which turn up. Determine the expected value E [Y ]. 337.0589 (11. 337. 337.8.5 4.m (Section 17.0493 .1320 0.1089 0. sevens that are thrown on the X Roll the pair an additional X times.14 X Y X be the number of spots that Determine the be the number of heads that turn up. and E [XY ].0620 0.16 npr08_06.0 3.5 4.2 0. 1/2) distribution.0090 0. P (Y = jX = k) has the binomial (k. Assume P (X = k) = 1/6 for 1 ≤ k ≤ 6 and for each k.0484 1. Y } has the joint distribution: X = [−2.0399 0.0189 0.3] 0.m (Section 17.0483 0. .123) t = u = 7. As a variation of Exercise 11. mle {X. 336. Y } has the joint distribution: P (X = t.0324 0.13. mle {X. suppose a pair of dice is rolled instead of a single die. E [Y ].0380 0.1 Determine E [X].) (See Exercise 7 (Exercise 8. E X 2 .0405 0. Let Y X be the total be the number of and determine rolls.0361 0.122) E [X].) (See Exercise 3 (Exercise 8.9 0. with values of Y increasing upward.m (Section 17.6) from "Problems On Random Vectors and Joint Distributions".
0063 7 0.0060 0.0056 0.058 0.045 2.0104 0.0140 0. Y = u) (11.0223 Table 11.0160 0.0064 0. E X 2 .0234 0.0112 0.0196 0.0060 0.0056 0.0180 0.009 2 0.0172 17 0. Determine analytically b. For the joint densities in Exercises 2032 below a.0064 0.0140 0.0081 0.0091 0.0234 0.0204 0. mle {X.137 0.0126 0.3 Determine E [X].0038 0.0072 3 0. Y = u) (11.0096 0.0280 0. (Solution on p.001 0. E [XY ].0156 0.0084 0.m (Section 17. and E [XY ].0217 19 0.005 0.0268 0. E Y2 . E [Y ].8) from "Problems On Random Vectors and Joint Distributions". The pair (Solution on p.8.0168 0. mle Data were kept on the eect of training time on the time to perform a job on a production line.0160 0. E [Y ].0056 0.0044 0. E [Y ].0060 0.5 0.0096 0.0200 0.0098 0.0126 0.0045 9 0. Use a discrete E [X].40: npr08_09)).0049 0.0070 0.0084 0.0056 0.0040 0.0196 0.0096 0. E X 2 .39: npr08_08)).0091 0.0050 0.031 0.049 0.0120 0.015 0.0108 0.061 0. .125) t = u = 5 4 3 2 1 1 0. 338. and E [XY ]. The data are as follows: P (X = t. E [Y ].039 0.065 0.2 Determine E [X].0191 0.0167 11 0.070 0.0090 13 0.0054 0.003 1.0060 0.0256 0.025 3 0.0244 0.18 npr08_08. 337. E X and . X is the amount of training. and Y is the time to perform the task.m (Section 17. in minutes.0162 0.0252 0. E Y 2 .0120 0.0072 0.033 0.010 0.0035 0.050 0.0182 0.001 0. E Y 2 . and E [XY ].124) t = u = 12 10 9 5 3 1 3 5 1 0.0260 0.0070 0.0080 0.5 0.0134 0.0040 0.012 0.8.0026 15 0.0156 0.) (See Exercise 8 (Exercise 8.017 Table 11.329 Exercise 11.039 0.0256 0. (See Exercise 9 (Exercise 8.0108 0.051 0. in hours.163 0.0012 0. E Y2 .0182 0.0182 0.) Exercise 11.0180 0.011 0.0256 0.19 npr08_09. Y } has the joint distribution: P (X = t.0017 5 0. 2 approximation for E [X]. E X 2 .9) from "Problems On Random Vectors and Joint Distributions".0112 0.
15) from "Problems On Random Vectors and Joint Distributions"). 0 ≤ t ≤ 1. u) = 4ue−2t 0 ≤ t.131) 3 3 (1 + t) 1 + 4t + t2 = 1 + 5t + 5t2 + t3 . 0 ≤ t. fXY (t. (11.) for (See Exercise 13 (Exercise 8.22 (Solution on p. 0 ≤ u ≤ 1 (11.25 (Solution on p.21 (Solution on p. (11. 0 ≤ u ≤ 1 (11.132) (Solution on p.134) .14) from "Problems On Random Vectors and Joint Distributions").0] (t) 6t2 (t + 1) + I(0.128) fX (t) = 2t. 0 ≤ t ≤ 2 88 88 3 3 6u2 + 4 + I(1.127) fX (t) = fY (t) = I[0.1] (u) Exercise 11.10) from "Problems On Random Vectors and Joint Distributions").) on the parallelogram with vertices (See Exercise 16 (Exercise 8. fXY (t. MATHEMATICAL EXPECTATION Exercise 11. fXY (t. fX (t) = fY (t) = 1 (t + 1) . 0) . (0.12) from "Problems On Random Vectors and Joint Distributions"). fY (u) = 1 − u/2.13) from "Problems On Random Vectors and Joint Distributions"). 339.1] (t) 6t2 1 − t2 . 0 ≤ u ≤ 1 + t. u) = 12t2 u (−1. 0 ≤ u ≤ 2. 0 ≤ u ≤ 2 (11. 2) . 0 ≤ u ≤ 1. 339.129) Exercise 11. (1. 0 ≤ t ≤ 2 4 (11. (1.126) Exercise 11. fY (u) = 2 (1 − u) . 339. (11. 0 ≤ t ≤ 1. 339. 1) . u) = 3 88 2t + 3u2 fX (t) = for 0 ≤ t ≤ 2.) on the square with vertices at (See Exercise 11 (Exercise 8. fXY (t.23 (Solution on p.26 (11. 1). fX (t) = 2 (1 − t) . 0 ≤ u ≤ 1.3] (u) 3 + 2u + 8u2 − 3u3 88 88 fY (u) = I[0. fY (u) = 2u.130) Exercise 11. 0 ≤ u ≤ 2 (1 − t).330 CHAPTER 11. 338. 0) . 1) . u) = 1 for 0 ≤ t ≤ 1. fXY (t. 1) fX (t) = I[−1.133) fY (u) = 12u3 − 12u2 + 4u.16) from "Problems On Random Vectors and Joint Distributions").24 (Solution on p. 0 ≤ u ≤ 1 Exercise 11. u) = 1 8 (t + u) 0 ≤ t ≤ 2. (2.) (See Exercise 10 (Exercise 8. u) = 1/2 (1. fXY (t.2] (t) (2 − t) Exercise 11. 0) . 338.) for (See Exercise 14 (Exercise 8. 2 (11.1] (t) t + I(1. (0. u) = 4t (1 − u) 0 ≤ t ≤ 1.11) from "Problems On Random Vectors and Joint Distributions"). fX (t) = 2e−2t .) (See Exercise 15 (Exercise 8.) for (See Exercise 12 (Exercise 8. 339. fXY (t.20 (Solution on p. (0.
142) fY (u) = I[0. 340. 0 ≤ u ≤ 1 11 11 11 (11.1] (t) 23 (2 − t) + I(1. 0 ≤ t ≤ 2.331 Exercise 11.2] (t) (3 − t) 13 13 + I(1. fXY (t.1] (u) 23 (2u + 1) + (11.136) Exercise 11.138) (Solution on p.140) = I[0.1] (t) 3 t2 + 2u + I(1.32 4 8 9 + u − u2 13 13 52 (11. u) = 2 13 (t + 2u).27 (Solution on p.2] (u) (2u + 3) 3 + 2u − u2 227 227 24 6 (2u + 3) + I(1. 0 ≤ u ≤ min{1. 340. 6 6 fX (t) = I[0. 0 ≤ u ≤ max{2 − t. 340. 12 3 120 t + 5t2 + 4t + I(1.) for (See Exercise 21 (Exercise 8. 9 fXY (t.1] (u) 24 6 (2u + 3) + I(1. fXY (t.1] (u) Exercise 11. 340.1] (u) Exercise 11.139) fX (t) = I[0.18) from "Problems On Random Vectors and Joint Distributions").28 (Solution on p.20) from "Problems On Random Vectors and Joint Distributions"). fY (u) 3 I(1.) for (See Exercise 20 (Exercise 8.2] (t) 9 − 6t + 19t2 − 6t3 179 179 24 12 (4 + u) + I(1.22) from "Problems On Random Vectors and Joint Distributions").2] (t) 14 t2 u2 .2] (u) 9 6 21 + u − u2 13 13 52 (11.17) from "Problems On Random Vectors and Joint Distributions"). fXY (t. u) = I[0.141) (Solution on p.1] (t) fY (u) = I[0. u) = 24 11 tu 0 ≤ t ≤ 2.31 (11.2] (u) 23 (4 + 6u − 4u2 ) = 6 I[0.) (See Exercise 22 (Exercise 8.143) (Solution on p. 2}.137) fX (t) = I[0. 3 − t}.) for (See Exercise 18 (Exercise 8.2] (u) 27 − 24u + 8u2 − u3 179 179 (11. 2 − t}.19) from "Problems On Random Vectors and Joint Distributions"). fXY (t. 12 12 12 2 2 t + I(1. 0 ≤ u ≤ min{2t. u) = 12 227 (3t + 2tu). t}. 3 − t}.2] (u) 9 + 12u + u2 − 2u3 227 227 (11.) for (See Exercise 17 (Exercise 8. 0 ≤ t ≤ 2. 340. 8 .2] (t) t(2 − t) .30 (11. 0 ≤ u ≤ min{2.29 (Solution on p. 24 6 3t2 + 1 + I(1. for 0 ≤ t ≤ 2.1] (t) Exercise 11.2] (t) 23 t2 . 0 ≤ u ≤ min{1 + t.2] (t) t 227 227 (11. fY (u) = u(u − 2) . fXY (t.1] (t) 12 2 6 t + I(1.1] (u) Exercise 11.) (See Exercise 19 (Exercise 8. u) = 3 23 (t + 2u) 0 ≤ t ≤ 2. 339.1] (t) fY (u) = I[0.135) fX (t) = I[0. u) = 12 179 3t2 + u .21) from "Problems On Random Vectors and Joint Distributions"). fX (t) = I[0.
35 The pair (Solution on p. 3150. Y = u) (11.0440 0.0405 0.0396 0 0.m (Section 17.0594 0.0495 0. Y.0090 0. ∞) .148) E [Z] and X ∼ Poisson (75). 341. 51100. Additional tickets are available according to the following schedule: 1120. E Z2 . M 4 = [50. {X.) has the joint distribution (in mle npr08_07.0726 2.8. let Determine E [Z] and E Z2 .1 0. ∞) . 341. Exercise 11. (11.2] (t) t2 .7 0.147) M 1 = [10. then repeat with icalc3 and compare results. Determine Exercise 11. 342. $18 each. identically distributed) with common X = [−5 − 1 3 4 7] Let P X = 0. Y = { X 2Y for for X +Y ≤4 X +Y >4 = IM (X.1320 0. ∞) (11.0231 0.0484 1.0203 0.2 0.8 3. Exercise 11.144) fX (t) = I[0.150) .1089 0. 340. Z} of random variables is iid (independent.0510 0.5 4. the total cost (in dollars) is a random Z = g (X) described by g (X) = 200 + 18IM 1 (X) (X − 10) + (16 − 18) IM 2 (X) (X − 20) + (15 − 16) IM 3 (X) (X − 30) + (13 − 15) IM 4 (X) (X − 50) where Suppose (11.) The (See Exercise 5 (Exercise 10.4 0.0077 Table 11.0189 0.1 2.0132 3.145) W = 3X − 4Y + 2Z .149) t = u = 7.4 Let Z = g (X.146) (11. Y } Approximate the Poisson distribution by truncating at 150.0297 0 4.1] (t) Exercise 11. MATHEMATICAL EXPECTATION for 0 ≤ u ≤ 1.) W = g X.) {X. agreement is that the organization will purchase ten tickets at $20 each (regardless of the number of individual buyers).01 ∗ [15 20 30 25 10] (11. fY (u) = + u + u2 0 ≤ u ≤ 1 8 14 8 4 2 (Solution on p. Y ) = 3X 2 + 2XY − Y 2 . $15 each.0891 0.0 3.34 (Solution on p.0216 0. M 2 = [20.332 CHAPTER 11.35. M 3 = [30.5 0. Do this using icalc. Y ) X + IM c (X. Determine E [W ].9 0. 2130 $16 each.33 The class distribution 3 3 2 3 1 3 t + 1 + I(1. Y } in Exercise 11.0324 0. ∞) .36 For the pair (Solution on p.38: npr08_07)): P (X = t.0528 0.5) from "Problems on Functions of Random Variables") The cultural committee of a student organization has arranged a special deal for tickets to a concert. $13 each If the number of purchasers is a random variable quantity X. {X.0363 0. Y ) 2Y (11.
157) Z has distribution Value Probability 1.37 (Solution on p.32 P (E) = 0. 2} (see Exercise 11. 343.29).2] (X) (X + Y ) Exercise 11.162 0.156) {D.8. Z} is independent.13 5. u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2. Determine E [W ] and E W2 .155) (Solution on p. u ≥ 1} (11.40 (11. X = −2IA + IB + 3IC . Y ) Y 2 . u) = 12 227 (3t + 2tu).152) (Solution on p.333 Determine E [W ] and E W2 E [Z] . 342.30). The class (11.) (See Exercise 16 (Exercise 10. 3 − t} (see Exercise 11. Exercise 11.108 0.42: npr10_16)) Minterm probabilities are (in the usual order) 0.56 P (F ) = 0. 344. F } is independent with P (D) = 0.) fXY (t. Z = IM (X.39 3 23 (11.7 0.025 0.m (Section 17. 0 ≤ u ≤ min{2.3 0. Determine analytically and E Z2 . 2 − t} (see Exercise 11.2 0.8 0. Z = IM (X. Y ) X + IM c (X.) fXY (t. . u) = (t + 2u) 0 ≤ t ≤ 2.25).41 M = {(t. of Random Variables". Y ) 2Y 2 . mle npr10_16.4 0. 342. u) : max (t. Y ) 2Y. Exercise 11. 2 − t)} (11. for 0 ≤ t ≤ 2.43 3.) for fXY (t.154) (Solution on p. 0 ≤ u ≤ 1 + t (see Exercise 11.27). Z = I[0.42 The class M = {(t. For the distributions in Exercises 3741 below a. Y ) (X + Y ) + IM c (X.018 Y = ID + 3IE + IF − 3.12 1. u) : t ≤ 1. 0 ≤ u ≤ min{1. Y ) XY. t} (see Exercise 11. 343. E. Use a discrete approximation to calculate the same quantities.375 0.1] (X) 4X + I(1. 0 ≤ u ≤ max{2 − t.151) (Solution on p. Z = IM (X.153) (Solution on p.38 (11.40 (11.24 2. Exercise 11.) for fXY (t.28). u) : u > t} 2 Exercise 11. b.012 0.045 0. Y ) (X + Y ) + IM c (X. M = {(t. u) = 12 179 3t2 + u . u) = 24 11 tu 0 ≤ t ≤ 2. 0 ≤ u ≤ min{1 + t. M = {(t.) fXY (t. 342.255 0. 1 Z = IM (X. Y. u) : u ≤ min (1.5 W = X 2 + 3XY 2 − 3Z .16) from "Problems on Functions {X. u) ≤ 1} Exercise 11. Y ) X + IM c (X. for 0 ≤ t ≤ 2.08 Table 11.
326) % file npr07_01. coefficients in c') npr06_12 Minterm probabilities in pm. MATHEMATICAL EXPECTATION Solutions to Exercises in Chapter 11 Solution to Exercise 11. disp('Data are in T and pc') npr07_01 Data are in T and pc EX = T*pc' EX = 2.PX] = csort(T. coefficients in c canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution EX = X*PX' EX = 2.1) from "Problems on Distribution and Density Functions" T = [1 3 2 3 4 2 1 3 5 2]. pc = 0.5 7.9890 T = sum(mintable(4)).PX] = csort(T.5 5.01*[ 8 13 6 9 14 11 12 7 11 9].01*[10 15 15 20 10 5 10 15].pm). 326) % file npr07_02.28: npr06_12) % Data for Exercise 12 (Exercise~6.001*[5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302].30: npr07_01) % Data for Exercise 1 (Exercise~7.0 3.8.5 7. ex = x*px ex = 2.8.7000 Solution to Exercise 11. disp('Minterm probabilities in pm. PX ex = X*PX' ex = 2. ex = X*PX' ex = 5.5].3500 [X.pc).2) from "Problems on Distribution and Density Functions" T = [3.9890 .1 (p.m (Section~17.0 5. [x.3500 Solution to Exercise 11. 326) % file npr06_12.334 CHAPTER 11.8.3 (p. pc') npr07_02 Data are in T. pc = 0.0 3.px] = csort(T.m (Section~17.2 (p. disp('Data are in T. % Alternate using X.m (Section~17.pc).31: npr07_02) % Data for Exercise 2 (Exercise~7. c = [1 1 1 1 0].7000 [X.5 5. pc EX = T*pc' EX = 5.12) from "Problems on Random Variables and Probabilities" pm = 0.
E [X] = 7. P. 327) Poisson (7).0083).5 = 10. u.1629 % Alternate . 327) gamma (20.5594 ez = 50*(1 .2 = 10. E [X] = 500 − 30E [N ] = 200.32: npr08_01) Data in Pn. E [N ] = 50 · 0. E [X] = 20/0.159) λ 1 1 − e−λa 1 − e−λa (1 + λa) + ae−λa = λ2 λ (11.1538 ex = total(t.8 (p.0541 Solution to Exercise 11. 326) X∼ X∼ N∼ X∼ X∼ binomial (127.335 Solution to Exercise 11.5000 EX2 = (X.9 (p.8. X.6 (p.^2)*PX' EX2 = 0. 0. 1/2).*(X<=30) + 30*(X>30). t.0083 = 1.10 (p.0002 = 100. 326) binomial (20. 000.1538 EY = Y*PY' EY = 0.exp(30/50)) % Theoretical value ez = 22. 327) E [X] = tfX (t) dt = 6 5 t3 dt + 0 6 5 2t − t2 dt = 1 11 10 (11. Y jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X. PY. 327) binomial (50.4 (p.2).5594 Solution to Exercise 11.*P) ex = 0.11 (p. Solution to Exercise 11.7 (p. Solution to Exercise 11. and P EX = X*PX' EX = 0. 327) npr08_01 (Section~17. E [X] = 127 · 0. PX. EZ = G8PX' EZ = 22.158) Solution to Exercise 11.0002). E [X] = 20 · 0.5 (p. 327) a E [Y ] = g (t) fX (t) dt = 0 tλe−λt dt + aP (X > a) = (11. Y. 1 2 Solution to Exercise 11. Solution to Exercise 11. 0.160) tappr Enter matrix [a b] of xrange endpoints [0 1000] Enter number of x approximation points 10000 Enter density as a function of t (1/50)*exp(t/50) Use row matrices X and PX as in the simple case G = X. 0.
5833 Solution to Exercise 11. Y. P jcalc .336 CHAPTER 11.. 328) npr08_03 (Section~17....^2)*PY' EY2 = 0.7500 EX2 = (X.7500 EX2 = (X.... P jcalc .34: npr08_03) Answers are in X.. 327) npr08_02 (Section~17.. Y.^2)*PY' EY2 = 4.^2)*PX' EX2 = 54..5000 EY = Y*PY' EY = 1.14 (p..*P) = 0.5714 EY2 = (Y.12 (p... Y.1667 EY2 = (Y.EX = X*PX' EX = 7 EY = Y*PY' EY = 3..6176 = total(t... MATHEMATICAL EXPECTATION = (Y.Pn.*u. PY jcalc .5000 EY = Y*PY' EY = 0. 328) npr08_04 (Exercise~8...*u.5000 EX2 = (X...8..8.^2)*PX' EX2 = 15..*P) EXY = 0.*u.33: npr08_02) Data are in X.13 (p.4) Answers are in X.EX = X*PX' EX = 0.^2)*PY' = 0..*P) EXY = 7.^2)*PX' EX2 = 0.2143 Solution to Exercise 11..8333 .6667 EXY = total(t...EX = X*PX' EX = 3. P..9643 EXY = total(t.0769 EY2 EY2 EXY EXY Solution to Exercise 11....
...^2)*PY' EY2 = 11.EX = X*PX' EX = 7. 328) npr08_05 (Section~17.1423 Solution to Exercise 11.^2)*PY' EY2 = 19.... PY jcalc . Y.8.8. P.8. Y.EX = X*PX' EX = 1...EX = X*PX' EX = 0.. 328) npr08_07 (Section~17.3696 EY = Y*PY' EY = 3..^2)*PX' EX2 = 5.8590 EY = Y*PY' EY = 1..^2)*PY' EY2 = 15.8495 EY2 = (Y..15 (p........36: npr08_05) Answers are in X.*P) EXY = 4.0344 EX2 = (X..^2)*PX' EX2 = 9.1667 Solution to Exercise 11..1455 EX2 = (X.17 (p....6115 EXY = total(t. P jcalc .7644 EY2 = (Y.*u.16 (p.37: npr08_06) Data are in X..0000 EY = Y*PY' EY = 1.4839 EXY = total(t.6803 Solution to Exercise 11. P jcalc ...*P) EXY = 3. Y. 329) . 328) npr08_06 (Section~17.337 EY2 = (Y.38: npr08_07) Data are in X.*u..18 (p.4583 Solution to Exercise 11...
.3333 EY = 0. 329) npr08_09 (Section~17. P jcalc .5*(u<=min(t+1.0016 EX2 = (X.0000 EY = 1. E X 2 = E Y 2 = 7/6 1+t 2 3−t (11.338 CHAPTER 11.8050 EX2 = (X. P jcalc .19 (p.EX = X*PX' EX = 10.....^2)*PX' EX2 = 133.20 (p.9850 EXY = 5.... E [Y ] = 2/3..*P) EXY = 22. P) Solution to Exercise 11.1410 Solution to Exercise 11. E X 2 = 1/6.*P) EY2 = 8.39: npr08_08) Data are in X...8.21 (p.^2)*PY' EY2 = 41.8.6667 EXY = 0... Y.162) tuappr: [0 1] [0 2] 200 400 u<=2*(1t) EX = 0.*u..^2)*PY' EXY = total(t.3t))&(u>= max(1t.t1)) EX = 1..0002 .40: npr08_09) Data are in X. u.. E Y 2 = 2/3 1 2(1−t) (11.0375 EY2 = (Y.EX = X*PX' EX = 1.0800 EY2 = (Y.1687 EXY = 1.. MATHEMATICAL EXPECTATION npr08_08 (Section~17.^2)*PX' EX2 = 4.161) E [XY ] = 0 0 tu dudt = 1/6 (11.*u..1000 EY = Y*PY' EY = 3.0002 EX2 = 1.5564 EXY = total(t. Y.2890 Solution to Exercise 11. 330) 1 E [X] = 0 2t (1 − t) dt = 1/3..1684 EY2 = 1.9250 EY = Y*PY' EY = 2. 330) 1 t E [X] = E [Y ] = 0 t2 dt + 1 1 2t − t2 dt = 1.6667 EX2 = 0..163) E [XY ] = (1/2) 0 1−t dudt + (1/2) 1 t−1 dudt = 1 (11.164) tuappr: [0 2] [0 2] 200 200 0.1667 EY2 = 0.1667 (use t..
330) E [X] = 11 2 3 2 2 .*u. E Y 2 = . E [Y ] = .*(1u) EX2 = 0. E X2 = .*(u<=min(1.1)) EX = 0. 331) E [X] = 52 32 57 2 28 .6202 EX2 = 2. E Y 2 = .170) tuappr: [1 1] [0 1] 400 200 12*t. E [Y ] = .0368 EY2 = 0.*(u<= min(1+t.*(u>= max(0.25 (p.24 (p.1141 EXY = 2. 330) E [X] = 2/3. E X 2 = .6667 EXY = 1. E X 2 = 1/2.1667 [0 2] [0 2] 200 200 (1/8)*(t+u) EY = 1. E Y 2 = .1667 EXY = 0.171) tuappr: EX = 0.2277 EY2 = 3. E [Y ] = .5098 .5000 EY2 = 0.166) E [XY ] = 1 8 2 0 0 2 t2 u + tu2 dudt = (11.*u. E Y 2 = 1/6 E [XY ] = 2/9 (11.4415 Solution to Exercise 11. E [XY ] = 5 15 5 5 5 (11.^2.5000 EY = 0.167) tuappr: EX = 1. E X2 = .t)).*(u<1+t) EY = 1.*exp(2*t) EX = 0. E X 2 = . 330) ∞ E [X] = 0 2te−2t dt = 2 1 1 1 1 .5822 EX2 = 1.^2). E X 2 = E Y 2 = 5/3 6 4 3 (11.4016 EY2 = 0. E Y2 = .4021 Solution to Exercise 11.168) tuappr: [0 6] [0 1] 600 200 4*u. E [Y ] = .6667 [0 1] [0 1] EY = 0. 330) E [X] = E [Y ] = 1 4 2 t2 + t dt = 0 7 .169) tuappr: EX = 1.1667 EX2 = 1.6009 EXY = 0.4998 EY2 = 0.27 (p.6667 EX2 = 0.6667 EY2 = 1.5000 Solution to Exercise 11.4035 EY = 0. E [XY ] = 2 3 2 2 3 (11.9458 [0 2] [0 1] 400 200 (24/11)*t.4004 EXY = 0. E [XY ] = 220 880 22 55 880 (11.23 (p. E [XY ] = 55 55 55 5 55 (11.4229 [0 2] [0 3] 200 300 (3/88)*(2*t + 3*u.3333 Solution to Exercise 11.339 Solution to Exercise 11.7342 EX2 = 0. E [Y ] = 1/3.26 (p.165) tuappr: EX = 0.2222 Solution to Exercise 11.22 (p.2t)) EY = 0.3333 E [X] = 313 1429 49 172 2153 .3333 200 200 4*t. 330) EXY = 0.
E Y2 = .2923 EY = 0.*(u<=max(2t.172) tuappr: EX = 1. 331) E [X] = 2491 476 1716 5261 1567 .176) tuappr [0 2] [0 1] 400 200 (3/8)*(t.^2+2*u). E X2 = .3t)) EY = 0.6875 EX2 = 1. 332) Use x and px to prevent renaming.t)) EY = 0. x = [5 1 3 4 7].01*[15 20 30 25 10].2309 [0 2] [0 2] 400 400 (2/13)*(t + 2*u).0944 Solution to Exercise 11.5450 Solution to Exercise 11.32 (p. E [Y ] = .5120 EXY = 1.174) tuappr: [0 2] [0 2] 400 400 (12/227)*(3*t + 2*t.173) tuappr: [0 2] [0 2] 400 400 (12/179)*(3*t. MATHEMATICAL EXPECTATION Solution to Exercise 11.33 (p. icalc Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter X probabilities px Enter Y probabilities px Use array operations on matrices X.0967 EY2 = 1.31 (p.*(t<=1) + (9/14)*(t.7251 EY2 = 1.^2.175) tuappr: EX = 1.0239 EXY = 1. 331) E [X] = 53 22 397 261 251 . E X2 = .*(u<=min(2*t.^2). Y.*(u<=min(1+t.0848 EY = 0.1518 [0 2] [0 2] 200 200 (3/23)*(t + 2*u). E Y2 = .*u. 331) E [X] = 243 11 107 127 347 . E X2 = . E Y2 = . . E [Y ] = .8695 EX2 = 1. E X2 = . E Y2 = . E X2 = .5286 EY2 = 0.^2 + u). E [XY ] = 13 12 130 78 390 (11. u. E [XY ] = 1135 2270 227 1135 3405 (11. E [Y ] = . PX.9119 EY2 = 1. 331) E [X] = 2313 778 1711 916 1811 .29 (p.9169 EX2 = 1. PY.5292 EXY = 0. E [XY ] = 46 23 230 230 230 (11. 331) E [X] = 16 11 219 83 431 .9596 EX2 = 1.*(t > 1) EX = 1. E [Y ] = .28 (p.1417 EXY = 1.2)) EX = 1.3t)) EX = 1. and P G = 3*t 4*u.0122 Solution to Exercise 11.0647 EXY = 1.*u). px = 0.*(u<=min(2.30 (p. t. E [XY ] = 1790 895 895 895 1790 (11. E [Y ] = .1056 Solution to Exercise 11. E [XY ] = 224 16 70 240 448 (11.340 CHAPTER 11. E Y2 = .7745 Solution to Exercise 11.0974 EX2 = 2.6849 EY2 = 1.3805 EY = 1.
. % Alternate EZ = Z*PZ' . t.1650e+03 EZ2 = (Z.. EG = total(G.. Y. EH = total(H. icalc Enter row matrix of Xvalues R Enter row matrix of Yvalues x Enter X probabilities PR Enter Y probabilities px Use array operations on matrices X.PR] = csort(G.P). PY.6500 Solution to Exercise 11. Y. 332) X = 0:150. EK = total(K.20).*u . Y.PX).X).38: npr08_07) Data are in X.PW] = csort(H.15)*(X .*(X>=30) + (13 .G = 3*t.P)..16)*(X.10). and P K = 3*t .30).^2..PZ] = csort(G. u.*P) EG = 5. and P H = t + 2*u..^2 + 2*t..4*u + 2*v.2975 ez2 = total(G. PY.*(X>=50). v.0868e+03 [Z.. t. PZ.6500 icalc3 % Solution with icalc3 Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter row matrix of Zvalues x Enter X probabilities px Enter Y probabilities px Enter Z probabilities px Use array operations on matrices X.*P) EK = 1.PZ] = csort(G. 332) npr08_07 (Section~17. P jcalc . % Alternate EW = W*PW' EW = 1..*P) EH = 1.341 [R. (15 . PX = ipoisson(75.3699e+06 Solution to Exercise 11.34 (p. EZ = Z*PZ' EZ = 1.35 (p.u. [Z.*(X>=20) + .50).*(X>=10) + (16 . Z.^2)*PZ' EZ2 = 1.6500 [W.P).18)*(X . PX. G = 200 + 18*(X .8.^2.*P) EG2 = 1. PX. u.
^2). 332) H = t.7379 EW2 = (W.P).*P) EH = 4.*P) EG2 = 11.38 (p.181) .39 (p.2t)) G = (1/2)*t. EH = total(H. EG = total(G.0872 Solution to Exercise 11.4351 [W. EZ = 0.2920 EZ2 = 0.*u.*(t+u<=4) + 2*u.^2. MATHEMATICAL EXPECTATION EZ = 5.*(u<=1+t) G = 4*t.*(t<=1) + (t + u).*(t>1). 333) E [Z] = E Z2 = 12 11 6 11 1 0 1 0 t t 1 t2 u dudt + 1 24 11 24 11 1 0 1 0 0 t tu3 dudt + t 24 11 24 11 2 1 2 1 0 2−t tu3 dudt = 2−t 16 55 39 308 (11.*(u<=min(1.*(u>t) + u.180) tuappr: [0 2] [0 1] 400 200 (24/11)*t.2086 EG2 = total(G.179) t3 u dudt + tu5 dudt + 0 tu5 dudt = 0 (11.^2)*PZ' EZ2 = 1.7379 EH2 = total(H. % Alternate EW = W*PW' EW = 4.*P) EH2 = 61.*(u<=t).^2.37 (p.^2.177) 3 88 (4t) 2t + 3u2 dudt + (t + u) 2t + 3u2 dudt = (11.4351 Solution to Exercise 11. 333) E [Z] 3 23 2 1 3 = (t + u) (t + 2u) dudt + 23 0 0 t 2u (t + 2u) dudt = 175 92 1 1 1 3 23 1 0 2−t 1 2u (t + 2u) dudt + (11. 333) E [Z] = E Z2 = 3 88 1 0 1 0 0 0 1+t 4t 2t + 3u2 dudt + 1+t 2 3 88 3 88 2 1 2 1 0 1+t (t + u) 2t + 3u2 dudt = 1+t 0 2 5649 1760 4881 440 (11.178) tuappr: [0 2] [0 3] 200 300 (3/88)*(2*t+3*u.^2)*PW' EW2 = 61.*P) EG = 3.1278 Solution to Exercise 11.36 (p.342 CHAPTER 11.PW] = csort(H.*(t+u>4).2975 EZ2 = (Z.0868e+03 Solution to Exercise 11.
^2.2)) M = u <= min(1.183) 2u2 3t2 + u dudt = 3t2 + u dudt + 2 1 0 3−t (11.*P) EZ = 1.9048 EZ2 = total(G.2t).^2 + u).M).5659 . G = (t + u).182) tuappr: [0 2] [0 2] 400 400 (3/23)*(t+2*u).6955 EZ2 = total(G.t)) M = max(t.*M + 2*u.40 (p.*(1M). G = t.*(u <= min(2. EZ = total(G.189) tuappr: [0 2] [0 2] 400 400 (12/227)*(3*t + 2*t.*M + 2*u. EZ = total(G.^2.*M + t.185) 4u4 3t2 + u dudt = 28296 6265 (11.*(1 .3t)) M = (t<=1)&(u>=1).^2.41 (p.4963 Solution to Exercise 11.*P) EZ2 = 4.*P) EZ = 1.*u. 333) E [Z] = 12 227 1 0 12 227 1+t 1 0 0 1 t (3t + 2tu) dudt + 12 227 12 227 2 1 2 1 2 0 2−t t (3t + 2tu) dudt + 5774 3405 (11.^2.184) E Z2 = 12 179 1 0 1 2 (t + u) 12 179 12 179 1 1 (11.187) tu (3t + 2tu) dudt + 1 tu (3t + 2tu) dudt = 2−t (11. 333) E [Z] = 12 179 1 0 1 2 (t + u) 3t2 + u dudt + 12 179 2 1 2 0 3−t 12 179 1 0 0 1 2u2 3t2 + u dudt+ 1422 895 4u4 3t2 + u dudt+ 0 0 (11.5224 Solution to Exercise 11. G = (t+u).u)<=1.*(1 .*u).*P) EZ2 = 3.*P) EZ = 1.343 E [Z 2 ] 3 23 2 1 3 = (t + u)2 (t + 2u) dudt + 23 0 0 t 4u2 (t + 2u) dudt = 2063 460 1 1 1 3 23 1 0 2−t 1 4u2 (t + 2u) dudt + (11.186) tuappr: [0 2] [0 2] 400 400 (12/179)*(3*t.*P) EZ2 = 4.188) E Z2 = 56673 15890 (11.*(u <= min(1+t.M).*(u<=max(2t. EZ = total(G.5898 EZ2 = total(G.
. [W. PZ. Z. MATHEMATICAL EXPECTATION Solution to Exercise 11.42 (p.3*v. and P G = t.*u. PX.. cy.8673 EW2 = (W. pmx.PX] = canonicf(cx.^2)*PW' EW2 = 426.P).pmy).pmx).PY] = canonicf(cy. PZ . pmy.^2 .8.^2 + 3*t.42: npr10_16) Data are in cx. Z. Y.8529 . u.. icalc3 input: X.PW] = csort(G. EW = W*PW' EW = 1. PZ [X.. 333) npr10_16 (Section~17. v. PX. PY. t. [Y. PY. Z.344 CHAPTER 11.Use array operations on matrices X.. Y.
1) standard deviation for X If is the positive square root σX of the variance. we note that the mean value on the real line. in which case E (X − E [X]) 2 = E X 2 − E 2 [X] (12. As in the case of mean value. In this unit. and it weights large variations more heavily than smaller ones.org/content/m23441/1. 345 .Chapter 12 Variance. we show that E (X − c) 2 is a minimum i c = E [X] . but provides limited information. Linear Regression 12. Covariance. Two markedly dierent random variables may have the same mean value. Remarks • • • • • X (ω) is the observed value of X. have been found particularly useful. the variance is a property of the distribution. We show below that the standard deviation is a natural measure of the variation from the mean. its variation from the mean is X (ω) − µX . value: Location of the center of mass for a distribution is important. The square of the error treats positive and negative variations alike. the standard deviation.2) This shows that the mean value is the constant which best approximates the random variable. The variance of a random variable X is the mean square of its variation about the mean 2 Var [X] = σX = E (X − µX ) The 2 where µX = E [X] (12. and linear regression how it may be used to characterize the distribution for a pair considered jointly with the concepts 12. It would be helpful to have a measure of the Among the possibilities. in the mean square sense. The variance is the probability weighted average of the square of these variations. we deal In subsequent units. we show covariance. In particular.6/>. rather than of the random variable. the variance and its square root. 1 This content is available online at <http://cnx. {X.1 Variance1 In the treatment of the mathematical expection of a real random variable locates the center of the probability mass distribution induced by X X.1. In the treatment of mathematical expectation. spread of the probability mass about the mean.1 Variance Denition. we examine how expectation may be used for further characterization of the distribution for with the concept of variance and its square root the standard deviation. Y } X.
(12.4) The variance add even if the coecients are negative. Y } is independent. while the variance is dened as this is usually not the most convenient form for computation. Indicator function X = IE P (E) = p. as examples in the next section show. Adding a constant b to X shifts the distribution (hence its center of mass) by that amount. Var [X + b] = Var [X]. we nd it expedient to identify several patterns for variance which are frequently useful in performing calculations. Xj }. q = 1 − p E [X] = p 2 E X 2 − E 2 [X] = E IE − p2 = E [IE ] − p2 = p − p2 = p (1 − p) = pq (12.3) in the unit on that topic. More generally. then n n Var k=1 ai Xi = k=1 Var [Xi ] (12.5) 2. In addition.3). we say the class is uncorrelated. Simple random variable X = Var [X] = n i=1 ti IAi n (primitive form) P (Ai ) = pi . expression. (Section 17. X by constant a changes the scale by a So also is the mean of the squares of the variations. Shift property. We calculate variances for some common distributions. whose role we study Remarks • • If the pair {X. cij are all zero.6) t2 pi qi − 2 i i=1 i<j ti tj pi pj . Some details are omittedusually details of algebraic manipulation or the straightforward evaluation of integrals. For one thing. we utilize properties of expecE (X − µX ) 2 . The variation of the shifted distribution about the shifted center of mass is the same as the variation of the original. it is uncorrelated. The converse is not true. B (Section 17. a. Basic patterns for variance Since variance is the expectation of a function of the random variable tation in computations. Var [aX ± bY ] = a2 Var [X] + b2 Var [Y ] ± 2ab (E [XY ] − E [X] E [Y ]) n n b. The result quoted above gives an alternate (V1): (V2): Calculating formula. unshifted distribution about the original center of mass. LINEAR REGRESSION X. Some Mathematical Aids. innite series or values of denite integrals. VARIANCE. Var [X] = E X 2 − E 2 [X]. COVARIANCE.2). (V4): Linear combinations a. In some cases we use well known sums of A number of pertinent facts are summarized in Appendix The results below are included in the table in Appendix C Variances of some discrete distributions 1. (V3): Change of scale. If the ai = ±1 and all pairs are uncorrelated. sinceE IAi IAj = 0i = j . Var k=1 The term ak Xk = k=1 a2 Var [Xk ] + 2 k i<j ai aj (E [Xi Xj ] − E [Xi ] E [Xj ]) (12. If the cij = E [Xi Xj ] − E [Xi ] E [Xj ] is the covariance of the pair {Xi . Multiplication of factor The squares of the variations are multiplied by a2 .346 CHAPTER 12. Var [aX] = a2 Var [X].
11) t2 dt = a b3 − a3 (a + b) (b − a) b3 − a3 soVar [X] = − = 3 (b − a) 3 (b − a) 4 12 2. 2 =e −µ k=2 µk k (k − 1) + µ = e−µ µ2 k! ∞ k=2 µk−2 + µ = µ2 + µ (k − 2)! (12. Uniform on (a. Binomial (n. Var [IEi ] = i=1 pq = npq (12. p). Then the distribution is symmetric triangular (−c.10) Var [X] = µ2 + µ − µ2 = µ. t ≥ 0E [X] = 1/λ ∞ E X2 = 0 4. λt2 e−λt dt = ≥ 0E [X] = ∞ 2 2 sothatVar [X] = 1/λ λ2 α λ (12. q2 2 + q/p − (q/p) = q/p2 p2 (12. t2 fX (t) dt = 2 0 t2 fX (t) dt (12. c c Var [X] = E X 2 = −c Now.7) Geometric (p). λ)fX (t) = 1 α α−1 −λt e t Γ(α) λ t E X2 = Hence 1 Γ (α) λα tα+1 e−λt dt = 0 Γ (α + 2) α (α + 1) = λ2 Γ (α) λ2 (12.347 3. in this case. p.9) Poisson(µ)P (X = k) = e−µ µ ∀k ≥ 0 k! k Using E X 2 = E [X (X − 1)] + E [X].14) Gamma(α. c = (b − a) /2. X = n i=1 IEi with{IEi : 1 ≤ i ≤ n}iidP (Ei ) = p n n Var [X] = i=1 4. Note that both the mean and the variance have common value µ. Symmetric triangular (a.15) Var [X] = α/λ2 . we may center the where distribution at the origin. 346).8) Var [X] = 2 5. Some absolutely continuous distributions 1. c). b) Because of the symmetry Because of the shift property (V2) ("(V2)". c−t 0 ≤ t ≤ cso c2 that E X2 = 2 c2 c ct2 − t3 dt = 0 c2 (b − a) = 6 24 2 (12. P (X = k) = pq k ∀k ≥ 0E [X] = q/p We use a trick: E X 2 = E [X (X − 1)] + E [X] ∞ ∞ E X2 = p k=0 k (k − 1) q k +q/p = pq 2 k=2 k (k − 1) q k−2 +q/p = pq 2 2 (1 − q) 3 +q/p = 2 q2 +q/p p2 (12.13) Exponential (λ) fX (t) = λe−λt .12) fX (t) = 3. ∞ we have E X Thus. b)fX (t) = E X2 = 1 b−a 1 b−a a b < t < bE [X] = a+b 2 2 2 (12. .
00 each.10. The rst pays $10.p.14.01*[8 11 6 13 5 8 12 7 14 16].16. Determine E [g (X)] and Var [g (X)] c = [3 1 2 3 4 1 1 2 3 2].8036 [Z. we calculate the mean for a variety of cases.pc).8036 % Original coefficients % Probabilities for C_j % g(c_j) % Direct calculation E[g(X)] % Direct calculation Var[g(X)] % Distribution for Z = g(X) % E[Z] % Var[Z] .348 CHAPTER 12. LINEAR REGRESSION Normal µ.19) P (Ci ) = 0.05. σ 2 E [X] = µ Consider 5. Then with P (A) = 0.EZ^2 VZ = 40. We may use MATLAB to perform the arithmetic. C} is independent (this assumption is not necessary Var [X] = 102 P (A) [1 − P (A)] + 82 P (B) [1 − P (B)] + 202 P (C) [1 − P (C)] Calculation is straightforward. 1) . 0.4200 VZ = (Z. 0.01*[15 20 10]. examples and calculate the variances.20. 0. We revisit some of those Example 12. 0.*q) VX = 58.9: Expectation of a function of ) from "Mathematical Expectation: Simple Random Variables") X X Suppose X in a primitive form is X = −3IC1 − IC2 + 2IC3 − 3IC4 + 4IC5 − IC6 + IC7 + 2IC8 + 3IC9 + 2IC10 with probabilities Let (12. 0.1: Expected winnings (Example 8 (Example 11.00 with probability 0.^2)*pc' .PZ] = csort(G.07. Y ∼ N (0. and the third $20.9900 Example 12. EZ = Z*PZ' EZ = 6. pc = 0.8: Expected winnings) from "Mathematical Expectation: Simple Random Variables") A bettor places three bets at $2. SOLUTION The net gain may be expressed X = 10IA + 8IB + 20IC − 6.*p. We may reasonbly suppose the class in computing the mean). Var [Y ] = √2 2π ∞ 2 −t2 /2 t e dt 0 = 1. E [Y ] = 0.4200 VG = (G. 0.11. (12. 0. 0. P (C) = 0.08. B.EG^2 VG = 40. p = 0. the second $8. P (B) = 0.^2.20.00 with probability 0.15. COVARIANCE. g (t) = t2 + 2t.12. VARIANCE.15. q = 1 .00 with probability 0. G = c. VX = sum(c.^2 + 2*c EG = G*pc' EG = 6.18) c = [10 8 20].10 (12.17) {A.06.08.13.^2)*PZ' .2: A function of (Example 9 (Example 11. 0.16) X = σY + µimpliesVar [X] = σ 2 Var [Y ] = σ 2 Extensions of some previous examples In the unit on expectations. (12.
and P G = t. ∞ ANALYTIC SOLUTION E [g (X)] = g (t) fX (t) dt = 0 4 I[0.^2)*PZ' .22: A function with a compound denition: truncated exponential) from "Mathematical Expectation.3e−0.0562 (by Maple) (12. Y )) 2tu − 3u.10: Z = g (X.3t dt + 16E I(4.2 ≈ 100. t.349 Example 12.*P) % Direct calculation of E[g(X.3e−0.4: A function with compound denition (Example 12 (Example 11.EG^2 % Direct calculation of Var[g(X.∞] (X) 16 (12.2529 VZ = (Z.P). we must approximate by a bounded random variable.3). . PY.2133 Example 12.24) Var [Z] = E Z 2 − E 2 [Z] ≈ 43. Y )) Expectation for g (X. Y ) (Example 10 (Example 11.3: Z = g (X.22) Z 2 = I[0.3t dt + 256e−1.4] (X) X 4 + I(4.2133 [Z.Y)] EG = 3. u_j)] EG = total(G.*u .4] (t) t4 0.3t dt + 256E I(4. we use jcalc. % Determination of distribution for Z EZ = Z*PZ' % E[Z] from distribution EZ = 3.25) To obtain a simple aproximation. jdemo1 % Call for data jcalc % Set up Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.∞] (X) (12.4] (X) X 2 + I(4.23) E Z2 = 0 I[0.2529 VG = total(G.∞] (X) 256 ∞ 4 (12.^2. Y. General Random Variables") Suppose X∼ exponential (0. % Calculation of matrix of [g(t_i.3e−0.∞] (X) = 0 t4 0.EZ^2 % Var[Z] from distribution VZ = 80.Y)] VG = 80. u) = t2 + To set up for calculations.21) = 0 t2 0.3t dt + 16P (X > 4) ≈ 7. from "Mathematical Expectation: Simple Random Variables" and let Z = g (t. u.4] (t) t2 0.3*u.^2 + 2*t. PX. Since P (X > 50) = e−15 ≈ 3 · 10−7 we may safely truncate X at 50. Let Z={ Determine X2 16 for for X≤4 X>4 = I[0.*P) .20) E [Z] and Var [Z].PZ] = csort(G.4972 (by Maple) (12.10: Expectation for from "Mathematical Expectation: Simple Random Variables") We use the same joint distribution as for Example 10 (Example 11.8486 APPROXIMATION (12.3e−0.
% Distribution for Z = g(X) EZ = Z*PZ' % E[Z] from distribution EZ = 7.EG^2 % Var[g(X)] VG = 43.4) from "Problems on Functions of Random Variables")) from "Mathematical Expectation.29) mu = 50.8472 Example 12.3*t) Use row matrices X and PX as in the simple case M = X <= 4.3*exp(0. X = D (p − c) − (m − D) (c − r) = D (p − r) + m (r − c) D > m.PZ] = csort(G.t). c = 30. LINEAR REGRESSION tappr Enter matrix [a b] of xrange endpoints [0 50] Enter number of x approximation points 1000 Enter density as a function of t 0.^2)*PZ' . Thus X = IM (D) [D (p − r) + m (r − c)] + [1 − IM (D)] [D (p − s) + m (s − c)] = D (p − s) + m (s − c) + IM (D) [D (p − r) + m (r − c) − D (p − s) − m (s − c)] = D (p − s) + m (s − c) + IM (D) (s − r) [D − m] Then (12.8486 [Z.EZ^2 % Var[Z] VZ = 43.4972 VG = (G.M). % Approximate distribution for D . APPROXIMATION (12. pD = ipoisson(mu. If demand is less than the amount ordered.8472 % Theoretical = 43. p = 50. the remaining stock can be returned (or otherwise disposed of ) at unit ( r < c). dollars per unit ( s > c). where M = (−∞. X = m (p − c) + (D − m) (p − s) = D (p − s) + m (s − c) It is convenient to write the expression for X in terms of IM . % g(X) EG = G*PX' % E[g(X)] EG = 7. Then For For D ≤ m.*X.350 CHAPTER 12.28) E [X] = (p − s) E [D] + m (s − c) + (s − r) E [IM (D) D] − (s − r) mE [IM (D)] We use the discrete approximation. COVARIANCE. t = 0:n.4972 VZ = (Z. n = 100. General Random Variables") The manager of a department store is planning for the holiday season. Demand D r dollars per for the season is assumed to be a random variable with Poisson What amount distribution. r = 20.PX). G = M.5: Stocking for random demand (Example 13 (Example 11.26) (12. dollars per unit and sells for p A certain item costs dollars per unit. s = 40.27) (12. m should the manager (µ) order to maximize the expected prot? PROBLEM FORMULATION Suppose D is the demand and X is the prot.^2)*PX' . additional units can be special ordered for s If the demand exceeds the amount m c ordered.^2 + 16*(1 . VARIANCE. Suppose µ = 50. m].23: Stocking for random demand (see Exercise 4 (Exercise 10.
1075 0. u = 1 − t.0938 2.0757 APPROXIMATION (12.3343 0. ANALYTIC SOLUTION on the triangular region bounded by E [Z] = (t2 + 2tu) fXY (t.0153 0. u) = 3u u = 0.0167 0.30) E Z2 = 3 −1 0 u t2 + 2tu 2 1 1−t dudt + 3 0 0 u t2 + 2tu 2 dudt = 3/35 (12.7908 0.0160 0.31) Var [Z] = E Z 2 − E 2 [Z] = 53/700 ≈ 0.0174 0. SG =sqrt(VG).1t)) Use array operations on X. disp([EG'. Let Z = g (X. VG = (G.0179 Example 12.^2)*pD' .0130 0.3117 0. G(i. PX. General Random Variables") Suppose the pair {X.4869 0.m(i)).0943 1. PY.6799 0.0931 1.0145 0.351 c = 30.24: A jointly distributed pair) from "Mathematical Expectation. s = 40. p = 50.0122 0.0936 1.0112 0.0115 0. and P G = t.0929 3. u = 1 + t.2206 0.1561 0. Y.5637 0. m = 45:55. Y ) = X 2 + 2XY .32) tuappr Enter matrix [a b] of Xrange endpoints [1 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density 3*u.6: A jointly distributed pair (Example 14 (Example 11.*(u<=min(1+t. u.*u. Y } has joint density fXY (t. for i = 1:length(m) % Step by step calculation for various m M = t<=m(i).^2 + 2*t.Y) = X^2 + 2XY .EG. Determine E [Z] and Var [Z].:) = (ps)*t + m(i)*(sc) + (sr)*M. % g(X.0137 0.0e+04 * 0. r = 20. end EG = G*pD'.VG'.0934 3.0108 0.0943 2.^2.0941 2. u) dudt 1 1−t 3 0 0 u (t2 + 2tu) dudt = 1/10 0 1+t = 3 0 −1 1+t 0 u (t2 + 2tu) dudt + (12.*(t .8880 0. t.0939 1.0942 1.0944 2.SG']') 1.
*P) . u) : max{t. G = t.*P) % E[g(X.^2. 1 0 1 1 0 1 2 1+t 1+t Reference to Figure 11.^2*PZ' .^2. u} ≤ 1} = {(t.PZ] = csort(G.*M + 2*u. E [W ] = 1 2 1 0 1 t dudt + 1−t 1 0 1 1 2 2u dudt + 1 2 1 2 2 1 3−t 2u dudt = 11/6 ≈ 1.*P) . and P M = max(t.=max(1t. u = 3 − t.9306 [Z..t1)))/2 Use array operations on X.EG^2 VG = 0.0765 Example 12. u) : t ≤ 1.34) E W2 = 1 2 t2 dudt + 1−t 4u2 dudt + 1 2 2 1 3−t 4u2 dudt = 103/24 t−1 (12. W ={ where on the square region bounded by u = 1 + t.8333 VG = total(G.^2)*PZ' . Y } > 1 = IQ (X. % Z = g(X. PX.9306 (12. E [W ] and Var [W ].P). % Distribution for Z EZ = Z*PZ' % E[Z] from distribution EZ = 1.33) Determine Q = {(t. PY.EG^2 VG = 0.352 CHAPTER 12. u ≤ 1}. u) = 1/2 u = 1 − t.2 shows three regions of integration.3t))& .Y)] EG = 1. (u$gt.*P) % E[g(X.. VARIANCE. LINEAR REGRESSION EG = total(G.M).*(1 .35) Var [W ] = 103/24 − (11/6) = 67/72 ≈ 0.7: A function with compound denition (Example 15 (Example 11. Y ) X + IQc (X.u)<=1.36) tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density ((u<=min(t+1. t.8340 VZ = (Z.PZ] = csort(G.8333 t−1 (12. Y. and u = t − 1. Y } has joint density fXY (t.0757 [Z.Y)] EG = 0.8340 % Theoretical 11/6 = 1. X 2Y for for max{X.3. u. General Random Variables") The pair {X. COVARIANCE.25: A function with a compound denition) from "Mathematical Expectation.1006 VZ = Z. Y ) 2Y (12.EZ^2 VZ = 0.1006 % Theoretical value = 1/10 VG = total(G.P).0765 % Theoretical value 53/700 = 0. ANALYTIC SOLUTION The intersection of the region Q and the square is the set for which 0 ≤ t ≤ 1 and 1 − t ≤ u ≤ 1. % Distribution for Z EZ = Z*PZ' % E[Z] from distribution EZ = 0.9368 .9368 % Theoretical 67/72 = 0.EZ^2 VZ = 0.Y) EG = total(G. Y } ≤ 1 max{X.
we get Var [Z] = (2125t0 − 1309) /80 ≈ 0. and P G = (t+u <= 1). APPROXIMATION % Theoretical values t0 = (sqrt(5) .*P) EG = 0. Y ) The value Q = {(t.6180 EZ = (3/4)*(5*t0 2) EZ = 0.8169 VZ = (Z.0540 [Z.7225 VZ = (2125*t0 . EZ = Z*PZ' EZ = 0. E [X] = µ . σ 2 then Z = Var [X] = σ 2 X−µ σ ∼ N (0. EG = total(G.^2)*PZ' .6180.8176 EZ2 = (25*t0 . u) : u + t ≤ 1} meet satises (12. 0 3 (5t0 − 2) 4 (12.1)/20 EZ2 = 0.0540 Standard deviation and the Chebyshev inequality In Example 5 (Example 10. the standard deviation from the mean. t.*t + (t+u > 1).1)/2 t0 = 0.37) Z = IQ (X.*P) .EZ^2 VZ = 0.EG^2 VG = 0.39) t0 1 1−t 1 t2 E [Z] = 3 0 For t 0 dudt + 3 t0 t 0 dudt + 3 t0 1−t dudt = E Z 2 replace t by t2 in the integrands to get E Z 2 = (25t0 − 1) /20. Y.P).40) For the normal distribution. √ Using t0 = 5 − 1 /2 ≈ 0.38) t0 where the line u=1−t t2 and the curve u = t2 t2 = 1 − t0 .0540 % Theoretical = 0.^2) Use array operations on X. Also.0540. u.0540 tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density 3*(u <= t. we have the .8169 % Theoretical = 0.353 Example 12.PZ] = csort(G.5: The normal distribution and standardized normal distribution) from "Functions of a Random Variable.8176 VG = total(G. u) = 3 on 0 ≤ u ≤ t2 ≤ 1 for (12.1309)/80 VZ = 0.^2. For a general distribution with mean seems to be a natural measure of the variation away µ and variance σ2 . Y ) X + IQc (X.8: A function with compound denition fXY (t. Thus P X − µ ≤t σ = P (X − µ ≤ tσ) = 2Φ (t) − 1 σ (12. 1)." we show that if and X ∼ N µ.
COVARIANCE. h = [' t Chebyshev Prob Ratio'].3173 3.0000 0.p. = σ2 (12.c. the bound is not a particularly tight. This inequality is useful in many theoretical applications as well as some practical ones.42) We consider three concepts which are useful in many situations.0124 12.^2). Y }. r = c. p = 2*(1 .41) In this general case.0455 5. 2 Then a2 σ 2 IA ≤ (X − µ) 2 2 .354 CHAPTER 12.2500 0.length(t)). where X∗ = X − µX Y − µY . we have a2 σ 2 P (A) ≤ E (X − µ) from which the Chebyshev inequality follows immediately.r]'./(t. (12. Y } of random variables is uncorrelated i E [XY ] − E [X] E [Y ] = 0 It is always possible to derive an uncorrelated pair as a function of a pair variances. t = 1:0. m = [t. Denition.1515 1.4945 2.gaussian(0.1.44) X∗ = Denition.0000 1.t)). A random variable X is centered i E [X] = 0. Y∗ = σX σY (12. VARIANCE.0027 41./p.5000 0.0000 0. c = ones(1.1111 0.1600 0. disp(h) t Chebyshev Prob Ratio disp(m) 1. It may be instructive to compare the bound on the probability given by the Chebyshev inequality with the actual probability for the normal distribution.5:3.45) {X. since it must hold for any distribution which has a variance.8831 3. A random variable is always centered.1336 3.4444 0.43) X' = X − µ Denition.46) . However.1554 DERIVATION OF THE CHEBYSHEV INEQUALITY Let A = {X − µ ≥ aσ} = {(X − µ) ≥ a2 σ 2 }. i X is standardized E [X] = 0 and Var [X] = 1. the standard deviation appears as a measure of the variation from the mean value. Upon taking expectations of both sides and using monotonicity.3263 2. Consider (12.0000 0.5000 0. A pair X' X −µ = σ σ is standardized {X. LINEAR REGRESSION Chebyshev inequality P X − µ ≥a σ ≤ 1 a2 or P (X − µ ≥ aσ) ≤ 1 a2 (12. both of which have nite U = (X ∗ + Y ∗ ) V = (X ∗ − Y ∗ ) . (12.
Y} not uncorrelated VX = total(t.48) jdemo1 jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.10: Simple Random Variables" and Example 12.*P) EXY = 0.6566 SX = sqrt(VX) SX = 1.9755e06 % Differs from zero because of roundoff .*vv.EX*EY c = 0.*P) .1130 c = EXY .EY^2 VY = 3.EX)/SX. Y. uu = x + y. Y ) Expectation for Z = g (X. 2 − E (Y ∗ ) 2 =1−1=0 (12.^2. for which E [XY ] − E [X] E [Y ] = 0 (12.*P) % Check for uncorrelated condition EUV = 9.EX^2 VX = 3. PX.3016 VY = total(u. and P EX = total(t.3 ( Z = g (X.0783 EXY = total(t. Y )) from "Mathematical Expectation: Simple Random Variables")). % Uncorrelated random variables vv = x .*P) . Y )) Z = g (X. u.6420 EY = total(u.EY)/SY.*u. t.^2.1633 % {X.*P) EY = 0.9122 x = (t . PY.9: Determining an uncorrelated pair We use the distribution for Examples Example 10 (Example 11. % Standardized random variables y = (u .y.*P) EX = 0.355 Now E [U ] = E [V ] = 0 and E [U V ] = E (X ∗ + Y ∗ ) (X ∗ − Y ∗ ) = E (X ∗ ) so the pair is uncorrelated.47) Example 12.10: Expectation for from "Mathematical Expectation: (Example 10 (Example 11. EUV = total(uu.8170 SY = sqrt(VY) SY = 1.
For this reason.org/content/m23460/1. VARIANCE. X (ω) − µX is the variation of X from from its mean. Y ] = E [(X − µX ) (Y − µY )] is called the covariance Y ' = Y − µY be the centered random variables.52) Note that the variance of If we standardize. We examine these concepts for information on the joint distribution. expand the expression (X − µX ) (Y − µY ) and use linearity to get E [(X − µX ) (Y − µY )] = E [XY − µY X − µX Y + µX µY ] = E [XY ] − µY E [X] − µX E [Y ] + µX µY which reduces directly to the desired expression. the following terminology is Denition.5/>. Can the expectation of an appropriate function of (X. (12. Now for given its mean and used.50) To see this.2 Covariance and the Correlation Coecient2 12. Y ) give useful (12. µX = E [X] and µY = E [Y ] E [(X − µX ) (Y − µY )] = E [XY ] − µX µY (12. we have ρ2 = E 2 [X ∗ Y ∗ ] ≤ E (X ∗ ) Now equality holds i 2 E (Y ∗ ) 2 =1 with equality i Y ∗ = cX ∗ (12.49) We note 2 σX = E (X − µX ) 2 give important information about the information about the joint distribution? A clue to one possibility is given in the expression Var [X ± Y ] = Var [X] + Var [Y ] ± 2 (E [XY ] − E [X] E [Y ]) The expression also that for E [XY ] − E [X] E [Y ] vanishes if the pair is independent (and in some other cases).1 Covariance and the Correlation Coecient The mean value µX = E [X] and the variance distribution for real random variable X. The X ∗ = (X − µX ) /σX and Y ∗ = (Y − µY ) /σY . .51) Y (ω) − µY is the variation of Y ω . Y ] = E X ' Y ' of X and Y.56) = P (X ≤ t = σX r + µX . ≤s (12. correlation coecientρ = ρ [X.2. Y ∗ ≤ s) = P ≤ r. we have E [(X − µX ) (Y − µY )] σX σY (12.53) Thus ρ = Cov [X. If we let The quantity X ' = X − µX and Cov [X. with Denition. (12. Y ] /σX σY . Y ] = E [X ∗ Y ∗ ] = X is the covariance of X with itself. with Relationship between ρ = ± 1 i Y ∗ = ± X ∗ ρ and the joint distribution (X ∗ . By Schwarz' inequality (E15).55) −1 ≤ ρ ≤ 1.356 CHAPTER 12.54) 1 = c2 E 2 (X ∗ ) We conclude 2 = c2 which implies c = ±1 and ρ = ±1 (12. LINEAR REGRESSION 12. COVARIANCE. Y ] is the quantity ρ [X. Y ≤ u = σY s + µY ) 2 This content is available online at <http://cnx. then Cov [X. Y ∗ ) X−µX σX Y −µY σY • • We consider rst the distribution for the standardized pair Since P (X ∗ ≤ r.
Y ) distribution are: u − µY t − µX =± σY σX Consider or u=± 2 σY (t − µX ) + µY σX . Then E 1 2 2Z = ∗ 1 2E (Y ∗ − X ∗ ) ∗ Reference to Figure 12.60) . s) = (X ∗ .1 shows this is the average of the square of the distances of the points about the line Similarly for (r. Figure 12.e. Y ) by the mapping t = σX r + µX u = σY s + µY Joint distribution for the standardized variables (12. (12.57) (X ∗ . s) = (X . Y ∗ ).357 we obtain the results for the distribution for (X.58) Z = Y ∗ − X ∗.59) ∗ 1 1 2 2 2 E (Y ∗ ± X ∗ ) = {E (Y ∗ ) + E (X ∗ ) ± 2E [X ∗ Y ∗ ]} = 1 ± ρ 2 2 Thus 1−ρ 1+ρ Now since is the variance about is the variance about s = r (the ρ = 1 line) s = −r (the ρ = −1 line) E (Y ∗ − X ∗ ) 2 = E (Y ∗ + X ∗ ) 2 i ρ = E [X ∗ Y ∗ ] = 0 (12. Y ∗ ) (ω) s = r. Now (12. E W 2 /2 is the variance about s = −r. the variance W = Y + X .s) to the line s = r. Y ∗ ) (ω) from the line s = r (i. The ρ = ±1 lines for the (X. (r. s = r). s = −r. ρ=1 ρ = −1 If i i X∗ = Y ∗ X ∗ = −Y ∗ i all probability mass is on the line i all probability mass is on the line −1 < ρ < 1..1: Distance from point (r. then at least some of the mass must fail to be on these lines.
By symmetry.63) 1 − ρ is proportional to the variance abut the ρ = 1 line and 1 + ρ ρ = −1 line. the pair cannot be independent. if desired. . Y ] = 0. ρ = 0 i the variances about both are the same.10: Uncorrelated but not independent Suppose the joint density for test. LINEAR REGRESSION ρ=0 is the condition for equality of the two variances. By the rectangle ρ = 1 line is u = t and the ρ = −1 line is the variance about each of these lines is the same.11: Uniform marginal distributions Figure 12. Y ) plane u = σY s + µY r= t − µX σX s= u − µY σY (12. Example 12. This fact can be veried by calculation.62) ρ = −1 line is: u − µY t − µX =− σY σX or u=− (12. Y } is constant on the unit circle about the origin. COVARIANCE.358 CHAPTER 12. VARIANCE.61) t = σX r + µX The ρ=1 line is: u − µY t − µX = σY σX The or u= σY (t − µX ) + µY σX σY (t − µX ) + µY σX (12. which is true i Cov [X.2: Uniform marginals but dierent correlation coecients. the condition Transformation to the (X. By symmetry. also. {X. Example 12. the is proportional to the variance about the u = −t. Thus ρ = 0.
(in fact the pair is independent) and E [XY ] = 0 Example 12.8170 so that and σY = 1. so that in each case E [X] = E [Y ] = 0 This means the and Var [X] = Var [Y ] = 1/3 line is (12. (1.4216 σX σY 30 (12.0).04699. Example 12. Y. and Var [Y ] = 3/80.1). By symmetry. 0 ≤ u ≤ 1 5 E [X] = 2/5.9122 (12.66) To complete the picture we need E [XY ] = Then 6 5 t2 u + 2tu2 dudt = 8/25 (12.67) Cov [X.13: An absolutely continuous pair The pair by 6 {X.4 . Since 1 − ρ < 1 + ρ.1).359 Consider the three distributions in Figure 12.*P) EX = 0.64) ρ=1 line is u=t and the ρ = −1 u = −t. t.4012 % Theoretical = 0." In that example calculations show E [XY ] − E [X] E [Y ] = −0. the variance about the ρ = 1 line is less than that about the ρ = −1 line. PX. in the rst and third quadrants with vertices (0. u) = 5 (t + 2u) on the triangular region bounded t = 0.12: A pair of simple random variables With the aid of mfunctions and MATLAB we can easily caluclate the covariance and the correlation coecient. Y ] = 10 ≈ 0. distribution is uniform over two squares.1). the variance about the ρ = −1 line is less than that about the ρ = 1 line. E [XY ] < 0 and ρ < 0. Y ] .1) and (0. so c. u = t. (0.65) ρ = −0.1).0).2. (0. we have fX (t) = From this we obtain 6 1 + t − 2t2 . ρ = 0. Var [X] = 3/50. the the square centered at the origin with vertices at (1. (1.1). For every pair of possible values.0). (1. 1 0 t 1 (12.9: Determining an uncorrelated pair) in "Variance. σX = 1.1). b. In case (c) the two squares are in the second and fourth quadrants. The actual value may be calculated to give ρ = 3/4. (1. the distribution is uniform over In case (b). The marginals are uniform on (1. 0 ≤ t ≤ 1 and fY (u) = 3u2 . Y } has joint density function fXY (t. (1. E [XY ] > 0 which implies ρ > 0. E [Y ] = 3/4.1). PY.1633 = Cov [X. and P EX = total(t.1) in each case. u. the two signs must be the same. a. (1.*(u>=t) Use array operations on X. Again.68) tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (6/5)*(t + 2*u). Since 1 + ρ < 1 − ρ.0). (1. Y ] = E [XY ] − E [X] E [Y ] = 2/100 APPROXIMATION and ρ= 4√ Cov [X. We use the joint distribution for Example 9 (Example 12. and u = 1. In case (a). By the usual integration techniques. This is evident from the gure. examination of the gure conrms this.
69) ρ = 0.0603 VY = total(u.4212 Coecient of linear correlation The parameter of linear correlation. coecient The following example shows that all probability mass may be on a curve.4216 A more descriptive name would be EY = total(u. Example 12.06 % Theoretical = 0.75 % Theoretical = 0.^2.e. − 1 < t < 1 1 2 1 and Let Y = g (X) = cosX .72) Y.j ai aj Cov (Xi . so that Y = g (X) but ρ = 0 fX (t) = 1/2. X) = i. p. In this case the integrand tg (t) is odd.j n Var (X) = Cov (X. Consider the linear combinations n m X= i=1 We wish to determine ai Xi and Y = j=1 bj Yj (12.j (12. Xj ) (12. Y ) = E X ' Y ' = E i.02 % Theoretical = 0. LINEAR REGRESSION % Theoretical = 0.0375 % Theoretical = 0. E [X] = 0.*P) . the value of Y is completely determined by the value of X ).71) n n n n X' = i=1 and similarly for ai Xi − i=1 ai µXi = i=1 ai (Xi − µXi ) = i=1 ' ai Xi (12. It is convenient to work Y ' = Y − µy . Y ] and Var [X]. Then Cov [X. yet ρ = 0.0376 CV = total(t.*P) . Note that g could be any even function dened on (1. Xi ) + i i=j ai aj Cov (Xi .1). Y = g (X) ρ is usually called the correlation coecient. Variance and covariance for linear combinations We generalize the property (V4) ("(V4)".7496 VX = total(t.^2. 346) on linear combinations.14: Suppose X∼ uniform (1.EX*EY CV = 0.73) i. VARIANCE.70) X ' = X − µX and Cov [X. Y ] = E [XY ] = Thus tcos t dt = 0 −1 (12. Yj ) i.*u.360 CHAPTER 12. so that the value of the integral is zero. Xj ) = i=1 a2 Cov (Xi .*P) EY = 0.EX^2 VX = 0. so that (i. Since by linearity of expectation.74) .1). COVARIANCE. ' By denition Cov (X.j In particular ' ai bj Xi Yj' = ' ai bj E Xi Yj' = ai bj Cov (Xi .0201 rho = CV/sqrt(VX*VY) rho = 0. n m with the centered random variables µX = i=1 we have ai µXi and µY = j=1 bj µYj (12..EY^2 VY = 0.*P) .
3 Linear Regression3 12. Xi ). we seek an important partial solution. b such that E (Y − aX − b) 2 is a minimum (12. then square and take the expectation. That is. The error of the estimate is Y (ω) − Y (ω). The regression line of Y on X r We seek the best straight line function for minimizing the mean squared error.5: The regression problem). (Section 14. Xj ) (12. we seek a function r such that E (Y − r (X)) 2 is a minimum. Xj ) = aj ai Cov (Xj .78) In the unit on Regression The problem of determining such a function is known as the regression problem. The problem is to determine the coecients a. or is observed.5/>.77) The choice of the mean square has two important properties: it treats positive and negative errors alike.1 Linear Regression Suppose that a pair {X. The best that can be hoped for is some estimate based on an average of the errors. given X.1. (12.79) We write the error in a special form. n we have Var [X] = i=1 Note that a2 Var [Xi ] + 2 i i<j ai aj Cov (Xi . and by some rule an estimate Y (ω). If the Xi form an independent class.361 Using the fact that ai aj Cov (Xi .org/content/m23468/1. At this point. In general.75) ai 2 does not depend upon the sign of ai . we show that this problem is solved by the conditional expectation of Y. Suppose ^ X (ω) Y (ω) ^ is returned. or are otherwise uncorrelated. (12. A value X (ω) is observed. we seek a rule (function) the estimate r such that Y (ω) ^ is r (X (ω)). we seek a function of the form u = r (t) = at + b. That is.3.80) Error squared = (Y 3 This − µY )2 + a2 (X − µX )2 + β 2 − 2β (Y − µY ) + 2aβ (X − µX ) − 2a (Y − µY ) (X − µX ) content is available online at <http://cnx. and it weights large errors more heavily than smaller ones.76) 12.81) . The most common measure of error is the mean of the square of the error E Y−Y ^ 2 (12. It is desired to estimate the corresponding value Y is a function of X. the expression for variance reduces to n Var [X] = i=1 a2 Var [Xi ] i (12. Y } of random variables has a joint distribution. Obviously there is no rule for determining Y (ω) unless on the average of some function of the errors. Error = Y − aX − b = (Y − µY ) − a (X − µX ) + µY − aµX − b = (Y − µY ) − a (X − µX ) − β (12.
1100 % The regression line is u = 0. For certain theoretical purposes.24: A jointly distributed pair) from "Mathematical Expectation. Y ] Var [X] b = µY − aµX regression line of Y on X.0495 b = EY .11 Example 12. PY. Y ) (Example 10 (Example 11. There is no need to determine Var [Y ] or ρ. region bounded by E [X] = E [XY ] = 0. PX. and P EX = total(t.1633 a = CV/VX a = 0. this is the rst form is usually the more convenient.84) u= Cov [X. General Random Variables")) from "Variance" Suppose the pair {X.3016 CV = total(t.83) a= Thus the optimum line. COVARIANCE. u = 1 + t.85) Note that the pair is uncorrelated.*P) .a*EX b = 0.82) Standard procedures for determining a minimum (with respect to a) show that this occurs for (12. but by the rectangle test is not independent. 1 so Cov [X. is (12.^2. the approximation procedure is not very satisfactory unless a very large number of approximation points are employed.3: Z = g (X. Only the covariance (which requres both means) and the variance of X are needed. the preferred form.6420 EY = total(u. Y. 1−u 1 The regression curve is u = E [Y ] = 3 0 u2 u−1 dtdu = 6 0 u2 (1 − u) du = 1/2 (12. Example 12. Y ] = 0. Y } has joint density fXY (t. Y )) from "Mathematical Expectation: Simple Random Variables")) from "Variance" jdemo1 jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.10: Expectation for Z = g (X.0783 VX = total(t.EX^2 VX = 3. u. Determine the regression line of Y on X. LINEAR REGRESSION E (Y − aX − b) 2 2 2 = σY + a2 σX + β 2 − 2aCov [X. With zero values of E [X] and E [XY ]. ANALYTIC SOLUTION By symmetry. t. But for calculation. .*P) EY = 0.*P) EX = 0. called the Cov [X. u) = 3u on the triangular u = 0.*P) . Y ] (12. u = 1 − t.EX*EY CV = 0.362 CHAPTER 12. Y ] σY (t − µX ) + µY = ρ (t − µX ) + µY = α (t) Var [X] σX The second form is commonly used to dene the regression line.*u.15: The simple pair of Example 3 (Example 12.0495t + 0.6: A jointly distributed pair (Example 14 (Example 11. VARIANCE.16: The pair in Example 6 (Example 12.
what is the best meansquare linear estimate of Y (ω)? region on X. 0 ≤ u ≤ Regression line for Example 12.11: Marginal distribution with compound expression) from "Random Vectors and MATLAB" and Example 12 (Example 10.17 (Distribution of Example 5 (Example 8. Determine the regression line of Y is observed. t} (see Figure Figure 12. Y } has joint density fXY (t.26: Continuation of Example 5 (Example 8.5: Marginals for a discrete distribution) from "Random Vectors and Joint Distributions") from "Function of Random Vectors" The pair 6 {X.86) The other quantities involve integrals over the same regions with appropriate integrands.1 . as follows: Quantity Integrand Value 779/370 127/148 E X 2 t + 2t u tu + 2u2 t u + 2tu 2 2 3 2 E [Y ] E [XY ] 232/185 Table 12.3).17: Distribution of Example 5 (Example 8. Figure 12. If the value X (ω) = 1.11: Marginal distribution with compound expression) from "Random Vectors and MATLAB" and Example 12 (Example 10. u) = 37 (t + 2u) on the max{1.7 0 ≤ t ≤ 2.5: Marginals for a discrete distribution) from "Random Vectors and Joint Distributions") from "Function of Random Vectors").3: ANALYTIC SOLUTION E [X] = 6 37 1 0 0 1 t2 + 2tu dudt + 6 37 2 1 0 t t2 + 2tu dudt = 50/37 (12.26: Continuation of Example 5 (Example 8.363 Example 12.
This interpretation should not be pushed too far.8581 VX = total(t. u.3 for an approximate plot).87) a = Cov [X. but is a common interpretation. If ρ = 0. then E Y−Y ρ2 is interpreted as the fraction of uncertainty removed by the linear rule and X. COVARIANCE. t.*P) EY = 0.*P) . More general linear regression . often found in the discussion of observations or experimental results.4011 y = 1.EX^2 VX = 0.8594 % Theoretical = 0. Y ] = 232 50 127 1293 − · = 185 37 148 13690 (12.3394 % Theoretical = 0.^2.a*EX b = 0. (12.1)) Use array operations on X.9776 % Theoretical = 0.0947 % Theoretical = 0.EX*EY CV = 0.*(u<=max(t. Y ] /Var [X] = The regression line is ^ sense) is 1293 6133 ≈ 0.89) The analysis above shows the minimum mean squared error is given by E Y−Y ^ =E 2 Y −ρ σY (X − µX ) − µY σX 2 2 = σY E (Y ∗ ) − 2ρX ∗ Y ∗ + ρ2 (X ∗ ) ^ 2 2 = σY 1 − 2ρ2 + ρ2 = σY 1 − ρ2 (12.3382 b = EY . If X (ω) = 1.7a + b = 0.7*a + b y = 0.3382.9760 An interpretation of ρ2 2 2 2 = σY E (Y ∗ − ρX ∗ ) 2 (12. Y. b = E [Y ] − aE [X] = ≈ 0. Then.*u.0944 a = CV/VX a = 0.7. PY.90) 2 2 = σY .2790 % Theoretical = 0. and P EX = total(t.*P) .3517 % Theoretical = 1.3514 EY = total(u. the mean squared error in the case of zero linear correlation.364 CHAPTER 12. VARIANCE.*P) EX = 1.4006 % Theoretical = 0.4011 3823 15292 u = at + b. the best linear estimate (in the mean (see Figure 12. PX. LINEAR REGRESSION Then Var [X] = and 779 − 370 50 37 2 = 3823 13690 Cov [X.2793 CV = total(t.88) square Y (ω) = 1.9760 APPROXIMATION tuappr Enter matrix [a b] of Xrange endpoints [0 2] Enter matrix [c d] of Yrange endpoints [0 2] Enter number of X approximation points 400 Enter number of Y approximation points 400 Enter expression for joint density (6/37)*(t+2*u).
1. · · · . so that 2 2 (12. E [Y ] = a0 + a1 E [X1 ] + · · · + an E [Xn ] E [Y Xi ] = a0 E [Xi ] + a1 E [X1 Xi ] + · · · + an E [Xn Xi ] i = 1. Cov [Xi .97) n equations plus equation (1) may be solved alagebraically for the ai . with X0 = 1. · · · . 1. we have ai = Cov [Y. if Consider (12. Xi are uncorrelated (i. .s.e. n. we obtain n + 1 linear equations in the n + 1 unknowns a0 . as follows.95) E [(Y − U ) V ] = 0 for all V of the form above. Now.92) W =Y −U and let d2 = E W 2 2 . then E (Y − U ) 2 is a minimum.93) d2 ≤ E (W + αV ) If we select the special = d2 + 2αE [W V ] + α2 E V 2 α=− This implies E [W V ] E [V 2 ] then 0≤− 2E[W V ] E[W V ] 2 + 2 E V E [V 2 ] E[V 2 ] E [W V ] = 0. equivalently n E [Y V ] = E [U V ] To see this. E (Y − V ) Since 2 = E (Y − U + U − V ) is of the same form as 2 = E (Y − U ) 2 + E (U − V ) 2 2 + 2E [(Y − U ) (U − V )] (12. Hence. 2. X1 . form {Y. successively. X2 . Xi ] + a2 Cov [X2 .3)). Xi ] Var [Xi ] 1≤i≤n (12. set V of the form V = i=0 ci Xi (12.. 2 which can only be satised by E [Y V ] = E [U V ] On the other hand.365 Consider a jointly distributed class. Xi ] + · · · + an Cov [Xn . a1 .91) U satises this minimum condition. for any α (12.99) {Xi : 1 ≤ i ≤ n} is iid as in the case of a simple random sample (see the section on "Simple Random Samples and Statistics (Section 13.98) and a0 = E [Y ] − a1 E [X1 ] − a2 E [X2 ] − · · · − an E [Xn ] In particular. Xn . We wish to deterimine a function U of the n U= i=0 If ai Xi . Xn }. this condition holds if the class (12. then E [(Y − U ) V ] = 0.96) U −V V. X1 . with zero value i If we take V to be U − V = 0 a. E (Y − V ) is a minimum when V = U . · · · .94) E[W V ] ≤ 0. for all or. The second term is nonnegative. · · · . such that E (Y − U ) 2 is a minimum (12. Xj ] = 0 for i = j ). Xi ] = a1 Cov [X1 . we take for 1≤i≤n For each (2) − E [Xi ] · (1) and use the calculating expressions for variance and covariance to get Cov [Y. 2. The rst term is xed. Xi ] These In the important special case that the (12. X2 . an . the last term is zero.
) (See Exercise 1 (Exercise 7. 1.1) from "Problems on Distribution and Density Functions ".50. $5.12) from "Problems on Random Variables and Probabilities". C. E [X1 ] = 2.08. 0.100) Solution of these simultaneous linear equations with MATLAB gives the results a2 = 0. 0. Suppose E [Y ] = 3.14. above.0083. with probabilities 0. is a partition. (Solution on p. $5. The class {A.30: values npr07_01)). 0. 3. 0. Var [X]. B.8. 0.09.4 in a national park there are 127 lightning strikes.366 CHAPTER 12.00. 3.5IC8 Determine (12. Exercise 12.8. with X1 = X .0IC6 + 3.4) from "Problems on Mathematical Expectation"). and $7. COVARIANCE. A customer comes in.0IC5 + 5. the result agrees with that obtained in the treatment of the regression line. a0 = b. Random variable X has respectively. which counts the number of these events which occur on a trial. Example 12.09.50. 0. and Exercise 2 (Exercise 11. 0. 4. (Solution on p. .50.1) from "Problems on Mathematical Expectation".3) from "Problems on Mathematical Expectation". (Solution on p. The random variable expressing the amount of her purchase may be written X = 3. mle npr07_01.) Exercise 12.m (Section 17. respectively.13. Determine Var [X]. Cov [Y. 12. and a1 = a.) Exercise 12. 0.00.05. 0.org/content/m24379/1.50. X2 ] = 7. Cov [Y. 2} C1 {Cj : 1 ≤ j ≤ 10} through C10 . mle npr07_02.9565.15.11. 0. and Exercise 1 (Exercise 11.2 (See Exercise 2 (Exercise 7.) Experience shows that the probability of each (See Exercise 4 (Exercise 11. 4 This content is available online at <http://cnx.4/>. X2 ] = 1. 0.15. $5. D} has minterm probabilities pm = 0. The class on {1.5IC3 + 7. 5. 374.18: Linear regression with two variables. VARIANCE.31: npr07_02)). The prices are $3.5IC1 + 5.6957.5IC4 + 5. 2.12.50.8.11.10 0.28: npr06_12)).5IC7 + 7. Then the three equations are a0 0 0 a0 = −1. X1 ] = 5. 374.m (Section 17. Var [X1 ] = 3.4348. 0.20. 3. $3.3 (See Exercise 12 (Exercise 6.102) X = IA + IB + IC + ID . 0. Linear Regression4 Exercise 12. 374. and + + + 2a2 3a1 1a1 + + + 3a3 1a2 8a2 = = = 3 5 7 (12. a1 = 1.06.2) from "Problems on Mathematical Expectation".07. 2.1 (Solution on p. She purchases one of the items with probabilities 0.15.00. and Cov [X1 . 374. $3. Determine Var [X].10.10 0. In a thunderstorm lightning strike starting a re is about 0. E [X2 ] = 3.2) from "Problems on Distribution and Density Functions ".4 Problems on Variance. 0.m (Section 17.001 ∗ [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] Consider Determine (12.101) Var [X]. Covariance. $7. mle npr06_12.0IC2 + 3. LINEAR REGRESSION Examination shows that for n = 1. A store has eight items for sale. Exercise 3 (Exercise 11. Var [X2 ] = 8.
Determine (See Exercise 7 (Exercise 11.) Exercise 12.4.1675 0.0420 · · · (12.5) from "Problems on Mathematical Expectation"). (Solution on p.7) from "Problems on Mathematical Expectation").5 ipped twenty times.0475 0. 0. F } is independent. {E.106) E [X] and Var [X].6) from "Problems on Mathematical Expectation"). F. 374.14 0. and Var [Z]. Calculate b. 374. C.367 Exercise 12.1125 0. A player pays $10 to play. Var [X].06 0. Y = −3ID + 4IE + IF − 7 (12.) Exercise 12.3IA − 1.43.m (Section 17.103) Var [X]. Calculate E [W ] and Var [W ].105) 0.8 (See Exercise 24 (Exercise 7.0480 0.11 Consider a second random variable The class Y = 10IE +17IF +20IG −10 in addition to that in Exercise 12.6ID − 3.2.0120 0.09 0.) (See Exercise 5 (Exercise 11.1680 0. (Solution on p. Use properties of expectation and variance to obtain that it is b.37.) A residential (See Exercise 6 (Exercise 11. 0. 0. E.0720 0.0130 0.21 0.46. G} has minterm probabilities (in mle npr12_10.104) a.24) from "Problems on Distribution and Density Functions" is a random variable T ∼ gamma (20. 374. College plans to raise money by selling chances on a board. Fifty chances are sold.10 Consider X = −3. E [Y ]. The number of X having Poisson (7) Var [X]. Var [X].8. 0.) The total operating Exercise 12.107) .6 (Solution on p.0280 0.) The class Exercise 12. he or she wins $30 with probability p = 0.8. D} has minterm probabilities (data are in mle npr12_10. 0. (Solution on p.21] (12. and Var [Y ].14 0. 374. {A.0430 a.3IC + 7. Let not E [X].m (Section 17.9 The class {A.) Exercise 12.0180 0.0002).7IB + 2. with respective probabilities 0. Z = 3Y − 2X . Note necessary to obtain the distributions for X or Y.7 noise pulses arriving on a power circuit in an hour is a random quantity distribution. Determine where is the number of winners (12. Determine Var [T ].06 0. (Solution on p.45. Let (12. and Exercise 8 (Exercise 11. Exercise 12. 375.09 0.8) from "Problems on Mathematical Expectation").24) from "Problems on Distribution and Density Functions".1120 0. D.53.0270 0. 375. W = 2X 2 − 3X + 2. time for the units in Exercise 24 (Exercise 7.43: npr12_10)) pmy = [0. Determine E [Z]. C.10. (Solution on p. B. 0. Let X = 6IA + 13IB − 8IC . X Two coins are Determine be the number of matches (both heads or both tails). Let (Solution on p. 374. B.39.0170 0. N The prot to the College is X = 50 · 10 − 30N.43: npr12_10)) pmx = [0.0725 0.
with Exercise 12. 0. Let Z = 2X − 5Y . Use the result of part (a) to determine Var [X]. F } 0. Use appropriate mprograms to obtain Compare with results of parts (a) and (b).46.) is independent. Determine E [X]. Determine E [Y ] = 4. 376. 0. E [XY ] = 30 (12. where n is a positive integer. E. E [Y ] = 1.0. Var [Y ]. with respective probabilities Exercise 12. 0. 377. Y } has joint distribution. with X∼ gamma (3.27. and Var [Y ]. a.112) Var [3X + 2Y ]. Suppose Exercise 12.47.110) a. VARIANCE.17 The pair {X. and E [Z].109) X = 8IA + 11IB − 7IC . E [Z].) distribution. 0.1) and Y ∼ Poisson (13). E [X]. Y = −3ID + 5IE + IF − 3. E [Y ]. Exercise 12. 376. E Y 2 = 2. E X 2 = 11. (Solution on p. Var [X] = 6.) Exercise 12. COVARIANCE.41.) has joint distribution. 377. E [X] and b. Y } E [X] = 2.368 CHAPTER 12. Z = X 2 + 2XY − Y .12 Suppose the pair {X. E [Y ] = 1.33.113) .) Exercise 12. E [Y ] = 10. (Solution on p.16 The pair {X. (r. Var [Y ] = 4 (12. and Var [Z]. Use properties of expectation and variance to obtain b. C. andZ = 3Y − 2X (12. E [Y ].13 The pair {X. (Solution on p.18 The pair {X. D. Calculate b. Y } is independent. E [XY ] = 15. LINEAR REGRESSION The pair {X.14 The class {A. 0. E [XY ] = 1 (12.) Exercise 12. Y } E [X] = 2. B. 377. Var [X]. E Y 2 = 101.15 For the Beta (Solution on p. 376. Calculate E [Z] and Var [Z]. Var [Z]. Determine E [X n ]. Var [Y ] = 5 (12. Determine E X 2 = 11. Let E [Y ] and Var [Y ]. (Solution on p. (Solution on p. Var [X]. Determine E X 2 = 5. c.37 Let (12. Determine E [Z] and Var [Z].) is independent. 378. Suppose E [X] = 3. Y } is jointly distributed with the following parameters: E [X] = 3. s) a.108) Var [3X − 2Y ]. Y } is independent. (Solution on p.111) Var [15X − 2Y ].
Y ].0112 0.0108 0.0256 0.0054 0.0050 0.0090 13 0.21 (Solution on p.7) from "Problems On Random Vectors and Joint Distributions".0049 0.0891 0.0172 17 0.0182 0.0084 0. (Solution on p.0072 0.0132 3.0189 0.1089 0. Random variable has density function fX (t) = { E [X] = 11/10.0252 0.0072 3 0.1320 0.0040 0.0096 0.8.0160 0.0096 0. Y = u) (12.114) Var [X].0204 0.39: npr08_08)): {X.0038 0.0200 0.0126 0.0045 9 0.116) t = u = 12 10 9 5 3 1 3 5 1 0.19 X (See Exercise 9 (Exercise 11.9 0.0223 .0196 0.8 3.0126 0.m (Section 17. 1] (t) t2 + I(1.0098 0. and Exercise 17 (Exercise 11.0096 0.0156 0.0077 Table 12. 378.2 Exercise 12.0191 0.0012 0.0168 0. Y on X.0 3.0040 0.8) from "Problems On Random Vectors and Joint Distributions".0260 0.0528 0.0091 0.9) from "Problems on Mathematical Expectation").5 0.0726 2.0396 0 0. and the regression line of For the distributions in Exercises 2022 Var [X].0056 0.0234 0.0180 0.0035 0.0064 0.0268 0.4 0.1 0.0112 0.0196 0.0026 15 0.0120 0.) Exercise 12. and Exercise 18 (Exercise 11. Y } P (X = t.0156 0. has the joint distribution (in le npr08_08.0120 0. Y } P (X = t.0405 0.17) from "Problems on Mathematical Expectation").0084 0.0495 0. Cov [X.8.0217 19 0.2] (t) (2 − t) 5 5 (12.0056 0. Determine (6/5) t2 (6/5) (2 − t) Determine for for 0≤t≤1 1<t≤2 6 6 = I [0.0063 7 0.369 Let Determine Z = 2X 2 + XY 2 − 3Y + 4.0244 0.0216 0.0017 5 0.0160 0.0256 0.0104 0.0162 0.0324 0.0080 0.0510 0.0440 0.0060 0. (Solution on p.0091 0.5 4.0081 0.1 2.0070 0. E [Z] .0594 0.0203 0. 378.0060 0.) The pair (See Exercise 8 (Exercise 8.0256 0. 378.2 0. Y = u) (12.0182 0.0060 0.0234 0.0056 0.0134 0.0090 0.38: npr08_07)): {X.0140 0.0044 0.0231 0.0064 0.m (Section 17. has the joint distribution (in le npr08_07.0363 0.0140 0.) The pair Exercise 12.0070 0.0056 0.18) from "Problems on Mathematical Expectation")..20 (See Exercise 7 (Exercise 8.0280 0.0180 0.0484 1.7 0.115) t = u = 7.0108 0.0060 0.0297 0 4.0182 0.0167 11 0.
E [X] = 313 1429 49 .118) Exercise 12. and Exercise 19 (Exercise 11. E [X] = E [Y ] = 7 5 .25) from "Problems on Mathematical Expectation").017 Table 12.039 0. E [Y ] = . u) = 0 ≤ t ≤ 2.003 1. Cov [X.049 0.045 2.051 0. and the regression line of Y on X.163 0. and Exercise 23 (Exercise 11. E X2 = 220 880 22 (12. in minutes.009 2 0.24 (Solution on p. and Exercise 20 (Exercise 11. 0 ≤ u ≤ 2 (1 − t).) (See Exercise 10 (Exercise 8. Data were kept on the eect of training time on the time to perform a job on a production line.005 0. 380.25 (Solution on p.117) t = u = 5 4 3 2 1 1 0. and Y X is the amount of is the time to perform the task. for fXY (t.370 CHAPTER 12. and Exercise 25 (Exercise 11. Determine analytically Var [X]. Check these with a discrete approximation. 3 6 E [Y ] = 2 3 (12.070 0.3 Exercise 12. 379. COVARIANCE.15) from "Problems On Random Vectors and Joint Distributions".22 (Solution on p.4 For the joint densities in Exercises 2330 below a.001 0. E X2 = 6 3 1 8 (t + u) (12.011 0.061 0.) (See Exercise 9 (Exercise 8.) (See Exercise 13 (Exercise 8.5 0.015 0. b.010 0.050 0. in hours.031 0.039 0. LINEAR REGRESSION Table 12. u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2. u) = 1 0 ≤ t ≤ 1. E X2 = .8.137 0.025 3 0.23 (Solution on p.23) from "Problems on Mathematical Expectation").40: npr08_09)): P (X = t. for fXY (t. fXY (t. VARIANCE.20) from "Problems on Mathematical Expectation").9) from "Problems On Random Vectors and Joint Distributions".) (See Exercise 15 (Exercise 8. Exercise 12. 379.119) Exercise 12. E [X] = 1 1 .10) from "Problems On Random Vectors and Joint Distributions". training. Y ].13) from "Problems On Random Vectors and Joint Distributions".012 0.19) from "Problems on Mathematical Expectation").120) . 0 ≤ u ≤ 1 + t. Y = u) (12. 379.065 0.m (Section 17.5 0.033 0.058 0. 0 ≤ u ≤ 2.001 0. The data are as follows (in le npr08_09.
0) . u) = 9 I[0.) (See Exercise 22 (Exercise 8. E X2 = 55 55 605 24 11 tu (12. identically distributed) with common X = [−5 − 1 3 4 7] Let P X = 0.17) from "Problems On Random Vectors and Joint Distributions".126) E [X] = Exercise 12. u) = 12t2 u (−1. and Exercise 32 (Exercise 11.27 (12.) (See Exercise 18 (Exercise 8. (0. E [Y ] = . E X2 = 224 16 70 (Solution on p. 381. Z} of random variables is iid (independent. for 0 ≤ t ≤ 2. 3 − t}.121) 11 2 2 . 380. 381. 0 ≤ u ≤ min{2t.123) Exercise 12.371 Exercise 12.) (See Exercise 21 (Exercise 8. E [Y ] = . fXY (t.01 ∗ [15 20 30 25 10] and (12. then repeat with icalc3 and compare results. E [Y ] = .28) from "Problems on Mathematical Expectation"). u) = 0 ≤ t ≤ 2. and Exercise 26 (Exercise 11. 1) . Y. E [Y ] = .) {X. 381.127) W = 3X − 4Y + 2Z .26 (Solution on p. E X2 = 13 12 1690 (12.125) Exercise 12. u) = 3 23 (t + 2u) for 0 ≤ t ≤ 2. 1) E [X] = Exercise 12.1] (t) 3 t2 + 2u + I(1.122) (Solution on p.) (See Exercise 16 (Exercise 8.31 The class distribution 243 11 107 . fXY (t.2] (t) 14 t2 u2 . 380. 380.27) from "Problems on Mathematical Expectation"). u) = 2 13 (t + 2u). and Exercise 27 (Exercise 11.31) from "Problems on Mathematical Expectation").16) from "Problems On Random Vectors and Joint Distributions". E [Y ] = . E [X] = 16 11 2847 . Do this using icalc. E X2 = 46 23 5290 (12.26) from "Problems on Mathematical Expectation"). E [X] = 32 627 52 . 8 for 0 ≤ u ≤ 1.30 (Solution on p.22) from "Problems On Random Vectors and Joint Distributions". fXY (t. 0) .29 (Solution on p. (1. E [X] = 53 22 9131 . (0. E X2 = 5 15 5 (12. and Exercise 31 (Exercise 11. and Exercise 28 (Exercise 11.) (See Exercise 17 (Exercise 8.18) from "Problems On Random Vectors and Joint Distributions". 0 ≤ u ≤ max{2 − t. 2 − t}.21) from "Problems On Random Vectors and Joint Distributions". t}. Determine E [W ] Var [W ]. .124) Exercise 12. for fXY (t.28 (Solution on p.32) from "Problems on Mathematical Expectation"). (12. 0 ≤ u ≤ min{1. on the parallelogram with vertices fXY (t.
2 − t} (see Exercise 27 (Exercise 11. VARIANCE.32 (Solution on p.131) Var [Z] 3 23 and Cov [X.132) 53 .34 (Solution on p. u) : u ≤ min (1.372 CHAPTER 12. E Z2 = 1135 3405 15890 Cov [X. COVARIANCE. 383.) for fXY (t.36 (Solution on p. Exercise 12. t} (see Exercise 28 (Exercise 11. Z]. u) : t ≤ 1. 0 ≤ t ≤ 2. for Check with discrete approximation. 55 E Z2 = 39 308 (12. E [X] = (12.136) Determine Var [Z] and 1567 5774 56673 .38) from "Problems on Mathematical Expectation"). Y ) 2Y 2 .) fXY (t. (12. 0 ≤ u ≤ min{1 + t. u) = (3t + 2tu). Check with discrete approximation.27) and Exercise 38 (Exercise 11. Z]. E [Z] = . for (12.133) Var [Z] 12 179 and Cov [Z]. u ≥ 1} (12. Y ) 2Y. u) = (t + 2u) 0 ≤ t ≤ 2.35 (Solution on p. Z = I[0. 0 ≤ u ≤ max{2 − t.130) 52 .2] (X) (X + Y ) 313 5649 4881 . E [Z] = . E [Z] = . Check with discrete approximation. M = {(t.37) from "Problems on Mathematical Expectation"). E Z2 = 1790 895 6265 Cov [X. 0 ≤ u ≤ 1 + t (see Exercise 25 (Exercise 11. Z]. Y ) (X + Y ) + IM c (X. u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2. Y ) XY.1] (X) 4X + I(1.25) and Exercise 37 (Exercise 11.28) and Exercise 39 (Exercise 11.40) from "Problems on Mathematical Expectation"). Z = IM (X. 382. u) = 0 ≤ t ≤ 2. Z].134) Determine Var [Z] 12 227 and 2313 1422 28296 .128) (12. for Check with discrete approximation.) fXY (t. Y ) X + IM c (X. 382.39) from "Problems on Mathematical Expectation"). M = {(t. 0 ≤ u ≤ min{2. 2 − t)} (12. Y ) Y 2 . LINEAR REGRESSION Exercise 12. 55 E [Z] = 16 .137) . 92 E Z2 = 2063 460 (12.30) and Exercise 41 (Exercise 11.33 (Solution on p. Check with discrete approximation. 2} (see Exercise 30 (Exercise 11. u) ≤ 1} E [X] = Determine (12. 46 E [Z] = 175 . E [X] = M = {(t.) fXY (t. 383. Y ) X + IM c (X. 0 ≤ u ≤ min{1. u) = 3t2 + u 0 ≤ t ≤ 2.135) Exercise 12. 383. Z = IM (X. E [X] = M = {(t. u) : u > t} 2 E [X] = Determine (12. Y ) (X + Y ) + IM c (X.) fXY (t. Z = IM (X. 1 Z = IM (X. . E Z2 = 220 1760 440 Cov [X.29) and Exercise 40 (Exercise 11. Exercise 12. u) : max (t.41) from "Problems on Mathematical Expectation"). 3 − t} (see Exercise 29 (Exercise 11.129) Determine Var [Z] 24 11 tu and Exercise 12.
37 Functions of Random Variables"). Y } in Exercise 12. .373 Exercise 12.) (See Exercise 12. 384.20. Y ) X + IM c (X.10) from "Problems on {X. W } W on Z.138) W = h (X.9) and 10 (Exercise 10.20. Y ) = 3X 2 + 2XY − Y 2 X 2Y for for (12. Y ) = { X +Y ≤4 X +Y >4 = IM (X. For the pair (Solution on p.139) Determine the joint distribution for the pair {Z. let Z = g (X. Y ) 2Y and determine the regression line of (12. and Exercises 9 (Exercise 10.
pc).7309 Solution to Exercise 12.4 (p.0.8 (p.0083).0083) = 1. 366) npr07_02 (Section~17.8525 Solution to Exercise 12.1 (p.2).31: npr07_02) Data are in T.374 CHAPTER 12.PX] = csort(T.0002). Solution to Exercise 12.30: npr07_01) Data are in T and pc EX = T*pc' EX = 2. 000. Var [X] = µ = 7. 367) binomial (20. coefficients in c canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution VX = (X. 000.5 (p.^2)*PX' .7 (p. 366) X∼ X∼ N∼ X∼ T ∼ binomial (127.EX^2 Vx = 1. .3 (p.28: npr06_12) Minterm probabilities in pm. COVARIANCE.7000 VX = (T. Var [X] = 127 · 0. 366) npr06_12 (Section~17.^2)*PX' . Solution to Exercise 12.8 = 8.0083 · (1 − 0. Var [X] = 302 Var [N ] = 7200.8. % Alternate Ex = X*PX' Ex = 2. 367) binomial (50.6 (p.^2)*pc' .2 · 0.2 (p.0. Solution to Exercise 12. 367) gamma (20. 2 Solution to Exercise 12. Var [N ] = 50 · 0. cy = [3 4 1 7].0454.EX^2 VX = 2.0. Solution to Exercise 12.00022 = 500.5500 [X.1/2). 366) npr07_01 (Section~17. 367) cx = [6 13 8 0]. pc EX = T*pc'. Var [X] = 20 · (1/2) = 5.EX^2 VX = 1. Var [T ] = 20/0.(X*PX')^2 VX = 0.7000 Vx = (X.8. 367) Poisson (7). VARIANCE.5500 Solution to Exercise 12. VX = (T.8. LINEAR REGRESSION Solutions to Exercises in Chapter 12 Solution to Exercise 12.9 (p.^2)*pc' .
PW) .8659e+03 Solution to Exercise 12. PX.pmy).^2.3*X + 2.^2 .*px. EY = dot(Y.PZ) .3400 9*VY + 4*VX 323.2200 VX = dot(X.EX^2 VX = 18.7900 dot(cy.11 (p.u.PY) EY = 19.^2. PY. pmx and pmy canonic Enter row vector of coefficients cx Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution EX = dot(X. [Z.9386 Solution to Exercise 12.01*[37 45 39 100].8. Y.PZ] = csort(H.^2 + 2*t. cy.PY) . EZ = dot(Z.2000 VY = dot(Y.^2.10 (p.PX).PW) EW = 44.8191 sum(cy. u.py) 5.6874 VW = dot(W. EW = dot(W.*u .EY^2 VY = 178.375 px py EX EX EY EY VX VX VY VY EZ EZ VZ VZ = = = = = = = = = = = = = = 0.0253 G = 2*X. 367) (Continuation of Exercise 12.*(1px)) 66.^2.PX) . dot(cx.PY] = canonicf(cy.01*[43 53 46 100]. [W.EW^2 VW = 2.*(1py)) 6. t. 367) npr12_10 (Section~17.P).43: npr12_10) Data are in cx.*py.^2. 0.2*EX 29.3600 icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X.PW] = csort(G. and P H = t.PX) EX = 1.9200 sum(cx.10) pmx [Y.2958 3*EY .px) 5.
VX = EX2 . Y.PY] = canonicf(cy.1700 EY = dot(Y.minprob(py(1:3))).1) implies E [X] = 30 and Var [X] = 300. ex = dot(cx.8671 vy = sum(cy.minprob(px(1:3))).px)) vx = 54.py) ey = 1.*(1 .3900 vx = sum(cx. 368) X ∼ gamma (3.PX) .6*2*CV VZ = 2 Solution to Exercise 12.01*[27 41 37 100].^2. 368) px = 0. [Y. EY = 4. u. py = 0.*py. cx = [8 11 7 0]. COVARIANCE.0545 [X. EX2 = 11.PZ) . PX.^2.PX) EX = 4. 368) Var [Z] = 4 · 300 + 25 · 13 = 1525 (12. and P EX = dot(X. EXY = 15. VARIANCE.EX^2 VX = 54.3900 VX = dot(X.*px.01*[47 33 46 100]. t. Solution to Exercise 12.8671 .1700 ey = dot(cy. VY = 5.376 CHAPTER 12.13 (p.12 (p.*(1py)) vy = 8.PY) EY = 1. cy = [3 5 1 3]. LINEAR REGRESSION EZ = 46.140) EX = 3.^2.EZ^2 VZ = 3.px) ex = 4.7165e+04 Solution to Exercise 12.14 (p.PX] = canonicf(cx. Y ∼ Poisson (13) implies E [Y ] = Var [Y ] = 13. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X.5343 VZ = dot(Z.^2.EX^2 VX = 2 CV = EXY .EX*EY CV = 3 VZ = 9*VX + 4*VY . Then 0. PY. E [Z] = 2 · 30 − 5 · 13 = −5.
EXY = 1.17 (p. EY = 10. EXY = 30. E X2 = r+s (r + s) (r + s + 1) rs (r + s) (r + s + 1) 2 (12.^2.PY) . VX = EX2 .EY^2 VY = 1 CV = EXY . 368) (12.16 (p.EX^2 VX = 1 VY = EY2 . VX = EX2 . EY = 1.2*EX 12.377 VY VY EZ EZ VZ VZ = = = = = = dot(Y.143) Some algebraic manipulations show that Var [X] = E X 2 − E 2 [X] = Solution to Exercise 12. EX2 = 11. 368) E [X n ] = Γ (r + s) Γ (r) Γ (s) 1 0 tr+n−1 (1 − t) s−1 dt = Γ (r + s) Γ (r + n) Γ (s) · = Γ (r) Γ (s) Γ (r + s + n) (12.EX*EY CV = 0 VZ = 15^2*VX + 2^2*VY VZ = 454 Solution to Exercise 12.EY^2 VY = 1 CV = EXY .EY^2 8.142) Γ (x + 1) = xΓ (x) we have E [X] = r r (r + 1) .15 (p.EX*EY CV = 1 . EY2 = 101. 368) EX = 2.141) Γ (r + n) Γ (r + s) Γ (r + s + n) Γ (r) Using (12.0545 3*EY .144) EX = 3.9589 Solution to Exercise 12. EX2 = 5.EX^2 VX = 2 VY = EY2 .5100 9*VY + 4*VX 291. EY2 = 2.
EX^2 VX = 31. P jcalc .. Y.*P) .PX)..^2..... 369) 13 100 (12.. EY = dot(Y.EX = dot(X. VX = dot(X. EY = 1. VY = 4.. 369) E X2 = t2 fX (t) dt = 6 5 1 t4 dt + 0 6 5 2 2t2 − t3 dt = 1 67 50 (12..8. 369) npr08_08 (Section~17. LINEAR REGRESSION VZ = 9*VX + 4*VY + 2*6*CV VZ = 1 Solution to Exercise 12.PY).PX) ...1116 CV = total(t.6924 % Regression line: u = at + b Solution to Exercise 12.PX) .18 (p.EX^2 VX = 5..PY).a*EX b = 0.19 (p..*u. VX = 6.0700 CV = total(t.EX*EY . 368) EX = 2.. COVARIANCE.EX*EY CV = 2...6963 a = CV/VX a = 0.*u. Y..378 CHAPTER 12.EX = dot(X.^2.21 (p. P jcalc .3*EY + 4 EZ = 31 Solution to Exercise 12.PX).145) Var [X] = E X 2 − E 2 [X] = Solution to Exercise 12. VX = dot(X.5275 b = EY . EY = dot(Y.8.38: npr08_07) Data are in X.20 (p.146) npr08_07 (Section~17. EX2 = VX + EX^2 EX2 = 10 EY2 = VY + EY^2 EY2 = 5 EZ = 2*EX2 + EX*EY2 ...39: npr08_08) Data are in X.*P) . VARIANCE.
8.147) Cov [X.149) tuappr: [0 1] [0 2] 200 400 EX = dot(X.PX).0272 a = CV/VX a = 0. EY = dot(Y.PX) .151) . Var [X] = 11/36 a = Cov [X..EX^2 VX = 0.EX*EY CV = 0.22 (p.24 (p.. Y ] /Var [X] = −1/11.150) (12.. VX = dot(X. 370) E [XY ] = 1 8 2 0 0 2 tu (t + u) dudt = 4/3.2586 a = CV/VX a = 0. VX = dot(X. Y ] = −1/36.*P) ..PX) .. Cov [X..77937/6.0000 b = EY .*u.a*EX b = 4.^2.PY).148) a = Cov [X.a*EX b = 1. Y..23 (p.EX = dot(X.3319 CV = total(t. Y ] /Var [X] = −1 b = E [Y ] − aE [X] = 1 (12.40: npr08_09) Data are in X.0556 a = CV/VX a = 1. b = EY .*u.EX^2 VX = 0.a*EX b = 5..3051 % Regression line: u = at + b Solution to Exercise 12.2584 b = EY .6110 % Regression line: u = at + b Solution to Exercise 12. 370) npr08_09 (Section~17.0000 u<=2*(1t) Solution to Exercise 12.379 CV = 8. Y ] = 1 1 2 2 − · = −1/18 Var [X] = 1/6 − (1/3) = 1/18 6 3 3 (12..PX).EX*EY CV = 0.*P) . 370) 1 2(1−t) E [XY ] = 0 0 tu dudt = 1/6 (12. b = E [Y ] − aE [X] = 14/11 (12.PY). EY = dot(Y.^2. P jcalc .0556 CV = total(t..
1056 a = 0. VARIANCE.0409 a = 0. Var [X] = 5290 10580 114 4165 b = E [Y ] − aE [X] = 4217 4217 (12.*u.155) a = Cov [X. Y ] /Var [X] = (12.5553 Solution to Exercise 12.2036 CV = 0. Var [X] = 3025 3025 124 368 b = E [Y ] − aE [X] = 431 431 (12.159) tuappr: [0 2] [0 1] 400 200 (24/11)*t.27 (p.162) . Y ] /Var [X] = 4/9 b = E [Y ] − aE [X] = 5/9 (12.160) 57 4217 .26 (p.380 CHAPTER 12.152) a = Cov [X.153) tuappr: [0 2] [0 3] 200 300 (3/88)*(2*t + 3*u. Var [X] = 75 25 (12. Y ] = − 3 23 2 1 0 t tu (t + 2u) dudt = 251 230 (12.156) tuappr: [1 1] [0 1] 400 200 12*t.161) a = Cov [X.154) Cov [X.*u. Y ] /Var [X] = − (12.0909 b = 1. 371) E [XY ] = 3 23 1 0 0 2−t tu (t + 2u) dudt + Cov [X. 371) E [XY ] = 24 11 1 0 0 1 t2 u2 dudt + 24 11 2 1 0 2−t t2 u2 dudt = 28 55 (12.4432 b = 0.6700 b = 0.1364 a = 0.1425 CV =0.3055 CV = 0. LINEAR REGRESSION tuappr: [0 2] [0 2] 200 200 (1/8)*(t+u) VX = 0.6736 Solution to Exercise 12.*(u<=1+t) VX = 0.2383 CV = 0. Var [X] = 880 1933600 48400 26321 26383 b = E [Y ] − aE [X] = 39324 39324 (12. 371) 0 t+1 1 1 E [XY ] = 12 −1 0 t3 u2 dudt + 12 0 t t3 u2 dudt = 2 5 (12.1)) VX = 0.*(u<= min(1+t.*(u<=min(1.25 (p. Y ] = 8 6 .8535 Solution to Exercise 12.28 (p.158) a = Cov [X. Y ] = .2727 Solution to Exercise 12.2t)) VX = 0.2867 b = 0. COVARIANCE.t)).^2).0278 a = 0.^2.*(u>= max(0. Y ] /Var [X] = − (12. 370) E [XY ] = 3 88 2 0 0 1+t tu 2t + 3u2 dudt = 2153 26383 9831 Cov [X.157) Cov [XY ] = − 124 431 .
0108 a = 0.166) 103 88243 . Y ] /Var [X] = (12.3984 CV = 0.5989 Solution to Exercise 12.^2.3517 CV = 0.8275 EW = (3 .01*[15 20 30 25 10]. [R.0817 b = 0.0272 b = 0. Y. PX.9975 icalc % Iterated use of icalc Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter X probabilities px Enter Y probabilities px Use array operations on matrices X. t.^2.0229 a = 0.9909 Solution to Exercise 12.0839 Solution to Exercise 12.163) 287 3 Var [X] = 130 1690 39 3733 b = E [Y ] − aE [X] = 297 3444 (12. u.31 (p.3t)) VX = 0.^2. Y ] = − 2 13 2 1 0 2t tu (t + 2u) dudt = 431 390 (12.6500 VX = dot(x.*u.164) a = Cov [X. icalc .EX^2 VX = 12.*(t<=1) + (9/14)*t. 371) x = [5 1 3 4 7].t)) VX = 0.0287 a = 0.381 tuappr: [0 2] [0 2] 200 200 (3/23)*(t + 2*u). EX = dot(x.PR] = csort(G.*(t>1) VX = 0.P).165) tuappr: [0 2] [0 2] 400 400 (2/13)*(t + 2*u).px) .168) tuappr: [0 2] [0 1] 400 200 (3/8)*(t.30 (p.px) % Use of properties EX = 1.1698 CV = 0. Var [X] = 3584 250880 7210 105691 b = E [Y ] − aE [X] = 88243 176486 (12. 371) E [XY ] = 2 13 1 0 0 3−t tu (t + 2u) dudt + Cov [X. 371) E [XY ] = 3 8 1 0 0 1 tu t2 + 2u dudt + Cov [X. and P G = 3*t .*(u<=min(2*t.*(u<=max(2t.6500 VW = (3^2 + 4^2 + 2^2)*VX VW = 371.167) a = Cov [X. Y ] = 9 14 2 1 0 1 t3 u3 dudt = 347 448 (12.4+ 2)*EX EW = 1.29 (p.^2 + 2*u).1350 b = 1. PY.4*u. px = 0. Y ] /Var [X] = − (12.
and P S = 3*t .169) Var [Z] = E Z 2 − E 2 [Z] = 94273 2451039 Cov [X.*(t<=1) + (t+u).*(u<=1+t) G = 4*t.33 (p.PW] = csort(H.7913 Solution to Exercise 12.P).^2.*P) .6500 Vw = dot(w.7934 % Theoretical 0.2445 % Theoretical 0. Y. X] = E [XZ] − E [X] E [Z] = − 84700 42350 (12.2110 EX = dot(X. t. COVARIANCE. Z] = E [XZ] − E [X] E [Z] = 3097600 387200 (12. t.32 (p.2435 VZ = total(G. v. Z. PX.382 CHAPTER 12. PY.*P) EZ = 3.*(t>1).PX) EX = 1.172) . PY.pw) . [W.170) tuappr: [0 2] [0 3] 200 300 (3/88)*(2*t+3*u. PX.*P) .4*u + 2*v.^2). u. and P H = t + 2*u.4220 CV = total(G.pw) Ew = 1.^2.pw] = csort(S.PW) . [w. 372) E [XZ] = 3 88 1 0 0 1+t 4t2 2t + 3u2 dudt + 3 88 2 1 0 1+t t (t + u) 2t + 3u2 dudt = 16931 3520 (12.EZ^2 VZ = 0.P).171) Var [Z] = E Z 2 − E 2 [Z] = 3557 43 Cov [Z. PZ. EW = dot(W. VARIANCE. LINEAR REGRESSION Enter row matrix of Xvalues R Enter row matrix of Yvalues x Enter X probabilities PR Enter Y probabilities px Use array operations on matrices X.9975 Solution to Exercise 12.EW^2 VW = 371. EZ = total(G.*t.Ew^2 Vw = 371. u.PW) EW = 1. 372) E [XZ] = 24 11 1 0 t 1 t (t/2) tu dudt + 24 11 1 0 0 t tu2 tu dudt + 24 11 2 1 0 2−t ttu2 tu dudt = 211 770 (12. Ew = dot(w.6500 VW = dot(W.^2.9975 icalc3 % Use of icalc3 Enter row matrix of Xvalues x Enter row matrix of Yvalues x Enter row matrix of Zvalues x Enter X probabilities px Enter Y probabilities px Enter Z probabilities px Use array operations on matrices X.EX*EZ CV = 0. Y.
CV = total(t.*P).*(1M).*G.*P) .PX).u)<=1.*(u<=min(1.M).*P) .^2 + u). X] = E [ZX] − E [Z] E [X] = 162316350 27052725 (12.178) tuappr: [0 2] [0 2] 400 400 (12/179)*(3*t.PX). EX = dot(X.36 (p. X] = E [ZX] − E [Z] E [X] = − 5607175 11214350 (12.180) 1 2−t Var [Z] = E Z 2 − E 2 [Z] = 112167631 5915884 Cov [Z.3t)) M = (t<=1)&(u>=1).EX*EZ CV = 0.*P) . VZ = total(G.*M + 2*u.PX) CV = 9.176) 2tu2 3t2 + u dudt = (12.*P) .*(u<=max(2t.EZ*EX CV = 0.35 (p. EZ = total(G.177) Var [Z] = E Z 2 − E 2 [Z] = 11170332 1517647 Cov [Z.1347 Solution to Exercise 12.*u. CV = total(t.2t)) G = (t/2). X] = E [ZX] − E [Z] E [X] = 42320 21160 (12.179) t2 u (3t + 2tu) dudt + (12.EZ*dot(X.*G.*M + 2*u.^2.2940e04 Solution to Exercise 12.*(u<=t).175) tuappr: [0 2] [0 2] 400 400 (3/23)*(t+2*u).383 tuappr: [0 2] [0 1] 400 200 (24/11)*t.*(1 .173) 2tu (t + 2u) dudt = (12.0017 Solution to Exercise 12. EX = dot(X.t)) M = max(t.*G.*P). G = (t + u). 372) 1 0 1 2 1 0 0 1 E [ZX] = 12 179 t (t + u) 3t2 + u dudt + 12 179 2 1 0 3−t 12 179 2tu2 3t2 + u dudt+ 24029 12530 (12.^2.*(u>t) + u.^2.0425 CV = total(t.EZ^2 VZ = 0. G = (t+u).34 (p. 372) 1 0 0 1 1 0 1 2−t E [ZX] = 3 23 t (t + u) (t + 2u) dudt + 3 23 2 1 1 t 3 23 2tu (t + 2u) dudt + 1009 460 (12.*(u <= min(2.174) Var [Z] = E Z 2 − E 2 [Z] = 39 36671 Cov [Z.181) . EZ = total(G. 372) 1 0 0 1 2 1 0 2−t E [ZX] = 12 227 1 0 1 12 227 1+t t2 (3t + 2tu) dudt + 12 227 12 227 2 2 t2 (3t + 2tu) dudt + t2 u (3t + 2tu) dudt = 20338 7945 (12.
P jointzw Enter joint prob for (X.^2 Enter expression for h(t. 373) npr08_07 (Section~17.*P) EZX = 2. PZW EZ = dot(Z.7988 % Regression line: w = av + b .0115 b = EW . PZ.EZ^2 VZ = 0.37 (p.u) t.^2.*(u <= min(1+t.^2 + 2*t. G = t.2t).1697 a = CZW/VZ a = 0. LINEAR REGRESSION tuappr: [0 2] [0 2] 400 400 (12/227)*(3*t + 2*t. EZ = total(G.*P) .*PZW) . v.5597 CV = EZX .384 CHAPTER 12.a*EZ b = 4. VARIANCE.PZ) .PW) EW = 4.*(t+u<=4) + 2*u.2)) EX = dot(X.7379 VZ = dot(Z.M).PX).38: npr08_07) Data are in X.PZ) EZ = 5.2975 EW = dot(W.*u.2188 VZ = total(G.*P).*u .0588e+03 CZW = total(v.*M + t.*G.EZ^2 VZ = 1. COVARIANCE.Y) P Enter values for X X Enter values for Y Y Enter expression for g(t.*u). M = u <= min(1.EX*EZ CV = 0.^2.EZ*EW CZW = 12.*(t+u>4) Use array operations on Z.8.u) 3*t. PW.6907 Solution to Exercise 12.*w.u. Y. W. EZX = total(t. w.*(1 .
We 13. 385 . u is a real parameter)(13.. function The moment generating functionMX MX (s) = E esX ( for random variable X (i.1) and Variance (Section 12.1 Transform Methods1 As pointed out in the units on Expectation (Section 11. ining the expectation of certain functions of X. respectively.1. of the distribution about the mean. X and the second moment of X about the mean. in a manner that completely determines the distribution.1 Three basic transforms We dene each of three transforms. for its distribution) is the s is a real or complex parameter) (13.1) measures the spread of the distribution about its center of mass. These have been studied extensively and used in many other applications.1. or asymetry. integervalued random variable X gX (s) = E sX = k sk P (X = k) (13. determine some key properties.org/content/m23473/1. the third moment about the mean E (X − µX ) 3 gives information about the skew. transforms. and the expectation E [g (X)] = E (X − E [X]) 2 2 = Var [X] = σX (13.e. In the section on integral transforms (Section 13.Chapter 13 Transform Methods 13. as the mean (moment) of information. For reasons noted below. the mathematical expectation E [X] = µX of a random variable X locates the center of mass for the induced distribution.2) The characteristic functionφX for random variable X is φX (u) = E eiuX The i2 = −1.7/>. we show their relationship to well known integral transforms. which makes it possible to utilize the considerable literature on these transforms. we refer to these as consider three of the most useful of these.2: Integral transforms).3) is generating functiongX (s) for a nonnegative. These quantities are also known. Denition. We investigate further along these lines by exam Each of these functions involves a parameter.1). and use them to study various probability distributions associated with random variables. Other moments give added For example.4) 1 This content is available online at <http://cnx.
2: The Poisson (µ) distribution ∞ ∞ k P (X = k) = e−µ µ . then work with MX k and its derivatives.7) Upon setting s = 0. i. we have ' MX (0) = E [X]. except that ' gX (s) = ksk−1 P (X = k) k=1 so that ' gX (1) = kP (X = k) = E [X] k=1 (13. The dening expressions display similarities which show useful relationships. the integral with respect to the parameter. we convert easily by the change of Moments The name and some of the importance of the moment generating function arise from the fact that the derivatives of MX evaluateed at s=0 are the moments about the origin. Repeated dierentiation gives the general result. we may convert the generating function to the moment generating function by replacing s with es . integervalued variables. The integral transform character of these entities implies that there is essentially a onetoone relationship between the transform and the distribution. Example 13. ∞ MX (s) = E esX = 0 ' MX (s) = λe−(λ−s)t dt = '' MX (s) = λ λ−s 3 (13. ∞ ∞ The generating function does not lend itself readily to computing moments.6) Since expectation is an integral and because of the regularity of the integrand. but its usefulness is greatest The generating function for nonnegative. k! so that gX (s) = e−µ k=0 sk µk = e−µ k! k=0 (sµ) = e−µ eµs = eµ(s−1) k! (13. TRANSFORM METHODS E sX has meaning for more general random variables.12) .10) From this we obtain Var [X] = 2/λ2 − 1/λ2 = 1/λ2 . we ordinarily use the moment generating function instead of the characteristic function to avoid writing the complex unit variable. When desirable.1: The exponential distribution The density function is fX (t) = λe−λt for t ≥ 0. and we limit our consideration to that case. (k) provided the k th moment exists (13.8) λ (λ − s) 2 2λ (λ − s) (13.5) Because of the latter relationship. particularly useful. (13. we may dierentiate inside ' MX (s) = d sX d E esX = E e = E XesX ds ds φ(k) (0) = ik E X k . We note two which are MX (s) = E esX = E (es ) X = gX (es ) and φX (u) = E eiuX = MX (iu) (13. k ≥ 0.11) For higher order moments.9) ' E [X] = MX (0) = λ 1 2λ 2 '' = E X 2 = MX (0) = 3 = 2 λ2 λ λ λ (13.386 CHAPTER 13. Specically MX (0) = E X k . The corresponding result for the characteristic function is Example 13.
Note that we have not shown that being uncorrelated implies the product rule. with those found by direct computation with the distribution. We utilize these properties in determining the moment generating and generating functions for several of our common distributions. Then ' MX (s) = eµ(e −1) '' µes MX (s) = eµ(e −1) µ2 e2s + µes (13. esX and gX+Y (s) = gX (s) gY (s) (13.20) On setting s=0 and using the fact that MX (0) = MY (0) = 1. we obtain two expressions for .21) E [XY ] = E [X] E [Y ]. '' E X 2 = MX (0) = µ2 + µ. To show this. φZ (u) = eiub φX (au) . parameter φX+Y (u) = φX (u) φY (u) . and Var [X] = µ2 + µ − µ2 = µ (13. we have E (X + Y ) which implies the equality 2 = E X 2 + E Y 2 + 2E [X] E [Y ] (13.17) s. of course. then (13.387 We convert to MX by replacing s with s es to get MX (s) = eµ(e s s −1) . gZ (s) = sb gX (sa ) (T1): If Z = aX + b.22) . and the other by taking the second derivative of the moment generating function. Operational properties We refer to the following as operational properties. For the moment generating function.14) These results agree. E (X + Y ) 2 then the pair {X.16) (T2): If the pair {X. Y } is independent.13) so that ' E [X] = MX (0) = µ.15) MZ (s) = ebs MX (as) . A partial converse for (T2) is as follows: (13. this pattern follows from E e(aX+b)s = sbs E e(as)X Similar arguments hold for the other two. then MX+Y (s) = MX (s) MY (s) .19) '' '' '' ' ' MX+Y (s) = [MX (s) MY (s)] = MX (s) MY (s) + MX (s) MY (s) + 2MX (s) MY (s) (13. Y } is uncorrelated. one by direct expansion and use of linearity. Some discrete distributions 1. Indicator function X = IE P (E) = p gX (s) = s0 q + s1 p = q + ps MX (s) = gX (es ) = q + pes (13. esY form an independent pair for each value of the By the product rule for expectation E es(X+Y ) = E esX esY = E esX E esY Similar arguments are used for the other two transforms. E (X + Y ) '' 2 = E X 2 + E Y 2 + 2E [XY ] (13. For the moment generating function.18) (T3): If MX+Y (s) = MX (s) MY (s). (13.
30) Poisson(µ)P (X = k) = e−µ µ ∀k ≥ 0 k! k establish (λ) and In Example 13. each geometric Xm = (p). esti pi Binomial (n. P (X = k) = pq k ∀k ≥ 0E [X] = q/p We use the formula for the geometric series to get ∞ ∞ gX (s) = k=0 5.24) Geometric (p).2 (The Poisson (µ) distribution).23) n MX (s) = i=1 3. and If Ym is the number of the trial in a Bernoulli sequence on which the is the number of failures before the Xm = Ym − m mth success.31) . (13. This also shows that the expectation and variance of geometric. above. Some absolutely continuous distributions 1. then Z = X + Y ∼ Poisson (λ + µ). n i=1 IEi with {IEi : 1 ≤ i ≤ n} iid P (Ei ) = p We use the product rule for sums of independent random variables and the generating function for the n gX (s) = i=1 4. 2) t2 + · · · for − 1 < t < 1 ∞ (13. p).29) Comparison with the moment generating function for the geometric distribution shows that Ym − m has the same distribution as the sum of m iid random variables.28) MXm (s) = pm k=0 C (−m. p) success occurs. X = indicator function. k) = (13. Thus Xm are m times the expectation and variance for the E [Xm ] = mq/pandVar [Xm ] = mq/p2 6. k) (−q) esk = k p 1 − qes m (13. Follows from (T1) and product of exponentials. (q + ps) = (q + ps) n MX (s) = (q + pes ) n (13. pq s = p k=0 k k (qs) = k p p MX (s) = 1 − qs 1 − qes (13. k) (−q) pm −m (−m − 1) (−m − 2) · · · (−m − k + 1) k! t = 0 shows that (13. If {X. This suggests that the sequence is characterized by independent.26) where C (−m. successive waiting times to success. then k mth P (Xm = k) = P (Ym − m = k) = C (−m. with X ∼ Poisson Y ∼ Poisson (µ). 1) t + C (−m. P (Ai ) = pi (13. we s gX (s) = eµ(s−1) and MX (s) = eµ(e −1) . TRANSFORM METHODS Simple random variable X = n i=1 ti IAi (primitive form) 2.25) Negative binomial (m.388 CHAPTER 13. Y } is an independent pair. Uniform on (a.27) The power series expansion about (1 + t) Hence −m = 1 + C (−m. b)fX (t) = 1 b−a a<t<b est fX (t) dt = 1 b−a b MX (s) = est dt = a esb − esa s (b − a) (13.
σ 2 • each exponential . 1) 1 MZ (s) = √ 2π ∞ The standardized normal. c) and similarly (13. has the same distribution as the dierence of two independent random variables. λ)fX (t) = Γ(α) λα tα−1 e−λt t ≥ 0 In example 1. c).34) for MZ . each uniform on (0. Then Z is normal. 4.36) X has the distribution of the sum of n independent random variables 5.40) and by the operational properties for the moment generating function MZ (s) = esc MX (as) MY (bs) = exp 2 2 a2 σX + b2 σY s2 + s (aµX + bµY + c) 2 (13. σY aX + bY + c.38) since the integrand (including the constant √ 1/ 2π ) is the density for N (s.3: Ane combination of independent normal random variables 2 2 {X.39) Example 13.33) X 3. MX (s) = which shows that in this case λ λ−s n (13.37) Now st − t2 2 = s2 2 1 − 2 (t − s) 2 so that MZ (s) = es 2 /2 1 √ 2π ∞ e−(t−s) −∞ 2 /2 dt = es 2 /2 (13. Let Z = µZ = aµX + bµY + c and 2 2 2 σZ = a2 σX + b2 σY (13. Normal µ. where MY is the ecs − 1 1 − e−cs · = MY (s) MZ (−s) = MY (s) M−Z (s) cs cs moment generating function for Y ∼ uniform (0.32) MX (s) = = (c + t) est dt + −c 1 c2 (c − t) est dt = 0 ecs + e−cs − 2 c2 s2 (13.389 2. Thus. Z ∼ N (0. . Y } is an independent pair with X ∼ N µX .0) (t) 1 c2 0 c+t c−t + I[0.41) . σX and Y ∼ N µY . • X = σZ + µ implies by property (T1) MX (s) = esµ eσ 2 2 s /2 = exp σ 2 s2 + sµ 2 (13. c) fX (t) = I[−c. (λ). Symmetric triangular (−c. 1). a positive integer.35) α = n. λ MX (s) = λ−s . t ≥ 0 MX (s) = For λα Γ (α) ∞ tα−1 e−(λ−s)t dt = 0 λ λ−s α (13. for by properties of expectation and variance Suppose . est e−t −∞ 2 /2 dt (13. above. 1 Gamma(α.c] (t) 2 2 c c c (13. we show that Exponential (λ)fX (t) = λe−λt .
Y } is independent with distributions X = [1 3 5 7] Y = [2 3 4] P X = [0. consolidate like values and add the probabilties for like values to achieve the distribution for the sum.2000 7. then performs a csort operation. we have n pi esti m πj esuj = pi πj es(ti +uj ) i. uj ) of values. Now if The moment generating function for the sum is the product of the moment generating functions.4: Distribution for a sum of independent simple random variables Suppose the pair {X. Y = 2:4.0000 0. MX (s) = in canonical form. It produces the pairproducts for Although not directly the probabilities and the pairsums for the values.42) MZ shows that Z is normally distributed. pi πj Since more than one pair sum may have the same value.1] P Y = [0. Ai is the event {X = ti } for each of the distinct Then the moment generating function for X is n pi esti i=1 (13. We have an mfunction mgsum for achieving this directly.0600 4.3 0. Example 13.0000 0. PY = 0. MX is thus related directly and simply to the distribution for random Consider the problem of determining the sum of an independent pair {X.45) Determine the distribution for Z =X +Y. it produces the same result as that produced by multiplying moment generating functions.PX. of pairs (ti .43) The moment generating function variable X.j Each of these sums has probability (13. Y = m j=1 uj IBj .0000 0.2 0.0900 10. TRANSFORM METHODS 2 σZ s2 + sµZ 2 = exp The form of (13.PZ]') 3. X = [1 3 5 7].0000 0.44) MX (s) MY (s) = i=1 The various values are sums j=1 for the values corresponding to ti + uj ti .PZ] = mgsum(X.1700 8. disp([Z. dependent upon the moment generating function analysis.390 CHAPTER 13.0500 11.4 0. That is. u j .0200 . with P (Y = uj ) = πj . Moment generating function and simple random variables Suppose X = n i=1 ti IAi values in the range of X.0000 0.1600 6.5 0.1*[2 4 3 1]. [Z.2] (13.0000 0. Y } of simple random variables.PY).1*[3 5 2].Y.0000 0. with pi = P (Ai ) = P (X = ti ).0000 0. PX = 0. we need to sort the values.0000 0.3 0.1500 9.1000 5.
tappr Enter matrix [a b] of xrange endpoints [0 1] Enter number of x approximation points 200 Enter density as a function of t t<=1 Use row matrices X and PX as in the simple case [Z.46) . By repeated use of the function mgsum.PX).1). the MATLAB convolution function may be used (see Example 13. The mfunctions mgsum3 and mgsum4 utilize this strategy. We consider and Y X iid.391 This could. uniform on [0.PZ/d) % Divide by d to recover f(t) % plotting details . The generating function The form of the generating function for a nonnegative.PX.X.1: Density for the dierence of an independent pair. Example 13.see Figure~13.1]. Also.5: Dierence of uniform distribution The moment generating functions for the uniform and the symmetric triangular show that the latter appears naturally as the dierence of two uniformly distributed random variables. which has the advantage that other functions of X and Y may be handled.7 (Sum of independent simple random variables)). ∞ ∞ X= k=0 kIAi (canonical form) pk = P (Ak ) = P (X = k) gX (s) = k=0 sk pk (13. plot(Z. have been achieved by using icalc and csort.1 Figure 13. integervalued. of course. we may obtain the distribution for the sum of more than two simple random variables. since the random variables are nonnegative. The techniques for simple random variables may be used with the simple approximations to absolutely continuous random variables. uniform (0. integervalued random variable exhibits a number of important properties.PZ] = mgsum(X.
TRANSFORM METHODS s with nonnegative coecients whose partial sums converge to one. all powers of s must be accounted for by including zeros gx = 0.2 (The Poisson (µ) distribution). 1 2 + 3s + 3s2 + 2s5 10 gY (s) = 1 2s + 4s2 + 4s3 10 (13.50) In the MATLAB function convolution.0400 2.49) For simple.0000 0. 0≤k k! µ = m. the series 1. This may be done quickly and easily with the MATLAB convolution function. As a power series in converges at least for s ≤ 1.2600 4.gy).47) ∞ gX (s) = e−m k=0 We conclude that (ms) = e−m k! k ∞ sk k=0 mk k! (13.. Example 13. function gX (s) = em(s−1) = e−m ems From the known power series for the exponential. Because of the product rule (T2) ("(T2)".0000 0. the unique power series expansion about the origin determines the distribution. pk = 0 for k > n). however.48) P (X = k) = e−m which is the Poisson distribution with parameter mk .0000 0. with gX (s) = for the missing powers. The power series expansion about the origin of an analytic function is unique. gX is a polynomial. we establish the generating function for Suppose. a = [' Z PZ'].1*[2 3 3 0 0 2].1*[0 2 4 4]. 387). that form characterizes the distribution. above.392 CHAPTER 13. % Zeros for missing powers 3. integervalued random variables. 4. k the probability pk = P (X = k) 3.2400 5. p. disp(a) Z PZ % Distribution for Z = X + Y disp(b) 0 0 1. we get (13.1400 3. 4 gy = 0. the generating functions are polynomials. nonnegative.0000 0. the problem of determining the distribution for the sum of independent random variables may be handled by the process of multiplying polynomials.1200 . If the power series converges to a known closed form. 2. (13.gz]'.e.0000 0. Example 13. The coecients of the power series display the distribution: for value is the coecient of sk . If the generating function is known in closed form. b = [0:8. For a simple random variable (i. Y } is independent.6: The Poisson distribution In Example 13.7: Sum of independent simple random variables Suppose the pair {X. we simply encounter the generating X ∼ Poisson (µ) from the distribution. % Zero for missing power 0 gz = conv(gx.
2 Integral transforms We consider briey the relationship of the moment generating function and the characteristic function with well known integral transforms (hence the name of this chapter). Suppose distribution function with FX (−∞) = 0.54) The right hand expression is the bilateral Laplace transform of FX when MX FX .56) . ∞ is absolutely continuous. it would not be necessary to be concerned about missing powers and the corresponding zero coecients. if MX is the moment generating function for (or. widely used in engineering and applied mathematics. X FX (t) = 0 for t < 0. For nonnegative random variable X.1.0000 7. equivalently. This is particularly useful when the random variable X is nonnegative.0800 0.0000 8. Moment generating function and the Laplace transform When we examine the integral forms of the moment generating function.8: Use of Laplace transform Suppose nonnegative X has moment generating function MX (s) = 1 (1 − s) (13. We may use tables of Laplace transforms is known.52) X Thus.0800 If mgsum were used. then MX (−s) = −∞ In this case. we may Example 13.55) MX (−s) is the bilateral Laplace transform of use ordinary tables of the Laplace transform to recover fX .393 6.0000 0. fX . 13. for FX ). X. then MX (−s) is the LaplaceStieltjes transform for The theory of LaplaceStieltjes transforms shows that under conditions suciently general to include all practical distribution functions ∞ ∞ MX (−s) = −∞ Hence e−st FX (dt) = s −∞ e−st FX (t) dt (13.0400 0. we see that they represent forms of the Laplace transform.53) 1 MX (−s) = s to recover so that If ∞ e−st FX (t) dt −∞ (13. The bilateral Laplace transform for FX FX is a probability is given by ∞ e−st FX (t) dt −∞ The LaplaceStieltjes transform for (13. e−st fX (t) dt (13.51) FX is ∞ e−st FX (dt) −∞ (13.
so that FX (t) = 1 − e−t t ≥ 0. p. for 0≤k≤n and φ (u) = k=0 (iu) E X k + o (un ) k! k as u→0 (13.62) ∀u 2. above. 387) and the result on moments as derivatives at the origin. result is that Laplace transforms become Fourier transforms. t ≥ 0. φ (k) k then n (0) = i E X k . and φ is the (13. The theoretical and applied literature is even more extensive for the characteristic function.9: Laplace transform and the density Suppose the moment generating function for a nonnegative random variable is MX (s) = λ λ−s α (13.58) From a table of Laplace transforms. we nd (13.394 CHAPTER 13. p. Example 13.57) 1/s is the transform for the constant 1 (for t ≥ 0) and 1/ (1 + s) is the transform for e−t . 1 1 1 1 MX (−s) = = − s s (1 + s) s 1+s From a table of Laplace transforms. λ). in keeping with the determination.61) We note one limit theorem which has very important consequences. we nd after some algebraic manipulations fX (t) = Thus.59) Γ (α) α (s − a) If we put is the Laplace transform of a = −λ. as expected. s and The the integral expressions make that change of parameter. λα tα−1 e−λt . If φn (u) → φ (u) for all u and φ is continuous at 0.63) . A fundamental limit theorem Suppose {Fn : 1 ≤ n} is a sequence of probability distribution functions and {φn : 1 ≤ n} is the corresponding sequence of characteristic functions. but there is an important expansion for the characteristic function. If F is a distribution function such that characteristic function for F. 387) and (T2) ("(T2)". tα−1 eat t ≥ 0 (13. i2 = −1. of the moment generating function for that distribution. where i is the imaginary unit. TRANSFORM METHODS We know that this is the moment generating function for the exponential (1) distribution.60) X ∼ gamma (α. Now. 1. t≥0 Γ (α) (13. An expansion theorem If E [Xn ] < ∞. The characteristic function Since this function diers from the moment generating function by the interchange of parameter iu. then φ is the characteristic function for distribution function F such that Fn (t) → F (t) at each point of continuity of F (13. then Fn (t) → F (t) φn (u) → φ (u) at every point of continuity for F. we nd that for α > 0. Not only do we have the operational properties (T1) ("(T1)".
68) √ √ φ t/σ n − 1 − t2 /2n  = β t/σ n  = o t2 /σ 2 n so that (13.org/content/m23475/1. for all t (13. We sketch a proof of this version of the CLT. in the theory of noise. along with certain elementary facts from analysis. then central limit theorem (CLT) asserts that if random variable X is the sum of a large class of independent X is approximately normally distributed.1 The Central Limit Theorem The random variables. for large samples. In much of the theory of errors of measurement.67) φ about the origin noted above. this theorem serves as the basis of an extraordinary amount of applied work. Consider an independent sequence It {Xn : 1 ≤ n} with of random variables. condition the Central Limit Theorem (LindebergLévy form) If Xi ∗ Fn be the distribution function for Sn .5/>. Let be the characteristic function for φ be the common ∗ Sn . normal population distribution is frequently quite appropriate. We consider a form of the CLT under hypotheses which are reasonable assumptions in many practical situations. the sample average is approximately normalwhether or not the population distribution is normal. illustrates the kind of argument used in more sophisticated proofs required for more general cases. each with reasonable distributions. φ (t) = E eitX Using the power series expansion of and √ ∗ φn (t) = E eitSn = φn t/σ n (13.2.395 13. which utilizes the limit theorem on characteristic functions.70) content is available online at <http://cnx. This celebrated theorem has been the ob ject of extensive theoretical research directed toward the discovery of the most general conditions under which it is valid. independently produced. In the statistics of large samples. On the other hand. the observed error is the sum of a large number of independent random quantities which contribute additively to the result.69) √ nφ t/σ n − 1 − t2 /2n  → 0 2 This as n→∞ (13. then Var [Xi ] = σ 2 . and ∗ Sn = Sn − nµ √ σ n (13. Thus. Form the sequence of partial sums n n n Sn = i=1 Let Xi ∀ n ≥ 1 E [Sn ] = i=1 E [Xi ] and Var [Sn ] = i=1 Var [Xi ] (13. {Xn : 1 ≤ n} is iid. the sample average is a constant times the sum of the random variables in the sampling process . the assumption of a a large number of random components.64) ∗ Sn be the standardized sum and let appropriate conditions. The CLT asserts that under Fn (t) → Φ (t) as n → ∞ for all t. we have φ (t) = 1 − This implies σ 2 t2 + β (t) 2 where β (t) = o t2 as t→0 (13. above. . We sketch a proof of the theorem under the form an iid class. with E [Xi ] = µ. the noise signal is the sum of In such situations. We have characteristic function for the Xi .65) Fn (t) → Φ (t) IDEAS OF A PROOF There is no loss of generality in assuming and for each as n → ∞.2 Convergence and the central Limit Theorem2 13. Similarly.66) n let φn µ = 0. known as the LindebergLévy theorem.
') % Plot of gaussian distribution function % Plotting details (see Figure~13. the second has only three.6 5. character of the sum is more evident in the second case.5*VX.2) .1*[2 2 1 3 1 1]. but it does not tell how fast. variables in each case. The theorem says that the distribution functions for sums of increasing numbers of the Xi converge to the normal distribution function. PX = 0. which are easily worked out with the aid of our mfunctions. EX = X*PX' EX = 1.71) It is a well known property of the exponential that 1− so that t2 2n n → e−t 2 /2 as n→∞ (13. TRANSFORM METHODS A standard lemma of analysis ensures √ φn t/σ n − 1 − t2 /2n n √  ≤ nφ t/σ n − 1 − t2 /2n  → 0 as n→∞ (13. It is instructive to consider some examples.F) % Stair step plot hold on plot(x. Fn (t) → Φ (t). The t is remarkably good in either case with only ve terms. Here we use not only the gaussian approximation.9900 VX = dot(X.px] = diidsum(X. but the gaussian approximation shifted one half unit (the so called continuity correction for integervalues random variables).x).2 1. A principal tool is the mfunction number of iterations of mgsum. It uses a designated Example 13. % Distribution function for the sum stairs(x.05 2.PX.0904 [x.10: First random variable X = [3. above.1 4. Discrete examples Demonstration of the central limit theorem We take the sum of ve iid simple random The discrete We rst examine the gaussian approximation in two cases.EX^2 VX = 13.PX) .2].5).396 CHAPTER 13.'. % Distribution for the sum of 5 iid rv F = cumsum(px).72) √ 2 φ t/σ n → e−t /2 as n→∞ for all t (13. The rst variable has six distinct values.gaussian(5*EX. diidsum (sum of discrete iid random variables).3 7.73) By the convergence theorem on characteristic functions.^2.
4900 [x.gaussian(5*EX. % Distribution for the sum of 5 iid rv F = cumsum(px).x).397 Figure 13.'.2: Distribution for the sum of ve iid random variables.5).F) % Stair step plot hold on plot(x.^2*PX' EX2 = 4. PX = [0.5).px] = diidsum(X.EX^2 VX = 0.PX. % Distribution function for the sum stairs(x.5*VX. EX = X*PX' EX = 1.5 0.2].'o') % Plot with continuity correction % Plotting details (see Figure~13.3) .1000 VX = EX2 .9000 EX2 = X.11: Second random variable X = 1:3. Example 13.') % Plot of gaussian distribution function plot(x.3 0.5*VX.x+0.gaussian(5*EX.
1*[1 2 3 2 2]. we take the sum of twenty one iid simple random variables with integer values.x). EX = dot(X.3: Distribution for the sum of ve iid random variables.3000 VX = dot(X. F = cumsum(px).F(40:90)) hold on plot(40:90. TRANSFORM METHODS Figure 13. PX = 0.^2.EX^2 VX = 4. We examine only part of the distribution function where most of the probability is concentrated.4) .21).FG(40:90)) % Plotting details (see Figure~13.PX) EX = 3. Example 13.PX) .2100 [x.12: Sum of twentyone iid random variables X = [0 1 3 5 6]. As another example.PX. FG = gaussian(21*EX. so that the nature of the approximation is more readily apparent.398 CHAPTER 13. This eectively enlarges the xscale. stairs(40:90.21*VX.px] = diidsum(X.
13: Sum of three iid.F(a). we may get approximations to the sums of absolutely continuous random variables. VX = 1/12.5 and Var [X] = 1/12. Suppose X∼ uniform (0.pz] = diidsum(X.z). The results on discrete variables indicate that the more values the more quickly the conversion seems to occur.4: Distribution for the sum of twenty one iid random variables. 1). length(z) ans = 298 a = 1:5:296.FG(a). [z. tappr Enter matrix [a b] of xrange endpoints [0 1] Enter number of x approximation points 100 Enter density as a function of t t<=1 Use row matrices X and PX as in the simple case EX = 0.PX. In our next example. FG = gaussian(3*EX. Absolutely continuous examples By use of the discrete approximation. Example 13. we start with a random variable uniform on (0.3*VX. F = cumsum(pz).'o') % Plotting details (see Figure~13.5) .399 Figure 13. Then E [X] = 0. 1). uniform random variables. % Plot every fifth point plot(z(a).z(a).3).5.
5: Distribution for the sum of three iid uniform random variables.14: Sum of eight iid random variables Suppose the density is one on the intervals (−1. FG = gaussian(0.8*VX.5) and (0. 1).5) Use row matrices X and PX as in the simple case [z.6) . Although the density is sym metric.z. plot(z. From symmetry.F. (0.PX. Other distributions may take many more Example 13.z). E [X] = 0.5. This is not entirely surprising.400 CHAPTER 13.pz] = diidsum(X.5)(t>=0. Consider the following example. since the sum of two gives a symmetric triangular distribution on terms to get a good t. it has two separate regions of probability. 2).8). TRANSFORM METHODS Figure 13. the t is remarkably good. The MATLAB computations are: tappr Enter matrix [a b] of xrange endpoints [1 1] Enter number of x approximation points 200 Enter density as a function of t (t<=0. F = cumsum(pz). −0. Calculations show Var [X] = E X 2 = 7/12.FG) % Plottting details (see Figure~13. For the sum of only three random variables. VX = 7/12.
). The convergence of the sample average is a form of the socalled weak law of large numbers. namely convergence values of the sample average random variable n illustrates convergence in probability. Convergent sequences are characterized by the fact that for large enough N.13: Sum of three iid. Although the sum of eight random variables is used.401 Figure 13. In either case. we deal with sequences of numbers. The fact that the variance of An becomes small for large n illustrates convergence in the mean (of with increasing order 2). we say N suciently large an approximates arbitrarily closely some number L for all n ≥ N . The increasing concentration of E An − µ2 → 0 In the calculus. the t to the gaussian is not as good as that for the sum of three in Example 4 (Example 13. m ≥ N .74) is a sequence of real numbers. the convergence is remarkable fastonly a few terms are needed for good approximation. then the sequence is {an : 1 ≤ n} • Convergent i there exists a number L such that for any ε > 0 there is an N L − an  ≤ ε for all such that n≥N (13. in distribution (sometimes called An weak convergence). For large enough n the probability that An lies within a given distance of the population mean can be made as near one as desired.2. uniform random variables. 13.6: Distribution for the sum of eight iid uniform random variables.75) . To be precise. Such a sequence is said to be fundamental (or Cauchy ). if we let ε > 0 be the error of approximation. the distance an − am  between any two terms is arbitrarily small for all n.2 Convergence phenomena in probability theory The central limit theorem exhibits one of several kinds of convergence important in probability theory. If the sequence converges i for as n→∞ (13. This unique number L is called the limit of the sequence.
For example the limit of a linear combination of sequences is that linear combination of the separate limits. the strong law holds).77) limP (X − Xn  > ε) = 0 n There is a corresponding notion of a sequence fundamental in probability. designated Xn → X i for any ε > 0. The notion of uniform convergence also applies.). it deals with the random variables as such. In probability theory we have the notion of uniformly for all almost uniform in probability X (ω) convergence. This is the case that the sequence converges except for a set of arbitrarily small probability.. for any ε > 0 there N which works for all x (or for some suitable prescribed set of x ). It is easy to confuse these two types of convergence. The kind of convergence noted for the sample average is convergence in probability (a weak law of large numbers). The sequence may converge for some x and fail to converge for others. These concepts may be applied to a sequence of random variables. we have a sequence {fn (x) : 1 ≤ n} of real numbers. The notion of convergent and fundamental sequences applies to sequences of realvalued functions with a common domain. In this case. Rather The notion of convergence than deal with the sequence on a pointwise basis. the convergence of the sample average is almost sure (i. the situation may be quite dierent. And such convergence has certain desirable properties. consider for each possible outcome the sequence of values ω a tape on which there is X1 (ω) . such tape. the probability is arbitrarily near one that the observed value Xn (ω) lies within a prescribed distance of larger m.e. Instead of balls.s. it is true that any fundamental sequence converges (i. exceptional tapes which has zero probability. we think in terms of balls drawn from a jar or box. to a random variable • If the sequence of random variable converges a. convergence (a strong law of large numbers). Here the uniformity is over values of the argument x. It is quite possible that such a sequence converges for some ω and diverges (fails to converge) for others. As a matter of fact. then there is an set of This means any For all other tapes. that by going far enough out on prescribed distance of the value X. In this case. exists an uniform convergence. has a limit). The following schematic representation may help to visualize the dierence between almostsure convergence and convergence in probability. the closeness to a limit is expressed in terms of the probability that the observed value Xn (ω) follows: should lie close the the value of the limiting random variable. which are realvalued functions with we have a sequence A somewhat more restrictive condition (and often a more desirable one) for sequences of functions is ω Ω and argument ω . For each argument {Xn (ω) : 1 ≤ n} of real numbers. TRANSFORM METHODS ε>0 • Fundamental i for any there is an N such that an − am  ≤ ε for all n. We may state this precisely as A sequence {Xn : 1 ≤ n} converges to P Xin probability. X (ω). In the case of sample average. the sequence on the selected tape may very well diverge. X3 (ω) . For each x in the domain. in many important cases the sequence converges for all ω except possibly a set (event) of probability zero. · · ·. (13. In setting up the basic probability model.. noted above is a quite dierent kind of convergence.402 CHAPTER 13. Xn (ω) → X (ω). It is not dicult to construct examples for which there is convergence in probability but pointwise convergence for no ω . This says nothing about the values Xm (ω) on the selected tape for any In fact. .e. Suppose {Xn : 1 ≤ n} is a sequence of real random variables. the values Xn (ω) beyond that point all lie within a X (ω) of the limit random variable. It turns out that for a sampling process of the kind used in simple statistics. What is really desired in most cases is a. X2 (ω) .76) As a result of the completeness of the real numbers. and limits of products are the products of the limits.s. A tape is selected. m ≥ N (13. • If the sequence converges in probability. we say the seqeunce domain converges almost surely ω (abbreviated a. For n suciently large.s.
Denition. If it converges almost surely. 2. this characterization of the integrability of a single random variable to dene the notion of the uniform integrability of a class. There is the question of fundamental (or Cauchy) sequences and convergent sequences. important to be aware of these various types of convergence. It converges almost surely i it converges almost uniformly.403 To establish this requires much more detailed and sophisticated analysis than we are prepared to make in this treatment. 2. weakly). For example • • Almost sure convergence implies convergence in probability Almost sure convergence and uniform integrability implies implies convergence in distribution. Chapter 17. What conditions imply the various kinds of convergence? 4. 4.78) If the order p is one. order p. We simply state informally some of the important relationships. 1. then it converges in distribution (i. Sometimes only one kind can be established. Do the various types of limits have the usual properties of limits? Is the limit of a linear combination of sequences the linear combination of the limits? Is the limit of products the product of the limits? 3. While much of it could be treated with elementary ideas. since they are frequently utilized in advanced treatments of applied probability and of statistics. convergence. A sequence order p to X Var [An ] with increasing n {Xn : 1 ≤ n} converges in the Lp may be mean of i E [X − Xn p ] → 0 as n→∞ designated Xn → X. we speak of meansquare The introduction of a new type of convergence raises a number of questions. convergence in mean p. we consider one important condition known as uniform integrability.80) This condition plays a key role in many aspects of theoretical probability. then it converges in probability. we simply say the sequence converges in the mean. A somewhat more detailed summary is given in PA. For as n→∞ (13. .) with respect to i supE I{Xt >a} Xt  → 0 t∈T a→∞ (13. to be integrable a random variable cannot be too large on too large a set. If it converges in probability. Also. Relationships between types of convergence for probability measures Consider a sequence {Xn : 1 ≤ n} of random variables. 3.79) We use Roughly speaking. An arbitrary class probability measure P {Xt : t ∈ T } is uniformly integrable as (abbreviated u. it may be easier to establish one type which implies another of more immediate interest. 600) for integrals is integrable i E I{Xt >a} Xt  → 0 as a→∞ (13. But for a complete treatment it is necessary to consult more advanced treatments of probability and measure. 1. p = 2. a However. Various chains of implication can be traced. it is complete treatment requires considerable development of the underlying measure theory.e. X According to the property (E9b) (list. The notion of mean convergence illustrated by the reduction of expressed more generally and more precisely as follows. p. What is the relation between the various kinds of convergence? Before sketching briey some of the relationships between convergence types. i it is uniformly integrable and converges in probability. We do not develop the underlying theory.i. The relationships between types of convergence are important. It converges in mean.
they have According to the central limit theorem.7/>. and is not aected by. This is an observation of the The sample average and the population mean sample average Consider the numerical average of the values in the sample An = The sample sum 1 n n Xi = i=1 1 Sn n Now (13. However. t2 . Then the class {Xi : 1 ≤ i ≤ n} is iid. the population distribution could be determined completely. then the sample mean and the sample variance should approximate the population quantities. tn ) of the sampling process. which means we select n members of the population and observe the quantity The selection is done in such a manner that on any trial each member is equally likely to be selected. Two important We want the population mean and the population variance. the observed value of these quantities would probably be dierent. the The sampling process We take a sample of size associated with each. • • The A sampling process is the iid class {Xi : 1 ≤ i ≤ n}. If the sample is representative of the population. We model the sampling process Let as follows: Xi .82) 3 This content is available online at <http://cnx. we may apply probability theory to exhibit several basic ideas of statistical analysis. Once formulated. we select at random a subset of the population and observe how the quantity varies over the sample. The quantity varies throughout the population. then n n E [Sn ] = i=1 E [Xi ] = nE [X] = nµ and Var [Sn ] = i=1 Var [Xi ] = nVar [X] = nσ 2 (13. It appears that we are describing a composite trial.81) Sn and the sample average An are random variables. population distribution. An Sn and are functions of the random variables distributions related to the population distribution {Xi : 1 ≤ i ≤ n} in the sampling process. which is basic to much of classical statistics.3. or realization. Also. the others. (t1 . In order to obtain information about the population distribution. This provides a model for sampling either from a very large population (often referred to as an innite population) or sampling with replacement from a small population.org/content/m23496/1. As such. that is not always feasible. If each member could be observed. The goal is to determine as much as possible about the character of the population. the sampling is done in such a way that the result of any one selection does not aect.1 Simple Random Samples and Statistics We formulate the notion of a (simple) random sample. We begin with the notion of a individuals or entities. · · · . 1 ≤ i ≤ n be the random variable for the i th component trial. If another observation were made (another sample taken). n. . parameters are the mean and the variance. The population distribution is the distribution of that quantity among the members of the population. x= 1 n n i=1 ti . random sample is an observation. (the common distribution of the Xi ). with each member having the population distribution.3 Simple Random Samples and Statistics3 13. Now if the population mean E [X] is µ and the population variance Var [X] is σ 2 . TRANSFORM METHODS 13. the sample size need not be large in many cases. for any reasonable sized sample they should be approximately normally distributed. sample distribution will give a useful approximation to the population distribution. As the examples demonstrating the central limit theorem show. Hopefully.404 CHAPTER 13. A population may be most any collection of Associated with each member is a quantity or a feature that can be assigned a number.
σ = 2. σ 2 /n . x = norminv(0.6449 2. Sample size to be determined.(1+p)/2). p = [0. 1.6424 0.5758 6. √ 2Φ a n/σ − 1 ≥ p i √ p+1 Φ a n/σ ≥ 2 √ x = a n/σ (13.2. probability specied. The population standard deviation. Then n ≥ σ 2 (x/a) = We may use the MATLAB function 2 σ a 2 x2 (13.x.2) 3.8415 = 384. n ≥ (2/0. What value of within distance n is required to ensure a probability of at least p a from the population mean? that the sample average lies SOLUTION Suppose the sample variance is known or can be approximated reasonably.9800 2. then An is approximately N µ. disp([p. Sample size given.8 0. A sample of size n is to be taken.x. The mean of the sample average the population mean.86) norminv to calculate values of x for various p.85) Find from a table or by use of the inverse normal function the value of to make required Φ (x) at least (p + 1) /2.83) Herein lies the key to the usefulness of a large sample.9600 3. depending on the population distribution (as seen in the previous demonstrations). what is the probability the sample average lies within distance a from the population mean? 2.3263 5.9500 1. probability to be determined.95 0. 2 uncertainty about the actual σ 2 Use at least 385 or perhaps 400 because The idea of a statistic .405 so that E [An ] = 1 E [Sn ] = µ n and Var [An ] = 1 Var [Sn ] = σ 2 /n n2 (13.84) 2.1. but the variance of the sample average is An is the same as Thus. There are 1. If the sample size n is reasonably large.15: Sample size Suppose a population has mean complementary questions: 1/n times the population variance.9 0.9000 1. p = P (An − µ ≤ a) = P √ a n An − µ √ ≤ σ σ/ n √ = 2Φ a n/σ − 1 (13.98 0. Example 13. µ and variance σ2 .4119 0.6349 For of p = 0.99]. a = 0.15.^2]') 0. If n is given.7055 0.95. as a measure of the variation is reduced by a √ 1/ n. factor for large enough sample.2816 1. the probability is high that the observed value of the sample average will be close to the population mean.8000 1.9900 2.8415 0.
TRANSFORM METHODS As a function of the random variables in the sampling process. t2 . the following result shows that a slight modication is desirable.16: Statistics as functions of the sampling process The random variable W = is 1 n n (Xi − µ) . . the variance for the population N 1 σ = N 2 Here we use N t2 i i=1 − 1 N N 2 ti i=1 (13. Example 13.88) It would appear that ∗ Vn might be a reasonable estimate of the population variance. since it uses the unknown parameter µ. i=1 2 then E [Vn ] = n n−1 2 σ = σ2 n−1 n (13. n is a statistic.87) not a statistic.94) 1/N rather than 1/ (N − 1). Example 13. i=1 2 where µ = E [X] However. A statistic is a function of the class {Xi : 1 ≤ i ≤ n} which uses explicitly no unknown parameters of the population.91) bias in the average.406 CHAPTER 13. · · · .17: An estimator for the population variance The statistic Vn = 1 n−1 n (Xi − An ) i=1 2 (13. we use the last expression to show ∗ E [Vn ] = The quantity has a 1 n σ 2 + µ2 − n σ2 + µ2 n = n−1 2 σ n (13. ∗ Vn = 1 n n (Xi − An ) = i=1 2 1 n 2 Xi − A2 n i=1 (13.93) members. If the set of numbers (t1 .90) Noting that E X 2 = σ 2 + µ2 . However. If we consider Vn = The quantity n 1 V∗ = n−1 n n−1 with n (Xi − An ) .89) is an estimator for the population variance. the following (13.92) Vn 1/ (n − 1) rather than 1/n is often called the sample variance to distinguish it from the population variance. the sample average is an example of a statistic. VERIFICATION Consider the statistic ∗ Vn = 1 n n (Xi − An ) = i=1 2 1 n n 2 Xi − A2 n i=1 (13. Denition. tN ) represent the complete set of values in a population of would be given by (13.
For each observed value of the sample sum random variable. which has mean zero and variance 100/3. 100/3). % Gaussian dbn for sample sum values plot(t.407 Since the statistic Vn has mean value σ2 . and determine the sample sums.000 Sample average ex = 0. we plot the fraction of observed sums less than or equal to that value.0) % Seeds random number generator for later comparison tappr % Approximation setup Enter matrix [a b] of xrange endpoints [1 1] Enter number of x approximation points 100 Enter density as a function of t 0. As a random variable. An evaluation similar to that for the mean.m)).') % Comparative plot % Plotting details (see Figure~13. it has a variance.f] = csort(A.p. Then E [X] = 0 Var [X] = 1/3. shows that Var [Vn ] = For large 1 n µ4 − n−3 4 σ n−1 where µ4 = E (X − µ) 4 (13. Y ∼ N (0.3333 m = 100.pg.18: A sampling demonstration of the CLT Consider a population random variable X ∼ uniform [1. so that Vn is a good largesample estimator for σ2 .ones(1.7) . % Matrix A of sample sums [t.95) n. we need to consider its variance.'k.561e17 Sample variance vx = 0.m).3344 Approximate population variance V(X) = 0.003746 Approximate population mean E(X) = 1.t.t). % Forms 100 samples of size 100 A = sum(a). This yields an experimental distribution function for the distribution function for a random variable which is compared with rand('seed'.5*(t<=1) Use row matrices X and PX as in the simple case qsample % Creates sample Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX Sample size n = 10000 % Master sample size 10. We take 100 samples of size 100.'k'. % fraction of elements <= each value pg = gaussian(0. S100 . it seems a reasonable candidate for an estimator of the population variance.m. 1]. Var [Vn ] is small. This gives a sample of size 100 of the sample sum random variable S100 . % Sorts A and determines cumulative p = cumsum(f)/m.100/3. If we ask how good is it. but more complicated in detail. a = reshape(T. and Example 13.
) gX (s) gX (s) for the geometric (p) distribution. 4 This content is available online at <http://cnx. generating function for the time Y Determine the moment to replacement. TRANSFORM METHODS Figure 13.org/content/m24424/1.96) a.4 Problems on Transform Methods4 Exercise 13.) has distribution X = [−3 − 2 0 1 4] P X = [0. 412.2 Calculate directly the generating function for the Poisson (Solution on p.10] (13.5/>. 412. Exercise 13. 412.4 Simple random variable X (Solution on p. Determine the moment generating function for b. and '' MX (0) = E X 2 . .408 CHAPTER 13.) (µ) distribution.) X ∼ exponential (1/50). Exercise 13.20 0. 13. The unit will be replaced immediately upon failure or at 60 hours.7: The central limit theorem for sample sums.1 Calculate directly the generating function (Solution on p.25 0. whichever comes rst. Show by direct calculation the ' MX (0) = E [X] X. 412.3 A pro jection bulb has life (in hours) represented by (Solution on p.30 0.15 0. Exercise 13.
determine Var [X]. Determine the Exercise 13.) is independent. (1−qes )2 a. 414. σ 2 (Solution on p. Y } is iid with common moment generating function generating function for Z = 2X − 4Y + 3. 412.) {A.) Exercise 13. b.5 Exponential (Solution on p. a. b) has symmetric triangular distribution on (2a. (Solution on p. Determine the moment generating functions for IA .13 Suppose the class Consider (Solution on p.6 The pair {X. 413.4es ). b) for any a < b. 413.3).2.98) and Var [X]. 0. Var [X] with the result of Exercise 13.) Use the moment generating function to obtain the variances for the following distributions (λ) Gamma (α. (−c. Y } is iid with common moment generating function MX (s) =(0. Exercise 13. . 0.99) a.6 + 0.) 388) on Use the moment generating function for the symmetric triangular distribution (list. c) as derived in the section "Three Basic Transforms". Poisson (4) and Y ∼ geometric (0. 413. 413. 413.3. E [X] and b.7 The pair the {X.97) By recognizing forms and using rules of combinations.9 Random variable X has moment generating function p2 .12 Random variable X (Solution on p. 413. with respective probabilities 0. Exercise 13. X. moment generating function for Z = 5X + 2Y .) has moment generating function MX (s) = By recognizing forms and using exp (3 (es − 1)) · exp 16s2 /2 + 3s 1 − 5s rules of combinations. (λ−s)3 Determine the moment Exercise 13.) has moment generating function MX (s) = 1 · exp 16s2 /2 + 3s 1 − 3s E [X] and (13. IC and use properties of moment generating functions to determine the moment generating function for b. Use the result of part (a) to show that the sum of two independent random variables uniform on (a.11 Random variable X (Solution on p.) λ3 . determine E [X] (13. 2b). {X.409 Exercise 13.10 The pair (Solution on p. IB . B.5. λ) Normal µ. Exercise 13. Use derivatives to determine E [X] and Var [X]. (Solution on p. Recognize the distribution from the form and compare part (a). Obtain an expression for the symmetric triangular distribution on (a. X = −3IA + 2IB + 4IC (13.) Determine Exercise 13. Y } generating function gZ for X ∼ Z = 3X + 2Y . Use the moment generating function to determine the distribution for X. 413. C} of events is independent.8 (Solution on p. p.
410 CHAPTER 13. with X ∼ binomial (5. Use mgsum to obtain the distribution for U +V. Exercise 13. 415. G + H.5. Y } is iid. 414. Y } is independent. Compare with result (b).33) Y ∼ binomial (7. 3. binomial (6. Use icalc and csort to obtain the distribution for (a). 0. does it have the form for the binomial distribution? and Z. − 0. Let U = 3X 2 − 2X + 1 and and V = Y 3 + 2Y − 3 (13.) binomial. a.19 Suppose the pair (Solution on p.17 Suppose the pair (Solution on p. Use mgsum to obtain the distribution for Z = 2X + 4Y . 0. Use this to show the '' ' ' Var [X] = gX (1) + gX (1) − gX (1) 2 .100) a. Use icalc and csort to obtain the distribution for (a). with both and a. Exercise 13. Y } is independent. Use generating functions to show under what condition. X −Y? Justify your answer.) {X. 0. d. Exercise 13. Exercise 13.5}. Y } is independent. Y is nonnegative integervalued. 2 2 Let G = g (X) = 3X − 2X and H = h (Y ) = 2Y + Y + 3. 414.) is Poisson and X +Y is Poisson. 414. 2.) {X. By the result of Exercise 13.14 X +Y is binomial.20 If X (Solution on p. Use distributions for the separate terms. Show that the k th derivative at s = 1 is gX (1) = E [X (X − 1) (X − 2) · · · (X − k + 1)] (k) (13. Exercise 13. Write Exercise 13. Use the generating functions to show that Y X (Solution on p. X +Y is binomial. Use canonic to determine the distribution. 416.3.39) on {−1.101) b.15 Suppose the pair (Solution on p.14 Suppose the pair {X. TRANSFORM METHODS c. with both X X and Y Y (Solution on p. determine the distribution for the sum with mgsum3. with X ∼ binomial (8. {X.) is a nonnegative integervalued random variable.) Y ∼ uniform {X. 414.47). Y } is independent. express the generating function as a power series. What about X +Y is Poisson. 0. Use the mgsum to obtain the distribution for b.) Poisson.16 Suppose the pair {X. Compare with result (b). Does Z have the binomial distribution? Is the result surprising? Examine the rst few possible values for the generating function for Z . 1. 415. U +V and compare with the result of part b. . Use generating functions to show under what condition b.18 Suppose the pair (Solution on p. Y } is independent. if any. is Poisson. G+H a.3.2. and compare with the result of part Exercise 13.51).
(Solution on p. It is desired to determine the sample n needed to ensure that with probability 0.) A population has standard deviation approximately three. a. σ 2 .25 sample average (Solution on p. σ 2 /n for any reasonable population distribution having mean value µ and Exercise 13. . b.) MXm (s) to obtain the mean and variance of the negative binomial (m. 416. Z = 3X − 2Y + 3 is is normal (see Example 3 (Example 13. Use this fact to show that if X ∼ N µ. Show that Var [X] is the second derivative of b.21 Let MX (·) be the moment generating function for X. 5). (Solution on p.3: Ane combination of independent normal random variables) from "Transform Methods" for general result).15: Sample size) from "Simple Random Samples and Statistics"). the An = is approximately variance 1 n n Xi i=1 (13.95 the sample average will be within 0. (Solution on p. 416. Exercise 13.) Use moment generating functions to show that variances add for the sum or dierence of indepen Exercise 13.26 size (Solution on p.) Use the moment generating function to show that {X. 416. N µ.24 The pair (Solution on p. p) Exercise 13. Use the normal approximation to estimate n (see Example 1 (Example 13.102) σ2 . Use the Chebyshev inequality to estimate the needed sample size. 417. 417.5 of the mean value. at s = 0.22 Use derivatives of distribution. Y } is iid N (3.) Use the central limit theorem to show that for large enough sample size (usually 20 or more).23 dent random variables. 416. Exercise 13. e−sµ MX (s) evaluated 2 then Var [X] = σ .) a.411 Exercise 13.
106) = λ 1 − e−(λ−s)a + e−(λ−s)a λ−s (13. 409) a.a] (X) X + I(a.10e4s ' MX (s) = −3 · 0.∞) (X) eas ∞ (13. Gamma λ 1 2λ 2 = E X2 = 3 = 2 λ2 λ λ λ Var [X] = 2 − λ2 = 1 λ2 (13.25es + 0.104) Solution to Exercise 13.2 (p.30 + 0.15e−3s + (−2) · 0.105) MY (s) = 0 est λe−λt dt + esa a λe−λt dt (13. Exponential: MX (s) = λ λ ' MX (s) = 2 λ−s (λ − s) '' MX (s) = 2λ (λ − s) 1 λ 3 2 (13.∞) (X) a a esY = I[0.20e−2s + 0.114) E [X] = α α2 + α E X2 = λ λ2 Var [X] = (13.113) '' MX (s) = α2 λ λ−s 1 1 λ +α λ−sλ−s λ−s 1 (λ − s) α λ2 2 (13. 408) Y = I[0.107) Solution to Exercise 13.109) '' MX (s) = (−3) · 0.115) .1 (p. 408) MX (s) = 0.15e−3s − 2 · 0.20e−2s + 0 + 0.a] (X) esX + I(a.108) (13.110) Setting s=0 and using e0 = 1 give the desired results.5 (p.25es + 4 · 0.412 CHAPTER 13. TRANSFORM METHODS Solutions to Exercises in Chapter 13 Solution to Exercise 13.111) E [X] = b.10e4s (13. λ): MX (s) = λ λ−s α ' MX (s) = α λ λ−s α α−1 λ (λ − s) 2 =α λ λ−s α α 1 λ−s (13.112) (α.10e4s 2 2 (13.103) Solution to Exercise 13.4 (p. 408) ∞ ∞ gX (s) = E sX = k=0 pk sk = e−µ k=0 µk sk = e−µ eµs = eµ(s−1) k! (13.3 (p. Solution to Exercise 13. 408) ∞ ∞ gX (s) = E sX = k=0 pk sk = p k=0 q k sk = p 1 − qs (geometric series) (13.25es + 42 · 0.15e−3s + 0.20e−2s + 0 + 0.
6 (p. 409) (13. p).128) X = X1 + X2 + X3 .130) .122) s2 (b − a) p2 (1 − qes ) p2 (1 − qes ) −2 '' −2 ' = 2p2 qes (1 − qes ) + (1 − 3 so that E [X] = 2q/p E X2 = 6q 2 2q + 2 p p (13.12 (p. 16) (13.120) m = (a + b) /2 and c = (b − a) /2. 16) (13. 409) e −e s (b − a) sb sa 2 = e s2b +e s2a − 2e 2 s(b+a) (13.126) X = X1 + X2 with X1 ∼ exponential(1/3) X2 ∼ N (3.117) E [X] = µ E X 2 = µ2 + σ 2 Var [X] = σ 2 Solution to Exercise 13.125) (2.6 + 0. Normal (µ. b) and symetric triangular on X = Y +m is symmetric MX (s) = ems MY (s) = e(m+c)s + e(m−c)s − 2ems ebs + eas − 2e ecs + e−cs − 2 ms e = = 2 s2 2 s2 b−a 2 2 c c s 2 a+b 2 s (13.6 + 0.10 (p.129) E [X] = 3 + 5 + 3 = 11 Var [X] = 3 + 25 + 16 = 44 (13.124) Var [X] = X∼ negative binomial 2 q 2 + pq 2q 2 2q 2q + = = 2 p2 p p2 p and (13.413 c.11 (p. with X1 ∼ Poisson (3) . which has E [X] = 2q/p Var [X] = 2q/p2 . If Y ∼ (m − c.116) '' MX (s) = MX (s) · σ 2 s + µ 2 + MX (s) σ 2 (13. m + c) = (a.9 (p.8 (p. 409) λ λ − 2s 3 λ λ + 4s 3 (13.4e2s (−c. X2 ∼ exponential (1/5) . 409) (13. 409) 3 −1) · (13. 409) Let triangular on 0.118) MZ (s) = e3s Solution to Exercise 13. 409) gZ (s) = gX s3 gY s2 = e4(s Solution to Exercise 13.3 1 − qs2 Solution to Exercise 13. c).127) E [X] = 3 + 3 = 6 Var [X] = 9 + 16 = 25 Solution to Exercise 13.119) MZ (s) = 0.7 (p. 0.123) = 6p2 q 2 es (1 − 4 qes ) 2p2 qes 3 qes ) so that (13. then (13.4e5s Solution to Exercise 13. σ): MX (s) = exp σ 2 s2 + µs 2 ' MX (s) = MX (s) · σ 2 s + µ (13.121) MX+Y (s) = Solution to Exercise 13. X3 ∼ N (3.
0000 0.0700 4. 410) Division by E [X + Y ] = µ + ν .1200 1. X2 = [0 2].133) c = [3 2 4 0].3]. P3 = [0.P2.12e−s + 0.X3. 410) Binomial i both have same minprob(P) p.5 + 0.28 + 0. P = 0.13 (p.131) 0.0000 0.5e2s 0.1*[3 5 2].28e2s + 0.0000 0.03 0.07e4s + 0. X3 = [0 4].16 (p.1200 3.2].0000 0.135) Y −X could have negative values.P3).0000 0. TRANSFORM METHODS Solution to Exercise 13. 410) . as the argument below shows.7 + 0.0000 0.X2. [x. canonic Enter row vector of coefficients c Enter row vector of minterm probabilities Use row matrices X and PX for calculations Call for XDBN to view the distribution P1 = [0. as shown below.0000 0.0000 0.17 (p.2800 2.5].07] (13.1200 0 0.134) gX+Y (s) = eµ(s−1) eν(s−1) = e(µ+ν)(s−1) However.0700 Solution to Exercise 13.8 0.0300 2.132) X = [−3 − 1 0 1 2 3 4 6] P X = [0.0000 0.0000 0.0700 6. gX (s) = eµ(s−1) gX (s) gives gY (s) = eν(s−1) . X1 = [0 3].0700 6.07 0.28 0. Solution to Exercise 13.8 + 0.2e4s = (13.PX.12 0.1200 1.x.2800 1. 409) MX (s) = 0.14 (p. P2 = [0.03e3s + 0. = (q + ps) n+m i p1 = p2 (13.03 0.12e−3s + 0.2800 0 0. where ν = E [Y ] > 0.0000 0. disp([X.12 0. and gX+Y (s) = gX (s) gY (s) = e(µ+ν)(s−1) .px]') 3.0000 0. 410) Always Poisson.7 0.0300 4.0000 0.28 0. Solution to Exercise 13.5 0.3e−3s 0.2800 3.P1.0300 1.414 CHAPTER 13.0000 0.px] = mgsum3(X1.0300 3.03es + 0. (13. n m gX+Y (s) = (q1 + p1 s) (q2 + p2 s) Solution to Exercise 13.15 (p.07e6s The distribution is (13.
0118 8.^3 + 2*Y . 410) X = 0:8.5 1.PX.0000 0. Y = 0:7.px).0000 0.3 2.0.3. PX.3 0.PX.px. PY.PY).. PX.49 + 0.PZ] = mgsum(2*x.Y).47.2 3.39.PZ(1:5)]') 0 0. PX = ibinom(8. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. PX = ibinom(5.5).^2 .PZ)) % Comparison of p values e = 0 Solution to Exercise 13. [Z.136) X = 0:5. Y = [1.2*t + 1 + u.0. PY = (1/5)*ones(1.0259 .49 + 0. [z.51s) Solution to Exercise 13. t.^2 .P).51s2 6 0.0. PY.H.3.51s4 6 (13. U = 3*X.^2 + Y + 3.PZ] = mgsum(U.415 x = 0:6. and P M = 3*t.0.49 + 0.0000 0.... PY = ibinom(7.^2 . Y. since odd values missing 2.0000 0.^2 . [Z.PY). u.X).X).PZ] = mgsum(G.2*X + 1.x).18 (p.4*x. . 410) 6 gZ (s) = 0. px = ibinom(6. and P M = 3*t. Y. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X. V = Y.^3 + 2*u ..pz] = csort(M.gX (s) = gY (s) = (0.2*t + 2*u.2*X.19 (p. disp([Z(1:5). [Z.V.51.0002 % Cannot be binomial.. t. e = max(abs(pz . G = 3*X.33. u.5]. H = 2*Y.^2 + u + 3.0043 6.0012 4.
PZ)) e = 0 Solution to Exercise 13.22 (p. 411) To simplify writing use (13.20 (p. mpm qes (1 − m+1 qes ) f (s) = pm m (1 − qes ) f ' (s) = f '' (s) = mpm qes (1 − m+1 qes ) + m (m + 1) pm q 2 e2s (1 − qes ) m+2 (13.21 (p.139) Solution to Exercise 13. set mq m (m + 1) q 2 m2 q 2 mq + − 2 = 2 2 p p p p and (13.24 (p.141) f (s) for MX (S). h (s) = MX (s) MY (s) (13. Var [X − Y ] = Var [X] + Var [Y ]. 410) Since power series may be dierentiated term by term ∞ gX (s) = k=n ∞ (n) (n) k (k − 1) · · · (k − n + 1) pk sk−n so that (13.P).137) gX (1) = k=n k (k − 1) · · · (k − n + 1) pk = E [X (X − 1) · · · (X − n + 1)] 2 (13.140) Setting s=0 and using the result on moments gives f '' (0) = −µ2 + µ2 + E X 2 − µ2 = Var [X] Solution to Exercise 13. g (s) = MY (s).pz] = csort(M. 411) To simplify writing. 411) M3X (s) = MX (3s) = exp M−2Y (s) = MY (−2s) = exp = exp 4 · 5s2 − 2 · 3s 2 (13. e = max(abs(pz .138) '' ' ' Var [X] = E X 2 − E 2 [X] = E [X (X − 1)] + E [X] − E 2 [X] = gX (1) + gX (1) − gX (1) (13.146) E 2 [X] + 2E [X] E [Y ] + E 2 [Y ] Taking the dierence gives (13. 411) ' '' ' f (s) = e−sµ MX (s) f '' (s) = e−sµ −µMX (s) + µ2 MX (s) + MX (s) − µMX (s) (13.143) Var [X] = Solution to Exercise 13. TRANSFORM METHODS [z. 9 · 5s2 + 3 · 3s 2 A similar treatment with g (s) replaced by Solution to Exercise 13.416 CHAPTER 13.145) h' (s) = f ' (s) g (s) + f (s) g ' (s) h'' (s) = f '' (s) g (s) + f ' (s) g ' (s) + f ' (s) g ' (s) + f (s) g '' (s) Setting s=0 yields E [X + Y ] = E [X] + E [Y ] E (X + Y ) 2 = E X 2 + 2E [X] E [Y ] + E Y 2 E 2 [X + Y ] = (13.23 (p.144) f (s) = MX (s).149) .147) g (−s) shows Var [X + Y ] = Var [X] + Var [Y ].142) E [X] = mpm q (1 − q) m+1 = mq m (m + 1) pm q 2 mq E X2 = + m+2 p p (1 − q) (13.148) MZ (s) = e3s exp (45 + 20) s2 + (9 − 6) s 2 65s2 + 6s 2 (13.
25 (p.150) By the central limit theorem. 411) E [An ] = 1 n n µ = µ Var [An ] = i=1 1 n2 n σ2 = i=1 σ2 n (13.151) Use of the table in Example 1 (Example 13. 411) normal. with the mean and variance above.152) . An is approximately Solution to Exercise 13.52 n implies n ≥ 720 (13.84 = 128 2 (13.05 0.5) 3.5 n An − µ √ ≥ 3 σ/ n ≤ 32 ≤ 0. Chebyshev inequality: P Normal approximation: √ 0.417 Solution to Exercise 13.15: Sample size) from "Simple Random Samples and Statistics" shows n ≥ (3/0.26 (p.
TRANSFORM METHODS .418 CHAPTER 13.
given a random variable.5/>. gives an alternate interpretation of conditional expectation. The notion of conditional independence is expressed in terms of conditional expectation.1) .Chapter 14 Conditional Expectation. in turn. Two possibilities for making the modication are suggested. plays a fundamental role in much of modern probability theory.1. An extension of the fundamental property leads directly to the solution of the regression problem which.1 Conditional Expectation. 419 . We discover that conditional expectation is a random quantity. In making the change from P (A) we eectively do two things: to P (AC) = P (AC) P (C) (14. It seems reasonable to make a corresponding modication of mathematical expectation when the occurrence weighted average of the values taken on by • • We could replace the prior probability measure P (·) with the conditional probability measure P (·C) and take the weighted average with respect to these new weights. given a random vector. we modify the original probabilities by introducing the conditional proba P (·C).1 Conditioning by an event If a conditioning event bility measure C occurs. we identify a fundamental property which provides the basis for a very general extension. Regression 14.We normalize the probability mass by taking of event C is known. 14. Then we consider two highly intuitive special cases of conditional expectation.org/content/m23634/1. We could continue to use the prior probability measure follows: P (·) and modify the averaging process as 1 This content is available online at <http://cnx. Conditional independence plays an essential role in the theory of Markov processes and in much of decision theory. The basic property for conditional expectation and properties of ordinary expectation are used to obtain four fundamental properties which imply the expectationlike character of conditional expectation. The expectation E [X] is the probability X.We limit the possible outcomes to event C P (C) as the new unit . We rst consider an elementary form of conditional expectation with respect to an event. Various types of conditioning characterize some of the more important random sequences and processes. In examining these. Regression1 Conditional expectation.
we adopt the following Denition. j . for each permissible i.418 ≈ λ (e−1 − e−2 ) λ (14. However. CONDITIONAL EXPECTATION.5) E [XC] = 2e−1 − 3e−2 1. (14. this point of view is limited. The expectation E [IC X] is the These two approaches are equivalent. and taking weighted averages with respect to the conditional probability is intuitive and natural in this case.8) . For a simple random variable X= n k=1 tk IAk n in canonical form n n E [IC X] /P (C) = k=1 E [tk IC IAk ] /P (C) = k=1 tk P (CAk ) /P (C) = k=1 tk P (Ak C) (14. This may be done by using the random variable X (ω) for ω ∈ C and zero elsewhere. Y = uj ) P (X = ti ) P (·X = ti ) to get (14.6) 14.1. Now P (Y = uj X = ti ) = n m form. 2/λ]. The notion of a conditional distribution.7) We take the expectation relative to the conditional probability m E [g (Y ) X = ti ] = j=1 g (uj ) P (Y = uj X = ti ) = e (ti ) (14.The weighted sum of those values taken on in C. REGRESSION .2 Conditioning by a random vectordiscrete case Suppose X = i=1 ti IAi and Y = j=1 uj IBj in canonical P (Bj ) = P (Y = uj ) > 0. The product form E [XC] P (C) = E [IC X] (λ) and Example 14. Now IC = IM (X) where M = P (C) = P (X ≥ 1/λ) − P (X > 2/λ) = e−1 − e−2 2/λ and (14.3) Remark. given C.2) The nal sum is expectation with respect to the conditional probability measure.4) E [IC X] = Thus IM (t) tλe−λt dt = 1/λ tλe−λt dt = 1 2e−1 − 3e−2 λ (14.Consider the values IC X which has value probability weighted . average is obtained by dividing by P (C). We suppose P (Ai ) = P (X = ti ) > 0 and P (X = ti . exponential C = {1/λ ≤ X ≤ 2/λ}. X (ω) for only those ω ∈ C . Arguments using basic theorems on expectation and the approximation of general random variables by simple random variables allow an extension to a general random variable X.1: A numerical example Suppose X ∼ [1/λ. In order to display a natural relationship with more the general concept of conditioning with repspect to a random vector.420 CHAPTER 14. The conditional expectation of X. given event C with positive probability. is the quantity E [XC] = E [IC X] E [IC X] = P (C) E [IC ] is often useful.
21 0.05 0.10 = 0. concept.10 0. Now on the real line and determine the expectation n m E [IM (X) g (Y )] = i=1 j=1 n IM (ti ) g (uj ) P (X = ti .10 0.80 E [Y X = 9] = (−1 · 0.05) /0.10 E [Y X = 0] = −1 0.05 + 0 · 0.11) (A) for all E [IM (X) g (Y )] = E [IM (X) e (X)] where e (ti ) = E [g (Y ) X = ti ] (14.12) ti in the range of X.9) IM (ti ) m = i=1 (14.40 = 0.83 The pattern of operation in each case can be described as follows: • For the i th column. multiply each value uj by P (X = ti . sum. Y } has the joint distribution P (X = ti .20 1 0.10 + 2 · 0. consider an example to display the nature of the We return to examine this property later.20 + 0 0.09 0.30 PX Table 14.10 0.2: Basic calculations and interpretation Suppose the pair {X.01 + 2 · 0.421 Since we have a value for each consider any reasonable set M ti in the range of X.21) /0.05 0.01 0. Y = uj ) (14. Y = uj ) g (uj ) P (Y = uj X = ti ) P (X = ti ) (14.09 + 2 · 0. The following interpretation helps visualize the conditional expectation and points to an important result in the general case. .10) j=1 n = i=1 We have the pattern IM (ti ) e (ti ) P (X = ti ) = E [IM (X) e (X)] (14.04) /0.10 + 0 · 0. the function e (·) is dened on the range of X.10 = 0.40 9 0.10 4 0.05 + 2 0.10 + 0 · 0.1 Calculate E [Y X = ti ] for each possible value ti taken on by X 0.15 0.20 = 0 E [Y X = 1] = (−1 · 0. then divide by P (X = ti ).20 0. Y = uj ).05 0.04 0. Example 14. But rst.05 0.05 0.13) X= Y =2 0 1 0 0.30 E [Y X = 4] = (−1 · 0.20 = (−1 · 0.15) /0.05 + 2 · 0.05 + 0 · 0.
The function fY X (·t) has the properties fX (t) > 0.^2 . Suppose E [g (X. Y = u_j) for all i.*P = u_j P(X = t_i. This mass is distributed along a vertical mass line at values for the this should uj taken on by Y. disp([X.0000 0.0000 4./sum(P).0000 12. P = 0.5 (The regression problem). of values of uj we use values of g (ti .14) elsewhere fX (t) > 0 of a density for each xed t eectively determines the range of for which X.1. % Data for the joint distribution Y = [1 0 2]./sum(P).EYX]') % u. Y } X = t. u) /fX (t) 0 fX (t) > 0 (14. 5 1 9 10. As in the case of ordinary expectations. P (X = t) = 0 for adopted in the discrete case.0000 1.*P).*u. The result of the computation is to determine the center of conditional distribution above t = ti . given {X. We examine that possibility in the treatment of the regression problem in Section 14. u.3 Conditioning by a random vector absolutely continuous case Suppose the pair distribution.Y) = Y^2 . Although the calculations are not dicult for a problem of this size. we consider the t requires a conditional density each for We seek to use the concept of a conditional modication of the approach fY X (ut) = { The condition fXY (t. EZX = sum(G. Y ) X = ti ].0000 0. of Y when X = ti . % sum(P) = PX (operation sum yields column sums) disp([X. jcalc % Setup for calculations Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X.01*[ 5 4 21 15. CONDITIONAL EXPECTATION. PX. Instead Z = g (X.EZX]') 0 1.5000 4. fY X (ut) du = .2XY % E[ZX=x] 14. making the handling of much larger problems quite easy. PY.*P). be the best estimate. in the meansquare sense. j 0 0 1.8000 9.422 CHAPTER 14.2*t.5000 1.8333 % Z = g(X. This is particularly useful in dealing with the simple approximation to an absolutely continuous pair. 1 fX (t) fXY (t. the basic pattern can be implemented simply with MATLAB.15) fY X (ut) ≥ 0. 10 5 10 5]. Y ) = Y 2 − 2XY . u) du = fX (t) /fX (t) = 1 (14. t. REGRESSION • For each ti we use the mass distributed above it.0000 0. Y.0500 9.8333 The calculations extend to the calculations. has joint density function The fact that fXY . uj ) in G = u. Intuitively.3000 4.1. and P EYX = sum(u. X = [0 1 4 9].
E [g (Y ) X = t] = The function g (u) fY X (ut) du = e (t) (14. where IM (t) g (u) fY X (ut) du fX (t) dt (14. t=1 since the denominator is zero for that value of t.3: Basic calculation and interpretation Suppose the pair by {X.20) By denition.423 We dene. as in the discrete case.19) (A) E [IM (X) g (Y )] = E [IM (X) e (X)] where Again. hence eectively on the range of X. fY X (ut) = We thus have t + 2u 1 + t − 2t2 on the triangle (zero elsewhere) (14.16) e (·) is dened for fX (t) > 0. This causes . Y } has joint density fXY (t. Then fX (t) = 6 5 1 6 5 (t + 2u) on the triangular region bounded (t + 2u) du = t 6 1 + t − 2t2 . For any reasonable set M on the real line. then.21) E [Y X = t] = ufY X (ut) du = 1 1 + t − 2t2 1 tu + 2u2 du = t 4 + 3t − 7t3 0≤t<1 6 (1 + t − 2t2 ) (14. for each t in the range of X. 0 ≤ t ≤ 1 5 (14. in this case. and u = t (see Figure 14. we postpone examination of this pattern until we consider a more general case.1). we must rule out no problem in practice. u) dudt = IM (t) e (t) fX (t) dt.17) e (t) = E [g (Y ) X = t] (14.18) Thus we have. e (t) = E [g (Y ) X = t] (14. u) = t = 0. Example 14. u = 1.22) Theoretically. E [IM (X) g (Y )] = = IM (t) g (u) fXY (t.
This also points the way to practical MATLAB calculations. We are able to make an interpretation quite analogous to that for the discrete case. then fXY (t. this points to the general result on regression in the section. structure.25) This interpretation points the way to the use of MATLAB in approximating the conditional expectation.1: The density function for Example 14.1. u) does The mass in the strip is approximately Mass ≈ ∆t fXY (t. If the strip is narrow enough. u) du = ∆tfX (t) ufY X (ut) du = e (t) (14. at its center.3 (Basic calculation and interpretation). In the MATLAB handling of joint absolutely continuous random variables. .23) • The moment of the mass in the strip about the line Moment ≈ ∆t ufXY (t. u) du = ∆tfX (t) u=0 is approximately (14. REGRESSION Figure 14. Consider the MATLAB treatment of the example under consideration. "The Regression Problem" (Section 14. we divide the region into narrow vertical strips.5: The regression problem). The success of the discrete approach in approximating the theoretical value in turns supports the validity of the interpretation. u) du (14. Then we deal with each of these by dividing the vertical strips to form the grid The center of mass of the discrete distribution over one of the t chosen for the approximation must lie close to the actual center of mass of the probability in the strip. CONDITIONAL EXPECTATION. consider a narrow vertical strip of width ∆t • with the vertical line through t not vary appreciably with for any u.24) • The center of mass in the strip is Moment Mass Center of mass = ≈ ∆t ufXY (t.424 CHAPTER 14. • For any t in the range of X t (between 0 and 1 in this case). Also.
*(u>=t)'. t.7*X.4 Extension to the general case Most examples for which we make numerical calculations will be one of the types above. PY. 14./sum(P).2) Figure 14.eYx) % Plotting details (see Figure~14.1.2*X.^2)). PX. 424). % Theoretical expression plot(X.425 f = '(6/5)*(t + 2*u)./(6*(1 + X .*P). these cases is built upon the intuitive notion of conditional distributions. The agreement of the theoretical and approximate values is quite good enough for practical purposes.X.2: Theoretical and approximate conditional expectation for above (p. u. It also indicates that the interpretation is reasonable. % Approximate values eYx = (4 + 3*X . % Density as string variable tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density eval(f) % Evaluation of string variable Use array operations on X. these cases and this interpretation are rather limited and do not provide the basis for the range of applicationstheoretical and . Y.^3). since the approximation determines the center of mass of the discretized mass which approximates the center of the actual mass in each vertical strip.EYx. Analysis of However. and P EYx = sum(u.
then using IC = IM (X) (B) E [IM (X) g (Y )] = E [g (Y ) X ∈ M ] P (X ∈ M ) (14. The next condition is just the property (B) noted above. has positive we have We have a tie to the simple case of conditioning with respect to an event. the function exists e (·)always and is such that random variable e (X) is unique a. Also. except that it must include the two special cases studied in the previous sections. By the special case of the Radon Nikodym theorem (E19) ("(E19)".26) t in the range of X. 426). Denition. 2. since (A) holds for all reasonable (Borel) sets. i E [IM (X) g (Y )] = E [IM (X) e (X)] Note that for each Borel set M on the codomain of X (14. we have the property (A) for all E [IM (X) g (Y )] = E [IM (X) e (X)] where e (t) = E [g (Y ) X = t] C = {X ∈ M } (14. If probability. Thus fXH (tu) = ue−ut for X ∼ exponential (u). the data It may seem strange that we should complicate the problem of determining conditional expectation e (X) = E [g (Y ) X] supplied in a problem makes this the expedient procedure.426 CHAPTER 14.27) Two properties of expectation are crucial here: 1. Example 14. (CE1) Dening condition. We make a denition based on these facts. except for a set of ω of probability zero). b).s. e (X) = E [g (Y ) X] a. determine the expected life . where the t≥0 E [X] (14. E [g (Y )] = E{E [g (Y ) X]} E [g (Y )] by rst getting the then taking expectation of that function. At this point it has little apparent signicance. is called property (CE1) (p. REGRESSION practicalwhich characterize modern probability theory. Frequently. any of a number of books on measuretheoretic probability may be consulted. then e (X) is unique a. e (X) is a random variable and E [g (Y )] The concept is abstract.29) X and vector valued (CE1a) If X Y do not need to be real valued.s. unique function dened on such that (A) Note that for all Borel sets Expectation M (14. 600). In Appendix F we tabulate a number of key properties of conditional expectation. CONDITIONAL EXPECTATION. In each case examined above. The justication rests in certain formal properties which are based on the dening condition (A) and other properties of expectation.e. SOLUTION H ∼ uniform (a. it is not clear why the term conditional expectation should be used. then P (X ∈ M ) > 0. is the a. p. We seek a basis for extension (which includes the special cases). By the uniqueness property (E5) ("(E5) ". We examine several of these properties. the range of X The conditional expectationE [g (Y ) X = t] = e (t) E [IM (X) g (Y )] = E [IM (X) e (X)] e (·) is a function. E [IM (X) e (X)] = E [g (Y ) X ∈ M ] P (X ∈ M ) The special case which is obtained by setting for all M to include the entire range of X so that IM (X (ω)) = 1 ω is useful in many theoretical and applied problems. 600).. (i. p. The condition (A) For a detailed treatment and proofs.30) of the If the parameter random variable device.s.4: Use of the law of total probability Suppose the time to failure of a device is a random quantity parameter u is the value of a parameter random variable H.28) is always a constant. This extension to possible is extremely important.s. (CE1b) Law of total probability. although and Y g (Y ) is real valued.
34) E [ag (Y ) + bh (Z) X] = aE [g (Y ) X] + bE [h (Z) X] a. p. = = = e2 (X) = E [h (Z) X] . then conditional X. and monotone convergence. (CE2) Linearity. 427) forms the basis for the solution of the regresson problem in the next section. positivity/monotonicity.s. b (14.33) a = 1/100. 600) for expectation. (CE5) 427). This restriction causes little problem for applications at the level of this treatment.s. E [X] = 100ln (2) ≈ 69. {X. Property (CE5) (p. for Borel functions all Borel sets N g on the codomain of Y Since knowledge of X does not aect the likelihood that expectation should not be aected by the value of tation must be Y will take on any set of values. mathematical induction. aE [IM (X) g (Y )] + bE [IM (X) h (Z)] a. The resulting constant value of the conditional expec E [g (Y )] in order for the law of total probability to hold. by (CE1) by linearity of expectation by (CE1) by linearity of expectation 600) for expectation E [IM (X) e (X)] E{IM (X) [ag (Y ) + bh (Z)]} a.s.s.s. 600) and the product rule (E18) ("(E18)". E{IM (X) [ae1 (X) + be2 (X)]} a.427 We use the law of total probability: E [X] = E{E [XH]} = Now by assumption E [XH = u] fH (u) du (14. (CE6) e (X) = E [g (Y ) X] a. . along with the dening condition provide the expectation like character. VERIFICATION Let e1 (X) = E [g (Y ) X] . An (14. Since the equalities hold for any Borel implies M. with some reservation for the fact that A similar development holds for conditional expectation. Property (CE6) (p. e (X) is a random variable.31. the uniqueness property (E5) ("(E5) ". These properties for expectation yield most The next three properties. we examine one of them.s.s.s. Y } is an independent pair i i E [g (Y ) X] = E [g (Y )] a. A formal proof utilizes uniqueness (E5) ("(E5) ". e (X) = ae1 (X) + be2 (X) a. In order to get some sense of how these properties root in basic properties of expectation. linearity. 427) provides another condition for independence.35) is easily established by extension to any nite linear combination Independence. i E [h (X) g (Y )] = E [h (X) e (X)] a.s.s.s. b = 2/100. This is property (CE2) (p.32) E [X] = For 1 b−a b a 1 ln (b/a) du = u b−a (14. of the other essential properties for expectation. for any Borel function h .31) E [XH = u] = 1/u Thus and fH (u) = 1 . p.s. unique a. p. for all E [IN (Y ) X] = E [IN (Y )] a. b−a a<u<b (14. and e (X) = E [ag (Y ) + bh (Z) X] a. = aE [IM (X) e1 (X)] + bE [IM (X) e2 (X)] a. For any constants a.
For h = h+ − h− . Y ) X = t] to get E [g (t. (CE10) Under conditions on g that are nearly always met in practice a. (CE8) E [h (X) g (Y ) X] = h (X) E [g (Y ) X] a. 428) Consider again the distribution for Example 14. we sketch the ideas of a proof of (CE6) (p. Y ) X = t] a. The pair {X. For 2. and E [hn (X) e (X)] [U+2197]E [h (X) e (X)] (14. the result again follows by linearity. Y )] a. 426) with arbitrary h (X). so that E [g (t. It is easy to establish in the two elementary cases developed in previous sections. SOLUTION .s. u) = 6 (t + 2u) 5 on the triangular region bounded by t = 0. We provide an interpretation after the development of regression theory in the next section. Property (CE10) (p. and u=t (14. (CE9) Repeated conditioning If X = h (W ) . then E{E [g (Y ) X] W } = E{E [g (Y ) W ] X} = E [g (Y ) X] a. For 3. If E [g (X. Y } is an independent pair.s. For h ≥ 0. 427). simple hn [U+2197]h. CONDITIONAL EXPECTATION. They play an essential role in many theoretical developments.39) Z = 3X 2 + 2XY .3 (Basic calculation and interpretation).3 (Basic calculation and interpretation) that E [Y X = t] = Let 4 + 3t − 7t3 0≤t<1 6 (1 + t − 2t2 ) (14. Its proof in the general case is quite sophisticated. 427) 1. Again. [PX ] {X. n h (X) = i=1 ai IMi (X). The next property is highly intuitive and very useful. For 5.s. there is a seqence of nonnegative. 428) and (CE9) (p.36) Since corresponding terms in the sequences are equal. to get some insight into how the various properties arise. 428) below provide useful aids to computation. 428) are peculiar to conditional expectation. the result follows by linearity. u = 1. this is (CE1) (p. g ≥ 0. . the result follows by linearity g = g + − g − .s. Determine E [ZX = t]. then the value of X should not aect the value of Y. This combined with (CE10) (p.s. 4. By monotone convergence (CE4). Y ) X = t] = E [g (t.38) We show in Example 14. We list them here (as well as in Appendix F) for reference. E [hn (X) g (Y )] [U+2197]E [h (X) g (Y )] Now by positivity. for any Borel function h This property says that any function of the conditioning random vector may be treated as a constant factor. h (X) = IM (X). then E [g (X.37) This somewhat formal property is highly useful in many theoretical developments. Properties (CE8) (p. IDEAS OF A PROOF OF (CE6) (p. Y } is independent. Y )] a. g ≥ 0. b. Example 14. 426). then we should be able to replace It certainly seem reasonable to suppose that if X by t in {X. e (X) ≥ 0. REGRESSION Examination shows this to be the result of replacing IM (X) in (CE1) (p. Y ) X = t] = E [g (t.428 CHAPTER 14. Y } has density fXY (t. [PX ] X = t. They are essential in the study of Markov sequences and of a class of random sequences known as submartingales. (14. (CE2) (p. 428) assures this. If E [g (X. the limits are equal.5: Use of property (CE10) (p. Y ) X = t]. 427). Y ) X = t] = E [g (t.
44) FY (u) = E FY X (uX) = If there is a conditional density FY X (ut) FX (dt) (14.429 By linearity. is (14. take the assumption on the conditional distribution to mean X∼ exponential (u). we extend the concept conditional expectation. and (CE10) (p. by the law of total probability (CE1b) (p.49) .40) In the treatment of mathematical expectation. 428) E [ZX = t] = 3t2 + 2tE [Y X = t] = 3t2 + Conditional probability 4t + 3t2 − 7t4 3 (1 + t − 2t2 ) (14. distribution function FY X (uX) = P (Y ≤ uX) = E I(−∞. Modeling assumptions often start with such a family of distribution functions or density functions. b).4 (Use of the law of total probability).43) We may dene the conditional P (EX) = E [IE X] Thus. SOLUTON As in Example 14. (CE8) (p. Example 14. However. given X. this is seldom a problem. we note that probability may be expressed as an expectation P (E) = E [IE ] For conditional probability. we have (14. The conditional probability E. where the fXH (tu) = ue−ut t ≥ 0 Then (14. there is no need for a separate theory of conditional probability. (14. measuretheoretic treatment shows that it may for all t in the range of not be true that FY X (· t) is a distribution function X.48) t FXH (tu) = 0 ue−us ds = 1 − e−ut 0 ≤ t (14. 428).41) E [IE C] = Denition.42) In this manner. in applications.46) u FY X (ut) = −∞ fY X (rt) dr so that fY X (ut) = ∂ FY X (ut) ∂u (14.47) A careful. P (EC) E [IE IC ] = = P (EC) P (C) P (C) of event (14. determine the distribution function FX . If the parameter random variable H ∼ uniform (a.4 (Use of the law of total probability). given an event. suppose parameter u is the value of a parameter random variable H.u] (Y ) X Then.6: The conditional distribution function As in Example 14.45) fY X such that P (Y ∈ M X = t) = M then fY X (rt) dr (14. 426).
S. the probability of a match on any throw is 1/6. etc. for N ∼ uniform on the integers 1 through 20. both ones. A pair of dice is thrown S be the number of matches (i.53) E [SN = i] = i/6. % file randbern. CONDITIONAL EXPECTATION. N = input('Enter VALUES of N ').m p = input('Enter the probability of success ').p. so that E [S] = 1 1 · 6 20 20 i= i=1 20 · 21 7 = = 1. n = length(N).. Determine the joint distribution for SOLUTION {N.50) =1− Dierentiation with respect to 1 e−bt − e−at t (b − a) fX (t) e−at t>0 (14. For any pair (i.75 6 · 20 · 2 4 (14. j). Let is chosen by a random selection from the integers from 1 through 20 (say by drawing a card from a box). Since there are 36 pairs of numbers for the two dice and six possible matches. P (N = i) = 1/20 1 ≤ i ≤ 20. disp('Joint distribution N.1:N(i)+1) = PN(i)*ibinom(N(i). and marginal PS') randbern % Call for the procedure Enter the probability of success 1/6 Enter VALUES of N 1:20 . REGRESSION By the law of total probability FX (t) = FXH (tu) fH (u) du = 1 b−a b 1 − e−ut du = 1 − a 1 b−a b e−ut du a (14. end PS = sum(P). both twos. P = rot90(P). PN = input('Enter PROBABILITIES for N ').e.). 0 ≤ j ≤ i. P (N = i. 1/6).52) The following example uses a discrete conditional distribution and marginal distribution to obtain the joint distribution for the pair.54) The following MATLAB procedure calculates the joint probabilities and arranges them as on the plane. for i = 1:n P(i.430 CHAPTER 14.m+1).0:N(i)). we have S conditionally binomial (i. S}.51) t yields the expression for fX (t) = 1 b−a b 1 + t2 t e−bt − a 1 + t2 t (14. Example 14. given N = i. P = zeros(n. S = 0:m.7: A random number A number N N of Bernoulli trials N times. S = j) = P (S = jN = i) P (N = i) Now (14. m = max(N). Since the i throws of the dice constitute a Bernoulli sequence with probability 1/6 of a success (a match). P.
we determine the best ane function. with a minimum at zero i Y r=e is the best rule.56) 2 + E (e (X) − r (X)) + 2E [(Y − e (X)) (r (X) − e (X))] h (X) = r (X) − e (X) to assert that (14. We now turn to the problem of nding the best u = at + b.s. is found in Example 14. which is the center of mass for the distribution. Determination of the regression curve is thus determination of the conditional expectation.1. In the interpretive Example 14.2. For a given value the best mean square estimate is u = e (t) = E [Y X = t] The graph of the range of (14. 427) proves to be key to obtaining the result. The optimum We have some hints of possibilities. We desire a rule for obtaining the best estimate of the corresponding value Y (ω).) and for any choice of r we may take E [(Y − e (X)) (r (X) − e (X))] = E [(Y − e (X)) h (X)] = 0 Thus (14. 427) in the form E [h (X) (Y − e (X))] = 0 2 for any Consider E (Y − r (X)) = E (Y − e (X)) Now 2 = E (Y − e (X) + e (X) − r (X)) 2 (14. but more often is not.20) Joint distribution N. e (X) = E [Y X].59) The rst term on the right hand side is xed. u = e (t) vs t is known as the and is unique except possibly on a set regression curve of Y on X. the second term is nonnegative. then Y (ω) − r (X (ω)) is the error of estimate. considering thin vertical strips.57) e (X) is xed (a. . The property (CE6) (p. In the treatment of expectation. S. This is dened for argument t in N such that P (X ∈ N ) = 0. r. A pair Here we are concerned with A value {X. we nd the conditional expectation E [Y X = ti ] is the center of mass for the conditional distribution at X = ti . We may write (CE6) (p.5 The regression problem We introduce the regression problem in the treatment of linear regression. such that E (Y − r (X)) function of this form denes the regression function 2 is a minimum.60) X.05*ones(1. Y } of real random variables has a joint distribution. If X (ω) Y (ω) is the actual value and estimation rule (function) r (X (ω)) r (·) is is the estimate.58) E (Y − r (X)) r (X) = e (X) a.s. which may in some cases be an ane function. reasonable function h. we seek a function r taken to be that for which the average square of the error is a minimum. case. We investigate this possibility.7500 % Agrees with the theoretical value 14.2 (Basic calculations and interpretation) for the absolutely continuous e (t) = E [Y X = t] might be the best estimate for Y when the value X (ω) = t Let is observed. and marginal PS ES = S*PS' ES = 1. (14. The best That is. This suggests the possibility that A similar result. we nd that the best constant to approximate a random variable in the mean square sense is the mean value. of Thus. is observed.1 for the discrete case. 2 = E (Y − e (X)) 2 + E (e (X) − r (X)) X (ω) = t 2 (14. P. more general regression. line of Y on X.55) In the treatment of linear regression.431 Enter PROBABILITIES for N 0.
and the best mean square estimate of Z given Z = g (X. Y } has joint density the triangular region bounded by that {X. 0 ≤ t < 1 3 (1 − t) 3 (14. Z} has a joint Example 14. Integration shows 2u (1 − t) 2 fX (t) = 30t2 (1 − t) . The result extends to functions of X and Y. Then the X = t is E [ZX = t]. then is the horizontal line through and the since Cov [X.63) E [Y X = t] = ufY X (ut) du = 1 (1 − t) 2 0 1−t 2u2 du = 2 (1 − t) 2 · 2 = 3 (1 − t) . agrees with regression line is u = 0 + E [Y ]. (14.62) M = [0.64) . Y ). 428). This is t = 0. of course. so that the u = E [Y ].432 CHAPTER 14. u) = 60t2 u for 0 ≤ t ≤ 1.61) Z={ where X2 2Y for for X ≤ 1/2 X > 1/2 = IM (X) X 2 + IN (X) 2Y E [ZX = t]. REGRESSION Example 14.9 (Estimate of a function of {X.3: The density function for Example 14. pair {X. This.9: Estimate of a function of Suppose the pair {X. Y } is independent. regression curve of the regression line. Y } fXY (t. E [ZX = t] = E IM (X) X 2 X = t + E [IN (X) 2Y X = t] = IM (t) t2 + IN (t) 2E [Y X = t] Now (14. CONDITIONAL EXPECTATION. SOLUTION By linearity and (CE8) (p. Y ] = 0 u = E [Y X = t] = E [Y ]. u = 0. 0 ≤ u ≤ 1 − t. 0 ≤ t ≤ 1 Consider 2 and fY X (ut) = on the triangle (14. 1].8: Regression curve for an independent pair If the pair Y on X {X.3). 1/2] and N = (1/2. and u = 1 − t (see Figure 14. Determine Figure 14. Suppose distribution. Y }).
*X. and P G = (t<=0.5). in spite of the moderate number of approximation points.*P). The two expressions t2 and (4/3) (1 − t)must not be added.EZx.^2 + (4/3)*(X>0.*t.*u.5). PX.X. for this would give an expression incorrect for all t in the range of Note that the indicator functions separate the two expressions.5).^2 + 2*(t>0.eZx. The rst holds on the interval X. u. ./sum(P).433 so that E [ZX = t] = IM (t) t2 + IN (t) 4 (1 − t) 3 (14. EZx = sum(G.4 The t is quite sucient for practical purposes.') % Plotting details % See Figure~14.*(u<=1t) Use array operations on X. % Approximation eZx = (X<=0.'k'. 1/2] and the second holds on the interval N = (1/2. Y. PY. The dierence in expressions for the two intervals of X values is quite clear. % Theoretical plot(X.*u. 1]. APPROXIMATION tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 100 Enter number of Y approximation points 100 Enter expression for joint density 60*t.5).'k.^2. t.*(1X).65) M = [0.
REGRESSION Figure 14. 0≤t≤1 2t2 + 1 (14.4: of {X. Y } fXY (t.5). u) : u ≥ t} (14. on the unit square 0 ≤ t ≤ 1. u) ufY X (ut) du −t5 + 4t4 + 2t2 . SOLUTION E [ZX = t] = 2t2 4t2 2t2 + 1 1 IQ (t. 5 and fY X (ut) = 2 t2 + u 2t2 + 1 on the square (14.68) = t2 + u du + t 2t2 t2 u + u2 du = 0 (14. (see Figure 14.9 (Estimate of a function Example 14. 0 ≤ t ≤ 1.10: Estimate of a function of Suppose the pair {X.66) Z={ 2X 2 3XY for for X≤Y X>Y = IQ (X. u) fY X (ut) du + 3t 6t +1 t IQc (t.69) .434 CHAPTER 14. Y ) 3XY. The usual integration shows fX (t) = Consider 3 2t2 + 1 .67) Determine E [ZX = t]. Theoretical and approximate regression curves for Example 14. where Q = {(t. Y ) 2X 2 + IQc (X. u) = 0≤u≤1 6 5 t2 + u . Y }). Y } has joint density {X. CONDITIONAL EXPECTATION.
9 (Estimate of a function of {X. but sum of the two parts is needed for each t.435 Figure 14. Here they serve to set the eective limits of integration. . There they provide a separation of two parts of the result.5: The density and regions for Example 14.10 (Estimate of a function of {X. Y }). Y }) Note the dierent role of the indicator functions than in Example 14.
EZx. u.10 (Estimate of a function APPROXIMATION tuappr Enter matrix [a b] of Xrange endpoints [0 1] Enter matrix [c d] of Yrange endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density (6/5)*(t.eZx. plot(X. REGRESSION of {X.^2 Use array operations on X. EZx = sum(G./sum(P).*u.') % Plotting details + u) P % Approximate % Theoretical % See Figure~14.^2 + 1). The theoretical and approximate are barely distinguishable on the plot.'k.*(u<t).^4 + 2*X.'k'.4 (Example 14. eZx = (X. PY. PX.X.*P). Y. An alternate approach is simply to dene . Y }).6: Theoretical and approximate regression curves for Example 14.^2). Although the same number of approximation points are use as in Figure 14.6 {X. CONDITIONAL EXPECTATION. the fact that it solves the regression problem is a matter that requires proof using properties of of conditional expectation. Given our approach to conditional expectation. Y })).^2./(2*X.436 CHAPTER 14. and G = 2*t.*(u>=t) + 3*t. t.9 (Estimate of a function of the fact that the entire region is included in the grid means a larger number of eective points in this example. Figure 14.^5 + 4*X.
Z) X] = e (X) a. In words.0891 0. on X. we get the best mean square estimate of g (Y ).4/>. Z). Z] = e (X) a. On the other hand if we rst get the best mean sqare estimate of g (Y ).70) g (Y ).0405 0.2 0. Regression2 For the distributions in Exercises 13 a.0 3. to interpret the formal property (CE9) (p. Once that is established.72) 2 This content is available online at <http://cnx. properties of 600)) show the essential equivalence of expectation (including the uniqueness property (E5) ("(E5) ". given (X.1 2. Determine the regression b.5275t + 0. and then take the best mean square estimate of that. Then (CE9b) can be expressed E [e (X) X.1 0. determine the regression curve of Z X. For the function curve of Y on X and compare with the regression line of Y on Z = g (X.0495 0. Y } P (X = t. Exercise 14. then take the best mean sqare estimate of that.0510 0.0528 0. Not all random variables have this property. . given X.0440 0.) (See Exercise 17 (Exercise 11. given (X. in particular.1320 0. This yields. Z] X} = E [g (Y ) X] e1 (X. 437).0363 0. 444.6924.8.0594 0.0189 0.71) t = u = 7. The alternate approach assumes the second moment E X2 is nite. There are some technical dierences which do not aect most applications. if we take the best estimate of given and E [e1 (X. The pair {X.org/content/m24441/1.5 4. Z). given X.0324 0. those ordinarily used in applications at the level of this treatment will have a variance. However. Y ) indicated in each case. Y = u) (14.2 Problems on Conditional Expectation. Z) = E [g (Y ) X.0726 2.s. hence a nite second moment. Z). p.0484 1. Z} = E{E [g (Y ) X.9 0.4 0. Z].0132 3. the best mean square estimator of g (Y ).17) from "Problems on Mathematical Expectation").m (Section 17.0077 Table 14.s.1089 0.38: npr08_07)): (Solution on p.0297 0 4. given X.0216 0. then determine its properties.0203 0.2 The regression line of Y on X is u = 0.8 3. We use the interpretation of e (X) = E [g (Y ) X] as the best mean square estimator of g (Y ). 14. we do not change the estimate of g (Y ).0396 0 0. Z = X 2 Y + X + Y  (14.7 0.5 0. (X. We examine the special form (CE9a) Put E{E [g (Y ) X] X.1 has the joint distribution (in le npr08_07. 426). our dening condition (CE1) (p. the two concepts.0090 0. given X. (14.0231 0.437 the conditional expectation to be the solution to the regression problem.
0060 0. Q = {(t.0108 0.0080 0.0026 15 0.0017 5 0.8) /X (14.061 0.163 0.0156 0.0040 0. in hours.001 0.039 0.070 0.0060 0.75) t = u = 5 4 3 2 1 1 0. Y = u) (14.8.0091 0.) The pair (See Exercise 18 (Exercise 11.0180 0.0126 0.0120 0.m (Section 17.015 0.0012 0.0167 11 0.0252 0. Z = (Y − 2. of training.m (Section 17.025 3 0.0217 19 0.0038 0.0081 0.0049 0. 445.0090 13 0.0256 0.0060 0.0096 0.0050 0. 444.0040 0. Y ) XY 2 Exercise 14.0156 0.0120 0.7793t + 4. Y = u) (14.74) √ Z = IQ (X.0191 0.050 0.5 0.2584t + 5.4 The regression line of Y on X is u = −0.031 0.033 0.) (See Exercise 19 (Exercise 11.0196 0.2 (Solution on p.0072 3 0.0234 0.0196 0. Y } has the joint distribution (in le npr08_08.0070 0.0234 0. The data are as follows (in le npr08_09. CONDITIONAL EXPECTATION.0256 0.0104 0.8.19) from "Problems on Mathematical Expectation").049 0.051 0.3 (Solution on p.0204 0.0134 0.012 0.0223 Table 14.137 0. Y ) X (Y − 4) + IQc (X.0072 0.0084 0.0056 0. in minutes.76) For the joint densities in Exercises 411 below .0096 0.0108 0.0280 0.0054 0.0180 0.001 0.0064 0.003 1.0112 0.0045 9 0. and Y X is the amount is the time to perform the task.0044 0.0168 0.0160 0.438 CHAPTER 14.18) from "Problems on Mathematical Expectation").0244 0.0126 0.010 0.0060 0.0035 0.0056 0.39: npr08_08)): P (X = t.011 0.045 2.0098 0.017 Table 14.0084 0.0140 0.0140 0.0056 0.009 2 0.6110.73) t = u = 12 10 9 5 3 1 3 5 1 0.0182 0.005 0. {X.0268 0.3051.5 0.3 The regression line of Y on X is u = −0.0182 0.039 0.0256 0.0064 0.0070 0.0112 0. Data were kept on the eect of training time on the time to perform a job on a production line.058 0.0063 7 0.0056 0. u) : u ≤ t} (14.065 0.0162 0.0182 0.0200 0.0096 0.0260 0.40: npr08_09)): P (X = t.0172 17 0.0091 0. REGRESSION Exercise 14.0160 0.
24 11 tu for . Exercise 20 (Exercise 11. Covariance. 0 ≤ u ≤ min{1. Exercise 26 (Exercise 11. 0 ≤ u ≤ 2 (1 − t).0958t + 1.) and Exercise 25 (See Exercise 15 (Exercise 8. fXY (t. Covariance.80) Y on X is u = (4t + 5) /9.26) from "Problems on Mathematical Expectation". The regression line of Y on X is u = 1 − t. Exercise 14. the parallelogram with vertices fXY (t.6 (Solution on p.79) (Solution on p. 1) The regression line of (14.27) from "Problems on Variance.78) fX (t) = Exercise 14. Covariance.7 3 3 (1 + t) 1 + 4t + t2 = 1 + 5t + 5t2 + t3 . and Exercise 26 (Exercise 12. 446. The regression line of 1 8 (t + u) Y on X is u = −t/11 + 35/33. Linear Regression").10) from "Problems On Random Vectors and Joint Distributions". for fX (t) = Exercise 14.439 a. Check these with a discrete approximation. 1) . 2 − t}.5 (Solution on p.25) from "Problems on Mathematical Expectation". fXY (t. 0 ≤ u ≤ 2. (Exercise 12. 0 ≤ u ≤ 1 + t. 0 ≤ t ≤ 1 Exercise 14. 0 ≤ t ≤ 2 4 (14.15) from "Problems On Random Vectors and Joint Distributions". 445. Exercise 27 (Exercise 11.0] (t) 6t2 (t + 1) + I(0. 446.8 (Solution on p. 0 ≤ t ≤ 2 88 88 (14. Covariance. Linear Regression").16) from " Problems On Random Vectors and Joint Distributions". u) = 12t2 u on (−1. and Exercise 24 (Exercise 12.23) from "Problems on Mathematical Expectation".4 (Solution on p.17) from " Problems On Random Vectors and Joint Distributions". for fXY (t. (0.4876.27) from "Problems on Mathematical Expectation". curve of Y on X and compare with the regression line of Y on b.1] (t) 6t2 1 − t2 Exercise 14.77) fX (t) = 2 (1 − t) .23) from "Problems on Variance.) (See Exercise 10 (Exercise 8. 445. and Exercise 23 (Exercise 12. (1. 446.24) from "Problems on Variance. 0) . u) = 3 88 2t + 3u2 The 0 ≤ t ≤ 2.20) from "Problems on Mathematical Expectation". Linear Regression"). Linear Regression"). (14. and Exercise 27 (Exercise 12.25) from "Problems on Variance. (0. 1 (t + 1) .81) fX (t) = I[−1.) (See Exercise 17 (Exercise 8. Exercise 23 (Exercise 11.) (See Exercise 16 (Exercise 8. u) = 0 ≤ t ≤ 2. 0) . fXY (t. Covariance.) (See Exercise 13 (Exercise 8. u) = 1 for 0 ≤ t ≤ 1. Exercise 25 (Exercise 11.26) from "Problems on Variance. Linear Regression"). regression line of Y on X is u = 0. Determine analytically the regression X. u) = 0 ≤ t ≤ 2.13) from " Problems On Random Vectors and Joint Distributions". 2 (14.
21) from " Problems On Random Vectors and Joint Distributions". 447.1] (X) 4X + I(1.440 CHAPTER 14.86) (14.31) from "Problems on Mathematical Expectation". fX (t) = 3 3 (1 + t) 1 + 4t + t2 = 1 + 5t + 5t2 + t3 . Exercise 31 (Exercise 11. for fXY (t.87) . (14. u) = 9 I[0. CONDITIONAL EXPECTATION.84) Exercise 14. and Exercise 29 (Exercise 12.1] (t) 3 2 3 t + 1 + I(1. Linear Regression"). u) = 0 ≤ t ≤ 2. fX (t) = I[0.9 (Solution on p. The regression line of Y on X is u = −0. 0 ≤ u ≤ min{2t.1] (t) 6 6 (2 − t) + I(1. 0 ≤ u ≤ max{2 − t.2] (t) 14 t2 u2 .2] (t) (3 − t) 13 13 2 13 (t + 2u).22) from " Problems On Random Vectors and Joint Distributions".1] (t) 12 2 6 t + I(1.37) from "Problems on Mathematical Expectation".83) Exercise 14.) (See Exercise 18 (Exercise 8. Determine analytically E [ZX = t] (Solution on p. fX (t) = I[0.28) from "Problems on Mathematical Expectation". 447. Use a discrete approximation to calculate the same functions.29) from "Problems on Variance. Exercise 28 (Exercise 11.2603. 3 − t}.) (See Exercise 21 (Exercise 8.12 fXY (t.2] (t) t2 8 14 (14. for fXY (t. and Exercise 14.0561t − 0. Covariance. 8 for 0 ≤ u ≤ 1. t}. (Exercise 12.30) from "Problems on Variance. and Exercise 28 (Exercise 12.82) fX (t) = I[0.11 (Solution on p. fX (t) = I[0. REGRESSION The regression line of Y on X is u = (−124t + 368) /431 12 12 2 t + I(1. u) = 0 ≤ t ≤ 2.2] (t) t2 23 23 3 23 (t + 2u) (14. 0 ≤ u ≤ 1 + t (see Exercise 37 (Exercise 11. 0 ≤ t ≤ 2 88 88 Z = I[0.) b.18) from " Problems On Random Vectors and Joint Distributions".28) from "Problems on Variance.1359t + 1. 448. Linear Regression").10 (Solution on p.1] (t) 3 t2 + 2u + I(1. The regression line of Y on X is u = 0.) and Exercise 30 (See Exercise 22 (Exercise 8. fXY (t.1] (t) Exercise 14. 447.0839.5989.32) from "Problems on Mathematical Expectation". Linear Regression").0817t + 0.2] (X) (X + Y ) (14. Covariance. Covariance. Exercise 14.2] (t) t(2 − t) 11 11 (14.6).85) For the distributions in Exercises 1216 below a. Exercise 32 (Exercise 11. u) = 3 88 2t + 3u2 for 0 ≤ t ≤ 2. The regression line of Y on X is u = 1.
n = 50 (see Example 7 (Example 14.7: A b. Determine the joint distribution for random number N of Bernoulli trials) from "Conditional Expectation. 0 ≤ u ≤ min{2t. fX (t) = I[0.) fXY (t.10). compare with the theoretical value. Exercise 14. u) = I[0. and Exercise 14. fX (t) = I[0. u) : max (t.94) Z = IM (X. Exercise 14. M = {(t. Y } for (Solution on p. u) : u ≤ min (1. Y ) 2Y. 449. Y ) X + IM c (X. Regression" for a possible approach).15 (14.1] (t) 12 2 6 t + I(1.1] (t) 12 12 2 t + I(1.) and fXY (t.13 (Solution on p.32) from "Problems on Mathematical Expectation". (see Exercise 31 (Exercise 11. Determine the joint distribution for random number N of Bernoulli trials) from "Conditional Expectation.39). 450. {X. u) : t ≤ 1. Use jcalc to determine E [Y ].7: A b. Y ) (X + Y ) + IM c (X. 8 for (14.2] (t) 14 t2 u2 . u) ≤ 1} Exercise 14.) a.18 Suppose X∼ uniform on 1 through n and Y ∼ conditionally uniform on 1 through i.) for fXY (t. t} (see Exercise 39 (Exercise 11.91) (Solution on p.11). Determine E [Y ] from E [Y X = i].93) (Solution on p. 2 − t} (see Exercise 38 (Exercise 11. 6 6 (2 − t) + I(1. given X = i.89) (Solution on p. 0 ≤ u ≤ max{2 − t. {X. n = 50 (see Example 7 (Example 14.) a. 448. (see Exercise 32 (Exercise 11.14 (14.90) Z = IM (X. Y } for (Solution on p.) 0 ≤ u ≤ 1.31) from "Problems on Mathematical Expectation". 0 ≤ u ≤ min{1.88) 1 Z = IM (X. 449. given X = i. Determine E [Y ] from E [Y X = i]. Exercise 14. for 0 ≤ t ≤ 2.17 Suppose (14.2] (t) t(2 − t) 11 11 (14. and Exercise 14.2] (t) t2 8 14 M = {(t. u) : u > t} 2 Exercise 14.9).2] (t) (3 − t) 13 13 M = {(t. 450. Y ) 2Y 2 . Y ) Y 2 .1] (t) 3 2 3 t + 1 + I(1. u) = 24 11 tu 0 ≤ t ≤ 2.441 Exercise 14. compare with the theoretical value.16 9 fXY (t. u ≥ 1} (14. u) = 2 13 (t + 2u). M = {(t. . u) = 3 23 (t + 2u) for 0 ≤ t ≤ 2.2] (t) t2 23 23 Exercise 14. 3 − t}.92) Z = IM (X. fX (t) = I[0.1] (t) 3 t2 + 2u + I(1. 2 − t)} (14. Use jcalc to determine E [Y ]. Y ) X + IM c (X.38) from "Problems on Mathematical Expectaton". Y ) XY. fX (t) = I[0. 448.8). Y ) (X + Y ) + IM c (X.1] (t) (14. Exercise 14. Regression" for a possible approach).95) X∼ uniform on 0 through n and Y ∼ conditionally uniform on 0 through i.
Y } E [Y ]. Determine the joint distribution for p = 0.) 100.20 A number times. [PXZ ] Exercise 14. 2 3 (2 − t) and X has density function fX (t) = 15 2 16 t (2 − t) 2 (Solution on p. Y } for (Solution on p. show that n−k Show that P (X = kX + Y = n) = C (n. a number from 1 to 10.e. for Let X is selected randomly from the integers 1 through Y be the number of sevens thrown on the X tosses.) a.21 A number X is selected randomly from the integers 1 through 100.25 (Solution on p. Z = v] = E [g (t. A pair of dice is thrown X Determine the joint distribution {X. Service time Y for a unit received in that β (2 − u).) has density function fX (t) = 4 − 2t for 1 ≤ t ≤ 2. 452. Exercise 14.s. Determine E [Y ]. REGRESSION Exercise 14.98) (Solution on p.) has density function fX (t) = 30t2 (1 − t) E [Y X = t] = E [Y ].23 and X (Solution on p. 452.3. Use jcalc to determine E [Y ]. Y. for p = µ/ (µ + λ) (14. both draw twos.. 0 ≤ k ≤ n.96) (Solution on p. Determine E [Y ] from E [Y X = k].24 and X (Solution on p. and then determine (Solution on p. Z).) X Suppose the pair {X. etc. Y ) X = t. Suppose the arrival time of a job in hours before closing time is a random variable Y. 2 for E [Y X = t] = 2 (1 − t) for 0 ≤ t < 1 3 0 ≤ t ≤ 1. given T = u. independently and randomly. 451. [PZ ] (14. 452. times. k) pk (1 − p) Exercise 14. Let Y X be the number of matches (i. {X. 452. Determine Exercise 14. 451. Z = v] FXZ (dtv) a. Y ) X = t. that (14. That is.27 Use the result of Exercise 14.442 CHAPTER 14. p.97) (Solution on p.26 Use the fact that (CE10) to show .19 Suppose X∼ uniform on 1 through n and Y ∼ conditionally binomial (i. (Solution on p. Determine E [Y ]. Z = v] a. Determine the joint distribution and then determine E [Y ]. period is conditionally exponential T ∼ uniform [0.) g (X. (n. u.) Each of two people draw Exercise 14. with is conditionally binomial X ∼ Poisson (µ) and Y ∼ Poisson (λ). Exercise 14. compare with the theoretical value. v) does not vary with v. where g ∗ (t.) A shop which works past closing time to complete jobs on hand tends to speed up service on any job received during the last hour before closing.28 E [g (t. Extend property E [g (X. Determine the distribution function for .s.) 0 ≤ t < 2. 452. Y ) Z = v] = Exercise 14.). given X = i.22 E [Y X = t] = 10t Exercise 14. CONDITIONAL EXPECTATION. 451. given X + Y = n. Y } is independent. 452. Y ) = g ∗ (X. 452. p).) 601) and (CE10) to show E [g (X. both draw ones.26 and properties (CE9a) ("(CE9a)". Exercise 14. 1]. µ/ (µ + λ)). n = 50 and b. Y ) X = t.
Time to failure of the i th component is Xi (Solution on p. Exercise 14. and X is conditionally exponential (u). 453. Xn ) ∈ Q}. Xn ) Xi ]} (14. Let W is iid exponential (λ). Q = {(t1 .29 Time to failure X (Solution on p. Suppose the parameter is the value of random variable H ∼ uniform on[0.30 A system has given H = u.100) . Determine E [XH = u] and use this to determine E [X]. The system fails if any one or more of the components What is the probability the failure is due to the be the time to system failure. i th component? Suggestion. 0. Thus {W = Xi } = {(X1 . ∀ k = i} (14. The parameter is dependent upon the manufacturing process.99) P (W = Xi ) = E [IQ (X1 . Xn )] = E{E [IQ (X1 . · · · tn ) : tk > ti .01]. X2 . Note that W = Xi i Xj > Xi for all j = i. X2 . Determine n components. 453.005. · · · . P (X > 150).) and the class {Xi : 1 ≤ i ≤ n} fails.443 Exercise 14. X2 .) of a manufactured unit has an exponential distribution. · · · . · · · . t2 .
5275t + 0. EZx = sum(G.39: npr08_08) Data are in X.3050 3.1960 3.0000 1.9100 17..5345 1.1757 Solution to Exercise 14..^2.EYx = sum(u.1957 15.7000 59.5000 0...0383 0.. disp([X.2584t + 5.*u + abs(t+u)./sum(P)..8130 4.... disp([X.7269 5.0000 0.9100 13.5254 19.3100 9.0000 58. disp([X. REGRESSION Solutions to Exercises in Chapter 14 Solution to Exercise 14.4000 2. CONDITIONAL EXPECTATION. G = (u4). Y..8.9869 5.5700 G = t.^2.2000 6.0000 3.7000 3.*sqrt(t).8.7130 4. 437) The regression line of Y on X is u = −0. Y.*P).0000 175.5530 3.0000 5.*u. P jcalc .38: npr08_07) Data are in X../sum(P)./sum(P).0000 2. npr08_07 (Section~17.0000 3.0000 2..EYx]') 3.4000 17..6110..6860 1.5350 3.0000 0.6924.0290 0.0000 5.EYx = sum(u. P jcalc ./sum(P). npr08_08 (Section~17.9100 M = u<=t.*(1M).EZx]') 1.*M + t.3270 2. disp([X...*P).9322 .0254 11.EYx]') 1. EZx = sum(G.1000 4.2000 1.0139 2.0000 2.9000 69.0000 166..*P).444 CHAPTER 14.1 (p.1000 0.*P).2 (p.EZx]') 3.9000 2...5000 3.6500 7. 437) The regression line of Y on X is u = 0.
P jcalc .1412 2. (1... 438) The regression line of Y on X is u = −0.4 (p.3 (p...4690 Solution to Exercise 14.103) fY X (ut) = E [Y X = t] = 1 2 (t + 1) tu + u2 du = 1 + 0 1 0≤t≤2 3t + 3 (14.8)... EZx = sum(G.102) tuappr: [0 1] [0 2] 200 400 u<=2*(1t) .2..5000 0. 0 ≤ u ≤ 2 2 (t + 1) 2 (14./sum(P).7896 119.0000 3.0000 9.1367 Solution to Exercise 14.1627 3.EYx = sum(u..0000 1..4076 2..40: npr08_09) Data are in X.0000 0.0000 185./sum(P). 439) The regression line of Y on X is u = 1 − t.2167 2.5175 2.445 7. 0 ≤ u ≤ 2 (1 − t) 2 (1 − t) 1 2 (1 − t) 2(1−t) (14.*P).0000 13.7793t + 4. Y..5000 0.2031 13.1250 2./t..8999 11./sum(P).0) Solution to Exercise 14.8333 1..*P).7531 105...5000 2. 439) The regression line of Y on X is u = −t/11 + 35/33.0000 11.0000 2.0000 15.EZx]') 1.104) .0000 2. disp([X.3933 3..9675 10.101) fY X (ut) = E [Y X = t] = udu = 1 − t.EYx) % Straight line thru (0. disp([X. plot(X.1). 1 .8.5000 3.5 (p. 0 ≤ t ≤ 1 0 (14. 0 ≤ t ≤ 1.0000 19.3051..EYx = sum(u. npr08_09 (Section~17.0000 17. (t + u) 0 ≤ t ≤ 2.EYx]') 1.3900 G = (u .0333 1..*P)..0000 0...
446
CHAPTER 14. CONDITIONAL EXPECTATION, REGRESSION
tuappr: [0 2] [0 2] 200 200 (1/8)*(t+u) EYx = sum(u.*P)./sum(P); eyx = 1 + 1./(3*X+3); plot(X,EYx,X,eyx) % Plots nearly indistinguishable
Solution to Exercise 14.6 (p. 439)
The regression line of
Y
on
X
is
u = 0.0958t + 1.4876. 2t + 3u2 0≤u≤1+t (1 + t) (1 + 4t + t2 )
1+t
(14.105)
fY X (ut) = E [Y X = t] = =
1 (1 + t) (1 + 4t + t2 )
2tu + 3u3 du
0
(14.106)
(t + 1) (t + 3) (3t + 1) , 0≤t≤2 4 (1 + 4t + t2 )
(14.107)
tuappr: [0 2] [0 3] 200 300 (3/88)*(2*t + 3*u.^2).*(u<=1+t) EYx = sum(u.*P)./sum(P); eyx = (X+1).*(X+3).*(3*X+1)./(4*(1 + 4*X + X.^2)); plot(X,EYx,X,eyx) % Plots nearly indistinguishable
Solution to Exercise 14.7 (p. 439)
The regression line of
Y
on
X
is
u = (23t + 4) /18. 2u (t + 1) 1 (t + 1)
2 0 2
fY X (ut) = I[−1,0] (t)
+ I(0,1] (t)
t+1
2u (1 − t2 )
on the parallelogram
(14.108)
E [Y X = t] = I[−1,0] (t)
2u du + I(0,1] (t)
1 (1 − t2 )
1
2u du
t
(14.109)
= I[−1,0] (t)
2 2 t2 + t + 1 (t + 1) + I(0,1] (t) 3 3 t+1
(14.110)
tuappr: [1 1] [0 1] 200 100 12*t.^2.*u.*((u<= min(t+1,1))&(u>=max(0,t))) EYx = sum(u.*P)./sum(P); M = X<=0; eyx = (2/3)*(X+1).*M + (2/3)*(1M).*(X.^2 + X + 1)./(X + 1); plot(X,EYx,X,eyx) % Plots quite close
Solution to Exercise 14.8 (p. 439)
The regression line of
Y
on
X
is
u = (−124t + 368) /431. 2u (2 − t) 1 (2 − t)
2 0
(14.113)
fY X (ut) = I[0,1] (t) 2u + I(1,2] (t)
1
2 2−t
(14.111)
E [Y X = t] = I[0,1] (t)
0
2u2 du + I(1,2] (t)
2u2 du
(14.112)
= I[0,1] (t)
2 2 + I(1,2] (t) (2 − t) 3 3
447
tuappr: [0 2] [0 1] 200 100 (24/11)*t.*u.*(u<=min(1,2t)) EYx = sum(u.*P)./sum(P); M = X <= 1; eyx = (2/3)*M + (2/3).*(2  X).*(1M); plot(X,EYx,X,eyx) % Plots quite close
Solution to Exercise 14.9 (p. 440)
The regression line of
Y
on
X
is
u = 1.0561t − 0.2603. t + 2u t + 2u 0 ≤ u ≤ max (2 − t, t) + I(1,2] (t) 2 (2 − t) 2t2
2−t
(14.114)
fY X (ut) = I[0,1] (t) E [Y X = t] = I[0,1] (t) 1 2 (2 − t)
tu + 2u2 du + I(1,2] (t)
0
1 2t2
t
tu + 2u2 du
0
(14.115)
= I[0,1] (t)
1 7 (t − 2) (t − 8) + I(1,2] (t) t 12 12
(14.116)
tuappr: [0 2] [0 2] 200 200 (3/23)*(t+2*u).*(u<=max(2t,t)) EYx = sum(u.*P)./sum(P); M = X<=1; eyx = (1/12)*(X2).*(X8).*M + (7/12)*X.*(1M); plot(X,EYx,X,eyx) % Plots quite close
Solution to Exercise 14.10 (p. 440)
The regression line of
Y
on
X
is
u = −0.1359t + 1.0839. t + 2u t + 2u + I(1,2] (t) 0 ≤ u ≤ max (2t, 3 − t) 2 6t 3 (3 − t) tu + 2u2 du + I(1,2] (t)
0
(14.117)
fY X (tu) = I[0,1] (t) E [Y X = t] = I[0,1] (t) 1 6t2
t
1 3 (3 − t)
3−t
tu + 2u2 du
0
(14.118)
= I[0,1] (t)
11 1 2 t + I(1,2] (t) t − 15t + 36 9 18
(14.119)
tuappr: [0 2] [0 2] 200 200 (2/13)*(t+2*u).*(u<=min(2*t,3t)) EYx = sum(u.*P)./sum(P); M = X<=1; eyx = (11/9)*X.*M + (1/18)*(X.^2  15*X + 36).*(1M); plot(X,EYx,X,eyx) % Plots quite close
Solution to Exercise 14.11 (p. 440)
The regression line of
Y
on
X
is
u = 0.0817t + 0.5989. t2 + 2u + I(1,2] (t) 3u2 0 ≤ u ≤ 1 t2 + 1
1 1
(14.120)
fY X (tu) = I[0,1] (t) E [Y X = t] = I[0,1] (t)
1 t2 + 1
t2 u + 2u2 du + I(1,2] (t)
0 0
3u3 du
(14.121)
= I[0,1] (t)
3t2 + 4 3 + I(1,2] (t) 2 + 1) 6 (t 4
(14.122)
448
CHAPTER 14. CONDITIONAL EXPECTATION, REGRESSION
tuappr: [0 2] [0 1] 200 100 (3/8)*(t.^2 + 2*u).*(t<=1) + ... (9/14)*t.^2.*u.^2.*(t>1) EYx = sum(u.*P)./sum(P); M = X<=1; eyx = M.*(3*X.^2 + 4)./(6*(X.^2 + 1)) + (3/4)*(1  M); plot(X,EYx,X,eyx) % Plots quite close
Solution to Exercise 14.12 (p. 440)
Z = IM (X) 4X + IN (X) (X + Y ),
601) gives
Use of linearity, (CE8) ("(CE8)", p.
601), and (CE10) ("(CE10)", p.
E [ZX = t] = IM (t) 4t + IN (t) (t + E [Y X = t]) = IM (t) 4t + IN (t) t + (t + 1) (t + 3) (3t + 1) 4 (1 + 4t + t2 )
(14.123)
(14.124)
% Continuation of Exercise~14.6 G = 4*t.*(t<=1) + (t + u).*(t>1); EZx = sum(G.*P)./sum(P); M = X<=1; ezx = 4*X.*M + (X + (X+1).*(X+3).*(3*X+1)./(4*(1 + 4*X + X.^2))).*(1M); plot(X,EZx,X,ezx) % Plots nearly indistinguishable
Solution to Exercise 14.13 (p. 440)
1 Z = IM (X, Y ) X + IM c (X, Y ) Y 2 , M = {(t, u) : u > t} 2 IM (t, u) = I[0,1] (t) I[t,1] (u) IM c (t, u) = I[0,1] (t) I[0,t] (u) + I(1,2] (t) I[0,2−t] (u) E [ZX = t] = I[0,1] (t) t 2
1 t 2−t
(14.125)
(14.126)
2u du +
t 0
u2 · 2u du + I(1,2] (t)
0
u2 ·
2u (2 − t)
2
du
(14.127)
1 1 2 = I[0,1] (t) t 1 − t2 + t3 + I(1,2] (t) (2 − t) 2 2
(14.128)
% Continuation of Exercise~14.6 Q = u>t; G = (1/2)*t.*Q + u.^2.*(1Q); EZx = sum(G.*P)./sum(P); M = X <= 1; ezx = (1/2)*X.*(1X.^2+X.^3).*M + (1/2)*(2X).^2.*(1M); plot(X,EZx,X,ezx) % Plots nearly indistinguishable
Solution to Exercise 14.14 (p. 441)
Z = IM (X, Y ) (X + Y ) + IM c (X, Y ) 2Y, M = {(t, u) : max (t, u) ≤ 1} IM (t, u) = I[0,1] (t) I[0,1] (u) IM c (t, u) = I[0,1] (t) I[1,2−t] (u) + I(1,2] (t) I[0,t] (u)
(14.129)
(14.130)
449
1 E [ZX = t] = I[0,1] (t) 2(2−t) I(1,2] (t) 2E [Y X = t]
1 0
(t + u) (t + 2u) du +
1 (2−t)
2−t 1
u (t + 2u) du +
(14.131)
= I[0,1] (t)
1 2t3 − 30t2 + 69t − 60 7 · + I(1,2] (t) 2t 12 t−2 6
(14.132)
% Continuation of Exercise~14.9 M = X <= 1; Q = (t<=1)&(u<=1); G = (t+u).*Q + 2*u.*(1Q); EZx = sum(G.*P)./sum(P); ezx = (1/12)*M.*(2*X.^3  30*X.^2 + 69*X 60)./(X2) + (7/6)*X.*(1M); plot(X,EZx,X,ezx)
Solution to Exercise 14.15 (p. 441)
Z = IM (X, Y ) (X + Y ) + IM c (X, Y ) 2Y 2 ,
M = {(t, u) : t ≤ 1, u ≥ 1}
(14.133)
IM (t, u) = I[0,1] (t) I[1,2] (u) IM c (t, u) = I[0,1] (t) I[0,1) (u) + I(1,2] (t) I[0,3−t] (u) E [ZX = t] = I[0,1/2] (t) I(1/2,1] (t) 1 6t2
1
(14.134)
1 6t2
2t
2u2 (t + 2u) du+
0
(14.135)
2u2 (t + 2u) du +
0
1 6t2
2t
(t + u) (t + 2u) du + I(1,2] (t)
1
1 3 (3 − t)
3−t
2u2 (t + 2u) du
0
(14.136)
= I[0,1/2] (t)
32 2 1 80t3 − 6t2 − 5t + 2 1 t + I(1/2,1] (t) · + I(1,2] (t) −t3 + 15t2 − 63t + 81 9 36 t2 9
(14.137)
tuappr: [0 2] [0 2] 200 200 (2/13)*(t + 2*u).*(u<=min(2*t,3t)) M = (t<=1)&(u>=1); Q = (t+u).*M + 2*(1M).*u.^2; EZx = sum(Q.*P)./sum(P); N1 = X <= 1/2; N2 = (X > 1/2)&(X<=1); N3 = X > 1; ezx = (32/9)*N1.*X.^2 + (1/36)*N2.*(80*X.^3  6*X.^2  5*X + 2)./X.^2 ... + (1/9)*N3.*(X.^3 + 15*X.^2  63.*X + 81); plot(X,EZx,X,ezx)
Solution to Exercise 14.16 (p. 441)
Z = IM (X, Y ) X + IM c (X, Y ) XY,
1 3
M = {(t, u) : u ≤ min (1, 2 − t)}
2−t 1
(14.138)
E [X = t] = I[0,1] (t)
0
t + 2tu du + I(1,2] (t) t2 + 1
3tu2 du +
0 2−t
3tu3 du
(14.139)
450
CHAPTER 14. CONDITIONAL EXPECTATION, REGRESSION
13 3 t + 12t2 − 12t3 + 5t4 − t5 4 4
= I[0,1] (t) t + I(1,2] (t) −
(14.140)
tuappr: [0 2] [0 1] 200 100 (t<=1).*(t.^2 + 2*u)./(t.^2 + 1) +3*u.^2.*(t>1) M = u<=min(1,2t); G = M.*t + (1M).*t.*u; EZx = sum(G.*P)./sum(P); N = X<=1; ezx = X.*N + (1N).*((13/4)*X + 12*X.^2  12*X.^3 + 5*X.^4  (3/4)*X.^5); plot(X,EZx,X,ezx)
Solution to Exercise 14.17 (p. 441)
a.
E [Y X = i] = i/2,
so
1 E [Y ] = E [Y X = i] P (X = i) = n+1 i=0
b.
n
n
i/2 = n/4
i=1
(14.141)
P (X = i) = 1/ (n + 1) , 0 ≤ i ≤ n, P (Y = kX = i) = 1/ (i + 1) , 0 ≤ k ≤ i; P (X = i, Y = k) = 1/ (n + 1) (i + 1) 0 ≤ i ≤ n, 0 ≤ k ≤ i.
hence
n = 50; X = 0:n; Y = 0:n; P0 = zeros(n+1,n+1); for i = 0:n P0(i+1,1:i+1) = (1/((n+1)*(i+1)))*ones(1,i+1); end P = rot90(P0); jcalc: X Y P           EY = dot(Y,PY) EY = 12.5000 % Comparison with part (a): 50/4 = 12.5
Solution to Exercise 14.18 (p. 441)
a.
E [Y X = i] = (i + 1) /2,
so
n
E [Y ] =
i=1
b.
E [Y X = i] P (X = i) =
1 n
n
i=1
i+1 n+3 = 2 4
hence
(14.142)
P (X = i) = 1/n, 1 ≤ i ≤ n, P (Y = kX = i) = 1/i, 1 ≤ k ≤ i; i ≤ n, 1 ≤ k ≤ i.
P (X = i, Y = k) = 1/ni 1 ≤
n = 50; X = 1:n; Y = 1:n; P0 = zeros(n,n); for i = 1:n P0(i,1:i) = (1/(n*i))*ones(1,i); end P = rot90(P0); jcalc: P X Y            EY = dot(Y,PY) EY = 13.2500 % Comparison with part (a):
53/4 = 13.25
451
Solution to Exercise 14.19 (p. 442)
a.
E [Y X = i] = ip,
so
n
E [Y ] =
i=1
b.
E [Y X = i] P (X = i) =
p n
n
i=
i=1
p (n + 1) 2
(14.143)
P (X = i) = 1/n, 1 ≤ i ≤ n, P (Y = kX = i) = ibinom (i, p, 0 : i) , 0 ≤ k ≤ i.
n = 50; p = 0.3; X = 1:n; Y = 0:n; P0 = zeros(n,n+1); % Could use randbern for i = 1:n P0(i,1:i+1) = (1/n)*ibinom(i,p,0:i); end P = rot90(P0); jcalc: X Y P           EY = dot(Y,PY) EY = 7.6500 % Comparison with part (a):
Solution to Exercise 14.20 (p. 442)
a.
0.3*51/2 = 0.765
P (X = i) = 1/n, E [Y X = i] = i/6,
so
E [Y ] =
1 6
n
i/n =
i=0
(n + 1) 12
(14.144)
b.
n = 100; p = 1/6; X = 1:n; Y = 0:n; PX = (1/n)*ones(1,n); P0 = zeros(n,n+1); % Could use randbern for i = 1:n P0(i,1:i+1) = (1/n)*ibinom(i,p,0:i); end P = rot90(P0); jcalc EY = dot(Y,PY) EY = 8.4167 % Comparison with part (a): 101/12 = 8.4167
Solution to Exercise 14.21 (p. 442)
Same as Exercise 14.20, except
p = 1/10. E [Y ] = (n + 1) /20.
n = 100; p = 0.1; X = 1:n; Y = 0:n; PX = (1/n)*ones(1,n); P0 = zeros(n,n+1); % Could use randbern for i = 1:n P0(i,1:i+1) = (1/n)*ibinom(i,p,0:i); end P = rot90(P0); jcalc          EY = dot(Y,PY) EY = 5.0500 % Comparison with part (a): EY = 101/20 = 5.05
452
CHAPTER 14. CONDITIONAL EXPECTATION, REGRESSION
Solution to Exercise 14.22 (p. 442)
2
E [Y ] =
E [Y X = t] fX (t) dt =
1
10t (4 − 2t) dt = 40/3
(14.145)
Solution to Exercise 14.23 (p. 442)
1
E [Y ] =
E [Y X = t] fX (t) dt =
0
20t2 (1 − t) dt = 1/3
3
(14.146)
Solution to Exercise 14.24 (p. 442)
E [Y ] =
E [Y X = t] fX (t) dt =
5 8
2 0
t2 (2 − t) dt = 2/3
3
(14.147)
Solution to Exercise 14.25 (p. 442)
that
X ∼ Poisson (µ), Y ∼ Poisson (λ), X + Y ∼ Poisson (µ + λ)
Use of property (T1) ("(T1)", p. 387) and generating functions shows
P (X = kX + Y = n) =
k
P (X = k, X + Y = n) P (X = k, Y = n − k) = P (X + Y = n) P (X + Y = n)
n−k
(14.148)
=
Put
λ e−µ µ e−λ (n−k)! k!
e−(µ+λ) (µ+λ) n!
n
=
n! µk λn−k k! (n − k)! (µ + λ)n
(14.149)
p = µ/ (µ + λ)
and
q = 1 − p = λ/ (µ + λ)
to get the desired result.
Solution to Exercise 14.26 (p. 442)
E [g (X, Y ) X = t, Z = v] = E [g ∗ (X, Z, Y )  (X, Z) = (t, v)] = E [g ∗ (t, v, Y )  (X, Z) = (t, v)] = E [g (t, Y ) X = t, Z = v] a.s. [PXZ ]
Solution to Exercise 14.27 (p. 442)
By (CE9) ("(CE9)", p. 601), By (CE10), by (CE10)
(14.150)
(14.151)
E [g (X, Y ) Z] = E{E [g (X, Y ) X, Z] Z} = E [e (X, Z) Z] a.s.
E [e (X, Z) Z = v] = E [e (X, v) Z = v] = e (t, v) FXZ (dtv) a.s.
By Exercise 14.26,
(14.152)
(14.153)
E [g (X, Y ) X = t, Z = v] FXZ (dtv) = E [g (t, Y ) X = t, Z = v] FXZ (dtv) a.s. [PZ ]
Solution to Exercise 14.28 (p. 442)
1
(14.154)
(14.155)
FY (v) =
FY T (vu) fT (u) du =
0
1 − e−β(2−u)v
du =
(14.156)
1 − e−2βv
eβv − 1 1 − e−βv = 1 − e−βv , 0<v βv βv
(14.157)
453
Solution to Exercise 14.29 (p. 443)
FXH (tu) = 1 − e−ut fH (u) =
0.01
1 = 200, 0.005 ≤ u ≤ 0.01 0.005 200 −0.005t e − e−0.01t t ≈ 0.3323 du = 200ln2 u
(14.158)
FX (t) = 1 − 200
0.005
e−ut du = 1 −
(14.159)
P (X > 150) =
200 −0.75 e − e−1.5 150
0.01
(14.160)
E [XH = u] = 1/u E [X] = 200
0.005
(14.161)
Solution to Exercise 14.30 (p. 443)
Let
Q = {(t1 , t2 , · · · , tn ) : tk > ti , k = i}.
Then
P (W = Xi ) = E [IQ (X1 , X2 , · · · , Xn )] = E{E [IQ (X1 , X2 , · · · , Xn ) Xi ]} = E [IQ (X1 , X2 , · · · , ti , · · · Xn )] FX (dt) P (Xk > t) = [1 − FX (t)]
k=i
If put
(14.162)
(14.163)
E [IQ (X1 , X2 , · · · , ti , · · · Xn )] =
n−1
(14.164)
FX
is continuous, strictly increasing, zero for Then
t < 0,
u = FX (t), du = fX (t) dt. t = 0 ∼ u = 0,
1
t = ∞ ∼ u = 1.
1
P (W = Xi ) =
0
(1 − u)
n−1
du =
0
un−1 du = 1/n
(14.165)
454
CHAPTER 14. CONDITIONAL EXPECTATION, REGRESSION
Chapter 15
Random Selection
15.1 Random Selection1
15.1.1 Introduction
N,
The usual treatments deal with a single random variable or a xed, nite number of random variables, considered jointly. However, there are many common applications in which we select at random a member of a class of random variables and observe its value, or select a random number of random variables and obtain some function of those selected. This is formulated with the aid of a which is nonegative, integer valued.
counting
or
selecting
random variable
It may be independent of the class selected, or may be related in We consider only the independent case.
some sequential way to members of the class. problems require
optional
random variables, sometimes called
Markov times.
Many important
These involve more theory
than we develop in this treatment. Some common examples:
1. Total demand of
2. Total service time for 3. 4. 5. 6.
N independent of the individual demands. N independent of the individual service times. Net gain in N plays of a game N independent of the individual gains. Extreme values of N random variables N independent of the individual values. Random sample of size N N is usually determined by propereties of the sample observed. Decide when to play on the basis of past results N dependent on past.
customers
N
N
units
15.1.2 A useful modelrandom sums
As a basic model, we consider the sum of a random number of members of an iid class. In order to have a concrete interpretation to help visualize the formal patterns, we think of the demand of a random number of customers. We suppose the number of customers a model to be used for a variety of applications.
N
is independent of the individual demands. We formulate
A
An
basic sequence {Xn : 0 ≤ n} [Demand of n customers] incremental sequence {Yn : 0 ≤ n} [Individual demands]
n
These are related as follows:
Xn =
k=0
Yk
for
n≥0
and
Xn = 0
for
n<0
Yn = Xn − Xn−1
for all
n
(15.1)
1 This
content is available online at <http://cnx.org/content/m23652/1.5/>.
455
456
CHAPTER 15. RANDOM SELECTION
A
D
counting random variable N.
(the random sum)
If
N =n Yk =
then
n
of the
Yk
are added to give the
compound demand
(15.2)
N
∞
∞
D=
k=0
I{N =k} Xk =
k=0 k=0
I{k} (N ) Xk ∞.
Note.
In some applications the counting random variable may take on the idealized value
For example,
in a game that is played until some specied result occurs, this may never happen, so that no nite value can be assigned to independent of the
We assume throughout, unless specically stated otherwise, that:
1. 2. 3.
Independent selection from an iid incremental sequence
N. In such a case, it is necessary to decide what value X∞ is to Yn (hence of the Xn ), we rarely need to consider this possibility.
be assigned.
For
N
X0 = Y0 = 0 {Yk : 1 ≤ k} is iid {N, Yk : 0 ≤ k} is
an independent class
We utilize repeatedly two important propositions:
1. 2.
E [h (D) N = n] = E [h (Xn )] , n ≥ 0. MD (s) = gN [MY (s)]. If the Yn are nonnegative
integer valued, then so is
D
and
gD (s) = gN [gY (s)]
DERIVATION We utilize properties of generating functions, moment generating functions, and conditional expectation.
1.
2.
E I{n} (N ) h (D) = E [h (D) N = n] P (N = n) by denition of conditional expectation, given an event. Now, I{n} (N ) h (D) = I{n} (N ) h (Xn ) and E I{n} (N ) h (Xn ) = P (N = n) E [h (Xn )]. Hence E [h (D) N = n] P (N = n) = P (N = n) E [h (Xn )]. Division by P (N = n) gives the desired result. sD By the law of total probability (CE1b), MD (s) = E e = E{E esD N }. By proposition 1 and the
product rule for moment generating functions,
n
E esD N = n = E esXn =
k=1
Hence
n E esYk = MY (s)
(15.3)
∞
MD (s) =
n=0
A parallel argument holds for
n MY (s) P (N = n) = gN [MY (s)]
(15.4)
gD
in the integervalued case.
Remark.
The result on
MD
and
gD
may be developed without use of conditional expectation.
∞
∞
MD (s) = E esD =
k=0 ∞
E I{N =n} esXn =
k=0
P (N = n) E esXn
(15.5)
=
k=0
n P (N = n) MY (s) = gN [MY (s)]
(15.6)
Example 15.1: A service shop
Suppose the number
N
of jobs brought to a service shop in a day is Poisson (8).
One fourth of
these are items under warranty for which no charge is made. Others fall in one of two categories. One half of the arriving jobs are charged for one hour of shop time; the remaining one fourth are
457
charged for two hours of shop time. Thus, the individual shop hour charges distribution
Yk
have the common
Y = [0 1 2]
SOLUTION
with probabilities
P Y = [1/4 1/2 1/4] P (D ≤ 4). 1 1 + 2s + s2 4
(15.7)
Make the basic assumptions of our model. Determine
gN (s) = e8(s−1) gY (s) =
According to the formula developed above,
(15.8)
gD (s) = gN [gY (s)] = exp (8/4) 1 + 2s + s2 − 8 = e4s e2s e−6
result of straightforward but somewhat tedious calculations is
2
(15.9)
Expand the exponentials in power series about the origin, multiply out to get enough terms. The
gD (s) = e−6 1 + 4s + 10s2 +
56 3 86 4 s + s + ··· 3 3
(15.10)
Taking the coecients of the generating function, we get
P (D ≤ 4) ≈ e−6 1 + 4 + 10 +
56 86 + 3 3
= e−6
187 ≈ 0.1545 3
(15.11)
Example 15.2: A result on Bernoulli trials
Suppose the counting random variable
N∼
n
binomial
(n, p)
and
Yi = IEi ,
with
P (Ei ) = p0 .
Then
gN = (q + ps)
and
gY (s) = q0 + p0 s
(15.12)
By the basic result on random selection, we have
gD (s) = gN [gY (s)] = [q + p (q0 + p0 s)] = [(1 − pp0 ) + pp0 s]
so that
n
n
(15.13)
D∼
binomial
(n, pp0 ).
In the next section we establish useful mprocedures for determining the generating function ment generating function
MD
gD and the moTo this end, we
for the compound demand for simple random variables, hence for determining It may helpful, if not entirely
the complete distribution.
Obviously, these will not work for all problems.
sucient, in such cases to be able to determine the mean value establish the following expressions for the mean and variance.
E [D]
and variance
Var [D].
Example 15.3: Mean and variance of the compound demand
E [D] = E [N ] E [Y ]
DERIVATION
and
Var [D] = E [N ] Var [Y ] + Var [N ] E 2 [Y ]
(15.14)
∞
∞
E [D] = E
n=0
I{N =n} Xn =
n=0 ∞
P (N = n) E [Xn ]
(15.15)
= E [Y ]
n=0 ∞
nP (N = n) = E [Y ] E [N ]
∞
(15.16)
E D
2
=
n=0
P (N = n) E
2 Xn
=
n=0
P (N = n) {Var [Xn ] + E 2 [Xn ]}
(15.17)
5. we may then calculate any desired quantities. That is. E [D] = 8 · 1 = 8.5 + 8 · 1 = 12 Hence. We examine a strategy for computation which is implemented in the mprocedure gend.4: Mean and variance for Example 15. We enter these gN = [p0 p1 · · · pn ] y = [π0 π1 · · · πm ] ··· y2 = conv (y.21) (15. To probabilities as on the plane. y2) ··· yn = conv (y.1 ( A service shop) E [N ] = Var [N ] = 8. P = zeros (n + 1. With the joint i th row are obtained by distribution. We then replace appropriate elements of the successive rows. n+1 rows has begin by preallocating zeros to the rows.20) 15. When the matrix P is completed. n ∗ m + 1). corresponding to missing values of orient the joint gY and the power of gY for the previous row. each of whose gY multiplied by the probability N i th row We the has that value. Var [D] = 8 · 0.19) Example 15. and there is a generating function.25 (0 + 2 + 4) − 1 = 0. we are able to approximate by a simple random variable. the length of yn.23) Coecients of Coecients of Coecients of n gY P whose rows contain the joint probabilities.458 CHAPTER 15. RANDOM SELECTION ∞ = n=0 Hence P (N = n) {nVar [Y ] + n2 E 2 [Y ]} = E [N ] Var [Y ] + E N 2 E 2 [Y ] (15.. integer valued. we need a matrix.18) Var [D] = E [N ] Var [Y ] + E N 2 E 2 [Y ] − E[N ] E 2 [Y ] = E [N ] Var [Y ] + Var [N ] E 2 [Y ] 2 (15.3 Calculations for the compound demand We have mprocedures for performing the calculations necessary to determine the distribution for a composite demand D when the counting random variable N and the individual demands Yk are simple random variables with not too many values. then so is D. (15. By symmetry E [Y ] = 1. we rotate P ninety degrees counterclockwise. The probabilities in the consist of the coecients for the appropriate power of To achieve this. respectively. The procedure gend If the Yi are nonnegative.e. such as for a Poisson counting random variable. . In some cases.22) gN and gY are the probabilities of the values of and calculate the coecients for powers of gY : N and Y. Suppose gN (s) = p0 + p1 s + p2 s2 + · · · + pn sn gY (s) = π0 + π1 s + π2 s2 + · · · + πm sm The coecients of (15. we set nm + 1 elements. y) y3 = conv (y. we remove N and D (i. Var [Y ] = 0. The replacement probabilities for the the convolution of zero rows and columns. y (n − 1)) We wish to generate a matrix 1 × (n + 1) 1 × (m + 1) 1 × (2m + 1) 1 × (3m + 1) 1 × (nm + 1) Coecients of Coecients of gN gY 2 gY 3 gY (15. values with zero probability).1.
PD.0000~~~~0.u.^2)*PD'~~ED^2 VD~=~~1.~PD./sum(P).~~~~~~%~Note~the~zero~coefficient~in~the~zero~position gend Do~not~forget~zero~coefficients~for~missing~powers Enter~the~gen~fn~COEFFICIENTS~for~gN~~gN .6: A numerical example gN (s) = Note that the zero 1 1 + s + s2 + s3 + s4 gY (s) = 0.6000~~~~~~~~%~Agrees~with~theoretical~E[DN=n]~=~n*EY ~~~~2.~Y.0000~~~~0. disp([N.1 5s + 3s2 + 2s3 5 power is missing from gY .24) gN~=~0.~~~~%~Note~zero~coefficient~for~missing~zero~power gY~=~0. 1.1. gN~=~(1/3)*[0~1~1~1]. 2.~PY.~call~for~gD disp(gD)~~~~~~~~~~~~~~~~~~%~Optional~display~of~complete~distribution ~~~~~~~~~0~~~~0.PL]~=~jcalcf(N. (15. 0.5. or 2 items with respective probabilities 0.D.~~~~~~~~%~All~powers~0~thru~2~have~positive~coefficients gend ~Do~not~forget~zero~coefficients~for~missing~powers Enter~the~gen~fn~COEFFICIENTS~for~gN~gN~~~~%~Coefficient~matrix~named~gN Enter~the~gen~fn~COEFFICIENTS~for~gY~gY~~~~%~Coefficient~matrix~named~gY Results~are~in~N.2250 ~~~~3.459 Example 15.~D. First we determine the matrices representing coecients are the probabilities that each integer value is observed.0000~~~~1.0000~~~~0.8000 VD~=~(D. or 3.0000~~~~0. The zero coecients must be included.2000~~~~~~~~~~~~~~~~%~Agrees~with~theoretical~EN*EY P3~=~(D>=3)*PD' P3~~=~0.~P May~use~jcalc~or~jcalcf~on~N.5: A compound demand The number of customers in a ma jor appliance store is equally likely to be 1.0000~~~~1.0000~~~~0.0000~~~~0.2917 ~~~~1.t. EDn~=~sum(u.1200~~~~~~~~~~~~~~~~%~Agrees~with~theoretical~EN*VY~+~VN*EY^2 Example 15.2*[1~1~1~1~1].1*[5~4~1].6000 ED~=~D*PD' ED~=~~1.4.0003 EN~=~N*PN' EN~=~~~2 EY~=~Y*PY' EY~=~~0. Note that the for any missing powers gN and gY .2000 ~~~~3.1*[0~5~3~2]. corresponding to the fact that P (Y = 0) = 0.*P).~P To~view~distribution~for~D.0000~~~~0.EDn]') ~~~~1. Each customer buys 0. Customers buy independently.3667 ~~~~2.0243 ~~~~5.P).~D.1167~~~~~~~~~~~~~~~~ [N. regardless of the number of customers.0040 ~~~~6. gY~=~0.0880 ~~~~4.D.~PN.PN. 0.
~P May~use~jcalc~or~jcalcf~on~N.0000~~~~0.~P To~view~distribution~for~D. Sometimes.~PY.~PD.~PN. Suppose.1250 P4_12~=~((D~>=~4)&(D~<=~12))*PD' P4_12~=~0. RANDOM SELECTION Enter~the~gen~fn~COEFFICIENTS~for~gY~~gY Results~are~in~N.0000~~~~0. p = 0. following example utilizes such an approximation procedure for the counting random variable N.0075 ~~~11.~%~Note~zero~coefficient~for~missing~zero~power gY~=~[0.~PD. or 3.~D.0000~~~~0.~D. .1155 ~~~~5.2000 ~~~~1.0424 ~~~~9. the number of customers in a major appliance store is equally likely to be 1.0000~~~~0.1250 ~~~~4.~D.~PN.~D.0000~~~~0.2640 ~~~~3.0000~~~~0.1110 ~~~~6. 2.1000 ~~~~2.0003 p3~=~(D~==~3)*PD'~~~~~~~~%~P(D=3) P3~=~~0.~Y.~call~for~gD disp(gD) ~~~~~~~~~0~~~~0.0000~~~~0.~P To~view~distribution~for~D. gN.1100 ~~~~3.2080 ~~~~1. This is a generalization of the Bernoulli case in Example 15.4~0. and gend.~Y.~call~for~gD disp(gD)~~~~~~~~~~~~~~~~~%~Optional~display~of~complete~distribution ~~~~~~~~~0~~~~0.460 CHAPTER 15.7: Number of successes for random number We are interested in the number of successes in N trials for a general counting random variable.0000~~~~0.0203 ~~~10.4650~~~~~~~~~~~%~P(4~<=~D~<=~12) Example 15.0000~~~~0.~~~~~~~%~Generating~function~for~the~indicator~function gend Do~not~forget~zero~coefficients~for~missing~powers Enter~gen~fn~COEFFICIENTS~for~gN~~gN Enter~gen~fn~COEFFICIENTS~for~gY~~gY Results~are~in~N.0720 The procedure gend is limited to simple N and Yk . with nonnegative integer values.0964 ~~~~7.0000~~~~0. a random The solution in the variable with unbounded range may be approximated by a simple random variable.0000~~~~0.6. gN~=~(1/3)*[0~1~1~1].6].2 ( A result on Bernoulli trials).0000~~~~0.~P May~use~jcalc~or~jcalcf~on~N.~PY. and each buys at least one item with probability the distribution for the number SOLUTION We use D Determine of buying customers.0000~~~~0. as in Example 15.4560 ~~~~2.0000~~~~0.0019 ~~~12.0696 ~~~~8. N of trials.5 (A compound demand).0000~~~~0. gY .
25*[1~2~1].1 ( VD~=~(D.0000~~~~0. gend Do~not~forget~zero~coefficients~for~missing~powers Enter~gen~fn~COEFFICIENTS~for~gN~~gN Enter~gen~fn~COEFFICIENTS~for~gY~~gY Results~are~in~N.0000~~~~0.0000~~~~0. The individual shop hour have the common distribution Y = [012] with probabilities P Y = [1/41/21/4]. we need to check for a sucient number of terms in a simple approximation.1165 ~~~~8.4 (Mean and variance for Example~15.0711 ~~~~5.1722e06 gN~=~ipoisson(8.~~~~~~~%~Approximate~gN gY~=~0.0000~~~~0.:))~~~~~~~~~~~~%~Calculated~values~to~D~=~50 ~~~~~~~~~0~~~~0.0000~~~~0.~PN.0000~~~~0.1545~~~~~~~~~~~~~~~~~%~Theoretical~value~(4~~places)~=~0. determine SOLUTION Since Poisson P (D ≤ 4).0105 ~~~17.1132 ~~~~9.0021 ~~~20.0000~~~~0.1 ( A service shop) The number charges Yk N of jobs brought to a service shop in a day is Poisson (8).0025~~~~~~~~~%~Display~for~D~<=~20 ~~~~1. N is unbounded.0000 p25~=~cpoisson(8.0463 ~~~~4.0000~~~~0. Then we proceed as in the simple case.0248 ~~~~3.1 ( A A .0:25).461 Example 15.0000 P4~=~(D<=4)*PD' P4~=~~~0.0369 ~~~14.0000~~~~0.8: Solution of the shop time Example 15.0064 ~~~18.~call~for~gD disp(gD(D<=20.~Y.0684 ~~~12.0000~~~~0.0000~~~~0.1099 ~~~~7.0000~~~~0.0861 ~~~11.0000~~~~0.4 (Mean and variance for Example~15.~P To~view~distribution~for~D.~P May~use~jcalc~or~jcalcf~on~N.0099 ~~~~2.1021 ~~~10.~PY.0000~~~~0.10:5:30)~~~~~%~Check~for~sufficient~number~of~terms pa~=~~~0.0166 ~~~16.25)~~~~~~~~~%~Check~on~choice~of~n~=~25 p25~=~~1.~D.0000~~~~0.0000~~~~0.2834~~~~0.0000~~~~0.0003~~~~0.0012 sum(PD)~~~~~~~~~~~~~~~~~~~~~~~%~Check~on~sufficiency~of~approximation ans~=~~1.~PD.1545 ED~=~D*PD' ED~=~~~8.0000~~~~0.0000~~~~0.9999~~~~~~~~~~~~~~~~~%~Theoretical~=~12~(Example~15.0253 ~~~15.0000~~~~0.^2)*PD'~~ED^2 VD~=~~11.0515 ~~~13.0939 ~~~~6.0000~~~~0.0000~~~~~~~~~~~~~~~~~%~Theoretical~=~8~~(Example~15. pa~=~cpoisson(8.0037 ~~~19.0173~~~~0. Under the basic assumptions of our model.~D.
0000~~~~0. For gN . it is necessary to treat the Example 15.5000~~~~0. For the moment generating function.9: Noninteger values A service shop has three standard charges for a certain class of warranty services it performs: $10. and $15.5000~~~~0. The distributions for the various sums are concatenated into two row vectors.0353 ~~~42.1000 ~~~12.0570 ~~~37. mgd~~~~~~~~~~~~~~~~~~~~~~~~~~~%~Call~for~procedure Enter~gen~fn~COEFFICIENTS~for~gN~~gN Enter~VALUES~for~Y~~Y Enter~PROBABILITIES~for~Y~~PY Values~are~in~row~matrix~D. the mprocedure mgd utilizes the mfunction mgsum to obtain these distributions.0000~~~~0. 4 with equal probabilities 0.2000 ~~~10.5~15].0187 .0240 ~~~30.3. need to implement the moment generating function In this case.1*[5~3~2]. we nd it convenient to have two mprocedures: distribution and jmgd D. 12.0330 ~~~32. D} as to develop the for the joint distribution. 0. take on values 10.5.0468 ~~~50.0000~~~~0. to which csort is applied to obtain the distribution for the compound demand.5000~~~~0.5000~~~~0.2. the joint distribution requires considerably mgd for the marginal {N. Let services rendered in a day. 15 with respective probabilities 0.5000~~~~0.0580 ~~~27.0600 ~~~25. Determine the distribution for SOLUTION C. disp(mD)~~~~~~~~~~~~~~~~~~~~~~%~Optional~display~of~distribution ~~~~~~~~~0~~~~0.~probabilities~are~in~PD. The procedure requires as input the generating function for and the actual distribution.5000~~~~0. To~view~the~distribution.462 CHAPTER 15. coecients as in gend. for the individual demands.0000~~~~0. and there are considerable gaps between the values. The number of jobs received in a normal work day can be considered a random variable N which takes on values 0. The job types for arrivals may be represented by an iid class {Yi : 1 ≤ i ≤ 4}. 0.0000~~~~0.5000~~~~0. The values for the individual demands are not limited to integers. $12.~call~for~mD. Y are put into a If Y is integer valued.0450 ~~~35. it is as easy to develop the joint distribution for marginal distribution for more computation. the actual values and probabilities in the distribution for pair of row matrices.0372 ~~~45. independent of the arrival process.5.0486 ~~~47. In the generating function case.0352 ~~~52. As a consequence.0414 ~~~40.2. RANDOM SELECTION The mprocedures mgd and jmgd The next example shows a fundamental limitation of the gend procedure. 3. Y N and PY . C The Yi be the total amount of gN~=~0. 1. PY~=~0.~~~~~~~~~%~Enter~data Y~=~[10~12.0600 ~~~15.0000~~~~0. we MD rather than the generating function gD .50.5000~~~~0.0500 ~~~22. Instead of the convolution procedure used in gend to determine the distribution for the sums of the individual demands. However.2*[1~1~1~1~1].0400 ~~~20. there are no zeros in the probability matrix for missing values.0000~~~~0. 2.0000~~~~0.0000~~~~0.
D}. the results are the same as those obtained with gend. complications come in placing the probabilities in the P we use a modication of mgd called matrix in the desired positions. and requires the same data format.6 (A numerical example). To~view~the~distribution.2000 ~~~~1.6 (A numerical example) In Example 15. disp(mD) ~~~~~~~~~0~~~~0.0000~~~~0.0000~~~~0. Y~=~1:3. gN (s) = (15.1421 As expected.25) gN~=~0. D value.0019 ~~~60.0003 We next recalculate Example 15. Actual operation is quite similar to the operation .0003 P3~=~(D==3)*PD' P3~=~~~0.0424 ~~~~9. PY~=~0.4000 P_4_12~=~((D>=4)&(D<=12))*PD' P_4_12~=~~0.1*[5~3~2].1250 ~~~~4.1100 ~~~~3.0075 ~~~11.6 (A numerical example).0000~~~~0.0000~~~~0. using mgd rather than gend. The This requires some calculations to determine the appropriate size of the matrices used as well as a procedure to put each probability in the position corresponding to its of mgd.0964 ~~~~7.1155 ~~~~5.0000~~~~0.1 ∗ [532].463 ~~~55. above. If it is desired to obtain the joint distribution for {N.~probabilities~are~in~PD.0000~~~~0.5).1250 ED~=~D*PD' ED~=~~~3.6 (A numerical example).0000~~~~0.1110 ~~~~6.5000~~~~0.0203 ~~~10. jmgd. we have 1 1 + s + s2 + s3 + s4 gY (s) = 0. mgd Enter~gen~fn~COEFFICIENTS~for~gN~~gN Enter~VALUES~for~Y~~Y Enter~PROBABILITIES~for~Y~~PY Values~are~in~row~matrix~D.0000~~~~0. We use the same expression for gN as in Example 15. Example 15.2*ones(1.0000~~~~0.0075 ~~~57.1000 ~~~~2.1 5s + 3s2 + 2s3 5 This means that the distribution for Y is Y = [123] and P Y = 0.0000~~~~0.4650 P7~=~(D>=7)*PD' P7~=~~~0.10: Recalculation of Example 15.0696 ~~~~8.0000~~~~0.0000~~~~0.0000~~~~0.~call~for~mD.0000~~~~0.0019 ~~~12.
Multinomial trials Let We analyze such a sequence of trials as follows. The class {Nkn : 1 ≤ k ≤ m} cannot be independent.29) C (n. If the random variable Ti i th arrival and the class {Ti : 1 ≤ i} is iid. in which one type is called a success and the other a failure. The type on the i th trial the Bernoulli or binomial case. m 1 2 n1 of the E1i . N2n = n2 . But both mprocedures work quite well for problems of And it will handle somewhat larger problems. However. but these do not depend upon this result. if the use of gend is appropriate. · · · . Nmn = nm } is one of the (15.31) 2 This content is available online at <http://cnx. the value of the other is determined. which we number 1 through m. Each such arrangement has probability so that m P (N1n = n1 . For each i. We establish some useful theoretical results and in some cases use MATLAB procedures. If n1 + n2 + · · · + nm = n. nm ) = n!/ (n1 !n2 ! · · · nm !) ways of arranging (15. the individual demands may be categorized in one of m types. · · · Nmn = nm ) = n! k=1 pnk k nk ! (15. and are convenient tools for solving various compound demand type problems. we let Nkn be the number of occurrences of type k. etc. the class {Eki : 1 ≤ k ≤ m} types will occur. moderate size. · · · . since on each component trial exactly one of the may be represented by the type random variable m Ti = k=1 We assume kIEki (15. it is faster and more ecient than mgd (or jmgd). n2 of the E2i . n2 . it is usually helpful to demonstrate the validity of the assumptions in typical examples. pk ). Then Nkn = i=1 IEki with Nkn = n k=1 (15. RANDOM SELECTION A principle use of the joint distribution is to demonstrate features of the model. since it sums to If the values of m−1 of them are known. Remark. such as nE [Y ]. n1 . Now each Nkn ∼ binomial (n. This.27) n trials.28) n.30) pn1 pn2 · · · pnm .2 Some Random Selection Problems2 In the unit on Random Selection (Section 15. 15.1). In general.2. the event {N1n = n1 . with P (Ti = k) = P (Eki ) = pk n m invariant with i (15. . is a partition. In this unit. we extend the treatment to a variety of problems. nm of the Emi . including those in the unit on random selection.org/content/m23664/1. of course. N2n = n2 . Suppose there are Eki be the event that type k occurs on the i th component trial.464 CHAPTER 15. we have multinomial trials.26) {Ti : 1 ≤ i} In a sequence of is iid. 15. MY (s).1 The Poisson decomposition is the type of the In many problems. we develop some general theoretical results and computational procedures using MATLAB.6/>. · · · . is utilized in obtaining the expressions for MD (s) in terms of E [DN = n] = gN (s) and This result guides the development of the computational procedures. For m = 2 we have m types.
The messages coming in on line 2 have probability leaving on line a. this is the binomial distribution with parameter A random number of multinomial trials We consider. p1 ). what is the distribution of outgoing messages on line a? What are the probabilities of at least 30. Na ∼ Poisson (50 · 0. and 1/10 require regular mail. 40 outgoing messages on line a? SOLUTION By the Poisson decomposition. by second day priority. we consider some examples. Ti : 1 ≤ i} is with P (Ti = k) = pk . P1 = 1 . Make the usual assumptions on compound demand. 2.5 · 300 = 150). Before verifying the propositions above. Also N1 + N2 ∼ Poisson (120 + 150 = 270). Under the usual independence assumptions. This is readily established with the aid of the generating function. or by regular parcel mail. 5/10 want second day priority. on line 2 the number is N2 ∼ Poisson p1a = 0. Then a. with respective probabilities p1 = 0. Example 15. What is the probability that fewer than 150 want next day express? What is the probability that fewer than 300 want one or the other of the two faster deliveries? SOLUTION Model as a random number of multinomial trials. Each b. and Type 3 is regular mail. 1 ≤ k ≤ m independent Nk ∼ Poisson (µpk ) {Nk : 1 ≤ k ≤ m} is independent.32) Poisson decomposition Suppose 1. 3. . For m = 2.1 · 300 = 30).4 · 300 = 120). incoming messages (45).465 This set of joint probabilities constitutes the multinomial distribution.9954 P12 = 1 . 35.33 p2a = 0.9620 Example 15. Let Nk be the number of results of type k N of multinomial trials. Suppose 4/10 of the customers want next day express. and N3 ∼ Poisson (0. in particular. N ∞ m Nk = i=1 IEki = n=1 I{N =n} Nkn with Nk = N k=1 (15. and N1 on line one in one hour is Poisson (50). the case of a random number (µ). µ for the sum the sum of the µi for the variables added. The usefulness of this remarkable result is enhanced by the fact that the sum of independent Poisson random variables is also Poisson. p2 = 0.11: A shipping problem The number N of orders per day received by a mail order house is Poisson (300). where in a random number N N ∼ Poisson of multinomial trials.4.5.300) P12 = 0.47 The number of On incoming line 1 the messages have probability of leaving on outgoing line a of 1 − p1a of leaving on line b. Type 2 is second day priority.33 + 45 · 0. N2 ∼ Poisson (0. (n.cpoisson(120. Orders are shipped by next day express.150) P1 = 0.12: Message routing A junction point in a network has two incoming lines and two outgoing lines. with N ∼ Poisson (µ) {Ti : 1 ≤ i} is iid {N.1.65).cpoisson(270. with three outcome types: Type 1 is next day express.47 = 37. and p3 = 0. Then N1 ∼ Poisson (0. and type 1 a success.
15. This is composite demand with Yk = IEki . the product rule holds for the class eµpk so that it is independent. N2 = n2 . Nmn = nm } Since N (15.30:5:40) Pa = 0.3722 VERIFICATION of the Poisson decomposition a. b. · · · .38) Vn = min{Y1 . (15.33) n1 . N2n = n2 . gNk (s) = gN [gYk (s)] = eµ(1+pk (s−1)−1) = eµpk (s−1) which is the generating function for b.39) N of the Yi . The minimum and maximum random variables are ∞ ∞ VN = n=0 I{N =n} Vn and WN = n=0 I{N =n} Wn (15. n = n1 + n2 + · · · + nm . · · · . Y2 .6890 0. {N1n = n1 . Therefore. n 2 .2 Extreme values Consider an iid class {Yi : 1 ≤ i} of nonnegative random variables. For any positive integer n we let (15.2.9119 0. let Nk ∼ Poisson (µpk ).466 CHAPTER 15.35) P (A) = e−µ µn · n! n! m k=1 pnk k = (nk )! m e−µpk k=1 pnk k = nk ! m P (Nk = nk ) k=1 (15. Yn } Then and Wn = max{Y1 .33 + 45*0. so that gYk (s) = qk + spk = 1 + pk (s − 1).34) is independent of the class of IEki . · · · . N2n = n2 . n m . the class {{N = n}. and consider A = {N1 = n1 . · · · . FV (t) = P (V ≤ t) = 1 + P (N = 0) − gN [P (Y > t)] FW (t) = gN [P (Y ≤ t)] . then a. Nmn = nm }} is independent.36) The second product uses the fact that m eµ = eµ(p1 +p2 +···+pm ) = k=1 Thus. For any (15. · · · . · · · . By the product rule and the multinomial distribution (15. RANDOM SELECTION ma = 50*0.40) Computational formulas If we set V0 = W0 = 0. Nm = nm } = {N = n} ∩ {N1n = n1 . Yn } P (Vn > t) = P n (Y > t) Now consider a random number and P (Wn ≤ t) = P n (Y ≤ t) (15. Y2 .37) {Nk : 1 ≤ k ≤ m}. Nk = N i=1 IEki .6500 Pa = cpoisson(ma.47 ma = 37.
on lowest bidder and Example 15. 15. Vn } ∞ for each n ∞ P (VN > t) = n=0 P (N = n) P (Vn > t) = n=1 P (N = n) P n (Y > t) .14: Lowest Bidder A manufacturer seeks bids on a modication of one of his processing units.PW]') 6. distribution exponential (1/3). we do not have the extra term for (15. Twenty contractors are invited to bid.467 ∞ These results are easily established as follows. 9.3).0668 9.8739 18. below.250). n=0 By additivity and {N. we have ∞ P (VN > t) = n=0 P (N = n) P n (Y > t) − P (N = 0) = gN [P (Y > t)] − P (N = 0) In this case. P (W0 ≤ t) = 1.41) If we add into the last sum the term P (N = 0) P 0 (Y > t) = P (N = 0) then subtract it.44) Example 15. PW = exp(20*exp(t/3)).16 (Batch testing)).0000 0.42) A similar argument holds for proposition (b). In some cases. Special case. so that the number of bids Assume the bids Yi N ∼ binomial (20. 12.43) P (N = 0) = P 0 (Y > t) P (N = 0) ∞ to each of the sums to get FV (t) = 1 − n=0 P n (Y > t) P (N = n) = 1 − gN [P (Y > t)] (15. since P (V0 > t) = 0 (15.45) t = 6:3:18.13: Maximum service time The number N of jobs coming into a service center in a week is a random quantity having a Poisson Suppose the service times (in hours) for individual units are iid.3.14 (Lowest Bidder ). with common (20) distribution.3694 12.0. What is the probability the maximum service time for the units is no greater than 6.0000 0. The market is such that the bids have a common distribution symmetric triangular on (150. They bid with probability 0. In that case ∞ ∞ n FV (t) = n=1 [1 − P (Y > t)] P (N = n) = n=1 P (Vn ≤ t) P (N = n) = ∞ ∞ n n=1 P (Y > t) P (N = n) n=1 P (N = n) − Add (15.0000 0. disp([t. independence of {VN > t} = {N = n}{Vn > t}. 18 hours? SOLUTION SOLUTION P (WN ≤ t) = gN [P (Y ≤ t)] = e20[FY (t)−1] = exp −20e−t/3 (15. since {N = 0}.6933 15. N =0 does not correspond to an admissible outcome (see Example 15. What is the probability of at least .9516 Example 15.0000 0.0000 0. (in thousands of dollars) form an iid class.
7000 210.3075 190.gY). are the same as in the previous solution.1 . PV = 1 .3848 180.468 CHAPTER 15. 0. SOLUTION. we are interested in the probability of one or more occurrences of the event p = P (Y ≤ t).(0.3p) where p = P (Y > t) Solving graphically for p = P (V > t) .7: Number of successes for random number of trials.3200 0.9612 210.3*p).14 (Lowest Bidder ) reects the fact that there are probably fewer bids in each case. of course.6705 190. This is essentially the problem in Example 7 (Example 15.14 (Lowest Bidder ) may be worked in this manner by using ibinom(20.0.PV]') 170.5.) from "Random Selection". Example 15.pd] = gendf(gN.5000 0. Y ≤ t.0000 0. indicates that a fraction p However. 180. 0.16: Batch testing Electrical units from a production line are rst inspected for operability.7 + 0.3. hence Solution P (V ≤ t) = 1 − gN [P (Y > t)] = 1 − (0. 200.7 + 0. 2 or 3 with probabilities 0. [d. experience All operable of those passing the initial operability test are defective. fact that the probabilities in this example are lower for each t than in Example 15.0:20). with probability N t = [170 180 190 200 210]. We use MATLAB to obtain 20 t = [170 180 190 200 210].p(i)].0000 0.2]. PV(i) = (d>0)*pd'.0000 0.p. not a low bid of zero. for i=1:length(t) gY = [p(i). gN = The The results.0000 0.2. % Zero for missing value PV = zeros(1. RANDOM SELECTION one bid no greater than 170.0000 0. p = [23/25 41/50 17/25 1/2 8/25].14 (Lowest Bidder ) with a general counting variable Suppose the number of bids is 1.8671 200.3.length(t)).5019 200. % Selects positions for d > 0 and end % adds corresponding probabilities disp([t.9200 0.0000 0.0000 0. Determine P (V ≤ t) in each case.0000 0. we get p = [23/25 41/50 17/25 1/2 8/25] for t = [170 180 190 200 210] 20 Now gN (s) = (0.9896 Example 15.0000 0.^20.7 + 0. 210? Note that no bid is we must use the special case. The minimum of the selected than or equal to For each t. p = [23/25 41/50 17/25 1/2 8/25]. respectively.3s) .8200 0.0000 0.PV]') 170.15: Example 15.8462 Example 15. % Probabilities Y <= t are 1 . t.p gN = [0 0.3 0.6800 0. 190.5 0. disp([t.1451 180. . Y 's is no greater than t if and only if there is at least one Y less We determine in each case probabilities for the number of bids satisfying Y ≤ t.
001. Then the total time (or cost) from the beginning to the completion of the rst success is N T = i=1 We suppose the Yi (composite demand with N −1∼ geometric p (15.469 units are subsequenly tested in a batch under continuous operation ( a burn in test).46) N =0 does not yield a minimum value. Let Yi of success on any component trial.3 Bernoulli trials with random execution times or costs Consider a Bernoulli sequence with probability p of the trial on which the rst success occurs. 15. MT (s) = gN [MY (s)] = There are two useful special cases: (15.47) n = 5000. disp([t. Now N − 1 ∼ geometric (p) implies pMY (s) 1 − qMY (s) gN (s) = ps/ (1 − qs). n = 5000. so that gN (s) = (q + ps) .0000 0. Statistical data indicate the defective units have times to failure Yi iid. P = exp(n*p*exp(lambda*t)). we have {N = 0} ⊂ {V > t} Since so that P (N = 0V > t) = P (N = 0) P (V > t) (15.001. and t = 1. N is P (V > t) = gN [P (Y > t)]. If np(s−1) approximately Poisson (np) with gN (s) = e and N∼ binomial for large n P (N = 0V > t) = For e−np enp[P (Y >t)−1] = e−npP (Y >t) = e−npe −λt (15. 5. Now P (Y > t) = e−λt .9983 5. p). p = 0.2. have very long life (innite from the point of view of the test). 4. lambda = 2.9125 3.P]') 1.5083 2. A batch of be the time of the rst failure and t N n units is tested. λ = 2.0000 0. exponential (λ). Since no defective units implies no failures in any reasonable test time.49) . MATLAB calculations give t = 1:5. If the test goes what is the probability of no defective units? units of time with no failure (i.0000 0. In actually designing the test.e. we have condition above. the number of defective units P (N = 0) = e−np . p = 0. one should probably make calculations with a number of dierent assumptions on the fraction of defective units and the life duration of defective units.9998 It appears that a test of three to ve hours should give reliable results.48) Yi form an iid class.. whereas good units Let V be the number of defective units in the batch. Now under the n (n. SOLUTION V > t).0000 0.9877 4.0000 0. Let be the time (or cost) to execute the N be the number i th trial. 3. 2. These calculations are relatively easy to make with MATLAB. independent of so that N. N is large and p is reasonably small.
solving for the distribution of a program such as Maple or Mathematica.25:1.50) T ∼ exponential Yi − 1 ∼ geometric (p0 ).470 CHAPTER 15.2.exp(3*t)).9093. Example 15.79 the maximum is one hour or less. 0./(1 + 4*exp(3*t)). In the general case.5000 0. 1.2 · 3 = 0.4105 0.7500 0.51) T −1∼ geometric (pp0 ). 1.63 the maximum is 3/4 hour or less.25. Since T requires transform theory.54) .6). gN (s) ps = n k=0 ps 1−qs = ps ∞ k=0 ∞ k=0 n (qs)k = ps n k=0 (qs)k + ∞ k=n+1 (qs)k = (15.5:0. 0.52) P (W ≤ t) = gN [P (Y ≤ t)] = MATLAB computations give t = 0. For the case of series. Interview times (in hours) Yi are presumed to form an iid class.5. We take the probability for success on any interview to be be found in four hours or less? p = 0. with probability 0. MT (s) = which implies 2.PWt]') 0. and may be handled best by simple Yi . so that (pλ).53) (qs)k + (qs)n+1 (qs)k k n+1 n+1 = ps k=0 (qs) + (qs) gN (s) = gn (s) + (qs) gN (s) (15. etc. Thus. so that MY (s) = λ λ−s .5. (λ). PWt = (1 .17: Job interviews Suppose a prospective employer is interviewing candidates for a job from a pool in which twenty percent are qualied.75. pλ/ (λ − s) pλ = 1 − qλ/ (λ − s) pλ − s (15.7924 1.6293 1.8 (1 − e−3t ) 1 + 4e−3t (15. so that P (T ≤ 4) = 1 − e−0.8925 1. What is the probability a satisfactory candidate will What is the probability the maximum interview time will be no greater than 0. 1. with probability 0.0000 0. RANDOM SELECTION Yi ∼ exponential 1.5 hours? SOLUTION T ∼ exponential (0. we may use approximation procedures based on properties of the geometric N −1∼ geometric (p).2 1 − e−3t 1 − e−3t = 1 − 0. the average interview time is 1/3 hour (twenty minutes).5000 0.2500 0. disp([t. gY (s) = p0 s 1−q0 s gT (s) = so that pp0 s/ (1 − q0 s) pp0 s = 1 − pp0 s/ (1 − q0 s) 1 − (1 − pp0 ) s (15.6·4 = 0.9468 The average interview time is 1/3 hour. each expo nential (3).
% Distribution function for D plot(0:100.56) (qs) n+1 gN [gY (s)] for s=1 reduces to q n+1 = P (N > n) With such an Yi which is negligible if n is large enough. probabilities are in PD.3. as in the general case of simple faster and more ecient for the integervalued case. 10}. To view the distribution. = [30 35 40]. integervalued.^(k1)]. (15.9497 P30 = (D<=30)*PD' P30 = 0.58) p = 0. n. we may use the gend procedure on where gn (s) = ps + pqs2 + pq 2 s3 + · · · + pq n sn+1 For the integervalued case. gn [gY (s)]. 2. gN = p*[0 q.^a = 1. if the (15. Suitable n may be determined in each case.0000 FD = cumsum(PD).8263 q a b b .FD(1:101)) % See Figure~15.0064 n = 40.1 P50 = (D<=50)*PD' P50 = 0.10)]. Example 15. gend is usually q is small.57) are nonnegative. gY = 0. N Determine the distribution for T = k=1 SOLUTION Yk (15. 0 <= k <= 40 gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Values are in row matrix D.0e04 * % Use n = 40 0.0379 0. Now gT (s) = gn [gY (s)] + (qs) n+1 gN [gY (s)] s = 1. sum(PD) % Check sum of probabilities ans = 1. Since (15. k = 1:n.2254 0. % Probabilities.471 Note that gn (s) has the form of the generating function for a simple approximation values and probabilities with N Nn which matches up to k = n.3 and Y be uniformly distributed on {1. = 1 .p. call for gD.1*[0 ones(1. % Check for suitable n = q. However. we could use mgd. the number of terms needed to is likely to be too great.55) The evaluation involves convolution of coecients which eectively sets gN (1) = gY (1) = 1. approximate Unless gn Yi . · · · .18: Approximating the generating function Let p = 0.
(basic sequence) • Interarrival times : {Wi : 1 ≤ i}.s. between the two sequences are simply The relations n S0 = 0.s. (incremental sequence) The strict inequalities imply that with probability one there are no simultaneous arrivals. . The stream of arrivals may be described in a third way. RANDOM SELECTION Figure 15. but use the actual distribution for 15. separated by random waiting or interarrival times. • Arrival times : {Sn : 0 ≤ n}. arrivals and designate the times of occurrence as arrival times.2.4 Arrival times and counting processes Suppose we have phenomena which take place at discrete instants of time.1: Execution Time Distribution Function FD . Sn = i=1 Wi and Wn = Sn − Sn−1 for all n≥1 (15. the failures of a system. with 0 = S0 < S1 < · · · a.59) The formulation indicates the essential equivalence of the problem with that of the compound demand (Section 15. In that case. use Y. gN as in Example 15.18 ( Approximating the generating function). vehicles passing a position on a road.472 CHAPTER 15. We refer to these occurrences as A stream of arrivals may be described in three equivalent ways. etc. with each Wi > 0 a. although at the cost of more computing time. The notation and terminology are changed to correspond to that customarily used in the treatment of arrival and counting processes.3: Calculations for the compound demand).1. These may be arrivals of customers in a store. of noise pulses on a communications line. The same results may be achieved with mgd.
t] (Si ) = max{n : Sn ≤ t} = min{n : Sn+1 > t} (15. renewal process Nt In the literature on renewal processes. the class {N (t2 ) − N (N1 ) . {Nt = n} = {Sn ≤ t < Sn+1 } This imples (15. This follows the result in the unit DISTRIBUTION APPROXI9MATIONS on the relationship between the Poisson and gamma distributions. . Sn ∼ gamma (n. i Nt ∼ Poisson (λt) for all t > 0. · · · . {Sn : 0 ≤ n} Several properties of the counting process {Wn : 1 ≤ n} should be noted: {Nt : 0 ≤ t} (15. That is 2. it is common for the random variable and the to count an arrival at the convention above. 2. t]. Sn . ω) = 0. integervalued function dened on [0.473 • Counting processes : Nt = N (t) t. For any given with I(0. a. b. If {Wi : 1 ≤ i} is iid exponential (λ). {Nt ≥ n} = {Sn ≤ t}. (15. h > 0.61) ω . More advanced treatments show that the process has independent. t + h]. along with the fact that {Nt ≥ n} = {Sn ≤ t}. stationary increments. We use Exponential iid interarrival times The case of exponential interarrival times is natural in many applications and leads to important mathematical results. t = 0. N (t + h) − N (t) counts the arrivals h > 0. N (tm ) − N (tm−1 )} is independent. we assume (15. this is a random quantity for each nonnegative is the number of arrivals in time period (0. rightcontinuous.s. h > 0.62) P (Nt = n) = P (Sn ≤ t) − P (Sn+1 ≤ t) = P (Sn+1 > t) − P (Sn > t) Although there are many possibilities for the interarrival time distributions. N (0. ω). For N (t + h) − N (t) = N (h) for all t. and S0 = 0. Such a family of random variables constitutes a random process. so that N (t + h) ≥ N (t) for Nt = i=1 3. 1. in the discussion of the connection between the gamma distribution and the exponential distribution. the interarrival times Wi . This requires an adjustment of the expressions relating Si . It should be clear that We thus have three equivalent descriptions for the stream of arrivals. Nt ∼ Poisson Remark. For a given t. with Wi > 0 a.64) and the interrarival Under such assumptions. λ) for all This is worked out in the unit on TRANSFORM METHODS. N0 = 0 and for t > 0 we have ∞ in the interval (t. ω) is a nondecreasing. ∞).63) {Wi : 1 ≤ i} times are called is iid.60) N 1. λ) for all n ≥ 1. The essential relationships between the three ways of describing the stream of arrivals is displayed in Wn = Sn − Sn−1 . ω the value is N (t. We utilize the following propositions about the arrival times and the counting process N. the counting process is often referred to as a renewal times. n ≥ 1. then Sn ∼ gamma (n. N (·. and t1 < t2 ≤ t3 < t4 ≤ · · · ≤ tm−1 < tm . The counting process is a Poisson process in the sense that (λt) for all t > 0. In this case the random process is a counting process. N (t4 ) − N (t3 ) .
20: Gamma arrival times Suppose the arrival times Sn ∼ ∞ gamma (n. the number of arrivals in any time interval depends upon the length of the interval and not its location in time.68) ∞ fn (t) = λe−λt n=1 the proposition is established. λ) and the present value of all future replacements is ∞ C= n=1 ce−αSn (15. λ). then a dollar spent is the time of replacement of the t units of time in the future has a nth unit. 140? p p = 1 . 600) for expectation and the corresponding property for integrals to assert ∞ ∞ ∞ ∞ E n=1 g (Sn ) = n=1 E [g (Sn )] = n=1 0 g (t) fn (t) dt where fn (t) = λe−λt (λt) (n − 1)! n−1 (15. Example 15.69) Example 15.9669 Sn ∼ gamma We develop next a simple computational result for arrival processes for which (n. If money is discounted at a rate current value e−αt . Units fail independently.0347 0.67) We may apply (E8b) (list. and the numbers of arrivals in nonoverlapping time intervals are independent. p.65) ∞ ∞ E n=1 VERIFICATION g (Sn ) = λ 0 g (15. Thus. (λt) = λe−λt eλt = λ (n − 1)! n=1 ∞ n−1 (15. 120. If Sn α.19: Emergency calls Emergency calls arrive at a police switchboard with interarrival times (in hours) exponential (15).70) .[101 121 141]) = 0. then Sn ∼ gamma (n. 600) to assert ∞ 0 ∞ ∞ ∞ gfn = n=1 Since g 0 n=1 fn (15. RANDOM SELECTION In words.66) We use the countable sums property (E8b) (list. Example 15.21: Discounted replacement costs A critical unit in a production system has life duration exponential (λ). λ) and and g is such that ∞ g < ∞ 0 Then E n=1 g (Sn )  < ∞ (15. p. What is the probability the number of calls in an eight hour shift is no more than 100.cpoisson(8*15. Upon failure the unit is replaced immediately by a similar unit.5243 0. Cost of replacement of a unit is c dollars.474 CHAPTER 15. the average interarrival time is 1/15 hour (four minutes).
The result of Example 15. t] (Sn ) g (Sn ). replace g (Sn ) = g by I(0. 871.77) Thus. with {Cn .08 (eight percent).24: Replacement costs. Example 15. (15. Often. 000 0. Sn } independent and E [Cn ] = c. let counting random variable for arrivals in the interval Nt be the (0. Nt n=1 ( Gamma arrival times). often referred to as a nite horizon.76) Example 15. the reduction factor is .t] (u) g (u) du = 0 0 g (u) du (15. above. invariant with n. nite period.t] g ∞ ∞ n=0 I(0.22: Random costs Suppose the cost of the random quantity nth replacement in Example 15. Then ∞ ∞ ∞ is a E [C] = E n=1 Cn e−αSn = n=1 E [Cn ] E e−αSn = n=1 cE e−αSn = λc α (15.1479 = 8.20 ( Gamma arrival times). it is desirable to plan for a specic.73) Example 15. nite horizon Under the conditions of Example 15. the expected cost for the innite horizon λc/α 1 − e−αt .75) VERIFICATION Since Nt ≥ n i Sn ≤ t.1479 to give E [C] = 60000 · 0.16 = 0. costs).20 ( Gamma arrival times) may be modied easily to account for a nite period. average time (in years) to failure and the α = 0. t Nt If Zt = n=1 g (Sn ) .21 ( Discounted replacement 1 − e−0. Then E [C] = 1200 · 4 = 60.23: Finite horizon Under the conditions assumed in Example 15. then E [Zt ] = λ 0 g (u) du (15.72) c = 1200.21 ( Discounted replacement costs) Cn . t In the result of Example 15.21 ( Discounted replacement costs).37.08 (15. SOLUTION t E [C] = λc 0 e−αu du = λc 1 − e−αt α is reduced by the factor (15.74) The analysis to this point assumes the process will continue endlessly into the future. consider the replacement costs over a twoyear period. For t = 2 and the numbers in Example 15.71) ∞ E [C] = λ 0 Suppose unit replacement cost discount rate per year ce−αt dt = λc α 1/λ = 1/4.20 and note that I(0. t].475 The expected replacement cost is ∞ E [C] = E n=1 Hence g (Sn ) where g (t) = ce−αt (15.
81) a = 1.4/>. the expression for In the important special case that E[ ∞ n=1 g (Sn )] may be put into a form which does not require the interarrival times to be exponential. 482.3)). 3 This content is available online at <http://cnx.08. 482.4) from "Problems on Random Variables and Joint Distributions") As a variation of Exercise 15. RANDOM SELECTION g (u) = ce−αu . .1.08.26: Uniformly distributed interarrival times Suppose each Wi ∼ uniform (a. b = 5. c = 100. X times.7900 EC = c*MW/(1 . Example 15. Then (see Appendix C (Section 17. MW (−α) = Let e−aα − e−bα α (b − a) and so that E [C] = c · e−aα − e−bα α (b − a) − [e−aα − e−bα ] (15. MW (−α) = E e−αW < 1. A coin is ipped number of heads that turn up.) (See Exercise 3 (Exercise 8. Then. where {Wi : 1 ≤ i} is pair {Vn .79) ∞ E [C] = c n=1 Since n MW (−α) = MW (−α) .1643 15.80) α>0 and W > 0. b).org/content/m24531/1. Example 15.exp(b*A))/(A*(b .476 CHAPTER 15. exponential Suppose that S0 = 0 and Sn = each E [Vn ] = c and each n i=1 Wi . c = 100 α = 0. suppose a pair of dice is rolled instead of a single die.78) MW is the moment generating function for W. ∞ g iid.1 die is rolled. Let Then for {Vn : 1 ≤ n} α>0 be a class such E [C] = E n=1 where Vn e−αSn = c · MW (−α) 1 − MW (−α) (15.3) from "Problems on Random Variables and Joint Distributions") A X be the number of spots that turn up. Let (Solution on p. DERIVATION First we note that n E Vn e−αSn = cMSn (−α) = cMW (−α) Hence. a = 1. b = 5.3 Problems on Random Selection3 Exercise 15.2 (Solution on p.) (See Exercise 4 (Exercise 8. MW = (exp(a*A) . 1 − MW (−α) so that provided MW (−α)  < 1 (15. Sn } is independent. Determine the distribution for Y. we have 0 < e−αW < 1.25: General interarrival.MW) EC = 376. Let Y be the Exercise 15. by properties of expectation and the geometric series (15.a)) MW = 0. Determine the distribution for Y. A = 0.
5) from "Problems on Random Variables and Joint Distributions") X times.00 0. times independently and randomly a number from 1 to 10.20) from "Problems on Conditional Expectation. Exercise 15. Exercise 15.15 6 0.10 3 0.15 50.5).1 Let D be the number of deer sighted under this model.).477 Exercise 15. Determine P (10 ≤ D ≤ 30). Y be the number of Exercise 15. 0. Var [D]. The number of herds N∼ binomial (20. 483.05 2 0.21) from "Problems on Conditional Expectation. Determine the distribution for X times. 484.50 0. Let matches (i.6 number (See Exercise 21 (Exercise 14.2 0. Exercise 15. etc.00 0.) (See Exercise 5 (Exercise 8.05 10 0. with common distribution Y = [1 2 3 4] Let P Y = [0. Determine Exercise 15.4). with probabilities (Solution on p. (Solution on p.9 probability of each being selected by a customer.15 4 0.00 0.20 42. Determine E [Y ] and P (Y ≤ 10).8 to be sighted is assumed to be a random variable be from 1 to 10 in size. Determine the distribution for Y. Determine E [Y ] P (Y ≤ 20). both twos.20 40. both ones.50 0.) of Bernoulli trials) from "Conditional is chosen by a random selection from the integers 1 through 20 (say by drawing a card from a box). Let the distribution for Y.e.4 (See Example 7 (Example 14. Regression") A Y X is selected randomly from the integers 1 through 100.10 25. be the number of questions answered correctly by the i th There are four questions. Regression") A X is selected randomly from the integers 1 through 100. 0. 75.50 0. 484.3 0. A pair of dice is thrown X times..) Game wardens are making an aerial survey of the number of deer in a park.10 60. etc.). 485. Y X be the number of matches (i. The table below shows the values of the items and the Value Probability 12.00 0. What is the probability of three or more sevens? A random number Y X be the total number of spots which turn up. 482. 484.15 30. Regression") A number X N (Solution on p. both draw twos. 485.05 Table 15. (Solution on p. 50.05 9 0.7: Expectation.) Yi N ∼ binomial (20. 100 and P (D ≥ 90). P (15 ≤ D ≤ 25).7 Suppose the number of entries in a contest is Let (Solution on p. Y. Suppose the contestant. Each herd is assumed to Value Probability 1 0.5 number Let (Solution on p.10 7 0.. Let both draw ones. Let an additional (Solution on p.82) and D be the total number of correct answers. Determine P (D ≤ t) for t = 25. (15.3 Suppose a pair of dice is rolled.10 8 0. A pair of dice is thrown be the number of sevens thrown on the and X tosses.4 0. Roll the pair be the number of sevens that are thrown on the X rolls.) A supply house stocks seven popular items.10 . Determine the distribution for Y.1] E [D] .) Each of two people draw Exercise 15.) (See Exercise 20 (Exercise 14.20 5 0.e. Yi are iid.
) Five hundred questionnaires are sent out. Determine the distribution for the total demand D. A single die is thrown the number of times indicated by the result of the spin of the wheel. The number of points made is the total of the numbers turned up on the sequence of throws of the die. (Solution on p. P (X > = 10). 486. Let (15.16 incoming messages (45). Exercise 15. The probability of a reply is 0. 487.478 CHAPTER 15. 486. The probability that a reply will be favorable is 0. binomial (20. 3.15 0. 0. Customers who buy do so (in dollars) with common distribution: Y = [200 220 240 260 280 300] Let P Y = [0.25 0.) Exercise 15.) N1 ∼ Poisson (500) and the N2 ∼ Poisson (300). 200. 0.0.) of students take a qualifying exam. 2.2 Suppose the purchases of customers are iid. · · · . on line 2 the number is N2 ∼ Poisson p1a = 0.7 of making 70 or more.) D = D1 + D2 . 488. What is the probability of at least 200. Determine P (D1 ≥ D2 ) and (Solution on p.10] the total sales for Geraldine. what is the probability all will pass? Ten or more will pass? Exercise 15. 155. and D.10 A game is played as follows: (Solution on p. and the number of customers in a day is binomial (10.15 0. a.14 (Solution on p. 488. The number who reply is a random number A questionnaire is sent to twenty persons. If each student has probability N ∼ binomial (20.12 and P (D ≥ 750). N ∼ If each respondent has probability p = 0. A wheel is spun. 250. what is the probability of ten or more favorable replies? Of fteen or more? Exercise 15. A grade of 70 or more earns a pass. Exercise 15. Let Y represent the number of points made and X = Y − 16 be the net gain (possibly negative) of the player.75.13 A random number Suppose N (Solution on p. P (D ≥ 1500). Determine the maximum value of X. 250 favorable replies? Exercise 15. 487.5 of a sale in each case. a dollar is returned for each point made.3). 487.33 On incoming line 1 the messages have probability of leaving on outgoing line a . A junction point in a network has two incoming lines and two outgoing lines.6 he makes a sale in each case.6. for t = 100. RANDOM SELECTION Table 15.25 0. and order an amount p1 = 0. 225. With probability calls on ve customers. 250. E [X]. p = 0. with probability on an iid basis. Var [X].8 of favoring a certain proposition.5).) The number of Exercise 15. P (X > = 16).11 Marvin calls on four customers. giving one of the integers 0 through 9 on an equally likely basis. what is the distribution for the total number Japanese visitors to the park? Determine D of German and P (D ≥ k) for k = 150. P (X > 0). How many dierent possible values are there? What is the maximum possible total sales? b. Geraldine Yi p2 = 0. (Solution on p. N1 on line one in one hour is Poisson (50).10 0. 300. D2 .15 Suppose the number of Japanese visitors to Florida in a week is number of German visitors is (Solution on p. 150. 245. A player pays sixteen dollars to play. If 25 percent of the Japanese and 20 percent of the Germans visit Disney World. P (D ≥ 1000).) 1.83) D1 be the total sales for Marvin and D2 Determine the distribution and mean and variance for D1 . Determine Determine E [D] and P (D ≤ t) P (100 < D ≤ 200).7).
Twenty ve percent are trucks.3 What is the probability the total cost of the $2.) of orders sent to the shipping department of a mail order house is Poisson (700). The messages coming in on line 2 have probability p2a = 0. 0. Suppose the probability is 0.50 boxes is greater than the cost of the $3. respectively. others The number of nonuniversity buyers Mac.00 greater than the cost of the $3.) A service center on an interstate highway experiences customers in a onehour period as follows: • • Northbound: Total vehicles: Poisson (200). What is the probability of 10 or more sales? What is the probability that the number of sales is at least half the number of information requests (use suitable simple approximations)? Exercise 15.05 Table 15. Southbound: Total vehicles: Poisson (180). which with packing costs have distribution Cost (dollars) Probability 0. The probabilities for Mac. Orders require one of seven kinds of boxes. N1 ∼ Poisson (30). 488.4.5. Exercise 15. 489.20 3. Suppose the following assumptions are reasonable for monthly purchases. HP. The respective probabilities for For each group.18 The number N (Solution on p. General customers for home and business computing.00 boxes? Suggestion.4.) of hits in a day on a Web site on the internet is Poisson (80). 0. and various other IBM compatible personal computers. the composite demand assumptions are reasonable.50 0.25 3. . what is the distribution of outgoing messages on line a? What are the probabilities of at least 30. If the number of cars passing a trac check point in an hour is Poisson (130).00 0.50 boxes is not more than $50.60 that the inquirer just browses but does not identify an interest.2.50 boxes is no greater than $475? What is the probability the cost of the $2.10 1.3. Truncate the Poisson distributions at about twice the mean value. 0. HP.17 has two ma jor sources of customers: (Solution on p. and the two groups buy independently.479 and 1 − p1a of leaving on line b. 489. Twenty percent are trucks. HP. Students and faculty from a nearby university 2.20 (Solution on p.10 4. It 1.19 The number N (Solution on p. What is the distribution for the number of Mac sales? What is the distribution for the total number of Mac and Dell sales? Exercise 15. N2 ∼ Poisson (65). • • • The number of university buyers are 0. 489. and is 0.30 that the result is a request for information.2.00 0.) A computer store sells Macintosh.50 0. is 0.25 0.) One car in 5 in a certain community is a Volvo.00 boxes? What is the probability the cost of the $2. 0. 35.15 2.00 0.75 0. 40 outgoing messages on line a? Exercise 15.47 of leaving on line a.10 that any hit results in a sale.21 (Solution on p. what is the expected number of Volvos? What is the probability of at least 30 Volvos? What is the probability the number of Volvos is between 16 and 40 (inclusive)? Exercise 15. 488. others are 0. Under the usual independence assumptions.15 2.
Exercise 15. Determine P (N2 > N3 ). Customer requests NA ∼ Poisson (30).1.) Two items. 491.) of bids is a random variable Value Probability 0 0.05 1 0.01).27 A property is oered for sale. Visa P (N1 ≥ 30). 400.4. The A discount retail store has two outlets in Houston. with respective probabilities N (Solution on p. 3. The number and the generating function gD (s).03 Table 15. the nember of orders NB ∼ Poisson (40).1. For store B. the probability an order for a is 0. 200. Determine the probabilities that the maximum withdrawal is less than or equal to t for t = 100.Each car has 1.10 2 0.10 6 0. from store B.3. 0.10 7 0.05 10 0. Let N1 .15 3 0.) Note Y N ∼ binomial (7. Determine E [D]. with respective probabilities 0. 2. N3 be the numbers of cash sales. number of orders in a day from store A is is (Solution on p.000 or less.10 5 0.20 4 0. Exercise 15. respectively Under the usual independence assumptions. 491. 491.01 ∗ [5 7 10 11 12 13 12 11 10 9] (15.25. 0.15 2 0. What is the probability the total order for item b in a day is 50 or more? Exercise 15. with respective probabilities N (Solution on p.480 CHAPTER 15. RANDOM SELECTION . P (N3 ≥ 50). 4. 200]. respectively. 490. 5]. 0. 500.05 .22 N of customers in a shop in a given day is Poisson (120). 490.) The number of customers during the noon hour at a bank teller's station is a random number N N = 1 : 10. What is the probability of at least one bid of $3.) Exercise 15. the probability of at least one bid of $125. uniform [100.500 or that no bid is not a bid of 0. Experience indicates the number values 0 through 8. Experience indicates the number having values 0 through 10. Bids (in thousands of uniform on [3. 0. and for b is 0.20 4 0. with respective probabilties 0. N2 . Make the usual independence assumptions. Exercise 15.10 6 0.25 with distribution (Solution on p. 0.05 8 0. MasterCard charges.6).20 5 0.Each truck has one or two persons. 0. (Solution on p. or 5 persons. let D be the number of persons to be served. with probabilities 0.3. .4 The market is such that bids (in thousands of dollars) are iid.05 7 0. Var [D]. distribution P N = 0. P (N2 ≥ 60).40. For store A.35.07 8 0.6.24 The number of bids on a job is a random variable dollars) are iid with less? (Solution on p. are featured in a special sale. the probability an order for a is 0. 0.05 1 0.26 A job is put out for bids.23 are phoned to the warehouse for pickup. 490.7 and 0.05 9 0. Customers pay with cash or by MasterCard or Visa charge cards. and for b is 0. 300.2.84) The amounts they want to withdraw can be represented by an iid class having the common Y ∼ exponential (0.3. Determine Exercise 15.15 3 0. a and b.7. with a common warehouse. and card charges.3.) of bids is a random variable having Value Probability 0 0.
2. p = 0. with the common distribution function u ≥ 0. A computer store oers each customer who makes a purchase of $500 or more a free chance Suppose the times.) of noise bursts on a data transmission line in a period number of digit errors caused by the geometric Suppose i th burst is Yi . However.1). Exercise 15. with probability (Solution on p.32 The number dollars) Yi N (Solution on p. 493. uniform on [10.15 3 0.33 at a drawing for a prize. is Poisson (7t). Exercise 15.000 or more.20 4 0.) Suppose a teacher is equally likely to have 0. 3 or 4 students come in during oce hours on a given day.05 Table 15. The bid amounts (in thousands of form an iid class. in minutes.05.) are iid. 492.6 The market is such that bids (in thousands of dollars) are iid symmetric triangular on [150 250]. in hours.) The Noise pulses arrrive on a data phone line according to an arrival process such that for each Yi for Nt of arrivals in time interval (0. The probability of winning on a draw is 0. t] is Poisson (µt). {Yi : 1 ≤ i} i th pulse has an intensity FY (u) = 1 − e−2u 2 t > 0 the such that the class is iid. If the lengths of the individual visits. Exercise 15.000? Exercise 15.10 7 0.29 Suppose of the N N ∼ binomial values of the Yi .28 A property is oered for sale. Experience indicates the number having values 0 through 8.5 The market is such that bids (in thousands of dollars) are iid. t]. Under the usual independence assumptions. Determine the probability that in an eighthour day the intensity will not exceed two.000 or more. 492. Let V be the minimum Determine P (V > t) for integer values from 10 to 20.10 6 0. 491.) of bids is a random variable Number Probability 0 0.481 Table 15.31 Twelve solidstate modules are installed in a control system. Under the usual independence assumptions.) What is the probability that the maximum amount bid is greater than $5. 493.) of bids on a painting is binomial (10.30 (Solution on p. what is the probability the unit does not fail because of a defective module in the rst 500 hours after installation? Exercise 15.3) and the Yi (Solution on p. 493.35 The number N (Solution on p. 20]. uniform [150. what is the probability that no visit will last more than 20 minutes. (10. in hours. (Solution on p.15 2 0. with respective probabilities N (Solution on p. two hours of (p). An error correcting system is capable or correcting ve or fewer errors µ = 12 and p = 0.3). Determine the probability of at least one bid of $210.05 8 0. 491.0025).15 5 0. 1.) If the modules are not defective. 200] Determine the probability of at least one bid of $180. are iid exponential (0.34 number (Solution on p.05 any unit could have a defect which results in a lifetime (in hours) exponential (0. what is the expected time between a winning draw? What is the probability of three or more winners in a ten hour day? Of ve or more? Exercise 15. between sales qualifying for a drawing is exponential (4). 492.005 (37 − 2t)2 ≤ t ≤ 10. What is the probability of no uncorrected error in operation? . The {Yi : 1 ≤ i} iid. with common density function fY (t) = 0. with the class (0.35.05 1 0. 0. 0. Exercise 15. Yi − 1 ∼ in any burst. they have practically unlimited life. Exercise 15.
477) PX = (1/36)*[0 0 1 2 3 4 5 6 5 4 3 2 1]. Y. D.0000 0.5 0.1667 4.0000 0.0000 0. P To view the distribution.0000 0.5].0140 % (Continued next page) 9.1025 2.6)].0026 Solution to Exercise 15. P May use jcalc or jcalcf on N.0000 0. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN PN Enter gen fn COEFFICIENTS for gY PY Results are in N.0000 0.0000 0. P May use jcalc or jcalcf on N.0040 10.0000 0. D. P To view the distribution. . PD.0008 11.3 (p.0000 0.0755 5. PN.0000 0. PD. PY.0000 0.482 CHAPTER 15. disp(gD) 0 0.0375 8. call for gD.0000 Solution to Exercise 15.0000 0. PY = [0.0208 6.5 0. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN PX Enter gen fn COEFFICIENTS for gY PY Results are in N.0000 0.1400 6. D.5].0001 12. PY = [0.0000 0.1 (p. PY. call for gD.2 (p.1954 5. RANDOM SELECTION Solutions to Exercises in Chapter 15 Solution to Exercise 15. PN.1641 1.0806 7. disp(gD) % Compare with P83 0 0. 476) PN = (1/36)*[0 0 1 2 3 4 5 6 5 4 3 2 1].0000 0. 476) PX = [0 (1/6)*ones(1.2578 3. D.0269 1.3125 2.0000 0.0000 0.1823 3.0000 0.2158 4. Y.
PY.0230 5.0000 0.0144 0. Y.2661 0.0000 0.0047 0.2113 0. disp(gD) 0 1. PY.4 (p.0000 9. D.0370 0.0013 0.0000 10.3660 2. call for gD.0000 0. D.0000 0.0000 5.2152 3.0008 7.0000 11.0000 0.0000 0. 477) gN = (1/20)*[0 ones(1.0828 4.0001 8.0000 0.0003 0.0000 2. gY = [5/6 1/6].0048 6.0000 11. PN.0000 6.0000 0.3072 1. disp(gD) 0 0. PN. PD.0000 10. PD. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N.0000 0. D.0000 7.20)].0000 0.0795 0.0000 P = (D>=3)*PD' P = 0.0000 0.1116 Solution to Exercise 15.0000 .0000 3.0001 0. Y.0000 0.0000 4. D.0000 0.0000 9.2435 0.0000 12.0000 8. P May use jcalc or jcalcf on N. P To view the distribution. P May use jcalc or jcalcf on N.1419 0.0000 13.0000 0.0000 0. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN PX Enter gen fn COEFFICIENTS for gY PY Results are in N. call for gD.483 PY = [5/6 1/6]. P To view the distribution.0000 12.
0000 0. . gY = [5/6 1/6].9837 Solution to Exercise 15.1*[0 2 4 3 1]. PN. call for gD.01*[0 ones(1. P May use jcalc or jcalcf on N.1].7 (p. Y.0500 P10 = (D<=10)*PD' P10 = 0.0000 18. 477) gN = ibinom(20. Y. PN.0000 0. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N. 477) gN = 0. gY = 0.100)].0000 0. PN. EY = dot(D. call for gD.0000 0.0000 16. PY.0000 15. PY.6 (p.5 (p. PD. D. PY. D. gY = [0. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N. Y.0000 20. PD. P To view the distribution.484 CHAPTER 15.0000 17. D.0000 0. D.0000 0. RANDOM SELECTION 14.0000 19.9 0. P To view the distribution.0000 Solution to Exercise 15. D. P May use jcalc or jcalcf on N. 477) gN = 0.4167 P20 = (D<=20)*PD' P20 = 0. PD.0:20). call for gD. P May use jcalc or jcalcf on N.4.0.9188 Solution to Exercise 15. P To view the distribution.01*[0 ones(1.0000 0. EY = dot(D.PD) EY = 8. D.PD) EY = 5.100)].
9998 Solution to Exercise 15. PD. P May use jcalc or jcalcf on N. s = size(D) s = 1 839 M = max(D) M = 590 t = [100 150 200 250 300].9290 Solution to Exercise 15.5 25 30.5.5 50 60]. 477) gN = ibinom(20. P = zeros(1. Y = [12.01*[0 5 10 15 20 15 10 10 5 5 5].6156 0.485 ED ED VD VD P1 P1 P2 P2 = = = = = = = = dot(D. end disp(P) 0.0:20). end disp(P) 0.8720 ((15<=D)&(D<=25))*PD' 0. D. for i = 1:4 P(i) = (D<=k(i))*PD'.0:10). for i = 1:5 P(i) = (D<=t(i))*PD'.^2)*PD' .5578 0. To view the distribution. P = zeros(1.4). PN. PY.01*[10 15 20 20 15 10 10].PD) 18. mgd Enter gen fn COEFFICIENTS for gN gN Enter VALUES for Y Y Enter PROBABILITIES for Y PY Values are in row matrix D. Y. call for gD. D.9 (p.9614 P1 = ((100<D)&(D<=200))*PD' P1 = 0.5). probabilities are in PD.5. k = [25 50 75 100].ED^2 31. 477) gN = ibinom(10.3184 0. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N.0310 0.5144 .8 (p.8497 0. P To view the distribution.4000 (D.9725 0.6386 ((10<=D)&(D<=30))*PD' 0. gY = 0.1012 0.0. call for mD.0. PY = 0.5 40 42.
5 % 4.PD1. u. RANDOM SELECTION Solution to Exercise 15.5*3.PD1) . 478) gn = 0.4*250 VD1 = dot(D1.0000 % Check: ED1 = EnM*EY = 2.u.PD] = csort(t+u.t.PD1.PD2] = mgdf(gnG.1*ones(1.PD2.0000 % Check: ED2 = EnG*EY = 2.2147 P16 = (X>=16)*PX' P16 = 0. % Check: VD = VD1 + VD2 .11 (p.P).ED1^2 VD1 = 6.PY).4214e+05 P1g2 = total((t>u).486 CHAPTER 15.1968e+04 ED2 = dot(D2.0.Y.0175e+04 [D1.PD2) ED2 = 625.2250e+03 % (Continued next page) VD = dot(D.PD1) ED1 = 600.^2.PD) . [Y.ED^2 VD = 1.0.PD) ED = 1.10).2250e+03 eD = ED1 + ED2 % Check: ED = ED1 + ED2 eD = 1. [X.Y.D2.6.4612 k = [1500 1000 750].5 . ED = dot(D.10 (p. Y. and P [D.3).gy). [D1.4667 P10 = (X>=10)*PX' P10 = 0. Y = 200:20:300. Use array opertions on matrices X.*P) P1g2 = 0.16 = 4.PY).16 = 0.PD2).5*250 VD2 = dot(D2.5*3.25 Solution to Exercise 15.0803 % Check EX = En*Ey .4214e+05 vD = VD1 + VD2 vD = 1.PY] = gendf(gn. gy = (1/6)*[0 ones(1.ED2^2 VD2 = 8. [D2.PY).PX) . PY.^2. PY = 0. t.0:4). M = max(X) M = 38 EX = dot(X.^2.1875 Ppos = (X>0)*PX' Ppos = 0. PX. 478) gnM = ibinom(4. gnG = ibinom(5.5.EX^2 VX = 114.P] = icalcf(D1.D2.PD2) .PD1] = mgdf(gnM. ED1 = dot(D1.6)].0:5).01*[10 15 25 25 15 10].2500 VX = dot(X.PX] = csort(Y16. PDk = zeros(1.^2.PX) EX = 0.
0038 Solution to Exercise 15.0.3. P To view the distribution. call for gD. PD.14 (p.0:20).7326 0.8. gY = [0. PY.13 (p. Pall = (D==20)*PD' Pall = 2.0.6. PN.0:20). D. D.7822e14 pall = (0.7. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N. % Alternate: use D binomial (pp0) D = 0:20. call for gD.12 (p. PY. end disp(PDk) 0.0660 pD = ibinom(20. 478) n = 500. D.2556 0. D. P May use jcalc or jcalcf on N.487 for i = 1:3 PDk(i) = (D>=k(i))*PD'.7788 p15 = (D>=15)*pD' p15 = 0. Y.2 0.0:20). p0 = 0. 478) gN = ibinom(20. P To view the distribution. gend Do not forget zero coefficients for missing powers Enter gen fn COEFFICIENTS for gN gN Enter gen fn COEFFICIENTS for gY gY Results are in N. PD. gY = [0. p10 = (D>=10)*pD' p10 = 0. P May use jcalc or jcalcf on N.7822e14 P10 = (D >= 10)*PD' P10 = 0.0660 Solution to Exercise 15.7*0.3 0. P10 = (D>=10)*PD' P10 = 0. p = 0.7].7788 P15 = (D>=15)*PD' P15 = 0.8872 Solution to Exercise 15.8]. PN. 478) gN = ibinom(20.7)^20 % Alternate: use D binomial (pp0) pall = 2. Y. .75.0.3*0.
0379 215.0167 220. 478) m1a = 50*0. disp([k.5098 190. HP sales Poisson (30*0. 479) Mac sales Poisson (30*0.0776 210.0140 Solution to Exercise 15.0000 0.18 (p.8736 175.9892 160. GD ∼ Poisson (300*0.0008 235.0000 0.9964 155.0000 0.488 CHAPTER 15.0000 250. RANDOM SELECTION D = 0:500.17 (p. m2a = 45*0. P = zeros(1.15 (p. end disp(P) 0.3663 195. 478) JD ∼ Poisson (500*0.0067 225.25 = 125).PD]') 150.3).47.6532 185.0000 0. . for i = 1:3 P(i) = (D>=k(i))*PD'. Solution to Exercise 15.1.5).k). total Mac plus HP sales Poisson(50. PD = cpoisson(185.5).0000 0.0000 0.0000 0. ma = m1a + m2a.9362 170.0000 Solution to Exercise 15.9893 0.5173 0.0000 0. PD = ibinom(500.16 (p. k = [200 225 250].p*p0.0000 0.0000 0. k = 150:5:250. 479) X = 0:30.0000 0.0001 245.0024 230.9718 165.D).0000 0.2 + 65*0.7785 180.X).0000 0. PNa = cpoisson(ma.0002 240.9119 0.0000 0.33.[30 35 40]) PNa = 0. PX = ipoisson(80*0.6890 0.3 = 25. D∼ Poisson (185).0000 0. Y = 0:80.0000 0.0000 0.3722 Solution to Exercise 15.0000 0.4 + 65*0.0000 0.1435 205.2405 200.0000 0.20 = 60).0000 0.2 = 25).
icalc: X Y PX PY .19 (p.41) = 0.10) % Direct calculation pX10 = 0. 479) T ∼ Poisson (200*0.7500 Solution to Exercise 15.25. PYP = 0.20 (p.3000 VYT = dot(YT..16) ... PYT = [0.*P) PM = 0..PX10 = (X>=10)*PX' % Approximate calculation PX10 = 0.EYT^2 VYT = 0. PY. PX.6400 .75 b = 295 YT = [1 2]. 479) P1 = cpoisson(130*0. Y.EYP^2 VYP = 1.^2. EYP = dot(YP. PX = ipoisson(700*0.2834 M = t>=0.PYT) .25 = 85).5*u..3.2834 pX10 = cpoisson(8.4000 VYP = dot(YP.9819 Solution to Exercise 15.8 + 180*0. EYT = dot(YT..PYP) . PY = ipoisson(700*0.8785 M = 2. and P P1 = (2.75 = 295).5*X<=475)*PX' P1 = 0..Y).2 + 180*0. Y = 0:300.7 0..489 PY = ipoisson(80*0.1572 Solution to Exercise 15..3].20.1*[3 3 2 1 1]. u. t. PM = total(M. P ∼ Poisson (200*0. a = 85 b = 200*0.21 (p..PYP) EYP = 2.2100 YP = 1:5.2.Y).2407 P2 = cpoisson(26. PM = total(M.30) = 0.PYT) EYT = 1.X).*P) PM = 0.cpoisson(26. 479) X = 0:400.8 + 180*0.5*t<=(3*u + 50).^2. icalc Enter row matrix of Xvalues X Enter row matrix of Yvalues Y Enter X probabilities PX Enter Y probabilities PY Use array operations on matrices X.
2705e+03 NT = 0:180.490 CHAPTER 15. PX = ipoisson(120*0.EDT^2 VDT = 161. icalc Enter row matrix of X values X Enter row matrix of Y values Y Enter X probabilities PX Enter Y probabilities PY Use array opertions on matrices X. EDT = dot(DT.7s + 0.PDP] = gendf(gNP. PY.gYP). 480) Solution to Exercise 15.5000 VT = 85*(VYT + EYT^2) VT = 161. Y = 0:120. gYP = 0. [DT.5000 NP = 0:500.NP). 480) P = cpoisson(30*0.24 (p.85) (15. RANDOM SELECTION EDT = 85*EYT EDT = 110.*P) PM = 0.X).22 (p. u. Y.3s2 − 1 gDP (s) = exp 295 0.23 (p.2468 .4. % Requires too much memory gDT (s) = exp 85 0. gNP = ipoisson(295.^2. t.7+40*0.1 3s + 3s2 2s3 + s4 + s5 − 1 gD (s) = gDT (s) gDP (s) Solution to Exercise 15. PM = total(M.86) X = 0:120.Y).50) = 0.PDT] = gendf(gNT.1*[0 3 2 2 1 1].5000 VP = 295*(VYP + EYP^2) VP = 2183 VD = VT + VP VD = 2.PDT) .5000 EDP = 295*EYP EDP = 708. 480) (15.1*[0 7 3].35. PY = ipoisson(120*0.5000 VDT = dot(DT.6.0000 ED = EDT + EDP ED = 818.gYT). PX. gYT = 0.NT). and P M = t > u.PDT) EDT = 110. % Possible alternative gNT = ipoisson(85. [DP.7190 Solution to Exercise 15.
26 (p.6536 %alternate gY = [0.0.gN[P(Y>t)] P = 1(0.4 0.PD] = gendf(pN.8493 Solution to Exercise 15.6794 Solution to Exercise 15.6*0. with P(E) = 0.68 0.25 pN = ibinom(7.27 (p.7490 0. [D.Positive number of satisfactory bids. [D.6].6794 % Second solution . 480) Probability of a successful bid P Y = (125 − 100) /100 = 0.polyval(fliplr(gN). % Generator function for indicator [D. gY = [0. P = (D>0)*PD' P = 0. gN = 0.491 % First solution .25.9116 Solution to Exercise 15.4 + 0. P = (D>0)*PD' P = 0. 481) .28 (p.01*[5 15 15 20 15 10 10 5 5].01*[5 15 15 20 10 10 5 5 5 5 5]. PY = 1 .exp(0.PY) P = 0.6536 Solution to Exercise 15. PY = 0. the outcome is indicator for event E. % i.25 (p.4598 0.75)^7 P = 0.32].0:7).PY) PW = 0. 480) Use FW (t) = gN [P (Y ≤ T )] gN = 0. gY = [3/4 1/4].8989 0.gY). P = 1 .01*t).6800 PW = 1 .PD] = gendf(gN. t = 100:100:500.1330 0.5 + 0.29 (p. % D is number of successes Pa = (D>0)*PD' % D>0 means at least one successful bid Pa = 0.PY) % fliplr puts coeficients in % descending order of powers FW = 0.gY). 481) gN = 0. gN = 0.FY(t) = 1 .9615 Solution to Exercise 15.01*[0 5 7 10 11 12 13 12 11 10 9].01*[5 10 15 20 20 10 10 7 3].gY).(4/5)^2) PY = 0.5*(1 .25 PY =0.polyval(fliplr(gN). 480) Consider a sequence of N trials with probabiliy p = (180 − 150) /50 = 0. FW = polyval(fliplr(gN).PD] = gendf(gN.6.e.6.
1092 0.45 (15.0664 0.7^10 = Columns 1 through 7 0.exp(0.2503 0.1686 Columns 8 through 11 0.p) FW = 0.3612 0.1092 gN = 0.(0.0147 0 Pa = (0.05.0:10).PD] = gendf(gN. 481) 0. = 0. P = 1 .gY). = polyval(fliplr(gN).9718 0. RANDOM SELECTION gN = ibinom(10.0360 0.PD] = gendf(gN.3.t).5104 0.3*p)^10 P = 0.1686 Columns 8 through 11 0.05*p)^12 FW = 0.3*p).p) .7092 0. [D.0.7 + 0.1*(20 .2*ones(1. FW = polyval(fliplr(gN).7092 0.0:12).32 (p.95 + 0.0147 0 t p P P Solution to Exercise 15. gY = [p 1p]. % Alternate [D.8352 . gY = [p 1p]. p = 1 . [D.87) p = 0. PW = (D==0)*PD' PW = 0. 481) p = 1 .8352 gN = ibinom(10.7635 gY = [p 1p].0.exp(2).PD] = gendf(gN.0.2503 0.7635 Solution to Exercise 15.45.0360 0.5).^10 . 481) 5 P (Y ≤ 5) = 0.3.7 + 0.3612 0.gY).8410 gN = ibinom(12.7^10 % Alternate form of gN Pa = Columns 1 through 7 0.0.0:10).005 2 (37 − 2t) dt = 0. = 10:20.5104 0.9718 0.492 CHAPTER 15.8410 Solution to Exercise 15.30 (p.0.0025*500).31 (p. FW = (0.gY). % D is number of "successes" Pa = (D>0)*PD' Pa = 0. PW = (D==0)*PD' PW = 0.0664 0.
FW2 = exp(56*(1 .35 (p.1)) FW = 0. FW = exp(mu*t*(1 .3233 0.0138 .33 (p.0.34 (p. k = 5.[3 5]) PND10 = 0. p = 0.q^(k1) . 481) Nt ∼ Poisson (λt).1)) FW2 = 0.exp(t^2) . NDt ∼ Poisson (λpt). t = 2. t = 10. 481) FW (k) = gN [P (Y ≤ k)]P (Y ≤ k) − 1 − q k−1 Nt ∼ Poisson (12t) q = 1 . EW = 1/(lambda*p) EW = 5 PND10 = cpoisson(lambda*p*t. WDt exponential (λp).3586 Solution to Exercise 15.493 Solution to Exercise 15.0527 Solution to Exercise 15. t = 2. 481) N8 is Poisson (7*8 = 56) gN (s) = e56(s−1) .35. lambda = 4.05. mu = 12.
494 CHAPTER 15. RANDOM SELECTION .
s. As in the case of other concepts. B} is conditionally independent.Chapter 16 Conditional Independence. Denition. are equivalent. For example. Y } conditionally independent. given event C.1) . We note two kinds of equivalences. givenZ. and of the theory of Markov systems.2) holds for all reasonable M and N is (technically. In the unit on Markov Sequences (Section 16. 16. X. the following (CI1) 1 This E [IM (X) IN (Y ) Z] = E [IM (X) Z] E [IN (Y ) Z] a. Given a Random Vector1 In the unit on Conditional Independence (Section 5. the concept of conditional independence of events is examined and used to model a variety of common situations. the rst of these. N (16.5/>. i E [IA IB C] = P (ABC) = P (AC) P (BC) = E [IA C] E [IB C] If we let consider (16. We examine in this unit.2). It would the pair {X. Given a Random Vector 16. i for all Borel E [IM (X) IN (Y ) Z] = E [IM (X) Z] E [IN (Y ) Z] M. i the product rule E [IM (X) IN (Y ) C] = E [IM (X) C] E [IN (Y ) C] (16.3) Remark. In this unit. given an event.1) be reasonable to A = X −1 (M ) and B = Y −1 (N ). very briey. for all Borel sets M. or Z be real valued. This suggests a possible extension to conditional expectation. 495 . then IA = IM (X) and IB = IN (Y ).org/content/m23813/1. The pair {A. all Borel M and N ). For example. we investigate a more general concept of conditional independence.1 The concept The denition of conditional independence of events is based on a product rule which may be expressed in terms of conditional expectation. The pair {X. X and Y. if X M and N is a three dimensional random vector. This concept lies at the foundations of Bayesian statistics. then Since it is not necessary that are on the codomains for M is a subset of R3 . N content is available online at <http://cnx. given C. given a random vector.1. Y . we understand that the sets respectively. which we refer to by the numbers used in the table in Appendix G. Y } conditionally independent. We examine the following concept. based on the theory of conditional expectation.1 Conditional Independence. we provide an introduction to the third. designated {X. Y } ci Z . of many topics in decision theory. it is useful to identify some key properties.
Z)] = E [IN (Y ) IQ (Z) e2 (Y. 496) and (CI2) (p.7) are useful in a variety of contexts. p. 496) implies (CI5) (p. (CI1) (p. ("(CI9) ". Using (CE9) (p. Y ] Z} = E{IN (Y ) E [IM (X) Z.10) (16. with (CI1) (p. (CI2) (p. Y } ci Z implies that additional knowledge about Y does not modify that best estimate. Y ] and e2 (Y. Similarly. Using the dening property (CE1) (p. The properties (CI1) (p. Y ]} = E [IN (Y ) IQ (Z) IM (X)] On the other hand.11) Use of property (CE8) (p. 602) is just a convenient way of expressing the other conditions.6) (16. 496) is a special case of (CI5) 495). 496) implies (CI1) (p. 496) is equivalent to (CI6) E [g (X. (CE8) (p. and 426) to (CE6) monotone convergence in a manner similar to that used in extending properties (CE1) (p. 496). for all Borel sets M E [IM (X) IQ (Z) Z. Z) h (Y. Z) Z] = E [g (X.s. and (CE8) (p. Z) = E [IM (X) Z]. 428). Z) Y ] = E{E [g (X. use of (CE1) (p. and (CE1) (p. CONDITIONAL INDEPENDENCE. 495). given knowledge of Z. 496). Y ] Z} = E{IN (Y ) E [IM (X) Z] Z} = E [IM (X) Z] E [IN (Y ) Z] (CI1) (p. Z)] then by the uniqueness property (E5b) (list. 428). 496). we show the equivalence of (CI1) (p. property (CI4) (p. Z) Z] E [h (Y. p. for all Borel sets M. 428). GIVEN A RANDOM VECTOR (CI5) E [g (X. Z) a. 496) is equivalent to (16. Y ] = E [g (X. Z) Z] is the best meansquare estimator for g (X. Now just as interpretation of conditional independence: E [g (X.4) 600) for expectation we may assert e1 (Y.496 CHAPTER 16. (CE8) (p. for all Borel sets M. To show that (CI1) (p. The additional properties in Appendix G (Section 17.s. for all Borel functions g. 428) shows that (CI2) (p. Q E [IM (X) IQ (Z) Y ] = E{E [IM (X) IQ (Z) Z] Y } a. Z) Z] a. Y ] = E [IM (X) IQ (Z) Z] a. 426). p. (CI1) (p. (CI3) (p. 496). Set If we show e1 (Y. N E [IM (X) Z. Z) = (16. Y ] = E [IM (X) Z] a. for all Borel functions g 496). Q As an example of the kinds of argument needed to verify these equivalences.5) e2 (Y. . 426) for conditional expectation. 496) being the dening condition for (CI1) (CI2) (CI3) (CI4) {X. Z) Z] Y } a.7) (16. Z). This interpreProperty (CI6) (p. A second kind of equivalence involves various patterns. monotonicity. we need to use linearity. we have E{IN (Y ) IQ (Z) E [IM (X) Z. and (CI4) (p. • (CI1) (p. Z) Z. h Because the indicator functions are special Borel functions.8) • (CI2) (p. N. (p. 427) for conditional expectation.s. 496) are equivalent. 496). 496). We refer to them as needed. (CI2) (p. for all Borel E [IN (Y ) IQ (Z) e1 (Y.s. 496) extends to (CI5) (p. 426) yields E{IN (Y ) IQ (Z) E [IM (X) Z]} = E{IQ (Z) E [IN (Y ) E [IM (X) Z] Z]} = E{IQ (Z) E [IM (X) Z] E [IN (Y ) Z]} = E{IQ (Z) E [IM (X) IN (Y ) Z]} = E [IN (Y ) IQ (Z) IM (X)] which establishes the desired equality. Z) Z] a. 496). so also (CI3) (p. for all Borel sets M. The condition {X. (16. 496) and (CI3) (p. for all Borel functions g (CI8) E [g (X. we have E [IM (X) IN (Y ) Z] = E{E [IM (X) IN (Y ) Z. 496) are equivalent.s. 496) implies (CI2) (p. Z) = E [IM (X) Z. 496). 602) is an alternate way of expressing (CI6) (p. 428). Y } ci Z . Property (CI9) Property (CI7) ("(CI7) ". Q (16. particularly in establishing properties of Markov systems. E [IM (X) IN (Y ) Z] = E [IM (X) Z] E [IN (Y ) Z] a.9) (16.s.s. 496) provides an important tation is often the most useful as a modeling assumption. 495).s. (p.
we have a conditional distribution for X. The Bayesian This has no meaning. In this view.e. The fraction of items from a production line which meet specications. The value of the parameter (say µ in the discussion above) is a realization of a parameter random variable H. 2.1. we assume 1. To implement this point of view.497 16. given H : i. then the sampling process is equivalent to a sequence of Bernoulli trials. about which there is uncertainty.12) =E i=1 If I(−∞.13) X has conditional density. u in the range of H. the population distribution is conceived as randomly selected from a class of such distributions. consider Xi = IEi . Xn ) FW H (t1 . 4. · · · . that of We Population proportion We illustrate these ideas with one of the simplest. t2 . 1. 1. quantity about which there is uncertainty. This is based on previous experience. The parameter in this case is the proportion p who meet the criterion. The proportion of a population of voters who plan to vote for a certain candidate. For each 2. X2 ≤ t2 . mention only a few to indicate the importance. H is the parameter The Bayesian model If X is a random variable whose distribution is the population distribution and random variable. a fundamental problem is to obtain information about the population There is an inherent diculty with this Suppose it is desired to determine the population mean distribution from the distribution in a simple random sample. they may be considered components of a parameter random vector. with P (Ei H = u) = E [Xi H = u] = e (u) = u . Examples abound. H} have a joint distribution. given H. then the conditional distribution for Sn . approach makes a fundamental change of viewpoint. then a similar product rule holds. µ.ti ] (Xi ) H = u = i=1 FXH (ti u) (16. 2. given H = u. Now µ is an unknown quantity However. given H. given the value of H. · · · . One way of expressing this idea is to refer to a has been selected by nature from a class of distributions. since it is a constant. (16. To see this. statistical problems: determining the proportion of a population which has a particular characteristic.14) is binomial (n. classical approach to statistics. X2 . We assume a 3. The fraction of women between the ages eighteen and fty ve who hold full time jobs. given H = u. tn u) = P (X1 ≤ t1 . The population distribution Since the population mean is a by experiment. prior distribution for H. we cannot assign a probability such as P (a < µ ≤ b). {Xi : 1 ≤ i ≤ n} is conditionally and consider the joint conditional distribution function iid. · · · Xn ≤ tn H = u) n n n (16. Let W = (X1 . The population distribution is a conditional distribution. 3. u). The proportion of a given population which has a certain disease. The mean value is thus a random variable whose value is determined by this selection. If sampling is at random. We have a random sampling process.ti ] (Xi ) H = u = i=1 E I(−∞. but most important. If Sn is the number of successes in a sample of size H is the parameter random variable and n. If two or more parameters are sought (say the mean and variance)..2 The Bayesian approach to statistics In the approach. then {X. it is modeled as a random variable whose value is to be determined state of nature.
k) uk+1 (1 − u) C (n. GIVEN A RANDOM VECTOR u as in the ordinary Bernoulli case.2)). (r.18) A prior distribution for H The beta purpose. k) uk (1 − u) n−k and E [Sn H = u] = nu (16. Sampling gives We make a Bayesian reversal to get 2. s) distribution (see Appendix G (Section 17. and by proper choice of parameters r. proves to be a natural choice for this Its range is the unit interval. if The Bayesian reversal Since {Sn = k} is an event with positive probability. we must assume a prior distribution for any. To complete the task.7)).16) The objective We seek to determine the best meansquare estimate of 1. H on the basis of prior knowledge. E [HSn = k]. we know an exression for E [Sn H = u] = nu.498 CHAPTER 16. given an event.15) E I{k} (Si ) H = u = P (Sn = kH = u) = C (n.17) fH (u) du (16. Two steps must be taken: H = u. If H. we use the denition of the conditional expectation. . is the number of successes in If Anaysis is carried out for each xed n n Sn = i=1 we have the result Xi = i=1 IEi n component trials (16. given Sn = k. 426) to obtain E [HSn = k] = E{HE I{k} (Sn ) H } E HI{k} (Sn ) = = E I{k} (Sn ) E{E I{k} (Sn ) H } = C (n. s. CONDITIONAL INDEPENDENCE. Sn = k .1) and 2 (Figure 16. and the law of total probability (CE1b) (p. the density function can be given a variety of forms (see Figures 1 (Figure 16. k) uk (1 − u) n−k n−k uE I{k} (Sn ) H = u fH (u) du E I{k} (Sn ) H = u fH (u) du fH (u) du (16.
2. s = 1.s) density for r = 2. . 10.499 Figure 16.1: The Beta(r.
t] (H) .2: The Beta(r. so that determination Γ (r + s) r−1 s−1 s−1 t (1 − t) = A (r.19) (r. For r. s + n − k) . s) . shows E [H] = If the prior distribution for r r+s and Var [H] = rs (r + s) (r + s + 1) 1 k+r n+s−k−1 u (1 − u) du 0 1 k+r−1 n+s−k−1 u (1 − u) du 0 2 (16.s) density for r = 5. 5. CONDITIONAL INDEPENDENCE. we may complete the determination of E [HSn = k] as follows.22) du We may adapt the Γ (r + k + 1) Γ (n + s − k) Γ (r + s + n) k+r · = Γ (r + s + n + 1) Γ (r + k) Γ (n + s − k) n+r+s analysis above to show that H is conditionally beta (r + k. fH has a [0. s positive integers. (16. s = 2. FHS (tk) = E It (H) I{k} (Sn ) E I{k} (Sn ) where (16.20) is a polynomial of the distribution function is easy. using the integral formula above. straightforward integration. s ≥ 2. s). the density is given by fH (t) = For on r ≥ 2. Its analysis is based on the integrals 1 0 For ur−1 (1 − u) H∼ beta s−1 du = Γ (r) Γ (s) Γ (r + s) with Γ (a + 1) = aΓ (a) (16.21) H 1 0 is beta (r. In any case.500 CHAPTER 16. GIVEN A RANDOM VECTOR Figure 16. n−k r−1 E [HSn = k] = A (r. s) tr−1 (1 − t) 0<t<1 Γ (r) Γ (s) maximum at (r − 1) / (r + s − 2). s) = uk+1 (1 − u) uk (1 − u) u (1 − u) − u) s−1 du 1 0 n−k r−1 u (1 s−1 = (16. fH (16. 10. 1].24) It (H) = I[0.23) given Sn = k . s) A (r.
t FHS (tk) = Γ (r + s + n) Γ (r + k) Γ (n + s − k) uk+r−1 (1 − u) 0 n+s−k−1 du = 0 fHS (uk) du (16. Figure 16. we get In the integral H∼ (r. which corresponds to H∼ uniform on (0. s.e. 7). however.1 (Population proportion with a SOLUTION For above. H. Fourteen H ∼ beta (1. Example 16. t E [HSn = k]. what is S20 = 14? After the rst sample is taken. If there is no prior r = 1. the conditional distribution given size Assuming prior ignorance (i. we simply take H can be utilized to select suitable r. Conditional densities for repeated sampling. H the rst sample the parameters is conditionally beta . According the treatment (k + r. are r = s = 1. s = 1. s). with thirteen favorable responses. that n = 20 is taken. Make a new estimate of H. Any prior information on the distribution for information.3: beta prior). A sample of size respond in favor of the increase. a second sample of Analysis is made using the conditional n = 20 is taken. n + s − k). The density has a maximum at (r + k − 1) / (r + k + n + s − k − 2) = k/n.6818.1: Population proportion with a beta prior It is desired to estimate the portion of the student body which favors a proposed increase in the student blanket tax to fund the campus radio station. The conditional expecation. 1). is (r + k) / (r + s + n) = 15/22 ≈ 0.501 The analysis goes through exactly as for expression for the numerator. except that For H is replaced by beta It (H)..1)). The information in the sample serves to modify the distribution for that information. n + s − k) = (15. distribution for the rst sample as the prior for the second. conditional upon Example 16. The value is as likely to be in any subinterval of a given length as in any other of the same length.25) The integrand is the density for beta (r + k. one factor u is replaced by It (u).
the estimate is the conditional expectation of H. The conditonal densities in the two cases may be plotted with MATLAB (see Figure 1). The new conditional distribution has parameters r∗ = 15 + 13 = 28 ∗ and s = 20 + 7 − 13 = 14. H ∼ beta (15. we should expect more sharpening of the density about the new meansquare estimate. We consider. For the classical case. the maximum for the second is somewhat larger and occurs at a slightly smaller reecting the smaller k. t. This requires a Bayesian reversal of the conditional densities. plot(t. u) du = fW H (tu) fH (u) du (16. with the conditional distribution as the new prior. The essential prior information may be utilized.'k') As expected.502 CHAPTER 16. given Sn . in the latter. for the Bayesian case with beta prior. the Bayesian estimate seems preferable for small samples. The Bayesian estimate is often referred to as the small sample estimate.t). And the density in the second case shows less spread. although there is nothing The sampling procedure For small samples.beta(15. Suppose fW H (tu) and fH (u) are specied. k = 13. resulting from the fact that prior information from the rst sample is incorporated into the analysis of the second sample. this reversal process when all random variables The Bayesian reversal for a joint absolutely continuous pair {W.t.beta(28. and it has the advantage that upgrades the prior distribution. important. The best estimate of H is 28/ (28 + 14) = 2/3. In the treatment above.t).28) Since by the rule for determining the marginal density fW (t) = we have fW H (t. and are absolutely continuous. we utilize the fact that the conditioning random variable the pair Sn is discrete. the estimator for µ is the sample average. The density has a maximum at t = (28 − 1) / (28 + 14 − 2) = 27/40 = 0. In any event.30) . the dierence may be quite in the Bayesian procedure which calls for small samples. Now by denition fHW (ut) = fW H (t. To determine E [HW = t] = we need ufHW (ut) du (16. H} is jointly absolutely continuous. t = 0:0.7. 7). The same result is obtained if the two samples are combined into one sample of size 40.6750. If Sn = k : Classical estimate = k/n Bayesian estimate = (k + 1) / (n + 2) (16. name Bayesian comes from the role of The application of Bayesian analysis to the population proportion required Bayesian reversal in the case of discrete Sn . next.14. these do not dier signicantly. The idea of the Bayesian approach is the view that an unknown parameter about which there is uncertainty is modeled as the value of a random variable. CONDITIONAL INDEPENDENCE. u) fW (t) and fW H (t.01:1. GIVEN A RANDOM VECTOR For the second sample.29) fHW (ut) = fW H (tu) fH (u) fW H (tu) fH (u) du and E [HW = t] = ufW H (tu) fH (u) du fW H (tu) fH (u) du (16. case prior information is not utilized. It may be well to compare the result of Bayesian analysis with that for classical statistics. Bayesian reversal in the analysis.26) For large sample size n.'k'. and the prior n = 20. we make the comparison with the case of no prior knowledge ( r = s = 1). For the new sample. u) = fW H (tu) fH (u) (16. Since.27) fHW (ut).
" given the present. there is either a change to a new state or a renewal Each state is maintained unchanged during the of the state immediately before the transition.5/>.2: A Bayesian reversal Suppose H ∼ exponential sample of size n (λ) is taken.2. Markov sequences We wish to model a system characterized by a sequence of we call transition times. the move to the next state is characterized by a distribution. We thus have a sequence XN = {Xn : n ∈ N}. We view an observation of the system as a where N = {0. Determine the best Xi are conditionally iid. in the sense that the transition probabilities are dependent upon the current state (and perhaps the period number). · · · .2 Elements of Markov Sequences2 16. and t∗ = t1 + t2 + meansquare estimate of H. Xn ). X2 . t2 . . conditional transition probability At any transition time. t2 . · · · the system makes a transition from one 2 This content is available online at <http://cnx.org/content/m23824/1.31) E [HW = t] = ∞ n+1 −(λ+t∗ )u u e du 0 ∞ n −(λ+t∗ )u u e du 0 ufHW (ut) du = (λ + t∗ ) · n! n+1 ∞ n+1 −ut∗ u e λe−λu du 0 ∞ n −ut∗ u e λe−λu du 0 (16. t1 . we show that the fundamental Markov property is an expression of conditional independence of past and future. Each (16. We suppose the system is evolving in time. A W = (X1 . In this section.33) 16.1 Elements of Markov Sequences Markov sequences (Markov chains) are often studied at a very elementary level. given W = t.503 Example 16. X1 (ω) . At discrete instants of time state to the succeeding one (which may be the same).34) yields a sequence of states {X0 (ω) . For period i. We suppose that the system is memoryless. but not upon the manner in which that state was reached. or a trajectory. 1. exponential (u). utilizing algebraic tools such as matrix analysis. With the background laid. The essential ChapmanKolmogorov In the usual timehomogeneous case with nite state space. and the SOLUTION n fXi H (ti u) = ue−uti Hence so that fW H (tu) = i=1 ue−uti = un e−ut ∗ (16. Put · · · + tn . This should provide a probabilistic perspective for a more complete study of the algebraic analysis. equation is seen as a consequence of this conditional independence. t = (t1 . 2.32) = = (n + 1)! (λ + t∗ ) n+2 n+1 = (λ + t∗ ) n where t = i=1 ∗ ti (16. · · · . the ChapmanKolmogorov equation leads to the algebraic formulation that is widely studied at a variety of levels of mathematical sophistication. whose value is one of the members of a set E. the state is represented by a value of a random variable Xi . states taken on at discrete instants which At each transition time. we only sketch some of the more common results. · · · } which is referred to as a realization composite ω of the sequence. This is the The past inuences the future only through the present. or stage period between transitions. known as the state space. tn ). · · · } trial. We consider only a nite state space and identify the states by integers from 1 to M. given H = u. which we model in terms of conditional independence. Markov property.
To simplify and writing. tk+1 ). and Q ⊂ En−1 state probabilities manner in which the present state is reached. · · · the object moves a random distance along a line. t ∈ [t1 .. we adopt the following convention: Un = (X0 . j.4: A class of branching processes Each member of a population is able to reproduce. we suppose that at certain Some mechanism limits each generation discrete instants the entire next generation is produced.504 CHAPTER 16.3: Onedimensional random walk An object starts at a given initial position. Yn+1 ). to a maximum population of M members..1 The parameter tn+1 . Xn+1 . in the following Denition. Let . (16. at the transition is to the transition is to X1 (ω) X2 (ω) k: state is Xk (ω). the past only by the value of the position Xn+1 = g (Xn . n = k. n = 1. k ∈ (16. At discrete instants t1 . and not by the transition probabilities The statistics of the process are determined by the initial P (Xn+1 = kXn = j) ∀ j. CONDITIONAL INDEPENDENCE.. At transition from the state Xn (ω) to the state Xn+1 (ω) for the next period.. but not by how the present is reached. The various moves are independent of each other. n ≥ 0 to obtain the transition probabilities. there is a n indicates the period t∈ [tn .36) Several conditions equivalent to the Markov condition (M) may be obtained with the aid of properties of conditional independence..n = (Xm . so that the future is aected by the present. X1 . and the for each n ≥ 0. If the periods are of unit length. · · · . given a random vector. t1 state is X1 (ω). We note rst that (M) is equivalent to P (Xn+1 = kXn = j. The sequence XN is Markov i (M) {Xn+1 . Since the position after the transition at tn+1 is aected by Example 16.. t ∈ [0. Let Y0 Yk be the initial position be the amount the object moves at time Xn = We note that n k=0 Yk be the position after n moves. Un } ci Xn for all n≥0 (16. t1 ). we utilize the notion of conditional independence. For simplicity.. t = tk {Yk : 1 ≤ k} iid Xn and not by the sequence of positions which led to this position. it is reasonable to suppose that the process XN is Markov. capture the notion that the system is without memory.37) The state in the next period is conditioned by the past only through the present state. at t2 state is X0 (ω). t2 . GIVEN A RANDOM VECTOR Initial period: Period one: . Period . t2 ).38) The following examples exhibit a pattern which implies the Markov condition and which can be exploited Example 16. · · · ) ∈ En and (16. Xn ) The random vector U n = (Xn . at tk+1 move to Xk+1 (ω) Table 16. tn+1 ). Un−1 ∈ Q) = P (Xn+1 = kXn = j) E. then tn = n.35) In order to Un is called the past at n of the sequence XN Un is the future at n. We verify this below. k ∈ E. · · · . n = 0. Xn ) ∈ En Um.. t ∈ [tk .
Zin = k indicates a net of k propagated member (either death and k ospring or survival and k − 1 ospring). Un } is independent. The population in generation by the i th n+1 is given by Xn Xn+1 = min {M. with {Yn : 1 ≤ n} iid Then Xn+1 = { Xn − 1 Yn+1 − 1 if if Xn ≥ 1 Xn = 0 = g (Xn . Yn+1 ) ∀ n ≥ 0 (16. Let {Yn+1 . Remark. order up to M. A pattern yielding Markov sequences Suppose {Yn : 0 ≤ n} is independent (call these the driving random variables ). Dn+1 ) n ≥ 0.5: An inventory problem A certain item is stocked according to an (m. the transitions are dispersed in time. (16. Yn+1 ) . 0} max{Xn − Dn+1 . Example 16. Suppose When a unit Xn is the remaining lifetime of the unit in service at time n Yn+1 is the lifetime of the unit installed at time n. 0} is independent. for purposes of analysis. nth period (before restocking). i=1 We suppose the class Zin } Yn+1 = (Z1n . m or greater. n ≥ 0. and and let Dn Xn be the stock at the end of the be the demand during the nth period. Un } is independent for each and the Markov condition seems to be indicated.42) . n ≥ 0 We now verify the Markov condition and obtain a method for determining the transition probabilities. as follows: • • Let If stock at the end of a period is less than If stock at the end of a period is m. Thus. 0 ≤ n} is iid.505 Zin be the number propagated by the i th member of the nth generation. the actual transition takes place throughout the period. then if 0 ≤ Xn < m if m ≤ Xn (16. do not order. i. but the observations are at discrete instants. we examine the state only at the end of the period (before restocking). Then for X0 be the initial stock. · · · . ii. fails. Set X0 = g0 (Y0 ) Then and Xn+1 = gn+1 (Xn . ZM n ).40) {Dn : 1 ≤ n} {Dn+1 . However. Yn : 1 ≤ n} is independent Xn+1 = gn+1 (Xn . Zin = 0 indicates death and no ospring.39) {Zin : 1 ≤ i ≤ M. Example 16. M ) inventory policy. Xn+1 = { If we suppose max{M − Dn+1 . Z2n . Yn+1 ) (16. = g (Xn .6: Remaining lifetime A piece of equipment has a lifetime which is an integral number of units of time. It seems reasonable to suppose the sequence XN Then is Markov. In this case. Each of these four examples exhibits the pattern {X0 . it is replaced immediately with another unit of the same type.41) Remark.
601) = P [gn+1 (u. (16. Yn+1 ) ∈ Q] by (E1a) ("(E1a) ". P (Xn+1 ∈ QXn = u) = P [gn+1 (u.46) (onestep) transition probabilities. b. It is apparent that if . Yn+1 ) ∈ Q] . Un−1 }ciXn (16. for all Borel sets Q. and any Borel set Q.44) P (Xn+1 ∈ QXn = u) = E{IQ [gn+1 (Xn . Since {Yn : 1 ≤ n} where is iid. The transition matrix thus has the property that each row sums to one. Un } is independent. Un−1 ). property (CI9) ("(CI9) ". 602) ensures {Xn+1 . Un }ciXn ∀n ≥ 0 which is the Markov property. the Markov As a matter of fact. Y1 . called the The transition probabilities may be arranged in a matrix usually referred to as the transition probability matrix. we write P (Xn+1 = jXn = i) = p (i. so that gn is invariant with n. j) These are called the or pij P (invariant with n (16. j) on row i and column j is the probability P (Xn+1 = jXn = i). = E{IQ [gn+1 (u.s. transition probabilities. Thus Un = hn (Y0 . we note the following special case of the proposition above. Y1 . [PX ] (CE10b) ("(CE10)". Yn+1 ) = u + Yn+1 . Yn are known. GIVEN A RANDOM VECTOR XN is Markov for all a. b. is invariant with n. We may utilize part (b) of the propositions to obtain the onestep Example 16. Yn+1 )] Xn = u} a. then Un is known. (16. Homogenous Markov sequences If {Yn : 1 ≤ n} is iid and gn+1 = g for all n. Yn ) {Yn+1 . n (16.506 CHAPTER 16. p. In this case. · · · . Such a matrix is called a stochastic matrix. and invariant with P (Xn+1 ∈ QXn = u) = P [g (u. CONDITIONAL INDEPENDENCE. From the propositions on transition probabilities. this is the only case usually treated in elementary texts. the transition probabilities are invariant with n. Yn + 1) ∈ Q] n. p. u. VERIFICATION a. Xn ) and Un = hn (Xn .7: Random walk continued gn (u. In this regard. p. By property (CI13) ("(CI13)". j)] The element the p (i.45) Remark. p. 603). below. · · · .s. which ensures each pair X = Yn+1 . Y = Xn . This case is important enough to warrant separate classication. P (Xn+1 = kXn = j) = P (j + Y = k) = P (Y = k − j) = pk−j pk = P (Y = k) (16. process XN If P (Xn+1 ∈ QXn = u) is said to be homogeneous. we have {Yn+1 . Y0 . then the process is a homogeneous Markov process. given Xn = i. In the homogeneous case. to the previous examples shows that the transition probabilities are invariant with n. transition matrix. the elements on i th row constitute the conditional distribution for Xn+1 . Since the function g is the same for all n and the driving random variables corresponding to the Yi form an iid class.47) P = [p (i. the sequences must be homogeneous. 599) by The application of this proposition. it is apparent that each is Markov. We return to the examples. Yn+1 )]}a. with and Z = Un−1 . all u ∈ E.48) . Thus. Denition.43) Since Xn+1 = gn+1 (Yn+1 .
M*mz+1). EZ = dot(0:mz.]) P = zeros(M+1. M = input('Enter maximum allowable population '). · · · .:)). call for branchdbn .M+1). W 2 = Z 1 + Z 2 . call for branchdbn') PZ = 0. k = 0:M*mz. If {Zin : 1 ≤ i ≤ M.m % Calculates transition matrix for a simple branching % process with specified maximum population.45 The transition matrix is P To study the evolution of the process. a = min(M. We simply need the distri Zin . Yn+1 ) = min{M.1.50) for With the aid of moment generating functions. M }. · · · . P(i+1.Z(i.8: Branching process continued g (j.1) = 1. Z(i.:) = p. j Wjn = i=1 We thus have Zin ensures {Wjn : 1 ≤ n} is iid for each j∈E (16.PZ).k). z = 1. 1 ≤ n} is iid. one may determine distributions for W1 = Z 1 . for i = 1:M % Operation similar to genD z = conv(PZ.01*[15 45 25 10 5]. disp(['The average individual propagation is '.51) branchp. [t. P(1. % Probability distribution for individuals branchp % Call for procedure Do not forget zero probabilities for missing values of Z Enter PROBABILITIES for individuals PZ Enter maximum allowable population 10 The average individual propagation is 1.z). then j i=1 Zin } and E = {0.1:i*mz+1) = z. Z = zeros(M. 1. % file branchp.507 Example 16.p] = csort(a. W M = Z 1 + · · · + Z M These calculations are implemented in an mprocedure called bution for the iid (16.num2str(EZ).49) P (Xn+1 = kXn = j) = { P (Wjn = k) P (Wj n ≥ M ) for 0≤k<M k≥M 0≤j≤M (16. disp('Do not forget zero probabilities for missing values of Z') PZ = input('Enter PROBABILITIES for individuals '). mz = length(PZ) . end disp('The transition matrix is P') disp('To study the evolution of the process.
1293 0. there is a high probability of returning to that value and very small probability of becoming extinct (reaching zero state).2239 0.1991 0.1179 0. g (0.0000 0 0 0 0.0005 0.0500 0. If the population ever reaches zero.0307 0.0315 0.2500 0.0078 0.0000 0.1534 0.0194 0.54) P (Xn+1 = kXn = j) = p (j. g (j. g (0.0022 disp(P) Columns 1 through 7 1.4500 0.0000 0 0.1528 0.0864 0. D) = max{1 − D.0169 0. 0) = P (D ≥ 1) .0000 0 0 0 0.0284 0.0559 0. set (16.0100 0.0001 0. use M =3 Dn is Poisson (1) (16.0001 0.0147 Note that p (0.1625 0.3585 0.0119 0.0406 0.1910 0. g (0.0000 0.0225 0.1574 0.0022 0. D) = max{3 − D.0000 0.1412 0.1179 0. D) = 0 i D≥1 implies p (1.0987 0.1000 0.2775 0.1227 0. Dn+1 ) = { max{M − Dn+1 .0005 0.0950 0. Dn+1 ) = k] The various cases yield g (0. p (0.0000 0.0020 0.1010 0. GIVEN A RANDOM VECTOR % Optional display of generated P 0 0.1003 0.0350 0.0075 0. 0} max{j − Dn+1 .0062 0.1381 0. D) = 0 D) = 1 D) = 2 D) = 3 i i i i D D D D ≥3 =2 =1 =0 implies implies implies implies p (0.0011 0.2550 0. k) = P [g (j.0294 0 0.1350 0.0222 0.5771 0.52) Numerical example m=1 To simplify writing.9: Inventory problem (continued) In this case. p (0.508 CHAPTER 16. 0) = 1.0068 0.9468 0 0.0006 0 0 0.1481 0.0483 0.1623 0.0001 0 0 0.0061 0.0000 0.7591 0.0253 0.0003 0.0440 0.1730 0. Also. 0} for for 0≤j<m m≤j≤M (16.0025 0.1080 0. Example 16.0000 0.0345 0.1675 0. CONDITIONAL INDEPENDENCE.1852 0.0000 0.0074 0.0043 0.0698 0.0579 0.0021 0.0001 0.1879 0.0079 0.0005 0.0034 0.1545 0.0000 Columns 8 through 11 0 0 0 0 0.0702 0.0017 0.0000 0. it is extinct and no more births can occur. 0) = P 1) = P 2) = P 3) = P (D (D (D (D ≥ 3) = 2) = 1) = 0) g (1. 0} g (0.0002 0.1500 0.0000 0.0304 0.53) D for Dn . Because of the invariance with n. if the maximum population (10 in this case) is reached.0590 0. 0} g (1.0832 0.0705 0.8799 0. p (0.
end P We consider the case M = 5.states).509 g (1. ms = length(states).sy). inventory1 Enter value M of maximum stock 5 Enter value m of reorder point 3 Enter row vector of demand values 0:20 Enter demand probabilities ipoisson(3. 2) = P (D = 0) p (3.b] = meshgrid(T(i. 1) = P (D = 0) g (1. 0} g (2. D) = 0 D) = 1 D) = 2 D) = 3 i i i D≥2 D=1 D=0 implies implies implies p (2.s] = meshgrid(Y. 0} = g (0. g (2. P(i.1839 0.3679 The calculations are carried out by hand in this case. my = length(Y).2642 0. PY = input('Enter demand probabilities ').*(s >= m).3679 0 0 0. k) is impossible so that g (3.6321 0. inventory1 has been written to implement the function g. 0) = P (D ≥ 2) p (2. D) = 1 i D = 0 implies p (1. and demand is Poisson (3).55) 0.M) inventory policy M = input('Enter value M of maximum stock ').3679 0.3679 0 P= 0. k) = p (0.3679 0.states). g (2. % Calculations for determining P [y.:) = PY*(a==b)'. g (2. T = max(0.0:20) P = % % % % Maximum stock Reorder point Truncated set of demand values Demand probabilities . for i = 1:ms [a. involving costs and rewards. m = input('Enter value m of reorder point ').3679 0.My). Y = input('Enter row vector of demand values '). D) = max{2 − D.1839 0. to exhibit the nature of the calculations.m % Version of 1/27/97 % Data for transition probability calculations % for (m. D) = max{3 − D. An mprocedure % file inventory1.0803 0. states = 0:M.:). This is a standard problem in inventory theory.3679 (16. 3 is impossible g (2. D) The various probabilities for D may be obtained from a table (or may be calculated easily with cpoisson) to give the transition probability matrix 0.*(s < m) + max(0. P = zeros(ms.ms). We approximate the Poisson distribution with values up to 20. 1) = P (D = 1) p (2. the reorder point m = 3.0803 0. D) = 2.
k for j ≥ 1 The resulting transition probability matrix is p1 p2 0 1 p3 0 0 ··· 1 P= 0 ··· ··· The matrix is an innite matrix.0498 Example 16.60) ChapmanKolmogorov equation. m ≥ 1 (16. 602). The immediate future be replaced by any nite future Un. 602).2240 0. 602) for conditional independence. Un } ci Xn+k Setting for all n ≥ 0. Y ) = Y − 1.58) g U n+k . particularly (CI9) ("(CI9) ". be replaced by the entire (M ∗ {U n .n (i.3528 0.57) The ChapmanKolmogorov equation and the transition matrix As a special case of the extended Markov property. p.1680 0.1847 0.n . we may assert Extended Markov property XN is Markov i Un.2240 0. may be used to establish the following. (CI10) ("(CI10)".0498 0 0 0.1494 0.2240 0. Xn+k = Xn+k+m and h (Un .10: Remaining lifetime (continued) g (0.n ∀0≤m≤n (16. k. p.1494 0 0. and (CI12) ("(CI12)". p. Xn+1 may Various properties of conditional independence. · · · .n+p may extended presentUm. ≥ 1 in (CI9) ("(CI9) ". k. M − 1}.0498 0. it follows that (CK) E [g (Xn+k+m ) Xn ] = E{E [g (Xn+k+m ) Xn+k ] Xn } This is the ∀ n ≥ 0. For a discrete state space P (Xn = jXm = i) = pm.1847 0.1680 0. j) pn. k) 0≤m<n<q (16.2240 0.n (i. we have {U n+k . M } then the E is {0. E.2240 0. m ≥ 1 (16.2240 0.56) Y is simple.1847 0. 1. If the range of Y is {1.2240 0. GIVEN A RANDOM VECTOR 0. Xn+k ) = Xn {Xn+k+m .1680 0. · · · . so that p (j.2240 0.2240 0. 603).2240 0.1494 0. k) = j∈E pm.5768 0.510 CHAPTER 16. j) this equation takes the form (16. k) = P (Y − 1 = k) = P (Y = k + 1) g (j.1494 0. unless state space ··· ··· ··· ··· (16.q (j.59) By the iterated conditioning rule (CI9) ("(CI8) ". p. we get (16.1847 0.1680 0.61) CK ' pm. 603). p. so that p (0.1494 0.q (i. Thus.n+p and the present Xn may be replaced by any Some results of abstract measure theory show that the nite future future Un .0498 0. k) = δj−1. Y ) = j − 1 for j ≥ 1.1494 0.0498 0. Xn } ci Xn+k for all n ≥ 0. CONDITIONAL INDEPENDENCE.0498 0. with which plays a central role in the study of Markov sequences. Um } ci Um. 2. k.2240 0.62) .
Xn . k) = P (Xn+m = kXn = i) are known as the invariant in n (16.1898 0.9 (Inventory problem (continued)).2730 0. The conditional probabilities pm of the form pm (i. etc. CK '' pm+n (i.q (j. we have (16. the distribution for any (i. k) j∈E ∀ i. k)].2854 0.2930 0.2619 =P = 0. A simple inductive argument based on P(3) = P(2) P(1) = P3 . the mth power of the transition matrix P CK '' establishes Example 16. · · · . we examine the distributions for the various E = {1. we consider a nite state space π (n) = [p1 (n) p2 (n) · · · pM (n)] As a consequence of the product rule. consider P (Xq = kXm = i) = E I{k} (Xq ) Xm = i = E{E I{k} (Xq ) Xn Xm = i} = j (16. the threestep transition probability matrix P(3) is obtained by raising P to the third power to get P (3) 3 0.65) mstep transition probabilities. k) pm.66) In terms of the product mstep transition matrix P(m) = [pm (i. j ∈ E (16. j) = j pn.1649 0. CK '' this set of sums is equivalent to the matrix P(m+n) = P(m) P(n) (16.2629 0. Xn is determined by the initial distribution VERIFICATION .68) The product rule for transition matrices (m) The mstep probability matrix P = Pm .67) Now P(2) = P(1) P(1) = PP = P2 .2930 0.63) E I{k} (Xq ) Xn = j pm. (16. j) pn (j.2753 0.511 To see that this is so.2917 0.2917 0.64) Homogeneous case For this case. letting pk (n) = P (Xn = k) for each k ∈ E. We use π (n) for the row matrix To simplify writing.1524 (16. k) = The ChapmanKolmogorov equation in this case becomes pm (i.n (i. That is. we may put CK ' in a useful matrix form.2993 0. j) (16.69) 0. M }.1524 We consider next the state probabilities for the various stages.70) Probability distributions for any period For a homogeneous Markov sequence. for X0 ) and the transition probability matrix P.2504 0.n (i.2629 0..11: The inventory problem (continued) For the inventory problem in Example 16.e.
makes it possible to link the unchanging structure. sequence. is characterized by the fact that each member Xn+1 of the sequence is conditioned by the value of the previous member of the sequence. This onestep stochastic linkage has made it customary to refer to a Markov sequence as a In the discreteparameter Markov case.2917 0. For any n ≥ 0. or chain interchangeably. The system can be viewed as an object jumping from state to state (node to node) at the successive transition times. represented by the transition matrix. The timeinvariant transition matrix may convey a static impression of the system. given a stock of (16. process. 2. j) as presented in the transition However. In the nonhomogeneous case. . On the The transition diagram and the transition matrix The previous examples suggest that a Markov chain is a dynamic system.72) is π (0) and P3 is the fourth row of P3 .. j) of states.11: The inventory problem (continued)). probability distribution π (n) = π (n − 1) P = π (0) P(n) = π (0) Pn = the nthperiod distribution The last expression is an immediate consequence of the product rule. Remarks • A similar treatment shows that for the nonhomogeneous case the distribution at any stage is determined by the initial distribution and the class of onestep transition matrices. any transition which has zero transition probability. we use the terms Markov chain. evolving in time. (i. j) > 0 but p (j. Put π (n) = [p1 (n) p2 (n) · · · pM (n)] Then (16.12: Inventory problem (continued) In the inventory system for Examples 3 (Example 16....512 CHAPTER 16. 7 (Example 16. suppose the initial stock is M = 3. CONDITIONAL INDEPENDENCE. This means that π (0) = [0 0 0 1] The product of (16. or Markov sequence.2629 0. probability of one unit in stock at the end of period number three.2917 that X3 = 1. · · · . the probability is 0.2930 0.n+1 (i. As we follow the trajectory of this object. the edges on the diagram correspond to positive onestep transition probabilities between the nodes connected. so that the distribution for X3 π (3) = [p0 (3) p1 (3) p2 (3) p3 (3)] = [0. We ignore. we may have p (i. simple geometric representation. GIVEN A RANDOM VECTOR Suppose the homogeneous sequence XN has nite statespace E = {1. let pj (n) = P (Xn = j) for each j ∈ E. i) = 0.71) π (0) = the initial π (1) = π (0) P . other hand. a P. Denition. with the dynamics of the evolving system behavior. Thus. transition probabilities pn.9: Inventory problem (continued)) and 9 (Example 16.1524] Thus. Example 16. we may have a connecting edge between we achieve a sense of the evolution of the system. Since for some pair two nodes in one direction. known as the transition diagram.73) This is the M = 3 at startup. as essentially impossible. M }. the stochastic behavior of a homogeneous chain is determined completely by the probability distribution for the initial state and the onestep transition probabilities matrix p (i. A transition diagram for a homogeneous Markov chain is a linear graph with one node for each state and one directed edge for each possible onestep transition between states (nodes). • A discreteparameter Markov process. but none in the other.5: An inventory problem). j) depend on the stage n.
since the probability of a transition from value 2 to value 3 is zero. we refer to the node for state k + 1. there are four possibilities: remain in that condition with probability 0. On the other hand.368 0. Each of these directed edges is marked with the (conditional) transition probability. there is no directed edge from node 1 to either node 2 or node 3.264 0. Figure 16.632 0. the Figure 1 shows the transition diagram for this system.368 (16. Similary. At each node corresponding to one of the state value is shown.368 0 0 P= 0.184 0. again. the transition matrix P for the inventory problem (rounded to three decimals). or move to node 3 with probability 0. For convenience.368. move to node 2 with probability 0.13 ( Transition diagram for inventory example).080 0.4: Transition diagram for the inventory system of Example 16. as node k. since these correspond to the transition probability distribution from that node.368 0 0. There is no directed edge from the node 2 to node 3.368 0.13: Transition diagram for inventory example Consider. the state value is one less than the state number.368.513 Example 16. move to node 1 with probability 0. .74) 0. which has state value k. other node.184 0. If the state value is zero.368 0.184. Note that the probabilities on edges leaving a node (including a self loop) must total to one. These are represented by the self loop and a directed edge to each of the nodes representing states.080. 0.368 possible states. In this example. probabilities of reaching state value 0 from each of the others is A similar situation holds for each represented by directed edges into the node for state value 0.080 0.
and idempotent). although we do not develop them in this F. recurrent. in the sense that if the system arrives at any one state in the subset it can never reach a state outside the subset. Unless stated otherwise.514 CHAPTER 16. j). Sometimes there is a regularity in the structure of a Markov sequence that results in periodicities. we limit consideration to the aperiodic Denition. j) = 1. We simply quote them as needed. An important classication of states is made in terms of Denition. It is called absorbing A state j is called i ergodic i it is positive. state. A number of important theorems may be developed for Fk and treatment. GIVEN A RANDOM VECTOR There is a oneone relation between the transition diagram and the transition matrix P. recurrent states fall into one of two subclasses: positive or null. denoted i → j . is reexive. j (after the initial period). p (j. Denition..e.e. i F (j. i pn (i. A recurrent state is one to which the system eventually returns. With this relationship. Usually if there are any self loops in the transition diagram (positive probabilities on the diagonal of the transition matrix case. but also displays certain structural properties. j) < 1.e. every pair communicates). For state j. Denition. state j aperiodic. The transition diagram not only aids in visualizing the dynamic evolution of a chain. An arrow notation is used to indicate important relations between states. By including j reaches j in all cases. hence is visited an innity of times. Periodicities can sometimes be seen. transitive. never leaves). although it is usually easier to use the diagram to show that periodicities cannot occur. j) > 0 for some n > 0. denoted i ↔ j i both i reaches j and j reaches i reaches i. Questions of communication and recurrence may be answered in terms of the transition diagram. If it is absorbing. the relation ↔ is an equivalence relation (i. j) > 0} is (16. Classication of states Many important characteristics of a Markov chain can be studied by considering the number of visits to an arbitrarily chosen. steps. Denition. and that is the only possible case for systems with nite state space. States i and j communicate. For a xed state j. P) the system is aperiodic. E F. j) = 1. otherwise. but xed. then state j is periodic with period δ . Denition.75) δ > 1. State Remark. j) = P (Ti = kX0 = i). the probability of ever reaching state j from state i. we can dene important classes. let δ = greatest If common denominator of {n : pn (j. A class is reached from within the class. j) = P (Ti < ∞X0 = i) = ∞ k=1 Fk (i. let of the rst visit to state T1 = the time (stage number) Fk (i. and is said to be j is said to be transient recurrent i F (j. CONDITIONAL INDEPENDENCE. closed if no state outside the class can be The following important conditions are intuitive and may be established rigorously: i↔j i→j i→j i and i and implies i is recurrent i j j is recurrent recurrent implies recurrent implies i↔j recurrent . then once it is reached it returns each step (i. State We say j. A class of states is communicating i every state in the class may be reached from every other state in the class (i. Only the positive case is common in the innite case. Some subsets of states are essentially closed. If the state space is innite. Often a chain may be decomposed usefully into subchains. and aperiodic.. the probability of reaching state j for the rst time from state i in k F (i.
irreducible.77) π (n) = π (0) Pn → π (0) P0 where (16. recurrent.515 Limit theorems for nite state space sequences The following propositions may be established for Markov sequences with nite state space: • • There are no null states.79) π 1 P0 = ··· π1 Each row of πm ··· ··· ··· π2 · · · πm π = limπ (n). as above..76) π (n) = [p1 (n) p1 (n) · · · pM (n)] the result above may be written so that π (n) = π (0) Pn (16. and not all states are transient. and for all i is a transient state (necessarily not in C ). we let πj = 1 πj = i=1 πi p (i.All states are recurrent .80) The result above may be stated by saying that the longrun distribution is the stationary distribution.3679 . P = 0. A limit theorem If the states in a Markov chain are ergodic (i.3679 0.1839 0. A Pn − P0  ≤ aλ where n other than (16. j) (16.3679 0 0.14: The long run distribution for the inventory example We use MATLAB to check the eigenvalues for the transition probability P and to powers of P. then .1839 E = abs(eig(P)) E = 0. j) = πj > 0 n j=1 If.3679 0. If a class of states is irreducible (i. k) j.e.6321 0. aperiodic).All states are aperiodic .3679 0. The convergence process is readily evident. is closed.81) λ is the largest absolute value of the eigenvalues for P λ = 1.78) π1 π2 π2 ··· ··· πm (16. k ∈ C . obtain increasing Example 16.e.has no proper closed subsets). F (i. positive. j) = F (i.3679 0. then M M limpn (i. n P0 = limPn n is the long run distribution Denition. A distribution is stationary i π = πP generating function analysis shows the convergence is exponential in the following sense (16.0803 0.3679 0 0 0.If a class then C or all are periodic with the same period.2642 0..0803 0.
28579574073314 0.00002099860496 0..2858 0.P8))) error8 = 2.2632 0.P4))) % Use P^16 for P_0 error4 = 0. then n is suciently large.0045824.26315641543927 0.P12))) error12 = 1.1663 0.2602 0.16634116914369 error4 = max(max(abs(P^16 .28470680858266 0.28577030590344 0.28471421248816 0.28581491438224 0.00002099 error12 = max(max(abs(P^16 .28385952806702 0.2847 0. error8 = max(max(abs(P^16 .2632 0.0000 0.28958568915950 P8 = P^8 P8 = 0.2632 0.28470680714781 0.00458242348096 P4 = P^4 P4 = 0.16637097383535 0.26315895715219 0.26314057837998 0.16633422627939 0. .17075261450021 0.16634117201261 0.2602 0.16634116914369 0. This n agree within acceptable exhibits a practical criterion for sucient convergence.16632636535655 0.00441148012334 % Compare with 0.984007206519035e05 % Compare with 0.2858 0.00000009622450 0.28580046500309 0. GIVEN A RANDOM VECTOR 1. then We P10 = P^10 P10 = 0.26059678211310 0. and the agreement with the error is close to the predicted.26315895715219 0. if we consider agreement to four decimal places sucient.16632636535655 P12 = P^12 P12 = 0. and we have approximated the long run matrix P0 with P16 .28470680858266 0.26288737107246 0.28469190218618 0.2847 0.1663 0. If the rows of P precision. For example.28593792666752 0.28593792666752 0.28958568915950 0.2632 0.16617268146679 0.2858 0.26746979455342 0.2847 shows that 0.00000009622450 The convergence process is clear.0000 format long N = E(2).28470687626748 0. have not determined the factor a.26315634631961 0.^[4 8 12] N = 0.2858 0.26316681807503 0.005660185959822e07 % Compare with 0.1663 0.28471028095839 0.516 CHAPTER 16.28580046500309 0.26315641543927 0.28479107531968 0.28579560683438 0.28579574360207 0..26059678211310 0.28156644866011 0.28471421248816 0.1663 n = 10 is quite sucient. CONDITIONAL INDEPENDENCE.28579560683438 0.2847 0.16634103381085 0.16387960205989 0.16387960205989 0.28250048636032 0.26315628010643 0.
If the initial state is in the target set. The transition matrix P thus has nonnegative elements. etc. if we are in state k. we make a selection of the next state based on that distribution. the arrival time would be zero. period numbers. This sample is To obtain a sample from the uniform distribution use a random number generator. We call the number of transitions in this case the recurrence time. The time (in transitions) to reach a target state is one less than the number of stages in the tra jectory which begins with the initial state and ends with the target state reached. one transition to reach the next (period one). time in this The tra jectory is k+1 stages long. Computation is carried For out with state numbers. These are arranged in a The transition matrix and the transition distributions. 2. . We suppose there are purposes of analysis and simulation. it may be desirable to know the time to complete visits to a prescribed number of the target states. • In some instances. • • If the initial state is not in the target set. For each state. Sometimes we are interested in continuing until rst arrival at (or visit to) a specic target state or any one of a set of target states. Arrival times and recurrence times The basic simulation produces one or more tra jectories of a specied length. the number of transitions is a measure of the elapsed time since the chain originated. stage. Stages. Again there is a choice of treatment in the case the initial set is in the target set. Zero transitions are required to reach the original stage (period zero). we adapt that procedure to the problem of simulating a trajectory for a homogeneous Markov sequences with nite state space. transition probability distribution for the next state. transformed by the quantile function into a sample from the desired distribution. there is a conditional The fundamental simulation strategy 1. interchangeably. We nd it convenient to refer to fashion.517 16. trajectories and time.3). so time or period number is one less than the number of stages. we number the states 1 through m states. we have a distribution for selecting the next state. transitions. A succession of these choices. two transitions to reach period two. with the selection of the next state made in each case from the distribution for the current state. The i th row consists of the transition distribution for selecting the nextperiod state when the current state number is i. States and state numbers. usually carrying a numerical value. we do not stop at zero but continue until the next visit to a target state (possibly the same as the initial state). If Q is the quantile function for the population distribution and U is a random vari