Lecture Notes in Applied Probability

2005 Wayne F. Bialas Department of Industrial Engineering University at Buffalo

i

Copyright c 2005 Wayne F. Bialas Department of Industrial Engineering University at Buffalo 301 Bell Hall Buffalo, New York 14260 USA E-mail: bialas@buffalo.edu Web: http://www.buffalo.edu/˜bialas All rights reserved. No part of this work may be reproduced by any means without the express written permission of the author.

Date printed: May 12, 2005

ii

CONTENTS

Acknowledgements To the student 1 Sets, Sample Spaces and Events Mathematical Models . . . . . . . . . . . . . . . . . . . . . . . . . . Deterministic models . . . . . . . . . . . . . . . . . . . . . . . . Nondeterministic models . . . . . . . . . . . . . . . . . . . . . . Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . Venn diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . Set operations . . . . . . . . . . . . . . . . . . . . . . . . . . . Set identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proving statements about sets . . . . . . . . . . . . . . . . . . . A proof of the distributive law for sets . . . . . . . . . . . . . . . Elementary sets . . . . . . . . . . . . . . . . . . . . . . . . . . Cardinality of sets . . . . . . . . . . . . . . . . . . . . . . . . . Sample spaces and events . . . . . . . . . . . . . . . . . . . . . Sets of events . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-Test Exercises for Chapter 1 . . . . . . . . . . . . . . . . . . . . Questions for Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . .

xi xiii 1 1 1 1 2 2 3 3 5 6 7 8 9 10 11 12 14

iii

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Independence of more than two events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conditional probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 20 20 26 26 29 32 35 35 35 36 40 41 41 45 48 49 49 51 52 54 57 67 Independence . . Bayes’ theorem . . . . . . . . . . . . . . . . . . . Questions for Chapter 3 . . . . . Deductions from the axioms . . . . . . Conditional Probability and Independence . . . . . . . . The birthday problem on the planet Yak . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-Test Exercises for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Measuring sets . . Permutation of n items distinguished by groups . . . . . . . . . Interpretation of probability . . . Self-Test Exercises for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Questions for Chapter 2 . . . . . . . . . . . .iv CONTENTS 2 Enumeration Fundamental Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Birthday Problem . . . . . . . . . . . . . . . . . . . Examples using logical probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Probability Probability Spaces . . . . . . . . . . . . . . Sampling . . Occupancy . . . . Relationships between English and set notation . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . Properties of any cumulative distribution function . . . . . . . . . . . . . . . . . . . . . . . . . . Finding E(X) for some specific distributions . . . . . . . . . Poisson distribution . . . . . . . . . . . . . . . . . . . . Binomial distribution . . . Exponential distribution . . . . . . . . . . . . . . 105 Some examples . . . . . . . . . . . . . Examples of continuous distributions . . . . . . . . . . . . . . . . . . . Geometric distribution . . . . . . Expectation . . . . . . . . . . . . . . . . . . . . . . . . Properties of any probability density function . . . . . . . . . . . . . . . . 109 Continuous case . . Other measures of central tendency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Continuous Random Variables . Properties of any probability mass function . . . . . . . . . . . . . . . . . . . . Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-Test Exercises for Chapter 4 . . . . . . . . . . . . . . . .CONTENTS v 4 Random Variables Basic Concepts . . . . . . . . . . . . . . Cumulative Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Discrete case . . . . . . . . . . . . . . . . . . . . Questions for Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Differential mass interpretation of the pdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Examples of discrete distributions . . . . 110 . . . . . . . . . . . . . . . . . . . . . . 81 81 81 81 82 82 83 85 86 87 88 89 90 90 90 91 91 92 92 99 105 5 Functions of Random Variables Basic Concepts . . . . . . . . . . .

. . . . . . . . . . . . . . . . . 118 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Approximating E(h(X)) and Var(h(X)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Functions of Random Vectors . . 180 . . . . . . . . . . . . . . . . . 155 Definition . . . . . . . . . . 126 Tools for using the standard normal tables . . . . . . . . . . . . 121 Chebychev’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Self-Test Exercises for Chapter 5 . . . . . . . . . 171 Testing for independence . . 157 Discrete Distributions . . . . . . . . . . . . . . . . . .vi CONTENTS Invertible Functions of Continuous Random Variables . . . . . . . . . . . . 171 Expectation and random vectors . . . . . . . . . . . 158 Continuous Distributions . . . . . . . 155 Joint Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Marginal Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Self-Test Exercises for Chapter 6 . . . . . . . . 178 Conditional distributions . . . . . . . . . . 126 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 Independence of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . 173 Random vectors and conditional probability . . . . . . . . . . . . . . . . . . . . . . . . . 115 The k th Moment of X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 The Law of Large Numbers . . . . . . . . . . 112 Expected values for functions of random variables . . . . . . . . . . . . . 133 Questions for Chapter 5 . . . . . . . . . . . . . . . . . . . . . . 142 6 Random Vectors 155 Basic Concepts . . .

. . . . . . . . . . . . . . . . . . . . . . 231 . . . . . . . . . . 199 Moment generating functions . . . . . . . . . . . . . . . . . . . . . . . 229 The problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Point Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 7 Transforms and Sums 199 Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Statistics . . . . . . . . . . . . . . . . . . . . 209 Linear combinations of normal random variables . . . 211 The Central Limit Theorem . . . 211 The DeMoivre-Laplace Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Probability generating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .CONTENTS vii Questions for Chapter 6 . . . . 229 Finding Estimators . . . . . . . . . . . . . . . . . . . . 230 Questions for Chapter 8 . 214 Self-Test Exercises for Chapter 7 . . . 229 Unbiased estimators . . . . . . . . . 209 The distribution of the sample mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 8 Estimation 227 Basic Concepts . . . . . . . . . . . . . . . . . . . 199 Notation . . . . . . . . . . . . . . 227 Random samples . . . . . 215 Questions for Chapter 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Method of maximum likelihood . . . . . . . . . 203 Other transforms of random variables . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Self-Test Exercises for Chapter 8 . . . . . . . . . . . 213 The distribution of the sample mean for large n . . . . . . .

233 The binomial distribution . . . . . . . 247 Approximation for the Inverse Normal Integral . . . . . . . . . . . . . . . . 264 . . . . . . . . 237 Questions for Chapter 9 . . . . . . 261 Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 D Sample Examination 2 261 Part I: Multiple Choice Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 Table II: Some Common Probability Distributions . . . . . . . . . . . . . 234 The Poisson distribution . . . . . . .viii CONTENTS 9 Probability Models 233 Coin Tossing (Binary) Models . . . . . . . . . . . . . . . . . . . . . . . . 235 Random Particle Models . . . . . . . . . . . . . . . . . 248 A Tables 249 Table I: Standard Normal Cumulative Distribution Function . . . . . 238 10 Approximations and Computations 247 Approximation for the Normal Integral . . . . . . . 257 Part II . . . . . . . . . 251 B Answers to Self-Test Exercises C Sample Examination 1 253 257 Part I: Multiple Choice Questions . . . . . . . 235 Self-Test Exercises for Chapter 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 Questions for Chapter 10 . . . . . . . . . . . . . . . . 233 The geometric distribution . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .CONTENTS ix E Sample Final Examination 265 Part I: Multiple Choice Questions . . . . . . . . 271 . . . . . . . 265 Part II . . . . . . . . . . . . . . .

x CONTENTS .

I must also thank Yu-Shuan Lee Brown. Angelo Morgante. Partel was modified and used to format this document. Mona Khanna. Wicks. WFB Buffalo. New York May 12. Lindsay Koslowski. Nguk Ching Ling. Derek Hoiem. and AFPL Ghostscript were used to produce the printed output. Judah Eisenhandler. Kevin Cleary. Harold T. Nikola Petrovic. In particular. Steven Kieffer. Alfred Guiffrida and Nishant Mishra for several contributions to these notes. Dawn Holmes. a This manual was typeset using Donald Knuth’s TEX with Leslie Lamport’s L TEX aTEX style file developed enhancements as implemented by Eberhard Mattes. Andrew Lenkei. 2005 xi .ACKNOWLEDGEMENTS Contained herein is a bit of the wisdom of the greatest scholars in probability theory. A L by H. Shiju David. Rajarshi Chakraborty. Elif Tokar and Min Zhang for finding numerous mistakes in previous editions. Lynne Makrina Tkaczevski. Amol Salunkhe. These notes have benefited the most from the many contributions of my students and teaching assistants. Michael Holender. Gerber. Xiaofeng Nie. Simin Huang. written by Mark A. Some of them were my professors. The Adobe AcrobatTM driver. Satyaki Ghosh-Dastidar. James Pasquini. I am grateful to Mark Chew. Taneka Lawson. dvipdfm. Any remaining errors and omissions are mine. Keith Klopfle. I would not have dared to prepare these notes without these tools.

xii CONTENTS .

the goal of any good mathematical model is to serve as a tool for reaching conclusions about the processes being modeled. A fool must now and then be right by chance. and then assign. For that purpose. a measure of uncertainty to each of those sets. is that a course in probability asks that you change your view of everything around you. The concepts of center of mass (center of gravity) and moment of inertia. When outcomes are not equally likely. we will use the mathematical theory of measure. For some problems. As engineers and scientists. There we will quickly discover analogies between allocating probability and allocating physical mass. we provide some counting techniques. ultimately. Conversation (c. – WILLIAM COWPER. too. What is certain. But. For that. 1790) xiii . in a systematic way. the Central Limit Theorem. or when they arise from a continuum of possible values. The universe is not a deterministic clockwork. Probability theory provides a way to mathematically represent chance. we can assign probabilities to events by simply counting equally likely outcomes. we will introduce the concept of random variables and random vectors to map our abstract probability problems into Euclidean real space. We will show you how sets can be used to express events.TO THE STUDENT These notes are for a course devoted to describing those processes around us that appear to produce random outcomes. but a harmony of random possibilities. we are always more comfortable working with real numbers. We will do that. become the mean and variance of our probability distributions. Point masses and continuous mass densities arise as a natural consequence. That measure will be called probability. with examples ranging from coin tossing to proving one of the most important results in all of science.

xiv CONTENTS .

All the Days of My Life (1913) Mathematical Models What is a mathematical model?    . – AMELIA E. SAMPLE SPACES AND EVENTS There are no little events in life. BARR.1 SETS.

.

 Observation  .

Examples: 1 To place this date in perspective. Application    Deterministic models For specified values of an experiment’s input parameters. University at Buffalo was founded in 1846. Example: V = IR Established experimentally by George Simon Ohm in 18271 Nondeterministic models Also called probabilistic models or stochastic models. 1 . the result is a known constant.

2 Sets • the number of α-particles emitted from a piece of radioactive material in one minute. . The universe is the set containing all points under consideration and is denoted by Ω. 2. B.) while their elements are denoted by small letters (a. we write x ∈ A. b. Sets Basic definitions We will need to use set theory to describe possible outcomes and scenarios in our model of uncertainty. A set can be described • by listing the set’s elements A = {1. . 4} • by describing the set in words “A is the set of all real numbers between 0 and 1. Notation: Sets are denoted by capital letters (A.” • by using the notation {ω : specification for ω} A = {x : 0 ≤ x ≤ 1} or A = {0 ≤ x ≤ 1} There are two sets of special importance: Definition 1.3. If x is an element of a set A. The empty set (the set containing no elements) is denoted by Ø. If x is not an element of A. . . A set is a collection of elements. Question: What is the difference (if any) between Ø and {Ø}? Definition 1. .1. • the number of students who will earn an A in EAS305. Ω. ω. . we write x ∈ A. • the outcome of a single toss of a fair die.2.). 3. Definition 1. inclusive. .

Define the sets A = {x : x2 + x − 2 = 0} B = {x : (x − 3)(x2 + x − 2) = 0} C = {−2. We avoid the ambiguity here by using ⊆ for subset and ⊆ for proper subset. A is a proper subset of B if A ⊆ B and A = B . We write A ⊆B . the set of all real numbers. . 1. Definition 1. Example: Let Ω = R. 3} Then A ⊆ B and B = C . A = B if and only if A ⊆ B and B ⊆ A. then we say that A is a subset of B (B is a superset of A). If every point of set A belongs to set B .5. / Venn diagrams Ω B A Set operations Complementation Ac = {ω ∈ Ω : ω ∈ A} . / Note: Some authors use the symbol ⊂ for subset and other use ⊂ for proper subset.6.4. We write A⊆B B⊇A Now we have a way to say A is identical to B . Definition 1. .Sets 3 Definition 1.

4 Example: Here is the region representing Ac in a Venn diagram: Ω Sets A Ac Union A ∪ B = {ω ∈ Ω : ω ∈ A or ω ∈ B (or both)} Example: Consider the following Venn diagram: Ω A B then the region representing A ∪ B is: Intersection A ∩ B = {ω ∈ Ω : ω ∈ A and ω ∈ B} .

These basic operations can be extended to any finite number of sets.7. Two sets A and B are disjoint (or mutually exclusive) if A∩B = Ø. A ∪ B ∪ C = A ∪ (B ∪ C) = (A ∪ B) ∪ C and A ∩ B ∩ C = A ∩ (B ∩ C) = (A ∩ B) ∩ C You can show that (a) A ∪ B = B ∪ A (b) A ∩ B = B ∩ A (c) A ∪ (B ∪ C) = (A ∪ B) ∪ C (d) A ∩ (B ∩ C) = (A ∩ B) ∩ C Note: (a) and (b) are the commutative laws Note: (c) and (d) are the associative laws Set identities (e) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) (f) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) (g) A ∩ Ø = Ø .Sets 5 Example: Using the same Venn diagram as above the region representing A ∩ B is given by: Note that some textbooks use the notation AB instead of A ∩ B to denote the intersection of two sets. Definition 1.

since B ⊆ C . Choose an x ∈ A. Theorem 1. Therefore. Here is an example of a simple set theorem and its proof. For example. (1) (2) (3) (4) Discussion If x ∈ A then x ∈ B Since x ∈ B then x ∈ C If x ∈ A then x ∈ C Therefore A ⊆ C Reasons Definition of A ⊆ B Definition of B ⊆ C By statements (1) and (2) Definition of A ⊆ C (as in the above proof) or the Note: A proof is often concluded with the symbol letters QED. Furthermore.1 would become: Proof. But they can be used to disprove a false mathematical statement. .6 Sets (h) A ∪ Ø = A (i) (A ∪ B)c = Ac ∩ B c (j) (A ∩ B)c = Ac ∪ B c (k) (Ac )c = A Note: (e) and (f) are called the Distributive Laws Note: (i) and (j) are called DeMorgan’s Laws Proving statements about sets Venn diagrams can only illustrate set operations and basic results. Many mathematicians prefer to write proofs in paragraph form. Proof. every element x ∈ A is also an element of C . If A ⊆ B and B ⊆ C .1. You cannot prove a true mathematical statement regarding sets using Venn diagrams. x ∈ A implies x ∈ B . the proof of Theorem 1. Hence A ⊆ C . Since A ⊆ B . then A ⊆ C . x ∈ B implies x ∈ C . from the definition of subset.

1964. (1) (2) (3) (4) (5) Therefore every element of A ∩ (B ∪ C) is also an element of (A ∩ B) ∪ (A ∩ C). Discussion If x ∈ A ∩ (B ∪ C). Reasons Definition of ∩. Hence x ∈ A ∩ B or x ∈ A ∩ C.. then x ∈ A and x ∈ B ∪ C . Since x ∈ B ∪ C . giving us A ∩ (B ∪ C) ⊆ (A ∩ B) ∪ (A ∩ C). Definition of ∪. Proof. . Definition of ∪. By (1) and (2).2. x ∈ (A ∩ B) ∪ (A ∩ C). A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C). We now prove the inclusion in the opposite direction: 2 Dinkines. Elementary Theory of Sets. Meredith Publishing. Therefore x ∈ A and x ∈ B . or x ∈ A and x ∈ C . F. then x ∈ B or x ∈ C .Sets 7 A proof of the distributive law for sets Here is a proof of one of the distributive laws for sets. The proof is from a textbook on set theory by Flora Dinkines2 Theorem 1. Definition of ∩.

. Therefore every element of (A ∩ B) ∪ (A ∩ C) is also an element of A ∩ (B ∪ C).8 Sets (6) (7) (8) (9) (10) (11) If y ∈ (A ∩ B) ∪ (A ∩ C). Statements (7) and (8) Since y is in B or C . . Definition of ∩. giving us A ∩ (B ∪ C) ⊇ (A ∩ B) ∪ (A ∩ C). . and C . Therefore y ∈ A ∩ (B ∪ C). . Therefore. S7 are called the elementary sets of A. Definition of ∩. Consider the following subsets of Ω: A B C 0 0 0 Ac ∩ B c ∩ C c = S0 0 0 1 Ac ∩ B c ∩ C = S1 0 1 0 Ac ∩ B ∩ C c = S2 0 1 1 Ac ∩ B ∩ C = S3 1 0 0 A ∩ Bc ∩ C c = S4 1 0 1 A ∩ Bc ∩ C = S5 1 1 0 A ∩ B ∩ Cc = S6 1 1 1 A∩B∩C = S7 The sets S0 . B and C . Definition of ∪. Elementary sets Let Ω be a universe with three subsets A. In either case (7) or (8). If y ∈ A ∩ B then y ∈ A and y ∈ B . it is in the union. then y ∈ (A ∩ B) or y ∈ (A ∩ C). Same as (7). using the definition of set equality. B . If y ∈ A ∩ C then y ∈ A and y ∈ C . S1 . y ∈ A and y is an element of one of the sets B or C . we have A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C). Therefore y ∈ A and y ∈ B ∪ C. .

That is. This is the subset of elements that are 0: 1: 1: not in A in B in C and and Figure 1. B and C using a Venn diagram. we have                            A Bc C Ac B C Ac Bc C C . For example (A ∪ B) ∩ C = S3 ∪ S5 ∪ S7 Cardinality of sets The cardinality of a set A (denoted by |A|) is the number of elements in A. Elementary sets have two very useful properties: • Elementary sets are pairwise disjoint. B and C .Sets 9 Ω A B A Bc Cc A B Cc A B C Ac B Cc Ac Bc Cc Figure 1. B . Si ∩ Sj = Ø for any i = j .1) corresponds to S3 = Ac ∩ B ∩ C . For some (but not all) experiments. and C can always be represented as a union of the elementary sets for A. the binary sequence (0.1: An example of elementary sets for three subsets For example. • Any set operation involving A.1 illustrates all of the 23 = 8 elementary sets for A.1.

Sample spaces and events Using set notation.10 • |Ω| is finite. 6}. The die turns up even. count) the number of elements of subsets of Ω. Definition 1.e. 4. 2. B= 3. Sets In such cases. Ω. is the set of all possible outcomes of an experiment. A sample space. • Each of the outcomes of the experiment are equally likely. it is important to be able to enumerate (i. An event is any subset of a sample space. 3. Definition 1. This will be discussed in a later chapter. Example: Consider the experiment of rolling a die once..8. then you just (re)invented the concept of a random variable. 3 What should be Ω for this experiment? Ω= If you thought the answer was Ω = {1. The die turns up 6. The die turns up less than 5.9. . we can now introduce the first components of our probability model. 3 Identify the subsets of Ω that represent the following events: 1. C= Example: Consider the experiment of burning a light bulb until it burns out. The outcome of the experiment is the life length of the bulb. A= 2. 5.

{HT }} A6 = {{HH}. you would need to list all possible subsets of Ω. {T H}} .Sets 3 What should be Ω for this experiment? 11 Ω = {x ∈ R : x ≥ 0} For the lack of a better representation. T T }. A = {x ∈ R : 0 ≤ x < 1. The set F is very different than the set of all possible outcomes. The bulb burns out within one and one-half hours. The sample space is Ω = {HH. HT. 3 Identify the subsets of Ω that represent the following events: 1. C = {24} Sets of events It is sometimes useful to be able to talk about the set of all possible events F. For example. Since an event is a set. namely A1 = {{HH}} A2 = {{HT }} A3 = {{T H}} A4 = {{T T }} A5 = {{HH}. the set of all events is really a set of sets.5} 2. T H. suppose we toss a two-sided coin twice. If you were asked to list all of the possible events for this experiment. Each element of F is a subset of Ω. Ω. we resorted to assigning a nonnegative real number to each possible outcome of the experiment. The bulb lasts at least 5 hours. B = {x ∈ R : x ≥ 5} 3. The bulb lasts exactly 24 hours.

12 Self-Test Exercises for Chapter 1 A7 = {{HH}. . . this often works well. {T T }} A8 = {{HT }. Left to reader. {HT }. B and C occurs. However. {T T }} A14 = {{HT }. {HT }. Answers can be found in Appendix B. {T T }} = Ω A16 = {} = Ø In this case F = {A1 . {T T }} A13 = {{HH}. F. {T H}} A9 = {{HT }. and is denoted by 2A . {T T }} A15 = {{HH}. choose the best response among those provided. A2 . Self-Test Exercises for Chapter 1 For each of the following multiple-choice questions.1 The event that corresponds to the statement. How to properly select a smaller F. Ω contains an infinite number of outcomes (either countable or uncountable) and F = 2Ω is much too large. to represent those events we actually need to consider. {HT }. . A16 } Definition 1. the set of all subsets of A is called the power set of A. {T T }} A11 = {{HH}. we select a smaller collection of sets. When Ω contains a finite number of outcomes. If A is a finite set. "at least one of the events A. S1. If Ω is a sample space. {T H}. {T T }} A10 = {{T H}. for some experiments. then |2A | = 2|A| Proof. then we often use the set of all possible events F = 2Ω . is found in many graduate level probability courses. In those cases.3. For a given set A. {T H}. .10. {T H}} A12 = {{HH}." is (A) A ∪ B ∪ C (B) A ∩ B ∩ C (C) (A ∩ B ∩ C)c . Theorem 1. {T H}.

3 The set (A ∩ B) ∪ (A ∩ B c ) is equivalent to (A) Ø (B) Ω (C) A (D) B (E) none of the above. B or C occurs is (A) (A ∩ B ∩ C) (B) (A ∩ B ∩ C)c (C) (A ∪ B ∪ C) (D) (A ∩ B c ∩ C c ) ∪ (Ac ∩ B ∩ C c ) ∪ (Ac ∩ B c ∩ C) (E) none of the above.Self-Test Exercises for Chapter 1 (D) (A ∪ B ∪ C)c (E) none of the above 13 S1.2 The event that corresponds to the statement that exactly one of the events A. S1. (A) (A ∩ B) ⊆ B (B) A ⊆ (A ∪ B) (C) Ac ⊆ (A ∩ B)c (D) all of the above are true. S1.5 For any sets A and B . .4 The set (A ∩ B) ∪ A is equivalent to (A) (A ∪ B) ∩ A (B) B (C) A ∩ B (D) Ø (E) none of the above. S1.

3. and B = {x : 0. 4}.7 If A ∩ B = Ø then (Ac ∪ B c )c equals (A) Ø (B) A (C) B (D) Ω (E) none of the above. B = {3. the event (A ∪ B c ) ∩ B equals (A) Ø (B) Ω (C) A ∪ B (D) A ∩ B (E) none of the above. is given by Ω = {x : 0 ≤ x ≤ 2}.14 Questions for Chapter 1 (E) none of the above are true. Ω. Questions for Chapter 1 1. 4.25 ≤ x ≤ 1.5}. 7}.6 For any events A and B . Let A = {2.1 Suppose that the universe consists of the positive integers from 1 through 10. Let A = {x : 0. S1.2 Suppose that the universe. S1. Describe the following sets: (a) (A ∪ B)c (b) A ∪ B c (c) (A ∩ B)c . 6.5 ≤ x ≤ 1}. List the elements of the following sets: (a) Ac ∩ B (b) Ac ∪ B (c) (Ac ∪ B c )c (d) A ∪ (B ∩ C)c c 1. 5} and C = {5.

3 Which of the following relationships are true? In each case. B = set of all integers that are integral multiples of 3.Questions for Chapter 1 (d) Ac ∩ B 15 1. (a) (A ∪ B) ∩ (A ∪ C) = A ∪ (B ∩ C) (b) (A ∪ B) = (A ∩ B c ) ∪ B (c) Ac ∩ B = A ∪ B (d) (A ∪ B)c ∩ C c = Ac ∩ B c ∩ C c (e) (A ∩ B) ∩ (B c ∩ C) = Ø 1. (a) A ∩ Ac (b) A ∪ Ac (c) A ∩ Ø (d) A ∪ Ø (e) Ω ∩ Ø (f) Ω ∪ Ø (g) A ∩ Ω (h) A ∪ Ω 1. where Ω is the universe and A is a subset of Ω. C = set of all integers that are integral multiples of 4. and J = set of all integers Describe the following sets: (a) A ∩ B (b) A ∪ B . either prove your result using basic set relationships or disprove by providing a counterexample.4 Identify the set obtained by the following set operations.5 Let A = set of all even integers.

3. a switch is placed in the “ON” position. the following holds: A ∪ (A ∩ B) = A ∩ (A ∪ B) = A.e. until all defectives have been found. Assume that X and Y are measured in hours on the continuous time axis with the beginning of the time period as the origin. show that for any sets A and B . 1. each of whose quality. one by one. These bulbs are tested. 1. is determined at the end of the day. defective or nondefective. The circuit is on at time z where z is some instant of time during the 24–hour period. (b) Describe and sketch in the XY –plane the following events: 1. The outcome of the experiment consists of the pair of numbers (X. The circuit is on for less than one hour. one by one.9 During the course of a day. i.7 A box of N light bulbs has r (r ≤ N ) bulbs with broken filaments. at some time.6 Prove the Law of Absorption. 1. a machine produces three items. Y .8 During a 24–hour period.16 Questions for Chapter 1 (c) B ∩ C (d) A ∪ J (e) A ∪ C 1. Describe the sample space for this experiment. (still during the same 24– hour period) the switch is placed in the “OFF” position. until a defective bulb is found. at some future time. Describe the sample space generated by a day’s production. The circuit is on exactly three times as long as it is off. 4. X . (b) Suppose that the above bulbs are tested. (a) Describe the sample space. . 2. The circuit is turned on before time r and turned off after time s (r < s). Y ). (a) Describe the sample space for the above experiment.. Subsequently.

1. Do not add other attributes of elements of R2 such as the distance between points. 1.11 Let A. If the second probe fails. Your theorem should only use sets and basic set operations. Express the following verbal statements in set notation: (a) At least one of the events occurs. If it fails.Questions for Chapter 1 17 1. another probe is sent. This process continues until a successful probe is launched. (d) At least one event occurs.12 (†) (provided by Nishant Mishra) Provide a theorem that is true for Venn diagrams but false for an arbitrary universe Ω.10 A space probe is sent to Mars. (b) Exactly one of the events occurs. but no more than two of the events occur. another follows. Describe the sample space for this experiment. (c) Exactly two of the events occurs. B and C be three events associated with an experiment. .

18 Questions for Chapter 1 .

4) dominoes and another producing (4. Elementary probability theory with stochastic processes. 2 Just have a person at the end of the assembly line rotating half of the (4..1.1) the dominoes by 180 degrees. So there is no need to have one assembly line producing (1. Springer-Verlag. Figure 2. – AESOP.1).2 ENUMERATION Do not count your chickens before they are hatched. a domino painted (1. Chung1 So you think you know how to count? Consider the following problem: Suppose you are the chief engineer at the Ellicott Dominoes domino factory.1: A single domino dominoes have each half of a tile painted with 1 to 9 dots (called “pips”) or a blank for a total of 10 possible choices.) Chung. Typically. New York.4) can be turned 180 degrees to look like a domino painted (4. 1975. Dominoes are rectangular tiles used in some parlor games and they look like Figure 2.2 How many different dominoes does your factory actually produce? (Try this problem on your friends and see how many different answers you get. L.1) dominoes. 1 19 . Fables (600 BC) This section is based on Chapter 3 of a text by K. L. K. Of course.

After a ball is drawn. There are m1 alternatives for the first choice. The total number of alternatives for the whole set of choices is equal to (m1 )(m2 )(m3 ) · · · . . 2. n possible ways to draw the second. each with a possible answer of True or False.20 Sampling In this chapter. 4 · 5 = 20 Sampling The Urn Model: Imagine an urn that contains n balls marked 1 to n. 3. There are four disciplines under which the balls can be drawn from the urn and examined. Example: Suppose you give a quiz with 5 questions. and solve each type. You would like to select a research team consisting of one man and one woman. m2 alternatives for the second. 3. 2. 3 is distinguished from 3. Fundamental Rule The following rule forms the foundation for solving any counting problem: Multiplication Principle: Suppose there are several multiple choices to be made. and so on. there are two ways to answer the first question. etc. Since there are n possible ways to draw the first ball. most counting problems can be approached using one or a combination of these four basic types. The total number of possible research teams is. the total number of possible sequences is nm . we will present a systematic way to approach counting problems. therefore. Sampling with replacement and with ordering. etc. Therefore. two ways to answer the second question. 1. The order of the balls is noted. Example: Suppose a engineering class has five women and four men. We will classify counting problems into four categories. In practice. I.. A total of m balls are drawn sequentially. it is replaced in the urn and may be drawn again. m3 for the third. Hence the sequence 1. There are four choices for selecting the man and five choices for selecting the woman. The total number of possible ways to answer all of the questions on the exam is 2 · 2 · 2 · 2 · 2 = 25 = 32.

we have n(n − 1)(n − 2) · · · (n − n + 1) = n! n! = = n !.Sampling 21 II. n − 1 possible ways to draw the second. then Sue) is arbitrary and does not affect the answer. (n − m)! By definition. and Sue) with birthdays this week. etc. and you have three friends (named Bill. Hence the total number of ways you can assign five birthday cards to your three friends is 5 · 4 · 3 = 60. it is not replaced in the urn and hence cannot be drawn again.. (n − n)! 0! Example: Suppose you have five different birthday cards. The number of possible choices for Bill’s card is 5. You want to send one card to each friend. this is equivalent to the problem of counting the number of ways you can arrange n different balls in a row. A total of m balls are drawn sequentially with m ≤ n. Since there are n possible ways to draw the first ball. After a ball is drawn. It’s interesting to note that the sequence of friends (Bill. the number of possible choices for Sue’s card is 3. The card for Mary’s must come from the remaining 4 cards. ? ? ? B M S . the total number of possible ways to draw n balls is n(n − 1)(n − 2) · · · (2)(1) ≡ n!. Sampling without replacement and with ordering. the total number of possible sequences is n(n − 1)(n − 2) · · · (n − m + 1). For the special case when m = n. Here is another way to look at this problem: Consider the empty mailboxes of your three friends. then Mary. Finally. Mary. Computational Hint: We can compute n(n − 1)(n − 2) · · · (n − m + 1) = n! . The order of the balls is noted. 0! ≡ 1 so that when n = m. Note that when m = n. n − 2 possible ways to draw the third.

? ? ? 1 2 3 So a possible arrangement is y z x 1 2 3 and there are 3 · 2 · 1 = 3! = 6 possible assignments. Combined. then one possible assignment is c d B M a S There are 5 ways to fill the first position (i. To derive the above formula. let x denote the number of ways to choose m items from n items without replacement and without ordering.22 Sampling If the cards are labelled a. Example: Using the above example.e. z. A total of m balls (m ≤ n) are drawn from the urn. This time we have 3 positions to fill with 3 different items.d. Hence. .. arrange the m items in order. Stage 2 Given the m items selected in Stage 1. The balls are not replaced after each drawing and the order of the drawings is not observed. Bill’s mailbox). Sampling without replacement and without ordering. III. Stage 1 (selecting the m balls) can be performed x ways.c. The total number of ways a subset of m items can be drawn from a set of n elements is (1) n! ≡ m!(n − m)! n . this case can be thought of as reaching into the urn and drawing m balls simultaneously. We can find the value of x by considering Case II as a two-stage process: Stage 1 Grab m items from n without replacement and without ordering. we can compute the number of ways one can arrange the three letters x. y. m This number is often referred to as the binomial coefficient.e. 4 ways to fill the second position and 3 ways to fill the third. the two stages implement Case II (sampling without replacement and with ordering).b.

Sampling 23 our unknown quantity. note that n m = n! n(n − 1)(n − 2) · · · (n − m + 1) = . Solving for x then yields our result. [Stage 1] · [Stage 2] = [Case II]. 97 3 3!97! 3! Example: A consumer survey asks: Select exactly two vegetables from the following list: ◦ peas ◦ carrots ◦ lima beans ◦ corn 4 2 The total number of ways you can select two items from the four is = 6. Furthermore. m!(n − m)! (n − m)!m! n−m So. for example. m!(n − m)! m! with only m terms in both the numerator and the denominator. 5 5 = 3 2 and 9 9 = . Hence. Hence. Stage 2 (ordering the m balls) can be performed m! ways (using Case II for m items drawn from m items). Computational Hint: Note that n m = n! n n! = = . we must have x · m! = n! (n − m)! In other words. To overcome this. we know. 10 10! 10 · 9 = = = 45 2 2!8! 2! and 100 100 100! 100 · 99 · 98 = = = = 161700. brute force computation of all three factorial terms n of m can be a challenge. . 4 5 Also note that if n is large. that the entire process of Stage 1 followed by Stage 2 can be performed in n! (n−m)! ways. using Case II (with m items drawn from n items).

. 3. we only need to know how many times each ball was drawn. the sequence of the draws is ignored. 3. 2. Sampling with replacement and without ordering. 3. 3 or 2. A total of m balls are drawn sequentially from the urn. the arrangements 2. 3. to describe 3. note that to describe a possible outcome from this scenario. 1 √ 2 3 √ √√ √√√ √ √√√√ || 1 2 2 2 3 2 3 3 Coded Representation √ √ √√ | | √√√ √ | | So. For example. 3 we only need to know that Ball 1 was drawn once. drawing 3.24 Sampling IV. The number of ways that m balls can be drawn from an urn containing n balls in this fashion is m+n−1 . 1. However. 2 and their counterparts would be represented as 1 2 2 2 3 2 √√√ 3 √ Now notice that any of these outcomes can be represented by only specifying √ the arrangement of the four check marks ( ) among the inner two vertical lines (|). 2. 1. 2 or 1. 2 is the same as drawing 1. would represent 3. 3. 3. 3. m Let’s prove that this formula works. for example. This case is perhaps the most difficult to perceive as a sampling discipline. 2. and Ball 3 was drawn twice. This arrangement can be represented as follows: 1 √ 2 √ 3 √√ 1 2 3 3 For another example using this scheme. Therefore. 3. 2. For example. First. 3. yet it has important applications. Ball 2 was drawn once. Each ball is replaced after being drawn. 2.

To represent all possible outcomes. So the following arrangement: Peanut Butter √√ Chocolate Chip √√√ Sugar √√√√√√√ Coded Representation √√ √√√ √√√√√√√ | | would represent the outcome when Mary selects 2 peanut butter. The number of ways we can select 4 positions from 6 positions is 6 = 15. In general. 1 √√ 2 √ ··· ··· n √ k1 k2 · · · kn Coded Representation √√ √ √ | |···| Therefore. Therefore. our table will have n columns. She likes peanut butter cookies. there are 15 4 possible ways we can select 4 balls from an urn containing 3 balls if we allow replacement and disregard ordering. note that our coded representation has six positions ? ? ? ? ? ? √ Any four of these can be selected for the check marks ( ) and the remaining two will be filled with the two vertical lines (|). our table has three columns with twelve check marks. chocolate chip cookies and sugar cookies. In this case. 3 chocolate chip and 7 sugar cookies. In this case.Sampling 25 To count the total number of possible outcomes. we are arranging 14 symbols in a row: . our coded representation will have m check marks and n − 1 vertical lines. So we need (n − 1) + m positions for both check marks and vertical lines ? ? ? ··· ? (n − 1) + m positions The number of ways we can select m positions for check marks from (n − 1) + m positions is n+m−1 m Example: Mary wants to buy a dozen cookies. we will need m check marks √ ( ). if there are n balls and m drawings. The baker will allow her to select any number of each type to make a dozen.

we are given n boxes marked 1 to n. n1 !n2 ! · · · nk ! This quantity is a generalization of the binomial coefficient and is called the multinomial coefficient. and m tokens numbered 1 to m. There are four disciplines for placing the tokens in the boxes.. . A. The number of distinct arrangements of these seven letters in a row is 7! = 210.26 Occupancy 2 vertical lines and 12 check marks. . = = 12 2·1 12 2 Permutation of n items distinguished by groups An extension of case III is the situation where one has n balls with n1 of them painted with color number 1. A. So the number of possible different boxes of one dozen cookies is 12 + 3 − 1 14 14 14 · 13 = = 91. The reason for introducing occupancy problems becomes clear as one encounters different counting problems. In such models. . we think of n balls placed in an urn from which m balls are drawn. C . 3!2!2! Occupancy The Token Model: In sampling problems. this situation is the same as case III. . B. Example: Consider the seven characters. Each of these four scenarios correspond to one of the four sampling cases. Nevertheless.nk of them painted with color number k . n2 of them painted with color number 2. We have k ni = n. C. Note that when k = 2. A direct analogy can be made with so-called occupancy problems. A. the number of distinguishable ways in which the tokens may occupy the boxes is computed in precisely the same manner as in the sampling models. i=1 If the balls are arranged in a row. B. the number of distinguishable arrangements is given by n! . Some may be easier to conceptualize as occupancy problems rather than sampling problems.

Occupancy

27

The connection between the two formulations is easy to remember if you note the following relationships: Sampling Problems ball number of drawing j th draw yields ball k Occupancy Problems box number on token j th token is placed in box k

For example, suppose we have a sampling problem with 3 balls (labelled a, b, c) being drawn 4 times. This would correspond to an occupancy problem with 3 boxes and 4 tokens (See Figure 2.2). Moreover, the following outcomes from each

Tokens 1 2 3 4

Tokens 1 2 3 4

2

1

3

4

a

b

c

a

b

c

(a) Token labels observed

(b) Token labels not observed

Figure 2.2: The occupancy problem model would be equivalent: Sampling Drawing Ball 1 b 2 a 3 b 4 c Occupancy Box Token(s) a 2 b 1,3 c 4

Sampling cases I to IV can now be viewed as

28

Occupancy

I*. Each box may contain any number of tokens and the labels on the tokens are observed. II*. No box may contain more than one token and the labels on the tokens are observed. III*. No box may contain more than one token and the labels on the tokens are not observed. IV*. Each box may contain any number of tokens and the labels on the tokens are not observed. The scheme with check marks and vertical bars that we used for Case IV is now readily apparent. Each column of the table represents a box, and each check mark represents a token. Example: Suppose a school has five classrooms (with different room numbers) and three teachers, Ms. Smith, Mr. Thomas and Ms. Warren. For some courses, more than one teacher might be assigned to the same classroom. I*. If more than one teacher can be assigned to a classroom, and it matters which teacher teaches in a particular room, then the number of ways you can assign teachers to classrooms is
nm = 53 = 125.

II*. If only one teacher can be assigned to a classroom, and it matters which teacher teaches in a particular room, then the number of ways you can assign teachers to classrooms is
n(n − 1) · · · (n − m + 1) = 5 · 4 · 3· = 60.

III*. If only one teacher can be assigned to a classroom, and it does not matter which teacher teaches in a particular room, then the number of ways you can assign teachers to classrooms is
n m =

5 = 10. 3

IV*. If more than one teacher can be assigned to a classroom, and it does not matter which teacher teaches in a particular room, then the number of ways

Self-Test Exercises for Chapter 2

29

you can assign teachers to classrooms is
m+n−1 m =

3+5−1 = 35. 3

Self-Test Exercises for Chapter 2
For each of the following multiple-choice questions, choose the best response among those provided. Answers can be found in Appendix B. S2.1 Bill’s Donut Shop bakes 10 varieties of doughnuts. How many different boxes of 12 doughnuts can a customer select from the 10 varieties? Note: Assume that Bill’s Shop has at least 12 of each type of donut. (A) 12! (B)
21 12

(C) 10! (D) 1210 (E) none of the above. S2.2 A teacher has a class of 30 students. Each day, she selects a team of 6 students to present their homework solutions to the rest of the class. How many different teams of 6 students can she select? (A)
30 6

(B) 306 (C) 30!/24! (D) 630 (E) none of the above S2.3 A box is partitioned into 3 numbered cells. Two indistinguishable balls are dropped into the box. Each cell may contain at most one ball. The total number of distinguishable arrangements of the balls in the numbered cells is (A) 1 (B) 3 (C) 6

30

Self-Test Exercises for Chapter 2

(D) 9 (E) none of the above. S2.4 A professor wants to use three colors (red, green and blue) to draw an equilateral triangle on a sheet of paper. Each edge of the triangle is a single color, but any color can be used more than once. How many different triangles can be drawn? (A) 33 (B) 3! (C) (D)
5 3 5 3

+1

(E) none of the above. S2.5 A professor has an equilateral triangle cut from a sheet of cardboard. He wants to use three colors (red, green and blue) to color the edges. Each edge of the triangle is a single color, but any color can be used more than once. How many different triangles can be produced? (A) 33 (B) 3! (C) (D)
5 3 5 3

+1

(E) none of the above. S2.6 Two white balls are dropped randomly into four labelled boxes, one at a time. A box may contain more than one ball. The number of distinguishable arrangements of the balls in the boxes is (A) (B)
4 2 7 2

(C) 4 (D) 42 (E) none of the above.

Self-Test Exercises for Chapter 2

31

S2.7 One white ball and one black ball are dropped randomly into four labelled boxes, one at a time. A box may contain more than one ball. The number of distinguishable arrangements of the balls in the boxes is (A) (B)
4 2 7 2

(C) 4 (D) 42 (E) none of the above. S2.8 A class has ten students. The teacher must select three students to form a committee. How many different possible committees can be formed? (A) 103 (B) 310 (C) 10!/7! (D)
10 3

(E) none of the above S2.9 A student club has ten members. The club must elect a president, vicepresident and treasurer from the membership. How many ways can these three positions be filled? (A) 103 (B) 310 (C) 10!/7! (D)
10 3

(E) none of the above S2.10 A candy company is introducing a new, bite-sized, chocolate candy called "XYZ’s". Each candy is covered with a sugar coat that is colored a single color: either red, green, or blue. Also each candy is marked with a single letter: either “X,” “Y” or “Z’. How many different possible candies are there?

How many different playing situations are there? .12 There are 4 machines in a factory.1 A domino piece is marked by two numbers. . . S2. how many ways can 4 workers be assigned to the machines? (A) 42 (B) 4! (C) 7 4 (D) 44 (E) none of the above Questions for Chapter 2 2. How many different pieces can be made using the numbers 1. . You may order the five scoops in any way. ranging from five different flavors to five scoops of the same flavor. How many different “Big Dishes” are possible? (A) 205 (B) (C) 20 5 24 5 (D) 20!/5! (E) none of the above S2. 2. .11 An ice cream shop offers 20 different flavors of ice cream. The pieces are symmetrical so that the number pair is not ordered. If each is worker assigned to one machine. Their menu says that their “Big Dish” contains five scoops of ice cream.2 At a bridge table.32 Questions for Chapter 2 (A) 9 (B) 33 (C) 9 3 (D) 3! (E) none of the above. the 52 cards are partitioned into four equal groups. n? 2.

(b) If Jack wants 2 Peanut Krunch for himself. ignoring suits)? 2. A cell may contain more than one ball.8 A multiple choice test contains 10 questions with each question having 4 possible answers. How many different ways could a student answer the entire test? 2. How many distinguishable results of the throw are there? 2.9 (provided by Mark Chew) Jack Dukey enters the Dip-’n-Donut Shop to buy a dozen doughnuts for himself and his co-workers in the Parks Department. If the balls are indistinguishable.4 Throw three indistinguishable dice.6 A ballot lists 10 candidates. list all possible arrangements. how many different ways can you mark your ballot? 2. and his boss in the Parks Department wants 3 Bavarian Creme for his coffee break. Each voter is allowed 4 votes. a voter must place 4 check marks on the ballot. different..3 How many ways can a poker hand of 5 cards be drawn from a 52 card deck so that each card is a different number or face (i. how many different ways can Jack order a dozen doughnuts? 2. According to the “bullet” voting system. Peanut Krunch (11) and Powdered Jelly (15). Strawberry (30). how many committees of three members may be chosen? Suppose that among the eight persons.10 Consider a box partitioned into 3 cells. How many different ways can the ballot be marked? 2. in such a way that every cell has an equal chance of receiving a ball.Questions for Chapter 2 33 2. exactly four are women. You may vote for any three of them.5 From eight persons. Two balls are dropped into the box. (a) How many different ways can Jack order a dozen doughnuts? Assume that ordering 6 Bavarian Creme and 6 Peanut Krunch is the same as ordering 6 Peanut Krunch and 6 Bavarian Creme. How many three-person committees can be chosen so that at least one member is female? 2. He notices that they have the following inventory of his favorite doughnuts: Bavarian Creme (21). and may assign more than one check mark to any candidate(s) up to a total of four marks. .e.7 An election ballot lists 10 candidates. If you vote for exactly three candidates. each independently of the other.

) A ballot lists 10 candidates. How many different ways can the students sign up for busses? 2. . or C with A being your most preferred candidate.14 Prove the binomial theorem: For any real numbers x and y and integer n ≥ 1 (x + y)n = n k=0 n k n−k x y k 2.15 You are given n balls of different colors. Each bus can hold 50 riders. (a) arrange all of them in a row? (b) choose m of them simultaneously? (c) divide them (perhaps unequally) between two identical urns? (d) divide them (perhaps unequally) between two children (and each child cares about the colors he or she receives)? . A class is composed of 4 girls and 12 boys.11 A classroom in the Bell Elementary School contains 16 seats with four rows of four seats each.12 Fifty students have rented 3 busses for a class trip. How many distinguishable ways can you. How many ways can the 4 girls be seated among the 12 boys if the students are distinguished only by sex? 2.13 (See also Question 2. 2. . A. B .6. How many different ways can you mark your ballot? 2. You must pick three candidates and rank them according to your preference.34 Questions for Chapter 2 Reanswer the question if no cell can have more than one ball.

defined on subsets of Ω. Ω = R) we often cannot reasonably compute P (A) for all A ⊆ Ω. that does not necessarily include every possible subset of Ω. F. An event is any subset of the sample space. P (·). The sample space. (for example.2.3 PROBABILITY The scientific imagination always restrains itself within the limits of probability. Ω. How would you measure a set? • finite sets • countable sets • uncountable sets Definition 3. The mathematical details are usually covered in advanced courses in probability theory. In this chapter. we will define what we mean by the probability of an event. we restrict our definition of P (A) to some smaller subset of events. – THOMAS HENRY HUXLEY. we showed how you can use sample spaces Ω and events A ⊆ Ω to describe an experiment. In those cases.3. P (A) and the mathematics necessary to consistently compute P (A) for all events A ⊆ Ω. Measuring sets Definition 3. that assigns to each event E a real number P (E) such that the following three probability axioms hold: 35 . For infinite sample spaces. Reflection (c 1880) Probability Spaces In Chapter 1. Ω.1. is the set of all possible outcomes of an experiment. Definition 3. is a function. A probability measure.

e. However. In section 3. . P (Ω) = 1 3. 1]. If E ∩ F = Ø (i. Here. then P (E ∪ F ) = P (E) + P (F ) The three specifications in the above definition are often called the Kolmogorov axioms for probability measures. That is. maps x = 2 into f (2) = 8. a single number from the interval [0. . Such functions assign each number in R a single number from R. That’s why we refer to P (·) as a probability measure. P (·). Interpretation of probability How do you assign the real numbers in Definition 3. we will show that 0 ≤ P (A) ≤ 1 for every A ⊆ Ω. That is P ({ωi }) = 1 n for i = 1. you worked with real-valued functions f (·). the set of events. n . In calculus. . f : R → R. 1]. is a function whose argument is a subset of Ω (namely an event) rather than a number.. In mathematics we call functions that assign real numbers to subsets. For example. ω2 . the function. P (·) is a function that assigns to each member of F.3? Logical probability This can be done when Ω is a finite set Ω = {ω1 . if the events E and F are mutually exclusive ). . That is P : F → [0. . . . Note that every event A ⊆ Ω is assigned one and only one number P (A). In mathematical terms. P (E) ≥ 0 for all events E 2. f (x) = x3 . a great deal of probability theory has its foundations in a mathematical field called measure theory with important results from many mathematicians. .36 Probability Spaces 1. including Emile Borel. measures. ωn } and each of the outcomes in Ω is equally likely. Henri Lebesgue and Eugene Dynkin.

the notion of logical probability has limited applications. We also must be able to assume that the outcome of any one of the trials does not affect the probabilities associated with the outcomes of any of the other trials. So we can assign. and there is no reason for one card to be drawn rather than another. . . ω2 . 6. rolling a fair die The die is symmetrical and there is no reason for one side to turn up rather than another. consider the outcome of rolling a fair six-sided die once. 6 3 6 P ({The die is even}) = P ({ω2 . . Each replicate is called an experimental trial. P ({ωi }) = 1 6 for i = 1. . for example. drawing a card randomly from a deck of playing cards Each card is physically the same. As an example. . . 2. . . . As you can see. In that case Ω = {ω1 . ω6 }) = |A| for any A ⊆ Ω 6 Logical probability is only useful when the mechanism producing the random outcomes has symmetry so that one could argue that there is no reason for one outcome to occur rather than other.Probability Spaces In that case. . . for any A ⊆ Ω assign P (A) = |A| n 37 For example. Experimental probability This interpretation requires that the experiment can be identically reproduced over and over again. Some examples where logical probability can be used are: P (A) = and. ω4 . suppose we . ω6 } where ωi = the die shows i pips for i = 1. in general. . indefinitely. tossing a fair coin The coin is symmetrical and there is no reason for heads to turn up rather than tails.

although not likely. consider the infinite sequence of experimental outcomes T 1 . determined in advance. . The value of K is. Let A = {good}. say that P (K < ∞) = 1. The outcome of one of our experimental trials is whether that transistor is either good or bad. . there always existed some K . The proof requires measure theory. i=1 would be zero. good}. however.38 Probability Spaces examine transistors being produced by an assembly line. 1 Suppose. To assign P (A) using logical probability. and assign P (A) = lim n→∞ n i=1 Ti n There are some important mathematical issues associated with defining P (A) in this way. and quite far away from the n eventual limit of 1 as n gets bigger than billions. T2 . Ω = {bad. there does exist some integer K such that for all n > K n i=1 Ti Ti n − P (A) < δ. and is typically covered in a graduate-level probability course. such that for all n > K the quantity n i=1 Ti n was arbitrarily close to P (A). We can. It is possible. itself. Some examples where experimental probability can be used are: . with Ti = 0 Ti = 1 if transistor i is bad if transistor i is good In this case. 2 In fact. In calculus. In particular. That concept is not suitable here. the type of limit that you defined in calculus cannot be used. that P (A) equaled 2 . with each transistor an experimental trial. that you could examine billions of transistors and find that all of them n are bad. it can be proven that for a prespecified small number δ > 0. In that case. Consider the sequence of outcomes for the repeated experiment. random and cannot be predetermined. for instance. .

Probability Spaces 39 quality assurance Specifying the probability that a single microprocessor from a semiconductor fabrication facility will be defective. consider the experiment of taking EAS305. interarrival times for queues Stating the probability that the time between any two consecutive customers is greater than five minutes. is to construct an equivalent lottery consisting of an urn. Once the decision maker has decided how many balls of each type to place in the urn. Subjective probability There are some cases where we would like to use probability theory. The decision maker is asked to place in the urn any number of red balls and white balls he or she desires. . For example. hitting a home run Specifying the probability that a baseball player will hit a home run on any attempt. Some examples where subjective probability can be used are: weather Specifying the probability that it will be warm and sunny tomorrow. but it is impossible to argue that the experiment can be replicated. we simply count the balls in the urn and assign P (A) = number of red balls in the urn total number of balls in the urn Not surprisingly. football Specifying the probability that the Buffalo Bills will win the Super Bowl next year. But sometimes it is the only way we can give meaning to the number P (A). with the outcome being the grade that you get in the course. Let A = {you get an “A” in EAS305} The procedure of determining the subjective probability of A. with the goal of making the event of drawing a red ball equivalent to getting an “A” in EAS305. auctions Stating the probability that painting by Monet will sell for one million dollars at an art auction. subjective probability has a lot of critics. There is a mathematical theory (complete with axioms and deductions) regarding the existence of such equivalent lotteries. and an unlimited quantity of red and white balls.

Note that the same bulb might be selected twice. and tested. the bad light bulb. and one of them is bad. without ordering). If the two bulbs are selected one at a time. then the probability of selecting the bad bulb is 2 3 Example: A box contains two light bulbs. Why it is wrong?) . then the probability of selecting the bad 4 bulb is 6 = 2 3 Solution 2: If the two bulbs are selected simultaneously (i. tested and then placed back in the box. Then a second bulb is chosen.40 Probability Spaces Examples using logical probability Example: A box contains three light bulbs. and one of them is bad. What is the probability that exactly one of the two tested bulbs is bad? “Solution:” Suppose the two bulbs are labelled A and B and bulb A is the bad light bulb. If two light bulbs are chosen at random from the box. If these 6 outcomes are equally likely (are they?). (Actually. B . this solution is wrong. and C and that light bulb A is bad. Suppose a first bulb is chosen. There are three possible outcomes: Both bulbs are good Both bulbs are bad Exactly one bulb is bad 1 So the probability that exactly one bulb is bad is 3 . there are (3)(2) = 6 possible outcomes of the experiment: AB BA AC CA CB BC Note that 4 of the 6 outcomes result in choosing A. If these 3 outcomes are equally likely (are they?). there are 3 = 3 possible outcomes of the experiment: 2 AB AC CB and 2 of the 3 outcomes result in choosing bad bulb A.e. what is the probability that one of the two bulbs is bad? Solution 1: Assume that the bulbs are labelled A..

1 since P (Ac ) ≥ 0 (Axiom 1) Theorem 3. P (A) ≤ 1 for all A ⊆ Ω Proof.2.3. P (A) = 1 − P (Ac ) P (A) ≤ 1 from Theorem 3. Let A be the event that the selected student is one of those who likes . A ⊆ B ⇒ P (A) ≤ P (B) Proof. P (Ac ) = 1 − P (A) Proof.Probability Spaces 41 Relationships between English and set notation • P (not E) = P (E c ) • P (E or F ) = P (E ∪ F ) • P (E and F ) = P (E ∩ F ) Deductions from the axioms Theorem 3.31 A ∩ (B ∩ Ac ) = Ø since P (B ∩ Ac ) ≥ 0 (Axiom 1) Example: In a class of 100 students it is known that 80 of the students like hamburger.1. Consider the experiment of picking a student at random from the class of 100. P (Ω) P (A ∪ Ac ) P (A) + P (Ac ) P (Ac ) = = = = 1 1 1 1 − P (A) from Axiom 2 since Ω = A ∪ Ac since A ∩ Ac = Ø Theorem 3. A∪B (A ∪ B) ∩ (A ∪ Ac ) A ∪ (B ∩ Ac ) P (A ∪ (B ∩ Ac )) P (A) + P (B ∩ Ac ) P (A) = = = = = ≤ B B B P (B) P (B) P (B) since A ⊆ B since A ∪ Ac = Ω DeMorgan’s law See Question 3.

Solution: Pick an element ω ∈ A ∩ B . . Suppose |B| is unknown. . That implies that ω ∈ A and ω ∈ B . For any finite number of pairwise disjoint sets A1 . A ∩ B ⊆ A. . 3 Describe the set A ∩ B . Theorem 3. (By induction) Is it true for n = 2? Yes.4. Hence. Assume it’s true for n = k . ω ∈ A. Solution: A ∩ B is the event that the student selected likes hamburger and broccoli. Ak we have P (A1 ∪ · · · ∪ Ak ) = P (A1 ) + · · · + P (Ak ) . E ⊆ F implies P (E) ≤ P (F ).80.80. we have P (A ∩ B) ≤ P (A) = 0. A2 . Therefore. . . 3 Show that A ∩ B ⊆ A. Since A ∩ B ⊆ A. that’s Axiom 3.42 Probability Spaces hamburger. A2 . An we have P (A1 ∪ A2 ∪ · · · ∪ An ) = P (A1 ) + P (A2 ) + · · · + P (An ) Proof. . . . 3 Even if |B| is unknown we know that P (A ∩ B) ≤ 0. In this case |A| = 80 and |Ω| = 100 and we can use logical probability to determine P (A) = 80/100 = 0. is it true for n = k + 1? Assume for k pairwise disjoint sets A1 . Why? Solution: For any events E and F . Let B be the event that the selected student is one of those who likes broccoli.80.

. A ∪ B = A ∪ (Ac ∩ B) Distributive law for sets and B = (A ∩ B) ∪ (Ac ∩ B) Distributive law for sets Therefore P (A ∪ B) = P (A) + P (Ac ∩ B) from Axiom 3 P (B) = P (A ∩ B) + P (Ac ∩ B) from Axiom 3 Subtracting the second equation from the first produces the result. and/or 2. Example: A hydraulic piston can have only two types of defects: 1. k .5. . .5. For any two events A and B P (A ∪ B) = P (A) + P (B) − P (A ∩ B) Proof. . + P (Ak+1 ) Associative law Theorem 3. . a structural defect A single piston may have one. both or neither defect. + P (Ak )) + P (Ak+1 ) from induction hypothesis P (A1 ∪ · · · ∪ Ak ∪ Ak+1 ) = P (A1 ) + . a surface defect. From quality control records we see that • The probability that a piston has a surface defect is 0. . . .Probability Spaces 43 Let Ak+1 be an event such that Ai ∩ Ak+1 = Ø for all i = 1. . Then (A1 ∪ · · · ∪ Ak ) ∩ Ak+1 = Ø since Ai ∩ Ak+1 = Ø for all i P ((A1 ∪ · · · ∪ Ak ) ∪ Ak+1 ) = P (A1 ∪ · · · ∪ Ak ) + P (Ak+1 ) since (A1 ∪ · · · ∪ Ak ) ∩ Ak+1 = Ø P ((A1 ∪ · · · ∪ Ak ) ∪ Ak+1 ) = (P (A1 ) + .

A2 .6.3 Theorem 3. P (Ø) = 0 Proof. is a countable sequence of disjoint sets. first define the following events: S = {a piston has a surface defect} T = {a piston has a structural defect} We want to find P (S ∩ T ).1 since Øc = Ω from Axiom 2 In order to allow the extension of Axiom (3) in Definition 3.4 + 0. .4 + 0.44 Probability Spaces • The probability that a piston has a structural defect is 0.* If A1 . .5 − 0.4.6 Therefore. then ∞ ∞ P i=1 Ai = i=1 P (Ai ) . • The probability that a piston has some type of defect is 0.4 P (S ∪ T ) = 0. .6 To find the probability that a piston has both a structural and a surface defect. P (S ∪ T ) = P (S) + P (T ) − P (S ∩ T ) 0.5 P (T ) = 0. the third probability axiom is replaced by 3.5. P (Ø) P (Ø) P (Ø) P (Ø) = = = = 1 − P (Øc ) 1 − P (Ω) 1−1 0 from Theorem 3. We know that P (S) = 0.3 to apply to a countable collection of disjoint sets (rather than just a finite collection). using Theorem 3.5 − P (S ∩ T ) P (S ∩ T ) = 0.6 = 0.6 = 0.

. Let Ei be the event that exactly i accidents occur in a month with i = 0.50 that at least two of them will have the same birthday? To solve this problem. What is the probability that at least two of them will have the same birthday? How large must k be so that the probability is 0. .. . we have ∞ ∞ P (Ei ) = i=0 i=0 e−1 i! ∞ i=0 = e −1 1 i! = e−1 e1 = 1 Using axiom 3. i=1 Since ∞ αi i=0 i! = eα for any α. . and we want to find the probability that the factory has at least one accident in a month.The Birthday Problem 45 Example: Consider the experiment of counting the number of accidents in a factory in a month. we get ∞ ∞ P i=0 ∞ Ei Ei i=1 ∞ = i=0 P (Ei ) = 1 P E0 ∪ = 1 = 1 P (E0 ) + P i=1 ∞ Ei Ei i=1 Since E0 ∩ ∪∞ Ei = Ø i=1 P = 1 − P (E0 ) = 1 − e−1 THE BIRTHDAY PROBLEM Suppose k randomly selected individuals are seated in a room. 1. we want to compute P ∪∞ Ei . 2. . we will make two assumptions: . That is. 1. . 2.* and the fact that Ei ∩ Ej = Ø for any i = j . Suppose that P (Ei ) = e−1 i! for i = 0.

An introduction to probability and its applications. we’ll first solve the problem for k = 3 and n = 365. there are (365)k corresponding permutations of their k birthdays. see Richard J. Let’s suppose the individuals are ordered. 1 .Jan01). Some textbooks1 say something like “If we imagine the k people to be ordered. Later. It sounds simple. d2 and d3 are the birthdays of the three individuals. . To illustrate the issues. each person is assigned a birthday with any of the n = 365 days being equally likely. d2 . Step 3 Divide the number obtained in Step 2 by the number in Step 1. Then every element of the sample space can be written as an ordered triple ω = (d1 . there are n = 365 days in a year 2.Jan01. we’ll use an approach that is common for most problems involving logical probability: Step 1 Count the number of equally likely outcomes in Ω.Jan01. The question asks us to compute P (E). . We can begin to write down all of the elements of Ω as follows: Ω = {(Jan01. (Jan01.Jan02).Jan01.Jan03). (Jan01. Step 2 Count the number of elements in F ⊆ Ω.Jan04). For just one example of many. But P (E) = 1 − P (F ) and it turns out that P (F ) is easier to compute. and any n.” Why ordered? Why not? Suppose k = 3. Define the events E = {at least 2 birthdays are the same} F = {all of the birthdays are different} and notice that E = F c .} The assignment of birthdays with ordered individuals can be represented as a sampling problem.Jan01. d3 ). Prentice-Hall. that is until you begin identifying and counting equally likely elements in Ω. Larsen and Morris Marx. . where d1 .46 The Birthday Problem 1. (Jan01. The number of ways we can draw three balls from 365 with replacement and with ordering is (365)3 . 1985. To compute P (F ). we’ll extend our solution for any k .

Identifying the elements of Ω without ordering the individuals is equivalent to listing all of the ways we can draw 3 balls from an urn containing 365 balls. .” the following table lists P (E) for several different values of k with n = 365: k 15 22 23 40 50 70 P (E) 0.970 0. Furthermore.The Birthday Problem 47 Continuing with this reasoning.253 0. if we perform the same computations for any n and k ≤ n. etc. the general correct answer is P (E) = 1 − P (F ) = 1 − (n)!/(n − k)! (n)k To answer the question “how large must k be?..475 0. “how many ways can we arrange 3 check marks among the 365 columns?” Jan01 √√√ √√ √√ . But why would we get the wrong answer if we don’t order the k individuals? Let’s see what this would mean for k = 3..999 and we see that we need at least 23 persons to insure that P (E) is at least 0. . with replacement and without ordering. So for k = 3 we get P (E) = 1 − P (F ) = 1 − (365)!/(365 − 3)! (365)3 This turns out to be the correct answer. Jan02 √ √ Jan03 . the number of ways that we can assign birthdays to individuals with everyone getting a different birthday is equal to the number of ways we can draw three balls from 365 without replacement and with ordering.507 0. . And that number is (365)!/(365 − 3)!. We can do this by creating a table with 365 columns (one for each calendar day) and asking the question.50.891 0.

Jan02. There is nothing wrong with identifying the elements of the sample space Ω in this way. assume that birthdays are randomly assigned. and P (E) = 1 − 1 (2)!/(2 − 2)! = (2)2 2 . T T }. using the unordered individual assumption. The names of the days are “Heads” and “Tails”. each element of Ω is distinguishable by the number of times someone in the group is assigned the birthday Jan01. HT. Assigning birthdays at random to the aliens is the same as each alien tossing a fair coin to determine it’s birthday. Once again. let’s use all of our formulas to solve a slightly modified version of the birthday problem: The birthday problem on the planet Yak There are two aliens from outer space in a room. Yak has only two days in its calendar year. but this time. with each day equally likely. We can again compute P (E). that is Ω = {HH.. not who in the group is assigned Jan01. They are from a planet called Yak.48 The Birthday Problem Using this approach. But we would wrong to assume that such elements are equally likely. etc. T H. the total number of elements in Ω is n+k−1 k = (n + k − 1)! k !(n − 1)! and the number of elements in F is n k = n! k !(n − k)! (n)!(n − 1)! (n + k − 1)!(n − k)! So that P (E) = 1 − P (F ) = 1 − To show why the unordered individual assumption is wrong. etc. Jan02.. So we can use our formulas with values n = 2 and k = 2. Under the ordered individual model there are four elements in Ω. In this case.

repeated random assignments. 4) (2. 4) (6. 2T. 2 . 6) Figure 3. 6) (5. 4) (5. des arts and des m´ tiers.1. 3) (5. Define the events A = {first roll is less than or equal to the second roll} B = {the sum of the two rolls is 4} Jean d’Alembert Encyclop´ die ou dictionnaire raisonn´ des sciences. 6) (6. see the work of Bose and Einstein in statistical mechanics. 2) (5. Ω = {2H. 2) (4. 2) (1. 3) (4. 1) (1. 3) (3. 1H1T} that is. 4) (1. 4) (4. 5) (2. 3) (6. 5) (5. 6) (2. 5) (6. 6) (4. 6) (3. 2) (6.” And the corresponding computation for P (E) yields P (E) = 1 − P (F ) = 1 − 2 (2)!(2 − 1)! = (2 + 2 − 1)!(2 − 2)! 3 And we see that the second (and faulty) formulation results in the same mistake that Jean d‘Alembert2 made in 1754. 1) (2. 5) (4. e e e Volume 4. 5) (3. 3) (1. 1) (3. 1) (6. 2) (3. The sample space can be represented as in Figure 3.Conditional Probability and Independence 49 (1. 1) (4.1: Sample Space for Rolling a Die Twice Under the unordered individual model there are three elements in Ω. “two heads”. 5) (1.3 Conditional Probability and Independence Conditional probability Example: Roll a die twice. 1) (5. 4) (3. “two tails” and “one head one tail. Moral of the story: Sampling Case IV (with replacement and without ordering) usually cannot be used to enumerate equally likely outcomes resulting from multiple. 1754. 3) (2. 2) (2. 3 For an important exception.

.7. . A2 . The probability of an event A given the event B . we can use logical probability to get P (B) = 1/12. An P (A1 A2 · · · An ) = P (A1 )P (A2 | A1 )P (A3 | A1 A2 ) · · · P (An | A1 A2 · · · An−1 ) P (· | B) is a probability measure Remember that P (·) is a probability measure if and only if 1. . what is the probability that the two rolls sum to 4? Definition 3. Theorem 3. P (Ω) = 1 3.4.50 Conditional Probability and Independence Since each of the outcomes is equally likely. With this additional information. If E ∩ F = Ø then P (E ∪ F ) = P (E) + P (F ) Suppose P (B) > 0 and consider P (· | B) • For any event E ⊆ Ω P (E | B) = = P (E ∩ B) P (B) [something ≥ 0] ≥0 [something > 0] Therefore P (E | B) ≥ 0 for all events E . . Suppose you are told that the first roll is less than or equal to the second. is the value that solves the equation P (A ∩ B) = P (A | B)P (B) Note: If P (B) > 0. For any events A1 . denoted P (A | B). P (E) ≥ 0 for all events E 2. this is equivalent to P (A | B) = P (A ∩ B) P (B) Notation: Sometimes we write P (A ∩ B) = P (AB).

• When interpreting P (E | A) we are asking the “what if” question.Conditional Probability and Independence 51 • Also. • We can think of P (· | B) as the updated version of P (·) given B . every event A ⊆ Ω produces an updated probability measure P (· | A). Independence Definition 3. “what is the probability of E if the event A occurs. • In fact.” • We can evaluate and use P (E | A) and P (E | B) simultaneously for different events. .5. we have P (Ω | B) = = P (Ω ∩ B) P (B) P (B) =1 P (B) Therefore P (Ω | B) = 1 • And for any two events E and F with E ∩ F = Ø P (E ∪ F | B) = = = = = P ((E ∪ F ) ∩ B) P (B) P ((E ∩ B) ∪ (F ∩ B) P (B) P (E ∩ B) + P (F ∩ B) P (B) P (E ∩ B) P (F ∩ B) + P (B) P (B) P (E | B) + P (F | B) Therefore if E ∩ F = Ø then P (E ∪ F | B) = P (E | B) + P (F | B) • Hence P (· | B) is a probability measure. A and B . The events A and B are independent if and only if P (A ∩ B) = P (A)P (B).

B2 . B1 ∪ B2 ∪ · · · ∪ Bn = Ω. Then P (A ∩ (B ∪ B c )) = P (A) P ((A ∩ B) ∪ (A ∩ B c )) = P (A) P (A ∩ B) + P (A ∩ B c ) = P (A) P (A)P (B) + P (A ∩ B c ) = P (A) P (A ∩ B c ) = P (A) − P (A)P (B) P (A ∩ B c ) = P (A)(1 − P (B)) P (A ∩ B c ) = P (A)P (B c ) Therefore A and B c are independent events. . and 2. .6. Bayes’ theorem Definition 3. . . • Ac and B are independent events. Bi ∩ Bj = Ø for any i = j . If A and B are independent events then • A and B c are independent events. Bn then n n P (A) = i=1 P (A ∩ Bi ) = i=1 P (A | Bi )P (Bi ) . . Suppose A and B are independent events. .52 Conditional Probability and Independence Note: If P (B) > 0 this is equivalent to saying that A and B are independent if and only if P (A | B) = P (A). .9. and • Ac and B c are independent events. Bn such that 1. (Law of Total Probability) Given an event A and a complete set of alternatives B1 . Theorem 3. B2 . The other two results follow by interchanging the roles of A and B . Proof. Theorem 3. . A complete set of alternatives is a collection of events B1 .8.

Addison-Wesley. . two are shorter than a specified length. Meyer. The proper solution is obtained by writing P (A1 ∩ A2 ) = P (A2 | A1 )P (A1 ) = 1 5 · 2 6 = 1 15 . Bn be a complete set of alternatives. Ac } is a complete set of alternatives. 3 Theorem 3. the value of P (A2 ) is not 5 . 1965. the snails bravely travel from one section of the garden to the other by crawling across the steps of the concrete stairway.Conditional Probability and Independence 53 Example: (from Meyer4 ) Suppose that among six bolts. . When it rains. Introductory probability theory and statistical applications. The common but incorrect solution is obtained by writing P (A1 ∩ A2 ) = P (A2 )P (A1 ) = 1 5 · 2 6 = 1 15 . 1 Although the answer is correct. If two bolts are chosen at random.. . But the value of 1 P (A2 | A1 ) is 5 .10. To evaluate P (A2 ) properly. we note that {A1 . (Bayes’ Theorem) Let B1 . then P (A | Bk )P (Bk ) P (Bk | A) = n i=1 P (A | Bi )P (Bi ) Example: Thousands of little snails live in the Snail Garden between the University at Buffalo Natural Sciences Complex and Talbert Hall. . P. A concrete stairway divides the garden into two sections. Reading MA. and write P (A2 ) = P (A1 )P (A2 | A1 ) + P (Ac )P (A2 | Ac ) 1 1 = 1 5 · 2 6 + 2 5 · 4 6 = 1. B2 . 4 . Many of the snails are crushed by the feet of unthinking students and faculty walking up and down the stairway. what is the probability that the two short bolts are picked? Let Ai = {the ith chosen bolt is short}. 1 We can the use the law of total probability. It is a treacherous journey.

Therefore.50) = 0. A smart snail crawls along the vertical portion of a step to avoid being crushed. What is the probability that it is a smart snail? In this case.60)(0.50) + (0.90)(0. we want to find P (S | A) = = P (A | S)P (S) P (A | S)P (S) + P (A | S c )P (S c ) (0. The probability that a smart snail will be crushed is 0. you select.40.50) = 0. . The probability after the rainstorm is 0. What is the probability of finding a smart snails after two rainstorms? Independence of more than two events The definition of independence can be extended to more than two events. Suppose that half of the snails are smart snails. a snail that has successfully crossed the concrete steps. using the Law of Total Probability P (A) = P (A | S)P (S) + P (A | S c )P (S c ) = (0.10.60)(0.54 Conditional Probability and Independence There are two types of snails.60 (0. 3 To find the probability that a single snail (chosen at random) will successfully cross the concrete steps uncrushed define the events S = {The snail is a smart snail} S c = {The snail is a not-so-smart snail} A = {The snail is successful} Note that S ∪ S c = Ω and S ∩ S c = Ø. at random.90)(0.50) + (0.90)(0.60. smart snails and not-so-smart snails. and the other half are not-so-smart snails. The probability that a not-so-smart snail will be crushed is 0.50. A not-so-smart snail crawls on the horizontal part of a step.50) The probability of selecting a smart snail before the rainstorm was 0. The event that any one snail will be crushed is independent from all others.75 3 Suppose that after the end of a rainstorm.

(2) and (3) all fail. then we say that A1 . . but not independent. A3 are pairwise independent. HT. and let Ω be all ordered pairs (i.7. 2 or 5} {second die is 4. j) 1 for i. B. A3 are independent if and only if (1) (2) (3) (4) P (A1 ∩ A2 ) = P (A1 )P (A2 ) P (A1 ∩ A3 ) = P (A1 )P (A3 ) P (A2 ∩ A3 ) = P (A2 )P (A3 ) P (A1 ∩ A2 ∩ A3 ) = P (A1 )P (A2 )P (A3 ) All of the above must hold for A1 . TT. 6 with probability of 36 assigned to each point. A2 . . Let A = {first toss is a head} B = {second toss is a head} C = {first toss equals second toss} 1 4 to each of the outcomes Then P (A ∩ B) = P (A ∩ C) = P (B ∩ C) = P (A ∩ B ∩ C) = 1 4 1 4 1 4 1 4 = P (A)P (B) = P (A)P (C) = P (B)P (C) = P (A)P (B)P (C) Hence A. A2 and A3 where equation (4) holds. A3 to be independent. Three events A. . . C are pairwise independent but not independent. but equations (1). and assign probability HH. . A2 . This is demonstrated by the following example: Example: Suppose that two dice are tossed. 5 or 6} {the sum of the faces is 9} . .Conditional Probability and Independence 55 Definition 3. A2 . A finite collection of events A1 . B and C can be pairwise independent. . as shown by the following example: Example: Toss a fair coin twice. 2. . the events A1 . If only the first three conditions hold. It is also possible to have three events A1 . Note: For n = 3. A2 . Let A B C = = = {second die is 1. An are independent if and only if P (Ak1 ∩ · · · ∩ Akj ) = P (Ak1 ) · · · P (Akj ) for all 2 ≤ j ≤ n and all 1 ≤ k1 < · · · < kj ≤ n. TH. j = 1.

. . . Assume all the outcomes from each roll are independent. 2. . Let Sk = {toss k is a six} An = {the first six occurs on toss n} Then P (Sk ) = 1 6 c P (Sk ) = 5 6 for k = 1.. c c c P (An ) = P (S1 )P (S2 ) · · · P (Sn−1 )P (Sn ) c c c An = S1 ∩ S2 ∩ · · · ∩ Sn−1 ∩ Sn = 5 6 n−1 1 6 for n = 1.. for n = 1. P (first 6 occurs after toss k) ∞ = n=k+1 ∞ P (An ) 5 6 n=k+1 n−1 1 6 = letting n = n − (k + 1) ∞ = n =0 5 6 k+n 1 6 . . .56 Conditional Probability and Independence Then P (A ∩ B) = P (A ∩ C) = P (B ∩ C) = P (A ∩ B ∩ C) = 1 6 1 36 1 12 1 36 = P (A)P (B) = = P (A)P (C) = = P (B)P (C) = 1 4 1 18 1 18 = P (A)P (B)P (C) Example: Suppose you roll a fair die repeatedly until a six appears. 2. . 2. .. We can also compute. Furthermore. .

since the first 6 occurs after toss k if and only if you get k non-6’s in a row.3 then (A) A and B are mutually exclusive (B) P (A ∩ B) = 0 (C) A and B are independent (D) all of the above (E) none of the above. So c c c {first 6 occurs after toss k} = S1 ∩ S2 ∩ · · · ∩ Sk c c c P (first 6 occurs after toss k) = P (S1 )P (S2 ) · · · P (Sk ) k = 5 6 for k = 0. .Self-Test Exercises for Chapter 3 57 = = = 1 6 1 6 5 6 5 6 5 6 k k k ∞ n =0 5 6 n ·6 for k = 0. 2. .1 If P (A) = 0. . 1. choose the best response among those provided. Which makes sense. 2. . S3.. .2 Let P (A) = P (B) = 2p and let P (A ∪ B) = 3p.3.. 1. Self-Test Exercises for Chapter 3 For each of the following multiple-choice questions. .3 and P (A | B) = 0. Answers can be found in Appendix B. . S3. Which of the following are always true? (A) p = 0 (B) A and B are independent (C) P ((A ∩ B)c ) = 1 − p (D) A and B are mutually exclusive (E) none of the above. P (B) = 0.

5 If P (A) = p. Let R denote the outcome of the red coin (Heads or Tails).4 If P (A) = p > 0. P (B) = p > 0 and P (A ∪ B) = 2p then P (B | A ∪ B) equals (A) 1 (B) p (C) 1/2 (D) p2 (E) none of the above S3. and the other is painted blue.3 If A ⊆ B . S3.6 Two fair coins are tossed once. P (B) = p and P (A ∩ B) = p2 then (A) A ∪ B = Ω (B) A = B (C) A and B are independent (D) all of the above (E) none of the above. Let B denote the outcome of the blue coin. then which of the following statements are always true? (A) A and B are independent (B) P (A) > P (B) (C) A and B are mutually exclusive (D) A ∪ B = A (E) none of the above S3. One coin is painted red. Then P (R = Heads | R = B) is equal to (A) 1/4 (B) 1/3 (C) 1/2 .58 Self-Test Exercises for Chapter 3 S3.

P (B) = q . S3. S3. Then (A) A = Ø (B) P (A ∩ A) = P (A)P (A) (C) P (A ∪ B) = 0 for any B ⊆ Ω.Self-Test Exercises for Chapter 3 59 (D) 1 (E) none of the above. S3. What is the probability that they both turn up the same value? (A) 1/72 (B) 1/36 (C) 1/12 (D) 1/6 (E) none of the above. If A ⊆ B and A = B then (A) p ≥ q (B) p ≤ q (C) p > q (D) p < q .7 Two indistinguishable six-sided fair dice are rolled. S3.9 Which of the following statements are always true? (A) P (Ω) = 1 (B) If P (A) < P (B) then A ⊆ B (C) If P (A) = 0 then A = Ø (D) all of the above (E) none of the above. (D) all of the above are true (E) none of the above are true.10 Suppose P (A) = p.8 Suppose P (A) = 0.

one at a time.13 Two balls are dropped randomly into four labelled boxes. what is the probability that the student will answer all questions correctly? (A) 1/(125 ) (B) 1/(512 ) (C) 1/ (D) 1/ 12 5 16 5 (E) none of the above. What is the probability that the junior stays home? . What is the probability that the first key you try will open the door? (A) 1/6 (B) 1/2 (C) 1/3 (D) 2/3 (E) none of the above. S3. A box may contain more than one ball. Someone places a seventh. useless key. The probability that both balls fall in the same box is (A) 4/ 7 2 (B) 1/4 (C) 1/16 (D) it depends on whether or not the balls are different colors (E) none of the above.11 There is one lock on a door and the key is among the six different ones you usually carry in your pocket. If a student randomly picks the answer to each question. S3. three of them are seniors and one of them is a junior. S3.60 Self-Test Exercises for Chapter 3 (E) none of the above.12 A multiple choice test has 12 questions. with each question having 5 possible answers.14 A student club has four officers. in your pocket. Three of the students will be picked at random to go on a trip. S3.

17 If A ⊆ B with P (A) = P (B) > 0 then (A) P (Ac ∩ B) = 0 (B) A = B (C) Ac ∩ B = Ø (D) all of the above are true (E) none of the above are true. C.15 Suppose that five tiles labelled with the letters C. S3.18 Which of the following statements are always true? . T are placed in a container. what is the probability that they can be arranged to spell the word CAT? (A) 1 (B) 1/3 (C) 1/ (D) 3/ 5 3 5 2 (E) none of the above. S3.16 Suppose A is any event.Self-Test Exercises for Chapter 3 61 (A) 1/4 (B) 1/ 6 3 (C) 1/4! (D) 1/43 (E) none of the above. Then P (A) + P (Ac ) equals (A) 0 (B) P (A) (C) P (Ac ) (D) 1 (E) none of the above. S3. S3. A. If three tiles are selected from the container (without replacement). C.

the probability that the sum of the two tosses is 7 equals (A) 1/36 (B) 1/18 (C) 1/9 (D) 1/6 (E) none of the above. Given that a 2 appears on the first toss. (E) none of the above are true. If a student randomly picks the answer to each question. (E) none of the above are true.62 (A) P (Ω) = 1 (B) If A ⊆ B then P (A) ≤ P (B) Self-Test Exercises for Chapter 3 (C) P (A ∩ B c ) ≤ P (A) for any event B (D) all of the above are true. S3. S3.25. with each question having 4 possible answers. . S3.20 A multiple choice test has 5 questions. Which of the following statements are always true? (A) A and B are independent (B) A ∩ B = B (C) B ⊆ A (D) all of the above are true. what is the probability that the student will answer all questions correctly? (A) 1/(45 ) (B) 1/4 (C) 1/ (D) 1/ 5 4 8 4 (E) none of the above.19 Suppose P (A) = 1 and P (B) = 0.21 A fair die is tossed twice.

S3.23 Four fair dice are rolled simultaneously. .24 If P (A) ≤ P (Ac ) then (A) 0 ≤ P (A) ≤ (B) P (A) = 0 (C) A ⊆ Ac (D) P (A) can be any value between 0 and 1 (E) none of the above. S3. The probability that exactly two of them turn up with a value less than or equal to 4 is (A) 24/81 (B) 1/3 (C) 4/9 (D) 1/2 (E) none of the above S3.22 If A ∩ B ∩ C = Ø then which of the following must be true (A) A ∩ B = Ø (B) P (A ∩ B) = 0 (C) P (A ∩ B ∩ C) = 0 (D) all of the above are true (E) none of the above.25 If P (A) = 1 4 1 2 and P (B) = 1 2 1 2 1 4 then (A) P (A ∪ B) = (B) P (A ∪ B) ≤ (C) A ∩ B = Ø (D) A = B (E) none of the above.Self-Test Exercises for Chapter 3 63 S3.

01)5 (B) 1/ (C) 5/ 50 5 50 5 (D) (45!)2 /(40!50!) (E) none of the above. 0. B and C that produce screws. S3. Then P (A ∩ B c ) equals .27 In a screw factor. 0. A screw is chosen at random from a day’s production and found to be defective. What is the probability that it came from machine A? 1.032 3. Machine C produces the remaining 50% of the daily output. S3. Of their output.2 4.28 A box contains fifty light bulbs. If five bulbs are chosen at random from the box. none of the above. 4% and 2% of the screws are defective. 0. the probability that all of them are good is equal to (A) (0.26 The probability that the birthdays of twelve people will fall in twelve different calendar months (assuming equal probabilities for the twelve months) is (A) 12!/(1212 ) (B) 1/(12!) (C) 1/12 (D) 1/(1212 ) (E) none of the above. respectively. S3. there are three machines A. 0.3125 5.29 Suppose A and B are independent events with P (A) = p and P (B) = q .64 Self-Test Exercises for Chapter 3 S3. 5%. Machine B produces 30% of the daily output. Machine A produces 20% of the daily output.01 2. exactly five of which are defective.

What is the probability that they all have different birthdays? (A) 0 (B) 365!/(365)364 (C) 1/365 (D) (365!364!)/728! (E) none of the above.32 Suppose birthdays are assigned to people randomly with each of the 365 days of the year equally likely. S3. A second ball is then drawn from the urn. A classroom has 364 students. One ball is drawn from the urn at random and then thrown away.Self-Test Exercises for Chapter 3 65 (A) pq (B) 1 − pq (C) p − pq (D) q − pq (E) none of the above.31 An urn contains 5 red balls and 5 black balls. Then which of the following statements are true: (A) P (A ∩ B) = pq (B) P (A ∩ B c ) = p − pq (C) P (Ac ∩ B) = q − pq (D) all of the above are true (E) none of the above are true. .30 Suppose A and B are independent events with P (A) = p and P (B) = q . S3. S3. The probability that the second ball is red equals (A) 1/3 (B) 4/9 (C) 1/2 (D) 5/9 (E) none of the above.

6 .5 Then P (B | A) equals (A) 0.5 (D) 0.33 A box contains 10 incandescent lamps. S3.35 Suppose P (A | B) = 0.66 Self-Test Exercises for Chapter 3 S3. S3. The probability that the lamps will light is (A) 2/9 (B) 1/4 (C) 1/2 (D) 3/4 (E) none of the above. (E) none of the above are always true.2 (B) 1/3 (C) 0. five are bad. Five of them are good.4 P (A | B c ) = 0.34 Which of the following statements (A) (B) or (C) are not always true? (A) P (E ∩ F ) = P (E)P (F ) for any events E and F (B) P (Ω) = 1 (C) P (E) ≥ 0 for any event E (D) all of the above are always true. Two lamps are randomly chosen from the box.8 P (B) = 0. The following circuit contains two such incandescent lamps in series and a (good) battery sufficient to light both lamps: + Both lamps must be good in order for either to light.

. . .0513 (D) 0.0390 (C) 0.10 0.3333 (E) none of the above.0200 (B) 0.02 0.4 There are two locks on the door and the keys are among the six different ones you usually carry in your pocket. in haste.3 There is one lock on the door and the correct key is among the six different ones you usually carry in your pocket. For any event.36 A study of automobile accidents produced the following data: Probability of involvement in an accident 0.Questions for Chapter 3 67 (E) none of the above. A.05 Model year 2002 2003 Other Proportion of all vehicles 0. Show that if A ⊂ B then P (B \ A) = P (B) − P (A). However. What is the probability that you can still open the door? What is the probability .2 Let P be a probability measure. in haste. you’ve lost one. What is the probability that you can still open the door? What is the probability that the first key you try will open the door? 3. 2.20 0.01 0. Hint: B = A ∪ (B \ A). Show that P (·) satisfies the three probability axioms. Let |A| denote the number of elements in the set A. n}.70 You know that an automobile has been involved in an accident. However. 3. define P (A) ≡ |A|/n. S3. Questions for Chapter 3 3. you’ve lost a key at random. What is the probability that the model year for this automobile is 2003? (A) 0. .1 Consider the sample space Ω = {1. 3.

with events A. Show that V is a probability measure. What is the value of P (S ∪ T | T )? 3. but only one leads to the outside.5 P (B) = 0. For any piece of land. show that V satisfies the three probability axioms. the relative value with respect to that of the kingdom is then given by the formula V (S) = P (S ∩ A) + 3P (S ∩ B) + 2P (S ∩ C) 2 where P (D) = (area of D)/(area of Ω) for any region D.8 Suppose that the land of a square kingdom. i. never reentering an unsuccessful manhole. B and C of equal area.3 P (A ∪ B) = 0.2 P (A ∪ C) = 0. is divided into three strips. There are 4 manholes in the prison. what is the value of P (S ∩ T | S)? 3. 3. B and C and probability measure P (·).7 Suppose P (S) = p and P (T ) = q with 0 < p < 1 and 0 < q < 1.6 P (C) = 0.5 P (A ∪ B ∪ C) = 0. such that P (A) = 0..68 Questions for Chapter 3 that the first two keys you try will open the door? (Note: There are two interpretations. Ω.5 A convict is planning to escape prison by leaving through the prison sewer system.) 3. A.9 Consider a sample space. and suppose that the value per unit is in the ratio of 1:3:2. What is the probability that the prisoner must try exactly 1 manhole? 2 manholes? 3 manholes? 4 manholes? 3.4 (a) Compute each of the following.3 P (B ∪ C) = 0. and the events S and T are independent. The prisoner plans to enter each of the manholes at random. . justifying your answer: P (A ∩ B ∩ C) P (Ac ∪ B c ) P (A ∩ B ∩ C c ) P (A ∩ B) P (A | B) (b) Are A and B mutually exclusive? Justify your answer.6 If P (S) = p and P (T ) = q with p > 0. in this kingdom. Ω.e. S .

Monty Hall: Now box A and box C are on the table and here is box B (contestant grips box B tightly). Monty provided a short.)6 This problem has appeared in various forms. 6 To Monty Hall’s credit.10 (from Selvin5 ) It’s “Let’s Make a Deal” – the famous TV show starring Monty Hall! Monty Hall: One of the three boxes labelled A. 29. I’ll give you $10.000. If you choose the box containing the keys.” The American Statistician. Monty Hall: I’ll give you $2000. “On The Monty Hall Problem.5? Why does the contestant want to trade boxes? (Hint: The contestant is a former EAS305 student. and there are several claims for its origin. I think I’ll keep this box. The keys might be in your box! I’ll give you $500 for the box. Vol. August 1975. it is claimed that Monty wrote a letter to Steve Selvin. statement of 5 . No. you win the car.Questions for Chapter 3 69 3. Contestant: I’ll take box B. B and C contains the keys to that new car worth $50.000 cash for your 2 box. Since there are two boxes left. Contestant: No. In that letter. It’s empty! (Audience: applause). Contestant: No. 3. The other two are empty. the original problem appeared in the paper by Steve Selvin. is the contestant’s probability of winning the car 0. once Selvin’s paper appeared in American Statistician. Contestant: Gasp! Monty Hall: Select one of these boxes. To my knowledge. but I’ll trade you my box B for box C on the table. Contestant: No thank you. but clear. Now either box C or your box B contains the car keys. Monty Hall: I’ll do you a favor and open one of the remaining boxes on the table (he opens box A). Is Monty right? Assuming Monty knows which of the three boxes contains the keys. the probability of your box containing the keys is now 1 .

A and B .11 Given two mutually exclusive events A and B .13 A computer can send any one of three characters (A. what is the probability that a B was transmitted? 3. it is possible that the printer will not print the same character as transmitted. The probability that an A will be incorrectly printed as a B is 0. but without any verification of its authenticity. What is the probability that it is correctly printed? (b) Suppose that two characters are transmitted independently.12 A warehouse receives the same model of a coffee maker from two different factories.15 A person gives you a six-sided die and tells you that it is a fair die. (a) You know that two particular coffee makers came from the same factory. B or C ) to a printer. past history has shown that 0. The text of Monty’s letter is available on several web sites. due to transmission errors. Monty also cautioned that there is a difference between the actual probability that the keys are in B and the contestant’s perceived probability.2.14 John throws six dice and wins if he scores at least one ace.70 Questions for Chapter 3 3.10 of the coffee makers from A are defective while 0. can A and B be independent? Why? 3. You test it and discover it is defective. What is the probability that the other one is also defective? (b) You chose another coffee maker at random from those in the warehouse. One is defective.50 of the coffee makers from B are defective. and a probability measure P (·) such that P (A) > 0 and P (B) > 0. You roll the die 100 times and 99 of the 100 rolls turns up 4. Other printer errors are impossible. However. What is the probability that it came from Factory B ? 3.5. What is the probability that two A’s will be printed? (c) Suppose that a single character is transmitted. During each transmission. each character has an equal chance of being sent. The probability that an C will be incorrectly printed as a B is 0. (a) Suppose that a single A is transmitted. If a B is printed. It is known that any one coffee maker has an equal chance of originating from A or B . Who has the advantage? 3. Should you accept the claim that die is a fair die? Why or why not? the correct answer to this question. However. . Mary throws twelve dice and wins if she scores at least two aces.

what is the probability of such an event? 3. he is clumsy. Find the probability of this event assuming tickets are given out on random days. Every one of the 12 tickets was given on either a Tuesday or Thursday. four boys take turns at washing dishes. there are 6 cars parked. in such a way that every cell has an equal chance of receiving a ball.Questions for Chapter 3 71 3.19 Consider a box partitioned into 3 cells.16 A professor parks illegally on campus every night and has received twelve tickets for overnight parking. what is the probability of incorrectly announcing that the youngest is clumsy.18 A parking lot attendant noticed that in a section where there are 10 parking spaces arranged in a row. adopt the following decision rule: If the youngest brother breaks three or more of four broken dishes. What is the probability that each cell is occupied? 3. each independently of the other. Reanswer (a) and (b). Using this decision rule. Two balls are dropped into the box. thus.21 de M´ r´ ’s paradox: Which is more probable: (a) to get at least one ace with ee four dice. 1950.” Was he justified in attributing the frequency of his breakage to chance? Hint: Before the four dishes are broken. but in such a way so that the four empty places were adjacent to each other. 3. Wiley. Out of a total of four broken dishes. or (b) to get at least one double ace in 24 throws of two dice? 7 This is a problem from the text: William Feller. three were caused by the youngest and he was thereafter called “clumsy. we will announce that the four dishes were not randomly broken and. . list all possible arrangements. A cell may contain more than one ball.20 (from Feller7 ) In a certain family.17 Let n balls be placed “randomly” into n cells. Feller is often regarded as the first person to organize probability theory as a self-contained subject. (a) If the balls are indistinguishable. when the process of breaking dishes is actually random? 3. An introduction to probability theory and its applications. Assuming that car owners pick a parking place at random among the spaces available. (b) What is the probability of each arrangement? (c) Suppose that no cell can have more than one ball. Was his renting a garage only for Tuesdays and Thursdays justified? 3.

2: Example of a ‘Block’ for Bell Elementary School 3. (a) How many different ways could the postal carrier deliver the letters? (b) What is the probability that everyone gets the correct letter? (c) What is the probability that at least two people get the wrong letters? 3. If no new passengers are picked up.2 for an example of such a block).72 Questions for Chapter 3 B B B B G G B B G G B B B B B B (B = Boy. What is the probability that the four girls will be seated together in a block of two rows and two columns somewhere in the classroom.25 Fifty students have rented 3 busses for a class trip. (b) If the seats are assigned randomly. in his state-of-mind. (See Figure 3. what is the probability that each row contains exactly one girl? 3. the carrier places the letters randomly into the n mailboxes.24 A classroom in the Bell Elementary School contains 16 seats with four rows of four seats each. However. A class is composed of 4 girls and 12 boys.23 After a night on the town. Answer the following questions assuming that each student chooses his bus for the trip at random: (a) What is the probability that every bus has at least one rider? .22 A bus starts with r passengers and stops at n bus stops (r ≤ n). (a) The students in the class are assigned seats at random. what is the probability that no two passengers leave at the same stop? What assumptions are you making? 3. the neighborhood postal carrier decides to deliver n letters addressed to n different individuals. Each bus can hold 50 riders. one letter per box. G = Girl) Figure 3.

Provide an example to show that the converse is not true. B and C P (A ∪ B ∪ C) = P (A) + P (B) + P (C) −P (A ∩ B) − P (A ∩ C) − P (B ∩ C) +P (A ∩ B ∩ C) 3. A = B ⇐ P (A) = P (B) 3. If each student casts his vote independently and at random for one of the two candidates. 5.31 We know that A = B ⇒ P (A) = P (B) Note that this is used in many of the proofs in this chapter. A ⊂ B and A = B ). n boys take turns at washing dishes. 000 voters is composed of 10.26 If an ordinary deck of 52 cards is dealt to four hands of 13 cards each. what is the probability that the player who gets the ace of spades will also get the ace of hearts? 3. 000 Democrats and 5. That is. From a recent survey.3 is still true even if we replace ⊆ by proper subset (i.28 In a family. 3. 000 Republicans.29 Let Ω be a sample space with probability measure P (·).e.32 A community of 20. Out of a total of d broken dishes. What is the probability that the youngest child breaks exactly x dishes? 3.Questions for Chapter 3 73 (b) What is the probability that all students sign up for the same bus? 3.27 A class of 100 students vote for class president among two candidates. while 5% of the Democrats favor JK. it is known that 80% of the Republicans favor Presidential candidate JK. (a) What is the probability that a voter chosen at random from the community will favor JK? . 000 independents. Provide an example where A ⊂ B (with A = B ) and P (A) = P (B).30 Show that Theorem 3. Show that for any events A.. what is the probability of a tied election? 3. and 20% of the independents favor him.

Suppose the machine has the probability 0. LED’s are packaged and shipped in bulk using large cardboard barrels. In the dark.36 The Colorful LED Company manufacturers both green and red light emitting diodes (LED’s). each character being produced by a unique impulse. What is the probability that the impulse chosen was the one designed to produce a C ? 3. Suppose that P (A) = 0. What is the probability that the impulse chosen was the one designed to produce a C ? (b) Suppose that one of the 26 impulses is chosen at random and fed into the machine twice. a second barrel was filled with a mixture of LED’s. the probabilities that any of the 25 other characters will actually appear are equal. can it be determined whether the LED gives off green or red light. and the character C is printed both times. Let P (B) = p. 4 red socks and 2 green socks. (a) Suppose that one of the 26 impulses is chosen at random and fed into the machine. she picks two socks at random from the basket. has a basket that contains 5 blue socks. the red and green LED’s look alike.4 while P (A∪B) = 0. What’s the probability that both socks are the same color? 3. What is the probability that the voter is a Republican? 3. If it prints the wrong character.35 Mary.74 Questions for Chapter 3 (b) Suppose a voter is chosen from the community and it is learned to favor JK. Only when an electrical current is applied. one barrel of red LED’s was produced. The barrels have no distinguishing markings. (a) For what choice(s) of p is P (A ∩ B) = 0? (b) For what choice(s) of p are A and B indpendent? 3. accidentally.9 of correctly producing the character corresponding to the impulse received.34 Let A and B be two events. On a busy production day. and that the character C is printed. When turned off. independent of past behavior. In addition. a university student. one fourth green and three quarters red. .7.33 A printing machine capable of printing any of the 26 letters (A through Z ) is operated by electrical impulses.

(a) You decide to try each door (one at a time) at random. Let Ek denote the event that you require exactly k escape attempts. What is the probability that you need more than three attempts? (d) Assume that the aliens zap you with their anti-memory beam (as in part (b)). you know how to expertly pilot the spacecraft back to Earth. (b) Anticipating your plan to escape. you may need more than three attempts to escape.Questions for Chapter 3 75 (a) Suppose that one of the two barrels is chosen at random. Because of your engineering degree. (c) Assuming that the aliens zap you with their anti-memory beam (as in part (b)). Two of the doors lead to corridors that return you back to your one-room cell. each time you return to your cell. every time you attempt to escape.37 You have been captured by aliens from outer space and held in a one-room cell on their planet. What is the probability that it is the barrel with a mixture of green and red LED’s? (b) Suppose that one of the two barrels is chosen at random and then an LED is tested. Hence. you choose randomly from all three doors. What is the probability that it is a green LED? (c) Suppose that one of the two barrels is chosen at random. Assume that after a failed escape attempt you never choose a previously chosen door. Only one of the three doors leads to an escape spacecraft. P (E2 ) and P (E3 ). Find the values of P (E1 ). Find a formula to compute the value of P (Ek ) for any positive integer k . an LED is tested. B and C . the space aliens zap you with a electromagnetic beam that causes you to forget which doors you have previously chosen. Find the values of P (E1 ). The room has three doors labelled A. P (E2 ) and P (E3 ). What is the probability that it is the barrel with a mixture of green and red LED’s? 3. . and it turns out to be RED.

39 Jelly beans are packaged in one-pound bags.)). and the fourth barrel contains a mixture of 10% red petunia seeds and 90% white petunia seeds. from the barrel that was randomly chosen in part (a. Then you will choose a bowl at random and then pick a jelly at random . experimental or subjective probability? (b) You choose a single petunia seed at random from the randomly chosen barrel. of course. you are given one bag each of red. Two of the barrels contain only white petunia seeds.76 Questions for Chapter 3 3. (a) Suppose you choose a barrel at random. Just before leaving the company. The green jelly beans are likewise placed in a white bowl. green.38 The Belchee Seed Company sells flower seeds and specializes in red petunias and white petunias. What is the probability you selected the barrel containing a mixture of red and white seeds? 3. The beans in both bowls are thoroughly mixed together. you will be blindfolded.). In order to conduct a probability experiment. purple and yellow jelly beans. you perform a DNA test on the seeds). red and white petunia seeds are indistinguishable except through DNA testing. The purple and yellow jelly beans are placed in a blue bowl. What is the probability that it is the barrel with the mixture of red and white seeds? Are you using logical. (a) In the experiment. Exactly one half of the red jelly beans (one-half pound) are placed in the blue bowl and the other half are placed in the white bowl. one contains only red petunia seeds. Before they are grown. an angry employee removed the labels from all four barrels of seeds so that they are now indistinguishable (unless. At the end of the season. What is the probability that it is a red petunia seed? (c) You now perform a DNA test on that single petunia seed (the one chosen at random in part (b. You determine that it is a white petunia seed. the company has only four barrels of petunia seeds remaining. with each bag containing jelly beans that are all of the same color.

42 (†) We define P (A | B) to be the number x that solves the equation P (A ∩ B) = xP (B) . What is the probability that it is a RED jelly bean? (c) A friend (who is not blindfolded) tells you that you did. choose a RED jelly bean? What is the probability that you selected the BLUE bowl? (d) You are still blindfolded. (b) While blindfolded. G. you choose a bowl at random and then pick a jelly bean from that bowl at random. How should you initially distribute the balls so as to maximize the probability of drawing a white ball? 3. Let B = {the BLUE bowl is chosen} W = {the WHITE bowl is chosen} Also let R = {a RED jelly bean is chosen} G = {a GREEN jelly bean is chosen} P Y = {a PURPLE jelly bean is chosen} = {a YELLOW jelly bean is chosen} Do the events B and W form a complete set of alternatives? Briefly explain why. Do the events R. 3. what is the probability that the second jelly bean is also RED? 3. Given that the first jelly bean was RED.40 Show that if P (A) = P (B) = 1 then P (A ∩ B) = 1. After you have distributed all of the balls between the two boxes. a box will be picked at random and a ball randomly chosen from the selected box. indeed. P and Y form a complete set of alternatives? Briefly explain why. So you do not know precisely which bowl you selected in part (b) and (c).Questions for Chapter 3 77 from that bowl.41 (†) Suppose that you have n white balls and m red balls which you plan to distribute between two boxes. You draw out a second jelly bean from the same bowl as you selected in parts (b) and (c).

B and C .78 Questions for Chapter 3 We do not define P (A | B) as (5) P (A | B) = P (A ∩ B) P (B) Here is one reason why. assume the following: • Two sets of three numbers are drawn twice each day. . That is. the lottery game is played twice a day. 2002. must match a sequence of three numbers. Mr. once mid-day and once in the evening. This was regarded by many as a “miraculous” outcome. and he gets a lot of “spam” messages.43 (†) The 9-1-1 New York Lottery Problem. Why? 3.4. or adult content mail that is promoting web sites containing explicit adult material. On September 11. the numbers drawn were 9-1-1. What is the probability that the lottery numbers will match the calendar date at least once in a year? 3. But if we try to use (5) to define conditional probabilities. the same as the date of the drawing. . Each message he receives belongs to one of the following three categories B1 = {the message is “junk mail”} B2 = {the message is “adult content mail”} B3 = (B1 ∪ B2 )c = {the message is regular E-mail} . Consider the following equation (6) P (A ∩ B ∩ C) = P (A)P (B | A)P (C | A ∩ B) Show that if we use Definition 3. For this question. Jackpot winners of the daily New York Lottery numbers game. with each number independently selected at random from the digits 0 through 9. then Equation (6) doesn’t always hold. • The numbers can match the calendar date only in the months of January to September. Pop has a computer with an E-mail account. The “spam” E-mail is of two types: junk mail that is advertising to sell a product.44 Mr. then Equation (6) holds for any events A. Pop has carefully analyzed his E-mail messages.

Pop defined the events A$ = {the message contains ‘$’} AX = {the message contains “XXX”} and determined that P (A$ | B1 ) = 0.8 P (A$ | B2 ) = 0.2 P (A$ | B3 ) = 0.3 That is. 60% of his E-mail is “junk mail.” 10% is “adult content mail. that is. Mr.” . Being a former EAS305 student.Questions for Chapter 3 79 Note that every message is assigned to one and only one category. and that almost all of the “adult mail” contains the text string ‘’XXX”.1 P (B3 ) = 0. (b) An E-mail message arrives and contains the character ‘$’.6 P (B2 ) = 0. the probability that an E-mail message chosen at random contains the character ‘$’. Use Bayes’ theorem to find the probability that the message is “adult content mail. Use Bayes’ theorem to find the probability that the message is “junk mail.” and the remaining 30% is useful. Mr.1 P (AX | B1 ) = 0.0 (a) Use the law of total probability to find P (A$ ).0 P (AX | B2 ) = 0. Mr. Pop estimates that the probability that a single message is assigned to a category is given by P (B1 ) = 0. Upon further analysis.” (c) An E-mail message arrives and contains the character string “XXX”. regular E-mail. Pop has determined that most of the “junk mail” contains the text character ‘$’. Regular mail usually contains neither string.9 P (AX | B3 ) = 0.

80 Questions for Chapter 3 .

In other words.4 RANDOM VARIABLES He stares in vain for what awaits him. lim FX (a) = 1 a→∞ 2. we define the cumulative distribution function (CDF). a→−∞ lim FX (a) = 0 3. Notation: (1) P {ω ∈ Ω : X(ω) = a} ≡ P {X(ω) = a} ≡ P (X = a) Cumulative Distribution Function Definition 4. 4. A random variable is a function X(·) which assigns to each element ω of Ω a real value X(ω). For any random variable X . – EDWIN ARLINGTON ROBINSON. FX (a) is a nondecreasing function of a. as FX (a) ≡ P (X ≤ a). The Man Against the Sky (1921) Basic Concepts Definition 4.1. And sees in Love a coin to toss. FX (x) is a right-continuous function of a. lim FX (x) = x↓a FX (a) 81 . Properties of any cumulative distribution function 1.2. FX (a).

82 Discrete Random Variables 1 0. For any random variable X and real values a < b P (a < X ≤ b) = FX (b) − FX (a) Discrete Random Variables Definition 4.1: An example of a CDF for a discrete random variable Theorem 4. pX (x) ≥ 0 for every x 2. Properties of any probability mass function 1.6 F(x) 0.4 0.2 0 0 1 2 3 4 5 x Figure 4. P (E) = xi ∈E pX (xi ) = 1 pX (xi ) .8 0.3.1. The probability mass function (pmf) of a discrete random variable X is given by pX (xi ) ≡ P (X = xi ). all xi 3.

success or failure. The random variable X is said to have a binomial distribution and has probability mass function (2) pX (k) = n k p (1 − p)(n−k) k for k = 0.. . . We will need the binomial expansion for any polynomial: (a + b)n = n k=0 = n k (n−k) a b k So n k=0 n k p (1 − p)(n−k) = (p + (1 − p))n k = (1)n = 1 The geometric distribution Recall the “success/failure” model that was used to derive the binomial distribution. . . . Then the probability mass function for Y is given by pY (k) = (1 − p)k−1 p for k = 1. and the random variable Y is said to have a geometric distribution with parameter p where 0 ≤ p ≤ 1. with each trial having two possible outcomes. Let X denote the number of successes on n trials. this time. repeated trials. then 1. 2. n k=0 pX (k) Let’s check to make sure that if X has a binomial distribution. we want to find the probability that the first success occurs on the k th trial. . 1. n.Discrete Random Variables 83 Examples of discrete distributions The binomial distribution Consider a sequence of n independent. Let p be the probability of a success for any single trial and let Y denote the number of trials needed to obtain the first success. Let p be the probability of a success for any single trial. but. . .

For any 0 < θ < 1.84 Discrete Random Variables Some facts about geometric series When working with the geometric distribution. ∞ j=0 θj = 1 1−θ Proof.2. Using Theorem 4.3. the following results are often helpful: Theorem 4. ∞ j=0 ∞ j j=0 ∞ j=0 ∞ θ − θ0 = θ − θ j θj = 1 = 1 = 1 = 1 = j=1 ∞ j+1 j=0 ∞ j=0 ∞ j=0 ∞ j=0 θj − θ θj θj θj (1 − θ) 1 1−θ Theorem 4. ∞ j=0 jθj = θ (1 − θ)2 Proof.2 ∞ j=0 ∞ j=0 θj θj = = 1 1−θ d 1 dθ 1 − θ d dθ . For any 0 < θ < 1.

. . 2. then ∞ ∞ pY (k) = k=1 (1 − p)k−1 p (1 − p)j p k=1 ∞ = = j=0 1 p=1 1 − (1 − p) The Poisson distribution The random variable X is said to have a Poisson distribution with parameter α > 0 if X has a probability mass function given by (3) pX (k) = e−α αk k! for k = 0.Continuous Random Variables ∞ 85 d j θ dθ j=0 ∞ = 1 (1 − θ)2 1 (1 − θ)2 1 (1 − θ)2 θ (1 − θ)2 θ (1 − θ)2 jθj−1 = jθj jθj jθj = = = θ j=1 ∞ −1 j=1 ∞ j=1 ∞ j=0 Example: Using Theorem 4. Suppose pY (k) = (1 − p)k−1 p for k = 1. Continuous Random Variables .2. . . 2. . . we can show that the probability mass function for a geometric random variable must sum to one.. 1.

If X is a continuous random variable. −∞ fX (x) dx = 1 fX (x) dx 3. then (4) FX (a) = a −∞ fX (x) dx. P (E) = E Theorem 4.6 F(x) 0. fX (a) ≡ da Properties of any probability density function 1.5.4. fX (x) ≥ 0 for every x ∞ 2.8 0.4 0. . Theorem 4. The probability density function (pdf) of a continuous random variable X is given by d FX (a).2: An example of a CDF for a continuous random variable Definition 4.4.2 0 0 1 2 3 4 5 x Figure 4.86 Continuous Random Variables 1 0. If X is a continuous random variable then P (X < a) = P (X ≤ a).

x + dx)) = fX (x)dx .Continuous Random Variables 87 Area = f(x)dx f(x) x x+dx Figure 4.3: The differential mass interpretation of fX (x) dx Differential mass interpretation of the pdf (5) P (X ∈ [x.

88 Continuous Random Variables Comparison of pmf’s and pdf’s A comparison of the properties of the probability mass function and probability density function. pmf pX (x) ≥ 0 P (X = x) = pX (x) pX (x) = 1 all x pdf fX (x) ≥ 0 P (x ≤ X < x + dx) = fX (x) dx ∞ −∞ fX (x) dx = 1 a −∞ FX (a) = x≤a pX (x) FX (a) = fX (x) dx pX (x) = P (E) x∈E E fX (x) dx = P (E) Examples of continuous distributions The uniform distribution The random variable X is said to have a uniform distribution if it has the probability density function given by (6) 1 fX (x) = b−a  0   if a ≤ x ≤ b otherwise The exponential distribution The random variable X is said to have an exponential distribution with parameter α > 0 if it has the probability density function given by (7) fX (x) = αe−αx if x ≥ 0 0 otherwise .

and E(X) = +∞ −∞ xfX (x) dx for continuous X .5 6 6 6 6 Example: Let X be a continuous random variable with probability density function fX (x) = 2x 0 ≤ x ≤ 1 0 otherwise Then E(X) = = = 0 +∞ −∞ 0 −∞ 1 xfX (x) dx 1 0 x0 dx + x2x dx + 1 1 +∞ x0 dx 2x2 dx = 2x3 3 = 0 2 3 −0= 2 3 . Note: • E(X) is a number.5. Let X equal the number of pips.” • The value of E(X) can be interpreted as the center of mass or center of gravity of the probability mass distribution for X . • The expected value of X is often called the “mean of X . Then E(X) = all x xpX (x) = (1)pX (1) + (2)pX (2) + (3)pX (3) + (4)pX (4) + (5)pX (5) + (6)pX (6) 1 1 = (1) 6 + (2) 1 + (3) 1 + (4) 1 + (5) 1 + (6) 6 = 3. The expected value of a random variable X is given by E(X) = all x xpX (x) + for discrete X .Expectation 89 Expectation Definition 4. Example: Roll a fair die once.

We have pX (k) = n k p (1 − p)n−k k for k = 0. The expected value for X can be found as follows: ∞ E(X) = k=0 ∞ ke−α e−α αk k! = k=1 αk (k − 1)! . . . . 2. . 1. We have pX (k) = e−α αk k! for k = 0. 1. . We can find E(X) as follows: n E(X) = k=0 n k n k p (1 − p)n−k k = n! pk (1 − p)n−k (k − 1)!(n − k)! k=1 Let s = k − 1 n−1 E(X) = = np = np n s=0 n−1 s=0 n − 1 s+1 p (1 − p)(n−s−1) s n−1 s p (1 − p)(n−s−1) s Poisson distribution Let X be a Poisson random variable with parameter α > 0. n. .90 Finding E(X) for some specific distributions Finding E(X) for some specific distributions Binomial distribution Let X be a binomial random variable with parameters n and 0 ≤ p ≤ 1. .

Hence v = −e−αx and du = dx. letting dv = αe−αx dx and x = u. To find E(X) let q = (1 − p) and then. We have pX (k) = (1 − p)k−1 p for k = 1.. This produces E(X) = −xe−αx ∞ 0 + 0 ∞ e−αx dx = 1 α . ∞ E(X) = = p kq k−1 p kq k−1 k=1 ∞ k=1 = p 1 1 = (1 − q)2 p Exponential distribution Let X be an exponential random variable with parameter α > 0. .Finding E(X) for some specific distributions 91 Let s = k − 1 ∞ E(X) = = α = α e−α s=0 ∞ s=0 αs+1 s! αs s! e−α Geometric distribution Let X be a geometric random variable with parameter 0 ≤ p ≤ 1. . We have fX (x) = αe−αx if x ≥ 0 0 otherwise We can get E(X) as follows: E(X) = 0 ∞ xαe−αx dx Integrate this by parts. 2. .

denoted by m 1 (X). E(X).50) equals (A) 0 .92 Self-Test Exercises for Chapter 4 Other measures of central tendency Definition 4. S4. denoted by Mode(X). 1} (E) none of the above are true.1 If X is a discrete random variable with P (X = −1) = P (X = 0) = P (X = 1) = 1/3 Then the expected value of X . The median of a random variable X . S4. The mode of a random variable X . Answers can be found in Appendix B. choose the best response among those provided. equals (A) 0 (B) 1/3 (C) 1/2 (D) any of the values in the set {−1.7.6. 2 2 Definition 4.2 Suppose X is a continuous random variable with cumulative distribution function   0 if x < 0  2 x if 0 ≤ x ≤ 1 FX (x) =   1 if x > 1 Then P (X < 0. 0. is given by pX (Mode(X)) = max pX (x) if X is discrete x fX (Mode(X)) = sup fX (x) x if X is continuous Self-Test Exercises for Chapter 4 For each of the following multiple-choice questions. is given 2 by 1 P (X ≤ m 1 (X)) = .

5 Let X be a continuous random variable with probability density function fX (x) = λe−2x 0 if x > 0 otherwise The value of λ must equal . S4.3 Suppose X is a random variable with cumulative distribution function 1/3 FX (x) =  2/3    1 Then P (X = 2) equals (A) 0 (B) 1/3 (C) 2/3 (D) 1 (E) none of the above. S4. S4.50 √ (D) 0.Self-Test Exercises for Chapter 4 93 (B) 0.25 (C) 0.50 (E) none of the above.4 Suppose X is a continuous random variable with probability density function fX (x) =   0    if x < 0 if 0 ≤ x < 1 if 1 ≤ x < 2 if x ≥ 2 1/2 0 if 0 ≤ x ≤ c otherwise Then the value of c is (A) 1/2 (B) 1 (C) 2 (D) +∞ (E) none of the above.

Which of the following statements are always true: (A) lim FX (a) = 0 a→∞ (B) P (X = a) = FX (a) − FX (a − 1) (C) 0 ≤ FX (a) ≤ 1 for every a (D) all of the above are true (E) none of the above are true.7 Let X be a random variable with cumulative distribution function FX (a). The probability that exactly two heads turn up is (A) (B) (C) (D) 1 2 1 4 3 8 2 3 (E) none of the above. the probability that exactly one is defective is (A) 1/4 .94 Self-Test Exercises for Chapter 4 (A) 1/2 (B) 1 (C) 2 (D) e2x (E) none of the above.6 Suppose a fair coin is flipped 3 times. S4.8 The length X (in inches) of a steel rod is a random variable with probability density function given by   x−9 fX (x) =  0 2 if 9 ≤ x ≤ 11 otherwise A rod is considered defective if its length is less than 10 inches. If 2 independent steel rods are examined. S4. S4.

01 that a person making an airline reservation will never show up for his or her flight. S4.10 Let X be a continuous random variable with probability density function fX (x) = αe−αx if x ≥ 0 0 otherwise If P (X > 1) = e−2 . S4.9 An airplane can hold 100 passengers.Self-Test Exercises for Chapter 4 95 (B) 3/8 (C) 1/16 (D) 3/16 (E) none of the above. If the airline permits 101 persons to make reservations for a given flight. It is known that the probability is 0.99)101 (D) 1 (E) none of the above. then α must equal (A) −1 (B) 1/2 (C) 1 (D) 2 (E) none of the above.01 (C) (0. S4. Passenger behavior has been extensively studied over the years.11 Let X be a continuous random variable with probability density function fX (x) = x/2 if 0 ≤ x ≤ 2 0 otherwise Then P (X ≤ 1) equals . the probability that someone will be denied a seat is (A) 0 (B) 0. Assume that a person decides independently whether or not to show up for a flight.

.13 You have two light bulbs. S4. S4. 2. Then P (X > 1) equals (A) e−1 (B) 1 − e−1 (C) 1 − 2e−1 (D) 1 (E) none of the above. . . The life length of each bulb (in hours) is a continuous random variable X with probability density function fX (x) = 2e−2x if x > 0 0 otherwise The two bulbs are allowed to burn independently of each other. The probability that both bulbs will each last at least one hour is (A) 0 (B) (1 − e−2 )2 (C) e−4 (D) 2e−2 (E) none of the above.96 Self-Test Exercises for Chapter 4 (A) 1/8 (B) 1/4 (C) 1/3 (D) 1/2 (E) none of the above. .12 Let X be a discrete random variable with probability mass function P (X = k) = e−1 k! for k = 0. 1.

Suppose FX (0) = 0. The value of fX (0) is (A) 0.15 Let X be a random variable with cumulative distribution function FX (x).Self-Test Exercises for Chapter 4 97 S4. The value of E(X) is (A) 0 (B) 3/8 (C) 3/4 (D) 3/2 (E) none of the above.00 (B) 0. S4.14 Let X be a continuous random variable with cumulative distribution function   0  FX (x) =   1 1 2 (x + 1) if x < −1 if −1 ≤ x ≤ 1 if x > 1 Let fX (x) denote the probability density function for X .75 (D) 1.00 (E) none of the above.16 Let X be a continuous random variable with probability density function fX (x) = 3x2 0 if 0 ≤ x ≤ 1 otherwise Let E(X) denote the expected value of X . Then P (0 < X ≤ 1) equals (A) 0 (B) 0.3 (D) 0.5.2 (C) 0.50 (C) 0. .5 (E) none of the above. S4.3 and FX (1) = 0.

A batch of 100 integrated circuits are tested.8 .18 Let X be a discrete random variable with cumulative distribution function.98 Self-Test Exercises for Chapter 4 S4.4 (C) 0.4 0.8 0. S4. F (a) ≡ P (X ≤ a).2 0 -2 -1 0 1 2 3 FX(a) a The value of P (−1 ≤ X ≤ 1) is (A) 0.6 0.6 (D) 0. The life lengths of any two integrated circuits are independent random variables.2 (B) 0.17 Suppose that an integrated circuit has a life length X (in units of 1000 hours) that is a continuous random variable with probability density function fX (x) = 3e−3x if x ≥ 0 0 otherwise An integrated circuit considered defective if its life length is less than 3000 hours. The expected number of defective items is (A) e−3 (B) (1 − e−3 ) (C) (1 − e−9 ) (D) 100(1 − e−9 ) (E) none of the above. given by the following graph: 1 0.

Questions for Chapter 4 4. What is the probability of getting exactly two heads on 4 tosses? (A) (B) (C) 4 2 1 4 1 4 1 4 2 2 3 4 2 2 3 4 2 (D) 1 − 3 4 2 (E) none of the above.2 The angle of ascent of an erratic rocket is a random variable X having the cumulative distribution function given by   1  FX (x) = if x > π/2 sin x if 0 ≤ x ≤ π/2   0 if x < 0 (a) Compute P (X ≤ π/8).1 Let X be a continuous random variable having the probability density function 3 2 2 (1 − x ) if 0 ≤ x ≤ 1 fX (x) = 0 otherwise (a) Compute and plot the cumulative distribution function for X. The probability of a head on one toss is 1/4. 4. (b) Find P (X ≥ 0.Questions for Chapter 4 99 (E) none of the above. (b) Find the probability density function fX (x) for X . (c) Find E(X).5). (c) Find E(X).19 An unfair coin is tossed 4 times. S4. .

4. Identify all possible values for p.5 Let Y be a random variable with probability density function fY (y) = 3y 2 if 0 ≤ y ≤ 1 0 otherwise (a) Find the cumulative distribution function for Y . (c) Find E(X). 4.100 Questions for Chapter 4 4.25). (a) Compute F (0. and P 1 3 ≤Y ≤ 1 2 . 4. 4. (b) Compute P (−1 ≤ X ≤ 0. Find the value of c.4 Let X be a discrete random variable with probability mass function pX (−1) = p pX (0) = 1 − 2p pX (1) = p The value of p is unknown. and F (−0.P Y ≥ 1 2 . F (1. (b) Compute P Y ≤ (c) Find E(Y ).3).25).25).6 Let X be a continuous random variable with probability density function fX (x) = cx if 0 ≤ x ≤ 2 0 otherwise 1 2 .7 Let X be a continuous random variable with probability density function fX (x) = 1/θ if 0 ≤ x ≤ α 0 otherwise What relationship must hold between θ and α? . P (Y ≤ 0).3 Let X be a random variable with probability density function fX (x) = 3√ 2 x 0 if 0 ≤ x ≤ 1 otherwise Let F denote the cumulative distribution function of X .

3. 4.8 The angle of ascent of an erratic rocket is a random variable X having the cumulative distribution function given by   1  FX (x) = if x > π/4 sin x if 0 ≤ x ≤ π/4   0 if x < 0 Note: This is an example of a mixed probability distribution with both discrete and continuous probability mass.2 pX (+2) = 0. (a) Evaluate the constant c. what is the expected number of weekly accidents? 4. 4. then P (X = 1) = e−1 . .4 pX (0) = 0.2 pX (−1) = 0.9 Suppose that D. the daily demand for an item. is a random variable with the following probability distribution: P (D = k) = c2k k! k = 1. (b) Compute the expected demand. If the number of weekly accidents in this factory is distributed with a Poisson distribution. (b) Find the probability density function fX (x) for X and the probability mass at π/4. 4.Questions for Chapter 4 101 4.11 True or False: If X has a Poisson distribution and P (X = 0) = e−1 . (a) Compute P (X ≤ π/8).10 The probability is p that there will be at least one accident in a certain factory in any given week.12 Let X be a discrete random variable with probability mass function pX (−2) = 0. 2.2 Draw a plot of the cumulative distribution function (CDF) for X . . There is a positive point mass at π/4. (c) Find E(X). . .

(c) Compute P (−0.14 The Classey Gassey gasoline company produces gasoline at two plants (A and B ) and then delivers the gasoline by truck to gasoline stations where it is sold.13 Suppose that X is a random variable with the following probability density function: 2x if 0 < x < 1 fX (x) = 0 otherwise (a) Compute P (X > 0.102 Questions for Chapter 4 4.5). If the octane rating is r. then the gasoline is Low grade if High grade if r < 85 r ≥ 85 When gasoline is delivered.5 ≤ X ≤ 0. A truck has arrived with today’s shipment of gasoline.5). what is the probability that the shipment is Low grade gasoline? . The octane rating of gasoline produced at Plant A is a continuous random variable X with probability density function fX (x) = 0.02(y − 80) 0 if 80 < y < 90 otherwise The grade of the gasoline is determined by its octane rating. 4. there is an equal probability that it was produced at Plant A or Plant B .005(x − 80) 0 if 80 < x < 100 otherwise The octane rating of gasoline produced at Plant B is a continuous random variable Y with probability density function fY (y) = 0. Find FX (0. You are the owner of a Classey Gassey gasoline station. (a) Given only the above information. (b) Let FX (x) denote the cumulative distribution function for X .5).

17 Let X be a discrete random variable with probability mass function pX (−4) = 1/4 pX (4) = 1/8 pX (−π) = 1/4 pX (π) = 1/8 pX (0) = 1/4 Let F denote the cumulative distribution function of X . 4. and F (14). in fact. (a) Compute F (−5).5).00 on each cake that he sells and a loss (due to spoilage) of $0. F ( 2 ). (b) Compute E(X). Low grade gasoline. 4 and 5. F (π) and F (2π). Assuming that each cake can be sold only on the day it is made. What is the probability that the shipment came from Plant A? 4. find the baker’s expected profit for . He also knows that there is a profit of $1. 3 (a) Compute F (−11). F (−3).16 Let X be a random variable with probability density function fX (x) = 2/x2 if 1 ≤ x ≤ 2 0 otherwise Let F denote the cumulative distribution function of X . 3. 2.18 The manager of a bakery knows that the number of chocolate cakes he can sell on any given day is a random variable with probability mass function p(x) = 1 6 for x = 0. (a) What is the value of the constant c? (b) Find P (X < 0. 4. 4.40 on each cake that he does not sell. (b) Find E(X).Questions for Chapter 4 103 (b) You actually test today’s shipment of gasoline and discover that it is. 1.15 Suppose that X is a random variable with the following probability density function: cx if 0 < x < 1 fX (x) = 0 otherwise where c is a constant.

Hint: Define the random variable Y as the actual daily profit (or loss). 6.. circle and star. Be careful to consider the possibility that the demand for your cakes. You can assume that there is an infinite number of pretzels in the bag with equal proportions of each shape. . what is her expected profit? 4. may exceed your supply. X . (a) Find P (X = k) for k = 4. (b) Find E(X). (c) How many chocolate cakes should he bake to maximize his expected profit? 4. square. (b) a day when he bakes 4 chocolate cakes.20 (†) A large bag contains pretzels with 4 different shapes: heart. You can then answer the question using the random variable Y . 5.19 If a contractor’s profit on a construction job is a random variable with probability density function fX (x) = 1 18 (x + 1) 0 −1 ≤ x ≤ 5 otherwise where the units are $1000.104 Questions for Chapter 4 (a) a day when he bakes 5 chocolate cakes. Using the probability distribution for X (the number of customer asking for cakes) find the probability distribution for Y . Let the random variable X denote the number of pretzels that Billy will draw from the bag. Billy plans to reach into the bag and draw out one pretzel at a time. stopping only when he has at least one pretzel of each type. . . The probability distribution for Y will depend on the number of cakes you decide to bake that day. .

representations of experimental outcomes. still so. As an example. at the point y = 20. let X be the random variable representing the Buffalo. There is a third. – WILLIAM SHAKESPEARE. namely FX (x).5 FUNCTIONS OF RANDOM VARIABLES When you do dance. and probably more important reason for using random variables: 3. New York temperature at 4:00 pm next Monday in degrees Fahrenheit. They allow us to restrict our attention to sample spaces that are subsets of the real line. The Winter’s Tale (c 1600) Basic Concepts Two of the primary reasons for considering random variables are: 1. they are interested in the event A = {ω ∈ Ω : Y (ω) ≤ 20} where Ω is the sample space of the underlying experiment. but our neighbors in Fort Erie. This random variable might be suitable for the use of Buffalonians. I wish you A wave o’the sea. They are convenient. and often obvious. Ontario prefer the random variable Y representing the same Buffalo temperature. This chapter deals with this third “feature” of random variables. move still. And own no other function. Note that Ω is being mapped by both of the random variables X and Y into the real line. Can we still find P (A)? 105 . FY (y). except in degrees Celsius. we could evaluate the cumulative distribution function for Y . But suppose we only know the cumulative distribution function for X . 2. They permit algebraic manipulation of event definitions. that you might ever do Nothing but that. In order to compute P (A). Furthermore.

mathematicians usually will drop the (ω). Note that we should have really written this as Y (ω) = h(X(ω)). if we know the probability distribution for X . 9 which is nothing more than the equation of a straight line.1. If X is a random variable. we also have Y = h(X) as a random variable. To show you what we mean by this. When finding the probability of the event A. we can use the shortcut notation presented in Chapter 5: P (A) = P (Y ≤ 20) = P ( 5 (X − 32) ≤ 20) 9 = P (X − 32 ≤ 36) = P (X ≤ 68) = FX (68). notice that for any ω ∈ Ω. So. let the function h(x) be defined by h(x) = 5 (x − 32). we can compute probabilities of events expressed in terms of the random variable Y . 1 . (1) Y (ω) = 5 9 Basic Concepts X(ω) − 32 . knowing a relationship like Equation 1 between X(ω) and Y (ω) for every ω ∈ Ω.” First. allows a representation of the event A in terms of either the random variable X or the random variable Y .106 The answer is “yes. The general concept surrounding the above example is the notion of a function of a random variable. In order to simplify the notation. We can then redefine the event A in terms of X as follows: A = {ω ∈ Ω : Y (ω) ≤ 20} = ω∈Ω : 5 9 (X(ω) − 32) ≤ 20 = {ω ∈ Ω : X(ω) − 32 ≤ 36} = {ω ∈ Ω : X(ω) ≤ 68} .1 The function h(x) maps merely points in the real line (the sample space for X ) into points in the real line (the sample space for Y ) as in Figure 5. So.

more obvious to define Y as mapping directly from Ω to the real numbers.e.1. Referring to Figure 5. “What are all of the possible outcomes of the experiment (i. suppose we are given a random variable X . In this case. Using this.3. Under those circumstances. Also assume that we are given the cumulative . you can see that what is happening is really quite simple. But consider the case where we have already defined X and then are given a functional relationship between X and Y . we can compute P (X ≤ t) for any value of t. Suppose we are given a cumulative distribution function for the random variable X . we ask ourselves.. the folks in Fort Erie are asking us to compute P (Y ≤ 20). it would be very handy to have a tool that would enable us to compute probabilities involving Y using the facts we already know about X . Returning to our example. It may seem. measuring Monday’s 4 pm temperature) that would result in announcing that the temperature in degrees Celsius in 20 or less?” Call this event A and compute the probability of A from the distribution of the random variable X. at first. the Buffalo temperature in degrees Fahrenheit.1: The mapping Y (ω) = h(X(ω)) The random variable X serves as a “middleman” in the mapping shown in Figure 5. An even more general situation appears in Figure 5.Basic Concepts 107 Ω ω X( ) Real line Y( ) Real line 212 100 Figure 5. To oblige. This gives us the cumulative distribution function FX (t) and allows us to compute the probability of many other events.2.

2: Representing the event {Y ≤ 20} y y= h(x) B x = h-1(B) Figure 5.3: A more complicated example .108 Basic Concepts y 20 y= ) h(x x 68 A Figure 5.

Solution: There is a pleasant fact that helps when working with discrete random variables: Theorem 5. we cannot make such a statement for continuous random variables. what values can Y assume?” We see that the function h(x) = x2 maps the point x = −2 into the point 4. To solve the problem. −1. 0. x = −1 into 1. also the probability of observing Y taking on a value in B . Definition 5.Some examples 109 distribution function for X so that we are capable of computing the probability measure of any subset of the sample space for X . So all of the probability mass for the random variable Y is assigned to the set {0. “Since X can take on the values −2. therefore. . x = 0 into 0. Unfortunately. A function of any discrete random variable is also a discrete random variable. 1 or 2. Some examples Discrete case Example: Let X be a discrete random variable with probability mass function pX (−2) = pX (−1) = pX (0) = pX (1) = pX (2) = 1 5 1 5 1 5 1 5 1 5 Let Y = X 2 . Then A is called the support of the distribution for the random variable Y . The probability of any of these particular outcomes of X is. 1. as we will see in a later example. x = 1 into 1 and x = 2 into 4. If we then define the random variable Y = h(X) (where h is the function shown in Figure 5. we should first ask ourselves.1. We would like to find the probability distribution for Y .3) we can compute the probability measure of any subset B in the sample space for Y by determining which outcomes of X will be mapped by h into B . Let A be the smallest subset of R such that P (Y ∈ A) = 1. Such a set is called the support for the random variable Y . 4}.1.

  0  . we can say that 2 pY (4) = pX (−2) + pX (2) = 5 . pY (1) = 2 and pY (0) = 1 . First.” With no elaborate computation. Find the distribution for the random variable Y = X 2 . Similarly. Let’s try to find FY (y). it is easier to deal with the cumulative distribution functions. the answer is “Y can assume values from 0 to 1..110 Some examples Since X will take on the value −2 with probability 1 and the value 2 with probability 5 1 . what values can Y take on?” Since Y = X 2 . 1). the density for X is given by fX (x) = if −1 ≤ x ≤ 1 0 otherwise 1 2 and the cumulative distribution function is x+1 FX (x) =  2   1   0   if x < −1 if −1 ≤ x ≤ 1 if x > 1. Solution: For continuous random variables. i.e. we should ask ourselves. Hence the probability distribution for Y is 5 5 given by the probability mass function: pY (0) = pY (1) = pY (4) = 1 5 2 5 2 5 Continuous case Example: Let X be a uniformly distributed random variable on the interval (−1. and since each of these mutually exclusive events will result in Y taking on the 5 value 4. “Since X can take on only values from −1 to +1. we now know that if y < 0 ? if 0 ≤ y ≤ 1 FY (y) =   1 if y > 1.

We immediately know that the random variable Y is discrete. We know that Y takes on the value 0 when X takes on any value from 0 to 1 (but not including . Solution: Since X takes on the values ranging continuously from 0 to 2. the density for X is given by fX (x) = if 0 ≤ x ≤ 2 0 otherwise   0   1 2 and the cumulative distribution function is x FX (x) =  2   1 if x < 0 if 0 ≤ x ≤ 2 if x > 2 Let h(x) be the greatest integer less than or equal to x. and we have our first example of a continuous random variable being mapped into a discrete random variable. Find the distribution for Y = h(X). that is 0 ≤ y ≤ 1. FY (y) =   0  √ if y < 0 y if 0 ≤ y ≤ 1   1 if y > 1 Finding the probability density function for Y is now easy. 2). 1 and 2 (why 2?). let h be given by the step function shown in Figure 5... To find the probability mass function for Y .4 again and try to visualize why this happened. Y can take on only the integers 0.e. first let’s try to find pY (0). That task is left to you.e. Example: Let X be a uniformly distributed random variable on the interval (0. i. Look at Figure 5.Some examples 111 Now suppose that y is equal to a value in the only remaining unknown case. So. We then have FY (y) = P (Y ≤ y ) = P X2 ≤ y √ √ = P − y≤X≤ y √ √ = FX y − FX − y √ √ y+1 − y+1 = − 2 √ 2 = y.4. i.

112 Invertible Functions of Continuous Random Variables y 3 2 1 x -2 -1 1 2 3 4 A Figure 5. pY (1) = 1 1 dx = 2 2 and 1 dx = 0. A function h : R → R is called onto. 2 Similarly.4: The greatest integer function 1). Hence. 1 pY (0) = pY (1) = 2 . A function h : R → R is called one-to-one.2. if h(a) = h(b) implies a = b. Invertible Functions of Continuous Random Variables First we need a few concepts about inverse functions from calculus: Definition 5. . pY (0) = 0 1 1 x dx = 2 2 2 1 1 0 1 = . pY (2) = 2 So we have the answer. if for every y ∈ R there exists some x ∈ R such that y = h(x). 2 2 which is good thing because we ran out of probability mass after the first two computations.

2. If one considers the function Y = h(X) as a transformation (x) → (y). take the first derivative of both sides with respect to y and use the chain rule: d FY (y) = dy d FX (h−1 (y)) dy d fY (y) = fX (h−1 (y)) h−1 (y) dy We have just proven the following theorem: Theorem 5. Suppose X is a continuous random variable with probability density function fX (x). Then. Suppose h(x) has an inverse and is increasing (or decreasing). for any a ≤ b. If h(x) is a strictly increasing function with a differentiable inverse h−1 (·). then the random variable Y = h(X) has probability density function d fY (y) = fX (h−1 (y)) h−1 (y) dy We will leave it to you to find a similar result for the case when h(x) is decreasing rather than increasing. then we can define the inverse function h−1 : R → R such that h−1 (y) = x if and only if h(x) = y . This produces the following relationship between the cumulative distribution function for X and the cumulative distribution function for Y : FY (y) = P [Y ≤ y] = P [h(X) ≤ y] FY (y) = P [h−1 (h(X)) ≤ h−1 (y)] FY (y) = P [X ≤ h−1 (y)] FY (y) = FX (h−1 (y)) To get the relationship between the densities. If h(x) a strictly increasing function and h(x) has an inverse. if X is a continuous random variable. Therefore. Note: This transformation technique is actually a special case of a more general technique from calculus. and Y = h(X).Invertible Functions of Continuous Random Variables 113 If a function h(x) is both one-to-one and onto. then h−1 (x) is nondecreasing. we can develop a general relationship between the probability density function for X and the probability density function for Y . we have h−1 (a) ≤ h−1 (b). the monotonicity property of h allows us to uniquely solve .

114 Invertible Functions of Continuous Random Variables y = h(x). 2]. when solving transformation problems with n variables. That solution is x = h−1 (y). In calculus. When −8 ≤ y ≤ 8 fY (y) = = = 1 1 1 −2 3 3 4 |y |( 3 )y 1 1 1 3 √ 12 |y | 3 y 2 1 −1 3| 12 |y 1 1 1 . we can use Theorem 5. 8]. fX (y 3 ) = 0 hence fY (y) is zero. has a nonnegative first derivative with respect to y . So the support for Y is [−8. Since h−1 (y) = y 3 . Find the probability density function for Y without explicitly finding the cumulative distribution function for Y . In those cases. Example: Let X be a continuous random variable with density for X given by fX (x) = |x|/4 if −2 ≤ x ≤ 2 0 otherwise Let Y = h(X) = X 3 . thus. Solution: The support for X is [−2. the resulting n × n determinant is called the Jacobian of the transformation. We can then express the probability density function of Y as fY (y) = fX (h−1 (y))J(y) where J(y) is the following 1 × 1 determinant: J(y) = ∂ −1 ∂x ∂ −1 = h (y) = h (y) ∂y ∂y ∂y since h−1 (y) is also an increasing function of y and.2 to get fY (y) = fX (h−1 (y)) = fX (y 3 ) 1 1 1 d −1 h (y) dy d 1 y3 dy 2 = fX (y 3 )( 1 )y − 3 3 When y < −8 or y > 8 we have y 3 < −2 or y 3 > 2.

2. Let X = 1 if heads. be a continuous random variable with probability density function fX (x) = 2e−2x x ≥ 0 0 otherwise Suppose you earn ln(x) + x2 − 3 dollars if the transistor lasts exactly x hours. . What are your expected earnings? Theorem 5. Question: Let X .Expected values for functions of random variables 115 Summarizing fY (y) = |y − 3 |/12 if −8 ≤ y ≤ 8 0 otherwise 1 Expected values for functions of random variables Theorem 5. We will provide proofs for both the discrete and continuous cases. Suppose X is a discrete random variable with probability mass function pX (xk ) = P (X = xk ) = pk for k = 1. your expected winnings.3. . −5 x = 0 h(x) = 10 x = 1 Find E(h(X)). then E(Y ) = E(h(X)) = all x h(x)pX (x) If X is a continuous random variable with probability density function fX (x). the life length of a transistor. . then E(Y ) = E(h(X)) = +∞ −∞ h(x)fX (x) dx Question: Toss a coin.4. If X is a discrete random variable with probability mass function pX (x). Proof. X = 0 if tails. Let h(X) denote your winnings. Let Y = h(X). . E(a) = a if a is a constant. We are really finding E(h(X)) where h(x) = a for all x..

5. Then E(h(X)) = E(aX) = ∞ −∞ ∞ h(x)fX (x) dx axfX (x) dx xfX (x) dx = a(E(X)) = a −∞ ∞ −∞ . Proof. 2. . Then E(h(X)) = k h(xk )pk axk pk k E(aX) = = a xk pk = a(E(X)) k Suppose X is a continuous random variable with probability density function fX (x).116 Expected values for functions of random variables Then E(h(X)) = k h(xk )pk apk k E(a) = = a pk = a(1) = a k Suppose X is a continuous random variable with probability density function fX (x). Suppose X is a discrete random variable with probability mass function pX (xk ) = P (X = xk ) = pk for k = 1. Then E(h(X)) = E(a) = ∞ −∞ ∞ h(x)fX (x) dx afX (x) dx fX (x) dx = a(1) = a = a −∞ ∞ −∞ Theorem 5. We are finding E(h(X)) where h(x) = ax.. . . E(aX) = aE(X) if a is a constant.

. Then E(h(X)) = k h(xk )pk . . Suppose X is a discrete random variable with probability mass function pX (xk ) = P (X = xk ) = pk for k = 1.Expected values for functions of random variables 117 Theorem 5.7.6. . . . . Suppose X is a discrete random variable with probability mass function pX (xk ) = P (X = xk ) = pk for k = 1. . Proof. E[h1 (X) + h2 (X)] = E[h1 (X)] + E[h2 (X)] for any functions h1 (X) and h2 (X) Proof. Then E(h(X)) = k h(xk )pk (xk + b)pk = k k E(X + b) = = k (xk pk + bpk ) xk pk + k bpk pk = E(X) + b(1) k = k xk pk + b Suppose X is a continuous random variable with probability density function fX (x). 2. E(X + b) = E(X) + b if b is a constant.. Then E(h(X)) = E(X + b) = = ∞ −∞ ∞ −∞ ∞ −∞ h(x)fX (x) dx (x + b)fX (x) dx = xfX (x) dx + ∞ −∞ ∞ −∞ (xfX (x) + bfX (x)) dx bfX (x) dx = E(X) + b(1) Theorem 5. 2. We are finding E(h(X)) where h(x) = h1 (x) + h2 (x). We are finding E(h(X)) where h(x) = x + b.

The quantity E(X k ) is defined as the kth moment about the origin of the random variable X . The kth Moment of X If we let h(x) = xk and consider E[h(X)]. Then E(h(X)) = E[h1 (X) + h2 (X)] = = = ∞ −∞ ∞ −∞ ∞ −∞ ∞ −∞ h(x)fX (x) dx (h1 (x) + h2 (x))fX (x) dx (h1 (x)fX (x) + h2 (fX (x)) dx h1 (x)fX (x) dx + ∞ −∞ h2 (x)fX (x) dx = E[h1 (X)] + E[h2 (X)] Notice in the all of the above proofs the close relationship of Discrete X and Continuous X ∞ and k −∞ P (X = k) = pX (xk ) = pk and P (x ≤ X ≤ x + dx) = fX (x) dx Also notice that Theorem 5.7 with h1 (x) = x and h2 (x) = b and you get Theorem 5.7. we get the following definition: Definition 5.118 The k th Moment of X E[h1 (X) + h2 (X)] = k [h1 (xk ) + h2 (xk )]pk [h1 (xk )pk + h2 (xk )pk ] k = = k h1 (xk )pk + k h2 (xk )pk = E[h1 (X)] + E[h2 (X)] Suppose X is a continuous random variable with probability density function fX (x).3.6.6 is really only a corollary to Theorem 5. . Simply use Theorem 5.

8. The standard deviation of a random variable X is given by σX ≡ Var(X). Var(X) = E(X 2 ) − [E(X)]2 Proof.5 2 = E[X ] + E[−2XE(X)] + E[[E(X)]2 ] = E[X ] − 2E(X)E[X] + E[[E(X)] ] . Note that Var(X) = for a discrete random variable X . Definition 5.4. The variance of a random variable X is Var(X) ≡ E[(X − E(X))2 ] .7 from Theorem 5. and Var(X) = +∞ −∞ (x − E(X))2 pX (x) all x (x − E(X))2 fX (x) dx for a continuous random variable X . This can be interpreted as the moment of inertia of the probability mass distribution for X . If we let h(x) = (x − E(X))2 and compute E(h(X)) we get a very useful result: Definition 5. Using the properties of expected value.5. Note: The value of Var(X) is the second moment of the probability distribution about the expected value of X . we get Var(X) = E[(X − E(X))2 ] = E[X 2 − 2XE(X) + [E(X)]2 ] 2 2 expand the square from Theorem 5.Variance 119 Variance Recall that E(X) is simply a number. Notation: We often write E(X) = µX 2 Var(X) = σX Theorem 5.

Let Y = h(X) then Var(Y ) = E [h(X) − E(h(X))]2 Example: Suppose X is a continuous random variable with probability density function 3x2 if 0 ≤ x ≤ 1 fX (x) = 0 otherwise .5 = a2 (E[X 2 ] − [E(X)]2 ) Using Theorem 5.4 Theorem 5. Using Theorem 5.8 Theorem 5. Var(aX) = a2 Var(X) Proof.8. we get Var(X + b) = E[((X + b) − E[X + b])2 ] = E[(X + b − E[X] − b)2 ] = E[(X − E[X])2 ] = Var(X) The proof of the following result is left to the reader: Theorem 5.9.11.10.4.120 Variance = E[X 2 ] − 2E(X)E[X] + [E(X)]2 = E[X ] − 2[E[X]] + [E(X)] = E[X 2 ] − [E[X]]2 2 2 2 from Theorem 5. Using Definition 5. Var(X + b) = Var(X) Proof. we get Var(aX) = E[(aX)2 ] − [E(aX)]2 = E[a2 X 2 ] − [E(aX)]2 = a2 E[X 2 ] − [aE(X)]2 = a E[X ] − a [E(X)] = a2 Var(X) 2 2 2 2 Using Theorem 5.

Then E(Y ) = 0 1 h(x)(3x2 ) dx = = 1 (3x2 ) dx 0 x 3 21 3 x = 2 0 2 3 2 1 Then we can use E(Y ) = E(h(X)) = Var(Y ) = 0 1 to compute (h(x) − E(h(X)))2 (3x2 ) dx = = = 1 3 2 − (3x2 ) dx x 2 0 1 1 3 9 (3x2 ) dx − + 2 x x 4 0 1 1 1 27 3 3 dx − 9x dx + x2 dx = 4 0 0 0 4 1 Approximating E(h(X)) and Var(h(X)) This section is based on material in a text by P. Addison-Wesley. if the function h(·) is very complicated. Reading MA. Often. P. we can get good results using the following approximations: Theorem 5. Then (2) (3) 2 h (µ) 2 σ 2 Var(Y ) ≈ [h (µ)]2 σ 2 E(Y ) ≈ h(µ) + Meyer. However.. This avoids the need for explicitly finding the probability distribution for the random variable Y = h(X) if we are only computing expected values. Meyer. given a random variable X and function Y = h(X). then even using these short-cut methods for computing E(h(X)) and Var(h(X)) may result in difficult integrals. Let X be a random variable with E(X) = µ and Var(X) = σ 2 . L.2 We have already shown that. . 1965.Approximating E(h(X)) and Var(h(X)) 121 Let Y = h(X) = 1/X . Suppose Y = h(X).12. we can compute E(Y ) = E(h(X)) directly using only the probability distribution for X . Introductory probability theory and statistical applications.

(This is only a sketch of the proof. We obtain. believing the result is the expected profit. Equation (2) in Theorem 5. this will not be the same as h(µ) = h(E(X)).12 provides an approximation for the error if you incorrectly use h(µ) rather than E(h(X)). the number h(µ) is incorrectly interpreted as the expected profit. the expected sales figure is “plugged into” the profit function. suppose we are selling a product. Now suppose. For the second approximation. h (µ) 2 σ 2 h (µ) 2 h(µ) − E(Y ) ≈ − σ 2 E(Y ) ≈ h(µ) + . Y = h(µ) + (X − µ)h (µ) + (X − µ)2 h (µ) + R1 2 where R1 is the remainder of the expansion. and the random variable X represents the monthly demand (in terms of the number of items sold). In fact. the expected profit is actually E(Y ) = E(h(X)). For example. Y = h(X) is the actual profit from selling X items in a month. Unless h is linear. Discard the remainder term R1 and take the expected value of both sides to get Equation (2). Often.122 Approximating E(h(X)) and Var(h(X)) Proof. Then E(X) = µ represents the expected number of items sold in a month.) Expand the function h in a Taylor series about x = µ to two terms. This is not generally true. Discard the term R2 and take the variance of both sides to get Equation (3). expand h in a Taylor series about x = µ to only one term to get Y = h(µ) + (X − µ)h (µ) + R2 where R2 is the remainder. That is. A common mistake when using expected values is to assume that E(h(X)) = h(E(X)).

Suppose T is a continuous random variable with probability density function fX (x) = 3000t−4 if t ≥ 10 0 otherwise Therefore.6 and 5. the surface tension of a liquid (dyn/cm) is given by the formula S = 2(1 − 0. results in E(S) ≈ h(15) + 75h (15) = 1.82 (dyne/cm) Var(S) ≈ 75[h (15)] = 0.000012(1 − 0.005t)−0.Approximating E(h(X)) and Var(h(X)) 123 Therefore.87 2 (dyne/cm)2 . This result for the linear case is also provided by Theorems 5.8 Since µ = 15.5.2 h (t) = 0.2 . Example: (Meyer) Under certain conditions. 5.82 h (15) = 0.01 h (15) = 0. we will use h(t) = 2(1 − 0.005(15))−0. your answer will be wrong by approximately ∆=− h (µ) 2 σ 2 Note that if h is linear. where T is the temperature of the liquid (degrees centigrade).005t)0.12.012(1 − 0. if you incorrectly state that h(µ) is your expected monthly profit.005T )1.000012(1 − 0. we have h(15) = 1.7. E(T ) = ∞ 10 3000t−3 dt = 15 (degrees centigrade) Var(T ) = E(T 2 ) − (15)2 = ∞ 10 3000t−2 dt − 225 = 75 (degrees centigrade)2 In order to use Equations (2) and (3).8 ≈ 0+ and. and ∆ = 0. h (x) = 0. using Theorem 5.2 h (t) = −0.005t)1.

13. Therefore. . the backwoods. µ−a −∞ +∞ µ+a (x − µ)2 fX (x) dx ≥ (x − µ)2 fX (x) dx ≥ µ−a −∞ +∞ µ+a a2 fX (x) dx a2 fX (x) dx. Hence. Theorem 5. and fX (x) dx replaced by the probability mass function for X . Then for any a > 0 P (|X − µ| ≥ a) ≤ σ2 a2 Proof.124 Chebychev’s Inequality Chebychev’s Inequality Chebychev’s inequality. For a given Var(X). Rural country. Houghton Mifflin Company. for all x over this region of integration. We will prove this result when X is a continuous random variable. and 3 boon·docks (b¯ on’d˘ ks’) plural noun. µ − a] ∪ [µ + a. the inequality establishes a limit on the amount of probability mass that can be found at distances far away from E(X) (the boondocks3 of the probability distribution).13.” Source: The American Heritage R Dictionary of the English Language. Note that (x − µ)2 ≥ a2 for every x ∈ (−∞. we have (x − µ)2 fX (x) ≥ a2 fX (x). Example usage: The o¯ o 1965 Billy Joe Royal Song: “Down in the Boondocks. given in the following Theorem 5.) We have for any a > 0 +∞ σ2 = = (x − µ)2 fX (x) dx −∞ µ−a µ+a +∞ (x − µ)2 fX (x) dx + −∞ µ−a µ−a (x − µ)2 fX (x) dx + µ+a +∞ (x − µ)2 fX (x) dx = −∞ µ−a (x − µ)2 fX (x) dx + something ≥ 0 + µ+a +∞ (x − µ)2 fX (x) dx ≥ −∞ (x − µ)2 fX (x) dx + µ+a (x − µ)2 fX (x) dx µ+a since µ−a (x − µ)2 fX (x) dx ≥ 0. Fourth Edition. is applicable for any random variable X with finite variance. (The proof for X discrete is similar using summations rather than integrals. Let X be a discrete or continuous random variable with E(X) = µ and Var(X) = σ 2 . +∞).

Linda Chattin at Arizona State University for this example.25 respectively. the manager. so P (8 < X < 28) = P (8 − 18 < X − 18 < 28 − 18) = P (−10 < X − 18 < 10) = P (|X − 18| < 10) = P (|X − µ| < 10) ≥ 1 − 6.Chebychev’s Inequality 125 So. Then for any a > 0 P (|X − µ| < a) ≥ 1 − σ2 a2 Example:4 The number of customers who visit a car dealer’s showroom on a Saturday morning is a random variable.25 = 0. we get σ2 ≥ ≥ µ−a −∞ µ−a −∞ (x − µ)2 fX (x) dx + a2 fX (x) dx + +∞ µ+a +∞ µ+a (x − µ)2 fX (x) dx a2 fX (x) dx = a2 P (X ≤ µ − a) + a2 (P (X ≥ µ + a) = a2 P (|X − µ| ≥ a) Note: X can be discrete. An equivalent way to state Chebychev’s inequality is Let X be a discrete or continuous random variable with E(X) = µ and Var(X) = σ 2 .9375 102 This is the best statement we can make without knowing more about the distribution for X . 4 Thanks to Dr. a former EAS305 student.25. What is the probability that there will be between 8 and 28 customers on any given Saturday? Solution We are given E(X) = µ = 18 and Var(X) = σ 2 = 6. Using extensive prior experience. has determined that the mean and variance of X are 18 and 6. X . .

The Normal Distribution Definition 5. we often say that the sequence { n Xn } converges to p almost surely. as n gets large. If X ∼ N (µ.14. Let Xn ≡ number of times the event A occurs p ≡ P (A) Then for any > 0 n→∞ lim P Xn −p < n = 1. σ 2 ) .} 2 3 The strong law states that.6. this sequence converges to p. 1 X2 . . The random variable X is said to have a normal distribution if it has probability density function 1 1 fX (x) = √ e− 2 σ 2π x−µ 2 σ for −∞ < x < ∞ Note that X has two parameters. with each outcome of the experiment independent of all others. µ and σ . The theorem guarantees that. . There is a related result called the strong law of large numbers that makes a similar (but stronger statement) about the entire sequence of random variables 1 {X1 . We often say that “X is distributed normally with mean µ and variance σ 2 ” We write X ∼ N (µ. Let an experiment be conducted n times. 4 X4 . Theorem 5. and Var(X) = σ 2 . Theorem 5. σ 2 ) then E(X) = µ.15. with probability one. In 1 this case.126 The Normal Distribution The Law of Large Numbers This is an important application of Chebychev’s inequality. The above result is the weak law of large numbers. . 1 X3 . the probability that the ratio Xn /n is observed to be “close” to p approaches one.

σ 2 ). Theorem 5.The Normal Distribution 127 One of the amazing facts about the normal distribution is that the cumulative distribution function (4) FX (a) = a −∞ 1 1 √ e− 2 σ 2π x−µ 2 σ dx does not exist in closed form. 1). We can then refer to a table of N (0. to an equivalent probability statement involving only a normal random variable with mean zero and variance one.e. σ Proof.7. there is an important property of the normal distribution that allows us to convert any probability statement. If X ∼ N (µ. the probability density function for X is 1 x−µ 2 1 for −∞ < x < ∞ fX (x) = √ e− 2 σ σ 2π Define the function x−µ h(x) = σ . such as P (X ≤ a). σ 2 ). Z ∼ N (0.16.. Z . The standard normal random variable. 1). We denote the cumulative distribution function for Z by Φ(a) = P (Z ≤ a). σ 2 ). If X ∼ N (µ. 1) probabilities. and thus avoid numerically integrating Equation 4 for every conceivable pair of values µ and σ . Definition 5. Suppose X ∼ N (µ. has probability density function 1 2 1 φ(z) = √ e− 2 z 2π i. Such a table is provided in Table I of Appendix A. In other words. there exists no function in closed form whose derivative is 1 1 fX (x) = √ e− 2 σ 2π x−µ 2 σ The only way to compute probabilities for normally distributed random variables is to evaluate the cumulative distribution function in Equation 4 using numerical integration. then the random variable Z= X −µ ∼ N (0. Therefore.

σ 2 ). Define the random variable Z= X −µ . X −µ If X ∼ N (µ. Theorem 5.16 tells us three things about Z = σ 1. We have E(Z) = E X −µ σ . Z has a normal distribution Actually.17.2. the first two results would be true even if X did not have a normal distribution: Lemma 5. σ Then E(Z) = 0 and Var(Z) = 1. increasing function of x with h−1 (z) = σz + µ d −1 h (z) = σ dz Using Theorem 5. Suppose X is any random variable with finite mean E(X) = µ and finite.128 The Normal Distribution Note that z = h(x) is an invertible. and 3. Var(Z) = 1. the probability density function for the random variable Z = h(X) is fZ (z) = fX (h−1 (z)) = = d −1 h (z) dz 1 (σz+µ)−µ 2 1 σ √ e− 2 σ for −∞ < σz + µ < ∞ σ 2π 1 2 1 √ e− 2 z for −∞ < z < ∞ 2π Note that we couldn’t use the cumulative distribution function of X to find the cumulative distribution function of Z since neither cumulative distribution function exists in closed form. E(Z) = 0 2. nonzero variance Var(X) = σ 2 . Proof.

18. in Theorem 5. σ 2 ). . then the linear transformation of X .) With a little additional work we can show that id X is normally distributed. namely. E(Z) = 0 and Var(Z) = 1. then the function Y = h(X) is strictly increasing and if a < 0. then for any constants a = 0 and b. it is no surprise that. Proof.. When we perform the above transformation (i. a2 σ 2 ). then any linear function h(X) = aX + b is also normally distributed as long as a = 0. the function Y = h(X) is strictly decreasing. Z = (X − µ)/σ ).e. This is just a special case of Theorem 5. If a > 0. the random variable Y = aX + b ∼ N (aµ + b. What is important in Theorem 5. X −µ Z= σ is also normally distributed.The Normal Distribution 129 = E = = = 1 (X − µ) σ 1 E[X − µ] σ 1 (E[X] − µ) σ 1 (0) = 0 σ and Var(Z) = Var X −µ σ 1 = Var (X − µ) σ 1 Var[X − µ] = σ2 1 = Var[X] = 1 σ2 So. This fact is provided by the following theorem: Theorem 5. If X ∼ N (µ.16. we say we are standardizing the distribution of X . A distribution that has zero mean and unit variance is said to be in standard form.16 is that if X is normally distributed.2.

130

The Normal Distribution

We know

1 1 fX (x) = √ e− 2 σ 2π

x−µ 2 σ

Suppose a > 0, then
FY (y) = P (Y ≤ y) = P (aX + b ≤ y) y−b = P X≤ a y−b = FX a

Take derivatives on both sides and use the chain rule to obtain d y−b d fY (y) = FY (y) = FX dy dy a y−b 1 = fX a a
= =

1 √ e σ 2π 1 1 √ e− 2 aσ 2π

−1 2

y−b a −µ σ

2

1 a

y−(aµ+b) 2 aσ

Hence, Y ∼ N (aµ + b, a2 σ 2 ). Now suppose a < 0, then
FY (y) = P (Y ≤ y) = P (aX + b ≤ y) y−b = P X≥ a y−b y−b = P X> +P X = a a y−b y−b +P X = = 1 − FX a a = and since X is continuous. . . y−b +0 = 1 − FX a

Take derivatives on both sides and use the chain rule to obtain d d y−b fY (y) = FY (y) = 1 − FX dy dy a

The Normal Distribution
y−b 1 a a and remembering that a < 0 . . .
1 −2 y−b a −µ σ 2

131

= −fX

=

1 1 √ e −a σ 2π 1 y−(aµ+b) 2 1 aσ √ e− 2 = −aσ 2π and since a2 = (−a)2 . . . 1 y−(aµ+b) 2 1 √ e− 2 −aσ = −aσ 2π

So, again, we have Y ∼ N (aµ + b, a2 σ 2 ). Question: What happens to the probability distribution of Y = aX + b when a = 0? Linear transformations rarely preserve the “shape” of the probability distribution. For example, if X is exponentially distributed, Z would not be exponentially distributed. If X was a Poisson random variable, Z would not be a Poisson random variable. But, as provided by Theorem 5.16, if X is normally distributed, then Z is also normally distributed. (See Question 5.6 to see that the uniform distribution is preserved under linear transformations.) Tools for using the standard normal tables
P (Z ≤ a) = Φ(a) is given in the tables for various values of a. It can be shown that since Z is a continuous random variable and the probability density function for Z is symmetric about zero, we have

Φ(a) = 1 − Φ(−a). Example: Suppose X ∼ N (0, 4). Let’s find P (X ≤ 1).
P (X ≤ 1) = P (X − 0 ≤ 1 − 0) X −0 1−0 = P ≤ 2 2

Since X ∼ N (0, 4) we know from Theorem 5.16 that
Z= X −0 2 ∼ N (0, 1)

132

The Normal Distribution

so, using Table I in Appendix A, we have
P (X ≤ 1) = P Z ≤ = P Z≤

1−0 2
1 2

= 0.6915

This is the area shown in the following figure:
0.5 0.4 0.3

φ(z ) φ( )

0.2 0.1 0.0 -2.0 -1.8 -1.6 -1.4 -1.2 -1.1 -0.9 -0.7 -0.5 -0.3 -0.1 0.1 0.3 0.5 0.7 0.9 1.0 1.2 1.4 1.6 1.8 2.0

z

Example: Suppose X ∼ N (2, 9). Let’s find P (0 ≤ X ≤ 3).
P (0 ≤ X ≤ 3) = P (0 − 2 ≤ X − 2 ≤ 3 − 2) X −2 3−2 0−2 ≤ ≤ = P 3 3 3

Since X ∼ N (2, 9) we know that
Z= X −2 3 ∼ N (0, 1)

Hence,
2 P (0 ≤ X ≤ 3) = P − 3 ≤ Z ≤ 1 3

= P Z≤

1 3

− P Z < −2 3

Since the N (0, 1) distribution is symmetric about zero,
P Z < −2 = P Z > 3
2 3

=1−P Z ≤

2 3

Finally, from Table I, we get
P (0 ≤ X ≤ 3) = P Z ≤
1 3

− 1−P Z ≤

2 3

= 0.63 − (1 − 0.75) = 0.38

Self-Test Exercises for Chapter 5

133

which is the area shown in the following figure:
0.5 0.4 0.3

φ(z ) φ( )

0.2 0.1 0.0 -2.0 -1.8 -1.6 -1.4 -1.2 -1.1 -0.9 -0.7 -0.5 -0.3 -0.1 0.1 0.3 0.5 0.7 0.9 1.0 1.2 1.4 1.6 1.8 2.0

z

Self-Test Exercises for Chapter 5
For each of the following multiple-choice questions, choose the best response among those provided. Answers can be found in Appendix B. S5.1 Let X be a continuous random variable with probability density function
fX (x) =
1 2

0

if −1 ≤ x ≤ 1 otherwise.

The value of P (X 2 ≥ 1 ) is 4 (A) 0 (B) (C)
1 4 1 2

(D) 1 (E) none of the above. S5.2 Let X be a random variable with E(X) = 0 and E(X 2 ) = 2. Then the value of E[X(1 − X)] is (A) −2 (B) 0 (C) 1 (D) 2 (E) Var(X)

134

Self-Test Exercises for Chapter 5

(F) none of the above. S5.3 Let X be a random variable with E(X) = 1 and E(X 2 ) = 2. Then Var(X) equals (A) −1 (B) 0 (C) 1 (D) 3 (E) none of the above. S5.4 Suppose that X is a continuous random variable with P (X < −2) = P (X < 2) = 1 . Then P (X 2 ≤ 4) equals. 2 (A) (B) (C)
1 4 1 2 3 4 1 4

and

(D) 1 (E) none of the above. S5.5 Let X be a random variable with E(X) = 2. Then E[(X − 2)2 ] equals (A) 0 (B) 2 (C) Var(X) (D) E(X 2 ) (E) none of the above. S5.6 Let X be a random variable with cumulative distribution function if x < 0 x if 0 ≤ x ≤ 1 FX (x) =   1 if x > 1 Then P (3X ≤ 1) equals (A) 1/6
  0 

−1. S5. Then P (Y ≤ 2 ) equals .Self-Test Exercises for Chapter 5 135 (B) 1/4 (C) 1/3 (D) 1/2 (E) none of the above.8 Let X be a discrete random variable with probability mass function pX (x) = 1/4 if x = −2. S5. Then P (Y ≤ 2 ) equals (A) 0 (B) 1/4 (C) 3/4 (D) 1 (E) none of the above. S5.7 Let X be a discrete random variable with probability mass function pX (−1) = p pX (0) = 1 − 2p pX (1) = p where 0 < p < 1 . 1 or 2 0 otherwise 1 Let Y = 1/X .9 Let X be a continuous random variable with probability density function fX (x) = 1/4 if −2 ≤ x ≤ 2 0 otherwise 1 Let Y = 1/X . The value of P (X 2 = 1) is 2 (A) p2 (B) 2p (C) (1 − 2p)2 (D) 1 (E) none of the above.

Then P (X 2 ≥ 0) equals (A) 0 .12 Suppose that X is a random variable with P (X > 0) = r where 0 < r < 1.10 Let X be a random variable with P (X < −1) = 1/4 P (X ≤ +1) = 3/4 Then P (|X| > 1) equals (A) 1/4 (B) 1/2 (C) 3/4 (D) 1 (E) none of the above.136 Self-Test Exercises for Chapter 5 (A) 0 (B) 1/2 (C) 3/4 (D) 1 (E) none of the above. S5. S5.11 Let X be a continuous random variable with probability density function fX (x) = if −1 ≤ x ≤ 1 0 otherwise 1 2 Then P (|X| ≤ 1 ) equals 4 (A) 0 (B) 1/4 (C) 1/2 (D) 1 (E) none of the above. S5.

15 The daily demand for chocolate cake in a neighborhood bakery is a random variable X with probability mass function given by pX (0) = 0.5 pX (3) = 0.13 Suppose X is a random variable with E(X) = α and Var(X) = α. Then E(X 2 ) equals (A) 0 (B) α (C) α2 (D) α + α2 (E) none of the above. with each outcome independent of the others.00 a cake and all unsold chocolate cakes are thrown away at the end of the day. Let Y denote your actual net amount won (which may be negative). Suppose you win $2 for every head and lose $1 for every tail. Then P (Y ≥ 0) equals (A) 2/3 (B) 848/1024 (C) 904/1024 (D) 1 (E) none of the above.2 pX (2) = 0. The cake is sold for $5.14 A fair coin is tossed 10 times. S5. S5. The expected daily profit from chocolate cake is equal to . the bakery owner bakes two chocolate cakes at a cost of $2. S5.00 per cake.Self-Test Exercises for Chapter 5 √ r 137 (B) (C) r (D) 1 (E) none of the above.1 where X represents the number of cakes requested by customers.2 pX (1) = 0. At the beginning of the day. where α > 0.

17 Suppose that an electronic device has a life length X (in units of 1000 hours) that is considered a continuous random variable with probability density function fX (x) = e−x 0 if x > 0 otherwise Suppose that the cost of manufacturing one such item is $2. The manufacturer sells the item for $5. the store owner bakes two loaves of rye bread at a cost of 20 cents per loaf.2 pX (1) = 0.2 pX (2) = 0. The expected daily profit from rye bread is equal to (A) 0 (B) 20 cents (C) 30 cents (D) 40 cents (E) none of the above.00.50 (C) $2.00 (D) $3. but guarantees a total refund if X ≤ 1.6 At the beginning of the day.00. What is the manufacturer’s expected profit per item? (A) 5e−1 (B) 2e−1 (C) −2 + 5e−1 .00 (E) none of the above.16 The daily demand for rye bread in a corner grocery store is a random variable X with probability mass function given by pX (0) = 0. S5. S5. The bread is sold for 50 cents a loaf and all unsold loaves are thrown away at the end of the day.138 Self-Test Exercises for Chapter 5 (A) 0 (B) $1.

and you lose $1 for each tail that appears. S5.18 Let X be a random variable with E(X) = 2 and E(X 2 ) = 5. You win $2 for each head that appears. S5.Self-Test Exercises for Chapter 5 139 (D) 5 − 2e−1 (E) none of the above. S5. Then Var(X) equals (A) 1 (B) 3 (C) 7 (D) 9 (E) none of the above.19 Let X be a random variable with E(X) = 1 and Var(X) = 4 then E((X − 1)2 ) equals (A) 0 (B) 1 (C) 4 (D) 16 (E) none of the above. . The value of E(Y ) is (A) 0 (B) 3/2 (C) 1 (D) 3 (E) none of the above.20 A fair coin is tossed 3 times. Let Y denote your earnings after 3 tosses.

5 and P (X = +1) = 0.24 Let X be a random variable with P (X = −1) = 0.5.1587 (C) 0.5 (D) 0. (E) none of the above are true. Y = σ Which of the following are always true: (A) E(Y ) = 0 and Var(Y ) = 1 (B) Y has a normal distribution (C) P (Y > 0) = 1/2 (D) all of the above are true. Then P (X < 0) equals (A) 0 (B) 0.5 (D) 0 (E) none of the above. Then the value of E(|X|) is (A) 0 .21 Let X be any random variable with E(X) = µ and Var(X) = σ 2 .140 Self-Test Exercises for Chapter 5 S5. Consider the random variable X −µ . S5.8413 (E) none of the above S5. then P (−2Y < 0) equals (A) p (B) 1 − p (C) 0. S5.22 Let X be a normally distributed random variable with E(X) = 3 and Var(X) = 9.23 If Y is a random variable with P (Y ≥ 0) = p.

26 Suppose that the random variable X has a normal distribution with mean 1 and variance 4. The value of E(Y ). Let Y denote the reward for any one customer. P (|X| ≤ 1) equals (A) 0.1826 (C) 0.3413 (E) none of the above.5 (C) 1 (D) 2 (E) none of the above are true. S5. S5. Then. the expected reward.1528 (B) 0.Self-Test Exercises for Chapter 5 141 (B) 0. is (A) 0 (B) 5 (C) 10 (D) 10e−25 (E) none of the above.3174 (D) 0. S5.27 Suppose X is a continuous random variable with cumulative distribution function   0 if x < 0  2 x if 0 ≤ x ≤ 1 FX (x) =   1 if x > 1 .25 The time to wait for service in the El Gasso Mexican Restaurant is a continuous random variable X (in minutes) with probability density function fX (x) = 5e−5x 0 x≥0 otherwise The restaurant will pay a customer a $10 reward if they wait more than 5 minutes for service.

Define the random variable Y = ln(X).71 (B) $7357.50 √ (D) 0.50 (E) none of the above.41 (D) $17. If the manufacturer sells 100 printers. .1 Let X be the random variable with probability density function fX (x) = 3 2 (1 0 − x2 ) if 0 ≤ x ≤ 1 otherwise Let h(x) = 1/x. how much should he expect to pay in refunds? (A) $2706.59 (C) $12. S5.28 The life length X (in years) of a printer costing $200 is a continuous random variable with probability density function fX (x) = e−x if x ≥ 0 0 otherwise The manufacturer agrees to pay a full refund to the buyer if the printer fails within the first year. Questions for Chapter 5 5. E(h(X)) and h(E(X)). (b) Find E(X). Then P (Y < ln(0.293. (a) Find the probability density function for the random variable Y = h(X) = 1/X .642.25 (C) 0.29 (E) none of the above.142 Questions for Chapter 5 Let ln(x) denote the natural logarithm of x.50)) equals (A) 0 (B) 0.

(b) Determine the cumulative distribution function for X . E(h(X)) and h(E(X)). 5.2 Let X be the random variable with probability density function fX (x) = 2/x2 if 1 ≤ x ≤ 2 0 otherwise Let h(x) = ln x. (c) Find Var(X). (c) Compute P (2X 2 ≤ 0. (b) Find E(X).Questions for Chapter 5 143 (c) Find Var(X). 9 (a) Find the probability density function for the random variable Y = h(X) = 4 X . . of a commodity is a continuous random variable with the following probability density function   ax if 0 ≤ x ≤ 1  fX (x) =   0 a if 1 < x ≤ 2 otherwise (a) Determine the value of a.4 The price. X . (c) Find Var(X). E(h(X)) and h(E(X)). (d) Compute E(X) and Var(X).5).3 Let X be the random variable with probability density function fX (x) = 3√ 2 x 0 if 0 ≤ x ≤ 1 otherwise Let h(x) = 4 x. (a) Find the probability density function for the random variable Y = h(X) = ln X . 5. 9 (b) Find E(X). 5.

5 using the actual probability density function for Y . b].5 Let X be a random variable with the following probability density function: 1 2 fX (x) = √ e−x /2 2π for −∞ < x < ∞ The is the normal distribution with parameters µ = 0 and σ 2 = 1. 5. suppose the cumulative distribution function of X is given by  if x < a  0   x−a if a ≤ x ≤ b FX (x) =  b−a   1 if x > b. Z= Let 5. (b) Using the probability density function for Y find E(Y ) and Var(Y ). The results should be the same as Example 5.5. Find the probability density function for the random variable Y = X 3 .9. In other words.00.144 Questions for Chapter 5 5. The manufacturer sells the item for $5. Show that Z also has a uniform distribution.7 Verify the results of Example 5. What is the manufacturer’s expected profit per item? . let X be the random variable with probability density function 3x2 if 0 ≤ x ≤ 1 fX (x) = 0 otherwise (a) Find the probability density function for the random variable Y = h(X) = 1/X .8 Suppose that an electronic device has a life length X (in units of 1000 hours) that is considered as a continuous random variable with probability density function: e−x x>0 fX (x) = 0 otherwise Suppose that the cost of manufacturing one such item is $2. X −r s Where r and s are non-zero constants. Note: The integral for fX (x) does not exist in closed form. 5.6 (†) Let X be a uniformly distributed random variable on the interval [a.00 but guarantees a total refund if X ≤ 0. Note: Be careful of the case where s < 0. That is. where a < b.

9 Suppose that the continuous random variable X has a probability density function given by   √1 e−x2 fX (x) = 2π  −∞ ≤ x ≤ +∞ 0 otherwise Let Y = X 2 . 5.10 Let X be a random variable with density   2 fX (x) = x2  0 d dx 1≤x≤2 otherwise 1 x. 5.. an industrial engineering student. (a) What is the expected number of tosses required to obtain the first head? (b) What is Mary’s entry fee? Is this fair? Why? 5. Mary’s entry fee will be E(2X ). to obtain the first head. Mary will pay an “entry fee” in advance to John equal to her expected winnings.Questions for Chapter 5 145 5.11 John and Mary play a game by tossing a fair coin repeatedly and betting on the number of tosses required. when the first head appears). was not able to study for an examination. The examination consists of 4 multiple choice questions. ln(x) = compute E(X) and (b) Compute E(Y ). Billy plans to answer the questions randomly by rolling a fair die for each question and choosing the response associated with the number rolled. Evaluate E(Y ). Each question has 6 possible answers from which to choose. where Y = X 2 + 1. (a) What is the probability mass function for the random variable X? (b) If the professor assigns a credit of ten points for every correct answer and deducts one point for every incorrect answer. Let X denote the number of questions correctly answered by Billy. Bill Bell.12 Due to pressing social commitments. only one of which is correct. To make the game fair. John will pay Mary 2X dollars when the game is over (i. what is Billy’s expected score? . X . Therefore. (a) Using the fact that Var(X).e.

it is sold for $12. Let Y = eX . Y is log-normal if and only if ln(Y ) is normal. What is the expected selling price of an article? 5.. if the store owner decides to bake one loaf (i. If its weight is greater than 52 ounces. Justify your answer.15 If X is a random variable with P (X = −2) = what is the value of E(|X|)? 1 2 and P (X = +2) = 1 . The bread is sold for 30 cents a loaf and all unsold loaves are thrown away at the end of the day. 5. then E(X(X − 1)) equals zero. If its weight is greater than 44 ounces and less than 52 ounces.16 The daily demand for pumpernickel bread in a corner grocery store is a random variable X with probability mass function given by pX (0) = 0. Suppose that X ∼ N (µX . it is considered unsatisfactory and thrown away. and has a uniform distribution on the interval (36. For what number α is E(X(α − X)) zero? 5. it is sold for $15. σX ).13 Let X be the number of dots obtained when a fair. . the store owner bakes k loaves of bread at a cost of 20 cents for the first loaf and 10 cents for each additional loaf.00. pX (1) = 0. Then Y is said to have a log-normal distribution. If the weight of the article is less than 44 ounces.146 Questions for Chapter 5 (c) The professor deducts points for wrong answers in order to discourage random guessing.17 A random variable X represents the weight (in ounces) of an article. common six-sided die is tossed. If 10 points are credited for each correct answer.6 At the beginning of the day.e.1. (b) How many loaves should the grocer bake per day to maximize his expected profit? 5. how many points should be deducted for each wrong answer so that Billy’s expected score from guessing is zero? 5. 56).3. R.00.14 (True or False) If X is a random variable with E(X) = 1. 2 5. pX (2) = 0. (a) Determine the probability distribution for the daily profit. Find the probability density function for Y . That is. k = 1).18 A distribution closely related to the normal distribution is the log-normal 2 distribution.

The relationship between the moments of X and the moments of Y is given by µX 2 σX = ln(µY ) − = ln 2 1 σY 1 2 ln + 1 = ln(µY ) − σX 2 2 2 µY 2 σY +1 µ2 Y Examples of random variables that may be represented by the log-normal distribution are river flows. The specifications call for X to be between 30 and 35 percent. If a resistor has a resistance between 950 and 1050 ohms. 9).05 per gallon   -$0. the size of an organism subject to a number of small impulses.10 per gallon  T (X) = $0. your answer is in terms of µX and σX . E(T ).20 The Youbee Resistor Company uses a machine to produce resistors. The manufacturer will make a net profit on the fuel (per gallon) which is the following function of X :   $0. evaluate E(T ). the diameter of a small particle after a crushing process. The cost of producing one resistor is 10 cents. (b) Suppose that the manufacturer wants to increase his expected profit.19 A rocket fuel is to contain a certain percent (say X ) of a particular additive.Questions for Chapter 5 147 Note: For this question. (a) What proportion of the resistors must be scrapped? (b) The company is considering the purchase of a new highprecision machine that will reduce the above production . These are not the expected value and standard deviation of Y . 5.10 per gallon if 30 < X < 35 if 35 < X < 40 or 25 < X ≤ 30 otherwise (a) If X ∼ N (33. He intends to do this by increasing his profit (per gallon) on those batches of fuel meeting the specification 30 < X < 35. it can be sold for 25 cents. and the length of life of certain items. by 50%. What must his new profit function be in order to achieve this goal? 5. The actual resistance of each resistor is a normally distributed random variable with mean resistance of 1000 ohms and a variance of 625 ohms2 . otherwise it is scrapped.

What would Var(X) have to equal so that P (X < 0) = 0. If the company wishes to maximize expected profit. What is the probability that a sample of fabric. you will say only noise is present. say X . It is obvious that X . cannot take on negative values.95. If only noise is present. 5. . the reading.975. Otherwise. That is P (X < 0) ≈ 0. You settle on the following signal detection rule: If the reading is 7 or greater. What would E(X) have to be so that P (X < 0) = 0. X . σ 2 ). Nevertheless.99. If a signal is present when the button is pressed. will be defective? Note: An immediate objection to the use of the normal distribution may be raised here. In the above example. should the new machine be purchased? 5. Compute this probability to verify this property for yourself in the above example. will have a normal distribution with mean 10 and variance 9. you will announce that a signal is present. Find k such that P (X > µ − kσ) = 0.10? Suppose.841. the cost of producing each resistor will increase to 15 cents if the company decides to purchase the machine (the new machine’s capital cost is included in this production cost). that we hold Var(X) fixed at 9. Although the selling price of resistors would remain the same. on the other hand.23 A measuring device provides a reading on a continuous scale whenever a button is pressed. Assume furthermore that a sample of this fabric is considered to be defective if X ≤ 162. the reading will have a normal distribution with mean 5 and variance 9. a normally distributed random variable may be observed to take on any positive or negative value. As long as the parameters µ and σ 2 are chosen so that P (X < 0) is essentially zero. = 0. the strength of cotton fabric. = 0. the above model (apparently not valid in view of the objection just raised) assigns negligible probability to the event {X < 0}.148 Questions for Chapter 5 variance to 400 ohms2 (the mean remains the same). is normally distributed with E(X) = 165 and Var(X) = 9.21 Suppose that the breaking strength of cotton fabric (in pounds). chosen at random.10? 5. suppose we hold E(X) fixed at 165. However. such a representation may work.22 Let Xi ∼ N (µ. This situation will occur frequently: A certain random variable that we know cannot take on negative values will be assumed to have a normal distribution anyway. = 0.

Questions for Chapter 5

149

(a) If a signal is truly present, and you use the above rule, what is the probability that your announcement is wrong? If only noise is really present, what is the probability that your announcement is wrong? (b) Suppose that the probability that a signal is truly present when the button is pressed is 0.25. If your reading is greater than 7, what is the probability that a signal is really present? (c) The value, 7, is called the threshold. What should this value be changed to so that your two answers for question (a) are equal? 5.24 An electronic device has a life length X (in years) that is considered a continuous random variable with probability density function
fX (x) = x/2 0

if 0 < x < 2 otherwise

(a) Find P (X > 1). (b) Find E(X). (c) Suppose that the cost of manufacturing one such item is $2.00. The manufacturer sells the item for $10.00, but guarantees a total refund if X ≤ 1. What is the manufacturer’s expected profit per item? (d) Suppose that the cost of manufacturing one such item is $2.00 and the customer is guaranteed a total refund if X ≤ 1. What should be the manufacturer’s selling price in order to provide an expected profit of $4.00 per item? 5.25 The reading of a water level meter is a continuous random variable X with probability density function
fX (x) =

1/4 0

if −2 < x < 2 otherwise

(a) Find P (X ≤ 1). (b) Find P (|X| ≤ 1). (c) Find E(X).

150

Questions for Chapter 5

(d) Find Var(X). (e) Let Y = X 2 . Find the cumulative distribution function for Y. 5.26 The daily demand for pumpkin pies at a corner grocery store is a random variable with probability mass function
  1 

pX (k) =

4 16 k   0

for k = 0, 1, 2, 3, 4 otherwise

The cost of baking one pumpkin pie is $1.00. The store sells the pies for $3.00, each. At the beginning of every day, the store manager must decide how many pies to bake. At the end of the day, any unsold pies are thrown away. If the demand for pies exceeds the supply, you can only sell those pies that you have already baked for that day. The additional customers leave unsatisfied. (a) Find P (X > 0). (b) Find E(X). (c) Find Var(X). (d) If the manager decides to bake three pies, what is the expected profit for that day? 5.27 The input voltage for an electronic component is a continuous random variable X with probability density function
fX (x) =

1/3 0

if −1 < x < 2 otherwise

(a) Find P (X ≤ 1). (b) Find P (X 2 ≤ 0.25). (c) Find E(X 2 ). (d) Let Y = 2X . Find the cumulative distribution function for Y. 5.28 The Youbee Mine has been used to mine gold ore for several years. The mass (in grams) of actual gold in one cubic centimeter of ore is a random variable

Questions for Chapter 5

151

X with probability density function fX (x) = cx3 0

if 0 < x < 1 otherwise

(a) Find the value of the constant c. (b) Find the cumulative distribution function for X and sketch a graph of the cumulative distribution function. (c) Each cubic centimeter or ore is worth $5 if it has more than 1 2 gram of gold. It is worth $0 otherwise. Let Y be a random variable denoting the value of a cubic centimeter of ore selected at random. What is the probability mass function for Y ? 5.29 Let X be a continuous random variable with probability density function
fX (x) =

1/3 0

if −1 < x < 2 otherwise

Let Y = X 2 . Find the cumulative distribution function for Y . 5.30 Let X be a continuous random variable with probability density function
fX (x) =

1 0

if −1 < x < 0 otherwise

Let Y = X 2 . Find P (Y ≤ 0.25). 5.31 Let X be a continuous random variable with probability density function
  a(1 + x) 

fX (x) =

  0

a(1 − x)

if −1 ≤ x < 0 if 0 ≤ x ≤ 1 otherwise

(a) Prove that the value of a must be 1. Use a clear mathematical presentation to show why this must be true. (b) Find P (X ≤ 0.5). (c) Find P (X 2 ≤ 0.25). (d) Find the cumulative distribution function for X .

152

Questions for Chapter 5

5.32 A person accused of a crime is being questioned by the police. A new “lie detector” device is being used that measures the amount of stress in a person’s voice. When the person speaks, the device gives a reading, X . If the person is telling the truth, then X ∼ N (2, 1). If the person is lying, then X ∼ N (4, 1). Therefore, the mean (expected) stress level of a person’s voice increases if he is lying. (a) The police decide on the following rule: If a person’s voice stress exceeds 4 when he is answering a question, conclude that he is lying. What is the probability of incorrectly accusing person of lying when he is telling the truth? (b) When the suspect was originally brought in for questioning, the police thought there was only a 50% chance that he was guilty of the crime. The police ask him the question, “Did you commit the crime?” The suspect replies, “No.” The reading on the voice stress meter exceeds 4. What is the probability that he committed the crime? 5.33 Consider a random variable X with possible outcomes, 0, 1, 2, . . . Suppose that P (X = k) = (1 − a)(ak ) for k = 0, 1, 2, . . . (a) Identify all permissible values for the constant a. (b) Show that for any two positive integers, m and n,
P (X > m + n | X > m) = P (X ≥ n).

5.34 Check whether the mean and the variance of the following distributions exist: (a)
fX (x) =

2x−3 0

x≥1 otherwise

(b) (The Cauchy distribution) a is a positive constant.
fX (x) = a π(a2 + x2 ) −∞ < x < ∞.

Questions for Chapter 5

153

5.35 Diameters of bearings in mass production are normally distributed with mean 0.25 inch and standard deviation 0.01 inch. Bearing specifications call for diameters of 0.24 ± 0.02 inch. What is the probability of a defective bearing being produced? 5.36 Suppose that the continuous random variable X has a probability density function given by
  √2 e−x2 fX (x) = 2π 

x≥0

0

otherwise

Let Y = X 2 . Evaluate E(Y ). 5.37 Suppose that the life lengths of two electronic devices, D1 and D2 , have distributions N (40, 36) and N (45, 9), respectively. If the electronic device is to be used for a 45-hour period, which device is to be preferred? If it is to be used for a 50-hour period, which device is to be preferred? 5.38 (†) Suppose random variable Z has a standard normal density function
φ(z) =
1 2 √1 e− 2 z 2π

for −∞ < z < ∞

Let Y = Z 2 . Find the probability density function for Y . (We say that Y has a Chi-square distribution with one degree of freedom.) Warning: Y is not strictly increasing or strictly decreasing.

154 Questions for Chapter 5 .

It would be reasonable to ask for the probability that tomorrow’s temperature is greater than 70◦ and tomorrow’s total rainfall is less than 0.1 inch. For example. we would need to compute the probability of the event B = {T > 70}. “What is the probability that the temperature will be greater than 70◦ regardless of the rainfall?” To answer this question. we wish to determine the probability of the event A = {T > 70. It would not be either easy or desirable to attempt to combine all of this information into a single measurement. we will build on our probability model and extend our definition of a random variable to permit such calculations. relative humidity and temperature. For example. Definition The first thing we must do is to precisely define what we mean by “an array of random variables. In other words.” 155 . wind direction.6 RANDOM VECTORS Basic Concepts Often. We would like to extend the notion of a random variable to deal with an experiment that results in several observations each time the experiment is run. let T be a random variable representing tomorrow’s maximum temperature and let R be a random variable representing tomorrow’s total rainfall. a single random variable cannot adequately provide all of the information needed about the outcome of an experiment. R < 0. atmospheric pressure. In this chapter. Another question that we might like to have answered is.1}. tomorrow’s weather is really best described by an array of random variables that includes wind speed.

156

Definition

y
ω
3 2 1 1 2 3 (4,1)

4

5

x

Figure 6.1: A mapping using (X1 , X2 ) Definition 6.1. Let Ω be a sample space. An n-dimensional random variable or random vector, X1 (·), X2 (·), . . . , Xn (·) , is a vector of functions that assigns to each point ω ∈ Ω a point X1 (ω), X2 (ω), . . . , Xn (ω) in n-dimensional Euclidean space. Example: Consider an experiment where a die is rolled twice. Let X1 denote the number of the first roll, and X2 the number of the second roll. Then (X1 , X2 ) is a two-dimensional random vector. A possible sample point in Ω is

that is mapped into the point (4, 1) as shown in Figure 6.1.

Joint Distributions

157

(x,y)

Figure 6.2: The region representing the event P (X ≤ x, Y ≤ y).

Joint Distributions
Now that we know the definition of a random vector, we can begin to use it to assign probabilities to events. For any random vector, we can define a joint cumulative distribution function for all of the components as follows: Definition 6.2. Let (X1 , X2 , . . . , Xn ) be a random vector. The joint cumulative distribution function for this random vector is given by
FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) ≡ P (X1 ≤ x1 , X2 ≤ x2 , . . . , Xn ≤ xn ).

In the two-dimensional case, the joint cumulative distribution function for the random vector (X, Y ) evaluated at the point (x, y), namely
FX,Y (x, y),

is the probability that the experiment results in a two-dimensional value within the shaded region shown in Figure 6.2. Every joint cumulative distribution function must posses the following properties: 1.
all xi →−∞

lim

FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) = 0

158

Discrete Distributions

x2 6 5 4 3 2 1 1 2 3 4 5 6 x1

Figure 6.3: The possible outcomes from rolling a die twice. 2. lim
FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) = 1

all xi →+∞

3. As xi varies, with all other xj ’s (j = i) fixed,
FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn )

is a nondecreasing function of xi . As in the case of one-dimensional random variables, we shall identify two major classifications of vector-valued random variables: discrete and continuous. Although there are many common properties between these two types, we shall discuss each separately.

Discrete Distributions
A random vector that can only assume at most a countable collection of discrete values is said to be discrete. As an example, consider once again the example on page 156 where a die is rolled twice. The possible values for either X1 or X2 are in the set {1, 2, 3, 4, 5, 6}. Hence, the random vector (X1 , X2 ) can only take on one of the 36 values shown in Figure 6.3. If the die is fair, then each of the points can be considered to have a probability

Continuous Distributions

159

1 mass of 36 . This prompts us to define a joint probability mass function for this type of random vector, as follows:

Definition 6.3. Let (X1 , X2 , . . . Xn ) be a discrete random vector. Then
pX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) ≡ P (X1 = x1 , X2 = x2 , . . . , Xn = xn ).

is the joint probability mass function for the random vector (X1 , X2 , . . . , Xn ). Referring again to the example on page 156, we find that the joint probability mass function for (X1 , X2 ) is given by
pX1 ,X2 (x1 , x2 ) =
1 36

for x1 = 1, 2, . . . , 6 and x2 = 1, 2, . . . , 6

Note that for any probability mass function,
FX1 ,X2 ,...,Xn (b1 , b2 , . . . , bn ) =
x1 ≤b1 x2 ≤b2

···
xn ≤bn

pX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ).

Therefore, if we wished to evaluate FX1 ,X2 (3.0, 4.5) we would sum all of the probability mass in the shaded region shown in Figure 6.3, and obtain
1 FX1 ,X2 (3.0, 4.5) = 12 36 = 1 . 3

This is the probability that the first roll is less than or equal to 3 and the second roll is less than or equal to 4.5. Every joint probability mass function must have the following properties: 1. pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) ≥ 0 for any (x1 , x2 , . . . , xn ). 2.
all x1

...
all xn

pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) = 1 pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) for any event E .
(x1 ,...,xn )∈E

3. P (E) =

You should compare these properties with those of probability mass functions for single-valued discrete random variables given in Chapter 5.

Continuous Distributions
Extending our notion of probability density functions to continuous random vectors is a bit tricky and the mathematical details of the problem is beyond the scope of

160

Continuous Distributions

an introductory course. In essence, it is not possible to define the joint density as a derivative of the cumulative distribution function as we did in the one-dimensional case. Let Rn denote n-dimensional Euclidean space. We sidestep the problem by defining the density of an n-dimensional random vector to be a function that when integrated over the set
{(x1 , . . . , xn ) ∈ Rn : x1 ≤ b1 , x2 ≤ b2 , . . . , xn ≤ bn }

will yield the value for the cumulative distribution function evaluated at
(b1 , b2 , . . . , bn ).

More formally, we have the following: Definition 6.4. Let (X1 , . . . , Xn ) be a continuous random vector with joint cumulative distribution function
FX1 ,...,Xn (x1 , . . . , xn ).

The function fX1 ,...,Xn (x1 , . . . , xn ) that satisfies the equation
FX1 ,...,Xn (b1 , . . . , bn ) =
b1 −∞

···

bn −∞

fX1 ,...,Xn (t1 , . . . , tn ) dt1 · · · dtn

for all (b1 , . . . , bn ) is called the joint probability density function for the random vector (X1 , . . . , Xn ). Now, instead of obtaining a derived relationship between the density and the cumulative distribution function by using integrals as anti-derivatives, we have enforced such a relationship by the above definition. Every joint probability density function must have the following properties: 1. fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) ≥ 0 for any (x1 , x2 , . . . , xn ).
+∞ +∞ −∞

2.

−∞

···

fX1 ,...,Xn (x1 , . . . , xn ) dx1 · · · dxn = 1 fX1 ,...,Xn (x1 , . . . , xn ) dx1 · · · dxn for any event E .

3. P (E) =

···

E

You should compare these properties with those of probability density functions for single-valued continuous random variables given in Chapter 5.

Continuous Distributions 161 x2 b2 a2 a1 b1 x1 Figure 6. The situation in the multidimensional case is a little more complicated.X2 (a1 . Y ) be a two-dimensional random variable with the following joint probability density function (see Figure 6. a2 < X2 ≤ b2 ). b2 ) − FX1 . y) = 2−y 0 2 1 0 2 if 0 ≤ x ≤ 2 and 1 ≤ y ≤ 2 otherwise Note that (2 − y) dx dy = 1. .4: Computing P (a1 < X1 ≤ b1 . a2 < X2 ≤ b2 ) = FX1 . In the one-dimensional case. a2 ) + FX1 .Y (x. Example: Let (X.4. we had the handy formula P (a < X ≤ b) = FX (b) − FX (a).X2 (b1 . You should be able to verify this formula for yourself by accounting for all of the probability masses in the regions shown in Figure 6.X2 (a1 . b2 ) − FX1 .X2 (b1 . with a comparable formula given by P (a1 < X1 ≤ b1 . a2 ). This worked for any type of probability distribution.5): fX.

162 Marginal Distributions fXY(x.0. To do this.0 −∞ −∞ 1.Y (x. Y ≤ 1. Marginal Distributions Given the probability distribution for a vector-valued random variable (X1 .0 0 fX. Y ≤ y) .5 1. y) over the region {(x. y) = P (X ≤ x. disregarding the other components?” The answer is yes. FX (x)? We know that FX. in the two-dimensional case. Y ≤ 1.Y (x. we might ask the question. i. Performing the integration.5}. y) dx dy 3 (2 − y) dx dy = 8 . Xn ).5: A two-dimensional probability density function Suppose we would like to compute P (X ≤ 1. y) : x ≤ 1.5) = = 1.Y (x. Y ) with joint cumulative distribution function FX. . y).y) 1 1 2 y 1 2 x Figure 6. For example. we may be given a random vector (X.. y ≤ 1. .5 1.5). Suppose we would like to find the cumulative distribution function for X alone.0 1. we calculate the volume under the surface fX. This region of integration is shown shaded (in green) in Figure 6.Y (x.0.e. and the solution requires the careful use of English rather than mathematics. P (X ≤ 1. “Can we determine the distribution of X1 . . we get.5.

To find pX (x).Y (x. So we can use the above definition to find the marginal cumulative distribution function for any of the Xi ’s. for example.5 is a nice definition. . y). we compute 6 pX1 (2) = P (X1 = 2) = k=1 1 pX1 .Y (x. . we shall define what we will call the marginal cumulative distribution function: Definition 6.. . k) = 6 .X2 .5. xn →∞ Notice that we can renumber the components of the random vector and call any one of them X1 .. xn ). y). . .. . Let (X1 . . .. we ask “What is the probability that X = x regardless of the value that Y takes on? This can be written as pX (x) = P (X = x) = P (X = x.Y (x. . y). . Using this idea..Xn (x1 . . .Xn (x1 . expression 1 can be read: “the probability that X takes on a value less than or equal to x and Y takes on any value. .” Therefore. Y ) with joint probability mass function pX. The marginal cumulative distribution function for X1 is given by FX1 (x1 ) = lim x2 →∞ x3 →∞ lim · · · lim FX1 .. x2 ) = 1 36 for x1 = 1. But in terms of both X and Y .X2 (2.. Y = any value) = all y pX. 6 and x2 = 1. . it would make sense to say FX (x) = P (X ≤ x) = P (X ≤ x. . xn ).. .X2 (x1 . Example: In the die example (on page 156) pX1 . Xn ) be a random vector with joint cumulative distribution function FX1 . it is more useful to examine marginal probability mass functions and marginal probability density functions. Y ≤ ∞) = y→∞ lim FX. . 2. 2. .Marginal Distributions 163 and we are asking for (1) FX (x) = P (X ≤ x). 6 To find pX1 (2). Although Definition 6. x2 . . suppose we have a discrete random vector (X. . . For example.

Y (3.1. But notice that in these computations.15. To find the marginal probability mass function for the total daily production. 1) + pX. 3) = 0. for example. Recall that a density.Y (3. Using this joint distribution.15 .Y (1.10 1 . we are simply adding the entries all columns for each row of Table 6. We simply look for the result in the margin1 for the entry Y = 2.2 So.Y (3. Doing this for Y as well as X we can obtain Table 6. Suppose that the probability mass function for (X. we sum the probabilities over all possible values of Y for each fixed x: pX (1) = pX.05 + 0. X .20 etc. 1) = 0.275. is. we can see that the probability of producing 2 items with exactly 1 of those items being defective is pX.05 .1: Joint pmf for daily production pX. fX (x). The procedure is similar for obtaining marginal probability density functions. but it brings some comfort to know we can get it from all of the mathematical machinery we’ve developed thus far.05 .Y (2.05 + 0.10 = 0.025 .10 0 . 2) = 0. y) 1 2 X 3 4 5 Y 2 3 0 0 .10 4 0 0 0 0 . but fX (x)dx.15 + 0. P (Y = 2) = pY (2) = 0.Y (2.10 5 0 0 0 0 0 This is hardly a surprising result.10 = 0.25 pX (3) = pX. Example: Let X be the total number of items produced in a day’s work at a factory.05 .Y (2.10 .164 Marginal Distributions Table 6.05 pX (2) = pX. 2) + pX. itself is not a probability measure.10 . and let Y be the number of defective items produced.Y (x. So 1 Would you believe that this is why they are called marginal distributions? . 1) = 0.1.025 . Y ) is given by Table 6.05 . 1) + pX.

40 1 .10 5 0 0 0 0 0 0 with a little loose-speaking integration notation we should be able to compute fX (x) dx = P (x ≤ X < x + dx) = P (x ≤ X < x + dx.Xn (x1 .05 ... Y = any value) = +∞ −∞ fX. ..05 .Y (x.2: Marginal pmf’s for daily production pX.05 .6.Marginal Distributions 165 Table 6.05 ..10 .Xn . y) 1 2 X 3 4 5 pY (y) Y 2 3 0 0 .Y (x. .275 .. y) dy dx where y is the variable of integration in the above integral. Looking at this relationship as fX (x) dx = +∞ −∞ fX.10 0. . .05 . we sum (or integrate) over all possible values of the unwanted random variable components. .225 pX (x) .10 . .025 .25 .10 .15 .10 .4 4 0 0 0 0 .Y (x. . y) dy dx it would seem reasonable to define fX (x) ≡ +∞ −∞ fX.. Notice that in both the discrete and continuous cases. The marginal probability density function for the random variable X1 is given by fX1 (x1 ) = +∞ −∞ ··· +∞ −∞ fX1 . x2 . y) dy and we therefore offer the following: Definition 6.10 .20 . xn ) dx2 · · · dxn .10 0 .Y (x. . Xn ) be a continuous random variable with joint probability density function fX1 ..X2 . .025 . Let (X1 ..

6 indicates a particular value for x (call it a). we are actually determining how much probability mass has been placed along the line x = a. Example: Let (X. We compute fX (x) = +∞ −∞ fX. the picture of the support would be as shown in Figure 6. y) = x+y 0 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise Find the marginal probability density function for X . Y ) be a two-dimensional continuous random variable with joint probability density function fX.166 Marginal Distributions y 1 0 a 1 x Figure 6.6. Y ) in the example The trick in such problems is to insure that your limits of integration are correct.6: The support for the random vector (X. Repeating this process for each x yields the desired probability density function. Solution: Let’s consider the case where we fix x so that 0 ≤ x ≤ 1. y) dy . For the above example.Y (x. Drawing a picture of the region where there is positive probability mass (the support of the distribution) often helps. The integration process assigns all of that mass to x = a in one-dimension.Y (x. by integrating over all values of y . If the dotted line in Figure 6.

Y (x. So. 1]. y) = 2−y 0 if 0 ≤ x ≤ 2 and 1 ≤ y ≤ 2 otherwise Find the marginal probability density function for X and the marginal probability density function for Y .5 on page 162) fX. Therefore. Y ) be the two-dimensional random variable with the following joint probability density function (see Figure 6. y) dy (2 − y) dy y2 2 y=2 y=1 2y − 1 2 If x is outside the interval [0.Y (x. fX (x) = 1 2 0 0≤x≤2 otherwise . in fact. 2]. summarizing these computations we find that fX (x) = x+ 0 1 2 0≤x≤1 otherwise We will leave it to you to check that fX (x) is. we have fX (x) = 0. Solution: Let’s first find the marginal probability density function for X . we have fX (x) = 0. a probability density +∞ functionby making sure that fX (x) ≥ 0 for all x and that −∞ fX (x) dx = 1. Consider the case where we fix x so that 0 ≤ x ≤ 2.Marginal Distributions 167 = 0 1 (x + y) dy y2 2 y=1 y=0 = xy + = x+ 1 2 If x is outside the interval [0. Example: Let (X. We compute fX (x) = = = = +∞ −∞ 2 1 fX.

. . “What are the possible values that Z can take on?” The answer: “The integers 2. 6. . . Z equals 4 precisely when any one of the mutually exclusive events {X1 = 1. the sum of the two rolls. or {X1 = 3.X2 (x1 . So. consider the case where we fix y so that 1 ≤ y ≤ 2.X2 (2. X2 = 2}. . 4.” For example. Therefore. {X1 = 2. . . . 1) = 3 36 .Y (x. Find the probability mass function for the random variable Z = X1 +X2 . y) = 0 giving us fY (y) = 0. 12. We ask the question. . 2]. Most of these problems just require a little careful reasoning. X2 = 1} occurs. x2 ) = 36 for x1 = 1. 3) + pX1 . we have fX. .X2 (1. 6 and x2 = 1. .168 Functions of Random Vectors To find the marginal probability density function for Y .Y (x. X2 ) be the die-tossing random vector of the example on page 156. Functions of Random Vectors The technique for computing functions of one-dimensional random variables carries over to the multi-dimensional case. 1 Solution: We already know that pX1 . 3. 4 − 2y 1≤y≤2 fY (x) = 0 otherwise You should also double check that fX and fY are both probability density functions. X2 = 3}. . We compute fY (y) = = = +∞ −∞ 2 0 x=2 (2 − y)x|x=0 fX. y) dx (2 − y) dx = 4 − 2y If y is outside the interval [1. pZ (4) = pX1 .X2 (3. . 2) + pX1 . Example: Let (X1 .

Solution: The support for the random variable Z is the interval [0. 6 pZ (7) = 36 . . Example: Let (X. so   0  FZ (z) =   1 ? if z < 0 if 0 ≤ z ≤ 2 if z > 2 For the case 0 ≤ z ≤ 2. we are computing the probability mass assigned to the shaded set in Figure 6. 2]. 4 pZ (5) = 36 .Functions of Random Vectors 169 Continuing in this manner. we wish to evaluate FZ (z) = P (Z ≤ z) = P (X + Y ≤ z). Find the cumulative distribution function for the random variable Z =X +Y. In establishing limits of integration. In other words. we notice that there are two cases to worry about as shown in Figure 6. 2 pZ (11) = 36 . 3 pZ (4) = 36 . 4 pZ (9) = 36 .7 as z varies from 0 to 2. 2 pZ (3) = 36 . 5 pZ (6) = 36 . you should be able to verify that 1 pZ (2) = 36 . 1 pZ (12) = 36 . 5 pZ (8) = 36 . Y ) be the two-dimensional random variable given in the example on page 166.8: Case I (z ≤ 1): FZ (z) = 0 z 0 z−y (x + y) dx dy = 1 3 3z . Case II (z > 1): FZ (z) = 0 z−1 0 1 (x + y) dx dy + 1 z−1 0 z−y (x + y) dx dy 1 1 = z2 − 3 z3 − 3 . 3 pZ (10) = 36 .

8: Two cases to worry about for the example . y z 1 z y 1 0 z 1 x 0 1 z x Case I Case II Figure 6.7: The event {Z ≤ z}.170 Functions of Random Vectors y 1 z x+ y= z 0 z 1 x Figure 6.

. Recall: An event A is independent of an event B if and only if P (A ∩ B) = P (A)P (B). X2 . .8. Theorem 6. X2 . . y) = [pX (x)][pY (y)] for all x and y . Testing for independence Case I: Discrete A discrete random variable X is independent of a discrete random variable Y if and only if pX. each Xi has the same marginal distribution. .d. . Independence of Random Variables Definition 6..e. . b2 . If X and Y are independent random variables then any event A involving X alone is independent of any event B involving Y alone. A sequence of n random variables X1 . . Xn is a random sample if and only if 1. X1 . Xn are independent.1. . . and FX1 . b2 . F (x)). A sequence of n random variables X1 . . FXi (x) = F (x) for all x and for all i (i.Independence of Random Variables 171 These two cases can be summarized as FZ (z) =   0   1 3  z 3 1  z2 − 1 z3 − 3  3   1 if z < 0 if 0 ≤ z ≤ 1 if 1 < z ≤ 2 if z > 2 Notice what we thought at first to be one case (0 ≤ z ≤ 2) had to be divided into two cases (0 ≤ z ≤ 1 and 1 < z ≤ 2).. .Xn (b1 . and 2.. ..7. .X2 . bn .i. .Y (x. Definition 6. . . We say that a random sample is a vector of independent and identically distributed (i. X2 . . . . .) random variables. . . bn ) = FX1 (b1 )FX2 (b2 ) · · · FXn (bn ) for all values b1 . Xn is independent if and only if.

Y (x.Y (0. grade A and grade B.1 0.Y (x.Y (x. Y = y) is given by the following table: y pX. y) = P (X = x.5) = 0.1 The random variables X and Y are not independent. Let X be the time to failure of transistor 1 and let Y be the time to failure of transistor 2.2 1 0.1 0. Let Y denote the number of grade B compressors produced on the same day. fX. y ≥ 0 0 otherwise The marginal densities are fX (x) = 2e−2x x ≥ 0 0 otherwise fY (y) = 2e−2y y ≥ 0 0 otherwise .3 0.Y (x. Suppose that the joint probability mass function pX. Example: A company produces two types of compressors. Note that pX.2 Example: Suppose an electronic circuit contains two transistors. Let X denote the number of grade A compressors produced on a given day. y) 0 x 1 2 0 0.4)(0.172 Independence of Random Variables Case II: Continuous A continuous random variable X is independent of a continuous random variable Y if and only if fX. y) = [fX (x)][fY (y)] for all x and y .1 = pX (0)pY (0) = (0.2 0. 0) = 0. y) = 4e−2(x+y) x ≥ 0.

Y (x. y) = 0 = fX (x)fY (y) = 2e−2x (0) For x < 0 and y ≥ 0: fX. Expectation and random vectors Suppose we are given a random vector (X. Y ) is discrete E(g(X. y) = x + y 0 ≤ x ≤ 1. Can we find E(g(X. E(g(X.Y (x. y) dy dx if (X.Y (x. Y )) = all x all y ∞ ∞ g(x. y) = 4e−2(x+y) = fX (x)fY (y) = 2e−2x 2e−2y For x ≥ 0 and y < 0: fX. Y ) and a function g(x. Y ) is continuous Example: Suppose X and Y have joint probability density function fX. Y )) = −∞ −∞ g(x. y) = 0 = fX (x)fY (y) = (0)2e−2y For x < 0 and y < 0: fX. y)pX. y) = 0 = fX (x)fY (y) = (0)(0) So the random variables X and Y are independent.Y (x. For x ≥ 0 and y ≥ 0: fX. X and Y for all values of (x.Y (x. Y ))? Theorem 6. Y ). y). y).2. y)fX. 0 ≤ y ≤ 1 0 otherwise . y) if (X.Expectation and random vectors 173 We must check the probability density functions for (X.Y (x.Y (x.

174 Expectation and random vectors Let Z = XY . y) dx dy xfX. Using Theorem 6.3. Y ) is a continuous random vector. The proofs for the discrete case are similar using summations rather than integrals.Y (x. then E[h(X)g(Y )] = E[h(X)]E[g(Y )] . If X and Y are independent.Y (x. probability mass functions rather than probability density functions.Y (x.Y (x.4.Y (x.Y (x. Theorem 6. To find E(Z) = E(XY ) use Theorem 6.2: E(X + Y ) = = = = = ∞ ∞ −∞ −∞ ∞ ∞ −∞ −∞ ∞ ∞ (x + y)fX. y) dy dx + ∞ −∞ y fX.Y (x. y) dx dy x fX (x) dx + y fY (y) dy = E(X) + E(Y ) Theorem 6. y) dx dy ∞ −∞ −∞ −∞ ∞ ∞ −∞ ∞ −∞ x −∞ fX.Y (x.2 to get E(XY ) = = = 0 ∞ ∞ −∞ −∞ 1 1 0 1 0 1 0 1 3 3x y 1 3y 0 1 xyfX. y) dx dy yfX. E(X + Y ) = E(X) + E(Y ) Proof. y) dx dy xy(x + y) dx dy x2 y + xy 2 dx dy 1 + 2 x2 y 2 1 0 = = 0 dy 1 + 1 y 2 dy 2 1 0 = 1 2 6y 1 + 6 y3 = 1 3 We will prove the following results for the case when (X. y) dy dx + ∞ ∞ −∞ −∞ ∞ ∞ −∞ −∞ ∞ −∞ yfX. y) dx dy + xfX.

9.4. .2: E[h(X)g(Y )] = ∞ ∞ −∞ −∞ h(x)g(y)fX. = E(h(X)) ∞ −∞ g(y)fY (y) dy = E(h(X))E(g(Y )) Corollary 6. Y ) . = = = ∞ ∞ −∞ −∞ ∞ −∞ ∞ −∞ h(x)g(y)fX (x)fY (y) dx dy ∞ −∞ g(y)fY (y) h(x)fX (x) dx dy g(y)fY (y) E(h(X)) dy since E(h(X)) is a constant. Note that Cov(X. X) = Var(X).Expectation and random vectors 175 Proof. If X and Y are independent.5. For any random variables X and Y Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X. . then E(XY ) = E(X)E(Y ) Proof. The covariance of the random variables X and Y is Cov(X. Using Theorem 6. . Y ) ≡ E[(X − E(X))(Y − E(Y ))].6.Y (x. Using Theorem 6. y) dx dy since X and Y are independent. Theorem 6. . set h(x) = x and g(y) = y to get E[h(X)g(Y )] = E(h(X))E(g(Y )) E[XY ] = E(X)E(Y ) Definition 6.

. let b = {Y − E(Y )}.5. If X and Y are independent then Cov(X. and E[XY ] are all constants. Y ) = E(XY ) − E(X)E(Y ) = 0 . .7: Cov(X. If X and Y are independent then. Y ) = E[(X − E(X))(Y − E(Y ))] = E[XY − XE(Y ) − E(X)Y + E(X)E(Y )] = E[XY ] + E[−XE(Y )] + E[−E(X)Y ] + E[E(X)E(Y )]] Since E[X]. We can then use Theorem 6. . Proof. Y ) Theorem 6. Y ) = 0. . . Using Definition 6. and expand (a + b)2 to get. Remember that the variance of a random variable W is defined as Var(W ) = E[(W − E(W ))2 ] Now let W = X + Y . we get Cov(X.176 Expectation and random vectors Proof. E(XY ) = E(X)E(Y ).7. E[Y ]. .8. Cov(X.9. Y ) = E(XY ) − E(X)E(Y ) Proof. = E[XY ] − E(Y )E[X] − E(X)E[Y ] + E(X)E(Y ) = E[XY ] − E(X)E(Y ) Corollary 6. = E[{X − E(X)}2 + {Y − E(Y )}2 + 2{X − E(X)}{Y − E(Y )}] = E[{X − E(X)}2 ] + E[{Y − E(Y )}2 ] +2E[{X − E(X)}{Y − E(Y )}] = Var(X) + Var(Y ) + 2Cov(X. Var(X + Y ) = E[(X + Y − E(X + Y ))2 ] = E[(X + Y − E(X) − E(Y ))2 ] = E[({X − E(X)} + {Y − E(Y )})2 ] Now let a = {X − E(X)}. from Corollary 6.

Y = aX + b for some constants a and b. then ρ(X.8: Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X. Then (with probability one). The converse of this theorem is also true: Theorem 6. 6. Suppose that X and Y are two random variables. Y ) ≡ . Y ) = −1. Y )| = 1. Y ) ≡ E (X − E(X))(Y − E(Y )) . If X and Y are independent then Var(X+Y ) = Var(X)+Var(Y ). If a < 0.12 are stated without proof.11 and 6.11. In other words: If the correlation coefficient ρ is ±1. 1965.Expectation and random vectors 177 Corollary 6.2 Theorem 6. Proof. Reading MA. Then |ρ(X. If X and Y are independent then. Definition 6.11. then ρ(X.10. Suppose that |ρ(X. Y ) Var(X + Y ) = Var(X) + Var(Y ) + 2(0) Var(X + Y ) = Var(X) + Var(Y ) Definition 6. Y ) ρ(X. Introductory probability theory and statistical applications.9. The covariance of two random variables X and Y is given by Cov(X. from Theorem 6. the Y is a linear function of X (with probability one). For any random variables X and Y . Meyer. such that Y = aX + b where a and b are constants. Addison-Wesley. If a > 0. Y ) = +1. Y )| = 1.10. Var(X)Var(Y ) Theorems 6. P.6 and Corollary 6. 2 . The correlation coefficient for two random variables X and Y is given by Cov(X.. |ρ(X. Proofs of these results may be found in the book by Meyer. Theorem 6. Y )| ≤ 1.10.12.

what is the probability that 2 grade A compressors were produced? Solution: P (X = 2 | Y = 0) = = P (X = 2. what is the probability that one of them is a grade B compressor? Solution: P (Y = 1 | X + Y = 2) = = = P (Y = 1. . . y ≥ 0 0 otherwise Given that the total life time for the two transistors is less than two hours. Y = 1) P (X + Y = 2) 1 0. X + Y = 2) P (X + Y = 2) P (X = 1. y pX.178 Expectation and random vectors Random vectors and conditional probability Example: Consider the compressor problem again.3 3 Example: And.1 0.3 0.5 5 Given that 2 compressors were produced on a given day.Y (x. fX. what is the probability that the first transistor lasted more than one hour? .1 Given that no grade B compressors were produced on a given day.2 1 0. y) 0 x 1 2 0 0. consider the two transistors.1 = 0.Y (x. Y = 0) P (Y = 0) 2 0.2 0.2 = 0. y) = 4e−2(x+y) x ≥ 0. again.1 0.

y) and let pY (y) be the marginal probability mass function for Y .Y (x.Y (x. . Case II: Continuous Let X and Y be random variables with joint probability density function fX. y) fY (y) whenever fY (y) > 0. We define the conditional probability density function of X given Y as fX|Y (x | y) = fX.Y (x.Expectation and random vectors 179 Solution: P (X > 1 | X + Y ≤ 2) = P (X > 1. y) and let fY (y) be the marginal probability density function for Y . X + Y ≤ 2) = = e 4e−2(x+y) dy dx − 3e−4 and P (X + Y ≤ 2) = 0 2 0 2−x 4e−2(x+y) dy dx = 1 − 5e−4 to get P (X > 1 | X + Y ≤ 2) = e−2 − 3e−4 = 0.Y (x. X + Y ≤ 2) P (X + Y ≤ 2) 2 1 0 −2 2−x We then compute P (X > 1. We define the conditional probability mass function of X given Y as pX|Y (x | y) = pX.0885 1 − 5e−4 Conditional distributions Case I: Discrete Let X and Y be random variables with joint probability mass function pX. y) pY (y) whenever pY (y) > 0.

X3 . 0) = 1/4 pX. Answers can be found in Appendix B. Let P (X1 + X2 + 2 2 X3 + X4 = 3) = r. S6. each with P (Xi = 1) = 1 and P (Xi = 0) = 1 . 1) = 1/2 Then P (Y = 1) equals (A) 1/3 (B) 1/4 (C) 1/2 .Y (0. X2 . X4 be independent and identically distributed random variables.2 Let (X.Y (0. Y ) be a discrete random vector with joint probability mass function given by pX.1 Let X1 .180 Self-Test Exercises for Chapter 6 Law of total probability Case I: Discrete pX (x) = y pX|Y (x | y)pY (y) Case II: Continuous fX (x) = +∞ −∞ fX|Y (x | y)fY (y) dy Self-Test Exercises for Chapter 6 For each of the following multiple-choice questions. S6. then the value of r is (A) (B) (C) 1 16 1 4 1 2 (D) 1 (E) none of the above. choose the best response among those provided.Y (1. 1) = 1/4 pX.

Y (0. S6.3 Let (X. Y ) be a random vector with joint probability mass function given by pX. 1) = 1 2 pX. Y ) be a continuous random vector with joint probability density function fX. Y ) be a random vector with joint probability mass function given by pX.Y (1. 0) = 1 4 Let pX (x) denote the marginal probability mass function for X . 1) = 1 2 pX.Y (1. S6. 0) = 1 4 The value of P (XY > 0) is (A) 0 (B) (C) 1 4 1 2 (D) 1 (E) none of the above.Self-Test Exercises for Chapter 6 181 (D) 3/4 (E) none of the above. The value of pX (1) is (A) 0 (B) (C) 1 4 1 2 (D) 1 (E) none of the above.Y (x. S6. −1) = 1 4 pX. y) = 1/2 if −1 ≤ x ≤ 1 and 0 ≤ y ≤ 1 0 otherwise Then P (X > 0) equals (A) 0 (B) 1/4 .Y (1.Y (0. −1) = 1 4 pX.5 Let (X.4 Let (X.Y (1.

y) = 0 otherwise Then P (X > 1 ) equals 2 (A) 0 (B) 1/2 (C) e−1/2 (D) 1 −y 2e (E) none of the above. Y ) is a continuous random vector with joint probability density function e−y if 0 ≤ x ≤ 1 and y ≥ 0 fX.182 Self-Test Exercises for Chapter 6 (C) 1/2 (D) 1 (E) none of the above. 1) = 1/3 pX.Y (0. 0) = 1/3 pX.Y (x.Y (0.Y (1. 1) = 1/3 pX.6 Suppose (X.8 Let (X. 0) = 1/3 pX. S6.Y (0. Y ) be a discrete random vector with joint probability mass function given by pX. S6.Y (1.7 Let (X. Y ) be a discrete random vector with joint probability mass function given by pX. 0) = 1/3 Then P (X + Y = 1) equals (A) 0 (B) 1/3 (C) 2/3 (D) 1 (E) none of the above.Y (0. S6. 0) = 1/3 .

. Y < 0. Y ) is a continuous random vector with joint probability density function   y if 0 ≤ x ≤ 2 and  0≤y≤1 fX.Y (x. y) =   0 otherwise Then P (X < 1. Y ) is a continuous random vector with joint probability density function   y if 0 ≤ x ≤ 2 and  0≤y≤1 fX. Then P (W = 1) equals (A) 0 (B) 1/3 (C) 2/3 (D) 1 (E) none of the above.5) equals (A) 0 (B) 1/8 (C) 1/4 (D) 1/2 (E) none of the above.10 Suppose (X. y) =   0 otherwise Then E(X 2 ) equals (A) 0 (B) 1 (C) 4/3 (D) 8/3 (E) none of the above. S6.Y (x.9 Suppose (X. Y }. S6.Self-Test Exercises for Chapter 6 183 Let W = max{X.

then Var(X + Y ) must equal (A) 2Var(X) √ (B) 2(Var(X)) (C) Var(2X) (D) 4Var(X) (E) none of the above.13 If X and Y are random variables with Var(X) = Var(Y ).Y (x. S6. Y ) be a continuous random vector with joint probability density function 1/π if x2 + y 2 ≤ 1 fX. The variance of Y is equal to √ (A) 4 n (B) 4n (C) 4n2 (D) 4 (E) none of the above. then the value of p is (A) (B) (C) 1 3 √ 3 r 1 2 (D) r3 (E) none of the above.11 The weights of individual oranges are independent random variables.184 Self-Test Exercises for Chapter 6 S6. each having an expected value of 6 ounces and standard deviation of 2 ounces. y) = 0 otherwise Then P (X > 0) equals . each with P (Xi = 1) = p and P (Xi = 0) = 1 − p. S6. Let Y denote the total net weight (in ounces) of a basket of n oranges. X2 .14 Let (X.12 Let X1 . S6. If P (X1 + X2 + X3 = 3) = r. X3 be independent and identically distributed random variables.

S6.17 Toss a fair die twice. Let the random variable X represent the outcome of the first toss and the random variable Y represent the outcome of the second toss. What is the probability that X is odd and Y is even. Y and Z be independent random variables with E(X) = 1 E(Y ) = 0 E(Z) = 1 Let W = X(Y + Z).15 Let X be a random variable.Self-Test Exercises for Chapter 6 185 (A) 1/(4π) (B) 1/(2π) (C) 1/2 (D) 1 (E) none of the above. Then E(W ) equals (A) 0 (B) E(X 2 ) (C) 1 (D) 2 (E) none of the above S6. Then E[(X + 1)2 ] − (E[X + 1])2 equals (A) Var(X) (B) 2E[(X + 1)2 ] (C) 0 (D) 1 (E) none of the above. S6. (A) 1/4 (B) 1/3 (C) 1/2 .16 Let X .

Y ) is random vector with joint probability density function fX.Y (x. 0) = 1/4 pX. y) = k for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 and y ≤ x 0 otherwise where k is a some positive constant. Y ) is a random vector with joint probability mass function given by pX.Y (0. y + w} (E) none of the above.186 Self-Test Exercises for Chapter 6 (D) 1 (E) none of the above S6. 0) = 1/2 pX. The limit of integration that should appear in position D is (A) min{1.19 Assume that (X.Y (0. To find the cumulative distribution function for W we can compute FW (w) = B A D C k dx dy for 0 ≤ w ≤ 1.Y (−1. Let W = X −Y . S6. y − w} (C) min{1. . x − w} (B) min{1.18 Assume that (X. 1) = 1/4 Define the random variable W = XY . The value of P (W = 0) is (A) 0 (B) 1/4 (C) 1/2 (D) 3/4 (E) none of the above. x − y} (D) min{1.

21 Suppose (X. y) = 2e−(x+2y) if x ≥ 0 and y ≥ 0 0 otherwise Then P (X + Y ≤ 1) equals (A) (B) (C) 1 1 −(x+2y) dx dy 0 0 2e 1 1−y 0 0 2e−(x+2y) dx dy 1 1 −(x+2y) dx dy 0 y 2e . S6. Then E(X 2 + Y 2 ) equals (A) 0 (B) 1 (C) 3 (D) 5 (E) none of the above.Y (x. The joint probability density function of (X.20 Let X be a random variable with E(X) = 0 and Var(X) = 1. y) = 4xy 0 if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise Then E 1 XY equals (A) +∞ (B) 0 (C) 1/2 (D) 4 (E) none of the above.Self-Test Exercises for Chapter 6 187 S6. Y ) is a continuous random vector with joint probability density function fX. Y ) is given by fX.Y (x. Let Y be a random variable with E(Y ) = 0 and Var(Y ) = 4. Y ) where X is the life length of transistor 1 and Y is the life length of transistor 2.22 The life lengths of two transistors in an electronic circuit is a random vector (X. S6.

20. .10. ω3 .Y (1. Z(ω3 ) = 3.188 Questions for Chapter 6 (D) 1 1 0 1−y 2e−(x+2y) dx dy (E) none of the above.10. Questions for Chapter 6 6. 1) = 0. (c) Find the probability mass function for X . Y and Z are defined as X(ω1 ) = 0. Y (ω3 ) = 1. 6. 3) = 0.Y (2.20. 1) = 0. pX.05. P (X ≤ 2. P (·) assigns probabilities.Y (1. pX. pX. (c) Find the distribution for the random variable Z = X + Y which is the total daily production of gizmos. as follows: P ({ω1 }) = P ({ω3 }) = 1 2 1 4 P ({ω2 }) = P ({ω4 }) = 1 8 1 8 Random variables X . pX. pX.2 A sample space Ω is a set consisting for four points. ω2 . Z).1 A factory can produce two types of gizmos. 2) = 0.Y (3. and let Y be the number of Type 2 gizmos produced on the same day.20.Y (3.Y (1. Z(ω1 ) = 1. Z(ω2 ) = 2. (b) Find the marginal probability mass functions for X and for Y. X(ω4 ) = 1 Y (ω4 ) = 0 Z(ω4 ) = 4 (a) Find the joint probability mass function for (X. Y (ω2 ) = 1. A probability measure. {ω1 . pX. The joint probability mass function for X and Y is given by pX. 1) = 0. Let X be a random variable denoting the number of Type 1 gizmos produced on a given day. Y.05. Y ).10.Y (1. (a) Compute P (X ≤ 2. 0) = 0. pX. ω4 }. 2) = 0. Y (ω1 ) = 0. Y = 2). Type 1 and Type 2. Y = 2) and P (Y > 0). X(ω3 ) = 1. (b) Find the joint probability mass function for (X. 2) = 0.Y (2. X(ω2 ) = 0.

Y (x. In other words. Y ) have a uniform distribution over the rectangle (0. 0) . The first one to fail burns out at time X and the second one at time Y . (e) Find the probability mass function for the random variable T = XY Z . y)-space for which the above probability density function assigns positive mass.Y (x. 1). 1).3 Let (X. (f) Find the joint probability mass function for the random variable (W. (1.4 Two light bulbs are burned starting at time 0. 6. i. y) = 2 0 if x > 0. (c) Let Z denote the excess life of the second bulb. y) = 1 0 if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise Find the cumulative distribution function for the random variable Z = X+Y . Find the probability density function for Z . (b) Find the marginal probability density functions for X and for Y . Y ) is given by fX. The joint probability density function for (X. (1. 6. let Z = Y − X . (g) Find the probability mass function for the random variable V = max{X. Y ) is given by fX. (0. y > 0 and x + y < 1 otherwise . (d) Compute P (Z > 1) and P (Y > 1).. T ).Y (x. Y }.e. 0). Obviously X ≤ Y .Questions for Chapter 6 189 (d) Find the probability mass function for the random variable W = X + Y + Z. y) = 2e−(x+y) 0 if 0 < x < y < ∞ otherwise (a) Sketch the region in (x.5 If two random variables X and Y have a joint probability density function given by fX. the probability density function for (X. 6.

7 Suppose that the random vector (X.Z (x. Y. Y ). (d) Find the marginal probability density function for X . (b) Evaluate the constant k . (b) Find the probability that X will take on a value less than and Y and Z will both take on values less than 1. Z) is a random vector with joint probability density function fX.Y. (c) Find the marginal probability density function for Y . 3 (d) Find the marginal probability density functions for X and for Y . 0 < y < 2 and 0 < z < 2 otherwise (a) Find the value of the constant α. z) = α(xyz 2 ) 0 if 0 < x < 1. 6.190 Questions for Chapter 6 (a) Find the probability that both random variables will take on a value less than 1 . 6. Y ) has joint probability density function fX. (e) Find the probability density function for Z = X + Y . 2 (b) Find the probability that X will take on a value less than and Y will take on a value greater than 1 . .Y (x. y. 2 1 4 (c) Find the probability that the sum of the values taken on by the two random variables will exceed 2 . (e) Find the marginal probability density function for Z = X + Y. 1 2 (c) Find the probability density function for the random vector (X.6 Suppose (X. Y ) in the xy plane. y) = kx(x − y) 0 if 0 ≤ x ≤ 2 and |y| ≤ x otherwise (a) Sketch the support of the distribution for (X.

and the width. and the joint probability density function of X and Y is given by fX. 000 units) is given by fS.Y (x. X . P .8 The length. p) = 5pe−ps 0 0. Y ) with joint probability density function fX.20 < p < 0. S . of a commodity (in dollars) and total sales. y) = x 0 if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2 otherwise (a) Find the marginal probability density functions for X and for Y .Y (x.Questions for Chapter 6 191 6.10 If X is the proportion of persons who will respond to one kind of mail-order solicitation. (d) Find the probability that there will be at least a 20% response to the second type of mail-order solicitation. Y . 6. (b) Find the conditional probability density function for X given that Y takes on the value y . Y is proportion of persons who will respond to a second type of mail-order solicitation.25. 000 units given P = 0. (c) Find the conditional probability density function for Y given that X takes on the value x.P (s.9 If the joint probability density function of the price. . (c) Find the probability that sales will exceed 20. (b) Find the conditional probability density function for S given that P takes on the value p. y) = 2 5 (x + 4y) 0 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise (a) Find the marginal probability density functions for X and for Y .40 and s > 0 otherwise (a) Find the marginal probability density functions for P and for S . (in 10. of salt crystals form a random variable (X. 6.

(e) Find the conditional distribution for X given Y = y . 0). 6. let pX|Y be the conditional probability mass function of X given Y .25).192 Questions for Chapter 6 (e) Find the probability that there will be at least a 20% response to the second type of mail-order solicitation given that there has only been a 10% response to the first kind of mail-order solicitation.12 The coordinates of a laser on a circular target are given by the random vector (X. X . the time when fuse 1 blows is a random variable. (Note: Be careful of the case |y| = 1.) 6.Y (x. Y ) has a uniform distribution on a disk of radius one centered at (0. 6. and the time when fuse 2 blows is a random variable. (X.13 Let X and Y be discrete random variables with the support of X denoted by Θ and the support of Y denoted by Φ.11 An electronic component has two fuses. Y ) is given by fX.Y (x. Y ) with the following probability density function:   1 fX. If an overload occurs. Let pX be the marginal probability mass function for X . (b) Find the marginal distributions for X and Y . Y > 0). (b) Are X and Y independent? Justify your answer. (a) Compute P (X 2 + Y 2 ≤ 0. The joint probability density function of the random vector (X. Y . y) = e−y 0 0 ≤ x ≤ 1 and y ≥ 0 otherwise (a) Compute P (X + Y ≥ 1). (c) Compute P (X ≥ 0. and let pY |X be the conditional probability mass function of . y) =  π 0 x2 + y 2 ≤ 1 otherwise Hence. (d) Compute P (X ≥ 0 | Y > 0).

0 < z < 2 otherwise Find E(X + Y + Z). . P . Z) is a random vector with joint probability density function fX.P (s.16 Suppose (X. p) = 5pe−ps 0 0. 000 units) is given by fS.Questions for Chapter 6 193 Y given X .5). Show that pX|Y (x|y) = pY |X (y|x)pX (x) pY |X (y|x)pX (x) x∈Θ for any x ∈ Θ and y ∈ Φ.40 and s > 0 otherwise Find E(S | P = 0.Z (x. (d) Compute P (X < 0. (c) Compute P (X < 0. 6. (in 10.15 Suppose (X. S .5). Y.5 | Y = 0.Y (x. y) = 2 0 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 − x otherwise (a) Determine the marginal probability density functions for X and for Y . Y ) is a continuous random vector with joint probability density function fX. respectively. 6. Y ) has joint probability density function fX. 0 < y < 2. (b) Are X and Y independent random variables? Justify your answer.14 The volumetric fractions of each of two compounds in a mixture are random variables X and Y .Y (x. 6. of a commodity (in dollars) and total sales.5 | Y < 0. y > 0 and x + y < 1 otherwise Find E(X + Y ).25). 6.20 < p < 0. z) = 3 2 8 (xyz ) 0 if 0 < x < 1. y) = 2 0 if x > 0.Y. y. The random vector (X.17 Suppose the joint probability density function of the price.

it has been determined that the random vector (X.03 . (c) Find E(X).25 .Y (x.04 . Plans are being made for a party.Y (x. y) = 2xe−y 0 if 0 ≤ x ≤ 1 and y ≥ 0 otherwise (a) Find P (X > 0. After an extensive survey was conducted.05 . with 2 juniors and 3 seniors.20 Suppose that (X.20 0 1 2 3 . and let Y be the proportion of persons who will respond to a second type of mail-order solicitation.Y (x. y) = 2x 0 if 0 < x < 1 and 0 < y < 1 otherwise Find the marginal probability density function for Y .194 Questions for Chapter 6 6. 6.5). 6. Y ) is a random vector with the following joint probability density function: fX.15 . Y ) has the following joint probability mass function: pX.05 .05 .01 .19 Suppose that (X. Let X denote the number of juniors who will attend the party.01 .Y (x. (d) Are X and Y independent random variables? Justify your answer. and let Y denote the number of seniors who will attend.21 The Department of Metaphysics of Podunk University has 5 students. y) X (juniors) Y (seniors) 0 1 2 .04 . 6. y) = 2 5 (x + 4y) 0 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise Find E(X | Y = y) for each 0 < y < 1. Suppose the joint probability density function of X and Y is given by fX.18 Let X be the proportion of persons who will respond to one kind of mailorder solicitation. (b) Find the marginal probability density function for Y .12 . Y ) is a random vector with the following joint probability density function: fX.

2 (d) Are X and Y independent random variables? Justify your answer. Y ) be a continuous random vector with joint probability density function given by fX. y ≥ 0 and x + y ≤ 1 . otherwise (a) Compute P (X < Y ).5 | Y ≤ 0.23 Let (X. 6. (d) Find P (X > 0. Y ) be a continuous random vector with joint probability density function given by fX. (b) Are X and Y independent random variables? Justify your answer.Y (x. what is the expected amount of money that will be collected from all students? (d) The two juniors arrive early at the party. y) = 4xy 0 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 .Questions for Chapter 6 195 (a) What is the probability that 3 or more students (juniors and/or seniors) will show up at the party? (b) What is the probability that more seniors than juniors will show up at the party? (c) If juniors are charged $5 for attending the party and seniors are charged $10.22 Let (X. . What is the probability that no seniors will show up? 6. (c) Find the conditional probability density function for Y given that X = 1 . y) = 2 0 if x ≥ 0.5).Y (x. otherwise (a) Find the marginal probability density function for X and the marginal probability density function for Y . (b) Find the marginal probability density function for X and the marginal probability density function for Y . (c) Find P (X + Y ≤ 1).

5. 2. and let Y be the number on the second ball. 3 and 4.5 | Y < 1. Y ). y) = 6x2 y 0 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise (a) Find the marginal probability density function for X and the marginal probability density function for Y . 6. 6. Y ) be a continuous random vector with joint probability density function given by fX. (b) Compute P (Y ≥ 3 | X = 1). (a) Find the joint probability mass function for the random vector (X.25 A box contains four balls numbered 1.24 Let (X. y) = if 0 < x < 1 and y > 0 otherwise  0    (a) Compute P (X > 0.Y (x.196 Questions for Chapter 6 6. (b) Are X and Y independent random variables? Justify your answer.Y (x. Let X be the number on the first ball drawn. A second ball is then drawn at random from the box. (c) Find the marginal probability density function for X .0). Y < 1.0). (c) Find P (X + Y ≤ 1). A game consists of drawing one of the balls at random from the box. It is not replaced. .26 Suppose that (X. (d) Find E(X + Y ). Y ) is a random vector with the following joint probability density function:   2xe−y    fX. (d) Find E(X + Y ). (b) Compute P (X > 0.

what is the student’s expected grade? (c) A student is chosen at random from the class and you are told that the student has a grade in the 70’s.Questions for Chapter 6 197 6. (b) Now suppose that X and Y can be positive and/or negative. What is the probability that the student is a junior? 6. Let Z = X/Y . A probability class is composed of 75 juniors and 25 seniors.e. (a) What is the probability that a student chosen at random from the class will receive a grade in the 70’s (i.27 Past studies have shown that juniors taking a particular probability course receive grades according to a normal distribution with mean 65 and variance 36. (a) Express the probability density function for Z in terms of an integral that involves only fX and fY . independent continuous random variables with probability density functions fX (x) and fY (y). Express the probability density function for Z in terms of an integral that involves only fX and fY . Compare your answer to the case where X and Y are both positive. respectively.28 (†) Suppose X and Y are positive.. Seniors taking the same course receive grades normally distributed with a mean of 70 and a variance of 36. between 70 and 80)? (b) If a student is chosen at random from the class. .

198 Questions for Chapter 6 .

199 . and constant c h(k) (c) ≡ dk h(u) duk . Example: Let h(x) = x3 then h(2) (4) = = = d2 h(x) dx2 d2 3 x dx2 x=4 x=4 3x 2 x=4 = 3(4)2 = 48 Probability generating functions Definition 7. Let X be a discrete random variable with support on the nonnegative integers.7 TRANSFORMS AND SUMS Transforms Notation Throughout this section we will use the following notation: Definition 7. u=c In other words.1. For any differentiable function h(u). Then the probability generating function for X is given by gX (z) ≡ E(z X ) for |z| < 1. h(k) (c) is the k th derivative of h evaluated at u = c.2.

Recalling that ex = ∞ k k=0 (x /k !). gX (0) = P (X = 1) (1) (1) Another derivative produces. 1. . . gX (0) = (1)P (X = 0) Now take the first derivative of gX (z). . . gX (z) = z 0 P (X = 0) + z 1 P (X = 1) + z 2 P (X = 2) + z 3 P (X = 3) + . .200 Transforms Example: Let X have a Poisson distribution with parameter α > 0 Then pX (k) = e−α αk k! for k = 0. we get. . 2. . . . If gX (z) is the probability generating function of X . . .1. gX (z) = P (X = 1) + 2zP (X = 2) + 3z 2 P (X = 3) + 4x3 P (X = 4) + . . . (Informal proof) Let g (k) (z) denote the k th derivative of g with respect to z . then P (X = k) = gX (0) k! (k) Proof. . We have gX (z) = E(z X ) ∞ = k=0 z k P (X = k) Writing out the sum explicitly. we get ∞ k=0 gX (z) = E(z X ) = zk e−α αk k! = e−α (αz)k k! k=0 ∞ = e−α eαz = eα(z−1) Theorem 7.

X2 . gX (1) = P (X = 1) + 2P (X = 2) + 3P (X = 3) + 4P (X = 4) + . . Recall that if X and Y are independent random variables. then E(X) = gX (1) (1) where gX (1) is the first derivative of gX (z) with respect to z . gX2 (z). . . . Proof.Transforms (2) 201 gX (z) = 2P (X = 2) + 3(2)zP (X = 3) + 4(3)z 2 P (X = 4) + 5(4)z 3 P (X = 5) + . As above. ∞ (1) (1) (1) = k=0 kP (X = k) = E(X) Theorem 7. evaluated at z = 1. . . . gX (z) = P (X = 1) + 2zP (X = 2) + 3z 2 P (X = 3) + 4x3 P (X = 4) + . . respectively. .2. gXn (z). . in general gX (0) = n!P (X = n) (n) Theorem 7. .3. Xn are independent random variables with probability generating functions gX1 (z). . Proof. . . If gX (z) is the probability generating function of X . gX (0) = 3(2)P (X = 3) (3) (3) And. gX (z) = 3(2)P (X = 3) + 4(3)(2)zP (X = 4) + 5(4)(3)z 2 P (X = 5) + . gX (0) = 2P (X = 2) (2) And once more to see the pattern. If X1 . then E(h(X)g(Y )) = E(h(X))E(g(Y )) . . . . . then the probability generating function of the random variable Y = X1 + X2 + · · · + Xn is n gY (z) = i=1 gXi (z).

.1. If gX (z) is the probability generating function of X . Theorem 7. We have.. integer constant.5 without proof.+Xn ) = E(z X1 z X2 · · · z Xn ) = E(z X1 )E(z X2 ) · · · E(z Xn ) = gX1 (z)gX2 (z) · · · gXn (z) Theorem 7. gY (z) = E(z Y ) = E(z a+X ) = E(z a z X ) = z a E(z X ) = z a gX (z) We offer Theorem 7. Except for some technical details. Proof. P (T = 1) = p P (T = 0) = 1 − p = q Then the probability generating function of T is given by gT (z) = z 0 q + z 1 p = q + zp . Example: Let T have a Bernoulli distribution with parameter p.5.4. and a is a nonnegative. The probability generating function of a random variable uniquely determines its probability distribution.202 Transforms Therefore. then the probability generating function of the random variable Y = a + X is gY (z) = z a gX (z). it is a direct result of Theorem 7. gY (z) = E(z Y ) = E(z X1 +X2 +.

3 with the Ti ’s to easily get gX (z) = gT1 (z)gT2 (z) · · · gTn (z) = (q + zp)n As an exercise. . . One could obtain the probability generating function for X is given by computing gX (z) = E(z X ) = n k=0 zk n k (n−k) p q k But we can use Theorem 7. Tn are n independent Bernoulli random variables each with P (Ti = 1) = p P (Ti = 0) = 1 − p = q We know (from Chapter 4) that X = T1 + T2 + · · · + Tn has a Binomial distribution with parameters n and p. T2 . . Definition 7. Moment generating functions Definition 7.2 to get E(X) from gX (z). Then the moment generating function for X is given by MX (t) ≡ E(etX ). Let X be (almost) any random variable. . That is P (X = k) = n k (n−k) p q k for k = 0. . T1 . The kth moment about the origin of a random variable X is defined as E(X k ).Transforms 203 Example: Now suppose. .3. Then   fX (x) = 1 b−a  0 if a ≤ x ≤ b otherwise .4. use Theorem 7. 1. . . n. Example: Let X have a uniform distribution on the interval (a. b).

we get tX ∞ MX (t) = E(e ) = k=0 etk ∞ e−α αk k! = e−α (αet )k k! k=0 t t −1) = e−α eαe = eα(e Note that since a Poisson random variable X has support {0. . . MX (t) = E(etX ) = = = = = ∞ −∞ b a etx fX (x) dx etx 1 dx b−a b a 1 t(b − a) tetx dx b a 1 etx t(b − a) etb − eta t(b − a) for t = 0 Example: Let X have a Poisson distribution with parameter α > 0 Then pX (k) = e−α αk k! for k = 0. Recalling that ex = ∞ k k=0 (x /k !). . . 1. 2.204 Transforms Therefore. . . we obtain gX (et ) = E[(et )X ] = E[etX ] = MX (t) Since the probability generating function for a Poisson random variable is gX (z) = eα(z−1) . it can have a probability generating function and a moment generating function. 2. we can derive the following relationship: gX (z) = E[z X ] substituting z = et .}. 1. When a random variable has both a probability generating function and a moment generating function.

. 1 MX (t) = √ 2πσ +∞ −∞ 2 etx exp − 1 x−µ 2 σ 2 dx Let (x − µ)/σ = s. 1965. 1. MX (t) = 1 1 √ 2π +∞ −∞ exp[t(σs + µ)]e−s 2 /2 ds Meyer. MX (t) = = α x(t−α) ∞ e t−α 0 α for t < α α−t Example: (From Meyer. . those random variables with probability generating functions). thus x = σs + µ and dx = σds.Transforms 205 the moment generating function for a Poisson random variable must be MX (t) = gX (et ) = eα(e t −1) It’s important to remember that this relationship will only hold for random variables with support {0. P. . MX (t) = E(etX ) = 0 ∞ ∞ etx αe−αx dx αex(t−α) dx = 0 Note that this integral only converges if t < α.. Continuing. . Therefore. Addison-Wesley. Example: Let X have an exponential distribution with parameter α > 0.} (that is. 2. Reading MA. σ 2 ). Then 1 1 x−µ fX (x) = √ exp − 2 σ 2πσ Therefore.1 ) Let X ∼ N (µ. Then fX (x) = αe−αx if x ≥ 0 0 otherwise Therefore. for every t. Introductory probability theory and statistical applications.

6. . We have MX (t) = E(etX ) = E (tX)k k! k=0 ∞ (k) Writing out the sum explicitly. . MX (t) = 1 + tE(X) + t2 E(X 2 ) t3 E(X 3 ) t4 E(X 4 ) + + + ··· 2! 3! 4! Take the first derivative of both sides to get . MX (t) = E 1 + tX + (tX)2 (tX)3 (tX)4 + + + ··· 2! 3! 4! If the sum in the right hand side converges. If MX (t) is the moment generating function of X . Proof. then ds = dv and we obtain MX (t) = etµ+σ 2 t2 /2 1 √ 2π . .206 +∞ −∞ +∞ −∞ Transforms 1 = etµ √ 2π 1 = etµ √ 2π = etµ+σ 2 t2 /2 1 exp(− 2 [s2 − 2σts]) ds exp{− 1 [(s − σt)2 − σ 2 t2 ]} ds 2 +∞ −∞ 1 √ 2π exp(− 1 [s − σt]2 ) ds 2 Now let s − σt = v . . . then E(X k ) = MX (0) (k) for k = 0. 1. MX (t) = E(X) + tE(X 2 ) + (1) t2 E(X 3 ) t3 E(X 4 ) + + ··· 2! 3! . +∞ −∞ e−v 2 /2 dv = e(tµ+σ 2 t2 /2) Theorem 7. . . . . we get . we get. (Informal proof) Let MX (t) denote the k th derivative of MX with respect to t. 2.

MX (t) = E(X 2 ) + tE(X 3 ) + (2) t2 E(X 4 ) t3 E(X 5 ) + + ··· 2! 3! and when evaluated at t = 0. . MX (t) = E(X k ) + tE(X k+1 ) + (k) t2 E(X k+2 ) t3 E(X k+3 ) + + ··· 2! 3! and when evaluated at t = 0. MX (0) = E(X k ) (k) Theorem 7. . If MX (t) is the moment generating function of X . . we see that. MX2 (t). . The moment generating function of a random variable uniquely determines its probability distribution. . and a and b are constants. . MXn (t). Theorem 7. . . σ 2 ) then the moment generating function for X is MX (t) = etµ+ 2 t 1 2 2 σ .9.Transforms 207 and when evaluated at t = 0. If X1 .7. MX (0) = E(X 2 ) (2) Continuing. . then the moment generating function of the random variable Y = X1 + X2 + · · · + Xn is n MY (t) = i=1 MXi (t). . Xn are independent random variables with moment generating functions MX1 (t). X2 . Theorem 7. . respectively.8. . . MX (0) = E(X) (1) Take the second derivative of both sides to get . then the moment generating function of the random variable Y = aX +b is MY (t) = ebt MX (at). . Example: If X ∼ N (µ. . . . .

. 8). if you encounter a random Y with moment generating function MY (t) = et 2 /2 you can immediately say that Y ∼ N (0. Therefore.8 with a= 1 σ and b=− µ σ we find that the moment generating function for Z is MZ (t) = ebt MX (at) = e−tµ/σ MX (t/σ) = e[− σ ] e = e = et 2 /2 tµ t [ σ µ+ t2 σ2 ] 2σ 2 [− tµ + tµ + σ σ t2 σ2 ] 2σ 2 which is the moment generating function of a normal random variable with expected value equal to zero and variance equal to one. 1). And. Now. we have shown that if X ∼ N (µ. So the moment generating function for X is MX (t) = etµ+ 2 t 1 2 2 σ Consider the random variable Z= 1 µ X −µ = X− σ σ σ Using Theorem 7. σ 2 ) then Z = (X − µ)/σ ∼ N (0.208 Transforms So. suppose you are given that the random variable X has a normal distribution with mean µ and variance σ 2 . if someone tells you that the random variable W has moment generating function 2 MW (t) = e−2t+4t then the distribution of W must be N (−2. 1).

Linear combinations of normal random variables Theorem 7. an and b are constants. . . . Let X be any random variable. Then the Laplace transform for X is given by LX (t) ≡ E(e−tX ). . Then the characteristic function for X is given by ψX (t) ≡ E(eitX ) √ where i ≡ −1.5. . . . Xn are independent normal random variables with 2 Xi ∼ N (µi . then n Y =b+ i=1 2 ai Xi ∼ N (µY . σY ) where µY = b + n ai µi i=1 . σi ) and a1 .Linear combinations of normal random variables 209 Other transforms of random variables Laplace transform Definition 7. Let X be (almost) any random variable. a2 .6. Characteristic function Definition 7.10. X2 . Note that the moment generating function for a continuous random variable X with probability density function fX (x) is MX (t) = ∞ −∞ etx fX (x) dx and the Laplace transform for the same density function is LX (t) = ∞ −∞ e−tx fX (x) dx So the moment generating function of a continuous random variable is simply the Laplace transform of the probability density function with t replaced by −t everywhere in the formula for the transform. . if X1 .

That is. . Xn . X2 . with each Xi ∼ N (6. . 16). Xn . it is reasonable to assume that the weights of n snails chosen at random from the population of all snails are n independent identically distributed random variables. Xn . .210 Linear combinations of normal random variables and 2 σY = n 2 a2 σi .? Since X1 . P (T > 28) = P 28 − 24 T − 24 > 4 4 = P (Z > 1) where Z ∼ N (0. . we have X = 4 1 i=1 Xi ∼ N (6. and we would like to find the probability that the total weight. Let’s find the probability that the observed average weight deviates from the expected weight by more than 2 gm. Since X1 . are iid with each Xi ∼ N (6.8413 = 0. 1). 1) = 1 − [P (Z ≤ 2) − P (Z ≤ −2)] = 1 − [P (Z ≤ 2) − (1 − P (Z ≤ 2))] = 2 − 2P (Z ≤ 2) = 2 − 2(0. i i=1 Example: The weights of snails are normally distributed with mean 6 gm. . will exceed 28 gm. . Therefore.1587 3 Once again. . X2 . Therefore. . 4 P (|X − 6| > 2) = 1 − P (|X − 6| ≤ 2) = 1 − P (−2 ≤ X − 6 ≤ 2) = 1−P −2 X −6 2 ≤ ≤ 1 1 1 = 1 − P (−2 ≤ Z ≤ 2) where Z ∼ N (0. and variance 4 gm2 . suppose that 4 snails are selected at random. 4). Therefore. 3 Suppose that 4 snails are selected at random. 4). X2 .0456 . . we have T = 4 i=1 Xi ∼ N (24. 1) = 1 − P (Z ≤ 1) = 1 − 0. . we want to find P (|X − 6| > 2). 4). Let X = 1 (X1 + 4 X2 + X3 + X4 ) denote the average weight of the 4 snails. are iid with each Xi ∼ N (6.9772) = 0. . T = X1 + X2 + X3 + X4 . . X1 .

Xn . . σ 2 /n). . σ 2 ) distribution. Remember that X . Since each Xi has the same moment generating function for all i. Xn . (Central Limit Theorem) If 1. The Central Limit Theorem Theorem 7. Sn = X1 + X2 + · · · + Xn . We will get the result by showing that the moment generating function for Zn must approach the moment generating function of a N (0. Proof. the sample mean. . Zn = then n→∞ Sn − nµ √ . Xn be a random sample from a N (µ. X2 .11. MSn (t) = [M (t)]n . and 3. X2 . is a random variable. . . Therefore. . Xn is a random sample with E(Xi ) = µ and Var(Xi ) = σ 2 for all i. X2 . . 1) random variable. X2 . . . Let X = n 1 i=1 Xi . . . we can let M (t) denote the common moment generating function of each X1 . . X1 . First note that E(Sn ) = nµ and Var(Sn ) = nσ 2 . . σ n lim P (Zn ≤ z) = Φ(z). we get √ √ (1) MZn (t) = e−µ nt/σ MSn (t/(σ n)) Since each Xi has the same moment generating function for all i. Then n X ∼ N (µ.The Central Limit Theorem 211 The distribution of the sample mean Let X1 . . we can let M (t) denote the common moment generating function of each X1 . Then MZn (t) = M Sn√ (t) = M −nµ σ n Sn √ − σ n √ nµ σ (t) Since MaY +b (t) = etb MY (at) for any random variable Y and any constants a and b. and 2. and MZn (t) = e−µ √ nt/σ √ M (t/(σ n)) n We are going to need the following two facts: . . .

take the natural log of both sides of equation (1) and use fact (A): √ t −µ nt ln MZn (t) = + n ln M √ σ σ n √ ∞ k −µ nt E(X k ) t √ = + n ln 1 + .212 E(X k ) k t k! k=1 (−1)j−1 j ξ j j=1 ∞ ∞ The Central Limit Theorem (A) M (t) = 1 + (B) ln(1 + ξ) = for |ξ| < 1. we can make this summation as small as we want. permitting us to use fact (B). E(X 2 ) − E(X)2 2σ 2 t2 + [other terms] But E(X) = µ and E(X 2 ) − E(X)2 = σ 2 . so ln MZn (t)   √ ∞  −µ nt (−1)j−1 = + n  σ j j=1 =  √ −µ nt + n σ ∞ j E(X k ) k! k=1 t √ σ n k t √ σ n ∞ k    E(X k ) k! k=1 ∞ 1 − 2 E(X k ) k! k=1 t √ σ n k 2  + · · · . Hence. As n → ∞. ξ ) will eventually be less than one.. σ k! σ n k=1 Note that the natural log term looks like fact (B) with ξ= E(X k ) k! k=1 ∞ t √ σ n k . Therefore. Now. the summation (i. Collect similar powers of t to obtain √ −µ nt E(X) √ +n t+ ln MZn (t) = σ σ n Hence. ln MZn (t) = √ (−µ + E(X)) n t+ σ E(X) 1 E(X)2 − 2!σ 2 n 2 σ 2 n t2 + · · · . 2 . for suitably large n.e. 1 ln MZn (t) = 0t + t2 + [other terms].

n i=1 Ti Then E(Ti ) = p and Var(Ti ) = p(1 − p). . We already know that Sn = has a Binomial distribution with parameters n and p.The Central Limit Theorem 213 But the “[other terms]” for tn with n ≥ 3 go to zero as n → ∞ because there is a power of n in the denominator of each of the “[other terms]” (expand the series further if you need convincing). T2 . . 1) random variable. . (DeMoivre-Laplace Theorem) As n gets large (i. because of the Central Limit Theorem. . So limn→∞ P (Zn ≤ a) = Φ(a). .e. n.2 we get the following Theorem 7. Hence. 2 /2 n→∞ which is the moment generating function for N (0. as the number of Bernoulli trials increases) the distribution of Zn = Sn − np np(1 − p) approaches the distribution of a N (0. But DeMoivre (1733) and Laplace (1812) proved a similar result for the following special case: Let T1 . DeMoivre and Laplace didn’t know about the Central Limit Theorem and had to prove this in a different way. 1. Of course. 2 n→∞ Therefore.. 2 . Tn be a random sample from a Bernoulli distribution. The DeMoivre-Laplace Theorem A proof of the Central Limit Theorem was provided by Liapounoff in 1902. . lim ln MZn (t) = lim MZn (t) = et t2 . that is let P (Ti = 1) = p P (Ti = 0) = 1 − p.12. However. . . that is P (Sn = k) = n k p (1 − p)n−k k for k = 0. 1).

. the distribution of the random variable Sn approaches the distribution of a N (np. this is not very useful (and not precise). Xn is a random sample with E(Xi ) = µ and Var(Xi ) = σ 2 but the distribution of the Xi is not normal. (k−np)2 n k −1 e 2 np(1−p) p (1 − p)n−k ≈ √ 1 2πnp(1−p) k This approximation usually works well when np(1 − p) > 9. i=1 . In the limit. Then the probability distribution of the 1 sample mean X = n n Xi is N (µ. The distribution of the sample mean for large n Let X1 . we get what is often called the “normal approximation to the binomial. Xn is a random sample from a normal distribution with E(Xi ) = µ and Var(Xi ) = σ 2 . we get E(X) = µ Var(X) = σ 2 /n But. since X is composed of a sum of independent random variables. . we can still say that the probability distribution of the sample 1 mean X = n n Xi is approximately N (µ.13. X2 . we can use the Central Limit Theorem to say that the probability distribution of X approaches a N (µ. σ 2 /n) distribution. . X2 . then for “large” values of n. . np(1 − p)) random variable.214 The Central Limit Theorem Another (more casual) way to state the DeMoivre-Laplace Theorem is to say that as n → ∞. we can summarize both results as follows: • If X1 . . because of the Central Limit Theorem. . But for finite n. And recall that the sample mean is defined as X= 1 n n Xi i=1 Simply using the rules for expected value and variance. Then. σ 2 /n). . Xn be a random sample from (just about any) distribution with E(Xi ) = µ and Var(Xi ) = σ 2 . Combining this fact with the case discussed on page 211. . i=1 • If X1 . . X2 .” as follows: Corollary 7. σ 2 /n). Suppose n is an integer and 0 < p < 1. . furthermore. because as n → ∞ both the mean np and variance np(1 − p) go to ∞. . .

each with the same probability distribution as X . S7.9772 (E) none of the above.1587 (B) 0.5t 2 2 (B) 2e0. choose the best response among those provided.3 Suppose X is a random variable with moment generating function MX (t) = E(etX ) = e0. 16) are independent random variables.1 Let X be a random variable with probability generating function g(z) = z + z5 + z6 3 The value of E(X) equals (A) 0 (B) 1 3 (C) 1 (D) 4 (E) none of the above. S7. Answers can be found in Appendix B.2 Suppose X ∼ N (2.Self-Test Exercises for Chapter 7 215 Self-Test Exercises for Chapter 7 For each of the following multiple-choice questions.5t 2 Let X1 and X2 be two independent random variables.5199 (C) 0. 9) and Y ∼ N (2.5t (C) et 2 . Then the moment generating function for Y = X1 + X2 is (A) e0.8413 (D) 0. Then P (X + Y < 9) equals (approximately) (A) 0. S7.

The value of P (X > Y ) is (A) 0 (B) 0.216 Self-Test Exercises for Chapter 7 (D) e0. .4 An automated assembly line produces incandescent light bulbs. normally distributed random variables each with mean 0 and variance 1. If p is the probability that any one light bulb is good. Tn be n independent identically distributed random variables with {Ti = 1} if light bulb i is good. then each Ti has probability mass function pTi (k) = 1−p p if k = 0 if k = 1 where 0 < p < 1.5 If X and Y are independent.5t 4 (E) none of the above. . S7. 2). Then S = n Ti is the i=1 total number of good light bulbs in the batch. T2 . S7. . then 3X − Y (A) has a normal distribution with mean 2 and variance 1 (B) has a normal distribution with mean 0 and variance 10 (C) has a normal distribution with mean 0 and variance 2 (D) has a normal distribution with mean 2 and variance 4 (E) none of the above. . Suppose that a batch of n bulbs are selected at random. Then P (S ≥ 1) equals (A) 1 − pn (B) 1 − (1 − p)n (C) 2 ∞ √1 e−x /2 dx 1 2π (D) 1 (E) none of the above.50 . 1) and Y ∼ N (0. Let T1 . and {Ti = 0} if light bulb i is bad. S7.6 Let X and Y be independent random variables with X ∼ N (0.05 (C) 0.

Then the random variables W1 = X + Y W2 = X − Y (A) have the same mean (B) have the same variance (C) are both normally distributed . Then Var(Y ) equals (A) 1/16 (B) 1/4 (C) 4 (D) 16 (E) none of the above. normally distributed random variables each with mean 0 and variance 1.7 Let X be a random variable with probability generating function g(z) = z(1 + z) 2 The value of P (X = 0) is (A) 0 (B) 1/4 (C) 1/2 (D) 1 (E) none of the above. S7.9 Let X and Y be independent. S7. 3. X2 . 4. Let Y = X1 + X2 + X3 + X4 . X4 be independent random variables.Self-Test Exercises for Chapter 7 217 (D) 0.95 (E) none of the above. each with E(Xi ) = 0 and Var(Xi ) = 1 for i = 1.8 Let X1 . X3 . 2. S7.

12 Jack is attempting to fill a 100 gallon water tank using a one gallon bucket.218 Self-Test Exercises for Chapter 7 (D) all of the above (E) none of the above.1 gallon. S7.11 Let X be a random variable with moment generating function MX (t) = (1 − t)−1 . The amount of water carried on any trip is independent of all other trips. S7. then X + Y + 1 has a moment generating function equal to (A) et (1 − t)−2 (B) (1 − t)−2 (C) 2et (1 − t)−1 (D) es (1 − t)(1 − s) (E) none of the above. Jack repeatedly goes to a well at the top of a hill.10 Let X be a random variable with moment generating function MX (t) = et (1 + e2t ) 2 The value of E(X) is (A) 0 (B) 1 (C) 2 (D) 3 (E) none of the above. The amount of water he carries on any one trip from the well to the tank is normally distributed with mean 1 gallon and standard deviation 0. If X and Y are independent.05 (B) 0.0636 . fills his one gallon bucket with water and brings that water down to the 100 gallon tank. The probability that Jack will need to make more than 100 trips to the well is (A) 0. S7. Let Y be a random variable with moment generating function MY (s) = (1 − s)−1 .

.6915 (D) 1. 2) and Y ∼ N (1.Self-Test Exercises for Chapter 7 219 (C) 0. 2). Then P (|X − Y | ≤ 1) equals (A) 0 (B) 0. S7.5 (E) none of the above.15 Let X be a discrete random variable with probability mass function given by 1 P (X = 0) = 3 P (X = 1) = 2 3 Then the moment generating function for X is (A) e2t/3 (B) (C) (D) 1 3 (1 1 2 (1 + 2et ) + et ) + e2t ) 1 t 2 e (1 (E) none of the above. 1) and Y ∼ N (0.5 (C) 1 (D) P (X + Y < 1) (E) none of the above.0 (E) none of the above. The value of P (X − Y < 1) equals (A) 0 (B) 0. S7. S7.3830 (C) 0.14 Let X and Y be independent random variables with X ∼ N (1. 2).13 Let X and Y be independent random variables with X ∼ N (0.2580 (D) 0.

. .220 Self-Test Exercises for Chapter 7 S7. . . Xn be independent. 2.18 Let X1 . Let W = n Xi and define the random i=1 variable W − 2n Z= c Then Z ∼ N (0. S7. S7.16 Suppose X1 . Xn are independent random variables with each Xi ∼ N (2. . . 1) if and only if (A) c = 1 (B) c = n √ (C) c = 2 n √ (D) c = 2/ n (E) none of the above. . X2 . Define the random variable Y = 1 n n Xi i=1 Then E(Y ) equals (A) 0 . . . . 4) for i = 1. n. .17 Let X be a random variable with moment generating function MX (t) = et (1 + et ) 2 The value of E(X) is (A) 0 (B) 2 3 (C) 1 (D) 3 2 (E) none of the above. identically distributed random variables with each Xi having a probability mass function given by P (Xi = 0) = 1 − p P (Xi = 1) = p where 0 ≤ p ≤ 1. .

(c) Find E(|X − 15|). 16).5)e26t + (0.1 The probability generating function for a random variable X is given by g(z) = z 10 + z 20 . 2 (a) Find P (X > 18 | X > 13). 9) and Z ∼ N (3. Y ∼ N (2.3 The moment generating function for a random variable X is given by M (t) = (0.4 Let X be a random variable with nonnegative. Y . .25)e49t .5 Suppose that X . 2−t What is the expected value for this random variable? 7. (d) Find the moment generating function for X . (b) Find E(X).25)e27t + (0. (a) Compute P (X ≤ 30). integer support and moment generating function p . 7.2 A random variable has the moment generating function M (t) = 2 . Questions for Chapter 7 7.Questions for Chapter 7 221 (B) p (C) p2 (D) 1 (E) none of the above. 7. 4). M (t) = t (1 − p) 1−e What is the value of P (X = 0)? 7. and Z are independent random variables with X ∼ N (1. (b) Find E(X) and E(X 2 + 1).

Hint: Don’t use Chebychev’s inequality.8 A fair coin is tossed 5000 times. Explain the difference in your answers.01? 7. .7 A large forklift has a 0. 7. such that P (|X n | ≤ a) ≥ 0. . each with E(Xi ) = 0 and Var(Xi ) = 1 for all i = 1.01 and mean µ that can be varied by adjusting the packaging equipment. . These weights can be assumed to be independent and identically distributed. . You should be able to get this answer knowing the exact distribution of Xn . .5 ton (1000 pound) capacity. X2 . 0. . Approximately how many cartons can the forklift carry so that the probability that its total load exceeds its capacity is only 0. The net weight (in pounds) is normally distributed with variance 0. . X2 . Xn be independent and identically distributed random variables.222 Questions for Chapter 7 (a) Compute P (X + Y ≤ 1). (c) Compare the answers to (a) and (b).95. (d) Compute P [(3X − Z)2 > 52].6 Let X1 . Cartons of items have individual weights that are normally distributed with mean 25 pounds and standard deviation 2 pounds. (b) Find the number a such that P (3−4a < Z < 3+4a) = 0.95 where X n denotes the sample mean of X1 . . . (a) Using Chebychev’s inequality. .3174 7. (c) Compute P [(X + Y + Z)2 ≤ 25]. Xn . n. Approximately what is the probability that at least 1965 of the tosses will turn up heads? 7. (a) What value should µ be adjusted to so that the producer can guarantee that the probability that the net weight of a box is less than one pound is only 0. find a number a (that will depend on n). 2.9 A one-pound box of breakfast cereal never contains exactly one pound of product. (b) Solve question (a) again using the additional information that each of the Xi is normally distributed.05? . . .

(a) What is the probability of receiving a grade less than 80? (b) If 49 students take the test. Find P (X1 < 12).12 The Phizzi Soft Drink Company’s best selling product is Phizzi-Soda that it sells in 12-ounce bottles. Find P (X1 > X2 ).10 The distribution of test scores on an examination is approximately normal with mean 70 and standard deviation 7. Assume that X is normally distributed with mean 12.Questions for Chapter 7 223 (b) What is the probability that a carton of 16 cereal boxes produced according to the specifications determined in part (a) will not exceed a total net weight of 19. 7. . (a) Let X1 denote the fill-volume of a single bottle of PhizziSoda. (b) Suppose that two bottles of Phizzi-Soda are chosen at random. When the automated filling machine is working properly.28 pounds? 7. (c) Let X = (X1 +X2 )/2 be a random variable representing the average fill-volume of the two bottles selected in question (b). Let X1 and X2 be independent random variables denoting the fill-volume of each of the two bottles. the quantity of Phizzi-Soda in a single bottle is a random variable X . Hint: P (X1 > X2 ) = P (X1 − X2 > 0). what is the probability that the average of their scores lies between 68 and 72? (c) What is the probability that the difference in the scores of two students (picked at random) lies between −5 and +5? 7.11 True or False: If X and Y are normally distributed random variables with mean 0 and variance 1. then 3X − Y has a normal distribution with mean 0 and variance 10. Find P (X < 12).1 ounces and variance 0.0025 (ounces)2 .

1 kgm? 7.5 kgm? (b) Suppose that ten packages of paper are weighed together.13 Let X be a continuous random vector with probability density function given by 1/θ for 0 ≤ x ≤ θ fX (x) = . (b) Find P (X > Y ). 16). identically distributed random variables. What is the probability that its weight is greater than 2. 7. (a) Find the number a having the property that P (Y > a) = 0.15 Prove that if gX (z) ≡ E(z x ) is a probability generating function for X .975.224 Questions for Chapter 7 7.25(kgm)2 . (a) Find Var(X). 9) and Y ∼ N (6. (a) Suppose that one package of paper is weighed.14 Individual packages of writing paper have weights that are normally distributed each with mean 2 kgm and variance 0. . normally distributed random variables with X ∼ N (1. What is the probability that their total weight is greater than 25 kgm? (c) Suppose that ten packages of paper are weighed together. What is the probability that their average weight is greater than 2. (c) Compute P (|X − 1| < 4). 7. then limz→1 gX (z) = 1. (d) Compute P (0 ≤ X < 2). 0 otherwise where θ is a constant with θ > 0. The weights of all packages can be considered independent. Your answers to the following questions might depend on the constant θ. (b) Find the moment generating function for X .16 Let X and Y be independent.

Mr. . .03 cm.01 cm. Yn are independent log-normal random variables. show that if Y1 . Socket diameter: µ = 5. . (a) Find E(Y ) and Var(Y ) (b) What is the probability that Mr. Let Y = n Xi be a random variable denoting the total water carried by i=1 all of the n = 200 helicopters. what proportion of rods will not fit? 7.19 Bill Gutts. The amount of water transported varies from helicopter to helicopter. For that reason. Gutts can fill his swimming pool using the 200 helicopters? . then n W = i=1 Yi is a log-normal random variable. The amount of water carried the 200 individual helicopters can be assumed to be a collection of independent and identically distributed random variables X1 . Xn with n = 200. . Y2 . The capacity of the swimming pool is 198869 gallons. X2 . Gutts plans to transport the water to the mountain-top pool using a fleet of 200 helicopters with each helicopter carrying a small water-tank. But it is well-known that the amount of water (in gallons) carried by a single helicopter can be represented by a random variable that is normally distributed with mean 1000 and variance 1600. The elevation of the pool (and his home) is very high so that it is impossible to pump water up the mountain. Quality control measurements show that the distribution of rod diameter and socket diameter are independent and normal with parameters: Rod diameter: µ = 5. wants to fill one of the swimming pools at his mountain-top home. . σ = 0. 7. .17 Referring to the previous question.10 cm. .Questions for Chapter 7 225 7. If the components are selected at random for the assembly. one of the wealthiest men on earth.18 At one stage in the assembly of an article. σ = 0. a cylindrical rod with a circular cross-section has to fit into a circular socket. .04 cm.

Gutts can fill his swimming pool if he uses 200 helicopters. X has density function fX (x) = 1 if 0 < x < 1 0 otherwise The discrete random variable Y is a Bernoulli random variable with p = 1/2. . even for a wealthy guy like Mr. Instead it is has a uniform distribution on the interval (930. 1069. The continuous random variable X has a uniform distribution on the interval (0. That is. 1/2 if k = 0 P (Y = k) = 1/2 if k = 1 Find the probability distribution of the random variable W = X + Y . Gutts would be happy if the probability that his swimming pool is completely filled is only 0. How many helicopters should he use? (d) In an E-mail message. 7.20 (†) Suppose X and Y are independent random varaibles. Use the Central Limit Theorem to find the approximate probability that Mr. Gutts is given the bad news that the amount of water carried by each helicopter does not have a N (1000. 1).50. Suppose Mr.718.226 Questions for Chapter 7 (c) Helicopters are very expensive. Mr. 1600) distribution. That is.282). Gutts.

identically distributed random variables. . Xn of independent. A random sample is any sequence. .1. . . X1 . X2 . x2 . X2 .X2 . Example: Let X have an exponential distribution with probability density function fX (x. If a continuous random variable X has an unknown parameter θ.8 ESTIMATION Good and bad luck are often mistaken for good and bad jusment – ANONYMOUS. X2 . . . . 227 .2. of independent. . .. A statistic is any function.. . identically distributed random variables. θ) = 1 1 −θx θe 0 x>0 otherwise where θ is unknown. xn . we often write it’s probability density function as fX (x. Then a random sample. h(X1 . Xn representing n independent observations from this random variable has a joint probability density function given by fX1 . Salada TM tea bag tag Basic Concepts Random samples Definition 8. . X1 . θ). Xn ).. . . .Xn (x1 .. θ) = 1 1 − θ (x1 +x2 +···+xn ) θe 0 xi > 0 for all i otherwise Statistics Definition 8. . .

. . θ(X1 . . Xn ) = max(X1 . . . ˆ 3. θ(X1 . X2 . θ(X1 . . X2 . with the others discarded. Xn as representing the future data values that will be obtained by making n independent observations of our experiment. Xn ) = X1 ˆ The random variable θ is the first observation. . You can think of the random variables X1 . Let X1 . . . Xn ) = i=1 Xi ˆ The random variable θ is the product of the observations. Xn ). n ˆ 2. . ˆ 4. Note: Any function of a random sample can be called a statistic.228 Basic Concepts If a statistic is being used to establish statistical inferences about a particular ˆ unknown parameter θ. Example: Let X1 . Xn ) = n n Xi i=1 ˆ The random variable θ is the average of the observations. . . X2 . . θ(X1 . X2 . . X2 . X2 . . There are two important statistics that arise so frequently that they are given special symbols: Definition 8. . then it is often written as θ(X1 . X2 . Then the following are statistics: 1 ˆ 1. . i=1 . . . Xn ) ˆ The random variable θ is the largest of the observations. . . . . . . . . . Xn be a random sample. . The sample mean is the statistic 1 n X≡ Xi n i=1 and the sample variance is the statistic s2 ≡ X n 1 n−1 (Xi − X)2 . . X2 . . Xn be a random sample. .3. . . . . X2 .

. . The likelihood function for . xn ) our estimate for the ˆ 1 . Xn be a random sample from a discrete random variable X with probability mass function pX (x. where θ is unknown. . Let x1 . x2 . X2 . X2 . Xn . X2 . value of θ. . ˆ ˆ Applying the function θ to these observations yields the number θ(x1 . Example: Let X be a discrete random variable with probability mass function given by pX (0) = 1 − θ and pX (1) = θ. . Xn is to be drawn from this distribution. . Let X1 . . A random sample X1 . X2 . This number can be thought of as an actual observation of the random variable ˆ ˆ θ(X1 . . We will call the number θ(x1 . . . x2 . . . Suppose that a statistic ˆ θ(X1 .Point Estimation 229 Point Estimation The problem Let X be a random variable with cumulative distribution function FX (x. . ˆ E(θ) = θ for all θ. i. X2 . . . we will have to find ways to distinguish good estimators from poor estimators. θ) that is completely determined except for the unknown constant θ. . . . . . . . . . . . Unbiased estimators ˆ Definition 8. . We call the random variable θ(X Since any statistic can be called an estimator of any parameter. X2 . . θ). . .4. If X1 . . . Xn is a random sample from X then 1 ˆ θ(X1 . Xn ) = n is an unbiased estimator for θ. xn ). xn denote the actual observed values taken on by X1 . . Xn ) has been already been selected for use.5. . . X2 . . Xn ). .e.. Xn ) an estimator for θ. n Xi i=1 Finding Estimators Method of maximum likelihood Definition 8. . . . . The estimator θ is an unbiased estimator of θ if its expected value equals θ. X2 . x2 . . . .

. . x2 . xn .e. x2 . . . θ). Maximum likelihood estimators are obtained by finding that value of θ that maximizes L for a given set of observations x1 . . an unbiased estimator for θ (i. . .. . . . θ). . x2 . namely θ(x1 . xn ). . . one of the properties that we might like to check for is whether θ is ˆ = θ). x2 . . . . θ can be thought of as a function of ˆ x1 . X2 . Since the value of θ that does this will usually vary with x1 . . ˆ To evaluate the properties of θ. Xn are independent identically distributed random variables each with marginal probability density function 1 1 fXi (x) = √ e− 2 σ 2π x−µ 2 σ for −∞ < x < +∞. . θ) = i=1 pX (xi . xn . Xn is given by n L(x1 . . . then the likelihood function is given by n L(x1 . . X is a continuous random variable with probability density function fX (x. xn . instead. xn . x2 . x2 . . That is we can substitute Xi for xi in the ˆ specification for θ and look at its properties as the statistic ˆ θ(X1 . . . . . . xn . . X2 . . . . check to see if E(θ) Self-Test Exercises for Chapter 8 For the following multiple-choice question. θ). Then an unbiased estimator for µ is (A) (X1 )(X2 ) · · · (Xn ) (B) (X1 + X2 )2 /2 (C) 1 n n i=1 Xi . .230 Self-Test Exercises for Chapter 8 X1 . θ) = i=1 fX (xi . . .1 Suppose that X1 . . . If. . xn . we can look at its performance prior to actually making the observations x1 . choose the best response among those provided. The answer can be found in Appendix B. Xn ). X2 . x2 . ˆ For example. . . S8. where σ > 0. .

an in order for Y to be an unbiased estimator for θ? 8. k! for k = 0. . . Suppose that E(Xi ) = θ and Y = n ai Xi where a1 . a2 .. X2 . . . λ) = λk e−λ .2 Let Y be an estimator for θ based on the random sample X1 .e. .4 The number of typographical errors.e. . . . Questions for Chapter 8 8.. Xn be a random sample from X . . .3 The life of a light bulb is a random variable X which has probability density function 1 1 −θx x>0 θe fX (x. . 8. (a) Find an estimator for θ using the method of maximum likelihood. ˆ (a) Show that θ = X1 /n is an unbiased estimator for θ. Xn . (b) Is the estimator for part (a) unbiased? Justify your answer. 8. Let X1 be a random sample of size 1 from X . . . . X2 . pX (k . . (c) Find the maximum likelihood estimator for η = 1/θ. . an are i=1 constants. . pX (k . X . A random sample of n pages are observed . on a page of text has a Poisson distribution with parameter λ. . 1. . .. θ) = n k θ (1 − θ)n−k k for k = 0. ˆ (b) Show that θ = X1 /n is the maximum likelihood estimator for θ. What constraint must be placed on a1 .1 Let X be a random variable with a binomial distribution. i. . a2 . 1. 2.Questions for Chapter 8 231 (D) σ (E) none of the above. i. . n. . θ) = 0 otherwise Let X1 .

Xn consider the statistic d2 formed by averaging the squared differences of all possible pairings of {Xi . . 8. . . X2 . There are n such pairs. X X . Xj }. .5 Given the random sample X1 . That statistic can be represented as 2 d2 ≡ X 1 n 2 i>j (Xi − Xj )2 Prove that d2 = 2s2 .232 Questions for Chapter 8 (a) Find an estimator for λ using the method of maximum likelihood. (b) Is the estimator for part (a) unbiased? Justify your answer.

. Let X = n i=1 Ti . Then X is said to have a binomial distribution. . . .9 PROBABILITY MODELS Coin Tossing (Binary) Models Definition 9. . These outcomes are conventionally called success and failure. with P (Ti = 1) = p for all i. Note: X denotes the total number of successes from n independent and identically distributed Bernoulli trials. . then pX (k) = n k p (1 − p)n−k k k = 0. If X is a binomial random variable with parameters n and p. 2. T is a random variable with a probability mass function given by pT (0) = 1 − p pT (1) = p Definition 9.3. n The binomial distribution Support: Probability mass function: 0.2. . Let T1 . Theorem 9.1. 1. . . n pX (k) = n k p (1 − p)n−k k k = 0. 2. Definition 9. Tn be a sequence of n independent and identically distributed Bernoulli random variables. 2. . . T2 . A Bernoulli random variable. 1. . n 233 . .1. . . 1. . A Bernoulli trial is an experiment with only two outcomes.

.2)3 (0. 2. . What is the probability that it breaks down exactly 3 days out of a week? Solution: Let X denote the number of breakdowns. The geometric distribution Support: Probability mass function: Probability generating function: Moment generating function: Moments: Note X denotes the number of independent and identically distributed Bernoulli trials required until the first success (counting the successful trial).234 Coin Tossing (Binary) Models g(z) = (zp + (1 − p))n MX (t) = (et p + (1 − p))n E(X) = np Probability generating function: Moment generating function: Moments: Notation Var(X) = np(1 − p) If X has a binomial distribution with parameters n and p. 1. .287 3 2. 1. p).2)4 = 0. 1 zp for |z| < g(z) = 1 − z(1 − p) 1−p tp e MX (t) = 1 − et (1 − p) E(X) = 1/p Var(X) = (1 − p)/p2 Example: A computer breaks down at least once a day with probability 0.8). P (X = 3) = 7 (0. What is the probability that it is operational for 3 days in a row before a breakdown? Solution: Let Y denote the number of days until a breakdown. P (Y = 4) = (0. .8) = 0. 0.8)3 (0. we often write X ∼ B(n.2. pX (k) = p(1 − p)k−1 k = 1. . Hence. It therefore remains operational for an entire day with probability 0.8. X ∼ B(7.0064 . . 2.

Therefore. 2. then for “large” n and “small” p P (X = k) = n k e−np (np)k p (1 − p)n−k ≈ k! k Random Particle Models n The expected number of particles per interval of length one is equal to α. Suppose the interval is divided into n equal segments with at most one particle placed in any segment.Random Particle Models 235 The Poisson distribution Support: Probability mass function: Probability generating function: Moment generating function: Moments: 0. . X is a binomial random variable with parameters n and p = α/n. . . . 1. 2. . . k! g(z) = eα(z−1) for |z| < 1 MX (t) = eα(e E(X) = α t −1) Var(X) = α Theorem 9. 1. pX (k) = e−α αk k = 0. What is the probability that there are exactly k particles in the unit interval? Let X denote the number of particles in the unit interval. If X is a binomial random variable with parameters n and p. P (X = k) = n k α n k  1 1− α n n−k .2.

Let Yi denote the interarrival time between occurrences i − 1 and i (see Figure 9.. . 1. Each of the Yi are independent and identically distributed random variables with probability density function given by fYi (y) = αe−αy 0 if y ≥ 0 otherwise Question: Suppose a person begins to observe the process at time t0 chosen independently of the future of the process? What is the distribution of the waiting time to the next occurrence. . . 1. . . 2. fW (w) = αe−αw 0 if w ≥ 0 otherwise . 1.3. n.1). This limiting particle process is called the Poisson Process with α = expected number of occurrences per unit interval X = number of occurrences in a unit interval and the probability mass function for X is given by pX (k) = αk e−α k! k = 0.1)? Theorem 9. . 2. .236 −k Random Particle Models αk k! n = 1− α n n n n−1 n−k+1 ··· n n 1− α n for k = 0. W (see Figure 9. . . 2. . Taking the limit as n gets large (the segment widths decrease) we obtain lim P (X = k) = 1(1)(1) · · · (1) = e−α αk k! αk −α e k! n→∞ for k = 0.

The company sells each bulb for r cents. S9. Cheapy will refund the purchase price of the entire box to the customer. The value of E(Y ) is (A) 0 . What is the probability that every bulb in a box is good? (A) 0 (B) p (C) 1 − p (D) pn (E) none of the above.1 It costs the Cheapy Flashbulb Company c cents to manufacture a single flashbulb. S9. let Y denote the profit from selling a box of Cheapy flashbulbs. If any of the bulbs in a box is bad.1: Interarrival and waiting times Self-Test Exercises for Chapter 9 For each of the following multiple-choice questions. The probability that any one bulb is good is p > 0 and it is independent of all other bulbs. Flashbulbs are packed n to a box.2 Using the information in the previous question. choose the best response among those provided.Self-Test Exercises for Chapter 9 237 Yi t i-1 W t t0 i Figure 9. Answers can be found in Appendix B. Note that Y may be negative.

the loss incurred (from the shutdown plus repair costs) equals (x2 + x). (a) On any given day. arriving at a refinery each day has a Poisson distribution with parameter α = 3. If more than 2 tankers arrive in a day. this machine breaks down at unexpected and unpredictable times. Suppose that X has a Poisson distribution with E(X) = α > 0. If the machine breaks down x times during the t hours. when running properly. If four independent pieces of lumber are examined. The number of breakdowns during any period of length t hours is a random variable with a Poisson distribution with parameter t. Let P be the random variable representing the total profit from running the machine for t hours. the probability that they all have no knots is (A) 1/4 (B) 4e−α (C) e−4α (D) α4 e−4α (E) none of the above. (b) Find the length of time the machine should be operated to maximize the expected profit. 9.3 The number of knots in a piece of lumber is a random variable.238 Questions for Chapter 9 (B) n(rp − c) (C) nrpn − nc (D) nrpn + nc − 2ncpn (E) none of the above.2 The number of oil tankers. what is the probability of having to send tankers away? . Questions for Chapter 9 9. However. can bring a profit of c dollars per hour (c < 2) to a firm. N . S9. (a) Find E(P ) as a function of t. Present port facilities can service 2 tankers each day.1 A complicated piece of machinery. the excess tankers must be sent to another port. X .

5 In forming binary numbers with 32 binary digits.6 Let X1 . If these are randomly distributed throughout the book.3 Suppose cars on a very long test track travel at exactly 50 miles per hour and pass a check point on the track according to a Poisson process at a rate of 5. or more than one incorrect digit in a binary number? If a computer forms 106 such 32-digit numbers per second. what is the probability that 10 pages. what is the probability that an incorrect number is formed during any one-second time period? 9.2 cars/hour. the probability that an incorrect digit will appear is 0. X3 . If 5 P i=1 Xi = 5 = r. selected at random.002.4 Suppose that a book of 620 pages contains 55 typographical errors. then what is the value of p in terms of r? .Questions for Chapter 9 239 (b) How much must the present facilities be increased to permit handling all tankers on approximately 80% on the days? (c) What is the expected number of tankers arriving per day? (d) What is the expected number of tankers serviced daily? (e) What is the expected number of tankers turned away daily? (f) If a fee of $10000 is collected from each tanker serviced. X5 be independent and identically distributed Bernoulli random variables. X4 . each with P (Xi = 1) = p. what is the probability of finding zero. one. X2 . will be free of errors? 9. If the errors are independent. (a) What is the average distance between cars? (b) What is the probability that the distance between any two cars is greater than 10 miles? 9. what is the expected total fees collected in one day? (g) How much must the present facilities be increased to assure that the expected total fees collected in one day is at least $30000? 9.

240 Questions for Chapter 9 9. with a simple link connecting exactly two molecules (see diagram below) .7 A chromosome can be considered a string of molecules.

What is the actual probability mass function for N ? (b) Compute the approximate probability that exactly 4 smaller chains result from the breakup. Parity errors are generated independently from one word to the next. (c) Consider the number of molecules on the smaller chains that result from the breakup. Approximately what is the probability that an error cycle will be longer than two minutes? . and each link breaks independently of the others.8 A computer transmits 1000 words per minute.005 of breaking.    Suppose a section of a chromosome consists of 1000 such links (and 1001 molecules). The probability that any one word has a parity error is 0. thus breaking the section into smaller chains.001. In recombinant DNA experiments. some of these links are broken. (a) Let N be the number of links broken. Each link has a probability 0. Approximately what proportion of these chains will contain at least 200 molecules? 9. (a) Approximately what is the probability of generating at least 2 words with parity errors in a one minute period? (b) An error cycle is the time between successive words with parity errors.

the continuous rod is cut at each imperfection. 1. n for k = 0. If a piece has any knots.Questions for Chapter 9 241 (c) A shift change in computer personnel takes place at 8:00 am each day. . 1. If p1 = p2 then Y = X1 + X2 has a Binomial distribution. (a) What is the average length of the cut glass rods? (b) What fraction of the rods will have a length greater than one foot? 9. .11 (†) Suppose X1 and X2 are independent Binomial random variables with P (X1 = k) = P (X2 = k) = n k p (1 − p1 )n−k k 1 n k p (1 − p2 )n−k k 2 for k = 0. To produce “perfect” glass rods.10 Glass rods are manufactured by cutting a continuous glass rod leaving a production machine. . . .9 The number of knots in a piece of lumber has a Poisson distribution with an average of 2 knots per piece. . . . . (a) What is the probability that a piece of lumber is unsatisfactory? (b) What is the expected number of unsatisfactory pieces in a load of 500 pieces? (c) What is the probability that all of the 500 pieces are satisfactory? 9. Small imperfections appear in the continuous rod according to a Poisson process with an average of two imperfections per foot of glass rod. n (a) Prove or give a counterexample: Suppose 0 < pi < 1 for i = 1. What is the probability that the computer operator on the new shift will wait at least two minutes for the first parity error on that shift? (d) What is the probability that no parity error occurs between 10:00 and 10:02 am? 9. The number of knots from piece to piece can be assumed to be independent. 2. it is considered unsatisfactory.

242 Questions for Chapter 9 (b) Prove or give a counterexample: Suppose 0 < pi < 1 for i = 1. On average. (d) Finally. (ii) W as p2 → 1. (c) Suppose 0 < pi < 1 for i = 1. If p1 = p2 then Y = X1 + X2 cannot have a Binomial distribution. . . Assume that: (a) every particle may appear in any bottle with equal probability. A defective bottle may have more than one particle. Addison-Wesley. Introductory Probability and Statistical Applications. . Z . the quick (and wrong) answer would be x percent. for large M . n where n = xM and α = x/100. 1 Meyer.. small particles in the molten glass cause individual bottles to be defective. (b) Show that as M → ∞ P (Z = k) ≈ e−α ak k! for k = 0. A single bottle requires 1 kg of molten glass.12 (†) (from Meyer1 ) In the manufacturing of glass bottles. the expected percentage of defective bottles equals d(x) = 100(1 − ex/100 ). . . we have d(x) ≈ x. the number of particles in a randomly chosen bottle. P. 9. 1. (a) Show that for M 100 kg batches of bottles. . on average. x particles. 2. Let W = Y − n. and (b) the distribution of any particle among the bottles is independent of all other particles. 1965 . show that for small x. . If we wanted to compute the expected percentage of defective bottles. 1. (c) Show that. but for larger x (say x = 100) the “quick” solution is very wrong. there are x particles in a batch of 100 kg of molten glass. 2 and let Y = X1 + X2 . has probability distribution P (Z = k) = n k α n k 1− α n n−k for k = 0. Find the limiting distribution for each of the following random variables: (i) Y as p2 → 0. Note that the molten glass required for 100 bottles contains.

9. 2. .15 (†) Pip’s Pizza Shoppe sells pepperoni and plain pizza by the slice. T10 are independent. . Compute E(X) and Var(X). . Each bulb has a probability be being defective equal to 0.13 The Youbee Light Bulb Company manufactures electric light bulbs. 10.9 P (Ti = 1) = 0. He decides he will accept the first offer exceeding c dollars.1 for all i = 1. . identically distributed random variables with P (Ti = 0) = 0. Therefore. (b) Find E(N ) (c) Find the cumulative distribution function for Y . Defects from one bulb to the next can be assumed to be independent random variables.14 (†) Mr. find FY (y) = P (Y ≤ y). There are four slices in each pizza. Let the Bernoulli random variable Ti equal 1 if bulb i is defective and 0 if the bulb is non-defective. 9. (a) Compute E(Ti ) and Var(Ti ) for every i. where c is a positive constant. every pizza that Pip prepares has pepperoni on only one of the . . Suppose the offers X1 . (c) Find P (X = 0).Questions for Chapter 9 243 9. (b) Let X = T1 + T2 + · · · + T10 denote the total number of defective bulbs in a box of 10 bulbs. . for a box of 10 bulbs T1 .1. . . That is. Let Y be the actual selling price of the house. (a) Find the probability mass function for N . Bonkers is selling his house. . Clearly justify each step of your answer. . X2 . Since one fourth of the slices sold by Pip are pepperoni. T2 . are independent identically distributed random variables (in dollars) each with cumulative distribution function FX (x). Let N be the number of offers needed to sell the house.

X2 = 0. Assume that Pip never throws away any slices of pizza during the day. a major oil company says that one fourth of all cars have at least one underinflated tire. Each car has 4 tires. .244 Questions for Chapter 9 four slices. . The remainder of the pizza is plain. What is the probability that Pip must bake a another pizza before selling all of the slices in the current pizza? Hint: Find the probability that he doesn’t need to bake another pizza. See the following diagram: Let the independent random variables X1 .16 (†) In a radio advertisement. (b) Suppose Pip has just taken a freshly baked pizza out of the oven and suppose that there are no other unsold slices. represent the slices sold by Pip with Xi = 1 if slice i is pepperoni. 9. X3 = 1. X5 = 0. . . X4 = 0. . Suppose that any tire becomes underinflated independently and with the same probability for each tire on every car. . (c) (††) What is the probability that Pip must bake a pizza to satisfy the request of the current customer? That is. Let the random variable Y denote the number of pizzas that Pip must bake that day. would require that Pip bake a second pizza before selling all of the slices in the first pizza. (a) Suppose 20 customers visit Pip’s Pizza Shoppe on a particular day. . the current customer wants pepperoni but Pip has no pepperoni slices available or the current customer wants plain but Pip has no plain slices available. X2 . What is the probability distribution for the random variable Y ? Hint: There is only one pepperoni slice per pie. Note that a sequence such as X1 = 1. and Xi = 0 if it’s plain.

What is the probability that the tire in underinflated? (b) Suppose a parking lot contains 10 cars. What is the probability distribution for X? . Let the random variable X represent the number of tires (out of the 40) that are underinflated. We plan on testing the air pressure for every tire on every car.Questions for Chapter 9 245 (a) Suppose we select a car at random. and then select a tire on that car at random.

246 Questions for Chapter 9 .

1 Approximation for the Normal Integral Let Z be a standard normal random variable.356563782 b4 = −1.821255978 b5 = 1. (1970). Handbook of Mathematical Functions. Inc. | (z)| < 7. that is the probability density function for Z is given by 1 2 1 φ(z) = √ e− 2 z 2π −∞<z <∞ Let Q(z) ≡ P (Z > z).781477937 With this approximation.2316419 1 + rz b1 = 0. New York (1970). 247 ..10 APPROXIMATIONS AND COMPUTATIONS These numerical approximations are from a book by Abramowitz and Stegun. Dover Publications. M.31938153 b2 = −0. and Stegun.gov/. that is Q(z) = ∞ z 1 2 1 √ e− 2 t dt 2π For z ≥ 0 the following approximation can be used to compute Q(z): Q(z) = φ(z)(b1 t + b2 t2 + b3 t3 + b4 t4 + b5 t5 ) + (z) where 1 r = 0.nist. This handbook is being published as a reference source on the World Wide Web at http://dlmf. 1 Abramowitz. I.330274429 t= b3 = 1.5 × 10−8 .

| (Q)| < 4. Write a computer program to find z when P (Z ≤ z) = 0.189269 t= ln c0 = 2.01. 0.2 Let Z ∼ N (0. For a given 0 < Q < 0.50. find the corresponding z that satisfies the above integral equation. 1). 0. .515517 d0 = 1. for a given 0 < Q < 0.5.802853 d1 = 0.1 Write a computer program to generate a table for the cumulative distribution function of the standard normal distribution for −3. 0.00 in increments of 0. the following approximation can be used to compute z : z =t− c0 + c1 t + c2 t2 + (Q) 1 + d1 t + d2 t2 + d3 t3 where 1 Q2 c1 = 0. that is the probability density function for Z is given by 1 2 1 φ(z) = √ e− 2 z 2π −∞<z <∞ Let Q = P (Z > z).01.010328 d2 = 0.432788 c2 = 0. Questions for Chapter 10 10. that is Q= ∞ z 1 2 1 √ e− 2 t dt 2π The problem is.99.25. 0.5.05. 10.001308 With this approximation.00 ≤ z ≤ 3.95 and 0.5 × 10−4 .248 Questions for Chapter 10 Approximation for the Inverse Normal Integral Let Z be a standard normal random variable.

Appendix A TABLES Page Table I Table II Standard Normal Cumulative Distribution Function Some Common Probability Distributions 250 251 249 .

8944 0.7324 0.9946 0.9817 0.9881 0.9177 0.9953 0.6179 0.5910 0.9834 0.7486 0.9981 0.9961 0.7 a 1.9082 0.9931 0.9049 0.6406 0.9 -0.6772 0.6443 0.9868 0.8 2.6950 0.9979 0.9983 0.9987 1 0.0 -1.8790 0.8389 0.9951 0.8729 0.9292 0.9893 0.9989 7 0.9582 0.9306 0.6879 0.6103 0.9960 0.5239 0.7517 0.9429 0.9966 0.9474 0.6554 0.9990 9 0.8 1.9982 0.8 -1.9192 0.6368 0.7549 0.9633 0.7088 0.9857 0.9 1.9854 0.7642 0.9713 0.6331 0.6 -0.9706 0.8289 0.9545 0.6 0.8365 0.2 1.9984 0.9564 0.2 z z a 0.9821 0.9949 0.7580 0.9830 0.8 0.8133 0.6217 0.9761 0.9875 0.7257 0.9357 0.9884 0.5398 0.9726 0.9115 0.5319 0.35 0.9778 0.9959 0.9015 0.6141 0.9970 0.7611 0.9980 0.8810 0.9783 0.6517 0.9505 0.8686 0.7881 0.6255 0.25 0.9971 0.8849 0.9990 .7157 0.9890 0.8238 0.9878 0.9463 0.5557 0.6064 0.9406 0.8186 0.1 0.8106 0.9918 0.2 2.9901 0.2 0.8051 0.5120 0.9916 0.9641 0.9750 0.9922 0.9719 0.8531 0.7 2.9985 0.7357 0.9671 0.9965 0.6844 0.4 2.9943 0.8159 0.20 0.9515 0.9989 6 0.9896 0.9452 0.9838 0.7454 0.40 Φ(a) ϕ(z) z 0.9394 0.8621 0.9147 0.9927 0.9904 0.9535 0.3 0.9798 0.8264 0.5 1.9370 0.3 2.6591 0.0 0.9484 0.5040 0.0 0 0.8508 0.9909 0.9788 0.9738 0.7 0.7123 0.5080 0.8869 0.8962 0.5279 0.9850 0.7224 0.4 0.9554 0.9649 0.9972 0.8708 0.9 3.5359 0.7910 0.7704 0.8340 0.5714 0.5871 0.9 Φ(a) = a a ϕ(z) dz = −∞ −∞ 1 2 √1 e− 2 z 2π dz -2.7291 0.9808 0.9948 0.15 0.9756 0.9608 0.9977 0.5 0.9864 0.9699 0.5987 0.9979 0.9977 0.7019 0.8212 0.9 2.3 -1.9826 0.9936 0.9236 0.9940 0.9319 0.9842 0.7673 0.10 0.4 1.5517 0.1 -0.5636 0.3 1.9984 0.9911 0.7764 0.7967 0.9955 0.9986 0.9989 8 0.9812 0.9441 0.8907 0.9793 0.8485 0.7 1.9987 2 0.9976 0.9988 4 0.7190 0.9032 0.9988 5 0.7734 0.8315 0.9945 0.5 -1.9981 0.6628 0.7422 0.9099 0.9929 0.8023 0.9967 0.5 2.6700 0.9066 0.9656 0.9846 0.3 0.6664 0.9887 0.5438 0.7995 0.6026 0.1 1.9418 0.5 0.9913 0.9345 0.9599 0.9938 0.7794 0.8830 0.8554 0.8577 0.7054 0.9693 0.5793 0.9952 0.5478 0.9332 0.6293 0.9525 0.9974 0.1 0.9222 0.5000 0.6736 0.7939 0.9925 0.0 2.00 0.7823 0.9963 0.9985 0.6 1.9973 0.9207 0.8 1.5832 0.9957 0.7852 0.8925 0.5160 0.8461 0.6 2.250 Table I: Standard Normal Cumulative Distribution Function TABLE I: STANDARD NORMAL CUMULATIVE DISTRIBUTION FUNCTION 0.9964 0.6915 0.9616 0.9934 0.9686 0.9131 0.8643 0.9982 0.9920 0.7389 0.8749 0.9664 0.6808 0.9968 0.9906 0.9625 0.5753 0.8438 0.30 0.0 1.0 1.9265 0.9871 0.9744 0.9975 0.9861 0.9591 0.9962 0.9678 0.9932 0.8997 0.5596 0.9987 3 0.8980 0.5199 0.9772 0.9162 0.9251 0.6480 0.9767 0.5948 0.6985 0.9986 0.8599 0.8888 0.5675 0.9573 0.5 1.9803 0.9969 0.8413 0.9898 0.9974 0.8078 0.9956 0.9732 0.4 -0.8665 0.9978 0.9382 0.9495 0.05 0.1 2.8770 0.2 1.9941 0.9279 0.

. a≤x≤b uniform cont a<b fX (x) = 1 b−a a+b 2 (b − a)2 12 ebt − eat (b − a)t TABLE II: SOME COMMON PROBABILITY DISTRIBUTIONS exponential cont α>0 fX (x) = αe−αx x≥0 1 α 1 α2 α α−t r gamma cont r ≥ 1. 2. . . 1. σ > 0 1 1 fX (x) = √ e− 2 σ 2π 2 t2 /2) 251 . . n np np(1 − p) (et p + (1 − p))n geometric disc 0≤p≤1 pX (k) = (1 − p)k−1 p k = 1. 1. . 1 p p(1 − p) et p + (1 − p) binomial disc n. 1 p α α 1−p p2 pet 1 − (1 − p)et eα(e t −1) Table II: Some Common Probability Distributions Poisson disc α>0 pX (k) = e−α αk k! k = 0.Name Type Parameters Definition Support E(X) Var(X) mgf Bernoulli disc 0≤p≤1 pX (k) = pk (1 − p)1−k k = 0. . . 0 ≤ p ≤ 1 pX (k) = n k pk (1 − p)n−k k = 0. . . . α > 0 fX (x) = α (αx)r−1 e−αx Γ(r) x−µ 2 σ x≥0 r α −∞ < x < ∞ µ r α2 σ2 α α−t e(tµ+σ normal cont µ.

252 Table II: Some Common Probability Distributions .

20 S3.6 Chapter 2 S2.23 S3.1 S3.25 S3.4 S2.19 S3.8 B D S2.2 S1.3 S2.5 S2.5 Answer C is acceptable if you allow the triangle to be flipped S2.2 S2.17 S3.21 S3.26 S3.31 S3.11 Remarks S2.8 S3.7 D A S1.1 S1.33 E B B D A D A S3.36 Remarks S3. P (A ∩ B) = P (A)P (B).18 S3.10 D A 253 .3 S3. Hence.12 S3.29 S3.15 S3.16 S3.32 C D B A C D B S3.7 S3.6 The correct answer is Chapter 3 S3.27 S3.1 S2.9 S3.2 S3.5 D B E C S2.6 S2.Appendix B ANSWERS TO SELF-TEST EXERCISES Chapter 1 S1.22 S3.3 C S1.9 A C S2.35 C B D A B D B 5 2 A D S1.5 S3.4 S3.5 Must have p = P (A) = P (B) = P (A ∩ B) = 0.13 S3.24 S3.28 S3.6 S3. C C E D D A C C S3.30 S3.14 S3.12 A D B S2.4 A S1.11 S3.7 S2.10 S3.34 C A A A A C A S3.

17 S5.2 S4.26 Remarks S5. ANSWERS TO SELF-TEST EXERCISES Chapter 4 S4.17 C A D D S7.3 S5.5 S7.10 S7.14 S6.16 Chapter 5 S5.11 S6.9 S7.19 S5.20 S5.10 S6.21 S5.22 S5.17 B C C D S4.15 B C B B B B C D S6.19 C C B A S4.22 D C C A B S6.6 S4.50) since ln(x) is an increasing function.1 S6.21 Remarks S6.18 S5.11 S5.15 S6.50)) = P (X < 0.5 S6.3 S4.20 C B A D C C B C A D S5.11 S7.28 C C D A E C S5.5 S4.4 S7.8 S4.12 S4.13 S5.17 S6.10 S5.8 S7.2 S6.7 S4.19 C C C E S6.9 S4.2 S7.27 Note that P (ln(X) < ln(0.19 The correct answer is 1 Chapter 7 S7.16 S6.3 S7.23 S5.25 C B B B D A C B C S4.14 S5.13 S6.24 A B B C C S5.9 S6.12 S7.27 A B D C B B S5.9 S5.7 S7.4 S5.5 S5.4 S4.6 S6.14 B D B S7.12 S6.3 S6.12 S5.16 S5.8 S5.254 APPENDIX B.18 E C E D S6.18 C C D B S7.10 S4.18 B B C C S4.6 S7.16 Chapter 8 S8.6 S5.1 C D C A C S7. Chapter 6 S6.7 S6.15 S5.13 S7.2 S5.11 S4.15 C D B .1 S5.7 S5.4 S6.1 S7.8 S6.1 S4.14 S4.13 S4.

1 D S9.2 C S9.255 Chapter 9 S9.3 C .

256 APPENDIX B. ANSWERS TO SELF-TEST EXERCISES .

) (a) 4! (b) 4 3 (c) 3! (d) 43 (e) none of the above. (5 points) Three of your friends are having birthdays on the same date. You purchased four birthday cards. Ω is a sample space with events A. 1. You purchased four birthday cards. B and C and probability measure P (·). How many different ways can you send cards to your friends so that each gets one card? (a) 4! (b) 4 4 (c) 4 (d) 44 (e) none of the above. 2. 257 . In all of the following questions.Appendix C SAMPLE EXAMINATION 1 PART I: MULTIPLE CHOICE QUESTIONS Instructions: For each of the questions in Part I. How many different ways can you send cards to your friends so that each gets one card? (You will have one card left over. (5 points) Four of your friends are having birthdays on the same date. choose the best answer among those provided.

3 then (a) A and B are mutually exclusive (b) P (A ∩ B) = 0 (c) A and B are independent (d) all of the above (e) none of the above. (5 points) If P (A) = 0. 5. (5 points) If A ⊆ B . 4. then (a) P (A ∩ B) = 0 (b) A = B (c) A ∩ B = Ø (d) A and B are independent (e) none of the above.3.258 Part I: Multiple Choice Questions 3. P (B) = q > 0 and A∩B = Ø then P (A∪B | A) equals (a) 1 (b) p (c) (p + q)/p (d) (p + q − pq)/p (e) none of the above .7.7 and P (B) = 0. P (B) = 0.” is (a) A ∪ B ∪ C (b) A ∩ B ∩ C (c) (A ∩ B) ∪ (A ∩ C) ∪ (B ∩ C) (d) (A ∪ B ∪ C)c (e) none of the above. (5 points) The event which corresponds to the statement. 6. “all of the events A. B and C occur simultaneously. P (A) = 0. (5 points) If P (A) = p > 0.3 and P (A | B) = 0.

25 (b) 0. 8. 9.5 (b) 1 (c) 2 √ (d) 2 (e) none of the above. (5 points) The value of the constant c is (a) 0. After getting your engineering degree. Your chicken wing recipe is so famous that the White House (in Washington DC) has asked that you send a large order of chicken wings for a important dinner. accidentally mixed . (5 points) P (X > 1) equals (a) 0. you are now the owner of the most successful restaurant in Buffalo. Your chicken wings can be either hot or mild.Part II 259 For questions 7 and 8: Assume that X is a continuous random variable with probability density function fX (x) = cx if 0 ≤ x ≤ 2 0 otherwise where c is a constant. Circle your final answer.5 (c) 0. PART II Instructions: Show your work in the space provided. 7. New York. and usually you keep the two different types of chicken wings in separate containers.75 (d) 1 (e) none of the above. Unfortunately. an engineering intern (who is working for you as a cook).

2 pX (0) = 0. it will also be hot? Justify your answer.260 Part II some hot wings with some mild ones. You don’t want to accidentally ship Container B or C.4 pX (+1) = 0. (a) (10 points) You plan to choose a container at random. (c) (5 points) Compute P (X > 1). Given that the first wing that you sampled was hot. (b) (10 points) You actually select a container at random. The only way you can determine the contents of the containers is by sampling the wings. (d) (10 points) Compute P (X = 0 | X ≤ 1). (c) (10 points) You plan to sample one more wing from the container chosen in part (b).4 (a) (10 points) Sketch the cumulative distribution function for X . You now have only three containers of chicken wings: Container A All hot wings Container B All mild wings Container C One half hot and one half mild wings The White House wants only hot chicken wings. What is the probability that the wing will be mild? Justify your answer. (b) (5 points) Compute P (X < 0). and then sample one of the chicken wings from that container. 10. and you need to ship the order immediately. Justify your answer using Bayes’ theorem. . It’s a hot wing! What is the probability that you chose Container C (the one with both hot and mild wings). To make matters worse. Let X be a random variable with the following probability mass function: pX (−1) = 0. what is the probability that if you sample a second wing from the same container. the containers are unmarked. and sample a chicken wing from that container.

1.Appendix D SAMPLE EXAMINATION 2 PART I: MULTIPLE CHOICE QUESTIONS Instructions: For each of the questions in Part I. (5 points) Let X be a continuous random variable with probability density function λxn if 0 ≤ x ≤ 1 fX (x) = 0 otherwise where λ is a constant and n is a nonnegative. choose the best answer among those provided. . .. A relationship which must exist between λ and n is (a) λn = 1 (b) λ = n + 1 (c) λ = n (d) λ = 2n (e) none of the above. 1. Then P (X ≥ k) equals (a) (b) 1 4 1 4 k k+1 261 . . integer constant. 2. 2. (5 points) Let X be a discrete random variable with a probability distribution given by P (X = k) = 1 4 k 1− 1 4 for k = 0.

E[(X − E(Y ))2 ] equals (a) 0 (b) Var(X) (c) Var(Y ) (d) Cov(X. (5 points) Let X be a continuous random variable with probability density function 4e−4x if x > 0 fX (x) = 0 otherwise. 3. 5. 4.262 Part I: Multiple Choice Questions (c) (d) 3 4 3 4 k (e) none of the above. The value of P (X 2 ≥ 16) is (a) 1 − e−4 (b) e−16 (c) 1 − 4e−16 (d) e−4 (e) none of the above. Then. Y ) (e) none of the above. (5 points) Let X and Y be random variables with E(X) = 2 and E(Y ) = 2. (5 points) If X is a random variable with P (X ≤ 0) = α. . then P (−3X < 0) equals (a) 0 (b) 1 3α (c) α (d) 1 − α (e) none of the above.

8.Y (1. . The value of P (W = 1) is (a) 0 (b) 1/9 (c) 1/3 (d) 2/3 (e) none of the above. (5 points) The value of E(X + Y ) is (a) −1 (b) 0 (c) 1/3 (d) 1 (e) none of the above. (5 points) Define the random variable W = XY .Y (0. −1) = 1/3 pX.  0 2 otherwise Then P (X < 0) equals (a) 0 (b) (c) (d) 1 8 1 4 1 2 (e) none of the above For questions 7 and 8: Assume that (X.Part I: Multiple Choice Questions 263 6. Y ) is a random vector with joint probability mass function given by pX. 1) = 1/3 7.Y (−1. (5 points) Let X be a continuous random variable with probability density function   x+1 if −1 ≤ x ≤ 1 fX (x) = . 0) = 1/3 pX.

otherwise 1 2 (a) (10 points) Compute P X + Y < . it has been determined that the random vector (X. Let (X. Circle your final answer.01 . After an extensive survey was conducted. . Y ) be a continuous random vector with joint probability density function given by fX. What is the probability that no men will show up? 10. The Youbee School of Engineering has a graduating class of 2 men and 3 women.05 .15 .12 3 . what is the expected amount of money that will be collected from members of the graduating class? (d) (10 points) All three women graduates arrive early at the party.03 .05 .04 (a) (5 points) What is the probability that 4 or more members of the graduating class will show up at the party? (b) (5 points) What is the probability that more women than men from the graduating class will show up at the party? (c) (10 points) If women are charged $1 for attending the party and men are charged $2. Y ) has the following joint probability mass function: pX. y) = 2 0 if x ≥ 0.04 Y (women) 1 2 . y) 0 X (men) 1 2 0 .Y (x. (b) (10 points) Find the marginal probability density function for X .264 Part II PART II Instructions: Show your work in the space provided. and let Y denote the number of women from the class who will attend. 9.25 .01 . y ≥ 0 and x + y ≤ 1 .20 . (c) (10 points) Find the conditional probability density function for Y given 1 that X = 2 .05 . Let X denote the number of men from the graduating class who will attend the party. Plans are being made for a graduation party.Y (x.

265 . (3 points) Let P (A) = r and P (B) = s with A ∩ B = Ø. 2. Then P (A | B c ) equals (a) 0 (b) r (c) r/(1 − s) (d) 1 (e) none of the above. Ω is a sample space with events A. The value of P (X > Y ) is (a) 0 (b) 0. 2). (3 points) Let X and Y be independent random variables with X ∼ N (0.50 (d) 0.Appendix E SAMPLE FINAL EXAMINATION PART I: MULTIPLE CHOICE QUESTIONS Instructions: For each of the questions in Part I. B and C and probability measure P (·). and if the coin turns up tails. choose the best answer among those provided. you win $1. In all of the following questions. If the coin turns up heads. 1) and Y ∼ N (0.95 (e) none of the above. you lose $1. For questions 3 and 4: A game is played by repeatedly tossing a fair coin. 1.05 (c) 0.

e. (3 points) In a class of 10 students. (3 points) Let X be a continuous random variable with probability density function αe−x if x ≥ 0 fX (x) = 0 otherwise The value of α (a) must be 1 . 3. your winnings minus your losses) after n tosses. 4. Note that Yn may be negative. 5 students received an A on the last examination.266 Part I: Multiple Choice Questions Let Yn denote your net winnings (i. (3 points) The value of E(Yn ) is (a) 0 (b) −n (c) n/2 (d) n (e) none of the above. (3 points) The value of Var(Yn ) is (a) 0 √ (b) n (c) n (d) n2 (e) none of the above. The probability that both of the selected students received a grade of A is (a) 1/4 (b) 2/9 (c) 1/3 (d) 1/2 (e) none of the above. 6. 5. Two students are chosen at random from the class..

(3 points) The set (A ∪ B) ∩ (A ∪ B c ) is equivalent to (a) Ø (b) Ω (c) A (d) B (e) none of the above.) (a) e−2 (b) 1 − e−2 (c) 1/2 (d) (0. 7. . 8. (3 points) Suppose a chain link has a probability of 0. (3 points) The weight of a gum drop (in ounces) is normally distributed with 1 mean 2 and standard deviation 4 . The probability that the total weight of the gum drops in the bag exceeds 20 ounces is (a) 0 (b) 0.001)2000 (e) none of the above. A bag contains 10 independent gum drops.5 (c) 1 (d) 1 − ( 1 )16 4 (e) none of the above.001 of breaking. Approximately. (Note: The chain will break if one or more links break.Part I: Multiple Choice Questions 267 (b) can be any positive real number (c) can be any positive integer (d) can be any real number (e) none of the above. 9. independently of all other links. what is the probability that a chain with 2000 such links will break.

0000 (e) none of the above. 8).268 Part I: Multiple Choice Questions 10.8413 (d) 1. (3 points) Let X be a random variable with probability generating function g(z) = z(1 + z) 2 The value of E(X) is (a) 0 (b) 2/3 (c) 1 (d) 3/2 (e) none of the above.1587 (b) 0. If Y = X 3 . (3 points) Let X be a random variable with moment generating function MX (t) = e(e t −1) The moment generating function of the random variable Y = 2X + 1 is (a) et (b) e2t (c) e(2e t −2) 2t −1) (d) e(t+e (e) none of the above. 11. Then Var(X) equals (a) 1 . then P (Y > 1) equals (a) 0. (3 points) Let X be a random variable with E(X) = 2 and E(X 2 ) = 5. (3 points) Let X ∼ N (1.5000 (c) 0. 13. 12.

(3 points) An airplane can hold 100 passengers.Part I: Multiple Choice Questions 269 (b) 3 (c) 7 (d) 9 (e) none of the above. It is know that. For questions 16 and 17: Suppose that X1 and X2 are independent random variables representing the lifetimes of two lightbulbs. each having marginal probability .01 (c) (0. (3 points) The weight X (in pounds) of a roll of steel is a random variable with probability density function given by fX (x) = 1 2 (x − 499) 0 if 499 ≤ x ≤ 501 otherwise A roll is considered defective if its weight is less than 500 pounds.99)101 (d) 1 (e) none of the above. the probability that someone will be denied a seat is (a) 0 (b) 0. 14. the probability that exactly one is defective is (a) 1/4 (b) 3/8 (c) 9/64 (d) 27/64 (e) none of the above. Assume that a person decides independently whether or not to show up for a flight. If the airline permits 101 persons to make reservations for a given flight. If 3 independent rolls of steel are examined. 15. over many years. 1% of all persons making airline reservations never show up for their flight.

. Xn be a random sample with each Xi having a probability mass function given by P (Xi = 0) = 1 − p P (Xi = 1) = p . 17. (3 points) The value of E(W ) is (a) 1/2 (b) 1 (c) 2 (d) X1 + X2 (e) none of the above. 16. Which of the following statements are always true? (a) P (A) ≤ P (B) (b) P (A ∪ B) = P (B) (c) P (A ∩ B) = P (A) (d) all of the above (e) none of the above. 18. (3 points) Suppose A ⊆ B . (3 points) Let X1 . .270 Part I: Multiple Choice Questions density function f (x) = 1/2 0 if 0 ≤ x ≤ 2 otherwise Let W = X1 + X2 represent the total lifetime of the two lightbulbs. . 19. . (3 points) The value of P (W < 2) is (a) 0 (b) 1/4 (c) 1/2 (d) 1 (e) none of the above.

and write down the answer to one of the following questions depending on the color of the bead: • If the bead is red. A sociologist is interested in determining the percentage of college students who like classical music. 21. draw out a bead at random. An unbiased estimator for p is given by (a) p = ˆ (b) p = ˆ i=1 n 1 n n n Xi i=1 Xi Xi2 i=1 (c) p = ˆ (d) all of the above (e) none of the above. Two indistinguishable balls are dropped into the box. The total number of distinguishable arrangements of the balls in the numbered cells is (a) (b) (c) 4 2 3 2 3 1 (d) 4 (e) none of the above.Part II 271 where 0 ≤ p ≤ 1. Any cell may contain more than one ball. (3 points) A box is partitioned into 3 numbered cells. The person is instructed to go to a secluded corner. she decides on the following technique: A person is given a bowl which contains 600 red beads and 400 blue beads. 20. answer “Yes” or “No” to the question: . Circle your final answer. In order to avoid embarrassing any students and help assure a truthful response to the survey. PART II Instructions: Show your work in the space provided.

the person is likely to be truthful.) 22. otherwise 1 2 (a) (5 points) Compute P X + Y < . except the person being surveyed. knows which question was actually answered. (c) (5 points) Find the marginal probability density function for Y . (d) (5 points) Find the conditional probability density function for Y given 1 that X = 2 . (c) (5 points) Let Y = (X/60) − (1/3). . Let (X. (d) (5 points) Suppose 40 of the 100 students answered “Yes” to the survey. y) = 2 0 if x ≥ 0. y ≥ 0 and x + y ≤ 1 . answer “Yes” or “No” to the question: Does your student number end in an odd digit? The person simply responds “Yes” or “No.Y (x. Therefore. (a) (5 points) Suppose p is the unknown fraction of students who like classical music. (b) (5 points) Let X denote the number of “Yes” responses from a survey of 100 individuals. What is your best guess for the value of p? (Hint: Use an unbiased estimator for p. Y ) be a continuous random vector with joint probability density function given by fX.” No one. What is the unconditional probability that a person surveyed will answer “Yes?” Your answer should depend on p. (b) (5 points) Find the marginal probability density function for X .272 Part II Do you like classical music? • If the bead is blue. Show that E(Y ) = p. Determine P (X = k) as a function of k and p.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.