P. 1
3500317 Applied Probability

# 3500317 Applied Probability

|Views: 501|Likes:

See more
See less

08/12/2012

pdf

text

original

## Sections

• Mathematical Models
• Deterministic models
• Nondeterministic models
• Sets
• Basic deﬁnitions
• Set operations
• Set identities
• A proof of the distributive law for sets
• Elementary sets
• Cardinality of sets
• Sample spaces and events
• Sets of events
• Self-Test Exercises for Chapter 1
• Questions for Chapter 1
• ENUMERATION
• Fundamental Rule
• Permutation of n items distinguished by groups
• Self-Test Exercises for Chapter 2
• Questions for Chapter 2
• PROBABILITY
• Probability Spaces
• Measuring sets
• Interpretation of probability
• Examples using logical probability
• Relationships between English and set notation
• Deductions from the axioms
• THE BIRTHDAY PROBLEM
• The birthday problem on the planet Yak
• Conditional Probability and Independence
• Conditional probability
• Independence
• Bayes’ theorem
• Independence of more than two events
• Self-Test Exercises for Chapter 3
• Questions for Chapter 3
• Cumulative Distribution Function
• Properties of any cumulative distribution function
• Discrete Random Variables
• Properties of any probability mass function
• Examples of discrete distributions
• Properties of any probability density function
• Differential mass interpretation of the pdf
• Examples of continuous distributions
• Expectation
• Finding E(X) for some speciﬁc distributions
• Binomial distribution
• Poisson distribution
• Geometric distribution
• Exponential distribution
• Other measures of central tendency
• Self-Test Exercises for Chapter 4
• Questions for Chapter 4
• Some examples
• Discrete case
• Continuous case
• Invertible Functions of Continuous Random Variables
• Expected values for functions of random variables
• The kth Moment of X
• Variance
• Approximating E(h(X)) and Var(h(X))
• Chebychev’s Inequality
• The Law of Large Numbers
• The Normal Distribution
• Tools for using the standard normal tables
• Self-Test Exercises for Chapter 5
• Questions for Chapter 5
• RANDOM VECTORS
• Deﬁnition
• Joint Distributions
• Discrete Distributions
• Continuous Distributions
• Marginal Distributions
• Functions of Random Vectors
• Independence of Random Variables
• Testing for independence
• Expectation and random vectors
• Random vectors and conditional probability
• Conditional distributions
• Self-Test Exercises for Chapter 6
• Questions for Chapter 6
• Transforms
• Notation
• Probability generating functions
• Moment generating functions
• Other transforms of random variables
• Linear combinations of normal random variables
• The distribution of the sample mean
• The Central Limit Theorem
• The DeMoivre-Laplace Theorem
• The distribution of the sample mean for large n
• Self-Test Exercises for Chapter 7
• Questions for Chapter 7
• Basic Concepts
• Random samples
• Statistics
• Point Estimation
• The problem
• Unbiased estimators
• Finding Estimators
• Method of maximum likelihood
• Self-Test Exercises for Chapter 8
• Questions for Chapter 8
• Coin Tossing (Binary) Models
• The binomial distribution
• The geometric distribution
• The Poisson distribution
• Random Particle Models
• Self-Test Exercises for Chapter 9
• Questions for Chapter 9
• Approximation for the Normal Integral
• Approximation for the Inverse Normal Integral
• Questions for Chapter 10
• TABLE I: STANDARD NORMAL CUMULATIVE DISTRIBUTION FUNCTION
• TABLE II: SOME COMMON PROBABILITY DISTRIBUTIONS
• SAMPLE EXAMINATION 1
• PART I: MULTIPLE CHOICE QUESTIONS
• PART II
• SAMPLE EXAMINATION 2

# Lecture Notes in Applied Probability

2005 Wayne F. Bialas Department of Industrial Engineering University at Buffalo

i

Copyright c 2005 Wayne F. Bialas Department of Industrial Engineering University at Buffalo 301 Bell Hall Buffalo, New York 14260 USA E-mail: bialas@buffalo.edu Web: http://www.buffalo.edu/˜bialas All rights reserved. No part of this work may be reproduced by any means without the express written permission of the author.

Date printed: May 12, 2005

ii

CONTENTS

Acknowledgements To the student 1 Sets, Sample Spaces and Events Mathematical Models . . . . . . . . . . . . . . . . . . . . . . . . . . Deterministic models . . . . . . . . . . . . . . . . . . . . . . . . Nondeterministic models . . . . . . . . . . . . . . . . . . . . . . Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . Venn diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . Set operations . . . . . . . . . . . . . . . . . . . . . . . . . . . Set identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proving statements about sets . . . . . . . . . . . . . . . . . . . A proof of the distributive law for sets . . . . . . . . . . . . . . . Elementary sets . . . . . . . . . . . . . . . . . . . . . . . . . . Cardinality of sets . . . . . . . . . . . . . . . . . . . . . . . . . Sample spaces and events . . . . . . . . . . . . . . . . . . . . . Sets of events . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-Test Exercises for Chapter 1 . . . . . . . . . . . . . . . . . . . . Questions for Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . .

xi xiii 1 1 1 1 2 2 3 3 5 6 7 8 9 10 11 12 14

iii

. Conditional probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Probability Probability Spaces . . . . . . . Self-Test Exercises for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . Deductions from the axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 20 20 26 26 29 32 35 35 35 36 40 41 41 45 48 49 49 51 52 54 57 67 Independence . . . . . . . . . . The birthday problem on the planet Yak . . . . . . . . . . . . . . Permutation of n items distinguished by groups . . . . . . . . . . . . . . . . . . . . . . . . Bayes’ theorem . . . . . . . . . . . . . . . . . Independence of more than two events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Relationships between English and set notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-Test Exercises for Chapter 2 . . . . . . . . . . Conditional Probability and Independence . Examples using logical probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .iv CONTENTS 2 Enumeration Fundamental Rule . Measuring sets . . . . . . . . . . . . . . . . . . . . . . The Birthday Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Questions for Chapter 2 . Occupancy . . . . . . . . . . . . . Interpretation of probability . . . . . . . . . . . . . . . . Sampling . . . . Questions for Chapter 3 .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Continuous case . . . . . . . . . . . . . Properties of any probability density function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Examples of continuous distributions . . . . . . . . . . . . . . . . . Questions for Chapter 4 . . . . . Differential mass interpretation of the pdf . . . . . . . . . . . . . . . . . . . . 109 Discrete case . . . . . . . . . . . Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .CONTENTS v 4 Random Variables Basic Concepts . . . . . . . . . . . . . . . . . . Properties of any cumulative distribution function . . . . . . . . . . Examples of discrete distributions . . . . . . . . Properties of any probability mass function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Poisson distribution . . . . . Expectation . . . . Binomial distribution . . . . . . . . . . . . . . . . . . . . . . . Exponential distribution . . . . . . Other measures of central tendency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 . . . . . . . . . . . . . . . . . . Continuous Random Variables . . . . Finding E(X) for some speciﬁc distributions . . . . . . . . . . . . . . . . . . . . . . . . . . Self-Test Exercises for Chapter 4 . . . . 81 81 81 81 82 82 83 85 86 87 88 89 90 90 90 91 91 92 92 99 105 5 Functions of Random Variables Basic Concepts . . . . . 105 Some examples . . . . . . . . . . . Geometric distribution . . . Cumulative Distribution Function . . . . . . . . . . .

. . . . . . . . . . . . . 112 Expected values for functions of random variables . . . . . . .vi CONTENTS Invertible Functions of Continuous Random Variables . . . . . . . . . . . . . . . . . . . . . . . 133 Questions for Chapter 5 . . . . . . . . . 159 Marginal Distributions . . . . . . . . . . . 155 Joint Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 126 The Normal Distribution . . . . . . . . . . . 157 Discrete Distributions . . . . . . 155 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 The Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 The k th Moment of X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Variance . . . . . . . . . . . . . . 121 Chebychev’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . 168 Independence of Random Variables . . . . . . . . . . . . . . . . . . . . 126 Tools for using the standard normal tables . . . . . . . . . . . . . . . . . . . . . . . . 180 . . 162 Functions of Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Approximating E(h(X)) and Var(h(X)) . . . . . . . . . . . . . . . . . . . 178 Conditional distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Expectation and random vectors . . . . 171 Testing for independence . . . . . . . . . . . . . . . . . . 173 Random vectors and conditional probability . . . . . . . . . . . . . . . . . . . 158 Continuous Distributions . . . . . . . . . . . . . . . . . . . . . . . . 142 6 Random Vectors 155 Basic Concepts . . . . . . . . . . . . . . . 179 Self-Test Exercises for Chapter 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Self-Test Exercises for Chapter 5 . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 The problem . . . . . . . . . 209 The distribution of the sample mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 Questions for Chapter 8 . . . . . . . . . 229 Unbiased estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Self-Test Exercises for Chapter 8 . . 213 The distribution of the sample mean for large n . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Finding Estimators . . . . . . . . . . . . . . . 214 Self-Test Exercises for Chapter 7 . . . . . . . . . . . . . . . . . . . . . . . 215 Questions for Chapter 7 . . . . . . . . . . . . 199 Moment generating functions . . . . . . . . . . . . . . . . . 227 Random samples . . . . . . . . . . . . . . . . . . 227 Statistics . . . . . . . . . . . . 199 Notation . . . . . . . 231 . . . . . . . . . . . . . . . . . . . . . . . . .CONTENTS vii Questions for Chapter 6 . . 211 The Central Limit Theorem . 227 Point Estimation . . . . . . . . . . . . . . . . . . 229 Method of maximum likelihood . . . . . . . . . . . . . . . . . . . . . 209 Linear combinations of normal random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 8 Estimation 227 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 The DeMoivre-Laplace Theorem . . . . . . . . . . . . . . . . . 203 Other transforms of random variables . . . . . . . . . . . . 188 7 Transforms and Sums 199 Transforms . . . . . 199 Probability generating functions .

. . . . . . 233 The geometric distribution . . . . . . . . . . . . 248 A Tables 249 Table I: Standard Normal Cumulative Distribution Function . . . 234 The Poisson distribution . . . . . . . . . . . . . . 259 D Sample Examination 2 261 Part I: Multiple Choice Questions . . . . . . . . . . . . . . . . . . . . . . . . 235 Random Particle Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Approximation for the Inverse Normal Integral . . . . 238 10 Approximations and Computations 247 Approximation for the Normal Integral . . . . . . . . . . . 250 Table II: Some Common Probability Distributions . . . . . . . . . . . . . . . . . . 261 Part II . . . . . . . . 237 Questions for Chapter 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Self-Test Exercises for Chapter 9 . . . . . . . . . . . . . . . . . . . . . . . 257 Part II . . . . . . . . . .viii CONTENTS 9 Probability Models 233 Coin Tossing (Binary) Models . . . . . . . . . . . 233 The binomial distribution . . . . . . . . . . . . . 264 . . . . . . . . . . . . 251 B Answers to Self-Test Exercises C Sample Examination 1 253 257 Part I: Multiple Choice Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 Questions for Chapter 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . 271 . . . . . . . . . . . . . . .CONTENTS ix E Sample Final Examination 265 Part I: Multiple Choice Questions . . . 265 Part II . . . . . . . . . . . . . . . . . . . . . .

x CONTENTS .

Some of them were my professors. Amol Salunkhe. Dawn Holmes. James Pasquini. Satyaki Ghosh-Dastidar. These notes have beneﬁted the most from the many contributions of my students and teaching assistants.ACKNOWLEDGEMENTS Contained herein is a bit of the wisdom of the greatest scholars in probability theory. WFB Buffalo. Kevin Cleary. I must also thank Yu-Shuan Lee Brown. written by Mark A. Any remaining errors and omissions are mine. Derek Hoiem. dvipdfm. Partel was modiﬁed and used to format this document. In particular. Alfred Guiffrida and Nishant Mishra for several contributions to these notes. Keith Klopﬂe. Taneka Lawson. Elif Tokar and Min Zhang for ﬁnding numerous mistakes in previous editions. I am grateful to Mark Chew. Rajarshi Chakraborty. Shiju David. Steven Kieffer. Lynne Makrina Tkaczevski. A L by H. Nguk Ching Ling. Mona Khanna. Xiaofeng Nie. New York May 12. Lindsay Koslowski. Judah Eisenhandler. and AFPL Ghostscript were used to produce the printed output. a This manual was typeset using Donald Knuth’s TEX with Leslie Lamport’s L TEX aTEX style ﬁle developed enhancements as implemented by Eberhard Mattes. 2005 xi . Wicks. Angelo Morgante. The Adobe AcrobatTM driver. Michael Holender. Harold T. I would not have dared to prepare these notes without these tools. Gerber. Simin Huang. Nikola Petrovic. Andrew Lenkei.

xii CONTENTS .

The concepts of center of mass (center of gravity) and moment of inertia. That measure will be called probability. A fool must now and then be right by chance. the goal of any good mathematical model is to serve as a tool for reaching conclusions about the processes being modeled. Point masses and continuous mass densities arise as a natural consequence. – WILLIAM COWPER. or when they arise from a continuum of possible values.TO THE STUDENT These notes are for a course devoted to describing those processes around us that appear to produce random outcomes. in a systematic way. Conversation (c. we are always more comfortable working with real numbers. The universe is not a deterministic clockwork. but a harmony of random possibilities. 1790) xiii . For some problems. with examples ranging from coin tossing to proving one of the most important results in all of science. We will do that. For that. Probability theory provides a way to mathematically represent chance. But. too. a measure of uncertainty to each of those sets. become the mean and variance of our probability distributions. we will introduce the concept of random variables and random vectors to map our abstract probability problems into Euclidean real space. we will use the mathematical theory of measure. For that purpose. and then assign. As engineers and scientists. is that a course in probability asks that you change your view of everything around you. we can assign probabilities to events by simply counting equally likely outcomes. the Central Limit Theorem. When outcomes are not equally likely. ultimately. we provide some counting techniques. There we will quickly discover analogies between allocating probability and allocating physical mass. What is certain. We will show you how sets can be used to express events.

xiv CONTENTS .

1 SETS. – AMELIA E. All the Days of My Life (1913) Mathematical Models What is a mathematical model?    . SAMPLE SPACES AND EVENTS There are no little events in life. BARR.

.

 Observation  .

Examples: 1 To place this date in perspective. Application    Deterministic models For speciﬁed values of an experiment’s input parameters. 1 . the result is a known constant. University at Buffalo was founded in 1846. Example: V = IR Established experimentally by George Simon Ohm in 18271 Nondeterministic models Also called probabilistic models or stochastic models.

we write x ∈ A. A set is a collection of elements. Deﬁnition 1. . inclusive. . . Notation: Sets are denoted by capital letters (A. . b. 2. Ω. Sets Basic deﬁnitions We will need to use set theory to describe possible outcomes and scenarios in our model of uncertainty. . Question: What is the difference (if any) between Ø and {Ø}? Deﬁnition 1. 3.” • by using the notation {ω : speciﬁcation for ω} A = {x : 0 ≤ x ≤ 1} or A = {0 ≤ x ≤ 1} There are two sets of special importance: Deﬁnition 1.).3. If x is an element of a set A. If x is not an element of A. we write x ∈ A. The empty set (the set containing no elements) is denoted by Ø. A set can be described • by listing the set’s elements A = {1. • the outcome of a single toss of a fair die. 4} • by describing the set in words “A is the set of all real numbers between 0 and 1.) while their elements are denoted by small letters (a. . .2. • the number of students who will earn an A in EAS305.1. ω. B.2 Sets • the number of α-particles emitted from a piece of radioactive material in one minute. The universe is the set containing all points under consideration and is denoted by Ω.

We write A ⊆B . Deﬁne the sets A = {x : x2 + x − 2 = 0} B = {x : (x − 3)(x2 + x − 2) = 0} C = {−2. / Note: Some authors use the symbol ⊂ for subset and other use ⊂ for proper subset.5. A = B if and only if A ⊆ B and B ⊆ A.Sets 3 Deﬁnition 1. 1. then we say that A is a subset of B (B is a superset of A). . . Deﬁnition 1. Deﬁnition 1. 3} Then A ⊆ B and B = C . Example: Let Ω = R.4. / Venn diagrams Ω B A Set operations Complementation Ac = {ω ∈ Ω : ω ∈ A} . A is a proper subset of B if A ⊆ B and A = B . If every point of set A belongs to set B .6. We avoid the ambiguity here by using ⊆ for subset and ⊆ for proper subset. the set of all real numbers. We write A⊆B B⊇A Now we have a way to say A is identical to B .

4 Example: Here is the region representing Ac in a Venn diagram: Ω Sets A Ac Union A ∪ B = {ω ∈ Ω : ω ∈ A or ω ∈ B (or both)} Example: Consider the following Venn diagram: Ω A B then the region representing A ∪ B is: Intersection A ∩ B = {ω ∈ Ω : ω ∈ A and ω ∈ B} .

Deﬁnition 1.7. A ∪ B ∪ C = A ∪ (B ∪ C) = (A ∪ B) ∪ C and A ∩ B ∩ C = A ∩ (B ∩ C) = (A ∩ B) ∩ C You can show that (a) A ∪ B = B ∪ A (b) A ∩ B = B ∩ A (c) A ∪ (B ∪ C) = (A ∪ B) ∪ C (d) A ∩ (B ∩ C) = (A ∩ B) ∩ C Note: (a) and (b) are the commutative laws Note: (c) and (d) are the associative laws Set identities (e) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) (f) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) (g) A ∩ Ø = Ø . These basic operations can be extended to any ﬁnite number of sets.Sets 5 Example: Using the same Venn diagram as above the region representing A ∩ B is given by: Note that some textbooks use the notation AB instead of A ∩ B to denote the intersection of two sets. Two sets A and B are disjoint (or mutually exclusive) if A∩B = Ø.

Proof. (1) (2) (3) (4) Discussion If x ∈ A then x ∈ B Since x ∈ B then x ∈ C If x ∈ A then x ∈ C Therefore A ⊆ C Reasons Deﬁnition of A ⊆ B Deﬁnition of B ⊆ C By statements (1) and (2) Deﬁnition of A ⊆ C (as in the above proof) or the Note: A proof is often concluded with the symbol letters QED. x ∈ B implies x ∈ C . Furthermore. Hence A ⊆ C . x ∈ A implies x ∈ B .6 Sets (h) A ∪ Ø = A (i) (A ∪ B)c = Ac ∩ B c (j) (A ∩ B)c = Ac ∪ B c (k) (Ac )c = A Note: (e) and (f) are called the Distributive Laws Note: (i) and (j) are called DeMorgan’s Laws Proving statements about sets Venn diagrams can only illustrate set operations and basic results.1. Many mathematicians prefer to write proofs in paragraph form. Since A ⊆ B . Here is an example of a simple set theorem and its proof. . For example.1 would become: Proof. Therefore. But they can be used to disprove a false mathematical statement. If A ⊆ B and B ⊆ C . Theorem 1. since B ⊆ C . Choose an x ∈ A. from the deﬁnition of subset. then A ⊆ C . every element x ∈ A is also an element of C . the proof of Theorem 1. You cannot prove a true mathematical statement regarding sets using Venn diagrams.

F. then x ∈ A and x ∈ B ∪ C . Reasons Deﬁnition of ∩. 1964. giving us A ∩ (B ∪ C) ⊆ (A ∩ B) ∪ (A ∩ C).. Since x ∈ B ∪ C . or x ∈ A and x ∈ C .Sets 7 A proof of the distributive law for sets Here is a proof of one of the distributive laws for sets. Deﬁnition of ∪. A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C). . Deﬁnition of ∪.2. then x ∈ B or x ∈ C . We now prove the inclusion in the opposite direction: 2 Dinkines. Meredith Publishing. x ∈ (A ∩ B) ∪ (A ∩ C). Elementary Theory of Sets. Therefore x ∈ A and x ∈ B . By (1) and (2). Discussion If x ∈ A ∩ (B ∪ C). The proof is from a textbook on set theory by Flora Dinkines2 Theorem 1. (1) (2) (3) (4) (5) Therefore every element of A ∩ (B ∪ C) is also an element of (A ∩ B) ∪ (A ∩ C). Hence x ∈ A ∩ B or x ∈ A ∩ C. Deﬁnition of ∩. Proof.

giving us A ∩ (B ∪ C) ⊇ (A ∩ B) ∪ (A ∩ C). . Therefore every element of (A ∩ B) ∪ (A ∩ C) is also an element of A ∩ (B ∪ C). . it is in the union. If y ∈ A ∩ B then y ∈ A and y ∈ B . Consider the following subsets of Ω: A B C 0 0 0 Ac ∩ B c ∩ C c = S0 0 0 1 Ac ∩ B c ∩ C = S1 0 1 0 Ac ∩ B ∩ C c = S2 0 1 1 Ac ∩ B ∩ C = S3 1 0 0 A ∩ Bc ∩ C c = S4 1 0 1 A ∩ Bc ∩ C = S5 1 1 0 A ∩ B ∩ Cc = S6 1 1 1 A∩B∩C = S7 The sets S0 . Deﬁnition of ∩. S7 are called the elementary sets of A. Deﬁnition of ∩. Therefore y ∈ A and y ∈ B ∪ C. Therefore y ∈ A ∩ (B ∪ C). In either case (7) or (8). Same as (7). Therefore. If y ∈ A ∩ C then y ∈ A and y ∈ C . B . B and C . then y ∈ (A ∩ B) or y ∈ (A ∩ C). . Elementary sets Let Ω be a universe with three subsets A. S1 . . y ∈ A and y is an element of one of the sets B or C .8 Sets (6) (7) (8) (9) (10) (11) If y ∈ (A ∩ B) ∪ (A ∩ C). using the deﬁnition of set equality. and C . Statements (7) and (8) Since y is in B or C . . Deﬁnition of ∪. we have A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C).

1: An example of elementary sets for three subsets For example. we have                            A Bc C Ac B C Ac Bc C C . • Any set operation involving A. For example (A ∪ B) ∩ C = S3 ∪ S5 ∪ S7 Cardinality of sets The cardinality of a set A (denoted by |A|) is the number of elements in A. That is.Sets 9 Ω A B A Bc Cc A B Cc A B C Ac B Cc Ac Bc Cc Figure 1. and C can always be represented as a union of the elementary sets for A.1 illustrates all of the 23 = 8 elementary sets for A. This is the subset of elements that are 0: 1: 1: not in A in B in C and and Figure 1. For some (but not all) experiments.1) corresponds to S3 = Ac ∩ B ∩ C .1. B and C using a Venn diagram. Si ∩ Sj = Ø for any i = j . B . Elementary sets have two very useful properties: • Elementary sets are pairwise disjoint. B and C . the binary sequence (0.

Example: Consider the experiment of rolling a die once. C= Example: Consider the experiment of burning a light bulb until it burns out. 3 What should be Ω for this experiment? Ω= If you thought the answer was Ω = {1. we can now introduce the ﬁrst components of our probability model.9. B= 3.10 • |Ω| is ﬁnite. Ω. 2. Sample spaces and events Using set notation. A= 2. count) the number of elements of subsets of Ω. Deﬁnition 1. then you just (re)invented the concept of a random variable. The die turns up even. 3 Identify the subsets of Ω that represent the following events: 1. . A sample space. This will be discussed in a later chapter. 6}. 5. • Each of the outcomes of the experiment are equally likely. Sets In such cases. Deﬁnition 1. 3. The die turns up less than 5. The die turns up 6. 4.e. The outcome of the experiment is the life length of the bulb. it is important to be able to enumerate (i. is the set of all possible outcomes of an experiment.8. An event is any subset of a sample space..

A = {x ∈ R : 0 ≤ x < 1. C = {24} Sets of events It is sometimes useful to be able to talk about the set of all possible events F. the set of all events is really a set of sets. T H. B = {x ∈ R : x ≥ 5} 3. Since an event is a set. HT. {HT }} A6 = {{HH}. The set F is very different than the set of all possible outcomes. T T }. namely A1 = {{HH}} A2 = {{HT }} A3 = {{T H}} A4 = {{T T }} A5 = {{HH}. 3 Identify the subsets of Ω that represent the following events: 1. we resorted to assigning a nonnegative real number to each possible outcome of the experiment. The bulb lasts at least 5 hours.Sets 3 What should be Ω for this experiment? 11 Ω = {x ∈ R : x ≥ 0} For the lack of a better representation. {T H}} . The bulb lasts exactly 24 hours. The bulb burns out within one and one-half hours. suppose we toss a two-sided coin twice. If you were asked to list all of the possible events for this experiment.5} 2. Ω. The sample space is Ω = {HH. Each element of F is a subset of Ω. you would need to list all possible subsets of Ω. For example.

{T H}. and is denoted by 2A . to represent those events we actually need to consider. then we often use the set of all possible events F = 2Ω . {T T }} A8 = {{HT }. {T H}} A12 = {{HH}. {HT }. Answers can be found in Appendix B. In those cases. {T T }} A10 = {{T H}. F. choose the best response among those provided. Self-Test Exercises for Chapter 1 For each of the following multiple-choice questions. If A is a ﬁnite set. {T H}. {HT }. Left to reader.3. . {T T }} A11 = {{HH}. Ω contains an inﬁnite number of outcomes (either countable or uncountable) and F = 2Ω is much too large. . this often works well. How to properly select a smaller F. {T H}} A9 = {{HT }. then |2A | = 2|A| Proof. For a given set A. {T T }} = Ω A16 = {} = Ø In this case F = {A1 .1 The event that corresponds to the statement. {T H}. is found in many graduate level probability courses." is (A) A ∪ B ∪ C (B) A ∩ B ∩ C (C) (A ∩ B ∩ C)c . Theorem 1. . A2 . However. {HT }.12 Self-Test Exercises for Chapter 1 A7 = {{HH}. When Ω contains a ﬁnite number of outcomes. the set of all subsets of A is called the power set of A. . {T T }} A13 = {{HH}. we select a smaller collection of sets. {T T }} A14 = {{HT }. {T T }} A15 = {{HH}. If Ω is a sample space. S1. A16 } Deﬁnition 1. B and C occurs.10. "at least one of the events A. for some experiments.

(A) (A ∩ B) ⊆ B (B) A ⊆ (A ∪ B) (C) Ac ⊆ (A ∩ B)c (D) all of the above are true.5 For any sets A and B . S1. S1.Self-Test Exercises for Chapter 1 (D) (A ∪ B ∪ C)c (E) none of the above 13 S1. B or C occurs is (A) (A ∩ B ∩ C) (B) (A ∩ B ∩ C)c (C) (A ∪ B ∪ C) (D) (A ∩ B c ∩ C c ) ∪ (Ac ∩ B ∩ C c ) ∪ (Ac ∩ B c ∩ C) (E) none of the above.2 The event that corresponds to the statement that exactly one of the events A.3 The set (A ∩ B) ∪ (A ∩ B c ) is equivalent to (A) Ø (B) Ω (C) A (D) B (E) none of the above.4 The set (A ∩ B) ∪ A is equivalent to (A) (A ∪ B) ∩ A (B) B (C) A ∩ B (D) Ø (E) none of the above. . S1.

Questions for Chapter 1 1.25 ≤ x ≤ 1. S1. Let A = {2. the event (A ∪ B c ) ∩ B equals (A) Ø (B) Ω (C) A ∪ B (D) A ∩ B (E) none of the above. List the elements of the following sets: (a) Ac ∩ B (b) Ac ∪ B (c) (Ac ∪ B c )c (d) A ∪ (B ∩ C)c c 1.5}. B = {3. and B = {x : 0.2 Suppose that the universe. Ω. 3.6 For any events A and B .7 If A ∩ B = Ø then (Ac ∪ B c )c equals (A) Ø (B) A (C) B (D) Ω (E) none of the above. 5} and C = {5. 7}.5 ≤ x ≤ 1}. 4}. Describe the following sets: (a) (A ∪ B)c (b) A ∪ B c (c) (A ∩ B)c .1 Suppose that the universe consists of the positive integers from 1 through 10. 4. 6.14 Questions for Chapter 1 (E) none of the above are true. Let A = {x : 0. S1. is given by Ω = {x : 0 ≤ x ≤ 2}.

(a) A ∩ Ac (b) A ∪ Ac (c) A ∩ Ø (d) A ∪ Ø (e) Ω ∩ Ø (f) Ω ∪ Ø (g) A ∩ Ω (h) A ∪ Ω 1. either prove your result using basic set relationships or disprove by providing a counterexample.Questions for Chapter 1 (d) Ac ∩ B 15 1.3 Which of the following relationships are true? In each case. B = set of all integers that are integral multiples of 3. C = set of all integers that are integral multiples of 4.4 Identify the set obtained by the following set operations. and J = set of all integers Describe the following sets: (a) A ∩ B (b) A ∪ B . where Ω is the universe and A is a subset of Ω. (a) (A ∪ B) ∩ (A ∪ C) = A ∪ (B ∩ C) (b) (A ∪ B) = (A ∩ B c ) ∪ B (c) Ac ∩ B = A ∪ B (d) (A ∪ B)c ∩ C c = Ac ∩ B c ∩ C c (e) (A ∩ B) ∩ (B c ∩ C) = Ø 1.5 Let A = set of all even integers.

the following holds: A ∪ (A ∩ B) = A ∩ (A ∪ B) = A.16 Questions for Chapter 1 (c) B ∩ C (d) A ∪ J (e) A ∪ C 1. i. is determined at the end of the day. The outcome of the experiment consists of the pair of numbers (X. 1.7 A box of N light bulbs has r (r ≤ N ) bulbs with broken ﬁlaments. The circuit is turned on before time r and turned off after time s (r < s). a machine produces three items.9 During the course of a day. at some time. 2. . Assume that X and Y are measured in hours on the continuous time axis with the beginning of the time period as the origin. X . defective or nondefective. (b) Suppose that the above bulbs are tested. each of whose quality. 3. Describe the sample space generated by a day’s production. Y ). 1.6 Prove the Law of Absorption. The circuit is on exactly three times as long as it is off. (b) Describe and sketch in the XY –plane the following events: 1. 1. until all defectives have been found. Describe the sample space for this experiment. until a defective bulb is found. The circuit is on at time z where z is some instant of time during the 24–hour period. Y .8 During a 24–hour period. Subsequently. 4. The circuit is on for less than one hour.. (a) Describe the sample space. at some future time. one by one. one by one.e. a switch is placed in the “ON” position. (still during the same 24– hour period) the switch is placed in the “OFF” position. (a) Describe the sample space for the above experiment. show that for any sets A and B . These bulbs are tested.

12 (†) (provided by Nishant Mishra) Provide a theorem that is true for Venn diagrams but false for an arbitrary universe Ω. If it fails. B and C be three events associated with an experiment.Questions for Chapter 1 17 1. 1. (b) Exactly one of the events occurs. Do not add other attributes of elements of R2 such as the distance between points. (d) At least one event occurs. (c) Exactly two of the events occurs. another probe is sent. This process continues until a successful probe is launched.10 A space probe is sent to Mars. 1. Describe the sample space for this experiment. Express the following verbal statements in set notation: (a) At least one of the events occurs. If the second probe fails. Your theorem should only use sets and basic set operations. . but no more than two of the events occur.11 Let A. another follows.

18 Questions for Chapter 1 .

4) can be turned 180 degrees to look like a domino painted (4.. L. 1975.4) dominoes and another producing (4.) Chung.1) dominoes.2 ENUMERATION Do not count your chickens before they are hatched. L. Of course.1). – AESOP.1) the dominoes by 180 degrees. So there is no need to have one assembly line producing (1.2 How many different dominoes does your factory actually produce? (Try this problem on your friends and see how many different answers you get. Dominoes are rectangular tiles used in some parlor games and they look like Figure 2. Figure 2.1: A single domino dominoes have each half of a tile painted with 1 to 9 dots (called “pips”) or a blank for a total of 10 possible choices.1. a domino painted (1. 2 Just have a person at the end of the assembly line rotating half of the (4. Springer-Verlag. K. New York. Elementary probability theory with stochastic processes. 1 19 . Typically. Chung1 So you think you know how to count? Consider the following problem: Suppose you are the chief engineer at the Ellicott Dominoes domino factory. Fables (600 BC) This section is based on Chapter 3 of a text by K.

The total number of possible ways to answer all of the questions on the exam is 2 · 2 · 2 · 2 · 2 = 25 = 32. Example: Suppose a engineering class has ﬁve women and four men.20 Sampling In this chapter. Since there are n possible ways to draw the ﬁrst ball. Sampling with replacement and with ordering. I. You would like to select a research team consisting of one man and one woman. The order of the balls is noted. Example: Suppose you give a quiz with 5 questions. In practice. two ways to answer the second question. 3. we will present a systematic way to approach counting problems. each with a possible answer of True or False. 2. Hence the sequence 1. There are m1 alternatives for the ﬁrst choice. m2 alternatives for the second. etc. and solve each type. there are two ways to answer the ﬁrst question.. 4 · 5 = 20 Sampling The Urn Model: Imagine an urn that contains n balls marked 1 to n. There are four disciplines under which the balls can be drawn from the urn and examined. A total of m balls are drawn sequentially. There are four choices for selecting the man and ﬁve choices for selecting the woman. n possible ways to draw the second. therefore. etc. . the total number of possible sequences is nm . The total number of alternatives for the whole set of choices is equal to (m1 )(m2 )(m3 ) · · · . 3 is distinguished from 3. Fundamental Rule The following rule forms the foundation for solving any counting problem: Multiplication Principle: Suppose there are several multiple choices to be made. 1. The total number of possible research teams is. After a ball is drawn. Therefore. 2. it is replaced in the urn and may be drawn again. We will classify counting problems into four categories. m3 for the third. most counting problems can be approached using one or a combination of these four basic types. and so on. 3.

After a ball is drawn. Here is another way to look at this problem: Consider the empty mailboxes of your three friends. then Sue) is arbitrary and does not affect the answer. it is not replaced in the urn and hence cannot be drawn again. The card for Mary’s must come from the remaining 4 cards. n − 2 possible ways to draw the third. and you have three friends (named Bill. The number of possible choices for Bill’s card is 5. ? ? ? B M S . 0! ≡ 1 so that when n = m. then Mary. and Sue) with birthdays this week. Computational Hint: We can compute n(n − 1)(n − 2) · · · (n − m + 1) = n! . etc. Mary. the total number of possible ways to draw n balls is n(n − 1)(n − 2) · · · (2)(1) ≡ n!. Hence the total number of ways you can assign ﬁve birthday cards to your three friends is 5 · 4 · 3 = 60.Sampling 21 II. You want to send one card to each friend. n − 1 possible ways to draw the second. we have n(n − 1)(n − 2) · · · (n − n + 1) = n! n! = = n !. Note that when m = n. A total of m balls are drawn sequentially with m ≤ n. Finally. Since there are n possible ways to draw the ﬁrst ball. It’s interesting to note that the sequence of friends (Bill. (n − m)! By deﬁnition. Sampling without replacement and with ordering. the total number of possible sequences is n(n − 1)(n − 2) · · · (n − m + 1). the number of possible choices for Sue’s card is 3. The order of the balls is noted. (n − n)! 0! Example: Suppose you have ﬁve different birthday cards. For the special case when m = n. this is equivalent to the problem of counting the number of ways you can arrange n different balls in a row..

. We can ﬁnd the value of x by considering Case II as a two-stage process: Stage 1 Grab m items from n without replacement and without ordering.e. A total of m balls (m ≤ n) are drawn from the urn.b. z.c. The balls are not replaced after each drawing and the order of the drawings is not observed. y.22 Sampling If the cards are labelled a. then one possible assignment is c d B M a S There are 5 ways to ﬁll the ﬁrst position (i.. we can compute the number of ways one can arrange the three letters x. the two stages implement Case II (sampling without replacement and with ordering). Sampling without replacement and without ordering. Hence. Example: Using the above example.e. ? ? ? 1 2 3 So a possible arrangement is y z x 1 2 3 and there are 3 · 2 · 1 = 3! = 6 possible assignments.d. let x denote the number of ways to choose m items from n items without replacement and without ordering. Bill’s mailbox). arrange the m items in order. Stage 1 (selecting the m balls) can be performed x ways. Combined. III. m This number is often referred to as the binomial coefﬁcient. To derive the above formula. Stage 2 Given the m items selected in Stage 1. 4 ways to ﬁll the second position and 3 ways to ﬁll the third. this case can be thought of as reaching into the urn and drawing m balls simultaneously. This time we have 3 positions to ﬁll with 3 different items. The total number of ways a subset of m items can be drawn from a set of n elements is (1) n! ≡ m!(n − m)! n .

we must have x · m! = n! (n − m)! In other words. To overcome this. m!(n − m)! (n − m)!m! n−m So. Computational Hint: Note that n m = n! n n! = = . Hence.Sampling 23 our unknown quantity. . 5 5 = 3 2 and 9 9 = . note that n m = n! n(n − 1)(n − 2) · · · (n − m + 1) = . that the entire process of Stage 1 followed by Stage 2 can be performed in n! (n−m)! ways. 10 10! 10 · 9 = = = 45 2 2!8! 2! and 100 100 100! 100 · 99 · 98 = = = = 161700. 97 3 3!97! 3! Example: A consumer survey asks: Select exactly two vegetables from the following list: ◦ peas ◦ carrots ◦ lima beans ◦ corn 4 2 The total number of ways you can select two items from the four is = 6. Hence. Stage 2 (ordering the m balls) can be performed m! ways (using Case II for m items drawn from m items). 4 5 Also note that if n is large. Furthermore. for example. we know. Solving for x then yields our result. [Stage 1] · [Stage 2] = [Case II]. using Case II (with m items drawn from n items). brute force computation of all three factorial terms n of m can be a challenge. m!(n − m)! m! with only m terms in both the numerator and the denominator.

Ball 2 was drawn once. 3. 3 we only need to know that Ball 1 was drawn once. 2 or 1. yet it has important applications. 3 or 2. 2. 1 √ 2 3 √ √√ √√√ √ √√√√ || 1 2 2 2 3 2 3 3 Coded Representation √ √ √√ | | √√√ √ | | So. 2. 2.24 Sampling IV. for example. and Ball 3 was drawn twice. 2 is the same as drawing 1. The number of ways that m balls can be drawn from an urn containing n balls in this fashion is m+n−1 . 3. . drawing 3. the arrangements 2. Therefore. 1. 2. First. 3. For example. would represent 3. Sampling with replacement and without ordering. This arrangement can be represented as follows: 1 √ 2 √ 3 √√ 1 2 3 3 For another example using this scheme. 3. Each ball is replaced after being drawn. note that to describe a possible outcome from this scenario. 3. However. For example. 1. to describe 3. the sequence of the draws is ignored. 2 and their counterparts would be represented as 1 2 2 2 3 2 √√√ 3 √ Now notice that any of these outcomes can be represented by only specifying √ the arrangement of the four check marks ( ) among the inner two vertical lines (|). This case is perhaps the most difﬁcult to perceive as a sampling discipline. we only need to know how many times each ball was drawn. 3. m Let’s prove that this formula works. 3. 2. 3. 3. A total of m balls are drawn sequentially from the urn.

our table has three columns with twelve check marks. In general. there are 15 4 possible ways we can select 4 balls from an urn containing 3 balls if we allow replacement and disregard ordering. In this case. we are arranging 14 symbols in a row: . our table will have n columns. we will need m check marks √ ( ). In this case. 1 √√ 2 √ ··· ··· n √ k1 k2 · · · kn Coded Representation √√ √ √ | |···| Therefore. note that our coded representation has six positions ? ? ? ? ? ? √ Any four of these can be selected for the check marks ( ) and the remaining two will be ﬁlled with the two vertical lines (|). The baker will allow her to select any number of each type to make a dozen. So we need (n − 1) + m positions for both check marks and vertical lines ? ? ? ··· ? (n − 1) + m positions The number of ways we can select m positions for check marks from (n − 1) + m positions is n+m−1 m Example: Mary wants to buy a dozen cookies. Therefore. chocolate chip cookies and sugar cookies. if there are n balls and m drawings. She likes peanut butter cookies. 3 chocolate chip and 7 sugar cookies.Sampling 25 To count the total number of possible outcomes. The number of ways we can select 4 positions from 6 positions is 6 = 15. our coded representation will have m check marks and n − 1 vertical lines. To represent all possible outcomes. So the following arrangement: Peanut Butter √√ Chocolate Chip √√√ Sugar √√√√√√√ Coded Representation √√ √√√ √√√√√√√ | | would represent the outcome when Mary selects 2 peanut butter.

3!2!2! Occupancy The Token Model: In sampling problems. . A direct analogy can be made with so-called occupancy problems. we are given n boxes marked 1 to n. In such models. Note that when k = 2. n2 of them painted with color number 2. and m tokens numbered 1 to m. Example: Consider the seven characters. the number of distinguishable ways in which the tokens may occupy the boxes is computed in precisely the same manner as in the sampling models. . A. this situation is the same as case III. C.26 Occupancy 2 vertical lines and 12 check marks. we think of n balls placed in an urn from which m balls are drawn.. B. n1 !n2 ! · · · nk ! This quantity is a generalization of the binomial coefﬁcient and is called the multinomial coefﬁcient. The number of distinct arrangements of these seven letters in a row is 7! = 210. . B. We have k ni = n. = = 12 2·1 12 2 Permutation of n items distinguished by groups An extension of case III is the situation where one has n balls with n1 of them painted with color number 1. C . Nevertheless. the number of distinguishable arrangements is given by n! . . So the number of possible different boxes of one dozen cookies is 12 + 3 − 1 14 14 14 · 13 = = 91.nk of them painted with color number k . A. There are four disciplines for placing the tokens in the boxes. i=1 If the balls are arranged in a row. A. The reason for introducing occupancy problems becomes clear as one encounters different counting problems. Some may be easier to conceptualize as occupancy problems rather than sampling problems. Each of these four scenarios correspond to one of the four sampling cases.

Occupancy

27

The connection between the two formulations is easy to remember if you note the following relationships: Sampling Problems ball number of drawing j th draw yields ball k Occupancy Problems box number on token j th token is placed in box k

For example, suppose we have a sampling problem with 3 balls (labelled a, b, c) being drawn 4 times. This would correspond to an occupancy problem with 3 boxes and 4 tokens (See Figure 2.2). Moreover, the following outcomes from each

Tokens 1 2 3 4

Tokens 1 2 3 4

2

1

3

4

a

b

c

a

b

c

(a) Token labels observed

(b) Token labels not observed

Figure 2.2: The occupancy problem model would be equivalent: Sampling Drawing Ball 1 b 2 a 3 b 4 c Occupancy Box Token(s) a 2 b 1,3 c 4

Sampling cases I to IV can now be viewed as

28

Occupancy

I*. Each box may contain any number of tokens and the labels on the tokens are observed. II*. No box may contain more than one token and the labels on the tokens are observed. III*. No box may contain more than one token and the labels on the tokens are not observed. IV*. Each box may contain any number of tokens and the labels on the tokens are not observed. The scheme with check marks and vertical bars that we used for Case IV is now readily apparent. Each column of the table represents a box, and each check mark represents a token. Example: Suppose a school has ﬁve classrooms (with different room numbers) and three teachers, Ms. Smith, Mr. Thomas and Ms. Warren. For some courses, more than one teacher might be assigned to the same classroom. I*. If more than one teacher can be assigned to a classroom, and it matters which teacher teaches in a particular room, then the number of ways you can assign teachers to classrooms is
nm = 53 = 125.

II*. If only one teacher can be assigned to a classroom, and it matters which teacher teaches in a particular room, then the number of ways you can assign teachers to classrooms is
n(n − 1) · · · (n − m + 1) = 5 · 4 · 3· = 60.

III*. If only one teacher can be assigned to a classroom, and it does not matter which teacher teaches in a particular room, then the number of ways you can assign teachers to classrooms is
n m =

5 = 10. 3

IV*. If more than one teacher can be assigned to a classroom, and it does not matter which teacher teaches in a particular room, then the number of ways

Self-Test Exercises for Chapter 2

29

you can assign teachers to classrooms is
m+n−1 m =

3+5−1 = 35. 3

Self-Test Exercises for Chapter 2
For each of the following multiple-choice questions, choose the best response among those provided. Answers can be found in Appendix B. S2.1 Bill’s Donut Shop bakes 10 varieties of doughnuts. How many different boxes of 12 doughnuts can a customer select from the 10 varieties? Note: Assume that Bill’s Shop has at least 12 of each type of donut. (A) 12! (B)
21 12

(C) 10! (D) 1210 (E) none of the above. S2.2 A teacher has a class of 30 students. Each day, she selects a team of 6 students to present their homework solutions to the rest of the class. How many different teams of 6 students can she select? (A)
30 6

(B) 306 (C) 30!/24! (D) 630 (E) none of the above S2.3 A box is partitioned into 3 numbered cells. Two indistinguishable balls are dropped into the box. Each cell may contain at most one ball. The total number of distinguishable arrangements of the balls in the numbered cells is (A) 1 (B) 3 (C) 6

30

Self-Test Exercises for Chapter 2

(D) 9 (E) none of the above. S2.4 A professor wants to use three colors (red, green and blue) to draw an equilateral triangle on a sheet of paper. Each edge of the triangle is a single color, but any color can be used more than once. How many different triangles can be drawn? (A) 33 (B) 3! (C) (D)
5 3 5 3

+1

(E) none of the above. S2.5 A professor has an equilateral triangle cut from a sheet of cardboard. He wants to use three colors (red, green and blue) to color the edges. Each edge of the triangle is a single color, but any color can be used more than once. How many different triangles can be produced? (A) 33 (B) 3! (C) (D)
5 3 5 3

+1

(E) none of the above. S2.6 Two white balls are dropped randomly into four labelled boxes, one at a time. A box may contain more than one ball. The number of distinguishable arrangements of the balls in the boxes is (A) (B)
4 2 7 2

(C) 4 (D) 42 (E) none of the above.

Self-Test Exercises for Chapter 2

31

S2.7 One white ball and one black ball are dropped randomly into four labelled boxes, one at a time. A box may contain more than one ball. The number of distinguishable arrangements of the balls in the boxes is (A) (B)
4 2 7 2

(C) 4 (D) 42 (E) none of the above. S2.8 A class has ten students. The teacher must select three students to form a committee. How many different possible committees can be formed? (A) 103 (B) 310 (C) 10!/7! (D)
10 3

(E) none of the above S2.9 A student club has ten members. The club must elect a president, vicepresident and treasurer from the membership. How many ways can these three positions be ﬁlled? (A) 103 (B) 310 (C) 10!/7! (D)
10 3

(E) none of the above S2.10 A candy company is introducing a new, bite-sized, chocolate candy called "XYZ’s". Each candy is covered with a sugar coat that is colored a single color: either red, green, or blue. Also each candy is marked with a single letter: either “X,” “Y” or “Z’. How many different possible candies are there?

Their menu says that their “Big Dish” contains ﬁve scoops of ice cream. how many ways can 4 workers be assigned to the machines? (A) 42 (B) 4! (C) 7 4 (D) 44 (E) none of the above Questions for Chapter 2 2. You may order the ﬁve scoops in any way.11 An ice cream shop offers 20 different ﬂavors of ice cream.12 There are 4 machines in a factory. How many different pieces can be made using the numbers 1. S2. 2. . . the 52 cards are partitioned into four equal groups. How many different “Big Dishes” are possible? (A) 205 (B) (C) 20 5 24 5 (D) 20!/5! (E) none of the above S2. . How many different playing situations are there? . n? 2.1 A domino piece is marked by two numbers. The pieces are symmetrical so that the number pair is not ordered. .2 At a bridge table. ranging from ﬁve different ﬂavors to ﬁve scoops of the same ﬂavor. If each is worker assigned to one machine.32 Questions for Chapter 2 (A) 9 (B) 33 (C) 9 3 (D) 3! (E) none of the above.

How many different ways could a student answer the entire test? 2. If the balls are indistinguishable. Strawberry (30). How many different ways can the ballot be marked? 2. A cell may contain more than one ball.5 From eight persons.3 How many ways can a poker hand of 5 cards be drawn from a 52 card deck so that each card is a different number or face (i.6 A ballot lists 10 candidates.e. list all possible arrangements. How many distinguishable results of the throw are there? 2. how many committees of three members may be chosen? Suppose that among the eight persons. how many different ways can you mark your ballot? 2.9 (provided by Mark Chew) Jack Dukey enters the Dip-’n-Donut Shop to buy a dozen doughnuts for himself and his co-workers in the Parks Department. You may vote for any three of them. different. According to the “bullet” voting system.Questions for Chapter 2 33 2. Two balls are dropped into the box. a voter must place 4 check marks on the ballot.10 Consider a box partitioned into 3 cells.7 An election ballot lists 10 candidates. (a) How many different ways can Jack order a dozen doughnuts? Assume that ordering 6 Bavarian Creme and 6 Peanut Krunch is the same as ordering 6 Peanut Krunch and 6 Bavarian Creme. exactly four are women.8 A multiple choice test contains 10 questions with each question having 4 possible answers. in such a way that every cell has an equal chance of receiving a ball. Peanut Krunch (11) and Powdered Jelly (15). (b) If Jack wants 2 Peanut Krunch for himself. Each voter is allowed 4 votes. how many different ways can Jack order a dozen doughnuts? 2. He notices that they have the following inventory of his favorite doughnuts: Bavarian Creme (21). and his boss in the Parks Department wants 3 Bavarian Creme for his coffee break. ignoring suits)? 2. and may assign more than one check mark to any candidate(s) up to a total of four marks. each independently of the other..4 Throw three indistinguishable dice. If you vote for exactly three candidates. How many three-person committees can be chosen so that at least one member is female? 2. .

A. How many ways can the 4 girls be seated among the 12 boys if the students are distinguished only by sex? 2. .13 (See also Question 2. Each bus can hold 50 riders.14 Prove the binomial theorem: For any real numbers x and y and integer n ≥ 1 (x + y)n = n k=0 n k n−k x y k 2. .11 A classroom in the Bell Elementary School contains 16 seats with four rows of four seats each.) A ballot lists 10 candidates.15 You are given n balls of different colors.6. You must pick three candidates and rank them according to your preference. 2.34 Questions for Chapter 2 Reanswer the question if no cell can have more than one ball. How many different ways can you mark your ballot? 2. B . or C with A being your most preferred candidate.12 Fifty students have rented 3 busses for a class trip. (a) arrange all of them in a row? (b) choose m of them simultaneously? (c) divide them (perhaps unequally) between two identical urns? (d) divide them (perhaps unequally) between two children (and each child cares about the colors he or she receives)? . How many distinguishable ways can you. A class is composed of 4 girls and 12 boys. How many different ways can the students sign up for busses? 2.

we showed how you can use sample spaces Ω and events A ⊆ Ω to describe an experiment. (for example. For inﬁnite sample spaces. P (·). deﬁned on subsets of Ω. Deﬁnition 3. In those cases. Reﬂection (c 1880) Probability Spaces In Chapter 1. F. A probability measure. that does not necessarily include every possible subset of Ω.2. – THOMAS HENRY HUXLEY. Ω = R) we often cannot reasonably compute P (A) for all A ⊆ Ω.3 PROBABILITY The scientiﬁc imagination always restrains itself within the limits of probability. P (A) and the mathematics necessary to consistently compute P (A) for all events A ⊆ Ω. The sample space.3. is a function. Ω.1. The mathematical details are usually covered in advanced courses in probability theory. we will deﬁne what we mean by the probability of an event. we restrict our deﬁnition of P (A) to some smaller subset of events. Ω. is the set of all possible outcomes of an experiment. that assigns to each event E a real number P (E) such that the following three probability axioms hold: 35 . In this chapter. How would you measure a set? • ﬁnite sets • countable sets • uncountable sets Deﬁnition 3. An event is any subset of the sample space. Measuring sets Deﬁnition 3.

That is P : F → [0. Such functions assign each number in R a single number from R. . 1]. Here. measures. However. P (·). the set of events. 1]. That’s why we refer to P (·) as a probability measure. n .. we will show that 0 ≤ P (A) ≤ 1 for every A ⊆ Ω.3? Logical probability This can be done when Ω is a ﬁnite set Ω = {ω1 .36 Probability Spaces 1. . That is. If E ∩ F = Ø (i. That is P ({ωi }) = 1 n for i = 1. Interpretation of probability How do you assign the real numbers in Deﬁnition 3. ω2 .e. ωn } and each of the outcomes in Ω is equally likely. In mathematical terms. f (x) = x3 . a great deal of probability theory has its foundations in a mathematical ﬁeld called measure theory with important results from many mathematicians. . a single number from the interval [0. the function. if the events E and F are mutually exclusive ). . Henri Lebesgue and Eugene Dynkin. is a function whose argument is a subset of Ω (namely an event) rather than a number. . . Note that every event A ⊆ Ω is assigned one and only one number P (A). In mathematics we call functions that assign real numbers to subsets. then P (E ∪ F ) = P (E) + P (F ) The three speciﬁcations in the above deﬁnition are often called the Kolmogorov axioms for probability measures. . P (Ω) = 1 3. For example. . In calculus. In section 3. P (E) ≥ 0 for all events E 2. you worked with real-valued functions f (·). f : R → R. including Emile Borel. maps x = 2 into f (2) = 8. P (·) is a function that assigns to each member of F.

2. in general. and there is no reason for one card to be drawn rather than another. . . In that case Ω = {ω1 . So we can assign. . As you can see. . P ({ωi }) = 1 6 for i = 1. ω6 } where ωi = the die shows i pips for i = 1. We also must be able to assume that the outcome of any one of the trials does not affect the probabilities associated with the outcomes of any of the other trials. consider the outcome of rolling a fair six-sided die once. Experimental probability This interpretation requires that the experiment can be identically reproduced over and over again. . 6 3 6 P ({The die is even}) = P ({ω2 . . . . . suppose we .Probability Spaces In that case. rolling a fair die The die is symmetrical and there is no reason for one side to turn up rather than another. . . for example. drawing a card randomly from a deck of playing cards Each card is physically the same. Some examples where logical probability can be used are: P (A) = and. ω4 . Each replicate is called an experimental trial. indeﬁnitely. 6. ω2 . ω6 }) = |A| for any A ⊆ Ω 6 Logical probability is only useful when the mechanism producing the random outcomes has symmetry so that one could argue that there is no reason for one outcome to occur rather than other. for any A ⊆ Ω assign P (A) = |A| n 37 For example. As an example. . the notion of logical probability has limited applications. tossing a fair coin The coin is symmetrical and there is no reason for heads to turn up rather than tails.

it can be proven that for a prespeciﬁed small number δ > 0. and quite far away from the n eventual limit of 1 as n gets bigger than billions. however. It is possible. . T2 . random and cannot be predetermined. i=1 would be zero. with each transistor an experimental trial. although not likely. and assign P (A) = lim n→∞ n i=1 Ti n There are some important mathematical issues associated with deﬁning P (A) in this way. the type of limit that you deﬁned in calculus cannot be used. consider the inﬁnite sequence of experimental outcomes T 1 . That concept is not suitable here. In that case. that P (A) equaled 2 . We can. such that for all n > K the quantity n i=1 Ti n was arbitrarily close to P (A). good}. there does exist some integer K such that for all n > K n i=1 Ti Ti n − P (A) < δ. 2 In fact. . with Ti = 0 Ti = 1 if transistor i is bad if transistor i is good In this case. To assign P (A) using logical probability. The proof requires measure theory. for instance. . determined in advance. Let A = {good}. say that P (K < ∞) = 1. Some examples where experimental probability can be used are: . The outcome of one of our experimental trials is whether that transistor is either good or bad. itself. In calculus. 1 Suppose.38 Probability Spaces examine transistors being produced by an assembly line. there always existed some K . Ω = {bad. and is typically covered in a graduate-level probability course. that you could examine billions of transistors and ﬁnd that all of them n are bad. The value of K is. In particular. Consider the sequence of outcomes for the repeated experiment.

with the outcome being the grade that you get in the course. The decision maker is asked to place in the urn any number of red balls and white balls he or she desires. subjective probability has a lot of critics. For example. is to construct an equivalent lottery consisting of an urn. Once the decision maker has decided how many balls of each type to place in the urn. we simply count the balls in the urn and assign P (A) = number of red balls in the urn total number of balls in the urn Not surprisingly. but it is impossible to argue that the experiment can be replicated. auctions Stating the probability that painting by Monet will sell for one million dollars at an art auction. football Specifying the probability that the Buffalo Bills will win the Super Bowl next year. consider the experiment of taking EAS305. interarrival times for queues Stating the probability that the time between any two consecutive customers is greater than ﬁve minutes. There is a mathematical theory (complete with axioms and deductions) regarding the existence of such equivalent lotteries. Let A = {you get an “A” in EAS305} The procedure of determining the subjective probability of A. Some examples where subjective probability can be used are: weather Specifying the probability that it will be warm and sunny tomorrow. . with the goal of making the event of drawing a red ball equivalent to getting an “A” in EAS305. But sometimes it is the only way we can give meaning to the number P (A). and an unlimited quantity of red and white balls. hitting a home run Specifying the probability that a baseball player will hit a home run on any attempt.Probability Spaces 39 quality assurance Specifying the probability that a single microprocessor from a semiconductor fabrication facility will be defective. Subjective probability There are some cases where we would like to use probability theory.

A ⊆ B ⇒ P (A) ≤ P (B) Proof.31 A ∩ (B ∩ Ac ) = Ø since P (B ∩ Ac ) ≥ 0 (Axiom 1) Example: In a class of 100 students it is known that 80 of the students like hamburger. A∪B (A ∪ B) ∩ (A ∪ Ac ) A ∪ (B ∩ Ac ) P (A ∪ (B ∩ Ac )) P (A) + P (B ∩ Ac ) P (A) = = = = = ≤ B B B P (B) P (B) P (B) since A ⊆ B since A ∪ Ac = Ω DeMorgan’s law See Question 3.Probability Spaces 41 Relationships between English and set notation • P (not E) = P (E c ) • P (E or F ) = P (E ∪ F ) • P (E and F ) = P (E ∩ F ) Deductions from the axioms Theorem 3.1. P (A) ≤ 1 for all A ⊆ Ω Proof. Let A be the event that the selected student is one of those who likes . P (Ac ) = 1 − P (A) Proof. P (Ω) P (A ∪ Ac ) P (A) + P (Ac ) P (Ac ) = = = = 1 1 1 1 − P (A) from Axiom 2 since Ω = A ∪ Ac since A ∩ Ac = Ø Theorem 3.2.3. Consider the experiment of picking a student at random from the class of 100.1 since P (Ac ) ≥ 0 (Axiom 1) Theorem 3. P (A) = 1 − P (Ac ) P (A) ≤ 1 from Theorem 3.

Let B be the event that the selected student is one of those who likes broccoli. Suppose |B| is unknown. Assume it’s true for n = k .80. A2 . 3 Even if |B| is unknown we know that P (A ∩ B) ≤ 0. Solution: A ∩ B is the event that the student selected likes hamburger and broccoli.4. . ω ∈ A. is it true for n = k + 1? Assume for k pairwise disjoint sets A1 .80.42 Probability Spaces hamburger. Since A ∩ B ⊆ A. A ∩ B ⊆ A. That implies that ω ∈ A and ω ∈ B . Why? Solution: For any events E and F . 3 Show that A ∩ B ⊆ A. In this case |A| = 80 and |Ω| = 100 and we can use logical probability to determine P (A) = 80/100 = 0. Ak we have P (A1 ∪ · · · ∪ Ak ) = P (A1 ) + · · · + P (Ak ) . . For any ﬁnite number of pairwise disjoint sets A1 . A2 . An we have P (A1 ∪ A2 ∪ · · · ∪ An ) = P (A1 ) + P (A2 ) + · · · + P (An ) Proof. Therefore. Solution: Pick an element ω ∈ A ∩ B . . . . that’s Axiom 3. . E ⊆ F implies P (E) ≤ P (F ). (By induction) Is it true for n = 2? Yes. .80. 3 Describe the set A ∩ B . we have P (A ∩ B) ≤ P (A) = 0. Hence. . Theorem 3.

. . a structural defect A single piston may have one. . Then (A1 ∪ · · · ∪ Ak ) ∩ Ak+1 = Ø since Ai ∩ Ak+1 = Ø for all i P ((A1 ∪ · · · ∪ Ak ) ∪ Ak+1 ) = P (A1 ∪ · · · ∪ Ak ) + P (Ak+1 ) since (A1 ∪ · · · ∪ Ak ) ∩ Ak+1 = Ø P ((A1 ∪ · · · ∪ Ak ) ∪ Ak+1 ) = (P (A1 ) + . k . From quality control records we see that • The probability that a piston has a surface defect is 0. . + P (Ak )) + P (Ak+1 ) from induction hypothesis P (A1 ∪ · · · ∪ Ak ∪ Ak+1 ) = P (A1 ) + . . and/or 2. . A ∪ B = A ∪ (Ac ∩ B) Distributive law for sets and B = (A ∩ B) ∪ (Ac ∩ B) Distributive law for sets Therefore P (A ∪ B) = P (A) + P (Ac ∩ B) from Axiom 3 P (B) = P (A ∩ B) + P (Ac ∩ B) from Axiom 3 Subtracting the second equation from the ﬁrst produces the result.5.5.Probability Spaces 43 Let Ak+1 be an event such that Ai ∩ Ak+1 = Ø for all i = 1. For any two events A and B P (A ∪ B) = P (A) + P (B) − P (A ∩ B) Proof. . Example: A hydraulic piston can have only two types of defects: 1. . . + P (Ak+1 ) Associative law Theorem 3. both or neither defect. a surface defect.

5.4 + 0. P (Ø) P (Ø) P (Ø) P (Ø) = = = = 1 − P (Øc ) 1 − P (Ω) 1−1 0 from Theorem 3.6 = 0.1 since Øc = Ω from Axiom 2 In order to allow the extension of Axiom (3) in Deﬁnition 3.6 To ﬁnd the probability that a piston has both a structural and a surface defect. We know that P (S) = 0. is a countable sequence of disjoint sets.6 = 0.5 P (T ) = 0.5 − 0. ﬁrst deﬁne the following events: S = {a piston has a surface defect} T = {a piston has a structural defect} We want to ﬁnd P (S ∩ T ). P (S ∪ T ) = P (S) + P (T ) − P (S ∩ T ) 0. • The probability that a piston has some type of defect is 0. .4. the third probability axiom is replaced by 3. using Theorem 3. .3 to apply to a countable collection of disjoint sets (rather than just a ﬁnite collection). then ∞ ∞ P i=1 Ai = i=1 P (Ai ) . P (Ø) = 0 Proof.3 Theorem 3.6 Therefore.6.44 Probability Spaces • The probability that a piston has a structural defect is 0.5 − P (S ∩ T ) P (S ∩ T ) = 0.4 P (S ∪ T ) = 0.* If A1 . A2 .4 + 0. .

2. . 1. we get ∞ ∞ P i=0 ∞ Ei Ei i=1 ∞ = i=0 P (Ei ) = 1 P E0 ∪ = 1 = 1 P (E0 ) + P i=1 ∞ Ei Ei i=1 Since E0 ∩ ∪∞ Ei = Ø i=1 P = 1 − P (E0 ) = 1 − e−1 THE BIRTHDAY PROBLEM Suppose k randomly selected individuals are seated in a room. i=1 Since ∞ αi i=0 i! = eα for any α. 1. That is.* and the fact that Ei ∩ Ej = Ø for any i = j . . . 2. Let Ei be the event that exactly i accidents occur in a month with i = 0. we will make two assumptions: . . What is the probability that at least two of them will have the same birthday? How large must k be so that the probability is 0. Suppose that P (Ei ) = e−1 i! for i = 0..The Birthday Problem 45 Example: Consider the experiment of counting the number of accidents in a factory in a month. . and we want to ﬁnd the probability that the factory has at least one accident in a month. . we want to compute P ∪∞ Ei . we have ∞ ∞ P (Ei ) = i=0 i=0 e−1 i! ∞ i=0 = e −1 1 i! = e−1 e1 = 1 Using axiom 3.50 that at least two of them will have the same birthday? To solve this problem.

} The assignment of birthdays with ordered individuals can be represented as a sampling problem.Jan01). An introduction to probability and its applications. To compute P (F ). . there are (365)k corresponding permutations of their k birthdays. d2 .Jan04). Larsen and Morris Marx. . (Jan01. To illustrate the issues. But P (E) = 1 − P (F ) and it turns out that P (F ) is easier to compute. each person is assigned a birthday with any of the n = 365 days being equally likely. d3 ).Jan02). where d1 .Jan01. Deﬁne the events E = {at least 2 birthdays are the same} F = {all of the birthdays are different} and notice that E = F c . we’ll ﬁrst solve the problem for k = 3 and n = 365. Let’s suppose the individuals are ordered. we’ll extend our solution for any k . For just one example of many. there are n = 365 days in a year 2. We can begin to write down all of the elements of Ω as follows: Ω = {(Jan01. Later. we’ll use an approach that is common for most problems involving logical probability: Step 1 Count the number of equally likely outcomes in Ω. The number of ways we can draw three balls from 365 with replacement and with ordering is (365)3 . (Jan01. Some textbooks1 say something like “If we imagine the k people to be ordered.Jan01. that is until you begin identifying and counting equally likely elements in Ω. It sounds simple.Jan03). 1985. Then every element of the sample space can be written as an ordered triple ω = (d1 .Jan01.” Why ordered? Why not? Suppose k = 3. (Jan01. see Richard J.Jan01. Prentice-Hall. and any n. Step 2 Count the number of elements in F ⊆ Ω. The question asks us to compute P (E). . 1 .46 The Birthday Problem 1. d2 and d3 are the birthdays of the three individuals. Step 3 Divide the number obtained in Step 2 by the number in Step 1.

But why would we get the wrong answer if we don’t order the k individuals? Let’s see what this would mean for k = 3. .507 0.891 0. “how many ways can we arrange 3 check marks among the 365 columns?” Jan01 √√√ √√ √√ .” the following table lists P (E) for several different values of k with n = 365: k 15 22 23 40 50 70 P (E) 0.. So for k = 3 we get P (E) = 1 − P (F ) = 1 − (365)!/(365 − 3)! (365)3 This turns out to be the correct answer. the number of ways that we can assign birthdays to individuals with everyone getting a different birthday is equal to the number of ways we can draw three balls from 365 without replacement and with ordering.. . Furthermore. We can do this by creating a table with 365 columns (one for each calendar day) and asking the question.The Birthday Problem 47 Continuing with this reasoning.970 0.999 and we see that we need at least 23 persons to insure that P (E) is at least 0. if we perform the same computations for any n and k ≤ n.475 0. And that number is (365)!/(365 − 3)!. the general correct answer is P (E) = 1 − P (F ) = 1 − (n)!/(n − k)! (n)k To answer the question “how large must k be?.253 0. Identifying the elements of Ω without ordering the individuals is equivalent to listing all of the ways we can draw 3 balls from an urn containing 365 balls.50. etc. Jan02 √ √ Jan03 . with replacement and without ordering. .

assume that birthdays are randomly assigned. Jan02. We can again compute P (E). let’s use all of our formulas to solve a slightly modiﬁed version of the birthday problem: The birthday problem on the planet Yak There are two aliens from outer space in a room. They are from a planet called Yak. etc. So we can use our formulas with values n = 2 and k = 2.. that is Ω = {HH. etc. Once again. the total number of elements in Ω is n+k−1 k = (n + k − 1)! k !(n − 1)! and the number of elements in F is n k = n! k !(n − k)! (n)!(n − 1)! (n + k − 1)!(n − k)! So that P (E) = 1 − P (F ) = 1 − To show why the unordered individual assumption is wrong. There is nothing wrong with identifying the elements of the sample space Ω in this way. Yak has only two days in its calendar year. In this case. T T }. and P (E) = 1 − 1 (2)!/(2 − 2)! = (2)2 2 . with each day equally likely. The names of the days are “Heads” and “Tails”. using the unordered individual assumption. But we would wrong to assume that such elements are equally likely. Under the ordered individual model there are four elements in Ω. HT. Jan02.. each element of Ω is distinguishable by the number of times someone in the group is assigned the birthday Jan01. but this time. Assigning birthdays at random to the aliens is the same as each alien tossing a fair coin to determine it’s birthday. not who in the group is assigned Jan01.48 The Birthday Problem Using this approach. T H.

des arts and des m´ tiers. 3) (4. 3) (3. “two heads”. Deﬁne the events A = {ﬁrst roll is less than or equal to the second roll} B = {the sum of the two rolls is 4} Jean d’Alembert Encyclop´ die ou dictionnaire raisonn´ des sciences. 4) (3. 2) (1. 2) (4.3 Conditional Probability and Independence Conditional probability Example: Roll a die twice. Ω = {2H. 4) (4. repeated random assignments. 6) (6. 3) (6. 4) (6. 1754. 4) (1. 1) (1.” And the corresponding computation for P (E) yields P (E) = 1 − P (F ) = 1 − 2 (2)!(2 − 1)! = (2 + 2 − 1)!(2 − 2)! 3 And we see that the second (and faulty) formulation results in the same mistake that Jean d‘Alembert2 made in 1754. 5) (4. 2T. 5) (1. 1) (5. 5) (5. 3 For an important exception. see the work of Bose and Einstein in statistical mechanics. 2) (6. 1) (6. 1) (3. Moral of the story: Sampling Case IV (with replacement and without ordering) usually cannot be used to enumerate equally likely outcomes resulting from multiple. 1) (4. 5) (6. 6) (2. 3) (5. 3) (2. 6) Figure 3.1: Sample Space for Rolling a Die Twice Under the unordered individual model there are three elements in Ω. 6) (3. 1H1T} that is. 2) (5. “two tails” and “one head one tail. 2) (2. 6) (5. 5) (2. 4) (2. e e e Volume 4. 2) (3. 2 . The sample space can be represented as in Figure 3. 5) (3. 6) (4. 3) (1. 4) (5.1.Conditional Probability and Independence 49 (1. 1) (2.

4. An P (A1 A2 · · · An ) = P (A1 )P (A2 | A1 )P (A3 | A1 A2 ) · · · P (An | A1 A2 · · · An−1 ) P (· | B) is a probability measure Remember that P (·) is a probability measure if and only if 1. denoted P (A | B). . . With this additional information. Suppose you are told that the ﬁrst roll is less than or equal to the second. what is the probability that the two rolls sum to 4? Deﬁnition 3.7. P (Ω) = 1 3. . . The probability of an event A given the event B . A2 . For any events A1 .50 Conditional Probability and Independence Since each of the outcomes is equally likely. we can use logical probability to get P (B) = 1/12. Theorem 3. P (E) ≥ 0 for all events E 2. is the value that solves the equation P (A ∩ B) = P (A | B)P (B) Note: If P (B) > 0. If E ∩ F = Ø then P (E ∪ F ) = P (E) + P (F ) Suppose P (B) > 0 and consider P (· | B) • For any event E ⊆ Ω P (E | B) = = P (E ∩ B) P (B) [something ≥ 0] ≥0 [something > 0] Therefore P (E | B) ≥ 0 for all events E . this is equivalent to P (A | B) = P (A ∩ B) P (B) Notation: Sometimes we write P (A ∩ B) = P (AB).

• We can think of P (· | B) as the updated version of P (·) given B . . every event A ⊆ Ω produces an updated probability measure P (· | A). • When interpreting P (E | A) we are asking the “what if” question.” • We can evaluate and use P (E | A) and P (E | B) simultaneously for different events. “what is the probability of E if the event A occurs. A and B . Independence Deﬁnition 3. • In fact.Conditional Probability and Independence 51 • Also. we have P (Ω | B) = = P (Ω ∩ B) P (B) P (B) =1 P (B) Therefore P (Ω | B) = 1 • And for any two events E and F with E ∩ F = Ø P (E ∪ F | B) = = = = = P ((E ∪ F ) ∩ B) P (B) P ((E ∩ B) ∪ (F ∩ B) P (B) P (E ∩ B) + P (F ∩ B) P (B) P (E ∩ B) P (F ∩ B) + P (B) P (B) P (E | B) + P (F | B) Therefore if E ∩ F = Ø then P (E ∪ F | B) = P (E | B) + P (F | B) • Hence P (· | B) is a probability measure.5. The events A and B are independent if and only if P (A ∩ B) = P (A)P (B).

Bn then n n P (A) = i=1 P (A ∩ Bi ) = i=1 P (A | Bi )P (Bi ) . . Then P (A ∩ (B ∪ B c )) = P (A) P ((A ∩ B) ∪ (A ∩ B c )) = P (A) P (A ∩ B) + P (A ∩ B c ) = P (A) P (A)P (B) + P (A ∩ B c ) = P (A) P (A ∩ B c ) = P (A) − P (A)P (B) P (A ∩ B c ) = P (A)(1 − P (B)) P (A ∩ B c ) = P (A)P (B c ) Therefore A and B c are independent events. Proof.6. B1 ∪ B2 ∪ · · · ∪ Bn = Ω. . . B2 .52 Conditional Probability and Independence Note: If P (B) > 0 this is equivalent to saying that A and B are independent if and only if P (A | B) = P (A).9. Bayes’ theorem Deﬁnition 3.8. . (Law of Total Probability) Given an event A and a complete set of alternatives B1 . B2 . and • Ac and B c are independent events. Theorem 3. . Bi ∩ Bj = Ø for any i = j . Suppose A and B are independent events. The other two results follow by interchanging the roles of A and B . Theorem 3. . • Ac and B are independent events. A complete set of alternatives is a collection of events B1 . and 2. If A and B are independent events then • A and B c are independent events. . . Bn such that 1.

1965. (Bayes’ Theorem) Let B1 . 4 . 1 We can the use the law of total probability. A concrete stairway divides the garden into two sections. The proper solution is obtained by writing P (A1 ∩ A2 ) = P (A2 | A1 )P (A1 ) = 1 5 · 2 6 = 1 15 . two are shorter than a speciﬁed length. P. the value of P (A2 ) is not 5 . the snails bravely travel from one section of the garden to the other by crawling across the steps of the concrete stairway. we note that {A1 . Ac } is a complete set of alternatives. It is a treacherous journey. But the value of 1 P (A2 | A1 ) is 5 . . The common but incorrect solution is obtained by writing P (A1 ∩ A2 ) = P (A2 )P (A1 ) = 1 5 · 2 6 = 1 15 . B2 . 1 Although the answer is correct. Addison-Wesley. 3 Theorem 3. what is the probability that the two short bolts are picked? Let Ai = {the ith chosen bolt is short}. To evaluate P (A2 ) properly. When it rains. .10. Many of the snails are crushed by the feet of unthinking students and faculty walking up and down the stairway. .Conditional Probability and Independence 53 Example: (from Meyer4 ) Suppose that among six bolts.. . then P (A | Bk )P (Bk ) P (Bk | A) = n i=1 P (A | Bi )P (Bi ) Example: Thousands of little snails live in the Snail Garden between the University at Buffalo Natural Sciences Complex and Talbert Hall. Introductory probability theory and statistical applications. Bn be a complete set of alternatives. and write P (A2 ) = P (A1 )P (A2 | A1 ) + P (Ac )P (A2 | Ac ) 1 1 = 1 5 · 2 6 + 2 5 · 4 6 = 1. Meyer. Reading MA. If two bolts are chosen at random.

50. The probability that a smart snail will be crushed is 0. and the other half are not-so-smart snails. What is the probability of ﬁnding a smart snails after two rainstorms? Independence of more than two events The deﬁnition of independence can be extended to more than two events.50) The probability of selecting a smart snail before the rainstorm was 0. The probability that a not-so-smart snail will be crushed is 0.40.60)(0. Suppose that half of the snails are smart snails. we want to ﬁnd P (S | A) = = P (A | S)P (S) P (A | S)P (S) + P (A | S c )P (S c ) (0. A smart snail crawls along the vertical portion of a step to avoid being crushed. 3 To ﬁnd the probability that a single snail (chosen at random) will successfully cross the concrete steps uncrushed deﬁne the events S = {The snail is a smart snail} S c = {The snail is a not-so-smart snail} A = {The snail is successful} Note that S ∪ S c = Ω and S ∩ S c = Ø. A not-so-smart snail crawls on the horizontal part of a step.50) + (0. . The probability after the rainstorm is 0.50) + (0. a snail that has successfully crossed the concrete steps. Therefore.50) = 0. smart snails and not-so-smart snails.75 3 Suppose that after the end of a rainstorm. at random. The event that any one snail will be crushed is independent from all others.60.90)(0.60 (0.90)(0. you select. using the Law of Total Probability P (A) = P (A | S)P (S) + P (A | S c )P (S c ) = (0. What is the probability that it is a smart snail? In this case.60)(0.50) = 0.90)(0.10.54 Conditional Probability and Independence There are two types of snails.

Three events A. but not independent. . This is demonstrated by the following example: Example: Suppose that two dice are tossed. j) 1 for i. . (2) and (3) all fail. the events A1 . .7. and let Ω be all ordered pairs (i. A2 . TT. . A ﬁnite collection of events A1 . A2 . If only the ﬁrst three conditions hold. B and C can be pairwise independent. TH. A2 and A3 where equation (4) holds. 5 or 6} {the sum of the faces is 9} . B. An are independent if and only if P (Ak1 ∩ · · · ∩ Akj ) = P (Ak1 ) · · · P (Akj ) for all 2 ≤ j ≤ n and all 1 ≤ k1 < · · · < kj ≤ n. then we say that A1 . 6 with probability of 36 assigned to each point. . . A3 are pairwise independent. HT. It is also possible to have three events A1 . as shown by the following example: Example: Toss a fair coin twice. A3 to be independent. C are pairwise independent but not independent. Let A = {ﬁrst toss is a head} B = {second toss is a head} C = {ﬁrst toss equals second toss} 1 4 to each of the outcomes Then P (A ∩ B) = P (A ∩ C) = P (B ∩ C) = P (A ∩ B ∩ C) = 1 4 1 4 1 4 1 4 = P (A)P (B) = P (A)P (C) = P (B)P (C) = P (A)P (B)P (C) Hence A. 2 or 5} {second die is 4. Let A B C = = = {second die is 1. . A2 . . but equations (1). j = 1. A2 . 2.Conditional Probability and Independence 55 Deﬁnition 3. and assign probability HH. Note: For n = 3. A3 are independent if and only if (1) (2) (3) (4) P (A1 ∩ A2 ) = P (A1 )P (A2 ) P (A1 ∩ A3 ) = P (A1 )P (A3 ) P (A2 ∩ A3 ) = P (A2 )P (A3 ) P (A1 ∩ A2 ∩ A3 ) = P (A1 )P (A2 )P (A3 ) All of the above must hold for A1 .

. Furthermore. Assume all the outcomes from each roll are independent. . 2. . We can also compute. Let Sk = {toss k is a six} An = {the ﬁrst six occurs on toss n} Then P (Sk ) = 1 6 c P (Sk ) = 5 6 for k = 1. . . . . P (ﬁrst 6 occurs after toss k) ∞ = n=k+1 ∞ P (An ) 5 6 n=k+1 n−1 1 6 = letting n = n − (k + 1) ∞ = n =0 5 6 k+n 1 6 . .. ..56 Conditional Probability and Independence Then P (A ∩ B) = P (A ∩ C) = P (B ∩ C) = P (A ∩ B ∩ C) = 1 6 1 36 1 12 1 36 = P (A)P (B) = = P (A)P (C) = = P (B)P (C) = 1 4 1 18 1 18 = P (A)P (B)P (C) Example: Suppose you roll a fair die repeatedly until a six appears. 2. 2.. c c c P (An ) = P (S1 )P (S2 ) · · · P (Sn−1 )P (Sn ) c c c An = S1 ∩ S2 ∩ · · · ∩ Sn−1 ∩ Sn = 5 6 n−1 1 6 for n = 1. for n = 1.

3.3 and P (A | B) = 0. Answers can be found in Appendix B. . choose the best response among those provided. P (B) = 0. 1. . S3. since the ﬁrst 6 occurs after toss k if and only if you get k non-6’s in a row. Self-Test Exercises for Chapter 3 For each of the following multiple-choice questions. 2.Self-Test Exercises for Chapter 3 57 = = = 1 6 1 6 5 6 5 6 5 6 k k k ∞ n =0 5 6 n ·6 for k = 0. Which makes sense. . So c c c {ﬁrst 6 occurs after toss k} = S1 ∩ S2 ∩ · · · ∩ Sk c c c P (ﬁrst 6 occurs after toss k) = P (S1 )P (S2 ) · · · P (Sk ) k = 5 6 for k = 0. Which of the following are always true? (A) p = 0 (B) A and B are independent (C) P ((A ∩ B)c ) = 1 − p (D) A and B are mutually exclusive (E) none of the above. 1. ..3 then (A) A and B are mutually exclusive (B) P (A ∩ B) = 0 (C) A and B are independent (D) all of the above (E) none of the above.1 If P (A) = 0. S3..2 Let P (A) = P (B) = 2p and let P (A ∪ B) = 3p. . . 2. .

Then P (R = Heads | R = B) is equal to (A) 1/4 (B) 1/3 (C) 1/2 . S3. Let B denote the outcome of the blue coin. Let R denote the outcome of the red coin (Heads or Tails).58 Self-Test Exercises for Chapter 3 S3.4 If P (A) = p > 0. P (B) = p > 0 and P (A ∪ B) = 2p then P (B | A ∪ B) equals (A) 1 (B) p (C) 1/2 (D) p2 (E) none of the above S3. then which of the following statements are always true? (A) A and B are independent (B) P (A) > P (B) (C) A and B are mutually exclusive (D) A ∪ B = A (E) none of the above S3.3 If A ⊆ B . One coin is painted red. and the other is painted blue.6 Two fair coins are tossed once. P (B) = p and P (A ∩ B) = p2 then (A) A ∪ B = Ω (B) A = B (C) A and B are independent (D) all of the above (E) none of the above.5 If P (A) = p.

Self-Test Exercises for Chapter 3 59 (D) 1 (E) none of the above. S3. S3. S3.8 Suppose P (A) = 0. S3.7 Two indistinguishable six-sided fair dice are rolled. Then (A) A = Ø (B) P (A ∩ A) = P (A)P (A) (C) P (A ∪ B) = 0 for any B ⊆ Ω. P (B) = q . What is the probability that they both turn up the same value? (A) 1/72 (B) 1/36 (C) 1/12 (D) 1/6 (E) none of the above. (D) all of the above are true (E) none of the above are true.9 Which of the following statements are always true? (A) P (Ω) = 1 (B) If P (A) < P (B) then A ⊆ B (C) If P (A) = 0 then A = Ø (D) all of the above (E) none of the above.10 Suppose P (A) = p. If A ⊆ B and A = B then (A) p ≥ q (B) p ≤ q (C) p > q (D) p < q .

The probability that both balls fall in the same box is (A) 4/ 7 2 (B) 1/4 (C) 1/16 (D) it depends on whether or not the balls are different colors (E) none of the above.14 A student club has four ofﬁcers. three of them are seniors and one of them is a junior. What is the probability that the junior stays home? .12 A multiple choice test has 12 questions. A box may contain more than one ball. useless key. Three of the students will be picked at random to go on a trip. What is the probability that the ﬁrst key you try will open the door? (A) 1/6 (B) 1/2 (C) 1/3 (D) 2/3 (E) none of the above.13 Two balls are dropped randomly into four labelled boxes. S3. in your pocket. one at a time. S3.60 Self-Test Exercises for Chapter 3 (E) none of the above. S3. S3.11 There is one lock on a door and the key is among the six different ones you usually carry in your pocket. If a student randomly picks the answer to each question. Someone places a seventh. what is the probability that the student will answer all questions correctly? (A) 1/(125 ) (B) 1/(512 ) (C) 1/ (D) 1/ 12 5 16 5 (E) none of the above. with each question having 5 possible answers.

16 Suppose A is any event. A.18 Which of the following statements are always true? .15 Suppose that ﬁve tiles labelled with the letters C. Then P (A) + P (Ac ) equals (A) 0 (B) P (A) (C) P (Ac ) (D) 1 (E) none of the above. T are placed in a container. S3. what is the probability that they can be arranged to spell the word CAT? (A) 1 (B) 1/3 (C) 1/ (D) 3/ 5 3 5 2 (E) none of the above. S3. S3. S3.Self-Test Exercises for Chapter 3 61 (A) 1/4 (B) 1/ 6 3 (C) 1/4! (D) 1/43 (E) none of the above. C.17 If A ⊆ B with P (A) = P (B) > 0 then (A) P (Ac ∩ B) = 0 (B) A = B (C) Ac ∩ B = Ø (D) all of the above are true (E) none of the above are true. If three tiles are selected from the container (without replacement). C.

(E) none of the above are true. (E) none of the above are true. Which of the following statements are always true? (A) A and B are independent (B) A ∩ B = B (C) B ⊆ A (D) all of the above are true. S3. what is the probability that the student will answer all questions correctly? (A) 1/(45 ) (B) 1/4 (C) 1/ (D) 1/ 5 4 8 4 (E) none of the above.62 (A) P (Ω) = 1 (B) If A ⊆ B then P (A) ≤ P (B) Self-Test Exercises for Chapter 3 (C) P (A ∩ B c ) ≤ P (A) for any event B (D) all of the above are true.20 A multiple choice test has 5 questions. If a student randomly picks the answer to each question. Given that a 2 appears on the ﬁrst toss. S3.25.19 Suppose P (A) = 1 and P (B) = 0. S3. with each question having 4 possible answers. the probability that the sum of the two tosses is 7 equals (A) 1/36 (B) 1/18 (C) 1/9 (D) 1/6 (E) none of the above.21 A fair die is tossed twice. .

22 If A ∩ B ∩ C = Ø then which of the following must be true (A) A ∩ B = Ø (B) P (A ∩ B) = 0 (C) P (A ∩ B ∩ C) = 0 (D) all of the above are true (E) none of the above.24 If P (A) ≤ P (Ac ) then (A) 0 ≤ P (A) ≤ (B) P (A) = 0 (C) A ⊆ Ac (D) P (A) can be any value between 0 and 1 (E) none of the above.25 If P (A) = 1 4 1 2 and P (B) = 1 2 1 2 1 4 then (A) P (A ∪ B) = (B) P (A ∪ B) ≤ (C) A ∩ B = Ø (D) A = B (E) none of the above. S3.23 Four fair dice are rolled simultaneously.Self-Test Exercises for Chapter 3 63 S3. . The probability that exactly two of them turn up with a value less than or equal to 4 is (A) 24/81 (B) 1/3 (C) 4/9 (D) 1/2 (E) none of the above S3. S3.

If ﬁve bulbs are chosen at random from the box. S3. 0. 0.032 3. Machine B produces 30% of the daily output.3125 5.27 In a screw factor. the probability that all of them are good is equal to (A) (0. 0. B and C that produce screws. 5%.2 4. S3.01)5 (B) 1/ (C) 5/ 50 5 50 5 (D) (45!)2 /(40!50!) (E) none of the above. A screw is chosen at random from a day’s production and found to be defective.64 Self-Test Exercises for Chapter 3 S3. none of the above. Machine A produces 20% of the daily output.01 2. Machine C produces the remaining 50% of the daily output. 0. 4% and 2% of the screws are defective. there are three machines A.26 The probability that the birthdays of twelve people will fall in twelve different calendar months (assuming equal probabilities for the twelve months) is (A) 12!/(1212 ) (B) 1/(12!) (C) 1/12 (D) 1/(1212 ) (E) none of the above. S3. What is the probability that it came from machine A? 1. respectively. Of their output. exactly ﬁve of which are defective.28 A box contains ﬁfty light bulbs.29 Suppose A and B are independent events with P (A) = p and P (B) = q . Then P (A ∩ B c ) equals .

S3. . A classroom has 364 students. One ball is drawn from the urn at random and then thrown away. What is the probability that they all have different birthdays? (A) 0 (B) 365!/(365)364 (C) 1/365 (D) (365!364!)/728! (E) none of the above.30 Suppose A and B are independent events with P (A) = p and P (B) = q .Self-Test Exercises for Chapter 3 65 (A) pq (B) 1 − pq (C) p − pq (D) q − pq (E) none of the above. Then which of the following statements are true: (A) P (A ∩ B) = pq (B) P (A ∩ B c ) = p − pq (C) P (Ac ∩ B) = q − pq (D) all of the above are true (E) none of the above are true. A second ball is then drawn from the urn. S3. S3.31 An urn contains 5 red balls and 5 black balls.32 Suppose birthdays are assigned to people randomly with each of the 365 days of the year equally likely. The probability that the second ball is red equals (A) 1/3 (B) 4/9 (C) 1/2 (D) 5/9 (E) none of the above.

4 P (A | B c ) = 0.66 Self-Test Exercises for Chapter 3 S3. The probability that the lamps will light is (A) 2/9 (B) 1/4 (C) 1/2 (D) 3/4 (E) none of the above. ﬁve are bad. S3. (E) none of the above are always true.5 (D) 0.6 . Two lamps are randomly chosen from the box.34 Which of the following statements (A) (B) or (C) are not always true? (A) P (E ∩ F ) = P (E)P (F ) for any events E and F (B) P (Ω) = 1 (C) P (E) ≥ 0 for any event E (D) all of the above are always true.33 A box contains 10 incandescent lamps. S3. Five of them are good.2 (B) 1/3 (C) 0.5 Then P (B | A) equals (A) 0.8 P (B) = 0.35 Suppose P (A | B) = 0. The following circuit contains two such incandescent lamps in series and a (good) battery sufﬁcient to light both lamps: + Both lamps must be good in order for either to light.

.Questions for Chapter 3 67 (E) none of the above. Hint: B = A ∪ (B \ A). 2.1 Consider the sample space Ω = {1. Questions for Chapter 3 3. However.0200 (B) 0. . However.4 There are two locks on the door and the keys are among the six different ones you usually carry in your pocket. Show that P (·) satisﬁes the three probability axioms.01 0. in haste. you’ve lost one.10 0. . in haste. you’ve lost a key at random. deﬁne P (A) ≡ |A|/n.2 Let P be a probability measure.3 There is one lock on the door and the correct key is among the six different ones you usually carry in your pocket. For any event. S3.02 0. Show that if A ⊂ B then P (B \ A) = P (B) − P (A). . 3.3333 (E) none of the above.0390 (C) 0. What is the probability that you can still open the door? What is the probability .05 Model year 2002 2003 Other Proportion of all vehicles 0. A. n}. What is the probability that the model year for this automobile is 2003? (A) 0. 3. What is the probability that you can still open the door? What is the probability that the ﬁrst key you try will open the door? 3.36 A study of automobile accidents produced the following data: Probability of involvement in an accident 0.70 You know that an automobile has been involved in an accident.20 0. Let |A| denote the number of elements in the set A.0513 (D) 0.

Ω. S . justifying your answer: P (A ∩ B ∩ C) P (Ac ∪ B c ) P (A ∩ B ∩ C c ) P (A ∩ B) P (A | B) (b) Are A and B mutually exclusive? Justify your answer.5 P (B) = 0. but only one leads to the outside. . and the events S and T are independent. is divided into three strips. in this kingdom. A.6 If P (S) = p and P (T ) = q with p > 0. what is the value of P (S ∩ T | S)? 3. B and C of equal area.68 Questions for Chapter 3 that the ﬁrst two keys you try will open the door? (Note: There are two interpretations. Ω. What is the value of P (S ∪ T | T )? 3.5 A convict is planning to escape prison by leaving through the prison sewer system.9 Consider a sample space.6 P (C) = 0. and suppose that the value per unit is in the ratio of 1:3:2. 3. For any piece of land.e. There are 4 manholes in the prison. the relative value with respect to that of the kingdom is then given by the formula V (S) = P (S ∩ A) + 3P (S ∩ B) + 2P (S ∩ C) 2 where P (D) = (area of D)/(area of Ω) for any region D.3 P (A ∪ B) = 0.5 P (A ∪ B ∪ C) = 0. What is the probability that the prisoner must try exactly 1 manhole? 2 manholes? 3 manholes? 4 manholes? 3.3 P (B ∪ C) = 0. Show that V is a probability measure.8 Suppose that the land of a square kingdom.7 Suppose P (S) = p and P (T ) = q with 0 < p < 1 and 0 < q < 1. The prisoner plans to enter each of the manholes at random. never reentering an unsuccessful manhole.. i. such that P (A) = 0.4 (a) Compute each of the following. show that V satisﬁes the three probability axioms.2 P (A ∪ C) = 0.) 3. with events A. B and C and probability measure P (·).

B and C contains the keys to that new car worth \$50.Questions for Chapter 3 69 3.000 cash for your 2 box.10 (from Selvin5 ) It’s “Let’s Make a Deal” – the famous TV show starring Monty Hall! Monty Hall: One of the three boxes labelled A. To my knowledge. I think I’ll keep this box. Contestant: No thank you. Is Monty right? Assuming Monty knows which of the three boxes contains the keys. the original problem appeared in the paper by Steve Selvin. Monty Hall: Now box A and box C are on the table and here is box B (contestant grips box B tightly). once Selvin’s paper appeared in American Statistician. but I’ll trade you my box B for box C on the table. Monty Hall: I’ll do you a favor and open one of the remaining boxes on the table (he opens box A). but clear.” The American Statistician. In that letter. No. I’ll give you \$10. and there are several claims for its origin. Since there are two boxes left. 29.)6 This problem has appeared in various forms. Contestant: No. Now either box C or your box B contains the car keys. Contestant: No.000. The other two are empty. August 1975. “On The Monty Hall Problem. Monty Hall: I’ll give you \$2000. is the contestant’s probability of winning the car 0. It’s empty! (Audience: applause). the probability of your box containing the keys is now 1 . Contestant: I’ll take box B. 3.5? Why does the contestant want to trade boxes? (Hint: The contestant is a former EAS305 student. Vol. statement of 5 . If you choose the box containing the keys. you win the car. Monty provided a short. 6 To Monty Hall’s credit. Contestant: Gasp! Monty Hall: Select one of these boxes. it is claimed that Monty wrote a letter to Steve Selvin. The keys might be in your box! I’ll give you \$500 for the box.

The probability that an C will be incorrectly printed as a B is 0. Mary throws twelve dice and wins if she scores at least two aces. You roll the die 100 times and 99 of the 100 rolls turns up 4. Should you accept the claim that die is a fair die? Why or why not? the correct answer to this question.2. You test it and discover it is defective. What is the probability that it came from Factory B ? 3. However. However. Monty also cautioned that there is a difference between the actual probability that the keys are in B and the contestant’s perceived probability. .14 John throws six dice and wins if he scores at least one ace. If a B is printed. but without any veriﬁcation of its authenticity. (a) Suppose that a single A is transmitted. past history has shown that 0. due to transmission errors.5.10 of the coffee makers from A are defective while 0.11 Given two mutually exclusive events A and B . Other printer errors are impossible. What is the probability that it is correctly printed? (b) Suppose that two characters are transmitted independently.12 A warehouse receives the same model of a coffee maker from two different factories. It is known that any one coffee maker has an equal chance of originating from A or B .70 Questions for Chapter 3 3. One is defective. (a) You know that two particular coffee makers came from the same factory. The text of Monty’s letter is available on several web sites. What is the probability that the other one is also defective? (b) You chose another coffee maker at random from those in the warehouse.50 of the coffee makers from B are defective. Who has the advantage? 3. The probability that an A will be incorrectly printed as a B is 0. can A and B be independent? Why? 3. B or C ) to a printer. what is the probability that a B was transmitted? 3.13 A computer can send any one of three characters (A. each character has an equal chance of being sent. it is possible that the printer will not print the same character as transmitted. A and B . During each transmission. and a probability measure P (·) such that P (A) > 0 and P (B) > 0. What is the probability that two A’s will be printed? (c) Suppose that a single character is transmitted.15 A person gives you a six-sided die and tells you that it is a fair die.

21 de M´ r´ ’s paradox: Which is more probable: (a) to get at least one ace with ee four dice. each independently of the other. Assuming that car owners pick a parking place at random among the spaces available. 1950. An introduction to probability theory and its applications.” Was he justiﬁed in attributing the frequency of his breakage to chance? Hint: Before the four dishes are broken. Reanswer (a) and (b). Using this decision rule. what is the probability of such an event? 3. What is the probability that each cell is occupied? 3.20 (from Feller7 ) In a certain family. Two balls are dropped into the box. in such a way that every cell has an equal chance of receiving a ball. Every one of the 12 tickets was given on either a Tuesday or Thursday. what is the probability of incorrectly announcing that the youngest is clumsy. (b) What is the probability of each arrangement? (c) Suppose that no cell can have more than one ball.19 Consider a box partitioned into 3 cells.17 Let n balls be placed “randomly” into n cells. he is clumsy. but in such a way so that the four empty places were adjacent to each other. Feller is often regarded as the ﬁrst person to organize probability theory as a self-contained subject. when the process of breaking dishes is actually random? 3. . there are 6 cars parked. Was his renting a garage only for Tuesdays and Thursdays justiﬁed? 3. we will announce that the four dishes were not randomly broken and. A cell may contain more than one ball. adopt the following decision rule: If the youngest brother breaks three or more of four broken dishes.16 A professor parks illegally on campus every night and has received twelve tickets for overnight parking. four boys take turns at washing dishes. Wiley. three were caused by the youngest and he was thereafter called “clumsy. list all possible arrangements. (a) If the balls are indistinguishable. or (b) to get at least one double ace in 24 throws of two dice? 7 This is a problem from the text: William Feller.18 A parking lot attendant noticed that in a section where there are 10 parking spaces arranged in a row.Questions for Chapter 3 71 3. Out of a total of four broken dishes. 3. Find the probability of this event assuming tickets are given out on random days. thus.

(a) The students in the class are assigned seats at random. If no new passengers are picked up. what is the probability that no two passengers leave at the same stop? What assumptions are you making? 3.25 Fifty students have rented 3 busses for a class trip.2 for an example of such a block). Each bus can hold 50 riders. in his state-of-mind.2: Example of a ‘Block’ for Bell Elementary School 3. what is the probability that each row contains exactly one girl? 3. (See Figure 3. (a) How many different ways could the postal carrier deliver the letters? (b) What is the probability that everyone gets the correct letter? (c) What is the probability that at least two people get the wrong letters? 3. one letter per box.72 Questions for Chapter 3 B B B B G G B B G G B B B B B B (B = Boy. the neighborhood postal carrier decides to deliver n letters addressed to n different individuals. What is the probability that the four girls will be seated together in a block of two rows and two columns somewhere in the classroom. Answer the following questions assuming that each student chooses his bus for the trip at random: (a) What is the probability that every bus has at least one rider? .22 A bus starts with r passengers and stops at n bus stops (r ≤ n).24 A classroom in the Bell Elementary School contains 16 seats with four rows of four seats each. (b) If the seats are assigned randomly. G = Girl) Figure 3.23 After a night on the town. However. A class is composed of 4 girls and 12 boys. the carrier places the letters randomly into the n mailboxes.

and 20% of the independents favor him. That is.. n boys take turns at washing dishes. 000 independents. 000 Republicans. Out of a total of d broken dishes. 3. (a) What is the probability that a voter chosen at random from the community will favor JK? .Questions for Chapter 3 73 (b) What is the probability that all students sign up for the same bus? 3.e.32 A community of 20. A ⊂ B and A = B ). what is the probability that the player who gets the ace of spades will also get the ace of hearts? 3.30 Show that Theorem 3.31 We know that A = B ⇒ P (A) = P (B) Note that this is used in many of the proofs in this chapter. A = B ⇐ P (A) = P (B) 3. Provide an example to show that the converse is not true. What is the probability that the youngest child breaks exactly x dishes? 3. 5. If each student casts his vote independently and at random for one of the two candidates.27 A class of 100 students vote for class president among two candidates. what is the probability of a tied election? 3.28 In a family. while 5% of the Democrats favor JK. 000 voters is composed of 10.26 If an ordinary deck of 52 cards is dealt to four hands of 13 cards each. B and C P (A ∪ B ∪ C) = P (A) + P (B) + P (C) −P (A ∩ B) − P (A ∩ C) − P (B ∩ C) +P (A ∩ B ∩ C) 3. Provide an example where A ⊂ B (with A = B ) and P (A) = P (B).3 is still true even if we replace ⊆ by proper subset (i. it is known that 80% of the Republicans favor Presidential candidate JK. From a recent survey. Show that for any events A. 000 Democrats and 5.29 Let Ω be a sample space with probability measure P (·).

In the dark. . Suppose that P (A) = 0. Only when an electrical current is applied. one fourth green and three quarters red. the probabilities that any of the 25 other characters will actually appear are equal. Let P (B) = p. she picks two socks at random from the basket. (a) Suppose that one of the 26 impulses is chosen at random and fed into the machine. one barrel of red LED’s was produced. a second barrel was ﬁlled with a mixture of LED’s.74 Questions for Chapter 3 (b) Suppose a voter is chosen from the community and it is learned to favor JK. When turned off. What is the probability that the impulse chosen was the one designed to produce a C ? (b) Suppose that one of the 26 impulses is chosen at random and fed into the machine twice.34 Let A and B be two events. 4 red socks and 2 green socks. a university student. and that the character C is printed. the red and green LED’s look alike. each character being produced by a unique impulse. (a) For what choice(s) of p is P (A ∩ B) = 0? (b) For what choice(s) of p are A and B indpendent? 3. On a busy production day. accidentally. has a basket that contains 5 blue socks. independent of past behavior.36 The Colorful LED Company manufacturers both green and red light emitting diodes (LED’s). Suppose the machine has the probability 0. The barrels have no distinguishing markings. can it be determined whether the LED gives off green or red light. What is the probability that the voter is a Republican? 3.9 of correctly producing the character corresponding to the impulse received. LED’s are packaged and shipped in bulk using large cardboard barrels. In addition. What is the probability that the impulse chosen was the one designed to produce a C ? 3.33 A printing machine capable of printing any of the 26 letters (A through Z ) is operated by electrical impulses.7. If it prints the wrong character.4 while P (A∪B) = 0. What’s the probability that both socks are the same color? 3. and the character C is printed both times.35 Mary.

each time you return to your cell. Only one of the three doors leads to an escape spacecraft. (c) Assuming that the aliens zap you with their anti-memory beam (as in part (b)). P (E2 ) and P (E3 ). (b) Anticipating your plan to escape. . Let Ek denote the event that you require exactly k escape attempts. What is the probability that you need more than three attempts? (d) Assume that the aliens zap you with their anti-memory beam (as in part (b)). an LED is tested. you know how to expertly pilot the spacecraft back to Earth. B and C .37 You have been captured by aliens from outer space and held in a one-room cell on their planet. Find the values of P (E1 ). Assume that after a failed escape attempt you never choose a previously chosen door. you choose randomly from all three doors. What is the probability that it is the barrel with a mixture of green and red LED’s? 3. every time you attempt to escape. What is the probability that it is a green LED? (c) Suppose that one of the two barrels is chosen at random. Two of the doors lead to corridors that return you back to your one-room cell. and it turns out to be RED. Find the values of P (E1 ). (a) You decide to try each door (one at a time) at random. Because of your engineering degree. What is the probability that it is the barrel with a mixture of green and red LED’s? (b) Suppose that one of the two barrels is chosen at random and then an LED is tested. you may need more than three attempts to escape. P (E2 ) and P (E3 ). Hence. Find a formula to compute the value of P (Ek ) for any positive integer k . The room has three doors labelled A. the space aliens zap you with a electromagnetic beam that causes you to forget which doors you have previously chosen.Questions for Chapter 3 75 (a) Suppose that one of the two barrels is chosen at random.

The purple and yellow jelly beans are placed in a blue bowl. The green jelly beans are likewise placed in a white bowl. Just before leaving the company. Exactly one half of the red jelly beans (one-half pound) are placed in the blue bowl and the other half are placed in the white bowl.38 The Belchee Seed Company sells ﬂower seeds and specializes in red petunias and white petunias. At the end of the season. you will be blindfolded.)). Then you will choose a bowl at random and then pick a jelly at random . experimental or subjective probability? (b) You choose a single petunia seed at random from the randomly chosen barrel. The beans in both bowls are thoroughly mixed together. from the barrel that was randomly chosen in part (a. green. What is the probability that it is a red petunia seed? (c) You now perform a DNA test on that single petunia seed (the one chosen at random in part (b. You determine that it is a white petunia seed. of course. red and white petunia seeds are indistinguishable except through DNA testing. with each bag containing jelly beans that are all of the same color. you are given one bag each of red. one contains only red petunia seeds. you perform a DNA test on the seeds). purple and yellow jelly beans. Two of the barrels contain only white petunia seeds. Before they are grown.76 Questions for Chapter 3 3. the company has only four barrels of petunia seeds remaining. In order to conduct a probability experiment. an angry employee removed the labels from all four barrels of seeds so that they are now indistinguishable (unless.). (a) Suppose you choose a barrel at random.39 Jelly beans are packaged in one-pound bags. and the fourth barrel contains a mixture of 10% red petunia seeds and 90% white petunia seeds. What is the probability you selected the barrel containing a mixture of red and white seeds? 3. (a) In the experiment. What is the probability that it is the barrel with the mixture of red and white seeds? Are you using logical.

So you do not know precisely which bowl you selected in part (b) and (c). Do the events R. what is the probability that the second jelly bean is also RED? 3. P and Y form a complete set of alternatives? Brieﬂy explain why. 3. indeed.Questions for Chapter 3 77 from that bowl. you choose a bowl at random and then pick a jelly bean from that bowl at random. Let B = {the BLUE bowl is chosen} W = {the WHITE bowl is chosen} Also let R = {a RED jelly bean is chosen} G = {a GREEN jelly bean is chosen} P Y = {a PURPLE jelly bean is chosen} = {a YELLOW jelly bean is chosen} Do the events B and W form a complete set of alternatives? Brieﬂy explain why. G. a box will be picked at random and a ball randomly chosen from the selected box. (b) While blindfolded. What is the probability that it is a RED jelly bean? (c) A friend (who is not blindfolded) tells you that you did. How should you initially distribute the balls so as to maximize the probability of drawing a white ball? 3. After you have distributed all of the balls between the two boxes. You draw out a second jelly bean from the same bowl as you selected in parts (b) and (c). choose a RED jelly bean? What is the probability that you selected the BLUE bowl? (d) You are still blindfolded.42 (†) We deﬁne P (A | B) to be the number x that solves the equation P (A ∩ B) = xP (B) . Given that the ﬁrst jelly bean was RED.40 Show that if P (A) = P (B) = 1 then P (A ∩ B) = 1.41 (†) Suppose that you have n white balls and m red balls which you plan to distribute between two boxes.

43 (†) The 9-1-1 New York Lottery Problem. On September 11. assume the following: • Two sets of three numbers are drawn twice each day. Jackpot winners of the daily New York Lottery numbers game. For this question. But if we try to use (5) to deﬁne conditional probabilities. the numbers drawn were 9-1-1. The “spam” E-mail is of two types: junk mail that is advertising to sell a product. This was regarded by many as a “miraculous” outcome. B and C . then Equation (6) doesn’t always hold. .78 Questions for Chapter 3 We do not deﬁne P (A | B) as (5) P (A | B) = P (A ∩ B) P (B) Here is one reason why. Mr. once mid-day and once in the evening. Each message he receives belongs to one of the following three categories B1 = {the message is “junk mail”} B2 = {the message is “adult content mail”} B3 = (B1 ∪ B2 )c = {the message is regular E-mail} . and he gets a lot of “spam” messages. then Equation (6) holds for any events A. or adult content mail that is promoting web sites containing explicit adult material. the same as the date of the drawing. That is. What is the probability that the lottery numbers will match the calendar date at least once in a year? 3. Pop has a computer with an E-mail account.44 Mr. must match a sequence of three numbers. .4. Why? 3. • The numbers can match the calendar date only in the months of January to September. the lottery game is played twice a day. Consider the following equation (6) P (A ∩ B ∩ C) = P (A)P (B | A)P (C | A ∩ B) Show that if we use Deﬁnition 3. Pop has carefully analyzed his E-mail messages. with each number independently selected at random from the digits 0 through 9. 2002.

Questions for Chapter 3 79 Note that every message is assigned to one and only one category.3 That is.9 P (AX | B3 ) = 0.8 P (A\$ | B2 ) = 0.0 P (AX | B2 ) = 0. (b) An E-mail message arrives and contains the character ‘\$’. the probability that an E-mail message chosen at random contains the character ‘\$’. 60% of his E-mail is “junk mail.” . regular E-mail.” and the remaining 30% is useful. Use Bayes’ theorem to ﬁnd the probability that the message is “junk mail.1 P (AX | B1 ) = 0. Mr.” 10% is “adult content mail. Mr. Upon further analysis. Pop deﬁned the events A\$ = {the message contains ‘\$’} AX = {the message contains “XXX”} and determined that P (A\$ | B1 ) = 0. Being a former EAS305 student.” (c) An E-mail message arrives and contains the character string “XXX”.6 P (B2 ) = 0. Pop has determined that most of the “junk mail” contains the text character ‘\$’.0 (a) Use the law of total probability to ﬁnd P (A\$ ). Regular mail usually contains neither string. and that almost all of the “adult mail” contains the text string ‘’XXX”. Pop estimates that the probability that a single message is assigned to a category is given by P (B1 ) = 0.1 P (B3 ) = 0.2 P (A\$ | B3 ) = 0. that is. Mr. Use Bayes’ theorem to ﬁnd the probability that the message is “adult content mail.

80 Questions for Chapter 3 .

lim FX (x) = x↓a FX (a) 81 . FX (a) is a nondecreasing function of a. lim FX (a) = 1 a→∞ 2. A random variable is a function X(·) which assigns to each element ω of Ω a real value X(ω). And sees in Love a coin to toss. as FX (a) ≡ P (X ≤ a). FX (a). FX (x) is a right-continuous function of a. For any random variable X . The Man Against the Sky (1921) Basic Concepts Deﬁnition 4. a→−∞ lim FX (a) = 0 3.2. we deﬁne the cumulative distribution function (CDF).4 RANDOM VARIABLES He stares in vain for what awaits him. 4. Properties of any cumulative distribution function 1.1. – EDWIN ARLINGTON ROBINSON. Notation: (1) P {ω ∈ Ω : X(ω) = a} ≡ P {X(ω) = a} ≡ P (X = a) Cumulative Distribution Function Deﬁnition 4. In other words.

P (E) = xi ∈E pX (xi ) = 1 pX (xi ) . pX (x) ≥ 0 for every x 2. Properties of any probability mass function 1. For any random variable X and real values a < b P (a < X ≤ b) = FX (b) − FX (a) Discrete Random Variables Deﬁnition 4.6 F(x) 0.82 Discrete Random Variables 1 0.2 0 0 1 2 3 4 5 x Figure 4. The probability mass function (pmf) of a discrete random variable X is given by pX (xi ) ≡ P (X = xi ).8 0.3.1.1: An example of a CDF for a discrete random variable Theorem 4. all xi 3.4 0.

but.Discrete Random Variables 83 Examples of discrete distributions The binomial distribution Consider a sequence of n independent. . this time. n.. and the random variable Y is said to have a geometric distribution with parameter p where 0 ≤ p ≤ 1. . we want to ﬁnd the probability that the ﬁrst success occurs on the k th trial. with each trial having two possible outcomes. 1. 2. repeated trials. . success or failure. . . then 1. n k=0 pX (k) Let’s check to make sure that if X has a binomial distribution. Let X denote the number of successes on n trials. . . Let p be the probability of a success for any single trial. Let p be the probability of a success for any single trial and let Y denote the number of trials needed to obtain the ﬁrst success. The random variable X is said to have a binomial distribution and has probability mass function (2) pX (k) = n k p (1 − p)(n−k) k for k = 0. Then the probability mass function for Y is given by pY (k) = (1 − p)k−1 p for k = 1. We will need the binomial expansion for any polynomial: (a + b)n = n k=0 = n k (n−k) a b k So n k=0 n k p (1 − p)(n−k) = (p + (1 − p))n k = (1)n = 1 The geometric distribution Recall the “success/failure” model that was used to derive the binomial distribution. .

2. ∞ j=0 θj = 1 1−θ Proof. For any 0 < θ < 1. Using Theorem 4. For any 0 < θ < 1.84 Discrete Random Variables Some facts about geometric series When working with the geometric distribution.3.2 ∞ j=0 ∞ j=0 θj θj = = 1 1−θ d 1 dθ 1 − θ d dθ . ∞ j=0 ∞ j j=0 ∞ j=0 ∞ θ − θ0 = θ − θ j θj = 1 = 1 = 1 = 1 = j=1 ∞ j+1 j=0 ∞ j=0 ∞ j=0 ∞ j=0 θj − θ θj θj θj (1 − θ) 1 1−θ Theorem 4. ∞ j=0 jθj = θ (1 − θ)2 Proof. the following results are often helpful: Theorem 4.

. 1. . 2.2. 2. then ∞ ∞ pY (k) = k=1 (1 − p)k−1 p (1 − p)j p k=1 ∞ = = j=0 1 p=1 1 − (1 − p) The Poisson distribution The random variable X is said to have a Poisson distribution with parameter α > 0 if X has a probability mass function given by (3) pX (k) = e−α αk k! for k = 0. . Suppose pY (k) = (1 − p)k−1 p for k = 1. . . . we can show that the probability mass function for a geometric random variable must sum to one. Continuous Random Variables ..Continuous Random Variables ∞ 85 d j θ dθ j=0 ∞ = 1 (1 − θ)2 1 (1 − θ)2 1 (1 − θ)2 θ (1 − θ)2 θ (1 − θ)2 jθj−1 = jθj jθj jθj = = = θ j=1 ∞ −1 j=1 ∞ j=1 ∞ j=0 Example: Using Theorem 4.

fX (x) ≥ 0 for every x ∞ 2. If X is a continuous random variable then P (X < a) = P (X ≤ a).4.86 Continuous Random Variables 1 0. . The probability density function (pdf) of a continuous random variable X is given by d FX (a).4 0.4. −∞ fX (x) dx = 1 fX (x) dx 3. If X is a continuous random variable.2: An example of a CDF for a continuous random variable Deﬁnition 4.2 0 0 1 2 3 4 5 x Figure 4.5. then (4) FX (a) = a −∞ fX (x) dx.6 F(x) 0. fX (a) ≡ da Properties of any probability density function 1. Theorem 4. P (E) = E Theorem 4.8 0.

3: The differential mass interpretation of fX (x) dx Differential mass interpretation of the pdf (5) P (X ∈ [x. x + dx)) = fX (x)dx .Continuous Random Variables 87 Area = f(x)dx f(x) x x+dx Figure 4.

pmf pX (x) ≥ 0 P (X = x) = pX (x) pX (x) = 1 all x pdf fX (x) ≥ 0 P (x ≤ X < x + dx) = fX (x) dx ∞ −∞ fX (x) dx = 1 a −∞ FX (a) = x≤a pX (x) FX (a) = fX (x) dx pX (x) = P (E) x∈E E fX (x) dx = P (E) Examples of continuous distributions The uniform distribution The random variable X is said to have a uniform distribution if it has the probability density function given by (6) 1 fX (x) = b−a  0   if a ≤ x ≤ b otherwise The exponential distribution The random variable X is said to have an exponential distribution with parameter α > 0 if it has the probability density function given by (7) fX (x) = αe−αx if x ≥ 0 0 otherwise .88 Continuous Random Variables Comparison of pmf’s and pdf’s A comparison of the properties of the probability mass function and probability density function.

The expected value of a random variable X is given by E(X) = all x xpX (x) + for discrete X .5. • The expected value of X is often called the “mean of X . Note: • E(X) is a number. and E(X) = +∞ −∞ xfX (x) dx for continuous X .5 6 6 6 6 Example: Let X be a continuous random variable with probability density function fX (x) = 2x 0 ≤ x ≤ 1 0 otherwise Then E(X) = = = 0 +∞ −∞ 0 −∞ 1 xfX (x) dx 1 0 x0 dx + x2x dx + 1 1 +∞ x0 dx 2x2 dx = 2x3 3 = 0 2 3 −0= 2 3 . Let X equal the number of pips.” • The value of E(X) can be interpreted as the center of mass or center of gravity of the probability mass distribution for X .Expectation 89 Expectation Deﬁnition 4. Example: Roll a fair die once. Then E(X) = all x xpX (x) = (1)pX (1) + (2)pX (2) + (3)pX (3) + (4)pX (4) + (5)pX (5) + (6)pX (6) 1 1 = (1) 6 + (2) 1 + (3) 1 + (4) 1 + (5) 1 + (6) 6 = 3.

n. . We can ﬁnd E(X) as follows: n E(X) = k=0 n k n k p (1 − p)n−k k = n! pk (1 − p)n−k (k − 1)!(n − k)! k=1 Let s = k − 1 n−1 E(X) = = np = np n s=0 n−1 s=0 n − 1 s+1 p (1 − p)(n−s−1) s n−1 s p (1 − p)(n−s−1) s Poisson distribution Let X be a Poisson random variable with parameter α > 0. 1.90 Finding E(X) for some speciﬁc distributions Finding E(X) for some speciﬁc distributions Binomial distribution Let X be a binomial random variable with parameters n and 0 ≤ p ≤ 1. We have pX (k) = e−α αk k! for k = 0. . . 1. . The expected value for X can be found as follows: ∞ E(X) = k=0 ∞ ke−α e−α αk k! = k=1 αk (k − 1)! . . . 2. . We have pX (k) = n k p (1 − p)n−k k for k = 0.

We have pX (k) = (1 − p)k−1 p for k = 1.. ∞ E(X) = = p kq k−1 p kq k−1 k=1 ∞ k=1 = p 1 1 = (1 − q)2 p Exponential distribution Let X be an exponential random variable with parameter α > 0. This produces E(X) = −xe−αx ∞ 0 + 0 ∞ e−αx dx = 1 α . We have fX (x) = αe−αx if x ≥ 0 0 otherwise We can get E(X) as follows: E(X) = 0 ∞ xαe−αx dx Integrate this by parts. Hence v = −e−αx and du = dx.Finding E(X) for some speciﬁc distributions 91 Let s = k − 1 ∞ E(X) = = α = α e−α s=0 ∞ s=0 αs+1 s! αs s! e−α Geometric distribution Let X be a geometric random variable with parameter 0 ≤ p ≤ 1. 2. To ﬁnd E(X) let q = (1 − p) and then. . . . letting dv = αe−αx dx and x = u.

7. S4.92 Self-Test Exercises for Chapter 4 Other measures of central tendency Deﬁnition 4.6. The mode of a random variable X .50) equals (A) 0 . 2 2 Deﬁnition 4. is given by pX (Mode(X)) = max pX (x) if X is discrete x fX (Mode(X)) = sup fX (x) x if X is continuous Self-Test Exercises for Chapter 4 For each of the following multiple-choice questions. 0. equals (A) 0 (B) 1/3 (C) 1/2 (D) any of the values in the set {−1. choose the best response among those provided.1 If X is a discrete random variable with P (X = −1) = P (X = 0) = P (X = 1) = 1/3 Then the expected value of X .2 Suppose X is a continuous random variable with cumulative distribution function   0 if x < 0  2 x if 0 ≤ x ≤ 1 FX (x) =   1 if x > 1 Then P (X < 0. 1} (E) none of the above are true. The median of a random variable X . E(X). Answers can be found in Appendix B. S4. denoted by Mode(X). is given 2 by 1 P (X ≤ m 1 (X)) = . denoted by m 1 (X).

Self-Test Exercises for Chapter 4 93 (B) 0.50 √ (D) 0. S4.4 Suppose X is a continuous random variable with probability density function fX (x) =   0    if x < 0 if 0 ≤ x < 1 if 1 ≤ x < 2 if x ≥ 2 1/2 0 if 0 ≤ x ≤ c otherwise Then the value of c is (A) 1/2 (B) 1 (C) 2 (D) +∞ (E) none of the above.25 (C) 0. S4.5 Let X be a continuous random variable with probability density function fX (x) = λe−2x 0 if x > 0 otherwise The value of λ must equal . S4.3 Suppose X is a random variable with cumulative distribution function 1/3 FX (x) =  2/3    1 Then P (X = 2) equals (A) 0 (B) 1/3 (C) 2/3 (D) 1 (E) none of the above.50 (E) none of the above.

8 The length X (in inches) of a steel rod is a random variable with probability density function given by   x−9 fX (x) =  0 2 if 9 ≤ x ≤ 11 otherwise A rod is considered defective if its length is less than 10 inches.94 Self-Test Exercises for Chapter 4 (A) 1/2 (B) 1 (C) 2 (D) e2x (E) none of the above. The probability that exactly two heads turn up is (A) (B) (C) (D) 1 2 1 4 3 8 2 3 (E) none of the above.6 Suppose a fair coin is ﬂipped 3 times. S4. S4. S4.7 Let X be a random variable with cumulative distribution function FX (a). Which of the following statements are always true: (A) lim FX (a) = 0 a→∞ (B) P (X = a) = FX (a) − FX (a − 1) (C) 0 ≤ FX (a) ≤ 1 for every a (D) all of the above are true (E) none of the above are true. If 2 independent steel rods are examined. the probability that exactly one is defective is (A) 1/4 .

11 Let X be a continuous random variable with probability density function fX (x) = x/2 if 0 ≤ x ≤ 2 0 otherwise Then P (X ≤ 1) equals . then α must equal (A) −1 (B) 1/2 (C) 1 (D) 2 (E) none of the above. S4. Passenger behavior has been extensively studied over the years.Self-Test Exercises for Chapter 4 95 (B) 3/8 (C) 1/16 (D) 3/16 (E) none of the above. S4. the probability that someone will be denied a seat is (A) 0 (B) 0. It is known that the probability is 0.10 Let X be a continuous random variable with probability density function fX (x) = αe−αx if x ≥ 0 0 otherwise If P (X > 1) = e−2 .01 (C) (0.01 that a person making an airline reservation will never show up for his or her ﬂight.99)101 (D) 1 (E) none of the above. S4. If the airline permits 101 persons to make reservations for a given ﬂight. Assume that a person decides independently whether or not to show up for a ﬂight.9 An airplane can hold 100 passengers.

S4. Then P (X > 1) equals (A) e−1 (B) 1 − e−1 (C) 1 − 2e−1 (D) 1 (E) none of the above.96 Self-Test Exercises for Chapter 4 (A) 1/8 (B) 1/4 (C) 1/3 (D) 1/2 (E) none of the above. S4. The probability that both bulbs will each last at least one hour is (A) 0 (B) (1 − e−2 )2 (C) e−4 (D) 2e−2 (E) none of the above. . .12 Let X be a discrete random variable with probability mass function P (X = k) = e−1 k! for k = 0. . . 2. 1.13 You have two light bulbs. The life length of each bulb (in hours) is a continuous random variable X with probability density function fX (x) = 2e−2x if x > 0 0 otherwise The two bulbs are allowed to burn independently of each other.

Self-Test Exercises for Chapter 4 97 S4.3 and FX (1) = 0.5. Then P (0 < X ≤ 1) equals (A) 0 (B) 0.5 (E) none of the above.2 (C) 0. Suppose FX (0) = 0.50 (C) 0. The value of E(X) is (A) 0 (B) 3/8 (C) 3/4 (D) 3/2 (E) none of the above.3 (D) 0.00 (B) 0. The value of fX (0) is (A) 0. S4.75 (D) 1.15 Let X be a random variable with cumulative distribution function FX (x).00 (E) none of the above. .16 Let X be a continuous random variable with probability density function fX (x) = 3x2 0 if 0 ≤ x ≤ 1 otherwise Let E(X) denote the expected value of X .14 Let X be a continuous random variable with cumulative distribution function   0  FX (x) =   1 1 2 (x + 1) if x < −1 if −1 ≤ x ≤ 1 if x > 1 Let fX (x) denote the probability density function for X . S4.

6 (D) 0.2 0 -2 -1 0 1 2 3 FX(a) a The value of P (−1 ≤ X ≤ 1) is (A) 0.98 Self-Test Exercises for Chapter 4 S4. The expected number of defective items is (A) e−3 (B) (1 − e−3 ) (C) (1 − e−9 ) (D) 100(1 − e−9 ) (E) none of the above.2 (B) 0.18 Let X be a discrete random variable with cumulative distribution function.17 Suppose that an integrated circuit has a life length X (in units of 1000 hours) that is a continuous random variable with probability density function fX (x) = 3e−3x if x ≥ 0 0 otherwise An integrated circuit considered defective if its life length is less than 3000 hours. S4. given by the following graph: 1 0. A batch of 100 integrated circuits are tested.8 .8 0.4 (C) 0.4 0. F (a) ≡ P (X ≤ a). The life lengths of any two integrated circuits are independent random variables.6 0.

4.2 The angle of ascent of an erratic rocket is a random variable X having the cumulative distribution function given by   1  FX (x) = if x > π/2 sin x if 0 ≤ x ≤ π/2   0 if x < 0 (a) Compute P (X ≤ π/8). (c) Find E(X). (b) Find P (X ≥ 0. Questions for Chapter 4 4. The probability of a head on one toss is 1/4.1 Let X be a continuous random variable having the probability density function 3 2 2 (1 − x ) if 0 ≤ x ≤ 1 fX (x) = 0 otherwise (a) Compute and plot the cumulative distribution function for X. (b) Find the probability density function fX (x) for X .19 An unfair coin is tossed 4 times. What is the probability of getting exactly two heads on 4 tosses? (A) (B) (C) 4 2 1 4 1 4 1 4 2 2 3 4 2 2 3 4 2 (D) 1 − 3 4 2 (E) none of the above. S4.5). (c) Find E(X). .Questions for Chapter 4 99 (E) none of the above.

P Y ≥ 1 2 . Find the value of c.7 Let X be a continuous random variable with probability density function fX (x) = 1/θ if 0 ≤ x ≤ α 0 otherwise What relationship must hold between θ and α? .4 Let X be a discrete random variable with probability mass function pX (−1) = p pX (0) = 1 − 2p pX (1) = p The value of p is unknown.3 Let X be a random variable with probability density function fX (x) = 3√ 2 x 0 if 0 ≤ x ≤ 1 otherwise Let F denote the cumulative distribution function of X .5 Let Y be a random variable with probability density function fY (y) = 3y 2 if 0 ≤ y ≤ 1 0 otherwise (a) Find the cumulative distribution function for Y .25). P (Y ≤ 0). (b) Compute P Y ≤ (c) Find E(Y ). 4. (a) Compute F (0. Identify all possible values for p. and F (−0. 4. (c) Find E(X). and P 1 3 ≤Y ≤ 1 2 .25).25). F (1.6 Let X be a continuous random variable with probability density function fX (x) = cx if 0 ≤ x ≤ 2 0 otherwise 1 2 .100 Questions for Chapter 4 4.3). 4. (b) Compute P (−1 ≤ X ≤ 0. 4.

the daily demand for an item.9 Suppose that D. 3. is a random variable with the following probability distribution: P (D = k) = c2k k! k = 1. . .2 Draw a plot of the cumulative distribution function (CDF) for X . (b) Compute the expected demand. (b) Find the probability density function fX (x) for X and the probability mass at π/4. what is the expected number of weekly accidents? 4. 4.11 True or False: If X has a Poisson distribution and P (X = 0) = e−1 . then P (X = 1) = e−1 .10 The probability is p that there will be at least one accident in a certain factory in any given week. There is a positive point mass at π/4. (a) Evaluate the constant c. 4. 4. If the number of weekly accidents in this factory is distributed with a Poisson distribution. (a) Compute P (X ≤ π/8). (c) Find E(X). .12 Let X be a discrete random variable with probability mass function pX (−2) = 0.Questions for Chapter 4 101 4.4 pX (0) = 0. 2.2 pX (−1) = 0.2 pX (+2) = 0.8 The angle of ascent of an erratic rocket is a random variable X having the cumulative distribution function given by   1  FX (x) = if x > π/4 sin x if 0 ≤ x ≤ π/4   0 if x < 0 Note: This is an example of a mixed probability distribution with both discrete and continuous probability mass. .

5).14 The Classey Gassey gasoline company produces gasoline at two plants (A and B ) and then delivers the gasoline by truck to gasoline stations where it is sold. what is the probability that the shipment is Low grade gasoline? . then the gasoline is Low grade if High grade if r < 85 r ≥ 85 When gasoline is delivered. (b) Let FX (x) denote the cumulative distribution function for X .13 Suppose that X is a random variable with the following probability density function: 2x if 0 < x < 1 fX (x) = 0 otherwise (a) Compute P (X > 0. Find FX (0.005(x − 80) 0 if 80 < x < 100 otherwise The octane rating of gasoline produced at Plant B is a continuous random variable Y with probability density function fY (y) = 0. The octane rating of gasoline produced at Plant A is a continuous random variable X with probability density function fX (x) = 0.5). (a) Given only the above information. If the octane rating is r.5 ≤ X ≤ 0.5). (c) Compute P (−0. A truck has arrived with today’s shipment of gasoline.02(y − 80) 0 if 80 < y < 90 otherwise The grade of the gasoline is determined by its octane rating. You are the owner of a Classey Gassey gasoline station. 4. there is an equal probability that it was produced at Plant A or Plant B .102 Questions for Chapter 4 4.

He also knows that there is a proﬁt of \$1. 4. 2. in fact. Low grade gasoline. 3 (a) Compute F (−11).15 Suppose that X is a random variable with the following probability density function: cx if 0 < x < 1 fX (x) = 0 otherwise where c is a constant. ﬁnd the baker’s expected proﬁt for .16 Let X be a random variable with probability density function fX (x) = 2/x2 if 1 ≤ x ≤ 2 0 otherwise Let F denote the cumulative distribution function of X .17 Let X be a discrete random variable with probability mass function pX (−4) = 1/4 pX (4) = 1/8 pX (−π) = 1/4 pX (π) = 1/8 pX (0) = 1/4 Let F denote the cumulative distribution function of X .Questions for Chapter 4 103 (b) You actually test today’s shipment of gasoline and discover that it is. (a) What is the value of the constant c? (b) Find P (X < 0.00 on each cake that he sells and a loss (due to spoilage) of \$0. (a) Compute F (−5). F ( 2 ). (b) Compute E(X). Assuming that each cake can be sold only on the day it is made.18 The manager of a bakery knows that the number of chocolate cakes he can sell on any given day is a random variable with probability mass function p(x) = 1 6 for x = 0. 1. 4. What is the probability that the shipment came from Plant A? 4. 4 and 5. and F (14). 4.5). F (π) and F (2π). F (−3). 3. (b) Find E(X).40 on each cake that he does not sell.

. X .. may exceed your supply. Billy plans to reach into the bag and draw out one pretzel at a time. what is her expected proﬁt? 4.104 Questions for Chapter 4 (a) a day when he bakes 5 chocolate cakes. (b) Find E(X). . 5. Let the random variable X denote the number of pretzels that Billy will draw from the bag. . (b) a day when he bakes 4 chocolate cakes. The probability distribution for Y will depend on the number of cakes you decide to bake that day. 6. Using the probability distribution for X (the number of customer asking for cakes) ﬁnd the probability distribution for Y . (a) Find P (X = k) for k = 4. You can assume that there is an inﬁnite number of pretzels in the bag with equal proportions of each shape. stopping only when he has at least one pretzel of each type. (c) How many chocolate cakes should he bake to maximize his expected proﬁt? 4.20 (†) A large bag contains pretzels with 4 different shapes: heart. . Be careful to consider the possibility that the demand for your cakes. circle and star.19 If a contractor’s proﬁt on a construction job is a random variable with probability density function fX (x) = 1 18 (x + 1) 0 −1 ≤ x ≤ 5 otherwise where the units are \$1000. Hint: Deﬁne the random variable Y as the actual daily proﬁt (or loss). You can then answer the question using the random variable Y . square.

5 FUNCTIONS OF RANDOM VARIABLES When you do dance. we could evaluate the cumulative distribution function for Y . but our neighbors in Fort Erie. They are convenient. they are interested in the event A = {ω ∈ Ω : Y (ω) ≤ 20} where Ω is the sample space of the underlying experiment. that you might ever do Nothing but that. Can we still ﬁnd P (A)? 105 . at the point y = 20. move still. still so. This chapter deals with this third “feature” of random variables. and often obvious. Note that Ω is being mapped by both of the random variables X and Y into the real line. FY (y). New York temperature at 4:00 pm next Monday in degrees Fahrenheit. The Winter’s Tale (c 1600) Basic Concepts Two of the primary reasons for considering random variables are: 1. And own no other function. They allow us to restrict our attention to sample spaces that are subsets of the real line. 2. But suppose we only know the cumulative distribution function for X . and probably more important reason for using random variables: 3. This random variable might be suitable for the use of Buffalonians. There is a third. They permit algebraic manipulation of event deﬁnitions. Furthermore. As an example. representations of experimental outcomes. In order to compute P (A). I wish you A wave o’the sea. namely FX (x). – WILLIAM SHAKESPEARE. except in degrees Celsius. Ontario prefer the random variable Y representing the same Buffalo temperature. let X be the random variable representing the Buffalo.

The general concept surrounding the above example is the notion of a function of a random variable. 9 which is nothing more than the equation of a straight line. knowing a relationship like Equation 1 between X(ω) and Y (ω) for every ω ∈ Ω.1. notice that for any ω ∈ Ω.1 The function h(x) maps merely points in the real line (the sample space for X ) into points in the real line (the sample space for Y ) as in Figure 5. To show you what we mean by this.” First. (1) Y (ω) = 5 9 Basic Concepts X(ω) − 32 . In order to simplify the notation. Note that we should have really written this as Y (ω) = h(X(ω)). if we know the probability distribution for X . mathematicians usually will drop the (ω). So. allows a representation of the event A in terms of either the random variable X or the random variable Y . So. we can use the shortcut notation presented in Chapter 5: P (A) = P (Y ≤ 20) = P ( 5 (X − 32) ≤ 20) 9 = P (X − 32 ≤ 36) = P (X ≤ 68) = FX (68). If X is a random variable. 1 . we also have Y = h(X) as a random variable. we can compute probabilities of events expressed in terms of the random variable Y . When ﬁnding the probability of the event A.106 The answer is “yes. We can then redeﬁne the event A in terms of X as follows: A = {ω ∈ Ω : Y (ω) ≤ 20} = ω∈Ω : 5 9 (X(ω) − 32) ≤ 20 = {ω ∈ Ω : X(ω) − 32 ≤ 36} = {ω ∈ Ω : X(ω) ≤ 68} . let the function h(x) be deﬁned by h(x) = 5 (x − 32).

it would be very handy to have a tool that would enable us to compute probabilities involving Y using the facts we already know about X . the folks in Fort Erie are asking us to compute P (Y ≤ 20). Under those circumstances. But consider the case where we have already deﬁned X and then are given a functional relationship between X and Y . we can compute P (X ≤ t) for any value of t.3. An even more general situation appears in Figure 5. Returning to our example.e.2. Suppose we are given a cumulative distribution function for the random variable X . To oblige. measuring Monday’s 4 pm temperature) that would result in announcing that the temperature in degrees Celsius in 20 or less?” Call this event A and compute the probability of A from the distribution of the random variable X. at ﬁrst. Also assume that we are given the cumulative . This gives us the cumulative distribution function FX (t) and allows us to compute the probability of many other events..1. suppose we are given a random variable X .1: The mapping Y (ω) = h(X(ω)) The random variable X serves as a “middleman” in the mapping shown in Figure 5. the Buffalo temperature in degrees Fahrenheit. we ask ourselves. It may seem. more obvious to deﬁne Y as mapping directly from Ω to the real numbers.Basic Concepts 107 Ω ω X( ) Real line Y( ) Real line 212 100 Figure 5. Referring to Figure 5. you can see that what is happening is really quite simple. Using this. “What are all of the possible outcomes of the experiment (i. In this case.

2: Representing the event {Y ≤ 20} y y= h(x) B x = h-1(B) Figure 5.3: A more complicated example .108 Basic Concepts y 20 y= ) h(x x 68 A Figure 5.

Let A be the smallest subset of R such that P (Y ∈ A) = 1. Some examples Discrete case Example: Let X be a discrete random variable with probability mass function pX (−2) = pX (−1) = pX (0) = pX (1) = pX (2) = 1 5 1 5 1 5 1 5 1 5 Let Y = X 2 . Then A is called the support of the distribution for the random variable Y .1. also the probability of observing Y taking on a value in B . x = −1 into 1.1. . A function of any discrete random variable is also a discrete random variable. −1. as we will see in a later example. 4}. The probability of any of these particular outcomes of X is. 0. x = 0 into 0. Such a set is called the support for the random variable Y .Some examples 109 distribution function for X so that we are capable of computing the probability measure of any subset of the sample space for X . “Since X can take on the values −2. Deﬁnition 5. We would like to ﬁnd the probability distribution for Y . what values can Y assume?” We see that the function h(x) = x2 maps the point x = −2 into the point 4. Solution: There is a pleasant fact that helps when working with discrete random variables: Theorem 5.3) we can compute the probability measure of any subset B in the sample space for Y by determining which outcomes of X will be mapped by h into B . To solve the problem. we cannot make such a statement for continuous random variables. therefore. x = 1 into 1 and x = 2 into 4. Unfortunately. So all of the probability mass for the random variable Y is assigned to the set {0. If we then deﬁne the random variable Y = h(X) (where h is the function shown in Figure 5. 1. 1 or 2. we should ﬁrst ask ourselves.

  0  . we should ask ourselves. First.” With no elaborate computation. Find the distribution for the random variable Y = X 2 .110 Some examples Since X will take on the value −2 with probability 1 and the value 2 with probability 5 1 . Let’s try to ﬁnd FY (y). 1). and since each of these mutually exclusive events will result in Y taking on the 5 value 4. what values can Y take on?” Since Y = X 2 . we now know that if y < 0 ? if 0 ≤ y ≤ 1 FY (y) =   1 if y > 1. i. we can say that 2 pY (4) = pX (−2) + pX (2) = 5 . it is easier to deal with the cumulative distribution functions. the density for X is given by fX (x) = if −1 ≤ x ≤ 1 0 otherwise 1 2 and the cumulative distribution function is x+1 FX (x) =  2   1   0   if x < −1 if −1 ≤ x ≤ 1 if x > 1. Solution: For continuous random variables.. the answer is “Y can assume values from 0 to 1. pY (1) = 2 and pY (0) = 1 . Hence the probability distribution for Y is 5 5 given by the probability mass function: pY (0) = pY (1) = pY (4) = 1 5 2 5 2 5 Continuous case Example: Let X be a uniformly distributed random variable on the interval (−1.e. Similarly. “Since X can take on only values from −1 to +1.

i... Look at Figure 5. Example: Let X be a uniformly distributed random variable on the interval (0. We then have FY (y) = P (Y ≤ y ) = P X2 ≤ y √ √ = P − y≤X≤ y √ √ = FX y − FX − y √ √ y+1 − y+1 = − 2 √ 2 = y. let h be given by the step function shown in Figure 5. 2). and we have our ﬁrst example of a continuous random variable being mapped into a discrete random variable. Find the distribution for Y = h(X). We immediately know that the random variable Y is discrete.4 again and try to visualize why this happened.e. That task is left to you. that is 0 ≤ y ≤ 1. Solution: Since X takes on the values ranging continuously from 0 to 2.e. i. 1 and 2 (why 2?). ﬁrst let’s try to ﬁnd pY (0). Y can take on only the integers 0. So. FY (y) =   0  √ if y < 0 y if 0 ≤ y ≤ 1   1 if y > 1 Finding the probability density function for Y is now easy.Some examples 111 Now suppose that y is equal to a value in the only remaining unknown case. We know that Y takes on the value 0 when X takes on any value from 0 to 1 (but not including . To ﬁnd the probability mass function for Y .4. the density for X is given by fX (x) = if 0 ≤ x ≤ 2 0 otherwise   0   1 2 and the cumulative distribution function is x FX (x) =  2   1 if x < 0 if 0 ≤ x ≤ 2 if x > 2 Let h(x) be the greatest integer less than or equal to x.

Hence. . A function h : R → R is called one-to-one. pY (1) = 1 1 dx = 2 2 and 1 dx = 0. if for every y ∈ R there exists some x ∈ R such that y = h(x). 2 2 which is good thing because we ran out of probability mass after the ﬁrst two computations. 2 Similarly. pY (0) = 0 1 1 x dx = 2 2 2 1 1 0 1 = .112 Invertible Functions of Continuous Random Variables y 3 2 1 x -2 -1 1 2 3 4 A Figure 5. if h(a) = h(b) implies a = b. A function h : R → R is called onto.2. 1 pY (0) = pY (1) = 2 . Invertible Functions of Continuous Random Variables First we need a few concepts about inverse functions from calculus: Deﬁnition 5. pY (2) = 2 So we have the answer.4: The greatest integer function 1).

Suppose X is a continuous random variable with probability density function fX (x). If one considers the function Y = h(X) as a transformation (x) → (y). take the ﬁrst derivative of both sides with respect to y and use the chain rule: d FY (y) = dy d FX (h−1 (y)) dy d fY (y) = fX (h−1 (y)) h−1 (y) dy We have just proven the following theorem: Theorem 5. If h(x) a strictly increasing function and h(x) has an inverse.2. Suppose h(x) has an inverse and is increasing (or decreasing). If h(x) is a strictly increasing function with a differentiable inverse h−1 (·). Therefore. Note: This transformation technique is actually a special case of a more general technique from calculus. then we can deﬁne the inverse function h−1 : R → R such that h−1 (y) = x if and only if h(x) = y . and Y = h(X). Then.Invertible Functions of Continuous Random Variables 113 If a function h(x) is both one-to-one and onto. the monotonicity property of h allows us to uniquely solve . we have h−1 (a) ≤ h−1 (b). This produces the following relationship between the cumulative distribution function for X and the cumulative distribution function for Y : FY (y) = P [Y ≤ y] = P [h(X) ≤ y] FY (y) = P [h−1 (h(X)) ≤ h−1 (y)] FY (y) = P [X ≤ h−1 (y)] FY (y) = FX (h−1 (y)) To get the relationship between the densities. we can develop a general relationship between the probability density function for X and the probability density function for Y . if X is a continuous random variable. then h−1 (x) is nondecreasing. for any a ≤ b. then the random variable Y = h(X) has probability density function d fY (y) = fX (h−1 (y)) h−1 (y) dy We will leave it to you to ﬁnd a similar result for the case when h(x) is decreasing rather than increasing.

2].2 to get fY (y) = fX (h−1 (y)) = fX (y 3 ) 1 1 1 d −1 h (y) dy d 1 y3 dy 2 = fX (y 3 )( 1 )y − 3 3 When y < −8 or y > 8 we have y 3 < −2 or y 3 > 2. fX (y 3 ) = 0 hence fY (y) is zero. Since h−1 (y) = y 3 . That solution is x = h−1 (y). has a nonnegative ﬁrst derivative with respect to y . Find the probability density function for Y without explicitly ﬁnding the cumulative distribution function for Y . when solving transformation problems with n variables. So the support for Y is [−8.114 Invertible Functions of Continuous Random Variables y = h(x). We can then express the probability density function of Y as fY (y) = fX (h−1 (y))J(y) where J(y) is the following 1 × 1 determinant: J(y) = ∂ −1 ∂x ∂ −1 = h (y) = h (y) ∂y ∂y ∂y since h−1 (y) is also an increasing function of y and. we can use Theorem 5. Example: Let X be a continuous random variable with density for X given by fX (x) = |x|/4 if −2 ≤ x ≤ 2 0 otherwise Let Y = h(X) = X 3 . In calculus. When −8 ≤ y ≤ 8 fY (y) = = = 1 1 1 −2 3 3 4 |y |( 3 )y 1 1 1 3 √ 12 |y | 3 y 2 1 −1 3| 12 |y 1 1 1 . thus. 8]. the resulting n × n determinant is called the Jacobian of the transformation. Solution: The support for X is [−2. In those cases.

Expected values for functions of random variables 115 Summarizing fY (y) = |y − 3 |/12 if −8 ≤ y ≤ 8 0 otherwise 1 Expected values for functions of random variables Theorem 5. What are your expected earnings? Theorem 5. be a continuous random variable with probability density function fX (x) = 2e−2x x ≥ 0 0 otherwise Suppose you earn ln(x) + x2 − 3 dollars if the transistor lasts exactly x hours.4. Proof. then E(Y ) = E(h(X)) = all x h(x)pX (x) If X is a continuous random variable with probability density function fX (x). the life length of a transistor. your expected winnings. . If X is a discrete random variable with probability mass function pX (x). Let X = 1 if heads. . .3. then E(Y ) = E(h(X)) = +∞ −∞ h(x)fX (x) dx Question: Toss a coin. Let h(X) denote your winnings. Question: Let X . Let Y = h(X). We are really ﬁnding E(h(X)) where h(x) = a for all x. . −5 x = 0 h(x) = 10 x = 1 Find E(h(X)). E(a) = a if a is a constant. We will provide proofs for both the discrete and continuous cases.. Suppose X is a discrete random variable with probability mass function pX (xk ) = P (X = xk ) = pk for k = 1. 2. X = 0 if tails.

. We are ﬁnding E(h(X)) where h(x) = ax. Then E(h(X)) = E(aX) = ∞ −∞ ∞ h(x)fX (x) dx axfX (x) dx xfX (x) dx = a(E(X)) = a −∞ ∞ −∞ . E(aX) = aE(X) if a is a constant.5.. Then E(h(X)) = E(a) = ∞ −∞ ∞ h(x)fX (x) dx afX (x) dx fX (x) dx = a(1) = a = a −∞ ∞ −∞ Theorem 5. 2. . Proof. Suppose X is a discrete random variable with probability mass function pX (xk ) = P (X = xk ) = pk for k = 1.116 Expected values for functions of random variables Then E(h(X)) = k h(xk )pk apk k E(a) = = a pk = a(1) = a k Suppose X is a continuous random variable with probability density function fX (x). Then E(h(X)) = k h(xk )pk axk pk k E(aX) = = a xk pk = a(E(X)) k Suppose X is a continuous random variable with probability density function fX (x). .

2. Then E(h(X)) = k h(xk )pk (xk + b)pk = k k E(X + b) = = k (xk pk + bpk ) xk pk + k bpk pk = E(X) + b(1) k = k xk pk + b Suppose X is a continuous random variable with probability density function fX (x). Suppose X is a discrete random variable with probability mass function pX (xk ) = P (X = xk ) = pk for k = 1.. Then E(h(X)) = E(X + b) = = ∞ −∞ ∞ −∞ ∞ −∞ h(x)fX (x) dx (x + b)fX (x) dx = xfX (x) dx + ∞ −∞ ∞ −∞ (xfX (x) + bfX (x)) dx bfX (x) dx = E(X) + b(1) Theorem 5.7.Expected values for functions of random variables 117 Theorem 5. We are ﬁnding E(h(X)) where h(x) = h1 (x) + h2 (x).. Suppose X is a discrete random variable with probability mass function pX (xk ) = P (X = xk ) = pk for k = 1. 2. . E(X + b) = E(X) + b if b is a constant. Proof. . Then E(h(X)) = k h(xk )pk . . .6. . . We are ﬁnding E(h(X)) where h(x) = x + b. E[h1 (X) + h2 (X)] = E[h1 (X)] + E[h2 (X)] for any functions h1 (X) and h2 (X) Proof.

3.7 with h1 (x) = x and h2 (x) = b and you get Theorem 5.118 The k th Moment of X E[h1 (X) + h2 (X)] = k [h1 (xk ) + h2 (xk )]pk [h1 (xk )pk + h2 (xk )pk ] k = = k h1 (xk )pk + k h2 (xk )pk = E[h1 (X)] + E[h2 (X)] Suppose X is a continuous random variable with probability density function fX (x).7. we get the following deﬁnition: Deﬁnition 5. The kth Moment of X If we let h(x) = xk and consider E[h(X)].6 is really only a corollary to Theorem 5. . Then E(h(X)) = E[h1 (X) + h2 (X)] = = = ∞ −∞ ∞ −∞ ∞ −∞ ∞ −∞ h(x)fX (x) dx (h1 (x) + h2 (x))fX (x) dx (h1 (x)fX (x) + h2 (fX (x)) dx h1 (x)fX (x) dx + ∞ −∞ h2 (x)fX (x) dx = E[h1 (X)] + E[h2 (X)] Notice in the all of the above proofs the close relationship of Discrete X and Continuous X ∞ and k −∞ P (X = k) = pX (xk ) = pk and P (x ≤ X ≤ x + dx) = fX (x) dx Also notice that Theorem 5.6. The quantity E(X k ) is deﬁned as the kth moment about the origin of the random variable X . Simply use Theorem 5.

8. Deﬁnition 5. The variance of a random variable X is Var(X) ≡ E[(X − E(X))2 ] . The standard deviation of a random variable X is given by σX ≡ Var(X). Using the properties of expected value. we get Var(X) = E[(X − E(X))2 ] = E[X 2 − 2XE(X) + [E(X)]2 ] 2 2 expand the square from Theorem 5. If we let h(x) = (x − E(X))2 and compute E(h(X)) we get a very useful result: Deﬁnition 5. Notation: We often write E(X) = µX 2 Var(X) = σX Theorem 5. Var(X) = E(X 2 ) − [E(X)]2 Proof.5 2 = E[X ] + E[−2XE(X)] + E[[E(X)]2 ] = E[X ] − 2E(X)E[X] + E[[E(X)] ] . and Var(X) = +∞ −∞ (x − E(X))2 pX (x) all x (x − E(X))2 fX (x) dx for a continuous random variable X . This can be interpreted as the moment of inertia of the probability mass distribution for X .Variance 119 Variance Recall that E(X) is simply a number. Note: The value of Var(X) is the second moment of the probability distribution about the expected value of X .7 from Theorem 5.5.4. Note that Var(X) = for a discrete random variable X .

Var(aX) = a2 Var(X) Proof.8 Theorem 5. Var(X + b) = Var(X) Proof.10. we get Var(X + b) = E[((X + b) − E[X + b])2 ] = E[(X + b − E[X] − b)2 ] = E[(X − E[X])2 ] = Var(X) The proof of the following result is left to the reader: Theorem 5. Using Theorem 5.11.120 Variance = E[X 2 ] − 2E(X)E[X] + [E(X)]2 = E[X ] − 2[E[X]] + [E(X)] = E[X 2 ] − [E[X]]2 2 2 2 from Theorem 5.8.4.4 Theorem 5. Let Y = h(X) then Var(Y ) = E [h(X) − E(h(X))]2 Example: Suppose X is a continuous random variable with probability density function 3x2 if 0 ≤ x ≤ 1 fX (x) = 0 otherwise . Using Deﬁnition 5.5 = a2 (E[X 2 ] − [E(X)]2 ) Using Theorem 5. we get Var(aX) = E[(aX)2 ] − [E(aX)]2 = E[a2 X 2 ] − [E(aX)]2 = a2 E[X 2 ] − [aE(X)]2 = a E[X ] − a [E(X)] = a2 Var(X) 2 2 2 2 Using Theorem 5.9.

. Suppose Y = h(X).2 We have already shown that. Meyer. . given a random variable X and function Y = h(X). Then (2) (3) 2 h (µ) 2 σ 2 Var(Y ) ≈ [h (µ)]2 σ 2 E(Y ) ≈ h(µ) + Meyer. However. Addison-Wesley.12. then even using these short-cut methods for computing E(h(X)) and Var(h(X)) may result in difﬁcult integrals. we can get good results using the following approximations: Theorem 5. L. Let X be a random variable with E(X) = µ and Var(X) = σ 2 . Reading MA. Introductory probability theory and statistical applications. This avoids the need for explicitly ﬁnding the probability distribution for the random variable Y = h(X) if we are only computing expected values. 1965. Then E(Y ) = 0 1 h(x)(3x2 ) dx = = 1 (3x2 ) dx 0 x 3 21 3 x = 2 0 2 3 2 1 Then we can use E(Y ) = E(h(X)) = Var(Y ) = 0 1 to compute (h(x) − E(h(X)))2 (3x2 ) dx = = = 1 3 2 − (3x2 ) dx x 2 0 1 1 3 9 (3x2 ) dx − + 2 x x 4 0 1 1 1 27 3 3 dx − 9x dx + x2 dx = 4 0 0 0 4 1 Approximating E(h(X)) and Var(h(X)) This section is based on material in a text by P. if the function h(·) is very complicated. we can compute E(Y ) = E(h(X)) directly using only the probability distribution for X . Often. P.Approximating E(h(X)) and Var(h(X)) 121 Let Y = h(X) = 1/X .

and the random variable X represents the monthly demand (in terms of the number of items sold). the expected sales ﬁgure is “plugged into” the proﬁt function. A common mistake when using expected values is to assume that E(h(X)) = h(E(X)). Y = h(µ) + (X − µ)h (µ) + (X − µ)2 h (µ) + R1 2 where R1 is the remainder of the expansion. Then E(X) = µ represents the expected number of items sold in a month. Unless h is linear. For example. h (µ) 2 σ 2 h (µ) 2 h(µ) − E(Y ) ≈ − σ 2 E(Y ) ≈ h(µ) + .12 provides an approximation for the error if you incorrectly use h(µ) rather than E(h(X)).122 Approximating E(h(X)) and Var(h(X)) Proof. For the second approximation.) Expand the function h in a Taylor series about x = µ to two terms. believing the result is the expected proﬁt. Equation (2) in Theorem 5. Discard the remainder term R1 and take the expected value of both sides to get Equation (2). Discard the term R2 and take the variance of both sides to get Equation (3). (This is only a sketch of the proof. This is not generally true. suppose we are selling a product. this will not be the same as h(µ) = h(E(X)). Often. the expected proﬁt is actually E(Y ) = E(h(X)). expand h in a Taylor series about x = µ to only one term to get Y = h(µ) + (X − µ)h (µ) + R2 where R2 is the remainder. We obtain. In fact. the number h(µ) is incorrectly interpreted as the expected proﬁt. Now suppose. That is. Y = h(X) is the actual proﬁt from selling X items in a month.

where T is the temperature of the liquid (degrees centigrade).005t)0. the surface tension of a liquid (dyn/cm) is given by the formula S = 2(1 − 0.8 Since µ = 15.82 (dyne/cm) Var(S) ≈ 75[h (15)] = 0. Example: (Meyer) Under certain conditions.005t)1. your answer will be wrong by approximately ∆=− h (µ) 2 σ 2 Note that if h is linear.Approximating E(h(X)) and Var(h(X)) 123 Therefore.012(1 − 0.87 2 (dyne/cm)2 .5.6 and 5.12.005t)−0.2 .000012(1 − 0. and ∆ = 0. using Theorem 5. Suppose T is a continuous random variable with probability density function fX (x) = 3000t−4 if t ≥ 10 0 otherwise Therefore. we will use h(t) = 2(1 − 0. h (x) = 0. if you incorrectly state that h(µ) is your expected monthly proﬁt. 5.82 h (15) = 0.2 h (t) = 0.2 h (t) = −0.01 h (15) = 0.005(15))−0. results in E(S) ≈ h(15) + 75h (15) = 1. E(T ) = ∞ 10 3000t−3 dt = 15 (degrees centigrade) Var(T ) = E(T 2 ) − (15)2 = ∞ 10 3000t−2 dt − 225 = 75 (degrees centigrade)2 In order to use Equations (2) and (3). we have h(15) = 1. This result for the linear case is also provided by Theorems 5.7.000012(1 − 0.005T )1.8 ≈ 0+ and.

Rural country.13. +∞). and 3 boon·docks (b¯ on’d˘ ks’) plural noun. µ−a −∞ +∞ µ+a (x − µ)2 fX (x) dx ≥ (x − µ)2 fX (x) dx ≥ µ−a −∞ +∞ µ+a a2 fX (x) dx a2 fX (x) dx. (The proof for X discrete is similar using summations rather than integrals. .” Source: The American Heritage R Dictionary of the English Language. the inequality establishes a limit on the amount of probability mass that can be found at distances far away from E(X) (the boondocks3 of the probability distribution). µ − a] ∪ [µ + a. given in the following Theorem 5. For a given Var(X). Fourth Edition. for all x over this region of integration.) We have for any a > 0 +∞ σ2 = = (x − µ)2 fX (x) dx −∞ µ−a µ+a +∞ (x − µ)2 fX (x) dx + −∞ µ−a µ−a (x − µ)2 fX (x) dx + µ+a +∞ (x − µ)2 fX (x) dx = −∞ µ−a (x − µ)2 fX (x) dx + something ≥ 0 + µ+a +∞ (x − µ)2 fX (x) dx ≥ −∞ (x − µ)2 fX (x) dx + µ+a (x − µ)2 fX (x) dx µ+a since µ−a (x − µ)2 fX (x) dx ≥ 0. the backwoods. Houghton Mifﬂin Company. Hence.13. We will prove this result when X is a continuous random variable. Theorem 5.124 Chebychev’s Inequality Chebychev’s Inequality Chebychev’s inequality. is applicable for any random variable X with ﬁnite variance. Therefore. Note that (x − µ)2 ≥ a2 for every x ∈ (−∞. and fX (x) dx replaced by the probability mass function for X . Let X be a discrete or continuous random variable with E(X) = µ and Var(X) = σ 2 . Example usage: The o¯ o 1965 Billy Joe Royal Song: “Down in the Boondocks. we have (x − µ)2 fX (x) ≥ a2 fX (x). Then for any a > 0 P (|X − µ| ≥ a) ≤ σ2 a2 Proof.

25.25 = 0. 4 Thanks to Dr. Then for any a > 0 P (|X − µ| < a) ≥ 1 − σ2 a2 Example:4 The number of customers who visit a car dealer’s showroom on a Saturday morning is a random variable. has determined that the mean and variance of X are 18 and 6. a former EAS305 student. Using extensive prior experience. we get σ2 ≥ ≥ µ−a −∞ µ−a −∞ (x − µ)2 fX (x) dx + a2 fX (x) dx + +∞ µ+a +∞ µ+a (x − µ)2 fX (x) dx a2 fX (x) dx = a2 P (X ≤ µ − a) + a2 (P (X ≥ µ + a) = a2 P (|X − µ| ≥ a) Note: X can be discrete. the manager. . so P (8 < X < 28) = P (8 − 18 < X − 18 < 28 − 18) = P (−10 < X − 18 < 10) = P (|X − 18| < 10) = P (|X − µ| < 10) ≥ 1 − 6. An equivalent way to state Chebychev’s inequality is Let X be a discrete or continuous random variable with E(X) = µ and Var(X) = σ 2 .9375 102 This is the best statement we can make without knowing more about the distribution for X . What is the probability that there will be between 8 and 28 customers on any given Saturday? Solution We are given E(X) = µ = 18 and Var(X) = σ 2 = 6. Linda Chattin at Arizona State University for this example.25 respectively.Chebychev’s Inequality 125 So. X .

and Var(X) = σ 2 . Theorem 5. . with probability one. µ and σ . The Normal Distribution Deﬁnition 5. we often say that the sequence { n Xn } converges to p almost surely.15. The theorem guarantees that.} 2 3 The strong law states that. In 1 this case. The above result is the weak law of large numbers. Let an experiment be conducted n times.14. Let Xn ≡ number of times the event A occurs p ≡ P (A) Then for any > 0 n→∞ lim P Xn −p < n = 1. the probability that the ratio Xn /n is observed to be “close” to p approaches one. σ 2 ) then E(X) = µ.126 The Normal Distribution The Law of Large Numbers This is an important application of Chebychev’s inequality. The random variable X is said to have a normal distribution if it has probability density function 1 1 fX (x) = √ e− 2 σ 2π x−µ 2 σ for −∞ < x < ∞ Note that X has two parameters. 1 X2 . .6. as n gets large. 1 X3 . . We often say that “X is distributed normally with mean µ and variance σ 2 ” We write X ∼ N (µ. this sequence converges to p. There is a related result called the strong law of large numbers that makes a similar (but stronger statement) about the entire sequence of random variables 1 {X1 . with each outcome of the experiment independent of all others. Theorem 5. σ 2 ) . If X ∼ N (µ. 4 X4 .

1). Suppose X ∼ N (µ.e.The Normal Distribution 127 One of the amazing facts about the normal distribution is that the cumulative distribution function (4) FX (a) = a −∞ 1 1 √ e− 2 σ 2π x−µ 2 σ dx does not exist in closed form. If X ∼ N (µ. there exists no function in closed form whose derivative is 1 1 fX (x) = √ e− 2 σ 2π x−µ 2 σ The only way to compute probabilities for normally distributed random variables is to evaluate the cumulative distribution function in Equation 4 using numerical integration. 1). the probability density function for X is 1 x−µ 2 1 for −∞ < x < ∞ fX (x) = √ e− 2 σ σ 2π Deﬁne the function x−µ h(x) = σ .. to an equivalent probability statement involving only a normal random variable with mean zero and variance one. Z . In other words. σ 2 ). σ Proof. We can then refer to a table of N (0. 1) probabilities. then the random variable Z= X −µ ∼ N (0. We denote the cumulative distribution function for Z by Φ(a) = P (Z ≤ a).7. and thus avoid numerically integrating Equation 4 for every conceivable pair of values µ and σ . Theorem 5. there is an important property of the normal distribution that allows us to convert any probability statement. If X ∼ N (µ. σ 2 ).16. Such a table is provided in Table I of Appendix A. such as P (X ≤ a). Therefore. The standard normal random variable. Deﬁnition 5. Z ∼ N (0. has probability density function 1 2 1 φ(z) = √ e− 2 z 2π i. σ 2 ).

Deﬁne the random variable Z= X −µ . the probability density function for the random variable Z = h(X) is fZ (z) = fX (h−1 (z)) = = d −1 h (z) dz 1 (σz+µ)−µ 2 1 σ √ e− 2 σ for −∞ < σz + µ < ∞ σ 2π 1 2 1 √ e− 2 z for −∞ < z < ∞ 2π Note that we couldn’t use the cumulative distribution function of X to ﬁnd the cumulative distribution function of Z since neither cumulative distribution function exists in closed form.128 The Normal Distribution Note that z = h(x) is an invertible. Var(Z) = 1. X −µ If X ∼ N (µ. Theorem 5. Proof.17. Suppose X is any random variable with ﬁnite mean E(X) = µ and ﬁnite. σ 2 ). Z has a normal distribution Actually. nonzero variance Var(X) = σ 2 . σ Then E(Z) = 0 and Var(Z) = 1. We have E(Z) = E X −µ σ . increasing function of x with h−1 (z) = σz + µ d −1 h (z) = σ dz Using Theorem 5. the ﬁrst two results would be true even if X did not have a normal distribution: Lemma 5. E(Z) = 0 2.2. and 3.16 tells us three things about Z = σ 1.

This is just a special case of Theorem 5. If a > 0. then any linear function h(X) = aX + b is also normally distributed as long as a = 0. When we perform the above transformation (i. Proof. then the function Y = h(X) is strictly increasing and if a < 0. we say we are standardizing the distribution of X ..16 is that if X is normally distributed. E(Z) = 0 and Var(Z) = 1. If X ∼ N (µ.) With a little additional work we can show that id X is normally distributed.16.e. σ 2 ). This fact is provided by the following theorem: Theorem 5. A distribution that has zero mean and unit variance is said to be in standard form.2. What is important in Theorem 5. . the random variable Y = aX + b ∼ N (aµ + b. then for any constants a = 0 and b. a2 σ 2 ). Z = (X − µ)/σ ).The Normal Distribution 129 = E = = = 1 (X − µ) σ 1 E[X − µ] σ 1 (E[X] − µ) σ 1 (0) = 0 σ and Var(Z) = Var X −µ σ 1 = Var (X − µ) σ 1 Var[X − µ] = σ2 1 = Var[X] = 1 σ2 So. the function Y = h(X) is strictly decreasing. then the linear transformation of X . it is no surprise that. in Theorem 5.18. X −µ Z= σ is also normally distributed. namely.

130

The Normal Distribution

We know

1 1 fX (x) = √ e− 2 σ 2π

x−µ 2 σ

Suppose a > 0, then
FY (y) = P (Y ≤ y) = P (aX + b ≤ y) y−b = P X≤ a y−b = FX a

Take derivatives on both sides and use the chain rule to obtain d y−b d fY (y) = FY (y) = FX dy dy a y−b 1 = fX a a
= =

1 √ e σ 2π 1 1 √ e− 2 aσ 2π

−1 2

y−b a −µ σ

2

1 a

y−(aµ+b) 2 aσ

Hence, Y ∼ N (aµ + b, a2 σ 2 ). Now suppose a < 0, then
FY (y) = P (Y ≤ y) = P (aX + b ≤ y) y−b = P X≥ a y−b y−b = P X> +P X = a a y−b y−b +P X = = 1 − FX a a = and since X is continuous. . . y−b +0 = 1 − FX a

Take derivatives on both sides and use the chain rule to obtain d d y−b fY (y) = FY (y) = 1 − FX dy dy a

The Normal Distribution
y−b 1 a a and remembering that a < 0 . . .
1 −2 y−b a −µ σ 2

131

= −fX

=

1 1 √ e −a σ 2π 1 y−(aµ+b) 2 1 aσ √ e− 2 = −aσ 2π and since a2 = (−a)2 . . . 1 y−(aµ+b) 2 1 √ e− 2 −aσ = −aσ 2π

So, again, we have Y ∼ N (aµ + b, a2 σ 2 ). Question: What happens to the probability distribution of Y = aX + b when a = 0? Linear transformations rarely preserve the “shape” of the probability distribution. For example, if X is exponentially distributed, Z would not be exponentially distributed. If X was a Poisson random variable, Z would not be a Poisson random variable. But, as provided by Theorem 5.16, if X is normally distributed, then Z is also normally distributed. (See Question 5.6 to see that the uniform distribution is preserved under linear transformations.) Tools for using the standard normal tables
P (Z ≤ a) = Φ(a) is given in the tables for various values of a. It can be shown that since Z is a continuous random variable and the probability density function for Z is symmetric about zero, we have

Φ(a) = 1 − Φ(−a). Example: Suppose X ∼ N (0, 4). Let’s ﬁnd P (X ≤ 1).
P (X ≤ 1) = P (X − 0 ≤ 1 − 0) X −0 1−0 = P ≤ 2 2

Since X ∼ N (0, 4) we know from Theorem 5.16 that
Z= X −0 2 ∼ N (0, 1)

132

The Normal Distribution

so, using Table I in Appendix A, we have
P (X ≤ 1) = P Z ≤ = P Z≤

1−0 2
1 2

= 0.6915

This is the area shown in the following ﬁgure:
0.5 0.4 0.3

φ(z ) φ( )

0.2 0.1 0.0 -2.0 -1.8 -1.6 -1.4 -1.2 -1.1 -0.9 -0.7 -0.5 -0.3 -0.1 0.1 0.3 0.5 0.7 0.9 1.0 1.2 1.4 1.6 1.8 2.0

z

Example: Suppose X ∼ N (2, 9). Let’s ﬁnd P (0 ≤ X ≤ 3).
P (0 ≤ X ≤ 3) = P (0 − 2 ≤ X − 2 ≤ 3 − 2) X −2 3−2 0−2 ≤ ≤ = P 3 3 3

Since X ∼ N (2, 9) we know that
Z= X −2 3 ∼ N (0, 1)

Hence,
2 P (0 ≤ X ≤ 3) = P − 3 ≤ Z ≤ 1 3

= P Z≤

1 3

− P Z < −2 3

Since the N (0, 1) distribution is symmetric about zero,
P Z < −2 = P Z > 3
2 3

=1−P Z ≤

2 3

Finally, from Table I, we get
P (0 ≤ X ≤ 3) = P Z ≤
1 3

− 1−P Z ≤

2 3

= 0.63 − (1 − 0.75) = 0.38

Self-Test Exercises for Chapter 5

133

which is the area shown in the following ﬁgure:
0.5 0.4 0.3

φ(z ) φ( )

0.2 0.1 0.0 -2.0 -1.8 -1.6 -1.4 -1.2 -1.1 -0.9 -0.7 -0.5 -0.3 -0.1 0.1 0.3 0.5 0.7 0.9 1.0 1.2 1.4 1.6 1.8 2.0

z

Self-Test Exercises for Chapter 5
For each of the following multiple-choice questions, choose the best response among those provided. Answers can be found in Appendix B. S5.1 Let X be a continuous random variable with probability density function
fX (x) =
1 2

0

if −1 ≤ x ≤ 1 otherwise.

The value of P (X 2 ≥ 1 ) is 4 (A) 0 (B) (C)
1 4 1 2

(D) 1 (E) none of the above. S5.2 Let X be a random variable with E(X) = 0 and E(X 2 ) = 2. Then the value of E[X(1 − X)] is (A) −2 (B) 0 (C) 1 (D) 2 (E) Var(X)

134

Self-Test Exercises for Chapter 5

(F) none of the above. S5.3 Let X be a random variable with E(X) = 1 and E(X 2 ) = 2. Then Var(X) equals (A) −1 (B) 0 (C) 1 (D) 3 (E) none of the above. S5.4 Suppose that X is a continuous random variable with P (X < −2) = P (X < 2) = 1 . Then P (X 2 ≤ 4) equals. 2 (A) (B) (C)
1 4 1 2 3 4 1 4

and

(D) 1 (E) none of the above. S5.5 Let X be a random variable with E(X) = 2. Then E[(X − 2)2 ] equals (A) 0 (B) 2 (C) Var(X) (D) E(X 2 ) (E) none of the above. S5.6 Let X be a random variable with cumulative distribution function if x < 0 x if 0 ≤ x ≤ 1 FX (x) =   1 if x > 1 Then P (3X ≤ 1) equals (A) 1/6
  0 

−1. Then P (Y ≤ 2 ) equals .7 Let X be a discrete random variable with probability mass function pX (−1) = p pX (0) = 1 − 2p pX (1) = p where 0 < p < 1 .Self-Test Exercises for Chapter 5 135 (B) 1/4 (C) 1/3 (D) 1/2 (E) none of the above. 1 or 2 0 otherwise 1 Let Y = 1/X . S5.8 Let X be a discrete random variable with probability mass function pX (x) = 1/4 if x = −2. The value of P (X 2 = 1) is 2 (A) p2 (B) 2p (C) (1 − 2p)2 (D) 1 (E) none of the above. Then P (Y ≤ 2 ) equals (A) 0 (B) 1/4 (C) 3/4 (D) 1 (E) none of the above. S5.9 Let X be a continuous random variable with probability density function fX (x) = 1/4 if −2 ≤ x ≤ 2 0 otherwise 1 Let Y = 1/X . S5.

10 Let X be a random variable with P (X < −1) = 1/4 P (X ≤ +1) = 3/4 Then P (|X| > 1) equals (A) 1/4 (B) 1/2 (C) 3/4 (D) 1 (E) none of the above.12 Suppose that X is a random variable with P (X > 0) = r where 0 < r < 1. S5.11 Let X be a continuous random variable with probability density function fX (x) = if −1 ≤ x ≤ 1 0 otherwise 1 2 Then P (|X| ≤ 1 ) equals 4 (A) 0 (B) 1/4 (C) 1/2 (D) 1 (E) none of the above. S5. S5.136 Self-Test Exercises for Chapter 5 (A) 0 (B) 1/2 (C) 3/4 (D) 1 (E) none of the above. Then P (X 2 ≥ 0) equals (A) 0 .

13 Suppose X is a random variable with E(X) = α and Var(X) = α. The cake is sold for \$5. Suppose you win \$2 for every head and lose \$1 for every tail. Then E(X 2 ) equals (A) 0 (B) α (C) α2 (D) α + α2 (E) none of the above.5 pX (3) = 0.Self-Test Exercises for Chapter 5 √ r 137 (B) (C) r (D) 1 (E) none of the above.2 pX (2) = 0.1 where X represents the number of cakes requested by customers. Then P (Y ≥ 0) equals (A) 2/3 (B) 848/1024 (C) 904/1024 (D) 1 (E) none of the above. The expected daily proﬁt from chocolate cake is equal to .00 a cake and all unsold chocolate cakes are thrown away at the end of the day. Let Y denote your actual net amount won (which may be negative). S5. the bakery owner bakes two chocolate cakes at a cost of \$2. S5.00 per cake.2 pX (1) = 0.15 The daily demand for chocolate cake in a neighborhood bakery is a random variable X with probability mass function given by pX (0) = 0. At the beginning of the day. S5. with each outcome independent of the others.14 A fair coin is tossed 10 times. where α > 0.

What is the manufacturer’s expected proﬁt per item? (A) 5e−1 (B) 2e−1 (C) −2 + 5e−1 .00 (E) none of the above. S5.2 pX (1) = 0.6 At the beginning of the day. S5.16 The daily demand for rye bread in a corner grocery store is a random variable X with probability mass function given by pX (0) = 0. The expected daily proﬁt from rye bread is equal to (A) 0 (B) 20 cents (C) 30 cents (D) 40 cents (E) none of the above.00. but guarantees a total refund if X ≤ 1.17 Suppose that an electronic device has a life length X (in units of 1000 hours) that is considered a continuous random variable with probability density function fX (x) = e−x 0 if x > 0 otherwise Suppose that the cost of manufacturing one such item is \$2. the store owner bakes two loaves of rye bread at a cost of 20 cents per loaf.138 Self-Test Exercises for Chapter 5 (A) 0 (B) \$1.2 pX (2) = 0. The bread is sold for 50 cents a loaf and all unsold loaves are thrown away at the end of the day.00.50 (C) \$2. The manufacturer sells the item for \$5.00 (D) \$3.

Let Y denote your earnings after 3 tosses.18 Let X be a random variable with E(X) = 2 and E(X 2 ) = 5.20 A fair coin is tossed 3 times. The value of E(Y ) is (A) 0 (B) 3/2 (C) 1 (D) 3 (E) none of the above. S5.Self-Test Exercises for Chapter 5 139 (D) 5 − 2e−1 (E) none of the above. S5. Then Var(X) equals (A) 1 (B) 3 (C) 7 (D) 9 (E) none of the above.19 Let X be a random variable with E(X) = 1 and Var(X) = 4 then E((X − 1)2 ) equals (A) 0 (B) 1 (C) 4 (D) 16 (E) none of the above. S5. and you lose \$1 for each tail that appears. . You win \$2 for each head that appears.

Consider the random variable X −µ . then P (−2Y < 0) equals (A) p (B) 1 − p (C) 0.5 (D) 0 (E) none of the above. Y = σ Which of the following are always true: (A) E(Y ) = 0 and Var(Y ) = 1 (B) Y has a normal distribution (C) P (Y > 0) = 1/2 (D) all of the above are true.23 If Y is a random variable with P (Y ≥ 0) = p.8413 (E) none of the above S5.21 Let X be any random variable with E(X) = µ and Var(X) = σ 2 . S5.140 Self-Test Exercises for Chapter 5 S5. S5. Then the value of E(|X|) is (A) 0 .5 and P (X = +1) = 0.5 (D) 0.5. Then P (X < 0) equals (A) 0 (B) 0. (E) none of the above are true.22 Let X be a normally distributed random variable with E(X) = 3 and Var(X) = 9.1587 (C) 0.24 Let X be a random variable with P (X = −1) = 0.

27 Suppose X is a continuous random variable with cumulative distribution function   0 if x < 0  2 x if 0 ≤ x ≤ 1 FX (x) =   1 if x > 1 . S5.25 The time to wait for service in the El Gasso Mexican Restaurant is a continuous random variable X (in minutes) with probability density function fX (x) = 5e−5x 0 x≥0 otherwise The restaurant will pay a customer a \$10 reward if they wait more than 5 minutes for service.3174 (D) 0. Then.1826 (C) 0. The value of E(Y ). is (A) 0 (B) 5 (C) 10 (D) 10e−25 (E) none of the above. P (|X| ≤ 1) equals (A) 0.Self-Test Exercises for Chapter 5 141 (B) 0.1528 (B) 0. S5. Let Y denote the reward for any one customer. the expected reward. S5.5 (C) 1 (D) 2 (E) none of the above are true.3413 (E) none of the above.26 Suppose that the random variable X has a normal distribution with mean 1 and variance 4.

. Then P (Y < ln(0. If the manufacturer sells 100 printers.50 √ (D) 0. Deﬁne the random variable Y = ln(X). how much should he expect to pay in refunds? (A) \$2706.25 (C) 0.142 Questions for Chapter 5 Let ln(x) denote the natural logarithm of x. S5. (a) Find the probability density function for the random variable Y = h(X) = 1/X .59 (C) \$12.71 (B) \$7357. Questions for Chapter 5 5.293.1 Let X be the random variable with probability density function fX (x) = 3 2 (1 0 − x2 ) if 0 ≤ x ≤ 1 otherwise Let h(x) = 1/x.28 The life length X (in years) of a printer costing \$200 is a continuous random variable with probability density function fX (x) = e−x if x ≥ 0 0 otherwise The manufacturer agrees to pay a full refund to the buyer if the printer fails within the ﬁrst year.41 (D) \$17.642.29 (E) none of the above.50 (E) none of the above.50)) equals (A) 0 (B) 0. (b) Find E(X). E(h(X)) and h(E(X)).

5). (c) Find Var(X). 5.4 The price. 5. E(h(X)) and h(E(X)). of a commodity is a continuous random variable with the following probability density function   ax if 0 ≤ x ≤ 1  fX (x) =   0 a if 1 < x ≤ 2 otherwise (a) Determine the value of a. (b) Find E(X). (d) Compute E(X) and Var(X). 9 (a) Find the probability density function for the random variable Y = h(X) = 4 X . X . (c) Compute P (2X 2 ≤ 0.3 Let X be the random variable with probability density function fX (x) = 3√ 2 x 0 if 0 ≤ x ≤ 1 otherwise Let h(x) = 4 x.Questions for Chapter 5 143 (c) Find Var(X). (a) Find the probability density function for the random variable Y = h(X) = ln X . (c) Find Var(X). . 5. 9 (b) Find E(X).2 Let X be the random variable with probability density function fX (x) = 2/x2 if 1 ≤ x ≤ 2 0 otherwise Let h(x) = ln x. E(h(X)) and h(E(X)). (b) Determine the cumulative distribution function for X .

b]. The results should be the same as Example 5. What is the manufacturer’s expected proﬁt per item? . Z= Let 5. The manufacturer sells the item for \$5.5 Let X be a random variable with the following probability density function: 1 2 fX (x) = √ e−x /2 2π for −∞ < x < ∞ The is the normal distribution with parameters µ = 0 and σ 2 = 1.00 but guarantees a total refund if X ≤ 0. X −r s Where r and s are non-zero constants.00. (b) Using the probability density function for Y ﬁnd E(Y ) and Var(Y ). where a < b. That is. Note: The integral for fX (x) does not exist in closed form. 5. Note: Be careful of the case where s < 0. In other words.5.144 Questions for Chapter 5 5.8 Suppose that an electronic device has a life length X (in units of 1000 hours) that is considered as a continuous random variable with probability density function: e−x x>0 fX (x) = 0 otherwise Suppose that the cost of manufacturing one such item is \$2.5 using the actual probability density function for Y .7 Verify the results of Example 5. let X be the random variable with probability density function 3x2 if 0 ≤ x ≤ 1 fX (x) = 0 otherwise (a) Find the probability density function for the random variable Y = h(X) = 1/X . Find the probability density function for the random variable Y = X 3 . 5. suppose the cumulative distribution function of X is given by  if x < a  0   x−a if a ≤ x ≤ b FX (x) =  b−a   1 if x > b. Show that Z also has a uniform distribution.6 (†) Let X be a uniformly distributed random variable on the interval [a.9.

what is Billy’s expected score? .e. where Y = X 2 + 1. 5. Therefore. Mary’s entry fee will be E(2X ). Bill Bell. an industrial engineering student. To make the game fair. to obtain the ﬁrst head. (a) What is the probability mass function for the random variable X? (b) If the professor assigns a credit of ten points for every correct answer and deducts one point for every incorrect answer..9 Suppose that the continuous random variable X has a probability density function given by   √1 e−x2 fX (x) = 2π  −∞ ≤ x ≤ +∞ 0 otherwise Let Y = X 2 .11 John and Mary play a game by tossing a fair coin repeatedly and betting on the number of tosses required. The examination consists of 4 multiple choice questions. X .12 Due to pressing social commitments. John will pay Mary 2X dollars when the game is over (i. was not able to study for an examination. Let X denote the number of questions correctly answered by Billy. Mary will pay an “entry fee” in advance to John equal to her expected winnings. 5. only one of which is correct. (a) Using the fact that Var(X).Questions for Chapter 5 145 5. when the ﬁrst head appears). ln(x) = compute E(X) and (b) Compute E(Y ). Each question has 6 possible answers from which to choose. (a) What is the expected number of tosses required to obtain the ﬁrst head? (b) What is Mary’s entry fee? Is this fair? Why? 5.10 Let X be a random variable with density   2 fX (x) = x2  0 d dx 1≤x≤2 otherwise 1 x. Billy plans to answer the questions randomly by rolling a fair die for each question and choosing the response associated with the number rolled. Evaluate E(Y ).

What is the expected selling price of an article? 5. and has a uniform distribution on the interval (36. it is sold for \$12.17 A random variable X represents the weight (in ounces) of an article.1. Let Y = eX . 56). it is sold for \$15.e. pX (1) = 0. For what number α is E(X(α − X)) zero? 5. R.13 Let X be the number of dots obtained when a fair. Y is log-normal if and only if ln(Y ) is normal. Suppose that X ∼ N (µX .3. σX ). . Find the probability density function for Y . k = 1). pX (2) = 0. The bread is sold for 30 cents a loaf and all unsold loaves are thrown away at the end of the day. it is considered unsatisfactory and thrown away. (a) Determine the probability distribution for the daily proﬁt.16 The daily demand for pumpernickel bread in a corner grocery store is a random variable X with probability mass function given by pX (0) = 0. Then Y is said to have a log-normal distribution. common six-sided die is tossed.00. If 10 points are credited for each correct answer.6 At the beginning of the day.00.. then E(X(X − 1)) equals zero. 5. if the store owner decides to bake one loaf (i. If the weight of the article is less than 44 ounces.18 A distribution closely related to the normal distribution is the log-normal 2 distribution. the store owner bakes k loaves of bread at a cost of 20 cents for the ﬁrst loaf and 10 cents for each additional loaf. 2 5.146 Questions for Chapter 5 (c) The professor deducts points for wrong answers in order to discourage random guessing. If its weight is greater than 52 ounces. how many points should be deducted for each wrong answer so that Billy’s expected score from guessing is zero? 5.15 If X is a random variable with P (X = −2) = what is the value of E(|X|)? 1 2 and P (X = +2) = 1 . That is. Justify your answer. If its weight is greater than 44 ounces and less than 52 ounces.14 (True or False) If X is a random variable with E(X) = 1. (b) How many loaves should the grocer bake per day to maximize his expected proﬁt? 5.

What must his new proﬁt function be in order to achieve this goal? 5. These are not the expected value and standard deviation of Y .20 The Youbee Resistor Company uses a machine to produce resistors. by 50%. 5. E(T ). He intends to do this by increasing his proﬁt (per gallon) on those batches of fuel meeting the speciﬁcation 30 < X < 35. The relationship between the moments of X and the moments of Y is given by µX 2 σX = ln(µY ) − = ln 2 1 σY 1 2 ln + 1 = ln(µY ) − σX 2 2 2 µY 2 σY +1 µ2 Y Examples of random variables that may be represented by the log-normal distribution are river ﬂows.19 A rocket fuel is to contain a certain percent (say X ) of a particular additive. the size of an organism subject to a number of small impulses. your answer is in terms of µX and σX . 9). (b) Suppose that the manufacturer wants to increase his expected proﬁt. and the length of life of certain items. The cost of producing one resistor is 10 cents. (a) What proportion of the resistors must be scrapped? (b) The company is considering the purchase of a new highprecision machine that will reduce the above production . The manufacturer will make a net proﬁt on the fuel (per gallon) which is the following function of X :   \$0. The actual resistance of each resistor is a normally distributed random variable with mean resistance of 1000 ohms and a variance of 625 ohms2 . evaluate E(T ).05 per gallon   -\$0. The speciﬁcations call for X to be between 30 and 35 percent.10 per gallon if 30 < X < 35 if 35 < X < 40 or 25 < X ≤ 30 otherwise (a) If X ∼ N (33. it can be sold for 25 cents. otherwise it is scrapped.10 per gallon  T (X) = \$0.Questions for Chapter 5 147 Note: For this question. If a resistor has a resistance between 950 and 1050 ohms. the diameter of a small particle after a crushing process.

Assume furthermore that a sample of this fabric is considered to be defective if X ≤ 162. you will say only noise is present.99. cannot take on negative values. a normally distributed random variable may be observed to take on any positive or negative value.22 Let Xi ∼ N (µ. What would Var(X) have to equal so that P (X < 0) = 0. Find k such that P (X > µ − kσ) = 0. the reading will have a normal distribution with mean 5 and variance 9. such a representation may work. that we hold Var(X) ﬁxed at 9.975. will have a normal distribution with mean 10 and variance 9. This situation will occur frequently: A certain random variable that we know cannot take on negative values will be assumed to have a normal distribution anyway. X . If a signal is present when the button is pressed. the reading. As long as the parameters µ and σ 2 are chosen so that P (X < 0) is essentially zero. suppose we hold E(X) ﬁxed at 165. = 0. σ 2 ). It is obvious that X . That is P (X < 0) ≈ 0. the above model (apparently not valid in view of the objection just raised) assigns negligible probability to the event {X < 0}.95. say X . Compute this probability to verify this property for yourself in the above example. However. will be defective? Note: An immediate objection to the use of the normal distribution may be raised here. .23 A measuring device provides a reading on a continuous scale whenever a button is pressed.841. You settle on the following signal detection rule: If the reading is 7 or greater. the cost of producing each resistor will increase to 15 cents if the company decides to purchase the machine (the new machine’s capital cost is included in this production cost).10? 5. on the other hand. In the above example.10? Suppose. 5.148 Questions for Chapter 5 variance to 400 ohms2 (the mean remains the same). Otherwise. If only noise is present. chosen at random. is normally distributed with E(X) = 165 and Var(X) = 9. should the new machine be purchased? 5. If the company wishes to maximize expected proﬁt. What is the probability that a sample of fabric. Although the selling price of resistors would remain the same. you will announce that a signal is present. the strength of cotton fabric. Nevertheless. What would E(X) have to be so that P (X < 0) = 0. = 0. = 0.21 Suppose that the breaking strength of cotton fabric (in pounds).

Questions for Chapter 5

149

(a) If a signal is truly present, and you use the above rule, what is the probability that your announcement is wrong? If only noise is really present, what is the probability that your announcement is wrong? (b) Suppose that the probability that a signal is truly present when the button is pressed is 0.25. If your reading is greater than 7, what is the probability that a signal is really present? (c) The value, 7, is called the threshold. What should this value be changed to so that your two answers for question (a) are equal? 5.24 An electronic device has a life length X (in years) that is considered a continuous random variable with probability density function
fX (x) = x/2 0

if 0 < x < 2 otherwise

(a) Find P (X > 1). (b) Find E(X). (c) Suppose that the cost of manufacturing one such item is \$2.00. The manufacturer sells the item for \$10.00, but guarantees a total refund if X ≤ 1. What is the manufacturer’s expected proﬁt per item? (d) Suppose that the cost of manufacturing one such item is \$2.00 and the customer is guaranteed a total refund if X ≤ 1. What should be the manufacturer’s selling price in order to provide an expected proﬁt of \$4.00 per item? 5.25 The reading of a water level meter is a continuous random variable X with probability density function
fX (x) =

1/4 0

if −2 < x < 2 otherwise

(a) Find P (X ≤ 1). (b) Find P (|X| ≤ 1). (c) Find E(X).

150

Questions for Chapter 5

(d) Find Var(X). (e) Let Y = X 2 . Find the cumulative distribution function for Y. 5.26 The daily demand for pumpkin pies at a corner grocery store is a random variable with probability mass function
  1 

pX (k) =

4 16 k   0

for k = 0, 1, 2, 3, 4 otherwise

The cost of baking one pumpkin pie is \$1.00. The store sells the pies for \$3.00, each. At the beginning of every day, the store manager must decide how many pies to bake. At the end of the day, any unsold pies are thrown away. If the demand for pies exceeds the supply, you can only sell those pies that you have already baked for that day. The additional customers leave unsatisﬁed. (a) Find P (X > 0). (b) Find E(X). (c) Find Var(X). (d) If the manager decides to bake three pies, what is the expected proﬁt for that day? 5.27 The input voltage for an electronic component is a continuous random variable X with probability density function
fX (x) =

1/3 0

if −1 < x < 2 otherwise

(a) Find P (X ≤ 1). (b) Find P (X 2 ≤ 0.25). (c) Find E(X 2 ). (d) Let Y = 2X . Find the cumulative distribution function for Y. 5.28 The Youbee Mine has been used to mine gold ore for several years. The mass (in grams) of actual gold in one cubic centimeter of ore is a random variable

Questions for Chapter 5

151

X with probability density function fX (x) = cx3 0

if 0 < x < 1 otherwise

(a) Find the value of the constant c. (b) Find the cumulative distribution function for X and sketch a graph of the cumulative distribution function. (c) Each cubic centimeter or ore is worth \$5 if it has more than 1 2 gram of gold. It is worth \$0 otherwise. Let Y be a random variable denoting the value of a cubic centimeter of ore selected at random. What is the probability mass function for Y ? 5.29 Let X be a continuous random variable with probability density function
fX (x) =

1/3 0

if −1 < x < 2 otherwise

Let Y = X 2 . Find the cumulative distribution function for Y . 5.30 Let X be a continuous random variable with probability density function
fX (x) =

1 0

if −1 < x < 0 otherwise

Let Y = X 2 . Find P (Y ≤ 0.25). 5.31 Let X be a continuous random variable with probability density function
  a(1 + x) 

fX (x) =

  0

a(1 − x)

if −1 ≤ x < 0 if 0 ≤ x ≤ 1 otherwise

(a) Prove that the value of a must be 1. Use a clear mathematical presentation to show why this must be true. (b) Find P (X ≤ 0.5). (c) Find P (X 2 ≤ 0.25). (d) Find the cumulative distribution function for X .

152

Questions for Chapter 5

5.32 A person accused of a crime is being questioned by the police. A new “lie detector” device is being used that measures the amount of stress in a person’s voice. When the person speaks, the device gives a reading, X . If the person is telling the truth, then X ∼ N (2, 1). If the person is lying, then X ∼ N (4, 1). Therefore, the mean (expected) stress level of a person’s voice increases if he is lying. (a) The police decide on the following rule: If a person’s voice stress exceeds 4 when he is answering a question, conclude that he is lying. What is the probability of incorrectly accusing person of lying when he is telling the truth? (b) When the suspect was originally brought in for questioning, the police thought there was only a 50% chance that he was guilty of the crime. The police ask him the question, “Did you commit the crime?” The suspect replies, “No.” The reading on the voice stress meter exceeds 4. What is the probability that he committed the crime? 5.33 Consider a random variable X with possible outcomes, 0, 1, 2, . . . Suppose that P (X = k) = (1 − a)(ak ) for k = 0, 1, 2, . . . (a) Identify all permissible values for the constant a. (b) Show that for any two positive integers, m and n,
P (X > m + n | X > m) = P (X ≥ n).

5.34 Check whether the mean and the variance of the following distributions exist: (a)
fX (x) =

2x−3 0

x≥1 otherwise

(b) (The Cauchy distribution) a is a positive constant.
fX (x) = a π(a2 + x2 ) −∞ < x < ∞.

Questions for Chapter 5

153

5.35 Diameters of bearings in mass production are normally distributed with mean 0.25 inch and standard deviation 0.01 inch. Bearing speciﬁcations call for diameters of 0.24 ± 0.02 inch. What is the probability of a defective bearing being produced? 5.36 Suppose that the continuous random variable X has a probability density function given by
  √2 e−x2 fX (x) = 2π 

x≥0

0

otherwise

Let Y = X 2 . Evaluate E(Y ). 5.37 Suppose that the life lengths of two electronic devices, D1 and D2 , have distributions N (40, 36) and N (45, 9), respectively. If the electronic device is to be used for a 45-hour period, which device is to be preferred? If it is to be used for a 50-hour period, which device is to be preferred? 5.38 (†) Suppose random variable Z has a standard normal density function
φ(z) =
1 2 √1 e− 2 z 2π

for −∞ < z < ∞

Let Y = Z 2 . Find the probability density function for Y . (We say that Y has a Chi-square distribution with one degree of freedom.) Warning: Y is not strictly increasing or strictly decreasing.

154 Questions for Chapter 5 .

It would not be either easy or desirable to attempt to combine all of this information into a single measurement. tomorrow’s weather is really best described by an array of random variables that includes wind speed.” 155 . we will build on our probability model and extend our deﬁnition of a random variable to permit such calculations. For example. It would be reasonable to ask for the probability that tomorrow’s temperature is greater than 70◦ and tomorrow’s total rainfall is less than 0. Deﬁnition The ﬁrst thing we must do is to precisely deﬁne what we mean by “an array of random variables. For example. wind direction. atmospheric pressure. let T be a random variable representing tomorrow’s maximum temperature and let R be a random variable representing tomorrow’s total rainfall. R < 0.1}. In this chapter. Another question that we might like to have answered is. we would need to compute the probability of the event B = {T > 70}. “What is the probability that the temperature will be greater than 70◦ regardless of the rainfall?” To answer this question. a single random variable cannot adequately provide all of the information needed about the outcome of an experiment. We would like to extend the notion of a random variable to deal with an experiment that results in several observations each time the experiment is run. we wish to determine the probability of the event A = {T > 70.6 RANDOM VECTORS Basic Concepts Often. relative humidity and temperature.1 inch. In other words.

156

Deﬁnition

y
ω
3 2 1 1 2 3 (4,1)

4

5

x

Figure 6.1: A mapping using (X1 , X2 ) Deﬁnition 6.1. Let Ω be a sample space. An n-dimensional random variable or random vector, X1 (·), X2 (·), . . . , Xn (·) , is a vector of functions that assigns to each point ω ∈ Ω a point X1 (ω), X2 (ω), . . . , Xn (ω) in n-dimensional Euclidean space. Example: Consider an experiment where a die is rolled twice. Let X1 denote the number of the ﬁrst roll, and X2 the number of the second roll. Then (X1 , X2 ) is a two-dimensional random vector. A possible sample point in Ω is

that is mapped into the point (4, 1) as shown in Figure 6.1.

Joint Distributions

157

(x,y)

Figure 6.2: The region representing the event P (X ≤ x, Y ≤ y).

Joint Distributions
Now that we know the deﬁnition of a random vector, we can begin to use it to assign probabilities to events. For any random vector, we can deﬁne a joint cumulative distribution function for all of the components as follows: Deﬁnition 6.2. Let (X1 , X2 , . . . , Xn ) be a random vector. The joint cumulative distribution function for this random vector is given by
FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) ≡ P (X1 ≤ x1 , X2 ≤ x2 , . . . , Xn ≤ xn ).

In the two-dimensional case, the joint cumulative distribution function for the random vector (X, Y ) evaluated at the point (x, y), namely
FX,Y (x, y),

is the probability that the experiment results in a two-dimensional value within the shaded region shown in Figure 6.2. Every joint cumulative distribution function must posses the following properties: 1.
all xi →−∞

lim

FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) = 0

158

Discrete Distributions

x2 6 5 4 3 2 1 1 2 3 4 5 6 x1

Figure 6.3: The possible outcomes from rolling a die twice. 2. lim
FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) = 1

all xi →+∞

3. As xi varies, with all other xj ’s (j = i) ﬁxed,
FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn )

is a nondecreasing function of xi . As in the case of one-dimensional random variables, we shall identify two major classiﬁcations of vector-valued random variables: discrete and continuous. Although there are many common properties between these two types, we shall discuss each separately.

Discrete Distributions
A random vector that can only assume at most a countable collection of discrete values is said to be discrete. As an example, consider once again the example on page 156 where a die is rolled twice. The possible values for either X1 or X2 are in the set {1, 2, 3, 4, 5, 6}. Hence, the random vector (X1 , X2 ) can only take on one of the 36 values shown in Figure 6.3. If the die is fair, then each of the points can be considered to have a probability

Continuous Distributions

159

1 mass of 36 . This prompts us to deﬁne a joint probability mass function for this type of random vector, as follows:

Deﬁnition 6.3. Let (X1 , X2 , . . . Xn ) be a discrete random vector. Then
pX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) ≡ P (X1 = x1 , X2 = x2 , . . . , Xn = xn ).

is the joint probability mass function for the random vector (X1 , X2 , . . . , Xn ). Referring again to the example on page 156, we ﬁnd that the joint probability mass function for (X1 , X2 ) is given by
pX1 ,X2 (x1 , x2 ) =
1 36

for x1 = 1, 2, . . . , 6 and x2 = 1, 2, . . . , 6

Note that for any probability mass function,
FX1 ,X2 ,...,Xn (b1 , b2 , . . . , bn ) =
x1 ≤b1 x2 ≤b2

···
xn ≤bn

pX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ).

Therefore, if we wished to evaluate FX1 ,X2 (3.0, 4.5) we would sum all of the probability mass in the shaded region shown in Figure 6.3, and obtain
1 FX1 ,X2 (3.0, 4.5) = 12 36 = 1 . 3

This is the probability that the ﬁrst roll is less than or equal to 3 and the second roll is less than or equal to 4.5. Every joint probability mass function must have the following properties: 1. pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) ≥ 0 for any (x1 , x2 , . . . , xn ). 2.
all x1

...
all xn

pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) = 1 pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) for any event E .
(x1 ,...,xn )∈E

3. P (E) =

You should compare these properties with those of probability mass functions for single-valued discrete random variables given in Chapter 5.

Continuous Distributions
Extending our notion of probability density functions to continuous random vectors is a bit tricky and the mathematical details of the problem is beyond the scope of

160

Continuous Distributions

an introductory course. In essence, it is not possible to deﬁne the joint density as a derivative of the cumulative distribution function as we did in the one-dimensional case. Let Rn denote n-dimensional Euclidean space. We sidestep the problem by deﬁning the density of an n-dimensional random vector to be a function that when integrated over the set
{(x1 , . . . , xn ) ∈ Rn : x1 ≤ b1 , x2 ≤ b2 , . . . , xn ≤ bn }

will yield the value for the cumulative distribution function evaluated at
(b1 , b2 , . . . , bn ).

More formally, we have the following: Deﬁnition 6.4. Let (X1 , . . . , Xn ) be a continuous random vector with joint cumulative distribution function
FX1 ,...,Xn (x1 , . . . , xn ).

The function fX1 ,...,Xn (x1 , . . . , xn ) that satisﬁes the equation
FX1 ,...,Xn (b1 , . . . , bn ) =
b1 −∞

···

bn −∞

fX1 ,...,Xn (t1 , . . . , tn ) dt1 · · · dtn

for all (b1 , . . . , bn ) is called the joint probability density function for the random vector (X1 , . . . , Xn ). Now, instead of obtaining a derived relationship between the density and the cumulative distribution function by using integrals as anti-derivatives, we have enforced such a relationship by the above deﬁnition. Every joint probability density function must have the following properties: 1. fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) ≥ 0 for any (x1 , x2 , . . . , xn ).
+∞ +∞ −∞

2.

−∞

···

fX1 ,...,Xn (x1 , . . . , xn ) dx1 · · · dxn = 1 fX1 ,...,Xn (x1 , . . . , xn ) dx1 · · · dxn for any event E .

3. P (E) =

···

E

You should compare these properties with those of probability density functions for single-valued continuous random variables given in Chapter 5.

a2 ). with a comparable formula given by P (a1 < X1 ≤ b1 .X2 (b1 . b2 ) − FX1 .4: Computing P (a1 < X1 ≤ b1 . a2 ) + FX1 . a2 < X2 ≤ b2 ). In the one-dimensional case. This worked for any type of probability distribution.4. a2 < X2 ≤ b2 ) = FX1 . y) = 2−y 0 2 1 0 2 if 0 ≤ x ≤ 2 and 1 ≤ y ≤ 2 otherwise Note that (2 − y) dx dy = 1.X2 (b1 . The situation in the multidimensional case is a little more complicated.X2 (a1 . we had the handy formula P (a < X ≤ b) = FX (b) − FX (a). b2 ) − FX1 . You should be able to verify this formula for yourself by accounting for all of the probability masses in the regions shown in Figure 6. Example: Let (X. Y ) be a two-dimensional random variable with the following joint probability density function (see Figure 6.X2 (a1 .Continuous Distributions 161 x2 b2 a2 a1 b1 x1 Figure 6.5): fX. .Y (x.

e. For example. “Can we determine the distribution of X1 . y).5 1.Y (x. y ≤ 1. we calculate the volume under the surface fX. Y ≤ 1. . Performing the integration.5).162 Marginal Distributions fXY(x.5: A two-dimensional probability density function Suppose we would like to compute P (X ≤ 1.. Y ) with joint cumulative distribution function FX.5.5}.0. To do this. Y ≤ y) . y) = P (X ≤ x.5 1.0 0 fX.Y (x. we might ask the question. in the two-dimensional case. . we may be given a random vector (X. and the solution requires the careful use of English rather than mathematics. Marginal Distributions Given the probability distribution for a vector-valued random variable (X1 .y) 1 1 2 y 1 2 x Figure 6. Suppose we would like to ﬁnd the cumulative distribution function for X alone. disregarding the other components?” The answer is yes.0. . we get. y) over the region {(x. P (X ≤ 1. y) dx dy 3 (2 − y) dx dy = 8 . This region of integration is shown shaded (in green) in Figure 6.0 1.Y (x. y) : x ≤ 1.0 −∞ −∞ 1.5) = = 1.Y (x. i. FX (x)? We know that FX. Xn ). Y ≤ 1.

it is more useful to examine marginal probability mass functions and marginal probability density functions..X2 (x1 .5 is a nice deﬁnition. for example. x2 . 2. To ﬁnd pX (x). But in terms of both X and Y . . 6 To ﬁnd pX1 (2). . So we can use the above deﬁnition to ﬁnd the marginal cumulative distribution function for any of the Xi ’s. The marginal cumulative distribution function for X1 is given by FX1 (x1 ) = lim x2 →∞ x3 →∞ lim · · · lim FX1 . xn ). suppose we have a discrete random vector (X. .. xn →∞ Notice that we can renumber the components of the random vector and call any one of them X1 . Y ≤ ∞) = y→∞ lim FX.Xn (x1 .X2 .Marginal Distributions 163 and we are asking for (1) FX (x) = P (X ≤ x). .Xn (x1 . . . . we shall deﬁne what we will call the marginal cumulative distribution function: Deﬁnition 6. . Although Deﬁnition 6.. . we ask “What is the probability that X = x regardless of the value that Y takes on? This can be written as pX (x) = P (X = x) = P (X = x. x2 ) = 1 36 for x1 = 1. For example.X2 (2. . Using this idea.5. 6 and x2 = 1. .. it would make sense to say FX (x) = P (X ≤ x) = P (X ≤ x. .. . . Example: In the die example (on page 156) pX1 .Y (x. y). k) = 6 . 2. Y ) with joint probability mass function pX. . Xn ) be a random vector with joint cumulative distribution function FX1 .Y (x. expression 1 can be read: “the probability that X takes on a value less than or equal to x and Y takes on any value. . we compute 6 pX1 (2) = P (X1 = 2) = k=1 1 pX1 .. Y = any value) = all y pX. . y).Y (x..” Therefore. Let (X1 . xn ).. . . . . y).

05 + 0. and let Y be the number of defective items produced. So 1 Would you believe that this is why they are called marginal distributions? .025 .10 5 0 0 0 0 0 This is hardly a surprising result.Y (x. Example: Let X be the total number of items produced in a day’s work at a factory. To ﬁnd the marginal probability mass function for the total daily production.05 pX (2) = pX.10 = 0. 1) = 0. we sum the probabilities over all possible values of Y for each ﬁxed x: pX (1) = pX.15 + 0.1.Y (2. Y ) is given by Table 6. 1) = 0. but it brings some comfort to know we can get it from all of the mathematical machinery we’ve developed thus far. But notice that in these computations.05 .10 = 0.1: Joint pmf for daily production pX.Y (3.Y (2. The procedure is similar for obtaining marginal probability density functions.10 1 . we are simply adding the entries all columns for each row of Table 6. for example.05 .Y (3.Y (3. 1) + pX. but fX (x)dx. 2) + pX. y) 1 2 X 3 4 5 Y 2 3 0 0 .25 pX (3) = pX.1. Using this joint distribution.Y (2.05 .20 etc.10 0 . P (Y = 2) = pY (2) = 0.025 .275. itself is not a probability measure. we can see that the probability of producing 2 items with exactly 1 of those items being defective is pX. Suppose that the probability mass function for (X.Y (1. 2) = 0.164 Marginal Distributions Table 6. 3) = 0.05 .15 . X .10 4 0 0 0 0 . fX (x). 1) + pX.05 + 0.10 .10 . We simply look for the result in the margin1 for the entry Y = 2.15. Recall that a density. Doing this for Y as well as X we can obtain Table 6.2 So. is.

05 ..X2 .. .225 pX (x) . The marginal probability density function for the random variable X1 is given by fX1 (x1 ) = +∞ −∞ ··· +∞ −∞ fX1 .10 0 .275 . .10 .10 .10 . Let (X1 . x2 .10 . ..025 . .05 .2: Marginal pmf’s for daily production pX. Xn ) be a continuous random variable with joint probability density function fX1 . Looking at this relationship as fX (x) dx = +∞ −∞ fX. we sum (or integrate) over all possible values of the unwanted random variable components. y) 1 2 X 3 4 5 pY (y) Y 2 3 0 0 . .05 .4 4 0 0 0 0 . y) dy dx it would seem reasonable to deﬁne fX (x) ≡ +∞ −∞ fX.Y (x. xn ) dx2 · · · dxn .Y (x. Notice that in both the discrete and continuous cases.Xn (x1 . y) dy dx where y is the variable of integration in the above integral.Marginal Distributions 165 Table 6. .05 . y) dy and we therefore offer the following: Deﬁnition 6.Y (x..Xn .10 0.15 .25 .20 . Y = any value) = +∞ −∞ fX..40 1 .025 . . . .10 ...Y (x..10 5 0 0 0 0 0 0 with a little loose-speaking integration notation we should be able to compute fX (x) dx = P (x ≤ X < x + dx) = P (x ≤ X < x + dx.05 .6.

166 Marginal Distributions y 1 0 a 1 x Figure 6.Y (x. Example: Let (X. The integration process assigns all of that mass to x = a in one-dimension. y) dy .6 indicates a particular value for x (call it a). Y ) in the example The trick in such problems is to insure that your limits of integration are correct.6. the picture of the support would be as shown in Figure 6. Repeating this process for each x yields the desired probability density function. Solution: Let’s consider the case where we ﬁx x so that 0 ≤ x ≤ 1. by integrating over all values of y .Y (x. y) = x+y 0 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise Find the marginal probability density function for X .6: The support for the random vector (X. If the dotted line in Figure 6. We compute fX (x) = +∞ −∞ fX. Drawing a picture of the region where there is positive probability mass (the support of the distribution) often helps. For the above example. we are actually determining how much probability mass has been placed along the line x = a. Y ) be a two-dimensional continuous random variable with joint probability density function fX.

we have fX (x) = 0. in fact. 1]. a probability density +∞ functionby making sure that fX (x) ≥ 0 for all x and that −∞ fX (x) dx = 1. Example: Let (X. we have fX (x) = 0.Y (x. 2]. Consider the case where we ﬁx x so that 0 ≤ x ≤ 2. summarizing these computations we ﬁnd that fX (x) = x+ 0 1 2 0≤x≤1 otherwise We will leave it to you to check that fX (x) is. So. We compute fX (x) = = = = +∞ −∞ 2 1 fX.Marginal Distributions 167 = 0 1 (x + y) dy y2 2 y=1 y=0 = xy + = x+ 1 2 If x is outside the interval [0.5 on page 162) fX. Therefore. y) = 2−y 0 if 0 ≤ x ≤ 2 and 1 ≤ y ≤ 2 otherwise Find the marginal probability density function for X and the marginal probability density function for Y .Y (x. fX (x) = 1 2 0 0≤x≤2 otherwise . y) dy (2 − y) dy y2 2 y=2 y=1 2y − 1 2 If x is outside the interval [0. Y ) be the two-dimensional random variable with the following joint probability density function (see Figure 6. Solution: Let’s ﬁrst ﬁnd the marginal probability density function for X .

4 − 2y 1≤y≤2 fY (x) = 0 otherwise You should also double check that fX and fY are both probability density functions. or {X1 = 3. . pZ (4) = pX1 . X2 = 1} occurs. y) = 0 giving us fY (y) = 0. the sum of the two rolls. y) dx (2 − y) dx = 4 − 2y If y is outside the interval [1. x2 ) = 36 for x1 = 1. {X1 = 2.X2 (3. Most of these problems just require a little careful reasoning. 1 Solution: We already know that pX1 . . “What are the possible values that Z can take on?” The answer: “The integers 2. we have fX.X2 (1.Y (x. Z equals 4 precisely when any one of the mutually exclusive events {X1 = 1. . X2 ) be the die-tossing random vector of the example on page 156. 6.” For example. Functions of Random Vectors The technique for computing functions of one-dimensional random variables carries over to the multi-dimensional case. Find the probability mass function for the random variable Z = X1 +X2 . X2 = 2}. 2) + pX1 . X2 = 3}. . . 3) + pX1 . . . . consider the case where we ﬁx y so that 1 ≤ y ≤ 2. . 12. Example: Let (X1 . Therefore.X2 (2. . 1) = 3 36 . So. We compute fY (y) = = = +∞ −∞ 2 0 x=2 (2 − y)x|x=0 fX. . We ask the question. 4. 2]. 3. . .168 Functions of Random Vectors To ﬁnd the marginal probability density function for Y .X2 (x1 .Y (x. 6 and x2 = 1.

In establishing limits of integration. 5 pZ (6) = 36 . Example: Let (X. 2 pZ (3) = 36 . 2]. we wish to evaluate FZ (z) = P (Z ≤ z) = P (X + Y ≤ z). Find the cumulative distribution function for the random variable Z =X +Y.7 as z varies from 0 to 2. 5 pZ (8) = 36 . we notice that there are two cases to worry about as shown in Figure 6. you should be able to verify that 1 pZ (2) = 36 . 4 pZ (5) = 36 . 1 pZ (12) = 36 . Y ) be the two-dimensional random variable given in the example on page 166. 4 pZ (9) = 36 . . 3 pZ (4) = 36 . In other words. so   0  FZ (z) =   1 ? if z < 0 if 0 ≤ z ≤ 2 if z > 2 For the case 0 ≤ z ≤ 2.8: Case I (z ≤ 1): FZ (z) = 0 z 0 z−y (x + y) dx dy = 1 3 3z . Solution: The support for the random variable Z is the interval [0. 2 pZ (11) = 36 .Functions of Random Vectors 169 Continuing in this manner. we are computing the probability mass assigned to the shaded set in Figure 6. 6 pZ (7) = 36 . Case II (z > 1): FZ (z) = 0 z−1 0 1 (x + y) dx dy + 1 z−1 0 z−y (x + y) dx dy 1 1 = z2 − 3 z3 − 3 . 3 pZ (10) = 36 .

y z 1 z y 1 0 z 1 x 0 1 z x Case I Case II Figure 6.170 Functions of Random Vectors y 1 z x+ y= z 0 z 1 x Figure 6.7: The event {Z ≤ z}.8: Two cases to worry about for the example .

. each Xi has the same marginal distribution. Testing for independence Case I: Discrete A discrete random variable X is independent of a discrete random variable Y if and only if pX. b2 .. and FX1 . Theorem 6. . Independence of Random Variables Deﬁnition 6. . . Recall: An event A is independent of an event B if and only if P (A ∩ B) = P (A)P (B). FXi (x) = F (x) for all x and for all i (i.. Xn is independent if and only if. y) = [pX (x)][pY (y)] for all x and y .X2 . .8. ....i.1.Y (x. Xn is a random sample if and only if 1. .7. X2 . A sequence of n random variables X1 . . Xn are independent. . . X2 .d. . bn ) = FX1 (b1 )FX2 (b2 ) · · · FXn (bn ) for all values b1 .e. . . . .Independence of Random Variables 171 These two cases can be summarized as FZ (z) =   0   1 3  z 3 1  z2 − 1 z3 − 3  3   1 if z < 0 if 0 ≤ z ≤ 1 if 1 < z ≤ 2 if z > 2 Notice what we thought at ﬁrst to be one case (0 ≤ z ≤ 2) had to be divided into two cases (0 ≤ z ≤ 1 and 1 < z ≤ 2). .) random variables. We say that a random sample is a vector of independent and identically distributed (i. A sequence of n random variables X1 . X2 . X1 . . . . . bn . If X and Y are independent random variables then any event A involving X alone is independent of any event B involving Y alone. b2 .Xn (b1 . F (x)). Deﬁnition 6. and 2. .

y) = P (X = x. Let Y denote the number of grade B compressors produced on the same day.1 0. Note that pX.1 0. y ≥ 0 0 otherwise The marginal densities are fX (x) = 2e−2x x ≥ 0 0 otherwise fY (y) = 2e−2y y ≥ 0 0 otherwise . y) = [fX (x)][fY (y)] for all x and y . Y = y) is given by the following table: y pX.Y (x.Y (0.1 = pX (0)pY (0) = (0.172 Independence of Random Variables Case II: Continuous A continuous random variable X is independent of a continuous random variable Y if and only if fX.2 1 0. Let X be the time to failure of transistor 1 and let Y be the time to failure of transistor 2. Example: A company produces two types of compressors.3 0.2 Example: Suppose an electronic circuit contains two transistors. y) = 4e−2(x+y) x ≥ 0.Y (x. Suppose that the joint probability mass function pX.5) = 0.1 The random variables X and Y are not independent. Let X denote the number of grade A compressors produced on a given day. fX. y) 0 x 1 2 0 0. grade A and grade B.4)(0.Y (x.Y (x.2 0. 0) = 0.

Expectation and random vectors Suppose we are given a random vector (X.Y (x. Y ) is continuous Example: Suppose X and Y have joint probability density function fX. y). Y ). Y ) and a function g(x. E(g(X. y) = x + y 0 ≤ x ≤ 1. y) = 0 = fX (x)fY (y) = 2e−2x (0) For x < 0 and y ≥ 0: fX.Y (x. Y )) = −∞ −∞ g(x.Y (x. y) = 0 = fX (x)fY (y) = (0)(0) So the random variables X and Y are independent.Y (x. y)fX. y)pX.2. 0 ≤ y ≤ 1 0 otherwise . X and Y for all values of (x.Y (x.Y (x. y). Y )) = all x all y ∞ ∞ g(x.Y (x. y) if (X. y) dy dx if (X. Y ) is discrete E(g(X. y) = 4e−2(x+y) = fX (x)fY (y) = 2e−2x 2e−2y For x ≥ 0 and y < 0: fX.Expectation and random vectors 173 We must check the probability density functions for (X. Y ))? Theorem 6. Can we ﬁnd E(g(X. For x ≥ 0 and y ≥ 0: fX. y) = 0 = fX (x)fY (y) = (0)2e−2y For x < 0 and y < 0: fX.

To ﬁnd E(Z) = E(XY ) use Theorem 6. The proofs for the discrete case are similar using summations rather than integrals. If X and Y are independent. y) dx dy x fX (x) dx + y fY (y) dy = E(X) + E(Y ) Theorem 6.Y (x. Y ) is a continuous random vector.3. y) dx dy xy(x + y) dx dy x2 y + xy 2 dx dy 1 + 2 x2 y 2 1 0 = = 0 dy 1 + 1 y 2 dy 2 1 0 = 1 2 6y 1 + 6 y3 = 1 3 We will prove the following results for the case when (X. then E[h(X)g(Y )] = E[h(X)]E[g(Y )] .Y (x. y) dx dy ∞ −∞ −∞ −∞ ∞ ∞ −∞ ∞ −∞ x −∞ fX. y) dx dy yfX. y) dx dy xfX.2: E(X + Y ) = = = = = ∞ ∞ −∞ −∞ ∞ ∞ −∞ −∞ ∞ ∞ (x + y)fX.Y (x.Y (x. probability mass functions rather than probability density functions.Y (x. Using Theorem 6.174 Expectation and random vectors Let Z = XY .4. y) dx dy + xfX. E(X + Y ) = E(X) + E(Y ) Proof. y) dy dx + ∞ ∞ −∞ −∞ ∞ ∞ −∞ −∞ ∞ −∞ yfX.2 to get E(XY ) = = = 0 ∞ ∞ −∞ −∞ 1 1 0 1 0 1 0 1 3 3x y 1 3y 0 1 xyfX.Y (x.Y (x.Y (x. Theorem 6. y) dy dx + ∞ −∞ y fX.

.9. Using Theorem 6. X) = Var(X). Y ) ≡ E[(X − E(X))(Y − E(Y ))]. set h(x) = x and g(y) = y to get E[h(X)g(Y )] = E(h(X))E(g(Y )) E[XY ] = E(X)E(Y ) Deﬁnition 6.Expectation and random vectors 175 Proof.2: E[h(X)g(Y )] = ∞ ∞ −∞ −∞ h(x)g(y)fX.Y (x. = E(h(X)) ∞ −∞ g(y)fY (y) dy = E(h(X))E(g(Y )) Corollary 6.5.6. For any random variables X and Y Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X. The covariance of the random variables X and Y is Cov(X. Note that Cov(X. Using Theorem 6. . then E(XY ) = E(X)E(Y ) Proof. Theorem 6. Y ) . . = = = ∞ ∞ −∞ −∞ ∞ −∞ ∞ −∞ h(x)g(y)fX (x)fY (y) dx dy ∞ −∞ g(y)fY (y) h(x)fX (x) dx dy g(y)fY (y) E(h(X)) dy since E(h(X)) is a constant.4. y) dx dy since X and Y are independent. If X and Y are independent. .

Proof. we get Cov(X.176 Expectation and random vectors Proof. . If X and Y are independent then. and expand (a + b)2 to get. from Corollary 6. let b = {Y − E(Y )}. Y ) = E(XY ) − E(X)E(Y ) Proof. We can then use Theorem 6. Using Deﬁnition 6. Y ) = 0. Y ) = E(XY ) − E(X)E(Y ) = 0 . Remember that the variance of a random variable W is deﬁned as Var(W ) = E[(W − E(W ))2 ] Now let W = X + Y . Cov(X. If X and Y are independent then Cov(X. = E[{X − E(X)}2 + {Y − E(Y )}2 + 2{X − E(X)}{Y − E(Y )}] = E[{X − E(X)}2 ] + E[{Y − E(Y )}2 ] +2E[{X − E(X)}{Y − E(Y )}] = Var(X) + Var(Y ) + 2Cov(X. Y ) Theorem 6. = E[XY ] − E(Y )E[X] − E(X)E[Y ] + E(X)E(Y ) = E[XY ] − E(X)E(Y ) Corollary 6. .9.7: Cov(X.8. E(XY ) = E(X)E(Y ). Y ) = E[(X − E(X))(Y − E(Y ))] = E[XY − XE(Y ) − E(X)Y + E(X)E(Y )] = E[XY ] + E[−XE(Y )] + E[−E(X)Y ] + E[E(X)E(Y )]] Since E[X]. E[Y ]. and E[XY ] are all constants.5. . . . Var(X + Y ) = E[(X + Y − E(X + Y ))2 ] = E[(X + Y − E(X) − E(Y ))2 ] = E[({X − E(X)} + {Y − E(Y )})2 ] Now let a = {X − E(X)}. .7.

Suppose that X and Y are two random variables. If a > 0. Y ) ρ(X.8: Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X.2 Theorem 6. Suppose that |ρ(X. Y )| ≤ 1.12. 2 .11. 1965. |ρ(X. Theorem 6. then ρ(X.6 and Corollary 6. Y )| = 1. Y )| = 1. Reading MA. The converse of this theorem is also true: Theorem 6. If X and Y are independent then.10. The correlation coefﬁcient for two random variables X and Y is given by Cov(X. Y = aX + b for some constants a and b. Proofs of these results may be found in the book by Meyer. If X and Y are independent then Var(X+Y ) = Var(X)+Var(Y ). Deﬁnition 6.12 are stated without proof.9. If a < 0. Then (with probability one).Expectation and random vectors 177 Corollary 6. For any random variables X and Y .11.10. P. Proof. Y ) ≡ E (X − E(X))(Y − E(Y )) . Y ) Var(X + Y ) = Var(X) + Var(Y ) + 2(0) Var(X + Y ) = Var(X) + Var(Y ) Deﬁnition 6. from Theorem 6. Y ) = +1. Meyer. Var(X)Var(Y ) Theorems 6. the Y is a linear function of X (with probability one). Y ) = −1. The covariance of two random variables X and Y is given by Cov(X. In other words: If the correlation coefﬁcient ρ is ±1. Y ) ≡ . 6.11 and 6.10. Addison-Wesley. such that Y = aX + b where a and b are constants.. Then |ρ(X. Introductory probability theory and statistical applications. then ρ(X.

178 Expectation and random vectors Random vectors and conditional probability Example: Consider the compressor problem again.Y (x.2 = 0. y) 0 x 1 2 0 0.3 0. what is the probability that the ﬁrst transistor lasted more than one hour? . . fX.Y (x. what is the probability that 2 grade A compressors were produced? Solution: P (X = 2 | Y = 0) = = P (X = 2. again. X + Y = 2) P (X + Y = 2) P (X = 1. Y = 0) P (Y = 0) 2 0.1 Given that no grade B compressors were produced on a given day.5 5 Given that 2 compressors were produced on a given day. consider the two transistors. .1 0.1 = 0.2 0. what is the probability that one of them is a grade B compressor? Solution: P (Y = 1 | X + Y = 2) = = = P (Y = 1. y) = 4e−2(x+y) x ≥ 0. y ≥ 0 0 otherwise Given that the total life time for the two transistors is less than two hours. y pX.1 0.2 1 0.3 3 Example: And. Y = 1) P (X + Y = 2) 1 0.

y) and let fY (y) be the marginal probability density function for Y . y) and let pY (y) be the marginal probability mass function for Y .Expectation and random vectors 179 Solution: P (X > 1 | X + Y ≤ 2) = P (X > 1. We deﬁne the conditional probability density function of X given Y as fX|Y (x | y) = fX.Y (x. .Y (x. y) fY (y) whenever fY (y) > 0.Y (x. X + Y ≤ 2) = = e 4e−2(x+y) dy dx − 3e−4 and P (X + Y ≤ 2) = 0 2 0 2−x 4e−2(x+y) dy dx = 1 − 5e−4 to get P (X > 1 | X + Y ≤ 2) = e−2 − 3e−4 = 0. Case II: Continuous Let X and Y be random variables with joint probability density function fX.Y (x. We deﬁne the conditional probability mass function of X given Y as pX|Y (x | y) = pX.0885 1 − 5e−4 Conditional distributions Case I: Discrete Let X and Y be random variables with joint probability mass function pX. y) pY (y) whenever pY (y) > 0. X + Y ≤ 2) P (X + Y ≤ 2) 2 1 0 −2 2−x We then compute P (X > 1.

1) = 1/4 pX. choose the best response among those provided. X3 . Answers can be found in Appendix B.Y (0.Y (0. S6. then the value of r is (A) (B) (C) 1 16 1 4 1 2 (D) 1 (E) none of the above. each with P (Xi = 1) = 1 and P (Xi = 0) = 1 .1 Let X1 . X4 be independent and identically distributed random variables. S6.180 Self-Test Exercises for Chapter 6 Law of total probability Case I: Discrete pX (x) = y pX|Y (x | y)pY (y) Case II: Continuous fX (x) = +∞ −∞ fX|Y (x | y)fY (y) dy Self-Test Exercises for Chapter 6 For each of the following multiple-choice questions. X2 .2 Let (X. Y ) be a discrete random vector with joint probability mass function given by pX. Let P (X1 + X2 + 2 2 X3 + X4 = 3) = r. 0) = 1/4 pX.Y (1. 1) = 1/2 Then P (Y = 1) equals (A) 1/3 (B) 1/4 (C) 1/2 .

Y (1.Y (x. S6. 0) = 1 4 Let pX (x) denote the marginal probability mass function for X .Y (1. The value of pX (1) is (A) 0 (B) (C) 1 4 1 2 (D) 1 (E) none of the above.Self-Test Exercises for Chapter 6 181 (D) 3/4 (E) none of the above. 1) = 1 2 pX. Y ) be a continuous random vector with joint probability density function fX.Y (1. −1) = 1 4 pX. Y ) be a random vector with joint probability mass function given by pX.3 Let (X. Y ) be a random vector with joint probability mass function given by pX.4 Let (X. y) = 1/2 if −1 ≤ x ≤ 1 and 0 ≤ y ≤ 1 0 otherwise Then P (X > 0) equals (A) 0 (B) 1/4 .Y (0. S6. S6. 1) = 1 2 pX.Y (0.Y (1. −1) = 1 4 pX. 0) = 1 4 The value of P (XY > 0) is (A) 0 (B) (C) 1 4 1 2 (D) 1 (E) none of the above.5 Let (X.

Y (0.Y (0.8 Let (X.Y (0.6 Suppose (X.182 Self-Test Exercises for Chapter 6 (C) 1/2 (D) 1 (E) none of the above. 1) = 1/3 pX. y) = 0 otherwise Then P (X > 1 ) equals 2 (A) 0 (B) 1/2 (C) e−1/2 (D) 1 −y 2e (E) none of the above. S6. Y ) is a continuous random vector with joint probability density function e−y if 0 ≤ x ≤ 1 and y ≥ 0 fX. 0) = 1/3 Then P (X + Y = 1) equals (A) 0 (B) 1/3 (C) 2/3 (D) 1 (E) none of the above. S6. 0) = 1/3 pX. S6. 0) = 1/3 pX. Y ) be a discrete random vector with joint probability mass function given by pX. 0) = 1/3 . Y ) be a discrete random vector with joint probability mass function given by pX.Y (1.Y (0.Y (1.Y (x. 1) = 1/3 pX.7 Let (X.

Y ) is a continuous random vector with joint probability density function   y if 0 ≤ x ≤ 2 and  0≤y≤1 fX.5) equals (A) 0 (B) 1/8 (C) 1/4 (D) 1/2 (E) none of the above. S6. Y }. S6.10 Suppose (X.Y (x. y) =   0 otherwise Then E(X 2 ) equals (A) 0 (B) 1 (C) 4/3 (D) 8/3 (E) none of the above.Self-Test Exercises for Chapter 6 183 Let W = max{X.Y (x. Then P (W = 1) equals (A) 0 (B) 1/3 (C) 2/3 (D) 1 (E) none of the above. y) =   0 otherwise Then P (X < 1. . Y ) is a continuous random vector with joint probability density function   y if 0 ≤ x ≤ 2 and  0≤y≤1 fX. Y < 0.9 Suppose (X.

X3 be independent and identically distributed random variables. If P (X1 + X2 + X3 = 3) = r. S6. Y ) be a continuous random vector with joint probability density function 1/π if x2 + y 2 ≤ 1 fX.184 Self-Test Exercises for Chapter 6 S6. then the value of p is (A) (B) (C) 1 3 √ 3 r 1 2 (D) r3 (E) none of the above.Y (x. each with P (Xi = 1) = p and P (Xi = 0) = 1 − p. y) = 0 otherwise Then P (X > 0) equals . S6. Let Y denote the total net weight (in ounces) of a basket of n oranges.12 Let X1 .11 The weights of individual oranges are independent random variables.14 Let (X. X2 . S6. then Var(X + Y ) must equal (A) 2Var(X) √ (B) 2(Var(X)) (C) Var(2X) (D) 4Var(X) (E) none of the above. each having an expected value of 6 ounces and standard deviation of 2 ounces. The variance of Y is equal to √ (A) 4 n (B) 4n (C) 4n2 (D) 4 (E) none of the above.13 If X and Y are random variables with Var(X) = Var(Y ).

Y and Z be independent random variables with E(X) = 1 E(Y ) = 0 E(Z) = 1 Let W = X(Y + Z). Then E(W ) equals (A) 0 (B) E(X 2 ) (C) 1 (D) 2 (E) none of the above S6. S6.17 Toss a fair die twice. (A) 1/4 (B) 1/3 (C) 1/2 . Then E[(X + 1)2 ] − (E[X + 1])2 equals (A) Var(X) (B) 2E[(X + 1)2 ] (C) 0 (D) 1 (E) none of the above. Let the random variable X represent the outcome of the ﬁrst toss and the random variable Y represent the outcome of the second toss. S6. What is the probability that X is odd and Y is even.Self-Test Exercises for Chapter 6 185 (A) 1/(4π) (B) 1/(2π) (C) 1/2 (D) 1 (E) none of the above.15 Let X be a random variable.16 Let X .

The limit of integration that should appear in position D is (A) min{1. 0) = 1/2 pX. 0) = 1/4 pX. Let W = X −Y . y + w} (E) none of the above. x − w} (B) min{1.Y (−1.Y (0.186 Self-Test Exercises for Chapter 6 (D) 1 (E) none of the above S6. Y ) is a random vector with joint probability mass function given by pX. x − y} (D) min{1. y) = k for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 and y ≤ x 0 otherwise where k is a some positive constant. Y ) is random vector with joint probability density function fX.Y (x. y − w} (C) min{1.19 Assume that (X.18 Assume that (X. S6. The value of P (W = 0) is (A) 0 (B) 1/4 (C) 1/2 (D) 3/4 (E) none of the above. 1) = 1/4 Deﬁne the random variable W = XY . .Y (0. To ﬁnd the cumulative distribution function for W we can compute FW (w) = B A D C k dx dy for 0 ≤ w ≤ 1.

21 Suppose (X.20 Let X be a random variable with E(X) = 0 and Var(X) = 1.Y (x.22 The life lengths of two transistors in an electronic circuit is a random vector (X. S6. y) = 2e−(x+2y) if x ≥ 0 and y ≥ 0 0 otherwise Then P (X + Y ≤ 1) equals (A) (B) (C) 1 1 −(x+2y) dx dy 0 0 2e 1 1−y 0 0 2e−(x+2y) dx dy 1 1 −(x+2y) dx dy 0 y 2e .Self-Test Exercises for Chapter 6 187 S6. Y ) is given by fX. Let Y be a random variable with E(Y ) = 0 and Var(Y ) = 4. Then E(X 2 + Y 2 ) equals (A) 0 (B) 1 (C) 3 (D) 5 (E) none of the above. Y ) where X is the life length of transistor 1 and Y is the life length of transistor 2. S6. y) = 4xy 0 if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise Then E 1 XY equals (A) +∞ (B) 0 (C) 1/2 (D) 4 (E) none of the above. Y ) is a continuous random vector with joint probability density function fX. The joint probability density function of (X.Y (x.

1) = 0. ω2 . pX. pX. 3) = 0. Y (ω1 ) = 0.Y (3. The joint probability mass function for X and Y is given by pX. (c) Find the distribution for the random variable Z = X + Y which is the total daily production of gizmos. .20. Z). Y = 2). ω3 .05. ω4 }. Z(ω3 ) = 3. 0) = 0. (c) Find the probability mass function for X . X(ω4 ) = 1 Y (ω4 ) = 0 Z(ω4 ) = 4 (a) Find the joint probability mass function for (X.Y (3. (b) Find the joint probability mass function for (X. 2) = 0. {ω1 .05. 2) = 0.2 A sample space Ω is a set consisting for four points.Y (2.Y (2. 1) = 0. pX. Z(ω1 ) = 1.1 A factory can produce two types of gizmos.20. Questions for Chapter 6 6. (b) Find the marginal probability mass functions for X and for Y. Y. X(ω2 ) = 0. Type 1 and Type 2. Y (ω3 ) = 1. 1) = 0.10. Y = 2) and P (Y > 0).20. A probability measure. 6. Y ). pX. P (X ≤ 2.10. Z(ω2 ) = 2. Y (ω2 ) = 1.Y (1. as follows: P ({ω1 }) = P ({ω3 }) = 1 2 1 4 P ({ω2 }) = P ({ω4 }) = 1 8 1 8 Random variables X .Y (1. 2) = 0.10. Let X be a random variable denoting the number of Type 1 gizmos produced on a given day.Y (1. pX. (a) Compute P (X ≤ 2. and let Y be the number of Type 2 gizmos produced on the same day. X(ω3 ) = 1.Y (1.188 Questions for Chapter 6 (D) 1 1 0 1−y 2e−(x+2y) dx dy (E) none of the above. Y and Z are deﬁned as X(ω1 ) = 0. pX. pX. P (·) assigns probabilities.

(g) Find the probability mass function for the random variable V = max{X.Y (x. Obviously X ≤ Y . 1).Y (x. 0). y) = 1 0 if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise Find the cumulative distribution function for the random variable Z = X+Y . let Z = Y − X .Y (x. 6. (d) Compute P (Z > 1) and P (Y > 1). (c) Let Z denote the excess life of the second bulb. y) = 2 0 if x > 0. Y }. (f) Find the joint probability mass function for the random variable (W. (e) Find the probability mass function for the random variable T = XY Z . Y ) is given by fX. i. y > 0 and x + y < 1 otherwise .3 Let (X. the probability density function for (X. Find the probability density function for Z . Y ) have a uniform distribution over the rectangle (0. The ﬁrst one to fail burns out at time X and the second one at time Y . The joint probability density function for (X.e. Y ) is given by fX. 1). (0. T ). (1. (1.Questions for Chapter 6 189 (d) Find the probability mass function for the random variable W = X + Y + Z. 6. (b) Find the marginal probability density functions for X and for Y . y) = 2e−(x+y) 0 if 0 < x < y < ∞ otherwise (a) Sketch the region in (x. 6. 0) ..5 If two random variables X and Y have a joint probability density function given by fX. In other words.4 Two light bulbs are burned starting at time 0. y)-space for which the above probability density function assigns positive mass.

2 (b) Find the probability that X will take on a value less than and Y will take on a value greater than 1 . Y ) has joint probability density function fX. 3 (d) Find the marginal probability density functions for X and for Y .Y (x. 6.Y. 6. 0 < y < 2 and 0 < z < 2 otherwise (a) Find the value of the constant α. 1 2 (c) Find the probability density function for the random vector (X. . (e) Find the marginal probability density function for Z = X + Y. y) = kx(x − y) 0 if 0 ≤ x ≤ 2 and |y| ≤ x otherwise (a) Sketch the support of the distribution for (X. (c) Find the marginal probability density function for Y .Z (x. (d) Find the marginal probability density function for X . (b) Find the probability that X will take on a value less than and Y and Z will both take on values less than 1. Y ) in the xy plane. z) = α(xyz 2 ) 0 if 0 < x < 1. (b) Evaluate the constant k . Y ). y. Z) is a random vector with joint probability density function fX.7 Suppose that the random vector (X.6 Suppose (X. 2 1 4 (c) Find the probability that the sum of the values taken on by the two random variables will exceed 2 . (e) Find the probability density function for Z = X + Y .190 Questions for Chapter 6 (a) Find the probability that both random variables will take on a value less than 1 . Y.

25. P .Y (x. .40 and s > 0 otherwise (a) Find the marginal probability density functions for P and for S . 6. y) = x 0 if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2 otherwise (a) Find the marginal probability density functions for X and for Y . (b) Find the conditional probability density function for X given that Y takes on the value y . of a commodity (in dollars) and total sales. Y is proportion of persons who will respond to a second type of mail-order solicitation.8 The length. (in 10. Y ) with joint probability density function fX.Y (x. (c) Find the probability that sales will exceed 20. 6. and the width. X . p) = 5pe−ps 0 0. S . 000 units) is given by fS. y) = 2 5 (x + 4y) 0 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise (a) Find the marginal probability density functions for X and for Y . 000 units given P = 0.Questions for Chapter 6 191 6.10 If X is the proportion of persons who will respond to one kind of mail-order solicitation.P (s. (c) Find the conditional probability density function for Y given that X takes on the value x. and the joint probability density function of X and Y is given by fX. of salt crystals form a random variable (X. (b) Find the conditional probability density function for S given that P takes on the value p.9 If the joint probability density function of the price. (d) Find the probability that there will be at least a 20% response to the second type of mail-order solicitation. Y .20 < p < 0.

(a) Compute P (X 2 + Y 2 ≤ 0. If an overload occurs. and let pY |X be the conditional probability mass function of . Y ) is given by fX. and the time when fuse 2 blows is a random variable.25). Y ) with the following probability density function:   1 fX. The joint probability density function of the random vector (X. Y . (Note: Be careful of the case |y| = 1.Y (x. 6.) 6.11 An electronic component has two fuses.192 Questions for Chapter 6 (e) Find the probability that there will be at least a 20% response to the second type of mail-order solicitation given that there has only been a 10% response to the ﬁrst kind of mail-order solicitation. Y > 0). (b) Find the marginal distributions for X and Y . let pX|Y be the conditional probability mass function of X given Y . X . y) = e−y 0 0 ≤ x ≤ 1 and y ≥ 0 otherwise (a) Compute P (X + Y ≥ 1). (X. (d) Compute P (X ≥ 0 | Y > 0).Y (x. y) =  π 0 x2 + y 2 ≤ 1 otherwise Hence. (e) Find the conditional distribution for X given Y = y .13 Let X and Y be discrete random variables with the support of X denoted by Θ and the support of Y denoted by Φ. Y ) has a uniform distribution on a disk of radius one centered at (0. Let pX be the marginal probability mass function for X . 0).12 The coordinates of a laser on a circular target are given by the random vector (X. the time when fuse 1 blows is a random variable. (c) Compute P (X ≥ 0. 6. (b) Are X and Y independent? Justify your answer.

Y. of a commodity (in dollars) and total sales.17 Suppose the joint probability density function of the price. 0 < z < 2 otherwise Find E(X + Y + Z).5 | Y = 0.P (s.5 | Y < 0. Z) is a random vector with joint probability density function fX. 6. p) = 5pe−ps 0 0. Y ) is a continuous random vector with joint probability density function fX. 000 units) is given by fS. (in 10. The random vector (X.5). y. 6. y) = 2 0 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 − x otherwise (a) Determine the marginal probability density functions for X and for Y .5). y) = 2 0 if x > 0.25).14 The volumetric fractions of each of two compounds in a mixture are random variables X and Y . S .15 Suppose (X. 6. y > 0 and x + y < 1 otherwise Find E(X + Y ). Y. Show that pX|Y (x|y) = pY |X (y|x)pX (x) pY |X (y|x)pX (x) x∈Θ for any x ∈ Θ and y ∈ Φ.Y (x.Questions for Chapter 6 193 Y given X . Y ) has joint probability density function fX. respectively. (c) Compute P (X < 0. P .20 < p < 0. . 6. (d) Compute P (X < 0.16 Suppose (X. (b) Are X and Y independent random variables? Justify your answer.Y (x.40 and s > 0 otherwise Find E(S | P = 0.Z (x. 0 < y < 2. z) = 3 2 8 (xyz ) 0 if 0 < x < 1.

and let Y be the proportion of persons who will respond to a second type of mail-order solicitation.18 Let X be the proportion of persons who will respond to one kind of mailorder solicitation.15 .05 . After an extensive survey was conducted.25 . it has been determined that the random vector (X.04 .Y (x.5). y) = 2xe−y 0 if 0 ≤ x ≤ 1 and y ≥ 0 otherwise (a) Find P (X > 0. 6.Y (x.01 . y) X (juniors) Y (seniors) 0 1 2 .01 . Plans are being made for a party. (c) Find E(X).20 0 1 2 3 . 6.03 .Y (x.21 The Department of Metaphysics of Podunk University has 5 students.04 . Y ) is a random vector with the following joint probability density function: fX.12 . Let X denote the number of juniors who will attend the party. with 2 juniors and 3 seniors. (b) Find the marginal probability density function for Y .05 . (d) Are X and Y independent random variables? Justify your answer.20 Suppose that (X. and let Y denote the number of seniors who will attend. y) = 2x 0 if 0 < x < 1 and 0 < y < 1 otherwise Find the marginal probability density function for Y .19 Suppose that (X. Y ) has the following joint probability mass function: pX. y) = 2 5 (x + 4y) 0 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise Find E(X | Y = y) for each 0 < y < 1.05 .Y (x. Suppose the joint probability density function of X and Y is given by fX. Y ) is a random vector with the following joint probability density function: fX.194 Questions for Chapter 6 6. 6.

(d) Find P (X > 0. what is the expected amount of money that will be collected from all students? (d) The two juniors arrive early at the party.22 Let (X.Y (x. Y ) be a continuous random vector with joint probability density function given by fX. 6. What is the probability that no seniors will show up? 6. Y ) be a continuous random vector with joint probability density function given by fX.23 Let (X.5 | Y ≤ 0. y ≥ 0 and x + y ≤ 1 . otherwise (a) Compute P (X < Y ).5). (b) Are X and Y independent random variables? Justify your answer. y) = 4xy 0 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 . (c) Find the conditional probability density function for Y given that X = 1 . y) = 2 0 if x ≥ 0. otherwise (a) Find the marginal probability density function for X and the marginal probability density function for Y . (b) Find the marginal probability density function for X and the marginal probability density function for Y .Questions for Chapter 6 195 (a) What is the probability that 3 or more students (juniors and/or seniors) will show up at the party? (b) What is the probability that more seniors than juniors will show up at the party? (c) If juniors are charged \$5 for attending the party and seniors are charged \$10. (c) Find P (X + Y ≤ 1). 2 (d) Are X and Y independent random variables? Justify your answer. .Y (x.

Y ) be a continuous random vector with joint probability density function given by fX. (a) Find the joint probability mass function for the random vector (X.25 A box contains four balls numbered 1.24 Let (X. and let Y be the number on the second ball.26 Suppose that (X. (c) Find the marginal probability density function for X . A game consists of drawing one of the balls at random from the box. (b) Compute P (X > 0.Y (x. (d) Find E(X + Y ). (c) Find P (X + Y ≤ 1). Y < 1. Y ). Y ) is a random vector with the following joint probability density function:   2xe−y    fX. y) = 6x2 y 0 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise (a) Find the marginal probability density function for X and the marginal probability density function for Y . 3 and 4.0). 2. It is not replaced.0). (d) Find E(X + Y ). y) = if 0 < x < 1 and y > 0 otherwise  0    (a) Compute P (X > 0. (b) Compute P (Y ≥ 3 | X = 1).Y (x.5. . 6.5 | Y < 1. 6. Let X be the number on the ﬁrst ball drawn. (b) Are X and Y independent random variables? Justify your answer.196 Questions for Chapter 6 6. A second ball is then drawn at random from the box.

e. (a) Express the probability density function for Z in terms of an integral that involves only fX and fY . independent continuous random variables with probability density functions fX (x) and fY (y). respectively. Let Z = X/Y . What is the probability that the student is a junior? 6. what is the student’s expected grade? (c) A student is chosen at random from the class and you are told that the student has a grade in the 70’s. (b) Now suppose that X and Y can be positive and/or negative.Questions for Chapter 6 197 6.27 Past studies have shown that juniors taking a particular probability course receive grades according to a normal distribution with mean 65 and variance 36. Seniors taking the same course receive grades normally distributed with a mean of 70 and a variance of 36. Express the probability density function for Z in terms of an integral that involves only fX and fY . A probability class is composed of 75 juniors and 25 seniors. .. between 70 and 80)? (b) If a student is chosen at random from the class. Compare your answer to the case where X and Y are both positive. (a) What is the probability that a student chosen at random from the class will receive a grade in the 70’s (i.28 (†) Suppose X and Y are positive.

198 Questions for Chapter 6 .

2.7 TRANSFORMS AND SUMS Transforms Notation Throughout this section we will use the following notation: Deﬁnition 7. u=c In other words.1. and constant c h(k) (c) ≡ dk h(u) duk . 199 . Example: Let h(x) = x3 then h(2) (4) = = = d2 h(x) dx2 d2 3 x dx2 x=4 x=4 3x 2 x=4 = 3(4)2 = 48 Probability generating functions Deﬁnition 7. Then the probability generating function for X is given by gX (z) ≡ E(z X ) for |z| < 1. Let X be a discrete random variable with support on the nonnegative integers. For any differentiable function h(u). h(k) (c) is the k th derivative of h evaluated at u = c.

. . . .1. gX (0) = (1)P (X = 0) Now take the ﬁrst derivative of gX (z). .200 Transforms Example: Let X have a Poisson distribution with parameter α > 0 Then pX (k) = e−α αk k! for k = 0. We have gX (z) = E(z X ) ∞ = k=0 z k P (X = k) Writing out the sum explicitly. . . Recalling that ex = ∞ k k=0 (x /k !). we get. . (Informal proof) Let g (k) (z) denote the k th derivative of g with respect to z . If gX (z) is the probability generating function of X . we get ∞ k=0 gX (z) = E(z X ) = zk e−α αk k! = e−α (αz)k k! k=0 ∞ = e−α eαz = eα(z−1) Theorem 7. gX (z) = P (X = 1) + 2zP (X = 2) + 3z 2 P (X = 3) + 4x3 P (X = 4) + . 1. . . . gX (z) = z 0 P (X = 0) + z 1 P (X = 1) + z 2 P (X = 2) + z 3 P (X = 3) + . . 2. . gX (0) = P (X = 1) (1) (1) Another derivative produces. then P (X = k) = gX (0) k! (k) Proof. .

gX (1) = P (X = 1) + 2P (X = 2) + 3P (X = 3) + 4P (X = 4) + . . . . gXn (z). . in general gX (0) = n!P (X = n) (n) Theorem 7. gX (z) = 3(2)P (X = 3) + 4(3)(2)zP (X = 4) + 5(4)(3)z 2 P (X = 5) + . . . . gX (0) = 3(2)P (X = 3) (3) (3) And. evaluated at z = 1. .3. . . Recall that if X and Y are independent random variables. . then the probability generating function of the random variable Y = X1 + X2 + · · · + Xn is n gY (z) = i=1 gXi (z). Xn are independent random variables with probability generating functions gX1 (z). . . . gX2 (z). ∞ (1) (1) (1) = k=0 kP (X = k) = E(X) Theorem 7. As above. X2 . Proof. . .Transforms (2) 201 gX (z) = 2P (X = 2) + 3(2)zP (X = 3) + 4(3)z 2 P (X = 4) + 5(4)z 3 P (X = 5) + . Proof. respectively. .2. If gX (z) is the probability generating function of X . then E(h(X)g(Y )) = E(h(X))E(g(Y )) . then E(X) = gX (1) (1) where gX (1) is the ﬁrst derivative of gX (z) with respect to z . If X1 . . gX (z) = P (X = 1) + 2zP (X = 2) + 3z 2 P (X = 3) + 4x3 P (X = 4) + . gX (0) = 2P (X = 2) (2) And once more to see the pattern.

it is a direct result of Theorem 7.1.. We have. and a is a nonnegative. The probability generating function of a random variable uniquely determines its probability distribution. then the probability generating function of the random variable Y = a + X is gY (z) = z a gX (z).4. Except for some technical details. Theorem 7.202 Transforms Therefore. P (T = 1) = p P (T = 0) = 1 − p = q Then the probability generating function of T is given by gT (z) = z 0 q + z 1 p = q + zp . integer constant.+Xn ) = E(z X1 z X2 · · · z Xn ) = E(z X1 )E(z X2 ) · · · E(z Xn ) = gX1 (z)gX2 (z) · · · gXn (z) Theorem 7.. If gX (z) is the probability generating function of X . gY (z) = E(z Y ) = E(z a+X ) = E(z a z X ) = z a E(z X ) = z a gX (z) We offer Theorem 7. Proof.5.5 without proof. Example: Let T have a Bernoulli distribution with parameter p. gY (z) = E(z Y ) = E(z X1 +X2 +.

T2 . Example: Let X have a uniform distribution on the interval (a.4. T1 . One could obtain the probability generating function for X is given by computing gX (z) = E(z X ) = n k=0 zk n k (n−k) p q k But we can use Theorem 7. b). . . 1. Deﬁnition 7.3. . use Theorem 7. . . Then the moment generating function for X is given by MX (t) ≡ E(etX ). The kth moment about the origin of a random variable X is deﬁned as E(X k ). Moment generating functions Deﬁnition 7. Let X be (almost) any random variable.Transforms 203 Example: Now suppose. Tn are n independent Bernoulli random variables each with P (Ti = 1) = p P (Ti = 0) = 1 − p = q We know (from Chapter 4) that X = T1 + T2 + · · · + Tn has a Binomial distribution with parameters n and p. Then   fX (x) = 1 b−a  0 if a ≤ x ≤ b otherwise . . .3 with the Ti ’s to easily get gX (z) = gT1 (z)gT2 (z) · · · gTn (z) = (q + zp)n As an exercise.2 to get E(X) from gX (z). n. . That is P (X = k) = n k (n−k) p q k for k = 0.

we get tX ∞ MX (t) = E(e ) = k=0 etk ∞ e−α αk k! = e−α (αet )k k! k=0 t t −1) = e−α eαe = eα(e Note that since a Poisson random variable X has support {0. . When a random variable has both a probability generating function and a moment generating function.204 Transforms Therefore. 2. . . 1. 1. . 2. Recalling that ex = ∞ k k=0 (x /k !). . we can derive the following relationship: gX (z) = E[z X ] substituting z = et . . we obtain gX (et ) = E[(et )X ] = E[etX ] = MX (t) Since the probability generating function for a Poisson random variable is gX (z) = eα(z−1) . MX (t) = E(etX ) = = = = = ∞ −∞ b a etx fX (x) dx etx 1 dx b−a b a 1 t(b − a) tetx dx b a 1 etx t(b − a) etb − eta t(b − a) for t = 0 Example: Let X have a Poisson distribution with parameter α > 0 Then pX (k) = e−α αk k! for k = 0. it can have a probability generating function and a moment generating function.}.

Introductory probability theory and statistical applications. .. P. . . for every t. Continuing. Therefore.1 ) Let X ∼ N (µ. Reading MA. σ 2 ). .Transforms 205 the moment generating function for a Poisson random variable must be MX (t) = gX (et ) = eα(e t −1) It’s important to remember that this relationship will only hold for random variables with support {0. thus x = σs + µ and dx = σds. 2. MX (t) = 1 1 √ 2π +∞ −∞ exp[t(σs + µ)]e−s 2 /2 ds Meyer. those random variables with probability generating functions). Example: Let X have an exponential distribution with parameter α > 0. 1. 1 MX (t) = √ 2πσ +∞ −∞ 2 etx exp − 1 x−µ 2 σ 2 dx Let (x − µ)/σ = s. 1965.} (that is. Then 1 1 x−µ fX (x) = √ exp − 2 σ 2πσ Therefore. MX (t) = E(etX ) = 0 ∞ ∞ etx αe−αx dx αex(t−α) dx = 0 Note that this integral only converges if t < α. Then fX (x) = αe−αx if x ≥ 0 0 otherwise Therefore. Addison-Wesley. MX (t) = = α x(t−α) ∞ e t−α 0 α for t < α α−t Example: (From Meyer.

. 2. then E(X k ) = MX (0) (k) for k = 0. we get. .206 +∞ −∞ +∞ −∞ Transforms 1 = etµ √ 2π 1 = etµ √ 2π = etµ+σ 2 t2 /2 1 exp(− 2 [s2 − 2σts]) ds exp{− 1 [(s − σt)2 − σ 2 t2 ]} ds 2 +∞ −∞ 1 √ 2π exp(− 1 [s − σt]2 ) ds 2 Now let s − σt = v . MX (t) = E 1 + tX + (tX)2 (tX)3 (tX)4 + + + ··· 2! 3! 4! If the sum in the right hand side converges.6. . (Informal proof) Let MX (t) denote the k th derivative of MX with respect to t. Proof. . we get . We have MX (t) = E(etX ) = E (tX)k k! k=0 ∞ (k) Writing out the sum explicitly. If MX (t) is the moment generating function of X . . . . . +∞ −∞ e−v 2 /2 dv = e(tµ+σ 2 t2 /2) Theorem 7. . MX (t) = E(X) + tE(X 2 ) + (1) t2 E(X 3 ) t3 E(X 4 ) + + ··· 2! 3! . MX (t) = 1 + tE(X) + t2 E(X 2 ) t3 E(X 3 ) t4 E(X 4 ) + + + ··· 2! 3! 4! Take the ﬁrst derivative of both sides to get . then ds = dv and we obtain MX (t) = etµ+σ 2 t2 /2 1 √ 2π . 1.

we see that. . MX (0) = E(X) (1) Take the second derivative of both sides to get . .8. . MX (t) = E(X 2 ) + tE(X 3 ) + (2) t2 E(X 4 ) t3 E(X 5 ) + + ··· 2! 3! and when evaluated at t = 0. If MX (t) is the moment generating function of X . Theorem 7. respectively. . The moment generating function of a random variable uniquely determines its probability distribution.9. MX (t) = E(X k ) + tE(X k+1 ) + (k) t2 E(X k+2 ) t3 E(X k+3 ) + + ··· 2! 3! and when evaluated at t = 0. MX (0) = E(X k ) (k) Theorem 7. . MX (0) = E(X 2 ) (2) Continuing. σ 2 ) then the moment generating function for X is MX (t) = etµ+ 2 t 1 2 2 σ . . . Xn are independent random variables with moment generating functions MX1 (t). .Transforms 207 and when evaluated at t = 0. If X1 . MXn (t). . . X2 .7. Theorem 7. . . MX2 (t). . . . . . then the moment generating function of the random variable Y = aX +b is MY (t) = ebt MX (at). then the moment generating function of the random variable Y = X1 + X2 + · · · + Xn is n MY (t) = i=1 MXi (t). . Example: If X ∼ N (µ. and a and b are constants.

8). . suppose you are given that the random variable X has a normal distribution with mean µ and variance σ 2 .208 Transforms So. Therefore. if someone tells you that the random variable W has moment generating function 2 MW (t) = e−2t+4t then the distribution of W must be N (−2. And. 1). we have shown that if X ∼ N (µ.8 with a= 1 σ and b=− µ σ we ﬁnd that the moment generating function for Z is MZ (t) = ebt MX (at) = e−tµ/σ MX (t/σ) = e[− σ ] e = e = et 2 /2 tµ t [ σ µ+ t2 σ2 ] 2σ 2 [− tµ + tµ + σ σ t2 σ2 ] 2σ 2 which is the moment generating function of a normal random variable with expected value equal to zero and variance equal to one. σ 2 ) then Z = (X − µ)/σ ∼ N (0. Now. 1). if you encounter a random Y with moment generating function MY (t) = et 2 /2 you can immediately say that Y ∼ N (0. So the moment generating function for X is MX (t) = etµ+ 2 t 1 2 2 σ Consider the random variable Z= 1 µ X −µ = X− σ σ σ Using Theorem 7.

Linear combinations of normal random variables Theorem 7. an and b are constants. σi ) and a1 . X2 . Note that the moment generating function for a continuous random variable X with probability density function fX (x) is MX (t) = ∞ −∞ etx fX (x) dx and the Laplace transform for the same density function is LX (t) = ∞ −∞ e−tx fX (x) dx So the moment generating function of a continuous random variable is simply the Laplace transform of the probability density function with t replaced by −t everywhere in the formula for the transform.10.5. Characteristic function Deﬁnition 7.Linear combinations of normal random variables 209 Other transforms of random variables Laplace transform Deﬁnition 7. . . a2 . Xn are independent normal random variables with 2 Xi ∼ N (µi . Then the Laplace transform for X is given by LX (t) ≡ E(e−tX ). . Then the characteristic function for X is given by ψX (t) ≡ E(eitX ) √ where i ≡ −1. then n Y =b+ i=1 2 ai Xi ∼ N (µY . σY ) where µY = b + n ai µi i=1 . Let X be (almost) any random variable. . Let X be any random variable.6. . . if X1 . . .

1) = 1 − [P (Z ≤ 2) − P (Z ≤ −2)] = 1 − [P (Z ≤ 2) − (1 − P (Z ≤ 2))] = 2 − 2P (Z ≤ 2) = 2 − 2(0.210 Linear combinations of normal random variables and 2 σY = n 2 a2 σi . it is reasonable to assume that the weights of n snails chosen at random from the population of all snails are n independent identically distributed random variables. T = X1 + X2 + X3 + X4 . Therefore. . 16).8413 = 0. Xn . That is. . . i i=1 Example: The weights of snails are normally distributed with mean 6 gm. . are iid with each Xi ∼ N (6. 1) = 1 − P (Z ≤ 1) = 1 − 0. 1). Xn . X2 . 4). . X2 . Therefore.? Since X1 . 4). . are iid with each Xi ∼ N (6. P (T > 28) = P 28 − 24 T − 24 > 4 4 = P (Z > 1) where Z ∼ N (0. .0456 .1587 3 Once again. . suppose that 4 snails are selected at random. Let’s ﬁnd the probability that the observed average weight deviates from the expected weight by more than 2 gm. 4). . X1 . Therefore. we have X = 4 1 i=1 Xi ∼ N (6. we want to ﬁnd P (|X − 6| > 2). with each Xi ∼ N (6. Xn . X2 . 3 Suppose that 4 snails are selected at random. Let X = 1 (X1 + 4 X2 + X3 + X4 ) denote the average weight of the 4 snails. 4 P (|X − 6| > 2) = 1 − P (|X − 6| ≤ 2) = 1 − P (−2 ≤ X − 6 ≤ 2) = 1−P −2 X −6 2 ≤ ≤ 1 1 1 = 1 − P (−2 ≤ Z ≤ 2) where Z ∼ N (0. . and variance 4 gm2 . and we would like to ﬁnd the probability that the total weight. . . we have T = 4 i=1 Xi ∼ N (24. Since X1 .9772) = 0. will exceed 28 gm.

. . . X2 . . . we can let M (t) denote the common moment generating function of each X1 . . . Then n X ∼ N (µ. and MZn (t) = e−µ √ nt/σ √ M (t/(σ n)) n We are going to need the following two facts: . we get √ √ (1) MZn (t) = e−µ nt/σ MSn (t/(σ n)) Since each Xi has the same moment generating function for all i. . is a random variable. We will get the result by showing that the moment generating function for Zn must approach the moment generating function of a N (0. Xn be a random sample from a N (µ. Zn = then n→∞ Sn − nµ √ . Then MZn (t) = M Sn√ (t) = M −nµ σ n Sn √ − σ n √ nµ σ (t) Since MaY +b (t) = etb MY (at) for any random variable Y and any constants a and b.The Central Limit Theorem 211 The distribution of the sample mean Let X1 . First note that E(Sn ) = nµ and Var(Sn ) = nσ 2 .11. X2 . The Central Limit Theorem Theorem 7. MSn (t) = [M (t)]n . . . σ n lim P (Zn ≤ z) = Φ(z). . and 2. X2 . 1) random variable. . X1 . and 3. Remember that X . Since each Xi has the same moment generating function for all i. (Central Limit Theorem) If 1. Xn . we can let M (t) denote the common moment generating function of each X1 . Let X = n 1 i=1 Xi . . σ 2 /n). Therefore. Xn is a random sample with E(Xi ) = µ and Var(Xi ) = σ 2 for all i. X2 . Xn . . Proof. Sn = X1 + X2 + · · · + Xn . σ 2 ) distribution. . the sample mean. .

Now. Therefore.e.. for suitably large n.212 E(X k ) k t k! k=1 (−1)j−1 j ξ j j=1 ∞ ∞ The Central Limit Theorem (A) M (t) = 1 + (B) ln(1 + ξ) = for |ξ| < 1. we can make this summation as small as we want. E(X 2 ) − E(X)2 2σ 2 t2 + [other terms] But E(X) = µ and E(X 2 ) − E(X)2 = σ 2 . As n → ∞. Collect similar powers of t to obtain √ −µ nt E(X) √ +n t+ ln MZn (t) = σ σ n Hence. ln MZn (t) = √ (−µ + E(X)) n t+ σ E(X) 1 E(X)2 − 2!σ 2 n 2 σ 2 n t2 + · · · . permitting us to use fact (B). so ln MZn (t)   √ ∞  −µ nt (−1)j−1 = + n  σ j j=1 =  √ −µ nt + n σ ∞ j E(X k ) k! k=1 t √ σ n k t √ σ n ∞ k    E(X k ) k! k=1 ∞ 1 − 2 E(X k ) k! k=1 t √ σ n k 2  + · · · . 2 . the summation (i. Hence. take the natural log of both sides of equation (1) and use fact (A): √ t −µ nt ln MZn (t) = + n ln M √ σ σ n √ ∞ k −µ nt E(X k ) t √ = + n ln 1 + . 1 ln MZn (t) = 0t + t2 + [other terms]. σ k! σ n k=1 Note that the natural log term looks like fact (B) with ξ= E(X k ) k! k=1 ∞ t √ σ n k . ξ ) will eventually be less than one.

We already know that Sn = has a Binomial distribution with parameters n and p. lim ln MZn (t) = lim MZn (t) = et t2 . . The DeMoivre-Laplace Theorem A proof of the Central Limit Theorem was provided by Liapounoff in 1902. . n i=1 Ti Then E(Ti ) = p and Var(Ti ) = p(1 − p). 1. that is let P (Ti = 1) = p P (Ti = 0) = 1 − p.e.12. T2 . . 1). Tn be a random sample from a Bernoulli distribution. . as the number of Bernoulli trials increases) the distribution of Zn = Sn − np np(1 − p) approaches the distribution of a N (0. So limn→∞ P (Zn ≤ a) = Φ(a). 2 . that is P (Sn = k) = n k p (1 − p)n−k k for k = 0. because of the Central Limit Theorem. 1) random variable.2 we get the following Theorem 7. Hence. . 2 n→∞ Therefore. . 2 /2 n→∞ which is the moment generating function for N (0. . (DeMoivre-Laplace Theorem) As n gets large (i. .. However. DeMoivre and Laplace didn’t know about the Central Limit Theorem and had to prove this in a different way. Of course. But DeMoivre (1733) and Laplace (1812) proved a similar result for the following special case: Let T1 . n.The Central Limit Theorem 213 But the “[other terms]” for tn with n ≥ 3 go to zero as n → ∞ because there is a power of n in the denominator of each of the “[other terms]” (expand the series further if you need convincing).

But for ﬁnite n. X2 .” as follows: Corollary 7. we can use the Central Limit Theorem to say that the probability distribution of X approaches a N (µ. . furthermore. Then the probability distribution of the 1 sample mean X = n n Xi is N (µ. since X is composed of a sum of independent random variables. . i=1 • If X1 . this is not very useful (and not precise). X2 . Then. In the limit.13. np(1 − p)) random variable. . . the distribution of the random variable Sn approaches the distribution of a N (np. Suppose n is an integer and 0 < p < 1. σ 2 /n). . i=1 . . (k−np)2 n k −1 e 2 np(1−p) p (1 − p)n−k ≈ √ 1 2πnp(1−p) k This approximation usually works well when np(1 − p) > 9. because as n → ∞ both the mean np and variance np(1 − p) go to ∞. X2 . we can summarize both results as follows: • If X1 . we get E(X) = µ Var(X) = σ 2 /n But. Xn is a random sample with E(Xi ) = µ and Var(Xi ) = σ 2 but the distribution of the Xi is not normal. . . . Combining this fact with the case discussed on page 211. σ 2 /n) distribution. we can still say that the probability distribution of the sample 1 mean X = n n Xi is approximately N (µ. then for “large” values of n. we get what is often called the “normal approximation to the binomial. . The distribution of the sample mean for large n Let X1 . And recall that the sample mean is deﬁned as X= 1 n n Xi i=1 Simply using the rules for expected value and variance.214 The Central Limit Theorem Another (more casual) way to state the DeMoivre-Laplace Theorem is to say that as n → ∞. Xn be a random sample from (just about any) distribution with E(Xi ) = µ and Var(Xi ) = σ 2 . because of the Central Limit Theorem. σ 2 /n). Xn is a random sample from a normal distribution with E(Xi ) = µ and Var(Xi ) = σ 2 . . .

16) are independent random variables. Answers can be found in Appendix B. Then the moment generating function for Y = X1 + X2 is (A) e0.9772 (E) none of the above. 9) and Y ∼ N (2.5t 2 Let X1 and X2 be two independent random variables.Self-Test Exercises for Chapter 7 215 Self-Test Exercises for Chapter 7 For each of the following multiple-choice questions.5t (C) et 2 . S7. choose the best response among those provided. S7.5199 (C) 0. S7. each with the same probability distribution as X .8413 (D) 0.1587 (B) 0. Then P (X + Y < 9) equals (approximately) (A) 0.1 Let X be a random variable with probability generating function g(z) = z + z5 + z6 3 The value of E(X) equals (A) 0 (B) 1 3 (C) 1 (D) 4 (E) none of the above.3 Suppose X is a random variable with moment generating function MX (t) = E(etX ) = e0.2 Suppose X ∼ N (2.5t 2 2 (B) 2e0.

. normally distributed random variables each with mean 0 and variance 1. The value of P (X > Y ) is (A) 0 (B) 0.5t 4 (E) none of the above. Then S = n Ti is the i=1 total number of good light bulbs in the batch. then 3X − Y (A) has a normal distribution with mean 2 and variance 1 (B) has a normal distribution with mean 0 and variance 10 (C) has a normal distribution with mean 0 and variance 2 (D) has a normal distribution with mean 2 and variance 4 (E) none of the above. 1) and Y ∼ N (0.216 Self-Test Exercises for Chapter 7 (D) e0. S7. and {Ti = 0} if light bulb i is bad.6 Let X and Y be independent random variables with X ∼ N (0. T2 . Then P (S ≥ 1) equals (A) 1 − pn (B) 1 − (1 − p)n (C) 2 ∞ √1 e−x /2 dx 1 2π (D) 1 (E) none of the above. . If p is the probability that any one light bulb is good. S7. then each Ti has probability mass function pTi (k) = 1−p p if k = 0 if k = 1 where 0 < p < 1. 2). Let T1 .4 An automated assembly line produces incandescent light bulbs.50 .05 (C) 0. .5 If X and Y are independent. Tn be n independent identically distributed random variables with {Ti = 1} if light bulb i is good. Suppose that a batch of n bulbs are selected at random. S7. .

95 (E) none of the above. X3 . Let Y = X1 + X2 + X3 + X4 . 4. Then Var(Y ) equals (A) 1/16 (B) 1/4 (C) 4 (D) 16 (E) none of the above.7 Let X be a random variable with probability generating function g(z) = z(1 + z) 2 The value of P (X = 0) is (A) 0 (B) 1/4 (C) 1/2 (D) 1 (E) none of the above. X2 . Then the random variables W1 = X + Y W2 = X − Y (A) have the same mean (B) have the same variance (C) are both normally distributed . normally distributed random variables each with mean 0 and variance 1.Self-Test Exercises for Chapter 7 217 (D) 0. S7.9 Let X and Y be independent. 2. S7. 3. X4 be independent random variables. each with E(Xi ) = 0 and Var(Xi ) = 1 for i = 1. S7.8 Let X1 .

S7. The probability that Jack will need to make more than 100 trips to the well is (A) 0.0636 .11 Let X be a random variable with moment generating function MX (t) = (1 − t)−1 .218 Self-Test Exercises for Chapter 7 (D) all of the above (E) none of the above. The amount of water carried on any trip is independent of all other trips. If X and Y are independent.12 Jack is attempting to ﬁll a 100 gallon water tank using a one gallon bucket. then X + Y + 1 has a moment generating function equal to (A) et (1 − t)−2 (B) (1 − t)−2 (C) 2et (1 − t)−1 (D) es (1 − t)(1 − s) (E) none of the above. Jack repeatedly goes to a well at the top of a hill. Let Y be a random variable with moment generating function MY (s) = (1 − s)−1 .10 Let X be a random variable with moment generating function MX (t) = et (1 + e2t ) 2 The value of E(X) is (A) 0 (B) 1 (C) 2 (D) 3 (E) none of the above. ﬁlls his one gallon bucket with water and brings that water down to the 100 gallon tank. The amount of water he carries on any one trip from the well to the tank is normally distributed with mean 1 gallon and standard deviation 0.05 (B) 0.1 gallon. S7. S7.

. 2).0 (E) none of the above.13 Let X and Y be independent random variables with X ∼ N (0.2580 (D) 0.3830 (C) 0.15 Let X be a discrete random variable with probability mass function given by 1 P (X = 0) = 3 P (X = 1) = 2 3 Then the moment generating function for X is (A) e2t/3 (B) (C) (D) 1 3 (1 1 2 (1 + 2et ) + et ) + e2t ) 1 t 2 e (1 (E) none of the above. 2) and Y ∼ N (1.6915 (D) 1.5 (C) 1 (D) P (X + Y < 1) (E) none of the above. S7.14 Let X and Y be independent random variables with X ∼ N (1.Self-Test Exercises for Chapter 7 219 (C) 0. The value of P (X − Y < 1) equals (A) 0 (B) 0. Then P (|X − Y | ≤ 1) equals (A) 0 (B) 0. 1) and Y ∼ N (0. 2).5 (E) none of the above. S7. S7.

Let W = n Xi and deﬁne the random i=1 variable W − 2n Z= c Then Z ∼ N (0. 4) for i = 1.220 Self-Test Exercises for Chapter 7 S7. . . . . Deﬁne the random variable Y = 1 n n Xi i=1 Then E(Y ) equals (A) 0 . . 2. .18 Let X1 .16 Suppose X1 . Xn are independent random variables with each Xi ∼ N (2.17 Let X be a random variable with moment generating function MX (t) = et (1 + et ) 2 The value of E(X) is (A) 0 (B) 2 3 (C) 1 (D) 3 2 (E) none of the above. . 1) if and only if (A) c = 1 (B) c = n √ (C) c = 2 n √ (D) c = 2/ n (E) none of the above. identically distributed random variables with each Xi having a probability mass function given by P (Xi = 0) = 1 − p P (Xi = 1) = p where 0 ≤ p ≤ 1. . X2 . . . . . S7. n. S7. Xn be independent.

2 (a) Find P (X > 18 | X > 13). Y .25)e27t + (0. and Z are independent random variables with X ∼ N (1. (b) Find E(X). (d) Find the moment generating function for X . (a) Compute P (X ≤ 30).4 Let X be a random variable with nonnegative.25)e49t . 4). .2 A random variable has the moment generating function M (t) = 2 . Questions for Chapter 7 7. Y ∼ N (2.3 The moment generating function for a random variable X is given by M (t) = (0.1 The probability generating function for a random variable X is given by g(z) = z 10 + z 20 . 2−t What is the expected value for this random variable? 7.Questions for Chapter 7 221 (B) p (C) p2 (D) 1 (E) none of the above. (c) Find E(|X − 15|). 9) and Z ∼ N (3. M (t) = t (1 − p) 1−e What is the value of P (X = 0)? 7. 7. 7. integer support and moment generating function p .5)e26t + (0. 16). (b) Find E(X) and E(X 2 + 1).5 Suppose that X .

. .01 and mean µ that can be varied by adjusting the packaging equipment.01? 7. Cartons of items have individual weights that are normally distributed with mean 25 pounds and standard deviation 2 pounds. Hint: Don’t use Chebychev’s inequality.5 ton (1000 pound) capacity. .95 where X n denotes the sample mean of X1 . . . 0. (c) Compute P [(X + Y + Z)2 ≤ 25]. n. Explain the difference in your answers.7 A large forklift has a 0. (b) Find the number a such that P (3−4a < Z < 3+4a) = 0. each with E(Xi ) = 0 and Var(Xi ) = 1 for all i = 1. . You should be able to get this answer knowing the exact distribution of Xn . (d) Compute P [(3X − Z)2 > 52]. . . X2 . (a) What value should µ be adjusted to so that the producer can guarantee that the probability that the net weight of a box is less than one pound is only 0. (b) Solve question (a) again using the additional information that each of the Xi is normally distributed. The net weight (in pounds) is normally distributed with variance 0. . (c) Compare the answers to (a) and (b).3174 7. These weights can be assumed to be independent and identically distributed. Approximately how many cartons can the forklift carry so that the probability that its total load exceeds its capacity is only 0.6 Let X1 .222 Questions for Chapter 7 (a) Compute P (X + Y ≤ 1). Xn be independent and identically distributed random variables. 2.9 A one-pound box of breakfast cereal never contains exactly one pound of product.8 A fair coin is tossed 5000 times. (a) Using Chebychev’s inequality. X2 . . such that P (|X n | ≤ a) ≥ 0. . ﬁnd a number a (that will depend on n).05? . .95. 7. Approximately what is the probability that at least 1965 of the tosses will turn up heads? 7. Xn .

Hint: P (X1 > X2 ) = P (X1 − X2 > 0).0025 (ounces)2 .10 The distribution of test scores on an examination is approximately normal with mean 70 and standard deviation 7. . (b) Suppose that two bottles of Phizzi-Soda are chosen at random. Find P (X < 12). (a) Let X1 denote the ﬁll-volume of a single bottle of PhizziSoda. Assume that X is normally distributed with mean 12. the quantity of Phizzi-Soda in a single bottle is a random variable X . Find P (X1 < 12). what is the probability that the average of their scores lies between 68 and 72? (c) What is the probability that the difference in the scores of two students (picked at random) lies between −5 and +5? 7.12 The Phizzi Soft Drink Company’s best selling product is Phizzi-Soda that it sells in 12-ounce bottles. (a) What is the probability of receiving a grade less than 80? (b) If 49 students take the test.11 True or False: If X and Y are normally distributed random variables with mean 0 and variance 1. (c) Let X = (X1 +X2 )/2 be a random variable representing the average ﬁll-volume of the two bottles selected in question (b).Questions for Chapter 7 223 (b) What is the probability that a carton of 16 cereal boxes produced according to the speciﬁcations determined in part (a) will not exceed a total net weight of 19. 7. then 3X − Y has a normal distribution with mean 0 and variance 10.28 pounds? 7. When the automated ﬁlling machine is working properly. Find P (X1 > X2 ). Let X1 and X2 be independent random variables denoting the ﬁll-volume of each of the two bottles.1 ounces and variance 0.

7. identically distributed random variables.5 kgm? (b) Suppose that ten packages of paper are weighed together. (b) Find P (X > Y ). (a) Find Var(X).15 Prove that if gX (z) ≡ E(z x ) is a probability generating function for X . (d) Compute P (0 ≤ X < 2).25(kgm)2 . 0 otherwise where θ is a constant with θ > 0. .16 Let X and Y be independent. (c) Compute P (|X − 1| < 4).14 Individual packages of writing paper have weights that are normally distributed each with mean 2 kgm and variance 0. What is the probability that their average weight is greater than 2. (a) Find the number a having the property that P (Y > a) = 0. The weights of all packages can be considered independent. What is the probability that its weight is greater than 2. 9) and Y ∼ N (6. (b) Find the moment generating function for X . normally distributed random variables with X ∼ N (1. then limz→1 gX (z) = 1. What is the probability that their total weight is greater than 25 kgm? (c) Suppose that ten packages of paper are weighed together. (a) Suppose that one package of paper is weighed. 7.975.224 Questions for Chapter 7 7.1 kgm? 7.13 Let X be a continuous random vector with probability density function given by 1/θ for 0 ≤ x ≤ θ fX (x) = . Your answers to the following questions might depend on the constant θ. 16).

10 cm. σ = 0. But it is well-known that the amount of water (in gallons) carried by a single helicopter can be represented by a random variable that is normally distributed with mean 1000 and variance 1600. . (a) Find E(Y ) and Var(Y ) (b) What is the probability that Mr.03 cm.18 At one stage in the assembly of an article. The amount of water carried the 200 individual helicopters can be assumed to be a collection of independent and identically distributed random variables X1 . Xn with n = 200. wants to ﬁll one of the swimming pools at his mountain-top home. If the components are selected at random for the assembly. a cylindrical rod with a circular cross-section has to ﬁt into a circular socket. Let Y = n Xi be a random variable denoting the total water carried by i=1 all of the n = 200 helicopters.17 Referring to the previous question. Y2 . Quality control measurements show that the distribution of rod diameter and socket diameter are independent and normal with parameters: Rod diameter: µ = 5.19 Bill Gutts. . . . one of the wealthiest men on earth.01 cm. show that if Y1 .04 cm. The amount of water transported varies from helicopter to helicopter. The elevation of the pool (and his home) is very high so that it is impossible to pump water up the mountain. σ = 0. . Gutts plans to transport the water to the mountain-top pool using a ﬂeet of 200 helicopters with each helicopter carrying a small water-tank. what proportion of rods will not ﬁt? 7. X2 . then n W = i=1 Yi is a log-normal random variable. For that reason. The capacity of the swimming pool is 198869 gallons. Gutts can ﬁll his swimming pool using the 200 helicopters? . . Mr.Questions for Chapter 7 225 7. Yn are independent log-normal random variables. Socket diameter: µ = 5. . . 7.

Gutts would be happy if the probability that his swimming pool is completely ﬁlled is only 0. 7. even for a wealthy guy like Mr. X has density function fX (x) = 1 if 0 < x < 1 0 otherwise The discrete random variable Y is a Bernoulli random variable with p = 1/2. Gutts. Mr.226 Questions for Chapter 7 (c) Helicopters are very expensive.282). Gutts is given the bad news that the amount of water carried by each helicopter does not have a N (1000. The continuous random variable X has a uniform distribution on the interval (0. How many helicopters should he use? (d) In an E-mail message. Use the Central Limit Theorem to ﬁnd the approximate probability that Mr. 1).718. That is. That is. 1600) distribution.50. Suppose Mr. 1/2 if k = 0 P (Y = k) = 1/2 if k = 1 Find the probability distribution of the random variable W = X + Y . . 1069. Instead it is has a uniform distribution on the interval (930.20 (†) Suppose X and Y are independent random varaibles. Gutts can ﬁll his swimming pool if he uses 200 helicopters.

. θ). we often write it’s probability density function as fX (x. A random sample is any sequence. 227 . . X2 .. Xn representing n independent observations from this random variable has a joint probability density function given by fX1 . X2 . Example: Let X have an exponential distribution with probability density function fX (x. xn . . Xn ). .X2 . Salada TM tea bag tag Basic Concepts Random samples Deﬁnition 8. θ) = 1 1 − θ (x1 +x2 +···+xn ) θe 0 xi > 0 for all i otherwise Statistics Deﬁnition 8.8 ESTIMATION Good and bad luck are often mistaken for good and bad jusment – ANONYMOUS. of independent.1. x2 . . .. . If a continuous random variable X has an unknown parameter θ. identically distributed random variables.Xn (x1 . . . h(X1 . identically distributed random variables. θ) = 1 1 −θx θe 0 x>0 otherwise where θ is unknown. A statistic is any function. . . . X2 . . . X1 . . Then a random sample. .2. X1 . Xn of independent...

. Xn ) = X1 ˆ The random variable θ is the ﬁrst observation. . Xn be a random sample. . . . ˆ 3. θ(X1 . . There are two important statistics that arise so frequently that they are given special symbols: Deﬁnition 8. . Xn as representing the future data values that will be obtained by making n independent observations of our experiment. . with the others discarded. X2 . . X2 . Let X1 . X2 . . .3. X2 . X2 . . . . You can think of the random variables X1 . . . then it is often written as θ(X1 . . .228 Basic Concepts If a statistic is being used to establish statistical inferences about a particular ˆ unknown parameter θ. . . θ(X1 . . X2 . . Xn ). . . X2 . . . θ(X1 . Example: Let X1 . ˆ 4. θ(X1 . . . Xn ) = n n Xi i=1 ˆ The random variable θ is the average of the observations. . X2 . i=1 . Xn ) ˆ The random variable θ is the largest of the observations. . . . The sample mean is the statistic 1 n X≡ Xi n i=1 and the sample variance is the statistic s2 ≡ X n 1 n−1 (Xi − X)2 . Xn ) = i=1 Xi ˆ The random variable θ is the product of the observations. . . . Note: Any function of a random sample can be called a statistic. Then the following are statistics: 1 ˆ 1. Xn ) = max(X1 . Xn be a random sample. n ˆ 2. X2 . .

. . . ˆ E(θ) = θ for all θ. X2 . xn ). This number can be thought of as an actual observation of the random variable ˆ ˆ θ(X1 . X2 . . Xn is to be drawn from this distribution. If X1 . A random sample X1 . θ). . Xn ) has been already been selected for use.5. . We call the random variable θ(X Since any statistic can be called an estimator of any parameter.Point Estimation 229 Point Estimation The problem Let X be a random variable with cumulative distribution function FX (x. ˆ ˆ Applying the function θ to these observations yields the number θ(x1 . we will have to ﬁnd ways to distinguish good estimators from poor estimators. . . . where θ is unknown. . . Xn ) = n is an unbiased estimator for θ. value of θ. x2 . . Xn be a random sample from a discrete random variable X with probability mass function pX (x. . . X2 . . We will call the number θ(x1 . . n Xi i=1 Finding Estimators Method of maximum likelihood Deﬁnition 8. .e. . . Let x1 . . xn denote the actual observed values taken on by X1 . . . . . X2 . X2 . .. Xn ) an estimator for θ. . xn ) our estimate for the ˆ 1 . . . . X2 . . .4. . Let X1 . i. Suppose that a statistic ˆ θ(X1 . . Xn ). . . . The likelihood function for . . Example: Let X be a discrete random variable with probability mass function given by pX (0) = 1 − θ and pX (1) = θ. X2 . . θ) that is completely determined except for the unknown constant θ. x2 . The estimator θ is an unbiased estimator of θ if its expected value equals θ. . . Unbiased estimators ˆ Deﬁnition 8. . . x2 . Xn is a random sample from X then 1 ˆ θ(X1 . . X2 . . Xn .

. xn . . θ can be thought of as a function of ˆ x1 . then the likelihood function is given by n L(x1 . . . That is we can substitute Xi for xi in the ˆ speciﬁcation for θ and look at its properties as the statistic ˆ θ(X1 . x2 . .e. xn . an unbiased estimator for θ (i. . x2 . . . ˆ For example. Since the value of θ that does this will usually vary with x1 . . θ). xn . x2 . . . . xn . . . . . . . . we can look at its performance prior to actually making the observations x1 . Xn ). . X2 . x2 . . . . . . . . . If. where σ > 0. X2 . choose the best response among those provided. S8.. Then an unbiased estimator for µ is (A) (X1 )(X2 ) · · · (Xn ) (B) (X1 + X2 )2 /2 (C) 1 n n i=1 Xi . xn . The answer can be found in Appendix B. x2 . one of the properties that we might like to check for is whether θ is ˆ = θ). Xn are independent identically distributed random variables each with marginal probability density function 1 1 fXi (x) = √ e− 2 σ 2π x−µ 2 σ for −∞ < x < +∞. xn . instead. . . . Xn is given by n L(x1 . x2 . xn ). θ). . X2 .230 Self-Test Exercises for Chapter 8 X1 . . . Maximum likelihood estimators are obtained by ﬁnding that value of θ that maximizes L for a given set of observations x1 . . namely θ(x1 . . θ) = i=1 pX (xi . . θ). x2 . .1 Suppose that X1 . . θ) = i=1 fX (xi . ˆ To evaluate the properties of θ. . check to see if E(θ) Self-Test Exercises for Chapter 8 For the following multiple-choice question. X is a continuous random variable with probability density function fX (x.

pX (k .Questions for Chapter 8 231 (D) σ (E) none of the above. . . Questions for Chapter 8 8. 1. a2 . an in order for Y to be an unbiased estimator for θ? 8.2 Let Y be an estimator for θ based on the random sample X1 . θ) = n k θ (1 − θ)n−k k for k = 0.1 Let X be a random variable with a binomial distribution. . . . . i. .e. (b) Is the estimator for part (a) unbiased? Justify your answer. . . . . . X . . k! for k = 0. (c) Find the maximum likelihood estimator for η = 1/θ. i. . λ) = λk e−λ . Xn be a random sample from X . an are i=1 constants. Xn . . . . A random sample of n pages are observed . . What constraint must be placed on a1 . pX (k . . . Let X1 be a random sample of size 1 from X . ˆ (b) Show that θ = X1 /n is the maximum likelihood estimator for θ. X2 .. θ) = 0 otherwise Let X1 .. 1. Suppose that E(Xi ) = θ and Y = n ai Xi where a1 . ˆ (a) Show that θ = X1 /n is an unbiased estimator for θ.e. n. (a) Find an estimator for θ using the method of maximum likelihood. . 2. a2 . 8.. on a page of text has a Poisson distribution with parameter λ.4 The number of typographical errors. . . X2 .3 The life of a light bulb is a random variable X which has probability density function 1 1 −θx x>0 θe fX (x. 8.

.232 Questions for Chapter 8 (a) Find an estimator for λ using the method of maximum likelihood. Xj }. X X .5 Given the random sample X1 . . Xn consider the statistic d2 formed by averaging the squared differences of all possible pairings of {Xi . That statistic can be represented as 2 d2 ≡ X 1 n 2 i>j (Xi − Xj )2 Prove that d2 = 2s2 . . 8. X2 . There are n such pairs. (b) Is the estimator for part (a) unbiased? Justify your answer. .

9 PROBABILITY MODELS Coin Tossing (Binary) Models Deﬁnition 9. Deﬁnition 9. T is a random variable with a probability mass function given by pT (0) = 1 − p pT (1) = p Deﬁnition 9. . Let X = n i=1 Ti . A Bernoulli trial is an experiment with only two outcomes. then pX (k) = n k p (1 − p)n−k k k = 0. 1. . . Tn be a sequence of n independent and identically distributed Bernoulli random variables. A Bernoulli random variable.1. Note: X denotes the total number of successes from n independent and identically distributed Bernoulli trials. n 233 . 2. . with P (Ti = 1) = p for all i. Let T1 . . Theorem 9. . . 1. Then X is said to have a binomial distribution. If X is a binomial random variable with parameters n and p. . These outcomes are conventionally called success and failure. T2 . . . . 2.1. n pX (k) = n k p (1 − p)n−k k k = 0. . n The binomial distribution Support: Probability mass function: 0. 2. .3. . . 1. .2.

1. 1. . 0. What is the probability that it breaks down exactly 3 days out of a week? Solution: Let X denote the number of breakdowns. The geometric distribution Support: Probability mass function: Probability generating function: Moment generating function: Moments: Note X denotes the number of independent and identically distributed Bernoulli trials required until the ﬁrst success (counting the successful trial). What is the probability that it is operational for 3 days in a row before a breakdown? Solution: Let Y denote the number of days until a breakdown. 2.2)4 = 0. pX (k) = p(1 − p)k−1 k = 1.8).8)3 (0. 2. It therefore remains operational for an entire day with probability 0.8. . . Hence. P (X = 3) = 7 (0. . P (Y = 4) = (0. . .234 Coin Tossing (Binary) Models g(z) = (zp + (1 − p))n MX (t) = (et p + (1 − p))n E(X) = np Probability generating function: Moment generating function: Moments: Notation Var(X) = np(1 − p) If X has a binomial distribution with parameters n and p. 1 zp for |z| < g(z) = 1 − z(1 − p) 1−p tp e MX (t) = 1 − et (1 − p) E(X) = 1/p Var(X) = (1 − p)/p2 Example: A computer breaks down at least once a day with probability 0.287 3 2. p). we often write X ∼ B(n.8) = 0.2.2)3 (0. X ∼ B(7.0064 .

. X is a binomial random variable with parameters n and p = α/n. .Random Particle Models 235 The Poisson distribution Support: Probability mass function: Probability generating function: Moment generating function: Moments: 0. . then for “large” n and “small” p P (X = k) = n k e−np (np)k p (1 − p)n−k ≈ k! k Random Particle Models n The expected number of particles per interval of length one is equal to α. pX (k) = e−α αk k = 0. . 1. Therefore. 2. 1. . k! g(z) = eα(z−1) for |z| < 1 MX (t) = eα(e E(X) = α t −1) Var(X) = α Theorem 9.2. Suppose the interval is divided into n equal segments with at most one particle placed in any segment. . P (X = k) = n k α n k  1 1− α n n−k . 2. If X is a binomial random variable with parameters n and p. What is the probability that there are exactly k particles in the unit interval? Let X denote the number of particles in the unit interval.

. . This limiting particle process is called the Poisson Process with α = expected number of occurrences per unit interval X = number of occurrences in a unit interval and the probability mass function for X is given by pX (k) = αk e−α k! k = 0. 1. . 1. W (see Figure 9.236 −k Random Particle Models αk k! n = 1− α n n n n−1 n−k+1 ··· n n 1− α n for k = 0. . n. .. . 2. 1.1)? Theorem 9. 2.1). fW (w) = αe−αw 0 if w ≥ 0 otherwise . . Taking the limit as n gets large (the segment widths decrease) we obtain lim P (X = k) = 1(1)(1) · · · (1) = e−α αk k! αk −α e k! n→∞ for k = 0. 2. . . Each of the Yi are independent and identically distributed random variables with probability density function given by fYi (y) = αe−αy 0 if y ≥ 0 otherwise Question: Suppose a person begins to observe the process at time t0 chosen independently of the future of the process? What is the distribution of the waiting time to the next occurrence.3. . Let Yi denote the interarrival time between occurrences i − 1 and i (see Figure 9.

If any of the bulbs in a box is bad.1: Interarrival and waiting times Self-Test Exercises for Chapter 9 For each of the following multiple-choice questions. let Y denote the proﬁt from selling a box of Cheapy ﬂashbulbs. What is the probability that every bulb in a box is good? (A) 0 (B) p (C) 1 − p (D) pn (E) none of the above. The company sells each bulb for r cents. Answers can be found in Appendix B. Note that Y may be negative. The probability that any one bulb is good is p > 0 and it is independent of all other bulbs.2 Using the information in the previous question. S9.1 It costs the Cheapy Flashbulb Company c cents to manufacture a single ﬂashbulb. The value of E(Y ) is (A) 0 . choose the best response among those provided.Self-Test Exercises for Chapter 9 237 Yi t i-1 W t t0 i Figure 9. Cheapy will refund the purchase price of the entire box to the customer. S9. Flashbulbs are packed n to a box.

S9. If the machine breaks down x times during the t hours. X . Let P be the random variable representing the total proﬁt from running the machine for t hours. (a) On any given day. If more than 2 tankers arrive in a day. the loss incurred (from the shutdown plus repair costs) equals (x2 + x).3 The number of knots in a piece of lumber is a random variable. the probability that they all have no knots is (A) 1/4 (B) 4e−α (C) e−4α (D) α4 e−4α (E) none of the above. the excess tankers must be sent to another port.1 A complicated piece of machinery. If four independent pieces of lumber are examined. can bring a proﬁt of c dollars per hour (c < 2) to a ﬁrm. However. what is the probability of having to send tankers away? . Present port facilities can service 2 tankers each day. Suppose that X has a Poisson distribution with E(X) = α > 0. 9. this machine breaks down at unexpected and unpredictable times. (a) Find E(P ) as a function of t. (b) Find the length of time the machine should be operated to maximize the expected proﬁt. The number of breakdowns during any period of length t hours is a random variable with a Poisson distribution with parameter t.238 Questions for Chapter 9 (B) n(rp − c) (C) nrpn − nc (D) nrpn + nc − 2ncpn (E) none of the above. arriving at a reﬁnery each day has a Poisson distribution with parameter α = 3. Questions for Chapter 9 9. N . when running properly.2 The number of oil tankers.

X3 . what is the probability that 10 pages. each with P (Xi = 1) = p. what is the probability that an incorrect number is formed during any one-second time period? 9. then what is the value of p in terms of r? . the probability that an incorrect digit will appear is 0. what is the expected total fees collected in one day? (g) How much must the present facilities be increased to assure that the expected total fees collected in one day is at least \$30000? 9. or more than one incorrect digit in a binary number? If a computer forms 106 such 32-digit numbers per second. X4 .Questions for Chapter 9 239 (b) How much must the present facilities be increased to permit handling all tankers on approximately 80% on the days? (c) What is the expected number of tankers arriving per day? (d) What is the expected number of tankers serviced daily? (e) What is the expected number of tankers turned away daily? (f) If a fee of \$10000 is collected from each tanker serviced. If 5 P i=1 Xi = 5 = r. (a) What is the average distance between cars? (b) What is the probability that the distance between any two cars is greater than 10 miles? 9. one.4 Suppose that a book of 620 pages contains 55 typographical errors. X5 be independent and identically distributed Bernoulli random variables.5 In forming binary numbers with 32 binary digits. what is the probability of ﬁnding zero.3 Suppose cars on a very long test track travel at exactly 50 miles per hour and pass a check point on the track according to a Poisson process at a rate of 5. X2 .002. selected at random. If these are randomly distributed throughout the book. If the errors are independent.2 cars/hour. will be free of errors? 9.6 Let X1 .

240 Questions for Chapter 9 9. with a simple link connecting exactly two molecules (see diagram below) .7 A chromosome can be considered a string of molecules.

What is the actual probability mass function for N ? (b) Compute the approximate probability that exactly 4 smaller chains result from the breakup. some of these links are broken.8 A computer transmits 1000 words per minute. thus breaking the section into smaller chains. and each link breaks independently of the others. Approximately what proportion of these chains will contain at least 200 molecules? 9. Approximately what is the probability that an error cycle will be longer than two minutes? . (a) Let N be the number of links broken. Parity errors are generated independently from one word to the next.001. (a) Approximately what is the probability of generating at least 2 words with parity errors in a one minute period? (b) An error cycle is the time between successive words with parity errors. Each link has a probability 0. (c) Consider the number of molecules on the smaller chains that result from the breakup.    Suppose a section of a chromosome consists of 1000 such links (and 1001 molecules). In recombinant DNA experiments. The probability that any one word has a parity error is 0.005 of breaking.

.10 Glass rods are manufactured by cutting a continuous glass rod leaving a production machine. If a piece has any knots. . . the continuous rod is cut at each imperfection. The number of knots from piece to piece can be assumed to be independent. . What is the probability that the computer operator on the new shift will wait at least two minutes for the ﬁrst parity error on that shift? (d) What is the probability that no parity error occurs between 10:00 and 10:02 am? 9. (a) What is the probability that a piece of lumber is unsatisfactory? (b) What is the expected number of unsatisfactory pieces in a load of 500 pieces? (c) What is the probability that all of the 500 pieces are satisfactory? 9. . it is considered unsatisfactory. If p1 = p2 then Y = X1 + X2 has a Binomial distribution. n (a) Prove or give a counterexample: Suppose 0 < pi < 1 for i = 1. 1. Small imperfections appear in the continuous rod according to a Poisson process with an average of two imperfections per foot of glass rod.11 (†) Suppose X1 and X2 are independent Binomial random variables with P (X1 = k) = P (X2 = k) = n k p (1 − p1 )n−k k 1 n k p (1 − p2 )n−k k 2 for k = 0. To produce “perfect” glass rods. . . . 2. .9 The number of knots in a piece of lumber has a Poisson distribution with an average of 2 knots per piece.Questions for Chapter 9 241 (c) A shift change in computer personnel takes place at 8:00 am each day. n for k = 0. 1. (a) What is the average length of the cut glass rods? (b) What fraction of the rods will have a length greater than one foot? 9.

(a) Show that for M 100 kg batches of bottles. small particles in the molten glass cause individual bottles to be defective. . Z . Assume that: (a) every particle may appear in any bottle with equal probability. . 1. 2. (ii) W as p2 → 1. If we wanted to compute the expected percentage of defective bottles. the expected percentage of defective bottles equals d(x) = 100(1 − ex/100 ). Addison-Wesley. 2 and let Y = X1 + X2 . (c) Suppose 0 < pi < 1 for i = 1.242 Questions for Chapter 9 (b) Prove or give a counterexample: Suppose 0 < pi < 1 for i = 1. for large M . we have d(x) ≈ x. (d) Finally. and (b) the distribution of any particle among the bottles is independent of all other particles. 1. On average. on average. If p1 = p2 then Y = X1 + X2 cannot have a Binomial distribution. 1965 . (c) Show that. Introductory Probability and Statistical Applications. show that for small x. P. 9. . Find the limiting distribution for each of the following random variables: (i) Y as p2 → 0.12 (†) (from Meyer1 ) In the manufacturing of glass bottles. Note that the molten glass required for 100 bottles contains. Let W = Y − n. n where n = xM and α = x/100. A defective bottle may have more than one particle. 1 Meyer. . the quick (and wrong) answer would be x percent. . . there are x particles in a batch of 100 kg of molten glass. A single bottle requires 1 kg of molten glass. has probability distribution P (Z = k) = n k α n k 1− α n n−k for k = 0.. . but for larger x (say x = 100) the “quick” solution is very wrong. the number of particles in a randomly chosen bottle. (b) Show that as M → ∞ P (Z = k) ≈ e−α ak k! for k = 0. x particles.

2. (c) Find P (X = 0).14 (†) Mr. Let the Bernoulli random variable Ti equal 1 if bulb i is defective and 0 if the bulb is non-defective. every pizza that Pip prepares has pepperoni on only one of the . Clearly justify each step of your answer. 9. identically distributed random variables with P (Ti = 0) = 0. . . Defects from one bulb to the next can be assumed to be independent random variables.15 (†) Pip’s Pizza Shoppe sells pepperoni and plain pizza by the slice.1 for all i = 1. Each bulb has a probability be being defective equal to 0. Compute E(X) and Var(X). . for a box of 10 bulbs T1 . (b) Let X = T1 + T2 + · · · + T10 denote the total number of defective bulbs in a box of 10 bulbs. where c is a positive constant. X2 .9 P (Ti = 1) = 0. ﬁnd FY (y) = P (Y ≤ y). (a) Find the probability mass function for N . Bonkers is selling his house.Questions for Chapter 9 243 9. Let Y be the actual selling price of the house. . T10 are independent.1. . 10. . Let N be the number of offers needed to sell the house. That is. (a) Compute E(Ti ) and Var(Ti ) for every i. . .13 The Youbee Light Bulb Company manufactures electric light bulbs. . Therefore. Suppose the offers X1 . (b) Find E(N ) (c) Find the cumulative distribution function for Y . T2 . . There are four slices in each pizza. Since one fourth of the slices sold by Pip are pepperoni. are independent identically distributed random variables (in dollars) each with cumulative distribution function FX (x). 9. He decides he will accept the ﬁrst offer exceeding c dollars.

Note that a sequence such as X1 = 1.16 (†) In a radio advertisement. . . . X2 . (c) (††) What is the probability that Pip must bake a pizza to satisfy the request of the current customer? That is.244 Questions for Chapter 9 four slices. a major oil company says that one fourth of all cars have at least one underinﬂated tire. Suppose that any tire becomes underinﬂated independently and with the same probability for each tire on every car. . What is the probability that Pip must bake a another pizza before selling all of the slices in the current pizza? Hint: Find the probability that he doesn’t need to bake another pizza. Let the random variable Y denote the number of pizzas that Pip must bake that day. and Xi = 0 if it’s plain. See the following diagram: Let the independent random variables X1 . the current customer wants pepperoni but Pip has no pepperoni slices available or the current customer wants plain but Pip has no plain slices available. would require that Pip bake a second pizza before selling all of the slices in the ﬁrst pizza. . X5 = 0. Assume that Pip never throws away any slices of pizza during the day. X3 = 1. . 9. . X2 = 0. (b) Suppose Pip has just taken a freshly baked pizza out of the oven and suppose that there are no other unsold slices. Each car has 4 tires. represent the slices sold by Pip with Xi = 1 if slice i is pepperoni. The remainder of the pizza is plain. What is the probability distribution for the random variable Y ? Hint: There is only one pepperoni slice per pie. (a) Suppose 20 customers visit Pip’s Pizza Shoppe on a particular day. X4 = 0.

What is the probability that the tire in underinﬂated? (b) Suppose a parking lot contains 10 cars. and then select a tire on that car at random. Let the random variable X represent the number of tires (out of the 40) that are underinﬂated. What is the probability distribution for X? .Questions for Chapter 9 245 (a) Suppose we select a car at random. We plan on testing the air pressure for every tire on every car.

246 Questions for Chapter 9 .

Handbook of Mathematical Functions.31938153 b2 = −0. that is the probability density function for Z is given by 1 2 1 φ(z) = √ e− 2 z 2π −∞<z <∞ Let Q(z) ≡ P (Z > z).330274429 t= b3 = 1. 1 Abramowitz.781477937 With this approximation.5 × 10−8 ..10 APPROXIMATIONS AND COMPUTATIONS These numerical approximations are from a book by Abramowitz and Stegun.gov/. and Stegun. This handbook is being published as a reference source on the World Wide Web at http://dlmf.2316419 1 + rz b1 = 0. (1970). that is Q(z) = ∞ z 1 2 1 √ e− 2 t dt 2π For z ≥ 0 the following approximation can be used to compute Q(z): Q(z) = φ(z)(b1 t + b2 t2 + b3 t3 + b4 t4 + b5 t5 ) + (z) where 1 r = 0. New York (1970).1 Approximation for the Normal Integral Let Z be a standard normal random variable.356563782 b4 = −1. Dover Publications.821255978 b5 = 1. Inc. | (z)| < 7. 247 . I.nist. M.

5.189269 t= ln c0 = 2.00 in increments of 0.802853 d1 = 0.25.432788 c2 = 0. the following approximation can be used to compute z : z =t− c0 + c1 t + c2 t2 + (Q) 1 + d1 t + d2 t2 + d3 t3 where 1 Q2 c1 = 0.5 × 10−4 . 0. for a given 0 < Q < 0. 0. 0. 1). . Questions for Chapter 10 10.1 Write a computer program to generate a table for the cumulative distribution function of the standard normal distribution for −3. 10.95 and 0.010328 d2 = 0.00 ≤ z ≤ 3.50. ﬁnd the corresponding z that satisﬁes the above integral equation. that is Q= ∞ z 1 2 1 √ e− 2 t dt 2π The problem is. | (Q)| < 4. that is the probability density function for Z is given by 1 2 1 φ(z) = √ e− 2 z 2π −∞<z <∞ Let Q = P (Z > z).05.5. Write a computer program to ﬁnd z when P (Z ≤ z) = 0.248 Questions for Chapter 10 Approximation for the Inverse Normal Integral Let Z be a standard normal random variable.99.01. For a given 0 < Q < 0.01.001308 With this approximation. 0.2 Let Z ∼ N (0.515517 d0 = 1.

Appendix A TABLES Page Table I Table II Standard Normal Cumulative Distribution Function Some Common Probability Distributions 250 251 249 .

7291 0.6103 0.05 0.9909 0.5793 0.9608 0.9345 0.9099 0.7910 0.5517 0.9969 0.9916 0.5557 0.9761 0.9599 0.9861 0.9934 0.9987 1 0.3 0.9975 0.7190 0.4 -0.9798 0.5 1.6 0.9441 0.7054 0.6844 0.9913 0.5319 0.7611 0.9732 0.9989 7 0.8830 0.9505 0.8849 0.4 0.7549 0.9904 0.9706 0.1 0.9920 0.1 1.9974 0.9452 0.5398 0.9965 0.250 Table I: Standard Normal Cumulative Distribution Function TABLE I: STANDARD NORMAL CUMULATIVE DISTRIBUTION FUNCTION 0.7734 0.3 1.7580 0.6879 0.7704 0.5 0.7967 0.9980 0.9463 0.0 1.5000 0.8944 0.5636 0.9564 0.7123 0.1 2.9756 0.8643 0.7 1.3 -1.9699 0.9616 0.6772 0.10 0.9961 0.9 2.9147 0.9929 0.9875 0.9906 0.9943 0.9803 0.7357 0.9495 0.9788 0.7642 0.9911 0.9265 0.9878 0.9970 0.9964 0.8962 0.8023 0.9292 0.25 0.9893 0.6950 0.6217 0.9979 0.6443 0.7 0.9953 0.9896 0.9582 0.5714 0.6 1.9 -0.9957 0.8212 0.9830 0.9986 0.15 0.9968 0.7224 0.5753 0.9726 0.7454 0.8365 0.5478 0.8438 0.9418 0.7995 0.9941 0.9982 0.3 0.9545 0.9963 0.5040 0.8770 0.9573 0.9864 0.9633 0.7 2.9857 0.9082 0.2 2.6480 0.1 -0.9 1.9838 0.9382 0.8238 0.9976 0.2 1.7673 0.9370 0.9656 0.9985 0.7157 0.9713 0.5359 0.9945 0.9515 0.6915 0.6808 0.9826 0.6141 0.6591 0.9162 0.9987 2 0.9649 0.5239 0.9918 0.8159 0.9678 0.7019 0.9981 0.9319 0.8869 0.30 0.9664 0.8289 0.0 -1.9890 0.9015 0.9049 0.7794 0.6517 0.9750 0.8554 0.7257 0.8264 0.8729 0.9956 0.6 -0.6700 0.8 0.8 2.5596 0.5832 0.9977 0.9525 0.9131 0.8577 0.9207 0.9986 0.8 1.9332 0.7 a 1.5160 0.9925 0.9693 0.1 0.5 2.9115 0.2 z z a 0.9868 0.6293 0.9988 4 0.9357 0.5 1.9936 0.8 -1.9719 0.6664 0.9808 0.9032 0.9982 0.9429 0.9484 0.8749 0.8413 0.9981 0.9984 0.6 2.8810 0.2 1.9279 0.5199 0.9946 0.9222 0.9932 0.7823 0.9251 0.5279 0.9979 0.7324 0.9960 0.8531 0.9 Φ(a) = a a ϕ(z) dz = −∞ −∞ 1 2 √1 e− 2 z 2π dz -2.9535 0.9 3.9783 0.8389 0.9949 0.8665 0.9744 0.9940 0.9850 0.8051 0.9955 0.9938 0.6628 0.9834 0.8186 0.7422 0.8106 0.9989 6 0.6368 0.9931 0.9686 0.4 2.9984 0.7881 0.9738 0.8888 0.9978 0.8997 0.7517 0.9406 0.6554 0.8599 0.6406 0.9971 0.9842 0.9812 0.5948 0.9871 0.9394 0.6255 0.5987 0.00 0.9988 5 0.9927 0.9985 0.3 2.20 0.7764 0.6064 0.9952 0.9306 0.4 1.40 Φ(a) ϕ(z) z 0.8340 0.9192 0.9778 0.9846 0.5675 0.9898 0.9959 0.8907 0.9974 0.7389 0.8485 0.9977 0.9236 0.0 0.8 1.9990 9 0.9554 0.0 1.8708 0.9884 0.35 0.9772 0.8686 0.7486 0.9817 0.6985 0.9887 0.5438 0.9962 0.9177 0.6736 0.9990 .9983 0.7852 0.2 0.9767 0.8315 0.9966 0.9821 0.8980 0.9066 0.8078 0.0 0 0.9641 0.8621 0.8790 0.6026 0.9591 0.9474 0.5 0.8133 0.9987 3 0.9671 0.9625 0.9989 8 0.9854 0.7939 0.6331 0.8461 0.9901 0.0 2.5 -1.8508 0.9881 0.9972 0.6179 0.9948 0.7088 0.5910 0.8925 0.9951 0.9793 0.9922 0.9967 0.5120 0.5871 0.5080 0.9973 0.

n np np(1 − p) (et p + (1 − p))n geometric disc 0≤p≤1 pX (k) = (1 − p)k−1 p k = 1. . α > 0 fX (x) = α (αx)r−1 e−αx Γ(r) x−µ 2 σ x≥0 r α −∞ < x < ∞ µ r α2 σ2 α α−t e(tµ+σ normal cont µ. 1 p p(1 − p) et p + (1 − p) binomial disc n. 0 ≤ p ≤ 1 pX (k) = n k pk (1 − p)n−k k = 0. . . . a≤x≤b uniform cont a<b fX (x) = 1 b−a a+b 2 (b − a)2 12 ebt − eat (b − a)t TABLE II: SOME COMMON PROBABILITY DISTRIBUTIONS exponential cont α>0 fX (x) = αe−αx x≥0 1 α 1 α2 α α−t r gamma cont r ≥ 1. .Name Type Parameters Deﬁnition Support E(X) Var(X) mgf Bernoulli disc 0≤p≤1 pX (k) = pk (1 − p)1−k k = 0. . 1. 2. 1. . σ > 0 1 1 fX (x) = √ e− 2 σ 2π 2 t2 /2) 251 . . . . 1 p α α 1−p p2 pet 1 − (1 − p)et eα(e t −1) Table II: Some Common Probability Distributions Poisson disc α>0 pX (k) = e−α αk k! k = 0.

252 Table II: Some Common Probability Distributions .

11 Remarks S2.24 S3.2 S3.5 Answer C is acceptable if you allow the triangle to be ﬂipped S2.6 The correct answer is Chapter 3 S3.29 S3.22 S3.16 S3.6 Chapter 2 S2.18 S3.1 S2.1 S3.21 S3.7 S3.34 C A A A A C A S3.6 S2.32 C D B A C D B S3.19 S3.5 Must have p = P (A) = P (B) = P (A ∩ B) = 0.Appendix B ANSWERS TO SELF-TEST EXERCISES Chapter 1 S1.36 Remarks S3.9 S3.15 S3. C C E D D A C C S3.14 S3.17 S3.28 S3.30 S3.23 S3.7 D A S1.27 S3.5 S3.3 S3.10 S3.10 D A 253 .4 S2.8 B D S2.8 S3.6 S3.9 A C S2.5 D B E C S2.2 S1.35 C B D A B D B 5 2 A D S1. P (A ∩ B) = P (A)P (B).1 S1.26 S3.25 S3.4 S3.7 S2.3 C S1.20 S3.33 E B B D A D A S3.4 A S1.12 S3.5 S2.12 A D B S2.31 S3.3 S2.13 S3.2 S2.11 S3. Hence.

17 S6.1 C D C A C S7.19 C C C E S6.18 E C E D S6.14 B D B S7.2 S5.14 S6.28 C C D A E C S5.26 Remarks S5.7 S7.19 C C B A S4.15 B C B B B B C D S6.14 S5.10 S6.3 S7.16 S5.14 S4.21 Remarks S6.27 A B D C B B S5.13 S5.5 S5.24 A B B C C S5.12 S5.10 S5.9 S6.1 S7.4 S5.17 S5.19 S5.13 S6.20 C B A D C C B C A D S5.8 S7.12 S4.50)) = P (X < 0.6 S6.7 S4.5 S6.6 S5.11 S4.8 S4.11 S5.15 C D B .8 S5.8 S6.2 S7.3 S4.10 S4.12 S6.23 S5.7 S6.1 S6.11 S7.4 S7.16 Chapter 8 S8.6 S4.50) since ln(x) is an increasing function.10 S7.9 S5.17 C A D D S7.3 S6.27 Note that P (ln(X) < ln(0.4 S6.5 S7.25 C B B B D A C B C S4.7 S5.3 S5. ANSWERS TO SELF-TEST EXERCISES Chapter 4 S4.2 S4.254 APPENDIX B.16 S6.20 S5.18 C C D B S7.15 S5.15 S6.16 Chapter 5 S5.18 B B C C S4.21 S5.22 D C C A B S6.22 S5.17 B C C D S4.5 S4.13 S7.18 S5.11 S6.1 S4.4 S4.19 The correct answer is 1 Chapter 7 S7.9 S4. Chapter 6 S6.13 S4.2 S6.12 S7.9 S7.6 S7.1 S5.

1 D S9.2 C S9.255 Chapter 9 S9.3 C .

256 APPENDIX B. ANSWERS TO SELF-TEST EXERCISES .

(5 points) Four of your friends are having birthdays on the same date. B and C and probability measure P (·). 2. 1. You purchased four birthday cards. choose the best answer among those provided. In all of the following questions.Appendix C SAMPLE EXAMINATION 1 PART I: MULTIPLE CHOICE QUESTIONS Instructions: For each of the questions in Part I. You purchased four birthday cards. (5 points) Three of your friends are having birthdays on the same date. 257 .) (a) 4! (b) 4 3 (c) 3! (d) 43 (e) none of the above. Ω is a sample space with events A. How many different ways can you send cards to your friends so that each gets one card? (a) 4! (b) 4 4 (c) 4 (d) 44 (e) none of the above. How many different ways can you send cards to your friends so that each gets one card? (You will have one card left over.

258 Part I: Multiple Choice Questions 3. (5 points) The event which corresponds to the statement.3 then (a) A and B are mutually exclusive (b) P (A ∩ B) = 0 (c) A and B are independent (d) all of the above (e) none of the above. B and C occur simultaneously. P (A) = 0.7.7 and P (B) = 0. 5. 4.” is (a) A ∪ B ∪ C (b) A ∩ B ∩ C (c) (A ∩ B) ∪ (A ∩ C) ∪ (B ∩ C) (d) (A ∪ B ∪ C)c (e) none of the above. then (a) P (A ∩ B) = 0 (b) A = B (c) A ∩ B = Ø (d) A and B are independent (e) none of the above.3. P (B) = 0. (5 points) If A ⊆ B . “all of the events A. 6. (5 points) If P (A) = p > 0. P (B) = q > 0 and A∩B = Ø then P (A∪B | A) equals (a) 1 (b) p (c) (p + q)/p (d) (p + q − pq)/p (e) none of the above .3 and P (A | B) = 0. (5 points) If P (A) = 0.

7. (5 points) P (X > 1) equals (a) 0. Your chicken wing recipe is so famous that the White House (in Washington DC) has asked that you send a large order of chicken wings for a important dinner. (5 points) The value of the constant c is (a) 0.5 (c) 0. New York.5 (b) 1 (c) 2 √ (d) 2 (e) none of the above. Circle your ﬁnal answer. PART II Instructions: Show your work in the space provided.25 (b) 0.75 (d) 1 (e) none of the above. 8. accidentally mixed . After getting your engineering degree. Your chicken wings can be either hot or mild. Unfortunately.Part II 259 For questions 7 and 8: Assume that X is a continuous random variable with probability density function fX (x) = cx if 0 ≤ x ≤ 2 0 otherwise where c is a constant. you are now the owner of the most successful restaurant in Buffalo. an engineering intern (who is working for you as a cook). 9. and usually you keep the two different types of chicken wings in separate containers.

(c) (10 points) You plan to sample one more wing from the container chosen in part (b). (a) (10 points) You plan to choose a container at random. The only way you can determine the contents of the containers is by sampling the wings.260 Part II some hot wings with some mild ones. It’s a hot wing! What is the probability that you chose Container C (the one with both hot and mild wings). What is the probability that the wing will be mild? Justify your answer. 10.2 pX (0) = 0. Justify your answer using Bayes’ theorem.4 (a) (10 points) Sketch the cumulative distribution function for X . what is the probability that if you sample a second wing from the same container. Given that the ﬁrst wing that you sampled was hot. To make matters worse. and sample a chicken wing from that container. (b) (5 points) Compute P (X < 0). You now have only three containers of chicken wings: Container A All hot wings Container B All mild wings Container C One half hot and one half mild wings The White House wants only hot chicken wings. the containers are unmarked. . You don’t want to accidentally ship Container B or C. and you need to ship the order immediately. (c) (5 points) Compute P (X > 1). Let X be a random variable with the following probability mass function: pX (−1) = 0. (b) (10 points) You actually select a container at random. it will also be hot? Justify your answer. and then sample one of the chicken wings from that container.4 pX (+1) = 0. (d) (10 points) Compute P (X = 0 | X ≤ 1).

2. 1. (5 points) Let X be a discrete random variable with a probability distribution given by P (X = k) = 1 4 k 1− 1 4 for k = 0. Then P (X ≥ k) equals (a) (b) 1 4 1 4 k k+1 261 . . integer constant. (5 points) Let X be a continuous random variable with probability density function λxn if 0 ≤ x ≤ 1 fX (x) = 0 otherwise where λ is a constant and n is a nonnegative. 1. A relationship which must exist between λ and n is (a) λn = 1 (b) λ = n + 1 (c) λ = n (d) λ = 2n (e) none of the above. choose the best answer among those provided.Appendix D SAMPLE EXAMINATION 2 PART I: MULTIPLE CHOICE QUESTIONS Instructions: For each of the questions in Part I. 2. .. .

5. Then. E[(X − E(Y ))2 ] equals (a) 0 (b) Var(X) (c) Var(Y ) (d) Cov(X. (5 points) If X is a random variable with P (X ≤ 0) = α. then P (−3X < 0) equals (a) 0 (b) 1 3α (c) α (d) 1 − α (e) none of the above. 4.262 Part I: Multiple Choice Questions (c) (d) 3 4 3 4 k (e) none of the above. The value of P (X 2 ≥ 16) is (a) 1 − e−4 (b) e−16 (c) 1 − 4e−16 (d) e−4 (e) none of the above. (5 points) Let X and Y be random variables with E(X) = 2 and E(Y ) = 2. (5 points) Let X be a continuous random variable with probability density function 4e−4x if x > 0 fX (x) = 0 otherwise. 3. . Y ) (e) none of the above.

Y ) is a random vector with joint probability mass function given by pX. −1) = 1/3 pX. (5 points) Let X be a continuous random variable with probability density function   x+1 if −1 ≤ x ≤ 1 fX (x) = . .  0 2 otherwise Then P (X < 0) equals (a) 0 (b) (c) (d) 1 8 1 4 1 2 (e) none of the above For questions 7 and 8: Assume that (X. (5 points) Deﬁne the random variable W = XY .Y (−1.Part I: Multiple Choice Questions 263 6.Y (1. 8. 1) = 1/3 7. The value of P (W = 1) is (a) 0 (b) 1/9 (c) 1/3 (d) 2/3 (e) none of the above. (5 points) The value of E(X + Y ) is (a) −1 (b) 0 (c) 1/3 (d) 1 (e) none of the above. 0) = 1/3 pX.Y (0.

you lose \$1. In all of the following questions. (3 points) Let P (A) = r and P (B) = s with A ∩ B = Ø. 1) and Y ∼ N (0.05 (c) 0. If the coin turns up heads. 265 .Appendix E SAMPLE FINAL EXAMINATION PART I: MULTIPLE CHOICE QUESTIONS Instructions: For each of the questions in Part I.50 (d) 0. Then P (A | B c ) equals (a) 0 (b) r (c) r/(1 − s) (d) 1 (e) none of the above. you win \$1. 1. The value of P (X > Y ) is (a) 0 (b) 0. 2. B and C and probability measure P (·). choose the best answer among those provided. Ω is a sample space with events A. (3 points) Let X and Y be independent random variables with X ∼ N (0. 2). For questions 3 and 4: A game is played by repeatedly tossing a fair coin.95 (e) none of the above. and if the coin turns up tails.

3. (3 points) In a class of 10 students. 4.e. Note that Yn may be negative.266 Part I: Multiple Choice Questions Let Yn denote your net winnings (i. (3 points) Let X be a continuous random variable with probability density function αe−x if x ≥ 0 fX (x) = 0 otherwise The value of α (a) must be 1 . your winnings minus your losses) after n tosses. Two students are chosen at random from the class. 5 students received an A on the last examination. The probability that both of the selected students received a grade of A is (a) 1/4 (b) 2/9 (c) 1/3 (d) 1/2 (e) none of the above. 5. (3 points) The value of E(Yn ) is (a) 0 (b) −n (c) n/2 (d) n (e) none of the above.. (3 points) The value of Var(Yn ) is (a) 0 √ (b) n (c) n (d) n2 (e) none of the above. 6.

A bag contains 10 independent gum drops.5 (c) 1 (d) 1 − ( 1 )16 4 (e) none of the above.001 of breaking. independently of all other links. (3 points) The weight of a gum drop (in ounces) is normally distributed with 1 mean 2 and standard deviation 4 . Approximately. (Note: The chain will break if one or more links break. 7. 9.) (a) e−2 (b) 1 − e−2 (c) 1/2 (d) (0. The probability that the total weight of the gum drops in the bag exceeds 20 ounces is (a) 0 (b) 0. (3 points) Suppose a chain link has a probability of 0. 8. (3 points) The set (A ∪ B) ∩ (A ∪ B c ) is equivalent to (a) Ø (b) Ω (c) A (d) B (e) none of the above.001)2000 (e) none of the above. what is the probability that a chain with 2000 such links will break. .Part I: Multiple Choice Questions 267 (b) can be any positive real number (c) can be any positive integer (d) can be any real number (e) none of the above.

12.268 Part I: Multiple Choice Questions 10. 11.0000 (e) none of the above. 13. then P (Y > 1) equals (a) 0.5000 (c) 0. If Y = X 3 . (3 points) Let X be a random variable with E(X) = 2 and E(X 2 ) = 5. (3 points) Let X ∼ N (1. Then Var(X) equals (a) 1 .8413 (d) 1.1587 (b) 0. (3 points) Let X be a random variable with probability generating function g(z) = z(1 + z) 2 The value of E(X) is (a) 0 (b) 2/3 (c) 1 (d) 3/2 (e) none of the above. (3 points) Let X be a random variable with moment generating function MX (t) = e(e t −1) The moment generating function of the random variable Y = 2X + 1 is (a) et (b) e2t (c) e(2e t −2) 2t −1) (d) e(t+e (e) none of the above. 8).

If 3 independent rolls of steel are examined.99)101 (d) 1 (e) none of the above. For questions 16 and 17: Suppose that X1 and X2 are independent random variables representing the lifetimes of two lightbulbs. If the airline permits 101 persons to make reservations for a given ﬂight. 15. the probability that exactly one is defective is (a) 1/4 (b) 3/8 (c) 9/64 (d) 27/64 (e) none of the above.Part I: Multiple Choice Questions 269 (b) 3 (c) 7 (d) 9 (e) none of the above.01 (c) (0. each having marginal probability . It is know that. (3 points) The weight X (in pounds) of a roll of steel is a random variable with probability density function given by fX (x) = 1 2 (x − 499) 0 if 499 ≤ x ≤ 501 otherwise A roll is considered defective if its weight is less than 500 pounds. 1% of all persons making airline reservations never show up for their ﬂight. 14. over many years. the probability that someone will be denied a seat is (a) 0 (b) 0. (3 points) An airplane can hold 100 passengers. Assume that a person decides independently whether or not to show up for a ﬂight.

16. (3 points) The value of E(W ) is (a) 1/2 (b) 1 (c) 2 (d) X1 + X2 (e) none of the above. 17. Xn be a random sample with each Xi having a probability mass function given by P (Xi = 0) = 1 − p P (Xi = 1) = p . . Which of the following statements are always true? (a) P (A) ≤ P (B) (b) P (A ∪ B) = P (B) (c) P (A ∩ B) = P (A) (d) all of the above (e) none of the above. . 18. (3 points) Suppose A ⊆ B .270 Part I: Multiple Choice Questions density function f (x) = 1/2 0 if 0 ≤ x ≤ 2 otherwise Let W = X1 + X2 represent the total lifetime of the two lightbulbs. (3 points) Let X1 . . . (3 points) The value of P (W < 2) is (a) 0 (b) 1/4 (c) 1/2 (d) 1 (e) none of the above. 19.

The total number of distinguishable arrangements of the balls in the numbered cells is (a) (b) (c) 4 2 3 2 3 1 (d) 4 (e) none of the above. (3 points) A box is partitioned into 3 numbered cells. 20. 21. Circle your ﬁnal answer. answer “Yes” or “No” to the question: . Two indistinguishable balls are dropped into the box. A sociologist is interested in determining the percentage of college students who like classical music.Part II 271 where 0 ≤ p ≤ 1. The person is instructed to go to a secluded corner. and write down the answer to one of the following questions depending on the color of the bead: • If the bead is red. An unbiased estimator for p is given by (a) p = ˆ (b) p = ˆ i=1 n 1 n n n Xi i=1 Xi Xi2 i=1 (c) p = ˆ (d) all of the above (e) none of the above. draw out a bead at random. she decides on the following technique: A person is given a bowl which contains 600 red beads and 400 blue beads. PART II Instructions: Show your work in the space provided. In order to avoid embarrassing any students and help assure a truthful response to the survey. Any cell may contain more than one ball.

Determine P (X = k) as a function of k and p.Y (x. (b) (5 points) Find the marginal probability density function for X . . Therefore. y ≥ 0 and x + y ≤ 1 . answer “Yes” or “No” to the question: Does your student number end in an odd digit? The person simply responds “Yes” or “No. except the person being surveyed. Show that E(Y ) = p.” No one.) 22. (a) (5 points) Suppose p is the unknown fraction of students who like classical music. otherwise 1 2 (a) (5 points) Compute P X + Y < . knows which question was actually answered. What is the unconditional probability that a person surveyed will answer “Yes?” Your answer should depend on p. the person is likely to be truthful.272 Part II Do you like classical music? • If the bead is blue. (b) (5 points) Let X denote the number of “Yes” responses from a survey of 100 individuals. (c) (5 points) Find the marginal probability density function for Y . What is your best guess for the value of p? (Hint: Use an unbiased estimator for p. Let (X. (d) (5 points) Find the conditional probability density function for Y given 1 that X = 2 . Y ) be a continuous random vector with joint probability density function given by fX. y) = 2 0 if x ≥ 0. (d) (5 points) Suppose 40 of the 100 students answered “Yes” to the survey. (c) (5 points) Let Y = (X/60) − (1/3).

scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->