This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

Editors' Picks Books

Hand-picked favorites from

our editors

our editors

Editors' Picks Audiobooks

Hand-picked favorites from

our editors

our editors

Editors' Picks Comics

Hand-picked favorites from

our editors

our editors

Editors' Picks Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

This material is provided for the educational use of students in CSE2400 at FIT. No further use or reproduction is permitted. Copyright G.A.Marin, 2008, All rights reserved.

Applied Statistics: Probability

1-1

**Permutations and Combinations
**

Suppose that we have n objects O1 , O2 ,..., On . A permutation of order k is an "ordered" selection of k of these for 1 ≤ k ≤ n. A combination of order k is an "unordered" selection of k of these. Common notation: P ( n, k ) or Pkn = n(n − 1)

n k

(n − k + 1) = n k

⎛ n ⎞ nk C ( n, k ) = C = ⎜ ⎟ = . ⎝k ⎠ k! Example: Given the 5 letters a,b,c,d,e how many ways can we list 3 of the 5 when order is important? Answer: P35 =53 =5*4*3=60. Note that each choice of 3 letters (such as a,c,e) results in 6 different results: ace, aec, cae, cea, eac, eca... Example: Given the 5 letters above how many ways can we choose 3 of the 5 when order is NOT important? ⎛ 5 ⎞ 53 5*4*3 = 10. In this case we have the 60 that result when we care about order Answer: ⎜ ⎟ = = 3 ⎠ 3! 3*2*1 ⎝ divided by 6 (the number of orderings of 3 fixed letters).

Applied Statistics: Probability 1-2

Definition

n k = n × (n − 1) × × (n − k + 1) for any positive integer n and for integers k such that 1 ≤ k ≤ n. This symbol is pronouced "n to the k falling." Examples: 6 3 = 6 × 5 × 4 = 120. 36 is not defined (for our purposes). 55 = 5! ⎛ n ⎞ nk Again: P = n and ⎜ ⎟ = . ⎝k ⎠ k!

n k k

Applied Statistics: Probability

1-3

3 white buttons. and 4 blue buttons..Permutations of Multiple Types The number of permutations of n = n1 + n2 + n! . and nr are of an rth type is Example: Suppose we have 2 red buttons. How many different orderings (permutations) are there? 9! = 1260. n2 are of a second type. n1 !n2 ! nr ! + nr objects of which n1 are of one type.. 2!3!4! Answer: Applied Statistics: Probability 1-4 . ..

. (a) How many unique choices can you get if order matters? 12 5 = 95. Next ⎝ 2⎠ multiply by orderings of 3 white and 2 red: 10i8 3 i4 2 = 40..w8 and the red ones are numbered r1.r3. (d) How many ways can you choose 3 white marbles and 2 red marbles if ⎛ 8 ⎞⎛ 4 ⎞ order does not matter? ⎜ ⎟⎜ ⎟ = 336 ⎝ 3 ⎠⎝ 2 ⎠ (e) How many marbles must you draw to be sure of getting two red ones? 10 Applied Statistics: Probability 1-5 . First determine which ⎛5⎞ two slots (positions) will be occupied by 2 red marbles: ⎜ ⎟ = 10. 040 ⎛12 ⎞ (b) How many unique choices can you get if order does not matter? ⎜ ⎟ = 792 ⎝5 ⎠ (c) How many ways can you choose 3 white marbles and 2 red marbles if order matters? You will fill 5 "slots" by drawing..Try These There are 12 marbles in an urn. 8 are white and 4 are red.w2.. For (a) .r2.r4. The white marbles are numbered w1.320.(d): Without looking into the urn you draw out 5 marbles.

744.Complex Combinations How many ways are there to create a “full house” (3-of-akind plus a pair) using a standard deck of 52 playing cards? ⎛13 ⎞ ⎛ 4 ⎞⎛12 ⎞ ⎛ 4 ⎞ ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ = 13i4i12i6 = 3.1 in text). This follows from the multiplication principle (Theorem 2. Applied Statistics: Probability 1-6 . ⎝1 ⎠ ⎝ 3 ⎠⎝1 ⎠ ⎝ 2 ⎠ (choose denomination)x(choose 3 of 4 of given denomination)x(choose one of the remaining denominations)x(choose 2 of 4 of this second denomination).3.

What is r ? ⎝ r ⎠ ⎝ r − 2⎠ Applied Statistics: Probability 1-7 .Try these… ⎛ n ⎞ ⎛ n⎞ Suppose ⎜ ⎟ = ⎜ ⎟ . What is n ? ⎝11⎠ ⎝ 7 ⎠ ⎛18 ⎞ ⎛18 ⎞ Suppose ⎜ ⎟ = ⎜ ⎟ .

dndn. nddn. nndd. ndnd. dnnd. *Applied Statistics and Probability for Engineers. In determining a schedule for a machine shop. Runger. John Wiley & Sons. 2006 Applied Statistics: Probability 1-8 . The number of possible sequences for two drilling operations and two notching operations is 4! 2!2! =6 The six sequences are easily summarized: ddnn.Examples* Consider a machining operation in which a piece of sheet metal needs two identical diameter holes drilled and two identical size notches cut. Douglas C. Inc. George C. we might be interested in the number of different possible sequences of the four operations. Montgomery. We denote a drilling operation as d and a notching operation as n.

2006 Applied Statistics: Probability 1-9 . Runger. ⎛ 8 ⎞ ⎛ 8 ⎞ 83 8*7 *6 = = 56. Montgomery. The number of possible designs is. If five identical components are to be placed on the board. therefore. George C. Inc. ⎜ ⎟=⎜ ⎟= ⎝ 5 ⎠ ⎝ 3 ⎠ 3! 3* 2*1 *Applied Statistics and Probability for Engineers.Example* A printed circuit board has eight different locations in which a component can be placed. how many different designs are possible? Each design is a subset of the eight locations that are to contain the components. Douglas C. John Wiley & Sons.

5. Countable The number of attempts until a message is transmitted successfully when the probability of success on any one attempt is p ⇒ Ω = {1. Finite Outcome from one roll of one die ⇒ Ω = {1. Applied Statistics: Probability 1-10 .3.5. 4..} = + Continuous (We begin with the discrete cases. 4. where is the set of all real numbers. .. 6} .) The time (in seconds) until a lightbulb burns out ⇒ Ω = {t ∈ : t ≥ 0} . 6. 2. 2.3.. Ω.Sample Space Definition: The totality of the possible outcomes of a random experiment is called the Sample Space.

If Ω is finite or countable. 6} ..3. 4.) Applied Statistics: Probability 1-11 .Events Definition: An event is a collection of points from the sample space. S 2 = {2} . Suppose we toss a coin until first Head appears.. S6 = {6} . What are the simple events? Unless stated otherwise. (Generally we will not be interested in most of these. For the die example the simple events are S1 = {1} . We use sets to describe events.. then a “simple” event is an event that contains only one point from the sample space. ALL SUBSETS of a sample space are included as possible events.5} . From the die example let the set of "even" outcomes be E = {2.. and many events will have probability zero. Let the set of "odd" outcomes be O = {1. Example: the result of one throw of die is odd.

leather interior. premium or standard stereo. Orders have premium stereo. white. black. green.Describe the sample space and events Each of 3 machine parts is classified as either above or below spec. and colors: red. Applied Statistics: Probability 1-12 . At least one part is below spec. V6 or V8 engine. An order for an automobile can specify either an automatic or standard transmission. blue. leather or cloth interior. and a V8 engine.

Autos crossing that weigh more than 3. Those messages transmitted 3 or fewer times. Applied Statistics: Probability 1-13 .000 pounds. A message is transmitted repeatedly until transmission is successful. The individual weights of automobiles crossing a bridge measured in tons to nearest hundredth of a ton. Lightbulbs that last between 1500 and 1800 hours.Describe: sample space and events The number of hours of normal use of a lightbulb.

A ∩ B ⇒ Both A and B occur. ∅ ⇒ the empty set (a set that contains no elements). we form new events from existing events by using the usual set theory operations. S ∩ A = S − A ⇒ S occurs and A does not occur. A ⇒ A does not occur. if A occurs. or." A ⊂ B ⇒ Every element of A is an element of B. Applied Statistics: Probability 1-14 . B occurs. A ∪ B ⇒ At least one of A or B occurs.Operations on Events Because the sample space is a set. Review Venn diagrams (in text). and any event is a subset A ⊂ Ω. Ω. A ∩ B = ∅ ⇒ A and B are "mutually exclusive.

(b) What is the event A1 ? (c) What is the event A1 ∪ A2? (c) What is the event A1 ∩ A2? (d) What is the event A1′ ? Applied Statistics: Probability 1-15 . Let Ai denote the event that the ith bit is distorted. (a) Describe the sample space. 4.Example Four bits are transmitted over a digital communications channel.3. 2. Each bit is either distorted or received without distortion. i = 1.

Venn Diagrams Identify the following events: A B (a ) A′ (b) A ∩ B (c ) ( A ∩ B ) ∪ C (d ) ( B ∪ C )′ ( A ∩ B )′ ∪ C C (e) Applied Statistics: Probability 1-16 .

.Mutually Exclusive & Collectively Exhaustive A collection of events A1 . is said to be mutually exclusive if Ai ∩ Aj = i { φ if i ≠ j Ai = Aj if i = j.. A collection of mutually exclusive events forms a partition of an event E if ∪ A = E. A2 .. A collection of events forms a partition of Ω if they are mutually exclusive and collectively exhaustive. i i Applied Statistics: Probability 1-17 . A collection of events is collectively exhaustive if ∪ Ai = Ω.

" No two of them intersect (mutually exclusive) and their union covers the entire sample space.Partition of Ω An −1 An A1 A2 The sets Ai are "events. Applied Statistics: Probability 1-18 .

A 4. are m utually exclusive. .. For every event A . then ⎡ ∞ ⎤ P ⎢ ∪ An ⎥ = ⎣ n =1 ⎦ ∑ P(A n =1 ∞ n ). A2 .Probability measure We use a probability measure to represent the relative likelihood that a random event will occur. P ( Ω ) = 1. If the events A1 . A 3. A 2. Applied Statistics: Probability 1-19 . If A and B are m utually exclusive. then P (A ∪ B )= P (A )+ P (B ). P ( A ) ≥ 0. Axioms: A 1. The probability of an event A is denoted P( A)..

∀A ∈ F.Theorem: Given a sample space. P. (b) P [ A] = 1 − P ⎡ A⎤ . Ω. defined on these events then the following hold: (a) P ( ∅ ) = 0. ⎣ ⎦ (c) P [ A ∪ B ] = P [ A] + P [ B ] − P [ A ∩ B ] . Don’t worry about proving them. ∀A. a "well-defined" collection of events. and a probability measure. F. ∀A. B ∈ F. You must “know” these and be able to use them to solve problems. Applied Statistics: Probability 1-20 . B ∈ F. (d) A ⊂ B ⇒ P [ A] ≤ P [ B ] .

(b) What is the probability that we do NOT get a 1? 5 The event we want is {1}′ or Ω ∼ {1} . 2.3. 6} . 4. 6 6 3 (d) If E = {1.5. or G = E. thus.5. ⎣ ⎦ 6 ⎢ ⎥ ⎣ ⎦ (c) What is the probability that we get a 1 or a 3? 1 1 1 + = . what might the event G be? P ⎡{1} ∪ {3}⎤ = P ⎡{1}⎤ + P ⎡{3}⎤ = ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ G = {1. the probability is 0. G = {1. 6} . 4. 4. 6} . (a) What is the probability that we obtain a 7? The event we want is ∅.Applying the Theorem We roll 1 die and obtain one of the numbers 1 through 6 with equal probability. and P ⎡{1}′ ⎤ = 1 − P ⎡{1}⎤ = . Applied Statistics: Probability 1-21 . 4. G = {1. 6} .3. Note that in all of these cases P [ E ] ≤ P [G ] .5. 2.5. and E ⊂ G .

.. i =1 i n 1 If all of the outcomes have equal probability.. x1 . i = 1. and p (4) = . then each p ( xi ) = .Assigning Discrete Probabilities When there are exactly n possible outcomes of an experiment.. 1 1 3 It follows that 8a = 1 ⇒ a = . This implies that p (1) = p ( 2 ) = p ( 3) = p ( 5 ) = p ( 6 ) = a (for example) and p ( 4 ) = 3a. Thus. we have a biased die and the probability of a 4 is 3 times more likely than the probability of any other outcome. i ≠ 4.. 6 Suppose.. thus. n. p ( i ) = . p ( xi ) .. i = 1. 8 8 8 Applied Statistics: Probability 1-22 .. the n 1 probability of any particular outcome on the roll of a fair die is . 2.n. however. must satisfy the following: (1) 0 ≤ p ( xi ) ≤ 1. xn then the assigned probabilities. (2) ∑ p ( x ) = 1.. 2. x2 ...

This follows from the multiplication principle (Theorem 2. 744.Complex Combinations How many ways are there to create a “full house” (3-of-akind plus a pair) using a standard deck of 52 playing cards? ⎛13 ⎞ ⎛ 4 ⎞ ⎛12 ⎞ ⎛ 4 ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ = 13i4i12i6 = 3.3.1 in text). Applied Statistics: Probability 1-23 . ⎝1 ⎠ ⎝ 3 ⎠ ⎝1 ⎠ ⎝ 2 ⎠ (choose denomination)x(choose 3 of 4 of given denomination)x(choose one of the remaining denominations)x(choose 2 of 4 of this second denomination).

The total number of outcomes is: ⎛ 52 ⎞ 525 52i51i50i49i48 = = 2. Applied Statistics: Probability 1-24 .960.744). = total number of outcomes.598.960 This is an example of a hypergeometric distribution.00144 Thus. we'll study this soon. the probability of getting a full house is 2.What is the probability of a “full house”? In discrete problems we interpret probability as a ratio: number of successful outcomes successes . successes + failures In this case the number of successful outcomes is the number of ways to get a full house (3. A full house happens about once in every 694 hands! This is why people invented wild cards.744 = 0. ⎜ ⎟= 5i4i3i2i1 ⎝ 5 ⎠ 5! 3.598.

What is the probability that the sum is 8? Applied Statistics: Probability 1-25 . P( A | B) = P( B) Q1: What is the probability of obtaining a total of 8 when rolling two dice? Q2: Suppose you roll two dice that you cannot see. Someone tells you that the sum is greater than 6.Conditional Probability The conditional probability of A given that B has occurred is P( A ∩ B) . provided P ( B ) ≠ 0.

6 ) (3. The first question is Find P( A).5 )( 5.2 )( 2. P( A) = 5 . 6 ) Sum=8. 1/36. 4 )( 4.5 ) (1.1) ( 5.3)( 5.5 )( 6.1) ( 4.3)( 6. Applied Statistics: Probability 1-26 .1) ( 6. Let B be the event that the sum of the two dice is greater than 6.5 )( 2.3)( 3.2 )(1.2 )( 3. 4 )( 5.1) ( 2.2 )( 5. 6 ) (6.2 )( 6. 36 because we are assuming that each outcome pair has the same probability. 6 ) (2. 4 )( 6.3)( 4.3)( 2.1) (1.Dice Problem Let A be the event of getting 8 on the roll of two dice. 4 )( 3. Here is the sample space: (1.5 )( 4.1) ( 3.2 )( 4.5 )( 3. 6 ) (4. 4 )( 2. 4 )(1. 6 ) (5.3)(1.

6 ) It follows that P( A | B ) = 5 . 4 )( 6.3)( 4.2 )( 6. 6 ) ( 3.Dice Problem (conditional) Here we roll the dice and learn that the sum is greater than 6.5)( 5.5 )( 6.5)( 2. P( B) P( B) Note well! 5 P ( A) 36 5 = = .3)( 6. 21 21 P( B) So…the definition makes sense! 36 Applied Statistics: Probability 1-27 . 6 ) (6.1) ( 6. 6 ) ( 5.5)( 3. 6 ) ( 4.2 )( 5.3)( 5. 21 Alternatively. by definition of conditional probability. With this knowledge the sample space becomes the following: (1. 6 ) ( 2. 4 )( 5. we have P( A ∩ B ) P( A) P( A | B) = = because A ∩ B = A. Furthermore. 4 )( 3.5)( 4. 4 )( 4. Let B represent the event that the sum is greater than 6.

500 sophomores. A university has 600 freshmen. 60 of the sophomores. is a freshman or a CS major (or both)? If a student is a CS major. what is the probability he/she is a sophomore? Applied Statistics: Probability 1-28 . For this problem assume there are NO seniors. selected at random. and 50 of the juniors are Computer Science majors. and 400 juniors. 80 of the freshmen. What is the probability that a student.Try this.

Use these steps to solve previous slide What is the sample space? 2. What is the answer to the problem? 1. Applied Statistics: Probability 1-29 . What are the probabilities of the events of interest? 4. What are the events (subsets) of interest? 3.

Note: memorize these conditional probability equations TODAY. provided P ( B ) ≠ 0.Alternate Form We have seen that the conditional probability of event A given that event A B has occurred is: P( A | B) = P(P(∩)B) . This is referred to as the "multiplication rule." and holds even when P ( B ) = 0. Notice that we could also write P( A ∩ B) = P( B | A) P( A). But there is a special case where the conditional probabilities above are not needed. Applied Statistics: Probability 1-30 . Both these equations always hold for any two events. They are extremely important. B Clearly this implies that P( A ∩ B) = P( A | B) P( B).

is the probability of getting a 3 on the first roll independent of the probability of getting a 3 on the second roll? Q2: If one die is rolled twice.Independent Events Two events A and B are independent iff the probability P A∩B =P APB ( ) ( ) ( ). Example (dice) Q1: If one die is rolled twice. is the probability that their sum is greater than 5 independent of the probability that the first roll produces a 1? Applied Statistics: Probability 1-31 .

5 )( 2. 6 ) (4.3)( 3.3)( 4.5. 6 6 36 Applied Statistics: Probability 1-32 .5 )( 4. The sample space associated with two 6 rolls of one die (or with one roll of a pair of dice): (1.1) ( 6.1) ( 2.1) ( 4. 4 )( 6. 4 )( 5.1) ( 3. 4 )( 4.Dice sample spaces (Q1) The sample space associated with one roll of a die: Ω1 = 1. 6 ) (2. Unless otherwise stated we assume the die is fair so that the probability of 1 any one of the simple events is . 4 )( 2. P(3 on first roll) = P(3 on second roll) = .3)(1. 6.5 )(1. 6 ) (5.3)( 5.2 )( 5.2 )( 2.1) ( 5.3)( 2. 36 6 1 1 1 Because × = . 5 )( 3.2 )( 6. 6 ) (6.5 )( 6. 6 ) (3.3.2 )(1.3) = .2 )( 3.3)( 6. the two events are independent.2 )( 4. 6 ) 1 1 Clearly P(3. 4 ) ( 3. 4 )(1. 2. 4.5 )( 5.1) (1.

4 )( 3.2 )( 6. 36 18 1 P [ F1] = . these two events are NOT independent. 6 P [ F 1 ∩ G 5] = 1 1 13 13 ≠ P [ F1] P [G5] = × = . Let G5 be the event that 6 the sum of the two dice is greater than 5 and F1 be the event that the first roll produces a 1. 6 ) (3. 6 ) F1 ∩ G 5 ⇒ P [ F 1 ∩ G 5] = 2 1 = .5 )( 3. 6 ) (4. 4 )( 5.2 )( 5.5 )( 5.2 ) (1.1) ( 5.2 )( 2. 6 ) (5. 4 )( 6. 4 )( 2.1) ( 2.5 )( 2.3)( 4. 36 18 26 13 P [ G 5] = = .5 )(1.3)( 6.3)( 3.3)( 5. 6 ) (6. 4 )( 4.1) (1.Dice: Q2 The probability of getting a 1 on the first die is 1 . The sample space is: (1.3)(1.5 )( 6. 4 )(1.2 )( 4.3)( 2.5 )( 4.1) ( 6.1) ( 4.1) ( 3.2 )( 3. 6 ) (2. 18 6 18 108 Thus. Applied Statistics: Probability 1-33 .

500 sophomores. or two kings. 60 of the sophomores. 80 of the freshmen. If a student is a CS major. what is the probability that he/she is a Junior? ∞ i ⎛1⎞ 2. i =3 ⎝ 4 ⎠ 3. What is the probability of drawing 2 pairs in a draw of 5 cards from a standard deck of 52 cards? (A pair is two cards of the same denomination – such as two aces.) Applied Statistics: Probability 1-34 . For this problem assume there are NO seniors. and 50 of the juniors are Computer Science majors. A university has 600 freshmen. Evaluate ∑ ⎜ ⎟ . and 400 juniors. 1. two sixes.Practice Quiz 1 – Explain your work as you have been taught in class.

John Wiley & Sons. 2006 Applied Statistics: Probability 1-35 .by Douglas C. Runger.3rd Ed .Multiplication and Total Probability Rules* Multiplication Rule *This slide from Applied Statistics and Probability for Engineers. Montgomery and George C. Inc.

Inc.by Douglas C. Montgomery and George C. Runger. John Wiley & Songs. 2006 Applied Statistics: Probability 1-36 .Multiplication and Total Probability Rules* *This slide from Applied Statistics and Probability for Engineers.3rd Ed .

John Wiley & Sons. Runger.Multiplication and Total Probability Rules* Total Probability Rule Partitioning an event into two mutually exclusive subsets. Partitioning an event into several mutually exclusive subsets. Inc. *This slide from Applied Statistics and Probability for Engineers.3rd Ed . 2006 Applied Statistics: Probability 1-37 .by Douglas C. Montgomery and George C.

and without replacement.s) of first-selected. Applied Statistics: Probability 1-38 . If two parts are selected at random. what is the probability that the second part selected is one with excessive shrinkage? S={pairs (f. secondselected taken from 25 total with 5 defects} SD={second selected (no replace) is a defect} FD={first selected is a defect} FN={first selected is not a defect}.Problem 2-97a A batch of 25 injection-molded parts contains 5 that have suffered excessive shrinkage.

24 5 24 25 6 30 5 Applied Statistics: Probability 1-39 .Problem Solution We seek P[SD]=P[SD ∩ FD]+P[SD ∩ FN] This becomes P[ SD | FN ]P[ FN ] + P[ SD | FD]P[ FD] 5 4 4 5 1 1 1 = * + * = + = = 0.2.

by Douglas C. 2006 Applied Statistics: Probability 1-40 . Runger. John Wiley & Sons. Montgomery and George C. Inc.3rd Ed .Multiplication and Total Probability Rules* Total Probability Rule (multiple events) *This slide from Applied Statistics and Probability for Engineers.

Probability of Failure 0.Total Probability Example A semiconductor manufacturer has the following data regarding the effect of contaminants on the probability that chips fail.01 0.001 Level of Contamination High Medium Low In a particular production run 20% of the chips have high-level. and 50% have low-level contamination. What is the probability that one of the resulting chips fails? Applied Statistics: Probability 1-41 . 30% have medium-level.1 0.

Bernoulli Trials “Consider an experiment that has two possible outcomes. Let the probability of success be p and the probability of failure be q where p+q=1. Thus they k are said to form a probability distribution. success and failure. Applied Statistics: Probability 1-42 . k Note that the sum of the probabilities ∑ p(k ) = 1.” The probability of obtaining exactly k successes in a sequence of n Bernoulli trials is the binomial probability p(k ) = ( n ) p k q n −k . Now consider the compound experiment consisting of a sequence of n independent repetitions of this experiment. Such a sequence is known as a sequence of Bernoulli Trials.

.. we have created a probability distribution.) As an example.} and assign probabilities to each of the possible simple events: P({s1}) = p1 . s2 . if I toss a coin 1 1 one time then P( H ) = and P (T ) = represents a probability distribution. Probability Distribution Applied Statistics: Probability 1-43 .. 2 2 The single coin toss distribution also is an example of a Bernoulli trial because it has only two possible outcomes (generally called "success" or "failure).When we take a discrete or countable sample space Ω = {s1 .... (Think that you have "distributed" all of the probability over all possible events. P({s2 }) = p2 ..

Suppose we toss a coin 10 times and we want the total number of heads.50E-01 1. Then p = P [ H ] . n = 10.00E-01 1.The binomial probabilities are defined by p( k ) = ( n ) p k q n − k . where k Binomial Probability Distribution p is the probability of success and q is the probability of failure in n Bernoulli trials.00E+00 0 1 2 3 4 5 6 7 8 9 10 probabilities Binomial n=10 p=0.5 Applied Statistics: Probability 1-44 . Using the above formula we obtain the probabilities: 3. q = P [T ] .50E-01 2.00E-01 5.00E-02 0.00E-01 2.

Alternatively. This is typical of all probability distributions (using their own parameters.Regarding Parameters Notice that the binomial distribution is completely defined by the formula for its probabilities. we may choose to estimate "statistics" such as mean and variance that are functions of these parameters. k The binomial probability equation never changes so we regard a binomial distribution as being defined by its parameters. Applied Statistics: Probability 1-45 . and by it "parameters" p and n. of course). after we consider random variables and the the continuous sample space. We'll get to this. One of the problems we often face in statistics is estimating the parameters after collecting data that we know (or believe) comes from a particular probability distribution (such as the p and n for the binomial). p (k ) = ( n ) p k q n − k .

Queuing and Computer Science Applications. Trivedi. Wiley & Sons. then the probability of successful word transmission is: Pw = P [ e or fewer errors in n trials] ⎛ n ⎞ i n −i = ∑⎜ ⎟q p . NY 2002 Applied Statistics: Probability 1-46 . If we assume that the transmission of successive bits is independent.Example (from Trivedi*) Consider a binary communication channel transmitting coded words of n bits each. Assume also that the code is capable of correcting up to e errors. which has probability q. where e ≥ 0. 2nd Ed. e *Probability and Statistics with Reliability. Assume that the probability of successful transmission of a single bit is p and that the probability of an error is q = 1 − p. i =0 ⎝ i ⎠ Notice that a "success for the Binomial distribution" means getting an error. Kishor S. J.

then transmission is successful. When a workstation is ready to transmit. Suppose the probability of a workstation being ready to transmit is p.1. a collision occurs. One and only one workstation may transmit during one of these time intervals. If k = 1. and each of the k ready workstations waits a random amount of time before trying again.Example A communications network is being shared by 100 workstations. If more than one workstation is ready at that moment. it will wait until the beginning of the next 100ms time interval before attempting to transmit. Time is divided into intervals that are 100 ms long. Applied Statistics: Probability 1-47 . Show how probability of collision varies as p varies between 0 and 0.

1.25. If it is a club.2. If it is a diamond. 3. and clubs (NOT 13 of each “suit”). Queen. We write this as P[Spade]=0. Similarly.25 that it is a face card. If a card is drawn at random. the probability is 0. Jack). then the probability that it is a spade is 0. the probability is 0. hearts.25. P[Diamond]=0.3. what is the probability that it is a spade? Applied Statistics: Probability 1-48 .Practice Quiz 2 A partial deck of playing cards (fewer than 52 cards) contains some spades. P[Heart]=0. the probability is 0.25 that it is a face card.2. Each of the 4 suits has some number of “face” cards (King. If it is a heart. the probability is 0.2 that it is a face card. diamonds. P[Club]=0. that it is a face card. If the drawn card is a spade. What is the probability that the randomly drawn card is a face card? What is the probability that the card is a Heart and a face card? If the card is a face card. 2.1.

**Discrete Random Variables
**

G. A. Marin

Applied Statistics: Probability

1-49

Review of “function”

Defn: A function is a set of ordered pairs such that no two pairs have the same first element (unless they also have the same second element). Example: g = (1,2 ) , 3, 5 , ( 5,12 ) defines a function, g , whose "domain" consists of the real numbers 1,3,5 and whose "range" consists of the numbers 2, 5,12. All functions are said to "map" values in their domain to values in their range. Example: f ( x) = x 2 + 5. Here a function is defined using a formula. This actually implies the the function is f =

{

(

)

}

{( x, x

2

+ 5 ) : x is a real number .

}

Notice the following: (a) The function has a "name." Here that name is f . (b) The implied domain of the function includes all real numbers, x, that can be plugged into the formula. In this case that includes all real no's. (c) Every number x in the domain (all reals) is "mapped" to the number x 2 +5. Thus f (1) = 6, f (−5) = 30, f (π ) = π 2 +5. (d) Sometimes we write this as 1 → 6, -5 → 30, π → π 2 +5. (e) The range of f is { x : x ≥ 5} .

Applied Statistics: Probability 1-50

Random Variable

Definition: A random variable X on a sample space Ω is a function that assigns a real number x to each sample point s ∈ Ω. The inverse image of X ( s) is the set of all points in Ω that the random variable X maps to the value x . It is denoted Ax = {s ∈ Ω | X ( s ) = x}.

Ω discrete ⇒ X discrete Ω continuous ⇒ X continuous

Applied Statistics: Probability

1-51

a description of the values it can produce and the probability of each value.Random Variable Ω = {s1 . Applied Statistics: Probability 1-52 .. we then define the probability of the value x to equal P [ Ax ] .} X = ( −∞. For discrete random variables... OR we may be given a discrete (continuous later) random variable. We define Ax as the set of all points in Ω that "map" into the value x ∈ . 2.. P [ X = k ] = pk . n.. ∞ ) We write X ( s ) = x. Sometimes we write Ax = X ( x ) and state that Ax is the "inverse image" of the value x under the random variable X .. For example. p X ( x) = P [ Ax ] . In this case we need not know what the underlying experiment really is.. where s ∈ Ω. and x ∈ . s2 . For k = 1. s3 .

5..3.. 6 Applied Statistics: Probability 1-53 .5. We say: “Let X be a random variable such that takes on the discrete values 1. Notice that we’d represent the sample space of each of these as {1. 2.6.2.6. but the sample spaces really include dice. Shuffle them and draw one at random. or cards. Replace the card and reshuffle to repeat.2.3.. arrows.. For each probability distribution 1 let pi = for i = 1.2. 6 The importance of the random variable is that it lets us deal with such an experimental setup without thinking dice..4..4.6. Experiment 3: You have 6 cards numbered 1 through 6. Its probability mass function is given as: the probability that X=i is 1/6. We write this as p X (i ) = 1 for i=1. cards.. Experiment 2: Spin an arrow that lands with equal probability on one of the numbers 1 through 6.The role of a random variable Experiment 1: Roll 1 fair die and determine the outcome. arrows.6} usually without drawing dice or arrows or cards.

(p2) If X is a discrete random var iable then ∑p i X ( xi ) = 1.. such that P[ X = x ] is defined. x .} includes all real numbers. Applied Statistics: Probability 1-54 . such that p X ( x ) ≠ 0. then its probability mass function (pmf) is given by: p X ( x ) = P ( X = x ) = P ( Ax ) = s∈ Ax ∑ P ( s ). Note: you cannot define a pmf without first defining a random variable. however. define a probability distribution directly on a sample space with no random variable defined. where the set { x1 . You can.. x . x2 ..Probability Mass Function If X is a discrete random variable. The pmf satisfies the following properties: (p1) 0 ≤ p X ( x ) ≤ 1 for all values.

Ω = {1. we define X's probability mass function as p X ( i ) = ⎨ 3 ⎪2 ⎪3 ⎩ a Bernoulli distribution. we could just write that p X (1) = 2 p X ( 0 ) = . 5.3. 2 .5. 4. thus. We define the random variable X on this sample space as ⎧1 follows: X (i ) = ⎨ ⎩0 if i = 1. Alternatively. 4. or we could define the pmf using a table: 3 pmf of X Value Prob 0 2/3 1 1/3 Applied Statistics: Probability 1-55 . X has if i = 0 1 and 3 ⎧1 ⎪3 1 ⎪ is . 2. 6 if i = 1 . Because the probability of rolling a 1 or 2 if i = 3. 6} .Discrete RV Example 1 Let the sample space Ω represent all possible outcomes of a roll of one die.

we must have the event AAAA6 occur where A means any result other than 6. thus. ⎝6⎠ ⎛1⎞ ⎜ ⎟ . these represent a sequence of 5 Bernoulli trials where success = 6 and failure = 1 through 5. This defines the pmf. the probability 6 ⎠ ⎝ 6 ⎠ 7776 ⎝ ⎛5⎞ of the first 6 on the kth roll is ⎜ ⎟ ⎝6⎠ ⎛5⎞ p X (k ) = ⎜ ⎟ ⎝6⎠ k −1 k −1 4 ⎛1⎞ ⎜ ⎟ . the probability ⎛ 5 ⎞ ⎛ 1 ⎞ 625 of this particular result is ⎜ ⎟ ⎜ ⎟ = = 0. What is the probability mass function (pmf) for X ? In order for the first 6 to occur on the 5th toss. Each trial is independent. This is a particular instance of the geometric distribution. Let the random variable X = k if the first 6 occurs on the kth roll for integer k > 0. Similarly. Clearly. ⎝6⎠ Applied Statistics: Probability 1-56 .08. for example.Discrete RV Example 2 A die is tossed until the occurrence of the first 6.

k = 1.. The probability of this is ⎜ ⎟ ⎝6⎠ ⎛5⎞ distribution for X : p X ( k ) = ⎜ ⎟ ⎝6⎠ k −1 k −1 1 .. say. 2.. . Suppose that the first ⎛5⎞ time you get the 2 is on the kth roll.1.. a 2 for the first time.... k = 0. twice.. 2. k . until you get. 6 Applied Statistics: Probability 1-57 . 6. that you get. a 2. The binomial distribution for X : p X ( k ) = ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ k ⎠⎝ 6 ⎠ ⎝ 6 ⎠ k n−k .. say. 6 (2) Roll a die n times and count the number of times. k = 1. This is ⎛ n⎞⎛ 1 ⎞ ⎛ 5 ⎞ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎝ k ⎠⎝ 6 ⎠ ⎝ 6 ⎠ k n−k ⎛ n⎞⎛ 1 ⎞ ⎛ 5 ⎞ ..n.. 1 The uniform distribution for X : p X ( k ) = .Useful “die” illustrations (1) Roll a die once and the probability of getting any one number (choose one of six) is 1/6. (3) Roll a die once... The geometric 6 1 .

For any real number x the probability that the random variable X takes a value in the interval (−∞. A. we write: P( X ∈ A) = P(a<X ≤ b). b]. xi ∈A If A = (a. If A = (a. where the last equality holds t ≤x only for discrete RVs X . x] is especially important and is denoted as: FX (x) = P(−∞ < X ≤ x) = P( X ≤ x) = ∑ pX (t). Applied Statistics: Probability 1-58 .Probability of Sets & Intervals For a discrete RV. etc. we write: P( X ∈ A) = P(a<X <b). X . b). we can write: P( X ∈ A) = ∑ pX ( xi ). The function F is called the cummulative distribution function (or just the distribution function) of X . and any set of real numbers.

8 8 8 Then the cdf. is given by: ⎧0 ⎪1 ⎪ ⎪8 FX ( x) = ⎨ ⎪3 ⎪8 ⎪1 ⎩ for x < 1 for 1 ≤ x < 2 for 2 ≤ x < 3 for x ≥ 3. (Its domain is all reals. F starts at 0 (for a discrete random variable like this one) and adds up the probabilities until it ends at 1. FX . and p X ( 3) = . The function F is defined for ALL REAL NUMBERS.Simple cdf example Let X be a random variable with pmf given by: 1 2 5 p X (1) = . Applied Statistics: Probability 1-59 . The "meaning" of F is that "FX ( x) is the probability that X ≤ x. p X ( 2 ) = .) The range of F is always between 0 and 1. NOTICE that F simply adds up the probability mass function's range values as it gets to them (starting from -∞ and moving towards +∞)." For a discrete random variable the graph of FX is a step function.

F has a positive jump at xi equal to pX (xi ) and takes a constant value in the interval [xi−1. (F2) F(x) is an increasing function of x.Cumulative Distribution Function Properties Important: P(a < X ≤ b) = FX (b) − FX (a). A RV is said to be of mixed type if it has continuous intervals plus jumps. (F1) 0 ≤ FX (x) ≤1. and cumulative distribution functions of continuous RVs have no jumps. Applied Statistics: Probability 1-60 .. it graphs a step function. Thus. xi ). Cumulative distribution functions of discrete RVs grow only by jumps. x→-∞ x→+∞ (F4) For discrete X that has positive probability only at the values x1. x2. (F3) lim F(x) = 0 and lim F(x) =1..

This is a Bernoulli trial with p=1/3 and q=2/3. is Bernoulli (or has a Bernoulli distribution) if its pmf is given by p0 = pX (0) = q and p1 = pX (1) = p where p + q =1.Bernoulli Distribution The RV. X. Let X=1 if the result is 1 or 2. Let X=0 otherwise. 0 for x<0 q for 0≤x<1 Example: Roll a die once. The corresponding CDF is given by: F(x) = { 1 for x≥1. Applied Statistics: Probability 1-61 .

2 0.6 0.5 Applied Statistics: Probability 1-62 .3 0.5 0.1 0 0 1 Bernoulli Distribution p=0.4 0.5 0.Bernoulli pmf Bernoulli Distribution p=0.

Applied Statistics: Probability 1-63 . x. ⎩ Notice that the cdf is defined for all real numbers.5 ⎧0 for x < 0 ⎪ Write as: F ( x ) = ⎨0.5 for 0 ≤ x < 1 ⎪1 otherwise.Bernoulli cdf p=0.

n. and has pmf given by: with equal probability p X ( xi ) = { for i =1... n 0 otherwise.. The RV X is said to have a Discrete Uniform n Distribution. then its distribution function is given by ⎧0 for x < 1 ⎪ ⎢ x⎥ ⎪ ⎣ ⎦ 1 ⎢ x⎥ FX ( x) = ⎨∑ = ⎣ ⎦ for 1 ≤ x ≤ n n ⎪ i =1 n ⎪1 for x > n.xn 1 . ⎩ Applied Statistics: Probability 1-64 . 1 n If we let X take on the integer values 1.x2 ....Discrete Uniform Distribution Let X be a random variable that can take any of n values x1 ....2.2....

Discrete Uniform pmf n=10 Discrete Uniform pmf 0.08 0.1 0.12 0.04 0.06 0.02 0 1 2 3 4 5 6 7 8 9 10 Discrete Unifrom n=10 Applied Statistics: Probability 1-65 .

Discrete Uniform cdf n=10 ⎧0 for x < 1 ⎪ ⎢ x⎥ ⎢ x⎥ ⎪⎣ ⎦ FX ( x) = ⎨∑ p X ( xi ) = ⎣ ⎦ for 1 ≤ x ≤ 10 10 ⎪ i =1 ⎪1 otherwise. ⎩ Applied Statistics: Probability 1-66 .

0 otherwise. The pmf of Yn is given by: pk = P(Yn = k ) = PYn (k ) = distribution if ⎧0 for t < 0 ⎪ ⎢t ⎥ ⎪⎣⎦ n i P[Yn ≤ t ] = FYn (t ) = ⎨∑( i ) p (1 − p)n−i for 0 ≤ t ≤ n Example: Toss a coin 10 ⎪ i =0 times and count the total ⎪1 for t > 0.5.Binomial Distribution Let Yn denote the number of successes in n Bernoulli trials. The random variable Yn is said to have a binomial . Applied Statistics: Probability 1-67 { ( )p n k k (1− p )n−k for 0 ≤ k ≤ n . This is binomial with p=0. k an integer. ⎩ number of heads.

50E-01 1.5 3.Probability Mass Function Binomial n=10 p=0.00E-01 1.00E-02 0.00E-01 5.00E+00 0 1 2 3 4 5 6 7 8 9 10 probabilities Applied Statistics: Probability 1-68 .00E-01 2.50E-01 2.

5 ⎧0 for t < 0 ⎪ ⎢t ⎥ ⎪ ⎣ ⎦ 10 i P[Yn ≤ t ] = FYn (t ) = ⎨∑ ( i ) p (1 − p )10−i for 0 ≤ t ≤ 10 ⎪ i =0 ⎪1 otherwise. ⎩ Applied Statistics: Probability 1-69 .Binomial cdf n=10 p=0.

and probabilities p + q = 1 This is p well-defined because ∑ pq = = 1.Geometric Distribution Consider any arbitrary sequence of Bernoulli trials and let Z be the number of trials up to and including the first success. Applied Statistics: Probability 1-70 ⎧0 for t < 1 ⎪⎣⎦ FZ (t ) = ⎨ ⎢t ⎥ ⎣t ⎦ p (1 − p )i −1 = 1 − (1 − p ) ⎢ ⎥ for t ≥ 1. 1− q i =1 The distribution function of Z is given by i-1 ∞ Example: See the previous example concerning rolling 1 die until a 6 occurs... 2.. Z is said to have a geometric distribution with pmf given by pZ (i ) = q i −1 p for i = 1.. ⎪∑ ⎩ i =1 .

5 Applied Statistics: Probability 1-71 .5 0.Geometric pmf p=0.6 0.3 0.2 0.4 0.1 0 1 2 3 4 5 6 7 8 9 10 geom p=0.5 Geometric pmf Example 0.

Geometric cdf p=0.5

B

**⎧0 for t < 1 ⎪⎣⎦ FZ (t ) = ⎨ ⎢t ⎥ ⎣⎦ p (1 − p)i −1 = 1 − (1 − p ) ⎢t ⎥ for t > 1. ⎪∑ ⎩ i =1
**

Applied Statistics: Probability 1-72

Poisson Distribution

A random variable, X t , has a Poisson Distribution with parameter α >0 if its pmf is given by: (α t ) k (e −α t ) P( X t = k ) = for k = 0,1,.... and t ≥ 0. (A distinct RV for each t.) k! NOTE: The Poisson is typically used to model the number of jobs arriving during time t in a time-share system, the arrival of calls at a switchboard, the arrival of messages at a terminal, etc. The parameter α is then interpreted as an arrival rate "per unit time." That is, if t is in seconds, then α must be the average arrivals per second. (In our text the parameter is given as λ .) The cumulative distribution function is: ⎧0 for x < 0 ⎪⎣ ⎦ FX t ( x ) = ⎨ ⎢ x ⎥ (α t ) k (e −α t ) for x ≥ 0. ⎪∑ k! ⎩ k =0 Notice that in mathematical notation α does not typically appear on the left-hand side even though the function is unspecified without it.

Applied Statistics: Probability 1-73

**Packet Arrival Example
**

---- Packet Arrivals ---X1 X2 X3 X4

…

X1 X2

XN

Each of the random variables

…

XN

(α ) k (e −α ) has a Poisson distribution; thus, P( X i = k ) = for i = 1, 2,..., n. k!

If Yt represents the total number of arrivals during any time t , then (α t ) k (e −α t ) P (Yt = k ) = , per the previous slide. k!

Applied Statistics: Probability

1-74

1) Poisson pmf 0.2 0.Poisson pmf (x.25 0.05 0 0 1 2 3 4 5 6 7 8 9 10 Applied Statistics: Probability 1-75 .3.15 Poisson alpha=3 0.1 0.

Poisson cdf α = 3 and t = 1. Applied Statistics: Probability 1-76 .

Poisson Example Connections arrive at a switch at a rate of 11 per ms.003.119. −11t k! (e ) . (a) What is the probability that exactly 11 calls arrive in one ms? (b) What is the probability that exactly 100 calls arrive in 10 ms? (c) What is the probability that the number of calls arriving in 2 ms is greater that 7 and less than or equal to 10? Let X t be the random variable giving the number of arrivals during t ms. −11 −11×10 (11× 2 ) k (e −11× 2 k! ) = 22 100! ) = 0. We know that X t has a Poisson distribution. which implies that P [ X t = k ] = arrival rate is 11 . 8 e 22 ⎛ 1 22 222 ⎞ 228 ⎛ 794 ⎞ ⎜ + + ⎟ = 22 ⎜ ⎟ = 0. The arrival distribution is Poisson. P [ X t = k ] = (α t ) k (11t ) k ms k! ( e ) with t in ms. −α t The (a) Probability of exactly 11 arrivals in one ms is P [ X 1 = 11] = (b) Probability of 100 calls in 10 ms is P [ X 10 = 100] = (c) P [ 7 < X 2 ≤ 10] = ∑ k =8 10 (11) 100 11 (11×10 ) (e 11! ( e ) = 0.025. thus. ⎝ 8! 9! 10! ⎠ e ⎝ 10! ⎠ Applied Statistics: Probability 1-77 .

.. X To define a pmf when X takes integer values write the following: ⎛n⎞ k n−k "The pmf is p X ( k ) = an expression often involving k . 2... α . n. such as "for integers k = 1." Be sure to use the values of other parameters (p. Be sure to specify all possible values of k .Summary for Discrete Random Variable.." ⎝k ⎠ This means the probability X = k .) that are correct for this particular problem. such as ⎜ ⎟ ( p ) (1 − p ) . for k < whatever min for 1 ≤ x ≤ n for k > n Applied Statistics: Probability 1-78 " . n. To define a cdf write the following: ⎧0 ⎪ ⎢ x⎥ ⎪⎣ ⎦ "The cdf is: FX ( x ) = ⎨∑ ( the expression for p X ( k ) ) ⎪ k =1 ⎪1 ⎩ This means the probability X ≤ x..

The 1 anytime he calls. 1. Now suppose that this student will phone once each night until the first night that he is able to reach his girlfriend. (For example.) What is the pmf for Y ? What is the name of Y ' s distribution? Applied Statistics: Probability 1-79 . What is the pmf for X ? What is the name of X ' s distribution? probability that he reaches her is 2. Suppose that the random 3 variable X equals the number of nights (out of three) that he is able to reach her.Practice Quiz 3 A college student phones his girlfriend once each night for three nights. What is the cdf for X ? 3. Let Y be a random variable that equals the number of nights that it takes him to reach her for the first time. Y = 2 if she doesn't answer the first night but does answer the second night.

x 1 − 2 p X ( x) 1 6 1 3 1 2 Applied Statistics: Probability 1-80 π 1 . You may have to list each possible value and its probability. X = 1 with probability . 2 6 3 2 You can define the pmf in a table: This mean the probability X = x. X = π with probability . For example.Suppose values of X are not integers. suppose that 1 1 1 1 X = − with probability .

Applied Statistics: Probability 1-81 .The cdf for this example… ⎧ ⎪0 ⎪ ⎪1 ⎪ FX ( x) = ⎨ 6 ⎪1 ⎪ This means the ⎪2 probability X ≤ x. ⎪1 ⎩ 1 for x < − 2 1 for − ≤ x < 1 2 for 1 ≤ x < π for x ≥ π .

Applied Statistics: Probability 1-82 .Mean and Variance of a Discrete Random Variable Definition Working formula: Var ( X ) = E ( X 2 ) − E 2 ( X ).

Mean and Variance of a Discrete Random Variable Figure 3-5 A probability distribution can be viewed as a loading with the mean equal to the balance point. Parts (a) and (b) illustrate equal means. but Part (a) illustrates a larger variance. Applied Statistics: Probability 1-83 .

Mean and Variance of a Discrete Random Variable Figure 3-6 The probability distribution illustrated in Parts (a) and (b) differ even though they have equal means and equal variances. Applied Statistics: Probability 1-84 .

Example 3-11 Applied Statistics: Probability 1-85 .

only when X and Y are independent.5 Var (5 X ) = 25(1. and Var ( aX + bY ) = a 2Var ( X ) + b 2Var (Y ) .25 Applied Statistics: Probability 1-86 .5) = 62. only when X and Y are independent. Continuing the example from previous slide: E (5 X ) = 5(12.Properties of E(X) and Var(X) If X and Y are discrete random variables and a and b are real numbers.85) = 46. then * E (aX ) = aE ( X ) * E ( X + Y ) = E ( X ) + E (Y ) and E ( aX + bY ) = aE ( X ) + bE (Y ) * Var ( aX ) = a 2Var ( X ) * Var ( X + Y ) = Var ( X ) + Var (Y ).

If we're 10 1 1 1 asked "What is the expected value of V?".5.6.8. The more times we repeat the draw.6.9. In fact.3.4.5.7.2. the closer we are likely to get to the expected average of 5. face-down. we are not likely to get each value exactly once. "What is the average face value of 1+2+3+4+5+6+7+8+9+10 these 10 cards?".8. on a table. If we were asked. we would compute = 5.Expected Value vs Average Suppose we take 10 playing cards numbered Ace. for any random variable with discrete uniform distribution (like our "die") the expected value is the same as the average of all possible values. If we choose one at random (and then replace and reshuffle). 2. For example.5.10 and arrange them randomly. The average of these outcomes is 4.6. it is equally likely that we get any value between 1 and 10.5.NOT 5.9 . So you might think of the expected value of a random variable as the value expected from averaging many outcomes. If the random variable V gives the value obtained.5. If we actually perform the experiment by drawing 10 times. Applied Statistics: Probability 1-87 . then V has a discrete uniform distribution on the integers 1 through 10. This is NOT TRUE for other distributions.3. we find 1( 10 ) + 2 ( 10 ) + + 10 ( 10 ) = 5. we might draw 1.5.9.7.2.

m! m=0 Similar work will show that Var(X ) = np(1 − p). n −1 Applied Statistics: Probability 1-88 .Binomial Mean and Variance The mean of a binomial distribution with parameter p is n ⎛n⎞ k ⎛n⎞ k n−k E ( X ) = ∑ k ⎜ ⎟ p (1 − p ) = ∑ k ⎜ ⎟ p (1 − p ) n − k . but there are much easier ways to show this. Let m = k − 1 to get k =0 ⎝ k ⎠ k =1 ⎝ k ⎠ n ⎛n ⎞ m +1 n −−−− n −1− m = ∑ (m + 1) ⎜ = ∑ (m + 1) p (1 − p ) p m +1 (1 − p ) n −1− m ⎟ ( m + 1)! m =0 m =0 ⎝ m + 1⎠ n −1 n −1 m +1 (n − 1) m m = np ∑ p (1 − p ) n −1− m = np.

6. The lines operate independently and the probability that any particular line is in use is 0. What is the probability that 10 or more lines are in use? What is the expected number of lines in use? What is the standard deviation of lines in use? Applied Statistics: Probability 1-89 .Exercise The interactive computer system at Gnu Glue has 20 communication lines to the central computer system.

Binomial Revisited

Recall that the Binomial RV X = X 1 + X 2 + ... + X n where each X i has a Bernoulli distribution and is mutually independent with the others. Because E ( X i ) = p, it follows trivially that E ( X ) = np. Because of independence we can write that Var ( X ) = ∑ Var ( X i ) also. Var ( X i ) = E ( X i2 ) − E 2 ( X ) = p − p 2 = p(1 − p ).

i =1 n

It follows simply that Var ( X ) = np (1 − p ).

Applied Statistics: Probability

1-90

**Poisson Mean and Variance
**

The mean of a Poisson distribution with parameter λ > 0 is

∑k

k =0

∞

λ k e−λ

k!

= λ e− λ ∑

k =1

∞

λ k −1

(k − 1)!

= λ e − λ eλ = λ. = λ e− λ ∑ k

k =1 ∞

Similarly, E ( X 2 ) = ∑ k

k =0

∞

λ k e−λ 2

k!

∞ k =1 ∞

λ k −1

(k − 1)! +∑

∞ ∞

= λ e − λ [∑ (k − 1) = λ 2e−λ ∑

k =2

] (k − 1)! k=1 (k − 1)! + λ e−λ ∑

k=1

λ k −1

λ k −1

λ k −2

(k − 2)!

λ k −1

(k − 1)!

= λ 2 + λ.

It follows that Var(X ) = E ( X 2 ) − E 2 ( X ) = λ 2 + λ − λ 2 = λ . NOTE : The mean and variance of the Poisson random variable, X t is λt (or α t ).

Applied Statistics: Probability

1-91

Exercise

Suppose it has been determined that the number of inquiries that arrive per second at the central computer system can be described by a Poisson random variable with an average rate of 10 messages per second. What is the probability that no inquiries arrive in a 1-second period? What is the probability that 15 or fewer inquiries arrive in a 1-second period? What are the mean and variance of the number of arrivals in 1 second?

Applied Statistics: Probability

1-92

Then s p p k =0 k =1 ⎛ 1 ⎞ 1 Thus. 2 p Applied Statistics: Probability 1-93 . k =1 k ∞ ∞ 1 1 ′( p) = −∑ k (1 − p ) k −1 = − 2 . E ( X ) = p ⎜ 2 ⎟ = . ⎝p ⎠ p Homework: Use similar technique to show that Var ( X ) = (1 − p) . Write s ( p ) = ∑ (1 − p ) = .Geometric Mean and Variance E ( X ) = ∑ kp (1 − p) k =1 ∞ ∞ k −1 = p ∑ k (1 − p ) k −1.

Var ( X ) = 2 ( n + 1)( 2n + 1) 3 ( n + 1) = − 6 4 12 12 ( n + 1)( 4n + 2 − 3n − 3) = ( n + 1)( n − 1) = n2 − 1 . = 12 12 12 2 2 Example: Let X be uniformly distributed on 1... Then E ( X ) = Var ( X ) = 99 33 = . n k =1 n (6) 6 k =1 n n 2 ( n + 1)( 2n + 1) − ( n + 1) Therefore. n.. = n k =1 2n 2 k =1 n n k 2 1 n 2 n ( n + 1)( 2n + 1) ( n + 1)( 2n + 1) = Similarly.2. 12 4 11 and 2 1-94 Applied Statistics: Probability ... E ( X ) = ∑ = ∑ k = . 2.10.Discrete Uniform Distribution Let X be a random variable with a discrete uniform distribution on the integers n ( n + 1) n + 1 k 1 n 1.. Then E ( X ) = ∑ = ∑ k = ..

3 (2) Same as (1) but m = 7 ⇒ i = 0... k = 5 (type 1). m + k − n to min k . Then X is said to be a hypergeometric random variable and its pdf is given by: ⎧⎛ k ⎞ ⎛ n − k ⎞ ⎪⎜ ⎟ ⎜ ⎟ ⎝ i ⎠ ⎝ m − i ⎠ for i = max 0.5 (3) Same as (1) but m = 17 ⇒ i = 2.1.3.....1. pX ( i ) = ⎨ ⎛ n ⎞ ⎪ ⎜m⎟ ⎪ ⎝ ⎠ ⎪0 otherwise ⎩ Values of i (examples): (1) n = 20...5. A sample of size m is selected from the n objects "without replacement. m ⎪ { } { } ⎪ . Applied Statistics: Probability 1-95 . m = 3 (sample size) ⇒ i = 0..." where m ≤ n (and k ≤ n). Let X be the random variable that denotes the number of type 1 objects in the sample.Hypergeometric Distribution Suppose that a set of n objects includes k objects of type 1 (successes?) and n − k objects of type 0 (failures perhaps?).

p X (1) = ⎝ = 0.) Let X be the number of men that have the marker in a sample of size 10.Text problem 3-101 A company employs 800 men under the age of 55. Thus. 800 ⎞ ⎛ ⎜ ⎟ 10 ⎠ ⎝ Applied Statistics: Probability 1-96 . what is the probability that exactly 1 man has the marker? Answer: Notice that this is certainly sampling without replacement. (We don't put the first man back into the pool before we draw the second one. X is hypergeo⎛ 240 ⎞⎛ 560 ⎞ ⎜ ⎟⎜ ⎟ 1 ⎠⎝ 9 ⎠ metric. Suppose that 30% carry a marker on the male chromosome that indicates an increased risk for high blood pressure.12. (a) If 10 men in the company are tested for the marker in this chromosome.

i=2 i =0 10 1 Applied Statistics: Probability 1-97 ..Text problem 3-101 Continued (b) If 10 men are tested for the marker... Thus.1.852..10 ⎪ ⎪ a total of 240 have the marker).1.. ⎩ The answer is either ∑ p X (i ) or 1 − ∑ p X (i ) = 0. p X (i ) = ⎨ ⎛ 800 ⎞ ⎜ ⎟ ⎪ 10 ⎠ ⎝ ⎪ ⎪0 otherwise....10 (because ⎧ ⎛ 240 ⎞⎛ 560 ⎞ ⎪⎜ ⎟⎜ ⎟ ⎝ i ⎠⎝10 − i ⎠ i = 0. what is the probability that more than 1 has the marker? Answer: Out of 10 the number with the marker can be 0.2.

k ⎛n−m⎞ .Mean and Variance of Hypergeometric If X is a hypergeometric random variable with parameters n (total objects).076. and m (sample size). where p = (the proportion of type 1 ⎟ n −1 ⎠ n ⎝ ⎛ 240 ⎞ Example: In the previous problem E ( X ) = 10 ⎜ ⎟ = 3 and ⎝ 800 ⎠ ⎛ 240 ⎞ ⎛ 240 ⎞ ⎛ 800 − 10 ⎞ Var(X ) = 10 ⎜ ⎟ ⎜1 − ⎟⎜ ⎟ = 2. k (number of type 1 objects). ⎝ 800 ⎠ ⎝ 800 ⎠ ⎝ 799 ⎠ Applied Statistics: Probability 1-98 . then μ = E ( X ) = mp and σ 2 = var( X ) = mp (1 − p ) ⎜ objects in the total).

No further distribution authorized. Applied Statistics: Probability 1-99 .Continuous Random Variables and Moments of Random Variables G. Marin For educational purposes only. A.

Applied Statistics: Probability 1-100 . A continuous random variable is characterized by a distribution function that is a continuous function of x for all x ∈ . possibly. This means the ⎩ probability X ≤ x. a finite number of points. −∞ < x < ∞. x < 0 ⎪ FX ( x) = ⎨ x. Example: ⎧0. x ≥ 1. 0 ≤ x < 1 ⎪1. If the distribution function has a derivative at all except. The subscript is dropped if there is no abiguity.Continuous Cumulative “Distribution Function” The CDF FX of a random variable X is defined to be the function FX ( x) = P( X ≤ x). then the random variable is said to be absolutely continuous.

x →. * lim F ( x) = 0 and lim F ( x) = 1.Properties of CDF * 0 ≤ F ( x) ≤ 1. −∞ < x < ∞ * F ( x) is an increasing function of x.∞ x →+∞ Applied Statistics: Probability 1-101 .

∞ (2) -∞ ∫ f ( x)dx = 1. The cumulative distribution function is used in both cases (or in "mixed" cases). We obtain the distribution function from the density function through integration: This means NOTHING P(X ≤ x) = F(x) = -∞ ∫ f (t )dt . X . f ( x) = probability mass function and continuous random variables have a probability density function.” Applied Statistics: Probability 1-102 . except through integration. Thus. x − ∞ < x < ∞. discrete random variables have a For a continuous (differentiable) random variable.Probability “Density Function” dF ( x) is called the dx probability density function (pdf) of X . Note: in most of our problems The pdf will be defined “piecewise. The pdf satisfies the following properties: (1) f ( x) ≥ 0 for all x.

⎩ What is the value of k ? What is the corresponding cdf? Applied Statistics: Probability 1-103 .Example The probability density function f is given as: ⎧ xk2 for x > 2 ⎪ f ( x) = ⎨ ⎪0 otherwise.

P [ X ≤ x ] = F ( x) = implies that P [ a ≤ X ≤ b ] = F (b) − F (a ) P [ a<X ≤ b ] = F (b) − F (a ) P [ a ≤ X < b ] = F (b) − F (a ) P [ a < X < b ] = F (b) − F (a ). If the distribution is continuous. All 4 cases hold because. That is. it is also true that F ( x) = P [ X < x ] . Applied Statistics: Probability 1-104 −∞ ∫ x f (t )dt. P [ X = a ] = P [ X = b ] = 0. It also . This implies that for any real number x. for a continuous random variable X . the probability of any particular value is zero in this case. Note that it is traditional to write F ( x) = P [ X ≤ x ] .Probabilities on Intervals and cdf Suppose X is a continuous RV with pdf given by f and cdf given by F .

Examples of use: • Interarrival times at a communication switch • Service times at a server • Time to failure or repair of a component.Exponential Distribution A random variable has an exponential distribution if for some λ >0 its distribution function is given by: ⎧1 − e − λ x . if x ≥ 0 f ( x) = ⎨ otherwise. ⎩0 It follows that its pdf is given by: ⎧λ e − λ x . ⎩0 Note that in most problems the parameter λ represents a "rate." such as a rate of arrivals or a rate of failures. Applied Statistics: Probability 1-105 . if 0 ≤ x < ∞ F ( x) = ⎨ otherwise.

Exponential pdf λ = 2 ⎧λ e − λ x . ⎩0 Note the values of pdf mean “nothing” except through integration. Applied Statistics: Probability 1-106 . if x ≥ 0 f ( x) = ⎨ otherwise.

Exponential cdf λ=2 ⎧1 − e − λ x . ⎩0 Applied Statistics: Probability 1-107 . if 0 ≤ x < ∞ F ( x) = ⎨ otherwise.

is the exponential cdf. what is the probability that we will have to wait more than 20 secs for the next car to pass? Note 20 sec = 1/3 min.Class Problem Suppose that we stand at a mile marker on I-4 and watch cars pass.036 3⎦ ⎣ Applied Statistics: Probability 1-108 . We notice that on the average 10 cars pass by us per minute and we're given that the time lapse between two consecutive cars has an exponential distribution. 1 ⎤ −10 ⎡ thus. 3⎦ 3 ⎣ where F (t ) = 1 − e − λt . the answer is P ⎢W > ⎥ = e 3 = 0. The average "rate" is λ =10. Let W be waiting time in minutes. 1⎤ 1 ⎡ Answer. If we begin timing at the moment that one car passes by. We seek P ⎢W > ⎥ = 1 − F ( ).

Simple Exercises Use F in the previous problem to write: Probability W<6 Probability W>6 Probability W<0 Probability W<-1 Probability 2<W<5 Probability W=1. Applied Statistics: Probability 1-109 .

then we know that P( X ≤ x) = ∫ λ e 0 x −λ y t+x dy and P ( X ≤ t + x) = ∫ 0 λ e− λ y dy. x > 0. P ( X ≤ t + x | X > t ) = ⎣ P( X > t ) t ∞ ∫ t λ e − λ y dy . and t > 0. (Would you like it if waiting time at your doctor's office was exponentially distributed?) Applied Statistics: Probability 1-110 . e − λt This is why we don't replace lightbulbs until they fail. t+x P ⎡( X ≤ t + x ) ∩ ( X > t ) ⎤ ⎦= Thus.Memoryless Property If X has an exponential distribution. λ e− λ y dy ∫ e − λt (1 − e − λ x ) = = 1 − e − λ x = P( X ≤ x).

P[W > t ] = P[ N t = 0]. where W is waiting time. Applied Statistics: Probability 1-111 .Exponential/Poisson Relationship Show that the time between adjacent arrivals of a Poisson Process has an exponential distribution. and 0 arrivals during time t. then the probability that waiting time to the next event is greater than t is P[W > t ]. | t | | 4 arrivals during time t. Hint: If N t denotes the number of arrivals during time t and N t has a Poisson distribution.

We shall refer to the last equation as the "gamma integration formula..Properties of Gamma Function On the next slide we introduce the gamma distributions. Also. it follows that Γ(n) = (n -1)Γ(n -1) = . Note also that Γ ( 1 ) = π . which is a family of distributions that includes the exponential distribution. Because Γ(1) = 1. for α > 0 and λ > 0. it is well known that 2 Γ (α ) ∫ ∞ 0 x e α -1 − λ x dx = λ α . = (n -1)! when n is a positive integer.." Applied Statistics: Probability 1-112 . That definition incorporates something called the gamma function. α > 0. which is defined as Γ(α ) = ∫ xα −1e − x dx. 0 ∞ Using integration by parts one can show Γ (α ) = (α − 1) Γ (α − 1) for α > 1.

∫ ∞ 0 xe 2 −4 x Γ ( 3) 2! 1 = .Example Evaluate ∫ ∞ 0 x 2 e −4 x dx. It follows that λ α . for λ >0 and α > 0. Answer: Recall that ∫ ∞ 0 x α −1 − λ x e dx = Γ (α ) In this case α -1 = 2 ⇒ α = 3 while λ =4. dx = 3 = 4 64 32 Applied Statistics: Probability 1-113 .

λ > 0 Γ(α ) Is said to have a Gamma distribution with parameters λ and α and we write X ∼ GAM(λ .Gamma Distribution A random variable with pdf given by λ α t α −1e − λt f (t ) = .. For α =1 the gamma becomes identical to the exponential distribution. NOTE: If a sequence of random variables X 1 . Applied Statistics: Probability 1-114 ..α ). The parameter α is called the shape parameter and the parameter λ is called the scale parameter...kα ) distribution. α > 0.α ). t > 0. then their sum has a GAM(λ . X 2 . X k are mutually independent and identically distributed as GAM(λ .

shape Applied Statistics: Probability 1-115 . α ) scale. λ .Gamma Density: g ( x.

) (b) What is P ( X > 2 ) ? (c) What is P( X > 2 | X > 1) ? BONUS: Use the gamma integration formula to evaluate the following integral: ∞ x 3e −2 x dx ∫ 0 Applied Statistics: Probability 1-116 . is given by: for x < 0 ⎧0 ⎪ 3 ⎪x fX ( x) = ⎨ for 0 ≤ x ≤ 4 64 ⎪ for x > 4. ⎪0 ⎩ Answer each of the following and EXPLAIN (show your work).Practice Quiz 4 The pdf of a random variable. X . (a ) What is the cdf of X ? (Find it explicitly.

Mean and Variance of a Continuous Random Variable Definition Applied Statistics: Probability 1-117 .

64 5 × 64 0 5 × 64 5 −∞ 0 x3 x 6 4 4096 32 E ( X 2 ) = ∫ x 2 f X ( x) dx = ∫ x 2 i dx = = = 64 6 × 64 0 6 × 64 3 −∞ 0 32 ⎛ 16 ⎞ − ⎜ ⎟ = 0.427 Var ( X ) = E ( X ) − ⎡ E ( X ) ⎤ = ⎣ ⎦ 3 ⎝ 5⎠ 2 2 Applied Statistics: Probability 1-118 ∞ 4 2 .Expected Value (continuous example) ⎧0 ⎪ 3 ⎪x Let f X ( x ) = ⎨ ⎪ 64 ⎪0 ⎩ ∞ 4 for x < 0 for 0 ≤ x ≤ 4 By definition for x > 4. x3 x 5 4 1024 16 E ( X ) = ∫ xf X ( x)dx = ∫ xi dx = = = .

∞ ∞ Applied Statistics: Probability 1-119 . Similarly. well-defined random variables may not have finite means.Existence of E(X) Continuing a previous example: Let X be a random variable with pdf given by: ⎧ x22 for x > 2 ⎪ f ( x) = ⎨ ⎪0 otherwise. x x 2 2 Thus. a random variable may have a finite mean and not have a finite 2nd moment or a finite kth moment (to be defined). ⎩ 2 2 ∞ Notice that E ( X ) = ∫ x 2 dx = ∫ dx = 2 ln x 2 = ∞.

Mean and Variance of a Continuous Random Variable Expected Value of a Function of a Continuous Random Variable Applied Statistics: Probability 1-120 .

Let μ = E ( X ) = Then (a) E ( X 2 ) = ∞ ∞ −∞ ∫ xf ( x)dx.Other expected values Let X be a random variable with pdf given by f . ∫ (b) E ( X 2 + 5 X − 2) = ∞ −∞ (c) E (sin X ) = −∞ ∫ ( sin x ) f ( x)dx.) −∞ ∞ ∫ x 2 f ( x)dx. ∫ Applied Statistics: Probability 1-121 . (μ is called the mean of X . ∞ −∞ (d) σ 2 = Var ( X ) = E ( X − μ ) 2 = Variance of X ( x − μ ) 2 f ( x)dx. ( x 2 − 5 x − 2) f ( x)dx.

(b) Find E ( X ).Try these with/without Mathcad ⎧ ⎪0 for x < 0 ⎪ π ⎪ Let X be a random variable with probability density f ( x ) = ⎨cos( x) for 0 ≤ x ≤ . 2 ⎪ π ⎪ 0 for x > ⎪ 2 ⎩ (a) Find the cdf of X . (c) Find E (sin X ). Applied Statistics: Probability 1-122 .

Var(X ) = E ( X ) − μ and E ( X ) = ∫ x λ e 0 dx = λ Γ(3) λ 3 = 2 λ 2 . −λ x Similarly. Then E ( X ) = λ 1 and Var( X ) = 1 λ2 . Applied Statistics: Probability 1-123 .Exponential Mean and Variance Let X be an exponential random variable with parameter λ. Recall the gamma integration formula: ∞ xα −1e − λ x dx = ∫ 0 Γ(α ) λ α . Then E ( X ) = −∞ ∫ xf X ( x)dx = ∫ xλ e 0 2 2 ∞ −λ x dx = λ Γ(2) λ 2 2 ∞ = 2 1 λ . ∞ Proof. Var(X ) = 2 λ 2 − 1 λ 2 = 1 λ 2 . Thus.

If the lightbulb is installed at time t = 0. what is the probability that it fails at time t = 3000 hours? What is the probability that it fails in fewer than 3000 hours? Applied Statistics: Probability 1-124 .Exponential Example The lifetime of a particular brand of lightbulb is exponentially distributed. The "mean time to failure" (MTTF) is 2000 hours.

Gamma Mean and Variance Recall that the pdf of a Gamma random variable is given by: Think scale** λ α t α −1e− λt f (t ) = . t > 0 Γ(α ) shape The expected value (mean) of a Gamma random variable is α . λ α The variance is 2 . the distribution is exponential. Applied Statistics: Probability 1-125 . λ Note that when α =1. α > 0.

α α = 1. solve for α and λ .75. First. Sketch the pdf. We have 0. Substitute this in second equation to get λ2 This gives λ =2 which implies α =3. From the first λ λ 1.5 and 2 = 0.75.5λ . This is all we need to obtain pdf values. 0 0 0 0 1 x 2 3 3 Applied Statistics: Probability 1-126 .Class Exercise A gamma distribution has a mean of 1.596 1 Scale=2 and Shape=3 Gam( x) 0.5 and a variance of 0.5 Important: Make sure that you can find the scale and shape parameters from a given mean and variance.75. equation we get α =1.5λ = 0.

⎪b − a x ≥ b. ⎪x−a ⎪ F ( x) = ⎨ .Continuous Uniform Distribution A continuous random variable X is said to have a uniform distribution over the interval (a.a < x < b ⎪ f ( x) = ⎨ b − a ⎪0.b) if its density is given by: ⎧ 1 . otherwise. a ≤ x < b. ⎩ The distribution function is: x<a ⎧0. ⎩ Applied Statistics: Probability 1-127 . ⎪1.

Continuous Uniform Density on (3.5) Applied Statistics: Probability 1-128 .

Continuous Uniform cdf on (3.5) Applied Statistics: Probability 1-129 .

It follows that Similarly. ⎟ = b−a ⎝ 2 ⎠ 12 Recall that if X has a "discrete" uniform distribution on 1. 1 b+a )= . 2 b−a 1 E ( X ) = ∫ x b − a dx = 1 (b 2 − a 2 )( 2 a b3 − a 3 . then 2 n +1 n2 − 1 E( X ) = and Var (X ) = . b).Continuous Uniform Mean and Variance Let X be a random variable having the continuous uniform distribution over the interval (a.. 1 ⎧ b−a f ( x) = ⎨ ⎩0 b for a < x < b otherwise. E ( X ) = b−a 2 1 3 b3 − a 3 ⎛ b + a ⎞ (b − a ) 2 1 −⎜ Var ( X ) = 3 .n. 2 12 Applied Statistics: Probability 1-130 .2... It's density is. thus..

(a) What is the mean driving time? (b) What is the standard deviation of the driving time? (c) What is the probability that it will take less than one hour and 5 minutes to make the trip? (d) 80% of the time the trip will take less than ______ minutes? Applied Statistics: Probability 1-131 .Example: Suppose that the time that it takes to drive from the Orlando airport to FIT is uniformly distributed between one hour and one and a quarter hours.

Means and Variances (so far) Distribution Bernoulli Binomial Geometric Discrete Uniform E(X) Var(X) p (1 − p ) p np p n +1 2 1 np (1 − p ) (1 − p ) p2 n2 − 1 12 Poisson Exponential Gamma Continuous Uniform α (or λ ) 1 α (or λ ) 1 λ α λ a+b 2 λ2 α λ2 2 (b − a ) 12 Applied Statistics: Probability 1-132 .

2.Practice Quiz 5 1.1. The random variable X has an exponential distribution. The random variable X has a Binomial distribution and the total number of trials is 30. What is x ? 2. (a) If the probability of success on a single trial is 0. Do not derive them from the definition. What is the pdf of X ? What is the mean of X ? What is the variance of X ? (Just write down what we known the mean and variance are. write the pmf for X . then what is the probability of success on a single trial? Applied Statistics: Probability 1-133 . (a) What is the variance of the driving time. (b) If E ( X ) = 2. Suppose that the driving time between Orlando and Melbourne is uniformly distributed between one hour and one hour plus 15 minutes.? (b) Eighty percent of all drivers make the trip in fewer than x minutes.) 3.

Normal Distribution Definition Applied Statistics: Probability 1-134 .

Normal Distribution Figure 4-10 Normal probability density functions for selected values of the parameters μ and σ2. Applied Statistics: Probability 1-135 .

Normal Distribution Some useful results concerning the normal distribution Applied Statistics: Probability 1-136 .

Normal Distribution Definition : Standard Normal Applied Statistics: Probability 1-137 .

Normal Distribution Example 4-11 Figure 4-13 Standard normal probability density function. Applied Statistics: Probability 1-138 .

9906 0.9986 Applied Statistics: Probability 1-139 0.9901 0.9938 0.9535 0.9911 0.8729 0.7823 0.9974 0.9979 0.8531 0.7 2.5948 0.8051 0.8 2.9332 0.9162 0.9719 0.9909 0.9838 0.9049 0.9927 0.9989 0.9236 0.9554 0.9868 0.9756 0.9082 0.7704 0.9977 0.9973 0.9854 0.9783 0.9925 0.8413 0.7967 0.6664 0.4 0.6368 0.6064 0.1 0.9803 0.6179 0.5 1.9793 0.9952 0.5120 0.9441 0.9957 0.8830 0.9981 0.9808 0.9279 0.4 1.7019 0.9292 0.5 0.9222 0.9989 0.9965 0.5987 0.9946 0.7123 0.8078 0.9986 0.5871 0.6103 0.9948 0.2 1.9961 0.8315 0.9830 0.7157 0.9429 0.9941 0.9306 0.9976 0.9693 0.7580 0.9893 0.9967 0.9664 0.9985 0.8023 0.7939 0.9115 0.9982 0.9932 0.05 0.9788 0.9817 0.9989 0.9962 0.9821 0.5000 0.9881 0.9890 0.7673 0.9345 0.7257 0.6844 0.9988 0.9671 0.9981 0.9616 0.9984 0.8907 0.9772 0.9706 0.9842 0.5832 0.9251 0.9778 0.6480 0.9861 0.7324 0.8461 0.0 2.5793 0.9959 0.6 2.7224 0.6026 0.9732 0.5636 0.8869 0.9972 0.9452 0.8340 0.8810 0.6293 0.1 1.9515 0.3 2.9798 0.7389 0.6950 0.8770 0.9896 0.8925 0.9987 0.5478 0.7517 0.6985 0.8554 0.8577 0.9850 0.01 0.3 0.9474 0.8665 0.8133 0.8621 0.9956 0.7794 0.9582 0.9147 0.7764 0.7422 0.9649 0.8599 0.8212 0.7454 0.8 1.9750 0.9953 0.9826 0.9 3.9099 0.5517 0.8365 0.9913 0.9968 0.6331 0.8962 0.02 0.9625 0.9857 0.6517 0.7 1.6 1.9936 0.6879 0.9978 0.9940 0.09 0.9975 0.7734 0.8389 0.5596 0.9955 0.8980 0.5910 0.03 0.9686 0.9916 0.9884 0.7852 0.6255 0.7357 0.6591 0.9984 0.9929 0.9934 0.6406 0.2 0.9319 0.8159 0.9980 0.5319 0.5 2.0 0.8289 0.6772 0.9265 0.6915 0.08 0.6554 0.6 0.8264 0.9990 0.8888 0.6700 0.5279 0.9945 0.9370 0.9 1.9641 0.9505 0.1 2.9970 0.9887 0.9812 0.5160 0.9608 0.5675 0.9015 0.6808 0.9032 0.4 2.9834 0.9871 0.8849 0.9963 0.9131 0.9918 0.9406 0.9846 0.9463 0.7088 0.7995 0.9987 0.7642 0.04 0.9633 0.5438 0.9761 0.9525 0.7291 0.9977 0.00 0.9382 0.6217 0.8997 0.9982 0.5714 0.9898 0.7611 0.8944 0.8749 0.Standard Normal Table 0.8 0.9878 0.5753 0.8186 0.7190 0.6141 0.9418 0.9983 0.5239 0.8508 0.9726 0.3 1.9678 0.9988 0.07 0.9969 0.9066 0.9564 0.9951 0.9974 0.9495 0.9949 0.2 2.9971 0.8790 0.9966 0.9922 0.9920 0.7054 0.9985 0.9484 0.9394 0.9738 0.8708 0.9864 0.6443 0.0 1.5199 0.9767 0.06 0.9357 0.9713 0.9192 0.5359 0.9573 0.8438 0.9599 0.8238 0.9990 .9591 0.9744 0.5557 0.7881 0.5398 0.7910 0.9960 0.9 2.9699 0.6628 0.9177 0.7486 0.9987 0.9979 0.7 0.8106 0.8686 0.9931 0.9545 0.8643 0.9964 0.5080 0.9875 0.9943 0.0 0.9656 0.8485 0.9207 0.9904 0.6736 0.7549 0.5040 0.

Normal Distribution Standardizing Applied Statistics: Probability 1-140 .

Normal Distribution To Calculate Probability Applied Statistics: Probability 1-141 .

Normal Distribution Example 4-13 Applied Statistics: Probability 1-142 .

Normal Distribution Example 4-14 Note: In MathCad we simply compute: Mean and standard deviation Applied Statistics: Probability 1-143 .

Normal Distribution Example 4-14 (continued) Applied Statistics: Probability 1-144 .

Normal Distribution Example 4-14 (continued) Figure 4-16 Determining the value of x to meet a specified probability. Applied Statistics: Probability 1-145 .

Also. α > 0. = (n-1)! Note also that Γ ( 1 2 ) = π.Recall Gamma Function The gamma function (studied previously) is defined as: Γ(α ) = ∫ xα −1e − x dx. it follows that Γ(n) = (n-1)Γ(n-1) = . it is well known that ∫ ∞ 0 x e α -1 − λ x dx = Γ (α ) λ α ." Applied Statistics: Probability 1-146 .. 0 ∞ Using integration by parts one can show Γ (α ) = (α − 1) Γ (α − 1) for α > 1.. Because Γ(1) = 1. We shall refer to the last equation as the "gamma integration formula.

Suppose you are searching for a ship from its last known position. Y ) is as above and Z is altitude. (1) X and Y are discrete. where ( X . Applied Statistics: Probability 1-147 . In a manner similar to single random variables we consider two cases. the ship's probable position. for example. Examples: (1) The location of a ship is given by its latitude and longitude. It is natural to consider the joint distribution of ( X . Suppose.Joint Probability Distributions When two or more random variables naturally occur together or take values that seem to be related. Z ). Here we use a joint pdf and joint cdf (tbd). it is common to consider their joint probability distributions. (2) Similarly an aircraft's position might be predicted using a joint distribution of (X . (2) X and Y are continuous. Y ). Y . Here we use a joint pmf and joint cdf (tbd). that X and Y are two random variables whose outcomes we wish to consider jointly.

.. y ) = P ( X = x.. X n = xn ) If only two random variables are involved.. Applied Statistics: Probability 1-148 .. we usually denote them as X and Y... xn ) = P ( X 1 = x1 . Y = y ). y ) = P ( X = x. Y = y ). X 2 ... and X 2 ... X 2 = x2 .. instead of X 1 ..Discrete Random Variables For discrete random variables X 1 . and we write: p X . Note that the author writes the latter as f X .. X n ( x1 . X n their joint probability mass function is given by: p X ( x ) = p X1 . X 2 .Y ( x..Y ( x. x2 .

Simple example The joint distribution of ( X . Y ) is given as follows: 1 p X .Y ( 2.Y ( 2.3) = .Y ( 3. 2 ) = 6 6 1 1 1 p X .Y ( 3. p X . and p X . 9 9 9 Do the values of X and Y seem to "influence" each other? Applied Statistics: Probability 1-149 .Y ( 3.1) = . 2 ) = .Y (1. 3 1 1 p X .1) = .1) = and p X .

xn ) = P ( X 1 = x1 .. X n their joint probability mass function is given by: p X ( x ) = p X1 .. X n = xn ) The discrete random variables X 1 . X n ( x1 .. X 2 . X 2 ..Joint PMF and Independence For discrete random variables X 1 ... X 2 = x2 .. X n are said to be mutually independent if their joint pmf can be written as: p X ( x ) = p X1 ( x1 ) p X 2 ( x2 )iii p X n ( xn )..... X 2 .. Applied Statistics: Probability 1-150 ........ x2 .

5] (d) P [ X 2 is odd | X 1 is odd ] Are X 1 and X 2 independent? Applied Statistics: Probability 1-151 .Problem: Joint pmf for X 1 and X 2 X2 =1 X1 = 1 X1 = 2 X2 = 2 X2 = 3 1/12 1/6 1/12 1/6 1/4 1/12 1/12 1/12 0 X1 = 3 Find: (a) P [ X 1 X 2 is even ] (b) P [ X 1 is odd ] (c) P [ X 1 ≤ 1.

. x2 . to get p X1 (1) = you just sum over all the ( x1 . X 2 .. X n ) where X i = xi . 6 1 3 The notation above is difficult. xn ) . The function p X i ( xi ) is called the marginal probability mass function for X i . x2 ... in the example. xn ). X 2 .. X n ( x1 . where the sum is over the points in the range of ( X 1 ...... X n ( x1 ....... notice that.Marginal pmf If X 1 ... X 2 . X 2 . X n are discrete random variables with joint pmf p X1 .... however. x2 ) pair values that have 1 as the value for X 1... Applied Statistics: Probability 1-152 . Example: Using the previous slide the marginal pmf for X 1 is given by: 1 p X1 (1) = 3 1 p X1 ( 2 ) = 2 1 p X1 ( 3 ) = .. then p X i ( xi ) = [ X i = xi ] ∑ p X1 .

.. X 2 .. 3 Applied Statistics: Probability 1-153 ..3) = .Y ( 2.. X 2 ≤ x2 .xn ) = P ( X 1 ≤ x1 ... y ) = P ( X ≤ x. X n are discrete random variables.. x2 ... then their joint cumulative distribution function is given by: FX1 . If there are only two random variables involved we usually write: FX . X n ( x1 . 2 FX . for example.. Y ≤ y ) . Returning to the "Simple Example" we find. X 2 ...Y ( x... X n ≤ xn ) .Joint cdf for Discrete RVs If X 1 .

Applied Statistics: Probability 1-154 .Skip Section 5-1.3 We will not cover the material on conditional probability distributions that is in this section of the text.

y) ∫∫ f ( x.Double Integral: z z=f(x. y )dydx Iterated Integral y Y=n(x) f(x. y)dA R =∫ b n( x) a m( x) ∫ f ( x.y R x=b x Applied Statistics: Probability 1-155 .y) Y=m(x) x=a x.

x=2 Applied Statistics: Probability 1-156 .Example: Evaluate the double integral ∫∫ 2 xydA where R is the region bounded by the R x Answer: ∫ ∫ 2 xydydx = ∫ x ∫ 2 ydydx = ∫ x ⎛ y 2| ⎜ 0 0 0 0 0 0 ⎝ 2 x2 2 x2 2 2 curve y = x 2 and the lines y = 0 and x = 2. 6 0 3 0 4 2 x 2 2 0 0 0 0 Region R 1 x 2 3 3 Computes the volume under the surface z=2xy and above the region R. ⎞ dx ⎟ ⎠ 5 4 x 6 2 32 = ∫ xx dx = | = .

Another Evaluation Approach ⎡ ⎤ ∫ ∫ 2 xydydx = ∫ x ⎢ ∫ 2 ydy ⎥dx ⎢0 ⎥ 0 0 0 ⎣ ⎦ 2 x2 2 x2 x2 The inside integral is simply 2 ydy = y | = x 4 . ∫ 2 x2 0 0 Substitute this inside the brackets above to get x6 xx 4 dx = ∫ 6 0 2 2 0 32 = . 3 Applied Statistics: Probability 1-157 .

10 10 8 2 4 x 3 ydxdy where R is the region bounded by ∫∫ R x 6 4 2 0 0 0 0 1 x 2 3 3 2⋅ x Applied Statistics: Probability 1-158 .Evaluate the double integral y = x 2 and y = 2 x.

( 4. y = 0. ∫∫ R 10 x 3 y 4 dxdy for R bounded by y = x3 . ∫∫ R y 2 dxdy for R bounded by y = x 2 . x = 1. 6.Extra Practice Evaluate the double integrals: 1. x = 4. y = ∫∫ R 1 1 . y = 0. x = 1 + y 2 . y = −1. x = 1. x = 2 − y 2 . x x Applied Statistics: Probability 1-159 . ∫∫ 3 ( y 2 + 4 ) dxdy for R bounded by x = 1.1) . 2.1) . 0 ) . (1. 3. 12 x 2 ydxdy for R bounded by x = −1 − y 2 . y = 2 x. 4. 3 x 2 y 2 dxdy for R bounded by y = x.y=− . 5. ∫∫ R ∫∫ R R xydxdy for R the triangle with vertices ( 0.

1.1) y2 = 2 − x 1 0 −2 −1 0 x 1 2 3 Iterated integral: −1.353 ∫ ∫ 0 1 2− x y 2 dydx Applied Statistics: Probability 1-160 .Problem 3 5 4 y1 ( x)3 y2 ( x) 2 The two curves are y1 = x 2 (-1.831) (1.353.

−∞ < x < ∞.Y ( x. d ) − F (b. c) Note: lim y →∞ FX .Joint Distribution of Continuous RVs The cumulative joint distribution of continuous random variables X and Y is defined by FX . c) + F (a. FX . Also: The marginal cdf is defined in the same manner for discrete random variables. y ) = FY ( y ) is called the marginal cumulative distribution of Y. Properties: * 0 ≤ F ( x. ∞ ) . lim x →∞ Think of FX ( x ) = P [ X ≤ x ] = P [ X ≤ x ∩ Y < ∞ ] ≈ FX .Y ( x. y ) ≤ 1 * F ( x. −∞ < y < ∞. Y ≤ y ). d ) − F (a. y ) is monotone increasing in BOTH variables * P(a < X ≤ b and c < Y ≤ d ) = F (b.Y ( x. y ) = P ( X ≤ x. Applied Statistics: Probability 1-161 .Y ( x. y ) = FX ( x) is called the marginal cumulative distribution of X .

y ) = −∞ −∞ ∫∫ x y f (u . Similarly. v)dvdu. v)dvdu. The function f is known as the joint probability density function. y ) dx. there is often a function f such that F ( x. y )dydx. the marginal pdf of Y is fY ( y ) = −∞ ∫ f ( x. Note that P(a < x ≤ b. by definition of marginal distribution we know that FX ( x) = −∞ −∞ ∞ ∫∫ x ∞ f (u . Applied Statistics: Probability 1-162 . a c b d Also. y ) dy. The marginal pdf of X is f X ( x) = ∞ −∞ ∫ f ( x. c < y ≤ d ) = ∫ ∫ f ( x.Joint Probability Density If X and Y are both continuous random variables.

) ⌠ ⎮ ⎮ ⌡ 3 f ( x. Applied Statistics: Probability 1-163 . y ) := 1 Checking: 1 ⌠ ⎮ ⎮ ⌡ 5 1 45 ⋅ ( x + y ) dy d x = 1 0 x ⌠ The marginal cdf for X is FX( x) := ⎮ ⎮ ⌡ or FX( x) → ( x − 1) ⋅ ( x + 6) 18 1 ⌠ ⎮ ⎮ ⌡ 5 1 45 ⋅ ( u + v ) dv du 0 for 1 ≤ x ≤ 3. ⌠ The marginal cdf for Y is FY( y ) := ⎮ ⎮ ⌡ or FY( y ) → y ⋅ ( y + 4) 45 y 0 ⌠ ⎮ ⎮ ⌡ 3 1 45 ⋅ ( u + v ) du dv 1 for 0 ≤ y ≤ 5.Example (using Mathcad): The random variables X and Y have the following joint pdf: ⋅ ( x + y ) ⋅ ( 1 ≤ x ≤ 3) ⋅ ( 0 ≤ y ≤ 5) 45 (You need only define the non-zero portion of f to Mathcad.

= .Y ⎧ 1 for (x. Y ≤ 2] = ∫ 1 3 2 4 ( x −1) ∫ 0 1 6 dydx A x =1 x=5 = ∫ 1 y| 6 1 2 3 ( x −1) 4 0 1 dx = ∫ 8 ( x − 1)dx 1 2 Also. ⎝ 2 ⎠ 1 16 Applied Statistics: Probability 1-164 3 1 = 6 48 16 This only works because f is constant above area A. ⎩0 otherwise What is the probability that X ≤ 2 and Y ≤ 2? y=3 y = ( x − 1) 3 4 P [ X ≤ 2.Example: Joint Density of X. 3 8 ⎛ x2 ⎞2 1 1 = 8 ⎜ − x ⎟| = . answer is ratio of area of small triangle to area of A. y ) ∈ A Let f ( x. y ) = ⎨ 6 where A is the triangle shown.

Applied Statistics: Probability 1-165 . 0 6 8 ∞ Notice that −∞ ∫ ⎛ x2 ⎞5 1 1 f X ( x ) dx = ∫ 8 ( x − 1) dx = 8 ⎜ − x ⎟| ⎝ 2 ⎠1 1 = 1 ⎡( 25 − 5 ) − ( 1 − 1) ⎤ = 1 8⎣ 2 2 ⎦ 5 as required for a pdf. y ) dy. and and the value of y is "integrated out.Y ( x.Marginal Density of X (previous example) fX ( x) = ∞ −∞ ∫ f X . for 1 ≤ x ≤ 5.") = 3 ( x −1) 4 ∫ 0 1 6 y 3 ( x −1) 1 4 dy = | = ( x − 1) . (The value of x is held constant.

Solution : We begin by finding f X ( x ) = ∫ f X . y t2 − 2 t2 − y 2 0 and we can now find the cdf. fY ( y ) = ye − y2 2 . 2). y ) dy. and F (1. f X ( x ) . By symmetry. 0 ∞ ⎡ 2 ⎤ f X ( x ) = ∫ xy exp ⎣ − 1 ( x 2 + y 2 ) ⎦ dy = xe 0 ∞ − x2 2 ∫ ∞ 0 ye − y2 2 dy = xe − x2 2 . ⎪0 ⎩ Find FY ( y ) . FY ( y ) = ∫ te dt = −e 0 | = 1− e y2 − 2 .Y ( x.Example: The random variables X and Y have the joint density function ⎧ xy exp ⎡ − 1 ( x 2 + y 2 ) ⎤ for x > 0 and y > 0 ⎪ ⎣ 2 ⎦ f ( x. This is Y 's density function. also. y ) = ⎨ otherwise. Applied Statistics: Probability 1-166 .

2)… Note that in the region of integration. values of y do not depend on values of x. F (1. ⎝ ⎠ −2 ⎡ − y2 ⎢ −e 2 ⎢ ⎣ 1 ⎤ ⎥ ⎥ 0⎦ Applied Statistics: Probability 1-167 . 2) = ∫ = ∫ xe 0 1 x2 − 2 2 1 0 0 ∫ 2 xy exp ⎡ − 1 ( x 2 + y 2 ) ⎤ dydx ⎣ 2 ⎦ 1 x2 − 2 ⎤ ⎥dx ∫0 ye dydx = ∫0 xe ⎥ 0⎦ ⎡ − x2 x2 1 − = (1 − e −2 ) ∫ xe 2 dx = (1 − e −2 ) ⎢ −e 2 0 ⎢ ⎣ y2 − 2 2 1 − ⎞ ⎛ = (1 − e ) ⎜1 − e 2 ⎟ ≈ 0.Finding F(1. 2) ≡ FX .Y (1.340. x>0 & y>0.

y ) = ⎨ otherwise. ⎪0 ⎩ Applied Statistics: Probability 1-168 .2 ⎧ ⎡ 1 2 ⎤ ⎪ xy exp ⎣ − 2 ( x + y ) ⎦ for x > 0 and y > 0 Scatterplot of f ( x.

Lessons to Learn

F(1,2)=P(X<1,Y<2) Integrate a pdf to get a cdf:

FX ( x) =

−∞

∫

x

f X (t )dt.

“Limit out the other variables of a joint cdf to get a marginal cdf.”

lim

y →∞

FX ,Y ( x, y ) = FX ( x)

**Integrate out the other variables of the joint pdf to get a marginal pdf.
**

fX ( x) =

∞ −∞

∫

f X ,Y ( x, y )dy.

Applied Statistics: Probability 1-169

Bivariate Normal pdf

μx := 0 μy := 0 ρ := 0

σx := 1 σy := 1

ρ = 0.6

c :=

1 2⋅ π⋅ σx⋅ σy ⋅ 1 − ρ

2

Applied Statistics: Probability 1-170

Independent RVs

Two random variables, X and Y , are independent if FX ,Y ( x, y ) = FX ( x) FY ( y ) for − ∞ < x < ∞ and − ∞ < y < ∞. If the corresponding density functions exist, this is equivalent to f X ( x, y ) = f X ( x) fY ( y ) for − ∞ < x < ∞ and − ∞ < y < ∞. EXAMPLE: Let X and Y have joint pdf f ( x, y ) =

{

π

1 , x 2 + y 2 ≤1

0, otherwise

.

Determine the marginal pdf's of X and Y . Are X and Y independent?

Applied Statistics: Probability 1-171

−1 ≤ x ≤ 1. fY ( y ) = π To check for independence notice that 1 2 2 4 f X .1-x 2 = 2 π 1-x 2 .Solution (Independent RV’s) We're given: f ( x. −1 ≤ y ≤ 1.1-x 2 π dy = π . X and Y are NOT independent. y ) = ∞ { π 1 . the marginal density for X is 1 y| 1-x 2 f X ( x) = −∞ ∫ 1-x 2 f ( x. y )dy = ∫ 1 . Because of the symmetry the marginal density of Y is 2 1 − y 2 . thus. otherwise . π π π π Thus. 0) = and f X (0) fY (0) = × = .Y (0. x 2 + y 2 ≤1 0. Applied Statistics: Probability 1-172 .

A large sum of independent. A sum of independent exponentials is Erlang. identically distributed RVs is approximately normal. A sum of independent Poissons is Poisson.Keeping Sums in the Family A sum of independent normals is normal. Applied Statistics: Probability 1-173 . A sum of independent gammas is gamma.

Linearity of Expectation Suppose that X and Y are random variables and that a and b are any two real numbers. In previous example find E (3 X + 2Y ). Also notice that Var (aX ) = a 2Var ( X ) and that Var (aX + bY ) = a 2Var ( X ) + b 2Var (Y ) if X and Y are independent. then E (aX + bY ) = aE ( X ) + bE (Y ). Use this property to derive the working Formula for variance: σ 2 = E ( X 2 ) − E 2 ( X ). Applied Statistics: Probability 1-174 .

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd