Professional Documents
Culture Documents
Probability Distributions
Fall 2014
1
Random Variables and Probability Distributions
Any quantity or characteristic that is able to assume a number of
different values such that any particular outcome is determined by
chance is called a random variable
Random variables can be either discrete or continuous
A discrete random variable is able to assume only a finite or
countable number of outcomes
A continuous random variable can take on any value in a
specified interval
2
Example: Let X be a random variable that represents the number
of diagnostic tests a child receives during an office visit to a
pediatric specialist
A probability distribution applies the theory of probability to
describe the behavior of a random variable
For a discrete random variable, the probability distribution
specifies each of the possible outcomes of the random variable along
with the probability that each will occur
We represent a potential outcome of the random variable X by x
3
Probabilities can be assigned to specific outcomes using either a
table or some mathematical relationship
x P(X = x)
0 0.671
1 0.229
2 0.053
3 0.031
4 0.010
5 0.006
P(X = 3) = 0.031
5
Probability distributions can also be displayed using a graph
.7
.6
.5
Probability X=x
.4
.3
.2
.1
0
0 1 2 3 4 5
Number of Diagnostic Tests
6
Note that the tabular form of a probability distribution looks very
much like a frequency distribution
For a sample of n measurements, a frequency distribution lists each
observed value and its corresponding frequency (and sometimes the
relative frequency as well)
For a discrete random variable, a probability mass function displays
each possible outcome of the random variable and its associated
probability
Both indicate which values occur more often than others
7
If a random variable is able to take on a large number of values,
then a probability mass function might not be the most useful way
to summarize its behavior
Instead, measures of location and dispersion can be calculated (as
long as the measurements are not categorical)
The mean or average value assumed by a random variable is called
its expected value, or the population mean
It is represented by E(X) or µ
To obtain the expected value of a discrete random variable X, we
multiply each possible outcome of the variable by its associated
probability and sum over all values with a probability greater than
0
8
Suppose that a random variable X is able to take on the k distinct
values x1 , x2 , . . . xk
E(X) = µ
k
X
= xi P(X = xi )
i=1
9
The variance of a random variable X is called the population
variance and is represented by Var(X) or σ 2
It quantifies the dispersion of the possible outcomes of X around
the expected value µ
To calculate the variance of X, we multiply the squared difference
between each possible value xi and the population mean µ by its
associated probability and sum over all values with a probability
greater than 0
Var(X) = σ2
k
X
= (xi − µ)2 P(X = xi )
i=1
= E(X − µ)2
10
√
The standard deviation of X is σ2 = σ
For the diagnostic testing data,
11
The cumulative distribution function (CDF) of a random
variable X is defined as P(X ≤ x)
It is represented by F(x)
For the diagnostic testing data, the CDF of X is:
x P(X ≤ x) = F(x)
0 0.671
1 0.900
2 0.953
3 0.984
4 0.994
5 1.000
12
The Binomial Distribution
13
Y = 1 if the child survives to age 70 and Y = 0 if he or she does not
These outcomes are mutually exclusive and exhaustive
Suppose that 72% of infants born in the U.S. survive to age 70 years
P(Y = 1) = p = 0.72
P(Y = 0) = 1 − p = 0.28
The probability distribution of Y is
y P(Y = y)
0 0.28
1 0.72
14
Now suppose that we have two newborn infants
How many will survive until the age of 70?
Let X be a random variable that represents the number of children
out of the two who survive
X can take on three possible values: 0, 1, 2
The survival status of any two children can be assumed to be
independent, and the multiplication law of probability can be
applied
Y1 Y2 X
0 0 0
0 1 1
1 0 1
1 1 2
15
P(X = 0) = P(Y1 = 0 and Y2 = 0)
= (1 − p)(1 − p)
= (0.28)2
= 0.08
P(X = 1) = P([Y1 = 1 and Y2 = 0] or[Y1 = 0 and Y2 = 1])
= P(Y1 = 1 and Y2 = 0) + P(Y1 = 0 and Y2 = 1)
= p(1 − p) + (1 − p)p
= 2(0.72)(0.28)
= 0.40
P(X = 2) = P(Y1 = 1 and Y2 = 1)
= p2
= (0.72)2
= 0.52
16
The relevant probabilities could also be worked out if we had three
or more newborn infants
The probability distribution of the random variable X, where X
represents the number of infants in the group who survive to age
70, is called a binomial distribution
In general, we have n independent outcomes of a Bernoulli random
variable Y
Each one has a probability of “success” p
The total number of successes X follows a binomial distribution
17
The fixed numbers n and p are called the parameters of the
binomial distribution
Parameters are numerical quantities that summarize the
characteristics of a probability distribution
For the survival example, n = 2 and p = 0.72
The binomial distribution can be used to describe a variety of
situations such as the number of siblings who will inherit a certain
genetic trait from their parents, the number of surgical patients
who experience a postoperative complication, or the number of
individuals requiring treatment in an emergency room who have
health insurance coverage
18
Permutations and Combinations
Example: In how many ways can the first 3 letters of the alphabet
— A, B, and C — be ordered?
A B C
A C B
B A C
B C A
C A B
C B A
There are 3 choices for the first position, 2 choices for the second
position, and only 1 choice for the last position
Therefore, the number of ways in which the 3 letters can be
ordered is
3×2×1 = 6
19
In general, n letters can be ordered in n! (pronounced n factorial)
ways, where
n! = n × (n − 1) × (n − 2) × . . . × 3 × 2 × 1
By definition,
0! = 1
Example: In how many ways can the first 6 letters of the alphabet
— A, B, C, D, E, and F — be ordered?
6! = 6 × 5 × 4 × 3 × 2 × 1
= 720
20
There are 6 choices for the first position, 5 for the second position,
and 4 for the third position
6 × 5 × 4 = 120
n Pk = n(n − 1) × · · · × (n − k + 1)
21
Note that 3 letters can be ordered in 3! ways
Therefore, the number of ways in which 3 letters can be selected
out of 6 when the order of selection does not matter is
6×5×4 120
= = 20
3! 6
In how many ways can k objects be selected out of n when the
order of selection does not matter?
n
C
n k =
k
n Pk
=
k!
n(n − 1) × · · · × (n − k + 1)
=
k(k − 1) × · · · × (2)(1)
n!
=
k! (n − k)!
22
This is the number of combinations of n objects taken k at a time
Special property: If n and k are nonnegative integers and n ≥ k,
then
n n
=
k n−k
Note that
n n!
=
n−k (n − k)!(n − (n − k))!
n!
=
(n − k)!k!
n!
=
k!(n − k)!
n
=
k
23
Return to the binomial distribution
We are given n independent outcomes of a Bernoulli random
variable Y , each with a constant probability of success p
The probability of failure is q = 1 − p
X is the total number of successes
Example: Given 5 newborn infants in the U.S., what is the
probability that exactly 3 of them will survive to age 70?
The probability that it is the first three children who survive is
24
In how many ways can 3 children be selected out of the 5?
5!
5 C3 =
3!(5 − 3)!
(5)(4)(3)(2)(1)
=
(3)(2)(1)(2)(1)
= 10
25
In general, given n independent trials or outcomes each with
probability of success p, the probability of exactly k successes is
n k n−k
P(X = k) = p q
k
where k = 0, 1, . . . n
X is a binomial random variable
The parameters n and p summarize the characteristics of the
distribution
Many statistical software packages can be used to calculate
binomial probabilities
In some situations, binomial tables can be used to evaluate
probabilities as well
26
Example: Given that p = 0.30, what is the probability of exactly 4
successes in 5 trials?
P(X = 4) = 0.0283
What is the probability of at most 2 successes in 5 trials?
P(X ≤ 2) = P(X = 0) + P(X = 1)
+ P(X = 2)
= 0.1681 + 0.3602 + 0.3087
= 0.8370
E(X) = µ
m
X
= xi P(X = xi )
i=1
n
X n k n−k
= k p q
k
k=0
= np
28
The variance of a binomial random variable X is
Var(X) = σ2
m
X
= (xi − µ)2 P(X = xi )
i=1
n
2 n
X
= (k − np) pk q n−k
k
k=0
= npq
29
Probability distribution of a binomial random variable X with
parameters n = 10 and p = 0.50
.25
.2
Probability X = x
.15
.1
.05
0
0 1 2 3 4 5 6 7 8 9 10
Number of Successes x
30
Probability distribution of a binomial random variable X with
parameters n = 10 and p = 0.20
.3
.25
.2
Probability X = x
.15
.1
.05
0
0 1 2 3 4 5 6 7 8 9 10
Number of Successes x
31
Probability distribution of a binomial random variable X with
parameters n = 10 and p = 0.80
.3
.25
.2
Probability X = x
.15
.1
.05
0
0 1 2 3 4 5 6 7 8 9 10
Number of Successes x
32
Example: In the United States, 11% of individuals have type B
blood
In a randomly selected sample of 10 people, it is found that 2 have
type B blood
What is the probability of this event?
First note that the number of individuals with type B blood is a
binomial random variable with parameters n = 10 and p = 0.11
The expected value of X is np = 10(0.11) = 1.1
33
10
P(X = 2) = (0.11)2 (1 − 0.11)10−2
2
10!
= (0.11)2 (0.89)8
2!(8!)
= 0.214
10!
P(X = 4) = (0.11)4 (0.89)6
4!(6!)
= 0.015
34
One approach to answering this question is to calculate the
probability that X is equal to 4 or anything “more extreme”, or
P(X ≥ 4)
10
X
P(X ≥ 4) = P(X = k)
k=4
3
X
= 1− P(X = k)
k=0
= 0.018
35
Is it unusual to observe 2 individuals with type B blood in a sample
of size 10?
10
X
P(X ≥ 2) = P(X = k)
k=2
1
X
= 1− P(X = k)
k=0
= 0.302
36
The Poisson Distribution
37
The probability mass function of X is
e−µ µk
P(X = k) =
k!
where k = 0, 1, 2, . . . ∞ and µ = λt
k is a potential outcome of X
µ is the parameter of the Poisson distribution
The constant λ (lambda) represents the rate at which the event
occurs, or the expected number of events per unit time
e is the base of the natural logarithm; it is a constant approximated
by 2.71828
38
The probability that a new case of tetanus will occur over some
small subinterval of time — such as a minute — is very tiny
Represent the time interval of interest by t (e.g., one month) and
the length of a subinterval by ∆t (e.g., one minute)
Three assumptions must be made in order for the Poisson
distribution to apply:
1) The probability that a single event occurs within a given small
subinterval is proportional to the length of the subinterval
P(event) ≈ λ ∆t for constant λ
The probability that 0 events occur in the subinterval ≈ 1 − λ∆t
39
The probability of observing more than one event in a single
subinterval is essentially 0
Therefore, within each subinterval, the event either occurs or it
does not occur
2) The rate at which the event occurs is constant over the entire
interval t
3) Events occurring in consecutive subintervals are independent of
each other
If these assumptions hold, then the random variable X — which
represents the number of events that occur in the interval t —
follows a Poisson distribution
Example: In the United States, cases of tetanus are reported at a
rate of λ = 4.5/month
40
λ is the expected number of events per unit time
The parameter µ is the expected number of events in the interval t
Since t = 1 month,
If t = 2 months, then µ = 9
The Poisson distribution can be used to model the number of
ambulances needed in a city in a given night, the number of
particles emitted from a specified amount of radioactive material,
or the number of suicides occurring in a large city in a given week
41
What is the probability that 0 cases of tetanus will be reported in a
given month?
e−4.5 (4.5)0
P(X = 0) =
0!
e−4.5 (1)
=
(1)
= 0.011
42
Probability distribution of a Poisson random variable X with
parameter µ = 4.5
.2
.16
Probability X = k
.12
.08
.04
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Number of Events k
43
The expected value of a Poisson random variable X is
m
X
E(X) = xi P(X = xi )
i=1
∞
X e−µ µk
= k
k!
k=0
= µ
The variance of X is
m
X
Var(X) = (xi − µ)2 P(X = xi )
i=1
∞
X e−µ µk
2
= (k − µ)
k!
k=0
= µ
44