You are on page 1of 44

BIO201

Probability Distributions

Fall 2014

1
Random Variables and Probability Distributions
Any quantity or characteristic that is able to assume a number of
different values such that any particular outcome is determined by
chance is called a random variable
Random variables can be either discrete or continuous
A discrete random variable is able to assume only a finite or
countable number of outcomes
A continuous random variable can take on any value in a
specified interval

2
Example: Let X be a random variable that represents the number
of diagnostic tests a child receives during an office visit to a
pediatric specialist
A probability distribution applies the theory of probability to
describe the behavior of a random variable
For a discrete random variable, the probability distribution
specifies each of the possible outcomes of the random variable along
with the probability that each will occur
We represent a potential outcome of the random variable X by x

3
Probabilities can be assigned to specific outcomes using either a
table or some mathematical relationship

x P(X = x)
0 0.671
1 0.229
2 0.053
3 0.031
4 0.010
5 0.006

The rule that is used to assign probabilities to the various


outcomes is called a probability mass function
0 < P(X = x) ≤ 1
P
P(X = x) = 1
4
What is the probability that a patient receives exactly 3 diagnostic
tests?

P(X = 3) = 0.031

What is the probability that a patient receives at most one


diagnostic test?

P(X = 0) + P(X = 1) = 0.671 + 0.229


= 0.900

What is the probability that a patient receives at least four


diagnostic tests?

P(X = 4) + P(X = 5) = 0.010 + 0.006


= 0.016

5
Probability distributions can also be displayed using a graph
.7

.6

.5
Probability X=x

.4

.3

.2

.1

0
0 1 2 3 4 5
Number of Diagnostic Tests

The area of the bar above each outcome x represents P(X = x)


The total area is equal to 1

6
Note that the tabular form of a probability distribution looks very
much like a frequency distribution
For a sample of n measurements, a frequency distribution lists each
observed value and its corresponding frequency (and sometimes the
relative frequency as well)
For a discrete random variable, a probability mass function displays
each possible outcome of the random variable and its associated
probability
Both indicate which values occur more often than others

7
If a random variable is able to take on a large number of values,
then a probability mass function might not be the most useful way
to summarize its behavior
Instead, measures of location and dispersion can be calculated (as
long as the measurements are not categorical)
The mean or average value assumed by a random variable is called
its expected value, or the population mean
It is represented by E(X) or µ
To obtain the expected value of a discrete random variable X, we
multiply each possible outcome of the variable by its associated
probability and sum over all values with a probability greater than
0

8
Suppose that a random variable X is able to take on the k distinct
values x1 , x2 , . . . xk

E(X) = µ
k
X
= xi P(X = xi )
i=1

For the diagnostic test data,

E(X) = 0(0.671) + 1(0.229) + 2(0.053)


+ 3(0.031) + 4(0.010) + 5(0.006)
= 0.498

We would expect an average of 0.5 diagnostic tests for each visit to


a pediatric specialist

9
The variance of a random variable X is called the population
variance and is represented by Var(X) or σ 2
It quantifies the dispersion of the possible outcomes of X around
the expected value µ
To calculate the variance of X, we multiply the squared difference
between each possible value xi and the population mean µ by its
associated probability and sum over all values with a probability
greater than 0

Var(X) = σ2
k
X
= (xi − µ)2 P(X = xi )
i=1

= E(X − µ)2

10

The standard deviation of X is σ2 = σ
For the diagnostic testing data,

σ2 = (0 − 0.5)2 (0.671) + (1 − 0.5)2 (0.229)


+ (2 − 0.5)2 (0.053) + (3 − 0.5)2 (0.031)
+ (4 − 0.5)2 (0.010) + (5 − 0.5)2 (0.006)
= 0.782

The standard deviation is



σ = 0.782
= 0.884

11
The cumulative distribution function (CDF) of a random
variable X is defined as P(X ≤ x)
It is represented by F(x)
For the diagnostic testing data, the CDF of X is:

x P(X ≤ x) = F(x)
0 0.671
1 0.900
2 0.953
3 0.984
4 0.994
5 1.000

12
The Binomial Distribution

Consider the dichotomous random variable Y


Y must take on one of two possible values, often referred to as
“failure” and “success”
A random variable of this type is called a Bernoulli random
variable
Example: We are interested in determining whether a newborn
infant in the United States will survive until his or her 70th
birthday
Let Y represent the survival status of the child at age 70 years

13
Y = 1 if the child survives to age 70 and Y = 0 if he or she does not
These outcomes are mutually exclusive and exhaustive
Suppose that 72% of infants born in the U.S. survive to age 70 years
P(Y = 1) = p = 0.72
P(Y = 0) = 1 − p = 0.28
The probability distribution of Y is

y P(Y = y)
0 0.28
1 0.72

14
Now suppose that we have two newborn infants
How many will survive until the age of 70?
Let X be a random variable that represents the number of children
out of the two who survive
X can take on three possible values: 0, 1, 2
The survival status of any two children can be assumed to be
independent, and the multiplication law of probability can be
applied

Y1 Y2 X
0 0 0
0 1 1
1 0 1
1 1 2

15
P(X = 0) = P(Y1 = 0 and Y2 = 0)
= (1 − p)(1 − p)
= (0.28)2
= 0.08
P(X = 1) = P([Y1 = 1 and Y2 = 0] or[Y1 = 0 and Y2 = 1])
= P(Y1 = 1 and Y2 = 0) + P(Y1 = 0 and Y2 = 1)
= p(1 − p) + (1 − p)p
= 2(0.72)(0.28)
= 0.40
P(X = 2) = P(Y1 = 1 and Y2 = 1)
= p2
= (0.72)2
= 0.52

16
The relevant probabilities could also be worked out if we had three
or more newborn infants
The probability distribution of the random variable X, where X
represents the number of infants in the group who survive to age
70, is called a binomial distribution
In general, we have n independent outcomes of a Bernoulli random
variable Y
Each one has a probability of “success” p
The total number of successes X follows a binomial distribution

17
The fixed numbers n and p are called the parameters of the
binomial distribution
Parameters are numerical quantities that summarize the
characteristics of a probability distribution
For the survival example, n = 2 and p = 0.72
The binomial distribution can be used to describe a variety of
situations such as the number of siblings who will inherit a certain
genetic trait from their parents, the number of surgical patients
who experience a postoperative complication, or the number of
individuals requiring treatment in an emergency room who have
health insurance coverage

18
Permutations and Combinations
Example: In how many ways can the first 3 letters of the alphabet
— A, B, and C — be ordered?
A B C
A C B
B A C
B C A
C A B
C B A
There are 3 choices for the first position, 2 choices for the second
position, and only 1 choice for the last position
Therefore, the number of ways in which the 3 letters can be
ordered is
3×2×1 = 6
19
In general, n letters can be ordered in n! (pronounced n factorial)
ways, where

n! = n × (n − 1) × (n − 2) × . . . × 3 × 2 × 1

By definition,

0! = 1

Example: In how many ways can the first 6 letters of the alphabet
— A, B, C, D, E, and F — be ordered?

6! = 6 × 5 × 4 × 3 × 2 × 1
= 720

Example: In how many different ways can 3 letters be selected out


of these 6 when the order of selection matters?

20
There are 6 choices for the first position, 5 for the second position,
and 4 for the third position

6 × 5 × 4 = 120

In how many ways can k objects be selected out of n when the


order of selection matters?

n Pk = n(n − 1) × · · · × (n − k + 1)

This is the number of permutations of n things taken k at a time


What if order does not matter?
In this case, a selection of ABC would be the same as BCA

21
Note that 3 letters can be ordered in 3! ways
Therefore, the number of ways in which 3 letters can be selected
out of 6 when the order of selection does not matter is

6×5×4 120
= = 20
3! 6
In how many ways can k objects be selected out of n when the
order of selection does not matter?
 
n
C
n k =
k
n Pk
=
k!
n(n − 1) × · · · × (n − k + 1)
=
k(k − 1) × · · · × (2)(1)
n!
=
k! (n − k)!

22
This is the number of combinations of n objects taken k at a time
Special property: If n and k are nonnegative integers and n ≥ k,
then
   
n n
=
k n−k

Note that
 
n n!
=
n−k (n − k)!(n − (n − k))!
n!
=
(n − k)!k!
n!
=
k!(n − k)!
 
n
=
k

23
Return to the binomial distribution
We are given n independent outcomes of a Bernoulli random
variable Y , each with a constant probability of success p
The probability of failure is q = 1 − p
X is the total number of successes
Example: Given 5 newborn infants in the U.S., what is the
probability that exactly 3 of them will survive to age 70?
The probability that it is the first three children who survive is

p×p×p×q×q = (0.72)3 (0.28)2


= 0.0293

24
In how many ways can 3 children be selected out of the 5?
5!
5 C3 =
3!(5 − 3)!
(5)(4)(3)(2)(1)
=
(3)(2)(1)(2)(1)
= 10

Therefore, the probability that exactly 3 children will survive is


P(X = 3) = 10(0.72)3 (0.28)2
= 0.293

Given 10 infants, what is the probability that exactly 6 will survive


to age 70?
 
10
P(X = 6) = (0.72)6 (0.28)10−6
6
= 0.180

25
In general, given n independent trials or outcomes each with
probability of success p, the probability of exactly k successes is
 
n k n−k
P(X = k) = p q
k

where k = 0, 1, . . . n
X is a binomial random variable
The parameters n and p summarize the characteristics of the
distribution
Many statistical software packages can be used to calculate
binomial probabilities
In some situations, binomial tables can be used to evaluate
probabilities as well

26
Example: Given that p = 0.30, what is the probability of exactly 4
successes in 5 trials?
P(X = 4) = 0.0283
What is the probability of at most 2 successes in 5 trials?
P(X ≤ 2) = P(X = 0) + P(X = 1)
+ P(X = 2)
= 0.1681 + 0.3602 + 0.3087
= 0.8370

What is the probability of at most 4 successes in 6 trials?


P(X ≤ 4) = P(X = 0) + . . . + P(X = 4)
= 1 − [P(X = 5) + P(X = 6)]
= 1 − [0.0102 + 0.0007]
= 0.9891
27
A binomial distribution can be summarized by a measure of
location and a measure of dispersion
The expected value of a binomial random variable X is

E(X) = µ
m
X
= xi P(X = xi )
i=1
n  
X n k n−k
= k p q
k
k=0

= np

If repeated samples of size 10 are selected from the population of


infants born in the U.S., the mean number of children per sample
who survive to age 70 would be np = (10)(0.72) = 7.2

28
The variance of a binomial random variable X is

Var(X) = σ2
m
X
= (xi − µ)2 P(X = xi )
i=1
n  
2 n
X
= (k − np) pk q n−k
k
k=0

= npq

Among samples of 10 infants, the variance would be


npq = (10)(0.72)(0.28) = 2.02 and the standard deviation would

be 2.02 = 1.42
Note that the variance npq is largest when p is equal to 0.5, and
smallest when p equals 0 or 1

29
Probability distribution of a binomial random variable X with
parameters n = 10 and p = 0.50
.25

.2
Probability X = x

.15

.1

.05

0
0 1 2 3 4 5 6 7 8 9 10
Number of Successes x

30
Probability distribution of a binomial random variable X with
parameters n = 10 and p = 0.20
.3

.25

.2
Probability X = x

.15

.1

.05

0
0 1 2 3 4 5 6 7 8 9 10
Number of Successes x

31
Probability distribution of a binomial random variable X with
parameters n = 10 and p = 0.80
.3

.25

.2
Probability X = x

.15

.1

.05

0
0 1 2 3 4 5 6 7 8 9 10
Number of Successes x

32
Example: In the United States, 11% of individuals have type B
blood
In a randomly selected sample of 10 people, it is found that 2 have
type B blood
What is the probability of this event?
First note that the number of individuals with type B blood is a
binomial random variable with parameters n = 10 and p = 0.11
The expected value of X is np = 10(0.11) = 1.1

33
 
10
P(X = 2) = (0.11)2 (1 − 0.11)10−2
2
10!
= (0.11)2 (0.89)8
2!(8!)
= 0.214

What is the probability that 4 out of 10 people in a sample have


type B blood?

10!
P(X = 4) = (0.11)4 (0.89)6
4!(6!)
= 0.015

Is observing 4 individuals with type B blood in a group of 10 an


unusual occurrence?

34
One approach to answering this question is to calculate the
probability that X is equal to 4 or anything “more extreme”, or
P(X ≥ 4)
10
X
P(X ≥ 4) = P(X = k)
k=4
3
X
= 1− P(X = k)
k=0

= 0.018

Observing 4 or more people with type B blood in a sample of size


10 is fairly unusual; it happens only about 1.8% of the time
By convention, something that happens less than 5% of the time is
considered to be “unusual”

35
Is it unusual to observe 2 individuals with type B blood in a sample
of size 10?
10
X
P(X ≥ 2) = P(X = k)
k=2
1
X
= 1− P(X = k)
k=0

= 0.302

Since 2 or more people with type B blood would be observed


approximately 30% of the time, this event would not be considered
unusual

36
The Poisson Distribution

The Poisson distribution is a discrete probability distribution


used to model the number of occurrences of an event that takes
place infrequently in time or space
Consider a random variable X representing the number of
occurrences of an event over a given interval of time
Example: We are interested in tracking the number of cases of
tetanus reported in the United States in a given month
In theory, X is a count that can assume any integer value greater
than or equal to 0

37
The probability mass function of X is
e−µ µk
P(X = k) =
k!

where k = 0, 1, 2, . . . ∞ and µ = λt
k is a potential outcome of X
µ is the parameter of the Poisson distribution
The constant λ (lambda) represents the rate at which the event
occurs, or the expected number of events per unit time
e is the base of the natural logarithm; it is a constant approximated
by 2.71828

38
The probability that a new case of tetanus will occur over some
small subinterval of time — such as a minute — is very tiny
Represent the time interval of interest by t (e.g., one month) and
the length of a subinterval by ∆t (e.g., one minute)
Three assumptions must be made in order for the Poisson
distribution to apply:
1) The probability that a single event occurs within a given small
subinterval is proportional to the length of the subinterval
P(event) ≈ λ ∆t for constant λ
The probability that 0 events occur in the subinterval ≈ 1 − λ∆t

39
The probability of observing more than one event in a single
subinterval is essentially 0
Therefore, within each subinterval, the event either occurs or it
does not occur
2) The rate at which the event occurs is constant over the entire
interval t
3) Events occurring in consecutive subintervals are independent of
each other
If these assumptions hold, then the random variable X — which
represents the number of events that occur in the interval t —
follows a Poisson distribution
Example: In the United States, cases of tetanus are reported at a
rate of λ = 4.5/month

40
λ is the expected number of events per unit time
The parameter µ is the expected number of events in the interval t
Since t = 1 month,

µ = λt = (4.5/month)(1 month) = 4.5

If t = 2 months, then µ = 9
The Poisson distribution can be used to model the number of
ambulances needed in a city in a given night, the number of
particles emitted from a specified amount of radioactive material,
or the number of suicides occurring in a large city in a given week

41
What is the probability that 0 cases of tetanus will be reported in a
given month?
e−4.5 (4.5)0
P(X = 0) =
0!
e−4.5 (1)
=
(1)
= 0.011

What is the probability that 1 case of tetanus will be reported?


e−4.5 (4.5)1
P(X = 1) =
1!
e−4.5 (4.5)
=
(1)
= 0.050

42
Probability distribution of a Poisson random variable X with
parameter µ = 4.5
.2

.16
Probability X = k

.12

.08

.04

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Number of Events k

43
The expected value of a Poisson random variable X is
m
X
E(X) = xi P(X = xi )
i=1

X e−µ µk
= k
k!
k=0
= µ

The variance of X is
m
X
Var(X) = (xi − µ)2 P(X = xi )
i=1

X e−µ µk
2
= (k − µ)
k!
k=0
= µ

They are the same!

44

You might also like