Discrete Probability Distributions

Lecture 5
Some useful discrete

probability distributions
CH2010 Engineering Statistics PL2023v6 1

The Bernoulli Process
We take a sample of size one (a trial), which only have two possible outcomes, e.g.
Head or tail
Boy or girl
Pass or fail
• We repeat this many times.
• The probability of an outcome (head or tail) in each trial is constant, and
independent of the outcomes of the other trails.

Example 5.1
A product has a defective rate of 25%. If we sample three products, what are the
probabilities that two of them are defective?
Answer: by tabulating the sampling space, let X be the number of defectives (D;
non-defective – N), we have
Outcome NNN NDN NND DNN NDD DND DDN DDD

X 0 1 1 1 2 2 2 3
For example, if X = 2

Example 5.1 (cont.)
Apply this to all possible values of x (viz. 0, 1, 2, 3), we obtain the probability
distribution function of x.
x 0 1 2 3
f(x) 27/64 27/64 9/64 1/64
This is a binomial distribution. The individual trials are Bernoulli

experiments/trials.
5.1 Binomial distribution
• Discrete probability distribution
• We take n Bernoulli trials (independent, binary results)
• Probability of a positive outcome in each trial is p.
• What is the probability of x positive outcomes?
• Written as:
• Take example 5.1:

To generalise,
In the case of Example 5.1,
This is much faster than “counting sample points”.

The name comes from the fact that the b(x;n,p) values correspond to the terms in the
binomial expansion of (q + p)n, where q = 1 − p.
The cumulative distribution can be computed as:
Computing B(r;n,p) manually may be tedious. The values of B(r;n,p) are given in
Table A.1.
The mean and variance of the binomial distribution b(x;n,p) are:
and
Applications of binomial distribution
• Engineering
• Quality control of product lines – pass or fail
• Mechanical strength test – yield or not yield
• Pharmaceutical
• Effectiveness of treatment – cure or no cure.
• Military
• Hit or miss (of guided missile)
• What other applications can you think of?

Example 5.2
The probability that a patient recovers from a rare blood disease is 0.4. If 15 people
are known to have contracted this disease, what is the probability that (a) at least 10
recover, (b) from 3 to 8 recover, and (c) exactly 5 recover?
Answer
X = the random variable representing the number of people who recover.
n = 15, p = 0.4.
(a) Probability that at least 10 recover = P(X ≥ 10).

Example 5.2 (cont.)
(a) Probability that at least 10 recover = P(X ≥ 10).
Look up Table A.1

Example 5.2 (cont.)
(b) Between 3 and 8 recover = 3 ≤ X ≤ 8.
Look up Table A.1.

Example 5.2 (cont.)
(c) Exactly 5 recover
Look up Table A.1
Alternatively,

5.2 Hypergeometric distribution
Now, we are sampling from a finite pool of items.
Each item only has two possible outcomes: pass and fail.
We do not put the sampled item back to the pool, nor do we replenish the pool with
new replacements.
As we sample, the number of item left in the pool reduces.
Accordingly, the probability of “pass” of the next outcome depends on all previous
outcomes.
This is a hypergeometric experiment.
An example hypergeometric experiment is drawing cards consecutively from a deck
of cards without putting the drawn cards back. Here the outcome is either a red card
or a black card.
Example 5.3
We draw 5 cards from a deck of 52 playing cards (26 reds and 26 blacks), what is
the probability of observing 3 red cards?
Recall what we’ve learnt in week 2.
Probability of getting 3 reds from a draw of 5
There are 26 reds in total There are 26 blacks in total

Pick 3 out of 26 reds Pick 2 out of 26 blacks
Pick 5 out of a deck of 52

• Following Example 5.3, we can also compute the probability of getting 0, 1, 2, 4
or 5 reds.
• The probability distribution of getting x reds is a hypergeometric distribution.
• The probability distribution of getting x reds in a draw of 5 cards from a deck of
52 cards containing 26 reds, can be denoted h(x; 52, 5, 26.)
• To generalise, the probability distribution of getting x successes in a sample of n
units from a pool of N units containing k numbers of successes is h(x; N, n, k).
• Hypergeometric distribution is relevant to the quality control of a batch-wise
production process.

Example 5.4
• An injection device is sold in packs of 10.
• It is not acceptable to have more than one defectives in a pack.
• The quality control manager wants to confirm the acceptability of the pack by
testing 3 devices in every pack of 10.
• If all three are good, then the pack is deemed accepted.
• Is this a good quality control plan?
Answer
Let’s consider: if the lot is unacceptable, e.g. 2 out of 10 are defective, what’s the
probability that the sampling plan does not identify this?

Example 5.4 (cont.)
In other words, what is the probability of picking 0 defective and 3 non-defectives
from a lot containing 2 defectives and 8 non-defectives? Let X be the random
variable describing the number of defectives sampled.
Thus, there’s a 47% percent chance that the sampling plan cannot detect any defect
item even if the pack has 2 defectives.
Therefore, this sampling plan is inadequate.

Generalising the problem in Examples 5.3 and 5.4 above, we have:
The mean and variance of h(x; N, n, k) are:

Hypergeometric vs binomial
• In a binomial (Bernoulli) experiment, the probability of a success is constant and
independent of previous results.
• In a hypergeometric experiment, the probabilities of a success depends on the no.
of successes in previous draws.
• In a hypergeometric experiment, if the pool (N) is sufficiently large c.f. the sample
size (n), then the probability of a success may be approximated as constant and
independent of previous results.
• In the above case the hypergeometric distribution can be approximated by a
binomial distribution.
• This approximation is good for n/N ≤ 0.05 (if the population size is at least 20
times larger than the sample size).

Hypergeometric vs binomial
If the hypothesis/approximation holds, k/N → p, which is independent of n.
Same as the binomial case
Same as the binomial case

5.3 Negative binomial distribution
When we performance Bernoulli trials, we are interested in the no. of successes in
n trials.
In a negative binomial experiment, we are interested in the no. of Bernoulli trials
required to reach n successes.
The negative binomial distribution, b*(x; k, p), describes the probability of
requiring a total of x trials to reach k no. of successes, when the probability of
success in each trial is p.
This can be computed by firstly considering the probability of having (k − 1)
successes in (x − 1) trials, then multiply by p.

Example 5.5
• A drug is considered 60% effective in bring some degree of relief to a patient.
• During a clinical trial, what is the probability that the 7th patient tested is the 5th
who experiences relief?
Answer
To rephrase the situation above, it takes exactly 7 trials to have 5 successes.
To rephrase again, it takes 6 trials to have 4 successes (regardless of order), plus the
7th trial exactly being a success.

Example 5.5 (cont.)
It takes 6 trials to have 4 successes (and two failures, regardless of order), and the
7th is a success.
Number of ways to have 4 successes and Probability of 4 success and 2 failures in a

2 failures in 6 trials specific sequence

5.3 Negative binomial distribution
To generalise, the probability that the kth success is reached in the xth trial which
has a success rate of p can be described by the negative binomial distribution:
No of different ways that the first Probability of k Probability of x − k failures

x – 1 trials had k – 1 successes successes
The mean and variance of the negative binomial distribution can be calculated
by:

5.4 Geometric distribution
A special case of the negative binomial distribution is when k = 1.
i.e. what is the probability that the first success occurs in the xth trial?
In this case, the negative binomial distribution
can be described by the geometric distribution.
With mean and variance:

Application of geometric distribution
• The question “how long does it take to reach a success” is very important to cost
an exploratory work.
• Drilling in a new oil field
• Ignition test of a new engine design
• Discovery of the first effective drug formulation
• Finding a job
• Understanding the underlying distribution function helps us to evaluate and
optimise the effectiveness of the work plan.
• Otherwise, one might invest too much time and resources!

5.5. Poisson distribution
If the random variable X is the number of outcomes over a given time interval or a
specific region, then the process yielding X is a Poisson process.
Examples:
• No. of phone calls received in 60 minutes.
• No. of bacteria found in a culture.
• No. of typos in a page of essay.
• No. of questions answered in one exam (is this really a random process?)

5.5 Poisson distribution
• The probability that positive outcomes will occur in a Poisson process is
proportional to the size of the test.
• This can be manifested by the mean number of (i.e. the mathematical expectation)
outcomes in a Poisson process:
where t represents “time”, “distance”, “area” or “volume” of interest and λ is the

coefficient of proportionality.
• The variance of the Poisson random variable is also λt.

5.5 Poisson distribution
• The probability distribution of the Poisson random variable, X taking the value x is:
where λ represents the average number of outcomes per unit time, distance, area, or
volume. This is also known as Poisson distribution.
Table A.2 tabulates the cumulative Poisson distributions:

Example 5.6
During a laboratory experiment, the average number of radioactive particles passing
through a detector in 1 ms is 4. What is the probability that 6 particles enter the
counter in a given millisecond?
Solution
Average = 4: λt = 4.
Alternatively, use Table A.2 to estimate the probability that 6 particles enter.

Nature of the Poisson probability function
The distribution is almost symmetric when μ is as large as 5

Binomial distribution vs Poisson distribution
• In a binomial/Bernoulli experiment, if there are many failures and only few
successes, i.e. large n and small p, then the sampling sequence can be
approximated as a continuum (of almost all failures), analogous to the concept of
continuous space or time in Poisson distribution.
• Therefore, Poisson distribution can be considered as an extreme case of binomial
distribution.

Example 5.8
In a glass blowing factory, defects
(bubbles) occurs, occasionally making
the piece undesirable.
On average, 1 in every 1000 items
produces has one more or more bubbles.
What is the probability that a random
sample of 8000 will yield fewer than 7
items possessing bubbles?

Example 5.8 (cont.)
Solution
This is essentially a binomial experiment with n = 8000 and p = 0.001. Since p is
very close to 0 and n is quite large, we can approximate with the Poisson
distribution using:
Hence, if X represents the number of items with one or more bubbles, we have:

Summary
Binomial distribution (Table A.1)
No. success in n independent Bernoulli experiments Mean Variance
Hypergeometric distribution
No. success in n successive trials without sample Mean Variance
replacement

Summary
Negative binomial distribution
No. of Bernoulli trials required to reach k successes Mean Variance
Geometric distribution
No. of Bernoulli trials required to reach the first success Mean Variance

Summary
Poisson distribution (Table A.2)
Probability of success is proportional to the sample size Mean Variance

Discrete Probability Distributions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Discrete Probability Distributions

Uploaded by

Copyright:

Available Formats

Lecture 5

Some useful discrete

CH2010 Engineering Statistics PL2023v6 1

CH2010 Engineering Statistics PL2023v6 2

Outcome NNN NDN NND DNN NDD DND DDN DDD

CH2010 Engineering Statistics PL2023v6 3

This is a binomial distribution. The individual trials are Bernoulli

• Take example 5.1:

CH2010 Engineering Statistics PL2023v6 5

In the case of Example 5.1,

This is much faster than “counting sample points”.

CH2010 Engineering Statistics PL2023v6 6

The cumulative distribution can be computed as:

• What other applications can you think of?

CH2010 Engineering Statistics PL2023v6 8

CH2010 Engineering Statistics PL2023v6 9

Look up Table A.1

CH2010 Engineering Statistics PL2023v6 10

Look up Table A.1.

CH2010 Engineering Statistics PL2023v6 11

Look up Table A.1

CH2010 Engineering Statistics PL2023v6 12

There are 26 reds in total There are 26 blacks in total

Pick 5 out of a deck of 52

CH2010 Engineering Statistics PL2023v6 15

CH2010 Engineering Statistics PL2023v6 16

CH2010 Engineering Statistics PL2023v6 17

The mean and variance of h(x; N, n, k) are:

CH2010 Engineering Statistics PL2023v6 18

CH2010 Engineering Statistics PL2023v6 19

Same as the binomial case

Same as the binomial case

CH2010 Engineering Statistics PL2023v6 20

CH2010 Engineering Statistics PL2023v6 21

CH2010 Engineering Statistics PL2023v6 22

Number of ways to have 4 successes and Probability of 4 success and 2 failures in a

CH2010 Engineering Statistics PL2023v6 23

No of different ways that the first Probability of k Probability of x − k failures

CH2010 Engineering Statistics PL2023v6 24

can be described by the geometric distribution.

With mean and variance:

CH2010 Engineering Statistics PL2023v6 25

CH2010 Engineering Statistics PL2023v6 26

CH2010 Engineering Statistics PL2023v6 27

where t represents “time”, “distance”, “area” or “volume” of interest and λ is the

CH2010 Engineering Statistics PL2023v6 28

CH2010 Engineering Statistics PL2023v6 29

CH2010 Engineering Statistics PL2023v6 30

The distribution is almost symmetric when μ is as large as 5

CH2010 Engineering Statistics PL2023v6 32

CH2010 Engineering Statistics PL2023v6 33

CH2010 Engineering Statistics PL2023v6 34

CH2010 Engineering Statistics PL2023v6 35

CH2010 Engineering Statistics PL2023v6 36

Poisson distribution (Table A.2)

Probability of success is proportional to the sample size Mean Variance

CH2010 Engineering Statistics PL2023v6 37

You might also like