You are on page 1of 14

Faculty of Information Science & Technology

(FIST)

PSM 0325
Introduction to Probability and Statistics

Foundation in Life Science


Foundation in Information Technology

ONLINE NOTES

Topic 4
Probability Distributions

FIST, MULTIMEDIA UNIVERSITY (436821-T)


MELAKA CAMPUS, JALAN AYER KEROH LAMA, 75450 MELAKA, MALAYSIA.
URL: http://fist2.mmu.edu.my
PSM0325 Introduction to Probability and Statistics Topic 4

TOPIC 4
PROBABILITY DISTRIBUTIONS

Reference:
Introduction to Probability and Statistics, Assliza Salim. et al.,Pearson. 2011

Objectives:
1. To calculate the probability binomial, poisson and normal distributions by using
the formula.
2. To find the mean and variance of binomial and poissonm distributions.
3. To calculate the probability binomial, poisson and normal distributions by using
statistical table

Contents:
1. Introduction
2. Binomial distribution
3. Poisson Distribution
4. Normal Distribution
5. Standard Normal Distribution

INTRODUCTION

Probability distributions can be classified into different distributions. This can be done
because the observations generated by different statistical experiments have the same
general type of behavior. In this chapter, students will study two different discrete
probability distributions and one continuous probability distribution.

4.1 BINOMIAL DISTRIBUTION

An experiment often consists of repeated trials, each with two possible outcomes that
may be labeled success or failure. The most obvious application deals with the testing
of items as they come off an assembly line, where each test or trial may indicate a
defective or a non-defective item. We may choose to define either outcome as a
success.

One may consider cards drawn in succession from an ordinary deck and each trial is
labeled a success or a failure, depending on whether the card is a heart or not a heart.
If each card is replaced and the deck shuffled before the next drawing, the two
experiments just described have similar properties, in that the repeated trials are
independent and the probability of a success remains constant from trial to trial. The
processed is referred to as a Bernoulli process. Each trial is called a Bernoulli trial.

Observe in the card-drawing example that the probabilities of a success for the
repeated trials change if the cards are not replaced. This is, the probability of selecting

__________________________________________________________________________________
1/13
PSM0325 Introduction to Probability and Statistics Topic 4

a heart on the first draw is 1/4, but on the second draw it is a conditional probability
having a value of 13/51 or 12/51, depending on whether a heart appeared on the first
draw: this, then, would no longer be considered a set of Bernoulli trials.

The Bernoulli Process

Strictly speaking, the Bernoulli process must possess the following properties:

1. The experiment consists of n repeated trials.


2. Each trial results in an outcome that may be classified as a success (s) or a failure
(f).
3. The probability of success, denoted by p, remains constant from trial to trial.
4. The repeated trials are independent.

Example 4.4:
Suppose an experiment consists of two repetitions of a Bernoulli trial with the
probability of success equal to 1/3 and the probability of failure equal to 2/3. What is
the probability that exactly one success is obtained in the two trials?

Solution:
We assume that the probabilities of the two possible outcomes (s and f) on the second
trial are independent of the outcome (s or f) on the first trial. In other words, the
second trial is identical (in terms of probabilities) to the first trial. Accordingly, each
fork of the tree has two branches, one leading to s, which has probability 1/3, and
another leading to f, which has probability 2/3.

The outcomes of the experiment which are of interest to us are sf and fs. We conclude
that

Pr[exactly one success] = Pr[{sf, fs}] = 2/9 + 2/9 = 4/9

Example 4.5:

A fair dice is rolled twice and the result is noted each time. On a single roll of a dice,
the result is said to be a success if a 1 or a 6 comes up, and the result is said to be a
failure otherwise. Since the dice is fair,

Pr[s] = Pr[{1, 6}] = 1/3 and Pr[f] = Pr[{2, 3, 4, 5}] = 2/3

Two rolls of the dice can be modeled as two repetitions of a Bernoulli trial.

Throughout this section we will denote the probability of success in a single Bernoulli
trial by p and the probability of failure by q = 1 - p. In general we will be interested in

__________________________________________________________________________________
2/13
PSM0325 Introduction to Probability and Statistics Topic 4

computing the probability of obtained certain number of successes (say, r) in a given


number of trials (say, n). We begin with a special case.

ff ... f , ss
 ... s, and ss
 ... s ff... f
 n times k times
n times n  k times

Those outcomes of the experiment that have r successes and n - r failures are
represented by elements in the sample space in which the letter s appears r times and
the letter f appears n -r times. Note that on the tree diagram the path corresponding to
such an outcome must have r branches with probability p (the successes) and n - r
branches with the probability q (the failures). Since the probability of the outcome is
the product of the conditional probabilities on the branches, the probability of this
outcome is prqn - r. Finally, we need only compute the number of such outcomes to
find the probability of r successes in n trials.

To determine the number of outcomes with exactly r successes, we note that each
such outcome corresponds to a selection of r positions from n in which to place the
letter s. For example, two selections are

ff . .. f ss
 .. . s, and ss
 .
   r r
nr lett ers lett
letters s s
f

and in general there are many more. Thus to count the number of ways to have r
successes in n trials we simply count the number of ways to select r objects from a set
of n. We recognize this number to be C(n, r). Combining this facts, we have the
formula:

The probability distribution of the binomial random variable X obtaining r


successes in a Bernoulli process consisting of n trials with success probability
p and failure probability q = 1 - p is

b( r; n, p )  P( X  r )  C( n, r ) p r q n  r (4.1)
for r = 0, 1, 2,…, n.

where n = total number of trials


p = probability of success
q = 1 - p = probability of failure
r = number of successes in n trials
n - r = number of failures in n trials

We illustrate Equation (4.1) by applying it to the experiment described in Example


5.4. It consists of two Bernoulli trials with success probability p =1/3. Therefore n =
2, p = 1/3, q = 2/3. The problem asks for the probability of exactly 1 success.
Therefore r = 1 in Equation (4.1). We have

__________________________________________________________________________________
3/13
PSM0325 Introduction to Probability and Statistics Topic 4

1 1
2 1 2!  1   2 
b(1;2,1)  P( X  1)  C( 2,1)(1 / 3) ( 2 / 3)
1
    
1!1!  3  3 
 2  13  13  4
9

as we obtained in Example 4.4.

Example 4.6:
Solve the problem posed in Example 4.5 by using Equation (4.1). Assume p = 1/4 and
q = 3/4.

Solution:
In Example 4.5 we have n = 4. We are interested in the probabilities of 0, 1, 2, 3, and
4 successes. Using Equation (5.1) with p = 1/4, q = 3/4, we find

Pr[ X  0]  C( 4, 0)(1 / 4) 0 (3 / 4) 4  81 / 256


Pr[ X  1]  C( 4,1)(1 / 4)1 (3 / 4) 3  27 / 64
Pr[ X  2]  C( 4,2 )(1 / 4) 2 (3 / 4) 2  27 / 128
Pr[ X  3]  C( 4,3)(1 / 4) 3 (3 / 4)1  3 / 64
Pr[ X  4]  C( 4,4 )(1 / 4) 4 (3 / 4) 0  1 / 256

We note that in Example 4.6 the events 0 success, 1 success, 2 successes, 3 successes,
and 4 successes from a partition of the sample space S. Thus the sum of the
probabilities for these events should be 1.
Indeed, we have

Pr[X = 0] + Pr[X = 1] + Pr[X = 2] + Pr[X = 3] + Pr[X = 4]


= 81/256 + 27/64 + 27/128 + 3/64 + 1/256
=1

In fact, in any binomial experiment with n trials the events 0 success, 1 success, 2
successes, …, n successes form a partition of the sample space, and the sum of their
probabilities must be 1.

In any Binomial experiment with n Bernoulli trials,


n

 Pr[ X  r ]  1
r0
(4.2)

Equation (4.2) can frequently be used to reduce the amount of computation needed to
solve a problem.

Using the Table of Binomials Probabilities

__________________________________________________________________________________
4/13
PSM0325 Introduction to Probability and Statistics Topic 4

The probabilities for a binomial experiment can also be read from Table 1, page 4 to
23. That table lists the probabilities of x for n = 1 to n = 20 and for selected values of
p (from 0.01 to 0.50), is less than or equal to r (r = 1 to n - 1).

r
 n
P( X  r )     p t (1  p ) n  t
t0  t 

Key Formulas:
(i) P(x = a) = P(x  a) - P(x  a - 1)
(ii) P(x  a) = 1 - P(x  a - 1)
(iii) P(x > a) = 1 - P(x  a)
(iv) P(a  x  b) = P(x  b) - P(x  a - 1)
(v) P(a < x < b) = P(x  b - 1) - P(x  a)

Example 4.8:
Let p = 0.25, n = 6. Find the probability that :
(a) exactly three
(b) at most two
(c) at least three
(d) one to three
are success.

Solution:
Refer Table 1, page 6, for n = 6 and p = 0.25.

(a) P(x = 3) = P(x  3) - P(x  2) = 0.9624 - 0.8306 = 0.1318

(b) P(x  2) = 0.8306

(c) P(x  3) = 1 - P(x < 3) = 1 - P(x  2) = 1 - 0.8306 = 0.1694

(d) P(1  x  3) = P(x  3) - P(x < 1) = 0.9624 - 0.1780 = 0.7844

If p > 0.50, you have to use q where q = 1 - p.


Let x = number of successes and y = number of failures.
Using q = 1 - p

__________________________________________________________________________________
5/13
PSM0325 Introduction to Probability and Statistics Topic 4

Key Formulas:
(vi) P(x = a) = P(y = n - a) = P(y  n - a) - P(y  n - a - 1)
(vii) P(x  a) = P(y  n - a)
(viii) P(x > a) = P(y  n - a-1)
(ix) P(a  x  b) = P(n - b  y  n - a)
= P(y  n - a) - P(y  n - a - 1)
(x) P(a < x < b) = P(n - b < y < n - a)
= P(y  n - a - 1) - P(y  n - b)

Example 4.9:
Let p = 0.63, n = 8. Find the probability that:
(a) exactly three
(b) at most two
(c) at least three
(d) one to three
are successes.

Solution:
Refer Table 1, page 7, for n = 8 and p = 1 - 0.63 = 0.37.

(a) P(x = 3) = P(y = 5) = P(y  5) - P(y  4) = 0.9644 - 0.8693 = 0.0951

(b) P(x  2) = P(y  6) = 1- P(y  5) = 1 - 0.9644 = 0.0356

(c) P(x  3) = P(y  5) = 0.9644

(d) P(1  x  3) = P(y  7) - P(y < 5) = 0.9996 - 0.8693 = 0.0.1303

Since the probability distribution of any binomial random variable depends only on
the values assumed by the parameters n, p, and q, it would seem reasonable to assume
that the mean and variance of a binomial random variable also depend on the values
assumed by these parameters. Indeed, this is true, and in Theorem bellow we derive
general formulas as function of n, p, and q that can be used to compute the mean and
variance of any binomial random variable.

The mean and standard deviation for a binomial distribution are


 = np and   npq (4.3)

Example 4.10:
By using Example 4.9, find mean and variance for this distribution.

__________________________________________________________________________________
6/13
PSM0325 Introduction to Probability and Statistics Topic 4

Solution:

n = 8, p = 0.63, q = 1 - 0.05 = 0.37

Therefore,

 = (4)(0.63) = 2.52 2 = (4)(0.37)( 0.63) = 0.9324

4.2 POISSON DISTRIBUTION

Experiments yielding numerical values of a random variable X, the number of


outcomes occurring during a given time interval or in specified region, are called
Poisson experiments. The given time interval may be of any length, such as a
minute, a day, a week, a month, or even a year. Hence a Poisson experiment can
generate observations for the random variable X representing the number of
telephone calls per hour received by an office, the number of days school closed due
to snow during winter, or the number of postponed games due to rain during a
baseball season. The specified region could be a line segment, an area, a volume, or
perhaps a piece of material. In such instances X might represent the number of field
mice per acre, the number of bacteria in a given culture, or the number of typing
errors per page.

A Poisson experiment is derived from the Poisson process and possesses the
following properties:

1. The number of outcomes occurring in one time interval or specified region is


independent of the number that occurs in any other disjoint time interval or region
space. In this way we say that the Poisson process has no memory.

2. The probability that a single outcome will occur during a very short time interval
or in a small region is proportional to the length of the time interval or the size of
the region and does not depend on the number of outcomes occurring outside this
time interval or region.

3. The probability that more than one outcome will occur in such a short time
interval or fall in such a small region is negligible.

The number X of outcomes occurring during a Poisson experiment is called a Poisson


random variable, and its probability distribution is called Poisson distribution. The
mean number of outcomes is computed from  = t, where t is the specific "time" or
"region" of interest. Since its probabilities depend on , the rate of occurrence of
outcomes, we shall denote them by the symbol P(x; t). The derivation of the formula
for p(x; t), based on the three properties of a Poisson process listed above, is beyond

__________________________________________________________________________________
7/13
PSM0325 Introduction to Probability and Statistics Topic 4

the scope of this book. The following concept is used for computing Poisson
probabilities.

The probability distribution of the Poisson random variable X, representing


the number of outcomes occurring in a given time interval or specified region
denoted by t, is
x e  
p( x ; t )  P( X  x )  , x = 0, 1, 2, …
x!
(4.10)

where  (lambda) = mean number of occurrences in that interval and the


value of e  2.71828.

Example 4.11:
The mean number of bacteria per milliliter of a liquid is known to be 4. Assuming
that the number of bacteria follows a Poisson distribution, find the probability that, in
1 ml of liquid, there will be
(a) no bacteria
(b) 4 bacteria
(c) less than three bacteria.

Solution:
Let X = number of bacteria in 1 ml of liquid
=4
4 0 e 4
(a) P( X  0)   e  4  0.0183
0!
44 e 4
(b) P( X  4 )   01953
.
4!
(c) P( X  3)  P( X  0)  P( X  1)  P( X  2)
e 4 ( 4) 2
 e  4  e  4 ( 4) 
2!
 0.2381

Using the Table of Poisson Probabilities

__________________________________________________________________________________
8/13
PSM0325 Introduction to Probability and Statistics Topic 4

The probabilities for a Poisson distribution can also be read from Table 2, from page
24 to 32. That table lists the probabilities of x for  = 0.00 to  = 20 and less than or
equal to r (r = 1 to n).
r
e  t
P( X  r )  
t 0 t!

Key Formulas:
(a) P(x = a) = P(x  a) - P(x  a - 1)
(b) P(x  a) = 1 - P(x  a - 1)
(c) P(x > a) = 1 - P(x  a)
(d) P(a  x  b) = P(x  b) - P(x  a - 1)
(e) P(a < x < b) = P(x  b - 1) - P(x  a)

Example 4.12:
On average, two new accounts are opened per day at a BBMB Branch. Using Table 2,
find the probability that on a given day the number of new accounts opened at this
bank will be
(a) exactly 6
(b) at most 3
(c) at least 7

Solution:
Let X = number of accounts are opened per day =2

(a) P(x = 6) = P(x  6) - P(x  5) = 0.9955 - 0.9834 = 0.0121


(b) P(x  3) = 0.8571
(c) P(x  7) = 1 - P(x  6) = 1 - 0.9955 = 0.0045

Like the binomial distribution, the Poisson distribution is used for quality control,
quality assurance, and acceptance sampling. In addition, certain important continuous
distributions used in reliability theory and queuing theory depend on the Poisson
process.

The mean and variance of the Poisson distribution p(x; t), both have the
value t.
 and  = 
(4.11)

4.3 THE NORMAL DISTRIBUTION

__________________________________________________________________________________
9/13
PSM0325 Introduction to Probability and Statistics Topic 4

The normal distribution is the most important continuous distribution in statistics. Many
measured quantities in the natural sciences follow a normal distribution, for example
weights, masses, ages, random errors, examination results and etc.

The normal distribution is the most important continuous distribution in statistics.


1. The normal graph is perfectly symmetrical, unimodal and bell shaped.
2. The mean, median and mode are all equals and are located at the center of the
distribution.
3. Because of the symmetry of the curve, a perpendicular dropped from the peak cuts
the curve into two equal halves. Therefore, 50% of the area lies to the left and 50% to
the right of the central perpendicular. The area under the curve must always equal to
1. The normal distribution curve is shown below. The frequency of the variable is
represented on the vertical axis and the value of the variable on the horizontal axis.
4. Both ends are extended infinitely and approach the horizontal axis.

The Standard Normal Distribution


The standard normal distribution is a special case of the normal distribution. For the
standard normal distribution, the value of the mean in equal to zero, and the value of
standard deviation is equal to 1.
The units for the standard normal distribution curve are denoted by z and are called the z
values or z scores. Z score is a number that indicates how far a particular measurement
score is to the right or to the left of the mean. The z score is obtained by using the
formula
x
z where  = mean and  = standard deviation.

Using Table Of Normal Probabilities


Refer to Table 4. The probabilities for continuous distribution are the areas under the
curve of a defining probability distribution. The probabilities can be treated as areas and
are obtained by using the table of the standard normal distribution. The z score is
calculated and entered into the table to obtain required areas or probability under the
curve less than or equal to the number represented by the z score.

 ( z )  P( Z  z )  Area to the left z

Note: The z score is rounded to two decimal places before entering the table.

Converting An X Value To A Z-Value

__________________________________________________________________________________
10/13
PSM0325 Introduction to Probability and Statistics Topic 4

For a normal random variable x, a particular value of x can be converted to its


corresponding z value by using the formula
x
z

where  and  are the mean and standard deviation of the normal distribution of x.

Determining The Z And X Values When An Area Under The Normal Distribution
Curve Is Known
We now consider how to use the standard normal table in reverse, that is, to find z value
when we know the area under the standard normal curve or probability. Then convert z
value to x value by using the formula x  z   .

Example 4.13

Solution:

Example 4.14

__________________________________________________________________________________
11/13
PSM0325 Introduction to Probability and Statistics Topic 4

Example 4.15

Solution:

__________________________________________________________________________________
12/13
PSM0325 Introduction to Probability and Statistics Topic 4

Example 4.16

Solution:

-----------------------------------------End of Topic 4------------------------------------------------

__________________________________________________________________________________
13/13

You might also like