You are on page 1of 21

6 SOME PROBABILITY DISTRIBUTIONS

Probability distributions play an important role in the application of Statistics. They are used
to model the behavior of many variables of interest. The variables that are measured in most
scientific studies, whose values occur by chance, are referred to as “random variables”. When used in
a statistical analysis, these random variables are assumed to follow a probability distribution.

6.1. RANDOM VARIABLE


A random variable is a function whose value is a real number determined by each
element in the sample space. Capital letters like X, Y, or Z are used to denote a random
variable. At times, elements of a sample space are not expressed as numbers. The use of
random variables provides a convenient way of expressing elements of a sample space as
numbers.

Example 6.1 A coin is tossed 3 times. Let X be the random variable denoting the number of
heads. The set of all possible outcomes are given in the first row of the table below. The
second row gives the values of the random variable X.
Outcomes HHH HHT HTH THH TTH THT HTT TTT
X = no. of heads 3 2 2 2 1 1 1 0

The random variable of interest, which we denote by the symbol X, is used to


represent the number of heads that come up in three tosses of a coin. The outcome HHH
can be numerically expressed as x  3 (the number of heads in the outcome HHH).
Similarly, the outcomes HHT, HTH, and THH have equivalent value of x, which is two. Thus,
the eight nonnumeric outcomes in three tosses of a coin are converted to the numerical
values 0, 1, 2 and 3 with the use of a random variable.

Example 6.2 A pair of dice is rolled. Let Y be the random variable denoting the absolute
difference of the points on the upturned faces of the dice. The possible values of Y are: 0, 1,
2, 3, 4 and 5. The outcomes and the corresponding values of Y are summarized as follows:
Y = absolute difference
0 1 2 3 4 5
= y1  y2

(1,1)
(1,2), (2,1)
(2,2) (1,3), (3,1)
(2,3), (3,2) (1,4), (4,1)
Corresponding (3,3) (2,4), (4,2) (1,5), (5,1) (1,6)
(3,4), (4,3) (2,5), (5,2)
Outcomes (4,4) (3,5), (5,3) (2,6), (6,2) (6,1)
(4,5), (5,4) (3,6), (6,3)
(5,5) (4,6), (6,4)
(5,6), (6,5)
(6,6)

In the above example, writing the 36 sample points where each sample point is an
ordered pair is simplified by making use of the random variable representing the absolute
difference of the upturned faces of the two dice.
Random variables are generally classified according to the values that they assume.
If a random variable takes only a finite number of values, then it is called a discrete random
SOME PROBABILITY DISTRIBUTIONS

variable. Discrete Random Variable is a random variable that assumes only a finite number
of values, most frequently integers. The values of these random variables are sometimes
called count data.
Some examples of discrete random variables are:
 number of hearts drawn from a deck of cards;
 number of heads in 3 tosses of a fair coin;
 number of persons in a city objecting to a new ordinance;
 number of barangays who voted for the opposition candidates.
Not all random variables take on positive integer values like the ones given in the
examples above. When we measure the height (in centimeters) of a person, or the weight (in
kilograms) of an individual, decimal values can be used. When a random variable can take
on any value on a continuous scale, then it is called a continuous random variable.
Continuous Random Variable is a random variable which can assume any value between two
points in a continuous scale. The values of these random variables are sometimes called
measured data.
Some examples of continuous random variables are:
 weight of a person;
 height of a person;
 percentage of persons in a city objecting to a new ordinance;
 weekly expenses of an average Filipino family.

6.2 PROBABILITY DISTRIBUTION OF A RANDOM VARIABLE


When making estimates about unknown population parameters, numerical values
that are computed only from the sample are usually used. These computed values, which
are called statistics, are assumed to be values of a random variable. A different value of the
statistic results when a different sample is taken, even though the same formula for its
computation is used. For every value that a statistic takes in a particular sample, a
corresponding probability can be computed. This leads us to the idea of a probability
distribution of a random variable.
Probability distributions can be viewed as ordered pairs in which the first element is
the value of the random variable and the second element is the associated probability.
Probability distributions provide a way of measuring accuracy of the estimates. Since
estimates are computed only from sample information, errors are bound to be committed.
The use of probability distributions will give us a tool in measuring the accuracy of our
estimates.
The sample mean is a statistic, which takes on different values when different
samples are obtained from a population. The sample mean is an example of a random
variable. Most of the statistics computed from sample data are considered as random
variables and they are associated with some probability distributions.

6.2.1 Discrete Probability Distribution


When a random variable is discrete, its corresponding probability distribution is
called a discrete probability distribution. A discrete probability distribution is an equation, or
a table that lists all possible values that a discrete random variable can take on together
with the associated probabilities.
SOME PROBABILITY DISTRIBUTIONS

Example 6.3 From example 6.2 where the random variable Y is the absolute difference of
the upturned faces of the two dice, a value of Y  4 corresponds to four sample points
namely (1,5) (5,1), (2,6) and (6,2). Since there are 36 sample points, a value of Y  4 is
assigned a probability 4 . For the other values of Y, the corresponding probabilities are
36
summarized in the probability distribution below.
Y 0 1 2 3 4 5
6 10 8 6 4 2
P(Y  y )
36 36 36 36 36 36

Notice that if we sum the probabilities in the second row of the table above, the total
is equal to 1. This is one of the important properties of a probability distribution.

Example 6.4 From example 6.1, the random variable of interest X is the number of heads
when a coin is tossed thrice. The possible values of X are 0, 1, 2, 3 . The probability
distribution of X can be written in tabular form as:
X 0 1 2 3
P ( X  x) 1 3 3 1
8 8 8 8

It can also be written in equation form as follows:

1 , if x  0
8
3 , if x  1
8
P ( X  x)   (6.1)
3, if x  2
8

1 , if x  3
8

The graph of the probability distribution of a discrete random variable is a set of


disjoint points in a Cartesian plane. However, its appearance can be enhanced by using a
histogram like the one shown in Figure 6.1. Corresponding to each value of the random
variable is a rectangle with a shaded area. The shaded area of each rectangle represents a
probability.
To draw the histogram, the values of the random variables are used as the
midpoints, and are plotted in the horizontal scale. The height of the rectangle is the
corresponding probability and is drawn in the vertical scale. The graph of a probability
distribution is like a relative frequency histogram.
SOME PROBABILITY DISTRIBUTIONS

Figure 6.1 Probability Distribution of Example 6.4

6.2.2 Continuous Probability Distribution


When a random variable is continuous, it takes any of the infinite number of values
in an interval. Thus, there is no meaningful interpretation for assigning a positive probability
for a specific value. For example, if the random variable of interest X, is the height of
sophomore ladies in MSU-IIT, then X can possibly take on values on the interval 55 to 75
inches.
Unlike in the discrete case, the graph of a continuous probability distribution is not
represented by a histogram. The graph of a continuous probability distribution is a smooth
curve like the one that is drawn below (Figure 6.2).

Figure 6.2 P(a  X  b) = area of the shaded region

To find the probability that a random variable X is contained in the interval a, b , the
area of the shaded portion below the curve, above the x-axis and between the lines erected
at a and b, is determined. This method of finding the probability is very similar to finding the
shaded area of a rectangle. For the continuous case, the upper boundary is a curve rather
than a straight line. The computation of the exact area under a continuous curve is not easy
to determine. However, for some commonly used continuous probability distributions,
probability tables are used to facilitate the computation of the probability of an interval.

Remark: There are two general rules which the values of all probability distributions must obey: First,
since the values of a probability distribution are probabilities, they must be numbers on the
interval from 0 to 1. Second, since a random variable has to take on one of its values, the
sum of all the values of a probability distribution must be equal to 1.
SOME PROBABILITY DISTRIBUTIONS

6.3 SOME PROBABILITY DISTRIBUTIONS


Consider the probability distribution that is defined by the following equation:

 x
 , if x  1 , 2 , 3
f ( x)   6 (6.2)

0 , elsewhere

The function f (x) , as defined above is a simple example of a probability distribution.


It has two important characteristics that will qualify it as a probability distribution:
 the values (range) of the function are non-negative
 the sum of the values of f (x) is equal to 1.
For the probability distribution defined in equation (6.2), the values of the function
are 1 , 2 and 3 , which are all positive values. The sum of these values is 1  2  3  1 .
6 6 6 6 6 6
There are many probability distributions that can be formulated by anybody.
However, not many of them are of practical use. In this chapter, some probability
distributions that were discovered and used to model many real-life situations, will be
presented. Some of them are discussed in this chapter.

6.3.1 Binomial Distribution: A Discrete Distribution


The Binomial Distribution is an example of a discrete probability distribution, which is
often encountered in practice. Some statistical problems involve repeated trials, where an
experimenter is interested whether a particular trial resulted in an outcome, which he/she
desires. In the games of chance terminology, an outcome of interest is called a “success”,
otherwise it is called a “failure”.
A trial may be a toss of a coin and the outcome is a success if it turns up to be a
head, or a failure if it turns up to be a tail. The trial can be a vote of a citizen in an election.
A vote for a candidate of your choice is labeled a success, any other choice is labeled a
failure. The term “success” should not be taken literally. It is just a label for one of two
possible mutually exclusive outcomes. There are many such type of trials that can be
repeated any number of times and whose result can be dichotomized into two mutually
exclusive outcomes. These trials are called binomial trials and the whole process is called a
binomial experiment.
A binomial experiment possesses the following characteristics:
 The experiment consists of n repeated trials.
 Each trial results in one of two mutually exclusive outcomes that may be
classified as either a “success” or a “failure”.
 The probability of a success in one trial, denoted by p, remains constant from
trial to trial.
 The repeated trials are independent.
Out of a maximum of n possible trials, an experimenter is usually interested in the
number of trials that resulted in a “success”. For example, the variable of interest is the
number of heads in n tosses of a coin; the number of correct guesses in a true-or-false or
multiple-choice examination; or the number of voters in Barangay San Miguel who voted for
the opposition in the last election.
SOME PROBABILITY DISTRIBUTIONS

In these trials, the random variable of interest is the number of success. Let us
denote this random variable by the symbol X. Clearly, the possible values of X are any
integer value from 0 to n inclusive. Let the probability of success in one trial be denoted by
p. For a binomial experiment, this probability does not change from trial to trial. The
probability of failure is just the complement q  1  p . The random variable X is a binomial
random variable and its probability distribution, called the binomial probability distribution, is
described mathematically by the following equation:

 C p x q n x  n!
n x p x qn x for x  0,1 , 2 ,, n
P ( X  x)  b( x; n, p)   x!(n  x)! (6.3)
0 otherwise

where n = number of trials;


p = probability of success;
q  1  p = probability of failure;
x = number of successes.

Example 6.5 Find the probability of obtaining exactly three 2's if an ordinary die is tossed 5
times.
Solution: Let X be the random variable which represents the number of 2‟s on the face of
an ordinary die. Let p be the probability that the number 2 on the face of the die
will occur in one trial. Thus, p  1 and q  1  1  5 . Hence, the probability of
6 6 6
getting exactly three 2's , with n  5 and x  3 is

P( X  3)  b (3; 5, 16 )5 C3 ( 16 )3 ( 65 ) 2  0.03215

Example 6.6 Find the probability of getting at least 4 heads in 6 tosses of a fair coin.
Solution: Let X be the number of heads and let p be the probability of getting a head in
one toss. Thus, p  1 and q  1  1  1 . With n  6 and x  4, 5 and 6, the
2 2 2
probability of getting at least 4 heads is:
P( X  4)  P( X  4)  P( X  5)  P( X  6)
 b (4; 6, 1 )  b (5; 6, 1 )  b (6; 6, 1 )
2 2 2
1 4 1
6 C4 ( ) ( ) 6  4  6 C5 ( ) ( ) 6  5  6 C6 ( 1 ) 6 ( 1 ) 6  6
1 5 1
2 2 2 2 2 2
=0.234375 + 0.09375 + 0.015625
=0.34375.

Example 6.7 A multiple-choice quiz has 10 questions, each with four possible answers of
which only one is the correct answer. What is the probability that sheer guesswork would
yield at most 1 correct answer?

Solution: Let X be the number of correct answers and let p be the probability of getting a
correct answer. Thus, p  1 and q  1  1  3 . Hence, the probability of
4 4 4
getting at most 1 correct answer, with n  10 and x  0 and 1 is:
SOME PROBABILITY DISTRIBUTIONS

P( X  1)  P( X  0)  P ( X  1)  b(0 ;10, 1 )  b(1 ; 10, 1 )


4 4
10 C0 ( 1 ) 0 ( 3 )10  0 10 C1 ( 1 )1 ( 3 )10 1
4 4 4 4
=0.05631 + 0.18771
=0.24402

Example 6.8 If 20% of the bolts produced by a machine are defective, determine the
probability that out of 4 bolts chosen at random,
a.) exactly 3 are defective.
b.) at least 3 are defective.
c.) 2 bolts are non-defective.
d.) at most 1 is non-defective.
Solution: Let X be the number of defective bolts drawn out of n  4 bolts,
p = probability of getting a defective bolt =20%=0.20 and
q = probability of getting non-defective bolt = 1 p = 1  0.20 = 0.80

a.) The probability that exactly 3 bolts are defective is


P( X  3)  b(3 ; 4,0.2) 4 C3 (0.2) 3 (0.8) 4  3  0.0256

b.) and c.) are left as exercises for the students.

d.) The event of choosing “at most 1 non-defective” means that 0 or 1 non-defective
is chosen. In this case, the random variable Y is the number of non-defective
items chosen, and has probabilities of success and failure as follows:
p  80%  0.8 and q  1  p  1  0.8  0.2 . Hence,
P(Y  1)  P(Y  1)  P(Y  0)  b(1 ;4, 0.8)  b(0 ; 4, 0.8)
 4 C1 (0.8)1 (0.2) 41  4 C0 (0.8) 0 (0.2) 40
=0.00256 + 0.0016
=0.0272

Remark: The mean, variance, and standard deviation of a binomially distributed random variable are
given by   np ;  2  npq and   npq , respectively.

Example 6.9 The probability that a patient recovers from a rare blood disease is 0.4. If 15
people are known to have contracted this disease, find the mean and standard deviation of
the number of recoveries among 15 patients.
Solution: Let X be the number of patients that survive. The probability p of survival is 0.40
and the probability of non-survival is 0.60. With n  15 , the mean and standard
deviation of X is given by:

  (15)(0.40)  6 and   (15)(0.40)(0.60)  3.6  1.897 .


SOME PROBABILITY DISTRIBUTIONS

6.3.2 Normal Distribution: A Continuous Distribution


A continuous random variable and their associated probability distribution arise
when data are defined over a continuous sample space. Any non-negative function defined
over an interval, can be a continuous probability distribution. The extra requirement needed
to make it a continuous distribution, or a probability density function, is that the total area
under the curve is equal to 1. Recall that for the discrete case, the extra requirement is that
the sum of all values of the function is 1.
There are various types of continuous distribution but among the most important, if
not the most important, is the normal distribution. The normal distribution is sometimes
called the Gaussian distribution, in honor of Gauss (1777-1855) who is one of the
Mathematicians who derived its equation.
The graph of a normal distribution is a bell-shaped curve that extends asymptotically
to the horizontal axis in both directions. In many practical applications, extending the tails of
the normal distribution very far is no longer necessary because the area below the curve
lying more than 4 or 5 standard deviations from the mean is negligible. The normal curve is
shown in Figure 6.3.


Figure 6.3 The Normal Curve

The mathematical equation of the density function of the normal variable depends
on parameters  and  2 , its mean and variance, respectively. The distribution is denoted
by the notation N ( ,  2 ) . The normal distribution function is given by

1  1 x   2 
f ( x)  exp    , (6.4)
 2  2    

where   x   ,   3.14159 , and exp (1)  e  2.71828 .

For the special case   0 and  2  1 , the distribution in equation (6.4) is called
the Standard Normal Distribution and the standard normal random variable Z, in particular,
is used instead of X.
The normal distribution has many interesting properties, some of which are used to
simplify the computations of probabilities. Some of its properties are:
1. It is symmetric with respect to a vertical axis passing through the mean .
2. The mean, median, and mode are equal.
3. The tails are asymptotic relative to the horizontal line.
4. The total area under the normal curve and above the horizontal axis is 1.
SOME PROBABILITY DISTRIBUTIONS

6.3.3 Areas Under The Normal Curve


The area under the probability density curve between two endpoints of an interval
represents the probability of a certain event. In a normal distribution, the probability of an
event can be represented graphically by a shaded region, as shown in the Figure 6.4. Since
the total area under the normal curve is 1, the area in the shaded region is a value between
0 and 1. This is consistent with the definition of the probability of an event.

Figure 6.4 P(a  X  b) = area of the shaded region

To find the area under a normal curve, knowledge of calculus and numerical integration are
required. The area between z1 and z 2 of the standard normal distribution is obtained by
evaluating the following integral.
1
z2 1  z2
P( z1  Z  z2 )   e 2 dz (6.5)
z1
2
Evaluation of such integrals is beyond the scope of this book. However, probability
tables are furnished in Appendix B that gives the results of such integration for different
values of z.

6.3.4 Computation of Probabilities Using the Standard Normal Table


The probability that a normal random variable is contained in an interval ( a , b ) is
numerically equal to the area under the normal curve between the endpoints a and b. The
examples below will illustrate how probabilities are computed using the table in Appendix B
(Standard Normal Curve Table). The table provides probability values for a standard normal
random variable (   0 and  2  1 ). However, when   0 and  2  1 , it is still possible
to find probabilities by using a simple transformation which will be illustrated in the later
examples.
The probability that Z is between two values, say z1 and z 2 , is equal to the area
under the curve, above the z-axis and between perpendicular lines erected at z1 and z 2 .
The area above a point, say z 0 , is equal to zero. Thus P( Z  z 0 )  0 . Hence, the probability
that Z is less than or equal to z 0 is the same as the probability that Z is less than z0 , that
is,
P( Z  z0 )  P( Z  z0 ) .
Also, P( Z  z0 )  P( Z  z0 ) and P( z1  Z  z 2 )  P( z1  Z  z 2 ) .
By symmetry, P( Z  z 0 )  P( Z  z 0 ) .
SOME PROBABILITY DISTRIBUTIONS

Given a value of Z, say z 0 , the standard normal curve table gives the probability
value for the expression of the form P( Z  z 0 ) . For expressions of the form P( Z  z0 ) or
P( z1  Z  z2 ) , the equivalent quantities are used, namely:
 P( Z  z 0 )  1  P( Z  z 0 ) ; (6.6)
 P( z1  Z  z2 )  P( Z  z 2 )  P( Z  z1 ) . (6.7)

Example 6.10 Let Z be a standard normal random variable. Find the following probabilities
using the standard normal table:
a.) P( Z  2.4) ;
b.) P( Z  0.58) ;
c.) P(0.25  Z  1.64) .
Solution: a) From the standard normal table, scan through the table to locate the row
whose leftmost value is 2.4 and the column whose topmost value is 0.00. The
cell in the intersection of this row and column contains the area of the shaded
region drawn below, which is 0.9918.

Hence, P( Z  2.4)  0.9918 .

b) The total area under the normal curve is 1. So the area of the shaded region
within the interval Z  0.58 is found by subtracting from 1, the area below the
curve within the interval Z  0.58 .

Hence, P( Z  0.58)  1  P( Z  0.58)  1  0.2810  0.7190 .

Note: P( Z  0.58)  ( Z  0.58)  0.7190 , by symmetry.

c) The area between z  0.25 and z  1.64 can be found by subtracting the
area below z  0.25 from the area below z  1.64 .
SOME PROBABILITY DISTRIBUTIONS

Hence, P(0.25  Z  1.64)  P(Z  1.64)  P( Z  0.25)


= 0.9495  0.4013
= 0.5482.

Example 6.11 Find the value of k, if:


a.) P( Z  k)  0.8461
b.) P( Z  k)  0.9345
c.) P(k  Z  k)  0.95
Solution: a.) To find the value of k, we scan through the row and column whose
intersection value is equal to 0.8461. If no such value is found, an
approximation can be made by choosing a value nearest to it.

Now, P( Z  1.02)  0.8461 . Thus, k  1.02 .

b) P( Z  k)  0.9345

We know that P( Z  k)  1  P( Z  k) .
So P(Z  k)  1  P( Z  k)  1  0.9345  0.0655 .
From the standard normal table, since P( Z  1.51)  0.0655 then k  1.51 .
SOME PROBABILITY DISTRIBUTIONS

c) P(k  Z  k)  0.95

The area of the unshaded portion of the graph is 1  0.95  0.05 . Thus, the
tails below k and above k, each has an area of 0.025. From the standard
normal table, P( Z  1.96)  0.0250 . Therefore, k  1.96 .

Example 6.12 Using the table of the standard normal curve, find the area:
a.) between z  0.46 and z  2.21 .
b.) to the right of z  1.24 .
c.) to the left of z  1.74 .
Solution: a.) P(0.46  Z  2.21)  P( Z  2.21)  P( Z  0.46)
 0.9864  0.3228
 0.6636 .

b.) P( Z  1.24)  1  P( Z  1.24)


 1  0.8925
 0.1075 .

c.) P( Z  1.74)  0.9591 .

Recall that the standard normal curve table provides areas only for the standard
normal curve, i.e.,   0 and  2  1 . It would be a tedious task to attempt to set up
separate tables of normal curves for every conceivable values of  and  2 . However, the
standard normal curve table can still be utilized even if the normal distribution has a mean
  0 and a variance  2  1 , by using the transformation or z-score
X 
Z (6.8)

Using equation (6.8), the normal variable X, with mean a   0 and a variance  2  1 , is
transformed into a standard normal variable Z with mean   0 and a variance  2  1 . The
standard normal curve table can then be used for the transformed variable.
To get the original value of X from a given value of Z, we solve for X in terms of Z
using Equation (6.8), that is,
X  Z   . (6.9)
SOME PROBABILITY DISTRIBUTIONS

Example 6.13 Let X be normally distributed with a mean of 72 and a standard deviation of
15. Find the Z - scores corresponding to: a.) 72 and b.) 93 .
Solution: The corresponding Z -scores with   72 and   15 are
X   72  72
a.) Z    0;
 15
X   93  72
b.) Z    1.4 .
 15

Example 6.14 Given a normal distribution with   40 and  2  36 , find


a) the probability that X assumes a value greater than 50;
b) the value of X that has 38 % of the area below it.
Solutions: a.) To find P( X  50) , we need to evaluate the area under the normal curve to
the right of x  50 . This can be done by transforming x  50 to the
corresponding z value, then obtaining the area to the left of z from the standard
normal table, and finally subtracting this area from 1. From equation (6.8), we
50  40
see that z   1.67 . Thus,
6
P( X  50)  P( Z  1.67)  1  P( Z  1.67)
 1  0.9525  0.0475

b.) To find the value of X given an area of 38% or 0.38 to its left, we require a z
value that has an area of 0.38 to its left, i.e., P( Z  z)  0.38 . From the
standard normal curve table, P( Z  0.31)  0.38 so that the desired z value is
0.31. Solving for x,
x  z  
 (.031 )(6)  40  38.14

6.3.5 Applications of the Normal Distribution


Many variables used in surveys and scientific studies behave like the normal
distribution. Even if the probability distribution of the random variable is unknown, assuming
a normal distribution for the unknown distribution often (but not always) leads to reliable
results. The succeeding examples illustrate the use of the normal distribution as the
approximate probability distribution for many variables of interest.

Example 6.15. A set of scores in a Statistics examination is approximately normally


distributed with a mean of 74 and a standard deviation of 7.9. Find the probability that a
student received scores between 75 and 80.

Solution: The z values corresponding to x1  75 and x 2  80 with   74 and   7.9


are
75  74 80  74
z1   0.13 and z 2   0.76 , respectively.
7.9 7.9

Hence, P(75  X  80)  P(0.13  Z  0.76)


 P( Z  0.76)  P( Z  0.13)
 0.7764  0.5517  0.2247 .
SOME PROBABILITY DISTRIBUTIONS

Example 6.16 If the weights of 600 students are normally distributed with a mean of 50
kilograms and a variance of 16 square kilograms,
a.) determine the percentage of students with weights lower than 55 kgs.
b.) How many students have weights exceeding 52 kgs. ?

Solution: a.) To find the percentage of students with weights lower than 55 kgs., we
require the area below x  55 kgs.. The corresponding z value with   50 and
55  50
2  16 (which implies that   4 ) is z   1.25 . Thus, it follows that
4
P( X  55)  P( Z  1.25)  0.8944 .
The percentage is found by multiplying the area under the normal curve by
100%. Hence, 89.44% of the students have weights lower than 55 kgs.

52  50
b.) For x  52 kgs, the corresponding z value is z   0.5 . Hence,
4
P( X  52)  P( Z  0.5)
 1  P( Z  0.5)
 1  0.6915  0.3085 .
The number of students who has weights exceeding 52 kgs., is found by
multiplying the area under the normal curve by the total number of students.
Hence, approximately 600  0.3085  185 students have weights exceeding 52
kgs.

Example 6.17 Suppose in Example 6.16, the weights of students are recorded to the
nearest kilogram, what happens to the results computed above?

Solution: a.) Weights recorded to the nearest kilogram are actually weights rounded off to
the nearest kilogram. If a weight measurement is from 54.5 to 55.4, it is
reported as 55 kilograms. Thus, weights are reported as lower than 55 kgs. if
the actual measurement is less than 54.5 kilograms.
54 .5  50
z  1.125
4
Hence, P( X  54.5)  P( Z  1.125 )  0.8697 .
The percentage is found by multiplying the area under the normal curve by
100%. Thus, 86.97% of the students weigh less than 55 kgs., if weights are
measured to the nearest kilogram.

b.) Following the same argument as Example 6.17 (a.), a weight exceeds 52 kgs.,
if the actual measurement is greater than or equal to 52.5 kilograms. Thus,
52 .5  50
z  0.625 .
4
It follows that P( X  52.5)  P( Z  0.625 )
 1  P( Z  0.625 )
 1  0.73405  0.26595 .
SOME PROBABILITY DISTRIBUTIONS

Therefore, there are approximately 600  0.26595  159 .57  160 students
having weights greater than 52 kgs. It is evident that there is a difference
between the results if the weights are recorded to the nearest kilogram.

6.3.6 Normal Approximation to the Binomial Distribution


To find the probability using the binomial distribution requires the computation of n!.
For large values of n, this quantity is difficult to compute. The normal distribution gives a
good approximation to the binomial distribution when n is large and p is not too close to 0 or
1. Using the normal approximation enables us to calculate probabilities for large binomial
samples. The mean  of a Binomial distribution b(n, p) is np and the variance  2 is npq. A
binomial random variable X, can be approximated by a normal random variable by using the
transformation
X  np
Z
npq

Remark: If a discrete random variable is approximated by a continuous random variable, an


adjustment for continuity is necessary because the probability that a continuous random
variable is equal to some specific value is 0. For example if X is a binomial random
variable, the quantity P( X  5) is greater than zero provided n  5 . When using a normal
approximation, P( X  5)  0 . The adjustment for continuity, requires the computation of
P(4.5  X  5.5) to approximate the value of P( X  5) .

Example 6.18 A pair of dice is rolled 180 times. What is the probability that a total of seven
(7) dots on the top face occurs
a.) less than 25 times?
b.) between 33 and 41 times inclusive?

Solution: a.) When a pair of dice is rolled, the probability that a total of 7 occurs is 6  1 .
36 6
If X represents the number of times that a total of 7 occurs, then the exact
24
probability is found by computing P ( X  25 )   b( x;180 , 16 ) . The mean and
x 0
variance of the distribution are respectively:

  np  180  1  30 and   npq  180  1  5  5


6 6 6

where q  1  p  1  1  5 .
6 6

Adjusting for continuity, the quantity P( X  25) becomes P( X  24.5) . The Z


24 .5  30
value corresponding to 24.5 is z   1.1 . Hence,
5
24
P ( X  25 )   b( x;180 , 16 )  P( Z  1.1)  0.1357 .
x 0
SOME PROBABILITY DISTRIBUTIONS

b.) The exact probability that a total of 7 occurs between 33 and 41 times
41
inclusive is P (33  X  41 )   b( x;180 , 16 ) .
x  33
Using the normal-curve approximation with   30 and   5 , the area to be
determined lies between x1  32.5 and x2  41.5 , after adjusting for
continuity. The corresponding Z values are

32 .5  30 41 .5  30
z1   0.5 and z 2   2.3 , respectively. Hence,
5 5
41
P (33  X  41 )   b( x;180 , 16 )  P (0.5  Z  2.3)
x  33
 P( Z  2.3)  P( Z  0.5)
 0.9893  0.6915  0.2978

6.4 SAMPLING DISTRIBUTION


In actual practice, the size N, of the population is usually a large number, even
infinite sometimes. However, for purposes of illustration, let us consider a population with
only a few elements.
Suppose a population consists of only N  5 elements, say, {2, 3, 4, 5, 9}. The
23459
mean of the population is    4.6 .
5
Consider drawing all possible samples of size 2 without replacement. There are a
total of 10 possible samples without replacement with their corresponding means:

Sample {2,3} {2,4} {2,5} {3,4} {3,5} {4,5} {2,9} {3,9} {4,9} {5,9}

x 2.5 3.0 3.5 3.5 4.0 4.5 5.5 6.0 6.5 7.0

23
If the sample drawn is {2,3}, the sample mean, x , is  2.5 . If the sample
2
drawn is {3,9}, the sample mean is 6.0. In fact, each sample drawn from the above
population results to a different value of the sample mean and none of them is equal to the
population mean. Single value estimates from the sample are hardly accurate but they will
still be useful and will be discussed in the next Chapter.
If there is no knowledge of the population before a sample is made, a sample is
randomly chosen from the population, and the resulting value for the sample mean is a
random value.
Consider now, all possible samples of size n, which can be drawn from a given
population with size N. For each sample, we can compute a statistic, such as the mean or
the standard deviation, and the value of the statistic will vary from sample to sample. In this
manner, we obtain a distribution of the statistic, which is called its sampling distribution.
Sampling distribution is the probability distribution of a statistic.
The sample mean is very popular statistic and its the distribution is called the
sampling distribution of the mean. Similarly we could have sampling distribution of the
standard deviation, the variance, the median, proportions, etc.
SOME PROBABILITY DISTRIBUTIONS

For each sampling distribution we can compute the mean or the standard deviation,
etc. Thus, we can speak of the mean or the standard deviation of the sampling distribution
of the mean. The standard deviation of a sampling distribution of a statistic is also called
the standard error of that statistic.
To be able to determine the sampling distribution of a statistic, it requires the
knowledge of the distribution of the population from which the sample was drawn. The
actual distribution may not be easy to determine but the normal distribution is a convenient
approximation to many unknown population distribution.
The succeeding subsections summarize some of the important results in statistical
theory. They are just stated below to provide the reader with an idea of the probability
distributions of sample statistics that will be used in the later chapters. The theorems stated
below can be found (with proofs) in many books in statistical theory.

6.4.1 The Central Limit Theorem


The central limit theorem, stated in the theorem below, is considered one of the
most important results in statistics. It gives the approximate sampling distribution of the
sample mean.
Theorem. If random samples of size n are drawn from a large or infinite population with
finite mean  and variance  2 , then the sampling distribution of the sample
mean is approximately normally distributed with mean  x   and standard
 n (x  )
deviation  x  Hence, Z  or equivalently, x ~ N (,  2 n) .
n 
When the population is normally distributed, the distribution of the sample means is
also normally distributed. However, if the population is not normally distributed, the sample
mean is still approximately normally distributed under the conditions stated in the Central
Limit Theorem. In applications, a sample is considered “large” if n  30 .

6.4.2 Chi-Square Distribution


Another popular statistic whose sampling distribution is needed to perform
hypothesis tests is the sample variance. Its sampling distribution is stated in the theorem
below.

Theorem. If s 2 is the variance of a random sample of size n taken from a normal


(n  1 ) s 2
population having the variance  2 , then  2  is a value of a random
2
variable  2 having the chi-square distribution with v  n  1 degrees of
freedom.

The value of the Chi-square statistic,  2 , cannot be negative. Unlike the normal
curve, the graph of the Chi-square distribution is not symmetric about 0. It has a skewed
distribution.
Percentile points of the  2 distribution for selected values of  (the level of
significance) and v (the degrees of freedom),  2 , v , are tabulated in Appendix D.
SOME PROBABILITY DISTRIBUTIONS

Example 6.19 Using the table of the Chi-square distribution in Appendix D , find the value of
each of the following.
a.)  20.05 , 10 b.)  20.025 , 6 c.)  2 0.01 ,2

Solution: a.) Scanning through the column (   0.05 ) and the row ( v  10 ), the intersection
value is equal to 18.307. Thus,  20.05 , 10  18 .307 .

b.) Scanning through the column (   0.025 ) and the row ( v  6 ), the intersection
value is equal to 14.449. Therefore,  20.025 , 6  14 .449 .

c.) Scanning through the column (   0.01 ) and the row ( v  2 ), the intersection
value is equal to 9.210. Hence,  20.01 , 2  9.210 .

6.4.3 Student t  Distribution


If the variance of a normally distributed population where we take our random
samples is unknown, and the sample size is not large ( n  30 in practice) , the values of s 2 ,
the sample variance, vary considerably from sample to sample. For small sample sizes, the
central limit theorem is not very useful and the distribution of Z may no longer be a standard
normal distribution. Replacing  by s in the expression above results to the statistic
x 
t . This statistic has a student t- distribution as summarized below.
s n

Theorem. If x and s 2 are the mean and variance, respectively of a random sample of size
n taken from a population that is normally distributed with mean  and variance
x 
 2 , then t  , is a value of a random variable T having the t distribution
s n
with v  n  1 degrees of freedom.
The t-distribution is similar to that of the standard normal random variable z in that
they are both symmetrical about a mean of zero and both are bell-shaped. Percentile points
of the t - distribution for selected values of  (level of significance) and v (degrees of
freedom), t , v , are tabulated in Appendix C.

Example 6.20 Using the table of the t- distribution in Appendix C, find the value of each of
the following.
a.) t 0.05, 10 b.) t 0.025, 6 c.) t 0.01, 2

Solution: a.) Scanning through the column (   0.05 ) and the row ( v  10 ), the intersection
value is equal to 1.812. Thus, t 0.05, 10  1.812 .

b.) Scanning through the column (   0.025 ) and the row ( v  6 ), the intersection
value is equal to 2.447. Therefore, t 0.025, 6  2.447 .

c.) Scanning through the column (   0.01 ) and the row ( v  2 ), the intersection
value is equal to 6.965. Hence, t 0.01, 2  6.965 .
SOME PROBABILITY DISTRIBUTIONS

6.4.4 FDistribution
The sampling distribution of the statistic F defined to be the ratio of two independent
chi-square variables, each divided by their degrees of freedom is called the Fdistribution.
Hence, if f is a value of the random variable F, we have

12 v1 s12  12  22 s12


f   
 22 v2 s22  22  12 s22

where 12 is a value of a chi-square distribution with v1  n1  1 degrees of freedom and


2
2 is a value of a chi-square distribution with v 2  n2  1 degrees of freedom. We say that
f is a value of the F distribution with v 1 and v 2 degrees of freedom

Theorem. If s12 and s22 are the variances of independent random samples of size n1 and
n2 taken from normal populations with variances  2 and  2 2 , respectively,
1
then
s12  12  22 s12
f  
s22  22  12 s22
is a value of a random variable F having the F distribution with v1  n1  1 and
v 2  n2  1 degrees of freedom.
Percentile points of the F distribution for selected values of  (level of significance),
v1 and v 2 (degrees of freedom), F  (v1 ,v 2 ) , are tabulated in Appendix E.

Example 6.21 Using the table of the F- distribution in Appendix E, find the value of each of
the following.
a.) F 0.05 (5, 10) b.) F 0.05 (20, 6) c.) F 0.01 (8,8)

Solution: a.) Looking at the critical values of the F-distribution table with   0.05 and
scanning through the column ( v1  5 ) and the row ( v 2  10 ), the intersection value
is equal to 3.33. Thus, F 0.05 (5, 10)  3.33 .

b.) Looking at the critical values of the F-distribution table with   0.05 and
scanning through the column ( v1  20 ) and the row ( v 2  6 ), the intersection
value is equal to 3.87. Thus, F 0.05 (20, 6)  3.87 .

c.) Looking at the critical values of the F-distribution table with   0.01 and
scanning through the column ( v1  8 ) and the row ( v 2  8 ), the intersection
value is equal to 6.03. Thus, F 0.01 (8,8)  6.03 .
SOME PROBABILITY DISTRIBUTIONS

EXERCISES

I. Random Variables and Probability Distributions


1. List the possible values of the following random variables. Find its probability
distribution.
a.) Let Y be the random variable denoting the sum of the points on the upturned
faces of the dice when a pair of dice is rolled.
b.) Let Z be the random variable denoting the number of heads when a coin is tossed
five times.
2. Determine whether the following can be a probability distribution, defined in each case
for the given values of x, and explain your answer.
C
a.) f ( x)  4 x for x  0, 1 , 2 , 3 , 4
16
x 2
b.) f ( x)  for x  0, 1 , 2 , 3 , 4, 5
5
x2
c.) f ( x)  for x  0, 1 , 2 , 3
14

II. Binomial Distribution


1. Suppose you know that 80% of the people applying for a certain job had no previous
experience in this job. You select a random sample of 5 current applicants. What is the
probability that 3 has no previous experience in the job?
2. The manager of a restaurant claims that only 3% of the customers are dissatisfied with
the service. If this claim is true, what is the probability that in a random sample of 25
customers,
a.) none is dissatisfied? c.) at most one is dissatisfied?
b.) at least one is satisfied?
3. A manufacturer claims that 6% of their products is defective. If the claim is true, what is
the probability that the number of defective products in a random sample of 20 will be
a.) exactly 2? c.) fewer than 5?
b.) 2 or more?
4. An insurance salesman sells policies to 5 men, all of identical age and in good health.
According to the actuarial tables the probability that a man of this particular age will be
alive in 30 years hence is 2 . Find the probability that in 30 years
3
a.) 5 men will be alive; c.) only 2 men will be alive;
b.) at least 3 men will be alive; d.) at least one man will be alive.

5. The probability that a certain kind of component will survive a given shock test is 3 .
4
Find the probability that exactly 2 of the next 4 components tested survive.

6. A fruit grower claims that 2 of his mango crop has been contaminated by the medfly
3
infestation. Find the probability that among 4 mango inspected by this grower
a.) all 4 have been contaminated by the medfly;
b.) anywhere from 1 to 3 have been contaminated.
SOME PROBABILITY DISTRIBUTIONS

III. Normal Applications


1. Suppose the temperature T during the month of June is normally distributed with mean
68  C and standard deviation 6  C . Find the probability P that the temperature is
between 70  C and 80  C .
2. Suppose the heights H of 800 students are normally distributed with mean 66 inches
and standard deviation 5 inches. Find the number N of students with heights
a.) between 65 and 70 inches;
b.) greater than or equal to 6 feet.
3. A certain „AA‟ battery lasts for 3 weeks, on the average, with a standard deviation of 0.5
week. Assuming that the battery lives are normally distributed, find the probability that a
given battery will last
a.) less than 2.3 weeks; c.) between 1.5 to 3.5 weeks.
b.) greater than 2 weeks;
4. Two students were informed that they received z-scores of 0.8 and –0.4 respectively, on
a multiple-choice examination in English. If their marks were 88 and 64 respectively,
find the mean and the standard deviation of the examination marks.
5. The grades on a short quiz in biology were 0, 1, 2, …, 10 points, depending on the
number answered correctly out of 10 questions. The mean grade was 6.7 with a
standard deviation of 1.2. Assuming the grades to be normally distributed, determine
a.) the percentage of students scoring 6 points;
b.) the maximum grade of the lowest 10% of the class;
c.) the minimum grade of the highest 10% of the class.
6. If the weights of ball bearings are normally distributed with mean 0.614 newtons and
standard deviation 0.0025 newtons, determine the percentage of ball bearings with
weights
a.) between 0.610 and 0.618 newtons; c.) less than 0.608 newtons.
b.) greater than 0.617 newtons;
7. In an industrial process the diameter of a ball bearing is an important component part.
The buyer sets specifications on the diameter to be 3.0  0.01 cm. The implication is
that no part falling outside these specifications will be accepted. It is known that in the
process the diameter of a ball bearing has a normal distribution with mean 3.0 and
standard deviation  = 0.005. On the average, how many manufactured ball bearings
will be scrapped?
8. The grades of 300 students in a Math exam are normally distributed with a mean of 80
and a standard deviation of 10.
a.) Find the probability that a student has a grade better than 85.
b.) How many students would have grades from 78 to 84?
c.) If the Math Department requires a grade of at least 70, how many will fail on this
basis?
d.) Find the lowest passing grade if the lowest 10% of the students are given a
failing grade.

You might also like