You are on page 1of 19

CHAPTER THREE

DISCRETE AND CONTINUOUS PROBABILITY DISTRIBUTION

3. Introduction

In the previous chapters, you have learnt how to construct a frequency polygon for a given
frequency distribution. It seemed that there was no way of telling in advance how the polygon
would look like and how the mean and the standard deviation would be. As a result, it may be
necessary to further study the behavior of the frequency polygon so as to study the general
behavior of the distribution in general and make some conclusions, which are useful for
decision-making.

A probability distribution can be thought of as a theoretical distribution, which is a probability


distribution that describes how outcomes are expected to vary.

This section focuses on the definitions of random variable and probability distribution. Then, you
will deal with the two most common discrete probability distributions and finally with one types
of continuous probability distribution.

3.1.Random Variable

In block 6, we defined the concept of ‘experiment’ and its associated outcome. A random
variable provides a means of assigning numerical values to experimental outcomes. The
definition of a random variable is as follows:

Definition:

A random variable is a variable whose values are determined by chance. Or,


A random variable is a numerical description of the outcome of an experiment.

Notation: Random variables are usually denoted by capital letters like X, Y, Z, etc.

Example 1: Consider the experiment of tossing of fair coin once.


The sample space is S={H, T} where H denotes the outcome ‘Head’ and T denotes the
outcome ‘Tail’. So, there are two possible outcomes H or T.
Now, let the random variable X represents the outcome `Head’, then X can take the value
0 or 1.

Example 2: Suppose a single fair die is rolled once.

The sample space of this experiment constitutes six possible outcomes,


S = {1, 2, 3 , 4, 5, 6}

Let the random variable Y denotes the outcome ‘A number greater than 2 occurs’. Then
the random variable can assume the values 3, 4, 5 or 6.

Examples 3: Consider the experiment of rolling two fair dice once simultaneously.

If the random variable T indicates the outcome `the sum of the numbers on the two dice
is greater than 10,’ then T can take the pairs (5, 6), (6, 5) or (6, 6) since in each of these
cases the sum of the numbers is greater than 10.

3.1.1. Types of Random Variables

As stated above, a random variable provides a means of associating a numerical value with each
possible experimental outcome.

Depending upon the numerical values it can assume, a random variable can be classified into two
major divisions.

A) Discrete Random Variable: is a random variable that may assume either a finite number
of values or an infinite sequence (e.g. 1, 2, 3…) of values. In general, a discrete random
variable takes whole number values, which can be counted or enumerated.

Example: The number of students who are enrolled for a diploma program in Unity
University College, the number of defective batteries observed in assessing its quality,
the number of customers who visit a shop during one day of operation are all examples of
discrete random variables.
B) Continuous Random Variable: is a random variable which may take on all values in a
certain interval or collection of intervals. A Continuous random variable, as the name
implies, assumes all possible values between any two values.

Example: Weight, time, temperature, etc are example of continuous random variable.

Remark: One way to determine whether a random variable is discrete or continuous is to


think of the values of the random variable as points on a line segment. If the entire line
segment between any two of these points also represents values the random variable may
assume, then the random variable is continuous.

3.1.2. Definition Of Probability Distribution

The probability distribution for a random variable describes how the probabilities are distributed
over the values of the random variable. For a discrete random variable X, the probability
function is denoted by P(X). The probability function provides the probability for each value of
the random variable.
A probability distribution may in general be defined as follows:

Definition:

A probability distribution is a correspondence, which assigns probabilities to the values of a


random variable.

Example 1: Construct a probability distribution for the number of heads in tossing two fair coins
simultaneously once.

Solution: The sample space of the experiment contains the following:


S = {HH, HT, TH, TT}
Let the random variable X denotes the ‘number of heads’. We then use the probability function
P(X) to assign probability to each out come consequently; the probability distribution is given
below:
Outcome, X 0 1 2

Probability, P(X) ¼ ½ ¼

The probability distribution shows that the probability that the random variable can assume the
value 0 is ¼, the value 1 is ½ and the value 2 is ¼. Note that the sum of these probabilities is 1.

Example 2: The number of mistakes a typist made in ten days of assessment is shown in the
following table.

No of mistakes 2 3 4 5
No of days 1 4 3 2

a) Construct a probability distribution for the number of mistakes she committed.


b) Represent graphically the probability distribution in part (a).

Solution:
a) In constructing the probability distribution, our random variable assumes a value for the
number of mistakes the typist committed. Let the variable X denotes this random
variable. Then, we assign a probability for each of the number of days with respect to the
total number of days.
The probability distribution is shown below:

No of mistakes, X 2 3 4 5

Probability, P(X) 1/10 4/10 3/10 2/10


b) We emphasize at this point that a probability distribution can be displayed on the
coordinate plane. The value of the random variable X is shown on the horizontal axis (x-
axis) and the probability that the random variable X assumes these values is shown on the
vertical axis (y-axis). For the probability distribution in the example, the random
variable X, which is the number of mistakes the typist committed, is labeled on the x-axis
and the corresponding probability, P(X) on the y-axis.

Y axis
0.4
P(X)
Probability 0.3

0.2

0.1

1 2 3 4 x axis
Number of mistakes
3.2.Types Of Probability Distribution

A probability distribution can be classified as a discrete or continuous probability distribution


according to whether it assumes a discrete or continuous random variable.

3.2.1. Discrete Probability Distribution

There are three types of discrete probability distribution; the binomial, Poisson and hyper
geometric probability distributions. But in this section, we will discuss the two of the (binomial
and Poisson). In the construction of the probability distribution for a discrete random variable,
the following two conditions must be satisfied.
Properties (Required Conditions) for a Discrete Probability Distribution
The sum of the probabilities of all the events in the sample space must equal 1.
i.e.  P(x) =1
The probability of each event in the sample space must be between or equal to 0 and 1.
i.e. 0  P(x)  1

For instance, in the above example, these two conditions are satisfied since

 P(X) = P(2) + P(3) + P(4) + P(5) = 0.1+ 0.4 + 0.3 + 0.2 = 1 and

each of these probabilities is greater than or equal to 0 and less than or equal to 1.
For some discrete random variables, the probability distribution can be given as a formula that
yields (x) for every possible value of x.

Example 3: Suppose a probability distribution is given by the formula:

 (x) = x/5 for x = 0, 2, 3

Construct the probability distribution correspondence.

Solution:
The outcome x assumes the values 0, 2 and 3

Out come, x 0 2 3
Probability, (x) 0/5 2/5 3/5

Expected Value and Variance of a Probability Distribution


i. Expected Value

The expected value, or mean, of a random variable is a measure of the central location for the
random variable. It is denoted by E(x) or . The mathematical expression for the expected value
of a discrete random variable x is as follows:
Expected value of a discrete random variable:
E(x)=  = x1 P(x1) + x2 . P(x2) +………..+ xn P(Xn) Or,
n
E (x) =  xi . P(xi)
i 1

where x1, x2,-------,xn are the outcomes and P(x1), P(x2)…P(xn) are the
corresponding probabilities.

The above formula shows that in order to compute the expected value of a discrete random
variable, we must multiply each value of the random variable by the corresponding probability
P(x) and then add the resulting products.

ii. Variance

While the expected value provides the mean value for the random variable, we often need a
measure of dispersion, or variability, for the random variable just as we need variance in block 5
to summarize the dispersion in a data set. The mathematical expression for the variance of a
discrete random variable is as follows:

Variance of a discrete probability distribution, σ 2

 x   . Px   x 


n n
σ2 Pxi  . 2
2 2
= i i i
i 1 i 1

2
and the standard deviation is ó  ó

Example 1: If three fair coins are tossed, find the expected number of heads that will occur and
obtain the variance
Solution:
Begin by constructing the probability distribution for the number of heads in tossing the three
coins.
The probability distribution is constructed below:

No of heads, x 0 1 2 3
Probability, P(x) 1/8 3/8 3/8 1/8
Then,
4
E(x)=  i 1
xi.P(xi) = xi P(x1) + x2 . P(x2) + x3 . P(x3) + x4 . P(x4)

= 0·1/8 + 1·3/8 + 2·3/8 + 3·1/8


= 0 + 3/8 + 6/8 + 3/8 = 12/8 = 6/4 = 3/2 = 1.5
The theoretical mean  = 1.5 implies that if the experiment is done as many times as possible,
then on the average a head occurs 1.5 of the time.

4
2 = 
i 1
[(xi-)2·P(xi)]

= (x1 - )2 · P(x1) + (x2 - )2 · P(x2) + (x3- )2 · P(x3) + (x4- )2 ·P(x4)
= (0 - 1.5)2 · 1/8 + (1-1.5)2 · 3/8 + (2 - 1.5)2 · 3/8 + (3 - 1.5)2 · 1/8
2 = 0.5

Example 2: One thousand tickets are sold at $1 each for a color television valued at $350. What
is the expected value if a person purchases one ticket?

Solution:

The problem can be seen as follows:


When a person purchases one ticket, he has two chances, to lose $1 or gain $349.

Gain, x $ 349 -$1


P(x) 1/1000 999/1000

Hence,
E(x) = $349 · 1/1000 + (-$1) · 999/1000 = -$0.65
Or,
E(x) = overall gain - $1 = $350 · 1/1000 - $1 = $0.65
i.e. The average loss is $0.65 for each of the 1000 ticket holders.
A. The Binomial Probability Distribution

The Binomial Probability Distribution is a discrete probability distribution that has many
applications. It is associated with a multi-step experiment that we call the Binomial experiment,
which is a probability experiment satisfying the following four requirements.

Properties of the Binomial Experiment


a) Each trial can have only two outcomes or outcomes that can be reduced to
two outcomes (success or failure).
b) There must be a fixed number of trials.
c) The outcomes of each trial must be independent.
d) The probability of a success must remain the same for each trial.

Definition:

A probability distribution showing the outcomes of a Binomial experiment along with the
corresponding probabilities is termed as a Binomial Probability Distribution.

In a Binomial experiment, the probability of exactly x successes in n trials is given by:

P x  
n!
. p x .qn x
n  x ! x!

Where x is the number of successes


P(x) is the probability of success
n is the number of trials
P is the numerical probability of success
q is the numerical probability of failure

Note: q = 1 - p and 0  x  n
Example 1: Consider the experiment of tossing a coin three times. Show that it is a binomial
experiment and find the probability of getting exactly two heads.

Solution:

This is a binomial experiment since


i) There are only two outcomes, head and tail.
ii) The number of trials is fixed (three)
iii) The probability of success, getting a head, does not change from trial to trial.
iv) The trials or tosses are independent, since the outcome of any trial is not affected by the
outcome of any other trial

Now, to find the probability of getting two heads, let p denotes the probability of getting a head
on a single toss.

Then p = 1/2, q = 1-1/2 = 1/2


n = 3, x = 2

P x  
n!
. p x . q n x
n  x ! x!
2
3!  1   1  32 3!  1   1 
P(2)  .   .   .  . 
3  2!2!  2   2  1!2!  4   2 
3
= = 0.375
8
Example 2: A new drug is effective 60% of the time. What is the probability that in a random
sample of 4 patients, it will be effective on two of them?
Solution:
This is a Binomial experiment as the points of the experiment are satisfied. Define ‘effective’ as
‘success’ and ‘non effective’ as ‘failure’. Then,
p = 0.6, q = 1 - 0.6 = 0.4, n = 4, x=2
Required p (2) = ?

. 0.6 . 0.4  6 0.0576  0.3456


4!
P (2) 
2 2

4  2 !2!
Hence, the drug will be effective on two of a random sample of 4 patients with a probability of
0.3456 (or 34.56%).

B. The Poisson Probability Distribution

A discrete probability distribution that is useful when n is large and p is small and when the
independent variables occur over a period of time is called the Poisson probability distribution.

The Poisson probability distribution assumes the following two conditions:

i) The probability of an occurrence is the same for any two intervals of equal length.
ii) The occurrence or non-occurrence in any interval is independent of the occurrence or
non-occurrence in any other interval.
The Poisson probability function
e   . x
P x ;   
x!
Where P(x, λ) is the probability of x occurrences in an interval of time, volume, area
etc for a variable, λ denotes the mean number of occurrences and e  2.7183

Example1: Past police records indicate a mean of five accidents per month while investigating
the safety of a dangerous intersection. The number of accidents is distributed according to the
probability in any month of
a) Exactly 3 accidents.
b) Fewer than 2 accidents.

Solution: By assumption the given distribution is a Poisson probability distribution.


Given that =5
x . e 
a) P x  
x!
x=3

P3 
5  . 2.7183
3 5

125 0.00674
3! 6
= 0.1404
b) Fewer than 2 accidents comprise 0 and 1 accident during any month.

 P0  P1 
5  . 2.7183
0 5

5  2.7183
1 5

0! 1!
 0.0674 + 0.3370
0.4044

Remark:- Although the above probability was determined by evaluating the probability
function, it is often easier to refer to the table for the Poisson probability distribution. These
table provides probabilities for specific values of x and . We have included the table at the
end of this block.
For convenience, in example 1a,  = 5 and x = 3. In the first column of the table choose
x = 3 and correspond it with  =5, the intersection of these two numbers gives you the
required probability, which is  0.1404.

Example 2: If there are 200 typographical errors randomly distributed in a 500-page


manuscript, find the probability that a given page contains exactly 3 errors.

Solution: First of all, find the mean number of errors


200
  0.4
500
Or, 0.4 error per page.
Since x = 3,

e  x 2.7183 . 0.4
0.4 3
P x ,      0.00715
x! 3!
Thus, there is less than a 1% probability that a give page contains less than 3 errors.

3.2.2. Continuous Probability Distribution

So far, we have been concerned with discrete probability distributions. In this section, we shall
turn to cases in which the variable can take on any value within a given range and in which the
probability distribution is continuous.
A common continuous probability distribution is the normal probability distribution. Several
mathematicians were instrumental in its development; among them is the eighteen-century
mathematician and astronomer Karl Gauss. In honor of his work, the normal probability
distribution is often called the Gaussian distribution.

There are two basic reasons why the normal distribution occupies such a prominent place in
statistics. First, it has some properties that make it applicable to a great many situations in which
it is necessary to make inferences by taking samples. Second, the normal distribution comes
close to fitting the actual observed frequency distributions of many phenomena, including human
characteristics (weights, heights and IQS)

The Normal Probability Distribution

Many contentious variables such as height and weight have distributions that are bell-shaped and
are called approximately normally distributed variables, deriving the most important probability
distribution used to describe a continuous random variable called the normal probability
distribution.
The normal probability distribution is a continuous, symmetric, bell-shaped
distribution of a variable.
The form or shape of the normal probability distribution is shown below.

The shape and position of the normal distribution curve depends on two parameters, the mean
and the standard deviation. Each normally distributed variable will have its own normal
distribution curve.
Properties of the normal probability distribution

 The normal distribution curve is bell-shaped.


 The mean, median and mode are equal and are located at the center of the
distribution.
 The curve is Uni-modal
 The curve is symmetrical about the mean.
 The curve is continuous and never touches the x-axis
 The total area under the normal distribution curve is equal to 1, or 100%
 The area under the normal curve that lies within one SD of the mean is
approximately 0.68 or 68%, within two SD’s about 0.95 or 95% and within
three SD’s about 0.997 or 99.7%

34.13% 34.13%

13.59% 13.59%
2.28% 2.28%

-3 -2 -  + +2 +3

About 68%
About 95%
About 99.7%

The mathematical equation of the normal probability distribution is defined by the probability
density function.
  x   2

f x  
1 2 2
e
2
Where  = mean
3.14159
e  2.7183
 = Standard deviation
The standard normal probability distribution

A random variable that has a normal distribution with a mean of 0 and a standard deviation of 1
is said to have a standard normal probability distribution. Recall that the standard score (z-
score) of a value is the number of standard deviations that value is from the mean. All normally
distributed variables can be transformed into the standard normal distributed variable by using
the formula for the standard score:
z= value – mean
Standard deviation

X 
Or, z=

The standard normal curve

-3 -2 -1  1 2 3

Area under the normal curve

As with other continuous random variables, probability calculations with any normal probability
distribution are made by computing areas under the graph of the probability density function.
Thus, to find the probability that a normal random variable lies within any specific interval, we
must, compute the area under the normal curve over that interval.

For the standard normal probability distribution, areas under the normal curve have been
computed and are available in tables that can be used in computing probabilities. The normal
probability distribution table is available at the end of this block.

For the solution of problems using the normal distribution, the following steps are used.
1. Draw a picture
2. Transform the given value to z-value
3. Shade the area desired
4. Read the area from the standard normal distribution table.
Example 1: Find the area under the normal curve between z=0 and z=2.34

Solution:
The standard normal curve
Representation is shown: From
the table the intersection 0 2.34
of z = 2.3 with 0.04 gives 0.4904 or
49.04% which is the required area.

Example 2 : Find the area under the normal distribution curve between z = -1.93 and z = 2.35

Solution: For easy look, draw


the normal curve and locate the two z-scores.
The total area (the shaded region) is the area
between –1.93 and 0 plus the -1.93 0 2.35
area between 0 and 2.35;

Hence, from the normal distribution table

Area = 0.4732 + 0.4906 = 0.9638 or 96.38%. Note that it is equivalent to say that the
probability of the z-value lying between z = -1.93 and z = 2.35 is 96.38%. This can also be
written as:
P(-1.93 < z < 2.35) = 0.9638

Example 3: Find the probability that the z-value of a normally distributed variable lies to the
left of 1.65

Solution
The probability that the z-value
lies to the left of 1.65 is equivalent to
finding the area under the standard
normal curve, which is to the left of 1.65
Hence, total area = area to the left of 0 0 1.65
plus area between 0 and 1.65 = P(z < 1.65)
= 0.5000 + 0.4505 = 0.9505 or 95.05%
Which is required probability.

Example 4: find P(z > 1.91)


Solution
P(z > 1.91) = area to the right of 0 area between 0 and 1.91.
i.e
P(Z > 1.91) = P(z > 0) - P(0 < z < 1.91) 0 1.91
= 0.5000 - 0.479
= 0.021 or 2.1%
Exercise
If a random variable x, has a normal distribution with a mean 5.6 and standard deviation 1.4, find
a) P(5 < x < 6) b)P(x < 7) c)P(x > 6.4)

Applications of the Normal Distribution

The area under the normal curve is used to solve practical application problems such as finding
probabilities or percentages of values. In order to solve such problems you need only transform
the values of the variable into the z values and read the standard normal distribution table.

Example 1: The scores for an IQ test are normally distributed with a mean of 100 and a
standard deviation of 15. Find the percentage of IQ scores that will fall below 112.

Solution
Step 1: Draw a figure and represent the area
Step 2: Find the z-value
Corresponding to an IQ
Score 112.
Z = x -  = 112 – 100 = 0.8 100 112
 115 0 0.8
Step3: From the table,
P(z < 0.8) = P(z < 0) + P(0 < z < 0.8) = 0.5000 + 0.2881 = 0.7881
Hence, 78.81% of the IQ scores fall below 112.
Example2: The monthly salaries of 2000 workers are normally distributed with a mean of birr
550 and of workers whose monthly salaries are
a) Between birr 600 and 700
b) Less than birr 700.

Solutions: the z – values corresponding to 600 and 700 are


600  550
Z  0.625
80

700  550
Z  1.875 550 600 700
80
0 0.625 1.875

Hence, 96.99% x2000=1939.8


Approximately 1940 of the workers earn a monthly salary less than birr 700.

Example 3 A college desires to accept only the top 10% of all graduating seniors on the basis
of the results o a national placement test. The test has a mean of 500 and a standard deviation of
100. Find the cut-off score for the exam.

Solution:
The area is shown.
We solve the problem back ward.
We need to determine the point on 500 x
the axis that cuts the upper 10% of the area. 0 z
Let it be denoted by x

P(z < 0) = 0.5000 – 0.1000 = 0.4000

From the table, the z – value that corresponds to the area 0.4000 is approximately 1.28.
x  500
Then, 1.28   x  628
100

Hence the score 628 should be used as a cut –off score. Any student scoring below 628 should
not be admitted.
Exercise
A standardized test has a mean of 50 and a standard deviation of 10. The scores are normally
distributed. If the test is administered to 800 students, approximately how many will score
between 48 and 62?

You might also like