You are on page 1of 35

Chapter 6 Basic Probability distributions

CONTENTS
6.1. Random variables
6.2. The probability distribution for a discrete random variable
6.3. Numerical characteristics of a discrete random variable
6.4. The binomial probability distribution
6.6. The Poisson distribution
6.6 Continuous random variables: distribution function and density function
6.7 Numerical characteristics of a continuous random variable
6.8. The normal distribution
6.9. The Exponential distribution
6.10. Exercises

6.1 Random variables


Random Variable is a quantity resulting from a random experiment that, by chance, can
assume different values
The outcome of an experiment need not be a number, for example, the outcome when a
coin is tossed can be 'heads' or 'tails'. However, we often want to represent outcomes as
numbers. A random variable is a function that associates a unique numerical value with
every outcome of an experiment. It is just a way of assigning a numerical result to the
outcome of a random experiment.
The value of the random variable will vary from trial to trial as the experiment is
repeated. A random variable is a numerical measure of the outcome from a probability
experiment, so its value is determined by chance. Random variables are denoted using
letters such as X.
Example 6.1 Observe 100 babies to be born in a clinic. The number of boys, which have
been born, is a random variable. It may take values from 0 to 100.
Example 6.2 Number of patients of a clinic daily is a random variable.
Example 6.3 Select one student from an university and measure his/her height and record
this height by x. Then x is a random variable, assuming values from, say from 100 cm to
260 cm in dependence upon each specific student.
Example 6.4 The weight of babies at birth also is a random variable. It can assume
values in the interval, for example, from 800 grams to 6000 grams.

Definition 6.1
A random variable is a variable that assumes numerical values associated with events
of an experiment.
A random variable can be classified as being either discrete or continuous depending on
the numerical values it assumes.
Classification of random variables
Random variables may be divided into two types: discrete random variables and
continuous random variables.

2006. R Waema Probability distributions 1


Discrete Random variable
Discrete Random Variable is a variable which can assume only integer values, such as,
7, 9, and so on. In other words, a discrete random variable cannot take fractions as value.
Things such as people, cars, or defectives are things we can count and are discrete items.
A discrete random variable is one which may take on only a countable number of distinct
values such as 0, 1, 2, 3, 4, ... Discrete random variables are usually (but not necessarily)
counts. If a random variable can take only a finite number of distinct values, then it must
be discrete. Examples of discrete random variables include:
 the number of children in a family
 the Friday night attendance at a cinema
 the number of patients in a doctor's surgery
 the number of defective light bulbs in a box of ten.

Continuous random variable


A continuous random variable is one which takes an infinite number of possible values in
an interval. Continuous random variables are usually measurements. Examples include:
 height
 weight,
 the amount of sugar in an orange
 the time required to run a mile
A continuous random variable is a random variable that result from measurement.

Independent Random Variables


Two random variables X and Y say, are said to be independent if and only if the value of
X has no influence on the value of Y and vice versa.

Among the random variables described above the number of boys in Example 6.1 and the
number of patients in Example 6.2 are discrete random variables, the height of students
and the weight of babies are continuous random variables.
Example 6.6 Suppose you randomly select a student attending your university. Classify
each of the following random variables as discrete or continuous:
 Number of credit hours taken by the student this semester
 Current grade point average of the student.
Solution
 The number of credit hours taken by the student this semester is a discrete
random variable because it can assume only a countable number of values (for
example 10, 11, 12, and so on). It is not continuous since the number of credit
hours can not assume values as 11.6678, 16.3466 and 12.9876 hours.
 The grade point average for the student is a continuous random variable because
it could theoretically assume any value (for example, 6.466, 8.986)
corresponding to the points on the interval from 0 to 10 of a line.

2006. R Waema Probability distributions 2


Probability Distribution
What is a Probability Distribution
A probability distribution is similar to the frequency distribution of a quantitative
population because both provide a long-run frequency for outcomes. In other words, a
probability distribution is listing of all the possible values that a random variable can take
along with their probabilities.
The "rule" that associates probabilities with specific values of a random variable is
referred to as a probability distribution.
A probability distribution provides the possible values of the random variable and their
corresponding probabilities. A probability distribution can be in the form of a table,
graph or mathematical formula.
A probability function is a mathematical rule that assigns probabilities to the values of a
random variable such that.
 All the probabilities assigned to the random variables must be between 0 and 1
inclusive.
 The sum of the probabilities of the outcomes (random variables) must be 1.
If these two conditions aren't met, then the function isn't a probability function. There is
no requirement that the values of the random variable only be between 0 and 1, only that
the probabilities be between 0 and 1.
Definition 6.3
The probability distribution for a discrete random variable x is a table, graph, or
formula that gives the probability of observing each value of x. We shall denote the
probability of x by the symbol p(x).

for example, suppose we want to find out the probability distribution for the number of
heads on three tosses of a coin:
Number of heads Probability

X P(X)
0 1/8
1 3/8
2 3/8
3 1/8

Presenting Probability distribution in Table form


In this case, we can compute the probabilities associated with each possible value of x by
simply counting the possible outcomes for each value x then divide by the total possible
outcomes for the experiment. The table contains two columns, first column gives the
possible values for the random variable X and the second column contains the
corresponding probability for each random variable value P(X). E.g

X P(X)
x1 p1
x2 p2

2006. R Waema Probability distributions 3


... ...
xn pn

where P(X) is the probability that the variable X assume the value X k (k = 1, 2,..., n).
Example 1
To make things a bit more interesting here, we will consider the experiment in which a
fair coin is flipped five times (or five identical coins are flipped simultaneously). We
know from previous work that this experiment will have 26 = 32 distinct outcomes.
Now, define:

X = number of heads when a fair coin in flipped 6 times

Then, X is a random variable (because it associates a numerical value with each of the 32
outcomes), and further, it is a discrete random variable because x can have only the
distinct values 0, 1, 2, 3, 4, or 6.
The probability distribution for X is just some way of specifying the probability of
observing each possible value of X.

x Pr(x)
0 1/32 =
0.03126
1 6/32 =
0.16626
2 10/32 =
0.3126
3 10/32 =
0.3126
4 6/32 =
0.16626
6 1/32 =
0.03126
total: 32/32 = 1

Notice that since the values, Pr(x), are just probabilities,

Example 3
A balanced coin is tossed twice and the number x of heads is observed. Find the
probability distribution for x.
Solution
Let Hk and Tk denote the observation of a head and a tail, respectively, on the kth toss, for
k = 1, 2. The four simple events and the associated values of x are shown in Table 6.1.

Table 6.1 Simple events of the experiment of tossing a coin twice

2006. R Waema Probability distributions 4


SIMPLE DESCRIPTION PROBABILIT NUMBER OF HEADS
EVENT Y
E1 H1H2 0.25 2
E2 H1T2 0.25 1
E3 T1H2 0.25 1
E4 T1T2 0.25 0

The event X = 0 is the collection of all simple events that yield a value of X= 0, namely,
the simple event E4. Therefore, the probability that x assumes the value 0 is
P(x = 0) = p(0) = P(E4) = 0.25.
The event X = 1 contains two simple events, E2 and E3. Therefore,
P(X = 1) = p(1) = P(E2) + P(E3) = 0.25 + 0.25 = 0.5.
Finally,
P(x = 2) = p(2) = P(E1) = 0.25.

Table 6.2 Probability distribution for X, the number of heads in


two tosses of a coin

x p(x)
0 0.25
1 0.5
2 0.25
For example, in a survey of 300 households 54 had no children, 117 had one child, 72
had two children, 42 had three children, 12 had four children and 3 had five children. If
we wanted to select one of these households for a follow-up survey we can compute the
probability of obtaining a household with different numbers of children. Table 6.3
provides the necessary information. In this we denote X as the number of children per
household and p(x) as the probability of the random variable taking a specific value. For
instance p(1) provides the probability that a randomly selected household has just one
child. In this case, this is 0.39.

Table 6.3: Probability Distributions for the Number of Children per Household

x p(x)

0 0.18
1 0.39
2 0.24
3 0.14
4 0.04
5 0.01
 f(x) = 1

Construction of a probability histogram:

2006. R Waema Probability distributions 5


This is not a very practical method for discrete random variables, but it illustrates what
becomes the method of choice (indeed, the only practical method) for computing
probabilities for continuous random variables. You just construct a bar of height Pr(X)
and width 1 centered on the value of X. For the five-coin-toss example, we get:

0.350
0.3125 0.3125
0.300

0.250

0.200
P r(x)
0.15625 0.15625
0.150

0.100

0.03125 0.03125
0.050

0.000
0 1 2 3 4 5

Other than the visual image this representation gives, there's no real new information here
that wasn't present in the tabulation earlier. Because we've drawn the columns for each
possible value of x to have a width of 1 unit, the shaded areas and the probabilities are
equivalent.
The best graph for the discrete random variable is bar graph. This is shown below for the
two-coin-toss example.

Figure 6.1 Probability distribution for X, the number of heads in two tosses of a coin

0.6
0.5
0.4
0.3
0.2
0.1
0
0 1 2

use of an algebraic formula


In many instances, actual algebraic formulas for Pr(x) can be worked out, based on the
known characteristics of the random experiment giving rise to the values of x.

2006. R Waema Probability distributions 6


Discrete probability Distribution
Discrete probability function is a probability function in which the random variables are
discrete. Since each value of a discrete random variable is linked to an outcome of an
experiment, the values of a random variable can be related to the probabilities of the
outcomes. The result of this process is called a discrete probability distribution.
A discrete probability distribution is a distribution with a domain whose elements are
the discrete values that a discrete random variable can assume, and a range whose
elements are the probabilities associated with the values in the domain.
The probability distribution for a random variable describes how the probabilities are
distributed over the values of the random variables. For a discrete random variable x, the
probability function is denoted by p(x). The probability function provides the probability
for each value of the random variable.

Requirements for a discrete probability Distribution


The probability that x can take a specific value is p(x). That is

The sum of all probabilities p(x) over all possible values of x is 1, that is
 allx P( X  x)   P( x)  1
allx

where x represents all possible values that X can have.


P ( x) assumes values between 0 and 1. i.e 0  P ( x)  1

Properties of the probability distribution for a discrete random variable x


0  p(x)  1
 p(x)  1
all x

Example
Let X = outcome of a 6-sided die
X 1 2 3 4 6 6
P(X) 1/6 1/6 1/6 1/6 1/6 1/6

The sum of all probabilities is 1 since 1/6 + 1/6 + · · · + 1/6 = 1 and all probabilities are
between 0 and 1.

Example
Consider the random variable X having the following probability distribution
X 2 3 5 7
P(X) 0.1 0.4 0.2 0.3

(i) Is this a probability distribution?


(ii) What is P(X = 3) or equivalently P(3)?
(iii) What is P(X = 4)?

2006. R Waema Probability distributions 7


(iv) What is P(X = 4.9999999999)?
(v) What is P(2  X  5)?
(vi) What is P(2  X <5)?
(vii) What is P(X >2)?

Use of Cumulative Probabilities

Very early in the course, you encountered cumulative frequencies and cumulative relative
frequencies. Cumulative probabilities are analogous. They are just probabilities that x is
less than or equal to some value. All random variables (discrete and continuous) have a
cumulative distribution function. It is a function giving the probability that the random
variable X is less than or equal to x, for every value x.
Formally, the cumulative distribution function F(x) is defined to be:
for
For a discrete random variable, the cumulative distribution function is found by summing
up the probabilities as in the example below.

For the five-coin-toss example, we can easily tabulate cumulative probabilities using the
probabilities for the individual values of x in the table above:

k Pr(x  k)
0 1/32 = 0.03125
1 6/32 = 0.1875
2 16/32 = 0.5
3 26/32 = 0.8125
4 31/32 = 0.96875
5 32/32 = 1

Thus, for example

Pr(x  2) = Pr(x = 0) + Pr(x = 1) + Pr(x = 2) = 1/32 +5/32 + 10/32 = 16/32 =0.5.

Cumulative probabilities are particularly useful for calculating the probability that ranges
of values of x occur. Thus, for example,

Pr(1  x  4) = Pr (x  4) - Pr(x  0)
= 0.96875 - 0.03125 = 0.9375

The cumulative distribution function (c.d.f.) of a discrete random variable X is the


function F(t) which tells you the probability that X is less than or equal to t. So if X has
p.d.f. f(x), we have:

F(t) = P(X  t) =  p( X )
x 

2006. R Waema Probability distributions 8


In other words, for each value that X can be which is less than or equal to t, work out the
probability that X is that value and add up all such results.

The Mean and Standard Deviation of a Discrete Probability Distribution


As we encountered earlier the mean (or what is sometimes called the expected value) of a
random variable is a measure of the central location for the random variable.

Expected Value of a discrete random variable


In general expectation is what is considered the most likely to happen.
the expected value (or expectation) of a discrete random variable is the sum of the
probability of each possible outcome of the experiment multiplied by its payoff ("value").
The expected value (or population mean) of a random variable indicates its average or
central value. It is a useful summary value (a number) of the variable's distribution.
Stating the expected value gives a general impression of the behavior of some random
variable without giving full details of its probability distribution.
The expected value of a random variable X is symbolized by E(X) or µ.
The expression for the mean (or expected value) of a discrete random variable is given by
E(x) =    xP ( x)
where the elements are summed over all values of the random variable X.
Example
Discrete case : When a die is thrown, each of the possible faces 1, 2, 3, 4, 5, 6 (the xi's)
has a probability of 1/6 (the p(xi)'s) of showing. The expected value of the face showing
is therefore:
µ = E(X) = (1 x 1/6) + (2 x 1/6) + (3 x 1/6) + (4 x 1/6) + (6 x 1/6) + (6 x 1/6) = 3.6
Notice that, in this case, E(X) is 3.6, which is not a possible value of X.

Definition 6.4
The expected value (or mean) of a random variable x, denoted by the symbol E(x), is
defined as follows:
Let x be a discrete random variable with probability distribution p(x). Then the mean
or expected value of x is
 E(x)   xp(x)
all x

Example 6.6 For example, in a survey of 300 households 54 had no children, 117 had
one child, 72 had two children, 42 had three children, 12 had four children and 3 had five
children.
Table 1.1: Probability Distributions for the Number of Children per Household

x p(x)

0 0.18
1 0.39
2 0.24
3 0.14

2006. R Waema Probability distributions 9


4 0.04
5 0.01
 p(x) = 1

Using the data from table 1.1, the mean can be readily computed as

 = 00.18 + 1 0.39 + 2 0.24 + 30.14 + 4 0.04 + 5 0.01 = 1.5

Variance of a Discrete Random variable


Variance is a parameter that measures how dispersed a random variable’s probability
distribution is. The (population) variance of a random variable is a non-negative number
which gives an idea of how widely spread the values of the random variable are likely to
be; the larger the variance, the more scattered the observations on average.
Stating the variance gives an impression of how closely concentrated round the expected
value the distribution is; it is a measure of the 'spread' of a distribution about its average
value.
Variance is symbolised by V(X) or Var(X) or
•Variance of a discrete random variable:
Var(x) =    ( x   ) P( x)
2 2

Note
The larger the variance, the further that individual values of the random variable
(observations) tend to be from the mean, on average;
The smaller the variance, the closer that individual values of the random variable
(observations) tend to be to the mean, on average;
Taking the square root of the variance gives the standard deviation, i.e.:

The variance and standard deviation of a random variable are always non-negative.

Definition 6.6
The second important numerical characteristics of random variable are its variance
and standard deviation, which are defined as follows:

Let x be a discrete random variable with probability distribution p(x). Then the
variance of x is
 2   (x- ) 2 P(x)
The standard deviation of x is the positive square root of the variance of x:
  2

Table 1.2: Calculation of the Variance for Number of Children per Household

x x- (x -)2 f(x) (x -)2f(x)

0 -1.5 2.25 0.18 0.4050

2006. R Waema Probability distributions 10


1 -0.5 0.25 0.39 0.0975
2 0.5 0.25 0.24 0.0600
3 1.5 2.25 0.14 0.3150
4 2.5 6.25 0.04 0.2500
5 3.5 12.25 0.01 0.1225
 = 1.25

Thus, in this case 2 = 1.25. The square root of this expression provides the standard
deviation for the number of children per household and this is given by:

 = 1.25  1.118
This again is measure in the same units as the random variable, which is children per
household.
Note: Consider the following.
The definitions for population mean and variance used with an ungrouped frequency
distribution were:

Some of you might be confused by only dividing by N. Recall that this is the population
variance, the sample variance, which was the unbiased estimator for the population
variance was when it was divided by n-1.
Using algebra, this is equivalent to:

Recall that a probability is a long term relative frequency. So every f/N can be replaced
by p(x). This simplifies to be:
What's even better, is that the last portion of the variance is the mean squared. So, the two
formulas that we will be using are:

Example 6.7 Refer to the two-coin tossing experiment and the probability distribution
for x, shown in Figure 6.1. Find the variance and standard deviation of x.

Solution In Example 6.6 we found the mean of x is 1. Then


2
1 1 1 1
 2  E[(x -  ) 2 ]   (x -  ) 2 p( x)  (0  1) 2    (1  1) 2    (2  1) 2   
x 0  4 2  4 2
and
1
  2   0.707
2
Example 2: Here's the example we were working on earlier.

2006. R Waema Probability distributions 11


x 1 2 3 4 5 6 sum

p(x) 1/6 1/6 1/6 1/6 1/6 1/6 6/6 = 1

x p(x) 1/6 2/6 3/6 4/6 5/6 6/6 21/6 = 3.5

x^2 p(x) 1/6 4/6 9/6 16/6 25/6 36/6 91/6 = 15.1667
The mean is 7/2 or 3.5
The variance is 91/6 - (7/2)^2 = 35/12 = 2.916666...
The standard deviation is the square root of the variance = 1.7078
Do not use rounded off values in the intermediate calculations. Only round off the final
answer.

Types of Discrete Probability distribution


A discrete random variable can be expressed mathematically in different probability
distribution forms
 Binomial probability distribution
 Poisson probability distribution
 Geometric probability distribution
 Bernoulli distribution
 Etc

6.4 The binomial probability distribution


One of the most widely known discrete distributions is the binomial distribution, which
has been used for hundreds of years. The binomial distribution is used when there are
exactly two mutually exclusive outcomes of a trial. These outcomes are appropriately
labeled "success" and "failure". The binomial distribution is used to obtain the probability
of observing x successes in N trials, with the probability of success on a single trial
denoted by p. The binomial distribution assumes that p is fixed for all trials.
Many types of probability problems have only two outcomes, or they can be reduced to
two outcomes.  For example, when a coin is tossed, it can land heads or tails.  when a
baby is born, it will be either male or female.  In a basketball game, a team either wins or
loses.  A true-false item can be answered in only two ways, true or false.  Other situations
can be reduced to two outcomes.  For example, a medical treatment can be classified as
effective or ineffective, depending on the results.  A person can be classified as having
normal or abnormal blood pressure, depending on the measure of the blood pressure
gauge.  A multiple-choice question, even though there are four or five answer choices,
can be classified as correct or incorrect.  Situations like these are called binomial
experiments.
An experiment has a binomial probability distribution if it satisfies the following
requirements:
1.  Each trial can have only two outcomes or outcomes that can be reduced to two
outcomes.  These outcomes can be considered as either success or failure.
2.  There must be a fixed number of trials.
3.  The outcomes of each trial must be independent of each other.

2006. R Waema Probability distributions 12


4.  The probability of a success must remain the same for each trial. the probability
of success is fixed
 Examples of binomial experiments
 Tossing a coin 20 times to see how many tails occur.
 Asking 200 people if they watch ABC news.
 Rolling a die to see if a 5 appears.
 Examples which aren't binomial experiments
 Rolling a die until a 6 appears (not a fixed number of trials)
 Asking 20 people how old they are (not two outcomes)
 Drawing 5 cards from a deck for a poker hand (done without replacement, so not
independent)

Notation for the Binomial Distribution


P(S)        The symbol for the probability of success
P(F)        The symbol for the probability of failure
p            The numerical probability of a success
q            The numerical probability of a failure
P(S) = p    and     P(F) = 1 - p = q
n            The number of trials
X            The number of successes
Note that 0 X n
    p ( x ) denotes the probability of getting exactly x successes among n trials

Methods of finding probabilities of success


There are three methods for finding probabilities in a binomial experiment. The first
method involves calculations using the binomial probability formula and is the basis for
the other two methods. The second method involves the use of the binomial probability
table, and the third method involves the use of statistical software.

Example 1: A multiple choice test contains 20 questions. Each question has five choices
for the correct answer. Only one of the choices is correct. With random guessing, does
the test have a binomial probability distribution?
  
Example 2: An experiment consists of flipping a fair coin 8 times and counting the
number of tails. Does this experiment have a binomial probability distribution?
  
Example 3: A pair of dice is rolled 37 times and the number of times a sum of 7 is
observed is recorded. Does this experiment have a binomial probability distribution?
 
Example 4: A multiple choice test contains 20 questions. Each question has four or five
choices for the correct answer. Only one of the choices is correct. With random guessing,
does this test have a binomial probability distribution?

Method 1: Use of the binomial Probability Formula


The probability of a success in a binomial experiment can be computed with the
following binomial formula.

2006. R Waema Probability distributions 13


Binomial Probability Formula
In a binomial experiment, the probability of exactly X successes in n trials is

where
p = probability of a success on a single trial, q=1-p
n = number of trials,
x= number of successes in n trials
The term p is the probability of getting a success on any one trial; the term
q = 1-p is the probability of getting a failure on any one trial.
The terms p and q remain constant through the sequence of trials undertaken.
In n trials only x successes are possible where x is a whole number between 0 and n.

Example 1:
A coin is tossed three times.  Find the probability of getting exactly two heads.
Solution:
This problem can be solved by looking that the sample space.  There are three ways to get
two heads.
HHH, HHT, HTH, THH, TTH, THT, HTT, TTT

3
The answer is or 0.375.
8
Looking at the problem in the previous example from the standpoint of a binomial
experiment, one can show that it meets the four requirements.
1.  There are only two outcomes for each trial, heads or tails.
2.  There is a fixed number of trials (three).
3.  The outcomes are independent of each other (the outcome of one toss in no way
affects the outcome of another toss).
4.  The probability of a success (heads) is 1/2 in each case.
 
In this case, n = 3, X = 2, p = 1/2, and q = 1/2.  Hence, substituting in the formula gives

which is the same answer obtained by using the sample space.


   
The same example can be used to explain the formula.  First, not that there are three ways
to get exactly two heads and one tail from a possible eight ways.  They are HHT, HTH,
and THH.  In this case, then, the number of ways of obtaining two heads from three coin
tosses is , or 3.  In general, the number of ways to get X successes from n trials with
out regard to order is

This is the first part of the binomial formula.  (Some calculators can be used for this.)

2006. R Waema Probability distributions 14


    Next each success has a probability of 1/2, and can occur twice.  Likewise, each failure

has a probability of 1/2 and can occur once, giving the part of the formula.  To
generalize, then, each success has a probability of p and can occur X times, and each
failure has a probability of q and can occur (n-X) times.  Putting it all together yields the
binomial formula.

Example 2
    If a student randomly guesses at five multiple-choice questions, find the probability
that the student gets exactly three correct.  Each question has five possible choices.
 
Solution
    In this case n = 6, X = 3, and p = 1/6, since there is one chance in five of guessing a
correct answer.  Then,

Example 3
    A survey from Teenage Research Unlimited (Northbrook, Ill.) found that 30% of
teenage consumers receive their spending money from part-time jobs.  If five teenagers
are selected at random, find the probability that at least three of them will have part-time
jobs.
 
Solution
    To find the probability that at least three have a part-time job, it is necessary to find the
individual probabilities for either 3, 4, or 6 and then add them to get the total probability.

Hence,
P(at least three teenagers have part-time jobs) = 0.132 + 0.028 + 0.002 = 0.162

Example 4: An experiment consists of flipping a fair coin 8 times and counting the
number of tails. Find the probability of seeing exactly 3 tails.

Example 5: A multiple choice test contains 20 questions. Each question has five choices
for the correct answer. Only one of the choices is correct. What is the probability of
making an 80 with random guessing?

2006. R Waema Probability distributions 15


 

 
Example 6: An experiment consists of flipping a fair coin 8 times and counting the
number of tails. Find the probability of seeing exactly 6 or 7 tails.

These are mutually exclusive events and thus,


P(x = 6 or x = 7) =
Example 7
Test for impurities commonly found in drinking water from private wells showed that
30% of all wells in a particular country have impurity A. If a random sample of 5 wells is
selected from the large number of wells in the country, what is the probability that:
Exactly 3 will have impurity A?
At least 3?
Fewer than 3?
Solution
First we confirm that this experiment possesses the characteristics of a binomial
experiment. This experiment consists of n = 5 trials, one corresponding to each random
selected well. Each trial results in an S (the well contains impurity A) or an F (the well
does not contain impurity A). Since the total number of wells in the country is large, the
probability of drawing a single well and finding that it contains impurity A is equal to
0.30 and this probability will remain the same for each of the 5 selected wells. Further,
since the sampling is random, we assume that the outcome on any one well is unaffected
by the outcome of any other and that the trials are independent. Finally, we are interested
in the number x of wells in the sample of n = 5 that contain impurity A. Therefore, the
sampling process represents a binomial experiment with n = 5 and p = 0.30.
a) The probability of drawing exactly x = 3 wells containing impurity A is
p(x)  C nx p x q n  x with n = 5, p = 0.30 and x = 3. We have by this formula
5!
p( 3 )  ( 0.30 )3( 1  0.30 )53  0.1323 .
3! 2!
b) The probability of observing at least 3 wells containing impurity A is
P(x  3) = p(3)+p(4)+p(5). We have calculated p(3) = 0.1323 and we leave to the reader
to verify that p(4) = 0.02835, p(5) = 0.00243. In result, P(3) = 0.1323+0.02835+0.00243
= 0.16380.
c) Although P(x<3) = p(0)+p(1)+p(2), we can avoid calculating 3 probabilities by using
the complementary relationship P(x<3) = 1-P(x  3) = 1-0.16380 = 0.83692.
Method 2: Use of the Binomial Probability table
In some cases, we can easily find binomial probabilities by simply referring to binomial
probability tables. First locate n and the corresponding value of x that is desired. At this
stage, one row of the numbers should be isolated. Now align that row with the proper

2006. R Waema Probability distributions 16


probability of p by using the column across the top. The isolated number represents the
desired probability (missing its decimal point at the beginning). A very small probability,
such as 0.000000345, is indicated as 0+.
Example:
Suppose that an examination consists of six true and false questions, and assume that a
student has no knowledge of the subject matter. The probability that the student will
guess the correct answer to the first question is 30%. Likewise, the probability of
guessing each of the remaining questions correctly is also 30%. What is the probability of
getting more than three correct answers?
For the above problem, n = 6, p = 0.30, and X >3. In the above table, search along the
row of p values for 0.30. The problem is to locate the P(X > 3). Thus, the answer
involves summing the probabilities for X = 4, 5, and 6. These values appear in the X
column at the intersection of each X value and p = 0.30, as follows:
P (X > 3) = Summation of {P (X=4) + P(X=5) +P(X=6)} = (0.060)+(0.010)+(0.001) =
0.071 or 7.1%
Thus, we may conclude that if 30% of the exam questions are answered by guessing, the
probability is 0.071 (or 7.1%) that more than four of the questions are answered correctly
by the student.
Method 3: Use of a computer software (Minitab)
Many computer statistics packages include an option for generating binomial
probabilities. With Minitab, first enter a column C1 of the x values for which you want
probabilities (such as 0,1,2,3,…) then select Calc from the main menu, and proceed to
select the submenu items of probability Distributions and Binomial. Enter the number
of trials , the probability of success, and C1 for the input column, then click on OK.

Mean, Variance, and Standard Deviation for the Binomial Distribution


    The mean, variance, and standard deviation of a variable that has the binomial
distribution can be found by using the following formulas.

where
p = probability of a success on a single trial,
q=1-p
n = number of trials

These formulas are algebraically equivalent to the formulas for the mean, variance, and
standard deviation of discrete random variables for probability distributions, but because
they are for variables of the binomial distribution, they have been simplified using
algebra.  The algebraic derivation is omitted here, but their equivalence is shown in the
next example.

2006. R Waema Probability distributions 17


Example 1
    A coin is tossed four times.  Find the mean, variance, and standard deviations of the
number of heads that will be obtained.
Solution
    With the formulas for the binomial distribution and n = 4, p = 1/2, and q = 1/2, the
results are
 

From the previous example, when four coins are tossed many, many times, the average of
the number of heads that appears is two, and the standard deviation of the number of
heads is one.  Note that these are theoretical values.
As stated previously, this problem can be solved by using the expected value formulas. 
The distributions is shown as follows:
No. of
heads, 0 1 2 3 4
X
Probabi
1/14/ 6/ 4/ 1/1
lity,
6 16 16 16 6
P(X)

Hence, the simplified binomial formulas give the same result.

Exercise 1: An experiment consists of flipping a fair coin 10 times and counting the
number of heads. Find the mean and standard deviation for this experiment.
 
 Exercise 2: Approximately 15% of all KSU students commute more than 20 miles one-
way to campus.  Would it be unusual in a class of 60 students to have 16 students who
commute more than 20 miles one-way to campus?
The mean is 60*.15 = 9 and the standard deviation
.  
Hence the z-score is

2006. R Waema Probability distributions 18


and yes, it would be unusual.
 
Exercise 3: At the KSU library, approximately ½% of books are returned late.  In the
next 1000 books that are returned, would it be unusual to see no late books?
To find probabilities from a binomial distribution, one may either calculate them
directly, use a binomial table, or use a computer

6.5 Poisson Distribution


The Poisson distribution is another discrete distribution and has some similarities with the
Binomial distribution. In contrast to the Binomial, however, the Poisson does not focus
on just two outcomes but on a number of discrete occurrences over some interval.
The Poisson distribution depends only on the average number of occurrences per unit
time of space.
A Poisson distribution is the distribution of the number of events in a fixed time interval,
provided that the events occur at random, independently in time and at a constant rate.
Many experiments consist of observing the occurrence times of random arrivals.
Examples include
 Arrivals of customers for service in a service facility
 Arrivals of calls at a switchboard
 The number of flaws in a bolt of fabric.
 The number of typos per page made by a secretary
The Poisson Distribution is a discrete distribution which takes on the values X = 0, 1, 2,
3, ... . It is often used to model the number of arrivals of events that occur in a fixed time
period (such as the number of telephone calls at a business or the number of accidents at
an intersection). It is also useful in ecological studies, e.g., to model the number of prairie
dogs found in a square mile of prairie.
Poisson distribution is named after the French mathematician Simeon Poisson, Poisson
probabilities are useful when there are a large number of independent trials with a small
probability of success on a single trial and the variables occur over a period of time. A
discrete random variable X is said to follow a Poisson distribution with parameter  , if it
has probability distribution
 x e 
P( X  x)  , x = 0, 1, 2, ...  > 0.
x!
where  is the average number of occurrences in the specified interval.

The Poisson distribution has the following characteristics

(i) It is a discrete distribution.


(ii) The length of the observation period is fixed in advance

(ii) It describes rare events (events occur randomly).

(iii) Each occurrence is independent of other occurrences.

2006. R Waema Probability distributions 19


(iv) The events occur at a constant average rate (The expected number of
occurrences must hold constant throughout the experiment)

(v) The occurrences in each interval can vary from zero to infinity.

The Poisson probability distribution provides a close approximation to the binomial


probability distribution when n is large and p is quite small or quite large. A Poisson
experiment does not have a given number of trials. For instance, a Poisson experiment
might examine the number of customers arriving at a store during a five-minute interval.
If you're approximating a binomial probability using the Poisson, then lambda is the same
as n * p.

We can use the Poisson model in the following example. Assume that bank customers
arrive randomly on weekday afternoons at an average of 3.2 customers per 2 minute
interval. Find the probability of exactly four customers arriving in a 2-minute interval on
a weekday afternoon ?

In this case x = 4 customers,  = 3.2 so it is relatively easy to obtain the probability using
the Poisson model.

 4 e .2  e-3.2 4.2742


f(4)    0.1781
4! 24 24

Thus, the probability that exactly four customers arrive in a two minute interval on a
weekday afternoon is 0.1781 or there is a 17.81 per cent chance of this event occurring.

Example 2:
If there are 500 customers per eight-hour day in a check-out lane, what is the probability
that there will be exactly 3 in line during any five-minute period?
The expected value during any one five minute period would be 500 / 96 = 5.2083333.
The 96 is because there are 96 five-minute periods in eight hours. So, you expect about
5.2 customers in 5 minutes and want to know the probability of getting exactly 3.
p(3;500/96) = e^(-500/96) * (500/96)^3 / 3! = 0.1288 (approx)

Exercises
1. Suppose a bank knows that on average 60 customers arrive in a certain service
hour. Using a time interval of 1 minute, calculate the probability of exactly one
customer arriving in a given one minute interval within that hour.  
2. Suppose a bank knows that on average 60 customers arrive in a certain service
hour. Using a time interval of 1 minute, calculate the probability of no customers
arriving in a given one minute interval within that hour.  
3. Suppose a bank knows that on average 60 customers arrive in a certain service
hour. Using a time interval of 1 minute, calculate the probability of exactly three
customers arriving in a given one minute interval within that hour.  
4. Suppose a bank knows that on average 60 customers arrive in a certain service
hour. Using a time interval of 1 minute, calculate the probability of more than
three customers arriving in a given one minute interval within that hour.  

2006. R Waema Probability distributions 20


5. Graph the probability distribution determined in the problems above (and the
lecture example).

Mean and standard deviation of a poisson distribution

The mean and variance of a random variable X that has a Poisson distribution with
parameter  are mean =E(X) =  and Variance = V(X) =  , i.e.,
E(X) = Var(X) = 

Example: In one particular autobiography of a professional athlete, there are an average


of 15 spelling errors per page. If the Poisson distribution is used to model the probability
distribution of the number of errors per page, then the random variable X, the number of
errors per page, has a Poisson distribution with  = 15. The probability that there are no
errors on a page is e-15150 = 3.06 x 10-7.

Continuous probability distribution


A continuous probability distribution is a distribution with a domain whose elements
are the continuous values that a continuous random variable can assume, and a range
whose elements are the probabilities associated with the values in the domain.
The probability distribution for a random variable describes how the probabilities are
distributed over the values of the random variables. For a continuous random variable x,
the probability function is denoted by f(x). The probability function provides the
probability for each value of the random variable.
For a continuous function, the probability density function (pdf) is the probability that the
variate (X) has the value x. Since for continuous distributions the probability at a single
point is zero, this is often expressed in terms of an integral between two points.

The probability density function of a continuous random variable is a function which can
be integrated to obtain the probability that the random variable takes a value in a given
interval.

Requirements for a Continuous probability Distribution


The mathematical definition of a continuous probability function, f(x), is a function that
satisfies the following properties.
1. The probability that x is between two points a and b is

2. It is non-negative for all real x. 0  f(x)  1


3. The integral of the probability function is one, that is the total probability for all
possible values of the continuous random variable X is 1:

2006. R Waema Probability distributions 21


What does this actually mean? Since continuous probability functions are defined for an
infinite number of points over a continuous interval, the probability at a single point is
always zero. Probabilities are measured over intervals, not single points. That is, the area
under the curve between two distinct points defines the probability for that interval. This
means that the height of the probability function can in fact be greater than one. The
property that the integral must equal one is equivalent to the property for discrete
distributions that the sum of all the probabilities must equal one.

Cumulative Distribution Function for a continuous random variable


All random variables (discrete and continuous) have a cumulative distribution function. It
is a function giving the probability that the random variable X is less than or equal to x,
for every value x.
Formally, the cumulative distribution function F(x) is defined to be:
for
For a continuous random variable, the cumulative distribution function is the integral of
its probability density function.
If X is a continuous random variable with p.d.f. f(x) defined on a  X  b , then the
cumulative distribution function (c.d.f.), written F(t) is given by:

So the c.d.f. is found by integrating the p.d.f. between the minimum value of X and t. 
Similarly, the probability density function of a continuous random variable can be
obtained by differentiating the cumulative distribution.
The c.d.f. can be used to find out the probability of a random variable being between two
values:
P( s  X  t ) = the probability that X is between s and t. But this is equal to the
probability that X  t minus the probability that X  s .

Mean or Expected Value of Continuous Variable


If X is a continuous random variable with probability density function f defined on an
interval with (possibly infinite) endpoints a and b, then the mean or expected value of X
is
E(X) = ab x f(x) dx.

E(X) is also called the average value of X. It is what we expect to get if we take the
average of many values of X obtained in experiments.

Example 1
Let X have probability density function given by
f(x) = 3x2,
with domain [0, 1]. Find E(X).

2006. R Waema Probability distributions 22


Solution
We have
E(X) = ab x f(x) dx.
= 01 (x)(3x2) dx
= 01 (3x3) dx
= [3x4/4]01 = 3/4.
Thus, the expected value of X is 3/4.

Variance and Standard Deviation of a continuous variable


Statisticians use the variance and standard deviation of a continuous random variable X
as a way of measuring its dispersion, or the degree to which is it "scattered."
Let X be a continuous random variable with density function f defined on the interval (a,
b), and let µ = E(X) be the mean of X. Then the variance of X is given by

Var(X) =  2 = E((X µ)2) = a


b
(x µ)2f(x) dx.

The standard deviation of X is the square root of the variance,


  Var ( X )

1.6.3 Cumulative Distribution Function


The cumulative distribution function (cdf) is the probability that the variable takes a
value less than or equal to x. That is

For a continuous distribution, this can be expressed mathematically as

Types of Continuous Probability distribution


A continuous random variable can be expressed mathematically in different probability
distribution forms
 Normal probability distribution
 Uniform probability distribution
 Exponential probability distribution
 Etc

Normal Distribution
A continuous random variable has a normal distribution if that distribution is symmetric
and bell shaped. The normal (or Gaussian) density function was proposed by C.F.Gauss
(1777-1866) as a model for the relative frequency distribution of errors, such errors of
measurement. Amazingly, this bell-shaped curve provides an adequate model for the
relative frequency distributions of data collected from many different scientific areas.

2006. R Waema Probability distributions 23


The most widely used and known distribution is the normal distribution. It is a
continuous rather than a discrete distribution and is known to fit many human
characteristics such as weight, height, life expectancy, IQ etc.
Characteristics of the Normal Distribution:
1. It is bell shaped and is symmetrical about its mean.
2. It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the
mean.
3. It is a continuous distribution.
4. It is a family of curves, i.e., every unique pair of mean and standard deviation
defines a different normal distribution. Thus, the normal distribution is completely
described by two parameters: mean and standard deviation. See the following
figure.
6. Total area under the curve sums to 1, i.e., the area of the distribution on each side
of the mean is 0.5
6. It is unimodal, i.e., values mound up only in the center of the curve.
7. The probability that a random variable will have a value between any two points
is equal to the area under the curve between those points.

The normal curve is sometimes referred to as a bell-shaped curve with values that mound
up only in the central portion of the curve. The values of x can vary from minus infinity
to plus infinity but most of the values will be concentrated around the mean. Values in
either tail of the normal curve are possible but have a small probability of occurring. The
normal distribution is fully described by just two parameters - its mean () and its
standard deviation (). The normal distribution is a family of curves. Each unique value
of the mean and each unique value of the standard deviation generates a different normal
curve. The probability density function of the normal distribution is given by the
following expression:

1 2
/ 2 2
f ( x)  e ( x   )
 2
The density function, mean and variance for a normal random variable
The density function:
1 2
/ 2 2
f ( x)  e ( x   )
 2
The parameters  and 2 are the mean and the variance , respectively, of the normal
random variable

The case where   0 and   1 is called standard normal distribution. The equation
for the standard normal distribution is given by
2
e x / 2
f ( x) 
2

Notation for the Probability of a Standard Normal Random Variable

2006. R Waema Probability distributions 24


Pr (a < Z < b) represents the probability a standard normal random variable is between a
and b

Pr (Z > a) represents the probability a standard normal random variable is greater


than a

Pr(Z < b) represents the probability a standard normal random variable is less than b.

Note that the integral calculus is used to find the area under the normal distribution curve.
However, this can be avoided by transforming all normal distribution to fit the standard
normal distribution. This conversion is done by rescaling the normal distribution axis
from its true units (time, weight, dollars, and...) to a standard measure called Z score or Z
value. A Z score is the number of standard deviations that a value, X, is away from the
mean. If the value of X is greater than the mean, the Z score is positive; if the value of X
is less than the mean, the Z score is negative. The Z score or equation is as follows:
Z = (X - Mean) /Standard deviation
Suppose x is a value on the x-axis. Its standardized value or z-value is obtained by
dividing the difference x-µ by the standard deviation.
z-value = (x-µ)/standard deviation
A standard Z table can be used to find probabilities for any normal curve problem that
has been converted to Z scores. For the table, refer to the text. The Z distribution is a
normal distribution with a mean of 0 and a standard deviation of 1.
The following steps are helpfull when working with the normal curve problems:
1. Graph the normal distribution, and shade the area related to the probability you
want to find.
2. Convert the boundaries of the shaded area from X values to the standard normal
random variable Z values using the Z formula above.
3. Use the standard Z table to find the probabilities or the areas related to the Z
values in step 2.
Example One:
Graduate Management Aptitude Test (GMAT) scores are widely used by graduate
schools of business as an entrance requirement. Suppose that in one particular
year, the mean score for the GMAT was 476, with a standard deviation of 107.
Assuming that the GMAT scores are normally distributed, answer the following
questions:

Question 1.
What is the probability that a randomly selected score from this GMAT falls
between 476 and 660? <= x <="660)" the following figure shows a graphic
representation of this problem.

2006. R Waema Probability distributions 25


Figure 1
Applying the Z equation, we get: Z = (660 - 476)/107 = 1.62. The Z value of 1.62
indicates that the GMAT score of 660 is 1.62 standard deviation above the mean. The
standard normal table gives the probability of value falling between 660 and the mean.
The whole number and tenths place portion of the Z score appear in the first column of
the table. Across the top of the table are the values of the hundredths place portion of the
Z score. Thus the answer is that 0.4474 or 44.74% of the scores on the GMAT fall
between a score of 660 and 476.

Question 2.
What is the probability of receiving a score greater than 760 on a GMAT test that has a
mean of 476 and a standard deviation of 107? i.e., P(X >= 760) = ?. This problem is
asking for determining the area of the upper tail of the distribution. The Z score is: Z =
( 760 - 476)/107 = 2.66. From the table, the probability for this Z score is 0.4948. This is
the probability of a GMAT with a score between 476 and 760. The rule is that when we
want to find the probability in either tail, we must substract the table value from 0.60.
Thus, the answer to this problem is: 0.6 - 0.4948 = 0.0062 or 0.62%. Note that P(X >=
760) is the same as P(X >760), because, in continuous distribution, the area under an
exact number such as X=760 is zero. The following figure shows a graphic representation
of this problem.

Figure 2
Question 3.
What is the probability of receiving a score of 640 or less on a GMAT test that has a
mean of 476 and a standard deviation of 107? i.e., P(X <= 640)="?." we are asked to
determine the area under the curve for all values less than or equal to 640. the z score is:
z="(640" 476)/107="0.6." from the table, the probability for this z score is 0.2267 which

2006. R Waema Probability distributions 26


is the probability of getting a score between the mean (476) and 640. the rule is that when
we want to find the probability between two values of x on either side of the mean, we
just add the two areas together. Thus, the answer to this problem is: 0.6 + 0.2267 = 0.73
or 73%. The following figure shows a graphic representation of this problem.

Figure 3
Question 4.
What is the probability of receiving a score between 440 and 330 on a GMAT test that
has a mean of 476 and a standard deviation of 107? i.e., P(330

Figure 4
In this problem, the two values fall on the same side of the mean. The Z scores are: Z1 =
(330 - 476)/107 = -1.36, and Z2 = (440 - 476)/107 = -0.34. The probability associated
with Z = -1.36 is 0.4131, and the probability associated with Z = -0.34 is 0.1331. The rule
is that when we want to find the probability between two values of X on one side of the
mean, we just subtract the smaller area from the larger area to get the probability between
the two values. Thus, the answer to this problem is: 0.4131 - 0.1331 = 0.28 or 28%.

Example Two:

Suppose that a tire factory wants to set a mileage guarantee on its new model called LA
60 tire. Life tests indicated that the mean mileage is 47,900, and standard deviation of the
normally distributed distribution of mileage is 2,060 miles. The factory wants to set the

2006. R Waema Probability distributions 27


guaranteed mileage so that no more than 6% of the tires will have to be replaced. What
guaranteed mileage should the factory announce? i.e., P(X <= ?)="6%.<br"> In this
problem, the mean and standard deviation are given, but X and Z are unknown. The
problem is to solve for an X value that has 6% or 0.06 of the X values less than that
value. If 0.06 of the values are less than X, then 0.46 lie between X and the mean (0.6 -
0.06), see the following graph.

Figure 6

Refer to the standard normal distribution table and search the body of the table for 0.46.
Since the exact number is not found in the table, search for the closest number to 0.46.
There are two values equidistant from 0.46-- 0.4606 and 0.4496. Move to the left from
these values, and read the Z scores in the margin, which are: 1.66 and 1.64. Take the
average of these two Z scores, i.e., (1.66 + 1.64)/2 = 1.646. Plug this number and the
values of the mean and the standard deviation into the Z equation, you get:
Z =(X - mean)/standard deviation or -1.646 =(X - 47,900)/2,060 = 44,628 miles.
Thus, the factory should set the guaranteed mileage at 44,628 miles if the objective is not
to replace more than 6% of the tires.
Computing Normal Probabilities
There are several different situations that can arise when asked to find normal
probabilities.
Situation Instructions

Between zero and Look up the area in the table


any number

Between two positives, or Look up both areas in the table and subtract the
Between two negatives smaller from the larger.

Between a negative and Look up both areas in the table and add them together
a positive

Less than a negative, or Look up the area in the table and subtract from 0.6000
Greater than a positive

Greater than a negative, or Look up the area in the table and add to 0.6000
Less than a positive

2006. R Waema Probability distributions 28


This can be shortened into two rules.
If there is only one z-score given, use 0.6000 for the second area, otherwise look up both
z-scores in the table
If the two numbers are the same sign, then subtract; if they are different signs, then add.
If there is only one z-score, then use the inequality to determine the second sign (< is
negative, and > is positive).
Some properties of the normal distribution make it relatively easy to use.
1. Normal distribution is completely determined by its mean and standard deviation.
2. The area under the normal curve between two values on the x-axis depends only on
how many standard deviations the values are away from the mean µ which is the x-
coordinate of the highest point on the curve.
In view of 2 above it makes sense to measure x-coordinates in terms of the number of
standard deviations from the mean.
Finding scores given the probabilities
Cautions to keep in mind
1. Don’t confuse z scores and areas.
Z scores are distances along the horizontal scale, but areas are regions under the
normal curve.
2. Choose the correct (right/left) side of the graph.
3. A z score must be negative whenever it is located to the left half of the normal
distribution.
4. Areas (or probabilities) are positive or zero values, but they are never negative.

The opposite question to finding a probability from a z-score: given an area or


probability, what z-score cuts off that area to the left? Or right? To find an x-score cutting
off a certain area or percentage, find the z-score (or scores) cutting off that area and turn
into x-scores with the following: x = µ + (z • σ)

Procedure for Finding Values Using the Z Table


1. Sketch a normal distribution curve, enter the given probability or percentage in
the appropriate region of the graph, and identify the x value(s) being sought.
2. Use the normal Z Table to find the z score corresponding to the cumulative left
area bounded by x. find the closest area, then identify the corresponding z score.
3. Using the Z Formula, enter the values for µ, σ, and the z score found in
step 2, then solve for x. x = µ + (z • σ) (If z is located to the left of the mean, be
sure that it is a negative number.)
4. Refer to the sketch of the curve to verify that the solution makes sense in the
context of the graph and the context of the problem.
% of the distribution?
X=μ+zσ
X = 60 + (-.67)(10) = 43.30

2006. R Waema Probability distributions 29


Standard Normal Probabilities
z 0.00 0.01 0.02 0.03 0.04 0.06 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0369
0.1 0.0398 0.0438 0.0478 0.0617 0.0667 0.0696 0.0636 0.0676 0.0714 0.0763
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1266 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1617
0.4 0.1664 0.1691 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.6 0.1916 0.1960 0.1986 0.2019 0.2064 0.2088 0.2123 0.2167 0.2190 0.2224
0.6 0.2267 0.2291 0.2324 0.2367 0.2389 0.2422 0.2464 0.2486 0.2617 0.2649
0.7 0.2680 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2862
0.8 0.2881 0.2910 0.2939 0.2967 0.2996 0.3023 0.3061 0.3078 0.3106 0.3133
0.9 0.3169 0.3186 0.3212 0.3238 0.3264 0.3289 0.3316 0.3340 0.3366 0.3389
1.0 0.3413 0.3438 0.3461 0.3486 0.3608 0.3631 0.3664 0.3677 0.3699 0.3621
1.1 0.3643 0.3666 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3926 0.3944 0.3962 0.3980 0.3997 0.4016
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4116 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4261 0.4266 0.4279 0.4292 0.4306 0.4319

2006. R Waema Probability distributions 30


1.6 0.4332 0.4346 0.4367 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4462 0.4463 0.4474 0.4484 0.4496 0.4606 0.4616 0.4626 0.4636 0.4646
1.7 0.4664 0.4664 0.4673 0.4682 0.4691 0.4699 0.4608 0.4616 0.4626 0.4633
1.8 0.4641 0.4649 0.4666 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4760 0.4766 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4860 0.4864 0.4867
2.2 0.4861 0.4864 0.4868 0.4871 0.4876 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4926 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.6 0.4938 0.4940 0.4941 0.4943 0.4946 0.4946 0.4948 0.4949 0.4961 0.4962
2.6 0.4963 0.4966 0.4966 0.4967 0.4969 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4966 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4976 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4986 0.4986 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990

Exercises

1. Given the standard normal distribution, find the probability that a z picked at random
will have a value less than or equal to 2.8.

2. What is the probability that a z picked at random from the population of z's will have a
value between 2.63 and -2.63?

3. What proportion of z values lie between -2.4 and -1.36?

4. Given the standard normal distribution, find the probability that z is at least 1.26

6. The weights of a certain melon are normally distributed with a mean of 14 ozs and a
standard deviation of 1.22 ozs. What is the probability that a melon drawn at random
from this population will weigh less than 12 ozs?

6. Using the standard normal table, determine a z value (to the nearest two decimal
places) such that the area
a. between z and negative infinity is .64

b. from the midpoint to z is .20

2006. R Waema Probability distributions 31


7. A teacher gives a test and gets normally distributed results with a mean of 60 and a
standard deviation of 10. If grades are assigned according to the following scheme,
find the numerical limits for each letter grade.
A: Top 10%
B: Scores above the bottom 70% and below the top 10%
C: Scores above the bottom 40% and below the top 30%
D: Scores above the bottom 10% and below the top 70%
F: Bottom 10%
8. According to data from the college Entrance Examination Board, the mean math
SAT score is 476, and 17% of the scores are above 600. Find the standard
deviation, and then use that result to find the 99th percentile (assume the scores are
normally distributed)
9. Quarters have weights that are normally distributed with a mean of 6.67g and a
standard deviation of 0.07g
(a) If a vending machine is adjusted to reject quarters weighing less than 6.63g or
more than 6.81g, what is the percentage of legal quarters that are rejected?
(b) Find the weights of accepted legal quarters if the machine is readjusted so that’s
the lightest 1.6% are rejected and the heaviest 1.6% are rejected?
10. A subcontractor manufactured ceramic substrates for IBM. These devices have
resistances that are normally distributed with a mean of 1.978 ohms and a standard
deviation of 0.172 ohms. If the required specifications are to be modified so that
3% are rejected because their resistances are too low and another 3% are rejected
because their resistances are too high, find the cutoff values for the acceptable
devices.
11. IQ scores are normally distributes wit a mean of 100 and a standard deviation of
16. If we define a genius to be someone in the top 1% of IQ scores, find the score
separating geniuses from the rest of us. This score could be used by a “Think tank”
company as one criterion for employment.
12. According to the opinion research Corporation, men spend an average of 11.4
minutes in the shower. Assume that the times are normally distributed with a
standard deviation of 1.8 minutes. Find the values of the quartiles Q1, and Q3.
13. The lengths of pregnancies are normally distributed with a mean of 268 days and a
standard deviation of 16 days. If we stipulate that a baby is premature if the length
of pregnancy is in the lowest 4%, find the length that separates premature babies
from those who are not premature. Premature babies often require special care, and
this result could be helpful to hospital administrators in planning for that care.
14. Weights of paper discarded by households each week are normally distributed with
a mean of 9.4 lb and a standard deviation of 4.2 lb. Find the weight that separates
the bottom 33% from the top 67%
15. An IBM subcontractor was hired to make ceramic substrates which are used to
distribute power and signals to and from computer silicon chips. Specifications
require resistance between 1.6 ohms and 2.6 ohms, but the population has normally
distributed resistances with a mean of1.978 ohms and a standard deviation of 0.172
ohms. What percentage of the ceramic substrates will not meet the manufacturer
specifications?
Does this manufacturing process appear to be working well?

2006. R Waema Probability distributions 32


16. IQ scores are normally distributes wit a mean of 100 and a standard deviation of
16. Mensa is an organization for people with high Iqs, and eligibility requires an IQ
above 131.6
(a) If someone is randomly selected, find the probability that he or she meets the
Mensa requirement
(b) In a typical region of 76,000 people, how many are eligible for Mensa?
17. According to the opinion research Corporation, men spend an average of 11.4
minutes in the shower. Assume that the times are normally distributed with a
standard deviation of 1.8 minutes. If a man is randomly selected, find the
probability that he spends at least 10 minutes in the shower.
18. The lengths of pregnancies are normally distributed with a mean of 268 days and a
standard deviation of 16 days. If we stipulate that a baby is premature if born at
least three weeks earlier, what percentage of babies are born premature? Premature
babies often require special care, and this result could be helpful to hospital
administrators in planning for that care.
19. Weights of paper discarded by households each week are normally distributed with
a mean of 9.4 lb and a standard deviation of 4.2 lb. Find the probability of
randomly selecting a household and getting one that discards between 6.0 lb and
8.0 lb of paper in a week.

20. Following their production, industrial generator shafts are tested for static and
dynamic balance, and the necessary weight is added to predrilled hole in order to
bring each shaft within balance specifications. From past experience, the amount of
weight added to a shaft has been normally distributed, with an average of 36 grams
and a standard deviation of 9 grams. Management has just directed that the best 6%
of the output be reserved for shipment to aerospace customers. Translating “the
best 6%” into an amount of balancing weight, what weight cutoff should be used in
deciding which generator shafts to reserve for aerospace customers?
21. The average life span of a certain brand of tires is 30,000 miles with a standard
deviation of 2,000 miles and follows a normal distribution.
i.) Would it be unusual for a tire to last for 35,000 miles?
ii.) What is the probability that a tire will have a life span between
25,000 and 28,000 miles?
iii.) Suppose this company wishes to replace only 2 out of every 10,000
tires with its warrantee.  How many miles should it guarantee a tire
will last?  Are there tires that have an unusually short lifespan yet
not be covered under warrantee?  If yes, what percentage of the
production falls into this category?

2006. R Waema Probability distributions 33


Exponential Distribution
The exponential distribution is the most important and also the easiest to use distribution
in queueing theory. They are often used to model the time between events that happen at
a constant average rate. Interarrival times and services times can often be represented
exactly or approximately using exponential distribution.
The probability distribution function of an exponential distribution is given by

The exponential distribution is used to model Poisson processes, which are situations in
which an object initially in state A can change to state B with constant probability per
unit time λ. The time at which the state actually changes is described by an exponential
random variable with parameter λ

An Exponential(λ) random variable has the following properties:

Mean = μ = 1/λ
Standard Deviation = σ2 = 1/λ2
The importance of the exponential distribution is based on the fact that it is the only
continuous distribution that posses the memoryless property.
Memoryless Property of the Exponential
An exponential random variable X has the property that ``the future is independent of the
past"
i.e. the fact that it hasn't happened yet, tells us nothing about how much longer it will take
before it does happen.
Cumulative distribution function
The cumulative distribution function is given by

2006. R Waema Probability distributions 34


2006. R Waema Probability distributions 35

You might also like