You are on page 1of 22

Certification Course on

Quality Assurance and Statistical Quality Techniques

Code 1.06: Probability

Course Level A
Basics of Probability Theory Issue No.: 01
Effective Date: 01-01-2015

We often make statements about probability. For example, a weather forecaster may predict
that there is a 90% chance of rain tomorrow. A health news report may state that an Asian
origin person has a much greater chance of getting heart disease than a European. A college
student may ask an instructor about the chances of clearing a Business School entrance test if
he secured an average grade in the graduation class.

Probability, which measures the likelihood that an event will occur, is an important part of
statistics. It is the basis of inferential statistics where we make decisions under conditions of
uncertainty. Probability theory is used to evaluate the uncertainty involved in those decisions.
For example, estimating the completion time for a project is based on many assumptions, some
of which may happen to be true and others may not. Probability theory will help us make
decisions under such conditions of imperfect information and uncertainty.

Combining probability and probability distributions with statistics helps us make decisions
about populations based on information obtained from samples. This chapter presents the
basic concepts of probability and the rules for computing probability.

A universe and a sample

We had read about samples and populations in Chapter 2. When we want to construct a
frequency distribution chart, we select a few pieces from a batch or lot. This is a sample, that
we use as a representative of the batch. The universe is the whole collection, in this case the
entire batch. When we calculate mean, standard deviation, variance etc from the sample, each
result is called a statistic. Since the composition of the samples will fluctuate, i.e will not be
identical each time we pick up a different set of pieces, the computed statistic will be larger or
smaller than the actual universal value.

The universe may have a finite number of items, for example in a batch, or it may be infinite,
such as the number of steel screws produced in a week. The universe can be defined differently
depending on the situation.

Because it is practically not possible to test the entire universe, due to time, cost
considerations, or because the tests are destructive in nature, a sample is selected. The primary
purpose of selecting a sample is to learn something about the universe that will aid taking a
decision about the universe, such as accepting or rejecting a lot. It is seen that the smaller the
sample size, the greater will be the fluctuation with respect to the true value. As the sample
size approaches the universe size, the accuracy becomes very good, but of course the cost may
be prohibitive.


Subjective Probability

The term probability is generally understood by its simpler synonyms such as likelihood, chance
etc. In general terms, probability is the chance that something will happen. We sometime say
there is a good chance that my friend will pass the examination. This is a subjective statement,
without specifying what we mean by ‘good’.

Objective Probability

On the other hand, when we say there is a 50 % chance that the match will be lost, we are
assigning a value to probability. In statistical quality control, we always calculate or use
probability in a quantified form.

When a coin is tossed, the probability of getting head is half or 50 %. You don’t need to be
statistician to answer that, as it is common sense. When you roll a die, the chance of getting the
number 4 is one sixth. This is because the die has six faces, and each has equal chance of
showing up. The chance of getting a red coloured card in a pack of cards is 1/2 , chance of
getting a card from the hearts suit is 1/4, and chance of getting a 10 is 1/13.

Probability is calculated objectively as the ratio of number of favourable outcomes to the total
number of outcomes.
No.of favourable outcome
Prob 
Total number of outcome

If an event occurs in 'A' ways and fails to occur in 'B' ways, and all these ways are equally likely,
then probability of occurrence of the event is

A B

In Statistical quality control, we will often encounter the situation, where p represents the
number of defective parts and q represents the number of good parts, the probability that a
randomly selected piece is defective is

(As per the ‘Complementary Rule’ we can also state that P(q) = 1.00 – P(p) )

Conditions for evaluating probability

Probability can be determined for different type of conditions. For examples:

a) Probability of a single event occurring among many possible events

b) Probability of more than one event occurring:
I. Probability of either of two events occurring / any of many possible events
occurring (OR condition)
II. Probability of two events occurring together / many events occurring together
(AND condition)
c) Probability of an event not occurring
d) Probability of a combination of events not occurring

A fundamental rule of probability calculations is that when you add the probability of all
possible events, the answer is always 1. For example probability of getting Heads is 1/2 and of
getting Tails is also 1/2. The sum of 1/2 and 1/2 is 1. The sum of probability of all six faces of a
cube at 1/6 for each face is also 1. Hence probability is always expressed as a number between
1.00 and 0, where 1 represents a certain occurrence and 0 represents that the event will not
occur at all.

Probability of a single event occurring

The definition for probability of single event (A) is given by the equation
P(A) =
where P(A) is the probability of event A happening, s = number of times A actually happened,
and n is the total number of possible events.

Thus, to get a ‘10’ in a pack of cards, s = 4 (there are four 10s in a pack), n = 52 (total number of
cards), hence P(A) = 4/52 = 1/13.

This equation holds good for finite universe, such as a pack of cards, or a lot of 100 hand-pumps
offered for inspection.

When the universe is infinite, prior information is necessary about the universe, which is taken
as the universal distribution, and the probability of an event happening is proportional to the
universal value. For example, if the on-going rejection level in a plastic moulding process is 0.14
%, the probability that a sample will be defective is P(A) = 0.0014.

Probability of more than one event occurring – OR condition

a) Mutually exclusive events

In probability two events are said to be mutually exclusive if the events have no shared
outcomes, i.e if one outcome occurs, the other cannot occur. If we consider the events as sets,
then two events are mutually exclusive when their intersection is the empty set. We could
denote that events A and B are mutually exclusive by the formula A ∩ B = Ø. The following
example will help to make sense of this definition.

Suppose that we roll two six-sided dice and add the number of dots showing on top of the dice.
The event consisting of an ‘even’ sum is mutually exclusive from the event consisting of an ‘odd’
sum, because there is no way possible for a number to be both even and odd.

Addition Rule for Probability of Mutually Exclusive Events

If events A and B are mutually exclusive, then the probability of A or B is the sum of the
probability of A and the probability of B:

P(A or B) = P(A) + P(B)

This is also written as probability of the union of Events A and B is denoted by P(A  B)

P (AB) = P(A) + P(B)

For example if we randomly select a card from a deck of cards, what is Probability that the card
is either a club or a spade?

P (club or spade) = P (club) + P (spade)= 13/52 + 13/52 = 26/52 = 0.5

b) Non exclusive events

Events may not be exclusive when they can also happen together. For example if we randomly
select a card from a deck of cards, what is Probability that the card is either a king or a spade or
both? Are cards that are Kings totally exclusive from cards that are spades? What about the
king of spades? In this example, we have at least one case where both events can occur
together, hence they are not mutually exclusive.

The Probability that the card is either a king or a spade or both is

P (king or spade or both) = P (king) + P (spade) - P (king and spade) = 4/52 + 13/52  1/52 =
16/52 = 0.31

Why is this so? Because the spade King is already considered in the 4 cards that are Kings as well as in
the 13 cards that are spades. If we do not deduct the joint event, we will be counting it twice.

In quality control, if we are counting samples only as defectives or non-defectives, these are mutually
exclusive events. But if we are counting occurrence of 2 defects, either of which can be found in a
manufactured part, or both can occur together then this is a situation where the events are not
exclusive. In such case the probability that a product is defective will be:

P(Defective) = P(D1) + P(D2) – P(D1 AND D2)

Can also be written as

P(D1  D2) = P(D1) + P(D2) – P(D1D2)

The combined probability of D1 and D2 occurring together has to be determined, for example from
historical inspection data (see example below)

Generalized Addition Rule

The above formula can be generalized for situations where events may not be mutually
exclusive and there could be an overlap. For any two events A and B, the probability of A or B is
the sum of the probability of A and the probability of B minus the shared probability of both A
and B:

P(A or B) = P(A) + P(B) - P(A and B)

Example :An inspector selects a casting from a lot. The probability that he picks up (a) a casting
with porosity is 0.40, (b) a casting with shrinkage is 0.30, and (c) both porosity and shrinkage is
0.20. What is the probability that the inspector selects a casting with porosity, or casting with
shrinkage, or both?

Solution: Let F = the event that the inspector selects a casting with porosity; and let N = the
event that the inspector selects a casting with shrinkage. Then, based on the rule of addition:

P(F  N) = P(F) + P(N) - P(F  N)

P(F  N) = 0.40 + 0.30 - 0.20 = 0.50
Probability of more than one event occurring – AND condition

The addition rule helped us solve problems when we performed one task and wanted to know
the probability of two possible events happening ‘during’ that task. We will now learn the
multiplication rule. The multiplication rule also deals with two events, but in these problems
the events occur as a result of more than one task (rolling one die then another, drawing two
cards, spinning a spinner twice, pulling two marbles out of a bag, etc).

Multiplication Rule:

The probability of two events A and B happening together is given by

P(A and B) = P(A) * P(B)

P(A B) = P(A) * P(B)

What The Rule Means:

Suppose we roll one die followed by another and want to find the probability of rolling a 5 on
the first die and rolling an odd number on the second die. In this problem we are not dealing
with the sum of both dice. We are only dealing with the probability of 5 on one die and then,
as a separate event, the probability of an odd number on the other die. The probabilities of
each event are:
P(5) = 1/6
P(odd) = 3/6
The combined probability of both happening together is given by the multiplication equation
P(5 odd) = (1/6)*(3/6) = 3/36 = 1/12

Independent events
In the above example, rolling a 5 on one die followed by rolling an odd number on the second
die are independent events. Each die is treated as a separate event and what happens on the
first die does not influence or effect what happens on the second die. This is our basic definition
of independent events: the outcome of one event does not influence or effect the outcome of
another event.
Suppose we have a box with 3 blue marbles, 2 red marbles, and 4 yellow marbles. We are going
to take out one marble, record its color, put it back in the box and take out another marble.
What is the probability of taking out a red marble followed by a blue marble?
The multiplication rule says we need to find P(red) * P(blue).
P(red) = 2/9
P(blue) = 3/9

P(red blue) = (2/9)* (3/9) = 6/81 = 2/27

Are the events in this example independent?. Yes, because after the first marble was taken out
and its color recorded, it was returned to the box. Therefore, the probability for the second
marble was not effected by what happened on the first marble.

Dependent events

If the occurrence of Event A changes the probability of Event B, then Events A and B are

If we take the same box of marbles as in the previous example, but in this case, we take out the
first marble, leave it out, and then take out another marble. What is the probability of taking
out a red marble followed by a blue marble?

We can still use the multiplication rule which says we need to find P(red) * P(blue). But in this
case when we take out the second marble, there will only be 8 marbles left in the bag.
Therefore the probability of the first and second events are:

P(red) = 2/9

P(blue) = 3/8

And the combined probability is

P(red blue) = (2/9) * (3/8) = 6/72 = 1/12

The events in this example were dependent. When the first marble was taken out and kept out,
it affected the probability of the second event.

Suppose we want to draw two cards from a standard deck. What is the probability that the first
card will be an Ace and the second card a Jack?

As both events are independent using the multiplication rule we get

P(Ace) * P(Jack) = (4/52) * (4/51) = 16/2652 = 4/663

The probability will be the same even if the question had asked for the probability of a jack
followed by an ace.
Complementary Rule of Probability

As we know the sum of probability of all events is always 1. Hence if the probability of event B
happening out of two possible events A and B, is P(A), the probability of event A is 1- P(B) The
complementarity rule of Probability can be written as

P(A) = 1 – P(B)

Many times, the situation B is described as event A not happening.

Probability of event A happening = 1 - Probability of event A not happening

Probability of event A not happening = 1 - Probability of event A happening

A lot of probability situations are handled by understanding and applying the complementary
rule. For example the probability of getting ‘2 or more’ defective samples in a sample of 6, is a
complementary function of probability of getting 0 or 1 defectives.

If P(0 defect) = 0.45 and P(1 defect) = 0.21, then

P(2,3,4,5,6 defects) = 1 –(0.45+0.21) = 0.34

Counting the total number of events

For determining probability of an event, we must always know the total number of events that
can occur, which becomes the denominator. Thus if we toss a coin, the total number of events
is 2. When we roll a dice, the total number of events is 6. If we roll 2 die, the total number of
events is 6 X 6 = 36. Such multiplication is the simplest method for counting total events.

A product can be classified into 3 size grades, 2 color shades and 4 weight categories. What is
the total ways in which the product can be classified:

3 X 2 X 4 = 24 classifications

The second method of counting commonly used in probability is permutations, or ways in

which objects can be ordered or arranged. For example the letters A, B, C can be arranged as

A,B,C A,C,B, B,A,C B,C,A C,A,B C,B,A

There are six ways in which these letters can be arranged. How did we get the number - by
calculating the permutation which is given by n! or factorial n. Thus 3! = 3X2X1 = 6. If there
were 4 letters, the permutations would be 4! = 4 X 3 X 2 X 1 = 24.
In the above example, if we were to ask what is the probability that the first letter is A. the
answer is 2 /6 as out of 6 there are 2 permutations that have the letter A first.

Permutations can also be done for sets from larger sets, for example in how many ways can you
arrange a set of 2 letters from a set of 5 letters e.g. A,B,C,D,E

This is calculated as n! / (n-r)! , where n is the total number and r is the smaller set size. The
answer is 5! / (5-2)! = 20

The third method of counting is combinations. How many combinations of 2 letters can you
make from A,B,C. This can be done as


In this case we are not concerned with the order of the letter but the pairing or combinations.
Thus A,B and B,A are the same. This is calculated as n! / [r! (n - r)!]

Thus in the example, n= 3, r = 2, the answer is 3! / [2!(3 - 2)!] = 3

If there were 5 letters in the set, A,B,C,D,E and we had to select any two (irrespective of order),
the answer is 5! / [2!x (5-2)!] = 10

A probability question could be – what is the probability that a set has the letter C. The answer
is 4/10 = 0.4.

What is the probability that a set has both letters B and C. The answer is 1/10 = 0.1.

What is the probability that a set has either letter B or C. The answer is

4/10 +4/10 -1/10 = 7/10 = 0.7 You can easily see that this has all the combinations except AE,

We can also use the multiplication rule of probability with combinations

Example : A box consists of 15 items of which 5 are defectives.

How many sets of 8 can be formed if each consists of

(a) exactly 2 defectives?

(b) at least 2 defectives?


The number of ways of choosing 2 defectives from 5 is 𝐶25 = 5! / (2! × 3!)) =10
The number of ways of choosing 6 non-defectives from 10 is

𝐶610 = 10! / ( 6!×4!) =210

(a) Number of possible sets with exactly 2 defectives:

𝐶25 × 𝐶610 =10×210=2100

(b) Number of sets with 3 defectives:

𝐶35 × 𝐶510 = 5! / ( 3! x 2! ) × 10!/ ( 5! x 5!) =2520

Number of sets with 4 defectives:

𝐶45 × 𝐶410 = 5! / ( 4! x 1! ) × 10!/ ( 4! x 6!) =1050

Number of sets with 5 defectives:

𝐶55 × 𝐶310 = 5! / ( 5! x 0! ) × 10!/ ( 3! x 7!) =120

So the number of sets with at least 2 defectives is:


An Alternative Solution could be

The problem with the method used above is that if we have many (say 20) to count, it would
become very tedious. So we look at another way of doing it.

If we find the number of sets with 0 and 1 defectives, and subtract this from the total number
of sets, we will have the number with at least 2:

Number of sets with 0 defectives:

𝐶810 = 10! / ( 8!×2!)) =45

Number of sets with 1 defective:

𝐶15 × 𝐶710 = 5! / ( 1! x 4! ) × 10!/ ( 7! x 3!) =600

The total number of sets is:

𝐶815 × = 15! / 8! x ( 15 - 8 ) ! = 15!/ ( 8! x 7!) =6435

So the number with at least 2 defectives is given by:


Probability distributions

Random Variable - is a set of real numbers associated with the set of outcomes of a random
experiment. Examples of random numbers are some of the examples we studied – tossing of a coin. The
random variable is the value x or our desired event can take. For example if we toss the coin thrice (see
example below), and say x is heads, x can take the value of 0,12,3.

When a random variable can take any value in a range, we are interested to find out

a) what will be the probability, it will take a particular value (probability density function
or pdf)
b) What will be the probability it is less than or equal to a particular value (cumulative
density function or cdf)
c) What will be the probability, it could take a range of values within the large range
d) What will be the probability, it will not take a particular value (i.e it could take any value
except the one)
e) What will be the probability, it will not take a range of values within the large range
Obviously the probability will keep changing with each value or a range of values
The probability function for the entire range of values that our random variable may take is
called the probability distribution.
Probability distributions are very useful in statistical quality control as we are always interested
to know how measurable characteristics of products can take different values in a range, or
how the number of defectives in a sample could vary within a lot.

Probability Distributions: Discrete vs. Continuous

In Chapter 2 we had learnt that data can be of two types one that you can measure and the
other type that you can count. In probability distributions, we estimate the probability of
variables. If a variable can take on any value between two specified values, it is called a
continuous variable; otherwise, it is called a discrete variable. All probability distributions can
be classified either as discrete probability distributions or as continuous probability
distributions, depending on whether they define probabilities associated with discrete
variables or continuous variables. A continuous distribution's probability function such as
normal distribution takes the form of a continuous curve, and its random variable takes on an
infinite number of possible values.
In this chapter we will be learning about discrete discrete probability distributions such as
Hypergeometric, binomial, poisson and continuous probability distributions such as normal
Let us see the simplest probability distribution. When a coin is tossed 3 times we are interested
in finding a Probability Distribution for the event of getting head.
Discrete Random Variable X is Number of Heads.
X can take the value of 0 Heads, 1 Head, 2 Heads or all 3 Heads
or X = {0,1,2,3}
Let us find the probability for each value of the random variable
0 Head - TTT - Prob. = 1/2 x 1/2 x 1/2 = 1/8
1 Head - THT - Prob. = 3/8


HHT Prob.
2 Heads THH - Prob. = 3/8
3 Heads - HHH - Prob. = 1/8
0 1 2 3
Figure 1 demonstrates the Probability Distribution Fig 1
Let us now learn about some of the discrete variable distributions:
Hyper Geometric Distribution - when the overall lot size (N) is small or finite, and probability of
event such as occurrence of defect (k) can be expressed as a fraction of the overall lot size.
Probability is calculated for occurrences (x) within a trial or sample size (n)
Hypergeometric experiment is a statistical experiment that has the following properties:
 A sample of size n is randomly selected without replacement from a population of N
 In the population, k items can be classified as successes, and N - k items can be classified
as failures.
The distribution is expressed by the formula
Prob(x; n, k, N) = [ kCx ] [ N-k Cn-x ] / [ N Cn ]
Where n is the number of samples selected
The same formula can be used in Excel in the same order as the above equation
= HYPGEOMDIST(sample_s,number_sample,population_s,number_pop)
Example : A lot consist of 1000 articles and has 5% defectives. A random sample of 75 is drawn
from this lot. What is the probability that sample will not contain any defective ?
N= 1000
n = 75
The number of defectives in the whole lot is 5 % = 50, hence k = 50
x = 0 (no defectives)
The probability equation is
Prob(x; n, k, N) = [ 50C0 ] [ 1000-50 C75-0 ] / [ 1000 C75 ]
The denominator 1000 C75 represents the total number of combinations of 75 samples from a lot
of 1000
The numerator has two terms. The first term 50C0 represents the combinations of 0 in 50
defectives, which is obviously 1. The second term is the number of combinations of 75 (as x =
0), out of the remaining 950 items.
This is a very complex calculation, but you can get the answer through Excel, which will return
a probability of 0.0183 or 1.83 %. Alternatively you can feed these values in the Probability
Excel template provided as part of the course material
If our question was x = 1, the equation would become
Prob(x; n, k, N) = [ 50C1 ] [ 1000-50 C75-1 ] / [ 1000 C75 ] = 0.078
If our question was, what is the
probability of gettign upto 1 defective
(i.e 0 or 1 defective), we add the two
probabilities, which calculates to 0.097.
The probability distribution curve for
the above example is given in Fig 2. See
how the probability of gettign x
defectives rises and peaks around 3,4,5
defectives, and then falls sharply. This
is because 5 % of 75 is 3.75 and we can
normally expect to get defectives in the
same magnitiude in the sample, as in
the whole population. This also brings
home an important learning, that a
sample closely resembles the universe,
but can come up with a range of results
with varying degrees of probability. For

Fig 2
example 1 out of 100 samples of 75 can contain even 9 defectives (12%).

Hypergeometric distributions are used when the population or universe size is limited or finite
(for example Lot size was 1000 in the above example).
Binomial Distribution - is used when the total population is very large. In binomial distribution,
the probability is linked with the number of trials e.g a sample size and is calculated for each
value of x within the sample size (n). It has the following properties

 The experiment consists of n repeated trials.

 Each trial can result in just two possible outcomes. We call one of these outcomes a
success and the other, a failure.
 The probability of success, denoted by P, is the same on every trial.
 The trials are independent; that is, the outcome on one trial does not affect the
outcome on other trials.

The Binomial probability distribution is defined by 2 parameters viz., n and p. When these are
known, distribution is completely known. The equation for the pdf is:
Prob(x, n, p ) = n Cx px q n – x
Where x is number of defective, n is number of trials, p is fraction defectives and q is (1-p)
(In generic descriptions for binomial distributions, p is number of successes or failures and q is
its reciprocal).
x assumes the values of 0,1,2 …. upto n
The mean of binomial distribution is n x p and variance np (1-p)
Example : A batch of 800 washers contain 6% defectives.
Calculate the probability that a random sample of 40 washers will contain
a. 0 defective
b. Exactly 1 defective
c. Up to 4 defectives
Applying the binomial probability equation
Prob (0 defectives) = 40 C0 x 0.060 x 0.94 40
The MS Excel uses the following formula expression to calculate binomial probability:
BINOMDIST(number_s,trials,probability_s,cumulative). The last expression is entered as ‘FALSE’
for Probability Distribution function and ‘TRUE’ for Probability Distribution function
Applying the values : BINOMDIST(0,40,0.06,False) we get a probability value of 0.084
For 1 defective, the equation is Prob (1 defectives) = 40 C1 0.061 x 0.94 39 and Excel function is
BINOMDIST(1,40,0.06,False) = 0.215
For (upto 4 defectives : cdf), we have to add the expressions for 0,1,2 3, and 4 or use the Excel
function BINOMDIST(4,40,0.06,True)= 0.910

The probability distribution

curve for the above example is
given in Fig 3. See how the
probability of gettign x
defectives rises and peaks
between 2 and 3 which
represents the average %
defective 6 % (0.06 X 40 = 2.4)

A Binomial distribution is used

only when the number of trials is
Fig 3
finite (for example a fixed number of samples selected based on acceptance sampling tables).
Poisson Distribution is similar to binomial, except that probability of occurrence is not linked
to trials or sample size. This resembles the situation when the number of samples in a binomial
distribution becomes so large that n and (n-x) are almost the same. This represents situations
where, we cannot limit ourselves to pre-determined number of trials, but wish to assess
probability for each event based on a known universe trend. Examples of such situations are
 Number of surface defects observed in a casting
 Number of machine breakdowns in a shift
 Number of births in a hospital during 1 hour
Poisson distribution is calculated for each occurrence of x within the overall population for an
overall event probability denoted by λ. The important conditions are –
a) probability of occurrence of an event remains constant from trial to trial
b) total number of trials is known and not large
c) chance of its actual occurrence is very small
d) number of events occurring is a random number, i.e events are mutually exclusive

The probability of an event x to occur (pdf), given that average number of occurrences is λ can
be given by Poisson distribution equation as under:
e   x
P(x) = for x = 0,1,2,......., where e = 2.71828
Example: The average number of births in a hospital is 6 in 24 hrs. If we have to calculate probability of
2 births in one hour we find that :

- events occur randomly

- events are mutually exclusive
Average births per hour λ = 6/24 = 0.25 births/hr
Probability of 2 births in an hour: P(2) = e - 0.25 x (0.25)2 / 2! = 0.024
The MS Excel uses the following formula expression to calculate Poisson probability:
POISSON (x,mean,cumulative). The last expression is entered as ‘FALSE’ for Probability
Distribution function and ‘TRUE’ for Probability Distribution function
Example : Tthe number of breakdowns of machines in a day follows a Poisson distribution with
mean 3. Find the probability that
a) there will be no breakdowns tomorrow (pdf)
b) There will be upto 2 breakdowns tomorrow (cdf)
c) There will be more than 4 breakdowns (cdf)
Applying Poisson distribution through MS Excel:
Case a) : POISSON(x,mean,False) = POISSON(0,3, false) = 0.049
Case b) : POISSON(x,mean,True) = POISSON(2,3, true) = 0.423
Case c) : 1 - POISSON(x,mean,True) = 1 - POISSON(4,3, true) = 1 – 0.815 = 0.185

The probability distribution curve for the

above example is given in Fig 4. As we see,
the highest probability is observed around
the mean.

Fig 4

How should we decide whether to use Poisson or Binomial Distribution?

If a mean or average probability of an event happening per unit time/per page/per mile cycled
etc., is given, and you are asked to calculate a probability of n events happening in a given
time/number of pages/number of miles cycled, then the Poisson Distribution is used.
If, on the other hand, an exact probability of an event happening is given, or implied, in the
question, and you are asked to calculate the probability of this event happening k times out of
n, then the Binomial Distribution must be used
Examples of Binomial or Poisson Distribution
A typist makes on average 2 mistakes per page. What is the probability of Poisson
a particular page having no errors on it?
A computer crashes once every 2 days on average. What is the Poisson
probability of there being 2 crashes in one week?
ICs are packaged in boxes of 10. The probability of an IC being faulty is Binomial
2%. What is the probability of a box containing 2 faulty ICs?
A box contains a large number of washers; there are twice as many steel Binomial
washers as brass ones. Four washers are selected at random from the
box. What is the probability that 0, 1, 2, 3, 4 are brass?
Normal Distribution

Normal distribution is a continuous variable probability distribution and is among the most
universally found natural distributions.
In a normal distribution, the frequency distribution of a measurable characteristic exhibits a
bell shaped pattern with most of the observation clustering around a central value and with the
frequencies diminishing rapidly on either side of this central value. Thus the frequencies of
large deviation from the central value are small. The Normal distribution is defined by two
parameters  and . The p.d.f of Normal distribution is given by
1 2
f (X)  e1/ 2( X   /  )    x   
2 

For this distribution, E(X) = , Var (X) =  2 . Under certain condition especially when N is large,
Binomial and Poisson approximate to Normal distribution. This distribution is used very
extensively in statistical theory.

Fig 5
1. Curve begins from - and ends with +.
2. It is symmetrical about mean Mean, Median and Mode are same
X 
U= is known as a standard normal variable with mean 0 and s.d. 1, where X is a N ()

3. It is fully explained by two parameter and. Different values of  and  give rise to
different normal curve (Figures 6 & 7)

Fig 6

Fig 7

4. The Total Area under the normal curve is one

5. Area bound by limits

Limits Area

μ+σ 68.27

μ + 1.64 σ 90.0
μ + 1.96 σ 95.0
μ + 2.0 σ 95.45
μ + 3.0 σ

The MS Excel uses the following formula expression to calculate Normal probability:
NORMDIST(x,mean,standard_dev,cumulative). The last expression is entered as ‘FALSE’ for
Probability Distribution function and ‘TRUE’ for Probability Distribution function
You can also use the MS Excel Probability template for Normal Distribution given as a part of
this course to determine the probability values.

1. The time taken to complete a particular repetitive job is normally distributed with mean 40
minutes and standard deviation 8 minutes. 25 jobs are to be performed.
a) How many jobs are expected to be completed within 35 minutes?
b) How many jobs are expected to take more than 48 minutes?
c) What is the total number of jobs that will be completed within 20 to 50 minutes?
No. of jobs expected to be completed within 35 minutes = Total no. or jobs x (probability that
the time taken for a job is < 35 minutes)
To find probability that the time taken for a job is < 35 minutes, we use the Excel function
NORMDIST(35,40,8,TRUE) to get a probability value of 0.266
Therefore No. of jobs expected to be completed within 35 minutes is 25 x 0.266 which is
approximately 7
2. No of jobs that are expected to take more than 48 minutes
= Total no. of jobs x probability that the job will be completed in more than 48 minutes
For ‘more than’ situations, we have to calculate probability of less than the value and subtract
from 1
Probability that the time taken for a job is < 48 minutes, we use the Excel function
NORMDIST(48,40,8,TRUE) to get a probability value of 0.841
Probability that the time taken for a job is > 48 minutes
To find, Probability (time taken for a job = 48 minutes) = 1 – 0.841 = 0.159
No. of jobs which are expected to take more than 48 minutes is 25 x 0.159 which is
approximately 4.
3. No. of jobs which will be completed within 20 to 50 minutes
= Total no. of jobs x (Probability that time taken for a job <50 minutes - Probability that time
taken for a job <20 minutes)
= 25 X {(NORMDIST(50,40,8,TRUE) - NORMDIST(20,40,8,TRUE)}
= 25 X (0.894 – 0.006) = 0.888
Therefore the the number of jobs that will be completed within 20 to 50 minutes = 25 x 0.888 =
22 (approximately)

Estimation of Process Parameters

In statistical quality control, the probability distribution is used to describe or model some
quality characteristic, such as a critical dimension of a product or the fraction defective of the
manufacturing process. Therefore, we are interested in making inferences about the
parameters of probability distributions. Since the parameters are generally unknown, we
require procedures to estimate them from sample data.
We may define an estimator of an unknown parameter as a statistic that corresponds to the
parameter. A particular numerical value of an estimator, computed from sample data, is called
an estimate. A point estimator is a statistic that produces a single numerical value as the
estimate of the unknown parameter. An interval estimator is a random interval in which the
true value of the parameter falls with some level of probability. These random intervals are
usually called confidence intervals.

Point Estimation
Consider the random variable x with probability distribution f(x). Suppose that the mean μ and
variance σ2 of this distribution are both unknown. If a random sample of n observations is
taken, then the sample mean x and sample variance S2 are point estimators of the population
mean μ and population variance σ2 respectively. For example, suppose we wished to obtain
point estimates of the mean and variance of the inside diameter of a bearing. We could
measure the inside diameters of a random sample of n = 20 bearings (say). Then the sample
mean and sample variance could be computed. If this yields x =1.495 and S2 =0.001, then the
point estimate of μ is x =1.495 and the point estimate of σ2 is S2=0.001.

Confidence intervals

An interval estimate of a parameter is the interval between two statistics that includes the true
value of the parameter with some probability. For example to construct an interval estimator of
the mean  , we must find two statistics L and U such that
P{L    U }  1  
The resulting interval
L   U
is called a 100 (1-)% C.I for the unknown mean  . L and U are called the lower and upper
confidence limits, respectively, and 1 -  is called the confidence coefficient. The interpretation
of a Confidence Interval is that if a large number of such intervals are constructed, each
resulting from a random sample, then 100 (1- )% of these intervals will contain the true value
of  .


The mean tensile strength of a synthetic fiber is an important quality characteristic that is of
interest to the manufacturer, who would like to find a 95% confidence interval estimate of the
mean. From past experience, the manufacturer is willing to assume that tensile strength is
approximately normally distributed; however, both the mean tensile strength and standard
deviation of tensile strength are unknown. A random sample of 16 fiber specimens is selected,
and their tensile strengths are determined. The sample data are shown in Table.

Specimen Strength (psi) Specimen Strength (psi)

1 48.89 9 49.20
2 52.07 10 48.10
3 49.29 11 47.90
4 51.66 12 46.94
5 52.16 13 51.76
6 49.72 14 50.75
7 48.00 15 49.86
8 49.96 16 51.57

We may calculate the sample mean and sample standard deviation of the tensile strength data

1 N 1
x  x i  16 (797.83)  49.86 psi
N i1

= 1.66 psi
Since t0.02 5.15 = 2.132, we find the 95% two-sided C.I on  as

49.68  (2.132)1.66 / 16    49.86  (2.132)1.66 / 16 or

49.98    50.74

Another way to express this result is that our estimate of the mean tensile strength is
49.86  0.88 psi with 95% confidence.

We have learnt the concept of probability, how objective probability is estimated, what the
various probability distributions are, and how probability is used in estimation of process

Many of the different statistical quality tools that we will learn in the remaining chapters will
rely on probability estimates and confidence intervals.