Professional Documents
Culture Documents
1
11.1.1. Objective Probability
• Consider a referee's flipping a coin before a football game to determine which
team will kick off
• Before the referee tosses the coin, both team captains know that they
have a 50% (0,5) chance (probability) of winning the toss
• No-one would argue that the probability of heads or tails was not 50%
• In this case, the probability of 0,50 that either heads or tails will occur
when a coin is tossed is called an objective probability
• More specifically, it is referred to as a classical or a priori (prior
to the occurrence) probability, one of the two types of objective
probabilities
• A classical, or a priori, probability can be defined as follows: Given a set of
outcomes for an activity (e.g. heads/tails), the probability of a specific (desired)
outcome is the ratio of the number of specific outcomes to the total number of
outcomes
• E.g. the probability of a head is the ratio of the number of specific
outcomes (a head) to the total number of outcomes (a head and a tail),
or ½
• E.g. the probability of drawing an ace from a deck of 52 cards would be
found by dividing 4 (the number of aces) by 52 (the total number of cards
in a deck) to get 1/13
• E.g. if we spin a roulette wheel with 50 red numbers and 50 black num-
bers, the probability of the wheel's landing on a red number is 50 divided
by 100, or 1/2
• These examples are referred to as a priori probabilities because we can state
the probabilities ahead of time
• This is because we assume we know the number of specific outcomes
and total outcomes prior to the occurrence of the activity
• E.g. 52 cards in a deck, 2 sides to a coin, etc.
• These probabilities are also known as classical probabilities because some of
the earliest references in history to probabilities were related to games of
chance
2
get 10 consecutive heads, thus, the relative frequency (probability) of a
head would be 1,0 (100%)
• Key characteristic of a relative frequency probability: the relative
frequency probability becomes more accurate as the total number of
observations of the activity increases
• If a coin were tossed about 350 times, the relative frequency
would approach 1/2 (assuming a fair coin)
3
11.1.2. Subjective Probability
• When relative frequencies are not available, a probability is often determined
anyway
• In these cases a person must rely on personal belief, experience, and
knowledge of the situation to develop a probability estimate
• A probability estimate that is not based on prior or past evidence as is a
subjective probability
• E.g. when a meteorologist forecasts a "60% chance of rain tomorrow," the
0,60 probability is usually based on the meteorologist's experience and
expert analysis of the weather conditions
• In other words, the meteorologist is not saying that these exact
weather conditions have occurred 1 000 times in the past and on
600 occasions it has rained, thus there is a 60% probability of rain
• E.g. when a sportswriter says that a football team has an 80% chance of
winning, it is usually not because the team has won 8 of its 10 previous
games
• The prediction is judgmental, based on the sportswriter's
knowledge of the teams involved, the playing conditions, and so
forth
• Once the relative frequency probability becomes colored by
personal belief, it is subjective
• Caution regarding the use of subjective probabilities: different people will often
arrive at different subjective probabilities
• Everyone should arrive at the same objective probability, given the same
numbers and correct calculations
• when a probabilistic analysis is to be made of a situation, the use of objective
probability will provide more consistent results
4
5
11.2. Fundamentals of Probability
• In the terminology of probability, the coin toss (in the example) is referred to as
an experiment
• An experiment is an activity that results in one of the several possible
outcomes
• Our coin-tossing experiment can result in either one of two outcomes,
which are referred to as events: a head or a tail
• The probabilities associated with each event in the experiment are:
Event Probability
Head 0,5
Tail 0,5
TOTAL 1,0
• The coin tossing example also exhibits a third characteristic: the events in a set
of events are mutually exclusive (only one of them can occur at a time)
• Either heads or tails – not both
• E.g. A sales manager in a shoe shop estimates that there is a 60%
probability that the customer will buy a pair of shoes and a 40% probability
that he won’t
• It is impossible to buy shoes and not buy shoes at the same time
Relative
Event Grade Nr of Students Probability
Frequency
A 300 300/3000 0,1
B 600 600/3000 0,2
C 1500 1500/3000 0,5
D 450 450/3000 0,15
F 150 150/3000 0,05
3000 1,0
6
…
• Now let us consider a case in which two events are not mutually exclusive
• The probability that A or B or both will occur is expressed as:
• P(AorB) = P(A) + P(B) - P(AB)
• P(AB) - joint probability of A and B – there is a possibility that both A and
B will occur
• For mutually exclusive events, this term has to equal zero because both events
cannot occur together:
• P(AorB) = P(A) + P(B) - P(AB) = P(A) + P(B) - 0 = P(A) + P(B)
7
…
• E.g. Suppose it has been determined that 40% of all students in the Faculty are
at present taking management, and 30% of all the students are taking finance
• Also, it has been determined that 10% take both subjects
• The probabilities are:
• P(M) = 0,4
• P(F) = 0,3
• P ( M F ) = 0,1
• The probability of a student's taking one or the other or both of the courses is:
• P(MorF) = P(M) + P(F) - P(MF) = 0,4 + 0,3 – 0,1 = 0,6
• Why was the joint probability, P(MF), subtracted out? The 40% of the
students who were taking management also included those students taking
both courses
• Likewise, the 30% of the students taking finance also included those students
taking both courses
• Venn diagram that shows the two events, M and F, that are not mutually
exclusive, and the joint event, MF.
8
11.3. Statistical Independence and Dependence
• Statistically, events are either:
• Independent
• The occurrence of one event does not affect the probability of the
occurrence of another event
• Dependent
• The occurrence of one event affects the probability of the occurrence
of another event
• In summary, if events A and B are independent, the following two properties hold:
• P(AB) = P(A) x P(B)
• P(A|B) = P(A)
9
11.3.2. Probability Trees
• E.g. coin is tossed 3 times
• Possible outcomes can be shown in a probability tree:
• To determine the joint probabilities that events will occur in succession are
computed by multiplying the probabilities of all the events
• E.g. the probability of getting heads on the first toss, tails on the second,
and tails on the third is 0,125:
• P(HTT) = P(H) x P(T) x P(T) = (0,5) (0,5) (0,5) = 0,125
• However, do not confuse the results in the probability tree with conditional
probabilities
• The probability that heads and then two tails will occur on three
consecutive tosses is computed prior to any tosses
• If the first two tosses have already occurred, the probability of getting
tails on the third toss is still 0,5:
• P(T|HT) = P(T) = 0,5
10
11.3.3. The Binomial Distribution
Some additional information can be drawn from the probability tree of our example
E.g. what is the probability of achieving exactly two tails on three tosses?
Observe the instances in which two tails occurred: two tails in three tosses
occurred three times, each time with a probability of 0,125
the probability of getting exactly two tails in three tosses is the sum of
these three probabilities (0,375)
The use of a probability tree can become very cumbersome, especially if we are
considering an example with 20 tosses
However, the example of tossing a coin exhibits certain properties that enable us to
define it as a Bernoulli process
The properties of a Bernoulli process are:
1. There are two possible outcomes for each trial (i.e. each toss of a
coin)
Success/failure; yes/no; heads/tails; good/bad; etc.
2. The probability of the outcomes remains constant over time
the probability of getting heads on a coin toss remains the
same, regardless of the number of tosses
3. The outcomes of the trials are independent
A heads on the first toss does not affect the probabilities on
subsequent tosses
4. The number of trials is discrete and an integer
Discrete: values that are countable
Integer: whole numbers (e.g. 1 car, 2 people)
11
…
E.g. suppose we want to determine the probability of getting exactly two tails in
three tosses of a coin
Tails = success (it is the object of the analysis)
Probability of a tail: p = 0,5
q = 1 – 0,5 = 0,5
Number of tosses: n = 3
Number of tails: r = 2
Substituting these values into the binomial formula will result in the
probability of two tails in three coin tosses:
n!
P ( 2tails )=P ( r =2 )= p r qn−r
r ! ( n−r ) !
3!
¿ × 0,52 × 0,53−2
2! ( 3−2 ) !
( 3× 2× 1 )
¿ × ( 0,25 ) × ( 0,5 )
( 2× 1 )( 1 )
6
¿ × 0,125
2
¿ 0,375
the probability of getting exactly two defective items out of four microchips
is 0,154 (15,4%)
12
…
Now, let us alter this problem to make it even more realistic:
The manager has determined that four microchips from every large batch
should be tested for quality. If two or more defective microchips are found,
the whole batch will be rejected. The manager wants to know the probability
of rejecting an entire batch of microchips, if, in fact, the batch has 20%
defective items.
Remember: binomial distribution gives us the probability of an exact
number of integer successes
if we want the probability of two or more defective items, it is
necessary to compute the probability of two, three, and four defective
items:
P(r 2) = P(r = 2) + P(r = 3) + P(r = 4)
Substituting the values p = 0,2, n = 4,q = 0,8 and r = 2,3, and 4 into the
binomial distribution results in the probability of two or more defective items:
4! 4! 4!
P ( r ≥ 2 )= × 0,22 × 0,84 −2 + ×0,23 ×0,8 4−3+ × 0,24 × 0,80
2 ! ( 4−2 ) ! 3 ! ( 4−3 ) ! 4 ! ( 4−4 ) !
¿ 0,154+ 0.026+0,002
¿ 0,182
Note: the collectively exhaustive set of events for this example is 0, 1, 2, 3, and 4
defective transistors
Because the sum of the probabilities of a collectively exhaustive set of
events equals 1.0:
P(r=0,1,2,3,4)= P(r=0) + P(r=1) + P(r=2) + P(r=3) + P(r=4) =1,0
Recall that the results of the immediately preceding example show that
P(r = 2 ) + P(r = 3) + P(r = 4) = 0,182
Given this result, we can compute the probability of "less than two
defectives" as follows:
P(r<2) = P(r= 0) + P(r= 1)= 1,0 - [P(r =2) + P(r=3) + P(r=4)] = 0,819
13
11.3.4. Dependent Events
As Stated earlier, if the occurrence of one event affects the probability of the
occurrence of another event, the events are dependent events
e.g. two buckets contain a number of colored balls each. Bucket 1 contains two red
balls and four white balls, and bucket 2 contains one blue ball and five red balls. A
coin is tossed. If heads result, a ball is drawn out of bucket 1; if tails result, a ball is
drawn from bucket 2
Like statistically independent events, dependent events also exhibit certain defining
properties
E.g. (slightly altered from previous) bucket 2 contains 1 white ball and 5 red balls
The outcomes that can result from this scenario is shown below:
When the coin is flipped, one of two outcomes is possible: heads (0,5) or tails (0,5):
P(H) = 0,5
P(T) = 0,5
14
…
As indicated previously, these probabilities are referred to as marginal probabilities
The marginal probabilities of a collectively exhaustive set of events sum to
one
They are also unconditional probabilities
Not conditional on the occurrence of any other events (the probabilities of the
occurrence of a single event
Once the coin has been tossed and a head or tail has resulted, a ball is drawn from
one of the buckets
If heads result, a ball is drawn from bucket 1
2/6 (0,33) probability of drawing a red ball
4/6 (0,67) probability of drawing a white ball
If tails result, a ball is drawn from bucket 2
5/6 (0,83) probability of drawing a red ball
1/6 (0,17) probability of drawing a white ball
These probabilities of drawing red or white balls are called conditional
probabilities because they are conditional on the outcome of the event of
tossing a coin
Symbolically, these conditional probabilities are expressed as:
P(R|H) = 0,33
"the probability of drawing a red ball given that heads result
from the coin toss," equals 0,33
P(W|H) = 0,67
P(R|T) = 0,83
P(W|T) = 0,17
15
…
Returning to our example, the joint events are the occurrence of heads and a red
ball, heads and a white ball, tails and a red ball, and tails and a white ball
The probabilities of these joint events are:
P(RH)=P(R|H) . P(H) = (0,33)(0,5) = 0,165
P(WH) = P(W|H) . P(H) = (0,67)(0,5) = 0,335
P(RT) = P(R|T) . P(T) = (0,83)(0,5) = 0,415
P(WT) = P(W|T) . P(T) = (0,17)(0,5) = 0,085
Joint probability table (summarizes the joint probabilities for the example):
Draw a Ball
Flip a coin Red White Marginal probabilities
Heads P(RH) = 0,165 P(WH) = 0,335 P(H) = 0,5
Tails P(RT) = 0,415 P(WT) = 0,085 P(T) = 0,5
Marginal probabilities P(R) = 0,580 P(W) = 0,420 1,00
.
16
11.3.5. Bayesian Analysis
E.g. a production manager for a manufacturing firm is supervising the machine setup for the
production of a product. The machine operator sets up the machine. If the machine is set
up correctly, there is a 10% chance that an item produced on the machine will be defective.
If the machine is set up incorrectly, there is a 40% chance that an item will be defective.
The production manager knows from past experience that there is a 0,5 probability that a
machine will be set up correctly or incorrectly by an operator. In order to reduce the chance
that an item produced on the machine will be defective, the manager has decided that the
operator should produce a sample item. The manager wants to know the probability that
the machine has been set up incorrectly if the sample item turns out to be defective.
The probabilities are:
P(C) = 0,5
P(IC) = 0,5
P(D|C) = 0,1
P(D|IC) = 0,4
C = correct
IC = incorrect
D = defective
The posterior probability for our example is the conditional probability that the
machine has been set up incorrectly, given that the sample item proves to be
defective, or P(IC|D)
In Bayesian analysis, once we are given the initial marginal and conditional prob-
abilities, we can compute the posterior probability by using Bayes's rule, as follows:
P ( D|IC ) P(IC ) ( 0,4)(0.5)
P ( IC|D )= = =0,8
P ( D|IC ) P ( IC ) + P ( D|C ) P (C) ( 0,4 ) ( 0,5 ) +( 0,1)(0,5)
Previously, the manager knew that there was a 50% chance that the machine was
set up incorrectly
Now, after producing and testing a sample item, the manager knows that if it is
defective, there is a 0,8 probability that the machine was set up incorrectly
by gathering some additional information, the manager can revise the estimate of
the probability that the machine was set up correctly
This will obviously improve decision making by allowing the manager to
make a more informed decision about whether to have the machine set up
again
In general, given two events, A and B, and a third event, C (that is conditionally
dependent on A and B) Bayes's rule can be written as:
17
P ( C| A ) P( A)
P ( C| A )=
P ( C| A ) P ( A ) + P ( C|B ) P(B)
18
11.4. Expected Value
It is often possible to assign numeric values to the various outcomes that can result
from an experiment
When the values of the variables occur in no particular order or sequence, the
variables are referred to as random variables
Every possible value of a variable has a probability of occurrence associated with it
E.g. if a coin is tossed three times, the number of heads obtained is a random
variable
The possible values of the random variable are 0,1,2, and 3 heads
The values of the variable are random because there is no way of predicting
which value will result when the coin is tossed three times
If three tosses are made several times, the values (i.e. numbers of heads)
that will result will have no sequence or pattern - they will be random
Although the exact values of the random variables in the previous examples are not
known prior to the event, it is possible to assign a probability to the occurrence of
the possible values that can result
E.g. Consider a production operation in which a machine breaks down
periodically
From experience it has been determined that the machine will break
down 0,1, 2, 3, or 4 times per month
Although managers do not know the exact number of breakdowns (x) that
will occur each month, they can determine the relative frequency
probability of each number of breakdowns p(x)
These probabilities are:
x P(x)
0 0,1
1 0,2
2 0,3
3 0,25
4 0,15
TOTAL 1,0
…
19
The expected value of the random variable (the number of breakdowns in any
given month) is computed by multiplying each value of the random variable by its
probability of occurrence and summing these products
For our example, the expected number of breakdowns per month is computed
as follows:
E ( x )=( 0 ) ( 0,1 ) + ( 1 ) ( 0,2 )+ ( 2 )( 0,3 ) + ( 3 )( 0,25 ) + ( 4 ) ( 0,15 )
¿ 2,15 breakdowns
The expected value is often referred to as the weighted average (or mean) of
the probability distribution
It is a measure of central tendency of the distribution
In addition to knowing the mean, it is often desirable to know how the values are
dispersed (or scattered) around the mean
A measure of dispersion is the variance, which is computed as follows:
Square the difference between each value and the expected value
Multiply these resulting amounts by the probability of each value
Sum the values compiled in step 2
General formula:
2
σ =∑ ¿ ¿ ¿
For the example:
2
σ =∑ ¿ ¿ ¿
20