You are on page 1of 59

Chapter

5
Probability (Part 2)
Contingency Tables Tree Diagrams Bayess Theorem (optional) Counting Rules (optional)

Contingency Tables
What is a Contingency Table?
A contingency table is a crosstabulation of frequencies into rows and columns. Variable 1
Col 1 Col 2 Col 3

Variable 2

Row 1 Row 2 Cell

Row 3
Row 4

A contingency table is like a frequency distribution for two variables.

Contingency Tables
Example: Salary Gains and MBA Tuition
Consider the following cross-tabulation table for n = 67 top-tier MBA programs:

Contingency Tables
Example: Salary Gains and MBA Tuition
Are large salary gains more likely to accrue to graduates of high-tuition MBA programs? The frequencies indicate that MBA graduates of high-tuition schools do tend to have large salary gains. Also, most of the top-tier schools charge high tuition. More precise interpretations of this data can be made using the concepts of probability.

Contingency Tables
Marginal Probabilities
The marginal probability of a single event is found by dividing a row or column total by the total sample size. For example, find the marginal probability of a medium salary gain (P(S2)). P(S2) = 33/67 = .4925
Conclude that about 49% of salary gains at the top-tier schools were between $50,000 and $100,000 (medium gain).

Contingency Tables
Marginal Probabilities
Find the marginal probability of a low tuition P(T1).

P(T1) = 16/67 = .2388 There is a 24% chance that a top-tier schools MBA tuition is under $40.000.

Contingency Tables
Joint Probabilities
A joint probability represents the intersection of two events in a cross-tabulation table. Consider the joint event that the school has low tuition and large salary gains (denoted as P(T1 S3)).

Contingency Tables
Joint Probabilities
So, using the cross-tabulation table, P(T1 S3) = 1/67 = .0149 There is less than a 2% chance that a top-tier school has both low tuition and large salary gains.

Contingency Tables
Conditional Probabilities
Found by restricting ourselves to a single row or column (the condition). For example, knowing that a schools MBA tuition is high (T3), we would restrict ourselves to the third row of the table.

Contingency Tables
Conditional Probabilities
Find the probability that the salary gains are small (S1) given that the MBA tuition is large (T3).
P(T1 | S3) = 5/32 = .1563 What does this mean?

Contingency Tables
Independence
To check for independent events in a contingency table, compare the conditional to the marginal probabilities. For example, if large salary gains (S3) were independent of low tuition (T1), then P(S3 | T1) = P(S3).
Conditional P(S3 | T1)= 1/16 = .0625 Marginal P(S3) = 17/67 = .2537

What do you conclude about events S3 and T1?

Contingency Tables
Relative Frequencies
Calculate the relative frequencies below for each cell of the cross-tabulation table to facilitate probability calculations.

Symbolic notation for relative frequencies:

Contingency Tables
Relative Frequencies
Here are the resulting probabilities (relative frequencies). For example,
P(T1 and S1) = 5/67 P(S1) = 17/67 P(T2 and S2) = 11/67 P(T3 and S3) = 15/67 P(T2) = 19/67

Contingency Tables
Relative Frequencies
The nine joint probabilities sum to 1.0000 since these are all the possible intersections. Summing the across a row or down a column gives marginal probabilities for the respective row or column.

Contingency Tables
Example: Payment Method and Purchase Quantity A small grocery store would like to know if the number of items purchased by a customer is independent of the type of payment method the customer chooses to use. Why would this information be useful to the store manager? The manager collected a random sample of 368 customer transactions.

Contingency Tables
Example: Payment Method and Purchase Quantity Here is the contingency table of frequencies:

Contingency Tables
Example: Payment Method and Purchase Quantity

Calculate the marginal probability that a customer will use cash to make the payment.
Let C be the event cash. P(C) = 126/368 = .3424

Now, is this probability the same if we condition on number of items purchased?

Contingency Tables
Example: Payment Method and Purchase Quantity P(C | 1-5) P(C | 6-10) P(C | 10-20) P(C | 20+) = 30/88 = .3409 = 46/135 = .3407 = 31/89 = 19/56 = .3483 = .3393

P(C) = .3424, so what do you conclude about independence? Based on this, the manager might decide to offer a cash-only lane that is not restricted to the number of items purchased.

Contingency Tables
How Do We Get a Contingency Table?
Contingency tables require careful organization and are created from raw data. Consider the data of salary gain and tuition for n = 67 top-tier MBA schools.

Contingency Tables
How Do We Get a Contingency Table?
The data should be coded so that the values can be placed into the contingency table.
Once coded, tabulate the frequency in each cell of the contingency table using MINITABs Stat | Tables | Cross Tabulation

Tree Diagrams
What is a Tree?
A tree diagram or decision tree helps you visualize all possible outcomes. Start with a contingency table. For example, this table gives expense ratios by fund type for 21 bond funds and 23 stock funds.

Tree Diagrams
What is a Tree?
To label the tree, first calculate conditional probabilities by dividing each cell frequency by its column total. For example, P(L | B) = 11/21 = .5238 Here is the table of conditional probabilities

Tree Diagrams
What is a Tree?
The tree diagram shows all events along with their marginal, conditional and joint probabilities. To calculate joint probabilities, use P(A B) = P(A | B)P(B) = P(B | A)P(A)

The joint probability of each terminal event on the tree can be obtained by multiplying the probabilities along its branch.
For example, P(B L) = P(L | B)P(B) = (.5238)(.4773) = .2500

Tree Diagrams
Tree Diagram for Fund Type and Expense Ratios

Bayess Theorem
Thomas Bayes (1702-1761) provided a method (called Bayess Theorem) of revising probabilities to reflect new probabilities. The prior (marginal) probability of an event B is revised after event A has been considered to yield a posterior (conditional) probability.

P( A | B) P( B) Bayess formula is: P ( B | A) P ( A)

Bayess Theorem
Bayess formula begins as:
P( A | B) P( B) P ( B | A) P ( A)

In some situations P(A) is not given. Therefore, the most useful and common form of Bayess Theorem is:

P( A | B) P( B) P( B | A) P( A | B) P( B) P( A | B ') P( B ')

Bayess Theorem
How Bayess Theorem Works
Consider an over-the-counter pregnancy testing kit and its track record of determining pregnancies. If a woman is actually pregnant, what is the tests track record? If a woman is not pregnant, what is the tests track record? False Negative False Positive
96% of time 1% of time 4% of time 99% of time

Bayess Theorem
How Bayess Theorem Works
Suppose that 60% of the women who purchase the kit are actually pregnant. Intuitively, if 1,000 women use this test, the results should look like this.

Bayess Theorem
How Bayess Theorem Works
Of the 580 women who test positive, 576 will actually be pregnant.

So, the desired probability is:

P(PregnantPositive Test) = 576/580 = .9931

Bayess Theorem
How Bayess Theorem Works
Now use Bayess Theorem to formally derive the result P(Pregnant | Positive) = .9931: First define A = positive test B = pregnant A' = negative test B' = not pregnant From the contingency And the compliment of table, we know that: each event is: P(A | B) = .96 P(A' | B) = .04 P(A | B') = .01 P(A' | B') = .99 P(B) = .60 P(B') = .40

Bayess Theorem
How Bayess Theorem Works
P(B | A) = P(A | B)P(B) P(A | B)P(B) + P(A | B')P(B') (.96)(.60)

(.96)(.60) + (.01)(.40) .576 .576 = = .576 + .04 .580

= .9931

So, there is a 99.31% chance that a woman is pregnant, given that the test is positive.

Bayess Theorem
How Bayess Theorem Works
Bayess Theorem shows us how to revise our prior probability of pregnancy to get the posterior probability after the results of the pregnancy test are known.
Prior Before the test P(B) = .60 Posterior After positive test result P(B | A) = .9931

Bayess Theorem is useful when a direct calculation of a conditional probability is not permitted due to lack of information.

Bayess Theorem
How Bayess Theorem Works
A tree diagram helps visualize the situation.

Bayess Theorem
How Bayess Theorem Works
The 2 branches showing a positive test (A) comprise a reduced sample space B A and B' A, so add their probabilities to obtain the denominator of the fraction whose numerator is P(B A).

Bayess Theorem
General Form of Bayess Theorem
A generalization of Bayess Theorem allows event B to be polytomous (B1, B2, Bn) rather than dichotomous (B and B').
P( A | Bi ) P( Bi ) P( Bi | A) P( A | B1 ) P( B1 ) P( A | B2 ) P( B2 ) ... P( A | Bn ) P( Bn )

Bayess Theorem
Example: Hospital Trauma Centers
Based on historical data, the percent of cases at 3 hospital trauma centers and the probability of a case resulting in a malpractice suit are as follows:

let event A = a malpractice suit is filed Bi = patient was treated at trauma center i

Bayess Theorem
Example: Hospital Trauma Centers
Applying the general form of Bayes Theorem, find P(B1 | A).
P( B1 | A) P( A | B1 ) P( B1 ) P( A | B1 ) P( B1 ) P( A | B2 ) P( B2 ) P( A | B3 ) P( B3 )

P( B1 | A)
P( B1 | A)

(0.001)(0.50) (0.001)(0.50) (0.005)(0.30) (0.008)(0.20)


0.0005 0.0005 0.1389 0.0005 0.0015 0.0016 0.00036 0.

Bayess Theorem
Example: Hospital Trauma Centers
Conclude that the probability that the malpractice suit was filed in hospital 1 is .1389 or 13.89%. All the posterior probabilities for each hospital can be calculated and then compared:

Bayess Theorem
Example: Hospital Trauma Centers
Intuitively, imagine there were 10,000 patients and calculate the frequencies:
Hospital 1 2 3 Total Malpractice Suit Filed 5 15 16 36 No Malpractice Suit Filed 4,995 2,985 1,984 9,964 Total 5,000 3,000 2,000 10,000

= 10,000x.5 = 10,000x.3 = 10,000x.2

= 5,000 x .001 = 3,000 x .005 = 2,000 x .008

= 5,000 - 5 = 3,000 - 15 = 1,984 - 16

Bayess Theorem
Example: Hospital Trauma Centers
Now, use these frequencies to find the probabilities needed for Bayes Theorem. For example,
Hospital
1 2 3 Total

Malpractice Suit Filed


P(B1|A)=5/36=.1389 P(B2|A)=15/36=.4167 P(B3|A)=16/36=4444 P(A)=36/10000=.0036

No Malpractice Suit Filed


P(B1|A')=.5012 P(B2|A')=.2996 P(B3|A')=.1991 P(A')=.9964

Total
P(B1)=.5 P(B2)=.3 P(B3)=.2 1.0000

Bayess Theorem
Example: Hospital Trauma Centers
Consider the following visual description of the problem:

Bayess Theorem
Example: Hospital Trauma Centers
The initial sample space consists of 3 mutually exclusive and collectively exhaustive events (hospitals B1, B2, B3).

Bayess Theorem
Example: Hospital Trauma Centers
As indicated by their relative areas, B1 is 50% of the sample space, B2 is 30% and B3 is 20%. 30%

50%

20%

Bayess Theorem
Example: Hospital Trauma Centers
But, given that a malpractice case has been filed (event A), then the relevant sample space is reduced to the yellow area of event A. The revised probabilities are the relative areas within event A.
P(B2 | A)

P(B1 | A)

P(B3 | A)

Counting Rules
Fundamental Rule of Counting
If event A can occur in n1 ways and event B can occur in n2 ways, then events A and B can occur in n1 x n2 ways. In general, m events can occur n1 x n2 x x nm ways.

Counting Rules
Example: Stock-Keeping Labels
How many unique stock-keeping unit (SKU) labels can a hardware store create by using 2 letters (ranging from AA to ZZ) followed by four numbers (0 through 9)? For example, AF1078: hex-head 6 cm bolts box of 12 RT4855: Lime-A-Way cleaner 16 ounce LL3319: Rust-Oleum primer gray 15 ounce

Counting Rules
Example: Stock-Keeping Labels
View the problem as filling six empty boxes:

There are 26 ways to fill either the 1st or 2nd box and 10 ways to fill the 3rd through 6th. Therefore, there are 26 x 26 x 10 x 10 x 10 x 10 = 6,760,000 unique inventory labels.

Counting Rules
Example: Shirt Inventory
L.L. Bean mens cotton chambray shirt comes in 6 colors (blue, stone, rust, green, plum, indigo), 5 sizes (S, M, L, XL, XXL) and two styles (short and long sleeves). Their stock might include 6 x 5 x 2 = 60 possible shirts. However, the number of each type of shirt to be stocked depends on prior demand.

Counting Rules
Factorials
The number of ways that n items can be arranged in a particular order is n factorial. n factorial is the product of all integers from 1 to n.

n! = n(n1)(n2)...1
Factorials are useful for counting the possible arrangements of any n items.

There are n ways to choose the first, n-1 ways to choose the second, and so on.

Counting Rules
Factorials
As illustrated below, there are n ways to choose the first item, n-1 ways to choose the second, n-2 ways to choose the third and so on.

Counting Rules
Factorials
A home appliance service truck must make 3 stops (A, B, C). In how many ways could the three stops be arranged? 3! = 3 x 2 x 1 = 6
List all the possible arrangements:

{ABC, ACB, BAC, BCA, CAB, CBA}


How many ways can you arrange 9 baseball players in batting order rotation? 9! = 9 x 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1 = 362,880

Counting Rules
Permutations
A permutation is an arrangement in a particular order of r randomly sampled items from a group of n items and is denoted by nPr

n! n Pr (n r )!
In other words, how many ways can the r items be arranged, treating each arrangement as different (i.e., XYZ is different from ZYX)?

Counting Rules
Example: Appliance Service Cans
n = 5 home appliance customers (A, B, C, D, E) need service calls, but the field technician can service only r = 3 of them before noon. The order is important so each possible arrangement of the three service calls is different. The number of possible permutations is:

n! 5! 5 4 3 2 1 120 60 n Pr (n r )! (5 3)! 2! 2

Counting Rules
Example: Appliance Service Cans
The 60 permutations with r = 3 out of the n = 5 calls can be enumerated. There are 10 distinct groups of 3 customers:
ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE

Each of these can be arranged in 6 distinct ways:


ABC, ACB, BAC, BCA, CAB, CBA

Since there are 10 groups of 3 customers and 6 arrange-ments per group, there are 10 x 6 = 60 permutations.

Counting Rules
Combinations
A combination is an arrangement of r items chosen at random from n items where the order of the selected items is not important (i.e., XYZ is the same as ZYX). A combination is denoted nCr

n! nCr r !(n r )!

Counting Rules
Example: Appliance Service Calls Revisited
n = 5 home appliance customers (A, B, C, D, E) need service calls, but the field technician can service only r = 3 of them before noon. This time order is not important. Thus, ABC, ACB, BAC, BCA, CAB, CBA would all be considered the same event because they contain the same 3 customers. The number of possible combinations is:
n! 5! 5 4 3 2 1 120 10 nCr r !(n r )! 3!(5 3)! (3 2 1)(2 1) 12

Counting Rules
Example: Appliance Service Calls Revisited
10 combinations is much smaller than the 60 permutations in the previous example. The combinations are easily enumerated: ABC, ABD, ABE, ACD, ACE, ADE, BCD, BCE, BDE, CDE

Applied Statistics in Business and Economics


End of Chapter 5

You might also like