You are on page 1of 39

Chapter 3: Fundamentals of Data Analysis Statistics

INTRODUCTION TO PROBABILITY
Outline

• Probability

• Mutually Exclusive Events

• Adding Probabilities
Probability
• We have all seen examples of probability
– What is the probability that a coin lands on heads when
flipped?
– What is the probability that I draw a King from a shuffled
deck of cards?
– What is the probability that the stock market goes up by
more than 5% over the next 12 months?

• We want to mathematically formalize probability so we can


use it to make business decisions under uncertainty
Probability
• Probability is a mathematical concept that describes the likelihood of
an event occurring
– It is a number between 0 and 1!

• An experiment is a repeatable procedure with a known set of


outcomes

• A sample space is the set of all possible outcomes of an experiment

• An event is the outcome of an experiment to which a probability can


be assigned
Probability

• Those definitions are all a bit circular


• Let’s think about the example of drawing a single card
from a shuffled deck
• The sample space is the set of all cards: {AD, AH, AC, AS,
2D, 2H, 2C, 2S, …, KD, KH, KC, KS}
• The experiment is the process of drawing a card
• An event is drawing a king or diamond
• The probability of this event occurring is 16/52
Probability
• Consider a machine that is manufactured from 3 parts
• Each of those three parts might be defective
• If 2 or more of those parts are defective then the whole
machine is defective
• Sample space: {GGG, GGD, GDG, GDD, DGG, DGD, DDG, DDD}
• Experiment: make a machine and see if it’s defective
• Event: is it defective?
• Probability: this is a bit harder to answer – we need to learn
some more!
Probability
• Consider an event, A (King or diamond)
• Let’s repeatedly perform an experiment, 𝑛 times
1 𝑖𝑓 𝐴 𝑜𝑐𝑐𝑢𝑟𝑠 𝑖𝑛 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡 𝑖
• Let 𝑎𝑖 = ቊ
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
• We then define
1 𝑛
– 𝑃 𝐴 = lim σ𝑖=1 𝑎𝑖
𝑛→∞ 𝑛

• The probability of an event is the relative frequency that event


occurs if the experiment is repeated infinitely many times
Probability
• Of course this is a bit ridiculous…nothing has ever
happened infinitely many times
• How did we answer the question about the cards?
– Enumerate all possible outcomes
– All cards are equally likely: 1/52
– Count the cards that are king or diamond: 16

• This isn’t always possible, but we still want to describe


the probability of events
Probability
• There are a few ways we can get “probability”
• A priori probability: The mathematical definition from before
• Empirical probability: Use data to measure relative frequency,
with a finite 𝑛
• Expert probability: Ask an expert in the field what they believe
the probability is

• With this in mind, let’s talk about some properties of events


and probability
Events
• Consider two events: A, B
• These two events are said to be mutually exclusive if the
probability of them both occurring is 0

• Let A be drawing a king, B is drawing a queen


• If you draw 1 card it cannot be a king and a queen!

• Suppose A is drawing a king and B is drawing a diamond


– Are these events mutually exclusive?
Events
• The way we write two events both occurring is with the
intersect notation: ∩
• Two events are mutually exclusive if
– 𝑃 𝐴∩𝐵 =0

• The way we write either of two events occurring is with


the union notation: ∪

– What is 𝑃 𝐴 ∪ 𝐵 ?
Events

• If A is King and B is diamond


– P(A) = 4/52
– P(B) = 13/52
– P(A) + P(B) = 17/52 ?
– But we already know the true probability is 16/52

• The king of diamonds is in both A and B


• We counted it twice!
Events
• The probability of a union is
– 𝑃 𝐴 ∪ 𝐵 = 𝑃 𝐴 + 𝑃 𝐵 − 𝑃(𝐴 ∩ 𝐵)

• In the card example 𝐴 ∩ 𝐵 is when a card is both a king and a diamond


– There is exactly 1 of those, with probability 1/52
– 𝑃 𝐴 ∪ 𝐵 = 4/52 + 13/52 - 1/52 = 16/52
Events
• If two events are mutually exclusive: 𝑃 𝐴 ∩ 𝐵 = 0
– Then 𝑃 𝐴 ∪ 𝐵 = 𝑃 𝐴 + 𝑃 𝐵
• What is the probability that a card is a king or queen?
• We already know these are mutually exclusive!
– 4/52 + 4/52 = 8/52

• What is 𝑃 𝐴 ∩ 𝐵 in general?

• We need to learn some more before we can answer this


Chapter 3: Fundamentals of Data Analysis Statistics

JOINT PROBABILITY
Outline

• Independent and dependent events

• Conditional Probability

• Joint probability

• Marginal Probability
Independent Events
• Two events are said to be independent if knowing that one of the events
occurred does not impact the probability of the other event occurring

• What is the probability of drawing a diamond?


– ¼

• I drew a card, it was a king, what is the probability it was a diamond?


– ¼

• King and diamond are independent events


Independent Events

• I drew a card, it was a king, what is the probability


that it was a queen?
–0
– King and queen are not independent!

• Two events that are not independent are said to be


dependent events
Independent Events
• Suppose two events are independent
• What is 𝑃(𝐴 ∩ 𝐵)?
– 𝑃 𝐴 ∩ 𝐵 = P A × 𝑃(𝐵)
– Why?
• The percentage of the sample space taken up by an event is its probability
• A takes up a portion P A of the sample space
• Of that portion, B takes up a portion P B
– Because knowing about A doesn’t change P B
– P B portion of P A is P A × 𝑃(𝐵)
– This is P A ∩ 𝐵
Independent Events
• Can two events, A and B, be independent and mutually exclusive?

• Mutually exclusive: 𝑃 𝐴 ∩ 𝐵 = 0
• Independent: 𝑃 𝐴 ∩ 𝐵 = P A × 𝑃(𝐵)

• If A and B are independent and mutually exclusive then P A ×


𝑃 𝐵 =0
• This can only happen if P A = 0 or P B = 0
• So two events cannot be independent and mutually exclusive!
Dependent Events
• What is the probability that I draw a card that’s red and a diamond? ¼

• What is the probability that I draw a red card? ½

• What is the probability that I draw a diamond? ¼

• If red and diamond were independent then P(red ∩ diamond) = P(red) *


P(diamond) = 1/8

• This is clearly incorrect


Conditional Probability

• We thus define the notion of conditional probability


– The probability of an event given that we know
another event has occurred
– P(A | B) is the probability that A occurs given that we
know B occurred

• P(diamond | red) = ½

• P(red | diamond) = 1
Conditional Probability

• With conditional probability we can get a new product rule


– Want to calculate 𝑃 𝐴 ∩ 𝐵
• What portion of the sample space does B take up?
– P(B)
• What portion of that portion does A take up?
– P(A | B)

• 𝑃 𝐴 ∩ 𝐵 = P A B P(B)
• 𝑃 𝐴 ∩ 𝐵 = P B A P(A)
Conditional Probability

• P(red ∩ diamond)
– P(diamond | red) P(red) = ½ * ½ = ¼
– P(red | diamond) P(diamond) = 1 * ¼ = ¼

• Suppose I draw 2 cards (without replacement) what is


the probability they’re both hearts?
– P(card1 is heart) P(card2 is heart | card1 is heart)
– ¼ * 12/51 = 12/204 = 3/51
Conditional Probability
• Suppose 70% of cars on a used car lot have air conditioning,
40% of cars have CD players and 20% have both
• What is the probability that a car with an AC will also have a
CD player?
– P(CD | AC) ?

• We can solve this using the product rule!


– P(AC ∩ CD) = P(AC) * P(CD | AC)
– 20% = 70% * P(CD | AC)
– P(CD | AC) = 20%/70% = 2/7
Contingency Table
• Another way we can solve this problem is with a contingency table
• A contingency table lists joint probabilities in a table
– Probabilities of intersections
A Not A
B P(A ∩ B) P(Not A ∩ B) P(B)
Not B P(A ∩ not B) P(not A ∩ not B) P(not B)
P(A) P(not A) 100%
CD NO CD
AC 20% 70%
NO AC
40% 100%

– P(CD | AC) = 20%/70% = 2/7


Example 2
• Suppose a study of speeding violations and drivers who use cell phones while
driving produced the following fictional results:
• 40% using phone while driving.
• 9% had speed violation in the last year.
• 3% are using phone while driving and had speed violation in the last year.
• What is the probability driver using phone while driving given that he/she had a
speed violation in the last year?
Solution (1)

• P(A)=P(Speeding violation in the last year)=9%


• P(B)=P(phone use while driving)=40%
• P(A∩B)=3%
• P(B|A)=P(A∩B)/P(A)=3/9=3%
No speeding Total
Speeding violation violation in the last
in the last year year

phone use while


3% 40%
driving
No phone use while
driving
Total 9% 100%
Contingency Table
• Contingency tables need not be 2 x 2
• In general they can have any number of rows or columns
• The events on the rows need to be mutually exclusive and
compose the entire samples space
• The events on the columns need to be mutually exclusive and
compose the entire samples space

• Think back to the treadmill example


– Rows could be product (3)
– Columns could be gender (2)
Marginal Probability

• In the contingency table we have that summing over rows gives the
probability of being in a column
• We call this the marginal probability

• Suppose there are events 𝐵𝑖 𝑛𝑖=1 such that 𝑃 𝐵𝑖 ∩ 𝐵𝑗 = 0, 𝑖 ≠


𝑗 and 𝑃 𝐵1 ∪ 𝐵2 ∪ ⋯ ∪ 𝐵𝑛 = 1
• Then 𝑃 𝐴 = 𝑃 𝐴 ∩ 𝐵1 + 𝑃 𝐴 ∩ 𝐵2 + ⋯ + 𝑃(𝐴 ∩ 𝐵𝑛 )
• This is called the law of total probability
• We can use it to calculate the probability of A
• When we calculate it this way we call it the marginal probability of A
Marginal Probability

• We can further expand the law of total probability


using the product rule
• 𝑃 𝐴 = 𝑃 𝐴|𝐵1 𝑃(𝐵1 ) + ⋯ + 𝑃 𝐴 𝐵𝑛 𝑃(𝐵𝑛 )

• This will be a very useful formula as we go on


Chapter 3: Fundamentals of Data Analysis Statistics

BAYES’ RULE
Outline

• Bayes’ Rule

• Examples
Bayes’ Rule
• Remember we developed a new product rule for dependent events?
– 𝑃 𝐴 ∩ 𝐵 = P A B P(B)
– 𝑃 𝐴 ∩ 𝐵 = P B A P(A)
• These two equations have the same left hand side
• That means the right hand sides must be equal
– P B A P A = P A B P(B)

• Bayes’ rule rewrites this as


𝑃(𝐴|𝐵)𝑃(𝐵)
– 𝑃 𝐵𝐴 =
𝑃(𝐴)
Bayes’ Rule
• The application area of Bayes’ rule is extensive
• Suppose 5% of people default on their home loans
• 10% of defaulters had high credit before their default
• 25% of the population has high credit
• Suppose a customer has high credit, what is the probability they will default on their
home loan?

𝑃 𝑑𝑒𝑓𝑎𝑢𝑙𝑡 𝑃(ℎ𝑖𝑔ℎ|𝑑𝑒𝑓𝑎𝑢𝑙𝑡)
• 𝑃 𝑑𝑒𝑓𝑎𝑢𝑙𝑡 ℎ𝑖𝑔ℎ =
𝑃(ℎ𝑖𝑔ℎ)
5% × 10%
• = = 2%
25%
Bayes’ Rule
• Remember I told you the law of total probability would
be useful

• We can expand Bayes’ Rule further with the law of total


probability
– 𝑃 𝐴 = 𝑃 𝐴 𝐵 𝑃 𝐵 + 𝑃 𝐴 𝑛𝑜𝑡 𝐵 𝑃 𝑛𝑜𝑡 𝐵

𝑃(𝐴|𝐵)𝑃(𝐵)
• 𝑃 𝐵𝐴 =
𝑃(𝐴|𝐵)𝑃(𝐵) +𝑃(𝐴|𝑛𝑜𝑡 𝐵)𝑃(𝑛𝑜𝑡 𝐵)
Bayes’ Rule
• Suppose a disease affects 1% of the population. There is a test for
this disease, but it’s not perfect; 5% of people with the disease will
test negative and 7% of people without the disease will test positive
• If you test positive what is the probability you have the disease?
• It seems like the answer is 93% - it’s not!

• We need to use Bayes’ rule to solve it


– P(disease | + test)
Bayes’ Rule
• Let B = disease, A = + test
𝑃(+𝑡𝑒𝑠𝑡|𝑑𝑖𝑠𝑒𝑎𝑠𝑒)𝑃(𝑑𝑖𝑠𝑒𝑎𝑠𝑒)
• 𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 + 𝑡𝑒𝑠𝑡) = 𝑃(+𝑡𝑒𝑠𝑡)
– 𝑃 +𝑡𝑒𝑠𝑡 = 𝑃 +𝑡𝑒𝑠𝑡 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 +
𝑃 +𝑡𝑒𝑠𝑡 ℎ𝑒𝑎𝑙𝑡ℎ𝑦 𝑃(ℎ𝑒𝑎𝑙𝑡ℎ𝑦)
– 𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 = 1%
– 𝑃 +𝑡𝑒𝑠𝑡 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 = 95%
– 𝑃 +𝑡𝑒𝑠𝑡 ℎ𝑒𝑎𝑙𝑡ℎ𝑦 = 7%

95% ×1%
• 𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 + 𝑡𝑒𝑠𝑡) = ≈ 12%
95%×1% +7%×99%

You might also like