You are on page 1of 49

Probability 1

Lecture 03 : Probability Axioms

Dr. Daniel Flores-Agreda,


(based on the notes of Prof. Davide La Vecchia)

Spring Semester 2021

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 1 / 49


Objective

• Provide the Axioms upon which the Probability Theory is built.


• Explore some rules that result from the Axioms
• Illustrate how these rules can be used to compute probabilities “in
real life”
• Introduce the concepts of Independence and Conditional Probability
• Present two important Theorems and illustrate their consequences.

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 2 / 49


Outline

1 An Axiomatic Definition of Probability

2 Implications: Properties of P (·)

3 Examples and Illustrations

4 Conditional Probability

5 Independence

6 Theorem I: The Theorem of Total Probability

7 Theorem II: Bayes’ Theorem

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 3 / 49


Outline

1 An Axiomatic Definition of Probability

2 Implications: Properties of P (·)

3 Examples and Illustrations

4 Conditional Probability

5 Independence

6 Theorem I: The Theorem of Total Probability

7 Theorem II: Bayes’ Theorem

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 4 / 49


An Axiomatic Definition of Probability

Definition
We define probability a set function with values in [0, 1], which satisfies
the following axioms:
(i) P (A) ≥ 0, for every event A
(ii) P (S ) = 1
(iii) If A1 , A2 , ... is a sequence of mutually exclusive events (namely
Ai ∩ Aj = ∅, for i 6= j, and i, j = 1, 2, ...), and such that
A = i∞=1 Ai , then
S

∞ ∞
!
∑ P (Ai ).
[
P (A) = P Ai = (1)
i =1 i =1

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 5 / 49


Outline

1 An Axiomatic Definition of Probability

2 Implications: Properties of P (·)

3 Examples and Illustrations

4 Conditional Probability

5 Independence

6 Theorem I: The Theorem of Total Probability

7 Theorem II: Bayes’ Theorem

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 6 / 49


Implications: Properties of P (·)

We can use the three axioms to build more sophisticated statements, e.g.

• The Probability of the Empty Set


• The Addition Law of Probability
• The Complement Rule
• The Monotonicity Rule
• The Probability of the Union
• Boole’s Inequality

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 7 / 49


Implications: Properties of P (·)

Theorem (Probability of the Empty Set)


P (∅) = 0

Proof.
Take A1 = A2 = A3 = .... = ∅. Then by (1) in Axiom (iii) we have
∞ ∞ ∞
!
∑ P (Ai ) = ∑ P (∅)
[
P (∅) = P Ai =
i =1 i =1 i =1

which is true only if it is an infinite sum of zeros. Thus

P (∅) = 0.

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 8 / 49


Implications: Properties of P (·)

Theorem (The Addition Law of Probability)


If A1 , A2 , ... are mutually exclusive events, then
!
n n
∑ P (Ai ).
[
P Ai = (2)
i =1 i =1

Proof.
Sn S∞
Let An+1 = An+2 = .... = ∅, then i = 1 Ai = i =1 Ai , and, from (1) (see Axiom (iii)) it
follows
∞ ∞ ∞
! !
n n
∑ P (Ai ) = ∑ P (Ai ) + ∑
[ [
P Ai = P Ai = P (Ai ) .
i =1 i =1 i =1 i =1 i =n +1
| {z }
≡0

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 9 / 49


Implications: Properties of P (·)

Theorem (The Complement Rule)


If A is an event, then P (Ac ) = 1 − P (A).

Proof.
Let A ∪ Ac = S and A ∩ Ac = ∅, so and

P (S ) = P (A ∪ Ac ) = P (A) + P (Ac ) .

By Axiom (i) Axiom (ii) we have P (S ) = 1, so the desired result follows from:

1 = P (A) + P (Ac ) .

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 10 / 49


Implications: Properties of P (·)
Theorem (The Monotonicity Rule)
For any two events A and B, such that B ⊂ A, we have

P (A) ≥ P (B ).

Proof.
Let us write
A = B ∪ (B c ∩ A)
and notice that B ∩ (B c ∩ A) = φ, so that

P (A) = P {B ∪ (B c ∩ A)}
= P (B ) + P (B c ∩ A)

which implies (since P (B c ∩ A) ≥ 0) that

P (A) ≥ P (B ).

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 11 / 49


Implications: Properties of P (·)

B⊂A
A

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 12 / 49


Implications: Properties of P (·)

Theorem (Probability of the Union of Two Events)


For any two events A and B then

P (A ∪ B ) = P (A) + P (B ) − P (A ∩ B ).

Proof.
Consider that A ∪ B = A ∪ (Ac ∩ B ), and A ∩ (Ac ∩ B ) = φ. Now remembera that
Ac ∩ B = B − (A ∩ B ), so,

P (A ∪ B ) = P ( A ) + P ( Ac ∩ B )
= P (A) + P (B ) − P (A ∩ B ).

a See Lecture 1 for the meaning of set difference.

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 13 / 49


Implications: Properties of P (·)

Theorem (Boole’s inequality)


For the events A1 , A2 , ...An ,
n
P (A1 ∪ A2 ∪ .... ∪ An ) ≤ ∑ P (Ai ).
i =1

For instance, let us consider n = 2. Then we have:

P (A1 ∪ A2 ) = P (A1 ) + P (A2 ) − P (A1 ∩ A2 ) ≤ P (A1 ) + P (A2 )

since P (A1 ∩ A2 ) ≥ 0 by definition.

Remark
It is worth notice that if Aj ∩ Ai = ∅, for every i and j, with i 6= j, then
P (A1 ∪ A2 ∪ .... ∪ An ) = ∑ni=1 P (Ai ), as stated in (2).

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 14 / 49


Outline

1 An Axiomatic Definition of Probability

2 Implications: Properties of P (·)

3 Examples and Illustrations

4 Conditional Probability

5 Independence

6 Theorem I: The Theorem of Total Probability

7 Theorem II: Bayes’ Theorem

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 15 / 49


Examples and Illustrations

Example (Flipping a coin twice)


If we flip a balanced coin twice, what is the probability of getting at least one
head?
• The sample space is: S = {HH, HT , TH, TT }
• Balanced coin: outcomes are equally likely and
pHH = pHT = pTH = pTT = p = 1/4
• Let A denote the event obtaining at least one Head, i.e. H = {HH, HT , TH }

Pr (A) = Pr ({HH ∪ HT ∪ TH })
= Pr ({HH }) + Pr ({HT }) + Pr ({TH })
1 1 1 3
= + + =
4 4 4 4

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 16 / 49


Examples and Illustrations

Example (Detecting Shoppers)


Shopper TRK is an electronic device that counts the number of shoppers entering a
shopping centre.
Two shoppers enter the shopping centre together, one walking in front of the other:
1. There is a 0.98 probability that the first shopper is detected
2. There is a 0.94 probability that the second shopper is detected.
3. There is a 0.93 probability that both shoppers are detected.
What is the probability that the device will detect at least one of the two shoppers
entering?

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 17 / 49


Examples and Illustrations

Example (Detecting Shoppers, continued)


Define the events:
• D (shopper is detected) and
• U (shopper is undetected).
Then, the Sample Space is S = {DD, DU, UD, UU }
Define the probabilities:
• Pr (DD ∪ DU ) = 0.98
• Pr (DD ∪ UD ) = 0.94
• Pr (DD ) = 0.93

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 18 / 49


Examples and Illustrations

Example (Detecting Shoppers, continued)

Pr (DD ∪ UD ∪ DU ) = Pr ({DD ∪ UD } ∪ {DD ∪ DU })


= Pr ({DD ∪ UD }) + Pr ({DD ∪ DU })
− Pr ({DD ∪ UD } ∩ {DD ∪ DU })

The only missing probability is : P {DD ∪ UD } ∩ {DD ∪ DU } =?


To compute it, let’s use the fact that the union is distributive with respect to the
intersection (see Ch 2):

(DD ∪ UD ) ∩ (DD ∪ DU ) = DD ∪ (UD ∩ DU ) = DD ∪ ∅ = DD

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 19 / 49


Examples and Illustrations

Example (Detecting Shoppers, continued)


Graphically:

So, the desired probability is:

Pr (DD ∪ UD ∪ DU ) = Pr ({DD ∪ UD }) + Pr ({DD ∪ DU }) − Pr (DD )


= 0.98 + 0.94 − 0.93
= 0.99

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 20 / 49


Examples and Illustrations

Example (De Morgan’s law)


Given P (A ∪ B ) = 0.7 and P (A ∪ B c ) = 0.9, find P (A).
By De Morgan’s law,

P (Ac ∩ B c ) = P ((A ∪ B )c ) = 1 − P (A ∪ B ) = 1 − 0.7 = 0.3

and similarly
P (Ac ∩ B ) = 1 − P (A ∪ B c ) = 1 − 0.9 = 0.1.
Thus,
P (Ac ) = P (Ac ∩ B c ) + P (Ac ∩ B ) = 0.3 + 0.1 = 0.4,
so
P (A) = 1 − 0.4 = 0.6.

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 21 / 49


Examples and Illustrations

Example (Probability, union, and complement)


John is taking two books along on his holiday vacation. With probability 0.5, he
will like the first book; with probability 0.4, he will like the second book; and with
probability 0.3, he will like both books. What is the probability that he likes
neither book?
Let Ai be the event that John likes book i, for i = 1, 2. Then the probability that he
likes at least one book is:
2
[
P( Ai ) = P (A1 ∪ A2 ) = P (A1 ) + P (A2 ) − P (A1 ∩ A2 )
i =1
= 0.5 + 0.4 − 0.3 = 0.6.

Because the event the John likes neither books is the complement of the event that he
likes at leas one of them (namely A1 ∪ A2 ), we have

P (Ac1 ∩ Ac2 ) = P ((A1 ∪ A2 )c ) = 1 − P (A1 ∪ A2 ) = 0.4.

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 22 / 49


Outline

1 An Axiomatic Definition of Probability

2 Implications: Properties of P (·)

3 Examples and Illustrations

4 Conditional Probability

5 Independence

6 Theorem I: The Theorem of Total Probability

7 Theorem II: Bayes’ Theorem

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 23 / 49


Conditional Probability

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 24 / 49


Conditional Probability

Sometimes, probability depends on the information available.


Example
Suppose you have two dice and throw them; the possible outcomes are:

(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)


(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)


(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1)
Flores-Agreda, (5,2)
La Vecchia (5,3) (5,4)
S110015 (5,5) (6,5) 2021
Spring Semester 25 / 49
Conditional Probability

(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)


(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
Example (cont’d)
(5,1) (5,2) (5,3) (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
Consider the event A = getting 5, or equivalently A = {5}.
What is P (A), namely, the probability of getting 5?

(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)


(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)


(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2)
Flores-Agreda, La Vecchia (5,3) (5,4)
S110015 (5,5) (6,5) 2021
Spring Semester 26 / 49
Conditional Probability

Example (cont’d)

The dice are fair so we can get 36 events with equal probability 1 36.
Namely:
1
Pr (i, j ) = , for i, j = 1, .., 6
36
Thus, we can make use of the highlighted probabilities

P (5) = Pr {(1, 4) ∪ (2, 3) ∪ (3, 2) ∪ (4, 1)}


= Pr {(1, 4)} + Pr {(2, 3)} + Pr {(3, 2)} + Pr {(4, 1)}
   
= 1 36 + 1 36 + 1 36 + 1 36

= 4 36

= 1 9.

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 27 / 49


(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
Conditional Probability

(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)


(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1)
Example (cont’d) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
Now, suppose
(5,1) that
(5,2) we throw the
(5,3) die first
(5,4)and we get
(5,5)2. (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
What is the probability of getting 5 given that we have observed 2
in the first throw?
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

Pr{getting 5 given 2 in the first throw} = Pr{getting 3 in the second throw} = 1/6

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 28 / 49


(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

Conditional Probability
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
Remark (4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3)
• Information changes the probability (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
• By knowing that we got 2 in the first throw, we have changed the
sample space:
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

• Probability can change drastically — e.g., suppose that in our


example we have 6 in the first throw ⇒ the probability of observing 5
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
in two
(2,1)draws ... is zero !!!
(2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
Flores-Agreda, La Vecchia S110015 Spring Semester 2021 29 / 49
Conditional Probability

Definition
Let A and B be two events. The conditional probability of event A given
event B, denoted by P (A|B ), is defined by:

P (A ∩ B )
P (A|B ) = , if P (B ) > 0,
P (B )

and it is left undefined if P (B ) = 0.

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 30 / 49


(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2)
Conditional Probability (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
Example (A check)
Let us define the set B as:
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

So we have
(1,1)
P (B ) = (1,2)
Pr {(2, 1) ∪(1,3)
(2, 2) ∪ (2, 3)(1,4)
∪ (2, 4) ∪ (2,(1,5)
5) ∪ (2, 6)} (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1)
= (3,2)
Pr (2, 1) + Pr (2, 2) + Pr ((3,4)
(3,3)
2, 3) + Pr (2, 4(3,5)
) + Pr (2, 5) + Pr (2, 6)
(3,6)
(4,1) = (4,2)
6/36 = 1/6(4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 31 / 49


(1,1)
Conditional (1,2) (1,3)
Probability (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
Example (A check)
and consider A ∩ B
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (6,5)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

So we have P (A ∩ B ) = Pr (2, 3) = 1/36 so,

P (A ∩ B ) 1/36 1
P (A|B ) = = = .
P (B ) 1/6 6

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 32 / 49


Outline

1 An Axiomatic Definition of Probability

2 Implications: Properties of P (·)

3 Examples and Illustrations

4 Conditional Probability

5 Independence

6 Theorem I: The Theorem of Total Probability

7 Theorem II: Bayes’ Theorem

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 33 / 49


Independence

Definition
Two events A and B are independent if the occurrence of one event has
no effect on the probability of occurrence of the other event. Thus,

P (A|B ) = P (A)

or equivalently
P (B |A) = P (B )

Clearly, if P (A|B ) 6= P (A), then A and B are dependent.

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 34 / 49


Independence
A second characterisation

Two events A and B are independent if

P (A|B ) = P (A),

now by definition of conditional probability we know that

P (A ∩ B )
P (A|B ) = ,
P (B )

so we have
P (A ∩ B )
P (A) = ,
P (B )
and rearranging the terms, we find that two events are independent iif

P (A ∩ B ) = P (A)P (B ).

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 35 / 49


Independence

Example !"#$%&'(
!
"!#$%&!%'!($''!()*++!(%,+'!-&.!()+!+%/)(!0$''%12+!$3(#$,+'!
4445!4465!4645!6445!4665!6465!6645!666!
-*+!-''3,+.!($!1+!+73-228!2%9+28!:%()!0*$1-1%2%(8!;<=!
!
>+?%&+!
!@!-&!4!$##3*'!$&!+-#)!$?!()+!?%*'(!(:$!($''+'!
"@!6!$##3*'!%&!()+!()%*.!($''!
#@!(:$!6'!$##3*!%&!()+!()*++!($''+'!
!
)*+!"*+!!!-&.!"!%&.+0+&.+&(A!!
!
),+!"*+!"!-&.!#!%&.+0+&.+&(A!!

12

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 36 / 49


Independence

Example (cont’d) !"#$%&'()*+,-+.'/(


!
"#!$%&#!
A {H H H , H HT } Pr^ A` 2/8 1/4
B {H HT , HTT , THT , TTT } Pr^ B` 4 / 8 1/ 2
D {HTT , THT , TTH } Pr^ D ` 3/8 !
Aˆ B ^ H HT ` Pr^ Aˆ B` 1/8
B ˆ D ^ HTT ,THT ` Pr^ B ˆ D` 2 / 8 1/ 4

Pr^ A`Pr^ B` 1 u 1 1 Pr^ Aˆ B` 1 Ÿ !!!"#!'()#*#()#(+!


4 2 8 8
Pr^ B`Pr^ D` 1 u 3 3 P r^ B ˆ D` 1 Ÿ !#!%()!$!)#*#()#(+!
2 8 16 4

13

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 37 / 49


Outline

1 An Axiomatic Definition of Probability

2 Implications: Properties of P (·)

3 Examples and Illustrations

4 Conditional Probability

5 Independence

6 Theorem I: The Theorem of Total Probability

7 Theorem II: Bayes’ Theorem

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 38 / 49


Theorem I: The Theorem of Total Probability

Theorem (Total Probabilities)


Let B1 , B2 , ..., Bk , ..., Bn be mutually disjoint events, satisfying
S = ∪ni=1 Bi , and P (Bi ) > 0, for every i = 1, 2, ..., n then for every A we
have that:
n
P (A) = ∑ P (A|Bi )P (Bi ). (3)
i =1

Proof.
Write A = A ∩ S = A ∩ (∪ni=1 Bi ) = ∪ni=1 (A ∩ Bi ). Since the {Bi ∩ A} are mutually
disjoint, we have
n n
P (A) = P (∪ni=1 (A ∩ Bi )) = ∑ P (A ∩ Bi ) = ∑ P (A|Bi )P (Bi ).
i =1 i =1

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 39 / 49


Theorem I: The Theorem of Total Probability

Remark. The theorem remains valid even if n = ∞ in Eq. (3).

Corollary
Let B satisfy 0 < P (B ) < 1; then for every event A:

P (A) = P (A|B )P (B ) + P (A|B c )P (B c )

Proof.
Exercise [Hint: S = B ∪ B c ].

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 40 / 49


Outline

1 An Axiomatic Definition of Probability

2 Implications: Properties of P (·)

3 Examples and Illustrations

4 Conditional Probability

5 Independence

6 Theorem I: The Theorem of Total Probability

7 Theorem II: Bayes’ Theorem

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 41 / 49


Theorem II: Bayes’ Theorem

Theorem I can be applied to derive the well-celebrated Bayes’ Theorem.

Theorem (Bayes’ Theorem)


Let B1 , B2 , ..., Bk , ..., Bn be mutually disjoint events, satisfying

S = ∪ni=1 Bi ,

and P (Bi ) > 0, for every i = 1, 2, ..., n. Then for every event A for which
P (A) > 0, we have that

P (A|Bk )P (Bk )
P (Bk |A) = n . (4)
∑i =1 P (A|Bi )P (Bi )

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 42 / 49


Theorem II: Bayes’ Theorem

Proof.
Let us write
P (A ∩ Bk )
P (Bk |A) =
P (A)
P (A ∩ Bk )
= n
∑i =1 P (A|Bi )P (Bi )
P (A|Bk )P (Bk )
= n
∑i =1 P (A|Bi )P (Bi )
That concludes the proof.

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 43 / 49


Theorem II: Bayes’ Theorem

Example
Let us consider a special case, ĂLJĞƐ͛dŚĞŽƌĞŵ!
where we have only two events A and B.
!
"#$%!&'(!)(*+,+&+$,!$*!-$,)+&+$,./!0#$1.1+/+&2!
Pr^ Aˆ B` Pr^ Aˆ B`
! Pr^ A| B` Pr^ B| A` !
Pr^ B` Pr^ A`
3'+4!-.,!1(!5#+&&(,!.4!!
! Pr^ Aˆ B` Pr^ A| B`Pr^ B` Pr^ Aˆ B` Pr^ B| A`Pr^ A`!
3'.&!+4!
! Pr^ A| B`Pr^ B` Pr^ B| A`Pr^ A` !
6(.##.,7+,7!&'+4!5(!'.8(!
Pr^ A`
! Pr^ A| B` uPr^ B| A` !!!!  !!ĂLJĞƐ͛dŚĞŽƌĞŵ!
Pr^ B`

14

... so thanks to Bayes’ Theorem we can reverse the role of A|B and B |A.

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 44 / 49


Theorem II: Bayes’ Theorem
Application

Example (Guessing in a multiple choice exam)

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 45 / 49


Theorem II: Bayes’ Theorem
Application

Example
Members of a consulting firm in Geneva rent cars from two rental agencies:
• 60% from AVIS
• 40% from Mobility
Now consider that
• 9% of the cars from AVIS need a tune-up
• 20% of the cars from Mobility need a tune-up

If a car delivered to the consulting firm needs a tune-up, what is the probability that the
care came from AVIS?

Aim
A := {car rented from AVIS} and B := {car needs a tune-up}. We know P (B |A) and
we look for P (A|B ) ⇒ Bayes’ theorem!!

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 46 / 49


Theorem II: Bayes’ Theorem
Application

Example (cont’d)

Pr^ A` 0.6
Pr^ B| A` 0.09

^
Pr B| A c ` 0.2

Pr^ B` Pr ^ B ˆ A ‰ B ˆ Ac `
^
Pr^ B ˆ A` Pr B ˆ A c `
^ ` ^ `
Pr^ B | A` Pr^ A` Pr B| A c Pr A c
0.09u0.6  0.2u0.4
0.134
Pr^ A`
Pr^ A| B` u Pr^ B| A` 0.6 0.09 0.402985
Pr^ B` 0.134

16

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 47 / 49


Wrap-up
Take-home message
• The Probability Axioms are important, since they imply computation
rules.
• These rules are key to compute probabilities “in real life”.
• Probability of an event can be conditional on the realisation of
another event.
• When an event has the same probability regardless of another it is
called independent
• Theorem I (basic)

P (A) = P (A ∩ B ) + P (A ∩ B c ) = P (A|B )P (B ) + P (A|B c )P (B c )

• Theorem II (basic) If we have P (A|B ), we can find P (B |A)

P (A|B )P (B )
P (B |A) =
P (A|B )P (B ) + P (A|B c )P (B c )

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 48 / 49


Thank you for your attention!
“See you” next week!

Flores-Agreda, La Vecchia S110015 Spring Semester 2021 49 / 49

You might also like