You are on page 1of 19

STAT2372 Probability

2020

Topic 1
Elementary Notions

STAT2372 2020 Topic 1 1


A Short History of Probability
• Ancient Egypt and Greece (games of chance)
• Gerolamo Cardano (1501–1576), Liber de ludo aleae (“Books on
Games of Chance”), Blaise Pascal (1623–1662) and Pierre de
Fermat (1601–1665), correspondence
• Jacob Bernoulli (1654–1705), Ars Conjectandi, 1713 (“The Art of
Conjecturing”), Abraham de Moivre (1667–1754), Leonard Euler
(1707–1783), Joseph-Louis Lagrange (1736–1813), Pierre-Simon
Laplace (1749–1827), Carl Friedrich Gauss (1777–1855), Siméon
Denis Poisson (1781–1840)
• Andrey Kolmogorov (1903–1987), Grundbegriffe der
Wahrsheinlichkeitsrechnung, 1933 (translation “Foundations of
the Theory of Probability”, 1950).

STAT2372 2020 Topic 1 2


Sample Spaces and Events
• Consider an experiment whose outcome is not predictable with
certainty, but for which all possible outcomes are known. We call
the set of all possible outcomes the sample space of the
experiment, and usually denote it by S or Ω. The elements are
listed in braces - {}. e.g. If the experiment consists of flipping
one coin, the outcomes are H and T, where H denotes ‘head’ and
T denotes ‘tail’, and the sample space is
Ω = {H, T } .
If the experiment consists of flipping two coins, the sample space
is
Ω = {(H, H) , (H, T ) , (T, H) , (T, T )}
The ordered pair represents the outcome of the experiment. Thus
(H, T ) means that a head turned up on the first flip and a tail on

STAT2372 2020 Topic 1 3


the second.
• Any subset E of the sample space Ω is known as an event. For
example, if E1 is the event that a head turns up on the first coin,
then
E1 = {(H, H) , (H, T )} .
If E2 is the event that at least one tail appears, then

E2 = {(H, T ) , (T, H) , (T, T )} .

• Two events A and B are said to be mutually exclusive if


A ∩ B = ∅, (the empty set) i.e. that A and B have no outcomes
in common.

STAT2372 2020 Topic 1 4


The Axioms of Probability

A function P whose domain is the set of all possible events and


which satisfies the three conditions (axioms)
1. 0 ≤ P (E) ≤ 1;
2. P (Ω) = 1;
3. For any sequence of mutually exclusive events E1 , E2 , . . .

P (E1 or E2 or . . .) = P (E1 ) + P (E2 ) + . . .

i.e. !

[ ∞
X
P Ei = P (Ei ) .
i=1 i=1
is called a probability function.

STAT2372 2020 Topic 1 5


Probability Notation and Properties

If A and B are any two events within the sample space, i.e. subsets
of Ω, then
• the symbol ∩ denotes intersection and A ∩ B is the event that
both A and B occur.
c

• A orA is the complement of A (read as “not A”), and so

P (A) + P A = P (Ω) = 1;

• the symbol ∪ denotes union and A ∪ B is the event that either A


or B or both A and B have occurred.
We have, using axiom 3 above,

P (A ∪ B) = P (A) + P (B) − P (AB) .

STAT2372 2020 Topic 1 6


• If A and B are mutually exclusive, then

P (A ∪ B) = P (A) + P (B) ;

• Important: Events A and B are said to be independent, if

P (AB) = P (A) P (B) ;

• An event F is a sub-event of the event E if E contains F i.e.


F ⊂ E. Then P (F ) ≤ P (E) .
e.g. in the example of tossing two coins, let E be the event that
at least one head appears and F be the event that the first coin
is a head. Then

E = {(H, T ) , (T, H) , (H, H)}

and
F = {(H, T ) , (H, H)} .

STAT2372 2020 Topic 1 7


Clearly, F is a subset of E , and P (F ) ≤ P (E) no matter how
the numeric probabilities are assigned to the events.
• Note that the domain of the function P is the set of all events.
Thus if we write P (E) , we are implying that E is an event.

Independence: Pairwise and Mutual


• Events A and B are said to be independent, if

P (AB) = P (A) P (B) ;

• More precisely, events A and B are said to be pairwise


independent if they adhere to the above condition.
• What about three or more events?
• Three events A, B and C are said to be mutually independent if

STAT2372 2020 Topic 1 8


– P (ABC) = P (A)P (B)P (C)
– All combination of event pairs are pairwise independent, i.e.
∗ P (AB) = P (A)P (B)
∗ P (AC) = P (A)P (C)
∗ P (BC) = P (B)P (C)

Conditional Probability
• If A and B are any two events, then the conditional probability
of A given B is defined by
P (AB)
P (A|B) =
P (B)
provided P (B) > 0. It is undefined if P (B) = 0.
• This makes sense if you draw a Venn diagram, and think about

STAT2372 2020 Topic 1 9


redefining probabilities when you know that only the outcomes in
B are possible.
• Let us extend this to three events. Suppose we toss two ordinary
six-sided dice, (i.e. the numbers on the faces are 1,2,3,4,5 and 6)
and the tosses are “independent”.
• Let A be the event that the number on the first die is odd, B the
event that the number on the second die is odd and C the event
that the sum of the numbers is odd.
• Clearly, events A and B are independent, as the tosses are
independent.
• However, events A and C would seem to be dependent, as the
result of the first toss is used to determine the sum. How do we
investigate this?
• First, we obtain the probabilities by counting, assuming equally

STAT2372 2020 Topic 1 10


likely outcomes. The sample spaces for the outcomes for each of
the two dice are
Ω1 = Ω2 = {1, 2, 3, 4, 5, 6} .

• Assuming the two dice are perfectly constructed, the outcomes


are equally likely and therefore each have probability 16 .
• Thus
3 1
A = {1, 3, 5} and P (A) = = ;
6 2
3 1
B = {1, 3, 5} and P (B) = = .
6 2
Now C = {(1, 2) , (1, 4) , (1, 6) , (2, 1) , (2, 3) , (2, 5) , . . . , (6, 5)} .
The number of outcomes comprising C is 18, the total number of
outcomes is 6 × 6 = 36 and therefore
1
P (C) = .
2

STAT2372 2020 Topic 1 11


• Now,
P (C|A) = P (sum is odd | first is odd)
= P (first is even and second is odd | first is odd)
+ P (first is odd and second is even | first is odd) ,
= P (first is odd and second is even | first is odd)
= P (second is even | first is odd)

• Since the two tosses are independent, the conditioning can be


dropped. Thus
P (C|A) = P (second is even)

=P B
= 1 − P (B)
1
=
2

STAT2372 2020 Topic 1 12


and therefore
1
P (C|A) = = P (C) .
2
• Hence the events A and C are independent!
• We can also show that B and C are independent.
• Hence, pairwise independence has been shown for the events A, B
and C.
• However, these events are not mutually independent, since

P (C|AB) = 0

and so
P (ABC) = 0.
i.e. if A and B occur, then C cannot.

STAT2372 2020 Topic 1 13


The Law of Total Probability
• Suppose events E1 , E2 , . . . En are mutually exclusive and
n
[
exhaustive of the sample space Ω (i.e. Ei = Ω). Then, for any
i=1
event A, we can write
n
[
A= (AEi ) .
i=1

• Because the Ei are mutually exclusive events, the AEi are also
mutually exclusive.

STAT2372 2020 Topic 1 14


• Thus,
n
X
P (A) = P (AEi )
i=1
Xn
= P (A|Ei ) P (Ei ) .
i=1

• Now, using the conditional probability rule, we get


P (AB)
P (A|B) =
P (B)
and so
P (AB) = P (A|B) P (B)
or P (AB) = P (B|A) P (A) .

STAT2372 2020 Topic 1 15


This can be visualised in the following Venn Diagram:

STAT2372 2020 Topic 1 16


Bayes’ Rule (or Theorem)
• Using the Law of Total Probability, the result known as Bayes’
rule, which is named after Thomas Bayes (1701–1761), can be
derived from the above.
• If {Ei ; i = 1, . . . n} are mutually exclusive and exhaustive, and A
is any event, then for j = 1, . . . , n,
P (Ej A)
P (Ej |A) =
P (A)
P (A|Ej ) P (Ej )
= Pn .
P (A|Ei ) P (Ei )
i=1

• Note that the index in the denominator is i, and not j. i is a


dummy. Using the same variable in numerator and denominator,
especially if one index is a summation index is sloppy notation

STAT2372 2020 Topic 1 17


which will be avoided.
• Example: A company buys its tyres from four suppliers: A(20%),
B (30%) , C (45%) , D (5%) . 10% of A′ s tyres are faulty, 8% of
B ′ s, 20% of C ′ s and 2% of D ′ s. If a tyre is selected at random
and found to be faulty, what is the probability that it came from
supplier C?
• Let F be the event that the tyre is faulty, and Ei be the event
that the tyre came from supplier i, (i = 1, 2, 3, 4), corresponding
to suppliers A, B, C and D respectively.
• Then
P (E1 ) = 0.20 P (F |E1 ) = 0.1
P (E2 ) = 0.30 P (F |E2 ) = 0.08
P (E3 ) = 0.45 P (F |E3 ) = 0.20
P (E4 ) = 0.05 P (F |E4 ) = 0.02

STAT2372 2020 Topic 1 18


• Thus

P (E3 |F )
P (F |E3 ) P (E3 )
=P
P (F |Ei ) P (Ei )
i
(0.20) (0.45)
=
(.1) (.2) + (.08) (.3) + (.2) (.45) + (.02) (.05)
0.09
=
0.135
2
=
3
≃ 0.6667

• Note that the denominator value is a “weighted average” of the


conditional probabilities.

STAT2372 2020 Topic 1 19

You might also like