You are on page 1of 40

LU1: Introduction to Probability

Theory

◼ History
◼ Counting formulas
◼ Probability definitions
LU1: Introduction to Probability Theory

◼ Topics
▪ History

▪ Counting formulas
❑ Permutations
❑ Combinations

▪ Probability definitions
❑ Basic concepts
❑ Laplace’s theory
❑ Frequentist theory
❑ Subjective theory

Ana Cristina Costa 2


LU1: Introduction to Probability Theory

◼ At the end of this learning unit students should be able to


▪ Apply the appropriate counting formula to each situation
▪ Understand the difference between permutations and combinations
▪ Explain the difference between an elementary outcome of a random
experiment and a non-elementary outcome
▪ Explain the difference between sample space and event, and identify
the sample space of a random experiment
▪ Compute probabilities using the Laplace’s theory (i.e., Classical
definition of probability)
▪ Explain the limitations of the Laplace’s theory
▪ Understand the Frequentist and Subjective theories
▪ Explain the limitations of the Frequentist theory

Ana Cristina Costa 3


LU1: Introduction to Probability Theory

◼ Resources on the Internet


▪ Newbold, P., Carlson, W. L., Thorne, B. (2013). Statistics for Business and
Economics. 8th Edition, Boston: Pearson, Sections 3.1 and 3.2. (requires VPN
connection)

▪ Kyle Siegrist (2020) Random - Probability, Mathematical Statistics, Stochastic


Processes. University of Alabama in Huntsville, Chapter 0 and Chapter 1. (access: 9
Feb 2021)

▪ MathsIsFun.com (2017) Combinations and Permutations. (access: 9 Feb 2021)

▪ Lightner, J. (1991). A Brief Look at the History of Probability and Statistics. The
Mathematics Teacher, 84(8), 623-630. (requires VPN connection)

Ana Cristina Costa 4


History

In the frivolous court of the kings of France, an experienced and inveterate player
– the knight of Méré – having found certain apparent contradictions between the
assessment of probabilities of gain in a certain game and his extensive experience,
proposed this problem to Pascal (1623-1662), among other questions about
games.

One of them was immediately resolved by Pascal; others were resolved by Fermat
(1601-1665) due to his correspondence with the latter.

In the 17th century there is still to be cited the remarkable work of Huyghens
(1629-1695) who introduced the notion of mean value or mathematical
expectation, and the masterful treatise of Jacob Bernoulli (1654-1705) who still
influences probabilistic thinking today.

Then comes de Moivre (1667-1754) proposing a first version of the Central Limit
Theorem, to which Gauss (1777-1855) and, fundamentally, Laplace (1749-1827)
would then give a more general form.

Ana Cristina Costa 5


History

Bayes (1702-1761) formulates the first attempt to mathematize statistical inference.

Since the end of the 19th century, Galton (1822-1911), K. Pearson (1857-1936) and
Student (1876-1937) (pseudonym of W. S. Gosset) begin the broad formulation of
statistics and its applications.

In the twentieth century it is practically impossible to name contributors, so many of


them are. In Mathematical Statistics, must be highlighted the names of Fisher (1890-
1962), Wald (1902-1950) and Neyman (1894-1981).

Gauss (1777-1855), Lagrange (1736-1813) and Poisson (1781-1840), to name just a


few authors, also made important contributions to the Theory of Probabilities.

Ana Cristina Costa 6


Counting formulas

◼ In counting processes, a complex problem is decomposed into a sequence


of independent elementary problems
◼ The number of results of the original problem is equal to the product of
the number of results of elementary problems

The toss of a coin


has 2 outcomes,
and the roll of a
dice has 6.
The sequential toss
of the coin and roll
of the dice gives
2x6 different
outcomes.

Ana Cristina Costa 7


Counting formulas

◼ Example
▪ A bit is equal to 0 or 1; a byte is a sequence of 8
bits. Sequence with
repetition
▪ How many different codes can be represented
by a byte?
Number of ordered
sequences of 8
elements, where
1st position 2nd position … 8th position
each element can
2 x 2 x … x 2 = 28
take 2 possible
values

✓ The order of the bits is important!

Ana Cristina Costa 8


Counting formulas

◼ Permutations with repetition


▪ The number of ordered sequences of dimension r, which it is possible
to form with the n elements of a set A, is given by nr

❑ When the order does matter it is a permutation

❑ In permutations with repetition, we can re-use the same element within


the sequence

Ana Cristina Costa 9


Counting formulas

◼ Example
▪ How many codes with 4 digits can you choose for
the ATM card if none of the digits can be
Sequence without
repeated?
repetition

1st digit 2nd digit 3rd digit 4th digit Number of ordered
10! sequences of 4
10 x 9 x 8 x 7 =
10 − 4 ! elements, where
the 10 possible
elements can not
✓ The order of the digits is important! be repeated
✓ Without repetition our choices get reduced each
time

Ana Cristina Costa 10


Counting formulas

◼ Partial permutations (or Sequences without repetition)


▪ The number of ordered sequences of r (r ≤ n) elements, where the n
𝑛!
possible elements can not be repeated is given by
𝑛−𝑟 !

❑ When the order does matter it is a permutation

❑ In partial permutations, we can not re-use the same element within the
sequence

Ana Cristina Costa 11


Counting formulas

◼ Permutations of n
▪ Case of a sequence without repetition when n = r
▪ There are 𝑛! ordered sequences without repetition

❑ A permutation of a set of objects is an arrangement of those objects into a


particular order: there are n! ways to order the n elements of a set A

◼ Example
▪ To access a particular computer, it is necessary to enter a password
consisting of 10 different digits
▪ The number of passwords you can choose is equal to 10!

Ana Cristina Costa 12


Counting formulas

◼ Permutations with multinomial coefficients (or


Multiset permutations)

▪ Let A be a set of k elements. We want to form


Multiset
ordered sequences of n elements, with n > k (i.e., permutations
at least one element will have to be repeated), such
that: Suppose we have a
set with n items,
❑ The 1st element appears n1 times
where there are n1,
❑ The 2nd element appears n2 times n2,…, nk that are
❑ … identical. The
number of ways to
❑ The k-th element appears nk times, and n1+n2+…+nk = n
permute them is
given by this
▪ The number of distinct sequences we obtain is given multinomial
𝑛! coefficient.
by the multinomial coefficient:
𝑛1 !×𝑛2 !×⋯×𝑛𝑘 !

Ana Cristina Costa 13


Counting formulas

◼ Example
• To access a particular computer, it is necessary to enter a password
consisting of 15 digits. How many passwords can be formed in order to
select:
• 2 times the digit 0

• 3 times the digit 1

• 3 times the digit 9

• 1 time the other digits

• Problem of permutations with multinomial coefficients:


15!
2!3!1!1!1!1!1!1!1!3!

Ana Cristina Costa 14


Counting formulas

◼ Combinations
▪ A combination is a selection of items from a collection, such that the
order of selection does NOT matter (unlike permutations)

▪ A k-combination of a set A with n elements is a subset of k distinct


elements (k ≤ n) of A, which is equal to the binomial coefficient
𝑛 𝑛!
𝐶𝑘𝑛 = =
𝑘 𝑛−𝑘 !𝑘!

Ana Cristina Costa 15


Counting formulas

◼ Example
▪ A restaurant needs to hire 2 cooks and 3 waiters from 14 candidates,
of which 4 are cooks and 10 are waiters. In how many different ways
can we do it?

▪ The order does not matter (choosing Mary and John is equal to
choosing John and Mary) → problem of Combinations

𝐶24 × 𝐶310 = 6 × 120 = 720

Choose 2 Choose 3
among the 4 among the 10
cooks waiters

Ana Cristina Costa 16


Probability definitions

◼ Basic concepts
▪ Random experiment
❑ In the context of this Learning Unit, it is said that an experiment is
random if:
1. We know all its possible results.
2. Each time it is carried out, it is not known in advance which of
the possible results will happen.
3. Can be repeated under similar conditions.

Ana Cristina Costa 17


Probability definitions

◼ Basic concepts
▪ Elementary outcome (or elementary result)
❑ An elementary outcome of a random experiment is a result that
cannot be subdivided into any other.

▪ Event
❑ In the context of this Learning Unit, an event is a set of one or more
elementary outcomes from a random experience.

Ana Cristina Costa 18


Probability definitions

◼ Basic concepts
▪ Sample space
❑ In the context of this Learning Unit, a Ω sample space is the set of
all elementary outcomes of a random experience.

❑ An elementary outcome corresponds to a subset of Ω formed by a


single element.

❑ An event corresponds to any subset of Ω.

Ana Cristina Costa 19


Probability definitions

◼ Example 1
▪ Consider a random experience that consists of flipping a coin and
observing which of the faces comes out: H=“Head" or T=“Tail".

▪ What is the sample space of this random experiment?


✓ Ω1 = {H, T}

Source: https://justflipacoin.com/

Ana Cristina Costa 20


Probability definitions

◼ Example 2
▪ Consider a random experience that consists of rolling a dice.

▪ What is the sample space of this random experiment?


✓ Ω2 = {1, 2, 3, 4, 5, 6}

▪ Let A be the event “the outcome is an even face”. How should we


represent this event?
✓ A = {2, 4, 6}

Source: https://www.netclipart.com

Ana Cristina Costa 21


Probability definitions

◼ Example 3
▪ Consider a random experience that consists of rolling 2 dices.

▪ What is the sample space of this random experiment?


✓ Ω3 = {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), … }

▪ Let B be the event “the sum of the outcomes of the two dices is equal
to 6”. How should we represent this event?
✓ B = {(1,5), (2,4), (3,3), (4,2), (5,1)}

Source: https://www.netclipart.com

Ana Cristina Costa 22


Probability definitions

◼ Laplace’s theory (or Classical definition of probability)


▪ Let Ω be the sample space of a random experience with N elementary
outcomes that are all equally-likely to occur. Let A be an event with n
elementary outcomes.

▪ The probability of A is represented by P(A) and it is given by


P(A) = n/N
❑ n is the cardinal number of the event A
❑ N is the cardinal number of the sample space

➢ Assumption: the probability of any elementary outcome is 1/N

Ana Cristina Costa 23


Probability definitions

◼ Laplace’s theory (or Classical definition of probability)


▪ The probability of an event is the ratio of the number of cases
favorable to it, to the number of all cases possible when nothing leads
us to expect that any one of these cases should occur more than any
other, which renders them, for us, equally possible

#𝐴 𝑛
𝑃 𝐴 = =
#Ω 𝑁

Ana Cristina Costa 24


Probability definitions

◼ Example 2 (continued)
▪ Consider a random experience that consists of rolling a dice. Let A be
the event “the outcome is an even face”. What is the probability of the
event A?

▪ The cardinal number of the sample space is


✓ # Ω2 = # {1, 2, 3, 4, 5, 6} = 6

▪ The cardinal number of the event A is


✓ # A = # {2, 4, 6} = 3

▪ The probability of event A is


✓ P(A) = 3/6 = 0.5
Source: https://www.netclipart.com

Ana Cristina Costa 25


Probability definitions

◼ Example 3 (continued)
▪ Consider a random experience that consists of rolling 2 dices. Let B be
the event “the sum of the outcomes of the two dices is equal to 6”.
What is the probability of the event B?

▪ The cardinal number of the sample space is


✓ # Ω3 = # {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), … } = 36

▪ The cardinal number of the event B is


✓ # B = # {(1,5), (2,4), (3,3), (4,2), (5,1)} = 5

▪ The probability of event B is


✓ P(B ) = 5/36 = 0.1389
Source: https://www.netclipart.com

Ana Cristina Costa 26


Probability definitions

◼ Consequences of Laplace’s theory


▪ The probability of an event is a number between 0 and 1
#A n
0 ≤ #Ω = N ≤ 1, because #A ≤ # and #A ≥ 0

▪ If an event occurs with certainty, its probability is 1


#Ω N
P Ω = = =1
#Ω N
Example: P(outcome of a dice is greater than or equal to 1) = P() = 1

▪ If an event does not occur with certainty, its probability is 0


#∅ 0
P ∅ = = =0
#Ω N
Example: P(outcome of a dice is 7) = P() = 0

Ana Cristina Costa 27


Probability definitions

◼ Limitations of Laplace’s theory


▪ Inapplicable when the number of possible elementary outcomes is
infinite

▪ Inapplicable when elementary outcomes are not equiprobable,


because the definition is circular

▪ Inapplicable to complex phenomena

Ana Cristina Costa 28


Probability definitions

◼ Frequentist theory (or Empirical theory)


▪ Consider a random experience that can repeated any number of times,
so that we can produce a series of independent trials under identical
conditions. In each observation, depending on chance, a particular
event A either occurs or does not occur.

▪ Let n be the number of repetitions of the experiment and let n(A) be


the number of times that event A occurs in this series of experiments.
❑ The ratio n(A)/n is called the relative frequency of the event A (in this given
series of independent and identical trials)

▪ The probability of A is given by


𝑛(𝐴)
𝑃 𝐴 = lim
𝑁→+∞ 𝑛

Ana Cristina Costa 29


Probability definitions

◼ Frequentist theory (or Empirical theory)


▪ The relative frequency of occurrence of an event A, observed in a
number of repetitions of the experiment, is a measure of the
probability of that event:
𝑛(𝐴)
𝑃(𝐴) ≈
𝑛

▪ It has been empirically observed that the relative frequency becomes


stable in the long run.

▪ This definition no longer requires that the elementary outcomes be


equiprobable!

Ana Cristina Costa 30


Probability definitions

◼ Example 4
▪ John Kerrich, along with internee Eric Christensen, tossed a coin 10 000 times
and observed the occurrence of “heads“ while interned in Denmark during
World War II.

▪ By recording the number of heads obtained as the trials continued, Kerrich was
able to demonstrate that the proportion of heads obtained asymptotically
approached the theoretical value of 0.5 (see the results in the LU1_Examples Excel file).
❑ The probability of “heads“ is equal to 0.5

❑ A fair coin was used

Kerrich, J. E. (1946). An Experimental Introduction to the


Ana Cristina Costa 31
Theory of Probability. Copenhagen: J. Jorgensen.
Probability definitions

◼ Example 5
▪ When you toss a coin, there are only two possible outcomes, heads or tails. On
any one toss, you will observe one outcome or another—heads or tails. Over a
large number of tosses, though, the percentage of heads and tails will come to
approximate the true probability of each outcome.

▪ In this applet, you can set the true probability of heads for your virtual coin,
and then toss it any number of times.

❑ Notice how the proportion of tosses that produce heads can be quite
variable at first but will eventually settle down to the true probability.

❑ Click the "Quiz Me" button to complete the activity.

“Statistical Applets”, Probability, book companion site of Moore, D., Notz, W. &
Ana Cristina Costa 32
Fligner, M. (2015) The Basic Practice of Statistics. (accessed February 2021)
Probability definitions

◼ Example 6
▪ In the academic year 2002/2003, students of the Degree in Information
Management were asked to roll 2 dices at least 50 times and record the "sum
of the dots".

▪ Objective: to compare the probabilities computed through the Frequentist


theory and the Laplace’s theory.

❑ The results of the experiment of Alexandra Pinto are available in the


Example6a sheet of the LU1_Examples Excel file
❑ Balanced dices were not used and therefore the Frequentist setting is more
appropriate

❑ The calculation of probabilities using the Laplace’s theory is available in the


Example6b sheet of the LU1_Examples Excel file

Ana Cristina Costa 33


Probability definitions

◼ Consequences of Frequentist theory


▪ The probability of an event is a number between 0 and 1

▪ If an event occurs with certainty, its probability is 1

▪ If an event does not occur with certainty, its probability is 0

Ana Cristina Costa 34


Probability definitions

◼ Limitations of Frequentist theory


▪ The convergence property of the frequency, whose limit might not
exist

▪ Inapplicable when the experiment cannot be repeated

▪ Inapplicable when the experiment cannot be repeated under identical


conditions

▪ Inapplicable to complex phenomena

Ana Cristina Costa 35


Probability definitions

◼ Remarks on the Frequentist theory


▪ The frequentist definition is deliberately vague, in certain points,
because a practical definition is intended.
❑ We need to make statements such as: "the probability that the patient will
survive the operation is 0.4".
❑ Why 0.4? The answer is possibly because 40% of previous patients survived
the operation.
❑ But, were the previous patients identical to this patient? No, they were not
- we are all individuals. But this is how this theory is often applied.

➢ This is the problem that is faced whenever a mathematical model is


applied to a practical and real situation. We must be prepared to
understand when we can apply a model and when not to apply it.

Ana Cristina Costa 36


Probability definitions

◼ Subjective theory
▪ The probability of an event is the degree of belief a person attaches to
that event, based on his/her available information, in a scale from 0 to
1 (or 0% to 100%).
❑ This reasoning holds only under the assumption of rationality, which
assumes that people act coherently.

❑ It was developed by probabilist B. de Finetti.

Ana Cristina Costa 37


Probability definitions

◼ Example 7
▪ In an interview, an economist said that he considered the
"Improvement" of the economic situation as likely as its "Stagnation".
However, he viewed “Improvement” as twice as likely as the
“Breakdown” of economic activity.

❑ Sample space:

Ω = {“Improvement”, “Stagnation”, “Breakdown”}

❑ It is not possible to determine the probability associated with each


outcome:

P(“Improvement”) = P(“Stagnation”) = 2 x P(“Breakdown”)

Ana Cristina Costa 38


Probability definitions

◼ Limitations of Subjective theory


▪ It contains no formal calculations and only reflects the subject's
opinions and past experience rather than on data or computation.

▪ Subjective probabilities differ from person to person.

▪ There is usually a high degree of personal bias.

➢ One way to improve the quality of a subjective probability is to use the


opinion of an expert in that field (e.g., an investment banker’s opinion of
the probability that a hostile takeover will succeed, or an engineer’s
opinion of the feasibility of a new energy technology).

Ana Cristina Costa 39


Probability definitions

◼ Remarks on probability definitions


▪ Much of the mathematics of probability was developed based on the
simplistic definition of Laplace’s theory. Alternative interpretations of
probability (e.g., Frequentist and Subjective theories) also have
problems.

▪ The following Learning Unit introduces the Axiomatic theory, which


deals in abstractions, avoiding the limitations and philosophical
complications of any probability interpretation. This means that the
probabilities of our events can be perfectly arbitrary, except that they
must satisfy a set of simple axioms.
❑ The classical theory will correspond to the special case of so-called
equiprobable spaces.

Ana Cristina Costa 40

You might also like