You are on page 1of 40

BC0406 – Introduction to

Probability and Statistics

Lecture 7
Introduction to Probability
Basic rules, chance,
probability tree

Dr. Richard H.A.H. Jacobs


Universidade Federal do ABC
Agenda
Today’s lecture

• Announcements

• Probability
• Introduction
• Addition rule
• Multiplication rule
• Conditional probability
• Tree diagram
• Notations
• More notions about chance

• Chapters for today and the next lecture


• Larson & Farber: Chapters 3 & 4
• Ross...
Announcement

• Second half of the course


• Probability
• We will make more exercises
• Bring a calculator to the lectures
Transition to probability

• For now, we use the normal approximation


method to talk about:
– “area under the curve”
– “frequency within a range” etc.
– But we did not define the terms precisely
– We need to do this to continue

• With the precise definition of distributions we


can:
– Understand the approximation method for the
normal distribution more fully
– Understand the importance of the random
sample
• Basic for all probabilistic inferences
Probability theories
• Subjective probability theories
– “Feeling” of certainty; degree of
conviction in some belief
– Can be applied to potential “unique”
events
• Ex: How likely was Germany to emerge
victorious from WW2 if Hitler had
anticipated the arrival of the troops on
Normandy?
– Also known as “Bayesian” because of
Rev. Bayes, who helped develop this
method
Probability Theories
• Theory of probability of frequencies
– Handles processes that can be
repeated several times, independently
and under the “same” conditions
• Tossing coins; throwing dice;card games,
lotteries
• Samples of experiments that can be repeated
– Does not apply to “unique” events
• Events for which repeated sampling cannot
be done
• Impossible to repeat WW2, with Germany
anticipating the arrival of Allied troops in
Normandy
Notions about chance
“If you ain't just a little scared when you enter a
casino, you are either very rich or you haven't
studied the games enough.”  (Anonymous bookie)

"A gambler with a system must be, to a greater or


lesser extent, insane." G. A. Sala (1828-95)

“The Gambler's Fallacy and its twin, the Reverse


Gambler’s Fallacy, have two distinctions that no
other fallacy has. They have built a city in the
desert: Las Vegas. They are the economic mainstay
of Monaco, an entire, albeit tiny, country, from which
we get the alias "Monte Carlo" fallacy… Both
versions of the fallacy are based on the same
mistake, namely, a failure to understand statistical
independence.” (Lacey, 1996)
Notions about chance
“The Probability of na event is higher or
lower depending on the number of
Chances that it can occur, compared to
the number of total chances that it can
occur of fail to occur... So if na Event
has 3 Chances to occur and 2 Chances
to fail, the fraction 3/5 adequately
represents the Probability of its
occurrence and may be considered its
measure”.
Abraham de Moivre
(1667-1754)
Contemporary definition of classical probability

• “The chance of something refers to the


percentage of times in which this something
is expected to occur, when the basic proces
is repeated over and over for an infinite
number of times, independently and under
the same conditions”.
» (Pisani et al., 1977)
Basic terminology
• Event
– The result of some process or “attempt”
• Tossing a coin, throwing a die, drawing a card

• Independent events
– The occurrence of one event has no effect on
the occurrence of another event
• Results of two coin tosses

• Mutually exclusive events


– The occurrence of one event eliminates the
possibility of another event occurring
• Heads, tails in a coin toss

• Exhaustive events
– A set that represents all possible outcomes that
can be produced
• Heads, tails in a coin toss
• Numbers “1” “2” “3” “4” “5” and “6” in a throw of a
dice
Theoretical vs. Empirical probability
• Theoretical probability
– aka “analytical vision”
– Based on the number of ways an event can
occur
• Ex: probability of drawing a King out of a pack:
• p(R) = N(R)/[N(R) + N(~R)] = 4/52 = 0,077

• Empirical probability
– “Relative frequency view”
– Conclusions about real past occurrences
• Ex: We drew 2 Kings in 20 attempts
• p(R) = 2/20 = 0,10

• For a finite number of trials, empirical probabilities


need not be the same as theoretical probabilities
Addition rule
• For a set of mutually exclusive events, A, B
– p(A or B) = p(A) + p(B)

• Generally used to determine the probability


that one event among two will occur in the
same attempt
Todas as cartas (52)
• Ex: How likely is it that a Jack (V) or King
V (4) R (4)
(R) will be taken out of the pack in an
attempt?

 p(V) = 4/52
 p(R) = 4/52
 V, R are mutually exclusive, therefore
 P(V or R) = 4/52 + 4/52 = 0,15
Addition rule
• Case 1: A is a subset of B
• Ex: What is the probability that a
card drawn from the deck is a Todas as cartas (52)
Jack (V) or a “figure” card (F)?
F (12)
 p(V) = 4/52 V (4)
 p(F) = 12/52
 However, V and F are not mutually
exclusive:
 V is a subset of F: all Jacks are
“figure” cards
 We must avoid “double
counting” V & F (4)
 p(V or F) = p(V) + p(F) – p(V & F)
 = 4/52 + 12/52 – 4/52
 12/52 = 3/13 = 0,23
Addition rule
• Caso 2: When A and B intersect
• Ex: What is the probability of a card
drawn from the deck being either a All cards (52)
Jack (V) or a spade (E)?

p(V) = 4/52 V (4)


p(E) = 13/52 E (13)
However, V and E are not mutually
exclusive:
 One V is also a E
 P(V & E) = (1/52)
 Again, we should avoid double
counting
 p(V or E) = p(V) + p(E) – p(V & E)
= 4/52 + 13/52 – 1/52
= 0,31 V & E (1)
Addition rule – summary
Subset
A B
When events are not mutually exclusive
 p(A & B) > 0
 We have to avoid “double counting”
 p(A or B) = p(A) + p(B) – p(A & B) A B
Intersection
When are mutually exclusive
 p(A & B) = 0 (!)

That is why:
p(A or B) = p(A) + p(B) – p(A & B) A B

= p(A) + p(B) – 0 Mutually


exclusive
= p(A) + p(B)
Multiplication rule

• For a set of independent events, A, B


– p(A & B) = p(A) * p(B)
• Generally used to determine the
probability that two events will occur
between attempts
• Ex: How likely is a Jack (V) to be
drawn from a deck on the first attempt,
followed by a spade (E) on the second
attempt?
Multiplication rule
• To maintain the independence of
All cards (52)
events, we must assume that
sampling occurs with replacement V (4) E (13)
• That is: Card 1 is returned to the deck
before card 2 is drawn
• (we will see sampling without
replacement next)
T_1 T_2
E
 p(V1) = 4/52 V
 p(E2) = 13/52
 p(V1 & E2) = p(V1) * p(E2)
= 4/52 * 13/52 = 0,019
Conditional probability
Probability that A will occur given that B
ocurred
p(A “given” B) or p(A | B) All voters (100)

Ex: Of 100 voters, we have 30 Republicans,


50 women, and 10 Republican women.
R=30 W=50
What is p(R | W)?
Conditional probability means adjusting
the condition to be the whole set, and
then calculating the probability as always
W & R =10
p(R | W) = p(W & R)/p(W)
p(W | R) = ?
= 10/50 = 0,20
p(W | R) = p(W & R)/p(R)
= 10/30 = 0,33
Conditional probability and independence

• When p(A | B) = p(A), condition B does not change


p(A).
– Ex: two coin tosses
• p(H2 | H1) = p(H2)
– Therefore, the events are “independent”

• Example: A die and then a coin are thrown. What is the


probability that the coin is Heads and the dice shows
“3”?
– Test of independence:
• p(H) = p(H|3) (independent)
– So, we can use the multiplication rule:
• p(H & 3) = p(H) * p(3) = 1/2 * 1/6 = 1/12
Conditional probability and independence
• When p(A | B) != p(A), condition B changes p(A).
– That is, the events are not independent

• Ex: One card is drawn from the deck, followed by


another card, without replacement. What is the
probability of the first card being a Jack (V) and the
second being a King (R)?
– Here, p(R | V) != p(R), since V is not replaced
– Therefore, we cannot use the simple multiplication rule
• p(V & R) != p(V) * p(R)
– We must consider the new situation:
• p(V1 & R2) = p(V1) * p(R2 | V1)
• p(V1) = 4/52
• p(R2 | V1) = 4/51 (NB: 51)
• p(V1 & R2) = 4/52 * 4/51 = 16/2652 = 0,006
Conditional probability – summary

When the events are not independent

P(B | A) != p(B)

Therefore, we must adjust the calculations because of this dependence

P(A & B) = P(A) * p(B | A)

When the events are independent

P(B | A) = p(B)

So we can reduce
P(A & B) = p(A) * p(B | A) to

P(A & B) = p(A) * p(B)


Putting it into practice
A deck has 52 cards (4 suits with 13 cards; 12 Figures and
40 Numbers). Resolve the following problems:

1. What is the probability of drawing a Queen (D) or a


Seven (7) in one attempt?

2. What is the probability of drawing a Queen (D) or a


Diamond (O) card in an attempt?

3. What is the probability of drawing a Seven (7) or a


Numeric (N) card in an attempt?
Putting it into practice
A deck has 52 cards (4 suits with 13 cards; 12 Figures and
40 Numbers). Resolve the following problems:

1. What is the probability of drawing a Queen (D) or a


Seven (7) in one attempt?
 P(D) = 4/52
 P(7) = 4/52
 D, 7 are mutually exclusive, therefore
 P(D or 7) = 4/52 + 4/52 = 0.15
Putting it into practice
A deck has 52 cards (4 suits with 13 cards; 12 Figures and
40 Numbers). Resolve the following problems:

2. What is the probability of drawing a Queen (D) or a


Diamond card [O] in one attempt?
 P(D) = 4/52
 P(O) = 13/52
 D, O are not mutually exclusive
 P(D & O) = 1/52, therefore
 P(D or O) = P(D) + P(O) – P(D & O)
 = 4/52 +13/52 – 1/52 = 0.31
Putting it into practice
A deck has 52 cards (4 suits with 13 cards; 12 Figures and
40 Numbers). Resolve the following problems:

3. What is the probability of drawing a Seven (7) or a


number card (N) in one attempt?
 P(7) = 4/52
 P(N) = 40/52
 7, N are not mutually exclusive
 P(7 & N) = 4/52, therefore
 P(7 or N) = P(7) + P(N) – P(7 & N)
 = 4/52 + 40/52 – 4/52 = 0.77
Putting it into practice
A deck has 52 cards (4 suits with 13 cards; 12 Figures and
40 Numbers). Resolve the following problems:

4. What is the probability of drawing a Queen (D) and a


Seven (7) in subsequent/consecutive attempts? (with
repositioning of the card)

5. What is the probability of drawing a Queen (D) and a


Diamond card (O) in consecutive attempts? (reposition)

6. What is the probability of drawing a Seven (7) and a


Number card (N) in consecutive attempts? (without
repositioning!)
Putting it into practice
A deck has 52 cards (4 suits with 13 cards; 12 Figures and
40 Numbers). Resolve the following problems:

4. What is the probability of drawing a Queen (D) and a


Seven (7) in consecutive attempts? (reposition)
4. P(D) = 4/52
5. P(7) = 4/52
6. D, 7 are independent events (with
repositioning/replacement), therefore
7. P(D & 7) = 4/52 * 4/52 = 0.006
Putting it into practice
A deck has 52 cards (4 suits with 13 cards; 12 Figures and
40 Numbers). Resolve the following problems:

5. What is the probability of drawing a Queen (D) and a


Diamond card (O) in consecutive attempts?
(replacement)
 P(D) = 4/52
 P(O) = 13/52
 D, O are independent events (with replacement), therefore
 P(D & O) = P(D) * P(O) = 4/52 * 13/52 = 0.02
Putting it into practice
A deck has 52 cards (4 suits with 13 cards; 12 Figures and
40 Numbers). Resolve the following problems:

6. What is the probability of drawing a Seven (7) and a


Number card (N) in consecutive attempts?
(~replacement!)
 P(7) * P(N | 7)

 P(7) = 4/52
 P(N | 7) = 39/51
 P(7 & N) = 4/52 * 39/51 = 0.06
Combining probability rules
Subset
• Problems of probability generally require A B
the use of various rules.

• Methods for organizing the information:


– Venn diagrams (circles)
• Becomes confusing with multiple B
events
A
Intersection

A B
Mutually
exclusive
Combining probability rules

• Problems of probability generally require


the use of various rules.

• Methods for organizing the information:


– Venn diagrams (circles)
• Become confusing with multiple events

– Tree diagrams
• Efficient method for organizing
information
• Helps to correctly apply the addition
and multiplication rules

• See the following example


Combining probability rules
An internet provider is interested in knowing how many people use chat
rooms. It has the following information about all internet users:
• 29% are 18-29 yo;
• 47% are 30-49 yo;
• 24% are 50+ yo.

They also interviewed some users of chat rooms:


• 47% of users between 18-29 years use chat rooms;
• 21% of users between 30-49 years;
• 7% among those over 50 years.

For marketing reasons, the provider needs to know: What is the probability
that a randomly chosen customer will use chat rooms?
Combining probability rules
An internet provider is interested in knowing how many people use chat rooms. It has the
following information about all the internet users: 29% are 18-29 yo; 47% are 30-49 yo; and
24% are 50 or more. They also interviewed some users of chat rooms: 47% of users between
18-29 yo use chat rooms; 21% between 30-49 yo; and 7% of those 50+ yo.

p(Chat | age1)

Age 0,47 Chat 0,1363 p(age1 & Chat)


18-29
= p(age1) * p(C | age1)
No 0,1537 = 0,29 * 0,47 = 0,1363
0,29 1 - 0,47 = 0,53
0,21 Chat 0,0987 p(Chat)
Internet 0,47 = p(a1&C) + p(a2&C) + p(a3&C)
users Age No 0,3713
30-49 0,79 = 0,1363 + 0,0987 + 0,0168
= 0,2518
0,24 0,07 Chat 0,0168

Age
0,93 No 0,2232
50+
Putting it into practice
A researcher is interested in how many people have a certain gene. She
has the following information about the tested persons:
• 60% are from Region A
• 40% are from Region B

She also knows that:


• 30% of people in Region A have the gene
• 20% of people in Region B have the gene

What is the probability that a person randomly chosen from all those
tested has the gene?
Putting it into practice
A researcher is interested in how many people have a certain gene. She has the
following information about the tested persons: 60% are from Region A; 40% are from
Region B. She also knows that: 30% of people in Region A have the gene; 20% of
people in Region B have the gene. What is the probability that a person randomly
chosen from all those tested has the gene?

p(Gene | A)

Region A 0,30 Gene 0,18 p(A & Gene)


= p(A) * p(G | A)
~Gene 0,42 = 0,60 * 0,30 = 0,18
0,60 1 - 0,30 = 0,70

p(Gene)
Tested = p(A&G) + p(B&G)
people = 0,18 + 0,08
0,40 = 0,26
0,20 Gene 0,08

Region B 0,80 ~Gene 0,32


Common confusions about chance
• 1. thinking about “representativeness”
– Tendency to believe that the result of a random
process, even in the short run, must capture the
essence of randomness
• Ex: In a lottery based on the draw of 6 balls
from a set [1:10], which sequence is more
likely:
– A: 153264
– B: 123456
– p(B2|B1) = p(B) (independent)
– each sequence =
1/10*1/9*1/8*1/7*1/6*1/5
• NB: This does not mean that all bets are
equally “good” in terms of prize amount
– Many people bet on sequences that
“seem random”
– They avoid those that “do not seem
random”
– Therefore, prizes are smaller for (shared)
sequences that do seem random
Common confusions about chance
• 2. The “Gambler’s fallacy”
– Belief that the outcome of a random process is
affected by previous results
• Chance as a “self-corrective” process
– Two forms:
• Outcome is more likely because it has not
happened recently
– “The law of means is on my side" (!)
• Outcome is more likely because it happened
recently
– “I’m in a lucky strike” “Estou numa maré de sorte!“
– NB: The common “Law of Means” is usually a
distortion of the “Law of Large Numbers” (we will see
soon)
• It refers to the behavior of random processes in
the long term:
• A large sample of a random process tends to
reflect the inherent probabilities of this process
– Ex: as the number of coin tosses increases, we
expect the average number of “Heads” to
approach the probability p(Heads).
– But at each attempt, p(Heads) does not change, it
is independent of past events.
Quiz – AIDS Test
• An HIV/AIDS pharmacy test was
developed. When the test is
given to people who actually
have HIV (based on complete
laboratory tests), the probability
of a positive test is 0.90.

• The test is given to a randomly


chosen person, and the outcome
is positive.

• What is the probability of the


person actually having HIV?

• 90%?
Quiz – AIDS Test
• 3. Confusion with inverse conditionals
– Belief that p(B | A) = p(A | B), even
when p(A) != p(B) All people (n = 1000)
• Test of HIV/AIDS is designed.
p(test+ | AIDS) = 0,90. What is AIDS p(+ | AIDS)
p(AIDS | test+)? n = 10 = n(A & +)/n(AIDS)
– Many will say 0.90, confusing p(A) = 10/1000 = 0,01 = 9/10 = 0,90
inverse conditionals
– The correct answer depends on
p(AIDS) and p(test+)
+
– See example...
• Assume: population n = 1000
• AIDS is rare, so assume n(AIDS) = + TEST
10 and p(AIDS) = 0.01 n(A & +) = 9 n(+) = 800
• Assume: test is generally positive for p(A & +) = 9/1000 = 0,009 p(+) = 800/1000 = 0,80
all tested
– n(Test+) = 800
– p(Test+) = 800/1000 = 0.80
p(AIDS | + TEST)
– NB: p(A|B) = p(B|A) SE p(A) = p(B) = n(AIDS & +)/n(+)
• In this case, p(A) = 0.01; p(+) = 0.80
• Test+ does not reveal much = 9 / 800
= 0,01125
Chapters for the next lecture
• Larson & Farber: Chapters 3 & 4

You might also like