You are on page 1of 88

Introduction to Probability and Bayes

Theorem
Events, Frequencies, Probability & Probability Distributions

Courtesy: Dr. Saketh Athkuri


Probability
Terminology

1. What is an event?

2. How many times will the event occur for a given


number of trials?

3. What is the probability?

4. Probability vs Statistics
Terminology

1. What is an event? Outcome of an experiment


2. How many times will
the event occur?

3. What is the probability?


H T T T H T H
4. Probability vs Statistics
H T H H T H H
What type of events are these?
Terminology

1. What is an event? Outcome of an experiment


2. How many times will
the event occur?

3. What is the probability?


H H H H T T T
4. Probability vs Statistics
H H H H T T T
What type of events are these?
Terminology

1. What is an event? Outcome of an experiment


2. How many times will
the event occur?

3. What is the probability?


H H H H T T T
4. Probability vs Statistics
H H H H T T T
What is the P(T=H)?
Terminology

1. What is an event? H H H H T T T

2. How many times will H H H H T T T


What is the P(T=H)?
the event occur?

3. What is the probability? P(T=H)?

4. Probability vs Statistics
Terminology

1. What is an event? H H H H T T T

2. How many times will H H H H T T T


What is the P(T=H)?
the event occur?

3. What is the probability? P(T=H)?

4. Probability vs Statistics Count number of times event H occurs


in the all-possible outcomes
Terminology
Count number of times event H occurs
1. What is an event?
in the all-possible outcomes
2. How many times will P(T=H)?
the event occur? P(T=HH)?
3. What is the probability? P(T=HTH)?
P(T=HHHH)?
4. Probability vs Statistics
P(T=TTHH)?
Terminology

1. What is an event?

2. How many times will


the event occur?

3. What is the probability?

4. Probability vs Statistics

What is the mortality rate for Covid-19?


Terminology

1. What is an event?

2. How many times will What is the probability of forming


the following words with their
the event occur? letters?
• Spain
3. What is the probability? • India
• Antarctica
4. Probability vs Statistics
Terminology

1. What is an event?

2. How many times will


the event occur?

3. What is the probability?

4. Probability vs Statistics
Terminology

1. What is an event?
Use of probability in real life:
2. How many times will
Insurance companies --
the event occur?
P(Sick|(20-30))=0.04
3. What is the probability? P(Sick|(60-70))=0.35

4. Probability vs Statistics
Terminology

1. What is an event?
Probability – Predict the likelihood
2. How many times will of a future event
Statistics – Analyse the past events
the event occur?

3. What is the probability?


Probability – What will happen in an
4. Probability vs Statistics ideal world?
Statistics – How ideal is the world?
Terminology

1. What is an event?

2. How many times will


the event occur?

3. What is the probability?

4. Probability vs Statistics
Probability is the basis of
inferential statistics.
Probability basics and rules
Sample space, event outcome
Probability of Event A
happening in sample space
S S is: ?

A
B P(S)=?
P(A)=?
Probability basics and rules
Sample space, event outcome
Probability of Event A
happening in sample space
S S is: ?

A B P(S)=?
P(A)=?
P(B)=?
P(A and B)=?
P(A or B)=?
Probability basics and rules
Sample space, event outcome
P(S) = 1

P(A) = ?
A B
P(B)= ?

P(A and B)= ?

P(A or B)= ?
Probability basics and rules
Sample space, event outcome
P(S)=1
A
P(A)
A B
P(B)

P(A and B) A B

P(A or B)
Probability basics and rules
Sample space, event outcome
P(S)=1
B
P(A)
A B
P(B)

P(A and B) A B

P(A or B)
Probability basics and rules
Sample space, event outcome
P(S)=1

P(A)
A B
P(B)
A B
P(A and B)

P(A or B)
Probability basics and rules
Sample space, event outcome
P(S)=1
A B
P(A)
A B
P(B)

P(A and B) A B

P(A or B)
Find the probabilities?
Age
Young Middle-aged Old Total

COVID No 10,503 27,368 259 38,130


Positive?
Yes 3,586 4,851 120 8,557
Total 14,089 32,219 379 46,687
Conditional probability
Sample space, event outcome
Conditional probability
Sample space, event outcome
P(A|B) P(B|A)

A B
Conditional probability
Sample space, event outcome
P(A|B) P(B|A)

A B

B
Conditional probability
Sample space, event outcome
P(A|B) P(B|A)

A B

B
Conditional probability
Sample space, event outcome
P(A|B) P(B|A)

A B

B A
Conditional probability
Sample space, event outcome
P(A|B) P(B|A)

A B

B A
Conditional probability
Sample space, event outcome

P(A|B) = P(B|A) =

B A
Conditional probability
Sample space, event outcome

B P(A|B) = = A P(B|A)
Conditional probability
Sample space, event outcome

B P(A/B) = = A P(B/A)

𝐴 𝐵
𝑃 𝐵 ×𝑃 $𝐵 = 𝑃 𝐴 ×𝑃 $𝐴 = 𝑃(𝐴 ∩ 𝐵)
Bayes theorem

Bayes' Theorem states that the conditional probability of an


event (A) based on the occurrence of another event (B), is equal
to the likelihood of the occurrence of the second event (B) given
the first event (A) has occurred, multiplied by the probability of
the occurrence of the first event (A), divided by the probability
of the occurrence of the second event (B).
Bayes theorem

Event A1
Event A2
Event A3
Bayes theorem

Event A1
Event A2
Event A3

A1, A2, A3 form my sample space S.


Bayes theorem

Event A1
Event A2
Event A3
Event B

A1, A2, A3 form my sample space S. Event B is a subset of S.


Bayes theorem

Event A1: P(A1)


Event A2: P(A2)
Event A3: P(A3)=?
Event B: P(B)=?

A1, A2, A3 form my sample space S. Event B is a subset of S.


Bayes theorem

Event (A1 & B): P(A1) P(B|A1)

Event (A2 & B): P(A2) P(B|A2)

Event (A3 & B): P(A3) P(B|A3)


Event (B) P(B)=?

A1, A2, A3 form my sample S. Event B occurs in sample S.


Bayes theorem
Event A1 Event A3
P(A1)
Event A2 Event B P(A3)=1-P(A1)-P(A2)
P(A2)
P(B|A1)
P(B|A2) P(B)=?
P(B|A3)

P(A1|B)=?

A1, A2, A3 form my sample space S. Event B is a subset of S.


Bayes theorem
Event A1 Event A3
Event A2 Event B
Bayes theorem
Event A1 Event A3
Event A2 Event B

P(A1 and B) =

P(B|A1) = ?
Bayes theorem
Event A1 Event A3
Event A2 Event B
P(A2 and B)=

P(B|A2)=?
Bayes theorem
Event A1 Event A3
Event A2 Event B
P(A3 and B)=

P(B|A3)=?
Bayes theorem
Event A1 Event A3
Event A2 Event B
P(B)=?
Bayes theorem
Event A1 Event A3
Event A2 Event B
P(B|A1)
Bayes theorem
Event A1 Event A3
Event A2 Event B P(B|A2)
Bayes theorem
Event A1 Event A3
Event A2 Event B P(B|A3)
Bayes theorem
P(A1|B) P(A1) P(A2) P(B|A3)
P(B|A1) P(B|A2) P(B)=?
Bayes theorem
P(A1) P(A2) P(B|A3)
P(B|A1) P(B|A2) P(B)=?

P(A1|B) =

Event A1
Event A2
Event A3
Event B
Bayes theorem
P(A1) P(A2) P(B|A3) P(A1|B) =
P(B|A1) P(B|A2) P(B)=?

Event A1
Event A2
Event A3
Event B
Bayes theorem
P(A1) P(A2) P(B|A3)
P(B|A1) P(B|A2) P(B)=?

Event A1
Event A2
Event A3
Event B P(A1|B)
Bayes theorem
P(A1) P(A2) P(B|A3)
P(B|A2) P(B)=?

Event A1
Event A2
Event A3
Event B P(B|A1)
P(A1|B)
Bayes theorem
P(A2) P(B|A3)
P(B|A2) P(B)=?

Event A1
Event A2
Event A3
Event B P(B|A1) P(A1)
P(A1|B)
Bayes theorem
P(A2) P(B|A3)
P(B|A2) P(B)=?

Event A1
Event A2
Event A3
Event B P(B|A1) P(A1)
P(A1|B)
P(B|A1) P(A1)
Bayes theorem
P(B|A3)
P(B)=?

Event A1
Event A2
Event A3
Event B P(B|A1) P(A1)
P(A1|B)
P(B|A1) P(A1) P(B|A2)P(A2)
Bayes theorem

Event A1
Event A2
Event A3
Event B P(B|A1) P(A1)
P(A1|B)
P(B|A1) P(A1) P(B|A2)P(A2) P(B|A3) P(A3)
Bayes theorem

A
Bayes theorem

A B
Bayes theorem

P(A)
P(B|A)
A B
P(B|A )C

P(A|B)=?
Bayes theorem

P(cancer)
cancer P(symptoms|cancer)
symptoms
P(symptoms|no cancer)
P(cancer|symptoms)=?
Example
Bayes Theorem allows you to find reverse probabilities, and to revise
original probabilities based on new information.

Case – Clinical trials


Epidemiologists claim that probability of cancer among Caucasian women
in their mid-50s is 0.005. An established test identified people who had
cancer and those that were healthy. A new mammography test in clinical
trials has a probability of 0.85 for detecting cancer correctly. In women
without cancer, it has a chance of 0.925 for a negative result. If a 55-year-
old Caucasian woman tests positive for cancer, what is the probability that
she in fact has cancer?
Solution
Case – Clinical trials
P(Cancer) = 0.005 (aka Prior Probability)
P(Test positive | Cancer) = 0.85 (aka Likelihood)
P(Test negative | No cancer) = 0.925
P(Cancer | Test positive) = ? (aka Posterior or Revised Probability)
P(Test Positive) aka Evidence
𝑷 𝑪𝒂𝒏𝒄𝒆𝒓 𝑻𝒆𝒔𝒕 +
𝑃 𝐶𝑎𝑛𝑐𝑒𝑟 ∗ 𝑃(𝑇𝑒𝑠𝑡 + |𝐶𝑎𝑛𝑐𝑒𝑟)
=
𝑃 𝑇𝑒𝑠𝑡 + 𝐶𝑎𝑛𝑐𝑒𝑟 ∗ 𝑃 𝐶𝑎𝑛𝑐𝑒𝑟 + 𝑃 𝑇𝑒𝑠𝑡 + 𝑁𝑜 𝑐𝑎𝑛𝑐𝑒𝑟 ∗ 𝑃(𝑁𝑜 𝑐𝑎𝑛𝑐𝑒𝑟)
0.005 ∗ 0.85 0.00425
= = = 0.054
0.85 ∗ 0.005 + 0.075 ∗ 0.995 0.078875
Example
Spam Assassin works by having users train the system. It
looks for patterns in the words in emails marked as spam
by the user. For example, it may have learned that the
word “free” appears in 20% of the mails marked as spam,
i.e., P(Free | Spam) = 0.20. Assuming 0.1% of non-spam
mail includes the word “free” and 50% of all mails received
by the user are spam, find the probability that a mail is
spam if the word “free” appears in it.
Solution
Case – Spam filtering
P(Spam) = 0.50
P(Free | Spam) = 0.20
P(Free | No spam) = 0.001
P(Spam | Free) = ?

𝑷 𝑺𝒑𝒂𝒎 𝑭𝒓𝒆𝒆
𝑃 𝑆𝑝𝑎𝑚 ∗ 𝑃(𝐹𝑟𝑒𝑒|𝑆𝑝𝑎𝑚)
=
𝑃 𝐹𝑟𝑒𝑒 𝑆𝑝𝑎𝑚 ∗ 𝑃 𝑆𝑝𝑎𝑚 + 𝑃 𝐹𝑟𝑒𝑒 𝑁𝑜 𝑠𝑝𝑎𝑚 ∗ 𝑃(𝑁𝑜 𝑠𝑝𝑎𝑚)
0.5 ∗ 0.2 0.1
= = = 0.995
0.2 ∗ 0.5 + 0.001 ∗ 0.5 0.1005
Random variable and Probability
distributions
Random variable
A random variable is a mathematical formalization of a quantity or object
which depends on random events.

Examples:
• Random process
Random variable
A random variable is a mathematical formalization of a quantity or object
which depends on random events.

Examples: Traffic signal


Random variable
A random variable is a mathematical formalization of a quantity or object
which depends on random events.

Examples: Traffic signal


Random variable
A random variable is a mathematical formalization of a quantity or object
which depends on random events.

Examples: Traffic signal


Random variable

Traffic signal
Examples:
Random variable

Traffic signal
Examples:

Probability Distribution
Probability distribution
Probability distribution is the mathematical function that gives the
probabilities of occurrence of different possible outcomes for
an experiment.

Points scored per game 0 1 2 3 4 5 6


Frequency, f 1 4 6 12 5 1 1

Points scored per game 0 1 2 3 4 5 6


Probability 1 4 6 12 5 1 1
Recall the Frequentist (empirical) approach 30 30 30 30 30 30 30
of assigning probabilities
Probability distribution
Age Frequency Probability
Probability 8.25229383 1 0.0001
8.74123818 1 0.0001
0.04 10.1067899 1 0.0001
10.3407847 1 0.0001
10.4350811 1 0.0001
0.035
B14513 50.972224
10.6131966 1 0.0001
0.0001
B18335 50.9806823 1 0.0001
0.03 B11173
10.9903822
50.9811871 1
0.0001
0.0001
11.5771154 0.0001
B13123 50.982565
12.9985465 1 0.0001
0.0001
B18564 50.9894953
13.0247399 1 0.0001
0.0001
0.025 B13253 50.9943793
13.3687472 1 0.0001
0.0001
B18225 50.9999045
13.6813224 1 0.0001
0.0001
B11913 51.0000273
13.7808575 1 0.0001
0.0001
0.02 B10182 51.0055388
13.8053047 1 0.0001
0.0001
B10007 51.0108321
14.7465226 1 0.0001
0.0001
B18783 51.0152659
14.7639849 1 0.0001
0.0001
0.015 B17911 51.0166165
15.2110197 1 0.0001
0.0001
B13297 51.0171894
15.5008938 11 0.0001
0.0001
B12413 51.0264663
15.5235948 11 0.0001
0.0001
0.01 B14701 51.0284308
15.6196375 11 0.0001
0.0001
B13876 51.0312821
15.6510696 11 0.0001
0.0001
B10295 51.0420323
15.7488585 11 0.0001
0.0001
0.005 B10771 51.0450746
16.3548002 11 0.0001
0.0001
B14507 51.0496721
16.5259307 11 0.0001
0.0001
B10152 51.0538193
16.543393 11 0.0001
0.0001
0 B11183 51.0544878
16.7424632 11 0.0001
0.0001
B13144 51.0606269 11 0.0001

60.1925358
16.7948501 0.0001
8.252293825
27.31915903
31.16577592
33.53895683
35.60439201
37.21703526
38.68897058
39.88423761
41.04122401
42.02118636
43.04125743
43.86931143
44.69441866
45.49645657
46.25644853
47.06706751
47.92858637
48.72863248
49.54195266
50.35615961
51.10733856
51.82449333
52.53636847
53.28027602
54.00984539
54.86759893
55.65618541
56.53148311
57.43062776
58.31623913
59.27983058

61.18270268
62.28959411
63.59533747
65.12849997
66.77999683
68.60521627
71.33353667
75.02659447
B12658 51.0615955
17.0192407 11 0.0001
0.0001
B12863 51.0621822
17.3056224 11 0.0001
0.0001
B15728 51.0632872
17.7020165 11 0.0001
0.0001
B11002 51.0642149
17.7500379 11 0.0001
0.0001
B15887 51.0672571
17.7596421 11 0.0001
0.0001
B11845 51.0711725
18.1813566 11 0.0001
0.0001
B11498 51.0764385
18.2285048 12 0.0001
0.0002
B14910 51.0790578 1 0.0001
B19132 51.0792625 1 0.0001
B14259 51.0857972 1 0.0001
Probability distribution
Age Frequency Probability
Probability Density 8.25229383 1 0.0001
8.74123818 1 0.0001
0.04
10.1067899 1 0.0001
10.3407847 1 0.0001
10.4350811 1 0.0001
0.035
B14513 50.972224
10.6131966 1 0.0001
0.0001
B18335 50.9806823
10.9903822 1 0.0001
0.0001
B11173 50.9811871
11.5771154 1 0.0001
0.0001
0.03 B13123 50.982565
12.9985465 1 0.0001
0.0001
B18564 50.9894953
13.0247399 1 0.0001
0.0001
B13253 50.9943793
13.3687472 1 0.0001
0.0001
B18225 50.9999045
13.6813224 1 0.0001
0.0001
0.025 B11913 51.0000273
13.7808575 1 0.0001
0.0001
B10182 51.0055388
13.8053047 1 0.0001
0.0001
B10007 51.0108321
14.7465226 1 0.0001
0.0001
B18783 51.0152659
14.7639849 1 0.0001
0.0001
0.02 B17911 51.0166165 1 0.0001
15.2110197 0.0001
B13297 51.0171894
15.5008938 11 0.0001
0.0001
B12413 51.0264663
15.5235948 11 0.0001
0.0001
B14701 51.0284308 11 0.0001
0.015 15.6196375 0.0001
B13876 51.0312821
15.6510696 11 0.0001
0.0001
B10295 51.0420323
15.7488585 11 0.0001
0.0001
B10771 51.0450746
16.3548002 11 0.0001
0.0001
0.01 B14507 51.0496721
16.5259307 11 0.0001
0.0001
B10152 51.0538193
16.543393 11 0.0001
0.0001
B11183 51.0544878
16.7424632 11 0.0001
0.0001
B13144 51.0606269
16.7948501 11 0.0001
0.0001
0.005 B12658 51.0615955
17.0192407 11 0.0001
0.0001
B12863 51.0621822
17.3056224 11 0.0001
0.0001
B15728 51.0632872
17.7020165 11 0.0001
0.0001
B11002 51.0642149
17.7500379 11 0.0001
0.0001
0 B15887 51.0672571
17.7596421 11 0.0001
0.0001
12
15
18
21
24
27
30
33
36
39
42
45
48
51
54
57
60
63
66
69
72
75
78
81
84
87
90
93
96
99
B11845 51.0711725 11 0.0001
0
3
6
9

102
18.1813566 0.0001
B11498 51.0764385
18.2285048 12 0.0001
0.0002
B14910 51.0790578 1 0.0001
B19132 51.0792625 1 0.0001
B14259 51.0857972 1 0.0001
Read this
Probability Density
0.04

0.035

0.03

0.025

0.02

0.015

0.01

0.005

12
15
18
21
24
27
30
33
36
39
42
45
48
51
54
57
60
63
66
69
72
75
78
81
84
87
90
93
96
99
0
3
6
9

102
What area will indicate the age
range of 30-59 years?
Expectation Value
No. of Coughs During
Exhaled Breath Frequency Probability
Collection in 30 min
0 4 0.19 How to compute average/mean
1
2
1
3
0.05
0.14
number of coughs exhaled in 30
3 2 0.1 mins?
4
5
2
1
0.1
0.05
(also called expected value E(x))
6 1 0.05
8 1 0.05
11 1 0.05 $
18 1 0.05
𝐸 𝑥 = P 𝑥! 𝑃(𝑥! )
24 1 0.05
!"#
69 1 0.05
88 1 0.05
99 1 0.05
Distributions
Discrete vs Continuous
Discrete Distributions Continuous Distributions
Speaks of the Probability that X can take a specific value, Speaks of the Probability that X lies between an
𝑃 𝑋 = 𝑥 = 𝑝(𝑥). "
interval, 𝑃 𝑎 ≤ 𝑋 ≤ 𝑏 = ∫! 𝑓 𝑥 𝑑𝑥.
Non-negative for all real values of X. Non-negative for all possible intervals in X.
'
Sum over all PMFs of X is 1, ∑ #∈% 𝑝 𝑥 = 1 Integral over all PDFs of X is 1.∫&' 𝑓 𝑥 𝑑𝑥 = 1
Probability Mass Function (PMF) Probability Density Function (PDF)

Countable Measurable
Probability distribution function

Q. If the random variable X has PDF of


! #
𝑓 𝑥 = 𝑥 for 0 < 𝑥 < 2
"
And 0 everywhere.

1. Find P(x=2)
2. Find P(1<x<2)
3. Find E(x)
4. Find Variance
Types of distribution
• Normal distribution

• Bernoulli distribution

• Binomial distribution
Normal distribution
• Probability density function:

• Normalized data:

Taken from wiki


How to understand them?
Measure Formula Description

Measures the weighted center of the


Mean (𝜇) 𝐸(𝑋)
distribution of X
Measures the spread of the distribution of X
Variance (𝜎 %) 𝐸 𝑋−𝜇 %
about the mean
&
𝑋−𝜇
Skewness 𝐸 Measures asymmetry of the distribution of X
𝜎
'
𝑋−𝜇 Measures the distribution of X at tail and
Kurtosis 𝐸 −3
𝜎 useful in identification of the outliers
Skewness
Kurtosis
• Kurtosis of a Normal
Distribution is 0

• Negative Kurtosis tells


that the distribution
has less data in tails

• Positive Kurtosis tells


that the distribution
has more data in tails
Binomial distribution
• It is a discrete distribution
• Set of trails with only two outcomes are possible

𝑛 ! "#!
𝑃 𝑋 = 𝑘: 𝑛, 𝑝 = 𝑝 1−𝑝
𝑘

𝐸 𝑋 = 𝑛𝑝; 𝑉𝑎𝑟(𝑋) = 𝑛𝑝𝑞

𝑝 𝑞
Example
In a multiple-choice test, each question has 4 options and only
1 is correct answer. If the student guesses randomly, what is
the probability that in a test of 50 questions, he gets
a. Exactly 20 questions correct
b. Exactly 30 questions correct
c. Exactly 12 questions correct
Example
In a multiple-choice test, each question has 4 options and
2 correct answers. If the student guesses randomly, what is
the probability that in a test of 50 questions, he gets
a. Exactly 20 questions correct
b. Exactly 30 questions correct
c. Exactly 12 questions correct

You might also like