Professional Documents
Culture Documents
Module 5:
Probability
Faculty: Santosh Chapaneri
• In ML, there may be uncertainties in different forms, e.g. arriving at the best
prediction of future given the past data, arriving at the best model based on
certain data, arriving at the confidence level while predicting the future
outcome based on past data, etc.
• In ML, we train the system using training data and we expect the ML algorithm
santosh.chapaneri@ieee.org
3
Probability – Properties
santosh.chapaneri@ieee.org
Probability – Properties
(marginal)
santosh.chapaneri@ieee.org
5
Probability – Conditional
• P (A | B) = the probability of event A given event B happened
occurrence of event B.
• Two events are called independent if and only if P(A|B) = P(A) P(B) (or
santosh.chapaneri@ieee.org
Probability – Conditional
• Q1: In a toy-making shop, the automated machine produces few
santosh.chapaneri@ieee.org
7
santosh.chapaneri@ieee.org
santosh.chapaneri@ieee.org
9
• Prob. of false alarm = 0.1 (marked as spam even if email is not a spam)
• Prior knowledge that only 0.4% of the total emails received are spam
• Let x be the event of marked as spam if the sender name has the words
‘mass’ or ‘bulk’ and y be the event of some mail really being spam.
• Compute p(y = 1 | x = 1)
santosh.chapaneri@ieee.org
10
test for breast cancer called mammogram. Suppose you are told
the test has a sensitivity of 80%, the prior probability of having
breast cancer is 0.4%, the false positive is 10%. If the test is
positive, what is the probability that she has cancer?
santosh.chapaneri@ieee.org
11
number of coins that come up heads. Here, the elements are 10-length
sequences of H & T.
• in practice, we usually do not care about the probability of obtaining any particular
sequence of heads and tails. Instead we usually care about real-valued functions
of outcomes, such as the number of heads that appear among our 10 tosses, or
the length of the longest run of tails.
santosh.chapaneri@ieee.org
12
Given that 10 coins are tossed, X(w) can take only a finite number of values, so it
is known as a discrete random variable.
• Here, the probability of the set associated with a random variable X taking on
• Suppose that X(w) is a random variable indicating the amount of time it takes for
• We denote the probability that X takes on a value between two real constants a
santosh.chapaneri@ieee.org
13
santosh.chapaneri@ieee.org
14
santosh.chapaneri@ieee.org
15
santosh.chapaneri@ieee.org
16
the values that g(x) can taken on for different values of x, where the weights are
given by p(x) or f(x).
santosh.chapaneri@ieee.org
17
santosh.chapaneri@ieee.org
18
X with PDF
santosh.chapaneri@ieee.org
19
santosh.chapaneri@ieee.org
20
santosh.chapaneri@ieee.org
21
santosh.chapaneri@ieee.org
22
santosh.chapaneri@ieee.org
23
(Marginal CDF)
santosh.chapaneri@ieee.org
24
(Marginal PMF)
(Joint PDF)
(Marginal PDF)
santosh.chapaneri@ieee.org
25
santosh.chapaneri@ieee.org
26
santosh.chapaneri@ieee.org
27
santosh.chapaneri@ieee.org
28
Sampling Distributions
• An important application of statistics in machine learning is how to draw a
• E.g. based on the malignancy sample test results of some random tumor
Sampling Distributions
• Population is a finite set of objects being investigated.
way that every member of the population has the same chance of being
chosen.
population if each object chosen is returned to the population before the next
object is chosen, then it is called the sampling with replacement. In this case,
repetitions are allowed. Number of possibilities = NN, since each sample can
be repeated. Prob. of each sample = 1/ NN
santosh.chapaneri@ieee.org
30
Sampling Distributions
• Sampling with Replacement: choose a random sample of 2 patients from a
chosen to the population before choosing the next object, then the unordered
subset is called sampling without replacement. The number of such samples
that can be drawn from the population size of N is
santosh.chapaneri@ieee.org
31
• When samples are drawn with replacement, these values are independent of
• When samples are drawn without replacement, these values are not
santosh.chapaneri@ieee.org
32
Hypothesis Testing
• Hypothesis is a statement about one or more populations.
• It is usually concerned with the parameters of the population. e.g. the hospital
administrator may want to test the hypothesis that the average length of stay
of patients admitted to the hospital is 5 days
santosh.chapaneri@ieee.org
33
Hypothesis Testing
santosh.chapaneri@ieee.org
34
this.
santosh.chapaneri@ieee.org