You are on page 1of 10

MAXIMUM LIKELIHOOD

HYPOTHESES FOR PREDICTING


PROBABILITIES
MAXIMUM LIKELIHOOD HYPOTHESES FOR PREDICTING
PROBABILITIES

• Consider the setting in which we wish to learn a


non-deterministic(probabilistic) function f:
X→{0,1}, which has two discrete output values.
(or)
• Learn a neural network whose output is
probability that f(x)=1
i.e. learn a target function f': X→[0,1] such that
f(x)=P(f'(x)=1)
eg: instance space X – symptoms of medical
patients.
target function f(x)=1 if the patient survives,
0 otherwise

NOTE: Here f(x) is probabilistic, 92% may survive


i.e., f‘ (x)=0.92 & f(x)=1
8% many not survive f‘ (x)=0.08 &f(x)=0
• Training data D= {<x1,d1>…….<xm,dm>}
• Let xi and di be random variables and each
training example is drawn independently
• so by using bayes theorem we can write
P(D|h) = ……………. 1
• Let the probability of encountering any
particular instance xi is independent of the
hypothesis h
• Lets understand with an example
• The probability that our training set contains a
particular patient xi is independent of the
hypothesis about survival rates
• Applying the product rule
P(D|h)= -----2
• Use the above equation to substitute for in 2
equation

• Now we write an expression for the maximum


likelihood hypothesis

• P(xi) can be dropped because it is independent


of hypothesis. It is a prior probabaility.so
nothing to do with the hypothesi
The expression on the right side of
the above equation seen in binomial
distribution
• To maximize an expression it is easier to work
with the log of the likelihood

You might also like