Professional Documents
Culture Documents
CS 440/ECE 448
Fall 2020
Margaret Fleck
Probability 3
For this toy example, we can fill out the joint distribution table. However, when we have more variables (typical
of AI problems), this isn't practical. The joint distribution contains
The estimation is the more important problem because we usually have only limited amounts of data, esp. clean
annotated data. Many value combinations won't be observed simply because we haven't seen enough examples.
Bayes Rule
Remember that P(A | C) is the probability of A in a context where C is true. For example, the probability of a
bike being teal increases dramatically if we happen to know that it's a veobike.
P(teal) = 0.21
P(teal | veoride) = 0.95 (19/20)
Let's write this in the equivalent form P(A,C) = P(C) * P(A | C).
Flipping the roles of A and C (seems like middle school but just keep reading):
P(A,C) and P(C,A) are the same quantity. (AND is commutative.) So we have
So
The normalization doesn't have a proper name because we'll typically organize our computation so as to
eliminate it. Or, sometimes, we won't be able to measure it directly, so the number will be set at whatever is
required to make all the probability values add up to 1.
Example: we see a teal bike. What is the chance that it's a veoride? We know
So we can calculate
P(evidence | cause) tends to be stable, because it is due to some underlying mechanism. In this example, veoride
likes teal and regular bike owners don't. In a disease scenario, we might be looking at the tendency of measles to
cause spots, which is due to a property of the virus that is fairly stable over time.
P(cause | evidence) is less stable, because it depends on the set of possible causes and how common they are
right now.
For example, suppose that a critical brake problem is discovered and Veoride suddenly has to pull bikes into the
shop for repairs. So then only 2% of the bikes are veorides. Then we might have the following joint distribution.
Now a teal bike is more likely to be standard rather than veoride.
https://courses.grainger.illinois.edu/cs440/fa2020/lectures/probability3.html 2/3
10/2/2020 CS440 Lectures
veoride standard |
teal 0.019 0.025 | 0.044
red 0.001 0.955 | 0.956
----------------------------
0.02 0.98
Dividing P(cause | evidence) into P(evidence | cause) and P(cause) allows us to easily adjust for changes in
P(cause).
https://courses.grainger.illinois.edu/cs440/fa2020/lectures/probability3.html 3/3