Professional Documents
Culture Documents
Andrew Rosenberg - Lecture 7: Graphical Models Machine Learning
Andrew Rosenberg - Lecture 7: Graphical Models Machine Learning
Andrew Rosenberg
March 5, 2010
1 / 44
Last Time
Logistic Regression
2 / 44
Today
Graphical Models
3 / 44
Recap
Models weve looked at so far. Linear Regression Logistic Regression Both make use of probabilistic models. Graphical models are a way to structure and visualize probability models.
4 / 44
Probability Models
(Joint) Probability Tables. We represent multinomial joint probabilities between K variables as K-dimensional tables. p(x) = p(u?, achiness?, headache?, . . . , temperature?) Assume D binary (true/false) variables. How big is this table? 2D Exponential Increase in size of the probability table. Related to the curse of dimensionality. What if rather than a Bernouli (binary) variables, we had multinomials with M choices?
5 / 44
Probability Models
What if the variables are independent? p(x) = p(u?, achiness?, headache?, . . . , temperature?) Recall, if x and y are independent: p(x, y ) = p(x)p(y ) The original probability distribution then factorizes. p(x) = p(u?)p(achiness?)p(headache?) . . . p(temperature?) How big is this table (if each variable is binary)? p(u?) = .2 .8 p(headache?) = .6 Total size = 2 D .4 etc.
6 / 44
Graphical Models
Independence assumptions are convenient (Naive Bayes), but rarely true. More often some groups of variables are dependent, but others are independent. Moreover others are conditionally independent.
7 / 44
Conditional Independence
If two variables are conditionally independent, then: p(x, z|y ) = p(x|y )p(z|y ) but p(x, z) = p(x)p(z) e.g. y = u?, x = achiness?, z = headache?. Written as: x z|y
8 / 44
9 / 44
10 / 44
11 / 44
p(x0 , . . . , xn1 ) =
i =0
p(xi |pai ) =
i =0
p(xi |i )
12 / 44
When we observe a variable know its value from data we color the variable corresponding to that node grey. Observing a variable allows us to condition on it. E.g. p(x,zy) Given an observation of any variable we can generate generate pdfs for the other variables.
13 / 44
14 / 44
p(x, y , z) =
15 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y )
16 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y ) p(x, y , z) p(x)p(y |x)p(z|y ) p(x|z, y ) = = p(y , z) p(y )p(z|y )
17 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y ) p(x, y , z) p(x)p(y |x)p(z|y ) p(x|z, y ) = = p(y , z) p(y )p(z|y ) p(x, y ) p(x)p(y |x) = = p(x|y ) = p(y ) p(y )
18 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y ) p(x, y , z) p(x)p(y |x)p(z|y ) p(x|z, y ) = = p(y , z) p(y )p(z|y ) p(x, y ) p(x)p(y |x) = = p(x|y ) = p(y ) p(y ) p(x, z|y ) = p(x|y )p(z|y )
19 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y ) p(x, y , z) p(x)p(y |x)p(z|y ) p(x|z, y ) = = p(y , z) p(y )p(z|y ) p(x, y ) p(x)p(y |x) = = p(x|y ) = p(y ) p(y ) p(x, z|y ) = p(x|y )p(z|y ) x z|y
20 / 44
21 / 44
p(x, y , z) =
22 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y )
23 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y ) p(x|y )p(y )p(z|y ) p(x, y , z) = p(x|z, y ) = p(y , z) p(y )p(z|y )
24 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y ) p(x|y )p(y )p(z|y ) p(x, y , z) = p(x|z, y ) = p(y , z) p(y )p(z|y ) = p(x|y )
25 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y ) p(x|y )p(y )p(z|y ) p(x, y , z) = p(x|z, y ) = p(y , z) p(y )p(z|y ) = p(x|y ) p(x, z|y ) = p(x|y )p(z|y )
26 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y ) p(x|y )p(y )p(z|y ) p(x, y , z) = p(x|z, y ) = p(y , z) p(y )p(z|y ) = p(x|y ) p(x, z|y ) = p(x|y )p(z|y ) x z|y
27 / 44
28 / 44
p(x, y , z) =
29 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y )
30 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y ) p(x, y , z) p(x)p(y |x, z)p(z) p(x|z, y ) = = p(y , z) p(y |z)p(z)
31 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y ) p(x, y , z) p(x)p(y |x, z)p(z) p(x|z, y ) = = p(y , z) p(y |z)p(z) p(x)p(y |x, z) = = p(y |z)
32 / 44
p(x, y , z) =
Is x z|y ? That is. . . Does p(x, z|y ) = p(x|y )p(z|y )? p(x, z|y ) = p(x|z, y )p(z|y ) p(x, y , z) p(x)p(y |x, z)p(z) p(x|z, y ) = = p(y , z) p(y |z)p(z) p(x)p(y |x, z) = = p(y |z) x not z|y
33 / 44
Factorization
A more complicated factorization
x3 x1 x0 x2 x4 x5
p(x0 , x1 , x2 , x3 , x4 , x5 ) =
34 / 44
Factorization
A more complicated factorization
x3 x1 x0 x2 x4 x5
p(x0 , x1 , x2 , x3 , x4 , x5 ) = =
? p(x0 ) . . .
35 / 44
Factorization
A more complicated factorization
x3 x1 x0 x2 x4 x5
p(x0 , x1 , x2 , x3 , x4 , x5 ) = = =
36 / 44
Factorization
A more complicated factorization
x3 x1 x0 x2 x4 x5
p(x0 , x1 , x2 , x3 , x4 , x5 ) = = = =
? p(x0 ) . . . p(x0 )p(x1 |x0 ) . . . p(x0 )p(x1 |x0 )p(x2 |x0 )p(x3 |x1 )p(x4 |x2 )p(x5 |x1 , x4 )
37 / 44
Factorization
How big are the probability tables?
p(x0 , x1 , x2 , x3 , x4 , x5 ) = p(x0 )p(x1 |x0 )p(x2 |x0 )p(x3 |x1 )p(x4 |x2 )p(x5 |x1 , x4 )
p(x5 |x1 , x4 ) =
38 / 44
39 / 44
Continuous models
x2
40 / 44
Naive Bayes
Naive Bayes Classication. y x0 x1 x2
Observation variables, xi are each independent given the class y. A distribution is optimized using maximum likelihood for each variable separately. Can easily combine multinomial, bernouli and continuous (e.g. Gaussian) distributions from the variables. p(y |x0 x1 , x2 ) p(x0 , x1 , x2 |y )p(y ) p(y |x0 x1 , x2 ) p(x0 |y )p(x1 |y )p(x2 |y )p(y )
41 / 44
Graphical Models
Graphical Models Graph representation of dependency relationship Directed Acyclic Graph (DAG) Nodes are random variables Edges dene dependence relationships. What can we do with Graphical models Learn Parameters to t data Understand the independence relationships between variables Perform inference (marginals and conditionals) Compute Likelihoods for classication
42 / 44
Bye
Next
More fun with Graphical Models
43 / 44