You are on page 1of 19

Lecture 9: Undirected Graphical Models Machine Learning

Andrew Rosenberg

March 5, 2010

1/1

Today

Graphical Models
Probabilities in Undirected Graphs

2/1

Undirected Graphs

What if we allow undirected graphs? What do they correspond to? Its not cause/eect, or trigger/response, rather, general dependence. Example: Image pixels, where each pixel is a bernouli. Can have a probability over all pixels p(x11 , x1M , xM1 , xMM ) Bright pixels have bright neighbors. No parents, just probabilities. Grid models are called Markov Random Fields.
3/1

Undirected Graphs

z x w x y |{w } y x

z y w x y |{w , z}

cannot represent w z|{x, y } x y |{w , z} Undirected separation is easy. To check xa xb |xc , check Graph reachability of xa and xb without going through nodes in xc .

4/1

Undirected Graphs

x z x y x y |z

x z

x z

x y |z OR x z

Undirected separation is easy. To check xa xb |xc , check Graph reachability of xa and xb without going through nodes in xc .

5/1

Probabilities in Undirected Graphs


Graph cliques dene clusters of dependent variables d f

Clique: a set of nodes such there is an edge between every pair of members of the set. We dene probability as a product of functions dened over cliques

6/1

Probabilities in Undirected Graphs


Graph cliques dene clusters of dependent variables d
   

Clique: a set of nodes such there is an edge between every pair of members of the set. We dene probability as a product of functions dened over cliques

7/1

Probabilities in Undirected Graphs


Graph cliques dene clusters of dependent variables d
   

Clique: a set of nodes such there is an edge between every pair of members of the set. We dene probability as a product of functions dened over cliques

8/1

Probabilities in Undirected Graphs


Graph cliques dene clusters of dependent variables d f

Clique: a set of nodes such there is an edge between every pair of members of the set. We dene probability as a product of functions dened over cliques

9/1

Probabilities in Undirected Graphs


Graph cliques dene clusters of dependent variables d f

Clique: a set of nodes such there is an edge between every pair of members of the set. We dene probability as a product of functions dened over cliques

10 / 1

Representing Probabilities
Potential Functions over cliques p(x) = p(x0 , . . . , xn1 ) = 1 Z c (xc )
cC

Normalizing Term guarantees that p(x) sums to 1 Z =


x cC

c (xc )

Potential Functions are positive functions over groups of connected variables. Use only maximal cliques. e.g. (x1 , x2 , x3 )(x2 , x3 ) (x1 , x2 , x3 )
11 / 1

Logical Inference

NOT

b c e XOR

AND

In Logic Networks, nodes are binary, and edges represent gates Gates: AND, OR, XOR, NAND, NOR, NOT, etc. Inference: given observed variables, predict others. Problems: Uncertainty, conicts and inconsistency Rather than saying a variable is True or False, lets say it is .8 True and .2 False. Probabilistic Inference
12 / 1

Inference
Probabilistic Inference

NOT

b c e XOR

AND

Replace the logic network with a Bayesian Network Probabilistic Inference: given observed variables, predict marginals over others. Not b=t b=f 0 1 1 0
13 / 1

a=t a=f

Inference
Probabilistic Inference

NOT

b c e XOR

AND

Replace the logic network with a Bayesian Network Probabilistic Inference: given observed variables, predict marginals over others. Soft Not b=t b=f a=t .1 .9 a=f .9 .1
14 / 1

Inference
General Problem Given a graph and probabilities, for any subset of variables, nd p(xe |xo ) = p(xe , xo ) p(xo )

Compute both marginals and divide. But this can be exponential...(Based on the number of parent each node has, or the size of the cliques)
M1

p(xj , xk ) =
x0 x1

...
xM1 i =0

p(xi |i ) (xc )
xM1 cC

p(xj , xk ) =
x0 x1

...

Have ecient learning and storage in Graphical Models, now inference.

15 / 1

Inecient Marginals
Brute Force. Given CPTs and a graph structure we can compute arbitrary marginals by brute force, but its inecient. For Example
p(x) p(x0 , x2 ) p(x0 , x5 ) p(x0 |x5 ) p(x0 |x5 = TRUE ) = = = = = p(x0 )p(x1 |x0 )p(x2 |x0 )p(x3 |x1 )p(x4 |x2 )p(x5 |x2 , x5 ) p(x0 )p(x2 |x0 ) X p(x0 )p(x1 |x0 )p(x2 |x0 )p(x3 |x1 )p(x4 |x2 )p(x5 |x2 , x5 )
x1 ,x2 ,x3 ,x4

P P P

x1 ,x2 ,x3 ,x4

p(x) p(x) p(xU\5 |x5 = TRUE ) p(xU\5 |x5 = TRUE )

P0

x ,x1 ,x2 ,x3 ,x4 x1 ,x2 ,x3 ,x4

x0 ,x1 ,x2 ,x3 ,x4

16 / 1

Ecient Computation of Marginals

b a e c

Pass messages (small tables) around the graph. The messages will be small functions that propagate potentials around an undirected graphical model. The inference technique is the Junction Tree Algorithm

17 / 1

Junction Tree Algorithm

Ecient Message Passing on Undirected Graphs. For Directed Graphs, rst convert to an Undirected Graph (Moralization). Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities

18 / 1

Bye

Next
Junction Tree Algorithm

19 / 1

You might also like