You are on page 1of 19

CS340 Machine learning Gibbs sampling in Markov random fields

Image denoising

y
x = E[x|y, ]

Ising model
2D Grid on {-1,+1} variables Neighboring variables are correlated

p(x|)

1 ij (xi , xj ) Z() <ij>


3

Ising model
ij (xi , xj ) = eW eW eW eW

W > 0: ferro magnet W < 0: anti ferro magnet (frustrated system)

p(x|)

1 exp[H(x|)] Z()

H(x)

xT Wx =
<ij>

Wij xi xj

=1/T = inverse temperature

Samples from an Ising model


W=1

Cold T=2

Hot T=5

Mackay fig 31.2

Boltzmann distribution
Prob distribution in terms of clique potentials
1 p(x|) = Z() c (xc | c )
cC

In terms of energy functions


c (xc ) log p(x) = exp[Hc (xc )] = [
cC

Hc (xc ) + log Z]

Ising model
2D Grid on {-1,+1} variables Neighboring variables are correlated
Hij = Wij Wij Wij Wij

W > 0: ferro magnet W < 0: anti ferro magnet (frustrated)

H(x)
p(x| )

=
=

xT Wx =
<ij>

Wij xi xj

1 exp[H(x| )] Z( )

=1/T = inverse temperature

Local evidence
p(x, y) = p(x)p(y|x) = 1 Z ij (xi , xj ) p(yi |xi )
i

<ij>

p(yi |xi )

N (yi |xi , 2 )

Gibbs sampling
A way to draw samples from p(x1:d|y,) one variable at a time, ie. p(xi|x-i)
1. xs+1 p(x1 |xs , . . . , xs ) 2 1 D 2. xs+1 p(x2 |xs+1 , xs , . . . , xs ) 3 2 1 D 3. xs+1 p(xi |xs+1 , xs i+1:D ) i 1:i1 4. xs+1 p(xD |xs+1 , . . . , xs+1 ) 1 D D1

Gibbs sampling from a 2d Gaussian

Mackay 29.13

10

Gibbs sampling in an MRF


Full conditional depends only on Markov blanket
p(Xi = |xi ) = = =

p(xi = , xi ) p(Xi = , xi ) (1/Z)[ jNi ij (Xi = , xj )][ (1/Z)


[ jNi jNi

<jk>:j,kFi

jk (xj , xk )] jk (xj , xk )]

ij (Xi = , xj )][

<j,k>:j,kFi

ij (Xi = , xj ) ij (Xi = , xj )

jNi

11

Gibbs sampling in an Ising model


Let (xi,xj) = exp(W xi xj), xi = +1,-1.
p(Xi = +1|xi ) =
jNi jNi

ij (Xi = +1, xj )
jNi

ij (Xi = +1, xj ) + exp[J


jNi

ij (Xi = 1, xj ) xj ]

= = = wi (u) = 1/(1 + eu ) =

xj ]
jNi

exp[J

jNi

xj ] + exp[J

exp[Jwi ] exp[Jwi ] + exp[Jwi ] (2J wi ) xj


jNi

12

Adding in local evidence


Final form is
p(Xi = +1|xi , y) = exp[J wi ]i (+1, yi ) exp[Jwi ]i (+1, yi ) + exp[Jwi ]i (1, yi )

Run demo

13

Gibbs sampling for DAGs


The Markov blanket of a node is the set that renders it independent of the rest of the graph. This is the parents, children and co-parents.
p(Xi |Xi ) = = = =
x

p(Xi , Xi ) x p(Xi , Xi ) p(Xi , U1:n , Y1:m , Z1:m , R) x p(x, U1:n , Y1:m , Z1:m , R) p(Xi |U1:n )[ j p(Yj |Xi , Zj )]P (U1:n , Z1:m , R)
x

p(Xi = x|U1:n )[ p(Xi |U1:n )[ p(Xi = x|U1:n )[


j

p(Yj |Xi = x, Zj )]P (U1:n , Z1:m , R) p(Yj |Xi = x, Zj )]

p(Yj |Xi , Zj )]
j

p(Xi |Xi ) p(Xi |P a(Xi ))


Yj ch(Xi )

p(Yj |P a(Yj )

14

Birats

15

Samples

16

Posterior predictive check

17

Boltzmann machines
Ising model where the graph structure is arbitrary, and the weights W are learned by maximum likelihood

Restricted Boltzmann machine

18

Hopfield network
Boltzmann machine with no hidden nodes (fully connected Ising model)

19

You might also like