0% found this document useful (0 votes)

53 views35 pages

Probabilistic Graphical Models Overview

This document discusses probabilistic graphical models and their use in expert systems. It introduces probabilistic graphical models including probabilistic networks represented by graphs. It describes properties of conditional independence and how graphs can represent conditional independence relationships through d-separation. It discusses different types of graphs including undirected graphs, directed acyclic graphs, and how they can be used to represent factorizations and perform inference. It provides an example of the Lauritzen-Spiegelhalter algorithm for inference in graphical models.

Uploaded by

565864220

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views35 pages

Probabilistic Graphical Models Overview

Uploaded by

565864220

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

PROBABILISTIC GRAPHICAL

MODELS

David Madigan
Rutgers University

madigan@[Link]
Expert Systems

•Explosion of interest in “Expert Systems” in the

early 1980’s

IF the infection is primary-bacteremia

AND the site of the culture is one of the sterile sites
AND the suspected portal of entry is the gastrointestinal tract
THEN there is suggestive evidence (0.7) that infection is bacteroid.

•Many companies (Teknowledge, IntelliCorp,

Inference, etc.), many IPO’s, much media hype

•Ad-hoc uncertainty handling

Uncertainty in Expert Systems

If A then C (p1)
If B then C (p2)

What if both A and B true?

Then C true with CF:

p1 + (p2 X (1- p1))

“Currently fashionable ad-hoc mumbo jumbo”

A.F.M. Smith
Eschewed Probabilistic Approach

•Computationally intractable
•Inscrutable
•Requires vast amounts of data/elicitation

e.g., for n dichotomous variables need 2n - 1

probabilities to fully specify the joint distribution
Conditional Independence

X Y|Z ! f X ,Y |Z ( x, y | z ) = f X |Z ( x | z ) fY |Z ( y | z )
Conditional Independence

•Suppose A and B are marginally independent. Pr(A),

Pr(B), Pr(C|AB) X 4 = 6 probabilities
•Suppose A and C are conditionally independent

given B: Pr(A), Pr(B|A) X 2, Pr(C|B) X 2 = 5

A B C

C A|B

•Chain with 50 variables requires 99 probabilities

versus 250-1
Properties of Conditional Independence (Dawid, 1980)

For any probability measure P and random variables A, B, and C:

CI 1: A B [P] ⇒ B A [P]

CI 2: A B ∪ C [P] ⇒ A B [P]

CI 3: A B ∪ C [P] ⇒ A B | C [P]

CI 4: A B and A C | B [P] ⇒ A B ∪ C [P]

Some probability measures also satisfy:

CI 5: A B | C and A C | B [P] ⇒ A B ∪ C [P]

CI5 satisfied whenever P has a positive joint probability density with

respect to some product measure
Markov Properties for Undirected Graphs

(Global) S separates A from B ⇒ A B|S

(Local) α V \ cl(α) | bd (α)
(Pairwise) α β | V \ {α,β}

(G) ⇒ (L) ⇒ (P)

X2 X5, X4 | X1, X3 (1)

X1 X2
⇒ X2 X4 | X1, X3, X5 (2)
X5 X3

X4 To go from (2) to (1) need X5 X2 | X1,X3? or CI5

Lauritzen, Dawid, Larsen & Leimer (1990)
Factorizations

A density f is said to “factorize according to G” if:

f(x) = Π ψC(xC)
C εC

• cliques are maximally complete subgraphs “clique potentials”

Proposition: If f factorizes according to a UG G, then it also
obeys the global Markov property

“Proof”: Let S separate A from B in G and assume V = A ! B ! S .

Let CA be the set of cliques with non-empty intersection with A.
Since S separates A from B, we must have B " C = ! for all C
in CA. Then:
f ( x) = #$ C ( xC ) #$ C ( xC ) = f1 ( x A! S ) f 2 ( xB ! S )
C"C A C"C \ C A
Markov Properties for Acyclic Directed Graphs
(Bayesian Networks)

(Global) S separates A from B in Gan(A,B,S)m ⇒ A B|S

(Local) α nd(α)\pa(α) | pa (α)

(G) ⇔ (L)
X1
X1

X3
X3

X2
X2

Lauritzen, Dawid, Larsen & Leimer (1990)

Factorizations

A density f admits a “recursive factorization” according to an

ADG G if f(x) = Π f(xv | xpa(v) )

ADG Global Markov Property ⇔ f(x) = Π f(xv | xpa(v) )

v εV

Lemma: If P admits a recursive factorization according to an

ADG G, then P factorizes according GM (and chordal
supergraphs of GM)

Lemma: If P admits a recursive factorization according to an

ADG G, and A is an ancestral set in G, then PA admits a
recursive factorization according to the subgraph GA
Factorizations

A p(A,B,C,D,E,F,G,H,S) =
p(A)p(C|A)p(D|C)p(S|D,F)p(E|S)
C G B p(F|G)p(G|B)p(H|S,B)p(B)
D F ⇒
S
H
p(S|A,B,C,D,E,F,G,H) ∝
E p(S|D,F)p(E|S)p(H|S,B)

{D,F,W,H,B} is the “Markov Blanket” of S. It contains the parents of

S, the children of S, and the other parents of the children of S.
Markov Properties for Acyclic Directed Graphs
(Bayesian Networks)

(Global) S separates A from B in Gan(A,B,S)m ⇒ A B|S

(Local) α nd(α)\pa(α) | pa (α)

(G) ⇒ (L) α ∪ nd(α) is an ancestral set; pa(α) obviously

separates α from nd(α)\pa(α) in Gan(α∪nd(α))m

(L) ⇒ (factorization) induction on the number of vertices

d-separation

A chain π from a to b in an acyclic directed graph G is said to be

blocked by S if it contains a vertex γ ∈ π such that either:
- γ ∈ S and arrows of π do not meet head to head at γ, or
- γ ∉ S nor has γ any descendents in S, and arrows of π
do meet head to head at g
Two subsets A and B are d-separated by S if all chains from A
to B are blocked by S
d-separation and global markov property

Let A, B, and S be disjoint subsets of a directed, acyclic graph,

G. Then S d-separates A from B if and only if S separates A
from B in Gan(A,B,S)m
UG – ADG Intersection

A B C

C A|B

A A D
A B C
B
C C B A C|B

A C A B | C,D A B C
C D | A,B
A C|B

A B C

A C|B
UG – ADG Intersection

UG ADG

Decomposable

•UG is decomposable if chordal

No CI5
•ADG is decomposable if moral
•Decomposable ~ closed-form log-linear models
Chordal Graphs and RIP

•Chordal graphs (uniquely) admit clique orderings

that have the Running Intersection Property
V 1. {V,T}
2. {A,L,T}
T L 3. {L,A,B}
A S 4. {S,L,B}
5. {A,B,D}
X D B 6. {A,X}

•The intersection of each set with those earlier in the list is fully contained
in previous set
•Can compute cond. probabilities (e.g. Pr(X|V)) by message passing
(Lauritzen & Spiegelhalter, Dawid, Jensen)
Probabilistic Expert System

•Computationally intractable
•Inscrutable
•Requires vast amounts of data/elicitation

•Chordal UG models facilitate fast inference

•ADG models better for expert system applications –

more natural to specify Pr( v | pa(v) )
Factorizations

UG Global Markov Property ⇔ f(x) = Π ψC(xC)

C εC

ADG Global Markov Property ⇔ f(x) = Π f(xv | xpa(v) )

v εV
Lauritzen-Spiegelhalter Algorithm

•Moralize
•Triangulate

Algorithm is widely deployed in commercial software

L&S Toy Example

Pr(C|B)=0.2 Pr(C|¬B)=0.6
A B C Pr(B|A)=0.5 Pr(B|¬A)=0.1
Pr(A)=0.7

ψ(A,B) ← Pr(B|A)Pr(A)
A B C
ψ (B,C) ← Pr(C|B)

B ¬B C ¬C
B ¬B
AB B BC A 0.35 0.35 B 0.2 0.8
1 1
¬A 0.03 0.27 ¬B 0.6 0.4

Message Schedule: AB BC BC AB Pr(A|C)

¬C C ¬C
C
B ¬B B 0.076 0
B 0.076 0.304
0.38 0.62 ¬B 0.372
¬B 0.372 0.248 0
Other Theoretical Developments

Do the UG and ADG global Markov properties

identify all the conditional independences implied
by the corresponding factorizations?

Yes. Completeness for ADGs by Geiger and Pearl

(1988); for UGs by Frydenberg (1988)

Graphical characterization of collapsibility in

hierarchical log-linear models
(Asmussen and Edwards, 1983)
Collapsibility

Survival Survival
No Yes No Yes
Less 3 176 1.7% Less 17 197 7.9%
Care Care
More 4 293 1.4% More 2 23 8.0%

Clinic A Clinic B

Survival
No Yes
Less 20 373 5.1%
Care
More 6 316 1.9%

Pooled
Collapsibility

Care Clinic Surv.

Theorem: A graphical log-linear model L

is collapsible onto A iff every connected
component of Ac is complete.
Bayesian Learning for Discrete ADG’s

• Example: three binary variables

• Five parameters:
Local and Global Independence
Bayesian learning
Consider a particular state pa(v)+ of pa(v)
Equivalence Classes and Chain Graphs

• ADG models for a fixed set of vertices decompose into

Markov equivalence classes:
A B C A B C A B C

A C|B
b b b

a d a d a d
A D | B,C
c c c

D1 D2 D3
B C|A
b

a d A D | B,C
c B C
D4
Why is this a problem?

• Repeating analyses for equivalent ADGs leads to significant

computational inefficiencies.
• Ensuring that equivalent ADGs have equal posterior
probabilities imposes severe constraints on prior
distributions (Geiger and Heckerman, 1995).
• Bayesian model averaging procedures that average across
ADGs assign weights to statistical models that are
proportional to equivalence class sizes.
Equivalence Class Characterization

Theorem (Verma & Pearl, Glymour et al, Frydenberg, AMP94):

Two ADGs are Markov equivalent iff they have the same
skeletons and the same immoralities.

Definition The essential graph D* associated with D is the graph

D* := ∪(D’|D’ ~ D),

b b
b
c
a d a d
a d
c c

b b b

a d a d a d

c c c

D1 D2 D3
Essential Graphs
AMP (1995)

• Essential graphs are chain graphs

• D* is the unique smallest chain graph Markov equivalent to D
• A graph G = (V, E) is equal to D* for some ADG D if and only if G
satisfies the following four conditions:
(i) G is a chain graph;
(ii) For every chain component t of G, Gt is chordal;
(iii) The configuration a→bc does not occur as an induced
subgraph of G;
(iv) Every arrow a→b ∈ G is strongly protected in G:
c2

a b a b a b a b (c1!c2)

c c c c1
(a) (b) (c) (d)

also Meek (1995) and Chickering (1995)

What’s a Chain Graph?

“Equivalence”:
a ~ b iff a b

Chain Components (“Boxes”)

Chain Graphs

UG ADG

Decomposable CG

•Chain graph Markov property, Frydenberg (1990)

•Equivalence results (LWF, AMP, Meek, Studeny)

A D
C D | A,B or C D|A ?
C B
Cox & Wermuth (1996)

Directed vs. Undirected Graphical Models
No ratings yet
Directed vs. Undirected Graphical Models
16 pages
Perfectness in Gaussian Graphical Models
No ratings yet
Perfectness in Gaussian Graphical Models
15 pages
16 Graphical Models
No ratings yet
16 Graphical Models
27 pages
Markov Networks: Overview and Comparison
No ratings yet
Markov Networks: Overview and Comparison
22 pages
PGM Theory Notes
No ratings yet
PGM Theory Notes
16 pages
Sampling in Graphical Markov Models
No ratings yet
Sampling in Graphical Markov Models
16 pages
Understanding Graphical Models in Probability
No ratings yet
Understanding Graphical Models in Probability
99 pages
Directed vs Undirected Graphical Models
No ratings yet
Directed vs Undirected Graphical Models
16 pages
Lauritzen - Graphical Models (1996)
No ratings yet
Lauritzen - Graphical Models (1996)
306 pages
Epi Summer 24
No ratings yet
Epi Summer 24
291 pages
Overview of Conditional Random Fields
No ratings yet
Overview of Conditional Random Fields
24 pages
Directed Graphical Models
No ratings yet
Directed Graphical Models
54 pages
Bayesian Methods for Continuous Densities
No ratings yet
Bayesian Methods for Continuous Densities
13 pages
Part 1 Lecture 4 Graphical Models
No ratings yet
Part 1 Lecture 4 Graphical Models
28 pages
Bayesian Networks Explained with R
No ratings yet
Bayesian Networks Explained with R
322 pages
Probability Rules and Bayesian Inference
No ratings yet
Probability Rules and Bayesian Inference
2 pages
Probabilistic Graphical Models: EEE 485/585 Statistical Learning and Data Analytics
No ratings yet
Probabilistic Graphical Models: EEE 485/585 Statistical Learning and Data Analytics
29 pages
Bayes Ball Algorithm in DAGs
No ratings yet
Bayes Ball Algorithm in DAGs
5 pages
Markov Random Fields in Vision
No ratings yet
Markov Random Fields in Vision
15 pages
Overview of Probabilistic Graphical Models
No ratings yet
Overview of Probabilistic Graphical Models
43 pages
Bayesian Networks and Irrelevance Axioms
No ratings yet
Bayesian Networks and Irrelevance Axioms
48 pages
Markov Triplets in Graphical Models
No ratings yet
Markov Triplets in Graphical Models
4 pages
Gibbs and Markov Random Fields Explained
No ratings yet
Gibbs and Markov Random Fields Explained
6 pages
Quantum Probability in Graph Comparison
No ratings yet
Quantum Probability in Graph Comparison
60 pages
17 Factor Graphs
No ratings yet
17 Factor Graphs
27 pages
Graphlet Frequency Distribution Method
No ratings yet
Graphlet Frequency Distribution Method
4 pages
Graphical Models: Markov Properties & Algorithms
No ratings yet
Graphical Models: Markov Properties & Algorithms
104 pages
Markov Chains and Random Walks Explained
No ratings yet
Markov Chains and Random Walks Explained
8 pages
Understanding Graphical Models in Probability
No ratings yet
Understanding Graphical Models in Probability
85 pages
A Review of Gaussian Markov Models For Conditional Independence
No ratings yet
A Review of Gaussian Markov Models For Conditional Independence
30 pages
Fundamental Concepts in Graph Theory
No ratings yet
Fundamental Concepts in Graph Theory
61 pages
Understanding Directed Graphical Models
No ratings yet
Understanding Directed Graphical Models
45 pages
Understanding Bayesian Networks and Inference
No ratings yet
Understanding Bayesian Networks and Inference
58 pages
Bayesian Belief Networks Overview
No ratings yet
Bayesian Belief Networks Overview
21 pages
PPT10-W10-Graph Analytics For Big Data
No ratings yet
PPT10-W10-Graph Analytics For Big Data
55 pages
Independence and Directed Graphs Explained
No ratings yet
Independence and Directed Graphs Explained
21 pages
Understanding Factor Graphs in Probability
No ratings yet
Understanding Factor Graphs in Probability
3 pages
Markov Random Fields in Probabilistic Inference
No ratings yet
Markov Random Fields in Probabilistic Inference
7 pages
Understanding Bayesian Networks
No ratings yet
Understanding Bayesian Networks
15 pages
BN Lecture2
No ratings yet
BN Lecture2
37 pages
Markov Chains: Theory and Applications
No ratings yet
Markov Chains: Theory and Applications
44 pages
Minimal I-Maps and Chordal Graphs Explained
No ratings yet
Minimal I-Maps and Chordal Graphs Explained
8 pages
Inference Techniques in Graphical Models
No ratings yet
Inference Techniques in Graphical Models
22 pages
Prob Inf
No ratings yet
Prob Inf
56 pages
Bayesian Networks for Researchers
No ratings yet
Bayesian Networks for Researchers
30 pages
Machine Learning Models Overview
No ratings yet
Machine Learning Models Overview
38 pages
Graph Theory and Its Applications
No ratings yet
Graph Theory and Its Applications
71 pages
Machine Learning Practical Certification
No ratings yet
Machine Learning Practical Certification
46 pages
Understanding Markov Random Fields
No ratings yet
Understanding Markov Random Fields
6 pages
Causal Bayesian Network Learning Methods
No ratings yet
Causal Bayesian Network Learning Methods
39 pages
Detecting Weighted Hidden Cliques
No ratings yet
Detecting Weighted Hidden Cliques
18 pages
Understanding Bayesian Networks in ML
No ratings yet
Understanding Bayesian Networks in ML
8 pages
Kuleshov's Deep Generative Models Lecture
No ratings yet
Kuleshov's Deep Generative Models Lecture
31 pages
Learning Bayesian Networks in R
No ratings yet
Learning Bayesian Networks in R
11 pages
Probabilistic Arguments in Graph Theory
No ratings yet
Probabilistic Arguments in Graph Theory
7 pages
Introduction to Probabilistic Graphical Models
No ratings yet
Introduction to Probabilistic Graphical Models
65 pages
Graphical Models and Belief Propagation
No ratings yet
Graphical Models and Belief Propagation
101 pages
Lecture 11
No ratings yet
Lecture 11
32 pages
Tractor Driveline Vibration Analysis
No ratings yet
Tractor Driveline Vibration Analysis
14 pages
Skip List Data Structure Overview
No ratings yet
Skip List Data Structure Overview
4 pages
Applied Mathematics I Course Overview
No ratings yet
Applied Mathematics I Course Overview
3 pages
Metaphysics
100% (7)
Metaphysics
148 pages
Stock Prices & Future Earnings Analysis
No ratings yet
Stock Prices & Future Earnings Analysis
29 pages
Comments On "Sinc Interpolation of Discrete Periodic Signals"
No ratings yet
Comments On "Sinc Interpolation of Discrete Periodic Signals"
9 pages
Algebra Review
No ratings yet
Algebra Review
3 pages
MMW Midterm Exam - Google Forms
No ratings yet
MMW Midterm Exam - Google Forms
20 pages
JEE Main 2020 Part Test II AITS
No ratings yet
JEE Main 2020 Part Test II AITS
20 pages
Statistics All Formula
No ratings yet
Statistics All Formula
11 pages
Trigonometric Notes With Tricks and Formulas)
No ratings yet
Trigonometric Notes With Tricks and Formulas)
64 pages
Noliac CEramics NCE Datasheet
No ratings yet
Noliac CEramics NCE Datasheet
7 pages
Soil Dynamics Assignment: Vibration Analysis
0% (1)
Soil Dynamics Assignment: Vibration Analysis
2 pages
Understanding ANOVA: Key Concepts and Applications
100% (1)
Understanding ANOVA: Key Concepts and Applications
12 pages
Bal Bharti Half Yearly
No ratings yet
Bal Bharti Half Yearly
6 pages
Interval-Valued Fuzzy Matrices Analysis
No ratings yet
Interval-Valued Fuzzy Matrices Analysis
6 pages
Lua Tutorial
0% (2)
Lua Tutorial
38 pages
SAS Data Analysis Techniques
No ratings yet
SAS Data Analysis Techniques
46 pages
Odd and Even Numbers Lesson Plan
No ratings yet
Odd and Even Numbers Lesson Plan
9 pages
Homeostasis Lab: Heart Rate Analysis
No ratings yet
Homeostasis Lab: Heart Rate Analysis
2 pages
Forced-Convection Heat Transfer Equations
No ratings yet
Forced-Convection Heat Transfer Equations
29 pages
App B Teknical SCOP NEF En14825 HeatpumpsEN
No ratings yet
App B Teknical SCOP NEF En14825 HeatpumpsEN
10 pages
Flash Issta20
No ratings yet
Flash Issta20
14 pages
Logical Reasoning and Analytical Ability Syllogism 140401024135 Phpapp01 PDF
No ratings yet
Logical Reasoning and Analytical Ability Syllogism 140401024135 Phpapp01 PDF
17 pages
Engineering Physics Syllabus PDF
No ratings yet
Engineering Physics Syllabus PDF
3 pages
A-Level Revision List - Oct2025
No ratings yet
A-Level Revision List - Oct2025
2 pages
Research Methodology for SMEs Study
No ratings yet
Research Methodology for SMEs Study
5 pages
Numerical Analysis of Dry Tribology
No ratings yet
Numerical Analysis of Dry Tribology
11 pages
Engineering Math Quiz: Asset & Interest Basics
No ratings yet
Engineering Math Quiz: Asset & Interest Basics
2 pages
Chapter 4 PDF
No ratings yet
Chapter 4 PDF
30 pages

Probabilistic Graphical Models Overview

Uploaded by

Probabilistic Graphical Models Overview

Uploaded by

PROBABILISTIC GRAPHICAL

•Explosion of interest in “Expert Systems” in the

IF the infection is primary-bacteremia

•Many companies (Teknowledge, IntelliCorp,

•Ad-hoc uncertainty handling

What if both A and B true?

p1 + (p2 X (1- p1))

“Currently fashionable ad-hoc mumbo jumbo”

e.g., for n dichotomous variables need 2n - 1

•Suppose A and B are marginally independent. Pr(A),

given B: Pr(A), Pr(B|A) X 2, Pr(C|B) X 2 = 5

•Chain with 50 variables requires 99 probabilities

For any probability measure P and random variables A, B, and C:

CI 4: A B and A C | B [P] ⇒ A B ∪ C [P]

Some probability measures also satisfy:

CI5 satisfied whenever P has a positive joint probability density with

(Global) S separates A from B ⇒ A B|S

(G) ⇒ (L) ⇒ (P)

X2 X5, X4 | X1, X3 (1)

X4 To go from (2) to (1) need X5 X2 | X1,X3? or CI5

A density f is said to “factorize according to G” if:

• cliques are maximally complete subgraphs “clique potentials”

“Proof”: Let S separate A from B in G and assume V = A ! B ! S .

(Global) S separates A from B in Gan(A,B,S)m ⇒ A B|S

Lauritzen, Dawid, Larsen & Leimer (1990)

A density f admits a “recursive factorization” according to an

ADG Global Markov Property ⇔ f(x) = Π f(xv | xpa(v) )

Lemma: If P admits a recursive factorization according to an

Lemma: If P admits a recursive factorization according to an

{D,F,W,H,B} is the “Markov Blanket” of S. It contains the parents of

(Global) S separates A from B in Gan(A,B,S)m ⇒ A B|S

(G) ⇒ (L) α ∪ nd(α) is an ancestral set; pa(α) obviously

(L) ⇒ (factorization) induction on the number of vertices

A chain π from a to b in an acyclic directed graph G is said to be

Let A, B, and S be disjoint subsets of a directed, acyclic graph,

•UG is decomposable if chordal

•Chordal graphs (uniquely) admit clique orderings

•Chordal UG models facilitate fast inference

•ADG models better for expert system applications –

UG Global Markov Property ⇔ f(x) = Π ψC(xC)

ADG Global Markov Property ⇔ f(x) = Π f(xv | xpa(v) )

Algorithm is widely deployed in commercial software

Message Schedule: AB BC BC AB Pr(A|C)

Do the UG and ADG global Markov properties

Yes. Completeness for ADGs by Geiger and Pearl

Graphical characterization of collapsibility in

Care Clinic Surv.

Theorem: A graphical log-linear model L

• Example: three binary variables

• ADG models for a fixed set of vertices decompose into

• Repeating analyses for equivalent ADGs leads to significant

Theorem (Verma & Pearl, Glymour et al, Frydenberg, AMP94):

Definition The essential graph D* associated with D is the graph

• Essential graphs are chain graphs

also Meek (1995) and Chickering (1995)

Chain Components (“Boxes”)

•Chain graph Markov property, Frydenberg (1990)

You might also like