Conditional Independence Relations

Conditional Independence Relations
Yunshu Liu
ASPITRG Research Group
2013-02-20
Motivation
What is Conditional Independence Relations?
A is conditionally independent of B given C
Example: Suppose MIT and Stanford accepted
undergraduate students only based on GPA
MIT : Accepted by MIT
Stanford : Accepted by Stanford
MIT
GPA
Stanford
Given Alice’s GPA as GP AAlice ,
P(MIT |Stanford, GP AAlice ) = P(MIT |GP AAlice )
We say MIT is conditionally independent of Stanford given GP AAlice
Sometimes use symbol (MIT ⊥Stanford|GP AAlice )
Motivation
Matroid Theory
Conditional Independence
Relations
Entropic region
Network Coding Probabilistic Reasoning
References:
[1]. F. Matúš and M. Studený, Conditional Independences
among Four Random Variables I, Combinatorics, Probability
and Computing, 1995, page 269-278.
[2]. F. Matúš, Infinitely Many Information Inequalities, IEEE Int.
Symp. Information Theory (ISIT), 2007, page 41-44
Outline
Preliminaries on Matroid theory
Matroid theory and Conditional Independence Relations
Infinitely many Information Inequalities

Outline
I Preliminaries on Matroid theory

I Matroid theory and Conditional Independence Relations
I Infinitely many Information Inequalities
Definitions of Matroid
What is Matroid?
Matroid is an independence structure that captures and
generalizes the notion of linear independence in vector spaces.
Independent Sets based definition of Matroid

We can represent a finite matroid by a pair (E, I), where E is a
finite set called ground set and I is a family of subset of E
called independent sets obeying the following properties:
I ∅∈I
I I1 ∈ I implies that I2 ∈ I for every subset I2 ⊆ I1
I I1 , I2 ∈ I with |I1 | < |I2 | implies there is an e ∈ I2 \I1 such
that I1 ∪ e ∈ I
Definitions of Matroid
Rank Function based definition of Matroid

A rank function is a function from subsets of ground set E to
integers satisfying the following conditions:
I If X ⊆ E then 0 6 r (X ) 6 |X |
I If X ⊆ Y ⊆ E, then r (X ) 6 r (Y )
I If X , Y ⊂ E, then r (X ) + r (Y ) > r (X ∪ Y ) + r (X ∩ Y )
NOTE: The value of the rank function is always a non-negative
integer
Examples of |E| = 4
For r = [r1 r2 r12 r3 r13 r23 r123 r4 r14 r24 r124 r34 r134 r234 r1234 ]
I r1 = [0 0 0 1 1 1 1 1 1 1 1 1 1 1 1] is a matroid
I r2 = [2 1 2 1 2 2 2 1 2 2 2 2 2 2 2] is not a matroid
I r3 = [2 2 3 2 3 3 4 2 3 3 4 4 4 4 4] is not a matroid
Polymatroids and Matroids
Polymatroidal axioms
Let f map subsets of ground set E to nonnegative real numbers,
the following conditions are called Polymatroidal axioms:
I f (∅) = 0
I If X ⊆ Y ⊆ E, then r (X ) 6 r (Y )
I If X , Y ⊂ E, then r (X ) + r (Y ) > r (X ∪ Y ) + r (X ∩ Y )
Examples: r1 , r2 and r3 all correspond to polymatroids.
Relationship between Matroids and Polymatroids

Matroids are Polymatroids with integer rank function and
cardinality constrain.
Conditional Independence Relations
Phrase (i, j|K )

Let N = {1, 2, · · · n} and S be the family of all couples (i, j|K ),
where K ⊂ N and ij is the union of two singletons i and j of
N − K.
Example
For N = {1, 2, 3}, there are 18 such couples (i, j|K ), including
the case when i = j. Listed below:
(1, 1|∅), (1, 1|2), (1, 1|3), (1, 1|23),
(2, 2|∅), (2, 2|1), (2, 2|3), (2, 2|13),
(3, 3|∅), (3, 3|1), (3, 3|2), (3, 3|12),
(1, 2|∅), (1, 2|3), (1, 3|∅), (1, 3|2), (2, 3|∅), (2, 3|1)
p-representation
Probabilistically(p-) representation
A relation L ⊂ S is called probabilistically representable if there
exists a system of n random variables ξ = {ξi }i∈N such that:
I L = |[ξ]| = {(i, j|K ) ∈ S(N)| ξi is conditionally independent
of ξj given ξK i.e. I(ξi ; ξj |ξK ) = 0 }.
We use P(N) to denote the set of all p-representable relations
on N.
Examples of |E| = 4, consider couple (2, 3|1)

I r1 = [0 0 0 1 1 1 1 1 1 1 1 1 1 1 1] is p-representable
I r2 = [2 1 2 1 2 2 2 1 2 2 2 2 2 2 2] is p-representable
I r3 = [2 2 3 2 3 3 4 2 3 3 4 4 4 4 4] is not p-representable
Semimatroid
Definition of Semimatroid
For r ∈ RP(N) we define |[r ]| as
I |[r ]| = {(i, j|K ) ∈ S(N)| r (iK ) + r (jK ) − r (ijK ) − r (K ) = 0 }.
A relation L ⊂ S(N) is called semimatroid if and only if L = |[r ]|
for some r ∈ ΓN .
I Here ΓN is the shannon outer bound for N random
variables, which is also the region contain all polymatroids.
I We use Semi(N) to denote the set of all semimatroids on
N.
I We say semimatroid L arising from r .
Examples
p-representable semimatroids are semimatroids arising from
entropic vectors h
matroidal semimatroids are semimatroids arising from rank
functions of matroids
Outline

Relationship between p-representable and
semimatroid
Every p-representable relation are semimatroid:
P(N) ⊆ Semi(N)
I |[ξ]| = {(i, j|K ) ∈ S(N)| I(ξi ; ξj |ξK ) = 0 }. ∈ P(N)
I |[r ]| = {(i, j|K ) ∈ S(N)| r (iK ) + r (jK ) − r (ijK ) − r (K ) = 0 }.
∈ Semi(N)
P(N) = Semi(N) for N ≤ 3

Recall ΓN = Γ̄∗N for N ≤ 3
Every p-representable relations are semimatroid:
P(N) ⊆ Semi(N)
I |[ξ]| = {(i, j|K ) ∈ S(N)| I(ξi ; ξj |ξK ) = 0 } ∈ P(N)

I |[r ]| = {(i, j|K ) ∈ S(N)| r (iK ) + r (jK ) − r (ijK ) − r (K ) = 0 }
∈ Semi(N)
P(N) � Semi(N) for N = 4

Recall Γ̄∗4 � Γ4
Examples:
Shannon Outer bound Γ4 Entropic region Γ∗4
I |[r1 ]| and |[r2 ]| are
p-representable
I |[r3 ]| is a semimatroid but
not p-representable
Matroid, p-representable semimatroid and
semimatroid for N = 4
Since every matroid which is linearly representable over a finite

field is also p-representable, all matroids for |N| ≤ 4 are
p-representable.
matroids
p-representable
semimatroids
N=4
Characterization of the extreme rays of Γ4
Matroids that are extreme rays of Γ4

For N = {1, 2, 3, 4}, with I ⊆ N and 0 ≤ t ≤ |N\I|, define
rtI (J) = min{t, |J\I|} with J ⊆ N
Then rtI is the matroid of rank t with loops I
Γ4 has 27 matroid extreme rays, all of which can be

expressed as rtI for some t and I
non-matroid extreme rays of Γ4

For N = {1, 2, 3, 4}, define the following functions:

(2) 2 if J = i
gi (J) =
min{2, |J|} if J 6= i

(3) |J| if i ∈
6 J
gi (J) =
min{3, |J| + 1} if i ∈ J

3 if K ∈ {ik, jk, il, jl, kl}
fij (K ) =
min{4, 2|K |} otherwise
extreme rays of Γ4
Γ4 has 41 extreme rays, including
27 matroid of the form rtI for some t and I,
(2)
4 extreme rays of the form gi for i = 1, 2, 3, 4,
(3)
4 extreme rays of the form gi for i = 1, 2, 3, 4,
6 extreme rays of the form fij for i, j ∈ N and i 6= j.
Examples of |E| = 4
I r1 = r112 = [0 0 0 1 1 1 1 1 1 1 1 1 1 1 1]
(2)
I r2 = g1 = [2 1 2 1 2 2 2 1 2 2 2 2 2 2 2]
I r3 = f34 = [2 2 3 2 3 3 4 2 3 3 4 4 4 4 4]
extreme rays of Γ4
(2) (3)
|[gi ]|, |[gi ]| and |[fij ]| are all semimatroids, among them
(2) (3)
|[gi ]| and |[gi ]| are p-representable,
|[fij ]| are not p-representable.
matroids
rtI
(2) (3)
|[gi ]|, |[gi ]| |[fij ]|
p-representable
semimatroids
N=4
Ingleton inequality
Ingleton inequality
Ingleton12
= I(X1 ; X2 |X3 ) + I(X1 ; X2 |X4 ) + I(X3 ; X4 |∅) − I(X1 ; X2 |∅)
= h12 + h13 + h23 + h14 + h24 − h1 − h2 − h34 − h123 − h124 > 0
R4 : generated by four variable shannon-type inequalities

and 6 Ingleton inequalities
R4 has 35 extreme rays, including
27 matroid of the form rtI for some t and I,
(2) (3)
8 extreme rays of the form gi and gi for i = 1, 2, 3, 4.
Note: fij 6∈ R4 for i, j ∈ N and i 6= j
Ingleton inequality
Ingleton semimatroid
A relation L ⊂ S(N) is called Ingleton semimatroid if and only if
L = |[r ]| for some r ∈ RN .
For |N| = 4 every Ingleton semimatroid is p-representable

Reason:
Define the convex cone of all Ingleton semimatroid as
InSemi(N), all the extreme rays of InSemi(N) are
p-representable.
G4ij : The gap between R4 and Γ4
G4ij = {h ∈ Γ4 |Ingletonij ≤ 0}
G4ij is the convex hull of 15 extreme rays, the V-representation
are generated by the 15 linearly independent functions fij , r1ijk ,
r1ijl , r1ikl , r1jkl , r1∅ , r3∅ , r1i , r1j , r1ik , r1jk , r1il , r1jl , r2k , r2l .
The H-representation of G4ij are 14 shannon-type

inequality together with −Ingletonij ≤ 0
List of semimatroids on four discrete variables
Ingleton semimatroid
There are 120 irreducible p-representable semimatroids of
sixteen types(means remove all permutations) over
four-element set N. Among which there are 36 ingleton
semimatroids of 11 types:
|[0]|, |[r1N−i ]| for i ∈ N, |[r1ij ]| for i, j ∈ N distinct, |[r1i ]| for i ∈ N,
ikj (2)
|[r1 ]|, |[r2i ]| for i ∈ N, |[r2 ]| for i, j ∈ N distinct, |[r2 ]|, |[r3 ]|, |[gi ]|
(3)
for i ∈ N, |[gi ]| for i ∈ N.
List of semimatroids on four discrete variables
non-Ingleton semimatroid
There are 120 irreducible p-representable semimatroids of
sixteen types(means remove all permutations) over
four-element set N. Among which there are 84 non-Ingleton
semimatroids of 5 types:
kl|∅
Lij = {(kl|i), (kl|j)(ij|∅)} ∪ {(kl|ij)} ∪ {(k|ij), (l|ij), (i|jkl), (j|ikl), (k|ijl), (l|ijk)}
(ij|kl)
Lij = {(ij|k), (ij|l), (kl|ij)} ∪ {(kl|i), (kl|j)}
(ik|jl)
Lij = {(kl|ij), (ij|k), (ik|l)} ∪ {(kl|j)} ∪ {(l|ij), (l|ijk)}
ik|j
Lij = {(ij|k), (ik|l), (kl|j)} ∪ {(i|jkl), (j|ikl), (k|ijl), (l|ijk)}
jl|∅
Lij = {(kl|i), (jl|k), (ij|∅)} ∪ {(kl|ij)} ∪ {(k|ij), (l|ij), (i|jkl), (j|ikl), (k|ijl), (l|ijk)}
Outline

Information Inequalities
Information Inequalities: Definition
If we consider the function of all

(2n − 1) joint entropies associated
with X:
Γ∗N
λAi hAi > 0 f �0
X
f =
i
where λAi are real coefficients. If a

inequality of this form is true for any
random variables {X1 , · · · Xn }, we call
it an Information inequality.
Information Inequalities: examples
For jointly related discrete random

variables A, B and C, we can define
the conditional entropy of A given B
by H(A|B), the mutual information
between A and B by I(A; B), the
conditional mutual information is
given by I(A; B|C).
H(A|B) = H(A, B) − H(B) > 0

I(A; B) = H(A) + H(B) − H(A, B) > 0
I(A; B|C) = H(A, C) + H(B, C) − H(A, B, C) − H(C) > 0
Shannon-Type Information inequality

hA + hB ≥ hA∩B + hA∪B ∀A, B ⊆ N
2N −1
ΓN = h∈R hP ≥ hQ ≥ 0 ∀Q ⊆ P ⊆ N
Relationship between Γ∗n and Γn

Γ∗2 = Γ2 and Γ¯∗3 = Γ3 : h12 = H(X1 , X2 )
All the Information inequalities on (0 1 1) (1 0 1)
N 6 3 variables are Shannon-Type. (1 1 1)
For N = 2, Use h1 + h2 > h12 ,

(0 0 0) h1 = H(X1 )
h12 > h1 and h12 > h2 , we get h2 = H(X2 )

Better Outer bound Γ+N
Shannon Outer bound ΓN Entropic region Γ∗N
For N ≥ 4 we have ΓN 6= Γ̄∗N

Non-Shannon-Type Information
inequalities exist for N > 4
N �4
The first discovered Non-Shannon-Type Information
inequality:Zhang-Yeung Inequality
2I(X1 ; X2 ) ≤ I(X3 ; X4 )+I(X3 ; X1 , X2 )+3I(X1 ; X2 |X3 )+I(X1 ; X2 |X4 )

Zhang-Yeung Inequality and DFZ Inequality
Recall:
Ingleton12
= I(X1 ; X2 |X3 ) + I(X1 ; X2 |X4 ) + I(X3 ; X4 |∅) − I(X1 ; X2 |∅)
Zhang-Yeung Inequality
Ingleton12 + I(X1 ; X2 |X3 ) + I(X1 ; X3 |X2 ) + I(X2 ; X3 |X1 ) > 0
DFZ Inequality
2Ingleton12 + 3I(X1 ; X2 |X3 ) + 3I(X1 ; X3 |X2 ) + I(X2 ; X3 |X1 ) > 0

Information Inequality
A sequence of Information Inequalities on four variables
s(s + 1)
s Ingleton12 +I(X2 ; X3 |X1 )+ [I(X1 ; X2 |X3 )+I(X1 ; X3 |X2 )] > 0
2
Derived by F. Matúš using adhesivity of polymatroid in [2];
Later by R. Dougherty and C. Freiling and K. Zeger using
D-copy in their preprint paper http://arxiv.org/abs/1104.3602
The closure of Γ∗N is not polyhedral for N ≥ 4
Ideas and procedure of the proof:

I Generate a sequence of Information Inequalities on four
variables.
I Prove a Lemma such that if a cone is a polyhedral set, a
curve in this set must have certain property .
I Construct a curve in the closure of Γ∗4 .
I Prove the property of this curve contradict the results of the
Lemma, thus the closure of Γ∗4 is not polyhedral.
The closure of Γ∗4 is not polyhedral
Geometrical Lemma
If P is a polyhedral and a curve c : [0, 1] → P has a right
tangent ċ0+ = limp→0+ p1 [cp − c0 ], then P contains the segment
with two endpoints c0 and c0 + ċ0+ for some > 0.
c0 + �ċ0+
p → cp
c0
P
Construction of the curve: 4 atoms

Consider four binary random variables: X = [x1 x2 x3 x4 ], where
x1 and x2 are independent, i.e. I(x1 ; x2 ) = 0, and the marginal
distributions of x1 and x2 are
p(x1 = 0) = 2p, p(x1 = 1) = 1 − 2p ( where 0 ≤ p ≤ 12 );
p(x2 = 0) = 12 , p(x2 = 1) = 12 .
Furthermore,
x3 = x1 · x2
x4 = (1 − x1 ) · (1 − x2 )
4 atoms: For all the outcomes of X, only four or less have
non-zero probability.
hxp : the entropy function of X

Construction of the curve: hxp and rtI
1
ln 2 · c(p) = hxp + H(p)r114 + [ln 2 + 2p ln 2 − H(2p)][r123 + r24 ]
2
where 0 ≤ p ≤ 12 , H(p) = −p ln p − (1 − p) ln(1 − p),

r114 , r123 and r24 are linear matroids, also notice hx0 = r114 ,
1
hx2 = r113 .
After some calculation:
c(0) = r114 + r123 + r24

ċ0+ = f12 + r124 + r23
Contradiction
I h() = c0 + ċ0+ = [r114 + r123 + r24 ] + [f12 + r124 + r23 ]
I By the geometrical lemma, if the closure of Γ∗4 is
polyhedral, then h() should be entropic for some .
I Plug h() in the sequence of Information Inequalities:
s(s + 1)
s Ingleton12 + [I(X1 ; X2 |X3 ) + I(X1 ; X3 |X2 )]
2
+I(X2 ; X3 |X1 ) = 1 − s
I For s large enough, 1 − s ≤ 0, contradiction!

Thanks!
Questions!

Conditional Independence Relations

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Conditional Independence Relations

Uploaded by

Copyright:

Available Formats

Conditional Independence Relations

ASPITRG Research Group

Preliminaries on Matroid theory

Matroid theory and Conditional Independence Relations

Infinitely many Information Inequalities

I Preliminaries on Matroid theory

Independent Sets based definition of Matroid

Rank Function based definition of Matroid

Relationship between Matroids and Polymatroids

Phrase (i, j|K )

Examples of |E| = 4, consider couple (2, 3|1)

I Preliminaries on Matroid theory

P(N) = Semi(N) for N ≤ 3

I |[ξ]| = {(i, j|K ) ∈ S(N)| I(ξi ; ξj |ξK ) = 0 } ∈ P(N)

P(N) � Semi(N) for N = 4

Since every matroid which is linearly representable over a finite

Matroids that are extreme rays of Γ4

rtI (J) = min{t, |J\I|} with J ⊆ N

Then rtI is the matroid of rank t with loops I

Γ4 has 27 matroid extreme rays, all of which can be

non-matroid extreme rays of Γ4

R4 : generated by four variable shannon-type inequalities

For |N| = 4 every Ingleton semimatroid is p-representable

The H-representation of G4ij are 14 shannon-type

I Preliminaries on Matroid theory

Information Inequalities: Definition

If we consider the function of all

where λAi are real coefficients. If a

Information Inequalities: examples

For jointly related discrete random

H(A|B) = H(A, B) − H(B) > 0

Shannon-Type Information inequality

Relationship between Γ∗n and Γn

All the Information inequalities on (0 1 1) (1 0 1)

N 6 3 variables are Shannon-Type. (1 1 1)

For N = 2, Use h1 + h2 > h12 ,

h12 > h1 and h12 > h2 , we get h2 = H(X2 )

For N ≥ 4 we have ΓN 6= Γ̄∗N

2I(X1 ; X2 ) ≤ I(X3 ; X4 )+I(X3 ; X1 , X2 )+3I(X1 ; X2 |X3 )+I(X1 ; X2 |X4 )

Ingleton12 + I(X1 ; X2 |X3 ) + I(X1 ; X3 |X2 ) + I(X2 ; X3 |X1 ) > 0

2Ingleton12 + 3I(X1 ; X2 |X3 ) + 3I(X1 ; X3 |X2 ) + I(X2 ; X3 |X1 ) > 0

A sequence of Information Inequalities on four variables

Ideas and procedure of the proof:

Construction of the curve: 4 atoms

hxp : the entropy function of X

Construction of the curve: hxp and rtI

where 0 ≤ p ≤ 12 , H(p) = −p ln p − (1 − p) ln(1 − p),

After some calculation:

c(0) = r114 + r123 + r24

I For s large enough, 1 − s ≤ 0, contradiction!

You might also like

I For s large enough, 1 − s ≤ 0, contradiction!