You are on page 1of 37

Conditional Independence Relations

Yunshu Liu

ASPITRG Research Group

2013-02-20
Motivation
What is Conditional Independence Relations?
A is conditionally independent of B given C
Example: Suppose MIT and Stanford accepted
undergraduate students only based on GPA
MIT : Accepted by MIT
Stanford : Accepted by Stanford
MIT

GPA

Stanford
Given Alice’s GPA as GP AAlice ,
P(MIT |Stanford, GP AAlice ) = P(MIT |GP AAlice )
We say MIT is conditionally independent of Stanford given GP AAlice
Sometimes use symbol (MIT ⊥Stanford|GP AAlice )
Motivation

Matroid Theory

Conditional Independence
Relations

Entropic region
Network Coding Probabilistic Reasoning

References:
[1]. F. Matúš and M. Studený, Conditional Independences
among Four Random Variables I, Combinatorics, Probability
and Computing, 1995, page 269-278.
[2]. F. Matúš, Infinitely Many Information Inequalities, IEEE Int.
Symp. Information Theory (ISIT), 2007, page 41-44
Outline

Preliminaries on Matroid theory

Matroid theory and Conditional Independence Relations

Infinitely many Information Inequalities


Outline

I Preliminaries on Matroid theory


I Matroid theory and Conditional Independence Relations
I Infinitely many Information Inequalities
Definitions of Matroid

What is Matroid?
Matroid is an independence structure that captures and
generalizes the notion of linear independence in vector spaces.

Independent Sets based definition of Matroid


We can represent a finite matroid by a pair (E, I), where E is a
finite set called ground set and I is a family of subset of E
called independent sets obeying the following properties:
I ∅∈I
I I1 ∈ I implies that I2 ∈ I for every subset I2 ⊆ I1
I I1 , I2 ∈ I with |I1 | < |I2 | implies there is an e ∈ I2 \I1 such
that I1 ∪ e ∈ I
Definitions of Matroid

Rank Function based definition of Matroid


A rank function is a function from subsets of ground set E to
integers satisfying the following conditions:
I If X ⊆ E then 0 6 r (X ) 6 |X |
I If X ⊆ Y ⊆ E, then r (X ) 6 r (Y )
I If X , Y ⊂ E, then r (X ) + r (Y ) > r (X ∪ Y ) + r (X ∩ Y )
NOTE: The value of the rank function is always a non-negative
integer

Examples of |E| = 4
For r = [r1 r2 r12 r3 r13 r23 r123 r4 r14 r24 r124 r34 r134 r234 r1234 ]
I r1 = [0 0 0 1 1 1 1 1 1 1 1 1 1 1 1] is a matroid
I r2 = [2 1 2 1 2 2 2 1 2 2 2 2 2 2 2] is not a matroid
I r3 = [2 2 3 2 3 3 4 2 3 3 4 4 4 4 4] is not a matroid
Polymatroids and Matroids

Polymatroidal axioms
Let f map subsets of ground set E to nonnegative real numbers,
the following conditions are called Polymatroidal axioms:
I f (∅) = 0
I If X ⊆ Y ⊆ E, then r (X ) 6 r (Y )
I If X , Y ⊂ E, then r (X ) + r (Y ) > r (X ∪ Y ) + r (X ∩ Y )
Examples: r1 , r2 and r3 all correspond to polymatroids.

Relationship between Matroids and Polymatroids


Matroids are Polymatroids with integer rank function and
cardinality constrain.
Conditional Independence Relations

Phrase (i, j|K )


Let N = {1, 2, · · · n} and S be the family of all couples (i, j|K ),
where K ⊂ N and ij is the union of two singletons i and j of
N − K.

Example
For N = {1, 2, 3}, there are 18 such couples (i, j|K ), including
the case when i = j. Listed below:
(1, 1|∅), (1, 1|2), (1, 1|3), (1, 1|23),
(2, 2|∅), (2, 2|1), (2, 2|3), (2, 2|13),
(3, 3|∅), (3, 3|1), (3, 3|2), (3, 3|12),
(1, 2|∅), (1, 2|3), (1, 3|∅), (1, 3|2), (2, 3|∅), (2, 3|1)
p-representation

Probabilistically(p-) representation
A relation L ⊂ S is called probabilistically representable if there
exists a system of n random variables ξ = {ξi }i∈N such that:
I L = |[ξ]| = {(i, j|K ) ∈ S(N)| ξi is conditionally independent
of ξj given ξK i.e. I(ξi ; ξj |ξK ) = 0 }.
We use P(N) to denote the set of all p-representable relations
on N.

Examples of |E| = 4, consider couple (2, 3|1)


For r = [r1 r2 r12 r3 r13 r23 r123 r4 r14 r24 r124 r34 r134 r234 r1234 ]
I r1 = [0 0 0 1 1 1 1 1 1 1 1 1 1 1 1] is p-representable
I r2 = [2 1 2 1 2 2 2 1 2 2 2 2 2 2 2] is p-representable
I r3 = [2 2 3 2 3 3 4 2 3 3 4 4 4 4 4] is not p-representable
Semimatroid
Definition of Semimatroid
For r ∈ RP(N) we define |[r ]| as
I |[r ]| = {(i, j|K ) ∈ S(N)| r (iK ) + r (jK ) − r (ijK ) − r (K ) = 0 }.
A relation L ⊂ S(N) is called semimatroid if and only if L = |[r ]|
for some r ∈ ΓN .
I Here ΓN is the shannon outer bound for N random
variables, which is also the region contain all polymatroids.
I We use Semi(N) to denote the set of all semimatroids on
N.
I We say semimatroid L arising from r .

Examples
p-representable semimatroids are semimatroids arising from
entropic vectors h
matroidal semimatroids are semimatroids arising from rank
functions of matroids
Outline

I Preliminaries on Matroid theory


I Matroid theory and Conditional Independence Relations
I Infinitely many Information Inequalities
Relationship between p-representable and
semimatroid
Every p-representable relation are semimatroid:
P(N) ⊆ Semi(N)
I |[ξ]| = {(i, j|K ) ∈ S(N)| I(ξi ; ξj |ξK ) = 0 }. ∈ P(N)
I |[r ]| = {(i, j|K ) ∈ S(N)| r (iK ) + r (jK ) − r (ijK ) − r (K ) = 0 }.
∈ Semi(N)

P(N) = Semi(N) for N ≤ 3


Recall ΓN = Γ̄∗N for N ≤ 3
Every p-representable relations are semimatroid:
P(N) ⊆ Semi(N)

I |[ξ]| = {(i, j|K ) ∈ S(N)| I(ξi ; ξj |ξK ) = 0 } ∈ P(N)


I |[r ]| = {(i, j|K ) ∈ S(N)| r (iK ) + r (jK ) − r (ijK ) − r (K ) = 0 }
∈ Semi(N)

P(N) � Semi(N) for N = 4


Recall Γ̄∗4 � Γ4
Examples:
Shannon Outer bound Γ4 Entropic region Γ∗4
I |[r1 ]| and |[r2 ]| are
p-representable
I |[r3 ]| is a semimatroid but
not p-representable
Matroid, p-representable semimatroid and
semimatroid for N = 4

Since every matroid which is linearly representable over a finite


field is also p-representable, all matroids for |N| ≤ 4 are
p-representable.

matroids
p-representable

semimatroids
N=4
Characterization of the extreme rays of Γ4

Matroids that are extreme rays of Γ4


For N = {1, 2, 3, 4}, with I ⊆ N and 0 ≤ t ≤ |N\I|, define

rtI (J) = min{t, |J\I|} with J ⊆ N

Then rtI is the matroid of rank t with loops I

Γ4 has 27 matroid extreme rays, all of which can be


expressed as rtI for some t and I
Characterization of the extreme rays of Γ4

non-matroid extreme rays of Γ4


For N = {1, 2, 3, 4}, define the following functions:

(2) 2 if J = i
gi (J) =
min{2, |J|} if J 6= i

(3) |J| if i ∈
6 J
gi (J) =
min{3, |J| + 1} if i ∈ J

3 if K ∈ {ik, jk, il, jl, kl}
fij (K ) =
min{4, 2|K |} otherwise
Characterization of the extreme rays of Γ4

extreme rays of Γ4
Γ4 has 41 extreme rays, including
27 matroid of the form rtI for some t and I,
(2)
4 extreme rays of the form gi for i = 1, 2, 3, 4,
(3)
4 extreme rays of the form gi for i = 1, 2, 3, 4,
6 extreme rays of the form fij for i, j ∈ N and i 6= j.

Examples of |E| = 4
For r = [r1 r2 r12 r3 r13 r23 r123 r4 r14 r24 r124 r34 r134 r234 r1234 ]
I r1 = r112 = [0 0 0 1 1 1 1 1 1 1 1 1 1 1 1]
(2)
I r2 = g1 = [2 1 2 1 2 2 2 1 2 2 2 2 2 2 2]
I r3 = f34 = [2 2 3 2 3 3 4 2 3 3 4 4 4 4 4]
Characterization of the extreme rays of Γ4

extreme rays of Γ4
(2) (3)
|[gi ]|, |[gi ]| and |[fij ]| are all semimatroids, among them
(2) (3)
|[gi ]| and |[gi ]| are p-representable,
|[fij ]| are not p-representable.

matroids
rtI
(2) (3)
|[gi ]|, |[gi ]| |[fij ]|
p-representable
semimatroids
N=4
Ingleton inequality

Ingleton inequality

Ingleton12
= I(X1 ; X2 |X3 ) + I(X1 ; X2 |X4 ) + I(X3 ; X4 |∅) − I(X1 ; X2 |∅)
= h12 + h13 + h23 + h14 + h24 − h1 − h2 − h34 − h123 − h124 > 0

R4 : generated by four variable shannon-type inequalities


and 6 Ingleton inequalities
R4 has 35 extreme rays, including
27 matroid of the form rtI for some t and I,
(2) (3)
8 extreme rays of the form gi and gi for i = 1, 2, 3, 4.
Note: fij 6∈ R4 for i, j ∈ N and i 6= j
Ingleton inequality

Ingleton semimatroid
A relation L ⊂ S(N) is called Ingleton semimatroid if and only if
L = |[r ]| for some r ∈ RN .

For |N| = 4 every Ingleton semimatroid is p-representable


Reason:
Define the convex cone of all Ingleton semimatroid as
InSemi(N), all the extreme rays of InSemi(N) are
p-representable.
G4ij : The gap between R4 and Γ4

G4ij = {h ∈ Γ4 |Ingletonij ≤ 0}
G4ij is the convex hull of 15 extreme rays, the V-representation
are generated by the 15 linearly independent functions fij , r1ijk ,
r1ijl , r1ikl , r1jkl , r1∅ , r3∅ , r1i , r1j , r1ik , r1jk , r1il , r1jl , r2k , r2l .

The H-representation of G4ij are 14 shannon-type


inequality together with −Ingletonij ≤ 0
List of semimatroids on four discrete variables

Ingleton semimatroid
There are 120 irreducible p-representable semimatroids of
sixteen types(means remove all permutations) over
four-element set N. Among which there are 36 ingleton
semimatroids of 11 types:
|[0]|, |[r1N−i ]| for i ∈ N, |[r1ij ]| for i, j ∈ N distinct, |[r1i ]| for i ∈ N,
ikj (2)
|[r1 ]|, |[r2i ]| for i ∈ N, |[r2 ]| for i, j ∈ N distinct, |[r2 ]|, |[r3 ]|, |[gi ]|
(3)
for i ∈ N, |[gi ]| for i ∈ N.
List of semimatroids on four discrete variables

non-Ingleton semimatroid
There are 120 irreducible p-representable semimatroids of
sixteen types(means remove all permutations) over
four-element set N. Among which there are 84 non-Ingleton
semimatroids of 5 types:

kl|∅
Lij = {(kl|i), (kl|j)(ij|∅)} ∪ {(kl|ij)} ∪ {(k|ij), (l|ij), (i|jkl), (j|ikl), (k|ijl), (l|ijk)}
(ij|kl)
Lij = {(ij|k), (ij|l), (kl|ij)} ∪ {(kl|i), (kl|j)}
(ik|jl)
Lij = {(kl|ij), (ij|k), (ik|l)} ∪ {(kl|j)} ∪ {(l|ij), (l|ijk)}
ik|j
Lij = {(ij|k), (ik|l), (kl|j)} ∪ {(i|jkl), (j|ikl), (k|ijl), (l|ijk)}
jl|∅
Lij = {(kl|i), (jl|k), (ij|∅)} ∪ {(kl|ij)} ∪ {(k|ij), (l|ij), (i|jkl), (j|ikl), (k|ijl), (l|ijk)}
Outline

I Preliminaries on Matroid theory


I Matroid theory and Conditional Independence Relations
I Infinitely many Information Inequalities
Information Inequalities

Information Inequalities: Definition

If we consider the function of all


(2n − 1) joint entropies associated
with X:
Γ∗N
λAi hAi > 0 f �0
X
f =
i

where λAi are real coefficients. If a


inequality of this form is true for any
random variables {X1 , · · · Xn }, we call
it an Information inequality.
Information Inequalities

Information Inequalities: examples

For jointly related discrete random


variables A, B and C, we can define
the conditional entropy of A given B
by H(A|B), the mutual information
between A and B by I(A; B), the
conditional mutual information is
given by I(A; B|C).

H(A|B) = H(A, B) − H(B) > 0


I(A; B) = H(A) + H(B) − H(A, B) > 0
I(A; B|C) = H(A, C) + H(B, C) − H(A, B, C) − H(C) > 0
Information Inequalities

Shannon-Type Information inequality

 
hA + hB ≥ hA∩B + hA∪B ∀A, B ⊆ N
2N −1
ΓN = h∈R hP ≥ hQ ≥ 0 ∀Q ⊆ P ⊆ N

Relationship between Γ∗n and Γn


Γ∗2 = Γ2 and Γ¯∗3 = Γ3 : h12 = H(X1 , X2 )

All the Information inequalities on (0 1 1) (1 0 1)

N 6 3 variables are Shannon-Type. (1 1 1)

For N = 2, Use h1 + h2 > h12 ,


(0 0 0) h1 = H(X1 )

h12 > h1 and h12 > h2 , we get h2 = H(X2 )


Information Inequalities
Better Outer bound Γ+N
Shannon Outer bound ΓN Entropic region Γ∗N

For N ≥ 4 we have ΓN 6= Γ̄∗N


Non-Shannon-Type Information
inequalities exist for N > 4

N �4
The first discovered Non-Shannon-Type Information
inequality:Zhang-Yeung Inequality

2I(X1 ; X2 ) ≤ I(X3 ; X4 )+I(X3 ; X1 , X2 )+3I(X1 ; X2 |X3 )+I(X1 ; X2 |X4 )


Zhang-Yeung Inequality and DFZ Inequality
Recall:

Ingleton12
= I(X1 ; X2 |X3 ) + I(X1 ; X2 |X4 ) + I(X3 ; X4 |∅) − I(X1 ; X2 |∅)

Zhang-Yeung Inequality

Ingleton12 + I(X1 ; X2 |X3 ) + I(X1 ; X3 |X2 ) + I(X2 ; X3 |X1 ) > 0

DFZ Inequality

2Ingleton12 + 3I(X1 ; X2 |X3 ) + 3I(X1 ; X3 |X2 ) + I(X2 ; X3 |X1 ) > 0


Information Inequality

A sequence of Information Inequalities on four variables

s(s + 1)
s Ingleton12 +I(X2 ; X3 |X1 )+ [I(X1 ; X2 |X3 )+I(X1 ; X3 |X2 )] > 0
2
Derived by F. Matúš using adhesivity of polymatroid in [2];
Later by R. Dougherty and C. Freiling and K. Zeger using
D-copy in their preprint paper http://arxiv.org/abs/1104.3602
The closure of Γ∗N is not polyhedral for N ≥ 4

Ideas and procedure of the proof:


I Generate a sequence of Information Inequalities on four
variables.
I Prove a Lemma such that if a cone is a polyhedral set, a
curve in this set must have certain property .
I Construct a curve in the closure of Γ∗4 .
I Prove the property of this curve contradict the results of the
Lemma, thus the closure of Γ∗4 is not polyhedral.
The closure of Γ∗4 is not polyhedral
Geometrical Lemma
If P is a polyhedral and a curve c : [0, 1] → P has a right
tangent ċ0+ = limp→0+ p1 [cp − c0 ], then P contains the segment
with two endpoints c0 and c0 + ċ0+ for some  > 0.

c0 + �ċ0+
p → cp
c0

P
The closure of Γ∗4 is not polyhedral

Construction of the curve: 4 atoms


Consider four binary random variables: X = [x1 x2 x3 x4 ], where
x1 and x2 are independent, i.e. I(x1 ; x2 ) = 0, and the marginal
distributions of x1 and x2 are
p(x1 = 0) = 2p, p(x1 = 1) = 1 − 2p ( where 0 ≤ p ≤ 12 );
p(x2 = 0) = 12 , p(x2 = 1) = 12 .
Furthermore,
x3 = x1 · x2
x4 = (1 − x1 ) · (1 − x2 )
4 atoms: For all the outcomes of X, only four or less have
non-zero probability.

hxp : the entropy function of X


The closure of Γ∗4 is not polyhedral

Construction of the curve: hxp and rtI

1
ln 2 · c(p) = hxp + H(p)r114 + [ln 2 + 2p ln 2 − H(2p)][r123 + r24 ]
2

where 0 ≤ p ≤ 12 , H(p) = −p ln p − (1 − p) ln(1 − p),


r114 , r123 and r24 are linear matroids, also notice hx0 = r114 ,
1
hx2 = r113 .

After some calculation:

c(0) = r114 + r123 + r24


ċ0+ = f12 + r124 + r23
The closure of Γ∗4 is not polyhedral

Contradiction
I h() = c0 + ċ0+ = [r114 + r123 + r24 ] + [f12 + r124 + r23 ]
I By the geometrical lemma, if the closure of Γ∗4 is
polyhedral, then h() should be entropic for some .
I Plug h() in the sequence of Information Inequalities:

s(s + 1)
s Ingleton12 + [I(X1 ; X2 |X3 ) + I(X1 ; X3 |X2 )]
2
+I(X2 ; X3 |X1 ) = 1 − s

I For s large enough, 1 − s ≤ 0, contradiction!


Thanks!

Questions!

You might also like