Professional Documents
Culture Documents
Halcyon Derks
The Principle of Inclusion-Exclusion:
Let A1 , A2 , A3 , . . . , An be subsets of a finite set S. Define
\
AI = Ai for I ⊆ {1, 2, 3, . . . , n}
i∈I
Then we have
X
|S − (A1 ∪ A2 ∪ A3 ∪ . . . ∪ An )| = (−1)|I| |AI | (1)
I⊆{1,2,3,...,n}
This formula calculates the number of elements that do not belong to any Ai .
The complement of this formula is also commonly used, as it calculates the number
of elements that belong to the union of the sets. Notice that |S − (A1 ∪ . . . ∪ An )| =
|S| − |A1 ∪ . . . ∪ An |, so multiplying equation 1 by (−1) and subtracting |S| will yield
|A1 ∪ . . . ∪ An |. Thus
X
|A1 ∪ A2 ∪ A3 ∪ . . . ∪ An | = (−1)|I|−1 |AI | (2)
∅6=I⊆{1,2,3,...,n}
is obtained. Equation (2) can also be written by grouping the terms of the summation
based on |I|.
X X
|A1 ∪ A2 ∪ A3 ∪ . . . ∪ An | = |Ai | − |Ai ∩ Aj | +
1≤i≤n 1≤i<j≤n
X
|Ai ∩ Aj ∩ Ak | − . . . (3)
1≤i<j<k≤n
. . . + (−1)n+1 |A1 ∩ . . . ∩ An |
Equation (3) can be proven by showing that an element a included in the union,
A1 ∪ . . . ∪ An , is counted exactly once in the right hand side
P of the equation. Let a be
of r ≥ 1 sets. a will be counted 1 times by |Ai |, it will be counted 2r
r
a memberP
times by |Ai ∩ Aj |, and generally will be counted mr times by the sum involving
m of the sets. Thus, returning to equation (3), a will be counted exactly
r r r r+1 r
− + − . . . + (−1)
1 2 3 r
1
r
− . . . + (−1)r+1 rr , the observation can be made
Solving for 1
r r r r+1 r r
− + − . . . + (−1) = 0+ = 1
1 2 3 r 0
So any element a will be counted exactly once in the right-hand side of the equation
as it is in |A1 ∪ . . . ∪ An |.
2
The pattern follows for all |I|, so considering components for which |I| = i, yields
X X
|Aj1 ∩ . . . ∩ Aji | + |Aj1 ∩ . . . ∩ Aji−1 ∩ Ak+1 | =
1≤j1 <...<ji ≤k 1≤j1 <...<ji−1 ≤k
X
|Aj1 ∩ . . . ∩ Aji |
1≤j1 <...<ji ≤k+1
Notice also that the signs will work out because |B ∩ Ak+1 | is subtracted, multiplying
all of the summation components by (−1). So,
X X
|A1 ∪ A2 ∪ . . . ∪ Ak+1 | = |Aa | − |Aa ∩ Ab | +
1≤a≤k+1 1≤a<b≤k+1
X
|Aa ∩ Ab ∩ Ac | − . . .
1≤a<b<c≤k+1
Find the number of positive integers not exceeding 100 that are not di-
visible by 5 or 7. Let S be the set of all positive integers less than or equal to
100. Then define the subsets in the following way: A will be the set of those positive
integers that are divisible by 5, and B will be those divisible by 7. Then A ∩ B rep-
resents numbers that are divisible by both 5 and 7. Since these are relatively prime,
a number that is divisible by both must be divisible by their product, namely 35.
Using these definitions
|S| = 100
j 100 k
|A| = = 20
5
j 100 k
|B| = = 14
7
j 100 k
|A ∩ B| = =2
35
3
Finally, applying the Principle of Inclusion-Exclusion yields
How many bit strings of length 8 do not contain 6 consecutive 0s? Let S
be the set of all bit strings of length 8. Now consider ways that these strings could
have six consecutive 0s. The chain of 0s could begin in the first position of the string,
the second position, or the third position. Notice that positions 4 through 8 will
not suffice since there are not 6 positions following which can be filled with 0s. So,
let A1 , A2 , and A3 be the set of strings with six consecutive 0s starting in the first,
second, or third position, respectively. Then A1 ∩ A2 contains the strings that have
six 0s starting in the first position and in the second. This means that five of the
0s overlap, and so the set contains strings that have 7 consecutive 0s beginning in
the first position. Similarly, A2 ∩ A3 is the set containing the strings that have 7
consecutive 0s beginning in the second position and A1 ∩ A3 is the set containing
the strings that have eight consecutive 0s (six beginning in the first position and six
beginning in the third position). The final set, A1 ∩ A2 ∩ A3 , is the set of strings that
have six consecutive 0s in every position, overall eight consecutive 0s.
In each case, the size of the set can be calculated by determining how many “free”
positions remain. A free position can be filled with either a 0 or 1 (2 choices), while
the rest of the positions must be 0 (1 choice). So S has eight free positions and
|S| = 28 . This pattern holds for all of the sets. So
|S| = 28 = 256
|A1 | = |A2 | = |A3 | = 22 = 4
|A1 ∩ A2 | = |A2 ∩ A3 | = 21 = 2
|A1 ∩ A3 | = |A1 ∩ A2 ∩ A3 | = 20 = 1
How many permutations of the letters of the English alphabet do not con-
tain any of the strings fish, frog, or bird ? Let S be the set of all permutations
of the letters in the English alphabet. Then these three words, fish, frog, and bird,
can appear in any of these permutations. So let the sets Af ish , Af rog , and Abird be
those sets made up of permutations containing the specified word. By defining the
sets in this way the intersections will all be empty sets. For instance, a permutation
that contains the word fish cannot also contain the word frog because the f can only
appear once in a permutation.
All that remains is to calculate |S|, |Af ish |, |Af rog |, and |Abird |. Clearly |S| = 26!,
as there are 26 choices for the first letter in the permutation, 25 choices for the second
4
letter, 24 for the third, etc. |Af ish |, |Af rog |, and |Abird | are slightly more difficult to
calculate. Each word uses four of the letters in the alphabet in a fixed order, so these
four letters are becoming one element in the “alphabet” of the specified set. Then a
set Ai is the set of permutations of 23 elements instead of 26 (22 single letters and 1
word), so |Af ish | = |Af rog | = |Abird | = 23!.
Finally, applying the Principle of Inclusion-Exclusion, the number of permutations
of the English alphabet that do not contain any of the words fish, frog, or bird, is
The methods and logic used to solve these problems can be applied to much more
generalized problems. The rest of this paper is devoted to solving some of these more
challenging problems as well as proving some very interesting results that follow from
(and lead to) the Principle of Inclusion-Exclusion.
Derangements
A derangement is a permutation of objects that leaves no object in its original
position. It is often interesting to calculate Dn , the number of derangements of a set.
Using this logic, consider the general case of |I| = r. This case fixes r objects
from the set, allowing the other (n − r) objects to be permuted. So |AI | = (n − r)!.
5
n
There are r
ways to choose which objects are fixed, so
X n n! n!
|AI | = · (n − r)! = · (n − r)! =
r (n − r)! · r! r!
|I|=r
Dn = |S − (A1 ∪ A2 ∪ . . . ∪ An )|
n! n! n! n!
= n! − + − . . . + (−1)r + . . . + (−1)n
1! 2! r! n!
1 1 1 1
= n! 1 − + − . . . + (−1)r + . . . + (−1)n
1! 2! r! n!
6
Using the Principle of Inclusion-Exclusion to derive the formula
X
|I| 5 5 5 5 5
E10 = (−1) |AI | = 10! − 9! + 8! − 7! + 6! − 5!
1 2 3 4 5
I⊆[1,5]
Dn = (n − 1)(Dn−1 + Dn−2 )
Dn = (n − 1)(Dn−1 + Dn−2 )
Manipulating this equation will lead to the second equation above. First, dis-
tribute the (n − 1) through the equation, so
From here, an inductive argument yields equation (4). For the base case, confirm
that D2 − 2D1 = (−1)2 . There is only one derangement of 2 elements, that is the
7
reverse of their original placement. There are no derangements of only one object as
it must always end up in its original placement. So,
Now, for the inductive argument, assume that Dn−1 − (n − 1)Dn−2 = (−1)n−1 . Then
P (A + B) = P (A) + P (B) − P (A · B)
8
The rest of this paper is devoted to solving some interesting problems that use the
Principle of Inclusion-Exclusion or the Sieve Formula in their solution, or help lend
understanding to what some interesting outcomes of these can be.
n
|AI | = Q
k∈I pk
Manipulating the summation will yield the more common form of the equation. Begin
by writing this out without the summand. That is
1 1 1 1 1 1
ϕ(n) = n 1 − − − − ... − + + + ...
p1 p2 p3 p r p1 p2 p2 p3
This might be recognized as the product of binomials. Specifically, those of the form
(1 − p1k ). So the formula becomes
r
1 1 1 1 Y 1
ϕ(n) = n 1 − 1− 1− ... 1 − =n 1−
p1 p2 p3 pr i=1
pi
Notice that in order to obtain one of the terms in the summation, go through the
product and choose either the 1 or the (− p1k ) from each term.
9
Prove the identity
n
X
i n k 0 if 0 ≤ k < n
(−1) i = n (5)
i (−1) n! if k = n
i=0
In order to prove equation (5), start by calculating the number of onto functions
that map a set K of k elements to one N of n elements. Take S to be the set of
all functions mapping elements of K to elements of N . Then take Ai to be the set
of functions that do not map any element in the domain to the ith element of the
codomain (hence, any function in Ai cannot be an onto function). Then the number
of onto functions will be |S − (A1 ∪ A2 ∪ A3 ∪ . . . ∪ An )|.
A function that maps K to N will take every element in the domain and map it
to some element in the codomain. This means that there will be n choices for each of
the k elements in the domain to map to, yielding nk different functions. So, |S| = nk .
A function that does not map any elements of K to the ith element of N will
take every elements in the domain and map it to (n − 1) elements in the codomain.
So, there are (n − 1) options for each element of K to map to. Then |Ai | for all
i ∈ {1, 2, 3, . . . , n} will be (n − 1)k . Also, there are n different ways to choose i from
the set N , so overall, X
|Ai | = n · (n − 1)k
1≤i≤n
Now, consider the general case when |I| = r. AI contains the functions that map
to only (n−r) elements of N . This means that there are still k elements of the domain,
mapping to n − r elements in the codomain, for any given subset of {1, 2, 3, . . . , n}
of size r. There will be nr subsets of size r, and there will be (n − r)k appropriate
functions for each of these subsets. So
X n
|AI | = (n − r)k
r
|I|=r
That is n
X
n−i n
|S − (A1 ∪ A2 ∪ . . . ∪ An )| = (−1) ik
i=0
n−i
This is very similar to the above identity. Up to this point this summation is the
number of onto functions mapping K to N . If k < n there will be no onto functions,
so n
X
n−i n
(−1) ik = 0 if 0 ≤ k < n
i=0
n − i
10
Then, when k = n, onto function are simply permutations of the elements, so the
number of functions is n!. So,
n
X
n−i n k 0 if 0 ≤ k < n
(−1) i =
n − i n! if k = n
i=0
Manipulating this equation will yield equation (5). First, without changing the
n n
outcome of the equation, replace n−i with i (these are equal). Next, consider
(−1)n−i . Multiply both sides of the equation by (−1)n . This will give a coefficient of
(−1)2n−i = (−1)−i = (−1)i . So,
n
i n 0 if 0 ≤ k < n
X
k
(−1) i = n
i (−1) n! if k = n
i=0
11
More Involved Combinatorial Arguments
Let A1 , A2 , . . . , An be any events, Bi = fi (A1 , A2 , . . . , An ) polynomials, and
c1 , c2 , . . . , cn reals. Then show that
k
X
ci P (Bi ) ≥ 0 (6)
i=1
The coefficient of P (B) in ki=1 ci P (Bi ) will be B⊆Bi ci . In order to find this
P P
in terms of Bi0 , it is necessary to determine which polynomials B appears in. When
B ⊆ Bi , the atom B 0 = A01 A02 · · · A0k A0k+1 · · · A0n ⊆ Bi0 . The probability of this atom
is equal to 1, meaning that P (Bi0 ) 6= 0. Also, since all probabilities are either 0 or 1,
P (Bi0 ) = 1. The reverse is true as well, that is if P (Bi0 ) 6= 0, then the probability of
at least one atom contained in Bi0 is nonzero. The only atom with a probability that
is nonzero is B, so B ⊆ Bi . This means that the sum of the coefficients of P (B) for
B ⊆ Bi will be
X k
ci P (Bi0 ) ≥ 0
i=1
Note that this result can be applied not only to inequalities, but to identities, as well,
simply considering that the difference of the two sides of the identity and its negative,
both must be at least 0.
12
probability that exactly q of them occur is
n
j+q j
X
(−1) σj
j=q
q
Approach this problem by considering both sides of the equation and applying the
previous result. Thus, let P (A1 ) = P (A2 ) = · · · = P (Ak ) = 1 and then P (Ak+1 ) =
P (Ak+2 ) = · · · = P (An ) = 0.
Start by considering P (AQ ). Since this is a polynomial in A1 , A2 , . . . , An , applying
the previous result allows the problem to be reduced to only showing it when all
probabilities of individual events are 1 or 0. From here consider two cases: the first
when k 6= q, the other when k = q. When k 6= q there will be no way for exactly
q events to occur. So the probability that exactly q events occur is 0. When k = q
there will be exactly one way to choose q events that occur from the n events total.
Once these q events have been chosen, the probability that they occur is exactly 1.
So the probability that exactly q events occur is 1. Thus,
0 if k < q
P (AQ ) = 1 if k = q
0 if k > q
As before, consider cases, this time there will be three cases to examine beginning
with k < q. In this case there will be noPsummation asthe upper bound will be less
than the lower bound. So, when k < q, kj=q (−1)j+q qj σj = 0.
The next case is k = q. In this case the summation has only one term, so calculate
what this term will be. The summation is (−1)q+q qq σq . So, there are three parts
to the formula: (−1)2q = 1, qq = 1, and σq = 1. This last one requires a bit more
P
explanation, σq = |I|=q P (AI ), and, in this case, there is only one set I that contains
q elements. So, P (AI ) = P (A1 · A2 · · · Aq ). Finally, since all of these events have a
probability of 1, σq = 1. So, when k = q,
k
j+q j
X
(−1) σj = 1
j=q
q
13
The final case to consider is k > q. Start by calculating σj in this case. This will
be the number of different ways
that j events can be chosen from those k events with
k
probability 1. Thus σj = j . Making this substitution in the summation,
k
j+q j k
X
(−1)
j=q
q j
Now the expression inside of the summand looks much like the binomial equation.
Making the observation that (−1)j+q = (−1)j (−1)q = (−1)j (−1)−q = (−1)j−q , and
using this reveals
Xk
k j−q k − q k
(−1) = · (1 − 1)k−q = 0
q j=q j−q q
So, overall
n
X
j 0 if k < q
(−1)j+q σj = 1 if k = q
q
0 if k > q
j=q
So, indeed
n
X
j+q j
P (AQ ) = (−1) σj
j=q
q
14
1
Using an inductive argument prove that P (A1 |Ā2 Ā3 . . . Ān ) ≤ 2d . Let the inductive
0 0
hypothesis be that all subgraphs G of G, with V (G ) ⊂ V (G) have P (Ai1 |Āi2 Āi3 . . . Āik ) ≤
1
2d
. Then, the base case will be the smallest possible subgraphs, or those consisting
1 1
of only a single vertex. In this case P (Ai ) ≤ 4d ≤ 2d from part (i) in the statement
of the problem.
Let vertices 2 through m be those adjacent to vertex 1. Then rewrite
P (A1 Ā2 ...Ān )
P (A1 Ā2 . . . Ān ) P (Ām+1 ...Ān ) P (A1 Ā2 . . . Ām |Ām+1 . . . Ān )
P (A1 |Ā2 Ā3 . . . Ān ) = = =
P (Ā2 . . . Ān ) P (Ā2 ...Ān ) P (Ā2 . . . Ām |Ām+1 . . . Ān )
P (Ām+1 ...Ān )
1
Now, from the inductive assumption, P (Ai |Ām+1 . . . Ān ) ≤ 2d . Also, every vertex has
degree of at most d and m − 1 is the degree of vertex 1. Thus,
m
X m−1 d 1
1− P (Ai |Ām+1 . . . Ān ) ≥ 1 − ≥1− =
i=2
2d 2d 2
1
So, the numerator is less than or equal to 4d
, and the denominator is greater than
or equal to 12 . Meaning that, overall,
1
4d 1
P (A1 |Ā2 Ā3 . . . Ān ) ≤ 1 =
2
2d
Now, rewriting P (Ā1 Ā2 · · · Ān ) in terms of P (A1 |Ā2 Ā3 . . . Ān ) will yield the final
result. Notice that
P (Ā1 Ā2 . . . Ān )
= P (Ā1 |Ā2 Ā3 . . . Ān ) = 1 − P (A1 |Ā2 Ā3 . . . Ān )
P (Ā2 Ā3 . . . Ān )
15
1
However, P (A1 |Ā2 Ā3 . . . Ān ) ≤ 2d
, so this means
aij 6= 0 ⇒ xi ≤ xj
Show that the sum, the product, and (if it exists) the inverse of compatible
matrices is compatible. Let A = (aij ) and B = (bij ) be compatible matrices.
Then show that C = A + B, D = AB, and E = A−1 are compatible as well.
Let C = (cij ). Then for all i, j ≤ n, cij = aij + bij . If cij 6= 0 then aij or bij must
be nonzero as well. Then, since A and B are both compatible, this implies xi ≤ xj ,
so C is also compatible.
Let D = (dij ). Then for all i, j ≤ n,
n
X
dij = aik bkj
k=1
If dij 6= 0, there must be some k such that aik 6= 0 and bkj 6= 0. This means that
xi ≤ xk (from compatibility of A), and xk ≤ xj (from compatibility of B). Then, by
transitivity, xi ≤ xj , so D is compatible.
Assuming that A is invertible, let E = (eij ). Then prove that eij 6= 0 ⇒ xi ≤ xj .
First, assume that xi ≤ xj ⇒ i ≤ j (this can be done by indexing A correctly). This
guarantees that A is an upper triangular matrix, so
n
Y
det(A) = aii
i=1
16
So then for all i, aii 6= 0. Now assume that for some eij 6= 0, xi 6≤ xj . It is possible to
then choose the maximal i that yields this result. Looking at the product of A and
E (the product of inverse matrices is the identity),
n
X
aik ekj = 0 i 6= j
k=1
However, since aii 6= 0 and eij 6= 0, aii eij 6= 0. So there must be some k 6= i such that
aik ekj 6= 0. Then aik 6= 0 ⇒ xi ≤ xk , and thus i < k. Since i < k, the earlier choice
of i as maximal, guarantees us that ekj 6= 0 ⇒ xk ≤ xj (i was the largest value for
which there was an “incompatibility” in E). Transitivity then yields xi ≤ xj , which
contradicts the assumption that E was not compatible. So E is compatible.
First observe that these conditions uniquely determine the function µ(x, y). Indeed,
if we know the values of µ(x, y) for x ≤ y < z, then by the last condition the value of
µ(x, z) can be determined.
Let the indexing be so that xi ≤ xj implies i ≤ j and define a matrix Q = (qij )
where
qij = 1 if xi ≤ xj
qij = 0 otherwise
Define a matrix M = (mij ) where mij = µ(xi , xj ). Then rewrite the requirements
on µ as
QM = I ⇒ M = Q−1
This uniquely determines µ(x, y).
Confirm this by checking that the three conditions are met. First, since Q is a
compatible matrix, M must also be compatible, confirming equation (7). Second,
µ(x, x) will always be an entry along the diagonal of M . All qii = 1, so det(Q) = 1,
thus the diagonal remains the same in the inverse, so mii = 1, confirming equation
(8). Finally, let ixz be an entry in the identity matrix. Then
n
X
ixz = mxy qyz
y=0
Notice however, that when y < x, mxy = 0, and when y > z, qyz = 0. So
n
X z
X
ixz = mxy qyz = mxy qyz = 0 (x < z)
y=0 y=x
17
Then, since qyz is 1 whenever y ≤ z, this is
X
µ(x, y) = 0
x≤y≤z
Then, verify the requirements of the function µ from above. First, it is clear that
µ(X, Y ) = 0 when X 6⊆ Y from the definition of the function confirming equation (7).
Next,
P µ(X, X) = (−1)|X−X| = (−1)0 = 1, confirming equation (8). Finally, calculate
X⊆Y ⊆Z µ(X, Y ) when X ⊂ Z. This will be
|Z−X|
X
|Y −X|
X |Z − X|
(−1) = (−1)k
X⊆Y ⊆Z k=0
k
Notice that this is the binomial equation, so the sum is (1 − 1)|Z−X| = 0, fulfilling
equation (9).
Represent f (x) and g(x) as the vectors [f (x1 ), . . . , f (xn )] and [g(x1 ), . . . , g(xn )],
respectively. Then observe that g(x) = f (x) · Q. It follows that
So, X
f (x) = g(z)µ(z, x)
z≤x
18
Finally, show that the Sieve is a special case of this statement.
Let S = {1, 2, . . . , n}, then the Sieve Formula states that
X
P (A1 + A2 + · · · + An ) = P (AK )(−1)|K|−1
K⊆S
However, since K ranges over all subsets of S, S − K will also range over all subsets.
So, X X
P (AS−K )(−1)|S−K| = P (AK )(−1)|K|
K⊆S K⊆S
Finally, X
P (A1 + A2 + · · · + An ) = P (AK )(−1)|K|−1
K⊆S
19