Professional Documents
Culture Documents
Diplomarbeit
Diplomarbeit
Directed Graphs
DIPLOMARBEIT
Diplom-Informatiker
FRIEDRICH-SCHILLER-UNIVERSITÄT JENA
Jena, 11.02.2009
Abstract
In Bioinformatics, the task of hierarchically classifying diseases with
noisy data recently led to studying the Transitivity Editing prob-
lem, which is to change a given digraph by adding and removing a
minimum number of arcs such that the resulting digraph is transitive.
We show that both Transitivity Editing and Transitivity Dele-
tion, which does not allow the insertion of arcs, are NP-complete even
when restricted to DAGs. We provide polynomial-time executable data
reduction rules that yield an O(k 2 )-vertex kernel for general digraphs
and an O(k)-vertex kernel for digraphs of bounded degree. Further-
more, a heuristic approach and a search tree algorithm are presented.
We show an asymptotic running time of O(2.57k + n3 ) for Transitiv-
ity Editing and O(2k + n3 ) for Transitivity Deletion.
2 Preliminaries 5
2.1 Basic Definitions and Notations . . . . . . . . . . . . . . . . . 5
2.2 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Transitivity Editing . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Graph Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4 Computational Complexity 20
4.1 Complexity of Transitivity Editing . . . . . . . . . . . . . . . 20
4.2 Complexity of Acyclic Transitivity Editing . . . . . . . . . . . 26
7 Heuristics 71
8 Experimental Results 78
8.1 Employed Tests . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.2 Results and Interpretation . . . . . . . . . . . . . . . . . . . . 80
9 Conclusion 85
1 INTRODUCTION 3
1 Introduction
Finding and highlighting structure in graph modeled data has been a compu-
tational problem since automated data collection became possible. Machine
learning, data mining, pattern recognition, image analysis, genome assembly,
automatic data reduction, and chip design are examples of fields where such
tasks are common. Examples for such structures range from induced cycles
of even or odd length over paths and vertices of special degree to complex
subgraphs such as complete bipartite graphs of certain size. In molecular
diagnostics, the task of finding hierarchical disease classifications based on
noisy data was recently considered by Jacob et al. [JJK+ 08]: A group of
patients that share a disease is analyzed for the possibility of hierarchically
classifying the disease in a scheme of sub-diseases, based on molecular char-
acteristics [HBB+ 06]. A threshold is applied to deduce that a certain disease
is a sub-disease of some other: If the ratio of patients with some feature B
that also exhibit feature A is beyond a given threshold, then (A, B) is an arc
in the disease hierarchy. Naturally, such measurements and classifications
are subject to errors that may result in noisy data. Although the relation
of being a sub-disease of some other disease should logically be transitive,
this noise may cause inconsistencies in the resulting data set. For instance,
a disease A is found to be a sub-disease of B which is in turn a sub-disease
of C, but the measured data does not indicate that A is a sub-disease of C.
To be able to work with the resulting data, we wish to eliminate as much
noise as possible from the data set, that is, we want to find a consistent dis-
ease hierarchy that is closest to the measured data. When considering the
data as a directed graph, we insert and delete arcs from it until transitivity
is achieved. Obviously, we must assume the error to be small, otherwise, we
may reconstruct almost any hierarchical structure from the data.
The central problem in this work is the Transitivity Editing prob-
lem, which is to find the closest transitive digraph to a given digraph. It
is highly related to the Cluster Editing problem, which is to find the
closest transitive undirected graph for a given undirected graph. It has
been shown that Cluster Editing is NP-complete [KM86]. We show
that the Transitivity Editing problem is also NP-complete and remains
NP-complete even when restricted to acyclic digraphs (DAGs) or digraphs
of maximum degree four. Another related problem is the Comparabil-
ity Editing problem, which asks whether a given undirected graph can
be transformed by a limited number of operations such that the remaining
graph can be transitively ordered. Comparability Editing has also been
shown to be NP-complete [NSS01]. In this work, we only briefly address
1 INTRODUCTION 4
this problem.
Bearing in mind that, since the measurement error is small, the edit-
ing distance to the closest transitive digraph is small, a fixed-parameter
approach is promising. In parameterized complexity, problems are ana-
lyzed with regard to parameters other than the input size. Often, it can be
shown that problems become efficiently solvable when considering a param-
eter that is small in comparison to the input size. In this context, Böcker
et al. [BBK09] have shown that Transitivity Editing is fixed-parameter
tractable by providing a search tree algorithm that runs in O(3k · n3 ) time
with n denoting the number of vertices in the input graph and k denot-
ing the minimum number of arc modifications. They also presented a very
fast Integer Linear Programming (ILP) implementation of Transi-
tivity Editing based on the ILP implementation of Cluster Editing.
However, the existence of a polynomial-size problem kernel was posed as
an open problem. By constructing a problem kernel that contains O(k 2 )
vertices and improving the branching vector of the search tree algorithm,
we show that Transitivity Editing can be solved in O(2.57k + n3 ) time,
improving the previous result by Böcker et al. [BBK09].
Apart from the exact search tree algorithm, we present a heuristic that
computes a transitive digraph that is relatively close but not necessarily
optimal in terms of edit distance. However, the algorithm can provide an
upper bound for the edit distance of the given digraph and, although possibly
more inaccurate, runs asymptotically faster than any known exact algorithm
and a previous heuristic algorithm by Jacob et al. [JJK+ 08]. As a novelty,
we study the problem of Transitivity Deletion, which asks whether the
given digraph can be turned transitive with a given number of arc deletions.
We show its NP-completeness and present a search tree algorithm that runs
in O(2k + n3 ) time based on a kernelization that leaves a problem kernel
of O(k 2 ) vertices. We only briefly address the problem of Transitivity
Completion, which is to calculate the transitive closure of a given digraph.
2 PRELIMINARIES 5
2 Preliminaries
In the following, we give a survey of definitions and graph-theoretic facts
needed in later sections. Complexity results and techniques are introduced
and a collection of classes of digraphs and problems including the Transi-
tivity Editing problem and some of its variants are presented.
v∈V 0 v∈V 0
Furthermore, indegA (u) := |predA (u)| denotes the indegree of the vertex u
in D and outdegA (u) := |succA (u)| denotes the outdegree of the vertex u
in D. Note that
X X
indegA (u) = outdegA (u) = |A| .
u∈V u∈V
2.2 Complexity
In this section, we give an introduction to the terms and ideas that are
related to combinatorially hard problems like Transitivity Editing. In
particular, we give a brief overview of NP-completeness and parameterized
complexity as well as kernelization and approximation.
Vertex Cover:
Input: An undirected graph G = (V, E) and an integer k ≥ 0.
Question: Is there some set C ⊆ V such that each edge in E
has at least one endpoint in C and |C| ≤ k?
Vertex Cover can be solved in O(1.2738k + kn) time [CKX06], where the
parameter k is a bound on the maximum size of the vertex cover set we are
looking for and n is the number of vertices of the given graph (throughout
this work, n always refers to the number of vertices in the input graph).
The best known “non-parameterized” solution for Vertex Cover is due
to Robson [Rob86, Rob01]. He showed that Independent Set and thus,
Vertex Cover can be solved in O(1.19n ) time. However, for k ≤ 0.71n,
the above mentioned fixed-parameter solution turns out to be better.
D contains a P3
⇔∃u,v,w∈V (u, v) ∈ A ∧ (v, w) ∈ A ∧ (u, w) 6∈ A
⇔∃u,v,w∈V ¬(¬((u, v) ∈ A ∧ (v, w) ∈ A) ∨ (u, w) ∈ A)
Lemma 2.2 allows us to use the terms “P3 -free” and “transitive” syn-
onymously. Furthermore, in a transitive digraph, the arcs from a vertex u
to all vertices v that are reachable from u are present, since otherwise there
would be some vertex w with (u, w, v) being a P3 . Recall that a digraph is
strong, if all vertices of it are reachable from any of its vertices. This implies
that, if a transitive digraph is strong, then it is a complete digraph.
Transitivity Editing:
Input: A directed graph D = (V, A) and an integer k ≥ 0.
Question: Is there a directed graph D0 = (V, A0 ) that is transi-
tive and |A∆A0 | ≤ k?
2 PRELIMINARIES 11
In this context, the set S := A∆A0 is called solution set and contains all
arcs that are inserted into or deleted from D. There are several interesting
properties of solution sets described in Section 3. An example for the Tran-
sitivity Editing problem is shown in Figure 1. When considering graph
editing problems it is also interesting to consider versions of the problem
that are limited to insertion or deletion, respectively.
Transitivity Deletion:
Input: A directed graph D = (V, A) and an integer k ≥ 0.
Question: Is there a directed graph D0 = (V, A0 ) that is transi-
tive with A0 ⊆ A and |A\A0 | ≤ k?
deleted from D, while SINS := S\A contains all arcs that are inserted into D.
The arcs in SDEL and SINS may sometimes be called delete operations or
insert operations, respectively. Note that, obviously, SINS ∩ SDEL = ∅. Ap-
plying S to D results in (V, A∆S). Note that all operations of S can be
applied in arbitrary order, as long as all operations in S are applied.
Note that there may still be many different optimal solution sets for a
single digraph. We discuss optimal solution sets in detail in Section 3.
On the other hand, not every DAG is transitive, moreover, we will see
that Transitivity Editing on DAGs is NP-complete. Figure 2 shows that
there is a digraph D, such that
Since for each arc (a, b) either (a, b) or (b, a) is in the arc set, Lemma 2.5
implies that, when restricted to tournaments, Transitivity Deletion ⊆
Feedback Arc Set. It has been shown that the Feedback Arc Set
problem is APX-hard [Kan92]. However, while still being NP-hard [CTY07],
3 STRUCTURE OF OPTIMAL SOLUTION SETS 15
Comparability Graphs.
Comparability Editing:
Input: An undirected graph G = (V, E) and an integer k ≥ 0.
Question: Is there an undirected graph G0 = (V, E 0 ) that is a
comparability graph and |E∆E 0 | ≤ k?
following lemma shows that arc deletions preserve the property of being
diamond-free.
With the following lemma, we are able to show that in order to solve
Transitivity Editing on a digraph which is diamond-free, it is optimal to
only perform arc deletions. This helps us improve the performance (running
time) of our algorithms on diamond-free graphs.
Lemma 3.2. If a given directed graph D = (V, A) does not contain a dia-
mond, then there is an optimal solution set S for D that does not insert an
arc, that is, S = SDEL .
Proof. Let S 0 be an optimal solution set for D. By Lemma 3.1, we can apply
all delete operations of a given solution set and still maintain a diamond-free
digraph. Hence, we assume D to be diamond-free and the solution set S 0 to
only contain insert operations. We now construct S from S 0 :
Since D does not contain a diamond, for each pair (a, b), there is at most
one w meeting the criteria (a, w) ∈ A and (w, b) ∈ A. Hence, for each arc
in S 0 there is at most one arc in S and hence |S| ≤ |S 0 |.
Let D0 := (V, A0 ) with A0 := A∆S. We now show that S is a solution
set for D by proving that D0 is transitive: Assume there is a P3 p = (x, y, z)
in D0 . Since S ⊆ A (that is, S contains only delete operations), we know
that (x, y) ∈ A and (y, z) ∈ A and, since S 0 is a solution set for D, we know
that p is not a P3 in (V, A∆S 0 ), implying either (x, z) ∈ S 0 or (x, z) ∈ S.
However, (x, z) 6∈ S 0 , because otherwise (x, y) ∈ S, contradicting p being
a P3 in D0 . Hence, (x, z) ∈ A and (x, z) ∈ S. By definition of S, this implies
that there is a v ∈ V with (z, v) ∈ A and (x, v) ∈ S 0 . Also, (y, v) 6∈ A, since
otherwise, (x, z, v) and (x, y, v) would form a diamond in D. Hence, q =
(y, z, v) is a P3 in D. Like p, also q cannot be a P3 in (V, A∆S 0 ). However, S 0
does only contain insert operations, which implies (y, v) ∈ S 0 . Since (y, z) ∈
A and (z, v) ∈ A, this implies (y, z) ∈ S, contradicting p being a P3 in D0 .
3 STRUCTURE OF OPTIMAL SOLUTION SETS 17
In the following, we can specialize this circumstance to the fact that arc
insertions need only to take place if the endpoints take part in a diamond.
However, to prepare this, the following lemma shows that there are optimal
solution sets that do not increase the number of paths between two vertices,
if they are not head and tail of a diamond.
Lemma 3.3. Let D = (V, A) be a digraph and u, v, w ∈ V . Furthermore,
let predA (v) ∩ succA (u) ⊆ {w}. Then there is an optimal solution set S ∗
for D such that if there is a path of length ≥ 3 from u to v in (V, A∆S ∗ ),
then it is (u, w, v).
Proof. Let S 0 denote an optimal solution set for D. Furthermore for all S ⊆
V × V , let
[
P(S) := {(u = x0 , x1 . . . , xm−1 = v) ∈ V m | ∀0≤i≤m−2 (xi , xi+1 ) ∈ A∆S}
m≥3
denote the set of all paths of length ≥ 3 in A∆S from u to v and let P0 (S) :=
P(S)\{u, w, v}. Note that, since there is no diamond (u, . . . , v), for each
path (x0 , . . . , xm−1 ) ∈ P0 (S 0 ), we know that (u, x1 ) ∈ SINS
0 0
∨ (x1 , v) ∈ SINS
0 0
and (u, xm−2 ) ∈ SINS ∨ (xm−2 , v) ∈ SINS . We refer to this fact as obser-
vation 1. In the following, we show that there is also an optimal solution
set S ∗ for D such that P0 (S ∗ ) = ∅. Let
S ∗ := S 0 ∪ S 0+ \S 0−
with [
S 0+ := A ∩ {(u, x1 ), (xm−2 , v)}
(x0 ,...,xm−1 )∈P0 (S 0 )
and
[
S 0− := SINS
0
∩ {(u, x1 ), (u, xm−2 ), (x1 , v), (xm−2 , v)}.
(x0 ,...,xm−1 )∈P0 (S 0 )
Proof. By Lemma 3.3 there is an optimal solution set S 0 for D such that
the digraph (V, A∆S 0 ) contains at most one path of length ≥ 3 from u
to v. If there is no such path in (V, A∆S 0 ), then S 0 \{(u, v)} is obviously
an optimal solution set that does not contain (u, v). If there is such a
path, then Lemma 3.3 implies that this path is (u, w, v). We can assume
0 . In the following, we show that
that (u, v) ∈ SINS
is an optimal solution set for D that does not contain (u, v). Obviously, |S| ≤
|S 0 |, hence, we need only show that S is a solution set for D. Suppose there
is a P3 p = (x, y, z) in (V, A∆S). Since we remove an arc insertion and add
an arc removal, it is clear that (x, z) ∈ S 0 ∆S.
3 STRUCTURE OF OPTIMAL SOLUTION SETS 19
Moreover, for all pairs of vertices (u, v) 6∈ A we can prove that, if there
is no directed path from u to v after applying all arc deletions of an optimal
solution set, then this solution set does not contain (u, v).
Corollary 3.7. Applying optimal solution sets preserves sources and sinks.
4 COMPUTATIONAL COMPLEXITY 20
Apart from preserving sources and sinks, optimal solution sets also do
not delete arcs from any source to any sink.
Lemma 3.8. Let D = (V, A) be a digraph and VSRC and VSNK be the sets
of all sources and sinks in D, respectively. If S is an optimal solution set
for D, then SDEL ∩ (VSRC × VSNK ) = ∅.
4 Computational Complexity
In this section, we prove the NP-completeness of Transitivity Editing
and Transitivity Deletion (see Section 2.3). Although it has been
stated that the NP-completeness of Transitivity Editing had been pre-
viously shown [JJK+ 08], the cited source ([NSS01]) does not prove the NP-
completeness of Transitivity Editing but the NP-completeness of Com-
parability Editing. Motivated by the lack of a completeness result, we
show that both Transitivity Editing and Transitivity Deletion are
NP-complete even when restricted to DAGs (see Section 4.2).
Positive-Not-all-equal-3SAT:
Input: A Boolean formula ϕ in n variables x0 , . . . , xn−1 which
is a conjunction of m clauses Ci , each consisting of three positive
literals.
4 COMPUTATIONAL COMPLEXITY 21
Each variable cycle has a subpath of eight vertices for each of the m clauses.
As we will see, each clause may cause the fifth vertex of the corresponding
subpath to be connected to other variable cycles, if xk is one of the variables
of this clause. The collection of all variable cycles is then referred to by (V, A)
with
n−1
[ n−1
[
V := Vk A := Ak .
k=0 k=0
4 COMPUTATIONAL COMPLEXITY 22
In the following, we refer to the arcs (v0k , v1k ), (v2k , v3k ), . . . , (v8m−2
k k
, v8m−1 ) as
even arcs and all other arcs in the variable cycle as odd arcs. Furthermore,
for each of the m clauses in ϕ, we construct a directed cycle of length three
between the variable cycles of its three variables as shown in Figure 3. These
will be referred to as clause cycles. In particular, for each clause Ci =
(xi0 , xi1 , xi2 ), we construct the following clause cycle:
n o
A0i := i0
v8i+4 i1
, v8i+4 i1
, v8i+4 i2
, v8i+4 i2
, v8i+4 i0
, v8i+4 .
Note that we do not need any vertices other than those in V . Finally,
let D := (V, A ∪ A0 ) denote the resulting digraph.
In order to show the correctness of the reduction, we need the following
lemmas.
Lemma 4.3. In order to turn a directed cycle of even length ≥ 4 transitive
without inserting an arc, it is optimal to delete every second arc. Moreover,
this is the only optimal way to do so.
Proof. Let C = (VC , AC ) denote a directed cycle of length l = 2 · l0 . Sup-
pose S is an optimal solution set for C that does not delete every second
arc. Note that |S| ≤ l0 , since otherwise the set containing every second
arc of C is a solution set that is smaller then S, contradicting the opti-
mality of S. Consider all pairs of adjacent arcs (a, b), (b, c) ∈ AC . Obvi-
ously, (a, c) 6∈ AC . Since S is a solution set for C, we know that (a, b) ∈ S
4 COMPUTATIONAL COMPLEXITY 23
or (b, c) ∈ S. However, since S does not delete every second arc, there
is some pair of arcs (a0 , b0 ), (b0 , c0 ) that are both in S. Obviously, P :=
(VC , AC \{(a0 , b0 ), (b0 , c0 )}) is a path of l − 2 = 2(l0 − 1) arcs. Hence, there
are l0 − 1 disjoint P3 s in P . Since S\{(a0 , b0 ), (b0 , c0 )} must be a solution set
for P , we know that |S|−2 ≥ l0 −1 and thus |S| ≥ l0 +1, which contradicts S
being an optimal solution set for C.
Note that there are two ways to delete every second arc in a variable
cycle. Either delete all odd arcs or all even arcs. These two optimal so-
lutions will represent the truth value of the corresponding variable. If the
variable cycle for xk is turned transitive by the deletion of all even arcs, xk
is considered to be assigned true, otherwise false.
Consider a clause cycle. Obviously, a cycle of length three can be turned
transitive with two arc deletions. However, this stamps an asymmetry on
the clause cycle that results in a remaining P3 , if all even arcs of all three
variable cycles are deleted or all odd arcs are. Hence, if this is the case, an
additional arc deletion is required.
Lemma 4.4. For each clause Ci , if a solution set S to the induced subgraph
h n oi
i0 i1 i2
D Vi0 ∪ Vi1 ∪ Vi2 ∪ v8i+4 , v8i+4 , v8i+4
then S contains at least 3 · 4m + 3 arcs, 4m arcs for each variable cycle and 3
for the clause cycle.
Figure 4: If all variable cycles adjoin to the clause cycle in the same way,
that is either all even arcs of all variables are deleted (left image), or all
odd arcs of all variables are deleted (right image), then the structure can
neither be turned transitive by removing two arcs, nor by removing an arc
and inserting its opposite arc. Always three operations are required. Bold
arcs symbolize membership in A0 . Dashed arcs symbolize deletions.
Figure 5: If the variable cycles adjoin to the clause cycle in different ways
(the left image shows that all odd arcs of the variable cycle of xi1 are deleted
and all even arcs of the variable cycles of the other two variables are deleted,
the right image shows the opposite), then the cycles can be turned transitive
by removing two arcs. Removing an arc and inserting its opposite does not
yield transitive subgraphs. Bold arcs symbolize membership in A0 . Dashed
arcs symbolize deletions.
4 COMPUTATIONAL COMPLEXITY 25
then there is some clause Ci = (xi0 , xi0 , xi0 ) with β(xi0 ) = β(xi1 ) = β(xi2 ).
By Lemma 4.4, turning the corresponding clause cycle transitive would re-
quire three operations, contradicting (D, 2m+4mn) ∈ Transitivity Edit-
ing.
Since Transitivity Editing is in NP and also NP-hard, the NP-
completeness follows.
In the above proof, we never employ arc insertions which implies that it
can be used to prove that Transitivity Deletion is NP-complete.
Figure 6: The variable gadget of xk . The bold arcs show potential docking
arcs (see Figures 8 and 9), while the additional paths via akj , bkj , and ckj
ensure that optimally turning this structure transitive requires the deletion
k , v k ) for each 0 ≤ j ≤ 8m or (v k , v k ) for each 0 ≤ j ≤ 8m.
of either (v0,j 1 5 6,j
4 COMPUTATIONAL COMPLEXITY 28
with
n o
Aupper
k,j := k
v1k , v2,j k
, v2,j k
, v3,j k
, v3,j k
, v4,j k
, v4,j , v5k ,
n o
Alower
k,j := v1k , akj , akj , bkj , bkj , ckj , ckj , v5k , and
n o
Aouter
k,j := v k
,
0,j 1 v k
, v k k
, v
5 6,j .
Note that (V, A) is acyclic and diamond-free. The following arc disjoint P3 s
are contained in each variable gadget (Vk , Ak ):
k , v k , v k ), (v k , v k , v k ), (v k , v k , v k ) for all 0 ≤ j < 3m
1. (v0,j 1 2,j 2,j 3,j 4,j 4,j 5 6,j
k
2. (v0,3m+j , v1k , akj ), (akj , bkj , ckj ), (ckj , v5k , v6,3m+j
k ) for all 0 ≤ j ≤ 5m
3. (v1k , akj , bkj ), (bkj , ckj , v5k ) for all 5m < j ≤ 13m + 1
Observation 4.7. For each variable gadget, at least 40m + 5 operations are
required to turn it transitive.
Definition 4.8. For each variable xk , the vertices v1k and v5k are odd ver-
tices. All other vertices in V are even vertices if they are adjacent to an odd
vertex and odd vertices if they are adjacent to an even vertex. We refer to
an arc (u, v) as odd arc if u is odd, otherwise (u, v) is called even arc.
The two vertices uip,0 and uip,1 are then connected to the variable gadgets,
depending on p:
n o
v ir , u i , v ir
, u i , if p = r
3,3i+p p,0
A0i,p,r := n i 4,3i+p p,1 o .
i ir i
vr
2,3i+p , up,0 , v3,3i+p , up,1 , if p 6= r
With the construction of these clause gadgets, we need less operations if one
of the variable gadgets is edited in a different manner than the other two
which will correspond to one of the variables of Ci being assigned a differ-
ent truth value than the other two. This is achieved by choosing different
4 COMPUTATIONAL COMPLEXITY 30
“docking arcs” for each clause gadget part: The three docking arcs of the
gadget part p of the clause gadget of Ci = (xi0 , xi1 , xi2 ) are (see Figure 7)
Furthermore, for each 0 ≤ r < 3, the arc γir ,3i+p denotes the arc in Air
that is incoming to the vertex that αir ,3i+p is outgoing from. Note that D
is acyclic and diamond-free. Hence, we can assume that there is an optimal
solution set for D that contains only arc deletions. By the construction of
the clause gadgets it is clear that each gadget part requires at least two
operations to be turned transitive.
Observation 4.9. For each clause gadget, at least six operations are re-
quired to turn it transitive, independent of deletions in the variable gadgets.
Observation 4.10. All three parts of each clause gadget dock over one odd
arc and two even arcs.
Since not all (v0,jk , v k ) and all (v k , v k ) are deleted, it is clear that there
1 5 6,j
is either some (v0,j k , v k ) or some (v k , v k ) that is not deleted. Hence, either
1 5 6,j
all (v1k , v2,j
k ) or all (v k , v k ) must be deleted.
4,j 5
Proof. Suppose the premise is true, that is, p is a gadget part for which
all αir ,3i+p or all γir ,3i+p are deleted. Without loss of generality we assume
all αir ,3i+p to be deleted. Since for each 0 ≤ r < 3 exactly one of the
arcs αir ,3i+p and γir ,3i+p is deleted, we know that all γir ,3i+p are not deleted.
Figure 8 shows that it is possible to turn part p of the clause gadget cor-
responding to clause Ci transitive with four operations. As we can also see
in Figure 8, there are four disjoint P3 s in (V ∪ V 0 , (A ∪ A0 )\(S\A0ir ,3i+p )).
Hence, four arc deletions are also required.
Suppose the premise is false, that is, p is a gadget part for which there
is some αis ,3i+p that is not deleted and there is also some γit ,3i+p that is not
deleted. Figure 9 shows that it is possible to turn part p of the clause gadget
corresponding to clause Ci transitive with five operations. As we can also
see in Figure 9, there are five disjoint P3 s in (V ∪ V 0 , (A ∪ A0 )\(S\A0ir ,3i+p )).
Hence, at least five arc deletions are required
gadget of xk are deleted, otherwise all odd arcs are. Since β is a satisfy-
ing assignment, there is no clause whose variables are assigned the same
truth value. Thus, each clause gadget docks to at least one variable gad-
get whose odd arcs are deleted and at least one variable gadget whose even
arcs are deleted. Hence, there is exactly one part of each clause gadget for
which the docking arcs are either all deleted or all not deleted. Lemma 4.14
describes that, under these circumstances, we can turn each clause gadget
transitive with 14 arc deletions and thus, we can turn D transitive with a
total of n · (40m + 5) + 14m arc deletions.
“⇐”: Since (D, n · (40m + 5) + 14m) is a yes-instance, there is some
solution set S to D that is optimal and |S| ≤ n · (40m + 5) + 14m. Further-
more, since D does not contain a diamond, by Lemma 3.2, we can assume
that S contains only arc deletions. Let D0 := (V ∪ V 0 , (A ∪ A0 )∆S). Let xk
be a variable of ϕ, let Ci denote a clause, and let p denote a gadget part of
the clause gadget of Ci . By Lemma 4.11, either all (v0,r k , v k ) or all (v k , v k )
1 5 6,r
with 0 ≤ r ≤ 8m are deleted. Without loss of generality, we assume that
k , v k ) ∈ S and all (v k , v k ) 6∈ S for 0 ≤ r ≤ 8m.
all (v0,r 1 5 6,r
In the following, we show that under these circumstances all even arcs of
the variable gadget of xk are deleted. Since i and p were chosen arbitrarily,
it suffices to show that we can modify S such that it is an optimal solution
set and
S ∩ Aupper k k k k
k,3i+p = {(v2,3i+p , v3,3i+p ), (v4,3i+p , v5 )}. (1)
k
Obviously, (v4,3i+p , v5k ) must be in S since otherwise (v4,3i+p
k , v5k , v6,0
k ) is a P
3
in (V, A∆S). Furthermore, there must be more than two arcs in S ∩ Aupper k,3i+p ,
k k k k
since the remaining P4 (v1 , v2,3i+p , v3,3i+p , v4,3i+p ) can only be destroyed with
k
a single arc deletion by deleting (v2,3i+p k
, v3,3i+p ), implying that S already
satisfies (1). Furthermore, we assume that
upper
S ∩ Ak,3i+p 6= 4,
since otherwise, we can remove one of these four arcs from S without creating
a P3 in D0 , contradicting the optimality of S. Hence, it is clear that
upper
S ∩ Ak,3i+p = 3
and thus, exactly one of the arcs in Aupper k,3i+p is not in S. Recall that the
docking arc αk,3i+p of part p of the clause gadget of clause Ci is either the
k
arc (v2,3i+p k
, v3,3i+p k
) or the arc (v3,3i+p k
, v4,3i+p ). In the following, we show
that S can be modified without creating a P3 in (V, A∆S) such that (1)
holds. To this end, we consider the following six cases:
4 COMPUTATIONAL COMPLEXITY 36
then all even arcs of the variable gadget of xk are deleted. By analogy, it
can be shown that if
∀0≤r≤8m (v5k , v6,r
k
)∈S
then all odd arcs of the variable gadget of xk are deleted. By Lemma 4.11, we
can assume that for each variable gadget, either all even arcs or all odd arcs
are in S. Since |S| ≤ n · (40m + 5) + 14m, Observation 4.7 and Lemma 4.14
imply that for each 0 ≤ i < m,
By Lemma 4.14, Equation (2) we know that for each clause gadget, there is
some part such that all its docking arcs are deleted or all its docking arcs
are not deleted. By the construction of the clause gadgets, this implies that
the truth values of the three variable gadgets of each clause gadget cannot
be equal. Hence, β is a satisfying assignment for the variables of ϕ.
All in all, the given instance of Positive-NAE-3SAT is a yes-instance,
iff (D, n · (40m + 5) + 14m) is a yes-instance of Transitivity Editing.
The theorem follows.
In the proof, we never employ arc insertions which implies that it can
be used to prove that Transitivity Deletion is NP-complete on DAGs.
5 POLYNOMIAL-TIME DATA REDUCTION 38
Lemma 5.4. Let D = (V, A) be a digraph and let n := |V |. Rule 5.2 can be
implemented to run in O(n3 ) time.
Proof. First, we show how it is possible to determine whether a given ver-
tex v takes part in a P3 in (V, A) in O(n2 ) time. A vertex v takes part in
a P3 , if one of the following conditions is true:
1. succA (succA (v)) \ succA (v) 6= ∅.
2. succA (v) \ succA (predA (v)) 6= ∅.
3. predA (predA (v)) \ predA (v) 6= ∅.
Since the difference of two sets of size O(n) each can be calculated in O(n)
time and |succA (succA (v))| ∈ O(n2 ), determining whether a given vertex
takes part in a P3 can be done in O(n2 ) time. Second, we show that it is
possible to remove a vertex v from the digraph (V, A) in O(n2 ) time. For
each vertex a maximum of O(n) arcs have to be removed, and each arc
removal can be done in O(n) time. All in all, it follows that Rule 5.2 can be
implemented in O(n3 ) time.
Theorem 5.5. Let D be a digraph that is reduced with respect to Rule 5.2
and let δ denote the maximum degree of D. If (D, k) is a yes-instance of
Transitivity Editing, then D contains at most 2k · (δ + 1) vertices.
denote the set of vertices in V that are affected by modifying the arc (u, v).
In the following, we prove that
[
R(u,v) = V
(u,v)∈S
and X
R(u,v) ≤ 2k(δ + 1).
(u,v)∈S
Figure 12: Examples for the application of Rule 5.6. Left: If (u, v) 6∈ A
and |Z| > k then insert (u, v) into D. Right: If (u, v) ∈ A and |Zu |+|Zv | > k
then delete (u, v) from D. Note that x 6∈ Zu and y 6∈ Zv .
Theorem 5.5 encourages the thought that the complexity of the problem
is partially related to the degree of the given digraph.
The following reduction rule follows an idea of Gramm et al. [GGHN05]
for the Cluster Editing problem: If there is some arc (a, b) in the given
digraph such that, if (a, b) is not modified, then each solution set must
contain more than k other arcs, then, in order for the solution set to contain
at most k arcs, (a, b) has to be modified. An example for the rule can be
found in Figure 12.
5 POLYNOMIAL-TIME DATA REDUCTION 43
2. Let (u, v) ∈ A,
Lemma 5.7. Rule 5.6 causes an arc insertion or an arc deletion iff this
operation destroys more than k P3 s in D.
Proof. “⇒”: Suppose that the application of Rule 5.6 to D causes an arc
insertion. Thus, there is a pair (u, v) ∈ V × V with (u, v) 6∈ A and |Z| > k
for Z := succA (u) ∩ predA (v). For all w ∈ Z, this operation destroys
the P3 (u, w, v) in D.
If the application of Rule 5.6 to D causes an arc deletion, then there
is a pair (u, v) ∈ A with |Zu | + |Zv | > k for Zu := predA (u) \ predA (v)
and Zv := succA (v) \ succA (u). For all w ∈ Zu and z ∈ Zv , this operation
destroys the P3 (w, u, v) and (u, v, z) in D. Thus, |Zu | + |Zv | > k P3 s in D
are destroyed in total.
“⇐”: Suppose the insertion of the arc (u, v) destroys more than k P3 s
in D, hence (u, v) 6∈ A and there is a set Z := {w ∈ V | (u, w, v) is a P3 in D}
with |Z| > k. Hence
and thus
∀w∈Z w ∈ succA (u) ∧ w ∈ predA (v)
which implies
Z ⊆ succA (u) ∩ predA (v) .
Since |Z| > k, Rule 5.6 applies.
5 POLYNOMIAL-TIME DATA REDUCTION 44
Proof. Let (D∗ , k−1) with D∗ = (V, A∗ ) denote the instance that is obtained
by applying Rule 5.6 to the given instance (D, k) with D = (V, A). Further-
more, let {(a, b)} = A∆A∗ , that is, applying Rule 5.6 modifies arc (a, b).
“⇐”: Suppose (D∗ , k − 1) is a yes-instance of Transitivity Editing.
Let S ∗ denote a solution set for D∗ with |S ∗ | ≤ k − 1. Obviously, S :=
S ∗ ∪ {(a, b)} is a solution set for D and |S| = |S ∗ | + 1 ≤ k.
“⇒”: Let (D, k) be a yes-instance of Transitivity Editing. In the
following, we show that all solution sets S for D with |S| ≤ k contain (a, b).
For the sake of contradiction, we assume that there is a solution set S to D
with |S| ≤ k and (a, b) 6∈ S. Lemma 5.7 implies that modifying (a, b)
destroys more than k different P3 s in D. Let p0 , . . . , pm with m ≥ k denote
these P3 s. For each 0 ≤ i ≤ m, the P3 pi must contain a and b in order
to be destroyed by modifying (a, b). Furthermore, pi must contain a third
vertex ci that is different from a and b. Obviously, all ci must be pairwise
different, otherwise the P3 s would not all be different. Since S is a solution
set for D, we know that for each 0 ≤ i ≤ m, it must contain one of the
arcs (a, ci ),(b, ci ),(ci , a),(ci , b) and (a, b). However, (a, b) 6∈ S and thus |S| ≥
m + 1 ≥ k + 1, contradicting S being a solution set for D.
Lemma 5.9. Let D = (V, A) be a digraph and let n := |V |. Rule 5.6 can be
executed in O(n3 ) time
5 POLYNOMIAL-TIME DATA REDUCTION 45
Proof. We show that, given a pair of vertices, we can execute Rule 5.6
in O(n) time. Let u, v ∈ V . If (u, v) 6∈ A, then we need to calculate succA (u)∩
predA (v), which can be done in O(n). If (u, v) ∈ A, then we need to calcu-
late predA (u) \ predA (v) and succA (v) \ succA (u), which can also be done
in O(n). Of course, we can determine if the sizes of the intersections are ≥ k
in constant time. Obviously, inserting or deleting (u, v) can be done in O(n)
as well.
With Rules 5.2 and 5.6 established, we look at the size of the remaining
instance. In the following, we show a kernel of O(k 2 ) vertices.
X := V \Y
Note that all vertices in X are adjacent to at least one vertex in Y because D
is reduced with respect to Rule 5.2. Also note that in order to destroy a P3 p
in D, the solution set S must contain an arc incident to two of the vertices
of p, hence for each P3 p in D at most one of the vertices of p is in X.
Since D can be turned transitive with at most k operations, we know
that |S| ≤ k and consequently |Y | ≤ 2k. Obviously |V | = |X| + |Y |, hence
the assumption that |V | > k(k + 2) implies |X| > k 2 . With the above
observation, it follows that there are more than k 2 P3 s in D.
For each operation (a, b) ∈ S, let
∀(a,b) q 6∈ Z(a,b)
5 POLYNOMIAL-TIME DATA REDUCTION 46
and thus [
q 6∈ Z(a,b) .
(a,b)∈S
A lower bound for k. An important topic, not only for kernelization but
also for the implementation of the search-tree algorithm, is to find a lower
bound for the kernel size. In the following, we present a train of thought
that enables us to find a lower bound for the number of deletions needed to
turn a given digraph transitive: The main idea is that two P3 s that do not
interfere cannot be destroyed by a single operation and, hence, if a given
digraph can be turned transitive with at most k operations, then it cannot
contain more than k disjoint P3 s. In this context, recall that two P3 s are
called disjoint if they share at most one vertex (See Section 2.1 on page 6).
Proof. Suppose there is an optimal solution set S 0 for D with |S 0 | < |I|. Con-
sider some operation (a, b) ∈ S 0 that destroys a P3 p = (u, v, w) in I. Since p
is destroyed by the operation, we know that (a, b) ∈ {(u, v), (v, w), (u, w)}.
However, since I is an independent set, we also know that all P3 s represented
by vertices in I are disjoint. Hence,
and thus there is no q ∈ I\{p} that can be destroyed by the operation (a, b).
Since (a, b) is arbitrary, we know that no arc in S can destroy more than
one P3 represented by a vertex in I and thus the theorem follows.
5 POLYNOMIAL-TIME DATA REDUCTION 47
The lower bound for the size of solution sets can be used when travers-
ing the search tree: If a subtree is discovered that requires turning a di-
graph D transitive with at most k operations and D is known to require
more than k operations, then we can skip over the subtree in the search tree
algorithm, thus saving time. Furthermore, it can be used in conjunction
with Rule 5.6:
Rule 5.13. Given an instance (D, k) with D = (V, A). Let I denote an
independent set of the P − 3-conflict graph of D.
Furthermore, let
2. Let (u, v) ∈ A,
Since the size of any independent set of CD is a lower bound for the size
of a solution set for D, the best lower bound is of course the maximum inde-
pendent set of CD . However, since the problem of obtaining this maximum
independent set is NP-hard, we may limit ourselves to finding a fairly good
independent set. This can be done by calculating a maximal matching M
of vertices in CD 1 (all vertices of CD that are not in M form an indepen-
dent set). In practice, it appears that the lower bound for the size of the
solution sets can be improved significantly by not just picking any maximal
matching of CD but finding a small maximal matching, yielding a larger
independent set. By conventional means, finding a maximal matching in
the P3 -conflict graph may take O(n6 ) time, since there may be O(n3 ) P3 s in
the given digraph. In the following, we present an approach that computes
a large independent set of the P − 3-conflict graph CD in O(n3 log n) time
(see Algorithm 2).
There are two key observations: First, each pair of vertices (u, v) ∈ V 2
causes a clique in CD . The vertices of this clique are all P3 s that contain u
and v, since these P3 s cannot be disjoint. We refer to these cliques as
arc-induced cliques. Second, each P3 can be contained in at most three arc-
induced cliques. For each pair of vertices, its clique can be found in O(n)
time. Since there are O(n2 ) pairs of vertices, we can find all arc-induced
cliques in the P3 -conflict graph in O(n3 ) time. The next step is to sort
all P3 s by the number of arc-induced cliques of size ≥ 2 they are contained
in and the sum of the sizes of all arc-induced cliques they are contained in,
biased by the former. This can be done in O(n3 log n) time. In the last step,
we take P3 s one by one in ascending order and insert their arcs into a set of
forbidden arcs. If one of the arcs of a P3 is in the set of forbidden arcs, then
we simply discard it, otherwise, it is joined into the set of disjoint P3 s. Since
there are up to n3 P3 s in the given digraph, this can take up to O(n3 log n)
time. Thus, the overall running time is bounded by O(n3 log n).
Proof. Let I denote the set that is returned by Algorithm 2 and suppose I is
not an independent set of CD . Then there are two P3 s p and q in I that are
connected in CD . Thus, by definition of CD , the P3 s p and q are not disjoint.
Without loss of generality, we assume that p succeeds q in the set P after
it has been sorted in Line 11. Then, however, all arcs of p are inserted into
1
Note that, the smaller the maximal matching, the larger the independent set. Unfor-
tunately, the MinMax-Matching problem is NP-hard [YG80].
5 POLYNOMIAL-TIME DATA REDUCTION 50
the set called “forbidden arcs” and since p and q are not disjoint, q cannot
be inserted into the resulting set.
It remains to show that the algorithm runs in O(n3 log n) time. As
sketched in the above text, processing the first loop takes at most O(n3 )
time, since there are at most n2 pairs of vertices and for each pair, there
are O(n) P3 s that contain the pair. The set of all P3 s in D can be calcu-
lated and sorted in O(n3 log n) time because there cannot be more than n3
different P3 s in D. Since |P | ∈ O(n3 ) and the insertion of p into the set of
disjoint P3 s may take up to O(log n) time, the final loop runs in O(n3 log n)
time. All in all, the running time of the algorithm does not exceed O(n3 log n).
Rule 5.16. Let D = (V, A) be a digraph and v some vertex that is not part
of the belt of a diamond. Let VSRC denote the set of all sources in D, let
R := predA (v) ∩ VSRC ∩ r | ∃u∈succA (v) (r, u) 6∈ A ,
5 POLYNOMIAL-TIME DATA REDUCTION 51
Figure 13: Rule 5.16: If all vertices in R are sources and we know that
deletion is optimal and the indegree of v is greater than its outdegree, then
deleting all arcs leaving v is at least as good as any other solution.
and let
T := succA (v) ∩ {t | ∃r∈R (r, t) 6∈ A} .
Furthermore, for each T 0 ⊆ T , let
\
RT 0 := R ∩ predA (t)
t∈T 0
denote the set of all vertices in R that are predecessors of all vertices in T 0 .
If |T 0 | + |RT 0 | ≤ |R| for all T 0 ⊆ T , then delete all arcs in {v} × T and
modify k accordingly.
Note that it is unclear whether checking for |T 0 |+|RT 0 | ≤ |R| for all T 0 ⊆
T is possible in polynomial time. However, it is possible to determine in
polynomial time, whether a condition that implies |T 0 | + |RT 0 | ≤ |R| for
all T 0 ⊆ T is true.
Obviously, |T 0 | ≤ |T |. Furthermore,
|RT 0 | ≤ max0 R{t} ≤ max R{t} .
t∈T t∈T
The following Rule is a special case of Rule 5.16 that can be executed in
polynomial time.
Rule 5.18. Let D = (V, A) be a digraph and v some vertex that is not part
of the belt of a diamond. Let VSRC denote the set of all sources in D, let
R := predA (v) ∩ VSRC ∩ r | ∃u∈succA (v) (r, u) 6∈ A ,
and let
T := succA (v) ∩ {t | ∃r∈R (r, t) 6∈ A} .
Furthermore, for each t ∈ T , let Rt := R ∩ predA (t). If |T | + maxt∈T |Rt | ≤
|R|, then delete all arcs in {v} × T and modify k accordingly.
By Lemma 5.17, it is clear that the preconditions of Rule 5.18 imply the
preconditions of Rule 5.16. Thus, if Rule 5.16 is correct, then Rule 5.18 is
also correct.
It is not hard to see that Rule 5.16 can be modified such that it works
with sinks instead of sources. All lemmas in this section are true for both
Rules 5.16 and 5.19, but the proofs for Rule 5.19 are omitted since they are
always completely analog.
Rule 5.19. Let D = (V, A) be a digraph and v some vertex that is not part
of the belt of a diamond. Let VSNK denote the set of all sinks in D, let
R := succA (v) ∩ VSNK ∩ r | ∃u∈predA (v) (u, r) 6∈ A ,
and let
T := predA (v) ∩ {t | ∃r∈R (t, r) 6∈ A} .
Furthermore, for each t ∈ T , let Rt := R ∩ succA (t). If |T | + maxt∈T |Rt | ≤
|R|, then delete all arcs in T × {v} and modify k accordingly.
Before proving the correctness of the reduction rules, we need the fol-
lowing lemma.
Lemma 5.20. Let D, R, T , RT 0 and v be as described in Rule 5.16. Let S
be an optimal solution set for D that does not contain any arc insertions
between R and T . Furthermore, let |T 0 | + |RT 0 | ≤ |R| for all T 0 ⊆ T . Then
|({v} × T )\S| ≤ |(R × {v}) ∩ S| .
5 POLYNOMIAL-TIME DATA REDUCTION 53
Proof. Let TS := succA\S (v) ∩ T denote the set of vertices of T that are
successors of v in (V, A\S). Obviously, for each t ∈ TS , there is some r ∈
R\RTS such that (r, v, t) is a P3 in D. Since S is a solution set for D that
does not delete any arc (v, t) with t ∈ TS , we know that
(R\RTS ) × {v} ⊆ S.
Hence,
(X × T ) ∩ A = ∅. (4)
denote the set of all vertices that are not in X ∪ T ∪ {v} but are reachable
from some z ∈ Z in D0 . Let XZ := {x ∈ X + | reach∗D0 ({x}) ∩ Z 6= ∅} denote
the set of all vertices in X + from which a vertex in Z can be reached. There
are a number of interesting facts to notice about this modified reachability
function:
Figure 14: An overview over most of the defined sets of the proof for
Lemma 5.21. Not all arcs are drawn here.
+
SINS := (R × Y ) ∩ A ∪ S
+ +
The set S + := SDEL ∪SINS then contains all arc modifications that are added
to those in S. Furthermore, consider the following sets.
−
SDEL := (R × Y ) ∩ SDEL
−
SINS := (((R ∪ Y ) × T ) ∩ SINS ) ∪ ({v} × (V \Y ) ∩ SINS )
− −
The set S − := SDEL ∪ SINS then contains all arc modifications that are re-
+ + −
moved from those in S. In this context, note that the sets SDEL , SINS , SDEL ,
− − +
and SINS are pairwise disjoint. Furthermore, S ⊆ S and S ∩ S = ∅. Fi-
nally, the set S 0 is constructed by
S 0 := (S ∪ S + )\S − .
t
Figure 15: Visualization of the sets XINS and Yt in the proof of Lemma 5.21.
Dotted arcs are inserted by S.
Note that (9) follows directly from Lemma 5.20. For the following proofs,
recall that |Z| ≤ |XZ | for each Z ⊆ reach∗D0 (X + ), and that the func-
tion reach∗D0 () is linear.
In the following, we show that
(Y \{v} × T ) ∩ S + ≤ S − .
DEL INS
+
By (4), we know that ((Y \{v}) × T ) ∩ SDEL = (reach∗D0 (X + ) × T ) ∩ SDEL
+
.
Thus, it suffices to show that
reach∗ 0 X + × T ∩ S + ≤ X + × T ∩ SINS .
D DEL (10)
To this end, we show that, for each t ∈ T , the solution set S contains
more arc insertions from X + to t than there are arcs from reach∗D0 (X + )
to t. For each t ∈ T , consider its predecessors in D0 . Especially, consider
those predecessors of t that are either in X + or in reach∗D0 (X + ). More
t
formally, these two sets are denoted by XINS := predA∆S (t) ∩ X + and Yt :=
5 POLYNOMIAL-TIME DATA REDUCTION 58
r
Figure 16: Visualization of the sets XDEL and Y r in the proof of Lemma 5.21.
Dashed arcs are deleted by S.
⇒ (x, t) ∈ SINS
t
⇒ x ∈ XINS .
t , which implies |Y | ≤ |X | ≤
Thus,
t
for all t ∈ T , it is clear that
t
XYt ⊆ XINS t Yt
X and thus |Yt × {t}| ≤ X × {t} . By the definition of X t and Yt ,
INS INS INS
it is clear that Yt × {t} = (reach∗D0 (X + ) × {t}) ∩ (A∆S) and XINS t × {t} =
(X + × {t}) ∩ SINS . This implies that
Obviously, all sets on the left hand side are pairwise disjoint and so are all
sets on the right hand side. Thus, we know the sizes of their respective
unions. It is not hard to see that (10) follows.
In the following, we show that
S ≤ (R × (Y \{v})) ∩ S − .
+
INS DEL
5 POLYNOMIAL-TIME DATA REDUCTION 59
To this end, we show for each r ∈ R that S contains more arc deletions
from r to X + than insertions are needed to complete r ×reach∗D0 (X + ) in D0 .
For each r ∈ R, consider its successors in D0 . Especially, consider those
successors of r that are either in X + or in reach∗D0 (X + ). More formally, these
two sets are denoted by XDEL r := succA∆S (r) ∩ X + and Y r := succA∆S (r) ∩
∗
reachD0 (X + ), respectively. See Figure 16 for a visualization. Since D0 is
transitive, we know that for all r ∈ R and all x ∈ X +
r
x 6∈ XDEL ⇒ (r, x) ∈ A∆S
⇒ ∀y∈reach∗ 0 ({x}) (r, y) ∈ A∆S
D
⇒ ∀y∈reach∗ 0 ({x}) y 6∈ Y r
D
⇒ x 6∈ XY r .
Obviously, all sets on the left hand side are pairwise disjoint and so are all
sets on the right hand side. Thus, we know the sizes of their respective
unions. It is not hard to see that (11) follows.
Altogether, we have proved that S 0 is also an optimal solution set for D
and that S 0 contains all (v, t) ∈ A with t ∈ T . Thus, there is an optimal
solution set for D that deletes all arcs in T ×{v}. The correctness of Rule 5.16
follows.
Although the application of Rule 5.16 causes only arc deletions, it is not
obvious that it can be applied for Transitivity Deletion as well. The
proof presented above allows arc insertions in the original solution set S and
relies on arc insertions in the constructed solution S 0 . Hence, the following
proof is needed to apply Rule 5.16 to instances of Transitivity Deletion.
Proof. Let S denote an optimal solution set for D with S = SDEL and
let D0 := (V, A\S). As in the proof for Lemma 5.21, let X := succA (v) \T
and X + := succA\S (v). Note that, under this circumstance we know that
succA\S X + ⊆ X + ,
(12)
S − := R × Y
and
S + := {v} × T
is an optimal solution set for D that deletes all arcs in {v} × T . In this
context, note that
R × Y ⊆ A\S 0 . (13)
Suppose S 0 was not a solution set for D. Then there is some P3 (a, b, c)
in (V, A\S 0 ) that is not in (V, A\S), and hence (a, b) ∈ S − , (b, c) ∈ S − ,
or (a, c) ∈ S + .
Case 1: (a, b) ∈ R × Y .
By (12), it is obvious that c ∈ X + . Then however, (13) contradicts (a, b, c)
being a P3 in (V, A\S 0 ).
Case 2: (b, c) ∈ R × Y .
Since R is a set of sources and S 0 contains only arc deletions, it is clear
that b is a source in (V, A\S 0 ), contradicting (a, b, c) being a P3 in (V, A\S).
Case 3: (a, c) ∈ {v} × T .
Since a = v, it is clear that b ∈ X + . This implies that there is some r ∈ R
with (r, b) ∈ A and thus, (r, a, c) and (r, b, c) are P3 s in D, contradicting the
diamond constraint.
Suppose S 0 was not optimal, that is, |S 0 | > |S|. Since S − ⊆= S and S + ∩
S = ∅, this implies |S + | > |S − |. However, by Lemma 5.20, we know that
−
S ≥ |(R × {v}) ∩ S| ≥ |({v} × T )\S| = S + ,
contradicting |S + | > |S − |.
Definition 5.23. For a digraph D = (V, A), the set of all P3 middles is
denoted by
MD := {b | ∃a,c∈V (a, b, c) is a P3 in D}.
Lemma 5.24. Let D = (V, A) be a digraph and a ∈ V . The vertex a is not
in MD iff
predA (a) × succA (a) ⊆ A. (14)
Proof. Let (x, a, y) denote a path in D. The following statements are equal:
• (x, a, y) is a P3 in D
• (x, y) 6∈ A
Obviously, all sources and sinks are in MD and thus we can expect a
larger set R for the application of Rule 5.16. Hence, it is more likely to find
some vertex v that the rule is applicable to.
Lemma 5.25. Rule 5.16 is sound for Transitivity Deletion, even with
R := predA (v) ∩ MD ∩ r | ∃u∈succA (v) (r, u) 6∈ A .
Proof. Let all notations except R and S 0 be as in the proof for Lemma 5.22.
In the following, we show that S 0 := (S ∪ S + )\S − with
S − := (R ∪ predA\S (R)) × Y
and
S + := {v} × T
is an optimal solution set for D that deletes all arcs in {v} × T . By con-
struction of S 0 and (12), it is clear that
succA\S 0 (Y ) ⊆ Y . (15)
As in the proof for Lemma 5.22, the fact that |S 0 | ≤ |S| follows directly
from Lemma 5.20. For the sake of contradiction, we assume that (a, b, c) is
a P3 in (V, A\S 0 ) but not in (V, A\S).
5 POLYNOMIAL-TIME DATA REDUCTION 62
Case 1: (a, b) ∈ S − .
Obviously, b ∈ Y and thus, (15) implies c ∈ Y . Hence, by (16), we know
that (a, c) ∈ A\S 0 , a contradiction.
Case 2: (b, c) ∈ S − .
Case 2.1: b ∈ R.
Since (a, b) 6∈ S − , it is clear that (a, b) ∈ A\S. Thus, a ∈ predA\S (R) and
by (16) we know that (a, c) ∈ A\S 0 , a contradiction.
Case 2.2: b ∈ predA\S (R).
Then, however, there must be some r ∈ R with (b, r) ∈ A\S. Since (a, b) 6∈
S − , it is clear that (a, r) ∈ A\S, implying a ∈ predA\S (R) and thus, by (16),
it is clear that (a, c) ∈ A\S 0 , a contradiction.
Case 3: (a, c) ∈ S + .
Obviously, a = v and c ∈ T . Hence, (15) implies b ∈ Y and thus, c ∈ Y ,
contradicting c ∈ T .
Figure 17: A digraph that is reduced with respect to all presented rules.
Note that d = (k − 1)/2 and |Qi | = k − 1 for all modules i. Its size is O(k 2 ).
in total. Rule 5.19 does not apply to vi since in this case T = {b, ui },
R = Qi ∪ {wi }, and maxt∈T Rt = Qi . Hence,
|T | + max R{t} = |{b, ui }| + |Qi | = k + 1 > k = |R| .
t∈T
Rule 5.18 does not apply to (V, A) since the only sources in this construction
are the vertices ui . However, since in this case T = Qi ∪ {wi } and R = {ui },
it is clear that
|T | + max R{t} = |Qi | + 1 = k > 1 = |R| .
t∈T
Since this set contains at most (k − 1)/2 vertices, Rule 5.6 does not apply
to (a, b). Furthermore, the indegrees of a and b are 1 and the indegree of
each vi is 2. Thus, for each module i, Rule 5.6 does not apply to any arc
in {a, b, vi } × Qi and (vi , wi ). For each module i, it is clear that
and thus, Rule 5.6 does not apply to (b, vi ). Finally, Rule 5.6 does not apply
to (ui , vi ) for each module i, since ui is a source and |succA (vi )| = k. All
in all, the construction is reduced with respect to the mentioned rules. In
the following, we consider the size of the construction. With d = (k − 1)/2
and |Qi | = k − 1 for all modules i, we can calculate the number of vertices
d−1
X
|V | = |Vb | + |Vim |
i=0
d−1
X
= 2+ (|Qi | + 3)
i=0
= 2 + d · (k − 1 + 3)
k−1
= 2+ · (k + 2)
2
1 2
= (k + k + 2).
2
Hence, the construction, which is reduced with respect to all presented rules
contains O(k 2 ) vertices (see Figure 18) and thus, the kernel size does not
improve asymptotically. However, since the rules do provide additional data
reduction in polynomial time, it is, for practical purposes, justified to apply
them to the given instance.
Deletion problem, to check whether (V, A ∪ {(u, w)}) can be turned tran-
sitive with ≤ k − 1 operations may be omitted. In this case, the worst-case
running time is 2k · poly(n). Let us have a closer look at the polynomial
factor. Finding a P3 in a given digraph can take O(n3 ) time. However, if
we know which arc to modify, we can calculate a list of P3 s that contain
this arc in O(n) steps, since each arc can be contained in at most n P3 s.
The idea is to keep a set of P3 s in D while branching. After each arc mod-
ification, this set must be updated. By the above observation and the fact
that set insertions may take logarithmic time, we arrive at a running time
of O(n log n). Also, the initial calculation of the P3 -set takes O(n3 ) time.
However, the task of finding a P3 can then be solved in constant time. All
in all, Transitivity Deletion can be solved in O(2k · n log n + n3 ) time
and Transitivity Editing can be solved in O(3k · n log n + n3 ) time.
In order to improve the running time of Algorithm 3, remember that
Lemma 3.4 implies that we only need to consider inserting an arc if we
encounter a diamond. This helps us decrease the branching number. The
modified search tree algorithm that is presented as Algorithm 4 traverses the
search tree in the following way: Upon finding a diamond d = (u, {x, y}, v)
in the given digraph D = (V, A) the the algorithm recursively asks whether
1. (V, A\{(u, x), (u, y)}) can be turned transitive with ≤ k −2 operations
2. (V, A\{(u, x), (y, v)}) can be turned transitive with ≤ k −2 operations
3. (V, A\{(x, v), (u, y)}) can be turned transitive with ≤ k −2 operations
4. (V, A\{(x, v), (y, v)}) can be turned transitive with ≤ k −2 operations
If there are no diamonds in the input graph, then the straightforward search
tree implementation for Transitivity Deletion is used to solve the prob-
lem. Recall that this implementation runs in O(2k · n3 ) time. Since all
possible ways of destroying an encountered diamond are considered by Algo-
rithm 4 and Lemma 3.4 implies the correctness of using the straightforward
search tree implementation for Transitivity Deletion when all diamonds
are destroyed, it is clear that Algorithm 4 is correct.
In the following, we consider the running time of Algorithm 4. By the
above enumeration, we conclude that the branching vector is (2, 2, 2, 2, 1).
This leads to a branching number of about 2.561554. Thus, Algorithm 4
takes, in the worst case, 2.57k · poly(n) time to find an optimal solution set
for the given digraph. By intersecting the predecessors of each vertex with
6 SEARCH TREE ALGORITHM 69
the successors of each vertex, one can find a diamond in asymptotically the
same time as finding a P3 needs. Thus, the polynomial factor is O(n3 ).
However, as before, we can keep a set of heads and tails of diamonds in D
that can be updated in O(n2 log n) steps. Again, the initial calculation of
the set of diamonds can be done in O(n3 ) time. With this modification,
we arrive at a total running time of O(2.57k · n2 log n + n3 ) for solving the
Transitivity Editing problem with the modified search tree algorithm.
Finally, we can apply the technique of interleaving (see Section 2.2) to
both search tree algorithms, resulting in a running time of O(2.57k + n3 )
and O(2k + n3 ) for solving Transitivity Editing and Transitivity Dele-
tion, respectively.
Proof. Since the proofs for each of the three parts of the lemma are anal-
ogous, we only show that, if (a, b) ∈ P and (b, c) ∈ P , then there is no
solution set for D that deletes (a, c). Clearly, any solution set for D that
deletes (a, c) must also delete either (a, b) or (b, c) to destroy the P3 (a, b, c).
Thus, if S does not modify (a, b) and (b, c) it cannot delete (a, c).
We use Lemma 6.2 to modify the algorithm that marks an arc in the
digraph such that, whenever an arc is marked, the conditions of Lemma 6.2
are checked and further marks are established accordingly.
6 SEARCH TREE ALGORITHM 70
Recall that sources and sinks are preserved by applying optimal solution
sets. In the following, we show that certain arcs between sources and sinks
can be marked.
Proof. Since the proofs for both parts of the rule are analogous, it suffices
to prove the first part. For the sake of contradiction, assume that there is
an optimal solution set S for D that deletes (r, s). By Corollary 3.7, r is
a source in (V, A∆S) and s is a sink in (V, A∆S). Thus, it is obvious that
undoing the deletion of (r, s) cannot create a P3 and thus S\{(r, s)} is also
a solution set for D, contradicting the optimality of S.
Proof. Obviously, optimal solution sets do not insert arcs between the two
components and thus, the unity of both partial solutions is optimal for D if
the partial solutions are optimal for D1 and D2 , respectively.
It is not hard to see that if a given digraph D = (V, A) has more than
one weakly connected component, then we can split the digraph and edit
the components individually. If restricted to Transitivity Deletion, this
idea can be used to split a digraph even earlier.
Proof. Suppose S was not an optimal solution set for D, that is, S is no
solution set for D or S is not optimal. If S is no solution set for D, then
there is a P3 p = (u, v, w) in (V, A\S). Since S1 and S2 are optimal solution
sets for D1 and D2 , respectively, p cannot be entirely contained in D1 or D2 .
Without loss of generality, let u ∈ V1 \V2 and w ∈ V2 \V1 . Since D10 and D20
are different weakly connected components, it is clear that v ∈ V1 ∩ V2 =
VSRC ∪ VSNK . Hence, v is a sink or a source in D and by Corollary 3.7, v is
a sink or source in both (V1 , A1 \S1 ) and (V2 , A2 \S2 ). Hence, v is also a sink
or source in (V, A\S), a contradiction to (u, v, w) being a P3 in (V, A\S).
If S is not optimal, then there is an optimal solution set S 0 to D with |S 0 | <
|S|. Obviously, S10 := S 0 ∩V1 ×V1 and S20 := S 0 ∩V2 ×V2 are solution sets for D1
and D2 , respectively. Since S1 and S2 are optimal, we know that |S1 | ≤ |S10 |
and |S2 | ≤ |S20 |. However, since S 0 is optimal, Lemma 3.8 implies that
there are no arcs between sources and sinks in S 0 and thus S10 ∩ S20 = ∅.
Hence, |S| ≤ |S1 | + |S2 | ≤ |S10 | + |S20 | = |S 0 |, contradicting |S 0 | < |S|.
7 Heuristics
In practice, one may find oneself forced to find solution sets for Transi-
tivity Editing faster than possible with the search tree algorithm. This
can be done by waiving the optimality of the computed solution. In the
following, we introduce a heuristic for Transitivity Editing. A heuristic
is an algorithm that either does not produce provably optimal solutions or
is not provably efficient. The presented heuristic may compute suboptimal
solutions but we conjecture that its running time is polynomial for all in-
puts. The basic idea of the presented heuristic is to assign a rank to every
pair of vertices of the given digraph D and then to greedily insert or remove
7 HEURISTICS 72
arcs based on their rank. Hereby, the rank of an arc represents its potential
to destroy P3 s in D. By adding the arc of maximum rank to the solution
set we hope to destroy as many P3 s as possible in each step. Note that this
arc is not necessarily in D.
Definition 7.1. For each pair (u, v) ∈ V × V , the rank of (u, v) is the
number of P3 s in D that are destroyed by modifying (u, v) minus the number
of P3 s that are created in D, if (u, v) is modified.
All ranks are initially computed by Algorithm 6 and stored in an array.
Lemma 7.2. Algorithm 6 is correct, that is, after calling Algorithm 6, for
each (u, v) ∈ V × V , the value of rnk [(u, v)] is equal to the rank of (u, v) as
defined in Definition 7.1.
Proof. Obviously, for any arc a, each P3 that is created by deleting a if a ∈ A
is destroyed by inserting a if a 6∈ A. This is being accounted for in line 3
of the algorithm. Thus, the proof works analogously for (u, v) 6∈ A and it
suffices to show the lemma for (u, v) ∈ A. Let r denote the number of P3 s
in D that are destroyed by deleting (u, v) and let s denote the number of P3 s
that are created by deleting (u, v). We prove the following: after running
Algorithm 6, the rank of the arc (u, v) is rnk [(u, v)] = r−s. Since (u, v) ∈ A,
it is clear that
s = |succA (u) ∩ predA (v)|
and
r = |(predA (u) \{v}) \ predA (v)| + |(succA (v) \{u}) \ succA (u)| .
Figure 19: If the arc (u, v) of some digraph (V, A) is modified, then we
need to update the ranks of the arcs that are drawn for the three classes of
vertices that are represented by x, y, and z.
Proof. Obviously, there are n2 pairs of vertices (u, v). For each pair, three
set differences are calculated with each set being of size O(n). This can
be done in linear time, resulting in a worst-case running time of O(n3 ) for
Algorithm 6.
To avoid recalculating the rank for all vertex pairs after modifying a
pair (u, v), we only update the ranks locally. Figure 19 illustrates that there
are three classes of rank updates that vary in their relation to u and v.
All vertices w ∈ V \{u, v} are processed three times. First, they are
considered as being in class x, that is, the ranks of (w, u) and (w, v) are
updated. Second, they are considered as being in class y, that is, the ranks
of (u, w) and (v, w) are updated. Finally, they are considered as being in
class z, that is, the ranks of (u, w) and (w, v) are updated. Thus, a total
of 6(|V | − 2) updates have to be done after each modification. Each affected
arc may have to get its rank adjusted in a different way, depending on
whether it is in A or not. To describe this correlation, we introduce a group
of functions defined in Table 1.
Example 7.4. Consider Figure 19. What would be the effect on the rank
of arc (x, v) if the arc (u, v) was deleted? Prior to the deletion, the removal
of (x, v) would have caused the P3 (x, u, v). After the deletion of (u, v),
this is no longer possible and, hence, the rank of (x, v) should increase.
The corresponding function value is update11 ((x, u), (x, v)) = 1. If (u, v)
7 HEURISTICS 75
Input: A directed graph D = (V, A), an arc (u, v) that has been
modified and the current maximum of all ranks.
Output: Updated ranks for all (x, y) ∈ V × V and the new
maximum of all ranks.
1 rnk [(u, v)] := − rnk [(u, v)];
2 foreach w ∈ V \{u, v} do
3 rnk [(u, w)] := rnk [(u, w)] + sgnu,v ·(update03 ((u, w), (w, v)) +
update02 ((u, w), (v, w)));
4 rnk [(w, u)] := rnk [(w, u)] + sgnu,v · update01 ((w, u), (w, v));
5 rnk [(w, v)] := rnk [(w, v)] + sgnu,v ·(update11 ((w, u), (w, v)) +
update13 ((u, w), (w, v)));
6 rnk [(v, w)] := rnk [(v, w)] + sgnu,v · update12 ((u, w), (v, w));
7 update the maximum rank if necessary;
8 end
was to be inserted into A, then the removal of (x, v) causes the P3 (x, u, v)
after the insertion, which was not possible prior to the insertion. Thus
the rank must decrease. This is implemented by multiplying the value
of update11 ((x, u), (x, v)) with sgnu,v .
Algorithm 7 describes how the update-procedure is implemented and
Algorithm 8 is used to calculate the resulting transitive digraph.
Lemma 7.5. Let D = (V, A) be a digraph and for all (a, b) ∈ V × V
let rnk [(a, b)] resemble the rank of (a, b) as defined in Definition 7.1. If the
arc (u, v) is modified, then Algorithm 7 updates rnk [(a, b)] such that it still
resembles the rank of (a, b).
Proof. Let D = (V, A) be a digraph, (a, b) be a pair of vertices, and (u, v)
be the pair of vertices with maximum rank in D. Let D0 denote the digraph
that results from modifying (u, v) in D. Let r and r0 denote the number
of P3 s in D and D0 , respectively, that are destroyed by modifying (a, b).
Analogously, let s and s0 denote the number of P3 s that are created by
modifying (a, b). We prove the following: if rnk [(a, b)] = r − s before calling
Algorithm 7, then rnk [(a, b)] = r0 − s0 afterwards. First, note that arcs that
are not incident to either u or v are not affected by the modification of (u, v).
Obviously, if (a, b) = (u, v), then r0 − s0 = −(r − s), since all P3 s that are
destroyed by modifying (u, v) would be created by modifying (u, v) again,
7 HEURISTICS 77
and vice versa. Without loss of generality, we assume that the modification
made to (u, v) is an arc deletion, that is, sgnu,v = 1.
In the following, let w denote some vertex in V \{u, v}. We consider the
rank of (a, b) = (u, w) and show exemplarily that line 3 of Algorithm 7 is
correct by proving
update03 ((u, w), (w, v)) + update02 ((u, w), (v, w)) = (r0 − s0 ) − (r − s).
Lines 4-6 can be verified analogously. The arc (u, w) can act as the two
arcs (u, y) and (u, z) in Figure 19. As shown in the middle and right col-
umn of Table 1, the corresponding update values are update02 ((u, w), (v, w))
and update03 ((u, w), (w, v)). Without loss of generality, let (u, w) ∈ A. We
consider the following cases:
Case 1: (v, w) ∈ A, (w, v) ∈ A.
The removal of (u, w) creates the P3 (u, v, w) in D but not in D0 . Further-
more, the removal of (u, w) destroys the P3 (u, w, v) in D0 which does not
exist in D. Hence, r0 = r + 1 and s0 = s − 1 and thus (r0 − s0 ) − (r − s) = 2.
Since update03 ((u, w), (w, v)) + update02 ((u, w), (v, w)) = 2, the correctness
follows.
Case 2: (v, w) ∈ A, (w, v) 6∈ A.
The removal of (u, w) creates the P3 (u, v, w) in D but not in D0 . Hence, r0 =
r and s0 = s−1 and thus (r0 −s0 )−(r−s) = 1. Since update03 ((u, w), (w, v))+
update02 ((u, w), (v, w)) = 1, the correctness follows.
Case 3: (v, w) 6∈ A, (w, v) ∈ A.
The removal of (u, w) destroys the P3 (u, w, v) in D0 which does not exist
in D. Hence, r0 = r + 1 and s0 = s and thus (r0 − s0 ) − (r − s) = 1.
Since update03 ((u, w), (w, v)) + update02 ((u, w), (v, w)) = 1, the correctness
follows.
8 EXPERIMENTAL RESULTS 78
Conjecture 7.6. If there is no pair of vertices (u, v) with rnk [(u, v)] > 0
in a digraph D, then there is no P3 in D.
8 Experimental Results
In the course of this work, experiments were carried out to provide a practical
point of view towards the presented algorithms. In this section, we report
the results of various test runs of the implementation of the two algorithms
for solving Transitivity Editing that are described in Section 6 and
Section 7, respectively. We included preprocessing Rules 5.2 and 5.13 into
the implementation of Algorithm 5 in order to reduce the number of vertices
in the input digraph to O(k 2 ) (see Section 5.1). Furthermore, Rules 5.18
and 5.19 were implemented. For the calculation of a lower bound for the size
of the solution set needed for Rule 5.13, Algorithm 2 (see Page 48) was used.
In the following, we refer to the resulting FPT algorithm simply as “the
FPT algorithm”. We explain the tests that were run with the algorithms
and present and interpret their results. All tests were run on a single core
of the multi core system described in Table 2.
8 EXPERIMENTAL RESULTS 79
[...]
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Quad CPU Q9300 @2.50GHz
stepping : 7
cpu MHz : 2497.000
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 3
cpu cores : 4
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
[...]
bogomips : 4982.43
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
14
12
10
heuristic error
0
0 10 20 30 40 50 60 70
Figure 20: The absolute error that the heuristic algorithm made, that
is, ||SOPT | − |Sheur ||, versus |SOPT |. The data was taken from all tests.
Note that the error does not exceed 23%.
8 EXPERIMENTAL RESULTS 82
8000
7000
6000
5000
time in seconds
4000
10
15
20
25
3000
30
2000
1000
0
0 50 100 150 200 250 300 350 400
vertices
Figure 21: Time in seconds that the FPT algorithm took to find an optimal
solution set for a given digraph versus the number of vertices in this digraph
for different optimal solution set sizes (k ∈ {10, 15, 20, 25, 30}).
Fixed n Test. The results of our experiments are not surprising in that
they agree with the theoretical thoughts. Figures 21 and 22 illustrate that
if we choose n >> k, the running times of both algorithms approach a
polynomial in n. Note that the results do not differ significantly for dif-
ferent optimal solution sizes. This may be due to our method of testing,
since it is likely that almost the complete optimal solution set was found
by the polynomial-time preprocessing preceding the branching in the FPT
algorithm. However, as can be seen by the different scales of the time axes
in the diagrams the preprocessing algorithm is by a factor of about 100
slower than the heuristic. This may, however be influenced by calling Algo-
rithm 2 (see Page 48) for lower bounding the size of the solution set, which
8 EXPERIMENTAL RESULTS 83
90
80
70
60
time in seconds
50
10
15
40 20
25
30
30
20
10
0
0 50 100 150 200 250 300 350 400
vertices in the input digraph
Figure 22: Time in seconds that the heuristic algorithm took to find a
solution set to a given digraph versus the number of vertices in this digraph
for different optimal solution set sizes (k ∈ {10, 15, 20, 25, 30}).
8 EXPERIMENTAL RESULTS 84
10000.00
1000.00
100.00
time in seconds
10.00
10
1.00 f(x)
15
20
0.10
0.01
0.00
0 10 20 30 40 50 60 70
Figure 23: Time in seconds that the FPT algorithm took to find a solution
set for a given digraph on a logarithmic scale versus the size of the optimal
solution sets for different input digraph sizes (number of vertices) (n ∈
{10, 15, 20}). The bold line marked f (x) is the graph of the function f (x) :=
0.001 · 1.3x .
takes O(n3 log n) time every time the solution set grows. Hence, we expect
the graphs to diverge for larger solution set sizes.
Fixed k Test. Figure 23 shows the time that the FPT algorithm took to
compute an optimal solution for a given input graph on a log scale. Due to
the polynomial summand we do not expect a straight line in the diagram
but rather a graph that is a bit bumpy in the proximity of the origin but
approaches a line asymptotically. The bold line in the diagram is drawn
for comparability. It is the graph of the function f (x) := 0.001 · 1.3x . This
suggests that the FPT algorithm runs, in practice, much faster than 2.57k .
The implementation of the heuristic algorithm (Algorithm 8) never took
more than 0.04 seconds for n ≤ 20. Unfortunately, this is too close to the
measurement inaccuracy to provide any meaningful results.
9 CONCLUSION 85
9 Conclusion
In the course of this work, we considered Transitivity Editing and
some related problems. While Transitivity Completion is solvable in
O(n2.376 ) time, we have seen that both Transitivity Editing and Tran-
sitivity Deletion are NP-complete, even when restricted to DAGs or
digraphs of maximum degree 4. We have shown that both problems admit
a problem kernel containing at most k(k + 1) vertices and that this kernel
can be calculated in O(n3 ) time. Furthermore, we presented a FPT algo-
rithms for Transitivity Editing and Transitivity Deletion that run
in O(2.57k + n3 ) and O(2k + n3 ) time, respectively. We also presented a
heuristic algorithm for solving Transitivity Editing. Finally, we per-
formed various experiments testing how the running times depend on both
the input size and the solution size in practice.
Although we did not succeed in finding a better than O(k 2 )-vertex ker-
nel for Transitivity Editing and Transitivity Deletion, it may be
possible to find an O(k)-vertex kernel in the future. Some sort of crown
type reduction rule [FLRS07] may be powerful enough to achieve this. It is
also interesting if it is possible to show whether a polynomial size kernel can
be computed in less than cubic time. Furthermore, a more detailed analy-
sis of the running time of Rule 5.16 is desirable. It also remains to prove
Conjecture 7.6, thus showing that the heuristic algorithm that was provided
always returns a solution set and runs in O(n4 ) time. Finally, the problem
of Transitivity Vertex Deletion is yet to be analyzed.
REFERENCES 86
References
[ACK+ 99] Giorgio Ausiello, Pierluigi Crescenzi, Viggo Kann, Alberto
Marchetti-Spaccamela, Giorgio Gambosi, and Marco Protasi.
Complexity and Approximation: Combinatorial Optimization
Problems and Their Approximability Properties. Springer, Jan-
uary 1999. 9
[CTY07] Pierre Charbit, Stéphan Thomassé, and Anders Yeo. The Min-
imum Feedback Arc Set problem is NP-hard for tournaments.
Combinatorics, Probability & Computing, 16(1):1–4, 2007. 14
[DGH+ 06] Michael Dom, Jiong Guo, Falk Hüffner, Rolf Niedermeier, and
Anke Truß. Fixed-parameter tractability results for feedback set
problems in tournaments. In Proceedings of the 6th Conference
REFERENCES 87
[GGHN05] Jens Gramm, Jiong Guo, Falk Hüffner, and Rolf Niedermeier.
Graph-modeled data clustering: fixed-parameter algorithms for
clique generation. Theory of Computing Systems, 38(4):373–392,
July 2005. 10, 42
[HBB+ 06] Michael Hummel, Stefan Bentink, Hilmar Berger, Wolfram Klap-
per, Swen Wessendorf, Thomas F.E. Barth, Heinz-Wolfram
Bernd, Sergio B. Cogliatti, Judith Dierlamm, Alfred C. Feller,
Martin-Leo Hansmann, Eugenia Haralambieva, Lana Harder,
Dirk Hasenclever, Michael Kühn, Dido Lenze, Peter Lichter,
Jose Ignacio Martin-Subero, Peter Möller, Hans-Konrad Müller-
Hermelink, German Ott, Reza M. Parwaresch, Christiane Pott,
Andreas Rosenwald, Maciej Rosolowski, Carsten Schwaenen,
Benjamin Stürzenhofecker, Monika Szczepanowski, Heiko Traut-
mann, Hans-Heinrich Wacker, Rainer Spang, Markus Loeffler,
Lorenz Trümper, Harald Stein, and Reiner Siebert. A bi-
ologic definition of Burkitt’s lymphoma from transcriptional
and genomic profiling. New England Journal of Medicine,
354(23):2419–2430, June 2006. 3, 10
[JJK+ 08] Juby Jacob, Marcel Jentsch, Dennis Kostka, Stefan Bentink,
and Rainer Spang. Detecting hierarchical structure in molec-
REFERENCES 88
[YZL07] Bing Yang, Si-Qing Zheng, and Enyue Lu. Finding two disjoint
paths in a network with MinSum-MinMin objective function. In
Proceedings of the 2007 International Conference on Foundations
of Computer Science (FCS2007), pages 356–361. CSREA Press,
2007. 26