Professional Documents
Culture Documents
Riesen and Bunke - Approximate Graph Edit Distance
Riesen and Bunke - Approximate Graph Edit Distance
a r t i c l e i n f o a b s t r a c t
Article history: In recent years, the use of graph based object representation has gained popularity. Simultaneously,
Received 15 October 2007 graph edit distance emerged as a powerful and flexible graph matching paradigm that can be used to
Received in revised form 1 February 2008 address different tasks in pattern recognition, machine learning, and data mining. The key advantages
Accepted 9 April 2008
of graph edit distance are its high degree of flexibility, which makes it applicable to any type of graph,
and the fact that one can integrate domain specific knowledge about object similarity by means of spe-
cific edit cost functions. Its computational complexity, however, is exponential in the number of nodes of
Keywords:
the involved graphs. Consequently, exact graph edit distance is feasible for graphs of rather small size
Graph based representation
Graph edit distance
only. In the present paper we introduce a novel algorithm which allows us to approximately, or subop-
Bipartite graph matching timally, compute edit distance in a substantially faster way. The proposed algorithm considers only local,
rather than global, edge structure during the optimization process. In experiments on different datasets
we demonstrate a substantial speed-up of our proposed method over two reference systems. Moreover, it
is emprically verified that the accuracy of the suboptimal distance remains sufficiently accurate for var-
ious pattern recognition applications.
Ó 2008 Elsevier B.V. All rights reserved.
0262-8856/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.imavis.2008.04.004
K. Riesen, H. Bunke / Image and Vision Computing 27 (2009) 950–959 951
planarly embedded, which is satisfied in many, but not all com- the concept of edit distance can be tailored to specific
puter vision applications of graph matching. applications.
In this paper, we propose a new efficient algorithm for edit dis- A standard set of distortion operations is given by insertions,
tance computation for general graphs. The method is based on an deletions, and substitutions of both nodes and edges. We denote
(optimal) fast bipartite optimization procedure mapping nodes the substitution of two nodes u and v by ðu ! vÞ, the deletion of
and their local structure of one graph to nodes and their local struc- node u by ðu ! eÞ, and the insertion of node v by ðe ! vÞ. For edges
ture of another graph. This procedure is somewhat similar in spirit to we use a similar notation. Other operations, such as merging and
the method proposed in [16,17]. However, rather than using dy- splitting of nodes [21], can be useful in certain applications but
namic programming for finding an optimal match between the sets are not considered in this paper. Given two graphs, the source
of local structure, we make use of Munkres’ algorithm [18]. Origi- graph g 1 and the target graph g 2 , the idea of graph edit distance
nally, this algorithm has been proposed to solve the assignment is to delete some nodes and edges from g 1 , relabel (substitute)
problem in polynomial time. However, in the present paper we gen- some of the remaining nodes and edges, and insert some nodes
eralize the original algorithm to the computation of graph edit dis- and edges in g 2 , such that g 1 is finally transformed into g 2 . A se-
tance. In experiments on semi-artificial and real-world data we quence of edit operations e1 ; . . . ; ek that transform g 1 completely
demonstrate that the proposed method results in a substantial into g 2 is called an edit path between g 1 and g 2 . In Fig. 1 an exam-
speed-up of the computation of graph edit distance, while at the ple of an edit path between two graphs g 1 and g 2 is given. This edit
same time the accuracy of the approximated distances is not much path consists of three edge deletions, one node deletion, one node
affected. insertion, two edge insertions, and two node substitutions.
A preliminary version of the current paper appeared in [19]. The Obviously, for every pair of graphs (g 1 ; g 2 ), there exist a number
current paper has been significantly extended with respect to the of different edit paths transforming g 1 into g 2 . Let !ðg 1 ; g 2 Þ denote
underlying methodology and the experimental evaluation. the set of all such edit paths. To find the most suitable edit path out
of !ðg 1 ; g 2 Þ, one introduces a cost for each edit operation, measur-
2. Graph edit distance computation ing the strength of the corresponding operation. The idea of such a
cost function is to define whether or not an edit operation repre-
In this section we first define our basic notation and then intro- sents a strong modification of the graph. Obviously, the cost func-
duce graph edit distance and its computation. Let L be a finite or tion is defined with respect to the underlying node or edge labels.
infinite set of labels for nodes and edges. Clearly, between two similar graphs, there should exist an inex-
pensive edit path, representing low cost operations, while for dis-
Definition 1. (Graph) A graph g is a four-tuple g ¼ ðV; E; l; mÞ, similar graphs an edit path with high costs is needed.
where Consequently, the edit distance of two graphs is defined by the
minimum cost edit path between two graphs.
V is the finite set of nodes,
E # V V is the set of edges, Definition 2. (Graph Edit Distance) Let g 1 ¼ ðV 1 ; E1 ; l1 ; m1 Þ be the
l : V ! L is the node labeling function, and source and g 2 ¼ ðV 2 ; E2 ; l2 ; m2 Þ the target graph. The graph edit
m : E ! L is the edge labeling function. distance between g 1 and g 2 is defined by
X
k
dðg 1 ; g 2 Þ ¼ min cðei Þ;
This definition allows us to handle arbitrary graphs with uncon- ðe1 ;...;ek Þ2!ðg 1 ;g 2 Þ
i¼1
strained labeling functions. For example, the labels can be given by
the set of integers, the vector space Rn , or a set of symbolic labels where !ðg 1 ; g 2 Þ denotes the set of edit paths transforming g 1 into g 2 ,
L ¼ fa; b; c; . . .g. Moreover, unlabeled graphs are obtained as a spe- and c denotes the cost function measuring the strength cðei Þ of edit
cial case by assigning the same label l to all nodes and edges. Edges operation ei .
are given by pairs of nodes ðu; vÞ, where u 2 V denotes the source The computation of the edit distance is usually carried out by
node and v 2 V the target node of a directed edge. Undirected means of a tree search algorithm which explores the space of all
graphs can be modeled by inserting a reverse edge ðv; uÞ 2 E for possible mappings of the nodes and edges of the first graph to
each edge ðu; vÞ 2 E with mðu; vÞ ¼ mðv; uÞ. the nodes and edges of the second graph. A widely used method
Graph matching refers to the task of evaluating the dissimilar- is based on the A* algorithm [22] which is a best-first search algo-
ity of graphs. One of the most flexible methods for measuring the rithm. The basic idea is to organize the underlying search space as
dissimilarity of graphs is the edit distance [2,3]. Originally, the an ordered tree. The root node of the search tree represents the
edit distance has been proposed in the context of string matching starting point of our search procedure, inner nodes of the search
[20]. Procedures for edit distance computation aim at deriving a tree correspond to partial solutions, and leaf nodes represent com-
dissimilarity measure from the number of distortions one has to plete – not necessarily optimal – solutions. Such a search tree is
apply to transform one pattern into the other. The concept of edit constructed dynamically at runtime by iteratively creating succes-
distance has been extended from strings to trees and eventually sor nodes linked by edges to the currently considered node in the
to graphs [2,3]. Similarly to string edit distance, the key idea of search tree. In order to determine the most promising node in the
graph edit distance is to define the dissimilarity, or distance, of current search tree, i.e. the node which will be used for further
graphs by the minimum amount of distortion that is needed to expansion of the desired mapping in the next iteration, a heuristic
transform one graph into another. Compared to other approaches function is usually used. Formally, for a node p in the search tree,
to graph matching, graph edit distance is known to be very flex- we use gðpÞ to denote the cost of the optimal path from the root
ible since it can handle arbitrary graphs and any type of node and node to the current node p, i.e. gðpÞ is set equal to the cost of the
edge labels. Furthermore, by defining costs for edit operations, partial edit path accumulated so far, and we use hðpÞ for denoting
g1 g2
Fig. 1. A possible edit path between graph g 1 and g 2 (node labels are represented by different shades of grey).
952 K. Riesen, H. Bunke / Image and Vision Computing 27 (2009) 950–959
the estimated cost from p to a leaf node. The sum gðpÞ þ hðpÞ gives cannot be any edge ðu; u0 Þ 2 E1 . Consequently, an edge insertion
the total cost assigned to an open node in the search tree. One can ðe ! e2 Þ has to be performed.
show that, given that the estimation of the future costs hðpÞ is low- Obviously, the implied edge operations can be derived from
er than, or equal to, the real costs, the algorithm is admissible, i.e. every partial or complete edit path during the search procedure gi-
an optimal path from the root node to a leaf node is guaranteed to ven in Algorithm 1. The costs of these implied edge operations are
be found [22]. dynamically added to the corresponding paths in OPEN.
In order to integrate more knowledge about partial solutions in
the search tree, it has been proposed to use heuristics [22]. Basi-
Algorithm 1. Graph edit distance algorithm cally, such heuristics for a tree search algorithm aim at the estima-
tion of a lower bound hðpÞ of the future costs. In the simplest
Input: Non-empty graphs g 1 ¼ ðV 1 ; E1 ; l1 ; m1 Þ and
scenario this lower bound estimation hðpÞ for the current node p
g 2 ¼ ðV 2 ; E2 ; l2 m2 Þ, where V 1 ¼ fu1 ; . . . ; ujV 1 j g and
is set to zero for all p, which is equivalent to using no heuristic
V 2 ¼ fv1 ; . . . ; vjV 2 j g
information about the present situation at all. The other extreme
Output: A minimum cost edit path from g 1 to g 2
would be to compute for a partial edit path the actual optimal path
e.g. pmin ¼ fu1 ! v3 ; u2 ! e; . . . ; e ! v2 g
to a leaf node, i.e. perform a complete edit distance computation
1: initialize OPEN to the empty set fg
for each node of the search tree. In this case, the function hðpÞ is
2: For each node w 2 V 2 , insert the substitution fu1 ! wg into
not a lower bound, but the exact value of the optimal costs. Of
OPEN
course, the computation of such a perfect heuristic is both unrea-
3: Insert the deletion fu1 ! eg into OPEN
sonable and untractable. Somewhere in between the two extremes,
4: loop
one can define a function hðpÞ evaluating how many edit opera-
5: Remove pmin ¼ argminp2OPEN fgðpÞ þ hðpÞg from OPEN
tions have to be performed in a complete edit path at least [2].
6: if pmin is a complete edit path then
One possible function of this type is described in the next
7: Return pmin as the solution
paragraph.
8: else
Let us assume that a partial edit path at a position in the search
9: Let pmin ¼ fu1 ! vi1 ; ; uk ! vik g
tree is given, and let the number of unprocessed nodes of the first
10: if k < jV 1 j then
graph g 1 and second graph g 2 be n1 and n2 , respectively. For an effi-
11: For each w 2 V 2 n fvi1 ; ; vik g, insert pmin [ fukþ1 ! wg
cient estimation of the remaining optimal edit operations, we first
into OPEN
attempt to perform as many node substitutions as possible. To this
12: Insert pmin [ fukþ1 ! eg into OPEN
end, we potentially substitute each of the n1 nodes from g 1 with
13: else
S any of the n2 nodes from g 2 . To obtain a lower bound of the exact
14: Insert pmin [ w2V 2 nfvi1 ;;vik g fe ! wg into OPEN
edit cost, we accumulate the costs of the minfn1 ; n2 g least expen-
15: end if
sive of these node substitutions, and the costs of maxf0; n1 n2 g
16: end if
node deletions and maxf0; n2 n1 g node insertions. Any of the se-
17: end loop
lected substitutions that is more expensive than a deletion fol-
lowed by an insertion operation is replaced by the latter. The
unprocessed edges of both graphs are handled analogously. Obvi-
In Algorithm 1 the A*-based method for optimal graph edit dis-
ously, this procedure allows multiple substitutions involving the
tance computation is given. The nodes of the source graph are pro-
same node or edge and, therefore, it possibly represents an invalid
cessed in the order ðu1 ; u2 ; . . .Þ. The deletion (line 12) or the
way to edit the remaining part of g 1 into the remaining part of g 2 .
substitution of a node (line 11) are considered simultaneously,
However, the estimated cost certainly constitutes a lower bound of
which produces a number of successor nodes in the search tree.
the exact cost and thus an optimal edit path is guaranteed to be
If all nodes of the first graph have been processed, the remaining
found [22]. In the following, we refer to this method as HEURISTIC-A*.
nodes of the second graph are inserted in a single step (line 14).
The set OPEN of partial edit paths contains the search tree nodes
to be processed in the next steps. The most promising partial edit 3. Fast suboptimal edit distance algorithms
path p 2 OPEN, i.e. the one that minimizes gðpÞ þ hðpÞ, is always
chosen first (line 5). This procedure guarantees that the complete The method described in the previous section finds an optimal
edit path found by the algorithm first is always optimal, i.e. has edit path between two graphs. Unfortunately, the computational
minimal costs among all possible competing paths (line 7). complexity of the edit distance algorithm, whether or not a heuris-
Note that edit operations on edges are implied by edit opera- tic function hðpÞ is used to govern the tree traversal process, is
tions on their adjacent nodes, i.e. whether an edge is substituted, exponential in the number of nodes of the involved graphs. This
deleted, or inserted, depends on the edit operations performed means that the running time and space complexity may be huge
on its adjacent nodes. Formally, let u; u0 2 V 1 [ feg and even for reasonably small graphs. In order to cope better with
v; v0 2 V 2 [ feg, and assume that the two node operations ðu ! vÞ the problem of graph distance, in [16,17] a suboptimal method
and ðu0 ! v0 Þ have been executed. We distinguish three cases. has been proposed. The basic idea is to decompose graphs into sets
(1) Assume there are edges e1 ¼ ðu; u0 Þ 2 E1 and e2 ¼ ðv; v0 Þ 2 E2 of subgraphs. These subgraphs consist of a node and its adjacent
in the corresponding graphs g 1 and g 2 . Then the edge substitution structure, i.e. edges including their nodes. The graph matching
ðe1 ! e2 Þ is implied by the node operations given above. problem is then reduced to the problem of finding an optimal
(2) Assume there is an edge e1 ¼ ðu; u0 Þ 2 E1 but there is no edge match between the sets of subgraphs by means of dynamic
e2 ¼ ðv; v0 Þ 2 E2 . Then the edge deletion ðe1 ! eÞ is implied by the programming.
node operations given above. Obviously, if v ¼ e or v0 ¼ e there can- In the present paper we introduce a novel algorithm for solving
not be any edge ðv; v0 Þ 2 E2 and thus an edge deletion ðe1 ! eÞ has the problem of graph edit distance computation. This approach is
to be performed. somewhat similar to the method described in [16,17], i.e. we
(3) Assume there is no edge e1 ¼ ðu; u0 Þ 2 E1 but an edge approximate the graph edit distance by finding an optimal match
e2 ¼ ðv; v0 Þ 2 E2 . Then the edge insertion ðe ! e2 Þ is implied by between nodes and their local structure of two graphs. However,
the node operations given above. Obviously, if u ¼ e or u0 ¼ e there we use a bipartite matching procedure to find the optimal match
K. Riesen, H. Bunke / Image and Vision Computing 27 (2009) 950–959 953
rather than dynamic programming. Because of its suboptimal nat- 16: end if
ure, this method does not generally return the optimal edit path, 17: else
but only an approximate one. 18: Save the smallest uncovered element emin GOTO STEP 4
19: end if
3.1. The assignment problem 20: STEP 3: Construct a series S of alternating primed and starred
zeros as follows:
Our novel approach for graph edit distance computation is 21: Insert Z 0 into S
based on the assignment problem. The assignment problem con- 22: while In the column of Z 0 exists a starred zero Z 1 do
siders the task of finding an optimal assignment of the elements 23: Insert Z 1 into S
of a set A to the elements of a set B, where A and B have the same 24: Replace Z 0 with the primed zero in the row of Z 1 . Insert Z 0
cardinality. Assuming that numerical costs are given for each into S
assignment pair, an optimal assignment is one which minimizes 25: end while
the sum of the assignment costs. Formally, the assignment prob- 26: Unstar each starred zero in S and replace all primes with
lem can be defined as follows. stars. Erase all other primes and uncover every line in C GOTO
Definition 3. (The Assignment Problem) Let us assume there are STEP 1
two sets A and B together with an n n cost matrix C of real numbers 27: STEP 4: Add emin to every element in covered rows and
given, where jAj ¼ jBj ¼ n. The matrix elements Cij correspond to the subtract it from every element in uncovered columns. GOTO
costs of assigning the i-th element of A to the j-th element of B. The STEP 2
assignment problem can be stated as finding a permutation 28: DONE: Assignment pairs are indicated by the positions of
P starred zeros in the cost matrix
p ¼ p1 ; . . . ; pn of the integers 1; 2; . . . ; n that minimizes ni¼1 Cipi .
The assignment problem can be reformulated as finding an opti-
mal matching in a complete bipartite graph and is therefore also Munkres’ algorithm is based on the following theorem.
referred to as bipartite graph matching problem. Solving the
assignment problem in a brute force manner by enumerating all Theorem 1. (Equivalent Matrices) Given a cost matrix C as defined
permutations and selecting the one that minimizes the objective in Definition 3, a column vector c ¼ ðc1 ; . . . ; cn Þ, and a row vector
function leads to an exponential complexity which is unreason- r ¼ ðr 1 ; . . . ; r n Þ, the square matrix C0 with the elements
able, of course. However, there exists an algorithm which is known C0ij ¼ Cij ci r j has the same optimal assignment solution as the
as Munkres’ algorithm [18]1 that solves the bipartite matching matrix C. C and C0 are said to be equivalent.
problem in polynomial time. In Algorithm 2 Munkres’ method is de-
scribed in detail. The assignment cost matrix C given in Definition 3 Proof. [24] Let p be a permutation of the integers 1; 2; . . . ; n mini-
P
is the algorithms’ input, and the output corresponds to the optimal mizing ni¼1 Cipi , then
permutation, i.e. the assignment pairs resulting in the minimum
X
n X
n X
n X
n
cost. In the description of Munkres’ method in Algorithm 2 some C0ipi ¼ Cipi ci rj
lines (rows or columns) of the cost matrix C and some zero elements i¼1 i¼1 i¼1 j¼1
are distinguished. They are termed covered or uncovered lines and
starred or primed zeros, respectively. The values of the last two terms are independent of permutation
P P
p so that if p minimizes ni¼1 Cipi , it also minimizes ni¼1 C0ipi .
Consequently, if we find a new matrix C0 equivalent to the initial
Algorithm 2. Munkres’ algorithm for the assignment problem cost matrix C, and a permutation p with all C0ipi ¼ 0, then p also
Pn
Input: A cost matrix C with dimensionality n minimizes i¼1 Cipi . Intuitively, Munkres’ algorithm transforms
Output: The minimum cost node or edge assignment the original cost matrix C into an equivalent matrix C0 having
1: For each row r in C, subtract its smallest element from every n independent zero elements. This independent set of zero ele-
element in r ments exactly corresponds to the optimal assignment pairs.
2: For each column c in C, subtract its smallest element from The operations executed in lines 1 and 2, and STEP 4 of Algo-
every element in c rithm 2 find a matrix equivalent to the initial cost matrix (Theorem
3: For all zeros zi in C, mark zi with a star if there is no starred 1). In lines 1 and 2 the column vector c ¼ ðc1 ; . . . ; cn Þ is constructed
zero in its row or column by ci ¼ minfCij gj¼1;...;n , and the row vector r ¼ ðr 1 ; . . . ; r n Þ0 by
4: STEP 1: rj ¼ minfCij gi¼1;...;n . In STEP 4 the vectors c and r are defined by
5: for Each column containing a starred zero do the rules
6: cover this column
emin if row i is covered 0 if column j is covered
7: end for ci ¼ rj ¼
0 otherwise emin otherwise
8: if n columns are covered then GOTO DONE else GOTO STEP 2
end if
where emin is the smallest uncovered element in the cost matrix C.
9: STEP 2:
STEP 1, 2, and 3 of Algorithm 2 are procedures to find a maxi-
10: if C contains an uncovered zero then
mum set of independent zeros which mark the optimal assign-
11: Find an arbitrary uncovered zero Z 0 and prime it
ment. In the worst case the maximum number of operations
12: if There is no starred zero in the row of Z 0 then
needed by the algorithm is Oðn3 Þ. Note that the Oðn3 Þ complexity
13: GOTO STEP 3
is much smaller than the Oðn!Þ complexity required by a brute
14: else
force algorithm.
15: Cover this row, and uncover the column containing the
starred zero GOTO STEP 2
3.2. Graph edit distance computation by means of Munkres’ algorithm
1
Munkres’ algorithm is a refinement of an earlier version by Kuhn [23] and is also Munkres’ algorithm as introduced in the last section provides us
referred to as Kuhn-Munkres, or Hungarian algorithm. with an optimal solution to the assignment problem in Oðn3 Þ time.
954 K. Riesen, H. Bunke / Image and Vision Computing 27 (2009) 950–959
The same algorithm can be used to derive a suboptimal solution to On the basis of the new cost matrix C defined above, Munkres’
the graph edit distance problem as described below. algorithm [18] can be executed (Algorithm 2). This algorithm finds
Let us assume a source graph g 1 ¼ ðV 1 ; E1 ; l1 ; m1 Þ and a target the optimal, i.e. the minimum cost, permutation p ¼ p1 ; . . . ; pnþm of
Pnþm
graph g 2 ¼ ðV 2 ; E2 ; l2 ; m2 Þ of equal size, i.e. jV 1 j ¼ jV 2 j, are given. the integers 1; 2; . . . ; n þ m that minimizes i¼1 Cipi . Obviously, this
One can use Munkres’ algorithm in order to map the nodes of V 1 is equivalent to the minimum cost assignment of the nodes of g 1
to the nodes of V 2 such that the resulting node substitution costs represented by the rows to the nodes of g 2 represented by the col-
are minimal, i.e. we solve the assignment problem of Definition 3 umns of matrix C. That is, Munkres’ algorithm indicates the mini-
with A ¼ V 1 and B ¼ V 2 . In our solution we define the cost matrix mum cost assignment pairs with starred zeros in the
C such that entry Ci;j corresponds to the cost of substituting the transformed cost matrix C0 . These starred zeros are independent,
i-th node of V 1 with the j-th node of V 2 . Formally, i.e. each row and each column of C0 contains exactly one starred
Cij ¼ cðui ! vj Þ, where ui 2 V 1 and vj 2 V 2 , for i j ¼ 1; . . . ; jV 1 j. zero. Consequently, each node of graph g 1 is either uniquely as-
The constraint that both graphs to be matched are of equal size signed to a node of g 2 (left upper corner of C0 ), or to the deletion
is too restrictive since it cannot be expected that all graphs in a node e (right upper corner of C0 ). Vice versa, each node of graph
specific problem domain always have the same number of nodes. g 2 is either uniquely assigned to a node of g 1 (left upper corner
However, Munkres’ algorithm can be applied to non-quadratic of C0 ), or to the insertion node e (bottom left corner of C0 ). The e-
matrices, i.e. to sets with unequal size, as well [24]. In a prelimin- nodes in g 1 and g 2 corresponding to rows n þ 1; . . . ; n þ m and col-
ary version of the present paper Munkres’ algorithm has been ap- umns m þ 1; . . . ; m þ n in C0 that are not used cancel each other out
plied to rectangular cost matrices [19]. In this approach Munkres’ without any costs (bottom right corner of C0 ).
algorithm first finds the minfjV 1 j; jV 2 jg node substitutions which So far the proposed algorithm considers the nodes only and takes
minimize the total costs. Then the costs of maxf0; jV 1 j jV 2 jg no information about the edges into account. In order to achieve a
node deletions and maxf0; jV 2 j jV 1 jg node insertions are added better approximation of the true edit distance, it would be highly
to the minimum cost node assignment such that all nodes of desirable to involve edge operations and their costs in the node
both graphs are processed. Using this approach, all nodes of the assignment process as well. In order to achieve this goal, an exten-
smaller graph are substituted and the nodes remaining in the sion of the cost matrix is needed. To each entry ci;j , i.e. to each cost of
larger graph are either deleted (if they belong to g 1 ) or inserted a node substitution cðui ! vj Þ, the minimum sum of edge edit oper-
(if they belong to g 2 ). ation costs, implied by node substitution ui ! vj , is added. Formally,
In the present paper we extend the idea proposed in [19] by assume that node ui has adjacent edges Eui and node vj has adjacent
defining a new cost matrix C which is more general in the sense edges Evj . With these two sets of edges, Eui and Evj , an individual cost
that we allow insertions or deletions to occur not only in the larger, matrix similarly to Definition 4 can be established and an optimal
but also in the smaller of the two graphs under consideration. In assignment of the elements Eui to the elements Evj according to
contrast with the scenario of [19], this new setting now perfectly Algorithm 2 can be performed. Clearly, this procedure leads to the
reflects the edit model of Definition 2. Moreover, matrix C is by minimum sum of edge edit costs implied by the given node substi-
definition quadratic. Consequently, the original method as de- tution ui ! vj . These edge edit costs are added to the entry ci;j .
scribed in Algorithm 2 can be used to find the minimum cost Clearly, to the entry ci;e , which denotes the cost of a node deletion,
assignment. the cost of the deletion of all adjacent edges of ui is added, and to
the entry ce;j , which denotes the cost of a node insertion, the cost
Definition 4. (Cost Matrix) Let g 1 ¼ ðV 1 ; E1 ; l1 ; m1 Þ be the source
of all insertions of the adjacent edges of vj is added.
and g 2 ¼ ðV 2 ; E2 ; l2 ; m2 Þ be the target graph with V 1 ¼ fu1 ; . . . ; un g
Note that Munkres’ algorithm used in its original form is opti-
and V 2 ¼ fv1 ; . . . ; vm g, respectively. The cost matrix C is defined as
mal for solving the assignment problem, but it provides us with
a suboptimal solution for the graph edit distance problem only.
This is due to the fact that each node edit operation is considered
individually (considering the local structure only), such that no im-
plied operations on the edges can be inferred dynamically. The re-
sult returned by Munkres’ algorithm corresponds to the minimum
cost mapping, according to matrix C, of the nodes of g 1 to the nodes
of g 2 . Given this mapping, the implied edit operations of the edges
are inferred, and the accumulated costs of the individual edit oper-
ations on both nodes and edges can be computed. These costs serve
us as an approximate graph edit distance. The approximate edit
distance values obtained by this procedure are equal to, or larger
than, the exact distance values, since our suboptimal approach
finds an optimal solution in a subspace of the complete search
space. In the following we refer to our novel suboptimal graph edit
distance algorithm as BIPARTITE, or BP for short.
where ci;j denotes the cost of a node substitution, ci;e denotes the
cost of a node deletion cðui ! eÞ, and ce;j denotes the costs of a node
insertion cðe ! vj Þ.
4. Experimental results
Obviously, the left upper corner of the cost matrix represents
the costs of all possible node substitutions, the diagonal of the right The purpose of the experiments presented in this section is to
upper corner the costs of all possible node deletions, and the diag- empirically verify the feasibility of the proposed suboptimal graph
onal of the bottom left corner the costs of all possible node inser- edit distance algorithm. First of all, we compare the runtime of
tions. Note that each node can be deleted or inserted at most once. suboptimal graph edit distance with the optimal version. Secondly,
Therefore any non-diagonal element of the right-upper and left- we aim at answering the question whether the resulting subopti-
lower part is set to 1. The bottom right corner of the cost matrix mal graph edit distances remain sufficiently accurate for pattern
is set to zero since substitutions of the form ðe ! eÞ should not recognition tasks. To this end we consider several classification
cause any costs. tasks. There are various approaches to graph classification that
K. Riesen, H. Bunke / Image and Vision Computing 27 (2009) 950–959 955
Fig. 5. Fingerprint examples from the four classes. Fig. 7. Protein examples of all top level classes.
K. Riesen, H. Bunke / Image and Vision Computing 27 (2009) 950–959 957
Table 2
Accuracy (Acc), correlation ðqÞ, time ðtÞ on all datasets
Based on the scatter plots and the high correlation coeffi- 5. Conclusions
cient q reported in Table 2, one can presume that the classifica-
tion results of an NN-classifier will not be negatively affected In the present paper, we have considered the problem of graph
when we substitute the exact edit distances by approximate edit distance computation. The edit distance of graphs is a power-
ones. In fact, this can be observed in Table 2 for all datasets. ful and flexible concept that has found various applications in pat-
On Letter (L) the classification accuracy is improved, while on tern recognition and related areas. A serious drawback is, however,
Letter (M) and Letter (H) it drops compared to the exact algo- the exponential complexity of graph edit distance computation.
rithm. On the COIL data, all algorithms achieve the same classi- Hence using optimal, or exact, algorithms restricts the applicability
fication result. Note that the only difference that is statistically of the edit distance to graphs of rather small size.
significant (using a Z-test at the 95% level) is the deterioration In the current paper, we propose a suboptimal approach to
on Letter (H) from 63.0% to 61.6%. On the datasets GREC and graph edit distance computation, which is based on Munkres algo-
Molecules we observe that our algorithm outperforms the sec- rithm for solving the assignment problem. The assignment prob-
ond reference system (with statistical significance), while on lem consists in finding an assignment of the elements of two sets
Fingerprints the accuracy drops statistically significantly. On with each other such that a cost function is minimized. In the cur-
Proteins we have only one result. This result is comparable to rent paper we show how the graph edit distance problem can be
results published in [38], where a random walk graph kernel transformed into the assignment problem. Our proposed solution
was used. allows for the insertion, deletion, and substitution of both nodes
From the results reported in Table 2, we can conclude that in and edges, but considers these edit operations in a rather indepen-
general the classification accuracy of the 1-NN classifier is not neg- dent fashion from each. Therefore, while Munkres algorithm re-
atively affected by using the approximate rather than the exact turns the optimal solution to the assignment problem, our
edit distances. This is due to the fact that most of the overesti- proposed solution yields only a suboptimal, or approximate, solu-
mated distances belong to inter-class pairs of graphs, while tion to the graph edit distance problem. However, the time com-
intra-class distances are not strongly affected. Obviously, intra- plexity is only cubic in the number of nodes of the two
class distances are of much higher importance for a distance based underlying graphs.
classifier than inter-class distances. In other words, through the In the experimental section of this paper we compute edit dis-
approximation of the edit distances, the graphs are rearranged tance on seven different graph datasets. The graph based represen-
with respect to each other such that a better classification becomes tations of our sets cover line drawings, object and fingerprint
possible. Graphs which belong to the same class (according to the images, and molecular compounds. On four of our datasets the ex-
ground truth) often remain near, while graphs from different clas- act graph edit distance is not applicable since the graphs are too
ses are pulled apart from each other. Obviously, if the approxima- large. However, the proposed suboptimal method can cope with
tion is too inaccurate, the similarity measure and the underlying the task of graph edit distance on all datasets. The first finding of
classifier will be unfavorably disturbed. our work is that although BEAM, which is another suboptimal algo-
Fig. 9. Scatter plots of the exact edit distances (x-axis) and the suboptimal edit distances (y-axis).
K. Riesen, H. Bunke / Image and Vision Computing 27 (2009) 950–959 959
rithm developed previously, achieves remarkable speed-ups com- [12] J. Wong, J. Hopcroft, Linear time algorithm for isomorphism of planar graphs,
in: Proceedings of 6th Annual ACM Symposium on Theory of Computing, 1974,
pared to the exact method, our novel algorithm is still much faster.
pp. 172–184.
Furthermore, our new approach makes graph edit distance feasible [13] E. Luks, Isomorphism of graphs of bounded valence can be tested in
for graphs with up to 130 nodes. polynomial time, Journal of Computer and Systems Sciences 25 (1982) 42–65.
The second finding is that suboptimal graph edit distance need [14] A. Torsello, D. Hidovic-Rowe, M. Pelillo, Polynomial-time metrics for attributed
trees, IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (7)
not necessarily lead to a deterioration of the classification accuracy (2005) 1087–1099.
of a distance based classifier. On two out of four datasets, the clas- [15] P. Dickinson, H. Bunke, A. Dadej, M. Kraetzl, On graphs with unique node
sification accuracy of a nearest-neighbor classifier remains the labels, in: E. Hancock, M. Vento (Eds.), Proceedings of 4th International
Workshop on Graph Based Representations in Pattern Recognition, LNCS,
same or even improves when the exact edit distances are replaced 2726, Springer, 2003, pp. 13–23.
by the suboptimal ones returned by our novel algorithm. This can [16] M. Eshera, K. Fu, A graph distance measure for image analysis, IEEE Transactions
be explained by the fact that all distances computed by the on Systems, Man, and Cybernetics (Part B) 14 (3) (1984) 398–408.
[17] M. Eshera, K. Fu, A similarity measure between attributed relational graphs for
suboptimal method are equal to, or larger than, the true distances. image analysis, in: Proceedings of 7th International Conference on Pattern
Moreover, an experimental analysis has shown that the larger the Recognition, 1984, pp. 75–77.
true distances are the larger is their overestimation. In other [18] J. Munkres, Algorithms for the assignment and transportation problems,
Journal of the Society for Industrial and Applied Mathematics 5 (1957) 32–38.
words, smaller distances are computed more accurately than larger [19] K. Riesen, M. Neuhaus, H. Bunke, Bipartite graph matching for computing the
distances by our suboptimal algorithm. This means that inter-class edit distance of graphs, in: F. Escolano, M. Vento (Eds.), in: Proceedings of 6th
distances are more affected than intra-class distances by the International Workshop on Graph Based Representations in Pattern
Recognition, LNCS 4538, 2007, pp. 1–12.
suboptimal algorithm. However, for a nearest-neighbor classifier,
[20] R. Wagner, M. Fischer, The string-to-string correction problem, Journal of the
small distances have more influence on the decision than large dis- Association for Computing Machinery 21 (1) (1974) 168–173.
tances. Hence no serious deterioration of the classification accu- [21] R. Ambauen, S. Fischer, H. Bunke, Graph edit distance with node splitting and
racy occurs when our novel suboptimal algorithm is used instead merging and its application to diatom identification, in: E. Hancock, M. Vento
(Eds.), Proceedings of 4th International Workshop on Graph Based
of an exact method. Representations in Pattern Recognition, LNCS, 2726, Springer, 2003, pp. 95–106.
In future work we will study the problem of graph clustering [22] P. Hart, N. Nilsson, B. Raphael, A formal basis for the heuristic determination of
using the proposed graph edit distance method, and integrate the minimum cost paths, IEEE Transactions of Systems, Science, and Cybernetics 4
(2) (1968) 100–107.
suboptimal distances into more sophisticated graph classifiers-like [23] H. Kuhn, The Hungarian method for the assignment problem, Naval Research
graph kernels. Also the comparison with other graph matching ap- Logistic Quarterly 2 (1955) 83–97.
proaches could be a rewarding avenue to be explored. [24] F. Bourgeois, J. Lassalle, An extension of the Munkres algorithm for the
assignment problem to rectangular matrices, Communications of the ACM 14
(12) (1971) 802–804.
References [25] K. Riesen, M. Neuhaus, H. Bunke, Graph embedding in vector spaces by means
of prototype selection, in: F. Escolano, M. Vento (Eds.), in: Proceedings of 6th
[1] D. Conte, P. Foggia, C. Sansone, M. Vento, Thirty years of graph matching in International Workshop on Graph Based Representations in Pattern
pattern recognition, International Journal of Pattern Recognition and Artificial Recognition, LNCS 4538, 2007, pp. 383–393.
Intelligence 18 (3) (2004) 265–298. [26] M. Neuhaus, H. Bunke, Bridging the Gap between Graph Edit Distance and
[2] H. Bunke, G. Allermann, Inexact graph matching for structural pattern Kernel Machines, World Scientific, 2007.
recognition, Pattern Recognition Letters 1 (1983) 245–253. [27] S. Nene, S. Nayar, H. Murase, Columbia object image library: coil-100,
[3] A. Sanfeliu, K. Fu, A distance measure between attributed relational graphs for Technical Report, Department of Computer Science, Columbia University,
pattern recognition, IEEE Transactions on Systems, Man, and Cybernetics (Part New York, 1996.
B) 13 (3) (1983) 353–363. [28] D. Comaniciu, P. Meer, Robust analysis of feature spaces: color image
[4] M. Neuhaus, H. Bunke, Edit distance based kernel functions for structural segmentation, in: IEEE Conference on Computer Vision and Pattern
pattern classification, Pattern Recognition 39 (10) (2006) 1852–1863. Recognition, 1997, pp. 750–755.
[5] M. Neuhaus, H. Bunke, An error-tolerant approximate matching algorithm for [29] P. Dosch, E. Valveny, Report on the second symbol recognition contest, in: L.
attributed planar graphs and its application to fingerprint classification, in: A. Wenyin, J. Llados (Eds.), Graphics Recognition. Ten years review and future
Fred, T. Caelli, R. Duin, A. Campilho, D. de Ridder (Eds.), Proceedings of 10th perspectives, Proceedings of 6th International Workshop on Graphics
International Workshop on Structural and Syntactic Pattern Recognition, LNCS, Recognition (GREC’05), LNCS, 3926, Springer, 2005, pp. 381–397.
3138, Springer, 2004, pp. 180–189. [30] R. Zhou, C. Quek, G. Ng, A novel single-pass thinning algorithm and an effective
[6] A. Robles-Kelly, E. Hancock, Graph edit distance from spectral seriation, IEEE set of performance criteria, Pattern Recognition Letters 16 (12) (1995) 1267–
Transactions on Pattern Analysis and Machine Intelligence 27 (3) (2005) 365– 1275.
378. [31] M. Neuhaus, H. Bunke, A graph matching based approach to fingerprint
[7] M. Boeres, C. Ribeiro, I. Bloch, A randomized heuristic for scene recognition by classification using directional variance, in: T. Kanade, A. Jain, N. Ratha (Eds.),
graph matching, in: C. Ribeiro, S. Martins (Eds.), Proceedings of 3rd Workshop Proceedings of 5th International Conference on Audio- and Video-based
on Efficient and Experimental Algorithms, LNCS, 3059, Springer, 2004, pp. 100– Biometric Person Authentication, LNCS, 3546, Springer, 2005, pp. 191–200.
113. [32] C. Watson, C. Wilson, NIST Special Database 4, Fingerprint Database, National
[8] S. Sorlin, C. Solnon, Reactive tabu search for measuring graph similarity, in: L. Institute of Standards and Technology, 1992.
Brun, M. Vento (Eds.), Proceedings of 5th International Workshop on Graph- [33] E. Henry, Classification and Uses of Finger Prints, Routledge, London, 1900.
based Representations in Pattern Recognition, LNCS, 3434, Springer, 2005, pp. [34] D.T.P. DTP, AIDS antiviral screen, 2004. Available from: <http://dtp.nci.nih.gov/
172–182. docs/aids/aids-data.html>.
[9] D. Justice, A. Hero, A binary linear programming formulation of the graph edit [35] H. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. Bhat, H. Weissig, I. Shidyalov,
distance, IEEE Transactions on Pattern Analysis and Machine Intelligence 28 P. Bourne, The protein data bank, Nucleic Acids Research 28 (2000) 235–242.
(8) (2006) 1200–1214. [36] I. Schomburg, A. Chang, C. Ebeling, M. Gremse, C. Heldt, G. Huhn, D. abd
[10] M. Neuhaus, K. Riesen, H. Bunke, Fast suboptimal algorithms for the Schomburg, Brenda, the enzyme database: updates and major new
computation of graph edit distance, in: C. Ribeiro, S. Martins (Eds.), developments, Nucleic Acids Research 32 (2004). Database issue: D431–D433.
Proceedings of 11th International Workshop on Structural and Syntactic [37] V. Levenshtein, Binary codes capable of correcting deletions, insertions and
Pattern Recognition, LNCS, 3059, Springer, 2006, pp. 163–172. reversals, Soviet Physics Doklady 10 (8) (1966) 707–710.
[11] A. Cross, R. Wilson, E. Hancock, Inexact graph matching using genetic search, [38] K. Borgwardt, H.P. Kriegel, Shortest-path kernels on graphs, in: Proceedings of
Pattern Recognition 30 (6) (1997) 953–970. 5th International Conference on Data Mining, 2005, pp. 74–81.