Multiple Polyline to Polygon Matching

Mirela T˘ anase
Remco C. Veltkamp
Herman Haverkort
institute of information and computing sciences, utrecht university
technical report UU-CS-2005-017
www.cs.uu.nl
Multiple Polyline to Polygon Matching
Mirela T˘ anase
1
Remco C. Veltkamp
1
Herman Haverkort
2
1
Department of Information & Computing Sciences
Utrecht University, The Netherlands
2
Department of Mathematics and Computing Science
TU Eindhoven, The Netherlands
Abstract
This paper addresses the partial shape matching problem, which helps identifying similarities even
when a significant portion of one shape is occluded, or seriously distorted. We introduce a measure for
computing the similarity between multiple polylines and a polygon, that can be computed in O(km
2
n
2
)
time with a straightforward dynamic programming algorithm. We then present a novel fast algorithm
that runs in time O(kmnlog mn). Here, m denotes the number of vertices in the polygon, and n is the
total number of vertices in the k polylines that are matched against the polygon. The effectiveness of the
similarity measure has been demonstrated in a part-based retrieval application with known ground-truth.
1 Introduction
The motivation for multiple polyline to polygon matching is twofold. Firstly, the matching of shapes has
been done mostly by comparing them as a whole [3, 14, 18, 23, 24, 25, 26, 28, 31]. This fails when a
significant part of one shape is occluded, or distorted by noise. In this paper, we address the partial shape
matching problem, matching portions of two given shapes. Secondly, partial matching helps identifying
similarities even when a significant portion of one shape boundary is occluded, or seriously distorted. It
could also help in identifying similarities between contours of a non-rigid object in different configurations
of its moving parts, like the contours of a sitting and a walking cat. Finally, partial matching helps allevi-
ating the problem of unreliable object segmentation from images, over or undersegmentation, giving only
partially correct contours.
Despite its usefulness, considerably less work has been done on partial shape matching than on global
shape matching. One reason could be that partial shape matching is more involved than global shape
matching, since it poses two more difficulties: identifying the portions of the shapes to be matched, and
achieving invariance to scaling. If the shape representation is not scale invariant, shapes at different scales
can be globally matched after scaling both shapes to the same length or the same area of the minimum
bounding box. Such solutions work reasonably well in practice for many global similarity measures. When
doing partial matching, however, it is unclear how the scaling should be done since there is no way of
knowing in advance the relative magnitude of the matching portions of the two shapes. This can be seen
in figure 1: any two of the three shapes in the figure should give a high score when partially matched but
for each shape the scale at which this matching should be done depends on the portions of the shapes that
match.
Contribution Firstly, we introduce a measure for computing the similarity between multiple polylines
and a polygon. The polylines could for example be pieces of an object contour incompletely extracted from
an image, or could be boundary parts in a decomposition of an object contour. The measure we propose is
based on the turning function representation of the polylines and the polygon. We then derive a number of
non-trivial properties of the similarity measure.
Secondly, based on these properties we characterize the optimal solution that leads to a straightforward
O(km
2
n
2
)-time dynamic programming algorithm. We then present a novel O(kmnlog mn)-time algo-
rithm. Here, m denotes the number of vertices in the polygon, and n is the total number of vertices in the
k polylines that are matched against the polygon.
1
A
B
C
Figure 1: Scale invariance in partial matching is difficult.
Thirdly, we have experimented with a part-based retrieval application. Given a large collection of
shapes and a query consisting of a set of polylines, we want to retrieve those shapes in the collection that
best match our query. The set of polylines forming the query are boundary parts in a decomposition of a
database shape – both this database shape and the parts in the query are selected by the user. The evaluation
on the basis of a known ground-truth indicate that a part-based approach to matching improves the global
matching performance for difficult categories of shapes.
2 Related Work
Arkin et al. [3] describe a metric for comparing two whole polygons that is invariant under translation,
rotation and scaling. It is based on the L
2
-distance between the turning functions of the two polygons, and
can be computed in O(mnlog mn) time, where m is the number of vertices in one polygon and n is the
number of vertices in the other.
Most partial shape matching methods are based on computing local features of the contour, and then
looking for correspondences between the features of the two shapes, for example points of high curvature
[2, 17]. Other local features-based approaches make use of genetic algorithms [22], k-means clustering
[7], or fuzzy relaxation techniques [20]. Such local features-based solutions work well when the matched
subparts are almost equivalent up to a transformation such as translation, rotation or scaling, because for
such subparts the sequences of local features are very similar. However, parts that we perceive as similar,
may have quite different local features (different number of curvature extrema for example).
Geometric hashing [32] is a method that determines if there is a transformed subset of the query point
set that matches a subset of a target point set, by building a hash table in transformation space. Also the
Hausdorff distance [11] allows partial matching. It is defined for arbitrary non-empty bounded and closed
sets A and B as the infimum of the distance of the points in A to B and the points in B to A. Both methods
are designed for partial matching, but do not easily transform to our case of matching multiple polylines to
a polygon.
Partially matching the turning angle function of two polylines under scaling, translation and rotation,
can be done in time O(m
2
n
2
) [8]. Given two matches with the same squared error, the match involving the
longer part of the polylines has a lower dissimilarity. The dissimilarity measure is a function of the scale,
rotation, and the shift of one polyline along the other. However, this works for only two single polylines.
Latecki et al [15], establish the best correspondence of parts in a decomposition of the matched shapes.
The best correspondence between the maximal convex arcs of two simplified versions of the original shapes
gives the partial similarity measure between the shapes. One serious drawback of this approach is the fact
that the matching is done between parts of simplified shapes at “the appropriate evolution stage” [14]. How
these evolution stages are identified is not indicated in their papers, though it certainly has an effect on the
quality of the matching process. Also, by looking for correspondences between parts of the given shapes,
false negatives could appear in the retrieval process. An example of such a case is in figure 2, where two
parts of the query shape Q are denoted by Q
1
and Q
2
, and two parts of the database shape P are denote
by P
1
and P
2
. Though P has a portion of P
1
∪ P
2
similar to a portion of Q
1
∪ Q
2
, no correspondence
among these four parts succeeds to recognize this. Considering different stages of the curve evolution (such
that P
1
∪ P
2
and Q
1
∪ Q
2
constitute individual parts) does not solve the problem either. This is because
the method used for matching two chains normalizes the two chains to unit length and thus no shifting of
2
P
1
P
2
Q
1
Q
2
Figure 2: Partial matching based on correspondence of parts may fail to identify similarities.
one endpoint of one chain over the other is done. Also notice that if instead of making correspondences
between parts, we shift the parts of one shape over the other shape and select those that give a close match,
the similarity between the shapes in figure 2 is reported.
In our approach, we also use a decomposition into parts in order to identify the matching portions,
but we do this only for one of the shapes. In the matching process between a (union of) part(s) and an
undecomposed shape, the part(s) are shifted along the shape, and the optimal placement is computed based
on their turning functions.
In the next section, we introduce a similarity measure for matching a union of possibly disjoint parts
and a whole shape. This similarity measure is a turning angle function-based similarity, minimized over all
possible shiftings of the endpoints of the parts over the shape, and also over all independent rotations of the
parts. Since we allow the parts to rotate independently, this measure could capture the similarity between
contours of non-rigid objects, with parts in different relative positions.
3 Polylines-to-Polygon Matching
We concentrate on the problem of matching an ordered set {P
1
, P
2
, . . . , P
k
} of k polylines against a poly-
gon P. We want to compute how close an ordered set of polylines {P
1
, P
2
, . . . , P
k
} is to being part of the
boundary of P in the given order in counter-clockwise direction around P (see figure 3). For this purpose,
the polylines are rotated and shifted along the polygon P, in such a way that the pieces of the boundary of
P “covered” by the k polylines are mutually disjoint except possibly at their endpoints. In this section we
define a similarity measure for such a polylines-to-polygon matching. The measure is based on the turning
function representation of the given polylines and it is invariant under translation and rotation. We first
give an O(km
2
n
2
)-time dynamic programming algorithm for computing this similarity measure, where m
is the number of vertices in P and n is the total number of vertices in the k polylines. We then refine this
algorithm to obtain a running time of O(kmnlog mn). Note that the fact that P is a polygon, and not an
open polyline, comes from the intended application of part based shape retrieval.
3.1 Similarity between polyline and polygon
The turning function Θ
A
of a polygon A measures the angle of the counterclockwise tangent with respect
to a reference orientation as a function of the arc-length s, measured from some reference point on the
boundary of A. It is a piecewise constant function, with jumps corresponding to the vertices of A. A
rotation of A by an angle θ corresponds to a shifting of Θ
A
over a distance θ in the vertical direction.
Moving the location of the reference point A(0) over a distance t ∈ [0, l
A
) along the boundary of A
corresponds to shifting Θ
A
horizontally over a distance t.
The distance between two polygons A and B is defined as the L
2
distance between their two turning
functions Θ
A
and Θ
B
, minimized with respect to the vertical and horizontal shifts of these functions (in
other words, minimized with respect to rotation and choice of reference point). More formally, suppose A
and B are two polygons with perimeter length l
A
and l
B
, respectively, and the polygon B is placed over A
in such a way that the reference point B(0) of B coincides with point A(t) at distance t along A from the
reference point A(0), and B is rotated clockwise by an angle θ with respect to the reference orientation. A
pair (t, θ) ∈ R × R will be referred to as a placement of B over A. The first component t of a placement
(t, θ) is also called a horizontal shift, since it corresponds to a horizontal shifting of Θ
A
over a distance t,
3
P
P
1
P
2
P
3
P
P
3
P
2
P
1
Figure 3: Matching an ordered set {P
1
, P
2
, P
3
} of polylines against a polygon P.
while the second component θ is also called a vertical shift, since it corresponds to a vertical shifting of Θ
A
over a distance θ. We define the quadratic similarity f(A, B, t, θ) between A and B for a given placement
(t, θ) of B over A, as the square of the L
2
-distance between their two turning functions Θ
A
and Θ
B
:
f(A, B, t, θ) =
_
lB
0

A
(s +t) −Θ
B
(s) +θ)
2
ds.
The similarity between two polygons A and B is then given by:
min
θ∈R, t∈[0,lA)
_
f(A, B, t, θ).
To achieve invariance under scaling, Arkin et al. [3] propose scaling the two polygons to unit length
prior to the matching.
For measuring the difference between a polygon A and a polyline B the same measure can be used
(for matching two polylines, some slight adaptations would be needed). Notice that, for the purpose of our
part-based retrieval application, we want that a polyline B included in a polygon A to match perfectly this
polygon, that is: their similarity should be zero. But if we scale A and B to the same length, their turning
functions will no longer match perfectly. For this reason we do not scale the polyline or the polygon prior
to the matching process. Thus, our similarity measure is not scale-invariant.
In a part-based retrieval application (see section 4) using a similarity measure based on the above
quadratic similarity (see section 3.2), we achieve robustness to scaling by normalizing all shapes in the col-
lection to the same diameter of their circumscribed circle. This is a reasonable solution for our collection,
which does not contain occluded, or incomplete shapes. Moreover these shapes are object contours and not
synthetic shapes.
3.2 Similarity between multiple polylines and a polygon
Let Θ : [0, l] →R be the turning function of a polygon P of m vertices, and of perimeter length l. Since P
is a closed polyline, the domain of Θcan be easily extended to the entire real line, by Θ(s+l) = Θ(s)+2π.
Let {P
1
, P
2
, . . . P
k
} be a set of polylines, and let Θ
j
: [0, l
j
] → R denote the turning function of the
polyline P
j
of length l
j
. If P
j
is made of n
j
segments, Θ
j
is piecewise-constant with n
j
−1 jumps.
For simplicity of exposition, f
j
(t, θ) denotes the quadratic similarity f
j
(t, θ) =
_
lj
0
(Θ(s+t) −Θ
j
(s) +
θ)
2
ds.
We assume the polylines {P
1
, P
2
, . . . , P
k
} satisfy the condition

k
j=1
l
j
≤ l. The similarity measure,
which we denote by d(P
1
, . . . , P
k
; P), is the square root of the sum of quadratic similarities f
j
, minimized
over all valid placements of P
1
, . . . , P
k
over P (or in other words, minimized over all valid horizontal and
vertical shifts of their turning functions):
d(P
1
, . . . , P
k
; P) = min
valid placements
(t1, θ1) . . . (t
k
, θ
k
)


k

j=1
f
j
(t
j
, θ
j
)


1/2
.
4
l 2l
Θ
t
1
t
1
+ l
1
Θ
1
t
2
t
2
+ l
2
Θ
2
t
3
t
3
+ l
3
< t
1
+ l
Θ
3
Figure 4: To compute d(P
1
, . . . , P
k
; P) between the polylines P
1
, . . . , P
3
and the polygon P, we shift the
turning functions Θ
1
, Θ
2
, and Θ
3
horizontally and vertically over Θ.
It remains to define what the valid placements are. The horizontal shifts t
1
, . . . , t
k
correspond to
shiftings of the starting points of the polylines P
1
, . . . , P
k
along P. We require that the starting points
of P
1
, . . . , P
k
are matched with points on the boundary of P in counterclockwise order around P, that is:
t
j−1
≤ t
j
for all 1 < j ≤ k, and t
k
≤ t
1
+ l. Furthermore, we require that the matched parts are disjoint
(except possibly at their endpoints), sharpening the constraints to t
j−1
+ l
j−1
≤ t
j
for all 1 < j ≤ k, and
t
k
+l
k
≤ t
1
+l (see figure 4).
The vertical shifts θ
1
, . . . , θ
k
correspond to rotations of the polylines P
1
, . . . , P
k
with respect to the
reference orientation, and are independent of each other. Therefore, in an optimal placement the quadratic
similarity between a particular polyline P
j
and P depends only on the horizontal shift t
j
, while the vertical
shift must be optimal for the given horizontal shift. We can thus express the similarity between P
j
and P
for a given positioning t
j
of the starting point of P
j
over P as: f

j
(t
j
) = min
θ∈R
f
j
(t
j
, θ).
The similarity between the polylines P
1
, . . . , P
k
and the polygon P is thus:
d(P
1
, . . . , P
k
; P) = min
t1∈[0,l), t2,...,t
k
∈[0,2l);
∀j∈{2,...,k}: tj−1+lj−1≤tj; t
k
+l
k
≤t1+l


k

j=1
f

j
(t
j
)


1/2
. (1)
3.3 Properties of the similarity function
In this section we give a few properties of f

j
(t), as functions of t, that constitute the basis of the algorithms
for computing d(P
1
, . . . , P
k
; P) in sections 3.5 and 3.6. We also give a simpler formulation of the opti-
mization problem in the definition of d(P
1
, . . . , P
k
; P). Arkin et al. [3] have shown that for any fixed t,
the function f
j
(t, θ) is a quadratic convex function of θ. This implies that for a given t, the optimization
problem min
θ∈R
f
j
(t, θ) has a unique solution, given by the root θ

j
(t) of the equation ∂f
j
(t, θ)/∂θ = 0.
Since
∂f
j
(t, θ)
∂θ
=
_
lj
0
2(Θ(s +t) −Θ
j
(s) +θ)ds =
_
lj
0
2(Θ(s +t) −Θ
j
(s))ds + 2θl
j
, (2)
we get:
Lemma 1 For a given positioning t of the starting point of P
j
over P, the rotation that minimizes the
quadratic similarity between P
j
and P is given by θ

j
(t) = −
_
lj
0
(Θ(s +t) −Θ
j
(s))ds/l
j
.
We now consider the properties of f

j
(t) = f
j
(t, θ

j
(t)), as a function of t.
Lemma 2 The quadratic similarity f

j
(t) has the following properties:
i) it is periodic, with period l;
ii) it is piecewise quadratic, with mn
j
breakpoints within any interval of length l; moreover, the
parabolic pieces are concave.
5
Θ
Θ
j
Figure 5: For given horizontal and vertical shifts of Θ
j
over Θ, the discontinuities of Θ
j
and Θ form a set
of rectangular strips.
Proof: i) Since Θ(t +l) = 2π + Θ(t), from lemma 1 we get that θ

(t +l) = θ

(t) −2π. Thus
f

j
(t +l) =
_
lj
0
(Θ(s +t +l) −Θ
j
(s) +θ

(t) −2π)
2
ds = f

j
(t).
ii) Since f

j
(t) = f(P, P
j
, t, θ

j
(t))f
j
(t, θ

j
(t)), we have:
f

j
(t) =
_
lj
0
(Θ(s +t) −Θ
j
(s) +θ

j
(t))
2
ds =
_
lj
0
(Θ(s +t) −Θ
j
(s))
2
ds +
_
lj
0
2(Θ(s +t) −Θ
j
(s))θ

j
(t)ds +
_
lj
0


j
(t))
2
ds
=
_
lj
0
(Θ(s +t) −Θ
j
(s))
2
ds +


j
(t)
_
lj
0
(Θ(s +t) −Θ
j
(s))ds +
l
j


j
(t))
2
.
Applying lemma 1 we get:
f

j
(t) =
_
lj
0
(Θ(s +t) −Θ
j
(s))
2
ds −
_
_
lj
0
(Θ(s +t) −Θ
j
(s))ds
_
2
l
j
. (3)
For the rest of this proof we restrict our attention to the behaviour of f

j
within an interval of length
l. For a given value of the horizontal shift t, the discontinuities of Θ and Θ
j
define a set S of rectangular
strips. Each strip is defined by a pair of consecutive discontinuities of the two turning functions and is
bounded from below and above by these functions.
As Θ
j
is shifted horizontally with respect to Θ, there are at most mn
j
critical events within an interval
of length l where the configuration of the rectangular strips defined by Θ and Θ
j
changes. These critical
events are at values of t were a discontinuity of Θ
j
coincides with a discontinuity of Θ. Inbetween these
critical events, the height of a strip σ is a constant h
σ
, while the width w
σ
is a linear function of t with
coefficient −1, 0 or 1.
For a given configuration of strips, the value of
_
lj
0
(Θ(s+t) −Θ
j
(s))
2
ds is thus given by the sum over
the strips in the current configuration of the width of the strip times the square of its height:
_
lj
0
(Θ(s +t) −Θ
j
(s))
2
ds =

σ∈S
w
σ
(t)h
2
σ
.
Similarly, we have:
_
lj
0
(Θ(s +t) −Θ
j
(s))ds =

σ∈S
w
σ
(t)h
σ
.
6
t
f

j
(t)
Figure 6: The quadratic similarity f

j
is a piecewise quadratic function of t, with concave parabolic pieces.
And thus equation (3) can be rewritten as:
f

j
(t) =

σ∈S
w
σ
(t)h
2
σ

1
l
j

σ∈S
w
σ
(t)h
σ


2
. (4)
Since the functions w
σ
are linear functions of t, the function f

j
is a piecewise quadratic function of t, with
at most mn
j
breakpoints within an interval of length l. Moreover, since the coefficient of t
2
in equation (4)
is negative, the parabolic pieces of f

j
are concave (see figure 6).
The following corollary indicates that in computing the minimum of the function f

j
, it suffices to
restrict our attention to a discrete set of at most mn
j
points.
Corollary 1 The local minima of the function f

j
are among the breakpoints between its parabolic pieces.
We now give a simpler formulation of the optimization problem in the definition of d(P
1
, . . . , P
k
; P).
In order to simplify the restrictions on t
j
in equation (1), we define:
f
j
(t) := f

j
_
t +
j−1

i=1
l
i
_
.
In other words, the function f
j
is a copy of f

j
, but shifted to the left with

j−1
i=1
l
i
. Obviously, f
j
has the
same properties as f

j
, that is: it is a piecewise quadratic function of t that has its local minima in at most
mn
j
breakpoints in any interval of length l. With this simple transformation of the set of functions f

j
, the
optimization problem defining d(P
1
, . . . , P
k
; P) becomes:
d(P
1
, . . . , P
k
; P) = min
t1∈[0,l), t2,...,t
k
∈[0,2l);
∀j∈{2,...,k}: tj−1≤tj; t
k
≤t1+l0


k

j=1
f
j
(t
j
)


1/2
, (5)
where l
0
:= l−

k
i=1
l
i
. Notice that if (t

1
, . . . , t

k
) is a solution to the optimization problemin equation (5),
then (t

1
, . . . , t

k
), with t

j
:= t

j
+

j−1
i=1
l
i
, is a solution to the optimization problem in equation (1).
3.4 Characterization of an optimal solution
In this section we characterize the structure of an optimal solution to the optimization problem in equa-
tion (5), and give a recursive definition of this solution. This definition forms the basis of a straightforward
dynamic programming solution to the problem.
Let (t

1
, . . . , t

k
) be a solution to the optimization problem in equation (5).
7
Lemma 3 The values of an optimal solution (t

1
, . . . , t

k
) are found in a discrete set of points X ⊂ [0, 2l),
which consists of the breakpoints of the functions f
1
, ..., f
k
, plus two copies of each breakpoint: one shifted
left by l
0
and one shifted right by l
0
.
Proof: If t

j−1
= t

j
and t

j
= t

j+1
, the constraints on the choice of t

1
, ..., t

k
do not prohibit decreasing
or increasing t

j
, thus t

j
must give a local minimum of function f
j
.
Notice now, that t

j−1
= t

j
means that t

j
= t

j−1
+ l
j−1
, thus the positioning of the ending point of
P
j−1
and of the starting point of P
j
coincide. Then, in the given optimal placement the polylines P
j−1
and
P
j
are “glued” together. Similarly, if t

k
= t

1
+l
0
, we have that P
k
and P
1
are glued together.
For any j ∈ {1, ..., k}, let G
j
be the maximal set of polylines containing P
j
and glued together in the
given optimal placement, that is: no polyline that is not in G
j
is glued to a polyline in G
j
. Observe that
G
j
must have either the form {P
g
, ..., P
h
} (with j ∈ {g, ..., h}) or the form {P
1
, ..., P
h
, P
g
, ..., P
k
} (with
j ∈ {1, ..., h, g, ..., k}), for some g and h in {1, ..., k}.
In the first case, we have t

i
= t

j
for all i ∈ {g, ..., h}. Since G
j
is a maximal set of polylines
glued together, the constraints on the choice of t

1
, ..., t

k
do not prohibit decreasing or increasing t

g
, ..., t

h
simultaneously, so t

j
must give a local minimum of the function f
gh
: R →R, f
gh
(t) =

h
i=g
f
i
(t). This
function is obviously piecewise quadratic, with concave parabolic pieces, and the breakpoints of f
gh
are
the breakpoints of the constituent functions f
g
, ..., f
h
. Note that a local minimum of f
gh
may not be a local
minimum of any of the constituent functions f
i
(g ≤ i ≤ h), but nevertheless it will always be given by a
breakpoint of one of the constituent functions.
In the second case, we can argue in a similar manner that t

j
must give a local minimum of the function
f
gh
: R → R, f
gh
(t) =

k
i=g
f
i
(t) +

h
i=1
f
i
(t − l
0
), (when j ∈ {g, ..., k}) or the function f
gh
: R →
R, f
gh
(t) =

k
i=g
f
i
(t + l
0
) +

h
i=1
f
i
(t) (when j ∈ {1, ..., h}). Thus, t

j
must be a breakpoint of one of
the functions f
g
, ..., f
h
, or it must be such a breakpoint shifted left or right by l
0
.
We call a point in [0, 2l), which is either a breakpoint of one of the functions f
1
, ..., f
k
, or such a
breakpoint shifted left or right by l
0
, a critical point. Since function f
j
has 2mn
j
breakpoints, the total
number of critical points in [0, 2l) is at most 6m

j
i=1
n
i
= 6mn. Let X = {x
0
, . . . , x
N−1
} be the set of
critical points in [0, 2l).
With the observations above, the optimization problem we have to solve is:
d(P
1
, . . . , P
k
; P) = min
t1, . . . , t
k
∈ X
∀j > 1 : tj−1 ≤ tj ; t
k
− t1 ≤ l0


k

j=1
f
j
(t
j
)


1/2
. (6)
We now show that an optimal solution of this problem can be constructed recursively. For this purpose,
we denote:
D[j, a, b] = min
t1, . . . , tj ∈ X
xa ≤ t1 ≤ . . . ≤ tj ≤ x
b
j

i=1
f
i
(t
i
) , (7)
where j ∈ {1, . . . , k}, a, b ∈ {0, . . . , N − 1}, and a ≤ b. Equation (7) describes the subproblem of
matching the set {P
1
, . . . , P
j
} of j polylines to a subchain of P, starting at P(x
a
) and ending at P(x
b
+

j
i=1
l
i
). We nowshow that D[j, a, b] can be computed recursively. Let (t

1
, . . . t

j
) be an optimal solution
for D[j, a, b]. Regarding the value of t

j
we distinguish two cases:
• t

j
= x
b
, in which case (t

1
, . . . t

j−1
) must be an optimal solution for D[j − 1, a, b], otherwise
(t

1
, . . . t

j
) would not give a minimum for D[j, a, b]; thus in this case, D[j, a, b] = D[j − 1, a, b] +
f
j
(x
b
);
• t

j
= x
b
, in which case (t

1
, . . . t

j
) must be an optimal solution for D[j, a, b − 1]; otherwise
(t

1
, . . . t

j
) would not give a minimum for D[j, a, b]; thus in this case D[j, a, b] = D[j, a, b −1].
8
We can now conclude that :
D[j, a, b] = min
_
D[j −1, a, b] + f
j
(x
b
), D[j, a, b −1]
_
, for j ≥ 1 ∧ a ≤ b, (8)
where the boundary cases are D[0, a, b] = 0 and D[j, a, a −1] has no solution.
A solution of the optimization problem (6) is then given by
d(P
1
, . . . , P
k
; P) = min
xa,x
b
∈X, x
b
−xa≤l0
_
D[k, a, b] . (9)
3.5 A straightforward algorithm
Equations (8) and (9) lead to a straightforward dynamic programming algorithm for computing the simi-
larity measure d(P
1
, . . . , P
k
; P):
Algorithm SimpleCompute d(P
1
, . . . , P
k
; P)
1. Compute the set of critical points X = {x0, . . . , xN−1}, and sort them.
2. For all j ∈ {1, . . . , k} and all i ∈ {0, . . . , N −1}, evaluate fj(xi).
3. MIN ←∞
4. for a ←0 to N −1 do
5. for j ←1 to k do D[j, a, a −1] ←∞
6. b ←a
7. while x
b
−xa ≤ l0 do
8. D[0, a, b] ←0
9. for j ←1 to k do
10. D[j, a, b] ←min
_
D[j −1, a, b] + fj(x
b
), D[j, a, b −1]
_
11. b ←b + 1
12. MIN ←min(D[k, a, b −1], MIN).
13. return

MIN
Lemma 4 The running time of algorithm SimpleCompute d(P
1
, . . . , P
k
; P) is O(km
2
n
2
), and the algo-
rithm uses O(km
2
n
2
) storage.
Proof: Step 1 of this algorithm requires first identifying all values of t such that a vertex of a turning
function Θ
j
is aligned with a vertex of Θ. For a fixed vertex v of a turning function Θ
j
, all m values of t
such that v is aligned with a vertex of Θ can be identified in O(m) time; the results for all n
j
vertices of
Θ
j
can thus be collected in O(mn
j
) time. We then have to shift these values left by

j−1
i=1
l
i
and make
copies of them shifted left and right by l
0
. Overall, the identification of the set X of critical points takes
O(

k
j=1
mn
j
) = O(mn), and they can be sorted in O(mnlog(mn)).
We nowshowthat, for a particular value of j, evaluating f
j
(x
i
), for all i ∈ {0, . . . , N−1} takes O(mn).
The algorithm is an adaptation of the algorithm used in [3], and we thus indicate only its main ideas. As
equation (4) from the proof of lemma 2 indicates, the function f
j
(t) is the difference of two piecewise
linear functions with the same set of breakpoints. Thus these functions can be completely determined by
evaluating an initial value and the slope of each linear piece. It remains to indicate how to compute the
slopes of these linear pieces.
Notice that when Θ
j
is horizontally shifted over Θ such that the configuration S of the rectangular strips
does not change, the only strips whose width is changing are those bounded by a jump of Θ and a jump of
Θ
j
. More precisely, let us denote by S
ΘΘj
the set of rectangular strips in the current configuration S that
are bounded left by a jump of Θ and right by a jump of Θ
j
(see figure 7), by S
ΘjΘ
the set of rectangular
strips that are bounded left by a jump of Θ
j
and right by a jump of Θ, and by S
ΘΘ
and S
ΘjΘj
the set of of
rectangular strips that are bounded left and right by a jump of Θ and Θ
j
, respectively. When shifting Θ
j
such that the configuration S does not change, the widths of the strips in S
ΘjΘj
and S
ΘΘ
do not change,
while the width of those in S
ΘΘj
increases and of those in S
ΘjΘ
decreases. Thus the slope of each linear
piece of the sum functions in equation (4) is determined only by the rectangular strips in S
ΘΘj
and S
ΘjΘ
.
Notice also that at each breakpoint of f
j
, only three strips in S change their type. The evaluation of
the function f
j
thus starts with computing an initial configuration of strips, with strips divided in the above
9
σ1
σ2
σ3
σ4
σ5
σ6
Θ
Θj
Figure 7: The four types of strips formed by the discontinuities of two turning functions Θ
j
and Θ: σ
1
, σ
4

S
ΘjΘ
, σ
2
, σ
6
∈ S
ΘΘj
, σ
3
∈ S
ΘjΘj
, and σ
5
∈ S
ΘΘ
.
mentioned four groups. An initial value f
j
(0) can be computed from this configuration in O(m+n
j
). We
then go through the set X of critical events in order, and when necessary we update the configuration of
strips (and thus the sets S
ΘΘj
and S
ΘjΘ
) and also compute the value of the function f
j
, based on the slopes
computed from S
ΘΘj
and S
ΘjΘ
. The updates of the configuration and the function values computation
take constant time per critical point. Thus we can evaluate the function f
j
at all critical points in x in
O(N) = O(mn), and thus the step 2 of the algorithm takes O(kmn) time.
The operations in the rest of the algorithm all take constant time each. Their total time is determined
by the number of times line 10 is executed, which is at most N times for the loop in line 4, times N for the
loop on line 6, times k for the loop on line 8, for a total of O(kN
2
) = O(km
2
n
2
) times.
Thus, line 4 to 12 dominate the running time of algorithm SimpleCompute, and this is O(km
2
n
2
). The
amount of storage used by the algorithm is dominated by the amount of space needed to store the matrix
M, which is O(kN
2
) = O(km
2
n
2
).
3.6 A fast algorithm
The above time bound on the computation of the similarity measure d(P
1
, . . . , P
k
; P) can be improved
to O(kmnlog mn). The refinement of the dynamic programming algorithm is based on the following
property of equation (8):
Lemma 5 For any polyline P
j
, j ∈ {1, . . . , k}, and any critical point x
b
, b ∈ {0, . . . , N − 1}, there is a
critical point x
z
, 0 ≤ z ≤ b, such that:
i) D[j, a, b] = D[j, a, b −1], for all a ∈ {0, ..., z − 1}, and
ii) D[j, a, b] = D[j −1, a, b] + f
j
(x
b
), for all a ∈ {z, ..., b}.
Proof: For fixed values of j and b, let (t
a
1
, ..., t
a
j
) denote the lexicographically smallest solution among
the optimal solutions for the subproblem D[j, a, b]. For ease of notation, define t
a
0
= x
a
and t
a
j+1
= x
b
.
The proof of i) and ii) is based on the following observation: as a increases from 0 to b, the value t
a
i
can
only increase as well, or remain the same, for any i ∈ {1, . . . , j}. We first prove this observation.
For the sake of contradiction, assume that there is an a (0 < a ≤ b) and an i (1 ≤ i ≤ j) such that
t
a
i
< t
a−1
i
. For such an a, let i

be the smallest such i, and let i

be the largest such i.
By our choice of i

and i

, we have t
a
i

−1
≤ t
a
i
< t
a−1
i
≤ t
a−1
i
≤ t
a−1
i

+1
≤ t
a
i

+1
. It follows
that (t
a
1
, ..., t
a
i

−1
, t
a−1
i
, ..., t
a−1
i
, t
a
i

+1
, ..., t
a
j
) is a valid solution for j, a and b . From the fact that the
lexicographically smallest optimal solution (t
a
1
, ..., t
a
j
) is different, we must conclude that it is at least as
good, i.e.

i

i=i
f
i
(t
a
i
) ≤

i

i=i
f
i
(t
a−1
i
).
But we also have t
a−1
i

−1
≤ t
a
i

−1
≤ t
a
i
≤ t
a
i
< t
a−1
i
≤ t
a−1
i

+1
, fromwhich it follows that (t
a−1
1
, ..., t
a−1
i

−1
,
t
a
i
, ..., t
a
i
, t
a−1
i

+1
, ..., t
a−1
i
) is a valid solution for j, a − 1 and b. From the fact that this solution is lexico-
graphically smaller than the lexicographically smallest optimal solution (t
a−1
1
, ..., t
a−1
j
), we must conclude
that it is suboptimal, so

i

i=i

f
i
(t
a
i
) >

i

i=i

f
i
(t
a−1
i
). This contradicts the conclusion of the previous
10
paragraph. It follows that t
a
i
≥ t
a−1
i
, for all 0 < a ≤ b and 1 ≤ i ≤ j, and thus, by induction, that t
a

i
≤ t
a
i
,
for all 0 ≤ a

≤ a.
We can now prove the lemma item i) and ii). Let z be the smallest value in {0, ..., b} for which t
z
j
= x
b
(since t
b
j
must be x
b
, such a value always exists). Since, by our choice of z, we have t
a
j
= x
b
for all
a < z, we have D[j, a, b] = D[j, a, b −1] for all a ∈ {0, ..., z −1}. Moreover, for a ∈ {z, ..., b}, we have
x
b
= t
z
j
≤ t
a
j
≤ t
b
j
= x
b
, so t
a
j
= x
b
, and thus, D[j, a, b] = D[j −1, a, b] + f
j
(x
b
).
For given j and b, we consider the function D[j, b] : {0, N − 1} → R, with D[j, b](a) = D[j, a, b].
Lemma 5 expresses the fact that the values of function D[j, b] can be obtained from D[j, b − 1] up to
some value z, and from D[j − 1, b] (while adding f
j
(x
b
)) from this value onwards. This property allows
us to improve the time bound of the dynamic programming algorithm in the previous section. Instead of
computing arrays of scalars D[j, a, b], we will compute arrays of functions D[j, b]. The key to success will
be to represent these functions in such a way that they can be evaluated fast and D[j, b] can be constructed
fromD[j −1, b] and D[j, b −1] fast. Before describing the data structure used for this purpose, we present
the summarized refined algorithm:
Algorithm FastCompute d(P
1
, . . . , P
k
; P)
1. Compute the set of critical points X = {x0, . . . , xN−1}, and sort them
2. For all j ∈ {1, ..., k} and all b ∈ {0, ..., N −1}, evaluate fj(x
b
)
3. ZERO ←a function that always evaluates to zero
4. INFINITY ←a function that always evaluates to ∞
5. MIN ←∞
6. a ←0
7. for j ←1 to k do
8. D[j, −1] ←INFINITY
9. for b ←0 to N −1 do
10. D[0, b] ←ZERO
11. for j ←1 to k do
12. Construct D[j, b] from D[j −1, b] and D[j, b −1]
13. while xa < x
b
−l0 do
14. a ←a + 1
15. val ←evaluation of D[k, b](a)
16. MIN ←min (val, MIN)
17. return

MIN
The running time of this algorithm depends on how the functions D[j, b] are represented. In order to
make especially steps 12 and 15 of the above algorithm efficient, we represent the functions D[j, b] by
means of balanced binary trees. Asano et al. [4] used an idea similar in spirit.
3.6.1 An efficient representation for function D[j, b]
We now describe the tree T
j,b
used for storing the function D[j, b]. Each node ν of T
j,b
is associated
with an interval [a

ν
, a
+
ν
], with 0 ≤ a

ν
≤ a
+
ν
≤ N − 1. The root ρ is associated with the full domain,
that is: a

ρ
= 0 and a
+
ρ
= N − 1. Each node ν with a

ν
< a
+
ν
is an internal node that has a split
value a
ν
= (a

ν
+ a
+
ν
)/2 associated with it. Its left and right children are associated with [a

ν
, a
ν
] and
[a
ν
+ 1, a
+
ν
], respectively. Each node ν with a

ν
= a
+
ν
is a leaf of the tree, with a
ν
= a

ν
= a
+
ν
. For
any index a of a critical point x
a
, we will denote the leaf ν that has a
ν
= a by λ
a
. Note that so far, the
tree looks exactly the same for each function D[j, b]: they are balanced binary trees with N leaves, and
log N height. Moreover, all trees have the same associated intervals, and split values in their corresponding
nodes.
With each node ν we also store a weight w
ν
, such that T
j,b
has the following property: D[j, b](a) is the
sum of the weights on the path from the tree root to the leaf λ
a
. Such a representation of a function D[j, b]
is not unique. A trivial example is to set w
ν
equal to D[j, b](a
ν
) for all leaves ν, and to set it to zero for
all internal nodes. It will become clear below why we also allow representations with non-zero weights on
internal nodes.
11
T
j−1,b
T
j,b−1
T
j,b

λz
λ
0
λ
N−1
λ
0
λ
0
λz
λz
λ
N−1
λ
N−1
Figure 8: The tree T
j,b
is contructed from T
j,b−1
and T
j−1,b
by creating new nodes along the path from
the root to the leaf λ
z
, and adopting the subtrees to the left of the path fromT
j,b−1
, and the subtrees to the
right of the path from T
j−1,b
.
Furthermore, we store with each node ν a value m
ν
which is the sum of the weights on the path from
the left child of ν to the leaf λ

, that is: the rightmost descendant of the left child of ν. This concludes the
description of the data structure used for storing a function D[j, b].
Lemma 6 The data structure T
j,b
for the representation of function D[j, b] can be operated on such that:
(i) The representation of a zero-function (i.e. a function that always evaluates to zero) can be con-
structed in O(N) time. Also the representation of a function that always evaluates to ∞ can be
constructed in O(N) time.
(ii) Given T
j,b
of D[j, b], evaluating function D[j, b](a) takes O(log N) time.
(iii) Given the representations T
j−1,b
and T
j,b−1
of the functions D[j −1, b] and D[j, b −1], respectively,
a representation T
j,b
of D[j, b] can be computed in O(log N) time.
Proof: (i) Constructing the representation of a zero-function involves nothing more than building a
balanced binary tree on N leaves and setting all weights to zero. This can be done in O(N) time. The
representation of a function that always evaluates to ∞ takes also O(N) time since it involves only the
building a balanced binary tree on N leaves and setting the weights of all leaves to ∞, and of internal
nodes to zero.
(ii) Once T
j,b
representing the function D[j, b] has been constructed, the weights stored allow us to
evaluate D[j, b](a) for any a in O(log N) time by summing up the weights of the nodes on the path from
the root to the leaf λ
a
that corresponds to a.
(iii) To construct T
j,b
from T
j,b−1
and T
j−1,b
efficiently, we take the following approach. We find the
sequences of left and right turns that lead from the root of the trees down to the leaf λ
z
, where z is defined
as in lemma 5. Note that the sequences of left and right turns are the same in the trees T
j,b
, T
j,b−1
, and
T
j−1,b
, only the weights on the path differ. Though we do not compute z explicitly, we will show belowthat
we are able to construct the path from the root of the tree to the leaf λ
z
corresponding to z, by identifying,
based on the stored weights, at each node node along this path whether the path continues into the right or
left subtree of the current node.
Lemma 5 tells us that for each leaf left of λ
z
, the total weight on the path to the root in T
j,b
must be the
same as the total weight on the corresponding path in T
j,b−1
. At λ
z
itself and right of λ
z
, the total weights
to the root in T
j,b
must equal those in T
j−1,b
, plus f
j
(x
z
). We construct the tree T
j,b
with these properties
as follows.
We start building T
j,b
by constructing a root ρ. If the path to λ
z
goes into the right subtree, we adopt
as left child of ρ the corresponding left child ν of the root fromT
j,b−1
. There is no need to copy ν: we just
add a pointer to it, thus at once making all descendants of ν in T
j,b−1
into descendants of the corresponding
node in the tree T
j,b
under construction. Furthermore, we set the weight of ρ equal to the weight of the
root of T
j,b−1
. If the path to λ
z
goes into the left subtree, we adopt the right child fromT
j−1,b
and take the
weight of ρ from there, now adding f
j
(x
z
).
Then we make a new root for the other subtree of the root ρ, i.e. the one that contains λ
z
, and continue
the construction process in that subtree. Every time we go into the left branch, we adopt the right child
12
fromT
j−1,b
, and every time we go into the right branch, we adopt the left child from T
j,b−1
(see figure 8).
For every constructed node ν, we set its weight w
ν
so that the total weight of ν and its ancestors equals the
total weight of the corresponding nodes in the tree from which we adopt ν’s child — if the subtree adopted
comes fromT
j−1,b
, we increase w
ν
by f
j
(x
z
).
By keeping track of the accumulated weights on the path down from the root in all the trees, we can set
the weight of each newly constructed node ν correctly in constant time per node. The accumulated weights
on the path down from the root, together with the stored weights for the paths down to left childrens’
rightmost descendants, also allow us to decide in constant time which is better: D[j, b − 1](a
ν
) or D[j −
1, b](a
ν
) + f
j
(x
z
). This will tell us if λ
z
is to be found in the left or in the right subtree of ν.
The complete construction process only takes O(1) time for each node on the path from ρ to λ
z
. Since
the trees are perfectly balanced, this path has only O(log N) nodes, so that T
j,b
is constructed in time
O(log N).
Theorem 1 The similarity d(P
1
, . . . , P
k
; P) between k polylines {P
1
, . . . , P
j
} with n vertices in total,
and a polygon P with m vertices, can be computed in O(kmnlog(mn)) time using O(kmnlog(mn))
storage.
Proof: We use algorithm Fastcompute d(P
1
, . . . , P
k
; P) with the data structure described above.
Step 1 and 2 of the algorithm are the same as in the algorithm from section 3.5 and can be executed in
O(kmn + mnlog n) time. From lemma 6, we have that the zero-function ZERO can be constructed
in O(N) time (line 3). Similarly, the infinity-function INFINITY can be constructed in O(N) time
(line 4). Lemma 6 also insures that constructing D[j, b] from D[j − 1, b] and D[j, b − 1] (line 12) takes
O(log N) time, and that the evaluation of D[k, b](a) (line 15) takes O(log N) time.
Notice that no node is ever edited after it has been constructed: no tree is ever updated, and a new tree
is always constructed by making O(log N) new nodes and for the rest by referring to nodes from existing
trees as they are. Therefore, an assignment as in lines 8 or 10 of the algorithm can safely be done by
just copying a reference in O(1) time, without copying the structure of the complete tree. Thus, the total
running time of the above algorithm will be dominated by O(kN) executions of line 12, taking in total
O(kN log N) = O(kmnlog(mn)) time.
Apart fromthe function values of f
j
computed in step 2, we have to store the ZEROand the INFINITY
function. All these require O(kN) = O(kmn) storage. Notice that any of the functions constructed in
step 12 requires only storing O(log(N)) = O(log(mn)) new nodes and pointers to nodes in previously
computed trees, and thus we need O(kmnlog(mn)) for all the trees computed in step 12. So the total
storage required by the algorithm is O(kmnlog(mn)).
We note that the problem resembles a general edit distance type approximate string matching problem
[19]. Global string matching under a general edit distance error model can be done by dynamic program-
ming in O(kN) time, where k and N represent the lengths of the two strings. The same time complexity
can be achieved for partial string matching through a standard “assign first line to zero” trick [19]. This
trick however does not apply here due to the condition x
b
−x
a
≤ l
0
. Indeed, by the above trick to convert
global matching into partial matching we would have to fill in a table D of O(kN) elements, in which
each entry D[j, b] gives the cost of the best alignment of the first j polylines ending at critical point x
b
(over all possible starting points x
a
satisfying the condition x
b
− x
a
≤ l
0
). This table however cannot be
filled efficiently in a column-wise manner, because an optimal D[j, b] might be given by an alignment of
the first j polylines, such that the alignment of the first j −1 polylines is sub-optimal for D[j −1, c], where
x
c
is the placement of the polyline j − 1. This is also the case for a different formulation of the dynamic
programming problem, where D[j, b] gives the best alignment of the first j polylines ending at a critical
point within the interval [x
0
, x
b
].
4 A Part-based Retrieval Application
In this section we describe a part-based shape retrieval application. The retrieval problem we are consider-
ing is the following: given a large collection of polygonal shapes, and a query consisting of a set of disjoint
13
Figure 9: Examples of images from Core Experiment “CE-Shape1”, part B. Images in the same row belong
to the same class.
polylines, we want to retrieve those shapes in the collection that best match the query. The query represents
a set of disjoint boundary parts of a single shape, and the matching process evaluates how closely these
parts resemble disjoint pieces of a shape in the collection. Thus, instead of querying with complete shapes,
we make the query process more flexible by allowing the user to search only for certain parts of a given
shape. The parts in the query are selected by the user from an automatically computed decomposition of a
given shape. More specifically, in the application we discuss below, these parts are the result of applying
the boundary decomposition framework introduced in [29]. For matching the query against a database
shape we use the similarity measure described in section 3.
The shape collection used by our retrieval application comes fromthe MPEG-7 shape silhouette database.
Specifically, we used the Core Experiment “CE-Shape-1” part B [13], a test set devised by the MPEG-7
group to measure the performance of similarity-based retrieval for shape descriptors. This test set consists
of 1400 images: 70 shape classes, with 20 images per class. Some examples of images in this collection
are given in figure 9, where shapes in each row belong to the same class. The outer closed contour of the
object in each image was extracted. In this contour, each pixel corresponds to a vertex. In order to decrease
the number of vertices, we then used the Douglas-Peuker [9] polygon approximation algorithm. Each re-
sulting simplified contour was then decomposed, as described in [29]. The instantiation of the boundary
decomposition framework described there uses the medial axis. For the medial axis computation we used
the Abstract Voronoi Diagram LEDA package [1]. The running time for a single query on the MPEG-7 test
set of 1400 images is typically about one second on a 2 GHz PC.
The matching of a query consisting of k polylines to an arbitrary database contour is based on the
similarity measure described in section 3. This similarity measure is not scale invariant, as we noticed in
section 3.1. Our shape collection, however, contains shapes at different scales. In order to achieve some
robustness to scaling, we scaled all shapes in the collection to the same diameter of the circumscribed disk.
Given the nature of our collection (each image in “CE-Shape-1” is one view of a complete object), this is
a reasonable approach. The reason we opted for a normalization based on the circumscribed disk, instead
of the bounding box, for example, is related to the fact that a class in the collection may contain images of
an object at different rotational angles.
4.1 Experimental Results
The selection of parts comprising the query has a big influence on the results of a part-based retrieval pro-
cess. Our collection for example includes classes, such as those labelled “horse”, “dog”, “deer”, “cattle”,
whose shapes have similar parts, such as limbs. Querying with such parts have shown to yield shapes from
all these classes. Figure 10 depicts two part-based queries on the same shape (belonging to the “beetle”
class) with very different results. In the first example, the query consists of five chains, each representing a
14
Figure 10: (Upper half) Original image, decomposed countour with five query parts in solid line style,
each representing a leg of a beetle contour, and the top ten retrieved shapes when quering with these five
polylines. (Lower half) The same original image and decomposed contour, and the top ten retrieved shapes
when quering with two polylines, one containing three legs and the other two legs.
leg of the beetle. Among the first ten retrieved results, only three come from the same class. In the second
example, the same parts are contained in the query, but they were concatenated, so that they form two
chains (three legs on one side, two of the other). All first ten retrieved results are this time from the same
class. By concatenating the leg parts in the second query, a spatial arrangement of these parts is searched in
the database, and this gives the query more power to discriminate between beetle shapes and other shapes
containing leg-like parts.
The MPEG group initiated the MPEG-7 Visual Standard in order to specify standard content-based
descriptors that allow to measure similarity in images or video based on visual criteria. Each visual de-
scriptor incorporated in the MPEG-7 Standard was selected from a few competing proposals, based on an
evaluation of their performance in a series of tests called Core Experiments. The Core Experiment “CE-
Shape-1” was devised to measure the performance of 2D shape descriptors. The performance of several
shape descriptors, proposed for standardization within MPEG-7, is reported in [21]. The performance of
each shape descriptor was measured using the so-called “bulls-eye” test: each image is used as a query,
and the number of retrieved images belonging to the same class was counted in the top 40 (twice the class
size) matches.
The shape descriptor selected by MPEG-7 to represent a closed contour of a 2D object or region in an
15
image is based on the Curvature Scale Space (CSS) representation [18]. The CSS of a closed planar curve
is computed by repeatedly convolving the contour with a Gaussian function, and representing the curvature
zero-crossing points of the resulting curve in the (s, σ) plane, where s is the arclength. The matching
process has two steps. Firstly, the eccentricity and circularity of the contours are used as a filter, to reduce
the amount of computation. The shape similarity measure between two shapes is computed, in the second
step, by relating the positions of the maxima of the corresponding CSSs. The reported similarity-based
retrieval performance of this method is 75.44% [21]. In the years following the selection of this descriptor
for the MPEG-7 Visual Standard, people have reported better performances on the same set of data for
other shape descriptors. Belongie et al. [6] reported a retrieval performance of 76.51% for their descriptor
based on shape contexts. Even better performances, of 78.18% and 78.38%, were reported by Sebastian et
al. [27] and Grigorescu and Petkov [10], respectively. The matching in [27] is based on aligning the two
shapes in a deformation-based approach, while in [10] a local descriptor of an image point is proposed that
is determined by the spatial arrangement of other image points around that point. Latecki at al. mention in
[16] a retrieval performance of 83.19% for a shape descriptor introduced in [5].
A global turning function-based shape descriptor [12], proposed for MPEG-7, is reported to have a
similarity-based retrieval performance of 54.14%. A later variation [33] increases the performance rate up
to 65.67%.
For classes in “CE-Shape-1” with a lowvariance among their shapes, the CSS matching [18] gives good
results, with a retrieval rate over 90%, as measured by the “bulls-eye” test. For difficult classes however,
for example the class “beetle”, the different relative lengths of the antennas/legs, and the different shapes of
the body pose problems for global retrieval. The average performance rate of CSS matching for this class
is only 36%. For the “ray” and “deer” classes, the bad results (an average performance rate of the CSS
matching of 26% and 33%, respectively) are caused by the different shape and size of the tails and antlers,
respectively, of the contours in these classes. A part-based matching, with a proper selection of parts, is
significantly more effective in such cases.
We tested the performance of our part-based shape matching. A prerequisite of such a performance of
the part-based matching is the selection of query parts that capture relevant and specific characteristics of
the shape. This has to be done interactively by the user. An overall performance score for the matching pro-
cess, like in [21], is therefore untractable, since that would re quire 1400 interactive queries. We therefore
compared, on a number of individual queries, our part-based approach with the global CSS matching and
with a global matching based on the turning function. These experimental results indicate that for those
classes with a low average performance of the CSS matching, our approach consistently performs better.
For example, for the shape called “carriage-18”, the CSS method has a bull’s eye score of 70%, the global
turning function method a score of 80%, and our multiple polyline to polygon matching method a score of
95%. For the shape called “horse-17”, both the CSS method and the global turning function method have
a bull’s eye score of 10%, and our method a score of 70%. (See many examples in the appendix.)
5 Concluding Remarks
In this paper we addressed the partial shape matching problem. We introduced a measure for computing the
similarity between a set of pieces of one shape and another shape. Such a measure is usefull in overcoming
problems due to occlusion or unreliable object segmentation from images. The similarity measure was
tested in a shape-based image retrieval application. We compared our part-based approach to shape match-
ing with two global shape matching technique (the CSS matching, and a turning function-based matching).
Experimental results indicate that for those classes with a low average performance of the CSS matching,
our approach consistently performs better. Concludingly, our dissimilarity measure provides a powerful
complementary shape matching method.
A prerequisite of such a performance of the part-based matching is the selection of query parts that
capture relevant and specific characteristics of the shape. This has to be done interactively by the user.
Since the user does not usually have any knowledge of the databased content before the retrieval process,
we have developed an alternative approach to part-based shape retrieval, in which the user is relieved of
the responsibility of specifying shape parts to be searched for in the database, see [30]. The query, in this
case, is a polygon, and in order to select among the large number of possible searches in the database with
16
parts of the query, the user interacts with the system in the retrieval process. His interaction, in the form
of marking relevant results from those retrieved by the system, has the purpose of allowing the system to
guess what the user is looking for in the database.
Acknowledgements
This research was partially supported by the Dutch Science Foundation (NWO) under grant 612.061.006,
and by the FP6 IST Network of Excellence 506766 Aim@Shape. Thanks to Veli M¨ akinen for the partial
string matching reference.
References
[1] AVD LEP, the Abstract Voronoi Diagram LEDA Extension Package. http://www.mpi-sb.
mpg.de/LEDA/friends/avd.html.
[2] N. Ansari and E. J. Delp. Partial shape recognition: A landmark-based approach. PAMI, 12:470–483,
1990.
[3] E. Arkin, L. Chew, D. Huttenlocher, K. Kedem, and J. Mitchell. An efficiently computable metric for
comparing polygonal shapes. PAMI, 13:209–215, 1991.
[4] T. Asano, M. de Berg, O. Cheong, H. Everett, H.J. Haverkort, N. Katoh, and A. Wolff. Optimal
spanners for axis-aligned rectangles. Computational Geometry, Theory and Applications, 30(1):59–
77, 2005.
[5] E. Attalla. Shape Based Digital Image Similarity Retrieval. PhD thesis, Wayne State University,
2004.
[6] S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts.
PAMI, 24:509–522, 2002.
[7] B. Bhanu and J. C. Ming. Recognition of occluded objects: A cluster-structure algorithm. Pattern
Recognition, 20(2):199–211, 1987.
[8] Scott D. Cohen and Leonidas J. Guibas. Partial matching of planar polylines under similarity trans-
formations. In Proc. SODA, pages 777–786, 1997.
[9] D.H. Douglas and T.K. Peucker. Algorithms for the reduction of the number of points required to
represent a digitized line or its caricature. The Canadian Cartographer, 10(2):112–122, 1973.
[10] C. Grigorescu and N. Petkov. Distance sets for shape filters and shape recognition. IEEE Transactions
on Image Processing, 12(10):1274–1286, 2003.
[11] D. Huttenlocher, G. Klanderman, and W. Rucklidge. Comparing images using the hausdorff distance.
PAMI, 15:850–863, 1993.
[12] IBM. Technical summary of turning angle shape descriptors proposed by IBM. Technical report,
IBM, 1999. TR ISO/IEC JTC 1/SC 29/WG 11/ P162.
[13] S. Jeannin and M. Bober. Description of core experiments for MPEG-7 motion/shape. Technical
report, MPEG-7, March 1999. Technical Report ISO/IEC JTC 1/SC 29/WG 11 MPEG99/N2690,
Seoul.
[14] L. J. Latecki and R. Lak¨ amper. Shape similarity measure based on correspondence of visual parts.
PAMI, 22:1185–1190, 2000.
[15] L. J. Latecki, R. Lak¨ amper, and D. Wolter. Shape similarity and visual parts. In Proc. Int. Conf.
Discrete Geometry for Computer Imagery (DGCI), pages 34–51, 2003.
17
[16] L. J. Latecki, R. Lak¨ amper, and D. Wolter. Optimal partial shape similarity. Image and Vision
Computing, 23:227–236, 2005.
[17] H.-C. Liu and M. D. Srinath. Partial shape classification using contour matching in distance transfor-
mation. PAMI, 12(11):1072–1079, 1990.
[18] F. Mokhtarian, S. Abbasi, and J. Kittler. Efficient and robust retrieval by shape content through
curvature scale space. In Workshop on Image DataBases and MultiMedia Search, pages 35–42, 1996.
[19] Gonzala Navarro. A guided tour to approximate string matching. ACM Computing Surveys, 33(1):31–
88, 2001.
[20] H. Ogawa. A fuzzy relaxation technique for partial shape matching. Pattern Recognition Letters,
15(4):349–355, 1994.
[21] J.-R. Ohm and K. M¨ uller. Results of MPEG-7 Core Experiment Shape-1. Technical report, MPEG-7,
July 1999. Technical Report ISO/IEC JTC1/SC29/WG11 MPEG98/M4740.
[22] E. Ozcan and C. K. Mohan. Partial shape matching using genetic algorithms. Pattern Recognition
Letters, 18(10):987–992, 1997.
[23] A. Pentland, R. W. Picard, and S. Sclaroff. Photobook: Content-based manipulation of image
databases. International Journal of Computer Vision, 18(3):233–254, June 1996.
[24] E. Persoon and K. S. Fu. Shape discrimination using Fourier descriptors. IEEE Transactions on
Systems, Man, and Cybernetics, 7(3):170–179, 1977.
[25] R. J. Prokop and A. P. Reeves. A survey of moment-based techniques for unoccluded object represen-
tation and recognition. Computer Vision, Graphics, and Image Processing, 54(5):438–460, Setember
1992.
[26] P. Rosin. Multiscale representation and matching of curves using codons. Graphical Models and
Image Processing, 55(4):286–310, July 1993.
[27] T. Sebastian, P. Klein, and B. Kimia. On aligning curves. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 25(1):116–125, 2003.
[28] K. Siddiqi, A. Shokoufandeh, S.J. Dickinson, and S.W. Zucker. Shock graphs and shape matching.
IJCV, 55(1):13–32, 1999.
[29] Mirela Tanase. Shape Deomposition and Retrieval. PhD thesis, Utrecht University, Department of
Computer Science, February 2005.
[30] Mirela Tanase and Remco C. Veltkamp. Part-based shape retrieval with relevance feedack. In Pro-
ceedings International Conference on Multimedia and Expo (ICME05), 2005.
[31] R. Veltkamp and M. Hagedoorn. State-of-the-art in shape matching. In M. Lew, editor, Principles of
Visual Information Retrieval, pages 87–119. Springer, 2001.
[32] Haim Wolfson and Isidore Rigoutsos. Geometric hashing: an overview. IEEE Computational Science
& Engineering, pages 10–21, October-December 1997.
[33] C. Zibreira and F. Pereira. A study of similarity measures for a turning angles-based shape descriptor.
In Proc. Conf. Telecommunications, Portugal, 2001.
18
Appendix A
We compared our multiple polyline to polygon matching (MPP) approach with the global CSS matching
(CSS) and with a global matching based on the turning function (GTA). Table 1 presents the query image
of the CSS matching and the query parts of our part-based matching in the first and second column of the
table. The retrieval rate measured by the “bulls-eye” test, the percentage of true positives returned in the
first 40 (twice the class size) matches, for the three matching techniques are indicated in the following three
columns, while the remaining columns give the percentage of true positives returned in the first 20 (class
size) matches.
Bull’s Eye True Positives
CSS Part-base Performance in class size
Query Image Query Parts Matching Process Matching Process
CSS GTA MPP CSS GTA MPP
10% 30% 65% 10% 10% 55%
beetle-20
10% 35% 60% 5% 20% 55%
beetle-10
15% 15% 70% 15% 5% 60%
ray-3
15% 25% 50% 15% 20% 40%
ray-17
15% 40% 50% 5% 30% 35%
Table 1: A comparison of the Curvature Scale Space (CSS), the Global
Turning Angle function (GTA), and our Multiple Polyline to Polygon
(MPP) matching.
19
Bull’s Eye True Positives
CSS Part-base Performance in class size
Query Image Query Parts Matching Process Matching Process
CSS GTA MPP CSS GTA MPP
deer-15
20% 40% 45% 15% 40% 40%
deer-13
5% 20% 45% 5% 15% 35%
bird-11
20% 15% 60% 20% 15% 40%
bird-17
20% 25% 50% 15% 25% 40%
bird-9
70% 80% 95% 45% 80% 90%
carriage-18
10% 10% 70% 10% 5% 60%
horse-17
15% 25% 65% 10% 25% 40%
horse-3
Table 1: A comparison of the Curvature Scale Space (CSS), the Global
Turning Angle function (GTA), and our Multiple Polyline to Polygon
(MPP) matching.
20
Bull’s Eye True Positives
CSS Part-base Performance in class size
Query Image Query Parts Matching Process Matching Process
CSS GTA MPP CSS GTA MPP
25% 30% 50% 20% 20% 45%
horse-4
30% 30% 50% 25% 25% 40%
crown-13
20% 45% 65% 20% 35% 50%
butterfly-11
15% 25% 55% 10% 20% 55%
butterfly-4
10% 15% 50% 10% 15% 45%
dog-11
Table 1: A comparison of the Curvature Scale Space (CSS), the Global
Turning Angle function (GTA), and our Multiple Polyline to Polygon
(MPP) matching.
21

Multiple Polyline to Polygon Matching
Mirela T˘ nase1 Remco C. Veltkamp1 Herman Haverkort2 a 1 Department of Information & Computing Sciences Utrecht University, The Netherlands 2 Department of Mathematics and Computing Science TU Eindhoven, The Netherlands
Abstract This paper addresses the partial shape matching problem, which helps identifying similarities even when a significant portion of one shape is occluded, or seriously distorted. We introduce a measure for computing the similarity between multiple polylines and a polygon, that can be computed in O(km2 n2 ) time with a straightforward dynamic programming algorithm. We then present a novel fast algorithm that runs in time O(kmn log mn). Here, m denotes the number of vertices in the polygon, and n is the total number of vertices in the k polylines that are matched against the polygon. The effectiveness of the similarity measure has been demonstrated in a part-based retrieval application with known ground-truth.

1 Introduction
The motivation for multiple polyline to polygon matching is twofold. Firstly, the matching of shapes has been done mostly by comparing them as a whole [3, 14, 18, 23, 24, 25, 26, 28, 31]. This fails when a significant part of one shape is occluded, or distorted by noise. In this paper, we address the partial shape matching problem, matching portions of two given shapes. Secondly, partial matching helps identifying similarities even when a significant portion of one shape boundary is occluded, or seriously distorted. It could also help in identifying similarities between contours of a non-rigid object in different configurations of its moving parts, like the contours of a sitting and a walking cat. Finally, partial matching helps alleviating the problem of unreliable object segmentation from images, over or undersegmentation, giving only partially correct contours. Despite its usefulness, considerably less work has been done on partial shape matching than on global shape matching. One reason could be that partial shape matching is more involved than global shape matching, since it poses two more difficulties: identifying the portions of the shapes to be matched, and achieving invariance to scaling. If the shape representation is not scale invariant, shapes at different scales can be globally matched after scaling both shapes to the same length or the same area of the minimum bounding box. Such solutions work reasonably well in practice for many global similarity measures. When doing partial matching, however, it is unclear how the scaling should be done since there is no way of knowing in advance the relative magnitude of the matching portions of the two shapes. This can be seen in figure 1: any two of the three shapes in the figure should give a high score when partially matched but for each shape the scale at which this matching should be done depends on the portions of the shapes that match. Contribution Firstly, we introduce a measure for computing the similarity between multiple polylines and a polygon. The polylines could for example be pieces of an object contour incompletely extracted from an image, or could be boundary parts in a decomposition of an object contour. The measure we propose is based on the turning function representation of the polylines and the polygon. We then derive a number of non-trivial properties of the similarity measure. Secondly, based on these properties we characterize the optimal solution that leads to a straightforward O(km2 n2 )-time dynamic programming algorithm. We then present a novel O(kmn log mn)-time algorithm. Here, m denotes the number of vertices in the polygon, and n is the total number of vertices in the k polylines that are matched against the polygon. 1

for example points of high curvature [2. It is defined for arbitrary non-empty bounded and closed sets A and B as the infimum of the distance of the points in A to B and the points in B to A. Both methods are designed for partial matching. parts that we perceive as similar. One serious drawback of this approach is the fact that the matching is done between parts of simplified shapes at “the appropriate evolution stage” [14]. The best correspondence between the maximal convex arcs of two simplified versions of the original shapes gives the partial similarity measure between the shapes. and then looking for correspondences between the features of the two shapes. by looking for correspondences between parts of the given shapes. Thirdly. 17]. Most partial shape matching methods are based on computing local features of the contour. where m is the number of vertices in one polygon and n is the number of vertices in the other. An example of such a case is in figure 2. rotation. Geometric hashing [32] is a method that determines if there is a transformed subset of the query point set that matches a subset of a target point set. because for such subparts the sequences of local features are very similar. k-means clustering [7]. we want to retrieve those shapes in the collection that best match our query. but do not easily transform to our case of matching multiple polylines to a polygon. establish the best correspondence of parts in a decomposition of the matched shapes. Such local features-based solutions work well when the matched subparts are almost equivalent up to a transformation such as translation. Other local features-based approaches make use of genetic algorithms [22]. this works for only two single polylines. The dissimilarity measure is a function of the scale. translation and rotation. and two parts of the database shape P are denote by P1 and P2 . Also the Hausdorff distance [11] allows partial matching. where two parts of the query shape Q are denoted by Q 1 and Q2 . Considering different stages of the curve evolution (such that P1 ∪ P2 and Q1 ∪ Q2 constitute individual parts) does not solve the problem either. Given a large collection of shapes and a query consisting of a set of polylines. rotation and scaling. The set of polylines forming the query are boundary parts in a decomposition of a database shape – both this database shape and the parts in the query are selected by the user. can be done in time O(m 2 n2 ) [8]. The evaluation on the basis of a known ground-truth indicate that a part-based approach to matching improves the global matching performance for difficult categories of shapes. Though P has a portion of P 1 ∪ P2 similar to a portion of Q 1 ∪ Q2 . Given two matches with the same squared error. [3] describe a metric for comparing two whole polygons that is invariant under translation. 2 Related Work Arkin et al. However. or fuzzy relaxation techniques [20]. Partially matching the turning angle function of two polylines under scaling. and can be computed in O(mn log mn) time. Latecki et al [15]. rotation or scaling. may have quite different local features (different number of curvature extrema for example). How these evolution stages are identified is not indicated in their papers. though it certainly has an effect on the quality of the matching process. However. the match involving the longer part of the polylines has a lower dissimilarity. and the shift of one polyline along the other. It is based on the L 2 -distance between the turning functions of the two polygons. This is because the method used for matching two chains normalizes the two chains to unit length and thus no shifting of 2 . no correspondence among these four parts succeeds to recognize this.A B C Figure 1: Scale invariance in partial matching is difficult. we have experimented with a part-based retrieval application. by building a hash table in transformation space. Also. false negatives could appear in the retrieval process.

since it corresponds to a horizontal shifting of Θ A over a distance t. we shift the parts of one shape over the other shape and select those that give a close match. we introduce a similarity measure for matching a union of possibly disjoint parts and a whole shape. and the polygon B is placed over A in such a way that the reference point B(0) of B coincides with point A(t) at distance t along A from the reference point A(0). . 3. In our approach. Since we allow the parts to rotate independently. It is a piecewise constant function. In this section we define a similarity measure for such a polylines-to-polygon matching. the part(s) are shifted along the shape. . 3 Polylines-to-Polygon Matching We concentrate on the problem of matching an ordered set {P 1 . More formally. A rotation of A by an angle θ corresponds to a shifting of Θ A over a distance θ in the vertical direction. comes from the intended application of part based shape retrieval. The distance between two polygons A and B is defined as the L 2 distance between their two turning functions ΘA and ΘB . We first give an O(km2 n2 )-time dynamic programming algorithm for computing this similarity measure. measured from some reference point on the boundary of A. minimized with respect to the vertical and horizontal shifts of these functions (in other words. . Pk } of k polylines against a polygon P . We want to compute how close an ordered set of polylines {P 1 . P2 . . In the matching process between a (union of) part(s) and an undecomposed shape. Also notice that if instead of making correspondences between parts. this measure could capture the similarity between contours of non-rigid objects. Pk } is to being part of the boundary of P in the given order in counter-clockwise direction around P (see figure 3). . 3 . suppose A and B are two polygons with perimeter length l A and lB . . A pair (t. we also use a decomposition into parts in order to identify the matching portions. where m is the number of vertices in P and n is the total number of vertices in the k polylines. one endpoint of one chain over the other is done. Note that the fact that P is a polygon.1 Similarity between polyline and polygon The turning function Θ A of a polygon A measures the angle of the counterclockwise tangent with respect to a reference orientation as a function of the arc-length s. For this purpose. l A ) along the boundary of A corresponds to shifting Θ A horizontally over a distance t. and the optimal placement is computed based on their turning functions. . and also over all independent rotations of the parts. respectively. the polylines are rotated and shifted along the polygon P . but we do this only for one of the shapes. in such a way that the pieces of the boundary of P “covered” by the k polylines are mutually disjoint except possibly at their endpoints. and not an open polyline. . minimized with respect to rotation and choice of reference point). θ) ∈ R × R will be referred to as a placement of B over A.P2 P1 Q1 Q2 Figure 2: Partial matching based on correspondence of parts may fail to identify similarities. This similarity measure is a turning angle function-based similarity. with parts in different relative positions. the similarity between the shapes in figure 2 is reported. Moving the location of the reference point A(0) over a distance t ∈ [0. minimized over all possible shiftings of the endpoints of the parts over the shape. and B is rotated clockwise by an angle θ with respect to the reference orientation. θ) is also called a horizontal shift. The measure is based on the turning function representation of the given polylines and it is invariant under translation and rotation. The first component t of a placement (t. with jumps corresponding to the vertices of A. In the next section. P2 . We then refine this algorithm to obtain a running time of O(kmn log mn).

that is: their similarity should be zero. Pk over P (or in other words. B.lA ) min f(A. . . θ) denotes the quadratic similarity f j (t. For this reason we do not scale the polyline or the polygon prior to the matching process. f j (t. . P2 . their turning functions will no longer match perfectly. The similarity between two polygons A and B is then given by: θ∈R. Thus. We assume the polylines {P1 . t∈[0. Pk . Since P is a closed polyline. For measuring the difference between a polygon A and a polyline B the same measure can be used (for matching two polylines. But if we scale A and B to the same length. (tk . 3. or incomplete shapes. we want that a polyline B included in a polygon A to match perfectly this polygon. θ) of B over A. . our similarity measure is not scale-invariant. Θj is piecewise-constant with nj − 1 jumps. by Θ(s+l) = Θ(s)+2π. which does not contain occluded. is the square root of the sum of quadratic similarities f j . θ) between A and B for a given placement (t. θ) = 0 j (Θ(s + t) − Θj (s) + 2 θ) ds. P2 . while the second component θ is also called a vertical shift. . B. P3 } of polylines against a polygon P . lj ] → R denote the turning function of the polyline Pj of length l j .2 Similarity between multiple polylines and a polygon Let Θ : [0. j=1 4 . we achieve robustness to scaling by normalizing all shapes in the collection to the same diameter of their circumscribed circle. and of perimeter length l. Pk . . θ1 ) . Let {P1 . t. t. In a part-based retrieval application (see section 4) using a similarity measure based on the above quadratic similarity (see section 3.P P3 P P2 P1 P3 P2 P1 Figure 3: Matching an ordered set {P 1 . This is a reasonable solution for our collection. minimized over all valid horizontal and vertical shifts of their turning functions): ⎛ ⎞1/2 d(P1 . . Moreover these shapes are object contours and not synthetic shapes. θ). l For simplicity of exposition. Pk } satisfy the condition k lj ≤ l. . P2 . . The similarity measure. Notice that.2). Pk } be a set of polylines. . minimized over all valid placements of P 1 . Arkin et al. θj )⎠ . To achieve invariance under scaling. as the square of the L 2 -distance between their two turning functions Θ A and ΘB : f(A. since it corresponds to a vertical shifting of Θ A over a distance θ. for the purpose of our part-based retrieval application. . θ) = lB 0 (ΘA (s + t) − ΘB (s) + θ)2 ds. B. . . . P ). θk ) ⎝ k fj (tj . . j=1 which we denote by d(P 1 . t. the domain of Θ can be easily extended to the entire real line. If Pj is made of nj segments. . . . P ) = min valid placements (t1 . We define the quadratic similarity f(A. l] → R be the turning function of a polygon P of m vertices. . [3] propose scaling the two polygons to unit length prior to the matching. some slight adaptations would be needed). . and let Θ j : [0. .

. tk +lk ≤t1 +l ⎝ k j=1 ⎞1/2 ∗ fj (tj )⎠ . Pk . Arkin et al. the optimization ∗ problem minθ∈R fj (t.. It remains to define what the valid placements are. as a function of t. P ) between the polylines P 1 . . We also give a simpler formulation of the optimization problem in the definition of d(P 1 . . . .3 Properties of the similarity function ∗ In this section we give a few properties of f j (t). Pk are matched with points on the boundary of P in counterclockwise order around P . . . the function fj (t. that is: tj−1 ≤ tj for all 1 < j ≤ k.6. we require that the matched parts are disjoint (except possibly at their endpoints). . with period l. . . . and are independent of each other. Furthermore. . . θ) = ∂θ we get: lj 0 2(Θ(s + t) − Θj (s) + θ)ds = lj 0 2(Θ(s + t) − Θj (s))ds + 2θlj . and tk + lk ≤ t1 + l (see figure 4). with mnj breakpoints within any interval of length l. P3 and the polygon P . Pk and the polygon P is thus: ⎛ d(P1 . . . the parabolic pieces are concave. we shift the turning functions Θ 1 . θ) is a quadratic convex function of θ. .Θ3 Θ1 Θ l t1 Θ2 t2 t2 + l2 t3 2l t3 + l3 < t1 + l t1 + l1 Figure 4: To compute d(P 1 . . . θ)/∂θ = 0. . moreover. . P ) = min t1 ∈[0. . . Pk . . sharpening the constraints to t j−1 + lj−1 ≤ tj for all 1 < j ≤ k. .2l).. . P )...l). The similarity between the polylines P 1 . 5 . We can thus express the similarity between P j and P ∗ for a given positioning t j of the starting point of P j over P as: fj (tj ) = minθ∈R fj (tj . ii) it is piecewise quadratic. Pk along P . while the vertical shift must be optimal for the given horizontal shift. . P ) in sections 3. ...tk ∈[0. . Since ∂fj (t. . Pk . Pk . . We require that the starting points of P1 . The vertical shifts θ1 . . . . .k}: tj−1 +lj−1 ≤tj . . and t k ≤ t1 + l. given by the root θ j (t) of the equation ∂f j (t. . ∀j∈{2. . in an optimal placement the quadratic similarity between a particular polyline P j and P depends only on the horizontal shift t j . θj (t)). (2) Lemma 1 For a given positioning t of the starting point of P j over P . θk correspond to rotations of the polylines P 1 .5 and 3. and Θ3 horizontally and vertically over Θ. t2 . . . Pk with respect to the reference orientation. tk correspond to shiftings of the starting points of the polylines P 1 . . (1) 3. ∗ Lemma 2 The quadratic similarity fj (t) has the following properties: i) it is periodic. . This implies that for a given t. [3] have shown that for any fixed t. ∗ ∗ We now consider the properties of f j (t) = fj (t. Θ2 .. . θ). The horizontal shifts t 1 . θ) has a unique solution. Therefore. as functions of t. that constitute the basis of the algorithms for computing d(P 1 . . the rotation that minimizes the l ∗ quadratic similarity between Pj and P is given by θ j (t) = − 0 j (Θ(s + t) − Θj (s))ds/lj . . .

θj (t)). For a given value of the horizontal shift t. the value of 0 j (Θ(s + t) − Θj (s))2 ds is thus given by the sum over the strips in the current configuration of the width of the strip times the square of its height: lj 0 (Θ(s + t) − Θj (s))2 ds = σ∈S wσ (t)h2 . t. 0 or 1. l For a given configuration of strips. the discontinuities of Θ and Θ j define a set S of rectangular strips. (3) ∗ For the rest of this proof we restrict our attention to the behaviour of f j within an interval of length l.Θj Θ Figure 5: For given horizontal and vertical shifts of Θ j over Θ. the discontinuities of Θ j and Θ form a set of rectangular strips. Each strip is defined by a pair of consecutive discontinuities of the two turning functions and is bounded from below and above by these functions. ∗ ∗ ∗ ii) Since fj (t) = f(P. we have: 0 lj (Θ(s + t) − Θj (s))ds = σ∈S wσ (t)hσ . we have: ∗ fj (t) = lj 0 ∗ (Θ(s + t) − Θj (s) + θj (t))2 ds = 0 lj (Θ(s + t) − Θj (s))2 ds + + lj 0 ∗ 2(Θ(s + t) − Θj (s))θj (t)ds ∗ (θj (t))2 ds lj 0 = 0 lj (Θ(s + t) − Θj (s))2 ds + lj 0 ∗ 2θj (t) (Θ(s + t) − Θj (s))ds + ∗ lj (θj (t))2 . 6 . Proof: i) Since Θ(t + l) = 2π + Θ(t). σ Similarly. As Θj is shifted horizontally with respect to Θ. from lemma 1 we get that θ ∗ (t + l) = θ∗ (t) − 2π. Inbetween these critical events. Thus ∗ fj (t + l) = lj 0 ∗ (Θ(s + t + l) − Θj (s) + θ∗ (t) − 2π)2 ds = fj (t). the height of a strip σ is a constant h σ . while the width wσ is a linear function of t with coefficient −1. θj (t))fj (t. Pj . there are at most mn j critical events within an interval of length l where the configuration of the rectangular strips defined by Θ and Θ j changes. Applying lemma 1 we get: ∗ fj (t) = lj 0 (Θ(s + t) − Θj (s))2 ds − lj (Θ(s 0 + t) − Θj (s))ds lj 2 . These critical events are at values of t were a discontinuity of Θ j coincides with a discontinuity of Θ.

4 Characterization of an optimal solution In this section we characterize the structure of an optimal solution to the optimization problem in equation (5). P ) = min t1 ∈[0. . ∗ In other words. and give a recursive definition of this solution. . ∗ ∗ tk ≤t1 +l0 ⎝ k j=1 ⎞1/2 f j (tj )⎠ . And thus equation (3) can be rewritten as: ∗ fj (t) = σ∈S wσ (t)h2 σ − ⎛ 1⎝ lj ⎞2 wσ (t)hσ ⎠ . the optimization problem defining d(P 1 . the function f j is a piecewise quadratic function of t. f j has the i=1 ∗ same properties as fj . 7 . . . with t∗ := tj + i=1 li . . with concave parabolic pieces. We now give a simpler formulation of the optimization problem in the definition of d(P 1 . since the coefficient of t 2 in equation (4) ∗ is negative. with at most mnj breakpoints within an interval of length l.. . Moreover.l). is a solution to the optimization problem in equation (1).2l). P ). . ∀j∈{2.k}: tj−1 ≤tj . . P ) becomes: ⎛ d(P1 . we define: j−1 ∗ f j (t) := fj t+ i=1 li . . that is: it is a piecewise quadratic function of t that has its local minima in at most ∗ mnj breakpoints in any interval of length l. .tk ∈[0. Pk . .∗ fj (t) t ∗ Figure 6: The quadratic similarity f j is a piecewise quadratic function of t. . . (5) where l0 := l− k li . 1 j k 3. i=1 ∗ j−1 then (t∗ .. . Notice that if (t 1 . (4) σ∈S ∗ Since the functions w σ are linear functions of t. it suffices to restrict our attention to a discrete set of at most mn j points. t∗ ).. . . tk ) is a solution to the optimization problem in equation (5). In order to simplify the restrictions on t j in equation (1). t2 . Pk . ∗ Corollary 1 The local minima of the function f j are among the breakpoints between its parabolic pieces... With this simple transformation of the set of functions f j . . Obviously. ∗ ∗ Let (t1 . the parabolic pieces of f j are concave (see figure 6). . the function f j is a copy of fj . . . .. ∗ The following corollary indicates that in computing the minimum of the function f j . . This definition forms the basis of a straightforward dynamic programming solution to the problem.. .. tk ) be a solution to the optimization problem in equation (5). Pk . . but shifted to the left with j−1 li .

. (when j ∈ {g. P ) = t1 . and a ≤ b. b] can be computed recursively.. .. 2l).Lemma 3 The values of an optimal solution (t 1 . b] = D[j − 1. .. tj ∈ X xa ≤ t1 ≤ . . . For any j ∈ {1. Pg . .. h}). in which case (t1 . otherwise (t1 . ∗ ∗ Notice now. we denote: j D[j. . With the observations above. otherwise (t1 . b − 1]. . k}) or the function f gh : R → R. Proof: If t j−1 = tj and tj = tj+1 . b] + f j (xb ). ∗ In the second case. . tk ) are found in a discrete set of points X ⊂ [0. b]. a.. thus tj must give a local minimum of function f j . . 2l). a. . starting at P (x a ) and ending at P (x b + j i=1 li ).. h. . f gh (t) = i=g f i (t) + i=1 f i (t − l0 ). 2l). a. and the breakpoints of f gh are the breakpoints of the constituent functions f g . or such a breakpoint shifted left or right by l 0 . . xN −1 } be the set of critical points in [0.. f gh (t) = i=g f i (t + l0 ) + i=1 f i (t) (when j ∈ {1. .. a. ... . Since function f j has 2mnj breakpoints... ... that is: no polyline that is not in G j is glued to a polyline in G j .. We call a point in [0.. 8 . . ... k}. i=1 (7) where j ∈ {1. Thus. a.. . .. Then. f k . Pk . thus in this case D[j. . for some g and h in {1. . . . . . we have that Pk and P1 are glued together.. Note that a local minimum of f gh may not be a local minimum of any of the constituent functions f i (g ≤ i ≤ h). tj ) must be an optimal solution for D[j. . . . tj ) would not give a minimum for D[j. b] = t1 . D[j.. . Ph } (with j ∈ {g. a. . ... Regarding the value of t j we distinguish two cases: • tj = xb . thus the positioning of the ending point of j j−1 Pj−1 and of the starting point of P j coincide. . Let X = {x0 . b]. . .. . but nevertheless it will always be given by a breakpoint of one of the constituent functions. the total j number of critical points in [0.. the constraints on the choice of t 1 . This function is obviously piecewise quadratic. ≤ tj ≤ xb min f i (ti ) .. ∗ ∗ In the first case... • tj = xb . so t j must give a local minimum of the function f gh : R → R. Pj } of j polylines to a subchain of P . (6) We now show that an optimal solution of this problem can be constructed recursively. f h . in the given optimal placement the polylines P j−1 and ∗ ∗ Pj are “glued” together. g. t j must be a breakpoint of one of the functions f g .. a. f gh (t) = i=g f i (t). a. . let Gj be the maximal set of polylines containing P j and glued together in the given optimal placement. tj ) be an optimal solution for D[j.. . which is either a breakpoint of one of the functions f 1 . . Since Gj is a maximal set of polylines ∗ ∗ ∗ ∗ glued together. k}. b ∈ {0.. . .. . . in which case (t1 .. . k}). b]. .. Pk } (with j ∈ {1.. h}) or the form {P 1 .. f h . which consists of the breakpoints of the functions f 1 . . N − 1}. . . tk − t1 ≤ l0 k h ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ min ⎝ k j=1 ⎞1/2 f j (tj )⎠ . b − 1]. tj−1 ) must be an optimal solution for D[j − 1. tk ∈ X ∀j > 1 : tj−1 ≤ tj . We now show that D[j. . a.. a. we have t i = tj for all i ∈ {g... . Observe that Gj must have either the form {P g .. a. .. . tk do not prohibit decreasing ∗ ∗ or increasing t j . thus in this case. .. For this purpose. .. b] = D[j. the constraints on the choice of t 1 .. . . . . .. . Ph . . we can argue in a similar manner that t j must give a local minimum of the function k h f gh : R → R. .. Similarly. . tk do not prohibit decreasing or increasing t g . plus two copies of each breakpoint: one shifted left by l0 and one shifted right by l 0 . k}... .. if t k = t1 + l0 . 2l) is at most 6m i=1 ni = 6mn. b]. Equation (7) describes the subproblem of matching the set {P1 . h}. a critical point. a.. or it must be such a breakpoint shifted left or right by l 0 . the optimization problem we have to solve is: ⎛ d(P1 . with concave parabolic pieces. . f k .. . that tj−1 = tj means that t∗ = t∗ + lj−1 . th ∗ h simultaneously. Let (t 1 . tj ) would not give a minimum for D[j...

b] ← min D[j − 1. . Overall. b] . let us denote by S ΘΘj the set of rectangular strips in the current configuration S that are bounded left by a jump of Θ and right by a jump of Θ j (see figure 7). for j ≥ 1 ∧ a ≤ b. D[j. a. It remains to indicate how to compute the slopes of these linear pieces. . P ) = xa . b − 1] 11. respectively. . . . . Pk . a. while xb − xa ≤ l0 do 8. for all i ∈ {0. a. a. b] + f j (xb ). For all j ∈ {1. and by S ΘΘ and SΘj Θj the set of of rectangular strips that are bounded left and right by a jump of Θ and Θ j . N − 1}. and they can be sorted in O(mn log(mn)). for a ← 0 to N − 1 do 5. b − 1]. . the only strips whose width is changing are those bounded by a jump of Θ and a jump of Θj . The algorithm is an adaptation of the algorithm used in [3]. return M IN Algorithm SimpleCompute d(P1 . P ) Lemma 4 The running time of algorithm SimpleCompute d(P 1 . Proof: Step 1 of this algorithm requires first identifying all values of t such that a vertex of a turning function Θ j is aligned with a vertex of Θ. . a − 1] ← ∞ 6. xb −xa ≤l0 (8) min D[k. xN−1 }. 3.5 A straightforward algorithm Equations (8) and (9) lead to a straightforward dynamic programming algorithm for computing the similarity measure d(P1 . . M IN ← ∞ 4. b] = min D[j − 1. . b] ← 0 9. k} and all i ∈ {0. Pk . . . . only three strips in S change their type. Pk . a. where the boundary cases are D[0. N −1} takes O(mn). . the identification of the set X of critical points takes O( k mnj ) = O(mn). D[0. Compute the set of critical points X = {x0 . Notice also that at each breakpoint of f j . and we thus indicate only its main ideas. j=1 We now show that. We then have to shift these values left by j−1 li and make i=1 copies of them shifted left and right by l 0 . a − 1] has no solution. . .We can now conclude that : D[j. (9) 3. . For a fixed vertex v of a turning function Θ j . b − 1] . for j ← 1 to k do 10. Notice that when Θ j is horizontally shifted over Θ such that the configuration S of the rectangular strips does not change. a. As equation (4) from the proof of lemma 2 indicates. evaluate f j (xi ). A solution of the optimization problem (6) is then given by d(P1 . When shifting Θ j such that the configuration S does not change. the widths of the strips in S Θj Θj and SΘΘ do not change. M IN ← min(D[k. b← b+1 12. . by S Θj Θ the set of rectangular strips that are bounded left by a jump of Θ j and right by a jump of Θ. a. . with strips divided in the above 9 . b] + f j (xb ). . a. . . for a particular value of j. b] = 0 and D[j. P ) is O(km2 n2 ). and sort them. b←a 7. P ): 1. a. More precisely. a. D[j. . Thus the slope of each linear piece of the sum functions in equation (4) is determined only by the rectangular strips in S ΘΘj and SΘj Θ . M IN ). √ 13. . D[j. the results for all n j vertices of Θj can thus be collected in O(mn j ) time. Thus these functions can be completely determined by evaluating an initial value and the slope of each linear piece. evaluating f j (xi ).xb ∈X. . for j ← 1 to k do D[j. . Pk . and the algorithm uses O(km2 n2 ) storage. 2. . . the function f j (t) is the difference of two piecewise linear functions with the same set of breakpoints. The evaluation of the function f j thus starts with computing an initial configuration of strips. all m values of t such that v is aligned with a vertex of Θ can be identified in O(m) time. . while the width of those in S ΘΘj increases and of those in S Θj Θ decreases. . a. . a. .

. ta−1 . i. and let i be the largest such i. N − 1}. .. i i But we also have ta−1 ≤ ta −1 ≤ ta ≤ ta < ta−1 ≤ ta−1 ... b] + f j (xb ). From the fact that the 1 j i i i i lexicographically smallest optimal solution (t a . j ∈ {1... there is a critical point xz . This contradicts the conclusion of the previous i 10 . σ3 ∈ SΘj Θj . let (t a .. ta−1 . 0 ≤ z ≤ b.. 1 i i i i −1 i i +1 i −1 ta . line 4 to 12 dominate the running time of algorithm SimpleCompute. Pk . i i By our choice of i and i . . j}. b]. and σ5 ∈ SΘΘ . . . and any critical point x b . b}.. and ii) D[j...σ3 σ2 Θj Θ σ1 σ5 σ4 σ6 Figure 7: The four types of strips formed by the discontinuities of two turning functions Θ j and Θ: σ1 .. σ6 ∈ SΘΘj ... Thus we can evaluate the function f j at all critical points in x in O(N ) = O(mn). which is O(kN 2 ) = O(km2 n2 ). so i i=i i i f i (ta ) > i i i=i f i (ta−1 ). The updates of the configuration and the function values computation take constant time per critical point. ta +1 ... . b] = D[j.6 A fast algorithm The above time bound on the computation of the similarity measure d(P 1 . a. We first prove this observation. Thus. a. . a. σ2 . and thus the step 2 of the algorithm takes O(kmn) time. a − 1 and b.. . and this is O(km 2 n2 ). we must conclude 1 j that it is suboptimal. ta−1 .. σ4 ∈ SΘj Θ . which is at most N times for the loop in line 4. b] = D[j − 1. . P ) can be improved to O(kmn log mn).. .. a. for a total of O(kN 2 ) = O(km2 n2 ) times. assume that there is an a (0 < a ≤ b) and an i (1 ≤ i ≤ j) such that ta < ta−1 . From the fact that this solution is lexicoi i i +1 i graphically smaller than the lexicographically smallest optimal solution (t a−1 . such that: i) D[j.. . Proof: For fixed values of j and b.. the value t a can i only increase as well. It follows i i i i i i +1 that (ta .. . . we have ta −1 ≤ ta < ta−1 ≤ ta−1 ≤ ta−1 ≤ ta +1 . for any i ∈ {1. . let i be the smallest such i. We then go through the set X of critical events in order. For the sake of contradiction. . 3. for all a ∈ {0. for all a ∈ {z.e. 0 j+1 The proof of i) and ii) is based on the following observation: as a increases from 0 to b. The operations in the rest of the algorithm all take constant time each... ta ) is a valid solution for j. ta ) denote the lexicographically smallest solution among 1 j the optimal solutions for the subproblem D[j. times N for the loop on line 6. ta . mentioned four groups. .. define t a = xa and ta = xb . Their total time is determined by the number of times line 10 is executed. times k for the loop on line 8. or remain the same. An initial value f j (0) can be computed from this configuration in O(m + n j ). ta−1 ). we must conclude that it is at least as 1 j good. ta ) is different. ta−1 . . z − 1}. and when necessary we update the configuration of strips (and thus the sets SΘΘj and SΘj Θ ) and also compute the value of the function f j . . .. ta−1 ) is a valid solution for j. The amount of storage used by the algorithm is dominated by the amount of space needed to store the matrix M .. . a and b .... b − 1]. i=i f i (ta ) ≤ i=i f i (ta−1 ). . k}. . based on the slopes computed from S ΘΘj and SΘj Θ .. . . . For such an a. . The refinement of the dynamic programming algorithm is based on the following property of equation (8): Lemma 5 For any polyline P j . ta −1 .. a. For ease of notation. from which it follows that (t a−1 . b ∈ {0. . .

and log N height. ν ν ν ν that is: a− = 0 and a+ = N − 1. For ν ν ν ν ν any index a of a critical point x a . Each node ν of T j. b] can be constructed from D[j − 1. and thus... −1] ← INFINITY 9. MIN ← ∞ 6. for j ← 1 to k do 8. Each node ν with a − = a+ is a leaf of the tree. Moreover. j j j j For given j and b. b − 1] fast. .. Pk . .. and sort them 2. Construct D[j. The key to success will be to represent these functions in such a way that they can be evaluated fast and D[j.b used for storing the function D[j. N − 1}. xN−1 }.6. .. for all 0 < a ≤ b and 1 ≤ i ≤ j. b](a) is the sum of the weights on the path from the tree root to the leaf λ a . The root ρ is associated with the full domain. aν ] and ν ν ν [aν + 1. . b] (while adding f j (xb )) from this value onwards. Note that so far. b − 1] 13.. b] = D[j − 1. and split values in their corresponding nodes. a+ ]. b] = D[j. A trivial example is to set w ν equal to D[j. b − 1] up to some value z. b}. so ta = xb .. . b] + f j (xb ). b].paragraph. Each node ν with a − < a+ is an internal node that has a split ρ ρ ν ν value aν = (a− + a+ )/2 associated with it.b has the following property: D[j. by induction. and thus. P ) The running time of this algorithm depends on how the functions D[j. It follows that t a ≥ ta−1 . Its left and right children are associated with [a − . we present the summarized refined algorithm: 1. 3. i i i i for all 0 ≤ a ≤ a. b](a) = D[j... D[j.. We can now prove the lemma item i) and ii). . b] and D[j. b] from D[j − 1. MIN ← min (val. Before describing the data structure used for this purpose. Compute the set of critical points X = {x0 . with a ν = a− = a+ . a. that t a ≤ ta . b − 1] for all a ∈ {0. 11 . INFINITY ← a function that always evaluates to ∞ 5.. This property allows us to improve the time bound of the dynamic programming algorithm in the previous section. . Such a representation of a function D[j. . a. a ← 0 7. b] are represented. b] and D[j. D[j. b] by means of balanced binary trees. we have xb = tz ≤ ta ≤ tb = xb . a+ ]. respectively. we will compute arrays of functions D[j. . N − 1} → R. a. . ZERO ← a function that always evaluates to zero 4. we have t a = xb for all j j a < z. With each node ν we also store a weight w ν . for b ← 0 to N − 1 do 10. b] is not unique.. In order to make especially steps 12 and 15 of the above algorithm efficient. [4] used an idea similar in spirit. b] : {0. b]. k} and all b ∈ {0. we will denote the leaf ν that has a ν = a by λa . b} for which t z = xb j (since tb must be xb . b] can be obtained from D[j. and to set it to zero for all internal nodes. with D[j.. all trees have the same associated intervals. we represent the functions D[j. the tree looks exactly the same for each function D[j. . Lemma 5 expresses the fact that the values of function D[j. .b is associated with an interval [a− . Moreover. Let z be the smallest value in {0. a ← a+1 15. For all j ∈ {1.1 An efficient representation for function D[j. we consider the function D[j. with 0 ≤ a− ≤ a+ ≤ N − 1.. val ← evaluation of D[k. . Instead of computing arrays of scalars D[j. D[0. Since. we have D[j.. Asano et al. such a value always exists). b] We now describe the tree T j. such that Tj. b]. a. z − 1}. for a ∈ {z. return MIN Algorithm FastCompute d(P1 . a. It will become clear below why we also allow representations with non-zero weights on internal nodes. b](aν ) for all leaves ν. b](a) 16. evaluate f j (xb ) 3. by our choice of z. and from D[j − 1. while xa < xb − l0 do 14. b]: they are balanced binary trees with N leaves. a. b]. b] ← ZERO 11. MIN ) √ 17. for j ← 1 to k do 12.

b](a) takes O(log N ) time. b − 1].b−1 . Also the representation of a function that always evaluates to ∞ can be constructed in O(N ) time. by identifying.b must equal those in T j−1.b−1 Tj−1. The representation of a function that always evaluates to ∞ takes also O(N ) time since it involves only the building a balanced binary tree on N leaves and setting the weights of all leaves to ∞. b]. We find the sequences of left and right turns that lead from the root of the trees down to the leaf λ z . that is: the rightmost descendant of the left child of ν.e.b with these properties as follows.b is contructed from T j.b λ0 λz λN −1 λ0 λz λN −1 Figure 8: The tree T j. (ii) Once Tj. There is no need to copy ν: we just add a pointer to it. the total weights to the root in Tj. Lemma 5 tells us that for each leaf left of λ z . (iii) To construct T j. a representation Tj. evaluating function D[j. Lemma 6 The data structure Tj.b .b and take the weight of ρ from there.b and Tj. b] can be operated on such that: (i) The representation of a zero-function (i.b of D[j.b−1 .b efficiently.b representing the function D[j.Tj. (ii) Given Tj. Note that the sequences of left and right turns are the same in the trees T j. and adopting the subtrees to the left of the path from T j. and Tj−1.b−1 . Tj. we store with each node ν a value m ν which is the sum of the weights on the path from the left child of ν to the leaf λ aν . i. (iii) Given the representations T j−1. the weights stored allow us to evaluate D[j.b−1 . now adding f j (xz ). at each node node along this path whether the path continues into the right or left subtree of the current node. Proof: (i) Constructing the representation of a zero-function involves nothing more than building a balanced binary tree on N leaves and setting all weights to zero. and continue the construction process in that subtree. based on the stored weights. b]. we take the following approach.b by constructing a root ρ. Every time we go into the left branch.b−1 and Tj−1. the total weight on the path to the root in T j.b .b . b] has been constructed. and the subtrees to the right of the path from T j−1. we will show below that we are able to construct the path from the root of the tree to the leaf λ z corresponding to z. We construct the tree T j. plus f j (xz ).b . and of internal nodes to zero. b] can be computed in O(log N ) time. b](a) for any a in O(log N ) time by summing up the weights of the nodes on the path from the root to the leaf λ a that corresponds to a.b by creating new nodes along the path from the root to the leaf λ z . only the weights on the path differ.b of D[j.b for the representation of function D[j. b] and D[j. a function that always evaluates to zero) can be constructed in O(N ) time. We start building Tj. Then we make a new root for the other subtree of the root ρ. we set the weight of ρ equal to the weight of the root of Tj.b−1 . we adopt the right child 12 .b−1 of the functions D[j − 1.b ⇓ λ0 λz λN −1 Tj. If the path to λ z goes into the left subtree. thus at once making all descendants of ν in T j. the one that contains λ z .e. Furthermore.b−1 into descendants of the corresponding node in the tree T j.b must be the same as the total weight on the corresponding path in T j. At λz itself and right of λ z . we adopt the right child from T j−1. Though we do not compute z explicitly.b−1 and Tj−1. where z is defined as in lemma 5. If the path to λ z goes into the right subtree. respectively. we adopt as left child of ρ the corresponding left child ν of the root from T j. Furthermore. This concludes the description of the data structure used for storing a function D[j. This can be done in O(N ) time.b under construction.b from Tj.

b] gives the best alignment of the first j polylines ending at a critical point within the interval [x 0 . Thus. Global string matching under a general edit distance error model can be done by dynamic programming in O(kN ) time. where D[j. the infinity-function IN F IN IT Y can be constructed in O(N ) time (line 4). The accumulated weights on the path down from the root. and every time we go into the right branch. Lemma 6 also insures that constructing D[j. because an optimal D[j. Pk . We note that the problem resembles a general edit distance type approximate string matching problem [19]. together with the stored weights for the paths down to left childrens’ rightmost descendants. without copying the structure of the complete tree. Notice that no node is ever edited after it has been constructed: no tree is ever updated. This is also the case for a different formulation of the dynamic programming problem. This will tell us if λz is to be found in the left or in the right subtree of ν. b] and D[j. This trick however does not apply here due to the condition x b − xa ≤ l0 . . also allow us to decide in constant time which is better: D[j. . . b] gives the cost of the best alignment of the first j polylines ending at critical point x b (over all possible starting points x a satisfying the condition x b − xa ≤ l0 ).5 and can be executed in O(kmn + mn log n) time.from Tj−1. we can set the weight of each newly constructed node ν correctly in constant time per node. So the total storage required by the algorithm is O(kmn log(mn)). the total running time of the above algorithm will be dominated by O(kN ) executions of line 12. .b−1 (see figure 8). Indeed. an assignment as in lines 8 or 10 of the algorithm can safely be done by just copying a reference in O(1) time. P ) with the data structure described above. by the above trick to convert global matching into partial matching we would have to fill in a table D of O(kN ) elements. such that the alignment of the first j − 1 polylines is sub-optimal for D[j − 1. and thus we need O(kmn log(mn)) for all the trees computed in step 12. we increase wν by f j (xz ). xb ]. Notice that any of the functions constructed in step 12 requires only storing O(log(N )) = O(log(mn)) new nodes and pointers to nodes in previously computed trees. b] might be given by an alignment of the first j polylines. we set its weight w ν so that the total weight of ν and its ancestors equals the total weight of the corresponding nodes in the tree from which we adopt ν’s child — if the subtree adopted comes from Tj−1. The complete construction process only takes O(1) time for each node on the path from ρ to λ z . 4 A Part-based Retrieval Application In this section we describe a part-based shape retrieval application. . we have to store the ZERO and the IN F IN IT Y function. taking in total O(kN log N ) = O(kmn log(mn)) time.b . . All these require O(kN ) = O(kmn) storage. Proof: We use algorithm Fastcompute d(P 1 . P ) between k polylines {P1 . . The retrieval problem we are considering is the following: given a large collection of polygonal shapes. where xc is the placement of the polyline j − 1. This table however cannot be filled efficiently in a column-wise manner.b is constructed in time O(log N ). b](aν ) + f j (xz ). By keeping track of the accumulated weights on the path down from the root in all the trees. b] from D[j − 1. in which each entry D[j. . we have that the zero-function ZERO can be constructed in O(N ) time (line 3). we adopt the left child from T j. . b](a) (line 15) takes O(log N ) time. where k and N represent the lengths of the two strings. . so that T j. Similarly. b − 1] (line 12) takes O(log N ) time.b . Apart from the function values of f j computed in step 2. b − 1](a ν ) or D[j − 1. . Step 1 and 2 of the algorithm are the same as in the algorithm from section 3. Theorem 1 The similarity d(P1 . Since the trees are perfectly balanced. Pj } with n vertices in total. From lemma 6. and a query consisting of a set of disjoint 13 . and that the evaluation of D[k. and a new tree is always constructed by making O(log N ) new nodes and for the rest by referring to nodes from existing trees as they are. can be computed in O(kmn log(mn)) time using O(kmn log(mn)) storage. For every constructed node ν. c]. this path has only O(log N ) nodes. The same time complexity can be achieved for partial string matching through a standard “assign first line to zero” trick [19]. . Pk . Therefore. and a polygon P with m vertices.

such as those labelled “horse”. Some examples of images in this collection are given in figure 9. contains shapes at different scales. Figure 10 depicts two part-based queries on the same shape (belonging to the “beetle” class) with very different results. we then used the Douglas-Peuker [9] polygon approximation algorithm. The shape collection used by our retrieval application comes from the MPEG-7 shape silhouette database. Querying with such parts have shown to yield shapes from all these classes. Specifically.Figure 9: Examples of images from Core Experiment “CE-Shape1”. For matching the query against a database shape we use the similarity measure described in section 3. The matching of a query consisting of k polylines to an arbitrary database contour is based on the similarity measure described in section 3. 4. we want to retrieve those shapes in the collection that best match the query. instead of the bounding box. “dog”. Given the nature of our collection (each image in “CE-Shape-1” is one view of a complete object). this is a reasonable approach. In this contour. This test set consists of 1400 images: 70 shape classes.1. these parts are the result of applying the boundary decomposition framework introduced in [29]. This similarity measure is not scale invariant. The parts in the query are selected by the user from an automatically computed decomposition of a given shape. is related to the fact that a class in the collection may contain images of an object at different rotational angles.1 Experimental Results The selection of parts comprising the query has a big influence on the results of a part-based retrieval process. part B. the query consists of five chains. The instantiation of the boundary decomposition framework described there uses the medial axis. we used the Core Experiment “CE-Shape-1” part B [13]. instead of querying with complete shapes. Our shape collection. as described in [29]. where shapes in each row belong to the same class. In order to achieve some robustness to scaling. however. a test set devised by the MPEG-7 group to measure the performance of similarity-based retrieval for shape descriptors. “deer”. such as limbs. for example. each pixel corresponds to a vertex. in the application we discuss below. The outer closed contour of the object in each image was extracted. Each resulting simplified contour was then decomposed. we scaled all shapes in the collection to the same diameter of the circumscribed disk. polylines. The running time for a single query on the MPEG-7 test set of 1400 images is typically about one second on a 2 GHz PC. In order to decrease the number of vertices. In the first example. For the medial axis computation we used the Abstract Voronoi Diagram LEDA package [1]. each representing a 14 . whose shapes have similar parts. More specifically. The reason we opted for a normalization based on the circumscribed disk. and the matching process evaluates how closely these parts resemble disjoint pieces of a shape in the collection. as we noticed in section 3. The query represents a set of disjoint boundary parts of a single shape. Our collection for example includes classes. Images in the same row belong to the same class. Thus. with 20 images per class. we make the query process more flexible by allowing the user to search only for certain parts of a given shape. “cattle”.

The shape descriptor selected by MPEG-7 to represent a closed contour of a 2D object or region in an 15 . decomposed countour with five query parts in solid line style. Each visual descriptor incorporated in the MPEG-7 Standard was selected from a few competing proposals. leg of the beetle. each representing a leg of a beetle contour. proposed for standardization within MPEG-7. (Lower half) The same original image and decomposed contour. The performance of each shape descriptor was measured using the so-called “bulls-eye” test: each image is used as a query. two of the other). The MPEG group initiated the MPEG-7 Visual Standard in order to specify standard content-based descriptors that allow to measure similarity in images or video based on visual criteria. and the top ten retrieved shapes when quering with two polylines. a spatial arrangement of these parts is searched in the database. is reported in [21]. and this gives the query more power to discriminate between beetle shapes and other shapes containing leg-like parts. The Core Experiment “CEShape-1” was devised to measure the performance of 2D shape descriptors. In the second example. but they were concatenated. and the number of retrieved images belonging to the same class was counted in the top 40 (twice the class size) matches. one containing three legs and the other two legs. based on an evaluation of their performance in a series of tests called Core Experiments. the same parts are contained in the query.Figure 10: (Upper half) Original image. so that they form two chains (three legs on one side. only three come from the same class. and the top ten retrieved shapes when quering with these five polylines. By concatenating the leg parts in the second query. The performance of several shape descriptors. Among the first ten retrieved results. All first ten retrieved results are this time from the same class.

The reported similarity-based retrieval performance of this method is 75. while in [10] a local descriptor of an image point is proposed that is determined by the spatial arrangement of other image points around that point. This has to be done interactively by the user. is therefore untractable. A prerequisite of such a performance of the part-based matching is the selection of query parts that capture relevant and specific characteristics of the shape. of 78. respectively. of the contours in these classes. by relating the positions of the maxima of the corresponding CSSs. The query. for example the class “beetle”. These experimental results indicate that for those classes with a low average performance of the CSS matching. Belongie et al. Firstly. is significantly more effective in such cases. and a turning function-based matching). For the “ray” and “deer” classes. since that would re quire 1400 interactive queries. respectively) are caused by the different shape and size of the tails and antlers. σ) plane. and our method a score of 70%. proposed for MPEG-7. Latecki at al. we have developed an alternative approach to part-based shape retrieval. [6] reported a retrieval performance of 76. For the shape called “horse-17”. people have reported better performances on the same set of data for other shape descriptors. the CSS matching [18] gives good results. in which the user is relieved of the responsibility of specifying shape parts to be searched for in the database. The average performance rate of CSS matching for this class is only 36%.19% for a shape descriptor introduced in [5].) 5 Concluding Remarks In this paper we addressed the partial shape matching problem. A prerequisite of such a performance of the part-based matching is the selection of query parts that capture relevant and specific characteristics of the shape. The matching process has two steps. The matching in [27] is based on aligning the two shapes in a deformation-based approach. Since the user does not usually have any knowledge of the databased content before the retrieval process. For example. and our multiple polyline to polygon matching method a score of 95%. the different relative lengths of the antennas/legs. The CSS of a closed planar curve is computed by repeatedly convolving the contour with a Gaussian function. An overall performance score for the matching process. In the years following the selection of this descriptor for the MPEG-7 Visual Standard. [27] and Grigorescu and Petkov [10]. with a retrieval rate over 90%.67%. We compared our part-based approach to shape matching with two global shape matching technique (the CSS matching.51% for their descriptor based on shape contexts. our part-based approach with the global CSS matching and with a global matching based on the turning function. (See many examples in the appendix. Even better performances. for the shape called “carriage-18”. respectively. with a proper selection of parts. see [30]. is reported to have a similarity-based retrieval performance of 54. We therefore compared. and representing the curvature zero-crossing points of the resulting curve in the (s. A global turning function-based shape descriptor [12]. the global turning function method a score of 80%. the eccentricity and circularity of the contours are used as a filter.44% [21]. This has to be done interactively by the user. The similarity measure was tested in a shape-based image retrieval application. For classes in “CE-Shape-1” with a low variance among their shapes. is a polygon. and in order to select among the large number of possible searches in the database with 16 . The shape similarity measure between two shapes is computed.38%. our approach consistently performs better. our approach consistently performs better. where s is the arclength. to reduce the amount of computation. and the different shapes of the body pose problems for global retrieval. on a number of individual queries.image is based on the Curvature Scale Space (CSS) representation [18]. We introduced a measure for computing the similarity between a set of pieces of one shape and another shape. Such a measure is usefull in overcoming problems due to occlusion or unreliable object segmentation from images. A later variation [33] increases the performance rate up to 65. in this case. the bad results (an average performance rate of the CSS matching of 26% and 33%.14%. the CSS method has a bull’s eye score of 70%. our dissimilarity measure provides a powerful complementary shape matching method. as measured by the “bulls-eye” test. mention in [16] a retrieval performance of 83. Concludingly.18% and 78. like in [21]. A part-based matching. were reported by Sebastian et al. Experimental results indicate that for those classes with a low average performance of the CSS matching. both the CSS method and the global turning function method have a bull’s eye score of 10%. in the second step. For difficult classes however. We tested the performance of our part-based shape matching.

MPEG-7. Shape similarity measure based on correspondence of visual parts. in the form of marking relevant results from those retrieved by the system. Lak¨ mper. [10] C.html. Huttenlocher. Acknowledgements This research was partially supported by the Dutch Science Foundation (NWO) under grant 612. Katoh. 2005. Distance sets for shape filters and shape recognition. pages 34–51. 30(1):59– 77. [4] T. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. [11] D. J.061. Everett. Guibas. L. has the purpose of allowing the system to guess what the user is looking for in the database. Conf. Theory and Applications. and W. pages 777–786. and by the FP6 IST Network of Excellence 506766 Aim@Shape. G. http://www. Latecki and R. 24:509–522. Description of core experiments for MPEG-7 motion/shape. Ansari and E. Shape matching and object recognition using shape contexts. Mitchell. J. 1991. [7] B.J. 15:850–863. [9] D. PAMI. [15] L. 1997. Haverkort. PhD thesis. Int. 1987. Technical report. SODA. An efficiently computable metric for comparing polygonal shapes. K. IBM.006. Kedem. 2002. Malik. J. Klanderman. R. the Abstract Voronoi Diagram LEDA Extension Package. Shape Based Digital Image Similarity Retrieval. References [1] AVD LEP. Comparing images using the hausdorff distance. Shape similarity and visual parts. [14] L. Bober. March 1999. Arkin. 12:470–483. His interaction. J. 13:209–215. 1973. Pattern Recognition. In Proc. Peucker. Cheong. [5] E. Huttenlocher. Delp. mpg. Technical Report ISO/IEC JTC 1/SC 29/WG 11 MPEG99/N2690. 2003. [3] E. Belongie. and J. Petkov. a PAMI. PAMI. Attalla. 22:1185–1190.parts of the query. 2003. Bhanu and J. Cohen and Leonidas J. H. de Berg. 2004.H. TR ISO/IEC JTC 1/SC 29/WG 11/ P162. Jeannin and M. [6] S. [12] IBM. Wolter. 17 . [8] Scott D. 1993. Technical report. Computational Geometry. Wolff. Partial matching of planar polylines under similarity transformations. PAMI. Rucklidge. M. The Canadian Cartographer. PAMI. [2] N. 12(10):1274–1286. Wayne State University.K. Douglas and T. Thanks to Veli M¨ kinen for the partial a string matching reference. Ming. In Proc. [13] S. the user interacts with the system in the retrieval process. 1990. D. and A. Grigorescu and N. Lak¨ mper. IEEE Transactions on Image Processing.de/LEDA/friends/avd. Technical summary of turning angle shape descriptors proposed by IBM. N. a Discrete Geometry for Computer Imagery (DGCI). Puzicha. and J. Asano. and D. H. 2000. O.mpi-sb. Recognition of occluded objects: A cluster-structure algorithm. Optimal spanners for axis-aligned rectangles. Chew. Partial shape recognition: A landmark-based approach. C. Seoul. Latecki. 1999. 20(2):199–211. 10(2):112–122.

[23] A. 2005. pages 10–21. [24] E. [32] Haim Wolfson and Isidore Rigoutsos. A fuzzy relaxation technique for partial shape matching. In Proceedings International Conference on Multimedia and Expo (ICME05). Part-based shape retrieval with relevance feedack. Shokoufandeh.[16] L. Hagedoorn. 1996. June 1996. P. [27] T. July 1993. In Workshop on Image DataBases and MultiMedia Search. Prokop and A. Rosin. R. 7(3):170–179. Siddiqi. 1990. 15(4):349–355. [30] Mirela Tanase and Remco C. Shock graphs and shape matching. R. u July 1999. Dickinson. Image and Vision a Computing. [25] R. Zibreira and F. [33] C. [17] H. October-December 1997. International Journal of Computer Vision. ACM Computing Surveys. Abbasi. Multiscale representation and matching of curves using codons. 55(1):13–32. Pattern Recognition Letters. Principles of Visual Information Retrieval. Sclaroff. Utrecht University. J. Pereira. Partial shape matching using genetic algorithms. Kimia. 18(10):987–992.-C. Technical Report ISO/IEC JTC1/SC29/WG11 MPEG98/M4740. Technical report. [26] P. Efficient and robust retrieval by shape content through curvature scale space. Veltkamp and M.J. Mokhtarian. Pentland. Ohm and K. A guided tour to approximate string matching. and Cybernetics. Ogawa. W. Results of MPEG-7 Core Experiment Shape-1. Telecommunications. [28] K. Conf. editor. and S. Photobook: Content-based manipulation of image databases. 1997. [22] E. A survey of moment-based techniques for unoccluded object representation and recognition. MPEG-7. [20] H. IEEE Computational Science & Engineering. and J. 12(11):1072–1079. Lak¨ mper. Zucker. PhD thesis. [19] Gonzala Navarro. Geometric hashing: an overview. [29] Mirela Tanase. 1999. 33(1):31– 88. Optimal partial shape similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence. 18(3):233–254. Partial shape classification using contour matching in distance transformation. February 2005. A study of similarity measures for a turning angles-based shape descriptor. and B. IEEE Transactions on Systems. 2005. 25(1):116–125. 54(5):438–460. P. Graphical Models and Image Processing. Shape Deomposition and Retrieval. Springer. 1994. 2001. Mohan. Persoon and K. K. and Image Processing. A.-R. Department of Computer Science. 2003. 2001. S. Wolter. and D. J. Graphics. M¨ ller. D. Portugal. [31] R. Sebastian. Reeves. Ozcan and C. 1977. Kittler. PAMI. IJCV. 55(4):286–310. Man. Lew. On aligning curves. S. Veltkamp. Liu and M. Latecki. 2001. Shape discrimination using Fourier descriptors. pages 87–119. S. Picard. Srinath. [21] J. State-of-the-art in shape matching. Setember 1992. [18] F.W. Pattern Recognition Letters. and S. 18 . 23:227–236. Computer Vision. pages 35–42. Fu. In M. Klein. In Proc.

and our Multiple Polyline to Polygon (MPP) matching. the Global Turning Angle function (GTA). 19 . The retrieval rate measured by the “bulls-eye” test. the percentage of true positives returned in the first 40 (twice the class size) matches.Appendix A We compared our multiple polyline to polygon matching (MPP) approach with the global CSS matching (CSS) and with a global matching based on the turning function (GTA). Table 1 presents the query image of the CSS matching and the query parts of our part-based matching in the first and second column of the table. Bull’s Eye Performance Matching Process CSS GTA MPP True Positives in class size Matching Process CSS GTA MPP CSS Query Image Part-base Query Parts 10% 30% 65% 10% 10% 55% beetle-20 10% 35% 60% 5% 20% 55% beetle-10 15% 15% 70% 15% 5% 60% ray-3 15% 25% 50% 15% 20% 40% ray-17 15% 40% 50% 5% 30% 35% Table 1: A comparison of the Curvature Scale Space (CSS). while the remaining columns give the percentage of true positives returned in the first 20 (class size) matches. for the three matching techniques are indicated in the following three columns.

CSS Query Image deer-15 Part-base Query Parts Bull’s Eye Performance Matching Process CSS GTA MPP True Positives in class size Matching Process CSS GTA MPP 20% 40% 45% 15% 40% 40% deer-13 5% 20% 45% 5% 15% 35% bird-11 20% 15% 60% 20% 15% 40% bird-17 20% 25% 50% 15% 25% 40% bird-9 70% 80% 95% 45% 80% 90% carriage-18 10% 10% 70% 10% 5% 60% horse-17 15% 25% 65% 10% 25% 40% horse-3 Table 1: A comparison of the Curvature Scale Space (CSS). the Global Turning Angle function (GTA). and our Multiple Polyline to Polygon (MPP) matching. 20 .

21 . and our Multiple Polyline to Polygon (MPP) matching.CSS Query Image Part-base Query Parts Bull’s Eye Performance Matching Process CSS GTA MPP True Positives in class size Matching Process CSS GTA MPP 25% 30% 50% 20% 20% 45% horse-4 30% 30% 50% 25% 25% 40% crown-13 20% 45% 65% 20% 35% 50% butterfly-11 15% 25% 55% 10% 20% 55% butterfly-4 10% 15% 50% 10% 15% 45% dog-11 Table 1: A comparison of the Curvature Scale Space (CSS). the Global Turning Angle function (GTA).

Sign up to vote on this title
UsefulNot useful