This action might not be possible to undo. Are you sure you want to continue?

# Multiple Polyline to Polygon Matching

Mirela T˘ anase

Remco C. Veltkamp

Herman Haverkort

institute of information and computing sciences, utrecht university

technical report UU-CS-2005-017

www.cs.uu.nl

Multiple Polyline to Polygon Matching

Mirela T˘ anase

1

Remco C. Veltkamp

1

Herman Haverkort

2

1

Department of Information & Computing Sciences

Utrecht University, The Netherlands

2

Department of Mathematics and Computing Science

TU Eindhoven, The Netherlands

Abstract

This paper addresses the partial shape matching problem, which helps identifying similarities even

when a signiﬁcant portion of one shape is occluded, or seriously distorted. We introduce a measure for

computing the similarity between multiple polylines and a polygon, that can be computed in O(km

2

n

2

)

time with a straightforward dynamic programming algorithm. We then present a novel fast algorithm

that runs in time O(kmnlog mn). Here, m denotes the number of vertices in the polygon, and n is the

total number of vertices in the k polylines that are matched against the polygon. The effectiveness of the

similarity measure has been demonstrated in a part-based retrieval application with known ground-truth.

1 Introduction

The motivation for multiple polyline to polygon matching is twofold. Firstly, the matching of shapes has

been done mostly by comparing them as a whole [3, 14, 18, 23, 24, 25, 26, 28, 31]. This fails when a

signiﬁcant part of one shape is occluded, or distorted by noise. In this paper, we address the partial shape

matching problem, matching portions of two given shapes. Secondly, partial matching helps identifying

similarities even when a signiﬁcant portion of one shape boundary is occluded, or seriously distorted. It

could also help in identifying similarities between contours of a non-rigid object in different conﬁgurations

of its moving parts, like the contours of a sitting and a walking cat. Finally, partial matching helps allevi-

ating the problem of unreliable object segmentation from images, over or undersegmentation, giving only

partially correct contours.

Despite its usefulness, considerably less work has been done on partial shape matching than on global

shape matching. One reason could be that partial shape matching is more involved than global shape

matching, since it poses two more difﬁculties: identifying the portions of the shapes to be matched, and

achieving invariance to scaling. If the shape representation is not scale invariant, shapes at different scales

can be globally matched after scaling both shapes to the same length or the same area of the minimum

bounding box. Such solutions work reasonably well in practice for many global similarity measures. When

doing partial matching, however, it is unclear how the scaling should be done since there is no way of

knowing in advance the relative magnitude of the matching portions of the two shapes. This can be seen

in ﬁgure 1: any two of the three shapes in the ﬁgure should give a high score when partially matched but

for each shape the scale at which this matching should be done depends on the portions of the shapes that

match.

Contribution Firstly, we introduce a measure for computing the similarity between multiple polylines

and a polygon. The polylines could for example be pieces of an object contour incompletely extracted from

an image, or could be boundary parts in a decomposition of an object contour. The measure we propose is

based on the turning function representation of the polylines and the polygon. We then derive a number of

non-trivial properties of the similarity measure.

Secondly, based on these properties we characterize the optimal solution that leads to a straightforward

O(km

2

n

2

)-time dynamic programming algorithm. We then present a novel O(kmnlog mn)-time algo-

rithm. Here, m denotes the number of vertices in the polygon, and n is the total number of vertices in the

k polylines that are matched against the polygon.

1

A

B

C

Figure 1: Scale invariance in partial matching is difﬁcult.

Thirdly, we have experimented with a part-based retrieval application. Given a large collection of

shapes and a query consisting of a set of polylines, we want to retrieve those shapes in the collection that

best match our query. The set of polylines forming the query are boundary parts in a decomposition of a

database shape – both this database shape and the parts in the query are selected by the user. The evaluation

on the basis of a known ground-truth indicate that a part-based approach to matching improves the global

matching performance for difﬁcult categories of shapes.

2 Related Work

Arkin et al. [3] describe a metric for comparing two whole polygons that is invariant under translation,

rotation and scaling. It is based on the L

2

-distance between the turning functions of the two polygons, and

can be computed in O(mnlog mn) time, where m is the number of vertices in one polygon and n is the

number of vertices in the other.

Most partial shape matching methods are based on computing local features of the contour, and then

looking for correspondences between the features of the two shapes, for example points of high curvature

[2, 17]. Other local features-based approaches make use of genetic algorithms [22], k-means clustering

[7], or fuzzy relaxation techniques [20]. Such local features-based solutions work well when the matched

subparts are almost equivalent up to a transformation such as translation, rotation or scaling, because for

such subparts the sequences of local features are very similar. However, parts that we perceive as similar,

may have quite different local features (different number of curvature extrema for example).

Geometric hashing [32] is a method that determines if there is a transformed subset of the query point

set that matches a subset of a target point set, by building a hash table in transformation space. Also the

Hausdorff distance [11] allows partial matching. It is deﬁned for arbitrary non-empty bounded and closed

sets A and B as the inﬁmum of the distance of the points in A to B and the points in B to A. Both methods

are designed for partial matching, but do not easily transform to our case of matching multiple polylines to

a polygon.

Partially matching the turning angle function of two polylines under scaling, translation and rotation,

can be done in time O(m

2

n

2

) [8]. Given two matches with the same squared error, the match involving the

longer part of the polylines has a lower dissimilarity. The dissimilarity measure is a function of the scale,

rotation, and the shift of one polyline along the other. However, this works for only two single polylines.

Latecki et al [15], establish the best correspondence of parts in a decomposition of the matched shapes.

The best correspondence between the maximal convex arcs of two simpliﬁed versions of the original shapes

gives the partial similarity measure between the shapes. One serious drawback of this approach is the fact

that the matching is done between parts of simpliﬁed shapes at “the appropriate evolution stage” [14]. How

these evolution stages are identiﬁed is not indicated in their papers, though it certainly has an effect on the

quality of the matching process. Also, by looking for correspondences between parts of the given shapes,

false negatives could appear in the retrieval process. An example of such a case is in ﬁgure 2, where two

parts of the query shape Q are denoted by Q

1

and Q

2

, and two parts of the database shape P are denote

by P

1

and P

2

. Though P has a portion of P

1

∪ P

2

similar to a portion of Q

1

∪ Q

2

, no correspondence

among these four parts succeeds to recognize this. Considering different stages of the curve evolution (such

that P

1

∪ P

2

and Q

1

∪ Q

2

constitute individual parts) does not solve the problem either. This is because

the method used for matching two chains normalizes the two chains to unit length and thus no shifting of

2

P

1

P

2

Q

1

Q

2

Figure 2: Partial matching based on correspondence of parts may fail to identify similarities.

one endpoint of one chain over the other is done. Also notice that if instead of making correspondences

between parts, we shift the parts of one shape over the other shape and select those that give a close match,

the similarity between the shapes in ﬁgure 2 is reported.

In our approach, we also use a decomposition into parts in order to identify the matching portions,

but we do this only for one of the shapes. In the matching process between a (union of) part(s) and an

undecomposed shape, the part(s) are shifted along the shape, and the optimal placement is computed based

on their turning functions.

In the next section, we introduce a similarity measure for matching a union of possibly disjoint parts

and a whole shape. This similarity measure is a turning angle function-based similarity, minimized over all

possible shiftings of the endpoints of the parts over the shape, and also over all independent rotations of the

parts. Since we allow the parts to rotate independently, this measure could capture the similarity between

contours of non-rigid objects, with parts in different relative positions.

3 Polylines-to-Polygon Matching

We concentrate on the problem of matching an ordered set {P

1

, P

2

, . . . , P

k

} of k polylines against a poly-

gon P. We want to compute how close an ordered set of polylines {P

1

, P

2

, . . . , P

k

} is to being part of the

boundary of P in the given order in counter-clockwise direction around P (see ﬁgure 3). For this purpose,

the polylines are rotated and shifted along the polygon P, in such a way that the pieces of the boundary of

P “covered” by the k polylines are mutually disjoint except possibly at their endpoints. In this section we

deﬁne a similarity measure for such a polylines-to-polygon matching. The measure is based on the turning

function representation of the given polylines and it is invariant under translation and rotation. We ﬁrst

give an O(km

2

n

2

)-time dynamic programming algorithm for computing this similarity measure, where m

is the number of vertices in P and n is the total number of vertices in the k polylines. We then reﬁne this

algorithm to obtain a running time of O(kmnlog mn). Note that the fact that P is a polygon, and not an

open polyline, comes from the intended application of part based shape retrieval.

3.1 Similarity between polyline and polygon

The turning function Θ

A

of a polygon A measures the angle of the counterclockwise tangent with respect

to a reference orientation as a function of the arc-length s, measured from some reference point on the

boundary of A. It is a piecewise constant function, with jumps corresponding to the vertices of A. A

rotation of A by an angle θ corresponds to a shifting of Θ

A

over a distance θ in the vertical direction.

Moving the location of the reference point A(0) over a distance t ∈ [0, l

A

) along the boundary of A

corresponds to shifting Θ

A

horizontally over a distance t.

The distance between two polygons A and B is deﬁned as the L

2

distance between their two turning

functions Θ

A

and Θ

B

, minimized with respect to the vertical and horizontal shifts of these functions (in

other words, minimized with respect to rotation and choice of reference point). More formally, suppose A

and B are two polygons with perimeter length l

A

and l

B

, respectively, and the polygon B is placed over A

in such a way that the reference point B(0) of B coincides with point A(t) at distance t along A from the

reference point A(0), and B is rotated clockwise by an angle θ with respect to the reference orientation. A

pair (t, θ) ∈ R × R will be referred to as a placement of B over A. The ﬁrst component t of a placement

(t, θ) is also called a horizontal shift, since it corresponds to a horizontal shifting of Θ

A

over a distance t,

3

P

P

1

P

2

P

3

P

P

3

P

2

P

1

Figure 3: Matching an ordered set {P

1

, P

2

, P

3

} of polylines against a polygon P.

while the second component θ is also called a vertical shift, since it corresponds to a vertical shifting of Θ

A

over a distance θ. We deﬁne the quadratic similarity f(A, B, t, θ) between A and B for a given placement

(t, θ) of B over A, as the square of the L

2

-distance between their two turning functions Θ

A

and Θ

B

:

f(A, B, t, θ) =

_

lB

0

(Θ

A

(s +t) −Θ

B

(s) +θ)

2

ds.

The similarity between two polygons A and B is then given by:

min

θ∈R, t∈[0,lA)

_

f(A, B, t, θ).

To achieve invariance under scaling, Arkin et al. [3] propose scaling the two polygons to unit length

prior to the matching.

For measuring the difference between a polygon A and a polyline B the same measure can be used

(for matching two polylines, some slight adaptations would be needed). Notice that, for the purpose of our

part-based retrieval application, we want that a polyline B included in a polygon A to match perfectly this

polygon, that is: their similarity should be zero. But if we scale A and B to the same length, their turning

functions will no longer match perfectly. For this reason we do not scale the polyline or the polygon prior

to the matching process. Thus, our similarity measure is not scale-invariant.

In a part-based retrieval application (see section 4) using a similarity measure based on the above

quadratic similarity (see section 3.2), we achieve robustness to scaling by normalizing all shapes in the col-

lection to the same diameter of their circumscribed circle. This is a reasonable solution for our collection,

which does not contain occluded, or incomplete shapes. Moreover these shapes are object contours and not

synthetic shapes.

3.2 Similarity between multiple polylines and a polygon

Let Θ : [0, l] →R be the turning function of a polygon P of m vertices, and of perimeter length l. Since P

is a closed polyline, the domain of Θcan be easily extended to the entire real line, by Θ(s+l) = Θ(s)+2π.

Let {P

1

, P

2

, . . . P

k

} be a set of polylines, and let Θ

j

: [0, l

j

] → R denote the turning function of the

polyline P

j

of length l

j

. If P

j

is made of n

j

segments, Θ

j

is piecewise-constant with n

j

−1 jumps.

For simplicity of exposition, f

j

(t, θ) denotes the quadratic similarity f

j

(t, θ) =

_

lj

0

(Θ(s+t) −Θ

j

(s) +

θ)

2

ds.

We assume the polylines {P

1

, P

2

, . . . , P

k

} satisfy the condition

k

j=1

l

j

≤ l. The similarity measure,

which we denote by d(P

1

, . . . , P

k

; P), is the square root of the sum of quadratic similarities f

j

, minimized

over all valid placements of P

1

, . . . , P

k

over P (or in other words, minimized over all valid horizontal and

vertical shifts of their turning functions):

d(P

1

, . . . , P

k

; P) = min

valid placements

(t1, θ1) . . . (t

k

, θ

k

)

⎛

⎝

k

j=1

f

j

(t

j

, θ

j

)

⎞

⎠

1/2

.

4

l 2l

Θ

t

1

t

1

+ l

1

Θ

1

t

2

t

2

+ l

2

Θ

2

t

3

t

3

+ l

3

< t

1

+ l

Θ

3

Figure 4: To compute d(P

1

, . . . , P

k

; P) between the polylines P

1

, . . . , P

3

and the polygon P, we shift the

turning functions Θ

1

, Θ

2

, and Θ

3

horizontally and vertically over Θ.

It remains to deﬁne what the valid placements are. The horizontal shifts t

1

, . . . , t

k

correspond to

shiftings of the starting points of the polylines P

1

, . . . , P

k

along P. We require that the starting points

of P

1

, . . . , P

k

are matched with points on the boundary of P in counterclockwise order around P, that is:

t

j−1

≤ t

j

for all 1 < j ≤ k, and t

k

≤ t

1

+ l. Furthermore, we require that the matched parts are disjoint

(except possibly at their endpoints), sharpening the constraints to t

j−1

+ l

j−1

≤ t

j

for all 1 < j ≤ k, and

t

k

+l

k

≤ t

1

+l (see ﬁgure 4).

The vertical shifts θ

1

, . . . , θ

k

correspond to rotations of the polylines P

1

, . . . , P

k

with respect to the

reference orientation, and are independent of each other. Therefore, in an optimal placement the quadratic

similarity between a particular polyline P

j

and P depends only on the horizontal shift t

j

, while the vertical

shift must be optimal for the given horizontal shift. We can thus express the similarity between P

j

and P

for a given positioning t

j

of the starting point of P

j

over P as: f

∗

j

(t

j

) = min

θ∈R

f

j

(t

j

, θ).

The similarity between the polylines P

1

, . . . , P

k

and the polygon P is thus:

d(P

1

, . . . , P

k

; P) = min

t1∈[0,l), t2,...,t

k

∈[0,2l);

∀j∈{2,...,k}: tj−1+lj−1≤tj; t

k

+l

k

≤t1+l

⎛

⎝

k

j=1

f

∗

j

(t

j

)

⎞

⎠

1/2

. (1)

3.3 Properties of the similarity function

In this section we give a few properties of f

∗

j

(t), as functions of t, that constitute the basis of the algorithms

for computing d(P

1

, . . . , P

k

; P) in sections 3.5 and 3.6. We also give a simpler formulation of the opti-

mization problem in the deﬁnition of d(P

1

, . . . , P

k

; P). Arkin et al. [3] have shown that for any ﬁxed t,

the function f

j

(t, θ) is a quadratic convex function of θ. This implies that for a given t, the optimization

problem min

θ∈R

f

j

(t, θ) has a unique solution, given by the root θ

∗

j

(t) of the equation ∂f

j

(t, θ)/∂θ = 0.

Since

∂f

j

(t, θ)

∂θ

=

_

lj

0

2(Θ(s +t) −Θ

j

(s) +θ)ds =

_

lj

0

2(Θ(s +t) −Θ

j

(s))ds + 2θl

j

, (2)

we get:

Lemma 1 For a given positioning t of the starting point of P

j

over P, the rotation that minimizes the

quadratic similarity between P

j

and P is given by θ

∗

j

(t) = −

_

lj

0

(Θ(s +t) −Θ

j

(s))ds/l

j

.

We now consider the properties of f

∗

j

(t) = f

j

(t, θ

∗

j

(t)), as a function of t.

Lemma 2 The quadratic similarity f

∗

j

(t) has the following properties:

i) it is periodic, with period l;

ii) it is piecewise quadratic, with mn

j

breakpoints within any interval of length l; moreover, the

parabolic pieces are concave.

5

Θ

Θ

j

Figure 5: For given horizontal and vertical shifts of Θ

j

over Θ, the discontinuities of Θ

j

and Θ form a set

of rectangular strips.

Proof: i) Since Θ(t +l) = 2π + Θ(t), from lemma 1 we get that θ

∗

(t +l) = θ

∗

(t) −2π. Thus

f

∗

j

(t +l) =

_

lj

0

(Θ(s +t +l) −Θ

j

(s) +θ

∗

(t) −2π)

2

ds = f

∗

j

(t).

ii) Since f

∗

j

(t) = f(P, P

j

, t, θ

∗

j

(t))f

j

(t, θ

∗

j

(t)), we have:

f

∗

j

(t) =

_

lj

0

(Θ(s +t) −Θ

j

(s) +θ

∗

j

(t))

2

ds =

_

lj

0

(Θ(s +t) −Θ

j

(s))

2

ds +

_

lj

0

2(Θ(s +t) −Θ

j

(s))θ

∗

j

(t)ds +

_

lj

0

(θ

∗

j

(t))

2

ds

=

_

lj

0

(Θ(s +t) −Θ

j

(s))

2

ds +

2θ

∗

j

(t)

_

lj

0

(Θ(s +t) −Θ

j

(s))ds +

l

j

(θ

∗

j

(t))

2

.

Applying lemma 1 we get:

f

∗

j

(t) =

_

lj

0

(Θ(s +t) −Θ

j

(s))

2

ds −

_

_

lj

0

(Θ(s +t) −Θ

j

(s))ds

_

2

l

j

. (3)

For the rest of this proof we restrict our attention to the behaviour of f

∗

j

within an interval of length

l. For a given value of the horizontal shift t, the discontinuities of Θ and Θ

j

deﬁne a set S of rectangular

strips. Each strip is deﬁned by a pair of consecutive discontinuities of the two turning functions and is

bounded from below and above by these functions.

As Θ

j

is shifted horizontally with respect to Θ, there are at most mn

j

critical events within an interval

of length l where the conﬁguration of the rectangular strips deﬁned by Θ and Θ

j

changes. These critical

events are at values of t were a discontinuity of Θ

j

coincides with a discontinuity of Θ. Inbetween these

critical events, the height of a strip σ is a constant h

σ

, while the width w

σ

is a linear function of t with

coefﬁcient −1, 0 or 1.

For a given conﬁguration of strips, the value of

_

lj

0

(Θ(s+t) −Θ

j

(s))

2

ds is thus given by the sum over

the strips in the current conﬁguration of the width of the strip times the square of its height:

_

lj

0

(Θ(s +t) −Θ

j

(s))

2

ds =

σ∈S

w

σ

(t)h

2

σ

.

Similarly, we have:

_

lj

0

(Θ(s +t) −Θ

j

(s))ds =

σ∈S

w

σ

(t)h

σ

.

6

t

f

∗

j

(t)

Figure 6: The quadratic similarity f

∗

j

is a piecewise quadratic function of t, with concave parabolic pieces.

And thus equation (3) can be rewritten as:

f

∗

j

(t) =

σ∈S

w

σ

(t)h

2

σ

−

1

l

j

⎛

⎝

σ∈S

w

σ

(t)h

σ

⎞

⎠

2

. (4)

Since the functions w

σ

are linear functions of t, the function f

∗

j

is a piecewise quadratic function of t, with

at most mn

j

breakpoints within an interval of length l. Moreover, since the coefﬁcient of t

2

in equation (4)

is negative, the parabolic pieces of f

∗

j

are concave (see ﬁgure 6).

The following corollary indicates that in computing the minimum of the function f

∗

j

, it sufﬁces to

restrict our attention to a discrete set of at most mn

j

points.

Corollary 1 The local minima of the function f

∗

j

are among the breakpoints between its parabolic pieces.

We now give a simpler formulation of the optimization problem in the deﬁnition of d(P

1

, . . . , P

k

; P).

In order to simplify the restrictions on t

j

in equation (1), we deﬁne:

f

j

(t) := f

∗

j

_

t +

j−1

i=1

l

i

_

.

In other words, the function f

j

is a copy of f

∗

j

, but shifted to the left with

j−1

i=1

l

i

. Obviously, f

j

has the

same properties as f

∗

j

, that is: it is a piecewise quadratic function of t that has its local minima in at most

mn

j

breakpoints in any interval of length l. With this simple transformation of the set of functions f

∗

j

, the

optimization problem deﬁning d(P

1

, . . . , P

k

; P) becomes:

d(P

1

, . . . , P

k

; P) = min

t1∈[0,l), t2,...,t

k

∈[0,2l);

∀j∈{2,...,k}: tj−1≤tj; t

k

≤t1+l0

⎛

⎝

k

j=1

f

j

(t

j

)

⎞

⎠

1/2

, (5)

where l

0

:= l−

k

i=1

l

i

. Notice that if (t

∗

1

, . . . , t

∗

k

) is a solution to the optimization problemin equation (5),

then (t

∗

1

, . . . , t

∗

k

), with t

∗

j

:= t

∗

j

+

j−1

i=1

l

i

, is a solution to the optimization problem in equation (1).

3.4 Characterization of an optimal solution

In this section we characterize the structure of an optimal solution to the optimization problem in equa-

tion (5), and give a recursive deﬁnition of this solution. This deﬁnition forms the basis of a straightforward

dynamic programming solution to the problem.

Let (t

∗

1

, . . . , t

∗

k

) be a solution to the optimization problem in equation (5).

7

Lemma 3 The values of an optimal solution (t

∗

1

, . . . , t

∗

k

) are found in a discrete set of points X ⊂ [0, 2l),

which consists of the breakpoints of the functions f

1

, ..., f

k

, plus two copies of each breakpoint: one shifted

left by l

0

and one shifted right by l

0

.

Proof: If t

∗

j−1

= t

∗

j

and t

∗

j

= t

∗

j+1

, the constraints on the choice of t

∗

1

, ..., t

∗

k

do not prohibit decreasing

or increasing t

∗

j

, thus t

∗

j

must give a local minimum of function f

j

.

Notice now, that t

∗

j−1

= t

∗

j

means that t

∗

j

= t

∗

j−1

+ l

j−1

, thus the positioning of the ending point of

P

j−1

and of the starting point of P

j

coincide. Then, in the given optimal placement the polylines P

j−1

and

P

j

are “glued” together. Similarly, if t

∗

k

= t

∗

1

+l

0

, we have that P

k

and P

1

are glued together.

For any j ∈ {1, ..., k}, let G

j

be the maximal set of polylines containing P

j

and glued together in the

given optimal placement, that is: no polyline that is not in G

j

is glued to a polyline in G

j

. Observe that

G

j

must have either the form {P

g

, ..., P

h

} (with j ∈ {g, ..., h}) or the form {P

1

, ..., P

h

, P

g

, ..., P

k

} (with

j ∈ {1, ..., h, g, ..., k}), for some g and h in {1, ..., k}.

In the ﬁrst case, we have t

∗

i

= t

∗

j

for all i ∈ {g, ..., h}. Since G

j

is a maximal set of polylines

glued together, the constraints on the choice of t

∗

1

, ..., t

∗

k

do not prohibit decreasing or increasing t

∗

g

, ..., t

∗

h

simultaneously, so t

∗

j

must give a local minimum of the function f

gh

: R →R, f

gh

(t) =

h

i=g

f

i

(t). This

function is obviously piecewise quadratic, with concave parabolic pieces, and the breakpoints of f

gh

are

the breakpoints of the constituent functions f

g

, ..., f

h

. Note that a local minimum of f

gh

may not be a local

minimum of any of the constituent functions f

i

(g ≤ i ≤ h), but nevertheless it will always be given by a

breakpoint of one of the constituent functions.

In the second case, we can argue in a similar manner that t

∗

j

must give a local minimum of the function

f

gh

: R → R, f

gh

(t) =

k

i=g

f

i

(t) +

h

i=1

f

i

(t − l

0

), (when j ∈ {g, ..., k}) or the function f

gh

: R →

R, f

gh

(t) =

k

i=g

f

i

(t + l

0

) +

h

i=1

f

i

(t) (when j ∈ {1, ..., h}). Thus, t

∗

j

must be a breakpoint of one of

the functions f

g

, ..., f

h

, or it must be such a breakpoint shifted left or right by l

0

.

We call a point in [0, 2l), which is either a breakpoint of one of the functions f

1

, ..., f

k

, or such a

breakpoint shifted left or right by l

0

, a critical point. Since function f

j

has 2mn

j

breakpoints, the total

number of critical points in [0, 2l) is at most 6m

j

i=1

n

i

= 6mn. Let X = {x

0

, . . . , x

N−1

} be the set of

critical points in [0, 2l).

With the observations above, the optimization problem we have to solve is:

d(P

1

, . . . , P

k

; P) = min

t1, . . . , t

k

∈ X

∀j > 1 : tj−1 ≤ tj ; t

k

− t1 ≤ l0

⎛

⎝

k

j=1

f

j

(t

j

)

⎞

⎠

1/2

. (6)

We now show that an optimal solution of this problem can be constructed recursively. For this purpose,

we denote:

D[j, a, b] = min

t1, . . . , tj ∈ X

xa ≤ t1 ≤ . . . ≤ tj ≤ x

b

j

i=1

f

i

(t

i

) , (7)

where j ∈ {1, . . . , k}, a, b ∈ {0, . . . , N − 1}, and a ≤ b. Equation (7) describes the subproblem of

matching the set {P

1

, . . . , P

j

} of j polylines to a subchain of P, starting at P(x

a

) and ending at P(x

b

+

j

i=1

l

i

). We nowshow that D[j, a, b] can be computed recursively. Let (t

1

, . . . t

j

) be an optimal solution

for D[j, a, b]. Regarding the value of t

j

we distinguish two cases:

• t

j

= x

b

, in which case (t

1

, . . . t

j−1

) must be an optimal solution for D[j − 1, a, b], otherwise

(t

1

, . . . t

j

) would not give a minimum for D[j, a, b]; thus in this case, D[j, a, b] = D[j − 1, a, b] +

f

j

(x

b

);

• t

j

= x

b

, in which case (t

1

, . . . t

j

) must be an optimal solution for D[j, a, b − 1]; otherwise

(t

1

, . . . t

j

) would not give a minimum for D[j, a, b]; thus in this case D[j, a, b] = D[j, a, b −1].

8

We can now conclude that :

D[j, a, b] = min

_

D[j −1, a, b] + f

j

(x

b

), D[j, a, b −1]

_

, for j ≥ 1 ∧ a ≤ b, (8)

where the boundary cases are D[0, a, b] = 0 and D[j, a, a −1] has no solution.

A solution of the optimization problem (6) is then given by

d(P

1

, . . . , P

k

; P) = min

xa,x

b

∈X, x

b

−xa≤l0

_

D[k, a, b] . (9)

3.5 A straightforward algorithm

Equations (8) and (9) lead to a straightforward dynamic programming algorithm for computing the simi-

larity measure d(P

1

, . . . , P

k

; P):

Algorithm SimpleCompute d(P

1

, . . . , P

k

; P)

1. Compute the set of critical points X = {x0, . . . , xN−1}, and sort them.

2. For all j ∈ {1, . . . , k} and all i ∈ {0, . . . , N −1}, evaluate fj(xi).

3. MIN ←∞

4. for a ←0 to N −1 do

5. for j ←1 to k do D[j, a, a −1] ←∞

6. b ←a

7. while x

b

−xa ≤ l0 do

8. D[0, a, b] ←0

9. for j ←1 to k do

10. D[j, a, b] ←min

_

D[j −1, a, b] + fj(x

b

), D[j, a, b −1]

_

11. b ←b + 1

12. MIN ←min(D[k, a, b −1], MIN).

13. return

√

MIN

Lemma 4 The running time of algorithm SimpleCompute d(P

1

, . . . , P

k

; P) is O(km

2

n

2

), and the algo-

rithm uses O(km

2

n

2

) storage.

Proof: Step 1 of this algorithm requires ﬁrst identifying all values of t such that a vertex of a turning

function Θ

j

is aligned with a vertex of Θ. For a ﬁxed vertex v of a turning function Θ

j

, all m values of t

such that v is aligned with a vertex of Θ can be identiﬁed in O(m) time; the results for all n

j

vertices of

Θ

j

can thus be collected in O(mn

j

) time. We then have to shift these values left by

j−1

i=1

l

i

and make

copies of them shifted left and right by l

0

. Overall, the identiﬁcation of the set X of critical points takes

O(

k

j=1

mn

j

) = O(mn), and they can be sorted in O(mnlog(mn)).

We nowshowthat, for a particular value of j, evaluating f

j

(x

i

), for all i ∈ {0, . . . , N−1} takes O(mn).

The algorithm is an adaptation of the algorithm used in [3], and we thus indicate only its main ideas. As

equation (4) from the proof of lemma 2 indicates, the function f

j

(t) is the difference of two piecewise

linear functions with the same set of breakpoints. Thus these functions can be completely determined by

evaluating an initial value and the slope of each linear piece. It remains to indicate how to compute the

slopes of these linear pieces.

Notice that when Θ

j

is horizontally shifted over Θ such that the conﬁguration S of the rectangular strips

does not change, the only strips whose width is changing are those bounded by a jump of Θ and a jump of

Θ

j

. More precisely, let us denote by S

ΘΘj

the set of rectangular strips in the current conﬁguration S that

are bounded left by a jump of Θ and right by a jump of Θ

j

(see ﬁgure 7), by S

ΘjΘ

the set of rectangular

strips that are bounded left by a jump of Θ

j

and right by a jump of Θ, and by S

ΘΘ

and S

ΘjΘj

the set of of

rectangular strips that are bounded left and right by a jump of Θ and Θ

j

, respectively. When shifting Θ

j

such that the conﬁguration S does not change, the widths of the strips in S

ΘjΘj

and S

ΘΘ

do not change,

while the width of those in S

ΘΘj

increases and of those in S

ΘjΘ

decreases. Thus the slope of each linear

piece of the sum functions in equation (4) is determined only by the rectangular strips in S

ΘΘj

and S

ΘjΘ

.

Notice also that at each breakpoint of f

j

, only three strips in S change their type. The evaluation of

the function f

j

thus starts with computing an initial conﬁguration of strips, with strips divided in the above

9

σ1

σ2

σ3

σ4

σ5

σ6

Θ

Θj

Figure 7: The four types of strips formed by the discontinuities of two turning functions Θ

j

and Θ: σ

1

, σ

4

∈

S

ΘjΘ

, σ

2

, σ

6

∈ S

ΘΘj

, σ

3

∈ S

ΘjΘj

, and σ

5

∈ S

ΘΘ

.

mentioned four groups. An initial value f

j

(0) can be computed from this conﬁguration in O(m+n

j

). We

then go through the set X of critical events in order, and when necessary we update the conﬁguration of

strips (and thus the sets S

ΘΘj

and S

ΘjΘ

) and also compute the value of the function f

j

, based on the slopes

computed from S

ΘΘj

and S

ΘjΘ

. The updates of the conﬁguration and the function values computation

take constant time per critical point. Thus we can evaluate the function f

j

at all critical points in x in

O(N) = O(mn), and thus the step 2 of the algorithm takes O(kmn) time.

The operations in the rest of the algorithm all take constant time each. Their total time is determined

by the number of times line 10 is executed, which is at most N times for the loop in line 4, times N for the

loop on line 6, times k for the loop on line 8, for a total of O(kN

2

) = O(km

2

n

2

) times.

Thus, line 4 to 12 dominate the running time of algorithm SimpleCompute, and this is O(km

2

n

2

). The

amount of storage used by the algorithm is dominated by the amount of space needed to store the matrix

M, which is O(kN

2

) = O(km

2

n

2

).

3.6 A fast algorithm

The above time bound on the computation of the similarity measure d(P

1

, . . . , P

k

; P) can be improved

to O(kmnlog mn). The reﬁnement of the dynamic programming algorithm is based on the following

property of equation (8):

Lemma 5 For any polyline P

j

, j ∈ {1, . . . , k}, and any critical point x

b

, b ∈ {0, . . . , N − 1}, there is a

critical point x

z

, 0 ≤ z ≤ b, such that:

i) D[j, a, b] = D[j, a, b −1], for all a ∈ {0, ..., z − 1}, and

ii) D[j, a, b] = D[j −1, a, b] + f

j

(x

b

), for all a ∈ {z, ..., b}.

Proof: For ﬁxed values of j and b, let (t

a

1

, ..., t

a

j

) denote the lexicographically smallest solution among

the optimal solutions for the subproblem D[j, a, b]. For ease of notation, deﬁne t

a

0

= x

a

and t

a

j+1

= x

b

.

The proof of i) and ii) is based on the following observation: as a increases from 0 to b, the value t

a

i

can

only increase as well, or remain the same, for any i ∈ {1, . . . , j}. We ﬁrst prove this observation.

For the sake of contradiction, assume that there is an a (0 < a ≤ b) and an i (1 ≤ i ≤ j) such that

t

a

i

< t

a−1

i

. For such an a, let i

be the smallest such i, and let i

**be the largest such i.
**

By our choice of i

and i

, we have t

a

i

−1

≤ t

a

i

< t

a−1

i

≤ t

a−1

i

≤ t

a−1

i

+1

≤ t

a

i

+1

. It follows

that (t

a

1

, ..., t

a

i

−1

, t

a−1

i

, ..., t

a−1

i

, t

a

i

+1

, ..., t

a

j

) is a valid solution for j, a and b . From the fact that the

lexicographically smallest optimal solution (t

a

1

, ..., t

a

j

) is different, we must conclude that it is at least as

good, i.e.

i

i=i

f

i

(t

a

i

) ≤

i

i=i

f

i

(t

a−1

i

).

But we also have t

a−1

i

−1

≤ t

a

i

−1

≤ t

a

i

≤ t

a

i

< t

a−1

i

≤ t

a−1

i

+1

, fromwhich it follows that (t

a−1

1

, ..., t

a−1

i

−1

,

t

a

i

, ..., t

a

i

, t

a−1

i

+1

, ..., t

a−1

i

) is a valid solution for j, a − 1 and b. From the fact that this solution is lexico-

graphically smaller than the lexicographically smallest optimal solution (t

a−1

1

, ..., t

a−1

j

), we must conclude

that it is suboptimal, so

i

i=i

f

i

(t

a

i

) >

i

i=i

f

i

(t

a−1

i

). This contradicts the conclusion of the previous

10

paragraph. It follows that t

a

i

≥ t

a−1

i

, for all 0 < a ≤ b and 1 ≤ i ≤ j, and thus, by induction, that t

a

i

≤ t

a

i

,

for all 0 ≤ a

≤ a.

We can now prove the lemma item i) and ii). Let z be the smallest value in {0, ..., b} for which t

z

j

= x

b

(since t

b

j

must be x

b

, such a value always exists). Since, by our choice of z, we have t

a

j

= x

b

for all

a < z, we have D[j, a, b] = D[j, a, b −1] for all a ∈ {0, ..., z −1}. Moreover, for a ∈ {z, ..., b}, we have

x

b

= t

z

j

≤ t

a

j

≤ t

b

j

= x

b

, so t

a

j

= x

b

, and thus, D[j, a, b] = D[j −1, a, b] + f

j

(x

b

).

For given j and b, we consider the function D[j, b] : {0, N − 1} → R, with D[j, b](a) = D[j, a, b].

Lemma 5 expresses the fact that the values of function D[j, b] can be obtained from D[j, b − 1] up to

some value z, and from D[j − 1, b] (while adding f

j

(x

b

)) from this value onwards. This property allows

us to improve the time bound of the dynamic programming algorithm in the previous section. Instead of

computing arrays of scalars D[j, a, b], we will compute arrays of functions D[j, b]. The key to success will

be to represent these functions in such a way that they can be evaluated fast and D[j, b] can be constructed

fromD[j −1, b] and D[j, b −1] fast. Before describing the data structure used for this purpose, we present

the summarized reﬁned algorithm:

Algorithm FastCompute d(P

1

, . . . , P

k

; P)

1. Compute the set of critical points X = {x0, . . . , xN−1}, and sort them

2. For all j ∈ {1, ..., k} and all b ∈ {0, ..., N −1}, evaluate fj(x

b

)

3. ZERO ←a function that always evaluates to zero

4. INFINITY ←a function that always evaluates to ∞

5. MIN ←∞

6. a ←0

7. for j ←1 to k do

8. D[j, −1] ←INFINITY

9. for b ←0 to N −1 do

10. D[0, b] ←ZERO

11. for j ←1 to k do

12. Construct D[j, b] from D[j −1, b] and D[j, b −1]

13. while xa < x

b

−l0 do

14. a ←a + 1

15. val ←evaluation of D[k, b](a)

16. MIN ←min (val, MIN)

17. return

√

MIN

The running time of this algorithm depends on how the functions D[j, b] are represented. In order to

make especially steps 12 and 15 of the above algorithm efﬁcient, we represent the functions D[j, b] by

means of balanced binary trees. Asano et al. [4] used an idea similar in spirit.

3.6.1 An efﬁcient representation for function D[j, b]

We now describe the tree T

j,b

used for storing the function D[j, b]. Each node ν of T

j,b

is associated

with an interval [a

−

ν

, a

+

ν

], with 0 ≤ a

−

ν

≤ a

+

ν

≤ N − 1. The root ρ is associated with the full domain,

that is: a

−

ρ

= 0 and a

+

ρ

= N − 1. Each node ν with a

−

ν

< a

+

ν

is an internal node that has a split

value a

ν

= (a

−

ν

+ a

+

ν

)/2 associated with it. Its left and right children are associated with [a

−

ν

, a

ν

] and

[a

ν

+ 1, a

+

ν

], respectively. Each node ν with a

−

ν

= a

+

ν

is a leaf of the tree, with a

ν

= a

−

ν

= a

+

ν

. For

any index a of a critical point x

a

, we will denote the leaf ν that has a

ν

= a by λ

a

. Note that so far, the

tree looks exactly the same for each function D[j, b]: they are balanced binary trees with N leaves, and

log N height. Moreover, all trees have the same associated intervals, and split values in their corresponding

nodes.

With each node ν we also store a weight w

ν

, such that T

j,b

has the following property: D[j, b](a) is the

sum of the weights on the path from the tree root to the leaf λ

a

. Such a representation of a function D[j, b]

is not unique. A trivial example is to set w

ν

equal to D[j, b](a

ν

) for all leaves ν, and to set it to zero for

all internal nodes. It will become clear below why we also allow representations with non-zero weights on

internal nodes.

11

T

j−1,b

T

j,b−1

T

j,b

⇓

λz

λ

0

λ

N−1

λ

0

λ

0

λz

λz

λ

N−1

λ

N−1

Figure 8: The tree T

j,b

is contructed from T

j,b−1

and T

j−1,b

by creating new nodes along the path from

the root to the leaf λ

z

, and adopting the subtrees to the left of the path fromT

j,b−1

, and the subtrees to the

right of the path from T

j−1,b

.

Furthermore, we store with each node ν a value m

ν

which is the sum of the weights on the path from

the left child of ν to the leaf λ

aν

, that is: the rightmost descendant of the left child of ν. This concludes the

description of the data structure used for storing a function D[j, b].

Lemma 6 The data structure T

j,b

for the representation of function D[j, b] can be operated on such that:

(i) The representation of a zero-function (i.e. a function that always evaluates to zero) can be con-

structed in O(N) time. Also the representation of a function that always evaluates to ∞ can be

constructed in O(N) time.

(ii) Given T

j,b

of D[j, b], evaluating function D[j, b](a) takes O(log N) time.

(iii) Given the representations T

j−1,b

and T

j,b−1

of the functions D[j −1, b] and D[j, b −1], respectively,

a representation T

j,b

of D[j, b] can be computed in O(log N) time.

Proof: (i) Constructing the representation of a zero-function involves nothing more than building a

balanced binary tree on N leaves and setting all weights to zero. This can be done in O(N) time. The

representation of a function that always evaluates to ∞ takes also O(N) time since it involves only the

building a balanced binary tree on N leaves and setting the weights of all leaves to ∞, and of internal

nodes to zero.

(ii) Once T

j,b

representing the function D[j, b] has been constructed, the weights stored allow us to

evaluate D[j, b](a) for any a in O(log N) time by summing up the weights of the nodes on the path from

the root to the leaf λ

a

that corresponds to a.

(iii) To construct T

j,b

from T

j,b−1

and T

j−1,b

efﬁciently, we take the following approach. We ﬁnd the

sequences of left and right turns that lead from the root of the trees down to the leaf λ

z

, where z is deﬁned

as in lemma 5. Note that the sequences of left and right turns are the same in the trees T

j,b

, T

j,b−1

, and

T

j−1,b

, only the weights on the path differ. Though we do not compute z explicitly, we will show belowthat

we are able to construct the path from the root of the tree to the leaf λ

z

corresponding to z, by identifying,

based on the stored weights, at each node node along this path whether the path continues into the right or

left subtree of the current node.

Lemma 5 tells us that for each leaf left of λ

z

, the total weight on the path to the root in T

j,b

must be the

same as the total weight on the corresponding path in T

j,b−1

. At λ

z

itself and right of λ

z

, the total weights

to the root in T

j,b

must equal those in T

j−1,b

, plus f

j

(x

z

). We construct the tree T

j,b

with these properties

as follows.

We start building T

j,b

by constructing a root ρ. If the path to λ

z

goes into the right subtree, we adopt

as left child of ρ the corresponding left child ν of the root fromT

j,b−1

. There is no need to copy ν: we just

add a pointer to it, thus at once making all descendants of ν in T

j,b−1

into descendants of the corresponding

node in the tree T

j,b

under construction. Furthermore, we set the weight of ρ equal to the weight of the

root of T

j,b−1

. If the path to λ

z

goes into the left subtree, we adopt the right child fromT

j−1,b

and take the

weight of ρ from there, now adding f

j

(x

z

).

Then we make a new root for the other subtree of the root ρ, i.e. the one that contains λ

z

, and continue

the construction process in that subtree. Every time we go into the left branch, we adopt the right child

12

fromT

j−1,b

, and every time we go into the right branch, we adopt the left child from T

j,b−1

(see ﬁgure 8).

For every constructed node ν, we set its weight w

ν

so that the total weight of ν and its ancestors equals the

total weight of the corresponding nodes in the tree from which we adopt ν’s child — if the subtree adopted

comes fromT

j−1,b

, we increase w

ν

by f

j

(x

z

).

By keeping track of the accumulated weights on the path down from the root in all the trees, we can set

the weight of each newly constructed node ν correctly in constant time per node. The accumulated weights

on the path down from the root, together with the stored weights for the paths down to left childrens’

rightmost descendants, also allow us to decide in constant time which is better: D[j, b − 1](a

ν

) or D[j −

1, b](a

ν

) + f

j

(x

z

). This will tell us if λ

z

is to be found in the left or in the right subtree of ν.

The complete construction process only takes O(1) time for each node on the path from ρ to λ

z

. Since

the trees are perfectly balanced, this path has only O(log N) nodes, so that T

j,b

is constructed in time

O(log N).

Theorem 1 The similarity d(P

1

, . . . , P

k

; P) between k polylines {P

1

, . . . , P

j

} with n vertices in total,

and a polygon P with m vertices, can be computed in O(kmnlog(mn)) time using O(kmnlog(mn))

storage.

Proof: We use algorithm Fastcompute d(P

1

, . . . , P

k

; P) with the data structure described above.

Step 1 and 2 of the algorithm are the same as in the algorithm from section 3.5 and can be executed in

O(kmn + mnlog n) time. From lemma 6, we have that the zero-function ZERO can be constructed

in O(N) time (line 3). Similarly, the inﬁnity-function INFINITY can be constructed in O(N) time

(line 4). Lemma 6 also insures that constructing D[j, b] from D[j − 1, b] and D[j, b − 1] (line 12) takes

O(log N) time, and that the evaluation of D[k, b](a) (line 15) takes O(log N) time.

Notice that no node is ever edited after it has been constructed: no tree is ever updated, and a new tree

is always constructed by making O(log N) new nodes and for the rest by referring to nodes from existing

trees as they are. Therefore, an assignment as in lines 8 or 10 of the algorithm can safely be done by

just copying a reference in O(1) time, without copying the structure of the complete tree. Thus, the total

running time of the above algorithm will be dominated by O(kN) executions of line 12, taking in total

O(kN log N) = O(kmnlog(mn)) time.

Apart fromthe function values of f

j

computed in step 2, we have to store the ZEROand the INFINITY

function. All these require O(kN) = O(kmn) storage. Notice that any of the functions constructed in

step 12 requires only storing O(log(N)) = O(log(mn)) new nodes and pointers to nodes in previously

computed trees, and thus we need O(kmnlog(mn)) for all the trees computed in step 12. So the total

storage required by the algorithm is O(kmnlog(mn)).

We note that the problem resembles a general edit distance type approximate string matching problem

[19]. Global string matching under a general edit distance error model can be done by dynamic program-

ming in O(kN) time, where k and N represent the lengths of the two strings. The same time complexity

can be achieved for partial string matching through a standard “assign ﬁrst line to zero” trick [19]. This

trick however does not apply here due to the condition x

b

−x

a

≤ l

0

. Indeed, by the above trick to convert

global matching into partial matching we would have to ﬁll in a table D of O(kN) elements, in which

each entry D[j, b] gives the cost of the best alignment of the ﬁrst j polylines ending at critical point x

b

(over all possible starting points x

a

satisfying the condition x

b

− x

a

≤ l

0

). This table however cannot be

ﬁlled efﬁciently in a column-wise manner, because an optimal D[j, b] might be given by an alignment of

the ﬁrst j polylines, such that the alignment of the ﬁrst j −1 polylines is sub-optimal for D[j −1, c], where

x

c

is the placement of the polyline j − 1. This is also the case for a different formulation of the dynamic

programming problem, where D[j, b] gives the best alignment of the ﬁrst j polylines ending at a critical

point within the interval [x

0

, x

b

].

4 A Part-based Retrieval Application

In this section we describe a part-based shape retrieval application. The retrieval problem we are consider-

ing is the following: given a large collection of polygonal shapes, and a query consisting of a set of disjoint

13

Figure 9: Examples of images from Core Experiment “CE-Shape1”, part B. Images in the same row belong

to the same class.

polylines, we want to retrieve those shapes in the collection that best match the query. The query represents

a set of disjoint boundary parts of a single shape, and the matching process evaluates how closely these

parts resemble disjoint pieces of a shape in the collection. Thus, instead of querying with complete shapes,

we make the query process more ﬂexible by allowing the user to search only for certain parts of a given

shape. The parts in the query are selected by the user from an automatically computed decomposition of a

given shape. More speciﬁcally, in the application we discuss below, these parts are the result of applying

the boundary decomposition framework introduced in [29]. For matching the query against a database

shape we use the similarity measure described in section 3.

The shape collection used by our retrieval application comes fromthe MPEG-7 shape silhouette database.

Speciﬁcally, we used the Core Experiment “CE-Shape-1” part B [13], a test set devised by the MPEG-7

group to measure the performance of similarity-based retrieval for shape descriptors. This test set consists

of 1400 images: 70 shape classes, with 20 images per class. Some examples of images in this collection

are given in ﬁgure 9, where shapes in each row belong to the same class. The outer closed contour of the

object in each image was extracted. In this contour, each pixel corresponds to a vertex. In order to decrease

the number of vertices, we then used the Douglas-Peuker [9] polygon approximation algorithm. Each re-

sulting simpliﬁed contour was then decomposed, as described in [29]. The instantiation of the boundary

decomposition framework described there uses the medial axis. For the medial axis computation we used

the Abstract Voronoi Diagram LEDA package [1]. The running time for a single query on the MPEG-7 test

set of 1400 images is typically about one second on a 2 GHz PC.

The matching of a query consisting of k polylines to an arbitrary database contour is based on the

similarity measure described in section 3. This similarity measure is not scale invariant, as we noticed in

section 3.1. Our shape collection, however, contains shapes at different scales. In order to achieve some

robustness to scaling, we scaled all shapes in the collection to the same diameter of the circumscribed disk.

Given the nature of our collection (each image in “CE-Shape-1” is one view of a complete object), this is

a reasonable approach. The reason we opted for a normalization based on the circumscribed disk, instead

of the bounding box, for example, is related to the fact that a class in the collection may contain images of

an object at different rotational angles.

4.1 Experimental Results

The selection of parts comprising the query has a big inﬂuence on the results of a part-based retrieval pro-

cess. Our collection for example includes classes, such as those labelled “horse”, “dog”, “deer”, “cattle”,

whose shapes have similar parts, such as limbs. Querying with such parts have shown to yield shapes from

all these classes. Figure 10 depicts two part-based queries on the same shape (belonging to the “beetle”

class) with very different results. In the ﬁrst example, the query consists of ﬁve chains, each representing a

14

Figure 10: (Upper half) Original image, decomposed countour with ﬁve query parts in solid line style,

each representing a leg of a beetle contour, and the top ten retrieved shapes when quering with these ﬁve

polylines. (Lower half) The same original image and decomposed contour, and the top ten retrieved shapes

when quering with two polylines, one containing three legs and the other two legs.

leg of the beetle. Among the ﬁrst ten retrieved results, only three come from the same class. In the second

example, the same parts are contained in the query, but they were concatenated, so that they form two

chains (three legs on one side, two of the other). All ﬁrst ten retrieved results are this time from the same

class. By concatenating the leg parts in the second query, a spatial arrangement of these parts is searched in

the database, and this gives the query more power to discriminate between beetle shapes and other shapes

containing leg-like parts.

The MPEG group initiated the MPEG-7 Visual Standard in order to specify standard content-based

descriptors that allow to measure similarity in images or video based on visual criteria. Each visual de-

scriptor incorporated in the MPEG-7 Standard was selected from a few competing proposals, based on an

evaluation of their performance in a series of tests called Core Experiments. The Core Experiment “CE-

Shape-1” was devised to measure the performance of 2D shape descriptors. The performance of several

shape descriptors, proposed for standardization within MPEG-7, is reported in [21]. The performance of

each shape descriptor was measured using the so-called “bulls-eye” test: each image is used as a query,

and the number of retrieved images belonging to the same class was counted in the top 40 (twice the class

size) matches.

The shape descriptor selected by MPEG-7 to represent a closed contour of a 2D object or region in an

15

image is based on the Curvature Scale Space (CSS) representation [18]. The CSS of a closed planar curve

is computed by repeatedly convolving the contour with a Gaussian function, and representing the curvature

zero-crossing points of the resulting curve in the (s, σ) plane, where s is the arclength. The matching

process has two steps. Firstly, the eccentricity and circularity of the contours are used as a ﬁlter, to reduce

the amount of computation. The shape similarity measure between two shapes is computed, in the second

step, by relating the positions of the maxima of the corresponding CSSs. The reported similarity-based

retrieval performance of this method is 75.44% [21]. In the years following the selection of this descriptor

for the MPEG-7 Visual Standard, people have reported better performances on the same set of data for

other shape descriptors. Belongie et al. [6] reported a retrieval performance of 76.51% for their descriptor

based on shape contexts. Even better performances, of 78.18% and 78.38%, were reported by Sebastian et

al. [27] and Grigorescu and Petkov [10], respectively. The matching in [27] is based on aligning the two

shapes in a deformation-based approach, while in [10] a local descriptor of an image point is proposed that

is determined by the spatial arrangement of other image points around that point. Latecki at al. mention in

[16] a retrieval performance of 83.19% for a shape descriptor introduced in [5].

A global turning function-based shape descriptor [12], proposed for MPEG-7, is reported to have a

similarity-based retrieval performance of 54.14%. A later variation [33] increases the performance rate up

to 65.67%.

For classes in “CE-Shape-1” with a lowvariance among their shapes, the CSS matching [18] gives good

results, with a retrieval rate over 90%, as measured by the “bulls-eye” test. For difﬁcult classes however,

for example the class “beetle”, the different relative lengths of the antennas/legs, and the different shapes of

the body pose problems for global retrieval. The average performance rate of CSS matching for this class

is only 36%. For the “ray” and “deer” classes, the bad results (an average performance rate of the CSS

matching of 26% and 33%, respectively) are caused by the different shape and size of the tails and antlers,

respectively, of the contours in these classes. A part-based matching, with a proper selection of parts, is

signiﬁcantly more effective in such cases.

We tested the performance of our part-based shape matching. A prerequisite of such a performance of

the part-based matching is the selection of query parts that capture relevant and speciﬁc characteristics of

the shape. This has to be done interactively by the user. An overall performance score for the matching pro-

cess, like in [21], is therefore untractable, since that would re quire 1400 interactive queries. We therefore

compared, on a number of individual queries, our part-based approach with the global CSS matching and

with a global matching based on the turning function. These experimental results indicate that for those

classes with a low average performance of the CSS matching, our approach consistently performs better.

For example, for the shape called “carriage-18”, the CSS method has a bull’s eye score of 70%, the global

turning function method a score of 80%, and our multiple polyline to polygon matching method a score of

95%. For the shape called “horse-17”, both the CSS method and the global turning function method have

a bull’s eye score of 10%, and our method a score of 70%. (See many examples in the appendix.)

5 Concluding Remarks

In this paper we addressed the partial shape matching problem. We introduced a measure for computing the

similarity between a set of pieces of one shape and another shape. Such a measure is usefull in overcoming

problems due to occlusion or unreliable object segmentation from images. The similarity measure was

tested in a shape-based image retrieval application. We compared our part-based approach to shape match-

ing with two global shape matching technique (the CSS matching, and a turning function-based matching).

Experimental results indicate that for those classes with a low average performance of the CSS matching,

our approach consistently performs better. Concludingly, our dissimilarity measure provides a powerful

complementary shape matching method.

A prerequisite of such a performance of the part-based matching is the selection of query parts that

capture relevant and speciﬁc characteristics of the shape. This has to be done interactively by the user.

Since the user does not usually have any knowledge of the databased content before the retrieval process,

we have developed an alternative approach to part-based shape retrieval, in which the user is relieved of

the responsibility of specifying shape parts to be searched for in the database, see [30]. The query, in this

case, is a polygon, and in order to select among the large number of possible searches in the database with

16

parts of the query, the user interacts with the system in the retrieval process. His interaction, in the form

of marking relevant results from those retrieved by the system, has the purpose of allowing the system to

guess what the user is looking for in the database.

Acknowledgements

This research was partially supported by the Dutch Science Foundation (NWO) under grant 612.061.006,

and by the FP6 IST Network of Excellence 506766 Aim@Shape. Thanks to Veli M¨ akinen for the partial

string matching reference.

References

[1] AVD LEP, the Abstract Voronoi Diagram LEDA Extension Package. http://www.mpi-sb.

mpg.de/LEDA/friends/avd.html.

[2] N. Ansari and E. J. Delp. Partial shape recognition: A landmark-based approach. PAMI, 12:470–483,

1990.

[3] E. Arkin, L. Chew, D. Huttenlocher, K. Kedem, and J. Mitchell. An efﬁciently computable metric for

comparing polygonal shapes. PAMI, 13:209–215, 1991.

[4] T. Asano, M. de Berg, O. Cheong, H. Everett, H.J. Haverkort, N. Katoh, and A. Wolff. Optimal

spanners for axis-aligned rectangles. Computational Geometry, Theory and Applications, 30(1):59–

77, 2005.

[5] E. Attalla. Shape Based Digital Image Similarity Retrieval. PhD thesis, Wayne State University,

2004.

[6] S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts.

PAMI, 24:509–522, 2002.

[7] B. Bhanu and J. C. Ming. Recognition of occluded objects: A cluster-structure algorithm. Pattern

Recognition, 20(2):199–211, 1987.

[8] Scott D. Cohen and Leonidas J. Guibas. Partial matching of planar polylines under similarity trans-

formations. In Proc. SODA, pages 777–786, 1997.

[9] D.H. Douglas and T.K. Peucker. Algorithms for the reduction of the number of points required to

represent a digitized line or its caricature. The Canadian Cartographer, 10(2):112–122, 1973.

[10] C. Grigorescu and N. Petkov. Distance sets for shape ﬁlters and shape recognition. IEEE Transactions

on Image Processing, 12(10):1274–1286, 2003.

[11] D. Huttenlocher, G. Klanderman, and W. Rucklidge. Comparing images using the hausdorff distance.

PAMI, 15:850–863, 1993.

[12] IBM. Technical summary of turning angle shape descriptors proposed by IBM. Technical report,

IBM, 1999. TR ISO/IEC JTC 1/SC 29/WG 11/ P162.

[13] S. Jeannin and M. Bober. Description of core experiments for MPEG-7 motion/shape. Technical

report, MPEG-7, March 1999. Technical Report ISO/IEC JTC 1/SC 29/WG 11 MPEG99/N2690,

Seoul.

[14] L. J. Latecki and R. Lak¨ amper. Shape similarity measure based on correspondence of visual parts.

PAMI, 22:1185–1190, 2000.

[15] L. J. Latecki, R. Lak¨ amper, and D. Wolter. Shape similarity and visual parts. In Proc. Int. Conf.

Discrete Geometry for Computer Imagery (DGCI), pages 34–51, 2003.

17

[16] L. J. Latecki, R. Lak¨ amper, and D. Wolter. Optimal partial shape similarity. Image and Vision

Computing, 23:227–236, 2005.

[17] H.-C. Liu and M. D. Srinath. Partial shape classiﬁcation using contour matching in distance transfor-

mation. PAMI, 12(11):1072–1079, 1990.

[18] F. Mokhtarian, S. Abbasi, and J. Kittler. Efﬁcient and robust retrieval by shape content through

curvature scale space. In Workshop on Image DataBases and MultiMedia Search, pages 35–42, 1996.

[19] Gonzala Navarro. A guided tour to approximate string matching. ACM Computing Surveys, 33(1):31–

88, 2001.

[20] H. Ogawa. A fuzzy relaxation technique for partial shape matching. Pattern Recognition Letters,

15(4):349–355, 1994.

[21] J.-R. Ohm and K. M¨ uller. Results of MPEG-7 Core Experiment Shape-1. Technical report, MPEG-7,

July 1999. Technical Report ISO/IEC JTC1/SC29/WG11 MPEG98/M4740.

[22] E. Ozcan and C. K. Mohan. Partial shape matching using genetic algorithms. Pattern Recognition

Letters, 18(10):987–992, 1997.

[23] A. Pentland, R. W. Picard, and S. Sclaroff. Photobook: Content-based manipulation of image

databases. International Journal of Computer Vision, 18(3):233–254, June 1996.

[24] E. Persoon and K. S. Fu. Shape discrimination using Fourier descriptors. IEEE Transactions on

Systems, Man, and Cybernetics, 7(3):170–179, 1977.

[25] R. J. Prokop and A. P. Reeves. A survey of moment-based techniques for unoccluded object represen-

tation and recognition. Computer Vision, Graphics, and Image Processing, 54(5):438–460, Setember

1992.

[26] P. Rosin. Multiscale representation and matching of curves using codons. Graphical Models and

Image Processing, 55(4):286–310, July 1993.

[27] T. Sebastian, P. Klein, and B. Kimia. On aligning curves. IEEE Transactions on Pattern Analysis and

Machine Intelligence, 25(1):116–125, 2003.

[28] K. Siddiqi, A. Shokoufandeh, S.J. Dickinson, and S.W. Zucker. Shock graphs and shape matching.

IJCV, 55(1):13–32, 1999.

[29] Mirela Tanase. Shape Deomposition and Retrieval. PhD thesis, Utrecht University, Department of

Computer Science, February 2005.

[30] Mirela Tanase and Remco C. Veltkamp. Part-based shape retrieval with relevance feedack. In Pro-

ceedings International Conference on Multimedia and Expo (ICME05), 2005.

[31] R. Veltkamp and M. Hagedoorn. State-of-the-art in shape matching. In M. Lew, editor, Principles of

Visual Information Retrieval, pages 87–119. Springer, 2001.

[32] Haim Wolfson and Isidore Rigoutsos. Geometric hashing: an overview. IEEE Computational Science

& Engineering, pages 10–21, October-December 1997.

[33] C. Zibreira and F. Pereira. A study of similarity measures for a turning angles-based shape descriptor.

In Proc. Conf. Telecommunications, Portugal, 2001.

18

Appendix A

We compared our multiple polyline to polygon matching (MPP) approach with the global CSS matching

(CSS) and with a global matching based on the turning function (GTA). Table 1 presents the query image

of the CSS matching and the query parts of our part-based matching in the ﬁrst and second column of the

table. The retrieval rate measured by the “bulls-eye” test, the percentage of true positives returned in the

ﬁrst 40 (twice the class size) matches, for the three matching techniques are indicated in the following three

columns, while the remaining columns give the percentage of true positives returned in the ﬁrst 20 (class

size) matches.

Bull’s Eye True Positives

CSS Part-base Performance in class size

Query Image Query Parts Matching Process Matching Process

CSS GTA MPP CSS GTA MPP

10% 30% 65% 10% 10% 55%

beetle-20

10% 35% 60% 5% 20% 55%

beetle-10

15% 15% 70% 15% 5% 60%

ray-3

15% 25% 50% 15% 20% 40%

ray-17

15% 40% 50% 5% 30% 35%

Table 1: A comparison of the Curvature Scale Space (CSS), the Global

Turning Angle function (GTA), and our Multiple Polyline to Polygon

(MPP) matching.

19

Bull’s Eye True Positives

CSS Part-base Performance in class size

Query Image Query Parts Matching Process Matching Process

CSS GTA MPP CSS GTA MPP

deer-15

20% 40% 45% 15% 40% 40%

deer-13

5% 20% 45% 5% 15% 35%

bird-11

20% 15% 60% 20% 15% 40%

bird-17

20% 25% 50% 15% 25% 40%

bird-9

70% 80% 95% 45% 80% 90%

carriage-18

10% 10% 70% 10% 5% 60%

horse-17

15% 25% 65% 10% 25% 40%

horse-3

Table 1: A comparison of the Curvature Scale Space (CSS), the Global

Turning Angle function (GTA), and our Multiple Polyline to Polygon

(MPP) matching.

20

Bull’s Eye True Positives

CSS Part-base Performance in class size

Query Image Query Parts Matching Process Matching Process

CSS GTA MPP CSS GTA MPP

25% 30% 50% 20% 20% 45%

horse-4

30% 30% 50% 25% 25% 40%

crown-13

20% 45% 65% 20% 35% 50%

butterﬂy-11

15% 25% 55% 10% 20% 55%

butterﬂy-4

10% 15% 50% 10% 15% 45%

dog-11

Table 1: A comparison of the Curvature Scale Space (CSS), the Global

Turning Angle function (GTA), and our Multiple Polyline to Polygon

(MPP) matching.

21

**Multiple Polyline to Polygon Matching
**

Mirela T˘ nase1 Remco C. Veltkamp1 Herman Haverkort2 a 1 Department of Information & Computing Sciences Utrecht University, The Netherlands 2 Department of Mathematics and Computing Science TU Eindhoven, The Netherlands

Abstract This paper addresses the partial shape matching problem, which helps identifying similarities even when a signiﬁcant portion of one shape is occluded, or seriously distorted. We introduce a measure for computing the similarity between multiple polylines and a polygon, that can be computed in O(km2 n2 ) time with a straightforward dynamic programming algorithm. We then present a novel fast algorithm that runs in time O(kmn log mn). Here, m denotes the number of vertices in the polygon, and n is the total number of vertices in the k polylines that are matched against the polygon. The effectiveness of the similarity measure has been demonstrated in a part-based retrieval application with known ground-truth.

1 Introduction

The motivation for multiple polyline to polygon matching is twofold. Firstly, the matching of shapes has been done mostly by comparing them as a whole [3, 14, 18, 23, 24, 25, 26, 28, 31]. This fails when a signiﬁcant part of one shape is occluded, or distorted by noise. In this paper, we address the partial shape matching problem, matching portions of two given shapes. Secondly, partial matching helps identifying similarities even when a signiﬁcant portion of one shape boundary is occluded, or seriously distorted. It could also help in identifying similarities between contours of a non-rigid object in different conﬁgurations of its moving parts, like the contours of a sitting and a walking cat. Finally, partial matching helps alleviating the problem of unreliable object segmentation from images, over or undersegmentation, giving only partially correct contours. Despite its usefulness, considerably less work has been done on partial shape matching than on global shape matching. One reason could be that partial shape matching is more involved than global shape matching, since it poses two more difﬁculties: identifying the portions of the shapes to be matched, and achieving invariance to scaling. If the shape representation is not scale invariant, shapes at different scales can be globally matched after scaling both shapes to the same length or the same area of the minimum bounding box. Such solutions work reasonably well in practice for many global similarity measures. When doing partial matching, however, it is unclear how the scaling should be done since there is no way of knowing in advance the relative magnitude of the matching portions of the two shapes. This can be seen in ﬁgure 1: any two of the three shapes in the ﬁgure should give a high score when partially matched but for each shape the scale at which this matching should be done depends on the portions of the shapes that match. Contribution Firstly, we introduce a measure for computing the similarity between multiple polylines and a polygon. The polylines could for example be pieces of an object contour incompletely extracted from an image, or could be boundary parts in a decomposition of an object contour. The measure we propose is based on the turning function representation of the polylines and the polygon. We then derive a number of non-trivial properties of the similarity measure. Secondly, based on these properties we characterize the optimal solution that leads to a straightforward O(km2 n2 )-time dynamic programming algorithm. We then present a novel O(kmn log mn)-time algorithm. Here, m denotes the number of vertices in the polygon, and n is the total number of vertices in the k polylines that are matched against the polygon. 1

for example points of high curvature [2. It is deﬁned for arbitrary non-empty bounded and closed sets A and B as the inﬁmum of the distance of the points in A to B and the points in B to A. Both methods are designed for partial matching. parts that we perceive as similar. One serious drawback of this approach is the fact that the matching is done between parts of simpliﬁed shapes at “the appropriate evolution stage” [14]. The best correspondence between the maximal convex arcs of two simpliﬁed versions of the original shapes gives the partial similarity measure between the shapes. and then looking for correspondences between the features of the two shapes. by looking for correspondences between parts of the given shapes. Thirdly. 17]. Most partial shape matching methods are based on computing local features of the contour. where m is the number of vertices in one polygon and n is the number of vertices in the other. An example of such a case is in ﬁgure 2. rotation. Geometric hashing [32] is a method that determines if there is a transformed subset of the query point set that matches a subset of a target point set. because for such subparts the sequences of local features are very similar. k-means clustering [7]. we want to retrieve those shapes in the collection that best match our query. but do not easily transform to our case of matching multiple polylines to a polygon. establish the best correspondence of parts in a decomposition of the matched shapes. Such local features-based solutions work well when the matched subparts are almost equivalent up to a transformation such as translation. Other local features-based approaches make use of genetic algorithms [22]. this works for only two single polylines. The dissimilarity measure is a function of the scale. translation and rotation. and two parts of the database shape P are denote by P1 and P2 . Also the Hausdorff distance [11] allows partial matching. where two parts of the query shape Q are denoted by Q 1 and Q2 . Considering different stages of the curve evolution (such that P1 ∪ P2 and Q1 ∪ Q2 constitute individual parts) does not solve the problem either. Given a large collection of shapes and a query consisting of a set of polylines. rotation and scaling. The set of polylines forming the query are boundary parts in a decomposition of a database shape – both this database shape and the parts in the query are selected by the user. can be done in time O(m 2 n2 ) [8]. The evaluation on the basis of a known ground-truth indicate that a part-based approach to matching improves the global matching performance for difﬁcult categories of shapes. Though P has a portion of P 1 ∪ P2 similar to a portion of Q 1 ∪ Q2 . Given two matches with the same squared error. [3] describe a metric for comparing two whole polygons that is invariant under translation. 2 Related Work Arkin et al. However. or fuzzy relaxation techniques [20]. Partially matching the turning angle function of two polylines under scaling. and can be computed in O(mn log mn) time. Latecki et al [15]. rotation or scaling. may have quite different local features (different number of curvature extrema for example). How these evolution stages are identiﬁed is not indicated in their papers. though it certainly has an effect on the quality of the matching process. However. the match involving the longer part of the polylines has a lower dissimilarity. and the shift of one polyline along the other. It is based on the L 2 -distance between the turning functions of the two polygons. This is because the method used for matching two chains normalizes the two chains to unit length and thus no shifting of 2 . no correspondence among these four parts succeeds to recognize this.A B C Figure 1: Scale invariance in partial matching is difﬁcult. we have experimented with a part-based retrieval application. by building a hash table in transformation space. Also. false negatives could appear in the retrieval process.

since it corresponds to a horizontal shifting of Θ A over a distance t. we shift the parts of one shape over the other shape and select those that give a close match. we introduce a similarity measure for matching a union of possibly disjoint parts and a whole shape. and the polygon B is placed over A in such a way that the reference point B(0) of B coincides with point A(t) at distance t along A from the reference point A(0). . 3. In our approach. Since we allow the parts to rotate independently. It is a piecewise constant function. In this section we deﬁne a similarity measure for such a polylines-to-polygon matching. the part(s) are shifted along the shape. . 3 Polylines-to-Polygon Matching We concentrate on the problem of matching an ordered set {P 1 . More formally. A rotation of A by an angle θ corresponds to a shifting of Θ A over a distance θ in the vertical direction. comes from the intended application of part based shape retrieval. The distance between two polygons A and B is deﬁned as the L 2 distance between their two turning functions ΘA and ΘB . We ﬁrst give an O(km2 n2 )-time dynamic programming algorithm for computing this similarity measure. measured from some reference point on the boundary of A. minimized with respect to the vertical and horizontal shifts of these functions (in other words. . Pk } of k polylines against a polygon P . We want to compute how close an ordered set of polylines {P 1 . P2 . . In the matching process between a (union of) part(s) and an undecomposed shape. Also notice that if instead of making correspondences between parts. this measure could capture the similarity between contours of non-rigid objects. Pk } is to being part of the boundary of P in the given order in counter-clockwise direction around P (see ﬁgure 3). . 3 . suppose A and B are two polygons with perimeter length l A and lB . . A pair (t. we also use a decomposition into parts in order to identify the matching portions. where m is the number of vertices in P and n is the total number of vertices in the k polylines. one endpoint of one chain over the other is done. Note that the fact that P is a polygon.1 Similarity between polyline and polygon The turning function Θ A of a polygon A measures the angle of the counterclockwise tangent with respect to a reference orientation as a function of the arc-length s. For this purpose. l A ) along the boundary of A corresponds to shifting Θ A horizontally over a distance t. and the optimal placement is computed based on their turning functions. . and also over all independent rotations of the parts. respectively. the polylines are rotated and shifted along the polygon P . but we do this only for one of the shapes. in such a way that the pieces of the boundary of P “covered” by the k polylines are mutually disjoint except possibly at their endpoints. and not an open polyline. . minimized with respect to rotation and choice of reference point). θ) ∈ R × R will be referred to as a placement of B over A.P2 P1 Q1 Q2 Figure 2: Partial matching based on correspondence of parts may fail to identify similarities. This similarity measure is a turning angle function-based similarity. with parts in different relative positions. the similarity between the shapes in ﬁgure 2 is reported. Moving the location of the reference point A(0) over a distance t ∈ [0. minimized over all possible shiftings of the endpoints of the parts over the shape. and B is rotated clockwise by an angle θ with respect to the reference orientation. θ) is also called a horizontal shift. The measure is based on the turning function representation of the given polylines and it is invariant under translation and rotation. The ﬁrst component t of a placement (t. with jumps corresponding to the vertices of A. In the next section. P2 . We then reﬁne this algorithm to obtain a running time of O(kmn log mn).

that is: their similarity should be zero. Pk over P (or in other words. B.lA ) min f(A. . . θ) denotes the quadratic similarity f j (t. For this reason we do not scale the polyline or the polygon prior to the matching process. f j (t. . P2 . their turning functions will no longer match perfectly. The similarity between two polygons A and B is then given by: θ∈R. Thus. We assume the polylines {P1 . t∈[0. Pk . Since P is a closed polyline. For measuring the difference between a polygon A and a polyline B the same measure can be used (for matching two polylines. But if we scale A and B to the same length. (tk . 3. or incomplete shapes. we want that a polyline B included in a polygon A to match perfectly this polygon. θ) of B over A. . our similarity measure is not scale-invariant. Θj is piecewise-constant with nj − 1 jumps. by Θ(s+l) = Θ(s)+2π. which does not contain occluded. is the square root of the sum of quadratic similarities f j . θ) between A and B for a given placement (t. θ) = 0 j (Θ(s + t) − Θj (s) + 2 θ) ds. P2 . while the second component θ is also called a vertical shift. . B. P3 } of polylines against a polygon P . lj ] → R denote the turning function of the polyline Pj of length l j .2 Similarity between multiple polylines and a polygon Let Θ : [0. j=1 4 . we achieve robustness to scaling by normalizing all shapes in the collection to the same diameter of their circumscribed circle. and of perimeter length l. Pk . . θ1 ) . Let {P1 . t. t. In a part-based retrieval application (see section 4) using a similarity measure based on the above quadratic similarity (see section 3.P P3 P P2 P1 P3 P2 P1 Figure 3: Matching an ordered set {P 1 . This is a reasonable solution for our collection. minimized over all valid horizontal and vertical shifts of their turning functions): ⎛ ⎞1/2 d(P1 . . Moreover these shapes are object contours and not synthetic shapes. θ). l For simplicity of exposition. Pk } satisfy the condition k lj ≤ l. . P2 . . The similarity measure. Notice that.2). Pk } be a set of polylines. . minimized over all valid placements of P 1 . Arkin et al. θj )⎠ . To achieve invariance under scaling. as the square of the L 2 -distance between their two turning functions Θ A and ΘB : f(A. since it corresponds to a vertical shifting of Θ A over a distance θ. for the purpose of our part-based retrieval application. . θ) = lB 0 (ΘA (s + t) − ΘB (s) + θ)2 ds. B. . . . P ). θk ) ⎝ k fj (tj . . j=1 which we denote by d(P 1 . t. the domain of Θ can be easily extended to the entire real line. If Pj is made of nj segments. . . . P ) = min valid placements (t1 . We deﬁne the quadratic similarity f(A. l] → R be the turning function of a polygon P of m vertices. . [3] propose scaling the two polygons to unit length prior to the matching. some slight adaptations would be needed). . and let Θ j : [0. .

. tk +lk ≤t1 +l ⎝ k j=1 ⎞1/2 ∗ fj (tj )⎠ . Pk . Arkin et al. the optimization ∗ problem minθ∈R fj (t.. It remains to deﬁne what the valid placements are. as a function of t. P ) between the polylines P 1 . . We also give a simpler formulation of the optimization problem in the deﬁnition of d(P 1 . . . .3 Properties of the similarity function ∗ In this section we give a few properties of f j (t). Pk are matched with points on the boundary of P in counterclockwise order around P . . . the function fj (t. that is: tj−1 ≤ tj for all 1 < j ≤ k.6. we require that the matched parts are disjoint (except possibly at their endpoints). . with period l. . . . and are independent of each other. Furthermore. . . θ) = ∂θ we get: lj 0 2(Θ(s + t) − Θj (s) + θ)ds = lj 0 2(Θ(s + t) − Θj (s))ds + 2θlj . and tk + lk ≤ t1 + l (see ﬁgure 4). with mnj breakpoints within any interval of length l. P3 and the polygon P . Pk and the polygon P is thus: ⎛ d(P1 . . . the parabolic pieces are concave. we shift the turning functions Θ 1 . θ) is a quadratic convex function of θ. .Θ3 Θ1 Θ l t1 Θ2 t2 t2 + l2 t3 2l t3 + l3 < t1 + l t1 + l1 Figure 4: To compute d(P 1 . . . θ)/∂θ = 0. . moreover. . P ) = min t1 ∈[0. . . Pk . . sharpening the constraints to t j−1 + lj−1 ≤ tj for all 1 < j ≤ k. .2l).. . P )...l). The similarity between the polylines P 1 . 5 . We can thus express the similarity between P j and P ∗ for a given positioning t j of the starting point of P j over P as: fj (tj ) = minθ∈R fj (tj . ii) it is piecewise quadratic. Pk along P . while the vertical shift must be optimal for the given horizontal shift. . P ) in sections 3. ...tk ∈[0. . Since ∂fj (t. . Pk . Pk . . We require that the starting points of P1 . The vertical shifts θ1 . . . . .k}: tj−1 +lj−1 ≤tj . . and t k ≤ t1 + l. given by the root θ j (t) of the equation ∂f j (t. . ∀j∈{2. . in an optimal placement the quadratic similarity between a particular polyline P j and P depends only on the horizontal shift t j . θj (t)). (2) Lemma 1 For a given positioning t of the starting point of P j over P . θk correspond to rotations of the polylines P 1 .5 and 3. and Θ3 horizontally and vertically over Θ. t2 . . . Pk with respect to the reference orientation. tk correspond to shiftings of the starting points of the polylines P 1 . . (1) 3. ∗ Lemma 2 The quadratic similarity fj (t) has the following properties: i) it is periodic. . This implies that for a given t. [3] have shown that for any ﬁxed t. ∗ ∗ We now consider the properties of f j (t) = fj (t. Θ2 .. . θ). The horizontal shifts t 1 . θ) has a unique solution. Therefore. as functions of t. that constitute the basis of the algorithms for computing d(P 1 . . the rotation that minimizes the l ∗ quadratic similarity between Pj and P is given by θ j (t) = − 0 j (Θ(s + t) − Θj (s))ds/lj . . .

θj (t)). For a given value of the horizontal shift t. the value of 0 j (Θ(s + t) − Θj (s))2 ds is thus given by the sum over the strips in the current conﬁguration of the width of the strip times the square of its height: lj 0 (Θ(s + t) − Θj (s))2 ds = σ∈S wσ (t)h2 . t. 0 or 1. l For a given conﬁguration of strips. the discontinuities of Θ and Θ j deﬁne a set S of rectangular strips. (3) ∗ For the rest of this proof we restrict our attention to the behaviour of f j within an interval of length l.Θj Θ Figure 5: For given horizontal and vertical shifts of Θ j over Θ. the discontinuities of Θ j and Θ form a set of rectangular strips. Each strip is deﬁned by a pair of consecutive discontinuities of the two turning functions and is bounded from below and above by these functions. ∗ ∗ ∗ ii) Since fj (t) = f(P. we have: 0 lj (Θ(s + t) − Θj (s))ds = σ∈S wσ (t)hσ . we have: ∗ fj (t) = lj 0 ∗ (Θ(s + t) − Θj (s) + θj (t))2 ds = 0 lj (Θ(s + t) − Θj (s))2 ds + + lj 0 ∗ 2(Θ(s + t) − Θj (s))θj (t)ds ∗ (θj (t))2 ds lj 0 = 0 lj (Θ(s + t) − Θj (s))2 ds + lj 0 ∗ 2θj (t) (Θ(s + t) − Θj (s))ds + ∗ lj (θj (t))2 . 6 . Proof: i) Since Θ(t + l) = 2π + Θ(t). σ Similarly. As Θj is shifted horizontally with respect to Θ. from lemma 1 we get that θ ∗ (t + l) = θ∗ (t) − 2π. Inbetween these critical events. Thus ∗ fj (t + l) = lj 0 ∗ (Θ(s + t + l) − Θj (s) + θ∗ (t) − 2π)2 ds = fj (t). the height of a strip σ is a constant h σ . while the width wσ is a linear function of t with coefﬁcient −1. θj (t))fj (t. Pj . there are at most mn j critical events within an interval of length l where the conﬁguration of the rectangular strips deﬁned by Θ and Θ j changes. Applying lemma 1 we get: ∗ fj (t) = lj 0 (Θ(s + t) − Θj (s))2 ds − lj (Θ(s 0 + t) − Θj (s))ds lj 2 . These critical events are at values of t were a discontinuity of Θ j coincides with a discontinuity of Θ.

4 Characterization of an optimal solution In this section we characterize the structure of an optimal solution to the optimization problem in equation (5). P ) = min t1 ∈[0. . ∗ In other words. and give a recursive deﬁnition of this solution. . ∗ ∗ tk ≤t1 +l0 ⎝ k j=1 ⎞1/2 f j (tj )⎠ . And thus equation (3) can be rewritten as: ∗ fj (t) = σ∈S wσ (t)h2 σ − ⎛ 1⎝ lj ⎞2 wσ (t)hσ ⎠ . the optimization problem deﬁning d(P 1 . the function f j is a piecewise quadratic function of t. f j has the i=1 ∗ same properties as fj . 7 . . . with t∗ := tj + i=1 li . . with concave parabolic pieces. We now give a simpler formulation of the optimization problem in the deﬁnition of d(P 1 . since the coefﬁcient of t 2 in equation (4) ∗ is negative. with at most mnj breakpoints within an interval of length l.. . Moreover.l). is a solution to the optimization problem in equation (1).2l). P ). . ∀j∈{2.k}: tj−1 ≤tj . . P ) becomes: ⎛ d(P1 . we deﬁne: j−1 ∗ f j (t) := fj t+ i=1 li . . that is: it is a piecewise quadratic function of t that has its local minima in at most ∗ mnj breakpoints in any interval of length l. .tk ∈[0. Pk . .∗ fj (t) t ∗ Figure 6: The quadratic similarity f j is a piecewise quadratic function of t. . . (5) where l0 := l− k li . 1 j k 3. i=1 ∗ j−1 then (t∗ .. . Notice that if (t 1 . (4) σ∈S ∗ Since the functions w σ are linear functions of t. it sufﬁces to restrict our attention to a discrete set of at most mn j points. t∗ ).. . . tk ) is a solution to the optimization problem in equation (5). In order to simplify the restrictions on t j in equation (1). t2 . Pk . ∗ Corollary 1 The local minima of the function f j are among the breakpoints between its parabolic pieces... With this simple transformation of the set of functions f j . . Obviously. ∗ ∗ Let (t1 . the parabolic pieces of f j are concave (see ﬁgure 6). . the function f j is a copy of fj . . . .. ∗ The following corollary indicates that in computing the minimum of the function f j . . This deﬁnition forms the basis of a straightforward dynamic programming solution to the problem.. .. tk ) be a solution to the optimization problem in equation (5). Pk . . but shifted to the left with j−1 li .

. (when j ∈ {g. P ) = t1 . and a ≤ b. b] can be computed recursively.. .. 2l).Lemma 3 The values of an optimal solution (t 1 . b] = D[j − 1. .. tj ∈ X xa ≤ t1 ≤ . . . For any j ∈ {1. Pg . .. h}). in which case (t1 . otherwise (t1 . ∗ ∗ Notice now. we denote: j D[j. . With the observations above. otherwise (t1 . b − 1]. . k}) or the function f gh : R → R. Proof: If t j−1 = tj and tj = tj+1 . b] + f j (xb ). ∗ In the second case. . tk ) are found in a discrete set of points X ⊂ [0. b]. a.. thus tj must give a local minimum of function f j . . 2l). a. . starting at P (x a ) and ending at P (x b + j i=1 li ).. h. . f gh (t) = i=g f i (t) + i=1 f i (t − l0 ). 2l). a. and the breakpoints of f gh are the breakpoints of the constituent functions f g . or such a breakpoint shifted left or right by l 0 . . xN −1 } be the set of critical points in [0.. f gh (t) = i=g f i (t + l0 ) + i=1 f i (t) (when j ∈ {1. .. a. ... . Since function f j has 2mnj breakpoints... ... that is: no polyline that is not in G j is glued to a polyline in G j .. We call a point in [0.. 8 . . ... k}. i=1 (7) where j ∈ {1. Thus. a.. . .. Then. f k . Pk . thus in this case D[j. . for some g and h in {1. . . . . . we have that Pk and P1 are glued together.. Note that a local minimum of f gh may not be a local minimum of any of the constituent functions f i (g ≤ i ≤ h). tj ) must be an optimal solution for D[j. . . . tj ) would not give a minimum for D[j. b] = t1 . D[j.. . Ph } (with j ∈ {g. a. . ... Regarding the value of t j we distinguish two cases: • tj = xb . thus the positioning of the ending point of j j−1 Pj−1 and of the starting point of P j coincide. . Let X = {x0 . b]. . .. . but nevertheless it will always be given by a breakpoint of one of the constituent functions. the total j number of critical points in [0.. the constraints on the choice of t 1 . This function is obviously piecewise quadratic. ≤ tj ≤ xb min f i (ti ) .. ∗ ∗ In the ﬁrst case... • tj = xb . so t j must give a local minimum of the function f gh : R → R. Pj } of j polylines to a subchain of P . (6) We now show that an optimal solution of this problem can be constructed recursively. f h . in the given optimal placement the polylines P j−1 and ∗ ∗ Pj are “glued” together. g. t j must be a breakpoint of one of the functions f g .. a. f gh (t) = i=g f i (t). a. . let Gj be the maximal set of polylines containing P j and glued together in the given optimal placement. tj ) be an optimal solution for D[j.. . which is either a breakpoint of one of the functions f 1 . . Since Gj is a maximal set of polylines ∗ ∗ ∗ ∗ glued together. k}. b ∈ {0.. . .. . . in which case (t1 .. . k}). b]. .. Pk } (with j ∈ {1.. h}) or the form {P 1 .. f h . which consists of the breakpoints of the functions f 1 . . N − 1}. . . tk − t1 ≤ l0 k h ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ min ⎝ k j=1 ⎞1/2 f j (tj )⎠ . b − 1]. tj−1 ) must be an optimal solution for D[j − 1. tk ∈ X ∀j > 1 : tj−1 ≤ tj . We now show that D[j. . a.. a. we have t i = tj for all i ∈ {g... . Observe that Gj must have either the form {P g .. a. .. . tk do not prohibit decreasing ∗ ∗ or increasing t j . thus in this case. .. For this purpose. .. b] = D[j. the constraints on the choice of t 1 .. . . . . .. . Ph . . we can argue in a similar manner that t j must give a local minimum of the function k h f gh : R → R. .. Similarly. . tk do not prohibit decreasing or increasing t g . plus two copies of each breakpoint: one shifted left by l0 and one shifted right by l 0 . k}... .. if t k = t1 + l0 . 2l) is at most 6m i=1 ni = 6mn. b]. Equation (7) describes the subproblem of matching the set {P1 . h}. a critical point. a.. or it must be such a breakpoint shifted left or right by l 0 . the optimization problem we have to solve is: ⎛ d(P1 . with concave parabolic pieces. . f k .. . that tj−1 = tj means that t∗ = t∗ + lj−1 . th ∗ h simultaneously. Let (t 1 . tj ) would not give a minimum for D[j...

b] ← min D[j − 1. . Overall. b] . let us denote by S ΘΘj the set of rectangular strips in the current conﬁguration S that are bounded left by a jump of Θ and right by a jump of Θ j (see ﬁgure 7). for j ≥ 1 ∧ a ≤ b. D[j. a. It remains to indicate how to compute the slopes of these linear pieces. . P ) = xa . b − 1] 11. respectively. . . . . Pk . a. while xb − xa ≤ l0 do 8. for all i ∈ {0. a. a. b] + f j (xb ). For all j ∈ {1. and by S ΘΘ and SΘj Θj the set of of rectangular strips that are bounded left and right by a jump of Θ and Θ j . N − 1}. and they can be sorted in O(mn log(mn)). for a ← 0 to N − 1 do 5. b − 1]. . the only strips whose width is changing are those bounded by a jump of Θ and a jump of Θj . The algorithm is an adaptation of the algorithm used in [3]. return M IN Algorithm SimpleCompute d(P1 . P ) Lemma 4 The running time of algorithm SimpleCompute d(P 1 . Proof: Step 1 of this algorithm requires ﬁrst identifying all values of t such that a vertex of a turning function Θ j is aligned with a vertex of Θ. . a − 1] ← ∞ 6. xb −xa ≤l0 (8) min D[k. xN−1 }. 3.5 A straightforward algorithm Equations (8) and (9) lead to a straightforward dynamic programming algorithm for computing the similarity measure d(P1 . . M IN ← ∞ 4. b] = min D[j − 1. . b] ← 0 9. k} and all i ∈ {0. Pk . . . . only three strips in S change their type. Pk . a. where the boundary cases are D[0. N −1} takes O(mn). . the identiﬁcation of the set X of critical points takes O( k mnj ) = O(mn). D[0. Compute the set of critical points X = {x0 . Notice also that at each breakpoint of f j . and we thus indicate only its main ideas. j=1 We now show that. We then have to shift these values left by j−1 li and make i=1 copies of them shifted left and right by l 0 . a − 1] has no solution. . .We can now conclude that : D[j. (9) 3. . For a ﬁxed vertex v of a turning function Θ j . b − 1] . for j ← 1 to k do 10. Notice that when Θ j is horizontally shifted over Θ such that the conﬁguration S of the rectangular strips does not change. a. As equation (4) from the proof of lemma 2 indicates. evaluate f j (xi ). A solution of the optimization problem (6) is then given by d(P1 . When shifting Θ j such that the conﬁguration S does not change. the widths of the strips in S Θj Θj and SΘΘ do not change. M IN ← min(D[k. b← b+1 12. . by S Θj Θ the set of rectangular strips that are bounded left by a jump of Θ j and right by a jump of Θ. a. . with strips divided in the above 9 . b] + f j (xb ). . a. . . for a particular value of j. b] = 0 and D[j. P ) is O(km2 n2 ). and sort them. b←a 7. P ): 1. a. More precisely. a. D[j. . Thus the slope of each linear piece of the sum functions in equation (4) is determined only by the rectangular strips in S ΘΘj and SΘj Θ . M IN ). √ 13. . D[j. the results for all n j vertices of Θj can thus be collected in O(mn j ) time. Thus these functions can be completely determined by evaluating an initial value and the slope of each linear piece. evaluating f j (xi ).xb ∈X. . for j ← 1 to k do D[j. . Pk . and the algorithm uses O(km2 n2 ) storage. 2. . . the function f j (t) is the difference of two piecewise linear functions with the same set of breakpoints. The evaluation of the function f j thus starts with computing an initial conﬁguration of strips. all m values of t such that v is aligned with a vertex of Θ can be identiﬁed in O(m) time. . while the width of those in S ΘΘj increases and of those in S Θj Θ decreases. . a. . a. .

. ta−1 . i. and let i be the largest such i. N − 1}. .. i i But we also have ta−1 ≤ ta −1 ≤ ta ≤ ta < ta−1 ≤ ta−1 ... b] + f j (xb ). From the fact that the 1 j i i i i lexicographically smallest optimal solution (t a . j ∈ {1... there is a critical point xz . This contradicts the conclusion of the previous i 10 . σ3 ∈ SΘj Θj . let (t a .. ta−1 . 0 ≤ z ≤ b.. 1 i i i i −1 i i +1 i −1 ta . line 4 to 12 dominate the running time of algorithm SimpleCompute. Pk . i i By our choice of i and i . . j}. b]. and σ5 ∈ SΘΘ . . . and any critical point x b . b}.. and ii) D[j...σ3 σ2 Θj Θ σ1 σ5 σ4 σ6 Figure 7: The four types of strips formed by the discontinuities of two turning functions Θ j and Θ: σ1 .. σ6 ∈ SΘΘj ... Thus we can evaluate the function f j at all critical points in x in O(N ) = O(mn). which is O(kN 2 ) = O(km2 n2 ). so i i=i i i f i (ta ) > i i i=i f i (ta−1 ). The updates of the conﬁguration and the function values computation take constant time per critical point. ta +1 ... . b] = D[j.6 A fast algorithm The above time bound on the computation of the similarity measure d(P 1 . a. We ﬁrst prove this observation. Thus. a. . a. σ2 . and thus the step 2 of the algorithm takes O(kmn) time. a − 1 and b.. . and this is O(km 2 n2 ). we must conclude 1 j that it is suboptimal. ta−1 .. σ4 ∈ SΘj Θ . which is at most N times for the loop in line 4. b] = D[j − 1. . P ) can be improved to O(kmn log mn).. .. a. for a total of O(kN 2 ) = O(km2 n2 ) times. assume that there is an a (0 < a ≤ b) and an i (1 ≤ i ≤ j) such that ta < ta−1 . From the fact that this solution is lexicoi i i +1 i graphically smaller than the lexicographically smallest optimal solution (t a−1 . such that: i) D[j.. . Proof: For ﬁxed values of j and b.. the value t a can i only increase as well. It follows i i i i i i +1 that (ta .. . . we have ta −1 ≤ ta < ta−1 ≤ ta−1 ≤ ta−1 ≤ ta +1 . for any i ∈ {1. . let i be the smallest such i. We then go through the set X of critical events in order. For the sake of contradiction. . 3. for all a ∈ {0. for all a ∈ {z.e. 0 j+1 The proof of i) and ii) is based on the following observation: as a increases from 0 to b. The operations in the rest of the algorithm all take constant time each... ta ) is a valid solution for j. ta ) denote the lexicographically smallest solution among 1 j the optimal solutions for the subproblem D[j. times N for the loop on line 6. ta . mentioned four groups. .. deﬁne t a = xa and ta = xb . Their total time is determined by the number of times line 10 is executed. times k for the loop on line 8. or remain the same. An initial value f j (0) can be computed from this conﬁguration in O(m + n j ). ta−1 ). we must conclude that it is at least as 1 j good. ta ) is different. ta−1 . . z − 1}. and when necessary we update the conﬁguration of strips (and thus the sets SΘΘj and SΘj Θ ) and also compute the value of the function f j . . .. ta−1 ) is a valid solution for j. The amount of storage used by the algorithm is dominated by the amount of space needed to store the matrix M .. . a and b .... b − 1]. i=i f i (ta ) ≤ i=i f i (ta−1 ). . k}. . based on the slopes computed from S ΘΘj and SΘj Θ .. . . . For such an a. . The reﬁnement of the dynamic programming algorithm is based on the following property of equation (8): Lemma 5 For any polyline P j . ta −1 .. a. For ease of notation. from which it follows that (t a−1 . b ∈ {0. . .

and log N height. ν ν ν ν that is: a− = 0 and a+ = N − 1. For ν ν ν ν ν any index a of a critical point x a . Each node ν of T j. b] can be constructed from D[j − 1. and thus... −1] ← INFINITY 9. MIN ← ∞ 6. for j ← 1 to k do 8. Each node ν with a − = a+ is a leaf of the tree. Moreover. j j j j For given j and b. b − 1] fast. .. Pk . .. and sort them 2. Construct D[j. The key to success will be to represent these functions in such a way that they can be evaluated fast and D[j.b used for storing the function D[j. N − 1}. xN−1 }.6. .. for all 0 < a ≤ b and 1 ≤ i ≤ j. b](a) is the sum of the weights on the path from the tree root to the leaf λ a . The root ρ is associated with the full domain. aν ] and ν ν ν [aν + 1. . b] (while adding f j (xb )) from this value onwards. Note that so far. b − 1] 13.. b] = D[j − 1. and split values in their corresponding nodes. a+ ]. b] = D[j. A trivial example is to set w ν equal to D[j. b − 1] up to some value z. b}. so ta = xb .. . b] + f j (xb ). b].paragraph. Each node ν with a − < a+ is an internal node that has a split ρ ρ ν ν value aν = (a− + a+ )/2 associated with it.b has the following property: D[j. by induction. and thus. P ) The running time of this algorithm depends on how the functions D[j. It follows that t a ≥ ta−1 . Its left and right children are associated with [a − . we present the summarized reﬁned algorithm: 1. 3. i i i i for all 0 ≤ a ≤ a. b](a) = D[j... D[j.. We can now prove the lemma item i) and ii). . b] and D[j. b] from D[j − 1. MIN ← min (val. Before describing the data structure used for this purpose. Compute the set of critical points X = {x0 . with a ν = a− = a+ . a. that t a ≤ ta . b − 1] for all a ∈ {0. 11 . INFINITY ← a function that always evaluates to ∞ 5.. This property allows us to improve the time bound of the dynamic programming algorithm in the previous section. . Such a representation of a function D[j. . a. a ← 0 7. b] are represented. b] and D[j. D[j. b] by means of balanced binary trees. we have xb = tz ≤ ta ≤ tb = xb . a+ ]. respectively. we will compute arrays of functions D[j. . N − 1} → R. a. . ZERO ← a function that always evaluates to zero 4. we have t a = xb for all j j a < z. With each node ν we also store a weight w ν . for b ← 0 to N − 1 do 10. b] is not unique.. In order to make especially steps 12 and 15 of the above algorithm efﬁcient. [4] used an idea similar in spirit. b] : {0. b]. k} and all b ∈ {0. we will denote the leaf ν that has a ν = a by λa . b} for which t z = xb j (since tb must be xb . b] can be obtained from D[j. and to set it to zero for all internal nodes. with D[j.. all trees have the same associated intervals. we represent the functions D[j. the tree looks exactly the same for each function D[j. . Lemma 5 expresses the fact that the values of function D[j. .b is associated with an interval [a− . Moreover. Let z be the smallest value in {0. a ← a+1 15. For all j ∈ {1.1 An efﬁcient representation for function D[j. we consider the function D[j. with 0 ≤ a− ≤ a+ ≤ N − 1.. val ← evaluation of D[k. . Instead of computing arrays of scalars D[j. D[0. Since. we have D[j.. Asano et al. such a value always exists). b] We now describe the tree T j. such that Tj. b]. a. z − 1}. for a ∈ {z. return MIN Algorithm FastCompute d(P1 . a. It will become clear below why we also allow representations with non-zero weights on internal nodes. b](aν ) for all leaves ν. b](a) 16. evaluate f j (xb ) 3. by our choice of z. and from D[j − 1. while xa < xb − l0 do 14. b]: they are balanced binary trees with N leaves. a. b]. b] ← ZERO 11. MIN ) √ 17. for j ← 1 to k do 12.

b](a) takes O(log N ) time. b − 1].b−1 . Also the representation of a function that always evaluates to ∞ can be constructed in O(N ) time. by identifying.b must equal those in T j−1.b−1 Tj−1. The representation of a function that always evaluates to ∞ takes also O(N ) time since it involves only the building a balanced binary tree on N leaves and setting the weights of all leaves to ∞. b]. We ﬁnd the sequences of left and right turns that lead from the root of the trees down to the leaf λ z . that is: the rightmost descendant of the left child of ν.e.b with these properties as follows.b is contructed from T j.b λ0 λz λN −1 λ0 λz λN −1 Figure 8: The tree T j. (ii) Once Tj. There is no need to copy ν: we just add a pointer to it. the total weights to the root in Tj. Lemma 5 tells us that for each leaf left of λ z . (iii) To construct T j. a representation Tj. evaluating function D[j. Lemma 6 The data structure Tj.b .b and take the weight of ρ from there.b and Tj. b] can be operated on such that: (i) The representation of a zero-function (i.b of D[j.b−1 .b efﬁciently.b representing the function D[j.Tj. (ii) Given Tj. Note that the sequences of left and right turns are the same in the trees T j. and adopting the subtrees to the left of the path from T j. and Tj−1.b−1 . Tj. we store with each node ν a value m ν which is the sum of the weights on the path from the left child of ν to the leaf λ aν . i. (iii) Given the representations T j−1. the weights stored allow us to evaluate D[j.b−1 . now adding f j (xz ). at each node node along this path whether the path continues into the right or left subtree of the current node. Proof: (i) Constructing the representation of a zero-function involves nothing more than building a balanced binary tree on N leaves and setting all weights to zero. and continue the construction process in that subtree. based on the stored weights. b]. we take the following approach.b by constructing a root ρ. Every time we go into the left branch.b−1 and Tj−1. the total weight on the path to the root in T j.b .b . b] has been constructed. and the subtrees to the right of the path from T j−1. we will show below that we are able to construct the path from the root of the tree to the leaf λ z corresponding to z. We construct the tree T j. plus f j (xz ).b . and of internal nodes to zero. b] can be computed in O(log N ) time. b](a) for any a in O(log N ) time by summing up the weights of the nodes on the path from the root to the leaf λ a that corresponds to a.b by creating new nodes along the path from the root to the leaf λ z . only the weights on the path differ.b of D[j.b for the representation of function D[j. b] and D[j. a function that always evaluates to zero) can be constructed in O(N ) time. We start building Tj. Then we make a new root for the other subtree of the root ρ. we set the weight of ρ equal to the weight of the root of Tj.b−1 . we adopt the right child 12 .b−1 of the functions D[j − 1.b ⇓ λ0 λz λN −1 Tj. If the path to λ z goes into the left subtree. thus at once making all descendants of ν in T j. the one that contains λ z .e. Furthermore.b−1 into descendants of the corresponding node in the tree T j.b must be the same as the total weight on the corresponding path in T j. At λz itself and right of λ z . we adopt the right child from T j−1. Though we do not compute z explicitly.b−1 and Tj−1. where z is deﬁned as in lemma 5. If the path to λ z goes into the right subtree. respectively. we adopt as left child of ρ the corresponding left child ν of the root from T j. Furthermore. This concludes the description of the data structure used for storing a function D[j. This can be done in O(N ) time.b under construction.b from Tj.

b] gives the best alignment of the ﬁrst j polylines ending at a critical point within the interval [x 0 . Thus. Global string matching under a general edit distance error model can be done by dynamic programming in O(kN ) time. where D[j. the inﬁnity-function IN F IN IT Y can be constructed in O(N ) time (line 4). The accumulated weights on the path down from the root. and every time we go into the right branch. Lemma 6 also insures that constructing D[j. because an optimal D[j. Pk . We note that the problem resembles a general edit distance type approximate string matching problem [19]. together with the stored weights for the paths down to left childrens’ rightmost descendants. without copying the structure of the complete tree. Notice that no node is ever edited after it has been constructed: no tree is ever updated. This is also the case for a different formulation of the dynamic programming problem. This will tell us if λz is to be found in the left or in the right subtree of ν. b] and D[j. This trick however does not apply here due to the condition x b − xa ≤ l0 . . also allow us to decide in constant time which is better: D[j. . . b] gives the cost of the best alignment of the ﬁrst j polylines ending at critical point x b (over all possible starting points x a satisfying the condition x b − xa ≤ l0 ).5 and can be executed in O(kmn + mn log n) time.from Tj−1. we can set the weight of each newly constructed node ν correctly in constant time per node. So the total storage required by the algorithm is O(kmn log(mn)). the total running time of the above algorithm will be dominated by O(kN ) executions of line 12. .b−1 (see ﬁgure 8). Indeed. an assignment as in lines 8 or 10 of the algorithm can safely be done by just copying a reference in O(1) time. P ) with the data structure described above. by the above trick to convert global matching into partial matching we would have to ﬁll in a table D of O(kN ) elements. such that the alignment of the ﬁrst j − 1 polylines is sub-optimal for D[j − 1. and thus we need O(kmn log(mn)) for all the trees computed in step 12. we increase wν by f j (xz ). xb ]. Notice that any of the functions constructed in step 12 requires only storing O(log(N )) = O(log(mn)) new nodes and pointers to nodes in previously computed trees. b] might be given by an alignment of the ﬁrst j polylines. we set its weight w ν so that the total weight of ν and its ancestors equals the total weight of the corresponding nodes in the tree from which we adopt ν’s child — if the subtree adopted comes from Tj−1. The complete construction process only takes O(1) time for each node on the path from ρ to λ z . 4 A Part-based Retrieval Application In this section we describe a part-based shape retrieval application. . we have to store the ZERO and the IN F IN IT Y function. taking in total O(kN log N ) = O(kmn log(mn)) time.b . . All these require O(kN ) = O(kmn) storage. Proof: We use algorithm Fastcompute d(P 1 . P ) between k polylines {P1 . . The retrieval problem we are considering is the following: given a large collection of polygonal shapes. where xc is the placement of the polyline j − 1. This table however cannot be ﬁlled efﬁciently in a column-wise manner.b is constructed in time O(log N ). b](aν ) + f j (xz ). By keeping track of the accumulated weights on the path down from the root in all the trees. b] from D[j − 1. in which each entry D[j. . we have that the zero-function ZERO can be constructed in O(N ) time (line 3). we adopt the left child from T j. . b](a) (line 15) takes O(log N ) time. where k and N represent the lengths of the two strings. . so that T j. Similarly. b − 1] (line 12) takes O(log N ) time.b . Apart from the function values of f j computed in step 2. b − 1](a ν ) or D[j − 1. . Step 1 and 2 of the algorithm are the same as in the algorithm from section 3. Theorem 1 The similarity d(P1 . Since the trees are perfectly balanced. Pj } with n vertices in total. From lemma 6. and a query consisting of a set of disjoint 13 . and that the evaluation of D[k. and a new tree is always constructed by making O(log N ) new nodes and for the rest by referring to nodes from existing trees as they are. can be computed in O(kmn log(mn)) time using O(kmn log(mn)) storage. For every constructed node ν. c]. this path has only O(log N ) nodes. The same time complexity can be achieved for partial string matching through a standard “assign ﬁrst line to zero” trick [19]. . Pk . Therefore. and a polygon P with m vertices.

such as those labelled “horse”. Some examples of images in this collection are given in ﬁgure 9. contains shapes at different scales. Figure 10 depicts two part-based queries on the same shape (belonging to the “beetle” class) with very different results. we then used the Douglas-Peuker [9] polygon approximation algorithm. The shape collection used by our retrieval application comes from the MPEG-7 shape silhouette database. Querying with such parts have shown to yield shapes from all these classes. Speciﬁcally.Figure 9: Examples of images from Core Experiment “CE-Shape1”. For matching the query against a database shape we use the similarity measure described in section 3. The matching of a query consisting of k polylines to an arbitrary database contour is based on the similarity measure described in section 3. 4. we want to retrieve those shapes in the collection that best match the query. instead of the bounding box. “dog”. Given the nature of our collection (each image in “CE-Shape-1” is one view of a complete object). this is a reasonable approach. In this contour. This test set consists of 1400 images: 70 shape classes.1. these parts are the result of applying the boundary decomposition framework introduced in [29]. This similarity measure is not scale invariant. The parts in the query are selected by the user from an automatically computed decomposition of a given shape. is related to the fact that a class in the collection may contain images of an object at different rotational angles.1 Experimental Results The selection of parts comprising the query has a big inﬂuence on the results of a part-based retrieval process. part B. the query consists of ﬁve chains. The instantiation of the boundary decomposition framework described there uses the medial axis. we used the Core Experiment “CE-Shape-1” part B [13]. instead of querying with complete shapes. Our shape collection. as described in [29]. where shapes in each row belong to the same class. In order to achieve some robustness to scaling. however. a test set devised by the MPEG-7 group to measure the performance of similarity-based retrieval for shape descriptors. “deer”. such as limbs. for example. each pixel corresponds to a vertex. in the application we discuss below. The outer closed contour of the object in each image was extracted. Each resulting simpliﬁed contour was then decomposed. we scaled all shapes in the collection to the same diameter of the circumscribed disk. polylines. The running time for a single query on the MPEG-7 test set of 1400 images is typically about one second on a 2 GHz PC. In order to decrease the number of vertices. In the ﬁrst example. For the medial axis computation we used the Abstract Voronoi Diagram LEDA package [1]. each representing a 14 . whose shapes have similar parts. More speciﬁcally. The reason we opted for a normalization based on the circumscribed disk. and the matching process evaluates how closely these parts resemble disjoint pieces of a shape in the collection. as we noticed in section 3. The query represents a set of disjoint boundary parts of a single shape. Our collection for example includes classes. Images in the same row belong to the same class. Thus. with 20 images per class. we make the query process more ﬂexible by allowing the user to search only for certain parts of a given shape. “cattle”.

The shape descriptor selected by MPEG-7 to represent a closed contour of a 2D object or region in an 15 . decomposed countour with ﬁve query parts in solid line style. Each visual descriptor incorporated in the MPEG-7 Standard was selected from a few competing proposals. leg of the beetle. each representing a leg of a beetle contour. proposed for standardization within MPEG-7. (Lower half) The same original image and decomposed contour. The performance of each shape descriptor was measured using the so-called “bulls-eye” test: each image is used as a query. two of the other). The MPEG group initiated the MPEG-7 Visual Standard in order to specify standard content-based descriptors that allow to measure similarity in images or video based on visual criteria. and the top ten retrieved shapes when quering with two polylines. a spatial arrangement of these parts is searched in the database. is reported in [21]. and this gives the query more power to discriminate between beetle shapes and other shapes containing leg-like parts. The Core Experiment “CEShape-1” was devised to measure the performance of 2D shape descriptors. In the second example. but they were concatenated. and the number of retrieved images belonging to the same class was counted in the top 40 (twice the class size) matches. one containing three legs and the other two legs. based on an evaluation of their performance in a series of tests called Core Experiments. the same parts are contained in the query.Figure 10: (Upper half) Original image. so that they form two chains (three legs on one side. only three come from the same class. and the top ten retrieved shapes when quering with these ﬁve polylines. By concatenating the leg parts in the second query. The performance of several shape descriptors. Among the ﬁrst ten retrieved results. All ﬁrst ten retrieved results are this time from the same class.

The reported similarity-based retrieval performance of this method is 75. while in [10] a local descriptor of an image point is proposed that is determined by the spatial arrangement of other image points around that point. This has to be done interactively by the user. is therefore untractable. A prerequisite of such a performance of the part-based matching is the selection of query parts that capture relevant and speciﬁc characteristics of the shape. of 78. respectively. of the contours in these classes. by relating the positions of the maxima of the corresponding CSSs. The query. for example the class “beetle”. These experimental results indicate that for those classes with a low average performance of the CSS matching. Belongie et al. Firstly. is signiﬁcantly more effective in such cases. and a turning function-based matching). For the “ray” and “deer” classes. since that would re quire 1400 interactive queries. respectively) are caused by the different shape and size of the tails and antlers. σ) plane. and our method a score of 70%. proposed for MPEG-7. Latecki at al. we have developed an alternative approach to part-based shape retrieval. [6] reported a retrieval performance of 76. For the shape called “horse-17”. people have reported better performances on the same set of data for other shape descriptors. the CSS matching [18] gives good results. in which the user is relieved of the responsibility of specifying shape parts to be searched for in the database. The average performance rate of CSS matching for this class is only 36%.19% for a shape descriptor introduced in [5].) 5 Concluding Remarks In this paper we addressed the partial shape matching problem. A prerequisite of such a performance of the part-based matching is the selection of query parts that capture relevant and speciﬁc characteristics of the shape. The matching process has two steps. The matching in [27] is based on aligning the two shapes in a deformation-based approach. Since the user does not usually have any knowledge of the databased content before the retrieval process. For example. and our multiple polyline to polygon matching method a score of 95%. the different relative lengths of the antennas/legs. The CSS of a closed planar curve is computed by repeatedly convolving the contour with a Gaussian function. An overall performance score for the matching process. In the years following the selection of this descriptor for the MPEG-7 Visual Standard. [27] and Grigorescu and Petkov [10]. with a retrieval rate over 90%.67%. We compared our part-based approach to shape matching with two global shape matching technique (the CSS matching.51% for their descriptor based on shape contexts. our part-based approach with the global CSS matching and with a global matching based on the turning function. (See many examples in the appendix. Even better performances. for the shape called “carriage-18”. respectively. with a proper selection of parts. see [30]. is reported to have a similarity-based retrieval performance of 54. We therefore compared. and representing the curvature zero-crossing points of the resulting curve in the (s. A global turning function-based shape descriptor [12]. the global turning function method a score of 80%. the eccentricity and circularity of the contours are used as a ﬁlter.44% [21]. This has to be done interactively by the user. The similarity measure was tested in a shape-based image retrieval application. For classes in “CE-Shape-1” with a low variance among their shapes. is a polygon. and in order to select among the large number of possible searches in the database with 16 . The shape similarity measure between two shapes is computed.38%. our approach consistently performs better. our approach consistently performs better. where s is the arclength. to reduce the amount of computation. and the different shapes of the body pose problems for global retrieval. on a number of individual queries.image is based on the Curvature Scale Space (CSS) representation [18]. We introduced a measure for computing the similarity between a set of pieces of one shape and another shape. Such a measure is usefull in overcoming problems due to occlusion or unreliable object segmentation from images. A later variation [33] increases the performance rate up to 65. in this case. the bad results (an average performance rate of the CSS matching of 26% and 33%.14%. the CSS method has a bull’s eye score of 70%. our dissimilarity measure provides a powerful complementary shape matching method. as measured by the “bulls-eye” test. mention in [16] a retrieval performance of 83. Concludingly.18% and 78. like in [21]. A part-based matching. were reported by Sebastian et al. Experimental results indicate that for those classes with a low average performance of the CSS matching. both the CSS method and the global turning function method have a bull’s eye score of 10%. in the second step. For difﬁcult classes however. We tested the performance of our part-based shape matching.

MPEG-7. Shape similarity measure based on correspondence of visual parts. in the form of marking relevant results from those retrieved by the system. Lak¨ mper. [10] C.html. Huttenlocher. Acknowledgements This research was partially supported by the Dutch Science Foundation (NWO) under grant 612. Katoh. 2005. Distance sets for shape ﬁlters and shape recognition. pages 34–51. 30(1):59– 77. [4] T. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. [11] D. J.061. Everett. Guibas. L. has the purpose of allowing the system to guess what the user is looking for in the database. Conf. Theory and Applications. and W. pages 777–786. and by the FP6 IST Network of Excellence 506766 Aim@Shape. G. http://www. Latecki and R. 24:509–522. Description of core experiments for MPEG-7 motion/shape. Ansari and E. Shape matching and object recognition using shape contexts. Mitchell. J. 1991. [7] B.J. 15:850–863. [9] D. PAMI. [15] L. 1997. Haverkort. PhD thesis. Int. 1987. Technical report. SODA. An efﬁciently computable metric for comparing polygonal shapes. K. IBM.006. Kedem. 2002. Malik. J. Klanderman. R. the Abstract Voronoi Diagram LEDA Extension Package. Shape Based Digital Image Similarity Retrieval. References [1] AVD LEP. Comparing images using the hausdorff distance. Shape similarity and visual parts. [14] L. Bober. March 1999. Arkin. 12:470–483. His interaction. J. 13:209–215. 1973. Pattern Recognition. In Proc. Peucker. Cheong. [5] E. Huttenlocher. Delp. mpg. Technical Report ISO/IEC JTC 1/SC 29/WG 11 MPEG99/N2690. 2003. [3] E. Belongie. and J. Petkov. a PAMI. PAMI. Attalla. 22:1185–1190.parts of the query. 2003. Bhanu and J. Cohen and Leonidas J. H. de Berg. 2004.H. TR ISO/IEC JTC 1/SC 29/WG 11/ P162. Jeannin and M. [6] S. [12] IBM. Wolter. 17 . [8] Scott D. 1993. Technical report. Computational Geometry. Wolff. Partial matching of planar polylines under similarity transformations. PAMI. Rucklidge. M. The Canadian Cartographer. PAMI. [2] N. 12(10):1274–1286. Wayne State University.K. Douglas and T. Thanks to Veli M¨ kinen for the partial a string matching reference. Ming. In Proc. [13] S. the user interacts with the system in the retrieval process. 1990. D. and A. Grigorescu and N. Lak¨ mper. IEEE Transactions on Image Processing.de/LEDA/friends/avd. Technical summary of turning angle shape descriptors proposed by IBM. N. a Discrete Geometry for Computer Imagery (DGCI). Puzicha. and J. Asano. and D. H. 2000. O.mpi-sb. Recognition of occluded objects: A cluster-structure algorithm. Optimal spanners for axis-aligned rectangles. Chew. Partial shape recognition: A landmark-based approach. C. Seoul. Latecki. 1999. 20(2):199–211. 10(2):112–122.

[23] A. 2005. pages 10–21. [24] E. [32] Haim Wolfson and Isidore Rigoutsos. A fuzzy relaxation technique for partial shape matching. In Proceedings International Conference on Multimedia and Expo (ICME05). Part-based shape retrieval with relevance feedack. Shokoufandeh.[16] L. Hagedoorn. 1996. June 1996. P. [27] T. July 1993. In Workshop on Image DataBases and MultiMedia Search. Prokop and A. Rosin. R. 7(3):170–179. Siddiqi. 1990. 15(4):349–355. [30] Mirela Tanase and Remco C. Shock graphs and shape matching. R. u July 1999. Dickinson. Image and Vision a Computing. [25] R. Zibreira and F. [33] C. [17] H. October-December 1997. International Journal of Computer Vision. ACM Computing Surveys. Abbasi. Multiscale representation and matching of curves using codons. 55(1):13–32. Pattern Recognition Letters. Principles of Visual Information Retrieval. Sclaroff. Utrecht University. J. Pereira. Partial shape matching using genetic algorithms. Kimia. 18(10):987–992.-C. Technical Report ISO/IEC JTC1/SC29/WG11 MPEG98/M4740. Technical report. [26] P. Efﬁcient and robust retrieval by shape content through curvature scale space. Veltkamp and M.J. Mokhtarian. Pentland. Ohm and K. A guided tour to approximate string matching. and Cybernetics. Ogawa. W. Results of MPEG-7 Core Experiment Shape-1. Telecommunications. [28] K. Conf. editor. and S. Photobook: Content-based manipulation of image databases. 1997. [22] E. A survey of moment-based techniques for unoccluded object representation and recognition. MPEG-7. [20] H. IEEE Computational Science & Engineering. and J. 12(11):1072–1079. Lak¨ mper. Zucker. PhD thesis. [19] Gonzala Navarro. Geometric hashing: an overview. [29] Mirela Tanase. 1999. 33(1):31– 88. Optimal partial shape similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence. 18(3):233–254. Partial shape classiﬁcation using contour matching in distance transformation. February 2005. A study of similarity measures for a turning angles-based shape descriptor. and B. IEEE Transactions on Systems. 2005. 25(1):116–125. 54(5):438–460. P. Graphical Models and Image Processing. Shape Deomposition and Retrieval. Springer. 1994. 2001. Mohan. Persoon and K. K. and Image Processing. A.-R. Department of Computer Science. 2003. 2001. S. Wolter. and D. J. Graphics. M¨ ller. D. Portugal. [31] R. Sebastian. Reeves. Ozcan and C. 1977. Kittler. PAMI. IJCV. 55(4):286–310. Man. Lew. On aligning curves. S. Veltkamp. Liu and M. Latecki. 2001. Shape discrimination using Fourier descriptors. pages 87–119. S. Picard. Srinath. [21] J. State-of-the-art in shape matching. Setember 1992. [18] F.W. Pattern Recognition Letters. and S. 18 . 23:227–236. Computer Vision. pages 35–42. Fu. In M. Klein. In Proc.

and our Multiple Polyline to Polygon (MPP) matching. the Global Turning Angle function (GTA). 19 . The retrieval rate measured by the “bulls-eye” test. the percentage of true positives returned in the ﬁrst 40 (twice the class size) matches.Appendix A We compared our multiple polyline to polygon matching (MPP) approach with the global CSS matching (CSS) and with a global matching based on the turning function (GTA). Table 1 presents the query image of the CSS matching and the query parts of our part-based matching in the ﬁrst and second column of the table. Bull’s Eye Performance Matching Process CSS GTA MPP True Positives in class size Matching Process CSS GTA MPP CSS Query Image Part-base Query Parts 10% 30% 65% 10% 10% 55% beetle-20 10% 35% 60% 5% 20% 55% beetle-10 15% 15% 70% 15% 5% 60% ray-3 15% 25% 50% 15% 20% 40% ray-17 15% 40% 50% 5% 30% 35% Table 1: A comparison of the Curvature Scale Space (CSS). while the remaining columns give the percentage of true positives returned in the ﬁrst 20 (class size) matches. for the three matching techniques are indicated in the following three columns.

CSS Query Image deer-15 Part-base Query Parts Bull’s Eye Performance Matching Process CSS GTA MPP True Positives in class size Matching Process CSS GTA MPP 20% 40% 45% 15% 40% 40% deer-13 5% 20% 45% 5% 15% 35% bird-11 20% 15% 60% 20% 15% 40% bird-17 20% 25% 50% 15% 25% 40% bird-9 70% 80% 95% 45% 80% 90% carriage-18 10% 10% 70% 10% 5% 60% horse-17 15% 25% 65% 10% 25% 40% horse-3 Table 1: A comparison of the Curvature Scale Space (CSS). the Global Turning Angle function (GTA). and our Multiple Polyline to Polygon (MPP) matching. 20 .

21 . and our Multiple Polyline to Polygon (MPP) matching.CSS Query Image Part-base Query Parts Bull’s Eye Performance Matching Process CSS GTA MPP True Positives in class size Matching Process CSS GTA MPP 25% 30% 50% 20% 20% 45% horse-4 30% 30% 50% 25% 25% 40% crown-13 20% 45% 65% 20% 35% 50% butterﬂy-11 15% 25% 55% 10% 20% 55% butterﬂy-4 10% 15% 50% 10% 15% 45% dog-11 Table 1: A comparison of the Curvature Scale Space (CSS). the Global Turning Angle function (GTA).