histogram based “plug-in” estimation techniques [1], thismethod has a faster convergence rate when
f
is a smoothfunction [2]. In the remainder of the paper, we restrict at-tention to graphs
G
that are spanning trees. In this case, theMGW function is equal to the EMST weight.
3. THE EMST REGISTRATION METRIC
Inspiredbyinformationtheoreticregistrationtechniques[8],and result (1), entropic graphs have recently been usefullyemployed for multi-modal registration [3],[7] and [4]. Thebasic framework of this approach is as follows. For im-ages,
I
i
(
x,y
)
,
i
= 1
,
2
, and
d
a positive even integer, de-ﬁne
f
i
(
x,y
) =
f
(
I
i
(
x
′
,y
′
) : (
x
′
,y
′
)
∈
N
(
x,y
))
∈
R
d/
2
where
N
(
x,y
)
⊂
R
2
is called the “neighborhood” of
(
x,y
)
and
f
is the “feature” function. A trivial but useful exam-ple is
f
i
=
I
i
(
x,y
)
where
f
is the identity function and
N
(
x,y
) = (
x,y
)
. In this case pixel intensity values arethe “features”. Information theoretic registration methodsare based on treating samples of the joint signal
(
f
1
,f
2
)
∈
R
d
as realizations of a random variable. Intuition suggeststhat samples from
f
1
and
f
2
should become “dependent” asthe two images are correctly aligned. This might be mea-sured, for example, by mutual information or joint entropyor, based on (1), by an entropic spanning graph [4], [7].In particular, for a ﬁxed feature function
f
and constant
α
,we deﬁne the EMST registration function to be the EMSTweight
W
γ
(
F
n
)
, where
F
n
is a set of
n
samples of
(
f
1
,f
2
)
.Research indicates that using higher dimensional features,i.e. not only pixel intensity values, makes the approachmore robust and accurate by incorporating spatial informa-tion. In contrast to the “plug-in” entropy estimators, theEMST registration function can be employed when high di-mensional features are used [7]. To date, the two major dis-advantages of using the EMST metric are the computationalbottleneck of computing the metric and it was not clear howgradient information could be obtained concurrently fromthe computation. We address the second of these issues be-low.
4. OPTIMIZING THE EMST METRIC
Image registration problems typically consist of three ma- jor components. The
transformation space
determines theallowed spatial transformation applied to the images. Thiscomponent is highly application dependent. Examples arerigid-body, afﬁne and deformable transformations. The
reg-istration function
quantiﬁes the similarity between two im-ages under a given transformation. Some examples are mu-tual information, correlation ratio, the EMST function, etc.The
optimization method
searches for the optimum transfor-mation that maximizes the similarity between the images.Consider the problem of aligning images
I
i
(
x,y
)
,
i
=1
,
2
, using the EMST metric with ﬁxed sampling locations
Ω
⊂
R
2
and feature function
f
(
.
)
. Let the spatial transfor-mation
T
: Ω
→
R
2
be parameterized by
m
parameters,i.e.
T
t
where
t
= (
t
1
,...,t
m
)
. For example, a rigid-bodytransformation
T
R
t
has three parameters: two shifts
(
t
x
,t
y
)
and a rotation
θ
. It can be expressed as:
T
R
t
(
x,y
) =
cos
θ
sin
θ
−
sin
θ
cos
θ
xy
+
t
x
t
y
.
(2)Let
W
(
t
)
denote the EMST weight of
F
(
t
) =
{
(
f
1
(
x,y
)
,f
2
(
T
t
(
x,y
))):
∀
(
x,y
)
∈
Ω
}
.
Note that in the rest of the paper we will assume
γ
= 1
.The choice of a different value for
γ
may change the regis-tration result, but the methods discussed below extend in astraightforward way to these cases. In this framework, theregistration problem boils down to the following optimiza-tion problem:
t
opt
= argmin
t
W
(
t
)
.
Typically this problem is non-convex. Thus ﬁnding theglobal optimum is a difﬁcult task. For illustration, Figure 1shows the proﬁle of the EMST registration function withrespect to the translation along the x-axis for an image pair.
-10-505105.566.577.588.599.510
Translation along the x-axis (in pixels)
E M S T m e t r i c
Fig. 1
. Proﬁle of the EMST registration function
4.1. Obtaining a Descent Direction
Descent optimization methods use a computed search di-rection that guarantees a local decrease in the value of theoptimization function. Using the gradient or an estimateof the gradient is a common way to determine this direc-tion. These methods usually assure convergence to a localextremum of the optimization function. To obtain a globalextremum, multiple-start and multi-resolution approaches,[6], can be used to augment the basic method in an attemptto avoid getting stuck in an undesired local extremum.The difﬁculty, however, in employing a gradient basedapproach for the EMST metric is that, in general, the EMSTmetric is not differentiable. For example, consider the ver-texset
V
=
{
v
1
,v
2
,v
3
}
withedgesandparameterizedweights: