Image Registration: John Ashburner

Image
Registration
John Ashburner
* Smooth
* Realign
* Normalise
* Segment
With slides by Chloe Hutton and Jesper Andersson
Overview of SPM Analysis
fMRI time-series Design matrix Statistical Parametric Map
Motion Smoothing General Linear Model

Correction
Spatial Parameter Estimates

Normalisation
Anatomical Reference
Contents
* Preliminaries
* Smooth
* Rigid-Body and Affine Transformations
* Optimisation and Objective Functions
* Transformations and Interpolation
* Intra-Subject Registration
* Inter-Subject Registration
Smooth
Smoothing is done by convolution.
Each voxel after smoothing effectively

becomes the result of applying a weighted
region of interest (ROI).
Before convolution Convolved with a circle Convolved with a Gaussian
Image Registration
Registration - i.e. Optimise the parameters
that describe a spatial transformation
between the source and reference
(template) images
Transformation - i.e. Re-sample according

to the determined transformation
parameters
2D Affine Transforms
* Translations by tx and ty
* x1 = x0 + tx
* y1 = y0 + ty
* Rotation around the origin by  radians
* x1 = cos() x0 + sin() y0
* y1 = -sin() x0 + cos() y0
* Zooms by sx and sy
* x1 = sx x0 Shear
* y1 = sy y0 x1 = x0 + h y0
y1 = y0
2D Affine Transforms
* Translations by tx and ty
* x1 = 1 x0 + 0 y0 + tx
* y1 = 0 x0 + 1 y0 + ty
* Rotation around the origin by  radians
* x1 = cos() x0 + sin() y0 + 0
* y1 = -sin() x0 + cos() y0 + 0
* Zooms by sx and sy:
* x1 = sx x0 + 0 y0 + 0 Shear
* y1 = 0 x0 + sy y0 + 0 x1 = 1 x0 + h y0 + 0
y1 = 0 x0 + 1 y0 + 0
3D Rigid-body Transformations
* A 3D rigid body transform is defined by:
* 3 translations - in X, Y & Z directions
* 3 rotations - about X, Y & Z axes
* The order of the operations matters
1 0 0 Xtrans   1 0 0 0   cosΘ 0 sinΘ 0   cosΩ sinΩ 0 0 



0 1 0 Ytrans   0 cosΦ sinΦ 0   0 1 0 0    sinΩ cosΩ 0 0 
    
0 0 1 Ztrans   0  sinΦ cosΦ 0    sinΘ 0 cosΘ 0  0 0 1 0
       
0 0 0 1   0 0 0 1   0 0 0 1   0 0 0 1 
Translations Pitch Roll Yaw

about x axis about y axis about z axis
Voxel-to-world Transforms
* Affine transform associated with each image
* Maps from voxels (x=1..nx, y=1..ny, z=1..nz) to some world
co-ordinate system. e.g.,
* Scanner co-ordinates - images from DICOM toolbox
* T&T/MNI coordinates - spatially normalised
* Registering image B (source) to image A (target) will
update B’s voxel-to-world mapping
* Mapping from voxels in A to voxels in B is by
* A-to-world using MA, then world-to-B using MB-1
* MB-1 MA
Left- and Right-handed Coordinate
Systems
* Analyze™ files are stored in a left-handed system
* Talairach & Tournoux uses a right-handed system
* Mapping between them requires a flip
* Affine transform with a negative determinant
Optimisation
* Optimisation involves finding some “best”
parameters according to an “objective function”,
which is either minimised or maximised
* The “objective function” is often related to a
probability based on some model
Most probable solution

Objective (global optimum)
function Local optimum Local optimum
Value of parameter
Objective Functions
* Intra-modal
* Mean squared difference (minimise)
* Normalised cross correlation (maximise)
* Entropy of difference (minimise)
* Inter-modal (or intra-modal)
* Mutual information (maximise)
* Normalised mutual information (maximise)
* Entropy correlation coefficient (maximise)
* AIR cost function (minimise)
Transformation
* Images are re-sampled. An example in 2D:
for y0=1..ny0 % loop over rows
for x0=1..nx0 % loop over pixels in row
x1 = tx(x0,y0,q) % transform according to q
y1 = ty(x0,y0,q)
if 1x1 nx1 & 1y1ny1 then % voxel in range
f1(x0,y0) = f0(x1,y1) % assign re-sampled value
end % voxel in range
end % loop over pixels in row
end % loop over rows
* What happens if x1 and y1 are not integers?
Simple Interpolation
* Nearest neighbour
* Take the value of the
closest voxel
* Tri-linear
* Just a weighted
average of the
neighbouring voxels
* f5 = f1 x2 + f2 x1
* f6 = f3 x2 + f4 x1
* f7 = f5 y2 + f6 y1
B-spline Interpolation
A continuous function is represented by 2D B-spline basis functions
a linear combination of basis functions of degrees 0, 1, 2 and 3
B-splines are piecewise polynomials

Nearest neighbour and
trilinear interpolation are
the same as B-spline
interpolation with degrees
0 and 1.
Contents
* Preliminaries
* Realign
* Mean-squared difference objective function
* Residual artifacts and distortion correction
* Coregister
Mean-squared Difference
* Minimising mean-squared difference works for

intra-modal registration (realignment)
* Simple relationship between intensities in one
image, versus those in the other
* Assumes normally distributed differences
Residual Errors from aligned fMRI
* Re-sampling can introduce interpolation errors
* especially tri-linear interpolation
* Gaps between slices can cause aliasing artefacts
* Slices are not acquired simultaneously
* rapid movements not accounted for by rigid body model
* Image artefacts may not move according to a rigid body model
* image distortion
* image dropout
* Nyquist ghost
* Functions of the estimated motion parameters can be

modelled as confounds in subsequent analyses
Movement by Distortion Interaction
of fMRI
•Subject disrupts B0 field,
rendering it inhomogeneous
=> distortions in phase-
encode direction
•Subject moves during EPI
time series
•Distortions vary with
subject orientation
=> shape varies
Movement by distortion interaction
Correcting for distortion changes using
Unwarp Estimate reference from
mean of all scans.
Estimate new distortion

fields for each image:
• estimate rate of change Unwarp time
of field with respect to series.
Estimate the current estimate of
movement movement parameters
parameters. in pitch and roll.
 +
B0  B0 
Andersson et al, 2001
Contents
* Preliminaries
* Realign
* Coregister
* Mutual Information objective function
Inter-modal registration
• Match images from same
subject but different
modalities:
– anatomical localisation of
single subject activations
– achieve more precise
spatial normalisation of
functional image using
anatomical image.
Mutual Information
* Used for between-modality registration

* Derived from joint histograms
* MI= ab P(a,b) log2 [P(a,b)/( P(a) P(b) )]
* Related to entropy: MI = -H(a,b) + H(a) + H(b)
* Where H(a) = -a P(a) log2P(a) and H(a,b) = -a P(a,b) log2P(a,b)
Contents
* Preliminaries
* Normalise
* Affine Registration
* Nonlinear Registration
* Regularisation
* Segment
Spatial Normalisation - Reasons
* Inter-subject averaging
* Increase sensitivity with more subjects
* Fixed-effects analysis
* Extrapolate findings to the population as a whole
* Mixed-effects analysis
* Standard coordinate system
* e.g., Talairach & Tournoux space
Spatial Normalisation - Procedure
* Minimise mean squared difference from template
image(s)
Affine registration Non-linear registration

T2 T1 Transm T1 305
EPI PD PET PD T2 SS
Template Images “Canonical” images

Spatial normalisation can
PET be weighted so that non-
brain voxels do not
A wider range of influence the result.
contrasts can be
Similar weighting masks
registered to a
can be used for normalising
linear combination lesioned brains.
of template images.
T1 PD
Spatial Normalisation - Templates
Spatial Normalisation - Affine
* The first part is a 12 parameter
affine transform
* 3 translations
* 3 rotations
* 3 zooms
* 3 shears
* Fits overall shape and size
* Algorithm simultaneously minimises

* Mean-squared difference between template and source image
* Squared distance between parameters and their expected values
(regularisation)
Spatial Normalisation - Non-linear
Deformations consist of a
linear combination of smooth
basis functions
These are the lowest

frequencies of a 3D discrete
cosine transform (DCT)
Algorithm simultaneously minimises

* Mean squared difference between template
and source image
* Squared distance between parameters and
their known expectation
Spatial Normalisation - Overfitting
Without
regularisation, Affine
the non-linear registration.
(2 = 472.1)
spatial Template
image
normalisation
can introduce
unnecessary Non-linear
warps. Non-linear
registration
without
registration
regularisation.
using
(2 = 287.3)
regularisation.
(2 = 302.7)
Contents
* Preliminaries
* Normalise
* Segment
* Gaussian mixture model
* Intensity non-uniformity correction
* Deformed tissue probability maps
Segmentation
* Segmentation in SPM5 also estimates a spatial
transformation that can be used for spatially
normalising images.
* It uses a generative model, which involves:

* Mixture of Gaussians (MOG)
* Bias Correction Component
* Warping (Non-linear Registration) Component
Gaussian Probability Density
* If intensities are assumed to be Gaussian of mean mk
and variance s2k, then the probability of a value yi is:
Non-Gaussian Probability Distribution
* A non-Gaussian probability density function can be
modelled by a Mixture of Gaussians (MOG):
Mixing proportion - positive and sums to one

Belonging Probabilities
Belonging
probabilities
are assigned
by normalising
to one.
Mixing Proportions
* The mixing proportion gk represents the prior
probability of a voxel being drawn from class k -
irrespective of its intensity.
* So:
Non-Gaussian Intensity Distributions
* Multiple Gaussians per tissue class allow non-Gaussian
intensity distributions to be modelled.
* E.g. accounting for partial volume effects
Probability of Whole Dataset
* If the voxels are assumed to be independent, then
the probability of the whole image is the product of
the probabilities of each voxel:
* A maximum-likelihood solution can be found by

minimising the negative log-probability:
Modelling a Bias Field
* A bias field is included, such that the required
scaling at voxel i, parameterised by b, is ri(b).
* Replace the means by mk/ri(b)

* Replace the variances by (sk/ri(b))2
Modelling a Bias Field
* After rearranging...
y r(b) y r(b)
Tissue Probability Maps
* Tissue probability maps (TPMs) are used instead
of the proportion of voxels in each Gaussian as
the prior.
ICBM Tissue Probabilistic Atlases. These tissue probability maps are

kindly provided by the International Consortium for Brain Mapping, John C.
Mazziotta and Arthur W. Toga.
“Mixing Proportions”
* Tissue probability maps for
each class are included.
* The probability of obtaining
class k at voxel i, given
weights g is then:
Deforming the Tissue Probability
Maps
* Tissue probability
images are
deformed according
to parameters a.
* The probability of
obtaining class k at
voxel i, given
weights g and
parameters a is
then:
The Extended Model
* By combining the modified P(ci=k|) and P(yi|ci=k,), the
overall objective function (E) becomes:
The Objective Function

Optimisation
* The “best” parameters are those that minimise this
objective function.
* Optimisation involves finding them.
* Begin with starting estimates, and repeatedly change
them so that the objective function decreases each
time.
Steepest Descent
Start
Optimum
Alternate between
optimising different groups
of parameters
Schematic of optimisation
Repeat until convergence…
Hold g, m, s2 and a constant, and minimise E w.r.t. b

- Levenberg-Marquardt strategy, using dE/db and d2E/db2
Hold g, m, s2 and b constant, and minimise E w.r.t. a

- Levenberg-Marquardt strategy, using dE/da and d2E/da2
Hold a and b constant, and minimise E w.r.t. g, m and s2

-Use an Expectation Maximisation (EM) strategy.
end
Levenberg-Marquardt Optimisation
* LM optimisation is used for nonlinear registration (a)
and bias correction (b).
* Requires first and second derivatives of the
objective function (E).
* Parameters a and b are updated by
* Increase l to improve stability (at expense of

decreasing speed of convergence).
Expectation Maximisation is used to
update m, s2 and g
* For iteration (n), alternate between:
* E-step: Estimate belonging probabilities by:
* M-step: Set (n+1) to values that reduce:

Regularisation
* Some bias fields and warps are more probable (a
priori) than others.
* Encoded using Bayes rule (for a maximum a posteriori
solution):
* Prior probability distributions modelled by a

multivariate normal distribution.
* Mean vector ma and mb
* Covariance matrix Sa and Sb
* -log[P(a)] = (a-ma)TSa-1(ama) + const
* -log[P(b)] = (b-mb)TSb-1(bmb) + const
Initial Affine Registration
The procedure begins with a Joint Probability Histogram
Mutual Information affine
registration of the image with
the tissue probability maps.
MI is computed from a 4x256
joint probability histogram.
See D'Agostino, Maes, Vandermeulen &

P. Suetens. “Non-rigid Atlas-to-Image
Registration by Minimization of Class-
Conditional Image Entropy”. Proc.
MICCAI 2004. LNCS 3216, 2004. Pages
745-753.
Background voxels
excluded
Background Voxels are
Excluded
An intensity threshold is found by

fitting image intensities to a
mixture of two Gaussians. This
threshold is used to exclude most
of the voxels containing only air.
Spatially
normalised
BrainWeb
phantoms (T1,
T2 and PD)
Tissue
probability
maps of GM
and WM
Cocosco, Kollokian, Kwan & Evans. “BrainWeb: Online Interface to a 3D MRI Simulated Brain Database”. NeuroImage 5(4):S425 (1997)
References
* Friston et al. Spatial registration and normalisation of images.
Human Brain Mapping 3:165-189 (1995).
* Collignon et al. Automated multi-modality image registration based on
information theory. IPMI’95 pp 263-274 (1995).
* Ashburner et al. Incorporating prior knowledge into image registration.
NeuroImage 6:344-352 (1997).
* Ashburner & Friston. Nonlinear spatial normalisation using basis functions.
Human Brain Mapping 7:254-266 (1999).
* Thévenaz et al. Interpolation revisited.
IEEE Trans. Med. Imaging 19:739-758 (2000).
* Andersson et al. Modeling geometric deformations in EPI time series.
Neuroimage 13:903-919 (2001).
* Ashburner & Friston. Unified Segmentation.
NeuroImage in press (2005).

Image Registration: John Ashburner

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Image Registration: John Ashburner

Uploaded by

Copyright:

Available Formats

Image

Motion Smoothing General Linear Model

Spatial Parameter Estimates

Each voxel after smoothing effectively

Transformation - i.e. Re-sample according

1 0 0 Xtrans   1 0 0 0   cosΘ 0 sinΘ 0   cosΩ sinΩ 0 0 

Translations Pitch Roll Yaw

Most probable solution

B-splines are piecewise polynomials

* Minimising mean-squared difference works for

* Functions of the estimated motion parameters can be

Estimate new distortion

* Used for between-modality registration

Affine registration Non-linear registration

Template Images “Canonical” images

* Algorithm simultaneously minimises

These are the lowest

Algorithm simultaneously minimises

* It uses a generative model, which involves:

Mixing proportion - positive and sums to one

* A maximum-likelihood solution can be found by

* Replace the means by mk/ri(b)

ICBM Tissue Probabilistic Atlases. These tissue probability maps are

The Objective Function

Hold g, m, s2 and a constant, and minimise E w.r.t. b

Hold g, m, s2 and b constant, and minimise E w.r.t. a

Hold a and b constant, and minimise E w.r.t. g, m and s2

* Increase l to improve stability (at expense of

* M-step: Set (n+1) to values that reduce:

* Prior probability distributions modelled by a

See D'Agostino, Maes, Vandermeulen &

An intensity threshold is found by

You might also like