You are on page 1of 7

2.1.

4 Interpolation

Discrete data such as digital images and related shiftmaps are by definition
known at discrete locations on a regular grid; they are functions of two variables, x and
y directions in this case and can be also referred as a surface since the value of the
function can be viewed as the z coordinate of a 3D point (x,y,z). Such functions can be
evaluated in between the discrete points using interpolation. A priori knowledge of the
behaviour of the actual data is beneficial in the choice of interpolation function, but
most can be interpolated using bilinear or better, bicubic interpolation. In image
processing, bicubic interpolation is preferred over bilinear as the former is better at
preserving fine details in scaling applications, while bilinear is far simpler to
implement. Bicubic interpolation is frequently used in the warping, scaling and
registration algorithms referred in this thesis.

O12 Q2 O22
Y2

y P

O11 Q1 O21
Y1

X1 x X2
Figure 2-4 Interpolation in a regular lattice.

Regardless of the order of the interpolation, the concept is to evaluate the value
of a point P which falls in between known values on a regular lattice as shown in Figure
2-4.

2.1.4.1 Bilinear Interpolation

In bilinear interpolation, it is assumed that values vary linearly along the grid
lines; however this is not true in any other direction, it takes a quadratic form as the
interpolation is performed in each direction in turn. The first step is to locate the four

34
closest points (O11, O12, O21, O22,) that surround P(x,y). The second step is to
interpolate along the x direction and evaluate the function at points Q1(x,Y1) and
Q2(x,Y2):
f(O 21 ) − f(O11 )
f(Q1 ) ≈ f(O11 ) + (x − X 1 )
X 2 − X1
2.3
f(O 22 ) − f(O12 )
f(Q 2 ) ≈ f(O12 ) + (x − X 1 )
X 2 − X1

Then the interpolation in the y direction is performed:


f(Q 2 ) − f(Q1 )
f(P) ≈ f(Q1 ) + ( y − Y1 ) 2.4
Y2 − Y1

Figure 2-5(a) shows the bilinear interpolation as a gray scale image in a 3 by 3


regular grid for which the values are shown; the same is plotted as surface mesh in
Figure 2-5(b), where the interpolated points (coarse mesh) are shown with an offset of
35 units on top of the uninterpolated points (fine mesh). Notice the curved surface on
the front right quadrant, and yet the corners of the patches are straight lines, indicating
that the surface curvature continuity is not maintained from one patch to the other, i.e.,
the interpolated surface is not continuous in its first derivative in all directions. Whilst a
grid of 3 by 3 points were used to demonstrate this shortcoming, the smallest grid that
can be interpolated using the bilinear method is 2 by 2, notable a single patch.

20 0 20

10 15 5

5 20 20
(a) (b)
Figure 2-5 Bilinear interpolation. (a) greyscale (b) mesh plot.

35
2.1.4.2 Bicubic Interpolation

Bicubic interpolation enforces the interpolated surface to be also continuous in


its first derivative in all directions, a significant improvement over the bilinear method.
However, the bicubic implementation requires at least 3 by 3 data points in order to
have enough constraints to calculate the x, y and xy cross derivatives. The interpolation
will be a 3rd order polynomial in the form:
3 3
f ( x, y ) ≈ ∑∑ aij x i y j 2.5
i =0 j =0

where the 16 coefficients aij are determined by imposing the continuity and first
derivative continuity constraints. Figure 2-6 shows the greyscale and mesh plot of the
same data points as for bilinear interpolation. The continuity of the first derivatives is
evident from the smoothness of the greyscale transition and from the surface curvature
of the mesh plot.

20 0 20

10 15 5

5 20 20
(a) (b)
Figure 2-6 Bicubic interpolation. (a) greyscale (b) mesh plot.

2.1.5 Affine Transformation

Mathematically speaking, an affine transformation is a linear transformation


followed by a translation in the form x a Ax + b . Affine transformation in the image
processing context is attributed to a set of simple transformations which are made
uniquely or as a combination of translation, rotation, scaling and shear. All these
transformations can be written in the form [92]:

36
x' = a11 x + a21 y + a31
2.6
y ' = a12 x + a22 y + a32

where the aij determine the mapping function of point (x,y) as mapped into point (x’,y’).
These coefficients can be collected and the two equations written in matrix form:
⎡ a11 a12 0⎤
[ x' y ' 1] = [ x y 1]⎢⎢a 21 a 22 0⎥⎥ = [ x y 1][T] 2.7
⎢⎣ a31 a32 1⎥⎦

The seemingly useless third column of the transformation matrix [T] is retained to allow
concatenation of multiple transformations by matrix multiplication.

The transformation matrix is applied to every pixel of an image unless


subregions have been defined; therefore the output image may or may not fit inside the
original image boundaries. These boundaries may need to be enlarged for intermediate
operations to avoid truncation, and then cropped back to the original size of the image.

2.1.5.1 Translation

Translation is simply the offsetting of all points in the x and y directions by


adding Tx and Ty to x and y, respectively:
⎡1 0 0⎤
⎢ ⎥
T=⎢0 1 0⎥ 2.8
⎢Tx Ty 1⎥⎦

2.1.5.2 Rotation

Rotation is carried out about the origin of the image. All points in the xy-plane
are rotated counter clockwise by an angle θ :
⎡ cos θ sin θ 0⎤
R = ⎢⎢− sin θ cos θ 0⎥⎥ 2.9
⎢⎣ 0 0 1⎥⎦

In the case of rotating an image, it is understood that it is rotated about its


centre point, therefore to perform such a rotation the image should be first translated
from its centre to its origin, rotated then moved back to its old centre.

37
2.1.5.3 Scale

Similarly to rotation, scaling works about the origin of the image, i.e., all the
distances are scaled with respect to the origin. The scale factors Sx and Sy in the x and y
directions can be different, therefore resulting in a differentially scaled image. Positive
scale factors less than unity have the effect of reducing whilst scale factors greater than
unity enlarge the image. Mirroring can be achieved by negative scale factors:
⎡S x 0 0⎤
S = ⎢⎢ 0 Sy 0⎥⎥ 2.10
⎢⎣ 0 0 1⎥⎦

Again, just like in rotation, it is understood that reducing or enlarging an image


is generally about its centre point, therefore to perform such a scaling the image should
be first translated to the origin, scaled then moved back to the centre.

2.1.5.4 Shear

The shear transform is analogous to distorting a rectangle into a parallelogram


by offsetting each row (column) proportional to its location to the first row (column).
Therefore, the shear transform along the x-axis keeps the y coordinates constant whilst
linearly translating the x coordinates with respect to y:
⎡ 1 0 0⎤
Hx = ⎢⎢ H x 1 0⎥⎥ 2.11
⎢⎣ 0 0 1⎥⎦

Similarly, the shear transform along the y-axis keeps the x coordinates constant
whilst linearly translating the y coordinates with respect to x:
⎡1 H y 0⎤
Hy = ⎢⎢0 1 0⎥⎥ 2.12
⎢⎣0 0 1⎥⎦

A combination of shear transforms is often used as a faster alternative to the simple


rotation transform [69].

38
2.1.5.5 Successive transformations

Due to the discrete nature of image data, successive transformations are bound
to introduce artefacts due to round-off errors. The simplest remedy to this problem is to
concatenate all the transformations by multiplying their matrices and only operate once
on the image data. If this is not practical, another consideration would be to perform the
simpler ones first, such as translations that are less likely to cause fractional
displacements, whereas rotations will undoubtedly introduce fractional displacements of
varying degrees with respect to the location of the affected pixels from the rotation axis.

2.1.6 Sampling

Sampling is at the core of any imaging system and has a decisive effect on the
success of any further digital processing. The problem stems from the fact that a
continuous signal (the image) with presumably infinite detail is to be stored by means of
a finite array of intensity dots (pixels). For the moment, it would simplify our thinking
process if we didn’t concern ourselves with the precision by which the intensity
information can be recorded and assume the process to take place with enough fidelity.
The effects of quantisation can be accounted for as a noise source.

What makes any signal of interest is that it can carry a certain amount of
information or detail. In the case of an image, this is determined by the density of
spatial detail or more precisely the spatial frequency that it can hold. Although the
concept of frequency is more intuitively understood in terms of cycles per second when
talking about an electrical signal, its counterpart in image processing is simply the
number of cycles per width or height of a given size image. A very useful tool in
analysing the frequency content of a signal is the Fourier transform.

2.1.7 Fourier transform

Similarly to time varying signals in the form s(t) of which the frequency
spectrum is given by its Fourier transform S(ft), where ft is the temporal frequency, the
frequency content of an image g(x) is given by its Fourier transform G(fs) where fs is the
spatial frequency. We can drop the subscript s for convenience. The Fourier transform
is defined by:

39
Affine Transformation Examples

Sampling and quantization


In order to become suitable for digital processing, an image function f(x,y) must be digitized
both spatially and in amplitude. Typically, a frame grabber or digitizer is used to sample and
quantize the analogue video signal. Hence in order to create an image which is digital, we need
to covert continuous data into digital form. There are two steps in which it is done:
 Sampling
 Quantization
The sampling rate determines the spatial resolution of the digitized image, while the
quantization level determines the number of grey levels in the digitized image. A magnitude of
the sampled image is expressed as a digital value in image processing. The transition between
continuous values of the image function and its digital equivalent is called quantization.
The number of quantization levels should be high enough for human perception of fine shading
details in the image. The occurrence of false contours is the main problem in image which has
been quantized with insufficient brightness levels.

You might also like