You are on page 1of 13

1496 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4. NO. 1 I .

NOVEMBER 1995

Detection of Missing Data in Image Sequences
Ani1 C. Kokaram, Member, ZEEE, Robin D. Morris, William J. Fitzgerald, and Peter J. W. Rayner

Abstractaright and dark flashes are typical artifacts in de- describes both an MRF based and a 3-D AR detector for dirt
graded motion picture material. The distortion is referred to as and sparkle in video signals. The performance is compared
“dirt and sparkle” in the motion picture industry. This is caused with the systems introduced in 141 and 151.
either by dirt becoming attached to the frames of the film, or
by the film material being abraded. The visual result is random Of course, any solution to this general problem of detection
patches of the frames having grey level values totally unrelated to and suppression of missing data in image sequences must
the initial information at those sites. To restore the film without involve attention to the motion of objects in the scene. With-
causing distortion to areas of the frames that are not affected, the out considering motion, the application of 3-D processes to
locations of the blotches must be identified. Heuristic and model- typical image sequences (e.g., television) would result in little
based methods for the detection of these missing data regions are
presented in this paper, and their action on simulated and real improvement over what could be achieved using just spatial
sequences is compared. information. This is because like information must be treated
together in each frame, and motion in a scene implies that the
I. INTRODUCTION information at a particular position coordinate in one frame
may not be related to the information at that coordinate in other
M ETHODS for suppressing impulsive distortion in still
images and video sequences have traditionally involved
median filters of some kind. Arce, Alp et al. [ 11-13] have intro-
frames. In other words, moving portions of an image tend to be
highly nonstationary in the temporal direction perpendicular
duced 3-D (spatiotemporal) multistage median filters (MMF’s) to the frame.
that can be used to suppress single pixel wide distortion in Although both AR and MRF methods can be used to
video signals. The MMF is a variant of standard median estimate motion in video 161-[9], a high computational cost
filtering in which the output value is the median of a set is incurred. It is to be noted also that motion estimation is a
of values that are themselves the output of several other vibrant research area and it would not be feasible to treat both
median filter masks of various shapes. In the case of degraded this problem and the detection problem in this one paper. It
motion picture film however, it is more typical to find blotches is chosen instead to use block matching to generate motion
that represent multiple pixel sized impulsive distortion. Such vectors that are then used by the 3-D detection process that
regions of constant intensity disturbances are called “dirt and follows. Block matching is widely used as a robust motion
sparkle” by television engineers. Kokaram et al. [4] have estimator in many applications [lo], [ 113. Since it is primarily
introduced a 3-D MMF that can reject such distortion. motion that gives clues to the detection of dirt and sparkle, a
It is important to realize that a successful treatment of the description of the motion estimator used is given first, followed
missing data problem must involve detection of the missing by the description of the detectors.
regions. This would enable the reconstruction algorithm to
11. MOTIONESTIMATION
concentrate on these areas and so the reconstruction errors
at noncorrupted sites can be reduced. This philosophy has Despite the additional computational load necessary to
important implications for median filtering in particular, which estimate motion in an image sequence, the rewards in terms
tends to remove the fine detail in images. Such a system of detection accuracy are great. Furthermore, dirt and sparkle
incorporating a detector into a median filtering system for can be easily modeled as a temporal discontinuity facilitating
video has been used to good effect in 141-161. its recognition. This discontinuity at a site of dirt and sparkle
This paper introduces model-based approaches to the gen- may be recognized in a broad sense as an area of image that
eral problem of detecting missing data in image sequences. cannot be matched to a similar area in both the previous
Although it is clear that, as yet, there does not exist a definitive and next frames. Using three frames for detection in this
image sequence model, both Markov random field (MRF) manner reduces problems caused by occlusion and uncovering
based techniques and the 3-D autoregressive (AR) model hold of objects, which would give rise to temporal discontinuities
some promise. Both models can describe the smooth variation in either the forward or backward direction only.
of grey scale that is found over large areas of the image and the The algorithm used for motion estimation is described fully
local pixel intensities. They can also handle the fine detail that in 161. It is a multiresolution motion estimation technique
is so important for image appreciation. The following work using block matching (BM) with a full motion search (FMS).
A multiresolution technique is essential if one is to deal
Manuscript received March 19, 1994; revised January 10, 1995. This work
was supported in part by the British Library and Cable and Wireless PLC. efficiently with all the different magnitudes of motion in an
The associate editor coordinating the review of this paper and approving it interesting scene. Several representations of the original image
for publication was Prof. A. Murat Tekalp. are made on different scales by successively lowpass filtering
The authors are with the Signal Processing and Communications Labora-
tory, Department of Engineering, Cambridge University, Cambridge, UK. and subsampling the original frame. Typically three or four
IEEE Log Number 9414596. levels are used for a 256 x 256 pixel image having resolutions
1057-7149/95$04.00 0 1995 IEEE

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM et al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES I491

128 x 128, 64 x 64 etc. In this paper, if there are N levels is found in general that it is better for pixel-wise detection
generated, the highest resolution image is defined as Level 0, to interpolate the vector field than to use a block-based field.
and the lowest as Level N - 1. This alleviates the more serious blocking artifacts, although
Motion estimation begins at Level N - 1. Block matching it is agreed that this solution is by no means a consistent
involves, first of all, segmenting the current frame f , say, into one. Removing blocking artifacts should be incorporated into
predefined rectangular blocks (of size L x L pixels in this case) the motion estimator itself and not as a post-processing stage.
and then estimating the motion of each block separately. It is Nevertheless, as far as detection of degradation is concerned,
necessary, first of all, to detect motion in each of these blocks blocking artifacts from the motion estimator are not a problem.
before a search for the correct motion vector can begin. This is For alternative motion estimation schemes, the reader is
done simply by thresholding the mean absolute error (MAE) referred to the extensive literature in [61 and [ 14]-[ 191.
between the pixels in the current block and those in the block
at the same position in the previous frame. If the MAE exceeds
a threshold t,, then it is assumed the block is moving. 111. THE MODELS
Once motion is detected, the MAE between the current In a sense, estimating motion in the video signal already
block and every block in a predefined search space in the imposes some model on the data. Using BM implies a trans-
previous frame is calculated. This search space is defined lational model of the image sequence, such that
by fixing the maximum expected displacement to +w pixels.
+
Then, the search space is the ( L 2 x w ) x ( L + 2 x w ) block I T L ( T 3 = L-1(7+ &.T<-l(q) (1)
centered on the current block position, but in the previous
frame. The motion estimator used here is a simple integer where r‘ = [:E. y] denotes spatial coordinate and GTz,n-1(’7)is
accurate technique, i.e., the blocks searched in the previous the motion vector mapping the relevant portion of frame 71
frame correspond only to the pixels available on the given into the corresponding portion of frame 71,- 1 at position r‘.
grid locations. Fractional displacement accuracy is possible The motion vector is found by minimizing a functional of
by interpolating between grid locations or by interpolating I n ( q - I r L - l ( ? + c T l , n - l ( r ‘ )In
) . the case of BM, this form is
the resulting MAE curve from an integer accurate search. the absolute error operation and the minimization is achieved
Fractional estimation will yield better results, but it is more via a direct search technique over all possible motion vectors
computationally demanding and so was not used in this work. within a certain range.
The displacement corresponding to the minimum MAE This basic model therefore creates each image by rear-
(Ed)is then selected for consideration. In order to prevent spu- ranging patches of grey scale from the previous frames. This
rious matches caused by noise (another problem encountered simple structure can be used to propose several detectors for
frequently in degraded video sequences), the method of Boyce a temporal discontinuity that will be considered in the next
[ 121 is used. This technique compares Ed with the “no motion” section. However, it is possible to use alternative models, such
error Eo corresponding to the center block in the search space. as those discussed in the following sections, to describe the
If the ratio r = Eo/Ed is less than some threshold ratio rt, evolution of pixel intensities. These models are more capable
the match is assumed to be a spurious one and the estimated of describing changing object brightness due to shading, for
motion vector is set to [0, 01. If the ratio is larger than the example. Of course these models must take motion into
threshold, then it is assumed that the minimum match is too account and it is possible to design schemes for motion
small to be due to the effect of noise, and the displacement estimation, whether implicit or explicit, using these techniques
corresponding to that match is selected. [6]-[8]. In practice, however, one finds it feasible to combine
After motion estimation at level N - 1 is complete, the a rough yet robust motion estimation algorithm (such as
vectors are propagated down to the level N - 2 where FMS BM BM) with more complicated image models. The process is
is again used to refine those estimates. Bilinear interpolation treated in two stages, the first involving motion estimation and
is used to estimate initial start vectors in the level 1 that are the second using these motion estimates to construct some
+
not estimated at the previous level 1 1. The multiresolution image sequence model. This procedure takes advantage of
scheme is the same as that used by Enkelmann er al. [ 131, the relatively simpler BM motion estimation process rather
except only top down vector refinement is used. At the final than resorting to the more complicated model based processes.
level, 0, it is possible that blocks not containing moving areas Note however, even though differing ideas underlie the motion
are assigned nonzero motion vectors because of their proximity estimation in the first stage, and the models used in the second
to moving regions. To identify and correct this problem of stage, the essential basis remains that an image in a sequence
vector haloes the solution by Bierling [lo] is used. Motion is is mostly the same as the images before or after it.
detected again at the original level before the estimate from
level 1 is accepted. A . Markov Random Fields
The final result is a field of vectors estimated on a block The use of Gibbs Distributions, or equivalently MRF models
basis over the entire image. To get a displacement for every for images was introduced to the signal processing literature
pixel one could either use the same vector for every pixel in by Geman and Geman [20]. The framework is a very flexible
a block, or as is used here, to interpolate the vector field.’ It one-here, only the basic theory needed for the development
of the MRF-based detector in Section IV-D is outlined; for a
’Using in this case bilinear interpolation. complete discussion, refer to [20] and [21].

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1498 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995

0

singleton
o--o
horizontal
Y
vertical

Fig. 2. Cliques associated with the first-order neighborhood system used in
the MRF detector.

Fig. 1. 2-D lattice illustrating neighborhoods of different orders. The
nth-order neighborhood of F includes all pixels labeled with numbers 5 n.
for generating samples from the joint distribution P ( I = 2 ) .
Neighborhoods on 3-D lattices are defined similarly. The Gibbs sampler is applied by repeatedly sampling from
the simple distribution of (4); that is, samples are drawn from
Consider a finite lattice S in two or three dimensions. At P ( I ( 3 = i(?')II(q = i(3,s'f 3, where each time ?indexes
each site r'of the lattice, define a random variable i(3,where a different site in the lattice. This proceedure will cause the
i(3 takes values from the discrete state-space w . Let i denote configuration of the field I to converge to a sample from the
any of the possible configurations of the complete field I . joint distribution P ( I ) ,irrespective of the initial configuration.
Define a nighborhood system N on S , (see Fig. l), where the These samples from P ( I )can be used to calculate expectations
nth order neighborhood of pixel ? is the set of pixels such that with respect to P ( I ) ,and, coupled with annealing, can be used
IF- .'I2 5 n. These local neighborhoods are symmetric, such to find the mode of the distribution in (5), the configuration
that s'E N? r' E N g . The set I is then an MRF if with maximum probability.
Finding the maximum of this probability distribution is a
P ( I = 2) > 0 vi E W J S J (2) massive combinatorial optimization problem, similar to the
P ( I ( 3 = i(F) I I(q = i(q,r'#q = P ( I ( 3 travelling salesman problem [24]. Introduce into (5) the idea of
= i(F) I I ( 4 = i(q,.?E &) (3) temperature, by multiplying the argument of the exponential
by +. By varying the value of T , the characteristics of
For example, if I represents the intensities of the pixels in the distribution may be changed from uniform at T = 00,
an image, (3) states that the conditional probability of a pixel to completely concentrated at the mode for T = 0. By
taking a particular value is a function only of the intensi- introducing the variable T and reducing its value at each
ties of pixels in a finite (and usually small) neighborhood. iteration of the Gibbs sampler according to some schedule,
The Hammersly-Clifford theorem [221 states this conditional the sample drawn from P ( I ) will converge to the maximum
probability distribution can be written as a sum over clique probability configuration of the field. In [20] it was proved
potentials as follows: a logarithmic schedule will cause the algorithm to converge
P ( I ( 3 = i ( q I I ( q= i(3,s'E NF) to the maximum probability solution. It has been noted many
times, however, that this schedule is too slow in practice, and
commonly an exponential schedule is used [25].
Details of using the mean field approximation to solve for
the minimum variance solution can be found in [26].
that is, the conditional probability is a function only of those
cliques C , dependant on i(F),where a clique is defined to be a
B. The 3 - 0 AR Model
subset of S such that the clique contains either a single site, or
every site in the clique is a neighbor of every other site. Some The structure of the AR model allows efficient, closed-
cliques for the first-order neighborhood are illustrated in Fig. 2. form, computational algorithms to be developed, and it is this,
The function Vc(i)is known as the potential function and is a together with its spatiotemporal nature, which is of interest.
function only of those variables within the clique C. From the The physical basis for its use as an image model is limited
conditional probability definition the joint distribution may be to its ability to describe local image smoothness both in time
written and space.
Simply put, the model tries to make the best prediction
(5) of a pel in the current frame based on a weighted linear
combination of intensities at pels in a predefined support
region. This support region may occupy pels in the current
where C is the set of all cliques.
frame as well as previous and past frames. The 3-D AR model
Because of the identity between MRF's and Gibbs dis-
has already been described by Strobach [27], Efstratiadis e?
tributions, many of the techniques of, and analogies with,
al. [7], and Kokaram [6], and the equation is repeated below
statistical physics can be applied to problems described by this
using the notation of Kashyap [28].
framework. In particular the techniques of simulated annealing
[201 and meanJield annealing [23] have been used successfully N

to find solutions to the optimization problems associated with I(x,Y,n, = a k l ( x + %k + wxn,n+qnh Y), Y + Q Y ~
MRF' s. k=l
The Gibbs sampler is the basic technique for much work +
w n , n + q , h (x7Y), n + 4 n k ) + 4x7 Y7
with MRF's. It provides a computationally tractable method (6)

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM et al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1499

No displacement Displacement of [-I -11 The final set of equations to be solved is stated below:
Ca = - c . (9)
Motion vector
Motion vector Here, C and c represent terms from the correlation function
of the image sequence. a is the vector of model Coefficients.
(See [61, VI, [271-[291.)

I v . THE DETECTORS
Frame n Frame n It is important to realize from the outset that this work
SUPpOfi pel characterizes missing data in an image sequence by a region of
Predicted Pel pixels that have no relation to the information in any frame but
the current one. “No relation” is assessed in different ways de-
Fig. 3 . Handling motion with the 3-D AR model
pending on the model structure used. This is typically the case
in all real occurrences of the problem. This simple realization
In this expression, I ( s ,y, n ) represents the pixel intensity at gives the key to all the detectors discussed here; the idea is
the location ( T , y) in the nth frame. There are N model coef- to look for temporal discontinuities in the sequence. Further
ficients ak. With no motion between frames, each coefficient information can be gathered from spatial discontinuities as
would weight the pixel at the location offset by the vectors well. This is more difficult to rely upon principally because
S;E = [q,k, q y k , q n k ] , the sum of these weighted pixels giving spatial discontinuities are a common and perhaps a necessary
the predicted value f(z,y, n ) below occurrence in an interesting picture.
Several detectors are described here. The discussion begins
h’
with those previously introduced and then moves on to the new
i ( z ,y. n ) = akl(z f qzk, y + qyk. n + q n k ) . (7) detectors, namely the SDIa-, MRF-, and AR-based systems.
k=l

Because there is motion however, these support locations A . Heuristics
must be offset by the relative displacement between the
There have been two detectors previously discussed that
predicted pixel location and the support location. This
involve some heuristics for detection. The earliest is that
displacement between frame n and frame m is defined to
discussed by Storey [30], [31]. This did not employ motion
be G , m ( x Y)l = [ W n , m ( Z ,!IvYn,m(zl
), Y)]. The arguments estimation and instead thresholded the forward and backward
(z, y ) illustrate that the displacement is a function of position
nonmotion-compensated frame differences to detect a blotch.*
in the image. Finally, t ( z , y , n ) is the prediction error or
There were a number of heuristics involved for detecting
residual at location ( z , y , n ) . It can also be considered
motion by using this information to vary the threshold in some
to be the innovations sequence driving the AR model to
way. The main thrust of the detector, however, is given by the
produce the observed image I ( z , y . n ) . Fig. 3 shows a
following statements (where I n ( q is the pixel intensity at the
temporally causal 3-D AR model with five pixels support
location F in the nth frame):
at [O,O, -11, [-1,0, -11, [O. -1, -11, [ 1 , O . -11, [0, 1. -11. The
figure illustrates how the displacement is incorporated into eb - I(n ~
=ITL -I(q
the prediction.
For the purposes of parameter estimation, the model is
ef =L(T3- L+1(.3
1, if (1.61 > e t ) AND (lefl > et)

{
considered in the prediction mode of (7). The task then
becomes to choose the parameters in order to minimize some DBBC = AND (sgn ( e b ) == sgn (er)) (10)
function of the prediction error, or residual 0, otherwise.
The detector can be stated in words as follows: I n ( 3 is
E ( Z , y, = I(., Y, - i(s,Y, (8)
a blotched pixel if both the absolute forward and backward
Equation 8 is just a rearrangement of the model (6) with the errors are greater than the threshold error e t , and In(?‘)does
emphasis placed on the prediction error, t(z,y, n ) . not lie within the range represented by the values I n - 1 ( q and
It was decided, in the interest of computational load, to use In+l(?‘). The latter rule is placed because of the assumption
a least-squared estimate for the model coefficients in order to that if the pixel value is between those of the pixels in the
adapt the coefficients to the image function prior to motion two frames is n + 1,n - 1, then it must be part of the natural
estimation. Recall that the displacement estimates are derived evolution of grey levels in the scene. The first two rules ensure
from a separate motion estimation process and so they do not that both the forward and backward differences agree that
complicate the least-squared solution further. The coefficients the central pixel represents some discontinuity. This would
are chosen, therefore, to minimize the square of the error €0,lessen the effect of false alarms at occlusion and uncovering
above. This leads to the normal equations. The derivation is since in that situation there would be a large error in one
the same as the 1-D case and the solution can be arrived at temporal direction only. The assumption of equal sign is only
by invoking the principle of orthogonality. Solving the normal 2The term blotch is used here as a synonym for the term “dirt and sparkle”
equations yields the model coefficients a k . and “temporal discontinuities.”

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1500 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 1 I , NOVEMBER 1995

true in general if the blotches tend to be bright white or For real sequences there must be some lower threshold tl
dark black. If the blotches are random in grey scale, then for the forward and backward differences that will indicate
this detector is likely to miss those occurrences. However this that the match found is sufficiently good that the current pel
is not a common situation. Finally, in the presence of large is uncorrupted. This is necessary because in real sequences
motion, this detector cannot correctly separate moving regions the motion is not translational and due to lighting effects the
from blotched areas for obvious reasons, despite the additional intensities of corresponding areas do not necessarily match.
control measures implemented in [30]. The reader is referred Further, there will be errors from the motion estimator.
to [30] and [31] for further details. The general rule is that when the SDI is 0 the current pel
is uncorrupted; else when it is 1 the current pel is corrupted.
B. SDI In order to allow for the cases where occlusion and multiple
corruptions along the motion trajectory are possible, there must
A very similar detector, using the spike detection index be some threshold to make the decision. The threshold also
(SDI), was presented in [4]. This was motion compensated, allows some tolerance in the case of real sequences where
however. It attempted to generate one number from which the motion is not purely translational and one has to deal with
presence or absence of a blotch could be inferred. The SDI slight lighting changes not due to motion.
is defined as follows: The SDI was found to be effective in most cases but relies
on the motion estimator tracking the actual image and not
being affected by blotches. This is an important issue since
typical BM algorithms are not robust to artifacts of such a
potentially large size. Further, the use of the lower threshold
tl automatically excludes a number of discontinuities from
consideration. The SDI also has quite a high false alarm
where tl is a low threshold that overcomes problems when el rate in occluded and uncovered regions where large forward
and e2 tend to zero. The SDI is limited to values between 0 and backward differences are likely. Nevertheless it is more
and 1 and the decision that a spike is present is taken when effective than the detector of (IO), primarily because of its use
the SDI at the tested pixel is greater than some predefined of explicit motion compensation.
threshold t,.
To understand how this approach works, assume the motion C. SDIa
is purely translational. Now consider the following points, There is scope for implementing a motion-compensated
where p , f , b are the present, forward, and backward pixel version of the detector given in (10). This is the first new
values, respectively, along a motion trajectory. (perhaps simpler) formulation to be considered in this paper. It
Occlusion: ( p - f l will be large and Ip - bl will be zero. flags a pixel as being distorted using a thresholding operation
Therefore, SDI = 0. as follows:
Uncovering: Jp- f l will be zero and J p- bJ will be large.
Therefore, SDI = 0.
Normal (trackable) motion: Both Ip - f l and Ip - bl will
be zero. As both p - f and p - b tend to 0, the SDI is
not well behaved. However, when this happens, it means
the motion estimator has found a good match in both
DSDIa
1
= 0 c if (eb > e t ) AND (ef
otherwise .

Here, Cn,n-l(f‘), Cn,,+1(T)are motion vectors mapping the
> et)

directions; hence, the current pel is not likely to be a
scratch. Therefore in this case the SDI is set to zero. pixel in frame n into the next and previous frames. A pixel is
A blotch at the current pel but in the position of an object therefore flagged as distorted when the forward and backward
showing normal motion: Both Ip - f l and Ip - bl will be motion-compensated frame differences are both larger than
large and the same and so SDI = 1. They would be the some threshold e t . This is the simplest detector for temporal
same since f,b would both be the same pels on the object discontinuities [6]; it does not involve the sign operations of
at different times thus having the same intensity, provided the detector defined by (10). This is because it is possible for
the assumption of pure translation holds. blotches to occur that violate the “sign” portion of the rule. The
A blotch at the current pel but in a position on an object SDIa also has a direct association with the AR-based system
image showing occlusion or uncovering: it is difficult to that is discussed later in this article.
say how the SDI behaves here. The SDI will take some
undefined value, not necessarily zero or one. This value D. Detection Using Markov Random Fields
would depend on the actual intensities of the occluding The use of the theory of MRF’s outlined in Section 111-A
regions. enables a different definition of “no relation” to be used-the
A blotch at the current pel but f and/or b represent pels spatial nature of MRF models allows the information that
at which blotches have also occurred. Again the SDI is dirt and scratches tend to occur in connected regions to be
not defined, but if the blotches are fairly constant valued encoded into the detector. No significant attempt is made to
the index tends to 0. model the image, this being too computationally intensive;

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM er al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1501

rather, the MRF model is applied to the blotch detection keeping all d(.Sq.s’# r‘ constant at their current values when
frame to introduce spatial continuity there. This encourages calculating the conditional distribution for d ( 3 .
the detection of connected blotch regions. These conditional distributions are used in the Gibbs sam-
In this section, D denotes the detection frame between the pler with annealing to find the maximum a posteriori (MAP)
two image frames, which is to be estimated, where d ( f ) = 1 configuration of the detection frame, given the data and the
indicates the presence of a blotch at position r‘ and d ( 3 = 0 model for blotches, as discussed in Section 111-A. The MAP
denotes no blotch. Bayes’ theorem gives configuration is found for the detection frame between the
current frame and the previous frame, and the current frame
P ( D = d 1 I = i ) 0: P ( I = i 1 D = d ) P ( D = d ) . (11) and the following frame. Regions detected in both temporal
That is, the probability distribution of the detection frame directions are consistent with the heuristic for blotches and
is proportional to the product of the likelihood of observing are classified as such.
the frame I , given the detection configuration, and the prior Parameter Estimation: The MRF detector is seen to depend
distribution on the detection frame, i.e., the model for the on three parameters-@, PI, /&. The value of controls the
expected blotch generation process. strength of the self-organization of the discontinuities, and
Thus, using 4(.) to denote the potential function for the Ripley [33] gives arguments for a value around two for a
two element cliques used, N for the four in-frame neighbors four nearest neighbor system, based on considerations of the
(the first-order neighborhood) and i ( 6 c ) for the single motion- conditional probability of a pixel when surrounded by three or
compensated neighbor used, the likelihood function is four pixels of the same state. Arguments of a similar nature
can be used to find CY and P 2 .
P(I = i I D = d) The last term in (14) “balances” the increase in conditional
probability introducing a discontinuity that eliminates the ef-
fect of the first term, the motion-compensated frame difference
term. To balance a difference of el requires

+ a(1 -
ue: N /j2.

Also, consider a single pixel error of magnitude e2. For this
i.e., the probability of a pixel having a particular grey scale to be detected requires
value is a function of the pixels in its spatiotemporal neigh-
borhood, with the temporal neighbor being excluded if a exp(-/?2) > exp(-crei + 4/31). (16)
discontinuity is indicated.
Thus, by quantifying the heuristic that spatial discontinuities
The prior on the detection frame is taken to be the nearest-
are indicators of blotches, the values of the parameters of the
neighbor king model 1321 (the n = 1 neigborhood in Fig.
model to detect the blotches may be chosen consistently. This
I), together with a term to bias the distribution toward no
has been shown to result in a detector with a “soft” threshold
detections, to avoid (1 1) being optimized by a solution with
1341, whereby the temporal discontinuity required for a blotch
scratches detected everywhere. This prior is successful in
to be detected is reduced as the spatial extent of the blotch
organizing the detection into connected regions as desired.
increases.
The prior is
P ( D = d) E. Detection Usinn
- the 3-0AR Model:
Assume that the image is corrupted according to the fol-

where f ( d ( 4 ) is the number of the four neighbors of d ( 3
with the same value as d ( 3 , and S() is the delta function.
where b(F) = {i,
0 with probability (1 - P s )
with probability Ps. (18)
Combining (12) and (13), using 4(.) = (.)‘ as the potential
function and dropping the term from (12) that is not a function Here, B is a randomly distributed grey level representing a
of d gives the a posteriori distribution as blotch or impulse, and it occurs at random intervals in the
frame with probability PB. As in the previous section, it is
P(D = d I I =i) required to detect the likely occurrences of b ( 3 # 0 in order
1 to isolate the distortion for further examination. The key to the
= -exp - [a(l - d(F))(i(F) i(rzc))2 solution is to make the assumption that the undistorted image
z T’ES
-

I ( z ,y, n) obeys the AR model whereas b(?) does not. This
approach was taken by Vaseghi et al. 1351, [36] in addressing
a similar problem in degraded audio.
Suppose the model coefficients for a particular image se-
This is the joint distribution for the detection frame D. From quence were known. The prediction error could then be
(14) the local conditional distributions are easily formed by considered to be noise with some correlation structure [29].

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1502 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995

If g ( 3 was filtered with the model prediction filter, the output prediction error given a single model. Therefore, an impulse
could be written as below: is detected when
N [f(3]'
2 t€ (20)
f'(3= g('f) - akg(F+ gk)
where t , is some threshold.
k=l
N N The other detection system uses two temporally different
= I(?) +b ( 3 - akI(F+ <k) - akb('?+ &) models-a forward predictor N:O and a backward predictor
k=l k=l 0 : N . The two prediction error fields, €1 and €2, are then
N thresholded to yield a detected distortion when
=E ( 3 +b ( 3 - a k b ( F + fk). (19)
([61(3]~ 2 G ) AND ([62(312 2 te). (21)
k=l

Equation (19) shows the undistorted samples in the degraded Therefore, a blotch is located when both predictors agree a
signal are reduced to the scale of the error or residual sequence. match cannot be found in either of the two frames. Such a
The distorted samples can be reduced in amplitude if there are system is denoted by N:O/O:N.
other distorted samples nearby but this does not occur often. In practice the causal/anti-causal detector is better than the
Therefore, thresholding the prediction error is a useful method noncausal approach. This is due to the better ability of the
of detecting local distortion. former technique to account for occlusion and uncovering by
Parameter Estimation: For a real sequence, the model co- seeking an agreement between two directed predictions. Only
efficients are unknown. In this work, they are estimated the N:O/O:N system is considered here.
from the degraded sequence using the normal equations. A Note that the SDIa detector is the same as the 1:0/0:1 AR
motion-compensated volume of data is extracted and then detector except that the two AR coefficients are set to 1.O. That
"centralized" by subtracting a linear 3-D trend following the detector is true to the model being used for motion estimation
2-D work of Veldhuis [37]. The coefficients are then estimated via BM. It follows from the idea that every image is just a
using the previously calculated displacements and the normal rearrangement of image patches in the past or the next frame.
equations. The choice of the spatial extent of the volume used Hence, pixels that cannot be found (to some tolerance) in
is important. If the size of a block is comparable to the size of either of the two surrounding frames must not be part of the
a particular blotch, then the coefficients are heavily biased by sequence.
that distortion and the resulting detection efficiency is poor.
This effect is enhanced when the model has spatial support F. Computational Load
in the current frame since the model support is then more likely In this work, multiresolution block matching was used to
to contain corrupted sites.3 In the case of dirt and sparkle, estimate motion. At each frame, motion must be estimated
because the distortion occupies a large spatial area, a model in both the forward and backward temporal directions. The
with spatial support in the current frame would only give large computation this requires, in all cases, is far in excess of that
prediction errors at the edges of a blotch. Inside the blotch the required by the detectors. Also, the detectors do not involve
residual would be reduced in magnitude. In practice, models the motion estimator explicitly; therefore, the motion estimator
with no support in the current frame are more effective since load is not considered here.
the distortion is local (impulsive) in time but not necessarily +
All arithmetic operations e.g. - ABS < were counted as
as local in space. costing one operation. The exponential function evaluation was
There is the question of how the current block being taken as costing 20 operations and inversion of an N x N
modeled is assigned motion vectors to yield the 3-D data matrix was assumed to be a N 3 process. Estimates for the
. volume required. There are two approaches. One is to use the number of operations per pixel for the detectors are as follows:
same block size as used by the motion estimator, which would DBBC = 11, SDI = 11, SDIa = 7, 3DAR = 140 (assuming
be consistent with previous assumptions, then compensate the a block size of 8 x 8 pixels and a 9:O model) and MRF = 50
entire block using the one vector. The other is to compensate per iteration. Only a small number of iterations (typically five)
each pixel in that block using interpolated vectors. This work were needed in the following experiments as the temporal term
uses the former technique primarily because of the lower in the detector (14) usually dominates over the spatial terms.
computation required.
It becomes helpful to describe AR predictors by the number v. RESULTS AND DISCUSSION
of pixels support in each frame. There is no evidence for
In order to objectively assess the performance of the various
asymmetric supports so a 9:O model refers to a model with
detectors just discussed, the sequence WESTERN1 (60 frames
nine pixels in a 3 x 3 square in the previous frame acting as
of 256 x 256) was artificially corrupted with blotches of
support. A 9:0:9 model has twice that support, nine pixels in
varying size and shape and random grey level. The method
each of the previous and next frames.
of corruption is outlined in Appendix A. The exact method
Implementation: There are two types of model-based detec-
of corruption is not important, it is sufficient to recognize that
tion systems that can be considered. The first thresholds the
areas of missing data were introduced into each frame in some
31n most real degraded sequences, blotches do not occur at the same spatial random manner so they represented temporal discontinuities.
position in consecutive frames. The corruption was quite realistic in that the size and shape

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM et al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1503

of the blotches produced were not regular. Typical degraded I
frames (48-50) are shown as Figs. 6-8. No effort was made to
0.9
insure that blotches did not occur at the same spatial location in
consecutive frames; indeed this was the case in some frames. ~ 0.8
The experiment thus represents worst case results in some 3
0

sense, since multiple occurrences of blotches in the same P
8 0.7
position in consecutive frames are indeed a very rare event in
practice. Figs. 9-13 show, respectively, detection results when z 0.6
o
the SDIa, SDI, MRF, 1:0/0:1 (known4), and 1:0/0:1 systems 3
3
are applied to frame 49. 2
Motion estimates were made from the degraded frames. A E 0.5
four-level motion estimation process was used as outlined in
the description previously. The search space used for the full
search block matching process was +4 pixels at each level. 0.4
1E-05 OOOO1 0 001 0 01 0.1 I
The generating kernel for the image pyramid was a spatially F’robabihty of False Alarm
truncated Gaussian function with variance of 1.0 and a kernel
size of 9 x 9 pixels. A threshold of 10.0 on the MAE was Fig. 4. Performance of detectors on 60 frames of the sequence: WESTERN.
used to detect motion, with a noise ratio [12] of 1.2 at the
original resolution level and 1.0 at all other levels. A block in the practical case is due to the AR coefficient estimation
size of 9 x 9 was used at the 256 x 256 level, and 5 x 5 process being biased by the degradation.
otherwise. In a sense, these and other details about the BM The energy of the blotches is sometimes so large in pro-
parameters used are not important. It is only necessary to note portion to the rest of the image patch being modeled that it
the results for each detector were generated using the same causes an adverse bias in the estimation process. This leads
motion vector estimates. to an increase in the false alarm rate and a decrease in the
Fig. 4 shows a plot of the correct detection rate versus false correct detection rate. When the coefficient estimation process
alarm rate for each detector. Since the original data is available operates on clean (known) data, the performance is much better
it is possible to estimate a true set of AR model coefficients than the SDIa or MRF systems. In this case, increasing the
for a particular AR model configuration, and this was done spatiotemporal support does help the situation and the 5:0/0:5
for both a 1:0/0:1 and a 5:0/0:5 system (using a support ( known) system performs better than the 1:0/0:1 (known)
as in Fig. 3). The motion estimates used for this artificial system. In the real case, increasing the support for the system
situation were also gained from the degraded sequence. The worsens the bias because the block sizes used for estimation
curves for the SDIa, and AR systems were generated by (9 x 9) are small, and hence the confidence with which
making measurements of accuracy for thresholds that vary coefficients can be estimated is sensitive to the number of
from 0 to 2000 in steps of 100. The SDI curve was generated missing pixels and the number of correlation terms that must
similarly with thresholds varying from 0 to 1.0 in steps of 0.05 be measured. Such small block sizes are forced because of the
(tl = 10.0). The point on the lines nearest the top right hand spatial nonhomogeneity of images.
corner of the graphs corresponds to the smallest threshold. It is interesting to note that in a real situation, the AR model-
The MRF detector characteristic was found by using different based detector would miss blotches with a “low” intensity
values of el and e2 (14 5 el 5 34 and 16 5 e2 5 5 6 ) in difference with preceeding and next frames. The reason is the
estimating the parameters a and ,LIZ via (15) and (16). coefficients can adjust to account for low levels of grey scale
It can be seen that the SDIa detector performs very well temporal discontinuities, hence yielding a low residual power.
overall, maintaining greater than 80% correct detection for less The SDI system has a limited range of activity because of
than 1% false alarm rate. Surprisingly, the AR model based the low threshold that must be used to “filter” pixels with a
detector systems do not perform well when the coefficients are good match in both the previous and next frames. Furthermore,
estimated from the degraded data (the real curves). The MRF the SDI ratio is not well defined in regions of occlusion and
approach gives slightly better results than the SDIa detector in uncovering, especially if a blotch is present. The false alarm
a real situation. The SDI detector does not perform as well as rate is seen to be higher than either the SDIa or MRF systems.
the SDIa or MRF systems and is more restrictive in its useful Note, however, that one could choose thresholds so that the
operating range. SDI performance approaches that of the SDIa. This shows that
Considering the AR system, more spatial support seems to it is more difficult to use the SDI effectively.
yield a worse performance. This may seem to be counter- The MRF system performs slightly better than the SDIa
intuitive, but the curves obtained when the coefficients are es- because it is able to incorporate the idea of spatial smoothness
timated from the original, clean data (the known curves) show in the form of the blotch. Therefore, it will flag a pixel as
a much better performance, and hence provide an explanation. being distorted not only if it is at the site of a temporal
Since these curves were generated using the same motion discontinuity but also if it is connected to a pixel of the
estimates from the degraded images, the worse performance same grey level that was at the site of a discontinuity. It
will therefore be able to detect the marginally contrasted
4Using parameters estimated from the clean sequence blotches primarily because of spatial connectivity, whereas

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1504 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 1 1 , NOVEMBER 1995

1

0.9

8
'1 0.8
6
2
6 0.7
'c

b
'
a
4
0.6

0.5
OS
i
1 0.0003 0001 0.003 0.01 003
Robability of False Alsrm
01 03 1

Fig. 5. Performance of detectors on frame 49 of the sequence: WESTERN.

Fig. 7. Degraded frame 49 of WESTERN.

Fig. 6. Degraded frame 48 of WESTERN.

the SDIa system would be unable to detect a blotch if the
temporal differences are too low (i.e., poorly contrasted). That Fig. 8. Degraded frame 50 of WESTERN.
this only produces a small improvement in performance is
understandable as the additional blotches found are those blotch on the shoulder well. Note also that the increased false
of low grey-scale difference, which will be only a small alarm rate shown in Fig. 13 occurs around the blotches and is
proportion of the blotches. In a frame when this is significant, due primarily to the influence of these discontinuities on the
a larger difference in operating characteristic is observed (see coefficient estimation process.
Figs. 5-11). The false alarm rates for each of the detectors are:
Figs. 9-13 show, respectively, detection results when the S D I a 4 . 4 % , S D I 4 . 8 % , MRF-O.28%, AR 1:0/0:1
SDIa, SDI, MRF, 1:0/0:1 (known), and 1:0/0:1 systems are (known)-O.23%, AR 1:O/O: 1 (estimated)-1.3%. All the
applied to frame 49. Each figure shows the result obtained detectors flag the area highlighted in Fig. 7 as a false alarm
when the relevant parameters or thresholds are set so the region. The main area of improvement of the MRF detector
probability of correct detection is 90%; hence they represent a and the AR l:O/O:l (known) detector is the reduction in the
horizontal slice across Fig. 5 at P, = 0.90. They illustrate number of single pixel false alarms flagged. These are most
the points made earlier. Red represents a missed blotched noticeable on the actor's shoulder, hair, and arm. The false
pixel, green represents correctly detected blotched pixels, alarm rate for the AR detector with estimated coefficients
and brown represents false alarms. Note how none of the is dominated by the effect of the bias in the coefficient
systems-SDIa, SDI, or AR-detect the lightly contrasted estimation process.

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM ef al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1505

Fig. 9. Detection using SDIa on frame 49. Fig. 11. Detection using the MRF on frame 49.

Fig. IO. Detection using SDI on frame 49. Fig. 12. Detection using known AR parameters, l:O/O:l,

In all Figs. 9-13, there is an undetected region (red) on the be increased. In cases where high fidelity of the reconstruction
shoulder of the main figure. This region can be seen to be only is required, for example still frame viewing, the MRF detector
slightly contrasted in the degraded frame in Fig. 7. It is notable is most suitable.
that all the detectors miss this region at this detectiodfalse
alarm rate, and it is because the area is of low contrast with A. Errors in Motion Estimation
the rest of the image it is, in fact, difficult to see. It is clear that motion estimation errors would adversely
Overall then, the SDIa detector is the best in terms of the affect the performance of all these detectors, more so the
compromise it strikes between computation and accuracy. The purely temporal SDI and SDIa systems. In the interest of
MRF approach is the most accurate however, and performs brevity then, we do not include results when the motion
extremely well in the real situation where the AR based ap- estimates come from the clean original, but choose instead
proach fails because of poor estimation of model coefficients. to present Figs. 6-13.
It is possible to use optimal weighted estimation of coefficients Figs. 6-8 show three frames, 48-50. A red block highlights
to alleviate the difficulties with the use of the AR approach as a region in frame 49 that has been uncovered from frame 48
in [38]. Of course the computational complexity would then and partially occluded in frame 50. As stated earlier, Figs.

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1506 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 1 1 , NOVEMBER 1995

Fig. 13. Detection using estimated AR parameters, 1:0/0: 1. Fig. 14. Frame from actual degraded sequence

9-13 show, respectively, detection results when the SDIa,
SDI, MRF, 1:0/0:1 (known), and 1:0/0:1 systems are applied
to frame 49. Note how the white coat lining of the central
figure is a source of false alarms in all cases. This effectively
demonstrates a fundamental limit in detection capability: areas
of fast motion can represent temporal discontinuities as regions
are rapidly uncovered and occluded. It is in these areas it
would be advantageous to use more than three frames in the
motion estimator (in the manner of [30] and [40]-[42], for
instance) to allow matches to be found when the material is
again uncovered.
This problem is unavoidable and therefore, in the design
of an interpolator for this “missing” data, robust estimators
must be found; i.e., interpolators that can reconstruct large,
apparently missing regions without distortion by using spa-
tial continuity when temporal smoothness is absent. This is
discussed in [43].

VI. REALDEGRADATION
Figs. 14 and 15 show results from the application of the Fig. 15. Detection using SDIa and MRF systems. Red, both; bright white,
SDIa; green, MRF.
SDIa and MRF systems to the problem of detecting the real
distortion in a motion picture frame. For brevity, only the
frame concerned is shown here, the motion in the scene As expected, the MRF system detects more of the large
consists of a vertical pan of four to five pixels per frame. blotch due to spatial connectivity. The SDIa is unable to
The background consists of out of focus trees that sway in detect all of it because parts of the blotch match well with
and out of shadow. The motion is typical of motion pictures, parts of the head in the next and previous frames. The SDIa
the objects in the scene move with velocities varying from has more false alarms in the background but performs better
small (foreground) to very large (background). on the daisy (with respect to false alarms) again because of
The main distortion is boxed in red in Fig. 14. The results the MRF tendency to “collect” pixels together. Both detectors
for the SDIa and MRF systems are superimposed on the image have problems along the moving arm of the figure because
in Fig. 15. Red pixels are those flagged as distorted by both the integer accurate motion estimation cannot properly com-
detectors. Bright white pixels are those flagged by the SDIa pensate for the fractional motion here, and the edge of the
process but not by the MRF process, finally green pixels are arm is highly contrasted with the dark suit. Nevertheless, both
those flagged by the MRF process and not by the SDIa process. detection systems detect the distortions satisfactorily.
The brightness of the image in Fig. 15 has been reduced so It is useful to note that by detecting the regions of suspected
the color of the flagged pixels can be more easily seen. distortion, the computation necessary for the next stage of

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM et al.: DETECTION OF MISSING DATA IN IMAGE SEQIJENCES 1507

reconstruction is reduced since the number of pixels that must REFERENCES
be considered is a small subset of the entire frame. The rate
[I] G. R. Arce and E. Malaret, “Motion preserving ranked-order filters for
of “suspicion” for the sytems in this case is 1.2 and 0.88% for image sequence processing,” in Proc. IEEE Int. Con$ Circuits Syst.,
the SDIa and MRF, respectively. 1989, pp. 983-986.
[2] G. R. Arce, “Multistage order statistic filters for image sequence
processing,” IEEE Trans. Signal Processing, vol. 39, pp. 1146-1 161,
VII. CONCLUSION May 1991.
[3] Bilge Alp, Petri Haavisto, Tiina Jarske, Kai Oistamo, and Yrjo Neuvo,
The problem of detecting blotches or missin!: data of “Median-based algorithms for image sequence processing,” SPIE Visual
this kind in image sequences is well posed in a temporal Commun. Image Processing, 1990, pp. 122-133.
[4] A. C. Kokaram and P. J. W. Rayner, “A system for the removal of
sense. The success of the purely temporal SDI1 detector impulsive noise in image sequences,” in SPIE Visual Commun. Image
shows the problem can be solved just by observing temporal Processing, Nov. 1992, pp. 322-331.
information. The results have indicated that incorporating more [SI -, “Removal of impulsive noise in image sequences,” in Singapore
Int. Conf Image Processing., Sept. 1992, pp. 629-633.
spatial information does have some benefit, but to exploit the [6] A. C. Kokaram, “Motion picture restoration,” Ph.D. thesis, Cambridge
full potential gains it is necessary to reduce the influence Univ., UK, May 1993.
[7] S . Efstratiadis and A. Katsagellos, “A model-based, pel-recursive mo-
of the degradation on both the motion estimator and the tion estimation algorithm,” in Proc. IEEE ICASSP, 1990, pp. 1973-1976.
coefficient estimator (with respect to the AR systems). It would [8] J. Konrad and E. Dubois, “Bayesian estimation of motion vector fields,”
be an advantage if it were possible to estimate the MRF IEEE Trans. Patt. Anal. Machine Intell,, vol. 14, no. 9, Sept. 1992.
191 I. M. Abdelquader, S . A. Rajala, W. E. Snyder, and G. L. Bilbro, “En-
hyperparameters from the image. ergy minimization approach to motion estimation,” Signal Processing,
The paper has discussed the problem of detecting blotches in vol. 28, pp. 291-309, 1992.
degraded motion pictures, which pertains to the more general [IO] M. Bierling, “Displacement estimation by hierarchical block matching,”
in SPIE VCIP, 1988, pp. 942-951.
problem of missing data detection. Identification o f the missing [ 1 I ] M. Ghanbari, “The cross-search algorithm for motion estimation,” IEEE
data regions allows efficient algorithms to be developed to Trans. Commun., vol. 38, pp. 950-953, July 1990.
interpolate these missing regions. This is discussed in [43]. [ 121 J. Boyce, “Noise reduction of image sequences using adaptive motion
compensated frame averaging,” in IEEE ICASSP, vol. 3, 1992, pp.
461-464.
[ 131 W. Enkelmann, “Investigations of multigrid algorithms for the estima-
APPENDIXA tion of optical flow fields in image sequences,” Comput. Vision Graph.
GENERATION
OF ARTIFICIALBLOTCHES Image Processing, vol. 43, pp. 150-177, 1988.
[I41 H. Nagel, “Recent advances in image sequence analysis,” in Premiere
Figs. 6 8 show frames from the sequence degraded with Coloque Image Traitment, Synthese. Technologie et Applications., pp.
artificial blotches. These artificial blotches are a good visual 545-558, May 1984.
match with blotches observed on real degraded sequences. The [IS] H. Nagel and W. Enkelmann, “An investigation of smoothness con-
straints for the estimation of displacement vector field from image
method of generation was as follows. sequences,” IEEE Trans. Putt. Anal. Machine Intell., vol. PAMI-8, pp.
The Ising model is an MRF model defined on two states, 565-592, Sept. 1986.
[ 16) S . Fogel, “The estimation of velocity vector fields from time-varying
with a conditional probability structure defined such that the image sequences,” Comput. Vision Graph. Image Processing: Image
probability of a pixel being in a given state is proportional to Understanding., vol. 53, pp, 253-287, May 1991.
the number of its neighbors in that state. The joint probability [I71 J. Robbins and A. Netravali, “Image sequence processing and dynamic
scene analysis,” in Recursive Motion Compensation: A Review. Berlin,
of the field is thus, where zi E {-1, +1} Vienna, New York: Springer-Verlag, 1983, pp. 76-103.
1181 B. Schunck, “Image flow: fundamentals and future research,” in IEEE
ICASSP, 1985, pp. 560-571.
[I91 J. Riveros and K. Jabbour, “Review of motion analysis techniques,”
Proc. IEEE, vol. 136, no. 397404, Dec. 1989.
[20] S . Geman and D. Geman, “Stochastic relaxation, Gibbs distributions,
and the Bayesian restoration of images,” IEEE Trans. Putt. Anal.
Samples from this model have approximately equal number Machine Intell., vol. PAMI-6, pp. 721-741, Nov. 1984.
of pixels in each state. If a term of the form nS(z; 1) + [21] D. Geman, “Random fields and inverse problems in imaging,” in Lec-
ture Notes in Mathematics,volume 1427. Berlin, Vienna, New York:
is introduced into ( 2 2 ) , this will bias the field toward the Springer-Verlag. 1990, pp. 1 13- 193.
state z, = -1. Iterating the Gibbs sampler on this biased (221 J. Besag, “Spatial interaction and the statistical analysis of lattice
distribution will result, from an initialization of equal numbers systems,” J . Royal Sfatist. Soc. B , vol. 36, pp. 192-326, 1974.
1231 H. P. Hiriyannaiah, G. L. Bilbro, and W. E. Snyder, “Restoration of
of each state, in a clustering of the pixels in each state, and a piecewise-constant images- by. mean-field annealing,” J. Opt. Soc. Am.,
gradual reduction of the number of pixels in the state zi = 1. pp. 1901-1912, 1989.
1241 S. Kirkpatrick, C. Gelatt, and M. Vecci, “Optimization by simulated
If the evolution of the field is stopped before a uniform picture .~
annealing,” Science, vol. 220, pp. 671-680, 1983.
is reached, then it will consist of small connected regions of [25] S . Geman, D. E. McClure, and D. Geman, “A nonlinear filter for film
state x1 = 1, randomly distributed across the frame. As can restoration and other problems in image restoration,” CVGIP: Graphical
Models Image Processing, vol. 54, no. 4, pp. 281-289, July 1992.
be seen from the figures, these regions are good simulations [26] G. L. Bilbro, W. E. Snyder, and R. C. Mann, “Mean-field approximation
of the kind of amorphous distortion found in practice. These minimizes relative entropy,” J. Opt. Soc. Am., vol. 8. no. 2, pp. 290-294,
model the location of the blotches very well, as can be seen Feb. 1991.
1271 P. Strobach, “Quadtree-structured linear prediction models for image
from the figures. sequence processing,” IEEE Trans. Putt. Anal. Machine Intell., vol. 11,
Finally, isolated blotches were colored uniformly with a pp. 742-747, July 1989.
value chosen randomly from [O, 2551 and then the original [28] A, Rosenfeld, Ed,, Univariate and Multivariate Random Fields for
Images. New York: Academic, 1981, pp. 245-258.
frames were corrupted by inserting the colored areas into the [29] A. K, Jain, Fundamentals of Digital Image Processing. Englewood
frames, replacing the original information. Cliffs, NJ: Prentice-Hall, 1989.

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1508 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995

(301 R. Storey, “Electronic detection and concealment of film dirt,” UK Robin D. Morns was born in Bury, Lancashire,
Patent Spec.$cation no. 2139039, 1984. UK, on December 15, 1969. He received the B.A.
[31] -, “Electronic detection and concealment of film dirt,” SMPTE J., degree in electrical and information sciences from
pp. 642-647, June 1985. Cambridge University Engineering Department in
[32] D. Chandler, Introduction to Modem Statistical Mechanics. New York: June 1991.
Oxford University Press, 1987. Since then, he has qualified for the M.A. degree
[33] B. D. Ripley, Statistical Inference for Spatial Processes. Cambridge, and is worlung toward the Ph.D. degree with the
U K Cambridge University Press, 1988. Signal Processing Laboratory of the same depart-
[34] R. D. Moms and W. J. Fitzgerald, “Replacement noise in image ment. His research has been in the area of Bayesian
sequences-Detection and correction by motion field segmentation,” in inference and statistical signal processing, with ap-
P ~ o c ICASSP,
. vol. 5, 1994, pp. V245-248. plications in the area of Markov random fields
[35] S. V. Vaseghi and P. J. W. Rayner, “Detection and suppression of applied to motion picture restoration.
impulsive noise in speech communication systems,’’ Proc. IEEE,, vol. In October of 1994, Mr. Moms was elected to a Junior Research Fellowship
137, pp. 3 8 4 6 , 1990. at Trinity College, Cambridge.
1361 S. V. Vaseghi, “Algorithms for the restoration of archived gramophone
recordings,” Ph.D. thesis, Cambridge Univ., UK, 1988.
1371
. . R. Veldhuis, Restoration of Lost Samples in Digital - Signals.Englewood
I I

Cliffs, NJ: Prentice-Hall, “l980. .
1381. E. DiClaudio, G. Orlandi, F. Piazza, and A. Uncini, “Optimal weighted
.
LS AR estimation in presence of impulsive noise,” in-IEEE I C A ~ S P . ,
vol. E3.8, 1991,: pp. 3149-3152. William J. Fitzgerald received the B.Sc. degree
[39] M. Sezan, M. Ozkan, and S. Fogel, “Temporally adaptive filtering of in physics in 1971, the M.Sc. degree in solid state
noisy image sequences using a robust motion estimation algorithm,” in physics in 1972, and the Ph.D. degree in 1974 from
Proc.,,IEEE ICASSP, vol. 3, 1991, pp. 2429-2431. the University of Birmingham, UK.
[40] M. Ozkan, M. I. Sezan, and A. M. Tekalp, “Adaptive motion- He worked for six years at the Institut Laue
compensated filtering of noisy image sequences,” IEEE Trans. Cicuits Langevin in Grenoble, France, as a Research Sci-
Syst.,. video TechnoL, pp. 277-290, Aug. 1993. entist worlung on the theory of neutron scattenng
[41] M. Ozkan, A. Erdem, M. Sezan, and A. Tekalp, “Efficient multiframe from condensed matter. He spent a year teaching
Wiener restoration of blurred and noisy image sequences,” IEEE Trans. physics at Trinity College, Dublin, and then be-
Image Processing, vol. 1, pp. 4 5 3 4 7 6 , 1992. came Associate Professor of Physics at the ETH
[42] A. Erdem, M. Sezan, and M. Ozkan, “Motion-compensated multiframe in Zurich, working on diffuse scattering of x-rays
Wiener restoration of blurred and noisy image sequences,” in IEEE from metallic systems as well as teaching. After several years working in
ICASSP, vol. 3, pp. 293-296, 1992. industrial research in Cambridge, he then took up his present position as
[43] A. C. Kokaram, R. D. Moms, W. J. Fitzgerald, and P. J. W. Rayner, University Lecturer in the Engineering Department, where he teaches and
“Interpolation of missing data in image sequences,” IEEE Trans. Image conducts research in signal processing.
Processing, this issue, pp. 1509-1519. Dr.Fitzgerald is a Fellow of Chnst’s College, and his research interests are
concerned with Bayesian inference and model-based signal processing.

Ani1 C. Kokaram (S’91-M’92) was born in Sangre
Grande, Trinidad and Tobago, on June 19, 1967.
He received the B.A. degree in electrical and in-
formation engineering sciences from the Cambridge Peter J. W. Rayner received the M.A. degree from
University Engineering Department, UK, in 1989. Cambridge University, UK, in 1968 and the Ph.D.
He went on to receive the M.A. and Ph.D. degrees degree from Aston University in 1969.
from the Signal Processing Group of the Cambridge Since 1968, he has been with the Department of
University Engineering Department in 1993, having Engineering at Cambridge University and is Head
worked principally on motion picture restoration. of the Signal Processing and Communications Re-
Since 1993, he has been working on other prob- search Group. He teaches courses in random signal
lems in archived motion picture film in his capacity theory, digital signal processing, image processing,
of Research Associate in the Signal Processing Group. His interests encompasi and communication systems. His current research
image sequence processing in general. He is currently engaged in such areas interests include image sequence restoration, audio
as image sequence noise reduction, missing data reconstruction, and motion restoration, nonlinear estimation and detection, and
estimation. He has applied this work in many different environments including time series modeling and classification. In 1990, he was appointed to an
scanning electron microscopes and particle image velocimetry. ad-hominem Readership in Information Engineering.

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.