Professional Documents
Culture Documents
1985 Threshold Selection Based On A Simple Image Statistic
1985 Threshold Selection Based On A Simple Image Statistic
AND
J. F~GLEIN
Institute for Coordination of Computer Techniques, Budapest, Hungary
Received June 19,1984; accepted December 4,1984
The problem of automatic threshold selection is considered. After a brief review of available
techniques, a novel method is proposed. It is based on image statistics which can be computed
without histogramming the grey level values of the image. A detailed analysis of the properties
of the algorithm is then carried out. The effectivenessof the method is shown on a number of
practical examples. 0 1985 Academic Press. Inc.
1. INTRODUCTION
A primary problem of image processing is to devise algorithms which will
successfully divide complex images into areas which meaningfully correspond to
objects in the real world. This image segmentationproblem can be extremely difficult
for general images which contain a large range of luminance or grey level values.
However, for many important applications in medicine or industrial inspection, the
main features of an image can be representedby as few as two grey levels. A typical
example is the inspection of an object placed on a dark background with which it
contrasts strongly. In such a situation the histogram of luminance values will possess
a strong bimodality with one peak corresponding to pixels from the object regions
and the other corresponding to pixels of the image background. This observation
permits classification or segmentationof the image by considering the relation of the
luminance values 1(x, y) with a luminance value T which is between the luminance
values of the object and background. The simple decision criterion for the class of
each pixel is:
T is called a threshold and this paper is concerned with automatically choosing this
number for images which satisfy a two-class assumption.
The importance of the thresholding segmentation method is based on its simplic-
ity and its wide applicability. It is useful because it is a data reduction step and
because it produces a binary representation of an image. Binary images are readily
manipulated to produce higher level descriptions of the scenes and objects, i.e.,
borders, relational graphs, etc.
The problems which occur in blindly applying thresholding are due to the nature
of real images and the fact that the assumptions which underlie the method, i.e., the
image is representable using only two grey levels, are not satisfied. A common
125
0734-189X/85 $3.00
Copyright 0 1985 by Academic Press, Inc.
All rights of reproduction in any form resaved.
126 KITTLER, ILLINGWORTH, AND FijGLEIN
problem is the effect of gradual shading across the surface of a large image. The
image is still segmentable provided nonrandom contrast which delineates objects is
locally preserved. However the gradual shading may mean that spatially separated
points of the same object may have luminances sufficiently different that they are
assigned to different classes. The shading leads to a broadening or filling in of the
intermodal valleys of the grey level histogram. It can even result in unimodal
distributions. Another problem may be that the object area is small compared with
the background area and therefore the size of the mode that it contributes to the
histogram is no more significant than that of random picture noise. Both of the
above problems can be addressed by summing the histogram over an area which is
more appropriate, i.e., in these cases a smaller area. However, the optimal size and
shape of such an area is difficult to determine and histogramming over too small an
area may produce results which lose any statistical significance or which violate our
basic assumption of histogram analysis, i.e., that two distinct significant modes exist.
Equally invalid is any attempt to apply the threshold method to an image with
more than two histogram modes. A common case occurs for industrial inspection
images in which there is a background, an object, and a strong shadow from the
object. Two intermodal valleys would then exist and it would be necessary to search
for two threshold values. In this paper we will not deal with this question of multiple
threshold selection.
The aim of this paper is to present a new method of automatically selecting a
threshold value. This is accomplished using simple statistics of the image and
without reference to histogram analysis. In order to place the new method, which we
call RATS (robust automatic threshold selector), in context we have included a brief
survey of many of the other methods which have been proposed for automatic
threshold selection, stressing some of the advantages or drawbacks of each method.
Such a review is timely as thresholding is an important developing technique and
few recent references exist to provide any comparative study of useful techniques.
The paper is organized so that the review occupies the following section. Section 3
considers a specific model of images from which several interesting properties and
statistics can be derived. Their usefulness has been indicated in previous publications
[l, 21. Section 4 constructs and justifies the use of a combination of these measures to
select a meaningful threshold. The effect of noise on the robustness of RATS
algorithm is then analysed in Section 5. Section 6 gives examples of the practical
application of the method to a couple of images. Section 7 discusses and illustrates
how the method can be applied more locally to overcome the effects of nonuniform
illumination. The linal section includes discussion and conclusions.
2. REVIEW OF THRESHOLDING TECHNIQUES
A wide selection of thresholding techniques use only the information contained
within the luminance histogram of the image. The most general method involves
locating all the modes of the histogram. Several peak-finding algorithms exist [3, 41.
We have used a scheme which produces a linear piecewise approximation of the
histogram [5,6]. This is then coded to indicate sections of positive, negative, or zero
slope and a simple syntactic analysis can be performed to locate all peaks and
valleys. This method includes several tunable parameters such as the precision of the
linear approximation and the value of gradient at which a line is regarded as having
significant nonzero slope. In practice these were easily selected. The method was
THRESHOLD SELECTION BASED ON A SIMPLE IMAGE STATISTIC 127
found to work well as it imposes few assumptions on the structure of the histogram
and can be used to select several thresholds if the structure of the histogram
indicates that more than two prominent peaks are present.
A popular thresholding method assumes that the grey level histogram contains
two and only two prominent modes and they are both Normally (GaussianIy)
distributed. The method fits the observed histogram to a sum of Gaussians with the
distribution means and widths as parameters [7]. The problem of such an analysis is
the computational complexity and its sensitivity to the correctness of the underlying
assumptions. A goodnessof fit criterion can be used as a test of the suitability of the
method for any particular image. If a bad fit is discovered than the analysis can be
repeated postulating a sum of three Gaussian distributions.
Ridler [8] has proposed an iterative method of thresholding. In his method he
utilizes a switching function image which is the binary version of the picture
obtained by using the threshold value of the last iteration. The initial switch function
is arbitrarily chosen as a binary image with the comer points assignedas background
and the rest of the picture as object. At each iteration the mean luminance values of
the pixels in the object and background classesof the associated switching function
image are calculated. The average of these grey level means is used as a new
threshold value to produce a new switching function. This process is iterated until a
stable solution is found. This is a very inelegant, multiple pass formulation of the
method. Essentially only grey level histogram information is used and this can be
accumulated by a single pass through the image data. The method consists of
arbitrarily dividing the histogram into two parts and calculating the mean grey level
of each part. The next approximation to the best threshold is the average of these
two mean values. This new approximation is used to divide the histogram and the
process is iterated until a stable solution is obtained. This formulation of the method
is simple and the process is much faster [30].
The effectiveness of grey level histogram analysis can be increased by considering
the histogram of suitable subpopulations of pixels of the image. In an ideal case the
pixels which populate the intermodal valley are those which he on the edgesbetween
the object and background regions. Thus if these pixels can be identified and
removed from the histogram the intermodal valley will deepen and be more easily
identified. Conversely the mean of the subpopulation of edge pixels should be a good
point at which to choose a threshold. Many people have suggested weighting the
grey level histogram with local derivative information, gradient and/or Laplacian.
However problems occur because random noise pixels also have large derivative
values. Rosenfeld and Weszka [9] have made a detailed study of such methods and
conclude that the study of grey level versus gradient plots can be useful aids in
threshold selection but they are not a general solution to the threshold selection
problem.
Wu, Hong, and Rosenfeld [lo] have experimented with isolating edge region pixels
by looking at small size blocks of a quadtree representation of an image. This
involves iterative subdivision of the image and parts of the image into quadrants, A
quadrant is not subdivided once the variance of the grey level values in the quadrant
is less than a specified tolerance. Edge points will give large contributions to the grey
level variance and therefore squares containing them will be subdivided until they
reach a small sire. The subset of these small quadtree blocks provide an enriched
sample of edge points upon which threshold analysis can be more easily performed.
128 KITTLFiR, ILLINGWORTH, AND FijGLEIN
mean value of the contrast of all step edges detected by the threshold. A step edge is
identified whenever the trial threshold value lies between the intensity values of the
adjacent pixels under consideration. The contrast contribution is the minimum of
the absolute differences between the trial threshold and the respective pixel intensi-
ties. This method of calculating contrast ensures that a given step edge gives
maximal contribution when the trial threshold is midway between the intensities of
object and background, of the step edge. The threshold selected is that trial value
which gives the highest average contrast value. The method favors the selection of a
threshold which has large numbers of high contrast edges and low numbers of low
contrast edges. As with the method of Barrett, it is also suitable for selecting
multiple thresholds.
Several authors have investigated the use of relaxation algorithms for classification
and threshold selection. This relies on the fact that the pixels within a uniform area
should have similar grey level values and therefore comparison of the assignment of
a pixel with its immediate neighborhood allows a quantitative assessment of the
correctness of that assignment. Rosenfeld and Smith [21] claim good results with this
method but Renade and Prewitt [22] find that the method is sensitive to the initial
assignment and to subsequent pixel updating rules. Renade found relaxation a useful
method to improve thresholding results after initializing the method with a standard
histogram threshold selection algorithm. Bhanu and Faugeras [23] have performed
studies of a gradient relaxation scheme and found that this yields good results and
the convergence properties of the method are controllable using only a few parame-
ters. They claim that their method is useful for unimodal distributions [24].
Other papers which suggest thresholding methods and/or include comparative
studies of thresholding algorithms can be found in [25-291.
3. PRELIMINARIES
Most of the approaches reviewed in Section 2 involve the analysis of the histogram
of grey level values which is associated with the difllculties and problems listed
earlier. The exceptions include the convergent evidence method [ll], where the
threshold is considered as a parameter of some criterion function quantifying the fit
between edge and threshold based segmentation. The determination of the optimal
threshold involves several iterations through the image data. Likewise, the edge
profile searching method [20] and the relaxation methods are iterative. Although the
unreliable histogram analysis is obviated in these methods, they are neither simple to
implement, nor free from artifacts, two essential prerequisites for completely auto-
matic operation.
Ideally we should like to be able to determine the correct threshold on the basis of
simple statistics defined directly in terms of pixel grey level values and possibly their
functions, without the need to rely on histogram analyses or some criterion optimiza-
tion involving multiple data passes. We shall introduce such a novel method in
Section 4 but before doing so it will be useful to provide some background and
preliminary material.
The search for a simple statistic which could provide a basis for a thresholding
method was stimulated by our recent work on edge detection which led to the
development of the absorption edge detector [l, 21. This detector has the desirable
property of yielding the edge magnitude proportional to the contrast between the
background and the object independent of the actual edge position and orientation.
THRESHOLD SELECTION BASED ON A SIMPLE IMAGE STATISTIC 131
It is based on the observation that the sum of the edge magnitudes output by
conventional edge operators in the vicinity of an edge and along a line intersecting
the edge is constant. This property has been shown to hold for a family of operators
with a 3 x 3 kernel. For the purpose of our discussion here it will be more
appropriate to consider a 1 x 3 operator and show that the absorption principle
remains valid.
Let us consider a scene segment containing a boundary between the dark
background and light object illustrated in Fig. 1. Suppose the contrast between the
object and the background is E, i.e.,
E-B-D.
where B and D are the luminance of the object and the background, respectively.
For the moment let us assumethat we can obtain a noisefree image of the scene and
also that the true edge angle lies in the interval [ - 45”, 457.
We shall now apply the edge gradient operator illustrated in Fig. 2 along one scan
line and sum its output over a set of consecutive pixels. Note that well inside the
background, or the object, the outputs of the operator will be zero and hence their
sum will also be zero. We shall therefore turn our attention to the boundary region.
In the previously reported studies of edge detectors a particular model for the
imaging device has been adopted [l, 21.According to this model, the grey level value
at a pixel is given by the integral of the sceneluminance function over the pixel area.
Here we shall adopt a more general model capable of characterizing factors affecting
the imaging process such as frequency characteristic lim itations, cell overlap, cell
crosstalk, etc. Thus it will be assumed that an ideal step in the scene luminance
function corresponding to the object boundary will give rise to a grey level value
transition function in the boundary region. The only restriction in the model is that
at any pixel in the vicinity of a true edge at angle [ - 45”, 457 the magnitude of the
derivative of the transition function with respect to x will exceed the one in the y
direction.
background D D “, a2 a3 . * B 8 object
k-l =k
Let us consider grey level values in the boundary region along one scan line and
let us denote them as shown in Fig. 3. Applying the operator of Fig. 2 we get the
gradient magnitudes shown in Fig. 4. Summing over the k + 1 pixels of the
luminance function transition region along the scan line we find
k+l k k
cej= xaj+2(B-D)- xaj=2E. 0)
j=O j=l j-l
Thus the summation of the derivatives in the x direction equals twice the contrast.
As the derivatives of the luminance function at pixels outside the boundary region
are zero, the sum of the operator outputs along the complete scan line is still equal to
2 - E.
In order to extend this result to true edge angles from the interval [135’, 225’1, we
simply need to replace ek by its absolute value. It is easy to verify that this does not
affect the result in (1).
To summarize this result more formally let us introduce the concept of a vertical
edge.
DEFINITION 1. Let the luminance function over the imaged scene be a step
function and denote the contour defining the boundary between high and low
luminance values by r(x, y). Let (x0, yo) be a boundary point. We say that the
luminance function, has a vertical edge at point (x0, vo) is the angle between the x
axis and the normal to the boundary at this point lies in either of the following
intervals: [ -45”, 457, [135’, 225’1.
Let us denote by N the number of pixels in one scan line. We can now state the
following theorem.
THEOREM 1. Let ej, j = 1, N be the outputs of the diferentiation operator in Fig. 2
along a horizontal scan line intersecting one vertical edge. Then
z lejl = 2E.
j=l
We now extend this result to the whole image. Suppose that in total n horizontal
scan lines intersect one vertical edge. Denoting by eij the output of the diEerentia-
tion operator centered at pixel (i, j), each intersected line contributes 2 E to the sum
of all eij. Assuming that the total number of horizontal scan lines is equivalent to
e0 q 82 . . . . . e k-l ek ektl
bockgoundI 0 k, -q2qa3y1 . [ * 1 * bt&@k&~kI 0 Iobject
(2)
Obviously in usual situations edges arise at the boundaries between objects and
the background. If a complete object is in the field of view, then one scan line will
intersect at least two vertical edges. A complicated shape of an object or several
objects may give rise to several vertical edges being intersected by one scan line.
Provided these edges are separated enough to allow the grey level function to reach
either the object or background intensity as appropriate, then the ith scan line will
contribute to the summation of the derivative magnitudes by the amount of
2 . E * n,.
The summation over the complete image will yield
(3)
Thus the output is directly proportional to the number of vertical edge pixels in
the image.
By analogy we can define the horizontal edge as follows.
DEFINITION 2. We say that the luminance function has a horizontal edge at
point (x,, vO) if the angle between the x axis and the normal to the boundary
r(x, JJ) at this point lies in either of the following intervals: [4Y, 1357 and
[225”, 3157.
If instead of the operator in Fig. 2 we now use the column operator of Fig. 5,
because of the rotational symmetry we can derive identical results to (3) for the
horizontal edges.
In order to combine these results for an image containing both vertical and
horizontal edges we recall that the operators in Figs. 2 and 5 approximate the x and
y derivatives of the image intensity function. It is easy to show that for a vertical
edge the output of the x mask (Fig. 2) will exceed the output of the y mask in Fig. 5
and vice versa. Thus by selecting the greater of the two outputs we obtain the
appropriate maximum derivative map of the image.
We now consider the effect of summing up the pixel values of the derived map.
Irrespective of the edge directions it makes no difference to the output of the
summation operator whether the summation is carried out row-wise or column-wise.
Provided we are not too close to the pixels where vertical and horizontal edges
intersect, i.e., near comers, the above results can be applied directly. In the region
El
1
-1
where the vertical and horizontal edges intersect, the situation is not as clear and
further detailed analysis is warranted. However, the number of such points is likely
to be small in comparison with noncomer points and their effect on the summation
operator output will therefore be negligible. We thus have the following theorem.
THEOREM 2. Let eij be the maximum in the absolute sense of the outputs of the x
and y derivative masks centered at the (i, j)th pixel. Ignoring any edge corner effects,
the sum of the absolute values of eij over an image is equal to two times the contrast E
times the number n of edgepixels in the image, i.e.,
i t leijl = 2En. (4
i-l j-1
This result is quite interesting. It states that irrespective of the edge profile which
is a function of the sensing device limitations, the output of the derivative map
summation operator depends only on the contrast and the number of edge pixels.
Note that n is the sum of the perimeters (expressed in pixel size units) of the objects
in the image. It must *be emphasized that the result is valid only under the
assumptions of uniform lighting and zero noise. However these aspects will be dealt
with later.
We shall not elaborate on the possible uses of this result which could, for instance,
include object perimeter mensuration. Instead we shall consider whether similar
simple and meaningful relationships exist between other statistics that can be easily
derived from any image. We shall see that one such relationship provides a basis for
an effective method of threshold determination which does not require the compu-
tation and, in particular, analysis of the grey level histogram.
4. A NEW THRESHOLDING METHOD
Having obtained such a surprising result when summing up max derivative values,
let us consider some other obvious candidate variables for such simple statistics.
Suppose we sum up all the grey level values g, to see whether the result has
interesting properties. In an ideal two-level image object and background pixels have
grey level values B and D, respectively. If the number of object pixels is 4, then the
sum will be
N N
c ~gij=qB+(N2-q)D=qE+N’D.
i-l j-l
In a more realistic image where the transition from the background to object
intensities is gradual this result will still hold provided the transition function is
reasonably symmetric.
While there may be some uses for these grey level statistics if some of the
parameters are known, i.e., either object size, contrast, or background level, espe-
cially in conjunction with the relationship in (4), the result does not seem to have a
clear designation of applications. We shall, therefore, consider the next obvious
candidate, namely the sum of grey values each multiplied by the maximum deriva-
tive.
THRESHOLD SELECTION BASED ON A SIMPLE IMAGE STATISTIC 135
FIG. 6. The product of the grey level and the x-derivative magnitude for the scan line of Fig. 3.
As before, we shall first consider one scan line intersected by a single edge at an
angle from the interval [ - 45’, 457 as in Fig. 1. Taking the product of the grey level
values in Fig. 3 with the corresponding magnitudes of the x derivatives in Fig. 4 we
get an output scan line illustrated in Fig. 6. Summing up over j = 0,. . . , k + 1 with
ff O=Danda,,,=Bwefind
k+l
C hj = fI aj(aj+l - aj-l) + a~(a~ - %I + ak+l(ak+l - ak)o
j=O j-0
k+l
xhj=(B+D)E.
j=l
It is easy to see that this result holds for any vertical edge provided hj is replaced
by its absolute value. Also if a scan line is intersected by one vertical edge only, then
the result holds for the summation over all pixels in the scan line, as the pixels well
within the object and the background do not contribute to the value of the statistics
(zero derivative). The following theorem summarizes this basic result.
??HBORBM 3. Let hj, j = 1, . . . , N be the product of the grey level value gj and the
output ej of the operator in Fig. 2 along a horizontal scan line intersecting one vertical
edge. Then
t lhjl = (B + D)E.
j=l
C C lhijl
B+D
T= i;lj;l _
(7)
C C leijl 2 ’
Now note that the right-hand side of (7) is the midpoint between the object and
background intensities. Under the assumption that B and D do not vary over the
image, this quantity is the appropriate threshold value for segmenting objects from
the background. Thus we have derived a completely new basis for thresholding
which does not require the computation of the grey level histogram and, more
importantly, its subsequent analysis.
A few comments are in order. First of all note that this novel method of
determining the threshold can be applied to images of any size. Second, the size of
the objects in the image in relation to the background size is immaterial. However, if
the image contains no objects then the statistics T is undefined. It is important,
therefore, to check the denominator in (7) to avoid numerical and semantic prob-
lems.
The assumption that the image be uniform, that is, having constant object and
background intensities over the whole image is highly unrealistic. Indeed if that were
the case the image thresholding problem would be trivial. Nevertheless it can be
argued that, the assumption often holds for a small enough subimage. If we divide
the image accordingly then an appropriate threshold value can be determined for
each resulting image window separately. This is the basic philosophy behind the
variable thresholding approach to segmenting images of scenes subject to nonuni-
form lighting. However, unlike histogram-based thresholding methods, where the
image partitioning accentuates the problems of histogram analysis, the new method
of threshold selection is remarkably robust. Of course, the subdivision of the image
increases the probability of each window being homogeneous, that is, containing
either background or object only. A suitable strategy, therefore, must be adopted to
cope with such situations.
Another factor causing the image to be nonuniform is noise. In the above
derivation of the properties of statistics T, noise has not been taken into account. It
will obviously result in nonzero value of the sum of derivatives even if a window
contains only background or object pixels. The effect of noise on the threshold value
will be investigated in the next section.
THRESHOLD SELECTION BASED ON A SIMPLE IMAGE STATISTIC 137
where g, is the grey level in the noise free case and qlij is the noise signal both at
pixel (i, j).
For the sake of simplicity, instead of analyzing pixel values in the 2-dimensional
image field, we shall consider- a l-dimensional analog that is a time series (or
one-scan line of the image). The generalization to the 2dimensional case is fairly
straightforward.
According to the simplifying assumption our model in the interval of N sample
points is now
it 4i)M i) I
T= j=lhJ (9)
iFl 14) I ’
where e(i) is the digital approximation of the derivative of s obtained using the
mask of Fig. 2, i.e.,
p(v) = [2&2]-1’2exp
i
-$
i
. (11)
Let us denote the difference noise signal by c(i), i.e.,
and
A = [l, -11,
we find that t(i) is distributed according to N(0,2a2). Thus the density of E(i) is
WI} = I. (14)
Let us now return to the numerator of T in (9). We shall examine the expression:
(15)
and show that for sufhciently large N it will approach zero. We fhst consider the
expected value of (15). Changing the order of the mathematical expectation and
summation operators we get
06)
Note that for Ii -jl r 2 u(i) and v(j) will be independent of e(i) and e(j) and
therefore the last term will vanish. Because of the independence of q(i) and e(i) we
THRESHOLD SELECTION BASED ON A SIMPLE IMAGE STATISTIC 139
=~62~E{[g(i+l)-g(i-1)+~(i+l)-~(i-1)]2}. (19)
Since the absolute value of the difference of two noisefree pixel intensity values is
less or equal to contrast E the first term in (18) can be bounded by 6 2( E2 + 26 2)/N.
For the second term in (18) we can write
where D is the background signal level. Consequently the effect of the second term
in the numerator can be neglected and we can write:
140 KITTLER, ILLINGWORTH, AND FbGLEIN
Starting from (24), it has been shown in [31] that in the presence of noise the
computed threshold T will be approaching
where q denotes the fraction of the object pixels in the image. For q = 0.5 the
threshold determined will still be correct. For other values of q the threshold will be
shifted either towards the object or background grey level values. However this bias
can be removed very effectively as discussed in [32].
6. EXPERIMENTAL RESULTS
We have implemented the RATS algorithm on a PDP D/44 using the PASCAL
programming language. Input images were acquired using a standard VIDICON
camera whose output was digitized to give grey levels in the range 0 to 63. The
gradient values of the image were calculated in software but considerable speed
advantages would result from a hardware implementation of this simple operator.
Figure 7 is an image of a lens cap. High contrast between the dark cap and light
background permits simple, correct threshold selection. A single global threshold
was calculated by including contributions of all the pixels. The grey level histogram
of the image is shown in Fig. 8a with the selected threshold indicated. The resultant
binary image is Fig. 8b.
Figure 9a is an image of a typical metal product which may need inspection.
Figures 9b and c show the grey level histogram and the binary image obtained using
the indicated threshold. The histogram is relatively complex. The upper mode of the
histogram has been spread and split into 3 submodes by nonuniform lighting and
shadowing. The resultant globally determined threshold produces a poor segmenta-
tion. The nonuniformity of illumination is illustrated in Fig. 9d which is the profile
of grey level intensity along a diagonal line in the background portion of the image.
This image requires threshold selection to be made locally.
THRESHOLD SELECTION BASED ON A SIMPLE IMAGE STATISTIC 141
FIG. 8. (a) Gray level histogram. The selected threshold is indicated. (b) Binary image based on global
statistic. Good segmentation results.
FIG. 9. (a) Image of metal object. (b) Gray level histogram. The selected threshold is indicated. (c)
Binary image from global statistic. Poor segmentation. (d) Gray level scan along a diagonal line of the
image background.
142 KITTL.ER, ILLINGWORTH, AND FijGLEIN
7. VARIABLE THRESHOLDING
For scenes which are nonuniformly ilhuninated, the determination of a single
global threshold for the image is often unsatisfactory. A more appropriate method is
to partition the image into smaller square windows for which relevant thresholds can
be independently determined. This strategy has been successfully used by several
authors for several thresholding methods [7, 271. However many threshold selection
methods which analyze the gray level histogram of the image are ill-suited to this
approach because in small populations statistical fluctuations dominate and make a
significant and correct segmentation of the histogram difficult. However the RATS
method, which has been shown to be insensitive to population size for noisefree
images, should benefit from this local application. This is related to the proposition
that in noisy images RATS will produce an optimal threshold if the number of
object and number of background pixels are equal. As the window size decreasesit
becomes more probable that a window which contains both object and background
pixels will achieve this desired balance. However, many small windows will contain
either only object or only background pixels. The determination of a threshold for
these homogenous windows will be inappropriate but suitable threshold values to
classify them as all object or all background pixels should be derivable from their
spatial and gray level relationships to the well-thresholded windows.
A simple test for windows which are homogenous, i.e., contain only object or only
background pixels, is to consider the E-grad statistic for all the windows into which
the image is divided. Windows which contain edge pixels will generally have a larger
Z-grad value than those containing no edge pixels. The effect of noise, if its statistics
are constant over the image, will add, on average, an equal Z-grad to both types of
window. The large Z-grad value windows as they contain edge pixels, are threshold-
able. They can be separated from the small Z-grad windows by treating the
2-dimensional array of Z-grad values as an image to which the RATS thresholding
method can be applied. Figure 10 shows the effectiveness of this method for the
image in Fig. 9a. The image was partitioned into square windows each with a side
length of 32 pixels. The 8 x 8 array of Z-grad values of all windows was thresholded
using the RATS method and Fig. 10 is the resultant binary image together with an
overlay of the border points of the object in the image. (The border was obtained
FIG. 10. The el?‘ectof the grad cut. The bright squares have values above the cut. This indicates they
have gradient contributions from large edge values and therefore good thresholds may be assigned for
these windows.
THRESHOLD SELECTION BASED ON A SIMPLE IMAGE STATISTIC 143
FIG. 11. A pyramid data structure. Values at high levels are constructed by the successive union of
nonoverlapping 2 x 2 window values.
from edge pixels of a later successful image binarization.) It is seen that the bright
windows, i.e., those above the calculated Z-grad threshold, coincide well with the
border of the object. The Z-grad threshold selects windows which contain edge
points and it is for these windows that meaningful thresholds are calculable. Similar
good results were found for a variety of images.
The assignment of good thresholds to window areas of the image which contain
only object or only background pixels can be attempted in many ways. The two
possibilities which have been considered in our work involve simple neighborhood
window averaging or the use of a pyramid data structure. The first of these methods
is well described in [7]. An unassignedwindow is given a threshold value which is the
weighted arithmetic mean of the threshold values of the 8 neighboring windows
which have a threshold assigned. Each window threshold is weighted by a factor
inversely proportional to the distance between its center and that of the central
unassigned window (e.g., for diagonal neighbors by l/ 6). This assignment process
is iterated until thresholds have been assigned to all windows.
The use of a pyramid data structure was motivated by the desire to select the best
thresholds from those determined at several different spatial resolutions [34]. The
independent nature of the sums of individual pixel statistics means that the RATS
algorithm is well suited to this approach. The pyramid data structure is illustrated in
Fig. 11. At the lowest level of the pyramid the statistics of the image are calculated
for small windows. At the next higher level the statistics for nonoverlapping blocks
formed by the union of 2 x 2 windows are calculated by summing the statistics
obtained for those 4 windows at the lower level. At the very highest level the
statistics are just the sums calculated over all pixels and the threshold calculated is
the global threshold. An appropriate or best threshold can be calculated for the low
level windows by considering information at several spatial scales. At each level we
144 KIT-ITER, ILLINGWORTH, AND FtjGLEIN
FIG. 12. Four gray level image which indicates the level at which a threshold was assigned to a base
level 32 x 32 pixel window. The brightest squares were assigned thresholds based on 32 x 32 pixel
window statistic, next brightest 64 X 64 pixel window, next brightest 128 X 128 windows, and the darkest
areas were given a threshold determined by sums over the full 256 X 256 image.
--_
WINDOW A WINOOW 8
. TE
.
d-
TO
WINDOW C WINDOW D
-- L
FIG. 13. Assignment of individual pixel thresholds based on 4 point interpolation between the window
thresholds of the nearest four windows.
FIG. 14. (a) Result of variable threshold& Window threshold reassignment was by averaging of
nearest neighbor windows. (b) Result of variable thresholding. Window threshold reassigmnent was by
use of a pyramid data structure.
8. CONCLUSIONS
The problem of automatic threshold selection has been considered. After a brief
review of available techniques, a novel method has been proposed. It is based on
image statistics which can be computed without histogramming the grey level values
of the image. A detailed analysis of the properties of the algorithm has been carried
out. The effectiveness of the method has been shown on a number of practical
examples.
146 KITTLER, ILLINGWORTH, AND FGGLEIN
REFERENCES
1. J. Kittler and K. Paler, An absorption edge detector, in Proceedings Computer Vision and Pattern
Recognition Conf., Washington 1983, pp. 345-350.
2. J. Kittler, J. Illingworth, and K. Paler, The magnitude accuracy of the template edge detector, Pattern
Recognition, 16,1983,607-613.
3. J. Ekhmdh and A. Rosenfeld, Peak detection using difference operators, IEEE Trans. Pattern Anal.
Mach. Intell. PAMI-1, No. 3, 1979, 317-325.
4. S. L. Horowitz, Peak Recognition in Waveforms in Syntactic Pattern Recognition, Applications (K. S.
Fu, Ed.), Springer-Verlag, Berlin/New York, 1977.
5. I. Tomek, Two algorithms for piecewise-linear continuous approximations of functions of one
variable, IEEE Trans. Comput. C-22,1974,445-448.
6. C. Willams, An efficient algorithm for the piecewise linear approximation of planar curves, Comput.
Graphics Image Process. 8,1978, 286-293.
7. Y. Nakagawa and A. Rosenfeld, Some experiments on variable thresholding, Pattern Recognition 11,
1979, 191-204.
8. T. Ridler and S. Calvard, Picture thresholding using an iterative selection method, IEEE Trans.
Systems Man Cybern., SMC-8, No. 8,1978, 630-632.
9. J. Wesxka and A. Rosenfeld, Histogram modification for threshold selection, IEEE Systems Man
Cybem., SMC-9, No. 1, 1979, 38-52.
10. A. Wu, T. H. Hong, and A. Rosenfeld, Threshold selection using quadtrees, IEEE Trans. Pattern
Anal. Mach. Inteli. PAMI4, No. 1, 1982, 90-94.
11. D. Milgram, Region extraction using convergent evidence, Comput. Graphics Image Process. l&1979,
1-12.
12. A. Rosenfeld and P. De La Torre, Histogram concavity analysis as an aid in threshold selection,
IEEE Trans. Systems Man Cybem. SMC-13, No. 3,1983, 231-235.
13. S. K. Pal, R. A. King, and A. A. Hashim, Automatic grey level thresholding through index of
fuzziness and entropy, to appear.
14. N. Otsu, A threshold selection method from grey level histograms, IEEE Trans. Systems Man
Cybern., SMC-9, No. 1,1979,62-66.
15. N. Otsu, Discriminant and least squares threshold selection, in 4th Znt. Joint Conf. on Pattern
Recognition, Kyoto, Japan, 1978, pp. 592-596.
16. T. Pun, A new method for grey level picture thresholding using the entropy of the histogram, Signal
Process. 2, 1980, 223-237.
17. T. Pun, Entropic thresholding, a new approach, Comput. Graphics Image Process. 16,1981, 210-239.
18. G. Johannsen and J. Bille, A threshold selection method using information measures, in 6th ht. Conf.
on Pattern Recognition, Munich, Germany, 1982.
19. F. Deravi and S. K. Pal, Grey level thresholding using second-order statistics, Pattern Recognition
Z.-err.1, Nos. 5, 6, 1983, 417-422.
20. W. Barrett, An iterative algorithm for multiple threshold selection, in Proc. IEEE Comput. Sot. Conf.
on Pattern Recognition and Image Process., DaBas, Texas, 1981, 273-278.
21. A. Rosenfeld and R. Smith, Thresholding using relaxation, IEEE Trans. Pattern Anal. Mach. Zntell.
PAMI3, No. 5, 1981, 598-606.
22. S. Ranade and J. Prewitt, A comparison of some segmentation algorithms for cytology, in 5th Znt.
Conf. on Pattern Recognition, Miami, Fla., 1980, pp. 561-564.
23. B. Parvin and B. Bhanu, Segmentation of images using a relaxation technique, IEEE Comput. Sot.
Conf. on Computer Vision and Pattern Recognition, Washington, D.C., 1983, pp. 151-153.
24. B. Bhanu and 0. Faugeras, Segmentation of images having unimodal distributions, IEEE Trans.
Pattern Anal. Mach. Intell. PAMI-4, No. 4, 1982, pp. 408-419.
25. B. Nordin, E. Bengtsson, B. Dahlgvist, 0. Eriksson, T. Jarkraus, and B. Stenkvist, Object orientated
cell image segmentation, in First IEEE Symp. on Medical Imaging and Image Interpretation,
Berlin 1982.
26. H. Bunke, H. Feistl, H. Neimann, G. Sagerer, F. Wolf, and G. X. Zhou, Smoothing, thresholding and
contour extraction in images from gated blood pool studies, in First IEEE Symp. on Medical
Imaging and Image Interpretation, Berlin 1982.
27. G. Fernando and D. M. Munro, Variable thresholding applied to angiography, in First IEEE Symp.
on Medical Zmag’ng and Image Interpretation, Berlin 1982.
THRESHOLD SELECTION BASED ON A SIMPLE IMAGE STATISTIC 147
28. J. Tokumtsu, S. Kawata, Y. Ichioka, and T. Suzuki, Adaptive binarisation using a hybrid image
processing system, Appl. Optics 17, No. 16,1978,2655-2657.
29. J. White and G. Rohrer, Image thresholding for OCR and other applications requiring character
image extraction, IBMJ. Res. Dev. 27, No. 4,1983,400-410.
30. H. J. Trussell, Comments on “Picture thresholding using an iterative selection method,” IEEE Trans.
Systems MQ~ Cybern. SMC-9,1979, 311.
31. J. Kittler, J. Illingworth, J. Foglein, and K. Paler, An automatic thresholding method for waveform
segmentation, in Proc. Digital Signal Processing-84, Florence 1984, pp. 727-732.
32. J. Kittler, J. Illingworth, J. Foglein, and K. Paler, An automatic thresholding algorithm and its
performance, in Proc. 7th Int. Conf. on Pattern Recognition, Montreal 1984, pp. 245.
33. R. Kohler, A segmentation system based on thresholding, Comput. Graphics Image Process. 15, 1981,
319-338.
34. A. R. Hanson and E. M. Riseman, Processing cones: A computational structure for image analysis, in
Structured Computer Vision (S. Tanimoto and A. Klinger, Eds.), Academic Press, New York,
1980.