Professional Documents
Culture Documents
1. Merge two regions f?l; and fll1 if w!Pm > 0i, where Pm = min(P;, �), P; and � are
the perimeters of Yl; and 9l1, and w is the number of weak boundary locations
(pixels on either side have their magnitude difference less than some threshold
a) . The parameter 01 controls the size of the region to be merged. For example
01 = 1 implies two regions will be merged only if one of the regions almost
surrounds the other. Typically, 01 = 0.5.
2. Merge 9l; and fll1 if w/I > 02, where I is the length of the common boundary
between the two regions. Typically 02 = 0.75. So the two regions are merged if
the boundary is sufficiently weak. Often this step is applied after the first
heuristic has been used to reduce the number of regions.
3. Merge 9l; and £7(1 only if there are no strong edge points between them. Note
that the run-length connectivity method for binary images can be interpreted
as an example of this heuristic.
4. Merge fll ; and fll1 if their similarity distance [see Section 9. 14) is less than a
threshold.
Merge
CR 2
• •
Split Merge
-----+- -----+-
3
(al Input
3 4
�
l ( A, B, Dl
�
2(A, B, Cl
3 ( 8, C, Dl
A
4
Template Matching
Texture Segmentation
A major task after feature extraction is to classify the object into one of several
categories. Figure 9.2 lists various classification techniques applicable in image
analysis. Although an in-depth discussion of classification techniques can be found
in the pattern-recognition literature-see, for example, [1]-we will briefly review
these here to establish their relevance in image analysis.
It should be mentioned that classification and segmentation processes have
closely related objectives. Classification can lead to segmentation, and vice-versa.
Classification of pixels in an image is another form of component labeling that can
result in segmentation of various objects in the image. For example , in remote
sensing, classification of multispectral data at each pixel location results in segmen
tation of various regions of wheat, barley, rice, and the like. Similarly, image
segmentation by template matching, as in character recognition, leads to classifica
tion or identification of each object.
There are two basic approaches to classification , supervised and nonsuper
vised, depending on whether or not a set of prototypes is available .
Supervised Learning
vector obtained from the observed image. A fundamental function in pattern recog
nition is called the discriminant function. It is defined such that the kth discriminant
function gk (x) takes the maximum value if x belongs to class k, that is, the decision
rule is
gk (x) > g; (x) (9. 138)
For a K class problem, we need K - 1 discriminant functions. These functions
divide the N -dimensional feature space into K different regions with a maximum of
K (K - 1)/2 hypersurfaces. The partitions become hyperplanes if the discriminant
function is linear, that is, if it has the form
(9. 139)
Such a function arises, for example, when x is classified to the class whose centroid
is nearest in Euclidean distance to it (Problem 9. 17). The associated classifier is
called the minimum mean (Euclidean) distance classifier.
An alternative decision rule is to classify x to S; if among a total of k nearest
prototype neighbors of x, the maximum number of neighbors belong to class S;. This
is the k-nearest neighbor classifier, which for k = 1 becomes a minimum-distance
classifier.
When the discriminant function can classify the prototypes correctly for some
linear discriminants, the classes are said to be linearly separable. In that case, the
weights ak and bk can be determined via a successive linear training algorithm. Other
discriminants can be piecewise linear, quadratic, or polynomial functions. The
k -nearest neighbor classification can be shown to be equivalent to using piecewise
linear discriminants.
Decision tree classification [60-61 ] . Another distribution-free classifier,
called a decision tree classifier, splits the N -dimensional feature space into unique
regions by a sequential method. The algorithm is such that every class need not be
tested to arrive at a decision. This becomes advantageous when the number of
classes is very large. Moreover, unlike many other training algorithms, this algo
rithm is guaranteed to converge whether or not the feature space is linearly sepa
rable.
Let µk (i ) and (J'k (i ) denote the mean and standard deviation, respectively,
measured from repeated independent observations of the kth prototype vector
r l
element y �> (i), m = 1 , . . . , Mk. Define the normalized average prototype features
zk (i) � µk (i )!ak (i ) and an N x K matrix
z 1 (l) z2 (l) . . . zk(l)
Z1 (2) Z2 (2) . . . Zk(2)
Z = :. :
. .: (9. 140)
z 1 (N) z2 (N) . . . zk(N)
The row number of Z is the feature number and the column number is the object or
class number. Further, let Z' � [Z] denote the matrix obtained by arranging the
elements of each row of Z in increasing order with the smallest element on the left
and the largest on the right. Now, the algorithm is as follows.
Decision Tree Algorithm
Step 1 Convert Z to Z'. Find the maximum distance between adjacent row
elements in each row of Z' . Find r, the row number with the largest maximum
distance. The row r represents a feature. Set a threshold at the midpoint of the
maximum distance boundaries and split row r into two parts.
Step 2 Convert Z' to Z such that the row r is the same in both the matrices.
The elements of the other rows of Z' are rearranged such that each column of Z
Z are in the same order as the elements of row r. Split Z into two matrices Z1 and Zz
represents a prototype vector. This means, simply, that the elements of each row of
2 3 4 5
This gives
I
'T]1 = 16
Z'
= [�
2
12
35
20 24 27
42 48 56 ] =>
- [
Zi =
6 12
56 28
20 24 27
42 35 48 ]
Z2 z3
The largest adjacent difference in the first row is 8; in the second row it is 7. Hence the
first row is chosen, and z (l) is the feature to be thresholded. This splits Z1 into Z2 and
Z3 , as shown. Proceeding similarly with these matrices, we get
2 1
'
Z2
=
[ 6
28 I
12
56 ] - [
I J
:::> Zz =
12
28
6
56
'T] 2 = 42
I
TJ4 = 23.5
z3 =
, [
20
35 I
24
42
27
48 ] =>
-
Zn =
[ 24
35
20
42
21
48 ]
'T]J = 38. 5 4 3 5
z4
The thresholds partition the feature space and induce the decision tree, as shown in
Fig. 9.58.
z(2)
56
52
5
•
44 3
•
36 •
4
2
28 •
20
-+--..___..___����.._ z( l )
0 8 16 24
J c (xlSk)P (x) dx
K
Risk, 9t � L
k=1 Rk
(9 . 141)
K
L c;, k P (S;)p (xlS;) < L C;,i P (S;)p (xlS;), 'Vj -:/= k � x E Sk (9. 142)
i= 1 i= 1
(a) (b)
(c) (d)
Cluster 2 I/ :-.._
I \
1 80
1
V2 \ I
1'- /
1 20
60
t-+-+-+-t-+-+-+-1r-+--+-+-t--+--fl--r.--t---t-,rl-r-t--il Cluster 1
\ /
0 60 1 20 1 80 240
v,
Similarity rule:
No
Spl it
Chain method (63]. The first data sample is designated as the representative
of the first cluster and similarity or distance of the next sample is measured from the
first cluster representative. If this distance is less than a threshold, say TJ, then it is
placed in the first cluster; otherwise it becomes the representative of the second
cluster. The process is continued for each new data sample until all the data has
been exhausted. Note that this is a one-pass method.
The procedure is repeated for each x;, one at a time, until the clusters and their
centers remain unchanged. If d (x , y) is the Euclidean distance, then a cluster center
is simply the mean location of its elements. If K is not known, we start with a large
.
Sk k = 1 , . . , K
Classificatio n t------t� kth class
i ma ge Features
___.....-. Feature
extraction
Symbols ....-----. Description
Symbolic
I nterpretation
representation
Visual Look up
models in tables