Dip Unit 4

18ECE011T-DIGITAL IMAGE
PROCESSING
UNIT-4
IMAGE SEGMENTATION AND
REPRESENTATION
Introduction
 What is image segmentation?
 Technically speaking, image segmentation refers
to the decomposition of a scene into different
components (thus to facilitate the task at higher
levels such as object detection and recognition)
 Scientifically speaking, segmentation is a
hypothetical middle-level vision task performed by
neurons between low-level and high-level cortical
areas
 There is no ground truth to a segmentation
task (an example is given in the next slide)
EE465: Introduction to Digital Image

Processing Copyright Xin Li 2
Dilemma
input result 1 result 2
What do we mean by “DIFFERENT” objects?
Another example: when we look at trees at a close distance, we consider

each of them as a different object; but as we look at trees far away, they
merge into one coherent object (woods)

Overview of Segmentation Techniques
Edge-based Document images
Color-based Medical images
Texture-based Range images
Biometric images
Disparity-based
Texture images
Motion-based

Edge-based Techniques
Segmentation
Edge Classification
by boundary
detection and analysis
detection

Region-Filling

Edge Detection
Basic idea: look for a neighborhood with strong signs

of change.
Problems: 81 82 26 24
82 33 25 25
• neighborhood size 81 82 26 24
• how to detect change
7
Differential Operators
Differential operators
• attempt to approximate the gradient at a pixel via masks
• threshold the gradient to select the edge pixels
8
Example: Sobel Operator
-1 0 1 1 2 1
Sx = -2 0 2 Sy = 0 0 0
-1 0 1 -1 -2 -1
On a pixel of the image

• let gx be the response to Sx
• let gy be the response to Sy
2 2 1/2
Then g = (gx + gy ) is the gradient magnitude.
 = atan2(gy,gx) is the gradient direction.

9
Java Toolkit’s Sobel Operator
original image gradient thresholded

magnitude gradient
magnitude
10
Zero Crossing Operators
Motivation: The zero crossings of the second derivative

of the image function are more precise than
the peaks of the first derivative.
step edge
smoothed
1st derivative
zero crossing
2nd derivative
11
Canny Edge Detector
• Smooth the image with a Gaussian filter.
• Compute gradient magnitude and direction at each pixel of

the smoothed image.
• Zero out any pixel response  the two neighboring pixels

on either side of it, along the direction of the gradient.
• Track high-magnitude contours.
• Keep only pixels along these contours, so weak little

segments go away.
12
Canny Examples
13
Best Canny on Kidney from Hw1
14
Best Canny on Blocks from Hw1
15
Hough Transform
• The Hough transform is a method for detecting

lines or curves specified by a parametric function.
• If the parameters are p1, p2, … pn, then the Hough

procedure uses an n-dimensional accumulator array
in which it accumulates votes for the correct parameters
of the lines or curves found on the image.
image accumulator
b
m
y = mx + b
16
Thresholding
non-contextual approach
The input to a thresholding operation is typically a grayscale or color

image. In the simplest implementation, the output is a binary image
representing the segmentation. Black pixels correspond to background
and white pixels correspond to foreground (or vice versa).
Threshopding of pixel grey level ( Basic
Global Thresholding)
Segmentation is accomplished by scanning the image pixel by pixel

and labeling each pixel as object or background, depending on
whether the grey level is greater or less than the value of T .
 0 f ( x, y )  T

g ( x, y )  
1 f ( x, y )  T

Thersholding works well when a grey level histogram of

the image groups separates the pixels of the object and the
background into two dominant modes. Then a threshold T
can be easily chosen between the modes.
Picking the threshold is the hard part
• Human operator decided the threshold

• Use mean gray level of the image
•A fixed proportion of pixels are detected ( set to 1) by
the thresholding operation
•Analyzing the histogram of an image
Basic Global Thresholding
Figure 1 A) shows a classic bi-modal intensity distribution. This

image can be successfully segmented using a single threshold T1. B)
is slightly more complicated. Here we suppose the central peak
represents the objects we are interested in and so threshold
segmentation requires two thresholds: T1 and T2. In C), the two
peaks of a bi-modal distribution have run together and so it is almost
certainly not possible to successfully segment this image using a
single global threshold
Basic Global Thresholding
The same approach can be used with more than one

treshold value.For example, for threshold T1 and T2, any
point which satisfies the relation T1<f(x,y) <T2 would be
labeled as an object point and all others would be
labeled background points.
In general this technique is less reliable than a single
variable threshold. This is because it often difficult to
establish multiple thresholds to effectively isolate the
region of interest especially when the number of modes
in the corresponding histogram is high.
Basic Global and Local Thresholding
Thresholding may be viewed as an operation that

involves tests against a function T of the form:
T = T[x,y,p(x,y),f(x,y)]
Where f(x,y) is the gray level , and p(x,y) is some local
property.
Simple tresholding schemes compare each pixels gray
level with a single global threshold. This is referred to
as Global Tresholding.
If T depends on both f(x,y) and p(x,y) then this is
referred to a Local Thresholding.
An algorithm used to obtain T automatically for
global thresholding
1. Select an initial estimate for T.

2. Segment the image using T. This well produce two groups
of pixels: G1 consisting of all pixels with gray level
values>T and G2 consisting of pixels with values <T.
3. Compute the average gray level values 1 and 2 for the
pixels in regions G1 and G2.
4. Compute a new threshold value:T = ½[1 + 2 ]
5. Repeat step 2 through 4 until the difference in T in
successive iterations is smaller than a predefined parameter
To.
Global Thresholding - Guidelines for Use
The histogram for image is
This shows a nice bi-modal

distribution --- the lower peak
represents the object and the
higher one represents the
background. The picture can be
segmented using a single
threshold at a pixel intensity value
of 120. The result is shown in
The histogram for image is
Due to the severe illumination gradient across the scene, the peaks
corresponding to foreground and background have run together and so
simple thresholding does not give good results. Following images show
the resulting bad segmentations for single threshold values of 80 and 120
respectively (reasonable results can be achieved by using adaptive
thresholding on this image).
Thresholding is also used to filter the output of or input to other

operators. For instance, in the former case, an edge detector like Sobel
will highlight regions of the image that have high spatial gradients. If
we are only interested in gradients above a certain value (i.e. sharp
edges), then thresholding can be used to just select the strongest edges
and set everything else to black. As an example,
was obtained by first
applying the Sobel
operator to
and then thresholding this

using a threshold value of
60.
Use of Boundary Characteristics for Histogram
Improvement and Local Thresholding
From the privies discussion , an indication of whether a pixel is on an edge may be
obtained by computing its gradient. In addition , use of the Laplacian can yield
information regarding whether a given pixel lies on the dark or light side of an edge.
The average value of the Laplacian is 0 at the transition of an edge, so in practice the
valleys of histograms formed from the pixels selected by a gradient/Laplacian
criterion can be expected to be sparsely populated.
We can calculate gradient f and the Laplacian 2f at any point (x,y) in an image .
These two quantities may be used to form a three-level image , as follows
if f  T
0

g ( x , y )   if f  T and 2 f  0

 if f  T and 2 f  0
(1)All pixels that are not on an edge are labeled 0

(2) All pixels on the dark side of an edge are labeled +
(3) All pixels on the light side of an edge are labeled -
Thresholding can be used as preprocessing

to extract an interesting subset of image
structures which will then be passed along to
another operator in an image processing
chain. For example, image shows a slice of
brain tissue containing nervous cells (i.e. the
large gray blobs, with darker circular nuclei
in the middle) and glia cells (i.e. the isolated,
small, black circles).
We can threshold this image so as to map all

pixel values between 0 and 150 in the original
image to foreground (i.e. 255) values in the
binary image, and leave the rest to go to
background, as in
The resultant image can then be

connected-components-labeled in order to
count the total number of cells in the
original image, as in
If we wanted to know how many nerve cells

there are in the original image, we might try
applying a double threshold in order to select
out just the pixels which correspond to nerve
cells (and therefore have middle level
grayscale intensities) in the original image.
(In remote sensing and medical terminology,
such thresholding is usually called density
slicing.) Applying a threshold band of 130 -
150 yields
Thresholding in RGB space
For color or multi-spectral images, it may be possible to set

different thresholds for each color channel, and so select just those
pixels within a specified cuboid in RGB space. Another common
variant is to set to black all those pixels corresponding to
background, but leave foreground pixels at their original
color/intensity (as opposed to forcing them to white), so that that
information is not lost.
1, d ( x, y )  d max

g ( x, y )  
 0 d ( x, y )  d max

where
d ( x , y )   f R ( x , y )  R0    f G ( x , y )  G 0    f B ( x , y )  B0 
2 2 2
Adaptive Thresholding
A more complex thresholding algorithm would be to use a

spatially varying threshold. This approach is very useful to
compensate for the effects of non –uniform illumination. If T
depends on coordinates x and y, this referred to as Dynamic
Thresholding or Adaptive Thresholding.
Another approach is to perform a preprocessing step before
thresholding.
Preprocessing the image to remove noise of other non-
uniformities can improve the performance of the
thresholding.
A technique which often provides better results is to only use
edge points when creating the grey level histogram .
Adaptive thresholding - how it works?
There are two main approaches to finding the threshold:

(i) the Chow and Kaneko approach and
(ii) local thresholding.
The assumption behind both methods is that smaller image
regions are more likely to have approximately uniform
illumination, thus being more suitable for thresholding.
Chow and Kaneko divide an image into an array of overlapping
subimages and then find the optimum threshold for each
subimage by investigating its histogram. The threshold for each
single pixel is found by interpolating the results of the
subimages. The drawback of this method is that it is
computational expensive and, therefore, is not appropriate for
real-time applications.
Adaptive thresholding - Local thresholding
An alternative approach to finding the local threshold is to statistically
examine the intensity values of the local neighborhood of each pixel.
The statistic which is most appropriate depends largely on the input
image. Simple and fast functions include the mean of the local intensity
distribution,
the median value,
or the mean of the minimum and maximum values,
The size of the neighborhood has to be large enough to cover

sufficient foreground and background pixels, otherwise a poor
threshold is chosen. On the other hand, choosing regions which are
too large can violate the assumption of approximately uniform
illumination. This method is less computationally intensive than
the Chow and Kaneko approach and produces good results for
some applications.
Adaptive thresholding -Guidelines for Use
Local adaptive thresholding, on the other hand, selects an

individual threshold for each pixel based on the range of
intensity values in its local neighborhood. This allows for
thresholding of an image whose global intensity
histogram doesn't contain distinctive peaks.
A task well suited to local adaptive thresholding is in
segmenting text from the image
Because this image contains a
strong illumination gradient,
global thresholding produces a
very poor result, as can be seen
in
Using the mean of a 7×7 neighborhood, adaptive thresholding yields
The method succeeds in the area surrounding the text because

there are enough foreground and background pixels in the local
neighborhood of each pixel; i.e. the mean value lies between
the intensity values of foreground and background and,
therefore, separates easily. On the margin, however, the mean
of the local area is not suitable as a threshold, because the
range of intensity values within a local neighborhood is very
small and their mean is close to the value of the center pixel.
The situation can be improved if the threshold

employed is not the mean, but (mean-C), where C
is a constant. Using this statistic, all pixels which
exist in a uniform neighborhood (e.g. along the
margins) are set to background. The result for a
7×7 neighborhood and C=7 is shown in
and for a 75×75 neighborhood and C=10 in
The larger window yields the poorer result,

because it is more adversely affected by the
illumination gradient. Also note that the latter is
more computationally intensive than
thresholding using the smaller window.
The result of using the median instead of the mean can be seen in
The neighborhood
size for this
example is 7×7
and C = 4). The
result shows that,
in this application,
the median is a
less suitable
statistic than the
mean.
Consider another example image containing a strong illumination

gradient
This image can not be
segmented with a global
threshold, as shown in
where a threshold of 80
was used.
However, since the image contains a

large object, it is hard to apply
adaptive thresholding, as well. Using
the (mean - C) as a local threshold,
we obtain with a 7×7 window and C
=4
Using the (mean - C) as a local threshold,

we obtain
with a 140×140 window and C = 8. All pixels which

belong to the object but do not have any background pixels
in their neighborhood are set to background. The latter
image shows a much better result than that achieved with a
global threshold, but it is still missing some pixels in the
center of the object. In many applications, computing the
mean of a neighborhood (for each pixel!) whose size is of
the order 140×140 may take too much time. In this case,
the more complex Chow and Kaneko approach to adaptive
thresholding would be more successful.
Color-Based Techniques
 Color representations
 Device dependent: RGB (displaying) or CMYK (printing)
 Device independent: CIE XYZ or CIELAB (L*a*b*)
 There are different specifications of RGB color
spaces (e.g., HP/Microsoft vs. Adobe)

Color Space Conversion
Analog
TV
Digital
TV(MPEG)

Clustering via K-Means Algorithm
An algorithm for partitioning (or clustering) N data points
into K disjoint subsets Sj containing Nj data points so as to
minimize the sum-of-squares criterion
data points centroid
Initialization: the data points the centroid is

randomly choose are assigned to updated for each
K centroids the K sets set

Processing Copyright Xin Li'2004 42
Subproblem I: Clustering by distance
to known centers
43
Subproblem II: Finding the centers
from known clustering
44
Toy Example of Kmeans
Clustering
1.Initialization 2.NN-Clustering
3.Codeword-update 4. Alternate 2 and 3

until convergence
http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html
K-means Clustering: Step 1
Algorithm: k-means, Distance Metric: Euclidean Distance
5
4
k1
k2
2
k3
0
0 1 2 3 4 5
5
4
k1
k2
2
k3
0
0 1 2 3 4 5
5
4
k1
2
k3
k2
1
0
0 1 2 3 4 5
5
4
k1
2
k3
k2
1
0
0 1 2 3 4 5
expression in condition 2 5
4
k1
2
k2
k3
1
0
0 1 2 3 4 5
expression in condition 1
Data Clustering via Kmeans
Instead of 2D, kmeans can be applied to 3D color space RGB or L*a*b*

Texture-based Techniques
What is Texture?
No one exactly knows.
In the visual arts, texture

is the perceived surface quality
of an artwork.

Disparity-based Techniques

Motion Segmentation

Document Segmentation
 Document images
consist of texts,
graphics, photos and
so on
 Document
segmentation is useful
for compression, text
recognition
 Adobe and Xerox are
the major players

Medical Image Segmentation
 Medical image analysis
can be used as
preliminary screening
techniques to help
doctors
 Partial Differential
Equation (PDE) has
been used for
segmenting medical
images
active contour model (snake)
Range Image Segmentation
range intensity ground

truth

Biometric Image Segmentation
 For fingerprint, face
and iris images, we
also need to segment
out the region of
interest
 Various cues can be
used such as ridge
pattern, skin color and
pupil shape
 Robust segmentation
could be difficult for
poor-quality images

Dip Unit 4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dip Unit 4

Uploaded by

Copyright:

Available Formats

18ECE011T-DIGITAL IMAGE

EE465: Introduction to Digital Image

input result 1 result 2

What do we mean by “DIFFERENT” objects?

Another example: when we look at trees at a close distance, we consider

EE465: Introduction to Digital Image

Color-based Medical images

Texture-based Range images

EE465: Introduction to Digital Image

EE465: Introduction to Digital Image

EE465: Introduction to Digital Image

Basic idea: look for a neighborhood with strong signs

• how to detect change

• attempt to approximate the gradient at a pixel via masks

• threshold the gradient to select the edge pixels

On a pixel of the image

 = atan2(gy,gx) is the gradient direction.

original image gradient thresholded

Motivation: The zero crossings of the second derivative

• Smooth the image with a Gaussian filter.

• Compute gradient magnitude and direction at each pixel of

• Zero out any pixel response  the two neighboring pixels

• Track high-magnitude contours.

• Keep only pixels along these contours, so weak little

• The Hough transform is a method for detecting

• If the parameters are p1, p2, … pn, then the Hough

The input to a thresholding operation is typically a grayscale or color

Segmentation is accomplished by scanning the image pixel by pixel

Thersholding works well when a grey level histogram of

• Human operator decided the threshold

Figure 1 A) shows a classic bi-modal intensity distribution. This

The same approach can be used with more than one

Thresholding may be viewed as an operation that

1. Select an initial estimate for T.

This shows a nice bi-modal

Thresholding is also used to filter the output of or input to other

and then thresholding this

(1)All pixels that are not on an edge are labeled 0

Thresholding can be used as preprocessing

We can threshold this image so as to map all

The resultant image can then be

If we wanted to know how many nerve cells

For color or multi-spectral images, it may be possible to set

A more complex thresholding algorithm would be to use a

There are two main approaches to finding the threshold:

The size of the neighborhood has to be large enough to cover

Local adaptive thresholding, on the other hand, selects an

Using the mean of a 7×7 neighborhood, adaptive thresholding yields

The method succeeds in the area surrounding the text because

The situation can be improved if the threshold

and for a 75×75 neighborhood and C=10 in

The larger window yields the poorer result,

Consider another example image containing a strong illumination

However, since the image contains a

Using the (mean - C) as a local threshold,

with a 140×140 window and C = 8. All pixels which

EE465: Introduction to Digital Image

EE465: Introduction to Digital Image

data points centroid

Initialization: the data points the centroid is

EE465: Introduction to Digital Image

3.Codeword-update 4. Alternate 2 and 3

Instead of 2D, kmeans can be applied to 3D color space RGB or L*a*b*

Instead of 2D, kmeans can be applied to 3D color space RGB or Lab*