Professional Documents
Culture Documents
ON
Elements of visual perception Image sampling and quantization basic relationship between pixels Basic geometric transformations
Introduction to fourier transform and DFT Properties of 2D fourier transform FFT Separable image transforms Walsh- Hadamard
Discrete cosine transform, Haar, Slant-Karhunen Love transforms.
Spatial domain methods: Basic grey level transformation Histogram equalization Image subtraction Image averaging Spatial
filtering Smoothing, sharpening filters Laplacian filters Frequency domain filters Smoothing Sharpening filters
Homomorphic filtering.
Model of image degradation/restoration process Noise models Inverse filtering Least mean square filtering Constrained least
mean square filtering Blind imager Pseudo inverse Singular value decomposition.
Lossless compression: variable length coding LZW coding Bit plane coding Predictive coding DPCM. Lossy Compression:
Transform coding Wavelet coding Basics of image compression standards JPEG, MPEG, basics of vector quantization.
Edge detection Thresholding Region based segmentation Boundary representation Chair codes Polygonal approximation
Boundary segments Boundary descriptors Simple descriptors Fourier descriptors Regional descriptors Simple descriptors
Texture.
TEXT BOOK
1. Rafael C. Gonzalez, Richard E. Woods, Digital Image Processing, 2nd Edition,
Pearson Education, 2003.
REFERENCES
1. William K. Pratt, Digital Image Processing , John Willey ,2001
2. Millman Sonka, Vaclav Hlavac, Roger Boyle, Broos/Colic, Thompson Learniy,
Vision, Image Processing Analysis and Machine, 1999.
UNIT I DIGITAL IMAGE FUNDAMENTALS AND TRANSFORMS
Elements of visual perception Image sampling and quantization basic relationship between pixels Basic geometric transformations
Introduction to fourier transform and DFT Properties of 2D fourier transform FFT Separable image transforms Walsh- Hadamard
Discrete cosine transform, Haar, Slant-Karhunen Love transforms.
Projections
Let us determine the image functions for the above sensitivity functions imaging the same scene:
1. This is the most realistic of the three. Sensitivity is concentrated in a band around 0 .
Z
0 0
f1 (x , y ) = cp (x0 , y 0 , )V1 ()d
2. This is an unrealistic capture device which has sensitivity only to a single wavelength 0 as
determined by the delta function. However there are devices that get close to such selective
behavior.
Z Z
f2 (x0 , y 0 ) = cp (x0 , y 0 , )V2 ()d = cp (x0 , y 0 , )( 0 )d
= cp (x0 , y 0 , 0 )
3. This is what happens if you take a picture without taking the cap off the lens of your camera.
Z Z
0 0
f3 (x , y ) = 0 0
cp (x , y , )V3 ()d = cp (x0 , y 0 , ) 0 d
= 0
For a camera that captures color images, imagine that it has three
sensors at each (x0, y 0 ) with sensitivity functions tuned to the colors or
wavelengths red, green and blue, outputting three image functions:
Z
0 0
fR (x , y ) = cp(x0, y 0, )VR ()d
Z
fG (x0, y 0 ) = cp(x0, y 0, )VG ()d
Z
fB (x0, y 0 ) = cp(x0, y 0, )VB ()d
These three image functions can be used by display devices (such
as
your monitor or your eye) to show a color image.
The image function fC (x0, y 0 ) is still a function of x0 [x0 min , x0max] and y0
0 , y0
[ymin max ] which vary in a continuum given by the respective intervals.
The values taken by the image function are real numbers which again
vary in a continuum or interval fC (x0, y 0) [fmin, fmax].
Digital computers cannot process parameters/functions that vary in a
continuum.
We have to discretize:
Quantization
where
Q(fC (i, j)) = (k + 1/2)Q + fmin
if and only if fC (i, j) [fmin + kQ, fmin + (k + 1)Q )
if and only if fmin + kQ fC (i, j) < fmin + (k + 1)Q
for k = 0, . . . , P 1
Quantization to P levels
, , ,
R (i, j), fG(i, j) and fB(i, j) are called the (R, G, B) parameterization of
f
the color space of the full color image.
There are other parameterizations, each with its own advantages and
disadvantages.
Grayscale Images
A grayscale
or luminance image can be considered to be one of the
components of a different parameterization.
Advantage: It captures most of the image information.
Our emphasisin this class will be on general processing. Hence we
will mainly work with grayscale images in order to avoid the various
nuances involved with different parameterizations.
Images as Matrices
F(u,v)=F*(-u,-v)
|F(u,v)| = |F(-u,-v)|
9-4
Periodicity properties
Fourier spectrum
with back-to-back
half periods in the
range
[0,n-1]
Shifted spectrum
with a
full period
in the
same range
Average Value
Therefore, 1
f ( x, y) = F (0,0)
N
The Laplacian
{ 2 f ( x, y)} (2 ) 2 (u 2 + v 2 )F (u, v)
0
Convolution & Correlation
1 1
g() g()
1/2 1/2
1 -1
Compute g(x-) by displacing g(-) by the value x
g() g(x)
1/2 1/2
-1 -1 x
1-D convolution example (continued)
f()g(x- ) f()g(x- )
1 1
1 1
Thus we have
x / 2 0 x 1
f ( x) * g ( x) = 1 x / 2 1 x 2
0
elsewhere.
Graphically, f(x)*g(x)
1/2
1 2 x
Convolution and impulse functions
f ( x) ( x x
0 )dx = f ( x0 )
(x x0 )dx = ( x x0 )dx = 1
x0
x0 x
A(x-x0)
Convolution with an impulse function
Given f(x) is
f()
and g(x)=(x+T)+ (x)+ (x-T)
-T T x
g()
x
Convolution and the Fourier transform
f ( x) * g ( x) F (u)G(u)
f ( x) g ( x) F (u) * G(u)
D(u, v) = u 2 + v 2
H(u,v)
v
u
Ideal lowpass filter (ILPF) (continued)
The point D0 traces a circle from the frequency origin giving a locus of
cutoff frequencies (all are at distance D0 from the origin)
One way to establish a set of standard loci is to compute circles that
encompass various amounts of the total signal power PT
PT is given by N 1 N 1
PT = P(u, v)
u =0 v =0
where P(u,v) is given as
= 100 P(u, v) / PT (the summation is over all points (u, v) encompassed by the circle)
u v
UNIT 1
2 marks
1. Define Image?
2. What is Dynamic Range?
3. Define Brightness?
4. Define Tapered Quantization?
5. What do you meant by Gray level?
6. What do you meant by Color model?.
7. List the hardware oriented color models?
8. What is Hue of saturation?
9. Explain separability property in 2D fourier transform
10. What are the properties of Haar and slant transform.
11. Define Resolutions?
12. What is meant by pixel?
13. Define Digital image?
14. What are the steps involved in DIP?
16. Specify the elements of DIP system?
18. What are the types of light receptors?
19. Differentiate photopic and scotopic vision?
26. Define sampling and quantization
27. Find the number of bits required to store a 256 X 256 image with 32 gray levels?
28. Write the expression to find the number of bits to store a digital image?
30. What do you meant by Zooming and shrinking of digital images?
32. Write short notes on neighbors of a pixel.
33. Explain the types of connectivity.
34. What is meant by path?
36. What is geometric transformation?
40. What is Image Transform?
16 MARKS
UNIT I
Xl
X2
X= .
.
Xn
M T
For M samples, Cx=l/M (xk-Mx)(xk-Mx).
K=l
K-L Transform Y= A (X- MX)
UNIT II IMAGE ENHANCEMENT TECHNIQUES
Spatial domain methods: Basic grey level transformation Histogram equalization Image subtraction Image averaging Spatial
filtering Smoothing, sharpening filters Laplacian filters Frequency domain filters Smoothing Sharpening filters
Homomorphic filtering.
B has 10 times as
few distinct pixel values.
Note also the vertical axis
scaling in hB (l).
Stretched/Compressed Pixel Value Ranges
For a given image, decompose the range of pixel values (0, . . . , 255) into
discrete intervals Rt = [at, bt ], t = 1, . . . , T , where T is the total number
of segments.
Each Rt is typically obtained as a range of pixel values that correspond
to a hill of hA(l).
Label the pixels with pixel values within each Rt via a point function.
Main Assumption: Each object is assumed to be composed of
pixels with similar pixel values.
Limitations
Histogram Equalization
Fora given image A, we will now design a special point function gAe (l)
which is called the histogram equalizing point function for A.
If B(i, j) = gAe (A(i, j)), then our aim is to make hB (l) as uniform/flat as
possible irrespective of hA(l)
Histogram equalization will help us:
Stretch/Compress an image such that:
Pixel values that occur frequently in A occupy a bigger dynamic range in B,
i.e., get stretched and become more visible.
Pixel values that occur infrequently in A occupy a smaller dynamic range in B,
i.e., get compressed and become less visible.
Compare images by mapping their histograms into a standard
histogram and sometimes undo the effects of some unknown
processing.
The techniques we are going to use to get gAe (l) are also applicable
in
histogram modification/specification.
Let g1(l) =
Pl
. Note that g1(l) [0, 1].
k=0 pA (k)
gAe (l) stretches the range of pixel values that occur frequently in A.
gAe (l) compresses the range of pixel values that occur infrequently in
A.
Example
Comparison/Undoing
Comparison/Undoing - contd.
1
Histogram Equalization
Pl hA (l)
gl (l) = k=0 pA (k) = gl (l) gl (l 1) = pA(l) = NM (l = 1, . . . , 255).
gAe (l) = round(255gl (l)) is the histogram equalizing point function for the
image A.
B(i, j) = g e A(A(i, j))
is the histogram equalized version of A.
In general, histogram equalization stretches/compresses an image such
that:
Pixel values that occur frequently in A occupy a bigger dynamic range in B, i.e., get
stretched and become more visible.
Pixel values that occur infrequently in A occupy a smaller dynamic range in B, i.e., get
compressed and become less visible.
Remember, two totally different images may have very similar his-
tograms.
Histogram Matching -
Specification
16 Marks
1. Explain the types of gray level transformation used for image enhancement.
# Linear (Negative and Identity)
# Logarithmic( Log and Inverse Log)
# Power_law (nth root and nth power)
# Piecewise_linear (Constrast Stretching, Gray level Slicing, Bit plane Slicing)
2. What is histogram? Explain histogram equalization.
# P(rk) = nk/n
# Ps(s) = l means histogram is arranged uniformly.
3. Discuss the image smoothing filter with its model in the spatial domain.
# LPF-blurring
# Median filter noise reduction & for sharpening image
4. What are image sharpening filters. Explain the various types of it.
# used for highlighting fine details
# HPF-output gets sharpen and background becomes darker
# High boost- output gets sharpen but background remains unchanged
# Derivative- First and Second order derivatives
Appl:
# Medical image
# electronic printing
# industrial inspection
Image restoration
f ( x, y) Degradation g ( x, y) Restoration f ( x, y)
Function + filter(s)
H
Noise
( x, y)
Image degradation/restoration process
Noise models
Gaussian noise
Gaussian (normal) noise p( z) =
1
e ( z z )
2
/ 2 2
Uniform noise
1
if a z b
p( z) = b a
0 otherwise
where
z represents intensity
a +b
z=
2
(b a) 2
2 =
12
Impulse (salt-and-pepper) noise
Pa for z = a
p( z) = Pb for z = b
0 otherwise
If b>a then any pixel with
intensity b will appear as a
light dot in the image
Pixels with intensity a will
appear as a dark dot
50 50
100 100
150 150
200 200
250 250
50 100 150 200 250 50 100 150 200 250
122
124
126
128
130
132
134
136
105 110 115 120 125 130 135 140 145 150
Mean filters
Order-Statistic filters
Adaptive filters
Mean filters (arithmetic)
f ( x, y) = 1
g (s, t )
mn ( s ,t )S xy
The operation is generally implemented using a spatial
filter of size m*n in which all coefficients have value 1/mn
A mean filter smoothes local variations in an image
Noise is reduced as a result of blurring
mn
f ( x, y) =
1
( s ,t )S xy g (s, t )
g (s, t ) Q +1
f ( x, y) =
( s ,t )S xy
g (s, t )
( s ,t )S xy
Q
16 Marks
6. What are the two approaches for blind image restoration? Explain in detail.
> Direct measurement
> Indirect estimation
UNIT IV IMAGE COMPRESSION
Objectives
At the end of this lesson, the students should be able to:
1. Explain the need for standardization in image transmission and reception.
2. Name the coding standards for fax and bi-level images and state their
characteristics.
3. Present the block diagrams of JPEG encoder and decoder.
4. Describe the baseline JPEG approach.
5. Describe the progressive JPEG approach through spectral selection.
6. Describe the progressive JPEG approach through successive
approximation.
7. Describe the hierarchical JPEG approach.
8. Describe the lossless JPEG approach.
9. Convert YUV images from RGB.
10. Illustrate the interleaved and non-interleaved ordering for color images.
Introduction
With the rapid developments of imaging technology, image compression and coding
tools and techniques, it is necessary to evolve coding standards so that there is
compatibility and interoperability between the image communication and storage
products manufactured by different vendors. Without the availability of standards,
encoders and decoders can not communicate with each other; the service providers
will have to support a variety of formats to meet the needs of the customers and the
customers will have to install a number of decoders to handle a large number of data
formats. Towards the objective of setting up coding standards, the
international standardization agencies, such as
International Standards Organization (ISO), International Telecommunications Union
(ITU), International Electro-technical Commission (IEC) etc. have formed expert
groups and solicited proposals from industries, universities and research laboratories.
This has resulted in establishing standards for bi-level (facsimile) images and
continuous tone (gray scale) images. In this lesson, we are going to discuss the
highlighting features of these standards. These standards use the coding and
compression techniques both lossless and lossy which we have already studied in the
previous lessons.
The first part of this lesson is devoted to the standards for bi-level image coding.
Modified Huffman (MH) and Modified Relative Element Address Designate
(MREAD) standards are used for text-based documents, but more recent
standards like JBIG1 and JBIG2, proposed by the Joint bi-level experts group (JBIG)
can efficiently encode handwritten characters and binary halftone images. The latter part
of this lesson is devoted to the standards for continuous tone images. We are going to
discuss in details about the Joint Photographic Experts Group (JPEG) standard and its
different modes, such as baseline (sequential), progressive, hierarchical and lossless.
(c) JBIG1: The earlier two algorithms just mentioned work well for printed texts
but are inadequate for handwritten texts or binary halftone images (continuous
images converted to dot patterns). The JBIG1 standard, proposed by the
Joint Bi-level Experts Group uses a larger region of support for coding the
pixels. Binary pixel values are directly fed into an arithmetic coder, which
utilizes a sequential template of nine adjacent and previously coded pixels plus
one adaptive pixel to form a 10-bit context. Other than the sequential mode
just described, JBIG1 also supports progressive mode in which a reduced
resolution starting layer image is followed by the transmission of progressively
higher resolution layers. The compression ratios of JBIG1 standard is
slightly better than that of MREAD for text images but has an
improvement of 8-to-1 for binary halftone images.
(d) JBIG2: This is a more recent standard proposed by the Joint bi-level Experts
Group. It uses a soft pattern matching approach to provide a solution to the
problem of substitution errors in which an imperfectly scanned symbol is
wrongly matched to a different symbol, as frequently observed in Optical
Character Recognition (OCR). JBIG2 codes the bit- map of each mark, rather
than its matched class index. In case a good match cannot be found for the
current mark, it becomes a token for a new class. This new token is then coded
using JBIG1 with a fixed template of previous pixels around the current mark.
The JBIG2 standard is seen to be 20% more efficient than the JBIG1 standard
for lossless compression.
JPEG Encoder
Figure shows the block diagram of a JPEG encoder, which has the following
components:
(a) Forward Discrete Cosine Transform (FDCT): The still images are first
partitioned into non-overlapping blocks of size 8x8 and the image samples
are shifted from unsigned integers with range [0,2 p 1] to signed integers
with range [ 2 p 1 ,2 p 1 ], where p is the number of bits (here, p = 8 ). The
theory of the DCT has been already discussed in lesson-8 and will not be
repeated here. It should however be mentioned that to preserve freedom for
innovation and customization within implementations, JPEG neither specifies
any unique FDCT algorithm, nor any unique IDCT algorithms.
The implementations may therefore differ in precision and JPEG has
specified an accuracy test as a part of the compliance test.
(b) Quantization: Each of the 64 coefficients from the FDCT outputs of a block
is uniformly quantized according to a quantization table. Since the aim is to
compress the images without visible artifacts, each step-size should be chosen
as the perceptual threshold or for just noticeable distortion. Psycho-visual
experiments have led to a set of quantization tables and these appear in ISO-
JPEG standard as a matter of information, but not a requirement.
(c) Entropy Coder: This is the final processing step of the JPEG encoder.
The JPEG standard specifies two entropy coding methods Huffman and
arithmetic coding. The baseline sequential JPEG uses Huffman only, but codecs
with both methods are specified for the other modes of operation. Huffman
coding requires that one or more sets of coding tables are specified by the
application. The same table used for compression is used needed to decompress
it. The baseline JPEG uses only two sets of Huffman tables one for DC and
the other for AC.
JPEG Decoder
Figure shows the block diagram of the JPEG decoder. It performs the inverse operation
of the JPEG encoder.
Baseline Encoding: Baseline sequential coding is for images with 8-bit samples and
uses Huffman coding only. In baseline encoding, each block is encoded in a single
left-to-right and top-to-bottom scan. It encodes and decodes complete 8x8 blocks with
full precision one at a time and supports interleaving of color components. The FDCT,
quantization, DC difference and zig-zag ordering proceeds. In order to claim JPEG
compatibility of a product it must include the support for at least the baseline encoding
system.
There are two forms of progressive encoding: (a) spectral selection approach and (b)
successive approximation approach. Each of these approaches is described below.
Progressive scanning through spectral selection: In this approach, the first scan
sends some specified low frequency DCT coefficients within each block. The
corresponding reconstructed image obtained at the decoder from the first scan therefore
appears blurred as the details in the forms of high frequency components are missing.
In subsequent scans, bands of coefficients, which are higher in frequency than the
previous scan, are encoded and therefore the reconstructed image gets richer with
details. This procedure is called spectral selection, because each band typically
contains coefficients which occupy a lower or higher part of the frequency spectrum
for that 8x8 block.
The spectral select on approach. Here all the 64 DCT coefficients in a block are of
8-bit resolution and successive blocks are stacked
one after the other in the scanning order. The spectral selection approach performs
the slicing of coefficients horizontally and picks up a band of coefficients,
starting with low frequency and encodes them to full resolution.
The successive approximation approach. The organization of the DCT coefficients and
the stacking of the blocks are same as before. The successive approximation approach
performs the slicing operation vertically and picks up a group pf bits, starting with the
most significant ones and progressively considering the lower frequency ones.
Obtain the reduced resolution images starting with the original and for each,
reduce the resolution by a factor of two, as described above.
Encode the reduced resolution image from the topmost layer of the
pyramid .
Decode the above reduced resolution image. Interpolate and up-sample it by a
factor of two horizontally and/or vertically, using the identical interpolation
filter which the decoder must use. Use this interpolated and up-sampled image
as a predicted image for encoding the next lower layer (finer resolution) of the
pyramid.
Encode the difference between the image in the next lower layer and the
predicted image using baseline, progressive or lossless encoding.
Repeat the steps of encoding and decoding until the lowermost layer
(finest resolution) of the pyramid is encoded.
Figure illustrates the hierarchical encoding process. In hierarchical encoding, the image
quality at low bit rates surpass the other JPEG encoding methods, but at the cost of
increased number of bits at the full resolution. Hierarchical encoding is used for
applications in which a high-resolution image should be accessed by a low resolution
display device. For example, the image may be printed by a high-resolution printer,
while it is being displayed on a low resolution monitor.
An entropy encoder is then used to encode the predicted pixel obtained from the lossless
encoder. Lossless codecs typically produce around 2:1 compression for color images
with moderately complex scenes. Lossless JPEG encoding finds applications in
transmission and storage of medical images.
Selection Prediction
Value
0 None
1 A
2 B
3 C
4 A+B-C
5 A+(B-C)/2
6 B+(A-C)/2
7 (A+B)/2
It is possible to convert an RGB image into YUV, using the following relations:
B Y
U= + 0.5
2
R Y
V= + 0.5
1.6
Scan-1: Y1,Y2,Y3,,Y15,Y16.
Y1, Y2, Y3, Y4, U1, V1, Y5, Y6, Y7, Y8, U2, V2,
JPEG Performance
Considering color images having 8-bits/sample luminance components and 8- bits/sample for each of
the two chrominance components U and V, each pixel requires 16-bits for representation, if both U and
V are sub-sampled by a factor of two in either of the directions. Using JPEG compression on a wide
variety of such color images, the following image qualities were measured subjectively:
A more advanced still image compression standard JPEG-2000 has evolved in recent times. This will
be our topic in the next lesson.
UNIT IV
2 Marks
16 Marks
7. Explain how compression is achieved in transform coding and explain about DCT
Thresholding segmentation is a method, which separates an image into two meaningful regions: foreground and background,
through a selected threshold value T. If the image is a grey image, T is an integer in the range of [0..K], where K is the maximum
intensity value. For example, if the image is an 8-bit gray image, K takes the value of 255 and T is in the range of [0..255].
Whenever the value of T is decided, the segmentation procedure is indicated by the following equation:
1 , if G x, y T
GB x, y (1)
0 , if G x, y T
In equation (1), G(x,y) indicates the intensity value of pixel (x,y) in the grey image G. GB is the segmentation result. Actually it
forms a binary image, in which each value of GB(x,y) gives the category (foreground or background) that the corresponding pixel
belongs to. If GB(x,y) = 1, then pixel (x,y) in the image G is classified as a foreground pixel, otherwise it is classified as a
background pixel.
Equation (1) is formulated under the assumption that foreground pixels in the image G have relatively high intensity values and
background pixels take low intensity values. Of course you can reverse the equation when you need to set the low intensity region
as the foreground.
The major problem of thresholding segmentation is how to select the optimal value of threshold T. Usually in order to get the
optimal value of T, we need to statically analyze the so-called histogram (or intensity histogram) of the input gray image G.
Before talking about the algorithm, first we list all the notions and statistic definitions that relate to histogram as follows.
Basic notations
HG is the intensity histogram of image G and it maps from each intensity value to an integer. The value of HG(i) indicates the
number of pixels in G that takes the intensity value, i, where i [0..K], K is the maximum intensity value as mentioned above (for
example K = 255 for 8-bit grey images). Obviously HG(i) is an integer value within range [0..N], N is the total number of pixels in
G as mentioned above. Based on the definition of histogram HG, we can get the so-called normalized histogram PG. It is defined
as follows:
PG(i) = HG(i) / N (2)
In equation (2), the value of each PG(i) indicates the percentage of pixels in G that takes the intensity value i. Clearly PG(i) is a
real value within the range [0..1]. The main reason of introducing the normalized histogram is that sometime we need to compare
two histogram from two images that contains different number of pixels. And the definition of normalized histogram makes this
kind of comparison meaningful.
Assume the current thresholding value is T, which separates the input image G into two regions: foreground and background
according the equation (1). The frequency of background, B(T) and the frequency of foreground, F(T) are defined as follows:
T K
B T PG i , F T PG i (3)
i 0 i T 1
Of course the frequency of the entire image G is calculated as = B(T) + F(T) = 1, no matter what value T takes. The mean
intensity values of background and foreground, B(T) and F(T) are calculated as:
T K
B T i PG i B , F T i PG i F (4)
i 0 i T 1
The mean intensity value of the entire image can be calculated as: = B(T) B(T) + F(T) F(T). Clearly no matter what value
T takes, keeps the same. The intensity variances of background and foreground, 2B(T) and 2
F(T) are defined as:
2 2
2
T i B T P i 2
K i F T P i
B T , F T (5)
i 0 B T i T 1 F T
Having the definition of variances of the background and the foreground, it is the time to define the so-called within-class
variance, 2within:
2
within
T B T B T F T F T (6)
2
Also we can define the so-called between-class variance, between:
2 2 2 2
between
T within
T B T F T B T F T (7)
2
In the above equation, indicates the intensity variance of the entire image G and it is calculated as:
2
2
K i P i
(8)
i 0
2
Obviously given the image G, is a constant value independent to the selection of the value of T.
Otsus algorithm
The Otsus algorithm is simple. We let T try all the intensity values from 0 to K and choose the one that gives the minimum
within-class variance 2within as the optimal thresholding value. Formally speaking:
2 2
Optimal value of T = TOpt, where within(TOpt) = min within T (9)
0 T K
As we said before, 2 = 2within(T)+ 2between(T), and 2 is independent of the selection of T, therefore, minimization of 2
within
means maximization of 2between. So the optimal value of T can also be taken as:
2 2
Optimal value of T = TOpt, where between(TOpt) = max between T (10)
0 T K
In fact equation (10) is the usual way that we use to find the optimal thresholding value. It is because that for each T, the
calculation of 2between only needs the calculations of B, F, B and F according to equation (7). And these values can be updated
iteratively:
Initially, T = 0:
Calculate the mean intensity of the entire image, .;
B (0) = PG(0); F(0) = 1 B(0) = 1 PG(0);
B (0) = 0; F(0) = / F(0), be careful if F(0) =0, then F(0) = 0.
Iteratively, T = T+1:
B (T+1) = B(T) + PG(T+1); F(T+1) = 1 B(T+1);
If B(T+1) = 0, then B(T+1) = 0;
ELSE B(T+1) = [ B(T) B(T) + (T+1) PG(T+1)] / B(T+1);
If F(T+1) = 0; then F(T+1) = 0;
ELSE F(T+1) = [ B(T+1) B(T+1)] / F(T+1);
Step 1:
Create a label image GL with the same size (width and height) as GB, and initially set each GL(x,y) = 1. Create a variable,
current_label, for recording the current available label and initially current_label = 0.
Step 2:
Scan the binary image GB sequentially and update the label image GL as follows:
Step 3:
Build the 2D transit matrix M_T. Initially create an empty matrix:
M_T[0..current_label1][0..current_label1].
Clearly it is an 2D array with size=(current_label) (current_label), and each element in the matrix is first set to 0. Then assign 1
to each element of the matrix that locates at the diagonal. It means: set M_T[i][i]=1, where i is from 0 to current_label1.
After the above initialization, we need to update the matrix M_T according to the label image GL and let M_T become the transit
matrix of GL. The update procedure is given as follows:
Step 4:
Calculate the transit closure matrix, M_TC, of M_T. This calculation of transit closure matrix is based on the warshells algorithm
which is an iteration procedure described as follows:
(1) Initially create two temporary matrixes M_0 and M_1, which have the same size of the matrix M_T. Copy each value in
M_T into the corresponding position in M_0 and M_1, which is:
(2) Update the matrix M_1 using the transitivity law, which is:
(3) Compare M_1 and M_0 to see if they are exactly the same, which means each element from M_1 has the same value as
the corresponding element from M_0. If so, set M_TC = M_1, which means we got the transit closure matrix. If not,
copy M_1 to M_0 and go back to (2).
Step 5:
Count the number of connected objects (or the number of equivalent classes) in the transit closure matrix M_TC. It is not hard to
see that the number connected objects equals to the number of distinct rows (or columns) in matrix M_TC. Assume M_TC[i] and
M_TC[j] are the two rows of matrix M_TC, where i j. We say these two rows are distinct if and only if there exists at least one
k [0.. current_label1] that M_TC[i][k] M_TC[j][k].
Step 6:
Output the number of connected objects (or the number of distinct rows of M_TC) and return.
Assume after step 3, we got the transit matrix M_T. In step 4 (1), we initially set two temporary matrixes M_0 and M_1, and copy
M_T to these two matrixes, as shown in Figure 1. Then we do some operations on the matrix M_1 as described in the Step 4 (2),
we get an updated M_1 as shown in Figure 2:
Figure 1: Copy M_T to M_0 and M_1. Figure 2: Updated M_1 by applying Step 4 (2).
Then according to Step 4 (3), we compare matrix M_1 (Figure 2) with matrix M_0 (Figure 1). We find M_1 is different from
M_0. Therefore we copy M_1 to M_0, as shown in Figure 3. Then go back to (2) do the same operations on M_1 again, and get
M_1 updated again, as shown in Figure 4.
Figure 3: Copy M_1 to M_0. Figure 4: M_1 is updated again.
Then we compare M_1 (Figure 4) and M_0 (Figure 3). Again we find that they are different. So we need to copy M_1 to M_0 (as
shown in Figure 5) and go back to (2) to get M_1 updated again (as shown in Figure 6).
This time we find that M_1 did not change, which means M_0 (Figure 5) and M_1 (Figure 6) are identical. So we say the current
M_1 is the transit closure matrix that we want, and set M_TC = M_1. Furthermore we discover there are two distinct rows in the
transit closure matrix: (1111100) and (0000011). It means that there are two connected objects.
Detection of Discontinuities
Thresholding
Region-Based Segmentation
1. What is segmentation?
2. Write the applications of segmentation.
3. What are the three types of discontinuity in digital image?
4. How the derivatives are obtained in edge detection during formulation?
5. Write about linking edge points.
6. What are the two properties used for establishing similarity of edge pixels?
7. Define Gradient Operator?
8. Define region growing?
9. Define compactness
16 Marks
9. Explain the segmentation techniques that are based on finding the regions
directly.
Edge detection line detection
Region growing
Region splitting
region merging
10. How is line detected? Explain through the operators
Types of line masks
1. horizontal
2. vertical
3. +45,-45