DIP-LECTURE - NOTES Final

1
DIGITAL IMAGE PROCESSING

II M.SC COMPUTER SCIENCE
III SEMESTER
1
2
UNIT–I
Fundamentals: Image Sensing and Acquisition, Image Sampling and Quantization,
relationship between Pixels; Random noise; Gaussian Markov Random Field, σ-field, Linear and
Non-linear Operations; Image processing models: Causal, Semi-causal, Noncausalmodels.
Color Models: Color Fundamentals, Color Models, Pseudo-color Image Processing, Full
Color Image Processing, Color Transformation, Noise in Color Images.
UNIT–II
Spatial Domain: Enhancement in spatial domain: Point processing; Mask processing;
Smoothing Spatial Filters; Sharpening Spatial Filters; Combining Spatial Enhancement
Methods.Frequency Domain: Image transforms: FFT, DCT, Karhunen-Loeve transform, Hotlling’s
T2 transform, Wavelet transforms and their properties - Image filtering in frequency domain.
UNIT–III
Edge Detection: Types of edges; threshold; zero-crossing; Gradient operators: Roberts,
Prewitt, and Sobel operators; residual analysis based technique; Canny edge detection. Edge
features and their applications.
UNIT–IV
Image Compression: Fundamentals, Image Compression Models, Elements of Information
Theory. Error Free Compression: Huff-man coding; Arithmetic coding; Wavelet transform based
coding; Lossy Compression: FFT; DCT; KLT; DPCM; MRFM based compression; Wavelet
transform based; Image Compression standards.
UNIT–V
Image Segmentation: Detection and Discontinuities: Edge Linking and Boundary
Deduction; Threshold; Region-Based Segmentation. Segmentation by Morphological watersheds.
The use of motion in segmentation, Image Segmentation based on Color. Morphological Image
Processing: Erosion and Dilation, Opening and Closing, Hit-Or-Miss Transformation, Basic
Morphological Algorithms, Gray-Scale Morphology.
TEXT BOOKS
1. Rafael Gonzalez, Richard E. Woods, “Digital Image Processing”, Fourth Edition, PHI/Pearson
Education, 2013.
2. A. K. Jain, Fundamentals of Image Processing, Second Ed., PHI, New Delhi, 2015.
REFERENCES 1. B. Chan la, D. Dutta Majumder, “Digital Image Processing and Analysis”, PHI,
2003.
2. Nick Elford, “Digital Image Processing a practical introducing using Java”, Pearson Education,
2004.
3. Todd R.Reed, “Digital Image Sequence Processing, Compression, and Analysis”, CRC Press,
2015.
4. L.Prasad,S.S.Iyengar,“Wavelet Analysis with Applications to Image Processing”,CRC Press,
2015.
2
UNIT-I
DIGITAL IMAGE FUNDAMENTALS

1. What is meant by Digital Image Processing? Explain how digital images can be represented?
An image may be defined as a two-dimensional function, f(x, y), where x and y are
spatial (plane) coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the
intensity or gray level of the image at that point. When x, y, and the amplitude values of f are
all finite, discrete quantities, we call the image a digital image.
The field of digital image processing refers to processing digital images by means of a
digital computer. Note that a digital image is composed of a finite number of elements, each of
which has a particular location and value. These elements are referred to as picture elements,
image elements, pels, and pixels. Pixel is the term most widely used to denote the elements of a
digital image.
Vision is the most advanced of our senses, so it is not surprising that images play the
single most important role in human perception. However, unlike humans, who are limited to
the visual band of the electromagnetic (EM) spectrum, imaging machines cover almost the
entire EM spectrum, ranging from gamma to radio waves. They can operate on images
generated by sources that humans are not accustomed to associating with images. These include
ultra-sound, electron microscopy, and computer-generated images. Thus, digital image
processing encompasses a wide and varied field of applications. There is no general agreement
among authors regarding where image processing stops and other related areas, such as image
analysis and computer vision, start.
Sometimes a distinction is made by defining image processing as a discipline in which
both the input and output of a process are images. We believe this to be a limiting and somewhat
artificial boundary.
For example, under this definition, even the trivial task of computing the average
intensity of an image (which yields a single number) would not be considered an image
processing operation. On the other hand, there are fields such as computer vision whose ultimate
goal is to use computers to emulate human vision, including learning and being able to make
inferences and take actions based on visual inputs. This area itself is a branch of artificial
intelligence (AI) whose objective is to emulate human intelligence. The field of AI is in its
earliest stages of infancy in terms of development, with progress having been much slower than
originally anticipated. The area of image analysis (also called image understanding) is in
between image processing and computer vision.
Representing Digital Images:
We will use two principal ways to represent digital images. Assume that an image f(x, y) is
sampled so that the resulting digital image has M rows and N columns. The values of the
coordinates (x, y) now become discrete quantities. For notational clarity and convenience, we
shall use integer values for these discrete coordinates. Thus, the values of the coordinates at the
origin are (x, y) = (0, 0). The next coordinate values along the first row of the image are
represented as (x, y) = (0, 1). It is important to keep in mind that the notation (0, 1) is used to
signify the second sample along the first row. It does not mean that these are the actual values of
physical coordinates when the image was sampled.
3
Figure 1 shows the coordinate convention used.
Fig 1 Coordinate convention used to represent digital images
The notation introduced in the preceding paragraph allows us to write the complete M*N digital image
in the following compact matrix form:
The right side of this equation is by definition a digital image. Each element of this matrix array is
called an image element, picture element, pixel, or pel.
2. Explain the process of image acquisition.

Image Sensing and Acquisition:
The types of images in which we are interested are generated by the combination of an
―illumination‖ source and the reflection or absorption of energy from that source by the elements of the
―scene‖ being imaged. We enclose illumination and scene in quotes to emphasize the fact that they are
considerably more general than the familiar situation in which a visible light source illuminates a
common everyday 3-D (three-dimensional) scene. For example, the illumination may originate from a
source of electromagnetic energy such as radar, infrared, or X-ray energy. But, as noted earlier, it could
originate from less traditional sources, such as ultrasound or even a computer-generated illumination
pattern.
Similarly, the scene elements could be familiar objects, but they can just as easily be molecules, buried
rock formations, or a human brain. We could even image a source, such as acquiring images of the
sun. Depending on the nature of the source, illumination energy is reflected from, or transmitted
through, objects. An example in the first category is light reflected from a planar surface. An example
in the second category is when X-rays pass through a patient’s body for the purpose of generating a
diagnostic X-ray film. In some applications, the reflected or transmitted energy is focused onto a
photo converter (e.g., a phosphor screen), which converts the energy into visible light. Electron
microscopy and some applications of gamma imaging use this approach.
Figure 5.1 shows the three principal sensor arrangements used to transform illumination energy into
digital images. The idea is simple: Incoming energy is transformed into a voltage by the combination
of input electrical power and sensor material that is responsive to the particular type
4
of energy being detected. The output voltage waveform is the response of the sensor(s), and a digital
quantity is obtained from each sensor by digitizing its response.
Fig.5.1 (a) Single imaging Sensor (b) Line sensor (c) Array sensor
(1) Image Acquisition Using a Single Sensor:
Figure 5.1 (a) shows the components of a single sensor. Perhaps the most familiar sensor of this type
is the photodiode, which is constructed of silicon materials and whose output voltage
5
waveform is proportional to light. The use of a filter in front of a sensor improves selectivity. For
example, a green (pass) filter in front of a light sensor favors light in the green band of the color
spectrum. As a consequence, the sensor output will be stronger for green light than for other
components in the visible spectrum.
In order to generate a 2-D image using a single sensor, there has to be relative displacements in both
the x- and y-directions between the sensor and the area to be imaged. Figure 5.2 shows an arrangement
used in high-precision scanning, where a film negative is mounted onto a drum whose mechanical
rotation provides displacement in one dimension. The single sensor is mounted on a lead screw that
provides motion in the perpendicular direction. Since mechanical motion can be controlled with high
precision, this method is an inexpensive (but slow) way to obtain high-resolution images. Other similar
mechanical arrangements use a flat bed, with the sensor moving in two linear directions. These types of
mechanical digitizers sometimes are referred to as microdensitometers.
Fig.5.2. Combining a single sensor with motion to generate a 2-D image
(2) Image Acquisition Using Sensor Strips:

A geometry that is used much more frequently than single sensors consists of an in-line arrangement of
sensors in the form of a sensor strip, as Fig. 5.1 (b) shows. The strip provides imaging elements in one
direction. Motion perpendicular to the strip provides imaging in the other direction, as shown in Fig.
5.3 (a).This is the type of arrangement used in most flat bed scanners. Sensing devices with 4000 or
more in-line sensors are possible. In-line sensors are used routinely in airborne imaging applications, in
which the imaging system is mounted on an aircraft that flies at a constant altitude and speed over
the geographical area to be imaged. One-dimensional imaging sensor strips that respond to various
bands of the electromagnetic spectrum are mounted perpendicular to the direction of flight. The
imaging strip gives one line of an image
at a time, and the motion of the strip completes the other dimension of a two-dimensional image.
Lenses or other focusing schemes are used to project the area to be scanned onto the sensors.
Sensor strips mounted in a ring configuration are used in medical and industrial imaging to obtain
cross-sectional (―slice‖) images of 3-D objects, as Fig. 5.3 (b) shows. A rotating X-ray source provides
illumination and the portion of the sensors opposite the source collect the X-ray energy that pass
6
through the object (the sensors obviously have to be sensitive to X-ray energy).This is the basis for
medical and industrial computerized axial tomography (CAT). It is important to note that the output of
the sensors must be processed by reconstruction algorithms whose objective is to transform the sensed
data into meaningful cross-sectional images.
In other words, images are not obtained directly from the sensors by motion alone; they require
extensive processing. A 3-D digital volume consisting of stacked images is generated as the object is
moved in a direction perpendicular to the sensor ring. Other modalities of imaging based on the CAT
principle include magnetic resonance imaging (MRI) and positron emission tomography (PET).The
illumination sources, sensors, and types of images are different, but conceptually they are very similar
to the basic imaging approach shown in Fig. 5.3 (b).
Fig.5.3 (a) Image acquisition using a linear sensor strip (b) Image acquisition using a
circular sensor strip
(3) Image Acquisition Using Sensor Arrays:
Figure 5.1 (c) shows individual sensors arranged in the form of a 2-D array. Numerous electromagnetic
and some ultrasonic sensing devices frequently are arranged in an array format. This is also the
predominant arrangement found in digital cameras. A typical sensor for these cameras is a CCD
array, which can be manufactured with a broad range of sensing properties and can be packaged in
rugged arrays of 4000 * 4000 elements or more. CCD sensors are used widely in digital cameras and
other light sensing instruments. The response of each sensor is proportional to the integral of the light
energy projected onto the surface of the sensor, a property that is used in astronomical and other
applications requiring low noise images. Noise reduction is achieved by letting the sensor integrate the
input light signal over minutes or even hours. Since the sensor array shown in Fig. 5.4 (c) is two
dimensional, its key advantage is that a complete image can be obtained by focusing the energy pattern
onto the surface of the array. The principal manner in which array sensors are used is shown in
Fig.5.4. This figure shows the energy from an illumination source being reflected from a scene
element, but, as mentioned at the beginning of this section, the energy also could be transmitted
through the scene elements. The first function performed by the imaging system shown in Fig.5.4 (c)
is to collect the incoming energy and focus it onto an image plane. If the illumination is light, the front
7
end of the imaging system is a lens, which projects the viewed scene onto the lens focal plane, as Fig.
2.15(d) shows. The sensor array, which is coincident with the focal plane, produces outputs
proportional to the integral of the light received at each sensor. Digital and analog circuitry sweep
these outputs and converts them to a video signal, which is then digitized by another section of the
imaging system. The output is a digital image, as shown diagrammatically in Fig. 5.4 (e).
Fig.5.4 An example of the digital image acquisition process (a) Energy (―illumination‖) source
(b) An element of a scene (c) Imaging system (d) Projection of the scene onto the image plane (e)
Digitized image
3. Explain about image sampling and quantization process.
Image Sampling and Quantization:
The output of most sensors is a continuous voltage waveform whose amplitude and spatial behavior
are related to the physical phenomenon being sensed. To create a digital image, we need to convert
the continuous sensed data into digital form. This involves two processes: sampling and quantization.
Basic Concepts in Sampling and Quantization:
The basic idea behind sampling and quantization is illustrated in Fig.6.1. Figure 6.1(a) shows a
continuous image, f(x, y), that we want to convert to digital form. An image may be continuous with
respect to the x- and y-coordinates, and also in amplitude. To convert it to digital form, we have to
sample the function in both coordinates and in amplitude. Digitizing the coordinate values is
called sampling. Digitizing the amplitude values is called quantization.
The one-dimensional function shown in Fig.6.1 (b) is a plot of amplitude (gray level) values of the
continuous image along the line segment AB in Fig. 6.1(a).The random variations are due to image
noise. To sample this function, we take equally spaced samples along line AB, as shown in Fig.6.1
(c).The location of each sample is given by a vertical tick mark in the bottom part of the figure. The
samples are shown as small white squares superimposed on the function. The set of these discrete
locations gives the sampled function. However, the values of the samples still span (vertically) a
continuous range of gray-level values. In order to form a digital function, the gray-level values also
must be converted (quantized) into discrete quantities. The right side of Fig. 6.1 (c) shows the gray-
level scale divided into eight discrete levels, ranging from black to white. The vertical tick marks
indicate the specific value assigned to each of the eight gray levels. The continuous gray levels are
quantized simply by assigning one of the eight discrete gray levels to each sample. The assignment is
made depending on the vertical proximity of a sample to a vertical tick mark. The digital samples
resulting from both sampling and quantization are shown in Fig.6.1 (d). Starting at the top of the image
and carrying out this procedure line by line produces a two-dimensional digital image.
8
Sampling in the manner just described assumes that we have a continuous image in both coordinate
directions as well as in amplitude. In practice, the method of sampling is determined by the sensor
arrangement used to generate the image. When an image is generated by a single sensing element
combined with mechanical motion, as in Fig. 2.13, the output of the sensor is quantized in the manner
described above. However, sampling is accomplished by selecting the number of individual
mechanical increments at which we activate the sensor to collect data. Mechanical motion can be
made very exact so, in principle; there is almost no limit as to how fine we can sample an image.
However, practical limits are established by imperfections in the optics used to focus on the
Fig.6.1. Generating a digital image (a) Continuous image (b) A scan line from A to Bin the
continuous image, used to illustrate the concepts of sampling and quantization (c) Sampling and
quantization. (d) Digital scan line
sensor an illumination spot that is inconsistent with the fine resolution achievable with mechanical
displacements. When a sensing strip is used for image acquisition, the number of sensors in the strip
establishes the sampling limitations in one image direction. Mechanical motion in the other direction
can be controlled more accurately, but it makes little sense to try to achieve sampling density in one
direction that exceeds the sampling limits established by the number of sensors in the other.
Quantization of the sensor outputs completes the process of generating a digital image.
When a sensing array is used for image acquisition, there is no motion and the number of sensors in the
array establishes the limits of sampling in both directions. Figure 6.2 illustrates this concept. Figure
6.2 (a) shows a continuous image projected onto the plane of an array sensor. Figure 6.2 (b) shows the
image after sampling and quantization. Clearly, the quality of a digital image is determined to a large
degree by the number of samples and discrete gray levels used in sampling and quantization.
9
Fig.6.2. (a) Continuos image projected onto a sensor array (b) Result of image sampling and
quantization
1. Explain about the basic relationships and distance measures between pixels in a digital image.
Neighbors of a Pixel:
A pixel p at coordinates (x, y) has four horizontal and vertical neighbors whose coordinates are given
by (x+1, y), (x-1, y), (x, y+1), (x, y-1). This set of pixels, called the 4-neighbors of p, is denoted by N 4
(p). Each pixel is a unit distance from (x, y), and some of the neighbors of p lie outside the digital
image if (x, y) is on the border of the image.
The four diagonal neighbors of p have coordinates (x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1) and are
denoted by ND (p). These points, together with the 4-neighbors, are called the 8- neighbors of p,
denoted by N8 (p). As before, some of the points in ND (p) and N8 (p) fall outside the image if (x, y) is
on the border of the image Connectivity:
Connectivity between pixels is a fundamental concept that simplifies the definition of numerous digital
image concepts, such as regions and boundaries. To establish if two pixels are connected, it must be
determined if they are neighbors and if their gray levels satisfy a specified criterion of similarity (say,
if their gray levels are equal). For instance, in a binary image with values 0 and 1, two pixels may be
4-neighbors, but they are said to be connected only if they have the same value.
Let V be the set of gray-level values used to define adjacency. In a binary image, V={1} if we are
referring to adjacency of pixels with value 1. In a grayscale image, the idea is the same, but set V
typically contains more elements. For example, in the adjacency of pixels with a range of possible
gray-level values 0 to 255, set V could be any subset of these 256 values. We consider three types of
adjacency:
(a) 4-adjacency. Two pixels p and q with values from V are 4-adjacent if q is in the set N4 (p).
(b) 8-adjacency. Two pixels p and q with values from V are 8-adjacent if q is in the set N8 (p).
(c) m-adjacency (mixed adjacency).Two pixels p and q with values from V are m-adjacent if
(i) q is in N4 (p), or
(ii) q is in ND (p) and the set has no pixels whose values are from V.
10
Mixed adjacency is a modification of 8-adjacency. It is introduced to eliminate the ambiguities that
often arise when 8-adjacency is used. For example, consider the pixel arrangement shown in Fig.9 (a)
for V= {1}.The three pixels at the top of Fig.9 (b) show multiple (ambiguous) 8- adjacency, as
indicated by the dashed lines. This ambiguity is removed by using m-adjacency, as shown in Fig. 9
(c).Two image subsets S1 and S2 are adjacent if some pixel in S1 is adjacent to some pixel in S2. It is
understood here and in the following definitions that adjacent means 4-, 8-
, or m-adjacent. A (digital) path (or curve) from pixel p with coordinates (x, y) to pixel q with
coordinates (s, t) is a sequence of distinct pixels with coordinates
where and pixels are adjacent for

. In this case, n is the length of the path. If (xo, yo) = (xn, yn), the path is a closed path.
We can define 4-, 8-, or m-paths depending on the type of adjacency specified. For example, the
paths shown in Fig. 9 (b) between the northeast and southeast points are 8-paths, and the path in Fig.
9 (c) is an m-path. Note the absence of ambiguity in the m-path. Let S represent a subset of pixels in an
image. Two pixels p and q are said to be connected in S if there exists a path between them
consisting entirely of pixels in S. For any pixel p in S,
Fig.9 (a) Arrangement of pixels; (b) pixels that are 8-adjacent (shown dashed) to the center pixel;
(c) m-adjacency
the set of pixels that are connected to it in S is called a connected component of S. If it only has one
connected component, then set S is called a connected set.
Let R be a subset of pixels in an image. We call R a region of the image if R is a connected set. The
boundary (also called border or contour) of a region R is the set of pixels in the region that have one
or more neighbors that are not in R. If R happens to be an entire image (which we recall is a
rectangular set of pixels), then its boundary is defined as the set of pixels in the first and last rows
and columns of the image. This extra definition is required because an image has no neighbors
beyond its border. Normally, when we refer to a region, we are referring to a subset of an image, and
any pixels in the boundary of the region that happen to coincide with the border of the image are
included implicitly as part of the region boundary.
Distance Measures:
For pixels p, q, and z, with coordinates (x, y), (s, t), and (v, w), respectively, D is a distance function or
metric if
11
The Euclidean distance between p and q is defined as
For this distance measure, the pixels having a distance less than or equal to some value r from(x, y)
are the points contained in a disk of radius r centered at (x, y).
The D4 distance (also called city-block distance) between p and q is defined as
In this case, the pixels having a D4 distance from (x, y) less than or equal to some value r form a
diamond centered at (x, y). For example, the pixels with D 4 distance ≤ 2 from (x, y) (the center point)
form the following contours of constant distance:
The pixels with D4 =1 are the 4-neighbors of (x, y).

The D8 distance (also called chessboard distance) between p and q is defined as
In this case, the pixels with D8 distance from(x, y) less than or equal to some value r form a square
centered at (x, y). For example, the pixels with D8 distance ≤ 2 from(x, y) (the center point) form the
following contours of constant distance:
The pixels with D8=1 are the 8-neighbors of (x, y). Note that the D4 and D8 distances between p and q
are independent of any paths that might exist between the points because these distances involve only
the coordinates of the points. If we elect to consider m-adjacency, however, the Dm distance between
two points is defined as the shortest m-path between the points. In this case, the distance between two
pixels will depend on the values of the pixels along the path, as well as the values of their neighbors.
For instance, consider the following arrangement of pixels and assume that p, p 2 , and p4 have value 1
and that p1 and p3 can have a value of 0 or 1:
12
Suppose that we consider adjacency of pixels valued 1 (i.e. = {1}). If p1 and p3 are 0, the length of the
shortest m-path (the Dm distance) between p and p4 is 2. If p1 is 1, then p2 and p will no longer be m-
adjacent (see the definition of m-adjacency) and the length of the shortest m-path becomes 3 (the path
goes through the points pp1p2p4). Similar comments apply if p3 is 1 (and p1 is 0); in this case, the length
of the shortest m-path also is 3. Finally, if both p1 and p3 are 1 the length of the shortest m-path between
p and p4 is 4. In this case, the path goes through the sequence of points pp 1p2p3p4.
2. Explain about Random noise?
Noise is always presents in digital images during image acquisition, coding, transmission, and
processing steps. It is very difficult to remove noise from the digital images without the prior knowledge
of filtering techniques.
Random in images:
Image noise is random variation of brightness or color information in the images captured. It is degradation
in image signal caused by external sources.Images containing multiplicative noise have the characteristic that
the brighter the area the noisier it. But mostly it is additive. We can model a noisy image as
Fig.2 Addition of random noise in an image
A(x,y) = H(x,y) + B(x,y) Where, A(x,y)= function of noisy image, H(x,y)= function of image noise , B(x,y)=
function of original image.

Sources of Image noise:
 While image being sent electronically from one place to another.
 Sensor heat while clicking an image.
 With varying ISO Factor which varies with the capacity of camera to absorb light.
Types of Image noise:
There are different types of image noise. They can typically be divided into 3 types.
13
Fig.3 classification of noise
1. Gaussian Noise:
Gaussian Noise is a statistical noise having a probability density function equal to normal distribution, also
known as Gaussian Distribution. Random Gaussian function is added to Image function to generate this
noise. It is also called as electronic noise because it arises in amplifiers or detectors. Source: thermal
vibration of atoms and discrete nature of radiation of warm objects.
Fig.4 Plot of Probability Distribution Function
The side image is a bell shaped probability distribution function which have mean 0 and standard
deviation(sigma) 1.
1.1 Implementation of Gaussian Noise with OpenCV-Python:
14
1.2 Effect of Standard Deviation(sigma) on Gaussian noise:
Fig.5 Effect of Sigma on Gaussian Noise
The magnitude of Gaussian Noise depends on the Standard Deviation(sigma). Noise Magnitude is directly
proportional to the sigma value.

2. Impulse Noise:
Impulse Function: In the discrete world impulse function on a vale of 1 at a single location and In continuous
world impulse function is an idealised function having unit area.
Fig.6 Impulse function in discrete world and continuous world
15
2.1 Types of Impulse Noise:
There are three types of impulse noises. Salt Noise, Pepper Noise, Salt and Pepper Noise.
Salt Noise: Salt noise is added to an image by addition of random bright (with 255 pixel value) all over the
image.
Pepper Noise: Salt noise is added to an image by addition of random dark (with 0 pixel value) all over the
image.
Salt and Pepper Noise: Salt and Pepper noise is added to an image by addition of both random bright (with
255 pixel value) and random dark (with 0 pixel value) all over the image.This model is also known as data
drop noise because statistically it drop the original data values [5]. Source: Malfunctioning of camera’s
sensor cell.
2.2 Implementation of Salt and Pepper Noise with OpenCV-Python:
3. Poisson Noise:
The appearance of this noise is seen due to the statistical nature of electromagnetic waves such as x-rays,
visible lights and gamma rays. The x-ray and gamma ray sources emitted number of photons per unit time.
These rays are injected in patient’s body from its source, in medical x rays and gamma rays imaging systems.
These sources are having random fluctuation of photons. Result gathered image has spatial and temporal
randomness. This noise is also called as quantum (photon) noise or shot noise.
16
3.1 Implementation of Poisson Noise Noise with OpenCV-Python:
4. Speckle Noise
A fundamental problem in optical and digital holography is the presence of speckle noise in the image
reconstruction process. Speckle is a granular noise that inherently exists in an image and degrades its quality.
Speckle noise can be generated by multiplying random pixel values with different pixels of an image.
4.1 Implementation of Poisson Noise Noise with OpenCV-Python:
17
So with this we have discussed about various common type of noise that exists in a digital image. But it is
not necessary that only one type of noise will be present in a particular image. That can be combination of
different noise. So it is very important to know about different filtering techniques to remove these noise for
betterment and enhancement of the image.
Gaussian Markov Random Fields

As a non expert in the field, I am relating Gaussian Processes (GP) and Gaussian Markov Random
Fields (GMRF).
I might just be confused by the fact that different resources use different formalism. Here I try to report the
main definitions and my thoughts.
A Gaussian Process is a supervised learning algorithm for modelling a latent function y=f(x)+ϵy=f(x)+ϵ,
given a labeled training set. It is specified by a mean function m(x)=E[f(x)]m(x)=E[f(x)] and the covariance
function k(x,y)=E[f(x)−m(x))(f(y)−m(y))]k(x,y)=E[f(x)−m(x))(f(y)−m(y))]. Predicting a GP means
computing, for a new point x∗x∗:
f¯¯¯∗=kT(K+σ2nI)−1yf¯∗=kT(K+σn2I)−1y
V[f∗]=k(x∗,x∗)−kT∗(K+σ2nI)−1k∗V[f∗]=k(x∗,x∗)−k∗T(K+σn2I)−1k∗
where kk is the vector with the covariance between the test point x∗x∗ and the points in the dataset xixi,
i.e. ki=k(x∗,xi)ki=k(x∗,xi), and yy is the vector of the labels of the elements in the training set, and KK is
the n×nn×n covariance matrix between all the inputs of the training set, and yy is the vector of f(x)f(x) of
the training set. In order to evaluate a model, one has to compute the LML (Logarithm of Marginal
Likelihood):
LML=log[p(y|K+σ2nI]=−12logdet[K+σ2nI]−12yT(K+σ2nI)−1y−n2log2πLML=log⁡[p(y|K+σn2I]=−12lo
g⁡det[K+σn2I]−12yT(K+σn2I)−1y−n2log⁡2π
To my understanding (please correct me if I am wrong), the computation of the LML is important while
training, because the training consist in the optimization of the hyper-parameters of the kernel function.
A Gaussian random field (GRF) is a random field involving Gaussian probability density functions of the
variables. Specifically, a random field is defined as X(s,ω)X(s,ω), where s∈Ds∈D is a set of locations
(usually D=RdD=Rd), and ωω is some element in a sample space (which usually is RR and is removed
from the notation). In the Gaussian case, each X(s)X(s) is defined by a mean
function μ(s)=E[X(s)]μ(s)=E[X(s)] and the covariance
function C(s,t)=Cov(X(s),X(t))C(s,t)=Cov(X(s),X(t)). For each finite collection of points:
x≡(X(s1),⋯,X(sp))T≈N(μ,Σ)x≡(X(s1),⋯,X(sp))T≈N(μ,Σ)
where Σij=C(si,sj)Σij=C(si,sj).
A Markov random field (MRF) is a synonym for undirected graphical models. Is a Random Field with the
markov property. Interpreting the random variable X(s)X(s) as a node in a undirected
graph G=(V,E)G=(V,E), we say that it satisfy the markov property if:
 pairwise markov property (any two non-adjacent variables are conditionally independent given all
other variable)
18
 local markov property (a variable is conditionally independent of all other variables given its
neighbors)
 global markov property (any two subsets of variables are conditionally independent given a separating
subset)
A Gaussian Markov Random Field (GMRF) is a GRF with the Markov property, (or alternative a
Markov Random Field with Gaussian random variables). By Definition 2.1 of the monogrpahy book of the
presentation I linked, a GMRF is defined as such: A random vector x=(x1,⋯,xn)T∈Rnx=(x1,⋯,xn)T∈Rn is
a GMRF w.r.t a labelled graph G=(V,E)G=(V,E) with mean μμ and precision matrix Q>0Q>0, iff its
density has the form:
π(x)=(2π)−n/2|Q|1/2exp(−1/2(x−μ)TQ(x−μ))π(x)=(2π)−n/2|Q|1/2exp(−1/2(x−μ)TQ(x−μ))
and Qij≠0⇔{i,j}∈EQij≠0⇔{i,j}∈E for all i≠ji≠j.
The algorithm for using a GP consist roughly in computing the Cholesky decomposition
of (K+σ2nI)=LLT(K+σn2I)=LLT. With that, is simple to compute α=(K+σ2nI)−1yα=(K+σn2I)−1y.
Finally, we can compute:
f¯¯¯∗=kTαf¯∗=kTα
V[f¯¯¯∗]=k(x∗,x∗)−(L∖k∗)T(L∖k∗).V[f¯∗]=k(x∗,x∗)−(L∖k∗)T(L∖k∗).
For the fitting, we basically use gradient-based algorithm to find the best hyperparameterization of the
kernel, (i.e. the covariance matrix). For this, we need to repeatedly evaluate the LML untill we converge to
the best value of the hyperparameters.
In general, I think GMRF can be seen as a special case of GP. In GP, we usualy specify the covariance
matrix KK (perhaps after applying a kernel function ϕϕ to the data, such that K=ϕ(X)Tϕ(X)K=ϕ(X)Tϕ(X),
while
The questions are the following:
 If I have an algorithm to fit a GP (imagine a black box that computer a LML, the f∗¯¯¯¯¯f∗¯ and the
variance), can I use it to fit a GMRF?
 What other algorithms are relevant for GMRF or GP? I think (but I am not sure) these other problems
are usually faced by practitioners in GP/GMRF, that are not solved by the algorithm previously
mentioned:
 Sampling from the Gibbs distribution of the process
 Reconstruct the precision matrix Q=K−1Q=K−1: which encodes the structure of the underlying
graph of the bayesian network, given that I have access to the whole covariance matrix of the
data KK.
 Algorithm for doing classification with GP.
Gamma-Ray Imaging
Major uses of imaging based on gamma rays include nuclear medicine and astronomical observations. In
nuclear medicine, the approach is to inject a patient with a radioactive isotope that emits gamma rays as it
decays. Images are produced from the emissions collected by gamma ray detectors. Figure 1.6(a) shows an
image of a complete bone scan obtained by using gamma-ray imaging. Images of this sort are used to locate
sites of bone pathology, such as infections or tumors. Figure 1.6(b) shows another major modality of
nuclear imaging called positron emission tomography (PET).The principle is the same as with X-ray
tomography, mentioned briefly in Section 1.2. However, instead of using an external source of X-ray
energy, the patient is given a radioactive isotope that emits positrons as it decays. When a positron meets an
electron, both are annihilated and two gamma rays are given off.These are detected and a tomographic
image is created using the basic principles of tomography.The image shown in Fig. 1.6(b) is one sample of
19
a sequence that constitutes a 3-D rendition of the patient. This image shows a tumor in the brain and one in
the lung, easily visible as small white masses. A star in the constellation of Cygnus exploded about 15,000
years ago, generating a superheated stationary gas cloud (known as the Cygnus Loop) that glows in a
spectacular array of colors. Figure 1.6(c) shows the Cygnus Loop imaged in the gamma-ray band. Unlike
the two examples shown in Figs. 1.6(a) and (b), this image was obtained using the natural radiation of the
object being imaged. Finally, Fig. 1.6(d) shows an image of gamma radiation from a valve in a nuclear
reactor. An area of strong radiation is seen in the lower, left side of the image.
Linear and Nonlinear Operations

Let H be an operator whose input and output are images. H is said to be a linear operator if, for any two
images f and g and any two scalars a and b,
H(af + bg) = aH(f) + bH(g)
(2.6-1) In other words, the result of applying a linear operator to the sum of two images (that have been
20
multiplied by the constants shown) is identical to applying the operator to the images individually,
multiplying the results by the appropriate constants, and then adding those results. For example, an operator
whose function is to compute the sum of K images is a linear operator. An operator that computes the
absolute value of the difference of two images is not. An operator that fails the test of Eq. (2.6-1) is by
definition nonlinear. Linear operations are exceptionally important in image processing because they are
based on a significant body of well-understood theoretical and practical results.Although nonlinear
operations sometimes offer better performance, they are not always predictable, and for the most part are
not well understood theoretically
1. What are the fundamental steps in Digital Image Processing?
Fundamental Steps in Digital Image Processing:
Image acquisition is the first process shown in Fig.2. Note that acquisition could be as simple as being
given an image that is already in digital form. Generally, the image acquisition stage involves
preprocessing, such as scaling.
Image enhancement is among the simplest and most appealing areas of digital image processing.
Basically, the idea behind enhancement techniques is to bring out detail that is obscured, or simply to
highlight certain features of interest in an image. A familiar example of enhancement is when we
increase the contrast of an image because ―it looks better.‖ It is important to keep in mind that
enhancement is a very subjective area of image processing.
Image restoration is an area that also deals with improving the appearance of an image. However,
unlike enhancement, which is subjective, image restoration is objective, in the sense that restoration
techniques tend to be based on mathematical or probabilistic models of image degradation.
Enhancement, on the other hand, is based on human subjective preferences regarding what constitutes
a ―good‖ enhancement result.
Color image processing is an area that has been gaining in importance because of the significant
increase in the use of digital images over the Internet.
Fig.2. Fundamental steps in Digital Image Processing

21
Wavelets are the foundation for representing images in various degrees of resolution. Compression, as
the name implies, deals with techniques for reducing the storage required to save an image, or the
bandwidth required to transmit it. Although storage technology has improved significantly over the
past decade, the same cannot be said for transmission capacity. This is true particularly in uses of the
Internet, which are characterized by significant pictorial content. Image compression is familiar
(perhaps inadvertently) to most users of computers in the form of image file extensions, such as the jpg
file extension used in the JPEG (Joint Photographic Experts Group) image compression standard.
Morphological processing deals with tools for extracting image components that are useful in the
representation and description of shape.
Segmentation procedures partition an image into its constituent parts or objects. In general,
autonomous segmentation is one of the most difficult tasks in digital image processing. A rugged
segmentation procedure brings the process a long way toward successful solution of imaging problems
that require objects to be identified individually. On the other hand, weak or erratic segmentation
algorithms almost always guarantee eventual failure. In general, the more accurate the segmentation,
the more likely recognition is to succeed.
22
Representation and description almost always follow the output of a segmentation stage, which usually
is raw pixel data, constituting either the boundary of a region (i.e., the set of pixels separating one
image region from another) or all the points in the region itself. In either case, converting the data to a
form suitable for computer processing is necessary. The first decision that must be made is whether
the data should be represented as a boundary or as a complete region. Boundary representation is
appropriate when the focus is on external shape characteristics, such as corners and inflections.
Regional representation is appropriate when the focus is on internal properties, such as texture or
skeletal shape. In some applications, these representations complement each other. Choosing a
representation is only part of the solution for transforming raw data into a form suitable for subsequent
computer processing. A method must also be specified for describing the data so that features of
interest are highlighted. Description, also called feature selection, deals with extracting attributes that
result in some quantitative information of interest or are basic for differentiating one class of objects
from another.
Recognition is the process that assigns a label (e.g., ―vehicle‖) to an object based on its descriptors.
We conclude our coverage of digital image processing with the development of methods for
recognition of individual objects.
Color Image Processing Color image processing includes the following topics:
1. Color fundamentals
In 1666 Sir Isaac Newton discovered that g (2002) when a beam of sunlight passes through a glass
prism, the emerging beam is split into a spectrum of colors
The colors that humans and most animals perceive in an object are determined by the nature of the
light reflected from the object
For example, green objects reflect light with wave lengths primarily in the range of 500 – 570 nm
while absorbing most of the energy at other wavelengths
23
Three basic quantities are used to describe the quality of a chromatic light source: – Radiance: the total
amount of energy that flows from the light source (measured in watts) – Luminance: the amount of energy
an observer perceives from the light source (measured in lumens)
Note we can have high radiance but low luminance – Brightness: a subjective (practically un-measurable)
notion that embodies the achromatic notion of intensity of light Digital Image Processing 6 Color
Fundamentals
Chromatic light spans the electromagnetic spectrum from approximately 400 to 700 nm.
➢ Human color vision is achieved through 6 to 7 million cones in each eye.
➢ Three principal sensing groups: –66% of these cones are sensitive to red light –33% to green light –2%
to blue light.
➢ Absorption curves for the different cones have been determined experimentally.
➢ Strangely these do not match the CIE standards for red (700nm), green (546.1nm) and blue (435.8nm)
light as the standards were developed before the experiments
➢ The primary colors can be added to produce the secondary colors. ➢ Mixing the three primaries
produces white.
➢ Mixing a secondary with its opposite primary produces white (e.g. red+cyan). ➢ Primary colors of light
(red, green, blue)
➢ Primary colors of pigments (colorants) – A color that subtracts or absorbs a primary color of light and
reflects the other two. – These are cyan, magenta and yellow (CMY). – A proper combination of pigment
primaries produces black. Digital Image Processing 9 Color Fundamentals How to Distinguish one color
from another?
➢ Brightness: the achromatic notion of intensity.
➢ Hue: the dominant wavelength in a mixture of light waves. Note : The dominant color perceived by an
observer, e.g. when we call an object red or orange we refer to its hue)
➢ Saturation: the amount of white light mixed with a hue. Pure colors are fully saturated. Pink (red+white)
is less saturated.
➢ Hue and saturation are called chromaticity.
➢ Therefore any color is characterized by its brightness and chromaticity.
➢ The amounts of red, green and blue needed to form a particular color are called tristimulus values and
are denoted by X, Y, Z. Digital Image Processing 10 Color Fundamentals ➢ A color is then specified by its
trichromatic coefficients: x+y+z=1 ➢ For any visible wavelength the tristimulus needed to produce that
wavelength are obtained by curves compiled by extensive experimentation.
2. Color Models
24
➢ There are different ways to model color.
➢ Two very popular models used in color image processing: – RGB (Red Green Blue) – HSI (Hue
Saturation Intensity)
➢ In the RGB model each color appears in its primary spectral components of red, green and blue. ➢ The
model is based on a Cartesian coordinate system. Digital Image Processing 12 RGB Color Model
➢ RGB values are at 3 corners.
➢ Cyan magenta and yellow are at three other corners.
➢ Black is at the origin.
➢ White is the corner furthest from the origin.
➢ Different colors are points on or inside the cube represented by RGB vectors
Images represented in the RGB color model consist of three component images. – one for each primary
color.
➢ When fed into a monitor these images are combined to create a composite color image.
➢ The number of bits used to represent each pixel is referred to as the color depth.
➢ A 24-bit image is often referred to as a fullcolor image as it allows 16,777,216 colors. Digital Image
Processing 14 RGB Color Model
➢ Generating RGB image
RGB is useful for hardware implementations and is serendipitously related to the way in which the human
visual system works.
➢ However, RGB is not a particularly intuitive way in which to describe colors.
➢ Rather when people describe colors they tend to use hue, saturation and brightness.
➢ RGB is great for color generation, but HSI is great for color description. Digital Image Processing 16
HSI Color Model
➢ Hue: A color attribute that describes a pure color (pure yellow, orange or red).
➢ Saturation: Gives a measure of how much a pure color is diluted with white light.
➢ Intensity: Brightness is nearly impossible to measure because it is so subjective. Instead we use
intensity.
➢ Intensity is the same achromatic notion that we have seen in grey level images
3. Pseudocolor Image Processing
25
4. ➢ Pseudocolor (also called false color) image processing consists of assigning colors to grey values
based on a specific criterion.
5. ➢ The principle use of pseudocolor image processing is for human visualisation. o Humans can discern
between thousands of color shades and intensities, compared to only about two dozen or so shades of
grey.
6. ➢ Intensity Slicing : Intensity slicing and color coding is one of the simplest kinds of pseudocolor
image processing.
7. First we consider an image as a 3D function mapping spatial coordinates to intensities (that we can
consider heights). Digital Image Processing 18 Pseudocolor Image Processing
8. ➢ Now consider placing planes at certain levels parallel to the coordinate plane.
9. ➢ If a value is one side of such a plane it is rendered in one color, and a different color if on the other
side.
10. ➢ In general intensity slicing can be summarised as:
11. ➢ Let [0, L-1] represent the grey scale. ➢ Let l0 represent black [f (x, y) = 0] and let lL-1 represent
white [f (x, y) = L-1].
12. ➢ Suppose P planes perpendicular to the intensity axis are defined at levels l1 , l2 , …, lp.
13. ➢ Assuming that 0 < P < L-1 then the P planes partition the grey scale into P + 1 intervals V1 , V2
,…,VP+1 Digital Image Processing 19 Pseudocolor Image Processing
14. ➢ Grey level color assignments can then be made according to the relation: f (x, y) = ck , if f (x, y) ∈ V
where ck is the color associated with the kth intensity level Vk defined by the partitioning planes at l =
k – 1 and l = k
➢ Three independent transformations of the intensity. ➢ The results are fed into the R, G, B channels.
➢ The resulting composite image highlights certain image parts
➢ Sinusoidal transformation functions.

➢ Changing the phase or the frequency of the transformation functions can emphasize ranges in the
gray scale.
➢ A small change in the phase between the transformations assigns a strong color to the pixels with
intensities in the valleys
26
ColorTransformation
➢ Gray scale image transformations may also be applied to each color separately. Color
transformation operations are, o Intensity adjustment o Color complement o Color slicing o Tone
Correction o Correction of color imbalancies Digital Image Processing 23 Color Image Smoothing and
Sharpening
➢ Color image smoothing and sharpening are two important Pre-Processing techniques within the
computer vision field.
➢ Color image smoothing is part of pre-processing techniques intended for removing possible image
perturbations without losing image information.
➢ Analogously, sharpening is a pre-processing technique that plays an important role for feature
extraction in image processing.
➢ Sharpening is in charge of the improvement of the image visual appearance and enhance the details
and borders of the image.
➢ There exists a lot of smoothing and sharpening methods that are able to improve visual quality of
images. However the same does not happen with simultaneous approaches due to the opposite nature of
these two operations. Digital Image Processing 24 Color Image Smoothing and Sharpening
➢ Typical spatial filters for color image smoothing are based on the convolution of the image with
different kernels, depending on the intended result
. ➢ These kernels could be of any size n × n, but usually 3 × 3.
➢ Commonly used filters for smoothing are based on averaging.
➢ Similarly, typical spatial techniques for sharpening images are based on kernels.
➢ Most common methods are based on derivatives, such as Laplacian filter.
➢ Smoothing intends to remove the high frequencies but Sharpening intends to increase the high
frequencies.
➢ Both steps smoothing and sharpening are used to enhance the image. This is called two-step
approach. In this approach, First performed either smoothing or sharpening.
➢ However this approaches usually leads too many problems.
➢ So, we used to prepare simultaneous approach. Digital Image Processing 25 Color edge detection
➢ Color Edge detection is one of the most important tasks in image processing and scene analysis
systems.
➢ Therefore, one edge detection technique is to measure the gradient vector magnitude at pixel
locations.
➢ Edge detection is one of the most commonly used operations in image processing and pattern
recognition.
➢ Edges form the outline of an object.
➢ An edge is the boundary between an object and the background, and indicates the boundary between
overlapping objects.
➢ This means that if the edges in an image can be identified accurately, all of the objects can be
located and basic properties such as area, perimeter, and shape can be measured. Digital Image
Processing 26 Color edge detection
➢ Edge detection in color image is far more challenging task than gray scale images as color space is
considered as a vector space.
➢ Almost 90% of edge information in a color image can be found in the corresponding grayscale
image. However, the remaining 10% can still be vital in certain computer vision tasks. ➢ In Color edge
detection, each pixel is represented by three channel values called Red, Green, and Blue.
➢ Each channel contain gray values. Gray edge detection techniques are applied on each channel to
detect the edges, discriminate the objects and discriminate the objects and backgrounds. Digital Image
27
Processing 27 Noise in color images
➢ Image noise is random variation of brightness or color information in images, and is usually an
aspect of electronic noise. It can be produced by the image sensor and circuitry of a scanner or digital
camera.
➢ The noise models discussed for grayscale images are also applicable to color images. ➢ However,
in many applications, a color channel may be more or less affected than the other channels.
➢ For instance, using a red color filter in a CCD camera may affect the red component of the image
(CCD sensors are noisier at low levels of illumination). Digital Image Processing 28 Noise in color
images
➢ The hue and saturation components are significantly degraded.
➢ This is due to the nonlinearity of the cos and min operations used in the transformation.
➢ The intensity component is smoother due to averaging of the three noisy RGB components. ➢
When only one channel is affected by noise, conversion to HIS spreads the noise to all HSI components
images.
➢ This is due to the transformation that makes use of all RGB components to compute each HSI
component. Different Noises are, 1) Salt and Pepper noise 2) Impulse noise 3) Gaussian noise
28
UNIT-II
IMAGE ENHANCEMENT (SPATIAL AND FREQUENCY DOMAINS)
1. What is the objective of image enhancement. Define spatial domain. Define point processing.
The term spatial domain refers to the aggregate of pixels composing an image. Spatial domain methods
are procedures that operate directly on these pixels. Spatial domain processes will be denoted by the
expression
where f(x, y) is the input image, g(x, y) is the processed image, and T is an operator on f, defined over
some neighborhood of (x, y). In addition,T can operate on a set of input images, such as performing the
pixel-by-pixel sum of K images for noise reduction.
The principal approach in defining a neighborhood about a point (x, y) is to use a square or rectangular
subimage area centered at (x, y), as Fig.2.1 shows. The center of the subimage is moved from pixel to
pixel starting, say, at the top left corner. The operator T is applied at each location (x, y) to yield the
output, g, at that location.The process utilizes only the pixels in the area of the image spanned by the
neighborhood.
Fig.2.1 A 3*3 neighborhood about a point (x, y) in an image.
Although other neighborhood shapes, such as approximations to a circle, sometimes are used, square and
rectangular arrays are by far the most predominant because of their ease of implementation. The simplest
form of T is when the neighborhood is of size 1*1 (that is, a single pixel). In this case, g depends only on
the value of f at (x, y), and T becomes a gray-level (also called an intensity or mapping) transformation
function of the form
where, for simplicity in notation, r and s are variables denoting, respectively, the gray level of f(x, y)
and g(x, y) at any point (x, y). For example, if T(r) has the form shown in Fig. 2.2(a), the effect of this
transformation would be to produce an image of higher contrast than the original by darkening the levels
below m and brightening the levels above m in the original image. In this technique, known as contrast
stretching, the values of r below m are compressed by the transformation function into a narrow range of
s, toward black.The opposite effect takes place for values of r above m. In the limiting case shown in Fig.
2.2(b), T(r) produces a two-level (binary) image. A mapping of this form is called a thresholding
function. Some fairly simple, yet powerful, processing approaches can be formulated with gray-level
transformations. Because enhancement at any point in an image depends only on the gray level at that
point, techniques in this category often are referred to as point processing.
29
Fig.2.2 Graylevel transformation functions for contrast enhancement.
Larger neighborhoods allow considerably more flexibility. The general approach is to use a function of
the values of f in a predefined neighborhood of (x, y) to determine the value of g at (x, y).One of the
principal approaches in this formulation is based on the use of so-called masks
(also referred to as filters, kernels, templates, or windows). Basically, a mask is a small (say, 3*3)
2-D array, such as the one shown in Fig. 2.1, in which the values of the mask coefficients determine the
nature of the process, such as image sharpening.
2.What is meant by image enhancement by point processing? Discuss any two methods in it.
Basic Gray Level Transformations:
The study of image enhancement techniques is done by discussing gray-level transformation functions.
These are among the simplest of all image enhancement techniques. The values of pixels, before and
after processing, will be denoted by r and s, respectively. As indicated in the previous section, these
values are related by an expression of the form s=T(r), where T is a transformation that maps a pixel
value r into a pixel value s. Since we are dealing with digital quantities, values of the transformation
function typically are stored in a one-dimensional array and the mappings from r to s are implemented
via table lookups. For an 8-bit environment, a lookup table containing the values of T will have 256
entries. As an introduction to gray-level transformations, consider Fig. 1.1, which shows three basic
types of functions used frequently for image enhancement: linear (negative and identity
transformations), logarithmic (log and inverse-log transformations), and power-law (nth power and nth
root transformations).The identity function is the trivial case in which output intensities are identical to
input intensities. It is included in the graph only for completeness.
Image Negatives:
The negative of an image with gray levels in the range [0, L-1] is obtained by using the negative
transformation shown in Fig.1.1, which is given by the expression
s = L - 1 - r.
Reversing the intensity levels of an image in this manner produces the equivalent of a photographic
negative. This type of processing is particularly suited for enhancing white or gray detail embedded in
dark regions of an image, especially when the black areas are dominant in size.
30
Fig.1.1 Some basic gray-level transformation functions used for image enhancement
Log Transformations:
The general form of the log transformation shown in Fig.1.1 is
where c is a constant, and it is assumed that r ≥ 0.The shape of the log curve in Fig. 1.1 shows that
this transformation maps a narrow range of low gray-level values in the input image into a wider range
of output levels.The opposite is true of higher values of input levels.We would use a transformation of
this type to expand the values of dark pixels in an image while compressing the higher-level values.The
opposite is true of the inverse log transformation.
Any curve having the general shape of the log functions shown in Fig. 1.1 would accomplish this
spreading/compressing of gray levels in an image. In fact, the power-law transformations discussed in
the next section are much more versatile for this purpose than the log transformation. However, the
log function has the important characteristic that it compresses the dynamic range of images with large
variations in pixel values. A classic illustration of an application in which pixel values have a large
dynamic range is the Fourier spectrum. At the moment,we are concerned only with the image
characteristics of spectra. It is not unusual to encounter spectrum values that range from 0 to or
higher.While processing numbers such as these presents no problems for a computer, image display
systems generally will not be able to reproduce faithfully such a wide range of intensity values. The net
effect is that a significant degree of detail will be lost in the display of a typical Fourier spectrum.
Power-Law Transformations:
Power-law transformations have the basic form
where c and g are positive constants. Sometimes Eq. is written as
31
to account for an offset (that is, a measurable output when the input is zero).However, offsets typically
are an issue of display calibration and as a result they are normally ignored in Eq. Plots of s versus r
for various values of g are shown in Fig. 1.2. As in the case of the log transformation, power-law
curves with fractional values of g map a narrow range of dark input values into a wider range of output
values,with the opposite being true for high-er values of input levels. Unlike the log function, however,
we notice here a family of possible transformation curves obtained simply by varying γ. As expected,
we see in Fig.1.2 that curves generated with values of g>1 have exactly the opposite effect as those
generated with values of g<1. Finally, we note that Eq. reduces to the identity transformation when c =
γ = 1. A variety of devices used for image capture, printing, and display respond according to a power
law.By convention, the exponent in the power-law equation is referred to as gamma. The proces used
to correct this power-law response phenomena is called gamma correction. For example, cathode ray
tube (CRT) devices have an intensity-to-voltage response that is a power function, with exponents
varying from approximately 1.8 to 2.5.With reference to the curve for g=2.5 in Fig.1.2, we see that
such display systems would tend to produce images that are darker than intended.
Fig.1.2 Plots of the equation for various values of (c=1 in all cases).
Piecewise-Linear Transformation Functions:

The principal advantage of piecewise linear functions over the types of functions we have
discussed above is that the form of piecewise functions can be arbitrarily complex. In fact, as we
will see shortly, a practical implementation of some important transformations can be formulated
only as piecewise functions. The principal disadvantage of piecewise functions is that their
specification requires considerably more user input. Contrast stretching:
One of the simplest piecewise linear functions is a contrast-stretching transformation. Low- contrast
images can result from poor illumination, lack of dynamic range in the imaging sensor, or even
wrong setting of a lens aperture during image acquisition.The idea behind contrast stretching is to
increase the dynamic range of the gray levels in the image being processed.
Figure 1.3 (a) shows a typical transformation used for contrast stretching.
The locations of points (r1 , s1) and (r2 , s2) control the shape of the transformation
32
Fig.1.3 Contrast Stretching (a) Form of Transformation function (b) A low-contrast image
(c) Result of contrast stretching (d) Result of thresholding.
function. If r1=s1 and r2=s2, the transformation is a linear function that produces no changes in gray
levels. If r1=r2,s1=0 and s2=L-1, the transformation becomes a thresholding function that creates a
binary image, as illustrated in Fig. 1.3 (b). Intermediate values of (r 1 , s1) and (r2 , s2) produce various
degrees of spread in the gray levels of the output image, thus affecting its contrast. In general, r1 ≤ r2
and s1 ≤ s2 is assumed so that the function is single valued and monotonically increasing.This
condition preserves the order of gray levels, thus preventing the creation of intensity artifacts in the
processed image.
Figure 1.3 (b) shows an 8-bit image with low contrast. Fig. 1.3(c) shows the result of contrast
stretching, obtained by setting (r 1 , s1) = (rmin , 0) and (r2 , s2) = (rmax , L-1) where rmin and rmax denote the
minimum and maximum gray levels in the image, respectively.Thus, the transformation function
stretched the levels linearly from their original range to the full range [0, L-1]. Finally, Fig. 1.3 (d)
shows the result of using the thresholding function defined previously,with r1 = r2 = m, the mean gray
level in the image.The original image on which these results are based is a scanning electron
microscope image of pollen,magnified approximately 700 times.
Gray-level slicing:
Highlighting a specific range of gray levels in an image often is desired. Applications include enhancing
features such as masses of water in satellite imagery and enhancing flaws in X-ray images.There are
33
several ways of doing level slicing, but most of them are variations of two basic themes.One
approach is to display a high value for all gray levels in the range of interest and a low value for all
other gray levels.This transformation, shown in Fig. 1.4 (a), produces a binary image.The second
approach, based on the transformation shown in Fig. 1.4 (b), brightens the desired range of gray levels but
preserves the background and gray-level tonalities in the image. Figure 1.4(c) shows a gray-scale image,
and Fig. 1.4 (d) shows the result of using the transformation in Fig. 1.4 (a).Variations of the two
transformations shown in Fig. 1.4 are easy to formulate
Fig.1.4 (a) This transformation highlights range [A, B] of gray levels and reduce all others to a
constant level (b) This transformation highlights range [A, B] but preserves all other levels (c) An
image (d) Result of using the transformation in (a).
Bit-plane slicing:
Instead of highlighting gray-level ranges, highlighting the contributionmade to total image appearance
by specific bits might be desired. Suppose that each pixel in an image is represented by 8 bits.
Imagine that the image is composed of eight 1-bit planes, ranging from bit-plane 0 for
the least significant bit to bit plane 7 for the most significant bit. In terms of 8-bit bytes, plane 0
34
contains all the lowest order bits in the bytes comprising the pixels in the image and plane 7 contains
all the high-order bits.Figure 1.5 illustrates these ideas, and Fig. 1.7 shows the various bit planes
for the image shown in Fig.1.6 . Note that the higher-order bits (especially the top four) contain
themajority of the visually significant data.The other bit planes contribute tomore subtle details in the
image. Separating a digital image into its bit planes is useful for analyzing the relative importance
played by each bit of the image, a process that aids in determining the adequacy of the number of bits
used to quantize each pixel.
Fig.1.5 Bit-plane representation of an 8-bit image.
In terms of bit-plane extraction for an 8-bit image, it is not difficult to show that the (binary) image for
bit-plane 7 can be obtained by processing the input image with a thresholding gray- level
transformation function that (1) maps all levels in the image between 0 and 127 to one level (for
example, 0); and (2) maps all levels between 129 and 255 to another (for example, 255).
35
Fig.1.6 An 8-bit fractal image
Fig.1.7 The eight bit planes of the image in Fig.1.6. The number at the bottom, right of each image
identifies the bit plane.
36
What is a mask
A mask is a filter. Concept of masking is also known as spatial filtering. Masking is also known as
filtering. In this concept we just deal with the filtering operation that is performed directly on the image.
A sample mask has been shown below
-1 0 1
-1 0 1
-1 0 1
What is filtering
The process of filtering is also known as convolving a mask with an image. As this process is same of
convolution so filter masks are also known as convolution masks.
How it is done
The general process of filtering and applying masks is consists of moving the filter mask from point to
point in an image. At each point (x,y) of the original image, the response of a filter is calculated by a pre
defined relationship. All the filters values are pre defined and are a standard.
Types of filters
Generally there are two types of filters. One is called as linear filters or smoothing filters and others are
called as frequency domain filters.
Why filters are used?
Filters are applied on image for multiple purposes. The two most common uses are as following:
 Filters are used for Blurring and noise reduction

 Filters are used or edge detection and sharpness
Blurring and noise reduction
Filters are most commonly used for blurring and for noise reduction. Blurring is used in pre processing
steps, such as removal of small details from an image prior to large object extraction.
Masks for blurring
The common masks for blurring are.
 Box filter
 Weighted average filter
In the process of blurring we reduce the edge content in an image and try to make the transitions between
different pixel intensities as smooth as possible.
Noise reduction is also possible with the help of blurring.
Edge Detection and sharpness
Masks or filters can also be used for edge detection in an image and to increase sharpness of an image.
What are edges
37
We can also say that sudden changes of discontinuities in an image are called as edges. Significant
transitions in an image are called as edges.A picture with edges is shown below.
Original picture
Same picture with edges
2. Write about Smoothing Spatial filters.

Smoothing Spatial Filters:
Smoothing filters are used for blurring and for noise reduction. Blurring is used in preprocessing steps,
such as removal of small details from an image prior to (large) object extraction, and bridging of small
gaps in lines or curves. Noise reduction can be accomplished by blurring with a linear filter and also by
non-linear filtering.
(1) Smoothing Linear Filters:
The output (response) of a smoothing, linear spatial filter is simply the average of the pixels contained
in the neighborhood of the filter mask. These filters sometimes are called averaging
filters. The idea behind smoothing filters is straightforward.By replacing the value of every pixel
38
in an image by the average of the gray levels in the neighborhood defined by the filter mask, this
process results in an image with reduced ―sharp‖ transitions in gray levels. Because random
noise typically consists of sharp transitions in gray levels, the most obvious application of smoothing is
noise reduction.However, edges (which almost always are desirable features of an image) also are
characterized by sharp transitions in gray levels, so averaging filters have the undesirable side effect
that they blur edges. Another application of this type of process includes the smoothing of false
contours that result from using an insufficient number of gray levels.
Fig.10.1 Two 3 x 3 smoothing (averaging) filter masks.The constant multiplier in front of each
mask is equal to the sum of the values of its coefficients, as is required to compute an average.
A major use of averaging filters is in the reduction of ―irrelevant‖ detail in an image. By

―irrelevant‖we mean pixel regions that are small with respect to the size of the filter mask.
Figure 10.1 shows two 3 x 3 smoothing filters. Use of the first filter yields the standard average of
the pixels under the mask.This can best be seen by substituting the coefficients of the mask in
which is the average of the gray levels of the pixels in the 3 x 3 neighborhood defined by the
mask.Note that, instead of being 1/9, the coefficients of the filter are all 1’s.The idea here is that it is
computationally more efficient to have coefficients valued 1. At the end of the filtering process the
entire image is divided by 9. An m x n mask would have a normalizing constant equal to 1/mn.
A spatial averaging filter in which all coefficients are equal is sometimes called a box filter.
The second mask shown in Fig.10.1 is a little more interesting. This mask yields a so-called
weighted average, terminology used to indicate that pixels are multiplied by different coefficients, thus
giving more importance (weight) to some pixels at the expense of others. In the mask shown in Fig.
10.1(b) the pixel at the center of the mask is multiplied by a higher value than any other, thus giving
this pixel more importance in the calculation of the average.The other pixels are inversely weighted as a
function of their distance from the center of the mask. The diagonal terms are further away from the
center than the orthogonal neighbors (by a factor of √2) and, thus, are weighed less than these immediate
neighbors of the center pixel.
39
The basic strategy behind weighing the center point the highest and then reducing the value of the
coefficients as a function of increasing distance from the origin is simply an attempt to reduce blurring
in the smoothing process. We could have picked other weights to accomplish the same general
objective. However, the sum of all the coefficients in the mask of Fig. 10.1(b) is equal to 16, an
attractive feature for computer implementation because it has an integer power of 2. In practice, it is
difficult in general to see differences between images smoothed by using either of the masks in Fig.
10.1, or similar arrangements, because the area these masks span at any one location in an image is so
small.
The general implementation for filtering an M x N image with a weighted averaging filter of size m x n
(m and n odd) is given by the expressio
(2) Order-Statistics Filters:

Order-statistics filters are nonlinear spatial filters whose response is based on ordering (ranking) the
pixels contained in the image area encompassed by the filter, and then replacing the value of the center
pixel with the value determined by the ranking result. The best-known example in this category is the
median filter, which, as its name implies, replaces the value of a pixel by the median of the gray levels
in the neighborhood of that pixel (the original value of the pixel is included in the computation of the
median). Median filters are quite popular because, for certain types of random noise, they provide
excellent noise-reduction capabilities, with considerably less blurring than linear smoothing filters of
similar size. Median filters are particularly effective in the presence of impulse noise, also called
salt-and-pepper noise because of its appearance as white and black dots superimposed on an image.
The median, ε, of a set of values is such that half the values in the set are less than or equal to ε, and
half are greater than or equal to ε. In order to perform median filtering at a point in an image,
we first sort the values of the pixel in question and its neighbors, determine their median, and
assign this value to that pixel. For example, in a 3 x 3 neighborhood the median is the 5th largest value,
in a 5 x 5 neighborhood the 13th largest value, and so on. When several values in a neighborhood are
the same, all equal values are grouped. For example, suppose that a 3 x 3 neighborhood has values (10,
20, 20, 20, 15, 20, 20, 25, 100). These values are sorted as (10, 15,
20, 20, 20, 20, 20, 25, 100), which results in a median of 20. Thus, the principal function of median
filters is to force points with distinct gray levels to be more like their neighbors. In fact, isolated
clusters of pixels that are light or dark with respect to their neighbors, and whose area is less than n2 /
2 (one-half the filter area), are eliminated by an n x n median filter. In this case
―eliminated‖ means forced to the median intensity of the neighbors. Larger clusters are affected
considerably less.
40
Sharpening Spatial Filters
Previously we have looked at smoothing filters which remove fine detail Sharpening spatial filters seek
to highlight fine detail  Remove blurring from images  Highlight edges  Useful for emphasizing
transitions in image intensity Sharpening filters are based on spatial differentiation Hanan Hardan 1
Spatial Differentiation Differentiation measures the rate of change of a function Let’s consider a simple
1 dimensional example
41
Use of Second Derivatives for Enhancement–The Laplacian:
The approach basically consists of defining a discrete formulation of the second-order derivative and
then constructing a filter mask based on that formulation. We are interested in isotropic filters, whose
response is independent of the direction of the discontinuities in the image to which the filter is
applied. In other words, isotropic filters are rotation invariant, in the sense that rotating the image and
then applying the filter gives the same result as applying the filter to the image first and then rotating
the result.
Development of the method:
It can be shown (Rosenfeld and Kak [1982]) that the simplest isotropic derivative operator is the
Laplacian, which, for a function (image) f(x, y) of two variables, is defined as
Because derivatives of any order are linear operations, the Laplacian is a linear operator. In order to be
useful for digital image processing, this equation needs to be expressed in discrete form. There are
several ways to define a digital Laplacian using neighborhoods. digital second.Taking into account that
we now have two variables, we use the following notation for the partial second- order derivative in
the x-direction:
42
Fig.11.1. (a) Filter mask used to implement the digital Laplacian (b) Mask used to implement an
extension of this equation that includes the diagonal neighbors. (c) and (d) Two other
implementations of the Laplacian.
but the coordinates are along the diagonals. Since each diagonal term also contains a –2f(x, y) term,
the total subtracted from the difference terms now would be –8f(x, y). The mask used to implement this
new definition is shown in Fig.11.1(b). This mask yields isotropic results for increments of 45°. The
other two masks shown in Fig. 11 also are used frequently in practice.
They are based on a definition of the Laplacian that is the negative of the one we used here. As such,
they yield equivalent results, but the difference in sign must be kept in mind when combining (by
addition or subtraction) a Laplacian-filtered image with another image.
Because the Laplacian is a derivative operator, its use highlights gray-level discontinuities in an image
and deemphasizes regions with slowly varying gray levels.This will tend to produce images that have
grayish edge lines and other discontinuities, all superimposed on a dark, featureless
background.Background features can be ―recovered‖ while still preserving the sharpening effect of the
Laplacian operation simply by adding the original and Laplacian images. As noted in the previous
paragraph, it is important to keep in mind which definition of the Laplacian is used. If the definition
used has a negative center coefficient, then we subtract, rather
43
than add, the Laplacian image to obtain a sharpened result.Thus, the basic way in which we use the
Laplacian for image enhancement is as follows:
Use of First Derivatives for Enhancement—The Gradient:
First derivatives in image processing are implemented using the magnitude of the gradient. For a
function f(x, y), the gradient of f at coordinates (x, y) is defined as the two-dimensional column vector
The magnitude of this vector is given by
The components of the gradient vector itself are linear operators, but the magnitude of this vector
obviously is not because of the squaring and square root operations. On the other hand, the
partial derivatives are not rotation invariant (isotropic), but the magnitude of the gradient vector is.
Although it is not strictly correct, the magnitude of the gradient vector often is referred to as the
gradient.
The computational burden of implementing over an entire image is not trivial, and it is common
practice to approximate the magnitude of the gradient by using absolute values instead of squares and
square roots:
44
This equation is simpler to compute and it still preserves relative changes in gray levels, but the
isotropic feature property is lost in general. However, as in the case of the Laplacian, the isotropic
properties of the digital gradient defined in the following paragraph are preserved only for a limited
number of rotational increments that depend on the masks used to approximate the derivatives. As it
turns out, the most popular masks used to approximate the gradient give the same result only for
vertical and horizontal edges and thus the isotropic properties of the gradient are preserved only for
multiples of 90°.
As in the case of the Laplacian, we now define digital approximations to the preceding equations, and
from there formulate the appropriate filter masks. In order to simplify the discussion that follows, we
will use the notation in Fig. 11.2 (a) to denote image points in a 3 x 3 region. For example, the center
point, z5 , denotes f(x, y), z1 denotes f(x- 1, y-1), and so on. The simplest approximations to a first-
order derivative that satisfy the conditions stated in that section are G x = (z8 –z5) and Gy = (z6 – z5) .
Two other definitions proposed by Roberts [1965] in the early development of digital image processing
use cross differences:
we compute the gradient as
If we use absolute values, then substituting the quantities in the equations gives us the following
approximation to the gradient:
This equation can be implemented with the two masks shown in Figs. 11.2 (b) and(c). These masks are
referred to as the Roberts cross-gradient operators. Masks of even size are awkward to implement. The
smallest filter mask in which we are interested is of size 3 x 3.An approximation using absolute values,
still at point z5 , but using a 3*3 mask, is
The difference between the third and first rows of the 3 x 3 image region approximates the derivative
in the x-direction, and the difference between the third and first columns approximates the derivative in
the y-direction. The masks shown in Figs. 11.2 (d) and (e), called the Sobel operators. The idea behind
using a weight value of 2 is to achieve some smoothing by giving more importance to the center
point. Note that the coefficients in all the masks shown in Fig.
11.2 sum to 0, indicating that they would give a response of 0 in an area of constant gray level, as
expected of a derivative operator.
45
Fig.11.2 A 3 x 3 region of an image (the z’s are gray-level values) and masks used to
computert he gradient at point labeled z5 . All masks coefficients sum to zero, as expected of a
derivative operator.
How Spatial Combining Works:
Replacing legacy systems with next-generation technology isn't always a one-to-one fix. This blog
describes spatial combining, why it's important for sensitive equipment like electronic countermeasures
(ECMs) and how it helps achieve the highest levels of power available.
Power in the Past
Historically, vacuum tube amplifiers were used for all amplifiers, from audio frequency to RF and
microwave, as well as for lighting and the displays for our television sets. Traveling wave tube
amplifiers (TWTAs) have continued to provide high-power amplification at microwave frequencies over
broad bandwidths.
46
But vacuum tubes, typically in the multi-kV range, have lower reliability than solid-state devices, with
low-voltage power supplies, and over time, the supply of vacuum tubes and the expertise to manufacture
them have decreased. As a result, most vacuum tubes have been replaced with solid-state alternatives,
except in applications such as microwave ovens and electronic warfare (EW), which required vacuum
tubes to generate the higher power levels necessary for equipment like ECM jamming transmitters.
But power combining techniques now make it possible to achieve these power levels with solid-state
devices.
Cathode Ray Tube Television

Spatial Combining: What It Is and Why It's Important
ECM systems comprise receivers, processors, displays and jamming transmitters. Only recently have
solid-state solutions been able to meet the power and bandwidth requirements of ECM jamming
transmitters, due to the advent of gallium nitride (GaN) power amplifier MMICs and low-loss,
broadband combining techniques. However, a single GaN MMIC still has insufficient power for most
ECM systems, which can have requirements of more than 100 watts from 1.5-7.5 GHz. Solid-state
devices must combine multiple power amplifiers to reach the same power levels originally offered by
TWTAs.
Spatial combining can create solid-state power amplifiers (SSPAs) in the following ranges of frequency
and power — which all provide performance improvements when compared with TWTAs:
 100 W to 1 kW, 1 GHz to 40 GHz covering up to a decade of bandwidth
 Reduced harmonic content in the output spectrum
 Less noise generated
 Increased linearity
Thermal management is critical to get the best performance out of solid-state devices. For ECMs, we
often work with different thermal environments on different platforms. Some systems could use a cooling
fluid or air cooling with a fan.
Spatial combining provides the most efficient means of combining GaN MMICs, lowering the amount of
heat dispersed. When we measure thermal performance, the efficiency of the total amplifier is the most
important factor.
47
The efficiency of GaN MMICs combined with the efficiency of spatial combining yields the most
efficient solid-state amplifier available. The more efficient the solid-state device, the lower the amount of
heat that must be dissipated.
Also, when solid-state components operate cooler, their reliability is better so we want them to operate as
cool as possible. Using good thermal conductors, like copper, and maximizing the available cross section
are key for improving the thermals. However, there are trade-offs between the types of metals used and
the weight of the equipment — it can't be too heavy for airborne platforms, for instance. So, there are
other metals that can be used when making size, weight and power (SWaP) considerations.
How Is Spatium Being Used Today?
Qorvo's method of spatial combining is called Spatium®. With coaxial construction, Spatium provides an
efficient, broadband and compact way of combining multiple MMICs in a single step. In fact, Spatium
can typically combine 16 amplifiers in one step, with only 0.5 dB combining loss. In addition, Qorvo
designed a thermal path on the back of the MMIC to the cooling plate, to help with thermal management.
(See the figure below for a thermal simulation using Qorvo's Spatium QPB1006.)
Example of Thermal Simulation Using Qorvo's Spatium QPB1006: Cooler

performance, better reliability
48
:
1. Discuss the frequency domain techniques of image enhancement in detail.

Enhancement In Frequency Domain:
The frequency domain methods of image enhancement are based on convolution theorem. This is represented
as,
g(x, y) = h (x, y)*f(x, y)
Where.
g(x, y) = Resultant image
h(x, y) = Position invariant operator f(x,
y)= Input image
The Fourier transform representation of equation above is,
G (u, v) = H (u, v) F (u, v)
The function H (u, v) in equation is called transfer function. It is used to boost the edges of input
image f (x, y) to emphasize the high frequency components.
The different frequency domain methods for image enhancement are as follows.
1. Contrast stretching.
2. Clipping and thresholding.
3. Digital negative.
4. Intensity level slicing and
5. Bit extraction.
1. Contrast Stretching:
Due to non-uniform lighting conditions, there may be poor contrast between the background and the
feature of interest. Figure 1.1 (a) shows the contrast stretching transformations.
Fig.1.1 (a) Histogram of input image
49
Fig.1.1 (b) Linear Law
Fig.1.1 (c) Histogram of the transformed image
These stretching transformations are expressed as
In the area of stretching the slope of transformation is considered to be greater than unity. The
parameters of stretching transformations i.e., a and b can be determined by examining the histogram
of the image.
2. Clipping and Thresholding:
Clipping is considered as the special scenario of contrast stretching. It is the case in which the
parameters are α = γ = 0. Clipping is more advantageous for reduction of noise in input signals of
range [a, b].
Threshold of an image is selected by means of its histogram. Let us take the image shown in the
following figure 1.2.
Fig. 1.2
50
The figure 1.2 (b) consists of two peaks i.e., background and object. At the abscissa of histogram
minimum (D1) the threshold is selected. This selected threshold (D1) can separate background and
object to convert the image into its respective binary form. The thresholding transformations are
shown in figure 1.3.
Fig.1.3
3. Digital Negative:
The digital negative of an image is achieved by reverse scaling of its grey levels to the
transformation. They are much essential in displaying of medical images.
A digital negative transformation of an image is shown in figure 1.4.
Fig.1.4
4. Intensity Level Slicing:
The images which consist of grey levels in between intensity at background and other objects
51
require to reduce the intensity of the object. This process of changing intensity level is done with the
help of intensity level slicing. They are expressed as
The histogram of input image and its respective intensity level slicing is shown in the figure 1.5.
Fig.1.5
When an image is uniformly quantized then, the nth most significant bit can be extracted and
displayed.
Let, u = k1 2B-1 + k2 2B-2 +…................... + kB-1 2 + kB

Then, the output is expressed as
52
Image transforms
Hotelling Transform
The Hotelling Transform is a conventional image processing transformation. The equation is as

follows.Suppose the points pis are the feature points that we found from SMSB. The covariance matrix is
calculated by the mathematical relations. In this part, the eigen values and vectors are obtained. As the
eigen values and vectors found, the geometric attack can be detected by some mathematical ralations.
Fig.3 Hotelling Transform
Fig.4 eigen analysis
Fig.3 shows equations of Hotelling Transforfm. Pi is extracted from the SMSB and Co is the covariance
matrix calculated from the mean vector and summation from every feature points. After getting the
covariance vectors, the eigen analysis can be used to find out the eigen vectors and values. Thus, the eigen
vectors and values represent the characteristic of one image. Fig. 4 shows a concept of eigen analysis.
wavelet transform
The basic method of the wavelet transform is selecting a function whose integral is zero in time-domain
as the basic wavelet. By the expansion and translation of the basic wavelet, we can get a family function
which may constitute a framework for the function space. We decompose the signal by projecting the
analysis signals on the framework. The signal in original time domain can get a time-scale expression by
several scaling in the wavelet transform domain. Then we are able to achieve the most effective signal
processing purpose transform domain. The essence of wavelet de-noising is searching for the best
mapping of signal from actual space to wavelet function space in order to get the best restoration of the
original signal. From the view of the signal processing, the wavelet de-noising is a signal filtering. The
wavelet denoising is able to retain the characteristics of the image successfully. Actually, it is
comprehensive with feature extraction and low-pass filtering.
ri
Figure 1: The de-noising process based on wavelet transform one
Wavelet properties
1. Vanishing moments
The vanishing moment is a criterion about how a function f(t) decays towards infinity. We can estimate
the decay rate using the integration below:
[∫∞−∞tkf(t);dt]=0[∫−∞∞tkf(t);dt]=0
Where k is the decay rate.
Also, a wavelet function ψ(t)ψ(t) has N vanishing moments if:
[∫∞−∞tkf(t);dt]=0[∫−∞∞tkf(t);dt]=0
This property shows how fast a wavelet decays with time. The number of vanishing moments and the
oscillation of wavelets have a close relationship. As the number of vanishing moments grows, the wavelet
oscillation becomes greater.

From the image above, we have three wavelets for comparison. The first has two vanishing moments. The
second has four, and the third figure has eight vanishing moments.
As we can see, as the number of vanishing moments increases, the number of oscillations also increases.
Therefore, the names of many wavelets are derived from the number of vanishing moments. Examples are
shown below:
CdbN, symN;[N is the number of vanishing moments]. For example, db6 has six vanishing moments
and sym3 has three vanishing moments.
CoifN; [2N is the number of vanishing moments]. Coif3 has six vanishing moments.
BoirNr.Nd; Nr is the number of vanishing moments of synthesis wavelets, and Nd is the number of
vanishing moments of analysis wavelet. For example, Bior3.5 has three vanishing moments and five
vanishing moments in the analysis wavelet.
2. Support size
The size of support indicates the FIR filter’s length. If an FIR filter has an N number of samples, the
support width is N-1.
For example, haar, dbN, and symN has N number of vanishing moments, filter length of 2N, and the
support width is 2N-1.

For coifN, the number of vanishing moments is 2N, and the filter length is 6N. Below is an example of
the symlet4 wavelet:
Symlet4 has four vanishing moments, and the filter length is given by 2N. This means that the filter
length is 8. As we can see from the figure, there are eight samples (0-7) in all the filters.
This is how the number of vanishing moments relates to the length of the filters. The more the vanishing
moments, the higher the filter’s length.
3. Regularity
Regularity is the number of continuous derivatives a function has. Intuitively, it can be considered as the
measure of smoothness.
A wavelet with more vanishing moments will be more regular. This is shown in the figure below:
In the figure above, we have two wavelets, db2 and db10. In both of them, we have the wavelet function
and scaling function. As we can see, both the wavelet and scaling functions of db10 are smoother than
that of db2.
Also, db10 has ten vanishing moments and db2 has two vanishing moments. The higher the vanishing
moments, the more regular the wavelet.
4. Symmetry
This is a property that concerns whether wavelet filter characteristics have a linear phase or not.
Considering this property, db is not symmetric. Therefore, it can be preferred as a non-linear phase.
Sym and coif are near symmetric or almost linear phases. Bior is a linear phase. We can understand this
better in the figure below:

We have used db4 and bior3.5 to clarify this. In the first figure, we represent the scale function of db4.
The blue plot is the phase plot in the frequency domain.
As we can see, it is not symmetric. However, for the bior3.5, in the frequency domain, the plot is
symmetric. This is used in the selection of wavelets for various applications.
5. Orthogonality
It is the ability of the wavelet to conserve energy. Strict orthogonality is beneficial for the accurate
reconstruction of the signal from decomposition coefficients. The scaling function and wavelet function
are orthogonal.
Haar, db, sym, coif, meyr are orthogonal wavelets. Bior and rbior are bi-orthogonal, and the rest are not
orthogonal.
The summary of wavelet classification is shown below:

s can be used to determine what kind of geometric attack is.
Image filtering in frequency domain.

In the frequency domain, a digital image is converted from spatial domain to frequency domain. In the
frequency domain, image filtering is used for image enhancement for a specific application. A Fast
Fourier transformation is a tool of the frequency domain used to convert the spatial domain to the
frequency domain. For smoothing an image, low filter is implemented and for sharpening an image, high
pass filter is implemented. When both the filters are implemented, it is analyzed for the ideal filter,
Butterworth filter and Gaussian filter.
The frequency domain is a space which is defined by Fourier transform. Fourier transform has a very
wide application in image processing. Frequency domain analysis is used to indicate how signal energy
can be distributed in a range of frequency.
The basic principle of frequency domain analysis in image filtering is to computer 2D discrete Fourier
transform of the image.
Fourier Series and Transform
Fourier Series
Fourier series is a state in which periodic signals are represented by summing up sines and cosines and
multiplied with a certain weight. The periodic signals are further broken down into more signals with
some properties which are listed below:
4.5M
Disneyland’s Instagram Was Taken Over by 'Super Hacker' Who Made Racist and Slur-Filled Po
1. Broken signals are sines and cosines.
2. New signals are harmonics of each other.
Fourier series analysis of a step edge:
Fourier decomposition
Example:
Fourier Transformation
Fourier transformation is a tool for image processing. it is used for decomposing an image into sine and
cosine components. The input image is a spatial domain and the output is represented in the Fourier or
frequency domain. Fourier transformation is used in a wide range of application such as image filtering,
image compression. Image analysis and image reconstruction etc.
The formula for Fourier transformation:
Example:
Unit III
Concept of Edge Detection
The concept of edge detection is used to detect the location and presence of edges by making changes in the
intensity of an image. Different operations are used in image processing to detect edges. It can detect the
variation of grey levels but it quickly gives response when a noise is detected. In image processing, edge
detection is a very important task. Edge detection is the main tool in pattern recognition, image segmentation
and scene analysis. It is a type of filter which is applied to extract the edge points in an image. Sudden changes
in an image occurs when the edge of an image contour across the brightness of the image.
In image processing, edges are interpreted as a single class of singularity. In a function, the singularity is
characterized as discontinuities in which the gradient approaches are infinity.
As we know that the image data is in the discrete form so edges of the image are defined as the local maxima of
the gradient. lll
Mostly edges exits between objects and objects, primitives and primitives, objects and background. The objects
which are reflected back are in discontinuous form. Methods of edge detection study to change a single pixel of
an image in gray area.
41.3M
713
History of Java
Edge detection is mostly used for the measurement, detection and location changes in an image gray. Edges are
the basic feature of an image. In an object, the clearest part is the edges and lines. With the help of edges and
lines, an object structure is known. That is why extracting the edges is a very important technique in graphics
processing and feature extraction.
The basic idea behind edge detection is as follows:
1. To highlight local edge operator use edge enhancement operator.

2. Define the edge strength and set the edge points.
NOTE: edge detection cannot be performed when there are noise and blurring image.
There are 5 edge detection operators they are as follows:
1. Sobel Edge Detection Operator

66
The Sobel edge detection operator extracts all the edges of an image, without worrying about the directions. The
main advantage of the Sobel operator is that it provides differencing and smoothing effect.
Sobel edge detection operator is implemented as the sum of two directional edges. And the resulting image is a
unidirectional outline in the original image.
Sobel Edge detection operator consists of 3x3 convolution kernels. Gx is a simple kernel and Gy is rotated by
90°
These Kernels are applied separately to input image because separate measurements can be produced in each
orientation i.e Gx and Gy.
Following is the gradient magnitude:
As it is much faster to compute An approximate magnitude is computed:
2. Robert's cross operator
Robert's cross operator is used to perform 2-D spatial gradient measurement on an image which is simple and
quick to compute. In Robert's cross operator, at each point pixel values represents the absolute magnitude of the
input image at that point.
Robert's cross operator consists of 2x2 convolution kernels. Gx is a simple kernel and Gy is rotated by 90o
67
Following is the gradient magnitude:
As it is much faster to compute An approximate magnitude is computed:
3. Laplacian of Gaussian
Incredible Deals on Dell PCs

SPONSORED BY DELL
SHOP NOW
The Laplacian of Gaussian is a 2-D isotropic measure of an image. In an image, Laplacian is the highlighted
region in which rapid intensity changes and it is also used for edge detection. The Laplacian is applied to an
image which is been smoothed using a Gaussian smoothing filter to reduce the sensitivity of noise. This
operator takes a single grey level image as input and produces a single grey level image as output.
Following is the Laplacian L(x,y) of an image which has pixel intensity value I(x, y).
68
In Laplacian, the input image is represented as a set of discrete pixels. So discrete convolution kernel which can
approximate second derivatives in the definition is found.
3 commonly used kernels are as following:
This is 3 discrete approximations which are used commonly in Laplacian filter.
Following is 2-D Log function with Gaussian standard deviation:
4. Prewitt operator
Prewitt operator is a differentiation operator. Prewitt operator is used for calculating the approximate gradient of
the image intensity function. In an image, at each point, the Prewitt operator results in gradient vector or normal
vector. In Prewitt operator, an image is convolved in the horizontal and vertical direction with small, separable
and integer-valued filter. It is inexpensive in terms of computations.
69
70
71
Distinguish between spatial domain and fre
72
Zero Crossing Detector
Common Names: Zero crossing detector,

Marr edge detector, Laplacian of Gaussian edge detector
Brief Description
The zero crossing detector looks for places in the Laplacian of an image where the value of the Laplacian passes
through zero --- i.e. points where the Laplacian changes sign. Such points often occur at `edges' in images ---
i.e. points where the intensity of the image changes rapidly, but they also occur at places that are not as easy to
associate with edges. It is best to think of the zero crossing detector as some sort of feature detector rather than
as a specific edge detector. Zero crossings always lie on closed contours, and so the output from the zero
crossing detector is usually a binary image with single pixel thickness lines showing the positions of the zero
crossing points.
The starting point for the zero crossing detector is an image which has been filtered using the Laplacian of
Gaussian filter. The zero crossings that result are strongly influenced by the size of the Gaussian used for the
smoothing stage of this operator. As the smoothing is increased then fewer and fewer zero crossing contours
will be found, and those that do remain will correspond to features of larger and larger scale in the image.
How It Works
The core of the zero crossing detector is the Laplacian of Gaussian filter and so a knowledge of that operator is
assumed here. As described there, `edges' in images give rise to zero crossings in the LoG output. For instance,
Figure 1 shows the response of a 1-D LoG filter to a step edge in the image.
73
Figure 1 Response of 1-D LoG filter to a step edge. The left hand graph shows a 1-D image, 200 pixels long,
containing a step edge. The right hand graph shows the response of a 1-D LoG filter with Gaussian standard
deviation 3 pixels.
However, zero crossings also occur at any place where the image intensity gradient starts increasing or starts
decreasing, and this may happen at places that are not obviously edges. Often zero crossings are found in
regions of very low gradient where the intensity gradient wobbles up and down around zero.
Once the image has been LoG filtered, it only remains to detect the zero crossings. This can be done in several
ways.
The simplest is to simply threshold the LoG output at zero, to produce a binary image where the boundaries
between foreground and background regions represent the locations of zero crossing points. These boundaries
can then be easily detected and marked in single pass, e.g. using some morphological operator. For instance, to
locate all boundary points, we simply have to mark each foreground point that has at least one background
neighbor.
The problem with this technique is that will tend to bias the location of the zero crossing edge to either the light
side of the edge, or the dark side of the edge, depending on whether it is decided to look for the edges of
foreground regions or for the edges of background regions.
A better technique is to consider points on both sides of the threshold boundary, and choose the one with the
lowest absolute magnitude of the Laplacian, which will hopefully be closest to the zero crossing.
Since the zero crossings generally fall in between two pixels in the LoG filtered image, an alternative output
representation is an image grid which is spatially shifted half a pixel across and half a pixel down, relative to the
original image. Such a representation is known as a dual lattice. This does not actually localize the zero
crossing any more accurately, of course.
A more accurate approach is to perform some kind of interpolation to estimate the position of the zero crossing
to sub-pixel precision.
A Residual-Based Joint Denoising Method

Bilateral filter and structure adaptive denoising methods described above can restore the noisy images
efficiently. For example, the original noise-free image “Lena” shown in Figure 1(a) is degraded by an additive
white Gaussian noise with noise variances , as shown in Figure 2(a). We test the two methods on the Noisy
“Lena" image, respectively. As we can see, the bilateral filter with PSNR = 27.56 dB, SSIM = 0.86 and the
structure adaptive denoising method with PSNR = 30.20 dB, SSIM = 0.84 exhibit acceptable image quality as
shown in Figures 2(b) and 2(c), respectively. On the other hand, as shown in the figure, the bilateral filter can
smooth the Noisy “Lena” efficiently but erase certain details of the image in the meanwhile; and the local self-
adaptive method can deliver the fine features, such edges, of the image, whereas it might generate image ripples
caused by noise in the smooth area. Based on this observation, we here propose a new method, that is, a residual
feedback-based joint image denoising method using bilateral filter and structure adaptive methods.
74
Figure 1
Four test images.
Figure 2
75
Comparison of the three denoising methods: (a) noise image with , (b) bilateral filter, (c) structure adaptive
filter, and (d) the proposed method.
Firstly, we define an image residual model as follows:where is the pixel value at point of the noisy image
with pixels; is the pixel value at point of the denoised image; and is the estimate of the location disturbance
of , namely, image residual. We assume that image residual still obeys Gaussian distribution with mean of 0
and a low-level variance; that is,We next define a criterion for testing image residual as follows:where denotes
a neighbor of with the radius being . As the expectation of image disturbance is 0, the criterion should not be
too large or too small in a flat area. Thus we can tag abnormal points bywhere is a preprescribed value, used as
an indicator of the toughness of tagging. As an example, the image residual between Figures 2(a) and 2(b) is
shown in Figure 3(a) and shown in Figure 3(b) is obtained by using (13). From Figure 3(b), we can see that the
details are tagged to some extent, but the noise still remains. So, for a tagged residual figure , we need to “filter”
it a second time. DefineHere, similarly to , is a neighbor of and is its radius; and is a preselected threshold.
The main idea of designing in (14) is based on the fact that the useful information, such as edges, in the
residual tagged image tends to be clustered while the noise-tags distribute randomly in the image . Thus we can
determine whether a given pixel, say , is informative or noisy by simply calculating . As shown in Figure 3(c),
the noise scattered in the tagged image has been eliminated visibly after the second filtering process. In
addition, it is worth noting that there exist some interrupted points in image , which might result from the fact
that some useful and informative areas are covered by the noise. So, we further expand the tagged information
by linking those unconnected abutting areas with following function:After the operation using , we get a binary
image shown as in Figure 3(d), which just tags useful information, such as edges, and meaningless noise has
been eliminated clearly.
Figure 3
Tagging procedure of bilateral filter.
76
According to the analysis described above, can clearly indicate the indistinct margins caused by bilateral
denoising. On the other hand, as discussed in Section 2, the structure adaptive kernel based methods can protect
fine features, such as edges, of an image as well as possible. So, we can avoid the appearance of indistinct
margins and ripples in the restored image by combining the favorite properties of these two methods.
Detailedly, let and be the images denoised by using structure adaptive kernel method and bilateral filter,
respectively. Then the resulting image can be formulated as follows:It is worth noting that in order to get ,
instead of filtering the whole noisy image to obtain by structure adaptive kernel method, we only need to get
some parts of by filtering those pixels satisfying the first restriction in (15), which can reduce the computation
complexity greatly. In addition, in order to erase those image edges caused by splicing, a smoothing formula is
defined aswhere denotes the number of elements of set . Thus we can replace with in (16) to gain a smoothed
splicing image.
In summary, the main steps of our proposed method are as follows:(1)Restore the noisy image by bilateral filter
and get the restored image .(2)Calculate the residual by (10).(3)Compute by (13), (14), (15), and (17).(4)Apply
structure adaptive method to denoise those pixels of the noisy image satisfying the first restriction of (15) and
get .(5)Get the restored image by
Figure 2(d) illustrates the restored “Lena” image by using our proposed method. This image has PSNR =
31.65 dB and SSIM = 0.92. As we can see, the proposed method performs remarkably better than the bilateral
filter and structure adaptive kernel method. Furthermore, the execution time of our proposed method with 55.16
seconds is much faster than that of the structure adaptive kernel method with 78.12 seconds. This can be
explained by the fact that our proposed method only performs structure filter on tagged points indicated by ,
instead of the whole image, leading to decrease of the computation complexity.
Canny Edge Detector

An explanation using the OpenCV C++ Library
The Background
What are edges?
In image processing, edges simply represent sets of points within an image where the image brightness has a high
rate of change (more on this later).
77
Example of image edges. Source: https://en.wikipedia.org/wiki/Sobel_operator
Why detect edges?
Edge detection is an important part of image processing as it facilitates the computational identification and
separation of images into distinct parts. This has wide ranging applications, from photo editing software like
photoshop, to pedestrian awareness software used by autonomous vehicle initiatives like Tesla.
Development of the Canny Edge Detector
The Canny Edge Detector (CED) was developed in 1986 by John F. Canny to detect a wide range of edges in
images. CED aims to satisfy three main criteria. First, the detection should accurately reveal as many edges as
78
possible. Second, the distance between the edges that are detected and the actual edges present in the source
image must be minimized. Third, an edge should only be detected once within an image.
The Algorithm
CED uses a multistage algorithm to detect edges. The process is as follows:

1. Apply a Gaussian Filter
2. Compute the image gradient
3. Apply non-maximum suppression
4. Apply double thresholding
5. Apply hysteresis
We’ll revisit each of these in more detail as we walkthrough the example implementation.
The Implementation
OpenCV logo
OpenCV is an open-source computer vision library written in C++ (also with bindings in Python, Java, and
MATLAB, which we won’t be discussing in this post). A comprehensive coverage of the features and
capabilities of OpenCV is outside of this post’s scope, so I will briefly go over the relevant parts as they come up.
cv::Mat
Before we get to the meat of the CED, we need to take a look at how we’ll be working with our image data.
OpenCV implements a matrix type cv::Mat to load, store, and manipulate image data. Conceptually,
the cv::Mat type can be thought of as a two-dimensional array, with each element representing a pixel value.
Simply iterate through the array to access or mutate pixel values.
79
For three-channel images (color images), an array of cv::Mat objects is used, with one matrix per color channel.
In other words, a three-dimensional array, with the third dimension representing the red, green, and blue color
channels.
However, because we will be doing edge detection, the color values of an image aren’t very relevant, so we’ll be
loading all our images in grayscale, as single-channel cv::Mat objects.

Loading an image for CED
String imgName = argv[1];
String imgPath = "../testImages/" + imgName;
Mat src = imread(imgPath, IMREAD_GRAYSCALE);
The above code uses cv::imread to load an image into a the Mat object src . The first parameter specifies the
image path and the second parameter provides the IMREAD_GRAYSCALE flag, so our image will be loaded in
grayscale.
source image
CED, Step 1: Apply Gaussian Filter
A Gaussian filter is used to smooth the source image so as to remove image noise that may cause false-positive
edge detection. Gaussian filters work by replacing every pixel in an image with a weighted average of neighbor
pixels. The pixels are weighted similar to a normal distribution (hence, the ‘Gaussian’) where farther neighbor
80
pixels have decreasing weights. The number of neighbors to consider within the average is variable, and the
smoothness of the image after being filtered is dependent on the size of the region being averaged over.
Mat blurred;
blur(src, blurred, Size(3, 3));
For this CED implementation, we will simply use the built in function cv::blur to apply the Gaussian filter. The
first parameter is the source Mat object, the second parameter is the destination Mat object, and the third
parameter defines the size of the region being averaged over.
Before and After applying the Gaussian filter

81
CED, Step 2: Compute Image Gradient
First, what is an image gradient? An image gradient is the two-dimensional gradient vector representing the
directional change in intensity (brightness) of an image. These intensity values provide the information necessary
to determine the positions and polarities of edges.
The image gradient is computed from convolving the source image with a derivative filter to get the first-order
partial derivatives, which together, form the (image) gradient vector representation. A potential edge is simply
identify by the values with the highest rates of change, so the derived values with the highest magnitude are
potential edge candidates.

Mat xComponent, yComponent;
Sobel(blurred, xComponent, CV_64F, 1, 0, 3);
Sobel(blurred, yComponent, CV_64F, 0, 1, 3);
We use the cv::Sobel method to compute the x and y gradient vector components. Note, the CV_64F parameter
value is simply an output type (in this case, our output will be a Mat object with pixel values represented as 64-
bit floats), and the remaining three parameters specify the component to output (x or y) and the kernel size for the
Sobel derivative filter (a kernel simply holds a filter, so for our current purposes we can think of them as the
same).
CED, Step 3: Non-Maximum Suppression
Non-maximum suppression is an edge thinning technique used to remove extraneous edge candidates. It works
by iterating through all pixel values, comparing the current value with the pixel value in the positive and negative
gradient directions, and suppressing the current value if it does not have the highest magnitude relative to its
neighbors. For our implementation, we will be using a set of discrete gradient directions.
Mat magnitude, angle;
cartToPolar(xComponent, yComponent, magnitude, angle, true);
First we use the cv::cartToPolar method to calculate the gradient magnitude and direction.
normalize(magnitude, magnitude, 0, 1, NORM_MINMAX);
82
Result of non-maximum suppression
CED, Step 4: Apply Double Thresholding
Double thresholding is used to categorize the remaining edge pixels into three categories using a low and high
threshold value. If an edge pixel value is greater than the high threshold value, it is categorized as a strong edge
pixel, with a high probability of being an edge. If an edge pixel value is less than the high threshold value, but
greater than the low threshold value, it is categorized as a weak edge pixel, with some probability of being an
edge. If an edge pixel value is less than both the high and low threshold values, it is categorized as having a very
We accomplish this by defining a set of destination Mat objects to hold the categorized values and then simply
iterate through the magnitude Mat and compare the edge pixel values to our
defined magMax and magMin values, adding them to their respective Mat object as necessary.
83
84
Sorted edge pixel values
CED, Step 5: Hysteresis
Hysteresis is the final step of the CED algorithm. It determines which of the values in the weak edge category
should be included in the final edge detection image. The process simply checks to see if a weak edge pixel is
We can implement this by iterating through the Mat object for weak pixel values, and checking if the coordinate
value in the strong pixel value Mat object has non-zero neighboring values. If so, we add the weak pixel value
back into the strong pixel value Mat object.
In digital image processing, edge detection is a technique used in computer vision to find the boundaries of an
image in a photograph. This uses an algorithm that searches for discontinuities in pixel brightness in an image
that is converted to grayscale. This allows software to detect features, objects and even landmarks in a
photograph by using segmentation and extraction algorithm techniques. The points in an image where the
brightness changes sharply are sets of curved line segments that are called the edges.
There are 4 things to look for in edge detection:

 Discontinuities in depth
 Discontinuities in surface orientation
 Changes in material properties
 Variations in scene illumination
85
The edges of an image allow us to see the boundaries of objects in an image. This is needed in software that need
to identify or detect let’s say people’s faces. Other objects like cars, street signs, traffic lights and crosswalks are
used in self-driving cars. AI software like Google’s Cloud Vision use these techniques for image content
analysis. This provides ways to extract the features in an image like face, logo, landmarks and even optical
character recognition (OCR). These are used for image recognition which I will explain with examples.
I will present three types of examples that can use edge detection beginning with feature extraction and then pixel
integrity.
Edge detection Feature Extraction
First example I will discuss is with regards to feature extraction to identify objects.
So what is a car?
We know from empirical evidence and experience that it is a transportation mechanism we use to travel e.g. a
car. We also know that we need to drive a car. Our vision can easily identify it as an object with wheels,
windshield, headlights, bumpers, etc. It is rather trivial to even ask that question to another person.
Machines do not know what a car is. We have to teach it using computer vision. Let’s take a look at this photo of
a car (below).
Feature extraction can be used to identify objects. A system can be trained to identify a car by its features —
headlights, wheels, front bumper, side mirrors, license plate, etc.
86
Using edge detection, we can isolate or extract the features of an object. Once the boundaries have been
identified, software can analyze the image and identify the object. Machines can be taught to examine the outline
of an image’s boundary and then extract the contents within it to identify the object. There will be false-positives,
or identification errors, so refining the algorithm becomes necessary until the level of accuracy increases.
The actual process of image recognition (i.e. identifying a car as a car) involves more complex computation
techniques that use neural networks e.g. Convolutional Neural Networks or CNN. I will not cover that in this
article.
Landmark Detection
Software that recognizes objects like landmarks are already in use e.g. Google Lens. Once again the extraction of
features leads to detection once we have the boundaries.
Landmark detection can be used when edges are created. In this example one can see how a system can identify
a landmark from its outline.
Landmarks, in image processing, actually refers to points of interest in an image that allow it to be recognized. It
can be a landmark like a building or public place to common objects we are familiar with in our daily lives. For
software to recognize what something is in an image, it needs the coordinates of these points which it then feeds
87
into a neural network. With the use of machine learning, certain patterns can be identified by software based on
the landmarks. Detecting the landmarks can then help the software to differentiate let’s say a horse from a car.
Once the edges help to isolate the object in an image, the next step is to identify the landmarks and then identify
the object.
Let us take a closer look at an edge detected image.
Pattern analysis can help identify objects in an image from landmark detection.
I have isolated 5 objects as an example. Each object was landmarks that software can use to recognize what it is.
I won’t delve further into that, but basically this means that once a pattern emerges from an object that the
software can recognize, it will be able to identify it.
Are they always accurate?
Well in most cases they are, but this is up for strict compliance and regulation to determine the level of accuracy.
Systems on which life and death are integral, like in medical equipment, must have a higher level of accuracy
than let’s say an image filter used in a social media app. This is why thorough and rigorous testing is involved
before the final release of image recognition software. In visioning systems like that used in self-driving cars, this
is very crucial.
Pixel Integrity
I created code to validate an image’s integrity pixel by pixel using “ImageChops” from the PIL library routine.
This allows a pixel by pixel comparison of two images. One image is the original and the other image is the one
88
that needs to be compared. Sometimes there might be a need to verify if the original image has been modified or
not, especially in multi-user environments. Without version control, a retoucher may not know if the image was
modified. Original content creators may also be curious to see if the original image they created is the same as
their content that another person may have uploaded on the Internet.
Now in order to do this, it is best to set the same pixel size on both the original image (Image 1) and the non-
original image (Image 2). I usually take the pixel size of the non-original image, so as to preserve its dimensions
since I can easily downscale or upscale the original image. I have an example of Image 2 that is 600 x 906 pixels
at a 0.662 aspect ratio. I set the same pixel size for Image 1 and then I run my code to compare them using edge
detection.
Final edge detection image
And finally, we are left with our final resulting image.
89
WHAT ARE THE APPLICATIONS OF EDGE DETECTION?
Edge detection is an extremely popular task in fields such as computer vision and image processing. It’s not
hard to see why: as humans, we depend on edge detection for tasks such as depth perception and detecting
objects in our field of view.
Perhaps the most widespread application of edge detection is for object recognition in computer vision.
Whenever objects are present in an image, there will be an edge between the boundary of the object and the
background behind the object. These algorithms identify thresholds where there is a large gradient between
neighboring pixels, indicating a significant change in the intensity of the image. Even today, simple yet
powerful algorithms such as Canny edge detection are widely used as a preprocessing step for many computer
vision systems.
Wherever computer vision systems can benefit from identifying edges (i.e. sharp changes in gradient
magnitude), you can bet that these algorithms will be used. Some of the most common applications are:
 Fingerprint recognition: When recognizing fingerprints, it’s useful to preprocess the image by
performing edge detection. In this case, the “edges” are the contours of the fingerprint, as contrasted
with the background on which the fingerprint was made. This helps reduce noise so that the system can
focus exclusively on the fingerprint’s shape.
 Medical imaging: Like fingerprint recognition, medical imaging is another field where computer
vision systems can improve performance by eliminating noise and extraneous information and
improve heatlhcare with AI. Medical images are susceptible to different types of noise during the data
collection process. Applying edge detection makes it easier for computer vision systems (and human
physicians) to detect anomalies in medical images.
 Vehicle detection: Consider one of the most cutting-edge computer vision use cases: self-driving cars.
These systems depend on the ability to rapidly and automatically detect other vehicles on the road. This
technology can help dramatically reduce the complexity of the images that self-driving cars need to
analyze, while still preserving the recognizable shapes of other vehicles.
90
2. Explain about Ideal Low Pass Filter (ILPF) in frequency domain.
Lowpass Filter:
The edges and other sharp transitions (such as noise) in the gray levels of an image contribute
significantly to the high-frequency content of its Fourier transform. Hence blurring (smoothing) is
achieved in the frequency domain by attenuating us the transform of a given image.
G (u, v) = H (u, v) F(u, v)
where F (u, v) is the Fourier transform of an image to be smoothed. The problem is to select a filter
transfer function H (u, v) that yields G (u, v) by attenuating the high-frequency components of F (u,
v). The inverse transform then will yield the desired smoothed image g (x, y).
Ideal Filter:
A 2-D ideal lowpass filter (ILPF) is one whose transfer function satisfies the relation
where D is a specified nonnegative quantity, and D(u, v) is the distance from point (u, v) to the origin
of the frequency plane; that is,
Figure 3 (a) shows a 3-D perspective plot of H (u, v) u a function of u and v. The name ideal filter
indicates that oil frequencies inside a circle of radius
Fig.3 (a) Perspective plot of an ideal lowpass filter transfer function; (b) filter cross
section.
91
Do are passed with no attenuation, whereas all frequencies outside this circle are completely
attenuated.
The lowpass filters are radially symmetric about the origin. For this type of filter,
specifying a cross section extending as a function of distance from the origin along a radial line is
sufficient, as Fig. 3 (b) shows. The complete filter transfer function can then be generated by rotating
the cross section 360 about the origin. Specification of radially symmetric filters centered on the N x
N frequency square is based on the assumption that the origin of the Fourier transform has been
centered on the square.
For an ideal lowpass filter cross section, the point of transition between H(u, v) = 1
and H(u, v) = 0 is often called the cutoff frequency. In the case of Fig.3 (b), for example, the cutoff
frequency is Do. As the cross section is rotated about the origin, the point Do traces a circle giving a
locus of cutoff frequencies, all of which are a distance Do from the origin. The cutoff frequency
concept is quite useful in specifying filter characteristics. It also serves as a common base for
comparing the behavior of different types of filters.
The sharp cutoff frequencies of an ideal lowpass filter cannot be realized with electronic
components, although they can certainly be simulated in a computer.
3. Discuss about Butterworth lowpass filter with a suitable example.
Butterworth filter:
The transfer function of the Butterworth lowpass (BLPF) of order n and with cutoff frequency locus at
a distance Do, from the origin is defined by the relation
A perspective plot and cross section of the BLPF function are shown in figure 4.
Fig.4 (a) A Butterworth lowpass filter (b) radial cross section for n = 1.
92
Digital Image Processing Question & Answers
Unlike the ILPF, the BLPF transfer function does not have a sharp discontinuity that establishes
a clear cutoff between passed and filtered frequencies. For filters with smooth transfer functions,
defining a cutoff frequency locus at points for which H (u, v) is down to a certain fraction of its
maximum value is customary. In the case of above Eq. H (u, v) = 0.5 (down 50 percent from its
maximum value of 1) when D (u, v) = Do. Another value commonly used is 1/√2 of the
maximum value of H (u, v). The following simple modification yields the desired value when D
(u, v) = Do:
IMAGE RESTORATION
1. Explain about gray level interpolation.
The distortion correction equations yield non integer values for x' and y'. Because the distorted
image g is digital, its pixel values are defined only at integer coordinates. Thus using non integer
values for x' and y' causes a mapping into locations of g for which no gray levels are defined. Inferring
what the gray-level values at those locations should be, based only on the pixel values at integer
coordinate locations, then becomes necessary. The technique used to accomplish this is called gray-
level interpolation.
The simplest scheme for gray-level interpolation is based on a nearest neighbor approach.
This method, also called zero-order interpolation, is illustrated in Fig. 6.1. This figure shows
(A) The mapping of integer (x, y) coordinates into fractional coordinates (x', y') by means of
following equations
x' = c1x + c2y + c3xy + c4
and
y' = c5x + c6y + c7xy + c8
(B) The selection of the closest integer coordinate neighbor to (x', y');
and
(C) The assignment of the gray level of this nearest neighbor to the pixel located at (x, y).
126
Fig. 6.1 Gray-level interpolation based on the nearest neighbor concept.
Although nearest neighbor interpolation is simple to implement, this method often has the
drawback of producing undesirable artifacts, such as distortion of straight edges in images of
high resolution. Smoother results can be obtained by using more sophisticated techniques, such as
cubic convolution interpolation, which fits a surface of the sin(z)/z type through a much larger
number of neighbors (say, 16) in order to obtain a smooth estimate of the gray level at any
126
desired point. Typical areas in which smoother approximations generally are required include 3- D
graphics and medical imaging. The price paid for smoother approximations is additional computational
burden. For general-purpose image processing a bilinear interpolation approach that uses the gray
levels of the four nearest neighbors usually is adequate. This approach is straightforward. Because the
gray level of each of the four integral nearest neighbors of a non integral pair of coordinates (x', y') is
known, the gray-level value at these coordinates, denoted v(x', y'), can be interpolated from the values
of its neighbors by using the relationship
v (x', y') = ax' + by' + c x' y' + d
where the four coefficients are easily determined from the four equations in four unknowns that can be
written using the four known neighbors of (x', y'). When these coefficients have been determined, v(x',
y') is computed and this value is assigned to the location in f{x, y) that yielded the spatial mapping into
location (x', y'). It is easy to visualize this procedure with the aid of Fig.
6.1. The exception is that, instead of using the gray-level value of the nearest neighbor to (x', y'), we
actually interpolate a value at location (x', y') and use this value for the gray-level assignment at (x, y).
2. Explain about Wiener filter used for image restoration.
The inverse filtering approach makes no explicit provision for handling noise. This approach
incorporates both the degradation function and statistical characteristics of noise into the restoration
process. The method is founded on considering images and noise as random processes, and the
objective is to find an estimate f of the uncorrupted image f such that the mean square error between
them is minimized. This error measure is given by
e = E {(f- f )2}
2
where E{•} is the expected value of the argument. It is assumed that the noise and the image
are uncorrelated; that one or the other has zero mean; and that the gray levels in the estimate
are a linear function of the levels in the degraded image. Based on these conditions, the
minimum of the error function is given in the frequency domain by the expression
where we used the fact that the product of a complex quantity with its conjugate is equal to the
magnitude of the complex quantity squared. This result is known as the Wiener filter, after N. Wiener
[1942], who first proposed the concept in the year shown. The filter, which consists of the terms
inside the brackets, also is commonly referred to as the minimum mean square error filter or the least
square error filter. The Wiener filter does not have the same problem as the inverse filter with zeros in
126
the degradation function, unless both H(u, v) and Sη(u, v) are zero for the same value(s) of u and v.
The terms in above equation are as follows: H
(u, v) = degradation function
H*(u, v) = complex conjugate of H (u, v)
│H (u, v│ 2 = H*(u, v)* H (u, v)

Sη (u, v) = │N (u, v) 2 = power spectrum of the noise
Sf (u, v) = │F (u, v) 2 = power spectrum of the undegraded image.
As before, H (u, v) is the transform of the degradation function and G (u, v) is the transform of
the degraded image. The restored image in the spatial domain is given by the inverse Fourier
transform of the frequency-domain estimate F (u, v). Note that if the noise is zero, then the noise
power spectrum vanishes and the Wiener filter reduces to the inverse filter.
When we are dealing with spectrally white noise, the spectrum │N (u, v│ 2 is a
constant, which simplifies things considerably. However, the power spectrum of the
undegraded image seldom is known. An approach used frequently when these
quantities are not known or cannot be estimated is to approximate the equation as
where K is a specified constant.
3. Explain a Model of the Image Degradation/Restoration Process.
The Fig. 6.3 shows, the degradation process is modeled as a degradation function that, together
with an additive noise term, operates on an input image f(x, y) to produce a degraded image g(x, y).
Given g(x, y), some knowledge about the degradation function H, and some knowledge about the
additive noise term η(x, y), the objective of restoration is to obtain an estimate f(x, y) of the original
image. the estimate should be as close as possible to the original input image and, in general, the more
we know about H and η, the closer f(x, y) will be to f(x, y).
The degraded image is given in the spatial domain by
g (x, y) = h (x, y) * f (x, y) + η (x, y)

where h (x, y) is the spatial representation of the degradation function and, the symbol *
indicates convolution. Convolution in the spatial domain is equal to multiplication in the
frequency domain, hence
G (u, v) = H (u, v) F (u, v) + N (u, v)
where the terms in capital letters are the Fourier transforms of the corresponding terms in above equation.
126
Fig. 6.3 model of the image degradation/restoration process.

4. Explain about the restoration filters used when the image degradation is due to noise
only.
If the degradation present in an image is only due to noise, then,
g (x, y) = f (x, y) + η (x, y)

G (u, v) = F (u, v) + N (u, v)
The restoration filters used in this case are,
1. Mean filters
2. Order static filters and
3. Adaptive filters
5. Explain Mean filters.

There are four types of mean filters. They are
(i) Arithmetic mean filter
This is the simplest of the mean filters. Let Sxy represent the set of coordinates in a
rectangular subimage window of size m X n, centered at point (x, y).The arithmetic mean filtering
process computes the average value of the corrupted image g(x, y) in the area defined by Sxy. The
value of the restored image f at any point (x, y) is simply the arithmetic mean computed using the
pixels in the region defined by Sxy. In other words
This operation can be implemented using a convolution mask in which all coefficients have value
1/mn
(ii) Geometric mean filter
126
An image restored using a geometric mean filter is given by the expression
Here, each restored pixel is given by the product of the pixels in the subimage window, raised to the
power 1/mn. A geometric mean filter achieves smoothing comparable to the arithmetic mean filter, but
it tends to lose less image detail in the process.
(iii) Harmonic mean filter
The harmonic mean filtering operation is given by the expression
The harmonic mean filter works well for salt noise, but fails for pepper noise. It does well also with
other types of noise like Gaussian noise.
(iv) Contra harmonic mean filter
The contra harmonic mean filtering operation yields a restored image based on the expression
where Q is called the order of the filter. This filter is well suited for reducing or virtually eliminating
the effects of salt-and-pepper noise. For positive values of Q, the filter eliminates pepper noise. For
negative values of Q it eliminates salt noise. It cannot do both simultaneously. Note that the contra
harmonic filter reduces to the arithmetic mean filter if Q = 0, and to the harmonic mean filter if Q = -
1.
126
6. Explain the Order-Statistic Filters.

There are four types of Order-Statistic filters. They are
(i) Median filter

UNIT-IV
IMAGE SEGMENTATION AND MORPHOLOGICAL IMAGE PROCESSING
1. What are the derivative operators useful in image segmentation? Explain their role in
segmentation.
Gradient operators:
First-order derivatives of a digital image are based on various approximations of the 2-D gradient.
The gradient of an image f (x, y) at location (x, y) is defined as the vector
It is well known from vector analysis that the gradient vector points in the direction of maximum rate of
change of f at coordinates (x, y). An important quantity in edge detection is the magnitude
of this vector, denoted by Af, where
This quantity gives the maximum rate of increase of f (x, y) per unit distance in the direction of Af. It
is a common (although not strictly correct) practice to refer to Af also as the gradient. The direction
of the gradient vector also is an important quantity. Let α (x, y) represent the direction angle of the
vector Af at (x, y). Then, from vector analysis,
where the angle is measured with respect to the x-axis. The direction of an edge at (x, y) is
perpendicular to the direction of the gradient vector at that point. Computation of the gradient of
an image is based on obtaining the partial derivatives &f/&x and &f/&y at every pixel location. Let
the 3x3 area shown in Fig. 1.1 (a) represent the gray levels in a neighborhood of an image. One of the
simplest ways to implement a first-order partial derivative at point z5 is to use the
following Roberts cross-gradient operators:
126
126
These derivatives can be implemented for an entire image by using the masks shown in Fig. 1.1(b).
Masks of size 2 X 2 are awkward to implement because they do not have a clear center. An approach
using masks of size 3 X 3 is given by
Fig.1.1 A 3 X 3 region of an image (the z’s are gray-level values) and various masks used to
compute the gradient at point labeled z5.
126
A weight value of 2 is used to achieve some smoothing by giving more importance to the center point.
Figures 1.1(f) and (g), called the Sobel operators, and are used to implement these two equations. The
Prewitt and Sobel operators are among the most used in practice for computing digital gradients. The
Prewitt masks are simpler to implement than the Sobel masks, but the latter have slightly superior
noise-suppression characteristics, an important issue when dealing with derivatives. Note that the
coefficients in all the masks shown in Fig. 1.1 sum to 0, indicating that they give a response of 0 in
areas of constant gray level, as expected of a derivative operator.
The masks just discussed are used to obtain the gradient components G x and Gy. Computation of the
gradient requires that these two components be combined. However, this implementation is not
always desirable because of the computational burden required by squares and square roots. An
approach used frequently is to approximate the gradient by absolute values:
This equation is much more attractive computationally, and it still preserves relative changes in gray
levels. However, this is not an issue when masks such as the Prewitt and Sobel masks are used to
compute Gx and Gy.
It is possible to modify the 3 X 3 masks in Fig. 1.1 so that they have their strongest responses along the
diagonal directions. The two additional Prewitt and Sobel masks for detecting discontinuities in the
diagonal directions are shown in Fig. 1.2.
126
Fig.1.2 Prewitt and Sobel masks for detecting diagonal edges
The Laplacian:
The Laplacian of a 2-D function f(x, y) is a second-order derivative defined as
For a 3 X 3 region, one of the two forms encountered most frequently in practice is
Fig.1.3 Laplacian masks used to implement Eqns. above.

126
where the z's are defined in Fig. 1.1(a). A digital approximation including the diagonal neighbors is
given by
Masks for implementing these two equations are shown in Fig. 1.3. We note from these masks that the
implementations of Eqns. are isotropic for rotation increments of 90° and 45°, respectively.
2. Write about edge detection.
Intuitively, an edge is a set of connected pixels that lie on the boundary between two regions.
Fundamentally, an edge is a "local" concept whereas a region boundary, owing to the way it is defined,
is a more global idea. A reasonable definition of "edge" requires the ability to measure gray-level
transitions in a meaningful way. We start by modeling an edge intuitively. This will lead us to
formalism in which "meaningful" transitions in gray levels can be measured. Intuitively, an ideal edge
has the properties of the model shown in Fig. 2.1(a). An ideal edge according to this model is a set of
connected pixels (in the vertical direction here), each of which is located at an orthogonal step
transition in gray level (as shown by the horizontal profile in the figure).
In practice, optics, sampling, and other image acquisition imperfections yield edges that are
blurred, with the degree of blurring being determined by factors such as the quality of the image
acquisition system, the sampling rate, and illumination conditions under which the image is
acquired. As a result, edges are more closely modeled as having a "ramp like" profile, such as the one
shown in Fig.2.1 (b).
Fig.2.1 (a) Model of an ideal digital edge (b) Model of a ramp edge. The slope of the ramp is
proportional to the degree of blurring in the edge.
The slope of the ramp is inversely proportional to the degree of blurring in the edge. In this model,
we no longer have a thin (one pixel thick) path. Instead, an edge point now is any point contained in the
ramp, and an edge would then be a set of such points that are connected. The "thickness" of the edge is
determined by the length of the ramp as it transitions from an initial to a final gray level.
126
, This length is determined by the slope, which, in turn, is determined by the degree of blurring. This
makes sense: Blurred edges lend to be thick and sharp edges tend to be thin. Figure 2.2(a) shows the
image from which the close-up in Fig. 2.1(b) was extracted. Figure 2.2(b) shows a horizontal gray-
level profile of the edge between the two regions. This figure also shows the first and second
derivatives of the gray-level profile. The first derivative is positive at the points of transition into and
out of the ramp as we move from left to right along the profile; it is constant for points in the ramp;
and is zero in areas of constant gray level. The second derivative is positive at the transition associated
with the dark side of the edge, negative at the transition associated with the light side of the edge, and
zero along the ramp and in areas of constant gray level. The signs of the derivatives in Fig. 2.2(b)
would be reversed for an edge that transitions from light to dark.
We conclude from these observations that the magnitude of the first derivative can be used to detect
the presence of an edge at a point in an image (i.e. to determine if a point is on a ramp).
Similarly, the sign of the second derivative can be used to determine whether an edge pixel lies
on the dark or light side of an edge. We note two additional properties of the second derivative around
an edge: A) It produces two values for every edge in an image (an undesirable feature); and B) an
imaginary straight line joining the extreme positive and negative values of the second derivative would
cross zero near the midpoint of the edge. This zero-crossing property of the second derivative is quite
useful for locating the centers of thick edges.
Fig.2.2 (a) Two regions separated by a vertical edge (b) Detail near the edge, showing a gray-
level profile, and the first and second derivatives of the profile.
126
3. Explain about the edge linking procedures.
The different methods for edge linking are as follows
(i) Local processing
(ii) Global processing via the Hough Transform
(iii) Global processing via graph-theoretic techniques.
(i) Local Processing:

One of the simplest approaches for linking edge points is to analyze the characteristics of pixels in a
small neighborhood (say, 3 X 3 or 5 X 5) about every point (x, y) in an image that has been labeled an
edge point. All points that are similar according to a set of predefined criteria are linked, forming an
edge of pixels that share those criteria.
The two principal properties used for establishing similarity of edge pixels in this kind of analysis are
(1) the strength of the response of the gradient operator used to produce the edge
pixel; and (2) the direction of the gradient vector. The first property is given by the value of Af.
Thus an edge pixel with coordinates (xo, yo) in a predefined neighborhood of (x, y), is similar in
magnitude to the pixel at (x, y) if
The direction (angle) of the gradient vector is given by Eq. An edge pixel at (xo, yo) in the predefined
neighborhood of (x, y) has an angle similar to the pixel at (x, y) if
where A is a nonnegative angle threshold. The direction of the edge at (x, y) is perpendicular to the
direction of the gradient vector at that point.
A point in the predefined neighborhood of (x, y) is linked to the pixel at (x, y) if both magnitude and
direction criteria are satisfied. This process is repeated at every location in the image. A record must
be kept of linked points as the center of the neighborhood is moved from pixel to pixel. A simple
bookkeeping procedure is to assign a different gray level to each set of linked edge pixels.
126
Fig.5.10 Some important probability density function
126
7. Enumerate the differences between the image enhancement and image restoration.
(i) Image enhancement techniques are heuristic procedures designed to manipulate an image in order to
take advantage of the psychophysical aspects of the human system. Whereas image restoration
techniques are basically reconstruction techniques by which a degraded image is reconstructed by
using some of the prior knowledge of the degradation phenomenon.
(ii) Image enhancement can be implemented by spatial and frequency domain technique, whereas
image restoration can be implement by frequency domain and algebraic techniques.
(iii) The computational complexity for image enhancement is relatively less when compared to the
computational complexity for irrrage restoration, since algebraic methods requires manipulation of
large number of simultaneous equation. But, under some condition computational complexity can be
reduced to the same level as that required by traditional frequency domain technique.
(iv) Image enhancement techniques are problem oriented, whereas image restoration techniques are
general and are oriented towards modeling the degradation and applying the reverse process in order
to reconstruct the original image.
(v) Masks are used in spatial domain methods for image enhancement, whereas masks are not used
for image restoration techniques.
(vi) Contrast stretching is considered as image enhancement technique because it is based on the
pleasing aspects of the review, whereas removal of’ image blur by applying a deblurring function is
considered as a image restoration technique.
(vii)
8. Explain about iterative nonlinear restoration using the Lucy–Richardson algorithm.
Lucy-Richardson algorithm is a nonlinear restoration method used to recover a latent image

which is blurred by a Point Spread Function (psf). It is also known as Richardson-Lucy
deconvolution.
With as the point spread function, the pixels in observed image are expressed as,
Here,
uj = Pixel value at location j in the image
126
ci = Observed
value at ith
pixel
locacation
The L-R algorithm cannot be used in application in which the psf (Pij) is dependent on one or more
unknown variables.
The L-R algorithm is based on maximum-likelihood formulation, in this formulation Poisson statistics
are used to model the image. If the likelihood of model is increased, then the result is an equation
which satisfies when the following iteration converges.
Here,
f = Estimation of undegraded image.
The factor f which is present in the right side denominator leads to non-linearity. Since, the algorithm is
a type of nonlinear restorations; hence it is stopped when satisfactory result is obtained.
The basic syntax of function deconvlucy with the L-R algorithm is implemented is given
below.
fr = Deconvlucy (g, psf, NUMIT, DAMPAR, WEIGHT)
Here the parameters are,
g = Degraded image
fr = Restored image
psf = Point spread function
NUMIT = Total number of iterations.
The remaining two parameters are,
DAMPAR
The DAMPAR parameter is a scalar parameter which is used to determine the deviation of
resultant image with the degraded image (g). The pixels which gel deviated from their original value
within the DAMPAR, for these pixels iterations are cancelled so as to reduce noise
148
generation and present essential image information.
WEIGHT
WEIGHT parameter gives a weight to each and every pixel. It is array of size similar to that of
degraded image (g). In applications where a pixel leads to improper image is removed by assigning it to
a weight as 0’. The pixels may also be given weights depending upon the flat-field correction, which is
essential according to image array. Weights are used in applications such as blurring with specified psf.
They are used to remove the pixels which are pre9ent at the boundary of the image and are blurred
separately by psf.
If the array size of psf is n x n then the width of weight of border of zeroes being used is ceil (n /
2)
The best-known order-statistics filter is the median filter, which, as its name implies, replaces
the value of a pixel by the median of the gray levels in the neighborhood of that pixel:
The original value of the pixel is included in the computation of the median. Median filters are quite
popular because, for certain types of random noise, they provide excellent noise-reduction
capabilities, with considerably less blurring than linear smoothing filters of similar size. Median
filters are particularly effective in the presence of both bipolar and unipolar impulse noise.
(ii) Max and min filters
Although the median filter is by far the order-statistics filler most used in image processing,
it is by no means the only one. The median represents the 50th percentile of a ranked set of numbers,
but the reader will recall from basic statistics that ranking lends itself to many other possibilities. For
example, using the 100th percentile results in the so-called max filter, given by
This filter is useful for finding the brightest points in an image. Also, because pepper noise has very
low values, it is reduced by this filter as a result of the max selection process in the subimage area
Sxy.
The 0th percentile filter is the min filter.
This filter is useful for finding the darkest points in an image. Also, it reduces salt noise as a result of
149
the min operation.
(iii) Midpoint filter
The midpoint filter simply computes the midpoint between the maximum and minimum
values in the area encompassed by the filter:
Note that this filter combines order statistics and averaging. This filter works best for randomly distributed
noise, like Gaussian or uniform noise.
(iv) Alpha - trimmed mean filter
It is a filter formed by deleting the d/2 lowest and the d/2 highest gray-level values of g(s, t)
in the neighborhood Sxy. Let gr (s, t) represent the remaining mn - d pixels. A filter formed by
averaging these remaining pixels is called an alpha-trimmed mean filter:
where the value of d can range from 0 to mn - 1. When d = 0, the alpha- trimmed filter reduces to the
arithmetic mean filter. If d = (mn - l)/2, the filter becomes a median filter. For other values of d, the
alpha-trimmed filter is useful in situations involving multiple types of noise, such as a combination
of salt-and-pepper and Gaussian noise.
9. Explain the Adaptive Filters.
Adaptive filters are filters whose behavior changes based on statistical characteristics of the
image inside the filter region defined by the m X n rectangular window Sxy.
Adaptive, local noise reduction filter:
The simplest statistical measures of a random variable are its mean and variance. These are
reasonable parameters on which to base an adaptive filler because they are quantities closely related
to the appearance of an image. The mean gives a measure of average gray level in the region over
which the mean is computed, and the variance gives a measure of average contrast in that region.
This filter is to operate on a local region, Sxy. The response of the filter at any point (x, y) on
which the region is centered is to be based on four quantities: (a) g(x, y), the value of the noisy
image at (x, y); (b) a2, the variance of the noise corrupting /(x, y) to form g(x, y); (c) ray, the local
mean of the pixels in Sxy; and (d) σ2L , the local variance of the pixels in Sxy.
150
The behavior of the filter to be as follows:
1. If σ2η is zero, the filler should return simply the value of g (x, y). This is the trivial, zero-noise case
in which g (x, y) is equal to f (x, y).
2. If the local variance is high relative to σ2η the filter should return a value close to g (x, y). A
high local variance typically is associated with edges, and these should be preserved.
3. If the two variances are equal, we want the filter to return the arithmetic mean value of the
pixels in Sxy. This condition occurs when the local area has the same properties as the overall
image, and local noise is to be reduced simply by averaging.
Adaptive local noise filter is given by,
The only quantity that needs to be known or estimated is the variance of the overall noise, a2. The
other parameters are computed from the pixels in Sxy at each location (x, y) on which the filter
window is centered.
Adaptive median filter:
The median filter performs well as long as the spatial density of the impulse noise is not large (as a
rule of thumb, Pa and Pb less than 0.2). The adaptive median filtering can handle impulse noise with
probabilities even larger than these. An additional benefit of the adaptive median filter is that it
seeks to preserve detail while smoothing nonimpulse noise, something that the "traditional" median
filter does not do. The adaptive median filter also works in a rectangular window area S xy. Unlike those
filters, however, the adaptive median filter changes (increases) the size of S xy during filter operation,
depending on certain conditions. The output of the filter is a single value used to replace the value of
the pixel at (x, y), the particular point on which the window Sxy is centered at a given time.
Consider the following notation:
zmin = minimum gray level value in Sxy
zmax = maximum gray level value in Sxy
zmcd = median of gray levels in Sxy
zxy = gray level at coordinates (x, y)
Smax = maximum allowed size of Sxy.
151
The adaptive median filtering algorithm works in two levels, denoted level A and level B, as follows:
Level A: A1 = zmed - zmin
A2 = zmed -
zmax If A1 > 0 AND A2 < 0, Go to level
B Else increase the window size
If window size ≤ Smax repeat level A Else
output zxy
Level B: B1 = zxy - zmin

B2 = zxy - zmax
If B1> 0 AND B2 < 0, output zxy Else
output zmed
10. Explain a simple Image Formation Model.

An image is represented by two-dimensional functions of the form f(x, y). The value or
amplitude of f at spatial coordinates (x, y) is a positive scalar quantity whose physical meaning is
determined by the source of the image. When an image is generated from a physical process, its values
are proportional to energy radiated by a physical source (e.g., electromagnetic waves). As a
consequence, f(x, y) must be nonzero and finite; that is,
0 < f (x, y) < ∞ …. (1) The
function f(x, y) may be characterized by two components:
A) The amount of source illumination incident on the scene being viewed.
B) The amount of illumination reflected by the objects in the scene.
Appropriately, these are called the illumination and reflectance components and are denoted by i (x,
y) and r (x, y), respectively. The two functions combine as a product to form f (x, y).
f (x, y) = i (x, y) r (x, y) …. (2)
where
0 < i (x, y) < ∞ …. (3)
and
0 < r (x, y) < 1 …. (4)
152
Equation (4) indicates that reflectance is bounded by 0 (total absorption) and 1 (total
reflectance).The nature of i (x, y) is determined by the illumination source, and r (x, y) is
determined by the characteristics of the imaged objects. It is noted that these expressions also are
applicable to images formed via transmission of the illumination through a medium, such as a chest
X-ray.
11. Write brief notes on inverse filtering.
The simplest approach to restoration is direct inverse filtering, where F (u, v), the transform
of the original image is computed simply by dividing the transform of the degraded image, G (u, v),
by the degradation function
The divisions are between individual elements of the functions.
But G (u, v) is given by
G (u, v) = F (u, v) + N (u, v)
Hence
It tells that even if the degradation function is known the undegraded image cannot be
recovered [the inverse Fourier transform of F( u, v)] exactly because N(u, v) is a random function
whose Fourier transform is not known.
If the degradation has zero or very small values, then the ratio N(u, v)/H(u, v) could
easily dominate the estimate F(u, v).
One approach to get around the zero or small-value problem is to limit the filter
frequencies to values near the origin. H (0, 0) is equal to the average value of h(x, y) and that this
is usually the highest value of H (u, v) in the frequency domain. Thus, by limiting the analysis to
frequencies near the origin, the probability of encountering zero values is reduced.
12. Write about Noise Probability Density Functions.
The following are among the most common PDFs found in image processing applications.
Gaussian noise
153
Because of its mathematical tractability in both the spatial and frequency domains, Gaussian
(also called normal) noise models are used frequently in practice. In fact, this tractability is so
convenient that it often results in Gaussian models being used in situations in which they are
marginally applicable at best.
The PDF of a Gaussian random variable, z, is given by
… (1)
where z represents gray level, µ is the mean of average value of z, and a σ is its standard deviation.
The standard deviation squared, σ2, is called the variance of z. A plot of this function is shown in
Fig. 5.10. When z is described by Eq. (1), approximately 70% of its values will be in the range [(µ -
σ), (µ +σ)], and about 95% will be in the range [(µ - 2σ), (µ + 2σ)].
Rayleigh noise
The PDF of Rayleigh noise is given by
The mean and variance of this density are given by

µ = a +ƒMb/4
σ2 = b(4 – Π)/4
Figure 5.10 shows a plot of the Rayleigh density. Note the displacement from the origin and the fact
that the basic shape of this density is skewed to the right. The Rayleigh density can be quite useful
for approximating skewed histograms.
Erlang (Gamma) noise
The PDF of Erlang noise is given by
where the parameters are such that a > 0, b is a positive integer, and "!" indicates factorial. The
mean and variance of this density are given by
154
µ = b / a
σ2 = b / a 2
Exponential noise
The PDF of exponential noise is given by
The mean of this density function is given by
µ = 1 / a
σ2 = 1 / a 2
This PDF is a special case of the Erlang PDF, with b = 1.

Uniform noise
The PDF of uniform noise is given by
The mean of this density function is given by

µ = a + b /2
σ2 = (b – a ) 2 / 12
Impulse (salt-and-pepper) noise
The PDF of (bipolar) impulse noise is given by
If b > a, gray-level b will appear as a light dot in the image. Conversely, level a will appear
like a dark dot. If either Pa or Pb is zero, the impulse noise is called unipolar. If neither probability is
zero, and especially if they are approximately equal, impulse noise values will resemble salt-and-
pepper granules randomly distributed over the image. For this reason, bipolar impulse noise also is
called salt-and-pepper noise. Shot and spike noise also are terms used to refer to this type of noise.
155
The threshold in the preceding example was specified by using a heuristic approach,
based on visual inspection of the histogram. The following algorithm can be used to obtain T
automatically:
1. Select an initial estimate for T.
2. Segment the image using T. This will produce two groups of pixels: G1 consisting of all pixels
with gray level values >T and G2 consisting of pixels with values < T.
3. Compute the average gray level values µ1 and µ2 for the pixels in regions G1and G2.
4. Compute a new threshold value:
5. Repeat steps 2 through 4 until the difference in T in successive iterations is smaller than a
predefined parameter T o.
When there is reason to believe that the background and object occupy comparable areas in the image,
a good initial value for T is the average gray level of the image. When objects are small compared to
the area occupied by the background (or vice versa), then one group of pixels will dominate the
histogram and the average gray level is not as good an initial choice. A more appropriate initial value
for T in cases such as this is a value midway between the maximum and minimum gray levels. The
parameter To is used to stop the algorithm after changes become small in terms of this parameter. This
is used when speed of iteration is an important issue.
4. Explain about basic adaptive thresholding process used in image

segmentation.
Basic Adaptive Thresholding:
Imaging factors such as uneven illumination can transform a perfectly segmentable histogram into a
histogram that cannot be partitioned effectively by a single global threshold. An approach
for handling such a situation is to divide the original image into subimages and then utilize a
different threshold to segment each subimage. The key issues in this approach are how to subdivide the
image and how to estimate the threshold for each resulting subimage. Since the threshold used for each
pixel depends on the location of the pixel in terms of the subimages, this type of thresholding is
adaptive.
156
Fig.5 (a) Original image, (b) Result of global thresholding. (c) Image subdivided into
individual subimages (d) Result of adaptive thresholding.
We illustrate adaptive thresholding with a example. Figure 5(a) shows the image, which we concluded
could not be thresholded effectively with a single global threshold. In fact, Fig. 5(b) shows the result of
thresholding the image with a global threshold manually placed in the valley of its histogram. One
approach to reduce the effect of nonuniform illumination is to subdivide the image into smaller
subimages, such that the illumination of each subimage is approximately uniform. Figure 5(c) shows
such a partition, obtained by subdividing the image into four equal parts, and then subdividing each part
by four again. All the subimages that did not contain a boundary between object and back-ground had
variances of less than 75. All subimages containing boundaries had variances in excess of 100. Each
subimage with variance greater than 100 was segmented with a threshold computed for that subimage
using the algorithm. The initial
value for T in each case was selected as the point midway between the minimum and maximum gray
levels in the subimage. All subimages with variance less than 100 were treated as one composite
image, which was segmented using a single threshold estimated using the same algorithm. The result
of segmentation using this procedure is shown in Fig. 5(d).
With the exception of two subimages, the improvement over Fig. 5(b) is evident. The boundary
between object and background in each of the improperly segmented subimages was small and dark,
and the resulting histogram was almost unimodal.
5. Explain in detail the threshold selection based on boundary characteristics.

Use of Boundary Characteristics for Histogram Improvement and Local Thresholding:
It is intuitively evident that the chances of selecting a "good" threshold are enhanced considerably if
the histogram peaks are tall, narrow, symmetric, and separated by deep valleys. One approach for
improving the shape of histograms is to consider only those pixels that lie on or near the edges
between objects and the background. An immediate and obvious improvement is that histograms
157
would be less dependent on the relative sizes of objects and the background. For instance, the
histogram of an image composed of a small object on a large background area (or vice versa) would be
dominated by a large peak because of the high concentration of one type of pixels.
If only the pixels on or near the edge between object and the background were used, the
resulting histogram would have peaks of approximately the same height. In addition, the probability
that any of those given pixels lies on an object would be approximately equal to the probability that it
lies on the back-ground, thus improving the symmetry of the histogram peaks.
Finally, as indicated in the following paragraph, using pixels that satisfy some
simple measures based on gradient and Laplacian operators has a tendency to deepen the valley
between histogram peaks.
The principal problem with the approach just discussed is the implicit assumption that the edges
between objects and background arc known. This information clearly is not available during
segmentation, as finding a division between objects and background is precisely what segmentation is
all about. However, an indication of whether a pixel is on an edge may be obtained by computing its
gradient. In addition, use of the Laplacian can yield information regarding whether a given pixel lies on
the dark or light side of an edge. The average value of the Laplacian is 0 at the transition of an edge,
so in practice the valleys of histograms formed from
the pixels selected by a gradient/Laplacian criterion can be expected to be sparsely populated. This
property produces the highly desirable deep valleys.
The gradient Aƒ at any point (x, y) in an image can be found. Similarly, the Laplacian A2f can also
be found. These two quantities may be used to form a three-level image, as follows:
where the symbols 0, +, and - represent any three distinct gray levels, T is a threshold, and the gradient
and Laplacian are computed at every point (x, y). For a dark object on a light background, the use of
the Eqn. produces an image s(x, y) in which (1) all pixels that are not on
an edge (as determined by Aƒ being less than T) are labeled 0; (2) all pixels on the dark side of an
edge are labeled +; and (3) all pixels on the light side of an edge are labeled -. The symbols + and - in
Eq. above are reversed for a light object on a dark background. Figure 6.1 shows the
labeling produced by Eq. for an image of a dark, underlined stroke written on a light background.
The information obtained with this procedure can be used to generate a segmented,
binary image in which l's correspond to objects of interest and 0's correspond to the background. The
transition (along a horizontal or vertical scan line) from a light background to a dark object must be
characterized by the occurrence of a - followed by a + in s (x, y). The interior of the object is
composed of pixels that are labeled either 0 or +. Finally, the transition from the object back to the
background is characterized by the occurrence of a + followed by a -. Thus a horizontal or vertical
scan line containing a section of an object has the following structure:
158
Fig.6.1 Image of a handwritten stroke coded by using Eq. discussed above
where (…) represents any combination of +, -, and 0. The innermost parentheses contain object points
and are labeled 1. All other pixels along the same scan line are labeled 0, with the exception of any
other sequence of (- or +) bounded by (-, +) and (+, -).
Fig.6.2 (a) Original image, (b) Image segmented by local thresholding.
159
Figure 6.2 (a) shows an image of an ordinary scenic bank check. Figure 6.3 shows the histogram as a
function of gradient values for pixels with gradients greater than 5. Note that this histogram has two
dominant modes that are symmetric, nearly of the same height, and arc separated by a distinct valley.
Finally, Fig. 6.2(b) shows the segmented image obtained by with T at or near the midpoint of the
valley. Note that this example is an illustration of local thresholding, because the value of T was
determined from a histogram of the gradient and Laplacian, which are local properties.
Fig.6.3 Histogram of pixels with gradients greater than 5
6. Explain about region based segmentation.

Region-Based Segmentation:
The objective of segmentation is to partition an image into regions. We approached this problem by
finding boundaries between regions based on discontinuities in gray levels, whereas segmentation was
accomplished via thresholds based on the distribution of pixel properties, such as gray-level values or
color.
Basic Formulation:
Let R represent the entire image region. We may view segmentation as a process that partitions
R into n subregions, R1, R2..., Rn, such that
160
Here, P (Ri) is a logical predicate defined over the points in set Ri and Ǿ` is the null set. Condition (a)
indicates that the segmentation must be complete; that is, every pixel must be in a region. Condition
(b) requires that points in a region must be connected in some predefined sense. Condition (c)
indicates that the regions must be disjoint. Condition (d) deals with the properties that must be
satisfied by the pixels in a segmented region—for example P (Ri) = TRUE if all pixels in Ri, have
the same gray level. Finally, condition (c) indicates that regions R i and Rj are different in the sense of
predicate P.
Region Growing:
As its name implies, region growing is a procedure that groups pixels or subregions into larger regions
based on predefined criteria. The basic approach is to start with a set of "seed" points and from these
grow regions by appending to each seed those neighboring pixels that have properties similar to the
seed (such as specific ranges of gray level or color). When a priori information is not available, the
procedure is to compute at every pixel the same set of properties that ultimately will be used to assign
pixels to regions during the growing process. If the result of these computations shows clusters of
values, the pixels whose properties place them near the centroid of these clusters can be used as seeds.
The selection of similarity criteria depends not only on the problem under consideration, but also on
the type of image data available. For example, the analysis of land-use satellite imagery depends
heavily on the use of color. This problem would be significantly more difficult, or even impossible, to
handle without the inherent information available in color images. When the images are monochrome,
region analysis must be carried out with a set of descriptors based on gray levels and spatial properties
(such as moments or texture).
Basically, growing a region should stop when no more pixels satisfy the criteria for inclusion in that
region. Criteria such as gray level, texture, and color, are local in nature and do not take into account
the "history" of region growth. Additional criteria that increase the power of a region- growing
algorithm utilize the concept of size, likeness between a candidate pixel and the pixels grown so far
(such as a comparison of the gray level of a candidate and the average gray level of
the grown region), and the shape of the region being grown. The use of these types of descriptors is
based on the assumption that a model of expected results is at least partially available.
Figure 7.1 (a) shows an X-ray image of a weld (the horizontal dark region) containing several cracks
and porosities (the bright, white streaks running horizontally through the middle of the image). We
wish to use region growing to segment the regions of the weld failures. These segmented features could
be used for inspection, for inclusion in a database of historical studies, for controlling an automated
welding system, and for other numerous applications.
161
Fig.7.1 (a) Image showing defective welds, (b) Seed points, (c) Result of region growing, (d)
Boundaries of segmented ; defective welds (in black).
The first order of business is to determine the initial seed points. In this application, it is known that
pixels of defective welds tend to have the maximum allowable digital value B55 in this case).
Based on this information, we selected as starting points all pixels having values of 255. The points thus
extracted from the original image are shown in Fig. 10.40(b). Note that many of the points are clustered
into seed regions.
The next step is to choose criteria for region growing. In this particular example we
chose two criteria for a pixel to be annexed to a region: (1) The absolute gray-level difference between
any pixel and the seed had to be less than 65. This number is based on the histogram shown in Fig. 7.2
and represents the difference between 255 and the location of the first major valley to the left, which is
representative of the highest gray level value in the dark weld region. (2) To be included in one of the
regions, the pixel had to be 8-connected to at least one pixel in that region.
If a pixel was found to be connected to more than one region, the regions
were merged. Figure 7.1 (c) shows the regions that resulted by starting with the seeds in Fig. 7.2 (b) and
utilizing the criteria defined in the previous paragraph. Superimposing the boundaries of these regions on
the original image [Fig. 7.1(d)] reveals that the region-growing procedure did indeed segment the
162
defective welds with an acceptable degree of accuracy. It is of interest to note that it was not necessary
to specify any stopping rules in this case because the criteria for region growing were sufficient to isolate
the features of interest.
Fig.7.2 Histogram of Fig. 7.1 (a)
Region Splitting and Merging:

The procedure just discussed grows regions from a set of seed points. An alternative is to subdivide an
image initially into a set of arbitrary, disjointed regions and then merge and/or split the regions in an
attempt to satisfy the conditions. A split and merge algorithm that iteratively works toward satisfying
these constraints is developed.
Let R represent the entire image region and select a predicate P. One approach for segmenting R is to
subdivide it successively into smaller and smaller quadrant regions so that, for any region Ri, P(Ri) =
TRUE. We start with the entire region. If P(R) = FALSE, we divide the image into quadrants. If P is
FALSE for any quadrant, we subdivide that quadrant into subquadrants, and so on. This particular
splitting technique has a convenient representation in the form of a so-called quadtree (that is, a tree in
which nodes have exactly four descendants), as illustrated in Fig. 7.3. Note that the root of the tree
corresponds to the entire image and that each node corresponds to a subdivision. In this case, only R 4
was subdivided further.
163
Fig. 7.3 (a) Partitioned image (b) Corresponding quadtree.
If only splitting were used, the final partition likely would contain adjacent regions with identical
properties. This drawback may be remedied by allowing merging, as well as splitting. Satisfying the
constraints, requires merging only adjacent regions whose combined pixels satisfy the predicate P. That
is, two adjacent regions Rj and Rk are merged only if P (Rj U Rk) = TRUE.
The preceding discussion may be summarized by the following procedure, in which, at any step we
1. Split into four disjoint quadrants any region Ri, for which P (Ri) = FALSE.
2. Merge any adjacent regions Rj and Rk for which P (Rj U Rk) =TRUE.
3. Stop when no further merging or splitting is possible.
Several variations of the preceding basic theme are possible. For example, one possibility is to split the
image initially into a set of blocks. Further splitting is carried out as described previously, but merging
is initially limited to groups of four blocks that are descendants in the quadtree representation and that
satisfy the predicate P. When no further mergings of this type are possible, the procedure is
terminated by one final merging of regions satisfying step 2. At this point, the merged regions may be
of different sizes. The principal advantage of this approach is that it uses the same quadtree for splitting
and merging, until the final merging step.
164
165
166
167
168
169
170
171
172
173
174
175
176
.
177
17
8
17
9
18
0
18
1
182
183
184
185
186
187
188
189
190
191
192
193
126
128
129
130
131
132
133
140
141
142
143
145
146
147
148
149
150
(ii) Global processing via the Hough Transform:
In this process, points are linked by determining first if they lie on a curve of specified shape. We now
consider global relationships between pixels. Given n points in an image, suppose that we want to find
subsets of these points that lie on straight lines. One possible solution is to first find all lines
determined by every pair of points and then find all subsets of points that are close to particular lines.
The problem with this procedure is that it involves finding n(n - 1)/2 ~ n2 lines and then performing
(n)(n(n - l))/2 ~ n3 comparisons of every point to all lines. This approach is computationally prohibitive
in all but the most trivial applications.
Hough [1962] proposed an alternative approach, commonly referred to as the Hough transform.
Consider a point (xi, yi) and the general equation of a straight line in slope-intercept form, yi
= a.xi + b. Infinitely many lines pass through (xi, yi) but they all satisfy the equation yi =
a.xi + b for varying values of a and b. However, writing this equation as b = -a.xi + yi, and considering
the ab-plane (also called parameter space) yields the equation of a single line for a fixed pair (x i, yi).
Furthermore, a second point (xj, yj) also has a line in parameter space associated with it, and this line
intersects the line associated with (xi, yi) at (a', b'), where a' is the slope and b' the intercept of the line
containing both (xi, yi) and (xj, yj) in the xy-plane. In fact, all points contained on this line have lines in
parameter space that intersect at (a', b'). Figure 3.1 illustrates these concepts.
Fig.3.1 (a) xy-plane (b) Parameter space
Fig.3.2 Subdivision of the parameter plane for use in the Hough transform
151
The computational attractiveness of the Hough transform arises from subdividing the parameter space
into so-called accumulator cells, as illustrated in Fig. 3.2, where (amax , amin) and (bmax , bmin),
are the expected ranges of slope and intercept values. The cell at coordinates (i, j), with accumulator
value A(i, j), corresponds to the square associated with parameter space coordinates (a i , bi).
Initially, these cells are set to zero. Then, for every point (x k, yk) in the image plane, we let the
parameter a equal each of the allowed subdivision values on the fl-axis and solve for the corresponding
b using the equation b = - xk a + yk .The resulting b’s are then rounded off to the nearest allowed value
in the b-axis. If a choice of ap results in solution bq, we let A (p, q) = A (p,
q) + 1. At the end of this procedure, a value of Q in A (i, j) corresponds to Q points in the xy- plane
lying on the line y = ai x + bj. The number of subdivisions in the ab-plane determines the accuracy of
the co linearity of these points. Note that subdividing the a axis into K increments gives, for every point
(xk, yk), K values of b corresponding to the K possible values of a. With n image points, this method
involves nK computations. Thus the procedure just discussed is linear in n, and the product nK does
not approach the number of computations discussed at the beginning unless K approaches or exceeds
n.
A problem with using the equation y = ax + b to represent a line is that the slope
approaches infinity as the line approaches the vertical. One way around this difficulty is to use the
normal representation of a line:
x cosθ + y sinθ = ρ
Figure 3.3(a) illustrates the geometrical interpretation of the parameters used. The use of this
representation in constructing a table of accumulators is identical to the method discussed for the slope-
intercept representation. Instead of straight lines, however, the loci are sinusoidal curves in the ρθ -
plane. As before, Q collinear points lying on a line x cosθj + y sinθj = ρ, yield Q sinusoidal curves
that intersect at (pi, θj) in the parameter space. Incrementing θ and solving for the corresponding p
gives Q entries in accumulator A (i, j) associated with the cell determined by (p i, θj). Figure 3.3 (b)
illustrates the subdivision of the parameter space.
Fig.3.3 (a) Normal representation of a line (b) Subdivision of the ρθ-plane into cells
152
The range of angle θ is ±90°, measured with respect to the x-axis. Thus with reference to Fig. 3.3 (a), a
horizontal line has θ = 0°, with ρ being equal to the positive x-intercept. Similarly, a vertical line has
θ = 90°, with p being equal to the positive y-intercept, or θ = - 90°, with ρ being equal to the negative
y-intercept.
(iii) Global processing via graph-theoretic techniques

In this process we have a global approach for edge detection and linking based on representing edge
segments in the form of a graph and searching the graph for low-cost paths that correspond to
significant edges. This representation provides a rugged approach that performs well in the presence of
noise.
Fig.3.4 Edge clement between pixels p and q
We begin the development with some basic definitions. A graph G = (N,U) is a finite, nonempty set of
nodes N, together with a set U of unordered pairs of distinct elements of N. Each pair (ni, nj) of U is
called an arc. A graph in which the arcs are directed is called a directed graph. If an arc is directed
from node ni to node nj, then nj is said to be a successor of the parent node ni. The process of
identifying the successors of a node is called expansion of the node. In each graph we define levels,
such that level 0 consists of a single node, called the start or root node, and the nodes in the last level
are called goal nodes. A cost c (ni, nj) can be associated with every arc (ni, nj). A sequence of nodes n1,
n2... nk, with each node ni being a successor of node ni-1 is called a path from n1 to nk. The cost of the
entire path is
The following discussion is simplified if we define an edge element as the boundary between two
pixels p and q, such that p and q are 4-neighbors, as Fig.3.4 illustrates. Edge elements are identified by
the xy-coordinates of points p and q. In other words, the edge element in Fig. 3.4 is defined by the pairs
(xp, yp) (xq, yq). Consistent with the definition an edge is a sequence of connected edge elements.
We can illustrate how the concepts just discussed apply to edge detection using the 3 X
3 image shown in Fig. 3.5 (a). The outer numbers are pixel
153
Fig.3.5 (a) A 3 X 3 image region, (b) Edge segments and their costs, (c) Edge corresponding to the
lowest-cost path in the graph shown in Fig. 3.6
coordinates and the numbers in brackets represent gray-level values. Each edge element, defined by
pixels p and q, has an associated cost, defined as
where H is the highest gray-level value in the image (7 in this case), and f(p) and f(q) are the gray-
level values of p and q, respectively. By convention, the point p is on the right-hand side of the
direction of travel along edge elements. For example, the edge segment (1, 2) (2, 2) is between points
(1, 2) and (2, 2) in Fig. 3.5 (b). If the direction of travel is to the right, then p is the point with
coordinates (2, 2) and q is point with coordinates (1, 2); therefore, c (p, q) = 7 - [7
- 6] = 6. This cost is shown in the box below the edge segment. If, on the other hand, we are traveling
to the left between the same two points, then p is point (1, 2) and q is (2, 2). In this case the cost is 8, as
shown above the edge segment in Fig. 3.5(b). To simplify the discussion, we assume that edges start in
the top row and terminate in the last row, so that the first element of an edge can be only between
points (1, 1), (1, 2) or (1, 2), (1, 3). Similarly, the last edge element has
to be between points (3, 1), (3, 2) or (3, 2), (3, 3). Keep in mind that p and q are 4-neighbors, as noted
earlier. Figure 3.6 shows the graph for this problem. Each node (rectangle) in the graph corresponds to
an edge element from Fig. 3.5. An arc exists between two nodes if the two corresponding edge
elements taken in succession can be part of an edge.
154
Fig. 3.6 Graph for the image in Fig.3.5 (a). The lowest-cost path is shown dashed.
As in Fig. 3.5 (b), the cost of each edge segment, is shown in a box on the side of the arc leading into
the corresponding node. Goal nodes are shown shaded. The minimum cost path is shown dashed, and
the edge corresponding to this path is shown in Fig. 3.5 (c).
7. What is thresholding? Explain about global thresholding.

Thresholding:
Because of its intuitive properties and simplicity of implementation, image thresholding enjoys a
central position in applications of image segmentation.
Global Thresholding:
The simplest of all thresholding techniques is to partition the image histogram by using a single global
threshold, T. Segmentation is then accomplished by scanning the image pixel by pixel and labeling
each pixel as object or back-ground, depending on whether the gray level of that pixel is greater or less
than the value of T. As indicated earlier, the success of this method depends entirely on how well the
histogram can be partitioned.
155
Fig.4.1 FIGURE 10.28 (a) Original image, (b) Image histogram, (c) Result of global
thresholding with T midway between the maximum and minimum gray levels.
Figure 4.1(a) shows a simple image, and Fig. 4.1(b) shows its histogram. Figure 4.1(c) shows the
result of segmenting Fig. 4.1(a) by using a threshold T midway between the maximum and minimum
gray levels. This threshold achieved a "clean" segmentation by eliminating the shadows and leaving
only the objects themselves. The objects of interest in this case are darker than the background, so any
pixel with a gray level ≤ T was labeled black (0), and any pixel with a gray level ≥ T was labeled
white (255).The key objective is merely to generate a binary image, so the black-white relationship
could be reversed. The type of global thresholding just described can be expected to be successful in
highly controlled environments. One of the areas in which this often is possible is in industrial
inspection applications, where control of the illumination usually is feasible.
156
Digital Image Processing
Unit IV
157
158
159
160
161
162
163
164
165
166
167
Error-Free Compression
Useful in application where no loss of information is tolerable. This maybe due to accuracy
requirements, legal requirements, or less than perfect quality of original image.
• Compression can be achieved by removing coding and/or interpixel redundancy.
• Typical compression ratios achievable by lossless techniques is from 2 to 10
Huffman Code
168
169
170
Arithmetic coding for image compression

the encoding requires more than 4 gigabytes. You must use .tif to be able to handle that size; if you can be
sure that the encoding will be no more than 2 gigabytes (minus one byte) then you can use .png .
.tif and .png files can be used to store arbitrary bytes without loss. (.bmp too, but .bmp has a limit of 30000
bytes for this purpose.)
You will not get any useful output if you ask to display an image created in this way: if the arithematic
encoding worked properly, then the output will look pretty much random. Just because you can create an
image file does not mean that the image file is understandable to humans.
171
People keep expecting that compressed images look like... I don't know. A distorted but partly
recognizable version of the original, I guess? A smaller and possibly recolored version of the original?
But compression theory says that you can continue to compress until one of two things happens:
1. the compressed values become statistically indistinguishable from random; or
2. the overhead needed to describe the data transformations to apply more compression starts to take more
space than just listing the compressed bytes as they are.
If you could still make out any resemblence between the original image and the data representing the
compressed image, then you did not do a good enough job of compression !!
Lossy Compression •
Unlike the error-free compression, lossy encoding is based on the concept of compromising the
accuracy of the reconstructed image in exchange for increased compression.
•The lossy compression method produces distortion which is irreversible. On the other hand, very
high compression ratios ranging between 10:1 to 50:1 can be achieved with visually indistinguishable
from the original. The error-free methods rarely give results more than 3:1.
•Transform Coding: Transform coding is the most popular lossy image compression method which
operates directly on the pixels of an image.
• The method uses a reversible transform (i.e. Fourier, Cosine transform) to map the image into a set
of transform coefficients which are then quantized and coded.
•The goal of the transformation is to decorrelate the pixels of a given image block such the most of the
information is packed into smallest number of transform coefficients.
172
173
174
175
MRFM Works
MRFM uses equipment that is very similar to that used for Atomic Force Microscopy (AFM). The main
difference is that for MRFM the microscope tip is magnetic. A coil in the instrument applies a radio-
frequency (RF) magnetic field that changes the ‘spin’ of electrons and protons in the sample. Each flip of
the spin causes the cantilever attached magnetic tip to move resulting in vibration as it moves across the
sample surface. A laser and optical detector is able to read these vibrations and produce a picture of the
individual atoms in the sample.
Figure 1. Schematic diagram for an MRFM
WAVELET TRANSFORM BASED COMPRESSION
176
177
IMAGE COMPRESSION STANDARDS

 MPEG-1 (1993): Coding of moving pictures and associated audio for digital storage media at up to
about 1.5 Mbit/s (ISO/IEC 11172). It includes the popular MPEG-1 Audio Layer III (MP3) audio
compression format.
 MPEG-2 (1995): Generic coding of moving pictures and associated audio information (ISO/IEC
13818).
 MPEG-3: MPEG-3 dealt with standardizing scalable and multi-resolution compression and was
intended for HDTV compression but was found to be redundant and was merged with MPEG-2.
 MPEG-4 (1998): Coding of audio-visual objects. It includes MPEG-4 Part 14 (MP4).
178
UNIT V
Image Segmentation
1. The Approach
Whenever one tries to take a bird’s eye view of the Image Segmentation tasks, one gets to observe a
crucial process that happens here – object identification. Any simple to complex application areas,
everything is based out of object detection.
And as we discussed earlier, detection is made possible because the image segmentation algorithms try to
– if we put it in lay man’s terms – collect similar pixels together and separate out dissimilar pixels. This is
done by following two approaches based on the image properties:
1.1. Similarity Detection (Region Approach)

This fundamental approach relies on detecting similar pixels in an image – based on a threshold, region
growing, region spreading, and region merging. Machine learning algorithms like clustering relies on this
approach of similarity detection on an unknown set of features, so does classification, which detects
similarity based on a pre-defined (known) set of features.
1.2. Discontinuity Detection (Boundary Approach)

This is a stark opposite of similarity detection approach where the algorithm rather searches for
discontinuity. Image Segmentation Algorithms like Edge Detection, Point Detection, Line Detection
follows this approach – where edges get detected based on various metrics of discontinuity like intensity
etc.
Edge Based Segmentation

Edge detection is the process of locating edges in an image which is a very important step towards
understanding image features. It is believed that edges consist of meaningful features and contains
significant information. It significantly reduces the size of the image that will be processed and filters out
information that may be regarded as less relevant, preserving and focusing solely on the important
structural properties of an image for a business problem.
Edge-based segmentation algorithms work to detect edges in an image, based on various discontinuities in
grey level, colour, texture, brightness, saturation, contrast etc. To further enhance the results,
supplementary processing steps must follow to concatenate all the edges into edge chains that correspond
better with borders in the image.
179
Image Source: researchgate.net

Edge detection algorithms fall primarily into two categories – Gradient based methods and Gray
Histograms. Basic edge detection operators like sobel operator, canny, Robert’s variable etc are used in
these algorithms. These operators aid in detecting the edge discontinuities and hence mark the edge
boundaries. The end goal is to reach at least a partial segmentation using this process, where we group all
the local edges into a new binary image where only edge chains that match the required existing objects or
image parts are present.
3. Region Based Segmentation

The region based segmentation methods involve the algorithm creating segments by dividing the image
into various components having similar characteristics. These components, simply put, are nothing but a
set of pixels. Region-based image segmentation techniques initially search for some seed points – either
smaller parts or considerably bigger chunks in the input image.
Next, certain approaches are employed, either to add more pixels to the seed points or further diminish or
shrink the seed point to smaller segments and merge with other smaller seed points. Hence, there are two
basic techniques based on this method.
3.1 Region Growing

It’s a bottom to up method where we begin with a smaller set of pixel and start accumulating or iteratively
merging it based on certain pre-determined similarity constraints. Region growth algorithm starts with
choosing an arbitrary seed pixel in the image and compare it with its neighboring pixels.
If there is a match or similarity in neighboring pixels, then they are added to the initial seed pixel, thus
increasing the size of the region. When we reach the saturation and hereby, the growth of that region
cannot proceed further, the algorithm now chooses another seed pixel, which necessarily does not belong
to any region(s) that currently exists and start the process again.
180
Region growing methods often achieve effective Segmentation that corresponds well to the observed
edges. But sometimes, when the algorithm lets a region grow completely before trying other seeds, that
usually biases the segmentation in favour of the regions which are segmented first. To counter this effect,
most of the algorithms begin with the user inputs of similarities first, no single region is allowed to
dominate and grow completely and multiple regions are allowed to grow simultaneously.
Region growth, also a pixel based algorithm like thresholding but the major difference is thresholding
extracts a large region based out of similar pixels, from anywhere in the image whereas region-growth
extracts only the adjacent pixels. Region growing techniques are preferable for noisy images, where it is
highly difficult to detect the edges.
3.2 Region Splitting and Merging

The splitting and merging based segmentation methods use two basic techniques done together in
conjunction – region splitting and region merging – for segmenting an image. Splitting involves iteratively
dividing an image into regions having similar characteristics and merging employs combining the adjacent
regions that are somewhat similar to each other.
A region split, unlike the region growth, considers the entire input image as the area of business interest.
Then, it would try matching a known set of parameters or pre-defined similarity constraints and picks up
all the pixel areas matching the criteria. This is a divide and conquers method as opposed to the region
growth algorithm.
Now, the above process is just one half of the process, after performing the split process, we will have
many similarly marked regions scattered all across the image pixels, meaning, the final segmentation will
contain scattered clusters of neighbouring regions that have identical or similar properties. To complete the
process, we need to perform merging, which after each split which compares adjacent regions, and if
required, based on similarity degrees, it merges them. Such algorithms are called split-merge algorithms.
SEGMENTATION BY MORPHOLOGICAL WATERSHEDS
Watershed is a ridge approach, also a region-based method, which follows the concept of topological
interpretation. We consider the analogy of geographic landscape with ridges and valleys for various
components of an image. The slope and elevation of the said topography are distinctly quantified by the
gray values of the respective pixels – called the gradient magnitude. Based on this 3D representation which
is usually followed for Earth landscapes, the watershed transform decomposes an image into regions that
are called “catchment basins”. For each local minimum, a catchment basin comprises all pixels whose path
of steepest descent of gray values terminates at this minimum.
181
In a simple way of understanding, the algorithm considers the pixels as a “local topography” (elevation),
often initializing itself from user-defined markers. Then, the algorithm defines something called “basins”
which are the minima points and hence, basins are flooded from the markers until basins meet on
watershed lines. The watersheds that are so formed here, they separate basins from each other. Hence the
picture gets decomposed because we have pixels assigned to each such region or watershed
Color-Based Segmentation
Try This ExampleCopy Command Copy Code

This example shows how to segment colors in an automated fashion using k-means clustering.
Clustering is a way to separate groups of objects. K-means clustering treats each object as having a
location in space. It finds partitions such that objects within each cluster are as close to each other as
possible, and as far from objects in other clusters as possible. You can use the imsegkmeans function to
separate image pixels by value into clusters within a color space. This example performs k-means
clustering of an image in the RGB and L*a*b* color spaces to show how using different color spaces can
improve segmentation results.
Step 1: Read Image
Read in hestain.png, which is an image of tissue stained with hemotoxylin and eosin (H&E). This staining
method helps pathologists distinguish between tissue types that are stained blue-purple and pink.
he = imread("hestain.png");
imshow(he), title('H&E image');
text(size(he,2),size(he,1)+15, ...
"Image courtesy of Alan Partin, Johns Hopkins University", ...
"FontSize",7,"HorizontalAlignment","right");
182
Step 2: Classify Colors in RBG Color Space Using K-Means Clustering

Segment the image into three regions using k-means clustering in the RGB color space. For each pixel in
the input image, the imsegkmeans function returns a label corresponding to a cluster.
Display the label image as an overlay on the original image. The label image incorrectly groups white,
light blue-purple, and light pink regions together. Because the RGB color space combines brightness and
color information within each channel (red, green, blue), lighter versions of two different colors are closer
together and more challenging to segment than darker versions of the same two colors.
numColors = 3;
L = imsegkmeans(he,numColors);
B = labeloverlay(he,L);
imshow(B)
title("Labeled Image RGB")
Step 3: Convert Image from RGB Color Space to L*a*b* Color Space
183
The L*a*b* color space separates image luminosity and color. This makes it easier to segment regions by
color, independent of lightness. The color space is also more consistent with human visual perception of
the distinct white, blue-purple, and pink regions in the image.
The L*a*b* color space is derived from the CIE XYZ tristimulus values. The L*a*b* space consists of the
luminosity layer L*, the chromaticity layer a* that indicates where a color falls along the red-green axis,
and the chromaticity layer b* that indicates where a color falls along the blue-yellow axis. All of the color
information is in the a* and b* layers.
Convert the image to the L*a*b* color space by using the rgb2lab function.
lab_he = rgb2lab(he);
Step 4: Classify Colors in a*b* Space Using K-Means Clustering
To segment the image using only color information, limit the image to the a* and b* values in lab_he.
Convert the image to data type single for use with the imsegkmeans function. Use
the imsegkmeans function to separate the image pixels into three clusters. Set the value of
the NumAttempts name-value argument to repeat clustering three times with different initial cluster
centroid positions to avoid fitting to a local minimum.
ab = lab_he(:,:,2:3);
ab = im2single(ab);
pixel_labels = imsegkmeans(ab,numColors,"NumAttempts",3);
Display the label image as an overlay on the original image. The new label image more clearly separates
the white, blue-purple, and pink stained tissue regions.
B2 = labeloverlay(he,pixel_labels);
imshow(B2)
title("Labeled Image a*b*")
Step 5: Create Images that Segment H&E Image by Color
184
Using pixel_labels, you can separate objects in the original image hestain.png by color, resulting in three
masked images.
mask1 = pixel_labels == 1;
cluster1 = he.*uint8(mask1);
imshow(cluster1)
title("Objects in Cluster 1");
imshow(cluster2)
imshow(cluster3)
185
Step 6: Segment Nuclei

Cluster 3 contains only the blue objects. Notice that there are dark and light blue objects. You can separate
dark blue from light blue using the L*'layer in the L*a*b* color space. The cell nuclei are dark blue.
The L* layer contains the brightness value of each pixel. Extract the brightness values of the pixels in this
cluster and threshold them with a global threshold by using the imbinarize function. The
mask idx_light_blue gives the indices of light blue pixels.
L = lab_he(:,:,1);
L_blue = L.*double(mask3);
L_blue = rescale(L_blue);
idx_light_blue = imbinarize(nonzeros(L_blue));
Copy the mask of blue objects, mask3, and then remove the light blue pixels from the mask. Apply the
new mask to the original image and display the result. Only dark blue cell nuclei are visible.
blue_idx = find(mask3);
mask_dark_blue = mask3;
mask_dark_blue(blue_idx(idx_light_blue)) = 0;
blue_nuclei = he.*uint8(mask_dark_blue);
imshow(blue_nuclei)
title("Blue Nuclei");
186
Dilation:
Dilation is a process in which the binary image is expanded from its original shape. The way the binary
image is expanded is determined by the structuring element.
This structuring element is smaller in size compared to the image itself, and normally the size used for the
structuring element is 3 x 3.
The dilation process is similar to the convolution process, that is, the structuring element is reflected and
shifted from left to right and from top to bottom, at each shift; the process will look for any overlapping
similar pixels between the structuring element and that of the binary image.
If there exists an overlapping then the pixels under the center position of the structuring element will be
turned to 1 or black.
Let us define X as the reference image and B as the structuring clement. The dilation operation is defined
by equation,
X⊕B={z|[(B^)Z∩X]∈X}X⊕B={z|[(B^)Z∩X]∈X}
where B is the image B rotated about the origin. Equation states that when the image X is dilated by the
structuring element B, the outcome element z would be that there will be at least one element in B that
intersects with an element in X.
187
If this is the case, the position where the structuring element is being centered on the image will be 'ON'.
This process is illustrated in Fig. a. The black square represents I and the white square represents 0.
Initially, the center of the structuring element is aligned at position •. At this point, there is no overlapping
between the black squares of B and the black squares of X; hence at position • the square will remain
white.
This structuring element will then be shifted towards right. At position **, we find that one of the black
squares of B is overlapping or intersecting with the black square of X.
Thus, at position • • the square will be changed to black. Similarly, the structuring element B is shifted
from left to right and from top to bottom on the image X to yield the dilated image as shown in Fig. a.
The dilation is an expansion operator that enlarges binary objects. Dilation has many uses, but the major
one is bridging gaps in an image, due to the fact that B is expanding the features of X.
Erosion:
Erosion is the counter-process of dilation. If dilation enlarges an image then erosion shrinks the image.
The way the image is shrunk is determined by the structuring element. The structuring element is normally
smaller than the image with a 3 x 3 size.
This will ensure faster computation time when compared to larger structuring-element size. Almost similar
to the dilation process, the erosion process will move the structuring element from left to right and top to
bottom.
188
At the center position, indicated by the center of the structuring element, the process will look for whether
there is a complete overlap with the structuring element or not.
If there is no complete overlapping then the center pixel indicated by the center of the structuring element
will be set white or 0. Let us define X as the reference binary image and B as the structuring element.
Erosion is defined by the equation
X⊝B={z|(B^)Z∈X}X⊝B={z|(B^)Z∈X}
Equation states that the outcome element z is considered only when the structuring element is a subset or
equal to the binary image X. This process is depicted in Fig. b. Again, the white square indicates O and the
black square indicates 1.
The erosion process starts at position •. Here, there is no complete overlapping, and so the pixel at the
position • will remain white.
The structuring element is then shifted to the right and the same condition is observed. At position u,
complete overlapping is not present; thus, the black square marked with • • will be turned to white.
The structuring element is then shifted further until its center reaches the position marked by •• •. Here, we
see that the overlapping is complete, that is, all the black squares in the structuring element overlap with
the black squares in the image.
Hence, the center of the structuring element corresponding to the image will be black. Figure b shows the
result after the structuring element has reached the last pixel. Erosion is a thinning operator that shrinks an
image. By applying erosion to an image, narrow regions can be eliminated while wider ones are thinned.
189
Morphological opening and closing
To finish up segmentation and morphological operations, it is important to know the compound operations of opening a
erosion and dilation are very useful, these operations tend to change the size of the mask. Opening and closing operatio
keeping the size of the objects in the mask roughly the same while refining their shape.
Opening
Morphological opening is an erosion followed by a dilation. I will leave it to you to verify each step, but an example
operation is shown in the diagram below.
Note that the final output mask has a size that is similar to the original input mask.
To implement this operation in MATLAB:
1. Read in the image FILE into the variable inputMask
inputMask =
9×9 logical array
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 1 1 1 0 0 0 0
0 0 1 1 1 0 0 0 0
0 0 1 1 1 1 1 0 0
0 0 1 1 1 1 1 0 0
0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0
190
0 0 0 0 0 0 0 0 0
2. This time, let's define a disk-shaped structuring element:
3. >> se = strel('disk', 1);

4. >> %Display the structuring element
>> imshow(se.Neighborhood, 'InitialMagnification', 1000)
5. To carry out morphological opening, use the function imopen(input_mask, structuring_element):
6. >> outputMask = imopen(inputMask, se)

7. outputMask =
8. 9×9 logical array
9. 0 0 0 0 0 0 0 0 0
10. 0 0 0 0 0 0 0 0 0
11. 0 0 0 1 0 0 0 0 0
12. 0 0 1 1 1 0 0 0 0
13. 0 0 1 1 1 1 0 0 0
14. 0 0 1 1 1 1 1 0 0
15. 0 0 0 1 1 1 0 0 0
16. 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
The net effect of a morphological opening is to keep only the pixels of the input mask which fits the structuring elem
opening tends to remove pixels or open gaps in the image.
To illustrate this last point, let's use a different example image.
1. Read in the image blobs.png. This image should already be available with MATLAB as it comes with the Image Pr
2. >> inputMask = imread('blobs.png');
>> imshow(inputMask)
191
3. Let's create a larger structuring element to make the change more obvious:
>> se = strel('disk', 5);
4. Run the opening operation and display the result in a new window:
5. >> outputMask = imopen(inputMask, se);
>> figure; imshow(outputMask)
As you can see from the output mask, opening with a 5 px radius disk has essentially removed most of the objects in the
the disk-shaped objects.
Hence, this operation could be useful for selecting objects of a certain shape or smoothing the outline of objects in the i
images below, the input image (left) has both dots and lines, while the output image (right) after opening has only dots.
the lines yourself by downloading the following image: dotsandlines.png Download dotsandlines.png
192
Closing
Morphological closing is a dilation followed by an erosion (i.e. the reverse of the operations for an opening). As you mi
the net effect of the closing operation is to remove background pixels that fit the structuring element. In other words, clo
gaps in the image.
The function for morphological closing is imclose(input_mask, structuring_element). As we did last time, I'll leave you
the example above in MATLAB.
If you tried closing the image blobs.png with a 5 px radius disk, you should get an image that looks similar to this:
193
Note that the closing operation has filled in gaps that are smaller than or the same size as the structuring element. Hence
useful for connecting objects that are near each other, while not affecting objects that are too far apart.
Hit-and-Miss Transform
Common Names: Hit-and-miss Transform, Hit-or-miss Transform

Brief Description
The hit-and-miss transform is a general binary morphological operation that can be used to look for
particular patterns of foreground and background pixels in an image. It is actually the basic operation of
binary morphology since almost all the other binary morphological operators can be derived from it. As
with other binary morphological operators it takes as input a binary image and a structuring element, and
produces another binary image as output.
How It Works
The structuring element used in the hit-and-miss is a slight extension to the type that has been introduced
for erosion and dilation, in that it can contain both foreground and background pixels, rather than just
foreground pixels, i.e. both ones and zeros. Note that the simpler type of structuring element used with
erosion and dilation is often depicted containing both ones and zeros as well, but in that case the zeros
really stand for `don't care's', and are just used to fill out the structuring element to a convenient shaped
kernel, usually a square. In all our illustrations, these `don't care's' are shown as blanks in the kernel in
order to avoid confusion. An example of the extended kind of structuring element is shown in Figure 1. As
usual we denote foreground pixels using ones, and background pixels using zeros.
194
Figure 1 Example of the extended type of structuring element used in hit-and-miss operations. This
particular element can be used to find corner points, as explained below.
The hit-and-miss operation is performed in much the same way as other morphological operators, by
translating the origin of the structuring element to all points in the image, and then comparing the
structuring element with the underlying image pixels. If the foreground and background pixels in the
structuring element exactly match foreground and background pixels in the image, then the pixel
underneath the origin of the structuring element is set to the foreground color. If it doesn't match, then that
pixel is set to the background color.
For instance, the structuring element shown in Figure 1 can be used to find right angle convex corner
points in images. Notice that the pixels in the element form the shape of a bottom-left convex corner. We
assume that the origin of the element is at the center of the 3×3 element. In order to find all the corners in a
binary image we need to run the hit-and-miss transform four times with four different elements
representing the four kinds of right angle corners found in binary images. Figure 2 shows the four different
elements used in this operation.
Figure 2 Four structuring elements used for corner finding in binary images using the hit-and-miss
transform. Note that they are really all the same element, but rotated by different amounts.
195
After obtaining the locations of corners in each orientation, We can then simply OR all these images
together to get the final result showing the locations of all right angle convex corners in any orientation.
Figure 3 shows the effect of this corner detection on a simple binary image.
Figure 3 Effect of the hit-and-miss based right angle convex corner detector on a simple binary image.
Note that the `detector' is rather sensitive.
Implementations vary as to how they handle the hit-and-miss transform at the edges of images where the
structuring element overlaps the edge of the image. A simple solution is to simply assume that any
structuring element that overlaps the image does not match underlying pixels, and hence the corresponding
pixel in the output should be set to zero.
The hit-and-miss transform has many applications in more complex morphological operations. It is being
used to construct the thinning and thickening operators, and hence for all applications explained in these
worksheets.
GRAYSCALE MORPHOLOGY If you are not familiar with binary morphology, you may want to read
about it first. As mentioned in the section on binary morphology, grayscale morphology is simply a
generalization from 1 bpp (binary) images to images with multiple bits/pixel, where the Max and Min
operations are used in place of the OR and AND operations, respectively, of binary morphology. The
following should be noted:  These operations, as with binary morphology, are nonlinear. The significance
of nonlinear operations, as opposed to linear operations, is that nonlinear operations can be used directly
for making decisions about regions of the image. Nonlinear operations are therefore of particular use in
image analysis.  The generalization can be seen in reverse: binary morphology is a specialization of
grayscale morphology. The Max of a set of values {0, 1} is equivalent to an OR, and the Min is equivalent
to an AND.  The generalization only applies to flat structuring elements. With grayscale images, dilation
and erosion can be defined with non-uniform structuring elements. These are rarely used, and will not be
discussed further.  With flat structuring elements, grayscale morphology is itself a specialization of rank-
order filters. When a rank-order filter acts on a source image, it generates a dest image where each pixel is
196
computed as follows: place the structuring element with its origin on the corresponding source pixel, and
choose the rank (ordered) pixel in the set of source pixels that is covered by the structuring element.
Erosion is thus a rank-order filter with rank = 0.0; dilation has rank = 1.0; and the median filter, which is
useful for removing noise, has rank 0.5. (To be completely accurate, when doing a dilation as described
here, invert the structuring element about its origin.)  Because the standard photometry of binary and
grayscale images is opposite (white is min val in binary and max val in grayscale), grayscale dilation
lightens a grayscale image and erosion darkens it, visually.  Without special hardware registers that do
Min and Max comparisons on 4 or 8 bytes simultaneously, grayscale morphology must be done by
comparing integers, one pixel at a time. It is thus much more than 8 times slower than implementations of
binary morphology that pack 32 pixels into each 32-bit word.  This inefficiency is somewhat offset by the
existence of a simple algorithm that computes grayscale morphological operations in a time independent of
the size of the structuring element. For the general case of an arbitrary SE of size A, where A, the "area" of
the SE, is the number of hits, the computation of the Max or Min requires A pixel comparisons on the src
image for each pixel of the dest. However, things get more efficient when the SE is a brick, by which we
mean it is a solid rectangle of hits. For a brick SE, dilation (or erosion) can be decomposed into a sequence
of 2 dilations (or erosions) on 1-D horizontal and vertical SEs. Each 1-D grayscale morphological
operation can be done with less comparisons than the "brute-force" method. Luc Vincent described one
such method, that uses an ordered queue of pixels, given by the set of pixels currently under the SE. The
queue is ordered so that the Min or Max can be read off from the end. When the SE is moved one position,
one pixel is removed from the queue, and a new pixel is placed into it, in the proper location. Most of the
work goes into sorting the queue to maintain the order. More recently, van Herk, Gil and Werman have
described an algorithm that uses a maximum of 3 comparisons for a linear SE, regardless of the size of the
SE. This method is described below, and it is the method we have implemented here. Some useful image
transforms are composed of a sequence of grayscale operations. For example, the tophat operator is often
used to find local maxima as seeds for some filling operation. It is defined as the difference between an
image and the opening of the image by a structuring element of given size. Another local maximum finder
is called the hdome operator, and it uses a grayscale reconstruction method for seed filling. (Yes, with h-
domes, you do a seed-filling operation to find seeds for another operation!).
197

DIP-LECTURE - NOTES Final

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DIP-LECTURE - NOTES Final

Uploaded by

Copyright:

Available Formats

1

DIGITAL IMAGE PROCESSING

DIGITAL IMAGE FUNDAMENTALS

Fig 1 Coordinate convention used to represent digital images

2. Explain the process of image acquisition.

Fig.5.2. Combining a single sensor with motion to generate a 2-D image

(2) Image Acquisition Using Sensor Strips:

(3) Image Acquisition Using Sensor Arrays:

where and pixels are adjacent for

form the following contours of constant distance:

The pixels with D4 =1 are the 4-neighbors of (x, y).

Fig.2 Addition of random noise in an image

function of original image.

vibration of atoms and discrete nature of radiation of warm objects.

Fig.4 Plot of Probability Distribution Function

Fig.5 Effect of Sigma on Gaussian Noise

proportional to the sigma value.

world impulse function is an idealised function having unit area.

Fig.6 Impulse function in discrete world and continuous world

betterment and enhancement of the image.

Gaussian Markov Random Fields

The questions are the following:

Linear and Nonlinear Operations

Fig.2. Fundamental steps in Digital Image Processing

➢ Sinusoidal transformation functions.

IMAGE ENHANCEMENT (SPATIAL AND FREQUENCY DOMAINS)

Fig.2.1 A 3*3 neighborhood about a point (x, y) in an image.

Power-law transformations have the basic form

where c and g are positive constants. Sometimes Eq. is written as

Piecewise-Linear Transformation Functions:

Fig.1.5 Bit-plane representation of an 8-bit image.

 Filters are used for Blurring and noise reduction

Same picture with edges

2. Write about Smoothing Spatial filters.

A major use of averaging filters is in the reduction of ―irrelevant‖ detail in an image. By

(2) Order-Statistics Filters:

Use of First Derivatives for Enhancement—The Gradient:

The magnitude of this vector is given by

we compute the gradient as

How Spatial Combining Works:

Cathode Ray Tube Television

Example of Thermal Simulation Using Qorvo's Spatium QPB1006: Cooler

1. Discuss the frequency domain techniques of image enhancement in detail.

G (u, v) = H (u, v) F (u, v)

Fig.1.1 (a) Histogram of input image

Fig.1.1 (c) Histogram of the transformed image

These stretching transformations are expressed as

2. Clipping and Thresholding:

A digital negative transformation of an image is shown in figure 1.4.

4. Intensity Level Slicing:

help of intensity level slicing. They are expressed as

Let, u = k1 2B-1 + k2 2B-2 +…................... + kB-1 2 + kB

The Hotelling Transform is a conventional image processing transformation. The equation is as

Fig.4 eigen analysis

Figure 1: The de-noising process based on wavelet transform one

the decay rate using the integration below:

Where k is the decay rate.

Also, a wavelet function ψ(t)ψ(t) has N vanishing moments if:

oscillation becomes greater.

and sym3 has three vanishing moments.

vanishing moments in the analysis wavelet.

support width is N-1.

support width is 2N-1.

│H (u, v│ 2 = H(u, v) H (u, v)