You are on page 1of 296

Digital Image Processing

UNIT - 1

TYBSC Sem – VI
Lecturer - Dhawal S. Bhangale
Wt is DIP ?
• Processing of image in digital form by digital
computer.

Why to process ?
• Improvement of pictorial info for human
perception.
• For autonomous machine application like quality
control, assembly automation.
• Efficient storage & transmission e.g. reduce disk
space and transmission time through low
bandwidth channel.
A. Human Perception
• Noise filtration.
• Improve certain characteristics of an image
like
– Contrast enhancement
– DE blurring – blurriness may appear due to
hand shake or moving platform or wrong
camera settings.
• Remote sensing – images taken from
satellite.
Filtering
Image Enhancement
Colored Image Enhancement
De Blurring
Medical Imaging
Medical Imaging
Medical Imaging
Remote Sensing
Remote Sensing
Remote Sensing
Machine vision Application
• Here we are not much interested in improving
visual quality of an image but to process image to
extract information or features which can be used
in applications like;
• Industrial machine vision, inspection.
• Automated target detection and tracking.
• Finger print detection.
• Arial & satellite images for weather predication
and forcasting.
Automated inspection
Automated inspection
Video Sequence Processing
• Video sequence is different image frames
displayed one after another.
• Used for applications like;
• Detection of moving parts (targets) for security
surveillance.
• Find trajectory of moving objects.
• Monitoring boundaries of moving objects in
medical applications.
Video Sequence Processing
Image compression
• For storing or transmitting.
• Image usually contains lot of redundancies that
can be exploited to achieve compassion.
• 3 types of redundancies:
– Pixel Redundancy
– Coding Redundancy
– Psychovisual Redundancy
• For compression we try to remove redundancies
retaining only information stored in the image.
Image Compression
History
• In 1920 DIP used to transmit digital image from
London and New York. Only 2 shades were used.
• In 1921 photographic printing improved quality. 5
shades were used.
• By 1929 it reached up to 5 shades.
• In 1964 for printing moon pictures sent by
satellite we improved the technology; here
actually modern DIP started.
Representing Digital Image
• Image is 2D light intensity function f(x,y). At
any point (x,y), value represent intensity at
that point.
• f(x,y) = r(x,y) * i(x,y)
• r(x,y) – reflectivity of the surface at the
corresponding point of image.
• i(x,y) – intensity on incident light.
• Representing analog image in a
computer is impossible.
• Ideally image have infinite points to
represent the info.
• The value of any point in image may
also have amplitude from 0 to infinity.
• Storing such infinite sized image we
store some sample values from image.
• To locate points to be stored we use discretization
process.
• To store amplitude value we quantize the ideally
infinite amplitude values to a limited range.
• Image can be represented as a matrix.

f(0,0) f(0,1) ……… f(0,N-1)


f(1,0) f(1,1) ……… f(1,N-1)
• I= : : :
: : :
f(M-1,0) f(M-1,0) ……… f(M-1,N-1)

• Each point called as pixel or pels.


Image
• Definition:- An image is a two-dimensional
function that represents a measure of some
characteristic such as brightness or color of a
viewed scene.
• An image is a projection of a 3D scene into a
2D projection plane.

• It can be defined as a two-variable function


f(x,y) where for each position (x, y) in the
projection plane, f(x,y) defines the light
intensity at this point.
Analog Image
• An analog image can be mathematically
represented as a continuous range of values
representing position and intensity.

• An analog image is characterized by a physical


magnitude varying continuously in space. For
example, the image produced on the screen of
a CRT monitor is analog in nature.
Digital image
• A digital image is composed of picture
elements called pixels.
• Pixels are the smallest sample of an image. A
pixel represents the brightness at one point.
• Conversion of an analog image into a digital
image involves two important operations,
namely, Sampling and Quantization.
Advantages of Digital Images
• The processing of images is faster and cost
effective.
• Digital images can be effectively stored and
efficiently transmitted from one place to
another.
• When shooting a digital image, one can
immediately see if the image is good or not.
Advantages of Digital Images
• Copying a digital image is easy. The quality of the
digital image will not be degraded even if it is
copied for several times.
• Whenever the image is in digital format, the
reproduction of the image is both faster and
cheaper.
• Digital technology offers plenty of scope for
versatile image manipulation.
Drawbacks of Digital Images
• Misuse of copyright has become easier because
images can be copied from the Internet easily.
• A digital file cannot be enlarged beyond a certain
size without compromising on quality.
• The memory required to store and process good-
quality digital images very high.
• For real-time implementation of digital-image-
processing algorithms, the processor has to be
very fast because the volume of data is very high.
Human Visual System
Structure
of human
eye
Structure of human eye
• Horizontal cross section of eye.
• Nearly sphere with diameter approx. 20mm
• 3 membranes : cornea & Sclera outer cover and
choroid & retina.
• Cornea is tough, transparent tissue; continuous
with cornea, sclera is opaque membrane.
• Choroid lies just below the sclera. It contains
blood vessels giving nutrition to eye. It is heavily
pigmented.
• At anterior divided into ciliary body & iris.
Structure of human eye
• Iris diaphragm contract and expand to control
amnt of light entering the eye.
• Central opening of iris (pupil) from 2-8mm.
• Lens contains 60%-70% water.
• Lens is covered by yellowish pigmentation, if
increased with age it is called as cataracts.
• Innermost membrane is retina.
• When eye is focused, light from outside object is
imaged on retina.
Structure of human eye
• 2 types of light receptors on retina: cons & rods.
• Cons – 6 to 7 million. Located on central portion
of retina called fovea. Sensitive to color. We can
resolve fine details with these bcoz each one is
connected to its own nerve. Muscles controlling
the eye rotate the eye ball until the image of an
object of interest falls on fovea. Cons vision
known as photopic or bright light vision.
Structure of human eye
• Rods – 75-150 million. Distributed over retinal
surface. Several rods connected to single nerve
reduce the details captured by these receptors.
No color identification. Sensitive to low level
illumination. Objects appearing colored in day
light appear colorless in moonlight. This
phenomenon is called as scotopic or dim-light
vision.
• The absence of receptors in the area where the
optic nerve emerge is called blind-spot.
Image Formation in the Eye
• lens of the eye is flexible.
• The shape of the lens is controlled by tension in
the fibres of the ciliary body.
• The distance between the centre of the lens and
the retina (called the focal length) varies from
approximately 17 mm to about 14 mm, as the
refractive power of the lens increases from its
minimum to its maximum.
Image Formation in the Eye

to calculate the size of the retinal


image
15/100 of any
= h/17 h =object
2.55 mm
• to calculate the size of the retinal image of any
object.
• for example, the observer is looking at a tree 15
m high at a distance of 100 m. If h is the height in
mm of that object in the retinal image, the
geometry of Fig. 2.3 yields 15/100=h/17 or
h=2.55 mm.
Sampling &
Quantization
Continuous Image and a scan line from
A to B in the continuous image.
Digitizing the Digitizing amplitude
coordinate values values called
called SAMPLING. QUANTIZATION.
Result of image sampling and
Quantization
Digital Image Representation
• A digital image is a two dimensional discrete
signal represented as an N X N array of
elements. Each element is a number
representing the intensity.
• E.g. a 4 X 4 image;
Digital Image Representation
• 7 X 7 Image Image in Graphical form

0 0 0 0 0 0 0
0 0 0 1 1 0 0
0 0 1 1 1 0 0
0 0 1 1 1 0 0
0 1 1 1 1 1 0
0 1 1 0 1 1 0
0 0 0 0 0 0 0
Neighbours of a Pixel
• Two types of neighbours a pixel P can have
• 4 Neighbours 8 Neighbours

X X X X
X P X X P X
X X X X
Resolution
• Resolution gives the degree of distinguishable
details.
Spatial Resolution Gray – Level Resolution
Smallest visible detail in image Smallest visible change in gray
level
Depends on number of pixels Depends on number of gray
levels
Depends upon sampling Gray levels are generally an
integer power of 2 e.g. 8, 16, 64,
256, etc
• If we use insufficient no of gray levels in smooth
areas of image, it causes “False Contouring”.
False Contouring

ORIGINAL FALSE CONTOURING


CLASSIFICATION OF DIGITAL IMAGES
• Raster Image:
– Rectangular array of regularly sampled values(Pixels).
Mapped in grid which are not easily scalable.
– Ex. Scanned graphics and web graphics.
– Loses its quality when enlarged beyond fixed and
limited no. of pixels(Resolution). Because, computer
will make up for missing information.
– Captured by optical scanners, digital cameras.
– They can show colors well so suitable for detailed
images like photographs.
– Image formats- BMP, JPEG,GIF, PNG.
CLASSIFICATION OF DIGITAL IMAGES
• Vector Image:
– Made up of lines and curves that are
mathematically defined in computer.
– Mathematical so scalable. i.e. can be printed at
any size, displayed at any resolution, without
loosing details.
– Because vector images can be scaled by any factor
without altering the resolution they are suitable
for topography, line art and illustrations.
IMAGE TYPES
• Binary image: (black and white)
– Only two values 0 and 1
– Easy to recognize shapes, location of shapes in
image.
• Grayscale image:
– Each pixel is represented by a byte or word, the
value of which represents the light intensity at
that point.

– In 8 bit image brightness value ranges from 0 to


255.
• Color image:
– Modeled as three band monochrome image (i.e.
three values per pixel)

– Each pixel value is the brightness info in each


spectral band.

– Common color spaces (bands)-RGB, HSV(hue


saturation, value) and CMYK(cyan magenta yellow
black)
• Volume image
– 3D image.

– Obtained from medical imaging equipments.

– Each data point is voxels (volume pixels).

– Ex. CAT scan image.


• Multi-spectral image:
– Set of images taken in different bands(e.g. visible,
infrared, ultraviolet light, etc.)

– Images are generally not visible to human, but the


info can be represented in visual form in RGB.

– Acquired for remote sensing applications.

– Hyper spectral image is a set of 224 images taken


in 224 bands, at a specific location on the earth
surface.
Image Processing System: Elements
Ex. Camera
• Image Acquisition, storage, processing, display.
Image
• Real world image Digital Image
Acquisition

• Old cameras – sliver halide based film used to


capture scene.
• Digital cameras – CCD (Charge-coupled device)
or CMOS (complementary metal–oxide–
semiconductor) devices converts light into
electrical charges.
CCD
• Charged Coupled devices
• How CCD works ?
–Step 1: Exposure
–Step 2: Charge Transfer
–Step3: charge is converted to voltage and
amplified.
Converting Photons into Electric Charge
• Photon falls on pixel  energy absorbed by
silicon and releases an electron.

• These electron collected and converted to


voltage.

• Higher light level  more electrons released

• High exposure time  more electrons released

• High wavelength light  lower electrons count


Potential Wells and Barrier (How electron charge
is collected in CCD)
1. 3 components – conductive material (polysilicon)
kept on semiconductor (silicon) surrounded by
highly insulating material (silicon dioxide).

2. +ve voltage to polysilicon aka gate electrode 


potential well within semiconductor beneath gate

3. -ve voltage to gate  potential barrier within


semiconductor beneath gate

4. Light falls onto the sensor silicon releases


electrons  collected within well  electron
charge transferred to voltage.
• CCDs are arranged in 3 possible configurations;
– Point scanning, line scanning, area scanning.
• Point scanning:-
– Single cell detector or pixel.

– Advantages:- High Resolution,


simple detector.

– Disadvantages:- errors due to


2 dimensional movement of
scanner, scanning speed,
complexity.
• Line Scanning (Line Array):-
– Array of single-cell detectors is placed along a single
axis.

– Advantages :- scanning speed better than


point scanner, high resolution,
less complicated than point scanner.

– Disadvantages:- pixel spacing in one


direction limits the resolution,
scan time still high, high cost.
• Area scanner (Area Array):-
– 2 dimensional array of detectors created.

– Advantages:- no need to move detector, highest frame


rate (fast scanning), low system complexity.

– Disadvantages:- limited resolution


in both dimensions, high cost.
Scanning Mechanism
• Scanner convert light into 1’s and 0’s. Converts
Analog image to Digital image.
• For greater speed generally use CCD sensors.
• Components of Scanner:
1. An optical system
2. A light sensor
3. An interface
4. Driver software
• Quality of scan depends on 1 and 2.
• Productivity depends on 3 and 4.
Types of Scanners
• Drum Scanner
• High quality scans for professional use. Better
quality than flatbed scanner.

• Consists of, a translucent cylindrical drum. Image


to be scanned is wet mounted on the drum.

• Sensing element Photomultiplier tube, which is


more sensitive than CCD is used.
• Flying Spot Scanner
• Used for scanning still images usually
photographic slides.
• High quality pictures but complex to implement.
• A high precision CRT tube with flat face plate is
used to generate a blank raster (flying spot)
• Image of Raster is projected onto the film and the
modulated light passing through the film or slide
is collected by condenser lens system is focused
on the photomultiplier tube.
• Photographic slide
• Flatbed scanner
• Normal office use scanners. With CCD array.

• The image of document to be scanned reached


CCD via a series of mirrors, filters and lenses.

• A lamp is used to illuminate document. Scan head


contains entire mechanism ( mirrors, lens, filters,
CCD array). Scan head moves slowly across the
document called the “PASS”.
• Image of document reflected by a series of
slightly curved mirrors (to focus the image on
smaller surface) focused onto a lens which then
focus it on CCD via a series of filters.

• Filter may be used to identify colors or any other


purpose.
• Scan Quality
• Quality of a scanned image is Determined by its
resolution and color depth.
• Scanner resolution is measured in dpi or dots per
inch.
• Scanner resolution can be classified as: 1. optical
resolution and 2. interpolated resolution.
• Optical resolution is actual number of sensor
elements per inch on scanner head. Usually
which is 1600 dpi or 3200 dpi.
• Interpolated resolution is not real resolution
because an algorithm is used to compute the
values in between dots. This makes interpolated
images look too smooth or slightly out of focus.

• Color depth indicates no of colors could be used


in image. In general scanners it is 24 bit but can
have higher for high-end scanners.
Image File Formats
• A digital image is encoded in the form of
binary files for storage.

• Different file formats may compress image


data by different amounts but all image files
have 2 parts; Header and Image data.

• Header part contains vital info like image


format, size, type, etc.
GIF
• Graphics Interchange Format.
• Developed for sending graphical images by phone
lines using MODEM by UNISYS Corp.
• 8 bit color images only.
• Lossless compression so image quality preserved.
• Used to create smaller animations.
• Supports only 256 colors.
JPEG
• International Organization for Standardization
(ISO)
• JPEG typically achieves 10:1 compression with
little perceptible quality loss.
• Good 24 bit color image and platform
independent format
• Lossy compression. Image quality degradation
PNG
• Portable Network Graphics
• Better than GIF and supports 16 million colors.
• Lossless image compression.
TIFF
• Tagged Image File Format. Aldus Corp 1980s.
• Does not compress image so image quality is
preserved but large size.
• It can handle multiple images and data in a single
file trough the inclusion of “tags” in the header.
• Can support any range of image resolution, size,
and color depth.
Applications of DIP
• Medicine : DIP techniques like image
segmentation and pattern recognition is used
in digital mammography to identify tumours.
• Image registration and fusion helps in
extracting information from medical images.
• In telemedicine, lossless compression
algorithms allow the medical images to be
transmitted effectively from one place to
another.
• Forensics : personal identification. Different
biometrics used in personal identification are
face, fingerprint, iris, vein pattern, etc.

• Pre-processing techniques include edge


enhancement, denoising, skeletonisation, etc.

• Template-matching algorithms are widely used


for proper identification
• Remote Sensing : Observations usually
consist of measurements of electromagnetic
radiation with different wavelengths of the
radiation carrying a variety of information
about the earth's surface and atmosphere.
• Planning, hydrology, agriculture, geology and
forestry.

• Image enhancement, image merging and


image-classification, Multispectral image
processing and texture classification
• Communications : Video conferencing helps
the people in different locations to interact lively.
Fast information transmitting is required.
• Effective image and video compression
algorithms like JPEG, JPEG2000, H.26X standards
help to transmit the data effectively for live video
conference.
• Automotives : 'night vision system'. Night
vision system helps to identify obstacles during
night time to avoid accidents. Infrared cameras
are invariably used in a night-vision system.
• Image enhancement, boundary detection and
object recognition.
CH. 2

2D SIGNALS AND SYSTEMS


Signal
• A signal is a variable that carries information.
• A signal is a function with one or more
variables.
• Example of one-dimensional signal is an ECG
signal.
• Whereas an image in a 2D signal. As intensity
of image changes in both x and y dimensions,
it is 2D signal.
System
• System processes input signal and produces
output signals.

• System is an operation or a set of operations


performed on input signal to produce output
signal.
2D Signals
• 2D Discrete Signal is represented as x(n1,n2).

• Where n1,n2  pair of integers and x  real


or complex value.

• In case of an image x(n1,n2) represents pixel


intensity at the location (n1, n2).
• 2D Unit Impulse Sequence

1, n1 = n 2 = 0
• x(n1,n2) = δ(n1,n2) =
0, otherwise
• Line Impulse
– Vertical Line Impulse

• x(n1,n2) = δ(n1)
• i.e. x(n1,n2) will have
value 1 when n1 = 0
• Line Impulse
– Horizontal Line Impulse

• x(n1,n2) = δ(n2)
• i.e. x(n1,n2) will have
value 1 when n2 = 0
• Line Impulse
– Diagonal Line Impulse
• Diagonal impulse has value 1 only at the diagonal points.

• x(n1,n2) = δ(n1 + n2)


• i.e. x(n1,n2) will have
value 1 when n1 + n2 = 0
• Line Impulse
– Diagonal Line Impulse
• Another form of diagonal impulse is,

• x(n1,n2) = δ(n1 - n2)


• i.e. x(n1,n2) will have
value 1 when n1 - n2 = 0
Solve…
• Sketch the sequence x(n1,n2) = δ(2n1 - n2)
• Solution :- from given equation it is clear that
x(n1,n2) will have value only when 2n1 - n2 = 0
•  2n1 = n2
Solve…
• Sketch the sequence x(n1,n2) = δ(n1 + n2 - 1)
• Solution :- from given equation it is clear that
x(n1,n2) will have value only when n1 + n2 - 1 = 0
•  n2 = 1 – n1
•  when n1= 0, n2 = 1 and n1= 1, n2 = 0
Exercise…

• Sketch the following.


• x(n1,n2) = δ(2n1 + n2 - 1)
Exponential Sequence
• Defined by,
• x(n1,n2) = an1bn2 ………. – ∞ < n1,n2 < ∞

– Where a and b are complex numbers,

• a = ejw1 and b = ejw2

• x(n1,n2) = e jw1+jw2

• x(n1,n2) =cos(w1n1 + w2n2) + jsin(w1n1 + w2n2)


Separable Sequence
• A signal x(n1,n2) is separable if it can be
represented as a product of the function of n1
alone and n2 alone.
• x(n1,n2) = x1 (n1) x2 (n2)
• e.g.
• Impulse sequence δ(n1,n2) is separable;
• => δ(n1,n2) = δ(n1) δ(n2)
• Also unit step sequence is separable;
• => u(n1,n2) = u(n1) u(n2)
Periodic Sequence
• A 2D signal is periodic if it repeats itself in a
regularly spaced intervals.
• In 1 dimensional signal,
• x(n + N) = x(n) n
• N is the period and is an integer and frequency
of period is 1/N.
• x(n1,n2) is periodic with period N1 X N2 iff
• x(n1,n2) = x(n1+N1,n2) = x(n1,n2+N2)
• Discrete signals are periodic in time only if its
frequency is rational.
• Note:- Rational is any number that can be written as a
fraction of two integers. Basically all real number are
rational.

• So if N is period then 1/N is frequency.

• Using trigonometric rule a complete circle that is


period is equal to 2 so the frequency is 1/2
Solve…
• Determine whether given signal is periodic,
𝜋 𝜋
• x(n1,n2) = cos( ( )n1 + ( )n2 )
4 2
• If periodic determine period.
• Solution Given,
𝜋 𝜋
• x(n1,n2) = cos( ( )n1 + ( )n2 )
4 2
• Consider x(n1,n2) = cos(w1n1 + w2n2)
• For the signal to be periodic,
𝑤1 𝑤2
• and should be rational
2𝜋 2𝜋
𝜋 𝜋
• In the given problem w1 = , w2 =
4 2
𝑤1 𝜋 1 1
•  = X =  N1 = 8
2𝜋 4 2𝜋 8
𝑤2 𝜋 1 1
• And = X =  N2 = 4
2𝜋 2 2𝜋 4

𝑤1 𝑤2
• Because and are rational
2𝜋 2𝜋

•  given signal is periodic with periods,


• N1 X N2 = 8 X 4
Exercise …

• Determine whether given signal is periodic,


• X(n1,n2) = cos(n1 + n2)
2D System
• 2D system is a device or algorithm that performs
some prescribed operation on a 2D signal. If
x(n1,n2) is the input signal and y(n1, n2) is
output signal, then
• y(n1, n2) = T[x(n1,n2)]
• T denotes the transformation
Classification of 2D Systems
• Linear Vs. Non-Linear Systems
• The response of the system to a weighted sum of
signals should be equal to the corresponding
weighted sum of the output of the system to
each of the individual input signals. This is
superposition principle.
• T[ax1(n1,n2) + bx2(n1,n2)] = ay1(n1,n2) + by2(n1,n2)
• Here a and b are scalar constants
• The superposition principle obeys both the
scaling property and additive property.

• Scaling property,
• If input is x1(n1,n2) and output is y1(n1,n2)
• Then if input is a1x1(n1,n2), output is
a1y1(n1,n2)
• Additive property,
• Shift-Variant Vs. Shift-Invariant Systems
• Shift-Invariant system’s input-output
characteristics do not change with time.
• Shift invariance is given by,
• For T[x(n1, n2)] = y(n1, n2)
• T[x(n1-m1, n2-m2)] = y(n1-m1, n2-m2)
• Static Vs. Dynamic System
• A system is static or memoryless if its output at
any instant depends at most on the input sample
but not on the past and future samples of the
input.
• In any other case, the system is said to be
dynamic or to have memory.
• i.e. y(n) = x(n) is static
• But
• y1(n) = x1(n) + x1(n-1) + x1(n+2) is dynamic
2D Digital Filters
• Filters are used for many image processing
applications like image enhancement, image
deblurring, target matching, etc.

• There a re 2 types of filters:


• Finite Impulse Response Filters (FIR)
• Infinite Impulse Response Filters (IIR)
• A system's impulse response h[n] is defined as
the output signal that results when an impulse is
applied to the system input.

• Why is this useful? It allows us to predict what


the system's output will look like in the time
domain. It allows us to calculate the output of
these systems for any input signal.
FIR Filter
• If the filter output y(n1, n2) depends on
present and past inputs only then it is non-
recursive realizations and the system is FIR
System.
• e.g.
𝑀
• y(n1,n2) = 𝑞=0 𝑏𝑞𝑥(𝑛1 − 𝑞, 𝑛2 − 𝑞)
IIR Filters
• If filter output y(n1,n2) depends on previous
outputs and present and previous inputs then the
filter is in recursive realization and the system is IIR
System.
• e.g.
• y(n1,n2) =
𝑁 𝑀

𝑎𝑝𝑦 𝑛1 − 𝑝, 𝑛2 − 𝑝 + 𝑏𝑞𝑥(𝑛1 − 𝑞, 𝑛2 − 𝑞)
𝑝=1 𝑞=0
(past output) (present and past input)
2D CONVOLUTION

• In mathematics convolution is a mathematical


operation on two functions (f and g) that
produces a third function (f g) that expresses
*
how the shape of one is modified by the other.
• Properties of 2D Convolution

• Commutative Property
– x (n1, n2) ∗ y (n1, n2) = y (n1, n2) ∗ x (n1, n2)

• Associative Property
– (x (n1, n2) ∗ y (n1, n2)) ∗ z (n1, n2)
= x (n1, n2) ∗ ( y (n1, n2) ∗ z (n1, n2))
• Distributive Property
– x (n1, n2) ∗ ( y (n1, n2) + z (n1, n2))
= x (n1, n2) ∗ y (n1, n2) + x (n1, n2) ∗ z (n1, n2)

• Convolution with Shifted Impulses


– x (n1, n2) ∗ δ (n1 − m1, n2 − m2)
= x (n1 − m1, n2 − m2)
2D Z-Transform
• Z transform coverts a discrete time domain
signal into a complex frequency domain
representation.
• It is discrete time equivalent of LAPLACE
Transform.
• Applications:
• Discrete Signal Processing – analysis of digital
filters
• Analysing Linear Time Invariant (LTI) system
• Economics and population science, etc.
Image Transforms

• A mathematical tool, which allows us to


move a signal from one domain to another
domain (time domain to the frequency
domain) useful in image processing and
image analysis.
Need for Transforms
• Mathematical Convenience:-
• Convolution in time domain  Multiplication
in the frequency domain
• To Extract more Information
• Compared to seeing a
white light seeing the
spectrum of light gives
more info. And Prism
is the tool for this
transformation.
• Here, the person X is in the time domain and the
person Y is in the frequency domain. The tool
which allows us to move from time domain to
frequency domain is the TRANSFORM.
FOURIER TRANSFORM
• An image is a spatially varying function. That
means there are many variations in an image
from low to high and high to low frequencies.
• One way to analyze spatial variations is to
decompose an image into its component
frequencies.
• Continuous Time Fourier Transform (CTFT) is
defined as,
∞ −𝑗𝑡
• X() = −∞
𝑥(𝑡)𝑒 𝑑𝑡
• A continuous time signal x(t) is converted into
discrete time signal x(nT) by sampling process,
where T is the sampling interval.
• The Fourier transform of a finite energy discrete
time signal x(nT) is given by,
• X(𝑒 𝑗𝜔 ) = ∞𝑛=−∞ 𝑥(𝑛𝑇)𝑒 −𝑗𝑛𝑡
−𝑗(𝑡)𝑛
• X(𝑒 𝑗𝜔 ) = ∞
𝑛=−∞ 𝑥(𝑛𝑇)𝑒 ………. (1)
• The relation between ω and Ω is given by
• ω = ΩT
• Replacing Ω by 2πf
• ω =2πf ×T ……….. (2)
• where T is the sampling interval and is equal to
1/fs. Replacing T by 1/fs
1
• ω=2πfx ………… (3)
𝑓𝑠
𝑓
• =k
𝑓𝑠
• Replacing in eq. (3)
• ω=k×2π
• To limit the infinite number of values to a finite
number,
𝜔 𝑘
• =
2𝜋 𝑁
 The Discrete Fourier Transform (DFT) of a finite
duration sequence x(n) is defined as,
2𝜋
𝑁−1 −𝑗 𝑘𝑛
• X (K) = 𝑛=0 𝑥(𝑛)𝑒 𝑁

• Where k = 0 , 1 , 2 , ...... , N-1


Digital Image Processing
UNIT - 2

TYBSC Sem – VI
Lecturer - Dhawal S. Bhangale
IMAGE ENHANCEMENT IN
SPATIAL DOMAIN
Spatial domain
• Means working with pixel values, working
directly with the raw data.
• The modified image
can be expressed as,
g( x , y ) = T [ f( x , y ) ]
where T is the
Transformation applied
• Spatial domain transformations are carried out in
2 ways:
Point processing, Neighborhood processing.
• Point Processing:- Work with single pixel.
T is 1 X 1 operator.
• Identity transform; in this
the original image
don’t change. Modified Gray
Level
(output pixel
value) s

Original Gray Level (input pixel value) r


• Point processing operations:-
• Digital Negative:-
• Inverting the gray levels. Used in displaying X-ray
images.
• Digital negative transform given by;
• S= ( L – 1 ) – r ……. L is number of gray levels (generally 256)
• So when r=0, s=255 & r=255, s=0

• Contrast Stretching:-
• We may get low contrast images due to poor
illumination or wrong setting of lens aperture.
Contrast stretching is Making dark portion darker
and bright portion brighter.
• It increases the dynamic range of modified image.
• Formula for contrast stretching is;

• Where l, m and n are slopes. And


• l , n < 1 and m > 1
• Thresholding:-
• Extreme contrast stretching yields Thresholding.
• i.e. if r1=r2 and s1=0 and s2=L-1.
• Gray Level Slicing:- Used to highlight a specific
range of gray values. This can help identify flaws
in X-rays and CAT scans.
• There are two methods;
1. slicing without background
2. Slicing with Background
• Bit plane slicing:-
• We can find out contribution made by each bit in
an image.
• for an 8-bit image we have 8 bit planes
• These bit planes ranging from bit-plane 0 (least
significant plane) to bit plane 7 (most significant
plane). (i.e. 8 images) showing importance of
each bit in the original image. Can be used for
image compression and steganography.

• Higher-order bits (especially the top four) contain


the majority of the visually significant data. The
other bit planes contribute to more subtle details
in the image.
• Dynamic range compression (log
transformation):- s = c log (1 + r)
• Where, c = 255/(log(1 + max_input_pixel_value)) and 1 is added
to r because, log(0) is infinity.
• Sometimes in an original image the difference
between the highest pixel value and the lowest
pixel value (Dynamic Range) is huge. Due to this
pixels with low values look obscured. E.g. in day
time we cant see stars even though they are
there. The opposite is also true.
• Expands the dark pixels in the image while
compressing the brighter pixels and Compresses
the dynamic range
• Clearly, the low intensity values in the input
image are mapped to a wider range of output
levels. The opposite is true for the higher values.
Image
Negatives,
power-Law
&
Log
Transforma
tions
• Power Law transformation (gamma correction):-
• S = C rγ
C and γ (gamma correction factor) are positive
constants
Magnetic resonance (MR)
image of a fractured
human spine.

C=1 and =0.6

C=1 and =0.3

Original

C=1 and =0.4


• Neighborhood Processing:-
• In this method, we change the value of pixel f(x,y)
based on the values of its 8 neighbors.
• We will calculate g(x,y) and put it in the center
place. Then the mask moves to next pixel.
• What are frequencies in an image?
LOW
FREQUENCY

HIGH
FREQUENCY

LOW
FREQUENCY

• if the gray levels change very rapidly frequency of


that image region is high, and vice versa.
• Image background  low freq.; edges  high freq.
• Image Noise
• During image acquisition and transmission noise
gets added.
• Two types;
– Gaussian Noise
– Salt and Pepper Noise
• 1. Gaussian Noise:-
• Arise during acquisition, poor illumination, high
temperature, electronic circuit noise

• Can be removed using mean filter, median filter,


Gaussian smoothing filter.
• 2. Salt and Pepper Noise:-
• Impulse / speckle noise.
• Sparsely occurring white and black pixels.
• Can be caused due to sharp and sudden
disturbances in the image signal.
• Can be removed using
median filter,
morphological filter.
• Low Pass Filtering
• Low pass averaging filter/mean Filter:-
• Can be used to remove Gaussian noise. It
achieves filtering by blurring the noise.
• Frequency response & spatial response.
• For averaging mask, all coefficients are positive.
Mask idea is to replace every pixel value in an
image by the average of the gray levels in the
neighborhood, this process reduces the sharp
transitions in gray levels.
• we can notice that the edges are blur now, and
sudden transition from value 10 to 50 is reduced
to 10 to 23.3 then to 36.6 and then to 50.
• Some other masks are also useful.

• Using second type of mask is called weighted


averaging.
abc
def

(a) Original image; (b)–(f) Results of smoothing


with square averaging filter masks of sizes n=3, 5,
9, 15, and 35, respectively
• Limitations of Low Pass Filter or Mean Filter or
Averaging Filter:
– Averaging operation leads to the blurring of an image.

– averaging operation attenuates and diffuses impulse


noise in the image but can’t remove.

– A single pixel, with a very unrepresentative value can


affect the mean value of all the pixels in its
neighborhood significantly.
• Low pass median filter:-
• Replaces the value of a pixel by the median of the
neighborhood.
• Median filters are particularly effective in the
presence of impulse noise, also called salt-and-
pepper noise.
• e.g.

Step1: Step2:
Step3: Step4:
median filter
• Highpass Filtering
• Eliminates low frequency regions and enhance
high frequency region. Effect of applying this filter
on image may result in removal of background
and enhancement of fine details and edges.
• So it is also used to sharpen images.
• Consider this image with a sharp edge. If we
apply above mask on this image output will be;

• We can observe 1) negative value which are


invalid, 2) very high value of pixels that can not be
displayed.
• By setting all negative value to 0 we an solve 1st
issue. And to remove 2nd issue we can use a
scaling component in the mask which will scale
down the pixel value to realizable value.
• High-Boost Filtering:-
• In highpass filtering we may get rid of complete
background of image. In case we want to enhance
edges and don’t want the loss of background we
can go for high boost filtering.

• High pass = original – low pass


• fsharp (x,y)= f(x,y) – flow (x,y)
• To get high boost image we change;
• fhb (x,y)= A . f(x,y) – flow (x,y)
• fhb (x,y)= ( A – 1 ) . f(x,y) + f(x,y) - flow (x,y)

• fhb (x,y)= ( A – 1 ) . f(x,y) + fsharp (x,y)

• High boost = ( A – 1 ) . original + high pass


High pass filtered image.


• High boosted image
• Unsharp Masking: We can also get a sharp
image by subtracting a blurred version of image
from original image. Used for edge enhancement.
• Steps are given below:
– Blur filter the image
– Subtract blurred image from original image
– Multiply the result with some weight
– Add result obtained to the original image
• f’(m,n) = f(m,n) + [f(m,n) – ̿f (m,n)]
– f(m,n) – original image
– ̿f (m,n) – blurred version
– f‘(m,n) – sharped result
• Zooming
• Can be done by two methods;
– Replication, Interpolation
• Replication:-
• Simply replicating each row and column.
• This is very easy method of zooming, but if the
image is zoomed to larger sizes clusters of gray
levels are formed, making the image look patchy.
• Original image Zoomed image
• Linear Interpolation:-
• Average of two adjacent pixels along the row is
taken and placed between those pixels, same
operation is performed along the columns.
• Interpolation along rows;
v-row(m,n) = u(m,n); 0<=m<=M-1, 0<=n<=N-1
v-row(m,n+1) = ½{u(m,n) + u(m,n+1)}
0<=m<=M-1, 0<=n<=N-1
Interpolation along columns;
v-col(m,n) = u(m,n); 0<=m<=M-1, 0<=n<=N-1
v-col(m,n+1) = ½{u(m,n) + u(m+1,n)}
0<=m<=M-1, 0<=n<=N-1
Histogram
• It is a plot of the number of occurrences of gray
levels in the image against the gray-level values.
• provides more insight about image contrast and
brightness.
– The histogram of a dark image will be clustered
towards the lower gray level.
– The histogram of a bright image will be clustered
towards higher gray level.
– For a low-contrast image, the histogram will not be
spread equally, that is, the histogram will be narrow.
– For a high-contrast image, the histogram will have an
equal spread in the gray level.
• Histogram Linear Stretching
• We do not alter the basic shape of the histogram,
but we spread it to cover the entire range.
• New gray level to assign pixels to;
(s-max – s-min)
s = T(r) = (r – r-min) + s-min
(r-max – r-min)
• Where,
• s-max  max gray level of output image
• s-min  min gray level of output image
• r-max  max gray level of input image
• r-min  min gray level of input image
Histogram Equalization
• A perfect image is one which has equal number of
pixels in all its gray levels.
• Equalization is a process that attempts to spread
out the gray levels in an image so that they are
evenly distributed across their range.

• It is a technique where the histogram of the


resultant image is as flat as possible. Histogram
equalization provides more visually pleasing
results across a wider range of images.
• Procedure to Perform Histogram Equalisation

– The maximum pixel value is 5. We need three bit


storage. So eight possible gray levels from 0 to 7. The
histogram of the input image is given below:
• Step 1 Compute the running sum of histogram values.

• Step 2 Divide the running sum obtained in Step 1 by the


total number of pixels. In this case, the total number of
pixels is 25.
• Step 3 Multiply the result obtained in Step 2 by the
maximum gray-level value, which is 7 in this case.
• Step 4 Mapping of gray
level by a one-to-one
correspondence:

• The original image and the histogram equalised image


are shown side by side.
• Lets take an example;
• Equalize following Histogram:
• For Home Work:
IMAGE ARITHMETIC
• Image Addition
• Used to create double exposure. Superimposing
one image upon another.
• c (m, n) = f (m, n) + g (m, n)
• e.g. if we have captured 2 images of the same location on same
date and time and 1 image have some noise then that part can
be compensated from other image through image addition.
• Image Subtraction
• Used to find the changes between two images of
a same scene.
• c (m, n) = f (m, n) − g (m, n)
• Image subtraction can be used to remove certain
features in the image.
• Image Multiplication
• Image multiplication is basically used for masking.
Used for background suppression.
• To extract an area from an image, that area can
be multiplied by one and the rest part by zero.
• Image Division
• Dividing the pixels in one image by the
corresponding pixels in a second image is
commonly used in transformation.
• The result of image division is just opposite to
that of image multiplication.
• Alpha Blending
• Alpha blending refers to addition of two images,
each with 0 to 1 fractional masking weights.
Alpha blending is useful for transparency and
compositing(the process of combining multiple images to
form a single image).
• If a and b are two images and c is output image,
• c = (1−α) a + αb
• α is the user-defined value.
• Different values of the variable α will increase the
emphasis on the image a or image b.
• Image a image b
• c = (1-1)a + b …. (=1) c = (1-0.001)a + b …(=0.001)

• c = (1-0.2)a + b …. (=0.2) c = (1-0.7)a + b …(=0.7)


IMAGE ENHANCEMENT IN
FREQUENCY DOMAIN
• Image of a damaged IC chip
• 2nd image shows its Fourier
transform.
• Brightness at the center
indicates image with more low
frequency components.
Diagonal lines indicate
existence of sharp edges
diagonally in original image.
• We should remember center of
DFT represents low frequency
components and peripheries
represents high frequency
components
• Bcoz a Fourier transform is perfectly reversible, if
we compute the DFT of image, and then
immediately inverse transform the result, we can
regain the same image.
• But, if we compute the DFT of image, and then
accentuate certain frequency components and
attenuate others by multiplying Fourier
coefficients by suitable weights. The
corresponding changes in the spatial form can be
seen after an inverse transform is computed. This
is called as Fourier filtering or frequency domain
filtering.
LOW PASS frequency domain filters
• Ideal Low Pass Filter (ILPF)
• Simplest filters. Cuts off all high frequency
components from Fourier transform that are at
the distance greater than the specified distance
(cut-off frequency) D0.

• broadly classified into two types:


1. non-separable filter mask, and
2. separable filter mask.
Non-separable Filter Transfer Function

Separable Filter Transfer Function

l
• Ideal filter is not realizable using
physical components, but using
computer programs only.
• Ringing effect in low pass filtering:-

• The reason behind this effect is convolution


theorem. We know,
g(x,y) = f(x,y) * h(x,y) this is equivalent to,
G(k,l) = F(k,l) X H(k,l)
• Now if we again take the IDFT of H(k,l) it causes
oscillations / ripples.

• Hence when h(x,y) is displayed as image it shows


a dominant filtering component at the center and
unwanted concentric components which causes
ringing effect in
output image.
• Butterworth Low Pass Filter (BLPF):-
• Ringing effect of Ideal filter is caused due to sharp
cut-off frequency, to get rid of it we remove sharp
cut-offs in BLPS.
• Transfer function of BLPS is;

• where n is the order of the function (order:- to


calculate current value how many past values are used is
order of function),
as n increases, it tends to behave like ideal filter.
• Filtered using Ideal filter.
With ringing

• Filtered using Butterworth.


No rings visible
• BLPF functions with orders increasing from
n= 1, 2, 5, 20.
• We can observe as order increases ringing effect
appears and also increases.
• Gaussian Low Pass Filter (GLPF):-
• Given by;

• Main advantage of GLPF over BLPF is there is no


ringing effect no matter what filter order we
choose.
High Pass Frequency domain filters
• To enhance edges and abrupt changes in gray
levels in an image high pass filters are used.
• Ideal High Pass Filter (IHPF):-
• Defined by;

• where D0 is cut-off frequency.


• Ideal high pass filters also suffers from the ringing
effect so not suitable in many practical situations.
• Butterworth High Pass Filter (BHPF):-
• Any high pass filter can be calculated as,
• HHBWF(k,l) = 1 – HLBWF(k,l) where LBWF  Low-Pass
Butterworth Filter

• BHPF causes no ringing effect with lower n (order)
values.

• With D0 = 10
Homomorphic Filtering
• An image f(n1,n2) is characterized by two
components;
– The amount of light incident on scene/object
(illumination) :- i(n1,n2)
– Amount of light reflected by the object
(reflection):- r(n1,n2)
• Image f(n1,n2) is formed by;
f(n1,n2) = i(n1,n2) . r(n1,n2)
where 0 < i(n1,n2) < infinity AND
0 < r(n1,n2) < 1; 0 indicates
total absorption and 1 indicates total reflection.

• Illumination contribute to the low frequency and


reflectance contribute to the high frequency.

• To separately work with both these components,


we can take logarithm of input function.

• Taking Fourier transform on both sides, we get


Now applying desired filter function H(k,l),


• How to get back the original image ? (transforming
image from frequency to spatial domain)
– Calculate IDFT of the above expression and for antilog
perform exponential operation.

• Now, The desired enhanced image is,



• As Homomorphic filter is a high pass filter to
decrease contribution of low frequencies and
amplify contribution of high freq.
• we choose YL < 1 and YH > 1.
DIP UNIT 2 CH 2, 3
Binary Image Processing,
Colour Image Processing
• Morphology is the science of appearance,
shape and organization.

• Mathematical morphology is a collection of


non-linear processes which can be applied to
an image to remove details smaller than a
certain reference shape.
BINARISATION
• Binarisation is the process of converting a
grayscale image to a black-and-white image.
• Grayscale image contains a pixel intensity range
of 0 to 255 levels.
• Global thresholding is used to set all pixels
above a defined value to white, and the rest of
the pixels to black.
• It is very important to decide the appropriate
threshold value to binarise the image.
MATHEMATICAL MORPHOLOGY
• Mathematical morphology is based on set theory.
• Morphological operators are based on set theory
operators.
• Black and White colour pixel sets in mathematical
morphology represent objects in an image.
• In morphological operations, black and white
pixels sets in different shapes and sizes are
applied to the image, these shapes are called as
structuring elements.

• Morphological operations are performed by


moving a structuring element over the binary
image pixel by pixel.

• A logical operation is performed on the pixels


covered by the structuring element.
LOGICAL OPERATIONS
• AND, OR operations.

B B B
W B W
B W W
W W W

W W W
W B B
B W B
B B B
• NOT, XOR operation.

W B
B W

W W W
W B B
B W B
B B W
• Logical Opearations
STANDARD BINARY
MORPHOLOGICAL OPERATIONS
• Dilation
• The dilation operation is defined as,

• Where X is original image


• B is structuring element
• is image B rotated about origin and shifted by
z positions.
• Structuring element is smaller in size than image.
• In this process the SE (Structuring Element) is
placed on the image and shifted from left to right
pixel by pixel.
• The process will look for any overlapping similar
pixels between the structuring element and of the
binary image.
• If there exists an overlapping then the pixels
under the center position of the structuring
element will be turned to 1 or black.
• Lets take an example;
• Consider that the SE is
placed first at * position in
image. At this point, there
is no overlapping between
the black squares of B and
the black squares of X;
hence at position * the square will
remain white.
• When positioned at **, 1 black square
of SE is overlapping 1 black square of
image, hence ** position will be
changed to black. This process will
continue till the last pixel of image.
• The dilation is an expansion operator that
enlarges binary objects. Dilation has many uses,
but the major one is bridging gaps in an image.
• Erosion
• Erosion is the counter-process of dilation. If
dilation enlarges an image then erosion shrinks
the image.
• Erosion is defined by;

• This equation states that the outcome element z


is considered only when the structuring element
is a subset or equal to the binary image X.
• the process will look for whether there is a
complete overlap with the structuring element or
not. If there is no complete overlapping then the
image pixel overlapped by the center of the
structuring element will be set white or 0.
• Erosion is a thinning operator that shrinks an
image. By applying erosion to an image, narrow
regions can be eliminated, while wider ones are
thinned.
Original Image

Dilated Image Eroded Image


DILATION & EROSION BASED
OPERATIONS
• Opening
• Where X is an input image and B is a structuring
element.
• Opening smoothes the inside of the object
contour, breaks narrow strips and eliminates thin
portions of the image.
• The opening operation is used to remove noise
and CCD defects in the images.
• Closing
• The closing operation fills the small holes and
gaps.
• It also smoothes contours and maintains shapes
and sizes of objects.
DISTANCE TRANSFORM
• It calculates distance between pixels of an image.

• Distance calculation is useful in medical imaging,


pattern recognition, comparison of binary images,
etc.

• Euclidean distance, city block, chessboard and


Quasi-Euclidean Distance Transform.
Euclidean Distance
City-Block Distance

• This metric measures the path between the pixels


based on a four-connected neighborhood.
Chessboard Distance

• The chessboard distance metric measures the


path between the pixels based on an eight-
connected neighborhood.
Quasi-Euclidean Distance Transform

• The quasi-Euclidean metric measures the total


Euclidean distance along a set of horizontal,
vertical and diagonal line segments
CH 3
COLOUR IMAGE PROCESSING
• Color is the perceptual sensation of light in the
visible range incident upon the retina.
• Different wavelengths of light are perceived as
different colors.
• Frequency of light – Colour
• Amount of light – Intensity
• Light that has a dominant frequency or set of
frequencies is called chromatic.
• Achromatic light has no color and it contributes
only to quantity or intensity.
COLOUR FORMATION

• There are three colour formation processes:


(i) additive process, (ii) subtractive process, and
(iii) pigmentation.

• Additive Colour Formation


• Two colour frequencies and intensities get added.
• This is used in TV screens.
• Subtractive Colour Formation
• When the light is passes through a filter, part of
light or some frequencies of the light gets
absorbed and rest is transmitted.
• Series of filters can be used to get required
results.
• When color slides are projected onto a screen
colors are formed subtractively.
• Colour Formation by Pigmentation
• Pigment particles can reflect, transmit or absorb
the light that reaches them.
• These events determine the color of the light
reflected by the surface.
• Colour formation through pigmentation
• allows one to see colors in a painting.
COLOUR MODEL
• Color models standardize the way to specify a
particular color.
• It also specifies all constructible colours within a
particular model.
• Some popular color models are,
1. RGB
2. CMY
3. HSI
RGB
• RED, GREEN, BLUE
• Used in CRT monitor screens.
• RGB is an additive
color model.
• Magenta = Red + Blue
• Yellow = Red + Green
• Cyan = Blue + Green
CMY
• Cyan, Magenta, Yellow
• Printers commonly employ the CMY model.

• CMY is a subtractive color


model.
• Magenta = White – Green
• Cyan = White – Red
• Yellow = White –Blue
HSI
• Hue, Saturation and Intensity
1. Hue - dominant colour
2. Saturation - relative purity or the amount of
white light mixed with a hue
3. Intensity – brightness
• HSI decouples the intensity information from the
colour, while hue and saturation correspond to
human perception.
• HIS model is useful for developing image
processing algorithms.
• The conversion from RGB space to HSI space is
given below;
UNIT 3 CH 1, CH 2
Image Segmentation,
Image Compression
Image Segmentation
• It is a process of partitioning the image into
homogeneous pixels based on some criteria.

• Different groups must not intersect with each


other, and adjacent groups must be
heterogeneous.
IMAGE-SEGMENTATION TECHNIQUES
Image Segmentation

Local Segmentation Global Segmentation

Region Approach Edge Approach

Boundary Approach
Region Approach
• Regions in an image are a group of
connected pixels with similar properties.

• In the region approach, each pixel is


assigned to a particular object or region.
Region Growing

• Here each neighbouring pixel is examined and


added to a region class if no edges are detected.

• If adjacent regions are found, both regions are


merged, weak edges are dissolved and strong
edges are left intact.
• Algorithm starts with a single pixel as a seed,
using homogeneity criterion a region is grown,
and removed from the process.

• Then another pixel is chosen as a seed and


algorithm continuous until all pixels have been
allocated to a segment.
Region Splitting and Merging

• As the name suggests it is a two step process.

• It begins with a whole image and divides it up


such that the segregated parts are more
homogenous than the whole.

• Merge any adjacent regions that are similar


enough.
• Algorithm:
1. Start with the whole image.
2. If the variance is too large, break it into
quadrants.
3. Merge any adjacent regions that are similar
enough.
4. Repeat steps (2) and (3) iteratively until no more
splitting or merging occurs.
Splitting Merging

Splitting Merging
IMAGE SEGMENTATION BASED ON
THRESHOLDING
• Thresholding techniques produce segments
having pixels with similar intensities.

• In Thresholding we establish boundaries in


images that contain solid objects resting on a
contrasting background.
Global Thresholding

• …. θ is threshold value

• If due to poor illumination image have different


areas with different intensities, Variable
Thresolding i.e form of local thresholding can be
used.
Adaptive Thresholding

• Global theresholding works when image


foreground have approximately uniform grey
level, similarly background also have uniform grey
level but unequal to foreground gray level.

• But if background grey levels are not constant or


contrast is varying within the image. Then it is
better use adaptive threshold grey levels as per
image area.
Histogram-Based Threshold Selection
• If we calculate Histogram for an image containing
object on a contrasting grey level background, it
will have 2 peaks, representing large number of
pixels inside and outside of the object. The dip
between the two peaks represents relatively low
number of pixels around the edge of the object.
This dip is commonly used for selecting threshold
grey level.
• If the objects in the image are not large or the
image is noisy, histogram may not show any
confident dip.
EDGE-BASED SEGMENTATION
• Points in an image where brightness changes
abruptly are called edges or edge points.
• Step Edge: intensity abruptly changes.

• Line Edge: when segment is very narrow,


it has two edges in close proximity.

• Ramp Edge: smoother transition between


segments
• Roof Edge: Two nearby ramp edges

• Causes of Edges:
– shadows, texture, geometry, etc.
– discontinuities in the image intensity due to changes
in the image structure.
– Edge points are boundaries of objects and
background.
Edge Detection Using First-order and
Second-order Derivatives
Consider two regions having transitions in intensity
1.from dark to bright to dark
2.from bright to dark to bright
• The first derivative values are shown;
• 1st derivative is positive at the leading edge means
transition from dark to bright and vise versa
• Second derivative is positive on the darker side of
the edge and negative on the brighter side.
• First order derivative can be calculated using the
Gradient Operator. Where as second order derivative
using the Laplacian.
• The gradient of image f(x,y) at location (x,y) is defined
by the vector,

𝜕𝑓
𝐺𝑥 𝜕𝑥
• 𝛻𝑓 = 𝐺𝑦
= 𝜕𝑓 ∴ 𝜵𝒇 ≈ 𝑮𝒙 + 𝑮𝒚
𝜕𝑦
𝜕𝑓
• = 𝐺𝑥 = 𝑓 𝑥 + 1, 𝑦 − 𝑓(𝑥, 𝑦)
𝜕𝑥
𝜕𝑓
• = 𝐺𝑦 = 𝑓 𝑥, 𝑦 + 1 − 𝑓(𝑥, 𝑦)
𝜕𝑦
Gives averaging effect
So noise is reduced.

Does not give


averaging effect
• Second-order Derivative Method
• Differentiating the first derivative gives the
second derivative. The Laplacian.

• and

But this is not used in


edge detection as it
is very sensitive to
the noise and it gives
double edges.
CH 2
IMAGE
COMPRESSION
Need for Compression
• Due to the development in Internet,
teleconferencing, multimedia and high-definition
multimedia technologies storage and transmission
of this data becomes an issue.
• E.g during this current teams call minimum of 10
times a second the screen is captured and
transmitted to al participants. So data
requirements per transmission is approx. 30 Mbs
per second.
• we can understand the requirement of image
compression.
REDUNDANCY IN IMAGES

• Images have redundant information.

• Compression is achieved through redundancy and


irrelevancy reduction.
Redundancy is broadly
classified as, Types of Redundancy

Statistical Psychovisual

Interpixel Coding

Spatial Temporal
Statistical Redundancy
• Interpixel Redundancy
• If in an image neighbouring pixel values are
related to each other, i.e. neighboring pixels are
not statistically independent.

• Coding redundancy
• In an image the information is represented in the
form of codes. Choosing efficient codes may
reduce image size.
Spatial Redundancy

• Spatial redundancy implies that there is a


relationship between neighboring pixels in an
image.
• Instead of representing each pixel in an image
independently, pixel value can be predicted from
its neighbors.
Temporal Redundancy

• Temporal redundancy is the statistical correlation


between pixels from successive frames in a video
sequence.

• The temporal redundancy is also called interframe


redundancy.
Psychovisual Redundancy

• In the Human Visual System, visual information is


not perceived equally.
• Some information may be more important than
other information.
• If less data is used to represent less important
visual information, perception will not be
affected.
IMAGE-COMPRESSION SCHEMES
• Lossless Compression or Reversible Compression

• Image after compression and decompression is


identical to the original image

• This only achieves modest compression rate.

• Preferred in the case of medical image


compression.
• Lossy Compression or Irreversible Compression

• The reconstructed image contains degradations


with respect to the original image.

• Higher compression ratio can be achieved

• Term ‘visually lossless’ is often used to


characterize lossy compression schemes that
result in no visible degradation.

• Preferred in the case of multimedia applications


RUN-LENGTH CODING
• It is effective when long sequences of the same
symbol occur.
• It exploits the spatial redundancy
• Run - repetition of a symbol
• Run-length - number of repeated symbols
• It maps a sequence of numbers into a sequence of
symbol pairs (run, value).
• It is used in Windows bitmap file format.
• There are 2 types (i) 1D run-length coding, and (ii)
2D run-length coding
1D Run-length Coding
• In these scheme each scan line is coded
individually.
• Each scan line is alternating sequence of runs of
1’s and 0’s.
• Consider following binary image,
• in RLC every run is converted to 2 values,
(run length, actual symbol)
• In the given example, the run-length coding for
the fourth row of the input image is,
4, 0, 10, 1, 4, 0.

• 2D Run-length Coding
• utilising correlation between pixels in neighboring
scan lines to achieve higher coding efficiency.
HUFFMAN CODING

• The most popular technique for removing coding


redundancy.

• Huffman coding assigns a binary code to each


intensity value.

• Shorter codes are assigned to intensities with


higher probability and longer codes will do for
intensities with lower probability.
• In the first step we combine the lowest probability
symbols into a single symbol that replaces them in
next source reduction.
• In the second step we code each reduced source,
starting with the smallest source and working
back to the original source.
• Obtain the Huffman code for the word
‘COMMITTEE’

p(C) 1/9 0.1


p(O) 1/9 0.1
p(M) 2/9 0.2
p(I) 1/9 0.1
p(T) 2/9 0.2
p(E) 2/9 0.2
• Step 1
• Step 2

You might also like