CAT1 Computer Vision Syllabus

CAT1 Computer Vision Syllabus
1. Introduction
2. Understanding the fundamentals of Computer Vision and Images Apply and analyze the
methods for pre-processing
3. What is computer vision? -A brief history
4. Image formation -Geometric primitives and transformations
5. Photometric image formation
6. Understanding the Image age Acquisition and the role of camera
7. The digital camera
8. Image processing
9. Point operators
10. Understanding and applying the basics of image transformation
11. Linear filtering
12. More neighbourhood operators
CAT 1 Question
1 Discuss the components of image processing and analyse each component in detail. (7)
Ans :- Image processing is a field of study that involves the
manipulation of digital images using computer algorithms. It
involves a series of operations performed on an image to enhance
its quality, extract useful information, or transform it into a more
meaningful representation. The process of image processing is
composed of several stages, each with its own set of techniques
and algorithms. In this answer, we will discuss the various
components of image processing and analyze each of them in
detail.
The components of image processing can be broadly categorized

into three main stages:
1. Image Acquisition
2. Image Enhancement
3. Image Analysis and Recognition
Let's examine each component in more detail.
1. Image Acquisition:
Image acquisition is the process of capturing or obtaining an

image. This can be done using various types of imaging devices,
such as cameras, scanners, and sensors. The quality of the image
obtained depends on the type of device used and the environment
in which it was captured. In image processing, the goal of image
acquisition is to obtain high-quality images with sufficient
information for further processing.
2. Image Enhancement:
Image enhancement is the process of improving the visual quality

of an image by removing noise, improving contrast, and sharpening
edges. This stage is usually performed after image acquisition to
improve the image's appearance and quality. There are various
techniques used in image enhancement, such as spatial domain
filtering, frequency domain filtering, histogram equalization, and
edge detection.
Spatial domain filtering involves modifying the pixel values of an

image based on their surrounding pixels. It includes techniques
such as mean filtering, median filtering, and Gaussian filtering.
Frequency domain filtering, on the other hand, involves modifying
the image's frequency components using techniques such as
Fourier transform and wavelet transform.
Histogram equalization is a technique that enhances the contrast of

an image by adjusting the pixel values' intensity levels. Edge
detection techniques are used to detect and enhance the edges in
an image.
3. Image Analysis and Recognition:
Image analysis and recognition involve the extraction of useful

information from an image. This stage is usually performed after
image enhancement to improve the accuracy of the analysis. There
are various techniques used in image analysis and recognition, such
as segmentation, feature extraction, and classification.
Segmentation is the process of dividing an image into multiple

regions or segments. Feature extraction involves identifying and
extracting relevant features from the image, such as texture, shape,
and color. Classification involves categorizing the image into
predefined classes based on the extracted features.
In conclusion, image processing is composed of three main stages:
image acquisition, image enhancement, and image analysis and
recognition. Each of these stages involves a set of techniques and
algorithms that are used to manipulate and extract useful
information from digital images. Understanding these components
is essential in developing efficient and accurate image processing
systems for various applications.
2 Discuss the Colour models like RGB, CMY, and HSV etc?(5)
Ans:- Color models are mathematical representations that define
how colors can be represented numerically.
RGB Color Model:
The RGB color model stands for Red, Green, Blue. It is an additive
color model used to represent colors in electronic systems, such as
computer monitors, televisions, and digital cameras. In this model,
colors are created by combining different intensities of red, green,
and blue light.
2. CMY Color Model:
The CMY color model stands for Cyan, Magenta, Yellow. It is a

subtractive color model used in printing and other physical media.
In this model, colors are created by subtracting different amounts
of cyan, magenta, and yellow pigments from white paper.
3. HSV Color Model:
The HSV color model stands for Hue, Saturation, Value. It is a

cylindrical color model used to represent colors in a way that is
more intuitive to humans than the RGB or CMY color models. In this
model, colors are represented by their hue, saturation, and value.
3 Mention the difference between Monochrome and grayscale image. (4)
Ans:- A monochrome image and a grayscale image are two different types of images
commonly used in digital image processing. Although they are often used
interchangeably, they have some fundamental differences.
A monochrome image is an image that consists of only one color, typically black or
white. It is a binary image where each pixel is represented by a single bit of
information, 0 or 1, indicating whether the pixel is black or white. Monochrome images
are commonly used in text-based documents, such as faxes, where only black or white
information is needed.
On the other hand, a grayscale image is an image that consists of different shades of
gray. Each pixel in a grayscale image is represented by a single value, typically ranging
from 0 to 255, indicating the intensity of gray. Grayscale images are commonly used in
image processing applications where color information is not necessary, but the
intensity or brightness of the image is important.
The main difference between monochrome and grayscale images is the number of
possible values that each pixel can take. In a monochrome image, each pixel can only
take one of two possible values, whereas in a grayscale image, each pixel can take any
value between 0 and 255. This difference in the number of values has implications for
image processing applications. For example, grayscale images can be used for more
complex image processing operations, such as edge detection, while monochrome
images are more suitable for simple operations like thresholding.
In summary, while both monochrome and grayscale images are commonly used in
image processing applications, they differ in the number of values that each pixel can
take. Monochrome images have only two possible values, black or white, while
grayscale images can take any value between 0 and 255, indicating different shades of
gray.
4 Explain the simple image model. How the basic nature of image is characterized by the
two components, called illumination and reflectance components.(7)
Ans:-
The simple image model is a mathematical model used in digital image
processing to describe the basic nature of an image. According to this
model, any image can be decomposed into two components: the
illumination component and the reflectance component.
The illumination component refers to the lighting conditions under which

the image was captured. It is the amount of light that falls on the scene and
affects the brightness of the image. This component includes factors such as
the direction, intensity, and color of the light source, as well as the presence
of shadows and reflections.
The reflectance component refers to the inherent properties of the objects

in the scene that reflect the light. It is the pattern of colors and textures that
we see in the image. This component includes factors such as the color,
texture, and material of the objects in the scene.
The two components can be mathematically separated from each other
using various image processing techniques, such as statistical analysis or
filtering. This separation allows us to manipulate the illumination and
reflectance components separately, providing us with greater control over
the image and enabling us to enhance its quality and detail.
The simple image model is widely used in various image processing

applications, such as image segmentation, image restoration, and object
recognition. By separating the illumination and reflectance components, we
can reduce the impact of lighting variations and other noise in the image,
leading to more accurate and reliable results.
In summary, the simple image model characterizes the basic nature of an

image as being composed of two components: the illumination component
and the reflectance component. These components can be mathematically
separated from each other and manipulated independently, providing
greater control over the image and enhancing its quality and detail.
5 Define 2D Transformation Image Formation. (5)

Ans:- 2D transformation in image processing refers to the process of
manipulating the geometric properties of an image by applying a
set of mathematical operations to its pixel coordinates. These
operations can include translation, rotation, scaling, shearing, and
reflection.
Image formation refers to the process of generating a digital image

from a physical scene. In image formation, light rays from the scene
are captured by a sensor, such as a camera or scanner, and
converted into digital signals that can be stored and processed by a
computer.
2D transformation is used in image processing to manipulate the

geometric properties of an image in order to achieve various goals,
such as correcting distortion, aligning images, and creating special
effects. The transformation is applied by defining a set of
transformation matrices that describe how the pixel coordinates of
the image are to be transformed.
The most common types of 2D transformations include translation,

rotation, and scaling. Translation involves moving the image
horizontally or vertically, rotation involves rotating the image
around a fixed point, and scaling involves changing the size of the
image. Shearing and reflection involve deforming the image in a
specific direction and reflecting it across an axis, respectively.
2D transformation is an important tool in image processing, as it

allows us to manipulate the geometry of an image without
changing its content. This can be useful for a variety of applications,
such as image registration, image enhancement, and image
segmentation.
6 Analyse the Role of Digital Cameras and How can we control the depth of field?(7)
Ans:- Digital cameras have become ubiquitous in modern society and have
revolutionized the way we capture and share images. They use digital technology to
convert light into electrical signals that can be stored and processed by a computer,
resulting in high-quality images that can be easily edited and shared.
One important aspect of digital photography is the control of depth of field, which
refers to the range of distances in an image that are in focus. Controlling depth of field
can help to create a more aesthetically pleasing image by drawing attention to specific
objects or areas of the scene.
There are several factors that can affect the depth of field in a digital image, including
aperture size, focal length, and distance to the subject. To control depth of field, we can
adjust the aperture size, which refers to the size of the opening in the lens that lets
light into the camera. A larger aperture (smaller f-number) will result in a shallower
depth of field, while a smaller aperture (larger f-number) will result in a deeper depth
of field.
In addition to aperture size, we can also control depth of field by adjusting the focal
length of the lens and the distance to the subject. A longer focal length and a closer
distance to the subject will result in a shallower depth of field, while a shorter focal
length and a farther distance to the subject will result in a deeper depth of field.
Overall, digital cameras play a critical role in capturing high-quality images and allow
us to control various aspects of the image, including depth of field. By adjusting the
aperture size, focal length, and distance to the subject, we can create images that are
aesthetically pleasing and convey our artistic vision.
7 What is image enhancement? Differentiate spatial domain and frequency domain

methods. If I is input intensity and O is output intensity then write the equation for image
negation and log transformation. Let the intensity range for the image is [0, L-1]. (7)
Ans:- Image enhancement is a process in which the quality of an
image is improved by applying various operations or techniques to
it. The goal of image enhancement is to improve the visual quality
of the image, make it easier to interpret, or to extract relevant
information from it.
There are two main categories of image enhancement methods:

spatial domain methods and frequency domain methods.
Spatial domain methods operate directly on the pixel values of the

image. They are based on simple mathematical operations, such as
filtering, thresholding, and contrast stretching. Spatial domain
methods are easy to understand and implement, but they can be
computationally intensive and may not be suitable for all types of
images.
Frequency domain methods, on the other hand, involve

transforming the image from the spatial domain to the frequency
domain using mathematical techniques such as the Fourier
transform. Frequency domain methods are useful for analyzing the
frequency content of an image and can be used for tasks such as
image compression and noise reduction.
The equation for image negation can be written as:
O=L-1-I
where L is the maximum intensity value in the image, and I and O

are the input and output intensity values, respectively. Image
negation is a spatial domain method that involves reversing the
intensity values of the image, resulting in a negative image.
The equation for log transformation can be written as:
O = c * log(1 + I)
where c is a scaling constant and I and O are the input and output
intensity values, respectively. Log transformation is a spatial domain
method that is used to enhance the contrast of an image,
particularly in the darker regions. The log function compresses the
dynamic range of the image, making the darker regions brighter
and easier to see.
In summary, image enhancement is a process of improving the

quality of an image using various techniques. Spatial domain
methods and frequency domain methods are the two main
categories of image enhancement methods. Image negation and
log transformation are two examples of spatial domain methods.
The equation for image negation is O = L - 1 - I, and the equation
for log transformation is O = c * log(1 + I).
8 State different limitations of a pinhole camera and how to overcome these limitations.
Write a short note on thin lenses. (7)
Ans:- Pinhole cameras are simple optical devices that use a tiny
aperture to project an inverted image of the outside world onto a
screen or film. However, pinhole cameras have several limitations
that can affect the quality of the image:
1. Limited light gathering ability: The small aperture of a pinhole

camera limits the amount of light that can enter the camera,
resulting in dim and grainy images.
2. Limited depth of field: The small aperture of a pinhole camera also
limits the depth of field, making it difficult to get objects at
different distances in focus at the same time.
3. Low resolution: The lack of a lens in a pinhole camera limits the
resolution of the image, resulting in a blurry and low-quality image.
To overcome these limitations, we can use a lens-based camera

instead of a pinhole camera. A lens-based camera uses a lens to
focus the light onto the film or sensor, which can improve the light
gathering ability, depth of field, and resolution of the image.
Thin lenses are one of the most common types of lenses used in
cameras and other optical devices. A thin lens is a simple lens that
is characterized by its focal length and refractive index. Thin lenses
can be used to form images of objects by bending the light that
passes through them.
There are two types of thin lenses: converging lenses and diverging
lenses. A converging lens is a lens that converges light rays towards
a single point, called the focal point. A diverging lens, on the other
hand, diverges light rays away from a single point, called the virtual
focal point.
Thin lenses can be used in different ways to form images,
depending on the position of the object and the lens. For example,
a converging lens can be used to form a real image of a distant
object by placing the object beyond the focal point of the lens. A
diverging lens can be used to form a virtual image of a nearby
object by placing the object inside the focal point of the lens.
In summary, pinhole cameras have several limitations, such as

limited light gathering ability, limited depth of field, and low
resolution. To overcome these limitations, we can use a lens-based
camera instead. Thin lenses are a common type of lens used in
cameras, which can be used to form images by bending light.
9 List out the different types of images in the view of the Computer Vision.(4)
Ans:- In the context of Computer Vision, images can be classified into several types
based on their characteristics and properties. The different types of images in
Computer Vision include:
1. Binary image: A binary image is a black and white image in which each pixel is either
black (0) or white (1). Binary images are commonly used for image processing tasks
such as edge detection, segmentation, and morphology.
2. Grayscale image: A grayscale image is an image in which each pixel has a single
intensity value, typically ranging from 0 (black) to 255 (white). Grayscale images are
commonly used for image analysis tasks such as feature extraction, object recognition,
and classification.
3. Color image: A color image is an image in which each pixel is represented by a
combination of red, green, and blue (RGB) values. Color images are commonly used in
computer vision applications such as object detection, segmentation, and tracking.
4. Depth image: A depth image is a type of image that represents the distance of each
pixel from the camera. Depth images are commonly used in applications such as 3D
reconstruction, object recognition, and human pose estimation.
These are some of the common types of images used in Computer Vision, and each
type has its own unique characteristics and applications.
10 Analyse the Image Noise Filters techniques you know and explain with an example. (7)
Ans:- Image noise is an unwanted variation in pixel intensity that can
be caused by several factors such as electronic interference, sensor
noise, or environmental factors. Image noise can reduce the quality
of an image and make it difficult to perform image analysis tasks.
Therefore, noise removal is an important pre-processing step in
image processing.
There are several techniques available for removing image noise,
and these techniques can be broadly classified into two categories:
spatial domain filters and frequency domain filters.
1. Spatial Domain Filters: Spatial domain filters are a class of noise

filters that operate directly on the pixel values of the image. These
filters are based on the idea of replacing each pixel value with a
weighted average of its neighboring pixels.
Example: Median filter - A median filter is a spatial domain filter

that replaces each pixel value with the median value of its
neighboring pixels. The median filter is a popular filter for removing
impulse noise, which is a type of noise that can cause isolated pixels
to have extreme values. The median filter works by replacing each
noisy pixel with the median value of its neighboring pixels, which
effectively removes the impulse noise while preserving the image
details.
2. Frequency Domain Filters: Frequency domain filters are a class of

noise filters that operate on the frequency components of the
image. These filters are based on the idea of transforming the
image into the frequency domain using techniques such as Fourier
transform, and then filtering out the noise components in the
frequency domain.
Example: Wiener filter - A Wiener filter is a frequency domain filter

that uses a statistical model of the image and noise to estimate the
ideal image. The Wiener filter works by estimating the power
spectrum of the noise and the signal, and then applying a filter that
minimizes the mean square error between the estimated image and
the true image. The Wiener filter is a popular filter for removing
additive noise, which is a type of noise that adds random values to
the pixel intensities.
In summary, image noise filters are important tools for removing

noise from images and improving their quality. Spatial domain
filters and frequency domain filters are two popular techniques for
removing image noise, each with its own strengths and weaknesses.
Median filter and Wiener filter are two examples of image noise
filters, which can be used to remove impulse noise and additive
noise, respectively.
11 List the difference between Linear Filters and Non-Linear Filters? With suitable example.
(7)
Ans;- Linear filters and non-linear filters are two broad categories of
image filters used for image processing tasks. The main difference
between these two types of filters is how they operate on the pixel
values of the image.
Linear Filters: Linear filters are a class of image filters that operate
on the pixel values of an image using a linear function. These filters
are based on the principle of convolution, which involves sliding a
small matrix, called a kernel or mask, over the image and
computing the weighted sum of the neighboring pixels at each
location.
Examples of linear filters include:
1. Gaussian filter - A Gaussian filter is a linear filter that is used to

smooth an image by reducing high-frequency noise while
preserving the image details. The Gaussian filter works by
convolving the image with a Gaussian kernel, which assigns higher
weights to the central pixels and lower weights to the neighboring
pixels.
2. Sobel filter - A Sobel filter is a linear filter that is used for edge
detection in images. The Sobel filter works by convolving the image
with two kernels, one for horizontal edges and one for vertical
edges. The resulting gradient magnitude is used to identify the
edges in the image.
Non-linear Filters: Non-linear filters are a class of image filters that

operate on the pixel values of an image using a non-linear function.
These filters are based on the principle of ranking, which involves
sorting the neighboring pixels and selecting a pixel based on its
rank.
Examples of non-linear filters include:
1. Median filter - A median filter is a non-linear filter that is used for

removing impulse noise from an image. The median filter works by
replacing each pixel with the median value of its neighboring pixels,
which effectively removes the impulse noise while preserving the
image details.
2. Bilateral filter - A bilateral filter is a non-linear filter that is used for
smoothing an image while preserving its edges. The bilateral filter
works by assigning weights to the neighboring pixels based on their
intensity and spatial distance, which ensures that nearby pixels with
similar intensity are given higher weights.
In summary, the main difference between linear filters and non-

linear filters is the type of operation they perform on the pixel
values of an image. Linear filters operate using a linear function,
while non-linear filters operate using a non-linear function.
Gaussian and Sobel filters are examples of linear filters, while
median and bilateral filters are examples of non-linear filters.
12 Discuss the Opening and closing morphology operations and its uses in Image Processing.
(7)
Opening and closing are two fundamental morphological operations in image
processing that are used for various image enhancement tasks, such as noise
reduction, object extraction, and shape analysis. Both operations are based on the
idea of structuring elements, which are small binary images that define the shape
and size of the features to be extracted or modified.
Opening: Opening is a morphological operation that involves two sequential steps:

erosion followed by dilation. The opening operation is used to remove small
objects, eliminate noise, and smooth the edges of larger objects in an image. The
opening operation works by eroding away the foreground pixels that do not match
the shape of the structuring element and then dilating the remaining pixels using the
same structuring element.
Closing: Closing is a morphological operation that involves two sequential steps:

dilation followed by erosion. The closing operation is used to fill small holes,
connect broken edges, and smooth the contours of objects in an image. The closing
operation works by dilating the foreground pixels that match the shape of the
structuring element and then eroding the resulting pixels using the same structuring
element.
Uses of Opening and Closing operations in Image Processing:
1. Image Denoising - The opening operation is used to remove small noisy regions
from an image, while the closing operation is used to fill in gaps and smooth the
edges of larger objects.
2. Object Extraction - The opening operation is used to remove small objects from an
image, while the closing operation is used to fill in gaps and connect the broken
edges of larger objects.
3. Shape Analysis - The opening and closing operations are used to extract the
boundaries and contours of objects in an image, which can be used to measure the
shape, size, and orientation of objects.
4. Edge Detection - The opening and closing operations can be used to detect the
edges and boundaries of objects in an image by comparing the original image with
the opened or closed image.
In summary, opening and closing are two important morphological operations in

image processing that are used for various image enhancement tasks, such as noise
reduction, object extraction, and shape analysis. The opening operation is used to
remove small objects and eliminate noise, while the closing operation is used to fill
small holes and smooth the contours of objects in an image.
13 Discuss the problem of median filter with even number of points in the window. (7)
Ans;- The median filter is a common technique used in image
processing for removing noise from an image. It is based on the
principle of replacing each pixel value with the median value of its
neighboring pixels. The median filter is effective in removing
impulsive noise, such as salt and pepper noise, from an image.
However, when the size of the median filter window is even, it can
lead to some issues. Specifically, the issue arises when computing
the median value for a set of even number of pixels in the window.
In this case, there is no unique median value, which leads to some
ambiguity in the result.
For example, consider a median filter window of size 4x4, and

suppose the pixel values within the window are as follows:
Copy code
         
If we apply a median filter to this window, we need to compute the

median value for the central pixel, which is the pixel value at (2,2).
The window size is even, so we need to find the median value of the
four pixels in the center of the window. However, there is no unique
median value in this case, since the four values are {6, 8, 1, 4}. One
possible solution is to take the average of the two middle values,
i.e., (4+6)/2 = 5. This approach works reasonably well, but it does
introduce some blurring and loss of detail.
To overcome this problem, some implementations of the median

filter use a window size that is odd, such as 3x3 or 5x5. This ensures
that there is always a unique median value, which helps to avoid the
ambiguity introduced by even-sized windows. Another approach is
to use a weighted median filter, where each pixel in the window is
given a weight based on its distance from the central pixel. This
helps to give more importance to the pixels that are closer to the
central pixel, which can help to reduce the blurring effect
introduced by averaging the two middle values.
14 Define image sampling and image quantization and discuss their role in the quality of an
image. (7)
Ans;-
Image sampling and image quantization are two important
concepts in digital image processing that have a significant impact
on the quality of an image.
Image sampling refers to the process of converting a continuous

analog image into a digital representation by sampling the image at
discrete points. In other words, it involves converting a continuous
image into a discrete image by selecting a finite number of samples
from the continuous signal. The sampling rate is usually measured
in pixels per inch (PPI) or dots per inch (DPI). The sampling rate
determines the level of detail that can be captured in the digital
image. Higher sampling rates result in higher resolution images
with more detail, but also require more storage space.
Image quantization, on the other hand, refers to the process of

converting a continuous range of values into a finite set of discrete
values. This involves assigning a digital value to each sample point
based on its intensity level. The number of quantization levels
determines the range of intensities that can be represented in the
digital image. For example, an 8-bit image has 256 levels of
intensity, while a 16-bit image has 65,536 levels of intensity. Higher
quantization levels result in higher precision and accuracy, but also
require more storage space.
The quality of an image is influenced by both the sampling rate and

the quantization level. Insufficient sampling can result in aliasing
artifacts, which appear as jagged edges or patterns in the image.
On the other hand, oversampling can result in a larger image file
size without any significant improvement in image quality.
Insufficient quantization can result in loss of detail or inaccurate
color representation, while oversampling can result in large file sizes
without significant improvement in image quality.
Therefore, in order to obtain high-quality digital images, it is

important to carefully select appropriate sampling rates and
quantization levels based on the specific requirements of the
application. In general, higher sampling rates and quantization
levels result in higher quality images, but also require more storage
space and processing power.
15 Explain the term 'computer vision' and it's need with the help of a suitable example. (4)
Ans:- Computer vision refers to the field of artificial intelligence and
computer science that deals with enabling computers to interpret
and understand digital images and video. The goal of computer
vision is to replicate the abilities of human vision, such as
recognizing objects, identifying patterns, and making decisions
based on visual input.
The need for computer vision arises from the vast amounts of visual
data that are generated and consumed every day, from security
cameras to social media photos. With the help of computer vision,
machines can analyze, interpret, and extract insights from this visual
data, enabling them to automate tasks, make predictions, and assist
humans in decision-making.
For example, consider a self-driving car. The car is equipped with

several cameras and sensors that capture real-time visual data of
the surroundings. With the help of computer vision algorithms, the
car can interpret this data to recognize objects such as pedestrians,
vehicles, and traffic signals. The car can then make decisions based
on this visual input, such as slowing down, changing lanes, or
coming to a stop. This example illustrates how computer vision
enables machines to replicate the abilities of human vision and
make intelligent decisions based on visual input.
16 Differentiate between CCD and CMOS. (7)
Ans: CCD (Charge-Coupled Device) and CMOS (Complementary
Metal-Oxide Semiconductor) are two different types of image
sensors that are commonly used in digital cameras, smartphones,
and other imaging devices. While both CCD and CMOS serve the
same function of converting light into electrical signals, they differ
in their structure and operation.
1. Structure:
CCD sensors have a simple structure consisting of an array of

photodiodes (light sensors) that are connected by a series of metal
electrodes. When light hits the photodiodes, electrons are
generated and stored in a "bucket brigade" fashion along the
electrodes until they are read out.
CMOS sensors, on the other hand, have a more complex structure

consisting of an array of photodiodes that are connected to
amplifiers and signal processors. Each pixel in a CMOS sensor has
its own amplifier and readout circuitry, allowing for faster and more
efficient image capture.
2. Power consumption:
CCD sensors consume more power than CMOS sensors due to their
higher operating voltage and the need for external clock signals to
read out the stored charges.
CMOS sensors consume less power due to their low operating

voltage and the fact that each pixel has its own amplifier and
readout circuitry, eliminating the need for external clock signals.
3. Image quality:
CCD sensors are known for their high image quality, low noise, and
excellent dynamic range, making them ideal for applications where
image quality is of utmost importance, such as professional
photography and scientific imaging.
CMOS sensors have improved significantly in recent years and are
now capable of producing high-quality images with low noise and
good dynamic range. However, they may not match the image
quality of CCD sensors in certain applications.
4. Speed:
CMOS sensors are generally faster than CCD sensors due to their
on-chip circuitry and the ability to read out pixels in parallel.
CCD sensors are slower than CMOS sensors due to the sequential
readout of pixels along the electrodes.
In summary, CCD sensors offer higher image quality and lower

noise but consume more power and are slower than CMOS sensors.
CMOS sensors, on the other hand, consume less power, are faster,
and offer good image quality but may not match the image quality
of CCD sensors in certain applications.
17 What do you mean by the aperture of the camera, discuss aperture's role in the quality of
an image. (5)
Ans:- In photography, the aperture of a camera refers to the
opening in the lens through which light enters the camera. It is
represented by an f-number, which determines the size of the
aperture. A smaller f-number represents a larger aperture, while a
larger f-number represents a smaller aperture.
The aperture plays a crucial role in the quality of an image,

particularly in terms of depth of field and exposure.
1. Depth of field:
The aperture controls the depth of field, which is the range of

distances in the image that are in sharp focus. A larger aperture
(smaller f-number) creates a shallow depth of field, with only a
small portion of the image in focus and the background blurred. A
smaller aperture (larger f-number) creates a deeper depth of field,
with more of the image in focus from foreground to background.
2. Exposure:
The aperture also plays a key role in determining the exposure of
the image. The larger the aperture, the more light enters the
camera, which can result in a brighter image. The smaller the
aperture, the less light enters the camera, resulting in a darker
image. Controlling the aperture in conjunction with shutter speed
and ISO allows for precise control of the exposure of the image.
In summary, the aperture of a camera is the opening in the lens

through which light enters the camera. It plays a crucial role in
determining the depth of field and exposure of the image, and
controlling the aperture allows for precise control over these
aspects of the image.
18 Define the process of image formation with a suitable diagram. (7)

Ans: Image formation is the process by which light is captured and
transformed into a digital image. The process of image formation
involves four main stages: illumination, reflection, refraction, and
capture.
1. Illumination: The first stage of image formation involves

illuminating the object with a light source. The light source could be
natural, such as sunlight, or artificial, such as a lamp or flash.
2. Reflection: The illuminated object reflects some of the light, which
then enters the camera lens. The reflected light carries information
about the object's color and shape.
3. Refraction: As the light enters the lens, it is refracted or bent. The
lens focuses the light onto the camera's sensor, forming an image.
4. Capture: The camera's sensor captures the focused image in the
form of pixels, which are then processed and stored as a digital
image.
Here is a simplified diagram to illustrate the process of image

formation:
cssCopy code
  
  
 
In the diagram, the object is illuminated with a light source, and the
reflected light enters the camera lens. The lens refracts the light to
focus it onto the camera's sensor, where the image is captured as
pixels and stored as a digital image.
Overall, image formation is a complex process that involves the

interaction of light, optics, and electronics. Understanding this
process is important in image processing and computer vision, as it
helps to optimize the quality and accuracy of the digital images.
19 Electromagnetic Spectrum defines the ranges of Electro-Magnetic Radiation from the Sun.
Explain how it helps in vision process. (7)
Ans:- The electromagnetic spectrum is a range of frequencies of
electromagnetic radiation that includes visible light, as well as other
forms of radiation such as radio waves, microwaves, infrared
radiation, ultraviolet radiation, X-rays, and gamma rays.
In the context of vision process, electromagnetic radiation plays a

critical role in how we see and perceive the world around us. The
visible part of the electromagnetic spectrum, which ranges from
approximately 400 to 700 nanometers, is the range of radiation that
our eyes can detect and interpret as color. Different wavelengths of
visible light correspond to different colors, and our brains process
these signals to create the visual experience of color.
Other parts of the electromagnetic spectrum, such as ultraviolet

and infrared radiation, are not visible to the human eye but can still
be detected and utilized in certain applications of vision processing.
For example, infrared cameras can detect heat radiation and are
used in night vision applications, while ultraviolet light can be used
in fluorescence microscopy to highlight specific molecules or
structures.
Furthermore, other forms of electromagnetic radiation, such as X-

rays and gamma rays, are used in medical imaging to visualize
internal structures of the human body. These forms of radiation can
penetrate through the body's tissues and create images of bones,
organs, and other structures that would not be visible otherwise.
In summary, the electromagnetic spectrum plays a crucial role in
the vision process by providing the range of radiation that our eyes
can detect and interpret as color, as well as other forms of radiation
that can be utilized in various applications of vision processing.
20 Illustrate how image processing, computer graphics and computer vision are related to
each other. (7)
Ans:-
Image processing, computer graphics, and computer vision are all closely
related fields in computer science, each with its own specific focus and
applications. While there is some overlap between these fields, they all have
unique characteristics and applications.
Image processing refers to the manipulation of images using mathematical

algorithms and computer programs. The goal of image processing is to
enhance images, correct distortions or noise, extract useful information from
images, or transform images into a different representation. Image
processing has applications in various fields, including medicine, security,
robotics, and entertainment.
Computer graphics, on the other hand, deals with the generation of visual
content using computer programs. The goal of computer graphics is to
create realistic or stylized images, animations, or visual effects for various
applications, including video games, movies, and virtual reality. Computer
graphics involves the use of computer programs to create, manipulate, and
render images, as well as to simulate complex physical phenomena.
Computer vision is a field that aims to enable computers to interpret and

understand visual content, similar to how humans perceive and interpret
visual information. The goal of computer vision is to extract useful
information from visual data, such as images or videos, and use it to make
decisions or take actions. Computer vision has a wide range of applications,
including self-driving cars, face recognition, and object detection.
While image processing, computer graphics, and computer vision are

distinct fields, they are closely related and often used together. For example,
image processing techniques can be used to preprocess visual data before
applying computer vision algorithms. Computer graphics techniques can be
used to create synthetic training data for computer vision algorithms.
Computer vision algorithms can be used to track or recognize objects in
video feeds for use in computer graphics or augmented reality applications
21 Image processing is the process of transforming an image into a suitable form that is more
appropriate for a particular application. Explain the steps required to execute in the entire
transformation process. (7)
Ans:- The image processing transformation process involves several
steps that are required to execute in a specific sequence to achieve
the desired outcome. These steps are as follows:
1. Image Acquisition: The first step is to acquire the image from a

source such as a camera, scanner, or database. This step is essential
to obtain the initial image data that will be processed in subsequent
steps.
2. Image Preprocessing: The acquired image may have unwanted
elements, such as noise or distortion, that can affect the quality of
the image. In the preprocessing step, these unwanted elements are
removed, and the image is corrected to ensure that the image data
is suitable for processing.
3. Image Enhancement: This step involves improving the quality of the
image to make it more suitable for a particular application.
Enhancement techniques include sharpening, contrast adjustment,
and filtering.
4. Image Restoration: Image restoration involves removing or
reducing the effect of noise or other distortions that have affected
the quality of the image. Restoration techniques include filtering
and deblurring.
5. Image Analysis: In this step, the image is analyzed to extract specific
features, such as edges, corners, or shapes, that are relevant to the
application. Image analysis techniques include feature extraction,
segmentation, and classification.
6. Image Interpretation: Image interpretation involves making
decisions based on the features extracted in the previous step. This
step is often the most challenging and requires the application of
machine learning or artificial intelligence techniques to make
decisions based on the extracted features.
7. Image Compression: Finally, the processed image data can be
compressed to reduce its storage requirements and make it easier
to transmit over networks. Compression techniques include lossless
and lossy compression.
In summary, the image processing transformation process involves

several steps, each of which is essential to achieve the desired
outcome. By following this sequence of steps, an image can be
transformed into a suitable form for a particular application, such as
image recognition or video analysis.
22 Differentiate between edge and corner detection. (4)

Ans: Edge detection and corner detection are two different
techniques used in image processing for feature detection. The
main differences between edge and corner detection are:
1. Definition: Edge detection detects edges or boundaries between

regions with different properties such as color, texture or intensity,
while corner detection detects corners or points where two or more
edges meet.
2. Result: Edge detection results in a line or curve that separates two
regions with different properties, while corner detection results in a
point that marks the intersection of two or more edges.
3. Application: Edge detection is used in applications such as object
recognition, image segmentation, and feature extraction. Corner
detection is used in applications such as object tracking, stereo
vision, and 3D reconstruction.
4. Complexity: Edge detection is generally simpler than corner
detection because edges are more prevalent and distinct in images,
while corners are fewer and more difficult to detect. Corner
detection requires more complex algorithms such as Harris corner
detection or FAST (Features from Accelerated Segment Test) corner
detection.
In summary, edge detection and corner detection are two different

techniques used for feature detection in image processing. Edge
detection detects edges between regions with different properties,
while corner detection detects points where two or more edges
meet. Each technique has its own strengths and weaknesses and is
used in different applications based on their suitability.
23 Write a note of Morphological filtering. (4)

Ans:-
Morphological filtering is a technique in image processing that
involves the use of mathematical morphology operations to filter
and process images. Morphological filtering is based on the
mathematical principles of set theory and topology, and is used to
extract image features such as edges, contours, corners, and
texture.
Morphological filtering is useful for image enhancement, noise

reduction, object recognition, and feature extraction. The most
commonly used morphological filters are erosion and dilation,
which are used to remove small objects and noise, and to fill in
gaps in the image respectively. Other morphological filters include
opening and closing, which are used to remove small objects while
preserving the shape and size of larger objects.
Morphological filtering is widely used in many applications,

including medical imaging, remote sensing, and robotics. It is a
powerful tool for enhancing images and extracting features, and
can be used in conjunction with other image processing techniques
to achieve better results.
24 What is feature extraction? How it is done? (4)

Ans:- Feature extraction is the process of identifying and extracting
meaningful features or patterns from raw data, such as images,
audio signals, or text. In image processing, feature extraction
involves extracting important characteristics of an image, such as
edges, corners, texture, and color, that can be used to describe or
classify the image.
Feature extraction can be done using various techniques such as

statistical analysis, filtering, segmentation, and transformation.
Statistical analysis involves computing statistical properties of an
image, such as mean, variance, and skewness, to extract features
related to brightness, contrast, and texture. Filtering involves
applying various filters to an image to extract features such as
edges and corners. Segmentation involves dividing an image into
regions based on similarity criteria and extracting features from
these regions. Transformation involves applying mathematical
transformations, such as Fourier or wavelet transforms, to an image
to extract features related to frequency or spatial information.
Once the features have been extracted, they can be used for various
tasks such as image classification, object detection, and recognition.
The choice of feature extraction technique depends on the specific
application and the type of features required.
25 Differentiate between supervised, un-supervised and semi-supervised classification. (4)

Ans:- Supervised, unsupervised, and semi-supervised classification are three
different approaches to machine learning and data analysis. The main
differences between them are:
1. Supervised classification: In supervised classification, the algorithm is trained

on labeled data, which means that the data is already labeled with the
correct answers. The algorithm then uses this labeled data to learn how to
classify new, unlabeled data. In other words, the algorithm learns from
example data that has already been labeled.
2. Unsupervised classification: In unsupervised classification, the algorithm is
not given any labeled data. Instead, it attempts to identify patterns or
clusters in the data based on some similarity metric. The algorithm does not
know what the correct answers are, so it must find structure in the data on
its own.
3. Semi-supervised classification: Semi-supervised classification is a
combination of both supervised and unsupervised approaches. It involves
training the algorithm on a smaller set of labeled data and then using the
knowledge gained from this to classify a larger set of unlabeled data.

CAT1 Computer Vision Syllabus

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CAT1 Computer Vision Syllabus

Uploaded by

Copyright:

Available Formats

CAT1 Computer Vision Syllabus

The components of image processing can be broadly categorized

Let's examine each component in more detail.

Image acquisition is the process of capturing or obtaining an

Image enhancement is the process of improving the visual quality

Spatial domain filtering involves modifying the pixel values of an

Histogram equalization is a technique that enhances the contrast of

3. Image Analysis and Recognition:

Image analysis and recognition involve the extraction of useful

Segmentation is the process of dividing an image into multiple

RGB Color Model:

2. CMY Color Model:

The CMY color model stands for Cyan, Magenta, Yellow. It is a

3. HSV Color Model:

The HSV color model stands for Hue, Saturation, Value. It is a

3 Mention the difference between Monochrome and grayscale image. (4)

The illumination component refers to the lighting conditions under which

The reflectance component refers to the inherent properties of the objects

The simple image model is widely used in various image processing

In summary, the simple image model characterizes the basic nature of an

5 Define 2D Transformation Image Formation. (5)

Image formation refers to the process of generating a digital image

2D transformation is used in image processing to manipulate the

The most common types of 2D transformations include translation,

2D transformation is an important tool in image processing, as it

7 What is image enhancement? Differentiate spatial domain and frequency domain

There are two main categories of image enhancement methods:

Spatial domain methods operate directly on the pixel values of the

Frequency domain methods, on the other hand, involve

The equation for image negation can be written as:

where L is the maximum intensity value in the image, and I and O

The equation for log transformation can be written as:

In summary, image enhancement is a process of improving the

1. Limited light gathering ability: The small aperture of a pinhole

To overcome these limitations, we can use a lens-based camera

In summary, pinhole cameras have several limitations, such as

1. Spatial Domain Filters: Spatial domain filters are a class of noise

Example: Median filter - A median filter is a spatial domain filter

2. Frequency Domain Filters: Frequency domain filters are a class of

Example: Wiener filter - A Wiener filter is a frequency domain filter

In summary, image noise filters are important tools for removing

Examples of linear filters include:

1. Gaussian filter - A Gaussian filter is a linear filter that is used to

Non-linear Filters: Non-linear filters are a class of image filters that

Examples of non-linear filters include:

1. Median filter - A median filter is a non-linear filter that is used for

In summary, the main difference between linear filters and non-

Opening: Opening is a morphological operation that involves two sequential steps:

Closing: Closing is a morphological operation that involves two sequential steps:

Uses of Opening and Closing operations in Image Processing:

In summary, opening and closing are two important morphological operations in

For example, consider a median filter window of size 4x4, and

If we apply a median filter to this window, we need to compute the

To overcome this problem, some implementations of the median

Image sampling refers to the process of converting a continuous

Image quantization, on the other hand, refers to the process of

The quality of an image is influenced by both the sampling rate and

Therefore, in order to obtain high-quality digital images, it is

For example, consider a self-driving car. The car is equipped with

CCD sensors have a simple structure consisting of an array of

CMOS sensors, on the other hand, have a more complex structure

CMOS sensors consume less power due to their low operating

In summary, CCD sensors offer higher image quality and lower

The aperture plays a crucial role in the quality of an image,