You are on page 1of 133

DIGITAL IMAGE

PROCESSING
B.E.GEOINFORMATICS
2018 - 2022
TABLE OF CONTENTS

S.NO TITLE PG.NO


01 One-dimensional statistics 1
Multidimensional statistics 6
02 Characteristics of image and pixel 10
Classification
- Bayes’ theorem 11
- Supervised classification methods 14
- Parametric and non-parametric classification 15
- Supervised training and refinement stages 16
- Support vector machine 19
03
- Unsupervised classification methods 20
- Sub-pixel classification 23
- Fuzzy classification 24
- Accuracy estimation 29
- ANN for image classification 35
- Decision tree classifier – Boundary descriptors 37
04 Color Theory 39
05 Data/ image formats --
06 Data perception from satellites 41
07 Display Operations 41
08 Display systems and methods 43
Expert Systems
- definition 48
09
- components 50
- applications 53
Errors
10 - Geometric 55
-Radiometric 57
Filters
- Convolutional filters 61
11
- Global operators 64
- Statistical and morphological operators 69
Hyperspectral data
12 - definition 72
- data analysis 75
Image
- displays 78
- enhancements
* definition 80
* Types of operations in enhancement 82
* Point operators (threshold to noise clipping) 83
* Linear and non-linear enhancement 85
13
* Histogram equalization 86
* False coloring and pseudo coloring 90
- formation 92
- fusion 93
- properties 94
- ratios 95
- rectification 96
- Indexed images and image ratios 100
14 IP systems 101
15 Photo-write systems 104
Pixel
- Adjancy --
16 - Connectivity 109
- Path and path lengths 110
- Neighbourhood 111
17 Principal Component Analysis (PCA) 113
Products:
- Data products (diff levels) 115
- Special products 117
18 - DEM, DSM, DTM 120
- Image processing hardwares and softwares 122
- Open source products 124
- Stereovision 126
19 Resolutions 129

TOPICS NOT INCLUDED:

- Data/ image formats


- Pixel adjancy
ONE-DIMENSIONAL STATISTICS: (2018107019 & 2018107020)
A measure of central tendency is a single value that describes the way in which a group of data cluster around a
central value. It represents the center point or a typical value in a dataset.

Why measures of central tendency are important?

➢ They are important because they indicate where most of the values fall in a distribution and help to analyze
the variability of the data.
➢ It allows the user to compare one set of data to another.
➢ Helps the user to understand the histogram of the image (the histogram of an image is the plot between
the pixel values and frequency of pixel values) and parameters such as skewing, contrast (the dynamic
range), and how clustered or spread out the values are.

Mean

The mean (or average) of a set of data values is the sum of all of the data values divided by the number of data
values. There are two types of mean namely population mean and sample mean. Population refers to the entire
dataset whereas sample is a subset of the population consisting of a specified range of values.

Median

The median of a set of data values is the middle value of the data set when it has been arranged in ascending
order. That is, from the smallest value to the highest value. Example: The median of 4, 1, and 7 is 4 because
when the numbers are put in order (1, 4, 7), the number 4 is in the middle. To Calculate the Median: Arrange
the n measurements in ascending (or descending) order. We denote the median of the data by M.

Mode

• The mode of a set of data values is the value(s) that occurs most often. Example: The mode of {4 2, 4, 3, 2,
2} is 2 because it occurs three times, which is more than any other number.

• HIGHEST FREQUENCY POPULARITY

The mean, median and mode of a data set are collectively known as measures of central tendency as these three
measures focus on where the data is centered or clustered.

1
➢ The mean is useful for predicting future results when there are no extreme values in the data set. However,
the impact of extreme values on the mean may be important and should be considered. E.g. The impact of
a stock market crash on average investment returns.
➢ The median may be more useful than the mean when there are extreme values in the data set as it is not
affected by the extreme values.
➢ The mode is useful when the most common item, characteristic or value of a data set is required.

When to use mean, median and mode?

In an ideally symmetrical normal distribution curve (or a


symmetrical histogram) mean, median and mode are
identical.

Mean is the most frequently used measure of central


tendency and generally considered the best measure of it. An
important property of mean is that it includes every value of
the dataset as part of the calculation and produces the least
amount of error.

However, mean cannot be used when the curve or histogram is skewed (that is, it is affected by the influence of
outliers or extreme values). Because, when the histogram is skewed, the mean loses its ability to provide central
location as the skewed data is dragging away from the typical value.

Hence in this case, median would prove to be an effective


measurement as it is least affected by the outliers. Normally,
Mode is used for categorical data where there is a need to
identify which
is the most
common item
or category.

Determining mean, median, mode from the histogram:

2
Consider the above histogram where heights of students (in inches) in a school have been given. The total number
of observations is 16.

Mean X̅̅̅̅
𝟓𝟗+𝟔𝟎+𝟔𝟏+𝟔𝟐+𝟔𝟑+𝟔𝟒+𝟔𝟓+𝟔𝟔+𝟔𝟕+𝟔𝟖+𝟔𝟗+𝟕𝟎+𝟕𝟏+𝟕𝟐+𝟕𝟑+𝟕𝟒
=
𝟏𝟔

= 66.5 inches.

Median M:

The total number of observations is even (16), hence there are two middle values.
𝑛 𝑛+1
The value at = 66 The value at = 67
2 2

66+67
Median = = 66.5 inches. (Mean and median are equal which indicates that the histogram is not skewed).
2

Mode: The most recurring value occurs between 66 and 67.

________________________________________________

Skewed histogram:

Consider the histogram of the data: 6,7,7,7, 7, 8, 8, 8, 9, 10


𝟔+𝟕+𝟕+𝟕+𝟕+𝟖+𝟖+𝟖+𝟗+𝟏𝟎
Mean x̄ = = 7.7
𝟏𝟎

Median: The total number of observations is even (10).


7+8
Median = = 7.5
2

Mode: 7 (It has the highest frequency).

Points to remember:

➢ If the mean is much larger than the median, the histogram is generally skewed right; a few values are larger
than the rest.

➢ If the mean is much smaller than the median, the histogram is generally skewed left; a few smaller values
bring the mean down.

3
➢ If the mean and median are close, the histogram is symmetrical and fairly balanced.

STANDARD DEVIATION:

It is a most widely used measure of variability or


diversity used in statistics. In terms of image
processing it shows how much variation or
"dispersion" exists from the average (mean, or
expected value). A low standard deviation indicates
that the data points tend to be very close to the
mean, whereas high standard deviation indicates
that the data points are spread out over a large
range of values. Mathematically standard deviation
is given by,

The Standard
Deviation is a measure of how spread out numbers are. It is useful in comparing
sets of data which may have the same mean but a different range. For example,
the mean of the following two is the same: 15, 15, 15, 14, 16 and 2, 7, 14, 22,
30. 2
𝜎 − 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑣𝑎𝑟
VARIANCE: 𝑠 2 − 𝑠𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟

Variance (𝜎 2 ) in statistics is a measurement of the spread between numbers in a


data set. That is, it measures how far each number in the set is from the mean
and therefore from every other number in the set

SKEWNESS:

Skewness refers to distortion or asymmetry in a symmetrical bell curve, or normal


distribution, in a set of data. If the curve is shifted to the left or to the right, it is
said to be skewed. Skewness can be quantified as a representation of the extent
to which a given distribution varies from a normal distribution. A normal distribution has a skew of zero.

Negative skew: The left tail is longer; the mass of the distribution is
concentrated on the right of the figure. The distribution is said to be left-
skewed, left-tailed, or skewed to the left.

Positive skew: The right tail is longer; the mass of the distribution is
concentrated on the left of the figure. The distribution is said to be right-
skewed, right-tailed, or skewed to the right.

4
KURTOSIS:

Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. That is,
data sets with high kurtosis tend to have heavy tails, or outliers. Data sets with low kurtosis tend to have light tails,
or lack of outliers. A uniform distribution would be the extreme case.

5
MULTI-DIMENSIONAL STATISTICS: (2018107021)
Scatterogram:

Scatter Diagrams are convenient mathematical tools to study the correlation between two random variables.
We can take any variable as the independent variable in such a case (the other variable being the dependent one),
and correspondingly plot every data point on the graph (𝑥𝑖 , 𝑦𝑖 ). The totality of all the plotted points forms the scatter
diagram. The Scatter Plot allows you to interactively classify two bands of raster data. One band provides the x
coordinates and the other band provides the y coordinates. If the bands do not contain dependent data, either band
can be plotted on either axis and the Scatter Plot illustrates only the degree of correlation between the two bands.

3D scatter plots are used to plot data points on three axes in the attempt to show the relationship between three
variables. Each row in the data table is represented by a marker whose position depends on its values in the columns
set on the X, Y, and Z axes.

Correlation:

Correlation shows the strength of a relationship between two variables and is expressed numerically by the
correlation coefficient. The correlation coefficient's values range between -1.0 and +1.0. We can calculate a
coefficient of correlation for the given data. It is a quantitative measure of the association of the random variables.
Its value is always less than 1, and it may be positive or negative. In the case of a positive correlation, the plotted
points are distributed from lower left corner to upper right corner (in the general pattern of being evenly spread
about a straight line with a positive slope), and in the case of a negative correlation, the plotted points are spread
out about a straight line of a negative slope) from upper left to lower right.

If the points are randomly distributed in space, or almost equally distributed at every location without depicting
any particular pattern, it is the case of a very small correlation, tending to 0.

6
Σ(𝑋−𝑋̅ )(𝑌−𝑌̅)
The formula for correlation is, 𝑟 =
√∑(𝑋−𝑋̅ )2 √(𝑌−𝑌̅)2

where r = the correlation coefficient,


̅𝑋= the average of observations of variable X,
𝑌̅= the average of observations of variable Y

In this example from a Landsat 8 scene, the best correlation of 0.98 is between bands 1 and 2 (both of which lie in
the visible), and bands 10 and 11 in the thermal IR. Several bands have negative correlations (especially 9 and the
two TIR bands), and some have essentially no correlation (bands 1 and 5).

“If two bands were perfectly correlated, they would be redundant” To estimate the degree of interrelation between
variables in a manner not influenced by measurement units, the correlation coefficient, is commonly used. The
correlation between two bands of remotely sensed data, 𝑟𝑘𝑙 is the ratio of their covariance (𝑐𝑜𝑣𝑘𝑙 ) to the product
of their standard deviations (𝑠𝑘 𝑠𝑙 ) thus,

Covariance:

Covariance is a measure of how much two random variables vary together. It’s similar to variance, but where
variance tells you how a single variable varies, co variance tells you how two variables vary together.

The different remote-sensing-derived spectral measurements for each pixel often change together in some
predictable fashion. If there is no relationship between the brightness value in one band and that of another for a
given pixel, the values are mutually independent; that is, an increase or decrease in one band's brightness value is
not accompanied by a predictable change in another band's brightness value. Because spectral measurements of

7
individual pixels may not be independent, some measure of their mutual interaction is needed. This measure, called
the covariance, is the joint variation of two variables about their common mean.

The sign of the covariance therefore shows the tendency in the linear relationship between the variables. However,
the magnitude of the covariance is not easy to interpret, because it is not normalized and hence depends on the
magnitudes of the variables. On the other hand, the normalized version of the covariance, the correlation
coefficient, shows the strength of the linear relation by its magnitude.

Unlike the correlation coefficient, covariance is measured in units. The units are computed by multiplying the units
of the two variables. The variance can take any positive or negative values. The values are interpreted as follows:

- Positive covariance: Indicates that two variables tend to move in the same direction.
- Negative covariance: Reveals that two variables tend to move in inverse directions.

Difference between covariance and correlation:

Covariance measures the total variation of two random variables from their expected values. Using covariance,
we can only gauge the direction of the relationship (whether the variables tend to move in tandem or show an
inverse relationship). However, it does not indicate the strength of the relationship, nor the dependency between
the variables.

On the other hand, correlation measures the strength of the relationship between variables. Correlation is the scaled
measure of covariance. It is dimensionless. In other words, the correlation coefficient is always a pure value and not
measured in any units.

8
Cross-correlation:

Cross-correlation is a measurement that tracks the movements of two or more sets of time series data relative
to one another. It is used to compare multiple time series and objectively determine how well they match up with
each other and, in particular, at what point the best match occurs.

Regression:

Regression is a statistical method that attempts to determine the strength and character of the relationship
between one dependent variable (usually denoted by Y) and a series of other variables (known as independent
variables). The two basic types of regression are simple linear regression and multiple linear regression, although
there are non-linear regression methods for more complicated data and analysis. Simple linear regression uses one
independent variable to explain or predict the outcome of the dependent variable Y, while multiple linear regression
uses two or more independent variables to predict the outcome.

• Simple regression analysis uses a single x variable for each dependent “y” variable. For example: (x1, Y1).

• Multiple regression uses multiple “x” variables for each independent variable: ((x1)1, (x2)1, (x3)1, Y1).

The only difference between simple linear regression and multiple regression is in the number of predictors (“x”
variables) used in the regression.

The regression equation representing how much y changes with any given change of x can be used to construct a
regression line on a scatter diagram, and in the simplest case this is assumed to be a straight line. The direction in
which the line slopes depends on whether the correlation is positive or negative. When the two sets of observations
increase or decrease together (positive) the line slopes upwards from left to right; when one set decreases as the
other increases the line slopes downwards from left to right.

Simple linear regression: Y=a + bX + u


Multiple linear regression: Y = a + b1X1 + b2X2 + b3X3 + ... + btXt + u

where,
Y = the variable that you are trying to predict (dependent variable).
X = the variable that you are using to predict Y (independent variable).
a = the intercept.
b = the slope.
u = the regression residual.

9
CHARACTERISTICS OF IMAGE AND PIXEL: (2017107021)
A digital image is an image composed of picture elements, also known as pixels, each with finite, discrete quantities
of numeric representation for its intensity or grey level that is an output from its 2D functions fed as input by its
spatial coordinates denoted with x, y on the x-axis and y-axis respectively. (OR) A digital image is a matrix of many
small elements, or pixels. Each pixel is represented by a numerical value. In general, the pixel value is related to the
brightness or colour that we will see when the digital image is converted into an analog image for display and
viewing.

In digital imaging, a pixel (or picture element) is the smallest item of information in an image. Pixels are arranged in
a 2-dimensional grid, represented using squares. Each pixel is a sample of an original image, where more samples
typically provide more-accurate representations of the original. The intensity of each pixel is variable; in colour
systems, each pixel has typically three or four components such as red, green, and blue, or cyan, magenta, yellow,
and black. The word pixel is based on a contraction of pix ("pictures") and el (for "element").

The term Image resolution is often used as a pixel count in digital imaging. When the pixel counts are referred to as
resolution, the convention is to describe the pixel resolution with the set of two numbers. The first number is the
number of pixel columns (width) and the second is the number of pixel rows (height), for example as 640 by 480.
Another popular convention is to cite resolution as the total number of pixels in the image, typically given as number
of megapixels, which can be calculated by multiplying pixel columns by pixel rows and dividing by one million. An
image that is 2048 pixels in width and 1536 pixels in height has a total of 2048×1536 = 3,145,728 pixels or 3.1
megapixels. One could refer to it as 2048 by 1536 or a 3.1-megapixel image. Other conventions include describing
pixels per length unit or pixels per area unit, such as pixels per inch or per square inch.

Every pixel-based image has three basic characteristics: dimension, bit depth, and color model.

Dimension:

Dimension is the attribute that is loosely related to size. Pixel-based images are always rectangular grids made up
of little squares, like checkerboards or chessboards. Image dimensions are limited by the capabilities of your capture
device, the amount of available storage space, your patience—the more pixels in the image, the more space it needs,
and the longer it takes to do things to it—and in Photoshop, by the 300,000-by-300,000-pixel image size limit.
Dimension is only indirectly related to physical size or resolution: Until you specify how large each pixel is (called
"resolution") an image has no specific size. But resolution and size aren't innate to the digital image; they're fungible
qualities.

Bit Depth:

Bit depth is the attribute that dictates how many shades or colours the image can contain. Because each pixel's tone or
colour is defined by one or more numbers, the range in which those numbers can fall dictates the range of possible values
for each pixel, and hence the total number of colours (or shades of grey) that the image can contain. For example, in a 1-
bit image (one in which each pixel is represented by one bit of information—either a one or a zero) each pixel is either on
or off, which usually means black and white. With two bits per pixel, there are four possible combinations (00, 01, 10, and
11), hence four possible values, and four possible colours or grey levels. Eight bits of information give you 256 possible
values; in 8-bit/channel RGB images, each pixel actually has three 8-bit values—one each for red, green, and blue for a
total of 24 bits per pixel.

Colour Model:

The third essential attribute of images, the Colour Model, is the one that dictates whether all those numbers
represent shades of grey, or colours. As we mentioned earlier, computers know nothing about tone or colour; they
just crunch numbers. Image mode is the attribute that provides a human meaning for the numbers they crunch. In
general, the numbers that describe pixels relate to tonal values, with lower numbers representing darker tones and
higher ones representing brighter tones. In an 8-bit/channel grayscale image, 0 represents solid black, 255
represents pure white, and the intermediate numbers represent intermediate shades of grey. In the colour image
modes, the numbers represent shades of a primary colour rather than shades of grey. So, an RGB image is actually
made up of three grayscale channels: one representing red values, one green, and one blue.
10
BAYES’ THEOREM/ BAYES’ LAW/ BAYES’ RULE: (2018107041)
It is a concept in decision theory which describes the probability of an event, based on prior knowledge of
conditions that might be related to the event. Bayesian inference is a method of statistical inference in which Bayes’
Theorem is used to update the probability from hypothesis as more evidence or information becomes available. For
example, if the risk of developing health problems is known to increase with age, Bayes' theorem allows the risk to
an individual of a known age to be assessed more accurately than simply assuming that the individual is typical of
the population as a whole. So, it is suitable to say that this law is purely evidence based.

Mathematical Expression:

𝑃(𝐵|𝐴)𝑃(𝐴)
𝑃(𝐴|𝐵) = , 𝑤ℎ𝑒𝑟𝑒 𝐴 𝑎𝑛𝑑 𝐵 𝑎𝑟𝑒 𝑒𝑣𝑒𝑛𝑡𝑠 𝑎𝑛𝑑 𝑃(𝐵) ≠ 0.
𝑃(𝐵)
𝑃(𝐴|𝐵) : conditional probability – likelihood of event A
occurring, given B is true

𝑃(𝐵 |𝐴) : conditional probability – likelihood of event B occurring,


given A is true

𝑃(𝐴) 𝑎𝑛𝑑 𝑃(𝐵) : marginal probability – observing A and B

Example:

A person having lung congestion to have corona = 90% (sensitive)


∴ True positive rate = 0.90
A person not having lung congestion not having corona = 80%
(specific)
∴ False positive rate = 0.20
Assume 5% prevalence; i.e, prevalence = 0.05
What is the probability of a person tested positive for corona
really have corona?
Positive Predictive Value of a test is proportion of persons who
are actually positive out of all those testing positive, and can be
calculated from a sample as:
𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒
𝑃𝑃𝑉 =
𝑇𝑒𝑠𝑡𝑒𝑑 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒

Let P(User|Positive) → person having lung congestion testing


positive for corona.

𝑃(𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 |𝐿. 𝐶)𝑃(𝐿. 𝐶)


𝑃(𝐿𝑢𝑛𝑔 𝑐𝑜𝑛𝑔𝑒𝑠𝑡𝑖𝑜𝑛|𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒) =
𝑃(𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒)
𝑃(𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 |𝐿. 𝐶)𝑃(𝐿. 𝐶)
=
𝑃(𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 |𝐿. 𝐶)𝑃(𝐿. 𝐶) + 𝑃(𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 |𝑁𝑜 𝐿. 𝐶)𝑃(𝑁𝑜 𝐿. 𝐶)

0.90 × 0.05
=
0.90 × 0.05 + 0.20 × 0.95

0.045
= ≈ 19%
0.045 + 0.19

- Sensitivity measures the proportion of positives that are correctly identified (e.g., the percentage of
sick people who are correctly identified as having some illness).

- Specificity measures the proportion of negatives that are correctly identified (e.g., the percen tage of
healthy people who are correctly identified as not having some illness).

- More common = higher prevalenc e; Less common = lower prevalence NOTE


11
Law of Total Probability is given as,

P(Positive) = P(Positive|User) + P(Positive|Non-user) P(Non-user), if specificity, sensitivity and prevalence are


known.

Even if someone has lung congestion, the probability they are a corona infected is only 19%, because in this group
only 5% of people are users, most positives are false positives coming from the remaining 95%.

Derivation:

(For Events)

Bayes’ theorem may be derived from the definition of conditional probability,


𝑃(𝐴∩𝐵)
𝑃(𝐴|𝐵) = , 𝑖𝑓 𝑃(𝐵) ≠ 0,
𝑃(𝐵)

𝑃(𝐵∩𝐴)
𝑃(𝐵|𝐴) = , 𝑖𝑓 𝑃(𝐴) ≠ 0,
𝑃(𝐴)

where P(A∩B) is the joint probability of both A and B being true, because

P(B∩A) = P(A∩B)

→ P(A∩B) = P(A|B) P(B) = P(B|A) P(A)


𝑃(𝐵|𝐴)𝑃(𝐴)
→ 𝑃(𝐴|𝐵) = , 𝑖𝑓 𝑃(𝐵) ≠ 0
𝑃(𝐵)

Bayesian Interpretation:

In the Bayesian (or epistemological) interpretation, probability


measures a "degree of belief". Bayes' theorem links the degree of
belief in a proposition before and after accounting for evidence.
For example, suppose it is believed with 50% certainty that a coin
is twice as likely to land heads than tails. If the coin is flipped a
number of times and the outcomes observed, that degree of belief
will probably rise or fall, but might even remain the same,
depending on the results. For proposition A and evidence B,

• P (A), the prior, is the initial degree of belief in A.


• P (A | B), the posterior, is the degree of belief after incorporating news that B is true.
𝑃(𝐵|𝐴)
• the quotient represents the support B provides for A.
𝑃(𝐵)
For more on the application of Bayes' theorem under the Bayesian interpretation of probability, see Bayesian
inference.

Classification of Pixel in an image:

𝑃(𝑋 |𝐶𝑖 )𝑃(𝐶𝑖 )


𝑃(𝐶𝑖 |𝑋) = , where X is the pixel, C is the
𝑃(𝑋)
class to which the pixel belongs to.

12
Discriminant Function:

A function of several
variates used to assign items
into one of two or more
groups. The function for a
particular set of items is
obtained from measurements
of the variates of items which
belong to a known group.

Discriminant analysis is
statistical technique used to
classify observations into non-
overlapping groups, based
on scores on one or more
quantitative predictor
variables. For example, a
doctor could perform
a discriminant analysis to
identify patients at high or low
risk for stroke. Used to study
group differences, relationships, group membership.

Assumptions here:

Sample size 80:20 Normal Distribution Homogenic covariance Outliers

Non-Multicollinearity Mutually Exclusive Classification Variability

13
SUPERVISED CLASSIFICATION METHODS: (2018107045)
In supervised classification the user or image analyst “supervises” the pixel
classification process. The user specifies the various pixels values or spectral
signatures that should be associated with each class. This is done by selecting
representative sample sites of a known cover type called Training Sites or Areas.
The computer algorithm then uses the spectral signatures from these training
areas to classify the whole image. Ideally, the classes should not overlap or should
only minimally overlap with other classes. In ENVI there are four different
classifications - Maximum Likelihood, Minimum Distance, Mahalanobis Distance,
Spectral Angle Mapper

Maximum Likelihood:

Assumes that the statistics for each class in each band are normally distributed
and calculates the probability that a given pixel belongs to a specific class. Each
pixel is assigned to the class that has the highest probability (that is, the maximum
likelihood). This is the default.

Minimum Distance:

Uses the mean vectors for each class and calculates the Euclidean distance from
each unknown pixel to the mean vector for each class. The pixels are classified to
the nearest class.

Mahalanobis Distance:

A direction-sensitive distance classifier that uses statistics for each class. It is similar to maximum likelihood
classification, but it assumes all class covariances are equal, and therefore is a faster method. All pixels are classified
to the closest training data.

Spectral Angle Mapper:

(SAM) is a physically-based spectral classification that uses an n-Dimension angle to match pixels to training data.
This method determines the spectral similarity between two spectra by calculating the angle between the spectra
and treating them as vectors in a space with dimensionality equal to the number of bands. This technique, when
used on calibrated reflectance data, is relatively insensitive to illumination and albedo effects

Advantages and Disadvantages:

In supervised classification the majority of the effort is done prior to the actual classification process. Once the
classification is run the output is a thematic image with classes and correspond to information classes or land cover
types. Supervised classification can be much more accurate than unsupervised classification, but depends heavily
on the training sites, the skill of the individual processing the image, and the spectral distinctness of the classes. If
two or more classes are very similar to each other in terms of their spectral reflectance (e.g., annual-dominated
grasslands vs. perennial grasslands), mis-classifications will tend to be high. Supervised classification requires close
attention to the development of training data. If the training data is poor or not representative the classification
results will also be poor. supervised classification generally requires more times and money compared to
unsupervised.

14
PARAMETRIC AND NON-PARAMETRIC CLASSIFICATION: (2018107043)
Digital image processing uses the techniques like image processing, image enhancement, photogrammetric
image processing of stereoscopic imagery, parametric and non-parametric information extraction, expert system
and neural network image analysis, hyper spectral data analysis and change detections.

a) Parametric Information Extraction:

Scientists attempting to extract land-cover information from remotely sensed data now routinely specify if the
classification is to be:

• Hard (sometimes referred to as crisp), with discrete mutually exclusive classes, or fuzzy, where the proportions
of materials within pixels are extracted;

• Based on individual pixels (referred to as a per-pixel classification) or if it will use object-based image analysis
(OBIA) segmentation algorithms that take into account not only the spectral characteristics of a pixel, but also
the spectral characteristics of contextual surrounding pixels. Thus, the algorithms take into account spectral and
spatial information.

Once these issues are addressed, it is a matter of determining whether to use parametric (based on the analysis of
normally distributed data), nonparametric, and/or nonmetric classification techniques. The maximum likelihood
classification algorithm continues to be a widely used parametric classification algorithm. Unfortunately, the
algorithm requires normally distributed training data in n bands for computing the class variance and covariance
matrices. It is difficult to incorporate nonimage categorical data into a maximum likelihood classification. Support
Vector Machine (SVM) classification is also very effective especially when spectral training data consist of mixed
pixels.

b) Nonparametric Information Extraction:

Nonparametric clustering algorithms, such as ISODATA, continue to be used extensively in digital image
processing research. Unfortunately, such algorithms depend on how the seed training data are extracted and it is
often difficult to label the clusters to turn them into useful information classes. For these and other reasons there
has been a significant increase in the development and use of artificial neural networks (ANN) for remote sensing
applications. An ANN does not require normally distributed training data. An ANN may incorporate virtually any type
of spatially distributed data in the classification. The only drawback is that sometimes it is difficult to determine
exactly how the ANN came up with a certain conclusion because the information is locked within the weights in the
hidden layer(s). Scientists continue to work on ways to extract hidden information so that the rules used can be
more formally stated. The ability of an ANN to learn should not be underestimated.

c) Nonmetric Information Extraction:

It is difficult to make a computer understand and use the heuristic rules of thumb and knowledge that a human
expert uses when interpreting an image. Nevertheless, there has been progress in the use of artificial intelligence
(AI) to try to make computers do things that, at the moment, people do better. One area of AI that has great
potential for image analysis is the use of expert systems that place all the information contained within an image in
its proper context with ancillary data and extract valuable information.

Parametric digital image classification techniques are based primarily on summary statistics such as the mean,
variance, and covariance matrices. Decision-tree or rule-based classifiers are not based on inferential statistics, but
instead “let the data speak for itself”. In other words, the data retains its precision and is not dumbed down by
summarizing it through means, etc. Decision-tree classifiers can process virtually any type of spatially distributed
data and can incorporate prior probabilities

15
There are several approaches to rule creation, including:

1) explicitly extracting knowledge and creating rules from experts,

2) implicitly extracting variables and rules using cognitive methods

3) empirically generating rules from observed data and automatic induction methods. The development of a
decision tree using human-specified rules is time-consuming and difficult. However, it rewards the user with detailed
information about how individual classification decisions were made.

Ideally, computers can derive the rules from training data without human intervention. This is referred to as
machine-learning. The analyst identifies representative training areas. The machine learns the patterns from these
training data, creates the rules, and uses them to classify the remotely sensed data. The rules are available to
document how decisions were made. A drawback of artificial neural network and machine learning classifiers in
general is the need for a large sample size of training data.

SUPERVISED TRAINING AND REFINEMENT STAGES: (2018107044)


The analyst “supervises“ the categorization of a set of specific classes by providing training statistics that identify
each category.

• In the imagery, the analyst identifies homogeneous, In the imagery, the analyst identifies homogeneous,
representative examples of the various surface cover types (information classes) of interest.

• These samples are referred to as training areas.

• The selection of appropriate training areas is based on the analyst’s familiarity with the geographical area and
their knowledge of the actual surface cover types present in the image.

• Good visual interpretation skills are mandatory for success.

Step 1: Define training data

• Each training site should appear homogenous and representative of the legend class

• Delineate several training sites for each legend class.

• Make each training site at least 20-25 pixels (i.e., > 5 acres for 30 meter pixels)

• Each class should be represented by ~ 100 n pixels (where n= number of spectral bands in the data set)

16
Training Sites:

* For 6-band TM & ETM imagery, the total number of training pixels per class should be at least 600.

* Try to capture the landscape diversity of the class.

Refinement stages:

• The first stage saliency map S^1 generated by the feedforward network is coarse compared to the original
resolution ground truth.

• Thus, in the second stage, we adopt a refinement net used for the subsequent refinement as shown in
following figure.

17
We use a network structure composed of the first four convolutional blocks of ResNet-50 (denoted as C1′, C2 x′
, C3 x′and C4 x′) with different parameters than those used in stage 1. This allows a more flexible account for
different structures and helps learn stage-specific refinements. S^1 serves as the input to the subsequent
incorporation module R^1and is refined to progressively increase the resolution. Similarly, in a subsequent stage t
(t ∈ {2, ..., T}), each incorporation module R^ t-1 aggregates information from the preceding coarse map encoding
S^ t -1 and outputs feature F^t of the refinement net in stage t. Each module R^ t-1 takes as input a mask encoding
S^t-1 generated in the master pass, along with matching features F^t generated in the refinement pass. It learns to
merge information in order to generate a new prediction encoding S ^t,

S^t = R^ t-1(S^t- 1, F^t), where S ^t- 1 and St denote the t-th stage input and output, respectively.

Structure Details:

Following image shows a detailed illustration of the first refinement module R^1 adopted in stage 2(i.e.,
concatenating a coarse saliency map S^1 from the master pass with a feature map F^2 From a refinement pass) to
generate a finer saliency map S^2. Since S^1 (12×12 pixels) is coarser than F^2 (23×23), we first upsample S^1 to
double its size. Then, we combine the upsampled saliency map with the feature maps F^2 to generate S^2. We
append two Extra convolutional layers behind the fourth convolutional Block (C4 x′) to reduce the dimension. The
first extra layer has 3 × 3 kernels and 256 channels while the second extra.

• The structure of the pyramid pooling module. From the first row to the last one: 1 × 1, 2 × 2, 3 × 3 and 6
× 6 feature bins, which are achieved by employing variable-size pooling kernels with different strides on
the output feature map (e.g. C4 x′).
• Layer (output feature map) has 3×3 kernels and 64 channels.

18
SUPPORT VECTOR MACHINE: (2018107053)
Support vector machines (SVMs) are a set of supervised learning methods used for classification, regression
and outliers detection. A support vector machine (SVM) is machine learning algorithm that analyzes data for
classification and regression analysis. SVM is a supervised learning method that looks at data and sorts it into one
of two categories. An SVM outputs a map of the sorted data with the margins between the two as far apart as
possible.

Advantages:

• Effective in high dimensional spaces.


• Still effective in cases where number of dimensions is greater than the number of samples.
• Uses a subset of training points in the decision function (called support vectors), so it is also memory
efficient.
• Versatile: different Kernel functions can be specified for the decision function. Common kernels are
provided, but it is also possible to specify custom kernels.

Disadvantages of SVMs:

• If the number of features is much greater than the number of samples, avoid over-fitting in choosing Kernel
functions and regularization term is crucial.
• SVMs don’t directly provide probability estimates, these are calculated using an expensive five-fold cross-
validation (see Scores and probabilities, below).

Applications of SVMs:

SVMs can be used to solve various real-world problems:

• SVMs are helpful in text and hyper-text categorization, as their application can significantly reduce the need
for labeled training instances in both the standard inductive and transductive settings. Some methods for
shallow semantic parsing are based on support vector machines.
• Classification of images can also be performed using SVMs. Experimental results show that SVMs achieve
significantly higher search accuracy than traditional query refinement schemes after just three to four
rounds of relevance feedback. This is also true for image segmentation systems, including those using a
modified version SVM that uses the privileged approach as suggested by Vapnik.
• Classification of satellite data like SAR data using supervised SVM.
• Hand-written characters can be recognized using SVM.
• The SVM algorithm has been widely applied in the biological and other sciences.
• They have been used to classify proteins with up to 90% of the compounds classified correctly. Permutation
tests based on SVM weights have been suggested as a mechanism for interpretation of SVM models.
Support-vector machine weights have also been used to interpret SVM models in the past. Posthoc

19
interpretation of support-vector machine models in order to identify features used by the model to make
predictions is a relatively new area of research with special significance in the biological sciences.

History of SVMs:

The original SVM algorithm was invented by Vladimir N. Vapnik and AlexeyYa. Chervonenkis in 1963. In1992,
Bernhard Boser, Isabelle Guyon and Vladimir Vapnik suggested a way to create non-linear classifiers by applying the
kernel trick to maximum-margin hyperplanes. The current standard incarnation (soft margin) was proposed by
Corinna Cortes and Vapnik in1993 and published in 1995.

UNSUPERVISED CLASSIFICATION: (2018107047)


Unsupervised classification is a form of pixel-based classification and
is essentially computer automated classification. The user specifies the
number of classes and the spectral classes are created solely based on the
numerical information in the data (i.e. the pixel values for each of the
bands or indices). Clustering algorithms are used to determine the
natural, statistical grouping of the data. The pixels are grouped together
into based on their spectral similarity. The computer uses feature space
to analyze and group the data into classes. Roll over the below image to
see how the computer might use feature space to group the data into ten
classes.

While the process is basically automated, the user has control over certain
inputs. This includes the Number of Classes, the Maximum Iterations,
(which is how many times the classification algorithm runs) and the
Change Threshold %, which specifies when to end the classification
procedure. After the data has been classified the user has to interpret,
label and color code the classes accordingly. Unsupervised classification
generates clusters based on similar spectral characteristics inherent in the
image. Then, you classify each cluster without providing training samples
of your own. The steps for running an unsupervised classification are: 1.
Generate clusters 2. Assign classes

Step 1: Generate clusters

INPUT: The image you want to classify.

NUMBER OF CLASSES: The number of classes you want to generate during


the unsupervised classification. For example, if you are working with multispectral imagery (red, green, blue, and
NIR bands), then the number here will be 40 (4 classes x 10).

MINIMUM CLASS SIZE: This is the number of pixels to make a unique


class. When you click OK, it creates clusters based on your input
parameters. But you still need to identify which land cover classes
each cluster belongs to.

Step 2: Assign classes

Now that you have clusters, the last step is to identify each class from
the iso-clusters output. Here are some tips to make this step easier:

• In general, it helps to select colors for each class. For example, set
water as blue for each class.

20
• After setting each one of your classes, we can merge the classes by using the reclassify tool.

Unsupervised classification is where the outcomes (groupings of pixels with common characteristics) are based on
the software analysis of an image without the user providing sample classes. The computer uses techniques to
determine which pixels are related and groups them into classes. The user can specify which algorism the software
will use and the desired number of output classes but otherwise does not aid in the classification process. However,
the user must have knowledge of the area being classified when the groupings of pixels with common characteristics
produced by the computer have to be related to actual features on the ground (such as wetlands, developed areas,
coniferous forests, etc.).

OUTPUT: Unsupervised classification methods generate a map with each pixel assigned to a particular class based
on its multispectral composition. The number of classes can be specified by the user or may be determined by the
number of natural groupings in the data. The user must then assign meaning to the classes, and combine or split
classes where necessary to generate a meaningful map.

Successful Rangeland Uses

Unsupervised classification has been used extensively in rangelands for a wide range of applications, including:

• Land cover classes


• Major vegetation types
• Distinguishing native vs invasive species cover
• Vegetation condition
• Disturbed areas (eg. fire)
• Land use change

Data Inputs

Unsupervised classification can be performed with any number of different remote-sensing or GIS-derived inputs.
Commonly, spectral bands from satellite or airborne sensors, band ratios or vegetation indices (e.g., NDVI), and
topographic data (e.g., elevation, slope, aspect) are used as inputs for unsupervised classification.

Software/Hardware Requirements

Unsupervised classification is relatively


easy to perform in any remote sensing
software (e.g., Erdas Imaging, ENVI,
Idrisi), and even in many GIS programs
(e.g., ArcGIS with Spatial Analyst or Image
Analysis extensions, GRASS).

Advantages:

• Unsupervised classification is
fairly quick and easy to run.
• There is no extensive prior
knowledge of area required, but
you must be able to identify and
label classes after the
classification.
• The classes are created purely
based on spectral information,
therefore they are not as
subjective as manual visual
interpretation.

21
Disadvantages:

• One of the disadvantages is that the spectral classes do not always correspond to informational classes.
• The user also has to spend time interpreting and label the classes following the classification.
• Spectral properties of classes can also change over time, so you can’t always use the same class information
when moving from one image to another.

ISODATA -- A Special Case of Minimum Distance Clustering

• “Iterative Self-Organizing Data Analysis Technique”


• Parameters you must enter include:
–N - the maximum number of clusters that you want
–T - a convergence threshold and
–M - the maximum number of iterations to be performed.

ISODATA Procedure
• N arbitrary cluster means are established,
• The image is classified using a minimum distance classifier
• A new mean for each cluster is calculated
• The image is classified again using the new cluster means
• Another new mean for each cluster is calculated
• The image is classified again...
• After each iteration, the algorithm calculates the percentage of pixels that remained in the same cluster
between iterations
• When this percentage exceeds T (convergence threshold), the program stops or…
• If the convergence threshold is never met, the program will continue for M iterations and then stop.
• Not biased to the top pixels in the image (as sequential clustering can be)
• Non-parametric--data does not need to be normally distributed
• Very successful at finding the “true” clusters within the data if enough iterations are allowed
• Cluster signatures saved from ISODATA are easily incorporated and manipulated along with (supervised)
spectral signatures
• Slowest (by far) of the clustering procedures.

Unsupervised Classification

• Alternatives to ISODATA approach


• K-means algorithm
• assumes that the number of clusters is known a priori, while ISODATA allows for different number of clusters
• Non-iterative
• Identify areas with “smooth” texture
• Define cluster centers according to first occurrence in image of smooth areas
• Agglomerative hierarchical
• Group two pixels closest together in spectral space
• Recalculate position as mean of those two; group
• Group next two closest pixels/groups
• Repeat until each pixel grouped

22
SUB-PIXEL CLASSIFICATION: (2018107049)
Sub-pixel level classification deals with performing feature the classification by breaking the pixel into more pixels
based on spectral unmixing by identifying the abundance of classes using fuzzy logic, whereas, objective oriented
classification is based on information from a set of similar pixels (objects).

Sub-pixel mapping combines the advantages of both hard and soft classification techniques producing an easily
interpretable map, with the high information content of a sub-pixel classification. As much information as possible
is extracted from a soft classification, producing a hard classification on a finer
scale, by assuming spatial dependence. Sub-pixel mapping is the process of Mixed pixel
spatially designating land cover class proportions to concrete pure sub-pixels.

Sub-pixel mapping is the process of spatially designating land cover class


proportions to concrete pure sub-pixels. Sub-pixel sharpening is a very closely
related concept to sub-pixel mapping, that can be described as the process of
spatially assigning land cover class proportions to sub-pixels, such that a soft
classification is changed to a finer resolution. To address “mixed pixel”
problem, area enumeration is taken. Pixel is a combination of different
reflectors, convolution of radiance from reflectors. MIXED PIXEL → impure
pixel -> combination of pure elements. Find out what are the elements and
how much is the combination.

Spectral unmixing:

Finding out the combination for used to retrieve class abundances within mixed pixels to fill “upsampled” sub -pixels

Spectral mixer Analysis: fractions of the pure reflectors


For ex: a pixel is a mix of Water, Soil and Veg
40% water + 40% soil +20% veg = DN value
pure “water “pixel DN value is D1
pure “soil “pixel DN value is D2
pure “veg “pixel DN value is D3
Then .4* D1+ .4*D2 + .2 * D3 = DN value

Mixing model: regression of fraction of independent fraction LMM→ .4*


D1+ .4*D2 + .2 * D3 = DN value. From such equation for pixels of various
fraction and it simultaneously. The fraction of pure class such as r, is
enumerated with less error of commission and error of omission.

The Linear Mixing Model: The idealized, pure signature for a spectral class
is called an endmember. Because of sensor noise and within-class signature
variability, endmembers only exist as a conceptual convenience and as
idealizations in real images. The spatial mixing of objects within a GIFOV
(ground instantaneous field of view) and the resulting linear mixing of their
spectral signatures are illustrated. Linear mixture modelling assumes a
single reflectance within the GIFOV and is an approximation to reality in
many cases; nonlinear mixing occurs whenever there is radiation
transmission through one of the materials (such as a vegetation canopy),
followed by reflectance at a second material (such as soil); or if there are
multiple reflections within a material or between objects within a GIFOV.

23
The linear mixing model for a single GIFOV. The boundaries between objects can be any shape and complexity; only
the fractional coverages and individual spectral reflectance are important.

Linear mixing is described mathematically as a linear vector-matrix equation. DNij = Efij + εij

where fij is the L *1 vector of L endmember fractions for the pixel at ij, and E is the K* L endmember signature
matrix, with each column containing one of the endmember spectral vectors. The lefthand side DNij is the K-
dimensional spectral vector at pixel ij. The added term εij represents the residual error in the fitting of a given pixel’s
spectral vector by the sum of L endmember spectra and unknown noise. Two constraints are 𝑓𝑖 ≥ 0 , ∑𝐿𝑙=1 fl .They
are additionally used .

FUZZY CLASSIFICATION: (2018107050)


Q1. What is fuzzy classifications?

Fuzzy classification is the process of grouping elements into a fuzzy set whose membership function is defined by
the truth value of a fuzzy propositional function. A fuzzy propositional function is, analogous to, [5] an expression
containing one or more variables, such that, when values are assigned to these variables, the expression becomes
a fuzzy proposition in the sense of. Accordingly, fuzzy classification is the process of grouping individuals having the
same characteristics into a fuzzy set.

Q2. What is fuzzy image processing?

Fuzzy image processing is an attempt to translate this ability of human reasoning into computer vision problems as
it provides an intuitive tool for inference from imperfect data.

Q3. What is the purpose of Fuzzy Logic?

Fuzzy logic allows one to quantify appropriately and handle imperfect data. It also allows combining them for a final
decision, even if we only know heuristic rules, and no analytic relations.

Q4. Fuzzy image speciality ?

Fuzzy image processing is special in terms of its relation to other computer vision techniques. It is not a solution for
a special task, but rather describes a new class of image processing techniques. It provides a new methodology,
augmenting classical logic, a component of any computer vision tool. A new type of image understanding and
treatment has to be developed. Fuzzy image processing can be a single image processing routine, or complement
parts of a complex image processing chain.

Q5. What are the basic components of fuzzy systems?

The two basic components of fuzzy systems are fuzzy sets and operations on fuzzy sets. Fuzzy logic defines rules,
based on combinations of fuzzy sets by these operations.

Q6. What are the operations done on fuzzy sets?

Operations on fuzzy sets. In order to manipulate fuzzy sets, we need to have operations that enable us to combine
them. As fuzzy sets are defined by membership functions, the classical set theoretic operations have to be replaced
by function theoretic operations.

Q7. What are linguistic variables?

Linguistic variables. An important feature of fuzzy systems is the concept of linguistic variables, introduced by In
order to reduce the complexity of precise definitions, they make use of words or sentences in a natural or artificial
language, to describe a vague property.

24
Q8. Define fuzzy logic

Fuzzy logic. The concept of linguistic variables allows us to define combinatorial relations between properties in
terms of a language. Fuzzy logic—an extension of classical Boolean logic—is based on linguistic variables, a fact
which has assigned fuzzy logic the attribute of computing with words

Q9. Why fuzzy image processing?

Every image or does not show up at all. Additionally, the edge may be corrupted by noise. A noisy edge can
appropriately be detected with probabilistic approaches, computing the likelihood of the noisy measurement to
belong to the class of edges.

Q9. define the edge? How do we classify an image that shows a gray-value slope?

A noisy slope stays a slope even if all noise is removed. If the slope is extended over the entire image we usually do
not call it an edge. But if the slope is “high” enough and only extends over a “narrow” region, we tend to call it an
edge.

Q10. What is strength of fuzzy logic?

Fuzzy logic does not need models. It can handle vague information, imperfect knowledge and combine it by heuristic
rules—in a well-defined mathematical framework. This is the strength of fuzzy logic!

Q11. Block diagram of fuzzy processing

Imperfect knowledge in image processing

Q12. Define membership values

The definition of the membership values depends on the


specific requirements of particular application and on
the corresponding expert knowledge. Figure shows an
example where brightness and edginess are used to
define the membership grade of each pixel.

25
Q13. Image Fuzzification

Fuzzy image processing is a kind of nonlinear image processing. The difference to other well-known methodologies
is that fuzzy techniques. operate on membership values. The image fuzzification (generation of suitable membership
values) is, therefore, the first processing step. Generally, three various types of image fuzzification can be
distinguished: histogram-based gray-level fuzzification, local neighborhood fuzzification, and feature fuzzification

Q14. Histogram-based gray-level fuzzification.

Histogram-based gray-level fuzzification. To


develop any point operation (global
histogram-based techniques), each gray level
should be assigned with one or more
membership values regarding to the
corresponding requirements.

Example: Image brightness is the brightness


of an image can be regarded as a fuzzy set
containing the subsets dark, gray, and bright
intensity levels (of course, one may define
more subsets such as very dark, slightly bright, etc.). Depending on the normalized image histogram, the location of
the membership functions can be determined It should be noted that for histogram-based gray-level fuzzification
some knowledge about image and its histogram is required (e.g., minimum and maximum of gray-level frequencies).
The detection accuracy of these histogram points, however, should not be very high as we are using the concept of
fuzziness (we do not require precise data).

Q15. Local neighbourhood fuzzification.

Local neighbourhood fuzzification. Intermediate techniques (e.g., segmentation, noise filtering etc.) operate on a
predefined neighborhood of pixels. To use fuzzy approaches to such operations, the fuzzification step should also
be done within the selected neighbourhood. The local neighborhood fuzzification can be carried out depending on
the task to be done. Of course, local neighborhood fuzzification requires more computing time compared with
histogram-based approach. In many situations, we also need more thoroughness in designing membership functions
to execute the local fuzzification because noise and outliers may falsify membership values.

Q16.Feature fuzzification

For high-level tasks, image features should usually be


extracted (e.g., length of objects, homogeneity of
regions, entropy, mean value, etc.). These features
will be used to analyze the results, recognize the
objects, and interpret the scenes. Applying fuzzy
techniques to these tasks, we need to fuzzify the
extracted features. It is necessary not only because
fuzzy techniques operate only on membership values
but also because the extracted features are often
incomplete and/or imprecise.

Example: Object length If the length of an object was


calculated in a previous processing step, the fuzzy
subsets very short, short, middle-long, long and very
long can be introduced as terms of the linguistic variable length in order to identify certain types of objects

26
Q17. How is an image characterized ?

Characterization of an image is generally accomplished in terms of primitive attributes called features. Image
features can be mainly divided into two categories:

• “natural” features, that are defined by the visual appearance of an image (e.g. luminance of a region of pixels,
gray scale of regions);

• “artificial” features, that come from image manipulation (e.g. histograms, spectral graphs).

Q18. What is the aim of image classifications ?

The aim of Fuzzy Image Processing (FIP) is to understand, represent, process an image and its features as fuzzy sets.
Moreover, the combination of fuzzy techniques with connectionist processing methods produces great advantages
and a flexible framework for solving complex problems. Neuro-fuzzy systems integrate the processing capabilities
of neural networks with the readability of fuzzy rule base providing results characterized by a high interpretability
and good degree of accuracy.

Q19. How can pixel classification be usefull?

The classification of pixels can find applications in different areas related to the field of image processing. In
particular, classifying pixels of an image can be useful as a pre-processing or a post-processing phase for the image
segmentation problem, respectively to facilitate the segmentation process or to refine its results.

Q20. State the process in fuzzy image processing?

Fuzzy image processing consists (as all other fuzzy approaches) of three stages: fuzzification, suitable operations on
membership values, and, if necessary, defuzzification (Fig. 22.11). The main difference to other methodologies in
image processing is that input data (histograms, gray levels, features, ...) will be processed in the so-called
membership plane where one can use the great diversity of fuzzy logic, fuzzy set theory and fuzzy measure theory
to modify/aggregate the membership values, classify data, or make decisions using fuzzy inference.

Q21. General structure of fuzzy image processing systems

Q22.state the operation in fuzzification

Operations in membership plane The generated membership values are modified by a suitable fuzzy approach. This
can be a modification, aggregation, classification, or processing by some kind of if-then rules.

Aggregation. Many fuzzy techniques aggregate the membership values to produce new memberships. Examples
are fuzzy hybrid connectives, and fuzzy integrals, to mention only some of them. The result of aggregation is a global
value that considers different criteria, such as features and hypothesis, to deliver a certainty factor for a specific
decision (e. g., pixel classification).
27
Modification. Another class of fuzzy techniques modify the membership values in some ways. The principal steps
are illustrated in. Examples of such
modifications are linguistic hedges, and
distance-based modification in prototype-
based fuzzy clustering. The result of the
modification is a new membership value for
each fuzzified feature (e. g., gray level,
segment, object).

Classification. Fuzzy classification


techniques can be used to classify input
data. They can be numerical approaches (e.
g., fuzzy clustering algorithms, fuzzy
integrals, etc.) or syntactic approaches (e. g.,
fuzzy grammars, fuzzy if-then rules, etc.).
Regarding to the membership values,
classification can be a kind of modification
(e. g., distance-based 702 22 Fuzzy Image
Processing adaptation of memberships in
prototype-based clustering) or aggregation (e. g., evidence combination by fuzzy integrals).

Inference. Fuzzy if-then rules can be used to make soft decisions using expert knowledge. Indeed, fuzzy inference
can also be regarded as a kind of membership aggregation because they use different fuzzy connectives to fuse the
partial truth in premise and conclusion of if-then rules.

Q23. What is need for defuzzification ?

In many applications we need a crisp value as output. Fuzzy algorithms, however, always deliver fuzzy answers (a
membership function or a membership value). In order to reverse the process of fuzzification, we use defuzzification
to produce a crisp answer from a fuzzy output feature. Depending on the selected fuzzy approach, there are
different ways to defuzzify the results. The well-known defuzzification methods such as center of area and mean of
maximum are used mainly in inference engines. One can also use the inverse membership function if point
operations are applied. The above image illustrates the three stages of fuzzy image processing for a modification-
based approach.

Q24. Fuzzy geometry

Fuzzy geometry Geometrical relationships between the image components play a key role in intermediate image
processing. Many geometrical categories such as area, perimeter, and diameter, are already extended to fuzzy sets.
The geometrical fuzziness arising during segmentation tasks can be handled efficiently if we consider the image or
its segments as fuzzy sets.

28
The main application areas of fuzzy geometry are feature extraction (e.g., in image enhancement), image
segmentation, and image representation. Fuzzy topology plays an important role in fuzzy image understanding, as
already pointed out earlier in this chapter. In the following, we describe some fuzzy geometrical measures, such as
compactness, index of area coverage, and elongatedness. A more detailed description of other aspects of fuzzy
geometry can be found in the literature.

Q24.Theoretical components of fuzzy image processing

Fuzzy image processing is knowledge-based and nonlinear. It is based on fuzzy logic and uses its logical, set-
theoretical, relational and epistemic aspects. The most important theoretical frameworks that can be used to
construct the foundations of fuzzy image processing are: fuzzy geometry, measures of fuzziness/image information,
rule-based approaches, fuzzy clustering algorithms, fuzzy mathematical morphology, fuzzy measure theory, and
fuzzy grammars. Any of these topics can be used either to develop new techniques, or to extend the existing
algorithms

ACCURACY ESTIMATION: (2018107046)


Accuracy assessment is a general term comparing the classification to geographical data that are assumed to be
true, in order to determine the accuracy of the classification process. Usually, the assumed-true data are derived
from ground truth data. Because it is not practical to test every pixel in the classification image, a representative
sample of reference points in the image with known class values is used.

Source of Errors in Remote sensing-derived information:

29
Ground Reference Test pixels:

Locate ground reference test pixels (or polygons if the classification is based on human visual interpretation) in
the study area.

• These sites are not used to train the classification algorithm and therefore represent unbiased reference
information.
• It is possible to collect some ground reference test information prior to the classification, perhaps at the
same time as the training data.
• Most often collected after the classification using a random sample to collect the appropriate number of
unbiased observations per class.

Accuracy assessment “best practices”

• 30-50 reference points per class is ideal


• Reference points should be derived from imagery or data acquired at or near the same time as the classified
image.

• If no other option is available, use the original image to visually evaluate the reference points (effective for
generalized classification schemes)

Sample size:

Sample size, N to be used to assess the accuracy of a land-use classification map for the binomial probability theory:

𝑧 2 (𝑝)(𝑞)
𝑁=
𝐸2
where, P – expected percent accuracy,
q = 100 – p,
E – allowable error,
Z = 2 (from the standard normal deviate of 1.96 for the 95% two-sided confidence level).

For a sample for which the expected accuracy is 85% at an allowable error is 5% (i.e., it is 95% accurate), the
number of points necessary for reliable results is:

With expected map accuracies of 85% and an acceptable error of 10%, the sample size for a map would be 51:

30
Sample Design

There are basically five common sampling designs used to


collect ground reference test data for assessing the
accuracy of a remote sensing- derived thematic map:

1. random sampling,

2. systematic sampling,

3. stratified random sampling,

4. stratified systematic unaligned sampling

5. cluster sampling

Commonly Used Methods of Generating Reference Points

• Random: no rules are used; created using a completely random process


• Stratified random: points are generated proportionate to the distribution of classes in the image.
• Equalized random: each class has an equal number of random points
• With a “stratified random” sample, a minimum number of reference points in each class is usually
specified (i.e., 30)
• For example, a 3 class image (80% forest, 10% urban, 10% water) & 30 reference points:
• Completely random: 30 forest, 0 urban, 1 water
• Stratified random: 24 forest, 3 urban, 3 water
• Equalized random: 10 forest, 10 urban, 10 water

Error Matrix:

Once a classification has been sampled a contingency table (also referred to as an error matrix or confusion
matrix) is developed.

• This table is used to properly analyze the validity of each class as well as the classification as a whole.
• In this way the we can evaluate in more detail the efficacy of the classification.

One way to assess accuracy is to go out in the field and observe the actual land class at a sample of locations, and
compare to the land classification it was assigned on the thematic map.

• There are number of ways to quantitatively express the amount of agreement between the ground truth
classes and the remote sensing classes.
• One way is to construct a confusion error matrix, alternatively called a error matrix
• This is row by column table, with as many rows as columns.
• Each row of the table is reserved for one of the information or remote sensing classes used by the
classification algorithm.
• Each column displays the corresponding ground truth classes in an identical order.

31
Overall accuracy:

The diagonal elements tally the number of pixels classified correctly in each class.

An overall measure of classification accuracy is,

35+37+41
which in this example accounts to , or 83%
136

But just because 83% classifications were accurate overall, does not mean that each category was successfully
classified at that rate.

Users accuracy:

A user of the imagery who is particularly interested in class A, say, might wish to know what proportion of
pixels assigned to class A were correctly assigned. In this example 35 of the 39 pixels were correctly assigned to
class A, and the user accuracy in this category of 35/39 = 90%

32
Producers accuracy:

Contrasted to user accuracy is producer accuracy, which has a slightly different interpretation.

• Producers accuracy is a measure of how much of the land in each category was classified correctly
• It is found, for each class or category, as

Kappa Analysis

𝐾ℎ𝑎𝑡 Coefficient of Agreement:

• Kappa analysis yields a statistic, which is an estimate of Kappa.


• It is measure of agreement or accuracy between the remote sensing- derived classification map and the
reference data as indicated by a) the major diagonal, and b) the chance agreement, which is indicated by
the row and column totals (referred to as marginals).

Kappa Coefficient

• Expresses the proportionate reduction in error generated by the classification in comparison with a
completely random process.
• A value of 0.82 implies that 82% of the errors of a random classification are being avoided.

The Kappa coefficient is not as sensitive to differences in sample sizes between classes and is therefore
considered a more reliable measure of accuracy; Kappa should always be reported. A Kappa of 0.8 or above is
considered a good classification; 0.4 or below is considered poor.

33
where,
r = number of rows in error matrix
𝑛𝑖𝑗 = number of observations in row I, column j
𝑛𝑖 = total number of observations in row i
𝑛𝑗 = total number of observations in column j
M = total number of observations in matrix reference

KAPPA COEFFICIENT

For an error matrix with r rows, and hence the same number of columns, let – A = the sum of r diagonal elements,
which is the numerator in the computation of overall accuracy. Let B = sum of the r products (row total x column
total). Then

where N is the number of pixels in the error matrix (the sum of all r individual cell
values).

Ground truth classes No. classified


A B C Pixels
A 35 2 2 39
Thematic map classes B 10 37 3 50
C 5 1 41 47
No. ground truth pixels 50 40 46 136

For the above error matrix, A = 35 +37 + 41 = 113; B = (39*50) + (50*40) + (47*46) = 6112; N = 136

34
ANN FOR IMAGE CLASSIFICATION: (2018107048)
Biological Neurons:

Brains have Neurons. These Neurons do the information process. This information is transmitted to other neurons
and finally we get decision. Network of human neurons is massively parallel. It consists of 10 million neurons and 60
trillion of interconnection. The processing speed of human being is 10-3sec. 5 to 6 order of slow when compared
with digital computer.

Artificial Neurons: (ANN)

The ANN is the piece of computing system


designed to simulate the functioning of
human brain. It is the function of Artificial
Intelligence and solve problems that would
prove impossible or difficult by human
beings. The processing of speed of
computing system is 10-9sec. When there
is non-linearity, we use the ANN. Here,
every input is connected to individual
strength and we get output in terms of
change the strength of input.

ELEMENTS OF ARTIFICIAL NEURON NETWORKS:

Keeping the above characteristics in mind, we can derive the basic elements as follows:

• Processing Elements
• Topology
• Learning Algorithm

Processing Elements:

In general, a processing unit is made up of summing unit followed by an output unit. The function of a summing unit
is to take n input values, weight each input unit and calculate the weighted sum of those values. Based on the sign
of the weight of each input, it is determined whether the input has a positive weight or negative weight. The
weighted sum of the summing unit is known as Activation Value and based on the signal from this activation value,
output is produced.

Topology:

The organization or arrangement of processing elements, their interconnections, inputs and outputs is simply known
as Topology. Some of the commonly used Topologies are:

❖ Instar
❖ Group of outstars
❖ Outstar
❖ Bidirectional Associative Memory
❖ Group of instars
❖ Auto associative Memory

35
Learning Algorithm or Laws:

The operation any neural network is governed by both activation state dynamics and synaptic weight dynamics.
Learning Algorithms or laws are implementations of synaptic dynamics and are described in terms of first derivative
of the weights. Some of the commonly known Learning Algorithms are:

• Hebb’s Law • Perception Learning Law • Delta Learning Law • Correlation Learning Law• Instar Learning Law
• Outstar Learning Law

CHARACTERISTICS OF ARTIFICIAL NEURON NETWORKS:

1) Non linearity: Inputs are linear and outputs are in terms of long non linearity. It has interconnection of nonlinear
neurons. Nonlinearity is distributed throughout the network.

2) Input / Output mapping: Output is closed to desired output. But it’s not achieved quickly. So, we go to Learning
Algorithm.

3) Evidential Response: At final we expect some response in confident level. The ANN associates a confident with a
decision / decision with a measure in confidence.

4) Fault Tolerence: There is possible of few neurons’ failure. The failure of one neuron not affects another neuron.
It depends on failure of count of neurons. (Graceful degradation)

5) VLSI implement ability: It able to connect very large number of neurons together.

6) Neurological Analogy: Here, two unrelated inputs are compared for their shared qualities. It makes rational
arguments and support ideas by showing connections and comparisons between dissimilar things.

7) Adaptivity: The free parameter is adapted through a continuing process of stimulated by environment in which
the network is embedded.

ADVANTAGES:

• The main advantage of ANN is parallel processing.


• Due to their parallel processing structure, any failure in one neural element will not affect the rest of the
process.
• Neural networks can be applied in any application and they can solve any complex problem.
• By implementing appropriate learning algorithms, an ANN can be made to learn without reprogramming.

DISADVANTAGES:

• All the parallel processing requires a huge amount of processing power and time.
• There is a requirement for a “training” period before real-world implementation.

36
APPLICATIONS IN IMAGE PROCESSING:

• Recognition of symbols (used in Olympics)


• Recognition of handwriting
• Segmentation of image
• Classification and segmentation of texture

AREAS OF ARTIFICIAL INTELLIGENCE

DECISION TREE CLASSIFIER: (2018107055)


A decision tree is a type of multistage classifier that can be applied to a single image or a stack of images. It is
made up of a series of binary decisions that are used to determine the correct category for each pixel. The decisions
can be based on any available characteristic of the dataset. For example, you may have an elevation image and two
different multispectral images collected at different times, and any of those images can contribute to decisions
within the same tree. No single decision in the tree performs the complete segmentation of the image into classes.
Instead, each decision divides the data into one of two possible classes or groups of classes.

Given a data of attributes together with its classes, a decision tree produces a sequence of rules that can be used to
classify the data. A decision tree is a predictive machine-learning model decides a new sample’s target value
(dependent variable) based on available data’s differing attribute values. The tree’s internal nodes denote different
attributes with internodal branches revealing attributes possible values in observed samples. Terminal nodes tell us
of the dependent variable’s [13] final value. A predicted attribute is known as dependent variable, as its value is
based on/decided by other attributes values. The latter which aid prediction of dependent variable value are known
as the independent variables in datasets.

Decision tree classifier is a hierarchically based classifier which compares the data with the range of properly
selected features. When a decision tree classifier provides only two outcomes at each stage, it is called binary
decision tree classifier (BDT). Features often used are as follows:

1.Spectral values 2.an index computed from spectral vales

3.principal components 4.any arithmetic value

37
Advantages:

1. Decision Tree is simple to understand


and visualise, requires little data
preparation, and can handle both
numerical and categorical data.

2. The decision tree follows a non-


parametric method; meaning, it is
distribution-free and does not depend on
probability distribution assumptions. It can
work on high-dimensional data with
excellent accuracy.

3. Decision trees can perform feature


selection or variable screening completely.
They can work on both categorial and
numerical data. Furthermore, they can
handle problems with multiple results or
outputs.

Disadvantages:

1. Decision tree can create complex trees


that do not generalise well, and decision trees can be unstable because small variations in the data might result
in a completely different tree being generated.

In classification, a set of example records is given, called the training data set, with each record consisting of several
attributes. One of the categorical attributes, called the class label, indicates the class to which each record belongs.
The objective of classification is to use the training data set to build a model of the class label such that it can be
used to classify new data whose class labels are unknown. For example, in second image, bands are created in range
of micrometers and used to classify the images into sea, forest etc., based on their reflectance values in a form of
decision tree.

38
COLOR THEORY: (2018107017)
The human visual system can distinguish hundreds of thousands of different colour shades and intensities, but
only around 100 shades of grey. Therefore, in an image, a great deal of extra information may be contained in the
colour, and this extra information can then be used to simplify image analysis, Eg. object identification and
extraction based on colour.

Color Image Processing

The use of color is important in image processing because:

• Color is a powerful descriptor that simplifies object identification and extraction.


• Humans can discern thousands of color shades and intensities, compared to about only two dozen shades
of grey.

Color image processing is divided into two major areas:

• Full-color processing: images are acquired with a full-color sensor, such as a color TV camera or color
scanner.
• Pseudocolor processing: The problem is one of assigning a color to a particular monochrome intensity or
range of intensities.

Color Fundamentals

• Colors are seen as variable combinations of the primary colors of light [red (R), green (G), and blue (B)].
• The primary colors can be mixed to produce the secondary colors: magenta (red+blue), cyan (green+blue),
and yellow (red+green).
• Mixing the three primaries, or a secondary with its opposite primary color, produces white light.
• However, the primary colors of pigments are cyan (C), magenta (M), and yellow (Y), and the secondary colors
are red, green, and blue.
• A proper combination of the three pigment primaries, or a secondary with its opposite primary, produces
black.

Color characteristics:

1. HUE: The hue is determined by the dominant wavelength. Visible colours occur between about 400nm (violet)
and 700nm (red) on the electromagnetic spectrum.

2. SATURATION: The saturation is determined by the excitation purity, and depends on the amount of white light
mixed with the hue.

3. INTENSITY: The intensity is determined by the actual amount of light, with more light corresponding to more
intense colours.

➢ Hue and saturation together determine the chromaticity for a given colour.
➢ Achromatic light has no colour - its only attribute is quantity or intensity.
➢ Grey level is a measure of intensity.
➢ Brightness or Luminance is determined by the perception of the colour, and is therefore psychological
Color models:

A color model is a specification of a coordinate system and subspace where each color is represented as single point.

1. RGB MODEL

• In the RGB model, an image consists of three independent image planes, one in each of the primary colours
(red, green and blue)
• Specifying a particular colour is by specifying the amount of each of the primary components present.

39
• This is an additive model, i.e. the colours
present in the light add to form new colours, and
is appropriate for the mixing of coloured light.

- Figure shows the geometry of the RGB colour


model for specifying colours using a Cartesian
coordinate system.

The greyscale spectrum, i.e. those colours made


from equal amounts, each primary lies on the
line joining the black and white vertices.

Use:

The RGB model is used for colour monitors and


most video cameras.

2. CMY MODEL

● The CMY (cyan-magenta-yellow) model is a subtractive model appropriate to absorption of colours.

● This model is what subtracted from white. In this case, the primaries are cyan, magenta and yellow, with red,
green and blue as secondary colours.

● The figure on the right shows the additive mixing of red,


green and blue primaries to form the three secondary colours
yellow (red + green), cyan (blue + green) and magenta (red +
blue), and white ((red + green + blue).

● The figure on the right shows the three subtractive


primaries, and their pairwise combinations to form red, green
and blue, and finally black by subtracting all three primaries
from white.

Use: The CMY model is used by printing devices and filters

3. THE HSE MODEL:

Hexagonal shape at an arbitrary color point.

● The hue is determined by an angle from a reference point, usually red.

● The saturation is the distance from the origin to the point.

● The intensity is determined by how far up the vertical intensity axis this hexagonal plane sits.

Color Transformation

As with the gray-level transformation, we model color transformations using the expression (g (x, y) = T [f (x, y)]
where f (x, y) is a color input image, g (x, y) is the transformed color output image, and T is the color transform.

40
DATA PERCEPTION FROM SATELLITES: (2018107003)
• The data obtained from the satellite is for the use of Remote Sensing techniques.
• Remote Sensing: obtaining information about an object with a sensor which is physically separated from
the object. There are different types of data from satellite. Types of data products: - High resolutions - Low
resolutions
• Data products from satellite is available in different levels
- level 0: Raw imagery more no of geometric disturbances.
- level 1: corrections applied for radiometric degradation.
- level 2: all the corrections of the both radiometric and geometric distortion.
- level 3: next level of products like special products for the purpose of the good interpretation.

QUICK LOOK PRODUCTS:

- Path/ row products - Shift along track products - Quadrant products - Basic stereo product

Row and Path based products:

- We can buy a satellite image by scanning strip by strip where the no is given.
- Each path is divided into no of frames, but identified by row and path products, by specifying path and row
number we can purchase data.

Shift along track products:

- If we can want 2 scenes in a path, the software merges the 2 and the required area is given in 1 scenes.

Quadrant products:

- the geocodes are specified and the scene into divided 4 quadrants, where we select particular area (or)
portion.

Stereo products:

- For a same image, different angles and different height were taken which use of measure 3d models.

OTHER TYPES OF DATA PRODUCTS:

1. PRECISION PRODUCT: in this data, the co-ordinates have a precise value, which we use.
2. GEOCODED PRODUCTS:
- Geocode is the set of latitude and longitude coordinates of a physical address.
- Geocoding is the process of turning a physical address into a set of latitude and longitude coordinates
which can then be plotted or displayed on a map.
- Products are given with the set of latitude and longitude which we give.

DISPLAY OPERATIONS: (2018107014)


Zoom

Zoom of an image is accomplished by making multiple copies of the pixels of the selected region. A zoom operation
on a digital image produces an increase in its displayed size, with a corresponding increase in "pixel" size. Although
zooming does not increase the information content of the region of interest, it can be helpful in the visualization of
small structures or for the examination of individual pixel brightness values.

To introduce the idea of interpolation, suppose that a small matrix must be zoomed by a factor of 2, and the median
of the closest two (or four) original pixels is used to interpolate each new pixel:

41
The simplest way to accomplish zooming of arbitrary scale is to double the size of the original as many times as
needed to obtain an image larger than the target size in all dimensions, interpolating new pixels on each expansion.
Then the desired image can be attained by subsampling the large image, or taking pixels at regular intervals from
the larger image in order to obtain an image with the correct length and width.

Basically zooming require two steps :- the creation of new pixels locations and the assignment of gray level to those
new locations. Suppose that we have an image of size 500*500 pixels and we want to enlarge it 1.5 times to 750*750
pixels. The spacing in the grid will be less than one pixel because we are fitting it over a small image. In order to
perform gray level assignment for any point in the overlay,we look for the closest pixels in the original image and
assign its gray level to the new pixels in the grid. This method gray level assignment is called nearest neighbour
interpolation.

Image shrinking:

Image shrinking is done in similar manner as described for zooming. The equivalent process of pixel replication is
row and column deletion. For example, we want to shrink an image by one- half; we delete every row and column.

Pyramid

Image Pyramid-formally called “pyramid representation of image”- is a image and signal processing technique, to
represent a single image using a set of cascading images. Image pyramid provides many useful properties for many
application, such as noise reduction, image analysis, image enhancement, etc

It is actually a representation of the image by a set of the different frequency-band images .

It is a type of multi-scale signal representation developed by the computer vision, image processing and signal
processing communities, in which a signal or an image is subject to repeated smoothing and subsampling. Pyramid
representation is a predecessor to scale-space representation and multiresolution analysis.

Look-up Tables

Lookup tables can be loaded with the values of transfer functions, in order to accomplish real time radiometric
processing.

42
These functions can be used to build models for n-modal histograms, to adjust the radiometry of images to match
prespecified statistics, and to compensate nonlinearities of the photographic process when digitizing aerial
photography in a remote sensing facility.

In data analysis applications, such as image


processing, a lookup table (LUT) is used to
transform the input data into a more desirable
output format. For example, a grayscale
picture of the planet Saturn will be
transformed into a color image to emphasize
the differences in its rings.

DISPLAY SYSTEMS AND METHODS: (2018107016)


1. CRT (Cathode Ray Tube) DISPLAY

MAIN COMPONENTS :

43
1. Electron gun: Electron gun consisting of a series of elements, primarily heating filament (heater) and a cathode.
It creates a source of electrons which are focused into a narrow beam directed at the face of the CRT.

2. Control Electrode: It is used to turn the electron beam on and off.

3. Focusing system: It is used to create a clear picture by focusing the electrons into a narrow beam.

4. Deflection Yoke: It is used to control the direction of the electron beam. It creates an electric or magnetic field
which will bend the electron beam as it passes through the area. In a conventional CRT, the yoke is linked to a sweep
or scan generator. The deflection yoke which is connected to the sweep generator creates a fluctuating electric or
magnetic potential.

5. Phosphorus-coated screen: The inside front surface of every CRT is coated with phosphors. Phosphors glow when
a high-energy electron beam hits them. Phosphorescence is the term used to characterize the light given off by a
phosphor after it has been exposed to an electron beam.

Working of CRT

• A CRT works by moving back and forth behind the screen to illuminate or activate the phosphor dots on the
inside of the glass tube.
• The monitor displays colour pictures by using a combination of phosphors that emit different-coloured light.
• By combining the emitted light from the different phosphors, a range of colours (RED, GREEN, BLUE) can be
generated.

The two basic techniques for producing colour displays with a CRT are:

a. Beam-penetration method:

• A beam of slow electrons excites only the outer red layer.


• A beam of very fast electrons penetrates through the red layer and excites inner green layer.
• At intermediate beam speeds, combinations of red and green light are emitted to show two additional
colours, orange and yellow.
• The speed of electrons and the screen colour is controlled by beam acceleration voltage.

• This method is an inexpensive way to produce colour in random-scan monitors, but only four colours are
possible

Advantages:

• Inexpensive Disadvantages:
• Only four colors are possible
• Quality of pictures is not as good as with another method.
44
b. Shadow-mask method

• Shadow Mask Method is commonly used in Raster-Scan System because they produce a much wider range
of colors than the beam-penetration method.

• It is used in the majority of color TV sets and monitors.

• It has 3 phosphor color dots at each pixel position.

• ○ One phosphor dot emits: Red light ○ Second emits: Green light ○ Third emits: Blue light.

• The phosphor dots in the triangles are organized so that each electron beam can activate only its
corresponding color dot when it passes through the shadow mask.

• When the three beams pass through a hole in the shadow mask, they activate a dotted triangle, which occurs
as a small spot on the screen.

Advantages:

• Realistic image
• Million different colors to be generated
• Shadow scenes are possible Disadvantage
• Relatively expensive compared with the monochrome CRT.
• Relatively poor resolution
• Convergence Problem

2. RASTER DISPLAY

Displays that use the raster scan technique for assembling an electronic image on a screen by drawing a raster of
horizontal lines. They are the most popular kind of graphics display. High-quality raster displays not only show areas
of solid color but can also display vectors (lines connecting endpoints) that have been appropriately converted
(rasterized) to pixel patterns.

This process is also called scan conversion and usually takes place in the image-creation system, which the
generation of a picture on a raster display. In raster scan, the picture is assembled line by line. For example - when
an electron beam draws a set pattern of horizontal lines on the surface of the surface of a phosphor-coated CRT
screen.

ADVANTAGES:

• Real life images with different shades can be displayed.


• Color range available is bigger than random scan display.
45
DISADVANTAGES:

• Resolution is lower than random scan display.


• More memory is required.
• Data about the intensities of all pixel has to be stored.

3. LCD (LIQUID CRYSTAL DISPLAY)

Liquid crystal displays are super-thin


technology display screens that are
generally used in laptop computer screens,
TVs, cell phones and portable video games.
LCD’s technologies allow displays to be much
thinner when compared to a cathode ray
tube (CRT) technology. LCD is composed of
several layers which include two polarized
panel filters and electrodes.

CONSTRUCTION:

• The basic structure of the LCD


should be controlled by changing
the applied current.
• It must use polarized light.
• The liquid crystal should able be to control both of the operations to transmit and also able to change the
polarized light.

WORKING:

• When an electrical current is applied to the liquid crystal molecule, the molecule tends to untwist.
• This causes the angle of light which is passing through the molecule of the polarized glass and also cause a
change in the angle of the top polarizing filter.
• As a result, a little light is allowed to pass the polarized glass through a particular area of the LCD.
• Thus, that particular area will become dark compared to others. The LCD works on the principle of blocking
light.

Advantages:

• LCD’s consumes less amount of power compared to CRT and LED


• LCDs consist of some microwatts for display in comparison to some mill watts for LED’s
• LCDs are of low cost
• Provides excellent contrast
• LCD’s are thinner and lighter when compared to cathode-ray tube and LED

Disadvantages

• Require additional light sources


• Range of temperature is limited for operation
• Low reliability and Speed is very low.
• LCD’s need an AC drive

4. PLASMA DISPLAY

A plasma display is a type of flat panel display that uses plasma, an electrically charged ionized gas, to illuminate
each pixel in order to produce a display output. It is commonly used in large TV displays of 30 inches and higher.
Plasma displays are often brighter than LCD displays and also have a wider color gamut, with black levels almost
equaling "dark room" levels. Plasma displays are also known as gas-plasma displays.
46
WORKING:

• In plasma, photons of energy are A closer image of the process of uv light, interacting with
released, if an electrical current is phosphorus to create vis light
allowed to pass through it. Both the
electrons and ions get attracted to each
other causing inter collision.
• This collision causes the energy to be
produced. Plasma displays mostly make
use of the Xenon and neon atoms.
• When the energy is liberated during
collision, light is produced by them.
• These light photons are mostly
ultraviolet in nature.
• Though they are not visible to us, they
play a very important factor in exciting
the photons that are visible to us.
• A plasma display consists of fluorescent lights which causes the formation of an image on screen

Advantages

• The slimmest of all displays


• Very high contrast ratios [1:2,000,000]
• Weighs less and is less bulky than CTR’s.
• Higher viewing angles compared to other displays [178 degrees]
• High clarity and hence better colour reproduction. [68 billion/236 vs 16.7 million/224]
• Very little motion blur due to high refresh rates and response time.
• Has a life span of about 100,000 hours.

Disadvantages

• Cost is much higher compared to other displays.


• Energy consumption is more.
• Produces glares due to reflection.
• These displays are not available in smaller sizes than 32 inches.
• Though the display doesn’t weigh much, when the glass screen, which is needed to protect the display
weighs more.
• Cannot be used in high altitudes. The pressure difference between the gas and the air may cause a
temporary damage or a buzzing noise.
• Area flickering is possible.

47
EXPERT SYSTEMS
- DEFINITION: (2018107054)
An expert system is a computer system that emulates the decision-making ability of a human expert. (also) An
expert system is a computer program that uses artificial intelligence (AI) technologies to simulate the judgment and
behavior of a human or an organization that has expert knowledge and experience in a particular field.

Types:

An expert system is divided into two subsystems:

1.The inference engine - The inference engine applies the rules to the known facts to deduce new facts.
Inference engines can also include explanation and debugging abilities.

2.The knowledge base - The knowledge base represents facts and rules.

Knowledge engineering:

The process of building an expert system:

1. The knowledge engineer establishes a dialog with the human expert to elicit knowledge.

2. The knowledge engineer codes the knowledge explicitly in the knowledge base.

3. The expert evaluates the expert system and gives a critique to the knowledge engineer.

Role of AI:

• An algorithm is an ideal solution guaranteed to yield a solution in a finite amount of time.


• When an algorithm is not available or insufficient, we rely on Artificial Intelligence (AI).
• Expert system relies on inference-we accept a 'reasonable solution'.

Uncertainty:

• Uncertainty = having limited knowledge (more than possible outcomes)


• Both human experts and expert systems must be able to deal with uncertainty.
• It is easier to program expert systems with shallow knowledge than with deep knowledge.
• Shallow knowledge – based on empirical and heuristic knowledge.
• Deep knowledge – based on basic structure,

Elements of expert system:

• User interface – mechanism by which user and system communicate.


• Exploration facility – explains reasoning of expert system to user.
• Working memory – global database of facts used by rules.
• Inference engine – makes inferences deciding which rules are satisfied and prioritizing.
• Agenda – a prioritized list of rules created by the inference engine, whose patterns are satisfied by facts or
objects in working memory.
• Knowledge acquisition facility – automatic way for the user to enter knowledge in the system bypassing the
explicit coding by knowledge engineer.

Limitations of expert system:

• Limitation 1: typical expert systems cannot generalize through analogy to reason about new situations in
the way people can.

- Solution 1 for limitation 1: repeating the cycle of interviewing the expert.

48
• Limitation raised form Solution 1: A knowledge acquisition bottleneck results from the time-consuming and
labor-intensive task of building an expert system.

Early expert system:

• DENDRAL – used in chemical mass spectroscopy to identify chemical constituents

• MYCIN – medical diagnosis of illness

• DIPMETER – geological data analysis for oil

• PROSPECTOR – geological data analysis for minerals

• XCON/R1 – configuring computer systems

Production rules:

• Knowledge base is also called production memory.

• Production rules can be expressed in IF-THEN pseudocode format.

• In rule-based systems, the inference engine determines which rule antecedents are satisfied by the facts.

Production system:

• Rule-based expert systems – most popular type today.

• Knowledge is represented as multiple rules that specify what should/not be concluded from different
situations.

• Forward chaining – start w/facts and use rules do draw conclusions/take actions.

• Backward chaining – start w/hypothesis and look for rules that allow hypothesis to be proven true.

Advantages of expert system:

•Increased availability
•Multiple expertise •Fast response
•Reduced cost
•Increased reliability •Intelligent tutor
•Reduced danger
•Explanation •Intelligent database
•Performance

49
COMPONENTS: (2018107057)
An expert system is typically composed of the following components. These are the

1.Human expert,

2.Knowledge base (rule based domain),


PRIMARY
3.Inference engine,
COMPONENTS
4.User interface,

5.Online databases,

6.User.

Fig: Basic Functions of Expert Systems

1.Human expert:

-Building of expert systems requires a human expert that extract the required knowledge.

-Human expert is an individual who has capability of recognizing the things in a superior way. eg: a doctor etc.

2.Knowledge Base:

- The expert knowledge is stored in knowledge base.

- The knowledge base is obtained from books, magazines and from various experts, scholars, and the Knowledge
Engineers. The knowledge engineer is a person with the qualities of empathy, quick learning, and case analyzing
skills.

- IF-THEN-ELSE rules are used to organize and formalize the knowledge in the knowledge base in a meaningful
way,to be used by interference machine.

-The success of any expert system majorly depends on the quality, completeness, and accuracy of the
information stored in the knowledge base.

3.Inference engine:

- It draws conclusion from the Knowledge base.

-The inference engine is the main processing element of the expert system.

-The inference engine applies logical rules to the knowledge base and deducts new knowledge. This process
would iterate as each new fact in the knowledge base could trigger additional rules in the inference engine.

50
- It seeks information and relationships from the knowledge base and to provide answers, predictions and
suggestions the way a human expert would.

4.User interface:

-A user interface is the method by which the expert system interacts with a user.

-These can be through dialog boxes, command prompts, forms, or other input methods. Some expert systems
interact with other computer applications, and do not interact directly with a human.

-UI (User Interface) is the system that allows a non-expert user to query (question) the expert system, and
to receive advice. The user-interface is designed to be a simple to use as possible.

5. Online databases:

The rules and conditions may be applied and evaluated using data and/or information stored in online databases.

-The databases can take variety of forms. It can be spatial and consists of remotely sensed images and thematic
maps in raster and vector format.

-However, the database may also consists of charts, graphs, algorithms, pictures, texts that are considered
important by the expert. The database should contain detailed, standardized meta data.

6. User:

This is a particular person or group of people who may not be experts and working on the expert system needs the
solution or advice for his queries, which are complex.

- Expert systems cannot use past experience to make analogies in a new environment.

- Expert systems comes under Non metric information extraction.

In short, the non-expert user queries the expert system by asking a question, or by answering questions asked by
the expert system in user interface.

The inference engine uses the query to search the knowledge base and then provides an answer or some advice to
the user.

51
Problem Domain Vs Knowledge Domain:

-An Expert’s Knowledge is specific to one


problem domain - medicine, finance, science,
engineering etc.

-The Expert’s Knowledge about solving


specific problems is called the knowledge
domain.

-The problem domain is always a super set of


the knowledge domain.

52
APPLICATIONS: (2018107059)

Advantages of Expert System

• Less Production Cost − Production costs of expert systems are extremely reasonable and affordable.
• Speed − They offer great speed and reduce the amount of work.
• Less Error Rate − Error rate is much lower as opposed to human errors.
• Low Risks − They are capable of working in environments that are dangerous to humans.
• Steady response − They avoid motions, tensions and fatigues.

53
Limitations

It is evident that no technology is entirely perfect to offer easy and complete solutions. Larger systems are not only
expensive but also require a significant amount of development time, and computer resources. Limitations of Es
include:

• Difficult knowledge acquisition


• Maintenance costs
• Development costs
• Adheres only to specific domains.
• Requires constant manual updates, it cannot learn by itself.
• It is incapable of providing logic behind the decisions.

Major Examples of Expert Systems

There are numerous examples of expert systems. Some of them are:

• MYCIN: This was one of the earliest expert systems that was based on backward chaining. It has the ability
to identify various bacteria that cause severe infections. It is also capable of recommending drugs based on
a person’s weight.
• DENDRAL: This was an AI based expert system used essentially for chemical analysis. It uses a substance’s
spectrographic data in order to predict its molecular structure.
• R1/XCON: This ES had the ability to select specific software to generate a computer system as per user
preference.
• PXDES: This system could easily determine the type and the degree of lung cancer in patients based on limited
data.
• CaDet: This is a clinical support system that identifies cancer in early stages.
• DXplain: This is also a clinical support system that is capable of suggesting a variety of diseases based on just
the findings of the doctor.

Conclusion

To conclude this report, expert system is undeniably reliable in terms of providing reasonable and highly
valuable decisions. Knowledge and experiences from a human expert can lead to the critical decision-making in
achieving success.

ERRORS
Any kind of errors present in remote sensing images are known as image distortions. Any remote sensing images
acquired from either spaceborne or airborne platforms are susceptible to a variety of distortions. These distortions
occur due to data recording procedure, shape and rotation of the Earth and environmental conditions prevailing at
the time of data acquisition. Distortions occurring in remote sensing images can be categorised into two types:

• Radiometric distortions • Geometric distortions.

As you know, an image is composed of finite number of pixels. The positions of these pixels are initially referenced
by their row and column numbers. However, if you want to use images, you should be able to relate these pixels to
their positions on the Earth surface. Further, the distance, area, direction and shape properties vary across an image,
thus these errors are known as geometric errors/distortions. This distortion is inherent in images because we
attempt to represent three-dimensional Earth surface as a two-dimensional image.

54
- GEOMETRIC ERRORS (2018107022)
Geometric errors originate during the process of data collection and vary in type and magnitude. There are several
factors causing geometric distortions such as:

• Earth’s rotation
• Earth’s curvature
• satellite platform instability and
• instrument error.

Nature of Geometric Errors

Geometric errors present in remote sensing images can be categorised into the following two types:

• Internal geometric errors, and

• External geometric errors.

It is important to recognise the source of internal and external error and whether it is systematic (predictable) or
non-systematic (random). Internal Geometric errors:

Internal geometric errors are introduced by the sensor system itself or by the effects of Earth’s rotation and
curvature. These errors are predictable or computable and often referred to as systematic that can be identified
and corrected using pre-launch or platform ephemeris.

Reasons of geometric distortions causing internal geometric errors in remote sensing images include the following:

• Skew caused by the Earth’s rotation effects


• scanning system induced variation in ground resolution cell size and dimensional relief displacement
• scanning system tangential scale distortion.

Earth Rotation Effect: You know that Earth rotates on its axis from west to east. Earth observing sun-synchronous
satellites are normally launched in fixed orbits that collect a path (or swath) of imagery as the satellite makes its way
from the north to the south in descending mode. As a result of the relative motion between the fixed orbital path
of satellites and the Earth’s rotation on its axis, the start of each scan line is slightly to the west of its predecessor
which causes overall effect of skewed geometry in the image.

Variation in Ground Resolution Cell Size and Dimensional Relief Displacement: An orbital multispectral scanning
system scans through just a few degrees off-nadir as it collects data hundreds of kilometers above the Earth’s surface
(between 600 and 700 km above the ground level). This configuration minimizes the amount of distortion
introduced by the scanning system. In case of low altitude multispectral scanning systems, numerous types of
geometric distortion may introduce that can be difficult to correct.

Tangential Scale Distortion: It occurs due to the rotation of the scanning system itself. Because when a scanner scans
across each scan line, the distance from scanner to ground increases further away from the center of the ground
swath. Although scanning mirror rotates at a constant speed but the instantaneous field of view of the scanner
moves faster and scans a larger area as it moves closer to the edges. It causes in the compression of image features
at points away from the nadir. This distortion is known as tangential scale distortion.

External geometric errors are usually introduced by phenomena that vary in Image Corrections nature through
space and time. The most important external variables that can cause geometric error in remote sensor data are
random movements by the spacecraft at exact time of data collection, which usually involve:

• Altitude changes and


• Attitude changes (yaw, roll and pitch).

55
Geometric correction:

The image correction involves image operations which normally precedes manipulation and analysis of image data
to extract specific information. The primary aim of image correction operations is to correct distorted image data to
create a more accurate representation of the original scene. Image corrections are also known as a preprocessing
of remotely sensed images. It is a preparatory phase that improves quality of images and serves as a basis for further
image analysis Depending upon the kinds of errors which are present in images. The image correction functions are
comprised of radiometric and geometric corrections. Geometric correction is the process of correcting geometric
distortions and assigning the properties of a map to an image.

It is the process of correction of raw remotely sensed data for errors of skew, rotation and perspective.

Rectification is the process of alignment of an image to a map (map projection system). In many cases, the image
must also be oriented so that the north direction corresponds to the top of the image. It is also known as
georeferencing. Registration is the process of alignment of one image to another image of the same area not
necessarily involving a map coordinate system.

Geocoding is a special case of rectification that includes geographical registration or coding of pixels in an image.
Geocoded data are images that have been rectified to a particular map projection and pixel size. The use of standard
pixel sizes and coordinates permits convenient overlaying of images from different sensors and maps in a GIS.

Orthorectification is the process of pixel-by-pixel correction of an image for topographic distortion. Every pixel in
an orthorectified image appears to view the Earth from directly above, i.e., the image is in an orthographic
projection.

It is essential to remove geometric errors because non-removal of geometric distortions in an image may not be
able to:

• relate features of the image to field data


• compare two images taken at different times and carry out change analysis
• obtain accurate estimates of the area of different regions in the image and
• relate, compare and integrate the image with any other spatial data. Geometric correction is usually
necessary but it is not required if the purpose of the study is not concerned with the precise positional
information and rather with the relative estimates of areas.

There are following two common geometric correction procedures which are often used

• Image-to-map rectification • Image-to-image registration

In Image-to-image registration the reference is another image. If a rectified image is used as the reference base, an
image registered to it will inherit the geometric errors existing in the reference image.

Image- to- map rectification is the process by which geometry of an image is made planimetric. Whenever accurate
area, direction and distance measurements are required, image-to-map geometric rectification should be
performed. It may not, however, remove all the distortion caused by topographic relief displacement in images. The
image-to- map rectification process normally involves selecting GCP image pixel coordinates (row and column) with
their map coordinate counterparts (Eg: meters and northing and easting in a Universal Transverse Mercator map
projection).

56
- RADIOMETRIC ERRORS: (2018107023)
Radiometric correction is done to calibrate the pixel values and/ correct for errors in the values. The process
improves the interpretability and quality of remote sensed data. Radiometric calibration and corrections are
particularly important when comparing multiple data sets over a period of time.

Procedures of Image processing

● Preprocessing ● Information Enhancement ● Information extraction

● Post-classification ● Information output


57
Why corrections ?

➢ The perfect remote sensing system has yet to be developed.


➢ The Earth’s atmosphere, land, and water are amazingly complex and do not lend themselves well to being
recorded by remote sensing devices.

Error Sources

➢ Internal Cause: When individual detectors do not function properly or are improperly calibrated.(Systematic
Error)
➢ External Cause: Atmosphere (between the terrain and the sensor) can contribute to noise (i.e., atmospheric
attenuation) such that energy recorded does not resemble that reflected/emitted by the terrain. ( Non -
Systematic Error )

Types of Radiometric Corrections

➢ Detector or Sensor Error (Internal Error)


➢ Atmospheric Error (External Error)
➢ Topographic Error (External Error)

Detector or Sensor Error

(Internal Error)

1. Ideally, the radiance recorded by a remote sensing system in various bands is an accurate representation of the
radiance actually leaving the feature of interest (Eg: Soil, Vegetation, Atmosphere, H2O or Urban Land Cover) on the
earth's surface or atmosphere. Unfortunately, noise (Error) can enter the data-collection system at several points.

2. Some of the commonly observed systematic radiometric errors are:

A. random bad pixels B. line start/stop problems C. line or column drop-outs D. n-Line striping

2A. Random bad pixels (shot noise)

• Sometimes an individual detector does not record spectral data for an individual pixel. When this occurs
randomly, it is called a bad pixel.
• When there are numerous random bad pixels found within the scene, it is called shot noise because it
appears that the image was shot by a shotgun.
• Normally these bad pixels contain values of 0 or 255 (in 8-bit data) in one or more of the bands.

2B. Line-start/stop problems

• Occasionally, scanning systems fail to collect data at the beginning or end of a scan line, or they place the
pixel data at inappropriate locations along the scan line.
• For example, all of the pixels in a scan line might be systematically shifted just one pixel to the right. This is
called a line-start problem.

2C. Line or column drop-outs

• An entire line containing no spectral information may be produced if an individual detector in a scanning
system (e.g., Landsat MSS or Landsat 7 ETM+) fails to function properly.
• If a detector in a linear array (e.g., SPOT XS, IRS-1C, QuickBird) fails to function, this can result in an entire
column of data with no spectral information. The bad line or column is commonly called a line or column
drop-out and contains brightness values equal to zero.

58
2D. N-Line Striping

• N-Line Striping is caused by the miss-calibration of one of n detectors.


• This is usually accomplished by computing a histogram of the values for each of the n detectors that
collected data over the entire scene (ideally, this would take place over a homogeneous area, such as a body
of water).

Atmospheric correction

1.There are several ways to atmospherically correct remotely sensed data. Some are relatively straightforward while
others are complex, being founded on physical principles and requiring a significant amount of information to
function properly.

2. This discussion will focus on two major types of atmospheric correction:

A. Absolute atmospheric Correction B. Relative Radiometric Correction

A. Absolute atmospheric correction

1.Solar radiation is largely unaffected as it travels through the vacuum of space. When it interacts with the Earth’s
atmosphere, however, it is selectively scattered and absorbed. The sum of these two forms of energy loss is called
atmospheric attenuation.

Atmospheric attenuation may

a) make it difficult to relate hand-held in situ spectroradiometer measurements with remote measurements,
b) make it difficult to extend spectral signatures through space and time, and
c) have an impact on classification accuracy within a scene if atmospheric attenuation varies significantly
throughout the image.

2.The general goal of absolute radiometric correction is to turn the digital brightness values (or DN) recorded by a
remote sensing system into scaled surface reflectance values. These values can then be compared or used in
conjunction with scaled surface reflectance values obtained anywhere else on the planet.

B. Relative Radiometric Correction

1. When required data is not available for absolute radiometric correction, we can do relative radiometric correction

2. Relative radiometric correction may be used to: a) Single-image normalization using histogram adjustment b)
Multiple-data image normalization using regression

Topographic correction

1. Topographic slope and aspect also introduce radiometric distortion (for example, areas in shadow).

2. The goal of a slope-aspect correction is to remove topographically induced illumination variation so that two
objects having the same reflectance properties show the same brightness value (or DN) in the image despite their
different orientation to the Sun’s position. Based on DEM, Sun.

3. Topographic Correction a correction applied to observed geophysical values to remove the effects of topography.

59
60
FILTERS
CONVOLUTION FILTERS: (2018107035)
Convolution is the process of adding each element of the image to its local neighbors, weighted by the kernel.
This is related to a form of mathematical convolution. The matrix operation being performed—convolution—is not
traditional matrix multiplication, despite being similarly denoted by *. For example, if we have two three-by-three
matrices, the first a kernel, and the second an image piece, convolution is the process of flipping both the rows and
columns of the kernel and multiplying locally similar entries and summing. The element at coordinates [2, 2] (that
is, the central element) of the resulting image would be a weighted combination of all the entries of the image
matrix, with weights given by the kernel:

Low Pass Filtering

A low pass filter is the basis for most smoothing methods. An image is smoothed by
decreasing the disparity between pixel values by averaging nearby pixels. Using a low
pass filter tends to retain the low frequency information within an image while reducing
the high frequency information. An example is an array of ones divided by the number
of elements within the kernel, such as the following 3 by 3 kernel:

61
• This filter produces a simple average (or arithmetic mean) of the nearest neighbors of each pixel in the image.

• It is one of a class of what are known as low pass filter.

• They pass low frequencies in the image or equivalently pass long wavelengths in image.

• It corresponds to abrupt changes in image intensity, i.e, edges .Thus get Blurring.

• Also, because it is replacing each pixel with an average of the pixel in its local neighborhood, one can understand
why it tends to blur the image.

• Blurring is typical of low pass filter.

High Pass Filtering

A high pass filter is the basis for most sharpening methods. An image is sharpened when contrast is enhanced
between adjoining areas with little variation in brightness or darkness

A high pass filter tends to retain the high frequency information within an image while reducing the low frequency
information. The kernel of the high pass filter is designed to increase the brightness of the center pixel relative to
neighboring pixels. The kernel array usually contains a single positive value at its center, which is completely
surrounded by negative values. The following array is an example of a 3 by 3 kernel for a high pass filter:

• High pass filtering of an image can be achieved by the application of low pass filter to the image and
subsequently subtraction of the low pass filtered result from the image.
• In abbreviated terms this is H=I-L, where H=high pass filtered image, I=original image and L=low pass filtered
image.

62
Directional Filtering

A directional filter forms the basis for some edge detection methods. An edge within an image is visible when a
large change (a steep gradient) occurs between adjacent pixel values. This change in values is measured by the first
derivatives (often referred to as slopes) of an image. Directional filters can be used to compute the first derivatives
of an image.

Directional filters can be designed for any direction within a given space. For images, x- and y-directional filters are
commonly used to compute derivatives in their respective directions. The following array is an example of a 3 by 3
kernel for an x-directional filter (the kernel for the y-direction is the transpose of this kernel):

• An edge within an image is visible when a large change (a steep gradient) occurs between adjacent pixel
values.
• This change in values is measured by the first derivatives of an image.
• Directional filters can be used to compute the first derivatives of an image
• Directional filters can be designed for any direction within a given space.

63
Laplacian Filtering

A Laplacian filter forms another basis for edge detection methods. A Laplacian filter can be used to compute the
second derivatives of an image, which measure the rate at which the first derivatives
change. This helps to determine if a change in adjacent pixel values is an edge or a
continuous progression.

Kernels of Laplacian filters usually contain negative values in a cross pattern (similar to a
plus sign), which is centered within the array. The corners are either zero or positive values.
The center value can be either negative or positive. The following array is an example of a
3 by 3 kernel for a Laplacian filter

GLOBAL OPERATORS: (2018107036)


Image processing operations fall into two classes. Local and global. Local operations affect only a small
corresponding area in the output image, and include edge detection, smoothing and point operations. Global
operations include histogram, image warping, Hough transform and connected components, find eigen values and
vector, principal components analysis. The best examples of global operators is image fusion.

Global Operators:

1)Principal components analysis


3)Image warping 5)Connected components
(PCA)
2)Histogram 4)Hough transform 6)Image fusion

1) Principal components analysis (PCA):

• The fusion method based on PCA is very simple and is a general statistical technique that transforms
multivariable data with correlated variable into one with uncorrelated variables. PCA has been widely used
in image encoding, image data compression, image enhancement and image fusion.

64
• Principal Component Analysis (PCA) is a dimensionality reduction method that is often used to reduce the
dimensionality of large data sets, by transforming a large set of variable into a smaller on the still contains
most of the information in the large set.

Explanation:

In the process, PCA method generates uncorrelated


images (PC1, PC2,…..,PCn) where n is a number of input
multispectral bands. The first principal component PC1
is replaced with the panchromatic band which has
higher spatial resolution than the multispectral images.
PCA transformation is applied to obtain the image in
the RGB color model as show figure.

The first principal component, which contains


maximum Variance is replaced by PAN image. Such
replacement maximizes the effect of panchromatic
image in the fused product. One solution could be
stretching the principal component to give a spherical
distribution. Besides, the PCA approach is sensitive to
the choice of area to be fused. If the grey Values of the
PAN image are adjusted to the grey Values similar to
PC1 component before the replacement the color
distortion is significantly reduced.

1.Standard deviation

2.Covariance matrix computation

3.Compute the Eigen vector and eigen values

4.feature vector

1: STANDARDIZATION

The aim of this step is to standardize the range of the continuous initial variables so that each one of them
contributes equally to the analysis. Mathematically, this can be done by subtracting the mean and dividing by the
standard deviation for each value of each variable z=value-mean/standard deviation

2: COVARIANCEMATRIXCOMPUTATION

The aim of this step is to understand how the variables of the input data set are varying from the mean with respect
to each other, or in other words, to see if there is any relationship between them. The covariance matrix is ap × p
symmetric matrix (where p is the number of dimensions) that has as entries the covariances associated with all
possible pairs of the initial variable

The sign of the covariance that matters:

- if positive then: the two variables increase or decrease together (correlated)

- if negative then: One increases when the other decreases (Inversely correlated)

65
3: COMPUTE THE EIGEN VECTORS AND EIGEN VALUES
OF THE COVARIANCE MATRIX TO IDENTIFY PC

Principal components are new variables that are


constructed as linear combinations or mixtures of the
initial variables. These combinations are done in such a
way that the new variables (i.e.,principal components)
are uncorrelated and most of the information within the
initial variables is squeezed or compressed into the first
components. Will allow you to reduce dimensionality
without losing much information and this by discarding
the components with low information and consider in
the remaining components as your new variables.

First principal component accounts for the largest possible variance in the set, maximizes the variance (the average
of the squared distances from projected points (red dots) to the origin). The second principal component is
calculated in the same way, with the condition that it is uncorrelated with (i.e.,perpendicular to) the first principal
component and that it accounts for the next highest. This continues until a total of p principal components have
been calculated, equal to the original number of variables. By ranking your eigen vectors in order of their eigen
values, highest to lowest, you get the principal components in order of significant.

4: FEATURE VECTOR:

The feature vector is simply a matrix that has as columns the eigen vectors of the components That we decide to
keep. This makes it the first step towards
dimensionality reduction, because if we
choose to keep only eigen vectors
(components) out on, the final data set will
have only dimensions.

5. RECAST THE DATA ALONG THE PRINCIPAL


COMPONENTS AXES:

The aim is to use the feature vector formed


using the eigen vectors of the covariance
matrix, to reorient the data from the original
axes to the ones represented by the principal
components (hence the name Principal
Components Analysis).

Histogram:

A histogram is a specialized graph or plot used in statistics. In its most common form, the independent variable
is plotted along the horizontal axis, and the dependent variable (usually a percentage) is plotted along the vertical
axis. The independent variable can attain only a finite number of discrete values (for example, five) rather than a
continuous range of values. The dependent variable can span a continuous range.

Correlation - Correlation is a statistical technique that can


show whether and how strongly pairs of variables are related.
Correlation measures association, but doesn't show if x causes
y or vice versa, or if the association is caused by a third–
perhaps unseen–factor.

66
Covariance - Covariance is a statistical tool that is used to determine the relationship between the movement of
two asset prices. When two stocks tend to move together, they are seen as having a positive covariance; when they
move inversely, the covariance is negative.

Cross correlation - Cross-correlation is a measurement that tracks the movements of two variables or sets of data
relative to each other. Cross-correlation is generally used when measuring information between two different time
series. The range of the data is- 1 to 1such that the closer the cross- correlation value is to 1, the more closely the
information sets are.

67
3) Images Warping:

Image warping is the process of digitally manipulating an image such that any shapes portrayed in the image have
been significantly distorted. Warping may be used for correcting image distortion as well as for creative purposes.
The most obvious approach to transforming a digital image is the forward mapping. However, for injective
transforms reverse mapping is also available.

4) Hough transform:

The Hough transform is a feature extraction technique used in image analysis, Computer vision, and digital image
processing. The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by
a voting procedure. This voting procedure is carried out in a parameter space, from which object candidates are
obtained as local maxima in a so-called accumulator space that is explicitly constructed by the algorithm for
computing the Hough transform.

Example1

Consider three data points, shown here as black dots.

For each data point, a number of lines are plotted going through it, all at different angles. These are shown here in
different colors. To each line, a support line exists which is perpendicular to it and which intersects the origin. In
each case, one of these is shown as an arrow. The length (i.e. perpendicular distance to the origin) and angle of each
support line is calculated. Lengths and angles are tabulated below the diagrams.

68
5) Connected Components:

A set of pixels in which each pixel is connected to all other pixels is called a Connected component.

Definition: A pixel p∈S is said to be connected to q∈S if there is a path from p to q consisting entirely of pixels of S.
A component labeling algorithm finds all connected components in an image and assigns a unique label to all points
in the same component.

STATISTICAL AND MORPHOLOGICAL OPERATORS: (2018107036)


Statistical filters: mean filter and median filter

Morphological filters:

• Morphology is a broad set of image


processing operations that process
images based on shapes.

• In a morphological operation, the value of


each pixel in the output image is based on a
comparison of the corresponding pixel in the
input image with its neighbors.

The structuring element: In morphological filter, each element in


the matrix is called “structuring element” instead of coefficient
matrix in the linear filter. The structuring elements contain only
value 0 and 1 (unlike the weightage values). And the hot spot of
the filter is the dark shade element (qualifying pixel).

The binary image is described as sets of two-dimensional coordinate point. This is called “Point Set” Q and point set
consist of the coordinate pair p = (u,v) of all foreground pixels.

For inverting binary image is complement operation reflection of binary image I by multiply -1 to point p. and
combining two binary image use union operator. Shifting binary image, I by some coordinate vector d.

69
Properties of erosion and dilation:

Dilation adds pixels to the boundaries of objects in an image, while erosion removes pixels on object boundaries.
The number of pixels added or removed from the objects in an image depends on the size and shape of the
structuring element used to process the image.

70
Composite operation:

- Dilation and erosion work together in composite operation.

- There are common way to represent the order of these two operations, opening and closing.

Opening

Closing

71
HYPERSPECTRAL DATA
- DEFINITION: (2018107051)
The electromagnetic spectrum is composed of thousands of bands representing different types of light energy.
Imaging spectrometers (instruments that collect hyperspectral data) break the electromagnetic spectrum into
groups of bands that support classification of objects by their spectral properties on the earth's surface.
Hyperspectral data consists of many bands -- up to hundreds of bands -- that cover the electromagnetic spectrum.

The data produced by imaging spectrometers is different from that of multispectral instruments owing to the
enormous number of wavebands recorded. For a given geographical area imaged the data produced can be viewed
as a cube, having two dimensions that represent spatial position and one that represents wavelength. Although data
volume strictly does not pose any major data processing challenges with contemporary computing system it is
nevertheless useful to examine the relative magnitudes of the data say TM and AVIRIS. Clearly, the major difference
to note between the two is the number of wavebands (7 vs. 224) and the radiometric quantization (8 vs 10bpppb).
Ignoring differences in spatial resolution, the relative data volumes, per pixel are 56:2240.Per pixel there are 40
times as many bits therefore for AVIRIS as for TM.

With 40 times as much data per pixel-does it means more information per pixel? Generally of course, that is not the
case-much of the additional data does not add to the inherent information content for particular information even
though it often helps in discovering that information in other words it contains redundancies. In remote sensing
data redundancy can take two forms: spatial and spectral. Exploiting spatial redundancy is behind the spatial context
methods. Spectral redundancy means that information content of one band can be fully or partly predicted from
the other bands in the data.

BASICS CONCEPTS

• The hyperspectral data in contiguous spectral bands

• Very narrow band width (order of 10 nm)

• Produce complete spectral signatures with no wavelength omissions.

• remote sensing uses a few wide spectral bands (50-300 nm)

– Less sensitive to subtle spectral changes such as phenological changes, mineral mixture
proportional changes, vegetation stress, etc.

– Harder to extract quantitative information

– The ability to measure reflectance in several contiguous bands across a specific part of the
spectrum allows these instruments to produce a spectral curve that can be compared to
reference spectra for any number of minerals, thereby allowing the mineral content of a particular
piece of ground to be determined.

n- Dimensional Data

Hyperspectral data (or spectra) can be thought of as points in an n-dimensional scatterplot. The data for a given
pixel corresponds to a spectral reflectance for that given pixel. The distribution of the hyperspectral data in n-space
can be used to estimate the number of spectral endmembers and their pure spectral signatures and to help
understand the spectral characteristics of the materials which make up that signature.

72
Radiometric Correction

• Hyperspectral imaging sensors collect radiance data from either airborne or spaceborne platforms which
must be converted to apparent surface reflectance before analysis techniques can take place. Atmospheric
correction techniques have been developed that use the data themselves to remove spectral atmospheric
transmission and scattered path radiance. There are seven gases in the Earth's atmosphere that produce
observable absorption features in the 0.4 - 2.5 micron range. They are water vapor, carbon dioxide, ozone,
nitrous oxide, carbon monoxide, methane, and oxygen.

• Approximately half of the 0.4 - 2.5 micron spectrum is affected by gaseous absorption

73
PREPROCESSIG

Calibration

Calibrating imaging spectroscopy data to surface reflectance is an integral part of the data analysis process,
and is vital if accurate results are to be obtained. The identification and mapping of materials and material properties
is best accomplished by deriving the fundamental properties of the surface, its reflectance, while removing the
interfering effects of atmospheric absorption and scattering, the solar spectrum, and instrumental biases.

Calibration to surface reflectance is inherently simple in concept, yet it is very complex in practice because
atmospheric radiative transfer models and the solar spectrum have not been characterized with sufficient accuracy
to correct the data to the precision of some currently available instruments, such as the NASA/JPL Airborne Visible
and Infra-Red Imaging Spectrometer.

The objectives of calibrating remote sensing data are to remove the effects of the atmosphere (scattering and
absorption) and to convert from radiance values received at the sensor to reflectance values of the land surface.
The advantages offered by calibrated surface reflectance spectra compared to uncorrected radiance data include:
1) the shapes of the calibrated spectra are principally influenced by the chemical and physical properties of surface
materials, 2) the calibrated remotely-sensed spectra can be compared with field and laboratory spectra of known
materials, and 3) the calibrated data may be analyzed using spectroscopic methods that isolate absorption features
and relate them to chemical bonds and physical properties of materials. Thus, greater confidence may be placed in
the maps of derived from calibrated reflectance data, in which errors may be viewed to arise from problems in
interpretation rather than incorrect input data.

Data normalization

When detailed radiometric correction is not feasible normalization is an alternative which makes the corrected
data independent of multiplicative noise, such as topographic and solar spectrum effects. This can be performed
using Log Residuals, based on the relationship between radiance and reflectance.

The number of training samples required to train a classifier for high dimensional data is much greater than that
required for conventional data, and gathering these training samples can be difficult and expensive. Therefore, the
assumption that enough training samples are available to accurately estimate the class quantitative description is
frequently not satisfied for high dimensional data. The pre-labeled samples used to estimate class parameters and
design a classifier are called training samples. The accuracy of parameter estimation depends substantially on the
ratio of the number of training samples to the dimensionality of the feature space. As the dimensionality increases,
the number of training samples needed to characterize the classes increase as well. If the number of training samples
available fails to catch up with the need, which is the case for hyperspectral data, parameter estimation becomes
inaccurate. Consider the case of a finite and fixed number of training samples. The accuracy of statistics estimation
decreases as dimensionality increases, leading to a decline of the classification accuracy. Although increasing the
number of spectral bands (dimensionality) As a result, the classification accuracy first grows and then declines as
the number of spectral bands increases, which is often referred to as the Hughes phenomenon).

Minimum Noise Fraction (MNF) Transformation

A minimum noise fraction (MNF) transformation is used to reduce the dimensionality of the hyperspectral
data by segregating the noise in the data. The MNF transform is a linear transformation which is essentially two
cascaded Principal Components Analysis (PCA) transformations. The first transformation decorrelates and rescales
the noise in the data. This results in transformed data in which the noise has unit variance and no band to band
correlations. [ENVI] The second transformation is a standard PCA of the noise-whitened data.

The Minimum Noise Fraction (MNF) transform computes the normalized linear combinations of the original bands
which maximize the ratio of the signal to noise. The approach was developed specifically for analysis of multiple
band remotely sensed data which would produce orthogonal bands ordered by their information content. It can
also be used for filtering noise through application of filters matched to the noise characteristics of the transformed
bands and inverting the data.

74
Because the transform is a ratio, it is also invariant with respect to scale changes in bands. Additionally, the signal
and noise of the transformed bands are also orthogonal. The approach requires that the covariance of the noise be
known, which is not generally the case for remotely sensed data. A reasonable estimate of the noise in each band
can be obtained when the signal is highly correlated across bands through adaptation of a procedure called the
maximum autocorrelation factor which exploits the correlation of signals in spatial neighbourhoods.

Pixel Purity Index

The Pixel Purity Index (PPI) is a processing technique designed to determine which pixels are the most
spectrally unique or pure. Due to the large amount of data, PPI is usually performed on MNF data which has been
reduced to coherent images. The most spectrally pure pixels occur when there is mixing of endmembers. The PPI is
computed by continually projecting n-dimensional scatterplots onto a random vector. The extreme pixels for each
projection are recorded and the total number of hits are stored into an image. These pixels are excellent candidates
for selecting endmembers which can be used in subsequent processing.

Use Pixel Purity Index (PPI) to find the most spectrally pure (extreme) pixels in multispectral and hyperspectral
images. A Pixel Purity Image is created where each pixel value corresponds to the number of times that pixel was
recorded as extreme.

Spectral Library

• The number of training samples required to train a classifier for high dimensional data is much greater than
that required for conventional data, and gathering these training samples can be difficult and expensive.

• The interpretation of high spectral resolution “hyperspectral” image data can be simplified by using
examples from laboratory and ground acquired libraries of documented spectra referred to as spectral
library.

• Spectral libraries contain spectra of individual species that have been acquired at test sites representatives
of varied terrain and climatic zones, observed in the field under natural conditions.

HYPER-SPECTRAL DATA ANALYSIS: (2018107052)


Spectral mixing

Spectral mixing is a consequence of the mixing of materials having different spectral properties within the Ground
instantaneous field of view of a single pixel.

Causes of spectral mixing:

Mixing can originate from a range of effects, such as mixing of the received signal at the sensor due to limited spatial
resolution or mixing of spectral contributions within the material, e.g., due to intimate mixtures of different
spectrally active materials within one sampling point. Mixing can be considered linear and non-linear, each requiring
adapted approaches. A wide range of algorithms has been developed to address this issue.

➢ A variety of factors interact to produce the signal received by the imaging spectrometer

➢ A very thin volume of material interacts with incident sun light. All the materials present in this volume
contribute to total reflected energy.

➢ Spatial mixing of materials in the area represented by a single pixel results in spectrally mixed reflected
signals.

➢ Variable illumination due to topography [shade] and actual shadow in the area represented by the pixel
further modify the reflected signal, basically mixing with a “black” end member.

➢ The imaging spectrometer integrates the reflected light from each pixel.

75
➢ Spectral unmixing analysis is devoted to extracting pure spectra that by necessity form each image. It can
be achieved only if the pixels are formed by linear mixing

Spectral Angle Mapper Classification

The Spectral Angle Mapper Classification (SAM) is an automated Figure 1


method for directly comparing image spectra to a known spectrum
or an endmember. This method treats both spectra as vectors and
calculates the spectral angle between them. This method is
insensitive to illumination since the SAM algorithm uses only the
vector direction and not the vector length. The result of the SAM
classification is an image showing the best match at each
Figure 1: Spectral angle pixel. This method is typically used as a first
cut for determining the mineralogy and works well in areas of
homogeneous regions.

Endmembers:

Mixed pixels are common in hyperspectral remote sensing images.


Endmember extraction is a key step in spectral unmixing. It is usually assumed that there are some pixels (known as
endmember) that include only one ground object in their image, and the process of finding these endmembers is
referred to endmember extraction.

Linear Spectral Unmixing Algorithm

In linear spectral mixing model, each mixed pixel can be expressed as a linear combination of endmembers weighted
by its corresponding abundance. It is widely used method for extracting surface information from remotely sensed
images is image classification. The spectral characteristics of each training class are defined through statistical or
probabilistic process from feature spaces and unknown pixel to be classified are statistically compared with known
classes and assigned to the class to which they mostly resemble.

Calculation of spectral angle:

Spectral angle mapping calculates the spectral similarity between a test


reflectance spectrum and a reference reflectance spectrum. The algorithm
determines the similarity between two spectra by calculating the spectral
angle between them. The spectral similarity between an unknown spectrum t
to a reference spectrum r is expressed in terms of average angle µ between
the spectra as shown in Figure 2. For each pixel the spectral angle mapper Spectral angle formula
calculates the angle between the vector defined by the pixel values and each
endmember vector. The result of this is one raster layer for each endmember containing the spectral angle. The
smaller the spectral angle the more similar a pixel is to a given endmember class. In a second step one can the go
ahead and enforce thresholds of maximum angles or simply classify each pixel to the most similar endmember.

Spectral Unmixing/Matched Filtering

Most surfaces on the earth, geologic or vegetated, are not homogeneous which results in a mixture of signatures
characterized by a single pixel. Depending on how the materials are mixing on the surface results in the type of
mathematical models capable of determining their abundances. If the mixing is rather large, than the mixing of the
signatures can be represented as a linear model. However, if the mixing is microscopic, then the mixing models
become more complex and non-linear.

The first step to determining the abundances of materials is to select endmembers, which is the most difficult step
in the unmixing process. The ideal case would consist of a spectral library which consists of endmembers when
linearly combined can form all observed spectra. A simple vector-matrix multiplication between the inverse library
matrix and an observed mixed spectrum gives an estimate of the abundance of the library endmembers for the
unknown spectrum.
76
Other Classification Techniques

A nonparametric classifier, such a neural network, and other feature extraction methods can be used to accurately
classify a hyperspectral image. Feature extraction methods, such as the decision boundary feature extraction (DBFE)
can extract the features necessary to achieve classification accuracy while reducing the amount of data analysed in
feature space.

Three different supervised classifiers

• Linear discriminant analysis (LDA): find a linear


combination of features which separate two or more
classes; the resulting combination may be used as a linear
classifier (only linearly separable classes will remain
separable after applying LDA). Linear discriminant analysis
is primarily used here to reduce the number of features to
a more manageable number before classification. Each of
the new dimensions is a linear combination of pixel values,
which form a template. This method can be used to
separate the alteration zones. For example, when different
data from various zones are available, discriminant analysis
can find the pattern within the data and classify it
effectively

• Support vector machine (SVM): An SVM model is


a representation of the examples as points in space,
mapped so that the examples of the separate categories are
divided by a clear gap that is as wide as possible. New
examples are then mapped into that same space and
predicted to belong to a category based on the side of the
gap on which they fall. SVM constructs a set of hyperplanes
in high-dimensional space; a good separation is achieved by
the hyperplane that has the largest distance to the nearest
training data points of any class.

• Multinomial logistic regression (MLR): It is a model that is


used to predict the probabilities of the different possible
outcomes of a categorically distributed dependent
variable, given a set of independent variables (which may
be real-valued, binary-valued, categorical-valued,
etc.).models the posterior class distributions in a Bayesian
framework, thus supplying (in addition to the boundaries
between the classes) a degree of plausibility for such
classes

Application of Hyperspectral Image Analysis

➢ Mineral targeting and mapping.

➢ Soil properties including moisture, organic content, and salinity.

➢ To identify Vegetation species, study plant canopy chemistry

➢ To detect military vehicles under partial vegetation canopy, and many other military target detection
objectives.

77
➢ Atmosphere: Study of atmospheric parameters such as clouds, aerosol conditions and Water vapor for
monitoring long term, large-scale atmospheric variations as a result of environmental change. Study of cloud
characteristics, i.e. structure and its distribution.

➢ Oceanography: Measurement of photosynthetic potential by detection of phytoplankton, detection of


yellow substance and detection of suspended matter. Investigations of water quality, monitoring coastal
erosion.

➢ Snow and Ice: Spatial distribution of snow cover, surface albedo and snow water equivalent. Calculation of
energy balance of a snow pack, estimation of snow properties-snow grain size, snow depth and liquid water
content.

➢ Oil Spills: When oil spills in an area effected by wind, waves, and tides, a rapid and assessment of the damage
can help to maximize the cleanup efforts.

Challenges in hyperspectral image classification

The special characteristics of hyperspectral data pose several processing problems:

➢ The high-dimensional nature of hyperspectral data introduces important limitations in supervised classifiers,
such as the limited availability of training samples or the inherently complex structure of the data.

➢ There is a need to address the presence of mixed pixels resulting from insufficient spatial resolution and other
phenomena in order to properly model the hyperspectral data.

➢ There is a need to develop computationally efficient algorithms, able to provide a response in a reasonable
time and thus address the computational requirements of time-critical remote sensing applications.

➢ In this work, we evaluate the impact of using subspace projection techniques prior to supervised classification
of hyperspectral image data while analysing each of the aforementioned items.

IMAGE DISPLAYS: (2018107013)


Image displays in use today are mainly colors (preferably flat screen) TV monitors. Monitors are driven by the
outputs of image and graphics display cards that are an integral part of the computer system. In some cases, it is
necessary to have stereo displays, and these are implemented in the form of headgear containing two small displays
embedded in goggles worn by users.

Types of displays:

A) Color CRT Monitors:

The CRT Monitor display by using a combination of phosphors. The phosphors are different colors. There are two
popular approaches for producing color displays with a CRT are:

1. Beam Penetration Method 2. Shadow-Mask Method

1) Beam Penetration Method

The Beam-Penetration method has been used with random-scan monitors. In this method, the CRT screen is coated
with two layers of phosphor, red and green and the displayed color depends on how far the electron beam
penetrates the phosphor layers. This method produces four colors only, red, green, orange and yellow. A beam of
slow electrons excites the outer red layer only; hence the screen shows red color only. A beam of high-speed
electrons excites the inner green layer. Thus, the screen shows a green color.

78
Advantages:

1. Inexpensive

Disadvantages:

1. Only four colors are possible

2. Quality of pictures is not as good as with


another method.

2) Shadow-Mask Method:

Shadow Mask Method is commonly used in Raster-Scan System because they produce a much wider range of colors
than the beam-penetration method. It is used in the majority of color TV sets and monitors.

Construction: A shadow mask CRT has 3 phosphor color dots at each pixel position.

- One phosphor dot emits: red light


- Another emits: green light
- Third emits: blue light

This type of CRT has 3 electron guns, one for each color dot and a shadow mask grid just behind the phosphor coated
screen. Shadow mask grid is pierced with small round holes in a triangular pattern. Figure shows the delta-delta
shadow mask method commonly used in color CRT systems.

Advantage:

1. Realistic image

2. Million different colors to be generated

3. Shadow scenes are possible

Disadvantage:

1. Relatively expensive compared with the


monochrome CRT.

2. Relatively poor resolution

3. Convergence Problem

B) RASTER DISPLAYS

Raster Scan

In a raster scan system, the electron beam is swept across the


screen, one row at a time from top to bottom. As the electron beam
moves across each row, the beam intensity is turned on and off to
create a pattern of illuminated spots. Picture definition is stored in
a memory area called the Refresh Buffer or Frame Buffer. This
memory area holds the set of intensity values for all the screen
points. Stored intensity values are then retrieved from the refresh
buffer and “painted” on the screen one row scanline at a time
as shown in the following illustration. Each screen point is referred
to as a pixel. At the end of each scan line, the electron beam returns
to the left side of the screen to begin displaying the next scan line.

79
3) Plasma display

A plasma display is a computer video display in which each pixel on the screen is illuminated by a tiny bit of plasma
or charged gas, somewhat like a tiny neon light. Plasma displays are thinner than cathode ray tube (CRT) displays

How Plasma Display Panels Work?

The structure of PDPs consists of multiple layers of


various materials, as seen in figure. The innermost layer
consists of a series of 3 cell compartments which make
up a single pixel of the projected image. Each cell
contains a gas mixture of noble gases, usually neon with
10-15% xenon, and is responsible for producing one of
the three primary colors, red, blue, or green. Outside of
these cells is the layer of dielectric material and
electrodes that provides energy to each of the 3-cell
chambers. The dielectric layer allows for more charge
to gather between the electrodes and the cells. On the
projection side of the display, the electrodes are
vertical and transparent. These electrodes are known as the transparent display electrodes and are coated in
magnesium oxide, while the back electrodes are known as the address electrodes. The outermost part of the plasma
display are the glass layers, one of which the image is shown on.

The way plasma displays work is similar to how a fluorescent light bulb works, gas is used to excited phosphors
which produce visible light. In plasma displays a voltage is given to the gas within the cells and the gas becomes
ionized, creating plasma. The plasma itself does not provide the light energy itself, rather it produces ultraviolet,
UV, light that excites the phosphors that are coated on to each cell. The color that is produced (either red, green, or
blue) is dependent on the phosphor. Red light is produced by phosphors such as (Y,Gd)BO3:Eu, YBO3:Eu, and
Y2O3:Eu. Blue light is produced by phosphor such as (Y,Gd)(V,P)O4 and BaMgAl14O23:Eu. Green light is produced
with phosphor such as Zn2SiO4:Mn, BaAl12O19:Mn, and SrAl12O19:Mn. By varying the intensity of the red, green,
and blue cells, all colors in the spectrum can be achieved.

IMAGE ENHANCEMENT DEFN: (2018107029)


Image enhancement refers to sharpening of image features such as edges, boundaries, or contrast to make a
graphic display more useful for display and analysis. The enhancement process does not increase the inherent
information content in the data. But it does increase the dynamic range of chosen features so that they can be
detected easily. Image enhancement is used to improve visual appeal and to understand the image content, to
accentuate the edge information and aid better analysis, to select suitable features extraction method and to
increase the dynamic range.

80
Types of image enhancement techniques:

The aim of image enhancement is to improve the interpretability or perception of information in images for human
viewers, or to provide 'better' input for other automated image processing techniques. Image processing techniques
can be divided into two board categories.

1.Spatial domain technique, which operate directly on pixels.

2.Frequency domain techniques which operate on the fourier transform of an image.

Image enhancement is mostly interactive and empirical approach.

SPATIAL vs FREQUENCY APPROACH

SPATIAL DOMAIN:

Direct manipulation of pixel values in


image plane

G (x, y) = T (f (x, y))

where g is the output value,

f is the input value

T is the operation

If T is on the pixel value along -> Point operation (Zero Memory), T is with neighbouring pixel values -> local operation

FREQUENCY DOMAIN:

Manipulation of pixel values linear position


invariant operator

G (x, y) = f (x, y) * h (x, y)

where g is the output value, f is the input value,

h is the convolution

g (w1, w2) = f (w1, w2) h (w1, w2)

Here are some useful examples and methods of image enhancement:

->Filtering with morphological operators

->Histogram equalization

->Noise removal using a Wiener filter

->Linear contrast adjustment

->Median filtering

->Unsharp mask filtering

->Contrast-limited adaptive histogram equalization (CLAHE)

->Decorrelation stretch

81
TYPES OF OPERATIONS IN IMAGE ENHANCEMENT: (2018107030)
There is a variety of ways to classify and characterize image operations. The reason for doing so is to understand
what type of results we might expect to achieve with a given type of operation or what might be the computational
burden associated with a given operation.

Image Enhancement

1. Point operation
2. Spatial operation
• contrast stretching
• Noise smoothing
• Noise clipping
• Median filtering
• Window slicing
• LP, HP & BP filtering y
• Histogram modeling
• Zooming

3. Transform operation

• Linear filtering 4. Pseudo coloring

• Root filtering • False coloring

• Homomorphic filtering • Pseudo coloring

Morphological operation

In a morphological operation, the value of each pixel in the output image is based on a comparison of the
corresponding pixel in the input image with its neighbors.

Composite Operation

Dilation and erosion work together in composite operation. There are common ways to represent the order of these
two operations, opening and closing.

Type of operations

The types of operations that can be applied to digital images to transform an input image a [m, n] into an output
image b [m, n] (or another representation) can be classified into three categories

Generic
Operation Characterization
Complexity/ Pixel

-the output value at a specific coordinate is


*Point dependent only on the input value at that constant
same coordinate.

-the output value at a specific coordinate is


*Local dependent on the input values in
the neighborhood of that same coordinate.

--the output value at a specific coordinate is


*Global dependent on all the values in the input
image.

82
Types of image operations. Image size= N x N, neighborhood size= Figure (1.2): Illustration of various types of
P x P. Note that the complexity is specified in operations per pixel. image operations
This is shown graphically in Figure (1.2).

Types of neighborhoods

Neighborhood operations play a key role in modern digital image


processing. It is therefore important to understand how images can
be sampled and how that relates to the various neighborhoods that
can be used to process an image.

Rectangular sampling - In most cases, images are sampled by laying


a rectangular grid over an image as illustrated in Figure (1.1). This
results in the type of sampling shown in Figure(1.3ab). Hexagonal
sampling-An alternative sampling scheme is shown in Figure (1.3c) and is termed hexagonal sampling.

Both sampling schemes have been studied extensively and both represent a possible periodic tiling of the
continuous image space. However rectangular sampling due to hardware and software and software considerations
remains the method of choice. Local operations produce an output pixel value 𝑓 ′ = [𝑚 = 𝑚0 , 𝑛 = 𝑛0 ] based upon
the pixel values in the neighbourhood 𝑓 = [𝑚 = 𝑚0 , 𝑛 = 𝑛0 ]. Some of the most common neighborhoods are the 4-
connected neighborhood and the 8-connected neighborhood in the case of rectangular sampling and the 6-
connected neighborhood in the case of hexagonal sampling illustrated in Figure (1.3).

POINT OPERATIONS: (2018107031)


Zero memory operations where a given gray level 𝒖 ∈ [𝟎, 𝑳] is mapped into a gray level 𝒗 ∈ [𝟎, 𝑳] according to
a transformation. Input and output gray levels are distributed between [0, L]. Typically, L = 255. Point operations
refer to running the same conversion operation for each pixel in a grayscale image. The transformation is based on
the original pixel and is independent of its location or neighboring pixels. Let “r” and “s” be the gray value at a point
(x, y) of the input image f (x, y) and the output image g (x, y), then the point operation can be defined by the following
formula, s= T (r) where T is the point operator of a certain gray-level mapping relationship between the original
image and the output image. Point operations are often used to change the grayscale range and distribution.

Point operations techniques are

● Contrast stretching ● Noise clipping ● Window slicing ● Histogram modeling

83
CONTRAST STRETCHING

Contrast stretching (often called normalization) is a simple


image enhancement technique that attempts to improve the
contrast in an image by `stretching' the range of intensity
values it contains to span the desired range of values, e.g. the
full range of pixel values that the image type concerned allows.
It differs from the more sophisticated histogram equalization
in that it can only apply a linear scaling function to the image
pixel values. As a result, the `enhancement' is less harsh. (Most
implementations accept a gray level image as input and
produce another gray level image as output)

THRESHOLDING

In many vision applications, it is


useful to be able to separate the regions
of the image corresponding to objects in
which we are interested, from the
regions of the image that correspond to
the background. Thresholding often
provides an easy and convenient way to
perform this segmentation based on the
different intensities or colors in the
foreground and background regions of an
image.

Besides, it is often useful to be able to see what areas of an image consist of pixels whose values lie within a specified
range, or band of intensities (or colors). Thresholding can be used for this as well. The a and b define the valley
between the peaks of the histogram. For a = b = t, this is called THRESHOLDING (The output becomes binary).

NOISE CLIPPING

A special case of contrast stretching where α = γ = 0 is called CLIPPING. It is useful


for noise reduction when the input signal is known to lie in the range [a, b]. The
slopes α, β, γ determine the relative contrast stretch.

84
LINEAR AND NON-LINEAR ENHANCEMENT: (2018107032)
Linear Enhancement:

1.Min-Max Contrast Enhancement

2.Percentage Linear Enhancement

3.Piecewise Linear Enhancement

Min-Max Contrast Enhancement:

The original minimum and maximum values of the data are assigned to a newly specified set of values that utilize
the full range of available brightness values.

𝑫𝑵 (𝒊) − 𝑫𝑵𝒎𝒊𝒏
× 𝟐𝟓𝟓
𝑫𝑵𝒎𝒂𝒙 − 𝑫𝑵𝒎𝒊𝒏

Percentage Linear Contrast Enhancement:

The percentage linear contrast stretch is similar to the minimum-maximum linear contrast stretch except this
method uses a specified minimum and maximum values that lie in a certain percentage of pixels from the mean of
the histogram. A standard deviation from the mean is often used to push the tails of the histogram beyond the
original minimum and maximum values.

𝑫𝑵 (𝒊) − 𝑫𝑵𝒑𝒎𝒊𝒏
× 𝟐𝟓𝟓
𝑫𝑵𝒑𝒎𝒂𝒙 − 𝑫𝑵𝒑𝒎𝒊𝒏

85
Piecewise Linear Enhancement:

When the distribution of a histogram in an image is bi or trimodal, an analyst may stretch certain values of the
histogram for increased enhancement in selected areas. This method of contrast enhancement is called a
piecewise linear contrast stretch. A piecewise linear contrast enhancement involves the identification of a number
of linear enhancement steps that expands the brightness ranges in the modes of the histogram and to increase the
dynamic range(contrast) specifically for selected targets.
𝐷𝑁 (𝑖) − 𝐷𝑁𝑝𝑚𝑖𝑛
× 𝑟1 𝑓𝑜𝑟 𝑎𝑙𝑙 𝐷𝑁(𝑖) 𝑤ℎ𝑒𝑟𝑒 𝑟1 ≤ 𝐼 ≤ 𝑟2
𝐷𝑁𝑝𝑚𝑎𝑥 − 𝐷𝑁𝑝𝑚𝑖𝑛

Nonlinear Enhancement:

Nonlinear contrast enhancement often involves histogram equalizations through the use of an algorithm. The
nonlinear contrast stretch method has one major disadvantage. Each value in the input image can have several
values in the output image, so that objects in the original scene lose their correct relative brightness value.

• Nonlinear functions with a fixed form.

• Fewer parameters to adjust.

• Satisfying 0 = f(min) <= g <= f(max) = L-1

Logarithmic transformation g = b log (af+ 1)

• Stretch dark region, suppress bright region.

Exponential transformation g = b (e^af - 1)

• Expand bright region

Power Law g = af^k

• K = 2: square law, similar to exponential

• K = 1/3: cubic root, similar to logarithmic

There are four methods of nonlinear contrast enhancement:

1. Histogram Equalizations

2. Adaptive Histogram Equalization

3. Homomorphic Filter

4. Unsharpe Mask

HISTOGRAM EQUALISATION: (2018107033)

Histogram equalization is a method in image processing of contrast


adjustment using the image's histogram. This method usually increases
the global contrast of many images, especially when the usable data of
the image is represented by close contrast values.

86
➢ Through this adjustment, the intensities can be better distributed on the histogram. This allows for areas of lower
local contrast to gain a higher contrast without affecting the global contrast.

➢ Histogram equalization accomplishes this by effectively spreading out the most frequent intensity values.

➢ The method is useful in images with backgrounds and foregrounds that are both bright or both dark

➢ A key advantage of the method is that it is a fairly straightforward technique and an invertible operator. So, in
theory, if the histogram equalization function is known, then the original histogram can be recovered.

➢ A disadvantage of the method is that it is indiscriminate. It may increase the contrast of background noise, while
decreasing the usable signal.

➢ There are two ways to think about and implement histogram equalization, either as image change or as palette
change. The operation can be expressed as P(M(I)) where I is the original image, M is histogram equalization
mapping operation and P is a palette.

➢ If we define new palette as P'=P(M) and leave image I unchanged then histogram equalization is implemented as
palette change. On the other hand if palette P remains unchanged and image is modified to I'=M(I) then the
implementation is by image change. In most cases palette change is better as it preserves the original data.

Histogram equalization of color image

➢ The above describes histogram equalization on a greyscale image. However, it can also be used on color images
by applying the same method separately to the Red, Green and Blue components of the RGB color values of the
image.

87
➢ Still, it should be noted that applying the same method on the Red, Green, and Blue components of an RGB image
may yield dramatic changes in the image's color balance since the relative distributions of the color channels change
as a result of applying the algorithm.

➢ However, if the image is first converted to another color space, Lab color space, or HSL/HSV color space in
particular, then the algorithm can be applied to the luminance or value channel without resulting in changes to the
hue and saturation of the image.

Example:

Small image The 8-bit greyscale img’s values

88
This cdf shows that the minimum value in the sub image is 52 and the maximum value is 154. The cdf of 64 for value
154 coincides with the number of pixels in the image. The cdf must be normalized to [0, 255]. The general histogram
equalization formula is:

Where 𝑐𝑑𝑓𝑚𝑖𝑛 is the minimum value of the cumulative distribution function (in this case 1), M × N gives the image's
number of pixels (for the example above 64, where M is width and N the height) and L is the number of grey levels
used (in most cases, like this one, 256). The equalization formula for this particular example is:

For example, the cdf of 78 is 46. (The value of 78 is used in the bottom row of the 7th column.) The normalized value
becomes

Once this is done then the values of the equalized image are directly taken from the normalized
cdf to yield the equalized values:

Notice that the minimum value (52) is now 0 and the maximum value (154) is now 255.

89
FALSE COLORING AND PSEUDO-COLORING: (2018107034)
False coloring

1. color images are a representation of a multispectral image produced using any bands other than visible red,
green and blue as the red, green, blue components of the display.

2.False color composite allow is to visualize wavelengths that the human eye can’t see (i.e near-infrared and
beyond).

3.False color images are mainly used satellite and space images.

4.Examples are remote sensing satellites (e.g. Landsat), space telescopes (e.g. the Hubble Space Telescope) or
space probes (e.g. Cassini-Huygens)

5. False color refers to a group of color rendering methods used to display images in color which were recorded in
the visible or non-visible parts of the electromagnetic spectrum.

6.A false-color image is an image that depicts an object in colors that differ from those a photograph would show.

7.In this image, colors have been assigned to three different wavelengths that our eyes cannot normally see.

8.In addition, variants of false color such as pseudo color, density slicing, and choropleths are used for information
visualization of either data gathered by a single grayscale channel or data not depicting parts of the electromagnetic
spectrum (e.g. elevation in relief maps or tissue types in magnetic resonance imaging).

90
9.In contrast to a true-color image, a false-color image sacrifices natural color rendition in order to ease the
detection of features that are not readily discernible otherwise – for example the use of near infrared for the
detection of vegetation in satellite images.

10.While a false-color image can be created using solely the visual spectrum (e.g. to accentuate color differences),
typically some or all data used is from electromagnetic radiation (EM) outside the visual spectrum (e.g. infrared,
ultraviolet or X-ray).

Pseudo coloring

1.A pseudo color image (sometimes styled pseudo-color or pseudo color) is derived from a grayscale image by
mapping each intensity value to a color according to a table or function.

2. Pseudo color is typically used when a single channel of data is available (e.g. temperature, elevation, soil
composition, tissue type, and so on), in contrast to false color which is commonly used to display three channels of
data.

3.Pseudocoloring can make some details more visible, as the perceived difference in color space is bigger than
between successive gray levels alone.

4.A typical example for the use of pseudo color is thermography (thermal imaging), where infrared cameras
feature only one spectral band and show their grayscale images in pseudo color.

5.Another familiar example of pseudo color is the encoding of elevation using hypsometric tints in physical relief
maps, where negative values (below sea level) are usually represented by shades of blue, and positive values by
greens and browns.

6.Depending on the table or function used and the choice of data sources, pseudo coloring may increase the
information contents of the original image, for example adding geographic information, combining information
obtained from infrared or ultra-violet light, or other sources like MRI scans.

7.A further application of pseudo coloring is to store the results of image elaboration; that is, changing the colors in
order to ease understanding an image.

Density slicing

1.A variation of pseudo color, divides an image into a few colored bands and is (among others) used in the
analysis of remote sensing images.

2. Density slicing the range of grayscale levels is divided into intervals, with each interval assigned to one of a
few discrete colors – this is in contrast to pseudo color, which uses a continuous color scale.

3. Example, in a grayscale thermal image the temperature values in the image can be split into bands of 2 °C,
and each band represented by one color – as a result the temperature of one spot in the thermograph can be easier
acquired by the user, because the discernible differences between the discrete colors are greater than those of
images with continuous grayscale or continuous pseudo color.

Choropleth

1.A choropleth is an image or map in which areas are colored or patterned proportionally to the category or
value of one or more variables being represented.

2.The variables are mapped to a few colors; each area contributes one data point and receives one color from
these selected colors

3.Basically it is density slicing applied to a pseudo color overlay.

4 A choropleth map of a geographic area is thus an extreme form of false color.

91
IMAGE FORMATION: (2018107001)
The study of image formation encompasses the radiometric and geometric processes by which 2D images of 3D
objects are formed. In cases of digital images, the formation process also includes analog to digital conversion and
sampling.

A digital image is formed by the small bits of data i.e. pixels, which are stored in computers. When we capture an
image in our digital camera in presence of light then this camera works like a digital sensor and converts it into
digital signals.

The two parts of the image formation process:

- The geometry of the image formation which determines where in the image plane the projection of a point
in the scene will be located
- The physics of light which determines the brightness of a point in the image plane as a function of
illumination and surface properties.

Image formation involves:

- Geometry and Radiometry - Photometry - Colorimetry - Sensors

Physical parameters of image formation:

• Geometric

• Optical

• Photometric

Filtering:

Digital filters are used to blur and sharpen digital images. They are performed by:

- Convolution with specifically designed kernels in the spatial domain.


- Masking specific frequency regions in the frequency domain.

Overlapping fields

• Machine or computer vision

• Computer graphics

• Artificial intelligence

• Signal processing

Image padding in Fourier domain filtering

Image are typically padded before being transformed to the Fourier space, the high-pass filtered images below
the illustrate the consequences of different padding techniques.

1.ZERO PADDED 2. REPEATED EDGE PADDED

Affine transformations:

They enable the basic image transformations including scale, rotate, translate, mirror and shear.

- Identity - Reflection - Scale - Rotate - Shear.

92
Image sampling.

• Discretizing the coordinate values is called sampling.


• Single sensing element combined with motion.
• Sensing strip
• Sensing array

Quantization:

Discretizing the amplitude values is called Quantization. The quality of a digital image is determined to a large by
the number of sample and discrete intensity levels used in the sampling and quantization.

Image interpolation:

Types: Nearest neighbor interpolation, Bilinear interpolation, Bicubic interpolation

SIMPLE IMAGE FORMATION MODEL

An image is a 2-D light intensity function f (x, y). An image is formed through the mapping of 3D scene onto 2D
surface or plane. The light intensity distribution f (x, y) over that plane characterizes the image and is considered to
be the dependent on 2distributions- illumination and reflectance. The image modeled as the illumination-
Reflectance model.

Transformations between frames

Object coordinates 3D

Word coordinates 3D

Camera coordinates 3D

Image plane coordinates 2D

Pixel coordinates 2D

IMAGE PROPERTIES: (2018107009)


The image properties are explained as follows,

• Image Dimension • Bit depth • Width • Colour model

IMAGE DIMENSIONS:

Image dimensions are the length and width of a digital image. It is usually measured in pixels, but some graphics
programs allow you to view and work with your image in the equivalent inches or centimeters. Depending on what
you plan to use your image for you may want to change the image size. For example, if you are using a high resolution
digital photograph, you might want to use image dimension smaller. The standard aspect ratio for widescreen is
16:9 (or 1.78:1), For letterbox format, while minimizing the size of bars required to fit traditional 4:3 broadcasts into
screens using the wider format.

93
BIT DEPTH:

Bit depth refers to the color information stored in an image. The higher the bit depth of an image, the more colors
it can store. The simplest image, a 1 bit image, can only show two colors, black and white. That is because the 1 bit
can only store one of two values, 0 (white) and 1 (black). It is the amount of colour information contained in each
pixel of the image. When referring to a color component, the concept can be defined as bits per component, bits
per channel, bits per color (all three abbreviated bpc), and also bits per pixel component, bits per color channel or
bits per sample (bps)

WIDTH:

The width property specifies the width of an element, and the height property specifies the height of an element.

img {width: 200px; height: 100px; }

The height and width properties may have the following values:

• auto - This is default. The browser calculates the height and width

• length - Defines the height/width in px, cm etc.

• % - Defines the height/width in percent of the containing block

• initial - Sets the height/width to its default value

• inherit - The height/width will be inherited from its parent value

COLOUR MODEL:

Digital images produced in color are governed by the same concepts of sampling, quantization, spatial resolution,
bit depth, and dynamic range that apply to their grayscale counterparts. However, instead of a single brightness
value expressed in gray levels or shades of gray, color images have pixels that are quantized using three independent
brightness components, one for each of the primary colors. When color images are displayed on a computer
monitor, three separate color emitters are employed, each producing a unique spectral band of light, which are
combined in varying brightness levels at the screen to generate all of the colors in the visible spectrum

Two common color models in imaging are RGB and CMY, two common color models in video are YUV and YIQ. YUV
uses properties of the human eye to prioritize information. Y is the black and white (luminance) image, U and V are
the color difference (chrominance) images.

IMAGE FUSION: (2018107038)


Image fusion process is defined as gathering all important information from multiple images and their inclusion into
fewer images usually a single one. A single image is more informative and accurate than any single source image
and it consists of all necessary information.

Two basic stages:

i. image registration: which brings the input images to spatial alignment.

ii. combining the image function: (intensities, colors, etc.,) in the area of frame overlap

Image registration works in four stages:

Feature detection: It is a low-level image processing operation. That is usually performed as the 1 st operation on
image and examines every pixel to see if there is a feature present at that pixel.

94
Feature matching: To detect a specified target in cluttered scene. This method detects single object rather than
multiple objects. The algorithm is based on comparing and analyzing point correspondences between reference
image and target image.

Transform model estimation: The type and parameter of the so-called mapping functions, aligning the sensed image
with reference image, are estimated.

Image resampling and transformation: The sensed image is transformed by means of mapping function. Image
values in non-linear co-ordinates are estimated by an appropriate interpolation technique.

Multi modal fusion with different resolution: One image with high spatial resolution, the other one with low spatial
but higher spectral resolution. E.g.: PAN sharpening.

IMAGE RATIOS: (2018107058)


NDVI (Normalized Difference Vegetation Index)

● NDVI quantifies vegetation by measuring the difference between near-infrared (which vegetation strongly
reflects) and red light (which vegetation absorbs).

● NDVI is functionally equivalent to the simple ratio; i.e there is no scatter in an SR vs NDVI plot, and each SR has
a fixed NDVI value. NDVI = (NIR - Red) / (NIR + Red) NDVI value varies from -1 to +1.

● Healthy vegetation reflects more NIR and green light compared to other wavelengths. But it absorbs more red
and blue light.

● This is why our eyes see vegetation as green colour.

Generally, the obtained results are:

NDVI= -1 to 0 represent water bodies

NDVI= -0.1 to 0.1 represent barren rocks, sand or snow

NDVI= 0.2 to 0.5 represent shrubs and grasslands

NDVI= 0.6 to 1.0 represent dense vegetation or tropical rainforest.

Importance of NDVI:

● Seasonal and inter - annual changes in Vegetation growth and activity can be monitored.

● The ratioing reduces many forms of multiplicative noise (sun illumination differences, cloud shadows, some
atmospheric attenuation, some topographic variations) present in multiple bands of multiple date imagery.

➢ Two of the standard MODIS land products are sixteen-day composite NDVI dataset of the world at a spatial
resolution of 500m and 1km.
➢ The NDVI transform also performs well on high spatial resolution imagery that has red and near-infrared
bands.

How we use NDVI:

• Satellites like Sentinel-2, Landsat and SPOT produce red and near infrared images.
• This list of 15 free satellite imagery data sources has data that we can download and create NDVI maps in
ArcGIS and QGIS.
• We use NDVI in agriculture, forestry and the environment.

95
NDBI (Normalized Difference Built-up Index)

● Normalized Difference Built-up Index is used to extract built-up features.

● Built-up areas are effectively mapped through arithmetic manipulation of re-coded Normalized Difference
vegetation index (NDVI)and NDBI images derived from imagery

● NDBI is defined as the linear combination of near-infrared band and the middle infrared (MIR) band, used for
extraction of urban built-up land.

● NDBI can be applied to seasonal changes in surface urban heat island effect.

NDBI = (SWIR - NIR) / (SWIR + NIR)

● NDBI value lies between -1 to +1.

● Negative value of NDBI represent water bodies whereas higher value represents built-up areas.

● NDBI value for Vegetation is low.

● This resulted in an output image that contained only built-up and barren pixels having positive values while
all other land cover had a value of 0 or -254.

● The technique was reported to be 92% accurate.

NDWI (Normalized Difference Water Index)

● Normalized water Index is an index to extract water bodies from satellite imagery.

● It used to differentiate water from the dry land or rather most suitable for water body mapping.

● Water bodies have a low radiation and strong absorbability in the visible infrared wavelength range.

● NDWI uses near infrared and green bands of remote sensing images based on the occurrence.

NDWI = (G - NIR) / (G + NIR)

This formula highlights the amount of water in water bodies. An alternate method of calculation uses the NIR
and SWIR channels.

NDWI = (NIR - SWIR)/(NIR + SWIR)

● The results of NDWI can be presented in the form of maps and graphs providing information on both the
spatial distribution of water stress on vegetation and its temporal evolution over longer periods of time.

● The NDWI value varies from -1 to +1.

● Generally, water bodies NDWI value is greater than 0.5.

● Vegetation has much smaller values which distinguishing vegetation from water bodies easily.

● Built-up features having positive values lies between 0 to 0.2.

IMAGE RECTIFICATION: (2018107039)


Image rectification is a process of geometrically correcting an image so that it can be represented on a planar
surface, conform to other images or conform to a map. That is, it is the process by which geometry of an image is
made planimetric. It is necessary when accurate area, distance and direction measurements are required to be made
from the imagery. It is achieved by transforming the data from one grid system into another grid system using a
geometric transformation. Grid transformation is achieved by establishing mathematical relationship between the

96
addresses of pixels in an image with the corresponding coordinates of those pixels on another image or map or
ground.

IMAGE TO MAP RECTIFICATION PROCEDURE:

Two basic operations must be performed geometrically to rectify a remotely sensed image to a map coordinate
system:

1. Geometric Transformation coefficient computation:

• The geometric relationship between input pixel location (row and column; x’, y’) and associated map co-
ordinates of the same point (x, y) are identified. Involves selecting Ground Control Points (GCPS) are used
to establish the nature of the geometric coordinate transformation that must be applied to rectify or fill
every pixel in the output image (x, y) with a value from a pixel in the unrectified input image (x’, y’).
• At the end of this procedure, it is equivalent to overlapping an empty output grid over an image or a map.
• This process is also called spatial interpolation.

2. Intensity Interpolation (Resampling):

• Now, in this step, we will fill the DN values in the empty grid which was created in the previous step.
• A pixel in the rectified image often requires a value from the input pixel grid that does not fall neatly on a
row and column co-ordinate.
• When this occurs, there must be some mechanism for determining the brightness value (BV) to be assigned
to the output rectified pixel. This process is called intensity interpolation.

The first step mentioned above in the process of image rectification begins with the selection of Ground Control
Points (GCP).

A) GROUND CONTROL POINTS (GCPs):

• A ground control point (GCP) is a location on the surface of the Earth (e.g., a road intersection) that can be
identified on the imagery and located accurately on a map.
• There are two distinct sets of co-ordinates associated with each GCP:
- Source or image coordinates specified in i rows and j columns and,
- Reference or map coordinates (e.g., x, y measured in degrees of latitude and longitude, or meters in a
Universal Transverse Mercator Projection).

ACCURACY OF TRANSFORMATION:

• Accurate GCPs are essential for accurate rectification.


• Well dispersed GCPs result in more reliable rectification.
• GCPs for Large Scale Imagery:
- Road intersections, airport runways, towers, buildings etc.
- For small scale imagery:
• Larger features like urban area or Geological features can be used.
NOTE: Landmarks that can vary (like lakes, other water bodies, vegetation etc.) should not be used.
• Sufficiently large number of GCPs should be selected.
• Requires a minimum number depending on the type of transformation.

SPATIAL INTERPOLATION USING POLYNOMIAL COORDINATE TRANSFORMATION:

• Polynomial equations are used to convert the source file coordinates to rectified map coordinates.
• Depending upon the distortion sin the imagery, the number of GCPs used, their location relative to one
another, complex polynomial equations are used.
• The degree of complexity of the polynomial is expressed as ORDER of the polynomial.

97
• The order is simply the highest exponent used in the polynomial.

FIRST ORDER POLYNOMIAL TRANSFORMATION/LINEAR TRANFORMATION/AFFINE TRANSFORMATION:

• This type of transformation can model six kinds of distortion in the remote sensor data including:
✓ Location in x and /or y
✓ Scale in x and /or y
✓ Skew in x and/or y
✓ Rotation.
• If the coefficients a0, a1, a2, b0, b1, b2 are known then, the above polynomial can be used to relate and point
on map to its corresponding point on image and vice versa. Hence six coefficients are required for this
transformation (three for x and three for y).
• So, it requires minimum THREE
GCPs for solving the above
equation.
• However, the error cannot be
estimated with three GCPs alone.
Hence one additional GCP is
taken.
• Before applying rectification to
the entire set of the data, it is
important to determine how well
the six coefficients derived from
the least square regression of the
initial GCPs account for the
geometric distortion of the input
image.

(The image shown tells us about the process took place in the above-mentioned process (First Order Polynomial
Transformation).

ACCURACY OF TRANSFORMATION:

• Before applying the rectification to the entire data i.e., calculating where each pixel would lie on the map,
we have to test the accuracy that is achieved with these four GCPs (Ground Control Points).
• In this method, we check how good do selected points fit between the map and the image?
• To solve linear polynomials, we first take four GCPs to compute the six coefficients. Its source coordinates
in the original output input image are say x1 and y1. The position of the same points in reference map in
degrees, feet or meters are say x, y.
• Now, if we input the map x, y values for the first GCP back into the linear polynomial equation with all the
coefficients in the place, we would get the computed or retransformed 𝑥𝑟 and 𝑦𝑟 values, which are supposed
to location of this point in input image.
• Ideally measured and computed values should be equal.
• In reality, this does not happen which results in error.

98
• So, we have to compute this error by the below technique.

ROOT MEAN SQUARE (RMS) ERROR:

• Accuracy is measured by computing Root Mean Square Error (RMS error) for each of the ground control
point.
• RMS error is the distance between the input (source or measured) location of a GCP and the retransformed
(or computed) location for the same GCP.
• RMS error is computed with a Euclidean Distance Equation.

where, xi and yi are the input source coordinates and xr and yr are the retransformed coordinates.

ACCEPTABLE RMS ERROR:

• The amount of RMS error that is tolerated can be thought of as a window around each source coordinate,
inside which are transformed coordinate is considered to be correct.
• Acceptable RMS error depends upon the - End use of the data, The type of data being accused ,and The
accuracy of the GCP and the ancillary data.
• Normally an RMS error of less than 1 per GCP and a total RMS error of less than half a pixel(0.5) is acceptable.

B) INTENSITY INTERPOLATION (RESAMPLING):

- Once an image is warped, how are DNs assigned to the “new” pixels?

• Since the grid of pixels in the source image rarely matches the grid for the reference image, the pixels are
resampled so that new data file values for the output file can be calculated.
• This process involves filling the rectified output grid with brightness values extracted from a location in the
input image and its reallocation in the appropriate coordinate location in the rectified output image.
• This results in input line and columns numbers as real numbers (and not integers).
• When this occurs, the methods of assigning Brightness values are: Nearest neighbor, Bilinear, Cubic.

B1) NEAREST NEIGHBOUR INTERPOLATION:

• The nearest neighbor approach uses the value of the closest input pixel for the output pixel value.
• The pixel value occupying the closest image file coordinate to the estimated coordinate will be used for the
output pixel value in the georeferenced image.

ADVANTAGES:

• Output values are the original input values. Other methods of resampling tend to average surrounding
values. This may be an important consideration when discriminating between vegetation types or locating
boundaries.
• Since original data are retained, this method is recommended before classification.
• Easy to compute and therefore fastest to use.

DISADVANTAGES:

• Produces a choppy, “stair-steeped” effect. The image has a rough appearance relative to the original
unrectified data.
• Data values may be lost, while other values may be duplicated.

99
B2) BILINEAR INTERPOLATION:

The bilinear interpolation approach uses the weighted average of the nearest four pixels to the output pixel.

ADVANTAGES:

Stair-steep effect caused by the nearest neighbor approach is reduced. Image looks smooth.

DISADVANTAGES:

• Alters original data and reduces contrast by averaging neighboring values together.
• Is computationality more extensive than nearest neighbor?

B3) CUBIC CONVOLUTION:

The cubic convolution approach uses the weighted average of the nearest sixteen pixels to the output pixel. The
output is similar to bilinear interpolation, but the smoothing effect caused by the averaging of surrounding input
pixel values is more dramatic.

ADVANTAGES:

Stair-steep effect caused by the nearest neighbor approach is reduced. Image looks smooth.

DISADVANTAGES:

• Alters original data and reduces contrast by averaging neighboring values together.
• Is computationality more extensive than nearest neighbor or bilinear interpolation?

INDEXED IMAGES AND IMAGE RATIOS: (2018107011)


The image processing Tool box software defines several fundamental types of images, one type of image in that is
the indexed images.

Indexed images:

• An indexed image consists of an image matrix and a color map.

• A color map is an m-by-3 matrix of class double containing values in range [0,1].

• Each row of color map specifies the red, green and blue components of a single color.
100
• In Image matrix, the pixel values are diverting indices into the color map. Therefore, to the corresponding
color in the color map.

• The mapping depends on the class of the image matrix:

➢ For single or double arrays, integer values range from [1, p], where p is the length of the color map.
The 2 points to first row, 2 to the second row and so on.

➢ For logical unit, unit 8 or unit 16 arrays, values range from [0, p-1]. Here 0 points to the 1 st row in
the color map, 1 point to the second row and so on.

Image ratios: [width: height]

• It is the ratio of its width to its height.

• It is commonly expressed as two numbers separated by colon, as in 16:9.

• For x: y, x unit is wide and y unit is height.

• The aspect ratio of an image is primarily determined by dimensions of the camera’s sensor.

• The two most common aspect ratios are 4:3 known as full screen and 16:9 known as wide screen.

IP SYSTEMS: (2018107015)
Image Processing System (IP system) is a combination of computer hardware and the image processing software
which can analyze digital image data. IP systems can be used to process remotely sensed data on

a) Mainframes (>64-bit CPU) b) Workstation (> 64-bit CPU) c)Personal computers (32 to 64-bit CPU)

The major difference is in the speed at which the computer processes millions of instructions per second (MIPS).
Mainframes are generally more efficient than workstations, which perform better than personal computers. The
MIPS being processed on all types of computers are increasing logarithmically, while the cost of a computer per
MIPS is decreasing.

Image Processing System Hardware characteristics

When working with or selecting a digital image processing system, the following factors should be considered:

• the number of analysts who will have access to the system at one time,
• the mode of operation,
• the central processing unit (CPU),
• the operating system,
• type of compiler,
• the amount and type of mass storage required,
101
• the spatial and color resolution desired, and
• the image processing applications software.

Central Processing Unit:

The central processing unit (CPU) is the computing part of the computer. It consists of a control unit and an
arithmetic logic unit. CPU directs input and output from and to mass storage devices, color monitors, digitizers,
plotters, etc.

Operating Systems:

Image Processing systems makes use of a single-user operating system and a network operating system with multi-
user capability.

Random Access Memory

Computers should have sufficient RAM for the operating system, image processing applications software, and any
remote sensor data that must be held in temporary memory while calculations are performed. Computers with 64-
bit CPUs can address more RAM than 32-bit machines

Interactive Graphical User Interface (GUI)

It is ideal if the image processing takes place in an interactive environment where the analyst select the processes
to be performed using a graphical user interface, or GUI (Campbell and Cromp, 1980). Most sophisticated image
processing systems are now configured using a point and click GUI that allows rapid selection and deselection of
images to be analyzed and the appropriate function to be applied.

Some common and effective digital image processing graphical user interfaces (GUI)include:

• ERDAS Imagine
• ENVI
• ESRI ArcGIS, etc..

MASS STORAGE

Digital image processing systems and related GIS data requires substantial mass storage resources. Mass storage
media should have:

• rapid access time,


• longevity, and
• inexpensive.

Compiler

Compiler has the ability to modify existing software or integrate newly developed algorithms with the existing
software. Image processing systems provide a toolkit that programmers can use to compile their own digital image
processing algorithms (e.g., ERDAS, ENVI). The toolkit consists of subroutines that perform very specific tasks such
as reading a line of image data into RAM or modifying a color look-up table to change the color of a pixel (RGB) on
the screen.

102
The above network depicts a typical networked digital image processing laboratory configuration and peripheral
devices for input and output of remotely sensed data.

HOW DO IP SYSTEMS PROCESS REMOTELY SENSED IMAGE DATA?

Image processing functions available in image processing systems can be categorized into the following categories:

• Preprocessing (Radiometric and Geometric)


• Display and Enhancement
• Information Extraction
• Photogrammetric Information Extraction
• Metadata and Image/Map Lineage Documentation
• Image and Map Cartographic Composition
• Geographic Information Systems (GIS)
• Integrated Image Processing and GIS

Preprocessing functions involve those operations that are normally required prior to the main data analysis and
extraction of information, and are generally grouped as radiometric or geometric corrections.

103
Image enhancement is solely to improve the appearance of the imagery to assist in visual interpretation and
analysis.

Image transformations are operations similar in concept to those for image enhancement. However, unlike image
enhancement operations which are normally applied only to a single channel of data at a time, image
transformations usually involve combined processing of data from multiple spectral bands. Arithmetic operations
(i.e. subtraction, addition, multiplication, division) are performed to combine and transform the original bands into
"new" images which better display or highlight certain features in the scene.

Image classification and analysis operations are used to digitally identify and classify pixels in the
data. Classification is usually performed on multi-channel data sets (A) and this process assigns each pixel in an
image to a particular class or theme (B) based on statistical characteristics of the pixel brightness values. There are
a variety of approaches taken to perform digital classification. We will briefly describe the two generic approaches
which are used most often, namely supervised and unsupervised classification.

It is not good for remotely sensed data to be analyzed in a vacuum. Remote


sensing information fulfills its promise best when used in conjunction with
ancillary data (e.g., soils, elevation, and slope) stored in a geographic
information system (GIS). The ideal system should be able to process the
digital remote sensor data and perform any necessary GIS processing. It is
not efficient to exit the digital image processing system, log into a GIS
system, perform a required GIS function, and then take the output of the
procedure back into the digital image processing system for further
analysis. Integrated systems perform both digital image processing and GIS
functions and consider map data as image data (or vice versa) and operate
on them accordingly.

There are both Commercial firms and Public agencies actively marketing
digital image processing systems with their own limitations!

PHOTOWRITE SYSTEMS: (2018107012)


Photo-Write Systems are Precision Opto-Electronic Equipment that have the capability of writing high
resolution, continuous-tone images from digital data on B&W and color photographic paper.
To develop photo products-hard copy data Photoproducts are produced by generating master films on digital film
recorders connected to the computer system.
Depends on:-Film Response & Illumination Condition.

104
Components:

- Grayscale drum scanner imager - Densitometer-optical density– - Color Management by Software


- Color drum scanner imager degree of darkness - Spectro Photometers
- Coordinate measuring light table - Color film recorder

1. Fotorite 20530 RS
2. Fotorite 1040C (M)
3. Fotorite 2030
(left to right)

Drum Scanners:
• A drum scanner is a special scanner used to scan high resolution pictures into a detailed and sharp image.
They are high-end scanners that are very expensive and are used by professionals.
• The drum scanner works by attaching the original image to a transparent revolving drum or cylinder.
• The film is wet mounted and then inserted into the scanner. The drum spins at a very high speed while light
from scanner illuminates each part of the film pixel by pixel, storing the particular color and gray scale
information as digital data.
• When one revolution is complete, the light source moves one pixel to the side, and images the next row,
continuing this process until the entire picture is imaged.
• After scanning, the grayscale levels corresponding to each constituent color are "reassembled" either as one
large digital file (if the image needs further manipulation) or as individual color-separated films.
• An important benefit of drum scanners is its tonality improvement i.e Drum scanner is capable of achieving
better tonality due to its nature of scan. They actually scan in analog using RGB lights, collects data by using
vacuum tube, giant capacitors, and resistors and convert it into digital data.

Coordinate Measuring Light Table:


A coordinate measuring machine uses a very sensitive electronic probe to measure a series of discrete points from
the geometry of a solid part.

105
Densitometer:
A densitometer is a device that measures the degree of darkness (the optical density) of a photographic or semi-
transparent material or of a reflecting surface. The densitometer is basically a light source aimed at a photoelectric
cell. It determines the density of a sample placed between the light source and the photoelectric cell from
differences in the readings.
Types:
• Transmission densitometers that measure transparent materials -Transmission densitometer used to
measure transparent surface measures color transparencies film & transparent substrate are some example
of common transparent surface measures.
• Reflection densitometers that measure light reflected from a surface.

Uses:

• Densitometers are used for measuring color saturation by print professionals


• Calibration of printing equipment
• It serves as one of the Molecular tools for gene study to quantify the radioactivity of a compound such as
radio labelled DNA.
• They are also used for making adjustments so that outputs are consistent with the colors desired in the
finished products.
• They are used in industrial radiography to ensure x-ray films are within code-required density ranges. They
are also used to compare relative material thicknesses.
• Densitometer are used for process control of density dot gain, dot area & ink trapping
• Densitometer reading will be different types of printing process & substrate

Color Film Recorder:

• A film recorder is a graphical output device for transferring images to photographic film from a digital
source. In a typical film recorder, an image is passed from a host computer to a mechanism to expose film
through a variety of methods, historically by direct photography of a high-resolution cathode ray tube (CRT)
display. The exposed film can then be developed using conventional developing techniques, and displayed
with a slide or motion picture projector.
• All film recorders typically work in the same manner. The image is fed from a host computer as a raster
stream over a digital interface. A film recorder exposes film through various mechanisms; flying spot (early
recorders); photographing a high-resolution video monitor; electron beam recorder (Sony HDVS); a CRT
scanning dot (Celco); focused beam of light from a light valve technology (LVT) recorder; a scanning laser
beam (Arrilaser); or recently, full-frame LCD array chips.
• For color image recording on a CRT film recorder, the red, green, and blue channels are sequentially
displayed on a single gray scale CRT, and exposed to the same piece of film as a multiple exposure through
a filter of the appropriate color. This approach yields better resolution and color quality than possible with
a tri-phosphor color CRT. The three filters are usually mounted on a motor-driven wheel. The filter wheel,
as well as the camera's shutter, aperture, and film motion mechanism are usually controlled by the
recorder's electronics and/or the driving software. CRT film recorders are further divided into analog and
digital types. The analog film recorder uses the native video signal from the computer, while the digital type
uses a separate display board in the computer to produce a digital signal for a display in the recorder. Digital
CRT recorders provide a higher resolution at a higher cost compared to analog recorders due to the
additional specialized hardware.

Uses:

Film recorders are used in digital printing to generate master negatives for offset and other bulk printing processes.
For preview, archiving, and small-volume reproduction, film recorders have been rendered obsolete by modern
printers that produce photographic-quality hard copies directly on plain paper.

106
Color Management:

• The primary goal of color management is to obtain a good match across color devices; for example, the
colors of one frame of a video should appear the same on a computer LCD monitor, on a plasma TV screen,
and as a printed poster.
• Color management helps to achieve the same appearance on all of these devices, provided the devices are
capable of delivering the needed color intensities

Spectro Photometers:

A spectrophotometer is an instrument that measures the amount of light absorbed by a sample.

Gamma Correction:

• Gamma describes the relationship between a color value and its brightness on a particular device.
• For images described in an RGB color space to appear visually correct, the display device should generate
an output brightness directly proportional (linearly related) to the input color value.
• Most display devices do not have this property. Gamma correction is a technique used to compensate for
the non-linear display characteristics of a device.
• Gamma correction is achieved by mapping the input values through a correction function, tailored to the
characteristics of the display device, before sending them to the display device.

Take a look at the image below. If we don’t account for gamma, the curve will be exponential (lower green curve).
If we perform gamma correction the actual response will be linear, as it ought to be. For comparison, the image also
shows how the graph looks when we perform gamma correction but the monitor actually has a linear response. In
this case, the intensities will be distorted in the opposite fashion, and we can see that when a non-linear monitor
distorts them in turn, this cancels out, and we end up with a straight line.

This image shows the mapping of colour intensities as sent to the monitor by the graphics card, and intensities
that are displayed by the monitor.

107
Image Composition:
Each band of a multi spectral image can be displayed one band at a time as a gray scale image, or as a
combination of three bands at a time as a color composite image. The three primary colors of light are red, green
and blue (RGB). Computer screens can display an image composed of three different bands, by using a different
primary color for each band. When we combine these three images, the result is a color image with each pixel’s
color determined by combination of RGB of different brightness.

Natural or True Color Composites:


A natural or true color composite is an image displaying a
combination of the visible red, true color composites, as colors appear
natural to our eyes, but often subtle differences in features are difficult to
recognize. Natural color images can be low in contrast and somewhat hazy
due the scattering of blue light by the atmosphere, green and blue bands to
the corresponding red, green and blue channels on the computer display.
The resulting composite resembles what would be observed naturally by
the human eye: vegetation appears green, water darkish blue to black and
bare ground and impervious surfaces appear light gray and brown. Many
people prefer Natural or True Color Composites.

False Color Composites (FCC):

False color images are a representation


of a multi spectral image produced using
any bands other than visible red, green and
blue as the red, green and blue
components of the display. False color
composites allow us to visualize
wavelengths that the human eye cannot
see (i.e. near infrared and beyond). Using
bands such as near infrared highlights the
spectral differences and often increases
the interpretability of the data. There are
many different false colored composites
that can be used to highlight different
features.

• Image data • Printers


• Image manipulation • Plotters – raster, vector
• Non image data handling • Photo printers
• Color assurance • Photo write systems
• Geometric accuracy • Images on Photo films
• Scale

108
PIXEL CONNECTIVITY: (2018107027)
In the field of image processing, pixel connectivity is the way in which pixels in 2-dimensional images relate to
their neighbours. The notation of pixel connectivity describes a relation between two or more pixels. For two pixels
to be connected they have to fulfil certain conditions on the pixel brightness and spatial adjacency. Two pixels are
connected if they are adjacent.

A set of pixels in an image which are all connected to each other is called a connected component. First, in order for
two pixels to be considered connected, their pixel values must both be from the same set of values V. For a grayscale
image, V might be any range of gray levels, e.g. V={22,23,...40}, for a binary image we simple have V={1}. If the
possible intensity values 0 – 255, V set can be any subset of these 256 values.

The types of connectivity are 4-connectivity ,6-connectivity and 8-connectivity in 2-dimension. In 3-D , there are
connectivities like 6-connectivity,18-connectivity and 26 -connectivity.

TYPES OF CONNECTIVITY:

4-connectivity: 4-connected pixels are neighbours to every pixel that touches one of their edges. These pixels are
connected horizontally and vertically.

6-connectivity: 6-connected pixels are neighbours to every pixel that touches one of their corners (which includes
pixels that touch one of their edges) in a hexagonal grid or stretched bond grid. There are several ways to map
hexagonal tiles to integer pixel coordinates. With one method, in addition to the 4-connected pixels, the two pixels
at coordinates (x+1, y+1) and (x-1, y-1) are connected to the pixel at (x, y).

8-connectivity: 8-connected pixels are neighbours to every pixel that touches one of their edges or corners. These
pixels are connected horizontally, vertically, and diagonally.

An example of a binary image with two connected components which are based on 4-connectivity can be seen in
Figure 1. If the connectivity were based on 8-neighbours, the two connected components would merge into one

Figure 1: Two connected components based on 4-connectivity.

109
PIXEL – PATH AND PATH LENGTHS: (2018107028)
• In digital imaging, a pixel, pel, or picture element is a physical point in a raster image, or the smallest
addressable element in an all points addressable display device; so it is the smallest controllable element of a
picture represented on the screen. The intensity of each pixel is variable.
• A pixel can have only two possible values, BLACK or WHITE. The third image, with eight bits per pixel, can
display 256 different brightness levels. This is generally adequate for human viewing. When an image is in
digital form, it is actually blurred by the size of the pixel

Characteristics: • Dimension, • Bit depth • Color model.

Basic Properties of pixels: • Neighbourhood • Adjancy • Connectivity • Path

Digital Path

• A path (curve) from pixel p with coordinates (x,y) to pixel q with coordinates (s,t) is a sequence of distinct pixels:
(x0,y0), (x1,y1), …, (xn,yn) where (x0,y0)=(x,y), (xn,yn)=(s,t), and (xi,yi) is adjacent to (xi- 1,yi-1), for 1≤i ≤n ; In this
case n is the length of the path.

• If (xo, yo) = (xn, yn) the path is a closed path

• 4-, 8-, m-paths can be defined depending on the type of adjacency specified.

• If p, q Î S, then q is connected to p in S if there is a path from p to q consisting entirely of pixels in S.

DISTANCE MEASURES

Given pixels p, q and z with coordinates (x, y), (s, t), (u, v) respectively, the distance function D has following
properties:

a. D (p, q) ≥ 0 [D (p, q) = 0, if p = q]

b. D (p, q) = D (q, p)

c. D (p, z) ≤ D (p, q) + D (q, z)

Distance measures and connectivity

• The D4 distance between two points p and q is the shortest 4-path between the two points

• The D8 distance between two points p and q is the shortest 8-path between the two points.

• For m-connectivity, the value of the distance (the length of the path) between two points depends on the values
of the pixels along the path.

The following are the different Distance measures:

City Block Distance:

D4 (p, q) = |x-s| + |y-t|

- The pixels having a D4 distance from (x, y) less than or equal to some value r form a diamond centered at (x, y).

Chess Board Distance:

D8 (p, q) = max (|x-s|, |y-t|)

- The pixels with D8 distance from (x, y) less than or equal to some value r form a square centered at (x, y).

Euclidean Distance:

De (p, q) = [(x-s)2+ (y-t)2] - The pixels having a distance less than or equal to some value r from (x, y) are the points
contained in a disk of radius r centered at (x, y).

110
Conclusions

Using the relationship between the elements of the picture in the following:

1. Processes that apply based on Digital Image element.

2. Measure distances between elements of the picture.

3. Retrieving medical images based on the content.

4. Computer Aided diagnosis in the field of medical imaging.

PIXEL NEIGHBOURHOOD: (2018107025)


PIXEL

• After sampling we get number of analog samples and each sample have intensity value which can be
Quantized as final step of digitization
• Quantized to discrete label
• 8 bit for black and white image
• 24 bit for colour image
• A Matrix element is called pixel.
• For 8 bit, a pixel can have value between 0 to 256

NEIGHBOURHOOD

• The neighbourhood of a pixel is the set of pixels that touch it. Simple as that.
• Thus, the neighbourhood of a pixel can have a maximum of 8 pixels (images are always considered 2D).
• The neighbourhood of a pixel is required for operations such as morphology, edge detection, median filter,
etc.
• Many computer vision algorithms allow the programmer to choose an arbitrary neighbourhood.

111
NEIGHBOURS OF A PIXEL

(x-1, y-1) (x, y-1) (x+1, y-1)

(x-1, y) P (x, y) (x+1, y)

(x-1, y+1) (x, y+1) (x+1, y+1)

The orange pixels form the neighbourhood of the pixel


'p'. Neighbourhoods of more specific nature exist for
various applications.

Here's a list of them:

TYPES OF NEIGHBOURS

N4 - 4-neighbours ND - diagonal neighbours N8 - 8-neighbours (N4 U ND)

a) 4-NEIGHBORHOOD

• The neighbourhood consisting of only the pixels directly touching.


• That is, the pixel above, below, to the left and right for the 4-
neighbourhood of a particular pixel.

• Any pixel p(x, y) has two vertical and two horizontal neighbours,
given by (x+1, y), (x-1, y), (x, y+1), (x, y-1)
• This set of pixels are called the 4-neighbors of P, and is denoted by
N4(P).
(x, y-1)

(x-1, y) P (x,y) (x+1, y) • Each of them is at a unit distance from P.


• 4-neighbors of a pixel q are its vertical and horizontal neighbours
(x, y+1) denoted by N4(p).
• This set of pixels, called the 4-neighbors or p.

b) d- NEIGHBORHOOD

• This neighbourhood consists of those pixels that do not touch it, or


they touch the corners. That is, the diagonal pixels.
• The four diagonal neighbours of p(x, y) are given by, (x+1, y+1),
(x+1, y-1), (x-1, y+1), (x-1 ,y-1)
• This set is denoted by ND(P).
• Each of them is at Euclidean distance of 1.414 from P.

112
c) 8- NEIGHBORHOOD

• This is the union of the 4-neighbourhood and the d-neighbourhood.


• It is the maximum possible neighbourhood that a pixel can have.
• 8-neighbors of a pixel are its 4 vertical horizontal and 4 diagonal
neighbours denoted by N8(p).
• The points ND(P) and N4(P) are together known as 8-neighbors of the point
P, denoted by N8(P).
• Some of the points in the N4, ND and N8 may fall outside image when P lies
on the border of image.

N4 – 4 neighbours
ND – diagonal neighbours
N8 – 8 neighbours (N4 U ND)

This give rise to 2 types of connectivity namely, 4-connectivity and 8-connectivity

PRINCIPAL COMPONENT ANALYSIS: (2018107024)


Overview

Principal component analysis, or PCA, is a statistical


procedure that allows you to summarize the information content
in large data tables by means of a smaller set of “summary
indices” that can be more easily visualized and analyzed. The
underlying data can be measurements describing properties of
production samples, chemical compounds or reactions, process
time points of a continuous process, batches from a batch
process, biological individuals or trials of a DOE-protocol, for
example.

History

PCA was invented in 1901 by Karl Pearson, as an analogue of


the principal axis theorem in mechanics; it was later
independently developed and named by Harold Hotelling in the
1930s. Depending on the field of application, it is also named the discrete Karhunen–Loève transform (KLT) in signal
processing, the Hotelling transform in multivariate quality control, proper orthogonal decomposition (POD) in
mechanical engineering, singular value decomposition (SVD) of X (Golub and Van Loan, 1983), eigenvalue
decomposition (EVD) of XTX in linear algebra, factor analysis (for a discussion of the differences between PCA and
113
factor analysis see Ch. 7 of Jolliffe's Principal Component Analysis), Eckart–Young theorem (Harman, 1960), or
empirical orthogonal functions (EOF) in meteorological science, empirical eigenfunction decomposition (Sirovich,
1987), empirical component analysis (Lorenz, 1956), quasiharmonic modes (Brooks et al., 1988), spectral
decomposition in noise and vibration, and empirical modal analysis in structural dynamics.

Definition

PCA is mathematically defined as an orthogonal linear transformation that transforms the data to a new
coordinate system such that the greatest variance by some projection of the data comes to lie on the first coordinate
(called the first principal component), the second greatest variance on the second coordinate, and so on.

PCA is a statistical approach to find the principal features of a distributed dataset based on the total variance30.
Given a set of multivariate distributed data in X-Y coordinate system (Fig. 1), PCA first finds the maximum variations
of the original datasets. These data points are then projected onto a new axis called i-j coordinate system. The
direction of i and j-axis is known as principal components.

Importance of PCA

PCA is used in exploratory data analysis and for making predictive models. It is commonly used for dimensionality
reduction by projecting each data point onto only the first few principal components to obtain lower-dimensional
data while preserving as much of the data's variation as possible. The first principal component can equivalently be
defined as a direction that maximizes the variance of the projected data. The 𝑖𝑡ℎ principal component can be taken
as a direction orthogonal to the first i-1 principal components that maximizes the variance of the projected data.

High-resolution image is referred as high-dimensional data space as each image data is organized into two-
dimensional pixel values in which each pixel consists of its respective RGB bits value. The representation of image
data poses a challenge to sharing image files over Internet. The lengthy image uploading and downloading time has
always been a major issue for Internet users. Apart from data transmission problem, high-resolution image
consumes greater storage space. Principal Component Analysis (PCA) is a mathematical technique to reduce the
dimensionality of data. It works on the principal of factoring matrices to extract the principal pattern of a linear
system. Using Eigen Faces for Recognition of Faces is a quintessential technique in computer vision. Sirovich and
Kirby (1987) showed that PCA could be used on a collection of face images to form a set of basis features.

A technique known as spike-triggered covariance analysis uses a variant of Principal Components Analysis in
Neuroscience to identify the specific properties of a stimulus that increase a neuron's probability of generating an
action potential.

When and Where to use:

It is used when we need to tackle the curse of dimensionality among data with linear relationships, i.e. where having
too many dimensions (features) in your data causes noise and difficulties (it can be sound, picture or context). This
specifically get worst when features have different scales (e.g. weight, length, area, speed, power, temperature,
volume, time, cell number, etc.)

We do this by reducing the dimension i.e. the features. But when should we reduce or change dimensions?

114
1- Better Perspective and less Complexity: When we need a more realistic perspective and we have many features
on a given data set and specifically when we have this intuitive knowledge that we don’t need this much number of
features.

Similarly, in many other practices modelling is easier in 2D than 3D, right?

2 - Better visualization: When we cannot get a good visualization due to high number of dimensions we use PCA to
reduce it into a shadow of 2D or 3D features (or even more but convenient enough for better parallel coordinates
or Andrew Curve, e.g. when you transfer 100 features into 10 features you cannot still depict it as 2D or 3D but you
can get a much better Andrew Curve)

3- Reduce size: When we have too much data and we are going to use process-intensive algorithms (like many
supervised algorithms) on the data so we need to get rid of redundancy.

Sometimes change of perspective matters more than D reduction and we want to exploit dimensionality:

4- Different perspective: Maybe you don’t have any of these motivations but you merely need to improve your
knowledge on your data. CPA can give you the best linearly independent and different combinations of features so
you can use to describe your data differently. It has extensive application wherever extensive data is, e.g. for media
editing, statistical quality control, portfolio analysis, etc. as far as we concern linear relationships. So if you are
dealing with data that reflects some sort of chaos, disruption and disorder, maybe PCA is not the best choice.

Applications:

Some of the applications of Principal Component Analysis (PCA) are:

• Spike-triggered covariance analysis in Neuroscience


• Quantitative Finance
• Image Compression
• Facial Recognition
• Other applications like Medical Data correlation

DATA PRODUCTS
- DIFFERENT LEVELS: (2018107005)
Data products are of two types: - Standard - Special

Standard products are generated after applying radiometric and geometric corrections. Special products are
generated after further processing the Standard products by mosaicing / merging / extracting and enhancement of
the data.

The raw data recorded at the earth station is corrected to various levels of processing at the Data Processing System
(DPS). They are:

Level 0: uncorrected (Raw data)

Level 1: Radiometrically corrected and Geometrically corrected only for earth rotation (Browse product)

Level 2: Both Radiometrically and Geometrically corrected (Standard product)

Level 3: Special processing like merging, enhancement etc., after Level 2 corrections (Special product)

Level 2 and Level 3 products will be supplied to users.

STANDARD PRODUCTS

There are various kinds of standard products that will be supplied are as follows:

115
i. Path/Row products

ii. Shift Along Track products

iii. Quadrant products

iv. Stereo products

v. Geocoded products.

1.Path/Row Based Products

These products will be generated based on the referencing scheme of each sensor. The user has to specify the
following:

i. Path Row\ number as per Referencing Scheme

ii. Sensor Identification

iii. Subscene Identification (for PAN)

iv. Date of Pass

v. Band number for B/W and Digital products, Band combination for FCC products

vi. Product Code

2. Shift Along Track Products

If a user's area of interest is less than the dimensions of a full scene and falls in two successive rows of the same
path, then the data will be supplied by sliding the scene in the forward (along the path) direction. These are called
Shift Along Track (SAT) products. This way the required area can be accommodated in a single product. In the case
of SAT products, the percentage of shift has to be specified in addition to the inputs specified by the user for
Path/Row based products. The percentage of shift along the path has to be specified between 10 % to 90 % in
multiples of 10

3. Quadrant Products

Each scene is divided into four nominal and twelve derived quadrants. Quadrant numbers 1,2,3.4 are nominal
quadrants. The remaining eight quadrants arc obtained after sliding quadrants 1, 2, 3 and 4 by 25%, along and across
the scene, within the path, in the forward direction. LISSIII quadrant products are generated on 1: 125,000 scale.

Quadrant products will be supplied from the LISS-III sensor in the visible band resolution, for visible and near infra-
red bands only. Quadrant products will not be available in SWIR band resolution. while placing a request for these
products, the users need to specify the quadrant number, in addition to the details specified in the case of Path/Row
based products.

4. Stereo Products

The oblique viewing capability of PAN sensor can be used to acquire stereopairs. A stereopair comprises of two
images of the same area acquired on different data and at different angles.

One of the parameters from which the quality of a stereopair can be judged is the Base/Height(B/H) ratio. B/H ratio
is the ratio of the distance between two satellite passes and the satellite altitude.

Stereo products will be available from the PAN sensor only. The input required in addition to Path/ Row details is
B/H ratio. Two scenes selected on two different dates satisfying the user’s B/H ratio will be supplied as a stereo pair.

Stereo products will be supplied with two levels of processing:

i. Only Radiometrically corrected

ii. Radiometrically corrected and Geometrically corrected for Across Track correction.
116
5.Geocoded Products

Geocoding corrects the imagery to a source independent format whereby multidate and multi satellite data can be
handled with ease. Geocoded products arc generated after applying radiometric and geometric corrections,
orienting the image to true north and generating the products with an output resolution appropriate to the map
scale. The advantage of a geocoded product is that it can be overlaid on a survey of India (SOI) toposheet map.

SPECIAL PRODUCTS: (2018107006)


Special products are generated after further processing standard products (Radiometrically and Geometrically
corrected) by extracting a specific area, mosaicing, merging and enhancing the data. The various types of special
products that will be supplied are as follows:

1) LISS - III district Geo-coded products

2) PAN 5’x5’ Geo coded products

3) PAN Full Scene (Path/ Row based and SAT)

4) PAN Quadrant products (PAN I or PAN Q)

5) Orthoimage

6) PAN + LISS - III Merged products

7) WiFS Zonal products

8) WiFS - VIM Zonal products

9) WiFS - VIM Full India products

A) LISS - III District Geo-coded Products

This product will be generated by mosaicing the standard correct LISS - III scene covering the district. The mosaic
will then be rotated to the true north.

The inputs to be specified by the user are as follows :

● State/ Union Territory’s name and district name as prevalent in the year 1991.

● All inputs as specified for Path-Row based products.

The criteria that will be used while selecting the scenes for preparing the mosaicing is the same in the case of LISS-
III based geocoded products. Depending on the districts in India have been classified into four categories:

CLASS Area(km x km)

CLASS A 45 x 45

CLASS B 90 x 90

CLASS C 180 x 180

117
CLASS D 400 x 400

Geocoded products of districts falling in Categories A,B and C will be supplied in 1:250,000 scale while those of
Category D will be supplied in 1:500,000 scale. The physical size of the photographic products will be 480mm for A
and B Categories and 960mm for C and D Categories.

B) PAN 5’x5’ Geocoded Products

Area corresponding to 5’x5’ within a path will be extracted around a user specified point and aligned to the true
north after applying standard corrections. The inputs to be specified by the user are latitude/ longitude of the points
around which the 5’x5’ data is required in addition to the details as in the case of Path/Row based products. The
main advantage of the product over the geocoded product is that it will be on a higher scale and can be overloaded
on a 1:12,500 scale map.

C) PAN Full Scene (Path/Row) Products

PAN Full scene products will be generated by mosaicing the data collected by the three arrays. The correction level
of these products is the same as that of standard products. The inputs to be specified are path, row and A,B,C or D.
The master will be a 960mm film and will be written on a Large Format Photowrite system on 1:125,000 scale. The
final product will be a 960mm paper print on the same scale.

D) PAN Full Scene (SAT) products

These products are PAN full scene products but shifted along the track by the user specified percent. The percentage
of shift varies between 10% and 90% in the forward direction. The inputs to be specified by the used are the same
as those of PAN Full scene products. The scales of the 960mm master film is 1:125,000 scale and the final products
will be supplied as 960mm paper prints.

E) PAN Quadrant Products

The PAN full scene is divided into four quadrants as shown in figure.

The PAN full scene is divided into four quadrants. Here


each quadrant corresponds to one and half an array
data. Here again, the scale of the 960mm master film will
be 1:125,000. The products will also be supplied on
960mm paper print on 1:125,000 scale.

F) PAN + LISS III Merged Products

In order to exploit dual advantage of the spectral


resolutions of LISS-III and spatial resolution of PAN, it is
planned to supply PAN + LIS-III merged produced in PAN
resolution. The inputs to be specified by the user are as
follows:

- Path and Row number as per referencing scheme


- Subscene ID for PAN
- Date of Pass for PAN and LISS-III
- Products code

The criteria that will be considered while selecting the PAN and LISS-III scenes are :

1. PAN tilt is near nadir and the scene fits into a LISS-III scene.

118
2. Days of pass are not separated by more than a few days.

These products will be supplied on 1:25,000 scale as color photographic products. Black and white and digital
products are not supplied.

G) WiFS - Zonal Products

These products will be generated zone wise (Refer figure 4.3.5 and Table A). India is divided into ten zones and each
zone covers at least one state completely. The inputs to be satisfied by the user in addition to Path/ Row details, is
the zone number. Based upon this input, the WiFS Scenes covering the zones will be mosaiced and final products
on 1:2 million scale will be generated on a 960mm paper print. These products are supplied as Black and White
products and digital products. The criteria considered while mosaicing the WiFS scenes covering the zone are as
follows:

● Same cycle

● Adjacent cycle

● Same season of the previous year

Table A WiFS ZONES

Zone No. States Covered

1. Jammu & Kashmir, Punjab, Himachal Pradesh, Haryana, Delhi, Parts of Uttar Pradesh, Rajasthan.

2. Rajasthan, Gujarat & Haryana, Parts of Madhya Pradesh and Maharashtra.

3. Uttar Pradesh, Parts of Bihar and Maharashtra.

4. Assam, Arunachal Pradesh, Meghalaya, Manipur, Mizoram, Nagaland, Sikkim, Tripura.

5. Madhya Pradesh, Parts of Maharashtra, Uttar pradesh, Andhra Pradesh & Orissa.

6. Orissa, Bihar, West Bengal, Sikkim, Parts of Madhya Pradesh, Uttar Pradesh.

7. Karnataka, Tamil Nadu, Goa, Kerala, Lakshadweep and Parts of Andhra Pradesh.

8. Maharashtra, Parts of Karnataka, Andhra Pradesh, Tamil Nadu & Madhya Pradesh.

9. Andhra Pradesh, Parts of Karnataka, Madhya Pradesh, Maharashtra, Tamil Nadu & Orissa.

10. Andaman and Nicobar Islands.

H) WiFS - VIM Full India Products

The WiFS Vegetation Index Map (VIM) for full India will be generated by mosaicing the WiFS scenes covering the
entire country within an interval of 10 days using WiFS sensors. Vegetation index is calculated using IR and Visible
band data

119
𝐵2 − 𝐵1
NDVI =
𝐵2 + 𝐵1
where B1 - Visible band, B2 - IR band

The NDVI thus calculated results in real values ranging from (-1,1) a post normalization is incorporated in
this formula which results in the output range on (0,255). The final product has been color coded to 12 dashes for
interpretation and will be on 1:6 million scale on 960mm paper print. The inputs to be specified by the user is the
specific date for which he would like to have VIM full India products and the other products as mentioned in the
case of Standard Path/ Row products.

I) WiFS - VIM Zonal Products

As in the case of WiFS zonal products, the user has to specify the zonal number and the other inputs similar to
Path/Row based products. The scenes covering the zone are mosaiced and VIM products will be generated for the
same. The criteria to be considered while selecting the scenes are as follows:

● Same cycle

● Adjacent cycles

● Same season of the previous years

The final product will be supplied on 1:2 million scale on 960mm paper prints.

J) Orthoimage

It is planned to introduce PAN Orthoimages after a few months of IRS-IC launch. This is a new product and its
generation will be in an experimental phase before searching operational status.

DEM, DSM, DTM: (2018107056)


DIGITAL ELEVATION MODEL (DEM)

A DEM is Digital Elevation Model is a raster object that contains elevation values for a site. DEM is frequently used
to refer to any digital representation of a topographic surface. DEM contains ground elevation data only (i.e.) Z as a
function of XY.

DIGITAL TERRAIN MODEL:

A digital terrain model (DTM) is a digital representation of the elevation of a planetary surface. It contains ground
elevation and terrain type information. It stores Z (function of XY) and type information (such as water, forest)

DIGITAL SURFACE MODEL:

A Digital Surface Model (DSM) represents the earth´s surface and includes all objects on it. Stores Z as function of
XY, usually the topmost surface

DEM, DSM, DTM:

Digital elevation model (DEM): Generic term covering digital topographic data in all its various forms as well as the
method for interpreting implicitly of the elevations between observations (Maune et al., 2001).

Digital terrain model (DTM): DTM is a synonym of bare-earth DEM (Maune et al., 2001). Florinsky 1998, defined
DTMs as digital representations of variables relating to a topographic surface, namely: digital elevation models
(DEMs), digital models of gradient (G), aspect (A), horizontal (Kh) and vertical (Kv) land surface curvatures as well as
other topographic attributes.

120
Digital surface model (DSM): Model depicting elevations of the top of reflective surfaces, such as buildings and
vegetation (Maune et al., 2001).

In this fig, Green line indicates – DSM; Blue line indicates- DTM

Methods for Obtaining Elevation Data,

· Real Time Kinematic GPS · Theodolite or total station · Focus variation

· Stereo photogrammetry · LIDAR · Doppler radar · Inertial surveys

ELEVATION DATA CAPTUTRES

1. PHOTOGAMMETRIC MAPPING
2. LIDAR
3. Ground Survey
4. Digitize from existing map data

Grid, TIN, Contours - Elevation data may be stored differently:

Grid

- Regular raster grid

- Irregular grid (denser where needed)

- With or without additional break line constraints

- Easy structure, easy to interpolate, fast.

TIN (Triangulated Irregular Network)

- Point storage, but not in regular raster

- With or without break line constraints

- Approximates the surface better with fewer


points, but more difficult to store and interpolate.

Contour lines

- Vector data with isolines (each line has a Z-value)

- Difficult to interpolate. Sometimes masked by buildings.

121
IMAGE PROCESSING HARDWARE AND SOFTWARE: (2018107018)
HARDWARE

Computer hardware characteristics that are of value when conducting Digital image processing, including: type of
computer, Central processing unit (CPU), system random-access (RAM) and read-only (ROM) memory, mass storage
& data archive considerations, video display spatial and spectral resolution, input and output devices, etc .The
hardware associated with typical digital image processing laboratories is discussed

SOFTWARE

High-quality digital image processing software is critical for successful digital image processing. The software should
be easy to use and functional. The most Important digital image processing functions are introduced. Many of the
most commonly used digital Image Processing systems are then reviewed including their functional strengths. Image
processing system cost constraints are introduced.

HARDWARE CLASSIFICATIONS

Central processing unit

The central processing unit (CPU) is the computing part of the computer. It consists of a control unit and an
arithmetic logic unit. The CPU:

• performs numerical integer and/or floating point calculations, and

• directs input and output from and to mass storage devices, color monitors, digitizers, plotters, etc.

The CPU’s efficiency can be measured in terms of

• The number of cycles it can process in one second, e.g., 3.7 GHz means the CPU performs approximately 3.7
billion cycles per second,

• How many millions of instructions it can process per second (MIPS), e.g., 500 MIPS,

• The number of transistors used by the CPU.

Read-Only Memory (ROM) and Random-Access Memory

ROM retains information even after the computer is turned off because power is supplied by a battery that
occasionally must be replaced. When a computer is turned on, the computer examines the information stored in
the various ROM registers and uses this information to proceed. Most personal computers have sufficient ROM to
perform quality digital image processing.

RAM is the computer’s primary temporary workspace. Unlike ROM, the data stored in RAM are lost when the
computer is turned off. Computers should have sufficient RAM for the operating system, image processing software,
and any spatial data that must be held in temporary memory while calculations are performed. Because of this, the
amount of RAM is one of the most important considerations when purchasing a computer for digital image
processing.

TYPES OF COMPUTER

Personal Computers

Personal computers (PCs) are the workhorses of the digital image processing industry (Figure 3-2). These include
relatively inexpensive computers such as desktops, laptops, and tablets. Most of the personal computers now come
with multiple CPUs. A multi-core processor is a single computing component with two or more independent central
processing units (called cores), that read and execute program instructions. CPUs were originally developed with
only one core. A dual-core processor has two cores (e.g., Intel Core Duo), a quad-core processor contains four cores

122
(e.g., Intel quad-core i7), a Hexa-core processor contains six cores (e.g., Intel Core i7 Extreme Edition),an Octa-core
processor contains eight cores and so on.

Computer Workstations

Computer workstations usually contain more powerful processors, more RAM, larger hard disk drives, and
very high-quality graphics display capability. These improved components allow workstations to perform digital
image processing analysis more rapidly than a personal computer. However, the cost of a workstation is usually two
to three times more than the cost of a personal computer.

Mainframe Computers

Mainframe computers perform calculations more rapidly than personal computers and workstations, and
they are able to support hundreds of users simultaneously. They may contain hundreds of CPUs in which case they
are usually referred to as super computers. Mainframe computers are usually expensive to purchase and maintain.
Also, digital image processing software for mainframe computers is more expensive

Input Devices

Aerial photographs are typically 9 × 9 in. in size. Therefore, it is important to have a scanner with an effective
area of at least 12 × 16 in. so that an entire 9 × 9 in. aerial photograph can be scanned in one pass.

Output Devices

GIS should be able to output high-quality maps, images, charts, and diagrams in both small or large formats.
To accomplish this, small- and large-format printers are required. Inexpensive ink-jet or color laser printers can be
used for small format printing, and E sized ink-jet plotters can be used for large formats.

Digital Image Processing Software

Multispectral Digital Image Processing Software ERDAS Imagine, ENVI, PCI Geomatica, TNTmips, and IDRISI are
heavily adopted digital image processing systems used for analysis of aircraft and satellite multispectral remote
sensor data. All have radiometric and geometric preprocessing algorithms, a variety of image enhancement and
analysis routines, and useful change detection modules. Some can process RADAR imagery.

Hyper spectral Digital Image Processing

Software ENVI and VIPER are rigorous hyper spectral image analysis programs. Both have sophisticated
radiometric (atmospheric) correction capabilities and a diversity of algorithms to extract information from hyper
spectral data

Geographic Object-based Image Analysis

(GEOBIA)

E Cognition, Feature Analyst, IDRISI, and ENVI have excellent geographic object-based image analysis (GEOBIA)
programs.

LiDAR Digital Image Processing Software

SOCET GXP software has extensive enterprise level LiDAR 3D processing capability used primarily by intelligence
gathering agencies.

RADAR Digital Image Processing Software

Only a few of the commercially-available digital image processing programs can process RADAR (single polarization)
or polarimetric RADAR (multiple polarization) data. ERDAS Imagine, ENVI, PCI Geomatica and IDRISI have RADAR
processing modules.

123
Open-Source Digital Processing Image

Software If software cost is a concern, open-source digital image processing software, such as GRASS or MultiSpec,
may be the best solution.

OPEN SOURCE PRODUCTS: (2018107007)

SOFTWARE ABOUT

Geomatica, PCI It is a remote sensing and photogrammetry desktop software package for processing earth
Geomatics observation data.

SAGA (System for Automated Geoscientific Analyses) is rich in library grid, imagery and
terrain processing modules. Has basic supervised classification. Standard modules: Filter for
grids, Gridding, Grid calculator, Geostatistics, Grid discretization, Grid tools Image
SAGA GIS
classification, Vector tools Terrain analysis. Key features: Intuitive GUI for data
management, visualization, and analysis, Framework-independent function development,
Object-oriented system design, Geo-referencing projections.

TNTmips GIS, TNTmips is a geospatial analysis system providing a fully featured GIS, RDBMS and
MicroImages,U automated image processing system with CAD, TIN, surface modeling, map layout and
SA innovative data publishing tools.

ERDAS It provides true value, consolidating remote sensing, photogrammetry, LiDAR analysis, basic
IMAGINE vector analysis, and radar processing into a single product.

ENVI (“Environment for Visualizing Images") is a software program used to visualize, process,
ENVI and analyze geospatial imagery. Supports all sensors, more than 200 different types of data,
and a myriad of data modalities.

Google Earth is a computer program, formerly known as Keyhole Earth Viewer, that maps
Google Earth the Earth by superimposing satellite images, aerial photography, and GIS data onto a 3D
globe, allowing users to see cities and landscapes from various angles.

It is a 30 years GIS software with several tools for image processing. Vector and raster data
are organized in location and mapset. It provides a supervised classification module with the
GRASS GIS
Maximum Likelihood algorithm. Key features: Image processing, Raster analysis, Vector
analysis, Geocoding.

It is an open-source geospatial toolkit and a frontend to that toolkit. It was developed using
OpenEV Python and uses the GDAL library to display georeferenced images and elevation data. It also
has image editing capabilities and uses OpenGL to display elevation data in 3D.

The neat part about Opticks software is the long list of extensions you can add. There are
Opticks plugins for raster math, radar processing and hyper/multispectral.

Before downloading an extension, make sure to check the compatibility.

124
It has several algorithms for image filtering, image segmentation, and image
classification with K-means and SVM (Support Vector Machines). Also, there is an official
Orfeo toolbox
interface that allows for the interactive execution of application. However, it provides
the integration with other software through a Python interface.
It is a group of software programs designed by Textron Systems Geospatial Solutions to
aid in analyzing satellite or aerial images of the Earth's surface for the purpose of
RemoteView
collecting and disseminating geospatial intelligence. The National Geospatial-Intelligence
Agency (NGA) was a user of RemoteView software.

It is a software application that performs functions related to photogrammetry. It is


SOCET SET developed and published by BAE Systems. SOCET SET was among the first commercial
digital photogrammetry software programs.

TerrSet (formerly IDRISI) is an integrated GIS and remote sensing software developed by
Clark Labs at Clark University for the analysis and display of digital geospatial
IDRISI information. It is a PC grid-based system that offers tools for researchers and scientists
engaged in analyzing earth system dynamics for environmental management,
sustainable resource development and equitable resource allocation.

eCognition Developer is a development environment for object-based image analysis.

ECognition It is used in earth sciences to develop rule sets (or applications) for the analysis of
remote sensing data. eCognition Server software provides a processing environment for
batch execution of image analysis jobs.

ArcGIS is maintained by the Environmental Systems Research Institute (ESRI). It is used


for creating and using maps, compiling geographic data, analyzing mapped information,
sharing and discovering geographic information, and managing geographic information
ArcGIS
in a database. ArcGIS consists of the following Windows desktop software: ArcReader,
ArcGIS Desktop, ArcMap, ArcScene, ArcGlobe, ArcCatalog, ArcGIS Pro. Key features:
Geocoding, Directions-mapping, Data Visualization, Offline mode to access maps.

SNAP (Sentinels Application Platform) is an ESA development, for scientific exploitation


of Earth Observation missions. It contains the functionalities of toolboxes such as
ENVISAT BEAM, NEST and Orfeo Toolbox. It comprises an app for interactive work with
SNAP
the Earth Observation Data, the Graph Processing Framework to create and execute
recurring workflows, a command line interface, programming interfaces for Python and
Java.
The Sentinel Toolbox consists of 3 separate applications: Sentinel-1 Toolbox (SAR
applications), Sentinel-2 Toolbox (High-resolution optical applications) and Sentinel-3
Sentinel Toolbox (High resolution optical applications). Sentinel-2 has become the gold standard
Toolbox for open satellite data. Sen2cor plugin which allows users to correct for atmospheric
effects and classify images. Sentinel-1 toolbox can perform interferometry, speckle
filtering, and co-registration.
In terms of remote sensing plugins, the semi-automatic classification plugin is one of the
QGIS Semi –
best. It’s especially useful because you can download satellite imagery directly in the
Automatic
plugin such as: Sentinel, Landsat, ASTER and MODIS. It also provides tools to its pre- and
Classification
post-processing of imagery. Key Features: Data capturing, Overlaying, Spatial analysis,
Plugin (SCP)
Create, edit, manage, and export data
This software can handle dual and full polarization SAR from satellites like:

PolSARPro ENVISAT-ASAR, ALOS-PALSAR, RADARSAT-2, TerraSAR-X. There’s a wide range of tools


like radar decompositions, InSAR processing, and calibration. It has the graph processing
framework where users can automate workflow.

125
It is intended to provide a platform for advanced geospatial analysis and data
visualization with applications in environmental research and geomatics industry. It
Whitebox GAT actually replaced the Terrain Analysis System (TAS). Whitebox GAT has a solid 410 tools
for GIS needs. Key features: LIDAR data, Image processing tools, Hydrology tools, GIS
tools.
Users can perform supervised classification, band algebra and decision trees. On top of
that, gvSIG software delivers a more diverse range of tools like: TASSLED CAP (Tassled Cap
is ideal for monitoring vegetation health/vigor and urban growth), VEGETATION INDICES
gvSIG
(The vegetation indices toolbar analyzes chlorophyll and plant health for multispectral
data). Key features: 3D and animation, Vector representation, Raster and remote sensing,
Topology.

InterImage is a bit different from the other open source remote sensing software on this
list. It specializes in automatic image interpretation. The core theme of automatic image
InterImage
interpretation is object-based image analysis (OBIA). This involves segmentation,
exploring their spectral, geometric, and spatial properties, and then classification.

STEREO VISION: (2018107010)


A stereoscope is a device for viewing a stereoscopic pair of separate images, depicting left-eye and right-eye views
of the same scene, as a single three-dimensional image.

The function of a stereoscope is to deflect normally converging lines of sight, so that each eye views a different
image. Instruments in use to-day for three-dimensional study of aerial photographs are of two types i.e. Lens
Stereoscope and reflecting or mirror stereoscope.

Stereoscopic Vision

• Stereoscopic vision is also called space vision or plastic vision, is a characteristic, possessed by most persons
of normal vision and is important for ability to conceive objects in three dimensional effects and to judge
distances.

• Stereoscopic vision is the basic prerequisite for photogrammetry and photo interpretation. Stereoscopy is
defined as the science or art which deals with stereoscopic or other three-dimensional effects and methods
by which these effects are produced.

• The close objects are larger, brighter, and more detailed than distant object, and that the close object
obstructs the view of distant object. Monocular vision means seeing with one eye. Binocular vision means
using both eyes simultaneously. The degree of depth perception is called as Stereoscopic acuity.

Stereoscopic vision

Stereoscopic viewing
126
• Two eyes must see two images, which are only slightly different in angle of view, orientation, colour,
brightness, shape and size.

• Human eyes, fixed on same object provide two points of observation which are required for parallax. A
finger held with the arm stretched and alternately viewed with left and right eye appears to move sideways.
Thus, movement or displacement is the horizontal parallax.

Types of stereoscopic vision

Stereoscopic vision can be of two types:

• Natural Stereoscopic Vision

• Artificial Stereoscopic Vision

Natural Stereoscopic Vision

• Natural Stereoscopic vision is possible due to the Monocular Vision which is possible due to the relative size
of the objects, overcutting convergence and accommodation of eyes, haze of atmosphere etc.
• Binocular Vision is responsible for perception of depth. Two slightly different images, seen by two eyes
simultaneously are fused into one by brain, giving the sensation of a model with three dimensions.
• The three-dimension effect is reduced beyond viewing distance of one meter. So also, the distance between
two eyes, called Eye base, affects stereoscopic vision. Wider the eye base, better is the three-dimensional
effect.

Artificial Stereoscopic Vision

• Artificial stereoscopic vision can be achieved with certain aids and a two dimensional photograph can
provide a three dimensional effect.
• This image obtained is comparable to the image that can be obtained if two eyes are placed at two points
of exposure stations on a flight line. Here the distance between two exposure stations is called the airbase.

• Relationship of accommodation i.e. changes of focus, and convergence or divergence of visual axes is
important. As the eyes focus on an object, they also turn so that lines of sight intersect at the object.

• Angle of convergence of nearer object is larger than the angle at object with longer distance. Proper
association between accommodation and convergence are necessary for efficient function of eyes.

• This association can be weakened or destroyed by improper use of eyes. Visual illusion, colour vision and
defects of focus, coordination defects in depth perception etc. are important factors affecting photo
interpretation.

• Stereoscopic vision thus, is the observer’s ability to resolve parallax differences between far and near
images. When relative position of aerial photograph is reserved, closer objects appear close, and this
phenomenon is called pseudo stereo vision.

127
Requirements of Stereoscopic Photography

If instead of looking at the original scene, we observe photos of that scene taken from two different viewpoints, we
can under suitable conditions, obtain a three dimensional impression from the two dimensional photos. In order to
produce a spatial model, the two photographs of a scene must fulfil certain condition:

• Both photographs must cover same scene, with 60% overlap.

• Time of exposure of both photographs must be same.

• The scale of the two photographs should be approximately the same. Difference up to 15% may be successfully
accommodated. For continuous observation and measurements, differences greater than 5% may be
disadvantageous.

• The brightness of both the photographs should be similar.

• Base height ratio must have an appropriate value. Normally the B/H or Base height ratio is up to 2 Ideal is not
known but is probably near to 0.25. If this ratio is too small, say 0.02, the stereoscopic view will not provide
depth impression better than when only single photo is viewed.

In the base height ratio- B/H

where, B = is the distance between two exposure stations; H = is the distance between an object and the line joining
two exposure stations.

STEREO (Solar Terrestrial Relations Observatory) products

• The oblique viewing capability of PAN sensor can be used to acquire stereo pairs. Stereo pairs comprises of
two images of same area acquired on different dates and angles.

• Base Height ratio (ratio of distance between 2 satellite passes and the satellite altitudes) is used to judge
the quality of the stereo pairs.

• Stereo products will be available from the PAN sensors only. The input required in addition to path/row
details is base height ratio.

• Two scenes selected on 2 different dates satisfying the users base height ratio will be supplied as the stereo
pair. These will be available as black and white photographic and digital products.

• Photographic products will be available on 1:250,000 scale.

Two levels of processing

• Only radiometrically corrected

• Radiometrically corrected and geometrically corrected for across track correction

These two levels of processing are available in with or without histogram equalisation. Stereo pairs are widely used
in photo interpretation for relief perception and also in photogrammetric studies of deriving DTM models.

Stereo Triplet products will also be supplied. Here in addition to the two scenes forming the stereo pair, a nadir
scene is also supplied.

128
RESOLUTION OF IMAGE: (2018107008)
The resolution of remote sensed raster data can be characterized in several different ways. There are four
primary types of "resolution" for raster's:

Spatial Spectral Radiometric Temporal

It is nearly impossible to acquire imagery that has high spatial, spectral, radiometric and temporal resolution. This
is known as Resolution Trade-off, as it is difficult and expensive to obtain imagery with extremely high resolution.
Therefore, it is necessary to identify which types of resolution are most important for a project.

Spatial Resolution:

Spatial resolution is the size each pixel represents in the real world
by the terms ground resolution distance. Spatial resolution is usually
reported as the length of one side of a single pixel. For example, Landsat
8 has 30-meter spatial resolution. In other words, an image with 30-meter
spatial resolution means that a single pixel represents an area on the
ground that is 30 meters across.

In analog imagery (film), the dimension (or width) of the smallest


object on the ground that can be distinguished in the imagery defines the
spatial resolution. The spatial resolution of a raster is determined by
sensor characteristics for digital imagery and film characteristics including
field of view, altitude for film photography.

Satellite Data Spatial Resolution:

Spectral Resolution:

Spectral resolution refers to how many spectral “bands” an instrument record. Spectral resolution is also
defined by how “wide” each band is or the range of wavelengths covered by a single band. Black and white photos
contain only 1 band that covers the visible wavelengths, color (RGB) images contain 3 bands and Landsat 8 has a
total of 11 bands. For example, MODIS (Moderate Resolution Imaging Spectroradiometer) has greater spectral
resolution than Landsat 8 because it has 36 relatively narrow bands that cover wavelengths from 0.4 to 14
micrometers. Landsat 8, on the other hand, has a total of 11 bands that cover less wavelengths and each band is
wider in terms of wavelength.

129
Radiometric Resolution:

Radiometric resolution is how finely a satellite or sensor divides up the radiance it receives in each band. The
greater the radiometric resolution the greater the range of intensities of radiation the sensor is able to distinguish
and record. Radiometric resolution is typically expressed as the number of bits for each band. Traditionally 8-bit
data was common in remote sensed data, newer sensors (like Landsat 8) have 16-bit data products. 8 bits = 28 =
256 levels (usually 0 to 255) 16 bits = 216 = 65,536 levels (0 to 65,535).

Temporal Resolution:

Remote sensed data represents a snap shot in time. Temporal resolution is the time between two subsequent
data acquisitions for an area. This is also known as the “return time” or "revisit time". The temporal resolution
depends primarily on the platform, for example, satellites usually have set return times and while sensors mounted
on aircraft or unmanned aircraft systems (UAS), have variable return times. For satellites, the return time depends
on the orbital characteristics (low vs high orbit), the swath width and whether or not there is an ability to point the
sensor. Landsat has a return time of approximately 16 days, while other sensors like MODIS have nearly daily return
times.

130

You might also like