Vision Presentation

University College of Borås - Robotics
Group
http://www.ing.hb.se
An Introduction to Machine Vision
IPWS
n.n@hb.se
1/120 2 3 L2 3M 3 2
3 2 ? i P
The Problem
• The process of interpreting 2D images of 3D scenes.
(i.e many to one mapping)
2/120 2 3 L2 3M 3 2
3 2 ? i P
The Problem
• The process has given time constraints
2/120 2 3 L2 3M 3 2
3 2 ? i P
The Problem
• The process has given time constraints
• The set of features for interpretation is usually know before hand
2/120 2 3 L2 3M 3 2
3 2 ? i P
A Classification of the
Stages within the Process
3/120 2 3 L2 3M 3 2
3 2 ? i P
Sensing
The process of acquiring and quantising the image of the scene
4/120 2 3 L2 3M 3 2
3 2 ? i P
Sensing
• The most important variables in this process are
4/120 2 3 L2 3M 3 2
3 2 ? i P
Sensing

Resolution of the image measured in terms of
∗ the number of pixels within the image plane and
∗ the number of bits used to represent a pixel
4/120 2 3 L2 3M 3 2
3 2 ? i P
Sensing

Illumination
4/120 2 3 L2 3M 3 2
3 2 ? i P
Sensing

Illumination
• These determine the fundamental limits on what can be achieved

in later stages
4/120 2 3 L2 3M 3 2
3 2 ? i P
5/120 2 3 L2 3M 3 2
3 2 ? i P
Preprocessing
This usually performs of two functions
• Noise Reduction -removal of what is unwanted
6/120 2 3 L2 3M 3 2
3 2 ? i P
Preprocessing
• Enhancement - emphasising what is required for later stages
6/120 2 3 L2 3M 3 2
3 2 ? i P
Preprocessing
• Enhancement - emphasising what is required for later stages
Both effective selectively change the signal-to-noise ratio of compo-

nents in the image
6/120 2 3 L2 3M 3 2
3 2 ? i P
7/120 2 3 L2 3M 3 2
3 2 ? i P
Segmentation
• This process groups the pixels in the image into ’related’ regions.
8/120 2 3 L2 3M 3 2
3 2 ? i P
Segmentation
• This groups often correspond to areas of the image with similar

physical 3D characteristics i.e
edges
8/120 2 3 L2 3M 3 2
3 2 ? i P
Segmentation

edges
colour
8/120 2 3 L2 3M 3 2
3 2 ? i P
Segmentation

edges
colour
texture
8/120 2 3 L2 3M 3 2
3 2 ? i P
9/120 2 3 L2 3M 3 2
3 2 ? i P
Feature Extraction
• This process represents the segmented region using some form
of abstraction
10/120 2 3 L2 3M 3 2
3 2 ? i P
Feature Extraction
of abstraction
• Often in the form of vectors of numbers called feature vectors
10/120 2 3 L2 3M 3 2
3 2 ? i P
Feature Extraction
of abstraction
• Often in the form of vectors of numbers called feature vectors
• There is a loss of information at this point and the features repre-

sent only the chosen characteristic.
10/120 2 3 L2 3M 3 2
3 2 ? i P
Matching
• This process associates the feature vectors with know or ex-
pected regions in the 3D scene.
11/120 2 3 L2 3M 3 2
3 2 ? i P
Matching
• This process associates the feature vectors with know or ex-
pected regions in the 3D scene.
• This often involves a local model of the expected scene
11/120 2 3 L2 3M 3 2
3 2 ? i P
Interpretation
• Assigns a globally consistent match to all the individually matched
regions region to produce an interpretation of the entire scene.
12/120 2 3 L2 3M 3 2
3 2 ? i P
Interpretation
• Assigns a globally consistent match to all the individually matched
regions region to produce an interpretation of the entire scene.
• Often with the help of a global model of the expected scene.
12/120 2 3 L2 3M 3 2
3 2 ? i P
Note
• These individual stages are not independent.
13/120 2 3 L2 3M 3 2
3 2 ? i P
Note
• These individual stages are not independent.
• In the implementation of a time efficient system they often overlap
13/120 2 3 L2 3M 3 2
3 2 ? i P
Illumination
14/120 2 3 L2 3M 3 2
3 2 ? i P
• This is the most critical aspect of the machine vision problem.
15/120 2 3 L2 3M 3 2
3 2 ? i P
• A badly chosen illumination scheme can make what is a tractable

problem intractable.
15/120 2 3 L2 3M 3 2
3 2 ? i P

• The illumination scheme must be suitable for the features which

are to be measured.
15/120 2 3 L2 3M 3 2
3 2 ? i P

• The illumination scheme must be suitable for the features which

are to be measured.
• There are 4 basic schemes
15/120 2 3 L2 3M 3 2
3 2 ? i P
Diffuse Lighting
Diffusing
Surface
Object
Shadow Light
source
16/120 2 3 L2 3M 3 2
3 2 ? i P
Often used
• to enhance the surface characteristics when the employed with

smooth or regular surfaces.
17/120 2 3 L2 3M 3 2
3 2 ? i P
Often used
• to enhance the surface characteristics when the employed with

smooth or regular surfaces.
• to remove shadows
17/120 2 3 L2 3M 3 2
3 2 ? i P
Directional Lighting

18/120 2 3 L2 3M 3 2
3 2 ? i P
• Used to enhance the scattering properties of the scene, when
observed from the camera’s perspective
19/120 2 3 L2 3M 3 2
3 2 ? i P
• This can be used to identify defects in a smooth surface, which

may appear as bright spots on a dark background.
19/120 2 3 L2 3M 3 2
3 2 ? i P
• This can be used to identify defects in a smooth surface, which

may appear as bright spots on a dark background.
• The angle of reflectance is chosen to reflect light away from the

camera for a smooth surface
19/120 2 3 L2 3M 3 2
3 2 ? i P
Back lighting

Collimated Light Source
20/120 2 3 L2 3M 3 2
3 2 ? i P
• This is used to enhance the object boundaries.
21/120 2 3 L2 3M 3 2
3 2 ? i P
• It produces a silhouette of the object without the need for prepro-

cessing
21/120 2 3 L2 3M 3 2
3 2 ? i P
• It produces a silhouette of the object without the need for prepro-

cessing
• Suits quantitative measurement
21/120 2 3 L2 3M 3 2
3 2 ? i P
Structured Lighting
22/120 2 3 L2 3M 3 2
3 2 ? i P
• This is a group of techniques which identifies variation in an ex-
pected pattern of illumination.
23/120 2 3 L2 3M 3 2
3 2 ? i P
• The pattern of illumination is chosen based on the features to be

extracted.
23/120 2 3 L2 3M 3 2
3 2 ? i P
• The pattern of illumination is chosen based on the features to be

extracted.
• Suitable for quantitative measurement.
23/120 2 3 L2 3M 3 2
3 2 ? i P
24/120 2 3 L2 3M 3 2
3 2 ? i P
25/120 2 3 L2 3M 3 2
3 2 ? i P
Connectivity
26/120 2 3 L2 3M 3 2
3 2 ? i P
• Used to define the neighbours or those pixels belonging to a re-
lated set such as a connected boundary
27/120 2 3 L2 3M 3 2
3 2 ? i P
• Various kinds of connectivity can be defined (given quantisation of

a square pixel grid)
27/120 2 3 L2 3M 3 2
3 2 ? i P

N4 (p) those pixels a unit distance away from p
27/120 2 3 L2 3M 3 2
3 2 ? i P


√
ND (p) those pixels at 2 from p
27/120 2 3 L2 3M 3 2
3 2 ? i P


√
N8 (p) the set of all adjacent pixels
27/120 2 3 L2 3M 3 2
3 2 ? i P


√
N8 (p) the set of all adjacent pixels
• Note Square pixel grids are not the usual case.

Typically they have an aspect ratio of 4 : 3
27/120 2 3 L2 3M 3 2
3 2 ? i P
N4(p)
q
q p q
q
28/120 2 3 L2 3M 3 2
3 2 ? i P
ND(p)
q q
p
q q
29/120 2 3 L2 3M 3 2
3 2 ? i P
N8(p)
q q q
q p q
q q q
30/120 2 3 L2 3M 3 2
3 2 ? i P
Mixed Connectivity
0 0 0 0 0 0
0 0 1 1 0 0
0 1 0 0 1 0
0 1 0 0 1 0
0 1 0 0 1 0
0 0 1 1 0 0
0 0 0 0 0 0
31/120 2 3 L2 3M 3 2
3 2 ? i P
Ambiguity
• The edge of the hole is not connected if we apply N4 connectivity
32/120 2 3 L2 3M 3 2
3 2 ? i P
Ambiguity
• The hole and background are connected if we simply apply N8 con-

nectivity to both background and edge
32/120 2 3 L2 3M 3 2
3 2 ? i P
Ambiguity

• Neither definition gives a unique definition of connectedness
32/120 2 3 L2 3M 3 2
3 2 ? i P
Ambiguity

• Neither definition gives a unique definition of connectedness
=⇒ M-connectivity
32/120 2 3 L2 3M 3 2
3 2 ? i P
M-Connectivity
M-connectivity is said to exist for the pixels q and p
(where p, q ∈ V the set of interest)
• If q ∈ N4 (p)
• Or
33/120 2 3 L2 3M 3 2
3 2 ? i P
M-Connectivity
• If q ∈ N4 (p)
• Or
• If q ∈ ND (p) And N4 (p) ∩ N4 (q) = 0/
33/120 2 3 L2 3M 3 2
3 2 ? i P
M-Connectivity
• If q ∈ N4 (p)
• Or
• If q ∈ ND (p) And N4 (p) ∩ N4 (q) = 0/
Note: this may not be implemented directly but can be implicitly achieved
by choosing an appropriate search order.
33/120 2 3 L2 3M 3 2
3 2 ? i P
p q is not a member of N (p)
1 1 4
q
1 q is a member of N (p)
D
1

1

1
p
N (p) intersect N (q) is empty
q 4 4
1

1

34/120 2 3 L2 3M 3 2
3 2 ? i P
Measures of Distance
• Metrics are distance measures with defined properties
35/120 2 3 L2 3M 3 2
3 2 ? i P
• A Metrics d is a real valued function
35/120 2 3 L2 3M 3 2
3 2 ? i P
• A Metrics d is a real valued function
• Metrics can be used to measure the distance between

pixels or features
35/120 2 3 L2 3M 3 2
3 2 ? i P
Properties of a Metric
Positivity
d(p, q) ≥ 0
d(p, q) = 0 i f f p = q
36/120 2 3 L2 3M 3 2
3 2 ? i P
Symmetry
d(p, q) = d(q, p)
37/120 2 3 L2 3M 3 2
3 2 ? i P
Symmetry
d(p, q) = d(q, p)
Triangle Inequality
d(p, z) + d(q, z) ≥ d(p, q)
37/120 2 3 L2 3M 3 2
3 2 ? i P
q=(xq , yq )
d(p,q) d(z,q)
z=(x ,y )
p=(xp , yp ) d(p,z) z z
38/120 2 3 L2 3M 3 2
3 2 ? i P
Some Metrics
Euclidean Metric
q
de(p, q) = (x p − xq)2 + (y p − yq)2
39/120 2 3 L2 3M 3 2
3 2 ? i P
Absolute Metric
da =| x p − xq | + | y p − yq |
40/120 2 3 L2 3M 3 2
3 2 ? i P
Absolute Metric
da =| x p − xq | + | y p − yq |
Maximum Value Metric
dm(p, q) = max(| x p − xq |, | y p − yq |)
40/120 2 3 L2 3M 3 2
3 2 ? i P
Observations
• de (p, q)2 is not a metric since the triangle inequality does not
hold.
41/120 2 3 L2 3M 3 2
3 2 ? i P
Observations
hold.
• Truncated or rounded version of de (.) also violate the triangle

inequality
41/120 2 3 L2 3M 3 2
3 2 ? i P
Observations
hold.

inequality
• Area A is a metric.
41/120 2 3 L2 3M 3 2
3 2 ? i P
Observations
hold.

inequality
• Area A is a metric.
• Perimeter P is a metric
41/120 2 3 L2 3M 3 2
3 2 ? i P
• =⇒Thinness T is a metric

A
T = 4π 0≤T ≤1
P2
for a circle T =1
42/120 2 3 L2 3M 3 2
3 2 ? i P
• Aspect ratio AR is a metric
length
AR =
breadth
Length
Breadth
43/120 2 3 L2 3M 3 2
3 2 ? i P
Approaches to Preprocessing
A simple categorisation of approaches
• Spatial Domain
Use the spatial relationships between pixels and their values i.e.
changes in intensity in a local neighbourhood of pixels
44/120 2 3 L2 3M 3 2
3 2 ? i P
• Spatial Domain
• Frequency domain
The relationships of quantities after transformation with
an orthonormal function. i.e. 2D Fourier transform
44/120 2 3 L2 3M 3 2
3 2 ? i P
• Spatial Domain
• Frequency domain
The relationships of quantities after transformation with
an orthonormal function. i.e. 2D Fourier transform
• Statical distributions
The statistical distribution of pixel intensities in the image
44/120 2 3 L2 3M 3 2
3 2 ? i P
Spatial Domain Methods
• Often based upon the passing a small mask of constant values
(i.e. 3 × 3 ) over the image.
45/120 2 3 L2 3M 3 2
3 2 ? i P
• Each mask is chosen to perform a particular function and prepro-

cessing involves several passes.
45/120 2 3 L2 3M 3 2
3 2 ? i P

• Since the processing with the mask is equivalent to convolution

of the image
45/120 2 3 L2 3M 3 2
3 2 ? i P


of the image
• there is a direct relationship between spatial and frequency do-

main techniques
45/120 2 3 L2 3M 3 2
3 2 ? i P


of the image
• there is a direct relationship between spatial and frequency do-

main techniques
• Spatial methods are often a more efficient method of process-

ing except for very large images.
45/120 2 3 L2 3M 3 2
3 2 ? i P
Convolution by a Mask
(0,0) x
0,0 1,0 2,0 ...
0,1 ... ... ...
0,2 ... ... ...
...
46/120 2 3 L2 3M 3 2
3 2 ? i P
input image output image
+, 788

87877 9:
: 9:9:9:9:9:9

, +,+,+,+,+,+
+, 78
78
78
7 8877
8
9:
:9:9
;<
< ;<;<;<;<;<;
;< =>
> =>=>=>=>=>=
=>

,+ ,+.-- <;<; >=>=

.-
-.
. -0/.-..-.-0/.-
-.

First

0/
0/ 0/0/0/

/0

0
/21
21
0/2121
21
12
2
14 21211
34324343
43 35 4343435
43 4
6
65 6
5 656565
656

( '(
' ' '(
(
' ' ('('' #$
$
#$
#$# #$
$
#$
# $#$##

(
( '(
'
( '(
(
' (('(' $# #$
$
# $$#$#

"!
!"
" !
" !
" "!"!"!

!" !
"
*) *
)* *

)* *) &
%& &

%& &% !"!

!"
"
!
"!"!

*)
)
** )
) * )
) * ) * ) * %
% & %
% & &%%&&%

Second )*
)*)* )
* )*
) * )
* )*
) *)*) & ) * %
& %&
% & %
& %&
% &%&%

47/120 2 3 L2 3M 3 2
3 2 ? i P
If the pixels are p1, . . . , p9 and the weights in the mask are w1 . . . w9
i.e
w1 w2 w3
w2 w5 w6
w7 w8 w9
then the new output pixel value is
0
p = Σi=1wi pi
9
48/120 2 3 L2 3M 3 2
3 2 ? i P
Frequency domain Methods
• For smaller images frequency based methods are usually more
time consuming than spatial methods
49/120 2 3 L2 3M 3 2
3 2 ? i P
• However, certain operations which are difficult to achieve using

spatial methods are easier in the frequency domain
i.e. the frequency spectrum of the image is invariant under
translation and rotation
49/120 2 3 L2 3M 3 2
3 2 ? i P

• Usually the transformation used is separable i.e 2D FFT
49/120 2 3 L2 3M 3 2
3 2 ? i P

• Usually the transformation used is separable i.e 2D FFT
• NOTE: the 1D FFT can be used to produce and invariant feature

vector of a boundary
49/120 2 3 L2 3M 3 2
3 2 ? i P
Computing the 2D FFT of an Image
1. Compute the 1D FFT transformation for each row producing an ar-
ray of coefficients
50/120 2 3 L2 3M 3 2
3 2 ? i P
ray of coefficients
1. Multiply these by N the dimension of the row
50/120 2 3 L2 3M 3 2
3 2 ? i P
ray of coefficients
1. Multiply these by N the dimension of the row
1. Take the 1D FFT of these values
50/120 2 3 L2 3M 3 2
3 2 ? i P
Statistical Methods
These form of methods depends on which stage they are used in the
processing of the image
• Preprocessing
where they are often used to generate a description of the
intensities of a region of an image
51/120 2 3 L2 3M 3 2
3 2 ? i P
Statistical Methods
• Preprocessing
• Segmentation
where they used to form a texture description
51/120 2 3 L2 3M 3 2
3 2 ? i P
Statistical Methods
• Preprocessing
• Segmentation
where they used to form a texture description
• Matching and Interpretation

Bayesian classifiers.
51/120 2 3 L2 3M 3 2
3 2 ? i P
0 255
Background
Pawn
52/120 2 3 L2 3M 3 2
3 2 ? i P
Preprocessing of Grey Scale Images
53/120 2 3 L2 3M 3 2
3 2 ? i P
Noise Reduction- Neighbourhood Averaging
Let
f be the input image
g be the output image
n be the number of pixels in the neighbourhood about and

including f (x, y)
Σi=n
i=1 f i
g(x, y) =
n
54/120 2 3 L2 3M 3 2
3 2 ? i P
For a 3 × 3 neighbourhood n = 9 and the mask
1 1 1
1 1 1
1 1 1
the result would then be divided by n
55/120 2 3 L2 3M 3 2
3 2 ? i P
Raw 3x3 5x5
56/120 2 3 L2 3M 3 2
3 2 ? i P
Raw 3x3 5x5
N. Smoothed N. Smoothed
57/120 2 3 L2 3M 3 2
3 2 ? i P
• The assumption for the noise reduction is that any noise in the
image is randomly distributed according to a zero mean Gaussian
distribution
58/120 2 3 L2 3M 3 2
3 2 ? i P
distribution
• This has the effect of blurring the image
58/120 2 3 L2 3M 3 2
3 2 ? i P
distribution
• This has the effect of blurring the image
• The larger the neighbourhood the larger the blurring effect
58/120 2 3 L2 3M 3 2
3 2 ? i P
Noise Reduction - Median Filtering
Median filtering requires
59/120 2 3 L2 3M 3 2
3 2 ? i P
• that the pixels be sorted according to their intensity values.

(i.e. lowest to highest)
59/120 2 3 L2 3M 3 2
3 2 ? i P

• then the value for which half the members are less and half the
values are larger is taken
59/120 2 3 L2 3M 3 2
3 2 ? i P

• then the value for which half the members are less and half the
values are larger is taken
For a 3 × 3 neighbourhood this corresponds to the 5th values
59/120 2 3 L2 3M 3 2
3 2 ? i P
• This approach is slower than Neighbourhood averaging but does
not blur the image.
60/120 2 3 L2 3M 3 2
3 2 ? i P
not blur the image.
• Thus is its appropriate for cases where quantitative

edge measurements are require.
60/120 2 3 L2 3M 3 2
3 2 ? i P
not blur the image.
• Thus is its appropriate for cases where quantitative

edge measurements are require.
• The median is a non-parametric statistic and so makes

no assumption about the noise distribution
60/120 2 3 L2 3M 3 2
3 2 ? i P
Noise Reduction - Integration
If we assume the noise distribution is ’salt and pepper’ noise
61/120 2 3 L2 3M 3 2
3 2 ? i P
• has zero mean
61/120 2 3 L2 3M 3 2
3 2 ? i P
• has zero mean
• is additive
61/120 2 3 L2 3M 3 2
3 2 ? i P
• has zero mean
• is additive
• is uncorrelated with the image
61/120 2 3 L2 3M 3 2
3 2 ? i P
• has zero mean
• is additive
• is uncorrelated with the image
Simply averaging images of the same scene over time will reduce
the noise
61/120 2 3 L2 3M 3 2
3 2 ? i P
Edge Enhancement
in Grey Scale Images
62/120 2 3 L2 3M 3 2
3 2 ? i P
• Edges in an image are rapid changes in intensity.
63/120 2 3 L2 3M 3 2
3 2 ? i P
• These correspond, in the 3D world to,
63/120 2 3 L2 3M 3 2
3 2 ? i P
• surface discontinuities
63/120 2 3 L2 3M 3 2
3 2 ? i P
• reflectance boundaries
63/120 2 3 L2 3M 3 2
3 2 ? i P
• reflectance boundaries
• illumination boundaries
63/120 2 3 L2 3M 3 2
3 2 ? i P
In order to distinguish between the cause of an edge we require
• a model of illumination in our scene
64/120 2 3 L2 3M 3 2
3 2 ? i P
• a geometric model of the poses other objects within the scene
64/120 2 3 L2 3M 3 2
3 2 ? i P
• a geometric model of the poses other objects within the scene
• This is not usually the case

Therefore all edges are processed similarly
64/120 2 3 L2 3M 3 2
3 2 ? i P
The principle of edge enhancement
• Since an edge is a rapid change in intensity between adjacent
pixels in an image
65/120 2 3 L2 3M 3 2
3 2 ? i P
The principle of edge enhancement
• Since an edge is a rapid change in intensity between adjacent
pixels in an image
• The most obvious process would be to take the difference be-

tween pixels.
65/120 2 3 L2 3M 3 2
3 2 ? i P
• Producing a local approximation to the rate of change of inten-
sity
66/120 2 3 L2 3M 3 2
3 2 ? i P
sity
• The edge could be emphasised further by taking the second

difference
66/120 2 3 L2 3M 3 2
3 2 ? i P
sity
• The edge could be emphasised further by taking the second

difference
• However, in both cases the images must be filtered to reduce the

noise prior to using these methods
66/120 2 3 L2 3M 3 2
3 2 ? i P
Image
Intensity
Profile
First
Derivative
Second
Derivative
67/120 2 3 L2 3M 3 2
3 2 ? i P
   
δf
Gx δx
Gradient[ f (x, y)] =  = 
δf
Gy δy
Gradient is a vector and points in the direction of the maximum rate

of change.
68/120 2 3 L2 3M 3 2
3 2 ? i P
The magnitude is often sufficient for the characterisation of edges
2 1
2 2
magnitude = Gx + Gy
69/120 2 3 L2 3M 3 2
3 2 ? i P
The magnitude is often sufficient for the characterisation of edges
1
magnitude = Gx + G2y 2
2
often another metric is chosen to reduce the computation required

such as
magnitude ≈| Gx | + | Gy |
69/120 2 3 L2 3M 3 2
3 2 ? i P
δf δf
The first order partial derivatives δx and δy can be obtained in a number
of ways
70/120 2 3 L2 3M 3 2
3 2 ? i P
Forward difference
δf
Gx = = f (x, y) − f (x − 1, y)
δx
δf
Gy = = f (x, y) − f (x, y − 1)
δy
The masks which follow are all discrete approximations of the first
or second derivative
71/120 2 3 L2 3M 3 2
3 2 ? i P
2 × 2 Roberts Cross Operator
The simplest approximation to the gradient
1 -1 a b
1 -1 c d
R(a) =| a − d | + | b − c |
72/120 2 3 L2 3M 3 2
3 2 ? i P
Laplacian ∇2
• This implements an approximation to the second derivative for a

N4 connected neighbourhood or N8.
73/120 2 3 L2 3M 3 2
3 2 ? i P
Laplacian ∇2

• It has a larger neighbourhood and thus is less susceptible to the
effect of noise that the simple Roberts Cross operator
73/120 2 3 L2 3M 3 2
3 2 ? i P
Laplacian ∇2

• It has a larger neighbourhood and thus is less susceptible to the
effect of noise that the simple Roberts Cross operator
• However, it use requires significant noise reduction.
0 1 0 1 1 1
1 -4 1 1 -8 1
0 1 0 1 1 1
73/120 2 3 L2 3M 3 2
3 2 ? i P
Sobel Operators
• This a weighted first difference using two masks.
74/120 2 3 L2 3M 3 2
3 2 ? i P
Sobel Operators
• One for change in the x direction and the other the y
74/120 2 3 L2 3M 3 2
3 2 ? i P
Sobel Operators
• One for change in the x direction and the other the y

• The result is usually an approximation to the the magnitude of the
vector sum.
-1 -2 -1 -1 0 1
Gy : 0 0 0 Gx : -2 0 2
1 2 1 -1 0 1
magintude =| Gx | + | Gy |
74/120 2 3 L2 3M 3 2
3 2 ? i P
Edge
75/120 2 3 L2 3M 3 2
3 2 ? i P
Segmentation
Given a
• primal sketch
an image in which the features to be used for segmentation are
enhanced.
76/120 2 3 L2 3M 3 2
3 2 ? i P
Segmentation
Given a
• primal sketch
an image in which the features to be used for segmentation are
enhanced.
• Segmentation
subdivides the image into regions which have similar proper-
ties
76/120 2 3 L2 3M 3 2
3 2 ? i P
• Thus similarity and discontinuity play are the basis for segmentation
techniques
77/120 2 3 L2 3M 3 2
3 2 ? i P
• Thus similarity and discontinuity play are the basis for segmentation
techniques
• One quantity on which to base a measure of similarity is pixel

intensity
77/120 2 3 L2 3M 3 2
3 2 ? i P
Thresholding
Binary Thresholding classifies each pixel individually as belonging to
one of two classes based upon a threshold value T

 ↑ f (x, y) > T
g(x, y) =
 ↓ f (x, y) ≤ T
where (↑, ↓) can be suitable values such as (255, 0) or (1, 0)
78/120 2 3 L2 3M 3 2
3 2 ? i P
Thresholding techniques are said to be
• Global
79/120 2 3 L2 3M 3 2
3 2 ? i P
• Global
• Variable or
79/120 2 3 L2 3M 3 2
3 2 ? i P
• Global
• Variable or
• Dynamic
in nature depending on the method used to calculate T
79/120 2 3 L2 3M 3 2
3 2 ? i P
Global
• The values T is calculated using the statistics derived from the
entire image
80/120 2 3 L2 3M 3 2
3 2 ? i P
Global
• The values T is calculated using the statistics derived from the
entire image
• This single value T is used to classify all pixels in the image
80/120 2 3 L2 3M 3 2
3 2 ? i P
Variable
• Also known as local threshold techniques
81/120 2 3 L2 3M 3 2
3 2 ? i P
Variable
• Use local neighbourhoods of pixels to generate a local thresh-

old for the neighbourhood.
81/120 2 3 L2 3M 3 2
3 2 ? i P
Variable

• Thus there a number of thresholds T1 , . . . Tn are used to classify

an entire image
81/120 2 3 L2 3M 3 2
3 2 ? i P
Variable


an entire image
• Most often these the neighbourhoods do not overlap
81/120 2 3 L2 3M 3 2
3 2 ? i P
Variable


an entire image
• Most often these the neighbourhoods do not overlap
• Variable threshold techniques can adapt to variations in illumina-

tion (i.e. shadows)
81/120 2 3 L2 3M 3 2
3 2 ? i P
32

T00 T10 .. 32

..

82/120 2 3 L2 3M 3 2
3 2 ? i P
Dynamic
• In principle each pixel is classified according to its own threshold
which is based on the pixels neighbourhood.
83/120 2 3 L2 3M 3 2
3 2 ? i P
Dynamic
• In principle each pixel is classified according to its own threshold
which is based on the pixels neighbourhood.
• For most applications this is impractical
83/120 2 3 L2 3M 3 2
3 2 ? i P
Calculation of Threshold values
based upon Statistical Analysis
84/120 2 3 L2 3M 3 2
3 2 ? i P
• These techniques often sample the neighbourhood and create a
histogram of pixel intensities representing their distribution.
85/120 2 3 L2 3M 3 2
3 2 ? i P
• In the idea case the histogram will be be bi-modal where one mode
represents
85/120 2 3 L2 3M 3 2
3 2 ? i P
represents
the object (i.e. what we wish to isolate) and the other
85/120 2 3 L2 3M 3 2
3 2 ? i P
represents
the object (i.e. what we wish to isolate) and the other
the background(i.e. everything else)
85/120 2 3 L2 3M 3 2
3 2 ? i P
However, often the image constrains
86/120 2 3 L2 3M 3 2
3 2 ? i P
• more than one object,
86/120 2 3 L2 3M 3 2
3 2 ? i P
• more than one background or shadows and
86/120 2 3 L2 3M 3 2
3 2 ? i P
• thus the histogram is multi-modal
86/120 2 3 L2 3M 3 2
3 2 ? i P
• thus the histogram is multi-modal
• Multiple thresholding techniques do exist but are by their nature

less robust.
86/120 2 3 L2 3M 3 2
3 2 ? i P
• The approach then is to either to
87/120 2 3 L2 3M 3 2
3 2 ? i P
• control the scene (i.e. backlighting) or
87/120 2 3 L2 3M 3 2
3 2 ? i P
• control the scene (i.e. backlighting) or
• choose the pixel intensity selectively for the image
87/120 2 3 L2 3M 3 2
3 2 ? i P
• Given that we are often concerned with edges
88/120 2 3 L2 3M 3 2
3 2 ? i P
• Given that we are often concerned with edges
• It is possible to form a distribution of only those pixels which have a

larger gradient and thus lie near on a edge
88/120 2 3 L2 3M 3 2
3 2 ? i P
1
Gy 1 −1
−1
Gx
Gx + Gy > ε
89/120 2 3 L2 3M 3 2
3 2 ? i P
• If we assume that the relevant regions of the image can be distin-
guished in the histogram (i.e. separable distribution)
90/120 2 3 L2 3M 3 2
3 2 ? i P
• then suitable thresholds will correspond to the troughs or dips in

the histogram
90/120 2 3 L2 3M 3 2
3 2 ? i P
• then suitable thresholds will correspond to the troughs or dips in

the histogram
• the method chosen for the selection of a particular point for the
trough contains assumptions about the underlying distribution.
90/120 2 3 L2 3M 3 2
3 2 ? i P
Piecewise Approximation
• This approach makes and piece wise linear approximation to the
distribution
91/120 2 3 L2 3M 3 2
3 2 ? i P
distribution
• The troughs and thus the threshold values are a subset of the
points of intersection.
91/120 2 3 L2 3M 3 2
3 2 ? i P
distribution
• The troughs and thus the threshold values are a subset of the
points of intersection.
• This approach can be simply extended to multiple thresholds
91/120 2 3 L2 3M 3 2
3 2 ? i P
T
Intensity
92/120 2 3 L2 3M 3 2
3 2 ? i P
Class Separation
Iterative technique which attempts to optimise
B(T ) = N1N2(M1 − M2)2
N1 the number of pixels with intensities< T
N2 the number of pixels with intensities ≥ T
M1 the mean intensity of the pixels whose values are < T
M2 the mean intensity of the pixels whose values are ≥ T
93/120 2 3 L2 3M 3 2
3 2 ? i P
• The value (M1 − M2 )2 corresponds to a measure of separation
of the two means
94/120 2 3 L2 3M 3 2
3 2 ? i P
of the two means
• The value N1 N2 corresponds to a measure of symmetry
94/120 2 3 L2 3M 3 2
3 2 ? i P
of the two means
• The value N1 N2 corresponds to a measure of symmetry
• The value B(T ) is a maximum for r the optimal T for a bi-modal

distribution
94/120 2 3 L2 3M 3 2
3 2 ? i P
2
(M − M )
1 2
M1 N1 M N
2 2
Intensity Intensity
Global Mean
95/120 2 3 L2 3M 3 2
3 2 ? i P
Optimal Threshold Selection
The calculation of the optimal threshold requires that we know
Po(z) the probability density function of the pixel intensity for the
object
96/120 2 3 L2 3M 3 2
3 2 ? i P
object
Pb(z) the probability density function of the pixel intensity for the
background
96/120 2 3 L2 3M 3 2
3 2 ? i P
object
Pb(z) the probability density function of the pixel intensity for the
background
pb , po the a priori probabilities of there being background and ob-

ject respectively in the image
96/120 2 3 L2 3M 3 2
3 2 ? i P
That the image consists of only object and background so
po + pb = 1
so that the probability density function for the image P(z) is given by
P(z) = po.Po(z) + pb.Pb(z)
97/120 2 3 L2 3M 3 2
3 2 ? i P
Thus the optimal threshold is at the point at which the two distri-
butions intersect
po.Po(T ) = pb.Pb(T )
98/120 2 3 L2 3M 3 2
3 2 ? i P
Thus the optimal threshold is at the point at which the two distri-
butions intersect
if we assume that both pixel distributions are Gaussian distributions

i.e for the object

(z−mo )2
1 − 2
Po(z) = √ exp 2σo
2πσo
98/120 2 3 L2 3M 3 2
3 2 ? i P
• then assuming σo = σb = σ and substituting Po (z) and Pb (z)
and into
99/120 2 3 L2 3M 3 2
3 2 ? i P
• then assuming σo = σb = σ and substituting Po (z) and Pb (z)
and into
• solving for T gives
mo + mb σ2 po
T= + ln
2 (mo − mb) pb
99/120 2 3 L2 3M 3 2
3 2 ? i P
• If σ = 0 or Po = Pb then T then simply becomes the average of
the means mo , mb
100/120 2 3 L2 3M 3 2
3 2 ? i P
• If σ = 0 or Po = Pb then T then simply becomes the average of
the means mo , mb
• For most practical cases the assumptions are not realisable
100/120 2 3 L2 3M 3 2
3 2 ? i P
The Pyramidal Threshold Technique
• The approach is to partition the image in to a set of non-overlapping
quad-tree like regions R
101/120 2 3 L2 3M 3 2
3 2 ? i P
• Determine if there is sufficient variability in each region r to be

able to determine a local threshold
Often based upon the variance of the histogram for the region
101/120 2 3 L2 3M 3 2
3 2 ? i P

• If not, determine a histogram from adjacent regions and re-calculate

the variability
101/120 2 3 L2 3M 3 2
3 2 ? i P


the variability
if the variability is sufficient use the threshold of this large region
for r
101/120 2 3 L2 3M 3 2
3 2 ? i P


the variability
if the variability is sufficient use the threshold of this large region
for r
if not enlarge the region and continue
101/120 2 3 L2 3M 3 2
3 2 ? i P
T00 T10 T20 T30
T01 T11 T21 T31 T’ T’

TGlobal
102/120 2 3 L2 3M 3 2
3 2 ? i P
A Distance Weighted Variation
• This approach does not create the entire pyramid but only the
first layer.
103/120 2 3 L2 3M 3 2
3 2 ? i P
first layer.
• The regions with insufficient variability calculate their thresholds as

a weighted mean of those neighbouring regions with a thresh-
old.
103/120 2 3 L2 3M 3 2
3 2 ? i P
first layer.
• The regions with insufficient variability calculate their thresholds as

a weighted mean of those neighbouring regions with a thresh-
old.
• The thresholds of the neighbouring regions are inversely weighted

by the distance to the unassigned region
103/120 2 3 L2 3M 3 2
3 2 ? i P
T T T1 T
2 2
T T T
1 ? 1
T T T T
2 1 2
T T T T
104/120 2 3 L2 3M 3 2
3 2 ? i P
Binary Image Processing
• The sequence of previous stages result in binary image i.e. where
each pixel may have only one of two values
105/120 2 3 L2 3M 3 2
3 2 ? i P
• The transformation to a binary image often results in

broken lines
105/120 2 3 L2 3M 3 2
3 2 ? i P

broken lines
isolated pixels
105/120 2 3 L2 3M 3 2
3 2 ? i P

broken lines
isolated pixels
missing corners
105/120 2 3 L2 3M 3 2
3 2 ? i P

broken lines
isolated pixels
missing corners
• Binary Image Morphology provides a general theory to tackle

these problems.We will look at these later.
105/120 2 3 L2 3M 3 2
3 2 ? i P
Removal of Isolated pixels
a b c 0 0 0
d p e 0 1 0
f g h 0 0 0
a+b+c+d + p+e+ f +g+h
106/120 2 3 L2 3M 3 2
3 2 ? i P
Look Up Table Implementation
Form an index i
i = 512.a+256.b+64.c+32.d +16.p+8.e+4. f +2.g+h
or
i = 2(2(2(2(2(2(2(2(2a+b)+c)+d)+ p)+e)+ f )+g)+h
in to an array of 512 elements containing the Boolean function repre-

sented as
107/120 2 3 L2 3M 3 2
3 2 ? i P
Index a b c d p e f g h output
0 0 0 0 0 0 0 0 0 0 1
... ...
511
108/120 2 3 L2 3M 3 2
3 2 ? i P
Boundary Isolation
There various definitions of exactly what constitutes a boundary
109/120 2 3 L2 3M 3 2
3 2 ? i P
110/120 2 3 L2 3M 3 2
3 2 ? i P
111/120 2 3 L2 3M 3 2
3 2 ? i P
0 0
1 1 1 1
1 1 1 1
1 1 1 1
0 0
112/120 2 3 L2 3M 3 2
3 2 ? i P
 4  4  4
1 0 [ 1 1 [ 1 1
B=     
0 0 0 0 1 0
113/120 2 3 L2 3M 3 2
3 2 ? i P
Chain Code
Chain code is a standardised form used to encode connected se-
quences of pixels
3 2 1
4 p 0
5 6 7
114/120 2 3 L2 3M 3 2
3 2 ? i P
First point found after raster
search
115/120 2 3 L2 3M 3 2
3 2 ? i P
116/120 2 3 L2 3M 3 2
3 2 ? i P
0 0
1 7
2
3− +1
2 6
4− + 0
4
6
3 +
5 − 7
4 4 6
117/120 2 3 L2 3M 3 2
3 2 ? i P

0

2

3− +1

4−

+ 0

5 − +

4 7

6

118/120 2 3 L2 3M 3 2
3 2 ? i P
0 5 2
y y
Area=1 x y Area=(y x 1) −0.5 Area=0
119/120 2 3 L2 3M 3 2
3 2 ? i P
D IRECTION A REA T WICE THE A REA
0 y 2y
1 y+0.5 2y+1
2 0 0
3 y+0.5 2y+1
4 y 2y
5 y-0.5 2y-1
6 0 0
7 y-0.5 2y-1
120/120 2 3 L2 3M 3 2
3 2 ? i P
Last −> First
7 0 7
0 0
1 7 2 +
7
7 3 1
2 6
4 6 4 0
7 2
3 6
4 4 5 7
7 0 6
6
121/120 2 3 L2 3M 3 2
3 2 ? i P

Vision Presentation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Vision Presentation

Uploaded by

Copyright:

Available Formats

University College of Borås - Robotics

An Introduction to Machine Vision

• The process has given time constraints

• The process has given time constraints

• The set of features for interpretation is usually know before hand

• The most important variables in this process are

• The most important variables in this process are

• The most important variables in this process are

• The most important variables in this process are

• These determine the fundamental limits on what can be achieved

• Noise Reduction -removal of what is unwanted

• Noise Reduction -removal of what is unwanted

• Enhancement - emphasising what is required for later stages

• Noise Reduction -removal of what is unwanted

• Enhancement - emphasising what is required for later stages

Both effective selectively change the signal-to-noise ratio of compo-

• This groups often correspond to areas of the image with similar

• This groups often correspond to areas of the image with similar

• This groups often correspond to areas of the image with similar

• Often in the form of vectors of numbers called feature vectors

• Often in the form of vectors of numbers called feature vectors

• There is a loss of information at this point and the features repre-

• This often involves a local model of the expected scene

• Often with the help of a global model of the expected scene.

• In the implementation of a time efficient system they often overlap

• A badly chosen illumination scheme can make what is a tractable

• A badly chosen illumination scheme can make what is a tractable

• The illumination scheme must be suitable for the features which

• A badly chosen illumination scheme can make what is a tractable

• The illumination scheme must be suitable for the features which

• There are 4 basic schemes

• to enhance the surface characteristics when the employed with

• to enhance the surface characteristics when the employed with

• This can be used to identify defects in a smooth surface, which

• This can be used to identify defects in a smooth surface, which

• The angle of reflectance is chosen to reflect light away from the

Collimated Light Source

• It produces a silhouette of the object without the need for prepro-

• It produces a silhouette of the object without the need for prepro-

• Suits quantitative measurement

• The pattern of illumination is chosen based on the features to be

• The pattern of illumination is chosen based on the features to be

• Suitable for quantitative measurement.

• Various kinds of connectivity can be defined (given quantisation of

• Various kinds of connectivity can be defined (given quantisation of

N4 (p) those pixels a unit distance away from p

• Various kinds of connectivity can be defined (given quantisation of

N4 (p) those pixels a unit distance away from p

• Various kinds of connectivity can be defined (given quantisation of

N4 (p) those pixels a unit distance away from p

• Various kinds of connectivity can be defined (given quantisation of

N4 (p) those pixels a unit distance away from p

• Note Square pixel grids are not the usual case.

• The hole and background are connected if we simply apply N8 con-

• The hole and background are connected if we simply apply N8 con-

• Neither definition gives a unique definition of connectedness

• The hole and background are connected if we simply apply N8 con-

• Neither definition gives a unique definition of connectedness

• If q ∈ ND (p) And N4 (p) ∩ N4 (q) = 0/

• If q ∈ ND (p) And N4 (p) ∩ N4 (q) = 0/

• A Metrics d is a real valued function

• A Metrics d is a real valued function

• Metrics can be used to measure the distance between

d(p, z) + d(q, z) ≥ d(p, q)

+, 788