Camera Models and Image Processing Techniques

Lecture #2 c
Camera model
The aperture is a hole or an opening through

which light is admitted. It Is implemented through a
device, called diaphragm, allowing for different size
openings.
Perspective camera model
1. Weak perspective camera model: the relative distance between two scene
points is much smaller than the average distance (ZAV) between the camera
and those points.
a. Orthographic projection: points are projected along rays parallel to the

optical axis (x’ = x, y’ = y)
b. Isotropic scaling by the factor f/ZAV
2. Magnification
3. Thin lens camera model: Z*z = f2; S0=Z+f and Si=z+f
4. The perspective camera model (pinhole)

Laws of geometric optics
Physical camera parameters

Intrinsic parameters: internal camera geometrical and optical characteristics
● Focal length := the distance between the optical center of the lens and the
image plane: f [mm] or[pixels]
● Effective pixel size (dpx,dpy) [mm];
● Principal point := location of the image center in pixel coordinates: (u0,V0)
● Distortion coefficients of the lens: radial (k1, k2) and tangential (p1, p2).
Bayer Pattern
Interpolation When we lack a value from the senzor, we make an 8-neighbour

interpolation, averaging the values. For each one of the colours (R,G,B) we have a
value in the matrix (R,G or B), the other two values are calculated via interpolation.
Keyword: Bilinear interpolation
Camera frame <-> image plane
transformation
Camera frame -> image plane transformation
1. Transform P = [Xc,Yc,Zc] to p = [x,y,-f]
f- focal distance[metric units]
2. Transform p[x,y]T (metric units) to image coordinates [u,v]T

(pixels)
Du, Dv - coefficients needed to transform metric units to pixels
Du = 1/dpx ; fx = f/dpx
Dv = 1/dpy ; fy = f/dpy
Image plane transformation -> Camera frame
Modeling the lens distortions

1. Radial lens distortion: Causes the actual image point to be displaced radially
in the image plane
2. Tangential distortion: Appears if the centers of curvature of the lenses’

surfaces are not strictly collinear
Transform p[x,y]T (metric) -> image coords [u,v]T (pixels)
=> The projection equations become non-linear

Solution: Perform distortion correction on image and afterwards linear projection
Distortion correction we make a correspondence between the distorted image
and the original image
1. Compute the image coordinates (x, y) (1)
2. Compute the (x', y') coordinates in the distorted image (2)
3. Compute the (u', v') coordinates in the distorted image (3)
4. D(u, v) = S(u', v')
Keyword: bilinear interpolation of the destination pixel intensity
3D(world) to 2D(image) mapping using

the Projection Matrix
Projection matrix
The projection equation of a 3D world point [ XW , YW , ZW ] expressed in normalized
coordinates:
Obtaining the 2D image coordinates from normalized coordinate:

Lecture #3 c
Binary images (Global - 263216)
Threshold selection: using standard deviation and mean: , where k1=k2=1

for low resolution and k1=1;1,5 k2=2 for high resolution
(check the formulae)
Global thresholding: Let f(x,y) be the source image and b(x,y) the binary image.
Semi-thresholding: Pixels whose values lie within a given

threshold range retain their original values. Pixels with values lying outside of the
threshold range are set to 0.
Variable thresholding:
● Variable thresholding allows different threshold levels to be applied to different
regions of an image.
● Let f( x,y ) be the source image and let x,y ) denote the local (region)
threshold value associated with each point in the image, that is d( x,y ) is the
threshold value associated with the region in which point ( x,y ) lies.
● The thresholded image b( x,y ) is defined by:
Multilevel thresholding: for segmentation of pixels into bins
For example, if the image histogram contains three peaks, then it is possible
to segment the image using two thresholds. These thresholds divide the value set
into three non overlapping ranges, each of which can be associated with a unique
value in the resulting image.
Determines multiple thresholds for reducing the number of image intensity

(gray) levels. Its first step is to determine the local maxima of the histogram. Then,
each gray level is assigned to the closest maximum.
The following steps must be performed in order to determine the histogram
maxima:
1. Normalize the histogram (transform it into a PDF)
2. Choose a window width 2*WH+1 (a good value for WH is 5)
3. Choose a threshold TH (a good value is 0.0003)
4. For each position (middle of the window) k from 0+WH to 255-WH
-Compute the average v of normalized histogram values in the interval
[k-WH, k+WH]. Remark: the value v is the average of 2*WH+1 values
-If PDF[k]>v+TH and PDF[k] is greater or equal than all PDF values in
the interval [k-WH, k+WH] then k corresponds to a histogram maximum. Store it and
then continue from the next position.
5. Insert 0 at the beginning of the maxima position list and 255 at the end.
The second step is thresholding. Thresholds are located at equal distances between
the maxima. Therefore the algorithm for thresholding is simply to assign to each pixel
the color value of the nearest histogram maximum.
Adaptive thresholding: Gray level histograms approximated by two normal

distributions; the threshold is set to give minimum probability of segmentation error.
Segmentation
Partitioning of an image into regions (subset of an image) such that each
region satisfies a predicate. This means that all the points in that region have the
same common property.A segmentation should be:
- Exhaustive: the entire image is used
- Exclusive: a pixel may belong to only one category.
Pixels belonging to adjacent regions, when taken jointly, do not satisfy the predicate.
hreshold selection by best approximation with a two level
T
image: Given an image fij the binary image is gij
- Where t is th threshold
- a and b are constants chosen to minimize the distance on the corresponding
intervals (between the original an binarized image)
- The image has P gray levels
The Euclidean distance on the interval [0,P-1] is
The minimum of F(a) and F(b) are the mean values:

Otsu method
Problem statement: we have two groups of pixels, one with one range of
values and one with another. Thresholding is difficult because these ranges usually
overlap. Idea: minimize the error of classifying a background pixel as a foreground
one or vice versa.
Set the threshold s.t each cluster is as tight as possible to minimize their overlap.
Geometric properties
Area: The area A is measured in pixels and it indicates the relative size of the
object.
Center of mass: The equations correspond to the row and

column where the center of mass is located
Axis of elongation (ORIENTATION): The axis of elongation gives information

about how the object is positioned in the field of view, that is its orientation. The axis
corresponds to the direction in which the object (seen as a plane surface of constant
width) can rotate most easily(has a minimum kinetic moment).
Perimeter: P
Thinness ratio (circularity): The function above has the
maximum value equal to 1, and for this value we obtain a circle.
1/T is called irregularity factor of an object (or compactness factor)
Aspect ratio: This property is found by scanning the image

and keeping the minimum and maximum values of the lines and
columns that form the rectangle circumscribed to the object.
Roundness: 0 - straight line; 1- circle
Emin: +sin2o; Emax: -sin2o
Projections: The projections

give information
about the shape of the object.
The horizontal projection equals the sum of pixels computed on each

line of the image
The vertical projection is given by the sum of the pixelson the

columns.
Run-Length Encoding
Run-Length Encoding (108): Two approaches are commonly used in run
length encoding.
● the start positions and length of runs of 1s for each row are used
● lengths of runs, starting with the length of the 0 run
- The vertical projection is obtained by adding up all

picture cell values in one column of the image.
- It is difficult to compute vertical projection directly

from the run-lengths. Consider instead the first
difference of the vertical projection:
- The first difference of the vertical projection can be obtained by projecting not the
image data, but the first horizontal differences of the image data:
Lecture #4 c
Sequential Labeling
Classical Labeling (Two pass)

1. First pass: performs label propagation
a. Whenever a situation arises in which two different labels can propagate
to the same pixel, the smaller label propagates and each such
equivalence found is entered in an equivalence table (e.g. (1,2) ->
EqTable).
b. Each entry in the equivalence table consists of an ordered pair, the
values of its components being the labels found to be equivalent.
c. After the first pass, the equivalence classes are found.
d. Each equivalence class is assigned a unique label, usually the
minimum (or oldest) label in the class.
2. A second pass through the image performs a translation, assigning to each
pixel the label of the equivalence class of its 1 st pass label.
BFS Labeling
The first step is to initialize the label matrix to zeros which indicates that
everything is unlabeled. Then algorithm searches for an unlabeled object pixel. If it
finds one, it gives it a new label and propagates the label to its neighbors. We repeat
this until all object pixels are given a label.
The queue data structure maintains the list of points that need to be labeled.
Since the queue uses a FIFO policy we obtain a breadth first traversal. We mark
visited nodes by setting the label for their position.Changing the data structure to a
stack would result in a depth first traversal of the image graph.
Border tracing algorithm
1. Search the image from top left until a pixel of a new region is found; this
pixel P0 is the starting pixel of the region border. Define a variable dir which stores
the direction of the previous move along the border from the previous border element
to the current border element.Assign
(a)dir =0 if the border is detected in 4-connectivity
(b)dir = 7 if the border is detected in 8-connectivity
2. Search the 3x3 neighborhood of the current pixel in an
anti-clockwise direction, beginning the neighborhood search at the pixel
positioned in the direction
(a)(dir+ 3) mod 4 (Fig. 6.1c)
(b)(dir+ 7) mod 8 if dir is even (Fig. 6.1d)
(dir+ 6) mod 8 if dir is odd (Fig. 6.1e)
The first pixel found with the same value as the current pixel is a new
boundary element Pn. Update the dir value.
3. If the current boundary element Pn is equal to the second border element
P1and if the previous border element Pn-1 is equal to P0,stop. Otherwise repeat
step (2).
4.The detected border is represented by pixels P0... Pn-2.
Polygonal approximation
Chain codes
OPEN CONTOUR = NU TE INTORCI
Lecture #5 c
Dilation
Erosion
Open: generally smoothes the contour of an object, breaks narrow isthmuses, and
eliminates thin protrusions.
Close: tends to smooth sections of contours but, as opposed to opening, it

generally fuses narrow breaks and long thin gulfs, eliminates small holes, and fills
gaps in the contour.
Hit-or-miss transform
The hit or miss transform is a natural operation to select out pixels that have
certain geometric properties, such as corner points, isolated points, or border points,
and that performs template matching, thinning, thickening, and centering.
It’s basically an erosion with more filters (where one is applied to the
complementary image). Then combine them.
Daca se ating tangential (label dupa erosion)=> Opening; Closing is more for filling
gaps. If opening is not enough, vertical/horizontal projection.
Lecture #6 c
Mean value
mean\:=\:\frac{\sum \:_{i=1}^{\infty \:}\:\sum \:_{j=1}^{\infty \:}\:f\left(i,j\right)\:}{m\cdot

\:n}
Standard deviation
deviation\:=\:\sqrt{\frac{\sum_{i=1}^{\infty}\:\sum_{j=1}^{\infty}\:\left(f\left(i,j\right)-\:me
an\:\right)^2}{m\cdot n}}
Variance
variance = deviation^2
Probability distribution function

p(g)=h(g)/M
Histogram slide BRIGHTNESS
Histogram stretch/shrink (goutmax and goutmin - intre ce luminozitati vreau
rezultatul) CONTRAST
g_{out}\:=\:g_{out^{min}}\:\cdot \:\left(g_{in}-\:g_{in^{MAX}}\right)\cdot
\:\frac{g_{out^{MAX}}\:-\:g_{out^{min}}}{g_{in^{MAX}}\:-\:g_{in^{min}}}
Gamma correction
g_{out}\:=\:L\:\cdot \:\left(\frac{g_{in}}{L}\right)^{gamma}
Cumulative probability density function (CPDF)

Care e probabilitatea sa apara culoarea respectivă
pci = pi + pci-1 (le adunăm pe toate pana la intensitatea aia)
Histogram equalization( 362611 )

Histogram equalization is a transform which allows us to obtain an output
image with a quasi-uniform histogram/PDF, regardless the shape of the
histogram/PDF of the input image.
pc(gin) = 1 cand e ultima val

Histogram specification Histogram matching (198)
Self-information (surprisal)
Information: the information associated to gray-level f:
Entropy: average information of the image:

Energy: how the gray-levels are distributed:
Lecture #7 c
Dirac Delta function
Can be thought of as a function on the real line which is zero everywhere
except at the origin, and which is also constrained to satisfy the identity:
Unit pulse function

Mean filter: 3x3 We use for salt and pepper
Gaussian filter: 3x3
Laplace filters: edge detection
High-pass filters:
Discrete Fourier transform (DFT):
Median filter
Selects the middle value from the ordered statistic and replaces the destination pixel
with it. In the example above, the selected value would be 104. The median filter
allows the elimination of salt & pepper noise.
Maximum filter
Selects the largest value amongst the ordered values of pixels from the window. In
the above example, the value selected is 114. This filter can be used to eliminate the
pepper noise, but it amplifies the salt noise if applied to a salt & pepper noise image.
Minimum filter
Selects the smallest value amongst the ordered values of pixels from the window. In
the above example, the value selected is 85. This filter can be used to eliminate the
salt noise, but it amplifies the pepper noise if applied to a salt & pepper noise image.
Gaussian
G\left(x,y\right)\:=\:\frac{1}{2\cdot \pi \cdot sigma^2}\cdot

e^{\frac{\left(x-x_0\right)^2+\:\left(y-y_0\right)^2}{-2\cdot sigma^2}}
Lecture #8 c
ignal to noise ratio
S
(a single image)
SNR\:=\:\sqrt{\frac{f^2}{n^2}-1}
To calculate it , we need 2 x d,f or n.
ignal to noise ratio

S
(multiple images)
SNR\:=\:\sqrt{\frac{r}{1-r}}
Lecture #9-10 c
Average filter(mean filter)
Threshold average filter (middle is not taken into consideration)

Different from the average filter due to the threshold: we don’t add the median
value, we put the average of the neighbours. In source we allow that value only if the
average is < Threshold.
Median filter (has the sorted vector)

Weighted median filter
Elementele se pot lua de un anumit număr de ori
Butterworth filter ( low pass filter : removes high frequencies, blurs image)
Trapezoidal filter
Lecture #11-12 c
First order differential edge detection (Canny) - 332
Use the gradient
Sobel filter - 336
“Aplici un Sobel, apoi celălalt, și faci radical din suma patratelor” - MAGNITUDINE
In order to find the edges, the algorithm has to check the points with the
largest variation of intensity. For this, we apply the gradient to find the local
maximum points of the 1st order derivative.
This gradient is a vector perpendicular to the tangent of that point only if the
point is part of a contour. It has two components that can be approximated with
convolution with derivative kernels (Sobel). After getting the two components, the
direction of the vector and the magnitude can also be obtained.
The magnitude is compared to a threshold value to determine whether or not
the point is an edge point.
-> We can obtain noise points that get classified as edge points, we can have gaps
in edges, and finally, thicker edges (for this we may need an edge thinning
algorithm).
Second order differential edge detection (Laplace)
1. Aplici filtru
2. “Trebe sa mergi pe fiecare pixel, sa te uiti la vecini si sa vezi unde inmultirea
dintre vecini da 0 ca atunci s-a schimbat semnul”
Inseamna ca in directia aia e o muchie = The edges are detected by
searching for zero crossings.
Canny edge detection (*IPLab11)
Cu MAGNITUDE și direcția poți sa continui sa aplici canny.
O derivata iti arata cat de tare se schimba culoarea si directia. magn cat de tare se
schimba, dir -in ce parte se schimba
Zero crossing methods - 353

1. The image is filtered with a LoG kernel representing the composition of a
Gaussian low pass filter and a Laplacian 2nd order derivative filter
2. The edge points are detected as the zero crossings of the filtered image
Steps in edge detection:
1. noise suppression
2. image quality enhancement
3. edge detection
4. edge localization/ thinning/ linking
Gradient: a vector which has a certain magnitude and direction
● the magnitude specifies the strength of the edge

● the direction is always perpendicular on the direction of the edge
Multiple types of edge detectors:
● Roberts: the interpolation is computed at point (i+1/2, j+1/2)
● Prewitt: each neighboring pixel is assigned the same importance
● Sobel: the most popular operator used in edge detection, this operator gives
more importance to the pixels that are closer to the center pixel
Optimal edge detection criteria:
1. good detection: minimize the probability of false positives/ negatives

2. good localization:
3. single response constraint: one point only for each true edge point; minimize the
number of local maxima around the true edge
Canny edge detection
● using the first order derivative
● based on step-edge and additive Gaussian noise
Step 0 -> noise removal using the Gaussian filter
1. compute the derivatives

○ using Sobel filters
2. compute the gradient
3. apply non-maxima suppression

○ find the local maxima of the gradient at each point, this steps is used
to thin the edge ridges that might have been thickened by the noise
removal algorithm
○ done by checking if each non-zero gradient is greater than its
neighbors (edge pixel) depending on the orientation
4. apply hysteresis thresholding/ edge linking
○ adaptive thresholding for fixing the issues caused by lighting
conditions
○ we find the value of the gradient magnitude that exceeds the ones of
the pixels that don't form an edge (non-edge pixels); this way we can
make the distinction more easily between the edge and non-edge
pixels
○ steps:
1. compute the histogram of the image after applying
non-maxima suppression, find the number of non-edge
pixels
2. the values of the histogram are summed, similarly to the
cumulative histogram, and assign the threshold the first
value that exceeds the number of non-edge pixels
○ then we use edge linking to complete the shapes of the edges, which
could be broken by the lighting conditions/noise
○ the algorithm classifies the edges into 3 categories: weak, strong and
non-edge with the help of two thresholds: the one computed from the
previous step (adaptive thresholding) and one computed by
multiplying this value with a coefficient smaller than 1.
○ the algorithm iterates through the strong edge pixels and finds its
weak edge neighbors and converts them into strong edges.
LoG kernel = Gaussian low-pass filter + Laplacian second order derivative
Gaussian low-pass filter:
● blurs the image to retrieve only the main features in the final result
Log
● find the position in the image where the second derivative is zero (edges)
The gradient has a greater value at an edge point (we find this by using the first derivative);
and if we apply a second derivative we notice that the "peak" of the gradient value is now a
zero-value. However, there are more 0s in the second derivative due to the ripples in the
original image.
When we find a zero-crossing of the Laplacian, we should also check if it is caused by a

ripple in the original image (noise).
This is done by computing an estimate of the local variance whenever we find a

zero-crossing.
● if the variance is low, the edge is a ripple

● if the variance is high, it is a true edge since it makes the distinction between
two very different grey levels
Deriv de ordin 2 in directia gradientului

Lecture #13 c
Stereovision
Din doua imagini cum obții o distanță
Conteaza ecuația cu Z. D- disparity (dif unde apare pct); b- distanta intre camere; Z -
depth
Depth approximation
Element
Range resolution
E acea val in profunzimea Z care poate fi observata intr-o schimbare la nivelul

disparitatii.
263216
362611
Fourier transform (223)
Floyd-Steinberg Floyd Steinberg dithering

Camera Models and Image Processing Techniques

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Camera Models and Image Processing Techniques

Uploaded by

Copyright:

Available Formats

Lecture #2 ​c

​The aperture​ is a hole or an opening through

​Perspective camera model

a. Orthographic projection: points are projected along rays parallel to the

3. Thin lens camera model: Z*z = f2; S0=Z+f and Si=z+f

4. The perspective camera model (pinhole)

​Physical camera parameters

​Interpolation​ When we lack a value from the senzor, we make an 8-neighbour

2. Transform p[x,y]​T​ (metric units) to image coordinates [u,v]​T

​Image plane transformation -> Camera frame

​Modeling the lens distortions

2. Tangential distortion​: Appears if the centers of curvature of the lenses’

Transform p[x,y]T (metric) -> image coords [u,v]T (pixels)

​=> The projection equations become non-linear

3D(world) to 2D(image) mapping using

Obtaining the 2D image coordinates from normalized coordinate:

Threshold selection:​ using standard deviation and mean: , where k1=k2=1

​Semi-thresholding:​ ​Pixels whose values lie within a given

Determines multiple thresholds for reducing the number of image intensity

​Adaptive thresholding​: Gray level histograms approximated by two normal

The Euclidean distance on the interval [0,P-1] is

The minimum of F(a) and F(b) are the mean values:

​Center of mass: ​The equations correspond to the row and

​Axis of elongation (ORIENTATION):​ The axis of elongation gives information

1/T is called irregularity factor of an object (or compactness factor)

​Aspect ratio: ​This property is found by scanning the image

​Roundness: ​0 - straight line; 1- circle

Emin: +sin2o; Emax: -sin2o

Projections: ​The projections

The horizontal projection equals the sum of pixels computed on each

The vertical projection is given by the sum of the pixelson the

- The vertical projection is obtained by adding up all

- It is difficult to compute vertical projection directly

​Classical Labeling (Two pass)

​Close: ​tends to smooth sections of contours but, as opposed to opening, it

mean\:=\:\frac{\sum \:_{i=1}^{\infty \:}\:\sum \:_{j=1}^{\infty \:}\:f\left(i,j\right)\:}{m\cdot

​Probability distribution function

​Cumulative probability density function (CPDF)

pci = pi + pci-1 (le adunăm pe toate pana la intensitatea aia)

​Histogram equalization( 362611 )

pc(gin) = 1 cand e ultima val

​Information: ​the information associated to gray-level f:

​Entropy: ​average information of the image:

​Unit pulse function

​Gaussian filter: ​3x3

​Laplace filters: ​edge detection

G\left(x,y\right)\:=\:\frac{1}{2\cdot \pi \cdot sigma^2}\cdot

To calculate it , we need 2 x d,f or n.

​ ignal to noise ratio

​Threshold average filter (middle is not taken into consideration)

​Median filter (has the sorted vector)

​Sobel filter - 336

​Second order differential edge detection (Laplace)

​Zero crossing methods - 353

Gradient: a vector which has a certain magnitude and direction

● the magnitude specifies the strength of the edge

Multiple types of edge detectors:

● Roberts: the interpolation is computed at point (i+1/2, j+1/2)

● Prewitt: each neighboring pixel is assigned the same importance

Optimal edge detection criteria:

1. good detection: minimize the probability of false positives/ negatives

Step 0 -> noise removal using the Gaussian filter

Lecture #2 c

The aperture is a hole or an opening through

Perspective camera model

Physical camera parameters

Interpolation When we lack a value from the senzor, we make an 8-neighbour

2. Transform p[x,y]T (metric units) to image coordinates [u,v]T

Image plane transformation -> Camera frame

Modeling the lens distortions

2. Tangential distortion: Appears if the centers of curvature of the lenses’

=> The projection equations become non-linear

Threshold selection: using standard deviation and mean: , where k1=k2=1

Semi-thresholding: Pixels whose values lie within a given

Adaptive thresholding: Gray level histograms approximated by two normal

Center of mass: The equations correspond to the row and

Axis of elongation (ORIENTATION): The axis of elongation gives information

Aspect ratio: This property is found by scanning the image

Roundness: 0 - straight line; 1- circle

Projections: The projections

Classical Labeling (Two pass)

Close: tends to smooth sections of contours but, as opposed to opening, it

Probability distribution function

Cumulative probability density function (CPDF)

Histogram equalization( 362611 )

Information: the information associated to gray-level f:

Entropy: average information of the image:

Unit pulse function

Gaussian filter: 3x3

Laplace filters: edge detection

ignal to noise ratio

Threshold average filter (middle is not taken into consideration)

Median filter (has the sorted vector)

Sobel filter - 336

Second order differential edge detection (Laplace)

Zero crossing methods - 353