Professional Documents
Culture Documents
"Position"
Localization Cognition
Global Map
Perception
Sensors
Vision
Uncertainties, Line extraction from laser scans
http://webvision.med.utah.edu/sretina.html
© R. Siegwart , M.Chli and D. Scaramuzza, ETH Zurich - ASL
Lecture 5 - Perception - Vision
Lec. 5
3
Human Visual Capabilities
Perspective Projection
Stereo Vision
Optical Flow
Color Tracking
object film
Solar eclipse
http://www.thelivingmoon.com/45jack_files/03files/Launch_Sites_Baikonur_Tour.html
© R. Siegwart , M.Chli and D. Scaramuzza, ETH Zurich - ASL
Lecture 5 - Perception - Vision
Lec. 5
17
Pinhole camera model
Pinhole model:
Captures beam of rays – all rays through a single point
The point is called Center of Projection or Optical Center
The image is formed on the Image Plane
We will use the pinhole camera model to describe how the image is
formed
Slide by Steve Seitz
Why so
blurry?
http://www.debevec.org/Pinhole/
A lens can focus multiple rays coming from the same point
A lens can focus multiple rays coming from the same point
Lens
object
Focal Point
Optical Axis
Focal Length: f
Lens
Object
A Focal Point
B
Image
f
z e
Similar Triangles:
B e
A z
Lens
Object
A A Focal Point
B
Image
f
z e
Lens
Object
Focal Point
Image
f
z e
1 1 1
Thin lens equation:
f z e
Any object point satisfying this equation is in focus
“Depth from Focus”: use this to estimate (roughly) the distance to the object
Lens film
object
“Circle of Confusion”
f or
“Blur Circle”
z e
L
Object is out of focus Blur Circle has radius: R
2e
A minimal L (pinhole) gives minimal R
For objects out of focus, larger aperture gives worse blur
Adjust camera settings, such that R remains smaller than the image resolution
0.08
z=10 then R = ? R 0.117 0.06
0.02
0 2 4 6 8 10 12 14
z
Increased sensitivity to blurring when object is close to the lens
h
Object C C
h'
f Image Image
z e z f
C = “optical center”, “center of projection”
“Ames room”
O = principal point
v
O
p
Image plane (CCD)
f
C Xc
C = optical center = center of the lens
Yc
For convenience, the image plane is usually represented in front so that the
image preserves the same orientation (i.e. not flipped)
[R|T]
2. Convert Pc to (discretised)
pixel coordinates (u,v)
2. Convert Pc to (discretised)
pixel coordinates (u,v)
© R. Siegwart , M.Chli and D. Scaramuzza, ETH Zurich - ASL
Lecture 5 - Perception - Vision
Lec. 5
36
Perspective Projection (2)
(0,0) u Image plane
From the Camera frame to pixel coordinates
So: ku fX c
u u0 ku x u u0
Zc
kv fYc
v v0 kv y v v0
Zc
Use Homogeneous Coordinates for linear mapping from 3D to 2D, by
introducing an extra element (scale):
u
u
p ~
p v and similarly for the world coordinates. Note, usually 1
v
© R. Siegwart , M.Chli and D. Scaramuzza, ETH Zurich - ASL
Lecture 5 - Perception - Vision
Lec. 5
37
Perspective Projection (3)
ku fX c
u u0 Zc
Zc
So: Pc
k fY u
v v0 v c
Zc
v
Expressed in matrix form & Homogenerous coords: O
p
u k u f 0 u0 X c f Image plane (CCD)
v 0 kv f v0 Yc
C
0 0 1 Z c Xc
Yc
Or alternatively
Projection Matrix
X w
u Xc
u Y
v K Yc v K R T w
Zw
1 Z c 1
1
Radial Distortion
parameter
Barrel distortion Pincushion distortion
Z Zw
1 m31 m32 m33 m34 w
1 1
what we obtained: the 3x4 projection matrix,
what we need: its decomposition into the camera calibration matrix K, and the
rotation R and position T of the camera.
Use QR factorization to decompose the 3x3 submatrix (m11:33) into the product of
an upper triangular matrix K and a rotation matrix R (orthogonal matrix)
m14
The translation T can subsequently be obtained by: T K 1 m24
m34
bf
ZP
ul u r
( R, T )
( X w , Yw , Z w )
( R, T ) ul Xw
pl l vl K l Yw
Left camera: ~
set the world frame to coincide
with the left camera frame 1 Z w
ur Xw
Right camera: pr r vr K r R Yw T
~
1 Z w
Epipolar Line 1
Epipolar Line 2
p1 (u1 , v1 ) p2 (u2 , v2 )
C1 E1 E2 C2
epipoles
Impose the epipolar constraint to aid matching: search for a
correspondence along the epipolar line
Rotation
Image from Left Camera Image from Right Camera
Rotation
Image from Left Camera Image from Right Camera
Focal lengths
Rotation
Image from Left Camera Image from Right Camera
Focal lengths
Lens Distortion
Rotation
Image from Left Camera Image from Right Camera
Focal lengths
Lens Distortion
Translation
Rotation
Image from Left Camera Image from Right Camera
Focal lengths
Lens Distortion
Translation
80
60
40
20
-20
-40
-60
-80
400
Y
0
50
100
150
350
X
200
x x/
C
C/
(R,T)
© R. Siegwart , M.Chli and D. Scaramuzza, ETH Zurich - ASL
Lecture 5 - Perception - Vision
Lec. 5
63
Multiple-view structure from motion
It computes the motion vectors of all pixels in the image (or a subset of
them to be faster)
thresholding in YUV space can achieve greater stability to illumination changes than in
RGB space.