Lecture 11

EE-381 Robotics-1
UG ELECTIVE
Lecture 𝟏𝟏
Dr. Hafsa Iqbal
Department of Electrical Engineering,
School of Electrical Engineering and Computer Science,
National University of Sciences and Technology,
Pakistan
Robot Perception using Vision
2
Computer Vision
• State-of-the-art
• OCR (Optical Character
Recognition)
• Converts scanned documents
to texts
• Recognizes handwritten
numbers
• Face Detection
• Detects blink for better photos
• Smile shutter / camera waits
until you smile
• Camera based login
• Object Recognition
• Action Recognition
• Activity Recognition
3
Computer Vision
• State-of-the-art
• Localization and Mapping
4
Automotive Safety
Slide acknowledgement: Prof. Fei Fei Li’s CS131 class at Stanford 5

Applications of Computer Vision
6
Applications of Computer Vision
7
Connection to other disciplines
8
What is the goal of Computer Vision?
• “The goal of Computer Vision is to make useful
decisions about real physical objects and scenes
based on sensed images”.
Image Computer
Processing Graphics
Computer
Vision
2D & 3D Scene Understanding

9
What is Computer Vision?
Slide acknowledgement: Prof. Fei Fei Li’s CS131 class at Stanford

10
11
What we would like to infer…
•,
Will person B put some money into Person C’s tip bag?
12
Why is computer vision hard?
⦁ Computers are good at numerical processing
⦁ Humans are good at perceptual processing
⦁ We want to use a computer to mimic human

perception… which is complex to understand
13
The Complexity of Perception
14
15
16
Perception
Ref: Light and Vision: LIFE Science Library

17
Perception
18
What is this?
19
Images are 2D projections of 3D World
20
Recognition Helps Reorganization
21
22
Writing Programs that “See”
An Example
23
What kind of information can we extract from an image?
⦁ 3D Information
⦁ Semantic Information
24
What kind of information can we extract from an image?
• 3D Information
• Semantic Information
25
How to infer from such complicated images?
Will person B put some money into Person C’s tip bag?
26
Origins of Computer Vision
27
The camera
- Textbook section 4.2.1 and 4.2.2 are to be read by the students themselves
- The digital Camera
- CCD Cameras
- CMOS Cameras
- Color Cameras
Sony Cybershot WX1
28
Pinhole camera
First described by Ibn Al-
Haytham
‫ الحسن بن الحسن بن الھیثم‬،‫ابو علی‬
in his 7 volume work
‫کتاب المناظر‬
He termed it
‫بیت المظلم‬
which was later translated into
Latin as “camera obscura”
http://www.ibnalhaytham.com/discover/who-was-ibn-al-haytham/
29
The first photograph on record
30
Pinhole Camera
• Lens is assumed to be single point
• Infinitesimally small aperture
• Has infinite depth of field i.e.,
everything is in focus
31
Pinhole Camera Properties: Distant objects are smaller
Slide Credit: Forsyth/Ponce http://www.cs.berkeley.edu/~daf/bookpages/slides.html and Khurram Shafique, Object Video
32
Pinhole camera model
• Pinhole model:
• Captures pencil of rays – all rays through a single point
• The point is called Center of Projection
• The image is formed on the Image Plane
Slide by Steve Seitz

35
How do we see the world?
• Let’s design a camera
• Idea 1: put a piece of film in front of an object
• Do we get a reasonable image?
36
Pinhole camera
• Add a barrier to block off most of the rays
• This reduces blurring
• The opening known as the aperture
37
Solution: adding a lens
• A lens focuses light onto the film
• Rays passing through the center are not deviated
38
Solution: adding a lens
• A lens focuses light onto the film
• Rays passing through the center are not deviated
• All parallel rays converge to one point on a plane located at the
focal length f
39
Thin lenses
• Thin lens equation: 1 1 1
 
d0 di f
• Any object point satisfying this equation is in focus
• This formula can also be used to estimate roughly the distance
to the object (“Depth from Focus”)
40
Pin-hole approximation
• If the object is far away from the lens, the whole lens will
appear (in relative terms) as a pinhole.
1 1 1
 
d0 di f
1 1
d 0  d i    di  f
di f
41
Shrinking the aperture
• Why not make the aperture as small as possible?
• Less light gets through (must increase the exposure)
• Diffraction effects…
𝐴 = 𝜋𝑟 2
2
𝐷
𝐴=𝜋
2
2
𝑛
𝐴=𝜋
2𝑓
• Where A is the area of aperture,
• D is the diameter of the aperture,
• f is the focal length and
• n is the f-number which is mostly
provided by the manufacturer
Slide by Steve Seitz
42
Shrinking the aperture
43
44
45
46
Representing a Digital Image
• It is natural to represent a digital image as a matrix
I(0,0) c
r I(8,15)
47
Representing a Digital Image
f(0,0) c
x
(0,0)
rx
r cy
90 cc rotation
f(8,15)
48
ImageAcquisition
49
Generating a Digital Image
• Digitization of
coordinate values is
called sampling
• Digitization of
amplitude values is
called quantization
50
Generating a DigitalImage
• How many samples to

use?
• How many quantization
levels to use?
51
Generating a DigitalImage
52
Resolution
• Spatial Resolution is the smallest discernable detail in an
image
• What is the minimum width of lines W such that they are

discernable in the image?
• Gray-level resolution is the smallest

discernable change in gray-level
53
Image Size andResolution
256 x192
128 x 96
64 x48
These images were produced simply by picking every
n-th sample horizontally and vertically and replicating
32 x24
the value n x n times
54
Color Components
Monochrome Image
0.2126 R(x, y) + 0.7152 G(x, y) + 0.0722 B(x, y)
Red R(x, y) Green G(x, y) Blue B(x, y)

56
Color Components
Ref: Light and Vision: LIFE Science Library 57

58
Storage Requirements
• Image of M x N pixels with 2k gray-levels and c color
components:
Size = M x N x k x c
Example: 512x512 monochrome image with 256 gray-levels

Size = 512 x 512 x 8 x 1 = 2,097,152 bits or 256kB
59
⦁ NxNxk
⦁ L = 2k
60
Example: 2048x1536 color

image with 256 gray-levels
per channel
Size = 2048 x 1536 x 8 x 3 = 75497472 bits or ~9.2 MB
The image was compressed on board the digital camera using JPEG
compression and the file size came out to be 527kB
61
Images at 1250, 300, 150 𝑎𝑛𝑑 72 dpi
63
Intensity resolution
⦁ Intensity resolution‐smallest discernible
change in intensity level
⦁ Is an integer power of two (8 bits, 16 bits)

⦁ 21, 22, 23, … . , 2 8 …., 216
⦁ An 8‐bit system quantizes intensity in fixed

increments of 1/256 units of intensity amplitude
64
Different Number of Gray Levels
256 32 16
8 4 2
Contouring
65
How many gray-levels are required?
Contouring…
32
64
128
256
Digital Images are usually quantized to 256 gray-levels

66
Picture Element
• Value of each location is

0 or 1
• Mono-Chrome
67
Picture Element
• How many different colours
can you represent using 24
bits ?
68
Picture Element
69
Image Features
70
Image Features
• Features
• Line/Edge
• Corner
• Pipe
• Blob
• Shape
• Object
71
Edge Detection
72
What is an Edge?
⦁ An edge is a location of rapid intensity variation
⦁ They often mark boundaries of objects, occlusion

contours, shadow boundaries or surface contours
⦁ Edges are very important in human perception
73
74
Finding Shapes from Edges
75
Finding Shapes from Edges
76
Line Detection from Edges
77
Line Detection for Lane Finding
78
More Examples
79
Line Detection
80
Edge Detection by Humans
81
Discrete Derivatives in 2-D
f f
fx   f (x 1)  f (x) fy   f ( y 1)  f ( y)
x y
82

Lecture 11

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 11

Uploaded by

Copyright:

Available Formats

EE-381 Robotics-1

Slide acknowledgement: Prof. Fei Fei Li’s CS131 class at Stanford 5

2D & 3D Scene Understanding

Slide acknowledgement: Prof. Fei Fei Li’s CS131 class at Stanford

⦁ Humans are good at perceptual processing

⦁ We want to use a computer to mimic human

Ref: Light and Vision: LIFE Science Library

Sony Cybershot WX1

Slide Credit: Forsyth/Ponce http://www.cs.berkeley.edu/~daf/bookpages/slides.html and Khurram Shafique, Object Video

Slide by Steve Seitz

• How many samples to

• What is the minimum width of lines W such that they are

• Gray-level resolution is the smallest

Red R(x, y) Green G(x, y) Blue B(x, y)

Ref: Light and Vision: LIFE Science Library 57

Example: 512x512 monochrome image with 256 gray-levels

Example: 2048x1536 color

Size = 2048 x 1536 x 8 x 3 = 75497472 bits or ~9.2 MB

⦁ Is an integer power of two (8 bits, 16 bits)

⦁ An 8‐bit system quantizes intensity in fixed

Digital Images are usually quantized to 256 gray-levels

• Value of each location is

⦁ They often mark boundaries of objects, occlusion

⦁ Edges are very important in human perception

You might also like