You are on page 1of 78

EE-381 Robotics-1

UG ELECTIVE

Lecture 𝟏𝟏
Dr. Hafsa Iqbal
Department of Electrical Engineering,
School of Electrical Engineering and Computer Science,
National University of Sciences and Technology,
Pakistan
Robot Perception using Vision

2
Computer Vision
• State-of-the-art
• OCR (Optical Character
Recognition)
• Converts scanned documents
to texts
• Recognizes handwritten
numbers
• Face Detection
• Detects blink for better photos
• Smile shutter / camera waits
until you smile
• Camera based login
• Object Recognition
• Action Recognition
• Activity Recognition
3
Computer Vision
• State-of-the-art
• Localization and Mapping

4
Automotive Safety

Slide acknowledgement: Prof. Fei Fei Li’s CS131 class at Stanford 5


Applications of Computer Vision

6
Applications of Computer Vision

7
Connection to other disciplines

8
What is the goal of Computer Vision?
• “The goal of Computer Vision is to make useful
decisions about real physical objects and scenes
based on sensed images”.

Image Computer
Processing Graphics

Computer
Vision

2D & 3D Scene Understanding


9
What is Computer Vision?

Slide acknowledgement: Prof. Fei Fei Li’s CS131 class at Stanford


10
11
What we would like to infer…
•,

Will person B put some money into Person C’s tip bag?
12
Why is computer vision hard?
⦁ Computers are good at numerical processing

⦁ Humans are good at perceptual processing

⦁ We want to use a computer to mimic human


perception… which is complex to understand

13
The Complexity of Perception

14
15
16
Perception

Ref: Light and Vision: LIFE Science Library


17
Perception

18
What is this?

19
Images are 2D projections of 3D World

20
Recognition Helps Reorganization

21
22
Writing Programs that “See”

An Example

23
What kind of information can we extract from an image?

⦁ 3D Information
⦁ Semantic Information

24
What kind of information can we extract from an image?
• 3D Information
• Semantic Information

25
How to infer from such complicated images?

Will person B put some money into Person C’s tip bag?
26
Origins of Computer Vision

27
The camera
- Textbook section 4.2.1 and 4.2.2 are to be read by the students themselves
- The digital Camera
- CCD Cameras
- CMOS Cameras
- Color Cameras

Sony Cybershot WX1

28
Pinhole camera
First described by Ibn Al-
Haytham
‫ الحسن بن الحسن بن الھیثم‬،‫ابو علی‬
in his 7 volume work
‫کتاب المناظر‬
He termed it
‫بیت المظلم‬
which was later translated into
Latin as “camera obscura”

http://www.ibnalhaytham.com/discover/who-was-ibn-al-haytham/
29
The first photograph on record

30
Pinhole Camera
• Lens is assumed to be single point
• Infinitesimally small aperture
• Has infinite depth of field i.e.,
everything is in focus

31
Pinhole Camera Properties: Distant objects are smaller

Slide Credit: Forsyth/Ponce http://www.cs.berkeley.edu/~daf/bookpages/slides.html and Khurram Shafique, Object Video

32
Pinhole camera model
• Pinhole model:
• Captures pencil of rays – all rays through a single point
• The point is called Center of Projection
• The image is formed on the Image Plane

Slide by Steve Seitz


35
How do we see the world?
• Let’s design a camera
• Idea 1: put a piece of film in front of an object
• Do we get a reasonable image?

36
Pinhole camera
• Add a barrier to block off most of the rays
• This reduces blurring
• The opening known as the aperture

37
Solution: adding a lens
• A lens focuses light onto the film
• Rays passing through the center are not deviated

38
Solution: adding a lens
• A lens focuses light onto the film
• Rays passing through the center are not deviated
• All parallel rays converge to one point on a plane located at the
focal length f

39
Thin lenses
• Thin lens equation: 1 1 1
 
d0 di f
• Any object point satisfying this equation is in focus
• This formula can also be used to estimate roughly the distance
to the object (“Depth from Focus”)

40
Pin-hole approximation
• If the object is far away from the lens, the whole lens will
appear (in relative terms) as a pinhole.
1 1 1
 
d0 di f
1 1
d 0  d i    di  f
di f

41
Shrinking the aperture
• Why not make the aperture as small as possible?
• Less light gets through (must increase the exposure)
• Diffraction effects…

𝐴 = 𝜋𝑟 2
2
𝐷
𝐴=𝜋
2
2
𝑛
𝐴=𝜋
2𝑓
• Where A is the area of aperture,
• D is the diameter of the aperture,
• f is the focal length and
• n is the f-number which is mostly
provided by the manufacturer
Slide by Steve Seitz

42
Shrinking the aperture

43
44
45
46
Representing a Digital Image
• It is natural to represent a digital image as a matrix

I(0,0) c

r I(8,15)
47
Representing a Digital Image

f(0,0) c

x
(0,0)

rx
r cy
90 cc rotation
f(8,15)

48
ImageAcquisition

49
Generating a Digital Image

• Digitization of
coordinate values is
called sampling

• Digitization of
amplitude values is
called quantization

50
Generating a DigitalImage

• How many samples to


use?
• How many quantization
levels to use?

51
Generating a DigitalImage

52
Resolution
• Spatial Resolution is the smallest discernable detail in an
image

• What is the minimum width of lines W such that they are


discernable in the image?

• Gray-level resolution is the smallest


discernable change in gray-level

53
Image Size andResolution

256 x192

128 x 96

64 x48
These images were produced simply by picking every
n-th sample horizontally and vertically and replicating
32 x24
the value n x n times

54
Color Components
Monochrome Image
0.2126 R(x, y) + 0.7152 G(x, y) + 0.0722 B(x, y)

Red R(x, y) Green G(x, y) Blue B(x, y)


56
Color Components

Ref: Light and Vision: LIFE Science Library 57


58
Storage Requirements
• Image of M x N pixels with 2k gray-levels and c color
components:

Size = M x N x k x c

Example: 512x512 monochrome image with 256 gray-levels


Size = 512 x 512 x 8 x 1 = 2,097,152 bits or 256kB

59
Storage Requirements

⦁ NxNxk
⦁ L = 2k

60
Storage Requirements

Example: 2048x1536 color


image with 256 gray-levels
per channel

Size = 2048 x 1536 x 8 x 3 = 75497472 bits or ~9.2 MB

The image was compressed on board the digital camera using JPEG
compression and the file size came out to be 527kB

61
Images at 1250, 300, 150 𝑎𝑛𝑑 72 dpi

63
Intensity resolution
⦁ Intensity resolution‐smallest discernible
change in intensity level

⦁ Is an integer power of two (8 bits, 16 bits)


⦁ 21, 22, 23, … . , 2 8 …., 216

⦁ An 8‐bit system quantizes intensity in fixed


increments of 1/256 units of intensity amplitude

64
Different Number of Gray Levels

256 32 16

8 4 2
Contouring

65
How many gray-levels are required?
Contouring…

32

64

128

256

Digital Images are usually quantized to 256 gray-levels


66
Picture Element

• Value of each location is


0 or 1

• Mono-Chrome

67
Picture Element
• How many different colours
can you represent using 24
bits ?

68
Picture Element

69
Image Features

70
Image Features
• Features

• Line/Edge

• Corner

• Pipe

• Blob

• Shape

• Object

71
Edge Detection

72
What is an Edge?
⦁ An edge is a location of rapid intensity variation

⦁ They often mark boundaries of objects, occlusion


contours, shadow boundaries or surface contours

⦁ Edges are very important in human perception

73
74
Finding Shapes from Edges

75
Finding Shapes from Edges

76
Line Detection from Edges

77
Line Detection for Lane Finding

78
More Examples

79
Line Detection

80
Edge Detection by Humans

81
Discrete Derivatives in 2-D

f f
fx   f (x 1)  f (x) fy   f ( y 1)  f ( y)
x y

82

You might also like