You are on page 1of 50

LECTURE (01)

Introduction to Computer Vision


It’s The First Lecture
❑ Introduction to computer vision
❑ Course overview

2
Course Details
❑ Course Description:
▪ Computer vision systems are becoming increasingly important to solve problems in a
variety of areas, including manufacturing and surveillance. It is therefore important
for future computer Engineering graduates to have solid knowledge of this field. For
this purpose, the course “Computer Vision” provides students with both theoretical
knowledge and practical experience with fundamental and advanced Computer
Vision algorithms. Topics range from basic image processing techniques such as
image convolution and region and edge detection to more complex vision algorithms
for contour detection, depth perception, dynamic vision, and object recognition.
Moreover, core topics like color processing, texture analysis and visual geometry are
covered. In programming assignments, students gain practical insight into the
development of vision applications by implementing Computer Vision algorithms in
the Python/MATLAB programming language. Their final project is the development
of their own computer vision program that solves a given problem; this could be
a simple object recognition task.

3
Learning Objectives
❑ Upon completion of this course, students will:
▪ Be familiar with both the theoretical and practical aspects of computing
with images;
▪ Have described the foundation of image formation, measurement, and
analysis;
▪ Have implemented common methods for robust image matching and
alignment;
▪ Understand the geometric relationships between 2D images and the 3D
world;
▪ Have gained exposure to object and scene recognition and categorization
from images;
▪ Grasp the principles of state-of-the-art deep neural networks; and
▪ Developed the practical skills necessary to build computer vision
applications.

4
Useful References
❑ Reference Books
▪ Rafeal C. Gonzales, ” Digital Image Processing”,pearson,3rd edition,2007.
▪ Richard Szeliski, ”Computer Vision: Algorithms and Applications”, Springer 2010-11.

❑ Web Sites and additional reading materials


▪ Will be posted on the site of the course.

5
Grading
❑ Grading :
▪ Midterm:
▪ Quizzes: 50
▪ Assignments:
▪ Project:
▪ Final: 50

6
Computer Vision
❑ Computer Vision is the science of building systems that can extract
certain task-relevant information from a visual scene.
❑ Such systems can be used for applications such as optical character
recognition, analysis of satellite and microscopic images, magnetic
resonance imaging, surveillance, identity verification, quality control in
manufacturing etc.

7
Computer Vision
❑ In a way, Computer Vision can be considered the inversion of Computer
Graphics.
❑ A computer graphics systems receives as its input the formal description
of a visual scene, and its output is a visualization of that scene.
❑ A computer vision system receives as its input a visual scene, and its
output is a formal description of that scene with regard to the system’s
task.
❑ Unfortunately, while a computer graphics task only allows one solution,
computer vision tasks are often ambiguous, and it is unclear what the
correct output should be.

8
What is (computer) vision?

9
Artificial
Machine Intelligence
Learning
Robotics
Deep
Learning

Human Computer
Interaction

Image Processing Scope of


Computer Vision
Graphics
Neuroscience

Computational
Photography Medical Imaging

Optics
10
The goal of computervision
❑ To bridge the gap between pixels and
“meaning”

11
Computer Vision
❑ A simple two-stage model of computer vision:

Bitmap Image Scene Scene


image processing analysis description

feedback (tuning)

Prepare image for Build an iconic


scene analysis model of the world

14
Computer Vision
❑ The image processing stage prepares the input image for the
subsequent scene analysis.
❑ Usually, image processing results in one or more new images that
contain specific information on relevant features of the input image.
❑ The information in the output images is arranged in the same way as in
the input image. For example, in the upper left corner in the output
images we find information about the upper left corner in the input
image.

15
Computer Vision
❑ The scene analysis stage interprets the results from the image
processing stage.
❑ Its output completely depends on the problem that the computer vision
system is supposed to solve.
❑ For example, it could be the number of bacteria in a microscopic
image, or the identity of a person whose retinal scan was input to the
system.
❑ In the following lectures we will focus on the lower-level, i.e., image
processing techniques.
❑ Later we will discuss a variety of scene analysis methods and
algorithms.
16
Computer Vision
❑ Have you ever used computer vision?
❑ How? Where?

17
Laptop: Biometrics auto-login (face recognition, 3D), OCR
Smartphones: QR codes, computational photography (Android Lens Blur, iPhone Portrait
Mode), panorama construction (Google Photo Spheres), face detection, expression
detection (smile), Snapchat filters (face tracking), Google Tango (3D reconstruction),
Night Sight (Pixel)
Web: Image search, Google photos (face recognition, object recognition, scene
recognition, geolocalization from vision), Facebook (image captioning), Google maps
aerial imaging (image stitching), YouTube (content categorization)
VR/AR: Outside-in tracking (HTC VIVE), inside out tracking (simultaneous localization
and mapping, HoloLens), object occlusion (dense depth estimation)
Motion: Kinect, full body tracking of skeleton, gesture recognition, virtual try-on
Medical imaging: CAT / MRI reconstruction, assisted diagnosis, automatic pathology,
connectomics, endoscopic surgery
Industry: Vision-based robotics (marker-based), machine-assisted router (jig),
automated post, ANPR (number plates), surveillance, drones, shopping
Transportation: Assisted driving (everything), face tracking/iris dilation for drunkeness,
drowsiness, automated distribution (all modes)
Media: Visual effects for film, TV (reconstruction), virtual sports replay
(reconstruction), semantics-based auto edits (reconstruction, recognition) 18
The Four Rs of Computer Vision
▪ Reconstruction
▪ Registration
▪ Reorganization R1
▪ Recognition
R2
R3

R4

19
R1: Reconstruction
❑ In computer vision, 3D reconstruction is the process of capturing the
shape and appearance of real objects.
❑ Multiview Geometry, 3D Vision, Shape-from-X

Left View Right View

20
R2: Registration
❑ Image registration is the process of transforming different sets of data
into one coordinate system. Data may be multiple images, data from
different sensors, times, depths, or viewpoints.
❑ Tracking, Alignment, Optical Flow, Correspondence

21
R3: Reorganization
❑ Clustering, Unsupervised Learning, Segmentation, Perceptual
Organization

22
R4: Recognition
❑ Image recognition, in the context of machine vision, is the ability to
identify objects, places, people, writing and actions in images.
❑ Verification, Identification, Detection.

23
24
Companies Using Computer Vision

26
Computer Vision Applications

27
Optical character recognition (OCR)
❑ Technology to convert images of text into text.
▪ If you have a scanner, it probably came with OCR software

Live
Camera
Translation

Mail digit recognition, AT&T labs


http://www.research.att.com/~yann/

License plate readers


http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
28
Face detection
❑ Almost all digital cameras detect faces
❑ Snapchat face filters

29
Smile detection

Sony Cyber-shot® T70 Digital Still Camera 30


Object recognition (in supermarkets)

31
How does it work?

32
How does it work?

33
34
Vision-based biometrics

“How the Afghan Girl was Identified by Her Iris Patterns”


Read the story (Wikipedia)

35
Facial login without a password…

36
Facial login without a password…

37
Object recognition (in mobile phones)
e.g., Google Lens

38
Human shape capture

39
Human shape capture

40
Human shape capture

41
Human shape capture

42
Special effects: motion capture

43
Interactive Games
Object Recognition:
http://www.youtube.com/watch?feature=iv&v=fQ59dXOo63o
Mario: http://www.youtube.com/watch?v=8CTJL5lUjHg
3D: http://www.youtube.com/watch?v=7QrnwoO1-8A
Robot: http://www.youtube.com/watch?v=w8BmgtMKFbY

JH 44
Medical imaging

Image guided surgery


3D imaging Grimson et al., MIT
MRI, CT

45
AutoCars - Uber bought CMU’s lab

46
47
Industrial robots

Vision-guided robots position nut runners on wheels

JH 48
Vision in Spaaaaace

NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.

Vision systems (JPL) used for several tasks


• Panorama stitching
• 3D terrain modeling
• Obstacle detection, position tracking
• For more, read “Computer Vision on Mars” by Matthies et al.
JH 49
Mobile Robots
Saxena et al. 2008
STAIR at Stanford

Skydio 2 drone
6x fisheye cameras for
obstacle avoidance
Onboard NVIDIA GPU 50
Augmented Reality

51
Virtual Reality

52
53

You might also like