You are on page 1of 41

INVICTA 2020

Image Processing and Computer Vision Workshop

ROBOTICS CLUB
IIT GUWAHATI
Topics, we are going to cover:
● Introduction to computer vision
● Understanding Image and colour model
● Basics of OpenCV
● Geometric transformation of image
● Drawing and writing texts on image
● Contours
● Edge and corner detection
● Object detection
● Colour recognition
● Introduction to Deep Learning
What is Computer Vision?
“Computer Vision is an interdisciplinary scientific field that deals with how computers can be
made to gain high-level understanding from digital images or videos.” -Wikipedia
What is Computer Vision?
“Computer Vision is an interdisciplinary scientific field that deals with how computers can be
made to gain high-level understanding from digital images or videos.” -Wikipedia
01 FACE RECOGNITION 02 OBJECT RECOGNITION
Here you could describe the
topic of the section

03 POSE ESTIMATION 03 GESTURE RECOGNITION


Here you could describe the Here you could describe the
topic of the section topic of the section
What is Image Processing then?
“A method to perform some operations on an image to extract useful information from it”
Seems We will break all of it down for you!
Using OpenCV ofcourse!
Daunting? OpenCV is a library which provides
functions to perform all of these
complex operations on images!
WHAT IS AN IMAGE?
There are two types of recording images:
1. Analog Image : Analog images are the type of images that we, as humans, look at. They also include such things
as photographs, paintings, TV images, and all of our medical images recorded on film or displayed on various
display devices. What we see in an analog image is various levels of brightness (or film density) and colors. It is
generally continuous and not broken into many small individual pieces.

2. Digital Image : A digital image is composed of a finite number of elements, each of which has a particular
location and value. These elements are referred to as pels, pixels, picture elements or image elements. A digital
image is a numeric representation, normally binary, of a two-dimensional image.

Mostly used types of images:

1. Grayscale images

2. RGB images
A Grayscale Image
A Grayscale Image

A grayscale or greyscale image is one


in which the value of each pixel is a
single sample representing only an
amount of light, that is, it carries only
intensity information. Grayscale
images, a kind of black-and-white or
gray monochrome, are composed
exclusively of shades of gray. The
contrast ranges from black at the
weakest intensity to white at the
strongest.
Color Image

A (digital) color image is a digital image that includes color information for each
pixel.

Any color image can be shown with two color models:

1.RGB

2.HSV
RGB Color Model
The RGB color model is an additive color model in which red, green and blue light
are added together in various ways to reproduce a broad array of colors
RGB Color Model
It consists of three matrices consisting of intensity of three primary colors i.e.red,
blue and green for a particular pixel.
HSV Color Model
The HSV color wheel is sometimes depicted as a cone or cylinder, but always with
these three components HSV (hue, saturation, value):

Hue : Hue is the color portion of the color model, expressed as a number from 0 to
360 degrees:

Saturation : Saturation is the amount of gray in the color, from 0 to 100 percent.
Reducing the saturation toward zero to introduce more gray produces a faded
effect. Sometimes, saturation is expressed in a range from just 0–1, where 0 is gray
and 1 is a primary color.

Value (or Brightness) : Value works in conjunction with saturation and describes the
brightness or intensity of the color, from 0–100 percent, where 0 is completely
black, and 100 is the brightest and reveals the most color.
HSV Color Model
What is a video consists of ?
Digital video is an electronic representation of moving visual images (video) in the form of
encoded digital data. This is in contrast to analog video, which represents moving visual
images with analog signals. Digital video comprises a series of digital images displayed in
rapid succession.

Frame Rate : Frame rate is the number of images showed in succession per second. It is
defined as the speed of displaying images.

Higher the frame rate, Higher the video quality, Higher the storage data.

Normal used frame rates: 24, 30, 60 ( also 25, 50 and many more...)
Getting
Started with
Images
What is OpenCV
❏ OpenCV (Open Source Computer Vision Library) is an open
source computer vision and machine learning software library.
❏ It is used for all sorts of image and video analysis, like facial
recognition and detection, license plate reading, photo editing,
advanced robotic vision, optical character recognition, and a
whole lot more.
❏ We will be using Python as the programming language as it has
other libraries like numpy,matplotlib,sciPy to which makes our
task easier.
Important Functions

CV2.imread()

Used to read an image from storage or any address.

1st argument is name of image or the location of the image and 2nd argument
is a flag .

Example - cv2.imread('messi5.jpg',0)

Example - cv2.imread('C:\\User\Desktop\messi.jpg',0)

Example - cv2.imread('http://10.0.0.19/' ,0)


Important Functions

CV2.imshow()

Used to display the image.

1st argument is window name and 2nd argument is name of image.

Example - cv2.imshow('image',img)
Important Functions

CV2.imwrite()

Used to save an image.

1st argument is name you want to give to your image and 2nd argument is the
image itself .

Example - cv2.imwrite('messi.png' , img )


Important Functions

CV2.waitKey()

It is a keyboard binding function. Its argument is the time in milliseconds.

cv2.destroyAllWindows()

It simply destroys all the windows we created.


Getting
Started with
Videos
Important Functions

cv2.VideoCapture()
To capture a video, you need to create a VideoCapture object.
Its argument can be either the device index or the name of a video file.
● cv2.VideoCapture(‘car.mp4’)
→ Used to play any saved video
● cv2.VideoCapture(0)
→ used to take video from laptop webcam
● cv2.VideoCapture(1)
→ used to take video from any external camera
Important Functions

cv2.cvtColor(input_image, cv2.COLOR_BGR2GRAY)
used to convert BGR image to Gray Scale image

cv2.cvtColor(input_image, cv2.COLOR_BGR2HSV)
used to convert BGR image to HSV Scale image
Geometric
Transformation
of Image
Learn to apply different geometric transformation to images
like Scaling ,translation, rotation, affine transformation etc .
SCALING
❏ Scaling is just resizing of the image.
❏ OpenCV comes with a function cv2.resize() for this purpose.
❏ Syntax - cv2.resize(src, dsize, fx(opt), fy(opt), interpolation)
❏ The size of the image can be specified manually, or you can specify the scaling factor.
Parameters:-
❏ Src - Input image
❏ dsize - output image size; if it equals zero, it is computed as:
dsize = Size(round(fx*src.cols), round(fy*src.rows))
❏ fx - scale factor along the horizontal axis; when it equals 0, it is computed as
fx=(double)dsize.width/src.cols
SCALING

❏ fy - scale factor along the vertical axis; when it equals 0, it is computed as


fy= (double)dsize.height/src.rows
❏ Interpolation - interpolation method
❏ Preferable interpolation methods are cv.INTER_AREA for shrinking and cv.INTER_CUBIC
(slow) & cv.INTER_LINEAR (By default) for zooming.
❏ Example :- res = cv.resize(img, None, fx=2, fy=2, interpolation = cv.INTER_CUBIC)
res = cv.resize(img, (2*width, 2*height), interpolation = cv.INTER_CUBIC)
TRANSLATION

❏ Translation is the shifting of object’s location.


❏ If we know the shift in (x,y) direction and let it be tx and ty, then the transformation matrix
is:

❏ we can use the cv2.warpAffine() function to implement these translations.


❏ We can form a transformation matrix using numpy and pass that into the cv2.warpAffine()
function. Example:- cv2.warpAffine(img , M, output_image_size)
** output_image_size should be in the form of (width, height)
ROTATION

❏ Rotation of an image for an angle theta is achieved by the transformation matrix of the
form

❏ To find this transformation matrix, OpenCV provides a function, “cv2.getRotationMatrix2D()”.


❏ Create a transformation matrix using cv2.getRotationMatrix2D() function and pass that into
cv2.warpAffine().
❏ Syntax : cv2.getRotationMatrix2D(centre, angle, scale)
**+ve angle - counter clockwise((the coordinate origin is assumed to be the top-left corner)
Drawing on Image
We're going to be covering how to draw various shapes
on our images and videos.
Drawing a line:- To draw a line, you need to pass starting and
ending coordinates of line.

Syntax: cv2.line(image, start_point, end_point, color, thickness)

Drawing a rectangle:- To draw a rectangle, you need top-left


corner and bottom-right corner of rectangle.

Syntax: cv2.rectangle(image, start_point, end_point, color,


thickness)

**color should be in (B,G,R) format


Drawing on Image

Drawing a circle:- To draw a circle, you need its center coordinates and radius

Syntax: cv2.circle(image, center_coordinates, radius, color, thickness)

**We can have a -1 for thickness. This implies the protest will be filled ,so we will get a Filled
in circle.

Try to draw a polygon!!


Adding texts to Images

cv2.putText()

Syntax: cv2.putText(image, text, org, font, fontScale, color, thickness, lineType,


bottomLeftOrigin)

org: It is the coordinates of the bottom-left corner of the text string in the image.

font: It denotes the font type. Some of font types are FONT_HERSHEY_SIMPLEX,
FONT_HERSHEY_PLAIN, , etc.

lineType: This is an optional parameter.It gives the type of the line to be used.

bottomLeftOrigin: This is an optional parameter. When it is true, the image data origin
is at the bottom-left corner. Otherwise, it is at the top-left corner.
Blurring and
Filtering
BLURRING
Blurring is commonly used for reducing image noises and
making image smooth.
BLURRING
cv2.blur() or cv2.boxfilter()

It’s parameters are image and kernel size.

It simply takes the average of all the pixels under kernel area and
replace the central element with this average.

cv2.GaussianBlur()

It’s parameters are image and kernel size.


Contours

cv2.findContours(), cv2.drawContours()

❏ Contours can be explained simply as a curve joining all the


continuous points (along the boundary), having same color or
intensity. The contours are a useful tool for shape analysis and
object detection and recognition.

Contours
image, contours, hierarchy =
cv2.findContours(gray_img,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

Arguments-
❏ first one is source image, second is contour retrieval mode, third is
contour approximation method. And it outputs the image, contours
and hierarchy.
❏ contours is a Python list of all the contours in the image. Each
individual contour is a Numpy array of (x,y) coordinates of boundary
points of the object.
Contours

img = cv2.drawContours(img, contours, -1, (0,255,0), 3)

❏ To draw the contours, cv2.drawContours function is used. It can also


be used to draw any shape provided you have its boundary points.
Its first argument is source image, second argument is the contours
which should be passed as a Python list, third argument is index of
contours (useful when drawing individual contour. To draw all
contours, pass -1) and remaining arguments are color, thickness etc.
Contours-Features

Moments, Contour Area, Contour Perimeter, Contour Approximation


❏ cnt = contours[0]
M = cv2.moments(cnt)
❏ area = cv2.contourArea(cnt)
❏ perimeter = cv2.arcLength(cnt,True)
❏ epsilon = 0.1*cv2.arcLength(cnt,True)
approx = cv2.approxPolyDP(cnt,epsilon,True)
THANKS
ROBOTICS CLUB
IIT GUWAHATI

You might also like