You are on page 1of 56

COMPUTER VISION

Ishaya, Jeremiah Ayock


Lecture 9

May 1, 2023
Academic City University College, Agbogba Haatso, Ghana.
Computer Vision
COMPUTER VISION

Computer vision is a field of artificial intelligence (AI)


that enables computers and systems to derive meaningful
information from digital images, videos and other visual
inputs — and take actions or make recommendations
based on that information.
Computer Vision uses convolutional neural networks to
process visual data at the pixel level and deep learning
recurrent neural networks to understand how one pixel
relates to another.

1
COMPUTER VISION WORKS

Computer vision algorithms are based on pattern


recognition. We train our model on a massive amount of
visual(images) data.
Our model processes the images with label and find patterns
in those objects(images).
For example, If we send a million pictures of vegetable
images to a model to train, it will analyze them and create an
Engine (Computer Vision Model) based on patterns that are
similar to all vegetables.
As a result, Our Model will be able to accurately detect
whether a particular image is a Vegetable every time we send
it.
2
COMPUTER VISION WORKS CON’T

Machines interpret images very simply: as a series of


pixels, each with its own set of colour values.
Consider the simplified image below, and how grayscale values
are converted into a simple array of numbers:

3
COMPUTER VISION CON’T

Figure 1: computer vision works

4
COMPUTER VISION CON’T

Images are giant grid of different squares, or pixels (this image


is a very simplified version of what looks like either Abraham
Lincoln or a Dementor).
Each pixel in an image can be represented by a number,
usually from 0–255. The series of numbers on the right is what
the software sees when you input an image.
For our image, there are 12 columns and 16 rows, which
means there are 192 input values for this image.

5
COMPUTER VISION CON’T

When we start to add colour, things get more complicated.


Computers usually read colour as a series of 3 values – red,
green, and blue (RGB) on that same 0–255 scale.
Now, each pixel actually has 3 values for the computer to
store in addition to its position. If we were to colourize
President Lincoln (or Harry Potter’s worst fear), that would
lead to 12 x 16 x 3 values, or 576 numbers.

6
COMPUTER VISION CON’T

Figure 2: Color Mapping 7


COMPUTER VISION CON’T

For some perspective on how computationally expensive this


is, consider this tree:

• Each colour value is stored in 8 bits.


• 8 bits x 3 colors per pixel = 24 bits per pixel.
• A normal sized 1024 x 768 image x 24 bits per pixel =
almost 19M bits, or about 2.36 megabytes.

8
COMPUTER VISION CON’T

That is a lot of memory to require for one image, and a lot of


pixels for an algorithm to iterate over. But to train a model
with meaningful accuracy especially when you are talking
about Deep Learning you will usually need tens of thousands
of images and the more the merrier.
Even if you were to use Transfer Learning to use the insights
of an already trained model, you’d still need a few thousand
images to train yours on.
With the sheer amount of computing power and storage
required just to train deep learning models for computer vision,
it is not hard to understand why advances in those two fields
have driven Machine Learning forward to such a degree.
9
COMPUTER VISION CON’T

Figure 3: Human Vs Computer Vision

10
TOP TOOLS USED FOR COMPUTER VISION

• OpenCV
• TensorFlow
• Keras
• CUDA
• MATLAB
• Viso Suite
• CAFFE
• SimpleCV
• DeepFace
• YOLO
• GPUImage
11
COMPUTER VISION TASKS

Image classification sees an image and can classify it (a dog,


an apple, a person’s face). More precisely, it is able to
accurately predict that a given image belongs to a certain
class.
For example, a social media company might want to use it to
automatically identify and segregate objectionable images
uploaded by users.

12
Object detection can use image classification to identify a
certain class of image and then detect and tabulate their
appearance in an image or video.
Examples include detecting damages on an assembly line or
identifying machinery that requires maintenance.

13
Object tracking follows or tracks an object once it is
detected. This task is often executed with images captured in
sequence or real-time video feeds.
Autonomous vehicles, for example, need to not only classify
and detect objects such as pedestrians, other cars and road
infrastructure, but they also need to track them in motion to
avoid collisions and obey traffic laws.

14
Content-based image retrieval uses computer vision to
browse, search and retrieve images from large data stores,
based on the content of the images rather than metadata tags
associated with them.
This task can incorporate automatic image annotation that
replaces manual image tagging. These tasks can be used for
digital asset management systems and can increase the
accuracy of search and retrieval.

15
COMPUTER VISION AND IMAGE PROCESSING

Image processing is the process of creating a new image


from an existing image, typically simplifying or enhancing the
content in some way.
It is a type of digital signal processing and is not concerned
with understanding the content of an image.
A given computer vision system may require image processing
to be applied to raw input, e.g. preprocessing images.

16
EXAMPLES OF IMAGE PROCESSING INCLUDE

Normalizing photometric properties of the image, such as


brightness or colour. Cropping the bounds of the image, such
as centring an object in a photograph.
Removing digital noise from an image, such as digital artefacts
from low light levels.

17
Image Processing
WHAT IS IMAGE PROCESSING?

Digital Image processing is the class of methods that deal


with manipulating digital images through the use of computer
algorithms.
It is an essential preprocessing step in many applications, such
as face recognition, object detection, and image compression.

18
Image processing is done to enhance an existing image or to
sift out important information from it. This is important in
several Deep Learning-based Computer Vision applications,
where such preprocessing can dramatically boost the
performance of a model.
Manipulating images, for example, adding or removing objects
to images, is another application, especially in the
entertainment industry.

19
TYPES OF IMAGES/HOW MACHINES ”SEE” IMAGES?

Digital images are interpreted as 2D or 3D matrices by a


computer, where each value or pixel in the matrix represents
the amplitude, known as the ”intensity” of the pixel.
Typically, we are used to dealing with 8-bit images, wherein
the amplitude value ranges from 0 − 255.

20
IMAGE PROCESSING

Figure 4: Image Processing


21
Thus, a computer ”sees” digital images as a function: I (x, y )
or I (x, y , z), where ”I” is the pixel intensity and (x, y ) or
(x, y , z) represent the coordinates (for binary/grayscale or
RGB images respectively) of the pixel in the image.

22
IMAGE PROCESSING

Figure 5: Image Cordinate

Computers deal with different “types” of images based on


their function representations. Let us look into them next. 23
BINARY IMAGE

Images that have only two unique values of pixel intensity- 0


(representing black) and 1 (representing white) are called
binary images.
Such images are generally used to highlight a discriminating
portion of a coloured image. For example, it is commonly used
for image segmentation, as shown below.

24
BINARY IMAGE

Figure 6: Binary Image

25
GRAYSCALE IMAGE

Grayscale or 8-bit images are composed of 256 unique


colours, where a pixel intensity of 0 represents the black colour
and a pixel intensity of 255 represents the white colour.
All the other 254 values in between are the different shades of
gray.
An example of an RGB image converted to its grayscale
version is shown below. Notice that the shape of the histogram
remains the same for the RGB and grayscale images.

26
GRAYSCALE IMAGE

Figure 7: Grayscale Image


27
RGB COLOR IMAGE

The images we are used to in the modern world are RGB or


coloured images which are 16-bit matrices to computers.
That is, 65,536 different colours are possible for each pixel.
”RGB” represents the Red, Green, and Blue ”channels” of an
image.

28
Thus, a pixel in an RGB image will be of colour black when the
pixel value is (0, 0, 0) and white when it is (255, 255, 255).
Any combination of numbers in between gives rise to all the
different colours existing in nature. For example, (255, 0, 0) is
the colour red (since only the red channel is activated for this
pixel).
Similarly, (0, 255, 0) is green and (0, 0, 255) is blue.

29
GRAYSCALE IMAGE

Figure 8: RGB Splitting


30
RGBA IMAGE

RGBA images are coloured RGB images with an extra


channel known as ”alpha” that depicts the opacity of the
RGB image. Opacity ranges from a value of 0% to 100% and
is essentially a “see-through” property.
Opacity in physics depicts the amount of light that passes
through an object. For instance, cellophane paper is
transparent (100%opacity), frosted glass is translucent, and
wood is opaque.
The alpha channel in RGBA images tries to mimic this
property. An example of this is shown below.

31
GRAYSCALE IMAGE

Figure 9: RGBA
32
TYPES OF IMAGE PROCESSING

There are five main types of image processing:

• Visualization - Find objects that are not visible in the


image
• Recognition - Distinguish or detect objects in the image
• Sharpening and restoration - Create an enhanced image
from the original image
• Pattern recognition - Measure the various patterns around
the objects in the image
• Retrieval - Browse and search images from a large
database of digital images that are similar to the original
image
33
Phases of Image Processing
IMAGE ACQUISITION

The image is captured by a camera and digitized (if the


camera output is not digitized automatically) using an
analogue-to-digital converter for further processing in a
computer.

34
IMAGE ENHANCEMENT

In this step, the acquired image is manipulated to meet the


requirements of the specific task for which the image will be
used.
Such techniques are primarily aimed at highlighting the hidden
or important details in an image, like contrast and brightness
adjustment, etc. Image enhancement is highly subjective in
nature.

35
IMAGE RESTORATION

This step deals with improving the appearance of an image and


is an objective operation since the degradation of an image
can be attributed to a mathematical or probabilistic model.
For example, removing noise or blur from images.

36
COLOR IMAGE PROCESSING

This step aims at handling the processing of coloured images


(16-bit RGB or RGBA images), for example, performing colour
correction or colour modelling in images.

37
WAVELETS AND MULTI-RESOLUTION PROCESSING

Wavelets are the building blocks for representing images in


various degrees of resolution. Images subdivision successively
into smaller regions for data compression and for pyramidal
representation.

38
IMAGE COMPRESSION

For transferring images to other devices or due to


computational storage constraints, images need to be
compressed and cannot be kept at their original size.
This is also important in displaying images over the internet;
for example, on Google, a small thumbnail of an image is a
highly compressed version of the original.
Only when you click on the image is it shown in the original
resolution? This process saves bandwidth on the servers.

39
MORPHOLOGICAL PROCESSING

Image components that are useful in the representation and


description of shapes need to be extracted for further
processing or downstream tasks.
Morphological Processing provides the tools (which are
essentially mathematical operations) to accomplish this.
For example, erosion and dilation operations are used to
sharpen and blur the edges of objects in an image, respectively.

40
IMAGE SEGMENTATION

This step involves partitioning an image into different key


parts to simplify and/or change the representation of an image
into something that is more meaningful and easier to analyze.
Image segmentation allows computers to put attention on the
more important parts of the image, discarding the rest, which
enables automated systems to have improved performance.

41
REPRESENTATION AND DESCRIPTION

Image segmentation procedures are generally followed by this


step, where the task for representation is to decide whether
the segmented region should be depicted as a boundary or a
complete region.
Description deals with extracting attributes that result in some
quantitative information of interest or are basic for
differentiating one class of objects from another.

42
OBJECT DETECTION AND RECOGNITION

After the objects are segmented from an image and the


representation and description phases are complete, the
automated system needs to assign a label to the object—to let
the human users know what object has been detected, for
example, ”vehicle” or ”person”, etc.

43
BENEFITS OF IMAGE PROCESSING

The implementation of image processing techniques has had a


massive impact on many tech organizations.
Here are some of the most useful benefits of image processing,
regardless of the field of operation:
• The digital image can be made available in any desired
format (improved image, X-Ray, photo negative, etc)
• It helps to improve images for human interpretation
• Information can be processed and extracted from images
for machine interpretation
• The pixels in the image can be manipulated to any
desired density and contrast
• Images can be stored and retrieved easily
• It allows for easy electronic transmission of images to 44
APPLICATIONS OF COMPUTER VISION

• Agriculture
• Sports
• Healthcare
• Transportation
• Manufacturing
• Retail
• Constructions

45
AGRICULTURE

• Product Quality Testing


• Plant disease detection
• Livestock health monitoring
• Crop and yield monitoring
• Insect detection
• Aerial survey and imaging
• Automatic weeding
• Yield Assessment

46
HEALTHCARE

• Cell Classification
• Disease Progression Score
• Cancer detection
• Blood loss measurement
• Movement analysis
• CT and MRI
• X-Ray analysis

47
TRANSPORTATION

• Vehicle Classification
• Traffic flow analysis
• Self-driving cars
• Moving Violations Detection
• Pedestrian detection
• License Plate Recognition
• Driver Attentiveness Detection
• Road condition monitoring

48
MANUFACTURING

• Defect inspection
• Reading text and barcodes
• Product assembly

49
RETAIL

• Intelligent video analytics


• Waiting Time Analytics
• Theft Detection
• Foot traffic and people counting
• Self-checkout
• Automatic replenishment

50
CONSTRUCTIONS

• Predictive maintenance
• PPE Detection

51
END OF PRESENTATION

THANK YOU

You might also like