Machine - Learning (Computer Vision)

COMPUTER VISION
Ishaya, Jeremiah Ayock

Lecture 9
May 1, 2023
Academic City University College, Agbogba Haatso, Ghana.
Computer Vision
COMPUTER VISION
Computer vision is a field of artificial intelligence (AI)

that enables computers and systems to derive meaningful
information from digital images, videos and other visual
inputs — and take actions or make recommendations
based on that information.
Computer Vision uses convolutional neural networks to
process visual data at the pixel level and deep learning
recurrent neural networks to understand how one pixel
relates to another.
1
COMPUTER VISION WORKS
Computer vision algorithms are based on pattern

recognition. We train our model on a massive amount of
visual(images) data.
Our model processes the images with label and find patterns
in those objects(images).
For example, If we send a million pictures of vegetable
images to a model to train, it will analyze them and create an
Engine (Computer Vision Model) based on patterns that are
similar to all vegetables.
As a result, Our Model will be able to accurately detect
whether a particular image is a Vegetable every time we send
it.
2
COMPUTER VISION WORKS CON’T
Machines interpret images very simply: as a series of

pixels, each with its own set of colour values.
Consider the simplified image below, and how grayscale values
are converted into a simple array of numbers:
3
COMPUTER VISION CON’T
Figure 1: computer vision works
4
Images are giant grid of different squares, or pixels (this image

is a very simplified version of what looks like either Abraham
Lincoln or a Dementor).
Each pixel in an image can be represented by a number,
usually from 0–255. The series of numbers on the right is what
the software sees when you input an image.
For our image, there are 12 columns and 16 rows, which
means there are 192 input values for this image.
5
When we start to add colour, things get more complicated.

Computers usually read colour as a series of 3 values – red,
green, and blue (RGB) on that same 0–255 scale.
Now, each pixel actually has 3 values for the computer to
store in addition to its position. If we were to colourize
President Lincoln (or Harry Potter’s worst fear), that would
lead to 12 x 16 x 3 values, or 576 numbers.
6
Figure 2: Color Mapping 7

For some perspective on how computationally expensive this

is, consider this tree:
• Each colour value is stored in 8 bits.

• 8 bits x 3 colors per pixel = 24 bits per pixel.
• A normal sized 1024 x 768 image x 24 bits per pixel =
almost 19M bits, or about 2.36 megabytes.
8
That is a lot of memory to require for one image, and a lot of

pixels for an algorithm to iterate over. But to train a model
with meaningful accuracy especially when you are talking
about Deep Learning you will usually need tens of thousands
of images and the more the merrier.
Even if you were to use Transfer Learning to use the insights
of an already trained model, you’d still need a few thousand
images to train yours on.
With the sheer amount of computing power and storage
required just to train deep learning models for computer vision,
it is not hard to understand why advances in those two fields
have driven Machine Learning forward to such a degree.
9
Figure 3: Human Vs Computer Vision
10
TOP TOOLS USED FOR COMPUTER VISION
• OpenCV
• TensorFlow
• Keras
• CUDA
• MATLAB
• Viso Suite
• CAFFE
• SimpleCV
• DeepFace
• YOLO
• GPUImage
11
COMPUTER VISION TASKS
Image classification sees an image and can classify it (a dog,

an apple, a person’s face). More precisely, it is able to
accurately predict that a given image belongs to a certain
class.
For example, a social media company might want to use it to
automatically identify and segregate objectionable images
uploaded by users.
12
Object detection can use image classification to identify a
certain class of image and then detect and tabulate their
appearance in an image or video.
Examples include detecting damages on an assembly line or
identifying machinery that requires maintenance.
13
Object tracking follows or tracks an object once it is
detected. This task is often executed with images captured in
sequence or real-time video feeds.
Autonomous vehicles, for example, need to not only classify
and detect objects such as pedestrians, other cars and road
infrastructure, but they also need to track them in motion to
avoid collisions and obey traffic laws.
14
Content-based image retrieval uses computer vision to
browse, search and retrieve images from large data stores,
based on the content of the images rather than metadata tags
associated with them.
This task can incorporate automatic image annotation that
replaces manual image tagging. These tasks can be used for
digital asset management systems and can increase the
accuracy of search and retrieval.
15
COMPUTER VISION AND IMAGE PROCESSING
Image processing is the process of creating a new image

from an existing image, typically simplifying or enhancing the
content in some way.
It is a type of digital signal processing and is not concerned
with understanding the content of an image.
A given computer vision system may require image processing
to be applied to raw input, e.g. preprocessing images.
16
EXAMPLES OF IMAGE PROCESSING INCLUDE
Normalizing photometric properties of the image, such as

brightness or colour. Cropping the bounds of the image, such
as centring an object in a photograph.
Removing digital noise from an image, such as digital artefacts
from low light levels.
17
Image Processing
WHAT IS IMAGE PROCESSING?
Digital Image processing is the class of methods that deal

with manipulating digital images through the use of computer
algorithms.
It is an essential preprocessing step in many applications, such
as face recognition, object detection, and image compression.
18
Image processing is done to enhance an existing image or to
sift out important information from it. This is important in
several Deep Learning-based Computer Vision applications,
where such preprocessing can dramatically boost the
performance of a model.
Manipulating images, for example, adding or removing objects
to images, is another application, especially in the
entertainment industry.
19
TYPES OF IMAGES/HOW MACHINES ”SEE” IMAGES?
Digital images are interpreted as 2D or 3D matrices by a

computer, where each value or pixel in the matrix represents
the amplitude, known as the ”intensity” of the pixel.
Typically, we are used to dealing with 8-bit images, wherein
the amplitude value ranges from 0 − 255.
20
IMAGE PROCESSING
Figure 4: Image Processing

21
Thus, a computer ”sees” digital images as a function: I (x, y )
or I (x, y , z), where ”I” is the pixel intensity and (x, y ) or
(x, y , z) represent the coordinates (for binary/grayscale or
RGB images respectively) of the pixel in the image.
22
IMAGE PROCESSING
Figure 5: Image Cordinate
Computers deal with different “types” of images based on

their function representations. Let us look into them next. 23
BINARY IMAGE
Images that have only two unique values of pixel intensity- 0

(representing black) and 1 (representing white) are called
binary images.
Such images are generally used to highlight a discriminating
portion of a coloured image. For example, it is commonly used
for image segmentation, as shown below.
24
BINARY IMAGE
Figure 6: Binary Image
25
GRAYSCALE IMAGE
Grayscale or 8-bit images are composed of 256 unique

colours, where a pixel intensity of 0 represents the black colour
and a pixel intensity of 255 represents the white colour.
All the other 254 values in between are the different shades of
gray.
An example of an RGB image converted to its grayscale
version is shown below. Notice that the shape of the histogram
remains the same for the RGB and grayscale images.
26
GRAYSCALE IMAGE
Figure 7: Grayscale Image

27
RGB COLOR IMAGE
The images we are used to in the modern world are RGB or

coloured images which are 16-bit matrices to computers.
That is, 65,536 different colours are possible for each pixel.
”RGB” represents the Red, Green, and Blue ”channels” of an
image.
28
Thus, a pixel in an RGB image will be of colour black when the
pixel value is (0, 0, 0) and white when it is (255, 255, 255).
Any combination of numbers in between gives rise to all the
different colours existing in nature. For example, (255, 0, 0) is
the colour red (since only the red channel is activated for this
pixel).
Similarly, (0, 255, 0) is green and (0, 0, 255) is blue.
29
GRAYSCALE IMAGE
Figure 8: RGB Splitting

30
RGBA IMAGE
RGBA images are coloured RGB images with an extra

channel known as ”alpha” that depicts the opacity of the
RGB image. Opacity ranges from a value of 0% to 100% and
is essentially a “see-through” property.
Opacity in physics depicts the amount of light that passes
through an object. For instance, cellophane paper is
transparent (100%opacity), frosted glass is translucent, and
wood is opaque.
The alpha channel in RGBA images tries to mimic this
property. An example of this is shown below.
31
GRAYSCALE IMAGE
Figure 9: RGBA
32
TYPES OF IMAGE PROCESSING
There are five main types of image processing:
• Visualization - Find objects that are not visible in the

image
• Recognition - Distinguish or detect objects in the image
• Sharpening and restoration - Create an enhanced image
from the original image
• Pattern recognition - Measure the various patterns around
the objects in the image
• Retrieval - Browse and search images from a large
database of digital images that are similar to the original
image
33
Phases of Image Processing
IMAGE ACQUISITION
The image is captured by a camera and digitized (if the

camera output is not digitized automatically) using an
analogue-to-digital converter for further processing in a
computer.
34
IMAGE ENHANCEMENT
In this step, the acquired image is manipulated to meet the

requirements of the specific task for which the image will be
used.
Such techniques are primarily aimed at highlighting the hidden
or important details in an image, like contrast and brightness
adjustment, etc. Image enhancement is highly subjective in
nature.
35
IMAGE RESTORATION
This step deals with improving the appearance of an image and

is an objective operation since the degradation of an image
can be attributed to a mathematical or probabilistic model.
For example, removing noise or blur from images.
36
COLOR IMAGE PROCESSING
This step aims at handling the processing of coloured images

(16-bit RGB or RGBA images), for example, performing colour
correction or colour modelling in images.
37
WAVELETS AND MULTI-RESOLUTION PROCESSING
Wavelets are the building blocks for representing images in

various degrees of resolution. Images subdivision successively
into smaller regions for data compression and for pyramidal
representation.
38
IMAGE COMPRESSION
For transferring images to other devices or due to

computational storage constraints, images need to be
compressed and cannot be kept at their original size.
This is also important in displaying images over the internet;
for example, on Google, a small thumbnail of an image is a
highly compressed version of the original.
Only when you click on the image is it shown in the original
resolution? This process saves bandwidth on the servers.
39
MORPHOLOGICAL PROCESSING
Image components that are useful in the representation and

description of shapes need to be extracted for further
processing or downstream tasks.
Morphological Processing provides the tools (which are
essentially mathematical operations) to accomplish this.
For example, erosion and dilation operations are used to
sharpen and blur the edges of objects in an image, respectively.
40
IMAGE SEGMENTATION
This step involves partitioning an image into different key

parts to simplify and/or change the representation of an image
into something that is more meaningful and easier to analyze.
Image segmentation allows computers to put attention on the
more important parts of the image, discarding the rest, which
enables automated systems to have improved performance.
41
REPRESENTATION AND DESCRIPTION
Image segmentation procedures are generally followed by this

step, where the task for representation is to decide whether
the segmented region should be depicted as a boundary or a
complete region.
Description deals with extracting attributes that result in some
quantitative information of interest or are basic for
differentiating one class of objects from another.
42
OBJECT DETECTION AND RECOGNITION
After the objects are segmented from an image and the

representation and description phases are complete, the
automated system needs to assign a label to the object—to let
the human users know what object has been detected, for
example, ”vehicle” or ”person”, etc.
43
BENEFITS OF IMAGE PROCESSING
The implementation of image processing techniques has had a

massive impact on many tech organizations.
Here are some of the most useful benefits of image processing,
regardless of the field of operation:
• The digital image can be made available in any desired
format (improved image, X-Ray, photo negative, etc)
• It helps to improve images for human interpretation
• Information can be processed and extracted from images
for machine interpretation
• The pixels in the image can be manipulated to any
desired density and contrast
• Images can be stored and retrieved easily
• It allows for easy electronic transmission of images to 44
APPLICATIONS OF COMPUTER VISION
• Agriculture
• Sports
• Healthcare
• Transportation
• Manufacturing
• Retail
• Constructions
45
AGRICULTURE
• Product Quality Testing

• Plant disease detection
• Livestock health monitoring
• Crop and yield monitoring
• Insect detection
• Aerial survey and imaging
• Automatic weeding
• Yield Assessment
46
HEALTHCARE
• Cell Classification
• Disease Progression Score
• Cancer detection
• Blood loss measurement
• Movement analysis
• CT and MRI
• X-Ray analysis
47
TRANSPORTATION
• Vehicle Classification
• Traffic flow analysis
• Self-driving cars
• Moving Violations Detection
• Pedestrian detection
• License Plate Recognition
• Driver Attentiveness Detection
• Road condition monitoring
48
MANUFACTURING
• Defect inspection
• Reading text and barcodes
• Product assembly
49
RETAIL
• Intelligent video analytics

• Waiting Time Analytics
• Theft Detection
• Foot traffic and people counting
• Self-checkout
• Automatic replenishment
50
CONSTRUCTIONS
• Predictive maintenance
• PPE Detection
51
END OF PRESENTATION
THANK YOU

Machine - Learning (Computer Vision)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine - Learning (Computer Vision)

Uploaded by

Copyright:

Available Formats

COMPUTER VISION

Ishaya, Jeremiah Ayock

Computer vision is a field of artificial intelligence (AI)

Computer vision algorithms are based on pattern

Machines interpret images very simply: as a series of

Figure 1: computer vision works

Images are giant grid of different squares, or pixels (this image

When we start to add colour, things get more complicated.

Figure 2: Color Mapping 7

For some perspective on how computationally expensive this

• Each colour value is stored in 8 bits.

That is a lot of memory to require for one image, and a lot of

Figure 3: Human Vs Computer Vision

Image classification sees an image and can classify it (a dog,

Image processing is the process of creating a new image

Normalizing photometric properties of the image, such as

Digital Image processing is the class of methods that deal

Digital images are interpreted as 2D or 3D matrices by a

Figure 4: Image Processing

Figure 5: Image Cordinate

Computers deal with different “types” of images based on

Images that have only two unique values of pixel intensity- 0

Figure 6: Binary Image

Grayscale or 8-bit images are composed of 256 unique

Figure 7: Grayscale Image

The images we are used to in the modern world are RGB or

Figure 8: RGB Splitting

RGBA images are coloured RGB images with an extra

There are five main types of image processing:

• Visualization - Find objects that are not visible in the

The image is captured by a camera and digitized (if the

In this step, the acquired image is manipulated to meet the

This step deals with improving the appearance of an image and

This step aims at handling the processing of coloured images

Wavelets are the building blocks for representing images in

For transferring images to other devices or due to

Image components that are useful in the representation and

This step involves partitioning an image into different key

Image segmentation procedures are generally followed by this

After the objects are segmented from an image and the

The implementation of image processing techniques has had a

• Product Quality Testing

• Intelligent video analytics

You might also like