DIYA ACADEMY OF LEARNING
Affiliated to CBSE Board, New Delhi
Affiliation No. 830420
Artificial Intelligence
Code-417
Computer Vision
Grade 10
Bringing Education & Values Together
Computer Vision
• Domain of Artificial Intelligence
• enables machines to see through images or visual data, process and
analyse them on the basis of algorithms and methods in order to
analyse actual phenomena with images.
Applications of Computer Vision
• Facial Recognition
• Face Filters
• Google’s Search by Image
• Computer Vision in Retail
• Self-Driving Cars
• Medical Imaging
• Google Translate App
Computer Vision Tasks
Object detection is the process of
finding instances of real-world objects
such as faces, bicycles, and buildings
Image Classification problem is in images or videos. Object detection
the task of assigning an input algorithms typically use extracted
image one label from a fixed set features and learning algorithms to
of categories. This is one of the recognize instances of an object
core problems in CV that, despite category. It is commonly used in
its simplicity, has a large variety applications such as image retrieval
of practical applications and automated vehicle parking
systems.
This is the task which involves
both processes of identifying Instance Segmentation is the process
what object is present in the of detecting instances of the objects,
image and at the same time giving them a category and then giving
identifying at what location that each pixel a label on the basis of that.
object is present in that image. It A segmentation algorithm takes an
is used only for single objects. image as input and outputs a
collection of regions (or segments).
QA
1. Define Computer Vision
2. What are the different types of computer vision tasks? Explain with
examples
Basics of Images
• Pixel: Picture element.
• Every photograph, in digital form, is made up of pixels.
• They are the smallest unit of information that make up a picture.
• Usually round or square, they are typically arranged in a 2-dimensional grid.
• Resolution: The number of pixels in an image is sometimes called the
resolution.
• Width by the height, for example a monitor resolution of 1280×1024.
• Another convention is to express the number of pixels as a single number, like
a 5 mega pixel camera (a megapixel is a million pixels). This means the pixels
along the width multiplied by the pixels along the height of the image taken
by the camera equals 5 million pixels.
• Pixel value: Each of the pixels that represents an image stored inside a
computer has a pixel value which describes how bright that pixel is,
and/or what colour it should be. The most common pixel format is the
byte image. A range of possible values from 0 to 255
Grayscale Images: Have a range of shades of gray without apparent colour.
• The darkest possible shade is black, which is the total absence of colour or zero value
of pixel. The lightest possible shade is white, which is the total presence of colour or
255 value of a pixel .
• Intermediate shades of gray are represented by equal brightness levels of the three
primary colours.
RGB Images
All the images that we see around are coloured images. These images
are made up of three primary colours Red, Green and Blue. All the
colours that are present can be made by combining different intensities
of red, green and blue.
Convolution
Convolution is a simple Mathematical operation which is fundamental
to many common image processing operators.
An (image) convolution is simply an element-wise multiplication of
image arrays and another array called the kernel followed by sum.
Applications:
Image editing software like photoshop,apps like Instagram and
snapchat, which apply filters to the image to enhance the quality of
that image.
Convolutional Neural Network
• A convolutional neural network is a type of neural network that is
most often applied to image processing problems
• But you can also use convolutional neural networks in natural
language processing projects, too.
[Link]
CNN Process
Layers of a CNN
• Convolution Layer: To extract the high-level features such as edges,
from the input image.
• Rectified Linear Unit Function: This layer simply gets rid of all the
negative numbers in the feature map and lets the positive number
stay as it is.
• Pooling Layer: Pooling layer is responsible for reducing the spatial size
of the Convolved Feature while still retaining the important features.
• Fully Connected Layer: To take the results of the convolution/pooling
process and use them to classify the image into a label.