0% found this document useful (0 votes)

153 views53 pages

Class 10 Notes Ai Computer Vision

Uploaded by

audichaturvedi6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

153 views53 pages

Class 10 Notes Ai Computer Vision

Uploaded by

audichaturvedi6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CLASS X

ARTIFICIAL INTELLIGENCE
ADVANCED PYTHON-COMPUTER VISION
• Emoji Scavenger Hunt :
[Link]
• [Link]
know-about-computer-vision
• Computer Vision is a domain of Artificial Intelligence.
• It deals with visual data inputs, primarily images and videos. It
enables computers to interpret and understand visual information
• Computer Vision is like giving computers the ability to see and
understand the world through digital images and videos, much like
how humans use their eyes to perceive their surroundings.
• It involves extraction of information from digital images like the
videos and photographs, then computers analyze visual
information from images and videos to recognize objects,
understand scenes, and make decisions based on what they “SEE”
• Computer Vision based applications analyse understand the content
using the concepts of image processing and machine learning
models.
Applications of Computer Vision
1. Facial Recognition
2. Face Filters
3. Google’s Search by Image
4. Computer Vision in Retail
5. Self-Driving Cars
6. Medical Imaging
7. Google Translate App
Weather forecasting
• Weather forecasting deals with gathering the satellites data, identifying
patterns in the observations made, and then computing the results to get
accurate weather predictions. This is done in real-time to prevent
disasters.
• Artificial Intelligence uses computer-generated mathematical programs
and computer vision technology to identify patterns so that relevant
weather predictions can be made. Scientists are now using AI for weather
forecasting to obtain refined and accurate results, fast!
• In the current model of weather forecasting, scientists gather satellite
data i.e. temperature, wind, humidity etc. and compare and analyze this
data against a mathematical model that is based on past weather patterns
and geography of the region in question. This is done in real time to
prevent disasters.
• This has resulted in scientists preferring AI for weather forecasting. One of
the key advantages of the AI based model is that it adjusts itself with the
dynamics of atmospheric changes.
Facial Recognition:
• With the advent of smart cities and smart
homes, Computer Vision plays a vital role in
making the home smarter.
• face lock system work in a smartphone
• guest recognition or log maintenance of the
visitors.
• in schools for an attendance system based on
facial recognition of students.
Face Filters :
• The modern-day apps like Instagram and
snapchat have a lot of features based on the
usage of computer vision. The application of
face filters is one among them.
• Through the camera the machine or the
algorithm is able to identify the facial
dynamics of the person and applies the facial
filter selected.
Google’s Search by Image
• The maximum amount of searching for data
on Google’s search engine comes from textual
data, but at the same time it has an
interesting feature of getting search results
through an image.
• CV compares different features of the input
image to the database of images and give us
the search result while at the same time
analysing various features of the image.
Computer Vision in Retail:
• Retailers can use Computer Vision techniques to track
customers’ movements through stores, analyse
navigational routes and detect walking patterns.
• Inventory Management is another such application.
Through security camera image analysis, a Computer
Vision algorithm can generate a very accurate estimate
of the items available in the store. Also, it can analyse
the use of shelf space to identify suboptimal
configurations and suggest better item placement.
Self-Driving Cars:
• A self-driving car, also known as an autonomous vehicle (AV),
driverless car, robot car, or robotic car is a vehicle that is
capable of sensing its environment and moving safely with
little or no human input.”
• Self-driving cars combine a variety of sensors to perceive
their surroundings and differentiate objects, such as
pedestrians, vehicles, and road signs within the vehicle's
environment such as radar, lidar, sonar, GPS, odometry.
• Most leading car manufacturers in the world like Tesla are
reaping the benefits of investing in artificial intelligence for
developing on-road versions of hands-free technology.
• This involves the process of identifying the objects, getting
navigational routes and also at the same time environment
monitoring.
Watch These Videos:
Video 1: Google’s Waymo -
[Link]
Video 2: Tesla’s self-parking -
[Link]
MEDICAL IMAGING
• Medical Imaging*: For the last decades,
computer-supported medical imaging application
has been a trustworthy help for physicians. It
doesn’t only create and analyse images, but also
becomes an assistant and helps doctors with
their interpretation. The application is used to
read and convert 2D scan images into interactive
3D models that enable medical professionals to
gain a detailed understanding of a patient’s
health condition.
GOOGLE TRANSLATE
• Google Translate App*: All you need to do to
read signs in a foreign language is to point
your phone’s camera at the words and let the
Google Translate app tell you what it means in
your preferred language almost instantly. By
using optical character recognition to see the
image and augmented reality to overlay an
accurate translation, this is a convenient tool
that uses Computer Vision.
CV is effective in the following:

• Optical Character Recognition

• Fingerprint Recognition
• Advanced Robotic Surgery in Healthcare
• Virtual Mirrors in Retail Industry
• Camera-based Customer Analysis Applications
• Cashier-less Stores
• Driverless Trucks in Transportation & Logistics Industry.
• Monitoring Product Assembly Sequence
Basics of images
capture an image using mobile camera
So how is the image with multiple attributes like colour,
height, width be stored in a computer/mobile phone?

For the mobile/computer the image is like a grid of numbers.

It is stored in terms of pixel intensities in a construct called MATRIX.
As we saw an image, once you work with matrices, you can work with images
represented as data.
The matrix is processed to identify the color etc.
How can NumPy and CV be integrated together?

• NumPy and OpenCV (cv2) are often used together in

computer vision tasks because they complement each
other well.
• OpenCV handles image processing, while NumPy
provides powerful array operations.
• In OpenCV, images are represented as NumPy arrays. This
means you can apply all the operations available in
NumPy directly on images.
• By integrating NumPy with OpenCV, you can leverage
powerful numerical operations on images, create custom
filters, manipulate pixels.
BASICS OF IMAGE
• “pixel” means a picture element. They are the smallest unit of information
that make up a picture. Every photograph, in digital form, is made up of pixels.
• The number of pixels in an image is sometimes called the resolution. The
more pixels you have, the more closely the image resembles the original. ie.
resolution of the image is higher.
• Usually round or square, they are typically arranged in a 2-dimensional grid.
• Each of the pixels that represents an image stored inside a computer has a
pixel value which describes how bright that pixel is, and/or what colour it
should be.
• Each pixel in a colour image has three numbers (ranging from 0 to 255)
associated with it. These numbers represent the intensity of red, green and
blue colour in that particular pixel.
Each pixel uses 1 byte of an image, which is equivalent to 8 bits of data. Since
each bit can have two possible values which tells us that the 8 bit can have
255 possibilities of values which starts from 0 and ends at 255.
• a monitor resolution of 1280×1024. This means there are 1280 pixels from
one side to the other, and 1024 from top to bottom.
• (0,0,0) represents black colour
• (255, 255, 255) represents white colour
TYPES OF IMAGES
• GRAYSCALE
• RGB

A computer sees images as a matrix of 2-

dimensional array in grayscale or three-
dimensional array in case of a colour image.
Grayscale images
• Grayscale images are images which have a range of shades of gray
without apparent colour. The darkest possible shade is black, which
is the total absence of colour or zero value of pixel. The lightest
possible shade is white, which is the total presence of colour or 255
value of a pixel. Intermediate shades of gray are represented by
equal brightness levels of the three primary colours.

• In a grayscale image, each value in the 2D matrix represents the

brightness of the pixels.
• The number in the matrix ranges between 0 to 255, wherein 0
represents black, 255 represents white and the values between
them is a shade of grey.
• Each pixel represents the brightness or darkness of the pixel, which
means the grayscale image is composed of only one channel.
RGB (Coloured) images
• A colour image is viewed as a matrix of a 3-
dimensional array, the right mix of three primary
colours (Red, Green and Blue), so a colour image
will have three channels. A Channel refers to the
number of colours in the digital image.
• This means that in a RGB image, each pixel has a
set of three different values which together give
colour to that particular pixel.
• Image is usually represented as height x width x
channels, where channel is 3 for coloured image.
Visit the link [Link]
Answer the questions
• What is the output colour when you put R=G=B=255 ?

• What is the output colour when you put R=G=B=0 ?

• How does the colour vary when you put either of the three as 0 and
then keep on varying the other two?

• How does the output colour change when all the three colours are
varied in same proportion ?

• What is the RGB value of your favourite colour from the colour
palette?
Visit the link [Link]
and create your own pixel art.
Try and make a GIF using the online app for your
own pixel art.
Tasks in Computer Vision
A computer cannot make sense of images. For human,
this ability comes naturally and effortlessly but for
machines, it’s a fairly complicated process.
The idea is to teach a computer how to make sense of a
matrix of numbers and identify objects, faces and
characters using mathematical principles.
1. Semantic Segmentation (Image Classification)
2. Classification + Localization
3. Object Detection
4. Instance Segmentation
Tasks in Computer Vision
Tasks in Computer Vision
1. Classification
• Classification in Computer Vision (CV) refers to the task of
categorizing images or objects within images into predefined
classes or categories.
• The goal is to train a model to recognize patterns and features in
images that correspond to specific categories, and then use this
trained model to predict the category of new, unseen images.
• In this process, an image is classified depending on its visual content
ie. assigning an input image one label from a fixed set of categories.
It is the process of finding out the class of the input image.
• A set of classes (objects to identify in images) are defined and a
model is trained to recognize them with the help of labelled photos.
Ie. it takes an image as an input and outputs a class i.e. a cat, dog
etc. or a probability of classes from which one has the highest
chance of being correct.
• Classification in computer vision is a
fundamental task where the goal is to
automatically assign labels to images or
objects within images based on learned
patterns and features.
• This technique is widely used in various
applications, including object recognition,
facial recognition, and medical image analysis
How Classification Works:

• Data Collection:
– A dataset of labeled images is collected, where each image is associated with
a specific class label. For example, a dataset might contain images of cats,
dogs, and birds, each labeled with their respective class.
• Feature Extraction:
– The model processes the images to extract relevant features, such as edges,
shapes, colors, or textures, that help in distinguishing between different
classes.
• Model Training:
– A machine learning algorithm is trained on the labeled dataset. During
training, the model learns to map the extracted features to the correct class
labels.
• Prediction:
– Once trained, the model can predict the class of new, unseen images by
analyzing their features and determining which class they most closely match.
Image Classification of Animals:
Suppose you want to build a system that can automatically classify images of animals
into categories like "cat," "dog," and "rabbit."
Steps:
• Dataset:
– Collect a dataset containing thousands of images of cats, dogs, and rabbits, each
labeled with the corresponding animal type.
• Feature Extraction:
– Use techniques like convolutional neural networks (CNNs) to automatically extract
features such as fur patterns, ear shapes, and body contours from the images.
• Training:
– Train a classification model (e.g., a deep learning model like a CNN) on the labeled
dataset. The model learns the distinguishing features of each class.
• Testing:
– Provide the model with new images, and it will classify them as either "cat," "dog,"
or "rabbit" based on the learned features.
Outcome:
• When given an image of a dog, the model processes the image, identifies the
relevant features, and predicts the label "dog" with high accuracy.
2. Classification and Localization
• It identifies what object is present in the image and at the
same time identifying at what location that object is present
in that image.
• It is used only for single objects.
• Eg. There a dog in an image, the algorithm predicts the class,
once it is classified and labelled, it creates a bounding box
around the object in the image.
3. Object Detection
• Object detection in computer vision (CV) is the process of identifying and locating
objects within an image or video. It aims to find instances of real-world objects
such as faces, bicycles, and buildings in images or videos.
• Unlike classification, which assigns a single label to an entire image, object
detection not only classifies the objects present in an image but also determines
their positions within the image by drawing bounding boxes around them.
• If we have multiple objects in the image, object detection algorithms use extracted
features and learning algorithms to recognize instances of an object category. There
can be multiple bounding boxes and labels can be there around the objects.
• It is commonly used in applications such as image retrieval and automated vehicle
parking systems.
4. Instance segmentation
• Instance segmentation helps in identifying and outlining distinctly
each object of interest appearing in an image.
• It is the process of detecting instances of the objects, giving them a
category and then giving each pixel a label on the basis of that.
• A segmentation algorithm takes an image as input and outputs a
collection of regions (or segments).
• This process helps to create a pixel-wise mask for each object in the
image and provides a far more granular understanding of the
object(s) in the image.
• Objects belonging to the same class are shown in multiple colours.
Image Features

• In computer vision and image processing, a feature is a

piece of information which is relevant for solving the
computational task related to a certain application.
• Features may be specific structures in the image such
as points, edges or objects.
example:
• Imagine that your security camera is capturing an
image. At the top of the image we are given six small
patches of images. Our task is to find the exact location
of those image patches in the image.
• Take a pencil and mark the exact location of those
patches in the image.
OpenCV
• OpenCV or Open Source Computer Vision Library is that tool
which helps a computer extract features from the images for
further processing.
• Image file types
– Windows bitmaps – *.bmp
– Joint Photographic Expert Group – *.jpeg, *.jpg
– Portable Network Graphics – *.png
To install OpenCV library
At command prompt type
pip install opencv-python

Besides the above you can also install

pip install numpy
pip install matplotlib
Steps to read and display an image in OpenCV

1. Read an image using imread() function.

2. Create a GUI window and display image using imshow()
function.
3. Use function waitkey(0) to hold the image window on the
screen by the specified number of seconds, o means till the
user closes it, it will hold GUI window on the screen. This
command is optional.
4. Delete image window from the memory after displaying using
destroyAllWindows() function. This command is optional.
Flags used when reading an image
cv2.IMREAD_COLOR: to load a color image. It is the
default flag. Alternatively, we can pass integer
value 1 for this flag.

cv2.IMREAD_GRAYSCALE: It specifies to load an image

in grayscale mode. Alternatively, we can pass integer
value 0 for this flag.

cv2.IMREAD_UNCHANGED: It specifies to load an

image as such including alpha channel. Alternatively,
we can pass integer value -1 for this flag.
Displaying image
import cv2
img=[Link]("C:/Users/admin/Desktop/original_man.jpg",cv2.IMREAD_GRAYSCALE)
[Link]("image",img)
[Link](0) #waits for a key to be pressed
[Link]() # It is for removing/deleting created GUI
window from screen # and memory

Alternatively, we can pass integer value 0 instead of

cv2.IMREAD_GRAYSCALE.
1 for showing in colour
We can display images using Matplotlib also

import cv2
import [Link] as plt
img=[Link] ("C:/Users/admin/Desktop/original_man.jpg”)
[Link](img)
[Link](“on”)
[Link]()
import cv2
import [Link] as plt
img=[Link] ("C:/Users/admin/Desktop/original_man.jpg")
RGB_img = [Link](img, cv2.COLOR_BGR2RGB)
[Link](RGB_img)
[Link](“on”)
[Link]()

It - Stephen King's PDF
80% (10)
It - Stephen King's PDF
588 pages
Secret Code Samsung
89% (38)
Secret Code Samsung
3 pages
Open Deed of Sale of A Motor Vehicle
81% (606)
Open Deed of Sale of A Motor Vehicle
1 page
Sim Owner Details - Pakistan No #1 Number Information System 2025
56% (16)
Sim Owner Details - Pakistan No #1 Number Information System 2025
3 pages
All Format
91% (32)
All Format
1 page
1500 Vocabulary Words
78% (112)
1500 Vocabulary Words
27 pages
میری گرم فیملی
79% (48)
میری گرم فیملی
133 pages
XXX Archita Phukan Viral Video Original XXX VIDEOS
8% (12)
XXX Archita Phukan Viral Video Original XXX VIDEOS
4 pages
Big Book of Sex
39% (134)
Big Book of Sex
386 pages
Earseus Key
50% (16)
Earseus Key
4 pages
Microsoft Office 2007 Activation Keys
85% (34)
Microsoft Office 2007 Activation Keys
2 pages
XXXX XXXXXXXX: X X X X X XX
60% (5)
XXXX XXXXXXXX: X X X X X XX
2 pages
NADANPENKODI - Malayalam Kambi Kathakal
60% (10)
NADANPENKODI - Malayalam Kambi Kathakal
8 pages
Telugu Family Sex Stories Collection
67% (102)
Telugu Family Sex Stories Collection
157 pages
Sample Research Paper PDF
90% (21)
Sample Research Paper PDF
36 pages
Chemistry (Annual Reports - Vol.59-1962)
100% (8)
Chemistry (Annual Reports - Vol.59-1962)
576 pages
All Numbers
68% (19)
All Numbers
59 pages
50 Numerical Questions On Electricity Class 10
89% (82)
50 Numerical Questions On Electricity Class 10
49 pages
Corel Draw X7 Serial Number & Activation Code
58% (43)
Corel Draw X7 Serial Number & Activation Code
1 page
Carbon and Its Compound (Prashant Kirad)
91% (272)
Carbon and Its Compound (Prashant Kirad)
21 pages
Telugu Boothu Kathala 24 PDF
77% (13)
Telugu Boothu Kathala 24 PDF
20 pages
Mineral and Energy Resources (Prashant Kirad)
92% (254)
Mineral and Energy Resources (Prashant Kirad)
20 pages
Telugu Boothu Kathala 5
67% (18)
Telugu Boothu Kathala 5
33 pages
Manufacturing Industries (Prashant Kirad)
91% (120)
Manufacturing Industries (Prashant Kirad)
22 pages
Uveit Foster
50% (6)
Uveit Foster
954 pages
R. D. Sharma Class 9th Book PDF - Unlocked
82% (72)
R. D. Sharma Class 9th Book PDF - Unlocked
464 pages
Agriculture (Prashant Kirad)
90% (220)
Agriculture (Prashant Kirad)
22 pages
Obligations and Contracts Hector de Leon
80% (81)
Obligations and Contracts Hector de Leon
905 pages
EFG Hermes - 21dec2022
No ratings yet
EFG Hermes - 21dec2022
54 pages
Casein Content in Milk Samples Study
89% (502)
Casein Content in Milk Samples Study
10 pages

Class 10 Notes Ai Computer Vision

Uploaded by

Class 10 Notes Ai Computer Vision

Uploaded by

CLASS X

• Optical Character Recognition

For the mobile/computer the image is like a grid of numbers.

• NumPy and OpenCV (cv2) are often used together in

A computer sees images as a matrix of 2-

• In a grayscale image, each value in the 2D matrix represents the

• What is the output colour when you put R=G=B=0 ?

• In computer vision and image processing, a feature is a

Besides the above you can also install

1. Read an image using imread() function.

cv2.IMREAD_GRAYSCALE: It specifies to load an image

cv2.IMREAD_UNCHANGED: It specifies to load an

Alternatively, we can pass integer value 0 instead of

gray_image = [Link](image, cv2.COLOR_BGR2GRAY)

You might also like