You are on page 1of 112

• apter wis Qnes

• Suźiÿect
• Ożiÿecfiwe

Scanned with
is a domain of AI that depicts the capability of a machine to get and
analyse visual information and afterwards predict some decisions about it.
a) NLP
b) Data Sciences
c) Augmented Reality
d) Computer Vision

Scanned with
is the sub-f1eld of AI that is focused on enabling computers
to understand and process human languages.
a) Deep Learning
b) Machine Learning
c) NLP
d) Data Sciences

Scanned with ComScanner


, the machine is trained with huge amounts of
data
which helps it in training itself around the data.
a) Supervised Learning
b) Deep Learning
c) Classification
d) Unsupervised Learning

4
Scanned with ComScanner
Expand CBT
a) Computer Behaved Training
b) Cognitive Behavioural Therapy
c) Consol1dafed Bafch of trainers
d) Combined Basic Training

Scanned with ComScanner


Give 2 examptes of Supervised Learning models.
a) Classification and Regression
b) Clustering and Dimensionality Reduction
c) Rule Based and Learning Based
d) Classification and Clustering

Scanned with ComScanner


Define Mach1ne Learning.
a) Machine learning is the study of computer algorithms that improve
automatically through experience.
b) Refers to any technique that enables computers to mimic human
intelligence.
c) Machine learning refers to computer systems (both machines
and software) enables machines to perform tasks for which it
is programmed.
d) Machine Learning refers to projects that allow the machine to work
on a particular logic.

Scanned with
refers to the AI modelling where the machine
learns by itself.
a) Learning Based
b) Rule Based
c) Machine Learning
d) Data Sciences

Scanned with
, the mach1ne is trained w1th huge amounts of data
which helps it in training itself around the data.
a) Machine Learning
b) Artificial Intelligence
c) NLP
d) Deep Learning

Scanned with
Define the term Machine Learning. Also give 2 applications of /ñachine
Learning in our daily lives.

• Machine Learning: It is a subset of Artificial Intelligence which enables machines to improve


at tasks with experience (data).

• The intention of Machine Learning is to enable machines to learn by themselves using


the provided data and make accurate Predictions/Decisions.

• Machine Learning is used in Snapchat Filters, NETFLIX recommendation system.

10

Scanned with ComScanner


Differentiate between ClaSsification and Regression.

Classification Regression
This model works on a discrete Such models work on continuous
dataset which means the data data.
need not be continuous.

For example, in the grading For example, if you wish to


predict system, students are classified on your next salary, then you
would put the basis of the grades they in the data of your previous
salary, obtain with respect to their marks any increments, etc and
would train in the examination. the model.

11

Scanned with ComScanner


Categorize the following under Data Sciences, Machine Learning, Computer
Vision and NLP:

The latest technological advancements have made our lives


convenient. Google Home, Alexa and Siri have been a huge help to
non-tech savvy people. Features like Facial recognition and Facelock have
added additional security to our gadgets. These advancements have
also contributed in making our needs more approachable and
convenient. Now you can even check the prices with Price comparison
websites and order groceries online with chatbots. Did you know that
you can even find how you are going to look when you grow old?
Faceapps and Snapchat filters have made this possible!

• Alexa, Siri-NLP, Facial Recognition - Computer Vision


• Facelock - Computer Vision
• Price comparison websites - Data Sciences
• Chatbots - NLP
• Faceapps -NLP
• Snapchat Filters - Machine Learning
12

Scanned with ComScanner


Which of the following is correct about the rule based approach?
a) We cannot provide enough rules to the machine.
b) A drawback/feature for this approach is that the learning is static.
c) Once the rules are fed into the system, it takes into consideration
any changes made in the original training dataset.
d) It can improve itself based on the feedbacks.

13

Scanned with ComScanner


When a machine possesses the ability to mimic the following human traits, it is said
to have artificial intelligence. Identify the positive traits that an AI machine
should possess.
make decisions
ii.bias
...
III predict
.
IV. learn and improve on its
a) i), andown iii) only
b) i) , iii) and iv) only
c) ii) and iv) only
d) i) ,ii), and iv) only

14

Scanned with ComScanner


Assenion(A):Neural networks are the backbone of deep learning algorithms
Reason(R): Neural networks use vast amounts of data
a) Both A and R are correct and R is the correct explanation of A
b) Both A and R are correct but R is NOT the correct explanation of A
c) A is correct but R is not correct
d) A is not correct but R is correct.

15

Scanned with ComScanner


A business problem where in we categorize whether an observation is “Safe,” “At-
Risk,” or “Unsafe” is an example of
a) Classification
b) Clustering
c) Regression
d) Dimensionality Reduction

16
Scanned with ComScanner
Tom is a student of grade five. He likes to move constantly at his desk. He
plays with pencils and taps his fingers, stands up in his place any time he gets a
chance. He enjoys playing basketball, and likes to play in the classroom.
Which of the following intelligence does he demonstrate?
a) Linguistic
b) Logical-Mathematical
c) Musical
d) Kinesthetic

17

Scanned with ComScanner


The basis of decision making depends upon
i) availability of information
ii) past experience
iii) positive attitude
iv) self-awareness
i) and ii)
b) ii) and iv)
c) i), ii) and iv)
and iii

18
Scanned with ComScanner
Infrared sensors detect infrared energy that is emitted by one's body heat. When
hands are placed in the proximity of the sensor, the infrared energy quickly
fluctuates. This fluctuation triggers the pump to activate and dispense the
designated amount of sanitizer. This is an example of
a) Automated machine
b) AI machine
c) Semi-automatic machine
d) Deep Learning machine

19
Scanned with ComScanner
Match Column A with Column B:

Column A Column B
Face recognition machine (i) Nnt AI
2. Automatic door
Gesture recognition
4. Automatic toy car
a) 1 -> (i) ; 2 -> (ii) ; 3 -> (i) ; 4 -> (ii)
b) 1 -> (ii) ; 2 -> (i) ; 3 -> (ii) ; 4 -> (i)
c) 1 -> (i) ; 2 -> (i) ; 3 -> (ii) ; 4 -> (i)
d) 1 -> (ii) ; 2 -> (i) ; 3 -> (i) ; 4 -> (ii)

20

Scanned with ComScanner


Assertion(A): Anyone can kick an artificially intelligent machine
Reason (R): They have no pain receptors
a) Both A and R are correct and R is the correct explanation of A
b Both A and R are correct but R is NOT the correct explanation
) of A
c) A is correct but R is not correct
d) A is not correct but R is correct.

21

Scanned with ComScanner


If Data is represented as “Answer”, Processing is represented as “Data” and
Answer is represented as “Processing”, which of the following can be related to
the description of layers in a neural network?
Choose the correct options
a) Input Layer -> Data; Output layer -> Processing; Hidden Layer -> Answer
b) Input Layer -> Processing; Output layer -> Data; Hidden Layer -> Answer
c) Input Layer -> Answer;Output layer -> Processing;Hidden Layer ->
Data
d) Input Layer -> Answer; Output layer ->Data; Hidden Layer ->
Processing

22

Scanned with ComScanner


Which of the following is true about neural networks?
a) Neural Networks tend to perform better with larger amounts of data.
b) Neural Networks tend to perform poorer with larger amounts of data.
c) Neural Networks tend to perform better with smaller amounts of data.
d) Neural Networks need no data

23

Scanned with ComScanner


Choose the correct option
a) Unsupervised learning ->IabeIIed dataset, Regression
b) Supervised learning -> labelled data set, Regression
c) Unsupervised learning ->unIabeIIed dataset, Classification
d) Supervised learning -> unlabelled data set, Regression

24

Scanned with ComScanner


A leading multinational company operates on a chain of hypermarkets and grocery
stores deployed an AI application to make it easier for employees to keep
their stores running smoothly. They used thousands of video cameras, weighted
sensors on shelves, and other technologies that can tell employees when certain
products is starting to go bad. One of the task of the application is to identify
bananas that had started to turn brown, eliminating the need for employees to
manually inspect fruit. Which of the following domain is used to achieve this?
a) Data sciences
b) Computer vision
c) Natural Language Processing
d) Fuzzy logic

25

Scanned with ComScanner


Data about the houses such as square footage, number of rooms, features,
whether a house has a garden or not, and the prices of these houses, i.e.,
the corresponding labels are fed into an AI machine. By leveraging data coming
from thousands of houses, their features and prices, we can now train the
model to predict a new house's price. This is an example of
a) Reinforcement learning
b) Supervised learning
c) Unsupervised learning
d) None of the above

26

Scanned with ComScanner


Amazon had been working on a secret AI recruiting tool. The machine-
learning specialists uncovered a big problem: their new recruiting engine
did not like women. The system taught itself that male candidates were
preferable. It penalized resumes that included the word “women This led to
the failure of the tool. This is an example of
a) Data Privacy
b) AI access
c) AI Bias
d) Data Exploration

27

Scanned with ComScanner


Why should we avoid using the training data for evaluation?

This is because our mo el wilsim remember the whole t raining


set, an will therefore always predict e correct label for a
int in the training set.

28
Scanned with ComScanner
and are AI based applications that help us in navigation.

oogle a , Apple ma ,

29

Scanned with ComScanner


“This type of intelligence measure's one's awareness of the natural
world around them and their sensitivities to any changes that occur. It
allows us to identify the variation among two different species and understand
how they are related”.
Identify the type of intelligence described in the above sentence

Naturalist Intelligence

80

Scanned with ComScanner


Identify the incorrect statement(s) from the following:
(i)Deep learning is a subset of Machine Learning
(ii)Machine Learning is a subset of Deep Learning
(iii) Artificial Intelligence is a subset of Deep Learning
(iv) Deep Learning is the advanced form of AI and ML
(a) only (i)
(b) (ii) and (iii)
(c) (i) and (ii)
(d) Only (iii)

31

Scanned with ComScanner


Search engines not only predict what popular searches may apply to your query
as you start typing, but it looks at the whole picture and recognizes what you're
trying to say rather than the exact search words. This is an example of
(a) Computer Vision
(b) Data Sciences
(c) Natural Language Processing
(d) Natural Language Understanding

52

Scanned with ComScanner


When a user installs an app in the smartphone, it asks for access to ga\lery,
contacts, etc. Aker accepting this, it gives the user agreement which most
users accept without realizing the implications. What is the concern here?
(a) Data Privacy
(b) Unemployment
(c) AI bias
(d) No concern

33

Scanned with ComScanner


We can't make “good” decisions without information. (True/False)

54

Scanned with ComScanner


Divya was learning neural networks. She understood that there were three layers
in a neural network. Help her identify the layer that does processing in the
neural network.
(a) Output layer
(b) Hidden layer
(c) Input layer
(d) Data layer

55

Scanned with ComScanner


Smita is working on a project that involves over a lakh of records. Which of
the
following should she use to make the best project?
(a) Traditional programming
(b) Manual processing
(c) IoT
(d) Neural networks

Scanned with ComScanner


Identify the algorithm based on the given graph
(a) Dimensionality reduction
(b) Classification
(c) Clustering
(d) Regression

37

Scanned with ComScanner


How do you understand whether a machine/application is AI based or not?
Explain
with the help of an example.

Any machine that has been trained with data and can make decisions/predictions on its own
can be termed as AI.

Eg: The bot or the automation machine is not trained with any data is not an AI while a chatbot
that understands and processes human language is an AI.

88

Scanned with ComScanner


If you do an image search for vacations on a popular search engine, the first
few searches mostly return the picture of beaches. What is the concern here?
Explain.

In the given scenario, we are concerned about the bias.

When we talk about a machine, we know that it is artificial and cannot think on its own. It can have
intelligence, but we cannot expect a machine to have any biases of its own.

Any bias can transfer from the developer to the machine while the algorithm is being developed.

Scanned with ComScanner


Ashwat is amazed to learn about his sister Ananya who is multi-talented and has
excelled in academics, music, dancing, sports and painting. He was quite
curious when Ananya told him that he too possessed all these intelligences like
every human being does, but only at difierent levels. He wondered which
intelligence she was talking about. Can you help Ashwat in learning about different
types of intelligences by naming and explaining any four types of intelligences?

1.I*Iathematical Logical Reasoning: ability to regulate, measure, and understand numerical


symbols, abstraction and logic.

2.Linguistic Intelligence: Language processing skills both in terms of understanding or implementation in


writing or verbally.

3. Spatial Visual Intelligence : ability to perceive the visual world and the relationship of one object to another.

4.Kinesthetic Intelligence : ability that is related to how a person uses his limbs in a skilled manner.

5.musical Intelligence : ability to recognize and create sounds, rhythms, and sound patterns. 40

Scanned with ComScanner


Samarth attended a seminar on Artificial Intelligence and has now been asked to
write a report on his learnings from the seminar. Being a non-technical person, he
understood that the AI enabled machine uses data of diPerent formats in many of
the daily based applications but failed to sync it with the right terminologies and
express the details. Help Samarth define Artificial Intelligence, list the three domains
of AI and the data that is used in these domains.

Artificial Intelligence (AI) refers to any technique that enables computers to mimic human
intelligence i.e., make decisions, predict the future, learn and improve on its own.

With respect to the type of data fed in the Al model, Al models can be broadly categorised into
three domains:

1.Data sciences: takes input in the form of numeric and alphanumeric data.

2.Computervision : takes input in the form of images and videos


41
3. NaturalLanguage Processing:takes input in the form of text and speech.
Scanned with ComScanner
Neural networks are said to be modelled the way how neurons in the human
brain behave. A similar system is mimicked by the AI machine to perform certain
tasks. Explain how neural networks work in an AI model and mention any three
features of Neural Networks.

Neural networks are loosely modelled after how neurons in the human brain
behave. The features of a neural network are :

1.They are able to extract data features automatically without needing the input
of the programmer. fJ LTlS

2.A neural network is essentially a system of organizing machine learning


algorithms to perform certain tasks.
**P* Hidden
3. It is a fast and efficient way to solve problems for which the dataset is very Layer Layer

large, such as in images.


42

Scanned with ComScanner


The canvas helps you in identifying the key elements
re\ated to the problem.
a) Problem
scoping
b) 4Ws Problem
c) Project cycte
d) Algorithm

Scanned with ComScanner


43

Scanned with ComScanner


Name any 2 methods of coltecting data.
a) Surveys and Interviews
b) Rumors and Myths
c) AI models and applications
d) Ima ination and thou hts

44

Scanned with ComScanner


What is the role of modelling in an NLP based AI model?
a) Modelling in NLP helps in processing of AI model
b) Modelling is required to make an AI model
c) In NLP, modelling requires data pre-processing only after which the
data is fed to the machine.
d) Modelling is used in simplification of data acquisition

45

Scanned with
Which of fhe following is not part: of the Al Project Cycle?
a) Data Exploration
b) Modelling
c) Testing
d) Problem Scoping

46

Sconned with ComScanner


is the last stage of the AI project Life
a) Problem Scoping
cycle.
b) Evaluation
c) Modelling
d) Data
AcquiSition

47

Scanned with ComScanner


Create a 4W Project Canvas for the following:
As more and more new technologies get into play, risks will get more
concentrated into a common network. Cybersecurity becomes extremely
complicated in such scenarios and goes beyond the control of firewalls.
It will not be able to detect unusual aCtiV1ty and patterns including
the movement of data.

Think how AI algorithms can scrape through vast amounts of logs to


identify susceptible user behaviour. Use an AI project cycle to clearly
identify the scope, how you will collect data, model and evaluation
parameters.
OUR [stakeholders] People who are using the new WHO
technology
HAS/ HAVE [issue, problem, need] Cyber security is WHAT
PROBLEM the need when so much of the flow of data is
THAT not monitored or escapes the antiviruses/
systems.
firewall
WHEN/ WHILE [contexVsituation] The probtem is in the use WHERE
of the tatest technology where vast amounts
of data is at gsk.
AN [benefit of solution to them] An effective Al WHY
IDEAL system which is able to detect the flow of
SOLUTION data and also report unusual activity 48
WOULD
Sconned with ComScanner
Choose the five stages of AI project cycle in correct order
a) Evaluation -> Problem Scoping -> Data Exploration -> Data Acquisition ->
Modelling
b) Problem Scoping -> Data Exploration -> Data Acquisition -> Evaluation ->
Modelling
c) Data Acquisition -> Problem Scoping -> Oata Exploration -> Modelling -
Evaluation >
d) Problem Scoping -> Data Acquisition -> Data Exploration -> Modelling -
Evaluation >

49

Scanned with ComScanner


helps us to summarise all the key points into one single outline so
that in future, whenever there is need to look back at the basis of the problem,
we can take a look at it and understand the key elements of it.
a) 4W Problem canvas
b) Problem Statement Template
c) Data Acquisition
d) Algorithm

50

Scanned with ComScanner


Which of the following is incorrect?
i)Testing data is the one on which we train and fit our model basically to fit
the parameters
ii)Training data is used only to assess performance of model
iii) Testing data is the unseen data for which predictions have to be made
a) i) and iii) only
b) i) and ii) only
c) ii) and iii) only
d) i), ii) and iii)

51

Scanned with ComScanner


a. Understand ond inspect the web page to find the HTML markers associated
with the information we want.
b. Use Python libraries to pull out data from the HTML page.
c. Manipulate the collected data to get it in the form we need.
The above given steps are for collecting data from which of the following data
sources?
a) Cameras
b) Sensors
c) Surveys
d) Web scraping

52

Scanned with ComScanner


4
since

Scanned with ComScanner


helps us to summarise all the key points into one single Template
so that in future, whenever there is a need to look back at the basis of the problem,
we can take a look at this and understand the key elements of it.

Problem Statement Template

Scanned with ComScanner


For better efficiency of an AI project Training data should be
i)Relevant
ii) Scattered
iii) Structured
iv)Authentic
Choose the correct option:
(a) Both i and ii
(b) Both i and iv
(c) Only i
(d) Only iv

Scanned with ComScanner


Suhana works for a company wherein she was assigned the task of developing
a project using AI project cycle. She knew that the first stage was scoping the
problem. Help her list the remaining stages that she must go through to develop
the project.

Problem
Scoping Data
Exploration Evaluation

Modelling
Acquisition

Scanned with ComScanner


is a simple file format that stores data separated by commas.

(b) doc
(c) csv
(d) png

omma se rate lues

58
Scanned with ComScanner
Which of the following is an application of data science?
(a) Text summarization
(b) Target Advertisements
(c) Face lock in smartphones
(d) Email filters

59

Scanned with ComScanner


Rajat has made a model which predicts the performance of Indian Cricket players in
upcoming matches. He collected the data of players' performance with respect
to stadium, bowlers, opponent team and health. His model works with good accuracy
and precision value. Which of the statement given below is incorrect?
(a) Data gathered with respect to stadium, bowlers, opponent team and health is
known as Testing Data.
(b) Data given to an AI model to check accuracy and precision is Testing Data.
(c) Training data and testing data are acquired in the Data Acquisition stage.
(d) Training data is always larger as compared to testing data.

60

Scanned with ComSconner


Ajay wants to access data from various sources. Suggest him any two points that
he
needs to keep in mind while accessing data from any data source.

While accessing data from any of the data sources, following points should be kept in mind:

1. Data which is available for public usage only should be taken up.

2.Personal datasets should only be used with the consent of the owner.

3.One should never breach someone's privacy to collect data.

4.Reliable sources of data ensure the authenticity of data which helps in the proper training of the AI
model.

61
Scanned with ComScanner
JãUUD3S D3 fl›!• pauuD3S
Give one example of an application which uses augmented reality.

Apple vision ro, Se riving Cars

63
Scanned with ComScanner
, input to machines can be photographs, videos acd pictures
from thermal or infrared sensors, indicators and different sources.
a) Computer Vision
b) Data Acquisition
c) Data Collection
d) Machine learning

64

Scanned with ComScanner


is the process of finding instances of real-world objects in images or videos.
(a) Instance segmentation
(b) Object detection
(c) Classification
(d) Image segmentation

Object Detection

65

Scanned with ComScanner


How many channels does a colour image have?

Scanned with ComScanner


means a picture element which is the smallest unit of information that makes
up a picture.
(a) Vision
(b) Pics
(c) Pixel
(d) Piskel

67

Scanned with ComScanner


Explain the term resolution wit'h an example

Resolution of an image refers to the number of pixels in an image, across the width and height.

For example a monitor resolution of 1280•1024. This means there are 1280 pixels from one side
to the other, and 1024 from top to bottom.

68

Scanned with ComScanner


The term Sentence Segmentation is
a) the whole corpus is divided into sentences
b) to undergo several steps to normal1se the text to a lower level
c) in which each sentence is then further divided into tokens
d) the process in which the affixes of words are removed

Scanned with ComScanner


is an example of Applications of Natural Language
Processing.
a) Evaluation
b) Automatic Summarization
c) Deep Learning
d) Problem Scoping

70

Scanned with ComScanner


Give 2 points of difference between a script-bot and a smart-bot

scriptBot Smart Bot


Script bots are easy to make Smart bots are flexible and
powerful
Script bots work around a script Smart bots work on large
which is programmed around databases and other resources
them directly
Most of them are free and easy to Smart bots learn with more data
integrate with a messaging
platform
None or little language processing Coding is required
skills
Limited functionality Wide functionality

71
Scanned with ComScanner
Explain the term Text Normalisation in Data Processing.

The first step in Data processing is Text Normalisation.

It helps in cleaning up the textual data in such a way that it comes down to a level
where its complexity is lower than the actual data.

In this we undergo several steps to normalise the text to a lower level.

The term used for the whole textual data from all the documents is known as corpus.

72

Scanned with ComScanner


Name any 2 applicat1ons of Natural Language Processing which are used in
the real-life scenario.

• Automatic Summanzation,
• Sentiment Analysis,
• Text classification,
• Virtual Assistants

73

Scanned with
Differentiate between stemming and lemmatization. Explain with the help
of an examp\e.

CARING temmatJzatJon CARE

CARING stemming CAR

Stemming is the process in which the affixes of words are removed and the words are converted to
their base form.

In lemmatization, the word we get after affix removal(also known as lemma) is a meaningful one.

Lemmatization makes sure that lemma is a word with meaning and hence it takes a longer time to
execute than stemming.
74

Scanned with ComScanner


What will be the output of the word “studies" if we do the following:
a. Lemmatization
b. Stemmin

75

Scanned with ComScanner


How many tokens are there in the sentence given below?
Traffic Jams have become a common part of our lives nowadays. Living in an
urban area means you have to face traffic each and every time you get out on the
road. Mostly, school students opt for buses to go to school.

46 tokens

76

Scanned with ComScanner


What is a corpus?

The term use describe whole textual ata from a the


documents altogether is known a co

77

Scanned with ComScanner


Identify any 2 stopwords in the given sentence:
Pollution is the introduction of contaminants into the natural environment
that cause adverse change.the three fypes of pollution are air pollution, water
ollution and land

Stopwords in e give sentence a : is, the, of, that, into, are, a

78
Scanned with ComScanner
“Automatic summarization is used in NLP applications". Is the given statement
correct? Justify your answer with an example.

• Yes, the given statement is correct.

• Automatic summarization is relevant not only for summarizing the


meaning of documents and information, but also to understand the
emotional meanings within the information, such as in collecting
data from social media.

• Automatic summarization is especially relevant when used to


provide an overview of a news item or blog post, while avoiding
redundancy from multiple sources and maximizing the diversity of
content obtained.

79

Scanned with ComScanner


Write down the steps to implement bag of words algorithm.

The steps to implement bag of words algorithm are as follows:

1. Text Normalisation: Collect data and pre-process it

2.Create Dictionary: Make a list of all the unique words occurring


in the corpus. (Vocabulary)

3. Create document vectors: For each document in the


corpus, find out how many times the word from the unique list
of words has occurred.

4.Create document vectors for all the documents.

80

Scanned with ComScanner


helps in assigning a predefined category to a document,
organize it in such a way that helps customers to find information they want.
For example spam filtering in email, auto tagging on social media, categorization of
news articles etc.

TEXT CLASSIFICATION

81

Scanned with ComScanner


Which of the following is the type of data used by NLP applications?
(a) Images
(b) Numerical data
(c) Graphical data
(d) Text and Speech

82
Scanned with ComScanner
Ayushi was learning about NLP. She wanted to know the term used for the
whole textual data from all the documents altogether. Help her in identifying the term
used for it.

83
Scanned with ComScanner
What is f TF-IDF?
th
Term requency Inverse ocument re uency

84
Scanned with ComScanner
A corpus contains 12 documents. How many document vectors will be there for
that corpus?
a. 12
b. 1
c.24
d. 1/12

85
Scanned with ComScanner
Identify the type of chatbot with the information given below:
These bots work on pre-programmed instructions inside the
application/machine and are generallyeasy to develop. They are deployed in
the customer care section of various companies. Their job is to answer some
basic queries that they are coded for and connect them to human executives
once they are unable to handle the conversation.

Script bot

86

Scanned with ComScanner


What will be the results of conversion of the term, ”happily’ in the process of
stemming and lemmatization? Which process takes longer time for execution?

87
Scanned with ComScanner
What do we get from the “bag of algorithm?
words

Ba of wo gives us two things:

1. voca ulary of words fo e cor

. The frequency of these words ( number of times it as occurre in the whole c

88
Scanned with ComScanner
Samiksha, a student of class X was exploring the Natural Language
Processing domain. She got stuck while performing the text normalisation.
Help her to normalise the text on the segmented sentences given below:

Document 1: Akash and Ajay are best friends.


Document 2: Akash likes to play football but Ajay prefers to play online games.

1.Tokenisation:
Akash, and, Ajay, are, best, friends |Akash, likes, to, play, football, but, Ajay, prefers, to, play, online, games

2.Removal of stopwords
Akash, Ajay, best, friends Akash, likes, play, football, Ajay, prefers, play, online, games

3. Converting text to a common case


akash, ajay, best, friends ahash, likes, play, football, ajay, prefers, play, online, games

4.Stemming/Lemmatisation
akash, ajay, best, friend ahash, like, play, football, ajay, prefer, play, online, game 89

Scanned with
is defined as the percentage of correct predictions out of all
observations. the
a) Predictions
b) Accuracy
c) Reality
d) F1 Score

90

Scanned with
What will be the outcome, if the Prediction is “Yes” and it matches
with the Reality? What will be the outcome, if the Prediction is “Yes”
and it does not match the Reatity7
a) True Positive, True Negative
b) True Negative, False Negative
c) True Negative, False Positive
d) True Positive, False Positive

91

Scanned with ComScanner


Recal\-Evaluation method is
a) defined as the fraction of positive cases that are correctly
identified.
b) defined as the percentage of true positive cases versus all the cases
where the prediction is true.
c) defined as the percentage of correct predictions out of atl fhe
observations.
d) comparison between the prediction and reality

92

Scanned with ComScanner


Differentiate between Prediction and Reality.
a) Prediction is the input given to the machine to receive the expected
result of rhe reat1tg.
b) Prediction is the output given to match the reality.
c) The prediction is the output which is given by the machine and the
reality is the real scenario in which the prediction has been made.
d) Prediction and reat1ty both can be used interchangeably.

93
Scanned with ComScanner
Which of the following statements is true for the term Evaluation?
a) Helps in classifying the type and genre of a document.
b) It helps in predicting the topic for a corpus.
c) Helps in understanding the reliability of any AI model
d) Process to extract the impo1ant information out of a corpus.

94

Scanned with ComScanner


Prediction and Reality can be easily mapped together with the help of :
a) Prediction
b) Reality
c) Accuracy
d) Confusion Matrix

95

Scanned with
What is F1 Score in Evatuation?

sc re can e efined as e me ce etwee cisi


s

re -* 0
0

Scanned with
96

Scanned with
Imagine that you have come up with an AI based prediction model
which has been deployed on the roads to check traffic jams. Now, the
objective of the model is to predict whether there will be a traffic jam
or not. Now, to understand the efficiency of this model, we need to Case 1: Is there a traffic Jam?
check if the predictions which it makes are correct or not. Thus, there Prediction: Yes Reality: Yes
exist two conditions which we need to ponder upon: Prediction and True Positive
Reality. Case 2: Is there a traffic
Jam? Prediction: No Reality:
Traffic Jams have become a common part of our lives nowadays. Living in
No
an urban area means you have to face traffic each and every time you get
True Negative
out on the road. Mostly, school students opt for buses to go to school.
Many times, the bus gets late due to such jams and the students are not Case 3: Is there a traffic Jam?
able to reach their school on time. Prediction: Yes Reality: No
False Positive
Considering all the possible situations make a Confusion Matrix for the Case 4: Is there a traffic
above situation. Jam? Prediction: No
Reality: Yes
False Negative

Scanned with
Yes No
Yes True Positive False
Prediction
No False Positive
97
Negative

Scanned with
What should be the value of F1 score if the model needs to have 100% accuracy?

The mode will have an 1 score of 1if it as to be 100*% accurate.

98
Scanned with
Give an example of a situation wherein false positive would have a high cost associated
with it.

• Let us consider a model that predicts that a mail is spam or not.

• If the model always predicts that the mail is spam, people would
not look at it and eventually might lose important information.

• Here False Positive condition (Predicting the mail as spam while the
mail is not spam) would have a high cost.

99
Scanned with ComScanner
What “is a confusion matrix? What is it used for?

• The confusion matrix is used to store the resu\ts of comparison


between the prediction and reality.

• From the confusion matrix, we can calculate parameters like recall,


precision ,F1 score which are used to evaluate the performance of an Al
model.

100

Scanned with ComScanner


Explain from the given graph, how the value and occurrence of a word are related in
a
corpus?

Stop ›'‹ords


.. Rare / Valuable
words

Value

As shown in the graph, occurrence and value of a word are inversely proportional.

The words which occur most (like stop words) have negligible value.

As the occurrence of words drops, the value of such words rises. These words are termed as rare or valuable
words. These words occur the least but add the most value to the corpus
1 01

Scanned with ComScanner


Take a look at the confusion matrix:
The Confusion Reality

Yes

Yes True False


Positive Positive
Prediction (TP) (FP)
False True
Negative Negative
(FN) (TN)
How do you calculate F1 score?

1 02

Scanned with ComScanner


The output given by the AI machine is known (Prediction/ Reality)
as

103
Scanned with ComScanner
is used to record the result of comparison between the prediction
and reality. It "is not an evaluation metric b re whiE/ Can help in evaluation

onfusion atrix

104

Scanned with ComScanner


Raunak was learning the conditions that make up the confusion matrix. He
came across a scenario in which the machine that was supposed to predict an
animal was always predicting not an animal. What is this condition called?
(a) False Positive
(b) True Positive
(c) False Negative
(d) True Negative

105

Scanned with ComScanner


Which two evaluation methods are used to calculate F1Score?
(a) Precision and Accuracy
(b) Precision and Recall
(c) Accuracy and Recall
(d) Precision, F1 score

106
Scanned with ComScanner
Priya was confused with the terms used in the evaluation stage. Suggest her the term
used for the percentage of correct predictions out of all the observations.
(a) Accuracy
(b) Precision
(c) Recall
(d) F1Score

107

Scanned with ComScanner


People of a village are totally dependent on the farmers for their daily food
items. Farmers grow new seeds by checking the weather conditions every
year. An AI model is being deployed in the village which predicts the chances of
heavy rain to alert farmers which helps them in doing the farming at the
right time. Which evaluation parameter out of precision, recall and F1 Score
is best to evaluate the performance of this AI model? Explain.

• Predicting heavy rain is important for farmers to protect their crops.

• Focusing only on positive predictions — Storm is coming: can lead to farmers delaying their crop if not
accurate.

• Focusing only on negative predictions — Storm is not coming can lead to damaged crop if not accurate.

• The best approach is to balance both accuracy and catching important events. This is what the F1 Score
measures.

1 08

Scanned with ComScanner


Automated trade industry has developed an Al model which predicts the selling
i) TP=60, TN=10, FP=25, FN=5 60-›25-›5-›10=100
and purchasing of automobiles. During testing, the AI model came up with the total cases have been performed
following predictions.

(ii) (Note: For calculating Precision, Recall and


Reality
F1 score, we need not multiply the formula by
Confusion Matrix
Yes No 100 as all these parameters need to range between
0 to 1)
Yes 60 25
Predicted Precision =TP/(TP+FP) =60/(60+25) =60/85 =0.7
No 05 10

(i) How many total tests have been performed in the above scenario? Recall=TP/(TP+FN) =60/(60+5) =60/65 =0.92
(ii) Calculate precision, recall and F1 Score.

F1 Score=2*Precision*Recall/(Precision+Recall)
2*0.7*0.92/(0.7+0.92)
0.79
Scanned with ComScanner
109

Scanned with ComScanner

You might also like