Professional Documents
Culture Documents
Unit 1: INTRODUCTION TO AI
• Types of Intelligence
• How to identify an AI
• AI vs ML vs DL
• AI Domains
Important Topics:
Unit 2: AI PROJECT CYCLE
• Different Stages
• 4W Canvas
• Neural Networks
Important Topics:
Unit 3: NATURAL LANGUAGE PROCESSING
• Text Normalization
• Bag of words
• TFIDF (optional)
Important Topics:
Unit 4: EVALUATION
• Confusion matrix
Ans:
Ans
• A machine becomes intelligent by training with data and
algorithm.
• AI machines keep updating their knowledge to optimize
their output
Introduction to AI
Differentiate between what is AI and what is not AI with the
help of an example?
Introduction to AI
What do you understand by Deep Learning?
ANS
ANS
If you wish to predict your next salary, then you would put in
the data of your previous salary and train your model
Project Cycle
What is an Artificial Neural Network? Explain the layers in an artificial
neural network.
Input Layer: this layer accepts all the inputs provided by the programmer.
Output Layer: Final results in the output are delivered via this layer.
Project Cycle
Differentiate between Classification and Regression.
Ans:
Project Cycle
Natural Language Processing
Give 2 points of difference between a script-bot and a smart-bot
Natural Language Processing
Explain the term Text Normalization in Data Processing.
Ans:
Ans:
• Virtual Assistants: With the help of speech recognition, these assistants can
not only detect our speech but can help with everyday tasks like setting an
alarm
Natural Language Processing
Natural Language Processing
Name any 2 applications of Natural Language Processing which are
used in the real-life scenario
Ans:
• Automatic Summarization
• Sentiment Analysis
• Text classification
• Virtual Assistants
Natural Language Processing
What will be the output of the word “studies” if we do the
following:
a. Lemmatization
b. Stemming
Ans:
Ans:
Ans:
The term used to describe the whole textual data from all the
documents altogether is known as corpus.
Natural Language Processing
Identify any 2 stopwords in the given sentence:
Ans:
Stopwords in the given sentence are: is, the, of, that, into, are, and
Natural Language Processing
“Automatic summarization is used in NLP applications”. Is the
given statement correct? Justify your answer with an example.
Ans:
Ans:
1. Document Classification
Helps in classifying the type and genre of a document.
2. Topic Modelling
It helps in predicting the topic for a corpus.
Ans:
Ans:
Occurrence and value of a word are inversely proportional.
• The words which occur most (like stop words) have negligible value.
• These words occur the least but add the most value to the corpus.
Natural Language Processing
Classify each of the images according to how well the model’s output matches
the data samples:
1. The model’s output does not match the true function at all. Hence the model
is said to be under fitting and its accuracy is lower.
2. Model is trying to cover all the data samples even if they are out of
alignment. This model is said to be over fitting and has a lower accuracy
3. Model’s performance matches well with the true function and is called a
perfect fit
Evaluation
What is F1 Score in Evaluation?
Ans:
F1 score can be defined as the measure of balance between
precision and recall.
Evaluation
Give an example of a situation wherein false positive would have a
high cost associated with it.
Ans:
• If the model always predicts that the mail is spam, people would
not look at it and eventually might lose important information.
Ans:
This is because our model will simply remember the whole training
set, and will therefore always predict the correct label for any point
in the training set
Evaluation
Which evaluation metric would be crucial in the following cases?
Justify your answer.
a. Mail Spamming
• If the model always predicts that the mail is spam, people would
not look at it and eventually might lose important information.
b. Gold Mining
• A model saying that there exists treasure at a point and you keep
on digging there but it turns out that it is a false alarm.
c. Viral Outbreak
a. Lack of Training Data: If the data is not sufficient for developing an AI Model,
or if the data is missed while training the model, it will not be efficient.
Precision: percentage of true positive cases versus all the cases where the prediction is true.
Recall: It is defined as the fraction of positive cases that are correctly identified
False Positive (impacts Precision): A person is predicted as high risk but does not
have heart attack.
False Negative (impacts Recall): A person is predicted as low risk but has heart
attack.
Therefore, False Negatives miss actual heart patients, hence recall metric need
more improvement. False Negatives are more dangerous than False Positives.
Evaluation
Calculate Accuracy, Precision, Recall and F1 Score for the following Confusion
Matrix on SPAM FILTERING: Also suggest which metric would not be a good
evaluation parameter here and why?
Precision: percentage of true positive cases versus all the cases where the prediction is true.
Recall: It is defined as the fraction of positive cases that are correctly identified
False Negative (impacts Recall): Mail is predicted as “not spam” but spam
Of course, too many False Negatives will make the Spam Filter ineffective. But
False Positives may cause important mails to be missed. Hence, Precision is more
important to improve
Evaluation
Thank you!