Professional Documents
Culture Documents
This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document
through email in error, please notify the system manager. This document
contains proprietary information and is intended only to the respective group /
learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender
immediately by e-mail if you have received this document by mistake and delete
this document from your system. If you are not the intended recipient you are
notified that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.
22IT401 - ARTIFICIAL
INTELLIGENCE AND
MACHINE LEARNING
Date: 01.02.2024
Table of Contents
SLIDE
S.NO. CONTENTS NO.
1 CONTENTS 5
2 COURSE OBJECTIVES 7
17 EXPERT SYSTEMS 57
Table of Contents
SLIDE
S.NO. CONTENTS
NO.
20 ASSIGNMENT 3 - UNIT 3 66
21 PART A Q & A 68
22 PART B Q s 72
26 ASSESSMENT SCHEDULE 78
PRE-REQUISITE CHART
Artificial INTELLIGENCE
Intelligence
LEARNING
22MA401- Probability
Machine Learning
and Statistics
Analysis of Algorithms
22CS102- Problem
22IT401–
AND
OBJECTIVES
• Understand the concept of Artificial Intelligence
• Familiarize with Knowledge based AI systems and approaches
• Apply the aspect of Probabilistic approach to AI
• Identify the Neural Networks and NLP in designing AI models
• Recognize the concepts of Machine Learning and its deterministic tools
UNIT 1 PROBLEM SOLVING AND SEARCH STARTEGIES
Introduction: What Is Ai, The Foundations Of Artificial Intelligence, The
History Of Artificial Intelligence, The State Of The Art. Intelligent Agents: Agents And
Environments, Good Behaviour: The Concept Of Rationality, The Nature Of
Environments, And The Structure Of Agents. Solving Problems By Searching: Problem-
Solving Agents, Uninformed Search Strategies, Informed (Heuristic) Search Strategies,
Heuristic Functions. Beyond Classical Search: Local Search Algorithms and Optimization
Problems, Searching With Nondeterministic Actions And Partial Observations, Online
Search Agents And Unknown Environments. Constraint Satisfaction Problems:
Definition, Constraint Propagation, Backtracking Search, Local Search, The Structure Of
Problems.
List of Exercise/Experiments
1. Implementation of uninformed search algorithm (BFS and DFS).
2. Implementation of Informed Search algorithm (A* and Hill
Climbing Algorithm)
9
4. 22IT401 ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
UNIT 3 LEARNING
Learning from Examples: Forms of Learning, Supervised Learning, Learning
Decision Trees, Evaluating and Choosing the Best Hypothesis, The Theory of Learning,
Regression and Classification with Linear Models, Artificial Neural Networks. Applications:
Human computer interaction (HCI), Knowledge management technologies, AI for customer
relationship management, Expert systems, Data mining, text mining, and Web mining,
Other current topics.
List of Exercise/Experiments
1. Numpy Operations
2. NumPy arrays
3. NumPy Indexing and Selection
4. NumPy Exercise:
(i) Write code to create a 4x3 matrix with values ranging from 2 to 13.
(ii) Write code to replace the odd numbers by -1 in the following array.
(iii) Perform the following operations on an array of mobile phones prices 6999,
7500, 11999, 27899, 14999, 9999.
a) Create a 1d-array of mobile phones prices
b) Convert this array to float type
c) Append a new mobile having price of 13999 Rs. to this array
d) Reverse this array of mobile phones prices
e) Apply GST of 18% on mobile phones prices and update this array.
f) Sort the array in descending order of price
g) What is the average mobile phone price.
10
4. 22IT401 ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
11
4. 22IT401 ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
List of Exercise/Experiments
Use Cases
Case Study 1: Churn Analysis and Prediction (Survival Modelling)
Cox-proportional models
Churn Prediction
Case Study 2: Credit card Fraud Analysis
Imbalanced Data
Neural Network
Case study 3: Sentiment Analysis or Topic Mining from New York Times
Similarity measures (Cosine Similarity, Chi-Square, N Grams)
Part-of-Speech Tagging
Stemming and Chunking
Case Study 4: Sales Funnel Analysis
A/B testing
Campaign effectiveness, Web page layout effectiveness
Scoring and Ranking
Case Study 5: Recommendation Systems and Collaborative filtering
User based
Item Based
Singular value decomposition–based recommenders
Case Study 6: Customer Segmentation and Value
Segmentation Strategies
Lifetime Value
Case Study 7: Portfolio Risk Conformance
Risk Profiling
Portfolio Optimization
Case Study 8: Uber Alternative Routing
Graph Construction
Route Optimization
12
5.COURSE OUTCOME
Cognitive/
Expected
Affective
Course Level of
Course Outcome Statement Level of the
Code Attainmen
Course
t
Outcome
Course Outcome Statements in Cognitive Domain
Explain the problem solving Understand
C211.1 70%
and search strategies. K2
Demonstrate the techniques for
Apply
C211.2 knowledge representation and 70%
K3
reasoning.
Interpret various forms of Apply
C211.3 learning, artificial neural K3 70%
networks and its applications.
13
6.CO-PO/PSO MAPPING
14
LECTURE PLAN – UNIT III
UNIT I INTRODUCTION
Sl.
No
NO ACTUAL
PROPOSED
OF LECTURE MODE OF
LECTURE PERTAININ TAXONOMY
TOPIC PER DELIVER
G CO(s) LEVEL
IOD Y
S PERIOD PERIOD
LEARNING FROM
1 EXAMPLES: FORMS OF 1 CO3 K2
LEARNING BB, ICT
SUPERVISED
LEARNING, LEARNING 1 CO3 K2 BB, ICT
DECISION TREES,
2
EVALUATING AND
3 CHOOSING THE BEST 1 CO3 K2 BB, ICT
HYPOTHESIS
THE THEORY OF
LEARNING,
REGRESSION AND 1 CO3 K3 BB, ICT
4 CLASSIFICATION WITH
LINEAR MODELS
ARTIFICIAL NEURAL
5 NETWORKS 1 CO3 K3 BB, ICT
APPLICATIONS: HUMAN
6 COMPUTER
INTERACTION (HCI),
KNOWLEDGE 1 CO3 K2 BB, ICT
MANAGEMENT
TECHNOLOGIES,
AI FOR CUSTOMER
RELATIONSHIP 1 CO3 K2 BB, ICT
7 MANAGEMENT,
15
Activity Based Learning - Crossword Puzzle
Activity Based Learning – Online Games
LEARNING
13
Lecture Notes - UNIT 3 LEARNING
Machine Learning (ML) is an automated learning with little or no human intervention. It
involves programming computers so that they learn from the available inputs. The main
purpose of machine learning is to explore and construct algorithms that can learn from the
previous data and make predictions on new input data.
FORMS OF LEARNING
Concepts of Learning
Learning can be broadly classified into three categories, as mentioned below, based on
the nature of the learning data and interaction between the learner and the environment.
• Supervised Learning
• Unsupervised Learning
• Semi-supervised Learning
Similarly, there are four categories of machine learning algorithms as shown below −
• However, the most commonly used ones are supervised and unsupervised learning.
Lecture Notes
Supervised Learning
Supervised learning is commonly used in real world applications, such as face and
speech recognition, products or movie recommendations, and sales forecasting.
Supervised learning can be further classified into two types -
Regression and Classification.
Supervised learning deals with learning a function from available training data. Here,
a learning algorithm analyzes the training data and produces a derived function that
can be used for mapping new examples. There are many supervised learning
algorithms such as Logistic Regression, Neural networks, Support Vector Machines
(SVMs), and Naive Bayes classifiers.
Common examples of supervised learning include classifying e-mails into spam and
not-spam categories, labeling webpages based on their content, and voice
recognition.
Lecture Notes
Unsupervised Learning
Unsupervised learning is used to detect anomalies, outliers, such as fraud or
defective equipment, or to group customers with similar behaviors for a sales
campaign. It is the opposite of supervised learning. There is no labeled data here.
When learning data contains only some indications without any description or
labels, it is up to the coder or to the algorithm to find the structure of the
underlying data, to discover hidden patterns, or to determine how to describe the
data. This kind of learning data is called unlabeled data.
Suppose that we have a number of data points, and we want to classify them into
several groups. We may not exactly know what the criteria of classification would
be. So, an unsupervised learning algorithm tries to classify the given dataset into
a certain number of groups in an optimum way.
Unsupervised learning algorithms are extremely powerful tools for analyzing data
and for identifying patterns and trends. They are most commonly used for
clustering similar input into logical groups. Unsupervised learning algorithms
include Kmeans, Random Forests, Hierarchical clustering and so on.
Semi-supervised Learning
If some learning samples are labeled, but some other are not labeled, then it is
semi-supervised learning. It makes use of a large amount of unlabeled data for
training and a small amount of labeled data for testing. Semi-supervised
learning is applied in cases where it is expensive to acquire a fully labeled dataset
while more practical to label a small subset. For example, it often requires skilled
experts to label certain remote sensing images, and lots of field experiments to
locate oil at a particular location, while acquiring unlabeled data is relatively easy.
Lecture Notes
Reinforcement Learning
Here learning data gives feedback so that the system adjusts to dynamic
conditions in order to achieve a certain objective. The system evaluates its
performance based on the feedback responses and reacts accordingly. The best
known instances include self-driving cars and chess master algorithm AlphaGo.
If the outlook is sunny and humidity is normal, then yes, you may play tennis.
• The learning data has attribute value pair like in the example shown above: Wind
as an attribute has two possible values – strong or weak
• Target function has discreet output. Here, the target function is – should you play
tennis? And the output to this discreet output – Yes and No
ID3 Algorithm
Although there are various decision tree learning algorithms, we will explore the
Iterative Dichotomiser 3 or commonly known as ID3. ID3 was invented by Ross
Quinlan.
Entropy
p(3)*log2(p(3))………………………..p(n)*log(2p(n))
Information Gain
We decided to break the first decision on the basis of outlook. We could have our
first decision based on humidity or wind but we chose outlook. Why?
Because making our decision on the basis of outlook reduced our randomness in the
outcome(which is whether to play or not), more than what it would have been
reduced in case of humidity or wind.
Let’s understand with the example here. Please refer to the play tennis dataset that
is pasted above.
We have data for 14 days. We have only two outcomes :
Either we played tennis or we didn’t play.
In the given 14 days, we played tennis on 9 occasions and we did not play on 5
occasions.
Probability of playing tennis:
Number of favourable events : 9
Number of total events : 14
Probability = (Number of favourable events) / (Number of total events)
= 9/14
= 0.642
Now, we will see probability of not playing tennis.
Probability of not playing tennis:
Number of favourable events : 5
Number of total events : 14
=0.940
So, entropy of whole system before we make our firest question is 0.940
1. Outlook
2. Temperature
3. Windy
4. Humidity
Let’s see what happens to entropy when we make our first decision on the basis
of Outlook.
Outlook
If we make a decison tree divison at this level 0 based on outlook, we have three
branches possible; either it will be Sunny or Overcast or it will be Raining.
1. Sunny : In the given data, 5 days were sunny. Among those 5 days, tennis was
played on 2 days and tenis was not played on 3 days. What is the entropy here?
= 0.97
2.Overcast: In the given data, 4 days were overcast and tennis was played on all the four
days. Let
3.Rain: In the given data, 5 days were rainy. Among those 5 days, tennis was played on 3
days and tenis was not played on 2 days. What is the entropy here?
Probablity of not playing tennis = 2/5 = 0.4
= 0.97
= 0.69
What is the reduction is randomness due to choosing outlook as a decsion maker?
= 0.940 – 0.69
= 0.246
This reduction in randomness is called Information Gain. Similar calculation can be
done for other features.
Temperature
Information Gain = 0.029
Windy
Information Gain = 0.048
Humidity
Information Gain = 0.152
We can see that decrease in randomness, or information gain is most for Outlook. So,
we choose first decision maker as Outlook.
Lecture Notes- Evaluating and Choosing the
Best Hypothesis
Bayesian Belief Network in artificial intelligence
Bayesian networks are probabilistic, because these networks are built from
a probability distribution, and also use probability theory for prediction
and anomaly detection.
Bayesian Network can be used for building models from data and experts
opinions, and it consists of two parts:
1. Directed Acyclic Graph
Let us discuss what is learning for a machine is as shown below media as follows:
Training the system: While training the model, data is usually split in the ratio of
80:20 i.e. 80% as training data and the rest as testing data. In training data, we
feed input as well as output for 80% of data. The model learns from training data
only. We use different machine learning algorithms(which we will discuss in detail in
the next articles) to build our model. Learning means that the model will build some
logic of its own.
Once the model is ready then it is good to be tested. At the time of testing, the input is
fed from the remaining 20% of data that the model has never seen before, the model
will predict some value and we will compare it with the actual output and calculate the
accuracy.
For example in above Figure B, Output – Wind Speed is not having any discrete value
but is continuous in a particular range. The goal here is to predict a value as much
closer to the actual output value as our model can and then evaluation is done by
calculating the error value. The smaller the error the greater the accuracy of our
regression model.
Example of Supervised Learning Algorithms:
• Linear Regression
• Logistic Regression
• Nearest Neighbor
• Decision Trees
• Random Forest
2. Regression: In regression, the output variable is a continuous variable, and the goal
is to predict the value of the output variable based on the input variables. Examples
of regression problems include predicting stock prices, weather forecasting, and
sales forecasting.
• Supervised learning algorithms are widely used in various
fields, such as natural language processing, computer vision,
medical diagnosis, speech recognition, and many others.
Some of the popular supervised learning algorithms include:
linear regression, logistic regression, decision trees, random
forest, k-nearest neighbors (KNN), support vector machine
(SVM), and neural
networks.
For Examples:
Which of the following is a regression task?
For example :
Which of the following is/are classification problem(s)?
• Predict the number of copies a music album will be sold next month
• https://machinelearningmastery.com/logistic-regression-for-machine-learning/
• https://machinelearningmastery.com/linear-regression-for-machine-learning/
# Python code to illustrate
# classification using data set
#Importing the required library
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
#Importing the dataset
dataset = pd.read_csv(
'https://archive.ics.uci.edu/ml/machine-learning-'+
'databases/iris/iris.data',sep= ',', header= None)
data = dataset.iloc[:, :]
#checking for null values
print("Sum of NULL values in each column. ")
print(data.isnull().sum())
#separating the predicting column from the whole dataset
X = data.iloc[:, :-1].values
y = dataset.iloc[:, 4].values
#Encoding the predicting variable
labelencoder_y = LabelEncoder()
y = labelencoder_y.fit_transform(y)
#Splitting the data into test and train dataset
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size = 0.3, random_state = 0)
#Using the random forest classifier for the prediction
classifier=RandomForestClassifier()
classifier=classifier.fit(X_train,y_train)
predicted=classifier.predict(X_test)
#printing the results
print ('Confusion Matrix :')
print(confusion_matrix(y_test, predicted))
print ('Accuracy Score :',accuracy_score(y_test, predicted))
print ('Report : ')
print (classification_report(y_test, predicted))
Artificial Neural Network
What is Artificial Neural Network?
The given figure illustrates the typical diagram of Biological Neural Network.
The typical Artificial Neural Network looks something like the given figure.
Artificial Neural Network
Dendrites from Biological Neural Network represent inputs in Artificial Neural
Networks, cell nucleus represents Nodes, synapse represents Weights, and Axon
represents Output.
Dendrites Inputs
Synapse Weights
Axon Output
An Artificial Neural Network in the field of Artificial intelligence where it
attempts to mimic the network of neurons makes up a human brain so that
computers will have an option to understand things and make decisions in a
human-like manner. The artificial neural network is designed by programming
computers to behave simply like interconnected brain cells.
There are around 1000 billion neurons in the human brain. Each neuron has an
association point somewhere in the range of 1,000 and 100,000. In the human
brain, data is stored in such a manner as to be distributed, and we can extract
more than one piece of this data when necessary from our memory parallelly. We
can say that the human brain is made up of incredibly amazing parallel
processors.
Input Layer:
Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all the
calculations to find hidden features and patterns.
Output Layer:
The input goes through a series of transformations using the hidden layer, which
finally results in output that is conveyed using this layer.
The artificial neural network takes input and computes the weighted sum of the
inputs and includes a bias. This computation is represented in the form of a
transfer function.
Artificial Neural Network
It determines weighted total is passed as an input to an activation function to produce the
output. Activation functions choose whether a node should fire or not. Only those who are
fired make it to the output layer. There are distinctive activation functions available that can
be applied upon the sort of task we are performing.
3. Hardware dependence
If the weighted sum is equal to zero, then bias is added to make the output non-zero
or something else to scale up to the system's response. Bias has the same input, and
weight equals to 1. Here the total of weighted inputs can be in the range of 0 to
positive infinity. Here, to keep the response in the limits of the desired value, a
certain maximum value is benchmarked, and the total of weighted inputs is passed
through the activation function.
The activation function refers to the set of transfer functions used to achieve the
desired output. There is a different kind of the activation function, but primarily
either linear or non-linear sets of functions. Some of the commonly used sets of
activation functions are the Binary, linear, and Tan hyperbolic sigmoidal activation
functions. Let us take a look at each of them in details:
Binary:
In binary activation function, the output is either a one or a 0. Here, to accomplish this,
there is a threshold value set up. If the net weighted input of neurons is more than 1,
then the final output of the activation function is returned as one or else the output is
returned as 0.
Artificial Neural Network
Feed-Forward ANN:
Applications :
• Expert systems,
• Web mining.
• Industrial Design − For interactive products like mobile phones, microwave oven,
etc.
Human computer interaction (HCI)
What Is Human-Computer Interaction (HCI)?
HCI is the study of how people interact with computers, especially as it relates to
technology design. User-centered design, UI, and UX are combined with HCI to provide
intuitive technology and products.
HCI specialists consider how to develop and deploy computer systems that satisfy human
users. The majority of this research focuses on enhancing human-computer interaction by
enhancing how people utilize and comprehend an interface.
The User
When using a computer, a user always has a purpose or aim in mind. To achieve this, the
computer presents a digital representation of things.
The Interface
An essential HCI element that can improve the quality of user interaction is the interface.
Many interface-related factors need to be taken into account, including the type of
interaction, screen resolution, display size, and even color contrast.
The Context
HCI is not only about providing better communication between users and computers but
also about factoring in the context and environment in which the system is accessed.
Examples of HCI
Let's examine some well-known HCI examples that have sped up the field's development.
3. Speed recognition
4. Cloud Computing
Knowledge management technologies
New-age information solutions require the latest technology and one of the fast-
paced developments in customer service. With easy solutions that can be
powered by AI (Artificial Intelligence) tied up with knowledge management.
Right now, it is vital to give customers exactly what they want and quickly, so
adopting this technology will provide a positive CX and an excellent brand image.
1. Assigning responsibilities
The knowledge management process helps in assigning responsibilities to the
concerned parties. The designation of the employees is updated, and their
responsibilities are filled in.
2. Content retargeting
Retargeting content helps reduce the labor of the content and product team. A
piece of knowledge base, once created, can be adapted for any media and any
channel. This provides seamless circulation eliminating multiple works on the
same topic. Admins can bracket the period for knowledge to be scheduled or
published.
3. Analytics and feedback
An analytics dashboard made comparing set standards and achieved targets easy
and feasible. The effective knowledge management platform users are free to
rate and comment on it. This helps judge the practical credibility of users,
creators, and knowledge management solutions.
4. Cloud repository
The knowledge base created can be online, offline, cloud, or on-premise, making
accessibility extremely flexible, customizable, and secure. 24×7 accessibility to
authorized parties helps to reduce the time gap between the arrival of the
problem and the delivery of an apt solution.
5. Customizable IT knowledge base
A knowledge management system represents brand tonality to its users. These
users can be both employees and self-service users. A knowledge website or
application can be fully customized per the client’s needs to ensure easy usage
and a friendly UI. Multiple templates are available.
6. Omni-channel service
Over-whelming customers can backfire but supporting them by being present in
the exact manner they want you to be is nothing short of a miracle.
Online knowledge management has information and solutions in guides, charts,
images, videos, and chats.
7. Implement SOP
Adherence to Standard Operating Procedure is not killing creativity but kindling it
towards productive results. It helps establish a chain of command and syncs the
activities of employees on multiple fronts.
8. Better decision making
A good knowledge management tool has multiple options, from archiving
knowledge to acquiring facts, data, leads, feedback, communication, and
rating. DIY knowledge creation helps in user-generated content, thus being more
relevant as it is passed through expert magnification.
AI for customer relationship
management
Combining generative AI with CRM allows companies to automate business
processes better, develop more personalized communications, and provide
customers with the most helpful answers to questions.
At its core, CRM AI helps businesses better organize customer information and
access that information more easily. This includes contact details, demographic
data, communication history, purchase history, and other pertinent data used to
build sales opportunities and better serve customers.
On the sales side, CRM with embedded AI gives users things like:
• Recommendations
Intelligent case routing may be a coming AI feature on the customer service side.
The goal of artificial intelligence in CRM is to let AI handle the analysis and make
intelligent recommendations about a customer or prospect based on all the data
about that person the system has collected.
With AI, a salesperson can open a contact record and ask the system for
suggestions on how to best connect with that person without spending time
sifting through company news and Twitter or LinkedIn profiles.
Examples of CRM With AI
Not surprisingly, some of the biggest names in CRM eagerly tout their artificial
intelligence functionality.
Companies like Salesforce, HubSpot, SugarCRM, Zoho, and Microsoft have all integrated
some form of AI into their platforms.
HubSpot
HubSpot has released an alpha version of ChatSpot, which combines GPT-4, HubSpot
CRM, DALLE·2, and Google Docs.
In this video, HubSpot co-founder Dharmesh Shah demonstrates AI use cases for sales,
reporting, and marketing.
Salesforce
Salesforce’s original AI platform was called Einstein. Salesforce billed Einstein as “a layer
of artificial intelligence that delivers predictions and recommendations based on your
unique business processes and customer data.”
Einstein GPT can be integrated into many facets of the Salesforce platform, including
sales, service, marketing, commerce, and Tableau (analytics).
An impressive marketing use of Einstein GPT is its ability to design an event or campaign-
specific landing page, including text, image, and web form.
Microsoft
Also, in March 2023, Microsoft introduced Microsoft Dynamics 365 Copilot for CRM and
ERP.
Generative AI functionality will be available in Dynamics 365 Sales and Viva Sales,
Dynamics 365 Customer Service, Dynamics 365 Customer Insights, and Dynamics 365
Marketing.
What is an Expert System?
The expert systems are the computer applications developed to solve complex problems in
a particular domain, at the level of extra-ordinary human intelligence and expertise.
• High performance
• Understandable
• Reliable
• Highly responsive
• Advising
• Demonstrating
• Deriving a solution
• Diagnosing
• Explaining
• Interpreting input
• Predicting results
Knowledge Base
It contains domain-specific and high-quality knowledge.
Knowledge is required to exhibit intelligence. The success of any ES majorly depends
upon the collection of highly accurate and precise knowledge.
What is Knowledge?
The data is collection of facts. The information is organized as data and facts about the
task domain. Data, information, and past experience combined together are
termed as knowledge.
Components of Knowledge Base
Knowledge representation
Knowledge Acquisition
User Interface
User interface provides interaction between user of the ES and the ES
itself. It is generally Natural Language Processing so as to be used by
the user who is well-versed in the task domain. The user of the ES need
not be necessarily an expert in Artificial Intelligence.
It explains how the ES has arrived at a particular recommendation. The
explanation may appear in the following forms −
• Natural language displayed on screen.
• Verbal narrations in natural language.
• Listing of rule numbers displayed on the screen.
The user interface makes it easy to trace the credibility of the
deductions.
Expert Systems Limitations
No technology can offer easy and complete solution. Large systems are
costly, require significant development time, and computer resources.
ESs have their limitations which include −
• Limitations of the technology
• Difficult knowledge acquisition
• ES are difficult to maintain
• High development costs
Applications of Expert System
The following table shows where ES can be applied.
Application Description
The Data Mining process breaks down into the following steps –
1. Collect, Extract, Transform and Load the data into the data
warehouse
2. Store and manage the data in the database or on the cloud.
3. Provide access to data to the business analyst, management
teams, and Information Technology professionals.
Text Mining
The basic idea behind Text Mining is to find patterns in large datasets
that can be used for various purposes.
There are three main types of Web Data, as shown in the above image.
Let’s discuss in brief these Web Data types.
Web Content Data: The widespread form of data in Web Content are
HTML, web pages, images, etc. All these various data types constitute
Web Content data. The main layout for the Internet/Web content is
HTML, with a slight difference depending upon the use of the browser,
but the basic layout structure is the same everywhere.
Web Structure Data: On a typical web page, the contents are arranged
within HTML tags. The pages are hyperlinked, allowing users to navigate
back and forth to find relevant information. So basically, relationship/links
describing the connection between webpages is web structure data.
Web Usage Data: The main Data is generated by the Web Server and
Application Server on a typical web page. Web/Application server collects
the log data, including information about the users like their geographical
location, time, the content they interacted with, etc.
Natural Language Processing (NLP) - Introduction
Definition
NLP is a subfield of artificial intelligence that deals with the interaction between
computers and human languages. It encompasses a wide range of tasks, from basic
language understanding to advanced language generation.
Key Concepts
•Tokenization:Breaking down text into smaller units, such as words or phrases.
•Part-of-Speech Tagging:Assigning grammatical categories to words.
•Named Entity Recognition (NER):Identifying and classifying entities (e.g., names,
locations) in text.
3. NLP Workflow
Text Preprocessing
Tokenization:Breaking down text into individual units.
Stopword Removal:Eliminating common words without significant meaning.
Stemming/Lemmatization:Reducing words to their base form.
Feature Extraction
Word Embeddings:Capturing semantic relationships between words.
TF-IDF (Term Frequency-Inverse Document Frequency): Assigning weights to words
based on their importance in a document.
Model Training and Evaluation
Supervised Learning: Training models on labeled datasets for tasks
like sentiment analysis or named entity recognition.
Unsupervised Learning: Extracting patterns and relationships from
unlabeled data, as seen in clustering or topic modeling.
Virtual Assistants
NLP powers virtual assistants like Siri or Alexa, enabling them to
understand and respond to user queries.
Information Retrieval
Enhancing search engines to provide more relevant and accurate
results.
Healthcare Applications
Analyzing medical texts for information extraction, diagnosis, and
research.
Conclusion
Natural Language Processing plays a pivotal role in bridging the gap
between human languages and computing systems. From machine
translation to sentiment analysis, NLP empowers various applications
that enhance our interaction with technology. While facing challenges
related to the complexity and diversity of language, continuous
advancements in NLP contribute to the development of more
intelligent and language-aware systems
ASSIGNMENT – UNIT III
2. Provide outline of the ID3 algorithm used for inducing decision tree from
the training tuples.
a) If your task is to build a model for COVID - 10 daily cases prediction, which
learning category would you use? Justify your answer.
58
Very hard: (CO3, K4)
7. What is CART?
CART is a machine learning algorithm used for both classification and
regression tasks. It works by recursively partitioning the input space into regions
and fitting simple models (e.g., decision trees) to each partition. CART builds binary
trees and splits the data into two subsets based on a threshold value for a given
feature at each node.
8. Define ANN?
An artificial neural network (ANN) is a computational model inspired by the
structure and function of biological neural networks. It consists of interconnected
nodes (neurons) organized in layers, including an input layer, one or more hidden
layers, and an output layer. ANN is used for tasks such as pattern recognition,
classification, regression, and optimization.
10. Write a short note on machine learning and its types? Explain how ML
works on CRM
SUPPORTIVE ONLINE COURSES – UNIT III
https://onlinecourses.nptel.ac.in/noc21_cs42/preview
An Introduction to Artificial Intelligence
By Prof. Mausam | IIT Delhi
https://www.coursera.org/learn/computational-thinking-problem-
solving
https://www.coursera.org/learn/artificial-intelligence-education-
for-teachers
https://www.coursera.org/specializations/ai-healthcare
https://www.coursera.org/learn/predictive-modeling-machine-
learning
https://www.drdobbs.com/parallel/the-practical-application-of-
prolog/184405220
61
REAL TIME APPLICATION- UNIT III
Neural Networks find extensive applications in areas where traditional computers
don’t fare too well. Like, for problem statements where instead of programmed
outputs, you’d like the system to learn, adapt, and change the results in sync with
the data you’re throwing at it. Neural networks also find
rigorous applications whenever we talk about dealing with noisy or incomplete data.
And honestly, most of the data present out there is indeed noisy.
With their brain-like ability to learn and adapt, Neural Networks form the entire
basis and have applications in Artificial Intelligence, and consequently, Machine
Learning algorithms. Before we get to how Neural Networks power Artificial
Intelligence, let’s first talk a bit about what exactly is Artificial Intelligence.
For the longest time possible, the word “intelligence” was just associated with the
human brain. But then, something happened! Scientists found a way of training
computers by following the methodology our brain uses. Thus came Artificial
Intelligence, which can essentially be defined as intelligence originating from
machines. To put it even more simply, Machine Learning is simply providing
machines with the ability to “think”, “learn”, and “adapt”.
With so much said and done, it’s imperative to understand what exactly are the use
cases of AI, and how Neural Networks help the cause. Let’s dive into
the applications of Neural Networks across various domains – from Social
Media and Online Shopping, to Personal Finance, and finally, to the smart assistant
on your phone.
You should remember that this list is in no way exhaustive, as the applications
of neural networks are widespread. Basically, anything that makes the machines
learn is deploying one or the other type of neural network.
62
Social Media
The ever-increasing data deluge surrounding social media gives the creators of these
platforms the unique opportunity to dabble with the unlimited data they have. No
wonder you get to see a new feature every fortnight. It’s only fair to say that all of this
would’ve been like a distant dream without Neural Networks to save the day.
Neural Networks and their learning algorithms find extensive applications in the world of
social media. Let’s see how:
Facebook
As soon as you upload any photo to Facebook, the service automatically highlights faces
and prompts friends to tag. How does it instantly identify which of your friends is in the
photo?
The answer is simple – Artificial Intelligence. In a video highlighting Facebook’s Artificial
Intelligence research, they discuss the applications of Neural Networks to power their
facial recognition software. Facebook is investing heavily in this area, not only within the
organization, but also through the acquisitions of facial-recognition startups
like Face.com (acquired in 2012 for a rumored $60M), Masquerade (acquired in 2016 for
an undisclosed sum), and Faciometrics (acquired in 2016 for an undisclosed sum).
In June 2016, Facebook announced a new Artificial Intelligence initiative that uses
various deep neural networks such as DeepText – an artificial intelligence engine
that can understand the textual content of thousands of posts per second, with
near-human accuracy.
Instagram
Instagram, acquired by Facebook back in 2012, uses deep learning by making use
of a connection of recurrent neural networks to identify the contextual meaning of
an emoji – which has been steadily replacing slangs (for instance, a laughing
emoji could replace “rofl”).
By algorithmically identifying the sentiments behind emojis, Instagram creates
and auto-suggests emojis and emoji related hashtags. This may seem like a
minor application of AI, but being able to interpret and analyze this emoji-to-text
translation at a larger scale sets the basis for further analysis on how people use
Instagram
Online Shopping
Do you find yourself in situations where you’re set to buy something, but you end
up buying a lot more than planned, thanks to some super-awesome
recommendations?
Yeah, blame neural networks for that. By making use of neural network and its
learnings, the e-commerce giants are creating Artificial Intelligence systems that
know you better than yourself. Let’s see how:
Search
Your Amazon searches (“earphones”, “pizza stone”, “laptop charger”, etc) return a
list of the most relevant products related to your search, without wasting much
time. In a description of its product search technology, Amazon states that
its algorithms learn automatically to combine multiple relevant features. It uses
past patterns and adapts to what is important for the customer in question.
And what makes the algorithms “learn”? You guessed it right – Neural Networks!
Recommendations
Amazon shows you recommendations using its “customers who viewed this item
also viewed”, “customers who bought this item also bought”, and also via curated
recommendations on your homepage, on the bottom of the item pages, and
through emails. Amazon makes use of Artificial Neural Networks to train its
algorithms to learn the pattern and behaviour of its users. This, in turn, helps
Amazon provide even better and customized recommendations.
CONTENT BEYOND SYLLABUS – UNIT III
65
ASSESSMENT SCHEDULE
Name of the
S.NO Start Date End Date Portion
Assessment
78
PRESCRIBED TEXT BOOKS AND REFERENCE BOOKS
REFERENCES:
67
Mini Projects
Air and water quality index and environment monitoring
Disclaimer:
This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document through
email in error, please notify the system manager. This document contains proprietary
information and is intended only to the respective group / learning community as
intended. If you are not the addressee you should not disseminate, distribute or
copy through e-mail. Please notify the sender immediately by e-mail if you have
received this document by mistake and delete this document from your system. If
you are not the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this information is
strictly prohibited.
69