Machine Learning Lab Manual

MACHINE
LEARNING
LABORATORY
MANUAL [ICS-553]
B.TECH III YEAR –

V SEM
[A.Y:2023-2024]
Department of Computer Science
and Engineering
Vision
 To acknowledge quality education and instill high patterns of discipline

making the students technologically superior and ethically strong which involves
the improvement in the quality of life in human race.
Mission
 To achieve and impart holistic technical education using the best of

infrastructure, outstanding technical and teaching expertise to establish the
students in to competent and confident engineers.
 Evolving the center of excellence through creative and innovative teaching

learning practices for promoting academic achievement to produce
internationally accepted competitive and world class professionals.
PROGRAMME EDUCATIONAL OBJECTIVES
PEO1 – ANALYTICAL SKILLS
 To facilitate the graduates with the ability to visualize, gather information,

articulate, analyze, solve complex problems, and make decisions. These are
essential to address the challenges of complex and computation intensive
problems increasing their productivity.
PEO2 – TECHNICAL SKILLS
 To facilitate the graduates with the technical skills that prepare them for
immediate employment and pursue certification providing a deeper
understanding of the technology in advanced areas of computer science and
related fields, thus encouraging to pursue higher education and research based on
their interest.
PEO3 – SOFT SKILLS
 To facilitate the graduates with the soft skills that include fulfilling the
mission, setting goals, showing self-confidence by communicating effectively,
having a positive attitude, get involved in team-work, being a leader, managing
their career and their life.
PEO4 – PROFESSIONAL ETHICS
 To facilitate the graduates with the knowledge of professional and ethical

responsibilities by paying attention to grooming, being conservative with style,
following dress codes, safety codes,and adapting themselves to technological
advancements.
PROGRAM OUTCOMES
Engineering Graduates will be able to:

1. Engineering knowledge: Apply the knowledge of mathematics, science,
engineeringfundamentals, and an engineering specialization to the solution of complex
engineering problems.
2. Problem analysis: Identify, formulate, review research literature, and
analyzecomplex engineering problems reaching substantiated conclusions using first
principles of mathematics, natural sciences, and engineering sciences.
3. Design / development of solutions: Design solutions for complex
engineeringproblems and design system components or processes that meet the
specified needs with appropriate consideration for the public health and safety, and the
cultural, societal, and environmental considerations.
4. Conduct investigations of complex problems: Use research-based knowledge
andresearch methods including design of experiments, analysis and interpretation of
data, and synthesis of the information to provide valid conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources,
andmodern engineering and IT tools including prediction and modeling to complex
engineering activities with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge
toassess societal, health, safety, legal and cultural issues and the consequent
responsibilities relevant to the professional engineering practice.
7. Environment and sustainability: Understand the impact of the
professionalengineering solutions in societal and environmental contexts, and
demonstrate the knowledge of, and need for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and
responsibilitiesand norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member
orleader in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities
withthe engineering community and with society at large, such as, being able to
comprehend and write effective reports and design documentation, make effective
presentations, and give and receive clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding
ofthe engineering and management principles and apply these to one’s own work, as a
member and leader in a team, to manage projects and in multi disciplinary
environments.
12. Life- long learning: Recognize the need for, and have the preparation and ability
toengage in independent and life-long learning in the broadest context of technological
change.
1. Lab Objectives:
 Make use of Data sets in implementing the machine learning algorithms
 Implement the machine learning concepts and algorithms in any
suitable languageof choice.
 Learning basic concepts of Python through illustrative examples and small
exercises.
 To prepare students to become Familiarity with the Python programming in AI
environment.
 To provide student with an academic environment aware of various AI
Algorithms.
 To train Students with python programming as to comprehend, analyze, design
and create AI platforms and solutions for the real life problems.
2. Description (If any):
1. The programs can be implemented in either JAVA or Python.

2. Data sets can be taken from standard repositories
(https://archive.ics.uci.edu/datasets) or constructedby the
students.
3. Study Experiment / Project:

Course outcomes: The students should be able to:
1. Understand the implementation procedures for the machine learning algorithms.

2. Apply various AI search algorithms (uninformed, informed, heuristic, constraint
satisfaction,)
3. Demonstrate working knowledge of reasoning in the presence of incomplete
and/or uncertain information.
4. Design Java/Python programs for various Learning algorithms.
5. Apply appropriate data sets to the Machine Learning algorithms.
6. Understand the fundamentals of knowledge representation, inference
7. Identify and apply Machine Learning algorithms to solve real world problems.
4. Introduction about lab Minimum System requirements:
 Processors: Intel Atom® processor or Intel® Core™ i3 processor.
 Disk space: 1 GB.
 Operating systems: Windows* 7 or later, macOS, and Linux.
 Python* versions: 2.7.X, 3.6.X.,3.8.X, 3.12.X
 IDE’’s: Anaconda, Jupyter notebook, Spyder ,Pydev(Eclipse) etc.
About language:
Python is a general purpose, high-level programming language; other high level
languages you might have heard of C++, PHP, Java and Python. Virtually all modern
programming languages make us of an Integrated Development Environment (IDE),
which allows the creation, editing, testing, and saving of programs and modules. In
Python, the IDE is called IDLE (like many items in the language, this is a reference to the
British comedy group Monty Python,and in this case, one of its members, Eric Idle).
Many modern languages use both processes. They are first compiled into a lower level
language, called byte code, and then interpreted by a program called a virtual machine.
Python uses both processes, but because of the way programmers interact with it,it is
usually considered an interpreted language.
Practical aspects are the key to understanding and conceptual visualization
of Theoretical aspects covered in the books. Also, this course is designed to
review the concepts of Data Structure , studied in previous semester and
implement the various algorithms related to different data structures.
How to install Python on Windows?

Python is a high-level programming language that has become increasingly popular due to its
simplicity, versatility, and extensive range of applications. The process of How to install Python in
Windows, operating system is relatively easy and involves a few uncomplicated steps.
How to Install Python in Windows?
We have provided step-by-step instructions to guide you and ensure a successful installation.
Whether you are new to programming or have some experience, mastering how to install Python on
Windows will enable you to utilize this potent language and uncover its full range of potential
applications.
To download Python on your system, you can use the following steps
Step 1: Select Version to Install Python
Visit the official page for Python https://www.python.org/downloads/ on the Windows operating
system. Locate a reliable version of Python 3, preferably version 3.10.11, which was used in testing
this tutorial. Choose the correct link for your device from the options provided: either Windows
installer (64-bit) or Windows installer (32-bit) and proceed to download the executable file.
Python Homepage
Step 2: Downloading the Python Installer
Once you have downloaded the installer, open the .exe file, such as python-3.10.11-amd64.exe, by
double-clicking it to launch the Python installer. Choose the option to Install the launcher for all
users by checking the corresponding checkbox, so that all users of the computer can access the
Python launcher application.Enable users to run Python from the command line by checking the Add
python.exe to PATH checkbox.
Python Installer
After Clicking the Install Now Button the setup will start installing Python on your Windows
system. You will see a window like this.
Python Setup
Step 3: Running the Executable Installer

After completing the setup. Python will be installed on your Windows system. You will see a
successful message.
Python Successfully installed
Step 4: Verify the Python Installation in Windows

Close the window after successful installation of Python. You can check if the installation of Python
was successful by using either the command line or the Integrated Development Environment
(IDLE), which you may have installed. To access the command line, click on the Start menu and
type “cmd” in the search bar. Then click on Command Prompt.
python --version
Python version
You can also check the version of Python by opening the IDLE application. Go to Start and enter
IDLE in the search bar and then click the IDLE app, for example, IDLE (Python 3.10.11 64-bit). If
you can see the Python IDLE window then you are successfully able to download and installed
Python on Windows.
Python IDLE
Getting Started with Python

Python is a lot easier to code and learn. Python programs can be written on any plain text editor
like Notepad, notepad++, or anything of that sort. One can also use an Online IDE to run Python
code or can even install one on their system to make it more feasible to write these codes because
IDEs provide a lot of features like an intuitive code editor, debugger, compiler, etc. To begin with,
writing Python Codes and performing various intriguing and useful operations, one must have Python
installed on their System.
How to Install Anaconda on Windows?
Anaconda is an open-source software that contains Jupyter, spyder, etc that are used for large data
processing, data analytics, heavy scientific computing. Anaconda works for R and python
programming language. Spyder(sub-application of Anaconda) is used for python. Opencv for python
will work in spyder. Package versions are managed by the package management system called conda.
To begin working with Anaconda, one must get it installed first. Follow the below instructions to
Download and install Anaconda on your system:
Download and install Anaconda:
Head over to anaconda.com and install the latest version of Anaconda. Make sure to download the
“Python 3.7 Version” for the appropriate architecture.
Begin with the installation process:
 Getting Started:
 Getting through the License Agreement:
 Select Installation Type: Select Just Me if you want the software to be used by a single User
 Choose Installation Location:
 Advanced Installation Option:

 Getting through the Installation Process:
 Recommendation to Install Pycharm:

 Finishing up the Installation:
Working with Anaconda:
Once the installation process is done, Anaconda can be used to perform multiple operations. To begin
using Anaconda, search for Anaconda Navigator from the Start Menu in Windows
How to install Jupyter Notebook on Windows?
Jupyter Notebook is an open-source web application that allows you to create and share documents
that contain live code, equations, visualizations, and narrative text. Uses include data cleaning and
transformation, numerical simulation, statistical modeling, data visualization, machine learning, and
much more.
Jupyter has support for over 40 different programming languages and Python is one of them. Python
is a requirement (Python 3.3 or greater, or Python 2.7) for installing the Jupyter Notebook itself.
Jupyter Notebook can be installed by using either of the two ways described below:
 Using Anaconda:
Install Python and Jupyter using the Anaconda Distribution, which includes Python, the Jupyter
Notebook, and other commonly used packages for scientific computing and data science. To install
Anaconda, go through How to install Anaconda on windows? and follow the instructions provided.
 Using PIP:
Install Jupyter using the PIP package manager used to install and manage software
packages/libraries written in Python. To install pip, go through How to install PIP on
Windows? and follow the instructions provided.
Installing Jupyter Notebook using Anaconda:
Anaconda is an open-source software that contains Jupyter, spyder, etc that are used for large data
processing, data analytics, heavy scientific computing. Anaconda works for R and python
programming language. Spyder(sub-application of Anaconda) is used for python. Opencv for python
will work in spyder. Package versions are managed by the package management system called conda.
To install Jupyter using Anaconda, just go through the following instructions:
 Launch Anaconda Navigator:
 Click on the Install Jupyter Notebook Button:

 Beginning the Installation:
 Loading Packages:
 Finished Installation:
Launching Jupyter:
Installing Jupyter Notebook using pip:
PIP is a package management system used to install and manage software packages/libraries written
in Python. These files are stored in a large “on-line repository” termed as Python Package Index
(PyPI).
pip uses PyPI as the default source for packages and their dependencies.
To install Jupyter using pip, we need to first check if pip is updated in our system. Use the following
command to update pip:
python -m pip install --upgrade pip
After updating the pip version, follow the instructions provided below to install Jupyter:
 Command to install Jupyter:
 python -m pip install jupyter
 Beginning Installation:
 Downloading Files and Data:
 Installing Packages:
 Finished Installation:
Launching Jupyter:
Use the following command to launch Jupyter using command-line:
jupyter notebook
Machine learning
Machine learning is a subset of artificial intelligence in
the field of computer science that often uses statistical
techniques to give computers the ability to "learn" (i.e.,
progressively improve performance on a specific task) with
data, without being explicitly programmed. In the past
decade, machine learning has given us self-driving cars,
practical speech recognition, effective web search, and a
vastly improved understanding of the human genome.
Machine learning tasks
Machine learning tasks are typically classified into two
broad categories, depending on whether there is a learning
"signal" or "feedback" available to a learning system:
Supervised learning: The computer is presented with example

inputs and their desired outputs, given by a "teacher", and
the goal is to learn a general rule that maps inputs to
outputs. As special cases, the input signal can be only
partially available, or restricted to special feedback:
Semi-supervised learning: the computer is given only an

incomplete training signal: a training set with some (often
many) of the target outputs missing.
Active learning: the computer can only obtain training

labels for a limited set of instances (based on a budget),
and also has to optimize its choice of objects to acquire
labels for. When used interactively, these can be presented
to the user for labeling.
Reinforcement learning: training data (in form of rewards

and punishments) is given only as feedback to the program's
actions in a dynamic environment, such as driving a vehicle
or playing a game against an opponent.
Unsupervised learning: No labels are given to the learning

algorithm, leaving it on its own to find structure in its
input. Unsupervised learning can be a goal in itself
(discovering hidden patterns in data) or a means towards an
end (feature learning).
Supervised learning Un Supervised learning Instance based

learning
Find-s algorithm EM algorithm
Candidate elimination algorithm
Decision tree algorithm
Back propagation Algorithm Locally weighted
Naïve Bayes Algorithm K means algorithm Regression algorithm
K nearest neighbour
algorithm(lazy learning
algorithm)
Machine learning applications
In classification, inputs are divided into two or more
classes, and the learner must produce a model that assigns
unseen inputs to one or more (multi-label classification)
of these classes. This is typically tackled in a supervised
manner. Spam filtering is an example of classification,
where the inputs are email (or other) messages and the
classes are "spam" and "not spam". In regression, also a
supervised problem, the outputs are continuous rather than
discrete.
In clustering, a set of inputs is to be divided into

groups. Unlike in classification, the groups are not known
beforehand, making this typically an unsupervised task.
Density estimation finds the distribution of inputs in some
space. Dimensionality reduction simplifies inputs by
mapping them into a lower- dimensional space. Topic
modeling is a related problem, where a program is given a
list of human language documents and is tasked with finding
out which documents cover similar topics.
Machine learning Approaches
Decision tree learning: Decision tree learning uses a decision

tree as a predictive model, which maps observations about an
item to conclusions about the item's target value. Association
rule learning Association rule learning is a method for
discovering interesting relations between variables in large
databases.
Artificial neural networks
An artificial neural network (ANN) learning algorithm,

usually called "neural network" (NN), is a learning
algorithm that is vaguely inspired by biological neural
networks. Computations are structured in terms of an
interconnected group of artificial neurons, processing
information using a connectionist approach to computation.
Modern neural networks are non-linear statistical data
modeling tools. They are usually used to model complex
relationships between inputs and outputs, to find patterns
in data, or to capture the statistical structure in an
unknown joint probability distribution between observed
variables.
Deep learning
Falling hardware prices and the development of GPUs for

personal use in the last few years have contributed to the
development of the concept of deep learning which consists
of multiple hidden layers in an artificial neural network.
This approach tries to model the way the human brain
processes light and sound into vision and hearing. Some
successful applications of deep learning are computer
vision and speech recognition.
Inductive logic programming
Inductive logic programming (ILP) is an approach to rule
learning using logic programming as a uniform representation
for input examples, background knowledge, and hypotheses.
Given an encoding of the known background knowledge and a
set of examples represented as a logical database of facts,
an ILP system will derive a hypothesized logic program that
entails all positive and no negative examples. Inductive
programming is a related field that considers any kind of
programming languages for representing hypotheses (and not
only logic programming), such as functional programs.
Support vector machines
Support vector machines (SVMs) are a set of related

supervised learning methods used for classification and
regression. Given a set of training examples, each marked
as belonging to one of two categories, an SVM training
algorithm builds a model that predicts whether a new
example falls into one category or the other.
Clustering
Cluster analysis is the assignment of a set of observations

into subsets (called clusters) so that observations within
the same cluster are similar according to some pre
designated criterion or criteria, while observations drawn
from different clusters are dissimilar. Different
clustering techniques make different assumptions on the
structure of the data, often defined by some similarity
metric and evaluated for example by internal compactness
(similarity between members of the same cluster) and
separation between different clusters. Other methods are
based on estimated density and graph connectivity.
Clustering is a method of unsupervised learning, and a
common technique for statistical data analysis.
Bayesian networks
A Bayesian network, belief network or directed acyclic

graphical model is a probabilistic graphical model that
represents a set of random variables and their conditional
independencies via a directed acyclic graph (DAG). For
example, a Bayesian network could represent the
probabilistic relationships between diseases and symptoms.
Given symptoms, the network can be used to compute the
probabilities of the presence of various diseases.
Efficient algorithms exist that perform inference and
learning.
Reinforcement learning
Reinforcement learning is concerned with how an agent ought
to take actions in an environment so as to maximize some
notion of long-term reward. Reinforcement learning
algorithms attempt to find a policy that maps states of the
world to the actions the agent ought to take in those
states. Reinforcement learning differs from the supervised
learning problem in that correct input/output pairs are
never presented, nor sub-optimal actions explicitly
corrected.
Similarity and metric learning

In this problem, the learning machine is given pairs of
examples that are considered similar and pairs of less
similar objects. It then needs to learn a similarity
function (or a distance metric function) that can predict
if new objects are similar. It is sometimes used in
Recommendation systems.
Genetic algorithms
A genetic algorithm (GA) is a search heuristic that mimics
the process of natural selection, and uses methods such as
mutation and crossover to generate new genotype in the hope
of finding good solutions to a given problem. In machine
learning, genetic algorithms found some uses in the 1980s
and 1990s. Conversely, machine learning techniques have
been used to improve the performance of genetic and
evolutionary algorithms.
Rule-based machine learning
Rule-based machine learning is a general term for any

machine learning method that identifies, learns, or evolves
"rules" to store, manipulate or apply, knowledge. The
defining characteristic of a rule-based machine learner is
the identification and utilization of a set of relational
rules that collectively represent the knowledge captured by
the system. This is in contrast to other machine learners
that commonly identify a singular model that can be
universally applied to any instance in order to make a
prediction. Rule-based machine learning approaches include
learning classifier systems, association rule learning, and
artificial immune systems.
Feature selection approach

Feature selection is the process of selecting an optimal
subset of relevant features for use in model construction.
It is assumed the data contains some features that are
either redundant or irrelevant, and can thus be removed to
reduce calculation cost without incurring much loss of
information. Common optimality criteria include accuracy,
similarity and information measures.
INDEX
Week- No List of Programs Pg Nos.

1 Implement and demonstrate the FIND-S algorithm for finding
the most specific hypothesis based on a given set of training data
samples. Read the training data from a .CSV file.
For a given set of training data examples stored in a .CSV file,
2 implement and demonstrate the Candidate-Elimination algorithm
to output a description of the set of all hypotheses consistent with
the training examples.
Write a program to demonstrate the working of the decision tree
3 based ID3 algorithm. Use an appropriate data set for building the
decision tree and apply this knowledge to classify a new sample.
Build an Artificial Neural Network by implementing the

4 Backpropagation algorithm and test the same using appropriate
data sets.
Write a program to implement the naïve Bayesian classifier for a

5 sample training data set stored as a .CSV file. Compute the
accuracy of the classifier, considering few test data sets.
Assuming a set of documents that need to be classified, use the

6 naïve Bayesian Classifier model to perform this task. Built-in
Java classes/API can be used to write the program. Calculate the
accuracy, precision, and recall for your data set.
Write a program to construct a Bayesian network considering

7 medical data. Use this model to demonstrate the diagnosis of
heart patients using standard Heart Disease Data Set. You can
use Java/Python ML library classes/API.
8 Apply EM algorithm to cluster a set of data stored in a .CSV file.

Use the same data set for clustering using k-Means algorithm.
Compare the results of these two algorithms and comment on the
quality of clustering. You can add Java/Python ML library
classes/API in the program.
9 Write a program to implement k-Nearest Neighbour algorithm to

classify the iris data set. Print both correct and wrong
predictions. Java/Python ML library classes can be used for this
problem.
Implement the non-parametric Locally Weighted Regression

10 algorithm in order to fit data points. Select appropriate data set
for your experiment and draw graphs.
1. Implement and demonstrate the FIND-S algorithm for finding the most specific
hypothesis based on a given set of training data samples. Read the training
data from a .CSV file.
FIND-S Algorithm
1. Initialize h to the most specific hypothesis in H
2. For each positive training instance x
For each attribute constraint ai in
h
If the constraint ai is satisfied by x
Then do nothing
Else replace ai in h by the next more general constraint that is satisfied by x
3. Output hypothesis h
Training Examples:
Example Sky AirTemp Humidity Wind Water Forecast EnjoySport

1 Sunny Warm Normal Strong Warm Same Yes
2 Sunny Warm High Strong Warm Same Yes
3 Rainy Cold High Strong Warm Change No
4 Sunny Warm High Strong Cool Change Yes
To illustrate this algorithm, assume the learner is given the sequence of

training examples from the EnjoySport task
3 Rainy Cold High Strong Warm Change No
• The first step of FIND-S is to initialize h to the most specific hypothesis in H
h - (Ø, Ø, Ø, Ø, Ø, Ø)
Consider the first training example
x1 = [Sunny Warm Normal Strong Warm Same], +
Observing the first training example, it is clear that hypothesis h is too specific.
None of the "Ø" constraints in h are satisfied by this example, so each is replaced by the next
more general constraint that fits the example
h1 = [Sunny Warm Normal Strong Warm Same]
Consider the second training example
x2 = [Sunny, Warm, High, Strong, Warm, Same], +
The second training example forces the algorithm to further generalize h, this time substituting a
"?" in place of any attribute value in h that is not satisfied by the new example
h2 = [Sunny Warm ? Strong Warm Same]
Consider the third training example
x3 = [Rainy, Cold, High, Strong, Warm, Change], -
Upon encountering the third training the algorithm makes no change to h. Because the FIND-S
algorithm simply ignores every negative example.
h3 = [Sunny Warm ? Strong Warm Same]
Consider the fourth training example
x4 = [Sunny Warm High Strong Cool Change], +
The fourth example leads to a further generalization of h
h4 = [Sunny Warm ? Strong ? ? ]

Program:
import csv
num_attributes = 6
a = []
print("\n The Given Training Data Set \n")
with open('enjoysport.csv', 'r') as csvfile:

reader = csv.reader(csvfile)
for row in reader:
a.append (row)
print(row)
print("\n The initial value of hypothesis: ")

hypothesis = ['0'] * num_attributes
print(hypothesis)
for j in range(0,num_attributes):
hypothesis[j] = a[0][j];
print("\n Find S: Finding a Maximally Specific Hypothesis\n")
for i in range(0,len(a)):
if a[i][num_attributes]=='yes':
for j in range(0,num_attributes):
if a[i][j]!=hypothesis[j]:
hypothesis[j]='?'
else :
hypothesis[j]= a[i][j]
print(" For Training instance No:{0} the hypothesis is
".format(i),hypothesis)
print("\n The Maximally Specific Hypothesis for a given Training

Examples :\n")
print(hypothesis)
Data Set:
sunny warm normal strong warm same yes

sunny warm high strong warm same yes
rainy cold high strong warm change no
sunny warm high strong cool change yes
Output:
The Given Training Data Set
['sunny', 'warm', 'normal', 'strong', 'warm', 'same', 'yes']

['sunny', 'warm', 'high', 'strong', 'warm', 'same', 'yes']
['rainy', 'cold', 'high', 'strong', 'warm', 'change', 'no']
['sunny', 'warm', 'high', 'strong', 'cool', 'change', 'yes']
The initial value of hypothesis:

['0', '0', '0', '0', '0', '0']
Find S: Finding a Maximally Specific
HypothesisFor Training Example No:0 the
hypothesis is
['sunny', 'warm', 'normal', 'strong', 'warm', 'same']
For Training Example No:1 the hypothesis is

['sunny', 'warm', '?', 'strong', 'warm', 'same']
For Training Example No:2 the hypothesis is

'sunny', 'warm', '?', 'strong', 'warm',
'same']
For Training Example No:3 the hypothesis

is'sunny', 'warm', '?', 'strong', '?',
'?']
The Maximally Specific Hypothesis for a given Training

Examples:
['sunny', 'warm', '?', 'strong', '?', '?']

2. For a given set of training data examples stored in a .CSV
file, implement and demonstrate the Candidate-Elimination
algorithm to output a description of the set of all hypotheses
consistent with the training examples.
Algorithm
For each training example d, do:
If d is positive example
Remove from G any hypothesis h inconsistent with d
For each hypothesis s in S not consistent with d:
Remove s from S
Add to S all minimal generalizations of s consistent

with d and having a generalization in G
Remove from S any hypothesis with a more specific h in

S
If d is negative example
Remove from S any hypothesis h inconsistent with d
For each hypothesis g in G not consistent with d:
Remove g from G
Add to G all minimal specializations of g consistent

with d and having a specialization in S
Remove from G any hypothesis having a more general

hypothesis in G
Solved Numerical Example – 1 (Candidate Elimination Algorithm):
3 Rain Cold High Strong Warm Change No

Solution:
S0: (ø, ø, ø, ø, ø, ø) Most Specific Boundary
G0: (?, ?, ?, ?, ?, ?) Most Generic Boundary
The first example is positive, the hypothesis at the specific boundary is inconsistent, hence
we extend the specific boundary, and the hypothesis at the generic boundary is consistent
hence we retain it.
S1: (Sunny,Warm, Normal, Strong, Warm, Same)

G1: (?, ?, ?, ?, ?, ?)
The second example in positive, again the hypothesis at the specific boundary is inconsistent,
hence we extend the specific boundary, and the hypothesis at the generic boundary is
consistent hence we retain it.
S2: (Sunny,Warm, ?, Strong, Warm, Same)

G2: (?, ?, ?, ?, ?, ?)
The third example is negative, the hypothesis at the specific boundary is consistent, hence
we retain it, and hypothesis at the generic boundary is inconsistent hence we write all
consistent hypotheses by removing one “?” (question mark) at time.
S3: (Sunny,Warm, ?, Strong, Warm, Same)

G3: (Sunny,?,?,?,?,?) (?,Warm,?,?,?,?) (?,?,?,?,?,Same)
The fourth example is positive, the hypothesis at the specific boundary is inconsistent,
hence we extend the specific boundary, and the consistent hypothesis at the generic
boundary are retained.
S4: (Sunny, Warm, ?, Strong, ?, ?)

G4: (Sunny,?,?,?,?,?) (?,Warm,?,?,?,?)
Learned Version Space by Candidate Elimination Algorithm for given data set is:
import csv
with open("trainingexamples.csv") as f:
csv_file = csv.reader(f)
data = list(csv_file)
specific = data[1][:-1]
general = [['?' for i in range(len(specific))] for j in
range(len(specific))]
for i in data:
if i[-1] == "Yes":
for j in range(len(specific)):
if i[j] != specific[j]:
specific[j] = "?"
general[j][j] = "?"
elif i[-1] == "No":

for j in range(len(specific)):
if i[j] != specific[j]:
general[j][j] = specific[j]
else:
general[j][j] = "?"
print("\nStep " + str(data.index(i)+1) + " of Candidate Elimination

Algorithm")
print(specific)
print(general)
gh = [] # gh = general Hypothesis
for i in general:
for j in i:
if j != '?':
gh.append(i)
break
print("\nFinal Specific hypothesis:\n", specific)
print("\nFinal General hypothesis:\n", gh)
Step 1 of Candidate Elimination Algorithm
['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same']
[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?',
'?', '?'], ['?', '?', '?', '?', '?', '?']]

['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same']
[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?',
'?', '?'], ['?', '?', '?', '?', '?', '?']]

['Sunny', 'Warm', '?', 'Strong', 'Warm', 'Same']
[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?',
'?', '?'], ['?', '?', '?', '?', '?', '?']]

['Sunny', 'Warm', '?', 'Strong', 'Warm', 'Same']
[['Sunny', '?', '?', '?', '?', '?'], ['?', 'Warm', '?', '?', '?', '?'],
['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', 'Same']]

['Sunny', 'Warm', '?', 'Strong', '?', '?']
[['Sunny', '?', '?', '?', '?', '?'], ['?', 'Warm', '?', '?', '?', '?'],
['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?']]
Final Specific hypothesis:

['Sunny', 'Warm', '?', 'Strong', '?', '?']
Final General hypothesis:

[['Sunny', '?', '?', '?', '?', '?'], ['?', 'Warm', '?', '?', '?', '?']]
3. Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use
an appropriate data set for building the decision tree and apply this knowledge to classify a
new sample.
Decision Tree ID3 Algorithm Machine Learning
ID3(Examples, Target_attribute, Attributes)
Examples are the training examples.
Target_attribute is the attribute whose value is to be predicted by the

tree.
Attributes is a list of other attributes that may be tested by the learned

decision tree.
Returns a decision tree that correctly classifies the given Examples.
Create a Root node for the tree
If all Examples are positive, Return the single-node tree Root, with label =
+
If all Examples are negative, Return the single-node tree Root, with label =
-
If Attributes is empty, Return the single-node tree Root,
with label = most common value of Target_attribute in Examples
Otherwise Begin
A ← the attribute from Attributes that best* classifies Examples
The decision attribute for Root ← A
For each possible value, vi, of A,
Add a new tree branch below Root, corresponding to the test A = vi
Let Examples vi, be the subset of Examples that have value vi for A
If Examples vi , is empty
Then below this new branch add a leaf node with
label = most common value of Target_attribute in Examples
Else
below this new branch add the subtree
ID3(Examples vi, Targe_tattribute, Attributes – {A}))
End
Return Root
The best attribute is the one with highest information gain
Problem Definition:
Build a decision tree using ID3 algorithm for the given training data in the table (Buy
Computer data), and predict the class of the following new example: age<=30,
income=medium, student=yes, credit-rating=fair
Solution:
age income student Credit rating Buys computer
<=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no

First, check which attribute provides the highest Information Gain in order to split the training set based
on that attribute. We need to calculate the expected information to classify the set and the entropy of
each attribute.
The information gain is this mutual information minus the entropy:
The mutual information of the two classes,
Entropy(S)= E(9,5)= -9/14 log2(9/14) – 5/14 log2(5/14)=0.94

Now Consider the Age attribute
For Age, we have three values age<=30 (2 yes and 3 no), age31..40 (4 yes and 0 no), and age>40 (3 yes and
2 no)
Entropy(age) = 5/14 (-2/5 log2(2/5)-3/5log2(3/5)) + 4/14 (0) + 5/14 (-3/5log2(3/5)-2/5log2(2/5))
= 5/14(0.9709) + 0 + 5/14(0.9709) = 0.6935
Gain(age) = 0.94 – 0.6935 = 0.2465
Next, consider Income Attribute

For Income, we have three values incomehigh (2 yes and 2 no), incomemedium (4 yes and 2 no), and
incomelow (3 yes 1 no)
Entropy(income) = 4/14(-2/4log2(2/4)-2/4log2(2/4)) + 6/14 (-4/6log2(4/6)-2/6log2(2/6)) + 4/14 (-
3/4log2(3/4)-1/4log2(1/4))
= 4/14 (1) + 6/14 (0.918) + 4/14 (0.811)
= 0.285714 + 0.393428 + 0.231714 = 0.9108
Gain(income) = 0.94 – 0.9108 = 0.0292
Next, consider Student Attribute

For Student, we have two values studentyes (6 yes and 1 no) and studentno (3 yes 4 no)
Entropy(student) = 7/14(-6/7log2(6/7)-1/7log2(1/7)) + 7/14(-3/7log2(3/7)-4/7log2(4/7)

= 7/14(0.5916) + 7/14(0.9852)
= 0.2958 + 0.4926 = 0.7884
Gain (student) = 0.94 – 0.7884 = 0.1516
Finally, consider Credit_Rating Attribute

For Credit_Rating we have two values credit_ratingfair (6 yes and 2 no) and credit_ratingexcellent (3
yes 3 no)
Entropy(credit_rating) = 8/14(-6/8log2(6/8)-2/8log2(2/8)) + 6/14(-3/6log2(3/6)-3/6log2(3/6))

= 8/14(0.8112) + 6/14(1)
= 0.4635 + 0.4285 = 0.8920
Gain(credit_rating) = 0.94 – 0.8920 = 0.479
Since Age has the highest Information Gain we start splitting the dataset using the age attribute.
Decision Tree after step 1

Since all records under the branch age31..40 are all of the class, Yes, we can replace the leaf with
Class=Yes
Now build the decision tree for the left subtree
The same process of splitting has to happen for the two remaining branches.
Left sub-branch
For branch age<=30 we still have attributes income, student, and credit_rating. Which one should be
used to split the partition?
The mutual information is E(Sage<=30)= E(2,3)= -2/5 log2(2/5) – 3/5 log2(3/5)=0.97

For Income, we have three values incomehigh (0 yes and 2 no), incomemedium (1 yes and 1 no) and
incomelow (1 yes and 0 no)
Entropy(income) = 2/5(0) + 2/5 (-1/2log2(1/2)-1/2log2(1/2)) + 1/5 (0) = 2/5 (1) = 0.4
Gain(income) = 0.97 – 0.4 = 0.57
For Student, we have two values studentyes (2 yes and 0 no) and studentno (0 yes 3 no)
Entropy(student) = 2/5(0) + 3/5(0) = 0
Gain (student) = 0.97 – 0 = 0.97
We can then safely split on attribute student without checking the other attributes since the information
gain is maximized.
Decision Tree after step 2
Since these two new branches are from distinct classes, we make them into leaf nodes with their
respective class as label:
Decision Tree after step 2_2
Now build the decision tree for right left subtree
Right sub-branch
The mutual information is Entropy(Sage>40)= I(3,2)= -3/5 log2(3/5) – 2/5 log2(2/5)=0.97
For Income, we have two values incomemedium (2 yes and 1 no) and incomelow (1 yes and 1 no)
Entropy(income) = 3/5(-2/3log2(2/3)-1/3log2(1/3)) + 2/5 (-1/2log2(1/2)-1/2log2(1/2))
= 3/5(0.9182)+2/5 (1) = 0.55+0. 4= 0.95
Gain(income) = 0.97 – 0.95 = 0.02
For Student, we have two values studentyes (2 yes and 1 no) and student no (1 yes and 1 no)
Entropy(student) = 3/5(-2/3log2(2/3)-1/3log2(1/3)) + 2/5(-1/2log2(1/2)-1/2log2(1/2)) = 0.95
Gain (student) = 0.97 – 0.95 = 0.02
For Credit_Rating, we have two values credit_ratingfair (3 yes and 0 no) and credit_ratingexcellent (0
yes and 2 no)
Entropy(credit_rating) = 0
Gain(credit_rating) = 0.97 – 0 = 0.97
We then split based on credit_rating. These splits give partitions each with records from the same class.
We just need to make these into leaf nodes with their class label attached:
Decision Tree for Buys Computer
New example: age<=30, income=medium, student=yes, credit-rating=fair
Follow branch(age<=30) then student=yes we predict Class=yes
Buys_computer = yes
ENTROPY:
Entropy measures the impurity of a collection of examples.
Where, p+ is the proportion of positive examples in S

p- is the proportion of negative examples in S.
INFORMATION GAIN:
 Information gain, is the expected reduction in entropy caused by partitioning

the examples according to this attribute.
 The information gain, Gain(S, A) of an attribute A, relative to a collection of
examples S, is defined as
Training Dataset:
Day Outlook Temperature Humidity Wind PlayTennis

D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No
Test Dataset:
Day Outlook Temperature Humidity Wind

T1 Rain Cool Normal Strong
T2 Sunny Mild Normal Strong
Program:
import math
import csv
def load_csv(filename): lines=

dataset = list(lines)
headers = dataset.pop(0)
return dataset,headers
class Node:
def init (self,attribute):
self.attribute=attribute
self.children=[]
self.answer=""
def subtables(data,col,delete): dic={}

coldata=[row[col] for row in data]
attr=list(set(coldata))
counts=[0]*len(attr)
r=len(data)
c=len(data[0])
for x in range(len(attr)):
for y in range(r):
if data[y][col]==attr[x]:
counts[x]+=1
dic[attr[x]]=[[0 for i in range(c)] for j in
range(counts[x])]
pos=0
for y in range(r):
if data[y][col]==attr[x]:
if delete:
del data[y][col]
dic[attr[x]][pos]=data[y]
pos+=1
return attr,dic
def entropy(S): attr=l
if len(attr)==1:
return 0
counts=[0,0]
for i in range(2):
counts[i]=sum([1 for x in S if attr[i]==x])/(len(S)*1.0)
sums=0
for cnt in counts:
sums+=-1*cnt*math.log(cnt,2)
return sums
def compute_gain(data,col): attr,d
subtables(data,col,delete=False)
total_size=len(data)
entropies=[0]*len(attr)
ratio=[0]*len(attr)
total_entropy=entropy([row[-1] for row in data])

ratio[x]=len(dic[attr[x]])/(total_size*1.0)
entropies[x]=entropy([row[-1] for row in
dic[attr[x]]])
total_entropy-=ratio[x]*entropies[x]
return total_entropy
def build_tree(data,features): lastco

if(len(set(lastcol)))==1:
node=Node("")
node.answer=lastcol[0]
return node
n=len(data[0])-1
gains=[0]*n
for col in range(n):
gains[col]=compute_gain(data,col)
split=gains.index(max(gains))
node=Node(features[split])
fea = features[:split]+features[split+1:] attr,dic=subtables(data,split,delete=True)
child=build_tree(dic[attr[x]],fea)
node.children.append((attr[x],child))
return node
def print_tree(node,level): if nod

print(" "*level,node.answer)
return
print(" "*level,node.attribute)
for value,n in node.children:
print(" "*(level+1),value)
print_tree(n,level+2)
def classify(node,x_test,features):
if node.answer!="":
print(node.answer)
return
pos=features.index(node.attribute)
for value, n in node.children:
if x_test[pos]==value:
classify(n,x_test,features)
'''Main program''' datase

node1=build_tree(dataset,features)
print("The decision tree for the dataset using ID3 algorithm

is")
print_tree(node1,0)
testdata,features=load_csv("data3_test.csv")
for xtest in testdata:
print("The test instance:",xtest)
print("The label for test instance:",end=" ")classify(node1,xtest,features)
Output:
The decision tree for the dataset using ID3 algorithm is
Outlook
rain
Wind
strong
no
weak
yes
overcast
yes
sunny
Humidity
normal
yes
high
no
The test instance: ['rain', 'cool', 'normal', 'strong']

The label for test instance: no
The test instance: ['sunny', 'mild', 'normal', 'strong']

The label for test instance: yes
Exp n0. 04: Build an Artificial Neural Network by implementing the Backpropagation
algorithm and test the same using appropriate data sets.
BACKPROPAGATION (training_example, ƞ, nin, nout, nhidden )

Each training example is a pair of the form (𝑥→ →, 𝑡→ ), where (𝑥→ ) is the vector of network
input values, (𝑡→ ) and is the vector of target network output values.
ƞ is the learning rate (e.g., .05). ni, is the number of network inputs, nhidden the number
of units in the hidden layer, and nout the number of output units.
The input from unit i into unit j is denoted xji, and the weight from unit i to unit j is
denoted wji
 Create a feed-forward network with ni inputs, nhidden hidden units, and nout output
units.
 Initialize all network weights to small random numbers
 Until the termination condition is met, Do
 For each (→𝑥→, 𝑡→ ), in training examples, Do

Propagate the input forward through the network:
1. Input the instance 𝑥→ →, to the network and compute the output ou of every
unit u in the network.
Propagate the errors backward through the network:
Training Examples:
Expected % in
Example Sleep Study
Exams
1 2 9 92
2 1 5 86
3 3 6 89
Normalize the input

Expected %
Example Sleep Study
in Exams
1 2/3 = 0.66666667 9/9 = 1 0.92
2 1/3 = 0.33333333 5/9 = 0.55555556 0.86
3 3/3 = 1 6/9 = 0.66666667 0.89
Program:
import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)
y = np.array(([92], [86], [89]), dtype=float)
X = X/np.amax(X,axis=0) # maximum of X array longitudinallyy
= y/100
#Sigmoid Function
def sigmoid (x):
return 1/(1 + np.exp(-x))
#Derivative of Sigmoid Function

def derivatives_sigmoid(x):
return x * (1 - x)
#Variable initialization
epoch=5000 #Setting training
iterations lr=0.1 #Setting learning
rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output
layer
weight and bias initialization
wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neur ons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neuron s))
bout=np.random.uniform(size=(1,output_neurons))
#draws a random range of numbers uniformly of dim x*y

for i in range(epoch):
#Forward Propogation
hinp1=np.dot(X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout) outinp=
outinp1+ bout
output = sigmoid(outinp)
#Backpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
EH = d_output.dot(wout.T)
#how much hidden layer wts contributed to error

hiddengrad = derivatives_sigmoid(hlayer_act)
d_hiddenlayer = EH * hiddengrad
# dotproduct of nextlayererror and currentlayerop

wout += hlayer_act.T.dot(d_output) *lr
wh += X.T.dot(d_hiddenlayer) *lr
print("Input: \n" + str(X)) print("Actual
Output: \n" + str(y)) print("Predicted
Output: \n" ,output)
Output:
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.89726759]
[0.87196896]
[0.9000671]]
Python Program to Implement and Demonstrate Backpropagation Algorithm Machine Learning
import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)
y = np.array(([92], [86], [89]), dtype=float)
X = X/np.amax(X,axis=0) #maximum of X array longitudinally
y = y/100
#Sigmoid Function
def sigmoid (x):
return 1/(1 + np.exp(-x))

#Derivative of Sigmoid Function
def derivatives_sigmoid(x):
return x * (1 - x)
#Variable initialization
epoch=5 #Setting training iterations
lr=0.1 #Setting learning rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output layer
#weight and bias initialization
wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
bout=np.random.uniform(size=(1,output_neurons))
#draws a random range of numbers uniformly of dim x*y
for i in range(epoch):
#Forward Propogation
hinp1=np.dot(X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout)
outinp= outinp1+bout
output = sigmoid(outinp)
#Backpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO * outgrad
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)#how much hidden layer wts

contributed to error
d_hiddenlayer = EH * hiddengrad
wout += hlayer_act.T.dot(d_output) *lr # dotproduct of

nextlayererror and currentlayerop
wh += X.T.dot(d_hiddenlayer) *lr
print ("-----------Epoch-", i+1, "Starts----------")
print("Input: \n" + str(X))
print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)
print ("-----------Epoch-", i+1, "Ends----------\n")
print("Input: \n" + str(X))
print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)

Training Examples:
Example Sleep Study Expected % in Exams
1 2 9 92
2 1 5 86
3 3 6 89
Normalize the input
Example Sleep Study Expected % in Exams
1 2/3 = 0.66666667 9/9 = 1 0.92
2 1/3 = 0.33333333 5/9 = 0.55555556 0.86
3 3/3 = 1 6/9 = 0.66666667 0.89
Output
———–Epoch- 1 Starts———-
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.81951208]
[0.8007242 ]
[0.82485744]]
———–Epoch- 1 Ends———-
Epoch- 2 Starts———-
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.82033938]
[0.80153634]
[0.82568134]]
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.82115226]
[0.80233463]
[0.82649072]]
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.82195108]
[0.80311943]
[0.82728598]]
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.8227362 ]
[0.80389106]
[0.82806747]]
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.8227362 ]
[0.80389106]
[0.82806747]]
Exp no. 05. Write a program to implement the Naïve Bayesian classifier for a sample training data set stored
as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.
Bayes’ Theorem is stated as:
Where,
P(h|D) is the probability of hypothesis h given the data D. This is called the posterior probability.
P(D|h) is the probability of data d given that the hypothesis h was true.
P(h) is the probability of hypothesis h being true. This is called the prior probability of h. P(D) is the
probability of the data. This is called the prior probability of D
After calculating the posterior probability for a number of different hypotheses h, and is interested in
finding the most probable hypothesis h ∈ H given the observed data D. Any such maximally probable
hypothesis is called a maximum a posteriori (MAP) hypothesis.
Bayes theorem to calculate the posterior probability of each candidate hypothesis is hMAP is a MAP
hypothesis provided.
Ignoring P(D) since it is a constant)
Gaussian Naive Bayes
A Gaussian Naive Bayes algorithm is a special type of Naïve Bayes algorithm. It’s specifically used when
the features have continuous values. It’s also assumed that all the features are following a Gaussian
distribution i.e., normal distribution
Representation for Gaussian Naive Bayes
We calculate the probabilities for input values for each class using a frequency. With real- valued inputs,
we can calculate the mean and standard deviation of input values (x) for each class to summarize the
distribution.
This means that in addition to the probabilities for each class, we must also store the mean and standard
deviations for each input variable for each class.
Gaussian Naive Bayes Model from Data
The probability density function for the normal distribution is defined by two parameters (mean and
standard deviation) and calculating the mean and standard deviation values of each input variable (x) for
each class value.
Examples:
The data set used in this program is the Pima Indians Diabetes problem.
This data set is comprised of 768 observations of medical details for Pima Indians patents. The records
describe instantaneous measurements taken from the patient such as their age, the number of times
pregnant and blood workup. All patients are women aged 21 or older. All attributes are numeric, and their
units vary from attribute to attribute.
The attributes are Pregnancies, Glucose, BloodPressure, SkinThickness, Insulin, BMI,

DiabeticPedigreeFunction, Age, Outcome
Each record has a class value that indicates whether the patient suffered an onset of diabetes within 5 years
of when the measurements were taken (1) or not (0)
Sample Examples:
Diabeti
c
Pedigre
e
Example Pregnanci Glucos BloodPressu SkinThickne Insuli BM Functio Ag Outcom
s es e re ss n I n e e
33.
1 6 148 72 35 0 6 0.627 50 1
26.
2 1 85 66 29 0 6 0.351 31 0
23.
3 8 183 64 0 0 3 0.672 32 1
28.
4 1 89 66 23 94 1 0.167 21 0
43.
5 0 137 40 35 168 1 2.288 33 1
25.
6 5 116 74 0 0 6 0.201 30 0
7 3 78 50 32 88 31 0.248 26 1
35.
8 10 115 0 0 0 3 0.134 29 0
30.
9 2 197 70 45 543 5 0.158 53 1
10 8 125 96 0 0 0 0.232 54 1
Python Program to Implement and Demonstrate Naïve Bayesian Classifier Machine Learning
import csv
import random
import math
def loadcsv(filename):
lines = csv.reader(open(filename, "r"));
dataset = list(lines)
for i in range(len(dataset)):
#converting strings into numbers for processing
dataset[i] = [float(x) for x in dataset[i]]
return dataset
def splitdataset(dataset, splitratio):
#67% training size
trainsize = int(len(dataset) * splitratio);
trainset = []
copy = list(dataset);
while len(trainset) < trainsize:

#generate indices for the dataset list randomly to pick ele for training data
index = random.randrange(len(copy));
trainset.append(copy.pop(index))
return [trainset, copy]
def separatebyclass(dataset):
separated = {} #dictionary of classes 1 and 0
#creates a dictionary of classes 1 and 0 where the values are
#the instances belonging to each class
for i in range(len(dataset)):
vector = dataset[i]
if (vector[-1] not in separated):
separated[vector[-1]] = []
separated[vector[-1]].append(vector)
return separated
def mean(numbers):
return sum(numbers)/float(len(numbers))
def stdev(numbers):
avg = mean(numbers)
variance = sum([pow(x-avg,2) for x in numbers])/float(len(numbers)-1)
return math.sqrt(variance)
def summarize(dataset): #creates a dictionary of classes
summaries = [(mean(attribute), stdev(attribute)) for attribute in

zip(*dataset)];
del summaries[-1] #excluding labels +ve or -ve
return summaries
def summarizebyclass(dataset):
separated = separatebyclass(dataset);
#print(separated)
summaries = {}
for classvalue, instances in separated.items():
#for key,value in dic.items()
#summaries is a dic of tuples(mean,std) for each class value
summaries[classvalue] = summarize(instances) #summarize is used

to cal to mean and std
return summaries
def calculateprobability(x, mean, stdev):
exponent = math.exp(-(math.pow(x-mean,2)/(2*math.pow(stdev,2))))
return (1 / (math.sqrt(2*math.pi) * stdev)) * exponent
def calculateclassprobabilities(summaries, inputvector):
probabilities = {} # probabilities contains the all prob of all class

of test data
for classvalue, classsummaries in summaries.items():#class and

attribute information as mean and sd
probabilities[classvalue] = 1
for i in range(len(classsummaries)):
mean, stdev = classsummaries[i] #take mean and sd of

every attribute for class 0 and 1 seperaely
x = inputvector[i] #testvector's first attribute
probabilities[classvalue] *= calculateprobability(x,
mean, stdev);#use normal dist
return probabilities
def predict(summaries, inputvector): #training and test data is passed
probabilities = calculateclassprobabilities(summaries, inputvector)
bestLabel, bestProb = None, -1
for classvalue, probability in probabilities.items():#assigns that

class which has he highest prob
if bestLabel is None or probability > bestProb:
bestProb = probability
bestLabel = classvalue
return bestLabel
def getpredictions(summaries, testset):
predictions = []
for i in range(len(testset)):
result = predict(summaries, testset[i])
predictions.append(result)
return predictions
def getaccuracy(testset, predictions):
correct = 0
for i in range(len(testset)):
if testset[i][-1] == predictions[i]:
correct += 1
return (correct/float(len(testset))) * 100.0
def main():
filename = 'naivedata.csv'
splitratio = 0.67
dataset = loadcsv(filename);
trainingset, testset = splitdataset(dataset, splitratio)
print('Split {0} rows into train={1} and test={2}

rows'.format(len(dataset), len(trainingset), len(testset)))
# prepare model
summaries = summarizebyclass(trainingset);
#print(summaries)
# test model
predictions = getpredictions(summaries, testset) #find the predictions

of test data with the training data
accuracy = getaccuracy(testset, predictions)
print('Accuracy of the classifier is : {0}%'.format(accuracy))

main()
Output
Split 768 rows into train=514 and test=254
Rows Accuracy of the classifier is : 71.65354330708661%
Exp No. 06: Assuming a set of documents that need to be classified, use the naïve
Bayesian Classifier model to perform this task. Built-in Java classes/API can be used to
write the program. Calculate the accuracy, precision, and recall for your data set.
Naive Bayes algorithms for learning and classifying text
LEARN_NAIVE_BAYES_TEXT (Examples, V)
Examples is a set of text documents along with their target values. V is the set of all possible target
values. This function learns the probability terms P(w k |vj,), describing the probability that a
randomly drawn word from a document in class v j will be the English word wk. It also learns the
class prior probabilities P(vj).
1. collect all words, punctuation, and other tokens that occur in Examples
 Vocabulary ← c the set of all distinct words and other tokens occurring in any
text document from Examples
2. calculate the required P(vj) and P(wk|vj) probability terms
 For each target value vj in V do


docsj ← the subset of documents from Examples for which the target value is vj

P(vj) ← | docsj | / |Examples|

Textj ← a single document created by concatenating all members of docsj

n ← total number of distinct word positions in Textj

for each word wk in Vocabulary

nk ← number of times word wk occurs in Textj

P(wk|vj) ← ( nk + 1) / (n + | Vocabulary| )
CLASSIFY_NAIVE_BAYES_TEXT (Doc)
Return the estimated target value for the document Doc. a i denotes the word found in the ith
position within Doc.
 positions ← all word positions in Doc that contain tokens found in Vocabulary
 Return VNB, where
Data set:
Text Documents Label

1 I love this sandwich pos
2 This is an amazing place pos
3 I feel very good about these beers pos
4 This is my best work pos
5 What an awesome view pos
6 I do not like this restaurant neg
7 I am tired of this stuff neg
8 I can't deal with this neg
9 He is my sworn enemy neg
10 My boss is horrible neg
11 This is an awesome place pos
12 I do not like the taste of this juice neg
13 I love to dance pos
14 I am sick and tired of this place neg
15 What a great holiday pos
16 That is a bad locality to stay neg
17 We will have good fun tomorrow pos
18 I went to my enemy's house today neg
Program:
import pandas as pd
msg=pd.read_csv('naivetext.csv',names=['message','label'])
print('The dimensions of the dataset',msg.shape)
msg['labelnum']=msg.label.map({'pos':1,'neg':0})
X=msg.message
y=msg.labelnum
print(X)
print(y)
#splitting the dataset into train and test data from

sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest=train_test_split(X,y)
print ('\n The total number of Training Data :',ytrain.shape)

print ('\n The total number of Test Data :',ytest.shape)
#output of count vectoriser is a sparse matrix from

sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
xtrain_dtm = count_vect.fit_transform(xtrain)
xtest_dtm=count_vect.transform(xtest)
print('\n The words or Tokens in the text documents \n')
print(count_vect.get_feature_names())
df=pd.DataFrame(xtrain_dtm.toarray(),columns=count_vect.get_fe
ature_names())
# Training Naive Bayes (NB) classifier on training data. from

sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB().fit(xtrain_dtm,ytrain)
predicted = clf.predict(xtest_dtm)
#printing accuracy, Confusion matrix, Precision and Recall from

sklearn import metrics
print('\n Accuracy of the classifer is’,
metrics.accuracy_score(ytest,predicted))
print('\n Confusion matrix')
print(metrics.confusion_matrix(ytest,predicted))
print('\n The value of Precision' ,

metrics.precision_score(ytest,predicted))
print('\n The value of Recall' ,

metrics.recall_score(ytest,predicted))
Output:
The dimensions of the dataset (18, 2)

0 I love this sandwich
1 This is an amazing place
2 I feel very good about these beers
3 This is my best work
4 What an awesome view
5 I do not like this restaurant
6 I am tired of this stuff
7 I can't deal with this
8 He is my sworn enemy
9 My boss is horrible
10 This is an awesome place
11 I do not like the taste of this juice
12 I love to dance
13 I am sick and tired of this place
14 What a great holiday
15 That is a bad locality to stay
16 We will have good fun tomorrow
17 I went to my enemy's house today
Name: message, dtype: object

0 1
1 1
2 1
3 1
4 1
5 0
6 0
7 0
8 0
9 0
10 1
11 0
12 1
13 0
14 1
15 0
16 1
17 0
Name: labelnum, dtype: int64
The total number of

Training Data: (13,) The
total number of Test Data:
(5,)
The words or Tokens in the text documents
['about', 'am', 'amazing', 'an', 'and', 'awesome', 'beers',
'best', 'can', 'deal', 'do', 'enemy', 'feel',
'fun', 'good', 'great', 'have', 'he', 'holiday', 'house', 'is',
'like', 'love', 'my', 'not', 'of', 'place',
'restaurant', 'sandwich', 'sick', 'sworn', 'these', 'this',
'tired', 'to', 'today', 'tomorrow', 'very', 'view', 'we',
'went', 'what', 'will', 'with', 'work']
Accuracy of the
classifier is 0.8
Confusion matrix [[2 1]
[0 2]]
The value of Precision
0.6666666666666666 The value of
Recall 1.0
Basic knowledge
Confusion Matrix
True positives: data points labelled as positive

that are actually positive False positives: data
points labelled as positive that are actually
negative True negatives: data points labelled as
negative that are actually negative False
negatives: data points labelled as negative that
are actually positive
Example:
Accuracy: how often is the classifier correct?

Example: Movie Review
Doc Text Class
1 I loved the movie +

2 I hated the movie -
3 a great movie. good movie +
4 poor acting -
5 great acting. good movie +
Unique word
< I, loved, the, movie, hated, a, great, good, poor, acting>
Doc I loved the movie hated a great good poor acting Class
1 1 1 1 1 +
2 1 1 1 1 -
3 2 1 1 1 +
4 1 1 -
5 1 1 1 1 +
1 1 1 1 1 +
3 2 1 1 1 +
5 1 1 1 1 +
3
𝑃(+) = = 0.6
5
1+1 1+1
𝑃(𝐼 |+) = = 0.0833 𝑃(𝑎 |+) = = 0.0833
14 + 14 +
10 10
1+1 2+1
𝑃(𝑙𝑜𝑣𝑒𝑑 |+) = = 0.0833 𝑃(𝑔𝑟𝑒𝑎𝑡 |+) = = 0.125
14 + 14 +
10 10
1+1 2+1
𝑃(𝑡ℎ𝑒 |+) = 0.0833 𝑃(𝑔𝑜𝑜𝑑 |+) = = 0.125
= 14 + 14 +
10 10
4+1 0+1
𝑃(𝑚𝑜𝑣i𝑒 |+) = = 0.2083 𝑃(𝑝𝑜𝑜𝑟 |+) = = 0.0416
14 + 14 + 10
10 1+
0+1 1
𝑃(ℎ𝑎𝑡𝑒𝑑 |+) = 0.0416 𝑃(𝑎𝑐𝑡i𝑛𝑔 |+) = = 0.0833
= 14 + 14 +
10 10
2 1 1 1 1 -
4 1 1 -
2
𝑃(−) =
5 = 0.4
1+1 0+1
𝑃(𝐼 |−) = 0.125 𝑃(𝑎 |−) = = 0.0625
= 6+ 6+
10 10
0+1 0+1
𝑃(𝑙𝑜𝑣𝑒𝑑 |−) = = 0.0625 𝑃(𝑔𝑟𝑒𝑎𝑡 |−) = = 0.0625
6+ 6+
10 10
1+1 0+1
𝑃(𝑡ℎ𝑒 |−) = 0.125 𝑃(𝑔𝑜𝑜𝑑 |−) = = 0.0625
= 6+ 6+
10 10
1+1 1+1
𝑃(𝑚𝑜𝑣i𝑒|−) = = 0.125 𝑃(𝑝𝑜𝑜𝑟|−) = 0.125
6+ 6+
=
10 10
1+1 1+1
𝑃(ℎ𝑎𝑡𝑒𝑑 |−) = 0.125 𝑃(𝑎𝑐𝑡i𝑛𝑔|−) = = 0.125
= 6+ 6+
10 10
Let’s classify the new document
I hated the poor acting

If Vj =
+ then,
= P(+) P(I | +) P(hated | +) P(the | +) P(poor | +) P(acting | +)
= 0.6 * 0.0833 * 0.0416 * 0.0833 * 0.0416 * 0.0833
= 6.03 X 10−2
If Vj = −
then,
= P(−) P(I | −) P(hated | −) P(the | −) P(poor | −) P(acting | −)
= 0.4 * 0.125 * 0.125 * 0.125 * 0.125 * 0.125
= 1.22 X 10−5
= 1.22 X 10−5 > 6.03 X 10−2

So, the new document belongs to ( − ) class
7. Write a program to construct a Bayesian network considering medical data. Use this model to
demonstrate the diagnosis of heart patients using a standard Heart Disease Data Set. You can
use Java/Python ML library classes/API.
Theory
A Bayesian network is a directed acyclic graph in which each edge corresponds to a

conditional dependency, and each node corresponds to a unique random variable.
Bayesian network consists of two major parts: a directed acyclic graph and a set of
conditional probability distributions
 The directed acyclic graph is a set of random variables represented by nodes.

 The conditional probability distribution of a node (random variable) is
defined for every possible outcome of the preceding causal node(s).
For illustration, consider the following example. Suppose we attempt to turn on our
computer, but the computer does not start (observation/evidence). We would like to
know which of the possible causes of computer failure is more likely. In this
simplified illustration, we assume only two possible causes of this misfortune:
electricity failure and computer malfunction.
The corresponding directed acyclic graph is depicted in below figure.
The goal is to calculate the posterior conditional probability distribution of each of the
possible unobserved causes given the observed evidence, i.e. P [Cause | Evidence].
Data Set:
Title: Heart Disease Databases
The Cleveland database contains 76 attributes, but all published experiments refer to
using a subset of 14 of them. In particular, the Cleveland database is the only one that
has been used by ML researchers to this date. The “Heartdisease” field refers to the
presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4.
Database: 0 1 2 3 4 Total
Cleveland: 164 55 36 35 13 303
Attribute Information:
1. age: age in years

2. sex: sex (1 = male; 0 = female)
3. cp: chest pain type
1. Value 1: typical angina
2. Value 2: atypical angina
3. Value 3: non-anginal pain
4. Value 4: asymptomatic
4. trestbps: resting blood pressure (in mm Hg on admission to the hospital)
5. chol: serum cholestoral in mg/dl
6. fbs: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
7. restecg: resting electrocardiographic results
1. Value 0: normal
2. Value 1: having ST-T wave abnormality (T wave inversions
and/or ST elevation or depression of > 0.05 mV)
3. Value 2: showing probable or definite left ventricular
hypertrophy by Estes’ criteria
8. thalach: maximum heart rate achieved
9. exang: exercise induced angina (1 = yes; 0 = no)
10.oldpeak = ST depression induced by exercise relative to rest
11.slope: the slope of the peak exercise ST segment
1. Value 1: upsloping
2. Value 2: flat
3. Value 3: downsloping
12. thal: 3 = normal; 6 = fixed defect; 7 = reversable defect
13.Heartdisease: It is integer valued from 0 (no presence) to 4.
Some instance from the dataset:
ag se c trestb ch fb reste thala exa oldpe slo c th Heartdise

e x p ps ol s cg ch ng ak pe a al ase
23
63 1 1 145 3 1 2 150 o 2.3 3 o 6 o
28
67 1 4 160 6 o 2 108 1 1.5 2 3 3 2
22
67 1 4 120 9 o 2 129 1 2.6 2 2 7 1
41 o 2 130 20 o 2 172 o 1.4 1 o 3 o

4
26
62 o 4 140 8 o 2 160 o 3.6 3 2 3 3
20
60 1 4 130 6 o 2 132 1 2.4 2 2 7 4
Python Program to Implement and Demonstrate Bayesian network using pgmpy Machine
Learning
import numpy as np
import pandas as pd
import csv
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.models import BayesianModel
from pgmpy.inference import VariableElimination
heartDisease = pd.read_csv('heart.csv')
heartDisease = heartDisease.replace('?',np.nan)
print('Sample instances from the dataset are given below')
print(heartDisease.head())
print('\n Attributes and datatypes')
print(heartDisease.dtypes)
model= BayesianModel([('age','heartdisease'),('sex','heartdisease'),
('exang','he artdisease'),('cp','heartdisease'),
('heartdisease','restecg'),('heartdise ase','chol')])
print('\nLearning CPD using Maximum likelihood estimators')
model.fit(heartDisease,estimator=MaximumLikelihoodEstimator)
print('\n Inferencing with Bayesian Network:')
HeartDiseasetest_infer = VariableElimination(model)
print('\n 1. Probability of HeartDisease given evidence= restecg')
q1=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'res
tecg':1})
print(q1)
print('\n 2. Probability of HeartDisease given evidence= cp ')
q2=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'cp'
:2})
print(q2)
Output
8. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set
for clustering using k-Means algorithm. Compare the results of these two algorithms and
comment on the quality of clustering. You can add Java/Python ML library classes/API
in the program.
Python Program to Implement and Demonstrate K-Means and EM Algorithm Machine

Learning
from sklearn.cluster import KMeans

from sklearn.mixture import GaussianMixture
import sklearn.metrics as metrics
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
names = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width',
'Class']
dataset = pd.read_csv("8-dataset.csv", names=names)
X = dataset.iloc[:, :-1]
label = {'Iris-setosa': 0,'Iris-versicolor': 1, 'Iris-virginica': 2}
y = [label[c] for c in dataset.iloc[:, -1]]
plt.figure(figsize=(14,7))
colormap=np.array(['red','lime','black'])
# REAL PLOT
plt.subplot(1,3,1)
plt.title('Real')
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y])
# K-PLOT
model=KMeans(n_clusters=3, random_state=0).fit(X)
plt.subplot(1,3,2)
plt.title('KMeans')
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[model.labels_])
print('The accuracy score of K-Mean: ',metrics.accuracy_score(y,

model.labels_))
print('The Confusion matrixof K-Mean:\n',metrics.confusion_matrix(y,

model.labels_))
# GMM PLOT
gmm=GaussianMixture(n_components=3, random_state=0).fit(X)
y_cluster_gmm=gmm.predict(X)
plt.subplot(1,3,3)
plt.title('GMM Classification')
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y_cluster_gmm])
print('The accuracy score of EM: ',metrics.accuracy_score(y,

y_cluster_gmm))
print('The Confusion matrix of EM:\n ',metrics.confusion_matrix(y,
y_cluster_gmm))
Output
The accuracy score of K-Mean: 0.24
The Confusion matrixof K-Mean:
[[ 0 50 0]
[48 0 2]
[14 0 36]]
The accuracy score of EM: 0.36666666666666664
The Confusion matrix of EM:
[[50 0 0]
[ 0 5 45]
[ 0 50 0]]
9. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data
set. Print both correct and wrong predictions. Java/Python ML library classes can be
used for this problem.
K-Nearest Neighbor Algorithm
Training algorithm:
 For each training example (x, f (x)), add the example to the list
training examples Classification algorithm:
 Given a query instance xq to be classified,
 Let x1 . . .xk denote the k instances from training examples that are
nearest to xq
 Return
 Where, f(xi) function to calculate the mean value of the k

nearest training examples.
Data Set:
Iris Plants Dataset: Dataset contains 150 instances (50 in each of three classes)
Number of Attributes: 4 numeric, predictive attributes and the Class.
Sample Data
Python Program to Implement and Demonstrate KNN Algorithm
import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn import metrics

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width',
'Class']
# Read dataset to pandas dataframe
dataset = pd.read_csv("9-dataset.csv", names=names)
X = dataset.iloc[:, :-1]
y = dataset.iloc[:, -1]
print(X.head())
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, test_size=0.10)
classifier = KNeighborsClassifier(n_neighbors=5).fit(Xtrain, ytrain)
ypred = classifier.predict(Xtest)
i = 0
print ("\n
")
print ('%-25s %-25s %-25s' % ('Original Label', 'Predicted Label',

'Correct/Wrong'))
print ("
")
for label in ytest:
print ('%-25s %-25s' % (label, ypred[i]), end="")
if (label == ypred[i]):
print (' %-25s' % ('Correct'))
else:
print (' %-25s' % ('Wrong'))
i = i + 1
print ("
")
print("\nConfusion Matrix:\n",metrics.confusion_matrix(ytest, ypred))
print ("
")
print("\nClassification Report:\n",metrics.classification_report(ytest,
ypred))
print ("
")
print('Accuracy of the classifer is %0.2f' %

metrics.accuracy_score(ytest,ypred))
print ("
")
Output
sepal-length sepal-width petal-length petal-width
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2

Original Label Predicted Label Correct/Wrong
Iris-versicolor Iris-versicolor Correct
Iris-virginica Iris-versicolor Wrong
Iris-virginica Iris-virginica Correct
Iris-setosa Iris-setosa Correct
Iris-virginica Iris-versicolor Wrong
Confusion Matrix:
[[4 0 0]
[0 4 0]
[0 2 5]]
Classification Report:
precision recall f1-score support
Iris-setosa 1.00 1.00 1.00 4
Iris-versicolor 0.67 1.00 0.80 4
Iris-virginica 1.00 0.71 0.83 7
avg / total 0.91 0.87 0.87 15
Accuracy of the classifer is 0.87
Exp. No. 10. Implement the non-parametric Locally Weighted Regression

algorithm in Python in order to fit data points. Select the appropriate data set for
your experiment and draw graphs.
Locally Weighted Regression Algorithm
Regression:
 Regression is a technique from statistics that are used to predict values of

the desired target quantity when the target quantity is continuous.
 In regression, we seek to identify (or estimate) a continuous variable
y associated with a given input vector x.
 y is called the dependent variable.
 x is called the independent variable.
Loess/Lowess Regression:
Loess regression is a nonparametric technique that uses local weighted regression to

fit a smooth curve through points in a scatter plot.
Lowess Algorithm:
Locally weighted regression is a very powerful nonparametric model used in statistical

learning.
Given a dataset X, y, we attempt to find a model parameter β(x) that minimizes
residual sum of weighted squared errors.
The weights are given by a kernel function (k or w) which can be chosen arbitrarily
Algorithm
1. Read the Given data Sample to X and the curve (linear or non linear) to Y
2. Set the value for Smoothening parameter or Free parameter say τ
3. Set the bias /Point of interest set x0 which is a subset of X
4. Determine the weight matrix using :
5. Determine the value of model term parameter β using:
6. Prediction = x0*β
Python Program to Implement and Demonstrate Locally Weighted Regression Algorithm
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
def kernel(point, xmat, k):
m,n = np.shape(xmat)
weights = np.mat(np1.eye((m)))
for j in range(m):
diff = point - X[j]

weights[j,j] = np.exp(diff*diff.T/(-2.0*k**2))
return weights
def localWeight(point, xmat, ymat, k):
wei = kernel(point,xmat,k)
W = (X.T*(wei*X)).I*(X.T*(wei*ymat.T))
return W
def localWeightRegression(xmat, ymat, k):
m,n = np.shape(xmat)
ypred = np.zeros(m)
for i in range(m):
ypred[i] = xmat[i]*localWeight(xmat[i],xmat,ymat,k)
return ypred
# load data points
data = pd.read_csv('10-dataset.csv')
bill = np.array(data.total_bill)
tip = np.array(data.tip)
#preparing and add 1 in bill
mbill = np.mat(bill)
mtip = np.mat(tip)
m= np.shape(mbill)[1]
one = np.mat(np1.ones(m))
X = np.hstack((one.T,mbill.T))
#set k here
ypred = localWeightRegression(X,mtip,0.5)
SortIndex = X[:,1].argsort(0)
xsort = X[SortIndex][:,0]
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.scatter(bill,tip, color='green')
ax.plot(xsort[:,1],ypred[SortIndex], color = 'red', linewidth=5)
plt.xlabel('Total bill')
plt.ylabel('Tip')
plt.show();
Output
Viva Questions
1. What is machine learning?
2. Define supervised learning
3. Define unsupervised learning
4. Define semi supervised learning
5. Define reinforcement learning
6. What do you mean by hypotheses
7. What is classification
8. What is clustering
9. Define precision, accuracy
and recall
10. Define entropy
11. Define regression
12. How Knn is different from k-means clustering
13. What is concept learning
14. Define specific boundary
and general boundary
15. Define target function

16.Define decision
tree17.What is ANN
18.Explain gradient
descent approximation
19. State Bayes theorem
20. Define Bayesian belief
networks 21.Differentiate hard and
soft clustering
22. Define variance
23. What is inductive machine learning
24. Why K nearest neighbour algorithm is lazy learning algorithm
25. Why naïve Bayes is
naïve 26.Mention
classification algorithms
27.Define pruning
Data Sources:
https://www.vtupulse.com/machine-learning
https://archive.ics.uci.edu/datasets
https://github.com/

Machine Learning Lab Manual

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning Lab Manual

Uploaded by

Copyright:

Available Formats

MACHINE

B.TECH III YEAR –

 To acknowledge quality education and instill high patterns of discipline

 To achieve and impart holistic technical education using the best of

 Evolving the center of excellence through creative and innovative teaching

PEO1 – ANALYTICAL SKILLS

 To facilitate the graduates with the ability to visualize, gather information,

PEO2 – TECHNICAL SKILLS

 To facilitate the graduates with the knowledge of professional and ethical

Engineering Graduates will be able to:

2. Description (If any):

1. The programs can be implemented in either JAVA or Python.

3. Study Experiment / Project:

1. Understand the implementation procedures for the machine learning algorithms.

How to install Python on Windows?

Step 3: Running the Executable Installer

Step 4: Verify the Python Installation in Windows

Getting Started with Python

Download and install Anaconda:

Begin with the installation process:

 Advanced Installation Option:

 Recommendation to Install Pycharm:

Working with Anaconda:

Installing Jupyter Notebook using Anaconda:

 Click on the Install Jupyter Notebook Button:

Supervised learning: The computer is presented with example

Semi-supervised learning: the computer is given only an

Active learning: the computer can only obtain training

Reinforcement learning: training data (in form of rewards

Unsupervised learning: No labels are given to the learning

Supervised learning Un Supervised learning Instance based

In clustering, a set of inputs is to be divided into

Machine learning Approaches

Decision tree learning: Decision tree learning uses a decision

Artificial neural networks

An artificial neural network (ANN) learning algorithm,

Falling hardware prices and the development of GPUs for

Support vector machines

Support vector machines (SVMs) are a set of related

Cluster analysis is the assignment of a set of observations

A Bayesian network, belief network or directed acyclic

Similarity and metric learning

Rule-based machine learning is a general term for any

Feature selection approach

Week- No List of Programs Pg Nos.

Build an Artificial Neural Network by implementing the

Write a program to implement the naïve Bayesian classifier for a

Assuming a set of documents that need to be classified, use the

Write a program to construct a Bayesian network considering

8 Apply EM algorithm to cluster a set of data stored in a .CSV file.

9 Write a program to implement k-Nearest Neighbour algorithm to

Implement the non-parametric Locally Weighted Regression

Example Sky AirTemp Humidity Wind Water Forecast EnjoySport

To illustrate this algorithm, assume the learner is given the sequence of

Example Sky AirTemp Humidity Wind Water Forecast EnjoySport

1 Sunny Warm Normal Strong Warm Same Yes

2 Sunny Warm High Strong Warm Same Yes

3 Rainy Cold High Strong Warm Change No

4 Sunny Warm High Strong Cool Change Yes

• The first step of FIND-S is to initialize h to the most specific hypothesis in H

h1 = [Sunny Warm Normal Strong Warm Same]

Consider the second training example

x2 = [Sunny, Warm, High, Strong, Warm, Same], +

h2 = [Sunny Warm ? Strong Warm Same]