You are on page 1of 94

MACHINE

LEARNING

LABORATORY
MANUAL [ICS-553]

B.TECH III YEAR –


V SEM

[A.Y:2023-2024]
Department of Computer Science
and Engineering

Vision

 To acknowledge quality education and instill high patterns of discipline


making the students technologically superior and ethically strong which involves
the improvement in the quality of life in human race.

Mission

 To achieve and impart holistic technical education using the best of


infrastructure, outstanding technical and teaching expertise to establish the
students in to competent and confident engineers.

 Evolving the center of excellence through creative and innovative teaching


learning practices for promoting academic achievement to produce
internationally accepted competitive and world class professionals.
PROGRAMME EDUCATIONAL OBJECTIVES

PEO1 – ANALYTICAL SKILLS

 To facilitate the graduates with the ability to visualize, gather information,


articulate, analyze, solve complex problems, and make decisions. These are
essential to address the challenges of complex and computation intensive
problems increasing their productivity.

PEO2 – TECHNICAL SKILLS

 To facilitate the graduates with the technical skills that prepare them for
immediate employment and pursue certification providing a deeper
understanding of the technology in advanced areas of computer science and
related fields, thus encouraging to pursue higher education and research based on
their interest.
PEO3 – SOFT SKILLS

 To facilitate the graduates with the soft skills that include fulfilling the
mission, setting goals, showing self-confidence by communicating effectively,
having a positive attitude, get involved in team-work, being a leader, managing
their career and their life.
PEO4 – PROFESSIONAL ETHICS

 To facilitate the graduates with the knowledge of professional and ethical


responsibilities by paying attention to grooming, being conservative with style,
following dress codes, safety codes,and adapting themselves to technological
advancements.
PROGRAM OUTCOMES

Engineering Graduates will be able to:


1. Engineering knowledge: Apply the knowledge of mathematics, science,
engineeringfundamentals, and an engineering specialization to the solution of complex
engineering problems.
2. Problem analysis: Identify, formulate, review research literature, and
analyzecomplex engineering problems reaching substantiated conclusions using first
principles of mathematics, natural sciences, and engineering sciences.
3. Design / development of solutions: Design solutions for complex
engineeringproblems and design system components or processes that meet the
specified needs with appropriate consideration for the public health and safety, and the
cultural, societal, and environmental considerations.
4. Conduct investigations of complex problems: Use research-based knowledge
andresearch methods including design of experiments, analysis and interpretation of
data, and synthesis of the information to provide valid conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources,
andmodern engineering and IT tools including prediction and modeling to complex
engineering activities with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge
toassess societal, health, safety, legal and cultural issues and the consequent
responsibilities relevant to the professional engineering practice.
7. Environment and sustainability: Understand the impact of the
professionalengineering solutions in societal and environmental contexts, and
demonstrate the knowledge of, and need for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and
responsibilitiesand norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member
orleader in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities
withthe engineering community and with society at large, such as, being able to
comprehend and write effective reports and design documentation, make effective
presentations, and give and receive clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding
ofthe engineering and management principles and apply these to one’s own work, as a
member and leader in a team, to manage projects and in multi disciplinary
environments.
12. Life- long learning: Recognize the need for, and have the preparation and ability
toengage in independent and life-long learning in the broadest context of technological
change.

1. Lab Objectives:
 Make use of Data sets in implementing the machine learning algorithms
 Implement the machine learning concepts and algorithms in any
suitable languageof choice.
 Learning basic concepts of Python through illustrative examples and small
exercises.
 To prepare students to become Familiarity with the Python programming in AI
environment.
 To provide student with an academic environment aware of various AI
Algorithms.
 To train Students with python programming as to comprehend, analyze, design
and create AI platforms and solutions for the real life problems.

2. Description (If any):

1. The programs can be implemented in either JAVA or Python.


2. Data sets can be taken from standard repositories
(https://archive.ics.uci.edu/datasets) or constructedby the
students.

3. Study Experiment / Project:


Course outcomes: The students should be able to:

1. Understand the implementation procedures for the machine learning algorithms.


2. Apply various AI search algorithms (uninformed, informed, heuristic, constraint
satisfaction,)
3. Demonstrate working knowledge of reasoning in the presence of incomplete
and/or uncertain information.
4. Design Java/Python programs for various Learning algorithms.
5. Apply appropriate data sets to the Machine Learning algorithms.
6. Understand the fundamentals of knowledge representation, inference
7. Identify and apply Machine Learning algorithms to solve real world problems.
4. Introduction about lab Minimum System requirements:
 Processors: Intel Atom® processor or Intel® Core™ i3 processor.
 Disk space: 1 GB.
 Operating systems: Windows* 7 or later, macOS, and Linux.
 Python* versions: 2.7.X, 3.6.X.,3.8.X, 3.12.X
 IDE’’s: Anaconda, Jupyter notebook, Spyder ,Pydev(Eclipse) etc.
About language:
Python is a general purpose, high-level programming language; other high level
languages you might have heard of C++, PHP, Java and Python. Virtually all modern
programming languages make us of an Integrated Development Environment (IDE),
which allows the creation, editing, testing, and saving of programs and modules. In
Python, the IDE is called IDLE (like many items in the language, this is a reference to the
British comedy group Monty Python,and in this case, one of its members, Eric Idle).
Many modern languages use both processes. They are first compiled into a lower level
language, called byte code, and then interpreted by a program called a virtual machine.
Python uses both processes, but because of the way programmers interact with it,it is
usually considered an interpreted language.
Practical aspects are the key to understanding and conceptual visualization
of Theoretical aspects covered in the books. Also, this course is designed to
review the concepts of Data Structure , studied in previous semester and
implement the various algorithms related to different data structures.

How to install Python on Windows?


Python is a high-level programming language that has become increasingly popular due to its
simplicity, versatility, and extensive range of applications. The process of How to install Python in
Windows, operating system is relatively easy and involves a few uncomplicated steps.
How to Install Python in Windows?
We have provided step-by-step instructions to guide you and ensure a successful installation.
Whether you are new to programming or have some experience, mastering how to install Python on
Windows will enable you to utilize this potent language and uncover its full range of potential
applications.
To download Python on your system, you can use the following steps
Step 1: Select Version to Install Python
Visit the official page for Python https://www.python.org/downloads/ on the Windows operating
system. Locate a reliable version of Python 3, preferably version 3.10.11, which was used in testing
this tutorial. Choose the correct link for your device from the options provided: either Windows
installer (64-bit) or Windows installer (32-bit) and proceed to download the executable file.

Python Homepage
Step 2: Downloading the Python Installer
Once you have downloaded the installer, open the .exe file, such as python-3.10.11-amd64.exe, by
double-clicking it to launch the Python installer. Choose the option to Install the launcher for all
users by checking the corresponding checkbox, so that all users of the computer can access the
Python launcher application.Enable users to run Python from the command line by checking the Add
python.exe to PATH checkbox.

Python Installer

After Clicking the Install Now Button the setup will start installing Python on your Windows
system. You will see a window like this.

Python Setup

Step 3: Running the Executable Installer


After completing the setup. Python will be installed on your Windows system. You will see a
successful message.
Python Successfully installed

Step 4: Verify the Python Installation in Windows


Close the window after successful installation of Python. You can check if the installation of Python
was successful by using either the command line or the Integrated Development Environment
(IDLE), which you may have installed. To access the command line, click on the Start menu and
type “cmd” in the search bar. Then click on Command Prompt.
python --version

Python version

You can also check the version of Python by opening the IDLE application. Go to Start and enter
IDLE in the search bar and then click the IDLE app, for example, IDLE (Python 3.10.11 64-bit). If
you can see the Python IDLE window then you are successfully able to download and installed
Python on Windows.

Python IDLE

Getting Started with Python


Python is a lot easier to code and learn. Python programs can be written on any plain text editor
like Notepad, notepad++, or anything of that sort. One can also use an Online IDE to run Python
code or can even install one on their system to make it more feasible to write these codes because
IDEs provide a lot of features like an intuitive code editor, debugger, compiler, etc. To begin with,
writing Python Codes and performing various intriguing and useful operations, one must have Python
installed on their System.
How to Install Anaconda on Windows?
Anaconda is an open-source software that contains Jupyter, spyder, etc that are used for large data
processing, data analytics, heavy scientific computing. Anaconda works for R and python
programming language. Spyder(sub-application of Anaconda) is used for python. Opencv for python
will work in spyder. Package versions are managed by the package management system called conda.
To begin working with Anaconda, one must get it installed first. Follow the below instructions to
Download and install Anaconda on your system:

Download and install Anaconda:

Head over to anaconda.com and install the latest version of Anaconda. Make sure to download the
“Python 3.7 Version” for the appropriate architecture.

Begin with the installation process:

 Getting Started:
 Getting through the License Agreement:

 Select Installation Type: Select Just Me if you want the software to be used by a single User
 Choose Installation Location:

 Advanced Installation Option:


 Getting through the Installation Process:

 Recommendation to Install Pycharm:


 Finishing up the Installation:

Working with Anaconda:

Once the installation process is done, Anaconda can be used to perform multiple operations. To begin
using Anaconda, search for Anaconda Navigator from the Start Menu in Windows
How to install Jupyter Notebook on Windows?
Jupyter Notebook is an open-source web application that allows you to create and share documents
that contain live code, equations, visualizations, and narrative text. Uses include data cleaning and
transformation, numerical simulation, statistical modeling, data visualization, machine learning, and
much more.
Jupyter has support for over 40 different programming languages and Python is one of them. Python
is a requirement (Python 3.3 or greater, or Python 2.7) for installing the Jupyter Notebook itself.
Jupyter Notebook can be installed by using either of the two ways described below:
 Using Anaconda:
Install Python and Jupyter using the Anaconda Distribution, which includes Python, the Jupyter
Notebook, and other commonly used packages for scientific computing and data science. To install
Anaconda, go through How to install Anaconda on windows? and follow the instructions provided.
 Using PIP:
Install Jupyter using the PIP package manager used to install and manage software
packages/libraries written in Python. To install pip, go through How to install PIP on
Windows? and follow the instructions provided.

Installing Jupyter Notebook using Anaconda:

Anaconda is an open-source software that contains Jupyter, spyder, etc that are used for large data
processing, data analytics, heavy scientific computing. Anaconda works for R and python
programming language. Spyder(sub-application of Anaconda) is used for python. Opencv for python
will work in spyder. Package versions are managed by the package management system called conda.
To install Jupyter using Anaconda, just go through the following instructions:
 Launch Anaconda Navigator:

 Click on the Install Jupyter Notebook Button:


 Beginning the Installation:

 Loading Packages:
 Finished Installation:

Launching Jupyter:
Installing Jupyter Notebook using pip:

PIP is a package management system used to install and manage software packages/libraries written
in Python. These files are stored in a large “on-line repository” termed as Python Package Index
(PyPI).
pip uses PyPI as the default source for packages and their dependencies.
To install Jupyter using pip, we need to first check if pip is updated in our system. Use the following
command to update pip:
python -m pip install --upgrade pip

After updating the pip version, follow the instructions provided below to install Jupyter:
 Command to install Jupyter:
 python -m pip install jupyter
 Beginning Installation:
 Downloading Files and Data:

 Installing Packages:

 Finished Installation:

Launching Jupyter:
Use the following command to launch Jupyter using command-line:
jupyter notebook

Machine learning
Machine learning is a subset of artificial intelligence in
the field of computer science that often uses statistical
techniques to give computers the ability to "learn" (i.e.,
progressively improve performance on a specific task) with
data, without being explicitly programmed. In the past
decade, machine learning has given us self-driving cars,
practical speech recognition, effective web search, and a
vastly improved understanding of the human genome.
Machine learning tasks
Machine learning tasks are typically classified into two
broad categories, depending on whether there is a learning
"signal" or "feedback" available to a learning system:

Supervised learning: The computer is presented with example


inputs and their desired outputs, given by a "teacher", and
the goal is to learn a general rule that maps inputs to
outputs. As special cases, the input signal can be only
partially available, or restricted to special feedback:

Semi-supervised learning: the computer is given only an


incomplete training signal: a training set with some (often
many) of the target outputs missing.

Active learning: the computer can only obtain training


labels for a limited set of instances (based on a budget),
and also has to optimize its choice of objects to acquire
labels for. When used interactively, these can be presented
to the user for labeling.

Reinforcement learning: training data (in form of rewards


and punishments) is given only as feedback to the program's
actions in a dynamic environment, such as driving a vehicle
or playing a game against an opponent.

Unsupervised learning: No labels are given to the learning


algorithm, leaving it on its own to find structure in its
input. Unsupervised learning can be a goal in itself
(discovering hidden patterns in data) or a means towards an
end (feature learning).

Supervised learning Un Supervised learning Instance based


learning
Find-s algorithm EM algorithm
Candidate elimination algorithm
Decision tree algorithm
Back propagation Algorithm Locally weighted
Naïve Bayes Algorithm K means algorithm Regression algorithm
K nearest neighbour
algorithm(lazy learning
algorithm)
Machine learning applications
In classification, inputs are divided into two or more
classes, and the learner must produce a model that assigns
unseen inputs to one or more (multi-label classification)
of these classes. This is typically tackled in a supervised
manner. Spam filtering is an example of classification,
where the inputs are email (or other) messages and the
classes are "spam" and "not spam". In regression, also a
supervised problem, the outputs are continuous rather than
discrete.

In clustering, a set of inputs is to be divided into


groups. Unlike in classification, the groups are not known
beforehand, making this typically an unsupervised task.
Density estimation finds the distribution of inputs in some
space. Dimensionality reduction simplifies inputs by
mapping them into a lower- dimensional space. Topic
modeling is a related problem, where a program is given a
list of human language documents and is tasked with finding
out which documents cover similar topics.

Machine learning Approaches

Decision tree learning: Decision tree learning uses a decision


tree as a predictive model, which maps observations about an
item to conclusions about the item's target value. Association
rule learning Association rule learning is a method for
discovering interesting relations between variables in large
databases.

Artificial neural networks

An artificial neural network (ANN) learning algorithm,


usually called "neural network" (NN), is a learning
algorithm that is vaguely inspired by biological neural
networks. Computations are structured in terms of an
interconnected group of artificial neurons, processing
information using a connectionist approach to computation.
Modern neural networks are non-linear statistical data
modeling tools. They are usually used to model complex
relationships between inputs and outputs, to find patterns
in data, or to capture the statistical structure in an
unknown joint probability distribution between observed
variables.

Deep learning

Falling hardware prices and the development of GPUs for


personal use in the last few years have contributed to the
development of the concept of deep learning which consists
of multiple hidden layers in an artificial neural network.
This approach tries to model the way the human brain
processes light and sound into vision and hearing. Some
successful applications of deep learning are computer
vision and speech recognition.
Inductive logic programming
Inductive logic programming (ILP) is an approach to rule
learning using logic programming as a uniform representation
for input examples, background knowledge, and hypotheses.
Given an encoding of the known background knowledge and a
set of examples represented as a logical database of facts,
an ILP system will derive a hypothesized logic program that
entails all positive and no negative examples. Inductive
programming is a related field that considers any kind of
programming languages for representing hypotheses (and not
only logic programming), such as functional programs.

Support vector machines

Support vector machines (SVMs) are a set of related


supervised learning methods used for classification and
regression. Given a set of training examples, each marked
as belonging to one of two categories, an SVM training
algorithm builds a model that predicts whether a new
example falls into one category or the other.

Clustering

Cluster analysis is the assignment of a set of observations


into subsets (called clusters) so that observations within
the same cluster are similar according to some pre
designated criterion or criteria, while observations drawn
from different clusters are dissimilar. Different
clustering techniques make different assumptions on the
structure of the data, often defined by some similarity
metric and evaluated for example by internal compactness
(similarity between members of the same cluster) and
separation between different clusters. Other methods are
based on estimated density and graph connectivity.
Clustering is a method of unsupervised learning, and a
common technique for statistical data analysis.

Bayesian networks

A Bayesian network, belief network or directed acyclic


graphical model is a probabilistic graphical model that
represents a set of random variables and their conditional
independencies via a directed acyclic graph (DAG). For
example, a Bayesian network could represent the
probabilistic relationships between diseases and symptoms.
Given symptoms, the network can be used to compute the
probabilities of the presence of various diseases.
Efficient algorithms exist that perform inference and
learning.

Reinforcement learning
Reinforcement learning is concerned with how an agent ought
to take actions in an environment so as to maximize some
notion of long-term reward. Reinforcement learning
algorithms attempt to find a policy that maps states of the
world to the actions the agent ought to take in those
states. Reinforcement learning differs from the supervised
learning problem in that correct input/output pairs are
never presented, nor sub-optimal actions explicitly
corrected.

Similarity and metric learning


In this problem, the learning machine is given pairs of
examples that are considered similar and pairs of less
similar objects. It then needs to learn a similarity
function (or a distance metric function) that can predict
if new objects are similar. It is sometimes used in
Recommendation systems.

Genetic algorithms
A genetic algorithm (GA) is a search heuristic that mimics
the process of natural selection, and uses methods such as
mutation and crossover to generate new genotype in the hope
of finding good solutions to a given problem. In machine
learning, genetic algorithms found some uses in the 1980s
and 1990s. Conversely, machine learning techniques have
been used to improve the performance of genetic and
evolutionary algorithms.
Rule-based machine learning

Rule-based machine learning is a general term for any


machine learning method that identifies, learns, or evolves
"rules" to store, manipulate or apply, knowledge. The
defining characteristic of a rule-based machine learner is
the identification and utilization of a set of relational
rules that collectively represent the knowledge captured by
the system. This is in contrast to other machine learners
that commonly identify a singular model that can be
universally applied to any instance in order to make a
prediction. Rule-based machine learning approaches include
learning classifier systems, association rule learning, and
artificial immune systems.

Feature selection approach


Feature selection is the process of selecting an optimal
subset of relevant features for use in model construction.
It is assumed the data contains some features that are
either redundant or irrelevant, and can thus be removed to
reduce calculation cost without incurring much loss of
information. Common optimality criteria include accuracy,
similarity and information measures.
INDEX

Week- No List of Programs Pg Nos.


1 Implement and demonstrate the FIND-S algorithm for finding
the most specific hypothesis based on a given set of training data
samples. Read the training data from a .CSV file.
For a given set of training data examples stored in a .CSV file,
2 implement and demonstrate the Candidate-Elimination algorithm
to output a description of the set of all hypotheses consistent with
the training examples.
Write a program to demonstrate the working of the decision tree
3 based ID3 algorithm. Use an appropriate data set for building the
decision tree and apply this knowledge to classify a new sample.

Build an Artificial Neural Network by implementing the


4 Backpropagation algorithm and test the same using appropriate
data sets.

Write a program to implement the naïve Bayesian classifier for a


5 sample training data set stored as a .CSV file. Compute the
accuracy of the classifier, considering few test data sets.

Assuming a set of documents that need to be classified, use the


6 naïve Bayesian Classifier model to perform this task. Built-in
Java classes/API can be used to write the program. Calculate the
accuracy, precision, and recall for your data set.

Write a program to construct a Bayesian network considering


7 medical data. Use this model to demonstrate the diagnosis of
heart patients using standard Heart Disease Data Set. You can
use Java/Python ML library classes/API.

8 Apply EM algorithm to cluster a set of data stored in a .CSV file.


Use the same data set for clustering using k-Means algorithm.
Compare the results of these two algorithms and comment on the
quality of clustering. You can add Java/Python ML library
classes/API in the program.

9 Write a program to implement k-Nearest Neighbour algorithm to


classify the iris data set. Print both correct and wrong
predictions. Java/Python ML library classes can be used for this
problem.

Implement the non-parametric Locally Weighted Regression


10 algorithm in order to fit data points. Select appropriate data set
for your experiment and draw graphs.
1. Implement and demonstrate the FIND-S algorithm for finding the most specific
hypothesis based on a given set of training data samples. Read the training
data from a .CSV file.

FIND-S Algorithm
1. Initialize h to the most specific hypothesis in H
2. For each positive training instance x
For each attribute constraint ai in
h
If the constraint ai is satisfied by x
Then do nothing
Else replace ai in h by the next more general constraint that is satisfied by x
3. Output hypothesis h

Training Examples:

Example Sky AirTemp Humidity Wind Water Forecast EnjoySport


1 Sunny Warm Normal Strong Warm Same Yes
2 Sunny Warm High Strong Warm Same Yes
3 Rainy Cold High Strong Warm Change No
4 Sunny Warm High Strong Cool Change Yes

To illustrate this algorithm, assume the learner is given the sequence of


training examples from the EnjoySport task

Example Sky AirTemp Humidity Wind Water Forecast EnjoySport

1 Sunny Warm Normal Strong Warm Same Yes

2 Sunny Warm High Strong Warm Same Yes

3 Rainy Cold High Strong Warm Change No

4 Sunny Warm High Strong Cool Change Yes

• The first step of FIND-S is to initialize h to the most specific hypothesis in H

h - (Ø, Ø, Ø, Ø, Ø, Ø)
Consider the first training example
x1 = [Sunny Warm Normal Strong Warm Same], +

Observing the first training example, it is clear that hypothesis h is too specific.

None of the "Ø" constraints in h are satisfied by this example, so each is replaced by the next
more general constraint that fits the example

h1 = [Sunny Warm Normal Strong Warm Same]

Consider the second training example

x2 = [Sunny, Warm, High, Strong, Warm, Same], +

The second training example forces the algorithm to further generalize h, this time substituting a
"?" in place of any attribute value in h that is not satisfied by the new example

h2 = [Sunny Warm ? Strong Warm Same]

Consider the third training example

x3 = [Rainy, Cold, High, Strong, Warm, Change], -

Upon encountering the third training the algorithm makes no change to h. Because the FIND-S
algorithm simply ignores every negative example.

h3 = [Sunny Warm ? Strong Warm Same]

Consider the fourth training example

x4 = [Sunny Warm High Strong Cool Change], +

The fourth example leads to a further generalization of h

h4 = [Sunny Warm ? Strong ? ? ]


Program:
import csv

num_attributes = 6
a = []
print("\n The Given Training Data Set \n")

with open('enjoysport.csv', 'r') as csvfile:


reader = csv.reader(csvfile)
for row in reader:
a.append (row)
print(row)

print("\n The initial value of hypothesis: ")


hypothesis = ['0'] * num_attributes
print(hypothesis)

for j in range(0,num_attributes):
hypothesis[j] = a[0][j];

print("\n Find S: Finding a Maximally Specific Hypothesis\n")

for i in range(0,len(a)):
if a[i][num_attributes]=='yes':
for j in range(0,num_attributes):
if a[i][j]!=hypothesis[j]:
hypothesis[j]='?'
else :
hypothesis[j]= a[i][j]
print(" For Training instance No:{0} the hypothesis is
".format(i),hypothesis)

print("\n The Maximally Specific Hypothesis for a given Training


Examples :\n")
print(hypothesis)
Data Set:

sunny warm normal strong warm same yes


sunny warm high strong warm same yes
rainy cold high strong warm change no
sunny warm high strong cool change yes

Output:

The Given Training Data Set

['sunny', 'warm', 'normal', 'strong', 'warm', 'same', 'yes']


['sunny', 'warm', 'high', 'strong', 'warm', 'same', 'yes']
['rainy', 'cold', 'high', 'strong', 'warm', 'change', 'no']
['sunny', 'warm', 'high', 'strong', 'cool', 'change', 'yes']

The initial value of hypothesis:


['0', '0', '0', '0', '0', '0']

Find S: Finding a Maximally Specific

HypothesisFor Training Example No:0 the

hypothesis is
['sunny', 'warm', 'normal', 'strong', 'warm', 'same']

For Training Example No:1 the hypothesis is


['sunny', 'warm', '?', 'strong', 'warm', 'same']

For Training Example No:2 the hypothesis is


'sunny', 'warm', '?', 'strong', 'warm',
'same']

For Training Example No:3 the hypothesis


is'sunny', 'warm', '?', 'strong', '?',
'?']

The Maximally Specific Hypothesis for a given Training


Examples:

['sunny', 'warm', '?', 'strong', '?', '?']


2. For a given set of training data examples stored in a .CSV
file, implement and demonstrate the Candidate-Elimination
algorithm to output a description of the set of all hypotheses
consistent with the training examples.
Algorithm

For each training example d, do:

If d is positive example

Remove from G any hypothesis h inconsistent with d

For each hypothesis s in S not consistent with d:

Remove s from S

Add to S all minimal generalizations of s consistent


with d and having a generalization in G

Remove from S any hypothesis with a more specific h in


S

If d is negative example

Remove from S any hypothesis h inconsistent with d

For each hypothesis g in G not consistent with d:

Remove g from G

Add to G all minimal specializations of g consistent


with d and having a specialization in S

Remove from G any hypothesis having a more general


hypothesis in G

Solved Numerical Example – 1 (Candidate Elimination Algorithm):

Example Sky AirTemp Humidity Wind Water Forecast EnjoySport

1 Sunny Warm Normal Strong Warm Same Yes

2 Sunny Warm High Strong Warm Same Yes

3 Rain Cold High Strong Warm Change No

4 Sunny Warm High Strong Cool Change Yes


Solution:

S0: (ø, ø, ø, ø, ø, ø) Most Specific Boundary

G0: (?, ?, ?, ?, ?, ?) Most Generic Boundary

The first example is positive, the hypothesis at the specific boundary is inconsistent, hence
we extend the specific boundary, and the hypothesis at the generic boundary is consistent
hence we retain it.

S1: (Sunny,Warm, Normal, Strong, Warm, Same)


G1: (?, ?, ?, ?, ?, ?)

The second example in positive, again the hypothesis at the specific boundary is inconsistent,
hence we extend the specific boundary, and the hypothesis at the generic boundary is
consistent hence we retain it.

S2: (Sunny,Warm, ?, Strong, Warm, Same)


G2: (?, ?, ?, ?, ?, ?)

The third example is negative, the hypothesis at the specific boundary is consistent, hence
we retain it, and hypothesis at the generic boundary is inconsistent hence we write all
consistent hypotheses by removing one “?” (question mark) at time.

S3: (Sunny,Warm, ?, Strong, Warm, Same)


G3: (Sunny,?,?,?,?,?) (?,Warm,?,?,?,?) (?,?,?,?,?,Same)
The fourth example is positive, the hypothesis at the specific boundary is inconsistent,
hence we extend the specific boundary, and the consistent hypothesis at the generic
boundary are retained.

S4: (Sunny, Warm, ?, Strong, ?, ?)


G4: (Sunny,?,?,?,?,?) (?,Warm,?,?,?,?)
Learned Version Space by Candidate Elimination Algorithm for given data set is:
import csv

with open("trainingexamples.csv") as f:
csv_file = csv.reader(f)
data = list(csv_file)

specific = data[1][:-1]
general = [['?' for i in range(len(specific))] for j in
range(len(specific))]

for i in data:
if i[-1] == "Yes":
for j in range(len(specific)):
if i[j] != specific[j]:
specific[j] = "?"
general[j][j] = "?"

elif i[-1] == "No":


for j in range(len(specific)):
if i[j] != specific[j]:
general[j][j] = specific[j]
else:
general[j][j] = "?"

print("\nStep " + str(data.index(i)+1) + " of Candidate Elimination


Algorithm")
print(specific)
print(general)

gh = [] # gh = general Hypothesis
for i in general:
for j in i:
if j != '?':
gh.append(i)
break
print("\nFinal Specific hypothesis:\n", specific)
print("\nFinal General hypothesis:\n", gh)
Step 1 of Candidate Elimination Algorithm
['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same']
[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?',
'?', '?'], ['?', '?', '?', '?', '?', '?']]

Step 2 of Candidate Elimination Algorithm


['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same']
[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?',
'?', '?'], ['?', '?', '?', '?', '?', '?']]

Step 3 of Candidate Elimination Algorithm


['Sunny', 'Warm', '?', 'Strong', 'Warm', 'Same']
[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?',
'?', '?'], ['?', '?', '?', '?', '?', '?']]

Step 4 of Candidate Elimination Algorithm


['Sunny', 'Warm', '?', 'Strong', 'Warm', 'Same']
[['Sunny', '?', '?', '?', '?', '?'], ['?', 'Warm', '?', '?', '?', '?'],
['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', 'Same']]

Step 5 of Candidate Elimination Algorithm


['Sunny', 'Warm', '?', 'Strong', '?', '?']
[['Sunny', '?', '?', '?', '?', '?'], ['?', 'Warm', '?', '?', '?', '?'],
['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?']]

Final Specific hypothesis:


['Sunny', 'Warm', '?', 'Strong', '?', '?']

Final General hypothesis:


[['Sunny', '?', '?', '?', '?', '?'], ['?', 'Warm', '?', '?', '?', '?']]

3. Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use
an appropriate data set for building the decision tree and apply this knowledge to classify a
new sample.

Decision Tree ID3 Algorithm Machine Learning

ID3(Examples, Target_attribute, Attributes)

Examples are the training examples.

Target_attribute is the attribute whose value is to be predicted by the


tree.

Attributes is a list of other attributes that may be tested by the learned


decision tree.

Returns a decision tree that correctly classifies the given Examples.

Create a Root node for the tree

If all Examples are positive, Return the single-node tree Root, with label =
+
If all Examples are negative, Return the single-node tree Root, with label =
-

If Attributes is empty, Return the single-node tree Root,

with label = most common value of Target_attribute in Examples

Otherwise Begin

A ← the attribute from Attributes that best* classifies Examples

The decision attribute for Root ← A

For each possible value, vi, of A,

Add a new tree branch below Root, corresponding to the test A = vi

Let Examples vi, be the subset of Examples that have value vi for A

If Examples vi , is empty

Then below this new branch add a leaf node with

label = most common value of Target_attribute in Examples

Else

below this new branch add the subtree

ID3(Examples vi, Targe_tattribute, Attributes – {A}))

End

Return Root

The best attribute is the one with highest information gain

Problem Definition:
Build a decision tree using ID3 algorithm for the given training data in the table (Buy
Computer data), and predict the class of the following new example: age<=30,
income=medium, student=yes, credit-rating=fair
Solution:

age income student Credit rating Buys computer

<=30 high no fair no

<=30 high no excellent no

31…40 high no fair yes

>40 medium no fair yes

>40 low yes fair yes

>40 low yes excellent no

31…40 low yes excellent yes

<=30 medium no fair no

<=30 low yes fair yes

>40 medium yes fair yes

<=30 medium yes excellent yes

31…40 medium no excellent yes

31…40 high yes fair yes

>40 medium no excellent no


First, check which attribute provides the highest Information Gain in order to split the training set based
on that attribute. We need to calculate the expected information to classify the set and the entropy of
each attribute.

The information gain is this mutual information minus the entropy:

The mutual information of the two classes,

Entropy(S)= E(9,5)= -9/14 log2(9/14) – 5/14 log2(5/14)=0.94


Now Consider the Age attribute
For Age, we have three values age<=30 (2 yes and 3 no), age31..40 (4 yes and 0 no), and age>40 (3 yes and
2 no)
Entropy(age) = 5/14 (-2/5 log2(2/5)-3/5log2(3/5)) + 4/14 (0) + 5/14 (-3/5log2(3/5)-2/5log2(2/5))
= 5/14(0.9709) + 0 + 5/14(0.9709) = 0.6935

Gain(age) = 0.94 – 0.6935 = 0.2465

Next, consider Income Attribute


For Income, we have three values incomehigh (2 yes and 2 no), incomemedium (4 yes and 2 no), and
incomelow (3 yes 1 no)
Entropy(income) = 4/14(-2/4log2(2/4)-2/4log2(2/4)) + 6/14 (-4/6log2(4/6)-2/6log2(2/6)) + 4/14 (-
3/4log2(3/4)-1/4log2(1/4))
= 4/14 (1) + 6/14 (0.918) + 4/14 (0.811)
= 0.285714 + 0.393428 + 0.231714 = 0.9108

Gain(income) = 0.94 – 0.9108 = 0.0292

Next, consider Student Attribute


For Student, we have two values studentyes (6 yes and 1 no) and studentno (3 yes 4 no)

Entropy(student) = 7/14(-6/7log2(6/7)-1/7log2(1/7)) + 7/14(-3/7log2(3/7)-4/7log2(4/7)


= 7/14(0.5916) + 7/14(0.9852)

= 0.2958 + 0.4926 = 0.7884

Gain (student) = 0.94 – 0.7884 = 0.1516

Finally, consider Credit_Rating Attribute


For Credit_Rating we have two values credit_ratingfair (6 yes and 2 no) and credit_ratingexcellent (3
yes 3 no)

Entropy(credit_rating) = 8/14(-6/8log2(6/8)-2/8log2(2/8)) + 6/14(-3/6log2(3/6)-3/6log2(3/6))


= 8/14(0.8112) + 6/14(1)

= 0.4635 + 0.4285 = 0.8920

Gain(credit_rating) = 0.94 – 0.8920 = 0.479

Since Age has the highest Information Gain we start splitting the dataset using the age attribute.

Decision Tree after step 1


Since all records under the branch age31..40 are all of the class, Yes, we can replace the leaf with
Class=Yes
Now build the decision tree for the left subtree

The same process of splitting has to happen for the two remaining branches.

Left sub-branch
For branch age<=30 we still have attributes income, student, and credit_rating. Which one should be
used to split the partition?

The mutual information is E(Sage<=30)= E(2,3)= -2/5 log2(2/5) – 3/5 log2(3/5)=0.97


For Income, we have three values incomehigh (0 yes and 2 no), incomemedium (1 yes and 1 no) and
incomelow (1 yes and 0 no)
Entropy(income) = 2/5(0) + 2/5 (-1/2log2(1/2)-1/2log2(1/2)) + 1/5 (0) = 2/5 (1) = 0.4
Gain(income) = 0.97 – 0.4 = 0.57

For Student, we have two values studentyes (2 yes and 0 no) and studentno (0 yes 3 no)
Entropy(student) = 2/5(0) + 3/5(0) = 0

Gain (student) = 0.97 – 0 = 0.97

We can then safely split on attribute student without checking the other attributes since the information
gain is maximized.
Decision Tree after step 2
Since these two new branches are from distinct classes, we make them into leaf nodes with their
respective class as label:

Decision Tree after step 2_2

Now build the decision tree for right left subtree

Right sub-branch
The mutual information is Entropy(Sage>40)= I(3,2)= -3/5 log2(3/5) – 2/5 log2(2/5)=0.97
For Income, we have two values incomemedium (2 yes and 1 no) and incomelow (1 yes and 1 no)
Entropy(income) = 3/5(-2/3log2(2/3)-1/3log2(1/3)) + 2/5 (-1/2log2(1/2)-1/2log2(1/2))
= 3/5(0.9182)+2/5 (1) = 0.55+0. 4= 0.95

Gain(income) = 0.97 – 0.95 = 0.02

For Student, we have two values studentyes (2 yes and 1 no) and student no (1 yes and 1 no)
Entropy(student) = 3/5(-2/3log2(2/3)-1/3log2(1/3)) + 2/5(-1/2log2(1/2)-1/2log2(1/2)) = 0.95
Gain (student) = 0.97 – 0.95 = 0.02

For Credit_Rating, we have two values credit_ratingfair (3 yes and 0 no) and credit_ratingexcellent (0
yes and 2 no)
Entropy(credit_rating) = 0

Gain(credit_rating) = 0.97 – 0 = 0.97

We then split based on credit_rating. These splits give partitions each with records from the same class.
We just need to make these into leaf nodes with their class label attached:

Decision Tree for Buys Computer

New example: age<=30, income=medium, student=yes, credit-rating=fair

Follow branch(age<=30) then student=yes we predict Class=yes

Buys_computer = yes

ENTROPY:
Entropy measures the impurity of a collection of examples.

Where, p+ is the proportion of positive examples in S


p- is the proportion of negative examples in S.
INFORMATION GAIN:

 Information gain, is the expected reduction in entropy caused by partitioning


the examples according to this attribute.
 The information gain, Gain(S, A) of an attribute A, relative to a collection of
examples S, is defined as

Training Dataset:

Day Outlook Temperature Humidity Wind PlayTennis


D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No

Test Dataset:

Day Outlook Temperature Humidity Wind


T1 Rain Cool Normal Strong
T2 Sunny Mild Normal Strong
Program:

import math
import csv

def load_csv(filename): lines=


dataset = list(lines)
headers = dataset.pop(0)
return dataset,headers

class Node:
def init (self,attribute):
self.attribute=attribute
self.children=[]
self.answer=""

def subtables(data,col,delete): dic={}


coldata=[row[col] for row in data]
attr=list(set(coldata))

counts=[0]*len(attr)
r=len(data)
c=len(data[0])
for x in range(len(attr)):
for y in range(r):
if data[y][col]==attr[x]:
counts[x]+=1

for x in range(len(attr)):
dic[attr[x]]=[[0 for i in range(c)] for j in
range(counts[x])]
pos=0
for y in range(r):
if data[y][col]==attr[x]:
if delete:
del data[y][col]
dic[attr[x]][pos]=data[y]
pos+=1
return attr,dic
def entropy(S): attr=l
if len(attr)==1:
return 0

counts=[0,0]
for i in range(2):
counts[i]=sum([1 for x in S if attr[i]==x])/(len(S)*1.0)

sums=0
for cnt in counts:
sums+=-1*cnt*math.log(cnt,2)
return sums
def compute_gain(data,col): attr,d
subtables(data,col,delete=False)

total_size=len(data)
entropies=[0]*len(attr)
ratio=[0]*len(attr)

total_entropy=entropy([row[-1] for row in data])


for x in range(len(attr)):
ratio[x]=len(dic[attr[x]])/(total_size*1.0)
entropies[x]=entropy([row[-1] for row in
dic[attr[x]]])
total_entropy-=ratio[x]*entropies[x]
return total_entropy

def build_tree(data,features): lastco


if(len(set(lastcol)))==1:
node=Node("")
node.answer=lastcol[0]
return node

n=len(data[0])-1
gains=[0]*n
for col in range(n):
gains[col]=compute_gain(data,col)
split=gains.index(max(gains))
node=Node(features[split])
fea = features[:split]+features[split+1:] attr,dic=subtables(data,split,delete=True)

for x in range(len(attr)):
child=build_tree(dic[attr[x]],fea)
node.children.append((attr[x],child))
return node

def print_tree(node,level): if nod


print(" "*level,node.answer)
return

print(" "*level,node.attribute)
for value,n in node.children:
print(" "*(level+1),value)
print_tree(n,level+2)

def classify(node,x_test,features):
if node.answer!="":
print(node.answer)
return
pos=features.index(node.attribute)
for value, n in node.children:
if x_test[pos]==value:
classify(n,x_test,features)

'''Main program''' datase


node1=build_tree(dataset,features)

print("The decision tree for the dataset using ID3 algorithm


is")
print_tree(node1,0)
testdata,features=load_csv("data3_test.csv")
for xtest in testdata:
print("The test instance:",xtest)
print("The label for test instance:",end=" ")classify(node1,xtest,features)

Output:

The decision tree for the dataset using ID3 algorithm is

Outlook
rain
Wind
strong
no
weak
yes
overcast
yes
sunny
Humidity
normal
yes
high
no

The test instance: ['rain', 'cool', 'normal', 'strong']


The label for test instance: no

The test instance: ['sunny', 'mild', 'normal', 'strong']


The label for test instance: yes

Exp n0. 04: Build an Artificial Neural Network by implementing the Backpropagation
algorithm and test the same using appropriate data sets.

BACKPROPAGATION (training_example, ƞ, nin, nout, nhidden )


Each training example is a pair of the form (𝑥→ →, 𝑡→ ), where (𝑥→ ) is the vector of network
input values, (𝑡→ ) and is the vector of target network output values.
ƞ is the learning rate (e.g., .05). ni, is the number of network inputs, nhidden the number
of units in the hidden layer, and nout the number of output units.
The input from unit i into unit j is denoted xji, and the weight from unit i to unit j is
denoted wji
 Create a feed-forward network with ni inputs, nhidden hidden units, and nout output
units.
 Initialize all network weights to small random numbers
 Until the termination condition is met, Do

 For each (→𝑥→, 𝑡→ ), in training examples, Do


Propagate the input forward through the network:
1. Input the instance 𝑥→ →, to the network and compute the output ou of every
unit u in the network.

Propagate the errors backward through the network:

Training Examples:

Expected % in
Example Sleep Study
Exams
1 2 9 92
2 1 5 86
3 3 6 89

Normalize the input


Expected %
Example Sleep Study
in Exams
1 2/3 = 0.66666667 9/9 = 1 0.92
2 1/3 = 0.33333333 5/9 = 0.55555556 0.86
3 3/3 = 1 6/9 = 0.66666667 0.89

Program:

import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)
y = np.array(([92], [86], [89]), dtype=float)
X = X/np.amax(X,axis=0) # maximum of X array longitudinallyy
= y/100

#Sigmoid Function
def sigmoid (x):
return 1/(1 + np.exp(-x))

#Derivative of Sigmoid Function


def derivatives_sigmoid(x):
return x * (1 - x)

#Variable initialization
epoch=5000 #Setting training
iterations lr=0.1 #Setting learning
rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output
layer
weight and bias initialization
wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neur ons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neuron s))
bout=np.random.uniform(size=(1,output_neurons))

#draws a random range of numbers uniformly of dim x*y


for i in range(epoch):

#Forward Propogation
hinp1=np.dot(X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout) outinp=
outinp1+ bout
output = sigmoid(outinp)

#Backpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
EH = d_output.dot(wout.T)

#how much hidden layer wts contributed to error


hiddengrad = derivatives_sigmoid(hlayer_act)
d_hiddenlayer = EH * hiddengrad

# dotproduct of nextlayererror and currentlayerop


wout += hlayer_act.T.dot(d_output) *lr
wh += X.T.dot(d_hiddenlayer) *lr
print("Input: \n" + str(X)) print("Actual
Output: \n" + str(y)) print("Predicted
Output: \n" ,output)

Output:

Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]

Actual Output:
[[0.92]
[0.86]
[0.89]]

Predicted Output:
[[0.89726759]
[0.87196896]
[0.9000671]]

Python Program to Implement and Demonstrate Backpropagation Algorithm Machine Learning

import numpy as np

X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)

y = np.array(([92], [86], [89]), dtype=float)

X = X/np.amax(X,axis=0) #maximum of X array longitudinally

y = y/100

#Sigmoid Function

def sigmoid (x):

return 1/(1 + np.exp(-x))


#Derivative of Sigmoid Function

def derivatives_sigmoid(x):

return x * (1 - x)

#Variable initialization

epoch=5 #Setting training iterations

lr=0.1 #Setting learning rate

inputlayer_neurons = 2 #number of features in data set

hiddenlayer_neurons = 3 #number of hidden layers neurons

output_neurons = 1 #number of neurons at output layer

#weight and bias initialization

wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))

bh=np.random.uniform(size=(1,hiddenlayer_neurons))

wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))

bout=np.random.uniform(size=(1,output_neurons))

#draws a random range of numbers uniformly of dim x*y

for i in range(epoch):

#Forward Propogation

hinp1=np.dot(X,wh)

hinp=hinp1 + bh

hlayer_act = sigmoid(hinp)

outinp1=np.dot(hlayer_act,wout)

outinp= outinp1+bout
output = sigmoid(outinp)

#Backpropagation

EO = y-output

outgrad = derivatives_sigmoid(output)

d_output = EO * outgrad

EH = d_output.dot(wout.T)

hiddengrad = derivatives_sigmoid(hlayer_act)#how much hidden layer wts


contributed to error

d_hiddenlayer = EH * hiddengrad

wout += hlayer_act.T.dot(d_output) *lr # dotproduct of


nextlayererror and currentlayerop

wh += X.T.dot(d_hiddenlayer) *lr

print ("-----------Epoch-", i+1, "Starts----------")

print("Input: \n" + str(X))

print("Actual Output: \n" + str(y))

print("Predicted Output: \n" ,output)

print ("-----------Epoch-", i+1, "Ends----------\n")

print("Input: \n" + str(X))

print("Actual Output: \n" + str(y))

print("Predicted Output: \n" ,output)


Training Examples:

Example Sleep Study Expected % in Exams

1 2 9 92

2 1 5 86

3 3 6 89

Normalize the input

Example Sleep Study Expected % in Exams

1 2/3 = 0.66666667 9/9 = 1 0.92

2 1/3 = 0.33333333 5/9 = 0.55555556 0.86

3 3/3 = 1 6/9 = 0.66666667 0.89

Output
———–Epoch- 1 Starts———-
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]

[0.86]
[0.89]]
Predicted Output:
[[0.81951208]
[0.8007242 ]
[0.82485744]]
———–Epoch- 1 Ends———-

Epoch- 2 Starts———-
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.82033938]
[0.80153634]
[0.82568134]]
———–Epoch- 2 Ends———-
———–Epoch- 3 Starts———-
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.82115226]
[0.80233463]
[0.82649072]]
———–Epoch- 3 Ends———-

———–Epoch- 4 Starts———-
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.82195108]
[0.80311943]
[0.82728598]]
———–Epoch- 4 Ends———-

———–Epoch- 5 Starts———-
Input:

[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.8227362 ]
[0.80389106]
[0.82806747]]
———–Epoch- 5 Ends———-

Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.8227362 ]
[0.80389106]
[0.82806747]]

Exp no. 05. Write a program to implement the Naïve Bayesian classifier for a sample training data set stored
as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.

Bayes’ Theorem is stated as:

Where,

P(h|D) is the probability of hypothesis h given the data D. This is called the posterior probability.
P(D|h) is the probability of data d given that the hypothesis h was true.
P(h) is the probability of hypothesis h being true. This is called the prior probability of h. P(D) is the
probability of the data. This is called the prior probability of D
After calculating the posterior probability for a number of different hypotheses h, and is interested in
finding the most probable hypothesis h ∈ H given the observed data D. Any such maximally probable
hypothesis is called a maximum a posteriori (MAP) hypothesis.
Bayes theorem to calculate the posterior probability of each candidate hypothesis is hMAP is a MAP
hypothesis provided.

Ignoring P(D) since it is a constant)

Gaussian Naive Bayes

A Gaussian Naive Bayes algorithm is a special type of Naïve Bayes algorithm. It’s specifically used when
the features have continuous values. It’s also assumed that all the features are following a Gaussian
distribution i.e., normal distribution

Representation for Gaussian Naive Bayes

We calculate the probabilities for input values for each class using a frequency. With real- valued inputs,
we can calculate the mean and standard deviation of input values (x) for each class to summarize the
distribution.

This means that in addition to the probabilities for each class, we must also store the mean and standard
deviations for each input variable for each class.
Gaussian Naive Bayes Model from Data

The probability density function for the normal distribution is defined by two parameters (mean and
standard deviation) and calculating the mean and standard deviation values of each input variable (x) for
each class value.

Examples:
The data set used in this program is the Pima Indians Diabetes problem.
This data set is comprised of 768 observations of medical details for Pima Indians patents. The records
describe instantaneous measurements taken from the patient such as their age, the number of times
pregnant and blood workup. All patients are women aged 21 or older. All attributes are numeric, and their
units vary from attribute to attribute.

The attributes are Pregnancies, Glucose, BloodPressure, SkinThickness, Insulin, BMI,


DiabeticPedigreeFunction, Age, Outcome

Each record has a class value that indicates whether the patient suffered an onset of diabetes within 5 years
of when the measurements were taken (1) or not (0)

Sample Examples:

Diabeti
c
Pedigre
e
Example Pregnanci Glucos BloodPressu SkinThickne Insuli BM Functio Ag Outcom
s es e re ss n I n e e

33.
1 6 148 72 35 0 6 0.627 50 1

26.
2 1 85 66 29 0 6 0.351 31 0

23.
3 8 183 64 0 0 3 0.672 32 1

28.
4 1 89 66 23 94 1 0.167 21 0

43.
5 0 137 40 35 168 1 2.288 33 1

25.
6 5 116 74 0 0 6 0.201 30 0
7 3 78 50 32 88 31 0.248 26 1

35.
8 10 115 0 0 0 3 0.134 29 0

30.
9 2 197 70 45 543 5 0.158 53 1

10 8 125 96 0 0 0 0.232 54 1

Python Program to Implement and Demonstrate Naïve Bayesian Classifier Machine Learning

import csv

import random

import math

def loadcsv(filename):

lines = csv.reader(open(filename, "r"));

dataset = list(lines)

for i in range(len(dataset)):

#converting strings into numbers for processing

dataset[i] = [float(x) for x in dataset[i]]

return dataset

def splitdataset(dataset, splitratio):

#67% training size

trainsize = int(len(dataset) * splitratio);

trainset = []

copy = list(dataset);

while len(trainset) < trainsize:


#generate indices for the dataset list randomly to pick ele for training data

index = random.randrange(len(copy));

trainset.append(copy.pop(index))

return [trainset, copy]

def separatebyclass(dataset):

separated = {} #dictionary of classes 1 and 0

#creates a dictionary of classes 1 and 0 where the values are

#the instances belonging to each class

for i in range(len(dataset)):

vector = dataset[i]

if (vector[-1] not in separated):

separated[vector[-1]] = []

separated[vector[-1]].append(vector)

return separated

def mean(numbers):

return sum(numbers)/float(len(numbers))

def stdev(numbers):

avg = mean(numbers)

variance = sum([pow(x-avg,2) for x in numbers])/float(len(numbers)-1)

return math.sqrt(variance)
def summarize(dataset): #creates a dictionary of classes

summaries = [(mean(attribute), stdev(attribute)) for attribute in


zip(*dataset)];

del summaries[-1] #excluding labels +ve or -ve

return summaries

def summarizebyclass(dataset):

separated = separatebyclass(dataset);

#print(separated)

summaries = {}

for classvalue, instances in separated.items():

#for key,value in dic.items()

#summaries is a dic of tuples(mean,std) for each class value

summaries[classvalue] = summarize(instances) #summarize is used


to cal to mean and std

return summaries

def calculateprobability(x, mean, stdev):

exponent = math.exp(-(math.pow(x-mean,2)/(2*math.pow(stdev,2))))

return (1 / (math.sqrt(2*math.pi) * stdev)) * exponent

def calculateclassprobabilities(summaries, inputvector):

probabilities = {} # probabilities contains the all prob of all class


of test data

for classvalue, classsummaries in summaries.items():#class and


attribute information as mean and sd
probabilities[classvalue] = 1

for i in range(len(classsummaries)):

mean, stdev = classsummaries[i] #take mean and sd of


every attribute for class 0 and 1 seperaely

x = inputvector[i] #testvector's first attribute

probabilities[classvalue] *= calculateprobability(x,
mean, stdev);#use normal dist

return probabilities

def predict(summaries, inputvector): #training and test data is passed

probabilities = calculateclassprobabilities(summaries, inputvector)

bestLabel, bestProb = None, -1

for classvalue, probability in probabilities.items():#assigns that


class which has he highest prob

if bestLabel is None or probability > bestProb:

bestProb = probability

bestLabel = classvalue

return bestLabel

def getpredictions(summaries, testset):

predictions = []

for i in range(len(testset)):

result = predict(summaries, testset[i])

predictions.append(result)

return predictions
def getaccuracy(testset, predictions):

correct = 0

for i in range(len(testset)):

if testset[i][-1] == predictions[i]:

correct += 1

return (correct/float(len(testset))) * 100.0

def main():

filename = 'naivedata.csv'

splitratio = 0.67

dataset = loadcsv(filename);

trainingset, testset = splitdataset(dataset, splitratio)

print('Split {0} rows into train={1} and test={2}


rows'.format(len(dataset), len(trainingset), len(testset)))

# prepare model

summaries = summarizebyclass(trainingset);

#print(summaries)

# test model

predictions = getpredictions(summaries, testset) #find the predictions


of test data with the training data

accuracy = getaccuracy(testset, predictions)

print('Accuracy of the classifier is : {0}%'.format(accuracy))


main()

Output
Split 768 rows into train=514 and test=254

Rows Accuracy of the classifier is : 71.65354330708661%

Exp No. 06: Assuming a set of documents that need to be classified, use the naïve
Bayesian Classifier model to perform this task. Built-in Java classes/API can be used to
write the program. Calculate the accuracy, precision, and recall for your data set.

Naive Bayes algorithms for learning and classifying text

LEARN_NAIVE_BAYES_TEXT (Examples, V)
Examples is a set of text documents along with their target values. V is the set of all possible target
values. This function learns the probability terms P(w k |vj,), describing the probability that a
randomly drawn word from a document in class v j will be the English word wk. It also learns the
class prior probabilities P(vj).

1. collect all words, punctuation, and other tokens that occur in Examples
 Vocabulary ← c the set of all distinct words and other tokens occurring in any
text document from Examples

2. calculate the required P(vj) and P(wk|vj) probability terms

 For each target value vj in V do



docsj ← the subset of documents from Examples for which the target value is vj

P(vj) ← | docsj | / |Examples|

Textj ← a single document created by concatenating all members of docsj

n ← total number of distinct word positions in Textj

for each word wk in Vocabulary

nk ← number of times word wk occurs in Textj

P(wk|vj) ← ( nk + 1) / (n + | Vocabulary| )

CLASSIFY_NAIVE_BAYES_TEXT (Doc)

Return the estimated target value for the document Doc. a i denotes the word found in the ith
position within Doc.

 positions ← all word positions in Doc that contain tokens found in Vocabulary
 Return VNB, where
Data set:

Text Documents Label


1 I love this sandwich pos
2 This is an amazing place pos
3 I feel very good about these beers pos
4 This is my best work pos
5 What an awesome view pos
6 I do not like this restaurant neg
7 I am tired of this stuff neg
8 I can't deal with this neg
9 He is my sworn enemy neg
10 My boss is horrible neg
11 This is an awesome place pos
12 I do not like the taste of this juice neg
13 I love to dance pos
14 I am sick and tired of this place neg
15 What a great holiday pos
16 That is a bad locality to stay neg
17 We will have good fun tomorrow pos
18 I went to my enemy's house today neg

Program:

import pandas as pd

msg=pd.read_csv('naivetext.csv',names=['message','label'])

print('The dimensions of the dataset',msg.shape)

msg['labelnum']=msg.label.map({'pos':1,'neg':0})
X=msg.message
y=msg.labelnum

print(X)
print(y)

#splitting the dataset into train and test data from


sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest=train_test_split(X,y)

print ('\n The total number of Training Data :',ytrain.shape)


print ('\n The total number of Test Data :',ytest.shape)

#output of count vectoriser is a sparse matrix from


sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
xtrain_dtm = count_vect.fit_transform(xtrain)
xtest_dtm=count_vect.transform(xtest)
print('\n The words or Tokens in the text documents \n')
print(count_vect.get_feature_names())

df=pd.DataFrame(xtrain_dtm.toarray(),columns=count_vect.get_fe
ature_names())

# Training Naive Bayes (NB) classifier on training data. from


sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB().fit(xtrain_dtm,ytrain)
predicted = clf.predict(xtest_dtm)

#printing accuracy, Confusion matrix, Precision and Recall from


sklearn import metrics
print('\n Accuracy of the classifer is’,
metrics.accuracy_score(ytest,predicted))
print('\n Confusion matrix')
print(metrics.confusion_matrix(ytest,predicted))

print('\n The value of Precision' ,


metrics.precision_score(ytest,predicted))

print('\n The value of Recall' ,


metrics.recall_score(ytest,predicted))
Output:

The dimensions of the dataset (18, 2)


0 I love this sandwich
1 This is an amazing place
2 I feel very good about these beers
3 This is my best work
4 What an awesome view
5 I do not like this restaurant
6 I am tired of this stuff
7 I can't deal with this
8 He is my sworn enemy
9 My boss is horrible
10 This is an awesome place
11 I do not like the taste of this juice
12 I love to dance
13 I am sick and tired of this place
14 What a great holiday
15 That is a bad locality to stay
16 We will have good fun tomorrow
17 I went to my enemy's house today

Name: message, dtype: object


0 1
1 1
2 1
3 1
4 1
5 0
6 0
7 0
8 0
9 0
10 1
11 0
12 1
13 0
14 1
15 0
16 1
17 0
Name: labelnum, dtype: int64

The total number of


Training Data: (13,) The
total number of Test Data:
(5,)
The words or Tokens in the text documents
['about', 'am', 'amazing', 'an', 'and', 'awesome', 'beers',
'best', 'can', 'deal', 'do', 'enemy', 'feel',
'fun', 'good', 'great', 'have', 'he', 'holiday', 'house', 'is',
'like', 'love', 'my', 'not', 'of', 'place',
'restaurant', 'sandwich', 'sick', 'sworn', 'these', 'this',
'tired', 'to', 'today', 'tomorrow', 'very', 'view', 'we',
'went', 'what', 'will', 'with', 'work']

Accuracy of the
classifier is 0.8
Confusion matrix [[2 1]
[0 2]]
The value of Precision
0.6666666666666666 The value of
Recall 1.0

Basic knowledge

Confusion Matrix

True positives: data points labelled as positive


that are actually positive False positives: data
points labelled as positive that are actually
negative True negatives: data points labelled as
negative that are actually negative False
negatives: data points labelled as negative that
are actually positive
Example:

Accuracy: how often is the classifier correct?


Example: Movie Review

Doc Text Class

1 I loved the movie +


2 I hated the movie -
3 a great movie. good movie +
4 poor acting -
5 great acting. good movie +

Unique word
< I, loved, the, movie, hated, a, great, good, poor, acting>

Doc I loved the movie hated a great good poor acting Class

1 1 1 1 1 +
2 1 1 1 1 -
3 2 1 1 1 +
4 1 1 -
5 1 1 1 1 +

Doc I loved the movie hated a great good poor acting Class

1 1 1 1 1 +
3 2 1 1 1 +

5 1 1 1 1 +

3
𝑃(+) = = 0.6
5
1+1 1+1
𝑃(𝐼 |+) = = 0.0833 𝑃(𝑎 |+) = = 0.0833
14 + 14 +
10 10
1+1 2+1
𝑃(𝑙𝑜𝑣𝑒𝑑 |+) = = 0.0833 𝑃(𝑔𝑟𝑒𝑎𝑡 |+) = = 0.125
14 + 14 +
10 10
1+1 2+1
𝑃(𝑡ℎ𝑒 |+) = 0.0833 𝑃(𝑔𝑜𝑜𝑑 |+) = = 0.125
= 14 + 14 +
10 10
4+1 0+1
𝑃(𝑚𝑜𝑣i𝑒 |+) = = 0.2083 𝑃(𝑝𝑜𝑜𝑟 |+) = = 0.0416
14 + 14 + 10
10 1+
0+1 1
𝑃(ℎ𝑎𝑡𝑒𝑑 |+) = 0.0416 𝑃(𝑎𝑐𝑡i𝑛𝑔 |+) = = 0.0833
= 14 + 14 +
10 10

Doc I loved the movie hated a great good poor acting Class

2 1 1 1 1 -

4 1 1 -

2
𝑃(−) =
5 = 0.4
1+1 0+1
𝑃(𝐼 |−) = 0.125 𝑃(𝑎 |−) = = 0.0625
= 6+ 6+
10 10
0+1 0+1
𝑃(𝑙𝑜𝑣𝑒𝑑 |−) = = 0.0625 𝑃(𝑔𝑟𝑒𝑎𝑡 |−) = = 0.0625
6+ 6+
10 10
1+1 0+1
𝑃(𝑡ℎ𝑒 |−) = 0.125 𝑃(𝑔𝑜𝑜𝑑 |−) = = 0.0625
= 6+ 6+
10 10
1+1 1+1
𝑃(𝑚𝑜𝑣i𝑒|−) = = 0.125 𝑃(𝑝𝑜𝑜𝑟|−) = 0.125
6+ 6+
=
10 10
1+1 1+1
𝑃(ℎ𝑎𝑡𝑒𝑑 |−) = 0.125 𝑃(𝑎𝑐𝑡i𝑛𝑔|−) = = 0.125
= 6+ 6+
10 10

Let’s classify the new document

I hated the poor acting


If Vj =
+ then,
= P(+) P(I | +) P(hated | +) P(the | +) P(poor | +) P(acting | +)
= 0.6 * 0.0833 * 0.0416 * 0.0833 * 0.0416 * 0.0833
= 6.03 X 10−2
If Vj = −
then,
= P(−) P(I | −) P(hated | −) P(the | −) P(poor | −) P(acting | −)
= 0.4 * 0.125 * 0.125 * 0.125 * 0.125 * 0.125
= 1.22 X 10−5

= 1.22 X 10−5 > 6.03 X 10−2


So, the new document belongs to ( − ) class
7. Write a program to construct a Bayesian network considering medical data. Use this model to
demonstrate the diagnosis of heart patients using a standard Heart Disease Data Set. You can
use Java/Python ML library classes/API.
Theory

A Bayesian network is a directed acyclic graph in which each edge corresponds to a


conditional dependency, and each node corresponds to a unique random variable.

Bayesian network consists of two major parts: a directed acyclic graph and a set of
conditional probability distributions

 The directed acyclic graph is a set of random variables represented by nodes.


 The conditional probability distribution of a node (random variable) is
defined for every possible outcome of the preceding causal node(s).
For illustration, consider the following example. Suppose we attempt to turn on our
computer, but the computer does not start (observation/evidence). We would like to
know which of the possible causes of computer failure is more likely. In this
simplified illustration, we assume only two possible causes of this misfortune:
electricity failure and computer malfunction.

The corresponding directed acyclic graph is depicted in below figure.

The goal is to calculate the posterior conditional probability distribution of each of the
possible unobserved causes given the observed evidence, i.e. P [Cause | Evidence].

Data Set:
Title: Heart Disease Databases
The Cleveland database contains 76 attributes, but all published experiments refer to
using a subset of 14 of them. In particular, the Cleveland database is the only one that
has been used by ML researchers to this date. The “Heartdisease” field refers to the
presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4.

Database: 0 1 2 3 4 Total
Cleveland: 164 55 36 35 13 303

Attribute Information:

1. age: age in years


2. sex: sex (1 = male; 0 = female)
3. cp: chest pain type
1. Value 1: typical angina
2. Value 2: atypical angina
3. Value 3: non-anginal pain
4. Value 4: asymptomatic
4. trestbps: resting blood pressure (in mm Hg on admission to the hospital)
5. chol: serum cholestoral in mg/dl
6. fbs: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
7. restecg: resting electrocardiographic results
1. Value 0: normal
2. Value 1: having ST-T wave abnormality (T wave inversions
and/or ST elevation or depression of > 0.05 mV)
3. Value 2: showing probable or definite left ventricular
hypertrophy by Estes’ criteria
8. thalach: maximum heart rate achieved
9. exang: exercise induced angina (1 = yes; 0 = no)
10.oldpeak = ST depression induced by exercise relative to rest
11.slope: the slope of the peak exercise ST segment
1. Value 1: upsloping
2. Value 2: flat
3. Value 3: downsloping
12. thal: 3 = normal; 6 = fixed defect; 7 = reversable defect
13.Heartdisease: It is integer valued from 0 (no presence) to 4.

Some instance from the dataset:

ag se c trestb ch fb reste thala exa oldpe slo c th Heartdise


e x p ps ol s cg ch ng ak pe a al ase

23
63 1 1 145 3 1 2 150 o 2.3 3 o 6 o
28
67 1 4 160 6 o 2 108 1 1.5 2 3 3 2
22
67 1 4 120 9 o 2 129 1 2.6 2 2 7 1

41 o 2 130 20 o 2 172 o 1.4 1 o 3 o


4
26
62 o 4 140 8 o 2 160 o 3.6 3 2 3 3
20
60 1 4 130 6 o 2 132 1 2.4 2 2 7 4

Python Program to Implement and Demonstrate Bayesian network using pgmpy Machine
Learning

import numpy as np

import pandas as pd

import csv

from pgmpy.estimators import MaximumLikelihoodEstimator

from pgmpy.models import BayesianModel

from pgmpy.inference import VariableElimination

heartDisease = pd.read_csv('heart.csv')

heartDisease = heartDisease.replace('?',np.nan)

print('Sample instances from the dataset are given below')

print(heartDisease.head())

print('\n Attributes and datatypes')

print(heartDisease.dtypes)
model= BayesianModel([('age','heartdisease'),('sex','heartdisease'),
('exang','he artdisease'),('cp','heartdisease'),
('heartdisease','restecg'),('heartdise ase','chol')])

print('\nLearning CPD using Maximum likelihood estimators')

model.fit(heartDisease,estimator=MaximumLikelihoodEstimator)

print('\n Inferencing with Bayesian Network:')

HeartDiseasetest_infer = VariableElimination(model)

print('\n 1. Probability of HeartDisease given evidence= restecg')

q1=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'res
tecg':1})

print(q1)

print('\n 2. Probability of HeartDisease given evidence= cp ')

q2=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'cp'
:2})

print(q2)
Output

8. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set
for clustering using k-Means algorithm. Compare the results of these two algorithms and
comment on the quality of clustering. You can add Java/Python ML library classes/API
in the program.

Python Program to Implement and Demonstrate K-Means and EM Algorithm Machine


Learning

from sklearn.cluster import KMeans


from sklearn.mixture import GaussianMixture

import sklearn.metrics as metrics

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

names = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width',
'Class']

dataset = pd.read_csv("8-dataset.csv", names=names)

X = dataset.iloc[:, :-1]

label = {'Iris-setosa': 0,'Iris-versicolor': 1, 'Iris-virginica': 2}

y = [label[c] for c in dataset.iloc[:, -1]]

plt.figure(figsize=(14,7))

colormap=np.array(['red','lime','black'])

# REAL PLOT

plt.subplot(1,3,1)
plt.title('Real')

plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y])

# K-PLOT

model=KMeans(n_clusters=3, random_state=0).fit(X)

plt.subplot(1,3,2)

plt.title('KMeans')

plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[model.labels_])

print('The accuracy score of K-Mean: ',metrics.accuracy_score(y,


model.labels_))

print('The Confusion matrixof K-Mean:\n',metrics.confusion_matrix(y,


model.labels_))

# GMM PLOT

gmm=GaussianMixture(n_components=3, random_state=0).fit(X)

y_cluster_gmm=gmm.predict(X)

plt.subplot(1,3,3)

plt.title('GMM Classification')

plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y_cluster_gmm])

print('The accuracy score of EM: ',metrics.accuracy_score(y,


y_cluster_gmm))
print('The Confusion matrix of EM:\n ',metrics.confusion_matrix(y,
y_cluster_gmm))

Output
The accuracy score of K-Mean: 0.24

The Confusion matrixof K-Mean:

[[ 0 50 0]
[48 0 2]
[14 0 36]]

The accuracy score of EM: 0.36666666666666664

The Confusion matrix of EM:

[[50 0 0]
[ 0 5 45]
[ 0 50 0]]

9. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data
set. Print both correct and wrong predictions. Java/Python ML library classes can be
used for this problem.
K-Nearest Neighbor Algorithm

Training algorithm:

 For each training example (x, f (x)), add the example to the list
training examples Classification algorithm:
 Given a query instance xq to be classified,
 Let x1 . . .xk denote the k instances from training examples that are
nearest to xq
 Return

 Where, f(xi) function to calculate the mean value of the k


nearest training examples.

Data Set:

Iris Plants Dataset: Dataset contains 150 instances (50 in each of three classes)
Number of Attributes: 4 numeric, predictive attributes and the Class.
Sample Data

Python Program to Implement and Demonstrate KNN Algorithm

import numpy as np

import pandas as pd

from sklearn.neighbors import KNeighborsClassifier

from sklearn.model_selection import train_test_split

from sklearn import metrics


names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width',
'Class']

# Read dataset to pandas dataframe

dataset = pd.read_csv("9-dataset.csv", names=names)

X = dataset.iloc[:, :-1]

y = dataset.iloc[:, -1]

print(X.head())

Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, test_size=0.10)

classifier = KNeighborsClassifier(n_neighbors=5).fit(Xtrain, ytrain)

ypred = classifier.predict(Xtest)

i = 0

print ("\n
")

print ('%-25s %-25s %-25s' % ('Original Label', 'Predicted Label',


'Correct/Wrong'))

print ("
")

for label in ytest:

print ('%-25s %-25s' % (label, ypred[i]), end="")

if (label == ypred[i]):
print (' %-25s' % ('Correct'))

else:

print (' %-25s' % ('Wrong'))

i = i + 1

print ("
")

print("\nConfusion Matrix:\n",metrics.confusion_matrix(ytest, ypred))

print ("
")

print("\nClassification Report:\n",metrics.classification_report(ytest,
ypred))

print ("
")

print('Accuracy of the classifer is %0.2f' %


metrics.accuracy_score(ytest,ypred))

print ("
")

Output
sepal-length sepal-width petal-length petal-width

0 5.1 3.5 1.4 0.2

1 4.9 3.0 1.4 0.2

2 4.7 3.2 1.3 0.2

3 4.6 3.1 1.5 0.2

4 5.0 3.6 1.4 0.2


Original Label Predicted Label Correct/Wrong

Iris-versicolor Iris-versicolor Correct

Iris-virginica Iris-versicolor Wrong

Iris-virginica Iris-virginica Correct

Iris-versicolor Iris-versicolor Correct

Iris-setosa Iris-setosa Correct

Iris-versicolor Iris-versicolor Correct

Iris-setosa Iris-setosa Correct

Iris-setosa Iris-setosa Correct

Iris-virginica Iris-virginica Correct

Iris-virginica Iris-versicolor Wrong

Iris-virginica Iris-virginica Correct

Iris-setosa Iris-setosa Correct

Iris-virginica Iris-virginica Correct

Iris-virginica Iris-virginica Correct

Iris-versicolor Iris-versicolor Correct

Confusion Matrix:

[[4 0 0]
[0 4 0]

[0 2 5]]

Classification Report:

precision recall f1-score support

Iris-setosa 1.00 1.00 1.00 4

Iris-versicolor 0.67 1.00 0.80 4

Iris-virginica 1.00 0.71 0.83 7

avg / total 0.91 0.87 0.87 15

Accuracy of the classifer is 0.87

Exp. No. 10. Implement the non-parametric Locally Weighted Regression


algorithm in Python in order to fit data points. Select the appropriate data set for
your experiment and draw graphs.
Locally Weighted Regression Algorithm

Regression:

 Regression is a technique from statistics that are used to predict values of


the desired target quantity when the target quantity is continuous.
 In regression, we seek to identify (or estimate) a continuous variable
y associated with a given input vector x.
 y is called the dependent variable.
 x is called the independent variable.

Loess/Lowess Regression:

Loess regression is a nonparametric technique that uses local weighted regression to


fit a smooth curve through points in a scatter plot.

Lowess Algorithm:

Locally weighted regression is a very powerful nonparametric model used in statistical


learning.
Given a dataset X, y, we attempt to find a model parameter β(x) that minimizes
residual sum of weighted squared errors.

The weights are given by a kernel function (k or w) which can be chosen arbitrarily

Algorithm
1. Read the Given data Sample to X and the curve (linear or non linear) to Y

2. Set the value for Smoothening parameter or Free parameter say τ

3. Set the bias /Point of interest set x0 which is a subset of X

4. Determine the weight matrix using :

5. Determine the value of model term parameter β using:

6. Prediction = x0*β

Python Program to Implement and Demonstrate Locally Weighted Regression Algorithm

import matplotlib.pyplot as plt

import pandas as pd

import numpy as np

def kernel(point, xmat, k):

m,n = np.shape(xmat)

weights = np.mat(np1.eye((m)))

for j in range(m):

diff = point - X[j]


weights[j,j] = np.exp(diff*diff.T/(-2.0*k**2))

return weights

def localWeight(point, xmat, ymat, k):

wei = kernel(point,xmat,k)

W = (X.T*(wei*X)).I*(X.T*(wei*ymat.T))

return W

def localWeightRegression(xmat, ymat, k):

m,n = np.shape(xmat)

ypred = np.zeros(m)

for i in range(m):

ypred[i] = xmat[i]*localWeight(xmat[i],xmat,ymat,k)

return ypred

# load data points

data = pd.read_csv('10-dataset.csv')

bill = np.array(data.total_bill)

tip = np.array(data.tip)

#preparing and add 1 in bill

mbill = np.mat(bill)
mtip = np.mat(tip)

m= np.shape(mbill)[1]

one = np.mat(np1.ones(m))

X = np.hstack((one.T,mbill.T))

#set k here

ypred = localWeightRegression(X,mtip,0.5)

SortIndex = X[:,1].argsort(0)

xsort = X[SortIndex][:,0]

fig = plt.figure()

ax = fig.add_subplot(1,1,1)

ax.scatter(bill,tip, color='green')

ax.plot(xsort[:,1],ypred[SortIndex], color = 'red', linewidth=5)

plt.xlabel('Total bill')

plt.ylabel('Tip')

plt.show();
Output
Viva Questions
1. What is machine learning?
2. Define supervised learning
3. Define unsupervised learning
4. Define semi supervised learning
5. Define reinforcement learning
6. What do you mean by hypotheses
7. What is classification
8. What is clustering
9. Define precision, accuracy
and recall
10. Define entropy
11. Define regression
12. How Knn is different from k-means clustering
13. What is concept learning
14. Define specific boundary
and general boundary

15. Define target function


16.Define decision
tree17.What is ANN
18.Explain gradient
descent approximation
19. State Bayes theorem
20. Define Bayesian belief
networks 21.Differentiate hard and
soft clustering
22. Define variance
23. What is inductive machine learning
24. Why K nearest neighbour algorithm is lazy learning algorithm
25. Why naïve Bayes is
naïve 26.Mention
classification algorithms

27.Define pruning
Data Sources:

https://www.vtupulse.com/machine-learning
https://archive.ics.uci.edu/datasets
https://github.com/

You might also like