Case Study 1 Barry and Communication Ba

B. M.
S EVENING COLLEGE OF ENGINEERING
Bull Temple Road, Bangalore – 19
Department
of
Computer Science & Engineering
Machine Learning
Laboratory Record
Subject Code: 17CSL76
Name: _______________
USN: ________________
B M S EVENING
COLLEGE OF ENGINEERING
(Affiliated to VTU, Belagavi)
LABORATORY CERTIFICATE
This is to certify that Mr / Ms ______________________________ has

Satisfactorily completed the course of experiments in Practical
___________________________ Prescribed by the Visveswaraya
Technological University for _____________________ Semester
________________________ Course in the Laboratory of the college in
the year 2020-21.
Head of the Department Staff in-charge of the batch
Date: ____________
Particulars of the Experiments Performed
CONTENTS
Expt Date Experiment Marks Page
No. Obtained No.
01 Implement and demonstrate the FIND- 3-4
S algorithm for finding the most
specific hypothesis based on a given
set of training data samples. Read the
training data from a .CSV file.
02 For a given set of training data 5-7
examples stored in a .CSV file,
implement and demonstrate the
Candidate-Elimination algorithm to
output a description of the set of all
hypotheses consistent with the training
examples.
03 Write a program to demonstrate the 8-10
working of the decision tree based ID3
algorithm. Use an appropriate data set
for building the decision tree and apply
this knowledge to classify a new
sample.
04 Build an Artificial Neural Network by 11-12
implementing the Backpropagation
algorithm and test the same using
appropriate data sets.
05 Write a program to implement the 13-15
naïve Bayesian classifier for a sample
training data set stored as a .CSV file.
Compute the accuracy of the classifier,
considering few test data sets.
06 Assuming a set of documents that need 16-17
to be classified, use the naïve Bayesian
Classifier model to perform this task.
Built-in Java classes/API can be used
to write the program. Calculate the
accuracy, precision, and recall for your
data set.
07 Write a program to construct a 18-21
Bayesian network considering medical
data. Use this model to demonstrate the
diagnosis of heart patients using
standard Heart Disease Data Set. You
can use Java/Python ML library
classes/API.
08 Apply EM algorithm to cluster a set of 22-25
data stored in a .CSV file. Use the same
data set for clustering using k-Means
algorithm. Compare the results of these
two algorithms and comment on the
quality of clustering. You can add
Java/Python ML library classes/API in
the program.
09 Write a program to implement k- 26-27
Nearest Neighbour algorithm to
classify the iris data set. Print both
correct and wrong predictions.
Java/Python ML library classes can be
used for this problem.
10 Implement the non-parametric Locally 28-29
Weighted Regression algorithm in
order to fit data points. Select
appropriate data set for your
experiment and draw graphs.
MACHINE LEARING LABORATORY (17CSL76)
BMS EVENING COLLEGE OF ENGINEERING 1


1. Implement and demonstrate the FIND-S algorithm for finding the most
specific hypothesis based on a given set of training data samples. Read the
training data from a .CSV file.
import csv
def loadCsv(filename):
lines = csv.reader(open(filename, "rt"))
dataset = list(lines)
for i in range(len(dataset)):
dataset[i] = dataset[i]
return dataset
attributes = ['Sky','Temp','Humidity','Wind','Water','Forecast']
print(attributes)
n = len(attributes)
dataset = loadCsv("pgm1.csv")
print(dataset)
h=['0'] * n
print("Intial hypothesis")
print(h)
print("The hypothesis are")
target = dataset[i][-1]
if(target == 'Yes'):
for j in range(n):
if(h[j]=='0'):
h[j] = dataset[i][j]
if(h[j]!= dataset[i][j]):
h[j]='?'
print(i+1,'=',h)
print("Final hypothesis")
print(h)

SAMPLE OUTPUT

2. For a given set of training data examples stored in a .CSV file, implement and
demonstrate the Candidate-Elimination algorithm to output a description of the
set of all hypotheses consistent with the training examples.
import csv
def get_domains(examples):
d = [set() for i in examples[0]]
for x in examples:
for i, xi in enumerate(x):
d[i].add(xi)
return [list(sorted(x)) for x in d]
def more_general(h1, h2):

more_general_parts = []
for x, y in zip(h1, h2):
mg = x == "?" or (x != "0" and (x == y or y == "0"))
more_general_parts.append(mg)
return all(more_general_parts)
def fulfills(example, hypothesis):

# the implementation is the same as for hypotheses:
return more_general(hypothesis, example)
def min_generalizations(h, x):

h_new = list(h)
for i in range(len(h)):
if not fulfills(x[i:i+1], h[i:i+1]):
h_new[i] = '?' if h[i] != '0' else x[i]
return [tuple(h_new)]
def min_specializations(h, domains, x):

results = []
for i in range(len(h)):
if h[i] == "?":
for val in domains[i]:
if x[i] != val:
h_new = h[:i] + (val,) + h[i+1:]
results.append(h_new)
elif h[i] != "0":
h_new = h[:i] + ('0',) + h[i+1:]
results.append(h_new)
return results
def generalize_S(x, G, S):

S_prev = list(S)
for s in S_prev:
if s not in S:
continue
if not fulfills(x, s):

S.remove(s)
Splus = min_generalizations(s, x)
## keep only generalizations that have a counterpart in G
S.update([h for h in Splus if any([more_general(g,h) for g in G])])
## remove hypotheses less specific than any other in S
S.difference_update([h for h in S if any([more_general(h, h1) for h1 in S if h != h1])])
return S
def specialize_G(x, domains, G, S):

G_prev = list(G)
for g in G_prev:
if g not in G:
continue
if fulfills(x, g):
G.remove(g)
Gminus = min_specializations(g, domains, x)
## keep only specializations that have a conuterpart in S
G.update([h for h in Gminus if any([more_general(h, s) for s in S])])
## remove hypotheses less general than any other in G
G.difference_update([h for h in G if any([more_general(g1, h) for g1 in G if h != g1])])
return G
def candidate_elimination(examples):
domains = get_domains(examples)[:-1]
n = len(domains)
G = set([("?",)*n])
S = set([("0",)*n])
print("Maximally specific hypotheses - S ")
print("Maximally general hypotheses - G ")
i=0
print("\nS[0]:",str(S),"\nG[0]:",str(G))
for xcx in examples:
i=i+1
x, cx = xcx[:-1], xcx[-1]
if cx=='Y': # x is positive example
G = {g for g in G if fulfills(x, g)}
S = generalize_S(x, G, S)
else:
S = {s for s in S if not fulfills(x, s)}
G = specialize_G(x, domains, G, S)
print("\nS[{0}]:".format(i),S)
print("G[{0}]:".format(i),G)
return
with open('program2.csv') as csvFile:

examples = [tuple(line) for line in csv.reader(csvFile)]
candidate_elimination(examples)

SAMPLE OUTPUT

3. Write a program to demonstrate the working of the decision tree based ID3
algorithm. Use an appropriate data set for building the decision tree and apply
this knowledge to classify a new sample.
import math
import csv
def load_csv(filename):
lines = csv.reader(open(filename, "r"));
headers = dataset.pop(0)
return dataset, headers
class Node:
def __init__(self, attribute):
self.attribute = attribute
self.children = []
self.answer = ""
def subtables(data, col, delete):

dic = {}
coldata = [ row[col] for row in data]
attr = list(set(coldata)) # All values of attribute retrived
for k in attr:
dic[k] = []
for y in range(len(data)):
key = data[y][col]
if delete:
del data[y][col]
dic[key].append(data[y])
return attr, dic
def entropy(S):
attr = list(set(S))
if len(attr) == 1: #if all are +v
return 0
counts = [0,0] # Only two values possible 'yes' or 'no'
for i in range(2):
counts[i] = sum( [1 for x in S if attr[i] == x] ) / (len(S) * 1.0)
sums = 0
for cnt in counts:
sums += -1 * cnt * math.log(cnt, 2)
return sums
def compute_gain(data, col):

attValues, dic = subtables(data, col, delete=False)
total_entropy = entropy([row[-1] for row in data])
for x in range(len(attValues)):
ratio = len(dic[attValues[x]]) / ( len(data) * 1.0)

entro = entropy([row[-1] for row in dic[attValues[x]]])

total_entropy -= ratio*entro
return total_entropy
def build_tree(data, features):

lastcol = [row[-1] for row in data]
if (len(set(lastcol))) == 1: # If all samples have same labels return that label
node=Node("")
node.answer = lastcol[0]
return node
n = len(data[0])-1
gains = [compute_gain(data, col) for col in range(n) ]
split = gains.index(max(gains)) # Find max gains and returns index
node = Node(features[split]) # 'node' stores attribute selected
#del (features[split])
fea = features[:split]+features[split+1:]
attr, dic = subtables(data, split, delete=True) # Data will be spilt in subtables
for x in range(len(attr)):
child = build_tree(dic[attr[x]], fea)
node.children.append((attr[x], child))
return node
def print_tree(node, level):

if node.answer != "":
print(" "*level, node.answer) # Displays leaf node yes/no
return
print(" "*level, node.attribute) # Displays attribute Name
for value, n in node.children:
print(" "*(level+1), value)
print_tree(n, level + 2)
def classify(node,x_test,features):
if node.answer != "":
print(node.answer)
return
pos = features.index(node.attribute)
for value, n in node.children:
if x_test[pos]==value:
classify(n,x_test,features)
''' Main program '''
dataset, features = load_csv("pgm3a.csv") # Read Tennis data
node = build_tree(dataset, features) # Build decision tree
print("The decision tree for the dataset using ID3 algorithm is ")
print_tree(node, 0)
testdata, features = load_csv("pgm3b.csv")
for xtest in testdata:
print("The test instance : ",xtest)
print("The predicted label : ", end="")
classify(node,xtest,features)

SAMPLE OUTPUT

4. Build an Artificial Neural Network by implementing the Backpropagation

algorithm and test the same using appropriate data sets.
import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)
y = np.array(([92], [86], [89]), dtype=float)
X = X/np.amax(X,axis=0)
y = y/100
def sigmoid (x):

return 1/(1 + np.exp(-x))
def dersig(x):
return x * (1 - x)
e=7000
lr=0.1
iln = 2
hln = 3
oln = 1
wh=np.random.uniform(size=(iln,hln))
bh=np.random.uniform(size=(1,hln))
wout=np.random.uniform(size=(hln,oln))
bout=np.random.uniform(size=(1,oln))
for i in range(e):
h1=np.dot(X,wh)
h=h1 + bh
hla = sigmoid(h)
oi1=np.dot(hla,wout)
oi= oi1+ bout
op = sigmoid(oi)
EO = y-op
og = dersig(op)
dop = EO* og
EH = dop.dot(wout.T)
hg = dersig(hla)

dhl = EH * hg
wout += hla.T.dot(dop) *lr
wh += X.T.dot(dhl) *lr
print("Input: \n" + str(X))
print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,op)
SAMPLE OUTPUT

5. Write a program to implement the naïve Bayesian classifier for a sample

training data set stored as a .CSV file. Compute the accuracy of the classifier,
considering few test data sets.
import csv
import random
import math
def loadCsv(filename):
lines = csv.reader(open(filename, "r"));
#converting strings into numbers for processing
dataset[i] = [float(x) for x in dataset[i]]
return dataset
def splitDataset(dataset, splitRatio):
#67% training size
trainSize = int(len(dataset) * splitRatio);
trainSet = []
copy = list(dataset);
while len(trainSet) < trainSize:
#generate indices for the dataset list randomly to pick ele for training data
index = random.randrange(len(copy));
trainSet.append(copy.pop(index))
return [trainSet, copy]
def separateByClass(dataset):
separated = {}
#creates a dictionary of classes 1 and 0 where the values are the instacnes belonging to each class
vector = dataset[i]
if (vector[-1] not in separated):
separated[vector[-1]] = []
separated[vector[-1]].append(vector)
return separated
def mean(numbers):
return sum(numbers)/float(len(numbers))
def stdev(numbers):
avg = mean(numbers)
variance = sum([pow(x-avg,2) for x in numbers])/float(len(numbers)-1)
return math.sqrt(variance)
def summarize(dataset):
summaries = [(mean(attribute), stdev(attribute)) for attribute in zip(*dataset)];
del summaries[-1]
return summaries
def summarizeByClass(dataset):
separated = separateByClass(dataset);
summaries = {}
for classValue, instances in separated.items():

#summaries is a dic of tuples(mean,std) for each class value

summaries[classValue] = summarize(instances)
return summaries
def calculateProbability(x, mean, stdev):
exponent = math.exp(-(math.pow(x-mean,2)/(2*math.pow(stdev,2))))
return (1 / (math.sqrt(2*math.pi) * stdev)) * exponent
def calculateClassProbabilities(summaries, inputVector):
probabilities = {}
for classValue, classSummaries in summaries.items():#class and attribute information as mean
and sd
probabilities[classValue] = 1
for i in range(len(classSummaries)):
mean, stdev = classSummaries[i] #take mean and sd of every attribute for class 0 and 1
seperaely
x = inputVector[i] #testvector's first attribute
probabilities[classValue] *= calculateProbability(x, mean, stdev);#use normal dist
return probabilities
def predict(summaries, inputVector):
probabilities = calculateClassProbabilities(summaries, inputVector)
bestLabel, bestProb = None, -1
for classValue, probability in probabilities.items():#assigns that class which has he highest prob
if bestLabel is None or probability > bestProb:
bestProb = probability
bestLabel = classValue
return bestLabel
def getPredictions(summaries, testSet):
predictions = []
for i in range(len(testSet)):
result = predict(summaries, testSet[i])
predictions.append(result)
return predictions
def getAccuracy(testSet, predictions):
correct = 0
for i in range(len(testSet)):
if testSet[i][-1] == predictions[i]:
correct += 1
return (correct/float(len(testSet))) * 100.0
def main():
filename = '5.csv'
splitRatio = 0.67
dataset = loadCsv(filename);
trainingSet, testSet = splitDataset(dataset, splitRatio)
print('Split {0} rows into train={1} and test={2} rows'.format(len(dataset),len(trainingSet),
len(testSet)))
# prepare model
summaries = summarizeByClass(trainingSet);
# test model
predictions = getPredictions(summaries, testSet)
accuracy = getAccuracy(testSet, predictions)

print('Accuracy of the classifier is : {0}%'.format(accuracy))

main()
SAMPLE OUTPUT

6. Assuming a set of documents that need to be classified, use the naïve Bayesian
Classifier model to perform this task. Built-in Java classes/API can be used to
write the program. Calculate the accuracy, precision, and recall for your data
set.
import pandas as pd
msg=pd.read_csv('pgm6.csv',names=['message','label'])
print('Total instances in the dataset:',msg.shape[0])
msg['labelnum']=msg.label.map({'pos':1,'neg':0})
X=msg.message
Y=msg.labelnum
print('\nThe message and its label of first 5 instances are listed below')
X5, Y5 = X[0:5], msg.label[0:5]
for x, y in zip(X5,Y5):
print(x,',',y)
from sklearn.model_selection import train_test_split

xtrain,xtest,ytrain,ytest=train_test_split(X,Y)
print('\nDataset is split into Training and Testing samples')
print('Total training instances :', xtrain.shape[0])
print('Total testing instances :', xtest.shape[0])
from sklearn.feature_extraction.text import CountVectorizer

count_vect = CountVectorizer()
xtrain_dtm = count_vect.fit_transform(xtrain)
xtest_dtm = count_vect.transform(xtest)
print('\nTotal features extracted using CountVectorizer:',xtrain_dtm.shape[1])
print('\nFeatures for first 5 training instances are listed below')
df=pd.DataFrame(xtrain_dtm.toarray(),columns=count_vect.get_feature_names())
from sklearn.naive_bayes import MultinomialNB

clf = MultinomialNB().fit(xtrain_dtm,ytrain)
predicted = clf.predict(xtest_dtm)
print('\nClassstification results of testing samples are given below')
for doc, p in zip(xtest, predicted):
pred = 'pos' if p==1 else 'neg'
print('%s -> %s ' % (doc, pred))
from sklearn import metrics

print('\nAccuracy metrics')
print('Accuracy of the classifer is',metrics.accuracy_score(ytest,predicted))
print('Recall :',metrics.recall_score(ytest,predicted),'\nPrecison
:',metrics.precision_score(ytest,predicted))
print('Confusion matrix')
print(metrics.confusion_matrix(ytest,predicted))

SAMPLE OUTPUT

7. Write a program to construct a Bayesian network considering medical data.

Use this model to demonstrate the diagnosis of heart patients using standard
Heart Disease Data Set. You can use Java/Python ML library classes/API.
Initial Setup

import numpy as np
import pandas as pd
import csv
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.models import BayesianModel
from pgmpy.inference import VariableElimination
heartDisease = pd.read_csv('heart.csv')
heartDisease = heartDisease.replace('?',np.nan)
print('Sample instances from the dataset are given below')

print(heartDisease.head())
print('\n Attributes and datatypes')

print(heartDisease.dtypes)
model=
BayesianModel([('age','heartdisease'),('sex','heartdisease'),('exang','heartdisease'),('cp','heartdisease'
),('heartdisease','restecg'),('heartdisease','chol')])
print('\nLearning CPD using Maximum likelihood estimators')
model.fit(heartDisease,estimator=MaximumLikelihoodEstimator)
print('\n Inferencing with Bayesian Network:')

HeartDiseasetest_infer = VariableElimination(model)
print('\n 1. Probability of HeartDisease given evidence= restecg')

q1=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'restecg':1})
print(q1)
print('\n 2. Probability of HeartDisease given evidence= cp ')

q2=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'cp':2})
print(q2)
SAMPLE OUTPUT



8. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same
data set for clustering using k-Means algorithm. Compare the results of these
two algorithms and comment on the quality of clustering. You can add
Java/Python ML library classes/API in the program.
import matplotlib.pyplot as plt

from sklearn import datasets
from sklearn.cluster import KMeans
import sklearn.metrics as sm
import pandas as pd
import numpy as np
import matplotlib
l1 = [0,1,2]
def rename(s):
l2 = []
for i in s:
if i not in l2:
l2.append(i)
for i in range(len(s)):
pos = l2.index(s[i])
s[i] = l1[pos]
return s
iris = datasets.load_iris()
X = pd.DataFrame(iris.data)
X.columns = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
y = pd.DataFrame(iris.target)
y.columns = ['Targets']
print("Actual Target is:\n", iris.target)
model = KMeans(n_clusters=3)
model.fit(X)
plt.figure(figsize=(14,7))
colormap = np.array(['red', 'lime', 'black'])
plt.subplot(1, 2, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y.Targets], s=40)
plt.title('Real Classification')
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[model.labels_], s=40)

plt.title('K Mean Classification')

plt.show()
km = rename(model.labels_)
print("\nWhat KMeans thought: \n", km)
print("Accuracy of KMeans is ",sm.accuracy_score(y, km))
print("Confusion Matrix for KMeans is \n",sm.confusion_matrix(y, km))
from sklearn import preprocessing

scaler = preprocessing.StandardScaler()
scaler.fit(X)
xsa = scaler.transform(X)
xs = pd.DataFrame(xsa, columns = X.columns)
print("\n",xs.sample(5))
from sklearn.mixture import GaussianMixture

gmm = GaussianMixture(n_components=3)
gmm.fit(xs)
y_cluster_gmm = gmm.predict(xs)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y_cluster_gmm], s=40)
plt.title('GMM Classification')
plt.show()
em = rename(y_cluster_gmm)
print("\nWhat EM thought: \n", em)
print("Accuracy of EM is ",sm.accuracy_score(y, em))
print("Confusion Matrix for EM is \n", sm.confusion_matrix(y, em))
SAMPLE OUTPUT



9. Write a program to implement k-Nearest Neighbour algorithm to classify the

iris data set. Print both correct and wrong predictions. Java/Python ML library
classes can be used for this problem.
from sklearn.model_selection import train_test_split

from sklearn.neighbors import KNeighborsClassifier
from sklearn import datasets
iris=datasets.load_iris()
print("Iris Data set loaded...")
x_train, x_test, y_train, y_test = train_test_split(iris.data,iris.target,test_size=0.1)

print("Dataset is split into training and testing...")
print("Size of trainng data and its label",x_train.shape,y_train.shape)
print("Size of trainng data and its label",x_test.shape, y_test.shape)
for i in range(len(iris.target_names)):
print("Label", i , "-",str(iris.target_names[i]))
classifier = KNeighborsClassifier(n_neighbors=1)
classifier.fit(x_train, y_train)
y_pred=classifier.predict(x_test)
print("Results of Classification using K-nn with K=1 ")

for r in range(0,len(x_test)):
print(" Sample:", str(x_test[r]), " Actual-label:", str(y_test[r]), " Predicted-label:",str(y_pred[r]))
print("Classification Accuracy :" , classifier.score(x_test,y_test));
from sklearn.metrics import classification_report, confusion_matrix

print('Confusion Matrix')
print(confusion_matrix(y_test,y_pred))
print('Accuracy Metrics')
print(classification_report(y_test,y_pred))

SAMPLE OUTPUT

10. Implement the non-parametric Locally Weighted Regression algorithm in

order to fit data points. Select appropriate data set for your experiment and
draw graphs.
import matplotlib.pyplot as plt

import pandas as pd
import numpy as np
def kernel(point,xmat, k):

m,n = np.shape(xmat)
weights = np.mat(np.eye((m)))
for j in range(m):
diff = point - X[j]
weights[j,j] = np.exp(diff*diff.T/(-2.0*k**2))
return weights
def localWeight(point,xmat,ymat,k):
wei = kernel(point,xmat,k)
W = (X.T*(wei*X)).I*(X.T*(wei*ymat.T))
return W
def localWeightRegression(xmat,ymat,k):
m,n = np.shape(xmat)
ypred = np.zeros(m)
for i in range(m):
ypred[i] = xmat[i]*localWeight(xmat[i],xmat,ymat,k)
return ypred
def graphPlot(X,ypred):
sortindex = X[:,1].argsort(0) #argsort - index of the smallest
xsort = X[sortindex][:,0]
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.scatter(bill,tip, color='green')
ax.plot(xsort[:,1],ypred[sortindex], color = 'red', linewidth=5)
plt.xlabel('Total bill')
plt.ylabel('Tip')
plt.show();
# load data points
data = pd.read_csv('pgm10.csv')
bill = np.array(data.total_bill) # We use only Bill amount and Tips data
tip = np.array(data.tip)
mbill = np.mat(bill) # .mat will convert nd array is converted in 2D array
mtip = np.mat(tip)
m= np.shape(mbill)[1]
one = np.mat(np.ones(m))
X = np.hstack((one.T,mbill.T)) # 244 rows, 2 cols
# increase k to get smooth curves
ypred = localWeightRegression(X,mtip,3)

graphPlot(X,ypred)
SAMPLE OUTPUT

Case Study 1 Barry and Communication Ba

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Case Study 1 Barry and Communication Ba

Uploaded by

Copyright:

Available Formats

B. M.

S EVENING COLLEGE OF ENGINEERING

Bull Temple Road, Bangalore – 19

Subject Code: 17CSL76

(Affiliated to VTU, Belagavi)

This is to certify that Mr / Ms ______________________________ has

Head of the Department Staff in-charge of the batch

BMS EVENING COLLEGE OF ENGINEERING 1

BMS EVENING COLLEGE OF ENGINEERING 2

BMS EVENING COLLEGE OF ENGINEERING 3

BMS EVENING COLLEGE OF ENGINEERING 4

def more_general(h1, h2):

def fulfills(example, hypothesis):

def min_generalizations(h, x):

def min_specializations(h, domains, x):

def generalize_S(x, G, S):

BMS EVENING COLLEGE OF ENGINEERING 5

def specialize_G(x, domains, G, S):

with open('program2.csv') as csvFile:

BMS EVENING COLLEGE OF ENGINEERING 6

BMS EVENING COLLEGE OF ENGINEERING 7

def subtables(data, col, delete):

def compute_gain(data, col):

BMS EVENING COLLEGE OF ENGINEERING 8

entro = entropy([row[-1] for row in dic[attValues[x]]])

def build_tree(data, features):

def print_tree(node, level):

BMS EVENING COLLEGE OF ENGINEERING 9

BMS EVENING COLLEGE OF ENGINEERING 10

4. Build an Artificial Neural Network by implementing the Backpropagation

def sigmoid (x):

BMS EVENING COLLEGE OF ENGINEERING 11

BMS EVENING COLLEGE OF ENGINEERING 12

5. Write a program to implement the naïve Bayesian classifier for a sample

BMS EVENING COLLEGE OF ENGINEERING 13

#summaries is a dic of tuples(mean,std) for each class value

BMS EVENING COLLEGE OF ENGINEERING 14

print('Accuracy of the classifier is : {0}%'.format(accuracy))

BMS EVENING COLLEGE OF ENGINEERING 15

from sklearn.model_selection import train_test_split

from sklearn.feature_extraction.text import CountVectorizer

from sklearn.naive_bayes import MultinomialNB

from sklearn import metrics

BMS EVENING COLLEGE OF ENGINEERING 16

BMS EVENING COLLEGE OF ENGINEERING 17

7. Write a program to construct a Bayesian network considering medical data.

BMS EVENING COLLEGE OF ENGINEERING 18

print('Sample instances from the dataset are given below')

print('\n Attributes and datatypes')

print('\n Inferencing with Bayesian Network:')

print('\n 1. Probability of HeartDisease given evidence= restecg')

print('\n 2. Probability of HeartDisease given evidence= cp ')

BMS EVENING COLLEGE OF ENGINEERING 19

BMS EVENING COLLEGE OF ENGINEERING 20

BMS EVENING COLLEGE OF ENGINEERING 21

import matplotlib.pyplot as plt

print("Actual Target is:\n", iris.target)

BMS EVENING COLLEGE OF ENGINEERING 22

plt.title('K Mean Classification')

from sklearn import preprocessing

from sklearn.mixture import GaussianMixture

BMS EVENING COLLEGE OF ENGINEERING 23

BMS EVENING COLLEGE OF ENGINEERING 24

BMS EVENING COLLEGE OF ENGINEERING 25