ML Lab Manual Sem-7

Machine learning 3170724
LAB MANUAL
MACHINE LEARNING
Subject Code: 3170724
Prepared By:
Mahavir Swami Collage of Engineering and Technology, Surat
Practical List (Academic year : 2022-23)
Subject : Sem : Department : Faculty Name:

Machine Learning (3170724) 7th C.S.E Darsha Chauhan
Guide: Darsha Chauhan

MAHAVIR SWAMI COLLEGE OF

ENGINEERING & TECHNOLOGY, SURAT
CERTIFICATE
This is to certify that MR. / Ms. ____________________________________

of class Computer Science & Engineering 7thsemester Enrollment No.
__________________ has satisfactorily submitted his / her term work in
subject Machine learning (Sub. Code: 3170724) for the term ending in
___________.
Date:
Sign of teacher Sign of Head of department

Index
Sr. date PRACTICAL PAGE Sign
No.
1 17/06/2021 Write a program to Implementation of mean, median and 4

mode
2 01/07/2021 Write a program to implement Data distribution 6

histogram.
3 08/07/2021 Write a program to implement scatter plot using given 7

dataset
4 15/07/2021 Write a program to Implementation of linear regression 8

from given dataset
5 29/07/2021 Write a program to implement Scale 10
6 12/08/2021 Write a program to training and testing from given 12

dataset
7 02/09/2021 Write a program to Implementation of Decision tree from 16

given dataset
8 16/09/2021 Write a program to Implement K-Nearest Neighbors 20

Algorithm from given dataset
9 23/09/2021 Write a program to implementation of K- Mean 22

clustering from given dataset
10 07/10/2021 Write a program to implementation of hierarchical 26

clustering from dataset
Practical 1: Write a program to Implementation of mean,

median and mode
Code:
import numpy as np
v1=np.arange(1,33)
print(v1)
print('----------------------------------')
v2=np.mean(v1)
print(v2)
print('----------------------------------')
v4=np.arange(1,11)
v3=np.median(v4)
print(v3)
print('----------------------------------')
v5=np.arange(1,11)
v6=np.arange(2,22)
#v5%2
#print(v5)
Gill Prabhdeep Singh(191110107013) Page 1

sum=0
i=0
for i in range(1,11):
sum=sum+i
print(sum)
sum/11
print(sum)

Output:

Practical: 2 Write a program to implement Data distribution

histogram from the given dataset.
Code:
import numpy
import matplotlib.pyplot as plt
x = numpy.random.normal(5.0, 1.0, 2000000)
plt.hist(x, 100)
plt.show()

Output:

Practical 3: Write a program to implement scatter plot using

given Dataset
Code:
importnumpy
importmatplotlib.pyplot as plt
x = numpy.random.normal(6.0, 1.0, 200)
y = numpy.random.normal(10.0, 2.0, 200)
plt.scatter(x, y)
plt.show()

Output:

Practical 4: Write a program to Implement linear regression

from given dataset
Code:
fromtkinter import *
def select():
sel = "Value = " + str(v.get())
label.config(text = sel)
top = Tk()
top.geometry("200x100")
v = DoubleVar()
scale = Scale( top, variable = v, from_ = 1, to = 100, orient = HORIZONTAL)
scale.pack(anchor=CENTER)
btn = Button(top, text="Value", command=select)
btn.pack(anchor=CENTER)
label = Label(top)
label.pack()
top.mainloop()

Output:

Practical 5: Write a program to implement Scale from given

dataset
Code:
import pandas
from sklearn import linear_model
from sklearn.preprocessing import StandardScaler
scale = StandardScaler()
df = pandas.read_csv("cars2.csv")
X = df[['Weight', 'Volume']]
scaledX = scale.fit_transform(X)
print(scaledX)

Output:

Practical 6: Write a program to training and testing from given

dataset
Code:
1. Training set: 80% from original dataset by random selection:
Testing set: 20% from original dataset by random selection:
import numpy
numpy.random.seed(2)
x = numpy.random.normal(3, 1, 100)
y = numpy.random.normal(150, 40, 100) / x
train_x = x[:80]
train_y = y[:80]
test_x = x[20:]
test_y = y[20:]
mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))
myline = numpy.linspace(0, 6, 100)
plt.scatter(train_x, train_y)
plt.plot(myline, mymodel(myline))
plt.show()

Output:

2. Training set: 20% from original dataset by random selection:

Testing set: 80% from original dataset by random selection:
import numpy
numpy.random.seed(2)
x = numpy.random.normal(3, 1, 100)
y = numpy.random.normal(150, 40, 100) / x
train_x = x[:80]
train_y = y[:80]
test_x = x[80:]
test_y = y[80:]
mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))
myline = numpy.linspace(0, 6, 100)
plt.scatter(train_x, train_y)
plt.plot(myline, mymodel(myline))
plt.show()

Output:

Practical 7: Write a program to implement Decision tree from

given datast.
Code:
from DecisionTree import *
import pandas as pd
from sklearn import model_selection
df = pd.read_csv('data_set/Social_Network_Ads.csv')
header = list(df.columns)
lst = df.values.tolist()
trainDF, testDF = model_selection.train_test_split(lst, test_size=0.2)
t = build_tree(trainDF, header)
print("\nLeaf nodes ****************")
leaves = getLeafNodes(t)
for leaf in leaves:
print("id = " + str(leaf.id) + " depth =" + str(leaf.depth))
print("\nNon-leaf nodes ****************")
innerNodes = getInnerNodes(t)
for inner in innerNodes:

print("id = " + str(inner.id) + " depth =" + str(inner.depth))
maxAccuracy = computeAccuracy(testDF, t)
print("\nTree before pruning with accuracy: " + str(maxAccuracy*100) + "\n")
print_tree(t)
nodeIdToPrune = -1
for node in innerNodes:
if node.id != 0:
prune_tree(t, [node.id])
currentAccuracy = computeAccuracy(testDF, t)
print("Pruned node_id: " + str(node.id) + " to achieve accuracy: " +

str(currentAccuracy*100) + "%")
if currentAccuracy > maxAccuracy:
maxAccuracy = currentAccuracy
nodeIdToPrune = node.id
if maxAccuracy == 1:
break
if nodeIdToPrune != -1:
prune_tree(t, [nodeIdToPrune])
print("\nFinal node Id to prune (for max accuracy): " + str(nodeIdToPrune))
else:

print("\nPruning strategy did'nt increased accuracy")
print("\n********************************************************************")
print("*********** Final Tree with accuracy: " + str(maxAccuracy*100) + "% ************")
print("********************************************************************\n")
print_tree(t)

Output:

Practical 8: K-Nearest Neighbors Algorithm

Code:
import numpy as np
import pandas as pd
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
# Assign colum names to the dataset
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']
# Read dataset to pandas dataframe
dataset = pd.read_csv(url, names=names)
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20)
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors=5)

classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
from sklearn.metrics import classification_report, confusion_matrix
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
error = []
# Calculating error for K values between 1 and 40
for i in range(1, 40):
knn = KNeighborsClassifier(n_neighbors=i)
knn.fit(X_train, y_train)
pred_i = knn.predict(X_test)
error.append(np.mean(pred_i != y_test))
plt.figure(figsize=(12, 6))
plt.plot(range(1, 40), error, color='red', linestyle='dashed', marker='o',
markerfacecolor='blue', markersize=10)
plt.title('Error Rate K Value')
plt.xlabel('K Value')
plt.ylabel('Mean Error')

Output:

Practical 9: Write a program to implementation of K- Mean

clustering given dataset.
Code:
import numpy as np
import pandas as pd
dataset = pd.read_csv('Mall_Customers.csv')
X = dataset.iloc[:, [3, 4]].values
from sklearn.cluster import KMeans
wcss = []
for i in range(1, 11):
kmeans = KMeans(n_clusters = i, init = 'k-means++', random_state = 42)
kmeans.fit(X)
wcss.append(kmeans.inertia_)
plt.plot(range(1, 11), wcss)
plt.title('The Elbow Method')
plt.xlabel('Number of clusters')
plt.ylabel('WCSS')
plt.show()
kmeans = KMeans(n_clusters = 5, init = 'k-means++', random_state = 42)
y_kmeans = kmeans.fit_predict(X)
plt.scatter(X[y_kmeans == 0, 0], X[y_kmeans == 0, 1], s = 100, c = 'red', label = 'Cluster 1')

plt.scatter(X[y_kmeans == 1, 0], X[y_kmeans == 1, 1], s = 100, c = 'blue', label = 'Cluster 2')
plt.scatter(X[y_kmeans == 2, 0], X[y_kmeans == 2, 1], s = 100, c = 'green', label = 'Cluster 3')
plt.scatter(X[y_kmeans == 3, 0], X[y_kmeans == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')
plt.scatter(X[y_kmeans == 4, 0], X[y_kmeans == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5')
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow',

label = 'Centroids')
plt.title('Clusters of customers')
plt.xlabel('Annual Income (k$)')
plt.ylabel('Spending Score (1-100)')
plt.legend()
plt.show()

Output:

Practical 10: Write a program to implementation of

hierarchical clustering from given dataset.
Code:
import numpy as np
import pandas as pd
dataset = pd.read_csv('Mall_Customers.csv')
X = dataset.iloc[:, [3, 4]].values
import scipy.cluster.hierarchy as sch
dendrogram = sch.dendrogram(sch.linkage(X, method = 'ward'))
plt.title('Dendrogram')
plt.xlabel('Customers')
plt.ylabel('Euclidean distances')
plt.show()
from sklearn.cluster import AgglomerativeClustering
hc = AgglomerativeClustering(n_clusters = 5, affinity = 'euclidean', linkage = 'ward')
y_hc = hc.fit_predict(X)
plt.scatter(X[y_hc == 0, 0], X[y_hc == 0, 1], s = 100, c = 'red', label = 'Cluster 1')
plt.scatter(X[y_hc == 1, 0], X[y_hc == 1, 1], s = 100, c = 'blue', label = 'Cluster 2')
plt.scatter(X[y_hc == 2, 0], X[y_hc == 2, 1], s = 100, c = 'green', label = 'Cluster 3')
plt.scatter(X[y_hc == 3, 0], X[y_hc == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')
plt.scatter(X[y_hc == 4, 0], X[y_hc == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5')

plt.title('Clusters of customers')
plt.xlabel('Annual Income (k$)')
plt.ylabel('Spending Score (1-100)')
plt.legend()
plt.show()

Output:

ML Lab Manual Sem-7

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ML Lab Manual Sem-7

Uploaded by

Copyright:

Available Formats

Machine learning 3170724

Subject Code: 3170724

Mahavir Swami Collage of Engineering and Technology, Surat

Practical List (Academic year : 2022-23)

Subject : Sem : Department : Faculty Name:

Guide: Darsha Chauhan

MAHAVIR SWAMI COLLEGE OF

This is to certify that MR. / Ms. ____________________________________

Sign of teacher Sign of Head of department

1 17/06/2021 Write a program to Implementation of mean, median and 4

2 01/07/2021 Write a program to implement Data distribution 6

3 08/07/2021 Write a program to implement scatter plot using given 7

4 15/07/2021 Write a program to Implementation of linear regression 8

5 29/07/2021 Write a program to implement Scale 10

6 12/08/2021 Write a program to training and testing from given 12

7 02/09/2021 Write a program to Implementation of Decision tree from 16

8 16/09/2021 Write a program to Implement K-Nearest Neighbors 20

9 23/09/2021 Write a program to implementation of K- Mean 22

10 07/10/2021 Write a program to implementation of hierarchical 26

Practical 1: Write a program to Implementation of mean,

Gill Prabhdeep Singh(191110107013) Page 1

Gill Prabhdeep Singh(191110107013) Page 2

Gill Prabhdeep Singh(191110107013) Page 3

Practical: 2 Write a program to implement Data distribution

import matplotlib.pyplot as plt

x = numpy.random.normal(5.0, 1.0, 2000000)

Gill Prabhdeep Singh(191110107013) Page 4

Gill Prabhdeep Singh(191110107013) Page 5

Practical 3: Write a program to implement scatter plot using

x = numpy.random.normal(6.0, 1.0, 200)

y = numpy.random.normal(10.0, 2.0, 200)

Gill Prabhdeep Singh(191110107013) Page 6

Gill Prabhdeep Singh(191110107013) Page 7

Practical 4: Write a program to Implement linear regression

sel = "Value = " + str(v.get())

scale = Scale( top, variable = v, from_ = 1, to = 100, orient = HORIZONTAL)

btn = Button(top, text="Value", command=select)

Gill Prabhdeep Singh(191110107013) Page 8

Gill Prabhdeep Singh(191110107013) Page 9

Practical 5: Write a program to implement Scale from given

from sklearn import linear_model

from sklearn.preprocessing import StandardScaler

Gill Prabhdeep Singh(191110107013) Page 10

Gill Prabhdeep Singh(191110107013) Page 11

Practical 6: Write a program to training and testing from given

import matplotlib.pyplot as plt

y = numpy.random.normal(150, 40, 100) / x

mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))

myline = numpy.linspace(0, 6, 100)

Gill Prabhdeep Singh(191110107013) Page 12

Gill Prabhdeep Singh(191110107013) Page 13

2. Training set: 20% from original dataset by random selection:

mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))

myline = numpy.linspace(0, 6, 100)

Gill Prabhdeep Singh(191110107013) Page 14

Gill Prabhdeep Singh(191110107013) Page 15

Practical 7: Write a program to implement Decision tree from

from sklearn import model_selection

trainDF, testDF = model_selection.train_test_split(lst, test_size=0.2)

print("\nLeaf nodes ****************")

for leaf in leaves:

print("id = " + str(leaf.id) + " depth =" + str(leaf.depth))