You are on page 1of 31

Machine learning 3170724

LAB MANUAL
MACHINE LEARNING

Subject Code: 3170724

Prepared By:

Mahavir Swami Collage of Engineering and Technology, Surat

Practical List (Academic year : 2022-23)

Subject : Sem : Department : Faculty Name:


Machine Learning (3170724) 7th C.S.E Darsha Chauhan

Guide: Darsha Chauhan


Machine learning 3170724

MAHAVIR SWAMI COLLEGE OF


ENGINEERING & TECHNOLOGY, SURAT

CERTIFICATE

This is to certify that MR. / Ms. ____________________________________


of class Computer Science & Engineering 7thsemester Enrollment No.
__________________ has satisfactorily submitted his / her term work in
subject Machine learning (Sub. Code: 3170724) for the term ending in
___________.
Date:

Sign of teacher Sign of Head of department


Machine learning 3170724

Index
Sr. date PRACTICAL PAGE Sign
No.

1 17/06/2021 Write a program to Implementation of mean, median and 4


mode

2 01/07/2021 Write a program to implement Data distribution 6


histogram.

3 08/07/2021 Write a program to implement scatter plot using given 7


dataset

4 15/07/2021 Write a program to Implementation of linear regression 8


from given dataset

5 29/07/2021 Write a program to implement Scale 10

6 12/08/2021 Write a program to training and testing from given 12


dataset

7 02/09/2021 Write a program to Implementation of Decision tree from 16


given dataset

8 16/09/2021 Write a program to Implement K-Nearest Neighbors 20


Algorithm from given dataset

9 23/09/2021 Write a program to implementation of K- Mean 22


clustering from given dataset

10 07/10/2021 Write a program to implementation of hierarchical 26


clustering from dataset
Machine learning 3170724

Practical 1: Write a program to Implementation of mean,


median and mode
Code:
import numpy as np

v1=np.arange(1,33)

print(v1)

print('----------------------------------')

v2=np.mean(v1)

print(v2)

print('----------------------------------')

v4=np.arange(1,11)

v3=np.median(v4)

print(v3)

print('----------------------------------')

v5=np.arange(1,11)

v6=np.arange(2,22)

#v5%2

#print(v5)

Gill Prabhdeep Singh(191110107013) Page 1


Machine learning 3170724

sum=0

i=0

for i in range(1,11):

sum=sum+i

print(sum)

sum/11

print(sum)

Gill Prabhdeep Singh(191110107013) Page 2


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 3


Machine learning 3170724

Practical: 2 Write a program to implement Data distribution


histogram from the given dataset.
Code:
import numpy

import matplotlib.pyplot as plt

x = numpy.random.normal(5.0, 1.0, 2000000)

plt.hist(x, 100)

plt.show()

Gill Prabhdeep Singh(191110107013) Page 4


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 5


Machine learning 3170724

Practical 3: Write a program to implement scatter plot using


given Dataset
Code:
importnumpy

importmatplotlib.pyplot as plt

x = numpy.random.normal(6.0, 1.0, 200)

y = numpy.random.normal(10.0, 2.0, 200)

plt.scatter(x, y)

plt.show()

Gill Prabhdeep Singh(191110107013) Page 6


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 7


Machine learning 3170724

Practical 4: Write a program to Implement linear regression


from given dataset
Code:
fromtkinter import *

def select():

sel = "Value = " + str(v.get())

label.config(text = sel)

top = Tk()

top.geometry("200x100")

v = DoubleVar()

scale = Scale( top, variable = v, from_ = 1, to = 100, orient = HORIZONTAL)

scale.pack(anchor=CENTER)

btn = Button(top, text="Value", command=select)

btn.pack(anchor=CENTER)

label = Label(top)

label.pack()

top.mainloop()

Gill Prabhdeep Singh(191110107013) Page 8


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 9


Machine learning 3170724

Practical 5: Write a program to implement Scale from given


dataset
Code:
import pandas

from sklearn import linear_model

from sklearn.preprocessing import StandardScaler

scale = StandardScaler()

df = pandas.read_csv("cars2.csv")

X = df[['Weight', 'Volume']]

scaledX = scale.fit_transform(X)

print(scaledX)

Gill Prabhdeep Singh(191110107013) Page 10


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 11


Machine learning 3170724

Practical 6: Write a program to training and testing from given


dataset
Code:
1. Training set: 80% from original dataset by random selection:
Testing set: 20% from original dataset by random selection:

import numpy

import matplotlib.pyplot as plt

numpy.random.seed(2)

x = numpy.random.normal(3, 1, 100)

y = numpy.random.normal(150, 40, 100) / x

train_x = x[:80]

train_y = y[:80]

test_x = x[20:]

test_y = y[20:]

mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))

myline = numpy.linspace(0, 6, 100)

plt.scatter(train_x, train_y)

plt.plot(myline, mymodel(myline))

plt.show()

Gill Prabhdeep Singh(191110107013) Page 12


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 13


Machine learning 3170724

2. Training set: 20% from original dataset by random selection:


Testing set: 80% from original dataset by random selection:

import numpy
import matplotlib.pyplot as plt
numpy.random.seed(2)

x = numpy.random.normal(3, 1, 100)
y = numpy.random.normal(150, 40, 100) / x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))

myline = numpy.linspace(0, 6, 100)

plt.scatter(train_x, train_y)
plt.plot(myline, mymodel(myline))
plt.show()

Gill Prabhdeep Singh(191110107013) Page 14


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 15


Machine learning 3170724

Practical 7: Write a program to implement Decision tree from


given datast.
Code:
from DecisionTree import *

import pandas as pd

from sklearn import model_selection

df = pd.read_csv('data_set/Social_Network_Ads.csv')

header = list(df.columns)

lst = df.values.tolist()

trainDF, testDF = model_selection.train_test_split(lst, test_size=0.2)

t = build_tree(trainDF, header)

print("\nLeaf nodes ****************")

leaves = getLeafNodes(t)

for leaf in leaves:

print("id = " + str(leaf.id) + " depth =" + str(leaf.depth))

print("\nNon-leaf nodes ****************")

innerNodes = getInnerNodes(t)

for inner in innerNodes:

Gill Prabhdeep Singh(191110107013) Page 16


Machine learning 3170724

print("id = " + str(inner.id) + " depth =" + str(inner.depth))

maxAccuracy = computeAccuracy(testDF, t)

print("\nTree before pruning with accuracy: " + str(maxAccuracy*100) + "\n")

print_tree(t)

nodeIdToPrune = -1

for node in innerNodes:

if node.id != 0:

prune_tree(t, [node.id])

currentAccuracy = computeAccuracy(testDF, t)

print("Pruned node_id: " + str(node.id) + " to achieve accuracy: " +


str(currentAccuracy*100) + "%")

if currentAccuracy > maxAccuracy:

maxAccuracy = currentAccuracy

nodeIdToPrune = node.id

t = build_tree(trainDF, header)

if maxAccuracy == 1:

break

if nodeIdToPrune != -1:

t = build_tree(trainDF, header)

prune_tree(t, [nodeIdToPrune])

print("\nFinal node Id to prune (for max accuracy): " + str(nodeIdToPrune))

else:

t = build_tree(trainDF, header)

Gill Prabhdeep Singh(191110107013) Page 17


Machine learning 3170724

print("\nPruning strategy did'nt increased accuracy")

print("\n********************************************************************")

print("*********** Final Tree with accuracy: " + str(maxAccuracy*100) + "% ************")

print("********************************************************************\n")

print_tree(t)

Gill Prabhdeep Singh(191110107013) Page 18


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 19


Machine learning 3170724

Practical 8: K-Nearest Neighbors Algorithm


Code:
import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"

# Assign colum names to the dataset

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']

# Read dataset to pandas dataframe

dataset = pd.read_csv(url, names=names)

X = dataset.iloc[:, :-1].values

y = dataset.iloc[:, 4].values

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20)

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

scaler.fit(X_train)

X_train = scaler.transform(X_train)

X_test = scaler.transform(X_test)

from sklearn.neighbors import KNeighborsClassifier

classifier = KNeighborsClassifier(n_neighbors=5)

Gill Prabhdeep Singh(191110107013) Page 20


Machine learning 3170724

classifier.fit(X_train, y_train)

y_pred = classifier.predict(X_test)

from sklearn.metrics import classification_report, confusion_matrix

print(confusion_matrix(y_test, y_pred))

print(classification_report(y_test, y_pred))

error = []

# Calculating error for K values between 1 and 40

for i in range(1, 40):

knn = KNeighborsClassifier(n_neighbors=i)

knn.fit(X_train, y_train)

pred_i = knn.predict(X_test)

error.append(np.mean(pred_i != y_test))

plt.figure(figsize=(12, 6))

plt.plot(range(1, 40), error, color='red', linestyle='dashed', marker='o',

markerfacecolor='blue', markersize=10)

plt.title('Error Rate K Value')

plt.xlabel('K Value')

plt.ylabel('Mean Error')

Gill Prabhdeep Singh(191110107013) Page 21


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 22


Machine learning 3170724

Practical 9: Write a program to implementation of K- Mean


clustering given dataset.
Code:
import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

dataset = pd.read_csv('Mall_Customers.csv')

X = dataset.iloc[:, [3, 4]].values

from sklearn.cluster import KMeans

wcss = []

for i in range(1, 11):

kmeans = KMeans(n_clusters = i, init = 'k-means++', random_state = 42)

kmeans.fit(X)

wcss.append(kmeans.inertia_)

plt.plot(range(1, 11), wcss)

plt.title('The Elbow Method')

plt.xlabel('Number of clusters')

plt.ylabel('WCSS')

plt.show()

kmeans = KMeans(n_clusters = 5, init = 'k-means++', random_state = 42)

y_kmeans = kmeans.fit_predict(X)

plt.scatter(X[y_kmeans == 0, 0], X[y_kmeans == 0, 1], s = 100, c = 'red', label = 'Cluster 1')

Gill Prabhdeep Singh(191110107013) Page 23


Machine learning 3170724

plt.scatter(X[y_kmeans == 1, 0], X[y_kmeans == 1, 1], s = 100, c = 'blue', label = 'Cluster 2')

plt.scatter(X[y_kmeans == 2, 0], X[y_kmeans == 2, 1], s = 100, c = 'green', label = 'Cluster 3')

plt.scatter(X[y_kmeans == 3, 0], X[y_kmeans == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')

plt.scatter(X[y_kmeans == 4, 0], X[y_kmeans == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5')

plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow',


label = 'Centroids')

plt.title('Clusters of customers')

plt.xlabel('Annual Income (k$)')

plt.ylabel('Spending Score (1-100)')

plt.legend()

plt.show()

Gill Prabhdeep Singh(191110107013) Page 24


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 25


Machine learning 3170724

Practical 10: Write a program to implementation of


hierarchical clustering from given dataset.
Code:
import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

dataset = pd.read_csv('Mall_Customers.csv')

X = dataset.iloc[:, [3, 4]].values

import scipy.cluster.hierarchy as sch

dendrogram = sch.dendrogram(sch.linkage(X, method = 'ward'))

plt.title('Dendrogram')

plt.xlabel('Customers')

plt.ylabel('Euclidean distances')

plt.show()

from sklearn.cluster import AgglomerativeClustering

hc = AgglomerativeClustering(n_clusters = 5, affinity = 'euclidean', linkage = 'ward')

y_hc = hc.fit_predict(X)

plt.scatter(X[y_hc == 0, 0], X[y_hc == 0, 1], s = 100, c = 'red', label = 'Cluster 1')

plt.scatter(X[y_hc == 1, 0], X[y_hc == 1, 1], s = 100, c = 'blue', label = 'Cluster 2')

plt.scatter(X[y_hc == 2, 0], X[y_hc == 2, 1], s = 100, c = 'green', label = 'Cluster 3')

plt.scatter(X[y_hc == 3, 0], X[y_hc == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')

plt.scatter(X[y_hc == 4, 0], X[y_hc == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5')

Gill Prabhdeep Singh(191110107013) Page 26


Machine learning 3170724

plt.title('Clusters of customers')

plt.xlabel('Annual Income (k$)')

plt.ylabel('Spending Score (1-100)')

plt.legend()

plt.show()

Gill Prabhdeep Singh(191110107013) Page 27


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 28

You might also like