You are on page 1of 34

ANNA UNIVERSITY

UNIVERSITY COLLEGE OF ENGINEERING- DINDIGUL

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


CS3491 – ARTIFICIAL INTELLIGENCE AND MACHINE
LEARNING LABORATORY

Page 1 of 34
ANNA UNIVERSITY
UNIVERSITY COLLEGE OF ENGINEERING – DINDIGUL
DINDIGUL – 62422

BONAFIDE CERTIFICATE
This is to certify that is a bonafide record of work done by
Mr./Ms.________________________________________
in _____________________________________________
laboratory during the academic year 2022-20

University Registration no.:

Staff In charge Head of the Department

Submitted for the university Practical Examination held on __________________

INTERNAL EXAMINER EXTERNAL EXAMINER

Page 2 of 34
PRACTICAL EXERCISES: 30 PERIODS

1. Implementation of Uninformed search algorithms (BFS, DFS)

2. Implementation of Informed search algorithms (A*, memory-bounded A*)

3. Implement naïve Bayes models

4. Implement Bayesian Networks

5. Build Regression models

6. Build decision trees and random forests

7. Build SVM models

8. Implement ensembling techniques

9. Implement clustering algorithms

10. Implement EM for Bayesian networks

11. Build simple NN models

12. Build deep learning NN models

Page 3 of 34
INDEX
EXPT NAME OF THE EXPERIMENT PAGE DATE OF SIGNATURE
NO. NO. COMPLETION
1

10

11

12

Page 4 of 34
EXPT NO.: 01
UNINFORMED SEARCH
DATE : ALGORITHMS
Aim:
To write a python program to implementation of Uninformed search algorithms
(BFS, DFS).
Algorithm:
Breadth First Search:
Step 1: Consider the graph you want to navigate.
Step 2: Select any vertex in your graph, say v1, from which you want to traverse
the graph.
Step 3: Examine any two data structure for traversing the graph
Visited array (size of the graph)
Queue data structure
Step 4: Starting from the vertex, you will add to the visited array, and afterward,
you will v1’s adjacent vertices to the queue data structure.
Step 5: Now, using the FIFO concept, you must remove the element from the
queue, put it into the visited array, and then return to the queue to add the adjacent
vertices of the removed element.
Step 6: Repeat step 5 until the queue is not empty and no vertex is left to be
visited.
Depth First Search:
Step 1: Start by putting any one of the graph’s vertices at the back of the queue.
Step 2: Now take the front item of the queue and add it to the visited list.
Step 3: Create a list of that vertex's adjacent nodes. Add those which are not within
the visited list to the rear of the queue.
Step 4: Keep continuing steps two and three till the queue is empty.
Step 5: Stop the program
Source Code:
Breadth First Search:
graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}

visited = [] # List for visited nodes.


queue = [] #Initialize a queue
def bfs(visited, graph, node): #function for BFS
Page 5 of 34
visited.append(node)
queue.append(node)
while queue: # Creating loop to visit each node
m = queue.pop(0)
print (m, end = " ")
for neighbour in graph[m]:
if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)

# Driver Code
print("Following is the Breadth-First Search")
bfs(visited, graph, '5') # function calling
Depth First Search:
# Using a Python dictionary to act as an adjacency list
graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}
visited = set() # Set to keep track of visited nodes of graph.
def dfs(visited, graph, node): #function for dfs
if node not in visited:
print (node)
visited.add(node)
for neighbour in graph[node]:
dfs(visited, graph, neighbour)
# Driver Code
print("Following is the Depth-First Search")
dfs(visited, graph, '5')

Result:
Thus the python program for Uninformed search algorithms was executed
successfully and the output was verified.

Page 6 of 34
EXPT NO.: 02
INFORMED SEARCH
DATE : ALGORITHMS
Aim:
To write a python program to implementation of Informed search algorithms (A*,
memory-bounded A*).
Algorithm:
A*:
Step 1: Initialize the starting node with a cost of zero and add it to an open list.
Step 2: While the open list is not empty:
a. Find the node with the lowest cost in the open list and remove it.
b. If this node is the goal node, return the path to this node.
c. Generate all successor nodes of the current node.
d. For each successor node, calculate its cost and add it to the open list.
Step 3: If the open list is empty and the goal node has not been found, then there is
no path from the start node to the goal node.
Memory-bounded A*:
Step 1: Initialize the starting node with a cost of zero and add it to an open list and a
closed list.
Step 2: While the open list is not empty:
a. Find the node with the lowest cost in the open list and remove it.
b. If this node is the goal node, return the path to this node.
c. Generate all successor nodes of the current node.
d. For each successor node, calculate its cost and add it to the open list if it is
not in the closed list.
e. If the open list is too large, remove the node with the highest cost from the
open list and add it to the closed list.
f. Add the current node to the closed list.
Step 3: If the open list is empty and the goal node has not been found, then there is
no path from the start node to the goal node.
Source Code:
A*:
import queue as Q
graph = {
'a': {'b': 2, 'c': 2},
'b': {'a': 2, 'd': 1},
'c': {'a': 2, 'd': 8, 'f': 3},
'd': {'b': 1, 'c': 8, 'e': 2, 'S': 3},
'e': {'d': 2, 'h': 8, 'r': 2, 'S': 9},
'f': {'c': 3, 'G': 2, 'r': 2},
'G': {'f': 2},
'h': {'e': 8, 'p': 4, 'q': 4},
'p': {'h': 4, 'q': 15, 'S': 1},

Page 7 of 34
'q': {'h': 4, 'p': 15},
'r': {'e': 2, 'f': 2},
'S': {'d': 3, 'e': 9, 'p': 1}
}

h_scores = {'S': 10, 'a': 5, 'b': 7, 'c': 4, 'd': 7, 'e': 5, 'f': 2, 'G': 0, 'h': 11, 'p': 14, 'q': 12, 'r': 3}

def astar(graph, source, destination):


visited = []
path = ['G']
parents = {source: 'None'}
queue = Q.PriorityQueue()
queue.put((10, source))
h2 = 0
while queue:
cost, node = queue.get()
if node not in visited:
visited.append(node)
if node == destination:
if destination in parents:
i = destination
while not path[-1] == source:
path.append(parents[i])
i = path[-1]
return visited, list(reversed(path))
children = graph[node]
for i in children:
if i not in visited:
total_cost = cost + children[i]
h1 = h_scores[i]
total = total_cost + h1 - h_scores[node]
queue.put((total, i))
parents[i] = node

solution, path = (astar(graph, 'S', 'G'))


print("The expanded vertex list---->", solution)
print("The return path is ---->",path)
Memory-bounded A*:
import heapq
import math
class PriorityQueue:
"""Priority queue implementation using heapq"""
def __init__(self):
self.elements = []
def is_empty(self):
return len(self.elements) == 0
def put(self, item, priority):

Page 8 of 34
heapq.heappush(self.elements, (priority, item))
def get(self):
return heapq.heappop(self.elements)[1]

class Node:
"""Node class for representing the search tree"""
def __init__(self, state, parent=None, action=None, path_cost=0):
self.state = state
self.parent = parent
self.action = action
self.path_cost = path_cost
def __lt__(self, other):
return self.path_cost + heuristic(self.state) <other.path_cost + heuristic(other.state)
def __eq__(self, other):
return self.state == other.state
def heuristic(state):
"""Heuristic function for estimating the cost to reach the goal state"""
# Example heuristic function - Euclidean distance to the goal
goal_state = (0, 0) # Replace with actual goal state
return math.sqrt((state[0] - goal_state[0])**2 + (state[1] - goal_state[1])**2)
def memory_bounded_a_star_search(start_state, max_memory):
"""Memory-bounded A* search algorithm"""
class frontier:
frontier = PriorityQueue()
frontier.put(Node(start_state), 0)
explored = set()
memory = {start_state: 0}
while not frontier.is_empty():
node = frontier.get()

if node.state not in explored:


explored.add(node.state)
if is_goal_state(node.state):
for child_state, action, step_cost in get_successor_states(node.state):
child_node = Node(child_state, node, action, node.path_cost + step_cost)
child_node_f = child_node.path_cost + heuristic(child_state)
if child_state not in memory or child_node_f< memory[child_state]:
frontier.put(child_node, child_node_f)
memory[child_state] = child_node_f
while memory_usage(memory) >max_memory:
state_to_remove = min(memory, key=memory.get)
del memory[state_to_remove]
return None
def get_successor_states(state):
"""Function for generating successor states"""
# Replace with actual successor state generation logic
return []

Page 9 of 34
def is_goal_state(state):
"""Function for checking if a state is the goal state"""
# Replace with actual goal state checking logic
return False
def get_solution_path(node):
"""Function for retrieving the solution path"""
path = []
return get_solution_path(node)
while node.parent is not None:
path.append((node.action, node.state))
node = node.parent
path.reverse()
return path
def memory_usage(memory):
"""Function for estimating the memory usage of a dictionary"""
return sum(memory.values())

Result:
Thus the python program for informed search algorithms was executed successfully
and the output was verified.

Page 10 of 34
EXPT NO.: 03
IMPLEMENT NAÏVE BAYES MODELS
DATE :

Aim:
To write a python program to implement naïve bayes models.
Algorithm:
Step 1: Collect the dataset: The first step in using Naive Bayes is to collect a dataset
that contains a set of data points and their corresponding classes.
Step 2: Prepare the data: The next step is to pre-process the data and prepare it for
the Naïve Bayes algorithm. This involves removing any unnecessary features or attributes
and normalizing the data.
Step 3: Compute the prior probabilities: The prior probabilities of each class can be
computed by calculating the number of data points belonging to each class and dividing it by
the total number of data points.
Step 4: Compute the likelihoods: The likelihoods of each feature for each class can be
computed by calculating the conditional probability of the feature given the class. This
involves counting the number of data points in each class that have the feature and dividing
it by the total number of data points in that class.
Step 5: Compute the posterior probabilities: The posterior probabilities of each class
can be computed by multiplying the prior probability of the class with the product of the
likelihoods of each feature for that class.
Step 6: Make predictions: Once the posterior probabilities have been computed for
each class, the Naive Bayes algorithm can be used to make predictions by selecting the class
with the highest probability.
Step 7: Evaluate the model: The final step is to evaluate the performance of the
Naive Bayes model. This can be done by computing various performance metrics such as
accuracy, precision, recall, and F1 score.
Source Code:
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset


dataset = pd.read_csv(r'C:\Users\Manikandan\OneDrive\KARTHIK
MANI\datasets\Social_Network_Ads.csv')
X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, -1].values

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0)

# Feature Scaling

Page 11 of 34
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

# Training the Naive Bayes model on the Training set


from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB()
classifier.fit(X_train, y_train)

# Predicting the Test set results


y_pred = classifier.predict(X_test)

# Making the Confusion Matrix

from sklearn.metrics import confusion_matrix, accuracy_score


ac = accuracy_score(y_test,y_pred)
cm = confusion_matrix(y_test, y_pred)

from sklearn.preprocessing import LabelEncoder


le = LabelEncoder()
X[:,0] = le.fit_transform(X[:,0])
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0)

from sklearn.preprocessing import StandardScaler


sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

from sklearn.model_selection import train_test_split


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0)

from sklearn.model_selection import train_test_split


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0)

from sklearn.preprocessing import StandardScaler


sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

from sklearn.naive_bayes import GaussianNB


classifier = GaussianNB()
classifier.fit(X_train, y_train)

y_pred = classifier.predict(X_test)

Page 12 of 34
y_pred

y_test

from sklearn.metrics import confusion_matrix,accuracy_score


cm = confusion_matrix(y_test, y_pred)
ac = accuracy_score(y_test,y_pred)

ac

cm

Result:
Thus the python program for implement naïve bayes models was executed
successfully and the output was verified.

Page 13 of 34
EXPT NO.: 04
IMPLEMENT BAYESIAN NETWORKS
DATE :

Aim:
To write a python program to implement Bayesian networks.
Algorithm:
Step 1: Define the variables: The first step in implementing a Bayesian Network is to
define the variables that will be used in the model. Each variable should be clearly defined
and its possible states should be enumerated.
Step 2: Determine the relationships between variables: The next step is to determine
the probabilistic relationships between the variables. This can be done by identifying the
causal relationships between the variables or by using data to estimate the conditional
probabilities of each variable given its parents.
Step 3: Construct the Bayesian Network: The Bayesian Network can be constructed
by representing the variables as nodes in a directed acyclic graph (DAG). The edges between
the nodes represent the conditional dependencies between the variables.
Step 4: Assign probabilities to the variables: Once the structure of the Bayesian
Network has been defined, the probabilities of each variable must be assigned. This can be
done by using expert knowledge, data, or a combination of both.
Step 5: Inference: Inference refers to the process of using the Bayesian Network to
make predictions or draw conclusions. This can be done by using various inference
algorithms, such as variable elimination or belief propagation.
Step 6: Learning: Learning refers to the process of updating the probabilities in the
Bayesian Network based on new data. This can be done using various learning algorithms,
such as maximum likelihood or Bayesian learning.
Step 7: Evaluation: The final step in implementing a Bayesian Network is to evaluate
its performance. This can be done by comparing the predictions of the model to actual data
and computing various performance metrics, such as accuracy or precision.
Source Code:
import numpy as np
import csv
import pandas as pd
from pgmpy.models import BayesianModel
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.inference import VariableElimination

#read Cleveland Heart Disease data


heartDisease = pd.read_csv('heart.csv')
heartDisease = heartDisease.replace('?',np.nan)

#display the data


print('Few examples from the dataset are given below')
print(heartDisease.head())

Page 14 of 34
#Model Bayesian Network
Model=BayesianModel([('age','trestbps'),('age','fbs'),('sex','trestbps'),('exang','trestbps'),('tre
stbps','heartdisease'),('fbs','heartdisease'),('heartdisease','restecg'),('heartdisease','thalach'),
('heartdisease','chol')])

#Learning CPDs using Maximum Likelihood Estimators


print('\n Learning CPD using Maximum likelihood estimators')
model.fit(heartDisease,estimator=MaximumLikelihoodEstimator)

# Inferencing with Bayesian Network


print('\n Inferencing with Bayesian Network:')
HeartDisease_infer = VariableElimination(model)

#computing the Probability of HeartDisease given Age


print('\n 1. Probability of HeartDisease given Age=30')
q=HeartDisease_infer.query(variables=['heartdisease'],evidence={'age':28})
print(q['heartdisease'])

#computing the Probability of HeartDisease given cholesterol


print('\n 2. Probability of HeartDisease given cholesterol=100')
q=HeartDisease_infer.query(variables=['heartdisease'],evidence={'chol':100})
print(q['heartdisease'])

Result:
Thus the python program for implement Bayesian networks was executed
successfully and the output was verified.

Page 15 of 34
EXPT NO.: 05
BUILD REGRESSION MODELS
DATE :

Aim:
To write a python program to Build Regression models
Algorithm:
Step 1: Collecting and cleaning the data: The first step in building a regression model
is to gather the data needed for analysis and ensure that it is clean and consistent. This may
involve removing missing values, outliers, and other errors.
Step 2: Exploring the data: Once the data is cleaned, it is important to explore it to
gain an understanding of the relationships between the input and outcome variables. This
may involve calculating summary statistics, creating visualizations, and testing for
correlations.
Step 3: Choosing the algorithm: Based on the nature of the problem and the
characteristics of the data, an appropriate regression algorithm is chosen.
Step 4: Pre-processing the data: Before applying the regression algorithm, it may be
necessary to pre-process the data to ensure that it is in a suitable format. This may involve
standardizing or normalizing the data, encoding categorical variables, or applying feature
engineering techniques.
Step 5: Training the model: The regression model is trained on a subset of the data,
using an optimization algorithm to find the values of the model parameters that minimize
the difference between the predicted and actual values.
Step 6: Evaluating the model: Once the model is trained, it is evaluated using a
separate test dataset to determine its accuracy and generalization performance. Metrics
such as mean squared error, R-squared, or root mean squared error can be used to assess
the model's performance.
Step 7: Improving the model: Based on the evaluation results, the model can be
refined by adjusting the model parameters or using different algorithms.
Step 8: Deploying the model: Finally, the model can be deployed to make predictions
on new data.
Source Code:
#importig ncessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

#importing dataset
customers = pd.read_csv(r'C:\Users\Manikandan\OneDrive\KARTHIK
MANI\datasets\Ecommerce Customers.csv')

#for viewing the dataset upto 5rows


customers.head()

Page 16 of 34
#description of dataset
customers.describe()

#view summary of dataset


customers.info()

sns.set_palette("GnBu_d")

sns.set_style('whitegrid')

sns.jointplot(x='Time on Website',y='Yearly Amount Spent',data=customers)

sns.jointplot(x='Time on App',y='Yearly Amount Spent',data=customers)

sns.jointplot(x='Time on App',y='Yearly Amount Spent',data=customers)

sns.pairplot(customers)

sns.lmplot(x='Length of Membership',y='Yearly Amount Spent',data=customers)

y = customers['Yearly Amount Spent']


X = customers[['Avg. Session Length', 'Time on App','Time on Website', 'Length of
Membership']]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=101)

from sklearn.linear_model import LinearRegression


lm = LinearRegression()
lm.fit(X_train,y_train)

predictions = lm.predict( X_test)


plt.scatter(y_test,predictions)
plt.xlabel('Y Test')
plt.ylabel('Predicted Y')

from sklearn import metrics

print('MAE:', metrics.mean_absolute_error(y_test, predictions))


print('MSE:', metrics.mean_squared_error(y_test, predictions))
print('RMSE:', np.sqrt(metrics.mean_squared_error(y_test, predictions)))

coeffecients = pd.DataFrame(lm.coef_,X.columns)
coeffecients.columns = ['Coeffecient']
coeffecients

Page 17 of 34
Result:
Thus the python program for Build Regression models was executed successfully and
the output was verified.

Page 18 of 34
EXPT NO.: 06
BUILD DECISION TREES AND
DATE : RANDOM FORESTS
Aim:
To write a python program to Build decision trees and random forests

Algorithm:
Decision Trees:
Step 1: Select the feature that best splits the data: The first step is to select the
feature that best separates the data into groups with different target values.
Step 2: Recursively split the data: For each group created in step 1, repeat the
process of selecting the best feature to split the data until a stopping criterion is met. The
stopping criterion may be a maximum tree depth, a minimum number of samples in a leaf
node, or another condition.
Step 3: Assign a prediction value to each leaf node: Once the tree is built, assign a
prediction value to each leaf node. This value may be the mean or median target value of
the samples in the leaf node.
Random Forest:
Step 1: Randomly select a subset of features: Before building each decision tree,
randomly select a subset of features to consider for splitting the data.
Step 2: Build multiple decision trees: Build multiple decision trees using the process
described above, each with a different subset of features.
Step 3: Aggregate the predictions: When making predictions on new data, aggregate
the predictions from all decision trees to obtain a final prediction value. This can be done by
taking the average or majority vote of the predictions.
Source Code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline
sns.set_style("whitegrid")
plt.style.use("fivethirtyeight")

df = pd.read_csv(r'C:\Users\Manikandan\OneDrive\KARTHIK MANI\datasets\WA_Fn-UseC_-
HR-Employee-Attrition.csv')
df.head()

sns.countplot(x='Attrition', data=df)

df.drop(['EmployeeCount', 'EmployeeNumber', 'Over18', 'StandardHours'], axis="columns",


inplace=True)

Page 19 of 34
categorical_col = []
for column in df.columns:
if df[column].dtype == object and len(df[column].unique()) <= 50:
categorical_col.append(column)
df['Attrition'] = df.Attrition.astype("category").cat.codes

categorical_col.remove('Attrition')

# Transform categorical data into dummies


# categorical_col.remove("Attrition")
# data = pd.get_dummies(df, columns=categorical_col)
# data.info()
from sklearn.preprocessing import LabelEncoder
label = LabelEncoder()
for column in categorical_col:
df[column] = label.fit_transform(df[column])

from sklearn.model_selection import train_test_split


X = df.drop('Attrition', axis=1)
y = df.Attrition
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

def print_score(clf, X_train, y_train, X_test, y_test, train=True):


if train:
pred = clf.predict(X_train)
clf_report = pd.DataFrame(classification_report(y_train, pred, output_dict=True))
print("Train Result:\n================================================")
print(f"Accuracy Score: {accuracy_score(y_train, pred) * 100:.2f}%")
print("_______________________________________________")
print(f"CLASSIFICATION REPORT:\n{clf_report}")
print("_______________________________________________")
print(f"Confusion Matrix: \n {confusion_matrix(y_train, pred)}\n")

elif train==False:
pred = clf.predict(X_test)
clf_report = pd.DataFrame(classification_report(y_test, pred, output_dict=True))
print("Test Result:\n================================================")
print(f"Accuracy Score: {accuracy_score(y_test, pred) * 100:.2f}%")
print("_______________________________________________")
print(f"CLASSIFICATION REPORT:\n{clf_report}")
print("_______________________________________________")
print(f"Confusion Matrix: \n {confusion_matrix(y_test, pred)}\n")

Page 20 of 34
from sklearn.tree import DecisionTreeClassifier
tree_clf = DecisionTreeClassifier(random_state=42)
tree_clf.fit(X_train, y_train)
print_score(tree_clf, X_train, y_train, X_test, y_test, train=True)
print_score(tree_clf, X_train, y_train, X_test, y_test, train=False)

from sklearn.ensemble import RandomForestClassifier


rf_clf = RandomForestClassifier(n_estimators=100)
rf_clf.fit(X_train, y_train)
print_score(rf_clf, X_train, y_train, X_test, y_test, train=True)
print_score(rf_clf, X_train, y_train, X_test, y_test, train=False)

Result:
Thus the python program for Build decision trees and random forests was executed
successfully and the output was verified.

Page 21 of 34
EXPT NO.: 07
BUILD SVM MODELS
DATE :

Aim:
To write a python program to Build SVM models
Algorithm:
Step 1: Load a dataset using the pandas library
Step 2: Split the dataset into training and testing sets using train_test_split function
from scikit-learn
Step 3: Train three SVM models with different kernels (linear, polynomial, and RBF)
using SVC function from scikit-learn
Step 4: Predict the test set labels using the trained models
Step 5: Evaluate the accuracy of the models using the accuracy_score function from
scikitlearn
Step 6: Print the accuracy of each model
Source Code:
#Import the Libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

#Load the Dataset


dataset = pd.read_csv(r'C:\Users\Manikandan\OneDrive\KARTHIK
MANI\datasets\Social_Network_Ads.csv')

dataset.head()

#Split Dataset into X and Y


X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values

#Split the X and Y Dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

# Perform Feature Scaling


from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

# Fit SVM to the Training set


from sklearn.svm import SVC
classifier = SVC(kernel = 'rbf', random_state = 0)

Page 22 of 34
classifier.fit(X_train, y_train)

# Predict the Test Set Results


y_pred = classifier.predict(X_test)

y_pred

# Make the Confusion Matrix


from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test,y_pred)

# Visualise the Test set results


from matplotlib.colors import ListedColormap
X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step
= 0.01),
np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('SVM (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

Result:
Thus the python program for Build SVM models was executed successfully and the
output was verified.

Page 23 of 34
EXPT NO.: 08
IMPLEMENT ENSEMBLING
DATE : TECHNIQUES
Aim:
To write a python program to Implement ensembling techniques
Algorithm:
Step 1: Load the dataset and split it into training and testing sets.
Step 2: Choose the base models to be included in the ensemble.
Step 3: Train each base model on the training set.
Step 4: Combine the predictions of the base models using the chosen ensembling
technique (voting, bagging, boosting, etc.).
Step 5: Evaluate the performance of the ensemble model on the testing set.
Step 6: If the performance is satisfactory, deploy the ensemble model for making
predictions on new data.
Source Code:
import pandas as pd
loan_data = pd.read_csv(r'C:\Users\Manikandan\OneDrive\KARTHIK
MANI\datasets\loan_data.csv')
loan_data.head()

loan_data.info()

print(loan_data['not.fully.paid'].value_counts())
loan_data['not.fully.paid'].value_counts().plot(kind='barh')

loan_data_class_1 = loan_data[loan_data['not.fully.paid'] == 1]
number_class_1 = len(loan_data_class_1)
loan_data_class_0 = loan_data[loan_data['not.fully.paid'] == 0].sample(number_class_1)
final_loan_data = pd.concat([loan_data_class_1,
loan_data_class_0])
print(final_loan_data.shape)

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(0, 1))

# Remove unwanted 'purpose' column and get the data


final_loan_data.drop('purpose', axis=1, inplace=True)
X = final_loan_data.drop('not.fully.paid', axis=1)
normalized_X = scaler.fit_transform(X)

from sklearn.model_selection import train_test_split


y = final_loan_data['not.fully.paid']
r_state = 2023

Page 24 of 34
t_size = 0.33
X_train, X_test, y_train, y_test = train_test_split(normalized_X, y,
test_size=t_size,
random_state=r_state,
stratify=y)

from sklearn.ensemble import RandomForestClassifier


from sklearn.model_selection import cross_val_score
# Define the model
random_forest_model = RandomForestClassifier()
# Fit the random search object to the data
random_forest_model.fit(X_train, y_train)

from sklearn.metrics import accuracy_score


# Make predictions
y_pred = random_forest_model.predict(X_test)
# Get the performance
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

from sklearn.neighbors import KNeighborsClassifier


from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression

X_train, X_val, y_train, y_val = train_test_split(


X_train, y_train,
test_size=t_size,
random_state=r_state)

# Decision Tree Model


dt_model = DecisionTreeClassifier()
dt_model.fit(X_train, y_train)
dt_model_pred_val = dt_model.predict(X_val)
dt_model_pred_test= dt_model.predict(X_test)
dt_model_pred_val = pd.DataFrame(dt_model_pred_val)
dt_model_pred_test = pd.DataFrame(dt_model_pred_test)
# KNN Model
knn_model = KNeighborsClassifier()
knn_model.fit(X_train,y_train)
knn_model_pred_val = knn_model.predict(X_val)
knn_model_pred_test = knn_model.predict(X_test)
knn_model_pred_val = pd.DataFrame(knn_model_pred_val)
knn_model_pred_test = pd.DataFrame(knn_model_pred_test)

x_val = pd.DataFrame(X_val)
x_test = pd.DataFrame(X_test)

Page 25 of 34
df_val_lr = pd.concat([x_val, knn_model_pred_val,
dt_model_pred_val], axis=1)
df_test_lr = pd.concat([x_test, dt_model_pred_test,
knn_model_pred_test],axis=1)
# Logistic Regression Model
lr_model = LogisticRegression()
lr_model.fit(df_val_lr,y_val)
lr_model.score(df_test_lr,y_test)

Result:
Thus the python program for Implement ensembling techniques was executed
successfully and the output was verified.

Page 26 of 34
EXPT NO.: 09
IMPLEMENT CLUSTERING
DATE : ALGORITHMS
Aim:
To write a python program to Implement clustering algorithms
Algorithm:
Step 1: Data preparation: The first step is to prepare the data that we want to
cluster. This may involve data cleaning, normalization, and feature extraction, depending on
the type and quality of the data.
Step 2: Choosing a distance metric: The next step is to choose a distance metric or
similarity measure that will be used to determine the similarity between data points.
Common distance metrics include Euclidean distance, Manhattan distance, and cosine
similarity.
Step 3: Choosing a clustering algorithm: There are many clustering algorithms
available, each with its own strengths and weaknesses. Some popular clustering algorithms
include K Means, Hierarchical clustering, and DBSCAN.
Step 4: Choosing the number of clusters: Depending on the clustering algorithm
chosen, we may need to specify the number of clusters we want to form. This can be done
using domain knowledge or by using techniques such as the elbow method or silhouette
analysis.
Step 5: Cluster assignment: Once the clusters have been formed, we need to assign
each data point to its nearest cluster based on the chosen distance metric.
Step 6: Interpretation and evaluation: Finally, we need to interpret and evaluate the
results of the clustering algorithm to determine if the clustering has produced meaningful
and useful insights.
Source Code:
# importing libraries
import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd

# Importing the dataset


dataset = pd.read_csv(r'C:\Users\Manikandan\OneDrive\KARTHIK
MANI\datasets\Mall_Customers.csv')

x = dataset.iloc[:, [3, 4]].values

#finding optimal number of clusters using the elbow method


from sklearn.cluster import KMeans
wcss_list= [] #Initializing the list for the values of WCSS
#Using for loop for iterations from 1 to 10.
for i in range(1, 11):
kmeans = KMeans(n_clusters=i, init='k-means++', random_state= 42)
kmeans.fit(x)

Page 27 of 34
wcss_list.append(kmeans.inertia_)
mtp.plot(range(1, 11), wcss_list)
mtp.title('The Elobw Method Graph')
mtp.xlabel('Number of clusters(k)')
mtp.ylabel('wcss_list')
mtp.show()

#training the K-means model on a dataset


kmeans = KMeans(n_clusters=5, init='k-means++', random_state= 42)
y_predict= kmeans.fit_predict(x)

#visulaizing the clusters


mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s = 100, c = 'blue', label = 'Cluster 1')
#for first cluster
mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s = 100, c = 'green', label = 'Cluster 2')
#for second cluster
mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c = 'red', label = 'Cluster 3')
#for third cluster
mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')
#for fourth cluster
mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s = 100, c = 'magenta', label = 'Cluster
5')
#for fifth cluster
mtp.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c =
'yellow', label = 'Centroid')
mtp.title('Clusters of customers')
mtp.xlabel('Annual Income (k$)')
mtp.ylabel('Spending Score (1-100)')
mtp.legend()
mtp.show()

Result:
Thus the python program for Implement clustering algorithms was executed
successfully and the output was verified.

Page 28 of 34
EXPT NO.: 10
IMPLEMENT EM FOR BAYESIAN
DATE : NETWORKS
Aim:
To write a python program to Implement EM for Bayesian networks
Algorithm:
Step 1: Initialize the parameters: Start by initializing the parameters of the Bayesian
network, such as the CPDs for each node. These can be initialized randomly or using some
prior knowledge.
Step 2: E-step: In the E-step, we estimate the expected sufficient statistics for the
unobserved variables in the network, given the observed data and the current parameter
estimates. This involves computing the posterior probability distribution over the hidden
variables, given the observed data and the current parameter estimates.
Step 3: M-step: In the M-step, we maximize the expected log-likelihood of the
observed data with respect to the parameters. This involves updating the parameter
estimates using the expected sufficient statistics computed in the E-step.
Step 4: Repeat steps 2 and 3 until convergence: Iterate between the E-step and M-
step until the parameter estimates converge, or some other stopping criterion is met.
Source Code:
# import libraries
# For plotting
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("white")
%matplotlib inline
#for matrix math
import numpy as np
#for normalization + probability density function computation
from scipy import stats
#for data preprocessing
import pandas as pd
from math import sqrt, log, exp, pi
from random import uniform
print("import done")

random_seed=36788765
np.random.seed(random_seed)
Mean1 = 2.0 # Input parameter, mean of first normal probability distribution
Standard_dev1 = 4.0 #@param {type:"number"}
Mean2 = 9.0 # Input parameter, mean of second normal probability distribution
Standard_dev2 = 2.0 #@param {type:"number"}

Page 29 of 34
# generate data
y1 = np.random.normal(Mean1, Standard_dev1, 1000)
y2 = np.random.normal(Mean2, Standard_dev2, 500)
data=np.append(y1,y2)

# For data visiualisation calculate left and right of the graph


Min_graph = min(data)
Max_graph = max(data)
x = np.linspace(Min_graph, Max_graph, 2000) # to plot the data
print('Input Gaussian {:}: μ = {:.2}, σ = {:.2}'.format("1", Mean1, Standard_dev1))
print('Input Gaussian {:}: μ = {:.2}, σ = {:.2}'.format("2", Mean2, Standard_dev2))
sns.distplot(data, bins=20, kde=False);

class Gaussian:
"Model univariate Gaussian"
def __init__(self, mu, sigma):
#mean and standard deviation
self.mu = mu
self.sigma = sigma

#probability density function


def pdf(self, datum):
"Probability of a data point given the current parameters"
u = (datum - self.mu) / abs(self.sigma)
y = (1 / (sqrt(2 * pi) * abs(self.sigma))) * exp(-u * u / 2)
return y

def __repr__(self):
return 'Gaussian({0:4.6}, {1:4.6})'.format(self.mu, self.sigma)
print("done")

Result:
Thus the python program for Implement EM for Bayesian networks was executed
successfully and the output was verified.
Page 30 of 34
EXPT NO.: 11
Build simple NN models
DATE :

Aim:
To write a python program to Build simple NN models
Algorithm:
Step 1: Data preparation: Pre-process the data to make it suitable for training the
NN. This may involve normalizing the input data, splitting the data into training and
validation sets, and encoding the output variables if necessary.
Step 2: Define the architecture: Choose the number of layers and neurons in the NN,
and define the activation functions for each layer. The input layer should have one neuron
per input feature, and the output layer should have one neuron per output variable.
Step 3: Initialize the weights: Initialize the weights of the NN randomly, using a small
value to avoid saturating the activation functions.
Step 4: Forward propagation: Feed the input data forward through the NN, applying
the activation functions at each layer, and compute the output of the NN.
Step 5: Compute the loss: Calculate the error between the predicted output and the
true output, using a suitable loss function such as mean squared error or cross-entropy.
Step 6: Backward propagation: Compute the gradient of the loss with respect to the
weights, using the chain rule and backpropagate the error through the NN to adjust the
weights.
Step 7: Update the weights: Adjust the weights using an optimization algorithm such
as stochastic gradient descent or Adam, and repeat steps 4-7 for a fixed number of epochs
or until the performance on the validation set stops improving.
Step 8: Evaluate the model: Test the performance of the model on a held-out test set
and report the accuracy or other performance metrics.
Source Code:
import torch
import torch.nn as nn

n_input, n_hidden, n_out, batch_size, learning_rate = 10, 15, 1, 100, 0.01

data_x = torch.randn(batch_size, n_input)


data_y = (torch.rand(size=(batch_size, 1)) < 0.5).float()

print(data_x.size())
print(data_y.size())

model = nn.Sequential(nn.Linear(n_input, n_hidden),


nn.ReLU(),
nn.Linear(n_hidden, n_out),
nn.Sigmoid())

Page 31 of 34
print(model)

loss_function = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

losses = []
for epoch in range(5000):
pred_y = model(data_x)
loss = loss_function(pred_y, data_y)
losses.append(loss.item())

model.zero_grad()
loss.backward()

optimizer.step()

import matplotlib.pyplot as plt


plt.plot(losses)
plt.ylabel('loss')
plt.xlabel('epoch')
plt.title("Learning rate %f"%(learning_rate))
plt.show()

Result:
Thus the python program for Build simple NN models was executed successfully and
the output was verified.

Page 32 of 34
EXPT NO.: 12
BUILD DEEP LEARNING NN
DATE : MODELS
Aim:
To write a python program to Build deep learning NN models
Algorithm:
Step 1: Data preparation: Pre-process the data to make it suitable for training the
NN. This may involve normalizing the input data, splitting the data into training and
validation sets, and encoding the output variables if necessary.
Step 2: Define the architecture: Choose the number of layers and neurons in the NN,
and define the activation functions for each layer. Deep learning models typically use
activation functions such as RELU or variants thereof, and often incorporate dropout or
other regularization techniques to prevent overfitting.
Step 3: Initialize the weights: Initialize the weights of the NN randomly, using a small
value to avoid saturating the activation functions.
Step 4: Forward propagation: Feed the input data forward through the NN, applying
the activation functions at each layer, and compute the output of the NN.
Step 5: Compute the loss: Calculate the error between the predicted output and the
true output, using a suitable loss function such as mean squared error or cross-entropy.
Step 6: Backward propagation: Compute the gradient of the loss with respect to the
weights, using the chain rule and backpropagate the error through the NN to adjust the
weights.
Step 7: Update the weights: Adjust the weights using an optimization algorithm such
as stochastic gradient descent or Adam, and repeat steps 4-7 for a fixed number of epochs
or until the performance on the validation set stops improving.
Step 8: Evaluate the model: Test the performance of the model on a held-out test set
and report the accuracy or other performance metrics.
Step 9: Fine-tune the model: If necessary, fine-tune the model by adjusting the
Hyper parameters or experimenting with different architectures.
Source Code:
import tensorflow as tf
from tensorflow import keras
# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Normalize the input data
x_train = x_train / 255.0
x_test = x_test / 255.0
# Define the model architecture
model = keras.Sequential([

Page 33 of 34
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(10)
])
# Compile the model
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
# Train the model
model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))
# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print('Test accuracy:', test_acc)

Result:
Thus the python program for Build deep learning NN models was executed
successfully and the output was verified.

Page 34 of 34

You might also like