Professional Documents
Culture Documents
Laboratory Manual
2018 Scheme
P a g e 1 | 14
DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND
MACHINE LEARNING
M1: Provide state of art infrastructure, tools and facilities to make students competent and achieve
excellence in education and research.
M2: Provide a strong theoretical and practical knowledge across the AIML discipline with an emphasis on
AI based research and software development.
M3: Inculcate strong ethical values, professional behavior and leadership abilities through various
curricular, co-curricular, training and development activities.
P a g e 2 | 14
Program Educational Objectives (PEOs)
PE01: Graduates will be able to follow logical, practical and research-oriented approach for solving the real-
world problems by providing AI based solutions.
PEO2: Graduates will be able to work independently as well as in multidisciplinary teams in the workplace.
PEO4: Graduates will be able to setup start-up and become successful entrepreneurs.
P a g e 3 | 14
Program Specific Outcomes (PSOs)
Train machine learning models to address real life challenging problems using
PSO - 1
acquired AI knowledge
Develop applications using ML techniques related to the field of medical, agriculture,
PSO - 2 defense, education and various scientific explorations.
Couse
Course Outcome
Index
Students will apply K-means and EM clustering algorithms to a CSV dataset, enabling a
C317.2 comparative evaluation of their clustering quality.
Students will implement and apply the Locally Weighted Regression algorithm to fit data
C317.3 points, gaining practical experience and visualization skills.
P a g e 4 | 14
1. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set.
Print both correct and wrong predictions.
Program:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score , confusion_matrix
from sklearn.neighbors import KNeighborsClassifier as knn
data=pd.read_csv('iris.csv'
) x=np.array(data.iloc[:,:-
1])
y=np.array(data.iloc[:,-1])
xtr,xt,ytr,yt=train_test_split(x,y,test_size=.5)
clf=knn(n_neighbors=3)
clf.fit(xtr,ytr)
y_pred=clf.predict(xt)
n=len(xtr)
for i in range(0,n):
if yt[i]==y_pred[i]:
print(yt[i]," is correctly predicted as: ",y_pred[i])
else:
print(yt[i]," is wrongly predicted as:
",y_pred[i]) print("accuracy= ",accuracy_score(yt,y_pred))
print("confusion matrix:\n",confusion_matrix(y_pred,yt))
The Python program utilizes the k-Nearest Neighbors (k-NN) algorithm to classify the Iris dataset.
After importing essential libraries such as pandas, numpy, and scikit-learn components like
train_test_split, accuracy_score, confusion_matrix, and KNeighborsClassifier, the Iris dataset is loaded
into a DataFrame. Subsequently, the dataset is divided into input features (X) and the target variable
(y). The training and testing sets are established using the train_test_split function. A k-NN classifier
with n_neighbors=3 is then instantiated and trained on the training data. Predictions are made on the
test set, and a loop iterates through the predictions to ascertain and print whether each instance is
correctly or wrongly predicted. Finally, the accuracy of the model is calculated using the
accuracy_score function, and a confusion matrix is displayed using the confusion_matrix function.
This program provides a comprehensive view of the k-NN algorithm's performance on the Iris dataset,
offering insights into both correct and incorrect predictions along with an evaluation of its overall
accuracy.
P a g e 5 | 14
Output:
2. Develop a program to apply K-means algorithm to cluster a set of data stored in .CSV file. Use
the same data set for clustering using EM algorithm. Compare the results of these two algorithms
and comment on the quality of clustering.
Program:
P a g e 6 | 14
import pandas as pd
import numpy as np
from sklearn import preprocessing
from sklearn.metrics import accuracy_score import matplotlib.pyplot as plt
dataset = load_iris()
x = pd.DataFrame(dataset.data)
x.columns = ['Sepal_length', 'Sepal_width', 'Petal_length',
'Petal_width'] y = dataset.target
scaler = preprocessing.StandardScaler()
scaler.fit(x)
xsa = scaler.transform(x)
xs = pd.DataFrame(xsa, columns=x.columns)
gmm =
GaussianMixture(n_components=3)
kmeans = KMeans(n_clusters=4)
gmm.fit(xs)
kmeans.fit(xs)
y_cluster_gmm = gmm.predict(xsa)
k_cluster_kmeans = kmeans.predict(xsa)
plt.figure(figsize=(14, 7))
plt.subplot(1, 3, 1)
plt.scatter(x.Petal_length, x.Petal_width, c=colormap[y],
s=40) plt.title('Real')
plt.subplot(1, 3, 2)
plt.scatter(x.Petal_length, x.Petal_width, c=colormap[y_cluster_gmm], s=40)
plt.title("GMM clustering")
plt.subplot(1, 3, 3)
plt.scatter(x.Petal_length, x.Petal_width,
c=colormap[k_cluster_kmeans], s=40)
plt.title("Kmeans clustering")
plt.show()
The Python program employs the K-means and Expectation-Maximization (EM) clustering algorithms
on the Iris dataset. The data, loaded from a CSV file, undergoes standardization before applying the
Gaussian Mixture Model (GMM) and KMeans algorithms. Accuracy scores are calculated to evaluate
the clustering performance of both methods. The program visually compares the real classes with the
clusters predicted by GMM and K-means through scatter plots depicting petal length and width
relationships. The provided visualizations aid in assessing the quality of clustering. Overall, the
program offers a concise yet informative analysis, presenting accuracy scores and visual
representations for a comparative evaluation of GMM and K-means clustering on the Iris dataset.
P a g e 7 | 14
Output:
3. Implement the non-parametric Locally Weighted Regression algorithm in order to fit data
points. Select appropriate data set for your experiment and draw graphs
Dataset: https://drive.google.com/file/d/1LIVYmyNvwXhhGUwDV1S8I6Gid4-YSG4B/view
Program:
data = pd.read_csv('10-
dataset.csv') x =
np.array(data.total_bill)
y = np.array(data.tip)
lowess_model = lowess.Lowess()
lowess_model.fit(x, y)
The Python program implements the Locally Weighted Regression (LOWESS) algorithm to fit data
points from a specified dataset, accessible via a Google Drive link. The dataset, loaded using Pandas,
P a g e 8 | 14
consists of 'total_bill' and 'tip' columns. The LOWESS algorithm is applied to the data, and the
resulting regression curve is plotted alongside the original data points. The matplotlib library is utilized
for visualization, with the regression curve displayed in black and the data points in red. The program
enhances the understanding of the dataset by fitting a smooth curve through the data points,
highlighting trends and relationships. The choice of the LOWESS algorithm ensures a non-parametric
approach, offering flexibility in capturing complex patterns present in the data.
Output:
4. Build an Artificial Neural Network by implementing the Backpropagation algorithm and test
the same using appropriate data sets
Program:
import numpy as np
x = x / np.amax(x,
axis=0) y = y / 100
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def derivatives_sigmoid(x):
return x * (1 - x)
epoch = 5000
lr = 0.1
inputlayer_neuron = 2
hiddenlayer_neuron = 3
output_neurons = 1
wh = np.random.uniform(size=(inputlayer_neuron,
hiddenlayer_neuron)) bh = np.random.uniform(size=(1,
P a g e 9 | 14
hiddenlayer_neuron))
wout = np.random.uniform(size=(hiddenlayer_neuron,
output_neurons)) bout = np.random.uniform(size=(1,
output_neurons))
for i in
range(epoch): #
Forward pass
hinp1 = np.dot(x, wh)
hinp = hinp1 + bh
hlayer_act =
sigmoid(hinp)
outinp1 = np.dot(hlayer_act, wout)
outinp = outinp1 + bout
output = sigmoid(outinp)
EO = y - output
outgrad = derivatives_sigmoid(output)
d_output = EO * outgrad
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)
d_hiddenlayer = EH * hiddengrad
wout += hlayer_act.T.dot(d_output) *
lr wh += x.T.dot(d_hiddenlayer) * lr
print("Input:\n" + str(x))
print("Actual Output:\n" + str(y))
print("Predicted Output:\n", output)
The Python program implements an Artificial Neural Network (ANN) using the Backpropagation
algorithm. The network consists of an input layer with 2 neurons, a hidden layer with 3 neurons, and
an output layer with 1 neuron. The Backpropagation algorithm is employed to train the network on a
small dataset (x, y). The dataset represents input-output pairs where 'x' is a 2D array of input features,
and 'y' is a 2D array of corresponding output labels. The program initializes random weights and biases
for the network.
The training process involves 5000 epochs with a learning rate (lr) of 0.1. In each epoch, the program
performs a forward pass to calculate the predicted output of the network. The error between the
predicted and actual output is then used to update the weights and biases through backpropagation.
This program provides a basic understanding of how a simple neural network is constructed and
trained using the Backpropagation algorithm.
Output:
P a g e 10 | 14
5. Demonstrate Genetic algorithm by taking a suitable data for any simple application
Program:
import random
import numpy as
np
def fitness(individual):
return sum(individual)
def mutation(individual):
for i in range(len(individual)):
if random.random() < 0.1:
individual = individual[:i] + [1-individual[i]] + individual[i + 1:]
return individual
pop_size, genome_size = 6, 5
population = init_population(pop_size, genome_size)
The provided Python program illustrates a basic Genetic Algorithm (GA) designed for binary
optimization. The program initializes a population of binary strings, evaluates their fitness based on
the count of '1's, and employs tournament selection, one-point crossover, and mutation to evolve the
population across generations. The goal is to maximize the number of '1's in the binary strings,
reflecting the optimization objective. The algorithm iterates through multiple generations,
showcasing the genetic evolution process. The program offers a foundational understanding of key
GA components such as selection, crossover, and mutation, and can serve as a starting point for more
P a g e 11 | 14
complex optimization problems.
Output:
Program:
import numpy as np
# Constants
GRID_SIZE = 3
NUM_ACTIONS = 4
NUM_STATES = GRID_SIZE * GRID_SIZE
START_STATE = 0
GOAL_STATE = NUM_STATES - 1
OBSTACLE_STATE = 4
EPSILON = 0.1
LEARNING_RATE = 0.1
DISCOUNT_FACTOR = 0.9
NUM_EPISODES = 1000
# Initialize Q-table
q_table = np.zeros((NUM_STATES, NUM_ACTIONS))
# Helper function to get possible next states from the current state
def get_next_states(state):
row, col = divmod(state, GRID_SIZE)
possible_next_states = []
# Check up
if row >
0:
possible_next_states.append(get_state_index(row -
1, col)) # Check down
if row < GRID_SIZE - 1:
possible_next_states.append(get_state_index(row + 1,
col))
P a g e 12 | 14
# Check
left if col
> 0:
possible_next_states.append(get_state_index(row,
col - 1)) # Check right
if col < GRID_SIZE - 1:
possible_next_states.append(get_state_index(row, col +
1))
return
possible_next_states # Q-
learning algorithm
for episode in range(NUM_EPISODES):
state = START_STATE
# Take the selected action and observe the next state and reward
next_states = get_next_states(state)
next_state = np.random.choice(next_states)
reward = -1 # Small negative reward for each step
The provided Python program exemplifies the application of the Q-learning algorithm to a grid world
problem. In this simplified scenario, the agent learns to navigate from a starting state to a goal state
while avoiding obstacles. The Q-learning algorithm employs a Q-table to dynamically update its
decision-making strategy based on the observed rewards during exploration. The program iteratively
refines the Q-values associated with state-action pairs, ensuring the agent's policy converges towards
optimal actions that maximize cumulative rewards. The epsilon-greedy strategy balances the agent's
exploration of new actions with the exploitation of learned knowledge. The learned Q-table serves as a
valuable representation of the agent's acquired knowledge, capturing the optimal actions for each state
and facilitating efficient decision-making in similar grid-based environments.
P a g e 13 | 14
Output:
P a g e 14 | 14