Professional Documents
Culture Documents
Document
Document
Step-2: Select random K points or centroids. (It can be other from the input dataset).
Step-3: Assign each data point to their closest centroid, which will form the predefined K clusters.
Step-4: Calculate the variance and place a new centroid of each cluster.
Step-5: Repeat the third steps, which means reassign each datapoint to the new closest centroid of each
cluster.
Importing libraries
Import numpy as nm
Import pandas as pd
Dataset = pd.read_csv(‘Mall_Customers_data.csv’)
Kmeans.fit(x)
Wcss_list.append(kmeans.inertia_)
Mtp.xlabel(‘Number of clusters(k)’)
Mtp.ylabel(‘wcss_list’)
Mtp.show()
Y_predict= kmeans.fit_predict(x
Mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s = 100, c = ‘blue’, label = ‘Cluster 1’) #for first
cluster
Mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s = 100, c = ‘green’, label = ‘Cluster 2’) #for second
cluster
Mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c = ‘red’, label = ‘Cluster 3’) #for third
cluster
Mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s = 100, c = ‘cyan’, label = ‘Cluster 4’) #for fourth
cluster
Mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s = 100, c = ‘magenta’, label = ‘Cluster 5’) #for fifth
cluster
Mtp.title(‘Clusters of customers’)
Mtp.legend()
Mtp.show()
Exp 6 classification
Bayes’ theorem is also known as Bayes’ Rule or Bayes’ law, which is used to determine the probability of
a hypothesis with prior knowledge. It depends on the conditional probability.
P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a hypothesis is
true.
Working of Naïve Bayes’ Classifier can be understood with the help of the below example:
Suppose we have a dataset of weather conditions and corresponding target variable “Play”. So using this
dataset we need to decide that whether we should play or not on a particular day according to the
weather conditions. So to solve this problem, we need to follow the below steps:
Import numpy as nm
Import pandas as pd
Dataset = pd.read_csv(‘user_data.csv’)
Y = dataset.iloc[:, 4].values
# Splitting the dataset into the Training set and Test set
From sklearn.model_selection import train_test_split
# Feature Scaling
Sc = StandardScaler()
X_train = sc.fit_transform(x_train)
X_test = sc.transform(x_test)
Classifier = GaussianNB()
Classifier.fit(x_train, y_train)
Predictions.
Y_pred = classifier.predict(x_test)