Professional Documents
Culture Documents
avinash_tiwari_9
avinash_tiwari_9
Q.9 : To perform K-Mean clustering operation and visualize for iris dataset
THEORY:
1. Import Libraries:
from sklearn.cluster import KMeans: This imports the KMeans clustering algorithm from scikit-
learn, a popular Python library for machine learning.
import matplotlib.pyplot as plt: This imports the matplotlib library for plotting.
2. Initialization:
wcss = []: This initializes an empty list wcss to store the Within-Cluster-Sum-of-Squares (WCSS)
for each number of clusters.
4. KMeans Clustering:
kmeans = KMeans(n_clusters=i, init='k-means++', max_iter=300, n_init=10, random_state=0): This
initializes a KMeans clustering object with the current number of clusters (i).
max_iter=300: This sets the maximum number of iterations for each KMeans run to 300. If
convergence is not achieved within these iterations, the algorithm stops.
n_init=10: This sets the number of times the KMeans algorithm will be run with different centroid
seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.
The x-axis represents the number of clusters, and the y-axis represents the corresponding WCSS
values.
By visually inspecting the graph, you can identify the "elbow point," which is the point where the
rate of decrease in WCSS starts to slow down. This point indicates the optimal number of clusters.
CODE:
wcss = []
target variable named 'target_variable'. Adjust these names according to your actual dataset.
Additionally, customize the preprocessing steps as needed for your specific data.