Professional Documents
Culture Documents
Hierarchial Clustering: Import Numpy As NP Import Pandas As PD From Matplotlib Import Pyplot As PLT
Hierarchial Clustering: Import Numpy As NP Import Pandas As PD From Matplotlib Import Pyplot As PLT
import numpy as np
import pandas as pd
dend = dendrogram(wardlink)
dend = dendrogram(wardlink,
truncate_mode='lastp',
p = 10,
#Method 1
clusters
# Method 2
clusters
df.head ()
Cluster Frequency
df.clusters.value_counts().sort_index()
Cluster Profiles
aggdata=df.iloc[:,1:8].groupby('clusters').mean()
aggdata['Freq']=df.clusters.value_counts().sort_index()
aggdata
df["Agglo_CLusters"]=Cluster_agglo
df.columns
agglo_data=df.drop(["SR_NO","clusters"],axis=1).groupby('Agglo_CLusters').
mean()
agglo_data['Freq']=df.Agglo_CLusters.value_counts().sort_index()
agglo_data
Recommendations
1. For companies hiring, go to colleges for Placements are Tier 1 colleges, followed by Tier 2
colleges
2. For companies providing Training program to staffs and students, go to colleges are Tier 2
and Tier 3 Colleges, since Tier 1 is comparitively performing better.
3. Tier 3 colleges will need to concentrate more on Marketing and Advertisements about their
campus to create awareness and attract students
4. Students looking to enroll in a college, can give priority to Tier 1 over Tier 2 and 3 colleges
Saving the Cluster Profiles in a csv file
#aggdata.to_csv('enggdata_hc.csv')
K-Means Clustering
import pandas as pd
import numpy as np
%matplotlib inline