Professional Documents
Culture Documents
[4]: data
1
[95 rows x 7 columns]
plt.xlabel("Atomic Number")
plt.ylabel("Electronegativity")
plt.title("Electronegativity vs. Atomic Number")
plt.grid(True)
I was able to plot the line very easily since both the atomic number and electronegativity were
already in sorted orded in the data, otherwise, we would have to first sort them before plotting
2
Number of peaks observed: 15
plt.scatter(atomic_number[merged_peaks], electronegativity[merged_peaks],␣
↪color='Brown', label="Peaks (Margined)")
plt.scatter(atomic_number[merged_peaks[6]], electronegativity[merged_peaks[6]],␣
↪color='Blue', label="Peaks (Margined)")
plt.scatter(atomic_number[merged_peaks[8]], electronegativity[merged_peaks[8]],␣
↪color='Blue', label="Peaks (Margined)")
3
0.0.1 From algorithm, by merging peaks, we got 11 peaks and I am observing 9 peaks
as we see
The peaks colored in red are totalling to 9 peaks (which I also see) and the blue ones as indicated
in the graph above are the peaks the algorithm has computed extra, this is very much possible
because of the so much noise around the peak that it appears it has created a local maxima, and
considers it a peak, although that is not significant peak for our data
[169]: clusters.items()
[169]: dict_items([(0, [1, 2, 3, 4, 5, 6, 7, 8, 9]), (1, [10, 11, 12, 13, 14, 15, 16,
17]), (2, [18, 19, 20, 21, 22, 23, 24]), (3, [25, 26, 27, 28, 29, 30, 31, 32,
33, 34, 35, 36]), (4, [37, 38, 39, 40, 41, 42]), (5, [43, 44, 45, 46, 47, 48,
49, 50, 51, 52, 53]), (6, [54, 55, 56, 57, 58, 59, 60]), (7, [61, 62, 63, 64,
65, 66, 67, 68, 69]), (8, [70, 71, 72, 73, 74]), (9, [75, 76, 77, 78, 79]), (10,
[80, 81, 82, 83, 84, 85, 86, 88, 89, 90, 91, 92, 93, 94, 95, 96])])
[170]: # Step 8: Define a function to create a polygon from the bounding box
def create_polygon(x, y, margin):
x_min, x_max = min(x) - margin, max(x) + margin
y_min, y_max = min(y) - margin, max(y) + margin
return [(x_min, y_min), (x_min, y_max), (x_max, y_max), (x_max, y_min)]
margins = [2,1,1,1,1,1,1,1,1,1,2]
for idx, (cluster_idx, elements) in enumerate(clusters.items()):
cluster_x = atomic_number[electronegativity.index.isin(elements)].tolist()
cluster_y = electronegativity[electronegativity.index.isin(elements)].
↪tolist()
4
plt.legend()
plt.show()
for i in range(len(kmeans.cluster_centers_)):
cluster_points = X[y_kmeans == i]
hull = ConvexHull(cluster_points)
plt.fill(cluster_points[hull.vertices, 0], cluster_points[hull.vertices,␣
↪1], alpha=0.15, color='blue')
5
plt.legend()
plt.show()
The above is the final result after applying KMeans Algorithm, this algorithm does a pretty decent
job at clustering
[ ]: