You are on page 1of 79

1


• • •
• • •
• •
2









3


5
1 initialization:
2 - select k random cluster centers
3 repeat:
4 “assignment” step:
5 - assign each point to the cluster of the nearest center
6 “update” step:
7 - move the cluster centers to the mean point of each cluster

6
1 initialization:
2 - select k random cluster centers
3 repeat:
4 “expectation” step:
5 - assign each point to the cluster of the nearest center
6 “maximization” step:
7 - move the cluster centers to the mean point of each cluster

7
8
• 𝒌 = 𝟒

9
• 𝑘 = 4

10
• 𝑘 = 4

11
• 𝑘 = 4

12
• 𝑘 = 4

13
• 𝑘 = 4

14
• 𝑘 = 4

15
• 𝑘 = 4

16
• 𝑘 = 4

17
• 𝑘 = 4

18
19
● 𝑋 = 𝑥Ԧ 1
, … , 𝑥Ԧ 𝑚
⊂ ℝ𝑛 𝑘 ∈ ℕ+

20
● 𝑋 = 𝑥Ԧ 1
, … , 𝑥Ԧ 𝑚
⊂ ℝ𝑛 𝑘 ∈ ℕ+
𝒌 𝒏 𝒌
𝒋 𝟐 𝒊 𝒋 𝟐
𝑱=෍ ෍ 𝒙−𝝁 = ෍ ෍ 𝒛𝒊𝒋 𝒙 −𝝁
𝒋=𝟏 𝒙∈𝑪𝒋 𝒊=𝟏 𝒋=𝟏

21
● 𝑋 = 𝑥Ԧ 1
, … , 𝑥Ԧ 𝑚
⊂ ℝ𝑛 𝑘 ∈ ℕ+
𝒌 𝒎 𝒌
𝒋 𝟐 𝒊 𝒋 𝟐
𝑱=෍ ෍ 𝒙−𝝁 = ෍ ෍ 𝒛𝒊𝒋 𝒙 −𝝁
𝒋=𝟏 𝒙∈𝑪𝒋 𝒊=𝟏 𝒋=𝟏

𝑘
1 if 𝑥Ԧ 𝑖 ∈ 𝐶𝑗 ෍ 𝑧𝑖𝑗 = 1, ∀𝑖
○ 𝑧𝑖𝑗 = ቊ
0 otherwise 𝑗=1

𝑗
○ 𝜇Ԧ ∈ ℝ𝑛 𝐶𝑗

22
● 𝑋 = 𝑥Ԧ 1
, … , 𝑥Ԧ 𝑚
⊂ ℝ𝑛 𝑘 ∈ ℕ+
𝒌 𝒎 𝒌
𝒋 𝟐 𝒊 𝒋 𝟐
𝑱=෍ ෍ 𝒙−𝝁 = ෍ ෍ 𝒛𝒊𝒋 𝒙 −𝝁
𝒋=𝟏 𝒙∈𝑪𝒋 𝒊=𝟏 𝒋=𝟏

𝑘
1 if 𝑥Ԧ 𝑖 ∈ 𝐶𝑗 ෍ 𝑧𝑖𝑗 = 1, ∀𝑖
○ 𝑧𝑖𝑗 = ቊ
0 otherwise 𝑗=1

𝑗
○ 𝜇Ԧ ∈ ℝ𝑛 𝐶𝑗

● 𝐽 𝑧𝑖𝑗 𝜇Ԧ 𝑗

23
1 initialization:
2 - choose random values for 𝜇Ԧ 𝑗 , for all 𝑗 ∈ 1, 2, … , 𝑘
3 repeat:
4 “expectation” step:
5 - minimize 𝐽 w.r.t. 𝑧𝑖𝑗 , keeping 𝜇Ԧ 𝑗 fixed
6 “maximization” step:
7 - minimize 𝐽 w.r.t. 𝜇Ԧ 𝑗 , keeping 𝑧𝑖𝑗 fixed

24

𝑚 𝑘
𝑖 𝑗 2
𝐽 = ෍ ෍ 𝑧𝑖𝑗 𝑥Ԧ − 𝜇Ԧ
𝑖=1 𝑗=1
𝑗
𝑧𝑖𝑗 𝜇Ԧ

25

𝑚 𝑘
𝑖 𝑗 2
𝐽 = ෍ ෍ 𝑧𝑖𝑗 𝑥Ԧ − 𝜇Ԧ
𝑖=1 𝑗=1
𝑗
𝑧𝑖𝑗 𝜇Ԧ

● 𝑧𝑖𝑗 = 1 ∀𝑖 𝑥Ԧ 𝑖

26

𝑚 𝑘
𝑖 𝑗 2
𝐽 = ෍ ෍ 𝑧𝑖𝑗 𝑥Ԧ − 𝜇Ԧ
𝑖=1 𝑗=1
𝑗
𝑧𝑖𝑗 𝜇Ԧ

● 𝑧𝑖𝑗 = 1 ∀𝑖 𝑥Ԧ 𝑖

1 𝑗 ∗ = argminj 𝑥Ԧ 𝑖
− 𝜇Ԧ 𝑗
○ 𝐽 𝑧𝑖𝑗 ∗ ≔ ൝
0

27

𝑚 𝑘
𝑖 𝑗 2
𝐽 = ෍ ෍ 𝑧𝑖𝑗 𝑥Ԧ − 𝜇Ԧ
𝑖=1 𝑗=1
𝑗
𝑧𝑖𝑗 𝜇Ԧ

● 𝑧𝑖𝑗 = 1 ∀𝑖 𝑥Ԧ 𝑖

1 𝑗 ∗ = argminj 𝑥Ԧ 𝑖
− 𝜇Ԧ 𝑗
○ 𝐽 𝑧𝑖𝑗 ∗ ≔ ൝
0

● 𝒙 𝒊

28

𝑚 𝑘
𝑖 𝑗 2
𝐽 = ෍ ෍ 𝑧𝑖𝑗 𝑥Ԧ − 𝜇Ԧ
𝑖=1 𝑗=1
𝑗
𝜇Ԧ 𝑧𝑖𝑗

29

𝑚 𝑘
𝑖 𝑗 2
𝐽 = ෍ ෍ 𝑧𝑖𝑗 𝑥Ԧ − 𝜇Ԧ
𝑖=1 𝑗=1
𝑗
𝜇Ԧ 𝑧𝑖𝑗

𝜕𝐽
● 𝐽 𝜇Ԧ 𝑗
⟹ =0
𝜕𝜇 𝑗

30

𝑚 𝑘
𝑖 𝑗 2
𝐽 = ෍ ෍ 𝑧𝑖𝑗 𝑥Ԧ − 𝜇Ԧ
𝑖=1 𝑗=1
𝑗
𝜇Ԧ 𝑧𝑖𝑗

𝜕𝐽
● 𝐽 𝜇Ԧ 𝑗
⟹ =0
𝜕𝜇 𝑗
𝜕𝐽 𝑖 𝑗 𝑖 𝑗 𝑗
σi 𝑧𝑖𝑗 𝑥 𝑖 𝟏
𝑗
= 2 ෍ 𝑧𝑖𝑗 𝑥 −𝜇 = 2 ෍ 𝑧𝑖𝑗 𝑥 − 2 ෍ 𝑧𝑖𝑗 𝜇 =0⟹𝜇 = = ෍𝒙
𝜕𝜇 𝑖 𝑖 𝑖 σi 𝑧𝑖𝑗 𝑪𝒋
𝒙∈𝑪𝒋

31

𝑚 𝑘
𝑖 𝑗 2
𝐽 = ෍ ෍ 𝑧𝑖𝑗 𝑥Ԧ − 𝜇Ԧ
𝑖=1 𝑗=1
𝑗
𝜇Ԧ 𝑧𝑖𝑗

𝜕𝐽
● 𝐽 𝜇Ԧ 𝑗
⟹ =0
𝜕𝜇 𝑗
𝜕𝐽 𝑖 𝑗 𝑖 𝑗 𝑗
σi 𝑧𝑖𝑗 𝑥 𝑖 𝟏
𝑗
= 2 ෍ 𝑧𝑖𝑗 𝑥 −𝜇 = 2 ෍ 𝑧𝑖𝑗 𝑥 − 2 ෍ 𝑧𝑖𝑗 𝜇 =0⟹𝜇 = = ෍𝒙
𝜕𝜇 𝑖 𝑖 𝑖 σi 𝑧𝑖𝑗 𝑪𝒋
𝒙∈𝑪𝒋

32

𝑚 𝑘
𝑖 𝑗 2
𝐽 = ෍ ෍ 𝑧𝑖𝑗 𝑥Ԧ − 𝜇Ԧ
𝑖=1 𝑗=1
𝑗
𝜇Ԧ 𝑧𝑖𝑗

𝜕𝐽
● 𝐽 𝜇Ԧ 𝑗
⟹ =0
𝜕𝜇 𝑗
𝜕𝐽 𝑖 𝑗 𝑖 𝑗 𝑗
σi 𝑧𝑖𝑗 𝑥 𝑖 𝟏
𝑗
= 2 ෍ 𝑧𝑖𝑗 𝑥 −𝜇 = 2 ෍ 𝑧𝑖𝑗 𝑥 − 2 ෍ 𝑧𝑖𝑗 𝜇 =0⟹𝜇 = = ෍𝒙
𝜕𝜇 𝑖 𝑖 𝑖 σi 𝑧𝑖𝑗 𝑪𝒋
𝒙∈𝑪𝒋

● 𝝁 𝒋
𝑪𝒋
33

𝑚 𝑘
𝑖 𝑗 2
𝐽 = ෍ ෍ 𝑧𝑖𝑗 𝑥Ԧ − 𝜇Ԧ
𝑖=1 𝑗=1
𝑗
𝜇Ԧ 𝑧𝑖𝑗

𝜕𝐽
● 𝐽 𝜇Ԧ 𝑗
⟹ =0
𝜕𝜇 𝑗
𝜕𝐽 𝑖 𝑗 𝑖 𝑗 𝑗
σi 𝑧𝑖𝑗 𝑥 𝑖 𝟏
𝑗
= 2 ෍ 𝑧𝑖𝑗 𝑥 −𝜇 = 2 ෍ 𝑧𝑖𝑗 𝑥 − 2 ෍ 𝑧𝑖𝑗 𝜇 =0⟹𝜇 = = ෍𝒙
𝜕𝜇 𝑖 𝑖 𝑖 σi 𝑧𝑖𝑗 𝑪𝒋
𝒙∈𝑪𝒋

● 𝝁 𝒋
𝑪𝒋 𝜇Ԧ 𝑗

34
35

36




37

38

𝑏 𝑥Ԧ 𝑖 − 𝑎 𝑥Ԧ 𝑖
𝑠 𝑥Ԧ 𝑖 =
max 𝑎 𝑥Ԧ 𝑖 , 𝑏 𝑥Ԧ 𝑖

● 𝑎 𝑥Ԧ 𝑖
− 𝑥Ԧ 𝑖

● 𝑏 𝑥Ԧ 𝑖

39

40

41



42

43

1 choose first center uniformly at random from the data points

2 repeat until all 𝑘 centers have been chosen:

𝑖 𝑖
3 - compute 𝐷 𝑥Ԧ , the distance from 𝑥Ԧ to the nearest chosen center

𝑖 𝑖 2
4 - choose a new center at random with probability 𝑃 𝑥Ԧ ~ 𝐷 𝑥Ԧ

5 run standard k-means algorithm

44

45

46


𝑎+𝑏
𝑅=
𝑛(𝑛 − 1)/2
○ 𝑎

○ 𝑏

47

48

49
50
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗 ∗ ≔ ൝
1 − 𝜇Ԧ
𝑥Ԧ 𝑖
0

51
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗 ∗ ≔ ൝
1 − 𝜇Ԧ
𝑥Ԧ 𝑖
0


52
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗 ∗ ≔ ൝
1 − 𝜇Ԧ
𝑥Ԧ 𝑖
0


53
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗 ∗ ≔ ൝
1 − 𝜇Ԧ
𝑥Ԧ 𝑖
0


● 𝑧𝑖𝑗 ∈ ℝ
𝑖
𝑥Ԧ 𝐶𝑗

−𝛽 𝑥Ԧ 𝑖 −𝜇 𝑗
𝑒 𝑗
σi 𝑧𝑖𝑗 𝑥Ԧ 𝑖
𝑧 𝑖𝑗 ∗ ≔ 𝜇Ԧ =
σ𝑗 𝑒 −𝛽 𝑥Ԧ 𝑖 −𝜇 𝑗 σi 𝑧𝑖𝑗

54

55

56

57

58
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗∗ = ൝
1 − 𝜇Ԧ
0

59
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗∗ = ൝
1 − 𝜇Ԧ
0

𝑖 𝑗 𝑖 𝑗 𝑖 𝑗
𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ − 𝜇Ԧ , 𝑥Ԧ − 𝜇Ԧ

60
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗∗ = ൝
1 − 𝜇Ԧ
0

𝑖 𝑗 𝑖 𝑗 𝑖 𝑗
𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ − 𝜇Ԧ , 𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ 𝑖 , 𝑥Ԧ 𝑖 − 2 𝑥Ԧ 𝑖 , 𝜇Ԧ 𝑗 + 𝜇Ԧ 𝑗 , 𝜇Ԧ 𝑗

61
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗∗ = ൝
1 − 𝜇Ԧ
0

𝑖 𝑗 𝑖 𝑗 𝑖 𝑗
𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ − 𝜇Ԧ , 𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ 𝑖 , 𝑥Ԧ 𝑖 − 2 𝑥Ԧ 𝑖 , 𝜇Ԧ 𝑗 + 𝜇Ԧ 𝑗 , 𝜇Ԧ 𝑗

𝑖
2 1 1 1
= 𝑥Ԧ − 2 𝑥Ԧ 𝑖 , ෍ 𝑥Ԧ ′ + ෍ 𝑥Ԧ ′ , ෍ 𝑥Ԧ ′′
𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

62
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗∗ = ൝
1 − 𝜇Ԧ
0

𝑖 𝑗 𝑖 𝑗 𝑖 𝑗
𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ − 𝜇Ԧ , 𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ 𝑖 , 𝑥Ԧ 𝑖 − 2 𝑥Ԧ 𝑖 , 𝜇Ԧ 𝑗 + 𝜇Ԧ 𝑗 , 𝜇Ԧ 𝑗

𝑖
2 1 1 1
= 𝑥Ԧ − 2 𝑥Ԧ 𝑖 , ෍ 𝑥Ԧ ′ + ෍ 𝑥Ԧ ′ , ෍ 𝑥Ԧ ′′
𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

𝑖
2 1 1
= 𝑥Ԧ −2 ෍ 𝑥Ԧ 𝑖 , 𝑥Ԧ ′ + 2෍ ′ ෍ 𝑥Ԧ ′ , 𝑥Ԧ ′′
𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ∈𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

63
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗∗ = ൝
1 − 𝜇Ԧ
0

𝑖 𝑗 𝑖 𝑗 𝑖 𝑗
𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ − 𝜇Ԧ , 𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ 𝑖 , 𝑥Ԧ 𝑖 − 2 𝑥Ԧ 𝑖 , 𝜇Ԧ 𝑗 + 𝜇Ԧ 𝑗 , 𝜇Ԧ 𝑗

𝑖
2 1 1 1
= 𝑥Ԧ − 2 𝑥Ԧ 𝑖 , ෍ 𝑥Ԧ ′ + ෍ 𝑥Ԧ ′ , ෍ 𝑥Ԧ ′′
𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

𝑖
2 1 1
= 𝑥Ԧ −2 ෍ 𝑥Ԧ 𝑖 , 𝑥Ԧ ′ + 2෍ ′ ෍ 𝑥Ԧ ′ , 𝑥Ԧ ′′
𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ∈𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

𝑖 𝑗
𝟏 𝟏 2
argmin𝑗 𝑥Ԧ − 𝜇Ԧ = 𝐚𝐫𝐠𝐦𝐢𝐧𝒋 −𝟐 ෍ 𝒙 𝒊 , 𝒙′ + 𝟐෍ ෍ 𝒙′ , 𝒙′′ 𝑥Ԧ 𝑖 √
𝑪𝒋 𝒙′ ∈𝑪𝒋 𝑪𝒋 𝒙′ ∈𝑪𝒋 𝒙′′ ∈𝑪𝒋

64
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗∗ = ൝
1 − 𝜇Ԧ
0

𝑖 𝑗 𝑖 𝑗 𝑖 𝑗
𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ − 𝜇Ԧ , 𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ 𝑖 , 𝑥Ԧ 𝑖 − 2 𝑥Ԧ 𝑖 , 𝜇Ԧ 𝑗 + 𝜇Ԧ 𝑗 , 𝜇Ԧ 𝑗

𝑖
2 1 1 1
= 𝑥Ԧ − 2 𝑥Ԧ 𝑖 , ෍ 𝑥Ԧ ′ + ෍ 𝑥Ԧ ′ , ෍ 𝑥Ԧ ′′
𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

𝑖
2 1 1
= 𝑥Ԧ −2 ෍ 𝑥Ԧ 𝑖 , 𝑥Ԧ ′ + 2෍ ′ ෍ 𝑥Ԧ ′ , 𝑥Ԧ ′′
𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ∈𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

𝑖 𝑗
𝟏 𝟏 2
argmin𝑗 𝑥Ԧ − 𝜇Ԧ = 𝐚𝐫𝐠𝐦𝐢𝐧𝒋 −𝟐 ෍ 𝒙 𝒊 , 𝒙′ + 𝟐෍ ෍ 𝒙′ , 𝒙′′ 𝑥Ԧ 𝑖 √
𝑪𝒋 𝒙′ ∈𝑪𝒋 𝑪𝒋 𝒙′ ∈𝑪𝒋 𝒙′′ ∈𝑪𝒋

65
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗∗ = ൝
1 − 𝜇Ԧ
0

𝑖 𝑗 𝑖 𝑗 𝑖 𝑗
𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ − 𝜇Ԧ , 𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ 𝑖 , 𝑥Ԧ 𝑖 − 2 𝑥Ԧ 𝑖 , 𝜇Ԧ 𝑗 + 𝜇Ԧ 𝑗 , 𝜇Ԧ 𝑗

𝑖
2 1 1 1
= 𝑥Ԧ − 2 𝑥Ԧ 𝑖 , ෍ 𝑥Ԧ ′ + ෍ 𝑥Ԧ ′ , ෍ 𝑥Ԧ ′′
𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

𝑖
2 1 1
= 𝑥Ԧ −2 ෍ 𝑥Ԧ 𝑖 , 𝑥Ԧ ′ + 2෍ ′ ෍ 𝑥Ԧ ′ , 𝑥Ԧ ′′
𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ∈𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

𝑖 𝑗
𝟏 𝟏 2
argmin𝑗 𝑥Ԧ − 𝜇Ԧ = 𝐚𝐫𝐠𝐦𝐢𝐧𝒋 −𝟐 ෍ 𝒙 𝒊 , 𝒙′ + 𝟐෍ ෍ 𝒙′ , 𝒙′′ 𝑥Ԧ 𝑖 √
𝑪𝒋 𝒙′ ∈𝑪𝒋 𝑪𝒋 𝒙′ ∈𝑪𝒋 𝒙′′ ∈𝑪𝒋



66
𝑗 ∗ = argminj 𝑥Ԧ 𝑖 𝑗
● 𝑧𝑖𝑗∗ = ൝
1 − 𝜇Ԧ
0

𝑖 𝑗 𝑖 𝑗 𝑖 𝑗
𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ − 𝜇Ԧ , 𝑥Ԧ − 𝜇Ԧ = 𝑥Ԧ 𝑖 , 𝑥Ԧ 𝑖 − 2 𝑥Ԧ 𝑖 , 𝜇Ԧ 𝑗 + 𝜇Ԧ 𝑗 , 𝜇Ԧ 𝑗

𝑖
2 1 1 1
= 𝑥Ԧ − 2 𝑥Ԧ 𝑖 , ෍ 𝑥Ԧ ′ + ෍ 𝑥Ԧ ′ , ෍ 𝑥Ԧ ′′
𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

𝑖
2 1 1
= 𝑥Ԧ −2 ෍ 𝑥Ԧ 𝑖 , 𝑥Ԧ ′ + 2෍ ′ ෍ 𝑥Ԧ ′ , 𝑥Ԧ ′′
𝐶𝑗 𝑥Ԧ ′ ∈𝐶𝑗 𝐶𝑗 𝑥Ԧ ∈𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

𝑖 𝑗
𝟏 𝟏 2
argmin𝑗 𝑥Ԧ − 𝜇Ԧ = 𝐚𝐫𝐠𝐦𝐢𝐧𝒋 −𝟐 ෍ 𝒙 𝒊 , 𝒙′ + 𝟐෍ ෍ 𝒙′ , 𝒙′′ 𝑥Ԧ 𝑖 √
𝑪𝒋 𝒙′ ∈𝑪𝒋 𝑪𝒋 𝒙′ ∈𝑪𝒋 𝒙′′ ∈𝑪𝒋



67
-
-
1 1
1 𝑗 ∗ = argmin𝑗 −2 ෍ 𝐾 𝑥Ԧ 𝑖 , 𝑥Ԧ ′ + 2෍ ′ ෍ 𝐾 𝑥Ԧ ′ , 𝑥Ԧ ′′
𝑧𝑖𝑗 ∗ ≔ 𝐶𝑗 𝑥Ԧ ′ ∈𝐶 𝑗 𝐶𝑗 𝑥Ԧ ∈𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

1 if 𝑥Ԧ 𝑖 ∈ 𝐶𝑗
○ 𝑧𝑖𝑗 = ቊ
0 otherwise

○ 𝐾: ℝ𝑛 × ℝ𝑛 → ℝ

68
-
-
1 1
1 𝑗 ∗ = argmin𝑗 −2 ෍ 𝐾 𝑥Ԧ 𝑖 , 𝑥Ԧ ′ + 2෍ ′ ෍ 𝐾 𝑥Ԧ ′ , 𝑥Ԧ ′′
𝑧𝑖𝑗 ∗ ≔ 𝐶𝑗 𝑥Ԧ ′ ∈𝐶 𝑗 𝐶𝑗 𝑥Ԧ ∈𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

1 if 𝑥Ԧ 𝑖 ∈ 𝐶𝑗
○ 𝑧𝑖𝑗 = ቊ
0 otherwise

○ 𝐾: ℝ𝑛 × ℝ𝑛 → ℝ

𝒌=𝟐
𝒙−𝒚 𝟐

𝑲 𝒙, 𝒚 = 𝒆 𝟐𝝈𝟐

69
-
-
𝟏 𝟏
𝟏 𝒋∗ = 𝐚𝐫𝐠𝐦𝐢𝐧𝒋 −𝟐 ෍ 𝑲 𝒙 𝒊 , 𝒙′ + 𝟐෍ ′
෍ 𝑲 𝒙′ , 𝒙′′
𝐳𝐢𝐣∗ ≔ ൞ 𝑪𝒋 𝒙′∈𝑪 𝒋 𝑪𝒋 𝒙 ∈𝑪𝒋 𝒙′′ ∈𝑪𝒋

1 if 𝑥Ԧ 𝑖 ∈ 𝐶𝑗
○ 𝑧𝑖𝑗 = ቊ
0 otherwise

○ 𝐾: ℝ𝑛 × ℝ𝑛 → ℝ

𝒌=𝟐
𝒙−𝒚 𝟐

𝑲 𝒙, 𝒚 = 𝒆 𝟐𝝈𝟐

70
-
-
𝟏 𝟏
𝟏 𝒋∗ = 𝐚𝐫𝐠𝐦𝐢𝐧𝒋 −𝟐 ෍ 𝑲 𝒙 𝒊 , 𝒙′ + 𝟐෍ ′
෍ 𝑲 𝒙′ , 𝒙′′
𝐳𝐢𝐣∗ ≔ ൞ 𝑪𝒋 𝒙′∈𝑪 𝒋 𝑪𝒋 𝒙 ∈𝑪𝒋 𝒙′′ ∈𝑪𝒋

1 if 𝑥Ԧ 𝑖 ∈ 𝐶𝑗
○ 𝑧𝑖𝑗 = ቊ
0 otherwise

○ 𝐾: ℝ𝑛 × ℝ𝑛 → ℝ

𝒌=𝟐
𝒙−𝒚 𝟐

𝑲 𝒙, 𝒚 = 𝒆 𝟐𝝈𝟐

71
-
-
𝟏 𝟏
𝟏 𝒋∗ = 𝐚𝐫𝐠𝐦𝐢𝐧𝒋 −𝟐 ෍ 𝑲 𝒙 𝒊 , 𝒙′ + 𝟐෍ ′
෍ 𝑲 𝒙′ , 𝒙′′
𝐳𝐢𝐣∗ ≔ ൞ 𝑪𝒋 𝒙′∈𝑪 𝒋 𝑪𝒋 𝒙 ∈𝑪𝒋 𝒙′′ ∈𝑪𝒋

1 if 𝑥Ԧ 𝑖 ∈ 𝐶𝑗
○ 𝑧𝑖𝑗 = ቊ
0 otherwise

○ 𝐾: ℝ𝑛 × ℝ𝑛 → ℝ

𝒌=𝟐
𝒙−𝒚 𝟐

𝑲 𝒙, 𝒚 = 𝒆 𝟐𝝈𝟐

72
-
-
𝟏 𝟏
𝟏 𝒋∗ = 𝐚𝐫𝐠𝐦𝐢𝐧𝒋 −𝟐 ෍ 𝑲 𝒙 𝒊 , 𝒙′ + 𝟐෍ ′
෍ 𝑲 𝒙′ , 𝒙′′
𝐳𝐢𝐣∗ ≔ ൞ 𝑪𝒋 𝒙′∈𝑪 𝒋 𝑪𝒋 𝒙 ∈𝑪𝒋 𝒙′′ ∈𝑪𝒋

1 if 𝑥Ԧ 𝑖 ∈ 𝐶𝑗
○ 𝑧𝑖𝑗 = ቊ
0 otherwise

○ 𝐾: ℝ𝑛 × ℝ𝑛 → ℝ

𝒌=𝟐
𝒙−𝒚 𝟐

𝑲 𝒙, 𝒚 = 𝒆 𝟐𝝈𝟐

73
-
-
𝟏 𝟏
𝟏 𝒋∗ = 𝐚𝐫𝐠𝐦𝐢𝐧𝒋 −𝟐 ෍ 𝑲 𝒙 𝒊 , 𝒙′ + 𝟐෍ ′
෍ 𝑲 𝒙′ , 𝒙′′
𝐳𝐢𝐣∗ ≔ ൞ 𝑪𝒋 𝒙′∈𝑪 𝒋 𝑪𝒋 𝒙 ∈𝑪𝒋 𝒙′′ ∈𝑪𝒋

1 if 𝑥Ԧ 𝑖 ∈ 𝐶𝑗
○ 𝑧𝑖𝑗 = ቊ
0 otherwise

○ 𝐾: ℝ𝑛 × ℝ𝑛 → ℝ

𝒌=𝟐
𝒙−𝒚 𝟐

𝑲 𝒙, 𝒚 = 𝒆 𝟐𝝈𝟐

74
-
-
1 1
1 𝑗 ∗ = argmin𝑗 −2 ෍ 𝐾 𝑥Ԧ 𝑖 , 𝑥Ԧ ′ + 2෍ ′ ෍ 𝐾 𝑥Ԧ ′ , 𝑥Ԧ ′′
𝑧𝑖𝑗 ∗ ≔ 𝐶𝑗 𝑥Ԧ ′ ∈𝐶 𝑗 𝐶𝑗 𝑥Ԧ ∈𝐶𝑗 𝑥Ԧ ′′ ∈𝐶𝑗

1 if 𝑥Ԧ 𝑖 ∈ 𝐶𝑗
○ 𝑧𝑖𝑗 = ቊ
0 otherwise

○ 𝐾: ℝ𝑛 × ℝ𝑛 → ℝ

𝒌=𝟐
𝒙−𝒚 𝟐

𝑲 𝒙, 𝒚 = 𝒆 𝟐𝝈𝟐

75
1 from sklearn.cluster import KMeans

2 from sklearn.metrics import silhouette_score, silhouette_samples, adjusted_rand_score

3 from sklearn.mixture import GaussianMixture

5 km = KMeans(n_clusters = 4) # k = 4, by default it uses k-means++ initialization and does 10 runs

6 km.fit(X) # run the algorithm, compute the cluster centers

7 y = km.predict(X) # cluster assignment for the points it was fitted on

8 km.cluster_centers_

9 km.inertia_ # final distortion value

10

11 silhouette_score(X, y) # mean silhouette score over all samples

76

77






78
79

You might also like