Professional Documents
Culture Documents
1. Explain cross-entropy and gini-index. How are they used to measure performance of decision
trees? (5)
2. Determine the radial kernel values among the 3 points 𝑎: (10,2), 𝑏: (3,5) 𝑎𝑛𝑑 𝑐: (5,7). What
information is derived from the radial kernel values. Comment which points among 𝑎 and 𝑏
has greater influence on point 𝑐. Assume 𝛾 = 1 . (10)
𝑝
2
𝐾(𝑎, 𝑏) = exp (−𝛾 (∑(𝑎𝑗 − 𝑏𝑗 ) ) = exp(−1((10 − 3)2 + (2 − 5)2 ))
𝑗=1
K(a,b) 6.47023E-26
K(b,c) 0.000335463
K(a,c) 1.92875E-22
The radial kernel values show that points (a,b) and (a,c) are relatively far from one another and
do not have much influence on each other. (b,c) are relatively closer and may have some
influence on each other. This indicates that point a is relatively far away from b and c.
Point b has greater influence on point c.
3. Form 2 clusters based on the radial kernel values in problem 2. The clusters should be formed
based on similarity. Determine the complete and centroid linkages between these 2 clusters.
(10)
Based on the radial kernel similarity, the two points that are most similar are b and c. Hence
two clusters are point C1: {a}, C2: {b,c}
Euclidean distance
D(a,b) : 7.615
D(a,c) : 7.071
Complete linkage: max(𝐷(𝑎, 𝑏), 𝐷(𝑎, 𝑐)) = 7.615 (using Euclidean Distance measure)
Centroid of C1: (10,2)
Centroid of C2: (4,6)
1
4. The result below shows the first few PCs of mtcars (11 features) data with associated
standard deviation
Determine the associated PVE and cumulative PVE of the PCs. (5)
###########################