Professional Documents
Culture Documents
Training set: (x1=1, x2=2, y=1); (x1=2, x2=1, y=1); (x1=3, x2=2, y=1); (x1=1, x2=3,
y=2); (x1=2, x2=3, y=2); (x1=2, x2=4, y=2); (x1=4, x2=1, y=3); (x1=4, x2=2, y=3); (x1=5,
x2=2, y=3); (x1=5, x2=3, y=4); (x1=5, x2=4, y=4); (x1=6, x2=4, y=4);
Q2. Given a set of training datapoints (dimension: 4), create a random forest
with 3 trees, where every tree has depth 1 only. For each tree, choose any 2 of
the dimensions. Given the test datapoint, what will be output of the random
forest?
Q3. You are given 6 2D data points with binary class labels. Start with a linear
classifier y = sign(x1 – x2) where x1, x2 are the two dimensions. Run Adaboost
algorithm for 2 iterations, and thus obtain 2 more linear classifiers along with
their weights. Carry out ensemble classification of these points using the 3 linear
classifiers (including initial one).
IDNo 1 2 3 4 5 6
Y 1 1 -1 1 -1 -1
Q4. You are given 4 3-dimensional labelled points. Find a 3D linear classifier
that separates the two classes, and calculate the margins of that classifier with
respect to each of the classes. Note: the perpendicular distance of point (l,m,n)
from the line ax+by+cz+d=0 is given by |al+bm+cn+d|/√(a2+b2+c2)
[denominator: rootover a square plus b square plus c square]
ID 1 2 3 4
X1 2 -1 0 1
X2 1 0 1 -1
X3 -1 2 2 1
Y +1 +1 -1 -1
Q5. Given a set of training datapoints and their corresponding ‘alpha’s, a set of
test datapoints and a kernel function, show the classification results using
Kernelized SVM. Assume b=0.
Training: (x1=1, x2=3, y=-1); (x1=2, x2=1, y=-1); (x1=4, x2=0, y=-1); (x1=0,
x2=4, y=-1); (x1=1, x2=5, y=1); (x1=2, x2=5, y=1); (x1=8, x2=2, y=1); (x1=3, x2=6,
y=1);
Q6. You are given 4 3D data-points. You want to reduce them into 1-dimension,
as a*X1+b*X2+c*X3 where a, b, c coefficients, at least 1 of whom should be 0.
For what values of a, b, c can you get 100% accuracy on this dataset?
ID 1 2 3 4
X1 1 2 -1 0
X2 2 3 -2 1
X3 1 -3 1 0
Y A A B B
Q7. Given a set of 4 3D points, you are given two sets of components to project
them onto: A) W1=[3/5 4/5 0]; W2=[4/5 -3/5 0] B) V1=[5/13 0 12/13];
V2=[12/13 0 -5/13]. Which will you prefer to project them onto?
ID 1 2 3 4
iii) Using softmax function on the output layer, write the probability distribution
of its class label according to our NN
Q9. Given a set of datapoints, carry out agglomerative clustering using (i) single-
linkage using threshold 5, (ii) complete linkage using threshold 10. Use Euclidean
distance.
(x1=1, x2=1); (x1=3, x2=3); (x1=9, x2=6); (x1=4, x2=7); (x1=9, x2=4); (x1=1, x2=9);
(x1=13, x2=9);
Q10. You are given a set of 10 data-points (2D) below. Attempt to partition them
into 3 clusters using a K-means b) K-means++. Use Manhattan distance to
identify nearest cluster centres from each point. Show 3 complete iterations in
each case.
[(-1,1), (5,-6), (0,-1), (-4,4), (1,0), (0,-2), (6,4), (-6,5), (-2,-1), (-5,-7)]
Q11. You have the following 1D observations, and you wish to fit a Gaussian
Mixture Model on it. Looking at the data, decide how many Gaussian
components you want to use. Then estimate the Gaussian parameters using E-
M algorithm.
[2.3, 4.7, -5.5, -4.8, 9.1, 3.5, 10.4, -4.3, 11.2, 1.9, 10.8, 3.4]