You are on page 1of 2

Assignment 6

Introduction to Machine Learning


Prof. B. Ravindran
1. Which of the following properties are characteristic of decision trees?

(a) Low bias


(b) High variance
(c) Lack of smoothness of prediction surfaces
(d) Unbounded parameter set

Sol. (a), (b), (c) & (d)


Refer to the lecture
2. (2 marks) Consider the following dataset:

Age Vaccination Tumor Size Tumor Site Malignant


5 1 Small Shoulder 0
9 1 Small Knee 0
6 0 Small Marrow 0
6 1 Medium Chest 0
7 0 Medium Shoulder 0
8 1 Large Shoulder 0
5 1 Large Liver 0
9 0 Small Liver 1
8 0 Medium Shoulder 1
8 0 Medium Shoulder 1
6 0 Small Marrow 1
7 0 Small Chest 1

What is the initial entropy of Malignant?


(a) 0.543
(b) 0.9798
(c) 0.8732
(d) 1
Sol. (b)
5 5 7 7
− 12 log2 12 − 12 log2 12 = 0.9798

3. (2 marks) For the same dataset, what is the info gain of Vaccination?
(a) 0.4763
(b) 0.2102
(c) 0.1134
(d) 0.9355

1
Sol. (a)
5
0.9798 − ( 12 (− 05 log2 0
5 − 5
5 log2 55 ) − 7 2
12 (− 7 log2 2
7 − 5
7 log2 75 )) = 0.4763
4. Consider the following statements:
Statement 1: Decision Trees are linear non-parametric models.
Statement 2: A decision tree may be used to explain the complex function learned by a neural
network.
(a) Both the statements are True.
(b) Statement 1 is True, but Statement 2 is False.
(c) Statement 1 is False, but Statement 2 is True.
(d) Both the statements are False.
Sol. (c)
Refer to the lecture
5. Which of the following machine learning models can solve the XOR problem without any
transformations on the input space?
(a) Linear Perceptron
(b) Neural Networks
(c) Decision Trees
(d) Logistic Regression
Sol. (b), (c)
Refer to the lecture
6. Which of the following is/are major advantages of decision trees over other supervised learning
techniques? (Note that more than one choices may be correct)
(a) Theoretical guarantees of performance
(b) Higher performance
(c) Interpretability of classifier
(d) More powerful in its ability to represent complex functions
Sol. (c)
Refer to the lecture
7. (2 Marks) Consider a dataset with only one attribute(categorical). Suppose there are q un-
ordered values in this attribute. How many possible combinations are needed to find the best
split-point for building the decision tree classifier?
(Note: The decision tree only makes binary splits, like CART).
(a) q
(b) q2
(c) 2q−1
(d) 2q−1 − 1
Sol. (d)
Refer to the lecture

You might also like