You are on page 1of 5

ML: Ques Bank For Unit 2

Classification, Regression, Generalization

1. What is Multiclass classification? Explain with suitable example.


2. Contingency table helps in accessing the performance of classifier. Prove
3. Explain true negative rate in classifier’s performance.
4. How class probability estimator works?
5. Do we really need squared error? Justify your answer.
6. How squared error is calculated? Explain with suitable example.
7. Write a short note on:
a. Precision
b. Recall
c. Accuracy
d. False positive and False Negative
8. An empirical probability helps in accessing class probability estimate?
Justify.
9. What are different ways to find formula for for mH(N)?
10. What is generalization error?
11. Why we need to bound the growth function?
12. Write a short note on VC dimensions.
13. Briefly explain what is generalization in the context of pattern recognition
problems?

Q. 1 Give examples of classification problems.


Q. 1 Define and explain following terms with example.
(i) label (ii) label space
(iii) Output Space (iv) Classification problem
(vi) Regression Problem (vi) Ranking and scoring problem
(vii) Probability Estimation Problem
Q. 2 Explain different types of predictive machine learning tasks.
Q. 3 What do you meant by
1. Binary Classification
2. Multiclass Classification
3. Decision Boundary
Q. 4 What is training data set and rest data set ?
Q. 5 Performance analysis of classifier is better practice. Justify.
Q. 6 Performance analysis of Regression model is better practice. Justify.
Q. 7 Explain with diagram
1. Univariate Binary Classification.
2. Bivariate Binary Classification.
3. Multivariate Binary Classification.
Q. 8 Explain (i) Input (ii) Output (iii) Constraints of Classification.
Q. 9 Explain confusion matrix for Binary classifier in detail.
Q. 10 Define and explain following terms.
1. True Positive 2. True Negative
3. False Positive 4. False Negative
5. TPR 6. TNR
7. FPR 8. FNR
9. Sensitivity 10. Recall
11. Specificity 12. Fallout 13. miss rate
Q. 11 Consider following confusion matrix and calculate.
(i) Sensitivity Classifier (ii) Recall of Classifier
(iii) Miss Rate of Classifier (iv) Fallout of Classifier
Confusion Predicted Total
Matrix + –
A
C + 8 10 18
L
U
A – 4 8 12
L
Total 12 18 30
Comment on quality/usability of this classifier.
Q. 12 Prove that
(i) FPR = 1 – TNR
(ii) TNR = 1 – FPR
(iii) FNR = 1 – TPR
(iv) TPR = 1 – FNR
(v) Accuracy = 1 – Error Rate
(vi) Error Rate = 1 – Accuracy
Q. 13 Justify correctness or incorrectness of following statements For good classifier.
(i) TPR should be closed to 1
(ii) FNR should be closed to 1
(iii) TNR should be close to 1
(iv) FPR should be close to 1
Ans. :
(Hint : Correct, incorrect, correct, incorrect statements ( Justify by using
probabilities of correct classification of positive or negative instances)
Q. 14 Comment on following statements.
(i) Higher value of TPR ensures that probabilities of correct classification of
positive instances in high.
(ii) Lower value of FNR ensures that probability of mis-classification positive
instances is low.

Q. 15 What do you meant by class probabilities estimation?


Q. 16 Explain class probability estimation process for binary classifier.
Q. 17 Explain how classification is done by using threshold probabilities.
Q. 18 What do you meant by
1. Feature Tree (Partitioning of training data )
2. Probability estimation tree
3. Classification Tree
Q. 19 Consider following partitioning of training data construct – feature tree,
probability estimation tree and classification tree.
Hit Movies=10 Hit Movies=2
Action Movie Flop Movies=2 Flop Movies= 10
Non- Action Hit Movies=12 Hit Movies=6
Movie Flop Movies=6 Flop Movies= 12
Romantic Movie Non-Romantic Movie
Q. 20 For partitioning of Data given in Q – 19, Evaluate performance of probability
estimation classifier.
Find : (1) SE i i = 1 to no of leaf nodes.
(2) Let | Test set | = 120 find MSE for all nodes.
Q. 21 Define and Explain
(a) Empirical probabilities of each leaf or class.
(b) Empirical probabilities of training data.
Q. 22 Prove that squared error with training data = calibration loss + Refinement loss.
Q. 23 Short note on : Class probability Estimation.
Q. 24 Define and explain with example multi-class classification.
Q. 25 Explain construction of multi-class classifier.
1. One vs all approach
2. One vs one approach
3. Error correcting output codes approach.
Q. 26 Explain confusion matrix for multi-class classifier.
Q. 27 Write formulae for following measures used for performance evaluation of
multi-class classification.
(i) Accuracy of multi-class classifier.
(ii) Error Rate of multi-class classifier.
(iii) Precision of multi-class classifier.
(iv) Recall of multi-class classifier.
Q. 28 Consider following confusion matrix evaluate performance of this multiclass
classifier.
Confused Matrix Predicted
C1 C2 C3 C4
Actual C1 20 40 20 80
C2 20 30 20 70
C3 20 20 60 100
Total 60 90 100 250
Q. 29 What do you meant by Regression ? explain with example.
Q. 30 What is simple linear Regression or linear Regression?
Q. 31 What is multiple linear Regression?
Q. 32 Explain statistical and geometric properties of linear Regression.?
Q. 33 Y = 0.20 + 0.15 Interpret this regression equation.?
Q. 34 Write and explain characteristic of best Regression line.
2
Q. 35 If Errors follow normal distribution with mean O and variance  then show that
output variable Y also follows normal distribution.
Q. 36 For given following data evaluate performance of following regression.

X Y
1 5
5 3
4 5
6 7
8 7
7 1
Q. 37 Explain following measures for evaluation of performance of regression w.
1. SSE 2. MSE
3. RMSE 4. NMSE
5. R-squared 6. MAD
7. MAPE 8. ALC
9. BLC
Q. 38 Calculate value of above measures for example in Q. 36.
Q. 39 What do you meant by overfitting ? prove and explain there are different
catalyst for the regression classification.
Q. 40 What do you meant by training error and test error?
Q. 41 Explain underfit, over fit, just fit models for classification or Regression model.
Q. 42 Explain case study of Regression by
Q. 43 Explain significance of Generalization theory.
Q. 44 Explain Hoeffding inequality.
Q. 46 Justify effective number of Hypothesis is 2n.
Q. 47 Explain Bounding function.
Q. 48 Explain relation between.
(i) Hoeffding In equaling
(ii) Bounding function to divide number of Hypothesis.
Q. 49 Justify Regularization theory is used to achieve good generalization.
Q. 50 What do you mean by Regularized regression ?
Q. 51 Explain
(i) Shrinkage Mention
(ii)
Ridge Regression
(iii) Lasso Regression
Q. 53 Explain effect of learning parameter  on Regularization.
Q. 54 What do you meant by VC Dimension?.
Q. 55 Explain significance of
1. VC dimension and inequality
2. Hoeffding inequity and VC inequality.
Q. 58 How to evaluate VC dimension.
if
1. Only max number of instances which can be shattered by model is
known.
2. Only size of training data to understand.
Q. 59 Explain theory of generalization w.r.t.
1. Hoeffding inequality
2. Shattering of k paints.
3. VC-dimension
4. VC-inequality

You might also like