You are on page 1of 5

1.

Suppose you have been given a fair coin and you want to find out the odds of
getting heads. Which of the following option is true for such a case?

Odds will be 0.5

2. Youden Index is a measure of the following:

Sensitivity + Specificity – 1

3. What is the use of lift chart?

It sifts through a large number of records to identify the records that belong to a
class of interest
It helps determine how effectively we can “skim the cream” by selecting a
relatively small number of records and getting a relatively large portion of the
responders

4. The results of the Logistics Regression are influenced by which of the following
assumptions

Multicollinearity

5. Which of the following technique is used for pruning the tree?

Cross Validation

6. You wish to classify spam from ham emails. Most of the independent variables
you have are categorical in nature. Which of the classifiers would be most
appropriate in this situation?

Classification Trees
7. In a binary logistic regression, the baseline is a function of majority class from
which of the following:

The Full Dataset

8. In Classification Trees, which one of the following can be used for identifying the
split for growing the tree:

Gini Index
Entropy
Misclassification Rate

9. Consider the following figure for answering the next few questions. In the figure,
X1 and X2 are the two variables and the data point is represented by dots (-1 is
negative class and +1 is a positive class). And you first split the data based on
variable X1(say splitting point is x11) which is shown in the figure using vertical
line. Every value less than x11 will be predicted as positive class and greater than
x will be predicted as negative class.

Equal to X11

10. Classification Modeling is a type of:

Supervised Learning

11. For the figure below, what is the approximate optimum size of a Classification
Tree

10

12. Calculate the misclassification rate for the following figure?

Approximately 3.67%
13. ROC curve is a plot between which of the following:

True Positive Rate and False Positive Rate

14. For the model below, predict the outcome for the following record:

Income = 80
Lot Size = 10
Duration = 30

Owner

15. The figure above shows ROC curve for three Logistic Regression Models? Which
one is them will give best results?

Yellow

16. What happens if the beta coefficient of a X variable is positive in Logistic


Regression?

There will be an overall increase in probability with increase in X value

17. Which of the following is/are true about default cut-off in Logistic Regression?

It is the probability of a majority class in the binary logistic regression

18. In Classification Model, which technique can help you to choose a threshold that
balance sensitivity and specificity

ROC Curve
19. Which of the following is true about random forests?

They are meant to increase the accuracy of Classification Trees

They create large number of trees on the training data that combinedly vote on the
predicted response for each observation

20. What is the True positive rate in the following figure for cut-off = 0.6

10/12

21. Name any two classifiers

Logistic Regression
Neural Nets

22. Which of the following is a good measure of performance when the data is
unbalanced?

23. These are the three scatter plots (A,B,C left to right) and hand drawn decision
boundaries for logistic regression. 
Which of the following is/are true about above visualizations?

Plot ‘C’ is likely to result in maximum test error

Plot ‘A’ is the best model because it is highly interpretable

24. Which one is true about pruning a classification tree?

It is required to tackle the problem of overfitting on the test data


25. Which one of the following is required to construct a lift chart?

Actual and Predicted Response Variable in the Training Set

You might also like