Professional Documents
Culture Documents
Suppose you have been given a fair coin and you want to find out the odds of
getting heads. Which of the following option is true for such a case?
Sensitivity + Specificity – 1
It sifts through a large number of records to identify the records that belong to a
class of interest
It helps determine how effectively we can “skim the cream” by selecting a
relatively small number of records and getting a relatively large portion of the
responders
4. The results of the Logistics Regression are influenced by which of the following
assumptions
Multicollinearity
Cross Validation
6. You wish to classify spam from ham emails. Most of the independent variables
you have are categorical in nature. Which of the classifiers would be most
appropriate in this situation?
Classification Trees
7. In a binary logistic regression, the baseline is a function of majority class from
which of the following:
8. In Classification Trees, which one of the following can be used for identifying the
split for growing the tree:
Gini Index
Entropy
Misclassification Rate
9. Consider the following figure for answering the next few questions. In the figure,
X1 and X2 are the two variables and the data point is represented by dots (-1 is
negative class and +1 is a positive class). And you first split the data based on
variable X1(say splitting point is x11) which is shown in the figure using vertical
line. Every value less than x11 will be predicted as positive class and greater than
x will be predicted as negative class.
Equal to X11
Supervised Learning
11. For the figure below, what is the approximate optimum size of a Classification
Tree
10
Approximately 3.67%
13. ROC curve is a plot between which of the following:
14. For the model below, predict the outcome for the following record:
Income = 80
Lot Size = 10
Duration = 30
Owner
15. The figure above shows ROC curve for three Logistic Regression Models? Which
one is them will give best results?
Yellow
17. Which of the following is/are true about default cut-off in Logistic Regression?
18. In Classification Model, which technique can help you to choose a threshold that
balance sensitivity and specificity
ROC Curve
19. Which of the following is true about random forests?
They create large number of trees on the training data that combinedly vote on the
predicted response for each observation
20. What is the True positive rate in the following figure for cut-off = 0.6
10/12
Logistic Regression
Neural Nets
22. Which of the following is a good measure of performance when the data is
unbalanced?
23. These are the three scatter plots (A,B,C left to right) and hand drawn decision
boundaries for logistic regression.
Which of the following is/are true about above visualizations?