You are on page 1of 7

1. Which of the following is NOT a common type of classification problem?

a. Binary classification
b. Multi-class classification
c. Regression
d. None of the above
2. Which of the following algorithms is a popular example of a Naive Bayes classifier?
a. Decision Trees
b. K-Nearest Neighbors
c. Random Forest
d. Naive Bayes
3. Which of the following is NOT an assumption made by the Naive Bayes algorithm?
a. Features are independent
b. Features have equal importance
c. Features are continuous
d. Features are categorical
4. In a Naive Bayes classifier, the probability of each class is calculated using:
a. The Bayes' theorem
b. The Central Limit Theorem
c. The Law of Large Numbers
d. The Pythagorean Theorem
5. The Laplace smoothing technique is used in Naive Bayes to:
a. Avoid zero probabilities for unseen features
b. Reduce overfitting
c. Speed up the algorithm
d. None of the above
6. Which of the following is NOT a metric used to evaluate the performance of a classification model?
a. Accuracy
b. Precision
c. Recall
d. F1-score
e. All of the above are used to evaluate model performance
7. In a binary classification problem, the confusion matrix shows:
a. The number of correctly classified instances for each class
b. The number of incorrectly classified instances for each class
c. Both the number of correctly and incorrectly classified instances for each class d. None of the
above
8. Which of the following is an example of a regression problem?
a. Predicting the color of a flower based on its petal length
b. Predicting the likelihood of a customer to buy a product based on their age and income
c. Predicting the price of a house based on its location and size
d. None of the above
1. The linear regression algorithm is used for: \
a. Binary classification
b. Multi-class classification
c. Regression
d. Clustering
2. Which of the following is NOT an assumption made by linear regression?
a. The relationship between the independent and dependent variables is linear
b. The residuals are normally distributed
c. The independent variables are independent of each other
d. The dependent variable is categorical
3. Which of the following is a common metric used to evaluate the performance of a
regression model?
a. Accuracy
b. Precision
c. Recall
d. Mean squared error
4. Which of the following is NOT an example of a decision tree algorithm?
a. ID3
b. C4.5
c. CART
d. KNN
5. Which of the following is NOT an advantage of decision trees?
a. They are easy to interpret and visualize
b. They can handle both categorical and continuous data
c. They are immune to overfitting
d. They can handle missing data
6. Which of the following is an example of an ensemble method?
a. Logistic Regression
b. K-Nearest Neighbors
c. Decision Trees
d. Random Forest
7. In a random forest, each tree is trained on:
a. The same set of features
b. A different set
1. Which of the following is NOT a popular ensemble learning method?
A. Boosting B. Bagging C. Random forests D. Logistic regression

Answer: D

2. Boosting and bagging are two popular methods for:


A. Feature selection B. Dimensionality reduction C. Ensemble learning D.
Unsupervised learning

Answer: C

3. In linear regression, the aim is to:


A. Predict the class labels of input data B. Predict the probability of a binary
event C. Predict the value of a continuous output variable D. Cluster input data
points

Answer: C

4. In logistic regression, the output variable is:


A. Continuous B. Binary C. Categorical D. None of the above

Answer: B

5. Maximum likelihood estimation is a method for:


A. Finding the optimal value of the regularization parameter B. Estimating the
parameters of a probabilistic model C. Reducing the variance of a model's
predictions D. None of the above

Answer: B

6. Regularization is used to:


A. Reduce the bias of a model B. Reduce the variance of a model C. Reduce
both the bias and variance of a model D. None of the above

Answer: A

7. Statistical learning theory is concerned with:


A. Developing algorithms for supervised learning B. Understanding the
theoretical properties of learning algorithms C. Developing algorithms for
unsupervised learning D. None of the above

Answer: B
8. The perceptron rule is used for:
A. Training linear regression models B. Training logistic regression models C.
Training neural networks with a single layer D. None of the above

Answer: C

9. Multi-layer perceptrons are also known as:


A. Feedforward neural networks B. Convolutional neural networks C. Recurrent
neural networks D. None of the above

Answer: A

10. Backpropagation is a method for:


A. Training supervised learning models B. Training unsupervised learning
models C. Evaluating the performance of a learning algorithm D. None of the
above

Answer: A

11. Deep learning models are characterized by:


A. Having a large number of layers B. Being trained using unsupervised
learning C. Being highly interpretable D. None of the above

Answer: A

12. PCA is a technique for:


A. Feature selection B. Feature extraction C. Model selection D. None of the
above

Answer: B

13. LDA is a technique for:


A. Feature selection B. Feature extraction C. Model selection D. None of the
above

Answer: B

14. K-means clustering is a method for:


A. Supervised learning B. Unsupervised learning C. Semi-supervised learning
D. None of the above

Answer: B
15. Gaussian mixture models are used for:
A. Regression
B. Classification
C. Clustering
D. None of the above

Answer: C

Q2. Machine Learning is a field of AI consisting of learning algorithms that ..............

A. At executing some task

B. Over time with experience

C. Improve their performance

D. All of the above

Q3. .............. is a widely used and effective machine learning algorithm based on the idea of bagging.
A. Regression B. Classification C. Decision Tree D. Random Forest

Q4. What is the disadvantage of decision trees? A. Factor analysis B. Decision trees are robust to
outliers C. Decision trees are prone to be overfit D. All of the above

Q5. How can you handle missing or corrupted data in a dataset? A. Drop missing rows or columns B.
Assign a unique category to missing values C. Replace missing values with mean/median/mode D. All
of the above

Q6. Which of the followings are most widely used metrics and tools to assess a classification model?
A. Confusion matrix B. Cost-sensitive accuracy C. Area under the ROC curve D. All of the above

Q7. Machine learning algorithms build a model based on sample data, known as ................. A.
Training Data B. Transfer Data C. Data Training D. None of the above

Q8. Machine learning is a subset of ................ A. Deep Learning B. Artificial Intelligence C. Data
Learning D. None of the above

Q9. A Machine Learning technique that helps in detecting the outliers in data. A. Clustering B.
Classification C. Anomaly Detection D. All of the above

Q10. Who is the father of Machine Learning? A. Geoffrey Hill B. Geoffrey Chaucer C. Geoffrey Everest
Hinton D. None of the above

Q11. What is the most significant phase in a genetic algorithm? A. Selection B. Mutation C.
Crossover D. Fitness function

Q12. Which one in the following is not Machine Learning disciplines? A. Physics B. Information
Theory C. Neurostatistics D. Optimization Control
Q13. Machine Learning has various function representation, which of the following is not function of
symbolic? A. Decision Trees B. Rules in propotional Logic C. Rules in first-order predicate logic D.
Hidden-Markov Models (HMM)

Q14. ................... algorithms enable the computers to learn from data, and even improve
themselves, without being explicitly programmed. A. Deep Learning B. Machine Learning C. Artificial
Intelligence D. None of the above

Q15. What are the three types of Machine Learning? A. Supervised Learning B. Unsupervised
Learning C. Reinforcement Learning D. All of the above

Q16. Which of the following is not a supervised learning? A. PCA B. Naive Bayesian C. Linear
Regression D. Decision Tree Answer

Q17. Real-Time decisions, Game AI, Learning Tasks, Skill acquisition, and Robot Navigation are
applications of ............. A. Reinforcement Learning B. Supervised Learning: Classification C.
Unsupervised Learning: Regression D. None of the above

Q18. Which of the following is not numerical functions in the various function representation of
Machine Learning? A. Case-based B. Neural Network C. Linear Regression D. Support Vector
Machines

Q19. Common classes of problems in machine learning is .............. A. Clustering B. Regression C.


Classification D. All of the above

Q20. Which of the following clustering algorithm merges and splits nodes to help modify nonoptimal
partitions? A. K-Means clustering B. Conceptual clustering C. Agglomerative clustering D. All of the
above

4) The term machine learning was coined in which year?

 A. 1958  B.1959  C.1960  D.1961

10. Consider a linear-regression model with N = 3 and D = 1 with input-ouput pairs as follows: y1 =
22, x1 = 1, y2 = 3, x2 = 1, y3 = 3, x3 = 2. What is the gradient of mean-square error (MSE) with respect
to β1 when β0 = 0 and β1 = 1? Give your answer correct to two decimal digits. Answer: -1.66
(deviation 0.01)

11. Let us say that we have computed the gradient of our cost function and stored it in a vector g.
What is the cost of one gradient descent update given the gradient? (A) O(D) (B) O(N) (C) O(ND) (D)
O(ND2 ) Answer: (A)

12. Let us say that we are fitting one-parameter model to the data, i.e. yn ≈ β0. The average of y1,
y2, . . . , yN is 1. We start gradient descent at β (0) 0 = 0 and set the step-size to 0.5. What is the value
of β0 after 3 iterations, i.e., the value of β (3) 0 ? Answer: 0.875 (deviation 0.01)

13. Let us say that we are fitting one-parameter model to the data, i.e. yn ≈ β0. The average of y1, y2,
. . . , yN is 1. We start gradient descent at β (0) 0 = 10 and set the step-size to 0.5. What is the value
of β0 after 3 iterations, i.e., the value of β (3) 0 ? Answer: CA: 2.125 (deviation 0.01)
14. Computational complexity of Gradient descent is, (A) linear in D (B) linear in N (C) polynomial in
D (D) dependent on the number of iterations

6. K-fold cross-validation is (A) linear in K (B) quadratic in K (C) cubic in K (D) exponential in K Answer:
A

17. You observe the following while fitting a linear regression to the data: As you increase the
amount of training data, the test error decreases and the training error increases. The train error is
quite low (almost what you expect it to), while the test error is much higher than the train error.
What do you think is the main reason behind this behavior. Choose the most probable option. (A)
High variance (B) High model bias (C) High estimation bias (D) None of the above Answer: A

18. Adding more basis functions in a linear model... (pick the most probably option) (A) Decreases
model bias (B) Decreases estimation bias (C) Decreases variance 4 (D) Doesn’t affect bias and
variance

You might also like