You are on page 1of 11

AKTU EXAM 19-20

Machine Learning Solved MCQ


Answer Key
Question Answer Question Answer Question Answer
1 A 26 C 51 A
2 D 27 C 52 A
3 C 28 B 53 A
4 D 29 B 54 C
5 D 30 C 55 D
6 B or C 31 C 56 A
7 D 32 D 57 C
8 A 33 C 58 B
9 A 34 A 59 D
10 A 35 C 60 A
11 A 36 A 61 C
12 A 37 D 62 D
13 C 38 C 63 C
14 B 39 D 64 D
15 B 40 B 65 D
16 C 41 A 66 C
17 B 42 D 67 A
18 B 43 B 68 A
19 C 44 C 69 B
20 A 45 B 70 A
21 B 46 C
22 B 47 B
23 A 48 D
24 A 49 D
25 B 50 D

www.universityacademy.in, info@universityacademy.in
AKTU EXAM 19-20
Machine Learning Solved MCQ
Highlighted Option is Correct Answer
Note: Attempt all questions. The question paper contains 70 MCQ type questions. Each
question carries equal marks. Select the answer and Jill the bubble corresponding to
that question in the attached OMRsheet.

1. What is Machine learning? (iii) Typel error occurs when we


(A) The autonomous acquisition of reject a null hypothesis when it
knowledge through the use of is true.
computer programs (A) Only i
(B) The autonomous acquisition of (B) Only iii

my
knowledge through the use of i and iii
manual programs (D) ii and iii

e
(C) The selective acquisition of 4. How do you handle missing or

ad
knowledge through the use of corrupted data in a dataset?
computer programs Drop missing rows or columns
(D) The selective acquisition of
Ac (B) Replace missing values with
knowledge through the use of mean/median/mode
manual programs Assign a unique category to
y

2. Which of the factors affect the missing values


it

performance of learner system does All of the above


not include? 5. Which is of the following option is
s

(A) Representation scheme used frue about FIND-S Algorithm


er

(B) Training scenario (A) FIND-S Algorithm starts from


(C) Type of feedback the most specific hypothesis
)
iv

Good data structures and generalize it by


3. Which of the following statements considering only positive
is/are true about "Type-I " and
Un

examples.
"Type-2" errors? (B) FIND-S algorithm ignores
U) Typel is known as false negative examples.
positive and Type2 is known FIND-S algorithm finds the
as false negative. most specific hypothesis
(ii) Type 1 is known as false within H that is consistent
negative and Type2 is known with the positive training
as false positive. examples.

[190537] [RCS080] [Page- 3]

www.universityacademy.co.in Page 1 of 10
(D All of the above (C) High estimation bias
6. Regarding bias and variance, which (D) Nonc of the above
of the followingstatementsare true? 9. Adding more basis functions in a
(Here 'high' and 'low' are relative to linear model... (pick the most
the idea model.) probably option)
odels which overfit have a Decreases model bias
high bias. (B) Decreases estimation bias
Models which overfit have a (C) Decreases variance
low bias. (D) Doesn't affect bias and

my
(C) Models which underfit have a variance
high variance. 10. Which of the following will be true

e
(D) None of these about k in k-NN in terms of Bias?
7. Which of the following sentence is

ad
When you increase the k the
FALSE regarding regression? bias will be increases
(A) It relates inputs to outputs. Ac (B) When you decrease the k the
(B) It is used for prediction. bias will be increases
(C) It may be used for (C) Can't say
interpretation.
ty

(D) None of these


discovers causal ll. Which of the following distance
relationships. measure do we use in case of
si

8. You observe the following while categorical variables in k-NN?


fitting a linear regression to the data: Hamming Distance
er

As you increase the amount of Euclidean Distance


training data, the test error decreases, Manhattan Distance
iv

and the training error increases. The


train error is quite low (almost what 2
Un

you expect it to), while the test error


is much higher than the train error. (D) and 3
What do you think is the main reason 12. Imagine, you are working with
behind this behavior? Choose the "Analytics Vidhya" and you want to
most probable option. develop a machine learning algorithm
(A) High variance which predicts the number of views
High model bias on the articles.

[190537] [RCS080] [Page- 41

www.universityacademy.co.in Page 2 of 10
Your analysis is based on features 15. Choose the False Statement.
like author name, number of articles Gradient of a continuous and
written by the same author on differentiable function
Analytics Vidhya in past and a few (A) is zero at a minimum
other features. Which of the (B) is non-zero at a maximumw
following evaluation metric would (C) is zero at a saddle point
you choose in that case? decreases as you get closer to

my
Mean Square Error the minimum
Accuracy 16. Computational complexity of
Fl Score

e
Gradient descent is,
Only 1

ad
(A) linear in D
(B) only 2 (B) linear in N
(C) only 3
(D) 1 and 3b
13. At a certain university, 4% of men
Ac (C)
(D)
polynomial in D
dependent on the number of
iterations
are over 6 feet tall and 1% of women Let's say, you are using activation
ty

17.
are over 6 feet tall. The total student ftmction X in hidden layers of neural
population is divided in the ratio 3:2 network. At a neuron for any given
si

in favour of women. If a student is input, you get the output as


selected at random from amopg all 0.0001". Which of the following
er

those over six feet tall, what is the activation function could X
probability that the student is a represent?
iv

woman? 60 (A) ReLU


(A) 25 (B) tanh
Un

(B) 36 SIGMOID
(C) 3/11 (D) None of these
W) moo 18. Which of the following hyper
14. Macromutation operator is also parameter(s), when increased may
known as cause random forest to over fit the
Headed Chicken data?
(B) Headless chicken Number of Trees
(C) SPX operator Depth of Tree
BLX operator Learning Rate

www.universityacademy.co.in Page 3 of 10
(A) Only I technique that adjusts weights in the
(B) only 2 neural network by propagating
(C) 1 and 2 weight changes.
(D) 2 and 3 Forward from source to sink
19. Which of the following is a Backward from sink to source
disadvantage of decision trees? (C) Forward from source to hidden
(A) Factor analysis nodes
(B) Decision trees are robust to (D) Backward from sink to hidden
outliers nodes

my
(C) Pecision trees are prone to be 23. Which of the following neural
verfit networks uses supervised learning?

e
(D) one of the above (A) Multilayer pcrceptron

ad
20. To find the minimum or the (B) Self organizing feature map
maximum of a function, we set the (C) Hopfield network
gradient to zero because:
Ac Choose the correct answer:
The value of the gradient at (A) A only
extrema of a function is (B) B only
y

always zero (C) A and B only


(B) Depends on the type of
it

A and C only
problem 24. Which of the following sentences is
(C) Both A and B incorrect in reference to Information
s

(D) None of the above gain?


er

21. In Delta Rule for error minimization It is biased towards single-


(A) weights are adjusted w.r.to valued attributes
iv

change in the output (B) It is biased towards multi-


Weights are adjusted w.r.to valued attributes
Un

difference between desired (C) ID3 makes use of information


output and actual output gain
(C) weights are adjusted w.r.to (D) The approachused by ID3 is
difference between input and greedy
output 25. What are two steps of tree pruning
(D) none of the above work?
22. Back propagation is a learning (A) Pessimistic pruning and

[190537] [RCS080] [Page- 6]

www.universityacademy.co.in Page 4 of 10
Optimistic pruning 30. Which of the following is true about
(B) Post-pruning and Pre-pruning Naive Bayes?
(C) Cost complexity pruning and Assumes that all the features
time complexity pruning in a dataset are equally
(D) None of the options important
26. Which one of these is not a tree- (B) Assumes that all the features
based learner? in a dataset are independent
(A) CART Both A and B
(B) 11)3 (D) None of the above options

my
Bayesian classifier 31. The method in which the previously
(D) Random Forest calculated probabilities are revised

e
27. What is tree-based classifiers? with new probabilities is classified as

ad
(A) Classifiers which form a tree (A) updating theorem
with each attribute at one level Ac (B) revised theorem
(B) Classifiers which perform Jef Bayes theorem
series of condition checking (D) dependencytheorem
with one attribute at a time 32. Which of the following is a widely
ty

Both a and b used and effective machine learning


(D) None of the options algorithm based on the idea of
si

28. Decision Nodes are represented by bagging?


(A) Decision Tree
er

(A) Disks (B) Regression


Squares \ßCf Classification
U)
iv

(C) Circles (D) Random Forest


(D) Triangles 33. Which of the following is a good test
Un

29. Previous probabilities in Bayes dataset characteristic?


Theorem that are changed with help Large enough to yield
of new available information are meaningful results
classified as (B) Is representative of the dataset
(A) independent probabilities as a whole
posterior probabilities Y) Both A and B
interior probabilities (D) None of the above
(D) dependent probabilities 34. What is the arity in case of crossover

[190537] [RCS080J [Page- 71

www.universityacademy.co.in Page 5 of 10
operator in GA? high bias.

(A Number of parents used for (ii) Models which overfit have a


the operator low bias.

(B) Number of offspring used for (iii) Models which underfit have a
the operator high variance.
(C) Both a and b (iv) Models which underfit have a
(D) None low ariance
35. Which of the following statements ) (i) and (ii)

about regularization is not con•ect? (B) (ii) and (iv)


sing too large a value of (C) (iii) and (iv)

my
(A)
lambda can cause your (D) None of these
38. What is the purpose of restricting

e
hypothesis to underfit the data.
(B) Using too large a value of hypothesis space in machine

ad
lambda can cause your learning?
hypothesis to overfit the data. Ac (A) can be easier to search
(C) Using a very large value of (B) May avoid overfit since they
lambda cannot hurt the are usually simpler (e.g. linear
performance of your or low order decision surface)
ty

hypothesis. ) Both above


(D) None of the above (D) None of the above
si

36. You are given reviews of movies 39. Suppose, you got a situation where
marked as positive, negative, and you find that your linear regression
er

neutral. Classifying reviews of a new model is under fitting the data. In


movie is an example of such situation which of the following
iv

Of Supervised Learning options would you consider?


(B) Unsupervised Learning You will add more features
Un

(C) Reinforcement Learning (B) You will start introducing


(D) None of these higher degree features
37. Regarding bias and variance, which (C) You will remove tome
of the following statementsare true?
J)
features
(Here 'high' and 'low' are relative to Both a and b.
the ideal model.) 40. Consider a simple linear regre 'sion
(i) Models which overfit have a model with one independent v •able

[190537] [RCS080J

www.universityacademy.co.in Page 6 of 10
(X). The output variable is Y. The (D) None of the above
equation is : Y=aX+b, where a is the 44. Unsupervised learning is
slope and b is the intercept. If we (A) learning without computers
change the input variable (X) by 1 problem based learning
unit, by how much output variable learning from environment
(N) will change? (D) learning from teachers
(A) 1 unit 45. In supervised learning
classes are not predefined

my
(B) By (A)
(C) By intercept, JBf classes are predefined
classes are not required

e
vcDf None (C)
41. You have generated data from a 3- (D) classification is not done

ad
degree polynomial with some noise. 46. Mutating a strain is:
What do you expect of the model that (A) Changing all the genes in the
was trained on this data using a 5-
degree polynomial as ftrnction class?
Ac (B)
strain.
Removing one gene in the
( Low bias, high variance strain.
ty

(B) High bias, low variance. (C Randomly changing one gene


(C) Low bias, low variance. in the strain.
si

(D) High bias, low variance. (D) Removing the strain from the
42. Genetic Algorithm are a part of population.
er

(A) Evolutionary Computing 47. Genetic Algorithms are considered


(B) inspired by Darwin's theory pseudo-random because they:
iv

about evolution - "survival of Search the solution space in a


the fittest" random fashion.
Un

(C) are adaptive heuristic search (B) Search the solution space
algorithm based on the using the previous generation
evolutionary ideas of natural as a starting point.
selection and genetics (C) Have no knowledge of what
J) All of the above strains are contained in the
43. What are the 2 types of learning next generation.
(A) Improvised and un-improvised (D) Use random numbers.
supervised and unsupervised 48. The three gene operators we have

(C) Layered and unlayered discussed can be thought of as:

www.universityacademy.co.in Page 7 of 10
one is
Crossover: Receiving the best
52. Among the following, which
(A)
not "hyperparameter"?
genes from both parents.
(A) learning rate CL
(B) Mutation: Changing one gene
so that the child is almost like (B) number of layers L in the
neural network
the parent.
activation values all]
(C) Mirror: Changing a string of
genes in the child so it is like a (D) size of the hidden layers n[l]

cousinto the parent. 53. (i) The deeper layers of a neural


network are typically

my
A and B only complex
computing more
49. If a population contains only one features of the input than the
earlier layers.

e
strain, you can introduce new strains
(ii) The earlier layers of a neural

ad
by:
network are typically
Using the Crossover operator. computing more complex
Injecting random strains into features of the input than the
(A)
Ac
the population. deeper layers.
Which of the following option
(B) Using the Mutation operator.
is correct?
B only
y

(C)
(i) is correct and (ii) is
B and C only
it

(D)
incorrect
50. The efficiency of a Genetic
(B) (i) is incorrect while (ii) is
Algorithm (how quickly it arrives at
s

correct
the best solution) is dependent upon:
er

(C) both are correct


(A) The initial conditions.
(D) both are incorrect
(B) The size of the population.
iv

54. There are certain functions with the


(C) The types of operators
following properties:
employed.
Un

(i) To compute the function using


All of the above
a shallow network circuit, you
51. Which of the following methods do
will need a large network
we use, to find the best fit line for
(where we measure size by the
data in Linear Regression?
number of logic gates in the
95 Least SquareError
network)
(B) Maximum Likelihood
(ii) To compute it using a deep
(C) Logarithmic Loss
network circuit, you need only
(D) Both A and B
an exponentially smaller

[190537) [RCS0801 [Page- 101


www.universityacademy.co.in Page 8 of 10
network. (C) Both A and B
Which of the following option (D) None of the above
is correct? 58. Given an image of a person,
(A) (i) is correct and (ii) is (i) predicting the height of that
incorrect person
(B) (i) is incon•ect while (ii) IS (ii) finding whether the person is
correct in happy, angry or sad mood.
(C) both are correct type of ML problem is
(D) both are incorrect (A) (i) is classification while (ii) is

my
55. Factor Analysis involves: regression problem
(A) dimensionality reduction (B (ii) is classification while (i) is

e
technique regression problem

ad
(B) finding correlation among both are classification problem
variables (D) both are regression problem
(C) capturing maximum variance
Ac
59. what does fitness function represent
in the data with minimum to describe optimization problem?
number of variables Objective function
(D) All the above (B) Scaling function
y

56. Which of the following is way to Chromosome decoding


it

reduce the skewness of a variable? function


(A) Taking log of the skewed (D) All of the above
s

variable 60. which of the following algorithms is


er

(B) Dividing each value of skewed called Lazy Learner?


variable by its standard (A) KNN
iv

deviation. (B) SVM


Normalizing the skewed (C) Naive Bayes
Un

variable (D) Decision Tree


Standardizing the skewed 61. What are the main driving operators
variable.
57. what causes overfitting? Selection
Large number of features in (B) Crossover
the data Both a and b
(B) Noise in the data (D) None of these

[190537] [RCS080] [Page- 111

www.universityacademy.co.in Page 9 of 10
is/are ü-ue?
is fi-ue about mimic
62. which of the following (A) Genetic
algorithm
bagging and boosting? process from natural
selection
learning
(A) Both are ensemble Chromosomesplay vital
roles
(B)
techniques
output of in GA
(B) Both combine the
weak learners to make
Jef Both a and b can't be
(D) Chromosomes
consistent predictions
encoded
(C) Both can be used to solve individual is
67. characteristics of
classification as well as
represented by

my
regression problems
(D) All of the above ) Chromosomes
(B) Gray Code

e
63. what causes underfitting?
(C) Initial population

ad
(A) Less number of features in the
data (D) None of the above
(B) Less number of observations 68. what is the main concept
Ac of
in the data Evolutionary computation?
Both a and b ( Survival of the fittest
(D) None of the above (B) Survival of the weakest
ty

64. The performance of GA is influenced (C) Phenotype


by (D) None of these
si

(A) Population size 69. selective pressure is also known as


(B) Crossover rate (A) Takeover Time
er

(C) Mutation rate ) candidate solution


All of the above (C) Proportionate time
iv

65. which of the following are main (D) None of the above
components of evolutionary 70. Which selection strategy is
Un

computation? susceptible to a high selection


(A) Initial population pressure and low population
(B) Fitness function diversity?
(C) Crossover, mutation and (A) Roulette-wheel selection
selection (B) Rank based selection
All of the above Tournament selection
66. which of the following statement(s) (D) All of the above

[190537] [RCS080]
[Page- 12]
www.universityacademy.co.in Page 10 of 10

You might also like