ML Question Bank U - 2 CS

SHRI RAMSWAROOP MEMORIAL GROUP OF PROFESSIONAL COLLEGES
B.Tech. [SEM VIII]

OBJECTIVE TYPE QUESTION BANK-II
(Session: 2020-21)
RCS-080: Machine Learning
Unit: II Unit Name: Decision Tree Learning

Course Outcome: CO2 Name of Faculty: Er. Saurabh Kumar Jain
TOPIC-WISE OBJECTIVE QUESTIONS
Topic Set-1: Decision tree learning algorithm, Issues in Decision tree Source Lecture(s): Ref.: T1,R1 & D1
learning U2_L1 to U1_L2
[A] In the below mentioned questions: the statements have only one correct option: G S
A _________ is a decision support tool that uses a tree-like graph or model of

decisions and their possible consequences, including chance event outcomes,
resource costs and utility.
Q1) a) Decision trees. M (a)
b) Graphs.
c) Neural network.
d) ANN
Choose from the following that are not Decision Tree nodes?
a) Decision Node
Q2) b) Chance Node M (d)
c) End Node
d) Leaf Node
Decision Nodes are represented by ____________
a) Disks
Q3) b) Squares L (b)
c) Circles
d) Triangles
Chance Nodes are represented by __________
a) Disks
Q4) b) Squares H (c)
c) Circles
d) Triangles
End Nodes are represented by __________
a) Disks
Q5) b) Squares M (d)
c) Circles
d) Triangles
Q6) Tree/Rule based classification algorithms generate rule to perform the H (a)
classification.
a) If-then.
b) While
c) Do while
Page 1 of 13
d) Switch
Which of the following is a widely used and effective machine learning algorithm
based on the idea of bagging?
Q7) a) Decision Tree H (d)

b) Regression
c) Classification
d) Random Forest
Cost complexity pruning algorithm is used in?
a) CART
Q8) b) C4.5 M (a)
c) ID3
d) Bayesian Classifier
Which one of these is not a tree based learner?
a) CART
Q9) b) C4.5 M (d)
c) ID3
d) Bayesian Classifier
Which one of these is a tree based learner?
a) Rule based
Q10) b) Bayesian Belief Network M (d)
c) Bayesian classifier
d) Random Forest
What is the approach of basic algorithm for decision tree induction?
a) Greedy
Q11) b) Top Down M (a)
c) Procedural
d) Step by Step
How will you counter over-fitting in decision tree?
a) By pruning the longer rules
Q13) b) By creating new rules H (a)
c) Both By pruning the longer rules’ and ‘ By creating new rules’
d) By pruning the shorter rules
Which of the following is a disadvantage of decision trees?
a. Factor analysis
Q14) b. Decision trees are robust to outliers L (c)
c. Decision trees are prone to be over fit
d. Decision trees are prone to be under fit
Q15)
In the below mentioned questions: the statements may have more than one
[B] G S
correct option(s):
Pruning is used in used in decision tree for the purpose of
a) Reduces the size of decision trees.
Q1) b) Reduces the complexity of the final classifier, M (a), (b),(c)
c) Improves predictive accuracy.
d) Increase the size of decision trees.
How can we avoid over fitting in decision tree?
a) Pre pruning
Q2) b) Post pruning M (a),(b),(c)
c) Sequester a proportion of the original data.
d) Regularization
Q3) Decision trees can handle (a),(b)
a) Categorical data. M
b) Numerical data.
Page 2 of 13
c) Time series data
d) Character data
Which of the following classifications would not best suit the student performance
classification systems?
Q4) a)If-.then-analysis
a) Market-basket analysis H
b) Regression analysis (b),(c),(d)
c) Cluster analysis
Decision tree learning is generally best suited to problems with the following
characteristics:
a) Instances are represented by attribute-value pairs.
Q5) H (a),(b),(c)
b) The target function has discrete output values.
c) Disjunctive descriptions may be required.
d) The target function has continuous output values.
H
What is Decision Tree?
a)Flow-Chart
(a
b)Structure in which internal node represents test on an attribute, each branch
),
Q6) represents outcome of test and each leaf node represents class label (a),(b),(c)
(b
c) It is one way to display an algorithm that only contains conditional control
),
statements.
(c
d) Uses a graph-like model of decisions and their possible consequences
)
H
Which of the following are the advantages of Decision Trees?
(a
a) Possible Scenarios can be added
),
Q7) b) Use a white box model, If given result is provided by a model (a),(b),(c)
(b
c) Worst, best and expected values can be determined for different scenarios
),
d) A decision tree required normalization of data.
(c
)
The method that measures the degree or probability of a particular variable being
wrongly classified when it is randomly chosen is known as
a. Gini Index M
Q8) b. Gini Impurity (a),(b)
c. Entropy M
d. Gini purity
Topic Set-2: : Artificial Neural N/w , Perceptron Source Lecture(s): U2_L3 to U2_L3 Ref.: T1, R1 & D1
In which of the following scenario a gain ratio is preferred over Information Gain?
a) When a categorical variable has very large number of category
Q1) b) When a categorical variable has very small number of category L (a)
c) Number of categories is the not the reason
d) When low cardinality problems occurs
The fundamental unit of artificial neural network is.
a) Brain
Q2) b) Nucleus L (c)
c) Neurons
d) Axon
Page 3 of 13
What is the feature of ANNs due to which they can deal with noisy, fuzzy, inconsistent
data?
a) Associative nature of networks
Q3) M (c)
b) Distributive nature of networks
c) Both associative & distributive
d) Commutative nature of networks
What was the main deviation in perceptron model from that of MP model?
a) More inputs can be incorporated
Q4) b) Learning enabled H (b)
c) Hidden layer enabled
d) Hidden layer disabled
What was the 2nd stage in perceptron model called?
a) Sensory units
Q5) b) Summing units H (c)
c) Association unit
d) Output unit
What is delta (error) in perceptron model of neuron?
a) Error due to environmental condition
Q6) b) Difference between desired & target output M (a)
c) Can be both due to difference in target output or environmental condition
d) Error due to hidden layer
Major pruning techniques used in decision tree are
a) Minimum error
Q7) b) Smallest tree M (b)
c) Largest tree
d) Maximum error
A perceptron is:
a) A single layer feed-forward neural network with pre-processing
b) An auto-associative neural network
Q8) L (a)
c) A double layer auto-associative neural network
d) A neural network that contains feedback
Which of the following is not the promise of artificial neural network?

a) It can explain result
Q9) b) It can survive the failure of some nodes L (a)
c) It has inherent parallelism
d) It can handle noise
A perceptron adds up all the weighted inputs it receives, and if it exceeds a certain
value, it outputs a 1, otherwise it just outputs a 0.
a)True
Q10) M (a)
b) False
c) Sometimes – it can also output intermediate values as well
d) Can’t say
What is delta (error) in perceptron model of neuron?
a) Error due to environmental condition
Q11) b) Difference between desired & target output (a)
M
c) Can be both due to difference in target output or environmental condition
d) Difference between training & target output
Q12) In a neural network, knowing the weight and bias of each neuron is the most important M (c)
step. If you can somehow get the correct value of weight and bias for each neuron, you
can approximate any function. What would be the best way to approach this?
a) Assign random values and pray to God they are correct
b) Search every possible combination of weights and biases till you get the best value
Page 4 of 13
c) Iteratively check that after assigning a value how far you are from the best values,
and slightly change the assigned values to make them better
d) Search every possible combination of weights till you get the best value
In the below mentioned questions: the statements may have more than one correct
[B] G S
option(s):
Various ways to adjust weight in ANN are

a) Brute-force method
Q1) b) Batch-Gradient Descent M (a),(b)
c) incremental-Gradient Descent
d) None of these
Which is true for neural networks?
a) It has set of nodes and connections
b) Each node computes it’s weighted input
Q2) M (a), (b),(c)
c) Node could be in excited state or non-excited state
d) It has set of nodes and no connections
What are the advantages of neural networks over conventional computers?

a) They have the ability to learn by example
Q3) b) They are more fault tolerant M (a), (b) ,(c)
c) They are more suited for real time operation due to their high computational rates
d) None of these
Single layer associative neural networks do not have the ability to:
a) Perform pattern recognition
Q4) b) Find the parity of a picture M (b),(c)
c) Determine whether two or more shapes in a picture are connected or not
d) None of these
What are tree based classifiers?
a) Classifiers which form a tree with each attribute at one level.
Q5) b) Classifiers which perform series of condition checking with one attribute at a time. M (a), (b)
c) Both a & b
d) None of these
Which of the following sentences are true?
a)In pre-pruning a tree is 'pruned' by halting its construction early
Q6) b) A pruning set of class labeled tuples is used to estimate cost complexity. H (a), (b) ,(c)
c) The best pruned tree is the one that minimizes the number of encoding bits.
d) A pruning set of class labeled tuples is used to estimate time complexity
A 3-input neuron is trained to output a zero when the input is 110 and a one when the
input is 111. After generalization, the output will be zero when and only when the
input is:
Q7) a)000 H (a), (b) ,(c),(d)
b)010
c)110
d)100
The ANN unit is composed of
(a)Summation function
Q8) (b)Threshold function M (a),(b)
c)Target function
d)Activation function
Page 5 of 13
Topic Set-3: Gradient Descent & Delta Rule Source Ref.: Ref.: T1,R1 &
Lecture(s):U2_L4 D1
What are the steps for using a gradient descent algorithm?

1. Calculate error between the actual value and the predicted value
2. Reiterate until you find the best weights of network
3. Pass an input through the network and get values from output layer
4. Initialize random weight and bias
5. Go to each neurons which contributes to the error and change its respective (d)
Q1) H
values to reduce the error
a) 1, 2, 3, 4, 5
b) 5, 4, 3, 2, 1
c) 3, 2, 1, 5, 4
d) 4, 3, 1, 5, 2
What if we use a learning rate that’s too large?

a) Network will converge
b) Network will not converge (b)
Q2) L
c) Can’t Say
d) Depend on the network
For a classification task, instead of random weight initializations in a neural

network, we set all the weights to zero. Which of the following statements is
true?
a) There will not be any problem and the neural network will train properly
Q3) M
b) The neural network will train but all the neurons will end up recognizing the (b)
same thing
c)The neural network will not train as there is no net gradient change
d) The neural network will train as there is no net gradient change
In a neural network, knowing the weight and bias of each neuron is the most
important step. If you can somehow get the correct value of weight and bias for
each neuron, you can approximate any function. What would be the best way to
approach this?
a) Assign random values and pray to God they are correct
Q4) M
b) Search every possible combination of weights and biases till you get the best (c)
value
c) Iteratively check that after assigning a value how far you are from the best
values, and slightly change the assigned values to make them better
d) Assign values which depend on the network
Which gradient technique is more advantageous when the data is too big to
handle in RAM simultaneously?
a) Full Batch Gradient Descent
Q5) L
b) Stochastic Gradient Descent (b)
c) Mini Gradient Descent
d) Gradient Descent
Name the function where if plotted in a n-dimensional plane, the negative and
positive examples of the function can be totally separated using a straight plane (b)
across the space.
Q6) a) Separable function L
b) Linearly separable function
c) Graphically separable function
d) Logical separable function
Q7) Computational complexity of Gradient descent is, L
a) Linear in D
Page 6 of 13
b) Linear in N (c)
c) Polynomial in D
d) Dependent on the number of iterations
ADALINE uses ____________(from the net input) to learn the model coefficients,
which is more “powerful” since it tells us by “how much” we were right or
wrong.
Q8) a) sorted values L
(b)
b) continuous predicted values
c) random values
d) Any value
ADALINE is a ____________neural network with multiple nodes where each
node accepts multiple inputs and generates one output.
a) continuous (c)
Q9) L
b) discrete
c) single layer
d) two layer
Which is also known as stochastic gradient descent
a) batch gradient descent
Q10) b) Incremental gradient descent L (b)
c) mini gradient descent
d) mini delta descent
Incremental Gradient Descent can approximate Batch gradient descent arbitrarily
closely if η is made________.
a) large enough (b)
Q11) L
b) small enough
c) finite
d) integer
The ADALINE always converges (given small enough η) to the minimum
squared error, while the perceptron only converges when___________.
a) data is separable (a)
Q12) M
b) data is large
c) data is not separable
d) data is small
In the below mentioned questions: the statements may have more than one correct
[B] G S
option(s):
Gradient of a continuous and differentiable function

a) is zero at a minimum
b) is non-zero at a maximum (a), (c), (d)
Q1) H
c) is zero at a saddle point
d) decreases as you get closer to the minimum
Gradient descent
a) Gradient descent will always find the global optimum (b), (c)
Q2) b) Steps are taken proportional to the gradient of the function at the current point H
c) The starting point could affect if a global optimum is found
d) The descent continues until the gradient is very large
Steps in the back propagation learning algorithm are:
a) Initialize weights with random values and set other parameters.
b) Read in the input vector and the desired output.
Q3) H (a), (b), (c), (d)
c) Compute the actual output via the calculations, working forward through the
layers.
d) Compute the error.
Q4) Multilayer Networks provide better modeling for M (a), (b)
Page 7 of 13
a) Complex network
b) Biological network
c) Monoplex networks
d) Local Area Network
Multilayer perceptron network
a) A neural network with several layers of nodes (or weights)
b) There are connections both between and within each layer
Q5) M (a),(d)
c) The number of units in each layer must be equal
d)Multiple layers of neurons allow for more complex decision boundaries than a
single layer
Back propagation
a) Is a learning algorithm for multilayer perceptron networks
b) The backward pass follows after the forward pass
Q6) M (a), (b)
c) Is based on a gradient descent technique to maximize the mean square
difference between the desired and actual outputs
d) Is also applicable to self-organizing feature maps
Weight updates in Back propagation
a) Usually, the weights are initially set to 0
Q7) b) Are proportional to the difference between the desired and actual outputs H (b),(c), (d)
c) The weight change is also proportional to the input to the weight layer
d) The output layer weights are used for computing the error of the hidden layer
a) f(x) is called a sigmoid function

Q8) b) It is beneficial because it does not limit the output value H (a),(c)
c) It is called an activation function and such a function is used on every
multilayer perceptron output
d) The derivative of the function is (f(x) + 1) f(x)
Topic Set-4: Derivation of Back Propagation Source Lecture(s):U2 L5 Ref.: Ref.: T1,R1 &
Rule & Back Propagation Algorithm D1
How the name counter propagation does signify its architecture?

a) Its ability to learn inverse mapping functions
Q1) b) Its ability to learn forward mapping functions M (c)
c) Its ability to learn forward and inverse mapping functions
d) Its ability to learn backward mapping functions
Back propagation is a learning technique that adjusts weights in the neural network by
propagating weight changes.
a) Forward from source to sink
Q2) H (b)
b) Backward from sink to source
c) Forward from source to hidden nodes
d) Backward from sink to hidden nodes
Q3) Representations of systems with heterogeneous nodes is known as M (a)
a) Node-colored graphs
b) Edge-colored graphs
c) Cyclic graphs
Page 8 of 13
d) Directed graphs
Multiplex Networks in neural network are
a) Edge-colored graphs
Q4) b) Node-based graphs L (a)
c) Disconnected graphs
d) Vertex labeled graphs
Multilevel networks in neural network are
Q5) b) Edge-colored graphs L (b)
c) Vertex labeled graphs
d) Disconnected graphs
Representations of systems with heterogeneous connections between the nodes
Q6) b) Edge-colored graphs M (b)
c) Node-based graphs
d) Weighted graphs
The self organizing list improves
a) Average access time
Q7) b) Insertion M (a)
c) Deletion
d) Binary search
Which of the following is not the rearranging method used to implement self-
organizing lists?
Q8) a) Count method H (d)
b) Move to front method
c) Ordering method
d) Least frequent used
Self Organizing Map are trained using:
a) Unsupervised learning
Q9) b) Supervised learning M (a)
c) Reinforcement learning
d) Semi supervised learning
SOM perform dimension reduction using a clustering method:
a) K means clustering
Q10) b) Grid based clustering L (a)
c) Partition based clustering
d) Hierarchical based clustering
Kohonen self-organizing maps (SOM) are a type of networks
a)Feed forward
Q11) b) Feed backward L (a)
c) Radial Basis Function Neural Network
d) Convolution Neural Network
Self-organizing maps differ from other artificial neural networks as they apply
a) Competitive learning
Q12) b) Unsupervised learning L (a)
c) Supervised learning
d) Reinforcement learning
[B] In the below mentioned questions: the statements may have more than one correct G S
Page 9 of 13
option(s):
SOM has two layers :
a) Input layers
Q1) b) Output layers M (a),(b)
c) Hidden layers
d) Network layers
SOM is used for
a) Clustering (a),(b)
M
Q2) b) Dimension reduction
c) Classification
d) Pattern matching
SOMs operate in two modes
a) Training
Q3) b) Mapping M (a),(b)
c) Testing
d) Filtering
Application of SOM are:
a) Project prioritization and selection (a),(b),
Q4) b) Seismic facies analysis for oil and gas exploration M
c) Creation of artwork (c),(d)
d) Failure mode and effects analysis
Point out the correct statements about SOM:
a) Self Organizing Map is trained using unsupervised learning.
b) Self Organizing Map perform dimension reduction using a clustering method K (a),(b), (c),
Q5) H
means clustering (d)
c) Self Organizing Map operates in two modes.
d) It is used in the problem of dimension reduction.
Advantages of Self Organizing Map are:
a) Data is easily interpreted and understood.
(a),(b),
Q6) b) Self Organizing Map is very easy to understand. H
(c),(d)
c) Huge data sets can be tackled
d) Self Organizing Map has high performance speed.
Disadvantages of Self Organizing Map are:
a) The number of parameters that has to be set.
Q7) b) The size and topology of the map needs to be determined. H (a),(b), (c)
c) Spend time on the optimization of the mapping.
d) Only gives accurate result for small set of data.
The visible part of a self-organizing map is the map space, which consists of
components called
a) Nodes
Q8) b) Neurons M (a),(b)
c) Axons.
d) Nucleus
Topic Set-5: Multilayer Networks, Source Lecture(s):U2_L5 & L8 Ref.: Ref.: T1,R1 &
Generalization D1
In the below mentioned questions: the statements have only one correct
[A] G S
option:
Page 10 of 13
The generalization means how good our model is at _______from the given
data and________ the learnt information elsewhere.
a) selecting, applying
Q1) M (c)
b) selecting, posting
c) learning, applying
d) learning, posting
What is a two-step process, learning step and prediction step, in machine
learning?
a) generalization
Q2) L (b)
b) classification
c) optimization
d) selection
This concept of learning from some data and correctly applying the gained
knowledge on other data is called _________________.
a) generalization (a)
Q3) L
b) classification
c) optimization
d) selection
If the neural network trains on the 10 breeds of dogs and refuses to classify
the other 2 breeds of dogs as dogs, then this neural network has ________ on
the training data.
a) underfit
Q4) L (b)
b) overfit
c) dependent
d)Not dependent
In the learning step, the model is developed based on____________.

a) true data (c)
Q5) b) real data L
c) given training data
d) verified data
In the ________step, the model is used to predict the response for given data.
a) verification
(b)
Q6) b) prediction L
c) learning
d) testing
The performance of ________________ is mostly depends upon its
generalization capability.
a) back propagation algorithm (c)
Q7) L
b) perceptron
c) Artificial Neural Networks
d) CNN
Generalization of the Artificial Neural Networks (ANN) is ability to
_________________.
a) handle unseen data (a)
Q8) L
b) used data
c) tested data
d) classified data
Q9) Which capability of the network is mostly determined by system complexity L (c)
and training of the network?
a) prediction
b) performance
Page 11 of 13
c) generalization
d) Association
A feed forward neural network is an artificial neural network wherein
connections between the units _______a cycle.
a) form
Q10) L (b)
b) do not form
c) detect
d) select
A network with at least one unit that is not output or input, where the direction of
data flow is in only one direction is called_______.
a)Neural network
Q11) L (b)
b) Multi-layer Feed Forward Networks
c) Artificial neural network
d)CNN
Which layer consists of the set of nodes that are not input or output units?
a) input layer
Q12) b) output layer L (c)
c) hidden layer
d) Multiple layer
[B] In the below mentioned questions: the statements may have more than one G S
correct option(s):
Overfitting
a) When the trained system matches the training set perfectly, overfitting may occur
b) Indicates limited generalization
Q1) H (a), (b), (d)
c) Should be avoided mainly because of the long training time
d) Stopping training earlier could reduce the problem
Multilayer perceptron network

a) a neural network with several layers of nodes (or weights)
b) There are connections both between and within each layer
Q2) H (a),(d)
c) The number of units in each layer must be equal
d)Multiple layers of neurons allow for more complex decision boundaries
than a single layer
Neural Networks
a) Nerve cells in the brain are called neurons
Q3) b) The output from the neuron is called dendrite M (a), (d)
c) One kind of neurons is called synapses
d) Learning takes place in the synapses
The perceptron
a) Invented by Hebb
b) Is a simplified model of the biological neuron
Q4) M (b), (c), (d)
c) Can be used to make multi-layer neural networks
d) Weights can be trained by adjusting them by an amount proportional to
the difference between the desired output and the actual output
Deep learning works well despite of ________________ problem(s).
a) High capacity (Susceptible to over fitting)
Q5) b) Numerical instability (vanishing/exploding gradient) M (a),(b),
c) Sharp minima (c)
d) Local maxima
Q6) Which of the following statements is true when you use 1×1 convolutions in a CNN?
Page 12 of 13
a) It can help in dimensionality reduction
b) It can be used for feature pooling H (a),(b),
c) It suffers less over fitting due to small kernel size (c)
d) It restrict activations to become too high or low
What steps can we take to prevent over fitting in a Neural Network?
a) Data Augmentation
Q7) b) Weight Sharing M (a),(b),(c),
c) Early Stopping (d)
d) Dropout
Types of layers to build Convolutional network architectures are
a) Convolutional Layer
Q8) b) Pooling Layer H (a), (b), (c )
c) Fully-Connected Layer
d) Network layer
REFERENCES:
TEXT BOOKS:
Ref. [ID] Authors Book Title Publisher/Press Year of Publication
McGraw-Hill
[T1] Tom M. Mitchell Machine Learning Education (India) 2013
Private Limited,
Machine Learning: An
[T2] Stephen Marsland CRC Press 2009
Algorithmic Perspective
REFERENCE BOOKS:
Ref. [ID] Authors Book Title Publisher/Press Year of Publication
Pattern Recognition and . Berlin: Springer-
[R1] Bishop, C 2012
Machine Learning Verlag
ONLINE/DIGITAL REFERENCES:
Ref. [ID] Source Name Source Hyperlink
Objective Questions in https://www.javacodemonk.com/machine-learning-based-multiple-choice-
[D1]
Machine Learning questions-626ca098
…………. X ………………….
Page 13 of 13

ML Question Bank U - 2 CS

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ML Question Bank U - 2 CS

Uploaded by

Copyright:

Available Formats

SHRI RAMSWAROOP MEMORIAL GROUP OF PROFESSIONAL COLLEGES

B.Tech. [SEM VIII]

Unit: II Unit Name: Decision Tree Learning

TOPIC-WISE OBJECTIVE QUESTIONS

A _________ is a decision support tool that uses a tree-like graph or model of

Q7) a) Decision Tree H (d)

Which of the following is not the promise of artificial neural network?

Various ways to adjust weight in ANN are

What are the advantages of neural networks over conventional computers?

What are the steps for using a gradient descent algorithm?

What if we use a learning rate that’s too large?

For a classification task, instead of random weight initializations in a neural

Gradient of a continuous and differentiable function

a) f(x) is called a sigmoid function

How the name counter propagation does signify its architecture?

In the learning step, the model is developed based on____________.

Multilayer perceptron network

You might also like