You are on page 1of 7

Quiz 1 Machine Learning II Total points 15/30

Let us say that we have computed the gradient of our cost function and 1/1
stored it in a vector g. What is the cost of one gradient descent update
given the gradient?

O(D)

O(N)

O(ND)

O(ND^2)

Which statement is true about the K-Means algorithm? 1/1

All attribute values must be categorical.

The output attribute must be cateogrical.

Attribute values may be either categorical or numeric.

All attributes must be numeric

Which function does the following multi-layer perceptron realize 2/2

AND

XOR
/
NOR

NAND

Principal component analysis (PCA) 0/2

Finds the directions with the most variation in the data

Is useful for visualizing data

Dimensions are increased when applying PCA

Eigenvalues and eigenvectors are computed from the covariance matrix

The average squared difference between classifier predicted output and 1/1
actual output.

mean squared error

root mean squared error

mean absolute error

mean relative error

Name *

Atharva Gondkar

A feed-forward neural network is said to be fully connected when 1/1

all nodes are connected to each other.

all nodes at the same layer are connected to each other.

/
all nodes at one layer are connected to all nodes in the next higher layer
all nodes at one layer are connected to all nodes in the next higher layer.

all hidden layer nodes are connected to all output layer nodes.

The average positive difference between computed and desired 0/1


outcome values.

root mean squared error

mean squared error

mean absolute error

mean positive error

K means 0/2

Automatically finds the number of clusters

Each cluster center is moved to the mean of data points assigned to it for each
iteration

A too small number of clusters may lead to overfitting

The algorithm has converged when the change in cluster assignment is less than a
threshold

Roll No *

2176032

What strategies can help reduce overfitting in decision trees? 0/2

Pruning

/
Make sure each leaf node is one pure class
Make sure each leaf node is one pure class

Enforce a minimum number of samples in leaf nodes

Enforce a maximum depth for the tree

Ensemble learning 0/2

A combination of classifiers are applied for classification

Classifiers should be trained to be slightly different

In bagging, each training sample (data point) is used only once for each iteration

Minority voting is used if there is disagreement

MLP

Gradient of a continuous and differentiable function 0/2

is zero at a minimum

is non-zero at a maximum

is zero at a saddle point

decreases as you get closer to the minimum

/
During backpropagation training, the purpose of the delta rule is to make 0/1
weight adjustments so as to

minimize the number of times the training data must pass through the network.

minimize the number of times the test data must pass through the network.

minimize the sum of absolute differences between computed and actual outputs.

minimize the sum of squared error differences between computed and actual
output.

The test set accuracy of a backpropagation neural network can often be 2/2
improved by

increasing the number of epochs used to train the network.

decreasing the number of hidden layer nodes.

increasing the learning rate.

decreasing the number of hidden layers.

Email *

atharva.gondkar@gmail.com

Multilayer perceptron network 0/1

Usually, the weights are initially set to small random values

/
A hard limiting activation function is often used
A hard limiting activation function is often used

The weights can only be updated after all the training vectors have been presented

Multiple layers of neurons allow for less complex decision boundaries than a single
layer

Unsupervised learning 2/2

Categorizes training vectors by identifying similarities between them

Can use the same error functions as supervised learning

Collaborative learning methods are often applied between classes

The data applied is unlabeled

Biological neural networks 0/2

Synapses can be inhibitory or excitatory

Learning takes place in the dendrites

The outputs from a neurons are pulses of fixed strength (height) and duration

The output from the neuron is called a synapse

Feedback

The correct answer is right because x, y, z

Support Vector Machines (SVMs) * 2/2

Support vectors are used for computing hyperplanes

Is a method for minimizing the margin to hyperplanes

Nonlinear problems are handled with mapping inputs to lower-dimensional space

Kernel functions are used for transforming data


/
Logistic regression is a ________ regression technique that is used to 2/2
model data having a _____outcome.

linear, numeric

linear, binary

nonlinear, numeric

nonlinear, binary

The values input into a feed-forward neural network 1/1

may be categorical or numeric.

must be either all categorical or all numeric but not both.

must be numeric.

must be categorical.

This form was created inside of MIT University.

 Forms

You might also like