DL - Quiz 2 - Google Forms

DL - Quiz2
Dear All,
Read all the instructions carefully.
This quiz has been timed for 20min.
DO NOT EDIT the FORM_TIMER_UNIQUE_IDENTIFIER field. It will get updated automatically.
Take screenshots of your responses in case you face issues during submission.
Thank you.
* Required
1. Email address *
2. PRN *
Questions
3. The following 3 questions are based on this image. Given a vector as follows, what is
its L1 norm? (NOTE: Simply mention the computed number and for floating point
numbers, round off the number upto 3 decimal places) *
4. Consider the same vector, what is its L2 norm? *

5. What is its max norm? *
6. Given 60000 grayscale images of size 28 x 28, what would be the shape of tensor
required? *
Mark only one oval.
(6000, 28, 28)
(60000, 28, 28)
(28, 28, 60000)
(60000, 28*28)
Questions (contd)
7. In case of a neural network, making a prediction is __________ while optimizing

weights is _________ *
Mark only one oval.
forward pass, backward pass
backward pass, forward pass
forward pass, backpropagation
feed forward, backpropagation

8. A gradient in a neural network is defined as a *
Mark only one oval.
vector of partial derivatives of the loss score (or empirical loss) with respect to every
weight in the network
vector of derivatives of the loss score (or empirical loss) with respect to every weight in
the network
vector of the loss score (or empirical loss) computed with respect to every weight in the
network
9. Select all that is true about a Jacobian
Check all that apply.
it is a vector of partial derivatives

it is a vector of gradients
it is computed when there are multiple functions to be dealt with
it is computed when there is a function that has multiple variables
it is computed when there are multiple functions where each function has multiple
variables
10. Observe the following neural network correctly and mention the correct
representation for the final prediction Yp. *
Mark only one oval.
Yp = sigmoid ( relu ( relu ( x1*w11 + x2*w12 + x3*w13 + b_z1 ) * w2 + b_z2 ) * w_Yp +

b_Yp )
Yp = sigmoid ( relu ( x1*w11 + x2*w12 + x3*w13 + b_z1 ) * w2 + b_z2 ) * w_Yp + b_Yp )
Yp = sigmoid ( relu ( relu ( x1*w11 + x2*w12 + x3*w13 + b_z2 ) * w2 ) * w_Yp + b_Yp )
Yp = sigmoid ( relu ( relu ( x, w + b_z1 ) * w2 + b_z2 ) * w_Yp + b_Yp )
Questions (contd)
11. Consider the steps involved in the gradient descent process of optimization, which of
the following options do you think should be an ideal size for a batch? *
Mark only one oval.
batch = single data item
batch = few samples of the data
batch = all the samples of the data
batch = all the training samples of the data
batch = few training samples of the data
batch = single data item from the training set

12. Consider the following graph of a neural network with a single input neuron and single
output neuron. (NOTE: there is no hidden neuron) The intermediate values obtained
are mentioned on the edges. Recall the application of chain rule in the
backpropagation process for weight updation. The next two questions are based on
this graph. Implementing the chain rule, what would be the output of grad(loss_val, b)
?*
Mark only one oval.
grad(loss_val, x2) * grad(x2, b)
grad(loss_val, y_true) * grad(loss_val, x2) * grad(loss_val, b)
grad(loss_val, y_true) * grad(loss_val, x2) * grad(x2, b)
grad(loss_val, x2) * grad(loss_val, b)
13. Now considering the same graph write the output of grad(loss_val, w) [NOTE:
Follow the notations as mentioned in the previous question] *
14. Thus the chain rule in this backward graph says that you can obtain the derivative
of a node with respect to another node by *
Mark only one oval.
multiplying the derivatives for each edge along the path linking the two nodes
adding the derivatives for each edge along the path linking the two nodes
multiplying the derivatives for every intermediate node along the path linking the two
nodes
15. Read the following question carefully *
Mark only one oval.
-2.808
-0.648
9.36
-3.6
-1.808
16. Continuing with the example above, *
Mark only one oval.
-1.249
-2.28
3.369
-2.378
-0.249
17. ⏩ (DO NOT MODIFY THIS ANSWER-for official purposes only )

FORM_TIMER_UNIQUE_IDENTIFIER *
This is your unique identifier. Please do not modify this.
This content is neither created nor endorsed by Google.
Forms

DL - Quiz 2 - Google Forms

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DL - Quiz 2 - Google Forms

Uploaded by

Copyright:

Available Formats

DL - Quiz2

4. Consider the same vector, what is its L2 norm? *

Mark only one oval.

(6000, 28, 28)

(60000, 28, 28)

(28, 28, 60000)

7. In case of a neural network, making a prediction is __________ while optimizing

Mark only one oval.

forward pass, backward pass

backward pass, forward pass

forward pass, backpropagation

feed forward, backpropagation

Mark only one oval.

9. Select all that is true about a Jacobian

Check all that apply.

it is a vector of partial derivatives

Mark only one oval.

Yp = sigmoid ( relu ( relu ( x1*w11 + x2*w12 + x3*w13 + b_z1 ) * w2 + b_z2 ) * w_Yp +

Yp = sigmoid ( relu ( x1*w11 + x2*w12 + x3*w13 + b_z1 ) * w2 + b_z2 ) * w_Yp + b_Yp )

Yp = sigmoid ( relu ( relu ( x1*w11 + x2*w12 + x3*w13 + b_z2 ) * w2 ) * w_Yp + b_Yp )

Yp = sigmoid ( relu ( relu ( x, w + b_z1 ) * w2 + b_z2 ) * w_Yp + b_Yp )

Mark only one oval.

batch = single data item

batch = few samples of the data

batch = all the samples of the data

batch = all the training samples of the data

batch = few training samples of the data

batch = single data item from the training set

Mark only one oval.

grad(loss_val, x2) * grad(x2, b)

grad(loss_val, y_true) * grad(loss_val, x2) * grad(loss_val, b)

grad(loss_val, y_true) * grad(loss_val, x2) * grad(x2, b)

grad(loss_val, x2) * grad(loss_val, b)

Mark only one oval.

Mark only one oval.

Mark only one oval.

17. ⏩ (DO NOT MODIFY THIS ANSWER-for official purposes only )

You might also like

Yp = sigmoid ( relu ( relu ( x1w11 + x2w12 + x3w13 + b_z1 ) w2 + b_z2 ) * w_Yp +

Yp = sigmoid ( relu ( x1w11 + x2w12 + x3w13 + b_z1 ) w2 + b_z2 ) * w_Yp + b_Yp )

Yp = sigmoid ( relu ( relu ( x1w11 + x2w12 + x3w13 + b_z2 ) w2 ) * w_Yp + b_Yp )