Quiz 05 - More On Deep Neural Networks and CNN Intro

29/11/2020 Quiz 05: More on Deep Neural Networks and CNN Intro
Quiz 05: More on Deep Neural Networks

and CNN Intro
Your email address (makbar201013@mscse.uiu.ac.bd) will be recorded when you submit
this form. Not you? Switch account
Which of the algorithms are not suitable for mini-batch gradient descent 1 point
algorithm?
Delta-bar-delta
Adam
RMSProp
Because pooling layers do not have parameters, they do not affect the 1 point
backpropagation (derivatives) calculation. What do you think of the

statement?
True
False
Clear selection
You're editing your response. Sharing this URL allows others to also edit
OPEN BLANK FORM
your response.
https://docs.google.com/forms/d/e/1FAIpQLSdgx7QHosSvcAnlCi6Pd-seOXGa9Q4krkFFB_y4ERwJ2kVv4g/viewform?edit2=2_ABaOnudqLvPN6SVgpP8-1s… 1/6
Consider this figure. These plots were generated with gradient descent; 3 points
with gradient descent with momentum (\betaβ = 0.5) and gradient

descent with momentum (\betaβ = 0.9). Which curve corresponds to
which algorithm?
(1) is gradient descent. (2) is gradient descent with momentum (small β). (3) is
gradient descent with momentum (large β)
(1) is gradient descent. (2) is gradient descent with momentum (large β) . (3) is
gradient descent with momentum (small β)
(1) is gradient descent with momentum (small β), (2) is gradient descent with
momentum (small β), (3) is gradient descent
(1) is gradient descent with momentum (small β). (2) is gradient descent. (3) is
gradient descent with momentum (large β)
Clear selection
You have an input volume that is 32x32x16, and apply max pooling with a 2 points
stride of 2 and a filter size of 2. What is the output volume?
16x16x16
15x15x16
16x16x8
32x32x8
Clear selection
OPEN BLANK FORM
your response.
Which of these is NOT a good learning rate decay scheme? Here, t is the 2 points
epoch number. lets assume alpha0 is the initial learning rate.
alpha = alpha0 * (1/sqrt(t))
alpha = alpha0 * (1/(1+2t))
alpha = alpha0 * exp(t)
alpha = alpha0 * (0.95) ^ t, ^ denotes power
Clear selection
A neural network has three layers. Inputs to this layers are 32x32 size 2 points
single channel image. After the inputs are taken, there is a convolutional
layer with 32 filters each of size 3x3 followed by a fully connected layer.
The flattened output of the Convolutional Layer is fed to the fully
connected layer with 64 units. The last layer of this network is a softmax
layer with 4 units. What is the total number of learnable parameters?
You have an input volume that is 63x63x16, and convolve it with 32 filters 1 point
that are each 7x7, using a stride of 2 and no padding. What is the output
volume?
29x29x32
16x16x32
29x29x16
16x16x16
Clear selection
OPEN BLANK FORM
your response.
Which of the following is not true about Adagrad algorithm? 1 point
It can produce a division by zero error during backpropagation
The summation of the gradients accumulated might be too high to make the
algorithm too slow
Adagrad uses individual learning rates for all the parameters
Which of the following statements about Adam is False? 1 point
Adam combines the advantages of RMSProp and momentum
Adam should be used with batch gradient computations, not with mini-batches.
The learning rate hyperparameter α in Adam usually needs to be tuned.
What do you think this filter will do to the gray scale images? 1 point
Detect 45 degree edges
Detect horizontal edges
Detect image contrast
Detect vertical edges
Clear selection
OPEN BLANK FORM
your response.
You have an input volume that is 15x15x8, and pad it using “pad=2.” What is 1 point
the dimension of the resulting volume (after padding)?
17x17x8
17x17x10
19x19x8
19x19x12
Clear selection
You use an exponentially weighted average on the Dhaka temperature 2 points
dataset. You use the following to track the temperature: v(t) =βv(t−1) +
(1−β)θ(t). The red line below was computed using β=0.9. What would
happen to your red curve as you vary β?
Decreasing beta will shift the red line slightly to the right
Increasing beta will shift the red line slightly to the right
Decreasing beta will create more oscillations within the red line
Increasing
You're editing beta will create
your response. more
Sharing this oscillations within to
URL allows others thealso
rededit
line
OPEN BLANK FORM
your response.
You have an input volume that is 63x63x16, and convolve it with 32 filters 2 points
that are each 7x7, and stride of 1. You want to use a “same” convolution.
What is the padding?
Clear selection
Submit
Never submit passwords through Google Forms.
This form was created inside of United International University. Report Abuse
Forms
OPEN BLANK FORM
your response.

Quiz 05 - More On Deep Neural Networks and CNN Intro

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Quiz 05 - More On Deep Neural Networks and CNN Intro

Uploaded by

Copyright:

Available Formats

29/11/2020 Quiz 05: More on Deep Neural Networks and CNN Intro

Quiz 05: More on Deep Neural Networks

backpropagation (derivatives) calculation. What do you think of the

with gradient descent with momentum (\betaβ = 0.5) and gradient

alpha = alpha0 * (1/sqrt(t))

alpha = alpha0 * (1/(1+2t))

alpha = alpha0 * exp(t)

alpha = alpha0 * (0.95) ^ t, ^ denotes power

Which of the following is not true about Adagrad algorithm? 1 point

It can produce a division by zero error during backpropagation

Adagrad uses individual learning rates for all the parameters

Which of the following statements about Adam is False? 1 point

Adam combines the advantages of RMSProp and momentum

The learning rate hyperparameter α in Adam usually needs to be tuned.

Detect 45 degree edges

Detect horizontal edges

Detect image contrast

Detect vertical edges

the dimension of the resulting volume (after padding)?

You use an exponentially weighted average on the Dhaka temperature 2 points

Never submit passwords through Google Forms.

You might also like