Professional Documents
Culture Documents
Mitigation:
1. How does the RelU activation function help in avoiding bad local minima in
feedforward neural networks?
2. How does the initialization of weights affect the performance of the RelU activation
function in avoiding bad local minima in feedforward neural networks?
3. What are some alternative activation functions to RelU for avoiding bad local minima
in feedforward neural networks?
4. How does the number of layers in a feedforward neural network affect the
performance of the RelU activation function in avoiding bad local minima?
5. Describe the role of activation functions in feedforward neural networks.
6. Explain how the RelU activation function can be used to avoid getting stuck in bad
local minima during training.
1. What is the goal of heuristics for faster training in feedforward neural networks?
2. How can batch normalization be used for faster training in feedforward neural
networks?
3. What are some other common heuristics used for faster training in feedforward neural
networks?
4. What are the limitations of heuristics for faster training in feedforward neural
networks?
5. What is the vanishing gradient problem, and how can it be mitigated in a feedforward
neural network?
6. Explain how batch normalization can be used as a heuristic for faster training in a
feedforward neural network.
1. What is Nestor's Accelerated Gradient Descent, and how does it differ from
traditional gradient descent in feedforward neural networks?
2. How can Nestor's Accelerated Gradient Descent be applied to improve the training of
feedforward neural networks?
3. What are the limitations of Nestor's Accelerated Gradient Descent in feedforward
neural networks?
4. How can other optimization techniques be combined with Nestor's Accelerated
Gradient Descent to improve the training of feedforward neural networks?
5. Describe the difference between gradient descent and Nestor's accelerated gradient
descent algorithm.
6. Explain how the momentum term in Nestor's accelerated gradient descent algorithm
can improve convergence speed.
Regularization:
Dropout:
UNIT III
CNN Architectures:
Convolution Operation:
1. Explain the concept of a convolutional operation in a CNN.
2. What are the advantages of using convolutional operations in image processing tasks?
1. What is a dilated convolution, and how does it differ from a standard convolution?
2. Explain how a transposed convolution is used in a CNN.
Structured Outputs:
Data Types:
1. What is the difference between grayscale and RGB images, and how does this affect
the input data type in a CNN?
2. Describe the role of data normalization in a CNN.
1. Explain how the fast Fourier transform (FFT) can be used to speed up convolution in
a CNN.
2. Describe the concept of a separable convolution and how it can improve the
efficiency of a CNN.
1. What are random features, and how are they used in a CNN?
2. Explain the concept of unsupervised learning in a CNN and give an example.
LeNet, AlexNet:
CNN Architectures:
1. Explain in detail the architecture of a convolutional neural network and its different
components, such as convolutional layers, pooling layers, and fully connected layers.
Also, provide an example of a typical CNN architecture and its use case.
2. Compare and contrast the architectures of LeNet and AlexNet, including their
differences in layer types and sizes.
3. Suppose you have an input image of size 28 x 28 pixels with a stride of 2 and a
max pooling layer of size 2 x 2 pixels. You apply the max pooling operation to the
input image, resulting in a reduced output image. What is the size of the output
image?
Assume that the max pooling operation takes the maximum value of each 2 x 2 region of the
input image and outputs it as a single value in the corresponding location in the output image.
Solution:
4. Suppose you have an input image of size 64 x 64 pixels with a stride of 2 and a
max pooling layer of size 3 x 3 pixels. You apply the max pooling operation to the
input image, resulting in a reduced output image. What is the size of the output
image?
Assume that the max pooling operation takes the maximum value of each 3 x 3 region of the
input image and outputs it as a single value in the corresponding location in the output image.
Solution:
Solution:
6. Suppose you have an input image of size 128 x 128 pixels and a filter/kernel of
size 3 x 3 pixels. You apply the convolution operation to the input image using
the given filter/kernel, and then apply a max pooling layer with a size of 2 x 2
pixels and a stride of 2. What is the size of the output after the max pooling
operation?
Solution:
Assume that the 2D convolution operation involves element-wise multiplication of each filter
with a corresponding portion of the input image, followed by summation of the resulting
values to obtain a single output value at each location in the feature map. Also, assume that
batch normalization is applied to each filter output independently, and ReLU activation is
applied to the batch-normalized output.
Solution:
8. Suppose you have an input image of size 256 x 256 pixels and a 2D max pooling
layer with a pool size of 2 x 2 pixels and a stride of 2. After the max pooling
operation, you apply a dropout layer with a dropout rate of 0.5. What is the size of the
output after the dropout operation?
Assume that the max pooling operation takes the maximum value of each 2 x 2 region of the
input feature map and outputs it as a single value in the corresponding location in the output
map. Also, assume that the dropout operation randomly sets half of the input units to zero
during training to prevent overfitting.
Solution:
Assume that the 2D convolution operation involves element-wise multiplication of each filter
with a corresponding portion of the input image, followed by summation of the resulting
values to obtain a single output value at each location in the feature map. Also, assume that
the max pooling operation takes the maximum value of each 2 x 2 region of the input feature
map and outputs it as a single value in the corresponding location in the output map.
Solution:
10. Suppose you have an input image of size 200 x 200 pixels and a 2D convolutional
layer with 16 filters, each of size 3 x 3 pixels, applied with a stride of 1 and zero
padding of size 1. After the convolution operation, you apply a batch normalization
layer and a ReLU activation function. What is the size of the output after the ReLU
activation function?
Assume that the 2D convolution operation involves element-wise multiplication of each filter
with a corresponding portion of the input image, followed by summation of the resulting
values to obtain a single output value at each location in the feature map. Also, assume that
the batch normalization layer normalizes the outputs of the convolutional layer to have zero
mean and unit variance, and the ReLU activation function sets any negative values to zero.
Solution:
11. Suppose you have a 2D input image of size 64 x 64 pixels and a max pooling layer
with a pooling window size of 2 x 2 pixels and stride of 2. What is the size of the
output feature map after applying the max pooling operation?
Solution:
12. Suppose you have a 3D input tensor of size 32 x 32 x 16 and a max pooling layer with
a pooling window size of 2 x 2 x 2 and stride of 2. What is the size of the output
feature map after applying the max pooling operation?
Solution:
13. Consider a CNN with an input layer of size 32 x 32, followed by a convolutional layer
(C1) with 16 filters of size 5 x 5 and a stride of 1, a max pooling layer (P1) with a
pooling window of size 2 x 2 and a stride of 2, and a fully connected layer (FC1) with
128 neurons. The input image is a color image with 3 channels. (a) Answer the
following questions in about 10 words each:
(b) Given the input image, what is the size of the output feature map after applying the
convolutional layer? Show all intermediate computations. (c) What is the size of the output
feature map after applying the max pooling layer? Show all intermediate computations. (d)
What is the size of the input vector to the fully connected layer?
answers:
(a)
1. The purpose of the convolutional layer is to extract features from the input image
through the application of convolutional filters.
2. The purpose of the max pooling layer is to downsample the feature maps produced by
the convolutional layer by taking the maximum value within a pooling window.
3. The purpose of the fully connected layer is to perform classification or regression on
the features extracted by the convolutional and pooling layers.
(b) The size of the output feature map after applying the convolutional layer is 28 x 28 x 16.
Explanation:
(c) The size of the output feature map after applying the max pooling layer is 14 x 14 x 16.
Explanation:
The max pooling layer has a pooling window of size 2 x 2 and a stride of 2.
The output feature map size is given by (input size - pooling window size)/stride + 1.
Applying a pooling window size of 2 x 2 with a stride of 2 reduces the spatial
dimensions of the input feature map by half in each dimension. Therefore, the output
feature map size is 14 x 14 x 16.
(d) The size of the input vector to the fully connected layer is 12544.
Explanation:
The output feature map size from the max pooling layer is 14 x 14 x 16.
Flattening the feature map into a vector yields a length of 14 x 14 x 16 = 3136.
Since there are 128 neurons in the fully connected layer, the input vector size is 3136
x 128 = 12544.
14. Consider a convolutional neural network with the following architecture:
(a) What is the size of the output feature map after applying the first convolutional layer?
Show your work.
(b) What is the size of the output feature map after applying the second convolutional layer?
Show your work.
(c) What is the size of the input vector to the fully connected layer? Show your work.
Solution:
(a) The output feature map size after applying the first convolutional layer is 26 x 26 x 16.
Explanation:
(b) The output feature map size after applying the second convolutional layer is 10 x 10 x 32.
Explanation:
The max pooling layer with pooling window size of 2 x 2 and stride of 2 reduces the
spatial dimensions of the input feature map by half in each dimension. Therefore, the
output feature map size after the first max pooling layer is 13 x 13 x 16.
The second convolutional layer has 32 filters of size 5 x 5 and a stride of 1.
The output feature map size is given by (input size - filter size)/stride + 1.
With no padding, the output feature map size is (13-5)/1 + 1 = 9. Applying 32 filters
results in an output feature map size of 9 x 9 x 32.
Another max pooling layer with pooling window size of 2 x 2 and stride of 2 reduces
the spatial dimensions of the input feature map by half in each dimension. Therefore,
the final output feature map size is 10 x 10 x 32.
(c) The size of the input vector to the fully connected layer is 1600.
Explanation:
The output feature map size from the second max pooling layer is 5 x 5 x 32.
Flattening the feature map into a vector yields a length of 5 x 5 x 32 = 800.
There are 64 neurons in the fully connected layer, so the input vector size is 800 x 64
= 51200.
However, the fully connected layer is preceded by a ReLU activation function, which
sets negative values to zero. Since some of the values in the flattened feature map may
be negative, we only count the positive values in the vector, which reduces the size to
1600.
Formulas used in CNN. Here are some of the commonly used formulas:
15. Assume you are feeding Convolutional Neural Network with following image matrix.
Each layer in the Deep Convolutional Neural Network performs some operation on
its input and outputs a new value.
Answer following Questions.
1 4 2 8 6
3 5 -11 4 -8
6 -9 6 -1 5
5 12 4 -14 3
-9 3 1 6 9
1.In the convolutional layer having the following filter,Briefly explain what is the output
& why?
1 0
0 1
2.The output of the convolutional layer goes through the ReLU layer (Rectified Linear
Unit) what will the output of this laver be? [explain this step briefly both text and
workings]
3. The output of the ReLu layer goes through a Max Pooling layer ([3 3]). what will the
output of this laver be (explain your answer)
1. The output of the convolutional layer can be obtained using the following formula:
Using the given filter [1 0; 0 1], the output of the convolutional layer for the input image can
be computed as:
3262
9 -16 2 -12
6754
2. The ReLU layer applies the element-wise activation function f(x) = max(0,x) to the
output of the convolutional layer. Therefore, any negative value in the output of the
convolutional layer is replaced by zero, while the positive values remain unchanged.
Thus, the output of the ReLU layer for the above input image would be:
3262
9020
6754
17 0 0 0
967
17 7 5
The output size of the Max Pooling layer is reduced due to the down-sampling operation.
Assume you are feeding a Convolutional Neural Network with the following grayscale image
matrix of size 6 x 6. Each layer in the Deep Convolutional Neural Network performs some
operation on its input and outputs a new value.
Image Matrix: 1 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18
19 20 21 22 23 24
25 26 27 28 29 30
31 32 33 34 35 36
1. In the convolutional layer having the following filter, what is the output & why?
Filter:
10
0 -1
2. The output of the convolutional layer goes through a ReLU layer. What will the
output of this layer be? Briefly explain both text and workings.
3. The output of the ReLU layer goes through a Max Pooling layer ([2 2], stride=2).
What will the output of this layer be? Briefly explain both text and workings.
24687
54719
96231
12486
37691
Perform vertical edge detection on this image using the following 3x3 filter:
-1 0 1
-2 0 2
-1 0 1
Solution: We can perform vertical edge detection using the convolution operation. We will
slide the filter over the image matrix and multiply the filter values with the corresponding
values of the image matrix, sum them up, and place the result in the output matrix.
Here are the steps to perform vertical edge detection on the given image using the given
filter:
1. We will start by placing the filter on the top-left corner of the image matrix:
246
547
962
We will then perform the convolution operation:
(-1)x2 + 0x4 + 1x6 + (-2)x5 + 0x4 + 2x7 + (-1)x9 + 0x6 + 1x2 = -12
The result of the convolution operation is -12, which we will place in the output
matrix at position (1,1).
2. Next, we will move the filter one step to the right:
468
471
623
We will perform the convolution operation:
(-1)x4 + 0x6 + 1x8 + (-2)x4 + 0x7 + 2x1 + (-1)x6 + 0x2 + 1x3 = -10
The result of the convolution operation is -10, which we will place in the output
matrix at position (1,2).
3. We will continue sliding the filter over the image matrix, performing the convolution
operation, and placing the result in the output matrix, until we reach the bottom-right
corner of the image matrix.
The resulting output matrix for vertical edge detection using the given filter on the given
image is:
0 0 -6
-1 0 -6
-2 0 -6
10 20 30 40 50
5 15 25 35 45
0 10 20 30 40
-5 5 15 25 35
-10 0 10 20 30
Perform horizontal edge detection on the given image matrix using the following filter:
-1 -1 -1
0 0 0
1 1 1
Solution:
Performing horizontal edge detection using the above filter involves sliding the filter over the
input image matrix and performing element-wise multiplication and summation. The
resulting output matrix represents the edges detected in the horizontal direction.
The first step is to create a padded matrix to avoid boundary effects. We add one column of
zeros to the left and right of the input matrix, and one row of zeros to the top and bottom.
0 0 0 0 0 0 0
0 10 20 30 40 50 0
0 5 15 25 35 45 0
0 0 10 20 30 40 0
0 -5 5 15 25 35 0
0 -10 0 10 20 30 0
0 0 0 0 0 0 0
Next, we slide the filter over the input matrix and perform element-wise multiplication and
summation:
The resulting matrix represents the edges detected in the horizontal direction.
Note: In practice, the output matrix would be further processed by passing it through an
activation function like ReLU to introduce non-linearity.