You are on page 1of 15

II Unit

Mitigation:

1. What is the role of mitigation techniques in feedforward neural networks?


2. How can regularization be used for mitigation in feedforward neural networks?
3. What are some common mitigation techniques used in feedforward neural networks?
4. What are the limitations of mitigation techniques in feedforward neural networks?
5. What is overfitting in a neural network, and how can it be mitigated?
6. Explain how regularization techniques can be used to mitigate overfitting in a
feedforward neural network.

RelU Heuristics for Avoiding Bad Local Minima:

1. How does the RelU activation function help in avoiding bad local minima in
feedforward neural networks?
2. How does the initialization of weights affect the performance of the RelU activation
function in avoiding bad local minima in feedforward neural networks?
3. What are some alternative activation functions to RelU for avoiding bad local minima
in feedforward neural networks?
4. How does the number of layers in a feedforward neural network affect the
performance of the RelU activation function in avoiding bad local minima?
5. Describe the role of activation functions in feedforward neural networks.
6. Explain how the RelU activation function can be used to avoid getting stuck in bad
local minima during training.

Heuristics for Faster Training:

1. What is the goal of heuristics for faster training in feedforward neural networks?
2. How can batch normalization be used for faster training in feedforward neural
networks?
3. What are some other common heuristics used for faster training in feedforward neural
networks?
4. What are the limitations of heuristics for faster training in feedforward neural
networks?
5. What is the vanishing gradient problem, and how can it be mitigated in a feedforward
neural network?
6. Explain how batch normalization can be used as a heuristic for faster training in a
feedforward neural network.

Nestor's Accelerated Gradient Descent:

1. What is Nestor's Accelerated Gradient Descent, and how does it differ from
traditional gradient descent in feedforward neural networks?
2. How can Nestor's Accelerated Gradient Descent be applied to improve the training of
feedforward neural networks?
3. What are the limitations of Nestor's Accelerated Gradient Descent in feedforward
neural networks?
4. How can other optimization techniques be combined with Nestor's Accelerated
Gradient Descent to improve the training of feedforward neural networks?
5. Describe the difference between gradient descent and Nestor's accelerated gradient
descent algorithm.
6. Explain how the momentum term in Nestor's accelerated gradient descent algorithm
can improve convergence speed.

Regularization:

1. What is the role of regularization in feedforward neural networks?


2. How can L1 and L2 regularization be applied in feedforward neural networks?
3. What are some other common regularization techniques used in feedforward neural
networks?
4. How does the strength of regularization affect the performance of a feedforward
neural network?
5. What is L1 regularization, and how can it be used to prevent overfitting in a
feedforward neural network?
6. Describe the role of early stopping in regularization of a feedforward neural network.

Dropout:

1. What is dropout, and how does it work in feedforward neural networks?


2. How can dropout be used to prevent overfitting in feedforward neural networks?
3. What are some limitations of using dropout in feedforward neural networks?
4. How can dropout be combined with other regularization techniques to improve the
performance of a feedforward neural network?
5. Explain how dropout regularization can be used to prevent overfitting in a
feedforward neural network.
6. Describe the trade-off between dropout and the learning rate in a feedforward neural
network.

UNIT III

CNN Architectures:

1. What is a convolutional neural network (CNN), and how is it different from a


traditional neural network?
2. Describe the key components of a CNN architecture, such as convolutional layers,
pooling layers, and fully connected layers.

Convolution Operation:
1. Explain the concept of a convolutional operation in a CNN.
2. What are the advantages of using convolutional operations in image processing tasks?

Variants of the Basic Convolution Function:

1. What is a dilated convolution, and how does it differ from a standard convolution?
2. Explain how a transposed convolution is used in a CNN.

Structured Outputs:

1. What is the difference between a classification and a regression task in a CNN?


2. Describe how a CNN can be used for object detection in an image.

Data Types:

1. What is the difference between grayscale and RGB images, and how does this affect
the input data type in a CNN?
2. Describe the role of data normalization in a CNN.

Efficient Convolution Algorithm:

1. Explain how the fast Fourier transform (FFT) can be used to speed up convolution in
a CNN.
2. Describe the concept of a separable convolution and how it can improve the
efficiency of a CNN.

Random or Unsupervised Features:

1. What are random features, and how are they used in a CNN?
2. Explain the concept of unsupervised learning in a CNN and give an example.

LeNet, AlexNet:

1. What is LeNet, and what was it originally designed for?


2. Describe the architecture of AlexNet and its contributions to the development of
CNNs.

5 and 7 Marks Questions

CNN Architectures:

1. Explain in detail the architecture of a convolutional neural network and its different
components, such as convolutional layers, pooling layers, and fully connected layers.
Also, provide an example of a typical CNN architecture and its use case.
2. Compare and contrast the architectures of LeNet and AlexNet, including their
differences in layer types and sizes.
3. Suppose you have an input image of size 28 x 28 pixels with a stride of 2 and a
max pooling layer of size 2 x 2 pixels. You apply the max pooling operation to the
input image, resulting in a reduced output image. What is the size of the output
image?

Assume that the max pooling operation takes the maximum value of each 2 x 2 region of the
input image and outputs it as a single value in the corresponding location in the output image.

Solution:

 The input image size is 28 x 28 pixels.


 The max pooling layer size is 2 x 2 pixels, which means that it will take the maximum
value of each 2 x 2 region of the input image and output it as a single value in the
corresponding location in the output image.
 The stride of the max pooling layer is 2, which means that it will move 2 pixels at a
time when scanning the input image.
 To calculate the size of the output image, we need to divide the size of the input
image by the stride of the max pooling layer.
 After one pass of the max pooling layer, the output image size will be 14 x 14 pixels
(i.e., 28 / 2 = 14).
 Therefore, the size of the output image after applying the max pooling layer once is
14 x 14 pixels.

4. Suppose you have an input image of size 64 x 64 pixels with a stride of 2 and a
max pooling layer of size 3 x 3 pixels. You apply the max pooling operation to the
input image, resulting in a reduced output image. What is the size of the output
image?

Assume that the max pooling operation takes the maximum value of each 3 x 3 region of the
input image and outputs it as a single value in the corresponding location in the output image.

Solution:

 The input image size is 64 x 64 pixels.


 The max pooling layer size is 3 x 3 pixels, which means that it will take the maximum
value of each 3 x 3 region of the input image and output it as a single value in the
corresponding location in the output image.
 The stride of the max pooling layer is 2, which means that it will move 2 pixels at a
time when scanning the input image.
 To calculate the size of the output image, we need to divide the size of the input
image by the stride of the max pooling layer.
 After one pass of the max pooling layer, the output image size will be 31 x 31 pixels
(i.e., (64 - 3)/2 + 1 = 31).
 Therefore, the size of the output image after applying the max pooling layer once is
31 x 31 pixels.
5. Suppose you have an input image of size 32 x 32 pixels and a filter/kernel of size
5 x 5 pixels. You apply the convolution operation to the input image using the
given filter/kernel, resulting in a feature map. What is the size of the feature
map?

Assume that the convolution operation involves element-wise multiplication of the


filter/kernel with a corresponding portion of the input image, followed by summation of the
resulting values to obtain a single output value at each location in the feature map.

Solution:

 The input image size is 32 x 32 pixels.


 The filter/kernel size is 5 x 5 pixels.
 To calculate the size of the feature map, we need to determine how many times the
filter/kernel can be applied to the input image without going out of bounds.
 The output size can be computed using the formula: Output size = [(Input size - Filter
size + 2*Padding)/Stride] + 1
 Assume the stride is 1 and there is no padding. Using the above formula, the output
size would be [(32 - 5)/1] + 1 = 28.
 Therefore, the size of the feature map is 28 x 28 pixels.

6. Suppose you have an input image of size 128 x 128 pixels and a filter/kernel of
size 3 x 3 pixels. You apply the convolution operation to the input image using
the given filter/kernel, and then apply a max pooling layer with a size of 2 x 2
pixels and a stride of 2. What is the size of the output after the max pooling
operation?

Assume that the convolution operation involves element-wise multiplication of the


filter/kernel with a corresponding portion of the input image, followed by summation of the
resulting values to obtain a single output value at each location in the feature map. Also,
assume that the max pooling operation takes the maximum value of each 2 x 2 region of the
input feature map and outputs it as a single value in the corresponding location in the output
map.

Solution:

 The input image size is 128 x 128 pixels.


 The filter/kernel size is 3 x 3 pixels.
 To calculate the size of the feature map, we need to determine how many times the
filter/kernel can be applied to the input image without going out of bounds.
 Assuming stride = 1 and no padding, the output feature map size after applying the
convolution operation would be (128 - 3 + 1) x (128 - 3 + 1) = 126 x 126 pixels.
 After applying the max pooling operation with a size of 2 x 2 pixels and a stride of 2,
the output feature map size would be (126 / 2) x (126 / 2) = 63 x 63 pixels.
 Therefore, the size of the output feature map after the max pooling operation is 63 x
63 pixels.
7. Suppose you have an input image of size 64 x 64 pixels and a 2D convolutional
layer with 16 filters, each of size 3 x 3 pixels, applied with a stride of 1 and no
padding. After the convolution operation, you apply a batch normalization layer
followed by a ReLU activation function. What is the size of the output feature
map?

Assume that the 2D convolution operation involves element-wise multiplication of each filter
with a corresponding portion of the input image, followed by summation of the resulting
values to obtain a single output value at each location in the feature map. Also, assume that
batch normalization is applied to each filter output independently, and ReLU activation is
applied to the batch-normalized output.

Solution:

 The input image size is 64 x 64 pixels.


 The convolutional layer has 16 filters of size 3 x 3 pixels each.
 Assuming stride = 1 and no padding, the output feature map size after applying the
convolution operation would be (64 - 3 + 1) x (64 - 3 + 1) x 16 = 62 x 62 x 16.
 After applying the batch normalization layer, the size of the feature map remains the
same, i.e., 62 x 62 x 16.
 After applying the ReLU activation function, the size of the feature map remains the
same as well, i.e., 62 x 62 x 16.
 Therefore, the size of the output feature map is 62 x 62 x 16 pixels.

8. Suppose you have an input image of size 256 x 256 pixels and a 2D max pooling
layer with a pool size of 2 x 2 pixels and a stride of 2. After the max pooling
operation, you apply a dropout layer with a dropout rate of 0.5. What is the size of the
output after the dropout operation?

Assume that the max pooling operation takes the maximum value of each 2 x 2 region of the
input feature map and outputs it as a single value in the corresponding location in the output
map. Also, assume that the dropout operation randomly sets half of the input units to zero
during training to prevent overfitting.

Solution:

 The input image size is 256 x 256 pixels.


 After applying the max pooling operation with a size of 2 x 2 pixels and a stride of 2,
the output feature map size would be (256 / 2) x (256 / 2) = 128 x 128 pixels.
 After applying the dropout operation with a dropout rate of 0.5, the size of the output
feature map remains the same, i.e., 128 x 128 pixels.
 Therefore, the size of the output feature map after the dropout operation is 128 x 128
pixels.
9. Suppose you have an input image of size 100 x 100 pixels and a 2D convolutional
layer with 8 filters, each of size 5 x 5 pixels, applied with a stride of 2 and no padding.
After the convolution operation, you apply a max pooling layer with a pool size of 2 x
2 pixels and a stride of 2. What is the size of the output after the max pooling
operation?

Assume that the 2D convolution operation involves element-wise multiplication of each filter
with a corresponding portion of the input image, followed by summation of the resulting
values to obtain a single output value at each location in the feature map. Also, assume that
the max pooling operation takes the maximum value of each 2 x 2 region of the input feature
map and outputs it as a single value in the corresponding location in the output map.

Solution:

 The input image size is 100 x 100 pixels.


 The convolutional layer has 8 filters of size 5 x 5 pixels each.
 Assuming stride = 2 and no padding, the output feature map size after applying the
convolution operation would be ((100 - 5) / 2 + 1) x ((100 - 5) / 2 + 1) x 8 = 48 x 48 x
8.
 After applying the max pooling operation with a size of 2 x 2 pixels and a stride of 2,
the output feature map size would be (48 / 2) x (48 / 2) x 8 = 12 x 12 x 8.
 Therefore, the size of the output feature map after the max pooling operation is 12 x
12 x 8 pixels.

10. Suppose you have an input image of size 200 x 200 pixels and a 2D convolutional
layer with 16 filters, each of size 3 x 3 pixels, applied with a stride of 1 and zero
padding of size 1. After the convolution operation, you apply a batch normalization
layer and a ReLU activation function. What is the size of the output after the ReLU
activation function?

Assume that the 2D convolution operation involves element-wise multiplication of each filter
with a corresponding portion of the input image, followed by summation of the resulting
values to obtain a single output value at each location in the feature map. Also, assume that
the batch normalization layer normalizes the outputs of the convolutional layer to have zero
mean and unit variance, and the ReLU activation function sets any negative values to zero.

Solution:

 The input image size is 200 x 200 pixels.


 The convolutional layer has 16 filters of size 3 x 3 pixels each, and the padding size is
1 pixel.
 Assuming stride = 1 and padding = 1, the output feature map size after applying the
convolution operation would be (200 + 2*1 - 3) / 1 + 1 = 200 x 200 x 16.
 After applying the batch normalization layer, the size of the output feature map
remains the same, i.e., 200 x 200 x 16.
 After applying the ReLU activation function, any negative values in the output feature
map are set to zero, so the size of the output feature map remains the same, i.e., 200 x
200 x 16.
 Therefore, the size of the output feature map after the ReLU activation function is 200
x 200 x 16 pixels.

11. Suppose you have a 2D input image of size 64 x 64 pixels and a max pooling layer
with a pooling window size of 2 x 2 pixels and stride of 2. What is the size of the
output feature map after applying the max pooling operation?

Solution:

 The input image size is 64 x 64 pixels.


 The max pooling layer has a pooling window size of 2 x 2 pixels and a stride of 2.
 The pooling operation downsamples the input image by taking the maximum value in
each pooling window. The output feature map has the same number of channels as the
input image and the spatial dimensions are reduced by a factor of the stride in each
dimension.
 Applying the max pooling operation with a pooling window size of 2 x 2 pixels and a
stride of 2 reduces the spatial dimensions of the input image by half in each
dimension. The size of the output feature map is, therefore, 32 x 32 pixels.
 Therefore, the size of the output feature map after applying the max pooling operation
is 32 x 32 pixels.

12. Suppose you have a 3D input tensor of size 32 x 32 x 16 and a max pooling layer with
a pooling window size of 2 x 2 x 2 and stride of 2. What is the size of the output
feature map after applying the max pooling operation?

Solution:

 The input tensor size is 32 x 32 x 16.


 The max pooling layer has a pooling window size of 2 x 2 x 2 and a stride of 2.
 The pooling operation downsamples the input tensor by taking the maximum value in
each pooling window. The output feature map has the same number of channels as the
input tensor and the spatial dimensions are reduced by a factor of the stride in each
dimension.
 Applying the max pooling operation with a pooling window size of 2 x 2 x 2 and a
stride of 2 reduces the spatial dimensions of the input tensor by half in each
dimension. The size of the output feature map is, therefore, 16 x 16 x 8.
 Therefore, the size of the output feature map after applying the max pooling operation
is 16 x 16 x 8.

13. Consider a CNN with an input layer of size 32 x 32, followed by a convolutional layer
(C1) with 16 filters of size 5 x 5 and a stride of 1, a max pooling layer (P1) with a
pooling window of size 2 x 2 and a stride of 2, and a fully connected layer (FC1) with
128 neurons. The input image is a color image with 3 channels. (a) Answer the
following questions in about 10 words each:

1. What is the purpose of the convolutional layer?


2. What is the purpose of the max pooling layer?
3. What is the purpose of the fully connected layer?

(b) Given the input image, what is the size of the output feature map after applying the
convolutional layer? Show all intermediate computations. (c) What is the size of the output
feature map after applying the max pooling layer? Show all intermediate computations. (d)
What is the size of the input vector to the fully connected layer?

answers:

(a)

1. The purpose of the convolutional layer is to extract features from the input image
through the application of convolutional filters.
2. The purpose of the max pooling layer is to downsample the feature maps produced by
the convolutional layer by taking the maximum value within a pooling window.
3. The purpose of the fully connected layer is to perform classification or regression on
the features extracted by the convolutional and pooling layers.

(b) The size of the output feature map after applying the convolutional layer is 28 x 28 x 16.

Explanation:

 The input image size is 32 x 32 x 3 (assuming RGB color channels).


 The convolutional layer has 16 filters of size 5 x 5 and a stride of 1.
 The output feature map size is given by (input size - filter size + 2*padding)/stride +
1.
 With no padding, the output feature map size is (32-5)/1 + 1 = 28. Applying 16 filters
results in an output feature map size of 28 x 28 x 16.

(c) The size of the output feature map after applying the max pooling layer is 14 x 14 x 16.

Explanation:

 The max pooling layer has a pooling window of size 2 x 2 and a stride of 2.
 The output feature map size is given by (input size - pooling window size)/stride + 1.
 Applying a pooling window size of 2 x 2 with a stride of 2 reduces the spatial
dimensions of the input feature map by half in each dimension. Therefore, the output
feature map size is 14 x 14 x 16.

(d) The size of the input vector to the fully connected layer is 12544.

Explanation:

 The output feature map size from the max pooling layer is 14 x 14 x 16.
 Flattening the feature map into a vector yields a length of 14 x 14 x 16 = 3136.
 Since there are 128 neurons in the fully connected layer, the input vector size is 3136
x 128 = 12544.
14. Consider a convolutional neural network with the following architecture:

Input layer: 28 x 28 x 1 Convolutional layer: 16 filters of size 3 x 3 with a stride of 1 and no


padding ReLU activation function Max pooling layer: pooling window of size 2 x 2 and
stride of 2 Convolutional layer: 32 filters of size 5 x 5 with a stride of 1 and no padding
ReLU activation function Max pooling layer: pooling window of size 2 x 2 and stride of 2
Flatten layer Fully connected layer: 64 neurons ReLU activation function Output layer: 10
neurons (for 10 possible classes) Softmax activation function

(a) What is the size of the output feature map after applying the first convolutional layer?
Show your work.

(b) What is the size of the output feature map after applying the second convolutional layer?
Show your work.

(c) What is the size of the input vector to the fully connected layer? Show your work.

Solution:

(a) The output feature map size after applying the first convolutional layer is 26 x 26 x 16.

Explanation:

 The input image size is 28 x 28 x 1.


 The convolutional layer has 16 filters of size 3 x 3 and a stride of 1.
 The output feature map size is given by (input size - filter size)/stride + 1.
 With no padding, the output feature map size is (28-3)/1 + 1 = 26. Applying 16 filters
results in an output feature map size of 26 x 26 x 16.

(b) The output feature map size after applying the second convolutional layer is 10 x 10 x 32.

Explanation:

 The max pooling layer with pooling window size of 2 x 2 and stride of 2 reduces the
spatial dimensions of the input feature map by half in each dimension. Therefore, the
output feature map size after the first max pooling layer is 13 x 13 x 16.
 The second convolutional layer has 32 filters of size 5 x 5 and a stride of 1.
 The output feature map size is given by (input size - filter size)/stride + 1.
 With no padding, the output feature map size is (13-5)/1 + 1 = 9. Applying 32 filters
results in an output feature map size of 9 x 9 x 32.
 Another max pooling layer with pooling window size of 2 x 2 and stride of 2 reduces
the spatial dimensions of the input feature map by half in each dimension. Therefore,
the final output feature map size is 10 x 10 x 32.

(c) The size of the input vector to the fully connected layer is 1600.

Explanation:
 The output feature map size from the second max pooling layer is 5 x 5 x 32.
 Flattening the feature map into a vector yields a length of 5 x 5 x 32 = 800.
 There are 64 neurons in the fully connected layer, so the input vector size is 800 x 64
= 51200.
 However, the fully connected layer is preceded by a ReLU activation function, which
sets negative values to zero. Since some of the values in the flattened feature map may
be negative, we only count the positive values in the vector, which reduces the size to
1600.

Formulas used in CNN. Here are some of the commonly used formulas:

Output size = ((N - F + 2P)/S) + 1

where N = Input size F = Filter size P = Padding S = Stride

1. Output size of a convolutional layer: Output size = ((N - F + 2P)/S) + 1, where N is


the input size, F is the filter size, P is the padding, and S is the stride.
2. Output size of a max pooling layer: Output size = ((N - F)/S) + 1, where N is the input
size, F is the filter size, and S is the stride.
3. Number of parameters in a convolutional layer: Number of parameters = (F x F x
D_in x K) + K, where F is the filter size, D_in is the number of input channels, K is
the number of filters.
4. Number of neurons in a fully connected layer: Number of neurons = H x W x D,
where H is the height of the input feature map, W is the width of the input feature
map, and D is the depth of the input feature map.
5. Loss function for CNN: Cross-entropy loss = -Σy log(y_hat), where y is the ground
truth label and y_hat is the predicted label.
6. Learning rate update rule: Learning rate = initial learning rate / (1 + decay_rate x
epoch), where initial learning rate is the starting learning rate, decay_rate is the rate at
which the learning rate decreases, and epoch is the current epoch.

15. Assume you are feeding Convolutional Neural Network with following image matrix.
Each layer in the Deep Convolutional Neural Network performs some operation on
its input and outputs a new value.
Answer following Questions.

1 4 2 8 6
3 5 -11 4 -8
6 -9 6 -1 5
5 12 4 -14 3
-9 3 1 6 9

1.In the convolutional layer having the following filter,Briefly explain what is the output
& why?
1 0
0 1
2.The output of the convolutional layer goes through the ReLU layer (Rectified Linear
Unit) what will the output of this laver be? [explain this step briefly both text and
workings]
3. The output of the ReLu layer goes through a Max Pooling layer ([3 3]). what will the
output of this laver be (explain your answer)

1. The output of the convolutional layer can be obtained using the following formula:

output[i,j] = input[i,j]*filter[0,0] + input[i,j+1]*filter[0,1] + input[i+1,j]*filter[1,0] +


input[i+1,j+1]*filter[1,1]

Using the given filter [1 0; 0 1], the output of the convolutional layer for the input image can
be computed as:

3262

9 -16 2 -12

6754

17 -10 -10 -11

2. The ReLU layer applies the element-wise activation function f(x) = max(0,x) to the
output of the convolutional layer. Therefore, any negative value in the output of the
convolutional layer is replaced by zero, while the positive values remain unchanged.
Thus, the output of the ReLU layer for the above input image would be:

3262

9020

6754

17 0 0 0

3. The Max Pooling layer with [3 3] filter performs a down-sampling operation by


selecting the maximum value from a non-overlapping rectangular sub-region of the
input. The output of the Max Pooling layer for the above input image would be
obtained as follows:

967

17 7 5

The output size of the Max Pooling layer is reduced due to the down-sampling operation.
Assume you are feeding a Convolutional Neural Network with the following grayscale image
matrix of size 6 x 6. Each layer in the Deep Convolutional Neural Network performs some
operation on its input and outputs a new value.

Image Matrix: 1 2 3 4 5 6

7 8 9 10 11 12

13 14 15 16 17 18

19 20 21 22 23 24

25 26 27 28 29 30

31 32 33 34 35 36

1. In the convolutional layer having the following filter, what is the output & why?
Filter:
10
0 -1
2. The output of the convolutional layer goes through a ReLU layer. What will the
output of this layer be? Briefly explain both text and workings.
3. The output of the ReLU layer goes through a Max Pooling layer ([2 2], stride=2).
What will the output of this layer be? Briefly explain both text and workings.

list of filters used in convolution in can

Problem: Consider the following 5x5 image matrix:

24687

54719

96231

12486

37691

Perform vertical edge detection on this image using the following 3x3 filter:

-1 0 1

-2 0 2

-1 0 1
Solution: We can perform vertical edge detection using the convolution operation. We will
slide the filter over the image matrix and multiply the filter values with the corresponding
values of the image matrix, sum them up, and place the result in the output matrix.

The output matrix will have a size of (5-3+1) x (5-3+1) = 3x3.

Here are the steps to perform vertical edge detection on the given image using the given
filter:

1. We will start by placing the filter on the top-left corner of the image matrix:
246
547
962
We will then perform the convolution operation:
(-1)x2 + 0x4 + 1x6 + (-2)x5 + 0x4 + 2x7 + (-1)x9 + 0x6 + 1x2 = -12
The result of the convolution operation is -12, which we will place in the output
matrix at position (1,1).
2. Next, we will move the filter one step to the right:
468
471
623
We will perform the convolution operation:
(-1)x4 + 0x6 + 1x8 + (-2)x4 + 0x7 + 2x1 + (-1)x6 + 0x2 + 1x3 = -10
The result of the convolution operation is -10, which we will place in the output
matrix at position (1,2).
3. We will continue sliding the filter over the image matrix, performing the convolution
operation, and placing the result in the output matrix, until we reach the bottom-right
corner of the image matrix.

The resulting output matrix for vertical edge detection using the given filter on the given
image is:

0 0 -6
-1 0 -6
-2 0 -6

Convolutional operation for Horizontal Edge Detection:

Consider the following image matrix:

10 20 30 40 50
5 15 25 35 45
0 10 20 30 40
-5 5 15 25 35
-10 0 10 20 30

Perform horizontal edge detection on the given image matrix using the following filter:
-1 -1 -1
0 0 0
1 1 1

Solution:

Performing horizontal edge detection using the above filter involves sliding the filter over the
input image matrix and performing element-wise multiplication and summation. The
resulting output matrix represents the edges detected in the horizontal direction.

The first step is to create a padded matrix to avoid boundary effects. We add one column of
zeros to the left and right of the input matrix, and one row of zeros to the top and bottom.

0 0 0 0 0 0 0
0 10 20 30 40 50 0
0 5 15 25 35 45 0
0 0 10 20 30 40 0
0 -5 5 15 25 35 0
0 -10 0 10 20 30 0
0 0 0 0 0 0 0

Next, we slide the filter over the input matrix and perform element-wise multiplication and
summation:

-30 -30 -30 0 30


-30 -30 -30 0 30
-30 -30 -30 0 30
-30 -30 -30 0 30
-30 -30 -30 0 30

The resulting matrix represents the edges detected in the horizontal direction.

Note: In practice, the output matrix would be further processed by passing it through an
activation function like ReLU to introduce non-linearity.

You might also like