You are on page 1of 8

CT 1 question bank NNDL

1. What is an artificial neural network? What are its advantages and disadvantages?

2. What is a simple artificial neuron and how does it work?

3. What do you mean by Perceptron? What are different types of perceptrons?

4. Explain how a single-layer neural network is used as a linear classifier ?

5. What is the role of the Activation functions in Neural Networks? List any three Activation
Functions used in Neural Networks.

6. What do you mean by Backpropagation?

7. What do you mean by Forward Propagation?


8. Explain Gradient Descent and its types.
Ans: Gradient descent is defined as one of the most commonly used iterative optimization
algorithms of machine learning to train the machine learning and deep learning models. It
helps in finding the local minimum of a function.

Cost function:
The cost function is defined as the measurement of difference or error between actual
values and expected values at the current position and present in the form of a single real
number.

If we move towards a negative gradient or away from the gradient of the function at the current
point, it will give the local minimum of that function.

The goal of the gradient descent algorithm is to minimize the given function

(say cost function). To achieve this goal, it performs two steps iteratively:
1. Compute the gradient (slope), the first order derivative of the function at
that point
2. Make a step (move) in the direction opposite to the gradient, opposite
direction of slope increase from the current point by alpha times the
gradient at that point

Types of gradient descent:


1. Batch gradient descent: It is used to find the error for each point in the training
set & update the model after evaluating all training epoch. In simple words, it is a
greedy approach where we have to sum over all examples for each update.
2. Stochastic gradient descent: It processes one training example per iteration.
Hence, the parameters are being updated even after one iteration in which only a
single example is processed. Hence, this is quite faster than batch gradient
descent.
3. Mini batch gradient descent: It works faster than both batch & stochastic gradient
descent. Here ‘b’ examples where b<m are processed per iteration. So even if
the no. of training examples is large, it is processed in batches of ‘b’ training
examples in one go. Thus, it works for larger training examples & that too with
lesser no. of iterations.

9. Explain the difference between Forward propagation and Backward Propagation in Neural
Networks?

10. Explain feedforward neural network ?


11. Explain feedback neural network ?

12. What are the Learning Rules in a Neural Network?


Ans=
Learning rule or Learning process is a method or a mathematical logic. It improves the
Artificial Neural Network’s performance and applies this rule over the network. Thus learning
rules updates the weights and bias levels of a network when a network simulates in a specific
data environment.
Applying learning rule is an iterative process. It helps a neural network to learn from the
existing conditions and improve its performance.
Let us see different learning rules in the Neural network:

Hebbian learning rule – It identifies, how to modify the weights of nodes of a network.
Perceptron learning rule – Network starts its learning by assigning a random value to each
weight.
Delta learning rule – Modification in sympatric weight of a node is equal to the multiplication of
error and the input.
Correlation learning rule – The correlation rule is the supervised learning.
Outstar learning rule – We can use it when it assumes that nodes or neurons in a network
arranged in a layer.
2.1. Hebbian Learning Rule
The Hebbian rule was the first learning rule. In 1949 Donald Hebb developed it as learning
algorithm of the unsupervised neural network. We can use it to identify how to improve the
weights of nodes of a network.
The Hebb learning rule assumes that – If two neighbor neurons activated and deactivated at
the same time. Then the weight connecting these neurons should increase. For neurons
operating in the opposite phase, the weight between them should decrease. If there is no
signal correlation, the weight should not change.
When inputs of both the nodes are either positive or negative, then a strong positive weight
exists between the nodes. If the input of a node is positive and negative for other, a strong
negative weight exists between the nodes.
At the start, values of all weights are set to zero. This learning rule can be used0 for both soft-
and hard-activation functions. Since desired responses of neurons are not used in the
learning procedure, this is the unsupervised learning rule. The absolute values of the weights
are usually proportional to the learning time, which is undesired.

… … … . Formula… …

The Hebbian learning rule describes the formula as follows:

2.2. Perceptron Learning Rule


As you know, each connection in a neural network has an associated weight, which changes
in the course of learning. According to it, an example of supervised learning, the network
starts its learning by assigning a random value to each weight.
Calculate the output value on the basis of a set of records for which we can know the
expected output value. This is the learning sample that indicates the entire definition. As a
result, it is called a learning sample.
The network then compares the calculated output value with the expected value. Next calculates an
error function ∈, which can be the sum of squares of the errors occurring for each individual in the
learning sample.
Computed as follows:

… … .. Formula… … … .

Perform the first summation on the individuals of the learning set, and perform the second
summation on the output units. Eij and Oij are the expected and obtained values of the jth unit
for the ith individual.
The network then adjusts the weights of the different units, checking each time to see if the
error function has increased or decreased. As in a conventional regression, this is a matter of
solving a problem of least squares.
Since assigning the weights of nodes according to users, it is an example of supervised
learning.

2.3. Delta Learning Rule


Developed by Widrow and Hoff, the delta rule, is one of the most common learning rules. It
depends on supervised learning.
This rule states that the modification in sympatric weight of a node is equal to the
multiplication of error and the input.
For a given input vector, compare the output vector is the correct answer. If the difference is
zero, no learning takes place; otherwise, adjusts its weights to reduce this difference. The
change in weight from ui to uj is: dwij = r* ai * ej.
where r is the learning rate, ai represents the activation of ui and ej is the difference between
the expected output and the actual output of uj. If the set of input patterns form an
independent set then learn arbitrary associations using the delta rule.
It has seen that for networks with linear activation functions and with no hidden units. The
error squared vs. the weight graph is a paraboloid in n-space. Since the proportionality
constant is negative, the graph of such a function is concave upward and has the least value.
The vertex of this paraboloid represents the point where it reduces the error. The weight
vector corresponding to this point is then the ideal weight vector.
We can use the delta learning rule with both single output unit and several output units.
While applying the delta rule assume that the error can be directly measured.
The aim of applying the delta rule is to reduce the difference between the actual and expected
output that is the error.

2.4. Correlation Learning Rule


The correlation learning rule based on a similar principle as the Hebbian learning rule. It
assumes that weights between responding neurons should be more positive, and weights
between neurons with opposite reaction should be more negative.

Contrary to the Hebbian rule, the correlation rule is the supervised learning. Instead of an
actual
The response, oj, the desired response, dj, uses for the weight-change calculation.
In Mathematical form the correlation learning rule is as follows:

Where dj is the desired value of output signal. This training algorithm usually starts with the
initialization of weights to zero.
Since assigning the desired weight by users, the correlation learning rule is an example of
supervised learning.

2.5. Out Star Learning Rule


We use the Out Star Learning Rule when we assume that nodes or neurons in a network
arranged in a layer. Here the weights connected to a certain node should be equal to the
desired outputs for the neurons connected through those weights. The out start rule produces
the desired response t for the layer of n nodes.

13. What is supervised learning and unsupervised learning?

Supervised Machine Learning:

Supervised learning is a machine learning method in which models are trained using labeled
data. In supervised learning, models need to find the mapping function to map the input
variable (X) with the output variable (Y).
Y =f ( X)

Supervised learning needs supervision to train the model, which is similar to as a student
learns things in the presence of a teacher. Supervised learning can be used for two types of
problems: Classification and Regression.

14. What is machine learning?


Ans. Machine learning is a branch of artificial intelligence (AI) and computer science which
focuses on the use of data and algorithms to imitate the way that humans learn, gradually
improving its accuracy.

15. What is deep learning?

16. Explain the difference between deep learning and neural networks.

You might also like