You are on page 1of 3

UNIT – 4

2 MARKS

1. What is the basic principle behind deep learning architectures?

Deep learning architectures are based on artificial neural networks (ANNs), which are stacks
of linear approximators that are mutated with nonlinear activation functions. When trained to
find the optimal weights via regression, ANNs are capable of approximating any arbitrary
function. This is the promise of deep learning, which is the moniker for “deep” stacks of ANNs.

2. How can deep learning architectures be used to model an arbitrary state space?

Deep learning architectures can be used to model an arbitrary state space by altering the
structure and hyperparameters of the network. For example, the number of layers and neurons
in each layer can be changed to match the complexity of the state space. Additionally, different
types of activation functions can be used to learn different types of relationships between the
data.

3. What is the main purpose of a nonlinear activation function in an artificial neural


network (ANN)?

The main purpose of a nonlinear activation function in an ANN is to introduce non-linearity into
the network. This is necessary because the real world is highly non-linear, and a linear ANN
would not be able to learn complex mappings from inputs to outputs.Activation functions
introduce non-linearity by transforming the output of a neuron in a non-linear way. This means
that the output of a neuron is not simply a linear combination of its inputs, but rather a more
complex function of its inputs.

4. Provide an example of how a nonlinear activation function can be used to learn a


non-linear mapping:

Suppose we are training an ANN to classify images of handwritten digits. The input to the ANN
is a vector of pixel values, and the output is a vector of probabilities, where each probability
represents the probability that the image is a particular digit.

If we used a linear activation function, the ANN would only be able to learn linear mappings
from the input pixel values to the output probabilities. This would not be enough to learn to
classify the images accurately, as the relationship between the pixel values and the digit labels
is non-linear.

However, if we use a nonlinear activation function, the ANN will be able to learn more complex
mappings from the input pixel values to the output probabilities. This will allow the ANN to
learn to classify the images more accurately.
5. What is the difference between a Multilayer Perceptron (MLP) and a Deep Belief
Network (DBN)?

An MLP is a feed-forward neural network with multiple layers of neurons. The connections
between the layers are fully connected, meaning that each neuron in one layer is connected to
every neuron in the next layer. DBNs are similar to MLPs, but the connections between the top
two (or more) layers are undirected, meaning that information can flow in both directions.
DBNs also use restricted Boltzmann machines (RBMs) to learn the probability distribution of
the data. RBMs are a type of undirected graphical model that can be used to learn complex
relationships between data points.

6. Deep learning is used in reinforcement learning to cope with complex state spaces.
What are some examples of complex state spaces?

Examples of complex state spaces include:

Visual data: Deep learning models, such as convolutional neural networks (CNNs), can be
used to learn from visual data, such as images and videos. This is useful for tasks such as
robot navigation and self-driving cars.

Time-series data: Deep learning models, such as recurrent neural networks (RNNs), can be
used to learn from time-series data, such as stock prices and sensor readings. This is useful
for tasks such as financial forecasting and anomaly detection.

High-dimensional multisensor data: Deep learning models, such as multi-layer


perceptrons (MLPs), can be used to learn from high-dimensional multisensor data, such as
data from multiple cameras or sensors on a robot. This is useful for tasks such as object
recognition and scene understanding.

7. What are some challenges of using deep learning in reinforcement learning?

One challenge of using deep learning in reinforcement learning is that it can be difficult to
collect enough data to train a good model. Additionally, reinforcement learning agents often
have to wait a long time to receive feedback, which can make it difficult to learn.

8. What are two of the main challenges of using nonlinear function approximators in
reinforcement learning?

Two of the main challenges of using nonlinear function approximators in reinforcement


learning are:

1. Divergence: Nonlinear function approximators can diverge, rather than


converge, to the optimal solution. This is because the moving target problem
can be more difficult to solve with nonlinear function approximators.
2. Correlated data: Model optimizers like stochastic gradient descent assume
that the data is independent and identically distributed (IID). However, when
sampling close in time, the data in reinforcement learning is likely to be
correlated. This can violate the IID assumption and cause models to fail to
converge.

9. What are the benefits of using experience replay in reinforcement learning?

Experience replay has two main benefits:

1. Sample efficiency: Experience replay allows you to use old data to train your model,
making training more sample efficient. This is because you can reuse the same data
multiple times, and you don't need to collect new data every time you update your model.
2. Decorrelation: Experience replay breaks the correlation between consecutive
observations. This is important because it helps the agent to learn a more general policy.

10. What is the purpose of cloning the Q-network in the Deep Q-Network (DQN)
algorithm?

Cloning the Q-network in the DQN algorithm is used to improve the stability of the algorithm
and make it less likely to oscillate or diverge. It does this by creating two separate neural
networks: an online network and a target network. The online network is used to select
actions in the environment, while the target network is used to estimate the Q-values of
those actions. After a certain number of iterations, the weights of the online network are
copied to the target network. This ensures that the target network is always using a stable
set of parameters to estimate the Q-values, which helps to improve the stability of the
overall algorithm.

16 MARKS

1. What are the key considerations for designing a neural network


architecture for implementing DQN? – 119
2. How does DQN perform on the CartPole Environment – 120
3. How can deep Q-networks (DQNs) be used to reduce energy usage in
buildings – 125
4. How can distributional reinforcement learning be used to improve the
performance of a DQN agent – 126
5. How can Rainbow DQN be used to improve the performance of Atari game
agents and predict the results – 129

You might also like