You are on page 1of 2

Quiz # 04

Questions can have multiple correct answers.

1. In using deep neural networks for approximating, policy-based RL. (d)

a) Look for the policy to achieve maximum future reward

b) Estimate the optimal policy

c) Searches directly for the optimal policy

d) All of the above

2. In Deep Reinforcement learning, what is the major concern about the data? (a,c)

a) The data is not independent and identically distributed (IID)

b) The data is independent and identically distributed (IID)

c) The data is Highly correlated

d) The data is not correlated

3. In Experience Replay Buffer, the values are stored as. (c)

a) s, a, r`, s

b) s, a`, r, s

c) s, a, r, s`

d) s, a, r, s

4. DQN is? (a)

a) Target Policy

b) On Policy

c) Current policy

d) Both b and c

5. Consider the following Experience Reply Buffer.

1. s, a, r, s`
2. s, a, r, s` B. What will be the appropriate mini batch of the
3. s, a, r, s` given reply buffer? (b)
4. s, a, r, s`
5. s, a, r, s`
6. s, a, r, s`
7. s, a, r, s`
8. s, a, r, s`
9. s, a, r, s`
a) [1,2,8,10]

b) [1,4,7,10]

c) [4,6,7,8]

d) [5,8,9,10]

6. While Training DQN, the next state (s`) comes from? (b)

a) Q-Net


b) Q -Net

c) Y-Q(s,a)

d) Y-Q(s,a`)

You might also like