You are on page 1of 5
€ Key Concepts ‘on Deep Neural Networks Due Aug 30, 259 PM 408 Y Congratulat TOPASS a0K orhigher ns! You passet Key Concepts on Deep Neural Networks LATEST SUBMISSION GRADE 100% 1. What isthe "ache" used for in our implementation of forward propagation ane backward propagation? ax Iti usedto keep trackor the hyperparameters that we are searching over, to speedup computation © We se eto pass variables computed during foward propagation tothe corresponding backward propagation step contains usefulvalues for backward propagation to compute deratves. Itis used to cache the intermediate values of the cost function during taining ‘We use itto pass variables computed during backward propagation to the corresponding forward propagation sep. It contains useful values for forward propagation to compute activations. SY correct Correct, the "cache" records values from the forward propagation units and sends ito the backward propagation units because i 's needed to compute the chain rule derivatives, "(Check al that apply) ax 2. Among the following, which ones are *hyperparameter size ofthe hidden layers nll Y correct activation values a! numberof layers in the neural network Y correct bias vectors bi weight matrices 1! learning rate a Y correct number of iterations Y correct Which ofthe Following statements is tue? axa © The earlier layers of a neural network are typically computing more complex features ofthe input than the deeper layers. © ‘he deeper layers ofa neural network are ypialy computing more complex features of the input than the earlier layers SY correct Vectorzation tows you te compute fornardprepagotonin an L-lyer neural network without on exhitorlooporony QED athe exp erate lop oer theleyers, 2a Tues? © Fase O twe Y correct Forward propagation propagates the input through the layers, although for shallow networks we may just ‘write al the ines (a = g (20), 2! = Wall + B®.) a deeper network, we cannot avoid a for loop iterating over the layers: (al! = g(a), 24 = WHat + BM, Assume we store the valves for nin an array called layer. ims 35 folows layer. ims = 321) So layer 1 has four hidden units, layer has 3 hen units andso on Which ofthe following forops wl al you to initalze the parameters forthe made? Oo for in range(t lenilayer. cis) parameter(W" + stl] = np.random.rande(ayer_dims(-t], layer « parameter st()] = npwrandomcandnilayer_dimsti,1)* 0.01 for in range(t, ler(ayer_dimsy2} parameter('W' + str] = nprandom.rande(layer_dimst, layer_dimsf-t})* 0.01 parameter’ + st()] = np.randomsandniayer_dimsti 1)* 0.01 for iin range(t, leayer_imsy2} parameter('W" + str()) = np.random.randn(layer_dims() layer_dimsti-})* 0.01 parameter('b' + st()] = np.randomcandni(layer_dimsfi-], 1)* 0.01 ® for in range(t en(layer. dims) parameter[W" + str(l] = np.random.rande(layer_dimst layer dimsf-t])* 0.01 parameter('’ + st()] = npwrandomsandnayer_dimsti, 1) * 0.01 V correct 6. Consider the following neural network ax= oe X14 FICWIOW? 1 Rode X2 oR SI LYSIS x3 How many layers does this network have? The numberof layers Lis 4 The numberof hidden layers i © ‘he number oflayers Lis 4, The numberof hidden ayers 3 “The number of layers is 3, The ruber of hidden ayerse3 “The number of layers Lis 5, The number of hidden layers is 4 SY correct Yes, As seen in lecture, the number of layers Is counted as the number of hidden layers + 1. The input and. output layers are not counted as hidden layers 11. During forward propagation. nthe foward function fr alyer you need to know what f the activation function in a layer (Sigmoid, tanh, RELU, etc), During backpropagation, the corresponding backward function also needs to know what isthe activation function for ayer, since the gradient depends on it TrueFalse? False © wwe Y correct Yes, as you've seen in week 3 each activation has a different derivative, Thus, during backpropagation you ‘eed to know which activation was used inthe forward propagation to be able to compute the correct derivative. 8, There are certain functions with the following properties: () To compute the function using a shallow network circuit, you wil ned a large network (where we measure size by the ‘numberof logic gates in the networs), but (i) To compute It using 2 deep network circuit, you need only an exponentially smaller network, Tue/False? @ me False Y correct 8. Consider the following 2 hidden layer neural network: Which ofthe following statements are True? (Check all that apply. Wl will have shape (3, 1) 8 it ave shape (3,1) wit ave shape (4,1) Y correct Yes. More generally the shape of bi! is (nl, 1). 1 will have shape (4, 4) Y correct ‘Yes. More generally, the shape of Wt! is (n,n), 1 A vathave shape 1 8 watnove shape. 1) SY Correct ‘Yes More geeraly the shape os (3), WE will have shape (3,4) Y correct ‘Yes, More generally, the shape of Wt! is (n,n), (WE withave shape (3, 1) WE will have shape (1,3) Y correct Yes. More generally the shape of Wil is (nl, nM), 9 wit ave shape (1,1) Y correct Yes. More generally the shape of i! s(n, 1) (1 Wl wit nave shape 4) 10 Whereas the previous question used a specific network, nthe general case whatis the dimension of WMD, che weight IED rat assoclated with layer? OW has shape (n,n) © Was shape (n,n) © W' has shape (ni), nl!) Wi has shape (nln!) Y correct True

You might also like