Professional Documents
Culture Documents
1) Use Pseudocode to fill the steps (1 to 4) in such way that the model goes through the process
of training. Stopping criteria can be ignored. Assign and reuse variables if needed. (5 Points)
Algorithm 1: Generic machine learning model training
input : batches = {samples, targets}, learningrate = λ, P arameters = Θ, loss = MSE,
model = NN
init model(parameters);
for batch in data do
1
2
3
4
end
return model;
2) The following table contains predicted values from a simplified linear model (Yi = Beta0 + xi )
and their true (i.e. expected) counterpart. Show the calculation of the MSE for this model! How
should Beta be updated to minimize the MSE? (5 Points)
Predicted Expected
2 1
3 2
5 4
8 7
ML4NLU Page 3 Matr.Nr.:
1) Which types of neural networks do you know and for which tasks are they typically used? (2
Points)
2) Explain what distinguishes an Long Short-Term Memory model (LSTM) from a conventional
Recurrent Neural Network (RNN). (3 Points)
3) Name at least three NLP tasks for which an LSTM is suitable! (3 Points)
1) Explain the terms overfitting and underfitting! When can they each occur? (2 Points)
2) Explain the differences between parameters and hyperparameters in a machine learning model (3
Points)
ML4NLU Page 5 Matr.Nr.:
x0 4
b = −2
1 1
b = 0.5
2 1
x1 1
b = −3
Show that the network correctly classifies the following data. Assume sgn as activation function (5
Points)
x0 x1 Klasse (
2 1 1 +1, if x > 0,
sgn(x):=
-1 2 1 −1, if x <= 0.
-3 2 -1
ML4NLU Page 6 Matr.Nr.:
1) Name an describe task usually used for the pretraining of language models (e.g. BERT) (2
Points)
2) What are positional embeddings and why are they used in the context of Transformer mod-
els? (2 Points)
3) Name at least four downstream tasks at token or text level and briefly explain them. (4 Points)
4) Discuss where even the largest language models reach their limits! (2 Points)
ML4NLU Page 7 Matr.Nr.:
Benchmarks 10 Points
h i
G(y, n) := (y1 , . . . , yn ), (y2 , . . . , yn+1 ), . . . , (y|y|−n+1 , . . . , y|y| ) (1)
P
min C(g, ŷ, n), C(g, y, n)
g∈G(ŷ,n)
P (ŷ, y, n) := P (3)
C(g, ŷ, n)
g∈G(ŷ,n)
1) Calculate the uni-/bi-grams for G(ŷ, 1), G(y, 1), G(ŷ, 2), G(y, 2). (4 Points)