Professional Documents
Culture Documents
Question No. 1
The expected error between the label y of a data sample x No TRUE FALSE
and the output of a supervised model f^(x) depends on Answer
bias or on variance, according to the weights of the model.
Question No. 2
State if the following statements about nearest neighbours algorithms are true or false:
In the K-nearest-neighbours algorithm for classi cation, No TRUE FALSE
the output of a new input vector cannot be obtained as Answer
the majority of the outputs of the K-nearest vectors in
memory.
∑
j=1 d(xj ,x)+d0
y = .
k 1
∑
j=1 d(xj ,x)+d0
Question No. 3
Given the dataset in the table, state if the following statements are true or false:
Question No. 4
State if the following statements about classi cation metrics are true or false:
Precision is the proportion of true positives over the all No TRUE FALSE
the predictions predicted as positive by a classi er. Answer
Accuracy is de ned as the proportion of true positives and No TRUE FALSE
true negatives over all the predictions made by a classi er. Answer
Recall is the proportion of true negatives over all the No TRUE FALSE
elements that actually belong to the positive class. Answer
Question No. 5
State if the following statements about linear functions and convexity are true or false:
The points satisfying the inequality 6x1 + 9x2 > 0 do not No TRUE FALSE
form a convex set. Answer
Question No. 6
State if the following statements about training, validating and testing are true or false:
"Statistically signi cant result" means that one can No TRUE FALSE
estimate probabilities of obtaining speci c results, and Answer
that the probability to obtain the result by chance is less
than a threshold value. Therefore the conclusion
"statistically signi cant or not" depends on the threshold
value.
Question No. 7
Given a dataset composed by n tuples (xi , yi ) with 1 ≤ i ≤ n , each output yi is
predicted using some regression model f^. State if the following metrics are suitable to
assess the performance of f^:
1
∑
n
^
|yi − f (xi )| No TRUE FALSE
n i=1
Answer
1
∑
n
^
(yi − f (xi )) No TRUE FALSE
n i=1
Answer
1
∑
n
^
(xi − f (xi ))
2
No TRUE FALSE
n i=1
Answer
1
∑
n
^
(yi − f (xi ))
2
No TRUE FALSE
n i=1
Answer
√
1 n
^
No TRUE FALSE
∑ (yi − f (xi ))
n i=1
Answer
Question No. 8
Given the points in the following plot, predict the class of triangle points using 3-nearest
neighbours. Consider all blue points as members of the positive class and all red points
as members of the negative class. State if the following statements are true or false:
Precision is
1
. No TRUE FALSE
3
Answer
Recall is
1
. No TRUE FALSE
3
Answer
Accuracy is
2
. No TRUE FALSE
3
Answer
Question No. 9
State if the following statements about Locally-Weighted Regression are true or false:
Question No. 10
State if the following statements about bottom-up clustering are true or false:
In the following plot, the covariance between the input No TRUE FALSE
dimensions is positive. Also, if using PCA to project the Answer
cloud of points from 2 dimensions to 1 dimension and
then to 2 dimensions again, there is no information loss.
PCA is sensitive to outliers, because points which are far No TRUE FALSE
away from most of the other points contribute greatly to Answer
the sum of squared distances in the projected space.
In PCA, input variables are transformed into a possibly No TRUE FALSE
lower number of uncorrelated input variables called Answer
principal components. In ascending order, principal
components take in account as much of the remaining
variability of data as possible.
Question No. 11
State if the following statements about top-down clustering are true or false:
In bottom-up clustering, at each step the most similar sets No TRUE FALSE
of points are merged together until a stopping criteria is Answer
met and the merging stops.
In top-down clustering, at each step the most similar sets No TRUE FALSE
of points are merged together until a stopping criteria is Answer
met and the merging stops.
Let x and y be two vectors of length n. ||x − y|| de nes No TRUE FALSE
the Manhattan distance between x and y. Answer
Δpc = η ⋅ membership(x, c) ⋅ (x − pc )
Question No. 12
If the number of di erent points is larger than the number No TRUE FALSE
of dimensions, the solution of the linear system can be Answer
found by computing the inverse of the matrix.
State if the following statements about the bias-variance dilemma are true or false:
Models with too few parameters tend to produce a large No TRUE FALSE
variation of results if runs of training and testing are Answer
repeated (with some randomization).
Models with too many parameters tend to fail because of No TRUE FALSE
a large bias, since they de ne models which might easily Answer
over t data.
Question No. 14
State if the following statements about supervised learning are true or false:
A suitable error measure on the training examples for No TRUE FALSE
regression models is the sum of squared errors between Answer
the known output yi and the output f^(xi ) obtained by
the models: ∑i (yi ^
− f (xi ))
2
.
→
Available data is composed by vectors of features x No TRUE FALSE
associated to an output y, and such data is used to build a Answer
function which models the relationship between x → and y.
Models for solving classi cation and regression problems No TRUE FALSE
are built by optimizing an objective function, which is Answer
assumed to be su ciently smooth so that the
generalization of a model built on training examples is
possible.
Question No. 15
State if the following statements about neurons and maximum likelihood are true or
false:
If w are weights of logistic regression, P r() is the output and yi the No TRUE FALSE
correct classi cation (0 or 1), the probability of obtaining the given Answer
output on the examples assuming independency is:
ℓ yi (1−yi )
Likelihood(w) = ∑ Pr(yi |xi , w) + (1 − Pr(yi |xi , w))
i=1
.
Consider a single perceptron de ned by the model No TRUE FALSE
^
f (w, x) = w
T
⋅ x, where w is a vector of weights and x is a vector Answer
of input data. Such model can be used to linearly separate the two
classes in the following plot:
If w are weights of logistic regression, P r() is the output and yi the No TRUE FALSE
correct classi cation (0 or 1), the probability of obtaining the given Answer
output on the examples assuming independency is:
.
ℓ yi (1−yi )
Likelihood(w) = ∏ Pr(yi |xi , w) (1 − Pr(yi |xi , w))
i=1
In autoencoders, the perceptrons in the middle layer of the neural No TRUE FALSE
network extract, from the original dataset, the input dimensions Answer
which are used to build a model that minimizes the di erence
between input data and rebuilt data.
Question No. 16
State if the following statements about goodness functions are true or false.
Question No. 17
Given the dataset in the table, state if the following statements are true or false:
Question No. 18
^ 2 3
f (x) = w1 x1 + w2 x + w3 x + w4 x
4
No TRUE FALSE
2 3 4
Answer
^
f (x) = 9 No TRUE FALSE
Answer
^
f (x) = w1 x1 x2 + w2 x2 x3 + w3 x3 x4 + w4 x
2
No TRUE FALSE
4
Answer
^
f (x) =
1
No TRUE FALSE
w1 x1 +w2 x2 +w3 x3 +w4 x4
Answer
^
f (x) = w1 x1 + w2 x2 + w3 x3 + w4 x4 No TRUE FALSE
Answer
^
f (x) = 9x3 No TRUE FALSE
Answer
^ 2 3
f (x) = x1 w1 + x2 w + x3 w + x4 w
4
No TRUE FALSE
2 3 4
Answer
^
f (x) = w
x1
+ w
x2
+ w
x3
+ w
x4
No TRUE FALSE
1 2 3 4
Answer
Question No. 19
If the number of input variables is 33, and one starts No TRUE FALSE
training from 99 examples, the parameters of the linear Answer
model obtaining zero error on the examples can always be
determined.
If the number of input variables is 77, and one starts No TRUE FALSE
training from 33 di erent examples, the parameters of the Answer
linear model obtaining zero error on the examples can
always be determined.
Question No. 20
State if the following statements about bottom-up clustering are true or false:
^ 2 3
f (x) = x1 w1 + x2 w + x3 w + x4 w
4
No TRUE FALSE
2 3 4
Answer
^
f (x) = w
x1
+ w
x2
+ w
x3
+ w
x4
No TRUE FALSE
1 2 3 4
Answer
Question No. 19
If the number of input variables is 33, and one starts No TRUE FALSE
training from 99 examples, the parameters of the linear Answer
model obtaining zero error on the examples can always be
determined.
If the number of input variables is 77, and one starts No TRUE FALSE
training from 33 di erent examples, the parameters of the Answer
linear model obtaining zero error on the examples can
always be determined.
Question No. 20
State if the following statements about bottom-up clustering are true or false:
The covariance matrix can be used to estimate the shape No TRUE FALSE
of a spheric cluster. Answer
∑ δ(x, y)
x∈C,y∈D
¯
δ avg (C, D) = ,
C ⋅ D
T −1
δ(x, y) = √(x − y) ⋅ S ⋅ (x − y).
Question No. 22
State if the following statements about feature selection are true or false:
Filter methods are not able to identify mutual relationships No TRUE FALSE
between di erent inputs. Answer
Even if the Pearson correlation coe cient between two data No TRUE FALSE
features is zero, it is not possible to state that such features are Answer
independent.
The linear correlation coe cient between xi and yi may change if No TRUE FALSE
x values are normalized by dividing them by their standard Answer
deviation σ > 0 in the following manner: xi /σ.
If the Pearson correlation coe cient between two data features is No TRUE FALSE
zero, the Mutual Information between such features is also zero. Answer
The Pearson correlation coe cient measures the linear No TRUE FALSE
relationship between numeric data features. It is de ned as the Answer
covariance of two input features divided by the product of their
standard deviations; as a consequence, if the covariance is
negative then also the correlation coe cient is going to be
negative.
A possible normalization of the data consists of rescaling all input No TRUE FALSE
dimensions that they range is in [0, 1]. Answer
Question No. 22
State if the following statements about feature selection are true or false:
Filter methods are not able to identify mutual relationships No TRUE FALSE
between di erent inputs. Answer
Even if the Pearson correlation coe cient between two data No TRUE FALSE
features is zero, it is not possible to state that such features are Answer
independent.
The linear correlation coe cient between xi and yi may change if No TRUE FALSE
x values are normalized by dividing them by their standard Answer
deviation σ > 0 in the following manner: xi /σ.
If the Pearson correlation coe cient between two data features is No TRUE FALSE
zero, the Mutual Information between such features is also zero. Answer
The Pearson correlation coe cient measures the linear No TRUE FALSE
relationship between numeric data features. It is de ned as the Answer
covariance of two input features divided by the product of their
standard deviations; as a consequence, if the covariance is
negative then also the correlation coe cient is going to be
negative.
A possible normalization of the data consists of rescaling all input No TRUE FALSE
dimensions that they range is in [0, 1]. Answer