This document contains exercises related to machine learning fundamentals. It includes:
1) An exercise on Bayes risk that asks the risk of a classifier under 0-1 loss in a binary classification problem where the probability of a class depends on the relative values of input features.
2) An exercise on consistency of empirical risk minimization (ERM) learners that asks about reasons ERM may not be Bayes consistent and whether decision tree learners of a certain depth could be consistent.
3) An exercise on consistency of 1-nearest neighbor classification that asks the risk of 1-NN on a binary problem and whether it converges to the Bayes risk.
This document contains exercises related to machine learning fundamentals. It includes:
1) An exercise on Bayes risk that asks the risk of a classifier under 0-1 loss in a binary classification problem where the probability of a class depends on the relative values of input features.
2) An exercise on consistency of empirical risk minimization (ERM) learners that asks about reasons ERM may not be Bayes consistent and whether decision tree learners of a certain depth could be consistent.
3) An exercise on consistency of 1-nearest neighbor classification that asks the risk of 1-NN on a binary problem and whether it converges to the Bayes risk.
This document contains exercises related to machine learning fundamentals. It includes:
1) An exercise on Bayes risk that asks the risk of a classifier under 0-1 loss in a binary classification problem where the probability of a class depends on the relative values of input features.
2) An exercise on consistency of empirical risk minimization (ERM) learners that asks about reasons ERM may not be Bayes consistent and whether decision tree learners of a certain depth could be consistent.
3) An exercise on consistency of 1-nearest neighbor classification that asks the risk of 1-NN on a binary problem and whether it converges to the Bayes risk.
Consider, as during the lecture, the following setting:
We have Y = Ŷ = {blue, orange}, PX = Uniform([0, 1]2 ).
Also, P (orange(| x(1) > x(2) ) = .9 and P(orange | x(1) ≤ x(2) ) = 0.1. blue if x(2) > 0.5 Define f (x) = . orange otherwise • What is the risk of f under the `0,1 loss ? Give the details
Exercice - Is the ERM Bayes Consistent ?
Consider a learner generating a classifier hS (·) when provided a dataset S as input. This learner is “Bayes-Consistent” if R(hS ) → R(f ∗ ) when |S| → ∞ (con- vergence in probability). • Name the two reasons why an ERM learner could be non Bayes Consistent • Let DTk = {decision trees of depth at most k}. Assume the learner gener- ates classifiers from DT3 by ERM. Can this learner be bayes Consistent?
Exercice - Is the 1-Nearest Neighbor Bayes Consistent ?
Let X = R and Y = Ŷ = {0, 1}. Assume P is such that X is uniformly distributed over [0, 1] and Y is distributed according to a Bernouilli of parameter η, independently of X. In other words, P (Y = 1 | X) = η is constant. Assume S = {(x1 , y1 ) . . . (xN , yN )} is drawn from P N . 1 FONDAMENTAUX DE L’APPRENTISSAGE ARTIFICIAL - EXERCICES (CM1+2) 2
What is the Bayes risk ?
(1) Let us note h1NS N (·) the classifier induced by a 1-nearest neighbor on a dataset S. For any x ∈ [0, 1], we note I(x) ∈ {1 . . . N } the index of the nearest neighbor of x in S. Clearly, h1N S N (·) = yI(x) . What is P yI(x) = 1 , the probability that the nearest neighbor of hx in S has ia label equal to 1 ?
(2) The risk of the 1-NN is R h1N S N = EX,Y 1[Y 6=YI(X) ] . Compute its ex- pectation over S. Show that if η ∈]0, 12 [ then it is strictly higher than the Bayes risk. What do you conclude ?