You are on page 1of 2

Learning From Data

Summer Semester 2022 Prof. Dr. Jörg Schäfer & Fatima Sajid Butt

Sheet 3
Vapnik-Chervonenkis Theorem (Generalization Bound)
Issue Date: NA
Submission Date: NA

Generalization Bounds
Exercise 1

Consider the Vapnik-Chervonenkis Theorem (Generalization Bound) from lecture 3 (LfD03):

2
P [|Ein (h) − Eout (h)| > ] ≤ 4mH (2N ) e− N/8
.

Assume that the growth function takes the form mH (N ) = 100N k . How many data is required
if you want to ensure that with 90% probability the out-of-sample-error does not differ from the
in-sample error by more than 5% for k = 2, 20, 200 or 2000?

Exercise 2

Prove equation (1) and (2) from page 42 from lecture 3 (LfD03), i.e. prove that
2
P [|Ein (h) − Eout (h)| ≤ ] ≤ 4mH (2N ) e− N/8
,

implies that for any fixed tolerance level δ we have with probability of at least 1 − δ:
r
8 4mH (2N )
Eout (h) ≤ Ein (h) + log (1)
N δ
r
8 4mH (2N )
Eout (h) ≥ Ein (h) − log . (2)
N δ

Hint: consider the following equation, work towards both sides and chose  and δ appropriately:

2
(Ein (h) − Eout (h)) ≤ 8/N log [4mH (2N )/δ]

Growth Function and VC Dimension


Exercise 3

Which of the following are possible formulas for a growth function mH (N ) and why?:
1. 1 + N
2. 1 + N + N 2
3. 1 + N + N2


4. 2N
2
5. 2N

1/2
Learning From Data
Summer Semester 2022 Prof. Dr. Jörg Schäfer & Fatima Sajid Butt

Exercise 4

Proof lemma 11 of lecture 3 (LfD03) , i.e.


k−1
(
X N k−1 + 1

N
≤ e N k−1
i=0
i ( k−1 )

Exercise 5

Let X = R and H = R. For a fixed number ω ∈ R a hypothesis is given by

hω (x) = sign(sin(ω x)).

Proof that for all N ∈ N the points {21 , 22 , . . . , 2N } are shattered by h and thus the VC dimension
is infinite.

Exercise 6

Use lemma 14 of lecture 4, i.e.

K
Y
mH (N ) ≤ mH̃ (N ) mHi (N )
i=1

to estimate the VC-dimension of a fully connected feed forward network with an input layer of size
N , L hidden layers of size M and an output layer of size 11 . You should assume the VC dimension
of the activation and summation function to be 1.

Exercise 7

Suppose, the hypothesis set H does not shatter x1 , . . . , xN , and x1 , . . . , xN is uniformly distributed.
Prove that:
mH (N )
P [F alsif ication] ≥ 1 − .
2N
Hint: look at the probability of the complementary event, i.e. the probability that Falsification does
not take place and prove that this is smaller than mH2N(N ) .

1 This is not a sharp estimate as we will see later.

2/2

You might also like