Professional Documents
Culture Documents
Theoretical Exercises
January 25th, 2023
Exercise 1. Bayesian Classifier
a) What is the difference between discriminative and generative
modeling?
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 1
a) What is the difference between discriminative and generative
modeling?
Generative Discriminative
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 3
b)What is the decision rule of the Bayesian classifier?
Bayes rule:
p(y ) · p(x|y )
p(y |x) =
p(x)
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 4
b)
p(y ) · p(x|y )
= arg max
y p(x)
= arg max p(y ) · p(x|y )
y
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 5
c) Simplify the decision rule if there is no prior knowledge about
the occurrence of the classes available.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 6
d) Show the optimally of the Bayesian classifier for the (0, 1) loss
function.
l (y1 , y2 ) is the loss if a feature vector belonging to class y2 is assigned to class y1 .
The (0,1)-loss function is defined by
0 , if y1 = y2
l (y1 , y2 ) =
1 , otherwise
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 7
d)
The optimal classifier according to the (0,1)-loss function applies the Bayesian
decision rule. This classifier is called Bayesian classifier.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 8
Exercise 2: Discriminant analysis
a) Write down the objective function for PCA.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 9
a) Write down the objective function for PCA.
ΣΦT = λ 0 ΦT
1
Σ=
m
∑(xi − µ)(xi − µ)T
i
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 10
b) Write down the objective function for LDA.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 11
b) Write down the objective function for LDA.
!
K L
1
Φ∗ = argmax ∑ (Φµy0 − Φµ̄ 0 )T (Φµy0 − Φµ̄ 0 ) + ∑ λi (kΦi k22 − 1)
Φ K y =1 i =1
Σinter ΦT = λ 0 ΦT
1
Σinter =
m
∑(xi − µy )(xi − µy )T
i i
i
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 12
c)Describe the differences between PCA and LDA.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 13
c)Describe the differences between PCA and LDA.
• PCA does not require a classified set of feature vectors like LDA.
• PCA transformed features are approximately normally distributed (central
limit theorem).
• PCA uses the covariance matrix, and LDA uses the inter-class-covariance
matrix to solve their respective eigenvalue and eigenvector problem.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 14
Exercise 3: Gaussian Mixture Model and EM
a) Write down the general form of a Gaussian mixture model.
K
p(x) = ∑ pk N (xi ; µk , Σk )
k =1
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 15
b) Which parameters of the GMM can be estimated using the EM
algorithm?
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 16
b) Which parameters of the GMM can be estimated using the EM
algorithm?
µk the K means
Σk the K covariance matrices of size d × d
pk fraction of all features in component k
p(k |xi ) ≡ pik the K probabilities for each of the m feature vectors xi
Additional estimates:
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 17
c) How do you initialize the EM algorithm?
(0)
Use k-means to find an initial guess for µk .
(0) (0)
Compute pk and Σk based on K clusters.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 18
d) Describe the basic steps of the EM algorithm for GMMs.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 19
d) Describe the basic steps of the EM algorithm for GMMs.
E−step: M−step:
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 20
d) Describe the basic steps of the EM algorithm for GMMs.
E−step: M−step:
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 20
Exercise 4: Kernel PCA
a) Describe the key idea of Kernel PCA.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 21
b) Explain the kernel trick and give the corresponding equation.
Σei = λi ei
where:
m
1
Σ= ∑ xi xTi ∈ Rd ×d
m i =1
ei = ∑ αi ,k xk
k
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 22
b)
Σei = λi ei
!
m
1
m
∑ xj xTj · ∑ αi ,k xk = λi ∑ αi ,k xk
j =1 k k
T
∑α j ,k xj xj xk = m · λi ∑ αi ,k xk
j ,k k
These equations have to be fulfilled for all projections on xl for all indices l:
All feature vectors show up in terms of inner products and the kernel trick can be
applied.
αj ,k k (xl , xj ) · k (xj , xk ) = m · λi αi ,k k (xl , xk )
∑ ∑
j ,k k
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 23
Exercise 5: Maximum Likelihood Estimation
a) Write down the log-likelihood function to estimate the
parameters µ and Σ of a Gaussian probability density
N (x ; µ, Σ) from training data x1 . . . xm .
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 24
a) Write down the log-likelihood function to estimate the
parameters µ and Σ of a Gaussian probability density
N (x ; µ, Σ) from training data x1 . . . xm .
m
L(x1 . . . xm ; µ, Σ) = ∑ log N (xi ; µ, Σ)
i =1
m
1 T −1
= ∑ − log(|2πΣ|) − (xi − µ) Σ (xi − µ)
i =1 2
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 25
b) Write down the ML estimators for µ and Σ.
m
1
µ = ∑ xi
m i =1
m
1
Σ = ∑ (xi − µ) (xi − µ)T
m i =1
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 26
Exercise 6: Naive Bayes
a) Which independence assumption is used for naive Bayes?
d
p(x|y ) = ∏ p(xi |y )
i =1
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 27
b) What is the decision rule of naive Bayes?
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 28
c) What is the structure of the covariance matrix of
normal-distributed classes in naive Bayes?
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 29
Exercise 7: Sigmoid Function
a) Write down the Sigmoid function g (x ).
1
g (x ) =
1 + e−x
with x ∈ R.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 30
b) Write down the posteriors for a two class problem (y = ±1) for
a given decision boundary F (x) in terms of a logistic function.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 31
b) Write down the posteriors for a two class problem (y = ±1) for
a given decision boundary F (x) in terms of a logistic function.
1
p(y |x) =
1 + ey ·F (x)
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 32
Exercise 8: Support Vector Machine
a) Write down the objective function for Rosenblatt’s Perceptron.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 33
a) Write down the objective function for Rosenblatt’s Perceptron.
y? = sgn(α T x + α0 )
minimize D (α0 , α) = − ∑ yi · (α T xi + α0 )
xi ∈M
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 34
b) Write down the optimization problem for SVM.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 35
b) Write down the optimization problem for SVM.
In SVM, the distance between the following two margins has to be maximized:
~α T~xi + α0 ≤ −1, if yi = −1
~α T~xi + α0 ≥ +1, if yi = +1
1
maximize ||α||22
2
subject to yi · (α T xi + α0 ) − 1 ≥ 0 ∀i
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 36
c) Explain the difference between Rosenblatt’s Perceptron and
SVM.
Paul Stöwer, Siming Bayer | Pattern Recognition Exercises January 25th, 2023 37