You are on page 1of 2

Machine Learning

Assignment #2

1. (Linear Regression) Linear regression aims to learn the parameters 𝜃⃑from


the training set 𝐷 = {(𝑥⃑ (") , 𝑦 (") ), 𝑖 = 1,2, . . . , 𝑚} so that the hypothesis ℎ$ (𝑥⃑) =
𝜃⃑ % 𝑥⃑ can predict the output 𝑦 given an input vector 𝑥⃑. Please derive the least mean
squares and stochastic gradient descent update rule, that is to use gradient descent
algorithm to update 𝜃⃑ so as to minimize the least squares cost function 𝐽(𝜃⃑).

2. (Linear Regression) Please explain from the probabilistic view why the
least squares cost function 𝐽(𝜃⃑) is a reasonable choice when choosing 𝜃⃑ ? (Hint:
probabilistic assumption and maximum likelihood estimation)

3. [(Logistic Regression) Logistic regression aims to learn the parameters 𝜃⃑


from the training set 𝐷 = {(𝑥⃑ (") , 𝑦 (") ), 𝑖 = 1,2, . . . , 𝑚} so that the hypothesis
&
ℎ$ (𝑥⃑) = 𝑔(𝜃⃑ % 𝑥⃑) (here 𝑔(𝑧) is the logistic or sigmod function 𝑔(𝑧) = &' ) !" ), can

predict the output 𝑦 ∈ {0, 1} given an input vector 𝑥⃑. Please derive the stochastic
gradient ascent rule for logistic regression learning problems.

4. [ (Linear Regression) Manually train a linear function based on the following


training instances (either batch or stochastic gradient descent algorithms are
fine). The initial values of parameters are 𝜃* = 0 , 𝜃& = 0 , 𝜃+ = 0. The learning rate
𝛼 is 0.5. Please update each parameter at least five times.

𝑥& 𝑥+ 𝑦
0 0 2
0 1 3
1 0 3
1 1 4

You might also like