You are on page 1of 2
Problem 1 Consider the finetion f(r1,-t2) — 100(4, — x3)? + (1-21). (1) Progra a steepest gradient descent algorithm with backtracking to find minimizers of f. What is your ¢,, Initial values of and a Let's call your total number of Iteration is Iteryy . Print your solutions ‘nd the step lengths at the Ist, Itery,/10-th, (2« Ieerys)/10-h, ..., Mergy-th iteration. (2) Program a Newton's algorithm with backtracking seareh for step length to find minimizors of f. What is your ¢1, initial values of x and a? Print your solutions and the step lengths at the Ist, Iterar/10-th, @eTterys)/M0eth, ..., Heryyth iteration Problem 2 (1) Suppose Q is symmetric and positive definite matrix, r and d are vectors of suitable dimensions, Show that f(w) = 4" Qw + rTw-4d always defines a convex function. Tn particular, f(w wl? fsa (convex fiction. (2) Adding a simple convex function can help couvexify a possibly nou-couvex function, Ta the Newton's ‘method for minimization, adding a f,-egularization term to the 2nd order Taylor series approximation gives Slee +p) = Slax) +7 V ian) + So VY (en)p + Al. > 0 Show thatthe first order condition with repect top lads tothe fallowing djusted Newton's system (9/(e4) + AD ayn = (OP C24) + Mra = Viste) (@) IV sex) 1s not invertible, discuss how you would choose to make V2f(2x) + Aet invertible. (hint: A symmetric matrix is invertible when all its eigenvalues are positive. ) Problem 3 ‘This questions concerns the optimization for logistic regression. Consider a dataset consists of N points {ssm}¥a, where 2, is a pedimensional random vector and ()< y: < 1. A logistic fmetion is defined as 1 Ot) = Teepe’ ‘This function is useful for modeling a response variable that takes yalue between 0 and 1, Le, one may model = 6(a +228) for some paramcters a and 2 (p-ditn voetor). To estimate a and foe eat iminimize the least square criterion RSS(a, 8) =O% (6(a + x7 8) — wi). (1) Plot the logistic fanetion (2) Derive the gradient for the above objective function. (3) We apply losistc regression to the dataset Ronevork2.csv to find the minitier (0°, 3°). We fist ‘xm the surface of our RSS function by using the following code (4) Now use gradient descent method to solve for minimizer. Try two different starting values for (a, 2) : (1) (-2, -2); () (~2,2). For each ease, print your final solution and plot some (selective) iterates on the contour plot using points( , ,cex=0.3,col="red") (ie, add the points whose coordinates are given by the first two arguments). Do these two starting points yield the same minimizer? Are they close to your guess? Why or why not? (5) Now we illustrate the utilities of convesifying the objective function, One may consider fp regularized logistic rogression whose objective funetion becomes: RSS(a,8) — Y(6a +27 8) — yw)? + ANB? (6) In this part, derive the gradient for above objective function. Code up the steepest gradient descent for this ¢2 regularized logistic regression problem and apply it to the datasot (use the same fixed step length 10-2). For A, experiment with some small mmbee, eg, 1071, 10°? ete. ‘Try the same two starting values for (a, #) : (i) (-2,—2) (ii) (2,2). For each ‘ase, print your final solution and plot some (selective) iterates on the contour plot using points , 1c0x0.3,col="red") (iv., add the points whose coordinates are given by the first two arguments). Do these two starting points yield the same minimizer? Are they close to your guess? Why or why not? ‘To help explain the outcome, you may want to plot the contour and the 8D-plot of the fa-regularized logistic objective function as in Problem (3).

You might also like