You are on page 1of 4

SYS 6003 - Midterm Exam

Due: Thursday, 10/11/2012 by 1:45 PM

INSTRUCTIONS:
You have roughly 48 hours to complete this exam. Please bring a hard copy of your
solutions to Prof. Cogills office (Olsson 111B) by 1:45 PM on Thursday, October 11.
You may use any books, notes, or software (such as Matlab or Octave) that you
want. However, you may not discuss this exam with anyone until 10/12,
after everyone has taken the exam. The only exception is that you may contact Prof.
Cogill via email during the exam if a problem statement is unclear. Discussing this
exam during the exam period will be considered an honor code violation.
Please check your email a few times during the exam. It is likely that we will send out
clarifications.
You should attempt to solve every problem. When assigning partial credit, a clear, correctly reasoned incomplete solution is generally preferable to an incoherent, incorrect
complete solution.
Please write and sign the Honor Pledge on your exam.
Good luck!

HONOR PLEDGE:
On my honor as a student I have neither given nor received aid on this exam.

1. Convex functions
Show that the following functions are convex:

(a) [10 pts.] f (x) = xT x with domain Rn .

(b) [10 pts.] f (x) = exp(aT x) with domain Rn .

(c) [10 pts.] f (x) = cT x log(b aT x) with domain {x Rn | aT x < b}.

(d) [10 pts.] f (x) = ln (

Pn

i=1

exp(xi )) with domain Rn .

2. Maximum likelihood estimation


Maximum likelihood estimation is a method for estimating the parameters of a statistical model from data. The method of maximum likelihood assumes that data were
generated from a given statistical model. Then, given this model and the observed
data, the method seeks the parameters that best explain the data in certain sense.
Specifically, suppose that x1 , . . . , xn are a set of n observed values. We might assume
that these values were generated independently, each according to the probability distribution h(x | ), where is a vector of unknown parameters. Maximum likelihood
seeks the that maximizes
n
Y
h(xi | ).
i=1

Equivalently, we can search for the that minimizes


f () =

n
X

ln (h(xi | )) .

i=1

In this problem you will derive maximum likelihood estimates for several common
classes of underlying models. The functions that you are to minimize in each part are
convex. You do not need to prove their convexity.

(a) [10 pts.] Exponential distribution: Find the scalar that minimizes
!
n
X
f () =
xi n ln()
i=1

(b) [10 pts.] Normal distribution with unit variance: Find the scalar that minimizes
n

1X
(xi )2
f () =
2 i=1

(c) [10 pts.] Binomial distribution: Find the scalar that minimizes
f () =

n
X
i=1

(m xi ) ln(1 ) + xi ln()

3. Least squares with a prior


In class, we saw that least squares provides a general approach for estimating the
values of unknown parameters that are linearly related to observations. Occasionally,
we want to incorporate additional constraints or prior knowledge on the values of the
unknown parameters. Here, you will solve an optimization problem that results from
such an extension of least squares.
In this problem you will minimize the objective function

n 
X
1
1
xi ln(xi ) + 2 (Ax b)T (Ax b),
f (x) = (k 1)
M
2
1=1
where the domain of this function is the set of positive vectors in Rn (that is, the set of
vectors with xi > 0 for all i {1, . . . , n}). The values M , k, and are given constants
that appear in a probabilistic model that gives rise to this objective function.1

(a) [10 pts.] Find the gradient of f .

(b) [10 pts.] Find the Hessian of f .

(c) [10 pts.] You are given an instance of this problem in the Matlab file midterm_p3.mat.
This file contains the matrix A, the vector b, and the scalar parameters k, M, and
sig. Use Newtons method to find the minimizing x. Print and turn in the value
of the minimizing x you found, as well the Matlab code that you wrote to compute
it.

Specifically, the function f is obtained from a probabilistic model that assumes our initial uncertainty
in each of the xi can be characterized by a gamma distribution.