Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
7Activity
0 of .
Results for:
No results containing your search query
P. 1
Subgradient and Bundle methods Report

Subgradient and Bundle methods Report

Ratings: (0)|Views: 328 |Likes:
Published by harshhpareek

More info:

Published by: harshhpareek on May 17, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

01/28/2013

pdf

text

original

 
Subgradient and Bundle Methods
Harsh PareekMay 5, 2010
Abstract
Minimizing a convex function over a convex region is an importantproblem in Nonlinear Programming. Several methods have been proposedand studied for optimization of differentiable functions. We introduce anumber of methods for optimization for non-smooth functions, all basedon the notion of subgradients. For these optimizations, we consider sub-gradient and bundle methods in detail.
1 Introduction
A function is said to be smooth if it is differentiable and the derivatives are con-tinuous. Many methods exist for the optimization of smooth convex functions(notably Steepest Descent Methods, Newton and Quasi-Newton methods andInterior point methods) [2] and [3] detail these methods and their implementa-tion.One can ask the question, Why are convex problems given such importance?To rephrase a quote by T. Rockafeller,
The great divide in optimization is not between linear and nonlinear problems,but between convex and nonconvex problems.
1.1 Motivation
Many naturally occuring problems are nonsmooth. Some common examplesare:
Hinge loss:
(
x
) = max(0
,
1
x
)
Piecewise Linear functions
For a problem,min
(
x
)
g
(
x
)
0An equivalent problem ismin
t
(
x
)
tg
(
x
)
01
 
This objective is always smooth and convex, but the problem is inherentlystill nonsmooth. (In some sense, the nonsmoothness has been shifted tothe constraints)
A function approximating a non-smooth function, may be analyticallysmooth, but “numerically nonsmooth”. For example, it may have similaroscillatory behaviour under iterative algorithms such as Gradient descent
1.2 Approaches to Solutions
Nonsmooth problems have arisen in many fields and have been solved by variousmethods.[5] covers a number of such methods for certain classes of optimizationproblems. Some general approaches are:
Approximate by a smooth function or a sequence of smooth functions
Reformulate the problem adding more variables/constraints such that theproblem becomes smooth
Subgradient Methods
: These methods proceed like gradient descentexcept for the fact that they use subgradients instead of gradients andmake appropriate modifications
Cutting Plane Methods
: Lower bound the function by a piecewiselinear function and use it to iteratively find the minimum.
Moreau-Yosida Regularization
Bundle Methods
: Combine the above two methods
U
decomposition
: Decompose the function to facilitate optimization.Refer [13]
1.3 Subgradients
As we define gradients for differentiable convex analysis, we define subgradientswhen the objective is not smooth.From the theory of convex analysis we have, a differentiable function
(
x
) isconvex iff 
x,y
R
m
(
y
)
(
x
) +
(
x
)
T
(
y
x
) (1)Similarly a subgradient at
x
is defined as any
g
R
n
such that
y
R
m
(
y
)
(
x
) +
g
T
(
y
x
) (2)Many such subgradients may exist.The set of all subgradients of 
at
x
is denoted
∂f 
(
x
)2
 
1.4 Convex Analysis for subgradients
A number of facts regarding nonsmooth convex functions should be noted.
A convex function is always subdifferentiable i.e. a subgradient of a convexfunction exists at every point
Directional derivatives also exist at every point.
If a convex function
is differentiable at
x
, its subgradient is a singletonset containing only the gradient at that point. i.e.
∂f 
(
x
) =
{
(
x
)
}
Let
(
x
;
d
) denote the directional derivative of 
in the direction
d
andlet t
R
. We note that, from definition
(
x
+ t
d
)
(
x
)t
g
d
g
∈ 
(
x
) (3)So, Subgradients are “lower bounds” for directional derivatives.
In fact,
(
x
;
d
) = sup
g
∂f 
(
x
)
g,d
Further,
d
is a descent direction iff 
g
d <
0
g
∂f 
(
x
)
1.5 Properties
Just as we have an arithmetic for gradients, we have an arithmetic for Subgra-dients. Rigorous proofs for these results are presented in [9]Properties:
∂ 
(
1
+
2
)(
x
) =
∂f 
1
(
x
) +
∂f 
2
(
x
)
∂αf 
(
x
) =
α∂f 
(
x
)
g
(
x
) =
(
Ax
+
b
)
∂g
(
x
) =
A
∂f 
(
Ax
+
b
)We have maxima and minima conditions in differential calculus (namely if a point is an extremal point, then its derivative is 0)The extension of this lemma to nonsmooth functions is :Local extremum
0
∂f 
(
x
)We note that this is not very useful in practice. This is because the derivativemay not vary continuously, i.e. its value at a point may not be representativeof its values nearby.This makes finding minima for general nonsmooth functions impossible. Aswe shall see, we can us the convexity to find the location of the minima (as itis unique).As an example, consider the case
(
x
) =
|
x
|
, the oracle returns subgradient0 only at
x
= 0. So, the magnitude of the subgradient is no indication of howclose we are to the final solution. During an iterative method, we may never hitzero, though we come arbitrarily close.3

Activity (7)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
Raouf Ziadi liked this
Raouf Ziadi liked this
Raouf Ziadi liked this
albertoferrer liked this
macsaverio liked this

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->