LA Opti Assignment 53.1 Lagrangian Duality

“mathematics is more about geometrical visualization than Symbolic Acrobatics”
Laopti Assignment 53
Lagrangian Duality (LD). The most tricky part of optimization theory.
LD is the core of modern theory of convex optimization. It arose from Lagrangian

function. Lagrangian function is a function created out of objective function and
constraints(treated as functions by removing the equality or inequality part of the
constrains ) that facilitate the writing of first order optimality conditions in a
methodical way. The resulting relationships between gradient of objective and
constraint functions along with complementarity conditions is the famous KKT
conditions.
Complementarity conditions basically help us to avoid the writing of ‘if’ in the

mathematical statement. Optimum value of the objective function can be either on
the bounadary of the constraint set or inside. If it is on boundary, we have one set
of first order conditions and if it is inside we have another set of conditions.
Complementarity condition combine these.
These conditions allows us to check whether the optimization algorithm has

converged onto the optimal point.
Soon mathematicians discovered that there is more to Lagrangian function

(corresponding to a convex objective function defined over a convex constraint
set.). The finding is: Corresponding to an optimization problem in terms of
original primal variables (some times called design variables) , there is an
equivalent dual optimization problem in terms of lagrangian multipliers as
variables. It followed from a geometry based argument on Lagrangian function.
To understand this, consider a simple problem.
min f ( x)
x
g ( x) ≥ 0
Corresponding Lagrangian function is

L( x, λ ) = f ( x) − λ g ( x); λ ≥ 0
Through a geometrical argument, we can derive the following:
min f ( x)
x is equivalent to
g ( x) ≥ 0
max min L( x, λ ) = min max L( x, λ ) ------------- (1)

λ x x λ
Derivation is based on a very tricky ‘thought experiment’ . It is given in the note

given to you and is there in the PPT slides also. It will be discussed in class. One
hour each is required for proving that both LHS and RHS finally lock on to the
same minimum value of the objective function over the constraint set.
Note very specially that, the whole argument is centered around the objective
function value.
Here through several pictures, we will prove one part of the relation. That is:
min f ( x) 
x
 ⇒ min max L( x, λ ) . It is RHS of (1)
g ( x) ≥ 0  x λ
Follow the following arguments/visualization
Step 1. Visualize the problem. Assume x is R2. f(x) is to be visualized in third

dimension.
Step 2: Mentally Compute max f ( x) − λ g ( x) .
λ ≥0
We are allowed to vary λ . Find out what should be λ at any x so as to maximize

lagrangian function at that point.
In the region where g ( x) ≥ 0 , max f ( x) − λ g ( x) is f ( x) itself. It is obtained by

λ ≥0
assigning λ = 0 . Note again that we are computing lagrangian function and
plotting it at every point where g ( x) ≥ 0 .
In the region where g ( x) < 0 , max f ( x) − λ g ( x) is ∞ . It is obtained by assigning

λ ≥0
λ = ∞. Note we are computing it and plotting at every point where g ( x) < 0 .
The resulting picture looks like the one given below.

Step 3. Compute Compute min max f ( x) − λ g ( x)
x λ ≥0
The output of step 3 is given in following figure. We end up in finding the
minimum value of f(x) in the feasible region.
This proves that

min f ( x) 
 ⇒ min max L( x, λ )
g ( x) ≥ 0  x λ
Similarly we can prove that
min f ( x) 
 ⇒ max min L( x, λ )
g ( x) ≥ 0  λ x
But this requires little more tougher geometrical argument.
We will do it later.
Lagrangian duality applied to Linear programming problem.
Consider following LP
max cT x
x
Ax ≤ b
x≥0
Let us find lagrangian function.
Writing lagrangin is tricky.
One way is to convert into a standard format (like changing max to min and chane
type inequality etc).
I personally remember in the following way.
If it is a maximization problem, add positive (>0) constraint part multiplied with

lagrangian multiplier to objective function and go above the objective function
value.
If it is an minimization problem, subtract positive (>0) constraint part multiplied
with lagrangian multiplier to objective function and go below the objective
function value.
It may take some time and practice to fully appreciate the above statement.
For the above LP problem, the lagrangian is:
max cT x 
x 
Ax ≤ b  ⇒ L( x, y ≥ 0, λ ≥ 0) = cT x + yT ( b − Ax ) + λ T x
x≥0 

The positive constraint part of first set of inequality constraint is b − Ax because

b − Ax ≥ 0
Similarly positive constraint part of second set of inequality constraint is x
because x ≥ 0
As per the duality theorem we can go for
max cT x 
x 
Ax ≥ b  ⇒ min max L( x, y ≥ 0, λ ≥ 0) = cT x + yT ( b − Ax ) + λ T x
y ,λ
x≥0 
x

With respect to primal variable we find optimality condition and substitute back
into lagrangian to obtain a minimization problem in terms of dual variables.
∂L
= 0 ⇒ c − AT y + λ = 0 vector
∂x
To eliminate x variables from lagrangian, we substitute c = AT y − λ in lagrangian.
cT x + yT ( Ax − b ) + λ T x = ( AT y − λ )T x + yT ( b − Ax ) + λ T x
Now lagrangian becomes L( y ≥ 0, λ ≥ 0) = bT y .

The dual variables y, λ are related by c = AT y − λ . This is equivalent to the
constraint AT y ≥ c since λ ≥ 0
So as per duality, we obtain a new optimization problem
min bT y
y
AT y ≥ c
y≥0
Assignment Question
Find the dual optimization problem for SVM problem given by
1
min wT w
w,γ 2
di ( wT xi − γ ) ≥ 1 ∀i = 1: m
Refer our book on SVM or any other book on kernel methods.
You are about to enter into the world of kernel methods which revolutionized
machine learning theory in 1990s .
Note that in deep learning algorithms for classification, the last block is still an
SVM classifier.
Kernel PCA, Kernel CCA Kernel ICA are powerful concepts useful in AI and data
science.

LA Opti Assignment 53.1 Lagrangian Duality

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LA Opti Assignment 53.1 Lagrangian Duality

Uploaded by

Copyright:

Available Formats

“mathematics is more about geometrical visualization than Symbolic Acrobatics”

Lagrangian Duality (LD). The most tricky part of optimization theory.

LD is the core of modern theory of convex optimization. It arose from Lagrangian

Complementarity conditions basically help us to avoid the writing of ‘if’ in the

These conditions allows us to check whether the optimization algorithm has

Soon mathematicians discovered that there is more to Lagrangian function

To understand this, consider a simple problem.

Corresponding Lagrangian function is

max min L( x, λ ) = min max L( x, λ ) ------------- (1)

Derivation is based on a very tricky ‘thought experiment’ . It is given in the note

Follow the following arguments/visualization

Step 1. Visualize the problem. Assume x is R2. f(x) is to be visualized in third

We are allowed to vary λ . Find out what should be λ at any x so as to maximize

In the region where g ( x) ≥ 0 , max f ( x) − λ g ( x) is f ( x) itself. It is obtained by

In the region where g ( x) < 0 , max f ( x) − λ g ( x) is ∞ . It is obtained by assigning

The resulting picture looks like the one given below.

This proves that

Similarly we can prove that

But this requires little more tougher geometrical argument.

Lagrangian duality applied to Linear programming problem.

Let us find lagrangian function.

Writing lagrangin is tricky.

I personally remember in the following way.

If it is a maximization problem, add positive (>0) constraint part multiplied with

For the above LP problem, the lagrangian is:

The positive constraint part of first set of inequality constraint is b − Ax because

As per the duality theorem we can go for

To eliminate x variables from lagrangian, we substitute c = AT y − λ in lagrangian.

Now lagrangian becomes L( y ≥ 0, λ ≥ 0) = bT y .

So as per duality, we obtain a new optimization problem

Find the dual optimization problem for SVM problem given by

Refer our book on SVM or any other book on kernel methods.

You might also like