You are on page 1of 62

Optimum Design Concepts: Optimality Conditions

Dr Yasser El-shaer
Upon completion of this chapter, you will be able to:
• Define local and global minima (maxima) for unconstrained and constrained
optimization problems
• Write optimality conditions for unconstrained problems
• Write optimality conditions for constrained problems
• Check optimality of a given point for unconstrained and constrained problems
• Solve first-order optimality conditions for candidate minimum points
• Check convexity of a function and the design optimization problem
• Use Lagrange multipliers to study changes to the optimum value of the cost function
due to variations in a constraint
Classification of optimization methods
OPTIMALITY CONDITIONS
Basic concept Functions of a
of optimality Single Variable

A stationary point may be (a) a minimum, (b) a maximum, or (c) an inflection point
DEFINITIONS OF GLOBAL AND LOCAL MINIMA

• A global minimum point is the one where there are no other feasible points with better cost function
values.
• A local minimum point is the one where there are no other feasible points “in the vicinity” with better cost
function values.
Existence of a Minimum

Necessary condition:
Consider a function f(x) of single variable defined for a < x < b. To find a
point of x*∈(a, b) that minimizes f(x), the first derivative of function f(x)
with respect to x at x = x* must be a stationary point; that is, f `(x*) = 0.
• Sufficient condition:
For the same function f(x) stated above and f (x*) = 0, then it can be said that
f(x*) is a minimum value of f(x) if f ``(x*) > 0, or a maximum value if f
``(x*) < 0.
Functions of a Single Variable
Basic concept of Functions of
optimality Multiple Variables
• Necessary condition:
The gradient of the function f(x) at x = x* must be a stationary point

• Sufficient condition:
For the same function f(x) stated above, let f(x*) = 0, then f(x*) has a minimum value of f(x) if
its Hessian matrix positive-definite.

Note that a square matrix is positive-definite if


(a) the determinant of the Hessian matrix is
positive (i.e., H > 0)
or (b) all its eigenvalues are positive
Quadratic Forms and Definite Matrices
• Quadratic Form
The quadratic form is a special nonlinear function having only second-order terms (either
the square of a variable or the product of two variables); for example, the following
function of three variables:

Generalizing the quadratic form of three variables in summation form:

where pij are constants related to the coefficients of various terms


The Matrix of the Quadratic Form
• The quadratic form can be written in the matrix notation.
Let P = [pij] be an n × n matrix and x = (x1, x2, …, xn) be an n-dimensional
vector. Then the quadratic form can be written as: F(x) = xT P x
P is called the matrix of the quadratic form F(x). Elements of P are obtained from the coefficients of the terms in
the function F(x). For example, the following two (3 × 3) P matrices are associated with the quadratic form

In these matrices, the diagonal elements are the coefficients


of the square terms in the equation, The off-diagonal terms
are obtained from the coefficients of the cross-product terms
with the following constraint:

All of the matrices are asymmetric except one. The symmetric matrix A associated with the
quadratic form can be obtained by setting pij = pji
• The symmetric matrix A for the quadratic form is obtained from any asymmetric
matrix P as follows:

Using the definition of symmetric matrix A, the matrix P can be replaced


with the symmetric matrix A and the quadratic form becomes:
F(x) = xT A x
The value or expression of the quadratic form does not change with P replaced by A. The
symmetric matrix A is useful in determining the nature of the quadratic form, which will be
discussed later in this lecture.
Example:

Dividing the coefficients equally between the off-diagonal terms, we obtain the symmetric matrix
associated with the quadratic form:
DETERMINATION OF THE FORM
OF A MATRIX
Stationary point nature summary
Gradient Vector: Partial Derivatives of a Function
• Geometrically, the gradient vector is normal to the tangent plane at the point
x*, as shown in Figure for a function of three variables. Also, it points in the
direction of maximum increase in the function.

Gradient vector for f(x1, x2, x3) at the point x*.


CALCULATION OF A GRADIENT VECTOR
Hessian Matrix: Second-Order Partial Derivatives
• Differentiating the gradient vector once again, we obtain a
matrix of second partial derivatives for the function f(x)
called the Hessian matrix or, simply, the Hessian.
EVALUATION OF THE GRADIENT AND HESSIAN
OF A FUNCTION
Basic concept of Functions of
optimality Multiple Variables
A function of two variables f(x1, x2)= -(cos2x1+cos2x2)2 Similarly, a function f(x1, x2) = (cos2x1+cos2x2)2 graphed
is graphed in Figure (a). Perturbations from point (x1, x2) in Figure (b) has a local maximum at (x1, x2) = (0, 0).
= (0, 0), which is a local minimum, in any direction result Perturbations from this point in any direction result in a
in an increase in the function value of f(x); that is, the decrease in the function value of f(x); that is, the slopes of
slopes of the function with respect to x1 and x2 are zero the function with respect to x1 and x2 are zero at this point
at this point of local minimum. of local maximum.

• The first derivatives of the function with respect to the variables are zero at the minimum or maximum,
which again is called a stationary point.
Example:3
MINIMUM-COST CYLINDRICAL TANK DESIGN

• Project/problem description. Design a minimum-cost cylindrical tank


closed at both ends to contain a fixed volume of fluid V. The cost is
found to depend directly on the area of sheet metal used.
Local minima for a function of two variables using
optimality conditions

• Local minima for a function of two variables using


optimality conditions
Lagrange multipliers
• The Lagrange multiplier technique lets you find the maximum or minimum of
a multivariable function f(x,y,…) when there is some constraint on the input
values you are allowed to use.
• This technique only applies to constraints that look something like this:
g(x,y,…)=c
• Here, g is another multivariable function with the same input space as f, and c is some
constant.

• The core idea is to look for points where the contour lines of f, and g are
tangent to each other.
• This is the same as finding points where the gradient vectors of f, and g, are
parallel to each other.
• Using contour maps
• Reasoning about this problem becomes easier if we
visualize it not with a graph, but with its contour lines.
when the contour lines of two functions f, and g,
are tangent, their gradient vectors are parallel
Lagrange wrote down a special new function which takes in
all the same input variables as f and g, along with the new λ :
Example
• consider the following inputs:

Here's how this new function would look:


• by setting the gradient of ℒ equal to the zero vector
NECESSARY CONDITIONS FOR A GENERAL
CONSTRAINED PROBLEM

• The Role of Inequalities: ACTIVE INEQUALITY


INACTIVE INEQUALITY
Karush-Kuhn-Tucker Necessary Conditions
• Working with In-equality conditions:
• An inequality constraint gi(x) ≤ 0 is equivalent to the equality constraint
gi(x)+si=0, where si  0 is a slack variable. The variables si are treated as
unknowns of the design problem along with the original variables. Their
values are determined as a part of the solution.
• Note that with the preceding procedure, we must introduce one additional
variable si and an additional constraint si  0 to treat each inequality
constraint. This increases the dimension of the design problem. The
constraint si  0 can be avoided if we use si2 as the slack variable instead of
just si. Therefore, the inequality gi(x) ≤ 0 is converted to an equality as
• where si can have any real value. This form can be used in the
Lagrange Multiplier Theorem to treat inequality constraints and
to derive the corresponding necessary conditions.
• The m new equations needed for determining the slack variables
are obtained by requiring the Lagrangian L to be stationary with
respect to the slack variables as well
• There is an additional necessary condition for the Lagrange
multipliers of “≤ type” constraints given as
EXAMPLE: INEQUALITY-CONSTRAINED
PROBLEM—USE OF NECESSARY CONDITIONS

Solution:
One solution can be obtained by setting s to zero to satisfy
the condition 2us=0 in Eq. (g). Equations (d) through (f)
are solved to obtain x1*=x2*=1, u*=1, s=0. When s=0, the
inequality constraint is active. x1, x2, and u are solved
from the remaining three equations, (d) through (f), which
are linear in the variables. This is a stationary point of L,
so it is a candidate minimum point.
Note from Figure that it is actually a minimum point,
since any move away from x* either violates the
constraint or increases the cost function.
• These gradients are along the same line but in opposite directions,
as shown in Figure. Observe also that any small move from point C
either increases the cost function or takes the design into the
infeasible region to reduce the cost function any further (i.e., the
condition for a local minimum is violated). Thus, point (1, 1) is indeed
a local minimum point.
• This geometrical condition is called the sufficient condition for a local
minimum point.
• It turns out that the necessary condition u  0 ensures that the
gradients of the cost and the constraint functions point in opposite
directions. This way f cannot be reduced any further by stepping in
the negative gradient direction for the cost function without violating
the constraint. That is, any further reduction in the cost function leads
to leaving the earlier feasible region at the candidate minimum point.
This can be observed in Figure.
• The necessary conditions for the equality- and inequality-
constrained problem written in the standard form, can be summed
up in what are commonly known as the Karush-Kuhn-Tucker
(KKT)
Solution to optimality conditions using MATLAB

You might also like