You are on page 1of 314

Optimization

VARUN PRATAP SINGH


Optimization literature
Textbooks:

1. Nocedal J. and Wright S.J., Numerical Optimization, Springer Series in


Operations Research, Springer, 636 pp, 1999.
2. Spall J.C., Introduction to Stochastic Search and Optimization,
Estimation, Simulation and Control, Wiley, 595 pp, 2003.
3. Chong E.K.P. and Zak S.H., An Introduction to Optimization, Second
Edition, John Wiley & Sons, New York, 476 pp, 2001.
4. Rao S.S., Engineering Optimization - Theory and Practice, John Wiley &
Sons, New York, 903 pp, 1996.
5. Gill P.E., Murray W. and Wright M.H., Practical Optimization, Elsevier,
401 pp., 2004.
6. Goldberg D.E., Genetic Algorithms in Search, Optimization and Machine
Learning, Addison Wesley, Reading, Mass., 1989.
7. S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge
University Press, 2004.(available at
http://www.stanford.edu/~boyd/cvxbook/)

2
Optimization literature
Journals:

1. Engineering Optimization
2. ASME Journal of Mechnical Design
3. AIAA Journal
4. ASCE Journal of Structural Engineering
5. Computers and Structures
6. International Journal for Numerical Methods in Engineering
7. Structural Optimization
8. Journal of Optimization Theory and Applications
9. Computers and Operations Research
10. Operations Research and Management Science

3
Optimization
Course Schedule:

1. Introduction to Optimization
2. Classical Optimization Techniques
3. Linear programming and the Simplex method
4. Nonlinear programming-One Dimensional Minimization Methods
5. Nonlinear programming-Unconstrained Optimization Techniques
6. Nonlinear programming-Constrained Optimization Techniques
7. Global Optimization Methods-Genetic algorithms
8. Global Optimization Methods-Simulated Annealing
9. Global Optimization Methods- Coupled Local Minimizers

4
Optimization
Course Prerequisite:

• Familiarity with MATLAB, if you are not familiar with MATLAB, please
visit

http://www.ece.ust.hk/~palomar/courses/ELEC692Q/lecture%2006%20-%20cvx/matlab_crashcourse.pdf

http://www.ece.ust.hk/~palomar/courses/ELEC692Q/lecture%2006%20-%20cvx/official_getting_started.pdf

5
1. Introduction
• Optimization is the act of obtaining the best result under given
circumstances.

• Optimization can be defined as the process of finding the conditions


that give the maximum or minimum of a function.

• The optimum seeking methods are also known as mathematical


programming techniques and are generally studied as a part of
operations research.

• Operations research is a branch of mathematics concerned with the


application of scientific methods and techniques to decision making
problems and with establishing the best or optimal solutions.

6
1. Introduction
• Operations research (in the UK) or operational research (OR)
(in the US) or yöneylem araştırması (in Turkish) is an
interdisciplinary branch of mathematics which uses methods like:

– mathematical modeling
– statistics
– algorithms to arrive at optimal or good decisions in complex
problems which are concerned with optimizing the maxima (profit,
faster assembly line, greater crop yield, higher bandwidth, etc) or
minima (cost loss, lowering of risk, etc) of some objective function.

• The eventual intention behind using operations research is to


elicit a best possible solution to a problem mathematically, which
improves or optimizes the performance of the system.

7
1. Introduction

8
1. Introduction
Historical development

• Isaac Newton (1642-1727)


(The development of differential calculus
methods of optimization)

• Joseph-Louis Lagrange (1736-1813)


(Calculus of variations, minimization of functionals,
method of optimization for constrained problems)

• Augustin-Louis Cauchy (1789-1857)


(Solution by direct substitution, steepest
descent method for unconstrained optimization)

9
1. Introduction
Historical development

• Leonhard Euler (1707-1783)


(Calculus of variations, minimization of
functionals)

• Gottfried Leibnitz (1646-1716)


(Differential calculus methods
of optimization)

10
1. Introduction
Historical development

• George Bernard Dantzig (1914-2005)


(Linear programming and Simplex method (1947))

• Richard Bellman (1920-1984)


(Principle of optimality in dynamic
programming problems)

• Harold William Kuhn (1925-)


(Necessary and sufficient conditions for the optimal solution of
programming problems, game theory)

11
1. Introduction
Historical development

• Albert William Tucker (1905-1995)


(Necessary and sufficient conditions
for the optimal solution of programming
problems, nonlinear programming, game
theory: his PhD student
was John Nash)

• Von Neumann (1903-1957)


(game theory)
12
1. Introduction
• Mathematical optimization problem:
minimize f 0 ( x)
subject to g i ( x)  bi , i  1,...., m

• f0 : Rn R: objective function
• x=(x1,…..,xn): design variables (unknowns of the problem,
they must be linearly independent)
• gi : Rn R: (i=1,…,m): inequality constraints

• The problem is a constrained optimization problem

13
1. Introduction
• If a point x* corresponds to the minimum value of the function f (x), the
same point also corresponds to the maximum value of the negative of
the function, -f (x). Thus optimization can be taken to mean
minimization since the maximum of a function can be found by seeking
the minimum of the negative of the same function.

14
1. Introduction
Constraints

• Behaviour constraints: Constraints that represent limitations on


the behaviour or performance of the system are termed behaviour or
functional constraints.

• Side constraints: Constraints that represent physical limitations on


design variables such as manufacturing limitations.

15
1. Introduction
Constraint Surface
• For illustration purposes, consider an optimization problem with only
inequality constraints gj (X)  0. The set of values of X that satisfy
the equation gj (X) =0 forms a hypersurface in the design space and
is called a constraint surface.

16
1. Introduction
Constraint Surface
• Note that this is a (n-1) dimensional subspace, where n is the
number of design variables. The constraint surface divides the
design space into two regions: one in which gj (X)  0and the other
in which gj (X) 0.

17
1. Introduction
Constraint Surface
• Thus the points lying on the hypersurface will satisfy the constraint
gj (X) critically whereas the points lying in the region where gj (X) >0
are infeasible or unacceptable, and the points lying in the region
where gj (X) < 0 are feasible or acceptable.

18
1. Introduction
Constraint Surface
• In the below figure, a hypothetical two dimensional design space is
depicted where the infeasible region is indicated by hatched lines. A
design point that lies on one or more than one constraint surface is
called a bound point, and the associated constraint is called an
active constraint.

19
1. Introduction
Constraint Surface
• Design points that do not lie on any constraint surface are known as
free points.

20
1. Introduction
Constraint Surface

Depending on whether a
particular design point belongs to
the acceptable or unacceptable
regions, it can be identified as one
of the following four types:

• Free and acceptable point

• Free and unacceptable point

• Bound and acceptable point

• Bound and unacceptable point

21
1. Introduction
• The conventional design procedures aim at finding an acceptable or
adequate design which merely satisfies the functional and other
requirements of the problem.

• In general, there will be more than one acceptable design, and the
purpose of optimization is to choose the best one of the many
acceptable designs available.

• Thus a criterion has to be chosen for comparing the different


alternative acceptable designs and for selecting the best one.

• The criterion with respect to which the design is optimized, when


expressed as a function of the design variables, is known as the
objective function.

22
1. Introduction
• In civil engineering, the objective is usually taken as the
minimization of the cost.

• In mechanical engineering, the maximization of the mechanical


efficiency is the obvious choice of an objective function.

• In aerospace structural design problems, the objective function for


minimization is generally taken as weight.

• In some situations, there may be more than one criterion to be


satisfied simultaneously. An optimization problem involving multiple
objective functions is known as a multiobjective programming
problem.

23
1. Introduction
• With multiple objectives there arises a possibility of conflict, and one
simple way to handle the problem is to construct an overall objective
function as a linear combination of the conflicting multiple objective
functions.

• Thus, if f1 (X) and f2 (X) denote two objective functions, construct a new
(overall) objective function for optimization as:

f (X)  1 f1 (X)   2 f 2 (X)

where 1 and 2 are constants whose values indicate the relative


importance of one objective function to the other.

24
1. Introduction
• The locus of all points satisfying f (X) = c = constant forms a
hypersurface in the design space, and for each value of c there
corresponds a different member of a family of surfaces. These surfaces,
called objective function surfaces, are shown in a hypothetical two-
dimensional design space in the figure below.

25
1. Introduction
• Once the objective function surfaces are drawn along with the constraint
surfaces, the optimum point can be determined without much difficulty.
• But the main problem is that as the number of design variables exceeds
two or three, the constraint and objective function surfaces become
complex even for visualization and the problem has to be solved purely
as a mathematical problem.

26
Example
Example:

Design a uniform column of tubular section to carry a compressive load P=2500 kgf
for minimum cost. The column is made up of a material that has a yield stress of 500
kgf/cm2, modulus of elasticity (E) of 0.85e6 kgf/cm2, and density () of 0.0025 kgf/cm3.
The length of the column is 250 cm. The stress induced in this column should be less
than the buckling stress as well as the yield stress. The mean diameter of the column
is restricted to lie between 2 and 14 cm, and columns with thicknesses outside the
range 0.2 to 0.8 cm are not available in the market. The cost of the column includes
material and construction costs and can be taken as 5W + 2d, where W is the weight
in kilograms force and d is the mean diameter of the column in centimeters.

27
Example
Example:

The design variables are the


mean diameter (d) and tube
thickness (t):
 x1  d 
X  
 x2  t 
The objective function to be
minimized is given by:

f (X)  5W  2d  5ldt  2d  9.82x1 x2  2x1


28
Example
• The behaviour constraints can be expressed as:

stress induced ≤ yield stress

stress induced ≤ buckling stress

• The induced stress is given by:

P 2500
induced stress   i  
dt x1 x2

29
Example
• The buckling stress for a pin connected column is given by:

Euler buckling load  2 EI


buckling stress   b   2
cross  sectional area l dt

where I is the second moment of area of the cross section of the


column given by:

 
I (d o4  d i4 )  (d o2  d i2 )( d o  d i )( d o  d i )
64 64


64
(d  t ) 2

 (d  t ) 2 (d  t )  (d  t )(d  t )  (d  t )

 
 dt (d 2  t 2 )  x1 x2 ( x12  x22 )
8 8
30
Example
• Thus, the behaviour constraints can be restated as:
2500
g1 ( X)   500  0
x1 x2
2500  2 (0.85 106 )( x12  x22 )
g 2 ( X)   0
x1 x2 8(250) 2

• The side constraints are given by:

2  d  14
0.2  t  0.8

31
Example
• The side constraints can be expressed in standard form as:

g 3 ( X)   x1  2  0
g 4 ( X)  x1  14  0
g 5 ( X)   x2  0.2  0
g 6 ( X)  x2  0.8  0

32
Example
• For a graphical solution, the constraint surfaces are to be
plotted in a two dimensional design space where the two axes
represent the two design variables x1 and x2. To plot the first
constraint surface, we have:
2500
g1 ( X)   500  0 x1 x2  1.593
x1 x2
• Thus the curve x1x2=1.593 represents the constraint surface
g1(X)=0. This curve can be plotted by finding several points on
the curve. The points on the curve can be found by giving a
series of values to x1 and finding the corresponding values of x2
that satisfy the relation x1x2=1.593 as shown in the Table below:
x1 2 4 6 8 10 12 14
x2 0.7965 0.3983 0.2655 0.199 0.1593 0.1328 0.114

33
Example
• The infeasible region represented by g1(X)>0 or x1x2< 1.593 is
shown by hatched lines. These points are plotted and a curve P1Q1
passing through all these points is drawn as shown:

34
Example
• Similarly the second
constraint g2(X) < 0 can
be expressed as:

x1 x2 ( x  x )  47.3
2
1
2
2

• The points lying on the


constraint surface g2
(X)=0 can be obtained as
follows (These points are
plotted as Curve P2Q2:
x1 2 4 6 8 10 12 14

x2 2.41 0.716 0.219 0.0926 0.0473 0.0274 0.0172


35
Example
• The plotting of side
constraints is simple
since they represent
straight lines.
x1 x2 ( x12  x22 )  47.3

• After plotting all the six


constraints, the feasible
region is determined as
the bounded area
ABCDEA

36
Example
• Next, the contours of the
objective function are to be
plotted before finding the
optimum point. For this, we
plot the curves given by:
f ( X)  9.82 x1 x2  2 x1  c
 constant

for a series of values of c. By


giving different values to c, the
contours of f can be plotted
with the help of the following
points.

37
Example
• For f (X)  9.82x1x2  2x1  50.0
x2 0.1 0.2 0.3 0.4 0.5 0.6 0.7

x1 16.77 12.62 10.10 8.44 7.24 6.33 5.64

• For f (X)  9.82x1x2  2x1  40.0


x2 0.1 0.2 0.3 0.4 0.5 0.6 0.7

x1 13.40 10.10 8.08 6.75 5.79 5.06 4.51

• For f (X)  9.82x1x2  2x1  31.58 (passing through t he corner point C)


x2 0.1 0.2 0.3 0.4 0.5 0.6 0.7

x1 10.57 7.96 6.38 5.33 4.57 4 3.56

• For f (X)  9.82x1x2  2x1  26.53 (passing through t he corner point B)


x2 0.1 0.2 0.3 0.4 0.5 0.6 0.7

x1 8.88 6.69 5.36 4.48 3.84 3.36 2.99

38
Example
• These contours are shown in the
figure below and it can be seen
that the objective function can not
be reduced below a value of 26.53
(corresponding to point B) without
violating some of the constraints.
Thus, the optimum solution is
given by point B with d*=x1*=5.44
cm and t*=x2*=0.293 cm with
fmin=26.53.

39
Examples

Design of civil engineering structures


• variables: width and height of member cross-sections
• constraints: limit stresses, maximum and minimum dimensions
• objective: minimum cost or minimum weight

Analysis of statistical data and building empirical models


from measurements
• variables: model parameters
• Constraints: physical upper and lower bounds for model parameters
• Objective: prediction error

40
Classification of optimization problems

Classification based on:

• Constraints
– Constrained optimization problem
– Unconstrained optimization problem

• Nature of the design variables


– Static optimization problems
– Dynamic optimization problems

41
Classification of optimization problems

Classification based on:

• Physical structure of the problem


– Optimal control problems
– Non-optimal control problems

• Nature of the equations involved


– Nonlinear programming problem
– Geometric programming problem
– Quadratic programming problem
– Linear programming problem

42
Classification of optimization problems

Classification based on:

• Permissable values of the design variables


– Integer programming problems
– Real valued programming problems

• Deterministic nature of the variables


– Stochastic programming problem
– Deterministic programming problem

43
Classification of optimization problems

Classification based on:

• Separability of the functions


– Separable programming problems
– Non-separable programming problems

• Number of the objective functions


– Single objective programming problem
– Multiobjective programming problem

44
Geometric Programming
• A geometric programming problem (GMP)
is one in which the objective function and
constraints are expressed as posynomials
in X.

45
46
Quadratic Programming Problem
• A quadratic programming problem is a nonlinear programming problem with a
quadratic objective function and linear constraints. It is usually formulated as
follows:
n n n
F (X)  c   qi xi   Q x x ij i j
i 1 i 1 j 1

subject to
n

a
i 1
ij xi  b j , j  1,2,  , m

xi  0, i  1,2,  , n

where c, qi,Qij, aij, and bj are constants.

47
Optimal Control Problem
• An optimal control (OC) problem is
a mathematical programming
problem involving a number of
stages, where each stage evolves
from the preceding stage in a
prescribed manner.
• It is usually described by two
types of variables: the control
(design) and the state variables.
The control variables define the
system and govern the evolution
of the system from one stage to
the next, and the state variables
describe the behaviour or status of
the system in any stage.

48
Optimal Control Problem
• The problem is to find a set of control or design
variables such that the total objective function
(also known as the performance index) over all
stages is minimized subject to a set of
constraints on the control and state variables.
• An OC problem can be stated as follows:
l
Find X which minimizes f ( X)   f i ( xi , y i )
i 1
subject to the constraints
qi ( xi , y i )  y i  y i 1 , i  1,2,, l
g j ( x j )  0, j  1,2, , l
h k ( y k )  0, k  1,2, , l
where xi is the ith control variable, yi is the ith
control variable, and fi is the contribution of the
ith stage to the total objective function; gj, hk and
qi are functions of xj, yk and xi and yi,
respectively, and l is the total number of stages. 49
Integer Programming Problem
• If some or all of the design variables x1,x2,..,xn of
an optimization problem are restricted to take
on only integer (or discrete) values, the problem
is called an integer programming problem.

• If all the design variables are permitted to take


any real value, the optimization problem is
called a real-valued programming problem.

50
Stochastic Programming Problem
• A stochastic programming problem is an
optimization problem in which some or all of the
parameters (design variables and/or
preassigned parameters) are probabilistic
(nondeterministic or stochastic).
• In other words, stochastic programming deals
with the solution of the optimization problems in
which some of the variables are described by
probability distributions.

51
Separable Programming Problem
• A function f (x) is said to be separable if it can be expressed as
the sum of n single variable functions, f1(x1), f2(x2),….,fn(xn), that is,
n
f (X)   f i xi
i 1

• A separable programming problem is one in which the objective


function and the constraints are separable and can be expressed in
standard form as:
n
Find X which minimizes f ( X)   f i ( xi )
i 1
subject to
n
g j ( X)   g ij ( xi )  b j , j  1,2, , m
i 1

where bj is constant.

52
Multiobjective Programming
Problem
• A multiobjective programming problem can be stated as follows:

Find X which minimizes f1 (X), f2 (X),…., fk (X)


subject to
g j ( X)  0, j  1,2,..., m

where f1 , f2,…., fk denote the objective functions to be minimized


simultaneously.

53
Review of mathematics
Concepts from linear algebra:
Positive definiteness
• Test 1: A matrix A will be positive definite if all its
eigenvalues are positive; that is, all the values of  that satisfy
the determinental equation
A  I  0
should be positive. Similarly, the matrix A will be negative
definite if its eigenvalues are negative.

54
Review of mathematics
Positive definiteness
• Test 2: Another test that can be used to find the positive definiteness
of a matrix A of order n involves evaluation of the determinants
A  a11
a11 a12 a13  a1n
a11 a12 a23  a2 n
A2  a21 a22
a21 a22 A3  a31 a32 a33  a3n
a11 a12 a13 
A3  a21 a22 a23 an1 an 2 an 3  ann
a31 a32 a33

• The matrix A will be positive definite if and only if all the values A1,
A2, A3,An are positive
• The matrix A will be negative definite if and only if the sign of Aj is (-
1)j for j=1,2,,n
• If some of the Aj are positive and the remaining Aj are zero, the matrix
A will be positive semidefinite 55
Review of mathematics

Negative definiteness

• Equivalently, a matrix is negative-definite if all its


eigenvalues are negative

• It is positive-semidefinite if all its eigenvalues are all


greater than or equal to zero

• It is negative-semidefinite if all its eigenvalues are all


less than or equal to zero

56
Review of mathematics
Concepts from linear algebra:

Nonsingular matrix: The determinant of the matrix is not


zero.

Rank: The rank of a matrix A is the order of the largest


nonsingular square submatrix of A, that is, the largest
submatrix with a determinant other than zero.

57
Review of mathematics
Solutions of a linear problem
Minimize f(x)=cTx
Subject to g(x): Ax=b
Side constraints: x ≥0
• The existence of a solution to this problem depends on the
rows of A.

• If the rows of A are linearly independent, then there is a


unique solution to the system of equations.

• If det(A) is zero, that is, matrix A is singular, there are either


no solutions or infinite solutions.
58
Review of mathematics
Suppose

1 1 1 3  1 1 1 3 
A  b  A*   
 1 1 0.5 1.5 - 1 1 0.5 1.5
The new matrix A* is called the augmented matrix- the columns of b are added
to A. According to the theorems of linear algebra:

• If the augmented matrix A* and the matrix of coefficients A have the same rank
r which is less than the number of design variables n: (r < n), then there are
many solutions.

• If the augmented matrix A* and the matrix of coefficients A do not have the
same rank, a solution does not exist.

• If the augmented matrix A* and the matrix of coefficients A have the same rank
r=n, where the number of constraints is equal to the number of design variables,
then there is a unique solution.
59
Review of mathematics
In the example

1 1 1 3  1 1 1 3 
A  b  A*   
  1 1 0.5 1.5 - 1 1 0.5 1.5 

The largest square submatrix is a 2 x 2 matrix (since m = 2


and m < n). Taking the submatrix which includes the first
two columns of A, the determinant has a value of 2 and
therefore is nonsingular. Thus the rank of A is 2 (r = 2). The
same columns appear in A* making its rank also 2. Since
r < n, infinitely many solutions exist.

60
Review of mathematics
In the example

1 1 1 3  1 1 1 3 
A  b  A*   
  1 1 0.5 1.5 - 1 1 0.5 1.5 
One way to determine the solutions is to assign ( n-r) variables arbitrary
values and use them to determine values for the remaining r variables.
The value n-r is often identified as the degree of freedom for the system
of equations.

In this example, the degree of freedom is 1 (i.e., 3-2). For instance x3 can
be assigned a value of 1 in which case x1=0.5 and x2=1.5

61
Homework
What is the solution of the system given below?
Hint: Determine the rank of the matrix of the coefficients and
the augmented matrix.

g1 : x1  x2  2
g 2 :  x1  x2  1
g3 : x1  2 x2  1

62
2. Classical optimization techniques
Single variable optimization

• Useful in finding the optimum solutions of continuous and differentiable


functions

• These methods are analytical and make use of the techniques of


differential calculus in locating the optimum points.

• Since some of the practical problems involve objective functions that are
not continuous and/or differentiable, the classical optimization techniques
have limited scope in practical applications.

63
2. Classicial optimization techniques
Single variable optimization

• A function of one variable f (x)


has a relative or local minimum
at x = x* if f (x*) ≤ f (x*+h) Local
minimum
for all sufficiently small
positive and negative values of
h
Global minima

• A point x* is called a relative Local minima


or local maximum if f (x*) ≥ f
(x*+h) for all values of h
sufficiently close to zero.

64
2. Classicial optimization techniques
Single variable optimization
• A function f (x) is said to have a global or absolute
minimum at x* if f (x*) ≤ f (x) for all x, and not just for all
x close to x*, in the domain over which f (x) is defined.

• Similarly, a point x* will be a global maximum of f (x) if f


(x*) ≥ f (x) for all x in the domain.

65
Necessary condition

• If a function f (x) is defined in the


interval a ≤ x ≤ b and has a relative
minimum at x = x*, where a < x* < b,
and if the derivative df (x) / dx = f’(x)
exists as a finite number at x = x*, then
f’ (x*)=0

• The theorem does not say that the


function necessarily will have a
minimum or maximum at every point
where the derivative is zero. e.g. f’ (x)=0
at x= 0 for the function shown in figure.
However, this point is neither a
minimum nor a maximum. In general, a
point x* at which f’(x*)=0 is called a
stationary point.
66
Necessary condition

• The theorem does not say what FIGURE 2.2 SAYFA 67


happens if a minimum or a
maximum occurs at a point x*
where the derivative fails to exist.
For example, in the figure

f ( x *  h)  f ( x*)
lim  m  (positive) or m - (negative)
h 0 h
depending on whether h
approaches zero through positive
or negative values, respectively.
Unless the numbers m  or m  are
equal, the derivative f’ (x*) does
not exist. If f’ (x*) does not exist,
the theorem is not applicable. 67
Sufficient condition

• Let f’(x*)=f’’(x*)=…=f (n-1)(x*)=0, but f(n)(x*) ≠ 0. Then f(x*)


is
– A minimum value of f (x) if f (n)(x*) > 0 and n is even
– A maximum value of f (x) if f (n)(x*) < 0 and n is even
– Neither a minimum nor a maximum if n is odd

68
Example
Determine the maximum and minimum values of the function:
f ( x)  12 x 5  45x 4  40 x 3  5

Solution: Since f’(x)=60(x4-3x3+2x2)=60x2(x-1)(x-2),


f’(x)=0 at x=0,x=1, and x=2.

The second derivative is:


f ( x)  60(4 x 3  9 x 2  4 x)
At x=1, f’’(x)=-60 and hence x=1 is a relative maximum. Therefore,
fmax= f (x=1) = 12

At x=2, f’’(x)=240 and hence x=2 is a relative minimum. Therefore,


fmin= f (x=2) = -11
69
Example

Solution cont’d:
At x=0, f’’(x)=0 and hence we must investigate the next derivative.

f ( x)  60(12 x 2  18x  4)  240 at x  0


Since f ( x)  0 at x=0, x=0 is neither a maximum nor a minimum, and it is an
inflection point.

70
Multivariable optimization with no
constraints
• Definition: rth Differential of f
If all partial derivatives of the function f through order r ≥ 1
exist and are continuous at a point X*, the polynomial
n n n
 r f ( X*)
d f ( X*)  
r
  hi h j  hk
i 1 j 1 k 1 xi x j xk

r summations

is called the rth differential of f at X*.

71
Multivariable optimization with no
constraints
• Example: rth Differential of f
n n n
 r f ( X*)
d f ( X*)  
r
  hi h j  hk
i 1 j 1 k 1 xi x j xk

r summations

when r = 2 and n = 3, we have


3 3
 2 f ( X*)
d f ( X*)  d f ( x1 *, x 2 *, x3 *)  
2 2
 hi h j
i 1 j 1 xi x j
2 f 2  f
2
2  f
2
h 2
( X*)  h2 ( X*)  h3 ( X*)
x1 x 2 x3
1 2 2 2

2 f 2 f 2 f
 2h1 h2 ( X*)  2h2 h3 ( X*)  2h1 h3 ( X*)
x1x 2 x 2 x3 x1x3

72
Multivariable optimization with no
constraints
• Definition: rth Differential of f
The Taylor series expansion of a function f (X*)
about a point X* is given by:
1 2 1
f ( X)  f ( X*)  df ( X*)  d f ( X*)  d 3 f ( X*)
2! 3!
1 N
 d f ( X*)  R N ( X*, h )
N!

where the last term, called the remainder is given by:


1
RN ( X*, h)  d N 1 f ( X*,  h)
( N  1)!
where 0    1 and h  X - X * 73
Example
Find the second order Taylor’s series approximation of the function
f ( x1 , x 2 , x3 )  x 22 x3  x1e x3

about the point  1


 
X*   0
- 2 
 

Solution: The second order Taylor’s series approximation of the function f about point
X* is given by
 1  1  1
    1 2  
f ( X)  f  0   df  0   d f  0 
  2   2  2!   2
     

74
Example cont’d
where
 1
 
f  0   e 2
  2
 
1  1  1  1 
  f   f   f  
df  0   h1  0   h2  0   h3  0 
  2 x1   x 2   x3  
    2   2   2
1 
 
 [h1e x3  h2 (2 x 2 x3 )  h3 x 22  h3 x1e x3 ] 0   h1e  2  h3 e  2
  2
 

75
Example cont’d
where

 1 3  1
  3
 f  
2
2  f
2
2  f
2
2  f
2
d2 f  0     hi h j  0   (h1  h2  h3
  2  i 1 j 1 xi x j   x12 x22 x32
    2
1 
2 f 2 f 2 f  
 2h1h2  2h2 h3  2h1h3 ) 0 
x1x2 x2 x3 x1x3  
  2
1 
 
 [h12 (0)  h22 (2 x3 )  h32 ( x1e x3 )  2h1h2 (0)  2h2 h3 (2 x2 )  2h1h3 (e x3 )] 0 
  2
 
 -4h22  e  2 h32  2h1h3e  2

76
Example cont’d
Thus, the Taylor’s series approximation is given by:

2 1
f ( X)  e 2
 e (h1  h3 )  (-4h22  e 2 h32  2h1h3 e 2 )
2!

Where h1=x1-1, h2=x2, and h3=x3+2

77
Multivariable optimization with no
constraints
• Necessary condition
If f(X) has an extreme point (maximum or minimum) at X=X*
and if the first partial derivatives of f (X) exist at X*, then
f f f
( X*)  ( X*)    ( X*)  0
x1 x2 xn

• Sufficient condition
A sufficient condition for a stationary point X* to be an
extreme point is that the matrix of second partial derivatives
(Hessian matrix) of f (X*) evaluated at X* is
– Positive definite when X* is a relative minimum point
– Negative definite when X* is a relative maximum point

78
Example
Figure shows two frictionless rigid bodies (carts) A and B connected by
three linear elastic springs having spring constants k1, k2, and k3. The
springs are at their natural positions when the applied force P is zero. Find
the displacements x1 and x2 under the force P by using the principle of
minimum potential energy.

79
Example
Solution: According to the principle of minimum potential energy, the
system will be in equilibrium under the load P if the potential energy is
a minimum. The potential energy of the system is given by:

Potential energy (U)


= Strain energy of springs-work done by external forces
1 1 1
 [ k 2 x12  k3 ( x2  x1 ) 2  k1 x22 ]  Px2
2 2 2
The necessary condition for the minimum of U are
U x1* 
Pk3
 k 2 x1  k3 ( x2  x1 )  0
x1 k1k 2  k1k3  k 2 k3
U P ( k 2  k3 )
 k3 ( x2  x1 )  k1 x2  P  0 x2 * 
x2 k1k 2  k1k3  k 2 k3
80
Example
Solution cont’d: The sufficiency conditions for the minimum at (x1*,x2*) can also
be verified by testing the positive definiteness of the Hessian matrix of U. The
Hessian matrix of U evaluated at (x1*,x2*) is:
  2U  2U 
 2 
x x1x2  k 2  k3  k3 
  21   k k1  k3 
J ( x1 *, x2 *)  U  2U  
 
3

 x1x2 x22  ( x *,x *)


1 2

The determinants of the square submatrices of J are


J1  k 2  k3  k 2  k3  0
k 2  k3  k3
J2   k1k 2  k1k3  k 2 k3  0
 k3 k1  k3

Since the spring constants are always positive. Thus the matrix J is positive definite
and hence (x1*,x2*) corresponds to the minimum of potential energy.

81
Semi-definite case
The sufficient conditions for the case when the Hessian
matrix of the given function is semidefinite:

• In case of a function of a single variable, the higher order derivatives


in the Taylor’s series expansion are investigated

82
Semi-definite case
The sufficient conditions for a function of several
variables for the case when the Hessian matrix of the
given function is semidefinite:
• Let the partial derivatives of f of all orders up to the order k ≥ 2 be
continuous in the neighborhood of a stationary point X*, and
d r f |X  X *  0 1  r  k  1
d k f |X  X *  0
so that dk f |X=X* is the first nonvanishing higher-order differential of f
at X*.
• If k is even:
– X* is a relative minimum if dk f |X=X* is positive definite
– X* is a relative maximum if dk f |X=X* is negative definite
– If dk f |X=X* is semidefinite, no general conclusions can be drawn
• If k is odd, X* is not an extreme point of f(X*)
83
Saddle point
• In the case of a function of two variables f (x,y), the
Hessian matrix may be neither positive nor negative
definite at a point (x*,y*) at which

f f
 0
x y

In such a case, the point (x*,y*) is called a saddle point.


• The characteristic of a saddle point is that it corresponds to
a relative minimum or maximum of f (x,y) wrt one
variable, say, x (the other variable being fixed at y=y* )
and a relative maximum or minimum of f (x,y) wrt the
second variable y (the other variable being fixed at x*).

84
Saddle point
Example: Consider the function
f (x,y)=x2-y2. For this function:

f f
 2 x and  2 y
x y
These first derivatives are zero at x* =
0 and y* = 0. The Hessian matrix of f
at (x*,y*) is given by:
2 0
J
0 2

Since this matrix is neither positive


definite nor negative definite, the
point ( x*=0, y*=0) is a saddle point.

85
Saddle point
Example cont’d:
It can be seen from the figure that f (x, y*) = f (x, 0) has a relative minimum and f
(x*, y) = f (0, y) has a relative maximum at the saddle point (x*, y*).

86
Example
Find the extreme points of the function
f ( x1 , x2 )  x13  x23  2 x12  4 x22  6

Solution: The necessary conditions for the existence of an extreme point


are:
f
 3 x12  4 x1  x1 (3 x1  4)  0
x1
f
 3 x22  8 x2  x2 (3 x2  8)  0
x2

These equations are satisfied at the points: (0,0), (0,-8/3), (-4/3,0), and
(-4/3,-8/3)

87
Example
Solution cont’d: To find the nature of these extreme points,
we have to use the sufficiency conditions. The second
order partial derivatives of f are given by:

2 f 2 f 2 f
 6 x1  4  6 x2  8 0
x1
2
x2
2
x1x2

The Hessian matrix of f is given by:

6 x1  4 0 
J 
 0 6 x2  8
88
Example

Solution cont’d:
6 x1  4 0 
J 
 0 6 x2  8

6 x1  4 0
If J1=|6x1+4| and J 2  , the values of J1 and J2 and
0 6 x2  8

the nature of the extreme point are as given in the next slide:

89
Example
Point X Value of J1 Value of J2 Nature of J Nature of f (X)
X

(0,0) +4 +32 Positive Relative 6


definite minimum

(0,-8/3) +4 -32 Indefinite Saddle 418/27


point

(-4/3,0) -4 -32 Indefinite Saddle 194/27


point

(-4/3,-8/3) -4 +32 Negative Relative 50/3


definite maximum

90
Multivariable optimization with equality
constraints
• Problem statement:
Minimize f = f (X) subject to gj(X)=0, j=1,2,…..,m where
 x1 
x 
 
X   2
 
 xn 

Here m is less than or equal to n, otherwise the problem becomes


overdefined and, in general, there will be no solution.

• Solution:
– Solution by direct substitution
– Solution by the method of constrained variation
– Solution by the method of Lagrange multipliers

91
Solution by direct substitution

For a problem with n variables and m equality constraints:

• Solve the m equality constraints and express any set of m


variables in terms of the remaining n-m variables

• Substitute these expressions into the original objective


function, the result is a new objective function involving
only n-m variables

• The new objective function is not subjected to any


constraint, and hence its optimum can be found by using
the unconstrained optimization techniques.

92
Solution by direct substitution

• Simple in theory

• Not convenient from a practical point of view as the


constraint equations will be nonlinear for most of the
problems

• Suitable only for simple problems

93
Example

Find the dimensions of a box of largest volume that can be inscribed in


a sphere of unit radius

Solution: Let the origin of the Cartesian coordinate system x1, x2, x3
be at the center of the sphere and the sides of the box be 2x1, 2x2, and
2x3. The volume of the box is given by:

f ( x1 , x2 , x3 )  8 x1 x2 x3
Since the corners of the box lie on the surface of the sphere of unit
radius, x1, x2 and x3 have to satisfy the constraint

x12  x22  x32  1

94
Example

This problem has three design variables and one equality


constraint. Hence the equality constraint can be used to
eliminate any one of the design variables from the
objective function. If we choose to eliminate x3:
x3  (1  x12  x22 )1/ 2

Thus, the objective function becomes:


f(x1,x2)=8x1x2(1-x12-x22)1/2
which can be maximized as an unconstrained function in
two variables.

95
Example

The necessary conditions for the maximum of f give:


f x 2
 8 x2 [(1  x12  x22 )1/ 2  1
]0
x1 (1  x1  x2 )
2 2 1/ 2

f x 2
 8 x1[(1  x12  x22 )1/ 2  2
]0
x2 (1  x1  x2 )
2 2 1/ 2

which can be simplified as:


1  2 x12  x22  0
1  x12  2 x22  0

From which it follows that x1*=x2*=1/3 and hence x3*= 1/3


96
Example

This solution gives the maximum volume of the box as:


8
f max 
3 3
To find whether the solution found corresponds to a
maximum or minimum, we apply the sufficiency conditions
to f (x1,x2) of the equation f (x1,x2)=8x1x2(1-x12-x22)1/2. The
second order partial derivatives of f at (x1*,x2*) are given by:

2 f 8 x1 x2 8 x2  x13 2 1/ 2 
     2 x (1  x 2
 x 2) 
x12 (1  x12  x22 )1/ 2 1  x12  x22  
2 2 1/ 2 1 1
 (1 x1 x 2 ) 
32
 at ( x1*, x2 *)
3
97
Example

The second order partial derivatives of f at (x1*,x2*) are given


by:

2 f 8 x1 x2 8 x1  x23 2 1/ 2 
     2 x (1  x 2
 x 2) 
x22 (1  x12  x22 )1/ 2 1  x12  x22  
2 2 1/ 2 2 1
 (1 x1 x 2 ) 
32
 at ( x1*, x2 *)
3
2 f 8 x22
 8(1  x1  x2 ) 
2 2 1/ 2

x1x2 (1  x12  x22 )1/ 2


8 x12 x22
 [(1  x  x )
2 2 1/ 2
 ]
(1  x1  x2 )
1 2 2 2 1/ 2
1 x  x
2
1
2
2

16
 at ( x1*, x2 *)
3 98
Example

Since
2
 f
2
 f  f  2 f 
2 2
0 and     0
x2
2
x1 x2  x1x2 
2 2

the Hessian matrix of f is negative definite at (x1*,x2*).


Hence the point (x1*,x2*) corresponds to the maximum of f.

99
Solution by constrained variation

• Minimize f (x1,x2)
subject to g(x1,x2)=0

• A necessary condition for f to have a minimum at some point (x1*,x2*)


is that the total derivative of f (x1,x2) wrt x1 must be zero at (x1*,x2*)
f f
df  dx1  dx2  0
x1 x2

• Since g(x1*,x2*)=0 at the minimum point, any variations dx1 and dx2
taken about the point (x1*,x2*) are called admissable variations
provided that the new point lies on the constraint:

g ( x1*  dx1 , x2*  dx2 )  0

100
Solution by constrained variation
• Taylor’s series expansion of the function about the point (x1*,x2*):
g * * g * *
g ( x1*  dx1 , x2*  dx2 )  g ( x1* , x2* )  ( x1 , x2 )dx1  ( x1 , x2 )dx2  0
x1 x2
• Since g(x1*, x2*)=0
g g
dg  dx1  dx2  0 at ( x1* , x2* )
x1 x2
g
• Assuming 0
x2
g / x1 * *
dx2   ( x1 , x2 )dx1
g / x2
f f
• Substituting the above equation into df  dx1  dx2  0
x1 x2
f g / x1 f
df  (  ) dx1  0
x1 g / x2 x2 ( x1* , x2* )

101
Solution by constrained variation

f g / x1 f
df  (  ) dx1  0
x1 g / x2 x2 ( x1* , x2* )

• The expression on the left hand side is called the constrained variation
of f

• Since dx1 can be chosen arbitrarily:

f g f g
(  ) 0
x1 x2 x2 x1 ( x1* , x2* )

• This equation represents a necessary condition in order to have


(x1*,x2*) as an extreme point (minimum or maximum)

102
Example

A beam of uniform rectangular cross section is to be cut from a log


having a circular cross secion of diameter 2 a. The beam has to be
used as a cantilever beam (the length is fixed) to carry a concentrated
load at the free end. Find the dimensions of the beam that correspond
to the maximum tensile (bending) stress carrying capacity.

103
Example

Solution: From elementary strength of materials, we know that the


tensile stress induced in a rectangular beam  at any fiber located at a
distance y from the neutral axis is given by
 M

y I
where M is the bending moment acting and I is the moment of inertia
of the cross-section about the x axis. If the width and the depth of the
rectangular beam shown in the figure are 2x and 2y, respectively, the
maximum tensile stress induced is given by:

M My 3 M
 max  y  2
I 1 3 4 xy
(2 x)( 2 y )
12

104
Example

solution cont’d: Thus for any specified bending moment,


the beam is said to have maximum tensile stress carrying
capacity if the maximum induced stress (max) is a
minimum. Hence we need to minimize k/xy2 or maximize
Kxy2, where k=3M/4 and K=1/k, subject to the constraint
x2  y 2  a2

This problem has two variables and one constraint; hence


the equation
f g f g
(  ) 0
x1 x2 x2 x1 ( x1* , x2* )

can be applied for finding the optimum solution.


105
Example

Solution: Since f  kx1 y 2


g  x2  y 2  a2

we have: f  kx2 y 2 g
 2x
x x
f g
 2kx1 y 3  2y
y y

Equation ( f g  f g ) 0 gives:
x1 x2 x2 x1 ( x1* , x2* )

 kx2 y 2 (2 y)  2kx1 y 3 (2 x)  0 at ( x*, y*)


106
Example

Solution: that is
y*  2 x *

Thus the beam of maximum tensile stress carrying


capacity has a depth of 2 times its breadth. The optimum
values of x and y can be obtained from the above equation
and
g  x2  y 2  a2

as:
a a
x*  and y*  2
3 3

107
Solution by constrained variation
Necessary conditions for a general problem

• The procedure described can be generalized to a problem with n


variables and m constraints.

• In this case, each constraint equation gj(x)=0, j=1,2,..,m gives rise to a


linear equation in the variations dxi, i=1,2,…,n.

• Thus, there will be in all m linear equations in n variations. Hence any


m variations can be expressed in terms of the remaining n-m
variations.

• These expressions can be used to express the differential of the


objective function, df, in terms of the n-m independent variations.

• By letting the coefficients of the independent variations vanish in the


equation df = 0, one obtains the necessary conditions for the
cnstrained optimum of the given function. 108
Solution by constrained variation
Necessary conditions for a general problem

• These conditions can be expressed as:


f f f f

xk x1 x2 xm
g1 g1 g1 g
 1
xk x1 x2 xm
 f , g1 , g 2 ,  , g m  g 2 g 2 g 2 g
J     2  0 where k  m  1, m  2,  , n
 k 1 2 3
x , x , x , x ,  , x m  xk x1 x2 xm

g m g m g m g
 m
xk x1 x2 xm

• It is to be noted that the variations of the first m variables (dx1, dx2,..,


dxm) have been expressed in terms of the variations of the remaining
n-m variables (dxm+1, dxm+2,.., dxn) in deriving the above equation.
109
Solution by constrained variation
Necessary conditions for a general problem

• This implies that the following relation is satisfied:


 g , g ,, g m 
J  1 2   0
 x1 , x2 ,, xm 
• The n-m equations given by the below equation represent the
necessary conditions for the extremum of f(X) under the m equality
constraints, gj(X) = 0, j=1,2,…,m.
f f f f

xk x1 x2 xm
g1 g1 g1 g
 1
xk x1 x2 xm
 f , g1 , g 2 ,  , g m  g 2 g 2 g 2 g
J     2  0 where k  m  1, m  2,  , n
 k 1 2 3
x , x , x , x ,  , x m  xk x1 x2 xm

g m g m g m g
 m
xk x1 x2 xm
110
Example
1 2
Minimize f (Y)  ( y1  y22  y32  y42 )
2
subject to
g1 (Y)  y1  2 y2  3 y3  5 y4  10  0
g 2 (Y)  y1  2 y2  5 y3  6 y4  15  0

Solution: This problem can be solved by applying the


necessary conditions given by
f f f f

xk x1 x2 xm
g1 g1 g1 g
 1
xk x1 x2 xm
 f , g1 , g 2 ,  , g m  g 2 g 2 g 2 g
J     2  0 where k  m  1, m  2,  , n
 xk , x1 , x2 , x3 ,  , xm  xk x1 x2 xm

g m g m g m g
 m
xk x1 x2 xm

111
Example
Solution cont’d: Since n = 4 and m = 2, we have to select two
variables as independent variables. First we show that any arbitrary set
of variables can not be chosen as independent variables since the
remaining (dependent) variables have to satisfy the condition of
 g1 , g 2 ,, g m 
J    0
 x1 , x2 ,, xm 

In terms of the notation of our equations, let us take the independent


variables as x3=y3 and x4=y4 so that x1=y1 and x2=y2. Then the
Jacobian becomes:
g1 g1
 g1 , g 2  y1 y2 1 2
J     0
 x1 , x2  g 2 g 2 1 2
y1 y2
and hence the necessary conditions can not be applied.
112
Example
Solution cont’d: Next, let us take the independent variables as x3=y2
and x4=y4 so that x1=y1 and x2=y3. Then the Jacobian becomes:

g1 g1
 g ,g  y1 y3 1 3
J  1 2    20
 x1 , x2  g 2 g 2 1 5
y1 y3

and hence the necessary conditions of


f f f f

xk x1 x2 xm
g1 g1 g1 g
 1
xk x1 x2 xm
 f , g1 , g 2 ,  , g m  g 2 g 2 g 2 g
J     2  0 where k  m  1, m  2,  , n
 xk , x1 , x2 , x3 ,  , xm  xk x1 x2 xm

g m g m g m g
 m
xk x1 x2 xm
can be applied. 113
Example
Solution cont’d: The equation
f f f f

xk x1 x2 xm
g1 g1 g1 g
 1
xk x1 x2 xm
 f , g1 , g 2 ,  , g m  g 2 g 2 g 2 g
J     2  0 where k  m  1, m  2,  , n
 xk , x1 , x2 , x3 ,  , xm  xk x1 x2 xm

g m g m g m g
 m
xk x1 x2 xm

give for k = m+1=3

f f f f f f
x3 x1 x2 y2 y1 y3
y2 y1 y3
g1 g1 g1 g g1 g1
 1  2 1 3  y2 (5  3)  y1 (10  6)  y3 (2  2)  2 y2  4 y1  0
x3 x1 x2 y2 y1 y3
2 1 5
g 2 g 2 g 2 g 2 g 2 g 2
x3 x1 x2 y2 y1 y3
114
Example
Solution cont’d:
For k = m+2=4
f f f f f f
x4 x1 x2 y4 y1 y3
y4 y1 y3
g1 g1 g1 g g1 g1
 1  5 1 3  y4 (5  3)  y1 (25  18)  y3 (5  6)  2 y4  7 y1  y3  0
x4 x1 x2 y4 y1 y3
6 1 5
g 2 g 2 g 2 g 2 g 2 g 2
x4 x1 x2 y4 y1 y3

From the two previous equations, the necessary conditions for the
minimum or the maximum of f is obtained as:

1
y1  y2
2
7
y3  2 y4  7 y1  2 y4  y2
2
115
Example
Solution cont’d:
When the equations
1
y1  y2
2
7
y3  2 y4  7 y1  2 y4  y2
2
are substituted, the equations
g1 (Y)  y1  2 y2  3 y3  5 y4  10  0
g 2 (Y)  y1  2 y2  5 y3  6 y4  15  0

take the form:


 8 y2  11y4  10
 15 y2  16 y4  15
116
Example
Solution cont’d:
from which the desired optimum solution can be obtained as:

5
y1*  
74
5
y2 *  
37
155
y3 * 
74
30
y4 * 
37

117
Solution by constrained variation
Sufficiency conditions for a general problem

• By eliminating the first m variables, using the m equality constraints,


the objective function can be made to depend only on the remaining
variables xm+1, xm+2, …,xn. Then the Taylor’s series expansion of f , in
terms of these variables, about the extreme point X* gives:
n
 f  1 n n  2 f 
f ( X *  dX)  f ( X * )     dxi      dxi dx j
i  m 1   
j  m 1  xi x j

xi g 2 i m1 g

where (f / xi ) g is used to denote the partial derivative of f wrt xi


(holding all the other variables xm+1, xm+2, …,xi-1, xi+1, xi+2,…,xn
constant) when x1, x2, …,xm are allowed to change so that the
constraints gj(X*+dX)=0, j=1,2,…,m are satisfied; the second
derivative ( 2 f / xi x j ) g is used to denote a similar meaning.
118
Solution by constrained variation
Example

Consider the problem of minimizing f (X)  f ( x1, x2 , x3 )


Subject to the only constraint g1 ( X)  x12  x22  x32  8  0

Since n=3 and m=1 in this problem, one can think of any of the m
variables, say x1, to be dependent and the remaining n-m variables,
namely x2 and x3, to be independent.

Here the constrained partial derivative (f / x2 ) g means the rate of
change of f with respect to x2 (holding the other independent variable
x3 constant) and at the same time allowing x1 to change about X* so as
to satisfy the constraint g1(X)=0

119
Solution by constrained variation
Example

In the present case, this means that dx1 has to be chosen to satisfy the
relation
g1 g g
g1 ( X *  dX)  g1 ( X*)  ( X*) dx1  1 ( X*) dx 2  1 ( X*) dx3  0
x1 x 2 x3
that is
2 x1* dx1  2 x 2* dx 2  0

since g1(X*)=0 at the optimum point and dx3= 0 (x3 is held constant.)

120
Solution by constrained variation
Example

Notice that (df/dxi)g has to be zero for i=m+1, m+2,...,n since the dxi
appearing in the equation
n
 f  1 n n  2 f 
f ( X *  dX)  f ( X * )     dxi      dxi dx j
i  m 1  xi  g 2 i m1 
j  m 1  xi x j

g

are all independent. Thus, the necessary conditions for the existence of
constrained optimum at X* can also be expressed as:
 f 
   0, i  m  1, m  2,..., n
 xi g

121
Solution by constrained variation
Example

It can be shown that the equations


 f 
   0, i  m  1, m  2,..., n
 xi  g
are nothing bu the equation
f f f f

xk x1 x2 xm
g1 g1 g1 g
 1
xk x1 x2 xm
 f , g1 , g 2 ,  , g m  g 2 g 2 g 2 g
J     2  0 where k  m  1, m  2,  , n
 xk , x1 , x2 , x3 ,  , xm  xk x1 x2 xm

g m g m g m g
 m
xk x1 x2 xm

122
Sufficiency conditions for a general
problem
• A sufficient condition for X* to be a constrained relative minimum
(maximum) is that the quadratic form Q defined by
n n  2 f 
Q   

j  m 1  xi x j
 dxi dx j

i  m 1 g
is positive (negative) for all nonvanishing variations dxi and the matrix

  2 f   2 f   2 f  
 2       
 xm 1  g  
 m 1 m  2  g
x x  
 m 1 n  g 
x x
  
 
  2 f    f 
2
 f 
2 
      2  
x 
 n m 1  g
x  
 n m 2  g
x x 
 n g
x 

has to be positive (negative) definite to have Q positive (negative) for


all choices of dxi
123
Solution by constrained variation
• The computation of the constrained derivatives in the sufficiency
condition is difficult and may be prohibitive for problems with more
than three constraints

• Simple in theory

• Difficult to apply since the necessary conditions involve evaluation of


determinants of order m+1

124
Solution by Lagrange multipliers
Problem with two variables and one constraint:

Minimize f (x1,x2)

Subject to g(x1,x2)=0

For this problem, the necessary condition was found to be:


f f / x2 g
(  ) 0
x1 g / x2 x1 ( x1* , x2* )

By defining a quantity , called the Lagrange multiplier as:


 f / x2 
   
 g / x2  ( x *,x *)
1 2

125
Solution by Lagrange multipliers
Problem with two variables and one constraint:

Necessary conditions for the point (x1*,x2*) to be an extreme point


The problem can be rewritten as:
f g
(  ) 0
x2 x2 ( x1* , x2* )

f g
(  ) 0
x1 x1 ( x1* , x2* )

In addition, the constraint equation has to be satisfied at the extreme


point:
g ( x1 , x2 ) 0
( x1* , x2* )

126
Solution by Lagrange multipliers
Problem with two variables and one constraint:

• The derivation of the necessary conditions by the method of Lagrange


multipliers requires that at least one of the partial derivatives of
g(x1,x2) be nonzero at an extreme point.

• The necessary conditions are more commonly generated by


constructing a function L,known as the Lagrange function, as

L( x1 , x2 ,  )  f ( x1 , x2 )  g ( x1 , x2 )  0

127
Solution by Lagrange multipliers
Problem with two variables and one constraint:

• By treating L as a function of the three variables x1, x2 and , the


necessary conditions for its extremum are given by:

L f g
( x1 , x2 ,  )  ( x1 , x2 )   ( x1 , x2 )  0
x1 x1 x1
L f g
( x1 , x2 ,  )  ( x1 , x2 )   ( x1 , x2 )  0
x2 x2 x2
L
( x1 , x2 ,  )  g ( x1 , x2 )  0


128
Example
Example: Find the solution using the Lagrange multiplier method.
Minimize f ( x, y )  kx1 y 2
subject to
g ( x, y )  x 2  y 2  a 2  0
Solution
The Lagrange function is
L( x, y,  )  f ( x, y )  g ( x, y )  kx1 y 2   ( x 2  y 2  a 2 )
The necessary conditions for the minimum of f ( x, y )
L
 kx 2 y  2  2 x  0
x
L
 2kx1 y 3  2 y  0
y
L
 x2  y2  a2  0
 129
Example
Solution cont’d
which yield:
k 2k
2  
x 3 y 2 xy4
from which the relation x*  (1 / 2) y * can be obtained. This relation along with
L
 x 2  y 2  a 2  0 gives the optimum solution as :

a a
x*  and y*  2
3 3

130
Solution by Lagrange multipliers
Necessary conditions for a general problem:

Minimize f(X)

subject to

gj (X)= 0, j=1, 2,….,m

The Lagrange function, L, in this case is defined by introducing one


Lagrange multiplier j for each constraint gj(X) as

L( x1 , x2 ,, xn , 1 , 2 ,, m )  f (X)  1 g1 (X)  2 g 2 (X)    m g m (X)

131
Solution by Lagrange multipliers
By treating L as a function of the n+m unknowns, x1, x2,…,xn,1,
2,…, m, the necessary conditions for the extremum of L, which also
corresponds to the solution of the original problem are given by:

L f m g j
  j  0, i  1,2,, n
xi xi j 1 xi
L
 g j ( X)  0, j  1,2,, m
 j

The above equations represent n+m equations in terms of the n+m


unknowns, xi and j

132
Solution by Lagrange multipliers
The solution:

 x1*  1* 
 *  *
x   
X*   2  and *   2 
  
 x*  * 
 n  m

The vector X* corresponds to the relative constrained minimum of


f(X) (sufficient conditions are to be verified) while the vector *
provides the sensitivity information.

133
Solution by Lagrange multipliers
Sufficient Condition

A sufficient condition for f(X) to have a constrained relative minimum


at X* is that the quadratic Q defined by
n n
2L
Q  dxi dx j
i 1 j 1 xi x j

evaluated at X=X* must be positive definite for all values of dX for


which the constraints are satisfied.
If
n n
2L
Q  ( X*, *)dxi dx j
i 1 j 1 xi x j

is negative for all choices of the admissable variations dxi, X* will be a


constrained maximum of f(X)
134
Solution by Lagrange multipliers
A necessary condition for the quadratic form Q to be positive
(negative) definite for all admissable variations dX is that each root of
the polynomial zi, defined by the following determinantal equation, be
positive (negative):

• The determinantal equation, on expansion, leads to an (n-m)th-order


polynomial in z. If some of the roots of this polynomial are positive
while the others are negative, the point X* is not an extreme point. 135
Example 1
Find the dimensions of a cylindirical tin (with top and bottom) made
up of sheet metal to maximize its volume such that the total surface
area is equal to A0=24.

Solution

If x1 and x2 denote the radius of the base and length of the tin,
respectively, the problem can be stated as:

Maximize f (x1,x2) = x12x2

subject to

2x12  2x1 x2  A0  24

136
Example 1
Solution

Maximize f (x1,x2) = x12x2

subject to 2x12  2x1 x2  A0  24

The Lagrange function is:


L( x1 , x2 ,  )   x12   (2x12  2x1 x2  A0 )
and the necessary conditions for the maximum of f give:
L
 2x1 x2  4x1  2x2  0
x1
L
 x12  2x1  0
x2
L
 2x12  2x1 x2  A0  0 137

Example 1
Solution
x1 x2 1
   x1
2 x1  x2 2

that is,
1
x1  x2
2

The above equations give the desired solution as:

1/ 2 1/ 2 1/ 2
A   2A   A 
x1*   0  , x2*   0  , and *   0 
 6   3   24 

138
Example 1
Solution 1/ 2
 A0 3 
This gives the maximum value of f as f *   

 54 
If A0 = 24, the optimum solution becomes
x1*  2, x2*  4, *  1, and f *  16
To see that this solution really corresponds to the maximum of f, we apply
the sufficiency condition of equation

139
Example 1
Solution
In this case:
2L
L11  2 ( X*,  * )
 2x2*  4*  4
x1
2L
L12  ( X*,  *)
 2x1*  2*  2
x1 x2
2L
L22  2 ( X*,  * )
0
x2
g1
g11   4x1*  2x2*  16
x1 ( X*,  * )

g1
g12   2x1*  4
x2 ( X*,  * ) 140
Example 1
Solution
Thus, equation

becomes

4  z 2 16
2 0  z 4  0
16 4 0 141
Example 1
Solution
that is,
272 2 z  192 3  0

This gives 12
z 
17

Since the value of z is negative, the point (x1*,x2*) corresponds to the


maximum of f.

142
Example 2
Find the maximum of the function f (X) = 2x1+x2+10 subject to g
(X)=x12+2x22 = 3 using the Lagrange multiplier method. Also find the
effect of changing the right-hand side of the constraint on the optimum
value of f.

Solution
The Lagrange function is given by:
L(X,  )  2 x1  x2  10   (3  x1  2 x22 )
The necessary conditions for the solution of the problem are:
L
 2  0
x1
L
 1  4x2  0
x2
L
 3  x1  2 x22

143
Example 2
Solution
The solution of the equation is:
 x1*  2.97
X*   *    
 x2  0.13 
*  2
The application of the sufficiency condition yields:
L11  z L12 g11
L21 L22  z g12  0
g11 g12 0

z 0 1 z 0 1
0  4  z  4 x2  0  8  z  0.52  0
1  4 x2 0 1  0.52 0
144
Example 2
Solution
0.2704 z  8  z  0
z  6.2972

Hence X* will be a maximum of f with f* = f (X*)=16.07

145
Multivariable optimization with
inequality constraints
Minimize f (X)

subject to

gj (X) ≤ 0, j=1, 2,…,m

The inequality constraints can be transformed to equality constraints


by adding nonnegative slack variables, yj2, as

gj (X) + yj2 = 0, j = 1,2,…,m

where the values of the slack variables are yet unknown.

146
Multivariable optimization with
inequality constraints
Minimize f(X) subject to

Gj(X,Y)= gj (X) + yj2=0, j=1, 2,…,m

 y1 
y 
 
where Y 2 is the vector of slack variables
 
 ym 

This problem can be solved by the method of Lagrange multipliers.


For this, the Lagrange function L is constructed as:
m
L( X, Y,  )  f ( X)    j G j ( X, Y)
j 1

1 
 
 
where    2  is the vector of Lagrange multiplier s
 
m 
147
Multivariable optimization with
inequality constraints
The stationary points of the Lagrange function can be found by
solving the following equations (necessary conditions):
L f m g j
( X, Y,  )  ( X)    j ( X)  0, i  1,2,, n
xi xi j 1 xi
L
( X, Y,  )  G j ( X, Y)  g j ( X)  y 2j  0, , j  1,2, , m
 j
L
( X, Y,  )  2 j y j  0, j  1,2, , m
y j
(n+2m) equations
(n+2m) unknowns

The solution gives the optimum solution vector X*, the Lagrange
multiplier vector, *, and the slack variable vector, Y*.

148
Multivariable optimization with
inequality constraints
Equation
L
(X, Y,  )  G j (X, Y)  g j (X)  y 2j  0, , j  1,2,, m
 j
ensure that the constraints
g j ( X)  0, j  1,2,, m
are satisfied, while the equation
L
(X, Y,  )  2 j y j  0, j  1,2,, m
y j
implies that either j=0 or yj=0

149
Multivariable optimization with
inequality constraints
• If j=0, it means that the jth constraint is inactive and hence can be
ignored.

• On the other hand, if yj= 0, it means that the constraint is active (gj =
0) at the optimum point.

• Consider the division of the constraints into two subsets, J1 and J2,
where J1 + J2 represent the total set of constraints.

• Let the set J1 indicate the indices of those constraints that are active at
the optimum point and J2 include the indices of all the inactive
constraints.

• Those constraints that are satisfied with an equality sign, gj= 0, at the
optimum point are called the active constraints, while those that are
satisfied with a strict inequality sign, gj< 0 are termed inactive
constraints.
150
Multivariable optimization with
inequality constraints
• Thus for j J1, yj = 0 (constraints are active), for j J2, j=0 (constraints
are inactive), and the equation

L f m g j
( X, Y,  )  ( X)    j ( X)  0, i  1,2,, n
xi xi j 1 xi

can be simplified as:

f m g j
 j  0, i  1,2,, n
xi jJ1 xi

151
Multivariable optimization with
inequality constraints
f m g j
 j  0, i  1,2,, n (1)
xi jJ1 xi

Similarly, the equation


L
(X, Y,  )  G j (X, Y)  g j (X)  y 2j  0, , j  1,2,, m
 j
can be written as:
g j ( X)  0, j  J1
(2)
g j ( X)  y  0,
2
j j  J2

The equations (1) and (2) represent n+p+(m-p)=n+m equations in the


n+m unknowns xi (i=1,2,…,n), j (j  J1), and yj (j  J2), where p
denotes the number of active constraints.
152
Multivariable optimization with
inequality constraints
Assuming that the first p constraints are active, the equation

f m g j
 j  0, i  1,2,, n
xi jJ1 xi

can be expressed as:

f g1 g 2 g p
  1  2   p , , i  1,2,, n
xi xi xi xi

These equations can be collectively written as


 f  1g1  2g 2     pg p
where f and g j are the gradients of the objective function and the jth constraint , respective ly.

153
Multivariable optimization with
inequality constraints
f / x1  g j / x1 
f  
 / x2  g j / x2 
f    and g j   
     
f / xn  g / x 
 j n

Equation
 f  1g1  2g 2     p g p

indicates that the negative of the gradient of the objective function can
be expressed as a linear combination of the gradients of the active
constraints at the optimum point.

154
Multivariable optimization with
inequality constraints-Feasible region
• A vector S is called a feasible direction from a point X if at least a
small step can be taken along S that does not immediately leave the
feasible region.

• Thus for problems with sufficiently smooth constraint surfaces, vector


S satisfying the relation

ST g j  0

can be called a feasible direction.

155
Multivariable optimization with
inequality constraints-Feasible region
• On the other hand, if the constraint is either linear or concave, any
vector satisfying the relation
ST g j  0

can be called a feasible region.

• The geometric interpretation of a feasible direction is that the vector S


makes an obtuse angle with all the constraint normals.

156
Multivariable optimization with
inequality constraints-Feasible region

157
Multivariable optimization with
inequality constraints
• Further we can show that in the case of a minimization problem, the j
values (j  J1), have to be positive. For simplicity of illustration,
suppose that only two constraints (p=2) are active at the optimum
point.

• Then the equation


 f  1g1  2g 2     p g p

reduces to
 f  1g1  2g2

158
Multivariable optimization with
inequality constraints
• Let S be a feasible direction at the optimum point. By premultiplying
both sides of the equation
 f  1g1  2g2
by ST, we obtain:
 STf  1STg1  2STg 2

where the superscript T denotes the transpose. Since S is a feasible


direction, it should satisfy the relations:

S T g1  0
S T g 2  0

159
Multivariable optimization with
inequality constraints
• Thus if, 1 > 0 and 2 > 0 the quantity STf is always positive.

• As f indicates the gradient direction, along which the value of the


function increases at the maximum rate, STf represents the
component of the increment of f along the direction S.

• If STf > 0, the function value increases, the function value increases
as we move along the direction S.

• Hence if 1 and 2 are positive, we will not be able to find any


direction in the feasible domain along which the function value can be
decreased further.

160
Multivariable optimization with
inequality constraints
• Since the point at which the equation
S T g1  0
S T g 2  0
is valid is assumed to be optimum, 1 and 2 have to be positive.

• This reasoning can be extended to cases where there are more than
two constraints active. By proceeding in a similar manner, one can
show that the j values have to be negative for a maximization
problem.

161
Kuhn-Tucker Conditions
• The conditions to be satisfied at a constrained minimum point, X*,
of the problem can be expressed as:

f g j
 j  0, i  1,2,, n
xi jJ1 xi
 j  0, j  J1

• These conditions are in general not sufficient to ensure a relative


minimum.
• There is only a class of problems, called convex programming
problems, for which the Kuhn-Tucker conditions are necessary
and sufficient for a global minimum.

162
Kuhn-Tucker Conditions
• Those constraints that are satisfied with an equality sign, gj=0, at
the optimum point are called the active constraints. If the set of
active constraints is not known, the Kuhn-Tucker conditions can
be stated as follows:
f m g j
 j  0, i  1,2, , n
xi j 1 xi
 j g j  0, j  1,2, , m
g j  0, j  1,2, , m
 j  0, j  1,2, , m

163
Kuhn-Tucker Conditions
• Note that if the problem is one of maximization or if the
constraints are of the type gj ≥ 0, the j have to be nonpositive in
the equations below :
f m g j
 j  0, i  1,2, , n
xi j 1 xi
 j g j  0, j  1,2, , m
g j  0, j  1,2, , m
 j  0, j  1,2, , m

• On the other hand, if the problem is one of maximization with


constraints in the form gj ≥ 0, the j have to be nonnegtaive in the
above equations.

164
Constraint Qualification
• When the optimization problem is stated as:
Minimize f(X)
subject to
gj (X) ≤ 0, j=1, 2,…,m
hk(X) = 0, k=1,2,….,p

the Kuhn-Tucker conditions become


m p
f    j g j    k hk  0
j 1 k 1

 j g j  0, j  1,2,, m
g j  0, j  1,2, , m
hk  0, k  1,2, , p
 j  0, j  1,2, , m
where  j and  k denote the Lagrange multiplier s associated with the constraint s
g j  0 and hk  0, respective ly.
165
Constraint Qualification
• Although we found that the Kuhn-Tucker conditions represent the necessary
conditions of optimality, the following theorem gives the precise conditions of
optimality:
• Theorem: Let X* be a feasible solution to the problem of
Minimize f(X)
subject to
gj (X) ≤ 0, j=1, 2,…,m
hk(X) = 0, k=1,2,….,p
If gj(X*), j J1 and hk(X*), k=1,2,…,p are linearly independent, there exist *
and * such that (X*, *, *) satisfy the equations below:
m p
f    j g j    k hk  0
j 1 k 1

 j g j  0, j  1,2,, m
g j  0, j  1,2,, m
hk  0, k  1,2, , p
166
 j  0, j  1,2, , m
Example 1
Consider the problem:
Minimize f(x1,x2)=(x1-1)2 +x22

subject to
g1 (x1,x2) =x13-2x2≤ 0
g2 (x1,x2) =x13+2x2≤ 0

Determine whether the constraint qualification and the Kuhn-Tucker


conditions are satisfied at the optimum point.

167
Example 1
Solution: The feasible region and the contours of the objective
function are shown in the figure below. It can be seen that the
optimum solution is at (0,0).

168
Example 1
Solution cont’d: Since g1 and g2 are both active at the optimum
point (0,0), their gradients can be computed as:
3 x12  0 
g1 (X*)     
  2  ( 0 , 0 )  2 
3 x12  0 
g 2 (X*)     
 2  ( 0 , 0 ) 2 

It is clear that g1(X*) and g2(X*) are not linearly independent.


Hence the constraint qualification is not satisfied at the optimum
point.

169
Example 1
Solution cont’d: Noting that:
2( x  1)  2
f (X*)   1    
 2 x2 ( 0 , 0 )  0 
The Kuhn Tucker conditions can be written using the equations

f g j
 j  0, i  1,2,, n
xi jJ1 xi
 j  0, j  J1

 2  1 (0)  2 (0)  0
as:
(E1)
0  1 (2)  2 (2)  0 (E2)
1  0 (E3)
2  0 (E4)

Since equation (E4) is not satisfied and equation (E5) can be satisfied for negative values of 1= 2 also, the Kuhn-
Tucker conditions are not satisfied at the optimum point.

170
Example 2

A manufacturing firm producing small refrigerators has entered into


a contract to supply 50 refrigerators at the end of the first month, 50
at the end of the second month, and 50 at the end of the third. The
cost of producing x refrigerators in any month is given by
$(x2+1000). The firm can produce more refrigerators in any month
and carry them to a subsequent month. However, it costs $20 per
unit for any refrigerator carried over from one month to the next.
Assuming that there is no initial inventory, determine the number of
refrigerators to be produced in each month to minimize the total
cost.

171
Example 2
Solution:

Let x1, x2, x3 represent the number of refrigerators produced in the


first, second and third month respectively. The total cost to be
minimized is given by:

total cost= production cost + holding cost

f ( x1 , x2 , x3 )  ( x12  1000)  ( x22  1000)  ( x32  1000)  20( x1  50)  20( x1  x2  100)
 x12  x22  x32  40 x1  20 x2

172
Example 2
Solution cont’d:
The constraints can be stated as:

g1 ( x1 , x2 , x3 )  x1  50  0
g 2 ( x1 , x2 , x3 )  x1  x2  100  0
g 3 ( x1 , x2 , x3 )  x1  x2  x3  150  0

The first Kuhn Tucker condition is given by:


f g g g
 1 1  2 2  3 3  0, i  1,2,3
xi xi xi xi
that is
2 x1  40  1  2  3  0 (E1)
2 x2  20  2  3  0 (E2)
2 x3  3  0 (E3)

173
Example 2
Solution cont’d:

The second Kuhn Tucker condition is given by :

 j g j  0, j  1,2,3
that is
1 ( x1  50)  0 (E4)
2 ( x1  x2  100)  0 (E5)
3 ( x1  x2  x3  150)  0 (E6)

174
Example 2
Solution cont’d:

The third Kuhn Tucker condition is given by :

gj 0 j  1,2,3
that is,
( x1  50)  0 (E7)
( x1  x2  100)  0 (E8)
( x1  x2  x3  150)  0 (E9)

175
Example 2
Solution cont’d:

The fourth Kuhn Tucker condition is given by :

j  0 j  1,2,3
that is,
1  0 (E10)
2  0 (E11)
3  0 (E12)

176
Example 2
Solution cont’d:

The solution of Eqs.(E1) to (E12) can be found in several ways. We


proceed to solve these equations by first noting that either 1=0 or
x1=50 according to (E4).

Using this information, we investigate the following cases to identify


the optimum solution of the problem:

• Case I: 1=0

• Case II: x1=50

177
Example 2
Solution cont’d:

• Case I: 1=0
Equations (E1) to (E3) give:

3
x3  
2
2 3
x2  10   (E13)
2 2
2 3
x1  20  
2 2

178
Example 2
Solution cont’d:

• Case I: 1=0
Substituting Equations (E13) into Eqs. (E5) and (E6) give:

2 (130  2  3 )  0
3
3 (180  2  3 )  0 (E14)
2

179
Example 2
Solution cont’d:

Case I: 1=0

The four possible solutions of Eqs. (E14) are:

1. 2=0, -180- 2-3/2 3=0. These equations along with Eqs.


(E13) yield the solution:

2=0, 3=-120, x1=40, x2=50, x3=60

This solution satisfies Eqs.(E10) to (E12) but violates Eqs.(E7)


and (E8) and hence can not be optimum.
180
Example 2
Solution cont’d:

Case I: 1=0

The second possible solution of Eqs. (E14) is:

2. 3=0, -130- 2-3=0. The solution of these equations leads to:

2=-130, 3=0, x1=45, x2=55, x3=0

This solution satisfies Eqs.(E10) to (E12) but violates Eqs.(E7)


and (E9) and hence can not be optimum.

181
Example 2
Solution cont’d:

Case I: 1=0

The third possible solution of Eqs. (E14) is:

3. 2=0, 3=0. Equations (E13) give:

x1=-20, x2=-10, x3=0

This solution satisfies Eqs.(E10) to (E12) but violates the


constraints Eqs.(E7) and (E9) and hence can not be optimum.

182
Example 2
Solution cont’d:

Case I: 1=0

The third possible solution of Eqs. (E14) is:

4. -130- 2-3=0, -180- 2-3/2 3=0. The solutions of these equations


and Equations (E13) give:

2 =-30, 3 =-100, x1=45, x2=55, x3=50

This solution satisfies Eqs.(E10) to (E12) but violates the


constraint Eq.(E7) and hence can not be optimum.
183
Example 2
Solution cont’d:

Case II: x1=50. In this case, Eqs. (E1) to (E3) give:


3  2 x3
2  20  2 x2  3  20  2 x2  2 x3 (E15)
1  40  2 x1  2  3  120  2 x2

Substitution of Eqs.(E15) in Eqs


2 ( x1  x2  100)  0 (E5)
3 ( x1  x2  x3  150)  0 (E6)
give:

(20  2 x2  2 x3 )( x1  x2  100)  0
(2 x3 )( x1  x2  x3  150)  0 (E16)
184
Example 2
Solution cont’d:

Case II: x1=50. Once again, there are four possible solutions to
Eq.(E16) as indicated below:

1. -20 - 2x2 + 2x3 = 0, x1 + x2 + x3 -150 = 0: The solution of these


equations yields:

x1 = 50, x2 = 45, x3 = 55

This solution can be seen to violate Eq.(E8) which says:

( x1  x2 100)  0 (E8)

185
Example 2
Solution cont’d:

Case II: x1=50. Once again, there are four possible solutions to
Eq.(E16) as indicated below:

2. -20 - 2x2 + 2x3 = 0, -2x3 = 0: The solution of these equations


yields:

x1 = 50, x2 = -10, x3 = 0

This solution can be seen to violate Eqs.(E8) and (E9) which say:

( x1  x2  100)  0 (E8)
( x1  x2  x3  150)  0 (E9)
186
Example 2
Solution cont’d:

Case II: x1=50. Once again, there are four possible solutions to
Eq.(E16) as indicated below:

3. x1 + x2 -100 = 0, -2x3 = 0: The solution of these equations


yields:

x1 = 50, x2 = 50, x3 = 0

This solution can be seen to violate Eq. (E9) which say:

( x1  x2  x3  150)  0 (E9)

187
Example 2
Solution cont’d:
Case II: x1=50. Once again, there are four possible solutions to
Eq.(E16) as indicated below:

4. x1 + x2 -100 = 0, x1 + x2 + x3 -150 = 0 : The solution of these


equations yields:

x1 = 50, x2 = 50, x3 = 50

This solution can be seen to satisfy all the constraint Eqs.(E7-E9)


which say:
( x1  50)  0 (E7)
( x1  x2  100)  0 (E8)
( x1  x2  x3  150)  0 (E9)
188
Example 2
Solution cont’d:
Case II: x1=50.
The values of 1 , 2 , and 3 corresponding to this solution can be
obtained from
3  2 x3
2  20  2 x2  3  20  2 x2  2 x3 (E15)
1  40  2 x1  2  3  120  2 x2

as:

1  20, 2  20, 3  100

189
Example 2
Solution cont’d:
Case II: x1=50.
1  20, 2  20, 3  100

Since these values of i satisfy the requirements:


1  0 (E10)
2  0 (E11)
3  0 (E12)
this solution can be identified as the optimum solution. Thus
x1*  50, x2 *  50, x3 *  50

190
Convex functions
• A function f(X) is said to be convex if for any pair of points
 x1(1)   x1( 2 ) 
 (1)   ( 2) 
 x2  x 
X1    and X 2   2 
     
 x (1)   x ( 2) 
 n   n 
and all  , 0    1,
f X 2  (1   ) X1   f ( X 2 )  (1   ) f ( X1 )
that is, if the segment joining the two points lies entirely above or on the
graph of f(X).

• A convex function is always bending upward and hence it is apparent


that the local minimum of a convex function is also a global minimum

191
Convex functions
• A function f(x) is convex if for any two points x and y, we have
f(y)  f(x) f T (x)(y  x)

• A function f(X) is convex if the Hessian matrix


H(X)   2 f (X) / xi x j 
is positive semidefinite.

• Any local minimum of a convex function f(X) is a global minimum 192


Concave function
• A function f(X) is called a concave function if for any two points X1
and X2, and for all 0    1,
f X2  (1   )X1   f (X2 )  (1   ) f (X1 )

that is, if the line segment joining the two points lies entirely below
or on the graph of f(X).

• It can be seen that a concave function bends downward and


hence the local maximum will also be its global maximum.

• It can be seen that the negative of a convex function is a concave


function.

193
Concave function
• Convex and concave functions in one variable

194
Concave function
• Convex and concave functions in two variables

195
Example
Determine whether the following function is convex or concave.

f(x) = ex

Solution:

d2 f
H ( x)  2  e x  0 for all real values of x. Hence f(x) is strictly convex.
dx

196
Example
Determine whether the following function is convex or concave.

f(x) = -8x2

Solution:

d2 f
H ( x)  2  16  0 for all real values of x. Hence f(x) is strictly concave.
dx

197
Example
Determine whether the following function is convex or concave.
f(x1,x2) = 2x13-6x22
Solution:
2 f 2 f 
 2 
x x1x2  12 x1 0 
H ( X)   12 
  f  f
2   0  12
 
 x1x2 x22 
Here
2 f
 12 x1  0 for x1  0 H(X)  144 x1  0 for x1  0
x12
 0 for x1  0
 0 for x1  0

Hence H(X) will be negative semidefinite and f(X) is concave for x1 ≤ 0

198
Example
Determine whether the following function is convex or concave.

f ( x1 , x2 , x3 )  4 x12  3x22  5 x32  6 x1 x2  x1 x3  3x1  2 x2  15


Solution:

2 f 2 f 2 f 
 2 
 x1 x1x2 x1x3 
8 6 1
 2 f  f
2
 f
2  
H ( X)     6 6 0 
 x1x2 x22 x2 x3 
1 0 10
 2 f 2 f  f
2 
 
 x1x3 x2 x3 x3 
2

199
Example
Determine whether the following function is convex or concave.

f ( x1 , x2 , x3 )  4 x12  3x22  5 x32  6 x1 x2  x1 x3  3x1  2 x2  15


Solution cont’d:
Here the principal minors are given by:

8 80
8 6
 12  0
6 6
8 6 1
6 6 0  114  0
1 0 10

and hence the matrix H(X) is positive definite for all real values of x1, x2, x3.
Therefore f(X) is strictly convex function. 200
Convex programming problem
• When the optimization problem is stated as:

Minimize f (X)
subject to
gj (X) ≤ 0, j = 1, 2,…,m

it is called a convex programming problem if the objective function f


(X), and the constraint functions, gj (X) are convex.
• Supposing that f (X) and gj(X), j=1,2,…,m are convex functions, the
Lagrange function can be written as:

 
m
L( X, Y, λ )  f ( X)    j g j ( X)  y 2j
J 1

201
Convex programming problem
 
m
L( X, Y, λ )  f ( X)    j g j ( X)  y 2j
J 1

• If j ≥ 0, then jgj(X) is convex, and since jyj=0 from


L
(X, Y,  )  2 j y j  0, j  1,2,, m
y j
L(X,Y,) will be a convex function

• A necessary condition for f (X) to be a relative minimum at X* is that


L(X,Y,) have a stationary point at X*. However, if L(X,Y,) is a convex
function, its derivative vanishes only at one point, which must be an
absolute minimum of the function f (X). Thus the Kuhn-Tucker conditions
are both necessary and sufficient for an absolute minimum of f (X) at X*.

202
Convex programming problem
• If the given optimization problem is known to be a convex programming
problem, there will be no relative minima or saddle points, and hence the
exteme point found by applying the Kuhn-Tucke conditions is guaranteed
to be an absolute minimum of f (X). However, it is often very difficult to
ascertain whether the objective and constraint functions involved in a
practical engineering problem are convex.

203
Linear Programming I:
Simplex method
• Linear programming is an optimization method applicable
for the solution of problems in which the objective function
and the constraints appear as linear functions of the decision
variables.

• Simplex method is the most efficient and popular method


for solving general linear programming problems.

• At least four Nobel prizes were awarded for contributions


related to linear programming (e.g. In 1975, Kantorovich of
the former Soviet Union and T.C. Koopmans of USA were
awarded for application of LP to the economic problem of
allocating resources).

204
Linear Programming I:
Simplex method-Applications
• Petroleum refineries
– choice of buying crude oil from several different sources with
differing compositions and at differing prices
– manufacturing different products such as aviation fuel, diesel fuel,
and gasoline, in varying quantities
– Constraints due to the restrictions on the quantity of the crude oil
from a particular source, the capacity of the refinery to produce a
particular product
– A mix of the purchased crude oil and the manufactured products is
sought that gives the maximum profit
• Optimal production plan in a manufacturing firm
– Pay overtime rates to achieve higher production during periods of
higher demand
• The routing of aircraft and ships can also be decided using
LP
205
Standard Form of a Linear
Programming Problem
• Scalar form
Minimize f ( x1 , x2 , , xn )  c1 x1  c2 x2    cn xn
subject to the constraint s
a11 x1  a12 x2    a1n xn  b1
a21 x1  a22 x2    a2 n xn  b2

am1 x1  am 2 x2    amn xn  bm
x1  0
x2  0

xn  0
where c j , b j and aij (i  1,2, , m; j  1,2, , n) are known constants,
and x j are the decision v ariables
206
Standard Form of a Linear
Programming Problem
• Matrix form
Minimize f ( X)  c T X
subject to the constraint s
aX  b
X0
where
 x1  b1  c1 
x  b  c 
 2  2  
X   , b   , c   2
     
 xn  bm  cn 
a11 a12 a13  a1n 
a a a  a 
a 21 22 23 2 n 
 
 
 m1
a a m2 a m3  a mn 
207
Characteristic of a Linear
Programming Problem

• The objective function is of the minimization type

• All the constraints are of the equality type

• All the decision variables are nonnegative

• The number of the variables in the problem is n. This


includes the slack and surplus variables.

• The number of constraints is m (m < n).

208
Characteristic of a Linear
Programming Problem
• The number of basic variables is m (same as the number of
constraints).

• The number of nonbasic variables is n-m.

• The column of the right hand side b is positive and greater


than or equal to zero.

• The calculations are organized in a table.

• Only the values of the coefficients are necessary for the


calculations. The table therefore contains only coefficient
values, the matrix A in previous discussions. These are the
coefficients in the constraint equations.
209
Characteristic of a Linear
Programming Problem
• The objective function is the last row in the table. The
constraint coefficients are written first.

• Row operations consist of adding (subtracting)a definite


multiple of the pivot row from other rows of the table.

210
Transformation of LP Problems into
Standard Form
• The maximization of a function f(x1,x2,…,xn ) is equivalent
to the minimization of the negative of the same function.
For example, the objective function

minimize f  c1 x1  c2 x2    cn xn
is equivalent to
maximize f    f  c1 x1  c2 x2    cn xn

Consequently, the objective function can be stated in the


minimization form in any linear programming problem.

211
Transformation of LP Problems into
Standard Form
• A variable may be unrestricted in sign in some problems. In
such cases, an unrestricted variable (which can take a
positive, negative or zero value) can be written as the
difference of two nonnegative variables.

• Thus if xj is unrestricted in sign, it can be written as xj=xj'-


xj", where
x j '  0 and x j "  0

• It can be seen that xj will be negative, zero or positive,


depending on whether xj" is greater than, equal to, or less
than xj’ .

212
Transformation of LP Problems into
Standard Form
If a constraint appears in the form of a “less than or equal
to” type of inequality as:
ak1 x1  ak 2 x2    akn xn  bk
it can be converted into the equality form by adding a
nonnegative slack variable xn+1 as follows:
ak1 x1  ak 2 x2    akn xn  xn1  bk

213
Transformation of LP Problems into
Standard Form
If a constraint appears in the form of a “greater than or equal
to” type of inequality as:
ak1 x1  ak 2 x2    akn xn  bk
it can be converted into the equality form by subtracting a
variable as:
ak1 x1  ak 2 x2    akn xn  xn1  bk
where xn+1 is a nonnegative variable known as a surplus
variable.

214
Geometry of LP Problems
Example: A manufacturing firm produces two machine
parts using lathes, milling machines, and grinding machines.
The different machining times required for each part, the
machining times available for different machines, and the
profit on each machine part are given in the following table.
Determine the number of parts I and II to be manufactured
per week to maximize the profit.
Type of machine Machine time Machine time Maximum time
required (min) required (min) available per week
Machine Part I Machine Part II (min)
Lathes 10 5 2500
Milling machines 4 10 2000
Grinding machines 1 1.5 450
215
Profit per unit $50 $100
Geometry of LP Problems
Solution: Let the machine parts I and II manufactured per
week be denoted by x and y, respectively. The constraints
due to the maximum time limitations on the various
machines are given by:

Since the variables x and y can not take negative values, we


have

216
Geometry of LP Problems
Solution: The total profit is given by:

Thus the problem is to determine the


nonnegative values of x and y that
satisfy the constraints stated in
Eqs.(E1) to (E3) and maximize the
objective function given by (E5). The
inequalities (E1) to (E4) can be
plotted in the xy plane and the feasible
region identified as shown in the
figure. Our objective is to find at least
one point out of the infinite points in
the shaded region in figure which
maximizes the profit function (E5).
217
Geometry of LP Problems
Solution: The contours of the objective
function, f, are defined by the linear
equation:

As k is varied, the objective function


line is moved parallel to itself. The
maximum value of f is the largest k
whose objective f, unction line has at
least one point in common with the
feasible region). Such a point can be
identified as point G in the figure.
The optimum solution corresponds to
a value of x*=187.5, y*=125, and a
profit of $21,875.00.
218
Geometry of LP Problems
Solution: In some cases, the optimum
solution may not be unique. For
example, if the profit rates for the
machine parts I and II are $40 and
$100 instead of $50 and $100
respectively, the contours of the profit
function will be parallel to side CG of
the feasible region as shown in the
figure. In this case, line P”Q” which
coincides with the boundary line CG
will correspond to the maximum
feasible profit. Thus there is no
unique optimal solution to the
problem and any point between C and
G online P”Q” can be taken as
optimum solution with a profit value
of $ 20000. 219
Geometry of LP Problems
Solution: There are three other
possibilities. In some
problems. In some prolems,
the feasible region may not be
a closed convex polygon. In
such a case, it may happen that
the profit level can be
increased to an infinitely large
value without leaving the
feasible region as shown in the
figure. In this case, the
solution of the linear
programming problem is said
to be unbounded. 220
Geometry of LP problems
• On the other extreme, the constraint set may be empty in
some problems. This could be due to the inconsistency
of the constraints; or, sometimes, even though the
constraints may be consistent, no point satisfying the
constraints may also satisfy the nonnegativity
restrictions.

• The last possible case is when the feasible region


consists of a single point. This can occur only if the
number of constraints is at least equal to the number of
variables. A problem of this kind is of no interest to us
since there is only one feasible point and there is nothing
to be optimized.
221
Solution of LP Problems

A linear programming problem may have

• A unique and finite optimum solution

• An infinite number of optimal solutions

• An unbounded solution

• No solution

• A unique feasible point


222
Solution of LP Problems

A linear programming problem may have a unique and finite


optimum solution:

The condition necessary for this to occur is:

• The objective function and the constraints have dissimilar


slopes

• The feasible region is bounded/closed.

223
Solution of LP Problems

A linear programming problem may have infinite solution.

The condition necessary for this to occur is:

• The objective function must be parallel to one of the


constraints.

224
Solution of LP Problems

A linear programming problem may have no solution.

The condition necessary for this to occur is:

• There is no point that is feasible. There is no solution to the


problem.

225
Solution of LP Problems

A linear programming problem may have unbounded


solution.

• In this case, the feasible region is not bounded.

• The presence of an unbounded solution also suggests that


the formulation of the problem may be lacking. Additional
meaningful constraints can be accomodated to define the
solution.

226
Geometrical characteristics of the
graphical solution of LP Problems
• The feasible region is a convex polygon*

• The optimum value occurs at an extreme point or vertex of


the feasible region

*A convex polygon consists of a set of points having the property that the line
segment joining any two points in the set is entirely in the convex set. In
problems having more than two decision variables, the feasible region is called
a convex polyhedron

227
Definitions
• Point in n-Dimensional Space A point X in an n-dimensional space is
characterized by an ordered set of n values or coordinates (x1,x2,…,xn).
The coordinates of X are also called the components of X.

• Line segment in n-Dimensions (L) If the coordinates of two points A


and B are given by xj(1) and xj(2) (j=1,2,…,n), the line segment (L) joining
these points is the collection of points X() whose coordinates are given
by xj= xj(1)+(1- ) xj(2), j=1,2,…,n, with 0 ≤  ≤ 1. Thus,

L  X X  X(1)  (1   )X(2) 
In one dimension, it is easy to see that the definition is in accordance
with our experience
x ( 2)  x( )  [ x ( 2)  x (1) ], 0    1
A B
0 x(1) x() x(2)

x( )   x (1)  (1   ) x ( 2) , 0    1 228


Definitions
• Hyperplane In n-dimensional space, the set of points whose coordinates satisfy a
linear equation

a1 x1    an xn  a T X  b
is called a hyperplane.
A hyperplane, H, is represented as

H(a,b) = {X| aT X = b}

A hyperplane has n-1 dimensions in an n-dimensional space. For example, in three


dimensional space it is a plane, and in two dimensional space it is a line.

229
Definitions
• Hyperplane continued: The set of points whose coordinates satisfy a linear
inequality like

a1 x1   an xn  b
is called a closed half space, closed due to the inclusion of an equality sign in the
inequality above. A hyperplane partitions the n-dimensional space (En) into two
closed half-spaces, so that


H   X aT X  b 
H  X a T
X  b

230
Definitions
• Convex set: A convex set is a collection of points such that if X(1) and X(2)
are any two points in the collection, the line segment joining them is also
in the collection. A convex set, S, can be defined mathematically as
follows:
If X (1) , X ( 2)  S , then X  S
where
X  X (1)  (1   ) X (2) , 0   1

A set containing only one point is always considered to be convex. Some


examples of convex sets in two dimensions are shown shaded.

231
Definitions
Nonconvex sets:

• The sets depicted by the shaded region on the figure below are not
convex.

• The L-shaped region, for example, is not a convex set because it is


possible to find two points a and b in the set such that not all points on the
line joining them belong to the set.

232
Definitions
Convex Polyhedron and Convex Polytope:

• A convex polyhedron is a set of points common to one or more half spaces


• A convex polyhedron that is bounded is called a convex polytope

233
Definitions
Vertex or Extreme Point:

• This is a point in the convex set that does not lie on a line segment joining
two other points of the set. e.g.
– Every point on the circumference of a circle
– Each corner point of a polygon

Feasible Solution:

• In an LP problem, any solution that satisfies the constraints


aX  b
X0

is called a feasible solution

234
Definitions
Basic Solution:

• A basic solution is one in which n-m variables are set equal to zero. A basic
solution can be obtained by setting n-m variables to zero and solving the
constraint
aX  b

simultaneously. In general, for n design variables and m constraints, the


number of basic solutions is given by the combination
n n!
 
 m  m!(n  m)!
Basis:

• The collection of variables not set equal to zero to obtain the basic solution
is called the basis.

235
Definitions
Basic Feasible Solution:

• This is a basic solution that satisfies the nonnegativity conditions of


X0
Nondegenerate Feasible Solution:

• This is a basic feasible solution that has got exactly m positive xi


Optimal Solution:

• A feasible solution that optimizes (minimizes) the objective function is


called an optimal solution

Optimal Basic Solution:

• This is a basic feasible solution for which the objective function is optimal
236
Theorems
Theorem 1:

• The intersection of any number of convex sets is also convex

Theorem 2:

• The feasible region of a linear programming problem is convex

237
Theorems
Theorem 3:

• Any local minimum solution is global for a linear programming problem

238
Theorems
Theorem 4:

• Every basic feasible solution is an extreme point of the convex set of


feasible solutions
Theorem 5:

• Let S be a closed, bounded convex polyhedron with Xie, i=1 to p, as the


set of its extreme points. Then any vector X Є S can be written as:
p
X   i X ie
i 1
p
i  0, 
i 1
i 1

Theorem 6:

• Let S be a closed convex polyhedron. Then the minimum of a linear


function over S is attained at an extreme point of S.
239
Solution of a system of linear
simultaneous equations
• Consider the following system of n equations in n unknowns:

a11 x1  a12 x 2    a1n x n  b1 (E 1 )


a 21 x1  a 22 x 2    a 2 n x n  b2 (E 2 )
a31 x1  a32 x 2    a3n x n  b3 (E 3 )

a n1 x1  a n 2 x 2    a nn x n  bn (E n )

• Assuming that this set of equations possesses a unique solution, a method


of solving the system consists of reducing the equations to a form known
as canonical form

240
Solution of a system of linear
simultaneous equations
• The equation:

a11 x1  a12 x 2    a1n x n  b1 (E 1 )


a 21 x1  a 22 x 2    a 2 n x n  b2 (E 2 )
a31 x1  a32 x 2    a3n x n  b3 (E 3 )

a n1 x1  a n 2 x 2    a nn x n  bn (E n )

will not be altered under the following elementary operations:


– Any equation Er is replaced by the equation kEr , where k is a nonzero constant
– Any equation Er is replaced by the equation Er+ kEs, where Es is any other
equation of the system

241
Solution of a system of linear
simultaneous equations
Pivot operations
• By making use of the elementary operations, the system can be reduced to a convenient equivalent form as
follows. Let us select some variable xi and try to eliminate it from all the equations except the jth one (for
which aji is nonzero). This can be accomplished by dividing the jth equation by aji and subtracting aki times
the result from each of the other equations, k=1,2,...,j-1,j+1,...,n. The resulting system of equations can be
written as:

 x1  a12
a11  x 2    a1,i 1 xi 1  0 xi  a1,i 1 xi 1    a1n x n  b1
 x1  a 22
a 21  x 2    a 2 ,i 1 xi 1  0 xi  a 2 ,i 1 xi 1    a 2 n x n  b2

a j 1,1 x1  a j 1, 2 x 2    a j 1,i 1 xi 1  0 xi  a j 1,i 1 xi 1    ai1,n x n  bj 1
a j1 x1  a j 2 x 2    a j ,i 1 xi 1  1xi  a j ,i 1 xi 1    a jn,n x n  bj
a j 1,1 x1  a j 1, 2 x 2    a j 1,i 1 xi 1  0 xi  a j 1,i 1 xi 1    a j 1,n x n  bj 1

a n1 x1  a n 2 x 2    a n ,i 1 xi 1  0 xi  a n ,i 1 xi 1    a nn
 x n  bn

where the primes indicate that the a’ij and b’j are changed from the original system. This procedure of
eliminating a particular variable from all but one equations is called a pivot operation.
242
Solution of a system of linear
simultaneous equations
• The system of equations
 x1  a12
a11  x 2    a1,i 1 xi 1  0 xi  a1,i 1 xi 1    a1n x n  b1
 x1  a 22
a 21  x 2    a 2 ,i 1 xi 1  0 xi  a 2 ,i 1 xi 1    a 2 n x n  b2

a j 1,1 x1  a j 1, 2 x 2    a j 1,i 1 xi 1  0 xi  a j 1,i 1 xi 1    ai1,n x n  bj 1
a j1 x1  a j 2 x 2    a j ,i 1 xi 1  1xi  a j ,i 1 xi 1    a jn,n x n  bj
a j 1,1 x1  a j 1, 2 x 2    a j 1,i 1 xi 1  0 xi  a j 1,i 1 xi 1    a j 1,n x n  bj 1

a n1 x1  a n 2 x 2    a n ,i 1 xi 1  0 xi  a n ,i 1 xi 1    a nn
 x n  bn

produced by the pivot operation have exactly the same solution as


the original set
a11 x1  a12 x 2    a1n x n  b1
a 21 x1  a 22 x 2    a 2 n x n  b2
a31 x1  a32 x 2    a3n x n  b3

a n1 x1  a n 2 x 2    a nn x n  bn
243
Solution of linear simultaneous
equations - Canonical form
• Next time, if we perform a new pivot operation by eliminating xs, s≠i, in all
the equations except the tth equation, t≠j, the zeros or the 1 in the ith
column will not be disturbed. The pivotal operations can be repeated by
using a different variable and equation each time until the system of the
equations is reduced to the form:

1x1  0 x2  0 x3    0 xn  b1

0 x1  1x2  0 x3    0 xn  b2

0 x1  0 x2  1x3    0 xn  b3


0 x1  0 x2  0 x3    1xn  bn

This system of equations is said to be in canonical form and has been obtained after
carrying out n pivot operations.
244
Solution of linear simultaneous
equations - Canonical form
• From the canonical form, the solution vector can be directly obtained
as:

xi  bi , i  1,2,, n

• Thus the solution above is the desired solution of

a11 x1  a12 x 2    a1n x n  b1


a 21 x1  a 22 x 2    a 2 n x n  b2
a31 x1  a32 x 2    a3n x n  b3

a n1 x1  a n 2 x 2    a nn x n  bn

245
Pivotal reduction of a general system
of equations
• Instead of a square system, let us consider a system of m equations in
n variables with n ≥ m. This system of equations is assumed to be
consistent so that it will have at least one solution

a11 x1  a12 x2    a1n xn  b1


a21 x1  a22 x2    a2 n xn  b2

am1 x1  am 2 x2    amn xn  bm

• The solution vector X that satisfy the above equation are not evident
from the equations. However, it is possible to reduce this system to an
equivalent canonical system from which at least one solution can
readily be deduced.

246
Pivotal reduction of a general system of
equations
• If pivotal operations with respect to any set of m variables, say, x1, x2,….,xm, are carried,
the resulting set of equations can be written as follows: Canonical system with pivotal
variables x1, x2,….,xm

1x1  0 x2    0 xm  a1,m 1 xm 1    a1n xn  b1

0 x1  1x2    0 xm  a2,m 1 xm 1    a2n xn  b2


0 x1  0 x2    1xm  am ,m 1 xm 1    amn
 xn  bm
Pivotal Nonpivotal or Constants
variables independen t
var iables
• One special solution that can always be deduced is:

b  , i  1,2,, m
 i
0, i  m  1, m  2,, n

• This solution is called a basic solution since the solution vector contains no more than
m nonzero terms

247
Pivotal reduction of a general system
of equations
• One special solution that can always be deduced is:
b  , i  1,2,, m
 i
0, i  m  1, m  2,, n

• The pivotal variables xi, i=1,2,…,m are called the basic variables and
the other variables xi, i=m+1,m+2,…,n are called the nonbasic
variables.
• This is not the only solution, but it is the one most readily deduced
from 
1x1  0 x2    0 xm  a1,m 1 xm 1    a1n xn  b1

0 x1  1x2    0 xm  a2,m 1 xm 1    a2n xn  b2


0 x1  0 x2    1xm  am ,m 1 xm 1    amn
 xn  bm
Pivotal Nonpivotal or Constants
variables independen t
var iables 248
Pivotal reduction of a general system
of equations cont’d
If all bi", i=1,2,…,m in the solution given by

b  , i  1,2,, m
 i
0, i  m  1, m  2,, n
are nonnegative, it satisfies Eqs
aX  b
X0

and hence it can be called a basic feasible solution.

249
Motivation of the simplex method

• Given a system in canonical form corresponding to a basic solution,


we have seen how to move to a neighboring basic solution by a pivot
operation

• Thus one way to find the optimal solution of the given LP problem is
to generate all the basic solutions and pick the one that is feasible and
corresponds to the optimal value of the objective function

• This can be done because the optimal solution, if one exists, always
occurs at an extreme point or vertex of the feasible domain.

250
Motivation of the simplex method

• If there are m equality constraints in n variables with n m, a basic


solution can be obtained by setting any of the n-m variables equal to zero.

• The number of basic solutions to be inspected is thus equal to the number


of ways in which m variables can be selected from a set of n variables,
that is,
n  n!
  
 m  (n  m)! m!

• e.g. If n = 10 and m=5, we have 252 basic solutions


If n = 20 and m=10, we have 184756 basic solutions

251
Motivation of the simplex method
(contd)
• Usually we do not have to inspect all these basic solutions
since many of them will be infeasible

• However, for large values of n and m , this is still a very large


number to inspect one by one

• Hence, what we really need is a computational schemethat


examines a sequence of basic feasible solutions, each of
which corresponds to a lower value of f until a minimum is
reached

• The simplex method of Dantzig is a powerful scheme for


obtaining a basic feasible solution.

252
Motivation of the simplex method
(contd)
• The simplex method of Dantzig is a powerful
scheme for obtaining a basic feasible solution.

• If the solution is not optimal, the method provides


for finding a neighboring basic feasible solution that
has a lower or equal value of f .

• The process is repeated until, in a finite number of


steps, an optimum is found.

253
Motivation of the simplex method
(contd)
• The first step involved in the simplex method is to construct
an auxiliary problem by introducing certain variables known
as artificial variables into the standard form of the linear
programming problem.

• The primary aim of adding artificial variables is to bring the


resulting auxiliary problem into a canonical form from which
its basic feasible solution can be obtained immediately.

• Starting from this canonical form, the optimal solution of the


original linear programming problem is sought in two phases.
254
Motivation of the simplex method
(contd)
• The first phase is intended to find a basic feasible solution to the original
linear programming problem.
– It consists of a sequence of pivot operations that produces a succession of
different canonical forms from which the optimal solutiobn of the auxiliary
problem can be found.

• The second phase is intended to find the optimal solution of the original
LP problem.
– It consists of a second sequence of pivot operations that enables us to move
from one basic feasible solution to the next of the original LP problem.

• In this process, the optimal solution of the problem, if one exists, will be
identified. The sequence of different canonical forms that is necessary in
both phases of the simplex method is generated according to the simplex
algorithm.

255
Simplex Algorithm
• The starting point of the simplex algorithm is always a set of equations,
which includes the objective function along with the equality constraints
of the problem in canonical form.
• Thus, the objective of the simplex algorithm is to find the vector X  0
that minimizes the function f (X) and satisfies the equations:

1x1  0 x 2    0 x m  a1,m 1 x m 1    a1n x n  b1

0 x1  1x 2    0 x m  a 2,m 1 x m 1    a 2n x n  b2


0 x1  0 x 2    1x m  a m ,m 1 x m 1    a mn
 x n  bm

0 x1  0 x 2    0 x m  f  c m 1 x m 1    c mn
 x n   f o

where aij, cj, bj, and fo are constants. Notice that (-f) is treated as a basic
variable in the canonical form
256
Simplex Algorithm
• The basic solution which can be deduced is:
xi  bi, i  1,2,, m
f  f 0n
xi  0, i  m  1, m  2, , n

• In Phase I of the simplex method, the basic solution corresponding to


the canonical form obtained after the introduction of the artificial
variables will be feasible for the auxiliary problem

• Phase II of the simplex method starts with a basic feasible solution of


the original linear programming problem. Hence the initial canonical
form at the start of the simplex algorithm will always be a basic
feasible solution.
257
Simplex Algorithm
• We know from Theorem
‘Let S be a closed convex polyhedron. Then the minimum of a linear
function over S is attained at an extreme point of S.’
that the optimal solution of a linear programming problem lies at one of
the basic feasible solutions.

• Since the simplex algorithm is intended to move from one basic


feasible solution to the other through pivotal operations, before moving
to the next basic feasible solution, we have to make sure that the
present basic feasible solution is not the optimal solution.

• By merely glancing at the number cj", j=1,2,…,n, we can tell whether


or not the present basic feasible solution is optimal.

258
Simplex Algorithm
Identifying an optimal point:
• Theorem: A basic feasible solution is an optimal solution with a
minimum objective function value of f0" if all the cost coefficients cj" ,
j=m+1, m+2, …,n, in the equations


1x1  0 x 2    0 x m  a1,m 1 x m 1    a1n x n  b1

0 x1  1x 2    0 x m  a 2,m 1 x m 1    a 2n x n  b2


0 x1  0 x 2    1x m  a m ,m 1 x m 1    a mn
 x n  bm

0 x1  0 x 2    0 x m  f  c m 1 x m 1    c mn
 x n   f o

are nonnegative.

259
Improving a nonoptimal basic feasible
solution
• From the last row of the Equation

1x1  0 x 2    0 x m  a1,m 1 x m 1    a1n x n  b1

0 x1  1x 2    0 x m  a 2,m 1 x m 1    a 2n x n  b2


0 x1  0 x 2    1x m  a m ,m 1 x m 1    a mn
 x n  bm

0 x1  0 x 2    0 x m  f  c m 1 x m 1    c mn
 x n   f o

we can write the objective function as:


m n
f  f 0   cixi   c x j j
i 1 j  m 1

 f 0

for the solution given by


xi  bi, i  1,2,, m
f  f 0n
xi  0, i  m  1, m  2, , n 260
Improving a nonoptimal basic feasible
solution
• If at least one cj" is negative, the value of f can be reduced by making the
corresponding xj" > 0.

• In other words, the nonbasic variable xj, for which the cost coefficient is cjn
is negative, is to be made a basic variable in order to reduce the value of
the objective function.

• At the same time, due to the pivotal operation, one of the current basic
variables will become nonbasic and hence the values of the new basic
variables are to be adjusted in order to bring the value of f less than f0" .

261
Improving a nonoptimal basic feasible
solution
• If there are more than one cj" < 0, the index s of the nonbasic variable xs
which is to be made basic is chosen such that
c s  minimum c j  0

• Although this may not lead to the greatest possible decrease in f (since it
may not be possible to increase xs very far), this is intuitively at least a
good rule for choosing the variable to become basic.

• It is the one generally used in practice because it is simple and it usually


leads to fewer iterations than just choosing any cj" < 0.

• If there is a tie in applying the above equation, (i.e., if more than one cj"
has the same minimum value), we select one of them arbitrarily as cs"
262
Improving a nonoptimal basic feasible
solution
• Having decided on the variable xs to become basic, we increase it from
zero holding all other nonbasic variables zero and observe the effect on
the current basic variables. From Eq.

1x1  0 x 2    0 x m  a1,m 1 x m 1    a1n x n  b1

0 x1  1x 2    0 x m  a 2,m 1 x m 1    a 2n x n  b2


0 x1  0 x 2    1x m  a m ,m 1 x m 1    a mn
 x n  bm

0 x1  0 x 2    0 x m  f  c m 1 x m 1    c mn
 x n   f o

we can obtain
x1  b1  a1s x s , b1  0
x 2  b2  a 2s x s , b2  0

x m  bm  a ms x s , bm  0
f  f 0  c sx s , c s  0

263
Improving a nonoptimal basic feasible
solution
• Since cs" < 0, the equation
xm  bm  a ms x s , bm  0
f  f 0  csx s , cs  0

suggests that the value of xs should be made as large as possible in order


to reduce the value of f as much as possible.

• However, in the process of increasing the value of xs, some of the


variables xi (i=1,2,...,m) in the Equation

x1  b1  a1s x s , b1  0


x 2  b2  a 2s x s , b2  0

may become negative.


264
Improving a nonoptimal basic feasible
solution
• If all the coefficients ais"  0, then xs can be made infinitely large without
making any xi < 0, i=1,2,...,m

• In such a case, the minimum value of f is minus infinity and the linear
programming problem is said to have an unbounded solution.

• If at least one ais" is positive, the maximum value that xs can take without
making xi negative is bi" / ais"

• If there are more than one ais" > 0, the largest value xs* that xs can take is
given by the minimum of the ratios bi" / ais" for which ais" > 0. Thus,
br  b 
xs *   minimum  i 
a rs ais" 0  ais 

265
Improving a nonoptimal basic feasible
solution
• The choice of r in the case of a tie, assuming that all b"i > 0 , is arbitrary.

• If for any b"i for which a"is > 0 is zero in the Eq.
x1  b1  a1s x s , b1  0
x 2  b2  a 2s x s , b2  0

xs can not be increased by any amount. Such a solution is called a


degenerate solution.

266
Improving a nonoptimal basic feasible
solution
• In the case of a nondegenerate basic feasible solution, a new basic
feasible solution can be constructed with a lower value of the objective
function.

• By substituting the value of xs* given by


br  b 
xs *   minimum  i 
a rs ais" 0  ais 

into
x1  b1  a1s x s , b1  0
x 2  b2  a 2s x s , b2  0

x m  bm  a ms x s , bm  0
f  f 0  c sx s , c s  0

267
Improving a nonoptimal basic feasible
solution (contd)
• We obtain
xs  xs *
xi  bi  ais x s*  0, i  1,2,, m and i  r
xr  0
x j  0, j  m  1, m  2,, n and j  s
f  f 0  c sx s*  f 0

which can readily be seen to be a feasible solution different from the


previous one.

268
Improving a nonoptimal basic feasible
solution (contd)
• Since a“rs > 0 in Equation

br  b 
xs *   minimum  i 
a rs ais" 0  ais 
a single pivot operation on the element a“rs in the system of equations


1x1  0 x 2    0 x m  a1,m 1 x m 1    a1n x n  b1

0 x1  1x 2    0 x m  a 2,m 1 x m 1    a 2n x n  b2


0 x1  0 x 2    1x m  a m ,m 1 x m 1    a mn
 x n  bm

0 x1  0 x 2    0 x m  f  c m 1 x m 1    c mn
 x n   f o

will lead to a new canonical form from which the basic feasible solution of the equation
xs  xs *
xi  bi  ais x s*  0, i  1,2,, m and i  r
xr  0
x j  0, j  m  1, m  2,, n and j  s

can easily be deduced.


269
Improving a nonoptimal basic feasible
solution (contd)
• Also, the equation
f  f 0  csx s*  f 0

shows that the basic feasible solution corresponds to a lower objective


function value compared to that of
xi  bi, i  1,2,, m
f  f 0n
xi  0, i  m  1, m  2, , n

• This basic feasible solution can be again be tested for optimality by seeing
whether all c"i > 0 in the new canonical form.

• If the solution is not optimal, the entire procedure of moving to another


basic feasible solution from the present one has to be repeated.

270
Improving a nonoptimal basic feasible
solution (contd)
• In the simplex algorithm, this procedure is repeated in an
iterative manner until the algorithm finds either

1. A class of feasible solutions for which f - or

2. An optimal basic feasible solution with all c"i 0, i=1,2,...,n

• Since there are only a finite number of ways to choose a set


of m basic variables out of n variables, the iterative process
of the simplex algorithm will terminate in a finite number of
cycles.
271
Example
Maximize F = x1 + 2x2 + x3
subject to

2 x1  x2  x3  2
 2 x1  x2  5 x3  6
4 x1  x2  x3  6
xi  0, i  1,2,3

272
Example
Solution: We first change the sign of the objective function
to convert it to a minimization problem and the signs of the
inequalities where necessary so as to obtain nonnegative
avlues for bi (to see whether an initial basic feasible solution
can be obtained readily. The resulting problem can be stated
as:
Minimize f = -x1 - 2x2 - x3
subject to
2 x1  x2  x3  2
2 x1  x2  5 x3  6
4 x1  x2  x3  6
xi  0, i  1,2,3 273
Example
Solution cont’d: By introducing the slack variables x4 ≥ 0,
x5 ≥ 0, and x6 ≥ 0, the system of equations can be stated in
canonical form as:
2 x1  x2  x3  x4 2
2 x1  x2  5 x3  x5 6
4 x1  x2  x3  x6 6 (E1)
 x1  2 x2  x3  f 0

where x4, x5 , x6 and -f can be treated as basic variables.

274
Example
Solution cont’d: The basic solution corresponding to Eqs.
(E1) is given by:
x4  2, x5  6, x6  6 (basic variables )
x1  x2  x3  0 (nonbasic variables ) (E2)
f 0

which can be seen to be feasible.


Since the cost coefficients corresponding to nonbasic
variables in Eqs. (E1) are negative (c1” = -1, c2” = -2, c3” = -
1), the present solution given by Eqs.(E2) is not optimum.

275
Example
Solution cont’d: To improve the present basic feasible
solution, we first decide the variable xs to be brought into the
basis as:   
cs  min( c j  0)  c2  2

Thus x2 enters the next basic set. To obtain the new canonical
form, we select the pivot element ars” such that

br bi "
 min ( )
 ais "0 ais "
ars

276
Example
Solution cont’d: In the present case, s = 2 and a12” and a32”
are ≥ 0. Since b1”/ a12” =2/1 and b3”/ a32” =6/1 , xr=x1. By
pivoting on a12” , the new system of equations can be
obtained as:

2 x1  x2  x3  x4 2
4 x1  0 x2  4 x3  x4  x5 8
2 x1  0 x2  2 x3  x4  x6 4 (E3)
3x1  0 x2  3x3  2 x4  f 0

277
Example
Solution cont’d: The basic feasible solution corresponding
to this canonical system is given by:
x2  2, x5  8, x6  4 (basic variables )
x1  x3  x4  0 (nonbasic variables ) (E4)
f  4

Since c3” = -3, the present solution is not optimum. As


cs”=min(ci”<0) = c3” , xs=x3 enters the next basis.

278
Example
Solution cont’d: To find the pivot element ars” , we find the ratios bi”/
ais” > 0. In the Equation below only a23 ” and a23 ” > 0
2 x1  x2  x3  x4 2
4 x1  0 x2  4 x3  x4  x5 8
2 x1  0 x2  2 x3  x4  x6 4 (E3)
3x1  0 x2  3x3  2 x4  f 0
Hence:
 
b2 8 b3 4
 and 
 4  2
a23 a33

279
Example
Solution cont’d: Since both of these ratios are the same, we arbitrarily
select a23 ” as the pivot element. Pivoting on a23 ” gives the following
canonical system of equations:
5 1
3x1  x2  0 x3  x4  x5 4
4 4
1 1
1x1  0 x2  1x3  x4  x5 2
4 4
3 1
0 x1  0 x2  0 x3  x4  x5  x6  0 (E5)
2 2
11 3
6 x1  0 x2  0 x3  x4  x5  f  10
4 4

The basic feasible solution corresponding to this canonical form is:


x2  4, x3  2, x6  0 (basic variables )
x1  x4  x5  0 (basic variables ) (E6)
f  10
280
Example
Variables

Basic x1 x2 x3 x4 x5 x6 -f b i” b1”/ ais”


variables for
ais”>0

x4 2 11 -1 1 0 0 0 2 2 Smaller one
(x4 drops
from the next
x5 2 -1 5 0 1 0 0 6 basis)

x6 4 1 1 0 0 1 0 6 6

-f -1 -2 -1 0 0 0 1 0

Most negative ci (x2 enters the next basis)

281
Example
Result of pivoting
Variables

Basic x1 x2 x3 x4 x5 x6 -f b i” b1”/ ais”


variables for
ais”>0

x2 2 1 -1 1 0 0 0 2
Select this
arbitrarily.
x5 4 0 4 1 1 0 0 8 2
(x5 drops
from the next
basis)
x6 2 0 2 -1 0 1 0 4 2

-f 3 0 -3 2 0 0 1 4

Most negative ci (x3 enters the next basis)

282
Example
Result of pivoting
Variables

Basic x1 x2 x3 x4 x5 x6 -f b i”
variables

x2 3 1 0 5/4 1/4 0 0 4

x3 1 0 1 1/4 1/4 0 0 2

x6 0 0 0 -3/2 -1/2 1 0 0

-f 6 0 0 11/4 3/4 0 1 10

All ci are ≥ 0 and hence the present solution is optimum

283
Example 2
Unbounded solution
Minimize f = -3x1- 2x2
subject to

x1  x2  1
3x1  2 x2  6
x1  0, x2  0

284
Example 2
Unbounded solution
Solution: Introducing the slack variables x3 ≥ 0 and x4 ≥ 0,
the given system of equations can be written in canonical
form as:
x1  x2  x3 1
3x1  2 x2  x4 6 (E1)
 3x1  2 x2  f 0
The basic feasible solution corresponding to this canonical
form is given by:
x3  1, x4  6 (basic variables )
x1  x2  0 (nonbasic variables ) (E2)
f 0
285
Example 2
Unbounded solution
Solution: Introducing the slack variables x3 ≥ 0 and x4 ≥ 0,
the given system of equations can be written in canonical
form as:
x1  x2  x3 1
3x1  2 x2  x4 6 (E1)
 3x1  2 x2  f 0
The basic feasible solution corresponding to this canonical
form is given by:
x3  1, x4  6 (basic variables )
x1  x2  0 (nonbasic variables ) (E2)
f 0
286
Example 2
Unbounded solution
Solution: Since the cost coefficients corresponding to the
nonbasic variables are negative, the solution given by Eq.(
E2) is not optimum. Hence the Simplex procedure is applied
to the canonical system of Eqs.(E1) starting from the
solution Eqs.(E2). The computations are done in tableau
form as shown in the next slides.

287
Example 2
Unbounded solution
Variables

Basic x1 x2 x3 x4 -f b i” b1”/ ais” for


variables ais”>0

x3 Smaller value
11 -1 1 0 0 1 1
(x3 leaves the
basis)
x4 3 -2 0 1 0 6 2

-f -3 -2 0 0 1 0

Most negative ci (x1 enters the next basis)

288
Example 2
Results of pivoting
Variables

Basic x1 x2 x3 x4 -f b i” b1”/ ais” for


variables ais”>0

x1 1 -1 1 0 0 1

Smaller value
x4 0 11 -3 1 0 3 3 (x4 leaves the
basis)

-f 0 -5 3 0 1 3

Most negative ci (x2 enters the next basis)

289
Example 2
Results of pivoting
Variables

Basic x1 x2 x3 x4 -f b i” b1”/ ais” for


variables ais”>0

x1 1 0 -2 1 0 4
Both ais” are
negative (i.e. No
x2 0 1 -3 1 0 3 variable leaves
the basis

-f 0 0 -12 5 1 18

Most negative ci (x3 enters the next basis)

290
Example 2
Unbounded solution
• At this stage, we notice that x3 has the most negative
cost coefficient and hence it should be brought into the
next basis.

• However, since all the coefficients ai3” are negative, the


value of f can be decreased indefinitely without violating
any of the constraints if we bring x3 into the basis. Hence
the problem has no bounded solution.

• In general, if all the coefficients of the entering variable


xs (ais”) have negative or zero values at any iteration, we
can conclude that the problem has an unbounded
solution.

291
Example 3
Infinite number of solutions

Minimize f = -40x1- 100x2


subject to

10 x1  5 x2  2500
4 x1  10 x2  2000
2 x1  3x2  900
x1  0, x2  0

292
Example 3
Infinite number of solutions
Solution: By adding the slack variables x3 ≥ 0, x4 ≥ 0 and x5 ≥
0, the equations can be written in canonical form as follows:

10 x1  5 x2  x3  2500
4 x1  10 x2  x4  2000
2 x1  3x2  x5  900
 40 x1  100 x2  f 0

293
Example 3
Infinite number of solutions
Variables

Basic x1 x2 x3 x4 x5 -f b i” b1”/ ais”


variables for
ais”>0

x3 10 5 1 0 0 0 2500 500

x4 4 10 0 1 0 0 2000 200
Smaller value
(x4 leaves the
x5 2 3 0 0 1 0 900 300 basis)

-f -40 -100 0 0 0 1 0

Most negative ci (x2 enters the next basis) 294


Example 3
Infinite number of solutions
Variables

Basic x1 x2 x3 x4 x5 -f b i”
variables

x3 8 0 1 -1/2 0 0 1500

x2 4/10 1 0 -1/10 0 0 200

x5 8/10 0 0 -3/10 1 0 300

-f 0 0 0 10 0 1 20,000

295
Example 3
Infinite number of solutions
• Since all ci” ≥ 0, the present solution is optimum. The optimum values are given
by:
x2  200, x3  1500, x5  300 (basic variables )
x1  x4  0 (nonbasic variables )
f min  20,000
• It can be observed from the last row of the preceding tableau that the cost
coefficient corresponding to the nonbasic variable x1 (c1”) is zero. This is an
indication that an alternative solution exists.

• Here x1 can be brought into the basis and the resulting new solution will also be a
basic feasible solution. For example, introducing x1 into the basis in place of x3
(i.e., by pivoting on a13”), we obtain the new canonical system of equations as
shown in the following tableau:

296
Example 3: Infinite number of
solutions
Variables

Basic x1 x2 x3 x4 x5 -f b i”
variables

x1 1 0 1/8 -1/16 0 0 1500/8

x2 0 1 -1/20 1/8 0 0 125

x5 0 0 -1/10 -1/4 1 0 150

-f 0 0 0 10 0 1 20,000

297
Example 3: Infinite number of
solutions
• The solution corresponding to this canonical form is given by:
x1  1500 / 8, x2  125, x5  150 (basic variables )
x3  x4  0 (nonbasic variables )
f min  20,000

• Thus, the value of f has not changed compared to the preceding


value since x1 has a zero cost coefficient in the last row of the
preceding tableau. Thus, the two basic (optimal) feasible solutions
are:
 0  1500 / 8
 200   125 
   
X1  1500 and X 2   0 
 0   0 
   
 300   150 

298
Example 3: Infinite number of
solutions
• An infinite number of nonbasic (optimal) feasible solutions can be
obtained by taking any weighted average of the two solutions as:

X*  X1  (1   ) X 2
 1500   1500 
(1   ) (1   )
 x1 *   8   8 
 x * 200  (1   )125  125  75 
 2    
 


X*   x3 *   1500  1500 
 x *  0   0 
  
4
  
 x5 * 300  (1   )150   150  150 
   
• It can be verified that the solution 0 X*
 will
1 always give the same
value of -20,000 for f for all 0 ≤  ≤ 1.

299
Two Phases of the Simplex Method

• This problem is to find nonnegative values for the


variables x1, x2,….,xn that satisfy the equations
a11 x1  a12 x2    a1n xn  b1
a21 x1  a22 x2    a2 n xn  b2

am1 x1  am 2 x2    amn xn  bm

and minimize the objective function given by


c1 x1  c2 x2    cn xn  f

300
Two Phases of the Simplex Method

• The general problems encountered in solving this problem


are:

– An initial feasible canonical form may not be readily available. This is the
case when the linear programming problem does not have slack variables for
some of the equations or when the slack variables have negative coefficients.

– The problem may have redundancies and/or inconsistencies, and may not be
solvable in nonnegative numbers.

• The two-phase simplex method can be used to solve the problem.

301
Two Phases of the Simplex Method

• Phase I of the simplex method uses the simplex


algorithm itself to find whether the linear
programming problem has a feasible solution.If a
feasible solution exists, it provides a basic feasible
solution in canonical form ready to initiate Phase II
of the method.
• Phas II in turn, uses the simplex algorithm to find
whether the problem has a bounded optimum. If a
bounded optimum exists, it finds the basic feasible
solution which is optimal.
302
Two Phases of the Simplex Method

The simplex method is described in the following steps:


1. Arrange the original system of the equations below so that
all the constant terms bi are positive or zero by changing,
where necessary, the signs on both sides of the equations.

a11 x1  a12 x2    a1n xn  b1


a21 x1  a22 x2    a2 n xn  b2

am1 x1  am 2 x2    amn xn  bm

303
Two Phases of the Simplex Method

2. Introduce to this system a set of artificial variables y1, y2, ….


ym (which serve as basic variables in Phase I), where each yi
≥ 0 , so that it becomes
a11 x1  a12 x2    a1n xn  y1  b1
a21 x1  a22 x2    a2 n xn  y2  b2

am1 x1  am 2 x2    amn xn  ym  bm
bi  0
The objective function can be written as:
c1 x1  c2 x2    cn xn  ( f )  0
304
Two Phases of the Simplex Method
3. Phase I of the method: Define a quantity w as the sum of the artificial
variables:
w  y1  y2    ym
and use the simplex algorithm to find xi ≥ 0 (i=1,2,…,n), yi ≥ 0
(i=1,2,…,m) which minimize w and satisfy the following equations:
a11 x1  a12 x2    a1n xn  y1  b1
a21 x1  a22 x2    a2 n xn  y2  b2

am1 x1  am 2 x2    amn xn  ym  bm
bi  0

c1 x1  c2 x2    cn xn  ( f )  0
305
Two Phases of the Simplex Method
3. Phase I of the method cont’d: Consequently consider the
array:
a11 x1  a12 x2    a1n xn  y1  b1
a21x1  a22 x2    a2 n xn  y2  b2

am1 x1  am 2 x2    amn xn  ym  bm
c1 x1  c2 x2    cn xn  (-f )  0
y1  y2    ym  (-w)  0

306
Two Phases of the Simplex Method
3. Phase I of the method cont’d: This array is not in canonical form;
however, it can be rewritten as a canonical system with basic variables
y1, y2, …. ym, -f, and –w by subtracting the sum of the first m equations
from the last to obtain the new system which provides the initial basic
solution that is necessary for starting Phase I:
a11 x1  a12 x2    a1n xn  y1  b1
a21 x1  a22 x2    a2 n xn  y2  b2

am1 x1  am 2 x2    amn xn  ym  bm
c1 x1  c2 x2    cn xn  (-f )  0
d1 x1  d 2 x2    d n xn  (-w)  -w0
where
d i  (a1i  a2i    ami ), i  1,2,..., n
 w0  (b1  b2  ...  bm ) 307
Two Phases of the Simplex Method

4. w is called the infeasibility form and has the property that


if as a result of Phase I, with a minimum of w>0, no
feasible solution exists for the original linear programming
problem stated in the equations:
a11 x1  a12 x2    a1n xn  b1
a21 x1  a22 x2    a2 n xn  b2

am1 x1  am 2 x2    amn xn  bm

c1 x1  c2 x2    cn xn  ( f )  0
the procedure is terminated.
308
Two Phases of the Simplex Method

4. On the other hand, if the minimum of w=0, the resulting


array will be in canonical form and hence initiate Phase II
by eliminating the w equation as well as the columns
corresponding to each of the artificial variables y1, y2,..,ym
from the array.

5. Apply the simplex algorithm to the adjusted canonical


system at the end of Phase I to obtain a solution, if a finite
exists which optimizes the value of f.

309
Example
• Minimize
f  2 x1  3x2  2 x3  x4  x5
subject to the constraint s :
3x1  3x2  4 x3  2 x4  x5  0
x1  x2  x3  3x4  x5  2
xi  0, i  1 to 5

310
Example
Solution:

Step I: As the constants on the right hand side of the constraints, are
already nonnegative, the application of step I is unnecessary.

Step II: Introducing the artificial variables y1 ≥ 0, and y2 ≥ 0, the equations


can be written as follows:

3x1  3x2  4 x3  2 x4  x5  y1 0
x1  x2  x3  3x4  x5  y2 2 (E1)
2 x1  3x2  2 x3  x4  x5  f 0

311
Example
Solution:

Step III: By defining the infeasibility form w as

w  y1  y2

3x1  3x2  4 x3  2 x4  x5  y1 0
x1  x2  x3  3x4  x5  y2 2
2 x1  3x2  2 x3  x4  x5  f 0 (E2)
y1  y2  w  0

312
Example
Solution:

Step III cont’d: This array can be rewritten as a canonical system with
basic variables as y1, y2,-f,-w by subtracting the sum of the first two
equations of (E2) from the last equation of E2. Thus the last equation of
(E2) becomes:
 4 x1  2 x2 - 5x3  5x4  0 x5-w  -2 (E3)
Since this canonical system (first three equations of (E2) and (E3),
provides an initial basic feasible solution, Phase I of the simplex method
can be started. The Phase I computations are shown in tableau form.

313
Example
Solution: Step III cont’d: Since this canonical system (first three equations of (E2)
and (E3) provides an initial basic feasible solution, Phase I of the simplex method
can be started.
Basic Admissable variables Artificial variables Value of
variables bi”/
ais”>0
x1 x2 x3 x4 x5 y1 y2 b i”

y1 3 -3 4 2 -1 1 0 0 0 Smaller
value (y1
drops
y2 1 1 1 3 1 0 1 2 2/3 from next
basis)

-f 2 3 2 -1 1 0 0 0

-w -4 2 -5 -5 0 0 0 -2

Most negative 314