You are on page 1of 128

CB304 Chemical process

modeling and simulation

Sandip Khan
tains and valleys (Fig. 14.1). For higher-dimensional problems, convenient images are not
possible.
We have chosen to limit this chapter to the two-dimensional case. We have adopted
this approach because the essential features of multidimensional searches are often best
communicated visually.
Multidimensional unconstrained optimization
Techniques for multidimensional unconstrained optimization can be classified in a
number of ways. For purposes of the present discussion, we will divide them depending on
whether they require derivative evaluation. The approaches that do not require derivative
evaluation are called nongradient, or direct, methods. Those that require derivatives are
called gradient,
Nongradient, or descent
or direct, methods: That domethods.
(or ascent), not require derivative evaluation.

14.1
tangible way to visual- Lines of constant f f
imensional searches is x
ntext of ascending a
(maximization) or de-
into a valley (mini-
. (a) A 2-D topo-
map that corresponds
D mountain in (b).
x

y
y
(a) (b)

367
implies, this method repeatedly evaluates the function at randomly selected values of the
eventually be located.
Problem Statement.
independent variables. If aUse
sufficient
a randomnumber of samples
number are conducted,
generator to locatethe
theoptimum
maximumwillo
MPLE 14.1eventually
RandombeSearch
located.
Method
Random Search Method
f(x, y) = y − x − 2x 2 − 2x y − y 2
Problem Statement. Use a random number generator to locate the maximum of
14.1 Random Search Method
in the domain
f(x, y) bounded
= y − x − by2 x− =
2x 2x −2
y − y 2 2 and y = 1 to 3. The domain is depicted in
to (E14.1.1)
Problemthat
Use
Notice a Statement.
random number generator to
Use a random
a single maximum locate the
number
of 1.5 maximum of following
generator
occurs at x =to−1 function
locate
andthey= maximum
1.5. of
in the domain bounded by x = −2 to 2 and y = 1 to 3. The domain is depicted in Fig. 14.2.
2 2
Solution. Random number generatorsattypically
f(x,
Notice y) =
that ay − x
single− 2x −
maximum 2x
ofy −
1.5 y
occurs x = −1 andgenerate
y = 1.5. values between (E14.1.1)
0 an
Solution.
designate
in the domain
Within xsuch toa 2number
Random
bounded
= −2 and ynumber
by as
= 1xto r, the
to 2following
=3generators
−2 y = 1 to
typically
and formula
generate can be
3. Thevalues
domain isused
between 0to
and
depictedgenerate
in1. Fig. xv
If we14.2.
Pagedesignate
PM domly that a such
within a number
a range as r,of
between the1.5
xfollowing formula canand
be used to generate x values ran-
2:36 Notice 369 single maximum l to xu: at
occurs x = −1 y = 1.5.
domly within a range between xl to xu:
Solution.
x =x x=l Random
+ (x −numberx )r generators typically generate values between 0 and 1. If we
xl + (xuu − xll)r
designate such a number as r, the following formula can be used to generate x values ran-
For the present
Forwithin
domly the presentapplication,
application,
a range =
−2x−2
betweenxl xx=ll to and and
xu = x = 2,
2,u and theand the isformula is
formula
u:
14.1 DIRECT METHODS 369
= −2 ++(2(2−
x = −2 − (−2))r
(−2))r ==
xx = xl + (xu − xl )r −2−2
+ 4r+ 4r
This can be tested by substituting 0 and 1 to yield −2 and 2, respectively.
Similarly forapplication,
y, a formula for the−2
present example
For the
This canpresent
be tested xl =
by substituting and xu =12,tocould
0 and and be developed
the
yield −2
formula as respectively.
and is2,
xy = l ++
= y−2 −−
(yu(2 yl )r = 1 +=
(−2))r (3 −
−21)r+=4r1 + 2r
FIGURE 14.2
The
canfollowing
ThisEquation testedExcel
be (E14.1.1) VBA macrocode
byshowing
substituting uses
0 and theyield
1at to VBA −2
random number function Rnd, to
the maximum x = −1 and yand 2, respectively.
= 1.5.
FIGURE 14.2
generate (x, y) pairs. These are then substituted into Eq. (E14.1.1). The maximum value
Equation
from among(E14.1.1) showing
these random trials isthe maximum
stored at x =
in the variable maxf, and
−1and the ycorresponding
= 1.5. x and
y values in maxx and maxy, yrespectively.
FIGURE
maxf14.2
= –1E9
The following Excel VBA macrocode uses the VBA random number function Rnd, t
domly within a range between xl to xu:
generate (x, y) pairs. These are then substituted into Eq. (E14.1.1). The maximum valu
x = xl + (xu − xl )r
from among these random trials is stored in the variable maxf, and the corresponding x an
For theand
y values in maxx present application,
maxy, xl = −2 and xu = 2, and the formula is
respectively.
x = −2 + (2 − (−2))r = −2 + 4r
maxf = –1E9
For jThis
= can
1 To n by substituting 0 and 1 to yield −2 and 2, respectively.
be tested
x = –2 + 4 * Rnd
y = 1 + 2 * Rnd
fn FIGURE
= y – 14.2x – 2 * x ^ 2 – 2 * x * y – y ^ 2
If Equation
fn > (E14.1.1)
maxf Then showing the maximum at x = −1 and y = 1.5.
maxf = fn
maxx = x y
maxy = y
End If 3
Next j – 10
– 20
0

2
A number of iterations yields
0

Iterations x 1 y0 f (x,
–2 –1 1 2 y)
x

1000 −0.9886 Maximum


1.4282 1.2462
2000 −1.0040 1.4724 1.2490
3000 −1.0040 1.4724 1.2490
4000 −1.0040 1.4724 1.2490
5000 −1.0040 1.4724 1.2490
maxy = y
End If
Next j

A number of iterations yields

Iterations x y f (x, y)

1000 −0.9886 1.4282 1.2462


2000 −1.0040 1.4724 1.2490
3000 −1.0040 1.4724 1.2490
4000 −1.0040 1.4724 1.2490
5000 −1.0040 1.4724 1.2490
6000 −0.9837 1.4936 1.2496
7000 −0.9960 1.5079 1.2498
8000 −0.9960 1.5079 1.2498
9000 −0.9960 1.5079 1.2498
10000 −0.9978 1.5039 1.2500

The results indicate that the technique homes in on the true maximum.

This simple brute force approach works even for discontinuous and nondifferentiable
functions. Furthermore, it always finds the global optimum rather than a local optimum. Its
major shortcoming is that as the number of independent variables grows, the implementa-
ne at the point. Next, move along the y axis with x constant to point 3. Continue this
rocess generating points 4, 5, 6, etc.

IGURE 14.3
Univariate search method
graphical depiction of how a univariate search is conducted.

y Ø Let us perform a univariate search


graphically, as shown in Figure.
Ø Start at point 1, and move along the
x axis with y constant to the
5
maximum at point 2.
6
3 Ø You can see that point 2 is a
4
1
maximum by noticing that the
2
trajectory along the x axis just
x touches a contour line at the point.
Ø Next, move along the y axis with x
constant to point 3.
Ø Continue this process generating
points 4, 5, 6, etc
Example 1:
%! " 2
! =8+ + + 6%"
2 %! %"

%! %" > 0 *+, %1 = %2 = 1

!
,! 2 2 #
= %! − " = 0 => %! =
,%! %! %" %"

!
,! 2 1 "
= 6 − " = 0. => %" =
,%" %" %! 3%!
X1 X2 y
1 0.578 15.42
1.51 0.46 14.7
1.62 0.45 14.76
1.64 0.45 14.75
y

GRADIENT METHODS
Gradient
FIGURE 14.6 Method 373
The directional gradient is defined along an axis h that forms an angle θ with the x axis.

x Suppose we have a two-dimensional function


f(x, y). An example might be your elevation
on a mountain
These ideas were useful to us in the one-dimensional searchasalgorithms
a function we of your position.
explored
in Chap. 13. However, to fully understand Suppose that yousearches,
multidimensional are at a specific
we mustlocation
first on
x=a theare
mountain (a, b)inand you want to know the
understand how the first and secondy =derivatives
b expressed a multidimensional
h=0 slope in an arbitrary direction. One way to
context.
define the direction is along a new axis h that
The Gradient. Suppose we have a two-dimensional formsfunction
an anglef(x,θ withy). An
theexample
x axis (seemight
Figure).
be your elevation on a mountain as a function!
of The
yourelevation along this
position. Suppose thatnew
you axis
are atcana be
specific location on the mountain (a, b) and you want thought of asthea slope
to know new in function g(h). di-
an arbitrary If you
rection. One way to define the direction is alongh a define
new axisyour position
h that formsasan being
angletheθ with
origintheof this
x axis (Fig. 14.6). The elevation along this new axis axis can h = 0), of
(thatbeis,thought theasslope
a new in function
this direction
g(h).
y If you define your position as being the origin would
of thisbeaxis
designated
(that is, h = as 0),
gʹ(0). This inslope,
the slope
this direction would be designated as g ′(0). This which
slope,is called
which the directional
is called derivative, can
the directional
be calculated
derivative, can be calculated from the partial derivatives along thefrom x andthe partial
y axis by derivatives
RE 14.6 along the x and y axis by
rectional gradient is defined f f
along an axis h that forms an angle θ with the x axis.
∂ ∂
g ′(0) = cos θ + sin θ (14.1)
∂x ∂y
where the partial derivatives are evaluated at x = a and y = b.
Assuming
These ideas were useful to us inthat
the your goal is to gain
one-dimensional the most
search elevation
algorithms with the next step, the next log-
we explored
ical question
hap. 13. However, to fully would be: what
understand direction is thesearches,
multidimensional steepest ascent?
we mustThe answer to this question is
first
′ ∂f ∂f
g (0) = cos θ + sin θ
∂x ∂y

Assumingwhere
thatthe partial
your goalderivatives are most
is to gain the evaluated at x =with
elevation a and = b.
they next
step, the 374 Assuming
next logicalthat your goal
question is toMULTIDIMENSIONAL
would gainwhat
be: the most UNCONSTRAINED
elevation
direction the the next sO
is with
steepestical question
ascent? The would
answerbe: towhat direction is
this question is provided
the steepestveryascent?
neatlyThe answer t
by whatprovided very
is referred to neatly by what isasreferred
mathematically to mathematically
theVector
gradient, is de-as atheconcise
whichprovides
notation gradi
fined asfined as sions, as
∂f ∂f ⎧ ∂f ⎫
!f = i+ j ⎪
⎪ (x) ⎪

∂x ∂y ⎪

⎪ ∂ x1 ⎪ ⎪


⎪ ⎪


⎪ ∂ f ⎪


This vector is also referred to as “del f.” It represents
⎪ the
(x) ⎪
⎪ directional deriv
Vector point
notation ⎪
⎨ ∂ x2 ⎬ ⎪
x = aprovides
and y = b.a concise
means to generalize the gradient to n !f(x) = . ⎪

⎪ ⎪
dimensions, as ⎪
⎪ . ⎪


⎪ ⎪


⎪ . ⎪


⎪ ⎪


⎪ ∂ f ⎪

⎩ (x)⎭
∂ xn
How do we use the gradient? For the
in gaining elevation as quickly as possibl
cally and how much we will gain by tak
depth later in this chapter.

EXAMPLE 14.2 Using the Gradient to Evaluate the Path of Steepest Ascent
cha01064_ch14.qxd 3/20/09 12:36 PM Page 375
Problem Statement. Employ the gradient to evaluate the steepest ascent direction for the
function
f(x, y) = x y 2
14.2 GRADIENT METHODS 375
at the point (2, 2). Assume that positive x is pointed east and positive y is pointed north.
Solution. First, our elevation can be determined as y

4
f(2, 2) = 2(2)2 = 8 8 24 40

Next, the partial derivatives can be evaluated, 3

∂f
= y 2 = 22 = 4
∂x 2

∂f
= 2x y = 2(2)(2) = 8
∂y 1

which can be used to determine the gradient as


0
0 1 2 3 4 x
!f = 4i + 8j
This vector can be sketched on a topographical map of the function, as in Fig. 14.7. This
FIGURE 14.7
immediately tells us that the direction we The arrow
must follows
take is the direction of steepest ascent calculated with the gradient.
( )
8
θ = tan−1 = 1.107 radians (= 63.4◦ )
4
Thus, during our first step, we will initially gain 8.944 units of elevation rise for a unit dis-
relative to the x axis. The slope in this direction, which
tance advanced is the
along magnitude
this steepest of !f that
path. Observe , can
Eq.be cal-
(14.1) yields the same result,
culated as g ′ (0) = 4 cos(1.107) + 8 sin(1.107) = 8.944
*
42 + 82 = 8.944 Note that for any other direction, say θ = 1.107/2 = 0.5235, g ′(0) = 4 cos(0.5235) +
8 sin(0.5235) = 7.608, which is smaller.
As we move forward, both the direction and magnitude of the steepest path will
change. These changes can be quantified at each step using the gradient, and your climbing
in Chap. 13. However, to fully understand multidimensio
2 understand how the first and second derivatives are expre
context.
14.2 GRADIENT METHODS 375
The Gradient. Suppose we have a two-dimensional functio
1
y
be your elevation on a mountain as a function of your positio
4
specificduring
Thus, locationour
on the mountain
first step, (a,
web) will
and you want to know
8 24 40 rection. gain
One way to define theelevation
direction isrise
along a new axis h t
initially 8.944 units of
0
3 0 1 x 2axis
for (Fig.3distance
a unit 14.6). The elevation
4 advanced
x along this
along thisnew axis can be
g(h). If you
steepest path.define your position as being ′the origin of this axi
this direction would be designated as g (0). This slope, wh
FIGURE
2 14.7 derivative, can be calculated from the partial derivatives alon
The arrow follows the direction of steepest ascent calculated with the gradient.
1 ′ ∂f ∂f
g (0) = cos θ + sin θ
∂x ∂y
0
0 1 2 3 4 x where the partial derivatives are evaluated at x = a and y = b
Thus, during our first step, we will initially gain 8.944 units
Assuming that of elevation
your is rise
goaltheta for a unit dis-
to gain
Try with = 1.3the most elevation wi
FIGURE 14.7 tance advanced along this steepest path. Observe that Eq.
ical question (14.1)
would be: yieldsdirection
the sameisresult,
The arrow follows the direction of steepest ascent calculated with the gradient. andwhat
0.9 the steepest ascent? T
provided very neatly by what is referred to mathematically a
g ′ (0) = 4 cos(1.107) + 8 sin(1.107) = 8.944
fined as

Note that for any other direction, say θ = 1.107/2 = 0.5235, g (0) = 4 cos (0.5235) +
Thus, during our first step, we will initially gain 8.944 units of elevation rise for a unit dis- ∂f ∂f
tance advanced along8 sin steepest path.=
this(0.5235) 7.608,
Observe thatwhich
Eq. (14.1)isyields
smaller. !f =
the same result, i+ j
∂ x ∂ y
g ′ (0) = 4 cos(1.107)As+ 8 we move
sin(1.107) forward, both the direction and magnitude of the steepest path will
= 8.944
Note that for anychange. These
other direction, say changes
θ = 1.107/2can be quantified
= 0.5235, at each
This
g ′(0) = 4 cos(0.5235) +step is
vector using
alsothe gradient,
referred to asand
“delyour climbing
f.” It represents the dire
direction
8 sin(0.5235) = 7.608, which is modified
smaller. accordingly. point x = a and y = b.
As we move forward, both the direction and magnitude of the steepest path will
A final insight can be gained by inspecting Fig. 14.7. As indicated, the direction of
change. These changes can be quantified at each step using the gradient, and your climbing
direction modifiedsteepest
accordingly.ascent is perpendicular, or orthogonal, to the elevation contour at the coordinate
A final insight can be gained by inspecting Fig. 14.7. As indicated, the direction of
(2, 2). This is a general characteristic of the gradient.
steepest ascent is perpendicular, or orthogonal, to the elevation contour at the coordinate
both negative, then you have reached a maximum. Figure 14.8 shows a func
is not true. The point (a, b) of this graph appears to be a minimum when obs
ther the x dimension or the y dimension. In both instances, the second partial
The Hessian
positive. However, if the function is observed along the line y = x, it can
maximum occurs at the same point. This shape is called a saddle, and cle
Whether
maximum a maximum
or a minimum or a occurs
minimum occurs
at the point.involves not only the
partials with respect
Whether a maximum to x and
or a yminimum
but also occurs
the second partial
involves not with
only the partia
respect
to x andtoy but
x and
also y.theAssuming that with
second partial the partial
respect derivatives are
to x and y. Assuming that
continuous at and
rivatives are near the at
continuous point
andbeing
near evaluated
the point being evaluated, the followin
be computed:
2 2 ! 2 "2
∂ f ∂ f ∂ f
|H | = 2 2

∂x ∂y ∂ x∂ y

Three cases can occur


• If |H| > 0 and ∂ 2 f/∂ x 2 > 0, then f (x, y) has a local minimum.
• If |H| > 0 and ∂ 2 f/∂ x 2 < 0, then f (x, y) has a local maximum.
• If |H| < 0, then f (x, y) has a saddle point.
376 MULTIDIMENSIONAL UNCONSTRAINED OPTIMIZATION

Saddle Point f (x, y)

(a, b)

y
y=x

A saddle point (x FIGURE


= a 14.8
and y = b). Notice that when the curve is
A saddle point (x = a and y = b). Notice that when the curve is viewed along the x and y
viewed along the directions,
x and the y directions,
function appears tothe function
go through a minimumappears toderivative),
(positive second go whereas
when viewed along an axis x = y, it is concave downward (negative second derivative).
through a minimum (positive second derivative), whereas when
viewed along an axis x = y, it is concave downward (negative
second derivative).
or a minimum [positive f ′′(x)]. In the previous paragraphs, we illustrated how the gradie
provides best local trajectories for multidimensional problems. Now, we will examine ho
the second derivative is used in such contexts.
an optimum,
where thisthe Hessian
matrix has other
is formally usestoin
referred as optimization
the Hessian of(for
f. example, for the multidimen-
sional The
form of Newton’s
quantity
Besides |H|method).
providing a is
wayequal Into
particular,
to discernthe ita multidimensional
allows searches
determinant
whether to include
of a matrix hassecond-order
made
function up of the second der
reached
curvature1 to attain
an optimum,
atives, thesuperior results.
Hessian has other uses in optimization (for example, for the multidimen-
sional form of Newton’s method). In particular, it allows searches to include second-order
⎡ 2Approximations.
Finite-Difference
curvature to attain

2 results. It should be mentioned that, for cases where they are
superior
∂ f ∂ f
difficult or inconvenient to compute analytically, both the gradient and the determinant of

Finite-Difference 2Approximations.⎥ It should be mentioned that, for cases where they are
the Hessian ⎢ ∂ x ∂ x∂
can be evaluated ⎥ y numerically. In most cases, the approach introduced in
Sec. the
H =foror⎢the
difficult
6.3.3 modified
2 2
⎥ method
inconvenient to compute
secant
analytically, both the gradient and the determinant of
is employed. Thattheis,approach
the independent variables
(14
⎣ ∂canf be evaluated
Hessian ∂ f ⎦numerically. In most cases, introduced in
can be
Sec.perturbed slightly
6.3.3 for the to secant
modified generate
2
the isrequired
method partial
employed. derivatives.
That is, For example,
the independent variables if a
centered-difference∂ y∂ x
can be perturbed approach
∂ y
slightly toisgenerate
adopted, thethey can be
required computed
partial as For example, if a
derivatives.
centered-difference
where this matrix approach
is is adopted,
formally they to
referred canas
be the
computed as of f.
Hessian
∂f f(x + δx, y) − f(x − δx, y)
=
Besides
∂f providing
f(x + δx, y) − af(x
way toy)discern whether a multidimensional function
− δx, (14.5) has reach
∂x = 2δx (14.5)
an optimum,
∂x the Hessian
2δx has other uses in optimization (for example, for the multidim
∂f f(x, y + δy) − f(x, y − δy)
sional =
∂form
f
=
of Newton’s
f(x, method).
y + δy) − f(x, y − δy) In particular, it allows searches to include
(14.6) second-or
(14.6)
∂y ∂y 2δy
2δy
curvature
2
to attain superior results.
∂ f ∂ 2 f f(x f(x + δx, y)y)−−2 2f(x,
+ δx, f(x,y)y)+
+ f(x −δx,
f(x − δx,y)y)
2
=
Finite-Difference= Approximations.2 It should be mentioned that, (14.7)
for (14.7)
cases where they
∂x ∂x 2 δxδx 2

difficult
∂ 2 f ∂ for=
2 inconvenient
f(x, y +y +
f(x, tof(x,
δy)δy)−−2 2f(x,compute f(x,analytically,
+ f(x,
y)+
y) yy −
−δy)
δy) both the gradient and the determinant
= (14.8)(14.8)
the∂ Hessian
y2 ∂ y 2 can be evaluatedδy
δy 2
2 numerically. In most cases, the approach introduced
Sec.∂ 26.3.3
f
∂ 2 f for the modified secant method is employed. That is, the independent variab
=
can∂ x∂ ∂ x∂=
bey perturbed slightly to generate the required partial derivatives. For example, i
y
f(x + δx, y + δy)
centered-difference − f(x + δx,
approach y − δy) − they
is adopted, f(x − δx, + δy)
cany be + f(x − δx,
computed asy − δy)
f(x + δx, y + δy) − f(x + δx, y − δy) − f(x − δx, y + δy) + f(x − δx, y − δy)
4δxδy
∂f f(x + δx, y) − f(x − δx, 4δxδy y) (14.9)

where
= (14.9) (14
∂ x δ is some small fractional
2δx value.
Note
where δ is thatsmall
some the methods employed
fractional value.in commercial software packages also use forward
Steepest Ascent Method
Ø Walk a short distance along the gradient direction.
Ø Re-evaluate the gradient and walk another short distance.
Ø By repeating the process you would eventually get to the top of the hill.

Alternate Way

Ø Continuous re-evaluation of the gradient can be computationally


demanding.
Ø A preferred approach involves moving in a fixed path along the initial
gradient until f(x, y) stops increasing.
Ø This stopping point becomes the starting point where gradient is re-
evaluated and a new direction followed.
Ø The process is repeated until the summit is reached.
ge 379

Thus, the problem boils down to two parts:

(1) Determining the “best” direction to search and


4.2 (2) Determining the “best distance” along that search
GRADIENT METHODS 379 direction.

h2
h0

h1

GURE 14.9 A graphical depiction of the method of steepest ascent.


graphical depiction of the method of steepest ascent.
GURE 14.9 The relationship between an arbitrary direction h a
graphical depiction of the method of steepest ascent.

Starting at x0, y0 the coordinates of any


y pressed as
!f = 3i + 4j
∂f
10 2
x = x0 + h
= ∂x
h
12:36 PM Page 380
∂f
cha01064_ch14.qxd 3/20/09 12:36 PM Page = y0 +
y380 h
6 1 ∂y
=
h

2 0
=
MULTIDIMENSIONAL
h
UNCONSTRAINED OPTIMIZATION
1 380 4 7 xMULTIDIMENSIONAL UNCONSTRAINED OPTIMIZ
NED OPTIMIZATION

GURE 14.10
where h is distance along the h axis. For example,
where suppose
h is distance = 1h axis.
alongx0the 0 =examp
and yFor 2 an
eis.
relationship 3i +suppose
between
For example, 4j,arbitrary
an as shown
x0 = in Fig.
direction
1 and 0 =14.10.
hyand x2and The
! f =coordinates
3i + 4j, as of
andy coordinates. any in
shown point
Fig. along
14.10. the
The hcoordinates
axis are g
e coordinates of any point along the h axis are given by
x = 1 + 3h x = 1 + 3h
(14.12) y = 2 + 4h
y =coordinates
Starting at x0, y0 the 2 + 4h of any point in the gradient direction can be ex-
essed as (14.13)
The following example illustrates how we can us
The following example illustrates how we can use these transformations to conver
f these transformations to convert a two- dimensional function of x and y into a one-dimen
ow we can∂ use
x = x0 + hdimensional function of x and y into a one-dimensional function in h.
(14.10)
o a one-dimensional
∂x function in h.
EXAMPLEThe
14.3∂ f Developing
following
dimensional function a 1-D
example of Function
illustrates
x and yhow aAlong
intowe thethese
can use Gradient
one-dimensional Direction
transformations
function in to
h. convert a two-
=
dimensional 2x − 4y
function
Problem
of =
x and
Statement.
y into−
2(−1) 4(1)we=have
−6 thefunction
a one-dimensional
Suppose
in h.
following 2
two-dimensional function: 0
∂y =
PLE 14.3 Developing a 1-D Function Along the Gradient Direction h
MPLE 14.3 Developing a 1-D
f(x,Function
y) = 2x yAlong thex Gradient
+ 2x −2 2
− 2y Direction
Therefore,
Problem the gradient
Statement. vectorweishave the following two-dimensional function:
Suppose 1 4
Problem Statement. Suppose we have the following two-dimensional function:
Develop a one-dimensional version of this equation along the gradient direction at point
!f
f(x, =
y)
x =6i
= −
−1
2x 6j
y
and+ y2x= −
1.
f(x, y) = 2x y + 2x − x − 2y
2
2 x − 22y
2
FIGURE 14.10
Develop
Develop aSolution.
To find athe maximum, The partial
one-dimensional
one-dimensional we derivatives
couldof search
version
version of this
this can be along
along evaluated
The
equation
equation thealong atthe
gradient (−1,
relationship
the gradient 1),
between
direction,
gradient an arbitrary
that
direction
direction at direction
is,atalong
point point an hh
== −1
−1and
xrunning and yy==
along ∂1. 1. direction of this vector. The function can be expressed along this ax
fthe
= 2y + 2 − 2x = 2(1) + 2 − 2(−1) = 6
Solution.
Solution.! The ∂ x
Thepartial
partialderivatives
derivatives" can
can be evaluated
be evaluated at (−1,
at (−1, 1), 1),
∂f f ∂f
∂∂ff x 0 + ∂
=h,2xy0−+4y = h2(−1)= −f(−14(1) 6h, 1 −Starting
+ −6
= 6h) at x0, y0 the coordinates of an
= 2y +∂∂2yx2−−2x2x= = 2(1)∂y+ 2+−22(−1) = 6= 6 pressed as
∂ x = 2y +
∂x
2(1) − 2(−1)
∂∂ff =Therefore,
2(−1 + 6h)(1 − 6h)
the gradient + 2(−1
vector is + 6h) − (−1 + 6h)∂2f − 2(1 − 6h)2
= 2x − 4y = 2(−1) − 4(1) = −6 x = x0 + h
∂ y = 2x − 4y = 2(−1) − 4(1) = −6 ∂x
where !f = derivatives
∂ y the partial 6i − 6j are evaluated at x = −1 and y = 1.
Therefore, To
thefind
gradient
By combining f
vector is we could search along the gradient direction,
terms, ∂function
Therefore, vector we
the maximum,
the gradient is develop a one-dimensional
y = y0 + h g(h)
that that an
is, along maps
h axisf(x
!f the
along 6ih−axis, y expressed along this axis as
6j along the direction of this vector. The function can∂be
= running
!f = 6i − ! 6j "
To find the maximum, we 2∂ fcould search
∂ f along the gradient direction, that is, along an h axis
To find
running the=
g(h)
along −180h x0 + we
f direction
maximum,
the +h,72h y0 +−search
could 7 h along
= f(−1the+gradient
6h, 1 − 6h)
direction, that is, along an h axis
∂ x of this vector.
∂ y The function can be expressed along this axis as
running! along the direction"of this vector. The function can be expressed 2
along 2this axis as
∂ f = 2(−1 ∂ f+ 6h)(1 − 6h) + 2(−1 + 6h) − (−1 + 6h) − 2(1 − 6h)
f !x0 + h, y0 + h = " f(−1 + 6h, 1 − 6h)
∂∂x f ∂ y∂ f
+ the
f x0where +
h, ypartial
0 h = f(−1
derivatives are+ 6h, 1 −at6h)
evaluated x= −1 and y =21.
Now that
= 2(−1∂By + we
x 6h)(1 have
combining
developed
∂y +
− 6h) 2(−1we
terms,
a
+ 6h)function
− (−1
develop + 6h) −the
along2
2(1 path
a one-dimensional − 6h) of steepest ascent, we can
function g(h) that maps f(x, y)
plore how
= to answer
2(−1
along +the6h)(1
h the
axis,− second
6h) + question.
2(−1 + 6h) That
− (−1 is,+how
6h) 2 far along this
− 2(1 − 6h) 2 path do we travel?
where the partial derivatives are evaluated at x = −1 and y = 1.
where the partial derivatives are evaluated at x = −1 and y = 1.
By combining terms, we develop a one-dimensional function g(h) that m
along the h axis,
g(h) = −180h 2 + 72h − 7

Thus, we convert from finding the optimum of a two-dimensional


function to performing a one-dimensional search along the gradient
Now that
direction. wemethod
This have developed a function
is called steepest along
ascent the an
when path of steepest ascent,
arbitrary
step size h is used. If a value of a single step h∗ is found that brings us
plore how to answer the second question. That is, how far along this path do we
directly to the maximum along the gradient direction, the method is
approach might
called the be tosteepest
optimal move along this path until we find the maximum of this f
ascent.
will call the location of this maximum h∗ . This is the value of the step that m
(and hence, f ) in the gradient direction. This problem is equivalent to findin
mum of a function of a single variable h. This can be done using different one-d
search techniques like the ones we discussed in Chap. 13. Thus, we convert f
This method is called steepest ascent when an arbitrary step size h is used. If a value
of a single step h∗ is found that brings us directly to the maximum along the gradient di-
rection, the method is called the optimal steepest ascent.

AMPLE 14.4 Optimal


Optimal SteepestSteepest
Ascent Ascent Method
Problem Statement. Maximize the following function:
f(x, y) = 2x y + 2x − x 2 − 2y 2
using initial guesses, x = − 1 and y = 1.
Solution. Because this function is so simple, we can first generate an analytical solution.
To do this, the partial derivatives can be evaluated as
∂f
= 2y + 2 − 2x = 0
∂x
∂f
= 2x − 4y = 0
∂y
This pair of equations can be solved for the optimum, x = 2 and y = 1. The second partial
derivatives can also be determined and evaluated at the optimum,
∂2 f
=−2
∂x2
∂2 f
=−4
∂ y2
∂2f ∂2f
= =2
∂ x∂ y ∂ y∂ x
and the determinant of the Hessian is computed [Eq. (14.3)],
2
∂2f ∂2f 3
= =2 Maximum
∂ x∂ y ∂ y∂ x
2
and the determinant of the Hessian is computed [Eq. (14.3)],
|H | = − 2(− 4) − 22 = 4 2
1
0
Therefore, because |H| > 0 and ∂ 2 f/∂ x 2 < 0, function value f (2, 1) is a maximum.
Now let us implement steepest
0 ascent. Recall that, at the end of Example 14.3, we had
1 problem by generating
already implemented the initial steps of the
g(h) = − 180h 2 + 72h –−1 7
–2 0 2 4 x
Now, because this is a simple parabola, we can directly locate the maximum (that is, h =
h∗ ) by solving the problem,
FIGURE
′ ∗ 14.11
g (h ) = 0
The method of optimal steepest ascent.
− 360h ∗ + 72 = 0
h ∗ = 0.2
This
(x,means that if we corresponding
y) coordinates travel along the to
h axis, g(h) reaches a minimum value when h =
this point,
h∗ = 0.2. This result can be placed back into Eqs. (14.10) and (14.11) to solve for the
x = −1 + 6(0.2) = 0.2
y = 1 − 6(0.2) = −0.2
This step is depicted in Fig. 14.11 as the move from point 0 to 1.
The second step is merely implemented by repeating the procedure. First, the partial
derivatives can be evaluated at the new starting point (0.2, −0.2) to give
∂f
= 2(−0.2) + 2 − 2(0.2) = 1.2
∂x
(x, y) coordinates corresponding to this point,
x = −1 + 6(0.2) = 0.2
y = 1 − 6(0.2) = −0.2
This step is depicted in Fig. 14.11 as the move from point 0 to 1.
The second step is merely implemented by repeating the procedure. First, the partial
derivatives can be evaluated at the new starting point (0.2, −0.2) to give
∂f
= 2(−0.2) + 2 − 2(0.2) = 1.2
∂x
∂f
= 2(0.2) − 4(−0.2) = 1.2
∂y
Therefore, the gradient vector is
!f = 1.2 i + 1.2 j
This means that the steepest direction is now pointed up and to the right at a 45" angle with
Figure)
the x axis (see Fig. 14.11). The coordinates along this new h axis can now be expressed as
x = 0.2 + 1.2h
y = −0.2 + 1.2h
Substituting these values into the function yields
f(0.2 + 1.2h, −0.2 + 1.2h) = g(h) = −1.44h 2 + 2.88h + 0.2
The step h∗ to take us to the maximum along the search direction can then be directly com-
puted as
g ′ (h ∗ ) = −2.88h ∗ + 2.88 = 0
h∗ = 1
y = −0.2 + 1.2h
Substituting these values into the function yields
f(0.2 + 1.2h, −0.2 + 1.2h) = g(h) = −1.44h 2 + 2.88h + 0.2
382 MULTIDIMENSIONAL UNCONSTRAINED OPTIMIZATION
14.2 GRADIENT
The step ∗
METHODS
h to take us to the maximum along the search direction can then be directly com- 383
puted as
y
′ ∗ ∗
(h ) =
This gresult −2.88h
can + 2.88
be placed = 0into Eqs. (14.10) and (14.11) to solve for the (x, y) coordi-
back 3
∗ Maximum
=1
nateshcorresponding to this new point,
2
x = 0.2 + 1.2(1) = 1.4
1 2
y = −0.2 + 1.2(1) = 1 0

As depicted in Fig. 14.11, we move to the 0new coordinates,


1 labeled point 2 in the plot, and
in so doing move closer to the maximum. The approach can be repeated with the final re-
–1
sult converging on the analytical solution, x– 2= 2 and 0y = 1. 2 4 x

FIGURE 14.11
X It can be shown thatY the
The method of optimal
method f ascent.
steepest
of steepest descent
is linearly convergent. Further, it
-1 to move very slowly
tends 1 along long, narrow ridges. This is because the new gradient at
each
0.2maximum point will
-0.2be perpendicular to the original direction. Thus, the technique
(x, y) coordinates corresponding to this point,
takes
1.4many small steps criss-crossing
1 the direct route to the summit. Hence, although it is
x = −1 + 6(0.2)
reliable, there are other approaches = 0.2
that converge much more rapidly, particularly in the
vicinity of an optimum. The y =remainder
1 − 6(0.2) of
= −0.2
the section is devoted to such methods.
This step is depicted in Fig. 14.11 as the move from point 0 to 1.
14.2.3 Advanced Gradient
The secondApproaches
step is merely implemented by repeating the procedure. First, the pa
derivatives can be evaluated at the new starting point (0.2, −0.2) to give
(b) f (x, y, PROBLEMS
z) = x2 + y2 + 2z2
(c) f (x, y) = ln(x2 + 2xy + 3y2)
Q1 Find
14.1
14.6 Find the
the minimum
directionalvalue
derivative
of of (b) f (x, y, z) =
2 (c) f (x, y) =
= (x
y) =
f(x, y) 2x− 3)y2 2+ (y − 2)2
+
e point 14.6 Find the
starting x =y 1=and
at x = 2atand = 1,direction
2 iny the h = 3i +
using theofsteepest descent
2j. method with
f(x, y) =
a14.2
stopping
Repeatcriterion
Example s = 1%.
of ε14.2 Explain
for the your results.
following function at the point
Q2 1.2).
14.7
(0.8, Perform one iteration of the steepest ascent method to locate starting at x =
the maximum of a stopping cri
f(x, y) = 2x y + 1.5y − 1.25x 2 − 2y 2 + 5
14.7 Perform
f(x, y) = 3.5x + 2y + x 2 − x 4 − 2x y − y 2
14.3 Given the maximum
ns that using initial guesses x = 0 and y = 0. Employ 2 2
bisection to find the
y) =size
f(x, step
optimal 2.25x y +gradient
in the 1.75y −search − 2y
1.5x direction. f(x, y) =
deriva-
14.8 Perform
Construct andone iteration
solve of theofoptimal
a system lineargradient
algebraicsteepest descent
equations that using initial g
method
maximizesto locate the minimum
f (x). Note that this of
is done by setting the partial deriva- optimal step s
o appli- tivesf(x,
of fy)with respect 14.8 Perform
= −8x + xto2 both x and 2y to zero.
+ 12y + 4y − 2x y
. 14.3. 14.4 method to loc
h of the using initial
(a) Start guesses
with x =guess
an initial 0 andofyx==0.1 and y = 1 and apply two appli-
14.9cations
Develop a program f(x, y) =
of the steepestusing
ascenta method
programming or from
to f(x, y) macroProb.
language
14.3.
h of the to
(b)implement
Constructthe random
a plot fromsearch method.
the results Design
of (a) the subprogram
showing the path ofso
the using initial g
(Revelle
15.1.1 et al. 1997).Form
Standard
The basic linear programming problem consists of two major parts: the objective function
15.1.1 Standard
and a set of Form
constraints. For a maximization problem, the objective function is generally
Linear Programming
expressed
The basic as
linear programming problem consists of two major parts: the objective
and aMaximize
set of constraints.
Z = c1 x1 + For
c2 x2 a+maximization
· · · + cn xn problem, the objective function is
(15.1)
For a maximization
expressed as problem, the objective function is generally expressed as
3/20/09 12:39where cj =388
PM Page payoff of each unit of the jth activity that is undertaken and xj = magnitude of
the jth Z = cthe
activity. Thus,
Maximize + c2ofx2the
1 x 1value + objective
· · · + cn xfunction,
n Z, is the total payoff due to the
total number of activities, n.
whereThe = payoff of
cj constraints caneach unit of the
be represented as that is undertaken and xj = mag
jth activity
generally
the jth activity. Thus, the value of the objective function, Z, is the total payoff d
CONSTRAINED
ai1 x1 + ai2 x2OPTIMIZATION
+ · · · + a x ≤ bi (15.2)
total number of activities,inn.n
The constraints
where aij = amount of can
the ithbe represented
resource generally
that is consumed for eachas
unit of the jth activity and
387
bi = amount of the ith resource that is available. That is, the resources are limited.
The second general type of constraint specifies that all activities must have a positive
a x
value,1
i1 + ai2 x2 + · · · + ain xn ≤ bi
xi ≥ 0 (15.3)
In the present context, this expresses the realistic notion that, for some problems, negative
activity is physically impossible (for example, we cannot produce negative goods).
Together, the objective function and the constraints specify the linear programming
problem. They say that we are trying to maximize the payoff for a number of activities
under the constraint that these activities utilize finite amounts of resources. Before show-
ing how this result can be obtained, we will first develop an example.

EXAMPLE 15.1 Setting Up the LP Problem


profits to the company. However, their production involves both time and on-site storage
constraints. For example, only one of the grades can be produced at a time, and the facility
is open for only 80 hr/week. Further, there is limited on-site storage for each of the products.
All these factors are listed below (note that a metric ton, or tonne, is equal to 1000 kg):
Example 1
Product

Resource Regular Premium Resource Availability

Raw gas 7 m3/tonne 11 m3/tonne 77 m3/week


Production time 10 hr/tonne 8 hr/tonne 80 hr/week
Storage 9 tonnes 6 tonnes

Profit 150/tonne 175/tonne

Develop a linear programming formulation to maximize the profits for this operation.
Ø The raw gas is processed into two grades of heating gas, regular and
Solution. The engineer operating this plant must decide how much of each gas to pro-
duce topremium
maximize quality.
profits. If the amounts of regular and premium produced weekly are des-
Ø Only
ignated onexof
as x1 and the grades can be produced at a time.
2, respectively, the total weekly profit can be calculated as
Ø The facility is open for only 80 hr/week.
Total profit = 150x + 175x
Ø There is limited1 on-site2 storage for each of the products (note that a metric
or written
ton,asor
a linear
tonne,programming objective
is equal to 1000 kg):function,
Maximize Z = 150x1 + 175x2
TheDevelop
Ø a linear
constraints can beprogramming
developed in aformulation to For
similar fashion. maximize the
example, theprofits forgas
total raw this
operation.
used can be computed as
Total gas used = 7x1 + 11x2
value,
xi ≥total
This 0 cannot exceed the available supply of 77 m3/week, so the (15.3)
constraint can be rep-
In resented
the presentas
context, this expresses the realistic notion that, for some problems, negative
activity is physically impossible (for example, we cannot produce negative goods).
7x1 +the
Together, 11x 2 ≤ 77function and the constraints specify the linear programming
objective
problem. They say that
If the amounts we are trying
of regular and to maximizeproduced
premium the payoff for a number
weekly areofdesignated
activities as x1
The remaining constraints can be developed in a similar fashion,
under the constraint that these activities utilize finite amounts of resources. Before show- with the resulting
and x2, respectively
ingtotal
how LPthis formulation given we
result can be obtained, by will first develop an example.
EXAMPLE 15.1 Setting Maximize Z = 150x1
Up the LP Problem + 175x2 (maximize profit)
Problem
subjectStatement.
to The following problem is developed from the area of chemical or
petroleum engineering. However, it is relevant to all areas of engineering that deal with
producing 1 + 11xwith
7xproducts 2 ≤limited
77 resources. (material constraint)
Suppose that a gas-processing plant receives a fixed amount of raw gas each week. The
10x1 + 8x2 ≤ 80 (time constraint)
raw gas is processed into two grades of heating gas, regular and premium quality. These
grades of ≤ are
x1gas 9 in high demand (that is, they are guaranteed
(“regular” storage
to sell) constraint)
and yield different
profits to the company. However, their production involves both time and on-site storage
x2 ≤
constraints. For6example, only one of the grades can be (“premium”
produced at a storage
time, and constraint)
the facility
is open for
x1,x 2 ≥
only 80 0hr/week. Further, there is limited on-site storage for
(positivity each of the products.
constraints)
All these factors are listed below (note that a metric ton, or tonne, is equal to 1000 kg):
Note that the above set of equations constitute the total LP formulation. The parenthetical
explanations at the right haveProduct
been appended to clarify the meaning of each term.
Resource Regular Premium Resource Availability

Raw gas 7 m3/tonne 11 m3/tonne 77 m3/week


15.1.2 Graphical
Production time Solution
10 hr/tonne 8 hr/tonne 80 hr/week
Storage 9 tonnes 6 tonnes
Because they are limited to two or three dimensions, graphical solutions have limited prac-
Profit 150/tonne 175/tonne
tical utility. However, they are very useful for demonstrating some basic concepts that
underlie the general algebraic techniques used to solve higher-dimensional problems with
Develop a linear programming formulation to maximize the profits for this operation.
the computer.
solution space. (b) The objective function can be increased until it reaches the highest value
resented
that obeys all constraints. Graphically, the function moves up and asuntil it touches
to the right
the feasible space at a single optimal point. This total cannot exceed the available suppl
resented as 7x 1 + 11x 2 ≤ 77

x2
The
7x1 + 11xx2
≤remaining
77 constraints
2

Redundant total
The LP formulation
remaining given
constraints can by
be deve
total LP formulation given by
8 Maximize
8 Z = 150x1 + 1
Maximize Z = 150x1 + 175x2 (ma
4
subject to
E subject
F to E
D 1 D Z!
7x1 ≤
7x1 + 11x +77
11x2 ≤ 77 14 (ma
4 2 00
3
5
C 8x2 1≤+
10x1 +10x 808x 2 ≤ 80C (tim
2 Z!
x1 ≤ 9 x1 ≤ 9 60 (“re
0
A B x2 ≤ 6 x 0 ≤A6 (“pr
B
0 6 2
4 8 x1 Z! 4
x1,x2 ≥ 0 0 (po
x1,x2 ≥ 0
(a) Note that the above set of equations
(b) constitu
Note that
explanations at thethe
rightabove set appended
have been of equa
explanations at the right have b
m. (a) The constraints define a feasible
creased until it reaches the highest value
resented as
moves up and to the right until it touches This total cannot exceed the available suppl
resented as 7x 1 + 11x 2 ≤ 77
The
7x1 + 11x 2 ≤remaining
77 constraints
x2 total
The LP formulation
remaining given
constraints can by
be deve
nt total LP formulation given by
Maximize Z = 150x1 + 1
8 Maximize Z = 150x1 + 175x2 (ma
subject to
subject to
F
E D Z! 7x1 ≤
7x1 + 11x +77
11x2 ≤ 77 (ma
14 2
00
3
C
10x1 +10x8x2 1≤+
808x 2 ≤ 80 (tim
Z! x1 ≤ 9 x1 ≤ 9 (“re
60
0
A B
x2 ≤ 6 x ≤ 6 (“pr
0 2
8 x1 Z! 4 8 x x 1,x 2 ≥ 0
1 (po
0 x1,x2 ≥ 0
(b)
Note that the above set of equations constitu
Note that
explanations at thethe
rightabove set appended
have been of equa
explanations at the right have b
392 CONSTRAINED OPTIMIZATION

x2 x2 x2

0 0 0
x1 x1 x1

(a) (b) (c)

FIGURE 15.2
Aside from a single optimal solution (for example, Fig. 15.1b), there are three other possible
outcomes of a linear programming problem: (a) alternative optima, (b) no feasible solution,
(a) alternative optima, (b) no feasible solution, and (c) an
and (c) an unbounded result.
unbounded result.

3. No feasible solution. As in Fig. 15.2b, it is possible that the problem is set up so that
there is no feasible solution. This can be due to dealing with an unsolvable problem or
due to errors in setting up the problem. The latter can result if the problem is over-
constrained to the point that no solution can satisfy all the constraints.
4. Unbounded problems. As in Fig. 15.2c, this usually means that the problem is under-
constrained and therefore open-ended. As with the no-feasible-solution case, it can
often arise from errors committed during problem specification.
Now let us suppose that our problem involves a unique solution. The graphical ap-
proach might suggest an enumerative strategy for hunting down the maximum. From
Unique solution: The maximum objective function intersects a single point.

Alternate solutions: Suppose that the objective function in the example had
coefficients so that it was precisely parallel to one of the constraints. Then, rather than
a single point, the problem would have an infinite number of optima corresponding to
a line segment

No feasible solution: It is possible that the problem is set up so that there is no feasible
solution. This can be due to dealing with an unsolvable problem or due to errors in
setting up the problem. The latter can result if the problem is over- constrained to the
point that no solution can satisfy all the constraints.

Unbounded problems. This usually means that the problem is under- constrained and
therefore open-ended. As with the no-feasible-solution case, it can often arise from
errors committed during problem specification.
Q1: Consider the linear programming problem:
PROBLEMS
Maximize f(x, y) = 1.75x + 1.25y
subject to
1.2x + 2.25y ≤ 14
x + 1.1y ≤ 8
2.5x + y ≤ 9
x ≥0
y≥0
Obtain the solution:
(a) Graphically.
(b) Using the simplex method.
(c) Using an appropriate software package (for example, Excel,
MATLAB, or Mathcad).
15.1.3 The
15.1.3 The Simplex
Simplex Method
Method
The simplex
The simplex method
method is is predicated
predicated on
on the
the assumption
assumptionthatthatthe
theoptimal
optimalsolution
solutionwill
willbebe
anan
extremepoint.
point. Thus,
Thus, the
the approach
approach must
mustbebeable
abletotodiscern
discernwhether
whetherduring
duringproblem solution
The Simplex Method
extreme
an extreme point occurs.
introducing what
what are
To do
are called
this, the constraint equations are
problem
reformulated as
solution
equalities
an extreme point occurs. To do this, the constraint equations are reformulated as equalities
byintroducing
by called slack
slack variables.
variables.
Slack SlackVariables.
Variables.
A slackAsvariable the name measures
implies,how much
a slack of a constrained
variable measures how resource
muchisof a con-
Slack
available, Variables.
that is, how much As the name
“slack” of implies, a slack variable measures how much
the of aForcon-
strained resource is available, thattheis,resource
how much is available.
“slack” ofFor theexample,
resourcerecallis available.
resourcestrained resource is available, that is, how much “slack” of the resource is available. For
constraint.
example, recall the resource constraint used in Examples 15.1 and 15.2,
example, recall the resource constraint used in Examples 15.1 and 15.2,
7x + 11x ≤ 77
7x11 + 11x22 ≤ 77
We can define a slack variable S1 as the amount of raw gas that is not used for a particular
We can We define a slackavariable
can define slack S1 as theSamount
variable 1 as the is
of raw gasofthat
amount rawisleftnot that
gas usedisfornotaconstraint,
particular
used for aitparticular
productionproduction level
level (xlevel (x 1, x2 ). If this quantity added to the side
1, x2). If this quantity is added to the left side of the constraint, it makes
of the makes
production
the relationship (x ,
1 2
exact, x ). If this quantity is added to the left side of the constraint, it makes
the relationship exact, exact,
the relationship
7x1 + 11x2 + S1 = 77
7x1 + 11x2 + S1 = 77
Now recognize what the slack variable tells us. If it is positive, it means that we have
some Now recognize
“slack” for thiswhat the slack
constraint. Thatvariable
is, we tells
haveus. some If itsurplus
is positive, it means
resource that isthat
notwe have
being
If it issome
positive, it means
“slack” for thatconstraint.
this we have some Thatus“slack”
is,that
wewe for
have this
someconstraint.
surplus That is, wethat
resource haveis not being
some fullyfully
surplus utilized.
resource If it
that is negative,
is not it
being tells
fully utilized. Ifhave
it is exceeded
negative, the
it constraint.
tells us that Finally, if it
we
utilized.
is zero, we If it ismeetnegative, it tells usThat thatis,wewehave exceeded thethe
constraint.
allowableFinally, if it
have exceeded theexactly
constraint. the constraint.
Finally, if it is zero, we exactly have
meetusedthe up all
constraint. resource.
isSince
zero,this
we isexactly
exactly meetthe the constraint.
condition where That is, we have
constraint linesused up allthe
intersect, theslack
allowable resource.
variable pro-
Since
vides this is exactly
a means to detect theextreme
condition where constraint lines intersect, the slack variable pro-
points.
videsAa different
means toslack detect extreme
variable is points.
developed for each constraint equation, resulting in what
A different
is called the fullyslack variableversion,
augmented is developed for each constraint equation, resulting in what
is called the fully augmented version,
Maximize Z = 150x1 + 175x2
Maximize Z = 150x1 + 175x2
subject to
is zero, we exactly meet the constraint. That is, we have used up all the allowable resource
Since this is exactly the condition where constraint lines
Theintersect,
remainingthe slack variablecan
constraints pro
vides a means to detect extreme points. total LP formulation given by
A different slack variable is developed for each constraint equation, resulting in wha
Algebraic Solution Maximize Z = 150x1 + 175x2
is called the fully augmented version,
Maximize Z = 150x1 + 175x2 subject to
subject to 7x1 + 11x2 ≤ 77
7x1 + 11x2 + S1 = 77
10x1 + 8x2 ≤ 80 (15.4a
10x1 + 8x2 + S2 = 80 x1 ≤ 9 (15.4b
x1 + S3 =9 x2 ≤ 6 (15.4c
x2 + S4 = 6 x1,x2 ≥ 0 (15.4d
x1 , x2 , S1 , S2 , S3 , S4 ≥ 0 Note that the above set of equations
explanations
Notice how we have set up the four equality at the
equations so that right have beenara
the unknowns
aligned in columns. We did this to underscore that we are now dealing with a system of lin
The ear
system is underspecified
algebraic or Part
equations (recall underdetermined, that is, it section,
Three). In the following has more weun-will show how
knowns
thesethan equations.
equations can be We
usedhave 2 structural
to determine 15.1.2
variables,
extreme 4 Graphical
slack variables,
points algebraically. Solution
and 6 total variables. Thus, the problem involves solving 4 equations with 6
Algebraic Solution. In contrast to Part Three, where
unknowns. Because
we had they are limited
n equations withto two or th
n unknowns
our example system [Eqs. (15.4)] is underspecified or underdetermined,
tical that is,they
utility. However, it hasare
more un
very
knowns than equations. In general terms, there are nunderlie
structuralthevariables
general(that is, the origina
algebraic techni
unknowns), m surplus or slack variables (one per constraint), and n + m total variables (struc
the computer.
tural plus surplus). For the gas production problem we have For2 structural variables, 4 problem
a two-dimensional slack vari
ables, and 6 total variables. Thus, the problem involves solving 4 equations with 6 unknowns
FIGURE 15.1
NowThe recognize
difference between
Graphical solution of a linear programming problem. (a) The constraints define a feasible
what the slack variable
the number of unknowns tells
andus.
theIfnumber
it is po
of
to 2 “slack”
for our problem) thisis directly relatedThat
to howis,wewecanhave
distinguish
someasurpl
feasib
solution space. (b) The objective function can be increased untilsome
it reaches forvalue
the highest constraint.
that obeys all constraints. Graphically, the function moves up and toSpecifically,
fully right until every
theutilized. feasible point has 2 variables out of 6 equal to zero.
itIftouches
it is negative, it tells us that we have exceede
the feasible space at a single optimal point. five corner points of the area ABCDE have the following zero values:
is zero, we exactly meet the constraint. That is, we have used u
Since this is exactly the condition where constraint lines inter
Extreme Point Zero Variables
x2 vides xa2 means to detect extreme points.
Redundant A
A different slack variable isx1developed
, x2 for each constrain
B x2 , S2
8 is called
8 theCfully augmented version,
S1 , S2
D S1 , S4
4 F Maximize
E Z = 150x1 + 175x
x1 , 2S4
E D E D Z!
1
subject to 14
4 00
3
C This observation leads
C to the conclusion that the extreme points ca
5
2 from +standard
7x1the 11xZ2!+form
S1 by setting two of the=variables
77 equal to zero. In o
60
reduces the problem
A 10x1A + 8x2 0 to + a solvable form of 4 equations with 4 unknowns.
S2 B = 80
0 E, setting x1 = S4 = 0 reduces the standard form to
point
B
0 6
4 8 x1 x1 Z! 4 =9 + S38 x1
11x2 + S1 0 = 77
(a) 8x2 x2+ S2 (b) = 80 + S4 = 6
x1 , x2 , S1 , S2 , +SS33, =S49 ≥ 0
Every feasible point has 2 variables out of 6 equal
x2 to zero. =6
Notice how we have set up the four equality equation
To generalize, a basic solution for m linearwhich can be solved
equations with for x2 = 6, S1 =is11,
n unknowns S2 = 32, and
developed by S3 = 9. Together w
aligned in columns. We did this to underscore that we are now
these values define point E.
setting n − m variables to zero, and solving the m equations
ear algebraic equations
To generalize,
for the for
(recall
a basic solution
m remaining
Part Three).
m linear
un-
In the
equations withfollowing
n unknown
knowns. The zero variables are formally referred
these
setting to
equations as nonbasic
can be to
n − m variables used variables,
toand
zero, whereas
determine the
extreme
solving the points
m equations foralgeb
the m
remaining m variables are called basic variables.
knowns. The zeroIfvariables
all the are basic
formallyvariables arenonbasic variables
referred to as
nonnegative, the result is called a basicAlgebraic
feasible mSolution.
variables are In
mainingsolution. contrast
called to Part Three,
basic variables. where
If all the basic we had n
variables
ourthe
example
result is system [Eqs.feasible
called a basic (15.4)]solution.
is underspecified
The optimumor underdeter
will be one of t
knownsNow thana direct approach
equations. In togeneral
determining thethere
terms, optimalaresolution would bv
n structural
the basic solutions, determine which were feasible, and among those, wh
unknowns), m surplus or slack variables (one per constraint), an
est value of Z. There are two reasons why this is not a wise approach.
Specifically, every feasible point has 2FIGURE
variables
15.1 out of 6 equal to zero. For example,
the basic solutions, determine which
five corner pointswere
of the feasible,
area ABCDEand have among
Graphical
solution the
solution of athose,
space.following
which
linear programming
(b) The objective zero
had
problem. (a)
values:
function can
the define
The constraints high a fe
be increased until it reaches the highe

15.1
est value LINEAR There are two reasons 7x
of Z.PROGRAMMING why 1 +this11xis2+ not S1a= wise77 all approach.
that obeys
395
constraints. Graphically, the function moves up and to the right until it
the feasible space at a single optimal point.

First, for even moderately sized Now


Extreme Point
problems,
recognize the what
approach
Zero Variables the slack can variable
involve tells solving us. Ifaitgrea is p
x2 x2
number of equations. For m equations someThe with
“slack” n unknowns,
for this this results
constraint. That in
is,solving
we have some surp
Simplex Methodwith n Implementation. Redundant
For m equations unknowns A simplex method
x1 , x2 avoids inefficiencies outlined
Bfully utilized. If it isx2negative, it tells us that we have exceed
8 8
, S2
in the previous section. n! It does this by starting
Cis zero, we exactly meet
with a basic feasible solution. Then it
S1 , S2 the constraint. That is, we have used
n 4 F

moves =
Cmthrough a sequence
E E

− m)! of other D basic feasible solutions that successively improve the D D


S1 , S4 1

m!(n ESince this is exactly the


x1 , S4 condition where constraint lines int
4
3

value of the objective function. Eventually, the optimal value


vides a means to detect extreme points.
is reached 5
and the method is
C
2 Z

Interminated.
simultaneous
the present problem, equations.
out of 15For example,
extreme points, ifonly
there
A different are
slack
5 are 10
variable
feasible. equations is developed
0 (m = 10)
A
for each
6
4
with constra
B 16 un
8 x1
0
A
Z!
0
knownsWe will (n =illustrate
16), youthe would This
approach observation
have is using
8008
called
leads
the
[=the
to the conclusion
gas-processing
16!/(10!
fully 6!)] 10that
augmented × the
problem
version,10extreme from
systems(a)
points
Examplescan be determi
of equations 15.1 to
and 15.2. The first step isfrom
solve! to the standard form by setting two of the variables equal to zero. In our example,
start at a basic feasible solution (that is, at an extreme corner
reduces the problem to a solvable form of 4 equations with 4 unknowns. For example,
The firstSecond,
step
theisfeasible
toastart atspace).
a basic feasible and Maximize
obvious =infeasible.
Zbeobvious
150x + 175x
point of significant portion
For
point cases ofxlike
E, setting these
1 = Sours,
4= 0 starting
may an
reduces point
the 1 would
standard starting For
form be2topoint
point
example, A;
would in be thepoint presen A;
that x1 =xx2 ==
is, is,
problem, 0.xThe
= original
0.64The
C = 156 equations with 4 unknowns become :
that 1 out 2 of extreme
original
11x2 + points,
6subject
Sequations
1
to only = 7754are
with feasible.
unknowns Clearly, if we could avoid
become
solving all these unnecessary 8x2 systems, + S2 a more = 80 efficient algorithm would be developed
Such S1an approach=is77described next. 7x1 + 11x2 + S1 = 77
+ S3 = 9
S2 = 80 x2
10x1 + 8x = 62
+ S2 = 80
S3 =9
which can be solved
x
1 for x = 6, S = 11, S = 32,3and S = 9. Together with x = S = +S =9
2 1 2 3 1 4
these values define point E. x 2
S4 = 6 + S4 = 6
To generalize, a basic solution for m linear equations with n unknowns is developed
x1 , x2 ,to Szero,
setting n − m variables 1 , S2and
, Ssolving
3 , S4 the≥ m 0 equations for the m remaining
Thus, the starting values knowns.
for theThe basic
zero variables
variables areare given
formally automatically
referred to as nonbasicas beingwhereas
variables, equal the
to
Notice how we have set up the four equality equatio
the right-hand sides of themaining
constraints.
m variables are called basic variables. If all the basic variables are nonnegat
aligned in columns. We didThe this to underscore be onethat we are now
Before proceeding tothetheresult
nextis step,
called a basic feasible solution.
the beginning information optimum
canwill
now be of these.
summarized
Now a eardirectalgebraic
approach to equations
determining (recall Part solution
the optimal Three).wouldIn the followin
be to calculate
in a convenient tabular format
the basiccalled
solutions,a determine
tableau.which
these equations
As shown
can be were below,
feasible,
used
the tableau
and among
to determine
provides
those, which
extreme had the
points algh
a
concise summary of the key information
est value of Z. There constituting
are two reasons whythe this
linear
is notprogramming
a wise approach. problem.
Algebraic
First, for Solution.
even moderately sized problems, the approach
In contrast can involve
to Part Three, wheresolving ag
we had
number of equations.
our example For msystem
equations with(15.4)]
[Eqs. n unknowns, this results in solving
is underspecified or underdet
Graphical solution of a linear programming problem. (a) The constraints define a fea
Thus, the starting values for the basic variables are given automatically as being equal to space. (b) The objective function can be increased until it reaches the highes
solution
that obeys all constraints. Graphically, the function moves up and to the right until it to
the right-hand sides of the constraints.
cha01064_ch15.qxd 3/20/09 12:39 PM Page 394 the feasible space at a single optimal point.
15.1 LINEAR
Before to the next step, the beginning information can now be summarized397
PROGRAMMING
proceeding
in a convenient tabular format called a tableau. As shown below, the tableau provides a x 2 x 2

conciseAtsummary of the key information constituting the linear programming problem.


this point, we have moved to point B (x2 = S2 = 0), and the new basic solution Redundant

becomes 8 8

Basic
7x1 + SZ1 x1= 77 x2 S1 394S2 S3 S4 SolutionCONSTRAINED
Intercept OPTIMIZATION
E
4 F
E
D 1 D

10x1 = 80 3
4
Z 1 −150 −175 0 0 0 0 0 The difference between
5 the Cnumber of unknowns and the
Sx11 0 + S3 7 = 9 11 1 0 0 0 77 11 2 Z!
to 2 for our problem) is directly related to how we can distingu
S2 0 10 8 0 1 0 0 80 8
S4 = 6 A
has 2Bvariables
A
9 Specifically, 9 every feasible point out0 of 6 Zequ
0 6
S3 0 1 0 0 0 1 0 4 8 x1 !
0
S
The solution
4 0 0 1 0 0 0 1 6 five corner
∞ points
of this system of equations effectively defines the values of the basic variables of the area ABCDE have the following zero
(a)
at point B: x1 = 8, S1 = 21, S3 = 1, and S4 = 6.
The tableau can be used to make the same calculation by employing the Gauss-Jordan
Notice that for the purposes of the tableau, the objective function is expressed asExtreme Point Zero Variables
Ø Moving A to B
method. Recall that the basic strategy behind Gauss-Jordan involved converting the pivot
element to 11 and then2 eliminating A the x1 , x2
ZØ Basic
− 150x − variable
175x − 0Ss21− 0S2the
will − coefficients
be 0Sreplaced
3 − 0S4 = with
in0the same column above and below
x1 (15.5)
pivot element (recall Sec. 9.7). B x2 , S2
Ø
The Employing
Fornext
this step
example, Gauss-Jordan
involves moving
the pivot rowto is Sa2Method
new basic feasible
(the leaving variable) solution
and thethat
pivotleads
C
to an
element is 10
S1 , S2
D S1 , S4
(the Ø
improvement S2 ofis the
coefficient leaving variable
of objective function.
the entering and
variable,This xx1
1is is entering
accomplished
). Dividing variable
the rowby increasing a current non-
by 10 and replacing S2E by x1 x1 , S4
basic variable
gives (at this point, x1 or x2 ) above zero so that Z increases. Recall that, for the
present example, extreme points must have 2 zero values. Therefore, one of the current
basic variables (S1, S2, S3, or S4) must also be set to zero.
Basic Z x1 x2 S S2 S3 S4 Solution ThisIntercept
observation leads to the conclusion that the extrem
To summarize this important step: one of1 the current nonbasic variables from
must the
be made
standard form by setting two of the variables equal
basic (nonzero).
Z This
1 variable
−150 is −175
called the entering
0 0variable. 0 In the0 process, one
reduces thecur-
0 of the problem to a solvable form of 4 equations with 4
rent basicS1variables0 is made 7 nonbasic 11 (zero). 1 This 0variable0is called 0 the leaving
77 variable.
point E, setting x1 = S4 = 0 reduces the standard form to
Now,x1 let us 0develop a1 mathematical 0.8 approach 0 0.1 0
for choosing 0
the entering 8 and leaving
variables.S3Because 0 1
of the convention 0
by which 0 0
the objective 1
function0
is written9 [(Eq. 2 + S1
11x(15.5)], = 77
S4 0 0 1 0 0 0 1 6
the entering variable can be any variable in the objective function having a negative 8x2 co- + S2 = 80
efficient (because this will make Z bigger). The variable with the largest negative value + S3 = 9
Next, the xR1
is conventionally
1
= R1+
chosen
coefficients R3*150
because
in the it usually
other rows leads
can beto the largest
eliminated. increase
For in
example,Z. For
for our
the
x2
case,
objective =6
function row,R2 the= R2 - R3*7
pivot row is multiplied by −150 and the result subtracted from the first
row to giveR4 = R4 – R3 which can be solved for x2 = 6, S1 = 11, S2 = 32, and S3 = 9
these values define point E.
To generalize, a basic solution for m linear equations with
Z x1 x2 S1 S2 S3 S4 Solution
setting n − m variables to zero, and solving the m equatio
x1 0 1 0.8 0 0.1 0 0 8
S3 0 1 0 0 0 1 0 9
S4 0 0 1 0 0 0 1 6

cha01064_ch15.qxd 3/20/09 12:39 PM Page 394


Next, the x1 coefficients in the other rows can be eliminated. For example, for the objective
function row, the pivot row is multiplied by −150 and the result subtracted from the first
row to give

394 CONSTRAINED OPTIMIZATION


Z x1 x2 S1 S2 S3 S4 Solution

1 −150 −175 0 0 0 0 0 The difference between the n


−0 −(−150) −(−120) −0 −(−15) 0 0 −(−1200)
to 2 for our problem) is directly re
Specifically, every feasible point
1 0 −55 0 15 0 0 1200 five corner points of the area ABC

Similar operations can be performed on the remaining rows to give the new tableau, Extreme Point Ze

A
B
Basic Z x1 x2 S1 S2 S3 S4 Solution Intercept C
D
Z 1 0 −55 0 15 0 0 1200 E
S1 0 0 5.4 1 −0.7 0 0 21 3.889
x1 0 1 0.8 0 0.1 0 0 8 10
S3 0 0 −0.8 0 −0.1 1 0 1 −1.25 This observation leads to the
S4 0 0 1 0 0 0 1 6 6 from the standard form by setting
reduces the problem to a solvable
point E, setting x1 = S4 = 0 reduc
S1 will be replaced with x2 R1 = R1 + (R2/5.4)*55
11x2 + S1 = 77
R3 = R3 – (R2/5.4)*0.8
8x2 + S2 = 80
R2 = R2/5.4 R4 = R3 + (R2/5.4)*0.8
+ S3 = 9
R5 = R5 - (R2/5.4)
x2 =6
which can be solved for x = 6, S
as the entering variable. According to the intercept values (now calculatedSolution.
as theFirst, the constraints can be plotted on the solution space. For example, the first
solution
constraint can be reformulated as a line by replacing the inequality by an equal sign and
column over the coefficients in the x2 column), the first constraint has the smallest
solving for x2positive
:
7
value, and therefore, S1 is selected as the leaving variable. Thus, the simplex method
x2 = − xmoves1+7
11
us from points B to C in Fig. 15.3. Finally, the Gauss-Jordan elimination can be imple-
Thus, as in Fig. 15.1a, the possible values of x1 and x2 that obey this constraint fall below
mented to solve the simultaneous equations. The result is the final tableau,this line (the direction designated in the plot by the small arrow). The other constraints can
be evaluated similarly, as superimposed on Fig. 15.1a. Notice how they encompass a region
where they are all met. This is the feasible solution space (the area ABCDE in the plot).
Aside from defining the feasible space, Fig. 15.1a also provides additional insight. In
Basic Z x1 x2 S1 S2 S3 Sparticular,
4 Solution
we can see that constraint 3 (storage of regular gas) is “redundant.” That is, the
feasible solution space is unaffected if it were deleted.
Z 1 0 0 10.1852 7.8704 0 0 Next, the 1413.889
objective function can be added to the plot. To do this, a value of Z must be
chosen. For example, for Z = 0, the objective function becomes
x2 0 0 1 0.1852 −0.1296 0 0 3.889
0 = 150x1 + 175x2
x1 0 1 0 −0.1481 0.2037 0 0 4.889
S3 0 0 0 0.1481 −0.2037 1 0 4.111
S4 0 0 0 −0.1852 0.1296
FIGURE 15.1 0 1 2.111
Graphical solution of a linear programming problem. (a) The constraints define a feasible
solution space. (b) The objective function can be increased until it reaches the highest value
that obeys all constraints. Graphically, the function moves up and to the right until it touches
the feasible space at a single optimal point.
We know that the result is final because there are no negative coefficients remaining in the
Moving
objective functionB row.
to CThe final solution is tabulated as x1 = 3.889
x and x2 = 4.889, which
2 x2

give a maximum objective function of Z = 1413.889. Further, because S3 and SRedundant


4 are still in
the basis, we know that the solution is limited by the first and second
8 constraints. 8

4 F
E E
15.2 NONLINEAR CONSTRAINED OPTIMIZATION
D 1 D Z!
14
4 00
3
C C
5
Z!
There are a number of approaches for handling nonlinear optimization problems in the 2
60
0
presence of constraints. These can generally be divided into indirect
0
A
and 6direct approaches
B
0
A
Z!
B
4 8 x 1 4 8 x1
(Rao, 1996). A typical indirect approach uses so-called penalty functions. These involve 0

placing additional expressions to make the objective function less optimal (a)as the solution (b)

approaches a constraint. Thus, the solution will be discouraged from violating constraints.
Although such methods can be useful in some problems, they can become arduous when
the problem involves many constraints.
The generalized reduced gradient (GRG) search method is one of the more popular of
the direct methods (for details, see Fylstra et al., 1998; Lasdon et al., 1978; Lasdon and
Smith, 1992). It is, in fact, the nonlinear method used within the Excel Solver.
It first “reduces” the problem to an unconstrained optimization problem. It does this
by solving a set of nonlinear equations for the basic variables in terms of the nonbasic
variables. Then, the unconstrained problem is solved using approaches similar to those
Linear Programming: Simplex Method
!"#$!$%& % = ( ! # x: 5×1 !"-6$# /: !×5 !"-6$#
c: 5×1 !"-6$# +: !×1 !"-6$#
)*+,&(- -. /# = +
# ≥ 0, + ≥ 0

6
The feasible solution space for a linear 5 C
programming problem is a polygon. The
optimal solution will be at one of the corner 4
points. Thus, an enumeration approach to solve B
3
the LPP will be to substitute the coordinates of
each corner point into the objective function 2
and determine which corner points is optimal. 1
But this will be an inefficient for large scale D A
0
problem
0 1 2 3 4 5 6
The Simplex Method: Canonical Form
!"#$!$%& % = (" #" + (# ## + ($ #$ + . . . . . + (% #%

)*+,&(- -. """ #" + ""# ## + … … … … … … + ""% #% = +"


"#" #" + "## ## + … … … … … … + "#% #% = +#



"&" #" + "&# ## + … … … … … … + "&% #% = +&

Where:
#" ≥ 0, ## ≥ 0 … … … … … … … . . , #% ≥ 0
+" ≥ 0, +# ≥ 0 … … … … … … … . . , +& ≥ 0

In general, m <n which leads to infinite number of feasible solutions. Hence selection of best
feasible solution which maximizes z is not an easy problem. To generate the solutions, use
first m variables (x1, …, xm) to reduce the system to canonical form by Gauss-Jordon
elimination.
The Simplex Method: Canonical Form
By performing n pivotal operations for any m variables (say, x1,x2,…,xm) called pivotal
variables the system of equations can be reduced to canonical form as follows:

1#" + 0## + ⋯ + 0#& + "",&(" #&(" + ⋯ + "",) #) + ⋯ + "",% #% = +"


0#" + 1## + ⋯ + 0#& + "#,&(" #&(" + ⋯ + "#,) #) + ⋯ + "#,% #% = +#



0#" + 0## + ⋯ + 1#& + "&,&(" #&(" + ⋯ + "&,) #) + ⋯ + "&,% #% = +&

A pivotal operation is sequence of elementary row operations that reduce the


coefficient of a specified variable to unity in one of the equation and zero elsewhere.
In the canonical form, x1, ….,xm are termed the basic variables or dependent variables.
Xm+1,…,xn are called nonbasic variables or the independent variables or non-pivotal
variables
Basic Variable and Basic Solutions

The solution obtained from a canonical form by setting the nonbasic variable or independent
Variable to zero is called a basic solution.

#* = +<* =.6 $ = 1, … , ! Variable x1,x2, …,xm are known as basic variables


#* = 0 =.6 $ = ! + 1, … , 5 Variable xm+1, ….xn are known as nonbasic variables

A basic feasible solution (BFS) is a basic solution in which the values of basic or dependent
Variables are non-negative. That is, for the above basic solution, +<* ≥ 0
Number of Basic Feasible Solution
Let us consider that we have a m linearly independent equation in n variables. Any
set of m variables can be selected as basic variables out of the possible n variables
to obtain a canonical system and a basic solution. Thus the maximum number of
basic solutions is :

5 5!
=
! !! 5 − ! !

A basic feasible solution (BFS) is a basic solution in which the values of basic or
dependent Variables are non-negative.
The Simplex Method: General Steps
1. Express the LPP in standard form
2. Start with an initial basic feasible solution in canonical form
3. Improve the solution if possible by finding another basic feasible solution with
a better objective function value. At this step, the simplex method eliminates
from consideration all those basic feasible solutions which are inferior
compared to the present basic feasible solution
4. Continue to find improved basic feasible solution. When a particular basic
feasible solution is found and cannot be improved by finding new basic feasible
solution, the optimally is reached.
How to obtain improved BFS?
Let us consider we have an initial basic feasible solution (BFS) in canonical form as
follows:
@"A$( ∶ #* = +<* ≥ 0 =.6 $ = 1, … , !
C.5+"A$( ∶ #* = 0 =.6 $ = ! + 1, … , 5

The set of basic variables is called a basis, xB. Let the objective function coefficient
of the Basic variables be denoted as cB.

#+ = #" , … , #& , (+ = (" , … , (& ,

Since the nonbasic variables are zero, the value of the objective function z
corresponding to initial BFS is given by:

% = (+ #+ = (" +" +. . . . . + (& +& = ∑& <


*," (* +*
How to obtain improved BFS?
Given the initial BFS, the simplex method
1. First examines whether the present BFS is optimal or not
2. If it is not optimal, the simplex method finds an adjacent basic feasible solution with
better (or at least equal) value of objective function (Z)

An adjacent basic solution differs from the present basic solution in exactly one basic
variable. In order to obtain an adjacent basic feasible solution, the simplex method
makes one of the basic variables a nonbasic variable and converts a nonbasic variable to
a basic variable in its Place.

How to select appropriate basic and nonbasic variables that gives maximum
improvement to the objective function?
How to select an adjacent BFS?

A basic feasible solution has non-negative value. The nonbasic variables are always zero.
Thus when we convert a nonbasic variable to a basic variable we increase its value from
zero to some positive quantity.

We have to choose that nonbasic variable that causes maximum improvement in the
objective function. To determine this, we can increase the value of nonbasic variable by
one unit (0 to 1) and check the change in the objective function value. Note that values of
other nonbasic variables will remain zero.
How to select an adjacent BFS?
Let us select the nonbasic variable xs and increase its value from 0 to 1. values of all
other nonbasic variables remain zero. The original canonical form can be rewritten as:

1#" + 0## + ⋯ + 0#& + "",&(" #&(" + ⋯ + "",) #) + ⋯ + "",% #% = +"


0#" + 1## + ⋯ + 0#& + "#,&(" #&(" + ⋯ + "#,) #) + ⋯ + "#,% #% = +#



0#" + 0## + ⋯ + 1#& + "&,&(" #&(" + ⋯ + "&,) #) + ⋯ + "&,% #% = +&

1#" + ⋯ + "",) #) = +"


1## + ⋯ + "#,) #) = +#


1#& + ⋯ + "&,) #) = +&
How to select an adjacent BFS?
When we increase xs from 0 to 1, the new solution are obtained as:

1#" + ⋯ + "",) #) = +" #* = +<* − "*) =.6 $ = 1, … … , !


1## + ⋯ + "#,) #) = +# #) = 1
… #- = 0 =.6 , = ! + 1, … … , 5 "5E , ≠ A

1#& + ⋯ + "&,) #) = +&

!"#$!$%& % = (" #" + (# ## + ($ #$ + . . . . . + (% #%


&
The new value of the objective function becomes: %%./ = G (* (+<* − "*) ) + ()
*,"

Note here xs=1 and cs = cost coefficient of xs


Inner product rule: which nonbasic variable become basic
Variable?
The net change in value of z per unit change in xs (known as relative profit of nonbasic
Variable xs)
& & &

(<) = %%./ − % = G (* (+<* − "*) ) + () − G (* +<* = () − G (* "*)


*," *," *,"

If the relative profit (<) > 0 then the objective function z can be improved by making xs
a basic variable. For a maximization problem, we should choose that nonbasic variable
which has maximum positive relative profit value. Note, for basic variables (<- = 0

The relative profit coefficient of a nonbasic variable xj is given by (<- = (- − (+ K<-

Here CB corresponds to the profit coefficients of the basic variables and K<- corresponds
To the j-th column in the canonical system of the basis under consideration.
Condition of optimality

Inner product rule: for a maximization problem, we should choose that nonbasic
variable which has maximum relative profit.

Condition of optimality: In a maximization problem, a BFS is optimal if the relative


profits of all the nonbasic variables are negative or zero. That is (<- ≤ 0

When the relative profits of all nonbasic variables are less or equal to zero, then every
adjacent BFS has an objective function value lower that the present solution. Thus the
current solution is a local maximum. Since a LPP is a convex programming problem, the
local maximum becomes global maximum.
Which basic variable becomes nonbasic?

Let us consider that (<) = !"#(<- > 0 "5E -ℎ& 5.5+"A$( N"6$"+O& #A ℎ"A +&&5 (ℎ.A&5 -.
enter as basic variable. Then, the values of the basic variables change as:
#* = +<* − "*) #) =.6 $ = 1, … … , !

If "*) < 0, -ℎ&5 #* , $5(6&"A&A "A #) is increased


If "*) = 0, -ℎ&5 #* , E.&A 5.- (ℎ"5Q& "A #) is increased
If "*) > 0, -ℎ&5 #* , E&(6&"A&A "A #) is increased and may turn negative (infeasible)

Thus, the maximum increase in xs is given by the following Minimum Ratio Rule:

+<* +3 +<*
max #) = min , ∀$. $= -ℎ$A ℎ"XX&5A "- $ = 6, #) $A $5(6&"A&E -. = min , ∀$
0!"12 "*) "3) 0!"12 "*)

4
Thus xs is increased 0 # , the basic variables xr becomes zero and is replaced by xs
#"
Inner product rule and minimum ratio rule

& & &

(<) = %%./ − % = G (* (+<* − "*) ) + () − G (* +<* = () − G (* "*)


*," *," *,"

Nonbasic variable which Has maximum positive Relative profit value Becomes basic
variable

+<*
max #) = min , ∀$ Basic variable which has minimum
0!"12 "*)
Ratio becomes nonbasic variable

The simplex method then checks if the current BFS is optimal by calculating the relative
profit coefficient for all nonbasic variables and the cycle is repeated until optimality
conditions are reached.
Summary of steps

1. Express problem in standard form.


2. Start with an initial BFS
3. Use inner product rule to find the relative-profit coefficients.
4. If all the relative profit coefficients are nonpositive, the current BFS is optimal.
otherwise select the nonbasic variable with most positive relative-profit coefficient to
enter as basic variable.
5. Apply Minimum Ration Rule to determine the basic variable that will become
nonbasic variable.
6. Checks if the current BFS is optimal by calculating the relative-profit coefficients for
all nonbasic variables and repeat the cycle until optimality conditions are reached.
Example 1:

Maximize Y = 6#" + 8## Maximize Y = 6#" + 8## + 0A" + 0A#

Subject to 5#" + 10## ≤ 60 Subject to 5#" + 10## + A" = 60


#" + ## ≤ 10 #" + ## + A# = 10
#" , ## ≥ 0 #" , ## , A" , A# ≥ 0

Introducing two slack variables


CB Basis Cj Solution +<*
6 8 0 0 bj "*)
(constants) ratio
#" ## A" A#
0 A" 5 10 1 0 60 60
=6
10
0 A# 1 1 0 1 10 10 S1
= 10 leaves
1
(-̅ 6 8 0 0 Z=0

X2 enter
Maximize Y = 6#" + 8## + 0A" + 0A#
Inner product rule:
Subject to 5#" + 10## + A" = 60
&
#" + ## + A# = 10
(<- = (- − G (+* "*- #" , ## , A" , A# ≥ 0
*,"
CB Basis Cj Solution +<*
6 8 0 0 bj "*)
(constants) ratio
#" ## A" A#
8 ## 1/2 1 1/10 0 6 6
= 12
1/2
0 A# 1/2 0 -1/10 1 4 4
=8
1/2
(-̅ 2 0 -4/5 0 Z=48

&
(<- = (- − G (+* "*- 1 1
, = 1 ⇒ (<- = 6 − 8× + 0× = 2
*," 2 2
1 −1 4
, = 2 ⇒ (<- = 0 − 8× + 0× =−
10 10 5

Y = 6#" + 8 ## + 0. A" + 0. A#
= 8×6 = 48
CB Basis Cj Solution +<*
6 8 0 0 bj "*)
(constants) ratio
#" ## A" A#
8 ## 0 1 1/5 -1 2
6 #" 1 0 -1/5 2 8
(-̅ 0 0 -2/5 -4 Z=64
Example 2:

Maximize Y = 2#" + 3## + #$ Raw Kg required/kg Maximum


Materials of product availability
Subject to #" + ## + 4#$ ≤ 100 D E F per day, kg
#" + 2## + #$ ≤ 150
3#" + 2## + #$ ≤ 320 A 1 1 4 100
B 1 2 1 150
#" , ## , #$ ≥ 0 C 3 2 1 320
Profit /Kg 2 3 1

Maximize Y = 2#" + 3## + #$ + 0A" + 0A# + 0A$

Subject to #" + ## + 4#$ + A" . = 100


#" + 2## + #$ + A# . = 150
3#" + 2## + #$ + A$ = 320

#" , ## , #$ , A" , A# , A$ ≥ 0
CB Basis Cj Solutio +<*
2 3 1 0 0 0 n "*)
bj
#" ## #$ A" A# A$
0 A" 1 1 4 1 0 0 100 100
= 100
1
0 A# 1 2 1 0 1 0 150 150
= 75
2
0 A$ 3 2 1 0 0 1 320 320
= 160
2
(-̅ 2 3 1 0 0 0 Z=0

Maximize Y = 2#" + 3## + #$ + 0A" + 0A# + 0A$ Inner product rule


&
Subject to #" + ## + 4#$ + A" . = 100
#" + 2## + #$ + A# . = 150 (<- = (- − G (+* "*-
3#" + 2## + #$ + A$ = 320 *,"

#" , ## , #$ , A" , A# , A$ ≥ 0
CB Basis Cj Solutio +<*
2 3 1 0 0 0 n "*)
bj
#" ## #$ A" A# A$
0 A" 1 1 4 1 0 0 100 100
= 100
1
3 ## 1 2 1 0 1 0 150 150
= 75
2
0 A$ 3 2 1 0 0 1 320 320
= 160
2
(-̅ Z=

Maximize Y = 2#" + 3## + #$ + 0A" + 0A# + 0A$ Inner product rule


&
Subject to #" + ## + 4#$ + A" . = 100
#" + 2## + #$ + A# . = 150 (<- = (- − G (+* "*-
3#" + 2## + #$ + A$ = 320 *,"

#" , ## , #$ , A" , A# , A$ ≥ 0
CB Basis Cj Solutio +<*
2 3 1 0 0 0 n "*)
bj
#" ## #$ A" A# A$
0 A" 1/2 0 7/2 1 -1/2 0 25 25
= 50
1/2
3 ## 1/2 1 1/2 0 1/2 0 75 75
= 150
1/2
0 A$ 2 0 0 0 -1 1 170 170
= 85
2
(-̅ 1/2 0 -1/2 0 -3/2 0 Z=225

Inner product rule


d#
d# = &
2
d# (<- = (- − G (+* "*-
d" = d" −
2 *,"
d$ = d$ − d#
CB Basis Cj Solutio +<*
2 3 1 0 0 0 n "*)
bj
#" ## #$ A" A# A$
2 #" 1 0 7 2 -1 0 50
3 ## 0 1 -3 -1 1 0 50
0 A$ 0 0 -14 -4 1 1 70
(-̅ 0 0 -4 -1 -1 0 Z=250

Inner product rule


d" = 2 d"
d# = d# − d" &
d$ = d$ − 4d# (<- = (- − G (+* "*-
*,"
Simplex Method: Minimization Problem
There are two approaches to solve minimization problems:

Approach-1:
Convert the minimization problem to an equivalent maximization problem by
Multiplying the objective function by -1

Maximize Y = 2#" + 3## + #$


Minimize Y = −2#" − 3## − #$
Subject to #" + ## + 4#$ ≤ 100
Subject to #" + ## + 4#$ ≤ 100
#" + 2## + #$ ≤ 150
#" + 2## + #$ ≤ 150
3#" + 2## + #$ ≤ 320
3#" + 2## + #$ ≤ 320
#" , ## , #$ ≥ 0
#" , ## , #$ ≥ 0
Simplex Method: Minimization Problem

Approach-2:

The coefficient in the (<- row give the net change in the value of Z per unit
increase in the nonbasic variable. A negative co-efficient in the (<- indicates that
the corresponding nonbasic variable will decrease the value of the objective
function.

Thus, the nonbasic variable with most negative (<- enters the basis. Note that the
minimum ratio rule remain unchanged.

Optimality criteria: All (<- ≥ 0


Minimize Y = −2#" − 3## − #$

Subject to #" + ## + 4#$ ≤ 100


#" + 2## + #$ ≤ 150
3#" + 2## + #$ ≤ 320

#" , ## , #$ ≥ 0
Example 3:

Minimize Y = −2#" − 3## − #$ Raw Kg required/kg Maximum


Materials of product availability
Subject to #" + ## + 4#$ ≤ 100 D E F per day, kg
#" + 2## + #$ ≤ 150
3#" + 2## + #$ ≤ 320 A 1 1 4 100
B 1 2 1 150
#" , ## , #$ ≥ 0 C 3 2 1 320
Profit /Kg 2 3 1

Minimize Y = −2#" − 3## − #$ + 0A" + 0A# + 0A$

Subject to #" + ## + 4#$ + A" . = 100


#" + 2## + #$ + A# . = 150
3#" + 2## + #$ + A$ = 320

#" , ## , #$ , A" , A# , A$ ≥ 0
CB Basis Cj Solutio +<*
-2 -3 -1 0 0 0 n "*)
bj
#" ## #$ A" A# A$
0 A" 1 1 4 1 0 0 100 100
= 100
1
0 A# 1 2 1 0 1 0 150 150
= 75
2
0 A$ 3 2 1 0 0 1 320 320
= 160
2
(-̅ -2 -3 -1 0 0 0 Z=0

Maximize Y = −2#" − 3## − #$ + 0A" + 0A# + 0A$ Inner product rule


&
Subject to #" + ## + 4#$ + A" . = 100
#" + 2## + #$ + A# . = 150 (<- = (- − G (+* "*-
3#" + 2## + #$ + A$ = 320 *,"

#" , ## , #$ , A" , A# , A$ ≥ 0
CB Basis Cj Solutio +<*
-2 -3 -1 0 0 0 n "*)
bj
#" ## #$ A" A# A$
0 A" 1 1 4 1 0 0 100 100
= 100
1
-3 ## 1 2 1 0 1 0 150 150
= 75
2
0 A$ 3 2 1 0 0 1 320 320
= 160
2
(-̅ Z=

Maximize Y = −2#" − 3## − #$ + 0A" + 0A# + 0A$ Inner product rule


&
Subject to #" + ## + 4#$ + A" . = 100
#" + 2## + #$ + A# . = 150 (<- = (- − G (+* "*-
3#" + 2## + #$ + A$ = 320 *,"

#" , ## , #$ , A" , A# , A$ ≥ 0
CB Basis Cj Solutio +<*
-2 -3 -1 0 0 0 n "*)
bj
#" ## #$ A" A# A$
0 A" 1/2 0 7/2 1 -1/2 0 25 25
= 50
1/2
-3 ## 1/2 1 1/2 0 1/2 0 75 75
= 150
1/2
0 A$ 2 0 0 0 -1 1 170 170
= 85
2
(-̅ -1/2 0 1/2 0 3/2 0 Z=-225

Inner product rule


d#
d# = &
2
d# (<- = (- − G (+* "*-
d" = d" −
2 *,"
d$ = d$ − d#
CB Basis Cj Solution +<*
-2 -3 -1 0 0 0 bj "*)
#" ## #$ A" A# A$
-2 #" 1 0 7 2 -1 0 50
-3 ## 0 1 -3 -1 1 0 50
0 A$ 0 0 -14 -4 1 1 70
(-̅ 0 0 4 1 1 0 Z=-250

Inner product rule


d" = 2 d"
d# = d# − d" &
d$ = d$ − 4d# (<- = (- − G (+* "*-
*,"
Simplex Method: Unbounded Problem

A LPP has unbounded optimum when its optimal value is not finite. For maximization
Problem the optimum value tends to +∞ and for minimization problem, the optimum
Value tends to - ∞.

Example: Maximize Y = 4#" + 2##

Subject to #" ≥ 4
## ≤ 2
#" , ## ≥ 0

If minimum ratio rule fails to identify the basic variable that will leave the basis

+<*
max #) = min , ∀$
0!"12 "*)

If all the coefficients ("*) )are negative then the problem is unbounded
CB Basis Cj Solution +<*
4 2 0 0 bj "*)
(constants) ratio
#" ## A" A#
0 A" -1 0 1 0 -4 ?
0 A# 0 1 0 1 2 ?
(-̅ 4 2 0 0 Z=0

X1 enter
Maximize Y = 4#" + 2## +0.s1+0.s2
Inner product rule: Subject to #" − A" = 4
& ## + A# = 2
(<- = (- − G (+* "*- #" , ## ≥ 0
*,"
Simplex Method: Degeneracy and cycling

A BFS in which one or more of the basic variables are zero are called a degenerate
BFS.
Consider that there is a tie between two rows while applying the Minimum Ratio rule
for a simplex method then at least one basic variable will be zero in the next iteration
and lead to degenerate solution.

When we have degenerate BFS, it is possible that the minimum ratio will be zero. This
implies that when a basis change is performed there will not be any improvement in
the Value of the objective function.

Thus simplex method can go through a series of iterations without making any
Improvement in the objective function. This is known as cycling.
Maximize Y = 10#" − 57## − 9#$ − 24#5

Subject to 0.5#" − 5.5## − 2.5#$ + 9#5 ≤ 0


0.5#" − 1.5## − 0.5#$ + #5 ≤ 0
#" ≤ 1
#" , ## , #$ , #5 ≥ 0

Bland’s Rule:

1. Among nonbasic variables that have a positive relative-profit coefficient,


chose the one with least index.
2. Among rows that satisfy the minimum ratio rule, chose the one with least
index (in case of tie)
Maximize Y = #" + ## + #$ Maximize Y = #" + ## + #$ + 0. A" + 0. A#

Subject to #" + ## ≤ 1 Subject to #" + ## + A" = 1


−## + #$ ≤ 0 −## + #$ + A# = 0
#" , ## , #$ ≥ 0 #" , ## , #$ ≥ 0

CB Basis Cj Solution +<*


1 1 1 0 0 bj "*)
(constant ratio
#" ## #$ A" A# s)
0 A" 1 1 0 1 0 1 1
0 A# 0 -1 1 0 1 0
(-̅ 1 1 1 Z=0

#" = 0, ## = 0, #$ = 0, A" = 1, A# = 0
CB Basis Cj Solution +<*
1 1 1 0 0 bj "*)
(constant ratio
#" ## #$ A" A# s)
1 #" 1 1 0 1 0 1
0 A# 0 -1 1 0 1 0 0
(-̅ 0 1 -1 Z=1

#" = 1, ## = 0, #$ = 0, A" = 0, A# = 0

CB Basis Cj Solution +<*


1 1 1 0 0 bj "*)
(constant ratio
#" ## #$ A" A# s)
1 #" 1 1 0 1 0 1 1
1 #$ 0 -1 1 0 1 0
(-̅ 1 -1 -1 Z=1

#" = 1, ## = 0, #$ = 0, A" = 0, A# = 0
Multiple solution
If the relative profit of a nonbasic variable is zero then there will be an alternate optimum.

Example:

Maximize Y = 40#" + 30## Maximize Y = 40#" + 50## + 0. A" + 0. A#

Subject to #" + 2## ≤ 40 Subject to #" + 2## + A" = 40


4#" + 3## ≤ 120 4#" + 3## + A# = 120
#" , ## ≥ 0 #" , ## , A" , A# ≥ 0

CB Basis Cj Solution +<*


40 30 0 0 bj "*)
(constants) ratio
#" ## A" A#
0 A" 1 2 1 0 40 40
0 A# 4 3 0 1 120 30
(-̅ 40 30 Z=0
CB Basis Cj Solution +<*
40 30 0 0 bj "*)
(constants) ratio
#" ## A" A#
0 A" 1 2 1 0 40 40
0 A# 4 3 0 1 120 30
(-̅ 40 30 Z=0

CB Basis Cj Solution +<*


40 30 0 0 bj "*)
(constants) ratio
#" ## A" A#
0 A" 0 5/4 1 -1/4 10 8
40 #" 1 3/4 0 1/4 30 40
(-̅ 0 -10 Z=1200
CB Basis Cj Solution +<*
40 30 0 0 bj "*)
(constants) ratio
#" ## A" A#
0 A" 0 5/4 1 -1/4 10 8
40 #" 1 3/4 0 1/4 30 40
(-̅ 0 -10 Z=1200

X1 = 30. max = 1200

CB Basis Cj Solution +<*


40 30 0 0 bj "*)
(constants) ratio
#" ## A" A#
30 ## 0 1 4/5 -1/5 8 10
40 #" 1 0 -3/5 2/5 24
(-̅ 0 -10 Z=1200

X1 = 24, x2 = 8, max = 1200


Nonlinear optimization with constraints
Method of Lagrange Multipliers: Equality constraints

In this method, an equality-constraints problem is converted to an equivalent


unconstrained problem with help of certain unspecified parameter known as Lagrange
multiplier.

The method of Lagrange multipliers provides a set of necessary condition for optimal
solution of the problem.

min =(#" , ## )

A*+,&(- -. ℎ #" , ## = 0

The method of Lagrange multiplier converts the problem as follows:

g #" , ## , h = = #" , ## + hℎ #" , ##

There is no restriction for sign of Lagrange multiplier


Method of Lagrange Multipliers

The Lagrange function : g #" , ## , h = = #" , ## + hℎ #" , ##

The necessary condition for optimum can be written as follows:

Eg E= Eℎ
# ,# ,h = # ,# +h # ,# = 0
E#" " # E#" " # E#" " #

Eg E= Eℎ
# ,# ,h = # ,# +h # ,# = 0
E## " # E## " # E## " #

Eg
# , # , h = ℎ #" , ## = 0
Eh " #

Solution #"∗ , ##∗ , h∗ can be found out by solving above equations simultenously.

If Hessian matrix of L at the solution is positive definite, then the solution is a local minimum.
If Hessian matrix of L at the solution is negative definite, then the solution is a local maximum.
Method of Lagrange Multipliers: General forms

min =(i) i = n vector #" h"


## h#
i= . p= .
A*+,&(- -. ℎ- i = 0, , = 1,2,3 … . . ! . .
#%
h%
Lagrange of the problem with (n+m) unknowns

g #" , ## , … … … , #% , h" , h# , … … … , h& = = i + h" ℎ" i + ⋯ + h& ℎ& i


= = i + ∑&
-," h- ℎ- i
The necessary conditions are given as:
&
Eg E= Eℎ-
i, q = i + G h- i = 0, i = 1,2 … … , n, j = 1,2, … … . , m
E#* E#* E#*
-,"

Eg
i, q = ℎ- i = 0, , = 1,2, … … , !
Eh-
Problem 1:

m"#$!$%& = #, s = # + s

A*+,&(- -. ℎ #, s = # # + s # − 1 = 0
The Lagrange function : g #, s, h = = #, s + hℎ #, s

= # + s + h(# # + s # − 1)

Eg E= Eℎ
#, s, h = #, s + h #, s = 0
E# E# E#
⇒ 1 + 2h# = 0

Eg E= Eℎ
#, s, h = #, s + h #, s = 0
Es Es E#
⇒ 1 + 2hs = 0

Eg
#, s h = ℎ #, s = 0
Eh
⇒ ## + s# − 1 = 0
The Lagrange function : g #, s, h = = #, s + hℎ #, s

= # + s + h(# # + s # − 1)

Eg E= Eℎ
#, s, h = #, s + h #, s = 0
E# E# E# √2 √2 1
⇒ 1 + 2h# = 0 #= ,s = ,λ = −
2 2 √2
Eg E= Eℎ
#, s, h = #, s + h #, s = 0
Es Es E# = #, s = √2
⇒ 1 + 2hs = 0

Eg 2 √2 1
#, s h = ℎ #, s = 0 #=− ,s = − ,λ =
Eh 2 2 √2
⇒ ## + s# − 1 = 0
= #, s = −√2
Problem 2:

m"#$!$%& = #, s = # # s

A*+,&(- -. ℎ #, s = # # + s # − 3 = 0
Problem 3:

min =(#" , ## ) = (#" − 1.5) # +(## − 1.5) #

A*+,&(- -. ℎ #" , ## = #" + ## − 2 = 0

Solution: #"∗ = 1, ##∗ = 1, h∗ = 1

At optimal point, the gradient of the objective function and the constraint function
are along same line.

Eg E= Eℎ
#" , ## , h = #" , ## + h # ,# = 0
E#" E#" E#" " #
E= Eℎ
#" , ## = −h # ,#
E#" E#" " #
Problem 4:

m"# .6 !$5 = #, s = 8 # # − 2y

A*+,&(- -. ℎ #, s = # # + s # − 1 = 0
Hessian Matrix

Ø If and only if all leading principal minors of the matrix are positive, then the matrix is
positive definite. For the Hessian, this implies the stationary point is a minimum.
Ø If and only if the kth order leading principal minor of the matrix has sign (-1)k, then
the matrix is negative definite. For the Hessian, this implies the stationary point is a
maximum.
Ø If none of the leading principal minors is zero, and neither (a) nor (b) holds, then the
matrix is indefinite. For the Hessian, this implies the stationary point is a saddle point.
Lagrange Multipliers: Inequality constraints

7 min =(#)
g #, h, w = = i + ∑-," h- (Q- i + w-# )
A*+,&(- -. Q # < 0
The necessary conditions are given as:

7
Eg E= EQ-
i, q, w = i + G h- i = 0, i = 1,2 … … , n
E#* E#* E#*
-,"

Eg
i, q, w = Q- i + w-# = 0, , = 1,2, … … , X
Eh-
⟺ Q- i < 0
w-# > 0

Eg
i, q, w = 2h- w- = 0, j = 1,2 … … , p ⟺ h- Q- = 0
Ew-

h- ≥ 0, j = 1,2 … … , p
Karush-Kohn-Tucker (KKT) conditions

7 min =(#)
g i, N, * = = i + ∑&
*," N* ℎ* i + ∑-," *- Q- i A*+,&(- -. Q # ≤0
h # =0
The necessary conditions are given as:
& 7
Eg E= EQ- EQ-
i, q, w = i + G N- i + G *- i = 0, i = 1,2 … … , n
E#* E#* E#* E#*
-," -,"

ℎ- i = 0 , j = 1,2 … … , m Feasibility check


Q- i ≤ 0, j = 1,2 … … , p

*- Q- = 0, j = 1,2 … … , p Switching Condition

*- ≥ 0, j = 1,2 … … , p Non-negativity of Lagrange multiplier


Example :1

min = # = #"# + 2### + 3#$#

A*+,&(- -.. #" −## − 2#$ ≤ 12


#" + 2## − 3#$ ≤ 8

g = #"# + 2### + 3#$# + *" #" − ## − 2#$ + *# #" + 2## − 3#$

E= E= EQ 2#" + *" + *# = 0. (1)


+ *" + *# =0 4## − *" + 2*# = 0 (2)
E# E# E#
6#$ − 2*" − 3*# = 0 (3)

*" #" − ## − 2#$ − 12 = 0 (4)


*- Q- = 0 *# (#" + 2## − 3#$ − 8) = 0 (5)

#" − ## − 2#$ − 12 ≤ 0 (6)


Q- i ≤ 0
(#" + 2## − 3#$ − 8) ≤ 0 (7)

*- ≥ 0 *" ≥ 0 (8)
*# ≥ 0 (9)
Case 1: *" = 0
8 8
From (1), (2) and (3), we get x" = x# = − #$ , x$ = #$
Put these in (5), we get,

u## + 8u# = 0 ⇒ u# = 0 or − 8

Hence, #" = ## = ## = 0

Case 2: (#" − ## − 2#$ − 12) = 0

Using (1), (2) and (3), we get 17u" + 12u# = −144

Since, *- ≥ 0 it is not possible


Example :2

min = # = #" − 1.5 # + ## − 1.5 #

A*+,&(- -.. #" +## − 2 ≤ 0

Solution: #"∗ = 1, ##∗ = 1, h"∗ = 1, w ∗ = 0


Example :3

In order to minimize the cost of operation, we need to determine the optimal values of
reactor volume (V), feed rate (F) and concentration of A in the reactor (CA). Formulate a
constrained optimization problem to determine optimal V,F and CA. Use mass balance
equations on A and B to formulate these constraints. Use method of Lagrange
multipliers to derive the expressions for optimal V,F and CA.

F, CA0

F, CA
Material Balance on A Material Balance on B

0 = Ñ92 Ö − 69 Ü + ÖÑ9 0 = 6+ Ü − ÖÑ+


⇒ Ñ92 − Ñ9 Ö − 0.1Ñ9 Ü = 0 ⇒ 0.1Ñ9 Ü − 10 = 0

6+ = kÑ9 àℎ&6&, â = 0.1ℎ:" !.O


-. X6.E*(& 10 .= @
ℎ6

ä$5$!$%& Ñ! = 5Ñ92 Ö + 0.3Ü

)*+,&(- -.: Ñ92 − Ñ9 Ö − 0.1Ñ9 Ü = 0


0.1Ñ9 Ü − 10 = 0

g = 5Ñ92 Ö + 0.3Ü + h" Ñ92 − Ñ9 Ö − 0.1Ñ9 Ü + h# (0.1Ñ9 Ü − 10)


g = 5Ñ92 Ö + 0.3Ü + h" Ñ92 − Ñ9 Ö − 0.1Ñ9 Ü + h# (0.1Ñ9 Ü − 10)

Eg
= 5Ñ92 + h" Ñ92 − Ñ9 = 0 CA0 0.04 mol/hr

F 12182 m3/hr
Eg V 31455 m3
= 0.3 − 0.1h" Ñ9 + 0.1h# Ñ9 = 0

CA 0.0318 mol/m3
Eg CT 11872.9 Rs/hr
= h" Ö − 0.1h" Ü + 0.1h# Ü = 0
EÑ9

Eg
= Ñ92 − Ñ9 Ö − 0.1Ñ9 Ü = 0
Eh"

Eg
= 0.1Ñ9 Ü − 10 = 0
Eh#
Sequential Linear Programming : NLP

Minimize =(#)

Subject to Q- (#) ≤ 0
ℎ; = 0
#*<+ ≤ #* ≤ #*=+

Minimize = # > + ∇=(# > )(# − # > )

Subject to Q- # > + ∇Q- # > # − # > ≤ 0


ℎ; # > + ∇ℎ; # > # − # > = 0
#*<+ ≤ #* ≤ #*=+
Example 1: Matlab Programming

= = 42
min = # = #"# + ###
a= −4 1
+ = −4
A*+,&(- -. Q # = #"# − ### ≥ 0
O+ = 0.5 0
ℎ # = 2 − #" − ### = 0
*+ = 2.5 3
0.5 ≤ #" ≤ 2.5, 0 ≤ ## ≤ 3
/&å = 1 2
+&å = 3
# = O$5X6.Q(=, ", +, "&å, +&å, O+, *+)
# 2 = (2 1)

min = #, # 2 = 5 + 4 #" − 2 + 2(## − 1)

A*+,&(- -. Q #, # 2 = 3 + 4 #" − 2 − (## − 1)


ℎ #, # 2 = −1 − (#" − 2) − 2(## − 1)
0.5 ≤ #" ≤ 2.5, 0 ≤ ## ≤ 3

"" @
Solution: #" = ,
? ?
. Q #" = 0.6049 > 0 ℎ #" = −0.0123 ≠ 0
"" @
Need to go for next iteration with #" = ,
? ?
Objective function
min = # = #"# + ###
#" = 2,1
2#" 4
∇= = = , = 2,1 = 5
2## 2
5.à, = # > + ∇= # > # − # > Q # = #"# − ## ≥ 0
# −2 ℎ # = 2 − #" − ### = 0
=5+ 4 2 "
## − 1
= 5 + 4 #" − 2 + 2(## − 1)

Constraints
#" = 2,1 #" = 2,1
2#" 4 −1
∇Q = = , Q 2,1 = 3 −1
−1 −1 ∇ℎ = = , ℎ 2,1 = −1
−2## −2
5.à, Q # > + ∇Q # > # − # >
# −2 5.à, ℎ # > + ∇ℎ # > # − # >
= 3 + 4 −1 " # −2
## − 1 = −1 + −1 −2 "
## − 1
= −1 − #" − 2 − 2 ## − 1
= 3 + 4 #" − 2 − (## − 1)
Objective function

11 8
#" = ,
9 9
22
2#" 9 11 8 185
∇= = = , = , =
2## 16 9 9 81
9
5.à, = # + ∇= # > # − # >
>
11
185 22 16 " 9 # −
= +
81 9 9 8
## −
9
185 22 11 16 8
= + #" − + ## −
81 9 9 9 9
Constraints

11 8
#" = , 11 8
9 9 #" = ,
22 9 9
2#" 11 8 49 −1
∇Q = = 9 , Q , = −1 11 8 1
−1 9 9 81 ∇ℎ = = 16 , Q , =−
−1 −2## − 9 9 81
9
5.à, ℎ # > + ∇ℎ # > # − # >
5.à, Q # > + ∇Q # > # − # > 11
11 1 # −
#" − 16 " 9
49 22 9 = − + −1 −
= + −1 81 9 8
81 9 8 ## −
## − 9
9 1 11 16 8
= − − #" − − ## −
49 22 11 8 81 9 9 9
= + #" − − ## −
81 9 9 9
Matlab Functions: fmincon
fmincon solves general nonlinear programming problem

Minimize =(#)
Matlab programming

Subject to # = =!$5(.5(=*5, # 2 , /, +)
# = =!$5(.5(=*5, # 2 , /, +, /&å, +&å)
/. # ≤ +
# = =!$5(.5(=*5, # 2 , /, +, /&å, +&å, O+, *+)
/.A # = +.A # = =!$5(.5(=*5, # 2 , /, +, /&å, +&å, O+, *+, 5.5O$5(.5)
c # ≤0
=*5(-$.5 (, (&å = 5.5O$5(.5 #
(.A # = 0
( # ≤0
g@ ≤ # ≤ ç@ (&å # = 0
&5E
Example 1:

Minimize = #, s = #, s

Subject to : # # + s # = 1

# = =!$5(.5(=*5, # 2 , /, +, /&å, +&å, O+, *+, 5.5O$5(.5)

=*5(-$.5 = = =*5( ë =*5(-$.5 (, (&å = 5.5O$5(.5 ë


%( # ≤ 0
#=ë 1 ; %(&å # = 0
s=ë 2 ;
f= # + s; #=ë 1 ;
&5E s=ë 2 ;
( = [];
(&å = # ^ 2 + s ^ 2 − 1;
&5E
Matlab Functions: fsolve
fsolve solves a system of nonlinear equations

# = =A.ON&(=*5, # 2 )
Matlab Programming
Example 2
=*5(-$.5 = = =*5( #
2#" − ## = & :C%
−#" + 2## = & :C$ = 1 = 2 ∗ # 1 − # 2 − exp −# 1 ;
= 2 = −# 1 + 2 ∗ # 2 − exp −# 2 ;
At #2 = [−5,5]
&5E
CSTR in Series
/→@
F, CA0

F, CA1 F, CA2 F, CA3

The objective is to determine the volume of reactor 1, 2 and 3 that


minimize the concentration of A at the outlet. Provided that the total
volume of three reactor should not exceed 7 m3.

Reaction rate constant K= 1.0 m1.5/hr. kmol0.5


Volumetric flow rate F0=2.0 m3/hr
Feed concentration of reactant A = 1 kmol/m3
From Mass Balance

2 Ñ92 − Ñ9" = 69" Ü"


2 Ñ9" − Ñ9# = 69# Ü#
2 Ñ9# − Ñ9$ = 69$ Ü$ Model Equation
".E
69" = â ∗ Ñ9"
".E
69# = â ∗ Ñ9#
".E
69$ = â ∗ Ñ9$

Ü" + Ü# + Ü$ ≤ 7 Inequality constraints

Minimize Ñ9$
function ex3(V) function F=CSTR(CA)
global V global V
A=[1 1 1]; F0 = 2;
B=7; K=1;
Aeq=[]; CA0 = 1;
Beq=[]; V1 = V(1);
LB=[]; V2 = V(2);
UB=[]; V3 = V(3);
V0=[2,3,2];
[x]=fmincon(@objfun,V0,A,B,Aeq,Beq,LB,UB) R1 = K*CA(1)^1.5
end R2 = K*CA(2)^1.5
R3 = K*CA(3)^1.5
function F = objfun(VOL)
global V
F(1) = F0*(CA0 - CA(1)) - V1*R1;
V = VOL;
F(2) = F0*(CA(1) - CA(2)) - V2*R2;
CA_guess = [1 1 1];
F(3) = F0*(CA(2) - CA(3)) - V3*R3;
CA = fsolve(@CSTR, CA_guess)
F = CA(3)
end
end
Example 4:

F, CA0

F, CA1 F, CA2 F, CAn

1st order reaction rate constant K= 12 hr-1


Volumetric flow rate F=0.9 m3/hr

The cost of reactor depends on it’s volume : Ñ = 137000Ü 2.5


F
Desired concentration change : F&' = 10000
&(

Determine the optimum number of reactors so that cost is minimum.


Find a minimum of an unconstrained multivariable function

# = =!$5*5((=*5, # 2 )

Objective function:
2.5
40 (
(.A- = (1370005) 10000 − 1
3
Process Flow Diagram

From process to Information flow diagram


A process flow diagram depicts the equipment and pipes which make up the plant.
A pipes
The set of are
distinc
ctive symb
shown as bols
arrows willpointing
be adopte ed the
in for direction
the various
of type of unit
material flow. Such a
computation.
diagram can beThes se are giveen
encoded in in figuree 1.4.form for use in computer. This is done in two
numerical
steps
• Conversion of process flow diagram into information flow diagram.
• Conversion of information flow diagram in to numerical form.
Information Flow Diagram

Althouugh the infoormation flow


fl diagraam will gennerally reseemble the pprocess floow
diagramm, these will be diffeerent in som
me stream and units aare not in bboth diagraam.
In the case
c of fig the surge tanks
t of fig
g are absennt because the processs is steadyy
state an
nd capital cost
c is igno ored.
The information flow diagram represents the flow of information via
streams between unit computations. It is constructed as follows:
(a) Each unit computation is represented by a suitable symbol.
(b) Each symbol is given the name of a unit computation
(c) The flows of information between units are drawn as directed
lines (streams) between symbols, with arrows indicating the direction
of information flow.
(d) The steams and symbols are separately numbered, usually
ascending in the direction of flow. The numbering is arbitrary, but no
two symbols or two streams may have the same numbers.
Conversion of Information Diagram into Numerical form

There are four methods


§ The process matrix method
§ The stream connection matrix method
§ The incidence matrix method
§ The adjacency matrix method
The contents of that row are the number of the particular unit, the name of
the unit computation representing the unit and the input stream number (as
positive numbers) followed by the output streams numbered (as negative
Thenumbers).
process matrix:
The process of matrix of Fig 1 &3 are

Table 2.1: Process Metric of information flow diagram figure 1.5

Unit Unit Associated streams numbers


computation
name
1 MIXER 1 7 -2
2 DISTL 2 -8 -3
3 REACT 3 -4
4 DISTL 4 -5 -9
5 DISTL 5 -7 -6
The first entry is the stream number and the second and third are the
numbers of the equipment units from which that stream comes and to which
it goes, respectively. The stream connection matrix of fig 3 is given in table
2.2.stream connection matrix
The

Table 2.2: Stream connection metrics of information flow diagram of figure


1.5.

Stream number From unit number to unit number


1 0 1
2 1 2
3 2 3
4 3 4
5 4 5
6 5 0
7 5 1
8 2 0
9 4 0
It can be seen that of the three items of information is the process matrix
only the first is retained the stream connection matrix. thus there is neither
indication of the type of unit computation nor of the order of input and
The incidence matrix method:
output streams of a unit.

The incidence matrix method:


The incidence metrics of the same example is given in table 2.3.

Table 2.3: Incidence metrics for information flow diagram of figure 1.5.

Unit stream number


no. 1 2 3 4 5 6 7 8 9
Althouugh the infoormation flow
fl diagraam will gennerally reseemble the pprocess floow
diagramm, these will be diffeerent in som
me stream and units aare not in bboth diagraam.
1 1 -1 In the case
c of fig the surge tanks
t g are absen1
of fig -1processs is steadyy
nt because the
state an
nd capital cost
c is igno ored.
2 1 -1
3 1 -1 -1
4 1 -1
5 1 -1 -1
sum 1 0 0 0 0 -1 0 -1 -1
Table 3

The left column contains the equipment number and the remaining columns
he incidence matrix contains the same information as the stream
nnection and thus has less information than the process matrix.

The Adjacency Metrics Method:


ency Metrics Method:

To unit no.
1 2 3 4 5
From 1 1
unit 2 1
No. 3 1
4 1
5 1

ble 2.4: Adjacency Metrics of Information Flow diagram of figure 1.5.


Sequential calculations:
Sequential calculations:
The process matrix can be used to find a workable sequence of calculation
The
for processset.
a recycle matrix can be used to find a workable sequence of calculation
for a recycle set.
Sequential calculations:

Fig 3.1: Process Flow Diagram


Fig 3.1: Process Flow Diagram
Table: 3.2: Process Metrics
Unit no Unit Table: 3.2: Process
Associated streamsMetrics
Unit no compu.
Unit Associated streams
name
compu.
1 name
UNIT1 2 -1 -3 0
21 UNIT1
UNIT2 62 8-1 -7-3 00
32 UNIT2
UNIT3 16 58 -4-7 00
43 UNIT3
UNIT4 71 -55 -8-4 00
54 UNIT4
UNIT5 37 4-5 -2-8 -60
5
The process matrix UNIT5 3 find a workable
can be used to 4 sequence -2 of calculation
-6 for a
The process
recycle set by matrix can besearching
exhaustively used to find
amonga workable sequence
the streams of calculation
of the recycle set. for a
recycle set by exhaustively searching among the streams of the recycle set.
From table 3.2, If stream 1 & 5 were known only unit 3 could be calculated. But if
From table
stream 2 and3.2, If stream
7 were known1 the
& 5entire
wereset
known
couldonly unit 3 could
be calculated be calculated.
in the But
sequence (1, 4 ,if
There are four loops which are (2, 3) , (7, 8), (1, 4, 2), and (4, 6, 7, 5). These loops
can be represented in the cycle matrix given in the table, where rank of a loop is
the number of streams in it and the stream frequency is the number of times a
Identification
stream is in a of Recycle Loops:
loop.

Table 3.3: Identification of Recycle Loops:


Loop Streams no. Loop
no. 1 2 3 4 5 6 7 8 rank
1 1 1 2
2 1 1 2
3 1 1 1 3
4 1 1 1 1 4
Sequential calculations:
Stream 1 2 1 2 1 1 2 1
The process matrix can be used to find a workable sequence of calculation
Frequecy for a recycle set.

14352
A minimum number of streams to be cut (assumed known) in order to eliminate all
recycle can be found in the following manner. A stream is said to be contained in
another j if each loop in which stream is found also involves stream j.

Thus stream 1 & 3 are contained in stream 2, streams 5, 6 & 8 are contained in
stream 7 and 1, 5, & 6 Figare
3.1: Process Flow Diagram
contained in stream 4. Since no more recycle loops
Table: 3.2: Process Metrics
Unit no Unit Associated streams
compu.
name
could be cut by any stream than by the stream which contains it, streams 1, 3, 5, 6,
and 8 cab be eliminated

Table 3.4: Remaining recycle loops:

Loop No. Streams Loop no.


2 4 7
1 1 1 1
2 1
3 1 1 2
4 1 1 2
Stream 2 2 2
frequency

Since a loop of rank 1 can only be cut by cutting the one remaining stream, streams
Since
2 & 7 amust
loopbeof rank
cut. 1 can only
Fortunately this be cut loop
breaks by cutting thetheone
3 & 4 at remaining
same stream,
time. Therefore,
streams
assuming2values
& 7 mustfor thebevariables
cut. Fortunately
of streamsthis
2 &breaks loop a3 direct
7 will allow & 4 at the same
calculations
time. Therefore, assuming values for the variables of streams 2 & 7 will allow
of all the unit of same fig. in the sequence (1, 4, 3, 5,2)
a direct calculations of all the unit of same fig. in the sequence (1, 4, 3, 5,2)
The Adjacency Matrix

The adjacency matrix is another tool for separating serial and recycle sets of units.
The information flow diagram for fig. 3.2 is used as the example for the
development of adjacency matrix and its adjancency is given in table 3.5:
Quiz
min = # = −#"5 ##

A*+,&(- -. Q # = #"# + ### − 25 ≤ 0


ℎ # = #" + 2## − 6 ≤ 0
#" ≥ 0, ## ≥ 0

Initial guess [2 1]

What is the solution after 1 iteration

You might also like