Professional Documents
Culture Documents
G.Anuradha
Contents
• Derivative-based Optimization
– Descent Methods
– The Method of Steepest Descent
– Classical Newton’s Method
– Step Size Determination
• Derivative-free Optimization
– Genetic Algorithms
– Simulated Annealing
– Random Search
– Downhill Simplex Search
What is Optimization?
• Choosing the best element from some set
of available alternatives
• Solving problems in which one seeks to
minimize or maximize a real function
Notation of Optimization
Optimize
y=f(x1,x2….xn) --------------------------------1
subject to
gj(x1,x2…xn) ≤ / ≥ /= bj ----------------------2
where j=1,2,….n
Eqn:1 is objective function
Eqn:2 a set of constraints imposed on the solution.
x1,x2…xn are the set of decision variables
Note:- The problem is either to maximize or minimize the
value of objective function.
Complicating factors in optimization
1. Existence of multiple decision variables
2. Complex nature of the relationships
between the decision variables and the
associated income
3. Existence of one or more complex
constraints on the decision variables
Types of optimization
• Constraint:- Solution is arrived at by
maximizing or minimizing the objective
function
• Unconstraint:- No constraints are imposed
on the decision variables and differential
calculus can be used to analyze them
Examples
Least Square Methods for System
Identification
• System Identification:- Determining a
mathematical model for an unknown system by
observing the input-output data pairs
• System identification is required
– To predict a system behavior
– To explain the interactions and relationship between
inputs and outputs
– To design a controller
• System identification
– Structure identification
– Parameter identification
Structure identification
T T
If F x x 0 then F x – x F x – F x x F x
x = x
x=x
T
F x x = 0
But this would imply that x* is not a minimum. Therefore x = x
Therefore the Hessian matrix must be positive definite. A matrix A is positive definite if:
T
z Az 0 for any z ° 0.
or
xk = xk + 1 – x k = kp k
xk +1
kp k
xk
pk - Search Direction
k - Learning Rate
Steepest Descent
Choose the next step so that the function decreases:
F x k + 1 F xk
where
g k F x
x = xk
x k + 1 = xk – k g k
Example
2 2
F x = x1 + 2 x1 x 2 + 2x 2 + x1
x 0 = 0.5 = 0.1
0.5
F x
F x =
x1
=
2x 1 + 2x2 + 1 g0 = F x = 3
x= x0 3
2x 1 + 4x 2
F x
x2
-1
-2
-2 -1 0 1 2
Effect of learning rate
• More the learning rate the trajectory
becomes oscillatory.
• This will make the algorithm unstable
• The upper limit for learning rates can be
set for quadratic functions
Stable Learning Rates (Quadratic)
1 T T
F x = -- x Ax + d x + c
2
F x = Ax + d
x k + 1 = xk – gk = x k – Ax k + d xk + 1 = I – A x k – d
Stability is determined
by the eigenvalues of
this matrix.
I – A zi = z i – Az i = z i – iz i = 1 – i z i
Stability Requirement:
2 2
1 – i 1 ---- ------------
i max
Example
0.851 0.526
A= 22
1 = 0.764 z
1 =
2 = 5.24 z
2 =
24 – 0.526 0.851
2 2
------------ = ---------- = 0.38
max 5.24
= 0.37 = 0.39
2 2
1 1
0 0
-1 -1
-2 -2
-2 -1 0 1 2 -2 -1 0 1 2
Newton’s Method
T 1 T
F xk + 1 = F xk + xk F xk + g k x k + -- xk A k x k
2
gk + Ak xk = 0
–1
x k = – Ak g k
xk + 1 = xk – A–k 1 gk
Example
2 2
F x = x1 + 2 x1 x 2 + 2x 2 + x1
x 0 = 0.5
0.5
F x g0 = F x = 3
x1 2x 1 + 2x2 + 1 x= x0 3
F x = =
2x 1 + 4x 2
F x
x2 A= 22
24
–1
x1 = 0.5 – 2 2 3
=
0.5
–
1 – 0.5 3
=
0.5
–
1.5
=
–1
0.5 24 3 0.5 – 0.5 0.5 3 0.5 0 0.5
Plot
2
-1
-2
-2 -1 0 1 2
Conjugate Vectors
1 T T
F x = -- x Ax + d x + c
2
p Tk Ap j = 0 kj
T T
zk Az j = jz k z j = 0 kj
2F x = A
g k = gk + 1 – g k = Ax k + 1 + d – Axk + d = A xk
where
xk = xk + 1 – xk = k p k
T T T
k p k Ap j = xk Ap j = gk p j = 0 k j
p 0 = – g0
p k = – gk + k p k – 1
where
T T T
gk – 1 gk gkgk g k – 1 gk
k = ---------T-------------------- or k = ------------------------- or k = -------------------------
g k – 1 p k – 1 g Tk – 1 gk – 1 g Tk – 1 gk – 1
Conjugate Gradient algorithm
• The first search direction is the negative
of the gradient.
p 0 = – g0
(For quadratic
• Select the learning rate to minimize functions.)
along the line.
T
F x pk T
x=x g p
k = – --------------------------------------k---------- = – --------k--------k----
T T
p k 2F x pk p k A kp k
x = xk
p k = – gk + k p k – 1
Example
1 T
F x = -- x 2 2 x + 1 0 x x 0 = 0.5
2 24 0.5
F x
x1 2x 1 + 2x2 + 1 –3
F x = = p 0 = –g 0 = – F x =
2x 1 + 4x 2 x = x0 –3
F x
x2
–3
33
–3
0 = – -------------------------------------------- = 0.2 x1 = x0 – 0 g0 = 0.5 – 0.2 3 = – 0.1
2 2 –3 0.5 3 – 0.1
–3 –3
2 4 –3
Example
g 1 = F x = 2 2 – 0.1 + 1 = 0.6
x = x1 2 4 – 0.1 0 –0.6
0.6
0.6 – 0.6
gT1 g1 – 0.6 0.72
1 = ------------ = ----------------------------------------- = ---------- = 0.04
gT0 g0 18
3
33
3
– 0.72
0.6 – 0.6
0.48 –0.72
1 = – --------------------------------------------------------------- = – ------------- = 1.25
0.576
2 2 –0.72
–0.72 0.48
2 4 0.48
Plots
x2 = x1 + 1 p1 = – 0.1 + 1.25 – 0.72 = –1
– 0.1 0.48 0.5
1 1
0 0
x2
-1 -1
-2 -2
-2 -1 0 1 2 -2 -1 0 1 2
x1
• This is used for finding line minimization
methods and their stopping criteria
– Initial bracketing
– Line searches
• Newton’s method
• Secant method
• Sectioning method