Exercise1 A Annotated

AMA 615 Exercise 1 2nd Sem, 2021 – 2022
Exercises.
1. Let 2 C 2 (IRn ) and suppose that there exists L > 0 such that kr2 (y)k2  L for all y 2 IRn . Show
that
kr (u) r (v)k2  Lku vk2 8u, v 2 IRn .
Solution: The inequality holds trivially if r (u) = r (v). Now, fix any u and v 2 IRn with
r (u) 6= r (v). Define the function : IR ! IR by
(t) := (r (u) r (v))T r (v + t(u v)).
Then 2 C 1 (IR) and we have from the mean value theorem that there exists ⇠ 2 (0, 1) such that
0
kr (u) r (v)k22 = (1) (0) = (⇠) = (r (u) r (v))T r2 (v + ⇠(u v))(u v)
2
 kr (u) r (v))k2 kr (v + ⇠(u v))k2 k(u v)k2
 Lkr (u) r (v))k2 k(u v)k2 .
Since r (u) 6= r (v), we conclude that
kr (u) r (v)k2  Lku vk2 .
This completes the proof.
⇥ ⇤T ⇥ ⇤T
2. Consider the function f (x) = (x1 + x22 )2 , the point x⇤ = 1 0 and the direction d⇤ = 1 1 .
Show that d⇤ is a descent direction of f at x⇤ , and find all stepsizes that satisfy the exact line search
criterion at x⇤ along d⇤ .
Solution: Note that  

2(x⇤1 + x⇤2 2 ) 2
rf (x⇤ ) = = .
4x⇤2 (x⇤1 + x⇤2 2 ) 0
Thus, we have [rf (x⇤ )]T d⇤ = 2 < 0, showing that d⇤ is a descent direction.
Next, consider
Minimize f (1 ↵, ↵).
↵ 0
h i2
1 2 3
Since f (1 ↵, ↵) = (1 ↵ + ↵ 2 )2 = ↵ 2 + 4 , we see that the function is minimized at
1
↵= 2. Thus, the stepsize that satisfies the exact line search criterion at x⇤ along d⇤ is ↵ = 12 .
x
3. Consider the function f (x) = e . Consider an iterate of the following form
xk+1 = xk + ↵k dk ,
where dk = f 0 (xk ) and ↵k is obtained via Armijo line search by backtracking with ↵
¯ k ⌘ 1 and
= 0.1. Start with x0 = 0.
y
(a) Show that e 1 0.1y whenever y 2 [0, 1].
(b) Show that ↵0 = 1 and x1 = 1.
(c) Show that, for all k 0, it holds that xk+1 > 0 and ↵k = 1.
Solution:
AMA 615 Exercise 1
(a) Let h(y) = e y

1 + 0.1y. Then h0 (y) = e y
+ 0.1 and h00 (y) = e y
.
00 0 0
Since h (y) > 0 for all y, we see that h is increasing. Since h (1) = 0.1 e 1 = 0.2679 < 0,
we conclude that h0 (y) < 0 for all y 2 [0, 1]. This shows that h is decreasing on [0, 1]. Thus,
h(y)  h(0) = 0 for all y 2 [0, 1], which is the desired inequality.
(b) Note that for each ↵ > 0, f (x0 ) + ↵f 0 (x0 )d0 = f (x0 ) 0.1↵[f 0 (x0 )]2 = 1 0.1↵. Since
f (x0 f 0 (x0 )) = f (1) = e 1

1 0.1 = f (x0 ) + f 0 (x0 )d0 ,
¯ k ⌘ 1 and
we conclude that the Armijo rule (with ↵ = 0.1) is satisfied without backtracking
at this iterate. Thus, ↵0 = 1 and x1 = 1.
(c) The proof is by induction. From (b), the conclusion holds for k = 0. Suppose that for some
` 0, it holds that x`+1 > 0 and ↵` = 1.
Note that for each ↵ > 0,
f (x`+1 ) + ↵f 0 (x`+1 )d`+1 = f (x`+1 ) 0.1↵[f 0 (x`+1 )]2 = e x`+1

[1 0.1↵e x`+1
].
Since
f (x`+1 f 0 (x`+1 )) = f (x`+1 + e x`+1
) = exp( x`+1 e x`+1
)
x`+1 (i)
x`+1 e x`+1 x`+1
=e e e [1 0.1e ],
x`+1
where (i) holds because of part (a) and the induction assumption x`+1 > 0 so that e 2
[0, 1].
Hence, we conclude that the Armijo rule (with ↵ ¯ k ⌘ 1 and = 0.1) is satisfied without
backtracking at this iterate. Thus, ↵`+1 = 1 and x`+2 = x`+1 + e x`+1 > 0. This completes
the proof by induction.
Note: Indeed, we have xk+1 = xk + e xk . One can show that the sequence {xk } is unbounded
and has no convergent subsequence. To see this, first note that {xk } is strictly increasing. Thus,
limk!1 xk = supk xk . If ⇠ := supk xk is finite, then by passing to the limit in the relation
xk+1 = xk + e xk , we get ⇠ = ⇠ + e ⇠ , which is a contradiction. Thus, ⇠ = 1.
This is not surprising because f (x) = e x does not have a minimizer and Theorem 2.5 asserts
that any accumulation point of the sequence generated is a stationary point — thus, the sequence
generated must then have NO accumulation point.
o
4. Let Q 0 and b 2 IRn . Define
1 T
x Qx bT x.
f (x) =
2
(a) Suppose that x̄ is not a stationary point of f and suppose the steepest descent method with
exact line search is applied to minimizing f starting from x̄.
Show that the stepsize that satisfies the exact line search criterion at x̄ along rf (x̄) is given by
krf (x̄)k2
.
[rf (x̄)]T Qrf (x̄)
(b) Suppose that x⇤ is the unique minimizer of f and let v be any eigenvector of Q.
i. Let x0 = x⇤ + v. Show that x0 is not a stationary point of f .
ii. Show that the steepest descent method with exact line search initialized at x0 = x⇤ + v
gives x1 = x⇤ .
Solution:
Page 2
AMA 615 Exercise 1
(a) Consider the function
些
(↵) := f (x̄ ↵rf (x̄))
We need to minimize subject to ↵ 0. Note that
1
(↵) = [x̄ ↵rf (x̄)]T Q[x̄ ↵rf (x̄)] bT [x̄ ↵rf (x̄)].
2
This is a quadratic with leading coefficient 12 rf (x̄)T Qrf (x̄), which is positive because
Q 0 and rf (x̄) 6= 0. Thus, the minimizer of is given by its vertex. Since
0 1 1
(↵) = [rf (x̄)]T Q[x̄ ↵rf (x̄)] [x̄ ↵rf (x̄)]T Qrf (x̄) + bT rf (x̄)
2 2
= [rf (x̄)]T Q[x̄ ↵rf (x̄)] + bT rf (x̄) = [rf (x̄)]T [Qx̄ b ↵Qrf (x̄)].
Thus, the vertex is

0
(0) krf (x̄)k2
00 (0)
= > 0.
[rf (x̄)]T Qrf (x̄)
Since the vertex is positive, it must also be the minimizer of over the set [0, 1).
(b) i. Note that x⇤ satisfies rf (x⇤ ) = Qx⇤ b = 0. Let Qv = v for some eigenvalue >0
of Q (since Q 0). Then
rf (x0 ) = Qx0 b = Q(x⇤ + v) b = Qv = v 6= 0.
Thus, x0 is not a stationary point.

ii. From (a), we know that x1 is given by
krf (x0 )k2 2

kvk2
x1 = x0 rf (x0 ) = x⇤ + v ( v)
[rf (x0 )]T Qrf (x0 ) 2 v T Qv
2
kvk2
= x⇤ + v 3 kvk2
( v) = x⇤ ,
where the second last equality holds because Qv = v.
5. Consider the function f : IR2 ! IR defined by

1
f (x) = (x1 + x2 1)2+ + kxk22 ,
2
where (
t if t 0,
t+ :=
0 otherwise.
(a) Compute rf (x).
(b) Consider an iterate of the following form:
xk+1 = xk + ↵k dk .
Let dk be the steepest descent direction and ↵k be chosen to satisfy the Wolfe’s condition.
Suppose that xk is nonstationary for all k.
i. Show that the sequence {xk } is bounded.
ii. Show that any accumulation point of {xk } is stationary.
Solution:
Page 3
AMA 615 Exercise 1
(a) We have 
2(x1 + x2 1)+ + x1
rf (x) = .
2(x1 + x2 1)+ + x2
(b) i. Using Armijo rule, we have for any k 1 that

1 k 2
kx k2  f (xk )  f (xk 1 ) c1 ↵k 1 krf (x
k 1
)k2
2
 f (xk 1 )  · · ·  f (x0 ).
p
This shows that kxk k2  2f (x0 ) for all k 1, meaning that {xk } is bounded.
ii. We next apply Zoutendijk’s theorem. First, clearly, f 2 C 1 (IRn ) and inf f 0. Also,
we see from part (a) that
.gg
rf (x) rf (y)
=

2(x1 + x2 1)+ + x1
2(x1 + x2 1)+ + x2
2(y1 + y2
2(y1 + y2
1)+
1)+
y1
y2
.
懋
:
Since
|2(x1 + x2 1)+ + x1 2(y1 + y2 1)+ y1 |
-
 2|x1 + x2
y1 y2 | + |x1 y1 |  3kx yk1 ,
|2(x1 + x2 1)+ + x2 2(y1 + y2 1)+ y2 |
 2|x1 + x2 y1 y2 | + |x2 y2 |  3kx yk1 ,
we see that
krf (x) rf (y)k22 E9
比
⼀炸
= |2(x1 + x2 1)+ + x1 2(y1 + y2 1)+ y1 | 2
+ |2(x1 + x2 1)+ + x2 2(y1 + y2 1)+ y2 | 2 ← 9 以炷
 18kx yk21  36kx yk22 .
Tteeottsonotttts
One can take ` = 6 in Zoutendijk’s theorem. Since the steepest descent direction is a
descent direction and ↵k satisfies the Wolfe conditions, we conclude that
1
X
cos2 ✓k krf (xk )k22 < 1. (1)
k=0
-
Now,
[rf (xk )]T rf (xk )
cos ✓k = = 1.
krf (xk )k2 krf (xk )k2
This together with (1) shows that limk!1 krf (xk )k2 = 0. Hence, any accumulation
point of {xk } is stationary.
Pm
6. Let f (x) = h(Ax) + µkxk2 , where h(y) = i=1 ln(1 + e yi
), A 2 IRm⇥n , and µ > 0.
(a) By considering kr2 h(y)k2 and using Question 1, show that for any u and v 2 IRm , it holds that
1
krh(u) rh(v)k2  ku vk2 .
4
(b) Show that at any nonstationary point, the Newton direction [r2 f (x)] 1
rf (x) is a descent
direction.
(c) Consider an iterate of the following form:
xk+1 = xk + ↵k dk .
Page 4
AMA 615 Exercise 1
Let dk be the Newton direction and ↵k be chosen to satisfy the Wolfe’s condition. Suppose that
xk is nonstationary for all k.
i. Show that the sequence {xk } is bounded.
ii. Show that any accumulation point of {xk } is stationary.
Solution:
(a) By direct computation, we have r2 h(y) is diagonal with the ith diagonal entry being
@2 yi @ 1 e yi
ln(1 + e )= = > 0.
@yi2 @yi 1 + e y i (1 + eyi )2
yi
e 1 1
Moreover, (1+e yi )2  4 because (1 e yi ) 2 0. Thus, we have kr2 h(y)k2  4 for all y.
From this, we conclude (using Question 1) that for any u and v 2 IRn ,
1
krh(u) rh(v)k2  ku vk2 .
4
(b) A direction computation shows that r2 f (x) = AT r2 h(Ax)A+2µI. Note that AT r2 h(Ax)A ⌫
0 because, for any u 2 IRn , it holds that
uT [AT r2 h(Ax)A]u = [Au]T r2 h(Ax)[Au] 0,
where the inequality holds since r2 h(Ax) 0 from part (b). Thus, r2 f (x) ⌫ 2µI 0, meaning
2 1
that [r f (x)] 0.
Hence, at a nonstationary point (so that rf (x) 6= 0), we have
[rf (x)]T [r2 f (x)] 1

rf (x) = [rf (x)]T [r2 f (x)] 1
rf (x) < 0.
(c) i. Using Armijo rule, we have for any k 1 that
µkxk k22  f (xk )  f (xk 1

) + c1 ↵k 1 [rf (x
k 1
)]T dk 1
 f (xk 1
)  · · ·  f (x0 ).
p
This shows that kxk k2  f (x0 )/µ for all k 1, meaning that {xk } is bounded.
ii. We next apply Zoutendijk’s theorem. First, clearly, f 2 C 1 (IRn ) and inf f 0. Also,
rf (x) = AT rh(Ax) + 2µx. Thus, we see from part (c) that
krf (x) rf (y)k2 = kAT (rh(Ax) rh(Ay)) + 2µ(x y)k2

T
 kA k2 krh(Ax) rh(Ay)k2 + 2µkx yk2
T
kA k2
 kAx Ayk2 + 2µkx yk2
4
✓ T ◆
kA k2 kAk2
 + 2µ kx yk2 .
4
T
One can take ` = kA k42 kAk2 + 2µ in Zoutendijk’s theorem. Since Newton direction is a
descent direction and ↵k satisfies the Wolfe conditions, we conclude that
1
X
cos2 ✓k krf (xk )k22 < 1. (2)
k=0
Now, notice that for any x, we have r2 f (x) = AT r2 h(Ax)A+2µI. Since AT r2 h(Ax)A ⌫
0 as in part (b), we deduce that min (r2 f (x)) 2µ > 0. This further implies that
2 1
max (r f (x)) = kr2 f (x)k2  kAT k2 kr2 h(Ax)k2 kAk2 + 2µ  2µ + kAk22 ,
4
Page 5
AMA 615 Exercise 1
where the last inequality used the relation kr2 h(Ax)k2  14 , which was proved in part
(a). We thus have
[rf (xk )]T [r2 f (xk )] 1 rf (xk )

cos ✓k =
krf (xk )k2 k[r2 f (xk )] 1 rf (xk )k2
[rf (xk )]T [r2 f (xk )] 1 rf (xk )
krf (xk )k22 k[r2 f (xk )] 1 k2
2 k 1 2
min ([r f (x )] ) min ([r f (xk )] 1
)
2 k 1
=
k[r f (x )] k2 max ([r2 f (xk )] 1 )
2 k
min (r f (x )) 2µ
= > 0.
2 k
max (r f (x )) 2µ + 14 kAk22
This together with (2) shows that limk!1 krf (xk )k2 = 0. Hence, any accumulation
point of {xk } is stationary.
( Householdertnsfomatim)
b)
IQcihQQ-Isu.eu
7. Consider Minimize
n
x2IR
f (x) with 巡:
⼯
1 T
2
x Ax + bT x,
where A = uuT + I for some u 2 IRn \{0}, and b 2 IRn .
f (x) =
事
啊
(a) Show that A 0. utllydy
c)
(b) Argue that the eigenvalues of A are 1 (with multiplicity n 1) and 1 + kuk22 .
←
Tī HMI (c) Suppose that2 the conjugate gradient method is applied to the above function, with x0 = 0.
Argue that x = A 1 b.
xn.FI
入 n_n I Solution: Ituùe Ituullidi Q
比垇以
型器
i (a) For any x 2 IRn , we have ⼆
T
x Ax = kxk22 + (u x)T 2
0.
0-tuideayohn.ru
⼀⼀⼀⼀
T
Moreover, x Ax = 0 implies kxk22 = 0, which means x = 0. Thus, A
0.
h i
u
(b) Let V 2 IRn⇥(n 1)
be such that V T u = 0 and V T V = I. Set W = V kuk2 2 IRn⇥n .
Then W T W = I and
" #
VT h i
T u
W AW = uT (I + uuT ) V kuk2
kuk2
" #
V T + V T uuT h u
i
= uT uT uuT V kuk2
kuk2 + kuk2
" #
VT h i
u
= u T V kuk2
(1 + kuk22 ) kuk 2

I 0
= =: D.
0 1 + kuk22
Since W T W = I and W 2 IRn⇥n , the above shows that W DW T is an eigenvalue decompo-

sition of A and the eigenvalues of A are 1 (with multiplicity n 1) and 1 + kuk22 .
(c) By Luenberger’s theorem, writing x⇤ = A 1
b, we have
✓ ◆2
n 1 1
kx 2
x⇤ k22  (x 2 ⇤ T
x ) A(x 2
x )⇤
(x0 x⇤ )T A(x0 x⇤ ) = 0
n 1 + 1
since n 1 = 1 = 1. Thus, x2 = x⇤ = A 1
b.
-
Page 6

Exercise1 A Annotated

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Exercise1 A Annotated

Uploaded by

Copyright:

Available Formats

AMA 615 Exercise 1 2nd Sem, 2021 – 2022

(t) := (r (u) r (v))T r (v + t(u v)).

Since r (u) 6= r (v), we conclude that

kr (u) r (v)k2  Lku vk2 .

This completes the proof.

Solution: Note that  

(a) Let h(y) = e y

f (x0 f 0 (x0 )) = f (1) = e 1

f (x`+1 ) + ↵f 0 (x`+1 )d`+1 = f (x`+1 ) 0.1↵[f 0 (x`+1 )]2 = e x`+1

(a) Consider the function

Thus, the vertex is

rf (x0 ) = Qx0 b = Q(x⇤ + v) b = Qv = v 6= 0.

Thus, x0 is not a stationary point.

krf (x0 )k2 2

where the second last equality holds because Qv = v.

5. Consider the function f : IR2 ! IR defined by

(b) i. Using Armijo rule, we have for any k 1 that

[rf (x)]T [r2 f (x)] 1

(c) i. Using Armijo rule, we have for any k 1 that

µkxk k22  f (xk )  f (xk 1

krf (x) rf (y)k2 = kAT (rh(Ax) rh(Ay)) + 2µ(x y)k2

[rf (xk )]T [r2 f (xk )] 1 rf (xk )

Since W T W = I and W 2 IRn⇥n , the above shows that W DW T is an eigenvalue decompo-

You might also like