Professional Documents
Culture Documents
Obd 04 PDF
Obd 04 PDF
109 / 165
Proof of Lemma 38
In the first step we have expanded the square and in the second step we
have used linearity of expectation.
112 / 165
Weak Convergence
113 / 165
Weak Convergence
Theorem 39 (Weak Convergence 1)
Choose any x0 2 Rn and let {xk } be the random iterates produced by
Algorithm 2. Let x⇤ 2 L be chosen arbitrarily. Then
1
E [xk+1 x⇤ ] = I !B E [Z] E [xk x⇤ ] . (35)
Theorem 40 (Convergence 2)
Let x⇤ = ⇧B
L (x0 ). Then for all i = 1, 2, . . . , n,
(
h i 0 if = 0,
i
E ui> B1/2 (xk x⇤ ) =
(1 ! i )k ui> B1/2 (x0 x⇤ ) if i > 0.
(39)
Moreover,
kE [xk x⇤ ] k2B ⇢k (!)kx0 x⇤ k2B , (40)
where the rate is given by
def
⇢(!) = max (1 ! i )2 . (41)
i: i >0
115 / 165
116 / 165
Proof of Theorems 39 and 40 - I
We first start with a lemma.
Lemma 42
Let Assumption 3 (exactness) hold. Consider arbitrary x 2 Rn and let
> 1/2
x⇤ = ⇧BL (x). If i = 0, then ui B (x x⇤ ) = 0.
Proof.
From (19) we see that x x⇤ = B 1 A> w for some w 2 Rm . Therefore,
ui> B1/2 (x x⇤ ) = ui> B 1/2 A> w . By Theorem 29, we have
Range (ui : i = 0) = Null AB 1/2 , from which it follows that
ui> B 1/2 A = 0.
Proof of Theorem 39: Algorithm 2 can be written in the form
1
ek+1 = (I !B Zk )ek , (42)
117 / 165
Taking expectations on both sides and using the tower property, we get
⇥ ⇤ ⇥ ⇥ ⇤⇤ ⇥ ⇤
E B1/2 ek+1 = E E B1/2 ek+1 | ek = (I !B 1/2 E [Z] B 1/2 )E B1/2 ek .
118 / 165
Proof of Theorems 39 and 40 - III
Finally, inequality (40) follows from
(38)
n
X ⇣ ⌘2
kE [xk x⇤ ] k2B = (1 ! i )2k ui> B1/2 (x0 x⇤ )
i=1
X ⇣ ⌘2
= (1 ! i )2k ui> B1/2 (x0 x⇤ )
i: i >0
(41) X ⇣ ⌘2
⇢k (!) ui> B1/2 (x0 x⇤ )
i: i >0
X ⇣ ⌘2 X ⇣ ⌘2
= ⇢k (!) ui> B1/2 (x0 x⇤ ) + ⇢k (!) ui> B1/2 (x0 x⇤ )
i: i >0 i: i =0
X⇣ ⌘2
= ⇢k (!) ui> B1/2 (x0 x⇤ )
i
X
= k
⇢ (!) (x0 x⇤ )> B1/2 ui ui> B1/2 (x0 x⇤ )
i
!
X X
= ⇢k (!) (x0 x⇤ )> B1/2 ui ui> B1/2 (x0 x⇤ ) = ⇢k (!)kx0 x⇤ k2B .
i i
P
The last identity follows from the fact that i ui ui> = UU> = I.
119 / 165
120 / 165
Convergence Rate as a Function of !
In view of (40) and (41), the optimal relaxation parameter is the one
solving the following optimization problem:
⇢
min ⇢(!) = max (1 ! i )2 . (43)
!2R i: i >0
121 / 165
Optimal Stepsize
Theorem 43 (Stepsize Choice)
def +
Let ! ⇤ = 2/( min + max ). Then the objective of (43) is given by
8
> 2 !0
<(1 ! max ) if
+ 2
⇢(!) = (1 ! min ) if 0 ! !⇤ . (44)
>
:
(1 ! max )
2 if ! !⇤
⇢i (!) = (1 ! i )2 ,
123 / 165
Strong Convergence
124 / 165
Decrease of Distance is Proportional to fS
Remarks: Equation (48) says that for any x⇤ 2 L, in the k-th iteration of
Algorithm 2 the distance of the current iterate from x⇤ decreases by the
amount 2!(2 !)fSk (xk ).
125 / 165
x >B 1/2
E [Z] B 1/2
x +
min (B
1/2
E [Z] B 1/2 )x > x (50)
holds for all x 2 Range M> . Applying this with M = (E [Z])1/2 B 1/2 ,
we see that (50) holds for all x 2 Range B 1/2 (E [Z])1/2 . However,
⇣ ⌘ ⇣ ⌘
1/2 1/2 1/2 1/2 1/2 1/2 >
Range B (E [Z]) = Range B (E [Z]) (B (E [Z]) )
⇣ ⌘ ⇣ ⌘
= Range B 1/2 E [Z] B 1/2 = Range B 1/2 A> ,
kxk+1 xk k2B = ! 2 kB 1
Zk (xk x⇤ )k2B
(21)
= ! 2 (xk x⇤ )> Zk (xk x⇤ )
(22)
= 2! 2 fSk (xk ). (51)
In a similar vein,
Proof of Lemma 44 - II
establishing (48).
128 / 165
Quadratic Bounds
+ 1
min · f (x) krf (x)k2B max · f (x). (53)
2
and
max
f (x) kx x⇤ k2B . (54)
2
Moreover, if Assumption 3 holds, then for all x 2 Rn and x⇤ = ⇧B
L (x) we
have
+
min
kx x⇤ k2B f (x). (55)
2
129 / 165
Proof of Lemma 46 - I
In view of (17) and (33), we obtain a spectral characterization of f :
1X
n ⇣ ⌘2
f (x) = i ui> B1/2 (x x⇤ ) , (56)
2
i=1
where x⇤ is any point in L. On the other hand, in view of (28) and (33),
we have
krf (x)k2B = kB 1
E [Z] (x x⇤ )k2B (57)
> 1
= (x x⇤ ) E [Z] B E [Z] (x x⇤ )
= (x x⇤ )> B1/2 (B 1/2
E [Z] B 1/2
)(B 1/2
E [Z] B 1/2
)B1/2 (x x⇤ )
= (x x⇤ )> B1/2 U(U> B 1/2
E [Z] B 1/2
U)2 U> B1/2 (x x⇤ )
(33)
= (xx⇤ )> B1/2 U⇤2 U> B1/2 (x x⇤ )
n
X ⇣ ⌘2
2 > 1/2
= i u i B (x x ⇤ ) . (58)
i=1
Inequality (53) follows by comparing (56) and (57), using the bounds
+ 2
min i i max i ,
We now move to the bounds involving norms. First, note that for any
x⇤ 2 L we have
(17) 1
f (x) = (x x⇤ )> E [Z] (x x⇤ ) (59)
2
1 1/2
= (B (x x⇤ ))> (B 1/2 E [Z] B 1/2
)B1/2 (x x⇤ ).
2
The upper bound follows by applying the inequality
1/2 1/2
B E [Z] B max I.
If x⇤ = ⇧B
L (x), then in view of (19), we have
⇣ ⌘
1/2 1/2 >
B (x x⇤ ) 2 Range B A .
131 / 165
Strong Convergence
Theorem 47 (Strong convergence)
Let Assumption 3 (exactness) hold and set x⇤ = ⇧BL (x0 ). Let {xk } be the
random iterates produced by Algorithm 2, where the relaxation parameter
def ⇥ ⇤
satisfies 0 < ! < 2, and let rk = E kxk x⇤ k2B . Then for all k 0 we
have
(1 !(2 !) max )k r0 rk (1 !(2 !) + k
min ) r0 . (60)
The best rate is achieved when ! = 1.
Proof.
Let k = E [f (xk )]. We have
(49) (55)
+
rk+1 = rk 2!(2 !) k rk !(2 !) min rk ,
and
(49) (54)
rk+1 = rk 2!(2 !) k rk !(2 !) max rk .
132 / 165
Convergence of f (xk )
133 / 165
Convergence of f (xk )
Theorem 48 (Convergence of f )
Choose x0 2 Rn , and let {xk }1
k=0 be the random iterates produced by
Algorithm 2, where the relaxation parameter satisfies 0 < ! < 2.
def Pk 1
(i) Let x⇤ 2 L. The average iterate x̂k = k1 t=0 xt for all k 1
satisfies
kx0 x⇤ k2B
E [f (x̂k )] . (61)
2!(2 !)k
(ii) Now let Assumption 3 hold. For x⇤ = ⇧B
L (x0 ) and k 0 we have
134 / 165
Proof of Theorem 48
⇥ ⇤
(i) Let k = E [f (xk )] and rk = E kxk x⇤ k2B . By summing up the
identities from (49), we get
k 1
X
2!(2 !) t = r0 rk .
t=0
135 / 165
136 / 165