Professional Documents
Culture Documents
Lecture14 PDF
Lecture14 PDF
Lecture 14 :
Banach fixed point theorem updated 17110119
then Cts CX Y ) is
complete metric with the do metric In
today 's
, a
space .
f X
of fixed point for X Many problems mathematics (
→
a
given in e.g
:
a
.
root finding ,
convex optimisation ,
or
solving ODES ) can ( through varying
degrees of chicanery ) be
phrased as fixed point problems . Hence
, general
theorems about fixed points tend to have widespread application .
1g
⇒
°
-
I :
iterate IR formula
is to
, beginning with
any xo E
,
the
"
" " " "
)
to
"
}
g ( xn )
anti an
I
: = -
I
.
q
Knt 2
Kw
Xnt I
KY g
g
of f ) fx )
'
'
of g (a) to since
precisely a root where g
9 ' a)
a
ya ) Es
g la ) =
O
-
a = -
②
Let V function
Example 44-2 V be a vector space , g
:X → a C
perhaps
linear ) and well and wish to solve
non
suppose we are
given
-
Why bother
rephrasing problems as fixed point problems ? Well ,
because
) f ( )
2
x
,
=
f ( Xo )
,
xz
=
f I x ,
=
f ( Xo ) ,
- .
.
,
X n ti
=
an .
) x* Lima f )
"
(
' '
( whatever
"
In the limit that means we have =
Ko and
,
f ( Lienz f Yao ) ) x*
ht
f- f x* )
'
= =
tizz f ( no ) =
lnirnafhcxo )
=
sequence
(f Tko ) ) 5=0 converges
:
one
way to ensure this is to assume
f- is a contraction ( so the
sequence
is Cauchy ) and that X is complete
*
that ) that
C so
Cauchy sequence converges .
The fact x is a fixed point
follows by precisely the above argument once we know it exists .
Def Let ( X
,
d) be a metric
space .
A function f :
X → X is called
( ) Xd ( ) It X
'
d foe ,
fx
'
E a
,
x
'
a
,
x E
.
Theorem LILI -
I I Banach fixed pt .
theorem ) Suppose ( X
,
d) is a complete
metric
space and f :X →
X is a contraction .
Then f- has
converges
to this fixed point .
d ( p, q ) =
d Cfp , fq ) E Xd C
p, 9)
is
unique .
It remains to
pure
existence .
Given x e X set an f C x
) .
(f 2x ) Xd ( fx fy ) d la ) induction
'
Note that d , f- Zy E
,
E A , y and by
also dffkx ,
fky ) E 7k d C x
, y ) for k > I .
We have for m > he
d ( am
, an ) E d ( am
,
am -
I ) t - .
-
t d ( antz
,
anti ) t d ( anti
,
an )
t d ( frit 2x
,
t d ( f 's
,
'
Xm It 'd ) )
-
S d ( foe ,
x ) t
- . -
t C f x
,
x t And I fx ,
x
[ ] )
'
Xm I d ( foe ,
-
= t . . .
t x
'
( Eino )
-
= Xn Xi dffx ,
x )
E
742
Eo Xi ) dffx ,
x )
= In .
.
d ( fx ,
x ) ( since OCXCI )
④
KI make
Since we
may
the RHS
arbitrarily small by making n
sufficiently
large , and so it follows that ( an ) F- o is
Cauchy .
Since X is complete
this to x* EX that
Cauchy sequence converges ,
say ,
is
"
f Ge )
x*
moan him
= =
*
f ( f ) ) nliynofnt ( )
'
"
flat ) =
Ling
C x = x =
x
so x* is a fixed point .
D
,
some
be still work to
phrased as fixed point problems ,
it
may
some show the
function g
is a contraction ( for example Example 44-2 can be used to
that walks
you through the linear case :
( IR ) Rn ( )
"
Exercise 44 I
Let A E Mn and let IR → be =
A
Av
x v
g- g
-
f
"
IR where
"
Define IR →
by f ( )
=
v t w w
'
-
v
-
is a fixed vector .
Prove that if there exists X e lo , 1) with
§ A X
I
I
Sij ij
E
for each Kien
-
=
,
then Av = w has a
unique solution v
using the
, f
to ?
Banach fixed point theorem applied
"
:
IR → IR
wisely )
"
( Hint :
choose
your
metric on IR .
⑤
Exercise 44-2
Suppose ( X
d ) is a compact metric and for XE to 1) let
space
, ,
,
Cts
,
( X ,
X ) Ects C XX )
be the
subspace of X contraction
mappings with the
-
topology ) .
Prove that the function
fix :
Cts a
( X ,
X ) > X
sending a contraction
mapping
to its
unique fixed point
is continuous .
⑥
(2) There is a
proof of the Implicit Function Theorem ( fundamental to
higher calculus aka differential geometry ) via the fixed point theorem
idea outlined in
Example 44-2 .
Atari
game )
its
utility .
Such a value function determines the agent 's
You be familiar with application from DeepMind
:
behaviour one
.
may
"
Silver Playing Atari with
V. Mnih ,
K .
Kavukcuoglu ,
D .
deep
( published
"
reinforcement arXiv 1312.5602
learning
:
in Nature in 2015 ) .
to an
optimal policy by iteration ( more on that in a moment ) and how
substitute .
The theoretical convergence is based on the Banach fixed pt .
th m :
"
•
S . Russell ,
P .
Norvig
"
Artificial intelligence : a modern approach 3rd ed .
517 . 2.3
"
Reinforcement learning f
"
•
R S . Sutton A. G. Barto 4 I
,
. .
.
-3
⑦
attainment measured
objective the
degree of of which
some is
,
any given is
they are in slate SES then they choose from a finite set of
to transition to state s
'
with
probability
PCs la ) At time step the agent
'
denoted ,
s
.
each receives a
reward Rls )
depending on their state
,
with Rls ) EIR .
We
,
probability Pls la ) only for finitely many !
'
fixed a
,
s the ,
s
is nonzero s
Teg .
8=2 ,
Als ) =
night
'
I if s - Stl a =
{
,
( la )
'
P s
,
s =
1 if s
'
=
s - I a
=
left
,
0 otherwise
R ( ) { 1000, ) right !
and s min es Move to win Note the probabilistic
=
say
instead
It:I
night
'
pcsya.sk
f 1 if s
' -
-
s
-
I
,
a = left
0 otherwise -1
RIE ,
T ) : =
It Tt R ( St ) ( why is this finite ? )
> o
:S A that
policy to be a
complete set of such choices
,
ie .
a function IT → such
"
Given a
starting state s
,
a
policy TL
,
and the
"
transition model ( meaning all the
probabilities obtain
probability
s over
, we a
sequences
s with PCs ) being the probability an
agent initially in state s
, following IT
, ,
( ) § )
"
E U Is) : =
E RH ,
s ) =
PIE ) Rtr ,
E
.
§
&
± "
Tis state the that U (s )
a-
The optimal policy beginning in s is one maximise
It
independent of
the
over all IT and it turns out this is s call it The
,
.
utility
E
E (*, ) Is)
"
of a state s is then U s
which we denote U .
This all seems
I
Now here's the brilliant trick ! To get around the morass in the box we can define
,
S → IR ) of an
) )
§sP la
l Us
'
) 8
'
U I s ) =
Rls t .
s
,
s
ameaays
, -
!
I
^
finite
⑨
,x÷s ) VCs )
Pls la
' '
*
I Cs ) : =
,
s
aargem.mg
.
) sup {
S } X
'
that d I U, U
' =
I U Is ) -
o Is ) I I se makes
prove
a
complete metric space .
⇐sp la ) )
l Us
'
) 8
'
flu ) I s ) =
Rls t .
ameaays
s
,
s
Conclude that by the Banach fixed point theorem f has a unique fixed point
unique som )
"
( hence the Bellman eq has a and that given any initial value
'
f ( )
'
function Uo the Ui : =
Uo
converges to
that IN and hence
sequence
so
,
It
Ufs )
Pls la )
' '
Cs ) s
aargemn.gg?s
: =
,
converges
" "
to
optimal policy an as i → a .
(
set of A discrete
the
policies does not have any reasonable topology as is .