Lecture14 PDF

①
Lecture 14 :
Banach fixed point theorem updated 17110119
We now know that if X is

compact and ( Y , dy ) is a complete metric space
then Cts CX Y ) is
complete metric with the do metric In
today 's
, a
space .
lecture we examine an important theorem about complete metric

spaces ,
Banach fixed point theorem which will be to

the ,
applied next lecture prove
existence of
the and
uniqueness solutions to ( ordinary ) differential equations .
DEI A fixed point of a function f :X → X is x E X such that f ( )x = x .
A fixed point problem is the problem of proving the existence and/or

uniqueness
f X
of fixed point for X Many problems mathematics (
→
a
given in e.g
:
a
.
root finding ,
convex optimisation ,
or
solving ODES ) can ( through varying
degrees of chicanery ) be
phrased as fixed point problems . Hence
, general
theorems about fixed points tend to have widespread application .
Example 44 ( Newton 's

algorithm ) Suppose g IR → IR is differentiable
1g
⇒
°
-
I :
and we wish to find a solution a of g (a) =

O . Newton 's method
iterate IR formula
is to
, beginning with
any xo E
,
the
"
" " " "
)
to
"
}
g ( xn )
anti an
I
: = -
I
.
q
Knt 2
Kw
Xnt I
KY g
g
of f ) fx )
'
Observe that fixed point C x

is
-
=
a x
'
of g (a) to since
precisely a root where g
9 ' a)
a
ya ) Es
g la ) =
O
-
a = -
②
Let V function
Example 44-2 V be a vector space , g
:X → a C
perhaps
linear ) and well and wish to solve
non
suppose we are
given
-
the equation g (v)

=
w for v. This is
equivalent to
finding
a fixed point of f I ) v = Vt
g k ) w
-
Why bother
rephrasing problems as fixed point problems ? Well ,
because
finding a fixed point of a function f :

X → X is ( at least conceptually )
iterate f infinitely times ! Choose EX set
trivial :
just many no and
) f ( )
2
x
,
=
f ( Xo )
,
xz
=
f I x ,
=
f ( Xo ) ,
- .
.
,
X n ti
=
an .
) x* Lima f )
"
(
' '
( whatever
"
In the limit that means we have =
Ko and
,
provided f is continuous we find that x is a fixed point
f ( Lienz f Yao ) ) x*
ht
f- f x* )
'
= =
tizz f ( no ) =
lnirnafhcxo )
=
This is an informal argument ,

but it can be made
rigorous provided the
sequence
(f Tko ) ) 5=0 converges
:
one
way to ensure this is to assume
f- is a contraction ( so the
sequence
is Cauchy ) and that X is complete
*
that ) that
C so
Cauchy sequence converges .
The fact x is a fixed point
follows by precisely the above argument once we know it exists .
Def Let ( X
,
d) be a metric
space .
A function f :
X → X is called
a contraction if there exists Xe CO , 1) such that
( ) Xd ( ) It X
'
d foe ,
fx
'
E a
,
x
'
a
,
x E
.
Clearly a contraction is continuous .

( we call f a A -
contraction )
③
from his PhD thesis No !
pressure .
Theorem LILI -
I I Banach fixed pt .
theorem ) Suppose ( X
,
d) is a complete
metric
space and f :X →
X is a contraction .
Then f- has
fixed point For X the ( fnx )nFo

unique
x E
a .
any sequence
converges
to this fixed point .
Roof Let 7 be the contraction factor off as above .

If fcp ) =p , ft 91=9 then
d ( p, q ) =
d Cfp , fq ) E Xd C
p, 9)
which is a contradiction unless d ( P, 9 ) =

O .
Hence a fixed point , if it exists ,
"
is
unique .
It remains to
pure
existence .
Given x e X set an f C x
) .
(f 2x ) Xd ( fx fy ) d la ) induction
'
Note that d , f- Zy E
,
E A , y and by
also dffkx ,
fky ) E 7k d C x
, y ) for k > I .
We have for m > he
d ( am
, an ) E d ( am
,
am -
I ) t - .
-
t d ( antz
,
anti ) t d ( anti
,
an )
( fmtx ) fhttx ) fha )

ht
= d fmx ,
t .
- . .
t d ( frit 2x
,
t d ( f 's
,
'
Xm It 'd ) )
-
S d ( foe ,
x ) t
- . -
t C f x
,
x t And I fx ,
x
[ ] )
'
Xm I d ( foe ,
-
= t . . .
t x
'
( Eino )
-
= Xn Xi dffx ,
x )
E
742
Eo Xi ) dffx ,
x )
= In .
.
d ( fx ,
x ) ( since OCXCI )
④
KI make
Since we
may
the RHS
arbitrarily small by making n
sufficiently
large , and so it follows that ( an ) F- o is
Cauchy .
Since X is complete
this to x* EX that
Cauchy sequence converges ,
say ,
is
"
f Ge )
x*
moan him
= =
But by Lemma L 8-4 we have
*
f ( f ) ) nliynofnt ( )
'
"
flat ) =
Ling
C x = x =
x
so x* is a fixed point .
D
While Example 44 I 44-2 hint at of the problems that be

may
-
,
some
be still work to
phrased as fixed point problems ,
it
may
some show the
function g
is a contraction ( for example Example 44-2 can be used to
the which create the

prove
the Implicit Function Theorem , hypotheses for
circumstances for that particular g to be a contraction ) .

Here is an exercise
that walks
you through the linear case :
( IR ) Rn ( )
"
Exercise 44 I
Let A E Mn and let IR → be =
A
Av
x v
g- g
-
f
"
IR where
"
Define IR →
by f ( )
=
v t w w
'
-
v
-
is a fixed vector .
Prove that if there exists X e lo , 1) with
§ A X
I
I
Sij ij
E
for each Kien
-
=
,
then Av = w has a
unique solution v
using the
, f
to ?
Banach fixed point theorem applied
"
:
IR → IR
wisely )
"
( Hint :
choose
your
metric on IR .
⑤
Exercise 44-2
Suppose ( X
d ) is a compact metric and for XE to 1) let
space
, ,
,
Cts
,
( X ,
X ) Ects C XX )
be the
subspace of X contraction
mappings with the
-
subspace topology ( as usual Cb ( X , X ) has the

compact open
-
topology ) .
Prove that the function
fix :
Cts a
( X ,
X ) > X
sending a contraction
mapping
to its
unique fixed point
is continuous .
⑥
Example 44-3 Applications of the Banach fixed point theorem include :
(1) The Picard theorem on existence of solids to ODES ( see Lecture 15 ) .
(2) There is a
proof of the Implicit Function Theorem ( fundamental to
higher calculus aka differential geometry ) via the fixed point theorem
( see for example http://www.math.jhu.edu/~jmb/note/invfnthm.pdf ) .

This follows the
idea outlined in
Example 44-2 .
(3) the Bellman equation is a functional equation ( ie .

an equation
in which the =
sign means equality of functions ,
as in a DE )
which is foundational in optimal control , dynamic programming
and reinforcement learning .
The functions involved are value functions
U C s ) assigning to each
possible state s ( say of an agent playing an
Atari
game )
its
utility .
Such a value function determines the agent 's
You be familiar with application from DeepMind
:
behaviour one
.
may
"
Silver Playing Atari with
V. Mnih ,
K .
Kavukcuoglu ,
D .
deep
( published
"
reinforcement arXiv 1312.5602
learning
:
in Nature in 2015 ) .
starts Bellman equation how to

The
paper by explaining the , converge
to an
optimal policy by iteration ( more on that in a moment ) and how
that's too slow instead ) neural network

,
so
they use a C deep as a
substitute .
The theoretical convergence is based on the Banach fixed pt .
th m :
"
•
S . Russell ,
P .
Norvig
"
Artificial intelligence : a modern approach 3rd ed .
517 . 2.3
"
Reinforcement learning f
"
•
R S . Sutton A. G. Barto 4 I
,
. .
.
-3
⑦
of Stuart & Russell 's AI book

The
following Exercise 17-6 :
is
Exercise Htt Consider an

agent acting in an environment in order to achieve
attainment measured
objective the
degree of of which
some is
,
rewards At time ( which discrete ) the

by scalar .
any given is
agent can be in one of a number of states S ,

and if
they are in slate SES then they choose from a finite set of
actions Als ) This action is an interaction with the environment

,
.
which causes the agent

-
to transition to state s
'
with
probability
PCs la ) At time step the agent
'
denoted ,
s
.
each receives a
reward Rls )
depending on their state
,
with Rls ) EIR .
We
assume the set L Rls ) Ise S3 EIR is bounded and that

given
,
probability Pls la ) only for finitely many !
'
fixed a
,
s the ,
s
is nonzero s
Teg .
8=2 ,
Als ) =
{ left , right ) for all s

,
and
night
'
I if s - Stl a =
{
,
( la )
'
P s
,
s =
1 if s
'
=
s - I a
=
left
,
0 otherwise
R ( ) { 1000, ) right !
and s min es Move to win Note the probabilistic
=
say
there because sometimes try fail could take

aspect is
you
and
,
i.e . we
instead
It:I
night
'
pcsya.sk
f 1 if s
' -
-
s
-
I
,
a = left
0 otherwise -1
The discounted reward of of slates E =

( so )F=o is
a
sequence
RIE ,
T ) : =
It Tt R ( St ) ( why is this finite ? )
> o
where Oak I is a fixed discount factor .

⑧
The optimal control problem ( or reinforcement learning problem , or

dynamic
feedback ) to determine
programming problem problem is
,
or
cybernetic ,
- . . .
how the agent should behave so that its

sequence of States so ,
so
,
-
- . maximise
" "
the expected discounted reward .

Here by behaviour we mean the choice of
action a C- Als ) given a current state s .
Let A =
Uses Als ) and define a
:S A that
policy to be a
complete set of such choices
,
ie .
a function IT → such
IT ( s ) E ACS ) for all S E S .
"
Given a
starting state s
,
a
policy TL
,
and the
"
transition model ( meaning all the
Pls la ) ) distribution state

'
probabilities obtain
probability
s over
, we a
sequences
s with PCs ) being the probability an
agent initially in state s
, following IT
, ,
and subject to the transition model , experiences s as its sequence of states .
The expected discounted reward in this case is
( ) § )
"
E U Is) : =
E RH ,
s ) =
PIE ) Rtr ,
E
.
§
&
± "
Tis state the that U (s )
a-
The optimal policy beginning in s is one maximise
It
independent of
the
over all IT and it turns out this is s call it The
,
.
utility
E
E (*, ) Is)
"
of a state s is then U s
which we denote U .
This all seems
I
how find It's ? Recall S be infinite

unsatisfying : would
you even such a
may
.
Now here's the brilliant trick ! To get around the morass in the box we can define
,
the value function U Is ) as the solution ( among functions U '

-
S → IR ) of an
equation ( the Bellman equation )
) )
§sP la
l Us
'
) 8
'
U I s ) =
Rls t .
s
,
s
ameaays
, -
!
I
^
only nonzero for finitely many

s
finite
⑨
The idea is that if U :S → IR is a solution of the Bellman equation then
the optimal policy of the

agent is derived from it via
,x÷s ) VCs )
Pls la
' '
*
I Cs ) : =
,
s
aargem.mg
.
solve the optimal control

problem need solve the Bellman
So to we
only
equation .
But this is
obviously a fixed point problem !
Ii) Let X be the set of bounded functions U :

S → IR and
) sup {
S } X
'
that d I U, U
' =
I U Is ) -
o Is ) I I se makes
prove
a
complete metric space .
Iii ) Prove that f : X → X defined by
⇐sp la ) )
l Us
'
) 8
'
flu ) I s ) =
Rls t .
ameaays
s
,
s
is contraction with contraction factor 8-

a
Conclude that by the Banach fixed point theorem f has a unique fixed point
unique som )
"
( hence the Bellman eq has a and that given any initial value
'
f ( )
'
function Uo the Ui : =
Uo
converges to
that IN and hence
sequence
so
,
It
Ufs )
Pls la )
' '
Cs ) s
aargemn.gg?s
: =
,
converges
" "
to
optimal policy an as i → a .
(
set of A discrete
the
policies does not have any reasonable topology as is .
We can however consider probabilistic policies and

thereby make sense of this .

Lecture14 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture14 PDF

Uploaded by

Copyright:

Available Formats

①

We now know that if X is

lecture we examine an important theorem about complete metric

Banach fixed point theorem which will be to

DEI A fixed point of a function f :X → X is x E X such that f ( )x = x .

A fixed point problem is the problem of proving the existence and/or

Example 44 ( Newton 's

and we wish to find a solution a of g (a) =

Observe that fixed point C x

the equation g (v)

finding a fixed point of a function f :

provided f is continuous we find that x is a fixed point

This is an informal argument ,

a contraction if there exists Xe CO , 1) such that

Clearly a contraction is continuous .

fixed point For X the ( fnx )nFo

Roof Let 7 be the contraction factor off as above .

which is a contradiction unless d ( P, 9 ) =

( fmtx ) fhttx ) fha )

But by Lemma L 8-4 we have

While Example 44 I 44-2 hint at of the problems that be

the which create the

circumstances for that particular g to be a contraction ) .

subspace topology ( as usual Cb ( X , X ) has the

Example 44-3 Applications of the Banach fixed point theorem include :

(1) The Picard theorem on existence of solids to ODES ( see Lecture 15 ) .

( see for example http://www.math.jhu.edu/~jmb/note/invfnthm.pdf ) .

(3) the Bellman equation is a functional equation ( ie .

starts Bellman equation how to

that's too slow instead ) neural network

of Stuart & Russell 's AI book

Exercise Htt Consider an

rewards At time ( which discrete ) the

agent can be in one of a number of states S ,

actions Als ) This action is an interaction with the environment

which causes the agent

assume the set L Rls ) Ise S3 EIR is bounded and that

{ left , right ) for all s

there because sometimes try fail could take

The discounted reward of of slates E =

where Oak I is a fixed discount factor .

The optimal control problem ( or reinforcement learning problem , or

how the agent should behave so that its

the expected discounted reward .

IT ( s ) E ACS ) for all S E S .

Pls la ) ) distribution state

and subject to the transition model , experiences s as its sequence of states .

The expected discounted reward in this case is

how find It's ? Recall S be infinite

the value function U Is ) as the solution ( among functions U '

equation ( the Bellman equation )

only nonzero for finitely many

The idea is that if U :S → IR is a solution of the Bellman equation then

the optimal policy of the

solve the optimal control

Ii) Let X be the set of bounded functions U :

Iii ) Prove that f : X → X defined by

is contraction with contraction factor 8-

We can however consider probabilistic policies and

You might also like