Professional Documents
Culture Documents
Online Appendix
Online Appendix
Robert L. Bray
August 1, 2021
Introduction
My argument seamlessly extends to this case in which the Monte Carlo sample is drawn from a
general density function. Regardless of the sampling distribution, Rust’s algorithm breaks the curse
of dimensionality only if the number of state variables that are meaningfully history dependent—i.e.,
contingent of the prior period’s state s or action a—is O(log(d)). However, the nature of this near
memorylessness varies with the sampling distribution. Under the uniform sampling distribution,
almost all state variables, ti , have a distribution that’s arbitrarily close to a uniform distribution
from almost all states s ∈ [0, 1]d . But under sampling density µd , almost all state variables, ti , have
a distribution that’s arbitrarily close to the conditional marginal sampling distribution, µid (ti |t<i ),
from almost all states s ∈ [0, 1]d .
Analysis
I will now derive the analog of each of Bray’s (2021) results under general sampling density µd .
Suppose the elements of md = {mid }bi=1
d
are drawn from a distribution with general density function
µd , which has full support over [0, 1]d .
In this case, Rust’s random Bellman operator would be
i µ
Pbd i
i=1 V (md )pda (md |s)
(Γ̂bµ
d V )(s) ≡ max uda (s) + β ,
a∈ a Pbd µ i
i=1 pda (md |s)
1
We will call this operator’s fixed point, V̂dbµ , the Rust value function under sampling density µ.
Note that we can also express the standard Bellman operator in terms of pµda and µd :
Z
(Γd V )(s) = max uda (s) + β V (t)pµda (t|s)µd (dt),
a∈ a t∈[0,1]d
Definition 7. A sequence of dynamic programs is strongly Rust solvable under sampling density
µ if for all > 0 there exists b ∈ b such that supd∈N E(||V̂dbµ − Vd ||) < .
Definition 8. A sequence of dynamic programs is weakly Rust solvable under sampling density µ
if for all > 0 there exists b ∈ b such that supd∈N E(||V̂dbµ − Vd ||1 ) < .
Definition 9. The ith state variable is -dependent under sampling density µ if ||Vdµ,−i − Vd ||1 < ,
where Vdµ,−i is the value function under density function pµ,−i i i
da (t|s) ≡ pda (t|s)µd (ti |t<i )/pd (ti |t<i ).
Definition 10. A sequence of dynamic programs is nearly memoryless under sampling density µ
if for all > 0, the number of state variables that are not -dependent under sampling density µ is
O(log(d)) as d → ∞.
Definition 11. A sequence of dynamic programs whose Rust value functions under sampling density
bµ
µ are {V̂ d }d∈N Rust -approximates under sampling density µ a sequence of dynamic programs
bµ
with value functions {Vd }d∈N if there exists b ∈ b for which supd∈N E(||V̂ d − Vd ||1 ) < .
Assumption 7. The scaled transition density function is Lipschitz continuous in its second argu-
ment: For each d ∈ N and t ∈ [0, 1]d , there exists Lipschitz constant `µd (t) ∈ R+ such that
|pµ µ
maxa∈a supr∈[0,1]d sups∈[0,1]d \r da (t|s)−pda (t|r)|
µ
`d (t)||s−r||2
≤ 1.
Assumption 8. The square integral of the scaled transition density Lipschitz function with re-
spect to measure µd is bounded by a polynomial function of d: There exists Lµ ∈ b such that
supd∈N t∈[0,1]d `µd (t)2 µd (dt)/Lµd ≤ 1.
R
Assumption 9. At the origin, the scaled transition density function is bounded by a polynomial
function of d: there exists M µ ∈ b such that maxa∈a supd∈N supt∈[0,1]d pµda (t|0)/Mdµ ≤ 1.
Assumption 10. The scaled transition density function is bounded by a polynomial function of d:
there exists M µ ∈ b such that maxa∈a supd∈N sups,t∈[0,1]d pµda (t|s)/Mdµ ≤ 1.
2
Proposition 1. Any sequence of dynamic programs that satisfies Assumptions 1, 2, 7, 8, and 9 is
strongly Rust solvable under sampling density µ.
Proposition 5. Any sequence of dynamic programs that (i) is weakly Rust solvable under sampling
density µ and (ii) satisfies Assumptions 1 and 2 can be Rust -approximated under sampling density
µ by a sequence of dynamic programs that is nearly memoryless under sampling density µ, for all
> 0.
Proofs
i µ
Pbd i
i=1 V (md )pda (md |s)
(Γ̂bµ
da V )(s) ≡uda (s) + β Pbd µ ,
i
i=1 pda (md |s)
bd
(Γ̃bµ V (mid )pµda (mid |s)/bd ,
X
da V )(s) ≡uda (s) + β
i=1
bd
bµ
gdi V (mid )pµda (mid |s)/ bd ,
X p
and Zda (s) ≡β
i=1
bd
where {gdi }i=1 is a set of independent standard normal random variables. Since E((Γ̃bµ
da V )(s)) =
(Γda V )(s), where Γda is defined in the proof of Proposition 1, Pollard’s (1989) seventh equation
implies that
2 2π
E ||Γ̃bµ
da V − Γda V || ≤√ E sup Zda bµ
(s)2 . (1)
bd s∈[0,1]d
3
We will now bound the expectation on the right. First, note that
bd
bµ bµ 2 2
|V (mid )(pµda (mid |t) − pµda (mid |s))| /bd
X
E(|Zda (t) − Zda (s)| : md ) =β 2
i=1
bd
2 X
βKd ||t − s||2
≤ `µd (mid )2 /bd . (2)
1−β
i=1
f (x)
Now for a given x > 0, divide [0, 1]d into f (x) ≡ dδdm /(2x)ed equally sized cubes, and define {cix }i=1
as the center points of these cubes. By design, these center points satisfy
x(1 − β)
sup min ||s − cix ||2 ≤ qP ,
s∈[0,1]d i∈{1,··· ,f (x)} βKd bd µ i 2
i=1 `d (md ) /bd
bµ bµ i 2
sup min E(|Zda (s) − Zda (cx )| : md ) ≤ x2 . (4)
s∈[0,1]d i∈{1,··· ,f (x)}
Now combining (3) and (4) with Pollard’s (1989) eighth equation yields
s q
bµ 2 bµ
E sup Zda (s) : md − E(Zda (0)2 : md )
s∈[0,1]d
Z pδdm
≤C log(f (x))dx
0
√ Z 1q
=Cδdm d
log 1/u du
0
√
=Cδdm dπ/2,
where C ≥ 1 is a universal constant that’s independent of all model parameters. And since x−y ≤ z
4
implies x2 ≤ 2y 2 + 2z 2 for all x, y, z ∈ R, this implies that
bµ bµ
E sup Zda (s)2 = E E sup Zda (s)2 : md
s∈[0,1]d s∈[0,1]d
bµ
≤ E 2 E(Zda (0)2 : md ) + (Cδdm )2 dπ
βK M µ 2 dβCK 2
d d d
≤2 +π E(`µd (mid )2 )
1−β 1−β
βK 2
d
≤ (2(Mdµ )2 + πd2 C 2 Lµd ).
1−β
Now combining this with (1) and applying Jensen’s inequality yields for all V ∈ Vd
q
2
E(||Γ̃bµ
da V − Γda V ||) ≤ E ||Γ̃bµ
da V − Γ da V ||
s
2π bµ
≤ √ E sup Zda (s)2
bd s∈[0,1]d
2πβKd
q
≤ 1/4 (Mdµ )2 + d2 C 2 Lµd . (5)
bd (1 − β)
Next, define constant function ιd ∈ Vd , where ιd (t) = Kd /(1 − β) for all t ∈ [0, 1]d . With this,
(5) yields
bd
!
Pbd V (mi )pµ (mi |s)
E(||Γ̂bµ Γ̃bµ pµda (mid |s)/bd
X
da Vd − da Vd ||) =E sup 1− β i=1
Pbd µ
d da d
p (mi |s)
s∈[0,1]d i=1 i=1 da d
bd
!
βKd
pµda (mid |s)/bd
X
≤ E sup 1−
1−β s∈[0,1]d i=1
||Γ̃bµ
=E − Γda ιd ||
da ιd
2πβKd
q
≤ 1/4 (Mdµ )2 + d2 C 2 Lµd .
bd (1 − β)
5
Combining this with (5) yields
E ||Γ̂bµ bµ
d Vd − Vd || = E ||Γ̂d Vd − Γd Vd ||
E ||Γ̂bµ
X
≤ da Vd − Γda Vd ||
a∈ a
E ||Γ̂bµ bµ bµ
X
≤ da Vd − Γ̃da Vd || + E ||Γ̃da Vd − Γda Vd ||
a∈ a
4|a|πβKd
q
≤ 1/4
(Mdµ )2 + d2 C 2 Lµd .
bd (1 − β)
E ||V̂dbµ − Vd || ≤ E ||Γ̂bµ
d Vd − Vd || /(1 − β)
4|a|πβKd
q
≤ 1/4 (Mdµ )2 + d2 C 2 Lµd ,
bd (1 − β) 2
Proof. Assumptions 2 and 10 ensure that |V (t)pµda (t|s)| ≤ Kd Mdµ /(1 − β) for all V ∈ V. Hence, for
V ∈ V, Popoviciu’s inequality implies that Var(V (mid )pµda (mid |s)) ≤ (Kd Mdµ )2 /(1 − β)2 , and thus
that
bd
X (Kd Mdµ )2
Var V (mid )pµda (mid |s)/bd ≤ .
bd (1 − β)2
i=1
(Kd Mdµ )2
Pr(ydb (s) = 1) ≤ ,
δ 2 bd (1 − β)2
( b )
d Z
yd (s) ≡1 i µ
X
b i
where V (md )pda (md |s)/bd − Vd (t)pda (t|s)λd (dt) > δ .
i=1 t∈[0,1]d
6
And this implies for V ∈ V that
bµ
˜ V )(s) − (Γda V )(s)|)
E(|(Γda
bd Z !
(mid )pµda (mid |s)/bd
X
=β E V − Vd (t)pda (t|s)λd (dt)
i=1 t∈[0,1]d
bd Z !
V (mid )pµda (mid |s)/bd −
X
≤ Pr(ydb (s) = 0) E Vd (t)pda (t|s)λd (dt) : ydb (s) = 0
i=1 t∈[0,1]d
bd Z !
V (mid )pµda (mid |s)/bd −
X
+ Pr(ydb (s) = 1) E Vd (t)pda (t|s)λd (dt) : ydb (s) = 1
i=1 t∈[0,1]d
(Kd Mdµ )2
≤1 · δ + 2 · 2Kd Mdµ /(1 − β)
δ bd (1 − β)2
2(Kd Mdµ )3
=δ + 2 ,
δ bd (1 − β)3
where Γ̃bµ
da is defined in the proof of Proposition 1 and Γda is defined in the proof of Proposition 1.
Hence, for V ∈ V we have
2(Kd Mdµ )3
E(||Γ̃bµ
da V − Γda V ||1 ) ≤ δ + . (6)
δ 2 bd (1 − β)3
bd
!
Z Pbd V (mi )pµ (mi |s)
d
E(||Γ̂bµ Γ̃bµ pµda (mid |s)/bd
X
da Vd − da Vd ||1 ) =E 1− β i=1
Pbd µ
d da d
λd (ds)
p (mi |s)
s∈[0,1]d i=1 i=1 da d
bd
Z !
βKd
pµda (mid |s)/bd λd (ds)
X
≤ E 1−
1−β s∈[0,1]d i=1
||Γ̃bµ
=E da ιd − Γda ιd ||1
2(Kd Mdµ )3
≤δ + (7)
δ 2 bd (1 − β)3
7
where Γ̂bµ
da and ιd are defined in the proof of Proposition 1. Now combining (6) and (7) yields
E ||Γ̂bµ bµ
d Vd − Vd ||1 = E ||Γ̂d Vd − Γd Vd ||1
E ||Γ̂bµ
X
≤ da Vd − Γda Vd ||1
a∈ a
E ||Γ̂bµ bµ bµ
X
≤ da Vd − Γ̃da Vd ||1 + E ||Γ̃da Vd − Γda Vd ||1
a∈ a
4|a|(Kd Mdµ )3
≤2|a|δ + .
δ 2 bd (1 − β)3
(1−β) a
8| |(Kd Mdµ )3
which is less than when δ < a
4| | and bd > δ 2 (1−β)4
.
γd ∈ R such that t∈[0,1]d max(0, `µd (t) − γd )µd (dt) = δd . Third, define probability density function
R
1 `µd (t) ≥ γd
ωdµ (t) ≡R .
r∈[0,1]d 1 `d (r) ≥ γd µd (dr)
µ
8
Fourth, Jensen’s inequality yields
Z Z
`µd (t)2 µd (dt) 1 `µd (t) ≥ γd `µd (t)2 µd (dt)
≥
t∈[0,1]d t∈[0,1]d
Z Z
ωdµ (t)`µd (t)2 µd (dt) 1 `µd (r) ≥ γd µd (dr)
=
t∈[0,1]d r∈[0,1]d
Z !2 Z
ωdµ (t)`µd (t)µd (dt) 1 `µd (r) ≥ γd µd (dr)
≥
t∈[0,1]d r∈[0,1]d
!2 Z
max(0, `µd (t) − γd )µd (dt)
R
t∈[0,1]d
1 `µd (r) ≥ γd µd (dr)
= γd + R
r∈[0,1]d 1 `d (r) ≥ γd µd (dr)
µ
r∈[0,1]d
!2 Z
δ
µ d 1 `µd (r) ≥ γd µd (dr)
= γd + R
r∈[0,1]d 1 `d (r) ≥ γd µd (dr) r∈[0,1]d
γd ≤ Lµd /(2δd ).
9
which is polynomially bounded. Also, by design we have
Z
|pµda (t|s) − pµda (t|s)|µd (dt)
t∈[0,1]d
Z Z
= pµda (t|s) +1− pµda (r|s)µd (dr) − pµda (t|s) µd (dt)
t∈[0,1]d r∈[0,1]d
Z Z
pµda (t|s) pµda (r|s) − pµda (r|s) µd (dr) − pµda (t|s) µd (dt)
= +
t∈[0,1]d r∈[0,1]d
Z Z
≤ |pµda (t|s) − pµda (t|s)|µd (dt) + |pµda (r|s) − pµda (r|s)|µd (dr)
t∈[0,1]d r∈[0,1]d
Z
=2 |pµda (t|s) − pµda (t|s)|µd (dt)
t∈[0,1]d
Z √
max 0, `µd (t)||t|| −
≤2 dγd µd (dt)
t∈[0,1]d
√ Z
max 0, `µd (t) − γd ) µd (dt)
≤2 d
t∈[0,1]d
√
=2 dδd . (8)
Now let Γµd represent the Bellman operator under density function pµda . With (8), we find that this
operator satisfies
||Γµd Vd − Vd || =||Γµd Vd − Γd Vd ||
Z
Vd (t)(pµda (t|s) − pµda (t|s))µd (dt)
X
≤ sup β
a∈ a s∈[0,1]d
Z
t∈[0,1]d
βKd X
≤ sup |pµ (t|s) − pµda (t|s)|µd (dt)
1 − β a∈a s∈[0,1]d t∈[0,1]d da
βKd X √
≤ sup 2 dδd
1 − β a∈a s∈[0,1]d
√
2aβδd Kd d
=
1−β
=(1 − β).
And with this, Rust’s (1997) Lemma 2.2 establishes that ||V µd − Vd || ≤ , where V µd is the fixed
point of Γµd .
10
q
Proof. Pinsker’s inequality establishes that ||λ1 − pida (·|s, t<i )||1 ≤ 2κ(pida (·|s, t<i ), λ1 ). And this
implies that
Z
p−i
da (t<i , t≥i |s) − pda (t<i , t≥i |s) λd−i+1 (dt≥i )
t≥i ∈[0,1]d−i+1
Z Z
= p−i
da (t<i , ti , t≥i+1 |s) − pda (t<i , ti t≥i+1 |s) λd−i (dt≥i+1 )λ1 (dti )
t ∈[0,1] t≥i+1 ∈[0,1]d−i
Zi Z
= 1− pida (ti |s, t<i ) p−i
da (t<i , ti , t≥i+1 |s)λd−i (dt≥i+1 )λ1 (dti )
ti ∈[0,1] t≥i+1 ∈[0,1]d−i
R
Z
t≥i+1 ∈[0,1]d−i pda (t<i , ti , t≥i+1 |s)λd−i (dt≥i+1 )
= 1 − pida (ti |s, t<i ) λ1 (dti )
ti ∈[0,1] pida (ti |s, t<i )
p<i+1 (t<i , ti )
Z
= 1 − pida (ti |s, t<i ) di λ1 (dti )
ti ∈[0,1] pda (ti |s, t<i )
Z
<i
=pd (t<i ) 1 − pida (ti |s, t<i ) λ1 (dti )
ti ∈[0,1]
=p<i i
d (t<i )||λ1 − pda (·|s, t<i )||1
q
≤p<i
d (t<i ) 2κ(pida (·|s, t<i ), λ1 ).
N such that
Pd
κ(pida (·|s, t<i ), λ1 ) <
Next, Lemma 1 implies that there exists m, n ∈ i=1 E
m + n log(d), for all d ∈ N. Since κ(pida (·|s, t<i ), λ1 ) ≥ 0, this implies that for a given γ >
q
0 the inequality 2 E κ(pida (·|s, t<i ), λ1 ) ≤ γ holds for at least d − 2(m + n log(d))/γ 2 values
11
sets satisfy
d d Z
1 s ∈ Ωida λd (ds)
X X
λd (Ωida )
=
i=1 i=1 s∈[0,1]d
Z d
1 s ∈ Ωida λd (ds)
X
=
s∈[0,1]d i=1
Z
d − 2(m + n log(d))/γ 2 λd (ds)
≥
s∈[0,1]d
=d − 2(m + n log(d))/γ 2 .
λd (Ωida ) ≥ 1 − δ (9)
holds for at least d − 2(m + n log(d))/(δγ 2 ) values of i. Thus, for i that satisfies (10), we have
Z Z
||Γ−i
da Vd − Γda Vd ||1 = uda (s) + β V (t)p−i
da (t|s)λd (dt)
s∈[0,1]d t∈[0,1]d
Z
− uda (s) − β V (t)pda (t|s)λd (dt) λd (ds)
t∈[0,1]d
Z Z
=β Vd (t)(p−i
da (t|s) − pda (t|s))λd (dt) λd (ds)
s∈[0,1]d t∈[0,1]d
Z
≤β||Vd || ||p−i
da (·|s) − pda (·|s)||1 λd (ds)
s∈[0,1]d
Z
≤β||Vd || ||p−i
da (·|s) − pda (·|s)||1 λd (ds)
s∈Ωida
Z !
||p−i
+ da (·|s)||1 + ||pda (·|s)||1 λd (ds)
s∈[0,1]d \Ωida
Z Z !
≤β||Vd || γλd (ds) + 2λd (ds)
s∈Ωida s∈[0,1]d \Ωida
where Γda is the action-a Bellman operator defined in the proof of Proposition 1, and Γ−i
da is the
12
analogous operator under p−i
da . Thus, for i that satisfies (10), we have
||Γ−i −i
d Vd − Vd ||1 =||Γd Vd − Γd Vd ||1
X
≤ ||Γ−i
da Vd − Γda Vd ||1
a∈ a
≤|a|β(Kd /(1 − β))(γ + 2δ),
where Γ−i −i
d is the analog of Γd under pd . And with this, Rust’s (1997) Lemma 2.2 implies that
||Vd−i − Vd ||1 < |a|βKd (γ+2δ)/(1−β)2 holds for i that satisfies (10). Hence, setting γ = δ = 4| (1−β) 2
3a|βKd
,
|a|βKd
we find that ||Vd−i − Vd ||1 < holds for at least d − 2(m + n log(d))/(δγ 2 ) = d − 128 (1−β) 2 (m +
n log(d)) values of i.
sup pµda (t|s) = exp sup log(pµda (t|s))
t∈[0,1]d t∈[0,1]d
Z
≥ exp log(pµda (t|s))pda (t|s)λd (dt)
t∈[0,1]d
d Z
X pi (t |s, t )
da i <i
= exp log p da (t|s)λ d (dt)
i=1 t∈[0,1]
d µid (ti |t<i )
Xd
= exp E κ(pida (·|s, t<i ), µid (·|t<i )) .
i=1
13
And this implies that
Z
pµ,−i
da (t<i , t≥i |s) − pda (t<i , t≥i |s) λd−i+1 (dt≥i )
t≥i ∈[0,1]d−i+1
Z
= µid (ti |t<i )/pida (ti |s, t<i ) − 1 pda (t<i , ti , t≥i+1 |s)λd−i+1 (dt≥i )
t≥i ∈[0,1]d−i+1
Z
= µid (ti |t<i )/pida (ti |s, t<i ) − 1 p<i+1
d (t<i , ti )λ1 (dti )
ti ∈[0,1]
Z
=p<i
d (t<i ) µid (ti |t<i ) − pida (ti |s, t<i ) λ1 (dti )
ti ∈[0,1]
=p<i i
d (t<i )||µd (·|t<i ) − pida (·|s, t<i )||1
q
≤p<i
d (t<i ) 2κ(pida (·|s, t<i ), µid (·|t<i )).
14
these sets satisfy
d d Z
λd (Ωµ,i 1 s ∈ Ωµ,i
X X
da ) = da λd (ds)
i=1 i=1 s∈[0,1]d
Z d
1 s ∈ Ωµ,i
X
= da λd (ds)
s∈[0,1]d i=1
Z
d − 2(m + n log(d))/γ 2 λd (ds)
≥
s∈[0,1]d
=d − 2(m + n log(d))/γ 2 .
Since λd (Ωµ,i
da ) ≤ 1 this implies that
λd (Ωµ,i
da ) ≥ 1 − δ (10)
holds for at least d − 2(m + n log(d))/(δγ 2 ) values of i. Thus, for i that satisfies (10), we have
Z Z
||Γµ,−i
da Vd − Γda Vd ||1 = uda (s) + β V (t)pµ,−i
da (t|s)λd (dt)
s∈[0,1]d t∈[0,1]d
Z
− uda (s) − β V (t)pda (t|s)λd (dt) λd (ds)
t∈[0,1]d
Z Z
=β Vd (t)(pµ,−i
da (t|s) − pda (t|s))λd (dt) λd (ds)
s∈[0,1]d t∈[0,1]d
Z
≤β||Vd || ||pµ,−i
da (·|s) − pda (·|s)||1 λd (ds)
s∈[0,1]d
Z
≤β||Vd || ||pµ,−i
da (·|s) − pda (·|s)||1 λd (ds)
s∈Ωµ,i
da
Z !
||pµ,−i
+ da (·|s)||1 + ||pda (·|s)||1 λd (ds)
s∈[0,1]d \Ωµ,i
da
Z Z !
≤β||Vd || γλd (ds) + 2λd (ds)
s∈Ωµ,i
da s∈[0,1]d \Ωµ,i
da
where Γda is the action-a Bellman operator defined in the proof of Proposition 1, and Γµ,−i
da is the
15
analogous operator under pµ,−i
da . Thus, for i that satisfies (10), we have
||Γµ,−i
d Vd − Vd ||1 =||Γµ,−i
d Vd − Γd Vd ||1
X µ,−i
≤ ||Γda Vd − Γda Vd ||1
a∈ a
≤|a|β(Kd /(1 − β))(γ + 2δ),
where Γµ,−i
d is the analog of Γd under pµ,−i
d . And with this, Rust’s (1997) Lemma 2.2 implies
that ||Vdµ,−i − Vd ||1 < |a|βKd (γ + 2δ)/(1 − β)2 holds for i that satisfies (10). Hence, setting γ =
(1−β) 2 µ,−i
δ = 4| a |βKd , we find that ||Vd − Vd ||1 < holds for at least d − 2(m + n log(d))/(δγ 2 ) = d −
|a|βKd
3
128 (1−β) 2 (m + n log(d)) values of i.
Proposition 5. Any sequence of dynamic programs that (i) is weakly Rust solvable under sampling
density µ and (ii) satisfies Assumptions 1 and 2 can be Rust -approximated under sampling density
µ by a sequence of dynamic programs that is nearly memoryless under sampling density µ, for all
> 0.
Proof. Define Ξµd as a general set with measure b−2 µd (dt) = b−2
R
d under sampling density µ: t∈Ξµ
d
d .
And use this to define the following probability transition density function:
pµb (t|s) ≡1 pµda (t|s) ≤ b2d pµda (t|s) + b2d 1 t ∈ Ξµd pµda (s),
da
Z
where pµda (s) ≡ 1 pµda (r|s) > b2d pµda (r|s)µd (dr).
r∈[0,1]d
This density function funnels all the mass that exceeds to b2d into Ξµd . Since it never exceeds 2b2d , this
density function satisfies Assumption 10. So Proposition 4 establishes that this density function
corresponds with a sequence of dynamic programs that’s nearly memoryless under sampling density
µ (given Assumptions 1 and 2).
Define the following as the set of points for which the new density function equals the old density
function:
16
The measure of this set under sampling density µ satisfies
Z Z XZ
1 pµda (r|s) > b2d µd (dt)
µd (dt) ≥1 − µd (dt) −
t∈Ωµ
d (s) t∈Ξµ
d a∈ a t∈[0,1]d
XZ
≥1 − b−2 −2
1 pµda (r|s) > b2d pµda (r|s)µd (dt)
d − bd
a∈ a Zt∈[0,1]d
pµda (r|s)µd (dt)
X
≥1 − b−2 −2
d − bd
a∈ a t∈[0,1]d
≥1 − (1 + |a|)/b2d .
The probability that this set contains all mid values satisfies
Z bd
Pr ∪bi=1
d
mid ⊂ Ωbd (s) = µd (dt)
t∈Ωµ
d (s)
µb
Next, define Γ̂d as Rust’s random Bellman operator evaluated under density function pµb
da . Since
V̂dµb ∈ Vd , we have
µb
|(Γ̂d V̂dµb )(s) − (Γ̂µb µb
d V̂d )(s)| ≤ βKd /(1 − β). (12)
And if ∪bi=1
d
mid ⊂ Ωbd (s) then
µb
|(Γ̂d V̂dµb )(s) − (Γ̂µb µb
d V̂d )(s)| = 0. (13)
17
Now combining (11)–(13) yields
µb µb
E ||Γ̂d V̂dµb − V̂dµb ||1 = E ||Γ̂d V̂dµb − Γ̂µb µb
d V̂d ||1
Z
µb
E |(Γ̂d V̂dµb )(s) − (Γ̂µb µb
= d V̂d )(s)| λd (ds)
s∈[0,1]d
Z
βKd /(1 − β) 1 − Pr ∪µb i µb
≤ i=1 md ⊂ Ωd (s) λd (ds)
d
s∈[0,1]d
βKd (1 + |a|)
Z
< λd (ds)
s∈[0,1]d bd (1 − β)
βKd (1 + |a|)
= .
bd (1 − β)
And with this, we can use Rust’s (1997) Lemma 2.2 to establishes that
µb µb βKd (1 + |a|)
E ||V̂ d − V̂dµb ||1 ≤ E ||Γ̂d V̂dµb − V̂dµb ||1 /(1 − β) <
. (14)
bd (1 − β)2
References
Bray, Robert L. 2021. A Comment on ”Using Randomization to Break the Curse of Dimensionality”.
Working Paper .
Pollard, David. 1989. Asymptotics via empirical processes. Statistical Science 4(4) 341–354.
Rust, John. 1997. Using randomization to break the curse of dimensionality. Econometrica 65(3)
487–516.
18