Online Appendix

Online Appendix for Bray (2021)
Rust’s Algorithm Under a General Sampling Density
Robert L. Bray
Kellogg School of Management, Northwestern University
August 1, 2021
Introduction
My argument seamlessly extends to this case in which the Monte Carlo sample is drawn from a
general density function. Regardless of the sampling distribution, Rust’s algorithm breaks the curse
of dimensionality only if the number of state variables that are meaningfully history dependent—i.e.,
contingent of the prior period’s state s or action a—is O(log(d)). However, the nature of this near
memorylessness varies with the sampling distribution. Under the uniform sampling distribution,
almost all state variables, ti , have a distribution that’s arbitrarily close to a uniform distribution
from almost all states s ∈ [0, 1]d . But under sampling density µd , almost all state variables, ti , have
a distribution that’s arbitrarily close to the conditional marginal sampling distribution, µid (ti |t<i ),
from almost all states s ∈ [0, 1]d .
Analysis
I will now derive the analog of each of Bray’s (2021) results under general sampling density µd .
Suppose the elements of md = {mid }bi=1
d
are drawn from a distribution with general density function
µd , which has full support over [0, 1]d .
In this case, Rust’s random Bellman operator would be
i µ
Pbd i
i=1 V (md )pda (md |s)
(Γ̂bµ
d V )(s) ≡ max uda (s) + β ,
a∈ a Pbd µ i
i=1 pda (md |s)
where pµda (t|s) ≡pda (t|s)/µd (t).
1
We will call this operator’s fixed point, V̂dbµ , the Rust value function under sampling density µ.
Note that we can also express the standard Bellman operator in terms of pµda and µd :
Z
(Γd V )(s) = max uda (s) + β V (t)pµda (t|s)µd (dt),
a∈ a t∈[0,1]d
where µd (dt) ≡µd (t)λd (dt)
We will now re-express our definitions, assumptions, and propositions in terms of µd .
Definition 7. A sequence of dynamic programs is strongly Rust solvable under sampling density
µ if for all > 0 there exists b ∈ b such that supd∈N E(||V̂dbµ − Vd ||) < .
Definition 8. A sequence of dynamic programs is weakly Rust solvable under sampling density µ
if for all > 0 there exists b ∈ b such that supd∈N E(||V̂dbµ − Vd ||1 ) < .
Definition 9. The ith state variable is -dependent under sampling density µ if ||Vdµ,−i − Vd ||1 < ,
where Vdµ,−i is the value function under density function pµ,−i i i
da (t|s) ≡ pda (t|s)µd (ti |t<i )/pd (ti |t<i ).
Definition 10. A sequence of dynamic programs is nearly memoryless under sampling density µ
if for all > 0, the number of state variables that are not -dependent under sampling density µ is
O(log(d)) as d → ∞.
Definition 11. A sequence of dynamic programs whose Rust value functions under sampling density
bµ
µ are {V̂ d }d∈N Rust -approximates under sampling density µ a sequence of dynamic programs
bµ
with value functions {Vd }d∈N if there exists b ∈ b for which supd∈N E(||V̂ d − Vd ||1 ) < .
Assumption 7. The scaled transition density function is Lipschitz continuous in its second argu-
ment: For each d ∈ N and t ∈ [0, 1]d , there exists Lipschitz constant `µd (t) ∈ R+ such that
|pµ µ
maxa∈a supr∈[0,1]d sups∈[0,1]d \r da (t|s)−pda (t|r)|
µ
`d (t)||s−r||2
≤ 1.
Assumption 8. The square integral of the scaled transition density Lipschitz function with re-
spect to measure µd is bounded by a polynomial function of d: There exists Lµ ∈ b such that
supd∈N t∈[0,1]d `µd (t)2 µd (dt)/Lµd ≤ 1.
R
Assumption 9. At the origin, the scaled transition density function is bounded by a polynomial
function of d: there exists M µ ∈ b such that maxa∈a supd∈N supt∈[0,1]d pµda (t|0)/Mdµ ≤ 1.
Assumption 10. The scaled transition density function is bounded by a polynomial function of d:
there exists M µ ∈ b such that maxa∈a supd∈N sups,t∈[0,1]d pµda (t|s)/Mdµ ≤ 1.
2
Proposition 1. Any sequence of dynamic programs that satisfies Assumptions 1, 2, 7, 8, and 9 is
strongly Rust solvable under sampling density µ.
Proposition 2. Any sequence of dynamic programs that satisfies Assumptions 1, 2, and 10 is

weakly Rust solvable under sampling density µ.
Proposition 3. Every sequence of dynamic programs that satisfies Assumptions 1, 2, 7, 8, and 9

can be -approximated by a sequence that satisfies Assumptions 1, 2, and 10, for all > 0.

nearly memoryless under sampling density µ.
Proposition 5. Any sequence of dynamic programs that (i) is weakly Rust solvable under sampling
density µ and (ii) satisfies Assumptions 1 and 2 can be Rust -approximated under sampling density
µ by a sequence of dynamic programs that is nearly memoryless under sampling density µ, for all
> 0.
Proofs
Proposition 1. Any sequence of dynamic programs that satisfies Assumptions 1, 2, 7, 8, and 9 is

strongly Rust solvable under sampling density µ.
Proof. For d ∈ N, V ∈ Vd , a ∈ a, and b ∈ b define
i µ
Pbd i
i=1 V (md )pda (md |s)
(Γ̂bµ
da V )(s) ≡uda (s) + β Pbd µ ,
i
i=1 pda (md |s)
bd
(Γ̃bµ V (mid )pµda (mid |s)/bd ,
X
da V )(s) ≡uda (s) + β
i=1
bd
bµ
gdi V (mid )pµda (mid |s)/ bd ,
X p
and Zda (s) ≡β
i=1
bd
where {gdi }i=1 is a set of independent standard normal random variables. Since E((Γ̃bµ
da V )(s)) =
(Γda V )(s), where Γda is defined in the proof of Proposition 1, Pollard’s (1989) seventh equation
implies that
2 2π
E ||Γ̃bµ
da V − Γda V || ≤√ E sup Zda bµ
(s)2 . (1)
bd s∈[0,1]d
3
We will now bound the expectation on the right. First, note that
bd
bµ bµ 2 2
|V (mid )(pµda (mid |t) − pµda (mid |s))| /bd
X
E(|Zda (t) − Zda (s)| : md ) =β 2
i=1
bd
2 X
βKd ||t − s||2
≤ `µd (mid )2 /bd . (2)
1−β
i=1
This expression implies that

v
u bd
βKd uX µ
δdm ≡ t d`d (mid )2 /bd
1−β
i=1
q
bµ bµ 2
≥ sup E(|Zda (t) − Zda (s)| : md ). (3)
s,t∈[0,1]d
f (x)
Now for a given x > 0, divide [0, 1]d into f (x) ≡ dδdm /(2x)ed equally sized cubes, and define {cix }i=1
as the center points of these cubes. By design, these center points satisfy
x(1 − β)
sup min ||s − cix ||2 ≤ qP ,
s∈[0,1]d i∈{1,··· ,f (x)} βKd bd µ i 2
i=1 `d (md ) /bd
which with (2) implies that
bµ bµ i 2
sup min E(|Zda (s) − Zda (cx )| : md ) ≤ x2 . (4)
s∈[0,1]d i∈{1,··· ,f (x)}
Now combining (3) and (4) with Pollard’s (1989) eighth equation yields
s q
bµ 2 bµ
E sup Zda (s) : md − E(Zda (0)2 : md )
s∈[0,1]d
Z pδdm
≤C log(f (x))dx
0
√ Z 1q
=Cδdm d

log 1/u du
0
√
=Cδdm dπ/2,
where C ≥ 1 is a universal constant that’s independent of all model parameters. And since x−y ≤ z
4
implies x2 ≤ 2y 2 + 2z 2 for all x, y, z ∈ R, this implies that

bµ bµ
E sup Zda (s)2 = E E sup Zda (s)2 : md
s∈[0,1]d s∈[0,1]d

bµ
≤ E 2 E(Zda (0)2 : md ) + (Cδdm )2 dπ
βK M µ 2 dβCK 2
d d d
≤2 +π E(`µd (mid )2 )
1−β 1−β
βK 2
d
≤ (2(Mdµ )2 + πd2 C 2 Lµd ).
1−β
Now combining this with (1) and applying Jensen’s inequality yields for all V ∈ Vd
q
2
E(||Γ̃bµ
da V − Γda V ||) ≤ E ||Γ̃bµ
da V − Γ da V ||
s
2π bµ

≤ √ E sup Zda (s)2
bd s∈[0,1]d
2πβKd
q
≤ 1/4 (Mdµ )2 + d2 C 2 Lµd . (5)
bd (1 − β)
Next, define constant function ιd ∈ Vd , where ιd (t) = Kd /(1 − β) for all t ∈ [0, 1]d . With this,
(5) yields
bd
!
Pbd V (mi )pµ (mi |s)
E(||Γ̂bµ Γ̃bµ pµda (mid |s)/bd
X
da Vd − da Vd ||) =E sup 1− β i=1
Pbd µ
d da d
p (mi |s)
s∈[0,1]d i=1 i=1 da d
bd
!
βKd
pµda (mid |s)/bd
X
≤ E sup 1−
1−β s∈[0,1]d i=1
||Γ̃bµ

=E − Γda ιd ||
da ιd
2πβKd
q
≤ 1/4 (Mdµ )2 + d2 C 2 Lµd .
bd (1 − β)
5
Combining this with (5) yields
E ||Γ̂bµ bµ

d Vd − Vd || = E ||Γ̂d Vd − Γd Vd ||
E ||Γ̂bµ
X
≤ da Vd − Γda Vd ||
a∈ a
E ||Γ̂bµ bµ bµ
X
≤ da Vd − Γ̃da Vd || + E ||Γ̃da Vd − Γda Vd ||
a∈ a
4|a|πβKd
q
≤ 1/4
(Mdµ )2 + d2 C 2 Lµd .
bd (1 − β)
And with this, Rust’s (1997) Lemma 2.2 establishes that
E ||V̂dbµ − Vd || ≤ E ||Γ̂bµ

d Vd − Vd || /(1 − β)
4|a|πβKd
q
≤ 1/4 (Mdµ )2 + d2 C 2 Lµd ,
bd (1 − β) 2
which is smaller than when bd is larger than

a
4| |πβKd
q 4
(Mdµ )2 + d2 C 2 Lµd .
(1−β)2

weakly Rust solvable under sampling density µ.
Proof. Assumptions 2 and 10 ensure that |V (t)pµda (t|s)| ≤ Kd Mdµ /(1 − β) for all V ∈ V. Hence, for
V ∈ V, Popoviciu’s inequality implies that Var(V (mid )pµda (mid |s)) ≤ (Kd Mdµ )2 /(1 − β)2 , and thus
that
bd
X (Kd Mdµ )2
Var V (mid )pµda (mid |s)/bd ≤ .
bd (1 − β)2
i=1
Accordingly, Chebyshev’s inequality establishes, for a given δ > 0 and V ∈ V, that
(Kd Mdµ )2
Pr(ydb (s) = 1) ≤ ,
δ 2 bd (1 − β)2
( b )
d Z
yd (s) ≡1 i µ
X
b i
where V (md )pda (md |s)/bd − Vd (t)pda (t|s)λd (dt) > δ .
i=1 t∈[0,1]d
6
And this implies for V ∈ V that
bµ
˜ V )(s) − (Γda V )(s)|)
E(|(Γda
bd Z !
(mid )pµda (mid |s)/bd
X
=β E V − Vd (t)pda (t|s)λd (dt)
i=1 t∈[0,1]d
bd Z !
V (mid )pµda (mid |s)/bd −
X
≤ Pr(ydb (s) = 0) E Vd (t)pda (t|s)λd (dt) : ydb (s) = 0
i=1 t∈[0,1]d
bd Z !
V (mid )pµda (mid |s)/bd −
X
+ Pr(ydb (s) = 1) E Vd (t)pda (t|s)λd (dt) : ydb (s) = 1
i=1 t∈[0,1]d
(Kd Mdµ )2
≤1 · δ + 2 · 2Kd Mdµ /(1 − β)
δ bd (1 − β)2
2(Kd Mdµ )3
=δ + 2 ,
δ bd (1 − β)3
where Γ̃bµ
da is defined in the proof of Proposition 1 and Γda is defined in the proof of Proposition 1.
Hence, for V ∈ V we have
2(Kd Mdµ )3
E(||Γ̃bµ
da V − Γda V ||1 ) ≤ δ + . (6)
δ 2 bd (1 − β)3
And this, in turn, implies that
bd
!
Z Pbd V (mi )pµ (mi |s)
d
E(||Γ̂bµ Γ̃bµ pµda (mid |s)/bd
X
da Vd − da Vd ||1 ) =E 1− β i=1
Pbd µ
d da d
λd (ds)
p (mi |s)
s∈[0,1]d i=1 i=1 da d
bd
Z !
βKd
pµda (mid |s)/bd λd (ds)
X
≤ E 1−
1−β s∈[0,1]d i=1
||Γ̃bµ

=E da ιd − Γda ιd ||1
2(Kd Mdµ )3
≤δ + (7)
δ 2 bd (1 − β)3
7
where Γ̂bµ
da and ιd are defined in the proof of Proposition 1. Now combining (6) and (7) yields
E ||Γ̂bµ bµ

d Vd − Vd ||1 = E ||Γ̂d Vd − Γd Vd ||1
E ||Γ̂bµ
X
≤ da Vd − Γda Vd ||1
a∈ a
E ||Γ̂bµ bµ bµ
X
≤ da Vd − Γ̃da Vd ||1 + E ||Γ̃da Vd − Γda Vd ||1
a∈ a
4|a|(Kd Mdµ )3
≤2|a|δ + .
δ 2 bd (1 − β)3
And with this, Rust’s (1997) Lemma 2.2 establishes that
E ||V̂dbµ − Vd ||1 ≤ E ||Γ̂bµ

d Vd − Vd ||1 /(1 − β)
4|a|(Kd Mdµ )3
≤2|a|δ/(1 − β) + 2 ,
δ bd (1 − β)4
(1−β) a
8| |(Kd Mdµ )3
which is less than when δ < a
4| | and bd > δ 2 (1−β)4
.
Proposition 3. Every sequence of dynamic programs that satisfies Assumptions 1, 2, 7, 8, and 9

can be -approximated by a sequence that satisfies Assumptions 1, 2, and 10, for all > 0.
2
Proof. First, choose > 0 small enough so that t∈[0,1]d `µd (t)µd (dt) > δd ≡ 2(1−β)
R
aβKd d
√ . Second, set
γd ∈ R such that t∈[0,1]d max(0, `µd (t) − γd )µd (dt) = δd . Third, define probability density function
R
1 `µd (t) ≥ γd

ωdµ (t) ≡R .
r∈[0,1]d 1 `d (r) ≥ γd µd (dr)
µ
8
Fourth, Jensen’s inequality yields
Z Z
`µd (t)2 µd (dt) 1 `µd (t) ≥ γd `µd (t)2 µd (dt)

≥
t∈[0,1]d t∈[0,1]d
Z Z
ωdµ (t)`µd (t)2 µd (dt) 1 `µd (r) ≥ γd µd (dr)

=
t∈[0,1]d r∈[0,1]d
Z !2 Z
ωdµ (t)`µd (t)µd (dt) 1 `µd (r) ≥ γd µd (dr)

≥
t∈[0,1]d r∈[0,1]d
!2 Z
max(0, `µd (t) − γd )µd (dt)
R
t∈[0,1]d
1 `µd (r) ≥ γd µd (dr)

= γd + R
r∈[0,1]d 1 `d (r) ≥ γd µd (dr)
µ
r∈[0,1]d
!2 Z
δ
µ d 1 `µd (r) ≥ γd µd (dr)

= γd + R
r∈[0,1]d 1 `d (r) ≥ γd µd (dr) r∈[0,1]d
≥ min (γd + δd /y)2 y

y∈ R+
=2γd δd ,
which with Assumption 8 implies
γd ≤ Lµd /(2δd ).
Fifth, define probability density function

Z
pµda (t|s) ≡pµda (t|s) +1− pµda (r|s)µd (dr),
r∈[0,1]d
√
pµda (t|s) ≡ max pµda (t|0) + dγd , pµda (t|s) .

where
This density function satisfies Assumption 10, since
pµda (t|s) ≤pµda (t|s) + 1

√
≤pµda (t|0) + dγd + 1
√
≤Mdµ + dLµd /(2δd ) + 1
aβKd Lµd d
=Mdµ + + 1,
(1 − β)2
9
which is polynomially bounded. Also, by design we have
Z
|pµda (t|s) − pµda (t|s)|µd (dt)
t∈[0,1]d
Z Z
= pµda (t|s) +1− pµda (r|s)µd (dr) − pµda (t|s) µd (dt)
t∈[0,1]d r∈[0,1]d
Z Z
pµda (t|s) pµda (r|s) − pµda (r|s) µd (dr) − pµda (t|s) µd (dt)

= +
t∈[0,1]d r∈[0,1]d
Z Z
≤ |pµda (t|s) − pµda (t|s)|µd (dt) + |pµda (r|s) − pµda (r|s)|µd (dr)
t∈[0,1]d r∈[0,1]d
Z
=2 |pµda (t|s) − pµda (t|s)|µd (dt)
t∈[0,1]d
Z √
max 0, `µd (t)||t|| −

≤2 dγd µd (dt)
t∈[0,1]d
√ Z
max 0, `µd (t) − γd ) µd (dt)

≤2 d
t∈[0,1]d
√
=2 dδd . (8)
Now let Γµd represent the Bellman operator under density function pµda . With (8), we find that this
operator satisfies
||Γµd Vd − Vd || =||Γµd Vd − Γd Vd ||
Z
Vd (t)(pµda (t|s) − pµda (t|s))µd (dt)
X
≤ sup β
a∈ a s∈[0,1]d
Z
t∈[0,1]d
βKd X
≤ sup |pµ (t|s) − pµda (t|s)|µd (dt)
1 − β a∈a s∈[0,1]d t∈[0,1]d da
βKd X √
≤ sup 2 dδd
1 − β a∈a s∈[0,1]d
√
2aβδd Kd d
=
1−β
=(1 − β).
And with this, Rust’s (1997) Lemma 2.2 establishes that ||V µd − Vd || ≤ , where V µd is the fixed
point of Γµd .

10
q
Proof. Pinsker’s inequality establishes that ||λ1 − pida (·|s, t<i )||1 ≤ 2κ(pida (·|s, t<i ), λ1 ). And this
implies that
Z
p−i
da (t<i , t≥i |s) − pda (t<i , t≥i |s) λd−i+1 (dt≥i )
t≥i ∈[0,1]d−i+1
Z Z
= p−i
da (t<i , ti , t≥i+1 |s) − pda (t<i , ti t≥i+1 |s) λd−i (dt≥i+1 )λ1 (dti )
t ∈[0,1] t≥i+1 ∈[0,1]d−i
Zi Z
= 1− pida (ti |s, t<i ) p−i
da (t<i , ti , t≥i+1 |s)λd−i (dt≥i+1 )λ1 (dti )
ti ∈[0,1] t≥i+1 ∈[0,1]d−i
R
Z
t≥i+1 ∈[0,1]d−i pda (t<i , ti , t≥i+1 |s)λd−i (dt≥i+1 )
= 1 − pida (ti |s, t<i ) λ1 (dti )
ti ∈[0,1] pida (ti |s, t<i )
p<i+1 (t<i , ti )
Z
= 1 − pida (ti |s, t<i ) di λ1 (dti )
ti ∈[0,1] pda (ti |s, t<i )
Z
<i
=pd (t<i ) 1 − pida (ti |s, t<i ) λ1 (dti )
ti ∈[0,1]
=p<i i
d (t<i )||λ1 − pda (·|s, t<i )||1
q
≤p<i
d (t<i ) 2κ(pida (·|s, t<i ), λ1 ).
And with this, Jensen’s inequality yields

Z
||p−i
da (·|s) − pda (·|s)||1 = p−i
da (t|s) − pda (t|s) λd (dt)
t∈[0,1]d
Z Z
= p−i
da (t<i , t≥i |s) − pda (t<i , t≥i |s) λd−i+1 (dt≥i )λi−1 (dt<i )
t ∈[0,1]i−1 t≥i ∈[0,1]d−i+1
Z <i q
≤ p<i
d (t<i ) 2κ(pida (·|s, t<i ), λ1 )λi−1 (dt<i )
t<i ∈[0,1]i−1
s Z
≤ 2 p<i i
d (t<i )κ(pda (·|s, t<i ), λ1 )λi−1 (dt<i )
t<i ∈[0,1]i−1
q
= 2 E κ(pida (·|s, t<i ), λ1 ) .

N such that
Pd
κ(pida (·|s, t<i ), λ1 ) <

Next, Lemma 1 implies that there exists m, n ∈ i=1 E
m + n log(d), for all d ∈ N. Since κ(pida (·|s, t<i ), λ1 ) ≥ 0, this implies that for a given γ >

q
0 the inequality 2 E κ(pida (·|s, t<i ), λ1 ) ≤ γ holds for at least d − 2(m + n log(d))/γ 2 values

of i. And with the result above, this implies that ||p−i

da (·|s) − pda (·|s)||1 < γ holds for at least
d − 2(m + n log(d))/γ 2 values of i. Now define Ωida = {s ∈ [0, 1]d : ||p−i
da (·|s) − pda (·|s)||1 < γ} as the
set points that satisfy this inequality for a given i ∈ {1, · · · , d}. The Lebesgue measures of these
11
sets satisfy
d d Z
1 s ∈ Ωida λd (ds)
X X
λd (Ωida )

=
i=1 i=1 s∈[0,1]d
Z d
1 s ∈ Ωida λd (ds)
X
=
s∈[0,1]d i=1
Z
d − 2(m + n log(d))/γ 2 λd (ds)

≥
s∈[0,1]d
=d − 2(m + n log(d))/γ 2 .
Since λd (Ωida ) ≤ 1 this implies that
λd (Ωida ) ≥ 1 − δ (9)
holds for at least d − 2(m + n log(d))/(δγ 2 ) values of i. Thus, for i that satisfies (10), we have
Z Z
||Γ−i
da Vd − Γda Vd ||1 = uda (s) + β V (t)p−i
da (t|s)λd (dt)
s∈[0,1]d t∈[0,1]d
Z
− uda (s) − β V (t)pda (t|s)λd (dt) λd (ds)
t∈[0,1]d
Z Z
=β Vd (t)(p−i
da (t|s) − pda (t|s))λd (dt) λd (ds)
s∈[0,1]d t∈[0,1]d
Z
≤β||Vd || ||p−i
da (·|s) − pda (·|s)||1 λd (ds)
s∈[0,1]d
Z
≤β||Vd || ||p−i
da (·|s) − pda (·|s)||1 λd (ds)
s∈Ωida
Z !
||p−i

+ da (·|s)||1 + ||pda (·|s)||1 λd (ds)
s∈[0,1]d \Ωida
Z Z !
≤β||Vd || γλd (ds) + 2λd (ds)
s∈Ωida s∈[0,1]d \Ωida
=β||Vd || γλd (Ωida ) + 2(1 − λd (Ωida ))

≤β(Kd /(1 − β))(γ + 2δ),
where Γda is the action-a Bellman operator defined in the proof of Proposition 1, and Γ−i
da is the
12
analogous operator under p−i
da . Thus, for i that satisfies (10), we have
||Γ−i −i
d Vd − Vd ||1 =||Γd Vd − Γd Vd ||1
X
≤ ||Γ−i
da Vd − Γda Vd ||1
a∈ a
≤|a|β(Kd /(1 − β))(γ + 2δ),
where Γ−i −i
d is the analog of Γd under pd . And with this, Rust’s (1997) Lemma 2.2 implies that
||Vd−i − Vd ||1 < |a|βKd (γ+2δ)/(1−β)2 holds for i that satisfies (10). Hence, setting γ = δ = 4| (1−β) 2
3a|βKd
,
|a|βKd

we find that ||Vd−i − Vd ||1 < holds for at least d − 2(m + n log(d))/(δγ 2 ) = d − 128 (1−β) 2 (m +
n log(d)) values of i.
Lemma 1. Assumption 10 implies that there exist m, n ∈ N such that

Pd
E(κ(pid (·|t<i ),µid (·|t<i ))
maxa∈a supd∈N sups∈[0,1]d i=1
m+n log(d) < 1.
Proof. The following implies the result:

sup pµda (t|s) = exp sup log(pµda (t|s))
t∈[0,1]d t∈[0,1]d
Z
≥ exp log(pµda (t|s))pda (t|s)λd (dt)
t∈[0,1]d
d Z
X pi (t |s, t )
da i <i
= exp log p da (t|s)λ d (dt)
i=1 t∈[0,1]
d µid (ti |t<i )
Xd

= exp E κ(pida (·|s, t<i ), µid (·|t<i )) .
i=1

q
Proof. Pinsker’s inequality establishes that ||pida (·|s, t<i ) − µid (·|t<i )||1 ≤ 2κ(pida (·|s, t<i ), µid (·|t<i )).
13
And this implies that
Z
pµ,−i
da (t<i , t≥i |s) − pda (t<i , t≥i |s) λd−i+1 (dt≥i )
t≥i ∈[0,1]d−i+1
Z
= µid (ti |t<i )/pida (ti |s, t<i ) − 1 pda (t<i , ti , t≥i+1 |s)λd−i+1 (dt≥i )
t≥i ∈[0,1]d−i+1
Z
= µid (ti |t<i )/pida (ti |s, t<i ) − 1 p<i+1
d (t<i , ti )λ1 (dti )
ti ∈[0,1]
Z
=p<i
d (t<i ) µid (ti |t<i ) − pida (ti |s, t<i ) λ1 (dti )
ti ∈[0,1]
=p<i i
d (t<i )||µd (·|t<i ) − pida (·|s, t<i )||1
q
≤p<i
d (t<i ) 2κ(pida (·|s, t<i ), µid (·|t<i )).
And with this, Jensen’s inequality yields

Z
||pµ,−i
da (·|s) − pda (·|s)||1 = pµ,−i
da (t|s) − pda (t|s) λd (dt)
t∈[0,1]d
Z q
≤ p<i
d (t<i ) 2κ(pida (·|s, t<i ), µid (·|t<i ))λi−1 (dt<i )
t<i ∈[0,1]i−1
s Z
≤ 2 p<i i i
d (t<i )κ(pda (·|s, t<i ), µd (·|t<i ))λi−1 (dt<i )
t<i ∈[0,1]i−1
q
= 2 E κ(pida (·|s, t<i ), µid (·|t<i )) .

Next, Lemma 1 implies that there exists m, n ∈ N such that

Pd
κ(pida (·|s, t<i ), µid (·|t<i )) <

i=1 E
m + n log(d), for all d ∈ N. Since κ(pida (·|s, t<i ), µid (·|t<i )) ≥ 0, this implies that for a given γ > 0
q
the inequality 2 E κ(pida (·|s, t<i ), µid (·|t<i )) ≤ γ holds for at least d − 2(m + n log(d))/γ 2 values

of i. And with the result above, this implies that ||pµ,−i

da (·|s) − pda (·|s)||1 < γ holds for at least
d − 2(m + n log(d))/γ 2 values of i. Now define Ωµ,i d µ,−i
da = {s ∈ [0, 1] : ||pda (·|s) − pda (·|s)||1 < γ}
as the set points that satisfy this inequality for a given i ∈ {1, · · · , d}. The Lebesgue measures of
14
these sets satisfy
d d Z
λd (Ωµ,i 1 s ∈ Ωµ,i
X X
da ) = da λd (ds)
i=1 i=1 s∈[0,1]d
Z d
1 s ∈ Ωµ,i
X
= da λd (ds)
s∈[0,1]d i=1
Z
d − 2(m + n log(d))/γ 2 λd (ds)

≥
s∈[0,1]d
=d − 2(m + n log(d))/γ 2 .
Since λd (Ωµ,i
da ) ≤ 1 this implies that
λd (Ωµ,i
da ) ≥ 1 − δ (10)
holds for at least d − 2(m + n log(d))/(δγ 2 ) values of i. Thus, for i that satisfies (10), we have
Z Z
||Γµ,−i
da Vd − Γda Vd ||1 = uda (s) + β V (t)pµ,−i
da (t|s)λd (dt)
s∈[0,1]d t∈[0,1]d
Z
− uda (s) − β V (t)pda (t|s)λd (dt) λd (ds)
t∈[0,1]d
Z Z
=β Vd (t)(pµ,−i
da (t|s) − pda (t|s))λd (dt) λd (ds)
s∈[0,1]d t∈[0,1]d
Z
≤β||Vd || ||pµ,−i
da (·|s) − pda (·|s)||1 λd (ds)
s∈[0,1]d
Z
≤β||Vd || ||pµ,−i
da (·|s) − pda (·|s)||1 λd (ds)
s∈Ωµ,i
da
Z !
||pµ,−i

+ da (·|s)||1 + ||pda (·|s)||1 λd (ds)
s∈[0,1]d \Ωµ,i
da
Z Z !
≤β||Vd || γλd (ds) + 2λd (ds)
s∈Ωµ,i
da s∈[0,1]d \Ωµ,i
da
=β||Vd || γλd (Ωµ,i

da ) + 2(1 − λd (Ωµ,i
da ))
≤β(Kd /(1 − β))(γ + 2δ),
where Γda is the action-a Bellman operator defined in the proof of Proposition 1, and Γµ,−i
da is the
15
analogous operator under pµ,−i
da . Thus, for i that satisfies (10), we have
||Γµ,−i
d Vd − Vd ||1 =||Γµ,−i
d Vd − Γd Vd ||1
X µ,−i
≤ ||Γda Vd − Γda Vd ||1
a∈ a
≤|a|β(Kd /(1 − β))(γ + 2δ),
where Γµ,−i
d is the analog of Γd under pµ,−i
d . And with this, Rust’s (1997) Lemma 2.2 implies
that ||Vdµ,−i − Vd ||1 < |a|βKd (γ + 2δ)/(1 − β)2 holds for i that satisfies (10). Hence, setting γ =
(1−β) 2 µ,−i
δ = 4| a |βKd , we find that ||Vd − Vd ||1 < holds for at least d − 2(m + n log(d))/(δγ 2 ) = d −
|a|βKd
3
128 (1−β) 2 (m + n log(d)) values of i.
Proposition 5. Any sequence of dynamic programs that (i) is weakly Rust solvable under sampling
density µ and (ii) satisfies Assumptions 1 and 2 can be Rust -approximated under sampling density
µ by a sequence of dynamic programs that is nearly memoryless under sampling density µ, for all
> 0.
Proof. Define Ξµd as a general set with measure b−2 µd (dt) = b−2
R
d under sampling density µ: t∈Ξµ
d
d .
And use this to define the following probability transition density function:
pµb (t|s) ≡1 pµda (t|s) ≤ b2d pµda (t|s) + b2d 1 t ∈ Ξµd pµda (s),

da
Z
where pµda (s) ≡ 1 pµda (r|s) > b2d pµda (r|s)µd (dr).

r∈[0,1]d
This density function funnels all the mass that exceeds to b2d into Ξµd . Since it never exceeds 2b2d , this
density function satisfies Assumption 10. So Proposition 4 establishes that this density function
corresponds with a sequence of dynamic programs that’s nearly memoryless under sampling density
µ (given Assumptions 1 and 2).
Define the following as the set of points for which the new density function equals the old density
function:
d (s) ≡ t ∈ [0, 1] : pda (t|s) = pda (t|s) ∀ a ∈ a .

Ωµb
d µb µ
16
The measure of this set under sampling density µ satisfies
Z Z XZ
1 pµda (r|s) > b2d µd (dt)

µd (dt) ≥1 − µd (dt) −
t∈Ωµ
d (s) t∈Ξµ
d a∈ a t∈[0,1]d
XZ
≥1 − b−2 −2
1 pµda (r|s) > b2d pµda (r|s)µd (dt)

d − bd
a∈ a Zt∈[0,1]d
pµda (r|s)µd (dt)
X
≥1 − b−2 −2
d − bd
a∈ a t∈[0,1]d
≥1 − (1 + |a|)/b2d .
The probability that this set contains all mid values satisfies
Z bd

Pr ∪bi=1
d
mid ⊂ Ωbd (s) = µd (dt)
t∈Ωµ
d (s)
≥(1 − (1 + |a|)/b2d )bd
= exp bd log(1 − (1 + |a|)/b2d )

= exp bd (−(1 + |a|)/b2d − (1 + |a|)2 /(2b4d ) − (1 + |a|)3 /(3b6d ) − · · · )

> exp(−(1 + |a|)/bd )
>1 − (1 + |a|)/bd . (11)
µb
Next, define Γ̂d as Rust’s random Bellman operator evaluated under density function pµb
da . Since
V̂dµb ∈ Vd , we have
µb
|(Γ̂d V̂dµb )(s) − (Γ̂µb µb
d V̂d )(s)| ≤ βKd /(1 − β). (12)
And if ∪bi=1
d
mid ⊂ Ωbd (s) then
µb
|(Γ̂d V̂dµb )(s) − (Γ̂µb µb
d V̂d )(s)| = 0. (13)
17
Now combining (11)–(13) yields
µb µb
E ||Γ̂d V̂dµb − V̂dµb ||1 = E ||Γ̂d V̂dµb − Γ̂µb µb

d V̂d ||1
Z
µb
E |(Γ̂d V̂dµb )(s) − (Γ̂µb µb
= d V̂d )(s)| λd (ds)
s∈[0,1]d
Z
βKd /(1 − β) 1 − Pr ∪µb i µb
≤ i=1 md ⊂ Ωd (s) λd (ds)
d
s∈[0,1]d
βKd (1 + |a|)
Z
< λd (ds)
s∈[0,1]d bd (1 − β)
βKd (1 + |a|)
= .
bd (1 − β)
And with this, we can use Rust’s (1997) Lemma 2.2 to establishes that
µb µb βKd (1 + |a|)
E ||V̂ d − V̂dµb ||1 ≤ E ||Γ̂d V̂dµb − V̂dµb ||1 /(1 − β) <

. (14)
bd (1 − β)2
Now for a given > 0 choose b ∈ b such that a

βKd (1+| |)
< /2 and E(||V̂dµb − Vd ||1 ) < /2. And
bd (1−β)2
µb µb µb
E(||V̂dµb − Vd ||1 ) < .

with this, (14) yields E ||V̂ d − Vd ||1 ≤ E ||V̂ d − V̂d ||1 +
References
Bray, Robert L. 2021. A Comment on ”Using Randomization to Break the Curse of Dimensionality”.
Working Paper .
Pollard, David. 1989. Asymptotics via empirical processes. Statistical Science 4(4) 341–354.
Rust, John. 1997. Using randomization to break the curse of dimensionality. Econometrica 65(3)
487–516.
18

Online Appendix

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Online Appendix

Uploaded by

Copyright:

Available Formats

Online Appendix for Bray (2021)

Rust’s Algorithm Under a General Sampling Density

Kellogg School of Management, Northwestern University

where pµda (t|s) ≡pda (t|s)/µd (t).

where µd (dt) ≡µd (t)λd (dt)

We will now re-express our definitions, assumptions, and propositions in terms of µd .

Proposition 2. Any sequence of dynamic programs that satisfies Assumptions 1, 2, and 10 is

Proposition 3. Every sequence of dynamic programs that satisfies Assumptions 1, 2, 7, 8, and 9

Proposition 4. Any sequence of dynamic programs that satisfies Assumptions 1, 2, and 10 is

Proposition 1. Any sequence of dynamic programs that satisfies Assumptions 1, 2, 7, 8, and 9 is

Proof. For d ∈ N, V ∈ Vd , a ∈ a, and b ∈ b define

This expression implies that

which with (2) implies that

And with this, Rust’s (1997) Lemma 2.2 establishes that

which is smaller than  when bd is larger than

Proposition 2. Any sequence of dynamic programs that satisfies Assumptions 1, 2, and 10 is

Accordingly, Chebyshev’s inequality establishes, for a given δ > 0 and V ∈ V, that

And this, in turn, implies that

And with this, Rust’s (1997) Lemma 2.2 establishes that

E ||V̂dbµ − Vd ||1 ≤ E ||Γ̂bµ

Proposition 3. Every sequence of dynamic programs that satisfies Assumptions 1, 2, 7, 8, and 9

≥ min (γd + δd /y)2 y

which with Assumption 8 implies

Fifth, define probability density function

This density function satisfies Assumption 10, since

pµda (t|s) ≤pµda (t|s) + 1

Proposition 4. Any sequence of dynamic programs that satisfies Assumptions 1, 2, and 10 is

And with this, Jensen’s inequality yields

of i. And with the result above, this implies that ||p−i

Since λd (Ωida ) ≤ 1 this implies that

=β||Vd || γλd (Ωida ) + 2(1 − λd (Ωida ))

≤β(Kd /(1 − β))(γ + 2δ),

Lemma 1. Assumption 10 implies that there exist m, n ∈ N such that

Proof. The following implies the result:

Proposition 4. Any sequence of dynamic programs that satisfies Assumptions 1, 2, and 10 is

And with this, Jensen’s inequality yields

Next, Lemma 1 implies that there exists m, n ∈ N such that

of i. And with the result above, this implies that ||pµ,−i

=β||Vd || γλd (Ωµ,i

≤β(Kd /(1 − β))(γ + 2δ),

d (s) ≡ t ∈ [0, 1] : pda (t|s) = pda (t|s) ∀ a ∈ a .

≥(1 − (1 + |a|)/b2d )bd

= exp bd log(1 − (1 + |a|)/b2d )

= exp bd (−(1 + |a|)/b2d − (1 + |a|)2 /(2b4d ) − (1 + |a|)3 /(3b6d ) − · · · )

> exp(−(1 + |a|)/bd )

>1 − (1 + |a|)/bd . (11)

Now for a given  > 0 choose b ∈ b such that a

You might also like

which is smaller than when bd is larger than

Now for a given > 0 choose b ∈ b such that a