Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Standard view
Full view
of .
0 of .
Results for:
P. 1
GaussianSampling.pdf

# GaussianSampling.pdf

Ratings: (0)|Views: 2|Likes:

### Availability:

See more
See less

01/07/2013

pdf

text

original

Lecture Notes on Monte Carlo MethodsFall Semester, 2005Courant Institute of Mathematical Sciences, NYUJonathan Goodman, goodman@cims.nyu.edu
Chapter 2: Simple Sampling of Gaussians.
created August 26, 2005Generating univariate or multivariate Gaussian random variables is simpleand fast. There should be no reason ever to use approximate methods based,for example, on the Central limit theorem.
1 Box Muller
It would be nice to get a standard normal from a standard uniform by invertingthe distribution function, but there is no closed form formula for this distributionfunction
(
x
) =
(
X < x
) =
1
√
2
π

x
−∞
e
x
2
/
2
dx
. The Box Muller method is abrilliant trick to overcome this by producing two independent standard normalsfrom two independent uniforms. It is based on the familiar trick for calculating
=

−∞
e
x
2
/
2
dx .
This cannot be calculated by “integration” – the indeﬁnite integral does nothave an algebraic expression in terms of elementary functions (exponentials,logs, trig functions). However,
2
=

−∞
e
x
2
/
2
dx

−∞
e
y
2
/
2
dy
=

−∞

−∞
e
(
x
2
+
y
2
)
/
2
dxdy .
The last integral can be calculated using polar coordinates
x
=
r
cos(
θ
),
y
=
r
sin(
θ
) with area element
dxdy
=
rdrdθ
, so that
2
=

r
= 0

2
πθ
=0
e
r
2
/
2
rdrdθ
= 2
π

r
= 0
e
r
2
/
2
rdr .
Unlike the original
x
integral, this
r
integral is elementary. The substitution
s
=
r
2
/
2 gives ds = rdr and
2
= 2
π

s
=0
e
s
ds
= 2
π .
The Box Muller algorithm is a probabilistic interpretation of this trick. If (
X,
) is a pair of independent standard normals, then the probability densityis a product:
(
x,y
) =1
√
2
πe
x
2
/
2
·
1
√
2
πe
y
2
/
2
=12
πe
(
x
2
+
y
2
)
/
2
.
1

Since this density is radially symmetric, it is natural to consider the polar co-ordinate random variables (
R,
Θ) deﬁned by 0
Θ
<
2
π
and
=
R
cos(Θ),and
=
R
sin(Θ). Clearly Θ is uniformly distributed in the interval [0
,
2
π
] andmay be sampled usingΘ = 2
πU
1
.
Unlike the original distribution function
(
x
), there is a simple expression forthe
R
distribution function:
G
(
R
) =
(
R
r
) =

rr
=0

2
π
Θ=0
12
πe
r
2
/
2
rdrdθ
=

rr
=0
e
r
2
/
2
rdr .
The same change of variable
r
2
/
2 =
s
,
r
dr
=
ds
(so that
r
=
r
when
s
=
r
2
/
2)allows us to calculate
G
(
r
) =

r
2
/
2
s
=0
e
s
dx
= 1
e
r
2
/
2
.
Therefore, we may sample
R
by solving the distribution function equation
1
G
(
R
) = 1
e
R
2
/
2
= 1
2
,
whose solution is
R
=

2ln(
2
). Altogether, the Box Muller method takesindependent standard uniform random variables
1
and
2
and produces inde-pendent standard normals
and
using the formulasΘ = 2
πU
1
, R
=

2ln(
2
)
,
=
R
cos(Θ)
,
=
R
sin(Θ)
.
(1)It may seem odd that
and
in (13) are independent given that they usethe same
R
and Θ. Not only does our algebra shows that this is true, but wecan test the independence computationally, and it will be conﬁrmed.Part of this method was generating a point “at random” on the unit cir-cle. We suggested doing this by choosing Θ uniformly in the interval [0
,
2
π
]then taking the point on the circle to be (cos(Θ)
,
sin(Θ)). This has the possi-ble drawback that the computer must evaluate the sine and cosine functions.Another way to do this
2
is to choose a point uniformly in the 2
×
2 square
1
x
1, 1
y
1 then rejecting it if it falls outside the unit circle. Theﬁrst accepted point will be uniformly distributed in the unit disk
x
2
+
y
2
1,so its angle will be random and uniformly distributed. The ﬁnal step is to geta point on the unit circle
x
2
+
y
2
= 1 by dividing by the length.The methods have equal accuracy (both are exact in exact arithmetic). Whatdistinguishes them is computer performance (a topic discussed more in a laterlecture, hopefully). The rejection method, with an acceptance probability
π
4
78%, seems eﬃcient, but rejection can break the instruction pipeline and slowa computation by a factor of ten. Also, the square root needed to compute
1
Recall that 1
2
is a standard uniform if
2
is.
2
Suggested, for example, in the dubious book
Numerical Recipies
.
2

the length may not be faster to evaluate than sine and cosine. Moreover, therejection method uses two uniforms while the Θ method uses just one.The method can be reversed to solve another sampling problem, generatinga random point on the “unit spnere” in
R
n
. If we generate
n
independentstandard normals, then the vector
= (
1
,...,X
n
) has all angles equallylikely (because the probability density is
(
x
) =
1
√
2
π
n
exp(
(
x
21
+
···
+
x
2
n
)
/
X/
is uniformly distributed on theunit sphere, as desired.
1.1 Other methods for univariate normals
The Box Muller method is elegant and reasonably fast and is ﬁne for casual com-putations, but it may not be the best method for hard core users. Many softwarepackages have native standard normal random number generators, which (if theyare any good) use expertly optimized methods. There is very fast and accuratesoftware on the web for directly inverting the normal distribution function
(
x
).This is particularly important for
quasi Monte Carlo
, which substitutes equidis-tributed sequences for random sequences (see a later lecture).
2 Multivariate normals
An
n
component multivariate normal,
, is characterized by its mean
µ
=
[
]and its covariance matrix
=
[(
µ
)(
µ
)
t
]. We discuss the problem of generating such an
with mean zero, since we achieve mean
µ
µ
toa mean zero multivariate normal. The key to generating such an
is the factthat if
is an
m
component mean zero multivariate normal with covariance
D
and
=
AY
, then
is a mean zero multivariate normal with covariance
=
X
t
=
AY
(
AY
)
t
=
AE
Y
t
A
t
=
t
.
We know how to sample the
n
component multivariate normal with
D
=
, justtake the components of
to be independent univariate standard normals. Theformula
=
AY
will produce the desired covariance matrix if we ﬁnd
A
with
AA
t
=
.A simple way to do this in practice is to use the
Choleski decomposition
from numerical linear algebra. This is a simple algorithm that produces a lowertriangular matrix,
L
, so that
LL
t
=
. It works for any positive deﬁnite
.In physical applications it is common that one has not
but its inverse,
.This would happen, for example, if
kT
= 1 (it’s easy to change this) and energy
12
t
H
, and probabilitydensity
1
exp(
12
t
H
). In large scale physical problems it may be impracti-cal to calculate and store the covariance matrix
=
1
though the Choleskifactorization
=
LL
t
is available. Note that
3
1
=
L
t
L
1
, so the choice
3
L
t
for the transpose of
L
1
, which also is the inverse of
L
t
.
3