You are on page 1of 27

Generation of Random Numbers

The simulation of stochastic models is based


on random numbers generated by a computer.
The generated numbers should satisfy statisti-
cal tests for uniformly distributed on [0, 1] ran-
dom numbers.

Let m be an integer. A popular method is the


linear congruential method which produces a
sequence of integers between zero and m − 1,
using the following relationship:

xi+1 = (a xi + c) mod m (1)


The initial value x0 is called the seed, a is called
the multiplier, c is called the increment. The
desired random numbers are
x
ui = i
m

1
Since, u1, u2, ... are generated using (1),
they are not actually random. Therefore these
numbers are often referred to as pseudoran-
dom. Moreover, the considered method gen-
erates only numbers from the set
{ }
1 2 m−1
0, , , ...,
m m m
The selection of three parameters, seed, mul-
tiplier, and increment, affects the statistical
properties of generated numbers and the length
of the cycle. For example, let a = 13, m = 64,
c = 0, and x0 = 4. We have

x1 = 13 · 4 mod 64 = 52
x2 = 13 · 52 mod 64 = 36
x3 = 13 · 36 mod 64 = 20
x4 = 13 · 20 mod 64 = 4
So, the length of this cycle is only 4.

2
A run is a succession of similar events preceded
and followed by a different event. An up run
is a sequence of numbers each of which is suc-
ceeded by a larger number. A down run is
a sequence of numbers each of which is suc-
ceeded by a smaller number. Let A be a total
number of runs in a random sequence, then its
mean and variance are
2n − 1 16n − 29
E[A] = and V ar[A] = ,
3 90
where n is the length of the sequence. For
n > 20 the distribution of A is reasonably ap-
proximated by a normal distribution, so
2n − 1
A − E[A] A−
Z= √ =√ 3
V ar[A] 16n − 29
90
follows (approximately) the standard normal
distribution. Let α be the level of significance,
and let z α be the corresponding critical value.
2
Then, the test rejects the sequence if
|Z| > z α .
2

3
For example, consider the following sequence

0.32 0.58 0.77 0.88 0.65 0.93 0.53


0.66 0.45 0.21 0.19 0.68 0.75 0.18
0.73 0.02 0.01 0.41 0.16 0.28 0.18
0.01 0.87 0.69 0.25 0.47 0.31 0.32
0.72 0.53 0.30 0.42 0.73 0.04 0.83
0.47 0.24 0.57 0.63 0.29

Hence n = 40,
2 · 40 − 1
E[A] = = 26.33
3
16 · 40 − 29
V ar[A] = = 6.79
90

It is convenient to mark each number (except


the last one) by “+” if this number is followed
by a larger number, or by “−” if this number
is followed by a smaller number.

4
+ + + − + − +
0.32 0.58 0.77 0.88 0.65 0.93 0.53
− − − + + − +
0.66 0.45 0.21 0.19 0.68 0.75 0.18
− − + − + − −
0.73 0.02 0.01 0.41 0.16 0.28 0.18
+ − − + − + +
0.01 0.87 0.69 0.25 0.47 0.31 0.32
− − + + − + −
0.72 0.53 0.30 0.42 0.73 0.04 0.83
− + + −
0.47 0.24 0.57 0.63 0.29

Hence, there are 26 runs,


A − E[A] 26 − 26.33
Z= √ = √ = −0.13
V ar[A] 6.79
Suppose that the level of significance is α =
0.05. Then, the critical value is z0.025 = 1.96,
and since

−1.96 < −0.13 < 1.96


the sequence cannot be rejected.
5
Now consider the following sequence
+ + + − + − +
0.55 0.63 0.77 0.90 0.52 0.91 0.64
− − − + + − +
0.93 0.76 0.67 0.52 0.68 0.87 0.55
− − + − + − −
0.78 0.77 0.74 0.95 0.82 0.86 0.43
+ − − + − + +
0.32 0.36 0.18 0.05 0.33 0.15 0.27
− − + + − + −
0.40 0.34 0.27 0.45 0.49 0.13 0.46
− + + −
0.35 0.25 0.39 0.45 0.12
The sequence of runs up and down is exactly
the same as in the previous example. So, the
considered sequence of numbers would pass
the runs-up and runs-down test. However, the
first 20 numbers are all above the mean
0.99 + 0
= 0.495
2
and the last 20 numbers are all below the mean.
If the numbers are random, such an occurrence
is highly unlikely.
6
Now, let a run be a sequence of numbers all
above (or all below) the mean. Let n1 and n2
be the number of individual observation above
and below the mean, respectively. Let B be
the total number of runs. If the numbers are
random, then
2n1n2 1
E[B] = +
n 2
2n1n2(2n1n2 − n)
V ar[B] =
n2(n − 1)
For n1 or n2 greater than 20, B is approxi-
mately normally distributed. So,
2n1n2 1
B − E[B] B− −
Z= √ =√ n 2
V ar[B] 2n1n2(2n1n2 − n)
n2(n − 1)
has (approximately) the standard normal dis-
tribution. The sequence of numbers is rejected
if
|Z| > z α ,
2
where z α is the critical value corresponding to
2
the level of significance α.
7
For example, consider the sequence

0.32 0.58 0.77 0.88 0.65 0.93 0.53


0.66 0.45 0.21 0.19 0.68 0.75 0.18
0.73 0.02 0.01 0.41 0.16 0.28 0.18
0.01 0.87 0.69 0.25 0.47 0.31 0.32
0.72 0.53 0.30 0.42 0.73 0.04 0.83
0.47 0.24 0.57 0.63 0.29

It is convenient to assign “+” to each number


that is greater than 0.495 and to assign “−”
to each number that is below 0.495.

− + + + + + +
0.32 0.58 0.77 0.88 0.65 0.93 0.53
+ − − − + + -
0.66 0.45 0.21 0.19 0.68 0.75 0.18
+ − − − − − −
0.73 0.02 0.01 0.41 0.16 0.28 0.18
− + + − − − −
0.01 0.87 0.69 0.25 0.47 0.31 0.32
+ + − − + − +
0.72 0.53 0.30 0.42 0.73 0.04 0.83
− − + + −
0.47 0.24 0.57 0.63 0.29
8
Hence, the total number of runs is 17, and
n1 = 18, n2 = 22, and n = n1+n2 = 40.
2 · 18 · 22 1
E[B] = + = 20.3
40 2
2 · 18 · 22 · (2 · 18 · 22 − 40)
V ar[B] = = 9.54
(40)2(40 − 1)
Since n2 > 20, the normal approximation is
acceptable.
17 − 20.3
Z= √ = −1.07
9.54
For the level of significance α = 0.05, the crit-
ical value z0.025 = 1.96. So, the test fails to
reject the considered sequence of numbers be-
cause
−1.96 < −1.07 < 1.96.

Markov’s Inequality
Theorem 1 If a random variable X takes on
only nonnegative values, then for any a > 0
E[X]
P (X ≥ a) ≤
a
9
Proof We will prove the theorem only for the
case where X is a continuous random variable
with density function f (x).
∫ ∞ ∫ a ∫ ∞
E[X] = xf (x)dx = xf (x)dx+ xf (x)dx
0 0 a
∫ ∞ ∫ ∞ ∫ ∞
≥ xf (x)dx ≥ af (x)dx = a f (x)dx
a a a

= aP (X ≥ a)

Chebyshev’s Inequality
Theorem 2 For any random variable X with
V ar[X] >(0 and any k > √
0, )
1
P |X − E[X]| ≥ k V ar[X] ≤ 2
k

Proof The inequality



|X − E[X]| ≥ k V ar[X]
is equivalent to the inequality
(X − E[X])2
≥ k2.
V ar[X]
10
(X − E[X])2
Since the random variable is non-
V ar[X]
negative and
 
2
(X − E[X]) 
E = 1,
V ar[X]
according to Markov’s inequality,
( √ )
P |X − E[X]| ≥ k V ar[X]
 
2
(X − E[X]) 1
=P ≥k 2  ≤ 2
V ar[X] k

The Weak Law of Large Numbers

Theorem 3 For any sequence of independent


and identically distributed random variables X1,
X2, ... with expected value µ and finite vari-
ance σ 2, and any ε > 0,
( )
X1 + ... + Xn
lim P − µ > ε = 0
n→∞ n

11
Proof Observe that
[ ]
X1 + ... + Xn E[X1] + ... + E[Xn]
E = = µ,
n n
and since X1, X2, ... are independent,
[ ]
X1 + ... + Xn V ar[X1] + ... + V ar[Xn]
V ar =
n n2

σ2
=
n
Then by Chebyshev’s inequality,
( )
X1 + ... + Xn σ 1

P
− µ ≥ k √ ≤ 2.
n n k

Hence, for k such that √ = ε,
n
( ) 2
X1 + ... + Xn σ
P − µ ≥ ε ≤ 2 ,
n nε
which implies
( )
X1 + ... + Xn
lim P − µ ≥ ε = 0
n→∞ n

12
A generalization of the weak law of large num-
bers is the strong law of large numbers, which
states that, with probability 1,
X1 + ... + Xn
lim = µ.
n→∞ n
So, with certainty, the long-run average of a
sequence of independent and identically dis-
tributed random variables will converge to the
expected value.
Numerical Integration Using Random Numbers
∫ 1
Consider the integral g(x)dx and let U be a
0
random variable uniformly distributed on [0, 1].
Since the density function of U is 1,
∫ 1 ∫ 1
g(x)dx = g(x) · 1 dx = E[g(U )]
0 0
Let U1, U2, ... be a sequence of independent
and uniformly distributed over [0, 1] random
variables. Then, g(U1), g(U2), ... is a sequence
of independent and identically distributed

ran-
1
dom variables with expected value g(x)dx.
0
By the law of large numbers
13
∫ 1
we can approximate g(x)dx by
0
n

g(ui)
i=1
n
where u1, u2, ..., un are random numbers gen-
erated by a computer. This is an example of
the approach called the Monte Carlo method.

For example, consider the integral


∫ 1 √
4 1 − x2 dx = π
0
and the random numbers u1 = 0.11, u2 = 0.97,
u3 = 0.89, u4 = 0.53, u5 = 0.53, u6 = 0.39,
u7 = 0.01, u8 = 0.21, u9 = 0.68, u10 = 0.56.
Then,
10
∑ √
4 1 − u2
i
i=1
= 3.139665308
10

14
Let u1, ..., un be random numbers. In order to
evaluate (approximately) the integral
∫ b
g(x) dx
a
we make the substitution
x−a
y=
b−a
Then
∫ b ∫ 1
g(x) dx = g((b − a)y + a)(b − a) dy
a 0
and consequently
n

g((b − a)ui + a)(b − a)
i=1
n
∫ 1 ∫ b
≈ g((b − a)y + a)(b − a) dy = g(x) dx
0 a
In order to evaluate (approximately) the inte-
gral
∫ ∞
g(x) dx
0

15
we make the substitution
1
y=
x+1
Then,∫
∞ ∫ 0 ( )( )
1 1
g(x) dx = g −1 − 2 dy
0 1 y y
( )
1
∫ 1 g −1
y
= 2
dy
0 y
and consequently
( )
1
n g −1 ( )
∑ ui 1
2 ∫ 1 g −1 ∫ ∞
i=1 u i y
≈ 2
dy = g(x) dx
n 0 y 0

Inverse Transform Technique


Let X be a random variable with strictly in-
creasing cumulative distribution function F (x).
Let F −1 be the inverse of F . Consider the ran-
dom variable Y = F (X). Since F (x) = P (X ≤
x), 0 ≤ F (X) ≤ 1. Moreover, for any 0 ≤ y ≤ 1,
16
P (Y ≤ y) = P (F (X) ≤ y) = P (X ≤ F −1(y))

= F (F −1(y)) = y
Hence, Y is uniformly distributed on [0, 1]. This
leads to the following method of sampling from
the distribution with cumulative distribution func-
tion F (x):
• generate random numbers u1, ..., un;
• compute F −1(u1), ..., F −1(un).
Exponential Distribution

The density function for the exponential dis-


tribution is
{
λe−λx if x ≥ 0
f (x) =
0 if x < 0
and for any x ≥ 0
∫ x
F (x) = λe−λt dt = 1 − e−λx
0

17
So, the cumulative distribution function of X
is F (x) = 1 − e−λx, and
U = 1 − e−λX
is uniformly distributed on [0, 1]. We have
e−λX = 1 − U
−λX = log(1 − U )
1
X = − log(1 − U ) (2)
λ
Observe that if U is uniformly distributed on
[0, 1], then
P (1 − U ≤ u) = P (U ≥ 1 − u) = 1 − P (U ≤ 1 − u)
= 1 − (1 − u) = u.
Therefore, 1 − U is also uniformly distributed
on [0, 1], and (2) can be rewritten as
1
X = − log(U )
λ
Hence, to sample from the exponential distri-
bution we need
• to generate random numbers u1, ..., un;
1 1
• to compute − log(u1), ..., − log(un).
λ λ

18
Uniform Distribution
Let X be a random variable uniformly distributed
on the interval [a, b]. Then, the density func-
tion is


 1
if a ≤ x ≤ b
f (x) =
 b−a
 0 otherwise
and for a ≤ x ≤ b
∫ x
dx x−a
F (x) = =
a b−a b−a
Hence,
X −a
U =
b−a
is uniformly distributed on [0, 1]. We have
X = U (b − a) + a,
and we can sample from the considered uni-
form distribution by
• generating random numbers u1, ..., un;
• computing u1(b − a) + a, ..., un(b − a) + a.

19
Empirical Continuous Distribution
Suppose that we have the following data
x1 < x2 < ... < xn,
and suppose that the smallest possible value
is 0. Then we define x0 = 0 and assign the
1
probability to each interval [xi−1, xi], 1 ≤ i ≤
n
n. For example, for n = 5
1
4 U
5
3
5
2
5
1
5

x1 x2 x3 x4 x5
F −1 (U )
In general, let U be a random variable, uni-
formly distributed on [0, 1]. If xi−1 < U ≤ xi,
then it is easy to see that
xi − xi−1 F −1(U ) − xi−1
= ,
1 i−1
U−
n n
20
and therefore,
( )
i−1
F −1(U ) = n(xi − xi−1) U − + xi−1
n
This observation leads to the following method
of sampling from the continuous empirical dis-
tribution:
• generate random numbers u1, ..., un;
• for each 1 ≤ j ≤ n, choose i such that
xi−1 < uj ≤ xi and compute
( )
i−1
n(xi − xi−1) uj − + xi−1
n

Discrete Distributions
The inverse transform method can also be used
when X is discrete. In this case,

F (x) = P (X ≤ x) = p(xi),
xi ≤x
where p(xi) is the probability mass function.
We can sample from a discrete distribution as
follows:
21
• generate random numbers u1, ..., un;
• for each 1 ≤ k ≤ n, return xi for the small-
est i satisfying uk ≤ F (xi)
Geometric Distribution
Suppose that a random variable X follows a
geometric distribution with parameter p, i.e.
X has the probability mass function

p(j) = p(1 − p)j .


Here xj = j. Let F (x) be the correspond-
ing cumulative distribution function. Then, for
any nonnegative integer x,
x

j p(1 − (1 − p)x+1)
F (x) = p(1 − p) =
j=0 1 − (1 − p)

= 1 − (1 − p)x+1
Let uk be a random number, then the smallest
i, satisfying uj ≤ F (xi), satisfies

F (xi−1) = 1−(1−p)i < uk ≤ 1−(1−p)i+1 = F (xi)


22
Hence,
(1 − p)i+1 ≤ 1 − uk < (1 − p)i

(i + 1) ln(1 − p) ≤ ln(1 − uk ) < i ln(1 − p)


and since ln(1 − p) < 0,
ln(1 − uk )
(i + 1) ≥ >i
ln(1 − p)
which gives
ln(1 − uk ) ln(1 − uk )
−1≤i< (3)
ln(1 − p) ln(1 − p)
So, using uk , we generated xi = i, satisfying
(3). For example, let p = 0.3 and uk = 0.58.
Then, xi = i that is generated by uk = 0.58
should satisfy
ln(1 − uk ) ln(1 − uk )
− 1 = 1.43 ≤ i < = 2.43,
ln(1 − p) ln(1 − p)
which gives i = 2. Suppose that ur = 0.31,
then the corresponding i should satisfy
ln(1 − ur ) ln(1 − ur )
− 1 = 0.04 ≤ i < = 1.04.
ln(1 − p) ln(1 − p)
Hence, in this case, i = 1.
23
Convolution Method
The convolution method is applicable when a
random variable X can be expressed as

X = Y1 + ... + Yk , (4)
where Y1, ..., Yk are independent and identi-
cally distributed random variables that can be
generated more readily than the direct genera-
tion of X. In this case we first generate values
of Y1, ..., Yk and then obtain the desired value
of X using (4). For example, the k-Erlang
1
random variable X with mean is a sum of k
λ
independent exponentially distributed random
1
variables Y1, ..., Yk each with mean . So,

we first generate a value for each Yi, using the
1
formula − ln ui, where u1, ..., uk are random

numbers, and then compute a value of X as a
sum
( ) ( )
1 1
− ln u1 + ... + − ln uk
kλ kλ
24
Acceptance-Rejection Method

In order to illustrate the acceptance-rejection


method, consider a Poisson random variable X
with probability mass function
λn −λ
p(n) = e .
n!
Recall that X can be viewed as a number of
arrivals during one unit of time according to
the Poisson process with interarrival times ex-
1
ponentially distributed with expected value .
λ
In order to generate a value of X, using a
sequence of random numbers u1, u2, ..., we
1 1
compute values − ln u1, − ln u2, ... until the
λ λ
following condition is satisfied
n (
∑ ) n+1
∑ ( )
1 1
− ln ui ≤ 1 < − ln ui
i=1 λ i=1 λ
Then, n is the desired value of X.

25
For example, consider the following sequence
of random numbers: u1 = 0.3, u2 = 0.76, u3 =
0.61, u4 = 0.72, u5 = 0.94, u6 = 0.4 u7 = 0.08
u8 = 0.53, u9 = 0.43 u10 = 0.08. Let λ = 2.
We have
1
− ln u1 = 0.601986402
λ
1
− ln u2 = 0.137218423
λ
1
− ln u3 = 0.247148161
λ
1
− ln u4 = 0.164252033
λ
1
− ln u5 = 0.030937702
λ
1
− ln u6 = 0.458145366
λ
1
− ln u7 = 1.262864322
λ
Since
3 (
∑ ) 4
∑ ( )
1 1
− ln ui ≤ 1 < − ln ui ,
i=1 λ i=1 λ

26
the first four random numbers generate value
3. Similarly, because
6 (
∑ ) 7
∑ ( )
1 1
− ln ui ≤ 1 < − ln ui ,
i=5 λ i=5 λ
the next three random numbers generate value
2.

27

You might also like