You are on page 1of 47

Test for Random Numbers

1. Frequency test. Uses the Kolmogorov-Smirnov or


the chi-square test to compare the distribution of the
set of numbers generated to a uniform distribution.
2. Runs test. Tests the runs up and down or the runs
above and below the mean by comparing the actual
values to expected values. The statistic for
comparison is the chi-square.
3. Autocorrelation test. Tests the correlation between
numbers and compares the sample correlation to the
expected correlation of zero.

Random Numbers 1
Test for Random Numbers
(cont.)
4. Gap test. Counts the number of digits that appear
between repetitions of a particular digit and then
uses the Kolmogorov-Smirnov test to compare
with the expected number of gaps.
5. Poker test. Treats numbers grouped together as a
poker hand. Then the hands obtained are
compared to what is expected using the chi-square
test.

Random Numbers 2
Test for Random Numbers
(cont.)
In testing for uniformity, the hypotheses are as
follows:
H0: Ri ~ U[0,1]
H1: Ri  U[0,1]
The null hypothesis, H0, reads that the numbers are
distributed uniformly on the interval [0,1].

Random Numbers 3
Test for Random Numbers
(cont.)
In testing for independence, the hypotheses are as
follows;
H0: Ri ~ independently
H1: Ri  independently
This null hypothesis, H0, reads that the numbers are
independent. Failure to reject the null hypothesis
means that no evidence of dependence has been
detected on the basis of this test. This does not imply
that further testing of the generator for independence
is unnecessary.

Random Numbers 4
Test for Random Numbers
(cont.)
Level of significance 
 = P(reject H0 | H0 true)
Frequently,  is set to 0.01 or 0.05
(Hypothesis)
Actually True Actually False
Accept 1 - 
(Type II error)
Reject  1 - 
(Type I error)
Random Numbers 5
FREQUENCY TEST
• The Kolmogrov –Smirnov test. This test
compares the continuous cdf, F(x), of the
uniform distribution to the empirical cdf,
SN(x), of the sample of N observations
Kolmogrov-Smirnov
Table
STEPS:
1.Rank the data from the smallest to largest.
2.Compute
 i 
D  max   Ri 
1i  N N
 
  i 1 
D  max  Ri  
1 i  N
 N 
3. Compute D = max (D+ , D-)
4. Determine the critical value, Dα .
5. If the sample statistic D is greater than the
critical value, Dα,the null hypothesis that the
data are a sample from a uniform
distribution is rejected.

Example:
Suppose that the five numbers 0.44, 0,81,
0,14, 0.05, 0,93 were generated. Perform a
test for uniformity using Kologrov-Smirnov
test with a level of significance α of 0.05
R(i) 0.05 0.14 0.44 0.81 0,93

i/N 0.20 0.40 0.60 0.80 1.00

i/N- R(i) 0.15 0.26 0.16 - 0.07

R- (i-1)/N 0.05 - 0.04 0.21 0.13

D+ = 0.26 and D- = 0.21. Therefore, D = 0.26. The critical


value for α = 0.o5 and N = 5 is 0.565. Since the computed
value, 0.26, is less than the tabulated critical value, 0.565, the
hypothesis of no difference between the distribution of the
generated numbers and the uniform distribution is not rejected
FREQUENCY TEST
• Chi-square Test. The chi square test uses
the sample statistic
2
n
(O  E )
2   i i

i 1 Ei

where Oi is the number in the ith class, Ei


is the expected number in the ith class, and
n is the number of classes. Where
N
Ei 
n
Example
Use the chi square test with α= 0.05 to test
whether the data shown below are uniformly
distributed
0.34 0.90 0.25 0.89 0.87 0.44 0.12 0.21 0.46 0.67
0.83 0.76 0.79 0.64 0.70 0.81 0.94 0.74 0.22 0.74
0.96 0.99 0.77 0.67 0.56 0.41 0.52 0.73 0.99 0.02
0.47 0.30 0.17 0.82 0.56 0.05 0.45 0.31 0.78 0.05
0.79 0.71 0.23 0.19 0.82 0.93 0.65 0.37 0.39 0.42
0.99 0.17 0.99 0.46 0.05 0.66 0.10 0.42 0.18 0.49
0.37 0.51 0.54 0.01 0.81 0.28 0.69 0.34 0.75 0.49
0.72 0.43 0.56 0.97 0.30 0.94 0.96 0.58 0.73 0.05
0.06 0.39 0.84 0.24 0.40 0.64 0.40 0.19 0.79 0.62
0.18 0.26 0.97 0.88 0.64 0.47 0.60 0.11 0.29 0.78
Solution:
Interval Oi Ei Oi-Ei (Oi-Ei)2 (Oi-Ei)2
Ei
0.00-0.09 8 10 -2 4 0.4
0.10-0.19 8 10 -2 4 0.4
0.20-0.29 10 10 0 0 0
0.30-0.39 9 10 -1 1 0.1
0.40-0.49 12 10 2 4 0.4
0.50-0.59 8 10 -2 4 0.4
0.60-0.69 10 10 0 0 0
0.70-0.79 14 10 4 16 1.6
0.80-0.89 10 10 0 0 0
0.90-0.99 11 10 1 1 0.1
The value of the test statistic χ2is 3.4. This is
compared with the critical value χ20.05,9=16.9
Since χ2 is much smaller than the tabulated,
the null hypothesis of no difference between
the sample distribution and the uniform
distribution is not rejected,
RUNS TEST
1. Runs up and runs down
2N 1 Where: N is the number of
a  sequence
3
Zo is the test
and statistic
2 16 N  29 a is the number of
  runs
90
a  a
Zo 
a
RUNS TEST(cont.)
Concerns:
 Number of runs
Length of runs
Note: If N is the number of numbers in a
sequence, the maximum number of runs is N-1,
and the minimum number of runs is one.
If “a” is the total number of runs in a sequence,
the mean and variance of “a” is given by

Random Numbers 16
RUNS TEST (cont.)

For N > 20, the distribution of “a” approximated


by a normal distribution, N(a ,  a2 ).
This approximation can be used to test the
independence of numbers from a generator.
Z0= (a - a) / a

Random Numbers 17
RUNS TEST (Cont.)
Substituting for aanda ==>
Za = {a - [(2N-1)/3]} / {(16N-29)/90},
where Z ~ N(0,1)
Acceptance region for hypothesis of
independence -Z Z0  Z
 / 2  / 2

-Z  / 2 Z  / 2
Random Numbers 18
Example
Based on runs up an down, determine whether the
following sequence of 40 numbers is such that
the hypothesis of independence can be rejected
where α = 0.05
0.41 0.68 0.89 0.94 0.74 0.91 0.55 0.71 0.36 0.30
0.09 0.72 0.86 0.08 0.54 0.02 0.11 0.29 0.16 0.18
0.88 0.91 0.95 0.69 0.09 0.38 0.23 0.32 0.91 0.53
0.31 0.42 0.73 0.12 0.74 0.45 0.13 0.47 0.58 0.29
Runs Test(cont.)
The sequence of runs up and down is as follows:
+ + + +  +  + +  + + + + + + + + + + + + 
There are 26 runs in this sequence. With N=40 and a=26,
a= {2(40) - 1} / 3 = 26.33 and
2
 a {16(40) - 29} / 90 = 6.79
=
Then,
Z0 = (26 - 26.33) / 
Now, the critical value is Z0.025 = 1.96, so the
independence of the numbers cannot be rejected on the
basis of this test.
Random Numbers 20
Runs Test

2. Runs above and below the mean.


2n1 n2
b   (1 / 2)
N
2 2n1n2 (2n1n2  N )
 
b
N 2 ( N  1)

n1 or n2 greater than 20, b, is approximately


normally distributed
b  b
Z0 
 b2
Example
Determine whether the following sequence of 40
numbers is an excessive number of runs above
and below the mean where α = 0.05
0.41 0.68 0.89 0.94 0.74 0.91 0.55 0.71 0.36 0.30
0.09 0.72 0.86 0.08 0.54 0.02 0.11 0.29 0.16 0.18
0.88 0.91 0.95 0.69 0.09 0.38 0.23 0.32 0.91 0.53
0.31 0.42 0.73 0.12 0.74 0.45 0.13 0.47 0.58 0.29
- + +++ ++ + - - -+ +- +- - - --
- - + + -- - - + + - - + - + - - + + -
n1 = 18 n2 = 22 N = 40 b = 17
2(18)(22) 1
b    20.3
40 2
2 2(18)(22)[(2)(18)(22)  40]
b  2
 9.54
(40) (40  1)

Since n2 is greater than 20, the normal


approximation is accpetable, resulting in a
Zo value of
17  20.3
Zo   1.07
9.54
Since Zo=1.96, the hypothesis of
independence is acceptable, resulting in a Zo
value of this test.
RUNS TEST (Cont.)
3) RUNS TEST: LENGTH OF RUNS
Let Yi be the number of runs of length i in a
sequence of N numbers. For an independent
sequence, the expected value of Yi for runs up
and down is given by
2
E (Yi )  [ N (i 2  3i  1)  (i 3  3i 2  i  4)], i  N  2
(i  3)!
2
E (Yi )  , i  N  1
N!
RUNS TEST (Cont.)
For runs and below the mean
The expected value of Yi is approximately
given by
Nwi
E (Yi )  , N  20
E(I )
where wi, the approximate probability than a run
has length i, is given
i by i
 n1   n2   n1  n2 
wi          , N  20
N  N   N  N 
Runs Test (Cont.)
And where E(I), the approximate expected
length of a run, is given by
 n1   n2 
E ( I )      , N  20
 n2   n1 

The approximate expected total number of


runs(of all lengths) in a sequence of length
N, E(A), is given by
N
E ( A)  , N  20
E(I )
The appropriate test is the chi-square test with
Oi being the observed number of runs of
length i. Then the test statistic is
L
[Oi  E (Yi )]2
 
2

i 1 E (Yi )

Where L = N-1 for runs up and down and L=N


for runs above and below the mean. If the
null hypothesis of independence is true t,
then χ2 is approximately chi-square
distributed with L-1 degrees of freedom
Runs Test (cont.)
Example: Given the ff sequence of numbers, can
the hypothesis that the numbers are independent
be rejected on the basis of the length of runs up
and down at alpha=0.05
0.30 0.48 0.36 0.01 0.54 0.34 0.96 0.06 0.61 0.85
0.48 0.86 0.14 0.86 0.89 0.37 0.49 0.60 0.04 0.83
0.42 0.83 0.37 0.21 0.90 0.89 0.91 0.79 0.57 0.99
0.95 0.27 0.41 0.81 0.96 0.31 0.09 0.06 0.23 0.77
0.73 0.47 0.13 0.55 0.11 0.75 0.36 0.25 0.23 0.72
0.60 0.84 0.70 0.30 0.26 0.38 0.05 0.19 0.73 0.44
For this sequence the +’s and –’s are as follows
+--+-+-++-+-++-++-+
-+--+-+--+--+++---++
---+-+---+-+---+-++-
Runs Test (cont.)
+--+-+-++-+-++-++-+
-+--+-+--+--+++---++
---+-+---+-+---+-++-
The length of runs in the sequence is as follows:
1,2,1,1,1,1,2,1,1,1,2,1,2,1,1,1,1,2,1,1
1,2,1,2,3,3,2,3,1,1,1,3,1,1,1,3,1,1,2,1
The number of observed runs of each length is
Run Length,i 1 2 3
Observed Runs, Oi 26 9 5
Runs Test (cont.)
The expected number of runs of lengths, one,
two, and three are computed from
2
E (Yi )  [ N (i 2  3i  1)  (i 3  3i 2  i  4)], i  N  2
(i  3)!
2
E (Yi )  , i  N  1
N!
2
E (Y1 )  [60(1  3  1)  (1  3  1  4)]  25.08
4!
2
E (Y2 )  [60(4  6  1)  (8  12  2  4)]  10.77
5!
2
E (Y3 )  [60(9  9  1)  (27  27  3  4)]  3.04
6!
Runs Test (cont.)
The mean total number of runs(up and down) is
2(60)  1
a   39.67
3

Thus, the E(Yi) for i=1,2 and 3 total 38.89.


Theerfore the length of 4 or more is 39.67-
38.89=0.78
Runs Test (cont.)
Run Length,i Observed Expected [Oi-E(Yi)]2 /
number of number of E(Yi)
Runs, Oi Runs, E(Yi)
1 26 25.08 0.03
2 9 14 10.77 14.59 0.02
≥3 5 3.82
Total 40 39.67 0.05

2
The critical value  is 3.84 (The degrees of
0.05,1
freedom equals the number of class intervals
minus one). Since  0 is less than the
2

critical value, the hypothesis of independence


cannot be rejected on the basis of this test
Runs Test (cont.)
Ex; Given the ff sequence of numbers, can the
hypothesis that the numbers are independent be
rejected on the basis of the length of runs above
and below the mean at α=0.05
0.30 0.48 0.36 0.01 0.54 0.34 0.96 0.06 0.61 0.85
0.48 0.86 0.14 0.86 0.89 0.37 0.49 0.60 0.04 0.83
0.42 0.83 0.37 0.21 0.90 0.89 0.91 0.79 0.57 0.99
0.95 0.27 0.41 0.81 0.96 0.31 0.09 0.06 0.23 0.77
0.73 0.47 0.13 0.55 0.11 0.75 0.36 0.25 0.23 0.72
0.60 0.84 0.70 0.30 0.26 0.38 0.05 0.19 0.73 0.44
For this sequence the +’s and –’s are as follows
----+-+-++-+-++--+-+
-+--+++++++--++----+
+--+-+---++++-----+-
Runs Test (cont.)
The are 28 values above the mean and 32 values
below the mean. The probabilities of runs of
various lengths wi are determined from
i i
 n1   n2   n1  n2 
wi          , N  20
 N   N   N  N 
Thus,
1 1
 28  32 28  32 
w1        0.498
 60  60 60  60 
2 2
 28  32 28  32 
w2        0.249
 60  60 60  60 
3 3
 28  32 28  32 
w3        0.125
 60  60 60  60 
Runs Test (cont.)
The Expected Length of run, E(I) is
 n1   n2 
E ( I )      
 n2   n1 
 28   32 
E(I )        2.02
 n32   28 
Runs Test (cont.)
Then, the expected number of runs of various
lengths E (Y )  Nwi , N  20
i
E(I )
60(0.498)
E (Y1 )   14.79
2.02
60(0.249)
E (Y2 )   7.40
2.02
60(0.125)
E (Y1 )   3.71
2.02
Runs Test (cont.)
The total number of runs expected is
E(A) = N/E(I) = 60/2.02 = 29.7
Run Length,i Observed Expected [Oi-E(Yi)]2 /
number of number of E(Yi)
Runs, Oi Runs, E(Yi)
1 17 14.79 0.33
2 9 7.40 0.35
3 1 6 3.71 7.51 0.30
≥4 5 3.80
Total 32 29.70 0.98
2

The critical value 0.05, 2 is 5.99(The degrees of
freedom equals the number of class intervals
2
minus one). Since  0 = 0.98 is less than the
critical value, the hypothesis of independence
cannot be rejected on the basis of this test
GAP TEST
 The Gap Test measures the number of digits between
successive occurrences of the same digit.
(Example) length of gaps associated with the digit 3.
4, 1, 3, 5, 1, 7, 2, 8, 2, 0, 7, 9, 1, 3, 5, 2, 7, 9, 4, 1, 6, 3
3, 9, 6, 3, 4, 8, 2, 3, 1, 9, 4, 4, 6, 8, 4, 1, 3, 8, 9, 5, 5, 7
3, 9, 5, 9, 8, 5, 3, 2, 2, 3, 7, 4, 7, 0, 3, 6, 3, 5, 9, 9, 5, 5
5, 0, 4, 6, 8, 0, 4, 7, 0, 3, 3, 0, 9, 5, 7, 9, 5, 1, 6, 6, 3, 8
8, 8, 9, 2, 9, 1, 8, 5, 4, 4, 5, 0, 2, 3, 9, 7, 1, 2, 0, 3, 6, 3
Note: eighteen 3’s in list
==> 17 gaps, the first gap is of length 10

Random Numbers 39
GAP TEST (cont.)
We are interested in the frequency of gaps.
P(gap of 10) = P(not 3) P(not 3) P(3) ,
note: there are 10 terms of the type P(not 3)
= (0.9)10 (0.1)
The theoretical frequency distribution for randomly
ordered digit is given
x
by
 (0.9)n = 1 - 0.9x+1
F(x) = 0.1n 0
Note: observed frequencies for all digits are
compared to the theoretical frequency using the
Kolmogorov-Smirnov test.
Random Numbers 40
GAP TEST(cont.)
(Example)
Based on the frequency with which gaps occur,
analyze the 110 digits above to test whether they are
independent. Use = 0.05. The number of gaps is
given by the number of digits minus 10, or 100. The
number of gaps associated with the various digits are
as follows:
Digit 0 1 2 3 4 5 6 7 8 9
# of Gaps 7 8 8 17 10 13 7 8 9 13

Random Numbers 41
GAP TEST (cont.)
Gap Test Example
Relative Cum. Relative
Gap Length Frequency Frequency Frequency F(x) |F(x) - S N(x)|
0-3 35 0.35 0.35 0.3439 0.0061
4-7 22 0.22 0.57 0.5695 0.0005
8-11 17 0.17 0.74 0.7176 0.0224
12-15 9 0.09 0.83 0.8147 0.0153
16-19 5 0.05 0.88 0.8784 0.0016
20-23 6 0.06 0.94 0.9202 0.0198
24-27 3 0.03 0.97 0.9497 0.0223
28-31 0 0.00 0.97 0.9657 0.0043
32-35 0 0.00 0.97 0.9775 0.0075
36-39 2 0.02 0.99 0.9852 0.0043
40-43 0 0.00 0.99 0.9903 0.0003
44-47 1 0.01 1.00 0.9936 0.0064

Random Numbers 42
GAP TEST(cont.)
The critical value of D is given by
D0.05 = 1.36 / 100 = 0.136
Since D = max |F(x) - SN(x)| = 0.0224 is less
than D0.05, do not reject the hypothesis of
independence on the basis of this test.

Random Numbers 43
POKER TEST
 Poker Test - based on the frequency with which
certain digits are repeated.
Example:
0.255 0.577 0.331 0.414 0.828 0.909
Note: a pair of like digits appear in each number
generated.

Random Numbers 44
Test for Random Numbers
(cont.)
In 3-digit numbers, there are only 3 possibilities.
P(3 different digits) =
(2nd diff. from 1st) * P(3rd diff. from 1st & 2nd)
= (0.9) (0.8) = 0.72
P(3 like digits) =
(2nd digit same as 1st)  P(3rd digit same as 1st)
= (0.1) (0.1) = 0.01
P(exactly one pair) = 1 - 0.72 - 0.01 = 0.27

Random Numbers 45
Test for Random Numbers
(cont.)
(Example)
A sequence of 1000 three-digit numbers has been
generated and an analysis indicates that 680 have
three different digits, 289 contain exactly one pair
of like digits, and 31 contain three like digits.
Based on the poker test, are these numbers
independent?
Let  = 0.05.
The test is summarized in next table.

Random Numbers 46
Test for Random Numbers
(cont.)
Observed Expected (Oi - Ei)2
Combination, Frequency, Frequency, -----------
i Oi Ei Ei
Three different digits 680 720 2.24
Three like digits 31 10 44.10
Exactly one pair 289 270 1.33
1000 1000 47.65

The appropriate degrees of freedom are one less than


the number of class intervals. Since 20.05, 2 = 5.99 <
47.65, the independence of the numbers is rejected
on the basis of this test.

Random Numbers 47

You might also like