You are on page 1of 29

ENGINEERING ANALYSIS II

ANALYSIS AND SYNTHESIS RANDOM VARIABLE

GROUP 7:

1. MUKHLIS ALI (23116014)


2. BUKHAEROTUL BRILLIANTORO (23116002)
3. HUY VISAL (23116708)

DEPARTMENT OF MECHANICAL ENGINEERING


FACULTY OF MECHANICAL AND AEROSPACE
ENGINEERING BANDUNG INSTITUTE OF TECHNOLOGY
2017

1
A. PROJECT
At this assignment, we want to compare synthesis and analysis of
Gaussian random variable using Matlab using the variation of:
1. Number of data (N)
2. Number of bin (N-bin)
3. Variance (using normrnd)
4. Generate using box-muller
5. Generate using equal probability of bins

B. SYNTHESIS OF RANDOM VARIABLE


1. Number of Data (N) Variation
At this case, we are varying number of data (100, 1,000, 10,000, and 100,000
data). All random variables that we use have zero mean value and unity of variance
or we set “randn(N,1)” at Matlab.

1.1.Gaussian random variable (100,1)


Generating Gaussian random variable with 100 data and make a histogram
with 10 bins.
>> x=randn(100,1);
>> hist(x)
25 Bin Count: 22
Bin Count: 25
Bin Center: 0.643
Bin Center: -0.00908 Bin Edges: [0.317, 0.969]
Bin Edges: [-0.335, 0.317]

20

Bin Count: 14 Bin Count: 14


Bin Count: 13
Bin Center: -1.31 Bin Center: 1.3
15 Bin Edges: [-1.64, -0.987] Bin Edges: [0.969, 1.62]
Bin Center: -0.661
Bin Edges: [-0.987, -0.335]

10

Bin Count: 4
Bin Count: 3
Bin Center: -1.97
5 Bin Edges: [-2.29, -1.64] Bin Count: 2
Bin Center: 2.6
Bin Count: 1 Bin Edges: [2.27, 2.93]
Bin Center: 3.25
Bin Edges: [2.93, Inf]
Bin Center: -2.62
Bin Edges: [-Inf, -2.29]
Bin Count: 2

0 Bin Center: 1.95


-3 -2 -1 0 1 2Bin Edges: [1.62, 2.27] 3 4

Evaluating the expected frequency (fe) of random variable.


>> fe1=normcdf(-2.29)*100;

2
>> fe2=(normcdf(-1.64)-normcdf(-2.29))*100;
>> fe3=(normcdf(-0.987)-normcdf(-1.64))*100;
>> fe4=(normcdf(-0.335)-normcdf(-0.987))*100;
>> fe5=(normcdf(0.317)-normcdf(-0.335))*100;
>> fe6=(normcdf(0.969)-normcdf(0.317))*100;
>> fe7=(normcdf(1.62)-normcdf(0.969))*100;
>> fe8=(normcdf(2.27)-normcdf(1.62))*100;
>> fe9=(normcdf(2.93)-normcdf(2.27))*100;
>> fe10=(1-normcdf(2.93))*100;
The values of observed frequency and expected frequency are shown in the
table. Chi-square of random variable also can be calculated in this table as follow.
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 ( −∞, -2.29] 1 1 0.0000
2 (-2.29, -1.64] 4 4 0.0000
3 (-1.64, -0.987] 14 11 0.8182
4 (-0.987, -0.335] 13 21 3.0476
5 (-0.335, 0.317] 25 26 0.0385
6 (0.317, 0.969] 22 21 0.0476
7 (0.969, 1.62] 14 11 0.8182
8 (1.62, 2.27] 2 3 0.3333
9 (2.27, 2.93] 3 1 4.0000
10 (2.93, ∞,) 2 1 1.0000
Total 100 100 10.1034
So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:
𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖
2
The value of 𝜒 is 10.1034. By using confidence level or F(z) is 0.95 and degree
of freedom k-3 (for Gaussian distribution), from Chi-square distribution table
2
(appendix A) the value of 𝜒5%,7 is 14,07
2
𝜒 2 < 𝜒5%,7
Thus, the random variable is acceptable.

3
1.2.Gaussian random variable (1000,1)
Generating Gaussian random variable with 1,000 data and make a histogram
with 10 bins.
>> x=randn(1000,1);
>> hist(x)
300 Bin Count: 268

Bin Center: -0.191


Bin Edges: [-0.529, 0.147]
Bin Count: 240

Bin Center: 0.485


250 Bin Edges: [0.147, 0.823]

Bin Count: 188

Bin Center: -0.866


200 Bin Edges: [-1.2, -0.529]

Bin Count: 128

150 Bin Center: 1.16


Bin Edges: [0.823, 1.5]

Bin Count: 84

100 Bin Center: -1.54


Bin Edges: [-1.88, -1.2]

Bin Count: 45

Bin Count: 27 Bin Center: 1.84


Bin Edges: [1.5, 2.17]
50
Bin Center: -2.22
Bin Count: 4 Bin Edges: [-2.56, -1.88] Bin Count: 2

Bin Center: -2.89 Bin Center: 3.19


Bin Edges: [-Inf, -2.56] Bin Edges: [2.85, Inf]
Bin Count: 14
0
-4 -3 -2 -1 0 1 2 3 4
Bin Center: 2.51
Bin Edges: [2.17, 2.85]

Evaluating the expected frequency (fe) of random variable


>> fe1=normcdf(-2.56)*1000;
>> fe2=(normcdf(-1.88)-normcdf(-2.56))*1000;
>> fe3=(normcdf(-1.2)-normcdf(-1.88))*1000;
>> fe4=(normcdf(-0.529)-normcdf(-1.2))*1000;
>> fe5=(normcdf(0.147)-normcdf(-0.529))*1000;
>> fe6=(normcdf(0.823)-normcdf(0.147))*1000;
>> fe7=(normcdf(1.5)-normcdf(0.823))*1000;
>> fe8=(normcdf(2.17)-normcdf(1.5))*1000;
>> fe9=(normcdf(2.85)-normcdf(2.17))*1000;
>> fe10=(1-normcdf(2.85))*1000;

The values of observed frequency and expected frequency are shown in the
table. Chi-square of random variable also can be calculated in this table as follow.

4
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 (−∞, -2.56] 4 5 0.2000
2 (-2.56, -1.88] 27 25 0.1600
3 (-1.88, -1.2] 84 85 0.0118
4 (-1.2, -0.529] 188 183 0.1366
5 (-0.529, 0.147] 268 260 0.2462
6 (0.147,0.823 ] 240 236 0.0678
7 (0.823,1.5] 128 138 0.7246
8 (1.5, 2.17] 45 52 0.9423
9 (2.17, 2.85] 14 13 0.0769
10 (2.85, ∞) 2 3 0.3333
Total 1000 1000 2.8995

So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:


𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖
2
The value of 𝜒 is 2.8995. By using confidence level or F(z) is 0.95 and degree of
freedom k-3 (for Gaussian distribution), from Chi-square distribution table
2
(appendix A) the value of 𝜒5%,7 is 14,07
2
𝜒 2 < 𝜒5%,7
Thus, the random variable is acceptable.

5
1.3.Gaussian random variable (10000,1)
Generating Gaussian random variable with 10,000 data and make a histogram
with 10 bins.
>> x=randn(10000,1);
>> hist(x)

3000

Bin Count: 2.82e+03

Bin Center: 0.279


2500 Bin Count: 2.6e+03 Bin Edges: [-0.0862, 0.645]

Bin Center: -0.452


Bin Edges: [-0.817, -0.0862]

Bin Count: 1.74e+03


2000
Bin Center: 1.01
Bin Edges: [0.645, 1.38]
Bin Count: 1.43e+03

Bin Center: -1.18


1500 Bin Edges: [-1.55, -0.817]

1000 Bin Count: 639


Bin Count: 517
Bin Center: 1.74
Bin Edges: [1.38, 2.11]
Bin Center: -1.91
Bin Edges: [-2.28, -1.55]
500 Bin Count: 149
Bin Count: 74
Bin Count: 10 Bin Count: 26
Bin Center: 2.47
Bin Center: -2.65 Bin Edges: [2.11, 2.84]
Bin Center: -3.38 Bin Edges: [-3.01, -2.28] Bin Center: 3.2
Bin Edges: [-Inf, -3.01] Bin Edges: [2.84, Inf]

0
-4 -3 -2 -1 0 1 2 3 4

Evaluating the expected frequency (fe) of random variable


>> fe1=normcdf(-3.01)*10000;
>> fe2=(normcdf(-2.28)-normcdf(-3.01))*10000;
>> fe3=(normcdf(-1.55)-normcdf(-2.28))*10000;
>> fe4=(normcdf(-0.817)-normcdf(-1.55))*10000;
>> fe5=(normcdf(-0.0862)-normcdf(-0.817))*10000;
>> fe6=(normcdf(0.645)-normcdf(-0.0862))*10000;
>> fe7=(normcdf(1.38)-normcdf(0.645))*10000;
>> fe8=(normcdf(2.11)-normcdf(1.38))*10000;
>> fe9=(normcdf(2.84)-normcdf(2.11))*10000;
>> fe10=(1-normcdf(2.84))*10000;
The values of observed frequency and expected frequency are shown in the
table. Chi-square of random variable also can be calculated in this table as follow.

6
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 (−∞, -3.01] 10 13 0.6923
2 (-3.01, -2.28] 74 100 6.7600
3 (-2.28, -1.55] 517 493 1.1684
4 (-1.55, -0.817] 1430 1464 0.7896
5 (-0.817, -0.0862] 2600 2587 0.0653
6 (-0.0862,0.645] 2820 2749 1.8338
7 (0.645,1.38] 1740 1757 0.1645
8 (1.38, 2.11] 639 664 0.9413
9 (2.11, 2.84] 149 152 0.0592
10 (2.84, ∞) 21 21 0.0000
Total 10000 10000 12.4743

So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:


𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖
2
The value of 𝜒 is 12.4743. By using confidence level or F(z) is 0.95 and
degree of freedom k-3 (for Gaussian distribution), from Chi-square distribution
2
table (appendix A) the value of 𝜒5%,7 is 14,07
2
𝜒 2 < 𝜒5%,7
Thus, the random variable is acceptable.

7
1.4.Gaussian random variable (100000,1)
Generating Gaussian random variable with 100,000 data and make a
histogram with 10 bins.
>> x=randn(100000,1);
>> hist(x)
4
x 10
3.5

3 Bin Count: 3.13e+04


Bin Count: 3.07e+04
Bin Center: 0.392
Bin Center: -0.485 Bin Edges: [-0.0467, 0.831]
Bin Edges: [-0.924, -0.0467]

2.5

2 Bin Count: 1.59e+04

Bin Count: 1.4e+04 Bin Center: 1.27


Bin Edges: [0.831, 1.71]
Bin Center: -1.36
1.5 Bin Edges: [-1.8, -0.924]

Bin Count: 4e+03


Bin Count: 3.25e+03
Bin Center: 2.15
Bin Center: -2.24 Bin Edges: [1.71, 2.59]
0.5
Bin Count: 356 Bin Edges: [-2.68, -1.8]
Bin Count: 26

Bin Center: -3.12 Bin Center: 3.9


Bin Edges: [-3.56, -2.68] Bin Edges: [3.46, Inf]
0
-5 Count: 19 -4 -3 -2 -1 0 1 Bin
2 Count: 456 3 4 5
Bin

Bin Center: 3.02


Bin Center: -3.99
Bin Edges: [2.59, 3.46]
Bin Edges: [-Inf, -3.56]

Evaluating the expected frequency (fe) of random variable


>> fe1=normcdf(-3.56)*100000;
>> fe2=(normcdf(-2.68)-normcdf(-3.56))*100000;
>> fe3=(normcdf(-1.8)-normcdf(-2.68))*100000;
>> fe4=(normcdf(-0.924)-normcdf(-1.8))*100000;
>> fe5=(normcdf(-0.0467)-normcdf(-0.924))*100000;
>> fe6=(normcdf(0.831)-normcdf(-0.0467))*100000;
>> fe7=(normcdf(1.71)-normcdf(0.831))*100000;
>> fe8=(normcdf(2.59)-normcdf(1.71))*100000;
>> fe9=(normcdf(3.46)-normcdf(2.59))*100000;
>> fe10=(1-normcdf(3.46))*100000;
The values of observed frequency and expected frequency are shown in the
table. Chi-square of random variable also can be calculated in this table as follow.

8
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 (−∞, -3.56] 19 19 0.0000
2 (-3.56, -2.68] 356 350 0.1029
3 (-2.68, -1.8] 3250 3225 0.1938
4 (-1.8, -0.924] 14000 14181 2.3102
5 (-0.924, -0.0467] 30700 30363 3.7404
6 (-0.0467,0.831] 31300 31564 2.2081
7 (0.831,1.71] 15900 15935 0.0769
8 (1.71, 2.59] 4000 3883 3.5254
9 (2.59, 3.46] 449 453 0.0353
10 (3.46, ∞) 26 27 0.0370
Total 100000 100000 12.2299

So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:


𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖
2
The value of 𝜒 is 12.2299. By using confidence level or F(z) is 0.95 and
degree of freedom k-3 (for Gaussian distribution), from Chi-square distribution
2
table (appendix A) the value of 𝜒5%,7 is 14,07
2
𝜒 2 < 𝜒5%,7
Thus, the random variable is acceptable.

Analysis:
From variation of data number, we get the conclusion that chi-square have
possibility increase related to the increasing of data number. But, for 10 bins until
100,000 data we still get chi-square value below the standard chi-square value at
table. So, we still have Gaussian random variable.

9
2. Number of Bins (x,N) Variations
So, we set the data constant at 10,000 data. And we varies the number of bin at
12, 15, and 18 bins.

2.1.Gaussian random variable (10000,1) with 12 bins


Generating Gaussian random variable with 10,000 data and make histogram
with 12 bins.
>> x=randn(10000,1);
>> hist(x,12)
Bin Count: 2.4e+03

Bin Center: 0.223


2500 Bin Count: 2.24e+03
Bin Edges: [-0.0819, 0.528]
Bin Center: -0.387
Bin Edges: [-0.692, -0.0819]

2000 Bin Count: 1.7e+03

Bin Center: 0.833


Bin Edges: [0.528, 1.14]
Bin Count: 1.46e+03

Bin Center: -0.997


Bin Edges: [-1.3, -0.692]
1500

Bin Count: 834

1000 Bin Count: 691 Bin Center: 1.44


Bin Edges: [1.14, 1.75]
Bin Center: -1.61
Bin Edges: [-1.91, -1.3]

Bin Count: 309


500 Bin Count: 224
Bin Center: 2.05
Bin Center: -2.22 Bin Edges: [1.75, 2.36]
Bin Count: 80
Bin Count: 36 Bin Edges: [-2.52, -1.91]
Bin Center: 2.66
Bin Center: -2.83 Bin Edges: [2.36, 2.97]
Bin Edges: [-3.13, -2.52]

0
-4 5
Bin Count: -3 -2 -1 0 1 2 3 Bin Count: 20 4

Bin Center: -3.44 Bin Center: 3.27


Bin Edges: [-Inf, -3.13] Bin Edges: [2.97, Inf]

Evaluating the expected frequency (fe) of random variable


>> fe1=normcdf(-3.13)*10000;
>> fe2=(normcdf(-2.52)-normcdf(-3.13))*10000;
>> fe3=(normcdf(-1.91)-normcdf(-2.52))*10000;
>> fe4=(normcdf(-1.3)-normcdf(-1.91))*10000;
>> fe5=(normcdf(-0.692)-normcdf(-1.3))*10000;
>> fe6=(normcdf(-0.0819)-normcdf(-0.692))*10000;
>> fe7=(normcdf(0.528)-normcdf(-0.0819))*10000;
>> fe8=(normcdf(1.14)-normcdf(0.528))*10000;
>> fe9=(normcdf(1.75)-normcdf(1.14))*10000;

10
>> fe10=(normcdf(2.36)-normcdf(1.75))*10000;
>> fe11=(normcdf(2.97)-normcdf(2.36))*10000;
>> fe12=(1-normcdf(2.97))*10000;
The values of observed frequency and expected frequency are shown in the table.
Chi-square of random variable also can be calculated in this table as follow.
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 (−∞, -3.13] 5 9 1.7778
2 (-3.13, -2.52] 36 50 3.9200
3 (-2.52, -1.91] 224 222 0.0180
4 (-1.91, -1.3] 691 687 0.0233
5 ( -1.3, -0.692] 1460 1477 0.1957
6 (-0.692,-0.0819] 2240 2229 0.0543
7 (-0.0819,0.528] 2401 2339 1.6434
8 (0.528,1.14] 1700 1716 0.1492
9 (1.14, 1.75] 834 871 1.5718
10 (1.75, 2.36] 309 309 0.0000
11 (2.36, 2.97] 80 76 0.2105
12 (2.97, ∞) 20 15 1.6667
Total 10000 10000 11.2306

So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:


𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖

The value of 𝜒 2 is 11.2306. By using confidence level or F(z) is 0.95 and


degree of freedom k-3 (for Gaussian distribution), from Chi-square distribution
2
table (appendix A) the value of 𝜒5%,9 is 16.92
2
𝜒 2 < 𝜒5%,9
Thus, the random variable is acceptable.

11
2.2.Gaussian random variable (10000,1) with 15 bins
Generating Gaussian random variable with 10,000 data and make
histogram with 15 bins.
>> x=randn(10000,1);
>> hist(x,15)
2500

Bin Count: 2.05e+03


Bin Count: 1.96e+03
Bin Center: 0.0874
Bin Edges: [-0.18, 0.354]
Bin Center: -0.446
Bin Edges: [-0.713, -0.18] Bin Count: 1.74e+03
2000
Bin Center: 0.621
Bin Edges: [0.354, 0.888]

Bin Count: 1.33e+03

1500 Bin Center: -0.98


Bin Count: 1.16e+03
Bin Edges: [-1.25, -0.713]
Bin Center: 1.15
Bin Edges: [0.888, 1.42]

1000
Bin Count: 671

Bin Center: -1.51


Bin Edges: [-1.78, -1.25] Bin Count: 479

Bin Center: 1.69


Bin Count: 270 Bin Edges: [1.42, 1.96]
500 Bin Count: 184
Bin Center: -2.05
Bin Edges: [-2.31, -1.78] Bin Center: 2.22
Bin Count: 11 Bin Edges: [1.96, 2.49] Bin Count: 2

Bin Center: -3.12 Bin Center: 3.82


Bin Edges: [-3.38, -2.85] Bin Edges: [3.56, Inf]
0 Bin Count: 87
Bin Count:
2 40
Bin Count: 2 -4 -3 -2 -1 0 1 3 Bin Count: 13 4 5
Bin Center: -2.58
Bin Edges: [-2.85, -2.31] Bin Center: 2.76
Bin Center: -3.65 Bin Center: 3.29
Bin Edges: [2.49, 3.02]
Bin Edges: [-Inf, -3.38] Bin Edges: [3.02, 3.56]

Evaluating the expected frequency (fe) of random variable


>> fe1=normcdf(-3.38)*10000;
>> fe2=(normcdf(-2.85)-normcdf(-3.38))*10000;
>> fe3=(normcdf(-2.31)-normcdf(-2.85))*10000;
>> fe4=(normcdf(-1.78)-normcdf(-2.31))*10000;
>> fe5=(normcdf(-1.25)-normcdf(-1.78))*10000;
>> fe6=(normcdf(-0.713)-normcdf(-1.25))*10000;
>> fe7=(normcdf(-0.18)-normcdf(-0.713))*10000;
>> fe8=(normcdf(0.354)-normcdf(-0.18))*10000;
>> fe9=(normcdf(0.888)-normcdf(0.354))*10000;
>> fe10=(normcdf(1.42)-normcdf(0.888))*10000;
>> fe11=(normcdf(1.96)-normcdf(1.42))*10000;
>> fe12=(normcdf(2.49)-normcdf(1.96))*10000;
>> fe13=(normcdf(3.02)-normcdf(2.49))*10000;

12
>> fe14=(normcdf(3.56)-normcdf(3.02))*10000;
>> fe15=(1-normcdf(3.56))*10000;
The values of observed frequency and expected frequency are shown in the table.
Chi-square of random variable also can be calculated in this table as follow.
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 (−∞, -3.38] 2 4 1.0000
2 (-3.38, -2.85] 11 18 2.7222
3 (-2.85, -2.31] 87 83 0.1928
4 (-2.31, -1.78] 270 271 0.0037
5 (-1.78, -1.25] 671 681 0.1468
6 ( -1.25,-0.713] 1330 1323 0.0370
7 (-0.713,-0.18] 1960 1907 1.4730
8 (-0.18,0.354] 2051 2098 1.0529
9 (0.354, 0.888] 1740 1744 0.0092
10 ( 0.888, 1.42] 1160 1095 3.8584
11 (1.42, 1.96] 479 528 4.5473
12 (1.96, 2.49] 184 186 0.0215
13 (2.49,3.02] 40 51 2.3725
14 (3.02, 3.56] 13 9 1.7778
15 (3.56, ∞) 2 2 0.0000
Total 10000 10000 19.2153

So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:


𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖
2
The value of 𝜒 is 19.2153. By using confidence level or F(z) is 0.95 and
degree of freedom k-3 (for Gaussian distribution), from Chi-square distribution
2
table (appendix A) the value of 𝜒5%,12 is 21.03
2
𝜒 2 < 𝜒5%,12
Thus, the random variable is acceptable.

13
2.3.Gaussian random variable (10000,1) with 18 bins
Generating Gaussian random variable with 10,000 data and make histogram
with 18 bins.
>> x=randn(10000,1);
>> hist(x,18)
Bin Count: 1.62e+03
1800 Bin Count: 1.58e+03
Bin Center: 0.0275
Bin Center: -0.394 Bin Edges: [-0.183, 0.238]
Bin Edges: [-0.605, -0.183]
1600

Bin Count: 1.53e+03

1400 Bin Count: 1.19e+03 Bin Center: 0.449


Bin Edges: [0.238, 0.66]
Bin Center: -0.816
Bin Edges: [-1.03, -0.605]
1200

Bin Count: 1.13e+03

1000 Bin Count: 812


Bin Center: 0.871
Bin Center: -1.24 Bin Edges: [0.66, 1.08] Bin Count: 724
Bin Edges: [-1.45, -1.03]
Bin Center: 1.29
800 Bin Edges: [1.08, 1.5]

Bin Count: 434 Bin Count: 425


600
Bin Center: -1.66 Bin Center: 1.71
Bin Edges: [-1.87, -1.45] Bin Edges: [1.5, 1.92]

400
Bin Count: 173

Bin Count: 73 Bin Center: 2.14


Bin Count: 49
Bin Edges: [1.92, 2.35]
200 Bin Count: 6
Bin Center: -2.5
Bin Count: 201 Bin Center: 2.56
Bin Edges: [-2.71, -2.29]
Bin Edges: [2.35, 2.77] Bin Center: 3.82
Bin Center: -2.08 Bin Edges: [3.61, Inf]
0 Bin Edges: [-2.29, -1.87]
-4 6
Bin Count: -3 Bin Count: 23 -2 -1 0 1 2 Count: 20
Bin 3 Bin Count: 84 5

Bin Center: -3.35 Bin Center: -2.92 Bin Center: 2.98 Bin Center: 3.4
Bin Edges: [-Inf, -3.13] Bin Edges: [-3.13, -2.71] Bin Edges: [2.77, 3.19] Bin Edges: [3.19, 3.61]

Evaluating the expected frequency (fe) of random variable


>> fe1=normcdf(-3.13)*10000;
>> fe2=(normcdf(-2.71)-normcdf(-3.13))*10000;
>> fe3=(normcdf(-2.29)-normcdf(-2.71))*10000;
>> fe4=(normcdf(-1.87)-normcdf(-2.29))*10000;
>> fe5=(normcdf(-1.45)-normcdf(-1.87))*10000;
>> fe6=(normcdf(-1.03)-normcdf(-1.45))*10000;
>> fe7=(normcdf(-0.605)-normcdf(-1.03))*10000;
>> fe8=(normcdf(-0.183)-normcdf(-0.605))*10000;
>> fe9=(normcdf(0.238)-normcdf(-0.183))*10000;
>> fe10=(normcdf(0.66)-normcdf(0.238))*10000;
>> fe11=(normcdf(1.08)-normcdf(0.66))*10000;
>> fe12=(normcdf(1.5)-normcdf(1.08))*10000;
>> fe13=(normcdf(1.92)-normcdf(1.5))*10000;

14
>> fe14=(normcdf(2.35)-normcdf(1.92))*10000;
>> fe15=(normcdf(2.77)-normcdf(2.35))*10000;
>> fe16=(normcdf(3.19)-normcdf(2.77))*10000;
>> fe17=(normcdf(3.61)-normcdf(3.19))*10000;
>> fe18=(1-normcdf(3.61))*10000;
The values of observed frequency and expected frequency are shown in the
table. Chi-square of random variable also can be calculated in this table as follow.
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 (−∞, -3.13] 6 9 1.0000
2 (-3.13, -2.71] 23 25 0.1600
3 (-2.71, -2.29] 73 76 0.1184
4 (-2.29, -1.87] 201 197 0.0812
5 (-1.87, -1.45] 434 428 0.0841
6 (-1.45,-1.03] 812 780 1.3128
7 (-1.03,-0.605] 1190 1211 0.3642
8 (-0.605,-0.183] 1580 1548 0.6615
9 (-0.183, 0.238] 1616 1667 1.5603
10 ( 0.238, 0.66] 1530 1513 0.1910
11 (0.66, 1.08] 1130 1146 0.2234
12 (1.08, 1.5] 724 733 0.1105
13 (1.5,1.92] 425 394 2.4391
14 (1.92, 2.35] 173 180 0.2722
15 (2.35, 2.77] 49 66 4.3788
16 ( 2.77, 3.19] 20 21 0.0476
17 ( 3.19, 3.61] 8 4 4.0000
18 (3.61, ∞] 6 2 8.0000
Total 10000 10000 17.0051

So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:


𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖

15
The value of 𝜒 2 is 17.0051. By using confidence level or F(z) is 0.95 and
degree of freedom k-3 (for Gaussian distribution), from Chi-square distribution
2
table (appendix A) the value of 𝜒5%,15 is 25
2
𝜒 2 < 𝜒5%,15
Thus, the random variable is acceptable.

Analysis:
From variation of number bins, we get that chi-square value only slightly change
between 12, 15 and 18 bins. This condition is possibility affected by the different
of bins number that we compare only small. So, the result does not give big
change. Also all the data still give a Gaussian random variable depend on chi-
square value.

16
3. Gaussian Random Variable using Normrnd (μ, σ, N, 1)
For this case we generate a Gaussian random variable with the mean value is 7
and varying the standard deviation (σ) at 1, 4 and 7, then we evaluate the difference
of all them. All random variables have 10000 data and 10 bins of histogram.

3.1 Gaussian random variable (7,1,10000,1)


Generating Gaussian random variable with the mean value is 7, and the
standard deviation is 1.
>> y=normrnd(7,1,10000,1);
>> hist(y)
>> mean(y)
ans =
6.9999
>> std(y)
ans =
0.9862

3000
Bin Count: 2.98e+03

Bin Count: 2.38e+03 Bin Center: 7.08


Bin Edges: [6.69, 7.46]
Bin Center: 6.31
2500 Bin Edges: [5.93, 6.69] Bin Count: 2.17e+03

Bin Center: 7.84


Bin Edges: [7.46, 8.23]

2000

1500
Bin Count: 1.1e+03

Bin Center: 5.55


Bin Edges: [5.16, 5.93] Bin Count: 838

1000 Bin Center: 8.61


Bin Edges: [8.23, 8.99]

Bin Count: 270


500 Bin Count: 188
Bin Center: 4.78
Bin Edges: [4.4, 5.16] Bin Center: 9.37 Bin Count: 24
Bin Edges: [8.99, 9.76]
Bin Center: 10.1
Bin Edges: [9.76, 10.5]

0
3 Bin Count: 41 4 5 6 7 8 9 10 11
Bin Count: 1 12
Bin Center: 4.01
Bin Center: 10.9
Bin Edges: [-Inf, 4.4]
Bin Edges: [10.5, Inf]

Evaluating the expected frequency (fe) of random variable using the relation
𝑥 − 𝑎𝑥
𝐹𝑋 (𝑥) = 𝐹𝑌 ( )
𝜎𝑥

17
>> fe1=normcdf((4.4-6.9999)/0.9862)*10000;
>> fe2=(normcdf((5.16-6.9999)/0.9862)-normcdf((4.4-6.9999)/0.9862))*10000;
>> fe3=(normcdf((5.93-6.9999)/0.9862)-normcdf((5.16-6.9999)/0.9862))*10000;
>> fe4=(normcdf((6.69-6.9999)/0.9862)-normcdf((5.93-6.9999)/0.9862))*10000;
>> fe5=(normcdf((7.46-6.9999)/0.9862)-normcdf((6.69-6.9999)/0.9862))*10000;
>> fe6=(normcdf((8.23-6.9999)/0.9862)-normcdf((7.46-6.9999)/0.9862))*10000;
>> fe7=(normcdf((8.99-6.9999)/0.9862)-normcdf((8.23-6.9999)/0.9862))*10000;
>> fe8=(normcdf((9.76-6.9999)/0.9862)-normcdf((8.99-6.9999)/0.9862))*10000;
>> fe9=(normcdf((10.5-6.9999)/0.9862)-normcdf((9.76-6.9999)/0.9862))*10000;
>> fe10=(1-normcdf((10.5-6.9999)/0.9862))*10000;

The values of observed frequency and expected frequency are shown in the
table. The χ2 or Chi-square of random variable also can be calculated in this table
as follow.
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 (−∞, 4.4] 41 42 0.0238
2 (4.4, 5.16] 270 269 0.0037
3 (5.16, 5.93] 1102 1079 0.4903
4 (5.93, 6.69] 2382 2377 0.0105
5 ( 6.69,7.46] 2982 3029 0.7293
6 (7.46,8.23] 2172 2143 0.3924
7 (8.23,8.99] 838 843 0.0297
8 (8.99,9.76] 188 192 0.0833
9 (9.76, 10.5] 24 24 0.0000
10 (10.5, ∞) 1 2 0.5000
Total 10000 10000 2.2630
So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:
𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖

18
The value of 𝜒 2 is 2.2630. By using confidence level or F(z) is 0.95 and
degree of freedom k-3 (for Gaussian distribution), from Chi-square distribution
2
table (appendix A) the value of 𝜒5%,7 is 14.07
2
𝜒 2 < 𝜒5%,7
Thus, the random variable is acceptable.

19
3.2 Gaussian random variable (7,4,10000,1)
Generating Gaussian random variable with the mean value is 7, and the
standard deviation is 4.
>> y=normrnd(7,4,10000,1);
>> hist(y)
>> mean(y)
ans =
6.9576
>> std(y)
ans =
4.0472

Bin Count: 3.15e+03


3500
Bin Center: 7.92
Bin Edges: [6.29, 9.55]

3000 Bin Count: 2.62e+03

Bin Center: 4.66


Bin Edges: [3.03, 6.29]

2500

Bin Count: 1.82e+03

2000 Bin Center: 11.2


Bin Edges: [9.55, 12.8]

Bin Count: 1.29e+03

1500 Bin Center: 1.41


Bin Edges: [-0.223, 3.03]

1000 Bin Count: 615

Bin Center: 14.4


Bin Count: 337 Bin Edges: [12.8, 16.1]

Bin Center: -1.85


500 Bin Count: 116
Bin Count: 38 Bin Edges: [-3.48, -0.223]
Bin Center: 17.7
Bin Center: -5.11 Bin Edges: [16.1, 19.3]
Bin Edges: [-6.74, -3.48]

0
Bin Count: 6-10 -5 0 5 10 15 20 Bin Count: 14 25

Bin Center: -8.37 Bin Center: 20.9


Bin Edges: [-Inf, -6.74] Bin Edges: [19.3, Inf]

Evaluating the expected frequency (fe) of random variable using the relation
𝑥 − 𝑎𝑥
𝐹𝑋 (𝑥) = 𝐹𝑌 ( )
𝜎𝑥
>> fe1=normcdf((-6.74-6.9576)/4.0472)*10000;
>> fe2=(normcdf((-3.48-6.9576)/4.047)-normcdf((-6.74-6.9576)/4.0472))*10000;
>> fe3=(normcdf((-0.22-6.9576)/4.047)-normcdf((-3.48-6.9576)/4.0472))*10000;
>> fe4=(normcdf((3.03-6.9576)/4.0472)-normcdf((-0.22-6.9576)/4.0472))*10000;
>> fe5=(normcdf((6.29-6.9576)/4.0472)-normcdf((3.03-6.9576)/4.0472))*10000;

20
>> fe6=(normcdf((9.55-6.9576)/4.0472)-normcdf((6.29-6.9576)/4.0472))*10000;
>> fe7=(normcdf((12.8-6.9576)/4.0472)-normcdf((9.55-6.9576)/4.0472))*10000;
>> fe8=(normcdf((16.1-6.9576)/4.0472)-normcdf((12.8-6.9576)/4.0472))*10000;
>> fe9=(normcdf((19.3-6.9576)/4.0472)-normcdf((16.1-6.9576)/4.0472))*10000;
>> fe10=(1-normcdf((19.3-6.9576)/4.0472))*10000;

The values of observed frequency and expected frequency are shown in the
table. The χ2 or Chi-square of random variable also can be calculated in this table
as follow.
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 (−∞, -6.74] 6 4 1.0000
2 (-6.74, -3.48] 38 46 1.3913
3 (-3.48, -0.223] 337 331 0.1088
4 (-0.223, 3.03] 1290 1279 0.0946
5 (3.03,6.29] 2618 2686 1.7215
6 (6.29,9.55] 3148 3046 3.4156
7 (9.55,12.8] 1818 1865 1.1845
8 (12.8,16.1] 615 625 0.1600
9 (16.1, 19.3] 116 108 0.5926
10 (19.3, ∞) 14 10 1.6000
Total 10000 10000 11.2689

So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:


𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖

The value of 𝜒 2 is 11.2689. By using confidence level or F(z) is 0.95 and


degree of freedom k-3 (for Gaussian distribution), from Chi-square distribution
2
table (appendix A) the value of 𝜒5%,7 is 14.07
2
𝜒 2 < 𝜒5%,7
Thus, the random variable is acceptable.

21
3.3 Gaussian random variable (7,7,10000,1)
Generating Gaussian random variable with the mean value is 7, and the
standard deviation is 4.
>> y=normrnd(7,7,10000,1);
>> hist(y)
>> mean(y)
ans =
7.0354
>> std(y)
ans =
6.8864
3500
Bin Count: 3.06e+03

Bin Center: 5.57


Bin Edges: [2.83, 8.32]
Bin Count: 2.68e+03
3000
Bin Center: 11.1
Bin Edges: [8.32, 13.8]

2500

Bin Count: 1.89e+03

Bin Center: 0.0841


2000 Bin Edges: [-2.66, 2.83]

Bin Count: 1.22e+03


1500
Bin Center: 16.6
Bin Edges: [13.8, 19.3]

1000 Bin Count: 636

Bin Center: -5.41


Bin Edges: [-8.15, -2.66]
Bin Count: 322

500 Bin Count: 130 Bin Center: 22


Bin Edges: [19.3, 24.8] Bin Count: 45
Bin Center: -10.9
Bin Edges: [-13.6, -8.15] Bin Center: 27.5
Bin Edges: [24.8, 30.3]

0
Bin Count:-20
18 -10 0 10 20 30 Bin Count: 5 40

Bin Center: -16.4 Bin Center: 33


Bin Edges: [-Inf, -13.6] Bin Edges: [30.3, Inf]

Evaluating the expected frequency (fe) of random variable using the relation
𝑥 − 𝑎𝑥
𝐹𝑋 (𝑥) = 𝐹𝑌 ( )
𝜎𝑥

>> fe1=normcdf((-13.6-7.0354)/6.8864)*10000;
>> fe2=(normcdf((-8.15-7.0354)/6.886)-normcdf((-13.6-7.0354)/6.8864))*10000;
>> fe3=(normcdf((-2.66-7.0354)/6.886)-normcdf((-8.15-7.0354)/6.8864))*10000;
>> fe4=(normcdf((2.83-7.0354)/6.8864)-normcdf((-2.66-7.0354)/6.8864))*10000;
>> fe5=(normcdf((8.32-7.0354)/6.8864)-normcdf((2.83-7.0354)/6.8864))*10000;

22
>> fe6=(normcdf((13.8-7.0354)/6.8864)-normcdf((8.32-7.0354)/6.8864))*10000;
>> fe7=(normcdf((19.3-7.0354)/6.8864)-normcdf((13.8-7.0354)/6.8864))*10000;
>> fe8=(normcdf((24.8-7.0354)/6.8864)-normcdf((19.3-7.0354)/6.8864))*10000;
>> fe9=(normcdf((30.3-7.0354)/6.8864)-normcdf((24.8-7.0354)/6.8864))*10000;
>> fe10=(1-normcdf((30.3-7.0354)/6.8864))*10000;

The values of observed frequency and expected frequency are shown in the
table. The χ2 or Chi-square of random variable also can be calculated in this table
as follow.
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 (−∞, -13.6] 18 14 1.1429
2 (-13.6, -8.15] 130 124 0.2903
3 (-8.15, -2.66] 636 659 0.8027
4 (-2.66, 2.83] 1888 1911 0.2768
5 (2.83,8.32] 3058 3033 0.2061
6 (8.32,13.8] 2678 2630 0.8760
7 (13.8,19.3] 1220 1255 0.9761
8 (19.3,24.8] 322 325 0.0277
9 (24.8, 30.3] 45 46 0.0217
10 (30.3, ∞) 5 3 1.3333
Total 10000 10000 5.9537

So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:


𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖

The value of 𝜒 2 is 5.9537. By using confidence level or F(z) is 0.95 and


degree of freedom k-3 (for Gaussian distribution), from Chi-square distribution
2
table (appendix A) the value of 𝜒5%,7 is 14.07
2
𝜒 2 < 𝜒5%,7
Thus, the random variable is acceptable.

23
Analysis:
From the variation of variance, we get the chi-square value inconsistent with the
increasing of the variance. The chi-square value is up and down in a big range.
But, still under the standard chi-square value at the table. It means all the data is
Gaussian random variable.

24
4. Gaussian Random Variable using Box Muller
Using Box Muller formula, actually we generate Gaussian random variable by
transformation of two statistically independent random variable x1 and x2 that
both uniformly distributed on (0,1). The formula is

𝑌 = 𝑇(𝑋1 , 𝑋2 ) = √−2ln⁡(𝑋1 )⁡sin⁡(2𝜋𝑋2 )


or
𝑌 = 𝑇(𝑋1 , 𝑋2 ) = √−2ln⁡(𝑋1 )⁡cos⁡(2𝜋𝑋2 )
This transformation will produce zero mean and unit variance of Gaussian
random variable N(0,1). At this analysis we will use number of data 10000 the
number of bins 10 bins.
We generate Gaussian random variable using box-muller with 10000 data of
uniform random variables.
>> for i=1:10000
x1=rand(10000,1);
x2=rand(10000,1);
y1=sqrt(-2*log(x1)).*sin(2*pi*x2);
end
>> hist(y1)
3500

3000
Bin Count: 3.03e+03 Bin Count: 2.49e+03

Bin Center: -0.127 Bin Center: 0.642


Bin Edges: [-0.511, 0.257] Bin Edges: [0.257, 1.03]
2500
Bin Count: 1.96e+03

Bin Center: -0.895


Bin Edges: [-1.28, -0.511]
2000

Bin Count: 1.19e+03


1500
Bin Center: 1.41
Bin Edges: [1.03, 1.79]

Bin Count: 763

1000 Bin Center: -1.66


Bin Edges: [-2.05, -1.28]

Bin Count: 302


Bin Count: 185
500 Bin Center: 2.18
Bin Edges: [1.79, 2.56] Bin Count: 56
Bin Center: -2.43
Bin Edges: [-2.82, -2.05]
Bin Center: 2.95
Bin Edges: [2.56, 3.33]

0
Bin -4
Count: 22 -3 -2 -1 0 1 2 3 4
Bin Count: 1 5

Bin Center: -3.2 Bin Center: 3.71


Bin Edges: [-Inf, -2.82] Bin Edges: [3.33, Inf]

Evaluating the expected frequency (fe) of random variable using the relation

25
>> fe1=normcdf(-2.82)*10000;
>> fe2=(normcdf(-2.05)-normcdf(-2.82))*10000;
>> fe3=(normcdf(-1.28)-normcdf(-2.05))*10000;
>> fe4=(normcdf(-0.511)-normcdf(-1.28))*10000;
>> fe5=(normcdf(0.257)-normcdf(-0.511))*10000;
>> fe6=(normcdf(1.03)-normcdf(0.257))*10000;
>> fe7=(normcdf(1.79)-normcdf(1.03))*10000;
>> fe8=(normcdf(2.56)-normcdf(1.79))*10000;
>> fe9=(normcdf(3.33)-normcdf(2.56))*10000;
>> fe10=(1-normcdf(3.33))*10000;

The values of observed frequency and expected frequency are shown in the
table. The χ2 or Chi-square of random variable also can be calculated in this table
as follow.
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 (−∞, -2.82] 22 24 0.1667
2 (-2.82, -2.05] 185 178 0.2753
3 (-2.05, -1.28] 763 801 1.8027
4 (-1.28, -0.511] 1960 2044 3.4521
5 ( -0.511,0.257] 3031 2967 1.3805
6 (0.257,1.03] 2490 2471 0.1461
7 (1.03,1.79] 1190 1148 1.5366
8 (1.79,2.56] 302 315 0.5365
9 (2.56, 3.33] 56 48 1.3333
10 (3.33, −∞) 1 4 2.2500
Total 10000 10000 12.8798

So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:


𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖

26
The value of 𝜒 2 is 12.8798. By using confidence level or F(z) is 0.95 and
degree of freedom k-3 (for Gaussian distribution), from Chi-square distribution
2
table (appendix A) the value of 𝜒5%,7 is 14.07
2
𝜒 2 < 𝜒5%,7
Thus, the random variable is acceptable.

Analysis:
Based on the synthesis of the data, we get a proof that Box muller can transform a
uniform random variable data becomes a Gaussian random variable. It means the
box muller formula effectively do that transformation.

27
5. Gaussian Random Variable using Equal Probability Bins
We generate a Gaussian random variable N (1000, 1) with 5 bins.
By table A8 Normal distribution (Kreyszig E. book);
𝐹𝑋 (𝑥1 ) = 0.2 => 𝑥1 = −0.842
𝐹𝑋 (𝑥2 ) = 0.4 => 𝑥2 = −0.253
𝐹𝑋 (𝑥3 ) = 0.6 => 𝑥3 = 0.253
𝐹𝑋 (𝑥4 ) = 0.8 => 𝑥4 = 0.842

6.
Observed frequency, 𝑓𝑜
>> x=randn(1000,1);
>> edges=[-inf,-0.842,-0.253,0.253,0.842,inf];
>> histcounts(x,edges)
ans =
189 195 217 200 199

Evaluating the expected frequency (𝑓𝑒) of random variable using the relation
>> fe1=normcdf(-0.842)*1000;
>> fe2=(normcdf(-0.253)-normcdf(-0.842))*1000;
>> fe3=(normcdf(0.253)-normcdf(-0.253))*1000;
>> fe4=(normcdf(0.842)-normcdf(0.253))*1000;
>> fe5=(1-normcdf(0.842))*1000;

The values of observed frequency and expected frequency are shown in the
table. The χ2 or Chi-square of random variable also can be calculated in this
table as follow.

28
(𝒇𝒐 − 𝒇𝒆)𝟐
Bin No. Bin edge 𝒇𝒐 𝒇𝒆
𝒇𝒆
1 (−∞, -0.842] 189 200 0.6050
2 (-0.842, -0.253] 195 200 0.1250
3 (-0.253, 0.253] 217 200 1.4450
4 (0.253, 0.842] 200 200 0.0000
5 ( 0.842, ∞) 199 200 0.0050
Total 1000 1000 2.1800

So, by using 𝜒 2 or Chi-square goodness of fit test with the formula:


𝑘 2
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))
𝜒 =∑
𝑓𝑒 (𝑖)
𝑖

The value of 𝜒 2 is 2.1800. By using confidence level or F(z) is 0.95 and


degree of freedom k-3 (for Gaussian distribution), from Chi-square
2
distribution table (appendix A) the value of 𝜒5%,2 is 5.99
2
𝜒 2 < 𝜒5%,2
Thus, the random variable is acceptable.

Analysis:
We can conclude that an equal probability bins can give a good chi-square value
for Gaussian random variable. And an equal probability bins means expected
frequency for every bins are same, so the chi-square test become easier to do.

29

You might also like