You are on page 1of 28

ENGINEERING ANALYSIS II

Analysis and Synthesis of Random Variable

Group 5 :
Bayu Sutanto (23116301)
Aditya Muhammad Nur (23116313)
Yee Yanglasyjiavue (23116704)

DEPARTMENT OF MECHANICAL ENGINEERING


FACULTY OF MECHANICAL AND AEROSPACE ENGINEERING
BANDUNG INSTITUTE OF TECHNOLOGY
2017
A. PROJECT
Synthesis and analysis of Gaussian random variable using Matlab, and
explore the variation of:
1. Number of data (N)
2. Number of bin (N-bin)
3. Generate using normrnd
4. Generate using box-muller

B. SYNTHESIS OF RANDOM VARIABLE


1. Synthesis Gaussian random variable using randn (N,1)
At this case, we will generate a Gaussian random variable with varying
number of data (100, 1.000 and 10.000 data). All random variables have
zero mean value and unity of variance.

a. Gaussian random variable (100,1)


Generating Gaussian random variable with 100 data and make a histogram
with 10 bins.
>> y1=randn(100,1);

>> hist(y1)

1
Evaluating the expected frequency (fe) of random variable.
>> fe1=normcdf(-1.82)*100;

>> fe2=(normcdf(-1.31)-normcdf(-1.82))*100;

>> fe3=(normcdf(-0.794)-normcdf(-1.31))*100;

>> fe4=(normcdf(-0.282)-normcdf(-0.794))*100;

>> fe5=(normcdf(0.229)-normcdf(-0.282))*100;

>> fe6=(normcdf(0.741)-normcdf(0.229))*100;

>> fe7=(normcdf(1.25)-normcdf(0.741))*100;

>> fe8=(normcdf(1.76)-normcdf(1.25))*100;

>> fe9=(normcdf(2.28)-normcdf(1.76))*100;

>> fe10=(1-normcdf(2.28))*100;

The values of observed frequency and expected frequency are shown in the
table. The χ2 or Chi-square of random variable also can be calculated in this table
as follow.

(𝒇𝒐 − 𝒇𝒆)𝟐
Bins fo fe
𝒇𝒆
-∞ to -1.82 4 3.44 0.09
-1.82 to -1.31 3 6.07 1.55
-1.31 to -0.794 11 11.85 0.06
-0.794 to -0.282 20 17.54 0.35
-0.282 to 0.229 21 20.16 0.04
0.229 to 0.741 17 18.01 0.06
0.741 to 1.25 16 12.37 1.07
1.25 to 1.76 4 6.64 1.05
1.76 to 2.28 3 2.79 0.02
2.28 to ∞ 1 1.13 0.02
Sum 100 100 4.29

So, by using χ2 or Chi-square goodness of fit test with the formula:


𝑘
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))2
χ =∑
𝑓𝑒 (𝑖)
𝑖=1

The value of χ2 is 4.29. By using confidence level or F(z) is 0.95 and degree
of freedom k-3 (for Gaussian distribution), from Chi-square distribution table
(appendix A) the value of χ2α,dof is 14,07 or by using Matlab:

2
>> threshold=chi2inv(.95,7)

threshold =

14.0671

So, because χ2 < χ2α,dof the random variable is valid as a Gaussian random variable.

b. Gaussian random variable (1000,1)


Generating Gaussian random variable with 1.000 data and make a histogram
with 10 bins.
>> y2=randn(1000,1);

>> hist(y2)

Evaluating the expected frequency (fe) of random variable.


>> fe1=normcdf(-2.55)*1000;

>> fe2=(normcdf(-1.87)-normcdf(-2.55))*1000;

>> fe3=(normcdf(-1.19)-normcdf(-1.87))*1000;

>> fe4=(normcdf(-0.511)-normcdf(-1.19))*1000;

>> fe5=(normcdf(0.169)-normcdf(-0.511))*1000;

>> fe6=(normcdf(0.849)-normcdf(0.169))*1000;

>> fe7=(normcdf(1.53)-normcdf(0.849))*1000;

>> fe8=(normcdf(2.21)-normcdf(1.53))*1000;

3
>> fe9=(normcdf(2.89)-normcdf(2.21))*1000;

>> fe10=(1-normcdf(2.89))*1000

The values of observed frequency and expected frequency are shown in the
table. The χ2 or Chi-square of random variable also can be calculated in this table
as follow.

(𝒇𝒐 − 𝒇𝒆)𝟐
Bins fo fe
𝒇𝒆
-∞ to -2.55 3 5.39 1.06
-2.55 to -1.87 25 25.36 0.00
-1.87 to -1.19 85 86.28 0.02
-1.19 to -0.511 186 187.65 0.01
-0.511 to 0.169 272 262.43 0.35
0.169 to 0.849 246 234.96 0.52
0.849 to 1.53 118 134.93 2.12
1.53 to 2.21 45 49.46 0.40
2.21 to 2.89 16 11.63 1.65
2.89 to ∞ 4 1.93 2.23
Sum 1000 1000 8.37

So, by using χ2 or Chi-square goodness of fit test with the formula:


𝑘
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))2
χ =∑
𝑓𝑒 (𝑖)
𝑖=1

The value of χ2 is 8.37. By using confidence level or F(z) is 0.95 and degree
of freedom k-3 (for Gaussian distribution), from Chi-square distribution table
(appendix A) the value of χ2α,dof is 14,07, or by using Matlab

>> threshold=chi2inv(.95,7)

threshold =

14.0671

So, because χ2 < χ2α,dof the random variable is valid as a Gaussian random variable.

c. Gaussian random variable (10000,1)


Generating Gaussian random variable with 10.000 data and make a histogram
with 10 bins.
>> y3=randn(10000,1);

4
>> hist(y3)

Evaluating the expected frequency (fe) of random variable.


>> fe1=normcdf(-2.8)*10000;

>> fe2=(normcdf(-2.04)-normcdf(-2.8))*10000;

>> fe3=(normcdf(-1.28)-normcdf(-2.04))*10000;

>> fe4=(normcdf(-0.521)-normcdf(-1.28))*10000;

>> fe5=(normcdf(0.238)-normcdf(-0.521))*10000;

>> fe6=(normcdf(0.997)-normcdf(0.238))*10000;

>> fe7=(normcdf(1.76)-normcdf(0.997))*10000;

>> fe8=(normcdf(2.52)-normcdf(1.76))*10000;

>> fe9=(normcdf(3.27)-normcdf(2.52))*10000;

>> fe10=(1-normcdf(3.27))*10000

The values of observed frequency and expected frequency are shown in the
table. The χ2 or Chi-square of random variable also can be calculated in this table
as follow.

5
(𝒇𝒐 − 𝒇𝒆)𝟐
Bins fo fe
𝒇𝒆
-∞ to -2.8 21 25.55 0.81
-2.8 to -2.04 168 181.20 0.96
-2.04 to -1.28 815 795.97 0.45
-1.28 to -0.521 2029 2009.10 0.20
-0.521 to 0.238 2920 2928.80 0.03
0.238 to 0.997 2489 2465.55 0.22
0.997 to 1.76 1180 1201.80 0.40
1.76 to 2.52 325 333.36 0.21
2.52 to 3.27 41 53.30 2.84
3.27 to ∞ 12 5.38 8.16
Sum 10000 10000 14.27

So, by using χ2 or Chi-square goodness of fit test with the formula:


𝑘
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))2
χ =∑
𝑓𝑒 (𝑖)
𝑖=1

The value of χ2 is 14.27. By using confidence level or F(z) is 0.95 and degree
of freedom k-3 (for Gaussian distribution), from Chi-square distribution table
(appendix A) the value of χ2α,dof is 14,07, or by using Matlab

>> threshold=chi2inv(.95,7)

threshold =

14.0671

So, because χ2 > χ2α,dof the random variable is invalid as a Gaussian random
variable.

2. Synthesis Gaussian random variable by varying the number of bins (y,N)


For this case, we use zero mean Gaussian random variable with 10.000 data
and varying the number of bin at 15, 20 and 25 bins. The 10.000 data is selected
because at previous experiment showed an invalid results, so the increasing
number of bin makes it better or not will be shown in this experiment.

a. Gaussian random variable (10000,1) with 15 bins


Generating Gaussian random variable with 10.000 data and make histogram
with 15 bins.

6
>> y1=randn(10000,1);

>> hist(y1,15)

Evaluating the expected frequency (fe) of random variable.


>> fe1=normcdf(-3.38)*10000;

>> fe2=(normcdf(-2.85)-normcdf(-3.38))*10000;

>> fe3=(normcdf(-2.31)-normcdf(-2.85))*10000;

>> fe4=(normcdf(-1.78)-normcdf(-2.31))*10000;

>> fe5=(normcdf(-1.25)-normcdf(-1.78))*10000;

>> fe6=(normcdf(-0.713)-normcdf(-1.25))*10000;

>> fe7=(normcdf(-0.18)-normcdf(-0.713))*10000;

>> fe8=(normcdf(0.354)-normcdf(-0.18))*10000;

>> fe9=(normcdf(0.888)-normcdf(0.354))*10000;

>> fe10=(normcdf(1.42)-normcdf(0.888))*10000;

>> fe11=(normcdf(1.96)-normcdf(1.42))*10000;

>> fe12=(normcdf(2.49)-normcdf(1.96))*10000;

>> fe13=(normcdf(3.02)-normcdf(2.49))*10000;

7
>> fe14=(normcdf(3.56)-normcdf(3.02))*10000;

>> fe15=(1-normcdf(3.56))*10000;

The values of observed frequency and expected frequency are shown in the
table. The χ2 or Chi-square of random variable also can be calculated in this table
as follow.

(𝒇𝒐 − 𝒇𝒆)𝟐
Bins fo fe
𝒇𝒆
-∞ to -3.38 2 3.62 0.73
-3.18 to -2.85 14 18.24 0.98
-2.85 to -2.31 89 82.58 0.50
-2.31 to -1.78 269 270.94 0.01
-1.78 to -1.25 677 681.12 0.02
-1.25 to -0.713 1330 1322.70 0.04
-0.713 to -0.18 1950 1906.50 0.99
-0.18 to 0.354 2034 2097.50 1.92
0.354 to 0.888 1730 1744.00 0.11
0.888 to 1.42 1150 1094.70 2.79
1.42 to 1.96 503 5.28E+02 1.19
1.96 to 2.49 189 1.86E+02 0.04
2.49 to 3.02 47 51.2328 0.35
3.02 to 3.56 12 10.7845 0.14
3.56 to ∞ 4 1.8543 2.48
Sum 10000 9999.94 12.31

So, by using χ2 or Chi-square goodness of fit test with the formula:


𝑘
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))2
χ =∑
𝑓𝑒 (𝑖)
𝑖=1

The value of χ is 12.31. By using confidence level or F(z) is 0.95 and degree
2

of freedom k-3 (for Gaussian distribution), from Chi-square distribution table


(appendix A) the value of χ2α,dof is 21.03, or by using Matlab

>> threshold=chi2inv(.95,12)

threshold =

21.0261

So, because χ2 < χ2α,dof the random variable is valid as a Gaussian random variable.

8
b. Gaussian random variable (10000,1) with 20 bins
Generating Gaussian random variable with 10.000 data and make
histogram with 20 bins.

>> y2=randn(10000,1);

>> hist(y2,20)

Evaluating the expected frequency (fe) of random variable.


>> fe1=normcdf(-3.18)*10000;

>> fe2=(normcdf(-2.8)-normcdf(-3.18))*10000;

>> fe3=(normcdf(-2.42)-normcdf(-2.8))*10000;

>> fe4=(normcdf(-2.04)-normcdf(-2.42))*10000;

>> fe5=(normcdf(-1.66)-normcdf(-2.04))*10000;

>> fe6=(normcdf(-1.28)-normcdf(-1.66))*10000;

9
>> fe7=(normcdf(-0.9)-normcdf(-1.28))*10000;

>> fe8=(normcdf(-0.521)-normcdf(-0.9))*10000;

>> fe9=(normcdf(-0.141)-normcdf(-0.521))*10000;

>> fe10=(normcdf(0.238)-normcdf(-0.141))*10000;

>> fe11=(normcdf(0.618)-normcdf(0.238))*10000;

>> fe12=(normcdf(0.997)-normcdf(0.618))*10000;

>> fe13=(normcdf(1.38)-normcdf(0.997))*10000;

>> fe14=(normcdf(1.76)-normcdf(1.38))*10000;

>> fe15=(normcdf(2.14)-normcdf(1.76))*10000;

>> fe16=(normcdf(2.52)-normcdf(2.14))*10000;

>> fe17=(normcdf(2.89)-normcdf(2.52))*10000;

>> fe18=(normcdf(3.27)-normcdf(2.89))*10000;

>> fe19=(normcdf(3.65)-normcdf(3.27))*10000;

>> fe20=(1-normcdf(3.65))*10000;

The values of observed frequency and expected frequency are shown in the
table. The χ2 or Chi-square of random variable also can be calculated in this table
as follow.

(𝒇𝒐 − 𝒇𝒆)𝟐
Bins fo fe
𝒇𝒆
-∞ to -3.18 6 7.36 0.25
-3.18 to -2.8 15 18.19 0.56
-2.8 to -2.42 44 52.05 1.25
-2.42 to -2.04 130 129.15 0.01
-2.04 to -1.66 267 277.82 0.42
-1.66 to -1.28 542 518.15 1.10
-1.28 to -0.9 852 837.88 0.24
-0.9 to -0.521 1180 1171.20 0.07
-0.521 to -0.141 1420 1427.50 0.04
-0.141 to 0.238 1483 1501.20 0.22
0.238 to 0.618 1390 1.38E+03 0.13
0.618 to 0.997 1110 1.09E+03 0.40
0.997 to 1.38 727 755.8893 1.10
1.38 to 1.76 456 445.8942 0.23
1.76 to 2.14 246 230.2652 1.08

10
2.14 to 2.52 82 103.0964 4.32
2.52 to 2.89 32 39.4153 1.40
2.89 to 3.27 7 13.8847 3.41
3.27 to 3.65 10 4.0662 8.66
3.65 to ∞ 1 1.3112 0.07
Sum 10000 9999.92 24.95

So, by using χ2 or Chi-square goodness of fit test with the formula:


𝑘
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))2
χ =∑
𝑓𝑒 (𝑖)
𝑖=1

The value of χ2 is 24.95. By using confidence level or F(z) is 0.95 and degree
of freedom k-3 (for Gaussian distribution), from Chi-square distribution table
(appendix A) the value of χ2α,dof is 27.59, or by using Matlab

>> threshold=chi2inv(.95,17)

threshold =

27.5871

So, because χ2 < χ2α,dof the random variable is valid as a Gaussian random variable.

c. Gaussian random variable (10000,1) with 25 bins

Generating Gaussian random variable with 10.000 data and make histogram
with 25 bins.

>> y3=randn(10000,1);

>> hist(y3,25)

11
Evaluating the expected frequency (fe) of random variable.
>> fe1=normcdf(-3.91)*10000;

>> fe2=(normcdf(-3.57)-normcdf(-3.91))*10000;

>> fe3=(normcdf(-3.22)-normcdf(-3.57))*10000;

>> fe4=(normcdf(-2.88)-normcdf(-3.22))*10000;

>> fe5=(normcdf(-2.54)-normcdf(-2.88))*10000;

>> fe6=(normcdf(-2.2)-normcdf(-2.54))*10000;

>> fe7=(normcdf(-1.86)-normcdf(-2.2))*10000;

>> fe8=(normcdf(-1.52)-normcdf(-1.86))*10000;

>> fe9=(normcdf(-1.18)-normcdf(-1.52))*10000;

>> fe10=(normcdf(-0.834)-normcdf(-1.18))*10000;

>> fe11=(normcdf(-0.492)-normcdf(-0.834))*10000;

>> fe12=(normcdf(-0.151)-normcdf(-0.492))*10000;

>> fe13=(normcdf(0.191)-normcdf(-0.151))*10000;

>> fe14=(normcdf(0.533)-normcdf(0.191))*10000;

>> fe15=(normcdf(0.874)-normcdf(0.533))*10000;

12
>> fe16=(normcdf(1.22)-normcdf(0.874))*10000;

>> fe17=(normcdf(1.56)-normcdf(1.22))*10000;

>> fe18=(normcdf(1.9)-normcdf(1.56))*10000;

>> fe19=(normcdf(2.24)-normcdf(1.9))*10000;

>> fe20=(normcdf(2.58)-normcdf(2.24))*10000;

>> fe21=(normcdf(2.92)-normcdf(2.58))*10000;

>> fe22=(normcdf(3.26)-normcdf(2.92))*10000;

>> fe23=(normcdf(3.61)-normcdf(3.26))*10000;

>> fe24=(normcdf(3.95)-normcdf(3.61))*10000;

>> fe25=(1-normcdf(3.95))*10000;

The values of observed frequency and expected frequency are shown in the
table. The χ2 or Chi-square of random variable also can be calculated in this table
as follow.

(𝒇𝒐 − 𝒇𝒆)𝟐
Bins fo fe
𝒇𝒆
-∞ to -3.91 1 0.46 0.63
-3.91 to -3.57 0 1.32 1.32
-3.57 to -3.22 5 4.62 0.03
-3.22 to -2.88 14 13.47 0.02
-2.88 to -2.54 34 35.54 0.07
-2.54 to -2.2 84 83.61 0.00
-2.2 to -1.86 170 175.39 0.17
-1.86 to -1.52 330 328.13 0.01
-1.52 to -1.18 572 547.45 1.10
-1.18 to -0.834 824 831.40 0.07
-0.834 to -0.492 1053 1.09E+03 1.41
-0.492 to -0.151 1340 1.29E+03 2.24
-0.151 to 0.191 1340 1.36E+03 0.23
0.191 to 0.533 1254 1.27E+03 0.27
0.533 to 0.874 1090 1.06E+03 0.87
0.874 to 1.22 817 798.27 0.44
1.22 to 1.56 490 518.525 1.57
1.56 to 1.9 320 306.6338 0.58
1.9 to 2.24 147 161.711 1.34
2.24 to 2.58 71 76.0545 0.34

13
2.58 to 2.92 27 31.8986 0.75
2.92 to 3.26 14 11.931 0.36
3.26 to 3.61 2 4.0396 1.03
3.61 to 3.95 0 1.1402 1.14
3.95 to ∞ 1 0.3908 0.95
Sum 10000 10000.10 16.93

So, by using χ2 or Chi-square goodness of fit test with the formula:


𝑘
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))2
χ =∑
𝑓𝑒 (𝑖)
𝑖=1

The value of χ2 is 16.93. By using confidence level or F(z) is 0.95 and degree
of freedom k-3 (for Gaussian distribution), from Chi-square distribution table
(appendix A) the value of χ2α,dof is 33.92, or by using Matlab

>> threshold=chi2inv(.95,22)

threshold =

33.9244

So, because χ2 < χ2α,dof the random variable is valid as a Gaussian random variable.

3. Synthesis Gaussian random variable using normrnd (μ, σ, N, 1)


For this case we generate a Gaussian random variable with the mean value
is 5 and varying the standard deviation (σ) at 1, 2 and 5, then we evaluate the
difference of all them. All random variables have 1000 data and 10 bins of
histogram.
a. Gaussian random variable (5,1,1000,1)
Generating gaussian random variable with the mean value is 5, and the
standard deviation is 1.
>> y1=normrnd(5,1,1000,1);

>> hist(y1)

14
mean(y1)

ans =

5.0228

Evaluating the expected frequency (fe) of random variable using the relation
of:
𝑥 − 𝑎𝑋
𝐹𝑋 (𝑥) = 𝐹𝑌 ( )
𝜎𝑋
so,
>> fe1=normcdf(2.6-5.0228)*1000;

>> fe2=(normcdf(3.23-5.0228)-normcdf(2.6-5.0228))*1000;

>> fe3=(normcdf(3.86-5.0228)-normcdf(3.23-5.0228))*1000;

>> fe4=(normcdf(4.49-5.0228)-normcdf(3.86-5.0228))*1000;

>> fe5=(normcdf(5.13-5.0228)-normcdf(4.49-5.0228))*1000;

>> fe6=(normcdf(5.76-5.0228)-normcdf(5.13-5.0228))*1000;

>> fe7=(normcdf(6.39-5.0228)-normcdf(5.76-5.0228))*1000;

>> fe8=(normcdf(7.02-5.0228)-normcdf(6.39-5.0228))*1000;

>> fe9=(normcdf(7.65-5.0228)-normcdf(7.02-5.0228))*1000;

>> fe10=(1-normcdf(7.65-5.0228))*1000

15
The values of observed frequency and expected frequency are shown in the table.
The χ2 or Chi-square of random variable also can be calculated in this table as
follow.

(𝒇𝒐 − 𝒇𝒆)𝟐
Bins fo fe
𝒇𝒆
-∞ to 2.6 5 7.70 0.95
2.6 to 3.23 26 28.80 0.27
3.23 to 3.86 91 85.95 0.30
3.86 to4.49 173 174.63 0.02
4.49 to 5.13 246 245.60 0.00
5.13 to 5.76 231 226.81 0.08
5.76 to 6.39 140 144.72 0.15
6.39 to 7.02 66 62.88 0.15
7.02 to 7.65 19 18.60 0.01
7.65 to ∞ 3 4.30 0.40
Sum 1000 1000 2.32

So, by using χ2 or Chi-square goodness of fit test with the formula:


𝑘
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))2
χ =∑
𝑓𝑒 (𝑖)
𝑖=1

The value of χ2 is 2.32. By using confidence level or F(z) is 0.95 and degree
of freedom k-3 (for Gaussian distribution), from Chi-square distribution table
(appendix A) the value of χ2α,dof is 14,07, or by using Matlab

>> threshold=chi2inv(.95,7)

threshold =

14.0671

So, because χ2 < χ2α,dof the random variable is valid as a Gaussian random variable.

b. Gaussian random variable (5,2,1000,1)


Generating gaussian random variable with the mean value is 5, and the
standard deviation is 2.
>> y2=normrnd(5,2,1000,1);

>> hist(y2)

16
The mean and standart deviation of the random variable is:
>> mean(y2)

ans =

4.8988

>> std(y2)

ans =

2.0019

Evaluating the expected frequency (fe) of random variable using the relation
of:
𝑥 − 𝑎𝑋
𝐹𝑋 (𝑥) = 𝐹𝑌 ( )
𝜎𝑋
so,
>> fe1=normcdf((-0.375- 4.8988)/ 2.0019)*1000;

>> fe2=(normcdf((0.916- 4.8988)/ 2.0019)-normcdf((-0.375- 4.8988)/


2.0019))*1000;

>> fe3=(normcdf((2.21-4.8988)/ 2.0019)-normcdf((0.916- 4.8988)/


2.0019))*1000;

>> fe4=(normcdf((3.5-4.8988)/ 2.0019)-normcdf((2.21-4.8988)/


2.0019))*1000;

>> fe5=(normcdf((4.79-4.8988)/ 2.0019)-normcdf((3.5-4.8988)/


2.0019))*1000;

17
>> fe6=(normcdf((6.08-4.8988)/ 2.0019)-normcdf((4.79-4.8988)/
2.0019))*1000;

>> fe7=(normcdf((7.37-4.8988)/ 2.0019)-normcdf((6.08-4.8988)/


2.0019))*1000;

>> fe8=(normcdf((8.66-4.8988)/ 2.0019)-normcdf((7.37-4.8988)/


2.0019))*1000;

>> fe9=(normcdf((9.95-4.8988)/ 2.0019)-normcdf((8.66-4.8988)/


2.0019))*1000;

>> fe10=(1-normcdf((9.95-4.8988)/ 2.0019))*1000;

The values of observed frequency and expected frequency are shown in the table.
The χ2 or Chi-square of random variable also can be calculated in this table as
follow.
(𝒇𝒐 − 𝒇𝒆)𝟐
Bins fo fe
𝒇𝒆
-∞ to -0.375 3 4.21 0.35
-0.375 to 0.916 24 19.11 1.25
0.916 to 2.21 60 66.29 0.60
2.21 to 3.5 153 152.74 0.00
3.5 to 4.79 245 235.97 0.35
4.79 to 6.08 236 244.09 0.27
6.08 to 7.37 166 169.06 0.06
7.37 to 8.66 83 78.39 0.27
8.66 to 9.95 26 24.32 0.12
9.95 to ∞ 4 5.81 0.57
Sum 1000 1000 3.82

So, by using χ2 or Chi-square goodness of fit test with the formula:


𝑘
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))2
χ =∑
𝑓𝑒 (𝑖)
𝑖=1

The value of χ2 is 3.82. By using confidence level or F(z) is 0.95 and degree
of freedom k-3 (for Gaussian distribution), from Chi-square distribution table
(appendix A) the value of χ2α,dof is 14,07, or by using Matlab
>> threshold=chi2inv(.95,7)

threshold =

14.0671

So, because χ2 < χ2α,dof the random variable is valid as a Gaussian random variable.

18
c. Gaussian random variable (5,5,1000,1)
Generating gaussian random variable with the mean value is 5, and the
standard deviation is 5.

>> y3=normrnd(5,5,1000,1);

>> hist(y3)

>> mean(y3)

ans =
4.9905

>> std(y3)

ans =

5.1553

Evaluating the expected frequency (fe) of random variable using the relation
of:
𝑥 − 𝑎𝑋
𝐹𝑋 (𝑥) = 𝐹𝑌 ( )
𝜎𝑋
so,
>> fe1=normcdf((-11.3- 4.9905)/ 5.1553)*1000;

>> fe2=(normcdf((-7.68- 4.9905)/ 5.1553)-normcdf((-11.3- 4.9905)/


5.1553))*1000;

>> fe3=(normcdf((-4.04- 4.9905)/ 5.1553)-normcdf((-7.68- 4.9905)/


5.1553))*1000;

19
>> fe4=(normcdf((-0.407- 4.9905)/ 5.1553)-normcdf((-4.04- 4.9905)/
5.1553))*1000;

>> fe5=(normcdf((3.23- 4.9905)/ 5.1553)-normcdf((-0.407- 4.9905)/


5.1553))*1000;

>> fe6=(normcdf((6.86- 4.9905)/ 5.1553)-normcdf((3.23- 4.9905)/


5.1553))*1000;

>> fe7=(normcdf((10.5- 4.9905)/ 5.1553)-normcdf((6.86- 4.9905)/


5.1553))*1000;

>> fe8=(normcdf((14.1- 4.9905)/ 5.1553)-normcdf((10.5- 4.9905)/


5.1553))*1000;

>> fe9=(normcdf((17.8- 4.9905)/ 5.1553)-normcdf((14.1- 4.9905)/


5.1553))*1000;

>> fe10=(1-normcdf((17.8- 4.9905)/ 5.1553))*1000;

The values of observed frequency and expected frequency are shown in the table.
The χ2 or Chi-square of random variable also can be calculated in this table as
follow.

(𝒇𝒐 − 𝒇𝒆)𝟐
Bins fo fe
𝒇𝒆
-∞ to -11.3 1 0.79 0.06
-11.3 to -7.68 10 6.20 2.33
-7.68 to -4.04 28 32.92 0.74
-4.04 to -0.407 105 107.64 0.06
-0.407 to 3.23 214 218.81 0.11
3.23 to 6.86 292 275.20 1.03
6.86 to 10.5 209 215.84 0.22
10.5 to 14.1 104 103.99 0.00
14.1 to 17.8 28 32.13 0.53
17.8 to ∞ 9 6.48 0.98
Sum 1000 1000 6.04

So, by using χ2 or Chi-square goodness of fit test with the formula:


𝑘
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))2
χ =∑
𝑓𝑒 (𝑖)
𝑖=1

The value of χ2 is 6.04. By using confidence level or F(z) is 0.95 and degree
of freedom k-3 (for Gaussian distribution), from Chi-square distribution table
(appendix A) the value of χ2α,dof is 14,07, or by using Matlab

20
>> threshold=chi2inv(.95,7)

threshold =

14.0671

So, because χ2 < χ2α,dof the random variable is valid as a Gaussian random variable.

4. Synthesis Gaussian random variable using box-muller


For this case, we generate a Gaussian random variable using box-muller
formula. The Gaussian random variable is resulted by transformation of two
statistically independent random variable x1 and x2 that both uniformly
distributed on (0,1). The fransformation is
𝑌 = 𝑇(𝑋1 , 𝑋2 ) = √−2ln⁡(𝑋1 )sin⁡(2𝜋𝑋2 )
or
𝑌 = 𝑇(𝑋1 , 𝑋2 ) = √−2ln⁡(𝑋1 )cos⁡(2𝜋𝑋2 )
This transformation will produce zero mean and unit variance of Gaussian
random variable. The experiment is conducted by varying the number of data
(100, and 1.000) and the histogram has 10 bins.

a. Gaussian random variable with 100 data


Generating Gaussian random variable using box-muller with 100 data of
uniform random variables.
>> for i=1:100

x1=rand(100,1);

x2=rand(100,1);

y1=sqrt(-2*log(x1)).*sin(2*pi*x2);

end

hist(y1)

21
Evaluating the expected frequency (fe) of random variable.
>> fe1=normcdf(-1.93)*100;

>> fe2=(normcdf(-1.5)-normcdf(-1.93))*100;

>> fe3=(normcdf(-1.08)-normcdf(-1.5))*100;

>> fe4=(normcdf(-0.65)-normcdf(-1.08))*100;

>> fe5=(normcdf(-0.224)-normcdf(-0.65))*100;

>> fe6=(normcdf(0.203)-normcdf(-0.224))*100;

>> fe7=(normcdf(0.629)-normcdf(0.203))*100;

>> fe8=(normcdf(1.06)-normcdf(0.629))*100;

>> fe9=(normcdf(1.48)-normcdf(1.06))*100;

>> fe10=(1-normcdf(1.48))*100;

The values of observed frequency and expected frequency are shown in the table.
The χ2 or Chi-square of random variable also can be calculated in this table as
follow.

22
(𝒇𝒐 − 𝒇𝒆)𝟐
Bins fo fe
𝒇𝒆
-∞ to -1.93 1 2.68 1.05
-1.93 to -1.5 1 4.00 2.25
-1.5 to -1.08 7 7.33 0.01
-1.08 to -0.65 12 11.78 0.00
-0.65 to -0.224 13 15.35 0.36
-0.224 to 0.203 20 16.91 0.57
0.203 to 0.629 21 15.49 1.96
0.629 to 1.06 13 12.01 0.08
1.06 to 1.48 9 7.51 0.29
1.48 to ∞ 3 6.94 2.24
Sum 100 100 8.83

So, by using χ2 or Chi-square goodness of fit test with the formula:


𝑘
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))2
χ =∑
𝑓𝑒 (𝑖)
𝑖=1

The value of χ2 is 8.83. By using confidence level or F(z) is 0.95 and degree
of freedom k-3 (for Gaussian distribution), from Chi-square distribution table
(appendix A) the value of χ2α,dof is 14,07, or by using Matlab

>> threshold=chi2inv(.95,7)

threshold =

14.0671

So, because χ2 < χ2α,dof the random variable is valid as a Gaussian random variable.

b. Generating Gaussian random variable


Generating Gaussian random variable using box-muller with 1.000 data of
uniform random variables.
>> for i=1:1000

x1=rand(1000,1);

x2=rand(1000,1);

y1=sqrt(-2*log(x1)).*cos(2*pi*x2);

end

hist(y1)

23
Evaluating the expected frequency (fe) of random variable.
>> fe1=normcdf(-2.36)*1000;

>> fe2=(normcdf(-1.74)-normcdf(-2.36))*1000;

>> fe3=(normcdf(-1.11)-normcdf(-1.74))*1000;

>> fe4=(normcdf(-0.487)-normcdf(-1.11))*1000;

>> fe5=(normcdf(0.137)-normcdf(-0.487))*1000;

>> fe6=(normcdf(0.762)-normcdf(0.137))*1000;

>> fe7=(normcdf(1.39)-normcdf(0.762))*1000;

>> fe8=(normcdf(2.01)-normcdf(1.39))*1000;

>> fe9=(normcdf(2.63)-normcdf(2.01))*1000;

>> fe10=(1-normcdf(2.63))*1000;

The values of observed frequency and expected frequency are shown in the table.
The χ2 or Chi-square of random variable also can be calculated in this table as
follow.

24
(𝒇𝒐 − 𝒇𝒆)𝟐
Bins fo fe
𝒇𝒆
-∞ to -2.36 12 9.14 0.90
-2.36 to -1.74 45 31.79 5.49
-1.74 to -1.11 87 92.57 0.34
-1.11 to -0.487 161 179.63 1.93
-0.487 to 0.137 254 241.36 0.66
0.137 to 0.762 202 222.49 1.89
0.762 to 1.39 141 140.77 0.00
1.39 to 2.01 69 60.05 1.33
2.01 to 2.63 24 17.95 2.04
2.63 to ∞ 5 4.27 0.13
Sum 1000 1000 14.70

So, by using χ2 or Chi-square goodness of fit test with the formula:


𝑘
2
(𝑓𝑜 (𝑖) − 𝑓𝑒 (𝑖))2
χ =∑
𝑓𝑒 (𝑖)
𝑖=1

The value of χ2 is 14.70. By using confidence level or F(z) is 0.95 and degree
of freedom k-3 (for Gaussian distribution), from Chi-square distribution table
(appendix A) the value of χ2α,dof is 14,07, or by using Matlab

>> threshold=chi2inv(.95,7)

threshold =

14.0671

So, because χ2 < χ2α,dof the random variable is invalid as a Gaussian random
variable.

25
C. ANALYSIS OF RANDOM VARIABLE
The generation of normal (Gaussian) random variable at any variation of
data, bins, variance and methods, can be analysed that:
1. By increasing the number of data of zero mean and unit variance of Gaussian
random variable which generated by randn command in Matlab, will
increase the value of Chi-square (χ2). Because it uses a constant bin (10
bins), at large number of data will create nonaccepting test of Chi-square.
By using convidence level of 95% and degree of freedom k-3 (10-3=7), the
χ2α,dof is 14,07. Using 100, 1.000 and 10.000 data produce the χ2 as 4.29,
8.37 and 14.27 respectively. So the generation of random variable with
10.000 data is not accepted as Gaussian random variable with confidence
level of 95%.
2. By increasing the number of bin for random variable with 10.000 data which
generated by randn command in Matlab, also increase the value of Chi-
square (χ2). However the increasing of bin is followed by increasing of
degree of freedom and the χ2α,dof. The using of 15, 20 and 25 bins with
convidence level of 95% data produce the χ2α,dof is 21.03, 27.59 and 33.92
respectively. The experiment with 15, 20 and 25 bins produce the χ2 as
12.31, 24.95 and 16.93 respectively. So, increasing the number of bins will
make a high data of random variable more acceptable with the same
confidence level.
3. The generation of Gaussian random variable using normrnd (μ,σ,N,1)
command in Matlab at the same number of bin with first experiment (10
bin) produces a smaller of Chi-square value. Using a constant 1000 data and
varying standard deviation of 1, 2 and 5 produce the χ2 as 2.32, 3.82 and
6.04 respectively. By increasing the standard deviation of Gaussian random
variable will make the data more spread from mean value.
4. The generation of Gaussian random variable using box-muller method need
an looping process in Matlab command. The box-muller always generate
zero mean and unit variance of Gaussian random variable. Same with first
experiment, increasing the number of data will increase the value of Chi-
square (χ2). Compare with randn command, the box-muller produce a higher

26
value of χ2. The using of 100 and 1.000 data produce the χ2 as 8.83 and 14.70
respectively. The convidence level of 95% and degree of freedom k-3 (10-
3=7), the χ2α,dof is 14,07. So, at 1.000 data of Gaussian random variable
which generated by mox-muller method is not accepted as Gaussian random
variable with confidence level of 95%.

27

You might also like