You are on page 1of 14

Name : Mahaputra Madani Senen

Risk Management & System Safety


Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

Homework 1
Table 1. Probability Comparison
No. p Info(p) Entropy     q Info(q) Entropy
0.29874 0.29874
1 1/12 3.584962501 7     1/12 3.584962501 7
0.29874 0.29874
2 1/12 3.584962501 7     1/12 3.584962501 7
0.29874 0.43082
3 1/12 3.584962501 7     1/6 2.584962501 7
0.29874 0.43082
4 1/12 3.584962501 7     1/6 2.584962501 7
0.29874 0.29874
5 1/12 3.584962501 7     1/12 3.584962501 7
0.29874 0.29874
6 1/12 3.584962501 7     1/12 3.584962501 7
0.29874 0.29874
7 1/12 3.584962501 7     1/12 3.584962501 7
0.29874 0.19104
8 1/12 3.584962501 7     1/24 4.584962501 0
0.29874 0.19104
9 1/12 3.584962501 7     1/24 4.584962501 0
0.29874 0.19104
10 1/12 3.584962501 7     1/24 4.584962501 0
0.29874 0.19104
11 1/12 3.584962501 7     1/24 4.584962501 0
0.29874 0.29874
12 1/12 3.584962501 7     1/12 3.584962501 7
                 
Sum(p) Sum 3.58496 Sum(q) Sum (Entropy- 3.41829
= 1 (Entropy-p) = 3   = 1 q) = 6
                 
Sum
(Entropy 3.75162 Sum (Entropy 3.58496
    -p||q) = 9       -q||p) = 3
                 
KL
Divergenc
        e (p||q)   -0.33333    

1
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

Probability Comparison
9/50
4/25
7/50
3/25
1/10
2/25
3/50
1/25
1/50
0
0 2 4 6 8 10 12 14

p q

Figure 1. Probability Comparison Graph


As shown in Table 1 and Figure 1, we found that there is probability which conflicted each
other. Therefore, by calculating the K-L Divergence, we obtained that Total Entropy is equal
to -0.33333.

Homework 2
Table 2. The Entropy Calculation Result
Head Tail Info(H) Info(T) Entropy
0.0001 0.9999 13.28771 0.000144 0.001473
0.01 0.99 6.643856 0.0145 0.080793
0.02 0.98 5.643856 0.029146 0.141441
0.03 0.97 5.058894 0.043943 0.194392
0.04 0.96 4.643856 0.058894 0.242292
0.05 0.95 4.321928 0.074001 0.286397
……….. Continue until…
0.48 0.52 1.058894 0.943416 0.998846
0.49 0.51 1.029146 0.971431 0.999711
0.50 0.5 1 1 1
0.51 0.49 0.971431 1.029146 0.999711
0.52 0.48 0.943416 1.058894 0.998846
0.53 0.47 0.915936 1.089267 0.997402
0.54 0.46 0.888969 1.120294 0.995378
0.55 0.45 0.862496 1.152003 0.992774
0.56 0.44 0.836501 1.184425 0.989588
0.57 0.43 0.810966 1.217591 0.985815

2
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

0.58 0.42 0.785875 1.251539 0.981454


0.59 0.41 0.761213 1.286304 0.9765
0.60 0.4 0.736966 1.321928 0.970951
0.61 0.39 0.713119 1.358454 0.9648
0.62 0.38 0.68966 1.395929 0.958042
……….. Continue until…
0.94 0.06 0.089267 4.058894 0.327445
0.95 0.05 0.074001 4.321928 0.286397
0.96 0.04 0.058894 4.643856 0.242292
0.97 0.03 0.043943 5.058894 0.194392
0.98 0.02 0.029146 5.643856 0.141441
0.99 0.01 0.0145 6.643856 0.080793
0.9999 0.0001 0.000144 13.28771 0.001473

Based on Table 2, we found that there is a 0.5 probability that gives the highest uncertainty.
Therefore, the entropy result also remains high.

ENTROPY
1.2

0.8

0.6

0.4

0.2

0
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101

Figure 2. The Entropy Result Graph

As shown in Figure 2, we found that there are Head and Tail has equal probability
equal to 0.5, then it gives the highest entropy. Therefore, it can be interpreted that when there
are several potential events which each event has the same probability (uniform), then the
risk taker will find difficulty in deciding since they cannot guess which event that more likely
to happen.

3
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

Homework 3
In this case, we defined that a = 2, and b = 4. Moreover, we generated 10000 random
numbers and we also calculated the F-1 by using the following formula:
F-1 = a + p (b - a), then we obtained a histogram plot based on the calculation result
afterwards. Furthermore, we can manage to calculate PDF for sample and theory, and KL-
Divergence as shown in Table 3. The result for KL-Divergence is -0.07761672. And figure 3
shows the picture of Histogram of F-1 (p) for 10000 random numbers.
Table 3. The F-1 calculation result
Frequenc PDF Info
Bin y PDF (Sample) (Theory) Info (Sample) (Theory)
2.00020050 6.65821148
6 1 0.0001 0.00990099 13.28771238 3
6.65821148
2.02019745 96 0.0096 0.00990099 6.702749879 3
2.04019439 6.65821148
5 95 0.0095 0.00990099 6.717856771 3
2.06019133 6.65821148
9 98 0.0098 0.00990099 6.673002535 3
2.08018828 6.65821148
4 94 0.0094 0.00990099 6.733123528 3
2.10018522 6.65821148
8 106 0.0106 0.00990099 6.559791925 3
2.12018217 6.65821148
3 109 0.0109 0.00990099 6.519528055 3
2.14017911 6.65821148
7 99 0.0099 0.00990099 6.658355759 3
2.16017606 6.65821148
2 90 0.009 0.00990099 6.795859283 3
……. Continue …..
2.64010273
1 100 0.01 0.00990099 6.64385619 6.658211483
2.66009967
6 108 0.0108 0.00990099 6.532824877 6.658211483
2.68009662 117 0.0117 0.00990099 6.41734766 6.658211483
2.70009356
5 104 0.0104 0.00990099 6.587272661 6.658211483
2.72009050
9 115 0.0115 0.00990099 6.442222329 6.658211483
2.74008745
4 101 0.0101 0.00990099 6.629500897 6.658211483
2.76008439 99 0.0099 0.00990099 6.658355759 6.658211483

4
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

8
2.78008134
3 87 0.0087 0.00990099 6.844768884 6.658211483
2.80007828
7 100 0.01 0.00990099 6.64385619 6.658211483
2.82007523
2 108 0.0108 0.00990099 6.532824877 6.658211483
2.84007217
6 115 0.0115 0.00990099 6.442222329 6.658211483
2.86006912
1 112 0.0112 0.00990099 6.480357457 6.658211483
2.88006606
6 102 0.0102 0.00990099 6.615287038 6.658211483
2.90006301 104 0.0104 0.00990099 6.587272661 6.658211483
2.92005995
5 95 0.0095 0.00990099 6.717856771 6.658211483
…. Continue …
3.93990412
6 99 0.0099 0.00990099 6.658355759 6.658211483  
3.95990107
1 88 0.0088 0.00990099 6.828280761 6.658211483  
3.97989801
5 99 0.0099 0.00990099 6.658355759 6.658211483  
More 101 0.0101 0.00990099 6.629500897 6.658211483  
             
Sum (Freq) = 10000          
Sum (Ent-Sample) Sum (Ent-Theory)
    = 6.638387169   = 6.658211
             
Sum (Entropy Sum (Entropy
-Sample||Theory) -Theory||Sample)
    = 6.658211483   = 6.716004
             

KL Divergence (Sample||
        Theory) = -0.07761672  

5
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

Histogram of F-1(p) for 10000 random numbers


12

10

0
0 2 4 6 8 10 12

Figure 3. Histogram of F-1 for 10000 random numbers

Homework 4
Table 4 shows the calculation for determining event time, with time at step 0.01 and event
based on rule if the random number is greater than 0.7, there will be event, and event time is
time multiply event. And then event time were sorted from smaller to larger.

Table 4. Calculation Event time and Waiting time


Event Time
time Random Numbers Event Event Time (Sorted)
0 0.232811503 0 0 0
0.01 0.89287821 1 0.01 0
0.02 0.246071712 0 0 0
0.03 0.8244096 1 0.03 0
0.04 0.075598858 0 0 0
0.05 0.091304668 0 0 0
0.06 0.802989513 1 0.06 0
…. Continue…
211.13 0.532507311 0 0 0
Waiting
211.14 0.365201755 0 0 0 Time
211.15 0.579132579 0 0 0.01 0.01
211.16 0.214858859 0 0 0.03 0.02

6
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

211.17 0.865632165 1 211.17 0.06 0.03


211.18 0.706847578 1 211.18 0.08 0.02
211.19 0.993396947 1 211.19 0.1 0.02
211.2 0.506491226 0 0 0.11 0.01
211.21 0.234051012 0 0 0.14 0.03
211.22 0.410951593 0 0 0.15 0.01
211.23 0.992500051 1 211.23 0.16 0.01
211.24 0.07743126 0 0 0.2 0.04
211.25 0.47865561 0 0 0.22 0.02
211.26 0.557766826 0 0 0.24 0.02
211.27 0.169395876 0 0 0.33 0.09
… continue…
299.96 0.03868672 0 0 299.92 0.08
299.97 0.6751632 0 0 299.93 0.01
299.98 0.450127142 0 0 299.94 0.01
299.99 0.862123618 1 299.99 299.99 0.05

Table 5. Calculation for Histogram Waiting time


Frequenc
Bin y
0.01 1515
0.012979 1074
0.015957 0
0.018936 0
0.021915 1888
0.024894 0
0.027872 0
… continue …
0.272128 0
0.275106 0
0.278085 0
0.281064 0
0.284043 0
0.287021 0
More 1
sum 8885

7
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

Waiting Time (Exponential Dist.)


12

10

0
0 2 4 6 8 10 12

Figure 4. Waiting Time (Exponential Dist.)


It can be seen that from horizon waiting time, we get the Exponential distribution graph, that
show the history of time and we can identify how long for the waiting time.

Homework 5
1. Cohort definition
In statistics, marketing, and demographics, a cohort is a group of subjects who share
decisive characteristics (usually subjects that experience a common event in a certain
period of time, such as birth or graduation). Or in other words, cohort is a group of people
who have something in common, can represent the source population – the population
from which cases of disease arise. For examples, such as all employees in an office
building, everyone who attended a football game, all the residents of a neighborhood.
2. Prospective definition
For prospective cohort study is type of research that there is a collection of exposure data
(baseline) of subjects recruited prior to the development of the desired results. The subject
is then followed through time (future) to record when the subject develops the desired
result. Ways to follow up on research subjects include: telephone interviews, face-to-face
interviews, physical exams, medical / laboratory tests, and sending questionnaires (source:
http://sphweb.bumc.bu.edu/otlt/MPH-
Modules/EP/EP713_CohortStudies/EP713_CohortStudies_print.html).

8
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

An example of a prospective cohort study is, for example, if a demographer wants to


measure all male births by 2018. Demographers must wait until the event is over, 2018
must end in order in order for demographics to have all the necessary data (source:
http://www.statsref.com/HTML/index.html?cohort_studies2.html).
Figure 5 shows the illustration of prospective cohort study.

Figure 5. The illustration of prospective cohort study

3. Retrospective definition
Retrospective studies begin with subjects who are at risk of having interesting outcomes or
illnesses and identify results starting from where the subject was when the study began to
past the subject to identify exposures. Retrospective use notes: clinical, education, birth
certificate, death certificate, etc. But that might be difficult because there might not be data
for research that is starting. These studies may have many exposures which might make
this study difficult (source: http://sphweb.bumc.bu.edu/otlt/MPH-
Modules/EP/EP713_CohortStudies/EP713_CohortStudies_print.html). An example of a
retrospective cohort study is, if a demographic examines a group of people born in 1970
who have type 1 diabetes. Demographics will start by looking at historical data. However,
if demographics see ineffective data in an attempt to infer the source of type 1 diabetes,
demographic results will not be accurate (source: http://sphweb.bumc.bu.edu/otlt/MPH-
Modules/EP/EP713_CohortStudies/EP713_CohortStudies5.html). The figure 6 shows
illustration of retrospective cohort study.

9
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

Figure 6. The illustration of retrospective cohort study

4. Cross-sectional study definition


In medical research, social science and biology, cross-sectional studies (also known as
cross-sectional analysis, transversal studies, prevalence studies) are a type of observational
study that analyzes data from a population, or representative part, at a particular point in
time - that is, cross-sectional data.
In economics, cross-sectional studies usually involve the use of cross-sectional regression,
to sort out the existence and magnitude of the causal effects of one or more independent
variables on the dependent variable that is attractive at a particular point in time. They
differ from time series analyzes, where the behavior of one or more economic aggregates
is tracked through time.

Figure 7. The illustration of cross-sectional study

10
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

Homework 6
From this homework, we want to see how statistic can be cheated. We can change the
correlation of data sets, with just change two sets of data. As we can show in the table below.

Table 6. Data sets, and the rank with their Correlation value
x y rank(x) rank(y)
0.52786
0.436553614 4 113 84
0.14984
0.529963464 5 92 164
0.06030
0.104397213 4 176 185
0.226603218 0.79876 147 38
0.60303
0.219025289 3 147 69
0.62811
0.155444842 3 163 64
0.14714
0.111213715 8 170 162
0.98549
0.460870931 1 104 4
0.95394
0.33675166 3 128 10
0.54626
0.811181187 7 37 74

….. continue ….
0.35894198 0.97125
4 8 8 3
0.54043345 0.87397
6 5 7 3
0.57716894 0.57289
4 2 5 5
0.76972053 0.58017
2 8 4 4
0.78357789 0.10305
4 5 3 8
0.97128
0.79122632 9 2 2
0.57179704 0.18254
6 7 2 6

11
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

0.26109679 0.44376
1 9 3 3
0.89421203 0.65339
2 1 1 2
0.18594745 0.30899
9 9 2 2
0.26340043 0.99718
1 4 1 1
0.14617571 0.29567
7 1 1 1

Corellation 0.39354
Corellation = -0.0697 = 2

Table 7. Data sets, and the rank with their New Correlation value when we change two
sets of data
x y rank(x) rank(y)
0.436553614 0.527864 113 84
0.529963464 0.149845 92 163
0.104397213 0.060304 175 185
0.226603218 0.79876 146 40
0.219025289 0.603033 146 70
…. Continue ….
0.78357789 0.10305
4 5 3 8
0.97128
0.79122632 9 2 4
0.57179704 0.18254
6 7 2 6
50 50 1 1
-50 -50 4 5
0.18594745 0.30899
9 9 2 3
0.26340043 0.99718
1 4 1 1
0.14617571 0.29567
7 1 1 2

Corellation = 0.99632 Correlation = 0.40199

12
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

1 6

It can be seen that from table 6 and 7, that when we change two sets data with very large
numbers and very small numbers in each the same row, we will get significantly much larger
correlation value for random numbers from -0.0697 to 0.996321, but for the rank sets, it is
not significantly changed the correlation value.

Table 8. Data sets, and the rank with their New Correlation value when we change two
sets of data
x y rank(x) rank(y)
0.436553614 0.527864 113 84
0.529963464 0.149845 92 163
0.104397213 0.060304 175 184
0.226603218 0.79876 146 39
0.460870931 0.985491 104 5
….. continue …
0.78357789 0.10305
4 5 3 7  
0.97128
0.79122632 9 2 3  
0.57179704 0.18254
6 7 2 5  
50 -50 1 6  
-50 50 4 1  
0.18594745 0.30899
9 9 2 2  
0.26340043 0.99718
1 4 1 1  
0.14617571 0.29567
7 1 1 1  
         
Corellation Correlation 0.40169
= -0.99665   = 9

Since we changed two sets data but with different sign (Positive or negative) in each row, we
can get the correlation value significantly change smaller than before, from -0.0697 to
-0.99665, but for the rank sets, it is not significantly changed the correlation value. From this

13
Name : Mahaputra Madani Senen
Risk Management & System Safety
Student No. : M10801850 Homework Day 2
Due date : 16/12/2019

case, we can conclude that even it can be cheated, in such area we still can find there is
something wrong with the data.

14

You might also like