Homework Day 2 - Mahaputra

Name : Mahaputra Madani Senen
Risk Management & System Safety

Student No. : M10801850 Homework Day 2
Due date : 16/12/2019
Homework 1
Table 1. Probability Comparison
No. p Info(p) Entropy q Info(q) Entropy
0.29874 0.29874
1 1/12 3.584962501 7 1/12 3.584962501 7
0.29874 0.29874
2 1/12 3.584962501 7 1/12 3.584962501 7
0.29874 0.43082
3 1/12 3.584962501 7 1/6 2.584962501 7
0.29874 0.43082
4 1/12 3.584962501 7 1/6 2.584962501 7
0.29874 0.29874
5 1/12 3.584962501 7 1/12 3.584962501 7
0.29874 0.29874
6 1/12 3.584962501 7 1/12 3.584962501 7
0.29874 0.29874
7 1/12 3.584962501 7 1/12 3.584962501 7
0.29874 0.19104
8 1/12 3.584962501 7 1/24 4.584962501 0
0.29874 0.19104
9 1/12 3.584962501 7 1/24 4.584962501 0
0.29874 0.19104
10 1/12 3.584962501 7 1/24 4.584962501 0
0.29874 0.19104
11 1/12 3.584962501 7 1/24 4.584962501 0
0.29874 0.29874
12 1/12 3.584962501 7 1/12 3.584962501 7

Sum(p) Sum 3.58496 Sum(q) Sum (Entropy- 3.41829
= 1 (Entropy-p) = 3 = 1 q) = 6

Sum
(Entropy 3.75162 Sum (Entropy 3.58496
-p||q) = 9 -q||p) = 3

KL
Divergenc
e (p||q) -0.33333
1
Due date : 16/12/2019
Probability Comparison
9/50
4/25
7/50
3/25
1/10
2/25
3/50
1/25
1/50
0
0 2 4 6 8 10 12 14
p q
Figure 1. Probability Comparison Graph

As shown in Table 1 and Figure 1, we found that there is probability which conflicted each
other. Therefore, by calculating the K-L Divergence, we obtained that Total Entropy is equal
to -0.33333.
Homework 2
Table 2. The Entropy Calculation Result
Head Tail Info(H) Info(T) Entropy
0.0001 0.9999 13.28771 0.000144 0.001473
0.01 0.99 6.643856 0.0145 0.080793
0.02 0.98 5.643856 0.029146 0.141441
0.03 0.97 5.058894 0.043943 0.194392
0.04 0.96 4.643856 0.058894 0.242292
0.05 0.95 4.321928 0.074001 0.286397
……….. Continue until…
0.48 0.52 1.058894 0.943416 0.998846
0.49 0.51 1.029146 0.971431 0.999711
0.50 0.5 1 1 1
0.51 0.49 0.971431 1.029146 0.999711
0.52 0.48 0.943416 1.058894 0.998846
0.53 0.47 0.915936 1.089267 0.997402
0.54 0.46 0.888969 1.120294 0.995378
0.55 0.45 0.862496 1.152003 0.992774
0.56 0.44 0.836501 1.184425 0.989588
0.57 0.43 0.810966 1.217591 0.985815
2
Due date : 16/12/2019
0.58 0.42 0.785875 1.251539 0.981454

0.59 0.41 0.761213 1.286304 0.9765
0.60 0.4 0.736966 1.321928 0.970951
0.61 0.39 0.713119 1.358454 0.9648
0.62 0.38 0.68966 1.395929 0.958042
……….. Continue until…
0.94 0.06 0.089267 4.058894 0.327445
0.95 0.05 0.074001 4.321928 0.286397
0.96 0.04 0.058894 4.643856 0.242292
0.97 0.03 0.043943 5.058894 0.194392
0.98 0.02 0.029146 5.643856 0.141441
0.99 0.01 0.0145 6.643856 0.080793
0.9999 0.0001 0.000144 13.28771 0.001473
Based on Table 2, we found that there is a 0.5 probability that gives the highest uncertainty.
Therefore, the entropy result also remains high.
ENTROPY
1.2
0.8
0.6
0.4
0.2
0
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101
Figure 2. The Entropy Result Graph
As shown in Figure 2, we found that there are Head and Tail has equal probability
equal to 0.5, then it gives the highest entropy. Therefore, it can be interpreted that when there
are several potential events which each event has the same probability (uniform), then the
risk taker will find difficulty in deciding since they cannot guess which event that more likely
to happen.
3
Due date : 16/12/2019
Homework 3
In this case, we defined that a = 2, and b = 4. Moreover, we generated 10000 random
numbers and we also calculated the F-1 by using the following formula:
F-1 = a + p (b - a), then we obtained a histogram plot based on the calculation result
afterwards. Furthermore, we can manage to calculate PDF for sample and theory, and KL-
Divergence as shown in Table 3. The result for KL-Divergence is -0.07761672. And figure 3
shows the picture of Histogram of F-1 (p) for 10000 random numbers.
Table 3. The F-1 calculation result
Frequenc PDF Info
Bin y PDF (Sample) (Theory) Info (Sample) (Theory)
2.00020050 6.65821148
6 1 0.0001 0.00990099 13.28771238 3
6.65821148
2.02019745 96 0.0096 0.00990099 6.702749879 3
2.04019439 6.65821148
5 95 0.0095 0.00990099 6.717856771 3
2.06019133 6.65821148
9 98 0.0098 0.00990099 6.673002535 3
2.08018828 6.65821148
4 94 0.0094 0.00990099 6.733123528 3
2.10018522 6.65821148
8 106 0.0106 0.00990099 6.559791925 3
2.12018217 6.65821148
3 109 0.0109 0.00990099 6.519528055 3
2.14017911 6.65821148
7 99 0.0099 0.00990099 6.658355759 3
2.16017606 6.65821148
2 90 0.009 0.00990099 6.795859283 3
……. Continue …..
2.64010273
1 100 0.01 0.00990099 6.64385619 6.658211483
2.66009967
6 108 0.0108 0.00990099 6.532824877 6.658211483
2.68009662 117 0.0117 0.00990099 6.41734766 6.658211483
2.70009356
5 104 0.0104 0.00990099 6.587272661 6.658211483
2.72009050
9 115 0.0115 0.00990099 6.442222329 6.658211483
2.74008745
4 101 0.0101 0.00990099 6.629500897 6.658211483
2.76008439 99 0.0099 0.00990099 6.658355759 6.658211483
4
Due date : 16/12/2019
8
2.78008134
3 87 0.0087 0.00990099 6.844768884 6.658211483
2.80007828
7 100 0.01 0.00990099 6.64385619 6.658211483
2.82007523
2 108 0.0108 0.00990099 6.532824877 6.658211483
2.84007217
6 115 0.0115 0.00990099 6.442222329 6.658211483
2.86006912
1 112 0.0112 0.00990099 6.480357457 6.658211483
2.88006606
6 102 0.0102 0.00990099 6.615287038 6.658211483
2.90006301 104 0.0104 0.00990099 6.587272661 6.658211483
2.92005995
5 95 0.0095 0.00990099 6.717856771 6.658211483
…. Continue …
3.93990412
6 99 0.0099 0.00990099 6.658355759 6.658211483
3.95990107
1 88 0.0088 0.00990099 6.828280761 6.658211483
3.97989801
5 99 0.0099 0.00990099 6.658355759 6.658211483
More 101 0.0101 0.00990099 6.629500897 6.658211483

Sum (Freq) = 10000
Sum (Ent-Sample) Sum (Ent-Theory)
= 6.638387169 = 6.658211

Sum (Entropy Sum (Entropy
-Sample||Theory) -Theory||Sample)
= 6.658211483 = 6.716004

KL Divergence (Sample||
Theory) = -0.07761672
5
Due date : 16/12/2019
Histogram of F-1(p) for 10000 random numbers

12
10
0
0 2 4 6 8 10 12
Figure 3. Histogram of F-1 for 10000 random numbers
Homework 4
Table 4 shows the calculation for determining event time, with time at step 0.01 and event
based on rule if the random number is greater than 0.7, there will be event, and event time is
time multiply event. And then event time were sorted from smaller to larger.
Table 4. Calculation Event time and Waiting time

Event Time
time Random Numbers Event Event Time (Sorted)
0 0.232811503 0 0 0
0.01 0.89287821 1 0.01 0
0.02 0.246071712 0 0 0
0.03 0.8244096 1 0.03 0
0.04 0.075598858 0 0 0
0.05 0.091304668 0 0 0
0.06 0.802989513 1 0.06 0
…. Continue…
211.13 0.532507311 0 0 0
Waiting
211.14 0.365201755 0 0 0 Time
211.15 0.579132579 0 0 0.01 0.01
211.16 0.214858859 0 0 0.03 0.02
6
Due date : 16/12/2019
211.17 0.865632165 1 211.17 0.06 0.03

211.18 0.706847578 1 211.18 0.08 0.02
211.19 0.993396947 1 211.19 0.1 0.02
211.2 0.506491226 0 0 0.11 0.01
211.21 0.234051012 0 0 0.14 0.03
211.22 0.410951593 0 0 0.15 0.01
211.23 0.992500051 1 211.23 0.16 0.01
211.24 0.07743126 0 0 0.2 0.04
211.25 0.47865561 0 0 0.22 0.02
211.26 0.557766826 0 0 0.24 0.02
211.27 0.169395876 0 0 0.33 0.09
… continue…
299.96 0.03868672 0 0 299.92 0.08
299.97 0.6751632 0 0 299.93 0.01
299.98 0.450127142 0 0 299.94 0.01
299.99 0.862123618 1 299.99 299.99 0.05
Table 5. Calculation for Histogram Waiting time

Frequenc
Bin y
0.01 1515
0.012979 1074
0.015957 0
0.018936 0
0.021915 1888
0.024894 0
0.027872 0
… continue …
0.272128 0
0.275106 0
0.278085 0
0.281064 0
0.284043 0
0.287021 0
More 1
sum 8885
7
Due date : 16/12/2019
Waiting Time (Exponential Dist.)

12
10
0
0 2 4 6 8 10 12
Figure 4. Waiting Time (Exponential Dist.)

It can be seen that from horizon waiting time, we get the Exponential distribution graph, that
show the history of time and we can identify how long for the waiting time.
Homework 5
1. Cohort definition
In statistics, marketing, and demographics, a cohort is a group of subjects who share
decisive characteristics (usually subjects that experience a common event in a certain
period of time, such as birth or graduation). Or in other words, cohort is a group of people
who have something in common, can represent the source population – the population
from which cases of disease arise. For examples, such as all employees in an office
building, everyone who attended a football game, all the residents of a neighborhood.
2. Prospective definition
For prospective cohort study is type of research that there is a collection of exposure data
(baseline) of subjects recruited prior to the development of the desired results. The subject
is then followed through time (future) to record when the subject develops the desired
result. Ways to follow up on research subjects include: telephone interviews, face-to-face
interviews, physical exams, medical / laboratory tests, and sending questionnaires (source:
http://sphweb.bumc.bu.edu/otlt/MPH-
Modules/EP/EP713_CohortStudies/EP713_CohortStudies_print.html).
8
Due date : 16/12/2019
An example of a prospective cohort study is, for example, if a demographer wants to

measure all male births by 2018. Demographers must wait until the event is over, 2018
must end in order in order for demographics to have all the necessary data (source:
http://www.statsref.com/HTML/index.html?cohort_studies2.html).
Figure 5 shows the illustration of prospective cohort study.
Figure 5. The illustration of prospective cohort study
3. Retrospective definition
Retrospective studies begin with subjects who are at risk of having interesting outcomes or
illnesses and identify results starting from where the subject was when the study began to
past the subject to identify exposures. Retrospective use notes: clinical, education, birth
certificate, death certificate, etc. But that might be difficult because there might not be data
for research that is starting. These studies may have many exposures which might make
this study difficult (source: http://sphweb.bumc.bu.edu/otlt/MPH-
Modules/EP/EP713_CohortStudies/EP713_CohortStudies_print.html). An example of a
retrospective cohort study is, if a demographic examines a group of people born in 1970
who have type 1 diabetes. Demographics will start by looking at historical data. However,
if demographics see ineffective data in an attempt to infer the source of type 1 diabetes,
demographic results will not be accurate (source: http://sphweb.bumc.bu.edu/otlt/MPH-
Modules/EP/EP713_CohortStudies/EP713_CohortStudies5.html). The figure 6 shows
illustration of retrospective cohort study.
9
Due date : 16/12/2019
Figure 6. The illustration of retrospective cohort study
4. Cross-sectional study definition

In medical research, social science and biology, cross-sectional studies (also known as
cross-sectional analysis, transversal studies, prevalence studies) are a type of observational
study that analyzes data from a population, or representative part, at a particular point in
time - that is, cross-sectional data.
In economics, cross-sectional studies usually involve the use of cross-sectional regression,
to sort out the existence and magnitude of the causal effects of one or more independent
variables on the dependent variable that is attractive at a particular point in time. They
differ from time series analyzes, where the behavior of one or more economic aggregates
is tracked through time.
Figure 7. The illustration of cross-sectional study
10
Due date : 16/12/2019
Homework 6
From this homework, we want to see how statistic can be cheated. We can change the
correlation of data sets, with just change two sets of data. As we can show in the table below.
Table 6. Data sets, and the rank with their Correlation value
x y rank(x) rank(y)
0.52786
0.436553614 4 113 84
0.14984
0.529963464 5 92 164
0.06030
0.104397213 4 176 185
0.226603218 0.79876 147 38
0.60303
0.219025289 3 147 69
0.62811
0.155444842 3 163 64
0.14714
0.111213715 8 170 162
0.98549
0.460870931 1 104 4
0.95394
0.33675166 3 128 10
0.54626
0.811181187 7 37 74
….. continue ….
0.35894198 0.97125
4 8 8 3
0.54043345 0.87397
6 5 7 3
0.57716894 0.57289
4 2 5 5
0.76972053 0.58017
2 8 4 4
0.78357789 0.10305
4 5 3 8
0.97128
0.79122632 9 2 2
0.57179704 0.18254
6 7 2 6
11
Due date : 16/12/2019
0.26109679 0.44376
1 9 3 3
0.89421203 0.65339
2 1 1 2
0.18594745 0.30899
9 9 2 2
0.26340043 0.99718
1 4 1 1
0.14617571 0.29567
7 1 1 1
Corellation 0.39354
Corellation = -0.0697 = 2
Table 7. Data sets, and the rank with their New Correlation value when we change two
sets of data
x y rank(x) rank(y)
0.436553614 0.527864 113 84
0.529963464 0.149845 92 163
0.104397213 0.060304 175 185
0.226603218 0.79876 146 40
0.219025289 0.603033 146 70
…. Continue ….
0.78357789 0.10305
4 5 3 8
0.97128
0.79122632 9 2 4
0.57179704 0.18254
6 7 2 6
50 50 1 1
-50 -50 4 5
0.18594745 0.30899
9 9 2 3
0.26340043 0.99718
1 4 1 1
0.14617571 0.29567
7 1 1 2
Corellation = 0.99632 Correlation = 0.40199
12
Due date : 16/12/2019
1 6
It can be seen that from table 6 and 7, that when we change two sets data with very large
numbers and very small numbers in each the same row, we will get significantly much larger
correlation value for random numbers from -0.0697 to 0.996321, but for the rank sets, it is
not significantly changed the correlation value.
Table 8. Data sets, and the rank with their New Correlation value when we change two
sets of data
x y rank(x) rank(y)
0.436553614 0.527864 113 84
0.529963464 0.149845 92 163
0.104397213 0.060304 175 184
0.226603218 0.79876 146 39
0.460870931 0.985491 104 5
….. continue …
0.78357789 0.10305
4 5 3 7
0.97128
0.79122632 9 2 3
0.57179704 0.18254
6 7 2 5
50 -50 1 6
-50 50 4 1
0.18594745 0.30899
9 9 2 2
0.26340043 0.99718
1 4 1 1
0.14617571 0.29567
7 1 1 1

Corellation Correlation 0.40169
= -0.99665 = 9
Since we changed two sets data but with different sign (Positive or negative) in each row, we
can get the correlation value significantly change smaller than before, from -0.0697 to
-0.99665, but for the rank sets, it is not significantly changed the correlation value. From this
13
Due date : 16/12/2019
case, we can conclude that even it can be cheated, in such area we still can find there is
something wrong with the data.
14

Homework Day 2 - Mahaputra

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Homework Day 2 - Mahaputra

Uploaded by

Copyright:

Available Formats

Name : Mahaputra Madani Senen

Risk Management & System Safety

Figure 1. Probability Comparison Graph

0.58 0.42 0.785875 1.251539 0.981454

Figure 2. The Entropy Result Graph

Histogram of F-1(p) for 10000 random numbers

Figure 3. Histogram of F-1 for 10000 random numbers

Table 4. Calculation Event time and Waiting time

211.17 0.865632165 1 211.17 0.06 0.03

Table 5. Calculation for Histogram Waiting time

Waiting Time (Exponential Dist.)

Figure 4. Waiting Time (Exponential Dist.)

An example of a prospective cohort study is, for example, if a demographer wants to

Figure 5. The illustration of prospective cohort study

Figure 6. The illustration of retrospective cohort study

4. Cross-sectional study definition

Figure 7. The illustration of cross-sectional study

Corellation = 0.99632 Correlation = 0.40199

You might also like