Professional Documents
Culture Documents
Pavel Darahakupets
Yihui Ni
+
>
>
>
>
>
>
>
>
>
}
MC_sim_par(sigma=0.3,r=0.01,S=100,T=3,n_int=100,n_per_node=100,nodes=1)
MC_sim_par(sigma=0.3,r=0.01,S=100,T=3,n_int=100,n_per_node=100,nodes=2)
MC_sim_par(sigma=0.3,r=0.01,S=100,T=3,n_int=100,n_per_node=100,nodes=3)
MC_sim_par(sigma=0.3,r=0.01,S=100,T=3,n_int=1000,n_per_node=100,nodes=1)
MC_sim_par(sigma=0.3,r=0.01,S=100,T=3,n_int=1000,n_per_node=100,nodes=2)
MC_sim_par(sigma=0.3,r=0.01,S=100,T=3,n_int=1000,n_per_node=100,nodes=3)
MC_sim_par(sigma=0.3,r=0.01,S=100,T=3,n_int=10000,n_per_node=100,nodes=1)
MC_sim_par(sigma=0.3,r=0.01,S=100,T=3,n_int=10000,n_per_node=100,nodes=2)
MC_sim_par(sigma=0.3,r=0.01,S=100,T=3,n_int=10000,n_per_node=100,nodes=3)
int
100
100
100
1000
1000
1000
10000
10000
10000
nodes
1
2
3
1
2
3
1
2
3
Mean
41.38311
42.06478
42.26371
43.10869
47.72830
45.87276
47.64042
49.07065
44.45592
Confidence
Lower*
36.38854
38.47619
39.22616
37.89681
43.88944
42.90057
42.18417
45.32661
41.43089
Interval (5%)
Upper*
46.37768
45.65338
45.30126
48.32058
51.56717
48.84496
53.09667
52.81468
47.48094
System
User
0.086
9.585
0.406
0.091
0.107
0.311
0.091
0.221
0.298
Time
System
0.001
0.103
0.080
0.003
0.024
0.063
0.002
0.047
0.052
Elapsed
0.087
0.117
0.128
0.094
0.124
0.134
0.093
0.121
0.116
As we can see, system time is in total increasing with the increased number
of nodes (exception: user time for int = 100, node = 2). This is contrary of what
we expected. However, since using separated nodes is not possible on windows
(at least not on our laptops) we ran the code on a Mac laptop.
Therefore, the time required for the calculation is for sure influenced by the
laptop we used, its CPU and other technical issues. Maybe it is even more
cumbersome for the laptop to divide the work on each node, than to calculate
on one. All this might have distorted our results.
Lets check if this unusual pattern will continue with cluster.
require("parallel")
## number of paths to simulate
n <- 100
MC_sim_par <- function(c1, sigma, r, S, T, n_int, n_per_node, nodes=1){
foo <- function(n, sigma, r, S, T, n_int) {
undval <- rep(0,n_int+1)
undval[1] <- S
2
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
>
>
>
>
>
>
>
>
>
+
+
>
>
We ran the code separatedly with n = 100, 1000, 10000 and slots = 1, 2, 3.
int
100
100
100
1000
1000
1000
10000
10000
10000
Slaves
1
2
3
1
2
3
1
2
3
Mean
42.53960
42.28161
42.19120
44.70104
44.81934
45.09298
45.87621
46.11989
45.82222
Confidence
Lower*
42.01988
41.76808
41.67619
44.18664
44.30977
44.58342
45.35819
45.59478
45.29974
Interval (5%)
Upper*
43.05933
42.79513
42.70621
45.21544
45.32891
45.60254
46.39423
46.64499
46.34470
Resources
sec. spent
2.07
2.08
2.08
2.08
2.07
2.07
2.07
2.09
2.08
We can see clearly, that by adding an additional slave (nodes), the elapsed
time spent for calculations is decreasing. This is also what we would expect and
different from Case 1.
By calculating with cluster, the range of the overall mean is smaller than the
calculations with laptop:
mclapply: min. 41.38311 / max. 49.07065 (laptop)
parlapply: min. 42.19120 / max. 46.11989 (cluster)
Also the range of the upper and lower boundary of the confidence interval is
smaller with cluster compared to on the laptop.
Obviously calculating with cluster is more accurate, since for each set of int
(100, 1000, 10000) the calculated mean should be smiliar.
This is only the case with cluster.
used
time wall
0:00:00:11
0:00:00:06
0:00:00:05
0:00:01:40
0:00:00:56
0:00:00:37
0:00:17:19
0:00:09:14
0:00:05:48
elapsed
0:00:00:11
0:00:00:06
0:00:00:05
0:00:01:40
0:00:00:56
0:00:00:37
0:00:17:19
0:00:09:14
0:00:05:48