Professional Documents
Culture Documents
1)
a) Application: “HTS: A Threaded Multilevel Sparse Hybrid Solver”
b) Number of cores: 32
c) Scaling obtained: linear
d) speedup obtained: 20 times (5%)
2)
4)
a) Application:” Virtual-Link: A Scalable Multi-Producer Multi-Consumer Message Queue
Architecture for Cross-Core Communication ”
b) Number of cores: 15
c) comment on the kind of scaling obtained: logarithmic
d) speedup obtained: 2.09x
5)
a) Application: “NVMe-CR: A Scalable Ephemeral Storage Runtime for
Checkpoint/Restart with NVMe-over-Fabrics”
b) Number of cores: 16
c) scaling obtained: linear
d) speedup obtained: 2x
For weak scaling,
1)
a) Application:” HTS: A Threaded Multilevel Sparse Hybrid Solver”
b) Weak scaling methodology: input size doubled and also the no of cores used
c) Number of cores up to which the results were shown: 16
d) comment on the weak scaling results: we do observe that as the problem size
increases HTS can continue to scale,
2)
a) Application: “A scalable adaptive-matrix SPMV for heterogeneous architectures”
b) Weak scaling methodology: 488 DoFs per MPI process
c) Number of cores up to which the results were shown:66
d) comment on the weak scaling results: 3x in setup time and 1.5x in SpMV than
PETSc-GPU
3)
a) Application: “PARallel Subgraph Enumeration in CUDA”
b) the weak scaling methodology used: we use synthetic random geometric graphs
(RGGs), A synthetic RGG at scale s has exactly 2s nodes
c) the number of cores: 224
d) comment on the weak scaling results: increasing in number of nodes proportional to
input size. Execution time shows linear result as expected.
4)
a) mention the application: “MICCO: An Enhanced Multi-GPU Scheduling Framework for
Many-Body Correlation Functions”
b) the weak scaling methodology used: Tensor size increases by 128 each time
c) the number of cores up to which the results were shown (8 GPU each having 32
cores=256 cores)
d) comment on the weak scaling results: GFLOPS proportional to increase in tensor size
5)
a) Application:” NVMe-CR: A Scalable Ephemeral Storage Runtime for
Checkpoint/Restart with NVMe-over-Fabrics”
b) Weak scaling methodology used: 32K atoms per process
c) Number of cores:16
d) comment on the weak scaling results: NVMe-CR achieves near perfect efficiency
(0.96 for checkpoint and 0.99 for recovery) at 448 processes