You are on page 1of 2

COL380: Parallel and Distributed Programming (CSE, IITD, Semester-II-2020-21) Assignment-1

Assignment by: Pratik Pranav - 2018CS10368

Introduction
In this assignment we needed to implement a program to calculate the sum of first N natural numbers. We
needed to suggest two methods for the above program using parallel computations. Below are the discussion
of the implementations, timing reports and the observations. Below is the execution time table for Sequential
Approach on varying number of threads and limits. In all the tables presented below each of the data points
is noted as median value of 30 different runs of that particular program.

Time Taken in ms
Limit k Threads 1 2 4 8
1e3 0.004 0.004 0.004 0.004
1e5 0.331 0.346 0.306 0.267
1e7 24.24 25.4 25.2 25.3

Approach 1
Description
In the first approach, I tried to parallelize the task using #pragma omp for construct using dynamic allo-
cation of the number to be added to a global variable, and finally taking the sum of the global variables
obtained by different threads. Each of the output on the table is averaged over 1000 runs.

Time Taken in ms
Limit k Threads 1 2 4 8
1e3 0.008 0.074 0.1 2.957
1e5 0.334 0.534 0.555 2.071
1e7 25.48 32.0 34.0 28.1

Speed-up
Limit k Threads 1 2 4 8
1e3 1 0.054 0.04 0.001
1e5 1 0.647 0.551 0.128
1e7 1 0.793 0.74 0.90

Efficiency
Limit k Threads 1 2 4 8
1e3 1 0.027 0.01 0.00012
1e5 1 0.323 0.137 0.016
1e7 1 0.396 0.185 0.112

Approach 2
In the second approach, I implemented the tree approach discussed in class to find the final sum. In this
approach I tried to first sum the ith and (i + N/2)th indexed element and then store the sum in ith index
then recurse the same approach on the array of first N/2 elements .

1 of 2
COL380: Parallel and Distributed Programming (CSE, IITD, Semester-II-2020-21) Assignment-1

Execution Time for Approach 2


Time Taken in ms
Limit k Threads 1 2 4 8
1e3 0.016 0.022 0.022 0.017
1e5 0.444 0.289 0.156 0.121
1e7 33.5 19.8 16.8 16.6

Speed-up
Limit k Threads 1 2 4 8
1e3 1 0.18 0.18 0.23
1e5 1 1.19 1.96 2.20
1e7 1 1.28 1.51 1.52

Efficiency
Limit k Threads 1 2 4 8
1e3 1 0.09 0.045 0.028
1e5 1 0.595 0.490 0.275
1e7 1 0.64 0.376 0.19

Observation
• As per Amadahl’s law we could easily see the efficiency values are continuously decreasing for both
the approaches which is due to the fact that, regardless of the magnitude of the improvement of
resources, the theoretical speedup is always limited by the part of the task that cannot benefit from
the improvement.

• We could see the with increasing number of threads in Approach 1, not much changes in time taken is
observed owing the fact of increasing overhead of distribution of tasks rather than summing. However,
in approach 2, we could see the natural distribution of time i.e., total time taken is decreasing with
increasing number of threads and limit sum.

• Many reasons such as allocating memory for threads, context switching time and releasing the allocating
memory indeed cause parallel programs to run slower on simple operations and small than serial ones
which explains why data observed sometimes is erroneous. Hence, for measuring the performance of
multi-threaded architecture using large amount of data with complex operation only could results in
better outputs.

• Data obtained in one run was highly varying, hence each data points is calculated by taking median
of 30 different runs.
• The speed-up is increasing along the row which is due to more work is getting distributed.

2 of 2

You might also like