You are on page 1of 5

ICS 2410 – PARALLEL SYSTEMS

ASSIGNMENT 1

MAINA GEORGE (C026-0251-08)

1. Discuss the following with respect to a parallel virtual machine


a. Compiling and running a PVM program
Running program called; <foo.c>
 In order to run your programs, you must compile them i.e. On my machine,
cc -L~/pvm3/lib/ALPHA foo.c -lpvm3 -o foo
 You will have to change the name ALPHA to the architecture name of your
computer. After compiling, you must put the executable file in the directory
~/pvm3/bin/ARCH.
 Also, you need to compile the program separately for every architecture in your
virtual machine. If you use dynamic groups, you must also add -lgpvm3 to the
compile command.
 Your executable file can then be run. To do this, you should first run PVM. After
PVM is running, your executable may be run from the Unix command line, like any
other program

b. Message passing
Sending and receiving message between two tasks:
Sending:
 call pvm_initsend(); This clears the default send buffer and specifies the message encoding.
After initialization, the sending task must then pack all of the data it wishes to send into the
sending buffer; pvm_pack() is used.
 pvm_pack(); family of functions. pvm_packf() is a printf-like function for packing multiple
types of data. After the data have been packed into the sending buffer, the message is ready to
be sent.
 pvm_send(): info=pvm_send(tid, msgtag) will send the data in the sending buffer to the process
with the task id of tid. It tags the message with the integer value msgtag.
A message tag is useful for telling the receiving task what kind of data it is receiving. For
example, a message tag of 5 might mean add the numbers in the message, while a tag of 10
might mean multiply them. pvm_mcast() is a similar function. It does the same thing as
pvm_send(), except it takes an array of tids instead of just one. This is useful when you want
to send the same message to a set of tasks.

Receiving:
 receiving task makes a call to pvm_recv() to receive a message. bufid=pvm_recv(tid, msgtag) will wait
for a message from task tid with tag msgtag to arrive, and will receive it when it does
pvm_nrecv() can also be useful. This does a non-blocking receive--if there is a suitable message it is
received, but if there isn't, the task does not wait.
pvm_probe() can be helpful as well. This will tell if a message has arrived, but takes no further
action.
 When a task has received a message, it must unpack the data from the receiving buffer. pvm_unpack()
accomplishes this in the same manner that pvm_pack() uses to pack the data in.

c. Creating and managing dynamic process groups


Dynamic process groups can be used when a set of tasks performs the either the same function or a group of
closely related functions. Users are able to give a name to a set of PVM processes, which are all given a group
instance number in addition to their tid.
 inum=pvm_joingroup("group_name"): it adds a task to the group "group_name". If no such group
exists, it will be created, then returned in inum. A task may belong to more than one group at a time
 pvm_lvgroup():called when a tasks wants to leave a group. Be careful--if tasks leave the group
without replacements joining, there will be gaps in the instance numbers.
 pvm_getinst(), pvm_gettid() , and pvm_gsize() are group information functions.
pvm_gsize() returns a process's group instance, tid, and group size, respectively. Other useful group
functions are pvm_bcast() and pvm_barrier(). pvm_bcast() is very similar to pvm_mcast(), but instead of
sending a message to an array of tids, it sends it to all members of a group. pvm_barrier() is used for
synchronization. A task that calls pvm_barrier() will stop until all the members of its group call
pvm_barrier() as well.

2. Discuss important environment features for parallel programming

 Configurability: In order to satisfy a wide community of users, environments must


allow individuals to set preferences. By having configurability as one of our design
goals, many users’ preferences can be incorporated into the environments usage
without writing special-purpose utilities.
 Scalability: Environments must work not only for small demonstrations, but also for
the large, realistic field use. Scalability can be with respect to several parameters. Our
primary concern is that environments be able to handle large science and engineering
applications of 100,000 lines of code.
 Portability: environments should allow execution on a variety of platforms.

 Flexibility: it is an important characteristic of general environments. We have seen


many situations where users wished to incorporate new types of performance data
into their environments. Advanced environments must be open to the type of data that
can be included and presented.

3. Discuss relative merits and demerits of various laws for measuring speed up performances vis a
vis to a parallel algorithm system
 Amdahl’s Law:- states that the speedup of a parallel algorithm is effectively limited by
the number of operations which must be performed sequentially, i.e its Serial Fraction
What is speed up?
The speed up factor help us in knowing the relative gain achieved in shifting the execution of
a task from sequential computer to parallel computer and the performance does not increase
linearly with the increase in number of processor.
Illustration:
Let us consider a problem say P, which has to be solved using a parallel computer. According
to Amdahl’s law, there are mainly two types of operations; therefore, the problem will have
some sequential operation and some parallel operations. We already know that it requires T
(1) amount of time to execute a problem using a sequential machine and sequential
algorithm. The time to compute the sequential operation is a fraction α (α<=1) of the total
execution time i.e. T (1) and the time to computer the parallel operations is (1- α), therefore S
(N) can be calculated as under:-

S (N) =T (1)/T (N)


S (N) =T (1)/ (α*T (1) + (1- α)*T (1)/N)

Dividing by T (1)

S (N) =1/ (α+ (1- α)/N)


Remember the value of α is between 0 and 1. Now put some values of number of processors,
we find that the S (N) keeps on decreasing with increase in the value of α.

Outcomes of analysis of Amdahl’s Law:-

1. To optimize the performance of parallel computers modifies compiler need to be


developed which aim to reduce the number of sequential operation pertaining to the reaction
α.

2. Manufacturers of parallel computers were discouraged from manufacturing large


scale machine having millions of processors.

One major shortcoming identified in Amdahl’s law: according to Amdahl’s law the
problem size is always fixed and of sequential operations remains mainly same.

 Gusta fson’s Law:-


There are menu applications which require that accuracy of the resultant output should be
high. In the present situation the computing power has increased substantially due to increase
in number of processors attached to parallel computer. Thus it is possible to increase the size
of the problem.
S (N) = α +N*(1- α)
S (N) = N- α*(N-1)

Thus decrease is because of overhead or sizes caused by inter processor communication.

 Sun and Ni’s Law:-

The Sun and Ni’s Law is a generalization of Amdahl’s Law as well as Custafson’s Law. The
fundamental concept of underlying the Sunand Ni’s Law is to find the solution to a problem
with a maximum size along with limited requirement of memory. Now a day, there are many
applications which are bounded by the memory in contrast to the processing speed.

In a multiple based parallel computer, each processor has an independent small memory. In
order to solve a problem, normally the problem is divided into sub problems and distributed
to various processors. It may be noted that size of sub-problem should be in proportion with
size of the independent local memory available with the processor. The size of the problem
can be increased further such that the memory could be utilized. This technique assists in
generating more accurate solution as the problem size has been increased.

4. Discuss the concept of computational granularity and computational latency

Granularity; This typically refers to the ratio of the number of bytes received by a process to the number of
floating point operations it does. Increasing the granularity will speed up the application, but the tradeoff is a
reduction in the available parallelism. It iis divided into; five grained and coarse grained system.

In five grained system parallel parts are relatively small and that means high communication overhead. In
coarse grain system parallel parts are relatively large, that mean more computation and less computation.

If granularity is too fine, it is possible that the overhead required for the communication and synchronization
between task takes longer than the computation. On the other hand, in coarse gain parallel system, relatively
large amount of computation work is done. They have high computation work to communication ration and
imply more opportunity for performance increase.

Communication Latency: refers to the time it takes to initiate a communication. Bandwidth


describes how fast you can get information across.
A real-life example of high latency may be a service call to your cell phone company: First, you
need to find the number, next you have to dial it, then you wait in the loop for what seems like ages,
subsequently you get passed along until you finally reach someone who can solve your problem.
Thus, how long a communication takes depends on two factors:
the time to initiate it (the latency) plus the amount of information you need to transfer multiplied
by the speed at which it can be transmitted (the bandwidth).
References:

1. Computer Science and Mathematics Division. An Introduction to PVM Programming. Retrieved from
website http://www.csm.ornl.gov/pvm/intro.html

2. In sung Park, Michael J et al. Voss. Parallel Programming Environment for OpenMP
https://engineering.purdue.edu/paramnt/publications/ompenv.pdf

3. Cardiff School of Computer Science & Informatics. Factors That Limit Speedup. Retrieved from
http://www.cs.cf.ac.uk/Parallel/Year2/section7.html

4. Field-theory.org. Practical example: Latency vs bandwidth. Retrieved from website http://www.field-


theory.org/articles/latency/index.html

You might also like