Elts

PVM and MPI
What is more preferable?
Comparative analysis of PVM and

MPI for the development of physical
applications on parallel clusters
Ekaterina Elts
Scientific adviser: Assoc. Prof. A.V. Komolkin
Introduction
• Computational Grand
Challenge problems
• Parallel processing – the
method of having many
small tasks to solve one
large problem
• Two major trends :
 MPPs - (massively parallel
processors) – but cost $$$ !
 distributed computing
Introduction
• The hottest trend today
is PC clusters running
Linux
• Many Universities and
companies can afford
16 to 100 nodes.
• PVM and MPI are the
most used tools for
parallel programming
Contents
• Parallel Programming
 A Parallel Machine Model
Cluster
 A Parallel Programming Model
Message Passing Programming Paradigm
• PVM and MPI
 Background
 Definition
 A Comparison of Features
• Conclusion
A Sequential Machine Model
A central processing
unit (CPU) executes a
program that performs
a sequence of read and
write operations on an
attached memory
The
von Neumann
computer
SISD – Single Instruction Stream – Single Data Stream

A Parallel Machine Model
Interconnect
The cluster
A node can communicate with The
other nodes by sending and von Neumann
receiving messages over an computer
interconnection network
MIMD – Multiple Instruction Stream – Multiple Data Stream
A Parallel Programming
Model
input input
output
output
Sequential (serial) algorithm Parallel algorithm

Example: scalar product of
  vectors  
a, b a, b
input
input
do i=1,N do i=1,N/2 do i=N/2+1,N

S=s+aibi s1=s1+aibi s2=s2+aibi
enddo enddo enddo
output S=s1+s2
output
print S print S
Sequential (serial) algorithm Parallel algorithm
A Parallel Programming
Model
• Message Passing
5
1 3
2 0
Many small tasks solve one large problem detailed picture of

Instantaneous state of computation a single task
Message Passing Paradigm
• Each processor in a message passing
program runs a separate process (sub-
program, task)
− written in a conventional sequential language

− all variables are private
− communicate via special subroutine calls
Messages
• Messages are packets of data moving
between processes
• The message passing system has to be
told the following information:
 Sending process
 Source location
 Data type
 Data length
 Receiving process(es)
 Destination location
 Destination size
Message Passing
SPMD MPMD
Single Program Multiple Data Multiple Program Multiple Data
Same program runs Each process

everywhere perform a different
Each process only function (input,
knows and operates problem setup,
on a small part of solution, output,
data display)
What is Master/Slave principle?
• The master has the control over the
running application, it controls all data and
it calls the slaves to do there work
PROGRAM
IF (process = master) THEN
master-code
ELSE
slave-code
ENDIF
END
Simple Example
SPMD&Master/Slave
For i from rank step size to N do
s=s+aibi
a1b1+a1+sizeb1+size+a1+2*sizeb1+2*size+…
enddo
s2   ai bi slave
2
  s1   ai bi
1
S=s1+s2 master
a, b
s3   ai bi
3
slave
PVM and MPI
Background
PVM MPI
The development of The development of MPI
PVM started in summer started in April 1992.
1989 at Oak Ridge MPI was designed by the
National Laboratory MPI Forum (a diverse
(ORNL). collection of implementors,
PVM was effort of a single library writers, and end
research group, allowing users) quite independently
it great flexibility in design of any specific
of this system implementation
MPI-1 MPI-2
PVM-1 PVM-2 PVM-3 PVM-3.4

1989 90 94 96 97 99 2000
PVM and MPI
Goals
PVM MPI
 A distributed  A library for writing
operating system application program, not
 Portability a distributed operating
system
 Heterogeneity
 portability
 Handling
communication  High Performance
failures  Heterogeneity
 Well-defined behavior
Note: implementation ≠ specification!
MPI implementations: LAM, MPICH,…
What is MPI ?
MPI - Message Passing Interface
 A fixed set of processes is created at program

initialization, one process is created per
processor
mpirun –np 5 program
 Each process knows its personal number (rank)
 Each process knows number of all processes
 Each process can communicate with other
processes
 Process can’t create new processes (in MPI-1)
What is PVM ?
PVM - Parallel Virtual Machine
 Is a software package that allows a
heterogeneous collection of workstations (host
pool) to function as a single high performance
parallel machine (virtual)
 PVM, through its virtual machine provides a
simple yet useful distributed operating system
 It has daemon running on all computers making
up the virtual machine
PVM Daemon (pvmd)
 UNIX process which oversees the operation of
user processes within a PVM application and
coordinates inter-machine PVM communications
 The pvmd serves as a message router and
controller
 One pvmd runs on each host of a virtual
machine
 The first pvmd (started by hand) is designated
the master, while the others (started by the
master) are called slaves
 Only the master can start new slaves and add
them to configuration or delete slave hosts from
the machine
Executing user Executing
computation PVM system
routines
master
What is Not Different?
• Portability – source code written for one architecture

can be copied to a second architecture, compiled and
executed without modification (to some extent)
• Support MPMD programs as well as SPMD
• Interoperability – the ability of different implementations
of the same specification to exchange messages
• Heterogeneity (to some extent)
PVM & MPI are systems designed to provide users with
libraries for writing portable, heterogeneous, MPMD
programs
Heterogeneity
• Architecture
• Data format static
• Computational speed
• Machine load
dynamic
• Network load
Heterogeneity: MPI
• Different datatypes can be encapsulated in
a single derived type, thereby allowing
communication of heterogeneous
messages. In addition, data can be sent
from one architecture to another with data
conversion in heterogeneous networks
(big-endian, little-endian).
Heterogeneity: PVM
• The PVM system supports heterogeneity
in terms of machines, networks, and
applications. With regard to message
passing, PVM permits messages
containing more than one datatype to be
exchanged between machines having
different data representations.
Process control
- Ability to start and stop tasks, to find out which
tasks are running, and possibly where they are
running.
 PVM contains all of these capabilities –

it can spawn/kill tasks dynamically
 MPI -1 has no defined method to start new task.
MPI -2 contain functions to start a group of tasks
and to send a kill signal to a group of tasks
Resource Control
• PVM is inherently dynamic in nature, and it
has a rich set of resource control
functions. Hosts can be added or deleted
load balancing
task migration
fault tolerance
efficiency
• MPI is specifically designed to be static in

nature to improve performance
Virtual topology
- only for MPI
• Convenient process naming
• Naming scheme to fit the communication pattern
• Simplifies writing of code
• Can allow MPI to optimize communications
Virtual topology example
A virtual topology of twelve processes - grid with a cyclic boundary

condition in one direction e.g. processes 0 and 9 are ``connected''.
The numbers represent the rank and the conceptual coordinates mapped
to the ranks
Message Passing operations
• MPI : Rich message support
• PVM: Simple message passing
Point-to-Point communications
A synchronous An asynchronous
communication does not communication completes as
complete until the message soon as the message is on
has been received. its way
Non-blocking operations
Non blocking communication allows useful work to be performed

while waiting for the communication to complete
Collective communications
Broadcast
A broadcast sends a
message to a number of
recipients
Barrier Reduction
A barrier operation operations
synchronises a Reduction operations
number of reduce data from a
processors. number of processors
to a single item.
Fault Tolerance: MPI
• MPI standard is based on a static model
• If a member of a group failed for some
reason, the specification mandated that
rather than continuing which would lead to
unknown results in a doomed application,
the group is invalidated and the application
halted in a clean manner.
• In simple if something fails, everything
does.
Failed Node
There is a failure and…

Failed Node
… the application is shut down

Fault Tolerance: PVM
• PVM supports a basic fault notification scheme:
it doesn’t automatically recover an application
after a crash, but it does provide notification
primitives to allow fault-tolerant applications to
be built
• The Virtual Machine is dynamically
reconfigurable
• A pvmd can recover from the loss of any foreign
pvmd except the master. The master must
never crash
Virtual Machine
Failed Node Virtual Machine

Virtual Machine
Fast host delete or recovery from fault

Conclusion
Each API has it’s unique strengths
PVM MPI
 Virtual machine concept  No such abstraction
 Simple message passing  Rich message support
 Communication topology  Support logical communication
unspecified topologies
 Interoperate across host  Some realizations do not
architecture boundaries interoperate across architectural
boundaries
 Portability over performance  Performance over flexibility
 Resource and process  Primarily concerned with
control messaging
 Robust fault tolerance  More susceptible to faults
Conclusion
Each API has it’s unique strengths
PVM is better for: MPI is better for:

 Heterogeneous cluster,  Supercomputers (PVM is
resource and process not supported)
control  Application for MPP
 The size of cluster and
the time of program’s Max performance
execution are great
 Application needs rich
message support
Acknowledgments
• Scientific adviser
Assoc. Prof. A.V.Komolkin
???
Thank you for your attention!

Elts

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Elts

Uploaded by

Copyright:

Available Formats

PVM and MPI

What is more preferable?

Comparative analysis of PVM and

SISD – Single Instruction Stream – Single Data Stream

Sequential (serial) algorithm Parallel algorithm

do i=1,N do i=1,N/2 do i=N/2+1,N

Many small tasks solve one large problem detailed picture of

− written in a conventional sequential language

Same program runs Each process

PVM-1 PVM-2 PVM-3 PVM-3.4

 A fixed set of processes is created at program

• Portability – source code written for one architecture

 PVM contains all of these capabilities –

• MPI is specifically designed to be static in

A virtual topology of twelve processes - grid with a cyclic boundary

Non blocking communication allows useful work to be performed

There is a failure and…

… the application is shut down

Failed Node Virtual Machine

Fast host delete or recovery from fault

PVM is better for: MPI is better for:

Thank you for your attention!

You might also like