You are on page 1of 28

Parallel Computing

Presented by Justin Reschke


9-14-04
Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
Parallel Algorithm Examples
Conclusion
Concepts and Terminology:
What is Parallel Computing?
Traditionally software has been written for
serial computation.
Parallel computing is the simultaneous use
of multiple compute resources to solve a
computational problem.
Concepts and Terminology:
Why Use Parallel Computing?
Saves time – wall clock time
Cost savings
Overcoming memory constraints
It’s the future of computing
Concepts and Terminology:
Flynn’s Classical Taxonomy
Distinguishes multi-processor architecture
by instruction and data
SISD – Single Instruction, Single Data
SIMD – Single Instruction, Multiple Data
MISD – Multiple Instruction, Single Data
MIMD – Multiple Instruction, Multiple Data
Flynn’s Classical Taxonomy:
SISD
Serial
Only one instruction
and data stream is
acted on during any
one clock cycle
Flynn’s Classical Taxonomy:
SIMD
All processing units
execute the same
instruction at any
given clock cycle.
Each processing unit
operates on a
different data
element.
Flynn’s Classical Taxonomy:
MISD
Different instructions operated on a single
data element.
Very few practical uses for this type of
classification.
Example: Multiple cryptography algorithms
attempting to crack a single coded
message.
Flynn’s Classical Taxonomy:
MIMD
Can execute different
instructions on
different data
elements.
Most common type of
parallel computer.
Concepts and Terminology:
General Terminology
Task – A logically discrete section of
computational work
Parallel Task – Task that can be executed
by multiple processors safely
Communications – Data exchange
between parallel tasks
Synchronization – The coordination of
parallel tasks in real time
Concepts and Terminology:
More Terminology
Granularity – The ratio of computation to
communication
 Coarse – High computation, low communication
 Fine – Low computation, high communication
Parallel Overhead
 Synchronizations
 Data Communications
 Overhead imposed by compilers, libraries, tools,
operating systems, etc.
Parallel Computer Memory
Architectures:
Shared Memory Architecture
All processors access
all memory as a
single global address
space.
Data sharing is fast.
Lack of scalability
between memory and
CPUs
Parallel Computer Memory
Architectures:
Distributed Memory
Each processor has
its own memory.
Is scalable, no
overhead for cache
coherency.
Programmer is
responsible for many
details of
communication
between processors.
Parallel Programming Models
Exist as an abstraction above hardware
and memory architectures
Examples:
 Shared Memory
 Threads
 Messaging Passing
 Data Parallel
Parallel Programming Models:
Shared Memory Model
Appears to the user as a single shared
memory, despite hardware
implementations.
Locks and semaphores may be used to
control shared memory access.
Program development can be simplified
since there is no need to explicitly specify
communication between tasks.
Parallel Programming Models:
Threads Model
A single process may have multiple,
concurrent execution paths.
Typically used with a shared memory
architecture.
Programmer is responsible for determining
all parallelism.
Parallel Programming Models:
Message Passing Model
Tasks exchange data by sending and receiving
messages.
Typically used with distributed memory
architectures.
Data transfer requires cooperative operations to
be performed by each process. Ex.- a send
operation must have a receive operation.
MPI (Message Passing Interface) is the interface
standard for message passing.
Parallel Programming Models:
Data Parallel Model
Tasks performing the same operations on
a set of data. Each task working on a
separate piece of the set.
Works well with either shared memory or
distributed memory architectures.
Designing Parallel Programs:
Automatic Parallelization
Automatic
 Compiler analyzes code and identifies
opportunities for parallelism
 Analysis includes attempting to compute
whether or not the parallelism actually
improves performance.
 Loops are the most frequent target for
automatic parallelism.
Designing Parallel Programs:
Manual Parallelization
Understand the problem
 A Parallelizable Problem:
Calculate the potential energy for each of several
thousand independent conformations of a
molecule. When done find the minimum energy
conformation.
 A Non-Parallelizable Problem:
The Fibonacci Series
 All calculations are dependent
Designing Parallel Programs:
Domain Decomposition
Each task handles a portion of the data set.
Designing Parallel Programs:
Functional Decomposition
Each task performs a function of the overall work
Parallel Algorithm Examples:
Array Processing
Serial Solution
 Perform a function on a 2D array.
 Single processor iterates through each
element in the array
Possible Parallel Solution
 Assign each processor a partition of the array.
 Each process iterates through its own
partition.
Parallel Algorithm Examples:
Odd-Even Transposition Sort
Basic idea is bubble sort, but concurrently
comparing odd indexed elements with an
adjacent element, then even indexed
elements.
If there are n elements in an array and
there are n/2 processors. The algorithm is
effectively O(n)!
Parallel Algorithm Examples:
Odd Even Transposition Sort
Initial array: Worst case scenario.
 6, 5, 4, 3, 2, 1, 0
6, 4, 5, 2, 3, 0, 1 Phase 1
4, 6, 2, 5, 0, 3, 1 Phase 2
4, 2, 6, 0, 5, 1, 3 Phase 1
2, 4, 0, 6, 1, 5, 3 Phase 2
2, 0, 4, 1, 6, 3, 5 Phase 1
0, 2, 1, 4, 3, 6, 5 Phase 2
0, 1, 2, 3, 4, 5, 6 Phase 1
Other Parallelizable Problems
The n-body problem
Floyd’s Algorithm
 Serial: O(n^3), Parallel: O(n log p)
Game Trees
Divide and Conquer Algorithms
Conclusion
Parallel computing is fast.
There are many different approaches and
models of parallel computing.
Parallel computing is the future of
computing.
References
A Library of Parallel Algorithms, www-
2.cs.cmu.edu/~scandal/nesl/algorithms.html

Internet Parallel Computing Archive, wotug.ukc.ac.uk/parallel

Introduction to Parallel Computing,


www.llnl.gov/computing/tutorials/parallel_comp/#Whatis

Parallel Programming in C with MPI and OpenMP, Michael J. Quinn,


McGraw Hill Higher Education, 2003

The New Turing Omnibus, A. K. Dewdney, Henry Holt and


Company, 1993

You might also like