You are on page 1of 35

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Parallel Computing
Lecture 2
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Seeking Concurrency
Data dependence graphs
Data parallelism
Functional parallelism
Pipelining
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Data Dependence Graph


Directed graph
Vertices = tasks
Edges = dependences
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Data Parallelism
Independent tasks apply same
operation to different elements of a
data set
for i 0 to 99 do
a[i] b[i] + c[i]
endfor

Okay to perform operations


concurrently
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Functional Parallelism
Independent tasks apply different
operations to different data elements

a2
b3
m (a + b) / 2
s (a2 + b2) / 2
v s - m2
First and second statements
Third and fourth statements
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Pipelining
Divide a process into stages
Produce several items
simultaneously
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Partial Sums Pipeline

p [0 ] p [1 ] p [2 ] p [3 ]

p [0 ] p [1 ] p [2 ]
= + + +

a[0 ] a[1 ] a[2 ] a[3 ]


Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Data Clustering
Data mining = looking for meaningful
patterns in large data sets
Data clustering = organizing a data
set into clusters of similar items
Data clustering can speed retrieval of
related items
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Document Vectors

Moon The Geology of Moon Rocks

The Story of Apollo 11

A Biography of Jules Verne

Alice in Wonderland

Rocket
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Document Clustering
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Clustering Algorithm
Compute document vectors
Choose initial cluster centers
Repeat
Compute performance function
Adjust centers
Until function value converges or max
iterations have elapsed
Output cluster centers
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Data Parallelism
Opportunities
Operation being applied to a data set
Examples
Generating document vectors
Finding closest center to each vector
Picking initial values of cluster centers
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Functional Parallelism
Opportunities
Draw data dependence diagram
Look for sets of nodes such that
there are no paths from one node to
another
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Data Dependence Diagram

Build document vectors Choose cluster centers

Compute function value

Adjust cluster centers Output cluster centers


Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Programming Parallel
Computers
Extend compilers: translate sequential
programs into parallel programs
Extend languages: add parallel
operations
Add parallel language layer on top of
sequential language
Define totally new parallel language
and compiler system
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Strategy 1: Extend
Compilers
Parallelizing compiler
Detect parallelism in sequential program
Produce parallel executable program
Focus on making Fortran programs
parallel
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Extend Compilers (cont.)


Advantages
Saves time and labor
Requires no retraining of programmers
Sequential programming easier than
parallel programming
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Extend Compilers (cont.)


Disadvantages
Parallelism may be irretrievably lost
when programs written in sequential
languages
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Extend Language
Add functions to a sequential
language
Create and terminate processes
Synchronize processes
Allow processes to communicate
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Extend Language (cont.)


Advantages
Easiest, quickest, and least expensive
Allows existing compiler technology to
be leveraged
New libraries can be ready soon after
new parallel computers are available
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Extend Language (cont.)


Disadvantages
Lack of compiler support to catch errors
Easy to write programs that are difficult
to debug
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Add a Parallel Programming


Layer
Lower layer
Core of computation
Process manipulates its portion of data
to produce its portion of result
Upper layer
Creation and synchronization of
processes
Partitioning of data among processes
A few research prototypes have been built
based on these principles
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Create a Parallel Language


Develop a parallel language from
scratch
occam is an example
Add parallel constructs to an existing
language
Fortran 90
High Performance Fortran
C*
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

New Parallel Languages


(cont.)
Advantages
Allows programmer to communicate
parallelism to compiler
Improves probability that executable will
achieve high performance
Disadvantages
Requires development of new compilers
New languages may not become
standards
Programmer resistance
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Current Status
Low-level approach is most popular
Augment existing language with low-
level parallel constructs
MPI and OpenMP are examples
Advantages of low-level approach
Efficiency
Portability
Disadvantage: More difficult to program
and debug
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Summary
Power of CPUs keeps growing
exponentially
Parallel programming environments
changing slowly
Two standards have emerged
MPI library, for processes that do not
share memory
OpenMP directives, for processes that do
share memory
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Parallel Architectures
Interconnection networks
Multiprocessors
Multicomputers
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Interconnection Networks
Uses of interconnection networks
Connect processors to shared memory
Connect processors to each other
Interconnection media types
Shared medium
Switched medium
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Shared versus Switched


Media
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Shared Medium
Allows only one message at a time
Messages are broadcast
Each processor listens to every
message
Collisions require resending of
messages
Ethernet is an example
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Switched Medium
Supports point-to-point messages
between pairs of processors
Each processor has its own path to
switch
Advantages over shared media
Allows multiple messages to be sent
simultaneously
Allows scaling of network to
accommodate increase in processors
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Switch Network Topologies


View switched network as a graph
Vertices = processors or switches
Edges = communication paths
Two kinds of topologies
Direct
Indirect
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Direct Topology
Ratio of switch nodes to processor
nodes is 1:1
Every switch node is connected to
1 processor node
At least 1 other switch node
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Indirect Topology
Ratio of switch nodes to processor
nodes is greater than 1:1
Some switches simply connect other
switches
Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Thank you