Parallel Computing

Presented by Justin Reschke 9-14-04

Concepts and Terminology Parallel Computer Memory Architectures Parallel Programming Models Designing Parallel Programs Parallel Algorithm Examples Conclusion

. Parallel computing is the simultaneous use of multiple compute resources to solve a computational problem.Concepts and Terminology: What is Parallel Computing? Traditionally software has been written for serial computation.

Concepts and Terminology: Why Use Parallel Computing? Saves time – wall clock time Cost savings Overcoming memory constraints It’s the future of computing .

Single Data MIMD – Multiple Instruction. Multiple Data MISD – Multiple Instruction. Single Data SIMD – Single Instruction. Multiple Data .Concepts and Terminology: Flynn’s Classical Taxonomy Distinguishes multi-processor architecture by instruction and data SISD – Single Instruction.

Flynn’s Classical Taxonomy: SISD Serial Only one instruction and data stream is acted on during any one clock cycle .

. Each processing unit operates on a different data element.Flynn’s Classical Taxonomy: SIMD All processing units execute the same instruction at any given clock cycle.

Example: Multiple cryptography algorithms attempting to crack a single coded message.Flynn’s Classical Taxonomy: MISD Different instructions operated on a single data element. . Very few practical uses for this type of classification.

Most common type of parallel computer.Flynn’s Classical Taxonomy: MIMD Can execute different instructions on different data elements. .

Concepts and Terminology: General Terminology Task – A logically discrete section of computational work Parallel Task – Task that can be executed by multiple processors safely Communications – Data exchange between parallel tasks Synchronization – The coordination of parallel tasks in real time .

etc. low communication Fine – Low computation. tools. libraries.Concepts and Terminology: More Terminology Granularity – The ratio of computation to communication   Coarse – High computation. operating systems. high communication Synchronizations Data Communications Overhead imposed by compilers. Parallel Overhead    .

Parallel Computer Memory Architectures: Shared Memory Architecture All processors access all memory as a single global address space. Lack of scalability between memory and CPUs . Data sharing is fast.

no overhead for cache coherency. Programmer is responsible for many details of communication between processors. Is scalable.Parallel Computer Memory Architectures: Distributed Memory Each processor has its own memory. .

Parallel Programming Models Exist as an abstraction above hardware and memory architectures Examples:     Shared Memory Threads Messaging Passing Data Parallel .

Locks and semaphores may be used to control shared memory access. despite hardware implementations. .Parallel Programming Models: Shared Memory Model Appears to the user as a single shared memory. Program development can be simplified since there is no need to explicitly specify communication between tasks.

. concurrent execution paths. Programmer is responsible for determining all parallelism. Typically used with a shared memory architecture.Parallel Programming Models: Threads Model A single process may have multiple.

. Data transfer requires cooperative operations to be performed by each process. MPI (Message Passing Interface) is the interface standard for message passing.a send operation must have a receive operation.Parallel Programming Models: Message Passing Model Tasks exchange data by sending and receiving messages. Ex. . Typically used with distributed memory architectures.

Parallel Programming Models: Data Parallel Model Tasks performing the same operations on a set of data. Each task working on a separate piece of the set. . Works well with either shared memory or distributed memory architectures.

Loops are the most frequent target for automatic parallelism.Designing Parallel Programs: Automatic Parallelization Automatic    Compiler analyzes code and identifies opportunities for parallelism Analysis includes attempting to compute whether or not the parallelism actually improves performance. .

Designing Parallel Programs: Manual Parallelization Understand the problem  A Parallelizable Problem: Calculate the potential energy for each of several thousand independent conformations of a molecule.  A Non-Parallelizable Problem: The Fibonacci Series  All calculations are dependent . When done find the minimum energy conformation.

Designing Parallel Programs: Domain Decomposition Each task handles a portion of the data set. .

Designing Parallel Programs: Functional Decomposition Each task performs a function of the overall work .

Possible Parallel Solution   . Single processor iterates through each element in the array Assign each processor a partition of the array.Parallel Algorithm Examples: Array Processing Serial Solution   Perform a function on a 2D array. Each process iterates through its own partition.

Parallel Algorithm Examples: Odd-Even Transposition Sort Basic idea is bubble sort. but concurrently comparing odd indexed elements with an adjacent element. then even indexed elements. If there are n elements in an array and there are n/2 processors. The algorithm is effectively O(n)! .

3. 0. 4. 4. 6. 6. 5. 5. 5 0. 1. 4. 4. 0. 6 . 4. 3. 4. 1. 6.Parallel Algorithm Examples: Odd Even Transposition Sort Initial array:  Worst case scenario. 2. 1 4. 2. 2. 5 0. 3. 5. 2. 6. 1 4. 0. 3 2. 1. 0 6. 2. 3. 2. 1. 1. 3. 3. 1. Phase 1 Phase 2 Phase 1 Phase 2 Phase 1 Phase 2 Phase 1 6. 5. 0. 5. 6. 5. 0. 3 2.

Other Parallelizable Problems The n-body problem Floyd’s Algorithm  Serial: O(n^3). Parallel: O(n log p) Game Trees Divide and Conquer Algorithms .

. There are many different approaches and models of parallel computing.Conclusion Parallel computing is fast. Parallel computing is the future of computing.

edu/~scandal/nesl/ Parallel Programming in C with MPI and OpenMP. Henry Holt and Company. www. 2003 The New Turing Omnibus. McGraw Hill Higher Education. Michael J. 1993 .References A Library of Parallel Algorithms. wotug. Introduction to Parallel Computing. Dewdney. K. www2.html Internet Parallel Computing Archive.