# CS-421 Parallel Processing Handout_11

BE (CIS) Batch 2004-05

Programming Examples for SIMD and MIMD Machines 1. SIMD Programming
Below is presented a matrix multiply pseudocode for an SIMD computer. This assumes that calculation of one column of product matrix is assigned to one PE. That is, for N x N product matrix, we require N PEs.
/* Matrix multiply, C: = A.B. Compute elements of C by for i := 1 step 1 until N begin /*Compute one row of C. */ /* Initialize the sums for each element of a row of C */ C[i, j] := 0, (1
N

cij = ∑ aik bkj
k =1

*/

≤ j ≤ N);

/* Loop over the terms of the inner product */ for k := 1 step 1 until N /* Add the kth inner product term across columns in parallel. */ C[i, j] := C[i, j] + A[i, k] * B[k, j], (1 /* End of product term loop */ end /* of all rows */

≤ j ≤ N);

For parallel algorithms, it’s also important to consider the data access requirements i.e. the data layout in memory of the machine. For instance, this SIMD algorithm requires that PEk must have access to kth-column of matrix B while access of whole matrix A is required by every PE.

2. MIMD Programming
Below is presented a matrix multiply pseudocode for a MIMD computer (Multiprocessor).
private i, j, k; shared A[N,N], B[N,N], C[N,N], N; /* Start N – 1 new processes, each for a different column of C. */ for j := 1 step 1 until N-1 fork DOCOL; /* The original process reaches this point and does the processing for Nth column */ j := N; DOCOL: /* Executed by N processes, each doing one column of C. */ for i := 1 step 1 until N begin /*Compute one row of C. */ /* Initialize the sums for each element of a row of C */ C[i, j] := 0,
Page - 1 - of 2

CS-421 Parallel Processing Handout_11

BE (CIS) Batch 2004-05

/* Loop over the terms of the inner product */ for k := 1 step 1 until N /* Add the kth inner product term */ C[i, j] := C[i, j] + A[i, k] * B[k, j]; /* End of product term loop */ end /* of all rows */ join N;

Of particular importance is the join command here. It has an implicit counter associated with it that starts off at a count = 0. This counter is shared by all processes (processors) executing the code. A process executing join increments this counter and compares it to N, the argument to join. If the value of count is not N, then this process kills itself. The process finding this count to be N, continues execution beyond join command. This way only one process continues beyond join and all other processes get terminated. *****

Page - 2 - of 2