You are on page 1of 5
376 Parallel Processing and Parallel Algorithms Two-Dimensional Mesh SIMD Model Given a two-dimensional mesh SIMD architecture with wraparound connection, there is an algorithm that uses n° processors to perform matrix-matrix multipli- cation. Consider an arbitrary element C[ij] of the product matrix. If B, denotes the j, column vector of B and A, denotes the i, row vector of A, then Ci is the product of row i of matrix A and column j of matrix B. The parallel algorithm ‘computes the product in three phases. Initially, the processor P, located at posi- tion (ij), row i and column j, stores Ali,j] and Bfi,j] elements of the matrices. In this distribution, only n processors contain a pair of elements of A and B. How- ever, itis possible to broadcast elements so that every processor has appropriate ‘elements to produce the specific element of the product C = A * B. This can be done by an upward rotation of the element of B and a leftward rotation of the ‘element of A stored in each processor. This initial distribution of the elements of the matrices is phase one of the algorithm. In phase two of the algorithm the dot product of the stored elements of each processor is computed. In phase three, the result of phase two is broadcast to the neighboring processors in the leftward and upward direction for the elements of A and B, respectively. After n itera- tions of phase three of the algorithm, the element C[ij] of the product is present in the processor P,, Procedure Parallel_Matrix_Matrix(A,B,C) Phasel for k=Oton-1do for P, where 0 k then Afi - 14] = Alig) endif ifj>k thenBUi - 1j] = Bli] endif endfor P, endfor Phase2 for P, where 0< ij Sn 1 do in parallel Cli) = Ali) * Blid) endfor P, Phase3 for k ton-1do for P, where 0

You might also like