# Advanced Parallel Computing for Scientiﬁc Applications

Prof. I. F. Sbalzarini ETH Zentrum, CAB G34 CH-8092 Z¨rich u Autumn Term 2010 Prof. P. Arbenz ETH Zentrum, CAB H89 CH-8092 Z¨rich u

Exercise 3
Release: 12. Oct 2010 Due: 26 Oct. 2010

1

Practice in C/C++

The following two assignments illustrate the eﬀects of caching in matrix operations. C uses row major memory layout for storing matrices and high dimensional arrays. Hence row-wise access of elements is more cache eﬃcient than column-wise access.

Question 1: Matrix multiplication
The ﬁle matrixMult.cpp contains a program to ﬁnd the execution time for matrix multiplication. C =A·B Each matrix is stored as a 1-D array. The multiplication is performed in the method void Multiply(...). To calculate each element in C, the elements of A are accessed row-wise and that of B are accessed column-wise resulting in several cache misses especially for large matrix sizes. A better cache usage can be achieved if the matrix B is transposed and the matrix multiplication operation is modiﬁed accordingly to give the same result as before. Your task is to implement the methods void InPlaceTranspose(...) and void MultiplyEfficient(...). Compile your code using the default GNU compiler: g++ -o mult matrixMult.cpp Do you observe better performance in the case of large matrices?

Question 2: Matrix norm
You have to calculate the 1-norm and inﬁnity norm of a matrix A of size m × n given by
m

||A||1 = max

1≤j≤n

|aij |
i=1 n

||A||∞ = max

1≤i≤m

|aij |
j=1

1