Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
1Activity
0 of .
Results for:
No results containing your search query
P. 1
Part 2

Part 2

Ratings: (0)|Views: 95|Likes:
Published by Vivian Liu

More info:

Published by: Vivian Liu on Apr 02, 2012
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

04/02/2012

pdf

text

original

 
Vivian Liu cs61c-ecAlexander Javad cs61c-dr
Part 2
A brief description of any changes you made to your code from part 1 in order to get it to run wellon a range of matrix sizes (max 150 words)We implemented cache blocking to improve the throughput for larger matrices, thenpadded each matrix so that there would not be a separate case for the fringes. The padded size of amatrix was the next largest multiple of the block size from the original size. We also locally stored13 sum vectors at a time to optimize performance. We decided not to pad matrices divisible by 32and simply use a block size of 32 for those cases.A brief description of how you used OpenMP pragmas to parallelize your code in sgemm-openmp.c(max 150 words).We placedthe entire body of code inside a #pragma omp parallel’ block and placed#pragma omp for’ statements outside all of the for loops.A plot showing the speedup of sgemm-all.c over sgemm-naive.c for values of Nbetween 64 and1024
0246810121416
        6        4        9        7        1        5        9        1        9        2        2        2        5        2        8        7        3        2        0        3        5        3        4        1        5        4        4        8        4        8        1        5        4        3        5        7        6        6        0        9        6        7        1        7        0        4        7        3        7        7        9        9        8        3        2        8        6        5        9        2        7        9        6        0        9        9        3
    g      F      l    o    p    s      /    s
n
speedup of sgemm-all over sgemm-naive
sgemm-naive sgemm-all

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->