You are on page 1of 272
PERFORMANCE ANALYSIS FORMULAS Amdanrs Law Let f be the fraction of operations in a computation that must be performed sequentially, where 0 = f = |. The maximum speedup y achievable by a parallel ‘computer with p processors performing the computation is, t ¥<550-Pip Gustafson-Barsis’s Law Given a parallel program solving a problem of size m using p processors, lets denote the fraction of total execution time spent in serial code, The maximum speedup Y- achievable by this program is vsp+(—ps Karp-Fiatt Metric Given parallel computation exhibiting speedup yon p processors, where p > 1, the experimentally determined serial fraction e is defined to be Isoefficiency Relation ‘Suppose a parallel system exhibits efficiency e(n, ), where n denotes problem size and p denotes numberof processors. Define C= «(n,p)/(1~e(n, p)).Let Tin, 1) denote sequential execution time, and let T(r, p) denote putllel over- head (total amount of time spent by all processors performing communications seb rednsant compiatatons}, in odes to maintain the same level of efficiency as the number of processors increases. problem size must be increased so thatthe following inequality is satisied Ton, 1) > CTIn, p) Parallel Programming in C with MPI and OpenMP Michael J. Quinn Oregon State University ee Higher Education Boston Burr Ridge, IL Dubuque, |A Madison, Wi New York San Fanelsca St.Louis Bangkok Bogotd Caracas Kuala Lumpur Liston London Madtid Mexico Cty Milan Montreal New Delhi Santiago Seoul Singapore Sydney Talpe! Toronto on PARALLEL PROGRAMMING IN C WITH MPL AND OPENMP Intemational Eon 2003 Exclusive rights by McGraw-Hill Esai (Asa) or manufacture an exon. This ‘ook cannot be reexpre fom te county to which ts ol by Metra Hl. THE Ineratonl Eien eno wail in Now Amen, Published by McGaw» business nit of The MGraw-Hil Companies, I, 121 [Avent ofthe Americas, Nbw Yor, 10. Copyipn © 2008 The Mera A ‘Compares, ie Alas eerved, No pat ofthis piation may be reproduced or Sinubated in any frm orby any means, o ore ins databate or eal syer sett he pir writen consent of The MeCaw- 48 Compaen, Dat ‘ot mite fn ay network or ber lene rage or raaision, or read oe Some ances: infding fetonc an ring component ayn be availble ‘tnlomers oud the Uae Sates, 10 09 08 OF 06 05 04 09 O22 20.08 08 07 06 O5 08 05 cr ste Library of Cor Quinn, tae chal ty Pall programing it with MPL and OpenMP / Michel J. Quinn —Iste Pom (SBN opr 2822562 1€ (Computer program language) 2. Paral pogramming (Computer cence) ie, ars 7secisass 2008 cos. 15'3aeat 2oenos63 cae ee 182, ‘When ordering this tite se ISBN 0071232656 ss Cataloging in-Publicatlon Data Prine Singapore eeanhhe com N 53478/18.5.06 With gratitude for their love, support, and guidance, | dedicate this book to my parents, award and Georgla Quinn, w Conteris Coreen, “i 263° Miso 55 39 KeyTerms 90 SAS Ramifiaions of Block cnarten? ded wien 5 310 Bibliographic Notes 90 Decomposition 121 Fartariaanen aaeresee ss 27° Summary 58 BAL Exercises 90 58 Developing the Parallel Algorithm 121 28 KeyTerms 59 S51 Funcion wer Beast 122 JA Introduction 159 29° Bibliographic Nowes 59 56 Analysis of Paral! Sive Algorithm 122 7-2 Speedupand ficiency 159 210 Exercises 60 amceae 5.7 Documenting the Parallel Program 9237-3 Amdahi's Law 161 Message-Passing Programming 93 58 Benchmarking 128 731 Linaion of Auda's Law 164 441 Insoduction 95 59° Improvements 129 732 The Adah et 164 cuarrenS 42 The Message Passing Model 98 5B Delete Even Integers 129 74 Gastaion-Barsis's Law 164 Paraliel Algorithm Design 63 43 The Message Pusing Interface 95 492 Eliminate Broadeast 130 75 The Karp-Fat Metic 167 31 Invoduetion 63 44 Cire Saisabilty 96 593 Reorgoize Loops 13 746 The sefiieney Metric 170 32 The Task/Chanvel Model 63 “a S94 Bencharting 131 77 Summary 174 33. Foster's Design Methodology 64 442 Pancions WPE_Conm_saniand 520 Summary 133 78 KeyTems 175 BBL Paroning 68 Sut KeyTerms 134 19 Bibliographic Notes 175 ! 332° Communication 67 443 5:12 Bibliographic Notes 134 TAO Execcses 176 | i 33.3 Agglomeration 68 444 Compiling MPI Programs 102 S43. Exercises 134 a4 Maoping 7D 44S Runsing MPT Programs 102 a nroducng Colleive ce ee “S Somme 08 cuarten Matrix-Vector Multiplication 175, Bat tnraision 72, ane e 451 Povcion MPL_Rosuce 108 Floyd's Algorithm 137 81 Induction 178 asec at 46 Benchmarking Pale Performance 108 ae 82 Sequenat Algorithm 179 dake Agslomeration nd Moping 75 $61 Pinon Yer Mt ne od 62 The AlLParsShonest Path 83 Data Decomposition Options 150 ' BAS Anat 75 hy Problem 137 £4 Rowwise Block-Stiped 33° Finding the Maximum 77 eee 6.3 Creating Arrays at Run Time 139 Decomposition 181 47° Summary 110 ASL Inroducton 77 64 Designing the Pale! Algorithm 140 SAL Design and Anais 18) 382° Parinoniny 77 Cie 641 Porttioning 40 442 Replcang«Bloc-Mepped Vector 183 453° Commanicaton 77 49° Bibtioprphie Nowes 110 642 Communication 141 B43 Function NPI_ALigacherv 184 BSA Aaslomerton ad Mapping 81 440 Exercises 111 643 Aeglomeation and Mapping 142 8d Replcted Vector IpalOupat 188 BSS Analysis #2 etd Mass aowOupur 43 AAS Documenting the Paral! Program 187 36 The mBody Problem 82 cuarren S 65 Pointto-Point Communication 145 46 Benchmarting 187 461 traduction 82 681 Fancion #O2_Send 6 85 Columawise Block Str 342 Ponte 8 UO i G2 Fincroneztece tt Desomston 13 365 Communication 83 54 Introduction 115 653 Deadlock 18 851 Designand Anas 189 “243 Atglomeraionand Mapping 8S ———'52._—-Sequential Algorithm 115 66 —Documeating the Paaet 152 Reading c Columavise Blok Siped 36S Analysis 85 $3 Sources of Parallelism 117 Program 149 Matrix 191 1 37 Adding Data Input 86 54 Data Decomposition Options $1 67 Analysis and Benchmarking 151 883° Function NPI_Scatvory 191 | pri eater Sel hieterd Dut Drempostion 178 68 Summary tS4 BSE Pring Colman Bk Spd | 372° Communication &7 $42 Blok Date Decomposition 18 69 Key Tems 154 Marie 198 473° Anabsis 88 $43 Block Decomposition Mocs 120 610 Bibliographic Notss 154 855° Foon ues _catmers 198 38 Summary 59 S44 Local Indes vers Global Ines 120 G11 Exercises 154 456 Distributing Paria Rents 195

You might also like