You are on page 1of 61

Song song ha thut ton Barnes-Hut vi OpenMP

TM TT KHA LUN
Song song ha l mt gii php quan trng c p dng khi gii quyt cc vn i hi phi tnh ton ln thng gp trong cc lnh vc khoa hc c bnBi ton Nbody l mt trong nhng bi ton c bn trong lnh vc vt l hc thin th, lin quan ti lc tng tc gia cc ht vi nhau trong khng gian. C rt nhiu hng gii quyt bi ton trn, trong c phng php s dng thut ton Barnes-Hut. OpenMP l giao din lp trnh ng dng API, cung cp cho ngi lp trnh mt giao din mm do, c tnh kh chuyn trong khi pht trin cc ng dng song song trn cc my tnh s dng kin trc b nh chia s. Kha lun ny gii thiu tng quan v bi ton N-body, thut ton Barnes-Hut v giao din lp trnh ng dng OpenMP. Trn c s nh gi hiu nng thut ton Barnes-Hut, tin hnh tm hiu, phn tch v xut cc phng thc song song ha thut ton Barnes-Hut vi OpenMP.

L Th Lan Phng

Song song ha thut ton Barnes-Hut vi OpenMP

LI CM N
u tin, em mun gi li cm n su sc nht ti TS. Nguyn Hi Chu, ngi hng dn v ch bo em tn tnh trong sut thi gian lm kha lun. Em xin chn thnh cm n thy Phm K Anh, gim c Trung tm Tnh ton hiu nng cao Trng i hc KHTN i hc Quc gia H Ni, ngi to iu kin tt nht cho em thc hnh v th nghim thut ton. Em cng xin gi li cm n ti tt c cc thy v cc anh ch trong Trung tm, nhng ngi gip v tr li mi thc mc, to iu kin cho em hon thnh kha lun. Em xin cm n thy on Minh Phng, ging vin b mn Mng v Truyn thng my tnh, khoa CNTT, trng i hc Cng ngh, ngi gip em th nghim bi ton trn my a x l Intel. Cui cng, em xin gi li cm n su sc ti nhng ngi thn trong gia nh em, nhng ngi lun quan tm, ng vin khch l em trong hc tp v trong cuc sng.

Sinh vin thc hin kha lun L Th Lan Phng

L Th Lan Phng

ii

Song song ha thut ton Barnes-Hut vi OpenMP

Danh sch hnh v


Hnh 1: Minh ha h N-body trong khng gian ..............................................................2 Hnh 2: Biu din lc tng hp tc dng ln 1 ht..........................................................4 Hnh 3: Quan st thin h Andromeda t tri t..........................................................6 Hnh 4: Biu din qu trnh quy thay th mt cm bi tm im............................7 Hnh 5: Cy Quadtree vi 4 mc ......................................................................................8 Hnh 6: Cy Octree vi 2 mc...........................................................................................8 Hnh 7: Biu din cy sau khi loi b cc trng ...........................................................9 Hnh 8: Cc thnh phn trong OpenMP........................................................................15 Hnh 9: Kin trc b nh chia s ....................................................................................16 Hnh 10: M hnh Fork-Join ...........................................................................................19 Hnh 11: Minh ha vng c song song ha................................................................21 Hnh 12: Hnh minh ha ch th Do/for ..........................................................................24 Hnh 13: Hnh minh ha ch th sections ........................................................................26 Hnh 14: Hnh minh ha ch th single............................................................................28 Hnh 15: Cu trc d liu cy trong treecode (1)..........................................................36 Hnh 16: Cu trc d liu cy trong treecode (2)..........................................................39

L Th Lan Phng

iii

Song song ha thut ton Barnes-Hut vi OpenMP

Bng t vit tt

T hoc cm t Giao din lp trnh ng dng Cc ch th m dnh cho a x l Lung Ht , Khi Nt Giao din truyn thng ip Tm khi

T vit tt API OpenMP

T ting Anh Application Program Interface Open Specifications for Multi Processing Thread Body Cell Node Message Passing Interface Center of mass

Thread body cell node MPI

L Th Lan Phng

iv

Song song ha thut ton Barnes-Hut vi OpenMP

Mc lc
TM TT KHA LUN........................................................................................... i LI CM N ............................................................................................................. ii Danh sch hnh v...................................................................................................... iii Bng t vit tt........................................................................................................... iv Mc lc ........................................................................................................................ v M U...................................................................................................................... 1 Chng 1: BI TON N-BODY V THUT TON BARNES-HUT ................ 2 1.1 Bi ton N-body .......................................................................................... 2 1.1.1 Gii thiu bi ton N-body............................................................ 2 1.1.2 Phng php nhm tng tc bi ton N-body............................... 5 1.1.3 Cu trc cy Quadtree v Octree................................................... 7 1.2 Thut ton Barnes-Hut ................................................................................ 9 1.2.1 M t thut ton Barnes-Hut ....................................................... 10 Chng 2: GII THIU V OPENMP................................................................. 15 2.1 OpenMP (Open specifications for Multi Processing) ............................... 15 2.2 Kin trc b nh chia s............................................................................ 16 2.3 Mc tiu ca OpenMP............................................................................... 17 2.4 Mi trng h tr OpenMP....................................................................... 18 2.5 M hnh lp trnh OpenMP ....................................................................... 18 2.6 Mt s ch th c bn trong OpenMP ........................................................ 19 2.6.1 Cc ch th song song ha............................................................ 20 2.6.2 Ch th khai bo min song song ................................................. 20 2.6.3 Ch th lin quan ti mi trng d liu...................................... 21 2.6.4 Ch th lin quan ti chia s cng vic ........................................ 23 2.6.5 Ch th ng b ha ..................................................................... 28 2.6.6 Th vin v mt s bin mi trng ........................................... 31 L Th Lan Phng

Song song ha thut ton Barnes-Hut vi OpenMP 2.7 V d v lp trnh song song vi OpenMP................................................ 33 2.7.1 omp_hello.c ................................................................................. 33 2.7.2 Cch bin dch ............................................................................. 33 2.7.3 Kt qu ........................................................................................ 34 Chng 3: SONG SONG HA THUT TON BARNES-HUT........................ 35 3.1 Treecode .................................................................................................... 35 3.1.1 Cu trc d liu ca cy .............................................................. 35 3.1.2 Cc bin ton cc ........................................................................ 39 3.2 Th nghim v nh gi hiu nng ca treecode ...................................... 40 3.2.1 Th nghim chng trnh treecode ............................................. 40 3.2.2 nh gi hiu nng...................................................................... 42 3.3 Song song ha treecode vi OpenMP ....................................................... 43 3.3.1 Mi trng thc hin song song ................................................. 43 3.3.2 Thc hin song song.................................................................... 44 3.4 Kt qu thc nghim ................................................................................. 51 KT LUN ............................................................................................................... 53 TI LIU THAM KHO........................................................................................ 54

L Th Lan Phng

vi

Song song ha thut ton Barnes-Hut vi OpenMP

M U
Bi ton N-body l mt trong nhng bi ton c bn ca vt l hc thin th. Trc y c rt nhiu hng khc nhau khi gii quyt vn lin quan ti lc tng tc gia cc ht ca h N ht trong khng gian. Trong c hai cch gii quyt c bn. l tnh trc tip lc gia cc cp ht vi phc tp l O (N2) v cch tnh th nng lp vi phc tp l O (N log N). Cch th nht cho php tnh ton mt cch gn chnh xc lc tng tc. Song thi gian cn thc hin trong bi ton N-body l rt ln, xp x O (N2) vi N l s ht. Trong thi gian tnh lc chim ch yu, khong 96 % thi gian thc hin chng trnh khi c th nghim trn my Intel 1 CPU. Cch th hai dng nh gim thiu thi gian tnh ton nhng li thiu chnh xc v thiu tnh tng qut khi m phng h N-body. Thut ton Barnes-Hut v cc ci tin ca n c p dng tnh lc vi phc tp xp x O (N log N) v cho kt qu tng i chnh xc. Song song ha thut ton Barnes-Hut c ngha v cng quan trng trong vic tng tc bi ton N-body. Song song ha thut ton Barnes-Hut trn kin trc my tnh c b nh phn tn bng cch s dng giao din lp trnh ng dng MPI c nhiu tc gi nghin cu v t kt qu tt. Tuy nhin vn song song ha thut ton ny trn kin trc my tnh a x l b nh chia s cha c nghin cu nhiu. OpenMP l mt trong cc giao din lp trnh ng dng dnh cho cc ng dng song song trn kin trc my tnh a x l b nh chia s. So vi MPI, OpenMP c tnh mm do, tnh kh chuyn cao, v cho php ngi lp trnh c c mt giao din n gin khi xy dng v pht trin cc ng dng song song. Kha lun ny nghin cu tng quan v bi ton N-body, tm hiu v thut ton Barnes-Hut cng nh v giao din lp trnh OpenMP. T rt ra nhng nhn xt v nh gi hiu nng thut ton v nghin cu vn song song ha thut ton Barnes-Hut s dng OpenMP trn m hnh b nh chia s.

L Th Lan Phng

Song song ha thut ton Barnes-Hut vi OpenMP

Chng 1: BI TON N-BODY V THUT TON BARNES-HUT


1.1 Bi ton N-body
1.1.1 Gii thiu bi ton N-body
Bi ton N-body v thc cht l bi ton lin quan ti tnh lc tng tc gia cc ht trong khng gian. Cc m phng ca bi ton N-body ng vai tr quan trng trong nhiu ng dng ca vt l hc thin th, ng lc hc phn t, v cc phng php tnh lu lng ca gi xoy (vortex flow methods) Xt h N ht trong khng gian. Gia cc ht c tng tc lc hp dn. V c N ht, nn mi ht s chu tc dng ca N-1 lc khc nhau gy ra bi cc ht cn li, lc tng hp ca N-1 lc ny s lm thay i vn tc v v tr ca ht .

Hnh 1: Minh ha h N-body trong khng gian Di y l gii thut c bn khi m phng h N-body. while (t < tfinal) {

L Th Lan Phng

Song song ha thut ton Barnes-Hut vi OpenMP for i =1 to n do { tnh lc f(i) tc dng ln ht i cp nht vn tc v v tr ca ht i } t = t + t } Trong lc f(i) tc dng ln ht i c th c tnh n gin nh sau: for i = 1 to n f(i) = sum[ j=1,...,n, j!=i] f(i,j) /* f(i,j) la luc cua hat j tac dung len i*/ end for Lc tng tc ca cc ht ph thuc vo khong cch gia chng. Bi vy, khi ht chuyn ng ta cn phi xc nh li lc tc dng ln n. Xt ht chuyn ng di tc dng ca lc hp dn.

Trong : F l lc hp dn gy ra bi hai ht a, b. G: hng s hp dn. = (6.6742 0.001) x 10-11 N m2 kg-2 ma, mb: khi lng ca ht a, b tng ng. d: khong cch gia hai ht L Th Lan Phng 3

Song song ha thut ton Barnes-Hut vi OpenMP Khi gia tc ca ht l:

Lc tng hp tc dng ln ht a l: Fnet

Hnh 2: Biu din lc tng hp tc dng ln 1 ht Xt cc thi im t0, t1, vi khong thi gian l t. Di tc dng ca lc Fnet, vn tc ca ht l:

Khi v tr ca ht theo trc x l:

Trong khng gian 3 chiu, khong cch gia hai ht a, b l:

Lc hp dn chiu ln trc Ox l:

Nh vy, bi ton N-body c th n gin nh sau: L Th Lan Phng 4

Song song ha thut ton Barnes-Hut vi OpenMP


for(t=0;t<tmax;t++){ for(i=0;i<N;i++){ F=Calc_force(i); vnew[i]=v[i]+F*dt/m; xnew[i]=x[i]+v[i]*dt; } for(i=0;i<N;i++){ x[i]=xnew[i]; v[i]=vnew[i]; } }

R rng vi N ht trong khng gian, thut ton tnh trc tip lc tng tc gia cc cp ht s lm cho phc tp ca bi ton l O(N2). Vy lm th no c th gim thiu c thi gian tnh ton?

1.1.2 Phng php nhm tng tc bi ton N-body


Trc y, bi ton N-body c m hnh ho bng cch tch hp trc tip, trong ta tnh lc tng tc theo tng cp ht, ging nh m t phn trn. Cch ny m t gn chnh xc trng thi ng lc hc ca h N ht nhng phc tp l O(N2). Cng c th m phng ht bng phng php th nng lp, phc tp ch cn l O(NlogN) nhng li thiu tnh chnh xc v thiu tnh tng qut ca bi ton. Nh vy vn t ra l lm th no va gim thiu phc tp ca thut ton nhng li m phng c mt cch tng i chnh xc v c tnh tng qut ca h thng N ht trong khng gian? Ta xt mt v d thc t di y. Gi s cn phi tnh lc hp dn ca tri t tc dng ln cc v sao v cc hnh tinh. Quan st bng mt thng, ta thy nhiu im sng trng ging nh l mt ngi sao n l, nhng thc cht l mt chm sao (V d chm sao tinh n Andromeda), bao gm hng t cc v sao con. Nhng v chng xut hin rt gn nhau, nn tng nh chng l mt im sng n. Xt D l kch thc ca khi hp bao quanh chm sao Andromeda Xt r l khong cch t tm khi ti tri t

L Th Lan Phng

Song song ha thut ton Barnes-Hut vi OpenMP Thit lp t s: Kch thc khi hp D/r = Khong cch t tm khi ti tri t Ta thy t s D/r l rt nh, do vy c th thay th mt cch tng i chnh xc tt c cc v sao trong chm sao Andromeda nh mt im x t ti tm ca khi.

Hnh 3: Quan st thin h Andromeda t tri t tng ny c cc nh bc hc trc pht hin v p dng vo nhiu bi ton. Nh trong l thuyt c hc c in, khi tnh lc ht ca tri t tc dng ln qu to ang ri, Newton coi tri t nh l mt im c t ti tm ca tri t. im mi m y l vic ta p dng tng ny mt cch quy gii quyt bi ton N-body. Chng hn khi ta quan st t chm sao tinh n Andromeda, di ngn h Milky Way c th c xp x l mt im t ti tm di. Nhng iu quan trng hn l qu trnh ny c th c lp li nhiu ln min l t s khong cch D1/r1 l nh c th thay th cc v sao trong mt khi nh hn bng mt im t ti tm khi khi tnh lc hp dn.

L Th Lan Phng

Song song ha thut ton Barnes-Hut vi OpenMP

Hnh 4: Biu din qu trnh quy thay th mt cm bi tm im qu trnh quy khi chia nh khng gian tr nn n gin, ngi ta s dng mt cu trc d liu c bit. l cu trc cy Quadtree v Octree.

1.1.3 Cu trc cy Quadtree v Octree


Cu trc cy Quadtree (dng trong khng gian 2 chiu) v Octree (trong khng gian 3 chiu) l cc cu trc d liu c s dng chia nh khng gian. n gin, ta xt m hnh cy Quadtree (tng t vi xy dng cy Octree). Cy Quadtree bt u bi mt hnh vung trong mt phng. Hnh vung ln ny c chia thnh 4 hnh vung nh. Mi hnh vung nh li c chia ra lm 4. Qu trnh phn chia ny c tip tc c din ra Hnh di y m t mt cy Quadtree vi 4 mc.

L Th Lan Phng

Song song ha thut ton Barnes-Hut vi OpenMP

Hnh 5: Cy Quadtree vi 4 mc Mi nt (node) ca cy tng ng c 4 con (children), l 4 vung nh va mi c to thnh t vic phn chia vung ln hn trc . Vi cy Octree, qu trnh din ra tng t. Nhng thay v mi nt c 4 con (nh trong Quadtree), mi nt ca cy Octree c 8 con. Di y l hnh m t mt cy Octree vi 2 mc chia.

Hnh 6: Cy Octree vi 2 mc Cc l ca cy Quadtree lu thng tin v v tr, khi lng ca cc ht tng ng c trong hp. Tuy nhin, nu nh phn b cc ht trong khng gian khng ng u th vic phn chia nh trn s khin cho nhiu l ca cy l rng. Do vy, vic lu tr cc l rng L Th Lan Phng

Song song ha thut ton Barnes-Hut vi OpenMP nh th rt lng ph. khc phc tnh trng trn, ngi ta ch tin hnh phn chia cc vung ch khi chng c cha nhiu hn mt ht. Ta c cu trc cy c dng nh sau

Hnh 7: Biu din cy sau khi loi b cc trng

1.2 Thut ton Barnes-Hut


Thut ton Barnes-Hut c gii thiu ln u trong bi bo A hierachical O(n logn) force caculation algorithm vo thng 12/1986. Tuy chnh xc khng bng thut ton FMM (Fast Mutipole Method), nhng tc tnh ton li nhanh hn. Thut ton Barnes-Hut c s dng kh rng ri trong lnh vc vt l hc thin th. Da trn nn tng m Barnes-Hut a ra, c rt nhiu ci tin v pht trin mi ca thut ton nh vic p dng tnh lc trn cc my tnh vector (Barnes 1990), gii thut Greengard 1990, J. Makino vi Treecode with a special-purpose processor, Publ. Astron. Soc. Japan 43 (1991) 621--638 Cc ci tin mi ny nhm tng chnh xc, tng tc khi tin hnh song song ha v c th ci tin ci t trn cc my tnh chuyn dng GRAPE.

L Th Lan Phng

Song song ha thut ton Barnes-Hut vi OpenMP

1.2.1 M t thut ton Barnes-Hut


Thut ton Barnes-Hut s dng chin thut chia tr nhm tm ra cm cc ht trong bi ton N-body. Gi s tt c cc ht nm trong 1 hnh khi 3 chiu. Cy octree c xy dng mt cch quy bng cch chia nh hnh khi thnh 8 cell nh hn. Loi b cc cell khng cha ht no. L ca cy l cc cell ch cha mt ht duy nht. Qu trnh chia nh khi c lp li cho n khi cc khi ch cha duy nht mt ht. n gin, ta xt thut ton Barnes-Hut trong khng gian 2 chiu di y. Nh vy, thut ton c th c m t qua 3 bc nh sau: Bc 1: Xy dng cy Quadtree. Bc 2: i vi mi vung ca cy, tnh tm khi v tng khi lng cc ht c trong cell. Bc 3: Vi mi ht, duyt li cy tnh lc tc dng ln n. Trong : Bc 1: Xy dng cy Quadtree. procedure QuadtreeBuild Quadtree = {empty} For i = 1 to n ... duyt tt c cc ht

QuadInsert(i, root) ... thm ht i vo cy end for Duyt cy loi b nhng l trng procedure QuadInsert(i,n) ... th tc ny thm ht i vo nt n trong cy ... khi xy dng cy, ch mi l trong cy ch cha ... 1 hoc 0 ht If cy con c gc ti n cha nhiu hn 1 ht La chn con c ca n thm ht i. L Th Lan Phng

10

Song song ha thut ton Barnes-Hut vi OpenMP QuadInsert(i,c) else if cy con c gc ti n cha ng 1 ht Thm 4 con ca n vo cy Quadtree Chuyn ht c trong n sang mt con Chn con c ca n thm ht i QuadInsert(i,c) else if cy con ti n l rng Lu ht i vo nt n endif Bc 2: Vi mi vung ca cy, tnh tm khi v tng khi lng cc ht c trong . ... tnh tm khi v tng khi lng cc ht cho mi ( mass, cm ) = Compute_Mass(root) ... cm = tm khi function ( mass, cm ) = Compute_Mass(n) ... Tnh khi lng v tm khi ... cho tt c cc ht c trong cy con gc l n if n cha 1 ht store ( mass, cm ) at n return ( mass, cm ) else for cc con c(i) ca n (i=1,2,3,4) ( mass(i), cm(i) ) = Compute_Mass(c(i)) end for mass = mass(1) + mass(2) + mass(3) + mass(4) cm = ( mass(1)*cm(1) + mass(2)*cm(2) L Th Lan Phng

11

Song song ha thut ton Barnes-Hut vi OpenMP + mass(3)*cm(3) + mass(4)*cm(4)) / mass store ( mass, cm ) at n return ( mass, cm ) end Bc 3: Vi mi ht, duyt cy tnh lc tc dng ln n. tnh lc tc dng ln ht, ta xt t s: Kch thc ca hp Khong cch t ht ti tm khi

D/r =

Nu t s D/r l nh, ta c th tnh lc gy ra bi cc ht trong hp bng cch s dng khi lng v v tr tm khi trong hp. Gi s (theta) l ngng (gc m) cn tnh ton (thng thng 0 < <= 1). Nu D/r < , ta tnh lc hp dn tc dng ln cc ht nh sau. (x, y, z) l v tr ca ht trong khng gian 3 chiu. m l khi lng ca ht. (xcm, ycm, zcm) l v tr ca tm ht trong hp mcm l tng khi lng cc ht c trong hp G l hng s hp dn Khi lc hp dn s c tnh theo cng thc xp x l: Force = G * m * mcm * ( (xcm-x)/r3, (ycm-y)/r3, (zcm-z)/r3) (*) Trong r = sqrt ((xcm-x)2 + (ycm-y)2 + (zcm-z)2 ) l khong cch t ht ti tm cc ht c trong hp. Nu D/r >= , qu trnh tnh lc tc dng ln mt ht c p dng quy. Lc tc dng ln ht bng tng cc lc cc nt con tc dng ln ht . Thut ton tnh lc tc dng ln ht bc ny c th c m t nh di y.

L Th Lan Phng

12

Song song ha thut ton Barnes-Hut vi OpenMP ... vi mi ht, duyt cy tnh lc tc dng ln n For i = 1 to n f(i) = TreeForce(i,root) end for function f = TreeForce(i,n) ... tnh lc hp dn tc dng ln ht i ... da vo tt c cc ht c trong nt n f=0 if n cha 1 ht f = lc tnh c da vo cng thc (*) else r = khong cch t ht i ti tm khi ti n D = kch thc ca cell n if D/r < theta tnh lc f da vo cng thc (*) else for tt c cc con c ca n f = f + TreeForce(i,c) end for end if end if Qua thut ton trn ta nhn thy: qu trnh duyt cy tnh lc tc dng ln ht l c lp i vi mi ht. Bi vy c th tin hnh song song ha qu trnh ny nhm tng tc bi ton N-body.

L Th Lan Phng

13

Song song ha thut ton Barnes-Hut vi OpenMP Hin nay, c rt nhiu gii thut nhm song song ha qu trnh tnh ton lc tc dng trong h thng N-body. C hai hng tin hnh song song ha thut ton. 1) Song song ha vi MPI - s dng b nh phn tn 2) Song song ha vi OpenMP - s dng b nh chia s Trong kha lun ny, ta s xem xt phng php song song ha thut ton Barnes-Hut vi OpenMP.

L Th Lan Phng

14

Song song ha thut ton Barnes-Hut vi OpenMP

Chng 2: GII THIU V OPENMP


2.1 OpenMP (Open specifications for Multi Processing)
Lp trnh song song trn cc my vi b nh chia s ng vai tr kh quan trng trong tnh ton hiu nng cao. Tuy nhin vic s dng nhng tin ch ca h thng b nh chia s l khng d dng i vi ngi lp trnh. Giao din truyn thng ip MPI phn ln c s dng trong cc h thng b nh phn tn. Kh nng m rng v tnh kh chuyn trong MPI l rt tt, nhng n li khng c ngha khi trin khai vi nhng on m vit cho cc my tnh tun t. Hn na MPI li khng tn dng c nhng tin ch m h thng b nh chia s mang li. Trong nhiu nm, c nhiu sn phm c gii thiu nhm a ra tnh kh chuyn v hiu nng cao trn nhng h thng c th. Tuy nhin vn pht sinh nhng vn trong tnh kh chuyn khi s dng cc sn phm ny. OpenMP c coi nh mt giao din lp trnh ng dng API (Application Program Interface) chun dnh cho lp trnh vi b nh chia s. OpenMP l s kt hp ca cc ch th bin dch, cc hm th vin, v cc bin mi trng c s dng xc nh phn thc hin song song trn b nh chia s trong lp trnh Fortran hoc C/C++. OpenMP a ra mt m hnh lp trnh song song c tnh kh chuyn i vi nhng kin trc b nh chia s i vi cc nh cung cp khc nhau.

Hnh 8: Cc thnh phn trong OpenMP

L Th Lan Phng

15

Song song ha thut ton Barnes-Hut vi OpenMP

2.2 Kin trc b nh chia s


H thng b nh chia s bao gm nhiu b x l CPU, mi b x l truy cp ti b nh chung thng qua cc siu kt ni hoc cc ng bus. Vic s dng khng gian a ch n lm cho mi b x l u c mt ci nhn ging nhau v b nh c s dng. Truyn thng trong h thng b nh chia s thng qua cch c v ghi d liu gia cc b x l vi nhau ln b nh. Vi cch ny, thi gian truy cp ti cc phn d liu l nh nhau, v tt c cc qu trnh truyn thng u thng qua ng bus. u im ca kin trc ny l d dng lp trnh, bi v khng c mt s truyn thng chnh tc bt buc no gia cc b x l vi nhau, chng ch n gin l truy cp ti b nh chung. iu khin qu trnh truy cp ti b nh chung thng qua cc k thut c pht trin trong cc my tnh a nhim, nh k thut semaphores trnh tnh trng tht nt c chai, gy ra bi vic nhiu b x l truy cp ti cng mt v tr trong b nh, ngi ta chia b nh chung thnh cc module. Mi module nh c kt ni vi mt b x l thng qua mt mng chuyn mch hiu nng cao.

Hnh 9: Kin trc b nh chia s

L Th Lan Phng

16

Song song ha thut ton Barnes-Hut vi OpenMP Mt s v d v cc my tnh b nh chia s: SGI Origin2000: l s kt hp hiu qu gia kin trc b nh chia s v b nh phn tn. B nh c phn tn v mt vt l gia cc nt, vi 2 b x l ti mi nt. Quyn truy cp ti b nh cc b ca cc b x l ti cc nt l nh nhau. Xt theo kha cnh kin trc chia s, tt c cc nt u c quyn truy cp ging nhau ti b nh phn tn vt l (http://www.cray.com/products/systems/origin2000) Sun HPC servers, nh Enterprise 3000 (gm t 1 n 6 b x l) hoc Enterprise 10000 (gm 4 n 64 b x l). (http://www.sun.com/servers)

HP Exemplar series, nh S-class (gm 4 n 16 b x l), X-class (ti 64 b x l) (http://www.hp.com/pressrel/sep96/30sep96a.htm) DEC Ultimate Workstation. Gm 2 b x l, nhng tc ca mi b x l rt cao (533 MHz) (http://www.workstation.digital.com/products/uwseries/uwproduct. html).

2.3 Mc tiu ca OpenMP


Cung cp giao din chun: OpenMP a ra mt giao din chun cho cc h thng my tnh b nh chia s. Tnh n gin: bao gm tp cc ch th n gin v d s dng cho lp trnh trn h thng my tnh b nh chia s. Mt chng trnh song song ha c th ch cn s dng 3 hoc 4 ch th bin dch. Tnh d s dng: o o Kh nng thc hin song song cho cc chng trnh tun t. Kh nng thc hin song song ha mc th s, hoc mc chi tit.

Tnh kh chuyn: h tr Fortran, C, C++

L Th Lan Phng

17

Song song ha thut ton Barnes-Hut vi OpenMP

2.4 Mi trng h tr OpenMP


Qua tm hiu, ta thy OpenMP l mt m hnh lp trnh ng dng song song da trn nn tng ca kin trc b nh chia s, c tnh kh chuyn v c th m rng, gip cho ngi lp trnh c c mt giao din n gin v mm do khi xy dng ng dng. OpenMP c xy dng da trn s hp tc ca cc tp on: Digital Equipment Corp. (http://www.digital.com/info/hpc/) IBM (http://www.ibm.com/) Intel Corporation (http://www.intel.com/) Kuck & Associates Inc. (http://www.kai.com/) Silicon Graphics Inc. (http://www.sgi.com/Technology/OpenMP/) Ngy nay nhiu hng phn cng, phn mm v cc nh pht trin ng dng chnh u ang xc nhn tnh nng ca OpenMP. Xem chi tit ti website http://www.openmp.org cp nht thng tin cng nh cc phin bn, cc chun dnh cho OpenMP. V d v mt vi trnh bin dch c h tr OpenMP: Absoft Pro FortranMP 6.0 (http://www.absoft.com/pro.win.html) IBM XL Fortran (http://www.software.ibm.com/ad/fortran/xlfortran/) KAI KAP/Pro Toolset (http://www.kai.com/kpts/_index.html)

2.5 M hnh lp trnh OpenMP


OpenMP da trn vic s dng s lng cc lung hin c trong lp trnh song song b nh chia s. l mt m hnh lp trnh chnh tc, cung cp y cc iu khin cho ngi lp trnh. C th xem m hnh lp trnh OpenMP nh l mt m hnh Fork-Join.

L Th Lan Phng

18

Song song ha thut ton Barnes-Hut vi OpenMP

Hnh 10: M hnh Fork-Join Trong m hnh Fork-Join, tt c cc chng trnh OpenMP u bt u bi mt tin trnh n. l master thread (lung chnh). Lung chnh ny c thc hin tun t cho n khi gp ch th khai bo vng cn song song ha. Fork: sau khi gp ch th khai bo song song, master thread s to ra mt nhm cc lung song song. Khi , cc cu lnh trong vng c khai bo song song s c thc hin song song ha trn nhm cc lung va c to. Join: khi cc lung thc hin xong nhim v ca mnh, chng s tin hnh qu trnh ng b ha, ngt lung, v ch li 1 lung duy nht l master thread.

2.6 Mt s ch th c bn trong OpenMP


K thut song song ha on code chnh l cc ch th bin dch. Ch th bin dch c thm vo m ngun, ch ra phn no c tin hnh song song trn h thng b nh chia s. Cng vi l mt s ch th c bit ch ra phng thc song song ha nh th no? u im ca k thut ny l d s dng v c tnh kh chuyn i vi h thng cc my tnh tun t v my tnh a x l. Bi v i vi cc trnh bin dch tun t, nhng ch th ny c coi nh l cc comment. Ch khi s dng thut ng. OpenMP h tr ngn ng C/C++ v Fortran. Tuy ging nhau v mt ng ngha, song cu trc khai bo cc ch th ca OpenMP li khc nhau. V d: L Th Lan Phng Fortran:

19

Song song ha thut ton Barnes-Hut vi OpenMP f77: !$OMP PARALLEL f77: call work(x,y) f77: !$OMP END PARALLEL C/C++:

C/C++: #pragma omp parallel C/C++: { C/C++: work(x,y); C/C++: } Di y l tm tt cc ch th c bn khi lp trnh vi OpenMP (dng trong ngn ng C/C++)

2.6.1 Cc ch th song song ha


Cc ch th song song ha c thm vo m ngun c cu trc nh sau. Cu trc: #pragma omp directive_name [clauses] <new_line> Ch : cui cc ch th song song u phi xung dng.

2.6.2 Ch th khai bo min song song


Ch th ny xc nh vng cn tin hnh song song ha. Khi mt thread bt gp ch th khai bo min cn song song, n s to ra mt nhm cc thread khc nhau, ng thi tr thnh master thread. Master thread c ID l 0. S lng cc thread c xc nh thng qua bin mi trng hoc qua li gi hm th vin. Cc thread ny s thc hin on m nm trong khai bo song song.

L Th Lan Phng

20

Song song ha thut ton Barnes-Hut vi OpenMP

Vng lin tc

Vng song song

Hnh 11: Minh ha vng c song song ha Cu trc: #pragma omp parallel [clause [clause]] <new_line> strutured block Trong clause c th l: private shared default firstprivate reduction if (scalar_logical_expression) copyin

2.6.3 Ch th lin quan ti mi trng d liu


private (list): khai bo danh sch cc bin l private i vi mi thread. iu c ngha mt qu trnh x l s s dng mt bn sao cc bin. Tham chiu ti cc bin gc c thay th bi tham chiu ti cc bn sao. L Th Lan Phng

21

Song song ha thut ton Barnes-Hut vi OpenMP firstprivate (list): ging vi khai bo private, song bn sao danh sch cc bin c gn gi tr ban u l gi tr ca cc bin gc. lastprivate (list): ging vi khai bo private, cc bin gc trong danh sch s c gn gi tr l gi tr cui cng sau khi ra khi vng lp hoc ra khi mt section. shared (list): tt c cc thread u c quyn truy cp ti cng mt danh sch cc bin c khai bo l shared. V thc cht, bin c chia s chim mt v tr c th trong b nh. Mi thread c th c v ghi thng qua a ch nh . Vn t ra l phi m bo cho cc thread truy cp mt cch hp l ti cc bin chia s. default (shared | none): thit lp thuc tnh mc nh cho tt c cc bin c s dng trong vng song song ha. Ring cc bin trong khai bo threadprivate khng chu nh hng ca default. Cc bin c th c khai bo chnh tc l private, sharedm khng cn phi khai bo default. reduction (operator : list): cho php cc bin thuc list (bin shared) c cc bn sao l private trong mi thread. Cc thread s thc hin v ghi gi tr vo bin private . Kt thc ch th reduction, bin shared trong list c ly gi tr t cc bin private mi thread bng cch p dng ton t operator. Ton t operator c th l cc php +, *, -, max, min, schedule (type [, chunk_size]): Ch th ny ch ra cch thc vng lp for c phn chia nh th no gia cc thread, thng c s dng to trng thi cn bng ti gia cc thread. Trong type c th l: static, dynamic, guided, hoc runtime o static: Nu khng ch ra chunk_size th chunk_size c gn bng CEILING(tng s ln lp/s lung). Cc chunk c gn ln lt cho cc thread (tc l theo kiu round-robin) o dynamic: Nu khng ch ra chunk_size th chunk_size c gn bng 1. Cc chunk c gn cho cc thread theo kiu: thread no ri hoc n trc th thc hin trc (first-come first-do). L Th Lan Phng

22

Song song ha thut ton Barnes-Hut vi OpenMP o guided: Nu khng ch ra chunk_size th chunk_size c gn bng 1. Nu ch ra chunk_size th tng s ln lp s c ch ra sao cho c ca cc chunk ni tip nhau (theo ch s tng dn) hay chunk_size gim theo hm m. chunk_size chnh l c ca chunk b nht. Cch lm: chunk_size u tin = CEILING(s ln lp chia cho s thread). cc chunk_size tip theo = CEILING(s ln lp cn li chia cho s thread) Khi thc hin, nu s lng chunk ln hn s thread th thread no thc hin xong phn vic ca mnh s m nhim chunk tip theo cha c thc hin. o runtime: Chunk_size s c xc nh khi chng trnh c thc hin. Kiu schedule s l static hoc c ch ra thng qua bin mi trng OMP_SCHEDULE (nh vy c th l DYNAMIC, GUIDED...) threadprivate: #pragma omp threadprivate (list) <new_line> Ch th threadprivate c s dng lm cho cc bin c phm vi ton cc tr thnh cc b v tip tc tn ti trong mi thread trong sut cc qu trnh cn c song song ha. Ch th phi c xut hin ngay sau khi khai bo bin. Sau mi thread s lm vic vi mt bn sao cc bin. copyin (list): gn gi tr cho cc bin c khai bo l threadprivate trong cc thread bng gi tr ca cc bin gc trong master thread trc khi thc hin song song. List cha danh sch cc bin sao chp.

2.6.4 Ch th lin quan ti chia s cng vic 2.6.4.1 Do/for


Ch th Do/for cho bit vng lp for nm trong khai bo phi c thc thi song song.

L Th Lan Phng

23

Song song ha thut ton Barnes-Hut vi OpenMP

Hnh 12: Hnh minh ha ch th Do/for Cu trc: #pragma omp for [clause [clause]] <new_line> C/C++ for loop Trong clause c th l: private (list) firstprivate (list) lastprivate (list) reduction (operator:list) schedule (type [,chunk_size]) ordered nowait ngha ca private, firstprivate, lastprivate, reduction, schedule c m t mc 6.3.

L Th Lan Phng

24

Song song ha thut ton Barnes-Hut vi OpenMP o ordered: phi c xut hin khi trong vng lp for c s dng ch th ordered o nowait: cho bit cc thread khng cn phi tin hnh ng b ha khi kt thc vng lp song song. Chng tip tc thc hin cc cu lnh sau vng lp m khng cn phi ch i thread no. C th kt hp gia khai bo song song vi ch th chia s cng vic bng cu trc sau: #pragma omp parallel for [clauses] <new_line> for loop

2.6.4.2 Sections
Ch th Sections ch ra cc on m c phn chia nh th no gia cc thread. Mt khai bo sections c th gm nhiu section con c lp vi nhau. Mi mt section c thc hin 1 ln bi mt thread. Nu thi gian thc hin l nhanh v cch ci t cho php, mt thread c th thc hin nhiu hn 1 section. Nu s lng thread nhiu hn s lng section, khi mt vi thread c th ri. Nu s lng thread t hn so vi section, ty thuc vo cch ci t s xc nh cc section c thc hin nh th no.

L Th Lan Phng

25

Song song ha thut ton Barnes-Hut vi OpenMP

Hnh 13: Hnh minh ha ch th sections Cu trc: #pragma omp sections [clause[ clause] . . . ] <new-line> { [#pragma omp section <new-line>] structured-block [#pragma omp section <new-line> structured-block ...] } Trong clause c th l: private lastprivate firstprivate reduction

L Th Lan Phng

26

Song song ha thut ton Barnes-Hut vi OpenMP nowait ngha ca cc thng s trn ging nh c m t cc phn trc. C th kt hp gia khai bo sections vi khai bo parallel. #pragma omp parallel sections [clauses] <new-line> { [#pragma omp section <new-line>] structured-block [#pragma omp section <new-line>] structured-block ...] }

2.6.4.3 Single
Ch th cho bit on m nm trong khai bo single s c thc thi bi duy nht mt thread. Nu khng c ty chn nowait, cc thread khc s khng thc hin ch th single v ch i ti im cui ca khi lnh trong khai bo single.

L Th Lan Phng

27

Song song ha thut ton Barnes-Hut vi OpenMP

Hnh 14: Hnh minh ha ch th single Cu trc: #pragma omp single [clauses] <new-line> structured-block Trong clause c th l: private firstprivate nowait

2.6.5 Ch th ng b ha
Xt v d n gin di y: 2 thread nm trn 2 b x l khc nhau u cng thc hin vic tng gi tr ca bin x vo mt thi im. (gi s x = 0) THREAD 1: increment(x) THREAD 2: increment(x)

L Th Lan Phng

28

Song song ha thut ton Barnes-Hut vi OpenMP { x = x + 1; } THREAD 1: 10 LOAD A, (x address) 20 ADD A, 1 30 STORE A, (x address) } THREAD 2: 10 LOAD A, (x address) 20 ADD A, 1 30 STORE A, (x address) { x = x + 1;

C th xy ra trng hp: thread1 lu gi tr x vo thanh ghi A thread2 lu gi tr x vo thanh ghi A thread1 cng thm 1 vo gi tr x trong thanh ghi A thread2 cng thm 1 vo gi tr x trong thanh ghi A thread1 lu gi tr trong thanh ghi A ti a ch ca x thread2 lu gi tr trong thanh ghi A ti a ch ca x Kt qu: x=1, khng phi l 2 nh mong i. trnh tnh trng trn, vic tng gi tr x phi c ng b ha gia cc thread m bo cho kt qu chnh xc. Di y l mt s ch th lin quan ti ng b ha.

2.6.5.1 Master
Ch th cho bit on m nm trong khai bo master s c thc hin bi master thread. Cc thread khc s b qua on m ny v tip tc thc hin bnh thng. Cu trc:

L Th Lan Phng

29

Song song ha thut ton Barnes-Hut vi OpenMP #pragma omp master <new-line> structured-block

2.6.5.2 Critical
Ch th xc nh on m nm trong khai bo s c truy cp bi duy nht mt thread vo mt thi im. Cc thread khc s phi ch cho n khi khng c thread no thc hin on m . Cu trc: #pragma omp critical [(name)] <new-line> structured-block

2.6.5.3 Barrier
Khi 1 thread gp ch th barrier, thread s phi ch cho n khi no tt c cc thread cn li u gp ch th ny. Cu trc: #pragma omp barrier <new-line>

2.6.5.4 Atomic
Ch th atomic xc nh mt vng b nh c th no s c cp nht mt cch tng phn, khng cho php nhiu thread cng thc hin ti vo mt thi im. Ch th ch p dng cho cc cu lnh n. Cu trc: #pragma omp atomic <new-line> statement_expression Cc cu lnh n c th l: ++x x++ --x x-L Th Lan Phng 30

Song song ha thut ton Barnes-Hut vi OpenMP v cc ton t +, -, *, /, &, ^, |, >> hoc <<

2.6.5.5 Flush
Ch th flush s ghi li cc bin visible trong thread vo b nh. Ngi lp trnh c th t xc nh qu trnh ng b ha mt cch trc tip trn b nh chia s thng qua vic s dng flush. Ty chn list c s dng xc nh danh sch cc bin cn flush, nu khng c ty chn ny tt c cc bin s c ghi li vo b nh. Cu trc: #pragma omp flush [(list)] <new-line>

2.6.5.6 Ordered
Ch th ordered xc nh vng lp s c thc hin theo th t nh th c thc thi trn b x l tun t. Ordered ch xut hin trong khai bo ch th lp Do/for. Ti mt thi im, ch c mt thread thc hin cng vic trong phn khai bo ordered. Cu trc: #pragma omp ordered <new-line> structured-block

2.6.6 Th vin v mt s bin mi trng 2.6.6.1 Mt s hm trong th vin ca OpenMP


s dng c cc hm c sn trong OpenMP, cn phi thm file header omp.h trong khi include cc th vin. #include <omp.h> Mt vi hm c bn trong th vin ca OpenMP: void omp_set_num_threads(int num_threads) int omp_get_num_threads(void) int omp_get_max_theads(void) int omp_get_thead_num(void) int omp_get_num_procs(void) L Th Lan Phng 31

Song song ha thut ton Barnes-Hut vi OpenMP int omp_in_parallel(void) void omp_set_dynamic(int dynamic_threads) int omp_get_dynamic(void) void omp_set_nested(int nested) int omp_get_nested(void)

2.6.6.2 Mt s bin mi trng trong OpenMP


Bin mi trng c s dng iu khin qu trnh thc thi cc on m song song. Truyn gi tr cho cc bin mi trng ty thuc vo trnh bin dch v kin trc h thng b nh chia s. C th truyn gi tr cho bin mi trng bng: export tn_bin = gi tr setenv tn_bin gi_tr v d: export OMP_NUM_THREADS=5 setenve OMP_NUM_THREADS 5 Mt s bin mi trng thng gp: OMP_SCHEDULE OMP_NUM_THREADSOMP_DYNAMIC OMP_NESTED

L Th Lan Phng

32

Song song ha thut ton Barnes-Hut vi OpenMP

2.7 V d v lp trnh song song vi OpenMP


2.7.1 omp_hello.c
/* omp_hello.c */ #include <omp.h> int main () {

int nthreads, tid; /* Fork a team of threads giving them their own copies of variables */ #pragma omp parallel private(nthreads, tid) { /* Obtain thread number */ tid = omp_get_thread_num(); printf("Hello World from thread = %d\n", tid); /* Only master thread does this */ if (tid == 0) { nthreads = omp_get_num_threads(); printf("Number of threads = %d\n", nthreads); } } /* All threads join master thread and disband */

Chng trnh minh ha hot ng ca cc thread khi thc hin song song. Bin tid c khai bo l private, lu ID ca mi thread. Bin private nthreads cho bit s lng thread tham gia vo qu trnh song song.

2.7.2 Cch bin dch


Ty theo cc nh cung cp v trnh bin dch hin s dng, c th c nhiu cch bin dch mt chng trnh OpenMP. Vi h thng my IBM, dng c -qsmp=omp Vi h thng my Intel, dng c -openmp Vi h thng Compag, dng c -omp

L Th Lan Phng

33

Song song ha thut ton Barnes-Hut vi OpenMP Trn my IBM AIX, dch chng trnh omp_hello.c, dng lnh: xlc_r qsmp=omp omp_hello.c o hello thc hin chng trnh, g lnh: ./hello Nu khng ch r s lng thread cn s dng thc hin qu trnh song song, th chng trnh s ly s thread mc nh hin c trong h thng kin trc b nh chia s. C th xc nh s thread cn thit thng qua hm th vin omp_set_num_threads(int num_threads) hoc truyn gi tr cho bin mi trng OMP_NUM_THREADS bng lnh: export OMP_NUM_THREADS Gi s, trong v d trn t s thread l 4: export OMP_NUM_THREADS=4 Xem trang http://www.navo.hpc.mil/Resources/Hardware/Romulus_Users_Guide.html#ProgEnv bit thm chi tit v mt s ch th bin dch.

2.7.3 Kt qu
Hello World from thread Number of threads = 4 Hello World from thread Hello World from thread Hello World from thread = 0 = 3 = 1 = 2

L Th Lan Phng

34

Song song ha thut ton Barnes-Hut vi OpenMP

Chng 3: SONG SONG HA THUT TON BARNES-HUT


3.1 Treecode
tin hnh song song ha thut ton Barnes-Hut vi bi ton N-body, ta xt chng trnh treecode ca J. Barnes lm v d. Treecode l mt trong nhng chng trnh m phng bi ton N-body. Da trn nn tng ca thut ton Barnes-Hut, treecode thc hin nhanh hn v kim sot li tt hn so vi cc chng trnh m phng h N-body trc . Nh bit, trong thut ton Barnes-Hut, sau khi xy dng cy Quadtree (hoc Octree), ta tin hnh duyt cy tnh lc tc dng ln tng ht. Tuy nhin, vic duyt cy tn rt nhiu thi gian. Treecode gim thiu chi ph cho duyt cy bng cch s dng tnh cht: cc ht lin k c danh sch cc tng tc vi n l ging nhau. tng ny c s dng trc nhm tng tc khi tnh ton lc trn cc my vector [3], nhng cc chng trnh ch n gin p dng tm kim cy trong mt khi ht nh. Treecode p dng cho tt c cc mc ca cy. Lc tc dng ln cc ht c tnh thng qua mt vng duyt cy quy. Qu trnh duyt cy quy ny lu tr v cp nht danh sch cc tng tc (interaction list). Ti mi mc ca cy, ht b (gi s nm trong c) c s dng phn bit vi danh sch cc tng tc m b c th c. Khi duyt quy ti ht b, danh sch tng tc s c s dng tnh lc hp dn v th nng ti ht b.

3.1.1 Cu trc d liu ca cy


Cu trc d liu chnh c s dng trong treecode l cy Octree, bao gm cc body v cc cell. L ca cy lu tr thng tin ca body. Cc nt trong ca cy l cc cell. Tin hnh duyt ton b cy bt u t gc ca cy. Cu trc cy Octree c th c biu din n gin nh sau:

L Th Lan Phng

35

Song song ha thut ton Barnes-Hut vi OpenMP

Hnh 15: Cu trc d liu cy trong treecode (1) Cu trc node biu din cc thng tin chung ca body v cell. Theo l thuyt, mi mt thnh phn ca cy c th c biu din l t hp ca body v cell. Nhng cch biu din l khng hiu qu, v cu trc ca body v cell i hi khng gian b nh khc nhau. Do vy ngi ta s dng cu trc node biu din chung cho body v cell. Vic p kiu c s dng chuyn con tr c kiu ty thnh con tr tr ti node, body v cell.
typedef struct _node { short type; bool update; real mass; vector pos; struct _node *next; } node, *nodeptr; #define Type(x) (((nodeptr) (x))->type)

#define Update(x) (((nodeptr) (x))->update) #define Mass(x) #define Pos(x) #define Next(x) (((nodeptr) (x))->mass) (((nodeptr) (x))->pos) (((nodeptr) (x))->next)

Trong :

L Th Lan Phng

36

Song song ha thut ton Barnes-Hut vi OpenMP Type(q) tr li kiu ca node q, c gi tr l CELL hoc BODY Update(q) c gi tr l boolean, cho bit q c cn cp nht lc tng tc khng? Next(q) l con tr, tr ti node tip theo ca q, sau khi tt c cc con ca q c duyt. Mass(q) l khi lng ca ht q hoc l khi lng ca tt c cc ht c trong cell q Pos(q) l v tr ca ht q hoc v tr ca tm khi trong cell q Cu trc body biu din cc ht.
typedef struct { node bodynode; vector vel; vector acc; real phi; } body, *bodyptr; #define Vel(x) #define Acc(x) #define Phi(x) (((bodyptr) (x))->vel) (((bodyptr) (x))->acc) (((bodyptr) (x))->phi)

Trong : Vel(b) l vn tc ca ht b Acc(b) l gia tc ca ht b Phi(b) l th nng ca ht b Cu trc cell biu din cc nt trong ca cy
#define NSUB (1 << NDIM) typedef struct { node cellnode;

L Th Lan Phng

37

Song song ha thut ton Barnes-Hut vi OpenMP

#if !defined(QUICKSCAN) real rcrit2; #endif nodeptr more; union { nodeptr subp[NSUB]; matrix quad; } sorq; } cell, *cellptr; #if !defined(QUICKSCAN) #define Rcrit2(x) (((cellptr) (x))->rcrit2) #endif #define More(x) #define Subp(x) #define Quad(x) (((cellptr) (x))->more) (((cellptr) (x))->sorq.subp) (((cellptr) (x))->sorq.quad)

Trong : Subq(c) l mng cc con tr tr ti cc con ca c More(c) l con tr tr ti con u tin trong cc con ca c Quad(c) l ma trn quadrupole moments Rcrit2(c) l bnh phng bn knh m nu nm ngoi bn knh , cell c c coi nh l mt cell interaction.

L Th Lan Phng

38

Song song ha thut ton Barnes-Hut vi OpenMP

Hnh 16: Cu trc d liu cy trong treecode (2)

3.1.2 Cc bin ton cc


Treecode s dng cc bin ton cc. Bin ton cc c s dng chung cho tt c cc tp ca chng trnh v c khai bo vi nh ngha global. nh ngha global nh sau: #define global extern Cc tham s u vo: theta: cho bit chnh xc khi tnh ton lc. Tham s ny khng c xc nh nu bn QUICKSCAN c bin dch options: l mt chui cc ty chn iu khin thi gian chy usequad: c cho bit c s dng quadrupole correctin hay khng? Cc bin c s dng khi xy dng cy: root: con tr tr ti root cell rsize: kch c ca root cell ncell: s cc cell c s dng xy dng cy tdepth: chiu cao ca cy cputree: thi gian CPU i hi xy dng cy Cc bin c s dng khi tnh lc:

L Th Lan Phng

39

Song song ha thut ton Barnes-Hut vi OpenMP actmax: di ln nht ca danh sch active trong khi tnh lc nbbcalc: s cc tng tc gia cc ht vi nhau nbccalc: s cc tng tc gia cc ht v cc cell cpuforce: thi gian CPU cn tnh lc

3.2 Th nghim v nh gi hiu nng ca treecode


3.2.1 Th nghim chng trnh treecode
Ton b m chng trnh treecode do J. Barnes vit c th c download ti trang web http://www.ifa.hawaii.edu/~barnes/treecode/treeguide.html Treecode c vit bng ngn ng C (ANSI C). Gi s chng trnh c bin dch bi trnh bin dch ca h iu hnh LINUX. Download file treecode.tar.gz ti trang web trn. thc hin chng trnh, copy file vo th mc ring, ri thc hin lnh gunzip gii nn tp. $ gunzip treecode.tar.gz $ tar xvf treecode.tar Th mc sau s cha cc file .c , .h v Makefile Trc khi bin dch, c th cn phi chnh sa mt s thng tin trong Makefile ty theo kin trc my tnh v trnh bin dch hin ang s dng. y, ta sa li thng tin trong Makefile bng cch thm ty chn pg vo cc c bin dch: CCFLAGS v c LDFLAGS khi thc hin chng trnh s sinh ra file gmon.out Ty chn bin dch trong Makefile c sa li nh di y:
# Compiler options. # LINUX: CCFLAGS = -pg -DLINUX LDFLAGS = -pg OPTFLAG = -O3

Thc hin cu lnh make dch nhiu file vi nhau L Th Lan Phng

40

Song song ha thut ton Barnes-Hut vi OpenMP $ make treecode Thc hin chng trnh bng ./treecode [tham s] C th xem cc tham s bng cu lnh: ./treecode help
treecode in= out= dtime=1/32 eps=0.025 theta=1.0 usequad=false options= tstop=2.0 dtout=1/4 nbody=4096 seed=123 save= restore= VERSION=1.4 Hierarchical N-body code (theta scan) Input file with initial conditions Output file of N-body frames Leapfrog integration timestep Density smoothing length Force accuracy parameter if true, use quad moments Various control options Time to stop integration Data output timestep Number of bodies for test run Random number seed for test run Write state file as code runs Continue run from state file Joshua Barnes February 21 2001

Khi thc hin chng trnh, kt qu ca qu trnh tnh ton s c hin th ra mn hnh, c dng nh sau:

L Th Lan Phng

41

Song song ha thut ton Barnes-Hut vi OpenMP

Hierarchical N-body code (theta scan) nbody 4096 dtime 0.03125 rsize 64.0 eps 0.0250 ftree 3.050 theta 1.00 actmax 1114 usequad false nbbtot 1051990 dtout 0.25000 nbctot 1417390 tstop 2.0000 CPUfc 0.003 CPUtot 0.004 CPUfc 0.003 CPUtot 0.007 CPUfc 0.003 CPUtot 0.011 CPUfc

tdepth 13

time |T+U| T -U -T/U |Vcom| |Jtot| 0.000 0.24032 0.25082 0.49114 0.51069 0.00000 0.00576 rsize 64.0 tdepth 12 ftree 3.108 actmax 1121 nbbtot 1054863 nbctot 1419799

time |T+U| T -U -T/U |Vcom| |Jtot| 0.031 0.24028 0.25075 0.49104 0.51066 0.00001 0.00576 rsize 64.0 tdepth 13 ftree 3.057 actmax 1108 nbbtot 1040044 nbctot 1422316

time |T+U| T -U -T/U |Vcom| |Jtot| 0.062 0.24029 0.25069 0.49098 0.51059 0.00001 0.00576 rsize tdepth ftree actmax nbbtot nbctot

3.2.2 nh gi hiu nng


Mc d treecode l mt ci tin ca thut ton Barnes-Hut, vi tc tnh ton lc nhanh hn v vn kim sot li tt hn so vi cc chng trnh trc , song vn duyt cy vn chim a s thi gian thc hin chng trnh. C th xem thi gian thc hin cc hm trong ton b chng trnh thng qua profile ca n. Thc hin lnh gprof vi chng trnh treecode xem thng tin profile v chng trnh nh sau: gprof treecode gmon.out > treecode.out Kt qu hin th trong file treecode.out c dng:

L Th Lan Phng

42

Song song ha thut ton Barnes-Hut vi OpenMP

Flat profile: Each sample counts as 0.01 seconds. % cumulative self time seconds seconds calls 96.02 129.09 129.09 65 0.96 130.37 1.29 2466730 0.95 131.65 1.28 65 0.76 132.68 1.03 266240 0.57 133.44 0.76 65 0.21 133.72 0.28 65 0.19 133.97 0.25 65 0.13 134.15 0.18 64 0.10 134.28 0.13 65 0.05 134.35 0.07 130637 0.03 134.39 0.04 69477 0.01 134.41 0.02 130637 0.01 134.43 0.02 8192 0.01 134.45 0.02 65 0.01 134.46 0.01 1

self s/call 1.99 0.00 0.02 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01

total s/call 1.99 0.00 0.02 0.00 0.01 0.00 0.00 2.05 0.00 0.00 0.00 0.00 0.00 2.05 0.07

name walktree subindex diagnostics loadbody hackcofm threadtree newtree stepsystem expandbox makecell xrandom setrcrit fpickshell treeforce testdata

Kt qu ny c c khi bin dch v thc hin chng trnh trn my Intel vi 1 CPU, Pentium 4 CPU 2.26GHz, 240MB RAM, h iu hnh LINUX. Nh vy, qua kt qu trn ta thy hm walktree chim n 96.02% tng s thi gian thc hin c chng trnh, trong khi phn trm thi gian thc hin cc hm khc l rt nh. Do vy, tuy thi gian thc hin treecode c nhanh hn v vn kim sot li l tt hn so vi cc chng trnh m phng bi ton N-body trc , song ti u ha chng trnh treecode, ta tin hnh th nghim song song ha chng trnh vi OpenMP trn my Intel 4 CPU nhm tng hiu nng tnh ton.

3.3 Song song ha treecode vi OpenMP


3.3.1 Mi trng thc hin song song
Thc hin song song ha thut ton treecode trn mi trng Intel (R) Xeon (TM) 4 CPU 2.40 GHz. Thng tin chi tit v cu hnh 1 CPU trn my Intel c cho di y:
processor vendor_id cpu family model model name stepping : : : : : : 0 GenuineIntel 15 2 Intel(R) Xeon(TM) CPU 2.40GHz 7

L Th Lan Phng

43

Song song ha thut ton Barnes-Hut vi OpenMP


cpu MHz : 2394.914 cache size : 512 KB Physical processor ID: 0 Number of siblings : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm bogomips : 4750.85

3.3.2 Thc hin song song


Theo nh gi hiu nng ca treecode mc 3.2.2, ta thy phn ln thi gian chng trnh dnh cho vic thc hin hm walktree. tng hiu nng tnh ton ca treecode, ta tin hnh song song ha treecode bng cch song song ha cc hm trn. Kh khn gp phi trong khi tin hnh song song ha treecode l s ph thuc vo chng trnh dch. Vi cc trnh bin dch ca cc kin trc my tnh khc nhau s cho thi gian thc hin tng hm con trong chng trnh treecode l khc nhau. V d vi trnh bin dch ca my IBM eServer Cluster 1600, h iu hnh AIX5.2 vi cu hnh nh sau: 5 node tnh ton pSeries 655, mi node gm 8 CPU Power 4+ 64 bit RISC 1.7 GHz ca IBM; cache 5.6MB ECC L2, 128MB ECC L3, bng thng: 72.3 GBps; 32GB RAM, bng thng b nh 51.2 GBps; 6x36 GB HDD. Nng lc tnh ton tng cng khong 240 GFlops (m rng ti a 768 GFlops/16 node). 1 node qun l CSM p630: Power4+ 64 bit 1.2 GHz; cache 1.5 MB ECC L2, 8MB ECC L3, bng thng: 12.8 GBps; 1GB RAM, bng thng: 6.4 GBps; 6x36 GB HDD, DVD ROM. 1 node iu khin phn cng HCM: Intel Xeon 3.06 GHz, 1GB RAM, 40 GB HDD, DVD RAM. L Th Lan Phng 44

Song song ha thut ton Barnes-Hut vi OpenMP Cc node c kt ni vi nhau thng qua HPS (High Performance Switch Switch hiu nng cao), bng thng 2GBps v GEthernet. H thng lu tr chung: IBM DS4400 v EXP700 kt ni vi cm IBM 1600 thng qua cp quang vi bng thng 2Gbps. Cc node chy HH AIX 5L phin bn 5.2 Kt qu profile ca treecode trn IBM AIX s l:
% cumulative time seconds 42.3 13.85 34.8 25.23 7.6 27.73 5.4 29.50 4.0 30.81 2.6 31.65 0.8 31.91 0.5 32.09 0.4 32.22 0.4 32.34 0.4 32.46 0.3 32.55 0.2 32.61 0.1 32.64 self self total seconds calls ms/call ms/call 13.85 11.38 532480 0.02 0.02 2.50 1.77 161645903 0.00 0.00 1.31 16072949 0.00 0.00 0.84 570149 0.00 0.03 0.26 0.18 65 2.77 2.77 0.13 2480994 0.00 0.00 0.12 0.12 0.09 266240 0.00 0.00 0.06 65 0.92 1.25 0.03 65 0.46 0.46 name .sqrt [8] .sumnode [9] .__mcount [11] .sqrtf [12] .accept [13] .walktree_13_6 <cycle 1> [7] .qincrement [15] .diagnostics [18] .subindex [19] .__stack_pointer [20] .qincrement1 [21] .loadbody [16] .hackcofm [22] .threadtree [25]

Vi trnh bin dch ca Intel (R) Xeon (TM) 4 CPU 2.40 GHz, th phn trm thi gian thc hin cc hm ca treecode l:

L Th Lan Phng

45

Song song ha thut ton Barnes-Hut vi OpenMP

Flat profile: Each sample counts as 0.00195312 seconds. % cumulative self self time seconds seconds calls ms/call 73.87 9.13 9.13 532480 0.02 12.84 10.71 1.59 16013212 0.00 10.21 11.97 1.26 569962 0.00 0.84 12.08 0.10 2466729 0.00 0.47 12.14 0.06 266240 0.00 0.46 12.19 0.06 303722 0.00 0.27 12.23 0.03 65 0.51 0.25 12.26 0.03 520 0.06 0.22 12.28 0.03 520 0.05 0.19 12.31 0.02 266240 0.00 0.14 12.33 0.02 64 0.27

total ms/call 0.02 0.00 0.02 0.00 0.00 0.00 0.51 0.07 0.05 0.03 189.50

name sumnode accept walktree subindex loadbody walksub diagnostics hackcofm threadtree gravsum stepsystem

Nh vy, vi cc trnh bin dch khc nhau, thi gian thc hin cc hm ca treecode l hon ton khc. V vy, vic nh gi hm no tn nhiu thi gian nht cng nh cn phi tin hnh song song ha nh th no l vn gp nhiu kh khn.

3.3.2.1 Phn tch hm walktree


Hm walktree l hm quy chnh dng trong khi tnh lc. Nguyn mu ca n c dng
void walktree(nodeptr *aptr, nodeptr *nptr, cellptr cptr, cellptr bptr,nodeptr p, real psize, vector pmid);

Hm walktree tnh lc hp dn ln tt c cc ht c trong node p thng qua vic duyt quy p v cc con ca n. Ti mi thi im trong lt duyt quy, thng tin ca cc node t gc ti p c lu tr trong mt tp cc node. Tp chnh l tp cc tng tc. Tp ny c chia thnh 2 tp cell v body ring bit, c tr bi cc con tr tng ng l cptr v bptr. Phn cn li ca cy c biu din bi mt tp cc active node, bao gm node p v cc node xung quanh n trong khng gian. Con tr tr ti cc node ny c lu vo mng nm gia aptr v nptr. Node p c kch thc l psize v v tr l pmid. Trong vng lp chnh, walktree duyt qua tt c cc active node ca p, kim tra xem node no s c thm vo danh sch tng tc, v node no gn vi p n mc phi kim tra cc con ca n mc tip theo ca qu trnh duyt quy. Cc cell c kim L Th Lan Phng 46

Song song ha thut ton Barnes-Hut vi OpenMP tra thng qua hm accept. Nu cell cch kh xa p, ngha l t s D/r l nh, cell c thm vo danh sch tng tc ca p. Ngc li, kim tra tt c cc con ca n, v thm vo danh sch cc active node. Nu c danh sch active mi c to ra, th tip tc duyt cy quy mc tip theo thng qua li gi hm walksub. Hm walksub thc hin vic gi hm walktree ti cc con ca p. Ngc li, nu khng c danh sch active mi no, tin hnh kim tra p. Nu p l body, thc hin tnh ton lc ti p bng li gi hm gravsum. Nguyn mu ca hm walksub c dng nh sau:
void walksub(nodeptr *nptr, nodeptr *np, cellptr bptr,nodeptr p, real psize, vector pmid); cellptr cptr,

Cc tham s trong hm walksub c gi tr ging vi cc tham s trong hm walktree ti li gi hm. C 2 trng hp xy ra: Nu p l cell, khi walksub s duyt qua tt c cc con ca p, v gi hm walktree ti mi nt con . Nu p l body, walksub s gi hm walktree ng mt ln duy nht, duyt nt danh sch active ca n.

Di y l chi tit ca hm walktree v walksub

L Th Lan Phng

47

Song song ha thut ton Barnes-Hut vi OpenMP

local void walktree(nodeptr *aptr, nodeptr *nptr, cellptr cptr, cellptr bptr, nodeptr p, real psize, vector pmid) { nodeptr *np, *ap, q; int actsafe; if (Update(p)) { /* are new forces needed? */ np = nptr; /* start new active list */ actsafe = actlen - NSUB;/* leave room for NSUB more */ for (ap = aptr; ap < nptr; ap++)/* loop over active nodes */ if (Type(*ap) == CELL) { /* is this node a cell? */ if (accept(*ap, psize, pmid)) {/* does it pass the test?*/ Mass(cptr) = Mass(*ap); /* copy to interaction list */ SETV(Pos(cptr), Pos(*ap)); SETM(Quad(cptr), Quad(*ap)); cptr++; /* and bump cell array ptr */ } else { /* else it fails the test */ if (np - active >= actsafe) /* check list has room */ error("walktree: active list overflow\n"); for (q = More(*ap); q != Next(*ap); q = Next(q)) /* loop over all subcells */ *np++= q; /* put on new active list */ } } else /* else this node is a body */ if (*ap != p) { /* if not self-interaction */ --bptr; /* bump body array ptr */ Mass(bptr) = Mass(*ap);/* and copy data to array */ SETV(Pos(bptr), Pos(*ap)); } actmax = MAX(actmax, np - active); /* keep track of max active */ if (np != nptr) /* if new actives listed */ walksub(nptr, np, cptr, bptr, p, psize, pmid); /* then visit next level */ else { /* else no actives left, so */ if (Type(p) != BODY) /* must have found a body */ error("walktree: recursion terminated with cell\n"); gravsum((bodyptr) p, cptr, bptr); /* sum force on the body */ } } }

L Th Lan Phng

48

Song song ha thut ton Barnes-Hut vi OpenMP

local void walksub(nodeptr *nptr, nodeptr *np, cellptr cptr, cellptr bptr, nodeptr p, real psize, vector pmid) { real poff; nodeptr q; int k; vector nmid; poff = psize / 4; /* precompute mid. offset */ if (Type(p) == CELL) { /* fanout over descendents */ for (q = More(p); q != Next(p); q = Next(q)) { /* loop over all subcells */ for (k = 0; k < NDIM; k++) /* locate each's midpoint */ nmid[k] = pmid[k] + (Pos(q)[k] < pmid[k] ? - poff : poff); walktree(nptr, np, cptr, bptr, q, psize / 2, nmid); /* recurse on subcell */ } } else { /* extend virtual tree */ for (k = 0; k < NDIM; k++) /* locate next midpoint */ nmid[k] = pmid[k] + (Pos(p)[k] < pmid[k] ? - poff : poff); walktree(nptr, np, cptr, bptr, p, psize / 2, nmid); /* and search next level */ } }

3.3.2.2 Song song ha treecode


tng song song ha treecode nh sau: S dng ch th taskq c h tr bi trnh dch ca my Intel, song song ha hm quy walktree. Vi cc hm khc, s dng cc ch th Do/for song song ha vng lp. Ch th taskq thit lp mi trng thc hin cc cng vic (task). Khi gp ch th taskq, mt trong s cc thread s c chn thc hin ch th . Mt hng i rng c to ra bi thread chn. Sau , on chng trnh nm trong khi taskq s c thc hin bi mt thread n. Cc thread cn li ch thc hin khi cng vic s c thm vo hng i. Cc ch th task xc nh mt khi cng vic c th c thc hin bi nhiu thread khc nhau. Khi gp ch th task trong khai bo taskq, on chng trnh

L Th Lan Phng

49

Song song ha thut ton Barnes-Hut vi OpenMP nm trong khai bo task v mt l thuyt s c xp vo hng i. Hng i s kt thc khi tt c cc cng vic trn c hon thnh. Nh vy, vi vic s dng hng i, hm walktree c th c chnh sa li nh sau: Trong hm gravcal(), li gi hm walktree s c thm cc ch th taskq v task ca OpenMP.
void gravcalc(void) { . . active[0] = (nodeptr) root; CLRV(rmid);

/* initialize active list /* set center of root cell

*/ */

/* Add parallel region */ #pragma omp parallel { #pragma intel omp taskq { #pragma intel omp task { walktree(active, active + 1, interact, interact + actlen, (nodeptr) root, rsize, rmid); /* scan tree, update forces */ } } } /* end of parallel region */ cpuforce = cputime() - cpustart; /* store CPU time w/o alloc */ free(active); free(interact); }

Trong hm walksub gi quy hm walktree, do vy s thm cc ch th ca OpenMP vo hm walksub nh sau:

L Th Lan Phng

50

Song song ha thut ton Barnes-Hut vi OpenMP

local void walksub(nodeptr *nptr, nodeptr *np, cellptr cptr, cellptr bptr, nodeptr p, real psize, vector pmid) { if (Type(p) == CELL) { /* fanout over descendents */ /* add parallel region */ #pragma intel omp parallel taskq shared(q) { for (q = More(p); q != Next(p); q = Next(q)) { #pragma intel omp task captureprivate(q) { for (k = 0; k < NDIM; k++) nmid[k] = pmid[k] + (Pos(q)[k] < pmid[k] ? - poff : poff); walktree(nptr, np, cptr, bptr, q, psize / 2, nmid); } } } /* end of parallel region */ } else { for (k = 0; k < NDIM; k++) nmid[k] = pmid[k] + (Pos(p)[k] < pmid[k] ? - poff : poff); walktree(nptr, np, cptr, bptr, p, psize / 2, nmid); } }

Khi bin dch chng trnh, cc ch th ca OpenMP s c thc hin song song. Kt qu thc nghim c cho di y.

3.4 Kt qu thc nghim


Ph thuc vo trnh bin dch v cu hnh my Intel (R) Xeon (TM) vi 4 CPU 2.40 GHz, chng trnh treecode sau khi th nghim song song vi mt s ch th ca OpenMP s cho thi gian thc hin cc hm nh di y:

L Th Lan Phng

51

Song song ha thut ton Barnes-Hut vi OpenMP

Flat profile: Each sample counts as 0.00195312 seconds. % cumulative self self time seconds seconds calls ms/call 74.84 10.43 10.43 532480 0.02 11.18 11.99 1.56 569962 0.00 10.83 13.50 1.51 16013212 0.00 0.68 13.59 0.09 2466729 0.00 0.48 13.66 0.07 266240 0.00 0.29 13.70 0.04 0.27 13.73 0.04 520 0.07 0.24 13.77 0.03 266240 0.00 0.22 13.80 0.03 130637 0.00 0 20 13 82 0 03 65 0 42

total ms/call 0.02 0.02 0.00 0.00 0.00 0.09 0.04 0.00 0 42

name sumnode walktree accept subindex loadbody _walksub_202__task4 hackcofm gravsum _walksub_199__taskq3 diagnostics

Nh vy, ty thuc vo tng trnh bin dch trn cc my tnh c cu hnh khc nhau, kt qu th nghim thu c trn my a x l Intel ch mang tnh cht tng i.

L Th Lan Phng

52

Song song ha thut ton Barnes-Hut vi OpenMP

KT LUN
Kt qu t c
Sau mt thi gian tm hiu, nghin cu v nh gi, ti nhn thy thut ton Barnes-Hut v cc ci tin ca n gp phn quan trng khi gii quyt bi ton N-body, vi phc tp ch l O (N log N). Cng qua tm hiu, ti thy OpenMP l mt giao din lp trnh ng dng song song n gin v d s dng. N cung cp cho ngi dng mt giao din mm do, c tnh kh chuyn cao trong khi xy dng v pht trin cc ng dng song song trn cc kin trc my tnh b nh chia s. Vic ci t v th nghim ci tin ca thut ton Barnes-Hut trn cc my a x l Intel v IBM cng gip cho ti c nhng kinh nghim v thu c mt s kt qu thc nghim. qua ti phn tch v thy c mt s kh khn khi tin hnh song song ha. Tuy kt qu thc nghim t c cha cao, song qua tm hiu ti hc hi c kinh nghim v nng cao vn hiu bit ca mnh v tnh ton hiu nng cao trn cc my a x l, cng nh kinh nghim v s dng h iu hnh Linux.

Hng pht trin


Do iu kin v thi gian nghin cu v iu kin v phn cng, nn ti mi ch tm hiu v tin hnh th nghim song song ha trn my a x l Intel. Hng nghin cu tip theo ca ti l tip tc tm hiu, th nghim v ti u thut ton treecode bng cch tin hnh song song ha trn my IBM, cng nh tin hnh song song ha thut ton trn cc my a x l kt hp gia b nh chia s v b nh phn tn.

L Th Lan Phng

53

Song song ha thut ton Barnes-Hut vi OpenMP

TI LIU THAM KHO


[1] [2] [3] [4] Josh Barnes & Piet Hut, A hierarchical O(N log N) force calculation algorithm, Nature, v. 324, December 1986. Fast Hierarchical Methods for the N-body Problem, Part http://www.cs.berkeley.edu/~demmel/cs267/lecture26/lecture26.html 1

J. E. Barnes, A modified tree code: Don't laugh; It runs, Journal of Computational Physics 87 (1990) 161--170. A. Kawai, J. Makino, High-accuracy treecode based on pseudoparticle multipole method, Proceedings of the 208th Symposium of the International Astronomical Union (Tokyo, Japan, July 10-13, 2001) 305-314. A. Kawai, J. Makino, Pseudo-particle multipole method: A simple method to implement a high-accuracy treecode, The Astrophysical Journal, 550 (2001) L143-L146. A. Kawai, J. Makino, T. Ebisuzaki, Performance analysis of high-accuracy tree code based on the pseudoparticle multipole method, The Astrophysical Journal Supplement 151 (2004) 13-33. Joshua E. Barnes, Institute for Astronomy, University of Hawaii, Treecode Guide Giovanni Erbacci, Shared Memory Paradigm, High Performance Systems Department, CINECA Introduction to OpenMP, Technical User Support, Supercomputing Institute, University of Minnesota exercise

[5]

[6]

[7] [8] [9]

[10] OpenMP http://www.llnl.gov/computing/tutorials/openMP/exercise.html

[11] Michael S, Claudia Leopold, A User's Experience with Parallel Sorting and OpenMP, Talk at the EWOMP'04 conference, Stockholm

L Th Lan Phng

54

Song song ha thut ton Barnes-Hut vi OpenMP


[12] Dieter an Mey, Two OpenMP Programming Patterns, Center for

Computing and Communication, Aachen University


[13] Paul Graham, Edinburgh Parallel Computing Centre, The university of Edinburgh, OpenMP A Parallel Programming Model for Shared Memory Architectures,March 1999, version 1.1

L Th Lan Phng

55

You might also like