You are on page 1of 18

8.

Case Study : Game of Life Parallel Design

Lero 2013

Parallel Life Design


Start by deciding strategy Decomposi9on Is the grid 1d or 2d? row/column/cartesian? Which is appropriate? Think about the communica9on Will a row or column based decomposi9on suce? What is the grid dataype? If we divide out old array of data as blocks (using a cartesian topology), how will the data update on each 9mestep? Remember need to update corners, leJ->right, top->boMom
Lero 2013 2

Parallel Design
What are the data dependencies? Cells need to interact with 8 neighbours so dependencies with cells north, south, east, west, northeast, northwest, southeast, southwest What decomposi9on strategy will suit such dependencies? Row to row (n/s) Column (e/w) Cartesian (n,s,e,w, ne,nw,se,sw)

Lero 2013

Row Decomposi9on


Rank 0 Rank 1

Rank size-1

Lero 2013

Column Decomposi9on

Rank 0

Rank 1

Rank size-1

Lero 2013

Cartesian Decomposi9on


Rank 0 Rank 1

Example: 9 Processes 3x3 decomposi9on

Rank size-1

Lero 2013

Cartesian Decomposi9on


Rank 0 Rank 1 NJ ndims =? number of dimensions dims=(?,?) number of processes in each dimension periods=(?,?) specifying whether the grid is periodic (1/true) or not (0/false) in each dimension Rank size-1 reorder = ? determines if the ranking may be reordered (true) or not (false)

NI

Lero 2013

Cartesian Decomposi9on

Global grid Local grid
localInnerNumCols

Rank 0

Rank 1

Rank 2

Rank 2

Rank 3

localOuterNumCols

Rank size-1

int localOuterNumCols = localInnerNumCols+2;!


Lero 2013 8

Local variables
To determine the local number of columns allocated to each process, can use the following logic: ! int localInnerNumCols = NJ/dims[1];! e.g. if NJ=100 and dims[1]=5 then localInnerNumCols=20 Remember: each process is actually allocated 2 extra ghost rows and columns for cyclic communica9on=> localInnerNumCols+2!

Lero 2013

Secng up variables
Need to decide which variables you will need e.g:
int ! int ! int ! int ! int ! !
localInnerNumCols = NJ/dims[1];! localOuterNumCols = localInnerNumCols+2;! localFirstCol= 0;! localLastCol=localInnerNumCols+1;! innerLastCol=localLastCol-1;!
localOuterNumCols

Local grid
localInnerNumCols

Rank 2

Lero 2013

10

Global vs Local
I am rank 3 but what chunk of data do I have from the global grid? The global grid is the NI x NJ size as corresponds to the serial applica9on The local grid is the grid distributed to each rank : NI/ dims[0] x NJ/dims[1] It may be necessary to determine where on the global grid your local data corresponds.

0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7

Rank 3 (1,0)


Lero 2013

11

Global vs Local
We can see that rank 3 has the sec9on of data from row 3 to row 4 in the global grid from column 1 to column 2 in the global grid The following logic determines the indices with respect to the
0 1 myFirstRow_glob=(coords[0]*localInnerNumRows)+1;! myFirstCol_glob=(coords[1]*localInnerNumCols)+1;! 2 myLastRow_glob=myFirstRow_glob+localInnerNumRows-1;! 3 myLastCol_glob=myFirstCol_glob+localInnerNumCols-1;! 4 ! 5 For rank 3 with coords (1,0) in a 9x9 sized grid: !myFirstRow_glob=3, myLastRow_glob=4, ! 6 !myFirstCol_glob=1, myLastCol_glob=2! 7

global rows/cols according to the local ranks coords

0 1 2 3 4 5 6 7

Rank 3 (1,0)

Lero 2013

12

Maintaining the same data distribu9on


What if in my parallelized version, I want to populate the distributed data in the same order as the globally populated set in the serial code?
! for(int glob_row=1 to NI) //Loop through the global array (not including ghost cells)! !for(int glob_col=1 to NJ)! ! ! !//Using the variables myFirstRow_glob and myFirstCol_glob (see previous slide)! !//Only populate the dataset that has been distributed to my rank ! ! ! ! ! ! ! ! ! !if((glob_row >= myFirstRow_glob && glob_row <= myLastRow_glob)! ! ! ! ! ! ! ! !&&(glob_col >= myFirstCol_glob && glob_col <= myLastCol_glob))! ! ! ! !! !//Now use this i and j to access the local array i.e array[i][j] ! !//Populate the localset with the randomly generated data! !//if true, then it fits my local chunk! !i = glob_row - myFirstRow_glob + 1;! !j = glob_col - myFirstCol_glob + 1;! !X= rand() //Generate global random dataset as you would for global array!

Lero 2013

13

South to North
!

! int localFirstRow= 0;! int localLastRow=localInnerNumRows+1;! ! MPI_Send(&local_array[1][1], localInnerNumCols, MPI_INT,north_rank,12,comm_cart);! !! !

north_rank
0,0

localInnerNumRows = 4

1,1 2,1 3,1 4,1 5,1

MPI_Recv(&local_array[localLastRow][1], localInnerNumCols, MPI_INT,south_rank,12,comm_cart,&status);!


!

localInnerNumCols

1,1

south_rank
Lero 2013 14

West to East
! ! ! ! ! ! ! ! ! ! ! !
0,0 1,1 1,2 1,3 1,4 0,0 1,0 1,1 2,0 3,0

west_rank

east_rank

Derived Datatype

localInnerNumCols = 4

MPI_Send(&local_array[1][localInnerNumCols],1,myCOL_TYPE,east_rank,0, comm_cart);! MPI_Recv(&local_array[1][0],1, myCOL_TYPE,west_rank,0,comm_cart,status);!

Lero 2013

15

NorthEast to SouthWest
!

MPI_Send(&local_array[innerLastRow][1], 1, MPI_INT, south_west_rank, 13, comm_cart);! ! MPI_Recv(&local_array[0][localLastCol], 1, MPI_INT, north_east_rank,13,comm_cart,&status);! innerLastRow

North_East_rank
0,0 1,1

localLastCol
0,0

South_West_rank
Lero 2013 16

MPIGen Tool

hMp://mpigen.lero.ie

Check out the documenta9on

Lero 2013

17

To use the tool


General Steps: Copy generated les into same directory as your GOL le Use the AppFragments.c les for hints and sugges9ons Invoke the necessary func9ons from within your code See the documenta9on for hints on how to do this If using skeleton code, make sure to implement the func9onality of the game of life To compile and run: Must include the MPI_Wrappers.c le when compiling mpicc gol_par.c xxx_MPIWrappers.c -o gol_par mpirun -n 9 gol_par

Lero 2013

18

You might also like