You are on page 1of 61

Parallel Programming

OpenMP

Pham Quang Dung

Hanoi, 2012

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 1 / 62
Outline

1 Introduction

2 OpenMP API overview

3 Compiling OpenMP programs

4 OpenMP directives

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 2 / 62
Introduction

OpenMP is an API that may be used to explicitly direct


multi-threaded shared memory parallel applications
OpenMP has three major components :
Compiler directives
Runtime library routines
Environment variables
website : openmp.org

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 3 / 62
Introduction
Fork-Join Model
An OpenMP begins as a single process (master thread). The master
thread runs in a sequential mode until the first parallel region
construct is encountered
FORK : the master thread creates a group of parallel threads
JOIN : When the group of threads finishes the statement in the
paralle region construct, they synchronize and terminate. Only the
master thread continues.

ng.com https://fb.com/tailieudientucntt
source : computing.llnl.gov/tutorials/openMP/#Introduction
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 4 / 62
Outline

1 Introduction

2 OpenMP API overview

3 Compiling OpenMP programs

4 OpenMP directives

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 5 / 62
OpenMP API overview

Compiler directives appear as comments in the source code


When compiling without flag -fopenmp, then the directives are ignored
OpenMP compiler directives are used for
Spawning a parallel region
Dividing blocks of code among threads
Distribute loop iterations between threads
Synchronization of work among threads

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 6 / 62
OpenMP API overview

Run-time Library routines


Setting and querying the number of threads
Querying a thread’s unique identifier (thread ID), a thread’s ancestor’s
identifier, the thread team size
Setting, initializing and terminating locks
Querying wall clock time
...

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 7 / 62
Outline

1 Introduction

2 OpenMP API overview

3 Compiling OpenMP programs

4 OpenMP directives

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 8 / 62
Compiling OpenMP programs

Ubuntu : g++ with flag -fopenmp


Example : g++ -o openmp-helloworld openmp-helloworld.cpp
-fopenmp

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 9 / 62
Outline

1 Introduction

2 OpenMP API overview

3 Compiling OpenMP programs

4 OpenMP directives

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 10 / 62
OpenMP directives

Directives format : #pragma omp directive-name [clause,...]


newline
Example : #pragma omp parallel private(i)
General rules :
Case sensitive
Only one directive-name per directive
Each directive applies to at most one succeeding statement (structured
block)

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 11 / 62
Parallel constructs

Creates a team of threads and becomes the master thread of the team
The code of the parallel region is duplicated and all threads will
execute that code
An implied barrier at the end of the parallel region
Only the master thread continues the execution after this point
If any thread terminates within the parallel region, then all threads of
the team terminate

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 12 / 62
Parallel constructs

syntax : #pragma omp parallel [clause ...] newline


clause can be :
if (scalar expression)
private (list)
shared (list)
default (shared k none)
firstprivate (list)
reduction (operator : list)
copyin (list)
num threads (integer-expression)

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 13 / 62
Parallel constructs

1 #i n c l u d e ”omp . h”
#i n c l u d e < s t d i o . h>
3 #i n c l u d e < s t d l i b . h>

5 main ( i n t a r g c , c h a r ∗∗ a r g v ) {
i n t nbthreads , t i d ;
7 nbthreads = a t o i ( argv [ 1 ] ) ;

9 /∗ f o r k a team o f t h r e a d s w i t h e a c h t h r e a d h a v i n g a p r i v a t e t i d
v a r i a b l e ∗/
#pragma omp p a r a l l e l i f ( n b t h r e a d s >= 4 ) n u m t h r e a d s ( n b t h r e a d s )
11 {
t i d = omp get thread num () ;
13 p r i n t f ( ” H e l l o w o r l d from t h r e a d %d\n” , t i d ) ;
}
15 }

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 14 / 62
Parallel constructs

Compile : g++ -o openMP-parallel-region-construct


openMP-parallel-region-construct.cpp -fopenmp
Example
Run : ./openMP-parallel-region-construct 3
Output : Hello world from thread 0

Example
Run : ./openMP-parallel-region-construct 6
Hello world from thread 3
Hello world from thread 2
Hello world from thread 0
Output :
Hello world from thread 1
Hello world from thread 4
Hello world from thread 5
ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 15 / 62
Parallel constructs
1 #i n c l u d e ”omp . h”
#i n c l u d e < s t d i o . h>
3 #i n c l u d e < s t d l i b . h>
#i n c l u d e <u n i s t d . h>
5
main ( i n t a r g c , c h a r ∗∗ a r g v ) {
7 i n t nbthreads , t i d ;
nbthreads = a t o i ( argv [ 1 ] ) ;
9 i n t x = 123;
i n t a = 456;
11
/∗ f o r k a team o f t h r e a d s w i t h e a c h t h r e a d h a v i n g a p r i v a t e t i d
v a r i a b l e ∗/
13 #pragma omp p a r a l l e l n u m t h r e a d s ( n b t h r e a d s ) p r i v a t e ( t i d , a )
{
15 t i d = omp get thread num () ;
p r i n t f ( ” H e l l o w o r l d from t h r e a d %d x = %d a = %d\n” , t i d , x , a ) ;
17
x = t i d ∗ 10;
19 sleep (3) ;
p r i n t f ( ” t h r e a d %d h a s x = %d\n” , t i d , x ) ;
21 }
}
ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 16 / 62
Parallel constructs

H e l l o w o r l d from t h r e a d 0 x = 123 a = 0
2 H e l l o w o r l d from t h r e a d 2 x = 123 a = 0
H e l l o w o r l d from t h r e a d 1 x = 123 a = 32634
4 thread 0 has x = 10
thread 2 has x = 10
6 thread 1 has x = 10

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 17 / 62
Parallel constructs
#i n c l u d e ”omp . h”
2 #i n c l u d e < s t d i o . h>
#i n c l u d e < s t d l i b . h>
4 #i n c l u d e <u n i s t d . h>

6 main ( i n t a r g c , c h a r ∗∗ a r g v ) {
i n t nbthreads , t i d ;
8 nbthreads = a t o i ( argv [ 1 ] ) ;
i n t x = 123;
10 i n t a = 456;

12 /∗ f o r k a team o f t h r e a d s w i t h e a c h t h r e a d h a v i n g a p r i v a t e t i d
v a r i a b l e ∗/
#pragma omp p a r a l l e l n u m t h r e a d s ( n b t h r e a d s ) p r i v a t e ( t i d , a , x )
14 {
t i d = omp get thread num () ;
16 p r i n t f ( ” H e l l o w o r l d from t h r e a d %d x = %d a = %d\n” , t i d , x , a ) ;

18 x = t i d ∗ 10;
sleep (3) ;
20 p r i n t f ( ” t h r e a d %d h a s x = %d\n” , t i d , x ) ;
}
22 }
ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 18 / 62
Parallel constructs

H e l l o w o r l d from t h r e a d 2 x = −495646976 a = 32693


2 H e l l o w o r l d from thread 0 x = 0 a = 0
H e l l o w o r l d from thread 1 x = 0 a = 0
4 thread 2 has x = 20
thread 0 has x = 0
6 thread 1 has x = 10

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 19 / 62
Parallel constructs

firstprivate(var-list) : variables are private and are initialized with the


value of the shared copy before the parallel region
lastprivate(var-list) : variable are private and the value of the thread
executing the last iteration of a parallel loop in sequential order to the
variable outside of the parallel region

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 20 / 62
Parallel constructs
#i n c l u d e ”omp . h”
2 #i n c l u d e < s t d i o . h>
#i n c l u d e < s t d l i b . h>
4 #i n c l u d e <u n i s t d . h>

6 main ( i n t a r g c , c h a r ∗∗ a r g v ) {
i n t nbthreads , t i d ;
8 nbthreads = a t o i ( argv [ 1 ] ) ;
i n t x = 123;
10 i n t a = 456;

12 /∗ f o r k a team o f t h r e a d s w i t h e a c h t h r e a d h a v i n g a p r i v a t e t i d
v a r i a b l e ∗/
#pragma omp p a r a l l e l n u m t h r e a d s ( n b t h r e a d s ) f i r s t p r i v a t e ( t i d , a , x )
14 {
t i d = omp get thread num () ;
16 p r i n t f ( ” H e l l o w o r l d from t h r e a d %d x = %d a = %d\n” , t i d , x , a ) ;

18 x = t i d ∗ 10;
sleep (3) ;
20 p r i n t f ( ” t h r e a d %d h a s x = %d\n” , t i d , x ) ;
}
22 }
ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 21 / 62
Parallel constructs

H e l l o w o r l d from t h r e a d 0 x = 123 a = 456


2 H e l l o w o r l d from t h r e a d 2 x = 123 a = 456
H e l l o w o r l d from t h r e a d 1 x = 123 a = 456
4 thread 0 has x = 0
thread 2 has x = 20
6 thread 1 has x = 10

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 22 / 62
Work-Sharing constructs

Divide the execution of the enclosed code region among the members
of the team that encounter it
Do not launch new threads
No implied barrier upon the entry to a work-sharing construct
An implied barrier at the end of the work-sharing construct
Type
DO/FOR (Data parallelism)
SECTION (Functional parallelism)
SINGLE

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 23 / 62
Work-Sharing constructs : DO-FOR

Specifies that the iterations of the loop immediately following it must


be executed in parallel by the team
Assumes that a parallel region has already been initiated

#pragma omp f o r [ c l a u s e . . . ] newline


2 s c h e d u l e ( t y p e [ , chunk ] )
ordered
4 private ( l i s t )
firstprivate ( list )
6 lastprivate ( list )
shared ( l i s t )
8 reduction ( operator : l i s t )
collapse (n)
10 nowait

12 for loop

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 24 / 62
Work-Sharing constructs : DO-FOR
#i n c l u d e ”omp . h”
2 #i n c l u d e < s t d i o . h>
#i n c l u d e < s t d l i b . h>
4 #i n c l u d e <u n i s t d . h>

6 main ( i n t a r g c , c h a r ∗∗ a r g v ) {
i n t nbthreads , tid , i ;
8 nbthreads = 2;
i n t N = 3∗ n b t h r e a d s ;
10
/∗ f o r k a team o f t h r e a d s w i t h e a c h t h r e a d h a v i n g a p r i v a t e tid
v a r i a b l e ∗/
12 #pragma omp p a r a l l e l n u m t h r e a d s ( n b t h r e a d s ) p r i v a t e ( t i d )
{
14 #pragma omp f o r
f o r ( i = 0 ; i < N ; i ++){
16 t i d = omp get thread num () ;
p r i n t f ( ” H e l l o w o r l d from t h r e a d %d i = %d\n” , t i d , i ) ;
18 }
}
20 }

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 25 / 62
Work-Sharing constructs : DO-FOR

Hello world from thread 0 i = 0


2 Hello world from thread 0 i = 1
Hello world from thread 0 i = 2
4 Hello world from thread 1 i = 3
Hello world from thread 1 i = 4
6 Hello world from thread 1 i = 5

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 26 / 62
Work-Sharing constructs : DO-FOR (scheduling strategies)

Schedule clause : schedule(type[,size])


Scheduling types
static : chunks of the specified size are assigned in a round robin
fashion to the threads
dynamic : the iterations are broken into chunks of speficied size. When
a thread finishes the execution of a chunk, the next chunk is assigned
to that thread
guided : similar to dynamic, the size of the chunks is exponentially
decreasing. The size parameter specifies the smallest chunk. The initial
chunk is implementation depedent
runtime : the scheduling and the size of chunks are determined via
environment variables

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 27 / 62
Work-Sharing constructs : DO-FOR (scheduling strategies)
1 #i n c l u d e ”omp . h”
#i n c l u d e < s t d i o . h>
3 #i n c l u d e < s t d l i b . h>

5 main ( i n t a r g c , c h a r ∗∗ a r g v ) {
i n t nbthreads , tid , i ;
7 nbthreads = 3;// a t o i ( argv [ 1 ] ) ;
i n t N = 3∗ n b t h r e a d s ;
9 #pragma omp p a r a l l e l n u m t h r e a d s ( n b t h r e a d s ) p r i v a t e ( t i d )
{
11 #pragma omp f o r s c h e d u l e ( dynamic , 4 )
f o r ( i = 0 ; i < N ; i ++){
13 t i d = omp get thread num () ;
p r i n t f ( ” H e l l o w o r l d s c h e d u l e ( dynamic , 4 ) from t h r e a d %d i = %d
\n” , t i d , i ) ;
15 }

17 #pragma omp f o r s c h e d u l e ( g u i d e d )
f o r ( i = 0 ; i < N ; i ++){
19 t i d = omp get thread num () ;
p r i n t f ( ” H e l l o w o r l d s c h e d u l e ( g u i d e ) from t h r e a d %d i = %d\n” ,
tid , i ) ;
ng.com }
21
https://fb.com/tailieudientucntt
}
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 28 / 62
Work-Sharing constructs : DO-FOR (scheduling strategies)

1 Hello world s c h e d u l e ( dynamic , 4 ) from t h r e a d 2 i = 0


Hello world s c h e d u l e ( dynamic , 4 ) from t h r e a d 2 i = 1
3 Hello world s c h e d u l e ( dynamic , 4 ) from t h r e a d 2 i = 2
Hello world s c h e d u l e ( dynamic , 4 ) from t h r e a d 2 i = 3
5 Hello world s c h e d u l e ( dynamic , 4 ) from t h r e a d 1 i = 8
Hello world s c h e d u l e ( dynamic , 4 ) from t h r e a d 0 i = 4
7 Hello world s c h e d u l e ( dynamic , 4 ) from t h r e a d 0 i = 5
Hello world s c h e d u l e ( dynamic , 4 ) from t h r e a d 0 i = 6
9 Hello world s c h e d u l e ( dynamic , 4 ) from t h r e a d 0 i = 7
Hello world s c h e d u l e ( g u i d e ) from t h r e a d 0 i = 0
11 Hello world s c h e d u l e ( g u i d e ) from t h r e a d 0 i = 1
Hello world s c h e d u l e ( g u i d e ) from t h r e a d 0 i = 2
13 Hello world s c h e d u l e ( g u i d e ) from t h r e a d 0 i = 7
Hello world s c h e d u l e ( g u i d e ) from t h r e a d 0 i = 8
15 Hello world s c h e d u l e ( g u i d e ) from t h r e a d 2 i = 3
Hello world s c h e d u l e ( g u i d e ) from t h r e a d 2 i = 4
17 Hello world s c h e d u l e ( g u i d e ) from t h r e a d 1 i = 5
Hello world s c h e d u l e ( g u i d e ) from t h r e a d 1 i = 6

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 29 / 62
Work-Sharing constructs : DO-FOR
(firstprivate vs. lastprivate)
#i n c l u d e ”omp . h”
2 #i n c l u d e < s t d i o . h>
#i n c l u d e < s t d l i b . h>
4 #i n c l u d e <u n i s t d . h>

6 main ( i n t a r g c , c h a r ∗∗ a r g v ) {
i n t nbthreads , tid , i ;
8 nbthreads = a t o i ( argv [ 1 ] ) ;
i n t x = 123;
10 i n t a = 456;
#pragma omp p a r a l l e l n u m t h r e a d s ( n b t h r e a d s ) {
12 #pragma omp f o r l a s t p r i v a t e ( t i d , x , i ) f i r s t p r i v a t e ( a )
f o r ( i = 0 ; i < n b t h r e a d s ; i ++){
14 t i d = omp get thread num () ;
p r i n t f ( ” H e l l o w o r l d from t h r e a d %d x = %d a = %d\n” , t i d , x , a ) ;
16 x = t i d ∗ 10; a = t i d ∗ 10;
sleep (1) ;
18 p r i n t f ( ” t h r e a d %d h a s x = %d a = %d\n” , t i d , x , a ) ;
}
20 }
p r i n t f ( ” t i d = %d x = %d a = %d\n” , t i d , x , a ) ;
ng.com https://fb.com/tailieudientucntt
22 }
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 30 / 62
Work-Sharing constructs : DO-FOR

H e l l o w o r l d from thread 4 x = 0 a = 456


2 H e l l o w o r l d from thread 5 x = 0 a = 456
H e l l o w o r l d from thread 3 x = 0 a = 456
4 H e l l o w o r l d from thread 1 x = 32606 a = 456
H e l l o w o r l d from thread 0 x = 32767 a = 456
6 H e l l o w o r l d from thread 2 x = 0 a = 456
thread 4 has x = 40 a = 40
8 thread 1 has x = 10 a = 10
thread 0 has x = 0 a = 0
10 thread 2 has x = 20 a = 20
thread 3 has x = 30 a = 30
12 thread 5 has x = 50 a = 50
t i d = 5 x = 50 a = 456

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 31 / 62
Work-Sharing constructs : SECTION

Non-iterative work-sharing construct


Specify that the enclosed sections of code are divided among threads
in the team
Each SECTION is executed once by a thread in the team
Different sections may be executed by different threads
It is possible that one thread executes more than one section
There is an implied barrier at the end of a SECTION directive unless
a nowait clause is specified
Restriction : It is illegal to branch (goto) into or out of sections blocks

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 32 / 62
Work-Sharing constructs : SECTION

1 #pragma omp s e c t i o n s [ c l a u s e . . . ] newline


private ( l i s t )
3 firstprivate ( list )
lastprivate ( list )
5 reduction ( operator : l i s t )
nowait
7 {

9 #pragma omp s e c t i o n newline

11 structured block

13 #pragma omp s e c t i o n newline

15 structured block

17 }

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 33 / 62
Work-Sharing constructs : SECTION
1 #i n c l u d e ”omp . h”
#i n c l u d e < s t d i o . h>
3
main ( i n t a r g c , c h a r ∗∗ a r g v ) {
5 int tid , i ,S ,P;
#pragma omp p a r a l l e l n u m t h r e a d s ( 2 ) p r i v a t e ( t i d , i , S , P)
7 {
t i d = omp get thread num () ;
9 #pragma omp s e c t i o n s n o w a i t
{
11 #pragma omp s e c t i o n
{
13 S = 0;
f o r ( i = 1 ; i <= 5 ; i ++) S = S + i ;
15 p r i n t f ( ” Thread %d compute sum S = %d\n” , t i d , S ) ;
}
17 #pragma omp s e c t i o n
{
19 P = 1;
f o r ( i = 1 ; i <= 5 ; i ++) P = P ∗ i ;
21 p r i n t f ( ” Thread %d compute sum P = %d\n” , t i d , P) ;
}
ng.com }
23
https://fb.com/tailieudientucntt
}
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 34 / 62
Work-Sharing constructs : SECTION

1 Thread 0 compute sum P = 120


Thread 1 compute sum S = 15

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 35 / 62
Work-Sharing constructs : SINGLE

The SINGLE directive specifies that the enclosed code is executed by


only one thread in the team
Useful when dealing with sections of code that are not thread safe
(e.g., I/O)
Threads in the team that do not execute the SINGLE directive wait at
the end of the enclosed code block, unless a nowait clause is specified
Restriction : It is illegal to branch into or out of a SINGLE block

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 36 / 62
Work-Sharing constructs : SINGLE

syntax : #pragma omp single [clause ...] newline structured block


clause can be :
private (list)
firstprivate (list)
nowait

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 37 / 62
Work-Sharing constructs : SINGLE

#i n c l u d e ”omp . h”
2 #i n c l u d e < s t d i o . h>

4 main ( i n t a r g c , c h a r ∗∗ a r g v ) {
int tid ;
6 #pragma omp p a r a l l e l n u m t h r e a d s ( 3 ) p r i v a t e ( t i d )
{
8 t i d = omp get thread num () ;
p r i n t f ( ” Thread %d c r e a t e d \n” , t i d ) ;
10 #pragma omp s i n g l e n o w a i t
{
12 p r i n t f ( ” Thread %d e x e c u t e s i n g l e s e c t i o n \n” , t i d ) ;
}
14 }
}

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 38 / 62
Work-Sharing constructs : SINGLE

1 Thread 1 created
Thread 1 executes single section
3 Thread 2 created
Thread 0 created

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 39 / 62
Combined Parallel Work-sharing constructs
#i n c l u d e ”omp . h”
2 #i n c l u d e < s t d i o . h>
#d e f i n e N 10
4 #d e f i n e CHUNK 4

6 main ( i n t a r g c , c h a r ∗∗ a r g v ) {
i n t t i d , i , chunk ;
8 chunk = CHUNK;
f l o a t a [N] , b [N] , c [N ] ;
10 f o r ( i = 0 ; i < N ; i ++)
a [ i ] = b[ i ] = i ∗1.0;
12
#pragma omp p a r a l l e l f o r \
14 s h a r e d ( a , b , c , chunk ) p r i v a t e ( i , t i d ) \
s c h e d u l e ( s t a t i c , chunk )
16 f o r ( i = 0 ; i < N ; i ++){
c[ i ] = a[ i ] + b[ i ];
18 t i d = omp get thread num () ;
p r i n t f ( ” t h r e a d %d comp utes c [%d ] = %f \n” , t i d , i , c [ i ] ) ;
20 }
}

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 40 / 62
Combined Parallel Work-sharing constructs

1 thread 2 c ompute s c [8] = 16.000000


thread 2 c ompute s c [9] = 18.000000
3 thread 1 c ompute s c [4] = 8.000000
thread 1 c ompute s c [5] = 10.000000
5 thread 1 c ompute s c [6] = 12.000000
thread 1 c ompute s c [7] = 14.000000
7 thread 0 c ompute s c [0] = 0.000000
thread 0 c ompute s c [1] = 2.000000
9 thread 0 c ompute s c [2] = 4.000000
thread 0 c ompute s c [3] = 6.000000

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 41 / 62
TASK construct

TASK construct defines an explicit task


may be executed by the encoutering thread, OR
deferred for execution by another thread in the team
A Task
A specific instance of executable code and its data environment,
generated when a thread encounters a taks construct or a parallel
construct
When a thread executes a task, it produces a task region
Task region
A region consists of all code encountered during the execution of a task
A parallel region consists of one or more implicit task regions
Explicit task : A task generated when a task construct is encountered
during the execution

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 42 / 62
TASK construct

source : http ://openmp.org/wp/resources/


Developers specify tasks in applications and run-time system executes
taks

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 43 / 62
TASK construct

i n t main ( i n t a r g c , c h a r ∗∗ a r g v ) {
2 i n t nbT = a t o i ( a r g v [ 1 ] ) ;
#pragma omp p a r a l l e l n u m t h r e a d s ( nbT )
4 {
#pragma omp s i n g l e
6 {
#pragma omp t a s k
8 printf (” hello ”) ;

10 #pragma omp t a s k
p r i n t f (” world ” ) ;
12
#pragma omp t a s k w a i t
14 p r i n t f ( ” \ n t h a n k you ! ” ) ;
}
16 }
p r i n t f ( ” \n” ) ;
18 }

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 44 / 62
TASK construct

Example
world h e l l o
2 t h a n k you !

Example
h e l l o world
2 t h a n k you !

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 45 / 62
TASK construct
void process ( int tid , int i ){
2 p r i n t f ( ” P r o c e s s (%d,%d ) \n” , t i d , i ) ;
s l e e p (4 −3∗ t i d ) ;
4 }
i n t main ( i n t a r g c , c h a r ∗∗ a r g v ) {
6 i n t nbT = a t o i ( a r g v [ 1 ] ) ;
int i = 1;
8 i n t N = 20;
#pragma omp p a r a l l e l n u m t h r e a d s ( nbT )
10 {
#pragma omp s i n g l e n o w a i t
12 {
w h i l e ( i < N) {
14 #pragma omp t a s k f i r s t p r i v a t e ( i )
{
16 i n t t i d = omp get thread num () ;
process ( tid , i ) ;
18 }
i = i +1;
20 }
}
22 }
}
ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 46 / 62
TASK construct
Run with 2 threads
1 Process (0 ,1)
Process (1 ,2)
3 Process (1 ,3)
Process (1 ,4)
5 Process (1 ,5)
Process (0 ,6)
7 Process (1 ,7)
Process (1 ,8)
9 Process (1 ,9)
Process (1 ,10)
11 Process (0 ,11)
Process (1 ,12)
13 Process (1 ,13)
Process (1 ,14)
15 Process (1 ,15)
Process (0 ,16)
17 Process (1 ,17)
Process (1 ,18)
19 Process (1 ,19)

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 47 / 62
Sychronization constructs

Directives
Critical : identifies that a section of code must be executed by a
single thread at a time
Barrier : specifies a synchronization point at which threads in a
parallel region will wait until all other threads in that section reach
the same point
Statement execution past the omp barrier point then continues in
parallel
Taskwait : specifies a wait on the completion of child tasks generated
by the current task

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 48 / 62
Sychronization constructs - Critical section

1 i n t main ( i n t a r g c , c h a r ∗∗ a r g v ) {
i n t N = a t o i ( argv [ 1 ] ) ;
3 i n t nbT = a t o i ( a r g v [ 2 ] ) ;
int T = 0;
5 #pragma omp p a r a l l e l
{
7 #pragma omp f o r
f o r ( i n t i = 1 ; i <= 2 0 ; i ++){
9 #pragma omp c r i t i c a l
{
11 T = T + i;
}
13 }
}
15 p r i n t f ( ”T = %d\n” ,T) ;
}

What happen if we remove line 9 ?


ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 49 / 62
Sychronization constructs - barrier

Code without barrier


i n t main ( i n t a r g c , c h a r ∗∗ a r g v ) {
2 i n t nbT = a t o i ( a r g v [ 1 ] ) ;
int tid ;
4 #pragma omp p a r a l l e l n u m t h r e a d s ( nbT ) p r i v a t e ( t i d )
{
6 t i d = omp get thread num () ;
double s t = omp get wtime ( ) ;
8 f o r ( i n t i = 0 ; i < 5 ∗ ( t i d +1) ; i ++){
p r i n t f ( ”T %d , i = %d\n” , t i d , i ) ;
10 sleep (1) ;
}
12 p r i n t f ( ” Thread %d f i n i s h e d a t % l f \n” , t i d , o m p g e t w t i m e ( )−s t ) ;
}
14 }

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 50 / 62
Sychronization constructs - barrier

T 0, i = 0
2 T 1, i = 0
T 0, i = 1
4 T 1, i = 1
T 0, i = 2
6 T 1, i = 2
T 0, i = 3
8 T 1, i = 3
T 0, i = 4
10 T 1, i = 4
Thread 0 f i n i s h e d at 5.000866
12 T 1, i = 5
T 1, i = 6
14 T 1, i = 7
T 1, i = 8
16 T 1, i = 9
Thread 1 f i n i s h e d at 10.001761

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 51 / 62
Sychronization constructs - barrier

Code with barrier


1 i n t main ( i n t a r g c , c h a r ∗∗ a r g v ) {
i n t nbT = a t o i ( a r g v [ 1 ] ) ;
3 int tid ;
#pragma omp p a r a l l e l n u m t h r e a d s ( nbT ) p r i v a t e ( t i d )
5 {
t i d = omp get thread num () ;
7 double s t = omp get wtime ( ) ;
f o r ( i n t i = 0 ; i < 5 ∗ ( t i d +1) ; i ++){
9 p r i n t f ( ”T %d , i = %d\n” , t i d , i ) ;
sleep (1) ;
11 }
#pragma omp b a r r i e r
13 p r i n t f ( ” Thread %d f i n i s h e d a t % l f \n” , t i d , o m p g e t w t i m e ( )−s t ) ;
}
15 }

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 52 / 62
Sychronization constructs - barrier

1 T 0, i = 0
T 1, i = 0
3 T 0, i = 1
T 1, i = 1
5 T 0, i = 2
T 1, i = 2
7 T 0, i = 3
T 1, i = 3
9 T 0, i = 4
T 1, i = 4
11 T 1, i = 5
T 1, i = 6
13 T 1, i = 7
T 1, i = 8
15 T 1, i = 9
Thread 1 f i n i s h e d at 10.001736
17 Thread 0 f i n i s h e d at 10.001799

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 53 / 62
THREADPRIVATE directive

#pragma omp threadprivate(var list)


Make global file scope variables (C,C++) local and persistent to a
thread throught the execution of multiple parallel regions
The directive must appear after the declaration of listed variables
Each thread get its own local location for variables listed
On the entry of a parallel region, data in THREADPRIVATE variables
are undefined, unless a COPYIN clause is specified in the PARALLEL
directive
Data in THREADPRIVATE objects is guaranteed to persist only if the
dynamic threads mechanism is ”turned off” and the number of
threads in different parallel regions remains constant

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 54 / 62
THREADPRIVATE directive
1 int a , b, tid ;
float x;
3 #pragma omp t h r e a d p r i v a t e ( a , x )
i n t main ( i n t a r g c , c h a r ∗∗ a r g v ) {
5 omp set dynamic (0) ;
a = 123; x = 456;
7 p r i n t f ( ” S e r i a l a = %d , x = %f \n” , a , x ) ;
p r i n t f ( ” 1 s t P a r a l l e l R e g i o n : \ n” ) ;
9 #pragma omp p a r a l l e l p r i v a t e ( b , t i d ) n u m t h r e a d s ( 3 ) c o p y i n ( a )
{
11 t i d = omp get thread num () ;
p r i n t f ( ” a t t h e e n t r y o f p a r a l l e l r e g i o n o f t h r e a d %d a= %d , x =
%f \n” , t i d , a , x ) ;
13 a = t i d ; b = t i d ; x = 10∗ t i d ;
p r i n t f ( ” Thread %d : a , b , x= %d %d %f \n” , t i d , a , b , x ) ;
15 }

17 p r i n t f ( ” 2 nd P a r a l l e l R e g i o n : \ n” ) ;
#pragma omp p a r a l l e l p r i v a t e ( t i d ) n u m t h r e a d s ( 4 )
19 {
t i d = omp get thread num () ;
21 p r i n t f ( ” Thread %d : a , b , x= %d %d %f \n” , t i d , a , b , x ) ;
ng.com} https://fb.com/tailieudientucntt
23 }
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 55 / 62
THREADPRIVATE directive

1 S e r i a l a = 123 , x = 456.000000
1 s t P a r a l l e l Region :
3 a t t h e e n t r y o f p a r a l l e l r e g i o n o f t h r e a d 1 a= 1 2 3 , x = 0 . 0 0 0 0 0 0
Thread 1 : a , b , x = 1 1 10.000000
5 a t t h e e n t r y o f p a r a l l e l r e g i o n o f t h r e a d 0 a= 1 2 3 , x = 4 5 6 . 0 0 0 0 0 0
Thread 0 : a , b , x = 0 0 0.000000
7 a t t h e e n t r y o f p a r a l l e l r e g i o n o f t h r e a d 2 a= 1 2 3 , x = 0 . 0 0 0 0 0 0
Thread 2 : a , b , x = 2 2 20.000000
9 2 nd P a r a l l e l R e g i o n :
Thread 2 : a , b , x = 2 0 20.000000
11 Thread 1 : a , b , x = 1 0 10.000000
Thread 0 : a , b , x = 0 0 0.000000
13 Thread 3 : a , b , x = 0 0 0.000000

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 56 / 62
REDUCTION clause

#pragma omp reduction(+ : var list)


The REDUCTION clause performs a reduction on the variables
appearing in the list
A private copy for each variables listed is created for each thread
At the end of the reduction, the reduction is applied to all private
copies of the shared variables, and the final result is written to the
global shared variable

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 57 / 62
REDUCTION clause

1 i n t main ( i n t a r g c , c h a r ∗∗ a r g v ) {
i n t N = a t o i ( argv [ 1 ] ) ;
3 i n t nbT = a t o i ( a r g v [ 2 ] ) ;

5 int T = 0;
#pragma omp p a r a l l e l f o r r e d u c t i o n (+:T)
7 f o r ( i n t i = 1 ; i <= N ; i ++)
{
9 i n t t i d = omp get thread num () ;
T = T + i;
11 p r i n t f ( ” Thread %d h a s i = %d compute l o c a l T = %d\n” , t i d , i , T) ;
}
13 p r i n t f ( ”T = %d\n” ,T) ;
}

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 58 / 62
REDUCTION clause

Thread 0 has i = 1 compute l o c a l T = 1


2 Thread 0 has i = 2 compute l o c a l T = 3
Thread 0 has i = 3 compute l o c a l T = 6
4 Thread 1 has i = 4 compute l o c a l T = 4
Thread 1 has i = 5 compute l o c a l T = 9
6 Thread 1 has i = 6 compute l o c a l T = 15
Thread 2 has i = 7 compute l o c a l T = 7
8 Thread 2 has i = 8 compute l o c a l T = 15
Thread 2 has i = 9 compute l o c a l T = 24
10 Thread 3 has i = 10 compute l o c a l T = 10
T = 55

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 59 / 62
REDUCTION clause

1 i n t main ( i n t a r g c , c h a r ∗∗ a r g v ) {
i n t N = a t o i ( argv [ 1 ] ) ;
3 i n t nbT = a t o i ( a r g v [ 2 ] ) ;

5 int T = 1;
#pragma omp p a r a l l e l f o r r e d u c t i o n ( ∗ : T)
7 f o r ( i n t i = 1 ; i <= N ; i ++)
{
9 i n t t i d = omp get thread num () ;
T = T + i;
11 p r i n t f ( ” Thread %d h a s i = %d compute l o c a l T = %d\n” , t i d , i , T) ;
}
13 p r i n t f ( ”T = %d\n” ,T) ;
}

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 60 / 62
REDUCTION clause

Thread 0 h a s i = 1 compute l o c a l T = 2
2 Thread 0 h a s i = 2 compute l o c a l T = 4
Thread 0 h a s i = 3 compute l o c a l T = 7
4 Thread 2 h a s i = 7 compute l o c a l T = 8
Thread 2 h a s i = 8 compute l o c a l T = 16
6 Thread 2 h a s i = 9 compute l o c a l T = 25
Thread 3 h a s i = 10 compute l o c a l T = 11
8 Thread 1 h a s i = 4 compute l o c a l T = 5
Thread 1 h a s i = 5 compute l o c a l T = 10
10 Thread 1 h a s i = 6 compute l o c a l T = 16
T = 30800

ng.com https://fb.com/tailieudientucntt
Pham Quang Dung () Parallel Programming OpenMP Hanoi, 2012 61 / 62

You might also like