Professional Documents
Culture Documents
Module Outline
Refresher on OpenMP
Model of parallelism
Components
Scope of variables
Synchronization
Refresher on OpenMP
Refresher on OpenMP
Sequential Program
void main()
{
int i, k, N=1000;
double A[N], B[N], C[N];
for (i=0; i<N; i++) {
A[i] = B[i] + k*C[i]
}
}
Parallel Program
#include omp.h
void main()
{
int i, k, N=1000;
double A[N], B[N], C[N];
#pragma omp parallel for
for (i=0; i<N; i++) {
A[i] = B[i] + k*C[i];
}
}
During Execution
(Cont.)
Parallel Program
Thread 1
void main()
#include omp.h
Thread 2
void
main()
{
void main()
Thread 3
void main()
{
int i, k, N=1000;
{
void main()
{ N=1000;
int int
i,C[N];
k,
double A[N], B[N],
i, N=1000;
k,
{
intB[N],
i,C[N];
k,C[N];
N=1000;
double
A[N],
B[N],
lb = 0;
double
A[N],
int i,C[N];
k, N=1000;
A[N],
lb#pragma
= 250; omp double
ub = 250;
parallel
for B[N],
double A[N], B[N], C[N];
= 500;
ub =for
500;
for (i=lb;i<ub;i++)
{ (i=0; lb
i<N;
i++) {
lb = 750;
ub =+750;
(i=lb;i<ub;i++)
{
A[i] = B[i] for
+ k*C[i];
A[i] = B[i]
k*C[i];
ub = 1000;
(i=lb;i<ub;i++)
{
A[i]
+ k*C[i];
}
} = B[i] for
(i=lb;i<ub;i++) {
A[i] = B[i] for
+ k*C[i];
} }
}
A[i] = B[i] + k*C[i];
}
}
}
}
}
Thread 0
Parallel
Slave
threads
Serial
Parallel
Serial
OpenMP s Components
Directives format
Refer to http://www.openmp.org and the OpenMP 2.0 specifications in the course web
site for more details
#pragma omp directive-name [clause[ [,] clause]...] new-line
For example,
#pragma omp for [clause[[,] clause] ... ] new-line
for-loop
private(variable-list)
firstprivate(variable-list)
lastprivate(variable-list)
reduction(operator: variable-list)
ordered
schedule(kind[, chunk_size])
nowait
Except here.
Iterations divided
over multiple threads
#include <omp.h>
//
#pragma omp parallel
{
Parallel Section
Parallel sections:
10
Type of variables
Most
11
12
Barriers
Barriers are implicit after each parallel section
When barriers are not needed for
correctness, use nowait clause
schedule clause will be discussed later
#include <omp.h>
//
#pragma omp parallel
{
13
Named locks
omp_init_lock(), omp_set_lock(), omp_unset_lock
(), omp_test_lock(), omp_destroy_lock()
Use when you use multiple locks (fine-grain lock
programming)
Copyright @ 2009-2010 Yan Solihin
14
Declaring Variables
15
Declaring Variables
is equivalent to:
sum = mysum = 0;
#pragma omp parallel firstprivate(mysum) {
#pragma omp for
for (i=0; i<N; i++)
mysum = mysum + a[i];
#pragma omp critical {
sum = sum + mysum
}
} // end omp parallel
Copyright @ 2009-2010 Yan Solihin
16
Restrictions on Reduction
17
Declaring Variables
18
omp_set_num_threads(4);
#pragma omp parallel // 4 threads will run
{
// do work here
}
omp_set_num_threads(omp_num_procs());
#pragma omp parallel // as many threads as CPUs
{
// do work here
}
Copyright @ 2009-2010 Yan Solihin
19
Step 1: Go to Lab/Reduction
Step 2: Parallelize the loop that reduces to sum
put #pragma omp parallel for reduction(+:sum)
20
OpenMP 3.0
if (expression)
untied
shared (list)
private (list)
firstprivate (list)
default( shared | none )
Copyright @ 2009-2010 Yan Solihin
21
Example of Tasking
22