You are on page 1of 61

Parallel Computing with MATLAB

August 18, 2010


Alyssa Silverman Account Manager
Sarah Zaranek Application Engineer

1
The MathWorks at a Glance
Headquarters:
Natick, Massachusetts US
US:
California, Michigan,
Washington DC, Texas
Europe:
UK, France, Germany,
Switzerland, Italy, Spain,
the Netherlands, Sweden
Asia-Pacific:
China, Korea, Australia
Worldwide training
and consulting
Earths topography on an equidistant cylindrical projection,
Distributors in 25 countries created with MATLAB and Mapping Toolbox.

2
MATLAB Central

Open exchange for the MATLAB


and Simulink user community
662,000 visits per month
File Exchange
Upload/download free files including
MATLAB code, Simulink models, and
documents
Rate files, comment, and ask questions
Newsgroup
Web forum and newsgroup for technical
discussions about MATLAB and
Simulink
Blogs
Read posts from key MathWorks
developers
who design and build the products

Visit www.mathworks.com/matlabcentral 3
Visit www.mathworks.com/academia 4
1,200+ Books in 27 Languages

Controls
Signal Processing
Image Processing
Biosciences
Communications
Mechanical Engineering
Aerospace Engineering
Electrical Engineering
Physical sciences
Finance
Mathematics

Visit www.mathworks.com/books 5
Training Services

Training Options
Public training
Offered throughout the world
Schedule and course information
at www.mathworks.com/training Example course topics
Onsite training Introductory and intermediate
Bring training to your site, with training on MATLAB, Simulink,
course customization available Stateflow, and Real-Time
Workshop
Web-based training
Specialized courses in control
Instructor-led e-learning
design, signal processing, parallel
Train at work or at home, with computing, code generation,
flexible dates and times communications, financial analysis,
and other areas

Visit www.mathworks.com/training 6
Further information
Stay for questions

Visit MATLAB Central for shared code

Trials, onsite demonstrations, technical


literature:
Alyssa Silverman
alyssa.silverman@mathworks.com
(508) 647-4343
Company and product information:
www.mathworks.com
7

Workshop:
Parallel Computing with MATLAB

2010 The MathWorks, Inc.


Sarah Zaranek, Senior Applications Engineer

Outline

Introduction to Parallel Computing Tools


Using Parallel Computing Toolbox
Task Parallel Applications
Data Parallel Applications

Parallel Computing with MATLAB

Worker Worker

Worker

TOOLBOXES Worker
Worker

Worker
BLOCKSETS Worker

Worker

10

Parallel Computing with MATLAB

Parallel Computing Toolbox MATLAB Distributed


Computing Server

MATLAB Workers

Users Desktop Compute Cluster


11

Solving Big Technical Problems

Challenges You could Solutions

Long running Run similar tasks


on independent
Wait
Computationally processors in
intensive parallel

Load data onto


Reduce size
Large data set multiple machines
of problem
that work together
in parallel

12

Parallel Computing Toolbox API

Task-parallel Applications
Using the parfor constructs
Using jobs and tasks

Data-parallel Applications
Using distributed arrays
Using the spmd construct

13

Task-parallel Applications

Converting for to parfor


Configurations
Scheduling parfor
Creating jobs and tasks
When to Use parfor vs. jobs and tasks
Resolving parfor Issues
Resolving jobs and tasks Issues

14

Toolboxes with Built-in Support

Optimization Toolbox
Global Optimization Toolbox
Statistics Toolbox Worker
Worker

Simulink Design Optimization TOOLBOXES


BLOCKSETS
Worker Worker
Worker

Bioinformatics Toolbox Worker


Worker

Communications Toolbox
.

Contain functions that directly leverage functions from the


Parallel Computing Toolbox
15

Opening and Closing a matlabpool

Open and close a matlabpool with two labs

16

Determining the Size of the Pool

17

One Pool at a Time

Even if you have not exceeded the number of labs, you can only
open one matlabpool at a time

18

Add Shortcut for Starting the matlabpool

19

Add Shortcut for Stopping the matlabpool

20

Example: Parameter Sweep of ODEs


1.2

Solve a 2nd order ODE 0.8

Displacement (x)
0.6


0.4
5 0.2
m = 5, b = 2, k = 2

m x b x k x 0 0

-0.2
m = 5, b = 5, k = 5

-0.4
1, 2 ,... 1, 2 ,... 0 5 10
Time (s)
15 20 25

Simulate with different


values for b and k 2.5

Peak Displacement (x)


2

Records and plots peak values


1.5

0.5
0
5
2 4
4 3
Damping (b) 2
6 1 Stiffness (k)

\task_parallel\paramSweepScript.m 21

The Mechanics of parfor Loops


1 1
2 23 34 4 55 66 7 8 8 9 9 10 10

Worker
a(i) = i; Worker
a = zeros(10, 1) a(i) = i;

parfor i = 1:10
a(i) = i;
end Worker
Worker
a a(i) = i;
a(i) = i;

Pool of MATLAB Workers


22

Converting for to parfor

Requirements for parfor loops


Task independent
Order independent

Constraints on the loop body


Cannot introduce variables (e.g. eval, load, global, etc.)
Cannot contain break or return statements
Cannot contain another parfor loop

23

Advice for Converting for to parfor

Use M-Lint to diagnose parfor issues

If your for loop cannot be converted to a parfor,


consider wrapping a subset of the body to a function

Read the section in the documentation on


classification of variables

http://blogs.mathworks.com/loren/2009/10/02/using-
parfor-loops-getting-up-and-running/

24

Migrating from Interactive to Scheduled

Work Worker

TOOLBOXES Scheduler
Worker
Worker
Result
BLOCKSETS Worker

25

Interactive to Scheduled

Interactive
Great for prototyping
Immediate access to MATLAB workers

Scheduled
Offloads work to other MATLAB workers (local or on a cluster)
Access to more computing resources for improved performance
Frees up local MATLAB session

26

Using Configurations

Managing configurations
Typically created by Sys Admins
Label configurations based on the version of MATLAB
E.g. linux_r2009a

Import configurations generated by the Sys Admin


Dont modify them with two exceptions
Setting the CaptureCommandWindowOutput to true for debugging
Set the ClusterSize for the local scheduler to the number of cores

27

Validating Configurations

Validate configurations
Ensure new configuration is properly working
Helpful when debugging failed jobs

Think of job configurations like printer queue


configurations

28

Creating and Submitting Jobs

Find resource

Create job

Create task(s)

Submit job

Wait for completion

Check for errors

Get results

Destroy job

Rather than using a shell script to submit a job to a cluster, well write
our jobscript in MATLAB.
task_parallel\basic_jobscript.m 29

Example: Scheduling the ODE Sweep

task_parallel\jobscript_ode.m 30

Example: Retrieving Results

task_parallel\ode_return.m 31

Considerations When Using parfor

parfor automatically quits on error


parfor doesnt provide intermediate results

32

Creating Jobs and Tasks

Rather than submitting a single task


containing a parfor, the jobscript can be
used to create an array of tasks, each
calling a unit of work

33

Example: Using Multiple Tasks

task_parallel\jobscript_tasks.m 34

Example: Retrieving Task Results

task_parallel\tasks_return.m 35

parfor or jobs and tasks

parfor Jobs and tasks


Seamless integration to All tasks run
users code Query results after each
Several for loops task is finished
throughout the code to
convert
Automatic load balancing

Try parfor first. If it doesnt apply to your application,


create jobs and tasks.
36

Resolving parfor Issues

Lets look at a common parfor issues and how to go


resolving them

37

Unclassified Variables

The variable A cannot be properly classified

task_parallel\valid_indexing_error.m 38

parfor Variable Classification

All variables referenced at the top level of the parfor


must be resolved and classified

Classification Description
Loop Serves as a loop index for arrays
Sliced An array whose segments are operated on by different
iterations of the loop
Broadcast A variable defined before the loop whose value is used
inside the loop, but never assigned inside the loop
Reduction Accumulates a value across iterations of the loop,
regardless of iteration order
Temporary Variable created inside the loop, but unlike sliced or
reduction variables, not available outside the loop

39

Variable Classification Example

Loop variable
Temporary variable
Reduction variable
Sliced output variable Sliced input variable

Broadcast variable

40

At the end of this loop, what is the


value of each variable?

task_parallel\what_is_it_parfor.m 41

Results
a: ones(1:10) (broadcast)
b: undefined (temp)
c: undefined (temp)
d: 1:10 (sliced)
e: 55 (reduction)
f: 5 (temp)
g: 20 (reduction)
h: 10 (temp)
idx: undefined (loop)

42

parfor issue: Nested for loops

Within the list of indices for a sliced variable, one of these indices is of the form i, i+k, i-k, k+i,
or k-i, where i is the loop variable and k is a constant or a simple (non-indexed) variable; and
every other index is a constant, a simple variable, colon, or end.

task_parallel\valid_indexing_error.m 43

parfor issue: Solution 1

Create a temporary variable, b to store the row vector. Use the looping index, i, to index
the columns and the colon to assign the row vector to the temporary variable between the
for loops.

task_parallel\valid_indexing_fix1.m 44

parfor issue: Solution 2

Use cell arrays. The restrictions on indexing only apply to the top-level indexing (i.e.
indexing into the cell array). Indexing into contents of the cell arrays is allowed.

task_parallel\valid_indexing_fix2.m 45

Using parfor with Simulink

Can use parfor with sim.


Must make sure that the Simulink workspace contains the
variables you want to use.

Within main parfor body: Use base workspace


Use assignin to place variables in base workspace.
Note: the base workspace when using parfor is different
than the base workspace when running serially.

task_parallel\simParforEx1.m 46

Resolving Jobs & Tasks Issues

Code running on your client machine ought to be able to


resolve functions on your path

When submitting jobs to a cluster, those files need to


either be submitted as part of the job (FileDependencies)
or the folder needs to be accessible (PathDependencies)

There is overhead when adding too many files to the job;


but setting path dependencies requires the Worker to be
able to reach the path

47

Parallel Computing Tools Address


Task-Parallel Data-Parallel

Long computations Large data problems

Multiple independent iterations 11 26 41


12 27 42
13 28 43
parfor i = 1 : n
14 29 44
% do something with i 15 30 45
16 31 46
end 17 32 47
17 33 48
19 34 49
Series of tasks 20 35 50
21 36 51

Task 1 Task 2 Task 3 Task 4 22 37 52

48

Data-parallel Applications

Using distributed arrays


Using spmd
Using mpi based functionality

49

Client-side Distributed Arrays


11 26 41

12 27 42

13 28 43

14 29 44

15 30 45

16 31 46

TOOLBOXES 17 32 47

17 33 48

BLOCKSETS 19 34 49

20 35 50

21 36 51

22 37 52

Remotely Manipulate Array Distributed Array


from Desktop Lives on the Cluster

data_parallel\distributed_example.m 50

Client-side Distributed Arrays and SPMD

Client-side distributed arrays


Class distributed
Can be created and manipulated directly from the client.
Simpler access to memory on labs
Client-side visualization capabilities

spmd
Block of code executed on workers
Worker specific commands
Explicit communication between workers
Mixture of parallel and serial code

51

spmd blocks (Data Parallel)

spmd
% single program across workers
end

Mix data-parallel and serial code in the same function


Run on a pool of MATLAB resources
Single Program runs simultaneously across workers
Distributed arrays, message-passing
Multiple Data spread across multiple workers
Data stays on workers

data_parallel\spmd_example.m 52

The Mechanics of spmd Blocks

Worker
x 1 Worker
x = 1 y = x + 1 x 1
y = x + 1
spmd
y = x + 1
end Worker
Worker
y x 1
x 1
y = x + 1 y = x + 1

Pool of MATLAB Workers


53

Composite Arrays

Created from client


TOOLBOXES
Stored on workers BLOCKSETS

Syntax similar to cell arrays

54

Composite Array in Memory


0x0000
2
0x0008
>> matlabpool open 4
0x0010
1

2
>> x = Composite(4)
0x0000
2
TOOLBOXES 3
0x0008
3
4 5
>> x{1} = 2 BLOCKSETS 0x0010

>> x{2} = [2, 3, 5]


>> x{3} = @sin x
0x0000 0x0000
0x0000

0x0008
>> x{4} = tsobject() 0x0008 0x0008
tsobject @sin
0x0010
0x0010 0x0010

0x0018

0xFFFF

spmd

single program, multiple data


Unlike variables used in multiple parfor loops,
distributed arrays used in multiple spmd blocks retain
state
Use M-Lint to diagnose spmd issues

56

MPI-Based Functions in
Parallel Computing Toolbox
Use when a high degree of control over parallel algorithm is required

High-level abstractions of MPI functions


labSendReceive, labBroadcast, and others
Send, receive, and broadcast any data type in MATLAB

Automatic bookkeeping
Setup: communication, ranks, etc.
Error detection: deadlocks and miscommunications

Pluggable
Use any MPI implementation that is binary-compatible with MPICH2

data_parallel\mpi_example.m 5

Summary for Interactive Functionality

Client-side Distributed Arrays


MATLAB array type across cluster
Accessible from client

SPMD END
Flow control from serial to parallel TOOLBOXES
Fine Grained BLOCKSETS
More control over distributed arrays

Composite Arrays
Generic data container across cluster
Accessible from client

Example: Scheduling Estimating

What is the probability that a randomly dropped needle


will cross a grid line?

(Buffon-Laplace Method) Simulate a


random needles dropping, calculate P,
and get an estimate for .

b
2 l ( a b) l 2 crossing needles
P(l , a, b)
ab total needles l
(xo,yo)
data_parallel\jobscript_Pi.m 5

Summary for Scheduled Functionality

uses function script pure task pure data parallel


matlabpool parallel parallel and serial

batch

matlabpool job

jobs and tasks

parallel job

Recommendations

Profile your code to search for bottlenecks


Make use of M-Lint when coding parfor and spmd
Beware of writing to files
Avoid the use of global variables
Run locally before moving to cluster

61

You might also like