You are on page 1of 12

Advanced Usage of the CBI Cluster

Parallel Jobs, Job Arrays, and the use of the developmental queue
Note on Parallel Jobs
Running parallel jobs on a cluster is especially
advantageous because they are intended to be
split into many processes which can be
executed simultaneously.

However, only programs that are both


programmed and compiled for parallel execution
will run in parallel on the cluster. These specific
programs are not designed to be run in any
environment without cluster management
software.
Writing the job Script

It is recommended that you place the job script in


it's own directory, along with all of the files
required by the job. The output file of the job will
be placed in this directory, as well.
Sample Script for a Parallel Job

The name of the job, “openmpi-test”

#!/bin/bash The parallel environment “orte” is specified and a


#$ -N openmpi-test minimum of 5 slots and a maximum of 10 slots
#$ -pe orte 5-10
#$ -cwd This specifies that the job will run in your home directory.
This combines the standard output and error messages into
#$ -j y one output.
/opt/openmpi/bin/mpirun -n $NSLOTS mpiProgram
exit 0
Submit the Parallel Job to the
Cluster

The script written for the Parallel Job can be


submitted to the cluster the same way as a
Batch Job (described in the first guide)

Use this command:

> qsub <script file name>


Job Arrays
Job Arrays are a useful way of running one
program over a number of input data sets.

The alternative to using a script with a job array is


to create a script for each data set you need to
input into the program.

The advantage of scripts that use job arrays is


that they are efficient for both the user and the
head node. Also, note that all of the jobs in the
array will use the same job ID. All of the tasks
are given an individual task ID.
Example Situation for a Job Array
For this example, we will use this sample script:

#!/bin/bash
/programs/program -i /data/input

Note: You can still add parameters as you would in a


regular Batch or Parallel job script.

This script will run once for the input file.

If we have 100 input files then we would need to create


100 of these scripts to run the program on each of the
input files.
Solution using an Array of Jobs

An efficient solution to the problem presented in the previous


slide is to use a job array.

#!/bin/bash
#$ -t 1-100
/programs/program -i /data/input.$SGE_TASK_ID -o /results/output.$SGE_TASK_ID

The '-t 1-100' specifies that this script will create 100
jobs, where each job is “numbered” 1 to 100. Each job will
replace '$SGE_TASK_ID' with it's “number”.
Deleting Job Arrays
All of the jobs in the array are grouped under a
single job ID. Using this command will kill all of
the jobs in that array:

> qdel <ID of the job array>

Additionally, the individual jobs in the array


(named “tasks”) are given task IDs. This
command will kill a specific task without killing
the other jobs in the array:

> qdel <job ID>.<task ID>


Development Queue

The Development Queue is to be used for testing


custom code and small data sets.

This queue is limited to a two hour run time,


afterward the job is killed.

It should be noted this queue is for testing, DO


NOT test code/data on the main queue.
Development Queue
The advantages of this queue are:
 This queue will be available more often then
compared to the main queue
 Should buggy code or bad data cause the
program to hang, the time limit will prevent it
from taking up slots
Development Queue

To access this queue, use the following as an


example job script:
#!/bin/bash
#$ -N dev-job
#$ -o dev-job.log
#$ -q dev.q Note these options. This asks for the dev
#$ -l testonly queue, and “testonly” is a special flag
#$ -cwd that is required to access the queue
#$ -j y
/path/to/program input
exit 0

You might also like