You are on page 1of 7

Home Resources User Guides Consulting Education Partners About Us Accounts High Performance Computing Systems Visualization Facilities

Software Telecollaboration Systems System Utilities File Transfer Remote Connectivity Remote Display Domain Expertise Software Development / Programming Support Visualization Consulting Seminars Workshops Tutorials User Groups Case Studies White Papers

About Us | Contact Us

PBS User Guide


Overview Job Limits Submitting a Job Non-interactive Batch Jobs Interactive Batch Jobs Checking Job Status Deleting a Job Viewing Job Output More Information

Overview
PBS is a job resource manager. A job is defined as a computational task such as computational simulation or data analysis. PBS provides job queuing and execution services in a batch cluster environment. In the HPC systems, PBS works with the Moab job scheduler. PBS provides job information to Moab and Moab tells PBS which jobs to run and on what compute nodes in a cluster to run the jobs on.

Job Limits
PBS is configured on each system to have a number of separate job queues. There is a default queue on each system that every user has access to. Each funding group has its own queue. These limits are placed on funding group queues: Funding Group Queue Limits Limit Lion-X* Maximum Walltime (Hours) 96 (default) to 336+

Maximum Processors in Use Per User No Limit Maximum Job Size (Processors) 32 Note that the walltime limits placed on funding group queues are arbitrary and can be adjusted at the request of the group's PI. If you believe you're part of a funding group of a system and you don't know what queue you should be using, please email us at helpdesk@rcc.its.psu.edu stating what group you are part of and that you need to know your group queue. These limits are placed on system default queues: System Default Queue Limits LionLion-XK CyberStar X* 24 24 96 32 32 No Limit No Limit

Limit Maximum Walltime (Hours) Maximum Processors in Use Per User Maximum Job Size (Processors)

Clsf No Limit No Limit

512 256 128 (nodes=64:ppn=8) (nodes=32:ppn=8) (nodes=1:ppn=128)

Most queue limits can be checked by running the command qstat -q.

Submitting a Job
Jobs are submitted to a PBS queue so that PBS can dispatch them to be run on one or more of a cluster's compute nodes. There are two main types of PBS jobs: Non-interactive Batch Jobs: this is the most common PBS job. A job script is created that contains PBS resource requests and the commands necessary to execute the job. The job script is then submitted to PBS to be run non-interactively. Interactive Batch Jobs: this is a way to get an interactive terminal on one or more of the compute nodes of a cluster. Commands can then be run interactily through that terminal directly on the compute nodes for the duration of the job. Interactive jobs are helpful for such things as program debugging and running many short jobs.

Non-interactive Batch Jobs


There are two steps to running a non-interactive batch job: 1. Create a PBS Script A PBS script is a standard Unix/Linux shell script that contains a few extra comments at the beginning that specify directives to PBS. These comments all begin with #PBS. The most important PBS directives are: PBS Directive Definition of Important PBS Directives Description This directive specifies the maximum walltime (real time, not CPU time) that a job should take. If this limit is exceeded, PBS

#PBS -l walltime=HH:MM:SS will stop the job. Keeping this limit close to the actual expected time of a job can allow a job to start more quickly than if the maximum walltime is always requested. This directive specifies the maximum amount of physical memory used by any process in the job. For example, if the job would run four processes and each would use up to 2 GB (gigabytes) of memory, then the directive would read #PBS -l pmem=2gb. The default for this directive on Lion-XF and Lion-LSP is 1 GB (gigabyte) of memory. Other Lion clusters do not currently set a default. This specifies the number of nodes (nodes=N) and the number of processors per node (ppn=M) that the job should use. PBS treats a processor core as a processor, so a system with eight cores per compute node can have ppn=8 as its maximum ppn request. Note that unless a job has some inherent parallelism of its own through something like MPI or OpenMP, requesting more than a single processor on a single node is usually wasteful and can impact the job start time. This specifies what PBS queue a job should be submitted to. This is only necessary if a user has access to a special queue. This option can and should be omitted for jobs being submitted to a system's default queue. Normally when a command runs it prints its output to the screen. This output is often normal output and error output. This directive tells PBS to put both normal output and error output into the same output file.

#PBS -l pmem=SIZEgb

#PBS -l nodes=N:ppn=M

#PBS -q queuename

#PBS -j oe

The following is an example PBS script.


#T h i si sas a m p l eP B Ss c r i p t .I tw i l lr e q u e s t1p r o c e s s o ro n1n o d e #f o r4h o u r s . # # R e q u e s t1p r o c e s s o r so n1n o d e # # P B Sln o d e s = 1 : p p n = 1 # # R e q u e s t4h o u r so fw a l l t i m e # # P B Slw a l l t i m e = 4 : 0 0 : 0 0 # # R e q u e s t1g i g a b y t eo fm e m o r yp e rp r o c e s s # # P B Slp m e m = 1 g b # # R e q u e s tt h a tr e g u l a ro u t p u ta n dt e r m i n a lo u t p u tg ot ot h es a m ef i l e # # P B Sjo e # # T h ef o l l o w i n gi st h eb o d yo ft h es c r i p t .B yd e f a u l t , # P B Ss c r i p t se x e c u t ei ny o u rh o m ed i r e c t o r y ,n o tt h e # d i r e c t o r yf r o mw h i c ht h e yw e r es u b m i t t e d .T h ef o l l o w i n g # l i n ep l a c e sy o ui nt h ed i r e c t o r yf r o mw h i c ht h ej o b # w a ss u b m i t t e d . # c d$ P B S _ O _ W O R K D I R # # N o ww ew a n tt or u nt h ep r o g r a m" h e l l o " . " h e l l o "i si n

# t h ed i r e c t o r yt h a tt h i ss c r i p ti sb e i n gs u b m i t t e df r o m , # $ P B S _ O _ W O R K D I R . # e c h o"" e c h o"" e c h o" J o bs t a r t e do n` h o s t n a m e `a t` d a t e ` " . / h e l l o e c h o"" e c h o" J o bE n d e da t` d a t e ` " e c h o""

Note that the above example script is for a non-MPI job. Information on how to write PBS scripts for MPI jobs can be found in the MPI software pages. 2. Submit the PBS Script to PBS for Execution Once a PBS script is created, it needs to be submitted to PBS so that it becomes eligible to be run. The command to submit a script to PBS is called qsub. The syntax of qsub is: qsub scriptfile The following is an example of using qsub to submit a PBS script called myjob. % qsub myjob 95.lionxj.rcc.psu.edu The job script myjob has just been submitted to PBS and has been assigned the Job_ID 95.lionxj.rcc.psu.edu. This Job_ID can later be used to control the job.

Interactive Batch Jobs


Interactive PBS jobs are similar to non-interactive PBS jobs in that they are submitted to PBS via the command qsub. Submitting an interactive PBS job differs from a non-interactive PBS job in that a PBS script is not necessary. All PBS directives can be specified on the command line. The syntax for qsub for submitting an interactive PBS job is: qsub -I ... pbs directives ... The Iflag above tells qsub that this is an interactive job. The following example shows using qsub to submit an interactive job using one processor on one node for four hours.
l i o n x i : ~ $q s u bIln o d e s = 1 : p p n = 1lw a l l t i m e = 4 : 0 0 : 0 0 q s u b :w a i t i n gf o rj o b1 0 6 4 1 5 9 . l i o n x i . r c c . p s u . e d ut os t a r t q s u b :j o b1 0 6 4 1 5 9 . l i o n x i . r c c . p s u . e d ur e a d y l i o n x i 2 5 : ~ $

There are two things of note here. The first is that the qsub command doesn't exit when run with the interactive Iflag. Instead, it waits until the job is started and gives a prompt on the first compute node assigned to a job. The second thing of note is the prompt lionxi25:~$ - this shows that commands are now being executed on the compute node lionxi25.

Checking Job Status


The command to check job status is qstat. qstat has many options. Some common ones are:

PBS Commands for Checking Job Status Command Name Description of Command Functionality

qstat

Shows the status of all PBS jobs. The time displayed is the CPU time used by the job. Shows the status of all PBS jobs. The time displayed is the walltime used by the job. Shows the status all PBS jobs submitted by the user userid. The time displayed is the walltime used by the job. Shows the status all PBS jobs along with a list of compute nodes that the job is running on.

qstat -s

qstat -u userid

qstat -n

qstat -f jobid Shows detailed information about the job jobid. A job can be in several different states. The most common ones are: PBS Job States State Meaning Q The job is queued and is waiting to start. R The job is currently running. E The job is currently ending. H The job has a user or system hold on it and will not be eligible to run until the hold is removed. Example: qstat output
l i o n x j : ~ $q s t a t J o bi d N a m e U s e r T i m eU s eSQ u e u e -----1 0 . l i o n x j s p a r s e a b c 1 2 3 1 8 8 : 2 0 : 2Rl i o n x j 1 1 . l i o n x j t e s t j w h 1 2 8 0 0 : 0 0 : 1 8Rl i o n x j a d m i n . . .

Job id: the job's unique indentifier Name: name of the job User: user that owns the job Time Use: CPU time used by the job S: state of the job Queue: the queue the job is in Example: qstat -s output
l i o n x j : ~ $q s t a ts l i o n x j . r c c . p s u . e d u :

R e q ' d R e q ' d E l a p J o bI D U s e r n a m eQ u e u e J o b n a m e S e s s I DN D ST S KM e m o r yT i m e ST i m e ----------1 0 . l i o n x j . r c ca b c 1 2 3 l i o n x j s p a r s e 5 7 9 3 4 2 g b1 9 0 : 0R1 8 9 : 2 1 1 . l i o n x j . r c cj w h 1 2 8 l i o n x j at e s t 1 1 9 4 6 3 - 5 0 0 : 0R1 6 6 : 5 . . .

Job id: the job's unique indentifier Username: user that owns the job Queue: the queue the job is in Jobname: the name of the job NDS: the number of compute nodes the job is using Req'd Memory: the memory requested for the job Req'd Time: the walltime requested for the job S: the state of the job Elap Time: the elapsed walltime for the job

Deleting a Job
The command to delete a job is qdel. Its syntax is "qdel Job_ID". PBS Commands for Deleting Jobs Command Name Description of Command Functionality qdel Job_ID Deletes the job identified by Job_ID.

qdel $(qselect -u username) Deletes all jobs belonging to user username. Example: deleting a job with Job_ID 10 lionxj:~$ qdel 10 Example: deleting all jobs belonging to user abc123 lionxj:~$ qdel $(qselect -u abc123)

Viewing Job Output


By default PBS will write screen output from a job to the follwing files: PBS Output Files Output File Name Jobname.oJob_ID Contents of Output File This file would contain the non-error output that would normally be written to the screen. This file would contain the error output that would normally be written

Jobname.eJob_ID

to the screen. If the PBS directive #PBS -j oe is used in a PBS script, the non-error and the error output are both written to the Jobname.oJob_ID file.

More Information
More information on PBS and PBS scripts can be found in the man pages for the commands qsub, pbs_resources, qstat, and qdel. Research Computing and Cyberinfrastructure is committed to making its websites accessible to all users and welcomes comments or suggestions on access improvements. Please contact us if you have comments or suggestions on accessibility. About Us | Contact Us Copyright 2011, The Pennsylvania State University | Privacy and Legal Statements | Hotlines Research Computing and Cyberinfrastructure is a unit of Information Technology Services.