Job Management on Beocat
Beocat uses a fork of the OpenPBS batch system called Torque for job management. It supports all of the normal PBS commands and attributes, and adds some extensions.
Basic introduction
qsub is the tool used to submit jobs to the cluster. A 'job' is simply a shell script. So, for an extremely simple example, my_job.sh is a short shell script that simply reports back the name of the machine and uptime.
1 #!/bin/bash
2 echo "$(hostname): $(uptime)"
Please note that the first line is actually unnecessary. All scripts will be run through the shell you specify to qsub, defaulting to your normal shell. In fact, job scripts need not even be executable. (It was just handy to be able to debug it interactively)
kuffs@[clotho] ~ % qsub my_job.sh 213.clotho.beocat
So we have submitted our job and it was given job number 213. You can use qstat to view the status of the jobs in the queue
kuffs@[clotho] ~ % qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 213.clotho my_job.sh kuffs 0 Q batch
In this screen we see our job, its name, who it was launched by, and lots of other details. We also notice under the 'S' column there is a 'Q' meaning that the job is queued and waiting to be assigned somewhere to work. This isn't a big deal, it usually takes a few seconds for the scheduler to figure out what it is going to do.
Sure enough, try a few seconds later and we get new information.
kuffs@[clotho] ~ % qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 213.clotho my_job.sh kuffs 0 E batch
Look under the status column and it is now 'E', exiting after finishing.
A little while later, we check on the status again
kuffs@[clotho] ~ % qstat
And don't receive any output. A quick look in our current directory, however, reveals two new files; 'my_job.sh.e213' and 'my_job.sh.o213'. These two files are the text written to STDERR and STDOUT by your job, respectively.
kuffs@[clotho] ~ % cat my_job.sh.e213
A quick peek in my_job.sh.e213 shows there is no error output.
kuffs@[clotho] ~ % cat my_job.sh.o213 virgrack6: 14:17:40 up 9 days, 2:47, 0 users, load average: 2.64, 2.28, 2.21
And my_job.sh.o213 has the output from the program.
I encourage you to check out the man pages for both qsub and qstat for much more information.
Selecting specific nodes
The biggest part of job submission is definitely allocating the correct resources for your job. The official documentation is quite good in this respect.
Check it out at http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml
I will add a few notes though,
Use the program pbsnodes to get a listing of the nodes and their attributes so you can choose the right kinds of equipment for your task.
- The machines with the Infiniband (read: high-speed) interconnects have the special property 'ib' defined to make selection easy.
- 32-bit machines have the property 'x86' defined, 64-bit systems have 'x86_64' defined instead.
Helpful tips
Scripting trick
To save yourself a bit of typing on the command line when testing, all the qsub parameters can be added as special comments in your job's shell script.
For example, let's say you have these parameters to give to qsub:
>qsub -l walltime=20:00:00 -S /bin/bash -N sim_pass3 my_job.sh
This would run the script my_job.sh using /bin/bash as the shell with a maximum run time of 20 hours and named 'sim_pass3'. This could get to be a pain to remember and edit every time. Fortunately, we can use special comments in my_job.sh that make our life easier:
1 #!/bin/bash
2 #PBS -l walltime=20:00:00
3 #PBS -S /bin/bash
4 #PBS -N sim_pass3
5 echo "$(hostname): $(uptime)"
Now you would submit the job with the much simpler command:
>qsub my_job.sh
Troubleshooting
Job won't start, always in queued state
So you've started a job, and can't figure out why it won't start.
kuffs@[clotho] ~ % qsub my_job.sh -l mem=100G
219.clotho.beocat
kuffs@[clotho] ~ % qstat -n
clotho.beocat:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
-------------------- -------- -------- ---------- ------ ----- --- ------ ----- - -----
219.clotho.beocat kuffs batch my_job.sh -- 1 -- 100gb 00:05 Q --
-- Everything is fine, except that the scheduler doesn't seem to want to select nodes for the job to run on. There is a handy tool checkjob for this task.
kuffs@[clotho] ~ % checkjob 219 checking job 219 State: Idle EState: Deferred Creds: user:kuffs group:kuffs_users class:batch qos:DEFAULT WallTime: 00:00:00 of 00:05:00 SubmitTime: Wed Nov 15 17:09:03 (Time Queued Total: 00:14:14 Eligible: 00:00:15) Total Tasks: 1 Req[0] TaskCount: 1 Partition: ALL Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0 Opsys: [NONE] Arch: [NONE] Features: [NONE] Dedicated Resources Per Task: PROCS: 1 MEM: 100G IWD: [NONE] Executable: [NONE] Bypass: 0 StartCount: 0 PartitionMask: [ALL] Flags: RESTARTABLE job is deferred. Reason: NoResources (cannot create reservation for job '219' (intital reservation attempt) ) Holds: Defer (hold reason: NoResources) PE: 38.18 StartPriority: 1 cannot select job 219 for partition DEFAULT (job hold active)
This incredibly detailed output tells us that the scheduler can't find the resources to give our task. Poking through more of the output, we see Dedicated Resources Per Task: PROCS: 1 MEM: 100G... doh! The scheduler happens to be right, there are no nodes in the cluster that have 100G of ram.
So we delete the broken job.
kuffs@[clotho] ~ % qdel 219
And submit our job, sans the typo.
kuffs@[clotho] ~ % qsub my_job.sh -l mem=100m 221.clotho.beocat
And things run just fine.
kuffs@[clotho] ~ % checkjob 221 checking job 221 State: Running Creds: user:kuffs group:kuffs_users class:batch qos:DEFAULT WallTime: 00:00:00 of 00:05:00 SubmitTime: Wed Nov 15 17:30:17 (Time Queued Total: 00:00:14 Eligible: 00:00:14) StartTime: Wed Nov 15 17:30:31 Total Tasks: 1 Req[0] TaskCount: 1 Partition: DEFAULT Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0 Opsys: [NONE] Arch: [NONE] Features: [NONE] Dedicated Resources Per Task: PROCS: 1 MEM: 100M Allocated Nodes: [virgrack6:1] IWD: [NONE] Executable: [NONE] Bypass: 0 StartCount: 1 PartitionMask: [ALL] Flags: RESTARTABLE Reservation '221' (00:00:00 -> 00:05:00 Duration: 00:05:00) PE: 1.00 StartPriority: 1
The scheduler has chosen one of the processors on virgrack6 for our task.