qsub to sbatch Translation
Side-by-Side Comparison of Slurm and Moab/Torque
Slurm is different from Torque in several ways. These include the commands used to submit and monitor jobs, the syntax used to request resources, and the way environment variables behave.
Some specific ways in which Slurm is different from Torque include:
- What Torque calls queues, Slurm calls partitions
- In Slurm, environmental variables of the submitting process are passed to the job by default
Submitting jobs
To submit jobs in Slurm, replace qsub with one of the commands from the table below.
Task | Torque Command | Slurm Command |
---|---|---|
Submit a batch job to the queue |
|
|
Start an interactive job |
|
|
where <job script>
needs to be replaced by the name of your job submission script (e.g. slurm_job.sh
). See below for changes that need to be made when converting the Torque syntax into that of Slurm.
Job Submission Options
In Slurm, as with Torque, job options and resource requests can either be set in the job script or at the command line when submitting the job. Below is a summary table.
Option | Torque (qsub) | Slurm (sbatch) |
---|---|---|
Script directive |
|
|
Job name |
|
|
Queue |
|
|
Wall time limit |
|
|
Node count |
|
|
Process count per node |
|
|
core count (per process) |
|
|
Memory limit |
|
|
Minimum memory per processor |
|
|
Request GPUs |
|
|
Request specific nodes |
|
|
Request node feature |
|
|
Standard output file |
|
|
Standard error file |
|
|
Combine stdout/stderr to stdout |
|
|
Copy environment |
|
|
Copy environment variable |
|
|
Job dependency |
|
|
Request event notification |
|
|
Email address |
|
|
Defer job until the specified time |
|
|
Node exclusive job |
|
|
Common Job Commands
Task | Torque Command | Slurm Command |
---|---|---|
Submit a job |
|
|
Delete a job |
|
|
Hold a job |
|
|
Release a job |
|
|
Start an interactive job |
|
|
Start an interactive job with X forwarding |
|
|
Monitoring Resources on the Cluster
Task | Torque Command | Slurm Command |
---|---|---|
Queue list / info |
|
|
Node list |
|
|
Node details |
|
|
Cluster status |
|
|
Monitoring Jobs
Info | Torque Command | Slurm Command |
---|---|---|
Job status (all) |
|
|
Job status (by job) |
|
|
Job status (by user) |
|
|
Job status (only own jobs) |
|
|
Job status (detailed) |
|
|
Show expected start time |
|
|
Monitor or review a job’s resource usage |
|
|
View job batch script |
|
Valid Job States
Code | State | Meaning |
---|---|---|
CA |
Canceled |
Job was canceled |
CD |
Completed |
Job completed |
CF |
Configuring |
Job resources being configured |
CG |
Completing |
Job is completing |
F |
Failed |
Job terminated with non-zero exit code |
NF |
Node Fail |
Job terminated due to failure of node(s) |
PD |
Pending |
Job is waiting for compute node(s) |
R |
Running |
Job is running on compute node(s) |
Job Environment and Environment Variables
Slurm sets its own environment variables within a job, as does Torque. A summary is in the table below.
Info | Torque | Slurm | Notes |
---|---|---|---|
Version |
|
– |
Can extract from |
Job name |
|
|
|
Job ID |
|
|
|
Batch or interactive |
|
– |
|
Submit directory |
|
|
Slurm jobs start from the submit directory by default. |
Submit host |
|
|
|
Node file |
|
A filename and path that lists the nodes a job has been allocated. |
|
Node list |
|
|
The Slurm variable has a different format to the Torque/PBS one. To get a list of nodes use: |
Walltime |
|
– |
|
Queue name |
|
|
|
Number of nodes allocated |
|
|
|
Number of processes |
|
|
|
Number of processes per node |
|
|
|
List of allocated GPUs |
|
– |
|
Requested tasks per node |
– |
|
|
Requested CPUs per task |
– |
|
|
Scheduling priority |
– |
|
|
Job user |
– |
|
Slurm Documentation
Extensive documentation on Slurm is available at https://slurm.schedmd.com/documentation.html