qsub to sbatch Translation

Side-by-Side Comparison of Slurm and Moab/Torque

Slurm is different from Torque in several ways. These include the commands used to submit and monitor jobs, the syntax used to request resources, and the way environment variables behave.

Some specific ways in which Slurm is different from Torque include:

  • What Torque calls queues, Slurm calls partitions
  • In Slurm, environmental variables of the submitting process are passed to the job by default

Submitting jobs

To submit jobs in Slurm, replace qsub with one of the commands from the table below.

Torque and Slurm Commands for Submitting Jobs
Task Torque Command Slurm Command
Submit a batch job to the queue

qsub <job script>

sbatch <job script>

Start an interactive job

qsub -I <options>

sinteractive <options>

where <job script> needs to be replaced by the name of your job submission script (e.g. slurm_job.sh). See below for changes that need to be made when converting the Torque syntax into that of Slurm.

Job Submission Options

In Slurm, as with Torque, job options and resource requests can either be set in the job script or at the command line when submitting the job. Below is a summary table.

Options for Job Submission
Option Torque (qsub) Slurm (sbatch)
Script directive

#PBS

#SBATCH

Job name

-N <name>

--job-name=<name>
-J <name>

Queue

-q <queue>

--partition=<queue>

Wall time limit

-l walltime=<hh:mm:ss>

--time=<hh:mm:ss>

Node count

-l nodes=<count>

--nodes=<count>
-N <count>

Process count per node

-l ppn=<count>

--ntasks-per-node=<count>

core count (per process)

--cpus-per-task=<cores>

Memory limit

-l mem=<limit>

--mem=<limit> (Memory per node in mega bytes – MB)

Minimum memory per processor

-l pmem=<limit>

--mem-per-cpu=<memory>

Request GPUs

-l gpus=<count>

--gres=gpu:<count>

Request specific nodes

-l nodes=<node>[,node2[,...]]>

-w, --nodelist=<node>[,node2[,...]]>
-F, --nodefile=<node file>

Request node feature

-l nodes=<count>:ppn=<count>:<feature>

--constraint=<feature>

Standard output file

-o <file path>

--output=<file path> (path must exist)

Standard error file

-e <file path>

--error=<file path> (path must exist)

Combine stdout/stderr to stdout

-j oe

--output=<combined out and err file path>

Copy environment

-V

--export=ALL (default)
--export=NONE to not export environment

Copy environment variable

-v <variable[=value][,variable2=value2[,...]]>

--export=<variable[=value][,variable2=value2[,...]]>

Job dependency

-W depend=after:jobID[:jobID...]
-W depend=afterok:jobID[:jobID...]
-W depend=afternotok:jobID[:jobID...]
-W depend=afterany:jobID[:jobID...]

--dependency=after:jobID[:jobID...]
--dependency=afterok:jobID[:jobID...]
--dependency=afternotok:jobID[:jobID...]
--dependency=afterany:jobID[:jobID...]

Request event notification

-m <events>

--mail-type=<events>
Note: multiple mail-type requests may be specified in a comma separated list:
--mail-type=BEGIN,END,NONE,FAIL,REQUEUE

Email address

-M <email address>

--mail-user=<email address>

Defer job until the specified time

-a <date/time>

--begin=<date/time>

Node exclusive job

qsub -n

--exclusive

Common Job Commands

Job submission and management
Task Torque Command Slurm Command
Submit a job

qsub <job script>

sbatch <job script>

Delete a job

qdel <job ID>

scancel <job ID>

Hold a job

qhold <job ID>

scontrol hold <job ID>

Release a job

qrls <job ID>

scontrol release <job ID>

Start an interactive job

qsub -I <options>

sinteractive <options>

Start an interactive job with X forwarding

qsub -I -X <options>

sinteractive <options>

Monitoring Resources on the Cluster  

Torque and Slurm Commands for Resource Monitoring
Task Torque Command Slurm Command
Queue list / info

qstat -q [queue]

scontrol show partition [queue]

Node list

pbsnodes -a
mdiag -n -v

scontrol show nodes

Node details

pbsnodes <node>

scontrol show node <node>

Cluster status

qstat -B

sinfo

Monitoring Jobs

Torque and Slurm Commands for Monitoring Jobs
Info Torque Command Slurm Command
Job status (all)

qstat
showq

squeue

Job status (by job)

qstat <job ID>

squeue -j <job ID>

Job status (by user)

qstat -u <user>

squeue -u <user>

Job status (only own jobs)

qstat_me

squeue --me
squeue --me -l

Job status (detailed)

qstat -f <job ID>
checkjob <job ID>

scontrol show job -dd <job ID>

Show expected start time

showstart <job ID>

squeue -j <job ID> --start

Monitor or review a job’s resource usage

qstat -f <job ID>

sacct -j <job ID> --format JobID,jobname,NTasks,nodelist,CPUTime,ReqMem,Elapsed

View job batch script

scontrol write batch_script <job ID> [filename]

Valid Job States

Codes from Slurm for Possible Job States
Code State Meaning
CA

Canceled

Job was canceled

CD

Completed

Job completed

CF

Configuring

Job resources being configured

CG

Completing

Job is completing

F

Failed

Job terminated with non-zero exit code

NF

Node Fail

Job terminated due to failure of node(s)

PD

Pending

Job is waiting for compute node(s)

R

Running

Job is running on compute node(s)

Job Environment and Environment Variables

Slurm sets its own environment variables within a job, as does Torque. A summary is in the table below.

Environment Variables for Torque and Slurm Jobs
Info Torque Slurm Notes
Version

$PBS_VERSION

Can extract from  sbatch --version

Job name

$PBS_JOBNAME

$SLURM_JOB_NAME

Job ID

$PBS_JOBID

$SLURM_JOB_ID

Batch or interactive

$PBS_ENVIRONMENT

Submit directory

$PBS_O_WORKDIR

$SLURM_SUBMIT_DIR

Slurm jobs start from the submit directory by default.

Submit host

$PBS_O_HOST

$SLURM_SUBMIT_HOST

Node file

$PBS_NODEFILE

A filename and path that lists the nodes a job has been allocated.

Node list

cat $PBS_NODEFILE

$SLURM_JOB_NODELIST

The Slurm variable has a different format to the Torque/PBS one. To get a list of nodes use:
scontrol show hostnames $SLURM_JOB_NODELIST

Walltime

$PBS_WALLTIME

Queue name

$PBS_QUEUE

$SLURM_JOB_PARTITION

Number of nodes allocated

$PBS_NUM_NODES

$SLURM_JOB_NUM_NODES
$SLURM_NNODES

Number of processes

$PBS_NP

$SLURM_NTASKS

Number of processes per node

$PBS_NUM_PPN

$SLURM_TASKS_PER_NODE

List of allocated GPUs

$PBS_GPUFILE

Requested tasks per node

$SLURM_NTASKS_PER_NODE

Requested CPUs per task

$SLURM_CPUS_PER_TASK

Scheduling priority

$SLURM_PRIO_PROCESS

Job user

$SLURM_JOB_USER

Slurm Documentation

Extensive documentation on Slurm is available at  https://slurm.schedmd.com/documentation.html