101: How to submit Slurm batch jobs
Introduction on how to submit the job to the Slurm cluster
Below is a sample Slurm script for running a Python code:
You python script example1.py
print("Hello World")
and the Slurm submission script example1.slurm
#!/bin/bash
#
#SBATCH --qos=cu_hpc
#SBATCH --partition=cpu
#SBATCH --job-name=example1
#SBATCH --output=example1.txt
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=00:10:00
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=1G
module purge
#To get worker node information
hostname
uname -a
more /proc/cpuinfo | grep "model name" | head -1
more /proc/cpuinfo | grep "processor" | wc -l
echo "pwd = "`pwd`
echo "TMPDIR = "$TMPDIR
echo "SLURM_SUBMIT_DIR = "$SLURM_SUBMIT_DIR
echo "SLURM_JOBID = "$SLURM_JOBID
#To run python script
python example1.py
Note that,
For
--qos
, you should check which qos that you are assigned. You can check by usingsacctmgr show assoc format=cluster,user,qos
QoS includes
cu_hpc, cu_htc, cu_math, cu_long, cu_student, escience
For
--partition
, you can choosecpu
orcpugpu
for all QoS, except for cu_math (usemath
partition).See detail of QoS and partition here.
You can also use other shells if want, not limited to bash. See an example of tcsh/csh in the CMSSW example.
To submit the job, you use sbatch
sbatch example1.slurm
You will see
Submitted batch job 81942
To check if your job is in which state
squeue -u your_user_name
In the ST column, R is Running, PD is pending.
Your output should look like
==========================================
SLURM_JOB_ID = 81943
SLURM_NODELIST = cpu-bladeh-01
==========================================
cpu-bladeh-01.stg
Linux cpu-bladeh-01.stg 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
16
pwd = /work/home/your_user_name/slurm/example1
TMPDIR = /work/scratch/your_user_name/81943
SLURM_SUBMIT_DIR = /work/scratch/your_user_name/81943
SLURM_JOBID = 81943
Hello World
#!/bin/bash
#
#SBATCH --qos=cu_hpc
#SBATCH --partition=cpu
#SBATCH --job-name=example1
#SBATCH --output=example1.txt
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=00:10:00
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=1G
module purge
#To get worker node information
hostname
uname -a
more /proc/cpuinfo | grep "model name" | head -1
more /proc/cpuinfo | grep "processor" | wc -l
#To set your submission directory
echo "pwd = "`pwd`
export MYCODEDIR=`pwd`
#Check PATHs
echo "MYCODEDIR = "$MYCODEDIR
echo "TMPDIR = "$TMPDIR
echo "SLURM_SUBMIT_DIR = "$SLURM_SUBMIT_DIR
echo "SLURM_JOBID = "$SLURM_JOBID
#Move to TMPDIR and run python script
cp example1.py $TMPDIR
cd $TMPDIR
python example1.py >| test.log
ls -l
cp -rf test.log $MYCODEDIR/
Last updated
Was this helpful?