101: How to submit Slurm batch jobs
Introduction on how to submit the job to the Slurm cluster
Below is a sample Slurm script for running a Python code:
You python script example1.py
print("Hello World")and the Slurm submission script example1.slurm
#!/bin/bash
#
#SBATCH --qos=cu_hpc
#SBATCH --partition=cpu
#SBATCH --job-name=example1
#SBATCH --output=example1.txt
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=00:10:00
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=1G
module purge
#To get worker node information
hostname
uname -a
more /proc/cpuinfo | grep "model name" | head -1
more /proc/cpuinfo | grep "processor" | wc -l
echo "pwd = "`pwd`
echo "TMPDIR = "$TMPDIR
echo "SLURM_SUBMIT_DIR = "$SLURM_SUBMIT_DIR
echo "SLURM_JOBID = "$SLURM_JOBID
#To run python script
python example1.pyNote that,
For
--qos, you should check which qos that you are assigned. You can check by usingsacctmgr show assoc format=cluster,user,qosQoS includes
cu_hpc, cu_htc, cu_math, cu_long, cu_student, escience
For
--partition, you can choosecpuorcpugpufor all QoS, except for cu_math (usemathpartition).See detail of QoS and partition here.
You can also use other shells if want, not limited to bash. See an example of tcsh/csh in the CMSSW example.
To submit the job, you use sbatch
sbatch example1.slurmYou will see
Submitted batch job 81942To check if your job is in which state
squeue -u your_user_nameIn the ST column, R is Running, PD is pending.
Your output should look like
==========================================
SLURM_JOB_ID = 81943
SLURM_NODELIST = cpu-bladeh-01
==========================================
cpu-bladeh-01.stg
Linux cpu-bladeh-01.stg 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
model name	: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
16
pwd = /work/home/your_user_name/slurm/example1
TMPDIR = /work/scratch/your_user_name/81943
SLURM_SUBMIT_DIR = /work/scratch/your_user_name/81943
SLURM_JOBID = 81943
Hello World#!/bin/bash
#
#SBATCH --qos=cu_hpc
#SBATCH --partition=cpu
#SBATCH --job-name=example1
#SBATCH --output=example1.txt
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=00:10:00
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=1G
module purge
#To get worker node information
hostname
uname -a
more /proc/cpuinfo | grep "model name" | head -1
more /proc/cpuinfo | grep "processor" | wc -l
#To set your submission directory
echo "pwd = "`pwd`
export MYCODEDIR=`pwd`
#Check PATHs
echo "MYCODEDIR = "$MYCODEDIR
echo "TMPDIR = "$TMPDIR
echo "SLURM_SUBMIT_DIR = "$SLURM_SUBMIT_DIR
echo "SLURM_JOBID = "$SLURM_JOBID
#Move to TMPDIR and run python script
cp example1.py $TMPDIR
cd $TMPDIR
python example1.py >| test.log
ls -l
cp -rf test.log $MYCODEDIR/Last updated
Was this helpful?