CU e-Science HTC-HPC Cluster and Cloud
  • CU e-Science HTC-HPC Cluster and Cloud
  • Introduction to our cluster
    • Our resources
    • Registration
    • Login to our cluster
    • Disk space
    • Acknowledgement and Publication
  • Slurm
    • 101: How to submit Slurm batch jobs
    • 101: Interactive jobs with Slurm
    • Basic Slurm commands
    • QoS and Partition
    • Job priority
    • Available complier and software
    • Examples
      • Simple C, C++, Python program
      • Python with VirtualEnv
      • How do I submit a large number of very similar jobs?
      • CMSSW
      • R
      • Mathematica
      • Message Passing Interface (MPI)
  • Kubenetes
    • Under construction!
Powered by GitBook
On this page

Was this helpful?

  1. Slurm

101: How to submit Slurm batch jobs

Introduction on how to submit the job to the Slurm cluster

Below is a sample Slurm script for running a Python code:

You python script example1.py

print("Hello World")

and the Slurm submission script example1.slurm

#!/bin/bash
#
#SBATCH --qos=cu_hpc
#SBATCH --partition=cpu
#SBATCH --job-name=example1
#SBATCH --output=example1.txt
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=00:10:00
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=1G

module purge

#To get worker node information
hostname
uname -a
more /proc/cpuinfo | grep "model name" | head -1
more /proc/cpuinfo | grep "processor" | wc -l
echo "pwd = "`pwd`
echo "TMPDIR = "$TMPDIR
echo "SLURM_SUBMIT_DIR = "$SLURM_SUBMIT_DIR
echo "SLURM_JOBID = "$SLURM_JOBID

#To run python script
python example1.py

Note that,

  1. For --qos, you should check which qos that you are assigned. You can check by using sacctmgr show assoc format=cluster,user,qos

    1. QoS includes cu_hpc, cu_htc, cu_math, cu_long, cu_student, escience

  2. For --partition, you can choose cpu or cpugpufor all QoS, except for cu_math (use math partition).

To submit the job, you use sbatch

sbatch example1.slurm

You will see

Submitted batch job 81942

To check if your job is in which state

squeue -u your_user_name

In the ST column, R is Running, PD is pending.

Your output should look like

==========================================
SLURM_JOB_ID = 81943
SLURM_NODELIST = cpu-bladeh-01
==========================================
cpu-bladeh-01.stg
Linux cpu-bladeh-01.stg 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
model name	: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
16
pwd = /work/home/your_user_name/slurm/example1
TMPDIR = /work/scratch/your_user_name/81943
SLURM_SUBMIT_DIR = /work/scratch/your_user_name/81943
SLURM_JOBID = 81943
Hello World

With the Slurm output, you see that your job is running on the same directory that you submit the job (e.g. /work/home/your_user_name/slurm/example1. This is not recommended. You should move the job to run on $TMPDIR (or $SLURM_SUBMIT_DIR) and copy the output back when the job is done. Here is an example of modified example1.slurm to run on $TMPDIR and copy test.log (output of python script) back to your submission directory. The $TMPDIR will be deleted automatically after the job is done.

#!/bin/bash
#
#SBATCH --qos=cu_hpc
#SBATCH --partition=cpu
#SBATCH --job-name=example1
#SBATCH --output=example1.txt
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=00:10:00
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=1G

module purge

#To get worker node information
hostname
uname -a
more /proc/cpuinfo | grep "model name" | head -1
more /proc/cpuinfo | grep "processor" | wc -l

#To set your submission directory
echo "pwd = "`pwd`
export MYCODEDIR=`pwd`

#Check PATHs
echo "MYCODEDIR = "$MYCODEDIR
echo "TMPDIR = "$TMPDIR
echo "SLURM_SUBMIT_DIR = "$SLURM_SUBMIT_DIR
echo "SLURM_JOBID = "$SLURM_JOBID

#Move to TMPDIR and run python script
cp example1.py $TMPDIR
cd $TMPDIR
python example1.py >| test.log
ls -l
cp -rf test.log $MYCODEDIR/

PreviousAcknowledgement and PublicationNext101: Interactive jobs with Slurm

Last updated 3 years ago

Was this helpful?

See detail of QoS and partition .

You can also use other shells if want, not limited to bash. See an example of tcsh/csh in the .

here
CMSSW example