LPCT Cluster

Job Management

SLURM is installed as the cluster's workload manager.

Note: SSH access to compute nodes is allowed without reservation for testing, compiling, and running small calculations. However, long-running jobs (more than 15 minutes) on compute nodes will be automatically terminated by system daemons.

This restriction does not apply to the kech and lia partitions.

Check node availability with:

$ sinfo

Key states:

Check your jobs with:

$ squeue -u $USER

Key statuses:

Slurm Options

Example SLURM script (job.sh):

#!/bin/bash
#SBATCH --job-name=test
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --time=1:00:00
#SBATCH --partition=lecce2
            
module load python/3.13.2
python my_script.py

Submit with:

$ sbatch job.sh

Essential SLURM Options

Option Description Example
--nodes Number of nodes --nodes=4
--ntasks-per-node Tasks (processes) per node --ntasks-per-node=16 (for 16-core nodes)
--cpus-per-task CPU cores allocated per process (for multi-threaded apps like OpenMP) --cpus-per-task=4 (4 cores per task)
--nodelist Specific nodes --nodelist=cns01,cns02
--exclusive Dedicated node access --exclusive
--gres GPU resources --gres=gpu:2
--mem Memory per node --mem=32G
--time Walltime limit --time=24:00:00

Pro Tip: Use sbatch --test-only job.sh to validate scripts without submission.

Interactive Sessions

Launch interactive jobs with salloc:

$ salloc --nodes=1 --ntasks=1 --gres=gpu:1 --time=1:00:00