Statistics SLURM Cluster

Page

What has changed?

The linux servers in the Statistics department were historically managed as independent units, which does not allow for easy load balancing of work; some nodes end up with a large amount of processes, while others are idle. To help address this issue, we implemented a queuing system. 

 

What is SLURM

Our clusters consists of many compute nodes, but at the same time have many users submitting many jobs. So, we need a mechanism to distribute the jobs across the nodes in a reasonable fashion and and SLURM is the one we are using now. ​Slurm (Simple Linux Utility for Resource Management) is a highly configurable open source workload and resource manager designed for Linux clusters of all sizes. Its key features are:

  • extensive scheduling options including advanced reservations,
  • suspend/resume for supporting binaries,
  • scheduler backfill,
  • fair-share scheduling, and
  • preemptive scheduling for critical jobs.

Slurm provides similar function as torque. But, the some commands are different on Slurm and Torque. For example, to see a list of all jobs on the cluster, using Moab/Torque, one would issue just the qstat command whereas the Slurm equivalent would be the squeue command:

How to access the SLURM queue

Login to the machine pronto.las.iastate.edu from any SSH client. Here is a tutorial for SSH Terminal Access

How to submit jobs ?

A basic job submission workflow can be found at http://www.brightcomputing.com/Blog/bid/174099/Slurm-101-Basic-Slurm-Usage-for-Linux-Clusters.

You can submit your jobs in either PBS or slurm format to queue a job for execution via SLURM. You can find some example submission scripts on Research IT's SLURM basics page

Some useful commands 

  1. srun myscript: This command is used to submit the job. Your jobs will be scheduled for queues on the basis of resources requested. 
  2. squeue: It gives the status of all the ques and the current queue structure

A helpful comparison cheat sheet is available at http://www.schedmd.com/slurmdocs/rosetta.pdf.

Available Software Modules

To find out available software modules on Pronto cluster, you can use the command "module spider" to find the right module.  More detail on how to use "module" command can be found from https://researchit.las.iastate.edu/spack-based-software-modules.

Where to find help

For more details, please reference the official SLURM documentation. You can also find some examples on Research IT's SLURM basics page

If you have any further questions, please contact stat-tech@iastate.edu or researchit@iastate.edu