Limit the number of tasks in SLURM

I queue at multiple jobs in SLURM. Can I limit the number of concurrent work orders in slurm?

Thanks in advance!

+7
source share
4 answers

If you are not an administrator, you can hold some tasks if you do not want all of them to run simultaneously with scontrol hold <JOBID> , and you can delay the transfer of some tasks with sbatch --begin=YYYY-MM-DD . In addition, if it is an array of jobs, you can limit the number of jobs in the array that are simultaneously running with the --array=1:100%25 instance to have 100 jobs in the array, but only 25 of them work.

+8
source

Based on SLURM resource limits, you can limit the total number of tasks you can perform for the / qos association with MaxJobs . Recall that an association is a combination of a cluster, account, username, and (optional) section name.

You can do something similar to:

 sacctmgr modify user <userid> account=<account_name> set MaxJobs=10 

I found this presentation to be very helpful if you have more questions.

+4
source

According to the Slerm documentation , --array=0-15%4 (- a sign and no :) will limit the number of simultaneously performed tasks of this array task 4

I wrote test.sbatch :

 #!/bin/bash # test.sbatch # #SBATCH -J a #SBATCH -p campus #SBATCH -c 1 #SBATCH -o %A_%a.output mkdir test${SLURM_ARRAY_TASK_ID} # sleep for up to 10 minutes to see them running in squeue and # different times to check that the number of parallel jobs remain constant RANGE=600; number=$RANDOM; let "number %= $RANGE"; echo "$number" sleep $number 

and run it with sbatch --array=1-15%4 test.sbatch

Tasks are performed as expected (always 4 in parallel), they simply create directories and continue to work for $ number seconds.

I appreciate comments and suggestions.

0
source

If your jobs are relatively similar, you can use the functions of the slurm array. I tried to figure this out for a while and found this solution at https://docs.id.unibe.ch/ubelix/job-management-with-slurm/array-jobs-with-slurm.

 #!/bin/bash -x #SBATCH --mail-type=NONE #SBATCH --array=1-419%25 # Submit 419 tasks with with only 25 of them running at any time #contains the list of 419 commands I want to run cmd_file=s1List_170519.txt cmd_line=$(cat $cmd_file | awk -v var=${SLURM_ARRAY_TASK_ID} 'NR==var {print $1}') # Get first argument $cmd_line #may need to be piped to bash 
0
source

Source: https://habr.com/ru/post/1265494/


All Articles