Job Array Support作业数组支持

Overview

概述

Job arrays offer a mechanism for submitting and managing collections of similar jobs quickly and easily; job arrays with millions of tasks can be submitted in milliseconds (subject to configured size limits). All jobs must have the same initial options (e.g. size, time limit, etc.), however it is possible to change some of these options after the job has begun execution using the scontrol command specifying the JobID of the array or individual ArrayJobID.

作业数组提供了一种快速、轻松地提交和管理类似作业集合的机制; 具有数百万个任务的作业数组可以在毫秒内提交(受配置的大小限制的限制)。所有作业必须具有相同的初始选项(如大小、时间限制等) ,但是在作业开始执行后,可以使用 scontrol 命令指定数组或单个 ArrayJobID 的 JobID 来更改其中的一些选项。

$ scontrol update job=101 ...
$ scontrol update job=101_1 ...

Job arrays are only supported for batch jobs and the array index values are specified using the --array or -a option of the sbatch command. The option argument can be specific array index values, a range of index values, and an optional step size as shown in the examples below. Note that the minimum index value is zero and the maximum value is a Slurm configuration parameter (MaxArraySize minus one). Jobs which are part of a job array will have the environment variable SLURM_ARRAY_TASK_ID set to its array index value.

作业数组只支持批处理作业,数组索引值是使用 sbatch 命令的—— array 或-a 选项指定的。选项参数可以是特定的数组索引值、索引值范围和可选的步长,如下面的示例所示。请注意,最小索引值为零,最大值是一个 slurm 配置参数(MaxarraySize 减去1)。作业数组的一部分作业将把环境变量 SLURM _ ARRAY _ TASK _ ID 设置为它的数组索引值。

# Submit a job array with index values between 0 and 31
$ sbatch --array=0-31    -N1 tmp

# Submit a job array with index values of 1, 3, 5 and 7
$ sbatch --array=1,3,5,7 -N1 tmp

# Submit a job array with index values between 1 and 7
# with a step size of 2 (i.e. 1, 3, 5 and 7)
$ sbatch --array=1-7:2   -N1 tmp

A maximum number of simultaneously running tasks from the job array may be specified using a "%" separator. For example "--array=0-15%4" will limit the number of simultaneously running tasks from this job array to 4.

可以使用“%”分隔符指定作业数组中同时运行的任务的最大数量。例如,“—— array = 0-15% 4”将这个作业数组中同时运行的任务的数量限制为4。

Job ID and Environment Variables

作业 ID 和环境变量

Job arrays will have two additional environment variable set. SLURM_ARRAY_JOB_ID will be set to the first job ID of the array. SLURM_ARRAY_TASK_ID will be set to the job array index value. SLURM_ARRAY_TASK_COUNT will be set to the number of tasks in the job array. SLURM_ARRAY_TASK_MAX will be set to the highest job array index value. SLURM_ARRAY_TASK_MIN will be set to the lowest job array index value. For example a job submission of this sort作业数组将有两个额外的环境变量。SLURM _ ARRAY _ JOB _ ID 将被设置为数组的第一个作业 ID。SLURM _ ARRAY _ TASK _ ID 将被设置为作业数组索引值。SLURM _ ARRAY _ TASK _ COUNT 将被设置为作业数组中的任务数。SLURM _ ARRAY _ TASK _ MAX 将被设置为最高的作业数组索引值。SLURM _ ARRAY _ TASK _ MIN 将被设置为作业数组索引值的最低值。例如,这种类型的作业提交 sbatch --array=1-3 -N1 tmpSatch —— array = 1-3-N1 tmp will generate a job array containing three jobs. If the sbatch command responds将生成一个包含三个作业的作业数组 Submitted batch job 36提交批作业36 then the environment variables will be set as follows:然后环境变量设置如下: SLURM_JOB_ID=36 SLURM_ARRAY_JOB_ID=36 SLURM_ARRAY_TASK_ID=1 SLURM_ARRAY_TASK_COUNT=3 SLURM_ARRAY_TASK_MAX=3 SLURM_ARRAY_TASK_MIN=1 SLURM_JOB_ID=37 SLURM_ARRAY_JOB_ID=36 SLURM_ARRAY_TASK_ID=2 SLURM_ARRAY_TASK_COUNT=3 SLURM_ARRAY_TASK_MAX=3 SLURM_ARRAY_TASK_MIN=1 SLURM_JOB_ID=38 SLURM_ARRAY_JOB_ID=36 SLURM_ARRAY_TASK_ID=3 SLURM_ARRAY_TASK_COUNT=3 SLURM_ARRAY_TASK_MAX=3 SLURM_ARRAY_TASK_MIN=1

All Slurm commands and APIs recognize the SLURM_JOB_ID value. Most commands also recognize the SLURM_ARRAY_JOB_ID plus SLURM_ARRAY_TASK_ID values separated by an underscore as identifying an element of a job array. Using the example above, "37" or "36_2" would be equivalent ways to identify the second array element of job 36. A set of APIs has been developed to operate on an entire job array or select tasks of a job array in a single function call. The function response consists of an array identifying the various error codes for various tasks of a job ID. For example the job_resume2() function might return an array of error codes indicating that tasks 1 and 2 have already completed; tasks 3 through 5 are resumed successfully, and tasks 6 through 99 have not yet started.

所有的 SLURM 命令和 API 都能识别 SLURM _ JOB _ ID 值。大多数命令还将 SLURM _ ARRAY _ JOB _ ID 加上以下划线分隔的 SLURM _ ARRAY _ TASK _ ID 值识别为标识作业数组的元素。使用上面的示例,“37”或“36 _ 2”将是识别 job 36的第二个数组元素的等效方法。已经开发了一组 API 来操作整个作业数组或在单个函数调用中选择作业数组的任务。函数响应由一个数组组成,该数组标识作业 ID 的各种任务的各种错误代码。例如 job _ resume2()函数可能返回一组错误代码,指示任务1和任务2已经完成; 任务3到任务5成功恢复,任务6到任务99尚未启动。

File Names

文件名

Two additional options are available to specify a job's stdin, stdout, and stderr file names: %A will be replaced by the value of SLURM_ARRAY_JOB_ID (as defined above) and %a will be replaced by the value of SLURM_ARRAY_TASK_ID (as defined above). The default output file format for a job array is "slurm-%A_%a.out". An example of explicit use of the formatting is: sbatch -o slurm-%A_%a.out --array=1-3 -N1 tmp which would generate output files names of this sort "slurm-36_1.out", "slurm-36_2.out" and "slurm-36_3.out". If these file name options are used without being part of a job array then "%A" will be replaced by the current job ID and "%a" will be replaced by 4,294,967,294 (equivalent to 0xfffffffe or NO_VAL).

可以使用两个附加选项来指定作业的 stdin、 stdout 和 stderr 文件名:% A 将被 SLURM _ ARRAY _ JOB _ ID (如上所述)的值替换,% a 将被 SLURM _ ARRAY _ TASK _ ID (如上所述)的值替换。作业数组的默认输出文件格式为“ slurm -% a _% a.out”。显式使用该格式的一个例子是: Satch-o slurm -% A _% a.out —— array = 1-3-N1 tmp 它将生成这种类型的输出文件名“ slurm-36 _ 1”。“出局”“ Slurm-36 _ 2”。“ out”和“ slurm-36 _ 3”。出去”。如果使用这些文件名称选项而不是作业数组的一部分,那么“% a”将被当前作业 ID 替换,而“% a”将被替换为4,294,967,294(相当于0xfffffe 或 NO _ VAL)。

Scancel Command Use

取消命令使用

If the job ID of a job array is specified as input to the scancel command then all elements of that job array will be cancelled. Alternately an array ID, optionally using regular expressions, may be specified for job cancellation.

如果将作业数组的作业 ID 指定为 scancel 命令的输入,则将取消该作业数组的所有元素。或者,可以指定数组 ID (可选地使用正则表达式)以取消作业。

# Cancel array ID 1 to 3 from job array 20
$ scancel 20_[1-3]

# Cancel array ID 4 and 5 from job array 20
$ scancel 20_4 20_5

# Cancel all elements from job array 20
$ scancel 20

# Cancel the current job or job array element (if job array)
if [[-z $SLURM_ARRAY_JOB_ID]]; then
  scancel $SLURM_JOB_ID
else
  scancel ${SLURM_ARRAY_JOB_ID}_${SLURM_ARRAY_TASK_ID}
fi

Squeue Command Use

使用排队命令

When a job array is submitted to Slurm, only one job record is created. Additional job records will only be created when the state of a task in the job array changes, typically when a task is allocated resources or its state is modified using the scontrol command. By default, the squeue command will report all of the tasks associated with a single job record on one line and use a regular expression to indicate the "array_task_id" values as shown below.

当一个作业数组提交给 slurm 时,只会创建一个作业记录。只有当作业数组中任务的状态发生更改时,才会创建其他作业记录,通常是在任务被分配资源或使用 scontrol 命令修改其状态时。默认情况下,squue 命令将在一行中报告与单个作业记录相关联的所有任务,并使用正则表达式指示“ array _ task _ id”值,如下所示。

$ squeue
 JOBID     PARTITION  NAME  USER  ST  TIME  NODES NODELIST(REASON)
1080_[5-1024]  debug   tmp   mac  PD  0:00      1 (Resources)
1080_1         debug   tmp   mac   R  0:17      1 tux0
1080_2         debug   tmp   mac   R  0:16      1 tux1
1080_3         debug   tmp   mac   R  0:03      1 tux2
1080_4         debug   tmp   mac   R  0:03      1 tux3

An option of "--array" or "-r" has also been added to the squeue command to print one job array element per line as shown below. The environment variable "SQUEUE_ARRAY" is equivalent to including the "--array" option on the squeue command line.

队列命令中还添加了“—— array”或“-r”选项,以便每行打印一个作业数组元素,如下所示。环境变量“ SQUEUE _ ARRAY”相当于在队列命令行中包含“-array”选项。

$ squeue -r
 JOBID PARTITION  NAME  USER  ST  TIME  NODES NODELIST(REASON)
1082_3     debug   tmp   mac  PD  0:00      1 (Resources)
1082_4     debug   tmp   mac  PD  0:00      1 (Priority)
  1080     debug   tmp   mac   R  0:17      1 tux0
  1081     debug   tmp   mac   R  0:16      1 tux1
1082_1     debug   tmp   mac   R  0:03      1 tux2
1082_2     debug   tmp   mac   R  0:03      1 tux3

The squeue --step/-s and --job/-j options can accept job or step specifications of the same format.

队列—— step/-s 和—— job/-j 选项可以接受相同格式的作业或步规范。

$ squeue -j 1234_2,1234_3
...
$ squeue -s 1234_2.0,1234_3.0
...

Two additional job output format field options have been added to squeue:在队列中增加了两个额外的作业输出格式字段选项: %F prints the array_job_id value% F 打印 array _ job _ id 值 %K prints the array_task_id value% K 打印 array _ task _ id 值 (all of the obvious letters to use were already assigned to other job fields). (所有明显要使用的字母都已分配到其他工作领域)。

Scontrol Command Use

控制命令的使用

Use of the scontrol show job option shows two new fields related to job array support. The JobID is a unique identifier for the job. The ArrayJobID is the JobID of the first element of the job array. The ArrayTaskID is the array index of this particular entry, either a single number of an expression identifying the entries represented by this job record (e.g. "5-1024"). Neither field is displayed if the job is not part of a job array. The optional job ID specified with the scontrol show job or scontrol show step commands can identify job array elements by specifying ArrayJobId and ArrayTaskId with an underscore between them (eg. <ArrayJobID>_<ArrayTaskId>).

使用 scontrol show job 选项显示与作业数组支持相关的两个新字段。工作证是这份工作的唯一标识符。ArrayJobID 是作业数组的第一个元素的 JobID。ArrayTaskID 是这个特定条目的数组索引,它是一个表达式的单个数字,标识这个作业记录表示的条目(例如“5-1024”)。如果作业不是作业数组的一部分,则两个字段都不会显示。用 scontrol show job 或 scontrol show step 命令指定的可选作业 ID 可以通过指定 ArrayJobId 和 ArrayTaskId (它们之间有下划线)来标识作业数组元素。< ArrayJobID > _ < ArrayTaskId >).

The scontrol command will operate on all elements of a job array if the job ID specified is ArrayJobID. Individual job array tasks can be modified using the ArrayJobID_ArrayTaskID as shown below.

如果指定的作业 ID 为 ArrayJobID,则 scontrol 命令将对作业数组的所有元素进行操作。可以使用 ArrayJobID _ ArrayTaskID 修改单个作业数组任务,如下所示。

$ sbatch --array=1-4 -J array ./sleepme 86400
Submitted batch job 21845

$ squeue
 JOBID   PARTITION     NAME     USER  ST  TIME NODES NODELIST
 21845_1    canopo    array    david  R  0:13  1     dario
 21845_2    canopo    array    david  R  0:13  1     dario
 21845_3    canopo    array    david  R  0:13  1     dario
 21845_4    canopo    array    david  R  0:13  1     dario

$ scontrol update JobID=21845_2 name=arturo
$ squeue
 JOBID   PARTITION     NAME     USER  ST   TIME  NODES NODELIST
 21845_1    canopo    array    david  R   17:03   1    dario
 21845_2    canopo   arturo    david  R   17:03   1    dario
 21845_3    canopo    array    david  R   17:03   1    dario
 21845_4    canopo    array    david  R   17:03   1    dario

The scontrol hold, holdu, release, requeue, requeuehold, suspend and resume commands can also either operate on all elements of a job array or individual elements as shown below.

Scontrol hold、 holdu、 release、 reeuue、 reeuehold、挂起和恢复命令也可以对作业数组的所有元素或单个元素进行操作,如下所示。

$ scontrol suspend 21845
$ squeue
 JOBID PARTITION      NAME     USER  ST TIME  NODES NODELIST
21845_1    canopo    array    david  S 25:12  1     dario
21845_2    canopo   arturo    david  S 25:12  1     dario
21845_3    canopo    array    david  S 25:12  1     dario
21845_4    canopo    array    david  S 25:12  1     dario
$ scontrol resume 21845
$ squeue
 JOBID PARTITION      NAME     USER  ST TIME  NODES NODELIST
21845_1    canopo    array    david  R 25:14  1     dario
21845_2    canopo   arturo    david  R 25:14  1     dario
21845_3    canopo    array    david  R 25:14  1     dario
21845_4    canopo    array    david  R 25:14  1     dario

scontrol suspend 21845_3
$ squeue
 JOBID PARTITION      NAME     USER  ST TIME  NODES NODELIST
21845_1    canopo    array    david  R 25:14  1     dario
21845_2    canopo   arturo    david  R 25:14  1     dario
21845_3    canopo    array    david  S 25:14  1     dario
21845_4    canopo    array    david  R 25:14  1     dario
scontrol resume 21845_3
$ squeue
 JOBID PARTITION      NAME     USER  ST TIME  NODES NODELIST
21845_1    canopo    array    david  R 25:14  1     dario
21845_2    canopo   arturo    david  R 25:14  1     dario
21845_3    canopo    array    david  R 25:14  1     dario
21845_4    canopo    array    david  R 25:14  1     dario

Job Dependencies

工作依赖性

A job which is to be dependent upon an entire job array should specify itself dependent upon the ArrayJobID. Since each array element can have a different exit code, the interpretation of the afterok and afternotok clauses will be based upon the highest exit code from any task in the job array.

依赖于整个作业数组的作业应该依赖于 ArrayJobID 来指定自身。由于每个数组元素可以有不同的退出代码,因此对后置子句和后置子句的解释将基于作业数组中任何任务的最高退出代码。

When a job dependency specifies the job ID of a job array:当作业依赖项指定作业数组的作业 ID 时: The after clause is satisfied after all tasks in the job array start.在作业数组中的所有任务启动之后,才满足 after 子句。 The afterany clause is satisfied after all tasks in the job array complete.在作业数组中的所有任务完成之后,将满足 after any 子句。 The aftercorr clause is satisfied after the corresponding task ID in the specified job has completed successfully (ran to completion with an exit code of zero).在指定作业中的相应任务 ID 成功完成(以零退出代码运行到完成)之后,将满足 Aftercorr 子句。 The afterok clause is satisfied after all tasks in the job array complete successfully.在作业数组中的所有任务成功完成后,将满足 Afterok 子句。 The afternotok午后clause is satisfied after all tasks in the job array complete with at least one tasks not completing successfully.子句在作业数组中的所有任务完成并且至少有一个任务没有成功完成之后满足。

Examples of use are shown below:

使用的例子如下:

# Wait for specific job array elements
sbatch --depend=after:123_4 my.job
sbatch --depend=afterok:123_4:123_8 my.job2

# Wait for entire job array to complete
sbatch --depend=afterany:123 my.job

# Wait for corresponding job array elements
sbatch --depend=aftercorr:123 my.job

# Wait for entire job array to complete successfully
sbatch --depend=afterok:123 my.job

# Wait for entire job array to complete and at least one task fails
sbatch --depend=afternotok:123 my.job

Other Command Use

其他命令用途

The following Slurm commands do not currently recognize job arrays and their use requires the use of Slurm job IDs, which are unique for each array element: sbcast, sprio, sreport, sshare and sstat. The sacct, sattach and strigger commands have been modified to permit specification of either job IDs or job array elements. The sview command has been modified to permit display of a job's ArrayJobId and ArrayTaskId fields. Both fields are displayed with a value of "N/A" if the job is not part of a job array.

以下 slurm 命令目前不能识别作业数组,使用它们需要使用 slurm 作业 ID,这些 ID 对于每个数组元素都是唯一的: sbcast、 sprio、 sreport、 sshare 和 sstat。Sacct、 sattat 和 strigger 命令已经被修改,以允许指定作业 ID 或作业数组元素。Sview 命令已经修改为允许显示作业的 ArrayJobId 和 ArrayTaskId 字段。如果作业不是作业数组的一部分,则这两个字段的值都显示为“ N/A”。

System Administration

系统管理

A new configuration parameter has been added to control the maximum job array size: MaxArraySize. The smallest index that can be specified by a user is zero and the maximum index is MaxArraySize minus one. The default value of MaxArraySize is 1001. The maximum MaxArraySize supported in Slurm is 4000001. Be mindful about the value of MaxArraySize as job arrays offer an easy way for users to submit large numbers of jobs very quickly.

添加了一个新的配置参数以控制最大作业数组大小: MaxArraySize。用户可以指定的最小索引为零,最大索引为 MaxArraySize-1。MaxArraySize 的默认值是1001。Slurm 支持的最大 MaxarraySize 为4000001。要注意 MaxarraySize 的关于,因为作业数组为用户提供了一种快速提交大量作业的简单方法。

The sched/backfill plugin has been modified to improve performance with job arrays. Once one element of a job array is discovered to not be runnable or impact the scheduling of pending jobs, the remaining elements of that job array will be quickly skipped.

已经修改了 sched/backfill 插件,以提高作业数组的性能。一旦发现作业数组中的一个元素不能运行,或者影响到挂起作业的调度,该作业数组中的其余元素将被迅速跳过。

Slurm creates a single job record when a job array is submitted. Additional job records are only created as needed, typically when a task of a job array is started, which provides a very scalable mechanism to manage large job counts. Each task of the job array will share the same ArrayJobId but will have their own unique ArrayTaskId. In addition to the ArrayJobId, each job will have a unique JobId that gets assigned as the tasks are started.

Slurm 在提交作业数组时创建单个作业记录。额外的作业记录仅在需要时创建,通常是在作业数组的任务启动时创建,该作业数组提供了一种非常可伸缩的机制来管理大量作业计数。作业数组的每个任务将共享相同的 ArrayJobId,但是它们有自己唯一的 ArrayTaskId。除了 ArrayJobId 之外,每个作业都有一个惟一的 JobId,在任务启动时被分配。

Last modified 3 December 2021

最后修订日期: 2021年12月3日

Last updated