srun
Section: Slurm Commands (1) Updated: Slurm Commands
NAME
名称
srun - Run parallel jobs. 运行并行作业
SYNOPSIS
简介
srun [OPTIONS(0)... [executable(0) [args(0)...]]] [ : [OPTIONS(N)...]] executable(N) [args(N)...] Option(s) define multiple jobs in a co-scheduled heterogeneous job. For more details about heterogeneous jobs see the document https://slurm.schedmd.com/heterogeneous_jobs.html 选项定义在一个共同调度的异构作业中多个作业,有关异构作业的详细关于,请参阅文档 https://slurm.schedmd.com/heterogeneous_jobs.html
DESCRIPTION
描述
Run a parallel job on cluster managed by Slurm. If necessary, srun will first create a resource allocation in which to run the parallel job. The following document describes the influence of various options on the allocation of cpus to jobs and tasks. https://slurm.schedmd.com/cpu_management.html
在 slurm 管理的集群上运行并行作业。如果需要,srun 将首先创建一个资源分配,以便在其中运行并行作业。 下面的文档描述了各种选项对于任务和作业的 cpu 分配的影响。 https://slurm.schedmd.com/cpu_management.html
RETURN VALUE
返回值
srun will return the highest exit code of all tasks run or the highest signal (with the high-order bit set in an 8-bit integer -- e.g. 128 + signal) of any task that exited with a signal. The value 253 is reserved for out-of-memory errors.
srun 将返回所有运行任务的最高退出代码,或者任何带有信号退出的任务的最高信号(高阶位设置为8位整数,例如128 + 信号)。 值253是为内存不足错误保留的。
EXECUTABLE PATH RESOLUTION
可执行路径解析
The executable is resolved in the following order: 1. If executable starts with ".", then path is constructed as: current working directory / executable 2. If executable starts with a "/", then path is considered absolute. 3. If executable can be resolved through PATH. See path_resolution(7). 4. If executable is in current working directory. Current working directory is the calling process working directory unless the --chdir argument is passed, which will override the current working directory.
可执行文件按以下顺序解析: 1. 如果可执行文件以“ .”开头,那么 path 将被构造为: 当前工作目录/可执行文件 2. 如果可执行文件以“/”开头,则路径被认为是绝对路径。 3. 如果可执行文件可以通过 PATH 解析,请参见 path_resolution(7)。 4. 如果可执行文件在当前工作目录中。 当前工作目录是调用进程的工作目录,除非传递--chdir 参数,这将覆盖当前工作目录。
OPTIONS
选项
--accel-bind=<options> Control how tasks are bound to generic resources of type gpu and nic. Multiple options may be specified. Supported options include: g Bind each task to GPUs which are closest to the allocated CPUs. n Bind each task to NICs which are closest to the allocated CPUs. v Verbose mode. Log how tasks are bound to GPU and NIC devices. This option applies to job allocations.
控制任务如何绑定到类型为 gpu 和 nic 的通用资源。可以指定多个选项。 支持的备选方案包括: g 将每个任务绑定到最接近所分配CPU的GPU。 n 将每个任务绑定到距离分配的CPU最近的NIC。 v 详细模式。记录任务如何绑定到GPU和NIC设备。 此选项适用于作业分配。
-A, --account=<account> Charge resources used by this job to specified account. The account is an arbitrary string. The account name may be changed after job submission using the scontrol command. This option applies to job allocations. 将此作业使用的资源计入指定帐户。这个账户是一个任意的字符串。在提交作业后,可使用 scontrol 命令更改帐户名称。此选项适用于作业分配。
--acctg-freq=<datatype>=<interval>[,<datatype>=<interval>...] Define the job accounting and profiling sampling intervals in seconds. This can be used to override the JobAcctGatherFrequency parameter in the slurm.conf file. <datatype>=<interval> specifies the task sampling interval for the jobacct_gather plugin or a sampling interval for a profiling type by the acct_gather_profile plugin. Multiple comma-separated <datatype>=<interval> pairs may be specified. Supported datatype values are: 以秒为单位定义作业计数和分析抽样间隔。这可用于覆盖 slurm.conf 文件中的 JobAcctGatherChannel 参数。<datatype>=<interval>指定 jobacct_together 插件的任务采样间隔或 acct_together_profile 插件的分析类型的采样间隔。可以指定多个逗号分隔的<datatype>=<interval>对。支持的数据类型值是: task Sampling interval for the jobacct_gather plugins and for task profiling by the acct_gather_profile plugin. NOTE: This frequency is used to monitor memory usage. If memory limits are enforced the highest frequency a user can request is what is configured in the slurm.conf file. It can not be disabled. Jobacct_together 插件的采样间隔和acct_together_profile插件的任务分析的采样间隔。 注意: 此频率用于监视内存使用情况。如果强制执行内存限制,用户可以请求的最高频率就是 slurm.conf 文件中配置的频率。无法禁用。 energy Sampling interval for energy profiling using the acct_gather_energy plugin. 使用 acct_together_energy 插件进行能量分析的采样间隔。 network Sampling interval for infiniband profiling using the acct_gather_interconnect plugin. 使用 acct_together_interconnect 插件进行无限波段分析的采样间隔。 filesystem ampling interval for filesystem profiling using the acct_gather_filesystem plugin. 使用 acct_together_filessystem 插件进行文件系统分析的采样间隔。
The default value for the task sampling interval is 30 seconds. The default value for all other intervals is 0. An interval of 0 disables sampling of the specified type. If the task sampling interval is 0, accounting information is collected only at job termination (reducing Slurm interference with the job). Smaller (non-zero) values have a greater impact upon job performance, but a value of 30 seconds is not likely to be noticeable for applications having less than 10,000 tasks. This option applies to job allocations.
任务采样间隔的默认值为30秒。所有其他间隔的默认值为0。间隔为0将禁用指定类型的采样。如果任务采样间隔为0,会计信息只会在任务终止时收集(减少对任务的 slurm 干扰)。 较小(非零)的值对作业性能有更大的影响,但是对于任务少于10,000的应用程序来说,30秒的值不太可能引起注意。此选项适用于作业分配。
--bb=<spec> Burst buffer specification. The form of the specification is system dependent. Also see --bbf. This option applies to job allocations. When the --bb option is used, Slurm parses this option and creates a temporary burst buffer script file that is used internally by the burst buffer plugins. See Slurm's burst buffer guide for more information and examples: https://slurm.schedmd.com/burst_buffer.html 突发缓冲区规范。规范的形式依赖于系统。参见--bbf。此选项适用于作业分配。当使用--bb 选项时,slurm 解析该选项并创建一个临时的突发缓冲区脚本文件,该脚本文件由突发缓冲区插件在内部使用。更多信息和例子,请参考 slurm 的缓冲指南: https://slurm.schedmd.com/burst_buffer.html
--bbf=<file_name>
Path of file containing burst buffer specification. The form of the specification is system dependent. Also see --bb. This option applies to job allocations. See Slurm's burst buffer guide for more information and examples: https://slurm.schedmd.com/burst_buffer.html
包含突发缓冲区规范的文件路径。规范的形式依赖于系统。参考--bb。此选项适用于作业分配。更多信息和示例请参见 slurm 缓冲指南: https://slurm.schedmd.com/burst_buffer.html
--bcast[=<dest_path>]
Copy executable file to allocated compute nodes. If a file name is specified, copy the executable to the specified destination file path. If the path specified ends with '/' it is treated as a target directory, and the destination file name will be slurm_bcast_<job_id>.<step_id>_<nodename>. If no dest_path is specified and the slurm.conf BcastParameters DestDir is configured then it is used, and the filename follows the above pattern. If none of the previous is specified, then --chdir is used, and the filename follows the above pattern too. For example, "srun --bcast=/tmp/mine -N3 a.out" will copy the file "a.out" from your current directory to the file "/tmp/mine" on each of the three allocated compute nodes and execute that file. This option applies to step allocations.
将可执行文件复制到已分配任务的计算节点。如果指定了文件名称,则将可执行文件复制到指定的目标文件路径。如果指定的路径以“/”结尾,它将被视为目标目录,目标文件名称将是 slurm_bcast_<job_id>.<step_id>_<nodename>。如果没有指定 dest_path,并且配置了 slurm.conf 的BcastParameter DestDir参数,那么将使用它,并且文件名遵循上述模式。如果没有指定前面的任何一个,那么使用--chdir,并且文件名也遵循上面的模式。例如,"srun --bcast=/tmp/mine -N3 a.out"将把文件“ a.out”从工作目录复制到分配的每个计算节点上的文件“/tmp/mine”,并执行该文件。此选项适用于步骤分配。
--bcast-exclude={NONE|<exclude_path>[,<exclude_path>...]}
Comma-separated list of absolute directory paths to be excluded when autodetecting and broadcasting executable shared object dependencies through --bcast. If the keyword "NONE" is configured, no directory paths will be excluded. The default value is that of slurm.conf BcastExclude and this option overrides it. See also --bcast and --send-libs.
通过--bcast 自动检测和分发可执行共享对象依赖项时要排除的绝对目录路径的逗号分隔列表。如果配置了关键字“ NONE”,则不会排除任何目录路径。默认值是 slurm.conf BcastExcluse 的值,此选项将重写该值。参见--bcast and --send-libs。
-b, --begin=<time> Defer initiation of this job until the specified time. It accepts times of the form HH:MM:SS to run a job at a specific time of day (seconds are optional). (If that time is already past, the next day is assumed.) You may also specify midnight, noon, fika (3 PM) or teatime (4 PM) and you can have a time-of-day suffixed with AM or PM for running in the morning or the evening. You can also say what day the job will be run, by specifying a date of the form MMDDYY or MM/DD/YY YYYY-MM-DD. Combine date and time using the following format YYYY-MM-DD[THH:MM[:SS]]. You can also give times like now + count time-units, where the time-units can be seconds (default), minutes, hours, days, or weeks and you can tell Slurm to run the job today with the keyword today and to run the job tomorrow with the keyword tomorrow. The value may be changed after job submission using the scontrol command. For example: 将此作业的启动推迟到指定时间。它接受 HH:MM:SS 格式的时间,以便在一天中的特定时间运行作业(秒是可选的)。(如果这个时间已经过去,那么假设是第二天。)您也可以指定 midnight,noon,fika (3 PM) or teatime (4 PM) ,您可以有一个时间的一天后缀AM或PM运行在早上或晚上。您还可以通过指定表单 MMDDYY或 MM/DD/YY YYYY-MM-DD的日期来说明作业将在哪一天运行。使用以下格式 YYYY-MM-DD[THH:MM[:SS]]合并日期和时间。你也可以给出“now + count time-units”这样的时间,time-units可以是seconds (default), minutes, hours, days, or weeks,你可以告诉 slurm 今天运行作业用关键字today,明天运行作业用关键字tomorrow。使用 scontrol 命令可以在作业提交后更改该值。例如:
Notes on date/time specifications: 日期/时间规格须知: - Although the 'seconds' field of the HH:MM:SS time specification is allowed by the code, note that the poll time of the Slurm scheduler is not precise enough to guarantee dispatch of the job on the exact second. The job will be eligible to start on the next poll following the specified time. The exact poll interval depends on the Slurm scheduler (e.g., 60 seconds with the default sched/builtin). - 虽然代码允许 HH:MM:SS 时间规范中的“秒”字段,但是请注意,slurm 调度程序的轮询时间不够精确,不足以保证在确切的秒钟发送作业。作业将有资格在指定时间后的下一轮投票中启动。确切的轮询间隔取决于 slurm 调度程序(例如,默认 sched/builtin 为60秒)。 - If no time (HH:MM:SS) is specified, the default is (00:00:00). - 如果没有指定时间(HH:MM:SS) ,则默认值为(00:00:00)。 - If a date is specified without a year (e.g., MM/DD) then the current year is assumed, unless the combination of MM/DD and HH:MM:SS has already passed for that year, in which case the next year is used. - 如果一个日期没有指定一个年份(例如 MM/DD) ,那么假定当前年份,除非 MM/DD 和 HH:MM:SS 的组合已经超过了该年份,在这种情况下使用下一年份。 This option applies to job allocations. 此选项适用于作业分配。
-D, --chdir=<path> Have the remote processes do a chdir to path before beginning execution. The default is to chdir to the current working directory of the srun process. The path can be specified as full path or relative path to the directory where the command is executed. This option applies to job allocations. 在开始执行之前,让远程进程执行 chdir to path。默认值是 chdir 到 srun 进程的当前工作目录。路径可以指定为执行命令的目录的完整路径或相对路径。此选项适用于作业分配。--cluster-constraint=<list>集群约束 = < 列表 > Specifies features that a federated cluster must have to have a sibling job submitted to it. Slurm will attempt to submit a sibling job to a cluster if it has at least one of the specified features. 指定联合群集必须有一个兄弟作业提交给它的特性。Slurm 会尝试向群集提交至少具备其中一项指定功能的同级作业。
-M, --clusters=<string> Clusters to issue commands to. Multiple cluster names may be comma separated. The job will be submitted to the one cluster providing the earliest expected job initiation time. The default value is the current cluster. A value of 'all' will query to run on all clusters. Note the --export option to control environment variables exported between clusters. This option applies only to job allocations. Note that the SlurmDBD must be up for this option to work properly. 要向其发出命令的集群。多个群集名称可以用逗号分隔。作业将提交给一个提供最早预期作业启动时间的集群。默认值是当前群集。值“ all”将查询运行在所有集群上。注意—— export 选项,它可以控制集群之间导出的环境变量。此选项仅适用于作业分配。请注意,SlurmDBD 必须启动才能使此选项正常工作。
--comment=<string> An arbitrary comment. This option applies to job allocations.任意注释。此选项适用于作业分配。
--compress[=type] Compress file before sending it to compute hosts. The optional argument specifies the data compression library to be used. The default is BcastParameters Compression= if set or "lz4" otherwise. Supported values are "lz4". Some compression libraries may be unavailable on some systems. For use with the --bcast option. This option applies to step allocations. 将文件发送到计算主机之前压缩该文件。可选参数指定要使用的数据压缩库。如果设置了默认值,则为 BcastParameter Compress = ,否则为“ lz4”。支持的值是“ lz4”。某些压缩库在某些系统上可能不可用。用于与—— bcast 选项一起使用。此选项适用于步骤分配。
-C, --constraint=<list> Nodes can have features assigned to them by the Slurm administrator. Users can specify which of these slurm features are required by their job using the constraint option. 节点可以有 Slurm 管理员分配给它们的特性。用户可以使用约束选项指定他们的工作需要哪些slurm特征。 If you are looking for 'soft' constraints please see --prefer for more information. 如果您正在寻找“软”约束,请参阅--prefer 获取更多信息 Only nodes having features matching the job constraints will be used to satisfy the request. Multiple constraints may be specified with AND, OR, matching OR, resource counts, etc. (some operators are not supported on all system types). 了解更多信息。只有具有与作业约束匹配的特性的节点才能用于满足请求。可以使用 AND、 OR、匹配 OR、资源计数等指定多个约束(某些操作符不支持所有系统类型)。
NOTE: If features that are part of the node_features/helpers plugin are requested, then only the Single Name and AND options are supported.
注意: 如果节点特性/helpers 插件中的特性被请求,那么只支持单名称和 AND 选项。
Supported --constraint options include:
支持——约束备选方案包括:
Single Name Only nodes which have the specified feature will be used. For example, --constraint="intel" 只使用具有指定特性的节点 Node Count A request can specify the number of nodes needed with some feature by appending an asterisk and count after the feature name. For example, --nodes=16 --constraint="graphics*4 ..." indicates that the job requires 16 nodes and that at least four of those nodes must have the feature "graphics." 请求可以通过在特性名称后加星号和计数来指定所需的节点数。例如,—— node = 16——約束 = “ Graphic* 4...”表示作业需要16个节点,并且其中至少有4个节点必须具有“图形”特性 AND If only nodes with all of specified features will be used. The ampersand is used for an AND operator. For example, --constraint="intel&gpu" 如果只使用具有所有指定特性的节点。与号用于 AND 运算符。例如,——約束 = “ Intel & gpu” OR If only nodes with at least one of specified features will be used. The vertical bar is used for an OR operator. For example, --constraint="intel|amd" 如果只使用具有至少一个指定特性的节点。垂直条用于 OR 运算符。例如,——約束 = “ Intel | amd” Matching OR If only one of a set of possible options should be used for all allocated nodes, then use the OR operator and enclose the options within square brackets. For example, --constraint="[rack1|rack2|rack3|rack4]" might be used to specify that all nodes must be allocated on a single rack of the cluster, but any of those four racks can be used. 如果所有分配的节点只能使用一组可能的选项中的一个,那么使用 OR 运算符并将选项放在方括号中。例如,可以使用——約束 = “[ rac1 | rak2 | rak3 | rak4]”来指定必须将所有节点分配到集群的一个机架上,但是可以使用这四个机架中的任何一个。 Multiple Counts Specific counts of multiple resources may be specified by using the AND operator and enclosing the options within square brackets. For example,可以使用 AND 指定多个资源的具体计数 运算符并将选项放在方括号中。 比如说,--constraint="[rack1*2&rack2*4]"——约束 = “[ rak1 * 2 & rak2 * 4]” might be used to specify that two nodes must be allocated from nodes with the feature of "rack1" and four nodes must be allocated from nodes with the feature "rack2".也许吧 用于指定必须从具有该特性的节点分配两个节点 和四个节点必须从具有该功能的节点分配 “球拍2”。 NOTE: This construct does not support multiple Intel KNL NUMA or MCDRAM modes. For example, while --constraint="[(knl&quad)*2&(knl&hemi)*4]" is not supported, --constraint="[haswell*2&(knl&hemi)*4]" is supported. Specification of multiple KNL modes requires the use of a heterogeneous job. 注意: 此构造不支持多个 Intel KNL NUMA 或 MCDRAM 模式。例如,虽然不支持—— 约束 = “[(knl & quad) * 2 & (knl & hemi) * 4]”,但是支持——约束 = “[ haswell * 2 & (knl & hemi) * 4]”。多个 KNL 模式的规范需要使用异构作业。 NOTE: Multiple Counts can cause jobs to be allocated with a non-optimal network layout. 注意: 多个计数可能导致作业以非最优网络布局分配。 Brackets Brackets can be used to indicate that you are looking for a set of nodes with the different requirements contained within the brackets. For example, 方括号可用于指示您正在查找一组具有 方括号内所载的不同规定,例如: --constraint="[(rack1|rack2)*1&(rack3)*2]"——约束 = “[(rak1 | rak2) * 1 & (rak3) * 2]” will get you one node with either the "rack1" or "rack2" features and two nodes with the "rack3" feature. The same request without the brackets will try to find a single node that meets those requirements. 将得到一个节点 无论是“球拍1”或“球拍2”的功能和两个节点与“球拍3”的功能。 同一个没有括号的请求将尝试找到一个 符合这些要求。 NOTE: Brackets are only reserved for Multiple Counts and Matching OR syntax. AND operators require a count for each feature inside square brackets (i.e. "[quad*2&hemi*1]"). Slurm will only allow a single set of bracketed constraints per job. 注意: 方括号仅用于多重计数和匹配 OR 语法。AND 运算符要求对方括号内的每个特性进行计数(即“[ quad * 2 & hemi * 1]”)。Slurm 只允许每个作业有一组括号内的约束。 Parenthesis Parenthesis can be used to group like node features together. For example, --constraint="[(knl&snc4&flat)*4&haswell*1]" might be used to specify that four nodes with the features "knl", "snc4" and "flat" plus one node with the feature "haswell" are required. All options within parenthesis should be grouped with AND (e.g. "&") operands. 括号可以用来将类似节点的特征分组在一起。例如,可以使用——約束 = “[(knl & snc4 & latt) * 4 & haswell * 1]”来指定需要四个具有“ knl”、“ snc4”和“ latt”特性的节点以及一个具有“ haswell”特性的节点。括号内的所有选项都应该用 AND (例如“ &”)操作数分组。 WARNING: When srun is executed from within salloc or sbatch, the constraint value can only contain a single feature name. None of the other operators are currently supported for job steps. This option applies to job and step allocations. 警告: 当 srun 在 salloc 或 sbatch 中执行时,约束值只能包含一个特性名称。作业步骤当前不支持其他任何操作符。 此选项适用于作业和步骤分配。
--container=<path_to_container> Absolute path to OCI container bundle.OCI 集装箱包的绝对路径。
--contiguous If set, then the allocated nodes must form a contiguous set. 如果设置了,那么分配的节点必须形成一个连续的集。 NOTE: If SelectPlugin=cons_res this option won't be honored with the topology/tree or topology/3d_torus plugins, both of which can modify the node ordering. This option applies to job allocations. 注意: 如果 SelectPlugin = con _ res,拓扑/树或拓扑/3d _ torus 插件将不支持此选项,这两个插件都可以修改节点顺序。此选项适用于作业分配。
-S, --core-spec=<num> Count of specialized cores per node reserved by the job for system operations and not used by the application. The application will not use these cores, but will be charged for their allocation. Default value is dependent upon the node's configured CoreSpecCount value. If a value of zero is designated and the Slurm configuration option AllowSpecResourcesUsage is enabled, the job will be allowed to override CoreSpecCount and use the specialized resources on nodes it is allocated. This option can not be used with the --thread-spec option. 作业为系统操作保留的每个节点的专用核计数 应用程式不会使用这些核心, 但会收取分配费用。 默认值取决于节点配置的 CoreSpecCount 值。 如果指定的值为零,则 slurm 配置选项 如果启用 AllowSpecResourcesUse,作业将被允许重写 CoreSpecCount 并在它所分配的节点上使用专用资源。 此选项不能与--thread-spec同时使用 This option applies to job allocations. 选项,这个选项 适用于工作分配。 NOTE: This option may implicitly impact the number of tasks if -n was not specified. 注意: 如果没有指定 -n,此选项可能会隐式影响任务的数量。 NOTE: Explicitly setting a job's specialized core value implicitly sets its --exclusive option, reserving entire nodes for the job. 注意: 显式地设置作业的专用核心值隐式地设置它的—— only 选项,为作业保留整个节点。
--cores-per-socket=<cores> Restrict node selection to nodes with at least the specified number of cores per socket. See additional information under -B option above when task/affinity plugin is enabled. This option applies to job allocations. 将节点选择限制为每个套接字至少具有指定数量的核的节点。如果启用了任务/关联插件,请参见上面的 -B 选项下的附加信息。此选项适用于作业分配。
--cpu-bind=[{quiet|verbose},]<type> Bind tasks to CPUs. Used only when the task/affinity plugin is enabled. NOTE: To have Slurm always report on the selected CPU binding for all commands executed in a shell, you can enable verbose mode by setting the SLURM_CPU_BIND environment variable value to "verbose". 将任务绑定到 CPU。 仅在启用任务/关联插件时使用。 注意: 为了让 slurm 始终报告所有选择的 CPU 绑定 在 shell 中执行的命令,可以通过设置 将 SLURM _ CPU _ BIND 环境变量值设置为“冗长”。 The following informational environment variables are set when --cpu-bind is in use: 在使用—— cpu-bind 时设置以下信息环境变量:
See the ENVIRONMENT VARIABLES section for a more detailed description of the individual SLURM_CPU_BIND variables. These variable are available only if the task/affinity plugin is configured. 有关各个 SLURM _ CPU _ BIND 变量的更详细描述,请参见 ENVIRONMENT VARIABLES 部分。这些变量只有在配置了任务/关联插件时才可用。 When using --cpus-per-task to run multithreaded tasks, be aware that CPU binding is inherited from the parent of the process. This means that the multithreaded task should either specify or clear the CPU binding itself to avoid having all threads of the multithreaded task use the same mask/CPU as the parent. Alternatively, fat masks (masks which specify more than one allowed CPU) could be used for the tasks in order to provide multiple CPUs for the multithreaded tasks. 当使用—— cpus-per-task 运行多线程任务时,请注意 CPU 绑定是从进程的父进程继承的。这意味着多线程任务应该指定或清除 CPU 绑定本身,以避免多线程任务的所有线程使用与父线程相同的掩码/CPU。或者,可以为任务使用脂肪掩码(指定多个允许的 CPU 的掩码) ,以便为多线程任务提供多个 CPU。 Note that a job step can be allocated different numbers of CPUs on each node or be allocated CPUs not starting at location zero. Therefore one of the options which automatically generate the task binding is recommended. Explicitly specified masks or bindings are only honored when the job step has been allocated every available CPU on the node. 注意,作业步骤可以在每个节点上分配不同数量的 CPU,也可以从位置零开始分配 CPU。因此,建议使用自动生成任务绑定的选项之一。显式指定的掩码或绑定只有在分配了节点上的每个可用 CPU 的作业步骤后才会执行。 Binding a task to a NUMA locality domain means to bind the task to the set of CPUs that belong to the NUMA locality domain or "NUMA node". If NUMA locality domain options are used on systems with no NUMA support, then each socket is considered a locality domain. 将任务绑定到 NUMA 区域域意味着将任务绑定到属于 NUMA 区域域或“ NUMA 节点”的一组 CPU。如果在不支持 NUMA 的系统上使用 NUMA 局部性域选项,则每个套接字都被视为一个局部性域。 If the --cpu-bind option is not used, the default binding mode will depend upon Slurm's configuration and the step's resource allocation. If all allocated nodes have the same configured CpuBind mode, that will be used. Otherwise if the job's Partition has a configured CpuBind mode, that will be used. Otherwise if Slurm has a configured TaskPluginParam value, that mode will be used. Otherwise automatic binding will be performed as described below. 如果没有使用—— cpu-bind 选项,默认的绑定模式将取决于 slurm 的配置和步骤的资源分配。如果所有分配的节点具有相同配置的 CpuBind 模式,则将使用该模式。否则,如果作业的 Partition 具有已配置的 CpuBind 模式,则将使用该模式。否则,如果 slurm 有一个已配置的 taskPluginparam 值,就会使用该模式。否则将按照下面的描述执行自动绑定。
Auto Binding Applies only when task/affinity is enabled. If the job step allocation includes an allocation with a number of sockets, cores, or threads equal to the number of tasks times cpus-per-task, then the tasks will by default be bound to the appropriate resources (auto binding). Disable this mode of operation by explicitly setting "--cpu-bind=none". Use TaskPluginParam=autobind=[threads|cores|sockets] to set a default cpu binding in case "auto binding" doesn't find a match. 仅在启用任务/关联时应用。如果作业步骤分配包含一个分配,其中的套接字、核心或线程的数量等于每个任务 CPU 的任务数量,那么默认情况下,任务将绑定到适当的资源(自动绑定)。通过显式设置“—— cpu-bind = none”禁用这种操作模式。使用 TaskPluginParam = autobind = [线程 | 核心 | 套接字]设置默认的 cpu 绑定,以防“ auto bind”找不到匹配。 Supported options include: 支持的备选方案包括: q[uiet] Quietly bind before task runs (default) 在任务运行之前安静地绑定(默认) v[erbose] Verbosely report binding before task runs 在任务运行前详细报告绑定 no[ne] Do not bind tasks to CPUs (default unless auto binding is applied) 不要将任务绑定到 CPU (默认情况下,除非应用了自动绑定)rank军衔 Automatically bind by task rank. The lowest numbered task on each node is bound to socket (or core or thread) zero, etc. Not supported unless the entire node is allocated to the job. 根据任务等级自动绑定。每个节点上编号最低的任务绑定到套接字(或核心或线程)零等。除非将整个节点分配给作业,否则不支持。 map_cpu:<list> Bind by setting CPU masks on tasks (or ranks) as specified where <list> is <cpu_id_for_task_0>,<cpu_id_for_task_1>,... If the number of tasks (or ranks) exceeds the number of elements in this list, elements in the list will be reused as needed starting from the beginning of the list. To simplify support for large task counts, the lists may follow a map with an asterisk and repetition count. For example "map_cpu:0*4,3*4". 通过在指定的 < list > is < CPU _ id _ for _ task _ 0 > ,< CPU _ id _ for _ task _ 1 > 处设置任务(或者级别)的 CPU 掩码来绑定,... 如果任务(或者级别)的数量超过了这个列表中的元素的数量,那么从列表的开头开始,列表中的元素将根据需要重用。为了简化对大型任务计数的支持,列表可以在带星号和重复计数的映射后面。例如“ map _ cpu: 0 * 4,3 * 4”。 mask_cpu:<list> Bind by setting CPU masks on tasks (or ranks) as specified where <list> is <cpu_mask_for_task_0>,<cpu_mask_for_task_1>,... The mapping is specified for a node and identical mapping is applied to the tasks on every node (i.e. the lowest task ID on each node is mapped to the first mask specified in the list, etc.). CPU masks are always interpreted as hexadecimal values but can be preceded with an optional '0x'. If the number of tasks (or ranks) exceeds the number of elements in this list, elements in the list will be reused as needed starting from the beginning of the list. To simplify support for large task counts, the lists may follow a map with an asterisk and repetition count. For example "mask_cpu:0x0f*4,0xf0*4". 通过在指定的任务(或等级)上设置 CPU 掩码来绑定,其中 < list > is < CPU _ ask _ for _ task _ 0 > ,< CPU _ ask _ for _ task _ 1 > ,... 为一个节点指定映射,并将相同的映射应用到每个节点上的任务(即每个节点上的最低任务 ID 映射到列表中指定的第一个掩码,等等)。CPU 掩码总是被解释为十六进制值,但可以在前面加上一个可选的“0x”。如果任务(或等级)的数量超过列表中元素的数量,则从列表的开头开始,列表中的元素将根据需要重用。为了简化对大型任务计数的支持,列表可以在带星号和重复计数的映射后面。例如“ ask _ cpu: 0x0f * 4,0 xf0 * 4”。 rank_ldom Bind to a NUMA locality domain by rank. Not supported unless the entire node is allocated to the job. 按级别绑定到 NUMA 本地域。除非将整个节点分配给作业,否则不支持。 map_ldom:<list> Bind by mapping NUMA locality domain IDs to tasks as specified where <list> is <ldom1>,<ldom2>,...<ldomN>. The locality domain IDs are interpreted as decimal values unless they are preceded with '0x' in which case they are interpreted as hexadecimal values. Not supported unless the entire node is allocated to the job. 通过将 NUMA 本地域域 ID 映射到指定的任务来绑定,其中 < list > is < ldom1 > ,< ldom2 > ,... < ldomN > 。局部域 ID 被解释为十进制值,除非它们前面加上“0x”,在这种情况下,它们被解释为十六进制值。除非将整个节点分配给作业,否则不支持。 mask_ldom:<list> Bind by setting NUMA locality domain masks on tasks as specified where <list> is <mask1>,<mask2>,...<maskN>. NUMA locality domain masks are always interpreted as hexadecimal values but can be preceded with an optional '0x'. Not supported unless the entire node is allocated to the job. 通过在 < list > is < mask1 > 、 < mask2 > 、 ... < maskN > 指定的任务上设置 NUMA 局部域掩码来绑定。NUMA 局部性域掩码总是解释为十六进制值,但可以在前面加上可选的“0x”。除非将整个节点分配给作业,否则不支持。 sockets Automatically generate masks binding tasks to sockets. Only the CPUs on the socket which have been allocated to the job will be used. If the number of tasks differs from the number of allocated sockets this can result in sub-optimal binding.自动生成将任务绑定到套接字的掩码。只使用套接字上已分配给作业的 CPU。如果任务的数量与所分配的套接字的数量不同,则可能导致次优绑定。 cores Automatically generate masks binding tasks to cores. If the number of tasks differs from the number of allocated cores this can result in sub-optimal binding. 自动生成将任务绑定到内核的掩码。如果任务的数量与分配的核的数量不同,这可能导致次优绑定。 threads Automatically generate masks binding tasks to threads. If the number of tasks differs from the number of allocated threads this can result in sub-optimal binding. 自动生成将任务绑定到线程的掩码。如果任务的数量与分配的线程的数量不同,这可能导致次优绑定。 ldoms Automatically generate masks binding tasks to NUMA locality domains. If the number of tasks differs from the number of allocated locality domains this can result in sub-optimal binding. 自动生成将任务绑定到 NUMA 局部域的掩码。如果任务的数量与分配的本地域的数量不同,这可能导致次优绑定。 help Show help message for cpu-bind显示 CPU 绑定的帮助消息This option applies to job and step allocations.此选项适用于作业和步骤分配。
--cpu-freq=<p1>[-p2[:p3]] Request that the job step initiated by this srun command be run at some requested frequency if possible, on the CPUs selected for the step on the compute node(s). 如果可能的话,请求在为计算节点上的步骤选择的 CPU 上以某个请求的频率运行由这个 srun 命令发起的作业步骤。 p1 can be [#### | low | medium | high | highm1] which will set the frequency scaling_speed to the corresponding value, and set the frequency scaling_governor to UserSpace. See below for definition of the values. p1可以是[ # # # # | low | media | high | highm1] ,它将频率调整速度设置为相应的值,并将频率调整速度设置为 UserSpace。有关这些值的定义,请参见下文。 p1 can be [Conservative | OnDemand | Performance | PowerSave] which will set the scaling_governor to the corresponding value. The governor has to be in the list set by the slurm.conf option CpuFreqGovernors. p1可以是[ Conservation | OnDemand | Performance | PowerSave ] ,它将 scaling _ ator 设置为相应的值。调控器必须在 slurm.conf 选项 CpuFreqGoverors 设置的列表中。 When p2 is present, p1 will be the minimum scaling frequency and p2 will be the maximum scaling frequency. 当 p2存在时,p1是最小缩放频率,p2是最大缩放频率。 p2 can be [#### | medium | high | highm1] p2 must be greater than p1. p2可以是[ # # # # | 中 | 高 | highm1] p2必须大于 p1。 p3 can be [Conservative | OnDemand | Performance | PowerSave | SchedUtil | UserSpace] which will set the governor to the corresponding value. p3可以是[保守 | OnDemand | Performance | PowerSave | SchedUtil | UserSpace ] ,它将调控器设置为相应的值。 If p3 is UserSpace, the frequency scaling_speed will be set by a power or energy aware scheduling strategy to a value between p1 and p2 that lets the job run within the site's power goal. The job may be delayed if p1 is higher than a frequency that allows the job to run within the goal. 如果 p3是 UserSpace,则频率调度策略将通过一个能感知功率或能量的调度策略设置为 p1和 p2之间的一个值,该值允许作业在站点的功率目标范围内运行。如果 p1高于允许作业在目标内运行的频率,则作业可能会延迟。 If the current frequency is < min, it will be set to min. Likewise, if the current frequency is > max, it will be set to max. 如果当前频率 < min,则将其设置为 min。同样,如果当前频率 > max,它将被设置为 max。 Acceptable values at present include: 目前可接受的数值包括: #### frequency in kilohertz 以千赫为单位的频率 Low the lowest available frequency 最低可用频率 High the highest available frequency 最高可用频率 HighM1 (high minus one) will select the next highest available frequency (高-1)将选择下一个最高可用频率 Medium attempts to set a frequency in the middle of the available range 试图将频率设置在可用范围的中间 Conservative attempts to use the Conservative CPU governor 试图使用保守的 CPU 调控器 OnDemand attempts to use the OnDemand CPU governor (the default value) 尝试使用 OnDemand CPU 调控器(默认值) Performance attempts to use the Performance CPU governor 尝试使用性能 CPU 调控器 PowerSave attempts to use the PowerSave CPU governor 尝试使用 PowerSaveCPU 调控器 UserSpace attempts to use the UserSpace CPU governor 尝试使用 UserSpace CPU 调控器 The following informational environment variable is set in the job step when 工作中设置了以下信息环境变量 什么时候走 --cpu-freq中央处理器频率 option is requested. 请求选项。
This environment variable can also be used to supply the value for the CPU frequency request if it is set when the 'srun' command is issued. The --cpu-freq on the command line will override the environment variable value. The form on the environment variable is the same as the command line. See the ENVIRONMENT VARIABLES section for a description of the SLURM_CPU_FREQ_REQ variable. 如果在发出“ srun”命令时设定了 CPU 频率请求的值,这个环境变量也可以用来提供该值。命令行上的—— cpu-freq 将覆盖环境变量值。环境变量上的表单与命令行上的表单相同。有关 SLURM _ CPU _ FREQ _ REQ 变量的描述,请参见 ENVIRONMENT VARIABLES 部分。 NOTE: This parameter is treated as a request, not a requirement. If the job step's node does not support setting the CPU frequency, or the requested value is outside the bounds of the legal frequencies, an error is logged, but the job step is allowed to continue. 注意: 此参数被视为请求,而不是需求。如果作业步骤的节点不支持设置 CPU 频率,或者请求的值超出了合法频率的范围,则会记录一个错误,但允许作业步骤继续。 NOTE: Setting the frequency for just the CPUs of the job step implies that the tasks are confined to those CPUs. If task confinement (i.e. the task/affinity TaskPlugin is enabled, or the task/cgroup TaskPlugin is enabled with "ConstrainCores=yes" set in cgroup.conf) is not configured, this parameter is ignored. 注意: 仅为作业步骤的 CPU 设置频率意味着任务仅限于这些 CPU。如果没有配置任务限制(即启用了任务/关联 TaskPlugin,或者在 cgroupp.conf 中设置“ ConstrainCores = yes”时启用了任务/cgroup TaskPlugin) ,则忽略此参数。 NOTE: When the step completes, the frequency and governor of each selected CPU is reset to the previous values. 注意: 当步骤完成时,每个选定 CPU 的频率和调控器将重置为以前的值。 NOTE: When submitting jobs with the --cpu-freq option with linuxproc as the ProctrackType can cause jobs to run too quickly before Accounting is able to poll for job information. As a result not all of accounting information will be present. 注意: 当使用—— cpu-freq 选项和 linuxproc 作为 ProctrackType 提交作业时,可能会导致作业运行得太快,以致会计无法轮询作业信息。因此,并非所有的会计信息都将存在。 This option applies to job and step allocations. 此选项适用于作业和步骤分配。
--cpus-per-gpu=<ncpus> Advise Slurm that ensuing job steps will require ncpus processors per allocated GPU. Not compatible with the --cpus-per-task option. 建议 slurm,随后的工作步骤将需要每个分配的图形处理器使用 ncpus 处理器。
-c, --cpus-per-task=<ncpus> Request that 要求 ncpusNcpus be allocated 分配 per process每个程序. This may be useful if the job is multithreaded and requires more than one CPU per task for optimal performance. Explicitly requesting this option implies. 如果作业是多线程的,并且每个任务需要一个以上的 CPU 才能获得最佳性能,那么这可能很有用。明确请求此选项意味着--exact The default is one CPU per process and does not imply 。 默认值是每个进程一个 CPU,并不意味着 --exact一模一样. If . 如果 -cC is specified without 则指定 -n- 是的, as many tasks will be allocated per node as possible while satisfying the 一样多 任务将尽可能分配到每个节点,同时满足 的 -cC restriction. For instance on a cluster with 8 CPUs per node, a job request for 4 nodes and 3 CPUs per task may be allocated 3 or 6 CPUs per node (1 or 2 tasks per node) depending upon resource consumption by other jobs. Such a job may be unable to execute more than a total of 4 tasks.例如,在一个有8个 CPU 的集群上 对于每个节点,每个任务可能有4个节点和3个 CPU 的作业请求 每个节点分配3或6个 CPU (每个节点1或2个任务) ,取决于 其他工作对资源的消耗 无法执行总共4个以上的任务。 WARNING: There are configurations and options interpreted differently by job and job step requests which can result in inconsistencies for this option. For example srun -c2 --threads-per-core=1 prog may allocate two cores for the job, but if each of those cores contains two threads, the job allocation will include four CPUs. The job step allocation will then launch two threads per CPU for a total of two tasks. 警告: 作业和作业步骤请求对配置和选项的解释不同,这可能导致此选项不一致。例如,srun-c2—— thread-per-core = 1 prog 可以为作业分配两个内核,但是如果每个内核包含两个线程,则作业分配将包含四个 CPU。然后,作业步骤分配将为每个 CPU 启动两个线程,执行总共两个任务。 WARNING: When srun is executed from within salloc or sbatch, there are configurations and options which can result in inconsistent allocations when -c has a value greater than -c on salloc or sbatch. The number of cpus per task specified for salloc or sbatch is not automatically inherited by srun and, if desired, must be requested again, either by specifying --cpus-per-task when calling srun, or by setting the SRUN_CPUS_PER_TASK environment variable. 警告: 当在 salloc 或 satch 中执行 srun 时,有些配置和选项可能会导致分配不一致,如果-c 在 salloc 或 satch 上的值大于-c。为 salloc 或 sbatch 指定的每个任务的 cpus 数量不会被 SRUN 自动继承,如果需要,必须通过在调用 SRUN 时指定—— cpus-per-task 或通过设置 SRUN _ CPUS _ PER _ TASK 环境变量再次请求。 This option applies to job and step allocations. 此选项适用于作业和步骤分配。
--deadline=<OPT> remove the job if no ending is possible before this deadline (start > (deadline - time[-min])). Default is no deadline. Valid time formats are: 如果在这个截止日期之前没有可能结束作业,则删除该作业(开始 > (截止日期时间[-min ]))。违约不是最后期限。有效的时间格式如下: HH:MM[:SS] [AM|PM] MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]MMDD MM/DD[/YY]-HH:MM[:SS]MM/DD YYYY-MM-DD[THH:MM[:SS]]] now[+count[seconds(default)|minutes|hours|days|weeks]] This option applies only to job allocations. 此选项仅适用于作业分配。
--delay-boot=<minutes> Do not reboot nodes in order to satisfied this job's feature specification if the job has been eligible to run for less than this time period. If the job has waited for less than the specified period, it will use only nodes which already have the specified features. The argument is in units of minutes. A default value may be set by a system administrator using the 不要为了满足这个作业的特性规范而重新启动节点,如果这个作业有资格运行少于这个时间段的话。如果作业等待的时间少于指定的时间段,它将只使用已经具有指定特性的节点。论点的单位是分钟。系统管理员可以使用delay_boot延迟 _ 启动 option of the 选择 SchedulerParameters参数 configuration parameter in the slurm.conf file, otherwise the default value is zero (no delay). 配置参数,否则默认值为零(无延迟)。 This option applies only to job allocations. 此选项仅适用于作业分配。
-d, --dependency=<dependency_list> Defer the start of this job until the specified dependencies have been satisfied completed. This option does not apply to job steps (executions of srun within an existing salloc or sbatch allocation) only to job allocations. <将此作业的开始推迟到指定的依赖项完成之后。此选项不适用于作业步骤(在现有的 salloc 或 satch 分配中执行 srun) ,仅适用于作业分配。<dependency_list附件 _ 列表> is of the form <> 为 < type:job_id[:job_id][,type:job_id[:job_id]]Type: job _ id [ : job _ id ][ ,type: job _ id [ : job _ id ]]> or <> 或 < type:job_id[:job_id][?type:job_id[:job_id]]Type: job _ id [ : job _ id ][ ? type: job _ id [ : job _ id ]]>. All dependencies must be satisfied if the "," separator is used. Any dependency may be satisfied if the "?" separator is used. Only one separator may be used. For instance:>.如果使用“ ,”分隔符,则必须满足所有依赖项。任何依赖性可能会得到满足,如果“ ?”使用分离器。只能使用一个分隔符。例如:
means that the job can run only after a 0 return code of jobs 20 and 21 AND the completion of job 23. However: 意味着作业只能在作业20和21的0返回码和作业23的完成之后运行。然而:
means that any of the conditions (afterok:20 OR afterok:21 OR afterany:23) will be enough to release the job. Many jobs can share the same dependency and these jobs may even belong to different users. The value may be changed after job submission using the scontrol command. Dependencies on remote jobs are allowed in a federation. Once a job dependency fails due to the termination state of a preceding job, the dependent job will never be run, even if the preceding job is requeued and has a different termination state in a subsequent execution. This option applies to job allocations. 意味着任何条件(后期: 20或后期: 21或后期: 23)将足以释放工作。许多作业可以共享相同的依赖关系,这些作业甚至可能属于不同的用户。使用 scontrol 命令可以在作业提交后更改该值。在联合中允许对远程作业的依赖性。一旦作业依赖项由于前一个作业的终止状态而失败,即使前一个作业被重新排队并且在随后的执行中具有不同的终止状态,依赖作业也将永远不会运行。此选项适用于作业分配。 after:job_id[[+time][:jobid[+time]...]] After the specified jobs start or are cancelled and 'time' in minutes from job start or cancellation happens, this job can begin execution. If no 'time' is given then there is no delay after start or cancellation. 在指定的作业启动或取消以及从作业启动或取消开始计算的“时间”发生后,该作业可以开始执行。如果没有给出“时间”,那么在开始或取消之后就没有延迟。 afterany:job_id[:jobid...]After any: job _ id [ : jobid... ] This job can begin execution after the specified jobs have terminated. This is the default dependency type. 此作业可以在指定作业终止后开始执行。这是默认的依赖项类型。 afterburstbuffer:job_id[:jobid...]After burstbuffer: job _ id [ : jobid... ] This job can begin execution after the specified jobs have terminated and any associated burst buffer stage out operations have completed. 这个作业可以在指定的作业终止和任何相关的突发缓冲区阶段结束操作完成后开始执行。 aftercorr:job_id[:jobid...] A task of this job array can begin execution after the corresponding task ID in the specified job has completed successfully (ran to completion with an exit code of zero). 此作业数组的任务可以在指定作业中的相应任务 ID 成功完成后开始执行(以零退出代码运行到完成)。 afternotok:job_id[:jobid...] This job can begin execution after the specified jobs have terminated in some failed state (non-zero exit code, node failure, timed out, etc). 此作业可以在指定作业以某种失败状态(非零退出代码、节点失败、超时等)终止后开始执行。 afterok:job_id[:jobid...] This job can begin execution after the specified jobs have successfully executed (ran to completion with an exit code of zero). 这个作业可以在指定的作业成功执行后开始执行(运行到完成时退出代码为零)。 singleton This job can begin execution after any previously launched jobs sharing the same job name and user have terminated. In other words, only one job by that name and owned by that user can be running or suspended at any point in time. In a federation, a singleton dependency must be fulfilled on all clusters unless DependencyParameters=disable_remote_singleton is used in slurm.conf. 此作业可以在共享相同作业名称和用户的任何先前启动的作业终止后开始执行。换句话说,在任何时间点,只有该名称拥有的一个作业可以运行或挂起。在联合中,必须在所有集群上实现单例依赖,除非 DependencyParameter = able _ remote _ singleton 在 slurm.conf 中使用。
-X, --disable-status- X-失效状态Disable the display of task status when srun receives a single SIGINT (Ctrl-C). Instead immediately forward the SIGINT to the running job. Without this option a second Ctrl-C in one second is required to forcibly terminate the job and srun will immediately exit. May also be set via the environment variable SLURM_DISABLE_STATUS. This option applies to job allocations.当 srun 接收到一个 SIGINT (Ctrl-C)时,禁用显示任务状态。而是立即将 SIGINT 转发到正在运行的作业。如果没有这个选项,则需要在一秒钟内使用第二个 Ctrl-C 来强制终止作业,并且 srun 将立即退出。也可以通过环境变量 SLURM _ DISABLE _ STATUS 设置。此选项适用于作业分配。
-m, --distribution={*|block|cyclic|arbitrary|plane=<size>}[:{*|block|cyclic|fcyclic}[:{*|block|cyclic|fcyclic}]][,{Pack|NoPack}] Specify alternate distribution methods for remote processes. For job allocation, this sets environment variables that will be used by subsequent srun requests. Task distribution affects job allocation at the last stage of the evaluation of available resources by the cons_res and cons_tres plugins. Consequently, other options (e.g. --ntasks-per-node, --cpus-per-task) may affect resource selection prior to task distribution. To ensure a specific task distribution jobs should have access to whole nodes, for instance by using the --exclusive flag. 为远程进程指定备用分发方法。对于作业分配,这将设置将由后续 Srun 请求使用的环境变量。任务分配会影响使用 con _ res 和 con _ tre 插件评估可用资源的最后一个阶段的任务分配。因此,其他选项(例如—— nasks-per-node,—— cpus-per-task)可能会影响任务分配之前的资源选择。为了确保特定的任务分发作业应该具有对整个节点的访问权限,例如通过使用—— only 标志。 This option controls the distribution of tasks to the nodes on which resources have been allocated, and the distribution of those resources to tasks for binding (task affinity). The first distribution method (before the first ":") controls the distribution of tasks to nodes. The second distribution method (after the first ":") controls the distribution of allocated CPUs across sockets for binding to tasks. The third distribution method (after the second ":") controls the distribution of allocated CPUs across cores for binding to tasks. The second and third distributions apply only if task affinity is enabled. The third distribution is supported only if the task/cgroup plugin is configured. The default value for each distribution type is specified by *. 此选项控制将任务分配给已分配资源的节点,以及将这些资源分配给用于绑定的任务(任务关联)。第一个分发方法(在第一个“ :”之前)控制任务到节点的分发。第二个分发方法(在第一个“ :”之后)控制跨套接字分配的 CPU,用于绑定到任务。第三个分布方法(在第二个“ :”之后)控制分配的 CPU 在不同核之间的分布,以便绑定到任务。第二个和第三个发行版仅适用于启用任务关联的情况。只有配置了任务/cgroup 插件,才支持第三个发行版。每个分发类型的默认值由 * 指定。 Note that with select/cons_res and select/cons_tres, the number of CPUs allocated to each socket and node may be different. Refer to the mc_support document for more information on resource allocation, distribution of tasks to nodes, and binding of tasks to CPUs. 注意,对于 select/con _ res 和 select/con _ tri,分配给每个套接字和节点的 CPU 数量可能不同。有关资源分配、任务到节点的分配以及任务到 CPU 的绑定的更多信息,请参考 mc _ support 文档。 First distribution method (distribution of tasks across nodes): 第一种分布方法(跨节点分布任务) : *Use the default method for distributing tasks to nodes (block). 使用默认方法将任务分发到节点(块)。 block The block distribution method will distribute tasks to a node such that consecutive tasks share a node. For example, consider an allocation of three nodes each with two cpus. A four-task block distribution request will distribute those tasks to the nodes with tasks one and two on the first node, task three on the second node, and task four on the third node. Block distribution is the default behavior if the number of tasks exceeds the number of allocated nodes. 块分配方法将任务分配到一个节点,使连续的任务共享一个节点。例如,考虑分配三个节点,每个节点有两个 CPU。一个四任务块分发请求将这些任务分发给在第一节点上有任务一和任务二,在第二节点上有任务三,在第三节点上有任务四的节点。如果任务数超过分配的节点数,则块分发是默认行为。 cyclic The cyclic distribution method will distribute tasks to a node such that consecutive tasks are distributed over consecutive nodes (in a round-robin fashion). For example, consider an allocation of three nodes each with two cpus. A four-task cyclic distribution request will distribute those tasks to the nodes with tasks one and four on the first node, task two on the second node, and task three on the third node. Note that when SelectType is select/cons_res, the same number of CPUs may not be allocated on each node. Task distribution will be round-robin among all the nodes with CPUs yet to be assigned to tasks. Cyclic distribution is the default behavior if the number of tasks is no larger than the number of allocated nodes. 循环分布方法将任务分布到一个节点,使得连续的任务分布在连续的节点上(以循环方式)。例如,考虑分配三个节点,每个节点有两个 CPU。一个四任务循环分发请求将这些任务分配给第一节点上的任务一和任务四、第二节点上的任务二和第三节点上的任务三的节点。注意,当 SelectType 为 select/con _ res 时,可能不会在每个节点上分配相同数量的 CPU。任务将在所有尚未分配 CPU 的节点之间循环分配。如果任务的数量不大于分配的节点的数量,则循环分布是默认行为。 plane The tasks are distributed in blocks of size <size>. The size must be given or SLURM_DIST_PLANESIZE must be set. The number of tasks distributed to each node is the same as for cyclic distribution, but the taskids assigned to each node depend on the plane size. Additional distribution specifications cannot be combined with this option. For more details (including examples and diagrams), please see the mc_support document and https://slurm.schedmd.com/dist_plane.html 任务按大小 < size > 分布。必须给出大小或设置 SLURM _ DIST _ PLANESIZE。分配给每个节点的任务数与循环分配的任务数相同,但分配给每个节点的任务取决于平面大小。其他分发规范不能与此选项结合使用。有关详细信息(包括示例和图表) ,请参阅 mc _ support 文档和 https://slurm.schedmd.com/dist_plane.htmlar bitrary The arbitrary method of distribution will allocate processes in-order as listed in file designated by the environment variable SLURM_HOSTFILE. If this variable is listed it will over ride any other method specified. If not set the method will default to block. Inside the hostfile must contain at minimum the number of hosts requested and be one per line or comma separated. If specifying a task count (-n, --ntasks=<number>), your tasks will be laid out on the nodes in the order of the file. NOTE: The arbitrary distribution option on a job allocation only controls the nodes to be allocated to the job and not the allocation of CPUs on those nodes. This option is meant primarily to control a job step's task layout in an existing job allocation for the srun command. NOTE: If the number of tasks is given and a list of requested nodes is also given, the number of nodes used from that list will be reduced to match that of the number of tasks if the number of nodes in the list is greater than the number of tasks. 任意的分配方法将按照环境变量 SLURM _ HOSTFILE 指定的文件中列出的顺序分配进程。如果列出此变量,它将覆盖指定的任何其他方法。如果没有设置,方法将默认为阻塞。主机文件内部必须至少包含所请求的主机数,并且每行一个主机或逗号分隔。如果指定一个任务计数(- n,—— nasks = < number >) ,那么您的任务将按照文件的顺序布局在节点上。注意: 作业分配上的任意分配选项只控制要分配给作业的节点,而不控制这些节点上 CPU 的分配。此选项主要用于在现有的 srun 命令作业分配中控制作业步骤的任务布局。注意: 如果给定了任务数量并且给定了请求节点的列表,那么如果列表中的节点数量大于任务数量,则从该列表中使用的节点数量将减少到与任务数量相匹配。
Second distribution method (distribution of CPUs across sockets for binding): 第二种分布方法(通过套接字分布用于绑定的 CPU) *Use the default method for distributing CPUs across sockets (cyclic). 使用默认方法跨套接字分发 CPU (循环)。 block The block distribution method will distribute allocated CPUs consecutively from the same socket for binding to tasks, before using the next consecutive socket. 块分发方法将在使用下一个连续套接字之前,从同一个套接字连续分发已分配的 CPU,以便绑定到任务。 cyclic The cyclic distribution method will distribute allocated CPUs for binding to a given task consecutively from the same socket, and from the next consecutive socket for the next task, in a round-robin fashion across sockets. Tasks requiring more than one CPU will have all of those CPUs allocated on a single socket if possible. 循环分发方法将分配已分配的 CPU,用于从相同的套接字连续地绑定到给定的任务,并从下一个任务的连续套接字以循环方式跨套接字进行绑定。如果可能的话,需要多个 CPU 的任务将把所有这些 CPU 分配到一个套接字上。 fcyclic The fcyclic distribution method will distribute allocated CPUs for binding to tasks from consecutive sockets in a round-robin fashion across the sockets. Tasks requiring more than one CPU will have each CPUs allocated in a cyclic fashion across sockets. Fcic 分发方法将以循环方式分配已分配的 CPU,以便跨套接字从连续的套接字绑定到任务。需要多个 CPU 的任务将以循环方式在套接字之间分配每个 CPU。
Third distribution method (distribution of CPUs across cores for binding): 第三种分布方法(用于绑定的 CPU 跨核分布) *Use the default method for distributing CPUs across cores (inherited from second distribution method). 使用默认方法跨核分布 CPU (继承自第二种分布方法)。 block The block distribution method will distribute allocated CPUs consecutively from the same core for binding to tasks, before using the next consecutive core. 块分发方法将在使用下一个连续的核之前,从同一个核连续分配已分配的 CPU,以便绑定到任务。 cyclic The cyclic distribution method will distribute allocated CPUs for binding to a given task consecutively from the same core, and from the next consecutive core for the next task, in a round-robin fashion across cores. 循环分发方法将分配给绑定到给定任务的分配 CPU 从相同的核连续分配,并从下一个连续的核连续分配到下一个任务,以循环的方式在不同的核之间分配。 fcyclic The fcyclic distribution method will distribute allocated CPUs for binding to tasks from consecutive cores in a round-robin fashion across the cores. Fcic 分布方法将以循环的方式分配分配的 CPU,以便在不同核之间绑定来自连续核的任务。
Optional control for task distribution over nodes: 节点上任务分配的可选控制: Pack Rather than evenly distributing a job step's tasks evenly across its allocated nodes, pack them as tightly as possible on the nodes. This only applies when the "block" task distribution method is used. 与其将作业步骤的任务均匀地分布在所分配的节点上,不如将它们尽可能紧密地打包在节点上。这只适用于使用“块”任务分配方法的情况。 NoPack Rather than packing a job step's tasks as tightly as possible on the nodes, distribute them evenly. This user option will supersede the SelectTypeParameters CR_Pack_Nodes configuration parameter. 与其在节点上尽可能紧密地打包作业步骤的任务,不如将它们均匀分布。此用户选项将取代 SelectTypeParameter CR _ Pack _ Nodes 配置参数。 This option applies to job and step allocations. 此选项适用于作业和步骤分配。
--epilog={none|<executable>} srun will run executable just after the job step completes. The command line arguments for executable will be the command and arguments of the job step. If none is specified, then no srun epilog will be run. This parameter overrides the SrunEpilog parameter in slurm.conf. This parameter is completely independent from the Epilog parameter in slurm.conf. This option applies to job allocations. Srun 将在作业步骤完成后立即运行可执行文件。可执行文件的命令行参数将是作业步骤的命令和参数。如果没有指定,则不会运行任何 srun Epilog。此参数重写 slurm.conf 中的 SrunEpilog 参数。该参数完全独立于 slurm.conf 中的 Epilog 参数。此选项适用于作业分配。
-e, --error=<filename_pattern> Specify how stderr is to be redirected. By default in interactive mode, srun redirects stderr to the same file as stdout, if one is specified. The --error option is provided to allow stdout and stderr to be redirected to different locations. See IO Redirection below for more options. If the specified file already exists, it will be overwritten. This option applies to job and step allocations. 指定如何重定向 stderr。默认情况下,在交互模式下,如果指定了标准输出,则 srun 将标准输出重定向到与标准输出相同的文件。提供—— error 选项是为了允许将 stdout 和 stderr 重定向到不同的位置。有关更多选项,请参见下面的 IO 重定向。如果指定的文件已经存在,它将被覆盖。此选项适用于作业和步骤分配。
--exact Allow a step access to only the resources requested for the step. By default, all non-GRES resources on each node in the step allocation will be used. This option only applies to step allocations. NOTE: Parallel steps will either be blocked or rejected until requested step resources are available unless --overlap is specified. Job resources can be held after the completion of an srun command while Slurm does job cleanup. Step epilogs and/or SPANK plugins can further delay the release of step resources. 只允许步骤访问为该步骤请求的资源。默认情况下,将使用步骤分配中每个节点上的所有非 GRES 资源。此选项仅适用于步骤分配。 注意: 在请求的步骤资源可用之前,并行步骤将被阻塞或拒绝,除非指定——重叠。作业资源可以在 srun 命令完成后保留,而 slurm 则进行作业清理。步骤结尾文件和/或 SPANK 插件可以进一步延迟步骤资源的释放。
-x, --exclude={<host1[,<host2>...]|<filename>} Request that a specific list of hosts not be included in the resources allocated to this job. The host list will be assumed to be a filename if it contains a "/" character. This option applies to job and step allocations. 请求在分配给此作业的资源中不包括特定的主机列表。如果主机列表包含一个“/”字符,那么它将被假定为一个文件名。此选项适用于作业和步骤分配。
--exclusive[={user|mcs}] This option applies to job and job step allocations, and has two slightly different meanings for each one. When used to initiate a job, the job allocation cannot share nodes with other running jobs (or just other users with the "=user" option or "=mcs" option). If user/mcs are not specified (i.e. the job allocation can not share nodes with other running jobs), the job is allocated all CPUs and GRES on all nodes in the allocation, but is only allocated as much memory as it requested. This is by design to support gang scheduling, because suspended jobs still reside in memory. To request all the memory on a node, use 此选项适用于作业和作业步骤分配,并且略有两个 每一个都有不同的含义。 用于启动作业时,作业分配不能与 其他正在运行的作业(或者只是具有“ = user”选项或“ = mcs”选项的其他用户)。 如果没有指定 user/mcs (即作业分配不能与 中的所有节点上分配所有 CPU 和 GRES 分配,但只分配它所请求的内存 设计,以支持帮派调度,因为挂起的作业仍然居住在 若要请求一个节点上的所有内存,请使用--mem=0—— mem = 0. The default shared/exclusive behavior depends on system configuration and the partition's . 默认的共享/排他行为取决于系统配置和 分区的 OverSubscribe超额订阅 option takes precedence over the job's option. NOTE: Since shared GRES (MPS) cannot be allocated at the same time as a sharing GRES (GPU) this option only allocates all sharing GRES and no underlying shared GRES.选项优先于作业选项。 注意: 由于共享 GRES (MPS)不能与共享同时分配 GRES (GPU)此选项仅分配所有共享 GRES,而不分配基础共享 GRES. This option can also be used when initiating more than one job step within an existing resource allocation (default), where you want separate processors to be dedicated to each job step. If sufficient processors are not available to initiate the job step, it will be deferred. This can be thought of as providing a mechanism for resource management to the job within its allocation (--exact implied). 在现有资源分配(默认)中启动多个作业步骤时,也可以使用此选项,其中您希望为每个作业步骤专用单独的处理器。如果没有足够的处理器来启动作业步骤,那么它将被延迟。这可以被认为是在作业的分配范围内为资源管理提供了一种机制(确切地说是暗示)。 The exclusive allocation of CPUs applies to job steps by default, but --exact is NOT the default. In other words, the default behavior is this: job steps will not share CPUs, but job steps will be allocated all CPUs available to the job on all nodes allocated to the steps. 在默认情况下,CPU 的独占分配适用于作业步骤,但是—— true 不是默认分配。换句话说,默认行为是这样的: 作业步骤将不共享 CPU,但作业步骤将在分配给步骤的所有节点上分配作业可用的所有 CPU。 In order to share the resources use the --overlap option. 为了共享资源,请使用——重叠选项。 See EXAMPLE below. 请参阅下面的示例。
--export={[ALL,]<environment_variables>|ALL|NONE} Identify which environment variables from the submission environment are propagated to the launched application. 确定提交环境中的哪些环境变量被传播到启动的应用程序。 --export=ALL Default mode if --export is not specified. All of the user's environment will be loaded from the caller's environment. 默认模式,如果没有指定—— export。所有用户的环境都将从调用者的环境中加载。 --export=NONE None of the user environment will be defined. User must use absolute path to the binary to be executed that will define the environment. User can not specify explicit environment variables with "NONE". 不会定义任何用户环境。用户必须使用要执行的二进制文件的绝对路径来定义环境。用户不能使用“ NONE”指定显式的环境变量。 This option is particularly important for jobs that are submitted on one cluster and execute on a different cluster (e.g. with different paths). To avoid steps inheriting environment export settings (e.g. "NONE") from sbatch command, either set --export=ALL or the environment variable SLURM_EXPORT_ENV should be set to "ALL". 这个选项对于在一个集群上提交并在另一个集群上执行的作业(例如使用不同的路径)特别重要。为了避免从 sbatch 命令继承环境导出设置(例如“ NONE”)的步骤,应该将 set-export = ALL 或环境变量 SLURM _ EXPORT _ ENV 设置为“ ALL”。 --export=[ALL,]<environment_variables> Exports all SLURM* environment variables along with explicitly defined variables. Multiple environment variable names should be comma separated. Environment variable names may be specified to propagate the current value (e.g. "--export=EDITOR") or specific values may be exported (e.g. "--export=EDITOR=/bin/emacs"). If "ALL" is specified, then all user environment variables will be loaded and will take precedence over any explicitly given environment variables. 导出所有 slurm * 环境变量以及显式定义的变量。多个环境变量名称应以逗号分隔。可以指定环境变量名称来传播当前值(例如“-export = EDITOR”) ,或者可以导出特定的值(例如“-export = EDITOR =/bin/emacs”)。如果指定了“ ALL”,那么将加载所有用户环境变量,并优先于任何显式给定的环境变量。 Example: --export=EDITOR,ARG1=test In this example, the propagated environment will only contain the variable EDITOR from the user's environment, SLURM_* environment variables, and ARG1=test. 在这个示例中,传播的环境将只包含来自用户环境的变量 EDITOR、 slurm _ * 环境变量和 ARG1 = test。 Example: --export=ALL,EDITOR=/bin/emacs There are two possible outcomes for this example. If the caller has the EDITOR environment variable defined, then the job's environment will inherit the variable from the caller's environment. If the caller doesn't have an environment variable defined for EDITOR, then the job's environment will use the value given by --export. 这个例子有两种可能的结果。如果调用方定义了 EDITOR 环境变量,那么作业的环境将从调用方的环境继承该变量。如果调用方没有为 EDITOR 定义环境变量,那么作业环境将使用—— export 给出的值。
-B, --extra-node-info=<sockets>[:cores[:threads]] Restrict node selection to nodes with at least the specified number of sockets, cores per socket and/or threads per core. NOTE: These options do not specify the resource allocation size. Each value specified is considered a minimum. An asterisk (*) can be used as a placeholder indicating that all available resources of that type are to be utilized. Values can also be specified as min-max. The individual levels can also be specified in separate options if desired: 将节点选择限制为至少具有指定数量的套接字、每个套接字的内核和/或每个内核的线程的节点。 注意: 这些选项没有指定资源分配大小。指定的每个值都被认为是最小值。星号(*)可以用作占位符,表示要利用该类型的所有可用资源。值也可以指定为 min-max。如果需要,还可以在单独的选项中指定各个级别:
If task/affinity plugin is enabled, then specifying an allocation in this manner also sets a default 如果启用了任务/关联插件,那么以这种方式指定分配也会设置默认值--cpu-bindCPU 绑定 option of 选择 threads丝线 if the 如果 -BB option specifies a thread count, otherwise an option of 选项指定线程数,否则为 cores核心 if a core count is specified, otherwise an option of 如果指定了核心计数,则为 sockets插座. If SelectType is configured to select/cons_res, it must have a parameter of CR_Core, CR_Core_Memory, CR_Socket, or CR_Socket_Memory for this option to be honored. If not specified, the scontrol show job will display 'ReqS:C:T=*:*:*'. This option applies to job allocations..如果 SelectType 被配置为 select/con _ res,那么它必须有一个 CR _ Core、 CR _ Core _ Memory、 CR _ Socket 或 CR _ Socket _ Memory 的参数才能使用这个选项。如果未指定,则 scontrol show 作业将显示‘ ReqS: C: T = * : * : *’。此选项适用于作业分配。 NOTE注意: This option is mutually exclusive with : 此选项与 --hint提示, --threads-per-core每个核心的螺纹 and 还有 --ntasks-per-core每核心任务. NOTE注意: If the number of sockets, cores and threads were all specified, the number of nodes was specified (as a fixed number, not a range) and the number of tasks was NOT specified, srun will implicitly calculate the number of tasks as one task per thread.如果套接字、核心和线程的数量都被指定,节点的数量被指定(作为一个固定的数字,而不是一个范围)和任务的数量没有被指定,srun 将隐式地计算任务的数量作为每个线程一个任务。
--gid=<group>—— gid = < group >If srun is run as root, and the --gid option is used, submit the job with group's group access permissions. group may be the group name or the numerical group ID. This option applies to job allocations.如果以 root 身份运行 srun,并使用—— gid 选项,则提交具有组访问权限的作业。群组可以是群组名称或数字群组 ID。此选项适用于作业分配。
--gpu-bind=[verbose,]<type>—— gpu-bind = [ verose,] < type >Bind tasks to specific GPUs. By default every spawned task can access every GPU allocated to the step. If "verbose," is specified before <将任务绑定到特定的 GPU。默认情况下,每个衍生任务都可以访问分配给该步骤的每个 GPU。如果在 < 之前指定“详细”,则为type类型>, then print out GPU binding debug information to the stderr of the tasks. GPU binding is ignored if there is only one task.> ,然后将 GPU 绑定调试信息打印到任务的 stderr。如果只有一个任务,则忽略 GPU 绑定。
Supported type options:
支持的类型选项:
closest最近的Bind each task to the GPU(s) which are closest. In a NUMA environment, each task may be bound to more than one GPU (i.e. all GPUs in that NUMA environment).将每个任务绑定到最接近的图形处理器。在 NUMA 环境中,每个任务可以绑定到多个 GPU (即 NUMA 环境中的所有 GPU)。map_gpu:<list>Map _ gpu: < list >Bind by setting GPU masks on tasks (or ranks) as specified where <list> is <gpu_id_for_task_0>,<gpu_id_for_task_1>,... GPU IDs are interpreted as decimal values. If the number of tasks (or ranks) exceeds the number of elements in this list, elements in the list will be reused as needed starting from the beginning of the list. To simplify support for large task counts, the lists may follow a map with an asterisk and repetition count. For example "map_gpu:0*4,1*4". If the task/cgroup plugin is used and ConstrainDevices is set in cgroup.conf, then the GPU IDs are zero-based indexes relative to the GPUs allocated to the job (e.g. the first GPU is 0, even if the global ID is 3). Otherwise, the GPU IDs are global IDs, and all GPUs on each node in the job should be allocated for predictable binding results.通过在指定的 < list > is < GPU _ id _ for _ task _ 0 > ,< GPU _ id _ for _ task _ 1 > 的任务(或等级)上设置 GPU 掩码来绑定,... GPU ID 被解释为十进制值。如果任务(或等级)的数量超过列表中元素的数量,则从列表的开头开始,列表中的元素将根据需要重用。为了简化对大型任务计数的支持,列表可以在带星号和重复计数的映射后面。例如“ map _ gpu: 0 * 4,1 * 4”。如果任务/cgroup 插件被使用,ConstrainDevice 在 cgroupp.conf 中被设置,那么 GPU ID 是相对于分配给作业的 GPU 的从零开始的索引(例如,第一个 GPU 是0,即使全局 ID 是3)。否则,GPU ID 就是全局 ID,作业中每个节点上的所有 GPU 都应该为可预测的绑定结果分配。mask_gpu:<list>蒙面 gpu: < list >Bind by setting GPU masks on tasks (or ranks) as specified where <list> is <gpu_mask_for_task_0>,<gpu_mask_for_task_1>,... The mapping is specified for a node and identical mapping is applied to the tasks on every node (i.e. the lowest task ID on each node is mapped to the first mask specified in the list, etc.). GPU masks are always interpreted as hexadecimal values but can be preceded with an optional '0x'. To simplify support for large task counts, the lists may follow a map with an asterisk and repetition count. For example "mask_gpu:0x0f*4,0xf0*4". If the task/cgroup plugin is used and ConstrainDevices is set in cgroup.conf, then the GPU IDs are zero-based indexes relative to the GPUs allocated to the job (e.g. the first GPU is 0, even if the global ID is 3). Otherwise, the GPU IDs are global IDs, and all GPUs on each node in the job should be allocated for predictable binding results.通过在任务(或等级)上设置 GPU 掩码来绑定,如下所示: < list > is < GPU _ ask _ for _ task _ 0 > ,< GPU _ ask _ for _ task _ 1 > ,... 为一个节点指定映射,并将相同的映射应用于每个节点上的任务(即每个节点上的最低任务 ID 映射到列表中指定的第一个掩码,等等)。GPU 掩码总是被解释为十六进制值,但可以在前面加上一个可选的“0x”。为了简化对大型任务计数的支持,列表可以在带星号和重复计数的映射后面。例如“ ask _ gpu: 0x0f * 4,0 xf0 * 4”。如果任务/cgroup 插件被使用,ConstrainDevice 在 cgroupp.conf 中被设置,那么 GPU ID 是相对于分配给作业的 GPU 的从零开始的索引(例如,第一个 GPU 是0,即使全局 ID 是3)。否则,GPU ID 就是全局 ID,作业中每个节点上的所有 GPU 都应该为可预测的绑定结果分配。none没有Do not bind tasks to GPUs (turns off binding if --gpus-per-task is requested).不要将任务绑定到 GPU (如果请求了—— GPUs-per-task,则关闭绑定)。per_task:<gpus_per_task>Per _ task: < gpus _ per _ task >Each task will be bound to the number of gpus specified in <gpus_per_task>. Gpus are assigned in order to tasks. The first task will be assigned the first x number of gpus on the node etc.每个任务将绑定到 < gpus _ per _ task > 中指定的 gpus 数量。按照任务分配 Gpus。第一个任务将被分配给节点上的第一个 x 个 gpus。single:<tasks_per_gpu>单身: < asks _ per _ gpu >Like --gpu-bind=closest, except that each task can only be bound to a single GPU, even when it can be bound to multiple GPUs that are equally close. The GPU to bind to is determined by <tasks_per_gpu>, where the first <tasks_per_gpu> tasks are bound to the first GPU available, the second <tasks_per_gpu> tasks are bound to the second GPU available, etc. This is basically a block distribution of tasks onto available GPUs, where the available GPUs are determined by the socket affinity of the task and the socket affinity of the GPUs as specified in gres.conf's Cores parameter.类似—— GPU-bind = 最接近的,除了每个任务只能绑定到一个 GPU,即使它可以绑定到同样接近的多个 GPU。要绑定到的 GPU 由 < asks _ per _ GPU > 确定,其中第一个 < asks _ per _ GPU > 任务绑定到第一个可用的 GPU,第二个 < asks _ per _ GPU > 任务绑定到第二个可用的 GPU,等等。这基本上是将任务块分布到可用的 GPU 上,其中可用的 GPU 由任务的套接字关联性和 GPU 的套接字关联性决定,如 gres.conf 的 Cores 参数所指定的。
--gpu-freq=[<type]=value>[,<type=value>][,verbose]—— gpu-freq = [ < type ] = value > [ ,< type = value > ][ ,verose ]Request that GPUs allocated to the job are configured with specific frequency values. This option can be used to independently configure the GPU and its memory frequencies. After the job is completed, the frequencies of all affected GPUs will be reset to the highest possible values. In some cases, system power caps may override the requested values. The field请求将分配给作业的 GPU 配置为特定的频率 价值观。 此选项可用于独立配置 GPU 及其内存 频率。 工作完成后,所有受影响的图形处理器的频率将被重置 达到最高可能的价值。 在某些情况下,系统电源上限可能会覆盖请求的值。 战场type类型 can be "memory". If 可以是“记忆”。 如果 type类型 is not specified, the GPU frequency is implied. The 未指定,则暗示 GPU 频率。 那个 value价值 field can either be "low", "medium", "high", "highm1" or a numeric value in megahertz (MHz). If the specified numeric value is not possible, a value as close as possible will be used. See below for definition of the values. The字段可以是“ low”、“ media”、“ high”、“ highm1”或 以兆赫为单位的数值。 如果指定的数值不可能,则为 可能将被使用。参见下面的定义的值。 那个verbose冗长 option causes current GPU frequency information to be logged. Examples of use include "--gpu-freq=medium,memory=high" and "--gpu-freq=450".选项会导致记录当前 GPU 频率信息。 使用的示例包括“—— gpu-freq = media,memory = high”和 “—— gpu-freq = 450”。
Supported value definitions:
支持的价值定义:
low低the lowest available frequency.最低可用频率。medium中等attempts to set a frequency in the middle of the available range.试图将频率设置在可用范围的中间。high很高the highest available frequency.最高可用频率。highm1Highm1(high minus one) will select the next highest available frequency.(高-1)将选择下一个最高可用频率。
-G, --gpus=[type:]<number>- G,—— gpus = [ type: ] < number >Specify the total number of GPUs required for the job. An optional GPU type specification can be supplied. For example "--gpus=volta:3". Multiple options can be requested in a comma separated list, for example: "--gpus=volta:3,kepler:1". See also the --gpus-per-node, --gpus-per-socket and --gpus-per-task options. NOTE: The allocation has to contain at least one GPU per node.指定作业所需的 GPU 总数。可以提供一个可选的 GPU 类型规范。例如“—— gpus = volta: 3”。可以在逗号分隔的列表中请求多个选项,例如: “—— gpus = volta: 3,kepler: 1”。另请参见—— gpus-per-node、—— gpus-per-socket 和—— gpus-per-task 选项。 注意: 分配必须包含每个节点至少一个 GPU。
--gpus-per-node=[type:]<number>—— gpus-per-node = [ type: ] < number >Specify the number of GPUs required for the job on each node included in the job's resource allocation. An optional GPU type specification can be supplied. For example "--gpus-per-node=volta:3". Multiple options can be requested in a comma separated list, for example: "--gpus-per-node=volta:3,kepler:1". See also the --gpus, --gpus-per-socket and --gpus-per-task options.指定作业资源分配中包含的每个节点上作业所需的 GPU 数量。可以提供一个可选的 GPU 类型规范。例如“—— gpus-per-node = volta: 3”。可以在逗号分隔的列表中请求多个选项,例如: “—— gpus-per-node = volta: 3,kepler: 1”。另请参见—— gpus、—— gpus-per-socket 和—— gpus-per-task 选项。
--gpus-per-socket=[type:]<number>—— gpus-per-socket = [ type: ] < number >Specify the number of GPUs required for the job on each socket included in the job's resource allocation. An optional GPU type specification can be supplied. For example "--gpus-per-socket=volta:3". Multiple options can be requested in a comma separated list, for example: "--gpus-per-socket=volta:3,kepler:1". Requires job to specify a sockets per node count ( --sockets-per-node). See also the --gpus, --gpus-per-node and --gpus-per-task options. This option applies to job allocations.指定作业资源分配中包含的每个套接字上作业所需的 GPU 数量。可以提供一个可选的 GPU 类型规范。例如“—— gpus-per-socket = volta: 3”。可以在逗号分隔的列表中请求多个选项,例如: “—— gpus-per-socket = volta: 3,kepler: 1”。要求作业为每个节点计数指定一个套接字(—— sockets-per-node)。另请参见—— gpus、—— gpus-per-node 和—— gpus-per-task 选项。此选项适用于作业分配。
--gpus-per-task=[type:]<number>—— gpus-per-task = [ type: ] < number >Specify the number of GPUs required for the job on each task to be spawned in the job's resource allocation. An optional GPU type specification can be supplied. For example "--gpus-per-task=volta:1". Multiple options can be requested in a comma separated list, for example: "--gpus-per-task=volta:3,kepler:1". See also the --gpus, --gpus-per-socket and --gpus-per-node options. This option requires an explicit task count, e.g. -n, --ntasks or "--gpus=X --gpus-per-task=Y" rather than an ambiguous range of nodes with -N, --nodes. This option will implicitly set --gpu-bind=per_task:<gpus_per_task>, but that can be overridden with an explicit --gpu-bind specification.指定在作业的资源分配中产生的每个任务上的作业所需的 GPU 数量。可以提供一个可选的 GPU 类型规范。例如“—— gpus-per-task = volta: 1”。可以在逗号分隔的列表中请求多个选项,例如: “—— gpus-per-task = volta: 3,kepler: 1”。另请参见—— gpus、—— gpus-per-socket 和—— gpus-per-node 选项。这个选项需要显式的任务计数,例如-n、—— nasks 或“—— gpus = X —— gpus-per-task = Y”,而不是含有-N、——节点的模棱两可的节点范围。这个选项将隐式地设置—— gpu-bind = per _ task: < gpus _ per _ task > ,但是可以用一个显式的—— gpu-bind 规范覆盖它。
--gres=<list>—— gres = < list >Specifies a comma-delimited list of generic consumable resources. The format of each entry on the list is "name[[:type]:count]". The name is that of the consumable resource. The count is the number of those resources with a default value of 1. The count can have a suffix of "k" or "K" (multiple of 1024), "m" or "M" (multiple of 1024 x 1024), "g" or "G" (multiple of 1024 x 1024 x 1024), "t" or "T" (multiple of 1024 x 1024 x 1024 x 1024), "p" or "P" (multiple of 1024 x 1024 x 1024 x 1024 x 1024). The specified resources will be allocated to the job on each node. The available generic consumable resources is configurable by the system administrator. A list of available generic consumable resources will be printed and the command will exit if the option argument is "help". Examples of use include "--gres=gpu:2", "--gres=gpu:kepler:2", and "--gres=help". NOTE: This option applies to job and step allocations. By default, a job step is allocated all of the generic resources that have been requested by the job, except those implicitly requested when a job is exclusive. To change the behavior so that each job step is allocated no generic resources, explicitly set the value of --gres to specify zero counts for each generic resource OR set "--gres=none" OR set the SLURM_STEP_GRES environment variable to "none".指定一般可使用资源的逗号分隔列表。列表中每个条目的格式为“名称[[ : type ] : count ]”。名称是消耗性资源。计数是默认值为1的那些资源的数量。计数的后缀可以是“ k”或“ K”(1024的倍数)、“ m”或“ M”(1024 x 1024的倍数)、“ g”或“ G”(1024 x 1024 x 1024的倍数)、“ t”或“ T”(1024 x 1024 x 1024 x 1024的倍数)、“ p”或“ P”(1024 x 1024 x 1024 x 1024的倍数)。指定的资源将分配给每个节点上的作业。可供使用的通用消耗性资源可由系统管理员配置。如果选项参数为“ help”,则将打印可用的通用可消耗资源列表并退出命令。使用的示例包括“—— gres = gpu: 2”、“—— gres = gpu: kepler: 2”和“—— gres = help”。注意: 此选项适用于作业和步骤分配。默认情况下,作业步骤分配作业请求的所有通用资源,但作业是独占的时候隐式请求的资源除外。要改变行为,以便每个作业步骤没有分配通用资源,显式设置-gres 的值以指定每个通用资源的零计数或设置“-gres = none”或设置 SLURM _ STEP _ GRES 环境变量为“ none”。
--gres-flags=<type>—— gres-Flag = < type >Specify generic resource task binding options.指定通用资源任务绑定选项。disable-binding禁用绑定Disable filtering of CPUs with respect to generic resource locality. This option is currently required to use more CPUs than are bound to a GRES (i.e. if a GPU is bound to the CPUs on one socket, but resources on more than one socket are required to run the job). This option may permit a job to be allocated resources sooner than otherwise possible, but may result in lower job performance. This option applies to job allocations. NOTE: This option is specific to SelectType=cons_res.禁用相对于通用资源本地的 CPU 筛选。这个选项当前需要使用比绑定到 GRES 更多的 CPU (例如,如果一个 GPU 绑定到一个套接字上的 CPU,但是运行这个作业需要多个套接字上的资源)。此选项可能允许比其他方式更快地分配资源,但可能导致较低的工作性能。此选项适用于作业分配。注意: 此选项特定于 SelectType = con _ res。enforce-binding强制约束The only CPUs available to the job/step will be those bound to the selected GRES (i.e. the CPUs identified in the gres.conf file will be strictly enforced). This option may result in delayed initiation of a job. For example a job requiring two GPUs and one CPU will be delayed until both GPUs on a single socket are available rather than using GPUs bound to separate sockets, however, the application performance may be improved due to improved communication speed. Requires the node to be configured with more than one socket and resource filtering will be performed on a per-socket basis. NOTE: Job steps that don't use --exact will not be affected. NOTE: This option is specific to SelectType=cons_tres for job allocations.作业/步骤中唯一可用的 CPU 将绑定到选定的 GRES (即,将严格执行 GRES.conf 文件中标识的 CPU)。此选项可能导致延迟启动作业。例如,一个需要两个 GPU 和一个 CPU 的任务将被延迟,直到两个 GPU 都在一个套接字上可用,而不是使用绑定到单独的套接字的 GPU,然而,由于通信速度的提高,应用程序的性能可能会得到改善。要求节点配置有多个套接字,资源过滤将在每个套接字的基础上执行。注意: 不使用——精确的作业步骤不会受到影响。注意: 这个选项特定于作业分配的 SelectType = con _ tri。
-h, --help救命Display help information and exit.显示帮助信息并退出。
--het-group=<expr>—— het-group = < expr >Identify each component in a heterogeneous job allocation for which a step is to be created. Applies only to srun commands issued inside a salloc allocation or sbatch script. <expr> is a set of integers corresponding to one or more options offsets on the salloc or sbatch command line. Examples: "--het-group=2", "--het-group=0,4", "--het-group=1,3-5". The default value is --het-group=0.确定要为其创建步骤的异构作业分配中的每个组件。仅适用于在 salloc 分配或 sbatch 脚本中发出的 srun 命令。< expr > 是一组整数,对应于 salloc 或 sbatch 命令行上的一个或多个选项偏移量。例如: “—— het-group = 2”,“—— het-group = 0,4”,“—— het-group = 1,3-5”。默认值是—— het-group = 0。
--hint=<type>——提示 = < type >Bind tasks according to application hints. NOTE: This option cannot be used in conjunction with any of --ntasks-per-core, --threads-per-core, --cpu-bind (other than --cpu-bind=verbose) or -B. If --hint is specified as a command line argument, it will take precedence over the environment.根据应用程序提示绑定任务。注意: 此选项不能与任何—— ntask-per-core、—— thread-per-core、—— cpu-bind (除了—— cpu-bind = verose)或-B 一起使用。如果—— tip 被指定为命令行参数,那么它将优先于环境。compute_bound计算 _ 绑定Select settings for compute bound applications: use all cores in each socket, one thread per core.为计算绑定应用程序选择设置: 使用每个套接字中的所有内核,每个内核一个线程。memory_bound内存限制Select settings for memory bound applications: use only one core in each socket, one thread per core.为内存绑定应用程序选择设置: 在每个套接字中只使用一个内核,每个内核使用一个线程。[no]multithread多线程[don't] use extra threads with in-core multi-threading which can benefit communication intensive applications. Only supported with the task/affinity plugin.[不要]使用内核多线程的额外线程,这可以使通信密集型应用程序受益。help救命show this help message显示此帮助信息This option applies to job allocations.此选项适用于作业分配。
-H, --hold- H-等等Specify the job is to be submitted in a held state (priority of zero). A held job can now be released using scontrol to reset its priority (e.g. "scontrol release <job_id>"). This option applies to job allocations.指定作业将以持有状态提交(优先级为零)。现在可以使用 scontrol 来释放被保留的作业以重置其优先级(例如“ scontrol release < job _ id >”)。此选项适用于作业分配。
-I, --immediate[=<seconds>]- 我,-立即[ = < 秒 > ]exit if resources are not available within the time period specified. If no argument is given (seconds defaults to 1), resources must be available immediately for the request to succeed. If defer is configured in SchedulerParameters and seconds=1 the allocation request will fail immediately; defer conflicts and takes precedence over this option. By default, --immediate is off, and the command will block until resources become available. Since this option's argument is optional, for proper parsing the single letter option must be followed immediately with the value and not include a space between them. For example "-I60" and not "-I 60". This option applies to job and step allocations.如果资源在指定的时间段内不可用,则退出。如果没有给出参数(秒默认为1) ,则必须立即提供资源以使请求成功。如果在 SchedulerParameter 中配置了 Defer,秒数 = 1,则分配请求将立即失败; 推迟冲突并优先于此选项。默认情况下,—— direct 是关闭的,并且该命令将阻塞,直到资源可用。由于该选项的参数是可选的,因此为了正确解析单个字母选项,必须立即在后面跟上该值,并且在它们之间不包括空格。例如“-I60”而不是“-I60”。此选项适用于作业和步骤分配。
-i, --input=<mode>- I,-input = < mode >Specify how stdin is to be redirected. By default, srun redirects stdin from the terminal to all tasks. See IO Redirection below for more options. For OS X, the poll() function does not support stdin, so input from a terminal is not possible. This option applies to job and step allocations.指定如何重定向 stdin。默认情况下,srun 将 stdin 从终端重定向到所有任务。有关更多选项,请参见下面的 IO 重定向。对于 OS X,poll ()函数不支持 stdin,因此不能从终端进行输入。此选项适用于作业和步骤分配。
-J, --job-name=<jobname>- J,-job-name = < jobname >Specify a name for the job. The specified name will appear along with the job id number when querying running jobs on the system. The default is the supplied executable program's name. NOTE: This information may be written to the slurm_jobacct.log file. This file is space delimited so if a space is used in the jobname name it will cause problems in properly displaying the contents of the slurm_jobacct.log file when the sacct command is used. This option applies to job and step allocations.为作业指定一个名称。当查询系统上正在运行的作业时,指定的名称会随作业标识号一起出现。默认值是提供的可执行程序的名称。注意: 此信息可以写入 slurm _ jobacct。日志文件。这个文件是用空格分隔的,所以如果在 jobname 名称中使用了空格,就会在正确显示 slurm _ jobacct 的内容时造成问题。使用 sacct 命令时的日志文件。此选项适用于作业和步骤分配。
--jobid=<jobid>—— jobid = < jobid >Initiate a job step under an already allocated job with job id id. Using this option will cause srun to behave exactly as if the SLURM_JOB_ID environment variable was set. This option applies to step allocations.使用作业 ID 在已分配的作业下启动作业步骤。使用此选项将导致 srun 的行为与设置了 SLURM _ JOB _ ID 环境变量的行为完全一样。此选项适用于步骤分配。
-K, --kill-on-bad-exit[=0|1]- K,—— kill-on-bad-exit [ = 0 | 1]Controls whether or not to terminate a step if any task exits with a non-zero exit code. If this option is not specified, the default action will be based upon the Slurm configuration parameter of KillOnBadExit. If this option is specified, it will take precedence over KillOnBadExit. An option argument of zero will not terminate the job. A non-zero argument or no argument will terminate the job. Note: This option takes precedence over the -W, --wait option to terminate the job immediately if a task exits with a non-zero exit code. Since this option's argument is optional, for proper parsing the single letter option must be followed immediately with the value and not include a space between them. For example "-K1" and not "-K 1".控制在任何任务使用非零退出代码退出时是否终止步骤。如果未指定此选项,则默认操作将基于 KillonBadExit 的 slurm 配置参数。如果指定了此选项,它将优先于 KillOnBadExit。选项参数为零不会终止作业。非零参数或无参数将终止作业。注意: 如果任务退出时具有非零退出代码,则此选项优先于-W,—— wait 选项立即终止作业。由于该选项的参数是可选的,因此为了正确解析单个字母选项,必须立即在后面跟上该值,并且在它们之间不包括空格。例如“-K1”而不是“-K1”。-l, --label- 我...-标签Prepend task number to lines of stdout/err. The --label option will prepend lines of output with the remote task id. This option applies to step allocations.将任务编号添加到 stdout/err 行。Label 选项将在带有远程任务 ID 的输出行前面加上。此选项适用于步骤分配。
-L, --licenses=<license>[@db][:count][,license[@db][:count]...]- L,——許可證 = < 許可證 > [@db ][ : count ][ ,許可證[@db ][ : count ] ... ]Specification of licenses (or other resources available on all nodes of the cluster) which must be allocated to this job. License names can be followed by a colon and count (the default count is one). Multiple license names should be comma separated (e.g. "--licenses=foo:4,bar"). This option applies to job allocations.必须分配给此作业的许可证(或集群中所有节点上可用的其他资源)的规范。许可证名称后面可以跟冒号和计数(默认计数为1)。多个许可证名称应该用逗号分隔(例如,“——許可证 = foo: 4,bar”)。此选项适用于作业分配。 NOTE: When submitting heterogeneous jobs, license requests only work correctly when made on the first component job. For example "srun -L ansys:2 : myexecutable". 注意: 当提交异构作业时,许可请求只有在第一个组件作业上才能正常工作。例如,“ srun-L ansys: 2: myExecable”。
--mail-type=<type>—— mail-type = < type >Notify user by email when certain event types occur. Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL (equivalent to BEGIN, END, FAIL, INVALID_DEPEND, REQUEUE, and STAGE_OUT), INVALID_DEPEND (dependency never satisfied), STAGE_OUT (burst buffer stage out and teardown completed), TIME_LIMIT, TIME_LIMIT_90 (reached 90 percent of time limit), TIME_LIMIT_80 (reached 80 percent of time limit), and TIME_LIMIT_50 (reached 50 percent of time limit). Multiple type values may be specified in a comma separated list. The user to be notified is indicated with
--mail-user. This option applies to job allocations.当某些事件类型发生时,通过电子邮件通知用户。有效的类型值是 NONE、 BEGIN、 END、 FAIL、 REQUEUE、 ALL (相当于 BEGIN、 END、 FAIL、 INVALID _ DEPEND、 REQUEUE 和 STAGE _ OUT)、 INVALID _ DEPEND (从未满足依赖)、 STAGE _ OUT (突发缓冲区阶段结束和拆除完成)、 TIME _ LIMIT、 TIME _ LIMIT _ 90(达到90% 的时间限制)、 TIME _ LIMIT _ 80(达到80% 的时间限制)和 TIME _ LIMIT _ 50(达到50% 的时间限制)。可以在逗号分隔的列表中指定多个类型值。要通知的用户用—— mail-user 表示。此选项适用于作业分配。--mail-user=<user>—— mail-user = < user >User to receive email notification of state changes as defined by --mail-type. The default value is the submitting user. This option applies to job allocations.用户接收由—— mail-type 定义的状态更改的电子邮件通知。默认值是提交用户。此选项适用于作业分配。
--mcs-label=<mcs>—— mcs-label = < mcs >Used only when the mcs/group plugin is enabled. This parameter is a group among the groups of the user. Default value is calculated by the Plugin mcs if it's enabled. This option applies to job allocations.仅在启用 mcs/group 插件时使用。此参数是用户组中的一个组。默认值是由 Plugin mcs 计算的,如果启用的话。此选项适用于作业分配。
--mem=<size>[units]—— mem = < size > [ unit ]Specify the real memory required per node. Default units are megabytes. Different units can be specified using the suffix [K|M|G|T]. Default value is指定每个节点所需的实际内存。默认单位为兆字节。可以使用后缀[ K | M | G | T ]指定不同的单位。默认值为DefMemPerNode and the maximum value is 最大值是 MaxMemPerNode. If configured, both of parameters can be seen using the 。如果配置了,可以使用 scontrol show config控件显示配置 command. This parameter would generally be used if whole nodes are allocated to jobs ( 如果将整个节点分配给作业(SelectType=select/linear选择类型 = 选择/线性). Specifying a memory limit of zero for a job step will restrict the job step to the amount of memory allocated to the job, but not remove any of the job's memory allocation from being available to other job steps. Also see).为作业步骤指定零的内存限制会将作业步骤限制为分配给作业的内存量,但不会从可用于其他作业步骤的作业内存分配中删除任何内存分配。也看到了--mem-per-cpu每个中央处理器 and 还有 --mem-per-gpu—— mem-per-gpu. The 。的 --mem—— mem, --mem-per-cpu每个中央处理器 and 还有 --mem-per-gpu—— mem-per-gpu options are mutually exclusive. If 选项是相互排斥的。如果 --mem—— mem, --mem-per-cpu每个中央处理器 or 或者 --mem-per-gpu—— mem-per-gpu are specified as command line arguments, then they will take precedence over the environment (potentially inherited from被指定为命令行参数,然后它们将优先于环境(可能继承自sallocSalloc or 或者 sbatch斯巴奇).
NOTE: A memory size specification of zero is treated as a special case and grants the job access to all of the memory on each node for newly submitted jobs and all available job memory to new job steps.
注意: 内存大小为零的规范被视为一种特殊情况,它为新提交的作业授予对每个节点上的所有内存的作业访问权,并将所有可用的作业内存授予新的作业步骤。
NOTE: Enforcement of memory limits currently relies upon the task/cgroup plugin or enabling of accounting, which samples memory use on a periodic basis (data need not be stored, just collected). In both cases memory use is based upon the job's Resident Set Size (RSS). A task may exceed the memory limit until the next periodic accounting sample.
注意: 内存限制的强制执行目前依赖于任务/cgroup 插件或启用记帐,它定期对内存使用情况进行采样(数据不需要存储,只需要收集)。在这两种情况下,内存使用都基于作业的驻留集大小(RSS)。任务可能会超过内存限制,直到下一次定期记帐样本。
This option applies to job and step allocations.
此选项适用于作业和步骤分配。
--mem-bind=[{quiet|verbose},]<type>—— mem-bind = [{ static | verose } ,] < type >Bind tasks to memory. Used only when the task/affinity plugin is enabled and the NUMA memory functions are available.将任务绑定到内存。仅在启用任务/关联插件且 NUMA 内存函数可用时使用。Note that the resolution of CPU and memory binding may differ on some architectures.请注意,CPU 和内存绑定的分辨率在某些体系结构上可能有所不同。 For example, CPU binding may be performed at the level of the cores within a processor while memory binding will be performed at the level of nodes, where the definition of "nodes" may differ from system to system. By default no memory binding is performed; any task using any CPU can use any memory. This option is typically used to ensure that each task is bound to the memory closest to its assigned CPU.例如,CPU 绑定可以在处理器内核的级别上执行,而内存绑定将在节点的级别上执行,其中“节点”的定义可能因系统而异。默认情况下不执行内存绑定; 任何使用任何 CPU 的任务都可以使用任何内存。此选项通常用于确保将每个任务绑定到最接近其分配的 CPU 的内存。The use of any type other than "none" or "local" is not recommended.不建议使用“ none”或“ local”以外的任何类型。 If you want greater control, try running a simple test code with the options "--cpu-bind=verbose,none --mem-bind=verbose,none" to determine the specific configuration.如果希望获得更大的控制权,请尝试运行一个简单的测试代码,其选项为“—— cpu-bind = 脚本,none —— mem-bind = 脚本,none”,以确定特定的配置。
NOTE: To have Slurm always report on the selected memory binding for all commands executed in a shell, you can enable verbose mode by setting the SLURM_MEM_BIND environment variable value to "verbose".
注意: 为了让 SLURM 始终报告在 shell 中执行的所有命令的选定内存绑定,您可以通过将 SLURM _ MEM _ BIND 环境变量值设置为“ verose”来启用 verose 模式。
The following informational environment variables are set when --mem-bind is in use:
在使用—— mem-bind 时设置以下信息环境变量:
See the ENVIRONMENT VARIABLES section for a more detailed description of the individual SLURM_MEM_BIND* variables.
有关各个 SLURM _ MEM _ BIND * 变量的更详细描述,请参见 ENVIRONMENT VARIABLES 部分。
Supported options include:
支持的备选方案包括:
help救命show this help message显示此帮助信息local本地Use memory local to the processor in use使用正在使用的处理器的本地内存map_mem:<list>Map _ mem: < list >Bind by setting memory masks on tasks (or ranks) as specified where <list> is <numa_id_for_task_0>,<numa_id_for_task_1>,... The mapping is specified for a node and identical mapping is applied to the tasks on every node (i.e. the lowest task ID on each node is mapped to the first ID specified in the list, etc.). NUMA IDs are interpreted as decimal values unless they are preceded with '0x' in which case they interpreted as hexadecimal values. If the number of tasks (or ranks) exceeds the number of elements in this list, elements in the list will be reused as needed starting from the beginning of the list. To simplify support for large task counts, the lists may follow a map with an asterisk and repetition count. For example "map_mem:0x0f*4,0xf0*4". For predictable binding results, all CPUs for each node in the job should be allocated to the job.通过设置任务(或等级)的内存掩码绑定,如 < list > is < numa _ ID _ for _ task _ 0 > ,< numa _ ID _ for _ task _ 1 > ,... 为节点指定映射,并将相同的映射应用于每个节点上的任务(即每个节点上的最低任务 ID 映射到列表中指定的第一个 ID,等等)。NUMA ID 被解释为十进制值,除非它们前面加上“0x”,在这种情况下,它们被解释为十六进制值。如果任务(或等级)的数量超过列表中元素的数量,则从列表的开头开始,列表中的元素将根据需要重用。为了简化对大型任务计数的支持,列表可以在带星号和重复计数的映射后面。例如“ map _ mem: 0x0f * 4,0 xf0 * 4”。对于可预测的绑定结果,作业中每个节点的所有 CPU 都应该分配给作业。mask_mem:<list>Mask _ mem: < list >Bind by setting memory masks on tasks (or ranks) as specified where <list> is <numa_mask_for_task_0>,<numa_mask_for_task_1>,... The mapping is specified for a node and identical mapping is applied to the tasks on every node (i.e. the lowest task ID on each node is mapped to the first mask specified in the list, etc.). NUMA masks are always interpreted as hexadecimal values. Note that masks must be preceded with a '0x' if they don't begin with [0-9] so they are seen as numerical values. If the number of tasks (or ranks) exceeds the number of elements in this list, elements in the list will be reused as needed starting from the beginning of the list. To simplify support for large task counts, the lists may follow a mask with an asterisk and repetition count. For example "mask_mem:0*4,1*4". For predictable binding results, all CPUs for each node in the job should be allocated to the job.通过设置任务(或等级)的内存掩码来绑定,如下所示: < list > is < numa _ ask _ for _ task _ 0 > ,< numa _ ask _ for _ task _ 1 > ,... 为一个节点指定映射,并将相同的映射应用于每个节点上的任务(即每个节点上的最低任务 ID 映射到列表中指定的第一个掩码,等等)。NUMA 掩码始终被解释为十六进制值。注意,如果蒙版不是以[0-9]开头,那么它们必须以“0x”开头,因此它们被视为数值。如果任务(或等级)的数量超过列表中元素的数量,则从列表的开头开始,列表中的元素将根据需要重用。为了简化对大型任务计数的支持,列表可以使用带星号和重复计数的掩码。例如“ ask _ mem: 0 * 4,1 * 4”。对于可预测的绑定结果,作业中每个节点的所有 CPU 都应该分配给作业。no[ne]没有don't bind tasks to memory (default)不将任务绑定到内存(默认)nosort没有avoid sorting free cache pages (default, LaunchParameters configuration parameter can override this default)避免对空闲缓存页进行排序(默认情况下,启动参数配置参数可以覆盖此默认情况)p[refer][参考文献]Prefer use of first specified NUMA node, but permit use of other available NUMA nodes.首选使用第一个指定的 NUMA 节点,但允许使用其他可用的 NUMA 节点。q[uiet]安静quietly bind before task runs (default)在任务运行之前悄悄绑定(默认)rank军衔bind by task rank (not recommended)按任务等级绑定(不推荐)sort排序sort free cache pages (run zonesort on Intel KNL nodes)排序自由缓存页面(在 Intel KNL 节点上运行 zonessort)v[erbose][英语]verbosely report binding before task runs在任务运行前冗长地报告绑定This option applies to job and step allocations.此选项适用于作业和步骤分配。
--mem-per-cpu=<size>[units]—— mem-per-cpu = < size > [ unit ]Minimum memory required per usable allocated CPU. Default units are megabytes. Different units can be specified using the suffix [K|M|G|T]. The default value is每个可用分配的 CPU 所需的最小内存。默认单位是兆字节。可以使用后缀[ K | M | G | T ]指定不同的单位。默认值为DefMemPerCPU and the maximum value is 最大值是 MaxMemPerCPU (see exception below). If configured, both parameters can be seen using the (参见下面的异常)。如果配置了这两个参数,可以使用 scontrol show config控件显示配置 command. Note that if the job's 命令。请注意,如果作业是 --mem-per-cpu每个中央处理器 value exceeds the configured 值超过配置的 MaxMemPerCPU, then the user's limit will be treated as a memory limit per task; ,那么用户的限制将被视为每个任务的内存限制; --mem-per-cpu每个中央处理器 will be reduced to a value no larger than 将减少到不超过 MaxMemPerCPU; --cpus-per-task—— cpus-per-task will be set and the value of 将被设置,并且 --cpus-per-task—— cpus-per-task multiplied by the new 乘以新的 --mem-per-cpu每个中央处理器 value will equal the original 值将等于原始值 --mem-per-cpu每个中央处理器 value specified by the user. This parameter would generally be used if individual processors are allocated to jobs (如果将单个处理器分配给作业(SelectType=select/cons_resSelectType = select/con _ res). If resources are allocated by core, socket, or whole nodes, then the number of CPUs allocated to a job may be higher than the task count and the value of).如果资源是由核心、套接字或整个节点分配的,那么分配给作业的 CPU 数量可能高于任务计数和--mem-per-cpu每个中央处理器 should be adjusted accordingly. Specifying a memory limit of zero for a job step will restrict the job step to the amount of memory allocated to the job, but not remove any of the job's memory allocation from being available to other job steps. Also see应作出相应的调整。为作业步骤指定零的内存限制将把作业步骤限制为分配给作业的内存量,但不会从其他作业步骤中删除任何作业的内存分配。也看到了--mem—— mem and 还有 --mem-per-gpu—— mem-per-gpu. The 。的 --mem—— mem, --mem-per-cpu每个中央处理器 and 还有 --mem-per-gpu—— mem-per-gpu options are mutually exclusive. 选项是相互排斥的。
NOTE: If the final amount of memory requested by a job can't be satisfied by any of the nodes configured in the partition, the job will be rejected. This could happen if --mem-per-cpu is used with the --exclusive option for a job allocation and --mem-per-cpu times the number of CPUs on a node is greater than the total memory of that node.
注意: 如果分区中配置的任何节点都不能满足作业请求的最终内存量,则作业将被拒绝。如果—— mem-per-cpu 与作业分配的—— only 选项一起使用,并且—— mem-per-cpu 乘以一个节点上的 CPU 数量大于该节点的总内存,就会发生这种情况。
NOTE: This applies to usable allocated CPUs in a job allocation. This is important when more than one thread per core is configured. If a job requests --threads-per-core with fewer threads on a core than exist on the core (or --hint=nomultithread which implies --threads-per-core=1), the job will be unable to use those extra threads on the core and those threads will not be included in the memory per CPU calculation. But if the job has access to all threads on the core, those threads will be included in the memory per CPU calculation even if the job did not explicitly request those threads.
注意: 这适用于作业分配中可用的已分配 CPU。当每个核心配置了多个线程时,这一点非常重要。如果一个作业请求——每个核心上的线程数比核心上的线程数少(或者——师提示 = nomultithread,意味着——每个核心上的线程数 = 1) ,那么这个作业将无法使用核心上的额外线程,并且这些线程将不会包含在每个 CPU 计算的内存中。但是如果作业可以访问核心上的所有线程,那么即使作业没有显式请求这些线程,这些线程也会包含在每个 CPU 计算的内存中。
In the following examples, each core has two threads.
在下面的示例中,每个核心有两个线程。
In this first example, two tasks can run on separate hyperthreads in the same core because --threads-per-core is not used. The third task uses both threads of the second core. The allocated memory per cpu includes all threads:
在第一个示例中,两个任务可以在同一核心的不同超线程上运行,因为没有使用—— thread-per-core。第三个任务使用第二个核心的两个线程。每个 CPU 分配的内存包括所有线程:
In this second example, because of --threads-per-core=1, each task is allocated an entire core but is only able to use one thread per core. Allocated CPUs includes all threads on each core. However, allocated memory per cpu includes only the usable thread in each core.
在第二个示例中,由于—— thread-per-core = 1,每个任务被分配了一个完整的内核,但是每个内核只能使用一个线程。分配的 CPU 包括每个核心上的所有线程。但是,每个 CPU 分配的内存只包括每个核心中可用的线程。
--mem-per-gpu=<size>[units]—— mem-per-gpu = < size > [ unit ]Minimum memory required per allocated GPU. Default units are megabytes. Different units can be specified using the suffix [K|M|G|T]. Default value is DefMemPerGPU and is available on both a global and per partition basis. If configured, the parameters can be seen using the scontrol show config and scontrol show partition commands. Also see --mem. The --mem, --mem-per-cpu and --mem-per-gpu options are mutually exclusive.每个分配的 GPU 所需的最小内存。默认单位是兆字节。可以使用后缀[ K | M | G | T ]指定不同的单位。默认值是 DefMemPerGPU,可以在全局和每个分区的基础上使用。如果配置了,可以使用 scontrol show 配置和 scontrol show 分区命令查看参数。还有..。每台中央处理器和每台图形处理器的选项是相互排斥的。
--mincpus=<n>—— mincpus = < n >Specify a minimum number of logical cpus/processors per node. This option applies to job allocations.指定每个节点的最小逻辑 CPU/处理器数。此选项适用于作业分配。
--mpi=<mpi_type>—— mpi = < mpi _ type >Identify the type of MPI to be used. May result in unique initiation procedures.识别要使用的 MPI 类型。可能导致独特的启动过程。cray_shasta(西班牙语)To enable Cray PMI support. This is for applications built with the Cray Programming Environment. The PMI Control Port can be specified with the --resv-ports option or with the MpiParams=ports=<port range> parameter in your slurm.conf. This plugin does not have support for heterogeneous jobs. Support for cray_shasta is included by default.启用 Cray PMI 支持。这是用 Cray 编程环境构建的应用程序。可以使用—— resv-ports 选项或 slurm.conf 中的 MpiParams = ports = < Port range > 参数指定 PMI Control Port。这个插件不支持异构作业。默认情况下包括对 cray _ shasta 的支持。list名单Lists available mpi types to choose from.列出可供选择的 mpi 类型。pmi2Pmi2To enable PMI2 support. The PMI2 support in Slurm works only if the MPI implementation supports it, in other words if the MPI has the PMI2 interface implemented. The --mpi=pmi2 will load the library lib/slurm/mpi_pmi2.so which provides the server side functionality but the client side must implement PMI2_Init() and the other interface calls.启用 PMI2支持。Slurm 中的 PMI2支持只有在 MPI 实现支持的情况下才能工作,换句话说,如果 MPI 实现了 PMI2接口的话。- mpi = pmi2将加载库 lib/slurm/mpi _ pmi2。因此它提供了服务器端的功能,但是客户端必须实现 PMI2 _ Init ()和其他接口调用。pmixPmixTo enable PMIx support ( 启用 PMIx 支持(https://pmix.github.io).Https://pmix.github.io ). The PMIx support in Slurm can be used to launch parallel applications (e.g. MPI) if it supports PMIx, PMI2 or PMI1. Slurm must be configured with pmix support by passing "--with-pmix=<PMIx installation path>" option to its "./configure" script.Slurm 中的 PMIx 支持可以用来启动并行应用程序(如 MPI) ,如果它支持 PMIx、 PMI2或 PMI1的话。Slurm 必须通过传递“-with-pmix = < PMIX 安装路径 >”选项到它的。/configure”脚本。
At the time of writing PMIx is supported in Open MPI starting from version 2.0. PMIx also supports backward compatibility with PMI1 and PMI2 and can be used if MPI was configured with PMI2/PMI1 support pointing to the PMIx library ("libpmix"). If MPI supports PMI1/PMI2 but doesn't provide the way to point to a specific implementation, a hack'ish solution leveraging LD_PRELOAD can be used to force "libpmix" usage.
在编写本文时,从2.0版开始的 Open MPI 支持 PMIx。PMIx 还支持使用 PMI1和 PMI2的向下兼容,如果 MPI 配置为指向 PMIx 库(“ libpmix”)的 PMI2/PMI1支持,则可以使用。如果 MPI 支持 PMI1/PMI2,但没有提供指向特定实现的方法,那么可以使用利用 LD _ PRELOAD 的黑客式解决方案来强制使用“ libpmix”。
none没有No special MPI processing. This is the default and works with many other versions of MPI.没有特殊的 MPI 处理。这是默认的,可以与许多其他版本的 MPI 一起工作。This option applies to step allocations.此选项适用于步骤分配。
--msg-timeout=<seconds>—— msg-timeout = < second >Modify the job launch message timeout. The default value is MessageTimeout in the Slurm configuration file slurm.conf. Changes to this are typically not recommended, but could be useful to diagnose problems. This option applies to job allocations.修改作业启动消息超时。默认值是 slurm 配置文件 slurm.conf 中的 MessageTimeout。通常不建议对此进行更改,但可能对诊断问题有用。此选项适用于作业分配。
--multi-prog多前卫Run a job with different programs and different arguments for each task. In this case, the executable program specified is actually a configuration file specifying the executable and arguments for each task. See MULTIPLE PROGRAM CONFIGURATION below for details on the configuration file contents. This option applies to step allocations.为每个任务运行具有不同程序和不同参数的作业。在这种情况下,指定的可执行程序实际上是指定每个任务的可执行程序和参数的配置文件。有关配置文件内容的详细信息,请参阅下面的多重程序配置。此选项适用于步骤分配。
--network=<type>—— network = < type >Specify information pertaining to the switch or network. The interpretation of 指定与交换机或网络相关的信息 type类型 is system dependent. This option is supported when running Slurm on a Cray natively. It is used to request using Network Performance Counters. Only one value per request is valid. All options are case in-sensitive. In this configuration supported values include:是系统依赖的。当在 Cray 上运行 slurm 时,这个选项是支持的。它用于使用网络性能计数器请求。每个请求只有一个值有效。所有选项都是区分大小写的。在这种配置中,支持的值包括:system系统Use the system-wide network performance counters. Only nodes requested will be marked in use for the job allocation. If the job does not fill up the entire system the rest of the nodes are not able to be used by other jobs using NPC, if idle their state will appear as PerfCnts. These nodes are still available for other jobs not using NPC.使用系统范围的网络性能计数器。只有被请求的节点将被标记用于作业分配。如果作业没有填满整个系统,其余的节点就不能被使用 NPC 的其他作业使用,如果空闲,它们的状态将显示为 PerfCnts。这些节点仍然可用于其他不使用 NPC 的作业。blade刀片Use the blade network performance counters. Only nodes requested will be marked in use for the job allocation. If the job does not fill up the entire blade(s) allocated to the job those blade(s) are not able to be used by other jobs using NPC, if idle their state will appear as PerfCnts. These nodes are still available for other jobs not using NPC.使用刀片网络性能计数器。只有请求的节点才会被标记用于作业分配。如果作业没有填满分配给作业的整个刀片,那么这些刀片不能被其他使用 NPC 的作业使用,如果空闲,它们的状态将显示为 PerfCnts。这些节点仍然可用于其他不使用 NPC 的作业。
In all cases the job allocation request must specify the --exclusive option and the step cannot specify the --overlap option. Otherwise the request will be denied.
在所有情况下,作业分配请求都必须指定—— only 选项,而步骤不能指定—— 重叠选项。否则请求将被拒绝。
Also with any of these options steps are not allowed to share blades, so resources would remain idle inside an allocation if the step running on a blade does not take up all the nodes on the blade.
此外,这些选项中的任何一个步骤都不允许共享刀片服务器,因此,如果运行在刀片服务器上的步骤不占用刀片服务器上的所有节点,则分配中的资源将保持空闲状态。
The network option is also available on systems with HPE Slingshot networks. It can be used to override the default network resources allocated for the job step. Multiple values may be specified in a comma-separated list.
网络选项也可用于具有 HPE 弹弓网络的系统。它可用于覆盖为作业步骤分配的默认网络资源。可以在逗号分隔的列表中指定多个值。
def_<rsrc>=<val>Def _ < rsrc > = < val >Per-CPU reserved allocation for this resource.此资源的每个 CPU 保留分配。res_<rsrc>=<val>Res _ < rsrc > = < val >Per-node reserved allocation for this resource. If set, overrides the per-CPU allocation.此资源的每个节点保留分配。如果设置了,则重写每个 CPU 的分配。max_<rsrc>=<val>Max _ < rsrc > = < val >Maximum per-node limit for this resource.此资源的最大每节点限制。depth=<depth>深度 = < 深度 >Multiplier for per-CPU resource allocation. Default is the number of reserved CPUs on the node.每个 CPU 资源分配的乘法器。默认值是节点上保留的 CPU 数量。
The resources that may be requested are:
可请求的资源如下:
txqsTxqsTransmit command queues. The default is 3 per-CPU, maximum 1024 per-node.传输命令队列。默认为每个 CPU 3个,每个节点最多1024个。tgqsTGQsTarget command queues. The default is 2 per-CPU, maximum 512 per-node.目标命令队列。默认为每个 CPU 2个,每个节点最多512个。eqs等式Event queues. The default is 8 per-CPU, maximum 2048 per-node.事件队列。默认为每个 CPU 8个,每个节点最多2048个。ctsCTCounters. The default is 2 per-CPU, maximum 2048 per-node.计数器。默认为每个 CPU 2个,每个节点最多2048个。tlesTlesTrigger list entries. The default is 1 per-CPU, maximum 2048 per-node.触发器列表条目。默认为每个 CPU 1个,每个节点最多2048个。ptes(咒语)Portable table entries. The default is 8 per-CPU, maximum 2048 per-node.便携式表条目。默认为每个 CPU 8个,每个节点最多2048个。lesLesList entries. The default is 134 per-CPU, maximum 65535 per-node.列出条目。默认为每个 CPU 134个,每个节点最多65535个。acsAcAddressing contexts. The default is 4 per-CPU, maximum 1024 per-node.寻址上下文。默认为每个 CPU 4个,每个节点最多1024个。
This option applies to job and step allocations.
此选项适用于作业和步骤分配。
--nice[=adjustment]——很好[ = 调整]Run the job with an adjusted scheduling priority within Slurm. With no adjustment value the scheduling priority is decreased by 100. A negative nice value increases the priority, otherwise decreases it. The adjustment range is +/- 2147483645. Only privileged users can specify a negative adjustment.在 slurm 范围内调整作业的优先级。如果没有调整值,则调度优先级将减少100。负的“好”值会增加优先级,否则会降低优先级。调整范围是 +/-2147483645。只有有特权的用户才能指定负调整。
-Z, --no-allocate- Z-不分配Run the specified tasks on a set of nodes without creating a Slurm "job" in the Slurm queue structure, bypassing the normal resource allocation step. The list of nodes must be specified with the -w, --nodelist option. This is a privileged option only available for the users "SlurmUser" and "root". This option applies to job allocations.绕过正常的资源分配步骤,在一组节点上运行指定的任务,而不在 slurm slurm 队列结构中创建一个“作业”。必须使用-w,—— nodelist 选项指定节点列表。这是一个特权选项,只对用户“ SlurmUser”和“ root”可用。此选项适用于作业分配。
-k, --no-kill[=off]- k,—— no-kill [ = off ]Do not automatically terminate a job if one of the nodes it has been allocated fails. This option applies to job and step allocations. The job will assume all responsibilities for fault-tolerance. Tasks launched using this option will not be considered terminated (e.g.如果分配的某个节点失败,不要自动终止作业。此选项适用于作业和步骤分配。这项工作将承担所有的容错责任。使用此选项启动的任务将不被视为已终止(例如。-KK, --kill-on-bad-exit——杀死坏退出 and 还有 -WW, --wait等等 options will have no effect upon the job step). The active job step (MPI job) will likely suffer a fatal error, but subsequent job steps may be run if this option is specified.选项对作业步骤没有影响)。活动作业步骤(MPI 作业)可能会出现致命错误,但是如果指定了此选项,则可以运行后续作业步骤。
Specify an optional argument of "off" disable the effect of the SLURM_NO_KILL environment variable.
指定一个可选参数“ off”禁用 SLURM _ NO _ KILL 环境变量的效果。
The default action is to terminate the job upon node failure.
默认操作是在节点失败时终止作业。
-F, --nodefile=<node_file>- F,—— nodefile = < node _ file >Much like --nodelist, but the list is contained in a file of name node file. The node names of the list may also span multiple lines in the file. Duplicate node names in the file will be ignored. The order of the node names in the list is not important; the node names will be sorted by Slurm.很像—— nodelist,但是列表包含在一个名称节点文件中。列表的节点名也可以跨文件中的多行。文件中的重复节点名将被忽略。列表中节点名称的顺序并不重要,节点名称将按 slurm 排序。
-w, --nodelist={<node_name_list>|<filename>}- w,—— nodelist = { < node _ name _ list > | < filename > }Request a specific list of hosts. The job will contain all of these hosts and possibly additional hosts as needed to satisfy resource requirements. The list may be specified as a comma-separated list of hosts, a range of hosts (host[1-5,7,...] for example), or a filename. The host list will be assumed to be a filename if it contains a "/" character. If you specify a minimum node or processor count larger than can be satisfied by the supplied host list, additional resources will be allocated on other nodes as needed. Rather than repeating a host name multiple times, an asterisk and a repetition count may be appended to a host name. For example "host1,host1" and "host1*2" are equivalent. If the number of tasks is given and a list of requested nodes is also given, the number of nodes used from that list will be reduced to match that of the number of tasks if the number of nodes in the list is greater than the number of tasks. This option applies to job and step allocations.请求一个特定的主机列表。作业将包含所有这些主机,并可能包含满足资源需求所需的其他主机。该列表可以指定为以逗号分隔的主机列表、主机范围(例如 host [1-5,7,... ])或文件名。如果主机列表包含一个“/”字符,那么它将被假定为一个文件名。如果指定的最小节点或处理器数大于所提供的主机列表所能满足的数目,则将根据需要在其他节点上分配额外的资源。主机名称可以附加星号和重复次数,而不是多次重复主机名称。例如,“ host1,host1”和“ host1 * 2”是等价的。如果给定了任务的数量,并且还给定了请求节点的列表,则如果列表中的节点数大于任务数,则从该列表中使用的节点数将减少到与任务数匹配。此选项适用于作业和步骤分配。
-N, --nodes=<minnodes>[-maxnodes]- N,—— node = < minnode > [-maxnode ]Request that a minimum of minnodes nodes be allocated to this job. A maximum node count may also be specified with maxnodes. If only one number is specified, this is used as both the minimum and maximum node count. The partition's node limits supersede those of the job. If a job's node limits are outside of the range permitted for its associated partition, the job will be left in a PENDING state. This permits possible execution at a later time, when the partition limit is changed. If a job node limit exceeds the number of nodes configured in the partition, the job will be rejected. Note that the environment variable SLURM_JOB_NUM_NODES (and SLURM_NNODES for backwards compatibility) will be set to the count of nodes actually allocated to the job. See the ENVIRONMENT VARIABLES section for more information. If -N is not specified, the default behavior is to allocate enough nodes to satisfy the requested resources as expressed by per-job specification options, e.g. -n, -c and --gpus. The job will be allocated as many nodes as possible within the range specified and without delaying the initiation of the job. If the number of tasks is given and a number of requested nodes is also given, the number of nodes used from that request will be reduced to match that of the number of tasks if the number of nodes in the request is greater than the number of tasks. The node count specification may include a numeric value followed by a suffix of "k" (multiplies numeric value by 1,024) or "m" (multiplies numeric value by 1,048,576). This option applies to job and step allocations.请求为此作业分配最少的小节点。也可以使用 maxnode 指定最大节点计数。如果只指定了一个数字,则将其用作最小和最大节点计数。分区的节点限制取代了作业的限制。如果作业的节点限制超出其关联分区所允许的范围,则该作业将保持 PENding 状态。这允许在稍后更改分区限制时执行。如果作业节点限制超过分区中配置的节点数,则将拒绝该作业。注意,环境变量的 SLURM _ JOB _ NUM _ NODES (以及向后兼容性的 SLURM _ NNODES)将被设置为实际分配给作业的节点计数。有关更多信息,请参见“环境变量”部分。如果没有指定-N,则默认行为是分配足够的节点以满足每个作业规范选项(例如-n、-c 和—— gpus)所表示的请求资源。作业将在指定的范围内分配尽可能多的节点,并且不会延迟作业的启动。如果给定了任务的数量并且也给定了被请求的节点的数量,那么如果请求中的节点数量大于任务的数量,则从该请求中使用的节点的数量将减少到与任务的数量相匹配。节点计数规范可以包括一个后缀为“ k”(乘以数值1,024)或“ m”(乘以数值1,048,576)的数值。此选项适用于作业和步骤分配。
-n, --ntasks=<number>- n,—— nasks = < number >Specify the number of tasks to run. Request that srun allocate resources for ntasks tasks. The default is one task per node, but note that the --cpus-per-task option will change this default. This option applies to job and step allocations.指定要运行的任务数。请求为 ntask 任务分配资源。默认值是每个节点一个任务,但请注意—— cpus-per-task 选项将更改此默认值。此选项适用于作业和步骤分配。--ntasks-per-core=<ntasks>—— nasks-per-core = < nasks >Request the maximum ntasks be invoked on each core. This option applies to the job allocation, but not to step allocations. Meant to be used with the --ntasks option. Related to --ntasks-per-node except at the core level instead of the node level. Masks will automatically be generated to bind the tasks to specific cores unless --cpu-bind=none is specified. NOTE: This option is not supported when using SelectType=select/linear.请求在每个核心上调用的最大 ntask。此选项适用于作业分配,但不适用于单步分配。用于与—— nasks 选项一起使用。除了在核心级别而不是节点级别之外,与—— nasks-per-node 相关。除非指定了—— cpu-bind = none,否则将自动生成面具来将任务绑定到特定的核。注意: 使用 SelectType = select/line 时不支持此选项。
--ntasks-per-gpu=<ntasks>—— nasks-per-gpu = < nasks >Request that there are ntasks tasks invoked for every GPU. This option can work in two ways: 1) either specify --ntasks in addition, in which case a type-less GPU specification will be automatically determined to satisfy --ntasks-per-gpu, or 2) specify the GPUs wanted (e.g. via --gpus or --gres) without specifying --ntasks, and the total task count will be automatically determined. The number of CPUs needed will be automatically increased if necessary to allow for any calculated task count. This option will implicitly set --gpu-bind=single:<ntasks>, but that can be overridden with an explicit --gpu-bind specification. This option is not compatible with a node range (i.e. -N<minnodes-maxnodes>). This option is not compatible with --gpus-per-task, --gpus-per-socket, or --ntasks-per-node. This option is not supported unless SelectType=cons_tres is configured (either directly or indirectly on Cray systems).请求为每个 GPU 调用 ntask 任务。这个选项有两种工作方式: 1)另外指定—— nasks,在这种情况下,一个无类型的 GPU 规范将自动确定为满足—— nasks-per-GPU,或者2)指定想要的 GPU (例如通过—— GPUs 或—— gres)而不指定—— nasks,总任务计数将自动确定。如果需要计算任务计数,所需的 CPU 数量将自动增加。这个选项将隐式地设置—— gpu-bind = single: < nasks > ,但是可以用显式的—— gpu-bind 规范覆盖它。此选项与节点范围不兼容(即-N < minnode-maxnode >)。此选项与—— gpus-per-task、—— gpus-per-socket 或—— nasks-per-node 不兼容。除非配置了 SelectType = con _ tri (在 Cray 系统上可以直接或间接配置) ,否则不支持此选项。
--ntasks-per-node=<ntasks>—— nasks-per-node = < nasks >Request that ntasks be invoked on each node. If used with the --ntasks option, the --ntasks option will take precedence and the --ntasks-per-node will be treated as a maximum count of tasks per node. Meant to be used with the --nodes option. This is related to --cpus-per-task=ncpus, but does not require knowledge of the actual number of cpus on each node. In some cases, it is more convenient to be able to request that no more than a specific number of tasks be invoked on each node. Examples of this include submitting a hybrid MPI/OpenMP app where only one MPI "task/rank" should be assigned to each node while allowing the OpenMP portion to utilize all of the parallelism present in the node, or submitting a single setup/cleanup/monitoring job to each node of a pre-existing allocation as one step in a larger job script. This option applies to job allocations.请求在每个节点上调用 ntask。如果与—— nasks 选项一起使用,则—— nasks 选项将具有优先权,并且—— nasks-per-node 将被视为每个节点的最大任务计数。用于与—— node 选项一起使用。这与—— cpus-per-task = ncpus 相关,但是不需要知道每个节点上的实际 CPU 数量。在某些情况下,能够请求在每个节点上调用的任务数量不超过特定数量会更方便。这方面的例子包括提交一个混合的 MPI/OpenMP 应用程序,其中只有一个 MPI“任务/等级”应该分配给每个节点,同时允许 OpenMP 部分利用节点中存在的所有并行性,或者提交一个单一的设置/清理/监控作业给每个节点的一个预先存在的分配作为一个更大的作业脚本的一个步骤。此选项适用于作业分配。
--ntasks-per-socket=<ntasks>—— nasks-per-socket = < nasks >Request the maximum ntasks be invoked on each socket. This option applies to the job allocation, but not to step allocations. Meant to be used with the --ntasks option. Related to --ntasks-per-node except at the socket level instead of the node level. Masks will automatically be generated to bind the tasks to specific sockets unless --cpu-bind=none is specified. NOTE: This option is not supported when using SelectType=select/linear.请求在每个套接字上调用的最大 ntask。此选项适用于作业分配,但不适用于单步分配。用于与—— nasks 选项一起使用。除了套接字级别而不是节点级别之外,与—— nasks-per-node 相关。除非指定了—— cpu-bind = none,否则将自动生成掩码来将任务绑定到特定的套接字。注意: 使用 SelectType = select/line 时不支持此选项。
--open-mode={append|truncate}—— open-mode = { append | truncate }Open the output and error files using append or truncate mode as specified. For heterogeneous job steps the default value is "append". Otherwise the default value is specified by the system configuration parameter JobFileAppend. This option applies to job and step allocations.使用指定的追加或截断模式打开输出和错误文件。对于异构作业步骤,默认值是“ append”。否则,默认值由系统配置参数 JobFileAppend 指定。此选项适用于作业和步骤分配。
-o, --output=<filename_pattern>- o,-output = < filename _ pattern >Specify the "指定「filename pattern文件名模式" for stdout redirection. By default in interactive mode, 默认情况下,在交互模式下, srunSrun collects stdout from all tasks and sends this output via TCP/IP to the attached terminal. With从所有任务中收集标准输出,并通过 TCP/IP 将该输出发送到附加的终端--output输出 stdout may be redirected to a file, to one file per task, or to /dev/null. See sectionStdout 可能被重定向到一个文件、每个任务一个文件或/dev/nullIO Redirection重定向 below for the various forms of 各种形式的 filename pattern文件名模式. If the specified file already exists, it will be overwritten. 。如果指定的文件已经存在,它将被覆盖。
If --error is not also specified on the command line, both stdout and stderr will directed to the file specified by --output. This option applies to job and step allocations.
如果命令行中没有指定—— error,那么 stdout 和 stderr 都将定向到由—— output 指定的文件。此选项适用于作业和步骤分配。
-O, --overcommit- 哦,-过度承诺Overcommit resources. This option applies to job and step allocations. 过度提交资源。此选项适用于作业和步骤分配。
When applied to a job allocation (not including jobs requesting exclusive access to the nodes) the resources are allocated as if only one task per node is requested. This means that the requested number of cpus per task (-c, --cpus-per-task) are allocated per node rather than being multiplied by the number of tasks. Options used to specify the number of tasks per node, socket, core, etc. are ignored.
当应用于作业分配时(不包括请求独占访问节点的作业) ,资源的分配就好像每个节点只请求一个任务。这意味着每个节点分配每个任务所请求的 CPU 数(- c,—— cpus-per-task) ,而不是乘以任务的数量。用于指定每个节点、套接字、核心等的任务数的选项将被忽略。
When applied to job step allocations (the srun command when executed within an existing job allocation), this option can be used to launch more than one task per CPU. Normally, srun will not allocate more than one process per CPU. By specifying --overcommit you are explicitly allowing more than one process per CPU. However no more than MAX_TASKS_PER_NODE tasks are permitted to execute per node. NOTE: MAX_TASKS_PER_NODE is defined in the file slurm.h and is not a variable, it is set at Slurm build time.
当应用于作业步骤分配(在现有作业分配中执行 srun 命令)时,此选项可用于为每个 CPU 启动多个任务。通常,srun 不会为每个 CPU 分配多个进程。通过指定—— overcommit,您显式地允许每个 CPU 有多个进程。但是,每个节点只允许执行 MAX _ TASKS _ PER _ NODE 任务。注意: MAX _ TASKS _ PER _ NODE 是在 slurm.h 文件中定义的,它不是一个变量,而是在 slurm 构建时设置的。
--overlap重叠Specifying --overlap allows steps to share all resources (CPUs, memory, and GRES) with all other steps. A step using this option will overlap all other steps, even those that did not specify --overlap.指定——重叠允许步骤与所有其他步骤共享所有资源(CPU、内存和 GRES)。使用此选项的步骤将重叠所有其他步骤,甚至那些未指定的步骤——重叠。
By default steps do not share resources with other parallel steps. This option applies to step allocations.
默认情况下,步骤不与其他并行步骤共享资源。此选项适用于步骤分配。
-s, --oversubscribe- 是的-订阅过量The job allocation can over-subscribe resources with other running jobs. The resources to be over-subscribed can be nodes, sockets, cores, and/or hyperthreads depending upon configuration. The default over-subscribe behavior depends on system configuration and the partition's OverSubscribe option takes precedence over the job's option. This option may result in the allocation being granted sooner than if the --oversubscribe option was not set and allow higher system utilization, but application performance will likely suffer due to competition for resources. This option applies to job allocations.作业分配可能会过度订阅其他正在运行的作业的资源。要超额订阅的资源可以是节点、套接字、核心和/或超线程,这取决于配置。默认的过订阅行为取决于系统配置,分区的 OverSubscribe 选项优先于作业选项。这个选项可能会导致分配被授予的时间早于没有设置—— overordering 选项并允许更高的系统利用率的时间,但是由于对资源的竞争,应用程序的性能可能会受到影响。此选项适用于作业分配。
-p, --partition=<partition_names>- -Request a specific partition for the resource allocation. If not specified, the default behavior is to allow the slurm controller to select the default partition as designated by the system administrator. If the job can use more than one partition, specify their names in a comma separate list and the one offering earliest initiation will be used with no regard given to the partition name ordering (although higher priority partitions will be considered first). When the job is initiated, the name of the partition used will be placed first in the job record partition string. This option applies to job allocations.请求资源分配的特定分区。如果没有指定,默认行为是允许 slurm 控制器选择系统管理员指定的默认分区。如果作业可以使用多个分区,请在逗号单独的列表中指定它们的名称,并且不考虑分区的名称顺序,将使用提供最早启动的分区(尽管优先级较高的分区将被优先考虑)。当作业启动时,所用分区的名称将首先放在作业记录分区字符串中。此选项适用于作业分配。
--power=<flags>权力 = < 旗帜 >Comma separated list of power management plugin options. Currently available flags include: level (all nodes allocated to the job should have identical power caps, may be disabled by the Slurm configuration option PowerParameters=job_no_level). This option applies to job allocations.逗号分隔的电源管理插件选项列表。目前可用的标志包括: level (分配给作业的所有节点都应该有相同的电源上限,可以通过 slurm 配置选项 PowerParameter = job _ no _ level 禁用)。此选项适用于作业分配。
--prefer=<list>——喜欢 = < 列表 >Nodes can have 节点可以有 features特征 assigned to them by the Slurm administrator. Users can specify which of these 由 slurm 管理员指定 features特征 are desired but not required by their job using the prefer option. This option operates independently from是需要的,但不是被要求的 他们的工作使用优先选项。 此选项独立于--constraint——约束 and will override whatever is set there if possible. When scheduling the features in 并将覆盖 如果可能的话,在那里设置任何东西。 中调度特性时 --prefer更喜欢 are tried first if a node set isn't available with those features then 如果一个节点集 那么这些功能是不可用的 --constraint——约束 is attempted. See 已经尝试过了。 你看 --constraint——约束 for more information, this option behaves the same way. 有关详细信息,请参阅此选项的行为 方式。
-E, --preserve-env- E,——保存-EnvPass the current values of environment variables SLURM_JOB_NUM_NODES and SLURM_NTASKS through to the executable, rather than computing them from command line parameters. This option applies to job allocations.将环境变量 SLURM _ JOB _ NUM _ NODES 和 SLURM _ NTASKS 的当前值传递到可执行文件,而不是从命令行参数计算它们。此选项适用于作业分配。
--priority=<value>——优先级 = < 值 >Request a specific job priority. May be subject to configuration specific constraints. value should either be a numeric value or "TOP" (for highest possible value). Only Slurm operators and administrators can set the priority of a job. This option applies to job allocations only.请求一个特定的工作优先级。可能受到特定于配置的约束。值应该是一个数值或“ TOP”(对于可能的最高值)。只有 slurm 操作员和管理员才能设置作业的优先级。此选项仅适用于作业分配。
--profile={all|none|<type>[,<type>...]}—— profile = { all | none | < type > [ ,< type > ... ]}Enables detailed data collection by the acct_gather_profile plugin. Detailed data are typically time-series that are stored in an HDF5 file for the job or an InfluxDB database depending on the configured plugin. This option applies to job and step allocations.通过 acct _ together _ profile 插件启用详细数据收集。详细的数据通常是时间序列,根据配置的插件存储在作业的 HDF5文件或流量数据库中。此选项适用于作业和步骤分配。All全部All data types are collected. (Cannot be combined with other values.)收集所有数据类型(不能与其他值组合)None没有No data types are collected. This is the default. (Cannot be combined with other values.)没有收集数据类型。这是默认值。 (不能与其他值组合。)
Valid type values are:
有效的类型值是:
Energy能量Energy data is collected.收集能源数据。Task任务Task (I/O, Memory, ...) data is collected.收集任务(I/O、内存、 ...)数据。Filesystem文件系统Filesystem data is collected.收集文件系统数据。Network网络Network (InfiniBand) data is collected.收集网络(InfiniBand)数据。
--prolog=<executable>—— prolog = < Executive >srun will run executable just before launching the job step. The command line arguments for executable will be the command and arguments of the job step. If executable is "none", then no srun prolog will be run. This parameter overrides the SrunProlog parameter in slurm.conf. This parameter is completely independent from the Prolog parameter in slurm.conf. This option applies to job allocations.Srun 将在启动作业步骤之前运行可执行文件。可执行文件的命令行参数将是作业步骤的命令和参数。如果可执行文件是“ none”,则不会运行任何 srun prolog。此参数重写 slurm.conf 中的 SrunProlog 参数。该参数完全独立于 slurm.conf 中的 Prolog 参数。此选项适用于作业分配。
--propagate[=rlimit[,rlimit...]]——传播[ = rlimit [ ,rlimit... ]]Allows users to specify which of the modifiable (soft) resource limits to propagate to the compute nodes and apply to their jobs. If no rlimit is specified, then all resource limits will be propagated. The following rlimit names are supported by Slurm (although some options may not be supported on some systems):允许用户指定要传播到计算节点并应用到其作业的可修改(软)资源限制。如果未指定 rlimit,则将传播所有资源限制。Slurm 支持下列 rlimit 名称(尽管某些系统可能不支持某些选项) :ALL全部All limits listed below (default)下面列出的所有限制(默认)NONE没有No limits listed below没有下列限制AS作为The maximum address space (virtual memory) for a process.进程的最大地址空间(虚拟内存)。CORE核心The maximum size of core file核心文件的最大大小CPU中央处理器The maximum amount of CPU timeCPU 时间的最大值DATA资料The maximum size of a process's data segment进程数据段的最大大小FSIZE尺寸The maximum size of files created. Note that if the user sets FSIZE to less than the current size of the slurmd.log, job launches will fail with a 'File size limit exceeded' error.创建的文件的最大大小。注意,如果用户将 FSIZE 设置为小于 slurmd.log 的当前大小,则作业启动将失败,并出现“文件大小超出限制”错误。MEMLOCKThe maximum size that may be locked into memory可以锁定到内存中的最大大小NOFILEThe maximum number of open files打开的文件的最大数量NPROCThe maximum number of processes available可用进程的最大数量RSSThe maximum resident set size. Note that this only has effect with Linux kernels 2.4.30 or older or BSD.最大驻留集大小。注意,这只对 Linux 内核2.4.30或更高版本或 BSD 有效。STACKThe maximum stack size最大堆栈大小This option applies to job allocations.此选项适用于作业分配。
--pty空虚Execute task zero in pseudo terminal mode. Implicitly sets --unbuffered. Implicitly sets --error and --output to /dev/null for all tasks except task zero, which may cause those tasks to exit immediately (e.g. shells will typically exit immediately in that situation). This option applies to step allocations.在伪终端模式下执行任务零。隐式集合——未缓冲。对于除任务0之外的所有任务,隐式地将—— error 和—— output 设置为/dev/null,这可能会导致这些任务立即退出(例如,在这种情况下,shell 通常会立即退出)。此选项适用于步骤分配。
-q, --qos=<qos>- q,—— qos = < qos >Request a quality of service for the job. QOS values can be defined for each user/cluster/account association in the Slurm database. Users will be limited to their association's defined set of qos's when the Slurm configuration parameter, AccountingStorageEnforce, includes "qos" in its definition. This option applies to job allocations.要求工作的服务质量。可以为 slurm 数据库中的每个用户/群组/帐户关联定义 QOS 值。当 slurm 配置参数 AccountingstorageEnforce 在其定义中包含“ qos”时,用户将被限制在他们协会定义的一组 qos 中。此选项适用于作业分配。
-Q, --quiet- Q-安静Suppress informational messages from srun. Errors will still be displayed. This option applies to job and step allocations.取消来自 srun 的信息消息。错误仍将显示。此选项适用于作业和步骤分配。
--quit-on-interrupt停止打断Quit immediately on single SIGINT (Ctrl-C). Use of this option disables the status feature normally available when srun receives a single Ctrl-C and causes srun to instead immediately terminate the running job. This option applies to step allocations.在单一 SIGINT (Ctrl-C)上立即退出。使用此选项将禁用当 srun 接收到单个 Ctrl-C 时通常可用的状态特性,并导致 srun 立即终止正在运行的作业。此选项适用于步骤分配。
--reboot重启Force the allocated nodes to reboot before starting the job. This is only supported with some system configurations and will otherwise be silently ignored. Only root, SlurmUser or admins can reboot nodes. This option applies to job allocations.强制分配的节点在启动作业之前重新启动。这只受到一些系统配置的支持,否则将无声无息地忽略它。只有 root 用户、 SlurmUser 或管理员可以重新启动节点。此选项适用于作业分配。
-r, --relative=<n>- r,——相对的 = < n >Run a job step relative to node n of the current allocation. This option may be used to spread several job steps out among the nodes of the current job. If -r is used, the current job step will begin at node n of the allocated nodelist, where the first node is considered node 0. The -r option is not permitted with -w or -x option and will result in a fatal error when not running within a prior allocation (i.e. when SLURM_JOB_ID is not set). The default for n is 0. If the value of --nodes exceeds the number of nodes identified with the --relative option, a warning message will be printed and the --relative option will take precedence. This option applies to step allocations.相对于当前分配的节点 n 运行作业步骤。此选项可用于在当前作业的节点之间分布多个作业步骤。如果使用-r,则当前作业步骤将从分配的节点列表的节点 n 开始,其中第一个节点被认为是节点0。R 选项不允许与-w 或-x 选项一起使用,如果不在先前的分配中运行(例如,未设置 SLURM _ JOB _ ID) ,将导致致命错误。N 的默认值是0。如果—— node 的值超过了使用—— relevant 选项标识的节点数,那么将打印一条警告消息,并优先使用—— relative 选项。此选项适用于步骤分配。
--reservation=<reservation_names>保留 = < 保留 _ 名称 >Allocate resources for the job from the named reservation. If the job can use more than one reservation, specify their names in a comma separate list and the one offering earliest initiation. Each reservation will be considered in the order it was requested. All reservations will be listed in scontrol/squeue through the life of the job. In accounting the first reservation will be seen and after the job starts the reservation used will replace it.从命名预订为作业分配资源。如果作业可以使用多个预订,请在逗号分隔的列表中指定他们的名称,并在提供最早启动的列表中指定他们的名称。每个预订将按照请求的顺序进行考虑。所有的预订都将在工作生涯中被列入控制/排队列表。在会计中,将看到第一个预订,并且在作业开始后,所使用的预订将替换它。
--resv-ports[=count]—— resv-ports [ = count ]Reserve communication ports for this job. Users can specify the number of port they want to reserve. The parameter MpiParams=ports=12000-12999 must be specified in slurm.conf. If the number of reserved ports is zero then no ports are reserved. Used for native Cray's PMI only. This option applies to job and step allocations.为此作业保留通信端口。用户可以指定要预留的端口数。必须在 slurm.conf 中指定参数 MpiParams = ports = 12000-12999。如果保留端口数为零,则不保留端口。仅用于本地克雷的 PMI。此选项适用于作业和步骤分配。
--send-libs[=yes|no]—— send-libs [ = yes | no ]If set to yes (or no argument), autodetect and broadcast the executable's shared object dependencies to allocated compute nodes. The files are placed in a directory alongside the executable. The LD_LIBRARY_PATH is automatically updated to include this cache directory as well. This overrides the default behavior configured in slurm.conf SbcastParameters send_libs. This option only works in conjunction with --bcast. See also --bcast-exclude.如果设置为 yes (或无参数) ,则自动检测并将可执行文件的共享对象依赖关系广播到分配的计算节点。文件放在可执行文件旁边的目录中。LD _ LIBRARY _ PATH 也会自动更新以包含这个缓存目录。这将覆盖 slurm.conf SbcastParameter send _ libs 中配置的默认行为。此选项只能与—— bcast 一起使用。请参阅—— bcast- 排除。
--signal=[R:]<sig_num>[@sig_time]讯号 = [ R: ] < sig _ num > [@sig _ time ]When a job is within sig_time seconds of its end time, send it the signal sig_num. Due to the resolution of event handling by Slurm, the signal may be sent up to 60 seconds earlier than specified. sig_num may either be a signal number or name (e.g. "10" or "USR1"). sig_time must have an integer value between 0 and 65535. By default, no signal is sent before the job's end time. If a sig_num is specified without any sig_time, the default time will be 60 seconds. This option applies to job allocations. Use the "R:" option to allow this job to overlap with a reservation with MaxStartDelay set. To have the signal sent at preemption time see the preempt_send_user_signal SlurmctldParameter.当作业在结束时间的 sig _ time 秒内时,向它发送信号 sig _ num。由于 slurm 处理事件的分辨率,信号可能会比指定时间提前60秒发送。Sig _ num 可以是一个信号号或名称(例如“10”或“ USR1”)。Sig _ time 必须有一个介于0和65535之间的整数值。默认情况下,在作业结束之前不发送任何信号。如果指定的 sig _ num 没有任何 sig _ time,则默认时间为60秒。此选项适用于作业分配。使用“ R:”选项允许此作业与具有 MaxStart纲要设置的预订重叠。若要在抢占时间发送信号,请参见 preempt _ send _ user _ information SlurmctldParameter。
--slurmd-debug=<level>—— slurmd-debug = < level >Specify a debug level for slurmd(8). The level may be specified either an integer value between 0 [quiet, only errors are displayed] and 4 [verbose operation] or the SlurmdDebug tags.为 slurmd (8)指定调试级别。可以指定级别为0[安静,只显示错误]和4[详细操作]之间的整数值,也可以指定 SlurmdDebug 标记。quiet安静Log nothing什么都不记录fatal致命的Log only fatal errors仅记录致命错误error错误Log only errors只记录错误info信息Log errors and general informational messages记录错误和一般信息性消息verbose冗长Log errors and verbose informational messages记录错误和详细的信息性消息
The slurmd debug information is copied onto the stderr of the job. By default only errors are displayed. This option applies to job and step allocations.
Slurmd 调试信息复制到作业的 stderr 中。默认情况下,只显示错误。此选项适用于作业和步骤分配。
--sockets-per-node=<sockets>—— socket-per-node = < sockets >Restrict node selection to nodes with at least the specified number of sockets. See additional information under -B option above when task/affinity plugin is enabled. This option applies to job allocations. NOTE: This option may implicitly impact the number of tasks if -n was not specified.将节点选择限制为至少具有指定数目的套接字的节点。如果启用了任务/关联插件,请参见上面的 -B 选项下的附加信息。此选项适用于作业分配。注意: 如果没有指定 -n,此选项可能会隐式影响任务的数量。
--spread-job分头行动Spread the job allocation over as many nodes as possible and attempt to evenly distribute tasks across the allocated nodes. This option disables the topology/tree plugin. This option applies to job allocations.在尽可能多的节点上分布作业分配,并尝试在分配的节点上均匀地分配任务。此选项禁用拓扑/树插件。此选项适用于作业分配。
--switches=<count>[@max-time]—— switch = < count > [@max-time ]When a tree topology is used, this defines the maximum count of leaf switches desired for the job allocation and optionally the maximum time to wait for that number of switches. If Slurm finds an allocation containing more switches than the count specified, the job remains pending until it either finds an allocation with desired switch count or the time limit expires. It there is no switch count limit, there is no delay in starting the job. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds". The job's maximum time delay may be limited by the system administrator using the SchedulerParameters configuration parameter with the max_switch_wait parameter option. On a dragonfly network the only switch count supported is 1 since communication performance will be highest when a job is allocate resources on one leaf switch or more than 2 leaf switches. The default max-time is the max_switch_wait SchedulerParameters. This option applies to job allocations.当使用树拓扑时,这将定义作业分配所需的最大叶子开关数,以及等待该数量的开关的最长时间(可选)。如果 slurm 发现一个分配包含比指定的开关数更多的开关,该作业将一直等待,直到它发现一个包含所需开关数的分配或时间限制到期为止。它没有开关计数限制,没有开始工作的延迟。可接受的时间格式包括“分钟”、“分钟: 秒”、“小时: 分钟: 秒”、“天-小时”、“天-小时: 分钟”和“天-小时: 分钟: 秒”。作业的最大时间延迟可能会受到系统管理员的限制,这个时间延迟可以使用 SchedulerParameter 配置参数和 max _ switch _ wait 参数选项。在蜻蜓网络中,唯一支持的交换机数量是1,因为当一个作业在一个叶子交换机或多于2个叶子交换机上分配资源时,通信性能将最高。默认的 max-time 是 max _ switch _ wait SchedulerParameter。此选项适用于作业分配。
--task-epilog=<executable>- -The slurmstepd daemon will run executable just after each task terminates. This will be executed before any TaskEpilog parameter in slurm.conf is executed. This is meant to be a very short-lived program. If it fails to terminate within a few seconds, it will be killed along with any descendant processes. This option applies to step allocations.Slurmstep 守护进程将在每个任务结束后立即运行可执行程序。这将在执行 slurm.conf 中的任何 TaskEpilog 参数之前执行。这应该是一个非常短命的程序。如果它未能在几秒钟内终止,它将与任何子进程一起被终止。此选项适用于步骤分配。
--task-prolog=<executable>—— task-prolog = < Executive >The slurmstepd daemon will run executable just before launching each task. This will be executed after any TaskProlog parameter in slurm.conf is executed. Besides the normal environment variables, this has SLURM_TASK_PID available to identify the process ID of the task being started. Standard output from this program of the form "export NAME=value" will be used to set environment variables for the task being spawned. This option applies to step allocations.Slurmstep 守护进程将在启动每个任务之前运行可执行程序。这将在执行 slurm.conf 中的任何 TaskProlog 参数之后执行。除了正常的环境变量之外,还有 SLURM _ TASK _ PID 可用于标识正在启动的任务的进程 ID。这个程序的标准输出格式为“导出名称 = 值”,用于为所生成的任务设置环境变量。此选项适用于步骤分配。
--test-only只能测试Returns an estimate of when a job would be scheduled to run given the current job queue and all the other srun arguments specifying the job. This limits srun's behavior to just return information; no job is actually submitted. The program will be executed directly by the slurmd daemon. This option applies to job allocations.根据当前作业队列和指定作业的所有其他 srun 参数,返回作业计划何时运行的估计值。这将 srun 的行为限制为只返回信息; 实际上不提交任何作业。该程序将由 slurmd 守护进程直接执行。此选项适用于作业分配。
--thread-spec=<num>—— thread-spec = < num >Count of specialized threads per node reserved by the job for system operations and not used by the application. The application will not use these threads, but will be charged for their allocation. This option can not be used with the作业为系统操作保留的每个节点的专用线程计数 应用程序不会使用这些线程, 但会收取分配费用。 此选项不能与--core-spec核心规格 option. This option applies to job allocations. 选项,这个选项 适用于工作分配。
NOTE: Explicitly setting a job's specialized thread value implicitly sets its --exclusive option, reserving entire nodes for the job.
注意: 显式地设置作业的专用线程值隐式地设置它的—— only 选项,为作业保留整个节点。
-T, --threads=<nthreads>- T,——线程 = < nthread >Allows limiting the number of concurrent threads used to send the job request from the srun process to the slurmd processes on the allocated nodes. Default is to use one thread per allocated node up to a maximum of 60 concurrent threads. Specifying this option limits the number of concurrent threads to nthreads (less than or equal to 60). This should only be used to set a low thread count for testing on very small memory computers. This option applies to job allocations.允许限制用于将作业请求从 srun 进程发送到分配节点上的 slurmd 进程的并发线程数。默认情况下,每个分配的节点使用一个线程,最多可以使用60个并发线程。指定此选项将并发线程的数量限制为 nthread (小于或等于60)。这应该只用于设置在非常小的内存计算机上进行测试的低线程数。此选项适用于作业分配。
--threads-per-core=<threads>——线程/核心 = < 线程 >Restrict node selection to nodes with at least the specified number of threads per core. In task layout, use the specified maximum number of threads per core. Implies --cpu-bind=threads unless overridden by command line or environment options. NOTE: "Threads" refers to the number of processing units on each core rather than the number of application tasks to be launched per core. See additional information under -B option above when task/affinity plugin is enabled. This option applies to job and step allocations. NOTE: This option may implicitly impact the number of tasks if -n was not specified.将节点选择限制为每个核心至少具有指定数量的线程的节点。在任务布局中,使用每个内核指定的最大线程数。意味着—— cpu-bind = 线程,除非被命令行或环境选项覆盖。注意: “线程”指的是每个核心上的处理单元的数量,而不是每个核心上要启动的应用程序任务的数量。如果启用了任务/关联插件,请参见上面的 -B 选项下的附加信息。此选项适用于作业和步骤分配。注意: 如果没有指定 -n,此选项可能会隐式影响任务的数量。
-t, --time=<time>- t,—— time = < time >Set a limit on the total run time of the job allocation. If the requested time limit exceeds the partition's time limit, the job will be left in a PENDING state (possibly indefinitely). The default time limit is the partition's default time limit. When the time limit is reached, each task in each job step is sent SIGTERM followed by SIGKILL. The interval between signals is specified by the Slurm configuration parameter对作业分配的总运行时间设置限制。如果请求的时间限制超过了分区的时间限制,作业将处于 PENding 状态(可能是无限期的)。默认时间限制是分区的默认时间限制。当达到时间限制时,将向每个作业步骤中的每个任务发送 SIGTERM,然后发送 SIGKILL。信号之间的间隔由 slurm 配置参数指定KillWait杀,等等. The 。的 OverTimeLimit超时限制 configuration parameter may permit the job to run longer than scheduled. Time resolution is one minute and second values are rounded up to the next minute.配置参数可能 允许作业比计划运行更长时间。时间分辨率为一分钟 第二个值被四舍五入到下一分钟。
A time limit of zero requests that no time limit be imposed. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds". This option applies to job and step allocations.
没有时间限制的零请求的时间限制。可接受的时间格式包括“分钟”、“分钟: 秒”、“小时: 分钟: 秒”、“天-小时”、“天-小时: 分钟”和“天-小时: 分钟: 秒”。此选项适用于作业和步骤分配。
--time-min=<time>—— time-min = < time >Set a minimum time limit on the job allocation. If specified, the job may have its --time limit lowered to a value no lower than --time-min if doing so permits the job to begin execution earlier than otherwise possible. The job's time limit will not be changed after the job is allocated resources. This is performed by a backfill scheduling algorithm to allocate resources otherwise reserved for higher priority jobs. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds". This option applies to job allocations.设置作业分配的最短时间限制。如果指定了作业,则如果这样做允许作业比其他方式更早地开始执行,则作业的时间限制可能降低到不低于—— time-min 的值。作业的时间限制在作业分配资源后不会更改。这是由一个回填调度算法执行,以分配资源,否则保留较高的优先级作业。可接受的时间格式包括“分钟”、“分钟: 秒”、“小时: 分钟: 秒”、“天-小时”、“天-小时: 分钟”和“天-小时: 分钟: 秒”。此选项适用于作业分配。
--tmp=<size>[units]—— tmp = < size > [ unit ]Specify a minimum amount of temporary disk space per node. Default units are megabytes. Different units can be specified using the suffix [K|M|G|T]. This option applies to job allocations.指定每个节点的最小临时磁盘空间量。默认单位是兆字节。可以使用后缀[ K | M | G | T ]指定不同的单位。此选项适用于作业分配。
--uid=<user>—— uid = < user >Attempt to submit and/or run a job as user instead of the invoking user id. The invoking user's credentials will be used to check access permissions for the target partition. User root may use this option to run jobs as a normal user in a RootOnly partition for example. If run as root, srun will drop its permissions to the uid specified after node allocation is successful. user may be the user name or numerical user ID. This option applies to job and step allocations.尝试以用户身份提交和/或运行作业,而不是调用用户 ID。调用用户的凭据将用于检查目标分区的访问权限。例如,用户 root 可以使用此选项作为 RootOnly 分区中的普通用户运行作业。如果以根用户身份运行,srun 将在节点分配成功后将其权限删除到指定的 uid。用户可能是用户名称或数字用户 ID。此选项适用于作业和步骤分配。
-u, --unbuffered- u-无缓冲By default, the connection between slurmstepd and the user-launched application is over a pipe. The stdio output written by the application is buffered by the glibc until it is flushed or the output is set as unbuffered. See setbuf(3). If this option is specified the tasks are executed with a pseudo terminal so that the application output is unbuffered. This option applies to step allocations.默认情况下,slurmstep 和用户启动的应用程序之间的连接是通过管道进行的。由应用程序编写的 Studio 输出由 glibc 进行缓冲,直到刷新或将输出设置为 unbuffer。参见 setbuf (3)。如果指定了此选项,则使用伪终端执行任务,以便应用程序输出不受缓冲。此选项适用于步骤分配。
--usage使用情况Display brief help message and exit.显示简短的帮助信息并退出。
--use-min-nodes—— use-min-nodeIf a range of node counts is given, prefer the smaller count.如果给定了一个节点计数范围,请选择较小的计数。
-v, --verbose- V-长篇大论Increase the verbosity of srun's informational messages. Multiple -v's will further increase srun's verbosity. By default only errors will be displayed. This option applies to job and step allocations.增加 srun 信息消息的冗长性。多个 v 会进一步增加 Srun 的冗长性。默认情况下,只显示错误。此选项适用于作业和步骤分配。
-V, --version- V-版本Display version information and exit.显示版本信息并退出。
-W, --wait=<seconds>等等Specify how long to wait after the first task terminates before terminating all remaining tasks. A value of 0 indicates an unlimited wait (a warning will be issued after 60 seconds). The default value is set by the WaitTime parameter in the slurm configuration file (see slurm.conf(5)). This option can be useful to ensure that a job is terminated in a timely fashion in the event that one or more tasks terminate prematurely. Note: The -K, --kill-on-bad-exit option takes precedence over -W, --wait to terminate the job immediately if a task exits with a non-zero exit code. This option applies to job allocations.指定在终止所有剩余任务之前,第一个任务终止后的等待时间。值为0表示无限等待(60秒后发出警告)。默认值由 slurm 配置文件中的 WaitTime 参数设置(参见 slurm.conf (5))。此选项可用于确保在一个或多个任务提前终止的情况下及时终止作业。注意:-K,—— kill-on-bad-exit 选项优先于-W,——如果任务退出时具有非零退出代码,则等待立即终止作业。此选项适用于作业分配。
--wckey=<wckey>—— wckey = < wckey >Specify wckey to be used with job. If TrackWCKey=no (default) in the slurm.conf this value is ignored. This option applies to job allocations.指定将与作业一起使用的 wckey。如果 slurm.conf 中的 TrackWCKey = no (缺省值) ,则忽略此值。此选项适用于作业分配。
--x11[={all|first|last}]—— x11[ = { all | first | last }]Sets up X11 forwarding on "all", "first" or "last" node(s) of the allocation. This option is only enabled if Slurm was compiled with X11 support and PrologFlags=x11 is defined in the slurm.conf. Default is "all".在分配的“所有”、“第一”或“最后”节点上设置 X11转发。只有当 slurm 是用 x11支持编译的,并且在 slurm.conf 中定义了 PrologFlags = x11时,才会启用这个选项。默认是“所有”。
srun will submit the job request to the slurm job controller, then initiate all processes on the remote nodes. If the request cannot be met immediately, srun will block until the resources are free to run the job. If the -I (--immediate) option is specified srun will terminate if resources are not immediately available.
Srun 会将作业请求提交给 slurm 作业控制器,然后在远程节点上启动所有进程。如果不能立即满足请求,srun 将阻塞,直到资源可以运行作业为止。如果指定了-I (—— direct)选项,则如果资源不能立即使用,则 srun 将终止。
When initiating remote processes srun will propagate the current working directory, unless --chdir=<path> is specified, in which case path will become the working directory for the remote processes.
启动远程进程时,srun 将传播当前的工作目录,除非指定了—— chdir = < path > ,在这种情况下,path 将成为远程进程的工作目录。
The -n, -c, and -N options control how CPUs and nodes will be allocated to the job. When specifying only the number of processes to run with -n, a default of one CPU per process is allocated. By specifying the number of CPUs required per task (-c), more than one CPU may be allocated per process. If the number of nodes is specified with -N, srun will attempt to allocate at least the number of nodes specified.
- n、-c 和-N 选项控制如何将 CPU 和节点分配给作业。当仅指定要用 -n 运行的进程数时,默认情况下为每个进程分配一个 CPU。通过指定每个任务所需的 CPU 数量(- c) ,可以为每个进程分配多个 CPU。如果节点数用 -N 指定,则 srun 将尝试至少分配指定的节点数。
Combinations of the above three options may be used to change how processes are distributed across nodes and cpus. For instance, by specifying both the number of processes and number of nodes on which to run, the number of processes per node is implied. However, if the number of CPUs per process is more important then number of processes (-n) and the number of CPUs per process (-c) should be specified.
上述三个选项的组合可用于改变进程在节点和 CPU 之间的分布方式。例如,通过同时指定要在其上运行的进程数和节点数,就意味着每个节点的进程数。但是,如果每个进程的 CPU 数量更重要,那么应该指定进程数量(- n)和每个进程的 CPU 数量(- c)。
srun will refuse to allocate more than one process per CPU unless --overcommit (-O) is also specified.
Srun 将拒绝为每个 CPU 分配多个进程,除非还指定了—— overcommit (- O)。
srun will attempt to meet the above specifications "at a minimum." That is, if 16 nodes are requested for 32 processes, and some nodes do not have 2 CPUs, the allocation of nodes will be increased in order to meet the demand for CPUs. In other words, a minimum of 16 nodes are being requested. However, if 16 nodes are requested for 15 processes, srun will consider this an error, as 15 processes cannot run across 16 nodes.
Srun 将尝试“至少”满足上述规格也就是说,如果32个进程请求16个节点,而某些节点没有2个 CPU,那么将增加节点的分配,以满足对 CPU 的需求。换句话说,请求的节点至少有16个。但是,如果为15个进程请求16个节点,srun 将认为这是一个错误,因为15个进程不能跨16个节点运行。
IO Redirection
重定向
By default, stdout and stderr will be redirected from all tasks to the stdout and stderr of srun, and stdin will be redirected from the standard input of srun to all remote tasks. If stdin is only to be read by a subset of the spawned tasks, specifying a file to read from rather than forwarding stdin from the srun command may be preferable as it avoids moving and storing data that will never be read.
默认情况下,stdout 和 stderr 将从所有任务重定向到 srun 的 stdout 和 stderr,stdin 将从 srun 的标准输入重定向到所有远程任务。如果 stdin 只能被衍生任务的子集读取,那么指定要读取的文件而不是从 srun 命令转发 stdin 可能更好,因为它可以避免移动和存储永远不会被读取的数据。
For OS X, the poll() function does not support stdin, so input from a terminal is not possible.
对于 OS X,poll ()函数不支持 stdin,因此不能从终端进行输入。
This behavior may be changed with the --output, --error, and --input (-o, -e, -i) options. Valid format specifications for these options are
可以使用—— output、—— error 和—— input (- o,-e,-i)选项更改此行为。这些选项的有效格式规范如下
all stdout stderr is redirected from all tasks to srun. stdin is broadcast to all remote tasks. (This is the default behavior)Stdout stderr 从所有任务重定向到 srun。将 stdin 广播到所有远程任务。(这是默认行为) none stdout and stderr is not received from any task. stdin is not sent to any task (stdin is closed).不从任何任务接收 stdout 和 stderr。不将 stdin 发送到任何任务(stdin 关闭)。 taskid stdout and/or stderr are redirected from only the task with relative id equal to taskid, where 0 <= taskid <= ntasks, where ntasks is the total number of tasks in the current job step. stdin is redirected from the stdin of srun to this same task. This file will be written on the node executing the task.Stdout 和/或 stderr 仅从相对 id 等于 taskid 的任务重定向,其中0 < = taskid < = nasks,其中 nasks 是当前作业步骤中的任务总数。Stdin 从 srun 的 stdin 重定向到同一个任务。这个文件将写在执行任务的节点上。 filename srun will redirect stdout and/or stderr to the named file from all tasks. stdin will be redirected from the named file and broadcast to all tasks in the job. filename refers to a path on the host that runs srun. Depending on the cluster's file system layout, this may result in the output appearing in different places depending on whether the job is run in batch mode.Srun 将把 stdout 和/或 stderr 从所有任务重定向到指定的文件。Stdin 将从命名文件重定向,并广播到作业中的所有任务。Filename 引用运行 srun 的主机上的路径。根据集群的文件系统布局,这可能导致输出出现在不同的位置,具体取决于作业是否以批处理模式运行。 filename pattern srun allows for a filename pattern to be used to generate the named IO file described above. The following list of format specifiers may be used in the format string to generate a filename that will be unique to a given jobid, stepid, node, or task. In each case, the appropriate number of files are opened and associated with the corresponding tasks. Note that any format string containing %t, %n, and/or %N will be written on the node executing the task rather than the node where srun executes, these format specifiers are not supported on a BGQ system.Srun 允许使用文件名模式来生成上面描述的命名 IO 文件。下面的格式说明符列表可以在格式字符串中用来生成一个文件名,该文件名对于给定的 jobid、 step id、 node 或 task 是唯一的。在每种情况下,都会打开适当数量的文件,并与相应的任务相关联。请注意,任何包含% t、% n 和/或% N 的格式字符串都将写在执行任务的节点上,而不是 srun 执行的节点上,这些格式说明符在 BGQ 系统上不受支持。\\Do not process any of the replacement symbols.不要处理任何替换符号。%%The character "%".字符“%”。%A% AJob array's master job allocation number.作业数组的主作业分配号。%a% aJob array ID (index) number.作业数组 ID (索引)编号。%J% Jjobid.stepid of the running job. (e.g. "128.0")正在运行的作业的 jobid.steid (例如“128.0”)%j% jjobid of the running job.运行作业的批号。%s% sstepid of the running job.对正在运行的作业不屑一顾。%N% Nshort hostname. This will create a separate IO file per node.这将为每个节点创建一个单独的 IO 文件。%n% nNode identifier relative to current job (e.g. "0" is the first node of the running job) This will create a separate IO file per node.相对于当前作业的节点标识符(例如,“0”是正在运行的作业的第一个节点) ,这将为每个节点创建一个单独的 IO 文件。%t% ttask identifier (rank) relative to current job. This will create a separate IO file per task.相对于当前作业的任务标识符(等级)。这将为每个任务创建一个单独的 IO 文件。%u% uUser name.用户名称。%x% xJob name.工作名称。
A number placed between the percent character and format specifier may be used to zero-pad the result in the IO filename. This number is ignored if the format specifier corresponds to non-numeric data (%N for example).
放置在百分比字符和格式说明符之间的数字可用于在 IO 文件名中为结果填零。如果格式说明符对应于非数值数据(例如% N) ,则忽略此数字。
Some examples of how the format string may be used for a 4 task job step with a Job ID of 128 and step id of 0 are included below:
格式字符串如何用于工作编号为128和工作编号为0的4个任务作业步骤的一些例子如下:
job%J.out工作完毕job128.0.outJob 128.0完毕job%4j.out工作% 4j. 完毕job0128.out完毕job%j-%2t.out工作% j -% 2t. 完毕job128-00.out, job128-01.out, ...工作128-00。出去,工作128-01。出去,..。
PERFORMANCE
表演
Executing srun sends a remote procedure call to slurmctld. If enough calls from srun or other Slurm client commands that send remote procedure calls to the slurmctld daemon come in at once, it can result in a degradation of performance of the slurmctld daemon, possibly resulting in a denial of service.
执行 srun 会向 slurmctld 发送一个远程过程调用。如果来自 srun 或其他 slurm 客户端命令的调用足够多,可以立即向 slurmctld 守护进程发送远程过程调用,这可能导致 slurmctld 守护进程性能下降,可能导致分布式拒绝服务攻击。
Do not run srun or other Slurm client commands that send remote procedure calls to slurmctld from loops in shell scripts or other programs. Ensure that programs limit calls to srun to the minimum necessary for the information you are trying to gather.
不要运行 srun 或其他 slurm 客户端命令,这些命令通过 shell 脚本或其他程序中的循环向 slurmctld 发送远程过程调用。确保程序将对 Srun 的调用限制到您试图收集信息所需的最小值。
INPUT ENVIRONMENT VARIABLES
输入环境变量
Upon startup, srun will read and handle the options set in the following environment variables. The majority of these variables are set the same way the options are set, as defined above. For flag options that are defined to expect no argument, the option can be enabled by setting the environment variable without a value (empty or NULL string), the string 'yes', or a non-zero number. Any other value for the environment variable will result in the option not being set. There are a couple exceptions to these rules that are noted below. NOTE: Command line options always override environment variable settings.
在启动时,srun 将读取和处理下列环境变量中设置的选项。这些变量中的大多数都是按照上面定义的设置选项的方式设置的。对于预期没有参数的标志选项,可以通过设置没有值的环境变量(空或空字符串)、字符串“ yes”或非零数字来启用该选项。环境变量的任何其他值都将导致未设置选项。这些规则有几个例外,如下所示。注意: 命令行选项总是覆盖环境变量设置。
PMI_FANOUTPMI _ FANOUT This is used exclusively with PMI (MPICH2 and MVAPICH2) and controls the fanout of data communications. The srun command sends messages to application programs (via the PMI library) and those applications may be called upon to forward that data to up to this number of additional tasks. Higher values offload work from the srun command to the applications and likely increase the vulnerability to failures. The default value is 32. 这仅与 PMI (MPICH2和 MVAPICH2)一起使用,并控制数据通信的扇出。Srun 命令将消息发送到应用程序(通过 PMI 库) ,并且可以调用这些应用程序将数据转发到多达这个数量的其他任务。较高的值将工作从 srun 命令卸载到应用程序,并可能增加对故障的脆弱性。默认值是32。 PMI_FANOUT_OFF_HOST This is used exclusively with PMI (MPICH2 and MVAPICH2) and controls the fanout of data communications. The srun command sends messages to application programs (via the PMI library) and those applications may be called upon to forward that data to additional tasks. By default, srun sends one message per host and one task on that host forwards the data to other tasks on that host up to PMI_FANOUT. If PMI_FANOUT_OFF_HOST is defined, the user task may be required to forward the data to tasks on other hosts. Setting PMI_FANOUT_OFF_HOST may increase performance. Since more work is performed by the PMI library loaded by the user application, failures also can be more common and more difficult to diagnose. Should be disabled/enabled by setting to 0 or 1. 这仅与 PMI (MPICH2和 MVAPICH2)一起使用,并控制数据通信的扇出。Srun 命令将消息发送到应用程序(通过 PMI 库) ,可以调用这些应用程序将数据转发到其他任务。默认情况下,srun 为每台主机发送一条消息,该主机上的一个任务将数据转发给该主机上的其他任务,直到 PMI _ FANOUT。如果定义了 PMI _ FANOUT _ OFF _ HOST,则可能需要用户任务将数据转发到其他主机上的任务。设置 PMI _ FANOUT _ OFF _ HOST 可以提高性能。由于更多的工作是由用户应用程序加载的 PMI 库执行的,故障也可能更常见,更难以诊断。应通过设置为0或1禁用/启用。 PMI_TIME This is used exclusively with PMI (MPICH2 and MVAPICH2) and controls how much the communications from the tasks to the srun are spread out in time in order to avoid overwhelming the srun command with work. The default value is 500 (microseconds) per task. On relatively slow processors or systems with very large processor counts (and large PMI data sets), higher values may be required. 这仅与 PMI (MPICH2和 MVAPICH2)一起使用,并控制从任务到 srun 的通信在时间上分散了多少,以避免工作压垮 srun 命令。默认值为每个任务500(微秒)。在相对较慢的处理器或具有非常大的处理器计数(和大的 PMI 数据集)的系统上,可能需要更高的值。 SLURM_ACCOUNT Same as -A, --account和 A 账户一样 SLURM_ACCTG_FREQ Same as --acctg-freq和 Actg-freq 一样 SLURM_BCAST Same as --bcast和广播一样 SLURM_BCAST_EXCLUDE Same as --bcast-exclude和排除体外受精一样 SLURM_BURST_BUFFER Same as --bb和 BB 一样 SLURM_CLUSTERS Same as -M, --clusters和 -M 一样,是星团 SLURM_COMPRESS Same as --compress和... 按压一样 SLURM_CONFSLURM _ CONFThe location of the Slurm configuration file.Slurm 配置文件的位置。SLURM_CONSTRAINTSLURM _ CONSTRINTSame as -C, --constraint与 -C,——约束相同SLURM_CORE_SPECSURM _ CORE _ SPECSame as --core-spec和核心规格一样SLURM_CPU_BINDSLURM _ CPU _ BINDSame as --cpu-bind和 CPU 绑定一样SLURM_CPU_FREQ_REQSLURM _ CPU _ FREQ _ REQSame as --cpu-freq.和中央处理器频率一样。SLURM_CPUS_PER_GPUCPU _ PER _ GPUSame as --cpus-per-gpu和... cpus-per-gpu 一样SRUN_CPUS_PER_TASK任务Same as -c, --cpus-per-task与 -c 相同,—— cpus-per-taskSLURM_DEBUGSLURM _ DEBUGSame as -v, --verbose. Must be set to 0 or 1 to disable or enable the option.与 -v 相同——详细。必须设置为0或1才能禁用或启用该选项。SLURM_DEBUG_FLAGSSLURM _ DEBUG _ FLAGSSpecify debug flags for srun to use. See DebugFlags in the slurm.conf(5) man page for a full list of flags. The environment variable takes precedence over the setting in the slurm.conf.指定 Srun 要使用的调试标志。有关标志的完整列表,请参见 slurm.conf (5)手册页中的 DebugFlags。环境变量优先于 slurm.conf 中的设置。SLURM_DELAY_BOOT延迟启动Same as --delay-boot和... 延迟启动一样SLURM_DEPENDENCYSLURM _ DEPENDENCYSame as -d, --dependency=<jobid>与-d 相同,—— 依存性 = < jobid >SLURM_DISABLE_STATUS状态Same as -X, --disable-status和 X 一样,失效状态SLURM_DIST_PLANESIZESLURM _ DIST _ PLANESIZEPlane distribution size. Only used if --distribution=plane, without =<size>, is set.平面分布大小。只有在设置了—— distribution = 刨面,没有 = < size > 的情况下才使用。SLURM_DISTRIBUTIONSLURM _ distributionSame as -m, --distribution和 -m 一样,分布SLURM_EPILOG(翻译)Same as --epilog和... Epilog 一样SLURM_EXACT没错Same as --exact一模一样SLURM_EXCLUSIVE专属Same as --exclusive和... 独家的一样SLURM_EXIT_ERRORSLURM _ EXIT _ ERRORSpecifies the exit code generated when a Slurm error occurs (e.g. invalid options). This can be used by a script to distinguish application exit codes from various Slurm error conditions. Also see SLURM_EXIT_IMMEDIATE.指定出现 slurm 错误时生成的退出代码(例如无效选项)。这可以被脚本用来区分应用程序退出代码和各种 slurm 错误条件。另请参见 SLURM _ EXIT _ IMMEDIATE。SLURM_EXIT_IMMEDIATE即时退出Specifies the exit code generated when the --immediate option is used and resources are not currently available. This can be used by a script to distinguish application exit codes from various Slurm error conditions. Also see SLURM_EXIT_ERROR.指定在使用—— direct 选项且当前资源不可用时生成的退出代码。这可以被脚本用来区分应用程序退出代码和各种 slurm 错误条件。另请参见 SLURM _ EXIT _ ERROR。SLURM_EXPORT_ENVSLURM _ EXPORT _ ENVSame as --export和出口一样SLURM_GPU_BINDGPU _ BINDSame as --gpu-bind和 Gpu 绑定一样SLURM_GPU_FREQGPU _ FREQSame as --gpu-freq和 Gpu-freq 一样SLURM_GPUSSLURM _ GPUSSame as -G, --gpus和 -G 一样—— gpusSLURM_GPUS_PER_NODEGPUS _ PER _ NODESame as --gpus-per-node和每个节点一样SLURM_GPUS_PER_TASKGPUS _ PER _ TASKSame as --gpus-per-task和每个任务一样SLURM_GRESSLURM _ GRESSame as --gres. Also see SLURM_STEP_GRES与—— gres 相同,也请参见 SLURM _ STEP _ GRESSLURM_GRES_FLAGSSURM _ GRES _ FLAGSSame as --gres-flags和 Gres 旗子一样SLURM_HINT提示Same as --hint和... 提示一样SLURM_IMMEDIATE马上Same as -I, --immediate和我一样,马上SLURM_JOB_ID作业Same as --jobid和... 乔比德一样SLURM_JOB_NAME作业名称Same as -J, --job-name except within an existing allocation, in which case it is ignored to avoid using the batch job's name as the name of each job step.与 -J,-job-name 相同,除了在现有的分配中,在这种情况下它被忽略,以避免使用批作业的名称作为每个作业步骤的名称。SLURM_JOB_NUM_NODESSURM _ JOB _ NUM _ NODESSame as -N, --nodes. Total number of nodes in the job’s resource allocation.与 -N,——节点相同。作业资源分配中的节点总数。SLURM_KILL_BAD_EXIT出口Same as -K, --kill-on-bad-exit. Must be set to 0 or 1 to disable or enable the option.与 -K、—— kill-on-bad-exit 相同。必须设置为0或1才能禁用或启用该选项。SLURM_LABELIOSLURM _ LABELIOSame as -l, --label和-l 一样—— labelSLURM_MEM_BINDSLURM _ MEM _ BINDSame as --mem-bind和... 迷幻剂一样SLURM_MEM_PER_CPUMEM _ PER _ CPUSame as --mem-per-cpu和中央处理器一样SLURM_MEM_PER_GPU图形处理器(GPU)Same as --mem-per-gpu和... mem-per-gpu 一样SLURM_MEM_PER_NODE节点Same as --mem和我一样SLURM_MPI_TYPEMPI _ TYPESame as --mpi和 MPi 一样SLURM_NETWORKSLURM _ NetworkSame as --network和电视台一样SLURM_NNODESSLURM _ NNODESSame as -N, --nodes. Total number of nodes in the job’s resource allocation. See SLURM_JOB_NUM_NODES. Included for backwards compatibility.与 -N、——节点相同。作业资源分配中的节点总数。请参阅 SLURM _ JOB _ NUM _ NODES。包含向后兼容性。SLURM_NO_KILL杀死Same as -k, --no-kill和 K 一样,不杀人SLURM_NPROCSSLURM _ NPROCSSame as -n, --ntasks. See SLURM_NTASKS. Included for backwards compatibility.与 -n、—— ntask 相同。请参阅 SLURM _ NTASKS。包含了向后兼容性。SLURM_NTASKSSLURM _ NTASKSSame as -n, --ntasks和 -n 一样,—— nasksSLURM_NTASKS_PER_CORENTASKS _ PER _ CORESame as --ntasks-per-core和每核心任务一样SLURM_NTASKS_PER_GPUNTASKS _ PER _ GPUSame as --ntasks-per-gpu和每个图形处理器的任务一样SLURM_NTASKS_PER_NODENTASKS _ PER _ NODESame as --ntasks-per-node与—— nasks-per-node 相同SLURM_NTASKS_PER_SOCKETNTASKS _ PER _ SOCKETSame as --ntasks-per-socket与每个套接字的任务相同SLURM_OPEN_MODE开放模式Same as --open-mode和开放模式一样SLURM_OVERCOMMITSLURM _ OVERCOMMITSame as -O, --overcommit和 -O 一样,过度承诺SLURM_OVERLAPSLURM _ OVERLAPSame as --overlap和... 重叠部分一样SLURM_PARTITIONSLURM _ PARITIONSame as -p, --partition和 -p 一样——分区SLURM_PMI_KVS_NO_DUP_KEYSKVS _ NO _ DUP _ KEYSIf set, then PMI key-pairs will contain no duplicate keys. MPI can use this variable to inform the PMI library that it will not use duplicate keys so PMI can skip the check for duplicate keys. This is the case for MPICH2 and reduces overhead in testing for duplicates for improved performance如果设置了,则 PMI 密钥对将不包含重复的密钥。MPI 可以使用这个变量通知 PMI 库它不会使用重复的键,这样 PMI 就可以跳过对重复键的检查。MPICH2就是这种情况,它减少了测试重复项以提高性能的开销SLURM_POWERSLURM _ POWERSame as --power和... 权力一样SLURM_PROFILESLURM _ PROFILESame as --profile和侧写一样SLURM_PROLOGSLURM _ PROLOGSame as --prolog和序言一样SLURM_QOSSLURM _ QOSSame as --qos和服务质量一样SLURM_REMOTE_CWDSURM _ REMOTE _ CWDSame as -D, --chdir=和 -D 一样,—— chdir =SLURM_REQ_SWITCHSLURM _ REQ _ SWITCHWhen a tree topology is used, this defines the maximum count of switches desired for the job allocation and optionally the maximum time to wait for that number of switches. See --switches当使用树拓扑时,这将定义作业分配所需的最大开关计数以及等待该数量的开关的最长时间(可选)。看,开关SLURM_RESERVATIONSLURM _ RESERVATIONSame as --reservation和预定的一样SLURM_RESV_PORTSSLURM _ RESV _ PORTSSame as --resv-ports和 Resv-ports 一样SLURM_SEND_LIBSSLURM _ SEND _ LIBSSame as --send-libs和... 发送即兴短片一样SLURM_SIGNALSLURM _ SIGNALSame as --signal和信号一样SLURM_SPREAD_JOBSURM _ SPREAD _ JOBSame as --spread-job和... 分散抢劫一样SLURM_SRUN_REDUCE_TASK_EXIT_MSG任务 _ 退出 _ MSGif set and non-zero, successive task exit messages with the same exit code will be printed only once.如果设置为非零,则具有相同退出代码的连续任务退出消息将只打印一次。SLURM_STDERRMODESLURM _ STDERRMODESame as -e, --error和 -e 一样——错误SLURM_STDINMODESLURM _ STDINMODESame as -i, --input和我一样,输入SLURM_STDOUTMODESLURM _ STDOUTMODESame as -o, --output和 -o 一样,输出SLURM_STEP_GRESSLURM _ STEP _ GRESSame as --gres (only applies to job steps, not to job allocations). Also see SLURM_GRES与—— gres 相同(只适用于作业步骤,不适用于作业分配)SLURM_STEP_KILLED_MSG_NODE_ID=IDSURM _ STEP _ KILLED _ MSG _ NODE _ ID = IDIf set, only the specified node will log when the job or step are killed by a signal.如果设置了,当作业或步骤被信号终止时,只有指定的节点将记录日志。SLURM_TASK_EPILOG任务 _ EPILOGSame as --task-epilog和任务结尾一样SLURM_TASK_PROLOGSLURM _ TASK _ PROLOGSame as --task-prolog和任务前奏一样SLURM_TEST_EXECSLURM _ TEST _ EXECIf defined, srun will verify existence of the executable program along with user execute permission on the node where srun was called before attempting to launch it on nodes in the step.如果定义了,srun 将在尝试在步骤中的节点上启动它之前,验证可执行程序的存在以及在调用 srun 的节点上的用户执行权限。SLURM_THREAD_SPECSLURM _ THREAD _ SPECSame as --thread-spec和线程规格一样SLURM_THREADSSLURM _ 螺纹Same as -T, --threads和 -T 一样,线头SLURM_THREADS_PER_CORESURM _ THREADS _ PER _ CORESame as --threads-per-core和每个芯线一样SLURM_TIMELIMIT时间限制Same as -t, --time和 T 一样,时间SLURM_UMASKUMASKIf defined, Slurm will use the defined umask to set permissions when creating the output/error files for the job.如果已定义,slurm 将使用已定义的 umask 在创建作业的输出/错误文件时设置权限。SLURM_UNBUFFEREDIOSLURM _ UNBUFFFEREDIOSame as -u, --unbuffered和 -u 一样,没有缓冲SLURM_USE_MIN_NODESSURM _ USE _ MIN _ NODESSame as --use-min-nodes与—— use-min-node 相同SLURM_WAIT等等Same as -W, --wait和 W 一样,等等SLURM_WAIT4SWITCHSLURM _ WAIT4SWITCHMax time waiting for requested switches. See --switches等待请求开关的最大时间,看——开关SLURM_WCKEYSLURM _ WCKEYSame as -W, --wckey和 WCkey 一样SLURM_WORKING_DIRSLURM _ WORKING _ DIR-D, --chdir- D-ChdirSLURMD_DEBUGSLURMD _ DEBUGSame as -d, --slurmd-debug. Must be set to 0 or 1 to disable or enable the option.与 -d、—— slurmd-debug 相同。必须设置为0或1才能禁用或启用该选项。SRUN_CONTAINER容器Same as --container.和集装箱一样。SRUN_EXPORT_ENV(出口)Same as --export, and will override any setting for SLURM_EXPORT_ENV.与—— export 相同,并将覆盖 SLURM _ EXPORT _ ENV 的任何设置。
OUTPUT ENVIRONMENT VARIABLES
输出环境变量
srun will set some environment variables in the environment of the executing tasks on the remote compute nodes. These environment variables are:
Srun 将在远程计算节点上执行任务的环境中设置一些环境变量。这些环境变量是:
SLURM_*_HET_GROUP_#SlurmFor a heterogeneous job allocation, the environment variables are set separately for each component.对于异构作业分配,为每个组件分别设置环境变量。SLURM_CLUSTER_NAMESLURM _ CLUSTER _ NAMEName of the cluster on which the job is executing.执行作业的群集的名称。SLURM_CPU_BIND_LIST系统名称: SLURM _ CPU _ BIND _ LIST--cpu-bind map or mask list (list of Slurm CPU IDs or masks for this node, CPU_ID = Board_ID x threads_per_board + Socket_ID x threads_per_socket + Core_ID x threads_per_core + Thread_ID).- CPU 绑定 map 或掩码列表(此节点的 slurm CPU ID 或掩码列表,CPU _ ID = Board _ ID x Thread _ per _ Board + Socket _ ID x Thread _ per _ Socket + Core _ ID x Thread _ per _ Core + Thread _ ID)。SLURM_CPU_BIND_TYPEBIND _ TYPE--cpu-bind type (none,rank,map_cpu:,mask_cpu:).—— cpu-bind 类型(none,rank,map _ cpu: ,ask _ cpu:)。SLURM_CPU_BIND_VERBOSESLURM _ CPU _ BIND _ VERBOSE--cpu-bind verbosity (quiet,verbose).—— CPU 绑定冗长(安静,冗长)。SLURM_CPU_FREQ_REQSLURM _ CPU _ FREQ _ REQContains the value requested for cpu frequency on the srun command as a numerical frequency in kilohertz, or a coded value for a request of low, medium,highm1 or high for the frequency. See the description of the --cpu-freq option or the SLURM_CPU_FREQ_REQ input environment variable.包含在 srun 命令上请求的 CPU 频率值,作为以千赫为单位的数字频率,或者作为低、中、高 m1或高频率请求的编码值。请参阅—— cpu-freq 选项或 SLURM _ CPU _ fREQ _ REQ 输入环境变量的说明。SLURM_CPUS_ON_NODE节点Number of CPUs available to the step on this node. NOTE: The select/linear plugin allocates entire nodes to jobs, so the value indicates the total count of CPUs on the node. For the select/cons_res and cons/tres plugins, this number indicates the number of CPUs on this node allocated to the step.此节点上步骤可用的 CPU 数。注意: select/line 插件将整个节点分配给作业,因此该值表示节点上的 CPU 总数。对于 select/con _ res 和 con/tre 插件,这个数字表示分配给步骤的这个节点上的 CPU 数量。SLURM_CPUS_PER_TASK任务Number of cpus requested per task. Only set if the --cpus-per-task option is specified.每个任务请求的 CPU 数。只有在指定了—— cpus-per-task 选项时才进行设置。SLURM_DISTRIBUTIONSLURM _ distributionDistribution type for the allocated jobs. Set the distribution with -m, --distribution.分配作业的分发类型。使用-m,—— Distribution 设置分发。SLURM_GPUS_ON_NODEGPUS _ ON _ NODENumber of GPUs available to the step on this node.此节点上步骤可用的 GPU 数。SLURM_GTIDSSURM _ GTIDSGlobal task IDs running on this node. Zero origin and comma separated. It is read internally by pmi if Slurm was built with pmi support. Leaving the variable set may cause problems when using external packages from within the job (Abaqus and Ansys have been known to have problems when it is set - consult the appropriate documentation for 3rd party software).在此节点上运行的全局任务 ID。原点为零,逗号分开。如果 slurm 是在 pmi 支持下构建的,pmi 就会在内部读取它。离开变量集可能会在使用作业内部的外部软件包时引起问题(Abaqus 和 Ansys 在设置变量集时会出现问题——请咨询第三方软件的适当文档)。SLURM_HET_SIZESLURM _ HET _ SizeSet to count of components in heterogeneous job.设置异构作业中组件的计数。SLURM_JOB_ACCOUNT作业账户Account name associated of the job allocation.与工作分配有关的帐户名称。SLURM_JOB_CPUS_PER_NODEJOB _ CPUS _ PER _ NODECount of CPUs available to the job on the nodes in the allocation, using the format CPU_count[(xnumber_of_nodes)][,CPU_count [(xnumber_of_nodes)] ...]. For example: SLURM_JOB_CPUS_PER_NODE='72(x2),36' indicates that on the first and second nodes (as listed by SLURM_JOB_NODELIST) the allocation has 72 CPUs, while the third node has 36 CPUs. NOTE: The select/linear plugin allocates entire nodes to jobs, so the value indicates the total count of CPUs on allocated nodes. The select/cons_res and select/cons_tres plugins allocate individual CPUs to jobs, so this number indicates the number of CPUs allocated to the job.使用格式 CPU _ Count [(xnumber _ of _ node)][ ,CPU _ Count [(xnumber _ of _ node)] ... ]。例如: SLURM _ JOB _ CPUS _ PER _ NODE =’72(x2) ,36’表示在第一个和第二个节点上(如 SLURM _ JOB _ NODELIST 所列) ,分配有72个 CPU,而第三个节点有36个 CPU。注意: select/line 插件将整个节点分配给作业,因此该值表示所分配节点上的 CPU 总数。Select/con _ res 和 select/con _ tri 插件为作业分配单个 CPU,因此这个数字表示分配给作业的 CPU 数量。SLURM_JOB_DEPENDENCYSLURM _ JOB _ DEPENDENCY SLURM _ JOB _ 依赖Set to value of the --dependency option.将值设置为—— endency 选项。SLURM_JOB_GPUSSURM _ JOB _ GPUSThe global GPU IDs of the GPUs allocated to this job. The GPU IDs are not relative to any device cgroup, even if devices are constrained with task/cgroup. Only set in batch and interactive jobs.分配给这个作业的 GPU 的全局 GPU ID。GPU ID 不相对于任何设备 cgroup,即使设备受到 Task/cgroup 的约束。只在批处理和交互作业中设置。SLURM_JOB_ID作业Job id of the executing job.执行作业的作业 ID。SLURM_JOB_NAME作业名称Set to the value of the --job-name option or the command name when srun is used to create a new job allocation. Not set when srun is used only to create a job step (i.e. within an existing job allocation).当 srun 用于创建新的作业分配时,设置为—— job-name 选项或命令名称的值。当 srun 仅用于创建作业步骤(即在现有作业分配中)时未设置。SLURM_JOB_NODELIST作业节点列表List of nodes allocated to the job.分配给作业的节点列表。SLURM_JOB_NODES作业节点Total number of nodes in the job's resource allocation.作业资源分配中的节点总数。SLURM_JOB_PARTITION作业分区Name of the partition in which the job is running.运行作业的分区的名称。SLURM_JOB_QOSSLURM _ JOB _ QOSQuality Of Service (QOS) of the job allocation.作业分配的服务质量(QOS)。SLURM_JOB_RESERVATION作业 _ 预留Advanced reservation containing the job allocation, if any.包含作业分配(如果有的话)的高级预订。SLURM_JOBID作业Job id of the executing job. See SLURM_JOB_ID. Included for backwards compatibility.正在执行的作业的作业 ID。请参阅 SLURM _ JOB _ ID。包含向后兼容性。SLURM_LAUNCH_NODE_IPADDR启动节点IP address of the node from which the task launch was initiated (where the srun command ran from).启动任务的节点(运行 srun 命令的节点)的 IP 地址。SLURM_LOCALIDSLURM _ LOCALIDNode local task ID for the process within a job.作业中进程的节点本地任务 ID。SLURM_MEM_BIND_LIST系统名称: SLURM _ MEM _ BIND _ LIST--mem-bind map or mask list (<list of IDs or masks for this node>).—— mem-bind map 或掩码列表(< 此节点的 ID 或掩码列表 >)。SLURM_MEM_BIND_PREFER最佳选择--mem-bind prefer (prefer).—— mem-bind 首选(首选)。SLURM_MEM_BIND_SORTBIND _ SORTSort free cache pages (run zonesort on Intel KNL nodes).排序空闲缓存页面(在 Intel KNL 节点上运行 zonessort)。SLURM_MEM_BIND_TYPE类型--mem-bind type (none,rank,map_mem:,mask_mem:).—— mem-bind 类型(none,rank,map _ mem: ,ask _ mem:)。SLURM_MEM_BIND_VERBOSESLURM _ MEM _ BIND _ VERBOSE--mem-bind verbosity (quiet,verbose).—— mem-bind 冗长(安静,冗长)。SLURM_NODE_ALIASESSLURM _ NODE _ ALIASESSets of node name, communication address and hostname for nodes allocated to the job from the cloud. Each element in the set if colon separated and each set is comma separated. For example: SLURM_NODE_ALIASES:=:ec0:1.2.3.4:foo,ec1:1.2.3.5:bar从云端分配给作业的节点的节点名称、通讯地址和主机名。集合中的每个元素(如果用冒号分隔,并且每个集合用逗号分隔)。例如: SLURM _ NODE _ ALIASES: = : ec0:1.2.3.4: foo,ec1:1.2.3.5: barSLURM_NODEIDSLURM _ NODEIDThe relative node ID of the current node.当前节点的相对节点 ID。SLURM_NPROCSSLURM _ NPROCSTotal number of processes in the current job or job step. See SLURM_NTASKS. Included for backwards compatibility.当前作业或作业步骤中的进程总数。请参阅 SLURM _ NTASKS。包含向后兼容性。SLURM_NTASKSSLURM _ NTASKSTotal number of processes in the current job or job step.当前作业或作业步骤中的进程总数。SLURM_OVERCOMMITSLURM _ OVERCOMMITSet to 1 if --overcommit was specified.如果指定了过度提交,则设置为1。SLURM_PRIO_PROCESS进程The scheduling priority (nice value) at the time of job submission. This value is propagated to the spawned processes.在作业提交时的调度优先级(漂亮的值)。这个值被传播到衍生进程。SLURM_PROCIDSLURM _ PROCIDThe MPI rank (or relative process ID) of the current process.当前进程的 MPI 等级(或相对进程 ID)。SLURM_SRUN_COMM_HOST主机IP address of srun communication host.Srun 通信主机的 IP 地址。SLURM_SRUN_COMM_PORTSRUN _ COMM _ PORTsrun communication port.运行通信端口,运行通信端口。SLURM_CONTAINER集装箱OCI Bundle for job. Only set if --container is specified.作业的 OCI 包。只有在指定了——容器时才设置。SLURM_SHARDS_ON_NODESURM _ SHARDS _ ON _ NODENumber of GPU Shards available to the step on this node.此节点上步骤可用的 GPU 碎片数。SLURM_STEP_GPUSSTEP _ GPUSThe global GPU IDs of the GPUs allocated to this step (excluding batch and interactive steps). The GPU IDs are not relative to any device cgroup, even if devices are constrained with task/cgroup.分配给此步骤的 GPU 的全局 GPU ID (不包括批处理和交互式步骤)。GPU ID 不相对于任何设备 cgroup,即使设备受到 Task/cgroup 的约束。SLURM_STEP_IDSTEP _ IDThe step ID of the current job.当前作业的步骤 ID。SLURM_STEP_LAUNCHER_PORTSLURM _ STEP _ LAUNCHER _ PORTStep launcher port.步进发射器端口。SLURM_STEP_NODELISTSTEP _ NODELISTList of nodes allocated to the step.分配给步骤的节点列表。SLURM_STEP_NUM_NODESSURM _ STEP _ NUM _ NODESNumber of nodes allocated to the step.分配给步骤的节点数。SLURM_STEP_NUM_TASKSSTEP _ NUM _ TASKSNumber of processes in the job step or whole heterogeneous job step.作业步骤或整个异构作业步骤中的进程数。SLURM_STEP_TASKS_PER_NODESURM _ STEP _ TASKS _ PER _ NODENumber of processes per node within the step.步骤中每个节点的进程数。SLURM_STEPIDSLURM _ STEPIDThe step ID of the current job. See SLURM_STEP_ID. Included for backwards compatibility.当前作业的步骤 ID。请参阅 SLURM _ STEP _ ID。包含向后兼容性。SLURM_SUBMIT_DIRSLURM _ SUBMIT _ DIRThe directory from which the allocation was invoked from.从中调用分配的目录。SLURM_SUBMIT_HOSTSLURM _ SUBMIT _ HOSTThe hostname of the computer from which the allocation was invoked from.从中调用分配的计算机的主机名。SLURM_TASK_PID任务 _ PIDThe process ID of the task being started.正在启动的任务的进程 ID。SLURM_TASKS_PER_NODESURM _ TASKS _ PER _ NODENumber of tasks to be initiated on each node. Values are comma separated and in the same order as SLURM_JOB_NODELIST. If two or more consecutive nodes are to have the same task count, that count is followed by "(x#)" where "#" is the repetition count. For example, "SLURM_TASKS_PER_NODE=2(x3),1" indicates that the first three nodes will each execute two tasks and the fourth node will execute one task.每个节点上要启动的任务数。值是以逗号分隔的,并且按照与 SLURM _ JOB _ NODELIST 相同的顺序排列。如果两个或多个连续节点具有相同的任务计数,则该计数后跟“(x #)”,其中“ #”是重复计数。例如,“ SLURM _ TASKS _ PER _ NODE = 2(x3) ,1”表示前三个节点将各自执行两个任务,第四个节点将执行一个任务。SLURM_TOPOLOGY_ADDRTOPOLOGY _ ADDRThis is set only if the system has the topology/tree plugin configured. The value will be set to the names network switches which may be involved in the job's communications from the system's top level switch down to the leaf switch and ending with node name. A period is used to separate each hardware component name.仅当系统配置了拓扑/树插件时才设置此值。该值将被设置为网络交换机的名称,这些交换机可能参与从系统的顶级交换机到叶交换机的作业通信,并以节点名称结束。一个周期用来分隔每个硬件组件的名称。SLURM_TOPOLOGY_ADDR_PATTERN拓扑学 _ ADDR _ PATTERNThis is set only if the system has the topology/tree plugin configured. The value will be set component types listed in SLURM_TOPOLOGY_ADDR. Each component will be identified as either "switch" or "node". A period is used to separate each hardware component type.仅当系统配置了拓扑/树插件时才设置此值。该值将被设置为 SLURM _ TOPOLOGY _ ADDR 中列出的组件类型。每个组件将被标识为“开关”或“节点”。使用句点分隔每个硬件组件类型。SLURM_UMASKUMASKThe umask in effect when the job was submitted.提交作业时有效的 umask。SLURMD_NODENAMESLURMD _ NODENAMEName of the node running the task. In the case of a parallel job executing on multiple compute nodes, the various tasks will have this environment variable set to different values on each compute node.运行任务的节点的名称。对于在多个计算节点上执行的并行作业,不同的任务将把这个环境变量设置为每个计算节点上的不同值。SRUN_DEBUG调试Set to the logging level of the srun command. Default value is 3 (info level). The value is incremented or decremented based upon the --verbose and --quiet options.设置为 srun 命令的日志记录级别。默认值为3(信息级别)。该值根据——减小或增大的选项。
SIGNALS AND ESCAPE SEQUENCES
信号和转义序列
Signals sent to the srun command are automatically forwarded to the tasks it is controlling with a few exceptions. The escape sequence <control-c> will report the state of all tasks associated with the srun command. If <control-c> is entered twice within one second, then the associated SIGINT signal will be sent to all tasks and a termination sequence will be entered sending SIGCONT, SIGTERM, and SIGKILL to all spawned tasks. If a third <control-c> is received, the srun program will be terminated without waiting for remote tasks to exit or their I/O to complete.发送到 srun 命令的信号会自动转发到它正在控制的任务,只有少数例外。转义序列 < control-c > 将报告与 srun 命令相关的所有任务的状态。如果 < control-c > 在一秒内输入两次,那么相关的 SIGINT 信号将被发送到所有任务,并输入一个终止序列,将 SIGCONT、 SIGTERM 和 SIGKILL 发送到所有衍生任务。如果接收到第三个 < control-c > ,则将终止 srun 程序,而无需等待远程任务退出或其 I/O 完成。
The escape sequence <control-z> is presently ignored.
目前忽略转义序列 < control-z > 。
MPI SUPPORT
MPI 支持
MPI use depends upon the type of MPI being used. There are three fundamentally different modes of operation used by these various MPI implementations.MPI 的使用取决于所使用的 MPI 的类型。这些不同的 MPI 实现使用三种根本不同的操作模式。
1. Slurm directly launches the tasks and performs initialization of communications through the PMI2 or PMIx APIs. For example: "srun -n16 a.out".
1.Slurm 直接启动任务,并通过 PMI2或 PMIx API 执行通信初始化。例如: “ srun-n16a.out”。
2. Slurm creates a resource allocation for the job and then mpirun launches tasks using Slurm's infrastructure (OpenMPI).
Slurm 为作业创建一个资源分配,然后 mpirun 使用 slurm 的基础设施(OpenMPI)启动任务。
3. Slurm creates a resource allocation for the job and then mpirun launches tasks using some mechanism other than Slurm, such as SSH or RSH. These tasks are initiated outside of Slurm's monitoring or control. Slurm's epilog should be configured to purge these tasks when the job's allocation is relinquished, or the use of pam_slurm_adopt is highly recommended.
3.Slurm 为作业创建一个资源分配,然后 mpirun 使用除 slurm 之外的其他机制(如 SSH 或 RSH)启动任务。这些任务是在 slurm 的监视或控制之外发起的。当作业的分配被放弃,或者强烈建议使用 pam _ slurm _ opt 时,slurm 的 pilog 应该被配置为清除这些任务。
See https://slurm.schedmd.com/mpi_guide.html for more information on use of these various MPI implementations with Slurm.
更多关于使用这些带 https://slurm.schedmd.com/mpi_guide.html 的 MPI 实现的信息,请参见 slurm。
MULTIPLE PROGRAM CONFIGURATION
多程序配置
Comments in the configuration file must have a "#" in column one. The configuration file contains the following fields separated by white space: 配置文件中的注释在第一列中必须有一个“ #”。配置文件包含以下以空格分隔的字段:Task rank任务级别 One or more task ranks to use this configuration. Multiple values may be comma separated. Ranges may be indicated with two numbers separated with a '-' with the smaller number first (e.g. "0-4" and not "4-0"). To indicate all tasks not otherwise specified, specify a rank of '*' as the last line of the file. If an attempt is made to initiate a task for which no executable program is defined, the following error message will be produced "No executable program specified for this task". 使用此配置的一个或多个任务级别。多个值可以用逗号分隔。范围可以用两个数字表示,两个数字之间用一个“-”分开,首先用较小的数字表示(例如“0-4”而不是“4-0”)。若要指示未另行指定的所有任务,请指定“ *”级别作为文件的最后一行。如果尝试启动未定义可执行程序的任务,将生成以下错误消息“没有为此任务指定的可执行程序”。Executable可执行 The name of the program to execute. May be fully qualified pathname if desired. 执行程序的名称。如果需要,可以使用完全限定的路径名。Arguments争论Program arguments. The expression "%t" will be replaced with the task's number. The expression "%o" will be replaced with the task's offset within this range (e.g. a configured task rank value of "1-5" would have offset values of "0-4"). Single quotes may be used to avoid having the enclosed values interpreted. This field is optional. Any arguments for the program entered on the command line will be added to the arguments specified in the configuration file.程序参数。 表达式“% t”将替换为任务的编号。表达式“% o”将被替换为该范围内的任务偏移量(例如,配置的任务等级值“1-5”将具有偏移量值“0-4”)。可以使用单引号来避免解释封闭的值。此字段是可选的。在命令行中输入的程序的任何参数都将添加到配置文件中指定的参数中。
For example:
例如:
EXAMPLES
例子
This simple example demonstrates the execution of the command hostname in eight tasks. At least eight processors will be allocated to the job (the same as the task count) on however many nodes are required to satisfy the request. The output of each task will be proceeded with its task number. (The machine "dev" in the example below has a total of two CPUs per node)这个简单的示例演示了在八个任务中执行命令 hostname。无论需要多少个节点来满足请求,都将至少分配8个处理器给作业(与任务计数相同)。每个任务的输出将使用其任务编号继续进行。(下面示例中的机器“ dev”每个节点总共有两个 CPU)
The srun -r option is used within a job script to run two job steps on disjoint nodes in the following example. The script is run using allocate mode instead of as a batch job in this case.
在作业脚本中使用 srun-r 选项在下面的示例中不相交的节点上运行两个作业步骤。在本例中,脚本使用分配模式运行,而不是作为批处理作业运行。
The following script runs two job steps in parallel within an allocated set of nodes.
下面的脚本在分配的一组节点中并行运行两个作业步骤。
This example demonstrates how one executes a simple MPI job. We use srun to build a list of machines (nodes) to be used by mpirun in its required format. A sample command line and the script to be executed follow.
这个例子演示了如何执行一个简单的 MPI 作业。我们使用 srun 来构建一个由 mpirun 以其所需格式使用的机器(节点)列表。下面是一个示例命令行和要执行的脚本。
This simple example demonstrates the execution of different jobs on different nodes in the same srun. You can do this for any number of nodes or any number of jobs. The executables are placed on the nodes sited by the SLURM_NODEID env var. Starting at 0 and going to the number specified on the srun command line.
这个简单的示例演示了在同一个运行中的不同节点上执行不同的作业。您可以对任意数量的节点或任意数量的作业执行此操作。可执行文件放置在 SLURM _ NODEID env var 所在的节点上。从0开始,转到 srun 命令行上指定的数字。
This example demonstrates use of multi-core options to control layout of tasks. We request that four sockets per node and two cores per socket be dedicated to the job.
此示例演示如何使用多核选项来控制任务的布局。我们要求每个节点有四个套接字,每个套接字有两个核心。
This example shows a script in which Slurm is used to provide resource management for a job by executing the various job steps as processors become available for their dedicated use.
这个例子展示了一个脚本,在这个脚本中,当处理器可供专用时,slurm 通过执行各种作业步骤来为作业提供资源管理。
This example shows how to launch an application called "server" with one task, 8 CPUs and 16 GB of memory (2 GB per CPU) plus another application called "client" with 16 tasks, 1 CPU per task (the default) and 1 GB of memory per task.
这个例子展示了如何启动一个名为“服务器”的应用程序,该应用程序包含一个任务、8个 CPU 和16GB 内存(每个 CPU 2GB) ,外加另一个名为“客户端”的应用程序,该应用程序包含16个任务、每个任务1个 CPU (默认情况下)和每个任务1GB 内存。
COPYING
收到
Copyright (C) 2006-2007 The Regents of the University of California. Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).版权所有(C)2006-2007加利福尼亚大学董事会,劳伦斯利福摩尔国家实验室制作(cf,DISCLAIMER)。 Copyright (C) 2008-2010 Lawrence Livermore National Security.版权所有(C)2008-2010劳伦斯利弗莫尔国家安全。 Copyright (C) 2010-2022 SchedMD LLC.版权所有(C)2010-2022版权所有(C)2010-2022版权所有(C)2010-2022版权所有(C)2010-2022版权所有(C)。
This file is part of Slurm, a resource management program. For details, see <https://slurm.schedmd.com/>.
这个文件是 slurm 的一部分,是一个资源管理程序。有关详细信息,请参阅 < https://slurm.schedmd.com/。
Slurm is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
Slurm 是自由软件,你可以根据自由软件基金会发布的 GNU通用公共许可协议条款重新发布和/或修改它,可以是许可证的第2版,也可以是(你可以选择的)任何更新的版本。
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
分发 slurm 是希望它有用,但没有任何保证,甚至没有默示的商品适销性或适合特定用途的保证。详情请参阅 GNU通用公共许可协议。
SEE ALSO
参见
salloc(1), sattach(1), sbatch(1), sbcast(1), scancel(1), scontrol(1), squeue(1), slurm.conf(5), sched_setaffinity (2), numa (3) getrlimit (2)Salloc (1) ,satta (1) ,sbatch (1) ,sbcast (1) ,scancel (1) ,scontrol (1) ,squeue (1) ,slurm.conf (5) ,sched _ setaffinity (2) ,numa (3) getrlimit (2)
Index
索引
NAME名称SYNOPSIS简介DESCRIPTION描述RETURN VALUE返回值EXECUTABLE PATH RESOLUTION可执行路径分辨率OPTIONS选择PERFORMANCE表演INPUT ENVIRONMENT VARIABLES输入环境变量OUTPUT ENVIRONMENT VARIABLES输出环境变量SIGNALS AND ESCAPE SEQUENCES信号和转义序列MPI SUPPORTMPI 支持MULTIPLE PROGRAM CONFIGURATION多程序配置EXAMPLES例子COPYING收到SEE ALSO参见
This document was created by man2html, using the manual pages.这个文档是由 man2html 使用手册页创建的。 Time: 16:41:06 GMT, December 20, 2022 时间: 格林尼治时间2022年12月20日16:41:06
Last updated