Chapter 10. LSF Batch Configuration Reference

This chapter describes the LSF Batch configuration files lsb.params, lsb.users, lsb.hosts, and lsb.queues. These files use the same horizontal and vertical section structure as the LIM configuration files (see 'Configuration File Formats'). All LSF Batch configuration files are found in the LSB_CONFDIR/cluster/configdir directory.

The `lsb.params` File

The lsb.params file defines general parameters used by the LSF Batch cluster. This file contains only one section.

Most of the parameters that can be defined in the lsb.params file control timing within the LSF Batch system. The default settings provide good throughput for long running batch jobs while adding a minimum of processing overhead in the batch daemons.

Parameters

This section and all the keywords in this section are optional. If keywords are not present, LSF Batch assumes default values for the corresponding keywords. The valid keywords for this section are:

DEFAULT_QUEUE = queue ...: DEFAULT_QUEUE lists the names of LSF Batch queues defined in the lsb.queues file. When a user submits a job to the LSF Batch system without explicitly specifying a queue and the user's environment variable LSB_DEFAULTQUEUE is not set, LSF Batch queues the job in the first default queue listed that satisfies the job's specifications and other restrictions.; If this keyword is not present or no valid value is given, then LSF Batch automatically creates a default queue named default with all the default parameters (see 'The lsb.queues File').

DEFAULT_HOST_SPEC = host_spec: host_spec must be a host name defined in the lsf.cluster.cluster file, or a host model defined in the lsf.shared file.; The CPU time limit defined by the CPULIMIT parameter in the lsb.queues file or by the user through the -c cpu_limit option of the bsub command is interpreted as the maximum number of minutes of CPU time that a job may run on a host of the default specification. When a job is dispatched to a host for execution, the CPU time limit is then normalized according to the execution host's CPU factor.; If DEFAULT_HOST_SPEC is defined in both the lsb.params file and the lsb.queues file for an individual queue, the value specified for the queue overrides the global value. If a user explicitly gives a host specification with the CPU limit when submitting a job, the user specified host or host model overrides the values defined in both the lsb.params and the lsb.queues files.; Default: the fastest batch server host in the cluster

DEFAULT_PROJECT = proj_name: The default project name for jobs. When a user submits a job without specifying any project name, and the user's environment variable LSB_DEFAULTPROJECT is not set, LSF Batch automatically assigns the job to this default project name. On IRIX 6, the project name must be one of the projects listed in the /etc/project(4) file. On all other platforms, the project name is a string used for accounting purposes.; Default: If this parameter is not present, LSF Batch uses default as the default project name

MBD_SLEEP_TIME = integer: The LSF Batch job dispatching interval. It determines how often the LSF Batch system tries to dispatch pending batch jobs.; Default: 60 (seconds)

SBD_SLEEP_TIME = integer: The LSF Batch job checking interval. It determines how often the LSF Batch system checks the load conditions of each host to decide whether jobs on the host must be suspended or resumed.; Default: 30 (seconds)

JOB_ACCEPT_INTERVAL = integer: The number of MBD_SLEEP_TIME periods to wait after dispatching a job to a host, before dispatching a second job to the same host. If JOB_ACCEPT_INTERVAL is zero, a host may accept more than one job in each job dispatching interval (MBD_SLEEP_TIME).; Default: 1

MAX_SBD_FAIL = integer: The maximum number of retries for reaching a non-responding slave batch daemon, sbatchd. The interval between retries is defined by MBD_SLEEP_TIME. If the master batch daemon fails to reach a host, and has retried MAX_SBD_FAIL times, the host is considered unavailable. When a host becomes unavailable the mbatchd assumes that all jobs running on that host have exited, and all rerunable jobs (jobs submitted with bsub -r) are scheduled to be rerun on another host.; Default: 3

CLEAN_PERIOD = integer: The amount of time that job records for jobs that have finished or have been killed are kept in-core in the master batch daemon after they have finished. Users can still see all jobs after they have finished using the bjobs command. For jobs that finished more than CLEAN_PERIOD seconds ago, use the bhist command.; Default: 3600 (seconds)

MAX_JOB_NUM = integer: The maximum number of finished jobs whose events are to be stored in an event log file (see the lsb.events(5) manual page). Once the limit is reached, mbatchd switches the event log file. See 'LSF Batch Event Log'.; Default: 1000

HIST_HOURS = integer: The number of hours of resource consumption history taken into account when calculating the priorities of users in a host partition (see 'Host Partitions') or a fairshare queue (see 'The lsb.queues File'). This parameter is meaningful only if a fairshare queue or a host partition is defined. In calculating a user's priority, LSF Batch uses a decay factor which scales the CPU time used by the user's jobs such that 1 hour of CPU time used is equivalent to 0.1 hour after HIST_HOURS have elapsed.; Default: five (hours)

PG_SUSP_IT = integer: The time interval (in seconds) during which a host should be interactively idle (it > 0) before jobs suspended because of a threshold on the pg load index can be resumed. This parameter is used to prevent the case in which a batch job is suspended and resumed too often as it raises the paging rate while running and lowers it while suspended. If you are not concerned with the interference with interactive jobs caused by paging, the value of this parameter may be set to 0.; Default: 180 (seconds)

Handling Cray NQS Incompatibilities

Cray NQS is incompatible with some of the public domain versions of NQS. Even worse, different versions of NQS on Cray are incompatible with each other. If your NQS server host is a Cray, some additional parameters may be needed for LSF to understand the NQS protocol correctly.

If the NQS version on a Cray is NQS 80.42 or NQS 71.3, then no extra setup is needed. For other versions of NQS on a Cray, you need to define NQS_REQUESTS_FLAGS and NQS_QUEUES_FLAGS.

NQS_REQUESTS_FLAGS = integer: If the version is NQS 1.1 on a Cray, the value of this flag is 251918848.; For other versions of NQS on a Cray, see 'Handling Cray NQS Incompatibilities' to get the value for this flag.

NQS_QUEUES_FLAGS = integer: See 'Handling Cray NQS Incompatibilities' to get the value for this flag. This flag is used by LSF to get the NQS queue information.

The `lsb.users` File

The lsb.users file contains configuration information about individual users and groups of users in an LSF Batch cluster. This file is optional.

UNIX User Groups

User groups defined by UNIX often reflect certain relationships among users. It is natural to control computer resource access using UNIX user groups.

You can specify a UNIX group anywhere an LSF Batch user group can be specified. The UNIX groups recognized by LSF Batch are the groups that are returned by a getgrnam(3) call. Note that only group members listed in the /etc/group file or the group.byname NIS map are accepted; the user's primary group as defined in the /etc/passwd file is ignored.

If both an individual user and a UNIX group have the same name, LSF assumes that the name refers to the individual user. In this case you can specify the UNIX group name by appending a slash '/' to the group name. For example, if you have both a user and a group named admin on your system, LSF interprets admin as the name of the user, and admin/ as the name of the group.

Limitations

Although it is convenient to use UNIX groups as LSF Batch user groups, it may produce unexpected results if the UNIX group definitions are not homogeneous across machines. The UNIX groups picked up by LSF Batch are the groups obtained by calling getgrnam(3) on the master host. If the master host later changes to another host, the groups picked up might be different.

This will not be a problem if all the UNIX user groups referenced by LSF Batch configuration files are uniform across all hosts in the LSF cluster.

LSF Batch User Groups

A user group is a group of users with a name assigned. User groups can be used in defining the following parameters in LSF Batch configuration files:

USERS in the lsb.queues file for authorized queue users.
USER_NAME in the lsb.users file for user job slot limits.
USER_SHARES in the lsb.hosts file for host partitions or in the lsb.queues file for queue fairshare policies.

The optional UserGroup section begins with a line containing the mandatory keywords GROUP_NAME and GROUP_MEMBER. Each subsequent line defines a single group. The first word on the line is the group name. The rest of the line contains a list of group members, enclosed in parentheses and separated by white space. A group can be included in another group; this means that every member of the first group is also a member of the second. For example:

Begin UserGroup
GROUP_NAME   GROUP_MEMBER
eng_users    (user1 user4)
tech_users   (eng_users user7)
acct_users   (user2 user3 user1)
End UserGroup

A user or group can be a member of more than one group. The reserved name all can be used to specify all users.

User and Group Job Slot Limits

Each user or user group can have a cluster-wide job slot limit and a per-processor job slot limit. These limits apply to the total number of job slots used by batch jobs owned by the user or group, in all queues. LSF Batch only dispatches the specified number of jobs at one time; if the user submits too many jobs, they remain pending and other users' jobs are run if hosts are available.

Detailed descriptions about job slot limits and how they are enforced by LSF Batch are described in 'Job Slot Limits'.

If a job slot limit is specified for a user group, the total number of job slots used by all users in that group are counted. If a user is a member of more than one group, each of that user's jobs is counted against the limit for all groups to which that user belongs.

This file can also contain a User section. The first line of this section gives the keywords that apply to the rest of the lines. The possible keywords include:

USER_NAME: Name of a user or user group. This keyword is mandatory. If the name is a group name and the name is appended with an '@', the job slot limits defined apply to each user in that group, as you could otherwise do by listing each user in that group in separate entries in this section.

MAX_JOBS: System-wide job slot limits. This limits the total number of job slots this user or user group can use at any time.

JL/P: Per processor job slot limit. This limits the maximum number of job slots this user or user group can use per processor. This number can be a fraction such as 0.5 so that it can also serve as a per-host limit. This number is rounded up to the nearest integer equal to or greater than the total job slot limits for a host. For example, if JL/P is 0.5, on a 4-CPU multiprocessor host, the user can only use up to 2 job slots at any time. On a uniprocessor machine, the user can use 1 job slot.

The reserved user name default can be used for USER_NAME to set a limit for each user or group not explicitly named. If no default limit is specified, users and groups not listed in this section can run an unlimited number of jobs.

Note
The default per-user job slot limit also applies to groups. If you define groups with many users, you may need to configure a job slot limit for that group explicitly to override the default setting.

Begin User
USER_NAME   MAX_JOBS  JL/P
user3         10        -
user2          4        1
eng_users@    10        1
default        6        1
End User

The `lsb.hosts` File

The lsb.hosts file contains host related configuration information for the batch server hosts in the cluster. This file is optional.

Host Section

The optional Host section contains per-host configuration information. Each host, host model or host type can be configured to run a maximum number of jobs and a limited number of jobs for each user. Hosts, host models or host types can also be configured to run jobs only under specific load conditions or time windows.

If no hosts, host models or host types are named in this section, LSF Batch uses all hosts in the LSF cluster as batch server hosts. Otherwise, only the named hosts, host models and host types are used by LSF Batch. If a line in the Host section lists the reserved host name default, LSF Batch uses all hosts in the cluster and the settings on that line apply to every host not referenced in the section, either explicitly or by listing its model or type.

The first line of this section gives the keywords that apply to the rest of the lines. The keyword HOST_NAME must appear. Other supported keywords are optional.

HOST_NAME: The name of a host defined in the lsf.cluster.cluster file, a host model or host type defined in the lsf.shared file, or the reserved word default.

MXJ: The maximum number of job slots for the host. On multiprocessor hosts MXJ should be set to at least the number of processors to fully use the CPU resource.; Default: unlimited

JL/U: The maximum number of job slots any single user can use on this host at any time. See 'Job Slot Limits' for details of job slot limits.; Default: unlimited

DISPATCH_WINDOW: Times when this host will accept batch jobs.; Dispatch windows are specified as a series of time windows. See 'Time Windows' for detailed format of time windows.; Default: always open

Note
Earlier versions of LSF used the keyword RUN_WINDOW instead of DISPATCH_WINDOW in the lsb.hosts file. This keyword is still accepted to provide backward compatibility.

MIG: Migration threshold in minutes. If a checkpointable or rerunable job dispatched to this host is suspended for more than MIG minutes, the job is migrated. The suspended job is checkpointed (if possible) and killed. Then LSF restarts or reruns the job on another suitable host if one is available. If LSF is unable to rerun or restart the job immediately, the job reverts to PEND status and is requeued with a higher priority than any submitted job, so it is rerun or restarted before other queued jobs are dispatched.; Each LSF Batch queue can also specify a migration threshold. Jobs are migrated if either the host or the queue specifies a migration threshold. If MIG is defined both here and in lsb.queues, the lower threshold is used.; Jobs that are neither checkpointable nor rerunable are not migrated.; Default: no automatic migration

r15s, r1m, r15m, ut, pg, io, ls, it, tmp, swp, mem, and name: Scheduling and suspending thresholds for the dynamic load indices supported by LIM, including external load index names. Each load index column must contain either the default entry or two numbers separated by a slash '/', with no white space. The first number is the scheduling threshold for the load index; the second number is the suspending threshold. See the 'Resources' chapter of the LSF User's Guide for complete descriptions of the load indices.; Each LSF Batch queue also can specify scheduling and suspending thresholds in lsb.queues. If both files specify thresholds for an index, those that apply are the most restrictive ones.; Default: no threshold

CHKPNT: Defines the form of checkpointing available. Currently, only the value 'C' is accepted. This indicates that checkpoint copy is supported. With checkpoint copy, all opened files are automatically copied to the checkpoint directory by the operating system when a process is checkpointed. Checkpoint copy is currently supported only on ConvexOS.; Default: no checkpoint copy

The keyword line should name only the load indices that you wish to configure on a per-host basis. Load indices not listed on the keyword line do not affect scheduling decisions.

Each following line contains the configuration information for one host, host model or host type. This line must contain one entry for each keyword on the keywords line. Use empty parentheses '()' or a dash '-' to specify the default 'don't care' value for an entry. The entries in a line for a host override the entries in a line for its model or type.

Begin Host
HOST_NAME  MXJ  JL/U   r1m     pg   DISPATCH_WINDOW
hostA      1      -  0.6/1.6  10/20 (5:19:00-1:8:30 20:00-8:30)
SUNSOL     1      -  0.5/2.5    -   23:00-8:00
default    2      1  0.6/1.6  20/40 ()
End Host

This example Host section shows host-specific configuration for a host and a host type, along with default values for all other load-sharing hosts. The server hostA runs one batch job at a time. A job will only be started on hostA if the r1m index is below 0.6 and the pg index is below 10; the running job is stopped if the r1m index goes above 1.6 or the pg index goes above 20. hostA only accepts batch jobs from 19:00 on Friday evening until 8:30 Monday morning, and overnight from 20:00 to 8:30 on all other days.

For hosts of type SUNSOL, the pg index does not have host-specific thresholds and such hosts are only available overnight from 23:00 to 8:00. SUNSOL must be a host type defined in the lsf.shared file.

The entry with host name default applies to each of the other hosts in the LSF cluster. Each host can run up to 2 jobs at the same time, with at most one job from each user. These hosts are available to run jobs at all times. Jobs may be started if the r1m index is below 0.6 and the pg index is below 20, and a job from the lowest priority queue is suspended if r1m goes above 1.6 or pg goes above 40.

Host Groups

The HostGroup section is optional. This section defines names for sets of hosts. The host group name can then be used in other host group, host partition, and batch queue definitions, as well as on an LSF Batch command line. When a host group name is used, it has exactly the same effect as listing all of the host names in the group.

Host groups are specified in the same format as user groups in the lsb.users file.

The host group section must begin with a line containing the mandatory keywords GROUP_NAME and GROUP_MEMBER. Each other line in this section must contain an alphanumeric string for the group name, and a list of host names or previously defined group names enclosed in parentheses and separated by white space.

Host names and host group names can appear in more than one host group. The reserved name all specifies all hosts in the cluster.

Begin HostGroup
GROUP_NAME  GROUP_MEMBER
licence1   (hostA hostD)
sys_hosts  (hostF license1 hostK)
End HostGroup

This example section defines two host groups. The group license1 contains the hosts hostA and hostD; the group sys_hosts contains hostF and hostK, along with all hosts in the group license1. Group names must not conflict with host names.

Host Partitions

The HostPartition section is optional, and you can configure more than one such section. See 'Controlling Fairshare' for more discussions of fairshare and host partitions.

Each HostPartition section contains a list of hosts and a list of user shares. Each host can be named in at most one host partition. Hosts that are available for batch jobs, but not included in any host partition are shared on a first-come, first-served basis. The special host name all can be specified to configure a host partition that applies to all hosts in a cluster.

Each user share contains a single user name or user group name, and an integer defining the share of the total CPU time available to that user. The special user name 'others' can be used to configure a total share for all users not explicitly listed. The special name 'default' configures the default per-user share for each user not explicitly named. Only one of others or default may be configured in a single host partition.

The time sharing is calculated by adding up all the shares configured, and giving each user or group the specified portion of that total. Note that the share for a group specifies the total share for all users in that group, unless the group name has a trailing '@'. In this case, the share is for each individual user in the group.

Note
Host partition fair share scheduling is an alternative to queue level fairshare scheduling. You cannot use both in the same LSF cluster.

This example shows a host partition applied to hosts hostA and hostD:

Begin HostPartition
HPART_NAME = part1
HOSTS = hostA hostD
USER_SHARES = [eng_users, 7] [acct_users, 3] [others, 1]
End HostPartition

In the example, the total of all the shares is 7 + 3 + 1 = 11. This host partition specifies that all users in the user group eng_users should get 7/11 of the CPU time, the acct_users group should get 3/11, and all other users together get 1/11.

CPU time usage is accumulated over a period of time configured by the HIST_HOURS optional parameter in the lsb.params file, with the default value of five hours. The batch system keeps track of the cumulative CPU usage of each user or group in the host partition. Each user or group is given a priority based on whether the group has received more or less than its share of the CPU over the time period. If jobs from more than one group are eligible to run on a host in the partition, the job from the group with lower CPU usage relative to the group's share is dispatched first.

Host partitions are only enforced when jobs from more than one user or group are pending. If only one user or group is submitting jobs, those jobs can take all the available time on the partitioned hosts. If another user or group begins to submit jobs, those jobs are dispatched first until the shares reach the configured proportion.

The following example shows a host partition that gives users in the eng_users group very high priority, but allows jobs from other users to run if there are no jobs from the eng_users group waiting:

Begin HostPartition
HPART_NAME = eng
Hosts = all
User_Shares = ([eng_users, 500] [others, 1])
End HostPartition

Hosts belonging to a host partition should not be configured in the HOSTS parameter of a queue together with other hosts not belonging to the same host partition. Otherwise, the following two limitations may apply:

Jobs in the queue sometimes may be dispatched to the host partition even though hosts not belonging to any host partition have lighter load.
If some hosts belong to one host partition and some hosts belong to another, only the priorities of one host partition are used when dispatching a parallel job to hosts from more than one host partition.

The `lsb.queues` File

The lsb.queues file contains definitions of the batch queues in an LSF cluster. This file is optional. If no queues are configured, LSF Batch creates a queue named default, with all parameters set to default values (see the description of DEFAULT_QUEUE in 'The lsb.params File').

Queue definitions are horizontal sections that begin with the line Begin Queue and end with the line End Queue. You can define at most 40 queues in an LSF Batch cluster. Each queue definition contains the following parameters:

General Parameters

`QUEUE_NAME = string`

The name of the queue. This parameter must be defined, and has no default. The queue name can be any string of non-blank characters up to 40 characters long. It is best to use 6 to 8 character names made up of letters, digits, and possibly underscores '_' or dashes '-'.

`PRIORITY = integer`

This parameter indicates the priority of the queue relative to other LSF Batch queues. Note that this is an LSF Batch dispatching priority, completely independent of the UNIX scheduler's priority system for time-sharing processes. The LSF Batch NICE parameter is used to set the UNIX time-sharing priority for batch jobs.

LSF Batch tries to schedule jobs from queues with larger PRIORITY values first. This does not mean that jobs in lower priority queues are not scheduled unless higher priority queues are empty. Higher priority queues are checked first, but not all jobs in them are necessarily scheduled. For example, a job might be held because no machine with the right resources is available, or all jobs in a queue might be held because the queue's dispatch window or run window (see below) is closed. Lower priority queues are then checked and, if possible, their jobs are scheduled.

LSF Batch tries to suspend jobs from queues with smaller PRIORITY values first.

If more than one queue is configured with the same PRIORITY, LSF Batch schedules jobs from all these queues in first-come, first-served order.

Default: 1

`NICE = integer`

Adjusts the UNIX scheduling priority at which jobs from this queue execute. The default value of 0 maintains the default scheduling priority for UNIX interactive jobs. This value adjusts the run time priorities for batch jobs on a queue-by-queue basis, to control their effect on other batch or interactive jobs. See the nice(1) manual page for more details.

Default: 0

`QJOB_LIMIT = integer`

Job slot limit for the queue. This limits the total number of job slots that this queue can use at any time.

Default: unlimited

`UJOB_LIMIT = integer`

Per user job slot limit for the queue. This limits the total number of job slots any user of this queue can use at any time.

Default: unlimited

`PJOB_LIMIT = integer`

Per processor job slot limit. This limits the total number of job slots this queue can use on any processor at any time. This limit is configured per processor so that multiprocessor hosts automatically run more jobs.

Default: unlimited

`HJOB_LIMIT = integer`

Per host job slot limit. This limits the total number of job slots this queue can use on any host at any time. This limit is configured per host regardless of the number of processors it may have. This may be useful if the queue dispatches jobs which require a node-locked license. If there is only one node-locked license per host then the system should not dispatch more than one job to the host even if it is a multiprocessor host. For example, the following will run a maximum of one job on each of hostA, hostB, and hostC:

Begin Queue
.
HJOB_LIMIT = 1
HOSTS=hostA hostB hostC
End Queue

Default: unlimited

`RUN_WINDOW = string`

The time windows in which jobs are run from this queue. Run windows are described in 'Run Windows'.

When the queue run window closes, the queue stops dispatching jobs and suspends any running jobs in the queue. Jobs suspended because the run window closed are restarted when the window reopens. Suspended jobs also can be switched to a queue with run window open; the job restarts as soon as the new queue's scheduling thresholds are met.

Default: always open

`DISPATCH_WINDOW = string`

The time windows in which jobs are dispatched from this queue. Once dispatched, jobs are no longer affected by the dispatch window. Queue dispatch windows are analogous to the host dispatch windows described above in 'DISPATCH WINDOW'.

Default: always open

`ADMINISTRATORS = name ...`

A list of queue-level administrators. The list of names can include any valid user name in the system, any UNIX user group name, and any user group name configured in the lsb.users file. Queue administrators can perform operations on any job in the queue as well as on the queue itself (for example, open/close, activate/deactivate). Switching a job from one queue to another requires the administrator to be authorized for both the current and the destination queues.

The -l option of the bqueues command will display configured administrators for each queue.

Default: No queue-level administrators are defined

Processor Reservation for Parallel Jobs

Parallel jobs requiring a large number of processors can often not be started if there are many lower priority sequential jobs in the system. There may not be enough resources at any one instant to satisfy a large parallel job, but there may be enough to allow a sequential job to be started. With the processor reservation feature the problem of starvation of parallel jobs can be reduced.

A host can have multiple 'slots' available for the execution of jobs. The number of slots can be independent of the number of processors and each queue can have its own notion of the number of execution slots available on each host. The number of execution slots on each host is controlled by the PJOB_LIMIT and HJOB_LIMIT parameters. When attempting to schedule parallel jobs requiring N processors (as specified with bsub -n), the system will attempt to find N execution slots across all eligible hosts. It ensures that each job never receives more slots than there are physical processors on any individual host.

When a parallel job cannot be dispatched because there are not enough execution slots to satisfy its minimum processor requirements, the currently available slots will be reserved for the job. These reserved job slots are accumulated until there enough available to start the job. When a slot is reserved for a job it is unavailable to any other job.

The processor reservation feature is disabled by default. To enable it, specify the SLOT_RESERVE keyword in the queue:

Begin Queue
.
PJOB_LIMIT=1
SLOT_RESERVE = MAX_RESERVE_TIME[n]
.
End Queue

The value of the keyword is MAX_RESERVE_TIME[n] where n is a multiple of MBD_SLEEP_TIME (defined in lsb.params). MAX_RESERVE_TIME controls the maximum time a slot is reserved for a job. It is required to avoid deadlock situations in which the system is reserving job slots for multiple parallel jobs such that none of them can acquire sufficient resources to start. The system will reserve slots for a job until n*MBD_SLEEP_TIME minutes. If an insufficient number have been accumulated, all slots are freed and made available to other jobs. The maximum reservation time takes effect from the start of the first reservation for a job and a job can go through multiple reservation cycles before it accumulates enough slots to be actually started.

Flexible Expressions for Queue Scheduling

LSF Batch provides a variety of possibly overlapping options for configuring job scheduling policies.

Queue-Level Resource Requirement

The condition for dispatching a job to a host can be specified through the queue-level RES_REQ parameter. Using a resource requirement string you can specify conditions in a more flexible manner than using the loadSched thresholds (see 'Load Thresholds'). For example:

RES_REQ= select[((type==ALPHA && r1m < 2.0)||(type==HPPA && r1m < 1.0))]

will allow a queue, which contains ALPHA and HPPA hosts, to have different thresholds for the different types of hosts. Using the hname resource in the RES_REQ string allows you to set up different conditions for different hosts in the same queue, for example:

RES_REQ= select[((hname=hostA && mem > 50)||(hname==hostB && mem > 100))]

When RES_REQ is specified in the queue and no job-level resource requirement is specified, then RES_REQ becomes the default resource requirement for the job. This allows administrators to override the LSF default of executing only on the same type as the submission host. If a job level resource requirement is specified together with RES_REQ, then a host must satisfy both requirements to be eligible for running the job. Similarly, the loadSched thresholds, if specified, must also be satisfied for a host to be eligible.

The order and span sections of the resource requirement string can also be specified in the RES_REQ parameter. These sections in RES_REQ are ignored if they are also specified by the user in the job level resource requirement.

Queue-Level Resource Reservation

The resource reservation feature allows user's to specify that the system should reserve resources after a job starts. This feature is also available at the queue level.

The queue-level resource reservation can be configured as part of the RES_REQ parameter. The RES_REQ can include a rusage section to specify the amount of resources a job should reserve after it is started. For example:

Begin Queue
.
RES_REQ = swap>50 rusage[swp=40:duration=5h:decay=1]
.
End Queue

If duration is not specified, the default is to reserve the resource for the lifetime of the job. If decay is specified as 1, then the reserved resource will be linearly decreased over the time specified by duration. If decay is not specified, then the resource reserved will not decrease over time. See 'Specifying Resource Reservation' in the LSF User's Guide and lsfintro(1) for detailed syntax of the rusage parameter.

Note
The use of RES_REQ affects the pending reasons as displayed by bjobs. If RES_REQ is specified in the queue and the loadSched thresholds are not specified the pending reasons for each individual load index will not be displayed.

Suspending Condition

The condition for stopping a job can be specified using a resource requirement string in the queue level STOP_COND parameter. If loadStop thresholds have been specified, then a job will be suspended if either the STOP_COND is TRUE or the loadStop thresholds are violated. For example, the following will suspend a job based on the idle time for desktop machines and based on availability of swap and memory on compute servers. Note that cs is a boolean resource defined in the lsf.shared file and configured in the lsf.cluster.cluster file to indicate that a host is a compute server:

Begin Queue
.
STOP_COND= select[((!cs && it < 5) || (cs && mem < 15 && swap < 50))]
.
End Queue

Note
Only the select section of the resource requirement string is considered when stopping a job. All other sections are ignored.

The use of STOP_COND affects the suspending reasons as displayed by the bjobs command. If STOP_COND is specified in the queue and the loadStop thresholds are not specified, the suspending reasons for each individual load index will not be displayed.

Note
LSF Batch will not suspend a job if the job is the only batch job running on the host and the machine is interactively idle (it >0).

Resume Condition

A separate RESUME_COND allows you to specify the condition that must be satisfied on a host if a suspended job is to be resumed. If RESUME_COND is not defined, then the loadSched thresholds are used to control resuming of jobs. The loadSched thresholds are ignored if RESUME_COND is defined.

Note that only the select section of the resource requirement string is considered when resuming a job. All other sections are ignored.

Load Thresholds

The queue definition can contain thresholds for 0 or more of the load indices. Any load index that does not have a configured threshold has no effect on job scheduling. A description of all the load indices is given in the 'Resources' chapter of the LSF User's Guide.

Each load index is configured on a separate line with the format:

index = loadSched/loadStop

index is the name of the load index, for example r1m for the 1-minute CPU run queue length or pg for the paging rate. loadSched is the scheduling threshold for this load index. loadStop is the suspending threshold.

The loadSched and loadStop thresholds permit the specification of conditions using simple AND/OR logic. For example, the specification:

MEM=100/10
SWAP=200/30

translates into a loadSched condition of mem>=100 && swap>=200 and a loadStop condition of mem < 10 || swap < 30. The loadSched condition must be satisfied by a host before a job is dispatched to it and also before a job suspended on a host can be resumed. If the loadStop condition is satisfied, a job is suspended.

Note
LSF Batch will not suspend a job if the job is the only batch job running on the host and the machine is interactively idle (it >0).

The scheduling threshold also defines the host load conditions under which suspended jobs in this queue may be resumed.

When LSF Batch suspends or resumes a job, it invokes the SUSPEND or RESUME action as described in 'Configurable Job Control Actions'. The default SUSPEND action is to send signal SIGSTOP, while default action for RESUME is to send signal SIGCONT.

Note
The r15s, r1m, and r15m CPU run queue length conditions are compared to the effective queue length as reported by lsload -E, which is normalised for multiprocessor hosts. Thresholds for these parameters should be set at appropriate levels for single processor hosts.

Resource Limits

Batch queues can enforce resource limits on jobs. LSF Batch supports most of the resource limits that the underlying operating system supports. In addition, LSF Batch also supports a few limits that the underlying operating system does not support.

`CPULIMIT = [hour:]minute[/host_spec]`

Maximum CPU time allowed for a job running in this queue. This limit applies to the whole job, no matter how many processes the job may contain. If a job consists of multiple processes, the CPULIMIT parameter applies to all processes in a job. If a job dynamically spawns processes, the CPU time used by these processes is accumulated over the life of the job. Processes that exist for less than 30 seconds may be ignored.

The limit is scaled; the job is allowed to run longer on a slower host, so that a job can do roughly the same amount of work no matter what speed of host it is dispatched to.

The time limit is given in the form [hour:]minute[/host_spec]. minute may be greater than 59. Three and a half hours can be specified either as 3:30, or 210. host_spec is shared by CPULIMIT and RUNLIMIT. It may be a host name or a host model name which is used to adjust the CPU time limit or the wall-clock run time limit. In its absence, the DEFAULT_HOST_SPEC defined for this queue or defined for the whole cluster is assumed. If DEFAULT_HOST_SPEC is not defined, the LSF Batch server host with the largest CPU factor is assumed.

CPU time limits are normalized by multiplying the CPULIMIT parameter by the CPU factor of the specified or default host, and then dividing by the CPU factor of the execution host. If the specified host has a CPU factor of 2 and another host has a factor of 1, then a CPULIMIT value of 10 minutes allows jobs on the specified host to run for 10 minutes, and jobs on the slower host to run for 20 minutes (2 * 10 / 1). See 'Host Models' and 'Descriptive Fields' for more discussion of CPU factors.

Default: unlimited

`RUNLIMIT = [hour:]minute[/host_spec]`

Maximum wall clock running time allowed for batch jobs in this queue. Jobs that are in the RUN state for longer than RUNLIMIT are killed by LSF Batch. RUNLIMIT is available on all host types. For an explanation of the form of the time limit, see CPULIMIT above.

Default: unlimited

`FILELIMIT = integer`

The per-process (hard) file size limit (in KB) for all the processes belonging to a job from this queue (see getrlimit(2)).

Default: unlimited

`MEMLIMIT = integer`

The per-process (hard) process resident set size limit (in KB) for all the processes belonging to a job from this queue (see getrlimit(2)). The process resident set size limit cannot be set on HP-UX and Sun Solaris 2.x, so this limit has no effect on an HP-UX or a Sun Solaris 2.x machine.

Default: unlimited

`DATALIMIT = integer`

The per-process (hard) data segment size limit (in KB) for all the processes belonging to a job from this queue (see getrlimit(2)). The data segment size limit cannot be set on HP-UX, so this limit has no effect on an HP-UX machine.

Default: unlimited

`STACKLIMIT = integer`

The per-process (hard) stack segment size limit (in KB) for all the processes belonging to a job from this queue (see getrlimit(2)). The stack segment size limit cannot be set on HP-UX, so this limit has no effect on an HP-UX machine.

Default: unlimited

`CORELIMIT = integer`

The per-process (hard) core file size limit (in KB) for all the processes belonging to a job from this queue (see getrlimit(2)). The core file size limit cannot be set on HP-UX, so this limit has no effect on an HP- UX machine.

Default: unlimited

`PROCLIMIT = integer`

The maximum number of job slots that can be allocated to a parallel job in the queue. Jobs which request more job slots via the -n option of bsub than the queue can accept will be rejected.

Default: unlimited

`PROCESSLIMIT = integer`

This limits the number of concurrent processes that can be part of a job.

Default: unlimited

`SWAPLIMIT = integer`

The amount of total virtual memory limit (in kilobytes) for a job from this queue. This limit applies to the whole job, no matter how many processes the job may contain.

The action taken when a job exceeds its SWAPLIMIT or PROCESSLIMIT is to send SIGQUIT, SIGINT, and SIGTERM, and then SIGKILL in sequence. For CPULIMIT, SIGXCPU is sent before SIGINT, SIGTERM, and SIGKILL.

Default: unlimited

`NEW_JOB_SCHED_DELAY = integer`

This parameter controls when a scheduling session should be started after a new job is submitted. For example:

Begin Queue
.
NEW_JOB_SCHED_DELAY=0
.
End Queue

If NEW_JOB_SCHED_DELAY is 0 seconds, a new scheduling session is started as soon as a job is submitted to this queue. This parameter can be used to obtain faster response times for jobs in a queue such as a queue for interactive jobs.

Note
Setting a value of 0 can cause mbatchd to be busy if there are a lot of submissions.

Default: 10 seconds

`JOB_ACCEPT_INTERVAL = integer`

This parameter has the same effect as JOB_ACCEPT_INTERVAL defined in the lsb.params file, except that it applies to this queue.

Default: JOB_ACCEPT_INTERVAL defined in lsb.params or 1 if it is not defined in lsb.params file

`INTERACTIVE = NO|ONLY`

An interactive job can be submitted via the -I option of the bsub command. By default, a queue would accept both interactive and background jobs. This parameter allows LSF cluster administrator to limit a queue to not accept interactive jobs (NO), or to only accept interactive jobs (ONLY).

Eligible Hosts and Users

Each queue can have a list of users and user groups who are allowed to submit batch jobs to the queue, and a list of hosts and host groups that restricts where jobs from the queue can be dispatched.

`USERS = name ...`

The list of users who can submit jobs to this queue. The list of names can include any valid user name in the system, any UNIX user group name, and any user group name configured in the lsb.users file. The reserved word all may be used to specify all users.

Note
LSF cluster administrator can submit jobs to any queue, even if the login name of the cluster administrator is not defined in the USERS parameter of the queue. LSF cluster administrator can also switch a user's jobs into this queue from other queues, even if this user's login name is not defined in the USERS parameter.

Default: all

`HOSTS = name[+pref_level] ...`

The list of hosts on which jobs from this queue can be run. Each name in the list must be a valid host name, host group name or host partition name as configured in the lsb.hosts file. The name can be optionally followed by +pref_level to indicate the preference for dispatching a job to that host, host group, or host partition. pref_level is a positive number specifying the preference level of that host. If a host preference is not given, it is assumed to be 0.

Hosts at the same level of preference are ordered by load. For example:

HOSTS = hostA+1 hostB hostC+1 servers+3

where servers is a host group name referring to all computer servers. This defines three levels of preferences: run jobs on servers as much as possible, or else on hostA and hostC. Jobs should not run on hostB unless all other hosts are too busy to accept more jobs.

If you use the reserved word 'others', it means jobs should run on all hosts not explicitly listed. You do not need to define this parameter if you want to use all batch server hosts and you do not need host preferences.

All the members of the host list should either belong to a single host partition or not belong to any host partition. Otherwise, job scheduling may be affected (see 'Host Partitions' on page 198).

Default: all batch hosts.

Scheduling Policy

LSF Batch allows many policies to be defined at the queue level. These affect in what order jobs in the queue should be scheduled.

Queue Level Fairshare

The concept of queue level fairshare was discussed in 'Scheduling Policy'. The configuration syntax for this policy is:

FAIRSHARE = USER_SHARES[ [username, share] [username, share] ......]

Note
These are real square brackets, not syntactic notation.

username is a user login name, a user group name, the reserved word default, or the reserved word others. share is a positive integer specifying the number of shares of resources that a user or user group has in the cluster.

The USER_SHARES assignment for a queue is interpreted in the same way as the USER_SHARES assignment in a host partition definition in the lsb.hosts file. See 'Host Partitions' for explanation of USER_SHARES. In general, a job has a higher scheduling priority if the job's owner has more shares, fewer running jobs, has used less CPU time and has waited longer in the queue.

Note the differences between the following two definitions:

FAIRSHARE = USER_SHARES[[grp1, share1] [grp2, share2]]

FAIRSHARE = USER_SHARES[[grp1, share1] [grp2@, share2]]

The '@' immediately after a user group name means that the share applies to each individual user in the user group. Without '@', the share applies to the user group as a whole.

See 'Controlling Fairshare' for examples of fairshare configuration.

Note
Queue level fair share scheduling is an alternative to host partition fair share scheduling. You cannot use both in the same LSF cluster for the same host(s).

Preemption Scheduling

The concept of preemptive scheduling was discussed in 'Preemptive and Preemptable'. PREEMPTION takes two possible parameters, PREEMPTIVE and PREEMPTABLE. The configuration syntax is:

PREEMPTION = PREEMPTIVE[q1 q2 ...] PREEMPTABLE

where [q1,q2 ...] are an optional list of queue names of lower priorities.

Note
These are real square brackets, not syntactic notation.

If PREEMTIVE is defined, this defines a preemptive queue that will preempt jobs in [q1, q2, ...]. Jobs in a preemptive queue can preempt jobs from the specified lower priority queues running on a host by suspending some of them and starting the higher priority jobs on the host.

If PREEMPTIVE is specified without a list of queue names, then this queue preempts all lower priority queues.

If the PREEMPTIVE policy is not specified, jobs dispatched from this queue will not suspend jobs from lower priority queues.

A queue can be specified as being preemptable by defining PREEMPTABLE in the PREEMPTION parameter of the queue.

Jobs from a preemptable queue can be preempted by jobs in any higher priority queues even if the higher priority queues do not have PREEMPTIVE defined. A preemptable queue is complementary to the preemptive queue. You can define a queue that is both preemptive as well as preemptable by defining both PREEMPTIVE and PREEMPTABLE. Thus the queue will preempt lower priority queues while it can also be preempted by higher priority queues.

Exclusive Queue

An exclusive queue is created by specifying EXCLUSIVE in the policies of a queue.

If the EXCLUSIVE policy is specified, this queue performs exclusive scheduling. A job only runs exclusively if it is submitted to a queue with exclusive scheduling, and the job is submitted with the -x option to bsub. An exclusive job runs by itself on a host---it is dispatched only to a host with no other batch jobs running.

Once an exclusive job is started on a host, the LSF Batch system locks that host out of load sharing by sending a request to the underlying LSF so that the host is no longer available for load sharing by any other task (either interactive or batch) until the exclusive job finishes.

Because exclusive jobs are not dispatched until a host has no other batch jobs running, it is possible for an exclusive job to wait indefinitely if no batch server host is ever completely idle. This can be avoided by configuring some hosts to run only one batch job at a time; that way the host is certain to have no batch jobs running when the previous batch job completes, so the exclusive job can be dispatched there.

The exclusive scheduling policy is specified using the following syntax:

EXCLUSIVE = {Y | N}

Migration

The MIG parameter controls automatic migration of suspended jobs.

`MIG = number`

If MIG is specified, then number is the migration threshold in minutes. If a checkpointable or rerunable job is suspended for more than MIG minutes and no other job on the same host is being migrated, LSF Batch checkpoints (if possible) and kills the job. Then LSF Batch restarts or reruns the job on another suitable host if one is available. If LSF is unable to rerun or restart the job immediately, the job reverts to PEND status and is requeued with a higher priority than any submitted job, so it is rerun or restarted before other queued jobs are dispatched.

The lsb.hosts file can also specify a migration threshold. Jobs are migrated if either the host or the queue specifies a migration threshold. If MIG is defined both here and in lsb.hosts, the lower threshold is used.

Jobs that are neither checkpointable nor rerunable are not migrated.

Default: no migration

Queue-Level Pre-/Post-Execution Commands

Pre- and post-execution commands can be configured on a per-queue basis. These commands are run on the execution host before and after a job from this queue is run, respectively. By configuring appropriate pre- and/or post-execution commands various situations can be handled such as:

Creating and deleting scratch directories for the job
Assigning jobs to run on specific processors on SMP machines
Customized scheduling
License availability checking

Note that the job-level pre-exec specified with the -E option of bsub is also supported. In some situations (for example, license checking) it is possible to specify a queue-level pre-execution command instead of requiring every job be submitted with the -E option.

The execution commands are specified using the PRE_EXEC and POST_EXEC keywords. For example:

Begin Queue
QUEUE_NAME     = priority
PRIORITY       = 43
NICE           = 10
PRE_EXEC       = /usr/people/lsf/pri_prexec
POST_EXEC      = /usr/people/lsf/pri_postexec
End Queue

The following points should be considered when setting up the pre- and post-execution commands for queues:

The entire contents of the configuration line of the pre- and post-execution commands are run under /bin/sh -c, so shell features can be used in the command. For example, the following is valid:

PRE_EXEC = /usr/local/lsf/misc/testq_pre >> /tmp/pre.out
POST_EXEC = /usr/local/lsf/misc/testq_post | grep -v "$USER"

Both the pre- and post-execution commands are run as the user by default. If you must run these commands as a different user, such as root (to do privileged operations, if necessary), you can configure the parameter LSB_PRE_POST_EXEC_USER in the lsf.sudoers file. See 'The lsf.sudoers File' for details.
The pre- and post-execution commands are run in /tmp.
Standard input and standard output and error are set to /dev/null. The output from the pre- and post-execution commands can be explicitly redirected to a file for debugging purposes.
If the pre-execution command exits with a non-zero exit code, then it is considered to have failed and the job is requeued to the head of the queue. This feature can be used to implement customized scheduling by having the pre-execution command fail if conditions for dispatching the job are not met.
The PATH environment variable is set to '/bin /usr/bin /sbin /usr/sbin'.
Other environment variables set for the job are also set for the pre- and post-execution commands.
When a job is dispatched from a queue which has a pre-execution command, LSF Batch will remember the post-execution command defined for the queue from which the job is dispatched. If the job is later switched to another queue or the post-execution command of the queue is changed, LSF Batch will still run the original post-execution command for this job.
When the post-execution command is run, the environment variable, LSB_JOBEXIT_STAT, is set to the exit status of the job. Refer to the manual page for wait(2) for the format of this exit status.
The post-execution command is also run if a job is requeued because the job's execution environment fails to be set up, or if the job exits with one of the queue's REQUEUE_EXIT_VALUES (see 'Automatic Job Requeue'). The environment variable, LSB_JOBPEND, is set if the job is requeued. If the job's execution environment could not be set up, LSB_JOBEXIT_STAT is set to 0.
If both queue and job level pre-execution commands are specified, the job level pre-execution is run after the queue level pre-execution command.

Default: no pre- and post-execution commands

Job Starter

A Job starter can be defined for each queue to bring the actual job into the desired environment before execution. The configuration syntax for job starter is:

JOB_STARTER = starter

where starter string is any executable that can be used to start the job command line. When LSF Batch runs the job, it executes /bin/sh -c "JOB_STARTER job_command_line". Thus a job starter can be anything that can be run together with the job command line.

Configurable Job Control Actions

Job control in LSF Batch refers to some well-known control actions that will cause a job's status to change. These actions include:

SUSPEND: Change a running job to SSUSP or USUSP. The default action is to send signal SIGTSTP for parallel or interactive jobs and SIGSTOP for other jobs.

RESUME: Change a suspended job to RUN status. The default action is to send signal SIGCONT.

TERMINATE: Terminate a job and possibly cause the job change to EXIT status. The default action is to send SIGINT first, then send SIGTERM 10 seconds after SIGINT, then send SIGKILL 10 seconds after SIGTERM.

Note
On Windows NT, actions equivalent to the above UNIX signals have been implemented to do the default job control actions.

Several situations may require overriding or augmenting the default actions for job control. For example:

A distributed parallel application requires that it receive a catchable signal when the job is suspended, resumed or terminated to propagate the signal to remote processes.
Notifying the user when their job is suspended.
An application holds resources (for example, licenses) that are not freed by suspending the job. The administrator can set up an action to be performed that causes the license to be released before the job is suspended and re-acquired when the job is resumed.
The administrator wants the job checkpointed before being suspended when a run window closes.

It is possible to override the actions used for job control by specifying the JOB_CONTROLS parameter in the lsb.queues file. The format is:

Begin Queue
.
JOB_CONTROLS = SUSPEND[signal | CHKPNT | command] \ 
               RESUME[signal | command]  \
               TERMINATE[signal | CHKPNT | command]
.
End Queue

where signal is a UNIX signal name (such as SIGSTOP, SIGTSTP, etc.). CHKPNT is a special action, which causes the system to checkpoint the job. Alternatively command specifies a /bin/sh command line to be invoked.

When LSF Batch needs to suspend or resume a job it will invoke the corresponding action as specified by the SUSPEND or RESUME parameters, respectively.

If the action is a signal, then the signal is sent to the job. If the action is a command, then the following points should be considered:

The contents of the configuration line for the action are run with '/bin/sh -c' so you can use shell features in the command.
The standard input, output, and error of the command are redirected to the NULL device.
The command is run as the user of the job.
All environment variables set for the job are also set for the command action. The following additional environment variables are set:

LSB_JOBPGIDS: A list of current process group IDs of the job
LSB_JOBPIDS: A list of current process IDs of the job.

For the SUSPEND action command, the following environment variable is also set:

LSB_SUSP_REASON: An integer representing a bit map of suspending reasons as defined in lsbatch.h.

The suspending reason can allow the command to take different actions based on the reason for suspending the job.

The SUSPEND action causes the job state to be changed from RUN state to the USUSP (in response to bstop) state or the SSUSP (otherwise) state when the action is completed. The RESUME action causes the job to go from SSUSP or USUSP state to the RUN state when the action is completed.

If the SUSPEND action is CHKPNT, then the job is checkpointed and then stopped by sending the SIGSTOP signal to the job atomically.

LSF Batch invokes the SUSPEND action to bring a job into SSUSP or USUSP status in the following situations:

When the user or LSF administrator issued an bstop command on the job

When load condition on the execution host satisfies the suspend condition

When the queue's run window closes

When the job is being preempted by a higher priority job

However, in certain situations you may want to terminate the job instead of calling the SUSPEND action. For example, you may want to kill jobs if the run window of the queue is closed. This can be achieved by configuring the queue to invoke the TERMINATE action instead of SUSPEND by specifying the following parameter:

TERMINATE_WHEN = WINDOW | LOAD | PREEMPT

If the TERMINATE action is CHKPNT, then the job is checkpointed and killed atomically.

If the execution of an action is in progress, no further actions will be initiated unless it is the TERMINATE action. A TERMINATE action is issued regardless of the current state of the job.

The following defines a night queue that will kill jobs if the run window closes.

Begin Queue
NAME           = night
RUN_WINDOW     = 20:00-08:00
TERMINATE_WHEN = WINDOW
JOB_CONTROLS   = TERMINATE[ kill -KILL $LS_JOBPIDS; mail -s "job $LSB_JOBID \
                 killed by queue run window" $USER < /dev/null ]
End Queue

Note that the command line inside an action definition must not be quoted.

LSF Batch invokes the TERMINATE action when a SUSPEND action that is redirected to TERMINATE with the TERMINATE_WHEN parameter should be invoked, or when the job reaches its RUNLIMIT, or PROCESSLIMIT.

Since the stdout and stderr of the job control action command are redirected to /dev/null, there is no direct way of knowing whether the command runs correctly. You should make sure the command line is correct. If you want to see the output from the command line for testing purposes, redirect the output to a file inside the command line.

Automatic Job Requeue

A queue can be configured to automatically requeue a job if the job exits with particular exit value(s). The parameter REQUEUE_EXIT_VALUES is used to specify a list of exit codes that can cause an exited job to be requeued; for example:

Begin Queue
PRIORITY            = 43
REQUEUE_EXIT_VALUES = 99 100
End Queue

This configuration enables jobs that exit with 99 or 100 to be requeued to the head of the queue from which it was dispatched. When a job is requeued, the output from the failed run is not saved and no mail is sent. The user will only receive notification when the job exits with a value different from the values listed in the REQUEUE_EXIT_VALUES parameter. Additionally, a job terminated by a signal is not requeued.

Default: Jobs in a queue are not requeued

Exclusive Job Requeue

The queue parameter REQUEUE_EXIT_VALUE controls job requeue behaviour. It defines a series of exit code values. If batch job exit with one of those values, the job gets requeued. There is a special requeue method called exclusive requeue. If the exit value is defined as EXCLUDE(value), the job will be requeued when it exits with the given value, but it will not be dispatched to the same host where it exited with the value. For example:

Begin Queue
.
REQUEUE_EXIT_VALUES=30 EXCLUDE(20)
HOSTS=hostA hostB hostC
.
End Queue

The job in this queue can be dispatched to hostA, hostB, or hostC. If the job exits with value 30, it will be dispatched on any of hostA, hostB, or hostC. If the job exits with value 20 on hostA, when requeued, it will only be dispatched to hostB or hostC. Similarly, if the job again exits with a value of 20, it will only be dispatched on hostC. Finally, if the job exits with value 20 on hostC, the job will be pending forever.

Note
If mbatchd is restarted, it will not remember the previous host(s) where the job exited with an exclusive requeue exit code. In this situation it is possible for a job to be dispatched to host(s) on which the job has previously exited with exclusive exit code.

Default Host Specification for CPU Speed Scaling

LSF runs jobs on heterogeneous machines. To set the CPU time limit for jobs in a platform independent way, LSF scales the CPU time limit by the CPU factor of the hosts involved.

The DEFAULT_HOST_SPEC defines a default host or host model that will be used to normalize the CPU time limit of all jobs, providing consistent behaviour for users.

`DEFAULT_HOST_SPEC = host_spec`

host_spec must be a host name defined in the lsf.cluster.cluster file, or a host model defined in the lsf.shared file.

The CPU time limit defined in this file or by the user through the -c cpu_limit option of the bsub command is interpreted as the maximum number of minutes of CPU time that a job may run on a host of the default specification. When a job is dispatched to a host for execution, the CPU time limit is then normalized according to the execution host's CPU factor.

If DEFAULT_HOST_SPEC is defined in both the lsb.params file and the lsb.queues file for an individual queue, the value specified for the queue overrides the global value. If a user explicitly gives a host specification when submitting a job, the user specified host or host model overrides the values defined in both the lsb.params and the lsb.queues files.

Default: DEFAULT_HOST_SPEC in the lsb.params file

NQS Forward Queues

To interoperate with NQS, you must configure one or more LSF Batch queues to forward jobs to remote NQS hosts. An NQS forward queue is an LSF Batch queue with the parameter NQS_QUEUES defined.

`NQS_QUEUES = queue_name@host_name ...`

host_name is an NQS host name which can be the official host name or an alias name known to the LSF master host through gethostbyname(3). queue_name is the name of an NQS queue on this host. NQS destination queues are considered for job routing in the order in which they are listed here. If a queue accepts the job, then it is routed to that queue. If no queue accepts the job, it remains pending in the NQS forward queue.

The lsb.nqsmaps file (see 'The lsb.nqsmaps File') must be present in order for LSF Batch to route jobs in this queue to NQS systems.

Since many features of LSF are not supported by NQS, the following queue configuration parameters are ignored for NQS forward queues: PJOB_LIMIT, POLICIES, RUN_WINDOW, DISPATCH_WINDOW, RUNLIMIT, HOSTS, and MIG. In addition, scheduling load threshold parameters are ignored because NQS does not provide load information about hosts.

Default: undefined

`DESCRIPTION = text`

A brief description of the job queue. This information is displayed by the bqueues -l command. The description can include any characters, including white space. The description can be extended to multiple lines by ending the preceding line with a back slash '\'. The maximum length for the description is 512 characters.

This description should clearly describe the service features of this queue to help users select the proper queue for each job.

The `lsb.nqsmaps` File

The lsb.nqsmaps file contains information on configuring LSF for interoperation with NQS. This file is optional.

Hosts

NQS uses a machine identification number (MID) to identify each host in the network that communicates using the NQS protocol. This MID must be unique and must be the same in the NQS database of each host in the network. The MID is assigned and put into the NQS data base using the NQS program nmapmgr(1m) or Cray NQS command qmgr(8). mbatchd uses the NQS protocol to talk with NQS daemons for routing, monitoring, signalling, and deleting LSF Batch jobs that run on NQS hosts. Therefore, the MIDs of the LSF master host and any LSF host that might become the master host when the current master host is down must be assigned and put into the NQS database of each host which may possibly process LSF Batch jobs.

In the mandatory Hosts section, list the MIDs of the LSF master host (and potential master hosts) and the NQS hosts that are specified in the lsb.queues file. If an NQS destination queue specified in the lsb.queues file is a pipe queue, the MIDs of all the destination hosts of this pipe queue must be listed here. If a destination queue of this pipe queue is itself a pipe queue, the MIDs of the destination hosts of this queue must also be listed, and so forth.

There are three mandatory keywords in this section:

`HOST_NAME`

The name of an LSF or NQS host. It can be the official host name or an alias host name known to the master batch daemon (mbatchd) through gethostbyname(3).

`MID`

The machine identification number of an LSF or NQS host. It is assigned by the NQS administrator to each host communicating using the NQS protocol.

`OS_TYPE`

The operating system (OS) type of the NQS host. At present, its value can be one of ULTRIX, HPUX, AIX, SOLARIS, SUNOS, IRIX, OSF1, CONVEX, or UNICOS. It is used by mbatchd to deliver the correct signals to the LSF Batch jobs running on this NQS host. An incorrect OS type would cause unpredictable results. If the host is an LSF host, the type is specified by the type field of the Host section in the lsf.cluster.cluster file. OS_TYPE is ignored; '-' must be used as a placeholder.

Begin Hosts
HOST_NAME        MID    OS_TYPE
cray001          1      UNICOS      #NQS host, must specify OS_TYPE
sun0101          2      SOLARIS     #NQS host
sgi006           3      IRIX        #NQS host
hostA            4      -           #LSF host; OS_TYPE is ignored
hostD            5      -           #LSF host
hostC            6      -           #LSF host
End Hosts

Users

LSF assumes shared and uniform user accounts on all of the LSF hosts. However, if the user accounts on NQS hosts are not the same as on LSF hosts, account mapping is needed so that the network server on the remote NQS host can take on the proper identity attributes. The mapping is performed for all NQS network conversations. In addition, the user name and the remote host name may need to match an entry either in the .rhosts file in the user's home directory, or in the /etc/hosts.equiv file, or in the /etc/hosts.nqs file on the server host. For Cray NQS, the entry may be either in the .rhosts file or in the .nqshosts file in the user's home directory.

This optional section defines the user name mapping from the LSF master host to each of the NQS hosts listed in the Host section above, that is, the hosts on which the jobs routed by LSF Batch may run. There are two mandatory keywords:

`FROM_NAME`

The name of an LSF Batch user. It is a valid login name on the LSF master host.

`TO_NAME`

A list of user names on NQS hosts to which the corresponding FROM_NAME is mapped. Each of the user names is specified in the form username@hostname. The hostname is the official name or an alias name of an NQS host, while the username is a valid login name on this NQS host. The TO_NAME of a user on a specific NQS host should always be the same when the user's name is mapped from different hosts. If no TO_NAME is specified for an NQS host, LSF Batch assumes that the user has the same user name on this NQS host as on an LSF host.

Begin Users
FROM_NAME       TO_NAME
user3          (user3l@cray001 luser3@sgi006)
user1          (suser1@cray001) # assumed to be user1@sgi006
End Users

If a user is not specified in the lsb.nqsmaps file, jobs are sent to NQS hosts with the same name the user has in LSF.

[Contents] [Prev] [Next] [End]

doc@platform.com

Chapter 10. LSF Batch Configuration Reference

The lsb.params File

Parameters

Handling Cray NQS Incompatibilities

The lsb.users File

UNIX User Groups

Limitations

LSF Batch User Groups

User and Group Job Slot Limits

The lsb.hosts File

Host Section

Host Groups

Host Partitions

The lsb.queues File

General Parameters

QUEUE_NAME = string

PRIORITY = integer

NICE = integer

QJOB_LIMIT = integer

UJOB_LIMIT = integer

PJOB_LIMIT = integer

HJOB_LIMIT = integer

RUN_WINDOW = string

DISPATCH_WINDOW = string

ADMINISTRATORS = name ...

Processor Reservation for Parallel Jobs

Flexible Expressions for Queue Scheduling

Queue-Level Resource Requirement

Queue-Level Resource Reservation

Suspending Condition

Resume Condition

Load Thresholds

Resource Limits

CPULIMIT = [hour:]minute[/host_spec]

RUNLIMIT = [hour:]minute[/host_spec]

FILELIMIT = integer

MEMLIMIT = integer

DATALIMIT = integer

STACKLIMIT = integer

CORELIMIT = integer

PROCLIMIT = integer

PROCESSLIMIT = integer

SWAPLIMIT = integer

NEW_JOB_SCHED_DELAY = integer

JOB_ACCEPT_INTERVAL = integer

INTERACTIVE = NO|ONLY

Eligible Hosts and Users

USERS = name ...

HOSTS = name[+pref_level] ...

Scheduling Policy

Queue Level Fairshare

Preemption Scheduling

Exclusive Queue

Migration

MIG = number

Queue-Level Pre-/Post-Execution Commands

Job Starter

Configurable Job Control Actions

Automatic Job Requeue

Exclusive Job Requeue

Default Host Specification for CPU Speed Scaling

DEFAULT_HOST_SPEC = host_spec

NQS Forward Queues

NQS_QUEUES = queue_name@host_name ...

DESCRIPTION = text

The lsb.nqsmaps File

Hosts

HOST_NAME

MID

OS_TYPE

Users

FROM_NAME

TO_NAME

The `lsb.params` File

The `lsb.users` File

The `lsb.hosts` File

The `lsb.queues` File

`QUEUE_NAME = string`

`PRIORITY = integer`

`NICE = integer`

`QJOB_LIMIT = integer`

`UJOB_LIMIT = integer`

`PJOB_LIMIT = integer`

`HJOB_LIMIT = integer`

`RUN_WINDOW = string`

`DISPATCH_WINDOW = string`

`ADMINISTRATORS = name ...`

`CPULIMIT = [hour:]minute[/host_spec]`

`RUNLIMIT = [hour:]minute[/host_spec]`

`FILELIMIT = integer`

`MEMLIMIT = integer`

`DATALIMIT = integer`

`STACKLIMIT = integer`

`CORELIMIT = integer`

`PROCLIMIT = integer`

`PROCESSLIMIT = integer`

`SWAPLIMIT = integer`

`NEW_JOB_SCHED_DELAY = integer`

`JOB_ACCEPT_INTERVAL = integer`

`INTERACTIVE = NO|ONLY`

`USERS = name ...`

`HOSTS = name[+pref_level] ...`

`MIG = number`

`DEFAULT_HOST_SPEC = host_spec`

`NQS_QUEUES = queue_name@host_name ...`

`DESCRIPTION = text`

The `lsb.nqsmaps` File

`HOST_NAME`

`MID`

`OS_TYPE`

`FROM_NAME`

`TO_NAME`