Chapter 5. Using LSF Batch

LSF Batch is a distributed load-sharing batch system for clusters of UNIX and Windows NT computers. LSF Batch uses LSF's load information and other services to schedule batch jobs. Because LSF Batch is integrated with other LSF services, batch and interactive jobs have a consistent view of system resources and load levels.

With LSF Batch, you can use a heterogeneous network of computers as a single system. All batch jobs go through a consistent interface, independent of the resources they need or the hosts they run on.

LSF Batch has the same view of cluster and master host as the LIM, although LSF Batch may only use some of the hosts in the cluster as servers. The slave batch daemon, sbatchd, runs on every host that the LSF administrator configures as an LSF Batch server. The master batch daemon, mbatchd, always runs on the same host as the master LIM. See 'Finding the Master' for more information on the master LIM.

Note
Figure 1. 'Structure of LSF' shows how LSF Batch fits into the LSF system. Figure 2. 'Structure of LSF Batch' shows the structure of LSF Batch.

The rest of this chapter provides important background information on how LSF Batch works and describes the commands that give information about your LSF Batch system. Topics include:

The states of an LSF Batch job
Job scheduling policy
Listing batch queues
Choosing the right queue for your job
How LSF Batch decides when and where to run your job
Listing batch server hosts
User groups and host groups

Batch Jobs

Each LSF Batch job goes through a series of state transitions until it eventually completes its task, crashes or is terminated. Figure 9 shows the possible states of a job during its life cycle.

Figure 9. Batch Job States

Many jobs enter only three states:

PEND: waiting in the queue
RUN: dispatched to a host and running
DONE: terminated normally

A job remains pending until all conditions for its execution are met. The conditions may include:

Start time specified by the user when the job is submitted
Load conditions on qualified hosts
Time windows during which the job's queue can dispatch jobs and qualified hosts accept jobs
Job limits imposed by the configured policy for each user, queue, and host
Relative priority to other users and jobs
Availability of the specified resources

A job may terminate abnormally for various reasons. Job termination may happen from any state. An abnormally terminated job goes into EXIT state. The situations where a job terminates abnormally include:

The job is cancelled by its owner or the LSF administrator while pending, or after being dispatched
The job is not able to be dispatched before it reaches its termination deadline and thus is aborted by LSF Batch
The job fails to start successfully. For example, the wrong executable is specified by the user when the job is submitted
The job crashes during execution

Jobs may also be suspended at any time. A job can be suspended by its owner, by the LSF administrator or by the LSF Batch system. There are three different states for suspended jobs:

PSUSP: suspended by its owner or the LSF administrator while in PEND state
USUSP: suspended by its owner or the LSF administrator after being dispatched
SSUSP: suspended by the LSF Batch system after being dispatched

After a job has been dispatched and started on a host, it is suspended by the LSF Batch system if the load on the execution host or hosts becomes too high. In such a case, batch jobs could be interfering among themselves or could be interfering with interactive jobs. In either case, some jobs should be suspended to maximize host performance or to guarantee interactive response time. LSF Batch suspends jobs according to their priority.

When a host is busy, LSF Batch suspends lower priority jobs first unless the scheduling policy associated with the job dictates otherwise. A job may also be suspended by the system if the job queue has a time window and the current time goes outside the time window.

A system suspended job can later be resumed by LSF Batch if the load condition on the execution host becomes good enough or when the closed time window of the queue opens again.

Scheduling Policy

The simple First-Come-First-Served (FCFS) job scheduling policy is often insufficient for an environment with many competing users. This section describes other job scheduling policies. These features provide powerful and flexible job scheduling in LSF Batch.

Host Partition Fairshare Scheduling

Host partition fairshare scheduling allows competing users to control a fair share of host resources. In a host partition each user or group is assigned a share of the total CPU time available on the shared hosts. The total available CPU time is divided by the total number of shares configured and users are given CPU time corresponding to their number of shares.

The LSF Batch system keeps track of the total CPU time used by each user or group and the number of jobs currently running for each user or group. Each user or group is assigned a priority based on the number of jobs running and the CPU time usage averaged over the last HIST_HOURS hours ( see 'Configuration Parameters'). If users or groups have used less than their share of the processing resources, their pending jobs (if any) are scheduled first, jumping ahead of other jobs in the batch queues.

The special user names others and default can also be assigned shares. If a share is assigned to the user name others, then the CPU time for all users not explicitly listed in the host partition is totalled up and compared to the configured share. If a share is assigned to the user name default, then the CPU time is counted separately for each user not explicitly named, and each of these users is allowed the configured share. The special host name all can be used to configure a host partition that applies to all hosts in the cluster.

The bhpart command displays the current cumulative CPU usage and scheduling priority for each user or group in a host partition.

If more than two users or groups are configured, each group is assigned a priority based on how far above or below their share of their current CPU usage is. Jobs from the user or group that is farthest below its share are scheduled first, followed by jobs from the next farthest below, as long as there are jobs and hosts available.

Note:
The CPU time used for host partition scheduling is not normalized by the host CPU speed factors.

Queue-Level Fairshare Scheduling

Queue level fairshare policies allow sharing policy to be defined in individual queues, rather than on host partitions that apply to all queues. This provides the flexibility of allowing different policies to be applied for different classes of jobs.

The LSF administrator can define policies such as FCFS, equal share, or unequal share at the queue level. The basic mechanism of queue level fairshare scheduling is the same as that of host partition level fairshare scheduling.

By default, jobs in a batch queue are scheduled first come, first serve: jobs are considered for scheduling in the order in which they are submitted. The LSF administrator can modify this policy by explicitly indicating the FAIRSHARE policy in the queue definition. The factors governing fairshare scheduling decisions include the following:

Shares assigned to individual users and user groups
Number of started jobs for each user or user group
Cumulative CPU time used by past jobs in the last HIST_HOURS hours (see 'Configuration Parameters')
Time a job has spent waiting in the queue

Note
Queue level fairshare scheduling is an alternative to host partition fairshare scheduling. You cannot use both in the same LSF cluster.

Preemptive and Preemptable Scheduling

When LSF Batch schedules jobs, those in higher priority queues are considered first. Jobs in lower priority queues are only started if all higher priority jobs are waiting for specified resources, hosts, starting times, or other constraints.

When a high priority job is ready to run, all the LSF Batch server hosts may already be running lower priority jobs. The high priority job ends up waiting for the low priority jobs to finish. If the low priority jobs take a long time to complete, the higher priority jobs may be blocked for an unacceptably long time.

LSF solves this problem by allowing preemptive scheduling within LSF Batch queues. Jobs pending in a preemptive queue can preempt lower priority jobs on a host by suspending them and starting the higher priority jobs on the host.

A queue can also be defined as preemptable. In this case, jobs in higher priority queues can preempt jobs in the preemptable queue even if the higher priority queues are not specified as preemptive.

Note:
When the preemptive scheduling policy is used, jobs in preemptive queues may violate the user or host job slot limits. However, LSF Batch ensures that the total number of slots used by running jobs (excluding jobs that are suspended) does not exceed the job slot limits. This is done by suspending lower priority jobs.

Exclusive Scheduling

Some queues accept exclusive jobs. A job can run exclusively only if it is submitted with the -x option to the bsub command specifying a queue that is configured to accept exclusive jobs. An exclusive job runs by itself on a host --- it is dispatched only to a host with no other batch jobs running and LSF does not send any other jobs to the host until the exclusive job completes.

Once an exclusive job is started on a host, the LSF Batch system locks that host out of load sharing by sending a request to the underlying LSF to change the host's status to lockU. The host is no longer available for load sharing by any other task (either interactive or batch) until the exclusive job finishes.

Interactive Batch Scheduling

Any batch queue can be defined to accept interactive jobs. A queue can be configured to dispatch jobs quickly; however, this will apply to all jobs submitted to the queue, batch or interactive. All batch scheduling policies and host selection features for resource intensive jobs apply to interactive jobs.

Scheduling Parameters

Scheduling parameters specify the load conditions under which pending jobs are dispatched, running jobs are suspended, and suspended jobs are resumed. These parameters are configured by the LSF administrator in a variety of ways.

Load Thresholds

Load thresholds can be configured by your LSF administrator to schedule jobs in queues. There are two possible types of load thresholds: loadSched and loadStop. Each load threshold specifies a load index value. A loadSched threshold is the scheduling threshold which determines the load condition for dispatching pending jobs. If a host's load is beyond any defined loadSched, a job will not be started on the host. This threshold is also used as the condition for resuming suspended jobs. A loadStop threshold is the suspending condition that determines when running jobs should be suspended.

Thresholds can be configured for each queue, for each host, or a combination of both. To schedule a job on a host, the load levels on that host must satisfy both the thresholds configured for that host and the thresholds for the queue from which the job is being dispatched.

The value of a load index may either increase or decrease with load, depending on the meaning of the specific load index. Therefore, when comparing the host load conditions with the threshold values, you need to use either greater than (>) or less than (<), depending on the load index.

When jobs are running on a host, LSF Batch periodically checks the load levels on that host. If any load index exceeds the corresponding per-host or per-queue suspending threshold for a job, LSF Batch suspends the job. The job remains suspended until the load levels satisfy the scheduling thresholds.

To find out what parameters are configured for your cluster, see 'Detailed Queue Information' and 'Batch Hosts'.

Resource Requirement Parameters

In addition to load thresholds, your LSF administrator can also define scheduling conditions in terms of resource requirements. Three parameters, RES_REQ, STOP_COND, and RESUME_COND, can be specified in the definition of a queue. These parameters take resource requirement strings as values (see 'Resource Requirement Strings' for more details) which results in a more flexible specification of conditions.

The resource requirement conditions for dispatching a job to a host can be specified through the queue level RES_REQ parameter (see 'Queue-Level Resource Requirement' of the LSF Administrator's Guide for further details).

You can also specify the resource requirements for your job using the -R option to the bsub command. If you specify resource requirements that are already defined in the queue, the host must satisfy both requirements to be eligible for running the job. In some cases, the queue specification sets an upper or lower bound on a resource. If you attempt to exceed that bound, your job will be rejected.

The condition for suspending a job can be specified using the queue level STOP_COND parameter. It is defined by a resource requirement string (see 'Suspending Condition' of the LSF Administrator's Guide). The stopping condition can only be specified in the queue.

The resource requirement conditions that must be satisfied on a host before a suspended job can be resumed is specified using the queue level RESUME_COND parameter (for more detail see 'Resume Condition' of the LSF Administrator's Guide). The resume condition can only be specified in the queue.

To find out details about the parameters of your cluster, see 'Detailed Queue Information' and 'Batch Hosts'.

Run and Dispatch Windows for Queues and Hosts

Separate time windows can be defined to control when jobs can be dispatched and when they are to be suspended.

Run Windows

Run windows are time windows during which jobs are allowed to run. When the windows are closed, running jobs are suspended and no new jobs are dispatched. The default is no restriction, or always open. Run windows can only be defined for queues (see 'Detailed Queue Information').

Note
These windows are only applicable to batch jobs. Interactive jobs scheduled by the Load Information Manager (LIM) of LSF are controlled by another set of run windows (see 'Listing Hosts').

Dispatch Windows

Dispatch windows are time windows during which jobs are allowed to be started. However, dispatch windows have no effect on jobs that have already started. This means that jobs are allowed to run outside the dispatch windows, but no new jobs will be started. The default is no restriction, or always open. Note that no jobs are allowed to start when the run windows are closed. Dispatch windows can be defined for both queues (see 'Detailed Queue Information') and batch server hosts (see 'Batch Hosts').

Batch Queues

Batch queues represent different job scheduling and control policies. All jobs submitted to the same queue share the same scheduling and control policy. Batch queues do not correspond to individual hosts; each job queue can use all server hosts in the cluster, or a configured subset of the server hosts.

The LSF administrator can configure job queues to control resource accesses by different users and types of application. Users select the job queue that best fits each job.

Finding Out What Queues Are Available

The bqueues command lists the available LSF Batch queues.

% bqueues
QUEUE_NAME     PRIO      STATUS      MAX  JL/U JL/P JL/H NJOBS  PEND  RUN  SUSP
interactive     400   Open:Active      -    -    -    -     2     0     2     0
priority        43    Open:Active      -    -    -    -    16     4    11     1
night           40   Open:Inactive     -    -    -    -     4     4     0     0
short           35    Open:Active      -    -    -    -     6     1     5     0
license         33    Open:Active      -    -    -    -     0     0     0     0
normal          30    Open:Active      -    -    -    -     0     0     0     0
idle            20    Open:Active      -    -    -    -     5     3     1     2

The PRIO column gives the priority of the queue. The bigger the value, the higher the priority. Queue priorities are used by LSF Batch for job scheduling and control. Jobs from higher priority queues are dispatched first. Jobs from lower priority queues are suspended first when hosts are overloaded.

The STATUS column shows the queue status. A queue accepts new jobs only if it is open and dispatches jobs only if it is active. A queue can be opened or closed only by the LSF administrator. Jobs submitted to a queue that is later closed are still dispatched as long as the queue is active. A queue can be made active or inactive either by the LSF administrator or by the run and dispatch windows of the queue.

The MAX column shows the limit on the number of jobs dispatched from this queue at one time. This limit prevents jobs from a single queue from using too many hosts in a cluster at one time.

The JL/U column shows the limit on the number of jobs dispatched at one time from this queue for each user. This prevents a single user from occupying too many hosts in a cluster while other users' jobs are waiting in the queue.

The JL/P column shows the limit on the number of jobs from this queue dispatched to each processor. This prevents a single queue from occupying too many of the resources on a host.

The JL/H column shows the maximum number of job slots a host can allocate for this queue. This limit controls the number of job slots for the queue on each host, regardless of the type of host: uniprocessor or multiprocessor.

The NJOBS column shows the total number of job slots required by all jobs in the queue, including jobs that have not been dispatched and jobs that have been dispatched but have not finished.

Note
A parallel job with N components would require N job slots.

The PEND column shows the number of job slots needed by pending jobs in this queue.

The RUN column shows the number of job slots used by running jobs in this queue.

The SUSP column shows the number of job slots required by suspended jobs in this queue.

Detailed Queue Information

The -l option to the bqueues command displays the complete status and configuration for each queue. You can specify queue names on the command line to select specific queues:

% bqueues -l normal

QUEUE: normal
  -- For normal low priority jobs, running only if hosts are lightly loaded. 
This is the default queue.

PARAMETERS/STATISTICS
 PRIO NICE     STATUS       MAX JL/U JL/P NJOBS  PEND  RUN  SSUSP USUSP
  40   20    Open:Active    100   50   11    1     1     0     0     0
Migration threshold is 30 min.

 CPULIMIT             RUNLIMIT
 20 min of IBM350     342800 min of IBM350

 FILELIMIT    DATALIMIT    STACKLIMIT   CORELIMIT    MEMLIMIT     PROCLIMIT
 20000 K      20000 K        2048 K     20000 K      5000 K       3

SCHEDULING PARAMETERS
           r15s   r1m  r15m   ut    pg    io   ls    it    tmp    swp    mem
 loadSched   -    0.7   1.0  0.2   4.0    50    -     -     -      -      -
 loadStop    -    1.5   2.5    -   8.0   240    -     -     -      -      -

SCHEDULING POLICIES:  FAIRSHARE  PREEMPTIVE PREEMPTABLE EXCLUSIVE
USER_SHARES:  [groupA, 70] [groupB, 15]  [default, 1]

DEFAULT HOST SPECIFICATION : IBM350

RUN_WINDOWS:  2:40-23:00 23:30-1:30

DISPATCH_WINDOWS:  1:00-23:50

USERS: groupA/ groupB/ user5
HOSTS:  hostA, hostD, hostB
ADMINISTRATORS:  user7
PRE_EXEC: /tmp/apex_pre.x > /tmp/preexec.log 2>&1
POST_EXEC:  /tmp/apex_post.x > /tmp/postexec.log 2>&1
REQUEUE_EXIT_VALUES:  45

The bqueues -l command only displays fields that apply to the queue. Any field that is not displayed has a default value that does not affect job scheduling or execution. In addition to the fields displayed by the default bqueues command, the fields that may be displayed are:

DESCRIPTION: A description of the typical use of the queue.

Default queue indication: Indicates that this is the default queue.

SSUSP: The number of job slots required by jobs suspended by the system because of load levels or run windows.

USUSP: The number of jobs slots required by jobs suspended by the user or the LSF administrator.

RSV: The numbers of job slots in the queue that are reserved by LSF Batch for pending jobs.

Migration threshold: The time that a job dispatched from this queue can be suspended by the system before LSF Batch attempts to migrate the job to another host.

CPULIMIT: The maximum CPU time a job can use, in minutes relative to the CPU factor of the named host. CPULIMIT is scaled by the CPU factor of the execution host so that jobs are allowed more time on slower hosts.

When the job-level CPULIMIT is reached, the system sends SIGXCPU to all processes in the job.

RUNLIMIT: The maximum wall clock time a process can use, in minutes. RUNLIMIT is scaled by the CPU factor of the execution host. When a job has been in the RUN state for a total of RUNLIMIT minutes, LSF Batch sends a SIGUSR2 signal to the job. If the job does not exit within 10 minutes, LSF Batch sends a SIGKILL signal to kill the job.

FILELIMIT: The maximum file size a process can create, in kilobytes. This limit is enforced by the UNIX setrlimit system call if it supports the RLIMIT_FSIZE option, or the ulimit system call if it supports the UL_SETFSIZE option.

DATALIMIT: The maximum size of the data segment of a process, in kilobytes. This restricts the amount of memory a process can allocate. DATALIMIT is enforced by the setrlimit system call if it supports the RLIMIT_DATA option, and unsupported otherwise.

STACKLIMIT: The maximum size of the stack segment of a process, in kilobytes. This restricts the amount of memory a process can use for local variables or recursive function calls. STACKLIMIT is enforced by the setrlimit system call if it supports the RLIMIT_STACK option.

CORELIMIT: The maximum size of a core file, in kilobytes. This limit is enforced by the setrlimit system call if it supports the RLIMIT_CORE option.

MEMLIMIT: The maximum running set size (RSS) of a process, in kilobytes. If a process uses more than MEMLIMIT kilobytes of memory, its priority is reduced so that other processes are more likely to be paged in to available memory. This limit is enforced by the setrlimit system call if it supports the RLIMIT_RSS option.

PROCLIMIT: The maximum number of processors allocated to a job. Jobs requesting more processors than the queue's PROCLIMIT are rejected.

PROCESSLIMIT: The maximum number of concurrent processes allocated to a job. If PROCESSLIMIT is reached, the system sends the following signals in sequence to all processes in the job: SIGINT, SIGTERM, and SIGKILL.

SWAPLIMIT: The swap space limit that a job may use. If SWAPLIMIT is reached, the system sends the following signals in sequence to all processes in the job: SIGINT, SIGTERM, and SIGKILL.

loadSched: The load thresholds LSF Batch uses to determine whether a pending job in this queue can be dispatched to a host, and to determine when a suspended job can be resumed. The load indices are explained in 'Load Indices'.

loadStop: The load thresholds LSF Batch uses to determine when to suspend a running batch job in this queue.

SCHEDULING POLICIES: Scheduling policies of the queue. Optionally, one or more of the following policies may be configured:

FAIRSHARE

Jobs in this queue are scheduled based on a fairshare policy. In general, a job will be dispatched before other jobs in this queue if the job's owner has more shares (see USER_SHARES below), fewer running jobs, and has used less CPU time in the recent past, and the job has waited longer. If all the users have the same shares, jobs in this queue are scheduled in a round--robin fashion.

If the fairshare policy is not specified, jobs in this queue are scheduled based on the conventional first--come--first--served (FCFS) policy. That is, jobs are dispatched in the order they were submitted.

PREEMPTIVE

Jobs in this queue may preempt running jobs from lower priority queues. That is, jobs in this queue may still be able to start even though the job limit of a host or a user has been reached, as long as some of the job slots defined by the job limit are taken by jobs from those queues whose priorities are lower than the priority of this queue. Jobs from lower priority queues will be suspended to ensure that the running jobs (excluding suspended jobs) are within the corresponding job limit. If the preemptive policy is not specified, the default is not to preempt any job.

PREEMPTABLE

Jobs in this queue may be preempted by jobs in higher priority queues, even if the higher priority queues are not specified as preemptive.

EXCLUSIVE

Jobs dispatched from this queue can run exclusively on a host if the user so specifies at job submission time (see 'Other bsub Options'). Exclusive execution means that the job is sent to a host with no other batch jobs running there, and no further job---batch or interactive---will be dispatched to that host while the job is running. The default is not to allow exclusive jobs.

USER_SHARES: A list of [username, share] pairs. username is either a user name or a user group name. share is the number of shares of resources assigned to the user or user group. A party will get a portion of the resources proportional to the party's share divided by the sum of the shares of all parties specified in this queue.

DEFAULT HOST SPECIFICATION: A host name or host model name. The appropriate CPU scaling factor of the host or host model (see lsinfo(1)) is used to adjust the actual CPU time limit at the execution host (see CPULIMIT above). This specification overrides the system default DEFAULT_HOST_SPEC (see 'Configuration Parameters').

RUN_WINDOWS: One or more run windows in a week during which jobs in this queue may execute. When a queue is out of its window or windows, no job in this queue will be dispatched. In addition, when the end of a run window is reached, any running jobs from this queue are suspended until the beginning of the next run window, when they are resumed. The default is no restriction, or always open.

A window is displayed in the format of begin_time-end_time. Time is specified in the format of [day:]hour[:minute], where all fields are numbers in their respective legal ranges: 0(Sunday)-6 for day, 0-23 for hour, and 0-59 for minute. The default value for minute is 0 (on the hour). The default value for day is every day of the week. The begin_time and end_time of a window are separated by '-', with no blank characters (SPACE or TAB) in between. Both begin_time and end_time must be present for a window. Windows are separated by blank characters. If only the character '-' is displayed, the windows are always open.

DISPATCH_WINDOWS: One or more dispatch windows in a week during which jobs in this queue may be dispatched to run. When a queue is out of its windows, no job in this queue can be dispatched. Jobs already dispatched are not affected by the dispatch windows. The default is no restriction, or always open. Dispatch windows are displayed in the same format as run windows (see RUN_WINDOWS above).

USERS: The list of users allowed to submit jobs to this queue.

HOSTS: The list of hosts to which this queue can dispatch jobs.

NQS DESTINATION QUEUES: The list of NQS queues to which this queue can dispatch jobs.

ADMINISTRATORS: A list of administrators of the queue. The users whose names are specified here are allowed to operate on the jobs in the queue and on the queue itself.

PRE_EXEC: Queue's pre-execution command. This command is executed before the real batch job is run on the execution host (or on the first host selected for a parallel batch job).

POST_EXEC: Queue's post-execution command. This command is executed on the execution host when a job terminates.

REQUEUE_EXIT_VALUES: Jobs that exit with these values are automatically requeued.

RES_REQ: Resource requirements of the queue. Only the hosts that satisfied this resource requirement can be used by the queue.

RESUME_COND: The condition(s) that must be satisfied to resume a suspended job on a host.

STOP_COND: The condition(s) which determine whether a job running on a host should be suspended.

Note that some parameters are displayed only if they are defined.

Automatic Queue Selection

When more than one batch queue is available, you need to decide which queue to use. If you submit a job without specifying a queue name, the LSF Batch system automatically chooses a suitable queue for the job from the candidate default queues, based on the requirements of the job.

Specifying Default Queues

LSF Batch has default queues. The bparams command displays them:

% bparams
Default Queues: normal
...

The user can override this list by defining the environment variable LSB_DEFAULTQUEUE.

Queue Selection Mechanism

Although simple to use, automatic queue selection may not behave as expected, if you do not choose your candidate queues properly. The criteria LSF Batch uses for selecting a suitable queue are as follows:

User access restriction. Queues that do not allow this user to submit jobs are discarded
Host restriction. If the job explicitly specifies a list of hosts on which the job can be run, then the selected queue must be configured to send jobs to all hosts in the list
Queue status. Closed queues are not considered
Exclusive execution restriction. If the job requires exclusive execution, then queues that are not configured to accept exclusive jobs are discarded
Job's requested resources. These must be within the resource limits of the selected queue

If multiple queues satisfy the above requirements, then the first queue listed in the candidate queues (as defined by DEFAULT_QUEUE or LSB_DEFAULTQUEUE) that satisfies the requirements is selected.

Choosing a Queue

The default queues are normally suitable to run most jobs for most users, but they may have a very low priority or restrictive execution conditions to minimize interference with other jobs. If automatic queue selection is not satisfactory, you should choose the most suitable queue for each job.

The factors affecting your decision are user access restrictions, size of the job, resource limits of the queue, scheduling priority of the queue, active time windows of the queue, hosts used by the queue, the scheduling load conditions, and the queue description displayed by the bqueues -l command.

The -u user_name option specifies a user or user group so that bqueues displays only the queues that accept jobs from these users.

The -m host_name option allows users to specify a host name or host group name so that bqueues displays only the queues that use these hosts to run jobs.

You must also be sure that the queue is enabled.

The following examples are based on the queues defined in the default LSF configuration. Your LSF administrator may have configured different queues.

To run a job during off hours because the job generates very high load to both the file server and the network, you can submit it to the night queue; use bsub -q night.

If you have an urgent job to run, you may want to submit it to the priority queue; use bsub -q priority.

If you want to use hosts owned by others and you do not want to bother the owners, you may want to run your low priority jobs on the idle queue so that as soon as the owner comes back, your jobs get suspended.

If you are running small jobs and do not want to wait too long to get the results, you can submit jobs to the short queue to be dispatched with higher priority. Make sure your jobs are short enough that they are not killed for exceeding the CPU time limit of the queue (check the resource limits of the queue, if any).

Batch Users

The busers command displays the maximum number of jobs a user or group may execute on a single processor, the maximum number of job slots a user or group may use in the cluster, the total number of job slots required by all submitted jobs of the user, and the number of job slots in the PEND, RUN, SSUSP, and USUSP states. If no user is specified, the default is to display information about the user who invokes this command. Here is an example of the output from the busers command:

% busers all
USER/GROUP       JL/P  MAX  NJOBS  PEND  RUN  SSUSP USUSP RSV 
default            1    12     -     -     -     -     -   - 
user9              1    12    34    22    10     2     0   0 
groupA             -   100    19     7    11     1     1   0

Note that if the reserved user name all is specified, busers reports all users who currently have jobs in the system, as well as default, which represents a typical user. The purpose of listing default in the output is to show the job slot limits (JL/P and MAX) of a typical user. No other parameters make sense for default.

Note
The counters displayed by busers treat a parallel job requesting N processors the same as N jobs requesting one processor.

Batch Hosts

LSF Batch uses some (or all) of the hosts in an LSF cluster as execution hosts. The host list is configured by the LSF administrator. The bhosts command displays information about these hosts.

% bhosts
HOST_NAME     STATUS   JL/U  MAX   NJOBS   RUN  SSUSP USUSP RSV 
hostA         ok         2     2     0     0     0     0     0 
hostD         ok         2     4     2     1     0     0     1 
hostB         ok         1     2     2     1     0     1     0

STATUS gives the status of the host and the sbatchd daemon. If a host is down or the LIM is unreachable, the STATUS is unavail. If the LIM is reachable but the sbatchd is not up, STATUS is unreach.

JL/U is the job slot limit per user. The host will not allocate more than JL/U job slots for one user at the same time. MAX gives the maximum number of job slots that are allowed on this host. This does not mean that the host has to always allocate this many job slots if there are waiting jobs; the host must also satisfy its configured load conditions to accept more jobs.

The columns NJOBS, RUN, SSUSP, USUSP, and RSV show the number of job slots used by jobs currently dispatched to the host, running on the host, suspended by the system, suspended by the user, and reserved on the host respectively.

The -l option to the bhosts command gives all information about each batch server host such as the CPU speed factor and the load threshold values for starting, resuming and suspending jobs. You can also specify host names on the command line to list the information for specific hosts.

% bhosts -l hostB

HOST: hostB
 STATUS        CPUF  JL/U  MAX NJOBS  RUN SSUSP USUSP  RSV DISPATCH_WINDOWS
 ok             9      1    2    2    1     0     0     1    2:00-20:30

           r15s   r1m  r15m   ut    pg    io   ls    it    tmp    swp    mem
 loadSched   -     -     -     -     -     -    -     -     -      -      -
 loadStop    -     -     -     -    40     -    -     -     -      -      -

 Migration threshold is 40 min.
 Files are copied at checkpoint.

The DISPATCH_WINDOWS column shows the time windows during which jobs can be started on the host. See 'Detailed Queue Information' for a description of the format of the DISPATCH_WINDOWS column. Unlike the queue run windows, jobs are not suspended when the host dispatch windows close. Jobs running when the host dispatch windows close continue running, but no new jobs are started until the windows reopen.

CPUF is the host CPU factor. loadSched and loadStop are the scheduling and suspending thresholds for the host. If a threshold is not defined, the threshold from the queue definition applies. If both the host and the queue define a threshold for a load index, the most restrictive threshold is used.

The migration threshold is the time that a job dispatched to this host can be suspended by the system before LSF Batch attempts to migrate the job to another host.

If the host supports checkpoint copy, this is indicated here. With checkpoint copy, the operating system automatically copies all open files to the checkpoint directory when a process is checkpointed. Checkpoint copy is currently supported only on ConvexOS and Cray systems.

User and Host Groups

The LSF administrator can configure user and host groups. The group names act as convenient aliases wherever lists of user or host names can be specified on the command line or in configuration files. The administrator can also limit the total number of running jobs belonging to a user or a group of users.

The bugroup and bmgroup commands list the configured group names and members for user and host groups respectively.

% bugroup acct_users
GROUP_NAME    USERS
acct_users : user1 user2 user4

% bmgroup host_grp
GROUP_NAME    HOSTS
big_servers  : hostD hostK

Specifying a user or host group to an LSF Batch command is the same as specifying all the user or host names in the group. For example, the command bsub -m big_servers specifies that the job may be dispatched to either of the hosts hostD or hostK. The command bjobs -l lists detailed information about the job, including the specified hosts and the load thresholds that apply to the job.

% bsub -m big_servers hostname
Job <31556> is submitted to default queue <normal>.

% bjobs -l 31556

Job Id <31556>, User <user1>, Status <DONE>, Queue <normal>, Command <hostname>
Thu Oct 27 01:47:51: Submitted from host <hostA>, CWD <$HOME>, Specified
                     Hosts <big_servers>;
Thu Oct 27 01:47:52: Started on <hostK>;
Thu Oct 27 01:47:53: Done successfully. The CPU time used is 0.2 seconds.

           r15s   r1m  r15m   ut    pg    io   ls    it    tmp    swp    mem
 loadSched   -     -     -     -     -     -    -     -     -     12      -
 loadStop    -     -     -     -    55     -    -     -     -      -      -

Configuration Parameters

The bparams command reports some generic configuration parameters of the LSF Batch system. These include the default queues, default host or host model for CPU speed scaling, job dispatch interval, job checking interval, job accepting interval, etc. The command can display such information in either short format or long format. The short format summarizes a few key parameters. For example:

% bparams
Default Queues:  normal idle
Default Host Specification:  DECAXP
Job Dispatch Interval:  20 seconds
Job Checking Interval:  15 seconds
Job Accepting Interval:  20 seconds

The -l option to the bparams command displays the information in long format, which gives a brief description of each parameter as well as the name of the parameter as it appears in the lsb.params file. In addition, the long format lists every parameter defined in the lsb.params file. Here is an example of the output from the long format of the bparams command:

% bparams -l

System default queues for automatic queue selection:
    DEFAULT_QUEUE = normal idle

The interval for dispatching jobs by master batch daemon:
    MBD_SLEEP_TIME = 20 (seconds)

The interval for checking jobs by slave batch daemon:
    SBD_SLEEP_TIME = 15 (seconds)

The interval for a host to accept two batch jobs subsequently:
    JOB_ACCEPT_INTERVAL = 1 (* MBD_SLEEP_TIME)

The idle time of a host for resuming pg suspended jobs:
    PG_SUSP_IT = 180 (seconds)

The amount of time during which finished jobs are kept in core:
    CLEAN_PERIOD = 3600 (seconds)

The maximum number of finished jobs that are logged in current event file:
    MAX_JOB_NUM = 2000

The maximum number of retries for reaching a slave batch daemon:
    MAX_SBD_FAIL = 3

The number of hours of resource consumption history:
    HIST_HOURS = 5

The default project assigned to jobs.
    DEFAULT_PROJECT = default

User Controlled Account Mapping

By default, LSF assumes a uniform user name space within a cluster. Some sites do not satisfy this assumption. For such sites, LSF provides support for the execution of batch jobs within a cluster with a non-uniform user name space.

You can set up a hidden .lsfhosts file in your home directory that tells what accounts to use when you send jobs to remote hosts and which remote users are allowed to run jobs under your local account. This is similar to the .rhosts file used by rcp, rlogin, and rsh.

The .lsfhosts file consists of multiple lines, where each line is of the form:

hostname|clustername username [send|recv]

A '+' in the hostname or username field indicates any LSF host or user respectively. The keyword send indicates that if you send a job to host hostname, then the account username should be used. The keyword recv indicates that your local account is enabled to run jobs from user username on host hostname. If neither send nor recv are specified, then your local account can both send jobs to and receive jobs from the account username on hostname.

Note
The clustername argument is used for the LSF MultiCluster product. See 'Using LSF MultiCluster'.

Lines beginning with '#' are ignored.

Note
The permission on your .lsfhosts file must be 0600 (that is, read/write only by the user). Otherwise, your .lsfhosts file is silently ignored.

For example, assume that hostB and hostA in your cluster do not share the same user name/user ID space. You have an account user1 on host hostB and an account ruser_1 on host hostA. You want to be able to submit jobs from hostB to run on hostA.

Your .lsfhosts files should be set up as follows:

On hostB:

% cat ~user1/.lsfhosts
hostA ruser_1 send

On hostA:

% cat ~ruser_1/.lsfhosts
hostB user1 recv

As another example, assume you have account user1 on host hostB and want to use the lsfguest account when sending jobs to be run on host hostA. The lsfguest account is intended to be used by any user submitting jobs from any LSF host.

The .lsfhosts files should be set up as follows:

On hostB:

% cat ~user1/.lsfhosts
hostA lsfguest send

On hostA:

% cat ~lsfguest/.lsfhosts
+  + recv

When using account mapping, your job is always started as a login shell so that the start-up files of the user account, under which your job will run, are sourced.

Your .lsfhosts file is read at job submission time. Subsequent changes made to this file will not affect the account used to run the job. Jobs submitted after the changes are made will pick up the new entries.

If you attempt to map to an account for which you have no permission, your job is put into PSUSP state. You can modify the .lsfhosts file of the execution account to give appropriate permission and resume the job.

Note
The bpeek command will not work on a job running under a different user account.

File transfer using the -f option to the bsub command will not work when running under a different user account unless rcp(1) is set up to do the file copying.

[Contents] [Prev] [Next] [End]

doc@platform.com