Ben

Manual

Back to the ben home page.

SYNOPSIS

ben command [options]

DESCRIPTION

Ben is a batch job scheduler. It maintains a queue of jobs that are to be executed. Each job is defined by a shell command or a shell script. A central server handles coordination, while clients run the jobs. Executions can be distributed across a number of computers on a network and run in parallel.

The ben command has three modes of operation:

Connections to the server are established over named Unix domain sockets or TCP/IP. Clients can connect and disconnect dynamically. As a result, commands may be interrupted, in which case the server requeues them, until they are successfully run and completed on a client. Commands too can be added to the queue or removed from it at any moment, yet the output of completed commands will not be removed from the server.

By default, the server creates a Unix domain socket, and each client connects to a local Unix domain socket on its own computer. The latter socket is created using ssh and forwards to the server. This way, authentication relies on Unix user permissions. Support for TCP/IP sockets is kept for compatibility, but it comes with no security: with TCP/IP sockets, one must trust all users on all computers involved, and anyone who can connect to them.

SETTING UP SERVER AND CLIENTS

ben server” starts the server. One frequent option is “-d” to detach from the current terminal (this creates a so-called daemon). All options are detailed below.

ben client” starts a client. Again, one can specify “-d” to detach from the current terminal. Use “-n processes” to specify the number of jobs that can run simultaneously on this client. If the server is on a remote computer that we can reach using ssh, then we can let the client talk to the server by using the “-f” option:

ben client -f user@server-host

If instead the client computer can be reached from the server, we can run the command

ben client -r user@client-host

from the computer that runs the server. Note that in the latter case, the “ben” binary must be in the $PATH on the client computer. Alternatively, the path to “ben” can be specified using “--remote-path /path/to/ben”.

CONTROLLING JOBS

One same command may be used to run multiple jobs, simply by specifying multiple job names. Distinct jobs can be differentiated (even if they share a same shell command definition), through three environment variables that are defined at execution time: “$outdir”, “$options” and “$job”. They contain, respectively, the output directory on the server for the current set of jobs, the options passed when the job was queued, and the job name. (For compatibility with earlier versions, the variable “$task” is also defined with the same value as “$job”.)

ben add” queues a series of jobs, all using a single specified shell command. A typical use is

ben add -o output-directory -C script.sh -J job-list.txt

In this example, the file “job-list.txt” contains the names of the jobs we want to run. For each of them, “script.sh” is run with the variable $job set to the appropriate value (the job name). The output will be stored on the server side, in files called “output-directory/job-name.out” for stdout and “output-directory/job-name.log” for stderr, where job-name is the name of each job.

Note that many options come in two forms: with a lower-case letter, the option is specified directly; with an upper-case letter, a file is given which contains the option. For example “-c ls” specifies the command “ls”, and is equivalent to “-C file.sh” if “file.sh” itself contains just the command “ls”.

ben rm” removes and stops jobs matching the provided server-side output directory, job status, and job name. If one parameter is not provided, any value for that parameter will be matched. Note that for safety reasons, if no parameter is provided, a help message is displayed and no operation is performed. Use “‘ben rm -t drp’” to remove all jobs.

ben exec” queues one copy of the given job per connected client (regardless of the number of processes allowed on that client). This job is different from other jobs in that it is synchronizing: (a) it can only run once no preceding jobs are pending in the queue, and (b) no other job can run simultaneously on the same client. In other words, for a given client, all preceding jobs can only run strictly before the synchronizing command, and all subsequent jobs after. Synchronizing jobs are useful to schedule a code update or a recompilation, then immediately start queuing jobs for the updated code. Note that the control ‘ben exec’ is blocking and will wait until the completion of all the jobs it schedules, displaying their output. You can use -d to background the command, or type ctrl+c to interrupt the wait (this does not affect the execution of the command).

ben scale” updates the number of processes running on a given client.

ben kill” stops a client.

ben list” shows the content of the server queue.

ben status” shows a summary of job progress per output directory.

ben nodes” shows a list of the connected clients.

ben exit” stops the server.

OPTIONS

General options

ben command [-s path | host[:port]] [-x ext | :shift] [-f user@host] [--local-socket path | host[:port]] [--local-extension ext | :shift] [--bridge] [--chdir dir] [-d] [--color auto|never|always|html] [command options]

-s path | host[:port], --socket path | host[:port]

Specify the address of the ben server. For Unix-domain sockets, bind or connect to the file path (by default “/tmp/ben-$USER/socket-default”). For TCP/IP sockets, bind or connect to hostname host. If host is empty or *, bind to all addresses. If host is “localhost”, bind or connect to the loopback interface. If port is not specified, the default is 9000.

-x ext | :shift, --extension ext | :shift

Shorthand for -s address specification. For Unix-domain sockets, bind or connect to the file “/tmp/ben-$USER/socket-ext”. For TCP/IP sockets, bind or connect to localhost on port 9000+shift.

-f user@host, --forward user@host

Connect to a ben server running on host. This works by setting up an ssh forwarding from local connections to user@host. When specified, the -s and -x options refer to the ben server running on host (for Unix-domain sockets, $USER becomes user in the default path).

--local-socket path | host[:port]

When -f is used, specify the local address ssh listens to. By default, Unix-domain sockets are used and the local path is “/tmp/ben-$USER/socket-seq” where seq is a random sequence. For TCP/IP sockets, if host is empty or *, bind to all addresses. If host is “localhost”, bind or connect to the loopback interface. If port is not specified, the default is 9000.

--local-extension ext | :shift

When -f is used, shorthand for --local-socket address specification. Similar to -x for -s. For Unix-domain sockets, the path is “/tmp/ben-$USER/socket-ext”. For TCP/IP sockets, bind or connect to localhost on port 9000+shift.

--bridge

When -f is used, sets a default local path of “/tmp/ben-$USER/socket-default” instead of using a random sequence. Useful for configuring forwards and remotes that will be exploited by other ben commands. This also sets to zero the default number of simultaneous jobs for the ben client command, see Client options below.

--chdir dir

Change current directory to dir (creating it if necessary) after reading input files.

-d, --daemon

Detach from terminal (daemonize) after successful network init.

--color auto|never|always|html

Specify when to use color output. The default is auto, i.e., use color when the output is a terminal. The html mode uses html <span> tags for color.

Server options

ben server [general options] [--snapshot path]

--snapshot path

Maintain a snapshot of the queue in the file indicated by path. The file is updated (overwritten) whenever the queue state changes. This option allows one to recover the queue content from the filesystem in case the server process gets interruped. See “ben add -i” for queue recovery.

Client options

ben client [general options] [--name name] [-n processes] [-r user@host] [--remote-socket path | host[:port]] [--remote-extension ext | :shift] [--bridge]

--name name

Set the name advertised to the server for this client. By default, this is the content of the HOSTNAME environment variable if it is found, or the value returned by gethostname() otherwise.

-n count, --processes count

Set the maximum number of jobs to be run simultaneously by this client instance to count. The default value is 1. If count is negative, it is set to the output of sysconf(_SC_NPROCESSORS_ONLN), i.e., the number of online CPUs.

--bridge

See General options above. Sets the number of jobs to zero, unless -n is specified. This is useful for configuring clients whose sole purpose is to maintain forwards and remotes.

-r user@host, --remote user@host

Run the client on the remote computer host instead of locally. This works by setting up an ssh forwarding on the remote host to the local computer. It requires $PATH on the remote host to contain the path to the “ben” binary, unless --remote-path is specified as well. Note that the command will be run in a non-login ssh shell, so $PATH will not contain directories added by the user’s .profile and variants. Also, ssh is not able to create the directory containing the remote socket (“/tmp/ben-$USER/” by default, unless --remote-socket is specified, see below), so this directory must already exist. If it does not, ben will attempt to create it, but this will necessitate multiple invocations of ssh, three in total: 1. failure to create socket, 2. creation of directory, 3. second attempt.

--remote-socket path | host[:port]

When -r is used, specify the address ssh listens to on the remote host. By default, Unix-domain sockets are used and the path is “/tmp/ben-$USER/socket-seq” where seq is a random sequence. For TCP/IP sockets, if host is empty or *, bind to all addresses. If host is “localhost”, listen on the loopback interface. If port is not specified, the default is 9000.

--remote-extension ext | :shift

When -r is used, shorthand for --remote-socket address specification. Similar to -x for -s. For Unix-domain sockets, the path is “/tmp/ben-$USER/socket-ext”. For TCP/IP sockets, listen on localhost on port 9000+shift.

--remote-path path-to-ben

When -r is used, specify where to find the “ben” binary on the remote client.

Control options

ben add [general options] [-o dir] {-c command|-C file} [-q options|-Q file] {name…|-J file|-i file}

ben rm [general options] [-o dir] [-t d|r|p…] [name…|-J file]

ben exec [general options] {-c command|-C file} [-q options|-Q file]

ben scale|kill [general options] [node…|-B file] [-n count] [--retire]

ben list|status [general options] [-l] [-o dir] [-t d|r|p…] [name…|-J file]

ben nodes [general options] [-v]

ben exit [general options] [--retire]

-o dir, --outdir dir

Specify the output directory on the server side. For rm and list, if dir starts with ~, it is taken as a POSIX extended regular expression. Beware of some shells expanding the ~ character; quotes may be necessary.

-c command, --command command, -C file, --command-file file

Specify a command, or a script file containing the command. That command will be passed to the shell designated by the SHELL environment variable.

-q options, --options options, -Q file, --options-file file

Specify the content of the “options” environment variable, or a file containing it. Beware of options starting with a dash (-), as ben can mistake them for its own command-line options.

name…, -j name, --job name, -J file, --job-file file

Specify a set of jobs, or a file containing a set of jobs. For rm and list, jobs can be decribed by their name, their numerical ID, a range of numerical IDs (e.g. 5-8), or (if name starts with ~) a POSIX extended regular expression. Beware of some shells expanding the ~ character; quotes may be necessary. If name is not specified, match all jobs.

node…, -b node, --node node, -B file, --node-file file

Specify a set of node, or a file containing a set of nodes. Nodes can be described by their name, their numerical ID, a range of numerical IDs (e.g. 5-8), or (if node starts with ~) a POSIX extended regular expression. Beware of some shells expanding the ~ character; quotes may be necessary. If node is not specified, match all nodes.

-n count, --processes count

Update the maximum number of jobs to be run simultaneously by this client instance to count. The default value is zero.

--retire

For scale or kill, do not interrupt currently-running jobs, even if they are over the updated capacity. For exit, wait until all pending jobs are finished before interrupting server.

-l, --long

Long form (display job start and stop times).

-t d|r|p…, --status d|r|p

Select only jobs that have a given status:

  • d: done, successfully completed
  • r: running, currently executing on a client
  • p: pending, in the queue
-v, --verbose

Verbose mode (display internal buffers status).

-i file, --input file

Specify a file describing jobs and their properties. The file contents follow the same format as the output of the --snapshot server option, allowing for queue recovery after a server process was interrupted. It is a .ini file containing a series of “[job]” sections. For each job, the entries are as follows:

  • type: if this optional field is present and its value is “done”, then the job is ignored
  • shell: the shell command to be used on the client side
  • dir: the output directory on the server side
  • options: the content for the “options” environment variable
  • name: the job name
  • command: the job command
  • sync: if this optional field is “yes” (default: “no”), then the job is a synchronizing one, as if queued using “ben exec
  • restrict_name: if this optional field is present, then the job can only run on a client with the specified name

A few other entries are ignored but accepted without warning, for compatibility with the output of the --snapshot server option: “id”, “running_id”, “running_name”, “restrict_id”, “forward_id”, “forward_name”, “ran_id”, “ran_name”, “stdout_path”, “stderr_path”, “start_time”, “stop_time”, “duration”.

Mandatory entries that are missing take their values from command-line parameters.

All values can be quoted (with double-quotes), in which case the escape sequences “\\”, “\"” and “\n” are understood.

OLD VERSIONS OF SSH

Unix-domain sockets are supported by ssh since version 6.7 (August 2014). Older versions will print the error message:

Bad remote forwarding specification

In order to work around the issue, we need “ben” to perform all networking over TCP/IP (with the caveat mentioned above: anyone with network access to the “ben” server must then be trusted). To achieve this, run the server with “ben server -s:” to listen on all interfaces, or “ben server -x:” to listen on the loopback interface (localhost) only. All further commands will require the corresponding “-s:” or “-x:”. Then, all forwarding commands will have to specify “--forward-extension :”, and all remote commands will have to specify “--remote-extension :”.

MANUAL SSH SETUP

For situations where -f (--forward) and -r (--remote) are not sufficient, one may want to manually setup ssh forwardings. The two typical situations are (a) we can connect to the clients from the server and (b) we can connect to the server from the clients.

If we can connect to the clients from the server, the server-side ssh command to create clients would typically look like:

ssh -f -R 9000:localhost:9000 user@client-host \
    "ben client -s:"

Instead, if we can connect to the server from the client, then we can run on each client:

ssh -f -N -L 9000:localhost:9000 user@server-host

then

ben client -s: -d

EXAMPLES

Set up a server:

ben server -d

Start a remote client:

ben client --remote user@machine -n 4 -d

Run script script.sh with job names stored in file jobs.txt, and store output in directory output/:

ben add -o output -C script.sh -J jobs.txt

Monitor progress:

ben list

Remove a job named bad-job from the previously-added batch:

ben rm -o output bad-job

Stop running jobs the client machine after it is done with all currently-running jobs:

ben scale machine -n 0 --retire

Stop server and all clients:

ben exit

CONFIGURATION AND FILES

Whenever "ben add" or "ben exec" is run, the $SHELL environment variable is captured along with the command specification. Clients will attempt to execute the commands in that same shell.

When the -d (--daemon) option is specified, output is redirected to /tmp/ben-$USER/server-$PID, /tmp/ben-$USER/client-$PID or /tmp/ben-$USER/remote-seq. Note that $PID is the process ID of “ben” before it calls fork() to create a daemon. It is typically smaller than the PID of the daemon created. The random string seq is identical to the one used in the path of the remote client-side Unix domain socket by default.

BUGS

As noted above, the -r option to ben client has several caveats. First, ben has to be installed on the remote, and has to be in the $PATH. Adding its path to the $PATH variable in a user’s .profile (or, say, .bashrc) will not work, because ssh runs ben in a non-login shell. This can be mitigated by specifying --remote-path. Second, by default, a listening Unix socket will be created on the remote in the directory /tmp/ben-$USER. If this directory does not exist, ssh will not be able to create the socket. ben will attempt to detect such situation, run mkdir on the remote, and retry. However, it will need three ssh connections for that, so you may need to type a password three times.

The reason for putting sockets in a per-user subdirectory is to ensure that permissions work properly. POSIX does not state anything about the permissions of the socket file itself.

SEE ALSO

ssh(1), unix(7), ip(7), sshd(8)

AUTHOR

Copyright 2012-2024 Laurent Poirrier. Released under the GPL version 3.

Back to the ben home page.